Getting started

This guide covers the prerequisites, initial setup, and basic configuration for implementing OpenALBA.

Prerequisites

Before implementing OpenALBA, ensure you have:

  • OpenTelemetry instrumentation deployed in your applications, with traces exported to a collector
  • Time-series storage for observability data (ClickHouse recommended, but Prometheus, TimescaleDB, or other options work)
  • Compute resources for running ALBA processing jobs (Kubernetes CronJobs recommended)
  • Alerting infrastructure for routing notifications (PagerDuty, Slack, email)

1. Verify instrumentation

OpenALBA requires specific OpenTelemetry attributes to function. Verify your applications emit the required attributes:

Required Attributesyaml
# Resource attributes
service.name: "my-service"
service.version: "1.2.3"
deployment.environment.name: "production"

# Span attributes
http.route: "/api/users/:id"
http.request.method: "GET"
http.response.status_code: 200
client.address: "203.0.113.42"

Tip

For enhanced detection, add identity attributes like user.id and session.id to your spans. See Signal Definitions for the complete list.

2. Configure storage

Set up tables for raw data, aggregated metrics, and ALBA-specific data. Example ClickHouse schema:

ClickHouse Schema (abbreviated)sql
-- Raw traces (7-day retention)
CREATE TABLE traces_raw (
    timestamp DateTime64(9),
    trace_id String,
    span_id String,
    service_name LowCardinality(String),
    http_route LowCardinality(String),
    http_status_code UInt16,
    user_id String,
    client_address String,
    duration_ms Float64
) ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY (service_name, timestamp)
TTL timestamp + INTERVAL 7 DAY;

-- User metrics (hourly aggregation, 90-day retention)
CREATE TABLE user_metrics_hourly (
    hour DateTime,
    user_id String,
    request_count UInt64,
    unique_endpoints UInt32,
    error_count UInt64,
    response_bytes_total UInt64
) ENGINE = SummingMergeTree()
PARTITION BY toYYYYMM(hour)
ORDER BY (user_id, hour)
TTL hour + INTERVAL 90 DAY;

3. Establish baselines

Before detecting anomalies, ALBA needs baseline data. Run the baseline job with historical data:

Initialize Baselinesbash
# Run baseline initialization with 14 days of historical data
alba baseline init --window 14d

# Or run incrementally on new data
alba baseline update --incremental

Warning

Baselines require sufficient sample data to be meaningful. Entities with fewer than 100 samples will use fallback baselines (peer group or population). Allow 2-4 weeks of data collection for optimal accuracy.

4. Enable detection

Configure which detection patterns to enable and their thresholds:

Detection Configurationyaml
detection:
  patterns:
    - name: user_request_volume_anomaly
      enabled: true
      zscore_threshold: 2.5

    - name: geographic_anomaly
      enabled: true
      new_country_score: 60
      impossible_travel_score: 90

    - name: service_error_rate_anomaly
      enabled: true
      zscore_threshold: 2.5
      min_volume: 100

  schedule:
    interval: 5m
    timeout: 2m

5. Configure alerting

Route alerts to appropriate teams based on anomaly type and risk:

Alert Routingyaml
alerting:
  routes:
    - name: security
      conditions:
        anomaly_types: [geographic, credential_stuffing, data_exfil]
        risk_score_min: 50
      channels:
        critical: [pagerduty:security, slack:#security-critical]
        high: [slack:#security-alerts]

    - name: sre
      conditions:
        anomaly_types: [error_rate, latency, dependency_failure]
        risk_score_min: 50
      channels:
        critical: [pagerduty:sre, slack:#incidents]
        high: [slack:#sre-alerts]

Next steps

Last updated: 2026-01-31