2. Core architecture

Status	Stable
Version	2.0.0
Last updated	2026-01-31
Authors	OpenALBA Working Group

2.1 System components

The OpenALBA processing pipeline consists of the following components:

Processing Pipeline

DATA SOURCES → AGGREGATION → BASELINES → ANOMALY DETECTION → RISK SCORING → ALERTS
     │              │            │              │                 │            │
   OTel         Pre-agg      Per-entity    Statistical +       Multipliers   Routing
   Traces       Metrics      + Peer group     ML methods       + Decay       + Delivery

Each component has distinct responsibilities:

Data Sources: Applications instrumented with OpenTelemetry SDK emit traces, metrics, and logs
Aggregation: Raw observability data is pre-aggregated into metrics per entity per time window
Baselines: Statistical and ML models establish “normal” behavior per entity and peer group
Anomaly Detection: Current behavior is compared to baselines to calculate objective anomaly scores
Risk Scoring: Anomaly scores are adjusted by entity criticality, sensitivity, and consumer-specific weights
Alerts: Risk scores exceeding thresholds are routed to appropriate teams

2.2 Entity model

OpenALBA profiles behavior for multiple entity types:

Entity Type	Primary Key	Baseline Scope	Typical Metrics
User	`user.id`	Per-user + peer group	Request volume, endpoints, data volume, timing
Session	`session.id`	Per-user historical	Duration, actions, navigation pattern
Service	`service.name`	Per-service + type	Request rate, error rate, latency, dependencies
Endpoint	`service.name` + `http.route`	Per-endpoint	Volume, response size, error rate
Dependency	`service.name` + `peer.service`	Per-pair	Call volume, error rate, latency

2.3 Time windows

OpenALBA uses multiple time windows to balance responsiveness with accuracy:

Window	Purpose	Typical Duration
Detection Window	Current behavior measurement	5-15 minutes
Short Baseline	Recent “normal”	24-72 hours
Standard Baseline	Primary behavioral baseline	7-30 days
Long Baseline	Seasonal patterns	90-365 days

Note

The detection window SHOULD be short enough to catch attacks but long enough to avoid noise from brief fluctuations. Implementations SHOULD allow this to be configured per detection pattern.

2.4 Data flow

Data flows through the system in the following sequence:

Applications emit spans and metrics via OpenTelemetry SDK to an OTel Collector
Collector exports data to storage (e.g., ClickHouse) via appropriate exporters
Aggregation jobs run periodically to compute per-entity metrics from raw spans
Baseline jobs update statistical models and ML models on configured schedules
Detection jobs compare current metrics to baselines and calculate anomaly scores
Risk scoring applies multipliers and decay to produce final risk scores
Alert evaluation checks thresholds and routes notifications to configured channels

2.5 Conformance

Implementations claiming conformance to the architecture:

MUST support the User and Service entity types
SHOULD support Session, Endpoint, and Dependency entity types
MUST implement at least detection and short baseline windows
SHOULD implement standard and long baseline windows for improved accuracy

Tip

Continue to Section 3: Baseline Methodology for details on how baselines are established.