4. Anomaly score calculation

Status	Stable
Version	2.0.0
Last updated	2026-01-31
Authors	OpenALBA Working Group

4.1 Overview

The Anomaly Score quantifies how unusual an observed behavior is compared to established baselines. This section defines the calculation methodology.

The anomaly score is an objective, mathematical measure in the range [0, 100] that does not consider business context or risk implications. Those factors are applied in the Risk Score calculation.

4.2 Conformance

Implementations MUST:

Calculate anomaly scores in the range [0, 100]
Implement at least one statistical deviation method from Section 4.3
Apply component weights as defined in Section 4.6

Implementations SHOULD:

Implement confidence adjustment for entities with limited baseline data
Support configurable component weights

4.3 Deviation component (0-100)

Measures statistical distance from the baseline.

4.3.1 Z-Score method

For normally distributed metrics:

Z-Score Calculationpython

z = (observed - baseline.mean) / baseline.stddev
deviation_score = min(100, |z| × 20)

Mapping:
    z = 0 (at mean) → score = 0
    z = 2.5 → score = 50
    z = 5 → score = 100

4.3.2 Modified Z-Score method (RECOMMENDED)

For metrics with outliers in the baseline:

Modified Z-Score Calculationpython

modified_z = 0.6745 × (observed - baseline.median) / baseline.MAD
deviation_score = min(100, |modified_z| × 18)

Where MAD is the Median Absolute Deviation.

Tip

The modified Z-score is RECOMMENDED for most use cases as it is more resistant to outliers in the baseline data.

4.3.3 IQR method

For non-normal distributions:

IQR Methodpython

IQR = baseline.q3 - baseline.q1
lower_fence = baseline.q1 - 1.5 × IQR
upper_fence = baseline.q3 + 1.5 × IQR

if observed < lower_fence:
    distance = (lower_fence - observed) / IQR
elif observed > upper_fence:
    distance = (observed - upper_fence) / IQR
else:
    distance = 0

deviation_score = min(100, distance × 30)

4.4 Rarity component (0-100)

Measures how uncommon the observed value is historically.

4.4.1 Percentile-based rarity

Percentile Raritypython

percentile = percentile_rank(observed, baseline.distribution)

if percentile <= 50:
    rarity_score = (1 - percentile/50) × 100  # Lower is rarer
else:
    rarity_score = ((percentile - 50)/50) × 100  # Higher is rarer

Example:
    Observed at 5th percentile → rarity = 90
    Observed at 50th percentile → rarity = 0
    Observed at 95th percentile → rarity = 90

4.4.2 Frequency-based rarity

For categorical values:

Frequency Raritypython

frequency = baseline.value_frequencies.get(observed, 0)
total = sum(baseline.value_frequencies.values())
rarity_score = (1 - frequency / total) × 100

Example:
    Value seen 1000 times out of 10000 → rarity = 90
    Value never seen before → rarity = 100

4.5 Velocity component (0-100)

Measures rate of change from the previous period.

4.5.1 Simple rate of change

Simple Velocitypython

if previous == 0:
    rate_of_change = infinity if current > 0 else 0
else:
    rate_of_change = (current - previous) / previous

velocity_score = min(100, |rate_of_change| × 50)

Mapping:
    0% change → score = 0
    50% change → score = 25
    100% change (doubled/halved) → score = 50
    200% change → score = 100

4.5.2 Baseline-normalized velocity

Baseline-Normalized Velocitypython

change = |current - previous|
normalized_change = change / baseline.stddev
velocity_score = min(100, normalized_change × 25)

Mapping:
    0 stddev change → score = 0
    2 stddev change → score = 50
    4+ stddev change → score = 100

4.6 Persistence component (0-100)

Measures duration of anomalous state.

4.6.1 Consecutive periods

Consecutive Persistencepython

consecutive = count of consecutive periods where score > threshold
persistence_score = min(100, consecutive × 10)

Example:
    Threshold: 40
    Period 1: score=55 → consecutive=1 → persistence=10
    Period 2: score=52 → consecutive=2 → persistence=20
    Period 3: score=48 → consecutive=3 → persistence=30
    Period 4: score=30 → consecutive=0 → persistence=0 (reset)

4.6.2 Weighted persistence

Weighted Persistencepython

weighted_sum = sum(recent_scores)
max_possible = N × 100
persistence_score = weighted_sum / max_possible × 100

Example (last 5 periods):
    Scores: [45, 52, 48, 55, 50]
    Sum: 250, Max: 500
    Persistence: 250/500 × 100 = 50

4.7 Composite score calculation

4.7.1 Standard formula

Standard Composite Formulapython

AnomalyScore = 0.40×Deviation + 0.25×Rarity + 0.20×Velocity + 0.15×Persistence

Example:
    Deviation: 65 (z-score of 3.25)
    Rarity: 80 (bottom 20% of distribution)
    Velocity: 40 (1.6 stddev change)
    Persistence: 30 (3 consecutive periods)

    Score = 0.40×65 + 0.25×80 + 0.20×40 + 0.15×30
          = 26 + 20 + 8 + 4.5
          = 58.5

4.7.2 Detection-specific weights

Detection-Specific Weightsyaml

volumetric_anomaly:  # DDoS/abuse - sudden spikes matter
  deviation: 0.50
  rarity: 0.15
  velocity: 0.30
  persistence: 0.05

access_pattern:  # Scraping - pattern matters most
  deviation: 0.25
  rarity: 0.45
  velocity: 0.15
  persistence: 0.15

data_exfiltration:  # Slow leaks - persistence matters
  deviation: 0.30
  rarity: 0.20
  velocity: 0.10
  persistence: 0.40

geographic:  # Rare events matter most
  deviation: 0.20
  rarity: 0.50
  velocity: 0.20
  persistence: 0.10

4.7.3 Multi-signal aggregation

When an entity has multiple anomalous metrics:

Multi-Signal Aggregationpython

Method: Maximum + Breadth Bonus (RECOMMENDED)
    base_score = max(individual_scores)
    breadth_bonus = min(20, count(scores > 40) × 5)
    entity_score = min(100, base_score + breadth_bonus)

Example:
    Request volume: 55
    Unique endpoints: 45
    Geographic: 75

    Base: 75
    Breadth: 3 signals > 40 → bonus = 15
    Entity score: 75 + 15 = 90

4.7.4 Confidence adjustment

Confidence Adjustmentpython

adjusted_score = raw_score × sqrt(confidence)

Example:
    Raw score: 70
    Confidence: 0.25 (new entity)

    Adjusted: 70 × sqrt(0.25) = 70 × 0.5 = 35

Rationale:
    sqrt() provides gradual adjustment
    At confidence 0.25 → 50% of score
    At confidence 0.64 → 80% of score
    At confidence 1.0 → 100% of score

4.8 Security considerations

Anomaly scores may be manipulated by adversaries who gradually shift behavior to poison baselines. Implementations SHOULD:

Use robust statistics (median, MAD) resistant to outliers
Implement baseline poisoning detection
Maintain historical baselines for comparison
Use slow adaptation rates (low alpha in EWMA) for user behavior

Note

Continue to Section 5: Risk Score Calculation for details on applying context and multipliers.