4. Anomaly score calculation
| Status | Stable |
| Version | 2.0.0 |
| Last updated | 2026-01-31 |
| Authors | OpenALBA Working Group |
4.1 Overview
The Anomaly Score quantifies how unusual an observed behavior is compared to established baselines. This section defines the calculation methodology.
The anomaly score is an objective, mathematical measure in the range [0, 100] that does not consider business context or risk implications. Those factors are applied in the Risk Score calculation.
4.2 Conformance
Implementations MUST:
- Calculate anomaly scores in the range [0, 100]
- Implement at least one statistical deviation method from Section 4.3
- Apply component weights as defined in Section 4.6
Implementations SHOULD:
- Implement confidence adjustment for entities with limited baseline data
- Support configurable component weights
4.3 Deviation component (0-100)
Measures statistical distance from the baseline.
4.3.1 Z-Score method
For normally distributed metrics:
z = (observed - baseline.mean) / baseline.stddev
deviation_score = min(100, |z| × 20)
Mapping:
z = 0 (at mean) → score = 0
z = 2.5 → score = 50
z = 5 → score = 1004.3.2 Modified Z-Score method (RECOMMENDED)
For metrics with outliers in the baseline:
modified_z = 0.6745 × (observed - baseline.median) / baseline.MAD
deviation_score = min(100, |modified_z| × 18)
Where MAD is the Median Absolute Deviation.Tip
The modified Z-score is RECOMMENDED for most use cases as it is more resistant to outliers in the baseline data.
4.3.3 IQR method
For non-normal distributions:
IQR = baseline.q3 - baseline.q1
lower_fence = baseline.q1 - 1.5 × IQR
upper_fence = baseline.q3 + 1.5 × IQR
if observed < lower_fence:
distance = (lower_fence - observed) / IQR
elif observed > upper_fence:
distance = (observed - upper_fence) / IQR
else:
distance = 0
deviation_score = min(100, distance × 30)4.4 Rarity component (0-100)
Measures how uncommon the observed value is historically.
4.4.1 Percentile-based rarity
percentile = percentile_rank(observed, baseline.distribution)
if percentile <= 50:
rarity_score = (1 - percentile/50) × 100 # Lower is rarer
else:
rarity_score = ((percentile - 50)/50) × 100 # Higher is rarer
Example:
Observed at 5th percentile → rarity = 90
Observed at 50th percentile → rarity = 0
Observed at 95th percentile → rarity = 904.4.2 Frequency-based rarity
For categorical values:
frequency = baseline.value_frequencies.get(observed, 0)
total = sum(baseline.value_frequencies.values())
rarity_score = (1 - frequency / total) × 100
Example:
Value seen 1000 times out of 10000 → rarity = 90
Value never seen before → rarity = 1004.5 Velocity component (0-100)
Measures rate of change from the previous period.
4.5.1 Simple rate of change
if previous == 0:
rate_of_change = infinity if current > 0 else 0
else:
rate_of_change = (current - previous) / previous
velocity_score = min(100, |rate_of_change| × 50)
Mapping:
0% change → score = 0
50% change → score = 25
100% change (doubled/halved) → score = 50
200% change → score = 1004.5.2 Baseline-normalized velocity
change = |current - previous|
normalized_change = change / baseline.stddev
velocity_score = min(100, normalized_change × 25)
Mapping:
0 stddev change → score = 0
2 stddev change → score = 50
4+ stddev change → score = 1004.6 Persistence component (0-100)
Measures duration of anomalous state.
4.6.1 Consecutive periods
consecutive = count of consecutive periods where score > threshold
persistence_score = min(100, consecutive × 10)
Example:
Threshold: 40
Period 1: score=55 → consecutive=1 → persistence=10
Period 2: score=52 → consecutive=2 → persistence=20
Period 3: score=48 → consecutive=3 → persistence=30
Period 4: score=30 → consecutive=0 → persistence=0 (reset)4.6.2 Weighted persistence
weighted_sum = sum(recent_scores)
max_possible = N × 100
persistence_score = weighted_sum / max_possible × 100
Example (last 5 periods):
Scores: [45, 52, 48, 55, 50]
Sum: 250, Max: 500
Persistence: 250/500 × 100 = 504.7 Composite score calculation
4.7.1 Standard formula
AnomalyScore = 0.40×Deviation + 0.25×Rarity + 0.20×Velocity + 0.15×Persistence
Example:
Deviation: 65 (z-score of 3.25)
Rarity: 80 (bottom 20% of distribution)
Velocity: 40 (1.6 stddev change)
Persistence: 30 (3 consecutive periods)
Score = 0.40×65 + 0.25×80 + 0.20×40 + 0.15×30
= 26 + 20 + 8 + 4.5
= 58.54.7.2 Detection-specific weights
volumetric_anomaly: # DDoS/abuse - sudden spikes matter
deviation: 0.50
rarity: 0.15
velocity: 0.30
persistence: 0.05
access_pattern: # Scraping - pattern matters most
deviation: 0.25
rarity: 0.45
velocity: 0.15
persistence: 0.15
data_exfiltration: # Slow leaks - persistence matters
deviation: 0.30
rarity: 0.20
velocity: 0.10
persistence: 0.40
geographic: # Rare events matter most
deviation: 0.20
rarity: 0.50
velocity: 0.20
persistence: 0.104.7.3 Multi-signal aggregation
When an entity has multiple anomalous metrics:
Method: Maximum + Breadth Bonus (RECOMMENDED)
base_score = max(individual_scores)
breadth_bonus = min(20, count(scores > 40) × 5)
entity_score = min(100, base_score + breadth_bonus)
Example:
Request volume: 55
Unique endpoints: 45
Geographic: 75
Base: 75
Breadth: 3 signals > 40 → bonus = 15
Entity score: 75 + 15 = 904.7.4 Confidence adjustment
adjusted_score = raw_score × sqrt(confidence)
Example:
Raw score: 70
Confidence: 0.25 (new entity)
Adjusted: 70 × sqrt(0.25) = 70 × 0.5 = 35
Rationale:
sqrt() provides gradual adjustment
At confidence 0.25 → 50% of score
At confidence 0.64 → 80% of score
At confidence 1.0 → 100% of score4.8 Security considerations
Anomaly scores may be manipulated by adversaries who gradually shift behavior to poison baselines. Implementations SHOULD:
- Use robust statistics (median, MAD) resistant to outliers
- Implement baseline poisoning detection
- Maintain historical baselines for comparison
- Use slow adaptation rates (low alpha in EWMA) for user behavior
Note
Continue to Section 5: Risk Score Calculation for details on applying context and multipliers.