Understanding CRML Parameters
A practical guide to choosing the right parameters for your cyber risk models.
Simulation Parameters
Number of Iterations
What it is: How many times the simulation runs to estimate your risk.
How to choose:
| Iterations | Use Case | Accuracy | Speed |
|---|---|---|---|
| 100-1,000 | Quick testing, prototyping | Low | Very Fast (< 1s) |
| 5,000-10,000 | Standard analysis, reporting | Good | Fast (2-5s) |
| 50,000-100,000 | High-stakes decisions, compliance | Excellent | Slower (10-30s) |
Rule of thumb: Start with 10,000 for most use cases. Increase if: - Making million-dollar decisions - Presenting to executives or board - Need regulatory compliance documentation
Why it matters: More iterations = more stable, reliable results. Like flipping a coin - 10 flips might give you 7 heads, but 10,000 flips will be very close to 50/50.
Random Seed
What it is: A number that makes your simulation reproducible.
When to use: - ✅ Testing: Use the same seed (e.g., 42) to verify code changes don't affect results - ✅ Documentation: Include seed in reports so others can reproduce your analysis - ✅ Debugging: Use a seed to get consistent results while troubleshooting - ❌ Production: Leave empty for different results each time (more realistic)
Example:
# Reproducible (always same results)
pipeline:
simulation:
monte_carlo:
runs: 10000
random_seed: 42
# Non-reproducible (different each time)
pipeline:
simulation:
monte_carlo:
runs: 10000
# No seed specified
Distribution Selection
Understanding Distributions
Distributions describe how often (frequency) and how bad (severity) cyber events occur.
Frequency Distributions
Question: "How many times will this happen per year?"
Poisson Distribution
Use when: Events are rare and random (most cyber risks)
Parameters:
- lambda: Average number of events per year per asset
How to choose lambda:
| Lambda | Meaning | Example |
|---|---|---|
| 0.01 | 1% chance per asset per year | Zero-day exploits |
| 0.05 | 5% chance per asset per year | Data breaches |
| 0.1 | 10% chance per asset per year | Phishing incidents |
| 0.5 | 50% chance per asset per year | Failed login attempts |
| 1.0 | Expected once per asset per year | Malware detections |
Real-world example:
# Ransomware scenario
# Industry data: ~8% of organizations hit per year
# You have 500 critical systems
frequency:
model: poisson
parameters:
lambda: 0.08 # 8% per system = ~40 expected incidents/year across all systems
Where to get lambda values: - Industry reports (Verizon DBIR, IBM Cost of Data Breach) - Your own historical incident data - Threat intelligence feeds - Peer benchmarking
Gamma Distribution
Use when: Event frequency varies over time or between assets
Parameters:
- shape: How variable the rate is
- scale: Average rate
When to use: Advanced modeling when Poisson is too simple (e.g., seasonal attacks, varying asset criticality)
Severity Distributions
Question: "When it happens, how much will it cost?"
Lognormal Distribution
Use when: Losses are typically small but can be extremely large (most cyber losses)
Parameters:
- median: The typical (median) loss amount in real currency. Accepts numbers or strings with spaces (e.g., "100 000").
- currency: The currency code for the median value (e.g., USD, EUR)
- sigma: Controls the variability (how spread out losses are)
- mu: Alternative to median - log-space mean (advanced users only)
- single_losses: List of observed or estimated single-event loss amounts for auto-calibration (do not combine with median/mu/sigma). Each value can be a number or a string with spaces.
- cardinality: Number of assets of this type. Accepts numbers or strings with spaces (e.g., "10 000").
- lambda: Poisson rate parameter. Accepts numbers or strings with spaces (e.g., "1 200").
- alpha_base: Gamma shape parameter for hierarchical_gamma_poisson. Accepts numbers, strings with spaces, or expressions (e.g., "1 000", "CI * 2 + 1").
- beta_base: Gamma rate parameter for hierarchical_gamma_poisson. Accepts numbers or strings with spaces (e.g., "10 000").
Number Format: Large numbers support ISO 80000-1 style space separators for readability. Both 100000 and "100 000" are valid.
This applies to all relevant parameters, including median, cardinality, lambda, alpha_base, beta_base, shape, scale, and single_losses. For example:
assets:
- name: Laptops
cardinality: "10 000" # 10,000 laptops
- name: Servers
cardinality: 500
frequency:
model: poisson
parameters:
lambda: "1 200" # 1,200 expected events
# Hierarchical example
model: hierarchical_gamma_poisson
parameters:
alpha_base: "1 000"
beta_base: "10 000"
How to choose median:
Simply use the typical loss amount directly from industry reports, own historical loss data or expert judgement.
| Median | Use Case |
|---|---|
| 8 000 | Minor incidents (laptop theft) |
| "100 000" | Data breaches (small) |
| "700 000" | Ransomware (medium enterprise) |
| "9 000 000" | Major data breach (large enterprise) |
How to choose sigma:
| sigma | Variability | Meaning |
|---|---|---|
| 0.5 | Low | Losses are predictable, clustered around median |
| 1.0-1.5 | Medium | Typical cyber risk - some variation |
| 2.0+ | High | Extreme variation - rare catastrophic losses possible |
Real-world example:
# Data breach severity
# Industry data: Median cost ~$100K, but can reach millions
severity:
model: lognormal
parameters:
median: "100 000" # $100K median (directly from IBM report)
currency: USD # Explicit currency
sigma: 1.2 # Moderate variability
Visual guide: - Low sigma (0.5): 📊 Narrow bell curve - predictable losses - Medium sigma (1.2): 📊 Wider curve - some big losses possible - High sigma (2.0): 📊 Very wide - rare but catastrophic losses
Note on mu: For advanced users, you can still use mu (the log-space mean where mu = ln(median)). However, median is recommended because:
- It's more intuitive and human-readable
- It directly maps to industry report data
- No manual log transformation required
Gamma Distribution
Use when: Losses have a natural minimum but long tail (e.g., recovery costs)
Parameters:
- shape: Controls the distribution shape
- scale: Controls the average loss size
When to use: Alternative to lognormal when you want more control over the tail behavior
Practical Workflow
Step 1: Gather Data
For Frequency (lambda): 1. Check your incident logs (how many times did X happen last year?) 2. Consult industry reports (Verizon DBIR, Ponemon, etc.) 3. Ask: "Out of 100 similar assets, how many get hit per year?"
For Severity (mu, sigma): 1. Review past incident costs (direct + indirect) 2. Use industry benchmarks (IBM Cost of Data Breach Report) 3. Consider: downtime, recovery, legal, reputation, fines
Step 2: Start Simple
# Begin with a simple model
model:
assets:
cardinality: 100 # Number of assets you're protecting
frequency:
model: poisson
parameters:
lambda: 0.05 # 5% chance per asset (conservative estimate)
severity:
model: lognormal
parameters:
median: "100 000" # $100K median loss
currency: USD
sigma: 1.2 # Moderate variability
### Step 3: Calibrate
Run the simulation and check if results make sense:
**Ask yourself:**
- Does $500K/year seem reasonable for my organization?
- Would I be comfortable explaining this to my CISO?
- Does it align with our cyber insurance premium?
### Step 4: Adjust
If results seem off:
**Too high?**
- Reduce `lambda` (events are rarer than you thought)
- Reduce `median` (losses are smaller than you thought)
**Too low?**
- Increase `lambda` (events are more common)
- Increase `median` (losses are larger)
- Increase `sigma` (more variability, captures rare big losses)
---
## Common Scenarios
### Scenario 1: Ransomware Risk
**Context:** Enterprise with 500 critical servers
```yaml
model:
assets:
cardinality: 500
frequency:
model: poisson
parameters:
lambda: 0.08 # 8% annual probability (industry average)
severity:
model: lognormal
parameters:
median: "700 000" # $700K median (ransom + downtime + recovery)
currency: USD
sigma: 1.8 # High variability (some pay $50K, others $5M)
Why these values? - Lambda: Sophos reports ~66% of orgs hit in 2 years ≈ 33%/year, but per-server is lower - Median: Average ransomware cost is 700K-1.4M (Sophos, 2023) - Sigma: High because costs vary wildly based on negotiation, backups, etc.
Scenario 2: Data Breach (PII)
Context: 50 databases containing customer PII
model:
assets:
cardinality: 50
frequency:
model: poisson
parameters:
lambda: 0.05 # 5% per database per year
severity:
model: lognormal
parameters:
median: "100 000" # $100K median
currency: USD
sigma: 1.2 # Moderate variability
Why these values? - Lambda: IBM reports 1 in 20 orgs have breach per year ≈ 5% - Median: IBM Cost of Data Breach 2023: $4.45M average, but varies by size - Sigma: Moderate because costs are somewhat predictable (per-record costs)
Scenario 3: Phishing Incidents
Context: 1000 employees, measuring credential compromise
model:
assets:
cardinality: 1000 # employees
frequency:
model: poisson
parameters:
lambda: 0.2 # 20% of employees click phishing per year
severity:
model: lognormal
parameters:
median: "8 000" # $8K median (mostly time to remediate)
currency: USD
sigma: 1.5 # Some lead to major breaches
Why these values? - Lambda: Industry average click rate is 10-30% - Median: Most phishing is caught quickly, low cost - Sigma: But some lead to major breaches, so high variability
Data Sources
Industry Reports (Free)
- Verizon DBIR: Breach frequency by industry
- IBM Cost of Data Breach: Average costs by breach type
- Ponemon Institute: Various cost studies
- SANS Institute: Incident statistics
Threat Intelligence
- MITRE ATT&CK: Technique frequency
- CISA Alerts: Current threat landscape
- Your SIEM/EDR: Historical incident data
Insurance Data
- Cyber insurance applications: Often require incident history
- Industry loss data: Some insurers publish aggregated data
Quick Reference Card
Starting point for common scenarios:
| Risk Type | Lambda | Median | Sigma | Rationale |
|---|---|---|---|---|
| Ransomware (Enterprise) | 0.08 | $700,000 | 1.8 | Industry avg: 8%, $700K median, high variance |
| Data Breach (SMB) | 0.05 | $100,000 | 1.2 | 5% annual, $100K median, moderate variance |
| Phishing (per employee) | 0.2 | $8,000 | 1.5 | 20% click rate, $8K median, some escalate |
| DDoS Attack | 0.15 | $35,000 | 1.0 | 15% annual, $35K median, predictable costs |
| Insider Threat | 0.02 | $1,200,000 | 2.0 | Rare (2%), $1.2M median, highly variable |
Simulation iterations: - Quick test: 1,000 - Standard analysis: 10,000 - Board presentation: 50,000+
Next Steps
- Start with examples: Use the pre-built models in the simulation page
- Modify one parameter at a time: See how it affects results
- Compare to your budget: Does the EAL match your cyber spend?
- Iterate: Refine based on your organization's data
Remember: All models are wrong, but some are useful. Start simple, validate with stakeholders, and refine over time.