Documentation / 09 · Carbon attribution

Per-service carbon attribution

Design notes for the runtime-calibrated per-service energy and carbon attribution surfaced in GreenSummary and consumed by the periodic disclosure aggregator. Pairs with Methodology (operator-facing) and 08 · Periodic disclosure (aggregator + wire schema).


Why
The first disclosure release recomputed aggregate.total_energy_kwh via a proxy at aggregate time, even when the underlying daemon had measured energy through Scaphandre or cloud SPECpower. It also distributed window-level CO2 to services proportionally to per-service I/O ops, ignoring the fact that two services in different regions emit at very different grid intensities.
The fix is to compute and serialise per-service energy + carbon at scoring time, so the aggregator can sum directly. Per-service values are runtime-calibrated end to end: the daemon sees the real region for each service and the real energy backend tag.
Algorithm
Scoring runs in score::compute_carbon_report. The function already loops once over all spans in the batch and accumulates per-region carbon into RegionAccumulator. Per-service attribution adds a parallel BTreeMap<String, ServiceCarbonAccumulator> that follows the same single-pass shape.
For each span, after computing the per-span energy, region, intensity, and PUE, the inner loop now also runs:
rust
let svc = state
    .per_service
    .entry(span.event.service.to_string())
    .or_insert_with(|| ServiceCarbonAccumulator {
        energy_kwh: 0.0,
        operational_gco2: 0.0,
        region: region_ctx.region_ref.to_string(),
    });
svc.energy_kwh += energy_kwh;
svc.operational_gco2 += op_co2;
Once the loop completes, score_green produces the GreenSummary maps:

per_service_energy_kwh[svc] = acc.energy_kwh
per_service_carbon_kgco2eq[svc] = acc.operational_gco2 / 1000.0
per_service_region[svc] = acc.region (or "unknown" sentinel if empty)
energy_kwh = sum(per_service_energy_kwh.values())
energy_model = select_co2_model_tag(window_flags) when energy > 0, else empty string

The per-service map is keyed by service name (lowercased upstream by CarbonContext.service_regions). The region field on the accumulator is also lowercased before storage, matching the keys in per_region so the two maps collate. Empty energy yields an empty energy_model string, which routes the window to the aggregator's proxy fallback path.
Region attribution
The region recorded for a service is the region of the first span observed for that service in the window. Later spans for the same service keep this region even if they carry a different cloud_region attribute. Two consequences:

A service deployed in two regions within the same scoring window is attributed entirely to its first observed region. The per-region row in GreenSummary.regions still reflects the split, so the global figures stay correct.
Long-running services with stable service_regions configuration are unaffected: every span resolves to the same region.

This trade-off keeps the per-service map simple. A more granular BTreeMap<(String, String), ServiceCarbonAccumulator> keyed by (service, region) would surface multi-region splits but enlarge the wire payload and force consumers to fold rows themselves. v1.0 prefers the simpler shape.
Model tag precedence
The per-window energy_model reuses the existing select_co2_model_tag from score::region_breakdown, which already implements the canonical precedence:
electricity_maps_api > scaphandre_rapl > kepler_ebpf > redfish_bmc > cloud_specpower > io_proxy_v3 > io_proxy_v2 > io_proxy_v1
with the optional +cal suffix when calibration data is active. The tag reflects the highest-fidelity model present in the window. No per-service breakdown of model tags is exposed: a transparent global tag is more useful than a per-service map that consumers would have to fold anyway.
Embodied carbon stays at the global level
The SCI M term lives only in co2.total and aggregate.total_carbon_kgco2eq. Per-service maps carry the operational term only. Reasons:

Per-request embodied amortisation is already an arbitrary spread. Splitting it per service would surface a precision that does not exist in the underlying data.
Embodied is not actionable through software optimisation. Removing N+1 patterns has no effect on M.
Consumers (auditors, public dashboards) who need the per-service operational figure benefit from a cleaner number that maps directly to actionable optimisations.

The invariant sum(per_service_carbon_kgco2eq) × 1000 ≈ co2.operational_gco2 (tolerance 1e-6) is tested.
Aggregator branching
report::periodic::aggregator::Builder::process_window checks two predicates:

report.green_summary.per_service_carbon_kgco2eq.is_empty() && report.green_summary.per_service_energy_kwh.is_empty() — runtime maps absent.
report.green_summary.energy_kwh > 0.0 — runtime energy total present.

When both runtime maps are non-empty, the aggregator sums the per-service values directly. When they are empty, it falls back to the proxy path inherited from the first release (proportional I/O share for carbon, total_io_ops × ENERGY_PER_IO_OP_KWH for energy). The two paths can coexist within the same archive directory: each window applies its own strategy.
A single tracing::warn! per archive file flags fallback usage so operators can spot stale archives. The counters runtime_windows and fallback_windows on AggregateInputs carry the split for downstream diagnostics.
Hardening at the archive boundary
Archive lines are operator-controlled state on disk. The aggregator treats every f64 field read out of an archive as untrusted:

energy_kwh, per_service_energy_kwh.values() and per_service_carbon_kgco2eq.values() go through sanitize_f64 which clamps NaN, +/-Inf and negative numbers to 0.0. Without this guard a single poisoned line would propagate NaN to every downstream sum.
The per_service map is capped at MAX_SERVICES = 4096 entries. Once the cap is reached, additional distinct services from the archive are silently dropped on the floor. Findings already routed to a known bucket continue to accumulate.
energy_source_models is capped at MAX_ENERGY_MODELS = 64 entries and each energy_model string is rejected when longer than 64 bytes. Tags differing only by the +cal suffix collapse into a single bare entry, so the set never carries both scaphandre_rapl and scaphandre_rapl+cal.

These caps mirror the runtime-side MAX_REGIONS cap in score::carbon_compute. They are silent (no error), the aggregator treats them as best-effort folding.
Backward compatibility
All five new GreenSummary fields carry #[serde(default)]. An archive line written without runtime energy attribution deserialises with energy_kwh = 0.0, energy_model = "", and empty maps. The aggregator detects this and falls back to the proxy.
No schema version bump. perf-sentinel-report/v1.0 stays the wire identifier. Consumers that read only the documented v1.0 baseline set keep working, consumers that opt into the new fields gain runtime-calibrated values automatically.
What we did not do

Per-service energy model tags (per_service_energy_model: BTreeMap<String, String>). Possible but unused today; the window-level tag carries enough fidelity for the disclosure's audit trail.
Multi-region per-service splits. The wire shape stays simple at the cost of approximate attribution for services that move regions mid-window.
Embodied carbon attribution per service. Deliberately excluded.
Schema version bump. The change is strictly additive.