Making Automotive SoCs Truly Safe

Why functional safety changes the verification game?

Most SoC verification teams know how to prove a chip meets its functional specification. You write and run simulations, close coverage, debug mismatches, and tape out. But in automotive design — especially when working to ISO 26262 — the job isn’t done when the chip “works.” It must also fail safely, and you must be able to prove it.

Functional safety (FuSa) in this context is the assurance that single-point faults, latent faults, and combinations of faults cannot cause a hazardous event without being detected in time. ISO 26262 formalises this with the Single Point Fault Metric (SPFM), Latent Fault Metric (LFM), and Diagnostic Coverage (DC), each linked to an ASIL (Automotive Safety Integrity Level) target.

Automotive SoCs are particularly challenging because they often integrate safety-critical domains (ASIL-B/C/D) alongside non-safety QM domains. Preventing interference between them — freedom-from-interference (FFI) — is critical, enforced through safety controllers, fault managers, and memory protection hardware. Verification of these mechanisms must happen at subsystem and SoC level, not just at isolated IPs. Many designs also follow the SEooC (Safety Element out of Context) model, meaning the SoC is delivered with assumptions about the larger system — and those assumptions themselves need verification evidence.

From a “working” SoC to a “safe” SoC

For teams new to automotive, the shift can be abrupt. A design can be functionally complete — specification implemented, RTL stable, UVM environments passing — when the safety manager hands over the FuSa list:

Show diagnostic coverage for all ASIL-D relevant elements.
Prove that QM accesses cannot corrupt safety-critical resources.
Inject faults and classify each as safe, detected dangerous, or undetected dangerous.
Close or justify all undetected dangerous faults.

Suddenly, the challenge is not just functional correctness but proving fault detection under all required conditions, with traceable evidence that can stand up to an external safety audit.

Early planning and partitioning

The work starts with hazard analysis, safety goals, and derived requirements. The SoC is partitioned into safety and non-safety domains, and an FFI matrix is created, defining allowable master/slave access combinations, conditions, and protections.

From here, each requirement is mapped to a specific verification activity and its associated evidence — the safety case thread. This ties directly into FMEDA metrics like DC, SPFM, and LFM. Importantly, hooks for Safety Controller (SC) and Fault Manager (FM) verification are added early at the SoC level to allow these cross-domain mechanisms to be tested in realistic conditions.

Dedicated verification of cross-domain safety infrastructure

Reusable verification components speed up and standardise this work. The Safety Controller Verification Component (SCVC) configures masters/slaves as safe or non-safe, drives all combinations, checks block/allow decisions, verifies error logging, and confirms correct fault signalling to the FM or interrupt controller.

The Fault Manager Verification Component (FMVC) configures fault responses, injects faults, checks correct fault IDs without aliasing, and confirms routing to IRQ, PMU, RMU, or alarms. These run at IP, subsystem, and SoC level to ensure both configuration correctness and FFI policy enforcement.

Fault list generation, pruning, and sampling

Fault injection begins with an RTL-level fault list. Working at RTL is intentionally conservative — if the RTL meets DC targets, the gate-level netlist will too. Formal cone-of-influence (COI) analysis removes structurally safe faults that can’t influence safety outputs, avoiding wasted simulation.

Because the fault set is still large, statistical fault sampling is applied. A random set of ~4,000 faults typically yields DC estimates within ±2% at 99% confidence. No cherry-picking or collapsing beyond proven-safe equivalence is allowed. Runs are limited to normal operating modes, since ISO 26262 doesn’t require coverage for every boot or test mode.

Combining formal and simulation effectively

Formal excels at mechanisms like ECC, parity, CRC, or lockstep comparators. Sequential Equivalence Checking (SEC) compares fault-free and faulted designs, proving safety outputs change within the detection time budget. It’s also valuable for checking that added safety features don’t break original functionality, and for analysing latent/multi-point faults.

Simulation dominates large-scale fault injection. It measures DC, detection latency, and system effects, classifying results into safe, detected dangerous, and undetected dangerous. Persistent unclassified (UU) or undetected dangerous faults are tackled using Fault Barrier Analysis — identifying propagation chokepoints like clock gates, mux selects, or mode bits, stimulating around them, and re-running the affected faults. This is iterative: removing one barrier often reveals another.

The strongest flows use formal and simulation together — formal prunes safe faults and produces activation traces; simulation executes those traces in the full-system context; barrier analysis closes the last gaps.

Addressing software safety mechanisms

Some datapaths are only covered by software safety mechanisms (STLs). To verify them, gap analysis first identifies unprotected areas. Static structural “River Flow Mode” analysis then traces all paths from a fault site to the software check point. On this reduced scope, formal detectability proofs inject faults and confirm they change the software-observed data.

Practical constraints like tying off unused ports, using protocol VIPs, or black-boxing non-critical logic keep the formal runs tractable. This combination of static and formal methods often recovers DC missed by simulation alone.

Verifying FFI and dependent fault behaviour

FFI policy verification means exhaustively exercising address windows, permissions, and modes across all SC instances, ensuring correct block/allow decisions, fault logging, and routing to FM. Coverage here is taken to 100%.

Dependent faults — where one fault might hide another — are also checked. The FM must not alias unrelated faults or mis-route them and must respect the defined priority handling.

Metrics, evidence, and sign-off

Throughout, the team tracks DC with its statistical margin, SPFM, LFM, rates of safe and residual faults, and trends in the UU bucket before and after closure. FFI coverage and SCVC/FMVC coverage are also monitored, linked to functional/code coverage for stimulus sufficiency.

Sign-off requires that all ASIL targets are met at RTL, no unreviewed UUs remain, FM routing is fully verified, SCVC/FMVC coverage is complete, and all formal proofs are either passing or justified. The safety case is updated with the full evidence chain: coverage reports, fault logs, SCVC/FMVC results, formal proofs, and barrier analysis data.

Making it repeatable

On mature projects, this becomes a repeatable workflow: safety requirements and FFI matrix defined at the start; SCVC/FMVC harnesses running nightly; fault list generated, pruned, and sampled; simulations classifying faults; formal and barrier loops closing UUs; and static/formal proving software mechanism coverage.

By the end, every safety goal maps directly to a piece of evidence. For engineers from non-automotive backgrounds, the takeaway is that functional safety verification is not a final-phase add-on — it’s a discipline embedded from the first day of the project, balancing simulation and formal intelligently, and delivering a SoC that is not only functional, but provably safe and ready for ISO 26262 sign-off.