Week 10 — Capstone: Safety Runtime, Calibration, and Full Jetson Integration

Overview

The capstone ties the whole course into one runtime on the Jetson Orin Nano: the control loop (Week 8), CAN actuator protocol (Week 8), sensor ingestion and state estimation (Week 9), prediction (Weeks 4–5), and planning (Week 6), all governed by a safety state machine with a watchdog, rate limiter, and fault manager — plus calibration tooling. The emphasis shifts from any single algorithm to system integration and safety: how the pieces compose, how the system behaves when a component is late or wrong, and how it degrades gracefully rather than failing dangerously.

This is the portfolio artifact for an embedded-autonomy role. A reviewer should be able to read the safety case, see the jitter and latency numbers, watch a fault get injected and the system enter a safe state, and trust that you understand autonomy as a system, not a collection of demos.

Readings

Boyd (apply): least squares and regularized least squares. Extract: calibration as a least-squares fit (sensor alignment, bias estimation).
Numerical Python (Johansson): NumPy/SciPy/matplotlib for fitting and analysis. Extract: the calibration and analysis tooling.
HLW: boot, devices, filesystems, processes, logs/system management. Extract: deploying and operating the runtime on the Jetson.
Embedded AI: deployment constraints, debugging, and drift. Extract: what changes between bench and deployed operation.

Key Concepts

The safety state machine

Define explicit states — e.g. INIT, CALIBRATING, NOMINAL, DEGRADED, SAFE_STOP — with guarded transitions. Every component reports health; the supervisor transitions to DEGRADED or SAFE_STOP on a missed heartbeat (Week 8), an inconsistent estimate (Week 9), a planning timeout (Week 6), or a calibration failure. The safe state is reachable from everywhere and commands a safe default. This explicit, inspectable machine is the safety case.

Watchdog, rate limiter, fault manager

A watchdog independently monitors loop liveness and forces a safe state if the loop stalls. A rate limiter bounds how fast commands change (no actuator step-jumps even if the planner outputs one). The fault manager centralizes detection (timeouts, range checks, consistency checks) and the response policy. These are generic, reusable robotics-runtime patterns.

Calibration

Sensors need calibration: IMU bias/scale, mounting alignment, time offset between streams. Frame these as least-squares fits (Boyd/Course 5): collect data under known conditions, solve for the parameters, validate residuals. Calibration tooling and stored calibration are part of a deployable system, not an afterthought.

Jetson deployment and budgeting

Bring the full stack onto the Jetson. Apply Week 1’s roofline and Week 3’s precision policy to fit prediction/planning in the per-cycle budget; profile the integrated loop; confirm the jitter (Week 8) holds under real load. Deployment surfaces power, thermal, and memory limits the Mac hides.

Theory Exercises

Draw the safety state machine: states, transition guards, and the safe default for each fault source from Weeks 6/8/9.
Derive a sensor-bias/alignment calibration as a (regularized) least-squares problem; write the normal equations.
Justify a watchdog independent of the main loop; explain what it must not share with the loop to remain trustworthy.
Design a rate limiter and show its effect on a step command; relate the limit to actuator and comfort constraints (Week 2).
Build a per-cycle time budget for the integrated loop on the Jetson; identify the component most at risk of overrun and the fallback.

Implementation

Integrate all prior labs into one Jetson runtime: estimation → prediction → planning → command, inside the Week 8 loop, supervised by the safety state machine with watchdog, rate limiter, and fault manager. Build calibration tools (app/) and store/load calibration. Add structured logging for post-run analysis.

Benchmark

Integrated loop: per-component and end-to-end latency (p50/p99/max), jitter under full load, and verification that each injected fault (dead CAN peer, late plan, inconsistent estimate) drives the correct safe transition within budget. Calibration residuals before/after. Power/thermal on the Jetson under sustained load.

Expected baselines: the integrated loop holds its target rate with bounded tails under load; every injected fault reaches SAFE_STOP within the derived timeout; calibration reduces residuals to the noise floor; the Jetson sustains the load within thermal limits (possibly after applying the Week 3 precision policy).

Connections

This capstone composes every prior week and is the course’s headline portfolio piece. It demonstrates the full arc — GPU-efficient ML, prediction, planning, real-time control, estimation, and a safety runtime — on real embedded hardware. It draws on Course 5 (optimization for calibration, probability for estimation) and Course 6 (the sensor signal front end), showing the foundations courses paying off in a deployed system. The optional Weeks 11+ directions extend it toward a real CAN hardware loop, a humanoid joint module, or richer perception.