Supported Datasets ================== Each row below lists the modalities the dataset processor currently emits into the unified format. ``✓`` means the modality is wired end-to-end through preprocessing and the on-disk ``TransformedFrameData`` schema; ``—`` means the dataset doesn't ship that modality (and it's surfaced as the corresponding default value via :class:`~standard_e2e.dataset_utils.modality_defaults.ModalityDefaults`). .. list-table:: :header-rows: 1 :widths: 22 10 10 10 10 12 14 * - Dataset - Cameras - LiDAR - HD map (BEV) - 3D detections - Driving command - Preference trajectory * - `Waymo Open E2E `__ - ✓ (8 ring cameras) - — - — - — - ✓ - ✓ * - `Waymo Open Perception `__ - ✓ (5 cameras) - ✓ (top + side, ego frame) - ✓ - ✓ - — - — * - `Argoverse 2 Sensor `__ - ✓ (7 ring cameras) - ✓ (merged sweep, ego frame) - ✓ - ✓ - — - — * - `Argoverse 2 Lidar `__ - — - ✓ (merged sweep, ego frame) - ✓ - — - — - — * - `NAVSIM `__ (OpenScene-v1.1) - ✓ (8 cameras: front/left×3/right×3/rear) - ✓ (merged sweep, ego frame) - ✓ (via nuPlan ``map.gpkg`` → unified taxonomy; lane boundaries carry no paint info, since nuPlan doesn't store paint type) - ✓ - ✓ (4-class one-hot → :class:`~standard_e2e.enums.Intent`) - — * - `WayveScenes101 `__ - ✓ (5 fisheye: forward + side arc) - ✓ (COLMAP SfM, ego frame) [#wayve_lidar]_ - — - — - — - — * - `comma2k19 `__ - ✓ (1 forward: comma EON, 1164×874 pinhole) [#comma2k19]_ - — - — - — - — - — All datasets also emit the ego **past/future trajectory** (from each dataset's poses, via the segment-context aggregator) regardless of the columns above. .. note:: **comma2k19 is high-volume** — 20 Hz × ~2000 one-minute segments ≈ 2.4 M frames (~2 TB at the native 1164×874 resolution). Two converter knobs bound the output size and processing time: ``--frame_stride N`` keeps **every N-th frame** (``1`` = full 20 Hz; e.g. ``--frame_stride 4`` ≈ 5 Hz), and the ``cameras_identity_adapter``'s ``max_size`` param **downscales** each frame so its longest side is at most that many px (intrinsics scaled to match). .. [#wayve_lidar] WayveScenes101 ships **no sensor lidar**. Its ``lidar_pc`` is populated from the per-scene **COLMAP SfM** point cloud: filtered (reprojection error ≤ 6, track length ≥ 2), converted OpenCV→FLU, then transformed into each frame's ego (FLU, x-forward/y-left/z-up) frame with the *world→ego* pose and range-clipped (50 m) so it flows through the standard lidar adapters. It is photogrammetric (sparse, up-to-scale), not a sensor measurement. The ego, cameras and lidar share one FLU frame, so a frame's cloud lifted by ``aux_data["pose_matrix"]`` reproduces the source SfM cloud exactly. .. [#comma2k19] comma2k19 ships a **single forward-facing** 20 Hz camera (comma EON, 1164×874, treated as a pinhole; identity extrinsics, since the dataset pose *is* the camera pose) plus a fused GNSS/IMU ego pose and CAN telemetry — no lidar, HD map, 3D boxes, or driving command. The ego pose is derived from the ECEF ``global_pose`` into a per-segment local FLU frame (x-forward/y-left/z-up), so ``global_position`` X/Y/Z/heading are segment-relative; ``global_position`` additionally carries the ego **speed** (:attr:`~standard_e2e.enums.TrajectoryComponent.SPEED`) from the ECEF velocity. Segments must be extracted from the distributed ``Chunk_*.zip`` archives first (as with WayveScenes101); each ``video.hevc`` is then decoded forward-only, since HEVC random seek is unreliable. Native rate is 20 Hz — use ``--frame_stride`` to subsample. How datasets are added ---------------------- See `Adding New Datasets Guide `_ for the full processor → adapter → aggregator pipeline a new dataset has to plug into.