Contents Menu Expand Light mode Dark mode Auto light/dark, in light mode Auto light/dark, in dark mode Skip to content
StandardE2E 0.0.5
Logo
StandardE2E 0.0.5

Getting Started

  • Quickstart
  • Overview
  • Supported Datasets

API Reference

  • API Reference
    • Unified Dataset
    • Data Structures
      • StandardFrameData
      • FrameIndexData
      • TransformedFrameData
      • TransformedFrameDataBatch
      • Trajectory
      • BatchedTrajectory
      • CameraData
      • LidarData
      • LidarPointCloud
      • BatchedLidarPointCloud
      • Detection3D
      • FrameDetections3D
      • BatchedFrameDetections3D
      • HDMap
      • MapElement
    • Enums
    • Caching and Processing
      • Caching Core
        • SourceDatasetProcessor
        • SourceDatasetConverter
      • Caching Adapters
        • AbstractAdapter
        • PanoImageAdapter
        • CamerasIdentityAdapter
        • IntentIdentityAdapter
        • FutureStatesIdentityAdapter
        • PastStatesIdentityAdapter
        • PreferenceTrajectoryAdapter
        • Detections3DIdentityAdapter
        • LidarAdapter
        • LidarBEVAdapter
        • HDMapBEVAdapter
      • Caching Segment Context
        • SegmentContextAggregator
        • FuturePastStatesFromMatricesAggregator
        • FutureDetectionsAggregator
      • Caching Source Datasets
        • WaymoE2EDatasetProcessor
        • WaymoE2EDatasetConverter
        • WaymoPerceptionDatasetProcessor
        • WaymoPerceptionDatasetConverter
        • Av2SensorDatasetProcessor
        • Av2SensorDatasetConverter
        • Av2LidarDatasetProcessor
        • Av2LidarDatasetConverter
        • NavsimDatasetProcessor
        • NavsimDatasetConverter
    • Dataset Utilities
      • FrameLoader
      • standard_e2e.dataset_utils.frame_loader.create_frame_loaders_from_config
      • FrameSelector
      • standard_e2e.dataset_utils.selector.create_frame_selector_from_config
    • Modality Defaults
      • ModalityDefaults
      • IntentDefaults
      • PreferredTrajectoryDefaults
      • LidarPointCloudDefaults
      • LidarBEVDefaults
      • HDMapBEVDefaults
    • Augmentation
      • FrameAugmentation
      • IdentityFrameAugmentation
      • MultipleFramesImageAugmentation
      • TrajectoryResampling
    • Indexing
      • IndexDataGenerator
      • IndexFilter
      • FrameFilterByBooleanColumn
      • FrameFilterByTimeDelta

Tutorials

  • 📓 Introduction Tutorial
  • 📓 Data Containers
  • 📓 Multi-Dataset Training
  • 📓 Custom Adapters

Project Links

  • GitHub
  • Discord
  • PyPI
Back to top
View this page

NavsimDatasetConverter¶

class standard_e2e.caching.src_datasets.navsim.NavsimDatasetConverter(source_processor, input_path, split, num_workers=0, do_parallel_processing=True, arguments=None)[source]¶

Bases: SourceDatasetConverter

Iterates NAVSIM scenes frame-by-frame.

Expected on-disk layout under input_path:

navsim_logs/<split>/*.pkl       per-log scene pickles
sensor_blobs/<split>/<log>/     CAM_F0/ CAM_L{0,1,2}/ CAM_R{0,1,2}/
                                CAM_B0/ MergedPointCloud/

Each pickle is a list of frame dicts; one frame = one ego timestamp. Yielding (log_pickle_path, frame_idx) ordered first by log and then by frame keeps the processor’s per-pickle cache warm across the multiprocessing chunk.

With STANDARD_E2E_DEBUG=true only the first log is processed.

Parameters:
  • source_processor (SourceDatasetProcessor)

  • input_path (str)

  • split (str)

  • num_workers (int)

  • do_parallel_processing (bool)

  • arguments (Namespace | dict | None)

convert()¶

Convert all frames then run any configured context aggregators.

Return type:

None

property dataset_name: str¶

Return the name of the dataset.

classmethod get_arg_parser()¶

Return an argument parser for the converter.

property max_workers: int | None¶

Optional cap on parallel-pool size; None means no cap.

Used by datasets where pool throughput plateaus or regresses past a certain worker count – typically because the processor carries large state (e.g. a prescanned HD-map cache) and Pool’s per-task dispatch overhead grows with worker count. Subclasses whose processors are small can leave this at None.

property multiprocessing_start_method: str¶

Start method for the worker pool.

Default "spawn" is the conservative choice: TensorFlow and OpenCV both keep global thread / mutex state that fork() inherits in a deadlock-prone way (typically before the first frame completes). Spawn pays a per-worker import cost (~5 s per worker, dominated by TensorFlow) but is the safe pattern for any worker that may run TF or cv2 work post-fork.

Subclasses whose worker hot path is fully TF-free (no tf.io.decode_image, no frame_utils.* calls, etc.) may override to "fork" to avoid the spawn import tax. This is a very large speedup on small / DEBUG runs and a meaningful one on full splits.

Next
Dataset Utilities
Previous
NavsimDatasetProcessor
Copyright © 2026, Stepan Konev and contributors
Made with Sphinx and @pradyunsg's Furo
On this page
  • NavsimDatasetConverter
    • NavsimDatasetConverter
      • NavsimDatasetConverter.convert()
      • NavsimDatasetConverter.dataset_name
      • NavsimDatasetConverter.get_arg_parser()
      • NavsimDatasetConverter.max_workers
      • NavsimDatasetConverter.multiprocessing_start_method