Name | Sensing Modalities | Year (published) | Labelled (benchmark) | Recording area | Size | Categories / Remarks | Link |
Ford AV Dataset [ref] | Visual camera (7), 3D LiDAR (4) | 2020 | 6 DoF Pose | Michigan | 1.6 TB (amount of frames not given) | ; Seasonal variation in weather, lighting, construction and traffic conditions | Dataset Website |
Toyota Research Institute DDAD [ref] | Visual camera (6), 3D LiDAR | 2020 | Depth | San Francisco, Bay Area, Cambridge, Detroit, Ann Arbor, Tokyo, Odaiba | Labeled: 99k frames (camera); 200 scenes | Long-range depth (~250m) | Dataset Website |
PandaSet [ref] | 3D LiDAR (2), Visual cameras (6), GNSS and inertial sensors | 2020 | 3D bounding box | San Francisco, El Camino Real | 48k frames (camera), 16k frames (LiDAR), 100+ scenes | 28 classes, 37 semantic segmentation labels; Solid state LiDAR | Dataset Website |
CADC [ref] | Visual camera (8), 3D LiDAR | 2020 | 3D bounding boxes | Waterloo (Canada) | Labeled: 56k frames (camera), 7k frames (LiDAR); Raw: 263k frames (camera), 32k frames (LiDAR) | Car, Pedestrian, Truck, Bus, Garbage Containers on Wheels, Traffic Guidance Objects, Bicycle, Pedestrian With Object, Horse and Buggy, Animals; Adverse Weather conditions, different intensities of snowfall | Dataset Website |
Astyx HiRes2019 [ref] | Radar, Visual camera, 3D LiDAR | 2019 | 3D bounding boxes | n.a. | 500 frames (5000 annotated objects) | Car, Bus, Cyclist, Motorcyclist, Person, Trailer, Truck | Dataset Website |
A2D2 [ref] | Visual cameras (6); 3D LiDAR (5); Bus data | 2019 | 2D/3D bounding boxes, 2D/3D instance segmentation | Gaimersheim, Ingolstadt, Munich | 40k frames (semantics), 12k frames (3D objects), 390k frames unlabeled | Car, Bicycle, Pedestrian, Truck, Small vehicles, Traffic signal, Utility vehicle, Sidebars, Speed bumper, Curbstone, Solid line, Irrelevant signs, Road blocks, Tractor, Non-drivable street, Zebra crossing, Obstacles / trash, Poles, RD restricted area, Animals, Grid structure, Signal corpus, Drivable cobbleston, Electronic traffic, Slow drive area, Nature object, Parking area, Sidewalk, Ego car, Painted driv. instr., Traffic guide obj., Dashed line, RD normal street, Sky, Buildings, Blurred area, Rain dirt | Dataset Website |
A*3D Dataset [ref] | Visual cameras (2); 3D LiDAR | 2019 | 3D bounding boxes | Singapore | 39k frames, 230k objects | Car, Van, Bus, Truck, Pedestrians, Cyclists, and Motorcyclists; Afternoon and night, wet and dry | Dataset Website |
EuroCity Persons [ref] | Visual camera; Announced: stereo, LiDAR, GNSS and intertial sensors | 2019 | 2D bounding boxes | 12 countries in Europe, 27 cities | 47k frames, 258k objects | Pedestrian, Rider, Bicycle, Motorbike, Scooter, Tricycle, Wheelchair, Buggy, Co-Rider; Highly diverse: 4 seasons, day and night, wet and dry | Dataset Website |
Oxford RobotCar [ref] (2016),[ref] (2019) | 2016: Visual cameras (fisheye & stereo), 2D & 3D LiDAR, GNSS, and inertial sensors; 2019: Radar, 3D Lidar (2), 2D LiDAR (2), visual cameras (6), GNSS and inertial sensors | 2016, 2019 | no | Oxford | 2016: 11,070,651 frames (stereo), 3,226,183 frames (3D LiDAR); 2019: 240k scans (Radar), 2.4M frames (LiDAR) | Long-term autonomous driving. Various weather conditions, including heavy rain, night, direct sunlight and snow. | Dataset Website 2016, Dataset Website 2019 |
Waymo Open Dataset [ref] | 3D LiDAR (5), Visual cameras (5) | 2019 | 3D bounding box, Tracking | n.a. | 200k frames, 12M objects (3D LiDAR), 1.2M objects (2D camera) | Vehicles, Pedestrians, Cyclists, Signs | Dataset Website |
Lyft Level 5 AV Dataset 2019 [ref] | 3D LiDAR (5), Visual cameras (6) | 2019 | 3D bounding box | n.a. | 55k frames | Semantic HD map included | Dataset Website |
Argoverse [ref] | 3D LiDAR (2), Visual cameras (9, 2 stereo) | 2019 | 3D bounding box, Tracking, Forecasting | Pittsburgh, Pennsylvania, Miami, Florida | 113 scenes, 300k trajectories | Vehicle, Pedestrian, Other Static, Large Vehicle, Bicycle, Bicyclist, Bus, Other Mover, Trailer, Motorcyclist, Moped, Motorcycle, Stroller, Emergency Vehicle, Animal, Wheelchair, School Bus; Semantic HD maps (2) included | Dataset Website |
nuScenes dataset [ref] | Visual cameras (6), 3D LiDAR, Radars (5) | 2019 | 3D bounding box | Boston, Singapore | 1000 scenes, 1.4M frames (camera, Radar), 390k frames (3D LiDAR) | Car or Van or SUV, Truck, Pickup Truck, Front Of Semi Truck, Bendy Bus, Rigid Bus, Construction Vehicle, Motorcycle, Bicycle, Bicycle Rack, Trailer, Police Vehicle, Ambulance, Train, Adult Pedestrian, Child Pedestrian, Construction Worker, Stroller, Wheelchair, Portable Personal Mobility Vehicle, Traffic Police, Other Police, Animal, Traffic Cone, Temporary Traffic Barrier, Pushable Pullable Object, Debris | Dataset Website |
BLVD [ref] | Visual (Stereo) camera, 3D LiDAR | 2019 | 3D bounding box, Tracking, Interaction, Intention | Changshu | 120k frames, 249,129 objects | Vehicle, Pedestrian, Rider during day and night | Dataset Website |
H3D dataset [ref] | Visual cameras (3), 3D LiDAR | 2019 | 3D bounding box | San Francisco | 27,721 frames, 1,071,302 objects | Car, Pedestrian, Cyclist, Truck, Misc, Animal, Motorcyclist, Bus | Dataset Website |
ApolloScape [ref] | Visual (Stereo) camera, 3D LiDAR, GNSS and inertial sensors | 2018, 2019 | 2D/3D pixel-level segmentation, lane marking, instance segmentation, Depth | n.a. | 143,906 frames, 89,430 objects | Rover, Sky, Car, Motobicycle, Bicycle, Person, Rider, Truck, Bus, Tricycle, Road, Sidewalk, Traffic Cone, Road Pile, Fence, Traffic Light, Pole, Traffic Sign, Wall, Dustbin, Billboard, Building, Bridge, Tunnel, Overpass, Vegetation | Dataset Website |
DBNet dataset [ref] | 3D LiDAR, Dashboard visual camera, GNSS | 2018 | Driving behaviours (Vehicle speed and wheel angles) | Multiple areas in China | Over 10k frames | In total seven datasets with different test scenarios, such as seaside roads, school areas, mountain roads | Dataset Website |
KAIST multispectral dataset [ref] | Visual (Stereo) and thermal camera, 3D LiDAR, GNSS and inertial sensors | 2018 | 2D bounding box, drivable region, image enhancement, depth, colorization | Seoul | 7,512 frames, 308,913 objects | Person, Cyclist, Car during day and night, fine time slots (sunrise, afternoon,...) | Dataset Website |
Multi-spectral Object Detection dataset [ref] | Visual and thermal cameras | 2017 | 2D bounding box | University environment in Japan | 7,512 frames, 5,833 objects | Bike, Car, Car Stop, Color Cone, Person during day and night | Dataset Website |
Multi-spectral Semantic Segmentation dataset [ref] | Visual and thermal camera | 2017 | 2D pixel-level segmentation | n.a. | 1569 frames | Bike, Car, Person, Curve, Guardrail, Color Cone, Bump during day and night | Dataset Website |
Multi-modal Panoramic 3D Outdoor (MPO) dataset [ref] | Visual camera, LiDAR and GNSS | 2016 | Place categorization | Fukuoka | 650 scans (dense), 34200 scans (sparse) | No dynamic objects | Dataset Website |
KAIST multispectral pedestrian [ref] | Visual and thermal camera | 2015 | 2D bounding box | Seoul | 95,328 frames, 103,128 objects | Person, People, Cyclist during day and night | Dataset Website |
KITTI [ref] (2012), [ref] (2013) | Visual (Stereo) camera, 3D LiDAR, GNSS and inertial sensors | 2012, 2013, 2015 | 2D, 3D bounding box, visual odometry, road detection, optical flow, tracking, depth, 2D instance and pixel-level segmentation | Karlsruhe | 7481 frames (training) 80.256 objects | Car, Van, Truck, Pedestrian, Person (sitting), Cyclist, Tram, Misc | Dataset Website |
The Málaga Stereo and Laser Urban dataset [ref] | Visual (Stereo) camera, 5x 2D LiDAR (yielding 3D information), GNSS and inertial sensors | 2014 | no | Málaga | 113,082 frames, 5,654.6 s (camera); >220,000 frames, ~5,000 s (LiDARs) | n.a. | Dataset Website |