Kitti 3D Object Detection Evaluation 2017 dataset - don't understand ground truth labels format

Question

Kitti 3D Object Detection Evaluation 2017 dataset - don't understand ground truth labels format

cdahms

2021年8月13日 03:38

I don't at all understand the format for the Kitti 3D Object Detection Evaluation 2017 dataset ground truth labels.

Since there are many Kitti datasets, let me be clear that I'm talking about the set available at http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d, then downloading the Velodyne point clouds and training labels:

Every source I can find, for example https://github.com/NVIDIA/DIGITS/blob/master/digits/extensions/data/objectDetection/README.md or https://medium.com/@desjoerdhaan/kitti-3d-object-detection-data-set-ef8ee6409574, stated that the columns are as follows:

type truncated occluded alpha x1 y1 x2 y2 height width length x y z yaw

where distances are in meters, yaw is in the range -pi to +pi.

x1, y1, x2, and y2 are locations in 2D images from one of the cameras and I'm only concerned with 3d for the moment so I'm not concerned about those.

height, width, length, x, y, z, and yaw are supposedly the size, location, and orientation of the bounding box in 3D space.

Where I'm having a problem is that after downloading the labels, for example file 000008.txt contains the following:

I added the column titles but the data is unchanged.

What I don't understand is that the x, y, z, width, length, height, and yaw data make no sense whatsoever. For example, how does the z data vary from 3.68 meters to 33.2 meters? Even the lowest value of 3.68 meters would mean the car would be floating about 10 feet in the air! And the 33.2 meters would be multiple stories in the air. Also, how are the x and y locations mostly less than 5 meters, a typical car is about 3.5 meters in length so that would mean most of the 6 cars would be on top of the ego car!

Moreover, if the lidar on the ego car is the 0, 0, 0 point, which would be the standard for datasets like this, then the center of mass of the surrounding cars would be approximately 1/2 meter below this, so I would have expected the z values all be approximately -0.5, but there is not any column that has values consistently close to this, so it isn't simply a matter of mixing up the columns.

Also, I checked the other files and the values are similarly not what I would have expected.

What am I missing here?

Something else I should mention is that most of the other autonomous car datasets have some kind of readme explaining this sort of thing or a forum, or both, but I couldn't find anything along those lines for the 2017 Kitti 3D Object Detection dataset. If this question is off topic for Data Science Stack Exchange I apologize and please feel free to suggest a better location to ask questions about the Kitti datasets if there is one, as you can probably tell I'm more familiar with some of the other autonomous car datasets but new to working with the Kitti set.

Topic 3d-object-detection dataset

Category Data Science

Kitti 3D Object Detection Evaluation 2017 dataset - don't understand ground truth labels format

About