基于pointpillars的点云目标检测、测试评估、TensorRT后量化及ROS可视化
参考:https://github.com/traveller59/kitti-object-eval-python,把相应的的依赖函数提取出来了,不需要单独安装second-1.5.1,spconv-1.0。也可以混合精度测试,通过修改config里的yaml参数,测试评估时要保证路径一一对应。可以看出int8精度损失比较严重,需要进一步做感知训练量化,接下来会利用英伟达提供的。编译代码,会得到
代码已经开源:https://github.com/Xiao-Hu-Z/pointpillars_int8
安装环境
Prepare the OpenPCDet environment
导出onnx
To export your own models, you can run
python3 export_onnx.py \
--cfg_file pointpillar.yaml
--ckpt_path your_model.pth \
--onnx_file pfe.onnx \
python3 export_onnx.py \
--cfg_file pointpillar.yaml
--ckpt_path your_model.pth \
--onnx_file rpn.onnx \
Here we extract two pure nn models from the whole computation graph pfe and rpn, this is to make it easier for trt to optimize its inference engines with int8.
int8 后量化
Generate calib_data
To make implicit ptq quantization, you need previously generate calibration files
The input of backbone directly modifies pcdet/models/detectors/pointpillar.py, and does not write code repeatedly
def forward(self, batch_dict):
# for cur_module in self.module_list:
# batch_dict = cur_module(batch_dict)
batch_dict = self.module_list[0](batch_dict)
batch_dict = self.module_list[1](batch_dict)
return batch_dict
Run the following command to generate the calibration input file
python3 generate_calib_data.py \
--cfg_file pointpillar.yaml \
--data_path your dataset file \
--ckpt your_model.pth \
--calib_file_path store cal ibration input file
By default this will generate fp16-engine files.
Generate TensorRT serialized engines
Actually you can directly create trt engines from onnx models and skip this step, however a more ideal way is to load your previously saved serialize engine files.
You can run
python3 ptq_int8.py \
--config waymo_centerpoint_pp_two_pfn_stride1_3x.py \
--pfe_onnx_file rpn.onnx \
--rpn_onnx_file rpn.onnx \
--pfe_engine_path pfe_fp.engine \
--rpn_engine_path rpn_fp.engine \
--mode quantification mode in fp32, fp16 or int8 \
--calib_file_path store cal ibration input file
By default this will generate int8-engine files.
You can also use the txtexec command to obtain fp16 or fp32’s trt
# x86
# fp16
TensorRT-8.4.3.1/targets/x86_64-linux-gnu/bin/trtexec --onnx=pp_pfe.onnx --explicitBatch --saveEngine=pp_pfe_fp16_trtexec.trt --fp16 --workspace=1024 --verbose
# fp32
TensorRT-8.4.3.1/targets/x86_64-linux-gnu/bin/trtexec --onnx=pp_pfe.onnx --explicitBatch --saveEngine=pp_pfe_fp32_trtexec.trt --workspace=1024 --verbose
Run inference
编译代码,会得到两个可执行文件,一个用于获取数据的推理结果,一个用于可视化
运行下列代码:
./test_point_pillars_cuda
对于单帧数据:data/000003.bin 结果输出:
# pytorch :
8
16.999187 3.7984838 -0.94293684 4.421786 1.6759017 1.471789 6.259531 0.95879066 1
5.309882 11.392116 -1.2433618 3.5853221 1.5456746 1.4377103 3.157524 0.94660896 1
30.813366 -0.572901 -0.75715184 4.3241267 1.661644 1.5736197 6.2604256 0.91791165 1
53.294666 2.0207937 -0.40219212 3.9299285 1.6296755 1.5902154 6.171218 0.5074535 1
53.318325 -2.7356923 -0.4548468 3.872774 1.5616711 1.6064421 6.0431376 0.441119 1
18.733013 15.593122 -1.3264668 4.466153 1.6951573 1.5360177 6.388626 0.32318175 1
0.6559625 -0.5785154 -1.131638 4.1749973 1.6668894 1.560942 6.034973 0.17306222 1
64.31819 8.491507 -0.48887318 4.0612826 1.5959842 1.5436162 6.276266 0.14615808 1
# c++/cuda/tensorrt
## fp32
8
16.999060 3.798431 -0.942883 4.421756 1.675821 1.471751 6.259527 0.958724 0.000000 3.798431 1
5.309843 11.392059 -1.243091 3.585154 1.545641 1.437766 3.157501 0.946587 0.000000 11.392059 1
30.813334 -0.572902 -0.757078 4.324007 1.661577 1.573622 6.260392 0.917729 0.000000 -0.572902 1
53.294670 2.020855 -0.402193 3.929884 1.629674 1.590329 6.171206 0.507371 0.000000 2.020855 1
53.318218 -2.735692 -0.454844 3.872712 1.561591 1.606453 6.043070 0.440806 0.000000 -2.735692 1
18.732822 15.593171 -1.326382 4.465864 1.695055 1.535907 6.388664 0.322872 0.000000 15.593171 1
0.650379 -0.578194 -1.131110 4.173248 1.666790 1.560607 6.036040 0.174399 0.000000 -0.578194 1
64.318169 8.491416 -0.488864 4.061276 1.596020 1.543714 6.276298 0.146148 0.000000 8.491416 1
## fp16
8
16.998623 3.798585 -0.942919 4.422517 1.675865 1.471489 6.259427 0.958924 0.000000 3.798585 1
5.310803 11.391916 -1.243179 3.584088 1.545765 1.437490 3.157401 0.947088 0.000000 11.391916 1
30.812458 -0.572833 -0.757393 4.321916 1.660997 1.572475 6.260312 0.917746 0.000000 -0.572833 1
53.294903 2.017216 -0.407383 3.933108 1.630412 1.597092 6.169904 0.569374 0.000000 2.017216 1
53.315487 -2.735236 -0.455371 3.879523 1.567689 1.614882 6.045636 0.475118 0.000000 -2.735236 1
18.732012 15.592730 -1.323921 4.463737 1.694481 1.535416 6.388288 0.321673 0.000000 15.592730 1
64.321220 8.491755 -0.483936 4.050905 1.591818 1.535908 6.274869 0.185652 0.000000 8.491755 1
0.646314 -0.577143 -1.130635 4.173886 1.667091 1.560809 6.028302 0.175538 0.000000 -0.577143 1
eval
将预测数据核openpcdet得到pkl文件转为数据评估格式
cd /eval
python kitti_format.py
数据存储在 kitti/object/pred/ 目录下
- Run evaluation kit on prediction and pcdet outputs
参考:https://github.com/traveller59/kitti-object-eval-python,我把相应的的依赖函数提取出放到kitti-object-eval-python里了,不需要单独安装second-1.5.1,spconv-1.0
python ./kitti-object-eval-python/evaluate.py evaluate --label_path=./kitti/object/training/label_2/ --result_path=./kitti/object/pcdet/ --label_split_file=./val.txt --current_class=0,1,2 --coco=False
Car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:90.77, 89.77, 88.75
bev AP:89.52, 87.06, 84.11
3d AP:85.96, 77.09, 74.41
aos AP:90.76, 89.57, 88.42
Car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:90.77, 89.77, 88.75
bev AP:90.78, 90.15, 89.42
3d AP:90.78, 90.03, 89.19
aos AP:90.76, 89.57, 88.42
Pedestrian AP(Average Precision)@0.50, 0.50, 0.50:
bbox AP:66.18, 62.16, 59.18
bev AP:61.34, 56.08, 52.51
3d AP:56.59, 51.94, 47.61
aos AP:48.25, 45.33, 42.83
Pedestrian AP(Average Precision)@0.50, 0.25, 0.25:
bbox AP:66.18, 62.16, 59.18
bev AP:72.08, 69.11, 66.13
3d AP:72.00, 68.88, 64.93
aos AP:48.25, 45.33, 42.83
Cyclist AP(Average Precision)@0.50, 0.50, 0.50:
bbox AP:85.03, 72.54, 68.55
bev AP:81.97, 66.16, 62.32
3d AP:79.70, 62.35, 59.36
aos AP:84.48, 70.69, 66.66
Cyclist AP(Average Precision)@0.50, 0.25, 0.25:
bbox AP:85.03, 72.54, 68.55
bev AP:86.24, 70.33, 66.47
3d AP:86.24, 70.33, 66.47
运行以下计算TensorRT推理结果:
python ./kitti-object-eval-python/evaluate.py evaluate --label_path=./kitti/object/training/label_2/ --result_path=./kitti/object/pred/fp32 --label_split_file=./val.txt --current_class=0,1,2 --coco=False
python ./kitti-object-eval-python/evaluate.py evaluate --label_path=./kitti/object/training/label_2/ --result_path=./kitti/object/pred/fp16 --label_split_file=./val.txt --current_class=0,1,2 --coco=False
python ./kitti-object-eval-python/evaluate.py evaluate --label_path=./kitti/object/training/label_2/ --result_path=./kitti/object/pred/int8 --label_split_file=./val.txt --current_class=0,1,2 --coco=False
速度评估
| Model | 3060ti |
|---|---|
| PointPillars-FP32 | 8.06 |
| PointPillars-FP16 | 5.15 |
| PointPillars-int8 | 4.24 |
| pfe_FP32 + rpn_fp16 | 5.95 |
| pfe_FP16 + rpn_fp32 | 7.30 |
精度评估
在Car、Pedestrian、Cyclist交并比分别为0.7、0.5、0.5,中等难度val数据集的3D检测性能如下:
| Model | Car@R11 | Pedestrian@R11 | Cyclist@R11 |
|---|---|---|---|
| OpenPCDet | 77.09 | 51.94 | 62.35 |
| PointPillars-FP32 | 76.75 | 52.82 | 61.89 |
| PointPillars-FP16 | 76.70 | 52.78 | 61.94 |
| PointPillars-int8 | 60.53 | 10.79 | 7.57 |
也可以混合精度测试,通过修改config里的yaml参数,测试评估时要保证路径一一对应
可以看出int8精度损失比较严重,需要进一步做感知训练量化,可以利用英伟达提供的量化工具箱做感知训练量化
可视化
编译好代码,ros播放kitti的bag包,在build目录下执行以下代码:
./test_point_pillars_cuda_ros
参考
This project refers to some codes from:
更多推荐

所有评论(0)