基于pointpillars的点云目标检测、测试评估、TensorRT后量化及ROS可视化

参考：https://github.com/traveller59/kitti-object-eval-python，把相应的的依赖函数提取出来了，不需要单独安装second-1.5.1，spconv-1.0。也可以混合精度测试，通过修改config里的yaml参数，测试评估时要保证路径一一对应。可以看出int8精度损失比较严重，需要进一步做感知训练量化，接下来会利用英伟达提供的。编译代码，会得到

令狐少侠、

2600人浏览 · 2023-02-09 20:44:10

令狐少侠、 · 2023-02-09 20:44:10 发布

代码已经开源：https://github.com/Xiao-Hu-Z/pointpillars_int8

安装环境

Prepare the OpenPCDet environment

导出onnx

To export your own models, you can run

python3 export_onnx.py \
--cfg_file pointpillar.yaml
--ckpt_path your_model.pth \
--onnx_file pfe.onnx \

python3 export_onnx.py \
--cfg_file pointpillar.yaml
--ckpt_path your_model.pth \
--onnx_file rpn.onnx \

Here we extract two pure nn models from the whole computation graph pfe and rpn, this is to make it easier for trt to optimize its inference engines with int8.

int8 后量化

Generate calib_data

To make implicit ptq quantization, you need previously generate calibration files

The input of backbone directly modifies pcdet/models/detectors/pointpillar.py, and does not write code repeatedly

    def forward(self, batch_dict):
        # for cur_module in self.module_list:
        #     batch_dict = cur_module(batch_dict)
        batch_dict = self.module_list[0](batch_dict)
        batch_dict = self.module_list[1](batch_dict)
        return batch_dict

Run the following command to generate the calibration input file

python3 generate_calib_data.py \
--cfg_file pointpillar.yaml \
--data_path your dataset file \
--ckpt your_model.pth \
--calib_file_path store cal ibration input file

By default this will generate fp16-engine files.

Generate TensorRT serialized engines

Actually you can directly create trt engines from onnx models and skip this step, however a more ideal way is to load your previously saved serialize engine files.

You can run

python3 ptq_int8.py \
--config waymo_centerpoint_pp_two_pfn_stride1_3x.py \
--pfe_onnx_file rpn.onnx \
--rpn_onnx_file rpn.onnx \
--pfe_engine_path pfe_fp.engine \
--rpn_engine_path rpn_fp.engine \
--mode quantification mode in fp32, fp16 or int8 \
--calib_file_path  store cal ibration input file

By default this will generate int8-engine files.

You can also use the txtexec command to obtain fp16 or fp32’s trt

# x86
# fp16
TensorRT-8.4.3.1/targets/x86_64-linux-gnu/bin/trtexec --onnx=pp_pfe.onnx --explicitBatch --saveEngine=pp_pfe_fp16_trtexec.trt --fp16 --workspace=1024 --verbose

# fp32
TensorRT-8.4.3.1/targets/x86_64-linux-gnu/bin/trtexec --onnx=pp_pfe.onnx --explicitBatch --saveEngine=pp_pfe_fp32_trtexec.trt --workspace=1024 --verbose

Run inference

编译代码，会得到两个可执行文件，一个用于获取数据的推理结果，一个用于可视化

运行下列代码：

./test_point_pillars_cuda

对于单帧数据：data/000003.bin 结果输出：

# pytorch :
8
16.999187 3.7984838 -0.94293684 4.421786 1.6759017 1.471789 6.259531 0.95879066 1
5.309882 11.392116 -1.2433618 3.5853221 1.5456746 1.4377103 3.157524 0.94660896 1
30.813366 -0.572901 -0.75715184 4.3241267 1.661644 1.5736197 6.2604256 0.91791165 1
53.294666 2.0207937 -0.40219212 3.9299285 1.6296755 1.5902154 6.171218 0.5074535 1
53.318325 -2.7356923 -0.4548468 3.872774 1.5616711 1.6064421 6.0431376 0.441119 1
18.733013 15.593122 -1.3264668 4.466153 1.6951573 1.5360177 6.388626 0.32318175 1
0.6559625 -0.5785154 -1.131638 4.1749973 1.6668894 1.560942 6.034973 0.17306222 1
64.31819 8.491507 -0.48887318 4.0612826 1.5959842 1.5436162 6.276266 0.14615808 1

# c++/cuda/tensorrt
## fp32
8
16.999060 3.798431 -0.942883 4.421756 1.675821 1.471751 6.259527 0.958724 0.000000 3.798431 1 
5.309843 11.392059 -1.243091 3.585154 1.545641 1.437766 3.157501 0.946587 0.000000 11.392059 1 
30.813334 -0.572902 -0.757078 4.324007 1.661577 1.573622 6.260392 0.917729 0.000000 -0.572902 1 
53.294670 2.020855 -0.402193 3.929884 1.629674 1.590329 6.171206 0.507371 0.000000 2.020855 1 
53.318218 -2.735692 -0.454844 3.872712 1.561591 1.606453 6.043070 0.440806 0.000000 -2.735692 1 
18.732822 15.593171 -1.326382 4.465864 1.695055 1.535907 6.388664 0.322872 0.000000 15.593171 1 
0.650379 -0.578194 -1.131110 4.173248 1.666790 1.560607 6.036040 0.174399 0.000000 -0.578194 1 
64.318169 8.491416 -0.488864 4.061276 1.596020 1.543714 6.276298 0.146148 0.000000 8.491416 1

## fp16
8
16.998623 3.798585 -0.942919 4.422517 1.675865 1.471489 6.259427 0.958924 0.000000 3.798585 1 
5.310803 11.391916 -1.243179 3.584088 1.545765 1.437490 3.157401 0.947088 0.000000 11.391916 1 
30.812458 -0.572833 -0.757393 4.321916 1.660997 1.572475 6.260312 0.917746 0.000000 -0.572833 1 
53.294903 2.017216 -0.407383 3.933108 1.630412 1.597092 6.169904 0.569374 0.000000 2.017216 1 
53.315487 -2.735236 -0.455371 3.879523 1.567689 1.614882 6.045636 0.475118 0.000000 -2.735236 1 
18.732012 15.592730 -1.323921 4.463737 1.694481 1.535416 6.388288 0.321673 0.000000 15.592730 1 
64.321220 8.491755 -0.483936 4.050905 1.591818 1.535908 6.274869 0.185652 0.000000 8.491755 1 
0.646314 -0.577143 -1.130635 4.173886 1.667091 1.560809 6.028302 0.175538 0.000000 -0.577143 1

eval

将预测数据核openpcdet得到pkl文件转为数据评估格式

cd /eval
python kitti_format.py

数据存储在 kitti/object/pred/ 目录下

Run evaluation kit on prediction and pcdet outputs

参考：https://github.com/traveller59/kitti-object-eval-python，我把相应的的依赖函数提取出放到kitti-object-eval-python里了，不需要单独安装second-1.5.1，spconv-1.0

python ./kitti-object-eval-python/evaluate.py evaluate --label_path=./kitti/object/training/label_2/ --result_path=./kitti/object/pcdet/ --label_split_file=./val.txt --current_class=0,1,2 --coco=False

Car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:90.77, 89.77, 88.75
bev  AP:89.52, 87.06, 84.11
3d   AP:85.96, 77.09, 74.41
aos  AP:90.76, 89.57, 88.42
Car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:90.77, 89.77, 88.75
bev  AP:90.78, 90.15, 89.42
3d   AP:90.78, 90.03, 89.19
aos  AP:90.76, 89.57, 88.42
Pedestrian AP(Average Precision)@0.50, 0.50, 0.50:
bbox AP:66.18, 62.16, 59.18
bev  AP:61.34, 56.08, 52.51
3d   AP:56.59, 51.94, 47.61
aos  AP:48.25, 45.33, 42.83
Pedestrian AP(Average Precision)@0.50, 0.25, 0.25:
bbox AP:66.18, 62.16, 59.18
bev  AP:72.08, 69.11, 66.13
3d   AP:72.00, 68.88, 64.93
aos  AP:48.25, 45.33, 42.83
Cyclist AP(Average Precision)@0.50, 0.50, 0.50:
bbox AP:85.03, 72.54, 68.55
bev  AP:81.97, 66.16, 62.32
3d   AP:79.70, 62.35, 59.36
aos  AP:84.48, 70.69, 66.66
Cyclist AP(Average Precision)@0.50, 0.25, 0.25:
bbox AP:85.03, 72.54, 68.55
bev  AP:86.24, 70.33, 66.47
3d   AP:86.24, 70.33, 66.47

运行以下计算TensorRT推理结果：

python ./kitti-object-eval-python/evaluate.py evaluate --label_path=./kitti/object/training/label_2/ --result_path=./kitti/object/pred/fp32 --label_split_file=./val.txt --current_class=0,1,2 --coco=False

python ./kitti-object-eval-python/evaluate.py evaluate --label_path=./kitti/object/training/label_2/ --result_path=./kitti/object/pred/fp16 --label_split_file=./val.txt --current_class=0,1,2 --coco=False

python ./kitti-object-eval-python/evaluate.py evaluate --label_path=./kitti/object/training/label_2/ --result_path=./kitti/object/pred/int8 --label_split_file=./val.txt --current_class=0,1,2 --coco=False

速度评估

Model	3060ti
PointPillars-FP32	8.06
PointPillars-FP16	5.15
PointPillars-int8	4.24
pfe_FP32 + rpn_fp16	5.95
pfe_FP16 + rpn_fp32	7.30

精度评估

在Car、Pedestrian、Cyclist交并比分别为0.7、0.5、0.5，中等难度val数据集的3D检测性能如下：

Model	Car@R11	Pedestrian@R11	Cyclist@R11
OpenPCDet	77.09	51.94	62.35
PointPillars-FP32	76.75	52.82	61.89
PointPillars-FP16	76.70	52.78	61.94
PointPillars-int8	60.53	10.79	7.57

也可以混合精度测试，通过修改config里的yaml参数，测试评估时要保证路径一一对应

可以看出int8精度损失比较严重，需要进一步做感知训练量化，可以利用英伟达提供的量化工具箱做感知训练量化

可视化

编译好代码，ros播放kitti的bag包，在build目录下执行以下代码：

./test_point_pillars_cuda_ros

参考

This project refers to some codes from:

OpenPCDet

TensorRT

九章云极普惠算力

更多推荐

AITemplate与Azure ML集成：云端推理实验管理终极指南

AITemplate作为高性能AI推理框架，与Azure Machine Learning的深度集成为开发者提供了强大的云端推理实验管理能力。本文将详细介绍如何利用AITemplate在Azure ML环境中构建高效推理管道，实现模型部署与性能优化的完整流程。AITemplate通过将神经网络转换为CUDA/HIP C++代码，在云端GPU环境中实现接近硬件极限的推理性能。## 🚀 为什么选

九章云极普惠算力

s3cmd安全性最佳实践：保护你的AWS凭证和敏感数据

作为一款强大的命令行工具，s3cmd在管理Amazon S3和CloudFront服务时扮演着重要角色。然而，**AWS凭证和敏感数据的安全性**往往被用户忽视，这可能导致严重的安全风险。本文将为你提供完整的s3cmd安全性配置指南，确保你的云存储操作始终处于安全保护之下。🚀## 🔐 为什么s3cmd安全性如此重要s3cmd需要访问你的AWS凭证才能执行操作，这些凭证包括Access