fix doc typos, a new nuscenes model with 1 nds better result

dlengerer · Apr 13, 2021 · 157a63a · 157a63a
1 parent cb25e87
commit 157a63a
Show file tree

Hide file tree

Showing 9 changed files with 273 additions and 22 deletions.
diff --git a/README.md b/README.md
@@ -20,9 +20,11 @@
 
 ## NEWS
 
+[2021-04-13] Better nuScenes results by fixing sync-bn bug and using stronger augmentations. Plese refer to [nuScenes](configs/nusc/README.md).  
+
 [2021-02-28] CenterPoint is accepted at CVPR 2021 :fire:
 
-[2021-01-06] CenterPoint v1.0 is released. Without bells and whistles, we rank first among all Lidar-only methods on Waymo Open Dataset with a single model that runs at 11 FPS. Check out CenterPoint's model zoo for [Waymo](configs/waymo/README.md) and [nuScenes](configs/nusc/README.md). 
+[2021-01-06] CenterPoint v0.1 is released. Without bells and whistles, we rank first among all Lidar-only methods on Waymo Open Dataset with a single model that runs at 11 FPS. Check out CenterPoint's model zoo for [Waymo](configs/waymo/README.md) and [nuScenes](configs/nusc/README.md). 
 
 [2020-12-11] 3 out of the top 4 entries in the recent NeurIPS 2020 [nuScenes 3D Detection challenge](https://www.nuscenes.org/object-detection?externalData=all&mapData=all&modalities=Any) used CenterPoint. Congratualations to other participants and please stay tuned for more updates on nuScenes and Waymo soon. 
 
@@ -94,17 +96,14 @@ All results are tested on a Titan RTX GPU with batch size 1.
 
 Please refer to [INSTALL](docs/INSTALL.md) to set up libraries needed for distributed training and sparse convolution.
 
-First download the model (By default, [centerpoint_pillar_512](https://drive.google.com/drive/folders/1K_wHrBo6yRSG7H7UUjKI4rPnyEA8HvOp)) and put it in ```work_dirs/centerpoint_pillar_512_demo```. 
-
-We provide a driving sequence clip from the [nuScenes dataset](https://www.nuscenes.org). Donwload the [folder](https://drive.google.com/file/d/1bK-xeq5UwJzpPfVDhICDJeKiU1QVZwtI/view?usp=sharing) and put in the main directory.     
-Then run a demo by ```python tools/demo.py```. If setup corectly, you will see an output video like (red is gt objects, blue is the prediction): 
-
-<p align="center"> <img src='docs/demo.gif' align="center" height="350px"> </p> 
-
 ## Benchmark Evaluation and Training 
 
 Please refer to [GETTING_START](docs/GETTING_START.md) to prepare the data. Then follow the instruction there to reproduce our detection and tracking results. All detection configurations are included in [configs](configs) and we provide the scripts for all tracking experiments in [tracking_scripts](tracking_scripts). 
 
+### ToDo List
+- [ ] Support visualization with Open3D  
+- [ ] Colab demo 
+- [ ] Docker   
 
 ## License
 

diff --git a/configs/nusc/README.md b/configs/nusc/README.md
@@ -11,6 +11,16 @@
 **We provide training / validation configurations, logs, pretrained models, and prediction files for all models in the paper**
 
 ### VoxelNet 
+| Model                 | Validation MAP  | Validation NDS  | Link          |
+|-----------------------|-----------------|-----------------|---------------|
+| [centerpoint_voxel_1440](voxelnet/nusc_centerpoint_voxelnet_0075voxel_fix_bn_z.py) | 58.5 | 66.3 | [URL](https://drive.google.com/drive/folders/1FOfCe9nWQrySUx42PlZyaKWAK2Or0sZQ?usp=sharing)  |
+
+
+
+### VoxelNet(depreacted) 
+
+These results are obtained before the bug fix. 
+
 | Model                 | FPS              | Validation MAP  | Validation NDS  | Link          |
 |-----------------------|------------------|-----------------|-----------------|---------------|
 | [centerpoint_voxel_1024](voxelnet/nusc_centerpoint_voxelnet_01voxel.py) | 16 | 56.4 | 64.8 | [URL](https://drive.google.com/drive/folders/1RyBD23GDfeU4AnRkea2BxlrosbKJmDKW?usp=sharing) |

diff --git a/configs/nusc/voxelnet/nusc_centerpoint_voxelnet_0075voxel_fix_bn_z.py b/configs/nusc/voxelnet/nusc_centerpoint_voxelnet_0075voxel_fix_bn_z.py
@@ -0,0 +1,232 @@
+import itertools
+import logging
+
+from det3d.utils.config_tool import get_downsample_factor
+
+tasks = [
+    dict(num_class=1, class_names=["car"]),
+    dict(num_class=2, class_names=["truck", "construction_vehicle"]),
+    dict(num_class=2, class_names=["bus", "trailer"]),
+    dict(num_class=1, class_names=["barrier"]),
+    dict(num_class=2, class_names=["motorcycle", "bicycle"]),
+    dict(num_class=2, class_names=["pedestrian", "traffic_cone"]),
+]
+
+class_names = list(itertools.chain(*[t["class_names"] for t in tasks]))
+
+# training and testing settings
+target_assigner = dict(
+    tasks=tasks,
+)
+
+# model settings
+model = dict(
+    type="VoxelNet",
+    pretrained=None,
+    reader=dict(
+        type="VoxelFeatureExtractorV3",
+        # type='SimpleVoxel',
+        num_input_features=5,
+    ),
+    backbone=dict(
+        type="SpMiddleResNetFHD", num_input_features=5, ds_factor=8
+    ),
+    neck=dict(
+        type="RPN",
+        layer_nums=[5, 5],
+        ds_layer_strides=[1, 2],
+        ds_num_filters=[128, 256],
+        us_layer_strides=[1, 2],
+        us_num_filters=[256, 256],
+        num_input_features=256,
+        logger=logging.getLogger("RPN"),
+    ),
+    bbox_head=dict(
+        type="CenterHead",
+        in_channels=sum([256, 256]),
+        tasks=tasks,
+        dataset='nuscenes',
+        weight=0.25,
+        code_weights=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.2, 0.2, 1.0, 1.0],
+        common_heads={'reg': (2, 2), 'height': (1, 2), 'dim':(3, 2), 'rot':(2, 2), 'vel': (2, 2)},
+        share_conv_channel=64,
+        dcn_head=False
+    ),
+)
+
+assigner = dict(
+    target_assigner=target_assigner,
+    out_size_factor=get_downsample_factor(model),
+    dense_reg=1,
+    gaussian_overlap=0.1,
+    max_objs=500,
+    min_radius=2,
+)
+
+
+train_cfg = dict(assigner=assigner)
+
+test_cfg = dict(
+    post_center_limit_range=[-61.2, -61.2, -10.0, 61.2, 61.2, 10.0],
+    max_per_img=500,
+    nms=dict(
+        use_rotate_nms=True,
+        use_multi_class_nms=False,
+        nms_pre_max_size=1000,
+        nms_post_max_size=83,
+        nms_iou_threshold=0.2,
+    ),
+    score_threshold=0.1,
+    pc_range=[-54, -54],
+    out_size_factor=get_downsample_factor(model),
+    voxel_size=[0.075, 0.075]
+)
+
+# dataset settings
+dataset_type = "NuScenesDataset"
+nsweeps = 10
+data_root = "data/nuScenes"
+
+db_sampler = dict(
+    type="GT-AUG",
+    enable=False,
+    db_info_path="data/nuScenes/dbinfos_train_10sweeps_withvelo.pkl",
+    sample_groups=[
+        dict(car=2),
+        dict(truck=3),
+        dict(construction_vehicle=7),
+        dict(bus=4),
+        dict(trailer=6),
+        dict(barrier=2),
+        dict(motorcycle=6),
+        dict(bicycle=6),
+        dict(pedestrian=2),
+        dict(traffic_cone=2),
+    ],
+    db_prep_steps=[
+        dict(
+            filter_by_min_num_points=dict(
+                car=5,
+                truck=5,
+                bus=5,
+                trailer=5,
+                construction_vehicle=5,
+                traffic_cone=5,
+                barrier=5,
+                motorcycle=5,
+                bicycle=5,
+                pedestrian=5,
+            )
+        ),
+        dict(filter_by_difficulty=[-1],),
+    ],
+    global_random_rotation_range_per_object=[0, 0],
+    rate=1.0,
+)
+train_preprocessor = dict(
+    mode="train",
+    shuffle_points=True,
+    global_rot_noise=[-0.78539816, 0.78539816],
+    global_scale_noise=[0.95, 1.05],
+    global_translate_std=0.5,
+    db_sampler=db_sampler,
+    class_names=class_names,
+)
+
+val_preprocessor = dict(
+    mode="val",
+    shuffle_points=False,
+)
+
+voxel_generator = dict(
+    range=[-54, -54, -5.0, 54, 54, 3.0],
+    voxel_size=[0.075, 0.075, 0.2],
+    max_points_in_voxel=10,
+    max_voxel_num=[120000, 160000],
+)
+
+train_pipeline = [
+    dict(type="LoadPointCloudFromFile", dataset=dataset_type),
+    dict(type="LoadPointCloudAnnotations", with_bbox=True),
+    dict(type="Preprocess", cfg=train_preprocessor),
+    dict(type="Voxelization", cfg=voxel_generator),
+    dict(type="AssignLabel", cfg=train_cfg["assigner"]),
+    dict(type="Reformat"),
+    # dict(type='PointCloudCollect', keys=['points', 'voxels', 'annotations', 'calib']),
+]
+test_pipeline = [
+    dict(type="LoadPointCloudFromFile", dataset=dataset_type),
+    dict(type="LoadPointCloudAnnotations", with_bbox=True),
+    dict(type="Preprocess", cfg=val_preprocessor),
+    dict(type="Voxelization", cfg=voxel_generator),
+    dict(type="AssignLabel", cfg=train_cfg["assigner"]),
+    dict(type="Reformat"),
+]
+
+train_anno = "data/nuScenes/infos_train_10sweeps_withvelo_filter_True.pkl"
+val_anno = "data/nuScenes/infos_val_10sweeps_withvelo_filter_True.pkl"
+test_anno = None
+
+data = dict(
+    samples_per_gpu=4,
+    workers_per_gpu=6,
+    train=dict(
+        type=dataset_type,
+        root_path=data_root,
+        info_path=train_anno,
+        ann_file=train_anno,
+        nsweeps=nsweeps,
+        class_names=class_names,
+        pipeline=train_pipeline,
+    ),
+    val=dict(
+        type=dataset_type,
+        root_path=data_root,
+        info_path=val_anno,
+        test_mode=True,
+        ann_file=val_anno,
+        nsweeps=nsweeps,
+        class_names=class_names,
+        pipeline=test_pipeline,
+    ),
+    test=dict(
+        type=dataset_type,
+        root_path=data_root,
+        info_path=test_anno,
+        ann_file=test_anno,
+        nsweeps=nsweeps,
+        class_names=class_names,
+        pipeline=test_pipeline,
+    ),
+)
+
+
+
+optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
+# optimizer
+optimizer = dict(
+    type="adam", amsgrad=0.0, wd=0.01, fixed_wd=True, moving_average=False,
+)
+lr_config = dict(
+    type="one_cycle", lr_max=0.001, moms=[0.95, 0.85], div_factor=10.0, pct_start=0.4,
+)
+
+checkpoint_config = dict(interval=1)
+# yapf:disable
+log_config = dict(
+    interval=5,
+    hooks=[
+        dict(type="TextLoggerHook"),
+        # dict(type='TensorboardLoggerHook')
+    ],
+)
+# yapf:enable
+# runtime settings
+total_epochs = 20
+device_ids = range(8)
+dist_params = dict(backend="nccl", init_method="env://")
+log_level = "INFO"
+work_dir = './work_dirs/{}/'.format(__file__[__file__.rfind('/') + 1:-3])
+load_from = None
+resume_from = None 
+workflow = [('train', 1)]
diff --git a/det3d/datasets/pipelines/preprocess.py b/det3d/datasets/pipelines/preprocess.py
@@ -34,6 +34,7 @@ def __init__(self, cfg=None, **kwargs):
         if self.mode == "train":
             self.global_rotation_noise = cfg.global_rot_noise
             self.global_scaling_noise = cfg.global_scale_noise
+            self.global_translate_std = cfg.get('global_translate_std', 0)
             self.class_names = cfg.class_names
             if cfg.db_sampler != None:
                 self.db_sampler = build_dbsampler(cfg.db_sampler)
@@ -130,6 +131,9 @@ def __call__(self, res, info):
             gt_dict["gt_boxes"], points = prep.global_scaling_v2(
                 gt_dict["gt_boxes"], points, *self.global_scaling_noise
             )
+            gt_dict["gt_boxes"], points = prep.global_translate_(
+                gt_dict["gt_boxes"], points, noise_translate_std=self.global_translate_std
+            )
         elif self.no_augmentation:
             gt_boxes_mask = np.array(
                 [n in self.class_names for n in gt_dict["gt_names"]], dtype=np.bool_

diff --git a/docs/NUSC.md b/docs/NUSC.md
@@ -75,7 +75,11 @@ The pretrained models and configurations are in [MODEL ZOO](../configs/nusc/READ
 
 ### Tracking
 
-Please refer to [tracking_scripts](../tracking_scripts) to reproduce all tracking results. The detection files are provided in the [MODEL ZOO](../configs/nusc/README.md)
+You can find the detection files are in the [MODEL ZOO](../configs/nusc/README.md). After downloading the detection files, you can simply run 
+
+```bash 
+python tools/nusc_tracking/pub_test.py --work_dir WORK_DIR_PATH  --checkpoint DETECTION_PATH  
+```
 
 ### Test Set 
 
@@ -107,9 +111,3 @@ Download the ```centerpoint_voxel_1440_dcn_flip``` [here](https://drive.google.c
 ```bash
 python tools/dist_test.py configs/nusc/voxelnet/nusc_centerpoint_voxelnet_0075voxel_dcn_flip.py --work_dir work_dirs/nusc_centerpoint_voxelnet_dcn_0075voxel_flip_testset  --checkpoint work_dirs/nusc_0075_dcn_flip_track/voxelnet_converted.pth  --testset --speed_test 
 ```
-
-With the generated detection files, you can create the tracking prediction by running
-
-```bash 
-bash tracking_scripts/centerpoint_voxel_1440_dcn_flip_testset.sh
-```
diff --git a/docs/WAYMO.md b/docs/WAYMO.md
@@ -27,13 +27,13 @@ Convert the tfrecord data to pickle files.
 
 ```bash
 # train set 
-CUDA_VISIBLE_DEVICES=-1 python det3d/datasets/waymo/waymo_converter.py --record_path 'WAYMO_DATASET_ROOT/tfrecord_training/segment-*.tfrecord'  --root_path 'WAYMO_DATASET_ROOT/train/'
+CUDA_VISIBLE_DEVICES=-1 python det3d/datasets/waymo/waymo_converter.py --record_path 'WAYMO_DATASET_ROOT/tfrecord_training/*.tfrecord'  --root_path 'WAYMO_DATASET_ROOT/train/'
 
 # validation set 
-CUDA_VISIBLE_DEVICES=-1 python det3d/datasets/waymo/waymo_converter.py --record_path 'WAYMO_DATASET_ROOT/tfrecord_validation/segment-*.tfrecord'  --root_path 'WAYMO_DATASET_ROOT/val/'
+CUDA_VISIBLE_DEVICES=-1 python det3d/datasets/waymo/waymo_converter.py --record_path 'WAYMO_DATASET_ROOT/tfrecord_validation/*.tfrecord'  --root_path 'WAYMO_DATASET_ROOT/val/'
 
 # testing set 
-CUDA_VISIBLE_DEVICES=-1 python det3d/datasets/waymo/waymo_converter.py --record_path 'WAYMO_DATASET_ROOT/tfrecord_testing/segment-*.tfrecord'  --root_path 'WAYMO_DATASET_ROOT/test/'
+CUDA_VISIBLE_DEVICES=-1 python det3d/datasets/waymo/waymo_converter.py --record_path 'WAYMO_DATASET_ROOT/tfrecord_testing/*.tfrecord'  --root_path 'WAYMO_DATASET_ROOT/test/'
 ```
 
 Create a symlink to the dataset root 
@@ -111,7 +111,17 @@ python det3d/datasets/waymo/waymo_common.py --info_path data/Waymo/infos_val_01s
 
 All pretrained models and configurations are in [MODEL ZOO](../configs/waymo/README.md).
 
-Our final model follows a two-stage training process. For example, to train the two-stage CenterPoint-Voxel model, you first need to train the one stage model using [ONE_STAGE](../configs/voxelnet/waymo_centerpoint_voxelnet_3x.py) and then train the second stage module using [TWO_STAGE](../configs/voxelnet/two_stage/waymo_centerpoint_voxelnet_two_stage_bev_5point_ft_6epoch_freeze.py). You can also contact us to access the pretrained models, see details [here](../configs/waymo/README.md). 
+### Second-stage Training 
+
+Our final model follows a two-stage training process. For example, to train the two-stage CenterPoint-Voxel model, you first need to train the one stage model using [ONE_STAGE](../configs/waymo/voxelnet/waymo_centerpoint_voxelnet_3x.py) and then train the second stage module using [TWO_STAGE](../configs/waymo/voxelnet/two_stage/waymo_centerpoint_voxelnet_two_stage_bev_5point_ft_6epoch_freeze.py). You can also contact us to access the pretrained models, see details [here](../configs/waymo/README.md). 
+
+### Tracking 
+
+Please refer to options in [test.py](../tools/waymo_tracking/test.py). The prediction file is an intermediate file generated using [dist_test.py](../tools/dist_test.py) that stores predictions in KITTI lidar format. 
+
+### Visualization 
+
+Please refer to [visual.py](../tools/visual.py). It will take a prediction file generated by [simple_inference_waymo.py](../tools/simple_inference_waymo.py) and visualize the point cloud and detections.  
 
 ### Test Set 
 

diff --git a/tools/waymo_tracking/test.py b/tools/waymo_tracking/test.py
@@ -26,7 +26,7 @@ def parse_args():
     parser = argparse.ArgumentParser(description="Tracking Evaluation")
     parser.add_argument("--work_dir", help="the dir to save logs and tracking results")
     parser.add_argument(
-        "--checkpoint", help="the dir to checkpoint which the model read from"
+        "--checkpoint", help="the dir to prediction file"
     )
     parser.add_argument(
         "--info_path", type=str

diff --git a/tracking_scripts/centerpoint_voxel_1440_dcn_flip.sh b/tracking_scripts/centerpoint_voxel_1440_dcn_flip.sh
diff --git a/tracking_scripts/centerpoint_voxel_1440_dcn_flip_testset.sh b/tracking_scripts/centerpoint_voxel_1440_dcn_flip_testset.sh