forked from gigo-team/bev_lane_det
-
Notifications
You must be signed in to change notification settings - Fork 18
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 2d66872
Showing
44 changed files
with
4,888 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
# BEV-LaneDet: a Simple and Effective 3D Lane Detection Baseline | ||
## Introduction | ||
BEV-LaneDet is an efficient and robust monocular 3D lane detection system. First, we introduce the Virtual Camera, which unifies the intrinsic/extrinsic parameters of cameras mounted on different vehicles to ensure the consistency of the spatial relationship between cameras. It can effectively promote the learning process due to the unified visual space. Secondly, we propose a simple but efficient 3D lane representation called Key-Points Representation. This module is more suitable for representing the complicated and diverse 3D lane structures. Finally, we present a lightweight and chip-friendly spatial transformation module called Spatial Transformation Pyramid to transform multi-scale front view features into BEV features. Experimental results demonstrate that our | ||
work outperforms the state-of-the-art approaches in terms of F-Score, being 10.6% higher on the OpenLane dataset and 5.9% higher on the Apollo 3D synthetic dataset, with a speed of 185 FPS. Our paper has been accepted by cvpr2023 [arxiv](https://arxiv.org/abs/2210.06006). | ||
|
||
|
||
- [Get Started](#getstart) | ||
- [Benchmark](#benchmark) | ||
- [Visualization](#visualization) | ||
|
||
|
||
## <span id="getstart">Get Started</span> | ||
|
||
### Installation | ||
- To run our code, make sure you are using a machine with at least one GPU. | ||
- Setup the enviroment | ||
``` | ||
pip install -r requirements.txt | ||
``` | ||
### Training and evaluation on OpenLane | ||
- Please refer to [OpenLane](https://github.com/OpenPerceptionX/OpenLane) for downloading OpenLane Dataset. For example: download OpenLane dataset to /dataset/openlane | ||
|
||
- How to train: | ||
1. Please modify the configuration in the /tools/openlane_config.py | ||
2. Execute the following code: | ||
``` | ||
cd tools | ||
python3 train_openlane.py | ||
``` | ||
- How to evaluation: | ||
1. Please modify the configuration in the /tools/val_openlane.py | ||
2. Execute the following code: | ||
``` | ||
cd tools | ||
python val_openlane.py | ||
``` | ||
|
||
### Training and evaluation on Apollo 3D Lane Synthetic | ||
- Please refer to [Apollo 3D Lane Synthetic](https://github.com/yuliangguo/3D_Lane_Synthetic_Dataset) for downloading Apollo 3D Lane Synthetic Dataset. For example: download OpenLane dataset to /dataset/apollo | ||
|
||
- How to train: | ||
1. Please modify the configuration in the /tools/apollo_config.py | ||
2. Execute the following code: | ||
``` | ||
cd tools | ||
python3 train_apollo.py | ||
``` | ||
- How to evaluation: | ||
1. Please modify the configuration in the /tools/val_apollo.py | ||
2. Execute the following code: | ||
``` | ||
cd tools | ||
python val_apollo.py | ||
``` | ||
|
||
## <span id="benchmark">Benchmark</span> | ||
|
||
### Results of different models on OpenLane dataset | ||
|
||
| Method | F-Score | X error near | X error far | Z error near | Z error far|GFLOPs | TensorRT | PyTorch | | ||
| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | | ||
| Gen-LaneNet | 29.7 |0.309|0.877|0.16|0.75| 34 | – | 54FPS | | ||
| PersFormer | 47.8 |0.322|0.778|0.213|0.681| 143 | – | 21FPS | | ||
| Ours | 58.4 | 0.309 |0.659|0.244|0.631| 53| 185FPS | 102FPS| | ||
|
||
### Results of different models on Apollo 3D Lane Synthetic (Balanced Scence) | ||
| Method | F-Score | X error near | X error far | Z error near | Z error far| | ||
| ---- | ---- | ---- | ---- | ---- | ---- | | ||
| 3D-LaneNet | 86.4 |0.068|0.477|0.015|0.202| | ||
| Gen-LaneNet | 88.1 |0.061|0.486|0.012|0.214| | ||
| CLGO | 91.9 |0.061|0.361|0.029|0.25| | ||
| PersFormer | 92.9 |0.054|0.356|0.01|0.234| | ||
| Ours | 98.7 | 0.016 |0.242|0.02|0.216| | ||
|
||
|
||
### Virtual Camera | ||
|
||
CPU implementation is here: [Virutal Camera on CPU](./csrc/README.md) | ||
|
||
| Hardware | Single-thread | Multi-thread | | ||
| ---- | ---- | ----| | ||
| Apple M1 | 1.5ms | 0.5ms | | ||
| Intel Xeon Platinum 8163 @ 2.5 GHz |5.5ms | 1.2ms| | ||
| Nv V100| - | TODO | | ||
|
||
|
||
## <span id="visualization">Visualization</span> | ||
### OpenLane | ||
Full-length (10 mins) video of OpenLane is here: [Video](./virtualization/ol.mp4) or you can find in https://www.youtube.com/watch?v=Mqh0N2cOctM | ||
|
||
![OpenLane](./visualization/ol.gif) | ||
|
||
### Apollo 3D Lane Synthetic | ||
You can watch video of Apollo 3D Lane Synthetic in https://www.youtube.com/watch?v=WC36c4wO_QM |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
cmake_minimum_required(VERSION 3.22) | ||
project(test) | ||
|
||
set(CMAKE_CXX_STANDARD 17) | ||
|
||
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O3") | ||
find_package(Eigen3 3.3 REQUIRED NO_MODULE) | ||
find_package(OpenCV REQUIRED) | ||
message(${OpenCV_INCLUDE_DIRS}) | ||
|
||
|
||
|
||
#target_include_directories(<target> PUBLIC | ||
# ${OpenCV_INCLUDE_DIRS} | ||
# ) | ||
add_executable(test main.cpp) | ||
target_link_libraries(test Eigen3::Eigen ${OpenCV_LIBS}) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
Dependency: | ||
1. OpenCV | ||
2. Eigen | ||
|
||
``` shell | ||
cmake ./ | ||
make | ||
./test | ||
``` | ||
|
||
It tests the latency by transforming 1000 images into Virtual Camera, twice. | ||
|
||
+ This program will load the sample image named a.jpg. | ||
+ The first test is only for warmup. | ||
+ The transformed jpg will be written out in the name of vcam.jpg. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,189 @@ | ||
#include <math.h> | ||
#include <iostream> | ||
#include <chrono> | ||
#include "Eigen/Eigen" | ||
|
||
#include "opencv2/opencv.hpp" | ||
#include "opencv2/core.hpp" | ||
|
||
using namespace std; | ||
|
||
namespace os::gigo::d3 { | ||
|
||
const double pi = 3.1415926535897932385; | ||
|
||
struct CamParamMini { | ||
double fx, fy, cx, cy; | ||
double pose_tx, pose_ty, pose_tz; | ||
double qx, qy, qz, qw; | ||
double yaw, pitch, roll; | ||
int image_width, image_height; | ||
}; | ||
|
||
inline Eigen::Matrix3d R_from_quaternion(double x, double y, double z, double w) { | ||
Eigen::Quaterniond q; | ||
q.x() = x; | ||
q.y() = y; | ||
q.z() = z; | ||
q.w() = w; | ||
return q.normalized().toRotationMatrix(); | ||
} | ||
|
||
inline Eigen::Matrix3d R_matrix(const double yaw, const double pitch, const double roll) { | ||
auto y = yaw / 180 * pi; | ||
auto p = pitch / 180 * pi; | ||
auto r = roll / 180 * pi; | ||
Eigen::Matrix3d Rz; | ||
Rz << cos(y), -sin(y), 0, sin(y), cos(y), 0, 0, 0, 1; | ||
Eigen::Matrix3d Ry; | ||
Ry << cos(p), 0, sin(p), 0, 1, 0, -sin(p), 0, cos(p); | ||
Eigen::Matrix3d Rx; | ||
Rx << 1, 0, 0, 0, cos(r), -sin(r), 0, sin(r), cos(r); | ||
return Rz * Ry * Rx; | ||
} | ||
|
||
|
||
inline Eigen::Matrix3d K_matrix(const CamParamMini &p) { | ||
Eigen::Matrix3d K; | ||
K << p.fx, 0, p.cx, 0, p.fy, p.cy, 0, 0, 1; | ||
return K; | ||
} | ||
|
||
// Project 3D points to image by pinhole camera model. | ||
std::vector<cv::Point2f> | ||
PersMapping(Eigen::Matrix<double, 3, 15> p3d, Eigen::Matrix3d P, Eigen::Matrix<double, 3, 1> T) { | ||
std::vector<cv::Point2f> pts; | ||
auto p3cam = p3d.colwise() - T; // 3 x 15 - 3 x 1 -> 3 x 15 | ||
auto p2d = P * p3cam; // 3 x 3 @ 3x15 -> 3x15 | ||
for (int i = 0; i < 15; ++i) { | ||
auto u = p2d(0, i); | ||
auto v = p2d(1, i); | ||
auto z = p2d(2, i); | ||
pts.push_back({static_cast<float >(u / z), static_cast<float >(v / z)}); | ||
} | ||
return pts; | ||
} | ||
|
||
|
||
cv::Mat | ||
calc_vcam_matrix(const CamParamMini &src_cam_params, const CamParamMini &dst_cam_params, double height = -0.393) { | ||
// 4 points are enough. Here we choose 15 points. | ||
Eigen::Matrix<double, 15, 3> pts3d; | ||
pts3d << 10, 0, height, 7 + 2., -1, height, | ||
80, 0, height, 80, -8, height, 80, 8, height, | ||
30, 0, height, 30, -8, height, 30, 8, height, | ||
10, 0, height, 10, -8, height, 10, 8, height, | ||
10, -4, height, 10, 4, height, | ||
12, -2, height, 12, 2, height; | ||
auto pts3d_T = pts3d.transpose(); | ||
auto Rs = R_from_quaternion(src_cam_params.qx, src_cam_params.qy, src_cam_params.qz, src_cam_params.qw); | ||
auto Rd = R_from_quaternion(dst_cam_params.qx, dst_cam_params.qy, dst_cam_params.qz, dst_cam_params.qw); | ||
Eigen::Matrix<double, 3, 1> Ts; | ||
Ts << src_cam_params.pose_tx, src_cam_params.pose_ty, src_cam_params.pose_tz; | ||
Eigen::Matrix<double, 3, 1> Td; | ||
Td << dst_cam_params.pose_tx, dst_cam_params.pose_ty, dst_cam_params.pose_tz; | ||
|
||
auto Ks = K_matrix(src_cam_params); | ||
auto Kd = K_matrix(dst_cam_params); | ||
|
||
auto Ps = (Rs * Ks.inverse()).inverse(); | ||
auto Pd = (Rd * Kd.inverse()).inverse(); | ||
|
||
auto pts_src = PersMapping(pts3d_T, Ps, Ts); | ||
auto pts_dst = PersMapping(pts3d_T, Pd, Td); | ||
|
||
auto h = cv::findHomography(pts_src, pts_dst); | ||
|
||
return h; | ||
} | ||
|
||
class VirtualCamera { | ||
public: | ||
VirtualCamera() = delete; | ||
|
||
VirtualCamera(const CamParamMini std_cam) { | ||
this->std_cam_ = std_cam; | ||
this->std_size = cv::Size(std_cam.image_width, std_cam.image_height); | ||
} | ||
|
||
void set_src_cam(const CamParamMini cam, double height = -0.393) { | ||
this->h_ = calc_vcam_matrix(cam , this->std_cam_, height = height); | ||
} | ||
|
||
cv::Mat transform(cv::Mat original_img) { | ||
cv::Mat output; | ||
cv::warpPerspective(original_img, output, this->h_, this->std_size, 0); | ||
return output; | ||
} | ||
|
||
private: | ||
CamParamMini std_cam_; | ||
cv::Mat h_; | ||
cv::Size std_size; | ||
}; | ||
} | ||
|
||
int main() { | ||
float stride = 3.75; | ||
float std_stride = 1.875; | ||
|
||
// Uncomment this line if you need single-thread. | ||
// cv::setNumThreads(1); | ||
|
||
|
||
// Settings of Virtual Camera | ||
os::gigo::d3::CamParamMini std_cam | ||
{1948.3936767578125 / std_stride, 1957.3106689453125 / std_stride, 962.94580078125 / std_stride, 559.2073974609375 / std_stride, | ||
2.1791250705718994, 0.0012771482579410076, 1.0726985931396484, | ||
-0.4864780008792877, 0.4880296289920807, -0.5088465809822083, 0.5159858465194702, | ||
1.5563105, -3.1311595, 1.6163771, | ||
static_cast<int>(3840 / 2 / std_stride), static_cast<int>(2160 / 2/ std_stride) | ||
}; | ||
|
||
// Settings of Real Camera | ||
os::gigo::d3::CamParamMini src_cam{ | ||
3807.4036 / stride, 3822.3098 / stride, 1903.5006 / stride, 1147.8899 / stride, | ||
2.187498, -0.009653946, 1.015064, | ||
-0.48510775, 0.4893328, -0.5197188, 0.5050903, | ||
1.5516857, -3.1316676, 1.621268, | ||
static_cast<int>(3840 / stride), static_cast<int>(2160 / stride) | ||
}; | ||
|
||
cout << "Loading image" <<endl; | ||
auto src_img = cv::imread("a.jpg"); | ||
|
||
using timer = chrono::high_resolution_clock; | ||
|
||
cv::resize(src_img, src_img, cv::Size(3840 / stride, 2160 / stride)); | ||
cv::Size dst_size{src_cam.image_width, src_cam.image_height}; | ||
|
||
cv::Mat output; | ||
|
||
// Establishing Virtual Camera | ||
os::gigo::d3::VirtualCamera vc(std_cam); | ||
vc.set_src_cam(src_cam); | ||
|
||
|
||
auto times = 1000; | ||
|
||
// Warm up | ||
cout << "Warm Up...." <<endl; | ||
|
||
auto s = timer::now(); | ||
for (auto i = 0; i < times; ++i) { | ||
output = vc.transform(src_img); | ||
} | ||
auto e = timer::now(); | ||
cout << std::chrono::duration_cast<std::chrono::milliseconds>(e - s).count() << "ms @" << times << "times" << endl; | ||
cout << "Warm Up.... DONE" <<endl; | ||
|
||
s = timer::now(); | ||
for (auto i = 0; i < times; ++i) { | ||
output = vc.transform(src_img); | ||
} | ||
e = timer::now(); | ||
auto int_ms = std::chrono::duration_cast<std::chrono::milliseconds>(e - s); | ||
cout << int_ms.count() << "ms @" << times << "times"<< endl; | ||
cv::imwrite("vcam.jpg", output); | ||
return 0; | ||
} |
Empty file.
Empty file.
Oops, something went wrong.