Lidar-Lidar Synchronization - Methodology and Implementation

Methodology and Implementation

4.5 Lidar-Lidar Synchronization

The lidar rotates and publishes data continuously at a given frequency, which means that it is not possible to trigger the lidar, such as with a camera. Since the lidars used in this project do not have a system for starting them simultaneously, they might publish data asynchronously. The job of the lidar synchronization is therefore to find a way to counteract the delay between each lidar. The method used in this thesis is from ”Multi-Lidar Calibration and Synchronization for Autonomous Vehicles” by Zhang et al. [61]. It uses the motion transformation between lidars to synchronize the scans to a reference time-stamp.

4.5.1 Method

Time-stamped data from a lidar is related to a global transformation T(t₁) for time t₁, while data from a second lidar is related to a global transformationT(t2). The global transformations

4.6 Localization between the two data inputs can be related to each other as

T =T(t1)⁻¹T(t2). (4.4)

This is illustrated in figure 4.8. This can be further extended to several lidars by relating trans-formationsT(t_i)fort_i = (0,1, . . . n)fornlidars.

Figure 4.8:Transformation from synchronization

4.5.2 Implementation

To implement this method, a set of prerequisites must be addressed. The first is related to time-stamping the data. Over the last two years, ROS-time has been used to time-stamp the incoming data, and it is used for this project as well. This means that the data is time-stamped at arrival at the processing unit on the vehicle, a consequence of this is that the delay from sending the data from the sensor to the processing unit is not accounted for. There are ways of synchronizing the time throughout the sensors, including GPS time and the NTP protocol [22]. If the performance of the vehicle is significantly affected by this, it is something that can be considered in future projects.

The other aspect is how to retrieve the global transformation T(t_i)at timet_i. This is done by relating each incoming point cloud callback to a time-stamp t_i, and then finding the matching state estimation callback which has a time-stamp closest possible tot_i. The lidars are publishing point clouds at 20Hz, while the Vectornav VN-300 is publishing at 400Hz, which means that it is possible to find a state estimation callback with a time-stamp approximately equal tot_i. Worst case deviation at 20m/s is(1/400Hz)/2 = 2.5·10⁻³s =⇒ 2.5·10⁻³s·20m/s= 2.5cm.

4.6 Localization

To determine the class of a cone candidate, it first needs to be located in the point cloud. This is achieved by implementing a localization algorithm that uses a set of filtering techniques alongside conditional Euclidean clustering to find regions of interest in 3D-space. The following section introduces the Euclidean clustering based on an approach given by R. B. Rusu [47].

Finally, the suggested localization framework is introduced.

4.6.1 Euclidean Clustering

Euclidean clustering uses the Euclidean norm to evaluate whether a set of points belong to a distinct cluster. First the point cloud is divided into smaller sections, which reduces the pro-cessing time since a cluster is found by looking at neighbouring points. Kd-tree representation is a typical space-partitioning structure that organizes points in a space with k dimensions, for a point cloud it is three dimensions. A thorough introduction to kd-tree is given in [6]. Given a set of pointsp_i ∈Pandp_j ∈P, with a defined thresholdd_th. The two set of points are said to belong to a distinct cluster if the Euclidean distance between the points is larger thand_th.

minkp_i−p_jk₂ ≥d_th (4.5)

A set of points is found by nearest neighbouring search using the kd-tree structure. For every pointpin the point cloud, the neighbours within a radiusr < d_th is said to belong to the same cluster asp.

4.6.2 Localization Framework

The first step is to filter out irrelevant data. Since the topology of the racetrack is consistently flat, and since the sparsity of data results in a maximum possible detection range, a determinis-tic xyz-filter is applied. This is done by neglecting points beyond user-defined xyz-constraints.

Secondly a voxelgrid downsampling is performed. This works by dividing the point cloud into boxes (voxels) with a user-defined size. All points locaed inside a voxel is represented by one point, the centroid. These two steps reduce the size of the point cloud, resulting in faster pro-cessing time for clustering.

The conditional euclidean clustering uses certain user-defined conditions to evaluate the point cloud. These are the maximum and minimum amount of points that can be in a cluster, the thresholdd_thand a condition that states that points from at least two different channels have to be present in a cluster. The last condition is to ensure that a certain height is preserved in the clustering, and that clusters with only horizontal points is not considered as a candidate. When clustering finds a candidate, the width and the height between the outermost points in the cluster is checked to ensure it is within the widthw_c= 228mmand the heighth_c= 325mm. Finally, the set of points representing a cluster is represented by one point, the centroid of the cluster.

The final filter performs outlier radius rejection to filter out candidates that are to close to each other. This is using the fact that cones are spaced apart at a certain distance to each other on the racetrack to remove false positives. The filter checks how many candidates there are within a user-defined radius and neglecting them if there are too many. When a cone candidate is found, the point representing a cluster is reconstructed by taking the raw data from the input point cloud within a radius of the candidate. The localization flowchart is illustrated in figure 4.9.

The performance of the localization algorithm is highly dependent on the user-defined param-eters that needs to be tuned to ensure good candidate localization. The tuning paramparam-eters are summarized in table 4.4. The voxel size and the clustering parameters are particularly dependent on each other, and are the main parameters that defines whether the cone candidate extraction is successful. For instance, by increasing the voxel size which result in a greater distance between the points, the thresholddthmight need to be increased in order to find relevant clusters.

4.7 2D Projection The localization algorithm is implemented in C++ as a ROS node that subscribes to the raw point cloud, either directly from the lidar ROS-driver or from the merged data from the lidar-lidar fusion. All the filtering techniques and the clustering are implemented by using the PCL.

Figure 4.9:Localization flowchart

Procedure Tuning parameter

xyz-filter (x,y,z)-constraints

Voxelgrid downsampling voxel size

Clustering min/max points, thresholdd_th

Outlier removal radius

Table 4.4: Tuning parameters

4.7 2D Projection

The reconstructed data from the localization framework consists of all the points that represent a cone candidate. As mentioned in section 3.3, it is thought that representing the data as an image is a good method for classification purposes. The reason for this is that a cone has a triangular shape that is captured by the lidar. The proposed method for image generation, or 2D projection of the data is given in the following sections, but first the classification range is evaluated.

4.7.1 Evaluation of Classification Range

Due to sparsity of data, the number of points hitting a cone decreases as the distance to the cone increases. The localization system is able to find a cone candidate with only two points hitting, but this will result in a low information image. The localization system should therefore be capped at a certain distance, depending on whether one or two lidars are used.

To determine the distance at which the classification is to be capped, the specifications from the lidars is evaluated alongside the approximate placements of the lidars on the vehicle/trolley. To classify the cone color, the intensity pattern across the height of a cone should be recognized.

This means that the dark patch on a yellow cone, and the light patch on a blue cone should be presented in the data. To classify the cone shape, it is thought that at least three vertical channels should hit the cone. The decided capping distances is summarized in table 4.5, and is illustrated in figure 4.10. In the illustrations, the pink line represent the detection range for shape classification and the yellow and black line represent the detection range for color classification.

The yellow and black line represent the true placement of the black/white path on a cone.

Color classification Shape classification

Pandar40 15 m 24 m

Pandar20B 12 m 24 m

Lidar-lidar fusion 17 m 28 m

Table 4.5:Maximum classification distance for various lidar configurations

0 5 10 15 20 25 30 35 40

Figure 4.10: Illustration of vertical resolution and classification range for different lidar configurations

4.7.2 Proposed Method

The cones defines the boundary of the track and are therefore situated mostly on an angle related to the body frame of the vehicle. To achieve a correct front view of the candidates, all the points in a candidate is rotated to the Y-Z plane.

After the rotation each point is mapped to an×nimage. The size of the image is chosen to be 28×28as this is small enough to give a fast processing time and large enough to capture the shape of a cone. This is something that can be easily adjusted. The mapping is performed by dividing the width and the height of a candidate to an×n grid, and giving a binary value to a cell in the grid if a point relates to that cell. If several points relate to the same cell, the value of the cell is still one. The result is an×nimage with ones in the pixels there are points. This is illustrated in figure 4.11 which shows how a10×10image could look like.

In document Lidar based object detection for an autonomous race car (sider 46-51)