Chinese Journal of Information Fusion
ISSN: 2998-3371 (Online) | ISSN: 2998-3363 (Print)
Email: [email protected]
Currently, with the deep and extensive popularization of high-precision sensors in cutting-edge fields such as autonomous driving, robot navigation, and security monitoring, extended target tracking (ETT) technology has emerged as a new research hotspot [1, 2]. Compared to the issue of traditional point target tracking, extended target tracking encompasses multiple information dimensions such as position, shape and velocity, which undoubtedly poses more stringent requirements on the accuracy and complexity of tracking algorithms [3].
To achieve effective tracking of extended targets, numerical methods have been addressed. For example, Granstrom et al. [4] introduced probability hypothesis density (PHD) filter and cardinalized probability hypothesis density (CPHD) filter into the ETT field. Then two measurement set partitioning methods [5] were added into the filters in [4]. Additionally, Habtemariam et al. [6] integrated the measurement unit generation strategy with joint probability data association (JPDA), thereby proposed the multi-detection joint probability data association (MD-JPDA) method. Zhang et al. [7] introduced the cardinality balanced multi-target multi-Bernoulli (CBMeMBer) algorithm and successfully conducted the ETT task. In [8], the generalized labeled multi-Bernoulli (GLMB) and Gamma Gaussian inverse Wishart (GGIW) distributions were used to precisely model the states and extension characteristics of multiple extended targets. Then, the GGIW Poisson model was ingeniously embedded into the multi-Bernoulli filter to cope with the issue of multi-extended target tracking [9]. Recently, an approach based on irregular probability distributions has also been proposed to cope with this issue [10].
However, when extended targets are occluded or densely distributed, due to their non-point nature and their complex interaction patterns in dynamic environments, the methods mentioned above are prone to trigger the challenging problem of target splitting or merging during actual operation. For example, in Figure 1(a), the radar of the car is occluded by a motorcycle, resulting in the truck being identified as two split targets, as shown in Figure 1(c). To cope with this problem, the key lies in accurately distinguishing whether the target has truly split or it is merely false alarm.
A few related works have been proposed to address the above problem. For visual targets, [11] detected target splitting positions and segments trajectories by stacking temporal dilated convolution blocks and an adaptive Gaussian smoothing label strategy. For missile targets, [12] constructed a mathematical model for splitting event detection and tracking within the joint integrated probabilistic data association (JIPDA) framework, achieving point target splitting determination and tracking through probability calculations of splitting events. [13] optimized the de-correlation time of group targets using Pareto analysis based on the interactive multiple model-unscented Kalman filter (IMM-UKF) framework, which essentially performs data association on point targets within group targets. It is important to note that these methods only utilize the position data of point targets. Directly applying them in extended target tracking scenarios cannot fully leverage the extended information of targets, leading to poor performance. To the best of our knowledge, there has been no work addressing the problem of splitting and merging of extended targets so far.
Motivated by this, we aim to make use of extended information and achieve accurate determination and fusion of split targets. To this end, we first analyze the extended target splitting problem with PHD-based filters, and then present the similarity of the track feature of extended targets. Next, we expand spatio-temporal [14] based clue-aware trajectory similarity (CATS) method to the ETT issue by integrating the Gaussian Wasserstein (GW) distance. Subsequently, we develop an extended target split error correction algorithm.
In summary, the main contribution of this paper is the proposed method that can solve the splitting problem of extended targets by using the spatio-temporal trajectories and extended information of extended targets. Furthermore, as far as we know, the method presented in this paper is the first work to deal with the issue of the split of extended target.
The organization of this paper is as follows. Section 2 describes the problem of extended target splitting in detail. Section 3 analyzes the split tracks' information. Section 4 elaborates on the proposed splitting determination method. Section 5 builds a simulation scenario to verify the effectiveness of proposed method. Section 6 summarizes the entire paper.
In this paper, a two dimensional ellipse is used to represent an extended target. The extended target state is defined as a triple:
where represents the measurement rate, represents the kinematic state, which includes its position , velocity and turn-rate that characterizes the rate of alteration in the direction of the velocity vector , where denotes the set of real vectors. represents the extended geometric information that includes the shape, size and direction of the ellipse extended target and
where denotes the set of symmetric positive definite matrices. The rotation matrix and the diagonal matrix are represented as follows:
where is the rotation angle of the ellipse, and are defined as the major/minor axes of the ellipse and controls the rotation.
PHD-based filters are widely used in the field of multiple extended target tracking, such as GGIW-PHD and GGIW-CPHD filters [15]. In order to formulate the problem of target splitting, we take PHD-based filter as the front-end process. Assume that GGIW-PHD filter [16] will output an extended target track set with labels. Specifically, at time step , the track information obtained from the front-end tracker is represented as , where is the total number of tracks in the set. Each element is defined as:
where is the unique index (it is referred to as label in the following text) of each track, denotes the extended target detection time, denotes the extended target state, denotes the "age" that target exists.
By grouping together the elements from different time steps with the same label , we can obtain the track sequence arranged in chronological order:
where and denote the start and end time step of track sequence. It should be noted that in this paper, track sequence will be called "track" and will be called the "element" of track .
Consider the automatic driving scenario shown in Figure 1. At time step , in addition to the original surviving target track with label , there appears a new track with label , indicating the potential emergence of a new extended target. Now, there are three possibilities for this new track:
The formulated problem is how to accurately judge which of the above three cases the new target state belongs to. Therefore, the goal of this paper is to propose an effective method to determine whether the extended object is split or not, and if it is split, then select an appropriate fusion method to fuse the two tracks.
First of all, we will analyze the track feature of the split extended targets in this section.
As described above, an elliptic extended target information includes position , detection time and geometric information . This elliptic can be interpreted as the following Gaussian distribution [17]:
where is scaling factor relates to the tolerance region that is user-defined. In addition, due to the uncertainty of sensor measurement and data processing, the detection time can also be considered to obey the following Gaussian distribution:
where is time interval and represents the scaling factor.
Suppose that track which has already existed moves in a two-dimensional plane, after being tracked by the PHD-based filter, it splits into two tracks and with distribution sets and , then their center sets and can be separately connected as a curve in a three-dimensional plane, as shown in Figure 2.
It can be observed that, influenced by various factors, there are deviations in the area where the two distributions should overlap, and the deviation shows the following characteristics: The deviations in detection time are highly random, but the deviations in spatial position are relatively fixed, and there are slight deviations in the rotation angle. Additionally, some measurement data are missing. Hence, if the two tracks are originated from the same track, they are actually a kind of spatio-temporal prism structure [18] with a range of uncertainty. Thus, we can use the similarity of historical track information to determine the split possibility of extended target.
In order to determine the split possibility, the clue-aware trajectory similarity (CATS) method based on spatial and temporal information in [19] is chosen. Its main idea is to find potential "matching points" on the two tracks when evaluating the spatial and temporal similarity. However, since it is inappropriate to use the center point to represent the extent of an ellipse, the direct application of the CATS method will result in unsatisfactory outcomes. Therefore, we propose a new method called GW-CATS to determine the splitting of extended targets, which will be elaborated in the next Section.
Before introducing the determination method, we first introduce Gaussian Wasserstein distance [17].
For elements and , they can construct two elliptical extended targets subjected to the following Gaussian distributions:
The Gaussian Wasserstein distance between the two extended targets provides the similarity measure metric that is defined as:
where represents the trace operator.
This metric simultaneously captures positional offsets and quantifies the congruence between the two targets' shapes through their covariance matrices. In this article, will be represented by the shorthand notation .
The main idea of the CATS method is to evaluate the spatial and temporal similarity of different tracks. The core workflow is as follows: First, temporal and spatial thresholds are set to filter elements contained in the two tracks, selecting element pairs from different tracks that are temporally and spatially close. Subsequently, the spatial distances between these element pairs are normalized to identify the most similar pairs. Lastly, the similarity between two tracks is computed as the average of all normalized similarity scores of their best-matching element pairs.
Since CATS handles point targets through Euclidean distance, it fails to account for extended target. Thus, we propose the GW-CATS method that addresses this limitation by incorporating geometric information, enabling a more reasonable use of extended information. The detailed implementation of the proposed method is as follows.
At time step , suppose that the track information is obtained from the front-end tracker. Thereinto, represents a newborn track and represents an existing track. In order to calculate the similarity between track and track , the following four-step process is adopted as follows:
Step 1: Spatio-temporal Matching Elements Finding
Given a spatial threshold , a time threshold , two elements and , if and satisfy the following conditions:
(1) , (2) ,
then we call is the spatio-temporal matching element of and is a spatio-temporal matching pair.
Similar to the CATS method, we set a time threshold and a space threshold to compensate for the uncertainty of target kinematics and sensor measurements. Due to the existence of extended target velocity information, After the determination of user-defined time threshold according to the actual situation, the spatial threshold can be calculated by the following method.
As shown in Figure 3, the initial center position of extended target is set as , its speed is , and turning rate perpendicular to the direction of the velocity vector is . After passing time , the center of the target reaches the position . Since is relatively small, the target speed can be approximated as constant during the motion. Then, the Euclidean distance between point and can be expressed as:
where denotes the 2-norm operator. Using as the radius, a validation gate is constructed to filter spatially irrelevant elements, as shown in gray part in Figure 3. Since is relatively small, the elliptical rotation angle remains minimal. By neglecting rotational effects, we derive:
The detailed derivation is provided in Appendix.
Step 2: Similarity Calculation of Matching Elements
For any elements in the reference track, the number of matching elements from other tracks may be zero, one, or multiple.
To distinguish matching elements and find the most similar matching pair, we quantify similarity scores through numerical normalization to the range [0,1], enabling optimal matching selection. Thus, the similarity of matching elements are calculated as follows [20] :
where is a matching element of and the value range of the function is limited in [0,1]. The closer the position and geometric information of two ellipses are, the larger this function value is, indicating greater similarity. If two extended targets are exactly the same, this function's value equals to 1. For brevity, will be abbreviated to .
Step 3: Best Matching Element Confirm
After similarity calculation, we can confirm the best matching element of . Suppose that track contains elements, for , if:
then we call the best matching element of , where represents any matching element of in track .
Best matching element pairs are defined as those that optimally capture the same kinematic characteristics. When two tracks are hypothesized to originate from the same physical target, our objective is to systematically identify these optimal element pairs, thereby enabling the subsequent processing step.
Step 4: Similarity Calculate of Two Tracks
Finally, after the above three steps, we obtain the best matching elements and matching values of each elements in track . The inter-track similarity is determined by aggregating and averaging the normalized similarity scores across all matched element pairs. Thus, the spatio-temporal similarity of track to track is defined as:
where refers to the number of elements in and is the corresponding best matching element of . For brevity, will be abbreviated to . In summary, a complete pseudocode implementation of the proposed method is provided in Algorithm 1.
For any newborn track , its similarity score with respect to each established independent track can be systematically computed through Algorithm 1. By establishing a similarity threshold , we implement the following decision rule:
If two tracks are determined to be similar, the Monte Carlo Minimum Mean Gaussian Wasserstein (MC-MMGW) method can be used to fuse the information of the two extended targets. For the specific details of the fusion method, please refer to reference [21].
In this section, we set a highway autonomous driving simulation scenario to evaluate the proposed GW-CATS method. We used the optimal sub-pattern assignment (OSPA) [22] and generalized optimal sub-pattern assignment (GOSPA) [23] as evaluation metric to Verify the effectiveness of the proposed GW-CATS method.
Given time steps, in total, we first set up the simulation scenario. The scenario is set in two-dimensional three-lane highway with lane width 3.5 m. The road centerline coordinates is [0 0 ; 50 0 ; 100 0 ; 250 20 ; 400 35]m. There are a total of five vehicles on the highway, and they all travel along the corresponding lane. Target parameters are listed in Table 1.
Parameter | Dimensions | Velocity | Lifetime |
---|---|---|---|
(m2) | (m/s) | (s) | |
RadarCar | 4.71.8 | 0.1-13.6 | |
Target 1 | 4.71.8 | 25 | 0.1-13.6 |
Target 2 | 2.01.0 | 24 | 0.2-13.6 |
Target 3 | 4.71.8 | 26 | 0.3-13.6 |
Target 4 | 9.32.2 | 0.1-13.6 |
Specifically, target 4 represents a motorcycle, target 1 is a truck, Target 2 and 3 are standard vehicles. The RadarCar is an autonomous vehicle equipped with four radars, and radar parameters are listed below:
Left/Right radars: 160° detection angle, 30 m range.
Front/rear radars: 30° detection angle, 50 m range.
Detection probability .
False alarm rate .
Clutter intensity (Poisson point process distribution).
The GGIW-PHD filter is used to track these targets, its corresponding parameters are shown in Table 2. For the specific introduction of the parameters, please refer to [16].
Parameter | Value |
---|---|
Birth rate | |
Death rate | |
Assignment threshold | 220 |
Extraction threshold | 0.8 |
Confirmation threshold | 0.95 |
Deletion threshold | |
Labeling threshold | [1.1 1 0.8] |
Merging threshold | 50 |
A.Selection of Time Threshold
The selection of time threshold is a process that combines experience and mathematical principles. In GW-CATS method, the physical meaning of the time threshold is the maximum acceptable time interval between the split target and the original target, and its value is based on the theory of spatio-temporal trajectory similarity: if two trajectories originate from the same target split, their spatio-temporal distribution should maintain continuity in finite time.
Taking the simulation scenario in this section as an example, the speed of the split truck is and its length is , then the time required for the target to completely cross its own length is . Considering the geometric uncertainty of the elliptical target after splitting, is finally selected as the equilibrium value. In order to verify the rationality of the threshold, a comparative experiment of is designed, and the key parameters are set as follows:
The experiments show that the tracking performance is optimal when . When , the real split targets cannot be merged due to the excessively narrow time window, and when , adjacent targets are prone to false merging. Under different value, the OSPA metric are shown in Figure 5. In what follows, of the proposed GW-CATS method is uniformly taken as .
B. Single-Run Results
Figure 4(a) shows the GGIW-PHD tracking results. The detected targets are all represented in the form of elliptical extended targets. It is evident that for target 1 (truck), a distinct segmentation issue occurs after it is obstructed by target 4 (motorcycle), resulting in few false newborn track (target 5-12).
As comparison, Figure 4(b) shows the result with the proposed GW-CATS method. From to , the truck target split into several false targets. Some of these false targets disappeared during their movement, while others remained until . Taking target 6 generated at k = 72 as an example, after iterations, at , the similarity between new track 6 and tracks 1-4 is [0.845, 0.629, 0, 0.627], so the proposed GW-CATS method decided to fuse target 6 and 1. Further, at and , the similarity between track 6 and track 1 is 0.896 and 0.910, respectively, so the target fusion process continued, and ultimately the proposed GW-CATS method successfully completed the split determination task. Futhermore, the fusion result is shown in Figure 6.
C.Monte Carlo Results
To evaluate the performance of the GW-CATS method, this section introduces the point-target based CATS method and the global nearest neighbor (GNN) method for comparison with the proposed GW-CATS method. The parameters of the GW-CATS method are consistent with those described in 5.1. The specific parameters of the CATS method are as follows:
The key parameter configurations of the GNN method are as follows:
We futher conducted 50 Monte Carlo (MC) trails to demonstrate the effectiveness of the proposed GW-CATS method. The tracking error evaluated by the mean OSPA metric are shown in Figure 7. The tracking error evaluated by the mean GOSPA metric with and are shown in Figure 8.
It can be observed that the OSPA or GOSPA value of the GGIW-PHD filter with GW-CATS correction is greatly reduced when the split target is successfully determined, compared with that of the original GGIW-PHD filter, GNN method and CATS method. It proves that the GW-CATS method can significantly improve the tracking accuracy of extended targets in occlusion scenes.
To address the splitting correction problem in extended target tracking under occlusion scenarios, we propose a novel track spliting determination method named GW-CATS that integrates the GW distance with spatio-temporal similarity analysis. Simulation results demonstrate that the proposed method can successfully determine the case of target splitting, further reduces the OSPA metric in split scenarios and achieves stable track fusion.
First, let us set time threshold , , , and turn-rate . Then, the direction angle of the velocity vector at the initial time is , the angle of the target's rotation around the center of the circle is and the radius of the arc is . On the one hand, when , it follows that
Then, the Euclidean distance can be calculated by the following formula
Furthermore, on the other hand, when , the velocity displacement formula can be directly applied for the calculation. Hence, it follows that:
Chinese Journal of Information Fusion
ISSN: 2998-3371 (Online) | ISSN: 2998-3363 (Print)
Email: [email protected]
Portico
All published articles are preserved here permanently:
https://www.portico.org/publishers/icck/