Extraction of Motion Information from Occupancy Grid Map Using Keystone Transform

Hongqi Fan; Dawei Lu; Yanwen Jiang; Achim J. Lilienthal

doi:10.62762/CJIF.2024.361892

CiteScore

2.43

Impact Factor

Volume 1, Issue 1, Chinese Journal of Information Fusion

Volume 1, Issue 1, 2024

Submit Manuscript Edit a Special Issue

Table of Content

1. Introduction
2. One Dimensional Spatail KST
3. Two Dimensional Spatial KST
4. Algorithm Implementation
5. Results
6. Conclusion

Chinese Journal of Information Fusion, Volume 1, Issue 1, 2024: 63-78

Open Access | Research Article | 10 June 2024

Extraction of Motion Information from Occupancy Grid Map Using Keystone Transform

Hongqi Fan 1

Dawei Lu 1 *

Yanwen Jiang 1

Achim J. Lilienthal 2

1 National Key Laboratory of Automatic Target Recognition (ATR), National University of Defense Technology, Changsha 410073, China

2 Perception for Intelligent Systems, Technical University of Munich, Munich, Germany

* Corresponding Author: Dawei Lu, [email protected]

DOI: 10.62762/CJIF.2024.361892

Received: 17 March 2024, Accepted: 06 June 2024, Published: 10 June 2024

Cited by: 3 (Source: Web of Science) , 4 (Source: Google Scholar)

PDF (3.29 MB) Full-Text HTML XML

Article Metrics Cite This Article

Abstract

Considering the tractability of OGM (Occupancy Grid Map) and its wide use in the dynamic environment representation of mobile robotics, the extraction of motion information from successive OGMs are very important for many tasks, such as SLAM (Simultaneously Localization And Mapping), DATMO (Detection and Tracking of Moving Object) and informaiton fusion for situation awareness. In this paper, we propose a novel motion extraction method based on the signal transform, called as S-KST (Spatial Keystone Transform), for the motion detection and estimation from successive noisy OGMs. It extends the KST in radar imaging or motion compensation to 1D spatial case (1DS-KST) and 2D spatial case (2DS-KST) combined multiple hypotheses about possible directions of moving obstacles. Meanwhile, the fast algorithm of 2DS-KST based on Chirp Z-Transform (CZT) is also given, which five steps, i.e. spatial FFT, directional filtering, CZT, spatial IFFT and Maximal Power Detector (MPD) merging and its computational complexity is proportional to the 2D-FFT. Simulation test results for the point objects and the extended objects show that SKST has a good performance on the extraction of sub-pixel motions in very noisy environment, especially for those slowly moving obstacles.

Keywords

mobile robotics

occupancy grid map

moving object

keystone transform

2DS-KST

velocity estimation

situation informaiton fusion

Notifications

$l$ Unit of imaginary number;
$l, m$ Discrete spatial variables for continuous ones $x, y$ respectively;
$i, j$ Discrete spatial frequencies for continuous ones $u, v$ respectively;
$n$ Discrete time variables for continuous one $t$ ;
$k$ Discrete temporal frequency for continuous one $f$ ;
$∆ x = ∆ y = R$ Size of Spatial cell, i.e., sampling interval in space;
$∆ t = T$ Frame period of 2D or 1D data, i.e., sampling interval in time;
$V_{\max}$ Maximum speed of interest, known as a prior for a given problem;
$L$ Number of space cells along $x$ or $y$ direction;
$N$ Number of data frames, i.e. the length of 2D or 1D signal along $t$ ;
$f (x, y, t)$ OGM or image at $t$ instant for continuous case, denoted $f_{t}(x,y)$ as well;
$f (l, m, n)$ OGM or image at $t_{n}$ instant for discrete case, denoted $f_{n}(l,m)$ as well;
$F (\cdot, \cdot, \cdot)$ Fourier transform of $f(\cdot,\cdot,\cdot)$ along any number of dimensions.

1. Introduction

Efficient perception of and reasoning about environments is still a major challenge for mobile robots operating in dynamic, densely cluttered or highly populated environments [1, 2, 3]. In the context of DAS (Driver Assistance System) [1, 4, 5, 6] or industrial field applications [7, 8], environment perception (or monitoring) usually includes three interleaved tasks, i.e., SLAM (Simultaneous Localization And Mapping) [9, 10], DATMO (Detection And Tracking of Moving Objects) [11] and CTM (Cell Transition Mapping) [12], which have already received a wide attention in the society of mobile robotics.

For the three tasks above, one of the most important things is the fast and reliable extraction of motion information from successive sensor observations. It consists of MOD (Moving Object Detection) even as well as their velocity estimation, which directly affects the performance of localization, mapping, moving object tracking [13]. Furthermore, high level tasks such as path planning, collision avoidance, will be affected by it as well [14]. For example, reasoning on behaviors in DAS requires to separate environment into static and dynamic parts [15, 6]. False negatives (dynamic objects) can lead to serious errors in the resulting maps such as spurious objects or misalignments due to localization errors [16]. On the other side, false positives (static objects) will degrade the performance and computability of tracking module, since the complexity of typical object tracking algorithm increases combinatorially with object number. Meanwhile, MOT (Moving Object Tracking) and CTM can directly benefit from the accurate and timely velocity estimation for moving objects in the dynamic environments.

This paper discuss the problem of extracting motion information from successive sensor observations, including MOD and their velocity estimation. In order to solve this problem effectively, the first thing of all is choosing a suitable representation of dynamic environments. Among all the existing environment representations, OGM (occupancy grid map) proposed by Elfes [17, 18] is the most popular one, which maps the environment as an array of probabilistic cells and easily integrates scans from multiple sensors, even from different type of sensors, for instance, sonar, laser range finder, IR camera [6]. Considering this tractability of OGMs and its wide use in SLAM and DATMO, this paper will take OGMs as inputs, and pay attention on the problem of extracting motion information from successive OGMs.

1.1 Related work

Almost all research so far on extracting motion information from OGMs have been done under the framework of DATMO. Petrovskaya et al. [11] propose to classify these methods in three categories: Traditional DATMO, Model-based DATMO and Grid-based DATMO. See [11] and the comprehensive survey [19] for more details about DATMO.

In this paper, we focus our attention on the detection of moving grid cells and their velocity estimation, regardless of the tracking problem on object level. According to different ways of understanding motion, we classify those techniques into three categories: Object-Oriented (OO), Cell-Oriented(CO), and OCcupancy-Oriented(OCO).

1.1.1 OO methods

OO approaches attribute the dynamic changes of successive local OGMs to moving objects. As a result, DATMO are divided into two sub-problems, MOD and MMOT (Multiple Moving Object Tracking) [3, 13, 4]. MOD is to isolate the moving objects in dynamic environments and MMOT is to achieve their state estimation and filter false alarms using typical multiple target tracking algorithms, such as MHT [13, 4] and JPDA [3]. For MOD we are interested in, a consistency based approach, which is based on inconsistencies observed for new data by comparing them with maps constructed by SLAM [13, 4], has often been used. In [3], a time-fading static map is used for consistency detector instead of the static map constructed by SLAM. Another important clue whether the object is moving or not is those moving objects detected in the past. To exploiting this information, a local dynamic grid map is also created to store information about previously detected moving objects in [13, 4]. If an occupied grid cell is near an area that was previously occupied by moving objects, it can be recognized as a potential moving object. There exist at least three drawbacks as follows for OO motion information extraction:

Consistency-based MOD can not effectively detect those slowly moving objects, such as pedestrians on the road, especially in the case of short time interval between consecutive measurements.
Velocity of every cell can not be extracted from OGMs directly, which usually is estimated by subsequent MMOT algorithm or by fusing MOD results with other sensor capable of measuring velocity, such as radar. For example, in [4], the author proposed a generic architecture to solve SLAM and DATMO in dynamic outdoor environments, in which the object detection results were fusing with radar local data and provide the detected objects with their velocities.
The last, perhaps one having most criticisms, is that under the view of OO, MMOT following MOD usually need complex data clustering and association which have combinatorial complexity and whose performance drastically degrade with the number of false alarms of MOD. To suppress the off-road false alarms given by MOD, [20] integrated the road model into their DATMO framework.

1.1.2 CO methods

Different from the viewpoint of OO, CO methods think that the dynamic changes of local OGMs are caused by the motion of gird cells rather than objects. The most remarkable work in this aspect is so called BOF (Bayesian Occupancy Filter), [1] and [21]. The former (4D-BOF) combines the occupancy grid with probabilistic velocity objects and leads to a four dimensional grid representation. However, the latter (2D-BOF) still uses a 2-dimensional occupancy grid but attaches an associated velocity distribution for every cell. It is obviously found that the 4D-BOF can represent overlapping objects with different velocities, while the 2D-BOF has the advantages to be computationally less demanding [6]. [2] evaluated these BOFs respectively through two experiments, the collision danger estimation for 4D-BOF and the human tracking for 2D-BOF. BOFs provide a Bayesian framework for grid-based monitoring of the dynamic environment. It allows us to extract information of motion cells, containing both occupancy and velocity, only based on the sequences of local OGMs. Furthermore, the complex track association operations is no need, as not existing concepts of objects or tracks in BOF model. Thus it is very suit for those applications in which information on the object level is not concerned. However, for those applications in which object level information is concerned, [22] give an easy way to integrate the 2D-BOF in [23, 1] with FCTA (Fast Clustering and Tracking Algorithm). Because BOF estimates the occupancy and velocity values for both static and dynamic parts of the environment, the subsequent FCTA has a dependency of parameters. In order to solve this problem, [6] integrated a novel real time scheme of MOD into the framework of [22] so that the static parts could be effectively removed from the output of BOF. The MOD in [6] is essentially a consistency-based detector, but it is based on transferring occupancy information between consecutive data grids rather than performing a complete SLAM solution. See [6] for more details about its MOD method.

1.1.3 OCO methods

The key insight of OCO methods is that the dynamic changes of local OGMs are caused by the flowing occupancies in grid cells rather not by the motion of cells themselves. Thus velocity is no longer defined for cells, but for occupancy inside the cell boundaries [5]. Therefore, every occupied cell could have more than one ancestor cell, that is to say, it can be having different velocities simultaneously. Another key fact under the OCO viewpoint is the occupancy preservation, which means the occupancy cannot disappear, unless at the border. From this viewpoint, [5] proposed an improved 2D-BOF, called as BOFUM (BOF Using prior Map). In BOFUM, the prior map information is encapsulated into the reachability matrix and its probability, which describe respectively whether a cell C can be reached from an ancestor cell A, and how the probability of this transition is. BOFUM also uses additional velocity states for dynamic model adaption, which clearly differs in previous approaches [21, 2]. As a result, BOFUM can predict the cell transitions more accurately than those CO methods. Similar to the reachability matrix and its probability in [5], [12] proposed a concept of CTMap (Conditional Transition Map) to model the motion patterns in dynamic environments. However, the transition parameters of CTMap was learned from a temporal signal of occupancy in cells in [12] rather than the method using in [5]. To extract the possible transition, a local neighborhood cross-correlation method was used in [12], which resulted in a large computationally demanding and was constrained with noise in OGMs, such as leaves or hands shaking. Moreover, the cross-correlation method can only give the direction of cell transition, but not give the full information of cell velocity. Nonetheless, it is worth pointing that regarding occupancy in cells as a temporal signal in [12] open a new window for motion information extraction form OGMs. Following this idea and based on the OCO viewpoint, a dual PHD filter was proposed [24], which can separate the dynamic and static cells and estimate the posterior occupancy and its flowing velocity in each cell under the random finite sets filtering framework. It provided the formal and strong way to deal with the OGMs for multiple purposes, such as BOF, CTMap building. However, like the most of RFS filters, the import modeling process according to the application scenario is relatively hard for the engineers.

1.2 KST in radar signal processing

In this paper, we attempt to propose a different way of extracting motion information from successive OGMs based on a signal transformation, called the keystone transform (KST) , which has been popularly used in the radar signal processing community. For example, [25] employed the KST was used to remove the linear component of the range migration for the moving target in the synthetic aperture radar (SAR) imaging. [26] proposed the fast implementation of KST based on Chirp-Z transform (CZT) in order to overcome the range migration in the long time coherent accumulation for detecting the dim moving targets. Recently, the high order KST were proposed for the above applications in [27][28] and [29] to correct the three order phase migration of the radar echo signal. For extracting the motion information from the OGMs, there exist many similarities to the range migration correction in the radar signal processing, such as focusing or accumulating the moving object with the unknown velocity, and the concepts of range and Doppler in radar signal processing are like the space cell and the velocity in the OGMs. Nevertheless, there also exist several differences with the KST of radar signal processing. One significant difference is that the radar signal in the complex number field and the OGMs is in the real number field. Another important difference is the unknown motion is usually the redial motion along the line of sight of the radar, although it can be high order motion (i.e. with the non-zero acceleration). However, the spatial motion in the OGMs can be one dimensional along the street way, two dimensional along the ground, and three dimensional in the free space. For the ground mobile robotics, we consider one dimensional and two dimensional cases in this paper.

1.3 Scope of this paper

This paper tries to develop a different way of extracting motion information from successive OGMs based on a signal transformation by extending the KST in the radar signal processing community to the 1D and 2D spatial case. The main theoretic idea occurred in our conference paper [30] and this journal article was mainly enlarged from the three aspects, i.e., the more detailed survey, the fast algorithm implementation of 2DS-KST and the experiments for the extended objects. For the sake of integrity, the original point object test results are also kept in this article.

2. One Dimensional Spatail KST

Occupancy grid maps model the environment as an array of cells. Typically, these are layered out in a two-dimensional grid. However, we first discuss the keystone transform for the one dimensional case, which is called as 1DS-KST hereinafter. One reason is that there has a prior straight line constraint with the motion of objects in many applications, for instance, cars moving on the highway or city roads. Detection and estimation the velocity of object moving along straight line themselves have a certain significance for these cases. Another reason is that 1DS-KST, where the concept of fast time is instead by one dimensional spatial grid, has a more direct relationship with KST in radar signal processing. It is helpful to understand the principle of keystone transform, especially for understanding the two dimensional spatial KST introduced in the next section.

For 1DS-KST, the main assumptions are the following:

The velocities for all moving objects are constant during $N$ successive frames.
$R\geq 2\cdot V_{\max}\cdot T$ , i.e., $V_{\max}\leq R/(2T)$ .
The sensor is motionless or the motion of it has been compensated by SLAM or other methods.

The first assumption always holds as long as the total length of time window $N\cdot T$ is very short or the amount of velocity change is less than one velocity resolution cell of KST. The second assumption is borrowed from [12], and it is derived from the Nyquist Sampling condition, which ensures the temporal signal of the occupancies in every grid cell are not aliased at a time sampling period $T$ . Meanwhile it ensures the spatial continuity of the motion of objects, which is necessary to filter the non-continuous changes of OGMs. In fact, for typical applications and modern sensors, this condition is easy to meet. For example, if the size of grid cell $R$ is equal to 2 meters and the time sampling period $T$ is equal to 0.1 seconds, the maximum velocity of objects is 10 m/s, which is enough high for most dynamic environment monitoring applications involving pedestrians, industrial robots and vehicles. As for those applications having higher maximal speed, such as automotive application, we can use a larger $R$ or smaller $T$ in order to avoid objects "jumping over" adjacent cells. As the focus of this paper is to extract the motion information from OGMs, the third assumption is natural and it can make this problem isolated from other problems, such as registration and localization.

Let us first consider the case that there is only one occupied grid cell, called as an ideal point object blow, in the sensor field of view. Assume it is moving at a constant velocity $V$ , $V\in[-V_{\max}/2,+V_{\max}/2)$ , and has an initial position $l_{0}R$ , then for any given time instant $t$ , the obtained OGM has the following form:

f_{t}(l)=\delta_{r_{t}}(l)

where $\delta$ denote a unit pulse function, which is defined as

\delta_{r_{t}}(l)=\begin{cases}1&\quad\text{if}\quad r_{t}\in[lR-\frac{R}{2},% lR+\frac{R}{2})\\ 0&\quad\text{otherwise}\end{cases}

And in (1) $r_{t}$ is the position of this object at time instant $t$ , which can be written as

r_{t}=l_{0}R+Vt

The Discrete Fourier Transform $\mathcal{F}_{t}(i)$ of $f_{t}(l)$ can be given as follows

\displaystyle\mathcal{F}_{t}(i)\overset{\text{def.}}{=}\sum_{l=0}^{L-1}f_{t}(l% )\cdot\exp\left(-\iota 2\pi\frac{l\cdot i}{L}\right)

\displaystyle\cong\exp\left(-\iota 2\pi\frac{r_{t}\cdot i}{LR}\right)

\displaystyle=\exp\left(-\iota 2\pi\frac{l_{0}i}{L}\right)\cdot\exp\left(-% \iota 2\pi\frac{V\cdot t\cdot i}{LR}\right)

Since the signal $f_{t}(l)$ is a real signal in the case of OGM, $\mathcal{F}_{t}(i)$ always satisfies conjugate symmetry, i.e.,

\mathcal{F}_{t}(i)=\mathcal{F}_{t}^{\ast}(L-i)

Therefore, we only concern the non-negative spatial frequency cells, that is, cells of $i=0,\ldots,L/2-1$ .

To use KST, we need choose a fixed cell of spatial frequency $i_{c}$ as a reference. In general, the center frequency within the effective bandwidth of signal is chosen for a reference in the application of radar signal processing. For our case of 1D-OGM, we can multiply $\mathcal{F}_{t}(i)$ by a window $w(i)$ in the spatial frequency domain, which corresponds to a spatial filtering process and results in a blurred OGM. Without loss of generality, we denote this window $W(i)$ as the following:

W(i),\quad i_{\min}\leq i\leq i_{\max}

So $i_{c}=(i_{\min}+i_{\max})/2$ can be selected as the reference. Thus (6) can be expressed as

\displaystyle\mathcal{F}_{t}(i)={}\exp\left(-\iota 2\pi\frac{l_{0}i}{L}\right)% \cdot W(i)\cdot\exp\left(-\iota 2\pi\frac{Vi_{c}}{LR}\cdot\frac{i}{i_{c}}t% \right),

\displaystyle i_{\min}\leq i\leq i_{\max}.

If let $t^{\prime}=\frac{i}{i_{c}}\cdot t$ , we can get

\displaystyle\mathcal{F}_{t}(i)={}\exp\left(-\iota 2\pi\frac{l_{0}i}{L}\right)% \cdot W(i)

\displaystyle\cdot\exp\left(-\iota 2\pi\frac{Vi_{c}}{LR}\cdot t^{\prime}\right)

\displaystyle\overset{\text{def.}}{=}\mathcal{F}_{i}(t^{\prime})

Now it is the time to consider a temporal variable $t$ . During the time interval $[-NT/2,NT/2)$ , we obtained $N$ successive OGMs at some discrete time instants. Without loss of generality, we can assume that $t=-NT/2,\ldots,0,T,\ldots,NT/2-T$ , that is to say, we obtain many signals $\mathcal{F}_{i}(t^{\prime})$ at some time instants $t^{\prime}$ .

Since the scale factor $i/i_{c}$ between $t$ and $t^{\prime}$ is variable with $i$ , the sampling period $T_{i}$ and total time interval $NT_{i}$ for $t^{\prime}$ are both different in terms of $i$ . The sampling patterns of $\mathcal{F}_{t}(i)$ and $\mathcal{F}_{i}(t^{\prime})$ are shown as Figure 1. Notably, the sampling pattern of $\mathcal{F}_{i}(t^{\prime})$ is like a keystone shape, that's why this corresponding transform is called as KST.

Figure 1 Sampling pattern of Keystone transform.

To correct the keystone effect in the sampling pattern of $\mathcal{F}_{i}(t^{\prime})$ , we need to compute the values at the discrete time $t^{\prime}=-NT/2,\ldots,0,T,\ldots,NT/2-T$ for every $i$ , which are usually obtained by an interpolate filter $h_{i}(n^{\prime})$ in the context of KST. See [31] for more details of the interpolate filter $h_{i}(n^{\prime})$ . Thus, after the interpolate filtering,

\displaystyle\tilde{\mathcal{F}}_{i}(n)={}\mathcal{F}_{i}(n^{\prime})\otimes h% _{i}(n^{\prime})

\displaystyle\cong{}\exp\left(-\iota 2\pi\frac{l_{0}i}{L}\right)\cdot W(i)

\displaystyle\cdot\exp\left(-\iota 2\pi\frac{Vi_{c}}{LR}\cdot nT\right),

\displaystyle n={}-N/2,\ldots,0,1,\ldots,N/2-1

Then the IDFT transform of $\tilde{\mathcal{F}}_{i}(n)$ in terms of $i$ will give the following result:

\displaystyle\tilde{f}_{n}(l)=\delta(l-l_{0})\otimes w(l)\cdot\exp\left(-\iota 2% \pi\frac{Vi_{c}}{LR}\cdot nT\right)

\displaystyle=w(l-l_{0})\cdot\exp\left(-\iota 2\pi\frac{Vi_{c}}{LR}\cdot nT\right)

\displaystyle=\begin{cases}w(0)\cdot\exp\left(-\iota 2\pi\frac{Vi_{c}}{LR}% \cdot nT\right)&\text{if}\;l=l_{0}\\ w(l-l_{0})\cdot\exp\left(-\iota 2\pi\frac{Vi_{c}}{LR}\cdot nT\right)&\text{% otherwise}\end{cases}

where $w(l)$ is coefficients of spatial filter corresponding with the window $W(i)$ .

In general, we must choose an appropriate window type and a suitable width of $W(i)$ so that $w(i)$ has a locally compact main-lobe and a side-lobe low enough. Thus, for those cells of $l\neq l_{0}$ , the non-zero $\tilde{f}_{n}(l)$ will not affect the analysis of velocity as long as the distance of object is larger than the width of main-lobe of $w(i)$ . This is easy to meet if the moving objects are not closely spaced pixel by pixel. Fortunately, this is the fact in OGM case. In fact, even for the superpositional target, if they have different velocities, the non-zero coefficients of $w(l)$ at $l\neq 0$ have no effect to velocity analysis as well. Therefore, let us consider the cell $l_{0}$ as the next step,

\tilde{f}_{n}(l_{0})=w(0)\cdot\exp\left(-\iota 2\pi\frac{VTi_{c}}{LR}\cdot n\right)

After doing the DFT for $\tilde{f}_{n}(l_{0})$ in terms of $n$ , we can get:

\displaystyle\tilde{\mathcal{F}}_{l_{0}}(k)={}\sum_{n=-N/2}^{N/2-1}w(0)\cdot% \exp\left(-\iota 2\pi\frac{VTi_{c}}{LR}\cdot n\right)

\displaystyle\cdot\exp\left(-\iota 2\pi\frac{nk}{N}\right)

\displaystyle={}\sum_{n=-N/2}^{N/2-1}w(0)

\displaystyle\cdot\exp\left(-\iota 2\pi n\left(\frac{VTi_{c}}{LR}+\frac{k}{N}% \right)\right)

As (19) shown, different velocities $V$ will be located at the different frequency grid cells $k$ . The KST method, therefore, allow for different velocities in a single cell, which is similar to 4D-BOF [1] and BOFUM [5]. For an object having the velocity $V$ , we have the following approximation:

\frac{k}{N}\cong-\frac{VTi_{c}}{LR}\Rightarrow V\cong-\frac{kLR}{NTi_{c}}

The corresponding resolution of the above velocity measurement is

\Delta V=\frac{LR}{NTi_{c}}

and the normalized one is

\Delta\mathsf{V}=\frac{\Delta V\cdot T}{R}=\frac{L}{Ni_{c}}

We can choose $i_{c}$ and $N$ according to the dynamic characteristics of environment and the velocity resolution of interest which is related to the minimal detectable velocity.

3. Two Dimensional Spatial KST

A one-dimensional OGM is of limited practical use. For mobile robots, the two-dimensional OGM is the usual case. This section will develop a method of two dimensional spatial KST (denoted as 2DS-KST) for the purpose of extracting motion information from successive OGMs. Besides of those assumptions in section 2, two additional ones needed here is as the following:

All objects are moving along nearly the same but unknown direction during $N$ frames11.
The unknown motion directions for all objects belong to a prior set with finite number of elements and are constant during $N$ frames.

3.1 2DS-KST with multiple hypotheses

Let us still consider only one ideal point object in the sensor field of view. Assume its initial position is $\mathbf{r}_{0}=[l_{0},m_{0}]^{T}$ and it has a constant velocity $\mathbf{V}=[V_{x},V_{y}]^{T}=[V\cos(\theta),V\sin(\theta)]^{T}$ , in which $V_{x},V_{y}$ satisfy the condition of $V_{x},V_{y}\in[-V_{\max}/2,+V_{\max}/2)$ . Then for a given time instant $t$ , the OGM has the following form:

\delta_{\mathbf{r}_{t}}(l,m)=\begin{cases}1&\quad\text{if}\;[r_{xt},r_{yt}]^{T% }\in\textrm{Rect}(l,m)\\ 0&\quad\text{otherwise}\end{cases}

where $\textrm{Rect}(l,m)$ denote the region of the cell $(l,m)$ .

For the 2D spatial case, equation (5) becomes as follows:

\displaystyle\mathcal{F}_{t}(i,j)=\exp\left(-\iota 2\pi\frac{\mathbf{r}_{0}% \cdot\mathbf{i}}{L}\right)\cdot\exp\left(-\iota 2\pi\frac{V\mathbf{u}_{\theta}% \cdot\mathbf{i}}{LR}t\right)

\displaystyle=\exp\left(-\iota 2\pi\frac{\mathbf{r}_{0}\cdot\mathbf{i}}{L}% \right)\cdot\exp\left(-\iota 2\pi\frac{Vi_{\theta}}{LR}t\right)

where $\mathbf{i}=[i,j]^{T}$ , $\mathbf{u}_{\theta}=[\cos\theta,\sin\theta]^{T}$ and $i_{\theta}=i\cos\theta+j\sin\theta$ .

As the above described, we assume there are finite possible hypotheses for $\theta$ , and denote the $p$ th hypothesis as $\theta_{p}$ ( $p=1,\ldots,\nu$ ). But the real case is we don't know the actually moving direction of the object, so we need to do KST for every possible hypothesis, then we can get:

\displaystyle\mathcal{F}_{t}^{\theta_{p}}(i,j)={}\exp\left(-\iota 2\pi\frac{% \mathbf{r}_{0}\cdot\mathbf{i}}{L}\right)\cdot W_{\theta_{p}}(\mathbf{i})

\displaystyle\cdot\exp\left(-\iota 2\pi\frac{V_{\theta_{p}}i_{c}^{\theta_{p}}}% {LR}\cdot\frac{i_{\theta_{p}}}{i_{c}^{\theta_{p}}}\cdot t\right)

\displaystyle\cdot\exp\left(-\iota 2\pi\frac{V_{\theta_{p}^{\perp}}i_{\theta_{% p}^{\perp}}}{LR}\cdot t\right)

where

$V_{\theta_{p}}$ , $i_{c}^{\theta_{p}}$ and $i_{\theta_{p}}$ are projections of vectors $\mathbf{V}$ , $\mathbf{i}_{c}^{\theta_{p}}$ and $\mathbf{i}$ along the $\theta_{p}$ direction, respectively, while $\mathbf{i}_{c}^{\theta_{p}}$ is the reference spatial frequency vector for $\theta_{p}$ hypothesis when doing KST in next step;
$V_{\theta_{p}^{\perp}}$ , $i_{\theta_{p}^{\perp}}$ are projections of vectors $\mathbf{V}$ , $\mathbf{i}$ perpendicular to the $\theta_{p}$ direction, respectively;
$W_{\theta_{p}}(\mathbf{i})$ is the two dimensional window function for $\theta_{p}$ hypothesis which has the same meaning as the window (8) in one dimensional case.

(4) result from the fact that the item $Vi_{\theta}$ in (3) can be rewritten as:

Vi_{\theta}=V_{\theta_{p}}i_{\theta_{p}}+V_{\theta_{p}^{\perp}}i_{\theta_{p}^{% \perp}}

Furthermore, if the hypothesis $\theta_{p}$ is true, that is to say, $\theta$ is approximately parallel to $\theta_{p}$ , the last product item in (4) can be ignored further. In this case, for the $\theta_{p}$ hypothesis, (4) can be approximated as the following:

\displaystyle\mathcal{F}_{t}^{\theta_{p}}(i,j)\cong{}\exp\left(-\iota 2\pi% \frac{\mathbf{r}_{0}\cdot\mathbf{i}}{L}\right)\cdot W_{\theta_{p}}(\mathbf{i})

\displaystyle\cdot\exp\left(-\iota 2\pi\frac{V_{\theta_{p}}i_{c}^{\theta_{p}}}% {LR}\cdot\frac{i_{\theta_{p}}}{i_{c}^{\theta_{p}}}\cdot t\right)

\displaystyle\overset{\textrm{def.}}{=}{}\mathcal{F}_{\mathbf{i}}^{\theta_{p}}% (t^{\prime})

where $t^{\prime}$ is defined as:

t^{\prime}=\frac{i_{\theta_{p}}}{i_{c}^{\theta_{p}}}t

The interpolate filter of Keystone transform is the same as 1D case, so we can get $\tilde{f}_{n}^{\theta_{p}}(l_{0},m_{0})$ as follows:

\tilde{f}_{n}^{\theta_{p}}(l_{0},m_{0})=w_{\theta_{p}}(0,0)\cdot\exp\left(-% \iota 2\pi\frac{V_{\theta_{p}}Ti_{c}^{\theta_{p}}}{LR}\cdot n\right)

The counterpart of the above equation in 1D case is (17). After doing the DFT for $\tilde{f}_{n}^{\theta_{p}}(l_{0},m_{0})$ in terms of $n$ for every occupied grid cell $[l_{0},m_{0}]^{T}$ , we can get:

\displaystyle\tilde{\mathcal{F}}_{l_{0},m_{0}}^{\theta_{p}}(k)={}\sum_{n=-N/2}% ^{N/2-1}w_{\theta_{p}}(0,0)

\displaystyle\cdot\exp\left(-\iota 2\pi n\left(\frac{V_{\theta_{p}}Ti_{c}^{% \theta_{p}}}{LR}+\frac{k}{N}\right)\right)

Thus, similar to 1D case, $V_{\theta_{p}}$ , the magnitude of the velocity in the grid cell $[l_{0},m_{0}]^{T}$ can be given as follows:

V_{\theta_{p}}\cong-\frac{kLR}{NTi_{c}^{\theta_{p}}}

while the direction of the velocity in this cell is given by $\theta_{p}$ , which is the assumption given in the previous, once used in (6).

3.2 Merging multiple hypotheses

The velocity measurement given by (11) is in the case of the $\theta_{p}$ hypothesis is true for this cell. In practice, we don't know which hypothesis is true for any occupied grid cell $[l,m]^{T}$ in advance, so we must merge the results of those multiple hypotheses so that the correct motion information can be given.

In fact, by 2DS-KST with multi-hypothesis, we can get $\nu$ cuboid $\tilde{\mathcal{F}}^{\theta_{p}}(l,m,k)$ , which is the general form of (10) for any arbitrary grid cell $[l,m]^{T}$ under the hypothesis $\theta_{p}$ . That is to say, we get the two dimensional matrix $\tilde{\mathcal{F}}_{l,m}(\theta_{p},k)$ for every grid cell $[l,m]^{T}$ , which represents the possible velocity in this cell, including the magnitude (denoted by $k$ , which can be negative) and the direction (denoted by $\theta_{p}$ ).

Here we use a MPD (Maximal Power Detector) based method for the purpose of extracting the motion information from $\tilde{\mathcal{F}}_{l,m}(\theta_{p},k)$ , which is based on the fact that the power item $|\tilde{\mathcal{F}}_{l,m}(\theta_{p},k)|^{2}$ will be larger when $\theta_{p}$ and $k$ are more matched with the true value of the velocity. It can be described as the following four steps:

The first step is the maximal merging step:

$\mathcal{P}(l,m)=\max_{\theta_{p},k}|\tilde{\mathcal{F}}_{l,m}(\theta_{p},k)|^% {2}$
The second is the following power detector:

$\mathcal{P}(l,m)\mathop{\gtrless}_{H0}^{H1}P_{\min}$

where $H1$ and $H0$ denote the event whether the grid cell to be determined is occupied or not, respectively, and where $P_{\min}$ denote the threshold of MPD.
The third is the following estimator if $H1$ holds on:

$\displaystyle(\hat{k},\hat{\theta})=\mathop{\arg}_{\theta_{p},k}\max|\tilde{% \mathcal{F}}_{l,m}(\theta_{p},k)|^{2}$

$\displaystyle\hat{\mathbf{V}}_{l,m}=\hat{V}_{\hat{\theta}}\cdot\exp(\iota\hat{% \theta})$

where $\hat{V}_{\hat{\theta}}$ can be obtained according to (11) if $\hat{k}$ given.
The last step is the separator as follows:

$|\hat{\mathbf{V}}_{l,m}|\mathop{\gtrless}_{S}^{D}V_{\min}$

where $D$ and $S$ denote the event whether the grid cell to be determined is dynamic or not, respectively, and where $V_{\min}$ denote the threshold of the separator, which directly related to the minimal detectable velocity or $\Delta V$ in (21).

As the above MPD approach of merging multiple hypotheses only outputs the most likely point estimation for the velocity, it is obvious that it is only appropriate for the case when the occupancies in each grid cell only have one significant velocity during the time interval. Nevertheless, it does not affect the versatility of 2D-KST for extracting motion information. For instance, we can use the more complex GMM (Gaussian Mixture Model) instead of the point estimation in MPD method as the velocity model of occupancy, which would undoubtedly increase the robustness and the compatibility to the complex situations. In fact, which velocity model should be selected is a key problem in practice and it closely depends on the physical application.

For the more complex method of merging multiple hypotheses, it is already beyond the scope of this paper and may be an appropriate topic for the future work.

4. Algorithm Implementation

In the previous two sections, 1DS-KST and 2DS-KST are presented, which are both interpolate filter based KST. In this section, a different implementation of 2DS-KST, which is more effective for our problem, is presented . The other issues related to implementation will be also discussed.

According to the description in Section 3.2, in order to extract the motion information from OGMs, we need the cuboid of $\mathcal{\tilde{F}}^{\theta_{p}}(l,m,k)$ under each hypothesis $\theta_{p}$ ( $p=1,\ldots,\nu$ ), rather than the direct result $\tilde{f}^{\theta_{p}}(l,m,n)$ of the keystone transform for the original input $f(l,m,n)$ . In the previous two sections, we used the interpolate filter based keystone transform to obtain $\tilde{f}^{\theta_{p}}(l,m,n)$ first, then obtained $\mathcal{\tilde{F}}^{\theta_{p}}(l,m,k)$ by doing the DFT transform for $\tilde{f}^{\theta_{p}}(l,m,n)$ in terms of $n$ .

Here we introduce another implementation without using interpolate filter, which is based on the so called CZT (Chirp-Z Transform)[32]. It can both correct the keystone effect in the sampling pattern of the data $\mathcal{F}^{\theta_{p}}(i,j,n)$ and transform them to the $k$ domain simultaneously. That is to say, it outputs $\mathcal{\tilde{F}}^{\theta_{p}}(i,j,k)$ instead of $\mathcal{\tilde{F}}^{\theta_{p}}(i,j,n)$ in the interpolate filter based method, so it is more effective for our problem. More importantly, CZT has the fast implementation and can flexibly control the frequency range to be analyzed. See [32, 26] for more details and applications of CZT. The flow diagram of CZT based 2D-KST is shown in Figure 2, where the data pattern in every step is explicitly shown and the meanings of those variables can be seen in section 2, 3 and the notations before the introduction.

Figure 2 Flow diagram of 2D-KST based on CZT.

The main steps in Figure 2 are explained as follows:

Spatial FFT: The OGMs are transformed from the spatial domain $(l,m)$ to the spatial frequency domain $(i,j)$ by 2D-FFT in this step, which can be done sequentially for each new OGM, or with batch processing for all $N$ OGMs through parallel computing. In order to use FFT, $L$ is usually a power of $2$ . If it is not this case, we can pad zeros at the tail. Thus the complexity of this step is $\mathcal{O}(2\alpha NL^{2}\log L)$ , where $\alpha$ ranges around the interval from 4 to 5 depending on the implementation [33].
Directional filtering: The window $W_{\theta_{p}}(i,j)$ in spatial frequency domain under every possible direction $\theta_{p}$ of velocity is multiplied respectively to the previous results, which corresponds to let every OGM pass several bandpass spatial filters, respectively. Here we use the following rectangle window,

$W_{\theta_{p}}(i,j)=\begin{cases}1&\text{if}\;i_{\theta_{p}}\in[i_{\min}^{% \theta_{p}},i_{\max}^{\theta_{p}}]\\ 0&\text{otherwise}\end{cases}$

where $i_{\min}^{\theta_{p}}$ and $i_{\max}^{\theta_{p}}$ are the minimal and the maximal spatial frequency respectively, and the response of the spatial filter are completely determined by them. The proper values for these two parameters are closely dependent on the size of the moving objects. We will discuss this question in Section 5. From the computational point of view, the rectangle window in (1) has two advantages at least. One is that the multiplication operation can be bypassed, the other is that only the data points inside of $W_{\theta_{p}}(i,j)$ need to be computed in the next step. Because the window can be preset beforehand and there is no multiplication operation, the complexity of this step can be ignored.
Chirp-Z transform: As described before and shown in Figure 2, CZT is used both to correct the keystone effect in the sampling pattern of the data $\mathcal{F}^{\theta_{p}}(i,j,n)$ and to transform them to the $k$ domain simultaneously. For our application, the parameters of CZT are set as follows:

$\displaystyle z_{k}=A_{\theta_{p}}\cdot\omega_{\theta_{p}}^{-k}$

$\displaystyle\omega_{\theta_{p}}=\exp\left(-\iota\frac{2\pi}{N}\cdot\frac{i_{% \theta_{p}}}{i_{c}^{\theta_{p}}}\right)$

$\displaystyle A_{\theta_{p}}=\omega_{\theta_{p}}^{-K/2}$

$\displaystyle K\geq\frac{N\arg_{\theta_{p}}\max(i_{c}^{\theta_{p}})}{L}$

where $K$ is the number of points of temporal frequency. To keep all hypotheses have the same the number of points of temporal frequency, we use the item $\arg_{\theta_{p}}\max(i_{c}^{\theta_{p}})$ in (5) instead of $i_{c}^{\theta_{p}}$ . According to the principle of CZT, there is no other constraint to $K$ . However, if $K$ can divide $N$ and $N$ is a power of 2, the complexity of CZT is minimal and given by [33]:

$\mathcal{C}_{CZT}=4\alpha N\log K+(4\alpha+25)N-K$

For our problem, the typical setting is $K=N/2$ and the area of every window $S(W_{\theta_{p}})=L^{2}/4$ , thus the complexity of this step is approximate to $\mathcal{O}(\nu L^{2}\cdot(\alpha N\log N+6))$ .
Spatial IFFT: As shown in Figure 2, this step uses the 2D-FFT to transform every cuboid $\mathcal{\tilde{F}}^{\theta_{p}}(i,j,k)$ from spatial frequency domain to spatial domain. As the migration across grid cells caused by motion has been compensated, the moving grid cell can be focused to the average, initial, or last position during the $N T$ interval, which depends on the definition of the time index used in the CZT. Obviously, the complexity of this step is $\mathcal{O}(2\alpha\nu NL^{2}\log L)$ .
MPD merging: The velocity of any cell identified as moving one is estimated in this step. The complexity of this step is approximately $\mathcal{O}(\nu NL^{2}/2)$ .

According to the above analysis, the total complexity of 2D-KST can be approximated as follows:

\mathcal{C}_{\text{Total}}\cong\mathcal{O}\left((1+\nu\log_{L}N/2+\nu)\cdot 2% \alpha NL^{2}\log L\right)

(7) means that the computational complexity of the total processing of 2D-KST is about $(\nu\log_{L}N/2+\nu)N$ times more than the complexity of 2D-FFT with the side length $L$ .

5. Results

In this section, we design three experiments to evaluate the performance of the proposed method on the extraction of motion information. The first two is based on numerical simulation data, and the last one is based on the real data from mobile robot.

The first experiment, called as point object test, is to demonstrate the validity of our method through point object data, which means all objects only occupy one grid cell in this experiment. Some intermediate results are also shown as figures in this experiment, which is benefit to understanding the process of KST.

The second experiment, i.e. extent object test, is to illustrate the responses of the spatial frequency window to the different size of objects. Some useful guidelines about how to choose the appropriate parameters for the spatial frequency window are derived further.

The third experiment, referred to as real data test, is to uncover some potential applications for the methods proposed in this paper. Two datasets are used in this experiment, which are collected by the robot in the indoor and outdoor environments respectively. The difference is that the ground truth is known precisely for the first dataset, while it is not the case of the outdoor dataset.

5.1 Point object test

In this experiment, we use the simulation data of point object to test the 1D-KST and 2D-KST. Considering the sensor noise and the imperfections in the OGM building process, we add to the OGMs some Poisson noise, which is uniformly distributed in the grid cells and whose number obeys the Poisson distribution. The main parameters in the simulation is shown in the Table 1, where the velocities $\mathsf{V}$ are represented in the normalized format, and the scale factor $\alpha$ in $i_{c}^{\theta_{p}}$ is defined as follows:

\alpha=\max(|\cos(\theta_{p})|,|\sin(\theta_{p})|)

Table 1 Simulation Parameters in the Experiment I.

	Parm.	1D case	2D case
OGMs	$L$	128	64
	$N$	100	40
Objects	#0	$20,0$	$10,10,0,-$
State	#1	$40,-0.5$	$20,15,0.5,0^{\circ}$
$l_{0},m_{0},\mathsf{V},\theta$	#2	$60,0.05$	$30,20,0.1,90^{\circ}$
	#3	$80,-0.2$	$35,30,0.2,45^{\circ}$
	#4	$100,0.1$	$40,40,0.3,135^{\circ}$
	#5	—	$45,50,0.4,165^{\circ}$
Poisson Rate	$\lambda$	16	64
Windows	$\theta_{p}$	—	$0:\pi/8:7\pi/8$
	$i_{c}^{\theta_{p}}$	$L/4$	$L/4\alpha$
	$i_{\min}$	$L/8$	$i_{c}^{\theta_{p}}/2$
	$i_{\max}$	$3L/8$	$3i_{c}^{\theta_{p}}/2$

The results of the one dimensional test are shown in Figures 3 - 5. From Figure 3, we can see five objects moving at different speeds along positive direction or negative direction. The speed of the fastest one is $0.5$ , which is the maximal feasible velocity given by the Nyquist sampling theorem, and the speed of the slowest moving object is $0.05$ , which is very close to the velocity resolution given by (22). Through this setting, we can validate 1D-KST in terms of velocity measurement capability. Meanwhile, we set a stationary object at $l_{0}=20$ , which can be used to check the performance on the separation of the dynamic grids from the static ones. After adding the Poisson noise, the OGMs look very noisy.

Figure 3 Sequence of the simulated one dimensional OGMs:

f(l,n)

(a) amplitude

(b) phase

Figure 4 Result of OGM sequence by the spatial Fourier transform:

\mathcal{F}(i,n)

Figure 5 Result of 1D-KST:

\tilde{\mathcal{F}}(l,k)

. The color denotes the total power of the accumulated occupancies in this cell during the

N

time instants. The green box on every object indicates the corresponding resolution of 1D-KST for the position and the velocity.

The result of OGM sequence by the spatial Fourier transform is shown in Figure 4. As the noise in the OGMs, it is hardly to see any change information along the time axis from Figure 4(a). However, it can still be seen from the phase of $\mathcal{F}(i,n)$ in Fig. 4(b). Moreover, it is not hardly to see the symmetry with respect to the spatial frequency axis from Figure 4 because our OGMs are all real numbers, thus from the information point of view we can only set the spatial frequency window in one side of the spatial frequency axis.

The final result of 1D-KST is shown in Figure 5. When looked together with Figure 3, it is clearly shown that all objects are well located in the grid cells at the midpoint time instant and their velocities are also measured with a high precision even for these so noisy OGMs22. Moreover, the occupies of every are well focused in the green box determined by 1D-KST's resolution, which is determined by (22) and the width of the spatial frequency window:

\Delta l\cong L/(i_{\max}-i_{\min})

From this result, we can conclude that 1D-KST has a good performance for the point object in terms of OGM filtering and the extraction of motion information.

The results of the two dimensional test are shown in Figures 6 - 10. Similar to the one dimensional case, we choose the maximal feasible and minimal detectable speed to are set to evaluate the capability of 2D-KST in terms of velocity measurement. Different from one dimensional case, we use eight hypotheses of direction to match the possible moving directions in OGMs. Among all five moving objects, four are moving along the direction in the hypothesis sets, while the last one (#5) is moving near the middle direction between the hypothesis $\theta_{p}=7\pi/8$ and the reverse direction of $\theta_{p}=0$ , but slightly close to $\theta_{p}=7\pi/8$ . Through this setting, we can evaluate the performance when the true moving direction dose not match any hypotheses. To see the result more clearly, we decrease the map size $L=64$ in the two dimensional test. The detailed parameter setting used in this simulation can be seen in Table 1.

Figure 6 Sequence of the simulated two dimensional OGMs:

f(l,m,n)

Figure 7 Spatial frequency window

W_{\theta_{p}}(i,j)

. (a)-(h) are the cases of

\theta_{p}=0,\pi/8,\ldots,7\pi/8

, respectively. The x-ticks and y-ticks are both the normalized values of spatial frequency. The value of the point inside the colorful quadrangle is equal to 1, while zero for the outside point.

Figure 8 Result of 2D-KST after the first merging step of MPD:

\mathcal{P}(l,m)

. The color denotes the total power of the accumulated occupancies in this cell during the

N

time instants.

Figure 9 Result of 2D-KST after the processing of MPD. The color has the same meaning as Figure 8. The blue arrows denote the velocities of the grid cells, whose lengths are proportional to the speed.

Figure 10 Comparison of the measurement and the true value of the velocity.

Figure 6 shows the noisy 2D-OGM sequence. It is very difficult to recognize the trajectory of each object by human eyes. To make the window $W_{\theta_{p}}(i,j)$ more intuitive than the expression (1) to the reader, we show them in Figure 7. In principle, the counterpart of the spatial frequency window here is the directional filter in optical signal processing, this is also the reason why we call the second step of 2D-KST as directional filtering.

After the first merging step of MPD, the result $\mathcal{P}(l,m)$ of 2D-KST processing is shown in Figure 8, which can be regard as the filtered OGM at the time instant $n=0$ . Figure 8, we can found that the noise in the original OGM is almost filtered, and the grid cells in the vicinity of objects have a significant occupancy value. The occupancies in the yellow grid cells around each object are the leakage of occupancy in the corresponding object grid cell, which is the consequence caused by the window $W_{\theta_{p}}(i,j)$ . Among all the six objects, the stationary object at $(10,10)$ behaves obviously isotropic, while the other moving objects behave the obvious directionality, which means that they have the biggest extent along their moving direction. Moreover, we can found that the grid cells near the mismatched one at $(45,50)$ have a relatively smaller value than those near the matched objects. Nonetheless, their values are at least twice as large as the values of those yellow grid cells. Thus, if we set the appropriate threshold $P_{\min}$ , see (13), we can remove the leakage occupancies in the yellow cells but maintain those in the vicinity of the mismatched object grid cell. Furthermore, if our requirement is to extract the moving objects, we can filter the stationary one by setting the appropriate threshold $V_{\min}$ , see (16).

The result of 2D-KST after MPD processing is shown in Figure 9, where $P_{\min}=0.3981$ (-8dB) and $V_{\min}=0.085$ are used, and where the blue arrows represent the velocity of the occupies in the corresponding cells, the length and pointing for the amplitude and the direction respectively. As can be seen from Figure 9, the stationary object has been removed successfully, although it has the maximal occupancy, while the five moving objects all persist in existing. What's more, the velocity measurements are highly in accordance with the ground truth in Table 1. To see it more obviously, the simple plot extractor is used, which extracts the local power maximum grid cells from the MPD results as the candidate detections and computes the occupancy weighted velocity as the velocity measurement of each candidate detection. The measurement results are shown in Figure 10, which suggest that 2D-KST has a good precision of velocity measurement for the point object and can be easily integrated with the plot extractor or other object clustering algorithm, such as FCTA [22].

5.2 Extend object test

The purpose of this experiment is to illustrate how the parameters of the spatial frequency window affect the output for different object sizes. Since the window parameters have the same effect on the output for the 1D and 2D cases and the 1D case is more easily to explain, we test two typical windows in one dimensional case for five different object sizes firstly. Through this test, some guidelines about how to choose the parameters for the spatial frequency window are derived. Then we validate these guidelines through 2D test for different object sizes. The size of each object used in this experiment is listed in Table 2, where the value means how many grid cells are occupied by this object. For 2D case, the two values correspond to the numbers of the occupied grid cells parallel and perpendicular to the direction of the velocity respectively. The other parameters are the same as Table 1 if without any special explanations.

Table 2 Sizes of the extent objects.

object label	#0	#1	#2	#3	#4	#5
1D case	5	4	1	3	2	–
2D case	6,3	3,3	1,1	2,1	2,2	3,2

The responses to the two typical windows in one dimensional case to the five object sizes are shown in Figure 11.

As can be seen from Figure 11 11(a) and 11(b), the wide window has a better resolution whether for the position or for the velocity, because it has a wider window and a higher reference spatial frequency $i_{c}$ . See (2) and (22) for the formulas about the KST resolution. But the large objects, object #0 at cell 20 and object #1 at cell 40 are split into two parts because the bandpass spatial filter is used in our KST approach. However, the two parts, the head and the tail, are both in the extent of the corresponding object which is represented by the green box. Fortunately, this split outcome may be accepted for many applications, such as CTM building [12] or DATMO using FCTA [22].

(a) Wide window

(b) Narrow window

Figure 11 Responses of the two typical windows in one dimensional case to the different sizes of objects. (a)

i_{c}=L/4,i_{\min}=L/8,i_{\max}=3L/8

; (b)

i_{c}=L/8,i_{\min}=L/16,i_{\max}=3L/16

. The color denotes the total power of the accumulated occupancies in this cell during the

N

time instants. The length of the green box on each object along

l

direction denotes the sum of the position resolution of 1D-KST and the extent of this object, while the length along normalized velocity direction denotes the corresponding velocity resolution of 1D-KST.

If it can not be accepted, the narrow window can be used instead. As shown in Figure 11 11(b), all objects are focused in the corresponding green box. However, the velocity resolution gets worsened as the lower $i_{c}$ used, and the object #2 at cell 60 has a negligible occupancy inside its green box since only very little proportion of its power can enter the narrow window.

In principle, we can get the following two empirical criteria for selecting the parameters of the spatial frequency:

$i_{\max}-i_{\min}\leq i_{c}$ , which ensures that the scaled time for each spatial frequency $i$ shown in Figure 1 has not too big gap with the one for $i_{c}$ . As a result, the average resolution of frequency for every cell $i$ can be approximated by the one of $i_{c}$ .
$i_{\min}\cdot E_{o}<L/2$ , where $E_{o}$ is the possible maximal size of the object along its moving direction. This criterion ensures that the object can be focused along their moving direction, because for the $E_{o}$ length continuous segment in OGM, its maximal effective spatial frequency is at $L/(2E_{o})$ . Only if this condition holds, the information of this object as a whole can be maintained.

In practice, we can choose the appropriate parameters of spatial frequency window to meet the above two criteria. But for those applications in which the moving objects have the significant difference in size, for example, the case of Figure 11, we can use multiple windows to match different object sizes, or use a uniform window but based on the multiple resolution grid cell maps for different sizes of objects. Along this path, it will result in a multiple resolution KST approach, which is beyond the scope of this paper and maybe the next step work in the future.

Next we validate the wide window in 2D case, the parameters can be seen in Table 1, through those objects in Table 2, where the extensions of all moving objects are no larger than 3 grid cells. The corresponding results are shown in Figure 12. From these results, we can conclude that when using the appropriate setting of spatial frequency window, the 2D-KST can perform very well for the extend objects when they have no too much difference in size.

Similar to the point object test, in order to demonstrate the potential capability of integrating with FCTA or other extend target tracking algorithms, we give the result of the plot extractor, which is shown in Table 3. As can be seen from Table 3, ten detections are achieved for those five moving objects altogether, and the average position is at the center of extend object, see Table 1 for the ground truth. Moreover, the velocity measurements also have a high precision, even for the mismatch one (#5), the maximal speed error is less than 0.05 and the direction of velocity is no more than 7 degrees among those three detections generated by this object.

Table 3 Detections by the plot extractor.

	$\hat{l},\hat{m}$	$\hat{V}$	$\hat{\theta}$ (deg)	object label
1	(20,14)	0.5	0
2	(20,15)	0.5	0	#1
3	(20,16)	0.5	0
4	(30,20)	0.09	87.1	#2
5	(34,29)	0.20	45	#3
6	(36,31)	0.20	45
7	(40,40)	0.31	135	#4
8	(47,49)	0.38	161.4
9	(45,50)	0.35	171.6	#5
10	(43,51)	0.38	161.4

Above all, as the point object test, it seems that 2D-KST has a good precision of velocity measurement for the extend object and can be easily integrated with the plot extractor or any other extend object tracking algorithm.

(a) Result after the first mergering step of MPD

(b) Result after the MPD processing

Figure 12 Result of 2D-KST for extend object. The color denotes the total power of the accumulated occupancies in this cell during the

N

time instants. The blue arrows denote the velocities of the grid cells, whose lengths are proportional to the speed.

6. Conclusion

This paper developed a different way of extracting motion information from successive noisy OGMs based on a signal transformation by extending the KST in the radar signal processing community to the 1D and 2D spatial case. And the fast algorithm for the 2DS-KST is also given and has the proportional computational complexity with the 2D-FFT. Simulation results show that our method can extract the sub-pixel motions effectively from the sequence of very noisy OGMs, which has a wide use, in many application scenarios, such as the industrial field, airport and other indoor environment.

Further evaluation by real data, multi-resolution KST for complex scenarios, integration with Bayesian Occupancy Filter and CTMAP, and hypothesis merging method based on more complex velocity model, are worth being paid attentions in the next step.

Data Availability Statement

Data will be made available on request.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62303478; in part by the ATR Foundation under Grant 2035250204; in part by the Key Lab. Foundation under Grant 220302.

Conflicts of Interest

The authors declare no conflicts of interest.

Ethical Approval and Consent to Participate

Not applicable.

References

Coué, C., Pradalier, C., Laugier, C., Fraichard, T., & Bessière, P. (2006). Bayesian occupancy filtering for multitarget tracking: an automotive application. The International Journal of Robotics Research, 25(1), 19-30.
[CrossRef] [Google Scholar]
Tay, M. K., Mekhnacha, K., Yguel, M., Coue, C., Pradalier, C., Laugier, C., ... & Bessiere, P. (2008). The Bayesian occupation filter. In Probabilistic Reasoning and Decision Making in Sensory-Motor Systems (pp. 77-98). Berlin, Heidelberg: Springer Berlin Heidelberg.
[CrossRef] [Google Scholar]
Kondaxakis, P., Kasderidis, S., & Trahanias, P. (2008). Tracking multiple targets from a mobile robot platform using a laser range scanner. In 2008 IET Seminar on Target Tracking and Data Fusion: Algorithms and Applications (pp. 175-184).
[CrossRef] [Google Scholar]
Vu, T. D., Burlet, J., & Aycard, O. (2008). Mapping of environment, detection and tracking of moving objects using occupancy grids. In Intelligent Vehicles Symposium (pp. 684-689). Los Alamitos: IEEE.
[Google Scholar]
Gindele, T., Brechtel, S., Schroder, J., & Dillmann, R. (2009, June). Bayesian occupancy grid filter for dynamic environments using prior map knowledge. In 2009 IEEE Intelligent Vehicles Symposium (pp. 669-676). IEEE.
[CrossRef] [Google Scholar]
Baig, Q., Perrollaz, M., & Laugier, C. (2014). Advances in the Bayesian Occupancy Filter framework using robust motion detection technique for dynamic environment monitoring. IEEE Robotics and Automation Magazine.
[Google Scholar]
Wolcott, R. W., & Eustice, R. M. (2015, May). Fast LIDAR localization using multiresolution Gaussian mixture maps. In 2015 IEEE international conference on robotics and automation (ICRA) (pp. 2814-2821). IEEE.
[CrossRef] [Google Scholar]
Behley, J., & Stachniss, C. (2018, June). Efficient surfel-based SLAM using 3D laser range data in urban environments. In Robotics: science and systems (Vol. 2018, p. 59).
[Google Scholar]
Durrant-Whyte, H., & Bailey, T. (2006). Simultaneous localization and mapping: part I. IEEE robotics & automation magazine, 13(2), 99-110.
[CrossRef] [Google Scholar]
Bailey, T., & Durrant-Whyte, H. (2006). Simultaneous localization and mapping (SLAM): Part II. IEEE robotics & automation magazine, 13(3), 108-117.
[CrossRef] [Google Scholar]
Petrovskaya, A., Perrollaz, M., Oliveira, L., Spinello, L., Triebel, R., Makris, A., ... & Bessière, P. (2012). Awareness of road scene participants for autonomous driving. Handbook of Intelligent Vehicles, 1383-1432.
[Google Scholar]
Kucner, T., Saarinen, J., Magnusson, M., & Lilienthal, A. J. (2013, November). Conditional transition maps: Learning motion patterns in dynamic environments. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 1196-1201). IEEE.
[CrossRef] [Google Scholar]
Wang, C. C., Thorpe, C., Thrun, S., Hebert, M., & Durrant-Whyte, H. (2007). Simultaneous localization, mapping and moving object tracking. The International Journal of Robotics Research, 26(9), 889-916.
[CrossRef] [Google Scholar]
Ess, A., Leibe, B., Schindler, K., & Van Gool, L. (2009, May). Moving obstacle detection in highly dynamic scenes. In 2009 IEEE International Conference on Robotics and Automation (pp. 56-63). IEEE.
[CrossRef] [Google Scholar]
Baig, Q., Perrollaz, M., Do Nascimento, J. B., & Laugier, C. (2012, December). Using fast classification of static and dynamic environment for improving Bayesian occupancy filter (BOF) and tracking. In 2012 12th International Conference on Control Automation Robotics & Vision (ICARCV) (pp. 656-661). IEEE.
[CrossRef] [Google Scholar]
Hahnel, D., Triebel, R., Burgard, W., & Thrun, S. (2003, September). Map building with mobile robots in dynamic environments. In 2003 IEEE International Conference on Robotics and Automation (Cat. No. 03CH37422) (Vol. 2, pp. 1557-1563). IEEE.
[CrossRef] [Google Scholar]
Elfes, A. (1989). A probabilistic framework for robot perception and navigation. PhD thesis, Carnegie-Mellon University.
[Google Scholar]
Elfes, A. (1989). Using occupancy grids for mobile robot perception and navigation. Computer, 22(6), 46-57.
[CrossRef] [Google Scholar]
Pancham, A., Tlale, N., & Bright, G. (2011). Literature review of SLAM and DATMO. RobMech 2011. http://hdl.handle.net/10204/5457
[Google Scholar]
Baig, Q., & Aycard, O. (2012, June). Improving moving objects tracking using road model for laser data. In 2012 IEEE Intelligent Vehicles Symposium (pp. 790-795). IEEE.
[CrossRef] [Google Scholar]
Chen, C., Tay, C., Laugier, C., & Mekhnacha, K. (2006, December). Dynamic environment modeling with gridmap: a multiple-object tracking application. In 2006 9th International Conference on Control, Automation, Robotics and Vision (pp. 1-6). IEEE.
[CrossRef] [Google Scholar]
Mekhnacha, K., Mao, Y., Raulo, D., & Laugier, C. (2008, September). Bayesian occupancy filter based" fast clustering-tracking" algorithm. In IROS 2008.
[Google Scholar]
Tay, M. K., Mekhnacha, K., Chen, C., Yguel, M., & Laugier, C. (2008). An efficient formulation of the Bayesian occupation filter for target tracking in dynamic environments. International Journal of Vehicle Autonomous Systems, 6(1-2), 155-171.
[CrossRef] [Google Scholar]
Fan, H., Kucner, T. P., Magnusson, M., Li, T., & Lilienthal, A. J. (2017). A dual PHD filter for effective occupancy filtering in a highly dynamic environment. IEEE Transactions on Intelligent Transportation Systems, 19(9), 2977-2993.
[CrossRef] [Google Scholar]
Zhu, D., Li, Y., & Zhu, Z. (2007). A keystone transform without interpolation for SAR ground moving-target imaging. IEEE Geoscience and Remote Sensing Letters, 4(1), 18-22.
[CrossRef] [Google Scholar]
Zhao, Y., Wang, J., Huang, L., & Yang, R. (2011, October). Low complexity keystone transform without interpolation for dim moving target detection. In Proceedings of 2011 IEEE CIE International Conference on Radar (Vol. 2, pp. 1745-1748). IEEE.
[CrossRef] [Google Scholar]
Huang, P., Liao, G., Yang, Z., Xia, X. G., Ma, J., & Zheng, J. (2016). Ground maneuvering target imaging and high-order motion parameter estimation based on second-order keystone and generalized Hough-HAF transform. IEEE Transactions on Geoscience and Remote Sensing, 55(1), 320-335.
[CrossRef] [Google Scholar]
Huang, P., Liao, G., Yang, Z., Xia, X. G., Ma, J. T., & Ma, J. (2016). Long-time coherent integration for weak maneuvering target detection and high-order motion parameter estimation based on keystone transform. IEEE Transactions on Signal Processing, 64(15), 4013-4026.
[CrossRef] [Google Scholar]
Zhan, M., Huang, P., Zhu, S., Liu, X., Liao, G., Sheng, J., & Li, S. (2021). A modified keystone transform matched filtering method for space-moving target detection. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-16.
[CrossRef] [Google Scholar]
Fan, H., Lu, D., Kucner, T. P., Magnusson, M., & Lilienthal, A. (2018, July). 2D spatial keystone transform for sub-pixel motion extraction from noisy occupancy grid map. In 2018 21st International Conference on Information Fusion (FUSION) (pp. 1-7). IEEE.
[CrossRef] [Google Scholar]
Richards, M. A. (2014). The keystone transformation for correcting range migration in range-doppler processing. pulse, 1000(1).
[Google Scholar]
Rabiner, L. R., Schafer, R. W., & Rader, C. M. (1969). The chirp z‐transform algorithm and its application. Bell System Technical Journal, 48(5), 1249-1292.
[CrossRef] [Google Scholar]
Rajmic, P., Prusa, Z., & Wiesmeyr, C. (2014, September). Computational cost of Chirp Z-transform and Generalized Goertzel algorithm. In 2014 22nd European Signal Processing Conference (EUSIPCO) (pp. 1004-1008). IEEE.
[Google Scholar]

Cite This Article

APA Style

Fan, H., Lu, D., Jiang, Y., & Lilienthal, A. J. (2024). Extraction of Motion Information from Occupancy Grid Map Using Keystone Transform. Chinese Journal of Information Fusion, 1(1), 63–78. https://doi.org/10.62762/CJIF.2024.361892

Article Metrics

Citations:

Google Scholar

Crossref

Scopus

Web of Science

Article Access Statistics:

PDF Downloads: 603

Publisher's Note

ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Copyright © 2024 by the Author(s). Published by Institute of Central Computation and Knowledge. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

Chinese Journal of Information Fusion

ISSN: 2998-3371 (Online) | ISSN: 2998-3363 (Print)

Email: [email protected]

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/icck/

Table of Content

1. Introduction

1.1 Related work

1.1.1 OO methods

1.1.2 CO methods

1.1.3 OCO methods

1.2 KST in radar signal processing

1.3 Scope of this paper

2. One Dimensional Spatail KST

3. Two Dimensional Spatial KST

3.1 2DS-KST with multiple hypotheses

3.2 Merging multiple hypotheses

4. Algorithm Implementation

5. Results

5.1 Point object test

5.2 Extend object test

6. Conclusion

Google Scholar

Crossref

Scopus

Web of Science

We use cookies