<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD with MathML3 v1.1d2 20140930//EN" "JATS-journalpublishing1-mathml3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="1.1d2" xml:lang="en">
  <front>
    <journal-meta>
      <journal-id journal-id-type="nlm-ta">CJIF</journal-id>
      <journal-id journal-id-type="publisher-id">ICCK</journal-id>
      <journal-title-group>
        <journal-title>Chinese Journal of Information Fusion</journal-title>
      </journal-title-group>
      <issn pub-type="ppub" publication-format="print">2998-3363</issn>
      <issn pub-type="epub" publication-format="electronic">2998-3371</issn>
      <publisher>
        <publisher-name>Institute of Central Computation and Knowledge Inc</publisher-name>
        <publisher-loc>522 W RIVERSIDE AVE STE N, SPOKANE, WA, 99201-0508, UNITED STATES</publisher-loc>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.62762/CJIF.2025.413277</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>Research Article</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>Radar Multi-Feature Graph Representation and Graph Network Fusion Target Detection Methods</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <contrib-id contrib-id-type="orcid">https://orcid.org/0009-0009-7805-7561</contrib-id>
          <name>
            <surname>Su</surname>
            <given-names>Ningyuan</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-1040-1655</contrib-id>
          <name>
            <surname>Chen</surname>
            <given-names>Xiaolong</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <contrib-id contrib-id-type="orcid">https://orcid.org/0009-0006-3526-2635</contrib-id>
          <name>
            <surname>Guan</surname>
            <given-names>Jian</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <contrib-id contrib-id-type="orcid">https://orcid.org/0009-0004-2606-4771</contrib-id>
          <name>
            <surname>Wang</surname>
            <given-names>Xinghai</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <contrib-id contrib-id-type="orcid">https://orcid.org/0000-0003-1112-5418</contrib-id>
          <name>
            <surname>Zhou</surname>
            <given-names>Liangjiang</given-names>
          </name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <contrib-id contrib-id-type="orcid">https://orcid.org/0009-0005-9689-7952</contrib-id>
          <name>
            <surname>Wang</surname>
            <given-names>Jinhao</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Wang</surname>
            <given-names>Hongyong</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff1"><label>1</label>Naval Aviation University, Yantai 264001, China</aff>
        <aff id="aff2"><label>2</label>Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, China</aff>
      </contrib-group>
      <author-notes>
        <corresp id="cor2">Corresponding Author: Xiaolong Chen. Email: <email>cxlcxl1209@163.com</email></corresp>
      </author-notes>
      <pub-date date-type="pub" pub-type="epub" publication-format="online">
        <day>26</day>
        <month>3</month>
        <year>2025</year>
      </pub-date>
      <volume>2</volume>
      <issue>1</issue>
      <fpage>59</fpage>
      <lpage>69</lpage>
      <history>
        <date date-type="received">
          <day>27</day>
          <month>2</month>
          <year>2025</year>
        </date>
        <date date-type="accepted">
          <day>19</day>
          <month>3</month>
          <year>2025</year>
        </date>
      </history>
      <permissions>
        <copyright-statement>© 2025 by the Authors. Published by Institute of Central Computation and Knowledge. This is an open access article under the CC BY license (https://creativecommons.org/licenses/by/4.0/).</copyright-statement>
        <copyright-year>2025</copyright-year>
        <copyright-holder>The Authors</copyright-holder>
        <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
        </license>
      </permissions>
      <self-uri xlink:href="https://www.icck.org/article/abs/cjif.2025.413277">This article is available from https://www.icck.org/article/abs/cjif.2025.413277</self-uri>
      <abstract>
        <p>In the context of neural network-based radar feature extraction and detection methods, single-feature detection approaches exhibit limited capability in distinguishing targets from background in complex environments such as sea clutter. To address this, a Multi-Feature Extraction Network and Graph Fusion Detection Network (MFEn-GFDn) method is proposed, leveraging feature complementarity and enhanced information utilization. MFEn extracts features from various time-frequency maps of radar signals to construct Multi-Feature Graph Data (MFG) for multi-feature graphical representation. Subsequently, GFDn performs fusion detection on MFG containing multi-feature information. By expanding the feature dimension, detection performance is further improved. Experimental results on dataset composed of real measured IPIX data demonstrate that MFEn-GFDn detection probability is approximately 8% higher than that of the Dual-Channel Convolutional Neural Network (DCCNN). Additionally, MFEn-GFDn enhances detection performance by expanding the feature dimension, particularly in environments lacking corresponding training samples.</p>
      </abstract>
      <kwd-group kwd-group-type="author" xml:lang="en">
        <kwd>radar target detection</kwd>
        <kwd>feature fusion</kwd>
        <kwd>graph data</kwd>
        <kwd>graph fusion detection network</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="S1">
      <label>1.</label>
      <title>Introduction</title>
      <p id="S1.p1">Due to the time-varying and non-stationary nature of sea clutter, as well as the diversity of target types and motion states [<xref rid="ref001" ref-type="bibr">1</xref>], data-driven target detection methods based on model feature and deep learning struggle to distinguish targets from clutter signals in actual observation environments. It hinders further improvement in detection performance. Many scholars have explored the complementary effects of different features to enhance detection performance through multi-feature joint detection methods. Current multi-feature joint detection methods primarily rely on convex hull learning algorithms, which are challenging to apply in high-dimensional feature spaces. Time-frequency features, reflecting the changes in frequency distribution of the sea surface background and targets over time, provide effective support for sea surface target detection methods, especially with the improvement in radar resolution [<xref rid="ref002" ref-type="bibr">2</xref>, <xref rid="ref003" ref-type="bibr">3</xref>]. Micro-Doppler theory reveals that sea surface targets exhibit micro-motion characteristics [<xref rid="ref004" ref-type="bibr">4</xref>, <xref rid="ref005" ref-type="bibr">5</xref>], reflecting changes in radial velocity and target image affected by waves (e.g., roll, pitch, yaw), offering additional information for detection. Consequently, time-frequency features have been widely used in sea surface target detection by enhancing the difference in radial velocity variation between targets and sea clutter [<xref rid="ref006" ref-type="bibr">6</xref>, <xref rid="ref007" ref-type="bibr">7</xref>, <xref rid="ref008" ref-type="bibr">8</xref>, <xref rid="ref009" ref-type="bibr">9</xref>].</p>
      <p>
        <fig id="F1">
          <label>Figure 1.</label>
          <caption>
            <p>STFT features with low discrimination between clutter and targets.</p>
          </caption>
          <graphic xlink:href="fig1.jpg"/>
        </fig>
      </p>
      <p id="S1.p2">However, time-frequency features detection methods face similar challenges as other feature detection methods. The complex sea detection environment and diverse target characteristics result in time-varying micro-motion features of targets. The lack of regular frequency modulation periodic characteristics impacts detection performance [<xref rid="ref010" ref-type="bibr">10</xref>]. Additionally, factors such as sea spikes cause sea clutter signals to exhibit similar two-dimensional time-frequency characteristics to targets. For instance, some scholars [<xref rid="ref011" ref-type="bibr">11</xref>] have used CNN to process the time-frequency features of several types of micro-motion targets, achieving target recognition with good performance. However, analysis of sea radar echo data reveals that time-frequency features exhibit unstable performance in distinguishing sea clutter from target signals, primarily related to background characteristics. In sea surface target detection, targets are influenced by factors such as waves, leading to complex and discontinuous echo fluctuations, manifested as discontinuous energy concentration distribution areas in time-frequency features. Moreover, when the target radial speed is low, target is more likely to overlap with clutter on the time-frequency map. Therefore, in actual sea observation environments, one time-frequency features is often insufficient to distinguish targets from clutter.</p>
      <p id="S1.p3">This article proposes a feature fusion detection method from the perspective of neural network data-driven detection methods, based on time-frequency features. It addresses the limited ability of single-feature detection methods in distinguishing targets and backgrounds in complex sea clutter environments. The Multi-Feature Extraction Network (MFEn) and Graph Fusion and Detection Network (GFDn) detection methods are proposed. It represents multiple features of signal samples through graph representation and achieve multi-feature fusion detection through graph classification.</p>
    </sec>
    <sec id="S2">
      <label>2.</label>
      <title>Multi-Feature Extraction Network and Multi-Feature Graph Data</title>
      <p id="S2.p1">Under actual detection conditions, amplitude time-frequency features detection faces numerous challenges due to the influence of complex environments and target characteristics. The amplitude feature significantly reduces the discriminability between targets and clutter signals under high sea conditions or weak targets. The Short-time Fourier Transform (STFT) time-frequency features also struggles to distinguish targets from clutter signals in many cases, as shown in Figure <xref ref-type="fig" rid="F1">1</xref>. Clutter signals may exhibit a wide Doppler range. And targets may have lower radial velocities during motion. So that clutter may cover targets in time-frequency features. Additionally, clutter sometimes exhibits similar characteristics to targets, resulting in missed alarms and false alarms.</p>
      <p id="S2.p2">To address the instability of single-model features in distinguishing target and clutter samples under complex conditions, increasing model feature types is a crucial way to enhance detection performance. Some target signals may be difficult to distinguish from clutter in certain features but exhibit high distinguishability in other feature domains [<xref rid="ref012" ref-type="bibr">12</xref>]. However, as the dimensionality of features increases, integrating multiple features and making decisions to generate detection results becomes a key and challenging research problem. The DCCNN fusion detection method [<xref rid="ref013" ref-type="bibr">13</xref>] extracts more class features by increasing the number of channels, obtaining high-dimensional combined features. Fusion detection is performed through a feature fusion classifier. However, as the feature dimension increases, the number of network parameters significantly rises, leading to difficulties in model training and fitting. Furthermore, expanding the training dataset is an essential way to improve the detection performance and generalization ability of deep learning methods. In practical detection tasks, the network used for detection is obtained through parameter optimization using a fixed finite dataset. However, the diversity of potential target types, motion types, and background features makes it challenging to achieve stable detection performance when there is a significant difference between the target being tested and the samples in the training dataset.</p>
      <p id="S2.p3">In response to these issues, this paper proposes an MFEn-GFDn detection method for target detection, improving the detection performance of multi-feature neural network detection methods under complex conditions. Self-supervised and adaptive structures are used to replace various components of the complex network [<xref rid="ref014" ref-type="bibr">14</xref>], solving the problem of model training and enhancing the model's generalization ability.</p>
      <p id="S2.p4">Compared to the DCCNN method, the MFEn-GFDn fusion detection method has the following differences: 1) In hidden feature extraction channel structure, different input features share the same channel; 2) hidden features form a combination of graph structures through graph representation, replacing feature concatenation and combination; 3) GFDn extracts and aggregates features from graph data composed of different features, instead of relying solely on the Multi-Layer Perceptron (MLP) module in DCCNN for feature fusion detection.</p>
      <p>
        <fig id="F2">
          <label>Figure 2.</label>
          <caption>
            <p>Network structure and data processing flow of the proposed method.</p>
          </caption>
          <graphic xlink:href="fig2.jpg"/>
        </fig>
      </p>
      <p>
        <fig id="F3">
          <label>Figure 3.</label>
          <caption>
            <p>Structure of GFDn.</p>
          </caption>
          <graphic xlink:href="fig3.jpg"/>
        </fig>
      </p>
      <p id="S2.p5">As shown in Figure <xref ref-type="fig" rid="F2">2</xref>, the Multi-Feature Extraction Network (MFEn) is constructed using the encoder part of a Convolutional Autoencoder (CAE), consisting of 2 convolutional layers, 2 pooling layers, and 1 fully connected layer. The first convolutional layer has 64 3x3 convolution kernels, and the second convolutional layer has 128 3x3 convolution kernels. The fully connected layer outputs a 256-dimensional vector of hidden features. Parameter optimization is achieved through training with a convolutional autoencoder. Similar to autoencoders (AE), CAE includes both encoder and decoder parts, and its training process involves encoding and decoding stages. In the encoding stage, the CAE encoder encodes the input data and maps the features to the hidden layer space. Subsequently, the decoder decodes the hidden layer features output by the encoder and reconstructs the corresponding input data. After training, the encoder part is used for feature extraction channels. Autoencoder (AE) is an unsupervised neural network model that can learn hidden features of input data for dimensionality reduction. CAE introduces convolutional layer operations, enhancing the feature extraction capability for two-dimensional data such as time-frequency maps. Additionally, during the training process, the loss function of CAE is a function of the reconstructed output data and the input raw data, rather than a function of the classification labels, reducing the interference of data labeling errors when training the feature extraction network.</p>
      <p id="S2.p6">Multiple hidden features are extracted from different model features of the same signal sample using MFEn for graph representation, with each sample corresponding to a Multi-Feature Graph Data (MFG) <inline-formula><mml:math alttext="G_{i}(V,E,F)" display="inline"><mml:mrow><mml:msub><mml:mi>G</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>V</mml:mi><mml:mo>,</mml:mo><mml:mi>E</mml:mi><mml:mo>,</mml:mo><mml:mi>F</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>. Each node in the node set of MFG corresponds to a hidden feature of the sample. The feature matrix of MFG <inline-formula><mml:math alttext="F\in\mathbb{R}^{N_{node}\times N_{feature}}" display="inline"><mml:mrow><mml:mi>F</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi>ℝ</mml:mi><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>⁢</mml:mo><mml:mi>o</mml:mi><mml:mo>⁢</mml:mo><mml:mi>d</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo lspace="0.222em" rspace="0.222em">×</mml:mo><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>f</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi><mml:mo>⁢</mml:mo><mml:mi>a</mml:mi><mml:mo>⁢</mml:mo><mml:mi>t</mml:mi><mml:mo>⁢</mml:mo><mml:mi>u</mml:mi><mml:mo>⁢</mml:mo><mml:mi>r</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> is the set of all hidden feature information, <inline-formula><mml:math alttext="N_{node}" display="inline"><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>⁢</mml:mo><mml:mi>o</mml:mi><mml:mo>⁢</mml:mo><mml:mi>d</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represents the node dimension of MFG, and <inline-formula><mml:math alttext="N_{feature}" display="inline"><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>f</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi><mml:mo>⁢</mml:mo><mml:mi>a</mml:mi><mml:mo>⁢</mml:mo><mml:mi>t</mml:mi><mml:mo>⁢</mml:mo><mml:mi>u</mml:mi><mml:mo>⁢</mml:mo><mml:mi>r</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represents the dimension of node features. The nodes form a fully connected graph through fully connected connections, where the edge set <inline-formula><mml:math alttext="E" display="inline"><mml:mi>E</mml:mi></mml:math></inline-formula> is represented in the form of an adjacency matrix and then becomes a fully 1 matrix.</p>
    </sec>
    <sec id="S3">
      <label>3.</label>
      <title>Graph Fusion Detection Network</title>
      <p id="S3.p1">In the process of fusing hidden features through fully connected layers, due to the fixed parameters of the trained network, each feature corresponds to a fixed weight during the fusion process. However, there is a lack of stable and effective modeling criteria for the correlation information of different features under actual detection conditions. Therefore, a Graph Fusion and Detection Network (GFDn) structure is proposed, which obtains fusion weights through network parameter learning and adaptively adjusts the weights based on hidden feature information. The MFG and GFDn structures are shown in Figure <xref ref-type="fig" rid="F3">3</xref>.</p>
      <p id="S3.p2">The final detection result of MFG in Figure <xref ref-type="fig" rid="F3">3</xref> was obtained through GFDn fusion detection. During this process, the data dimension is reduced from <inline-formula><mml:math alttext="N_{node}\times N_{feature}" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>⁢</mml:mo><mml:mi>o</mml:mi><mml:mo>⁢</mml:mo><mml:mi>d</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo lspace="0.222em" rspace="0.222em">×</mml:mo><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>f</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi><mml:mo>⁢</mml:mo><mml:mi>a</mml:mi><mml:mo>⁢</mml:mo><mml:mi>t</mml:mi><mml:mo>⁢</mml:mo><mml:mi>u</mml:mi><mml:mo>⁢</mml:mo><mml:mi>r</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> dimensionality to a single scalar <inline-formula><mml:math alttext="1\times 1" display="inline"><mml:mrow><mml:mn>1</mml:mn><mml:mo lspace="0.222em" rspace="0.222em">×</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math></inline-formula>, representing the probability of the sample being classified as the target. GFDn consists of a feature fusion network and a feature detection network. The feature fusion network consists of a graph attention convolutional layer, a graph pooling layer [<xref rid="ref015" ref-type="bibr">15</xref>], and a feature reading module. In the graph attention convolutional layer, each node serves as a central node, and its features are fused with those of other nodes using corresponding attention coefficients.</p>
      <p>
        <disp-formula id="S3.E1">
          <mml:math alttext="z_{i}^{(l)}=W^{(l)}h_{i}^{(l)}" display="block">
            <mml:mrow>
              <mml:msubsup>
                <mml:mi>z</mml:mi>
                <mml:mi>i</mml:mi>
                <mml:mrow>
                  <mml:mo stretchy="false">(</mml:mo>
                  <mml:mi>l</mml:mi>
                  <mml:mo stretchy="false">)</mml:mo>
                </mml:mrow>
              </mml:msubsup>
              <mml:mo>=</mml:mo>
              <mml:mrow>
                <mml:msup>
                  <mml:mi>W</mml:mi>
                  <mml:mrow>
                    <mml:mo stretchy="false">(</mml:mo>
                    <mml:mi>l</mml:mi>
                    <mml:mo stretchy="false">)</mml:mo>
                  </mml:mrow>
                </mml:msup>
                <mml:mo>⁢</mml:mo>
                <mml:msubsup>
                  <mml:mi>h</mml:mi>
                  <mml:mi>i</mml:mi>
                  <mml:mrow>
                    <mml:mo stretchy="false">(</mml:mo>
                    <mml:mi>l</mml:mi>
                    <mml:mo stretchy="false">)</mml:mo>
                  </mml:mrow>
                </mml:msubsup>
              </mml:mrow>
            </mml:mrow>
          </mml:math>
        </disp-formula>
      </p>
      <p>
        <disp-formula id="S3.E2">
          <mml:math alttext="e_{ij}^{(l)}=\text{LeakyReLU}\left(a^{(l)^{T}}\left(z_{i}^{(l)}\,\|\,z_{j}^{(l%&#10;)}\right)\right)" display="block">
            <mml:mrow>
              <mml:msubsup>
                <mml:mi>e</mml:mi>
                <mml:mrow>
                  <mml:mi>i</mml:mi>
                  <mml:mo>⁢</mml:mo>
                  <mml:mi>j</mml:mi>
                </mml:mrow>
                <mml:mrow>
                  <mml:mo stretchy="false">(</mml:mo>
                  <mml:mi>l</mml:mi>
                  <mml:mo stretchy="false">)</mml:mo>
                </mml:mrow>
              </mml:msubsup>
              <mml:mo>=</mml:mo>
              <mml:mrow>
                <mml:mtext>LeakyReLU</mml:mtext>
                <mml:mo>⁢</mml:mo>
                <mml:mrow>
                  <mml:mo>(</mml:mo>
                  <mml:mrow>
                    <mml:msup>
                      <mml:mi>a</mml:mi>
                      <mml:msup>
                        <mml:mrow>
                          <mml:mo stretchy="false">(</mml:mo>
                          <mml:mi>l</mml:mi>
                          <mml:mo stretchy="false">)</mml:mo>
                        </mml:mrow>
                        <mml:mi>T</mml:mi>
                      </mml:msup>
                    </mml:msup>
                    <mml:mo>⁢</mml:mo>
                    <mml:mrow>
                      <mml:mo>(</mml:mo>
                      <mml:mrow>
                        <mml:msubsup>
                          <mml:mi>z</mml:mi>
                          <mml:mi>i</mml:mi>
                          <mml:mrow>
                            <mml:mo stretchy="false">(</mml:mo>
                            <mml:mi>l</mml:mi>
                            <mml:mo stretchy="false">)</mml:mo>
                          </mml:mrow>
                        </mml:msubsup>
                        <mml:mo rspace="0.448em">∥</mml:mo>
                        <mml:msubsup>
                          <mml:mi>z</mml:mi>
                          <mml:mi>j</mml:mi>
                          <mml:mrow>
                            <mml:mo stretchy="false">(</mml:mo>
                            <mml:mi>l</mml:mi>
                            <mml:mo stretchy="false">)</mml:mo>
                          </mml:mrow>
                        </mml:msubsup>
                      </mml:mrow>
                      <mml:mo>)</mml:mo>
                    </mml:mrow>
                  </mml:mrow>
                  <mml:mo>)</mml:mo>
                </mml:mrow>
              </mml:mrow>
            </mml:mrow>
          </mml:math>
        </disp-formula>
      </p>
      <p>
        <disp-formula id="S3.E3">
          <mml:math alttext="\alpha_{ij}^{(l)}=\frac{\exp(e_{ij}^{(l)})}{\sum_{m\in\tilde{N}(v_{i})}\exp(e_%&#10;{im}^{(l)})}" display="block">
            <mml:mrow>
              <mml:msubsup>
                <mml:mi>α</mml:mi>
                <mml:mrow>
                  <mml:mi>i</mml:mi>
                  <mml:mo>⁢</mml:mo>
                  <mml:mi>j</mml:mi>
                </mml:mrow>
                <mml:mrow>
                  <mml:mo stretchy="false">(</mml:mo>
                  <mml:mi>l</mml:mi>
                  <mml:mo stretchy="false">)</mml:mo>
                </mml:mrow>
              </mml:msubsup>
              <mml:mo>=</mml:mo>
              <mml:mfrac>
                <mml:mrow>
                  <mml:mi>exp</mml:mi>
                  <mml:mo>⁡</mml:mo>
                  <mml:mrow>
                    <mml:mo stretchy="false">(</mml:mo>
                    <mml:msubsup>
                      <mml:mi>e</mml:mi>
                      <mml:mrow>
                        <mml:mi>i</mml:mi>
                        <mml:mo>⁢</mml:mo>
                        <mml:mi>j</mml:mi>
                      </mml:mrow>
                      <mml:mrow>
                        <mml:mo stretchy="false">(</mml:mo>
                        <mml:mi>l</mml:mi>
                        <mml:mo stretchy="false">)</mml:mo>
                      </mml:mrow>
                    </mml:msubsup>
                    <mml:mo stretchy="false">)</mml:mo>
                  </mml:mrow>
                </mml:mrow>
                <mml:mrow>
                  <mml:msub>
                    <mml:mo>∑</mml:mo>
                    <mml:mrow>
                      <mml:mi>m</mml:mi>
                      <mml:mo>∈</mml:mo>
                      <mml:mrow>
                        <mml:mover accent="true">
                          <mml:mi>N</mml:mi>
                          <mml:mo>~</mml:mo>
                        </mml:mover>
                        <mml:mo>⁢</mml:mo>
                        <mml:mrow>
                          <mml:mo stretchy="false">(</mml:mo>
                          <mml:msub>
                            <mml:mi>v</mml:mi>
                            <mml:mi>i</mml:mi>
                          </mml:msub>
                          <mml:mo stretchy="false">)</mml:mo>
                        </mml:mrow>
                      </mml:mrow>
                    </mml:mrow>
                  </mml:msub>
                  <mml:mrow>
                    <mml:mi>exp</mml:mi>
                    <mml:mo>⁡</mml:mo>
                    <mml:mrow>
                      <mml:mo stretchy="false">(</mml:mo>
                      <mml:msubsup>
                        <mml:mi>e</mml:mi>
                        <mml:mrow>
                          <mml:mi>i</mml:mi>
                          <mml:mo>⁢</mml:mo>
                          <mml:mi>m</mml:mi>
                        </mml:mrow>
                        <mml:mrow>
                          <mml:mo stretchy="false">(</mml:mo>
                          <mml:mi>l</mml:mi>
                          <mml:mo stretchy="false">)</mml:mo>
                        </mml:mrow>
                      </mml:msubsup>
                      <mml:mo stretchy="false">)</mml:mo>
                    </mml:mrow>
                  </mml:mrow>
                </mml:mrow>
              </mml:mfrac>
            </mml:mrow>
          </mml:math>
        </disp-formula>
      </p>
      <p>
        <disp-formula id="S3.E4">
          <mml:math alttext="h_{i}^{(l+1)}=\sigma\left(\sum_{j\in\tilde{N}(v_{i})}\alpha_{ij}^{(l)}z_{j}^{(%&#10;l)}\right)" display="block">
            <mml:mrow>
              <mml:msubsup>
                <mml:mi>h</mml:mi>
                <mml:mi>i</mml:mi>
                <mml:mrow>
                  <mml:mo stretchy="false">(</mml:mo>
                  <mml:mrow>
                    <mml:mi>l</mml:mi>
                    <mml:mo>+</mml:mo>
                    <mml:mn>1</mml:mn>
                  </mml:mrow>
                  <mml:mo stretchy="false">)</mml:mo>
                </mml:mrow>
              </mml:msubsup>
              <mml:mo>=</mml:mo>
              <mml:mrow>
                <mml:mi>σ</mml:mi>
                <mml:mo>⁢</mml:mo>
                <mml:mrow>
                  <mml:mo>(</mml:mo>
                  <mml:mrow>
                    <mml:munder>
                      <mml:mo lspace="0em" movablelimits="false">∑</mml:mo>
                      <mml:mrow>
                        <mml:mi>j</mml:mi>
                        <mml:mo>∈</mml:mo>
                        <mml:mrow>
                          <mml:mover accent="true">
                            <mml:mi>N</mml:mi>
                            <mml:mo>~</mml:mo>
                          </mml:mover>
                          <mml:mo>⁢</mml:mo>
                          <mml:mrow>
                            <mml:mo stretchy="false">(</mml:mo>
                            <mml:msub>
                              <mml:mi>v</mml:mi>
                              <mml:mi>i</mml:mi>
                            </mml:msub>
                            <mml:mo stretchy="false">)</mml:mo>
                          </mml:mrow>
                        </mml:mrow>
                      </mml:mrow>
                    </mml:munder>
                    <mml:mrow>
                      <mml:msubsup>
                        <mml:mi>α</mml:mi>
                        <mml:mrow>
                          <mml:mi>i</mml:mi>
                          <mml:mo>⁢</mml:mo>
                          <mml:mi>j</mml:mi>
                        </mml:mrow>
                        <mml:mrow>
                          <mml:mo stretchy="false">(</mml:mo>
                          <mml:mi>l</mml:mi>
                          <mml:mo stretchy="false">)</mml:mo>
                        </mml:mrow>
                      </mml:msubsup>
                      <mml:mo>⁢</mml:mo>
                      <mml:msubsup>
                        <mml:mi>z</mml:mi>
                        <mml:mi>j</mml:mi>
                        <mml:mrow>
                          <mml:mo stretchy="false">(</mml:mo>
                          <mml:mi>l</mml:mi>
                          <mml:mo stretchy="false">)</mml:mo>
                        </mml:mrow>
                      </mml:msubsup>
                    </mml:mrow>
                  </mml:mrow>
                  <mml:mo>)</mml:mo>
                </mml:mrow>
              </mml:mrow>
            </mml:mrow>
          </mml:math>
        </disp-formula>
      </p>
      <p id="S3.p3">where <inline-formula><mml:math alttext="W^{(l)}" display="inline"><mml:msup><mml:mi>W</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula> is the parameter of feature extraction in this layer, <inline-formula><mml:math alttext="h_{i}^{(l)}" display="inline"><mml:msubsup><mml:mi>h</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> is the input feature of node <inline-formula><mml:math alttext="i" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> in this layer. <inline-formula><mml:math alttext="z_{i}^{(l)}" display="inline"><mml:msubsup><mml:mi>z</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula>, the preliminary feature extraction result, is obtained from <inline-formula><mml:math alttext="h_{i}^{(l)}" display="inline"><mml:msubsup><mml:mi>h</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> via <inline-formula><mml:math alttext="W^{(l)}" display="inline"><mml:msup><mml:mi>W</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula>. <inline-formula><mml:math alttext="e_{ij}^{(l)}" display="inline"><mml:msubsup><mml:mi>e</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>⁢</mml:mo><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> is the preliminary attention coefficient between paired nodes. <inline-formula><mml:math alttext="a^{(l)}" display="inline"><mml:msup><mml:mi>a</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>l</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula> is the attention parameter matrix. <inline-formula><mml:math alttext="v_{j}\in\tilde{N}(v_{i})" display="inline"><mml:mrow><mml:msub><mml:mi>v</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>∈</mml:mo><mml:mrow><mml:mover accent="true"><mml:mi>N</mml:mi><mml:mo>~</mml:mo></mml:mover><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> is the mask of neighboring nodes. <inline-formula><mml:math alttext="h_{i}^{(l+1)}" display="inline"><mml:msubsup><mml:mi>h</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>l</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> is the output of this graph attention convolutional layer.</p>
      <p id="S3.p4">The graph pooling layer determines the nodes to be retained in the layer based on the pooling attention coefficients of each node. After multiple pooling layers, the multiple feature fusion results of one node are ultimately retained as the final embedded feature after fusion. The graph attention convolutional layer achieves the fusion of input feature class variables. In this article, the input graph data node feature dimension is <inline-formula><mml:math alttext="N_{feature}=256" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>f</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi><mml:mo>⁢</mml:mo><mml:mi>a</mml:mi><mml:mo>⁢</mml:mo><mml:mi>t</mml:mi><mml:mo>⁢</mml:mo><mml:mi>u</mml:mi><mml:mo>⁢</mml:mo><mml:mi>r</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>256</mml:mn></mml:mrow></mml:math></inline-formula>, indicating the dimension of hidden features.</p>
      <p>
        <fig id="F4">
          <label>Figure 4.</label>
          <caption>
            <p>Process of graph pooling layer.</p>
          </caption>
          <graphic xlink:href="fig4.jpg"/>
        </fig>
      </p>
      <p id="S3.p5">Figure <xref ref-type="fig" rid="F4">4</xref> is an example of the graph pooling layer. During the graph pooling process, the node dimension of sample features <inline-formula><mml:math alttext="F" display="inline"><mml:mi>F</mml:mi></mml:math></inline-formula>, i.e., the number of hidden features, is reduced from <inline-formula><mml:math alttext="N_{node}" display="inline"><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>⁢</mml:mo><mml:mi>o</mml:mi><mml:mo>⁢</mml:mo><mml:mi>d</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> from 1 through multiple graph pooling layers. In the <inline-formula><mml:math alttext="m" display="inline"><mml:mi>m</mml:mi></mml:math></inline-formula>th pooling layer of the graph, the pooling attention coefficient of each node <inline-formula><mml:math alttext="W_{att}^{(m)}" display="inline"><mml:msubsup><mml:mi>W</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mo>⁢</mml:mo><mml:mi>t</mml:mi><mml:mo>⁢</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>m</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> is first calculated based on the network parameters of the layer and the node features <inline-formula><mml:math alttext="h_{i}^{(m)}" display="inline"><mml:msubsup><mml:mi>h</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>m</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula>, <inline-formula><mml:math alttext="i=1,2,\dots,N_{node}" display="inline"><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>⁢</mml:mo><mml:mi>o</mml:mi><mml:mo>⁢</mml:mo><mml:mi>d</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow></mml:math></inline-formula> input to the layer as <inline-formula><mml:math alttext="\mathbf{z}_{att}^{(m)}" display="inline"><mml:msubsup><mml:mi>𝐳</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mo>⁢</mml:mo><mml:mi>t</mml:mi><mml:mo>⁢</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>m</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula>.</p>
      <p>
        <disp-formula id="S3.E5">
          <mml:math alttext="z_{att,i}^{(m)}=W_{att}^{(m)}h_{i}^{(m)}" display="block">
            <mml:mrow>
              <mml:msubsup>
                <mml:mi>z</mml:mi>
                <mml:mrow>
                  <mml:mrow>
                    <mml:mi>a</mml:mi>
                    <mml:mo>⁢</mml:mo>
                    <mml:mi>t</mml:mi>
                    <mml:mo>⁢</mml:mo>
                    <mml:mi>t</mml:mi>
                  </mml:mrow>
                  <mml:mo>,</mml:mo>
                  <mml:mi>i</mml:mi>
                </mml:mrow>
                <mml:mrow>
                  <mml:mo stretchy="false">(</mml:mo>
                  <mml:mi>m</mml:mi>
                  <mml:mo stretchy="false">)</mml:mo>
                </mml:mrow>
              </mml:msubsup>
              <mml:mo>=</mml:mo>
              <mml:mrow>
                <mml:msubsup>
                  <mml:mi>W</mml:mi>
                  <mml:mrow>
                    <mml:mi>a</mml:mi>
                    <mml:mo>⁢</mml:mo>
                    <mml:mi>t</mml:mi>
                    <mml:mo>⁢</mml:mo>
                    <mml:mi>t</mml:mi>
                  </mml:mrow>
                  <mml:mrow>
                    <mml:mo stretchy="false">(</mml:mo>
                    <mml:mi>m</mml:mi>
                    <mml:mo stretchy="false">)</mml:mo>
                  </mml:mrow>
                </mml:msubsup>
                <mml:mo>⁢</mml:mo>
                <mml:msubsup>
                  <mml:mi>h</mml:mi>
                  <mml:mi>i</mml:mi>
                  <mml:mrow>
                    <mml:mo stretchy="false">(</mml:mo>
                    <mml:mi>m</mml:mi>
                    <mml:mo stretchy="false">)</mml:mo>
                  </mml:mrow>
                </mml:msubsup>
              </mml:mrow>
            </mml:mrow>
          </mml:math>
        </disp-formula>
      </p>
      <p id="S3.p6">Then, based on the set graph pooling rate <inline-formula><mml:math alttext="k\in(0,1]" display="inline"><mml:mrow><mml:mi>k</mml:mi><mml:mo>∈</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">]</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, the index of the reserved nodes in this layer <inline-formula><mml:math alttext="Z_{idx}" display="inline"><mml:msub><mml:mi>Z</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>⁢</mml:mo><mml:mi>d</mml:mi><mml:mo>⁢</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is obtained as:</p>
      <p>
        <disp-formula id="S3.E6">
          <mml:math alttext="\text{idx}=\text{top-rank}\left(Z,\left\lceil kN\right\rceil\right)\quad Z_{%&#10;mask}=Z_{idx}" display="block">
            <mml:mrow>
              <mml:mrow>
                <mml:mtext>idx</mml:mtext>
                <mml:mo>=</mml:mo>
                <mml:mrow>
                  <mml:mtext>top-rank</mml:mtext>
                  <mml:mo>⁢</mml:mo>
                  <mml:mrow>
                    <mml:mo>(</mml:mo>
                    <mml:mi>Z</mml:mi>
                    <mml:mo>,</mml:mo>
                    <mml:mrow>
                      <mml:mo>⌈</mml:mo>
                      <mml:mrow>
                        <mml:mi>k</mml:mi>
                        <mml:mo>⁢</mml:mo>
                        <mml:mi>N</mml:mi>
                      </mml:mrow>
                      <mml:mo>⌉</mml:mo>
                    </mml:mrow>
                    <mml:mo>)</mml:mo>
                  </mml:mrow>
                </mml:mrow>
              </mml:mrow>
              <mml:mspace width="1em"/>
              <mml:mrow>
                <mml:msub>
                  <mml:mi>Z</mml:mi>
                  <mml:mrow>
                    <mml:mi>m</mml:mi>
                    <mml:mo>⁢</mml:mo>
                    <mml:mi>a</mml:mi>
                    <mml:mo>⁢</mml:mo>
                    <mml:mi>s</mml:mi>
                    <mml:mo>⁢</mml:mo>
                    <mml:mi>k</mml:mi>
                  </mml:mrow>
                </mml:msub>
                <mml:mo>=</mml:mo>
                <mml:msub>
                  <mml:mi>Z</mml:mi>
                  <mml:mrow>
                    <mml:mi>i</mml:mi>
                    <mml:mo>⁢</mml:mo>
                    <mml:mi>d</mml:mi>
                    <mml:mo>⁢</mml:mo>
                    <mml:mi>x</mml:mi>
                  </mml:mrow>
                </mml:msub>
              </mml:mrow>
            </mml:mrow>
          </mml:math>
        </disp-formula>
      </p>
      <p id="S3.p7">where <inline-formula><mml:math alttext="Z_{mask}" display="inline"><mml:msub><mml:mi>Z</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mo>⁢</mml:mo><mml:mi>a</mml:mi><mml:mo>⁢</mml:mo><mml:mi>s</mml:mi><mml:mo>⁢</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the graph attention mask.</p>
      <p>
        <disp-formula id="S3.E7">
          <mml:math alttext="X^{\prime}=X_{idx,:},\quad X_{out}=X^{\prime}\odot Z_{mask},\quad A_{out}=A_{%&#10;idx,idx}" display="block">
            <mml:mrow>
              <mml:mrow>
                <mml:msup>
                  <mml:mi>X</mml:mi>
                  <mml:mo>′</mml:mo>
                </mml:msup>
                <mml:mo>=</mml:mo>
                <mml:msub>
                  <mml:mi>X</mml:mi>
                  <mml:mrow>
                    <mml:mrow>
                      <mml:mi>i</mml:mi>
                      <mml:mo>⁢</mml:mo>
                      <mml:mi>d</mml:mi>
                      <mml:mo>⁢</mml:mo>
                      <mml:mi>x</mml:mi>
                    </mml:mrow>
                    <mml:mo>,</mml:mo>
                    <mml:mo>:</mml:mo>
                  </mml:mrow>
                </mml:msub>
              </mml:mrow>
              <mml:mo rspace="1.167em">,</mml:mo>
              <mml:mrow>
                <mml:mrow>
                  <mml:msub>
                    <mml:mi>X</mml:mi>
                    <mml:mrow>
                      <mml:mi>o</mml:mi>
                      <mml:mo>⁢</mml:mo>
                      <mml:mi>u</mml:mi>
                      <mml:mo>⁢</mml:mo>
                      <mml:mi>t</mml:mi>
                    </mml:mrow>
                  </mml:msub>
                  <mml:mo>=</mml:mo>
                  <mml:mrow>
                    <mml:msup>
                      <mml:mi>X</mml:mi>
                      <mml:mo>′</mml:mo>
                    </mml:msup>
                    <mml:mo lspace="0.222em" rspace="0.222em">⊙</mml:mo>
                    <mml:msub>
                      <mml:mi>Z</mml:mi>
                      <mml:mrow>
                        <mml:mi>m</mml:mi>
                        <mml:mo>⁢</mml:mo>
                        <mml:mi>a</mml:mi>
                        <mml:mo>⁢</mml:mo>
                        <mml:mi>s</mml:mi>
                        <mml:mo>⁢</mml:mo>
                        <mml:mi>k</mml:mi>
                      </mml:mrow>
                    </mml:msub>
                  </mml:mrow>
                </mml:mrow>
                <mml:mo rspace="1.167em">,</mml:mo>
                <mml:mrow>
                  <mml:msub>
                    <mml:mi>A</mml:mi>
                    <mml:mrow>
                      <mml:mi>o</mml:mi>
                      <mml:mo>⁢</mml:mo>
                      <mml:mi>u</mml:mi>
                      <mml:mo>⁢</mml:mo>
                      <mml:mi>t</mml:mi>
                    </mml:mrow>
                  </mml:msub>
                  <mml:mo>=</mml:mo>
                  <mml:msub>
                    <mml:mi>A</mml:mi>
                    <mml:mrow>
                      <mml:mrow>
                        <mml:mi>i</mml:mi>
                        <mml:mo>⁢</mml:mo>
                        <mml:mi>d</mml:mi>
                        <mml:mo>⁢</mml:mo>
                        <mml:mi>x</mml:mi>
                      </mml:mrow>
                      <mml:mo>,</mml:mo>
                      <mml:mrow>
                        <mml:mi>i</mml:mi>
                        <mml:mo>⁢</mml:mo>
                        <mml:mi>d</mml:mi>
                        <mml:mo>⁢</mml:mo>
                        <mml:mi>x</mml:mi>
                      </mml:mrow>
                    </mml:mrow>
                  </mml:msub>
                </mml:mrow>
              </mml:mrow>
            </mml:mrow>
          </mml:math>
        </disp-formula>
      </p>
      <p>where <inline-formula><mml:math alttext="X_{idx,:}" display="inline"><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>⁢</mml:mo><mml:mi>d</mml:mi><mml:mo>⁢</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo>,</mml:mo><mml:mo>:</mml:mo></mml:mrow></mml:msub></mml:math></inline-formula> is the feature of each node based on the node index, <inline-formula><mml:math alttext="\odot" display="inline"><mml:mo>⊙</mml:mo></mml:math></inline-formula> represents the operation of preserving the node feature of some nodes based on the mask, and <inline-formula><mml:math alttext="A_{idx,idx}" display="inline"><mml:msub><mml:mi>A</mml:mi><mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>⁢</mml:mo><mml:mi>d</mml:mi><mml:mo>⁢</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>⁢</mml:mo><mml:mi>d</mml:mi><mml:mo>⁢</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:mrow></mml:msub></mml:math></inline-formula> is the adjacency matrix of the graph data after preserving some nodes based on the index. <inline-formula><mml:math alttext="X_{out}" display="inline"><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mo>⁢</mml:mo><mml:mi>u</mml:mi><mml:mo>⁢</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math alttext="A_{out}" display="inline"><mml:msub><mml:mi>A</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mo>⁢</mml:mo><mml:mi>u</mml:mi><mml:mo>⁢</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represent the node features and adjacency matrix of the output graph data from this layer.</p>
      <p id="S3.p8">The MFG of GFDn input in this paper consists of 12 nodes, corresponding to 12 hidden features. Graph pooling layer 1 retains 6 nodes, graph pooling layer 2 retains 3 nodes, and graph pooling layer 3 retains 1 node. After each pooling layer, the output features of that layer are read through feature reading operations,</p>
      <p>
        <disp-formula id="S3.E8">
          <mml:math alttext="Z_{read}=\frac{1}{\left\lceil kN\right\rceil}\sum_{i=1}^{\left\lceil kN\right%&#10;\rceil}x^{\prime}_{i}\Bigg{\|}\max_{i=1}^{\left\lceil kN\right\rceil}(x^{%&#10;\prime}_{i})" display="block">
            <mml:mrow>
              <mml:msub>
                <mml:mi>Z</mml:mi>
                <mml:mrow>
                  <mml:mi>r</mml:mi>
                  <mml:mo>⁢</mml:mo>
                  <mml:mi>e</mml:mi>
                  <mml:mo>⁢</mml:mo>
                  <mml:mi>a</mml:mi>
                  <mml:mo>⁢</mml:mo>
                  <mml:mi>d</mml:mi>
                </mml:mrow>
              </mml:msub>
              <mml:mo>=</mml:mo>
              <mml:mrow>
                <mml:mrow>
                  <mml:mfrac>
                    <mml:mn>1</mml:mn>
                    <mml:mrow>
                      <mml:mo>⌈</mml:mo>
                      <mml:mrow>
                        <mml:mi>k</mml:mi>
                        <mml:mo>⁢</mml:mo>
                        <mml:mi>N</mml:mi>
                      </mml:mrow>
                      <mml:mo>⌉</mml:mo>
                    </mml:mrow>
                  </mml:mfrac>
                  <mml:mo>⁢</mml:mo>
                  <mml:mrow>
                    <mml:munderover>
                      <mml:mo movablelimits="false">∑</mml:mo>
                      <mml:mrow>
                        <mml:mi>i</mml:mi>
                        <mml:mo>=</mml:mo>
                        <mml:mn>1</mml:mn>
                      </mml:mrow>
                      <mml:mrow>
                        <mml:mo>⌈</mml:mo>
                        <mml:mrow>
                          <mml:mi>k</mml:mi>
                          <mml:mo>⁢</mml:mo>
                          <mml:mi>N</mml:mi>
                        </mml:mrow>
                        <mml:mo>⌉</mml:mo>
                      </mml:mrow>
                    </mml:munderover>
                    <mml:msubsup>
                      <mml:mi>x</mml:mi>
                      <mml:mi>i</mml:mi>
                      <mml:mo>′</mml:mo>
                    </mml:msubsup>
                  </mml:mrow>
                </mml:mrow>
                <mml:mo mathsize="2.600em">∥</mml:mo>
                <mml:mrow>
                  <mml:munderover>
                    <mml:mi>max</mml:mi>
                    <mml:mrow>
                      <mml:mi>i</mml:mi>
                      <mml:mo>=</mml:mo>
                      <mml:mn>1</mml:mn>
                    </mml:mrow>
                    <mml:mrow>
                      <mml:mo>⌈</mml:mo>
                      <mml:mrow>
                        <mml:mi>k</mml:mi>
                        <mml:mo>⁢</mml:mo>
                        <mml:mi>N</mml:mi>
                      </mml:mrow>
                      <mml:mo>⌉</mml:mo>
                    </mml:mrow>
                  </mml:munderover>
                  <mml:mo>⁡</mml:mo>
                  <mml:mrow>
                    <mml:mo stretchy="false">(</mml:mo>
                    <mml:msubsup>
                      <mml:mi>x</mml:mi>
                      <mml:mi>i</mml:mi>
                      <mml:mo>′</mml:mo>
                    </mml:msubsup>
                    <mml:mo stretchy="false">)</mml:mo>
                  </mml:mrow>
                </mml:mrow>
              </mml:mrow>
            </mml:mrow>
          </mml:math>
        </disp-formula>
      </p>
      <p id="S3.p9">The embedded features obtained by concatenating the output features of each layer are input into the feature detection network. The feature detection network is an MLP composed of two fully connected layers, with output dimensions of 64 and 2, respectively, to achieve the process of obtaining detection results from the fused embedded features.</p>
    </sec>
    <sec id="S4">
      <label>4.</label>
      <title>Experimental analysis</title>
      <p id="S4.p1">This article tests the performance of the proposed method by measuring the IPIX dataset of resident mode radar signals. Staring mode radar signals consists of multiple coherent pulse signals over a long observation time, which is suitable for extracting various features of the detection samples.</p>
      <sec id="S4.SS1">
        <label>4.1</label>
        <title>Dataset Introduction</title>
        <p id="S4.SS1.p1">Intelligent Pixel Processing (IPIX) [<xref rid="ref016" ref-type="bibr">16</xref>] data is a commonly applied high-resolution sea clutter measurement data in sea clutter-related research. It was collected by Haykin from McMaster University through measurement and acquisition experiments using IPIX radar in 1993 (Dartmouth, Nova Scotia) and 1998 (Grimsby, Ontario). The radar parameters, data formats, and other related information are shown in Tables <xref rid="T1" ref-type="table">1</xref> and <xref rid="T2" ref-type="table">2</xref>. Three sets of data from the 1993 experiment are applied for method performance validation in this paper.</p>
        <p>
          <table-wrap id="T1">
            <label>Table 1</label>
            <caption>
              <p>IPIX radar parameters.</p>
            </caption>
            <table>
              <thead>
                <tr>
                  <th style="border-top: 1px solid black;" colspan="4" align="center">Radar parameters</th>
                </tr>
              </thead>
              <tbody>
                <tr>
                  <td style="border-top: 1px solid black;" align="center">Antenna gain</td>
                  <td style="border-top: 1px solid black;" align="center">44dB</td>
                  <td style="border-top: 1px solid black;" align="center">Peak power</td>
                  <td style="border-top: 1px solid black;" align="center">8kW</td>
                </tr>
                <tr>
                  <td align="center">Sidelobe</td>
                  <td align="center">-30dB</td>
                  <td align="center">Antenna diameter</td>
                  <td align="center">2.4m</td>
                </tr>
                <tr>
                  <td align="center">Instantaneous dynamic range</td>
                  <td align="center">50dB</td>
                  <td align="center">Beamwidth</td>
                  <td align="center">0.9°</td>
                </tr>
                <tr>
                  <td align="center">Range resolution</td>
                  <td align="center">30m</td>
                  <td align="center">Bandwidth</td>
                  <td align="center">5MHz</td>
                </tr>
                <tr>
                  <td style="border-bottom: 1px solid black;" align="center">Polarization</td>
                  <td style="border-bottom: 1px solid black;" align="center">HH/VV/HV/VH</td>
                  <td style="border-bottom: 1px solid black;" align="center">Pulse repetition frequency</td>
                  <td style="border-bottom: 1px solid black;" align="center">1000Hz</td>
                </tr>
              </tbody>
            </table>
          </table-wrap>
        </p>
        <p>
          <table-wrap id="T2">
            <label>Table 2</label>
            <caption>
              <p>IPIX data information.</p>
            </caption>
            <table>
              <thead>
                <tr>
                  <th style="border-top: 1px solid black;" align="center">No.</th>
                  <th style="border-top: 1px solid black;" align="center">File name</th>
                  <th style="border-top: 1px solid black;" align="center">Target Unit</th>
                  <th style="border-top: 1px solid black;" align="center">Protection unit</th>
                  <th style="border-top: 1px solid black;" align="center">Sea state</th>
                </tr>
              </thead>
              <tbody>
                <tr>
                  <td style="border-top: 1px solid black;" align="center">IPIX01</td>
                  <td style="border-top: 1px solid black;" align="center">19931108_220902_starea</td>
                  <td style="border-top: 1px solid black;" align="center">7</td>
                  <td style="border-top: 1px solid black;" align="center">6, 8</td>
                  <td style="border-top: 1px solid black;" align="center">2</td>
                </tr>
                <tr>
                  <td align="center">IPIX02</td>
                  <td align="center">19931118_023604_stareC0000</td>
                  <td align="center">8</td>
                  <td align="center">7, 9-10</td>
                  <td align="center">3</td>
                </tr>
                <tr>
                  <td style="border-bottom: 1px solid black;" align="center">IPIX03</td>
                  <td style="border-bottom: 1px solid black;" align="center">19931107_135603_starea</td>
                  <td style="border-bottom: 1px solid black;" align="center">9</td>
                  <td style="border-bottom: 1px solid black;" align="center">8, 10-11</td>
                  <td style="border-bottom: 1px solid black;" align="center">4</td>
                </tr>
              </tbody>
            </table>
          </table-wrap>
        </p>
        <p id="S4.SS1.p2">Amplitude and time-frequency features are widely studied in radar target detection. In recent years, scholars have proposed many feature models for target detection. In this experiment, multiple features include signals with different polarization modes and time-frequency maps obtained by different time-frequency analysis methods. There are significant differences in the echo characteristics of signals with different polarization modes. Taking IPIX01 data as an example, the data includes radar signal data with four polarization modes: HH, HV, VH, and VV. Through experiments, it was found that there are significant differences in the target detection performance of these four types of data. A dataset was constructed using 6000 clutter samples and 6000 target samples, and binary classification was performed using LeNet. The results are shown in Table <xref rid="T3" ref-type="table">3</xref>.</p>
        <p>
          <table-wrap id="T3">
            <label>Table 3</label>
            <caption>
              <p>Comparison of polarization mode characteristics.</p>
            </caption>
            <table>
              <thead>
                <tr>
                  <th style="border-top: 1px solid black;" align="center">Polarization mode</th>
                  <th style="border-top: 1px solid black;" align="center">HH</th>
                  <th style="border-top: 1px solid black;" align="center">HV</th>
                  <th style="border-top: 1px solid black;" align="center">VH</th>
                  <th style="border-top: 1px solid black;" align="center">VV</th>
                </tr>
              </thead>
              <tbody>
                <tr>
                  <td style="border-top: 1px solid black;" align="center">Accuracy (target)</td>
                  <td style="border-top: 1px solid black;" align="center">89.57%</td>
                  <td style="border-top: 1px solid black;" align="center">87.04%</td>
                  <td style="border-top: 1px solid black;" align="center">87.18%</td>
                  <td style="border-top: 1px solid black;" align="center">88.70%</td>
                </tr>
                <tr>
                  <td style="border-bottom: 1px solid black;" align="center">Accuracy (clutter)</td>
                  <td style="border-bottom: 1px solid black;" align="center">96.74%</td>
                  <td style="border-bottom: 1px solid black;" align="center">96.15%</td>
                  <td style="border-bottom: 1px solid black;" align="center">97.28%</td>
                  <td style="border-bottom: 1px solid black;" align="center">97.08%</td>
                </tr>
              </tbody>
            </table>
          </table-wrap>
        </p>
        <p id="S4.SS1.p3">From Table <xref rid="T3" ref-type="table">3</xref>, it can be seen that data with different polarization modes exhibit various distinguishability for targets and clutter samples. HH polarization is of weaker reflection on sea surface clutter, especially in low sea conditions. The sea surface echo intensity is lower, improving the contrast between the target and clutter. STFT is a classical time-frequency analysis method in the analyzing sea clutter signals [<xref rid="ref017" ref-type="bibr">17</xref>]. However, window functions causes energy emission. The Wigner Ville Distribution (WVD) method avoids the energy emission problem caused by window functions. But cross interference terms limits the performance of WVD. When analyzing sea clutter, it exhibits good energy accumulation effect on the target signal. While Smoothed Pseudo Wigner Ville Distribution (SPWVD) suppresses cross term interference based on WVD. As shown in Figure <xref ref-type="fig" rid="F5">5</xref>, the model features involved in this section include three time-frequency model of four types of polarization signals, STFT, WVD, and SPWVD, totaling 12 features.</p>
        <p>
          <fig id="F5">
            <label>Figure 5.</label>
            <caption>
              <p>Examples of the features.</p>
            </caption>
            <graphic xlink:href="fig5.jpg"/>
          </fig>
        </p>
        <p>
          <fig id="F6">
            <label>Figure 6.</label>
            <caption>
              <p>Detection Performance comparison on models with different training data.</p>
            </caption>
            <p>
              <fig id="F6.fig1">
                <caption>
                  <p>(a) Test results, IPIX01 test data</p>
                </caption>
                <graphic xlink:href="fig6a.jpg"/>
              </fig>
            </p>
            <p>
              <fig id="F6.fig2">
                <caption>
                  <p>(b) False alarm loss, IPIX01 test data</p>
                </caption>
                <graphic xlink:href="fig6b.jpg"/>
              </fig>
            </p>
            <p>
              <fig id="F6.fig3">
                <caption>
                  <p>(c) Test results, IPIX02 test data</p>
                </caption>
                <graphic xlink:href="fig6c.jpg"/>
              </fig>
            </p>
            <p>
              <fig id="F6.fig4">
                <caption>
                  <p>(d) False alarm loss, IPIX02 test data</p>
                </caption>
                <graphic xlink:href="fig6d.jpg"/>
              </fig>
            </p>
            <p>
              <fig id="F6.fig5">
                <caption>
                  <p>(e) Test results, IPIX03 test data</p>
                </caption>
                <graphic xlink:href="fig6e.jpg"/>
              </fig>
            </p>
            <p>
              <fig id="F6.fig6">
                <caption>
                  <p>(f) False alarm loss, IPIX03 test data</p>
                </caption>
                <graphic xlink:href="fig6f.jpg"/>
              </fig>
            </p>
          </fig>
        </p>
        <p id="S4.SS1.p4">This paper proposes a multi feature fusion network structure. Time-frequency features are studied as an example, without further research on model features in other domains. The experimental environment is TensorFlow 1.13. The training parameters includes a batch size of 32, a learning rate of 0.01, an epoch of 10000. Parameter initialization is Xavier. And bias initialization is 0. The parameter optimization strategy adopts gradient descent optimizer and MSE loss function. GFDn adopts ADAM optimizer and cross entropy loss function.</p>
      </sec>
      <sec id="S4.SS2">
        <label>4.2</label>
        <title>Performance analysis of data detection in different environments</title>
        <p id="S4.SS2.p1">7 training datasets are built using IPIX01-03 data, as shown in Table <xref rid="T4" ref-type="table">4</xref>.</p>
        <p>
          <table-wrap id="T4">
            <label>Table 4</label>
            <caption>
              <p>Explanation of training sets for different sea states.</p>
            </caption>
            <table>
              <thead>
                <tr>
                  <th style="border-top: 1px solid black;" align="center">  No.</th>
                  <th style="border-top: 1px solid black;" align="center">  Training set data</th>
                </tr>
              </thead>
              <tbody>
                <tr>
                  <th style="border-top: 1px solid black;" align="center">  1</th>
                  <td style="border-top: 1px solid black;" align="center">  IPIX01</td>
                </tr>
                <tr>
                  <th align="center">  2</th>
                  <td align="center">  IPIX02</td>
                </tr>
                <tr>
                  <th align="center">  3</th>
                  <td align="center">  IPIX03</td>
                </tr>
                <tr>
                  <th align="center">  4</th>
                  <td align="center">  IPIX01,IPIX02,IPIX03 mixture</td>
                </tr>
                <tr>
                  <th align="center">  5</th>
                  <td align="center">  IPIX01,IPIX02 mixture</td>
                </tr>
                <tr>
                  <th align="center">  6</th>
                  <td align="center">  IPIX01,IPIX03 mixture</td>
                </tr>
                <tr>
                  <th style="border-bottom: 1px solid black;" align="center">  7</th>
                  <td style="border-bottom: 1px solid black;" align="center">  IPIX02,IPIX03 mixture</td>
                </tr>
              </tbody>
            </table>
          </table-wrap>
        </p>
        <p id="S4.SS2.p2">Dataset 1, 2, and 3 are composed of observation data from a single environment, Dataset 4 is composed of an equal mixture of sample from three different environments. And Dataset 5, 6, and 7 are composed of observation data from two different environments. DCCNN and MFEn-GFDn were trained on different training sets, and their detection performance was tested on three different test sets (IPIX01-03). The results are shown in Figure <xref ref-type="fig" rid="F6">6</xref>.</p>
        <p id="S4.SS2.p3">Firstly, the detection results of IPIX01 data are analyzed, as shown in Figure <xref ref-type="fig" rid="F6">6</xref> (a). The detection performance curve reflects the separability and distribution of target and clutter. The closer it is to 0, the more likely the sample is clutter. The detection probabilities of DCCNN trained on two training sets, 1 and 4, are close. MFEn-GFDn exhibit higher detection probability than DCCNN on condition of low false alarms rate. It significantly improve the discrimination between targets and clutter near the target decision (detection value 1) in the detection result domain. Therefore, by expanding the types of features, the discrimination between targets and clutter can still be further improved. The false alarm loss curve reflects the adaptability of the model to clutter samples in the test set after optimizing the training set parameters. As shown in Figure <xref ref-type="fig" rid="F6">6</xref> (b), the false alarm losses of MFEn GFDn and DCCNN are similar.</p>
        <p id="S4.SS2.p4">As shown in Figures <xref ref-type="fig" rid="F6">6</xref> (c) and (d), in IPIX02 data testing experiment, the detection performance of different training sets and models is similar. High-performance detection can be achieved through the STFT features of amplitude and HH polarization data. The MFEn-GFDn method can further improve the discrimination between targets and clutter by expanding the types of features. In the DCCNN detection results obtained from training set 4, it can be found that, some clutter samples are firmly detected as targets, and the false alarm rate cannot be further reduced with the threshold controlled by the training samples. However, MFEn-GFDn can avoid this problem.</p>
        <p id="S4.SS2.p5">As shown in Figures <xref ref-type="fig" rid="F6">6</xref> (e) (f) and Table <xref rid="T5" ref-type="table">5</xref>, in the IPIX03 data testing experiment, the DCCNN trained with mixed data from training set 4 achieved significant improvement in detection performance compared to the single environment training set 3. And it effectively suppressed false alarm losses. Compared to DCCNN, MFEn-GFDn has a significant advantage in detection performance. It is worth noting that under the background of high sea state data in IPIX03, the false alarm loss is severe, and there are a large number of stable false alarms that are difficult to suppress in the detection results of DCCNN and MFEn-GFDn.</p>
        <p id="S4.SS2.p6">Expanding the training dataset is an effective way to improve the detection probability and generalization ability of data-driven detection methods. The effective utilization of information can further enhance the performance of the method.</p>
        <p>
          <table-wrap id="T5">
            <label>Table 5</label>
            <caption>
              <p>Detection results on IPIX 03.</p>
            </caption>
            <table>
              <thead>
                <tr>
                  <th style="border-top: 1px solid black;"/>
                  <th style="border-top: 1px solid black;" align="center">False alarm rate</th>
                  <th style="border-top: 1px solid black;" align="center">Detection probability</th>
                </tr>
              </thead>
              <tbody>
                <tr>
                  <th style="border-top: 1px solid black;" align="center">
                    <p>
                      <table-wrap>
                        <table>
                          <tr>
                            <td align="center">DCCNN</td>
                          </tr>
                          <tr>
                            <td align="center">(Training set 3)</td>
                          </tr>
                        </table>
                      </table-wrap>
                    </p>
                  </th>
                  <td style="border-top: 1px solid black;" align="center">0.00224</td>
                  <td style="border-top: 1px solid black;" align="center">0.467</td>
                </tr>
                <tr>
                  <th align="center">
                    <p>
                      <table-wrap>
                        <table>
                          <tr>
                            <td align="center">DCCNN</td>
                          </tr>
                          <tr>
                            <td align="center">(Training set 4)</td>
                          </tr>
                        </table>
                      </table-wrap>
                    </p>
                  </th>
                  <td align="center">0.00144</td>
                  <td align="center">0.596</td>
                </tr>
                <tr>
                  <th style="border-bottom: 1px solid black;" align="center">MFEn-GFDn</th>
                  <td style="border-bottom: 1px solid black;" align="center">0.00144</td>
                  <td style="border-bottom: 1px solid black;" align="center">0.681</td>
                </tr>
              </tbody>
            </table>
          </table-wrap>
        </p>
        <p id="S4.SS2.p7">Under actual observation conditions, targets or environment may have not been observed in previous works, which stresses demands on the generalization ability of detection methods. Three detection experiments, with each set consisting of IPIX01-IPIX03 and corresponding training sets of 5, 6, and 7, are conducted to test DCCNN and MFEn-GFDn. The test results are shown in Figure <xref ref-type="fig" rid="F7">7</xref>.</p>
        <p>
          <fig id="F7">
            <label>Figure 7.</label>
            <caption>
              <p>Detection performance comparison between DCCNN and MFEn-GFDn in new environments.</p>
            </caption>
            <p>
              <fig id="F7.fig1">
                <caption>
                  <p>(a) Test results, IPIX01 test data</p>
                </caption>
                <graphic xlink:href="fig7a.jpg"/>
              </fig>
            </p>
            <p>
              <fig id="F7.fig2">
                <caption>
                  <p>(b) False alarm loss, IPIX01 test data</p>
                </caption>
                <graphic xlink:href="fig7b.jpg"/>
              </fig>
            </p>
            <p>
              <fig id="F7.fig3">
                <caption>
                  <p>(c) Test results, IPIX02 test data</p>
                </caption>
                <graphic xlink:href="fig7c.jpg"/>
              </fig>
            </p>
            <p>
              <fig id="F7.fig4">
                <caption>
                  <p>(d) False alarm loss, IPIX02 test data</p>
                </caption>
                <graphic xlink:href="fig7d.jpg"/>
              </fig>
            </p>
            <p>
              <fig id="F7.fig5">
                <caption>
                  <p>(e) Test results, IPIX03 test data</p>
                </caption>
                <graphic xlink:href="fig7e.jpg"/>
              </fig>
            </p>
            <p>
              <fig id="F7.fig6">
                <caption>
                  <p>(f) False alarm loss, IPIX03 test data</p>
                </caption>
                <graphic xlink:href="fig7f.jpg"/>
              </fig>
            </p>
          </fig>
        </p>
        <p>
          <table-wrap id="T6">
            <label>Table 6</label>
            <caption>
              <p>Detection performances comparison on IPIX 03.</p>
            </caption>
            <table>
              <tbody>
                <tr>
                  <td style="border-top: 1px solid black;"/>
                  <th style="border-top: 1px solid black;" align="center">False alarm rate</th>
                  <th style="border-top: 1px solid black;" align="center">Detection probability</th>
                </tr>
                <tr>
                  <td style="border-top: 1px solid black;" align="center">DCCNN</td>
                  <td style="border-top: 1px solid black;" align="center">0.00193</td>
                  <td style="border-top: 1px solid black;" align="center">0.411</td>
                </tr>
                <tr>
                  <td style="border-bottom: 1px solid black;" align="center">MFEn-GFDn</td>
                  <td style="border-bottom: 1px solid black;" align="center">0.00176</td>
                  <td style="border-bottom: 1px solid black;" align="center">0.612</td>
                </tr>
              </tbody>
            </table>
          </table-wrap>
        </p>
        <p id="S4.SS2.p8">As shown in Figure <xref ref-type="fig" rid="F7">7</xref> (a)-(f), in the IPIX01-IPIX03 data testing experiment, MFEn-GFDn has a significant advantage in detection probability. However, due to the fact that the test samples and training samples come from data obtained in different environments, there are greater differences in features, resulting in a decrease in detection performance and significant false alarm losses for both methods on the three test sets. In the IPIX02 data testing experiment, the detection probability of DCCNN decreased significantly, as shown in Figure <xref ref-type="fig" rid="F7">7</xref> (c) - (d). Compared with DCCNN, MFEn-GFDn method achieves higher detection probability and false alarm loss advantage for data with significant differences from the training set, but it is still difficult to completely avoid false alarm loss in some environments, as shown in Figure <xref ref-type="fig" rid="F7">7</xref> (f) and Table <xref rid="T6" ref-type="table">6</xref>. More types of model features can enhance the discrimination between targets and clutter samples, while MFEn avoids higher fitting of the training set by the network through self supervised training, alleviating the problem of reduced ability to distinguish unknown characteristic samples.</p>
      </sec>
    </sec>
    <sec id="S5">
      <label>5.</label>
      <title>Conclusion</title>
      <p id="S5.p1">This paper addresses the problem of limited ability of single feature detection methods in distinguishing targets and backgrounds in complex sea clutter backgrounds. From the perspective of expanding the types of features and utilizing the complementarity between different features, an MFEn-GFDn feature fusion detection method is proposed. Multiple time-frequency maps of radar signals are extracted via CAE based MFEn to construct radar signal MFG. The MFG containing multiple feature information is then fused and detected using GFDn. In different environmental target detection experiments, expanding the dataset can significantly improve the model's generalization ability. The MFEn-GFDn trained on a training set composed of a mixture of three datasets has a detection probability increase of about 8% compared to DCCNN. In addition, MFEn-GFDn further improves detection performance by expanding feature dimensions, especially in environments lacking corresponding training samples, with higher generalization ability. In the actual sea detection experiment, the proposed method still requires more different sea state data for network training. But it may has higher performances in case of detection in new sea environments. At the same time, the proposed method is of higher computational complexity, due to more time-frequency features extraction.</p>
    </sec>
  </body>
  <back>
    <ack>
      <title>Acknowledgments</title>
      <p id="ack.p1">This work was supported by National Natural Science Foundation of China under Grant 62222120 and Natural Science Foundation of Shandong under Grant ZR2024JQ003.</p>
    </ack>
    <sec id="sec0100" sec-type="COI-statement">
      <title>Conflict of interest</title>
      <p>The authors declare no conflicts of interest. </p>
    </sec>
    <ref-list>
      <title>References</title>
      <ref id="ref001">
        <label>[1]</label>
        <mixed-citation> Xin, Z., Liao, G., Yang, Z., Zhang, Y., &amp; Dang, H. (2016). A deterministic sea-clutter space–time model based on physical sea surface. <italic>IEEE Transactions on Geoscience and Remote Sensing</italic>, 54(11), 6659-6673. [<uri>https://doi.org/10.1109/TGRS.2016.2587739</uri>] </mixed-citation>
      </ref>
      <ref id="ref002">
        <label>[2]</label>
        <mixed-citation> del-Rey-Maestre, N., Jarabo-Amores, M. P., Mata-Moya, D., Barcena-Humanes, J. L., &amp; del Hoyo, P. G. (2017). Machine learning techniques for coherent CFAR detection based on statistical modeling of UHF passive ground clutter. <italic>IEEE Journal of Selected Topics in Signal Processing</italic>, 12(1), 104-118. [<uri>https://doi.org/10.1109/JSTSP.2017.2780798</uri>] </mixed-citation>
      </ref>
      <ref id="ref003">
        <label>[3]</label>
        <mixed-citation> Jarabo-Amores, M. P., Rosa-Zurera, M., Gil-Pita, R., &amp; Lopez-Ferreras, F. (2009). Study of two error functions to approximate the Neyman–Pearson detector using supervised learning machines. <italic>IEEE Transactions on Signal Processing</italic>, 57(11), 4175-4181. [<uri>https://doi.org/10.1109/TSP.2009.2025077</uri>] </mixed-citation>
      </ref>
      <ref id="ref004">
        <label>[4]</label>
        <mixed-citation> del-Rey-Maestre, N., Mata-Moya, D., Jarabo-Amores, M. P., Gomez-del-Hoyo, P. J., &amp; Barcena-Humanes, J. L. (2018). Artificial intelligence techniques for small boats detection in radar clutter. <italic>Real data validation. Engineering Applications of Artificial Intelligence</italic>, 67, 296-308. [<uri>https://doi.org/10.1016/j.engappai.2017.10.005</uri>] </mixed-citation>
      </ref>
      <ref id="ref005">
        <label>[5]</label>
        <mixed-citation> Li, J., &amp; Stoica, P. (2008). <italic>MIMO radar signal processing</italic>. John Wiley &amp; Sons. </mixed-citation>
      </ref>
      <ref id="ref006">
        <label>[6]</label>
        <mixed-citation> Cheikh, K., &amp; Soltani, F. (2006). Application of neural networks to radar signal detection in K-distributed clutter. <italic>IEE Proceedings-Radar, Sonar </italic>&amp;<italic> Navigation</italic>, 153(5), 460-466. [<uri>https://doi.org/10.1049/ip-rsn:20050103</uri>] </mixed-citation>
      </ref>
      <ref id="ref007">
        <label>[7]</label>
        <mixed-citation> Vicen-Bueno, R., Rosa-Zurera, M., Jarabo-Amores, M. P., &amp; Gil-Pita, R. (2010). Automatic target detection in simulated ground clutter (Weibull distributed) by multilayer perceptrons in a low-resolution coherent radar. <italic>IET radar, sonar </italic>&amp;<italic> navigation</italic>, 4(2), 315-328. [<uri>https://doi.org/10.1049/iet-rsn.2009.0080</uri>] </mixed-citation>
      </ref>
      <ref id="ref008">
        <label>[8]</label>
        <mixed-citation> Vicen-Bueno, R., Rosa-Zurera, M., Jarabo-Amores, M. P., &amp; de la Mata-Moya, D. (2010). Coherent detection of Swerling 0 targets in sea-ice Weibull-distributed clutter using neural networks. <italic>IEEE Transactions on Instrumentation and Measurement</italic>, 59(12), 3139-3151. [<uri>https://doi.org/10.1109/TIM.2010.2047579</uri>] </mixed-citation>
      </ref>
      <ref id="ref009">
        <label>[9]</label>
        <mixed-citation> Cai, Z., Zhang, M., &amp; Liu, Y. (2018). Sea-surface weak target detection scheme using a cultural algorithm aided time-frequency fusion strategy. <italic>IET radar, sonar </italic>&amp;<italic> navigation</italic>, 12(7), 711-720. [<uri>https://doi.org/10.1049/iet-rsn.2018.0004</uri>] </mixed-citation>
      </ref>
      <ref id="ref010">
        <label>[10]</label>
        <mixed-citation> Zhang, R., &amp; Cao, S. (2018). Real-time human motion behavior detection via CNN using mmWave radar. <italic>IEEE Sensors Letters</italic>, 3(2), 1-4. [<uri>https://doi.org/10.1109/LSENS.2018.2889060</uri>] </mixed-citation>
      </ref>
      <ref id="ref011">
        <label>[11]</label>
        <mixed-citation> Tang, X., Li, D., Cheng, W., Su, J., &amp; Wan, J. (2021, March). A novel sea clutter suppression method based on deep learning with exploiting time-frequency features. In <italic>2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)</italic> (Vol. 5, pp. 2548-2552). IEEE. [<uri>https://doi.org/10.1109/IAEAC50856.2021.9390660</uri>] </mixed-citation>
      </ref>
      <ref id="ref012">
        <label>[12]</label>
        <mixed-citation>Rui, G. U. O., Yue, Z., Biao, T. I. A. N., Yu, X. I. A. O., Jun, H. U., Shiyou, X. U., &amp; Zengping, C. H. E. N. (2022). Review of the technology, development and applications of holographic staring radar. <italic>Journal of Radars</italic>, 12(2), 389-411. [<uri>https://doi.org/10.12000/JR22153</uri>] </mixed-citation>
      </ref>
      <ref id="ref013">
        <label>[13]</label>
        <mixed-citation> Chen, X., Su, N., Huang, Y., &amp; Guan, J. (2021). False-alarm-controllable radar detection for marine target based on multi features fusion via CNNs. <italic>IEEE Sensors Journal</italic>, 21(7), 9099-9111. [<uri>https://doi.org/10.1109/JSEN.2021.3054744</uri>] </mixed-citation>
      </ref>
      <ref id="ref014">
        <label>[14]</label>
        <mixed-citation> Feng, K., Chen, H., Kong, Y., Zhang, L., Yu, X., &amp; Yi, W. (2022, July). Prediction of Multi-Function Radar Signal Sequence Using Encoder-Decoder Structure. In <italic>2022 7th International Conference on Signal and Image Processing (ICSIP)</italic> (pp. 152-156). IEEE. [<uri>https://doi.org/10.1109/ICSIP55141.2022.9887351</uri>] </mixed-citation>
      </ref>
      <ref id="ref015">
        <label>[15]</label>
        <mixed-citation> Lee, J., Lee, I., &amp; Kang, J. (2019, May). Self-attention graph pooling. In <italic>International conference on machine learning</italic> (pp. 3734-3743). pmlr. </mixed-citation>
      </ref>
      <ref id="ref016">
        <label>[16]</label>
        <mixed-citation> Mei, X. A. (2007). A study on the statistical characteristics of IPIX radar sea spikes. <italic>Journal of Spacecraft TT C Technology</italic>, 26(2), 19-23. </mixed-citation>
      </ref>
      <ref id="ref017">
        <label>[17]</label>
        <mixed-citation> Ningyuan, S., Xiaolong, C., Jian, G., Xiaoqian, M., &amp; Ningbo, L. (2018). Detection and classification of maritime target with micro-motion based on CNNs. <italic>Journal of Radars</italic>, 7(5), 565-574. [<uri>https://doi.org/10.12000/JR18077</uri>] </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>
