PlumX Metrics
Embed PlumX Metrics

Multi-target tracking for unmanned aerial vehicle swarms using deep reinforcement learning

Neurocomputing, ISSN: 0925-2312, Vol: 466, Page: 285-297
2021
  • 58
    Citations
  • 0
    Usage
  • 55
    Captures
  • 0
    Mentions
  • 31
    Social Media
Metric Options:   Counts1 Year3 Year

Metrics Details

  • Citations
    58
    • Citation Indexes
      58
  • Captures
    55
  • Social Media
    31
    • Shares, Likes & Comments
      31
      • Facebook
        31

Article Description

In recent years, deep reinforcement learning (DRL) has proved its great potential in multi-agent cooperation. However, how to apply DRL to multi-target tracking (MTT) problem for unmanned aerial vehicle (UAV) swarms is challenging: 1) the scale of UAVs may be large, but the existing multi-agent reinforcement learning (MARL) methods that rely on global or joint information of all agents suffer from the dimensionality curse; 2) the dimension of each UAV’s received information is variable, which is incompatible with the neural networks with fixed input dimensions; 3) the UAVs are homogeneous and interchangeable that each UAV’s policy should be irrelevant to the permutation of its received information. To this end, we propose a DRL method for UAV swarms to solve the MTT problem. Firstly, a decentralized swarm-oriented Markov Decision Process (MDP) model is presented for UAV swarms, which involves each UAV’s local communication and partial observation. Secondly, to achieve better scalability, a cartogram feature representation (FR) is proposed to integrate the variable-dimensional information set into a fixed-shape input variable, and the cartogram FR can also maintain the permutation irrelevance to the information. Then, the double deep Q-learning network with dueling architecture is adapted to the MTT problem, and the experience-sharing training mechanism is adopted to learn the shared cooperative policy for UAV swarms. Extensive experiments are provided and the results show that our method can successfully learn a cooperative tracking policy for UAV swarms and outperforms the baseline method in the tracking ratio and scalability.

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know