Multi-target tracking for unmanned aerial vehicle swarms using deep reinforcement learning

Citation DataNeurocomputing, ISSN: 0925-2312, Vol: 466, Page: 285-297

Publication Year2021

58
Citations
0
Usage
55
Captures
0
Mentions
31
Social Media

Metric Options: Counts1 Year3 Year

Metrics Details

Citations
58
- Citation Indexes
  58
Captures
55
- Readers
  55
Social Media
31
- Shares, Likes & Comments
  31

Article Description

In recent years, deep reinforcement learning (DRL) has proved its great potential in multi-agent cooperation. However, how to apply DRL to multi-target tracking (MTT) problem for unmanned aerial vehicle (UAV) swarms is challenging: 1) the scale of UAVs may be large, but the existing multi-agent reinforcement learning (MARL) methods that rely on global or joint information of all agents suffer from the dimensionality curse; 2) the dimension of each UAV’s received information is variable, which is incompatible with the neural networks with fixed input dimensions; 3) the UAVs are homogeneous and interchangeable that each UAV’s policy should be irrelevant to the permutation of its received information. To this end, we propose a DRL method for UAV swarms to solve the MTT problem. Firstly, a decentralized swarm-oriented Markov Decision Process (MDP) model is presented for UAV swarms, which involves each UAV’s local communication and partial observation. Secondly, to achieve better scalability, a cartogram feature representation (FR) is proposed to integrate the variable-dimensional information set into a fixed-shape input variable, and the cartogram FR can also maintain the permutation irrelevance to the information. Then, the double deep Q-learning network with dueling architecture is adapted to the MTT problem, and the experience-sharing training mechanism is adopted to learn the shared cooperative policy for UAV swarms. Extensive experiments are provided and the results show that our method can successfully learn a cooperative tracking policy for UAV swarms and outperforms the baseline method in the tracking ratio and scalability.

Bibliographic Details

DOI10.1016/j.neucom.2021.09.044

URL IDhttp://www.sciencedirect.com/science/article/pii/S0925231221014223; http://dx.doi.org/10.1016/j.neucom.2021.09.044; http://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=85116710929&origin=inward; https://linkinghub.elsevier.com/retrieve/pii/S0925231221014223; https://dx.doi.org/10.1016/j.neucom.2021.09.044

AUTHOR(S)

Wenhong Zhou; Zhihong Liu; Jie Li; Xin Xu; Lincheng Shen

PUBLISHER(S)

Elsevier BV

TAG(S)

Computer Science; Neuroscience

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know