Video sketch: A middle-level representation for action recognition
Applied Intelligence, ISSN: 1573-7497, Vol: 51, Issue: 4, Page: 2589-2608
2021
- 19Citations
- 8Captures
Metric Options: CountsSelecting the 1-year or 3-year option will change the metrics count to percentiles, illustrating how an article or review compares to other articles or reviews within the selected time period in the same journal. Selecting the 1-year option compares the metrics against other articles/reviews that were also published in the same calendar year. Selecting the 3-year option compares the metrics against other articles/reviews that were also published in the same calendar year plus the two years prior.
Example: if you select the 1-year option for an article published in 2019 and a metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019. If you select the 3-year option for the same article published in 2019 and the metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019, 2018 and 2017.
Citation Benchmarking is provided by Scopus and SciVal and is different from the metrics context provided by PlumX Metrics.
Example: if you select the 1-year option for an article published in 2019 and a metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019. If you select the 3-year option for the same article published in 2019 and the metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019, 2018 and 2017.
Citation Benchmarking is provided by Scopus and SciVal and is different from the metrics context provided by PlumX Metrics.
Article Description
Different modalities extracted from videos, such as RGB and optical flows, may provide complementary cues for improving video action recognition. In this paper, we introduce a new modality named video sketch, which implies the human shape information, as a complementary modality for video action representation. We show that video action recognition can be enhanced by using the proposed video sketch. More specifically, we first generate video sketch with class distinctive action areas and then employ a two-stream network to combine the shape information extracted from image-based sketch and point-based sketch, followed by fusing the classification scores of two streams to generate shape representation for videos. Finally, we use the shape representation as the complementary one for the traditional appearance (RGB) and motion (optical flow) representations for the final video classification. We conduct extensive experiments on four human action recognition datasets – KTH, HMDB51, UCF101, Something-Something and UTI. The experimental results show that the proposed method outperforms the existing state-of-the-art action recognition methods.
Bibliographic Details
Springer Science and Business Media LLC
Provide Feedback
Have ideas for a new metric? Would you like to see something else here?Let us know