A novel iteration scheme with conjugate gradient for faster pruning on transformer models

Citation DataComplex and Intelligent Systems, ISSN: 2198-6053, Vol: 10, Issue: 6, Page: 7863-7875

Publication Year2024

0
Citations
0
Usage
84
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Metrics Details

Captures
84
- Readers
  84

Article Description

Pre-trained models based on the Transformer architecture have significantly advanced research within the domain of Natural Language Processing (NLP) due to their superior performance and extensive applicability across multiple technological sectors. Despite these advantages, there is a significant challenge in optimizing these models for more efficient deployment. To be concrete, the existing post-training pruning frameworks of transformer models suffer from inefficiencies in the crucial stage of pruning accuracy recovery, which impacts the overall pruning efficiency. To address this issue, this paper introduces a novel and efficient iteration scheme with conjugate gradient in the pruning recovery stage. By constructing a series of conjugate iterative directions, this approach ensures each optimization step is orthogonal to the previous ones, which effectively reduces redundant explorations of the search space. Consequently, each iteration progresses effectively towards the global optimum, thereby significantly enhancing search efficiency. The conjugate gradient-based faster-pruner reduces the time expenditure of the pruning process while maintaining accuracy, demonstrating a high degree of solution stability and exceptional model acceleration effects. In pruning experiments conducted on the BERT and DistilBERT models, the faster-pruner exhibited outstanding performance on the GLUE benchmark dataset, achieving a reduction of up to 36.27% in pruning time and a speed increase of up to 1.45× on an RTX 3090 GPU.

Bibliographic Details

DOI10.1007/s40747-024-01595-w

URL IDhttp://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=85200560098&origin=inward; http://dx.doi.org/10.1007/s40747-024-01595-w; https://link.springer.com/10.1007/s40747-024-01595-w; https://dx.doi.org/10.1007/s40747-024-01595-w; https://link.springer.com/article/10.1007/s40747-024-01595-w

AUTHOR(S)

Jun Li; Yuchen Zhu; Kexue Sun

PUBLISHER(S)

Springer Science and Business Media LLC

TAG(S)

Computer Science; Engineering; Mathematics

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know