PlumX Metrics
Embed PlumX Metrics

Fast dictionary-based compression for inverted indexes

WSDM 2019 - Proceedings of the 12th ACM International Conference on Web Search and Data Mining, Page: 6-14
2019
  • 21
    Citations
  • 0
    Usage
  • 20
    Captures
  • 0
    Mentions
  • 0
    Social Media
Metric Options:   Counts1 Year3 Year

Metrics Details

  • Citations
    21
    • Citation Indexes
      21
  • Captures
    20

Conference Paper Description

Dictionary-based compression schemes provide fast decoding operation, typically at the expense of reduced compression effectiveness compared to statistical or probability-based approaches. In this work, we apply dictionary-based techniques to the compression of inverted lists, showing that the high degree of regularity that these integer sequences exhibit is a good match for certain types of dictionary methods, and that an important new trade-off balance between compression effectiveness and compression efficiency can be achieved. Our observations are supported by experiments using the document-level inverted index data for two large text collections, and a wide range of other index compression implementations as reference points. Those experiments demonstrate that the gap between efficiency and effectiveness can be substantially narrowed.

Bibliographic Details

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know