Stream VByte : Faster byte-oriented integer compression

Citation data:

Information Processing Letters, ISSN: 0020-0190, Vol: 130, Page: 1-6

Publication Year:
Usage 32
Abstract Views 32
Captures 9
Readers 9
Mentions 1
Blog Mentions 1
Social Media 10
Tweets 10
Daniel Lemire; Nathan Kurz; Christoph Rupp
Elsevier BV
Mathematics; Computer Science
Most Recent Tweet View All Tweets
Most Recent Blog Mention
article description
Arrays of integers are often compressed in search engines. Though there are many ways to compress integers, we are interested in the popular byte-oriented integer compression techniques (e.g., VByte or Google's varint-GB ). Although not known for their speed, they are appealing due to their simplicity and engineering convenience. Amazon's varint-G8IU is one of the fastest byte-oriented compression technique published so far. It makes judicious use of the powerful single-instruction-multiple-data (SIMD) instructions available in commodity processors. To surpass varint-G8IU, we present Stream VByte, a novel byte-oriented compression technique that separates the control stream from the encoded data. Like varint-G8IU, Stream VByte is well suited for SIMD instructions. We show that Stream VByte decoding can be up to twice as fast as varint-G8IU decoding over real data sets. In this sense, Stream VByte establishes new speed records for byte-oriented integer compression, at times exceeding the speed of the memcpy function. On a 3.4 GHz Haswell processor, it decodes more than 4 billion differentially-coded integers per second from RAM to L1 cache.