'Is Serialized Roaring Bitmaps compressed?
According to the definition, roaring bitmap is a "better compressed bitset". Which means it is compressed by nature. But, according to my test, the size of Dgraph's Serialized Roaring Bitmap is not compressed?
func main() {
//github.com/dgraph-io/sroar
sr := sroar.NewBitmap()
for {
if sr.GetCardinality() > 999 {
break
}
sr.Set(uint64(rand.Int()))
}
buf := sr.ToBuffer()
fmt.Printf("sroar size: %d\n", len(buf))
//github.com/RoaringBitmap/roaring
r := roaring.New()
for {
if r.GetCardinality() > 999 {
break
}
r.AddInt(rand.Int())
}
var bs bytes.Buffer
r.WriteTo(&bs)
fmt.Printf("roar size: %d\n", bs.Len())
}
output:
sroar size: 152704
roar size: 9936
i.e. the "original" roaring bitmap requires 9936 bytes for one thousand uint32 (that library will cast int to uint32) which is 4000 bytes of "raw data", while the Dgraph SRB will require 152704 bytes for one thousand of uint64 which is 8000 bytes of "raw data".
Is it true that a Roaring Bitmap is by nature compressed, and can be used as-is (without decompress etc)?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
