Bloom Filters for Filesystem Forensics

Publication Year:
Usage 306
Downloads 254
Abstract Views 52
Repository URL:
Bourg, Rachel
Computer crimes investigation; Forensic sciences; Computer crimes investigation; Forensic sciences
thesis / dissertation description
Digital forensics investigations become more time consuming as the amount of data to be investigated grows. Secular growth trends between hard drive and memory capacity just exacerbate the problem. Bloom filters are space-efficient, probabilistic data structures that can represent data sets with quantifiable false positive rates that have the potential to alleviate the problem by reducing space requirements. We provide a framework using Bloom filters to allow fine-grained content identification to detect similarity, instead of equality. We also provide a method to compare filters directly and a statistical means of interpreting the results. We developed a tool--md5bloom--that uses Bloom filters for standard queries and direct comparisons. We provide a performance comparison with a commonly used tool, md5deep, and achieved a 50% performance gain that only increases with larger hash sets. We compared filters generated from different versions of KNOPPIX and detected similarities and relationships between the versions.