Finding Frequent Items:
(version 1.0)

 

Description:
This package provides implementations of various one pass algorithms for finding frequent items in data streams. In particular it contains the following:
  • Frequent Algorithm
  • Lossy Counting, and variations
  • Space Saving
  • Greewald & Khanna
  • Quantile Digest
  • Count Sketch
  • Hierarchical Count-Min Sketch
  • Combinatorial Group Testing
The code is an extension of the MassDAL library. Implementations are by Graham Cormode.

Download:
  • C++ Source code v1.0 (Visual Studio 2005, gcc 3.4.6): [bzip2],[zip].

 

Citations:
  • Finding Frequent Items in Data Streams [pdf],
    G. Cormode, M. Hadjieleftheriou
    Proc. of the International Conference on Very Large Data Bases (VLDB)
    Auckland, New Zealand, August 2008.
[ main | publications ]