Go back to overview

Algorithm for estimating distinct elements in streams

May 26, 2023

It is such a beautiful algorithm that it fits in a tweet.


The F0-Estimator algorithm uses a sampling strategy to keep the sample set small. It adjusts the sampling rate to prevent the set from exceeding a threshold.

After processing the data stream, it outputs |X|/p where 'p' is the final sampling rate.

Arxiv preprint