Summarize metrics with random deletion

You have a metric for which you have a result every second You can’t keep this granularity forever; it would be too big Standard solution: produce e.g. hourly logs with summaries e.g. min max mean p50 p99 My suggested alternative: just keep the original data points, but randomly delete some You can then run any aggregations over them when required

How does random deletion affect expected percentiles?

WFH on lockdown, like me? Let’s help each other out! I just released Vidrio, a free app for macOS and Windows to make your screen-sharing awesomely holographic. “Oh damn that's genius”, said some YouTuber when he saw it. But don't believe him, or the #1 on Product Hunt. Instead, believe this demo:

With Vidrio

With generic competitor

More by Jim

Tagged . All content copyright James Fisher 2018. This post is not associated with my employer. Found an error? Edit this page.