Summarize metrics with random deletion

You have a metric for which you have a result every second You can’t keep this granularity forever; it would be too big Standard solution: produce e.g. hourly logs with summaries e.g. min max mean p50 p99 My suggested alternative: just keep the original data points, but randomly delete some You can then run any aggregations over them when required

How does random deletion affect expected percentiles?

Get updates on Twitter


More by Jim

Tagged . All content copyright James Fisher 2018. This post is not associated with my employer. Found an error? Edit this page.