Software  /  ,

Outlier detection in R

Mar 9, 2018

Speaking of outliers, it’s not always obvious when and why a data point is an outlier. The Overview of Outliers package in R by Antony Unwin lets you compare methods.

Articles on outlier methods use a mixture of theory and practice. Theory is all very well, but outliers are outliers because they don’t follow theory. Practice involves testing methods on data, sometimes with data simulated based on theory, better with `real’ datasets. A method can be considered successful if it finds the outliers we all agree on, but do we all agree on which cases are outliers?

See also Unwin’s talk from 2017 for more about the thinking behind the package.

Favorites

Who is Older and Younger than You

Here’s a chart to show you how long you have until you start to feel your age.

Watching the growth of Walmart – now with 100% more Sam’s Club

The ever so popular Walmart growth map gets an update, and yes, it still looks like a wildfire. Sam’s Club follows soon after, although not nearly as vigorously.

Visualizing the Uncertainty in Data

Data is an abstraction, and it’s impossible to encapsulate everything it represents in real life. So there is uncertainty. Here are ways to visualize the uncertainty.

The Changing American Diet

See what we ate on an average day, for the past several decades.