Software  /  ,

Outlier detection in R

Mar 9, 2018

Speaking of outliers, it’s not always obvious when and why a data point is an outlier. The Overview of Outliers package in R by Antony Unwin lets you compare methods.

Articles on outlier methods use a mixture of theory and practice. Theory is all very well, but outliers are outliers because they don’t follow theory. Practice involves testing methods on data, sometimes with data simulated based on theory, better with `real’ datasets. A method can be considered successful if it finds the outliers we all agree on, but do we all agree on which cases are outliers?

See also Unwin’s talk from 2017 for more about the thinking behind the package.

Favorites

Who is Older and Younger than You

Here’s a chart to show you how long you have until you start to feel your age.

Shifting Incomes for American Jobs

For various occupations, the difference between the person who makes the most and the one who makes the least can be significant.

A Day in the Life of Americans

I wanted to see how daily patterns emerge at the individual level and how a person’s entire day plays out. So I simulated 1,000 of them.

This is an American Workday, By Occupation

I simulated a day for employed Americans to see when and where they work.