Posted to

Statistics

Datalandia, the fictional town saved by data

GE has a short video series on a fictional town called Datalandia where machines talk to each other and data is exchanged in a hero-like fashion. "This summer the most…

Predicting riots

Hannah Fry and her group at University College London investigate data from the 2011 London riots and found that the complex activity of rioters is reminiscent of shopping behavior and…

Dictionary of Numbers extension adds context to numbers

We read and hear numbers in the news all the time, but it can be hard to imagine what those numbers mean. For example, big numbers, on the scale of…

Statistics jokes

There's a fun CrossValidated thread on statistics jokes. Here's the one with the top votes: A statistician's wife had twins. He was delighted. He rang the minister who was also…

Beer recommendation system in R

Using data from Beer Advocate, in the form of 1.5 million reviews, yhat shows how to build a recommendation system in R. The goal for our system will be for…

Twitter trend detection algorithm

Stuff happens, and people tweet about it. Something major happens, and a lot of people tweet about it. Masters student Stanislav Nikolov and his adviser Devavrat Shah are working on…

Non-statistician analysts are the new norm

As data grows cheaper and more easily accessible, the people who analyze it aren't always statisticians. They're likely to not even have had any statistical training. Biostatistics professor Jeff Leek…

The differences between a geek and a nerd

Curious about how people use "geek" and "nerd" to describe themselves and if there was any difference between the two terms, Burr Settles analyzed words used in tweets that contained…

Hans Rosling explains population growth and climate change

Because every day is a good day to listen to Hans Rosling talk numbers. In this short video, Rosling uses Lego bricks to explain population growth and the gaps in…

Myths of big data

Microsoft researcher Kate Crawford describes several myths of big data. Myth #4: It makes cities smarter. "It's only as good as the people using it," Ms. Crawford said. Many of…

Medicare provider charge data released

The Centers for Medicare and Medicaid Services released billing data for more than 3,000 U.S. hospitals, showing high variance in cost of health scare across the country and even between…

Convergence of Miss Korea faces

After seeing a Reddit post on the convergence of Miss Korea faces, supposedly due to high rates of plastic surgery, graduate student Jia-Bin Huang analyzed the faces of 20 contestants.…

Length of the average dissertation

On R is My Friend, as a way to procrastinate on his own dissertation, beckmw took a look at dissertation length via the digital archives at the University of Minnesota.…

The Numbers Game on National Geographic

Jake Porway, the founder of DataKind, has a new show on the National Geographic channel called The Numbers Game. I unfortunately don't have the channel, so the clips on the…

Flexible data

Data is an abstraction of something that happened in the real world. How people move. How they spend money. How a computer works. The tendency is to approach data and…

Problematic databases used to track employee theft

Employee theft accounts for billions of dollars of lost merchandise per year, so it's a huge concern for retailers, but it often goes unreported as a crime. If only there…

How to become a password cracker in a day

Deputy editor at Ars Technica Nate Anderson was curious if he could learn to crack passwords in a day. Although there's definitely a difference between advanced and beginner crackers, openly…

Odds of a perfect NCAA March Madness bracket

Math professor Jeff Bergen explains the odds of picking a perfect bracket.…

Declining songwriter ratings with age

Do singer-songwriters age well like a fine wine, or does quality decline with age? Kyle Biehle analyzed fan ratings by age. I understand all of the reasons for not comparing…

Data hackathon challenges and why questions are important

Jake Porway, executive director of DataKind on data hackathons and why they require careful planning to actually work: Any data scientist worth their salary will tell you that you should…

What data brokers know about you

Lois Beckett for ProPublica has a thorough piece on data brokers — companies that collect and sell information about you — and what they know and where they get the…

Using search data to find drug side effects

Along the same lines as Google Flu Trends, researchers at Microsoft, Stanford and Columbia University are investigating whether search data can be used to find interactions between drugs. They recently…