Optimizing your R code

Posted to Coding  |  Tags: ,  |  Nathan Yau

Hadley Wickham offers a detailed, practical guide to finding and removing the major bottlenecks in your R code.

It’s easy to get caught up in trying to remove all bottlenecks. Don’t! Your time is valuable and is better spent analysing your data, not eliminating possible inefficiencies in your code. Be pragmatic: don’t spend hours of your time to save seconds of computer time. To enforce this advice, you should set a goal time for your code and only optimise only up to that goal. This means you will not eliminate all bottlenecks. Some you will not get to because you’ve met your goal. Others you may need to pass over and accept either because there is no quick and easy solution or because the code is already well-optimized and no significant improvement is possible. Accept these possibilities and move on to the next candidate.

This is how I approach it. Some people spend a lot of time optimizing, but I’m usually better off writing code without speed in mind initially. Then I deal with it if it’s actually a problem. I can’t remember the last time that happened though. Obviously, this approach won’t work in all settings. So just use common sense. If it takes you longer to optimize than it does to run your “slow” code, you’ve got your answer.


One Dataset, Visualized 25 Ways

“Let the data speak” they say. But what happens when the data rambles on and on?

Life expectancy changes

The data goes back to 1960 and up to the most current estimates for 2009. Each line represents a country.

Marrying Age

People get married at various ages, but there are definite trends that vary across demographic groups. What do these trends look like?

A Day in the Life of Americans

I wanted to see how daily patterns emerge at the individual level and how a person’s entire day plays out. So I simulated 1,000 of them.