rvest: R package to scrape web data

Posted to Software  |  Tags:  |  Nathan Yau

Inspired by the Python libraries RoboBrowser and BeautifulSoup, the rvest package by Hadley Wickham helps you scrape web data via R in a similar way.

Parse tables into data frames, navigate around a website, and of course, extract bits from a page. I’ll stick to BeautifulSoup, but I’m saving this for later. I’m sure it’ll come in handy sooner rather than later.

Favorites

Divorce Rates for Different Groups

We know when people usually get married. We know who never marries. Finally, it’s time to look at the other side: divorce and remarriage.

Divorce and Occupation

Some jobs tend towards higher divorce rates. Some towards lower. Salary also probably plays a role.

Most popular porn searches, by state

We’ve seen that we can learn from what people search for, through the eyes of Google suggestions: state stereotypes, national …

Watching the growth of Walmart – now with 100% more Sam’s Club

The ever so popular Walmart growth map gets an update, and yes, it still looks like a wildfire. Sam’s Club follows soon after, although not nearly as vigorously.