Following up on my post last week about using Twitter to track eating and weight, some of you voiced some interest in creating your own Twitter bot. This post covers how you can do that.
The Gist of It
Creating my own Twitter bot was pretty straightforward (much more than I thought it’d be), mostly because Twitter provides an API and the resources to make it that way.
I wanted something really simple that I could play around with. I just wanted to be able to send a direct message to my Twitter bot, and from there, it would store my data. OK, so here are the basic steps I took:
- Create Twitter account for bot
- Turn on email notification for direct messages only
- Check email periodically for new direct messages
- Parse direct messages and store in database
Create a Twitter Account (and Email Address)
The first step is easy. Create a Twitter account specifically for your bot. The account name should be short and easy to remember. Make sure you enter an IMAP email address that is only for your bot. You could put in a general purpose email address, but it’ll make your life a lot easier if the email address was specifically for Twitter.
Turn on Email Notifications
Once you’ve setup your bot account, turn on email notification via the Twitter options menu. For now, tell Twitter to only send you notifications when your bot receives direct messages and not when someone new follows.
Check Email and Do Something with Messages
Here’s where the actual code comes in. Here’s the general framework. I’ve left out some details that will be specific to your own purposes.
from imaplib import *
from email.Parser import Parser
import datetime, time, email, email.Utils
import re
# Connect to email server
server = IMAP4("__EMAIL_SERVER.COM__")
server.login("__EMAIL_ACCOUNT_NAME__", "__EMAIL_PASSWORD__")
r = server.select("INBOX")
# Find only new mail (i.e. new direct messages)
r, data = server.search(None, "(NEW)")
# If there are new direct messages:
if len(data[0]) > 0:
p = Parser()
# Loop through new emails
for num in data[0].split():
# Who email is from (Should be one line, broken for display only)
r, data = server.fetch(num, '(BODY[HEADER.FIELDS
(DATE SUBJECT FROM X-TwitterEmailType X-TwitterSenderScreenName
X-TwitterCreatedAt X-TwitterRecipientScreenName)])')
msg = p.parsestr(data[0][1])
who = msg.__getitem__('From')
matchemail = re.compile(r'[\w\-][\w\-\.]+@[\w\-][\w\-\.]+[a-zA-Z]{1,4}')
email_addy = matchemail.findall(who)[0]
# Twitter username
twitter_un = msg.__getitem__('X-TwitterSenderScreenName')
# If the email is a direct message sent from Twitter
if msg.__getitem__('X-TwitterEmailType') == 'direct_message':
# When direct message sent, convert to epoch seconds
twitter_time = msg.__getitem__('X-TwitterCreatedAt').strip()
time_tuple = email.Utils.parsedate(twitter_time)
epoch_seconds = time.mktime(time_tuple)
# Get body of email sent by Twitter
r, data = server.fetch(num, '(RFC822.TEXT)')
body = data[0][1]
twitter_dm = body.split("\r\n\r\n")[0].strip()
# Do something with the twitter direct message...
# Parse it...
# Store it in a database?...
# Logout of email server
server.logout()
I run this script every 30 minutes with a cron. You could of course run it more frequently. The important part of this code though is that Twitter attaches its own special headers (e.g. X-TwitterEmailType). If you wanted your bot to automatically follow users that followed it, you could check the EmailType and then use the Twitter API to follow a Twitter user. For my simple purposes though, I only cared about direct messages.
That’s all. There is of course plenty of room for improvement. Like I said, you could make this useful to lots of users by making your bot automatically follow those who follow it. Users can only direct message another Twitter user, if he is following. I would also delete emails that have already been read and stored somewhere so that the INBOX doesn’t pile up. Yup.
Did I miss anything?
Thanks for sharing Nathan !
I’m willing to bet 99% of people would, at step 4 upon seeing that code, find that be anything but “pretty straightforward” unless they’re a programmer.
Why don’t you use Twitter directly? There is a very nice API for Python where you can just ask for direct messages without having to mess with parsing an entire email. After your last post on the subject, I built a little bot using that that checks for new messages every minute and can answer questions about my server (where it runs) – and sends a tweet when the load gets too high. The next step will be to have it log data I send it, too.
@Ian – Agreed. I know a large percentage of the FlowingData readers do have some programming in their blood though :)
@Robert – I originally wrote an email bot sans Twitter. I was sending emails to a dedicated email address so my code was already in place to parse. I switched to Twitter though as my imagination went wild and figured I didn’t want to handle spam, malicious users, etc.
Plus, this is the beginning of something much bigger, and I wanted to avoid any Twitter API request limit road blocks further along. It might not even matter, but I like the option.
Do you want to share your bot code that uses the Twitter API? I imagine it’s much much shorter than mine, and I’m sure readers would be interested.
Ah, I see. The good thing about email is that you could do this in a “push” fashion through something like procmail that runs your script whenever an email arrives. The API limits aren’t too bad, you get 100 API calls per hour, that lets me ask for new messages once a minute and still have some breathing room for tweeting back. I hope they’ll eventually implement a push service for this, that would make the service a lot more useful and stable at the same time.
I was certainly planning on releasing my code, I just want to get the data logging part done – should not take long to do that, though.
Oooo, push email. Do you have any resource suggestions for procmail or something else to implement that?
Email is always push ;) I’m not talking about getting the email from your account to your mail program, but the point where it gets delivered to your mail server. When that is a unix machine and you have an account there directly, you get the email the moment it comes in. Procmail is an email filter program that can be run on every email when it arrives, and that can run outside programs like your bot. Google has tons of hits for procmail, though I’m not seeing a really good intro right now. The key question though is where your bot runs and how close that is to the mail server.
Pingback: Favorieten en bookmarks voor 23 October tot 5 November | Cafe del Marketing
Here’s another thought on data delivery… as you may know, Google has an SMS service where you send a text message request for directions, etc. and receive a text message reply. Your concept could be extended to provide on-demand data delivery via SMS using IMAP…
For cellphone microbrowser queries, an XHTML form could be used to send text requests, then graphical data generated on the fly using the GD library and saved as a GIF file. Graphics may be dynamically scaled to the client device using the free WURFL API.
It’s hard to fit much graphical data on a screen as small as 128×128 pixels – maybe you could have another contest? ;)
@Robert – Semantics :) Push email client…better? From what I understand, it looks like I’ve got use of .procmailrpc, so I should be in business when the time’s right. Now I just have to sift through the docs to figure out how to run a python script when an email comes in.
@Kevin – Hmm, a fit what you can in 128×128 pixels contest… i like it
I wasn’t criticizing your use of the word, just wanted to clarify. Running a python program should be like running any other program, you’ll just have to adapt your script to get the email fed through stdin.
@Robert – yeah, that’s what i figured. thanks for the tip!
Pingback: The Geeks Of 3D » Other Interesting Weekly News In Brief…
Having a program run from procmail is quite simple:
:0:
* ^X-TwitterEmailType
# execute the lines below if the about header is found
| /home/bin/mycommand.py
The place to dig into procmail is probably here:
http://www.perlcode.org/tutorials/procmail/proctut/
“… 3) Check email periodically for new direct messages … Did I miss anything? …”
Nice example & I agree with not having to use the twitter api. You could also use urllib and call the individual RSS user feed (eg: http://twitter.com/statuses/user_timeline/12436.rss) reading the url periodically and parsing the RSS. The data is already structured and the tools exist to parse.
One trouble I’m finding though, not that I’ve been using this technique for a little while is that Twitter is sometimes really slow to send that email that a direct message came in…
“… One trouble I’m finding though, not that I’ve been using this technique for a little while is that Twitter is sometimes really slow to send that email …”
@Nathan they might do a batch process on the updates. Once a day run a sql query of all new events then send them off. Sounds silly but it’s notification not real-time like the API and the RSS feed – not 100% sure about the RSS but pretty sure.
Most of the time the emails come immediately, but sometimes the emails get delayed by more than an hour. Sometimes they don’t come at all. I dunno…
for more information on running python scripts when your mail server receives an email see:
http://bilumi.org/trac/wiki/cellphone_install#InstallMailRoutingwpostfixandprocmail
I found the discussion here useful. Thanks.
It still seems like the best architecture is a polling scheme. The system can then interact with responses and status in addition to direct messages. One might be able to get whitelisted to reduce lag.
Anyone know of whitelisted apps? (I’m trying to gauge the liklihood of this happening with my app)
@nathan,
I think that the lag has more to do with twitter in general than anything else. I know that from time to time SMS updates take hours to go through. There is no apparent rhyme or reason to these delays, and sometimes they will delay for some users, and not to others. I would imagine that this delay would likewise affect the API.
I really like the idea of ‘push’ update via email; I wish there was a convenient way for twitter to directly support a ‘push’ interface. Something like the “twitter do” interface (www.twitter.com/tdo) would be really cool. It runs a URL when you send a DM to @tdo; you register your commands and everything; but it isn’t really useful for setting up you OWN bot, since it won’t respond to a DM to YOU.
The problem with the API is the 100 requests per hour; if you have thousands of users, you’re going to run out of requests in a hurry. Although in fairness, if you are running for 1000s of users, your server load is going to be pretty high using your emails anyway.
For users who don’t have direct access to procmail or similar (for example, lots of people on web hosts. If you have a CPanel web host, you can set up “email filters”, which when matched can “pipe to a program”, which will have the same effect.
FWIW, this should also work with languages other than Python, such as PHP, if people are more familiar with that. Although you’ll need to know more about your hosting environment, no reason C/C++ wouldn’t work, for that matter.
HTH
thanks..
Thanks for the guide! I’ve just made a Twitter bot. More info http://download11.com/twitter
thats great that you are talking about the twitter api,a good example of searching with the twitter api is on twiogle.com because you can search on twitter and google at the same time.
Thanks, I’ve been using the Twitter API for some time now just messing around with some sports push bots. Really cool stuff