Analysis of passwords in Sony security breach

Posted to Statistics  |  Tags: , ,

A little over a week ago, Sony was hit yet again with another security breach — this time over one million passwords, that were stored in plain text, were released into the wild. Software architect Troy Hunt took a closer look at the dataset and found just how predictable people's passwords are.

We know passwords are too short, too simple, too predictable and too much like the other ones the individual has created in other locations. The bit which did take me back a bit was the extent to which passwords conformed to very predictable patterns, namely only using alphanumeric character, being 10 characters or less and having a much better than average chance of being the same as other passwords the user has created on totally independent systems.

The 25 most used passwords? seinfeld, password, winner, 123456, purple, sweeps, contest, princess, maggie, 9452, peanut, shadow, ginger, michael, buster, sunshine, tigger, cookie, george, summer, taylor, bosco, abc123, ashley, bailey

It can be a pain to keep track of a lot of different passwords for every site, but it just might save you much bigger headaches later on. Avoid actual words at the very least. Or maybe someone should hurry up and let us access our accounts via eye scanners.

[A brief Sony password analysis via @kevinweil]

11 Comments

  • These passwords were associated with a raffle of some sort. With most common passwords like sweeps, contest, winner, and seinfeld (the subject of the drawing), it may have been that some users weren’t concerned about the security of an account they only expected to use once. I’ve certainly created weak passwords out of convenience in those situations, but use much more secure passwords for sites I use regularly.
    The basic premise — that many users create weak passwords — is correct, but I don’t think this is the correct dataset to show that.

  • I think the real problem is that such a respectable (yet?) company as Sony stores passwords in plain text. In the end even if you use very simple password it would need a targeted brute force attack to guess it. But if all passwords from some database are stolen/hacked and they were stored in plain text(!) – your best super-random 20+ symbol password won’t have any advantage over “123456”.

  • I agree that they really need to be encrypted, and encrypted with some sort of salt. But if you password is stored encrypted, here are so tips I got from Steve Gibson at GRC:
    1) Length is the most important factor so pad a simple to remember password with some easy to remember scheme e.g. <<<<>>>>
    2) Include at least one uppercase, one lowercase, one number and one symbol. e.g. <<<<>>>>

    See https://www.grc.com/haystack.htm

  • With eye scanners, at least you could use a different eye for each site you access…

  • In 2005 “members of a violent gang […] chopped off a car owner’s finger to get round the vehicle’s hi-tech security system” (Source: http://news.bbc.co.uk/2/hi/asia-pacific/4396831.stm)

  • Gerard St. Croix June 13, 2011 at 12:31 pm

    Eye scanner fans need to check “Demolition Man” out.

    Anyway, I’m surprised to see that there were two peaks at 6 & 8 while 7 got a relative low score. Curious.

  • Anyone understand the significance of 9452? Nothing’s jumping out on Google.

  • On password length, I was surprised as well that there was a peak at 6 and 8 compared to 7, since I believe there are more common names of 7 letters than of 6 or 8, and since 7 is considered an ideal length for memorizing. However, if you look at the given list for top 25 passwords, the distribution is as follows: 17 words of 6 letter, 4 of 8 letters, 2 of 7 letters, 1 of 4 letters, 1 of 5 letters.