My son asked the other night if hacking computers really worked like on television, and if numbers/words really flashed across the screen. I explained to him a bit, then told him we could make something that looked like he seen on television. Below is a print out I wrote for him. Keep in mind I tried to dumb it down a bit for my 10 year old son.
Hacking MD5 Hashes
What is a MD5 Hash?
MD5 is a widely used message-digest algorithm used for creating one-way hash values.
What is a Hash?
A hash function takes an arbitrary block of data and returns a fixed-size bit string.
How Does MD5 Hashes Work in Python?
In : import hashlib In : hashlib.md5('python').hexdigest() Out: '23eeeb4347bdd26bfc6b7ee9a3b755dd'
Why Are Hashes Used in Programming?
Being that hashes are one-way, we are unable to take the outputted hash and return to the original text.
For example, we can’t take the above ’23eeeb4347bdd26bfc6b7ee9a3b755dd’ hash and get the string ‘python’.
That being the case they can be rather useful for storing sensitive data such as passwords. This way no one knows your password, but they can compare hashes to verify that they match.
Bob uses a simple Dictionary Word as his Password.
Lets say our user named Bob has a dictionary password of baseball. When Bob logs into his computer he types his password and compares this resulted hash with what is stored in the computers database.
The database would store only the hashed password, as it is sensitive. The hash of baseball is 276f8db0b86edaa7fc805516c852c889
In : hashlib.md5('baseball').hexdigest() Out: '276f8db0b86edaa7fc805516c852c889'
Are Dictionary Passwords Secure?
Lets say you have a list that contained all dictionary words; you could use this to hash all words, and then compare them to Bob’s password hash.
In : password_hash = '276f8db0b86edaa7fc805516c852c889' In : words = ['basketball', 'soccer', 'football', 'baseball'] In : for word in words: ....: h = hashlib.md5(word).hexdigest() ....: print 'Comparing %s with %s' % (h, password_hash) ....: if h == password_hash: ....: print 'The word %s matches hash' % word ....: break ....: Comparing d0199f51d2728db6011945145a1b607a with 276f8db0b86edaa7fc805516c852c889 Comparing da443a0ad979d5530df38ca1a74e4f80 with 276f8db0b86edaa7fc805516c852c889 Comparing 37b4e2d82900d5e94b8da524fbeb33c0 with 276f8db0b86edaa7fc805516c852c889 Comparing 276f8db0b86edaa7fc805516c852c889 with 276f8db0b86edaa7fc805516c852c889 The word baseball matches hash
Do I Need to Create a List of all Dictionary Words?
No, most computer systems come with a file called words that contains most dictionary based words. Below I show a simple line-count of the file
[root@localhost ~]# wc -l /usr/share/dict/words 479829 /usr/share/dict/words
How can we use a file like this in Python?
Python provides us a tool for opening system files, then reading them line-by-line.
In : f = open('/usr/share/dict/words', 'r') In : lines = f.readlines() In : lines[0:5] Out: ['1080\n', '10-point\n', '10th\n', '11-point\n', '12-point\n']
We will need to format that line variable and remove the trailing \n, this is called a newline character. To achieve this we will use the rstrip function to remove it.
In : lines.rstrip() Out: '1080' In : lines.rstrip() Out: '10-point'
How many words do we have?
Notice how this matches our line-count above.
>>> len(line) 479829