Category Archives: Computers

TV Style Computer Hacking!

My son asked the other night if hacking computers really worked like on television,
and if numbers/words really flashed across the screen. 

I explained to him a bit, then told him we could make something that looked like he 
seen on television. Below is a print out I wrote for him.

Keep in mind I tried to dumb it down a bit for my 10 year old son.

Hacking MD5 Hashes

What is a MD5 Hash?

MD5 is a widely used message-digest algorithm used for creating one-way hash values.

What is a Hash?

A hash function takes an arbitrary block of data and returns a fixed-size bit string.

How Does MD5 Hashes Work in Python?

In []: import hashlib

In []: hashlib.md5('python').hexdigest()
Out[]: '23eeeb4347bdd26bfc6b7ee9a3b755dd'

Why Are Hashes Used in Programming?

Being that hashes are one-way, we are unable to take the outputted hash and return to the original text.

For example, we can’t take the above ’23eeeb4347bdd26bfc6b7ee9a3b755dd’ hash and get the string ‘python’.

That being the case they can be rather useful for storing sensitive data such as passwords. This way no one knows your password, but they can compare hashes to verify that they match.

Bob uses a simple Dictionary Word as his Password.

Lets say our user named Bob has a dictionary password of baseball. When Bob logs into his computer he types his password and compares this resulted hash with what is stored in the computers database.

The database would store only the hashed password, as it is sensitive. The hash of baseball is 276f8db0b86edaa7fc805516c852c889

In []: hashlib.md5('baseball').hexdigest()
Out[]: '276f8db0b86edaa7fc805516c852c889'

Are Dictionary Passwords Secure?

Lets say you have a list that contained all dictionary words; you could use this to hash all words, and then compare them to Bob’s password hash.


In []: password_hash = '276f8db0b86edaa7fc805516c852c889'

In []: words = ['basketball', 'soccer', 'football', 'baseball']

In []: for word in words:
   ....:     h = hashlib.md5(word).hexdigest()
   ....:     print 'Comparing %s with %s' % (h, password_hash)
   ....:     if h == password_hash:
   ....:         print 'The word %s matches hash' % word
   ....:         break
Comparing d0199f51d2728db6011945145a1b607a with 276f8db0b86edaa7fc805516c852c889
Comparing da443a0ad979d5530df38ca1a74e4f80 with 276f8db0b86edaa7fc805516c852c889
Comparing 37b4e2d82900d5e94b8da524fbeb33c0 with 276f8db0b86edaa7fc805516c852c889
Comparing 276f8db0b86edaa7fc805516c852c889 with 276f8db0b86edaa7fc805516c852c889
The word baseball matches hash

Do I Need to Create a List of all Dictionary Words?

No, most computer systems come with a file called words that contains most dictionary based words. Below I show a simple line-count of the file

[root@localhost ~]# wc -l /usr/share/dict/words
479829 /usr/share/dict/words

How can we use a file like this in Python?

Python provides us a tool for opening system files, then reading them line-by-line.

In []: f = open('/usr/share/dict/words', 'r')

In []: lines = f.readlines()

In []: lines[0:5]
Out[]: ['1080\n', '10-point\n', '10th\n', '11-point\n', '12-point\n']

We will need to format that line variable and remove the trailing \n, this is called a newline character. To achieve this we will use the rstrip function to remove it.

In []: lines[0].rstrip()
Out[]: '1080'

In []: lines[1].rstrip()
Out[]: '10-point'

How many words do we have?

Notice how this matches our line-count above.

>>> len(line)

Share on Facebook Share on Twitter Share on Google+

Split Python List by Nth Item

This has been one of the processes I’ve normally not solved in a clean or readable way.

In []: a = [1,2,3,4,5,6,7,8]

Taking the top output I want to return something similar to below.

Out[]: [(1, 2), (3, 4), (5, 6), (7, 8)]

Using a Pythonic approach this task isn’t to difficult, but there is some explaining to do.

First lets look at the code:

In []: a = iter([1,2,3,4,5,6,7,8])

In []: [ i for i in zip(a, a) ]
Out[]: [(1, 2), (3, 4), (5, 6), (7, 8)]

First we need to pass our list to the iter function, this turns our list into a iterator object:

In []: a
Out[]: <listiterator at 0x1037c3110>

This iterator can be used like any other iterator as it has a next method:

In []: a = iter([1,2,3,4,5,6,7,8])

In []:
Out[]: 1

In []:
Out[]: 2

In []:
Out[]: 3

The next bit to understand is the zip function:

In []: a = [1,2,3,4]

In []: b = [5,6,7,8]

In []: zip(a, b)
Out[]: [(1, 5), (2, 6), (3, 7), (4, 8)]

What zip does is takes the 1st object (next method) from each iterator-able object passed in
and packs them together in a tuple, it then continues till one or all objects throws a StopIteration :

In []: a = iter([1,2])

In []:
Out[]: 1

In []:
Out[]: 2

In []:
StopIteration                             Traceback (most recent call last)
<ipython-input-52-aa817a57a973> in <module>()
----> 1


So to put it all together zip takes our ‘a’ variable (an iter object of our list) as a argument twice.

[ i for i in zip(a, a) ]

Zip then grabs the first item from our first argument using the next method, and the first item from our second argument using the next method and bundles them in a tuple. If we were to do this by hand it would look like this:

In []: a = iter([1,2,3,4,5,6,7,8])

In []: first =

In []: second =

In []: first, second
Out[]: (1, 2)

Then if we were to do it again:

In []: first =

In []: second =

In []: first, second
Out[]: (3, 4)

Hopefully this makes sense now.

Share on Facebook Share on Twitter Share on Google+

Linux /proc/net/route addresses unreadable

So you may have looked at /proc/net/route before and thought how the heck am I suppose to read this. Well here is the low down.

This file uses endianness to store the addresses as hexadecimal,
in reverse; for example 192 as hex is C0:

In []: hex(192)
Out[]: '0xc0'

So lets take a look at our route file:

Iface   Destination     Gateway         Flags   RefCnt  Use     Metric  Mask            MTU     Window  IRTT
eth0    00087F0A        00000000        0001    0       0       0       00FFFFFF        0       0       0
eth0    0000FEA9        00000000        0001    0       0       1002    0000FFFF        0       0       0
eth0    00000000        01087F0A        0003    0       0       0       00000000        0       0       0

Now the first entry has a destination of 00087F0A, lets go ahead and chunk these in to hex

In []: x = iter('00087F0A')

In []: res = [ ''.join(i) for i in zip(x, x) ]

In []: res
Out[]: ['00', '08', '7F', '0A']

Now if we wanted we could convert these hex codes manually:

In []: int('0A', 16)
Out[]: 10

But we want to do this in one big swoop:

In []: d = [ str(int(i, 16)) for i in res ]

In []: d
Out[]: ['0', '8', '127', '10']

And there we go, our IP address; however, it appears to be backwards. Lets go ahead and fix that, and return as a string:

In []: '.'.join(d[::-1])
Out[]: ''

And there we have it! Your function may look something like this when all said and done:

In []: def hex_to_ip(hexaddr):
   ....:     x = iter(hexaddr)
   ....:     res = [str(int(''.join(i), 16)) for i in zip(x, x)]
   ....:     return '.'.join(res[::-1])

And with output like so:

In []: hex_to_ip('00087F0A')
Out[]: ''

In []: hex_to_ip('0000FEA9')
Out[]: ''

Share on Facebook Share on Twitter Share on Google+

Reading Yum Repository Data

I’ve spent a lot of time working with RPM in the last couple years, and have had the pleasure of maintaining the IUS Community.

I wanted to share a small utility we use quite often called repodataParser, repodataParser is a Python class for working with RPM repositories, and used in a few of our Django applications.

The idea is all RPM repositories contain a XML file containing details about the package it contains. Lets take CentOS’s Vault for example:

In [1]: from RepoParser.RepoParser import Parser

In [2]: parser = Parser(url='')

We are now provided two methods getList and getPackage, lets go over getList first:

In [3]: help(parser.getList)

getList(self) method of RepoParser.RepoParser.Parser instance
    returns a python list of dicts of the nodes in a XML files TagName

According to the doc string we return a Python list of dicts, lets:

In [4]: type(parser.getList())
Out[4]: list

In [5]: len(parser.getList())
Out[5]: 6019

Yup, we have a list with 6019 files, lets have a look at the first one:

In [6]: parser.getList()[0]
{u'arch': (u'i686', None),
 u'checksum': (u'36099439b7dbc9323588f1999bff9b1738bb8b4df56149eb7ebb5b5226107665',
  {u'pkgid': u'YES', u'type': u'sha256'}),
 u'description': (u'The libjpeg package contains a library of functions for manipulating\nJPEG images, as well as simple client programs for accessing the\nlibjpeg functions.  Libjpeg client programs include cjpeg, djpeg,\njpegtran, rdjpgcom and wrjpgcom.  Cjpeg compresses an image file into\nJPEG format.  Djpeg decompresses a JPEG file into a regular image\nfile.  Jpegtran can perform various useful transformations on JPEG\nfiles.  Rdjpgcom displays any text comments included in a JPEG file.\nWrjpgcom inserts text comments into a JPEG file.',
 u'format': (u'\n    ', None),
 u'location': (None, {u'href': u'Packages/libjpeg-6b-46.el6.i686.rpm'}),
 u'name': (u'libjpeg', None),
 u'packager': (u'CentOS BuildSystem <>', None),
 u'size': (None,
  {u'archive': u'289416', u'installed': u'287173', u'package': u'135732'}),
 u'summary': (u'A library for manipulating JPEG image format files', None),
 u'time': (None, {u'build': u'1282396975', u'file': u'1309667078'}),
 u'url': (u'', None),
 u'version': (None, {u'epoch': u'0', u'rel': u'46.el6', u'ver': u'6b'})}

Now lets have a look at getPackage:

In [7]: help(parser.getPackage)

getPackage(self, package) method of RepoParser.RepoParser.Parser instance
    return a python list of dicts for a package name

We know a CentOS 6 server should provide a php package, so lets look them up.

In [8]: parser.getPackage('php')
[{u'arch': (u'x86_64', None),
  u'checksum': (u'8387996f9876fd0be5ae30845e8bb4c65371d54c4969ebe61c7e6fa771622f5b',
   {u'pkgid': u'YES', u'type': u'sha256'}),
  u'description': (u'PHP is an HTML-embedded scripting language. PHP attempts to make it\neasy for developers to write dynamically generated webpages. PHP also\noffers built-in database integration for several commercial and\nnon-commercial database management systems, so writing a\ndatabase-enabled webpage with PHP is fairly simple. The most common\nuse of PHP coding is probably as a replacement for CGI scripts.\n\nThe php package contains the module which adds support for the PHP\nlanguage to Apache HTTP Server.',
  u'format': (u'\n    ', None),
  u'location': (None, {u'href': u'Packages/php-5.3.2-6.el6.x86_64.rpm'}),
  u'name': (u'php', None),
  u'packager': (u'CentOS BuildSystem <>', None),
  u'size': (None,
   {u'archive': u'3648536', u'installed': u'3647853', u'package': u'1169480'}),
  u'summary': (u'PHP scripting language for creating dynamic web sites', None),
  u'time': (None, {u'build': u'1289553183', u'file': u'1309669007'}),
  u'url': (u'', None),
  u'version': (None, {u'epoch': u'0', u'rel': u'6.el6', u'ver': u'5.3.2'})}]

And here is how we would grab a package version for the first php package provided by the repository:

In [9]: php = parser.getPackage('php')

In [10]: php[0]['version'][1]['ver']
Out[10]: u'5.3.2'

repodataParser is pretty rough around the edges, but as you can see it does work. Hopefully
this will have helped someone out there needing to check RPM repository XML data.

Share on Facebook Share on Twitter Share on Google+

Random Board Game Selection using Board Game Geek

Using Board Game Geek’s API I wanted to create a simple Python tool for randomly picking a game to play.
Below is some quick Python code to achieve my goal:

from urllib2 import urlopen
from lxml import etree
from random import choice

def get_xml():
    req = urlopen('')
    return req

def get_items():
    xml = etree.parse(get_xml())
    return xml.xpath('//item')

def get_thumbnail(item):
    t = item.xpath('thumbnail')
    if len(t) == 1:
        return t[0].text

def get_name(item):
    t = item.xpath('name')
    if len(t) == 1:
        return t[0].text

def get_stats(item):
    t = item.xpath('stats')
    if len(t) == 1:
        return dict(t[0].items())

def get_minplayers(item):
    return get_stats(item).get('minplayers')

def get_maxplayers(item):
    return get_stats(item).get('maxplayers')

def get_playingtime(item):
    return get_stats(item).get('playingtime')

def get_as_dict(item):
    return dict(

def get_games():
    items = get_items()
    return [get_as_dict(i) for i in items]

def get_random_game():
    games = get_games()
    return choice(games)

And using this would work like so:

In [1]: get_random_game()
{'maxplayers': '6',
 'minplayers': '1',
 'name': 'Castle Panic',
 'playingtime': '60',
 'thumbnail': ''}

In [2]: get_random_game()
{'maxplayers': '6',
 'minplayers': '3',
 'name': 'Munchkin Pathfinder',
 'playingtime': '90',
 'thumbnail': ''}

Share on Facebook Share on Twitter Share on Google+