Using Regex to extract parts of a postcode in Python

The concepts

Every time I try to do something with postcodes I find myself trying to remember the different potential formats and googling for effective regular expressions (the one I want is never in the first five I try).

I decided to record what I found to help out future me (and others). This post will:

Firstly, this page is brill from the GetTheData peeps:

Programmer’s guide to UK postcodes

It provides some really useful context (including a clear graphic describing how UK postcodes are constructed) and has some sample regular expressions This basically unblocked me last time I went round this loop.

Getting the different parts of a postcode

I used the regex from getthedata.com to easily extract the Outcode from a full postcode string.

import re
source_string = "wc2b 3dx"

string_to_process = source_string.replace(" ","").upper()
# WC2B3DX

matches = re.findall(r'^((([A-Z][A-Z]{0,1})([0-9][A-Z0-9]{0,2})) {0,}(([0-9])([A-Z]{2})))', postcode)
# [('WC2B 3DX', 'WC2B', 'WC', '2B', '3DX', '3', 'DX')]

The re.findall method will output a list by default (list of matches). As the above regex only matches on the first postcode in a string, and I know I’m only ever passing a single postcode in, I take the first match from the list to give me a tuple containing the various postcode parts.

postcode_parts = matches[0]
# ('WC2B 3DX', 'WC2B', 'WC', '2B', '3DX', '3', 'DX')

I can then use whichever part of the postcode I need. In my example I wanted to first look for a complete match in a big list, and if there’s no match look for a match on just the Outcode.

First I get the full postcode and outcode to work with:

postcode = postcode_parts[0]
outcode = postcode_parts[1]

Then I can try and find them in the list:

list_of_postcodes = ['WC2B3DF', 'WC2D5FD', 'WC2B', 'AB34FD', 'AB3']

if postcode in list_of_postcodes:
    print('Found the full postcode!')
elif outcode in list_of_postcodes:
    print('Found the outcode!')
else:
    print('Didn't find anything :(')

# Found the outcode!

Hopefully this might help save others a few minutes by avoiding the Google / Stack Overflow loop!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s