UPDATE: I downloaded the full postal codes listing for Australia from Australia Post’s website. I then wrote a script to geocode these postcodes by querying the Geonames postcode API. Here is a file with all the postcodes I couldn’t find in Geonames. Next I will spatially look for the electoral divisions for the postcodes I have found.
Incase you are interested here is code (a spec – don’t you love Ruby/RSpec – a specification that actually works :) )
require 'open-uri'
require 'csv'
require 'json'
class GeocodeAustralianPostcodes
@@base_url = "http://ws.geonames.org/postalCodeSearchJSON"
def find(postcode)
url = @@base_url+"?postalcode=#{postcode}&maxRows=5&country=AU"
JSON.parse(open(url).read)
end
def postcode_hashes(file_location)
csv = CSV::parse(File.open(file_location, 'r') {|f| f.read })
fields = csv.shift
csv.collect { |record| Hash[*fields.zip(record).flatten ] }
end
def write(infile_location, outfile_location)
phashes = postcode_hashes(infile_location)
outfile = File.open(outfile_location, 'w+')
bad_outfile = File.open('data/postcodes_notfound.csv', 'w+')
outfile.puts("POSTCODE, LAT, LON, adminCode1, adminName1, adminName2, placeName")
phashes.each do |h|
response = find(h['Pcode'])
if response["postalCodes"].empty?
bad_outfile.puts("#{h['Pcode']} not found")
else
response["postalCodes"].each do |r|
outfile.puts "#{h['Pcode']}, #{r["lat"]}, #{r["lng"]}, #{r["adminCode1"]}, #{r["adminName1"]}, #{r["adminName2"]}, #{r["placeName"]}"
end
end
end
outfile.close
end
end
describe GeocodeAustralianPostcodes do
before do
@gap = GeocodeAustralianPostcodes.new
@phashes = @gap.postcode_hashes('data/postcode_sampledata.txt')
end
it "should read the postcodes from the text file and convert them to an array of hashes" do
csv_reader = CSV::Reader.parse(File.open('data/postcode_sampledata.txt', 'r'))
csv_reader.shift
@phashes.each do |h|
h['Pcode'] == csv_reader.shift[0]
end
end
it "should find the postcode on each line and geocode the postcode using the Geonames JSON API" do
@phashes.each do |h|
resp = @gap.find(h['Pcode'])
if resp["postalCodes"].empty?
#puts("#{h['Pcode']} not found")
else
resp["postalCodes"].each do |r|
#puts "#{h['Pcode']}, #{r["lat"]}, #{r["lng"]}"
end
end
end
end
it "should write back to an outpufile with Lat, Lon, Postcode if geocoded, othewise postcode# not found" do
@gap.write('data/pc-full_20080529.csv', 'data/aus_post_output.csv')
end
end
I noticed the Postcode Areas (POA) data at the ABS is missing Postcodes because they have been created by aggregating Census Districts (CDs) resulting in some postcodes dropping out because the CDs which they covered had already been allocated to other postcodes. Strange a one-to-many associations don’t seem to be supported for CDs and POA. I might need to get in touch with Australia Post to see if they have any point or polygon data for postcodes that has been mapped independent of Census Districts. Then do a join with the Electoral Divisions from the ABS. Anyone out there already done this?

Back in Feb at the Sydney Ruby meetup Matthew Landauer gave a talk on the Open Australia project. Today the site went live and it’s looking great! (OpenAustralia.org) The project has been developed by volunteers. You can enter your postcode, see who your local representative in parliament is, watch their activities in parliament. I love the ability to comment on each statement spoken by a parliamentarian. I actually had a very-teeny-tiny-little-bit to do with the postcode functionality. I sniffed out some of the data for postcode to commonwealth electoral division mapping. Matt informs me the first bug reports have to do with missing postcode data *blush*… yep I screwed up!… I didn’t spend enough time looking at the data before I sent it, thus overnight turning my geo-guru status in Matts eyes to a mere geo-goat!

Now having a closer look at the ABS site I think I may have spotted the problem… the data for the Electoral Divisions and Postcode are not in the same hierarchy. In other words one dataset cannot be aggregated or de-aggregated to create the other. So what’s the solution? Yep the *real* geo-gurus are thinking full outer spatial-join.
For the geo-goats (like me) it will help to check out this diagram that shows how Australia is chopped up into shapes by the very competent folks at the ABS.
What’s going is that the two regions are derived separately and in separate branches. I am currently looking at fixing this by getting a hold of the two data sets and freshly deriving the postcodes.
So I got the data from ABS and here are the results, looks like its as though the electoral divisions may have more than one postcode. Now to generate a new dataset… This is not a problem at the application level. The problem might occur if there is a postcode overlaps more than one electoral division … still checking the data for that
