Islands of Information

Snow on Tree, Poland 2006

The snow and the vodka of Christmas in Poland are but a distant memory now.

I'm knee deep in one of the banes of my professional life: Islands of information.

Years ago we produced a detailed Flash presentation explaining how large companies suffer when their data is stranded in islands of information created by the different software used by various corporate functions such as accounting, stock control, marketing, payroll and so on. Our client was a leading ERP supplier for the construction industry – their message was all about switching to a single integrated system. Despite spending a huge amount of time on this presentation never once did we think the islands of information would be an issue for us and our school clients. (The presentation is still online here [3.3Mb Flash])

Every week I'm presented with a new format for storing alumni, parent, pupil and teacher data. Naturally every vendor has designed their database in a unique way and, if they provide an export feature at all, outputs in their own special layout of columns and rows. Some don't believe in normalisation so you end up with three people per row. Others believe in such levels of customisability it's impossible to create a re-usable tool.

Of course our system uses its own unique data structure too, though it's fully normalised. Which is great apart from when I need to normalise 7,000 rows from someone else's program into 21,000 rows for our system. Dates are a horror too, some use yyyy-mm-dd or dd/mm/yyyy whilst others have a separate column for each portion of the date. sigh

Data conversion and transposition tools aren't new and the problem we face every week isn't new. And that's probably the most depressing thing. We've come so far in so many ways yet when it comes to representing people systems are continuously re-inventing the wheel. There are too many standards floating around to define people and their relationships to each other – the result being that none have been settled on.

If everybody could export and import vCard (or whatever I'm not arguing for any standard here, just a standard) life would be a breeze.

Instead I'm left to keep tweaking our command-line Java tool for data conversion. Because while mapping one field in one database to a different field in another is easy, it's the little yet big problems of normalisation and data formats that take a human to mess up and hence sort out.

Horses in Snow, Poland 2005