Programmatic migration – what can go wrong?

Useful Learning From My Experience

Programmatic migration is where you use software to achieve a fast data migration. It’s far faster than manual migration.

It collects information held in one format, or way, and transforms it so that it can be read in a new context or programme. Web content, for example, is often held in a SQL database

A SQL database is a collection of linked tables.

You might want to be replacing an old Content Management System with a new one, for example.

I’ve worked on several migrations from old CMS systems, such as WordPress, or Joomla, to new SharePoint websites. Microsoft have invested in making SharePoint a powerful tool for running intranets and public sites so it is an attractive proposition for companies like AELTC and organisations like BEIS.

To work in SharePoint, the information in content has to be set out in the correct format, matching the database fields behind the scenes.

Clever Software for fast data migration

Clever software is used to capture content in one format and transform it into a new arrangement.

But there are quite a few things that can go wrong in this process.

Experian have done an excellent blog on this which has stood the test of time. While Experian found the catchy number of eight, and please visit their site to read their list, I’ve identified five problems I’ve seen.

  1. Clients having poor knowledge of the problems with their existing data and assuming technology will find a solution quickly
  2. Pre-migration laziness – not carrying out a good enough preparation data analysis exercise to understand the fundamental structure of the As Is database, and its fields. This means not noticing guff in the wrong fields – and if not picked up, this means contents with errors in it can be migrated. If the site has thousands of pages, manual tester tend only to sample pages, rather than read the whole context in the new environment. As a result, hunting down the problem pieces of content once migrated is a massive task and confidence in your solution will disappear.
  3. Non-validation of the specification and requirements of the new database -known as its database schema, or abstract mapping of the fields contained in the tables which make up the database. SharePoint has its own database schema, which sets out how information shown to the end user through their web browser is contained and stored behind the scenes. Serious study of the TO BE database schema is needed to specify how information from the AS IS system is to be stored in the TO BE new database.
  4. Not enough early stage testing – Sometimes, people use sample content which is not representative of the main content to be migrated.
  5. Post-migration – there are bandwidth constraints which control or limit the amount of information which can be migrated in an hourly or daily period. Often people wait until all of the data is transferred before then testing the new content, in the new system, to see if it has been properly transferred. This often delays a horrible surprise which is better coming early, so you have time to respond. That is, you get to the end and you find out that all the headline text from one website is coming through as body text in the new website because the fields were not properly mapped against each other in your script. Introducing batching, with associated early and agile prototyping stages, means you will find problems and deal with them early, rather than waiting till the end, when your scope for avoiding a poor user experience and embarassment is limited.