Monday, August 06, 2007

It is all about the data

...but the question is where to start. I can't tell you how many times I go into an account and they have no idea of the data models that are behind their systems or if the models they have are adequate or factual. I would say it is fair to say that most IT organizations view modeling as a nice to have still. It should not then come as a surprise that most applications are still deployed as a stove pipe within the organization. See to truly integrate systems you must have a good understanding of each system. I am not talking about an arbitrary pump of data to fill out a form either when it comes to integration, but I am talking about true data reliance when we finally achieve these service oriented architectures we talk so much about. Here is the most common excuses I hear from IT about modeling when we go into do IT master planning.

1) We have too many systems and too much legacy data
2) Modeling takes more time than it is worth
3) The users won't put in all of the data required so we have bad data
4) Not all systems require it because we are not integrating all systems
5) We have some systems that we cannot customize

I understand and can empathize with individuals that if you have never done it and have a medium size IT infrastructure, the task can be daunting. However, it is not just one large task and can be split up into several bits. Here are the first two that I suggest.

1) Catalog your current models and objects

Objects in the models are things such as employees, customers, locations and orders, for example. Once there is a good grasp of these objects, the group should pick one. I usually start with employees because it is the easiest place for people to grasp what kinds of data they would want to know about an employee or to have a way to uniquely identify them as they move from system to system.

The next two steps are to then determine what will be the "main" system that houses the authoritative data set for this object (Active Directory, an ERP system, a Customer Service system, etc...) and what systems this object appears in. It is not necessarily the case all of the time that there will be a main system, but if possible it make the data integrity easier.

Once the group goes through that exercise they should be able to rinse and repeat for all objects. Once they have all objects they should be able to go back and build out the models for all of the systems. Not always that easy but it gives them a good foundation.

2) Begin to change the user input

I love the "The users won't put in all of the data required so we have bad data" excuse. I tell the groups if a customer filled out an order or a contract but didn't put their name, the company name or a signature would you except it? There is no reason to accept partial data if there is a good business reason why full data needs to be captured.

Some of this behavior is caused by the fact that not all of the information is integrated. For instance, I know as a user if I put in a unique identifier on a form (e.g. e-mail address) there should be no reason why a company shouldn't be able to pull my address from their data banks if I had put it in before. Users will get really tired of filling in long forms and view it as a work detractor instead of something that is helping them.

Take that one object that the group identified above and start to change the input of that object to match the model you want and force the data integrity. If you start off a little bit at a time the users will get use to having to put in data of a certain format without it seeming too overbearing. Gradually introduce the rest of the input for objects until eventually you have a system that logically makes sense and you are ensuring that you are capturing the data to run that system.


Post a Comment

<< Home