The Causes Of Bad Data, And How To Fix Them
Poor quality data is costing organisations a lot of money. In fact, research suggests that ‘bad data’ costs businesses on average 30% (or more) of their revenue, which is pretty staggering when you think about it.
Obviously, as with all such research, these figures should be taken with a pinch of salt. But one thing does seem clear – if you don’t focus on putting measures in place that will stop the flow of poor-quality data into and around your organisation, it will hit you hard and you will lose money because of it.
What are the causes of bad data?
As more and more information enters your organisation from different points of origin, the likelihood that mistakes will be made automatically increases. The first mistake that many organisations make is disregarding the potentially damaging impact of bad data that’s already flowing around between their enterprise applications, and playing down the problem by assuming that their data is clean and accurate. However, as the figure mentioned above makes clear, you cannot simply ignore the problem and hope it goes away.
The second mistake that organisations make is to allow different parts of the business to become completely isolated from one another, as this creates departmental silos. Once these silos have been created and taken root, not only are they hard to break down, but they also encourage the use of data definitions that are different to those used by other parts of the organisation. This makes it difficult to manage and reconcile important data between teams and applications.
Of course, there’s one cause of bad data that has and will always exist to some degree, and that’s human error. Regardless of the amount of supplier data that’s being entered or used, human error will always be a factor that must be accounted for, as typos, incorrect or incomplete information can find its way into the application. Human fallibility in particular is one of the main reasons why you need data governance measures in place, as one of the main aims of governance is to make sure there are stringent approval processes that will catch any errors early on in the data lifecycle.
When you combine the two points above, the existence of silos and the constant risk of human error, you can see why it’s so easy for data to be duplicated. Duplicate data is a common problem and one that every organisation is likely to experience to a certain extent, but that doesn’t mean you can just leave it and hope it disappears. Duplicated information can seriously affect the accuracy of your reporting and analysis, and make data cleansing exercises difficult and expensive, so it’s something that must be tackled proactively.
Having just mentioned data governance, it’s worth taking into account the fact that the absence of any such governance measures can itself be a cause of bad data. Without such policies and procedures in place, it’s easy to see why bad practices can become so prevalent within your organisation.
What does bad quality data look like?
Before your organisation can solve its data quality problems, it first needs to understand what bad supplier data actually looks like. This means communicating clearly, across all seniority levels and silos, the following warning signs to look out for:
Inaccurate information – arguably the hardest type of bad vendor data to identify is that which is inaccurate. This is because, while all of the required information may look like it’s present and consistent, there are subtle errors that are hard to detect and which require prior knowledge (for example, names, addresses, phone numbers, payment details, etc).
Incomplete information – when there are important pieces of information missing then clearly the entry will be marked as incomplete.
Inconsistent information – again, similar to inaccurate information, this can be quite tricky to spot at first glance because it might seem as if all of the fields have been filled in (and may technically be correct), but the information is recorded/displayed inconsistently. For example, this could be telephone numbers with no dialling code or too many spaces, or money in different currencies, or a name written as an initial rather than fully.
Other ways in which data can be defined as being of a bad quality include invalid, redundant or non-standardised information, which you can read more about in this blog.
How can you fix your organisation’s bad data?
Fixing your organisation’s supplier data is more than just the act of cleaning and changing the information itself. To really fix your bad data and the causes of it, you need to transform your business’s mindset and attitude towards the way it captures and stores information.
This means you should not just treat data cleansing as a one-time project. To truly maintain the quality of your vendor data in the long-term, you need a data governance team in place that not only defines processes and policies for your data, but also defines metrics that allow them to track how the information changes over time.
One particular analogy we like to use here is that of cleaning the ocean. Even if you focus all of your efforts on removing plastic from the ocean right now, as quickly as you take it out more will come in to replace it. You’ll be in a constant process of cleaning without actually fixing the root of the problem. The main focus should be on stopping the pollution from taking place in the first instance.
It is the same with data. Once you’ve stopped poor quality supplier data from entering the system and adding to the problem, you can turn your attention to fixing the bad data you already have. This is also one of the main reasons why it is so beneficial to establish a single entry point, or supplier portal, for your supplier data, as you can then control the flow of information into your organisation and put thorough quality-control measures in place. You can find out more about the advantages of having a global supplier portal in this blog.
However, even once you have created your data governance function, you still need to make sure you draw in other people from across the organisation’s business community, including subject matter experts, in order to make sure everyone is working in alignment with one another.
A key part of this initial phase is assessing what is important to your organisation, which is why you need alignment with other parts of the business. Which applications are most important to your business, or used the most? Which data elements are the most essential (i.e. information that you can’t function without)? You can’t begin to make changes until you know what position you’re actually in.
You can read more about why data governance is so important and what’s involved in creating a data governance framework in the following white papers and blogs: