Beware Of The Bad Data!
In 2017, IBM estimated that bad data cost the US economy $ 3.1 trillion. Imagine what it costs your organization?
Ever since Clive Humby coined the phrase "data is the new oil" in 2006, everyone is gung-ho about data. Large volumes of big data churn through every organization's arteries. Big data is touted as the disruptive driver of competitive advantage. In a digitally integrated world, data drives and transforms the entire value chain spanning revenue, cost and innovation.
As with oil, data needs to be processed and refined for downstream solutions- but it is useful only when it is of the right quality. How much of all this data is accurate or clean, to begin with? A study reported in MIT Sloan Review indicated that bad data can cost companies 15%-25% in revenues.
Four factors determine the value of big data: Volume, Velocity, Variety, and Veracity. Our concern in this article is with the Veracity of data, representing its reliability and integrity. A decision maker needs to be satisfied with the integrity of the data value chain, from source to the decision. Based on our experience and observation, there are six kinds of flaws in Data Veracity:
Akash Jain, Founder, InfoTech Navigator
1. Pollution of raw data: From bad surveys, subconsciously biased sampling, impersonation, and bad data entry processes. A classic example is in social media, where fake bots masquerade as humans.
2. Contextual adequacy or appropriateness: Data collected could be factual, but contextually irrelevant. A leading hypermarket chain noted mismatched results for direct marketing campaigns directed at certain loyalty card users. "In-store" interviews revealed that loyalty cards were actually owned by house-help staff making purchases on behalf of their employers.
3. Disharmony with data velocity: In a fast-moving digital marketing environment, data obsolescence renders analytics incorrect or useless in a very short time.
4. Isolation of business functions and data processes: Leading to duplication, and all instances of the same data not updated simultaneously.
5. Fraudulent data: a problem that even the most secure technology like Blockchain has not been able to fully control.
6. Omissions: caused primarily by laziness, lack of comprehension, or impending deadlines.
Bad data can lead to revenue loss, cost inefficiencies and missed opportunities. Leaders need to recognize and act on this expeditiously. Some steps to eliminate the scourge of bad data include:
• Articulate a vision for a data driven culture and a sustainable program to implement it.
• Invest in strong leadership and talent to drive data driven decision making.
• Identify what data points are key to the core decision areas.
• Treat data with the same respect as other key assets like plant/ machinery, human talent, etc.
• Engage continually with points of data origin- Customers, Vendors, Partners, Social Media, Regulators, Shareholders, etc.
The guru of quality, Edward Deming said "In God we trust, all others bring data." And while one follows this dictum, we should also remember that it needs to be the Right Data; because as Ronald H. Coase, a renowned British Economist said, "if you torture the data long enough, it will confess to anything"; so, the risk of bad data is truly in what you will never know.
Ever since Clive Humby coined the phrase "data is the new oil" in 2006, everyone is gung-ho about data. Large volumes of big data churn through every organization's arteries. Big data is touted as the disruptive driver of competitive advantage. In a digitally integrated world, data drives and transforms the entire value chain spanning revenue, cost and innovation.
As with oil, data needs to be processed and refined for downstream solutions- but it is useful only when it is of the right quality. How much of all this data is accurate or clean, to begin with? A study reported in MIT Sloan Review indicated that bad data can cost companies 15%-25% in revenues.
Four factors determine the value of big data: Volume, Velocity, Variety, and Veracity. Our concern in this article is with the Veracity of data, representing its reliability and integrity. A decision maker needs to be satisfied with the integrity of the data value chain, from source to the decision. Based on our experience and observation, there are six kinds of flaws in Data Veracity:
Akash Jain, Founder, InfoTech Navigator
1. Pollution of raw data: From bad surveys, subconsciously biased sampling, impersonation, and bad data entry processes. A classic example is in social media, where fake bots masquerade as humans.
2. Contextual adequacy or appropriateness: Data collected could be factual, but contextually irrelevant. A leading hypermarket chain noted mismatched results for direct marketing campaigns directed at certain loyalty card users. "In-store" interviews revealed that loyalty cards were actually owned by house-help staff making purchases on behalf of their employers.
3. Disharmony with data velocity: In a fast-moving digital marketing environment, data obsolescence renders analytics incorrect or useless in a very short time.
4. Isolation of business functions and data processes: Leading to duplication, and all instances of the same data not updated simultaneously.
5. Fraudulent data: a problem that even the most secure technology like Blockchain has not been able to fully control.
6. Omissions: caused primarily by laziness, lack of comprehension, or impending deadlines.
Bad data can lead to revenue loss, cost inefficiencies and missed opportunities. Leaders need to recognize and act on this expeditiously. Some steps to eliminate the scourge of bad data include:
• Articulate a vision for a data driven culture and a sustainable program to implement it.
• Invest in strong leadership and talent to drive data driven decision making.
• Identify what data points are key to the core decision areas.
• Treat data with the same respect as other key assets like plant/ machinery, human talent, etc.
• Engage continually with points of data origin- Customers, Vendors, Partners, Social Media, Regulators, Shareholders, etc.
The guru of quality, Edward Deming said "In God we trust, all others bring data." And while one follows this dictum, we should also remember that it needs to be the Right Data; because as Ronald H. Coase, a renowned British Economist said, "if you torture the data long enough, it will confess to anything"; so, the risk of bad data is truly in what you will never know.