DATA QUALITY MATURITY IS AN ELUSIVE METRIC....Data governance objectives such as understanding the...

DATA QUALITY MATURITY IS AN ELUSIVE METRIC.

LEARN HOW TO MEASURE IT

Elements of Data Consumption:

In any data system, you will find data generators and/or producer’s responsible for creating a piece of information which is either a measure of a business process or outcome, and you will discover data consumers who are interested in understanding the business process across the horizon. This demonstrates the need to integrate the data generated across different applications participating in a business process. It’s important to understand both data producers and consumers operating with different objectives and goals – and the same is translated to the applications that are built to aid such an objective.

In a perfectly orchestrated data process, data that is generated at the data producer application gets integrated to produce reports that are simply information about the business process or the state of the process. When there is a business motive that strives to explore data further by correlating data from different domains and identifying patterns using statistical models, the information becomes a business insight, later leading to knowledge over a period of time.

The next level of data maturity comes when insight, intelligence, and perception intersect naturally, thus leading to wisdom. Wisdom is the ultimate driver for higher-level cognitive decisions that are more intuition-driven due to vast amounts of knowledge being turned into wisdom. Future applications that use vast amounts of data (e.g., IoT, big data), with the ability to process such large data in short durations can generate vast amounts of knowledge, leading to cognitive decision systems of sagacity, otherwise known as the ability to make good judgements.

However, the critical nature of success in data maturity models is dependent on the quality of the data. Unfortunately, data quality is often investigated just before data integration, with a short-term focus on fixing data quality parameters. An example approach is cleansing the data through data augmentation or through standardization with a data quality tool. Ignoring data quality issues that arise within the source of truth during the data generation process can lead to a bad data experience resulting in transparency questions along with a belief that the data is not trustworthy.

For information to move from intelligence to a sagacity state, there are plenty of human factors that need to shape the usage, such as motive and perception based on experience and the context of the data that presents itself at the time of use. If you are to provide a complete view of data issues, it’s important to co-relate data quality metrics that are objectively collected through profiling and metrics, and are subjectively collected through a user survey to help you understand usage experience and content clarity.

Data Quality Control Framework: Data governance objectives such as understanding the value of a data asset, providing optimal use of data to the enterprise, creating awareness to data users and promoting self-service data usage can only be accomplished by providing a complete view of the data in order to have a meaningful discussion with stakeholders.

The image below provides a complete view of the data quality parameters that can be validated across both the producer and consumer level based upon three categories that will be explored in further detail – (1) data representation, (2) data value usage, and (3) data context.

Data

Data Integration BI Reporting AnalyticsC ognitivePerceptiveMotive

InsightsI ntelligence Analysis

Data

Data Integration BI Reporting AnalyticsC ognitivePerceptiveMotive

InsightsI ntelligence Analysis

Data producer

Data Context

Data Consumer

Enterprise LocalP ersonal

What is Data Representation?

Data representation refers to the relationship within the values that are associated with an application’s attributes corresponding to its variables. As an example, consider an application that is used for processing loan applications:

1. At the application level, the application variable defines the loan application domain. Due to different variations at the process level, you are able to generate or measure a process, and captured as process measures.

2. The process output is represented within the application as attributes, which describe different properties of the intended business action. In the above example, different properties or characteristics of a loan application itself are captured as data attributes.

3. Data attributes are bound or constrained by their data value properties, such as numeric values, float or money values, percentage values, etc. These boundaries restrict the data to ensure data quality is not compromised.

4. All data values and attributes have relationships across other attributes; they’re not binary. These relationships are important because they help define the business rules within the application and create dependencies between data values.

Changes in data attributes, including value and relationship, are the main reasons that the quality of data represented in the application can degrade. It’s important to understand the root cause of such degraded attributes to fix the quality at the source and prevent further damage upstream.

Data Representation with Data Quality Controls:

Data controls at the application level can be deployed to profile and monitor data at the time it is generated to ensure quality. Controls can measure the data quality at the data representation level, collect control metrics like volumetric information, and generate reports outlining the metrics that are independent of the complexity and sensitivity of the data itself. This leads to transparent and sharable data quality reports on the data.

Application-level data quality controls can monitor and identify data quality issues at the design level, or when they’re created due to faulty business rules in the application where the data is originating. Such controls not only help improve the data quality at the source of data origination, but can also help improve application extensibility and usability across the enterprise.

Below are some control parameters that can be measured at the application level:

Control Name Description/Definition

Null Value Representation

Data Value Level Data Quality Controls:

Data movement between applications involves complex data transformations such as filter, aggregation, qualification, etc. As data moves through multiple stages of integration, the maturity of the data usage improves drastically, which results in maximizing the value of data as an enterprise asset. A data integration point is an ideal location to measure data quality metrics that can help an enterprise to understand value creation parameters of the data. Data value level quality parameters are measurable data properties and can be expressed as quantitative data metrics.

This space is dominated by many data quality product vendors and tools. They all provide similar profiling capabilities within the tool or profiler interfaces, but such capabilities are intended to be leveraged on a static data set, to help identify data patterns and data anomalies to aid users in building data standardization rules or data cleansing routines.

Data profilers are not usually used to monitor data in real-time and they lack the capabilities to compare quality metrics to produce data quality trends. As an example, an organization can profile a data extract for a given day and receive the profiling results, but they cannot build profiles on a particular data field that happens to occur across multiple extracts and correlate the results or trend the same for subjective interpretations.

Infogix data controls are deployable in real-time or in batch, and can capture data quality metrics across different gradients. They can then report metrics and identify trends that can be co-related to other applications involved in the data flow process. Controls can be run in real time before or after a data integration routine.

The following are a set of Infogix data value controls that can measure different data quality parameters at the data value level. Such controls are repeatable, reusable, quick to deploy, and capture metrics in real-time.


Data value-level controls can be expanded by creating additional combinations of basic controls to build upon complex data quality parameters. Alternatively, custom controls that are hand-coded with control rules can be created to measure and monitor specific contextual data value measures.

Data Value in the Age of Big Data

Data service channels are continually expanding their traditional data reporting. This process of organizing data into a summary of information that allows users to understand the current state of the business, while deriving business value through information insight, is becoming a more popular approach in the big data analytic world. Business teams are frequently pushed toward analysis and self-service options that allow business teams to explore the data and extract actionable insight.

The evolution of data provisioning channels means an increased demand in metadata and business glossaries that can aid businesses in their journey of transitioning data to insight. Yet as this evolution takes place, there lie challenges around data quality when trying to understand the data context when converting insight to knowledge and thereafter to actionable wisdom. Ignoring the opportunity to measure and correct data quality issues at the context level would impact the value realization time of data within the organization.

Data Context Controls for Data Quality

Data context is defined as the interpretation of data at the time of its usage or at the point of conversion from data to insight. Reporting tools and/or downstream data extracts used in 3rd party applications are the ideal spot for measuring data quality values, but it’s not uncommon to measure the same at the data warehouse level too. Infogix controls are independent and easily deployable at either the reporting tool or at a third party data application to capture valuable data metrics to co-relate the data parameters with survey results to provide an end-to-end contextual data quality understanding.


Enterprise Data Integrity for Better Results

To make the most of your enterprise systems, it’s vital to verify, balance, reconcile, and track data end-to-end to ensure it is high quality, accessible and standardized for compliance. Important considerations to keep in mind include:

• Addressing any data quality issues closest to the source to avoid propagation of bad data• Ensuring that data quality improvement is a continuous cycle requiring people, processes, and

technology working in collaboration to succeed

Infogix provides data integrity at an enterprise-level for all business information. Working with Infogix helps to bridge the gap between business and IT by making sure all of the data hops (exchanges) perform flawlessly and the delivery of data is done accurately across your enterprise systems. This is why data-intensive organizations are choosing Infogix standardized and automated data integrity solutions to provide enterprise visibility into the health of their data.


About Infogix

In our fourth decade as an industry pioneer, Infogix continues to provide large and mid-market companies around the globe with a broad range of integrated and configurable tools to govern, manage and use data. From operations and the office of data to sales, and from product and customer service to marketing—users across the entire organization rely on our software to remove barriers to data access, accelerate time to insight, increase operational efficiency, and confidently trust business decisions. Our best in class retention rate is proof of our customer-centric focus as we partner with them to thrive in today’s data-driven economy.

Optimize your data management and governance strategy with Infogix. Visit www.infogix.com or call +1.630.505.1800 (US, Canada. and International), +44 1242 674 137 (UK and Europe).

twitter.com/Infogix facebook.com/Infogix linkedin.com/Infogix plus.google.com/+Infogix

http://www.infogix.com

https://twitter.com/Infogix

https://twitter.com/Infogix

https://www.facebook.com/Infogix/

https://www.facebook.com/Infogix/

https://www.linkedin.com/company/infogix

https://www.linkedin.com/company/infogix

https://plus.google.com/+Infogix

https://plus.google.com/+Infogix

DATA QUALITY MATURITY IS AN ELUSIVE METRIC....Data governance objectives such as understanding the...

Documents

Transcript of DATA QUALITY MATURITY IS AN ELUSIVE METRIC....Data governance objectives such as understanding the...