Infosys Data Governance Framework
Infosys Data Governance Framework
Infosys Data Governance Framework
Introduction • Digital journeys – Need for trustworthy • Smart – Intelligent way to discovery,
data to enable digital journeys tag, heal data and machine learn since
In today’s business environment, companies traditional ways are not scalable to meet
need to react quickly to changes in demands The goal of Data Governance initiative is future business needs
and customer environments. Here are to ensure timely, trustworthy and relevant
some challenges faced in today’s digital information delivery that enables informed • Active – Data governance platform
environment: decision making. Characteristics of an ideal is the centerpiece for next generation
data governance solution are: enterprise, that can manage, monitor,
• Data and compute proliferation – Rise alert, actionize policies and applications
of cloud, big data along with existing • Unified data management – Across data that depend on day to day runs
investments – need for unified view in traditional systems, big data systems
and data on cloud Data governance is not only about data,
• Increased regulatory and compliance – but also about enabling clear ownership,
CCAR, GDPR, solvency, HSE/reach etc. • Collaborative and metrics driven business rules, operational requirements,
– Cater to all stakeholders like data tools and business processes, easy decision
• Key to enable boundaryless data – stewards, domain champs, CDO, IT, making process, stakeholder interaction and
Intelligent applications need to know business support, legal and compliance business data access.
enterprise data and “all” about it
Data Discovery,
Across most organizations, as data is Classification and Data Governance Hub
Catalogues
distributed across sources like data lakes, Data Quality Metrics, Governance Metrics, Measures
data warehouses and individual silos, it is Data Governance Unified Metadata Hub Data Quality Hub
important to create a boundaryless view Tools
to monitor the data. Infosys proposes the Technical and Business Metadata Data Quality rules, exceptions
Organization
data governance framework to address Policies, Standards
etc.
this challenge. It helps govern Augmented Metadata Metadata
Extractor Exchange
Enterprise Data Warehouse making use
of Infosys custom solutions and data Compute Hairball Metadata Generators Data Quality Patterns
management tools/components from
vendors like Informatica, Collibra. Custom code Tool Repositories, Files, Data Quality
Tools, Exceptions
Smart DQ - Analytics,
Machine Learning
ETLs etc. Other forms of data
etc. Algorithms
Highlights of Infosys Data Governance
framework are:
Unified Metadata Hub: Displays a unified computed data). Infosys Smart DQ solution Data Governance tool (IDG) provides these
view of organization metadata by integrating provides advanced machine learning capabilities to enable organizations for next
structured metadata from tool repositories capabilities. generation data governance.
like ETL/reports, unstructured metadata from
custom code, metadata from data lakes and Data Governance Hub: Configurable data Data Classification and Cataloging: Catalog
tools like Apache Atlas and Waterline data. governance activities (define metrics policies, and organize the data scattered across the
catalogs, standards, visualization). Plug and organization. Solution can utilize tools like
Data Quality Hub: Provides Integrated Data play model, registering applications only Informatica EIC, Waterline data etc. for this
Quality Management capability to define which are needed by the enterprise. purpose.
data quality rules, monitor data quality
progress, self-healing capability through Data Governance Applications: Capabilities
machine learning to predict missing values for analyzing data lineage, dashboards for
(includes Data at rest, data in motion and data protection, data quality metrics. Infosys
Infosys solution Informatica
Intelligent Data Metadata Management
Lake Enterprise Information Database
Infosys tools complement data governance Catalog (EIC) Tables/Columns
implementation patterns using standard Data Catalog
Physical Data Model
Data Lake –
tools from vendors like Informatica. Infosys HDFS/HIVE Data Tagging Live Data Map
Tools utilized in the solution are: Data modeling tools
Data Integration Data Discovery/Search Technical Metadata
Logical Data Model
Infosys Data Governance tool (IDG):
BDM, IIS, Power
Operational Metadata
Supports data governance operations Exchange Atlas
and helps in defining governance strategy Technical Metadata Business Glossary
and framework for next generation needs. NIFI/Spark/Map
Reduce etc.
IDG tool helps to achieve core principle Oozie Infosys – Data
Governance Solution
of accuracy, lineage by providing intuitive Operational Metadata
metrics to various roles. It also provides DG Metrics
single point access to data quality rules Data Quality Exceptions (IDG 2.0 & Apps)
captured for data in Metadata Exchange
management, stewardship & data quality motion
metrics reporting along with traceability
visualization. Informatica Data Quality Informatica Analyst
Infosys – Data
DQ Metrics Definition Data profiling Workbench
Infosys Smart DQ: Solution to pre-fill Data profiling Smart DQ
unknown master attributes in transactional DQ Rules
Management
data by mapping to history/master data Data Stewardship
using Machine Learning Models. Manual Data Quality
effort is reduced by more than 75%.
© 2018 Infosys Limited, Bengaluru, India. All Rights Reserved. Infosys believes the information in this document is accurate as of its publication date; such information is subject to change without notice. Infosys
acknowledges the proprietary rights of other companies to the trademarks, product names and such other intellectual property rights mentioned in this document. Except as expressly permitted, neither this
documentation nor any part of it may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, printing, photocopying, recording or otherwise, without the
prior permission of Infosys Limited and/ or any named intellectual property rights holders under this document.