Data Lineage and Metadata Management in Enterprise Data Lakes: Tools, Frameworks, and Compliance Implications

Author(s): Pavan Kumar Mantha

Publication #: 2512020

Date of Publication: 11.07.2022

Country: United States

Pages: 1-6

Published In: Volume 8 Issue 4 July-2022

DOI: https://doi.org/10.62970/IJIRCT.v9.i4.2512020

Abstract

Finding quality metadata and data lineage in enterprise data lakes will definitely present challenges for organizations. However, with sufficient quality of metadata and data lineage, organizations will have high data quality, easier transparency, and easier compliance. In this report, we discuss three different capability systems for metadata and data lineage: Apache Atlas; Collibra; and custom engineering. Additionally, we explore important considerations regarding feature availability, usability and governing capability. For example: Apache Atlas has open-source capability and good lineage; Collibra has the best stewardship workflow, and some compliance capability; custom engineering allows the organization build the organization needs to have some functionality. In summary, this analysis shows all three are a viable means to assist the organization in being audit ready, build trust with key stakeholders, and provide reasonable oversight of the enterprises data governance strategies.

Keywords: Data Lineage, Metadata Management, Enterprise Data Lakes, Regulatory Compliance.

Download/View Paper's PDF

Download/View Count: 54

Share this Article