Data Lineage and Metadata Management in Enterprise Data Lakes: Tools, Frameworks, and Compliance Implications
Author(s): Pavan Kumar Mantha
Publication #: 2512020
Date of Publication: 11.07.2022
Country: United States
Pages: 1-6
Published In: Volume 8 Issue 4 July-2022
DOI: https://doi.org/10.62970/IJIRCT.v9.i4.2512020
Abstract
Finding quality metadata and data lineage in enterprise data lakes will definitely present challenges for organizations. However, with sufficient quality of metadata and data lineage, organizations will have high data quality, easier transparency, and easier compliance. In this report, we discuss three different capability systems for metadata and data lineage: Apache Atlas; Collibra; and custom engineering. Additionally, we explore important considerations regarding feature availability, usability and governing capability. For example: Apache Atlas has open-source capability and good lineage; Collibra has the best stewardship workflow, and some compliance capability; custom engineering allows the organization build the organization needs to have some functionality. In summary, this analysis shows all three are a viable means to assist the organization in being audit ready, build trust with key stakeholders, and provide reasonable oversight of the enterprises data governance strategies.
Keywords: Data Lineage, Metadata Management, Enterprise Data Lakes, Regulatory Compliance.
Download/View Count: 54
Share this Article