
Abstract:
The event is dedicated to data discovery & data management - what it means, why you should improve it and how you should go about it, including two examples of how others have done it; Amundsen from Lyft and Artifact from Shopify.
Talk #1: From discovery to trusting data, by Lyft
At Lyft, we have made our analysts and data scientists over 30% more productive by making it easier to discover data. This talk gives a quick overview of Amundsen and then goes into detail on how we have tried both automated and curated metadata to showcase what’s trusted and not in Amundsen. It will dive deep into linking the Airflow DAG which produced the data (task level lineage), linking what and how many dashboards are built from a given data set (table level lineage), as well as SLAs and historical landing times to give users signal into what’s trusted.
Talk #2: How We’re Solving Data Discovery Challenges at Shopify
Shopify developed an in house solution to their data discovery and management challenges. This talk will address the data discovery problems faced, how users were impacted, the solution and approach, and finally trade-offs that were evaluated throughout the building process.
Ranko Cupovic
Senior Product Manager in Shopify Data Science and Engineering organization, where he works on building data products