Project Nessie: A git-like experience for Data Lakes

Apr 14, 12:00PM PDT(07:00PM GMT).
  • Free 117 Attendees
This event is hosted by SF Big Analytics Group.

A number of new technologies, most notably Apache Iceberg, have recently made traditional data warehouse concepts like transactions, commits and rollbacks a possibility on the Data Lake. This is leading to a convergence of the data lake and the data warehouse. A new open source project, Project Nessie, seeks to take this trend one step further. Project Nessie enables multi-table transactions on the data lake and provides decoupled transactions making distributed transactions a reality. Project Nessie does this by introducing Git-like semantics to data lakes. By using versioning concepts, users can work in an entirely new way, experimenting or preparing data without impacting the live view of the data, opening a whole world of possibilities for true DataOps on the data lake. This talk will discuss the benefits of Nessie and Iceberg and how these technologies can work together in modern data platforms.

Ryan Murray (Dremio)

Ryan Murray is an open source engineer at Dremio in the office of the CTO. He previously served in the financial services industry doing everything from bond trader to data engineering lead. Ryan holds a PhD in theoretical physics and is an active open source contributor who dislikes it when data isnt accessible in an organisation. He is passionate about making customers successful and self-sufficient, and still one day dreams of winning the Stanley Cup
The event ended.
Watch Recording
*Recordings hosted on Youtube, click the link will open the Youtube page.