Centralizing data on S3 and ADLS is common practice, but it is always been challenging to make that data accessible to users of Tableau, Power BI, Looker and other BI tools. ETLing the data into a cloud data warehouse, and managing BI extracts, is equally complex and expensive.
In this talk, we will discuss:
* Advancements in data management, such as (1) Apache Iceberg, an open source table format that enables transactions and time travel; and (2) Project Nessie, an open source metastore that supports git semantics
* Advancements in query acceleration, such as (1) Apache Arrow, a columnar memory format and execution kernel; (2) C3, a columnar cloud cache; and (3) Data Reflections, data structures that enable query acceleration
* Best practices for designing a shared semantic layer in the lakehouse
Tomer Shiran (Dremio)
The founder and Chief Product Officer of Dremio, the cloud data lake company. Prior to Dremio, he was VP Product and employee #5 at MapR. Tomer previously held numerous product management and engineering positions at Microsoft and IBM Research. He holds a master degree in electrical and computer engineering from Carnegie Mellon University and a bachelor’s in computer science from Technion - Israel Institute of Technology, as well as five U.S. patents.