Introducing BeamLearningMonth in May 2020! In collaboration with Google cloud team, we host a series of practical introductory sessions to Apache Beam!
Apache Beam is a rich unified model that allows to execute data-intensive processing pipelines written in your favorite language in multiple target execution systems. Apache Spark is the most popular open source big data framework, it is mature and counts with a rich ecosystem of tools, resources and support.
In this talk we introduce Apache Beam users to the Spark runner and discuss why it is a perfect match for users who care about running their data jobs in an open source system that allows jobs to be run in different clusters and clouds.
This is our 3rd talk in the series, donot forget to check out Webinar 1 on May 6 on Interactive Introduction to Apache Beam Session 1 and Webinar 2 on May 13 on Best Practices Towards a Production-ready Beam Pipeline Session 2.
Software Engineer from Talend, with more than ten years of experience designing and developing information systems for financial groups, telecom companies and startups. Focused on Big Data and Cloud architectures (aka Distributed Systems). He works at Talend France as an Open Source Software Engineer. He is an Apache Beam and Apache Avro committer and PMC member, and also an enthusiastic contributor to multiple other open source projects.