Apache Beam everywhere. An introduction to the Spark runner


May 20, 10:00 AM PDT
  • Virtual AICamp
  • 273 RSVPs
Description
Introducing BeamLearningMonth in May 2020! In collaboration with Google cloud team, we host a series of practical introductory sessions to Apache Beam!
Apache Beam is a rich unified model that allows to execute data-intensive processing pipelines written in your favorite language in multiple target execution systems. Apache Spark is the most popular open source big data framework, it is mature and counts with a rich ecosystem of tools, resources and support.

In this talk we introduce Apache Beam users to the Spark runner and discuss why it is a perfect match for users who care about running their data jobs in an open source system that allows jobs to be run in different clusters and clouds.

This is our 3rd talk in the series, donot forget to check out Webinar 1 on May 6 on Interactive Introduction to Apache Beam Session 1 and Webinar 2 on May 13 on Best Practices Towards a Production-ready Beam Pipeline Session 2.


Contact Organizer