
Two years ago, Doordash started the journey of creating a real time event processing system to replace the legacy data pipelines and address our event processing needs to scale our business. We created a scalable system that could handle heterogeneous data sources and destinations, was easily accessible with different levels of abstractions and had end to end schema enforcement. We were able to accomplish this by shifting our strategy from heavily relying on AWS and third-party data services to leveraging open source frameworks that can be customized and better integrated with our infrastructure.
In this session, I will share what we learned and how we put together this system with Apache Flink, Kafka as well as Kubernetes.