Loading Events

« All Events

  • This event has passed.

Incremental data loading with Azure Databricks by Dustin Vannoy

Mar 19 at 7:00 pm - 9:00 pm

***Please Use Source Link Below to Confirm Event Details***

 

Details

Abstract
There has been an increasing push to load data incrementally throughout the day, sometimes within minutes. For example, you may replicate data for analytics from an application database using Change Data Feed. Apache Spark and Delta Lake are a great option to do this at a large scale. Using Azure Databricks for this type of processing gives us the power of Apache Spark and Delta Lake, plus added benefits like auto-loader and Delta Live Tables. In this session, you will learn best practices for incremental data processing and see several techniques for building these data pipelines using Azure Databricks.

Speaker Bio
Dustin Vannoy is a Data Engineering Consultant experienced in solving business problems with analytics and big data solutions. He is passionate about all aspects of data engineering, especially building data platforms and streaming data pipelines. He currently focuses on building data platforms and pipelines in Apache Spark / Databricks, Kafka, Python, and Scala. He is a co-founder of the Data Engineering San Diego meetup and encourages others to grow their data skills by making tutorials, mentoring others, and speaking at events.

Details

Date:
Mar 19
Time:
7:00 pm - 9:00 pm
Website:
https://www.meetup.com/ladataplatform/events/298093425/?

Venue

Virtual / Online

Organizer

Los Angeles Data Platform