Shuffle for fans
Chatbot

Data Engineering Podcast

Technology

Weekly deep dives on data management with the engineers and entrepreneurs who are shaping the industry

Popular episodes

Creating A Unified Experience For The Modern Data Stack At Mozart Data

Nov 27 • 58:31

Summary

The modern data stack has been gaining a lot of attention recently with a rapidly growing set of managed services for different stages of the data lifecycle. With all of the available options it is possible to run a scalable, production grade data platform with a small team, but there are still sharp edges and integration challenges to work through. Peter Fishman an...

Doing DataOps For External Data Sources As A Service at Demyst

Nov 27 • 59:16

Summary

The data that you have access to affects the questions that you can answer. By using external data sources you can drastically increase the range of analysis that is available to your organization. The challenge comes in all of the operational aspects of finding, accessing, organizing, and serving that data. In this episode Mark Hookey discusses how he and his team ...

Exploring Processing Patterns For Streaming Data Integration In Your Data Lake

Nov 20 • 52:53

Summary

One of the perennial challenges posed by data lakes is how to keep them up to date as new data is collected. With the improvements in streaming engines it is now possible to perform all of your data integration in near real time, but it can be challenging to understand the proper processing patterns to make that performant. In this episode Ori Rafael shares his expe...

Laying The Foundation For The Era Of Big Complexity With Dagster

Nov 20 • 01:05:25

Summary

The technology for scaling storage and processing of data has gone through massive evolution over the past decade, leaving us with the ability to work with massive datasets at the cost of massive complexity. Nick Schrock created the Dagster framework to help tame that complexity and scale the organizational capacity for working with data. In this episode he shares t...

Data Quality Starts At The Source

Nov 14 • 58:54

Summary

The most important gauge of success for a data platform is the level of trust in the accuracy of the information that it provides. In order to build and maintain that trust it is necessary to invest in defining, monitoring, and enforcing data quality metrics. In this episode Michael Harper advocates for proactive data quality and starting with the source, rather tha...

Eliminate Friction In Your Data Platform Through Unified Metadata Using OpenMetadata

Nov 10 • 01:06:54

Summary

A significant source of friction and wasted effort in building and integrating data management systems is the fragmentation of metadata across various tools. After experiencing the impacts of fragmented metadata and previous attempts at building a solution Suresh Srinivas and Sriharsha Chintalapani created the OpenMetadata project. In this episode they share the les...

Business Intelligence Beyond The Dashboard With ClicData

Nov 6 • 01:02:00

Summary

Business intelligence is often equated with a collection of dashboards that show various charts and graphs representing data for an organization. What is overlooked in that characterization is the level of complexity and effort that are required to collect and present that information, and the opportunities for providing those insights in other contexts. In this epi...

Exploring The Evolution And Adoption of Customer Data Platforms and Reverse ETL

Nov 5 • 01:02:06

Summary

The precursor to widespread adoption of cloud data warehouses was the creation of customer data platforms. Acting as a centralized repository of information about how your customers interact with your organization they drove a wave of analytics about how to improve products based on actual usage data. A natural outgrowth of that capability is the more recent growth ...

Removing The Barrier To Exploratory Analytics with Activity Schema and Narrator

Oct 29 • 01:08:48

Summary

The perennial question of data warehousing is how to model the information that you are storing. This has given rise to methods as varied as star and snowflake schemas, data vault modeling, and wide tables. The challenge with many of those approaches is that they are optimized for answering known questions but brittle and cumbersome when exploring unknowns. In this ...

Streaming Data Pipelines Made SQL With Decodable

Oct 29 • 01:09:32

Summary

Streaming data systems have been growing more capable and flexible over the past few years. Despite this, it is still challenging to build reliable pipelines for stream processing. In this episode Eric Sammer discusses the shortcomings of the current set of streaming engines and how they force engineers to work at an extremely low level of abstraction. He also expla...

Check out similar podcasts

Beatles Books
Joe Wisbey
Relationship Advice
Hosted by: Chase Kosterlitz, Produced by: Sarah Kosterlitz
Chile Despertó
Chile Despertó
Journey To Launch
Jamila Souffrant
TwitterBlogCareersPress KitCommunity GuidelinesTerms of ServicePrivacy Policy
© 2021 Akora Labs, Inc.