InData Engineer ThingsbyDaniApache Iceberg: The Hadoop of the Modern Data Stack?The bigger they are the harder they fall.Dec 12, 20246Dec 12, 20246
InTDS ArchivebyJesus L. LoboDetecting real-time and unsupervised anomalies in streaming data: a starting pointSensors enable the Internet of Things (IoT) by collecting the data for smarter decisions in all kinds of systems. Data are usually…Jan 10, 20202Jan 10, 20202
InQodea Google Cloud Tech BlogbyKeven PintoRunning pyspark jobs on Google Cloud using Serverless DataprocRun Spark batch workloads without having to bother with the provisioning and management of clusters!.Mar 24, 20221Mar 24, 20221
Andy BryantProcessing guarantees in KafkaEach of the projects I’ve worked on in the last few years has involved a distributed message system such as AWS SQS, AWS Kinesis and more…Nov 16, 20194Nov 16, 20194
InTDS Archiveby💡Mike ShakhomirovData pipeline design patternsChoosing the right architecture with examplesJan 2, 20238Jan 2, 20238
InTDS ArchivebyChengzhi Zhao10 Fantastic Classic Books For Data EngineeringThe Books Make You Success For Data EngineersDec 6, 20222Dec 6, 20222
Apache DolphinSchedulerApache Open-source Projects in Modern Data StacksEditor: Detong, github.com/mischaZhangNov 23, 2022Nov 23, 2022
Darth DataData is not the new oil! Lessons in evolving a Data PlatformOur data platform has been running for years. We hired a bunch of data scientist and engineers, too many unused tables which are running…Oct 27, 2022Oct 27, 2022
Ahmed OmraneBQ+DBT: 5 proven practices to scale you analytics infrastructure effectively without exploding your…At The Fabulous, we have been using BQ and DBT as the core of our Data Analytics for the past 2 years. As of end of 2022, we have fully…Nov 13, 20221Nov 13, 20221
The Data ObserverData Engineering Best Practices: How Big Tech & FAANG Firms Manage and Optimize Apache KafkaThe popular open-source messaging/streaming system, Apache Kafka, is a key enabler for some of the most data-driven and disruptive…Jun 22, 2022Jun 22, 2022
Alexandre BeauvoisData Platforms: The PresentThe Evolution of Data PlatformsJun 10, 20221Jun 10, 20221
Dagster BlogPostgres: a better message queue than Kafka? | Dagster BlogWe shipped Dagster Cloud 1.0 in August. It’s been pretty successful so far, and this is the first in a series of blog posts talking about…Oct 5, 20228Oct 5, 20228
InArtefact Engineering and Data SciencebyBenoît Goujondbt coalesce 2022 recapThis year’s edition was taking place in New Orleans. And as in the past editions, we learned a ton about the analytics engineering…Oct 21, 2022Oct 21, 2022
InCreandumbyStaffan HelgessonA New Era Of Data Analysis — Enter MasonAfter a decade of quasi-code tools and pure no-code tools for data analysis and visualization, users of all sorts still return to SQL.Oct 18, 20221Oct 18, 20221
InData Engineer ThingsbyXinran WaibelData Engineering Excellency at NetflixTakeaways from two years as a data engineer at Netflix.Oct 7, 20228Oct 7, 20228
InMcDonald’s Technical BlogbyGlobal TechnologyMcDonald’s event-driven architecture: The data journey and how it worksPart two of event-driven architecture post.Aug 30, 20225Aug 30, 20225
InGeek CulturebyPaul LeeI Got Rejected From Meta’s Data Engineer Interview— Here’s What I LearnedThrough every success… every failure… there’s a lesson to learn in every endeavor — Paul Lee (Author)Jul 1, 202221Jul 1, 202221