This guide covers the entire data integration lifecycle, beginning with core ETL/ELT concepts and data mesh architectures. You will progress from data profiling and modeling to building production-ready pipelines using AWS Glue, Azure Data Factory, and Apache NiFi. The book explores real-time streaming with Kafka and Kinesis, workflow orchestration via Airflow, and real world applications in sectors like banking and healthcare.
Author: Sayan Guha
