Data engineering course
Master the basics of data engineering, designing systems and infrastructure for effective data management.
Data Engineering is one of the most important techniques in the twenty-first century, and it is a must to build scalable, efficient and fault-tolerant data pipelines.
Who is this for?
This course is designed for data engineers who want to master the tools and techniques used in modern data engineering. We'll cover HDFS as the foundation of big data, and data processing with Hive, Impala, and Spark. Further, we will investigate advanced data storage technologies such as HBase and Kudu and finish the training with some job-orchestration tools such as Airflow.
Program goal - What you will take away from the course
You'll learn how to design and build robust data pipelines that can handle large volumes of data, perform complex data transformations, and integrate with a wide range of data sources and destinations. With hands-on exercises and real-world examples, you'll gain practical experience working with these powerful tools and be ready to apply your new skills to your own data engineering projects.
Introduction to data engineering and big data
Learn the fundamentals of data engineering and big data to efficiently handle data processing and analysis.
Hadoop Distributed File System (HDFS) and Hadoop ecosystem tools
Gain knowledge about the Hadoop Distributed File System and Hadoop ecosystem tools used for managing and processing large volumes of data.
Apache Hive and Impala for SQL-based data processing
Discover Apache Hive and Impala for enabling SQL-based data processing in big data environments.
Apache HBase and Apache Kudu for NoSQL data storage
Learn about Apache HBase and Apache Kudu for implementing NoSQL data storage in distributed environments.
Apache Spark for distributed data processing and analytics
Explore Apache Spark as a powerful framework for distributed data processing and analytics in big data scenarios.
Airflow for data pipeline orchestration and scheduling
Master Airflow for efficiently orchestrating and scheduling data pipelines, ensuring smooth data flow.
Best practices for data engineering development, deployment, and monitoring
Understand the best practices for data engineering project development, deployment, and monitoring to achieve optimal results.
Meet the Creators
Chief Technology Officer & Principal Big Data Solutions Architect Lead, Ultra Tendency
Professional Software Architect, Ultra Tendency
Senior Lead Big Data Developer & Berlin Territory Manager, Ultra Tendency
Unlock the Ultra Tendency program to help your team to deliver meaningful impact today.