Data Engineering projects


How to build a DAG based Task Scheduling tool for Multiprocessor systems using python


Scheduling Big Data Workloads in the Cloud with pyDag


Design, Development and Deployment of a simple Data Pipeline

Data Engineering technical challenge


How to Build a Lossless Data Compression and Data Decompression Pipeline


Introduction to Apache Spark

Big data with Apache Spark


A Text Analysis of Andres Manuel Lopez Obrador’s Speeches


Building an Amazon Prime content-based Movie Recommender System


Understanding Similarity Measures for Text Analysis


Distance Metric of similarity


Dockerizing an Apache Spark Standalone Cluster


Build a Big Data Pipeline with PySpark and AWS EMR on EC2


How to Build a Big Data Pipeline with PySpark and AWS EMR on EC2 Spot Fleets and On-Demand Instances


Building an ETL pipeline with Apache Airflow and Visualizing AWS Redshift data using Power BI



Comentarios

Entradas populares de este blog

Ramses Alexander Coraspe Valdez