Ramses Alexander Coraspe Valdez Ingenieria de datos resumen de temas para el curso Clase 1: Modelado de Datos Clase 2: Procesos ETL Clase 3: Lago de datos Clase 4: Inteligencia de negocios Clase 5: GCP como proveedor en la nube Clase 6: Sistemas distribuidos Clase 7: Privacidad
How to build a DAG based Task Scheduling tool for Multiprocessor systems using python Scheduling Big Data Workloads in the Cloud with pyDag Design, Development and Deployment of a simple Data Pipeline Data Engineering technical challenge How to Build a Lossless Data Compression and Data Decompression Pipeline A parallel implementation of the bzip2 high-quality data compressor tool in Python. Introduction to Apache Spark Big data with Apache Spark A Text Analysis of Andres Manuel Lopez Obrador’s Speeches Text analytics with python Building an Amazon Prime content-based Movie Recommender System TF-IDF, Cosine similarity, BM25, BERT Understanding Similarity Measures for Text Analysis Distance Metric of similarity Dockerizing an Apache Spark Standalone Cluster Small Big Data ecosystem with In-Memory Processing Build a Big Data Pipeline with PySpark and AWS EMR on EC2 How to Build a Big Data Pipeline with PySpark and AWS EMR on EC2 Spot Fleets and On-Demand Instances Building an ETL pipelin...
Comentarios
Publicar un comentario