DSAN 6000 Sections 01 and 02
| Can be processed on single machine? | No | Medium (Parallel Processing) |
Big! Parallel + Distributed Processing |
| Yes | Small (Your Laptop) |
Medium (Data Streaming) |
|
| Yes | No | ||
| Can be stored on single machine? | |||
This webpage just serves as a hub collecting Jeff’s slides for DSAN 6000: Big Data and Cloud Computing, Fall 2025 at Georgetown University. It is not a replacement for the main course webpage!
Section 01 of the course takes place on Wednesday from 3:30pm to 6:00pm in Walsh 394.
| Title | Date |
|---|---|
| Week 1: Course Overview | Aug 28 |
| Week 2: Cloud Computing | Sep 2 |
| Week 3: Parallelization Concepts | Sep 8 |
| Week 4: DuckDB, Polars, File Formats | Sep 15 |
| Week 5: Data Engineering | Sep 22 |
| Week 6: Introduction to Spark | Sep 29 |
| Week 7: Spark DataFrames and Spark SQL | Oct 6 |
| Week 8: SparkNLP | Oct 20 |
| Week 9: SparkML | Oct 27 |
| Week 10: ETL Pipeline Orchestration with Airflow | Nov 3 |
| Week 11: Vector Databases | Nov 10 |
| Week 12: In-Class Office Hours | Nov 17 |
| Week 13: Serverless and Container Orchestration | Nov 24 |
| Week 14: Final Topics, Review | Dec 1 |
| Week 15: Final Project Presentations | Dec 8 |
No matching items