Syllabus
Welcome to DSAN 5500: Data Structures, Objects, and Algorithms in Python! The course meets on Mondays from 12:30pm to 3:00pm in Car Barn Room 204.
Course Staff
- Prof. Jeff Jacobs,
jj1088@georgetown.edu
- Office hours (Click to reserve): Monday, Tuesday 3:30-6pm, held in Car Barn Room 207-04A
- TA Yihan Bian,
yb214@georgetown.edu
- Office hours: By appointment
- TA Binhui Chen,
bc928@georgetown.edu
- Office hours (Click to reserve): Wednesday, 3:30-4:30pm, held on Zoom
- TA Brian Kwon,
sk2338@georgetown.edu
- Office hours: By appointment
Course Overview
My goal, in creating the course, is to take the general, language-agnostic data science concepts you’ve learned in (e.g.) DSAN 5000 and work through how to implement these concepts efficiently in Python, where we can define “efficient” in different ways based on different goals that we may have in different settings while working as Data Scientists.
The graded components for the course consist of five homework assignments, an in-class midterm, and a final project. Grades will be allocated as follows:
Category | Percent of Final Grade |
---|---|
In-Class Midterm | 30% |
Final Project | 30% |
Homeworks | 40% |
HW1: Python Fundamentals | 10% |
HW2: Data Structures, Algorithms, and Complexity | 10% |
HW3: Data-Processing Pipelines | 10% |
HW4: Parallel Computing | 10% |
The course does not have any “official” prerequisites, but a general comfort with Python is strongly recommended. If you have never used Python before, however (or if you haven’t used it in a while and feel like your skills are rusty), you can browse the materials on the Resources page!
Course Topics / Calendar
The following is a rough map of what we will work through together throughout the semester; given that everyone learns at a different pace, my aim is to leave us with a good amount of flexibility in terms of how much time we spend on each topic: if I find that it takes me longer than a week to convey a certain topic in sufficient depth, for example, then I view it as a strength rather than a weakness of the course that we can then rearrange the calendar below by adding an extra week on that particular topic! Similarly, if it seems like I am spending too much time on a topic, to the point that students seem bored or impatient to move onto the next topic, we can move a topic intended for the next week to the current week!
Unit | Week | Date | Topic |
---|---|---|---|
Unit 1: Python Fundamentals | 1 | Jan 10 (Wednesday) | Course Intro and Motivation |
Jan 15 | No Class (Martin Luther King, Jr. Day) | ||
2 | Jan 22 | Software Design Patterns and Object-Oriented Programming | |
3 | Jan 29 | Data Structures and Computational Complexity | |
Feb 2 (Friday), 11:59pm EST | HW1 (Python Fundamentals) Due | ||
Unit 2: Data Structures, Algorithms, and Complexity | 4 | Feb 5 | Heaps, Stacks, Hash Maps |
5 | Feb 12 | Binary Search Trees | |
6 | Feb 20 (Tuesday) | Code Examples and Midterm Prep | |
Midterm | 7 | Feb 26 | In-Class Midterm |
Mar 1 (Friday), 5:59pm EST | HW2 (Data Structures, Algorithms, and Complexity) Due | ||
Mar 4 | No Class (Spring Break) | ||
Unit 3: Data-Processing Pipelines | 8 | Mar 11 | Data Validation, Data Processing Pipelines |
9 | Mar 18 | Data Pipeline Orchestration | |
Mar 22 (Friday), 5:59pm EDT | HW3 (Data-Processing Pipelines) Due | ||
Unit 4: Parallel Computing | 10 | Mar 25 | Moving from Serial to Parallel Pipelines |
Apr 1 | No Class (Easter Break) | ||
11 | Apr 8 | Parallel Pipelines and Map-Reduce | |
12 | Apr 15 | Final Projects, Interfaces | |
Apr 19 (Friday), 11:59pm EST | HW4 (Parallel Computing) Due | ||
Unit 5: Advanced Topics and Applications | 13 | Apr 22 | Applications in NLP |
14 | Apr 29 | Applications in Bioinformatics | |
May 3 (Friday), 11:59pm EST | Final Project Due |
Assignment Distribution, Submission, and Grading
The programming assignments for the course will be managed through Google Classroom. This means that, to work on and submit the assignments, you will use the following workflow:
- Open the
.ipynb
file for the assignment from within Google Classroom - Work on the problems within the file, saving your progress early and often! You can try things out or create drafts of your solutions however you’d like (for example, in VSCode or JupyterLab or any other IDE), but your final submission for each assignment must be submitted through the Google Classroom interface!
- Submit the completed version of the assignment by clicking the blue “Hand in” button on the assignment page.
The interface allows you to unsubmit and continue working on an assignment, for example if you find a mistake, but be careful and make sure you resubmit once you’ve fixed the mistake, since submissions will not be accepted after the grace period for late submission has ended.
Late Policy
After the due date, for each homework assignment, you will have a grace period of 24 hours to submit the assignment without a lateness penalty. After this 24 hour grace period, late penalties will be applied up until 66 hours after the due date. Specifically, late penalties will be applied based on the following scale (unless you obtain an excused lateness from one of the instructional staff!):
- 0 to 24 hours after due date: no penalty
- 24 to 30 hours after due date: 2.5% penalty
- 30 to 42 hours after due date: 5% penalty
- 42 to 54 hours after due date: 10% penalty
- 54 to 66 hours after due date: 20% penalty
- More than 66 hours after due date: Assignment submissions no longer accepted (without instructor approval)
Final Letter Grade Determination
Once all assignments have been graded, we will compute your final numeric grade according to the above weighting, rounded to two decimal places. The letter grade that we report to Georgetown on the basis of this numeric grade will then follow the DSAN letter grade policy as follows, where the start and end points for each range are inclusive:
Range Start | Range End | Letter Grade |
---|---|---|
92.50 | 100.00 | A |
89.50 | 92.49 | A- |
87.99 | 89.49 | B+ |
81.50 | 87.98 | B |
79.50 | 81.49 | B- |
69.50 | 79.49 | C |
59.50 | 69.49 | D |
0.00 | 59.49 | F |
Official Course Description
The Data Structures, Objects, and Algorithms in Python course will look at built-in data structures, such as dictionaries, lists, tuples, sets, strings, and frozen sets. The course will also cover objects and classes in Python, as well as building new structures and objects. The class will cover algorithms including runtime, recurrence, and development. Applications will include data science problems. Prerequisite: A working or intermediate knowledge of Python. 3 credits.
Title IX/Sexual Misconduct Statement
Georgetown University and its faculty are committed to supporting survivors and those impacted by sexual misconduct, which includes sexual assault, sexual harassment, relationship violence, and stalking. Georgetown requires faculty members, unless otherwise designated as confidential, to report all disclosures of sexual misconduct to the University Title IX Coordinator or a Deputy Title IX Coordinator.
If you disclose an incident of sexual misconduct to a professor in or outside of the classroom (with the exception of disclosures in papers), that faculty member must report the incident to the Title IX Coordinator, or Deputy Title IX Coordinator. The coordinator will, in turn, reach out to the student to provide support, resources, and the option to meet. [Please note that the student is not required to meet with the Title IX coordinator.]. More information about reporting options and resources can be found in the Sexual Misconduct Resource Center.
If you would prefer to speak to someone confidentially, Georgetown has a number of fully confidential professional resources that can provide support and assistance. These resources include:
- Health Education Services for Sexual Assault Response and Prevention: Confidential email
sarp@georgetown.edu
- Counseling and Psychiatric Services (CAPS): 202-687-6985
- After hours you can call 833-960-3006 to reach Fonemed, a telehealth service, and ask for the on-call CAPS clinician
GSAS Resources and Policies for Students
You can find a collection of relevant resources and policies for students on the GSAS website, and the Provost’s policy on accommodating students’ religious observances on the Campus Ministry website.
You can also make use of the Student Academic Resource Center. In particular, within the Resource Center there is a link to Georgetown’s Disability Support page. If you believe you have a disability, you can contact the Academic Resource Center (arc@georgetown.edu
) for further information. The ARC is located in the Leavey Center, Suite 335 (202-687-8354), and it is the campus office responsible for reviewing documentation provided by students with disabilities and for determining reasonable accommodations in accordance with the Americans with Disabilities Act (ADA) and University policies.