Syllabus
Welcome to DSAN 5450: Data Ethics and Policy at Georgetown University!
The course meets on Wednesdays from 3:30-6pm in the Walsh Building, Room 498
Course Staff
- Prof. Jeff Jacobs,
jj1088@georgetown.edu
- Office hours (Click to schedule): Monday, Tuesday, 3:30-6pm
- TA Amelia Baier,
ab3868@georgetown.edu
- Office hours: By appointment
- TA Sonali Dabhi,
sd1387@georgetown.edu
- Office hours: By appointment
Course Description
This graduate-level course will train students to navigate the landscape of ethical issues which arise at each step of the data science process, with an eye towards developing policy recommendations for governments and organizations seeking expert advice on how to tackle these issues from a regulatory perspective. Students will explore and critically evaluate a range of data-related issues in contemporary society, such as responsible data collection, algorithmic bias, privacy, transparency, accountability, democratic participation in data usage and data-driven decisions, and the ethical implications of emerging technologies like artificial intelligence and machine learning (self-driving cars, ChatGPT, crowd-sourced training data, etc.).
Beginning with a set of historical case studies—instances in which scientists, engineers, and policymakers have been forced to re-evaluate their ethical intuitions in light of technological developments (nuclear power, use of social media platforms to organize protests and influence political outcomes, deployment of facial recognition software and predictive AI by police and military forces)—the course then introduces a set of general ethical frameworks (consequentialism, deontological ethics, and virtue ethics), challenging students to consider their relative strengths and weaknesses for addressing modern technological-ethical dilemmas faced by businesses, healthcare organizations, governments, and academic institutions. After a final portion of the course linking these ethical frameworks with practical regulatory and policy considerations, students will write and present a policy whitepaper analyzing a data-ethical issue of particular interest to them, integrating ethical perspectives, regulatory principles, and domain knowledge into a recommendation of best practices for the relevant agency, firm, or institution.
The course will thus equip students with a robust ethical “toolbox” for conscientiously gathering, interpreting, and extracting meaning from data throughout their careers as data scientists, while respecting privacy, fairness, transparency, democratic accountability, and other social concerns. Prerequisites: None. 3 credits.
Course Overview
The course revolves around three “pillars”, which we’ll examine individually before bringing them together for your final projects at the end of the class.
Data Science
A portion of the course will focus on introductions to cutting-edge technologies like self-driving cars, ChatGPT, facial detection algorithms, and various applications of AI to police and military technologies. For this portion, we’ll draw fairly often from the contents of the following books:
- Catherine D’Ignazio and Lauren F. Klein (2020). Data Feminism. Cambridge, MA: MIT Press. [Free, open-source!]
- Cathy O’Neil (2016). Weapons of Math Destruction. New York, NY: Crown Books.
Since there are plenty of in-depth resources available to you (e.g., other Georgetown courses!) for learning the technical details of these technologies, our goal in this course will be to learn just the particular aspects of each technology which are most relevant to the ethical and policy issues they present.
For example, we will look at Neural Netwok-based Machine Learning algorithms, but we will focus specifically on how the performance of these algorithms on a given task depends crucially on the existence of effective training data for that task. The breakthroughs in Artificial Intelligence which have had an immense impact on society over the past few decades, for example, have not come about because of new algorithms (neural networks, for example, have been around since the 1950s). Rather, they have come about because of the massive, exponential increase in the amount of data available to train these already-existing algorithms: for example, data scraped from across the entire web, or from millions of scanned books, or from Wikipedia’s massive collection of articles. This means, therefore, that these algorithms simply encode pre-existing human biases into algorithmically-derived “rules”, thus motivating the next pillar of the course: Ethics!
Ethics
For the ethics-focused portion of the course, we’ll be reading selections from the following textbook:
- Lewis Vaughn and Louis P. Pojman (2021). The Moral Life: An Introductory Reader in Ethics and Literature. Oxford, UK: Oxford University Press. [PDF]
From the vast array of readings contained in this collection, we’ll look at both “standard” ethical readings from e.g. Jeremy Bentham and Immanuel Kant plus readings from literary sources like Ursula Le Guin and Ambrose Bierce.
Public Policy
For the final piece of the course we will take the technological developments discussed the first portion, analyze them using the ethical frameworks discussed in the second portion, and come to conclusions as to what types of things lawmakers, governments, and civil society organizations (NGOs, for example, and Think Tanks) can do in practice to address the ethical issues raised by these technologies. This means that, specifically, the recommended final project for the course will be a Policy Whitepaper, where you will choose a particular institution and make a recommendation to them in terms of how they can use their power (for example, the power to pass laws) to most effectively address an ethical issue that you believe is important.
For this portion of the class we’ll have to draw on a wide range of different readings, depending on what particular subdomains of public policy are most interesting to you all, but as a general textbook on ethics in data science which does focus a good amount on policy specifically, we will look at:
- Anne L. Washington (2023). Ethical Data Science: Prediction in the Public Interest. New York, NY: Oxford University Press.
Now that you have an overview of the trajectory of the course, the following section contains the particulars of what we’ll be reading and working on each week!
Schedule
The following is a rough map of what we will work through together throughout the semester; given that everyone learns at a different pace, my aim is to leave us with a good amount of flexibility in terms of how much time we spend on each topic: if I find that it takes me longer than a week to convey a certain topic in sufficient depth, for example, then I view it as a strength rather than a weakness of the course that we can then rearrange the calendar below by adding an extra week on that particular topic! Similarly, if it seems like I am spending too much time on a topic, to the point that students seem bored or impatient to move onto the next topic, we can move a topic intended for the next week to the current week!
Unit | Week | Date | Topic |
---|---|---|---|
Unit 1: Ethical Frameworks and Fairness in AI | 1 | Jan 17 | Introduction to the Course |
2 | Jan 24 | Machine Learning, Training Data, and Bias | |
3 | Jan 31 | (Descriptive) Fairness in AI | |
4 | Feb 7 | Fairness in AI | |
Feb 9 (Friday), 11:59pm EST | [Deliverable] HW1: Nuts and Bolts for Fairness in AI | ||
5 | Feb 14 | Context-Sensitive Fairness | |
6 | Feb 21 | Causality in Ethics and Policy | |
Feb 26 (Monday), 11:59pm EST | [Deliverable] HW2: Context-Sensitive Fairness | ||
7 | Feb 28 | In-Class Midterm: Data Ethics, Fairness, Privacy, Causality | |
Mar 6 | No Class (Spring Break) | ||
Unit 3: Policy Frameworks | 8 | Mar 13 | From Data Ethics to Data Policy |
9 | Mar 20 | Privacy Policies, Incomplete Contracts, and Power | |
10 | Mar 27 | Econometric Policy Evaluation and Inverse Fairness | |
11 | Apr 3 | Fairness vs. Social Welfare | |
12 | Apr 10 | Project Talk, Causality and Identity Formation | |
Apr 12 (Friday) | [Deliverable] HW3: Privacy Policies as Incomplete Contracts | ||
13 | Apr 17 | Applications: Race, Class, Gender, Sexuality, and Disability (Data Feminism) | |
14 | Apr 24 | Republican Liberty and the Kindly Slavemaster | |
Apr 26 (Friday) | [Deliverable] HW4: Policy Evaluation | ||
May 10 (Friday) | [Deliverable] Policy Whitepaper |
Assignments and Grading
The main assignment in the course will be your policy whitepaper, submitted at the end of the semester. However, there will also be a midterm exam and a series of assignments which exist to let you explore each of the modules of the course, in turn.
Assignment | Due Date | % of Grade |
---|---|---|
HW1: Nuts and Bolts for Fairness in AI | Friday, February 9 |
10% |
HW2: Context-Sensitive Fairness | Friday, February 26 |
10% |
Midterm | Wednesday, February 28 | 30% |
HW3: Privacy Policies as Incomplete Contracts | Friday, April 12 |
10% |
HW4: Policy Evaluation | Friday, April 26 |
10% |
Policy Whitepaper | Friday, May 10 | 30% |
Homework Lateness Policy
After the due date, for each homework assignment, you will have a grace period of 24 hours to submit the assignment without a lateness penalty. After this 24-hour grace period, late penalties will be applied based on the following scale (unless you obtain an excused lateness from one of the instructional staff!):
- 0 to 24 hours late: no penalty
- 24 to 30 hours late: 2.5% penalty
- 30 to 42 hours late: 5% penalty
- 42 to 54 hours late: 10% penalty
- 54 to 66 hours late: 20% penalty
- More than 66 hours late: Assignment submissions no longer accepted (without instructor approval)