Syllabus

Welcome to DSAN 5450: Data Ethics and Policy at Georgetown University!

The course meets on Wednesdays from 3:30-6pm in the Walsh Building, Room 498

Course Staff

Course Description

This graduate-level course will train students to navigate the landscape of ethical issues which arise at each step of the data science process, with an eye towards developing policy recommendations for governments and organizations seeking expert advice on how to tackle these issues from a regulatory perspective. Students will explore and critically evaluate a range of data-related issues in contemporary society, such as responsible data collection, algorithmic bias, privacy, transparency, accountability, democratic participation in data usage and data-driven decisions, and the ethical implications of emerging technologies like artificial intelligence and machine learning (self-driving cars, ChatGPT, crowd-sourced training data, etc.).

Beginning with a set of historical case studies—instances in which scientists, engineers, and policymakers have been forced to re-evaluate their ethical intuitions in light of technological developments (nuclear power, use of social media platforms to organize protests and influence political outcomes, deployment of facial recognition software and predictive AI by police and military forces)—the course then introduces a set of general ethical frameworks (consequentialism, deontological ethics, and virtue ethics), challenging students to consider their relative strengths and weaknesses for addressing modern technological-ethical dilemmas faced by businesses, healthcare organizations, governments, and academic institutions. After a final portion of the course linking these ethical frameworks with practical regulatory and policy considerations, students will write and present a policy whitepaper analyzing a data-ethical issue of particular interest to them, integrating ethical perspectives, regulatory principles, and domain knowledge into a recommendation of best practices for the relevant agency, firm, or institution.

The course will thus equip students with a robust ethical “toolbox” for conscientiously gathering, interpreting, and extracting meaning from data throughout their careers as data scientists, while respecting privacy, fairness, transparency, democratic accountability, and other social concerns. Prerequisites: None. 3 credits.

Course Overview

The course revolves around three “pillars”, which we’ll examine individually before bringing them together for your final projects at the end of the class: high-level ethical issues in data science, general ethical frameworks, and public policy applications.

Data Science

A portion of the course will focus on introductions to cutting-edge technologies like self-driving cars, ChatGPT, facial detection algorithms, and various applications of AI to police and military technologies. For this portion, we’ll draw fairly often from the contents of the following books:

  • Perez (2019): Invisible Women: Data Bias in a World Designed for Men.
  • Catherine D’Ignazio and Lauren F. Klein (2020). Data Feminism. Cambridge, MA: MIT Press. [Free, open-source!]
  • Cathy O’Neil (2016). Weapons of Math Destruction. New York, NY: Crown Books.

Since there are plenty of in-depth resources available to you (e.g., other Georgetown courses!) for learning the technical details of these technologies, our goal in this course will be to learn just the particular aspects of each technology which are most relevant to the ethical and policy issues they present.

For example, we will look at Neural Netwok-based Machine Learning algorithms, but we will focus specifically on how the performance of these algorithms on a given task depends crucially on the existence of effective training data for that task. The breakthroughs in Artificial Intelligence which have had an immense impact on society over the past few decades, for example, have not come about because of new algorithms (neural networks, for example, have been around since the 1950s). Rather, they have come about because of the massive, exponential increase in the amount of data available to train these already-existing algorithms: for example, data scraped from across the entire web, or from millions of scanned books, or from Wikipedia’s massive collection of articles. This means, therefore, that these algorithms simply encode pre-existing human biases into algorithmically-derived “rules”, thus motivating the next pillar of the course: Ethics!

Ethics

For the ethics-focused portion of the course, we’ll be reading selections from the following textbook:

From the vast array of readings contained in this collection, we’ll look at both “standard” ethical readings from e.g. Jeremy Bentham and Immanuel Kant plus readings from literary sources like Ursula Le Guin and Ambrose Bierce.

Public Policy

For the final piece of the course we will take the technological developments discussed the first portion, analyze them using the ethical frameworks discussed in the second portion, and come to conclusions as to what types of things lawmakers, governments, and civil society organizations (NGOs, for example, and Think Tanks) can do in practice to address the ethical issues raised by these technologies. This means that, specifically, the recommended final project for the course will be a Policy Whitepaper, where you will choose a particular institution and make a recommendation to them in terms of how they can use their power (for example, the power to pass laws) to most effectively address an ethical issue that you believe is important.

For this portion of the class we’ll have to draw on a wide range of different readings, depending on what particular subdomains of public policy are most interesting to you all, but as a general textbook on ethics in data science which does focus a good amount on policy specifically, we will look at:

Now that you have an overview of the trajectory of the course, the following section contains the particulars of what we’ll be reading and working on each week!

Schedule

The following is a rough map of what we will work through together throughout the semester; given that everyone learns at a different pace, my aim is to leave us with a good amount of flexibility in terms of how much time we spend on each topic: if I find that it takes me longer than a week to convey a certain topic in sufficient depth, for example, then I view it as a strength rather than a weakness of the course that we can then rearrange the calendar below by adding an extra week on that particular topic! Similarly, if it seems like I am spending too much time on a topic, to the point that students seem bored or impatient to move onto the next topic, we can move a topic intended for the next week to the current week!

Unit Week Date Topic
Unit 1: Ethical Frameworks 1 Jan 15 Introduction to the Course
2 Jan 22 Machine Learning, Training Data, and Bias
Unit 2: Fairness in AI 3 Jan 29 (Descriptive) Fairness in AI
Jan 31 (Friday), 5:59pm EST [Deliverable] HW1: Nuts and Bolts for Fairness in AI
4 Feb 5 (Normative) Fairness in AI
5 Feb 12 Context-Sensitive Fairness
6 Feb 19 Causality in Ethics and Policy
Midterm 7 Feb 26 In-Class Midterm: Data Ethics, Fairness, Privacy, Causality
Mar 6 No Class (Spring Break)
Unit 3: Policy Frameworks 8 Mar 12 Privacy Policies, Incomplete Contracts, and Power
9 Mar 19 From Data Ethics to Data Policy
10 Mar 26 Econometric Policy Evaluation and Inverse Fairness
11 Apr 2 Fairness vs. Social Welfare
Unit 4: Applications 12 Apr 9 Project Talk, Causality and Identity Formation
13 Apr 16 Applications: Race, Class, Gender, Sexuality, and Disability (Data Feminism)
14 Apr 23 Republican Liberty and the Kindly Slavemaster
May 10 (Friday) [Deliverable] Policy Whitepaper

Assignments and Grading

The main assignment in the course will be your policy whitepaper, submitted at the end of the semester. However, there will also be a midterm exam and a series of assignments which exist to let you explore each of the modules of the course, in turn.

Assignment Due Date % of Grade
HW1: Nuts and Bolts for Fairness in AI

Friday, February 9

10%
HW2: Context-Sensitive Fairness

Friday, February 26

10%
Midterm Wednesday, February 28 30%
HW3: Privacy Policies as Incomplete Contracts

Friday, April 12

10%
HW4: Policy Evaluation

Friday, April 26

10%
Policy Whitepaper Friday, May 10 30%

Homework Lateness Policy

After the due date, for each homework assignment, you will have a grace period of 24 hours to submit the assignment without a lateness penalty. After this 24-hour grace period, late penalties will be applied based on the following scale (unless you obtain an excused lateness from one of the instructional staff!):

  • 0 to 24 hours late: no penalty
  • 24 to 30 hours late: 2.5% penalty
  • 30 to 42 hours late: 5% penalty
  • 42 to 54 hours late: 10% penalty
  • 54 to 66 hours late: 20% penalty
  • More than 66 hours late: Assignment submissions no longer accepted (without instructor approval)

References

Perez, Caroline Criado. 2019. Invisible Women: Data Bias in a World Designed for Men. Abrams.