Final Project Details
PPOL 6805 / DSAN 6750: GIS for Spatial Data Science
Project Overview
Your job, for the final project, will be to
- Choose an application area that you are interested in (this can be drawn from the list in the “Application Areas” section below, but doesn’t have to be! You are very welcome to choose any area that is of interest to you!)
- Develop a hypothesis about some spatial phenomena within that application area, which can be addressed using the GIS/Spatial Data Science tools we’ve learned in the course. You will need to be specific about whether your hypothesis relates to a first-order property of the observations in your dataset, a second-order property of these observations, or both.
- Present some initial evidence regarding your hypothesis, whether in the form of visualizations or naïve clustering measures such as the global or local versions of Moran’s \(I\).
- Assess the veracity of the hypothesis using formal hypothesis-evaluation approaches. The general “template” for this formal hypothesis-evaluation would be to:
- Compute and visualize the intensity function and/or Pairwise Correlation Function for your observed data,
- Run 999 Monte Carlo simulations of the spatial patterns that would result from your null hypothesis, then
- Compare the the intensity functions or Pairwise Correlation Functions for these 999 simulations with those of your observed spatial pattern from step (a).
- Draw an initial conclusion regarding your hypothesis from Step 2, on the basis of the analysis carried out in Step 4.
- Conclude with a “roadmap” of the next steps that you would need to take to further explore and/or refine your initial conclusion from Step 5.
- For example, if your dataset had missing values for important regions/time periods, here you can discuss how you might go about collecting observations to form a better dataset that would allow you to evaluate your hypothesis with greater confidence!
Application Areas
Urban Planning
I put this here first only as, the application area where I have the most experience, so can probably be most helpful in terms of project guidance!
Though it is a broad topic, one way to narrow your scope to make it feasible in the time we have left is to choose between a comparative approach and a case study approach:
Comparative Approach:
Here the idea is to understand a spatial phenomenon by analyzing how it is implemented across different urban areas. For example, you could choose a measure of transportation efficiency (on-time rate, proportion of urban population living within walking distance of a subway/bus stop, etc.), then see how it is distributed across different urban areas, whether within a specific country or internationally. Then, the project could be rooted in a hypothesis regarding why some urban areas have high efficiency and others have low efficiency.
Case Study Approach:
Here, rather than comparing across different urban areas, the idea would be more focused on telling a story about a particular city or urban area. The reason this approach is mentioned after the Comparative Approach is because, in reality, what you’re doing here is still comparative! It’s just that, in this case, the comparisons are being made between e.g. subregions of the urban area, or between the urban area observed at different time periods.
For example, a famous instance of the use of GIS to study a particular urban area is the Massachusetts state government’s Logan Airport Health Study, which found significantly higher rates of asthma among children living closer to Logan Airport. Here the point is that the rates were higher for these children relative to children living in other parts of the same urban area. That is, it’s still a “comparative” approach, just comparing areas-near-Logan with areas-not-near-Logan!
Climate Change / Renewable Energy
Public Policy Evaluation
Here, I specify policy evaluation rather than just public policy studies in general, since I think that some of the tools we’ve learned in class can be particularly helpful for spatially-based comparison of the effectiveness of a given policy choice.
That’s a fairly vague statement so, as a concrete example,
International Relations
Public Health / Global Health
Diffusion Processes
Google Cloud Platform Credits
You should have received the Google Cloud Platform credits at your @georgetown.edu
email address if you’re enrolled in this course. The code in the email should add a new billing account to your Google Cloud Console, with $50 for:
- Creating a full-on Virtual Machine for the project, if you think that you’ll need a bunch of storage or computing power or GPU usage, and/or
- For use with Google Maps Engine, if you think that one of their Maps Engine datasets/tools may be useful!
As mentioned in the email: A reasonable choice, which ensures that you’ll stay within-budget even if you run the VM instance from now until the semester, is to create an instance (you can name it “gis-project”, for example) with all of the default options except:
- Choose ec2-small for the Machine Type (ec2-medium is the default, but based on the estimated cost for this type, it would run out of $ before the semester ends)
- Click the “Change” button under “Boot Disk” and then set the size to 25 GB (this is what I meant by “trying a few different options”: with the $ saved moving from ec2-medium to ec2-small, you can increase from the default 10 GB disk to a 25 GB disk without the expected cost going over $50 for the semester).
GIS Datasets/Project Ideas Collection
Once we get to the Spatial Data Science unit, I’m hoping that you will have lots of inspiration in terms of the different methods that can be used for drawing inferences about spatial phenomena! Until then, me and TA Billy will add links to the following Raindrop.io collection, which you can browse to potentially draw inspiration for your own final project topic: