Week 6: Random Fields and Spatial Autocorrelation

PPOL 6805 / DSAN 6750: GIS for Spatial Data Science
Fall 2024

Author
Affiliation

Jeff Jacobs

Published

Wednesday, October 2, 2024

Open slides in new tab →

Random Fields

  • (Now on slides rather than scribbled on the whiteboard!)

D = Rectangle; Z(s) here: height at coordinate sD

Key Notation / Definition

Data={Z(s)sDRd}

  • d>1: Data forms a Random Field (this class: d=2!)
Geostatistical Data Lattice/Region Data Point Pattern
Criteria Fixed D, Continuous Fixed D, Discrete Random subset DD
Interest Infer non-observed parts of D Autocorrelation, clustering Point-generating process
Example
  • N trees s1,s2,,sN, observed within a sample window DR2, (D some finite plot of land)
  • Z(si): Attribute(s) at site si
  • Example: Height. Z(s1)=500m, Z(s2)=850m,
  • Z(s) observed over N×N grid of plots

  • Contiguity, Neighbors (next section of slides!)

  • Autocorrelation: Are points around si likely to have values similar to Z(si)?

  • Unknown number of lightning strikes s1,s2,
  • Contrast with geostatistical: all of D is observed, but what determines the subset D where events occur?
  • “Unmarked”: Just locations
  • “Marked”: Locations+info (e.g., intensity of strike)

Geostatistical Data: Dumping Sand onto a Table

  • Domain D: tabletop; Random process: dumping sand onto the tabletop
  • Ω: All possible realizations of process. ωiΩ Particular realization
Figure 1: ω1Ω
Figure 2: ω2Ω
Figure 3: ω3Ω
Figure 4: ω4Ω

Geostatistical Modeling: Terrains Are Not Spiky Chaotic Landscapes!

Geostatistical Data Point Pattern

  • Indicator function: 1[Z(s)>860m]
Code
set.seed(6805)
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Code
library(sf)
Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
Code
library(terra)
terra 1.7.78

Attaching package: 'terra'

The following object is masked from 'package:tidyr':

    extract
Code
library(latex2exp)
#### Define area
lat_range <- c(-5.0, -6.0)
lon_range <- c(15, 20)
#lon_center <- -5.4
#lat_center <- 36.75
# And the window around this centroid
lat_radius <- 0.05
lon_radius <- 0.1

gen_random_table <- function(world_num=1, return_data=FALSE, rand_seed=NULL) {
  if (!is.null(rand_seed)) {
    set.seed(rand_seed)
  } else {
    rand_seed <- sample(1:9999, size=1)
    set.seed(rand_seed)
  }
  lat_center <- runif(1, min=min(lat_range), max=max(lat_range))
  lon_center <- runif(1, min=min(lon_range), max=max(lon_range))
  lon_lower <- lon_center - lon_radius
  lon_upper <- lon_center + lon_radius
  lat_lower <- lat_center - lat_radius
  lat_upper <- lat_center + lat_radius
  coords <- data.frame(x = c(lon_lower, lon_lower, lon_upper, lon_upper),
                       y = c(lat_lower, lat_upper, lat_lower, lat_upper))
  coords_sf <- st_as_sf(coords, coords = c("x", "y"), crs = 4326)
  # Using geodata
  elev <- geodata::elevation_3s(lon = lon_center, lat = lat_center, res=2.5, path = getwd())
  elev <- terra::crop(elev, coords_sf)
  return(elev)
}

compute_hillshade <- function(elev) {
  # Calculate hillshade
  slopes <- terra::terrain(elev, "slope", unit = "radians")
  aspect <- terra::terrain(elev, "aspect", unit = "radians")
  hs <- terra::shade(slopes, aspect)
  return(hs)
}

plot_hillshade <- function(elev, hs, title=NULL) {
  ## Plot hillshading as basemap
  # (here using terra::plot, but could use tmap)
  # lat_upper_trimmed <- lat_upper - 0.001*lat_upper
  # lat_lower_trimmed <- lat_lower - 0.001*lat_lower
  base_plot <- terra::plot(
    hs, col = gray(0:100 / 100), legend = FALSE, axes = FALSE, mar=c(0,0,1,0), grid=TRUE
  )
  #    xlim = c(elev_xmin, elev_xmax), ylim = c(elev_ymin, elev_ymax))
  # overlay with elevation
  
  color_vec <- terrain.colors(25)
  plot(elev, col = color_vec, alpha = 0.5, legend = FALSE, axes = FALSE, add = TRUE)
  
  # add contour lines
  terra::contour(elev, col = "grey30", add = TRUE)
  if (!is.null(title)) {
    world_name <- TeX(sprintf(r"(World $\omega_{%d} \in \Omega$)", world_num))
    title(world_name)
  }
}
elev_r1 <- gen_random_table(rand_seed=6810)
hs_r1 <- compute_hillshade(elev_r1)
plot_hillshade(elev_r1, hs_r1)

Code
elev_indic_r1 <- terra::ifel(
  elev_r1 > 860, elev_r1, NA
)
plot(elev_indic_r1)

Code
elev_r2 <- gen_random_table(rand_seed=6815)
hs_r2 <- compute_hillshade(elev_r2)
plot_hillshade(elev_r2, hs_r2)

Code
my_quant <- function(x) quantile(x, 0.99)
elev_99 <- terra::global(elev_r2, my_quant)[1,1]
elev_indic_r2 <- terra::ifel(
  elev_r2 > elev_99, elev_r2, NA
)
plot(elev_indic_r2)

Geospatial Data Lattice/Region Data

Code
table_elev <- gen_random_table(rand_seed=6805)
table_hs <- compute_hillshade(table_elev)
#plot_hillshade(table_elev, table_hs)
#par(mfrow=c(1,2), mar=c(0,0,1,1))
plot(table_elev)

(Pretend this is infinite-resolution) Infinite number of neighbors around each point!
Code
elev_coarse <- terra::aggregate(table_elev, 12)
plot(elev_coarse)

Finite number of neighbors around each observation Can study correlations between neighboring cells!

Doing Things with nbdep

  • “Neighbor” Definitions
  • Testing Hypotheses About Neighborhoods!

Who Are My Neighbors?

  • Contiguity-Based:

  • Distance-Based

  • K-Nearest Neighbors

Distance-Based Neighbors

From Xie ()

Spatial Autocorrelation

  • Location i has high value locations near i more likely to have high values

Moran’s I

I=ni=1n(yiy)2Inverse of Variancei=1nj=1nwij(yiy)(yjy)Weighted Covariancei=1nj=1nwijNormalize Weights

  • I is Large when:
    • yi and yj are neighbors: wij, and
    • yi and yj large at the same time: (yiy)(yjy)

Local Indicators of Spatial Autocorrelation (LISA)

  • tldr: See how much a given location s contributes to overall Moran’s I
  • Local Moran’s I:

Ii=yiySi2j=1nwij(yjy)

Our Null Hypothesis: Complete Spatial Randomness (CSR)

Leaving You With a Challenge

  • These point patterns can be classified as clustered or regular:

Leaving You With a Challenge

References

Schabenberger, Oliver, and Carol A. Gotway. 2004. Statistical Methods for Spatial Data Analysis. CRC Press.
Xie, Sherrie. 2023. “Analyzing Geospatial Data in R.”