Week 7: Point Processes, Clustering, and Regularity

PPOL 6805 / DSAN 6750: GIS for Spatial Data Science
Fall 2024

Class Sessions

Author

Affiliation

Jeff Jacobs

jj1088@georgetown.edu

Published

Wednesday, October 9, 2024

Open slides in new tab →

Naïve Clustering

Weeks 6-8 (My labels)	Naïve Clustering	→	Point Process Models	→	Autocorrelation in Point Data	→	Autocorrelation in Lattice Data
Waller and Gotway (2004)	(A bunch of stuff we already learned)	→	Ch 5: Analysis of Spatial Point Patterns	→	Ch 6: Point Data (Cases and Controls)	→	Ch 7: Regional Count Data (First mention of autocorrelation on page 227)
Schabenberger and Gotway (2004)	Ch 1 (Page 14): Autocorrelation	→	(A bunch of more complicated stuff we’ll learn later)

Why “Naïve”?

[Continuing from last week] Lattice data: easiest way to visualize / operationalize simple measures of clustering via spatial autocorrelation
So, we develop intuitions for measures like Moran’s \(I\) by looking at lattice data, but…
Naïve because we’re not modeling the lattice regions (cells) themselves!
Non-naïve clustering: Knowing the clusters but also what process led to their formation!

Spatial Randomness

Code

library(tidyverse)
library(spatstat)
set.seed(6805)
N <- 60
r_core <- 0.05
obs_window <- square(1)
# Regularity via Inhibition
#reg_sims <- rMaternI(N, r=r_core, win=obs_window)
cond_reg_sims <- rSSI(r=r_core, N)
# CSR data
#csr_sims <- rpoispp(N, win=obs_window)
cond_sr_sims <- rpoint(N, win=obs_window)
### Clustered data
#clust_sims <- rMatClust(kappa=6, r=2.5*r_core, mu=10, win=obs_window)
#clust_sims <- rMatClust(mu=5, kappa=1, scale=0.1, win=obs_window, n.cond=N, w.cond=obs_window)
#clust_sims <- rclusterBKBC(clusters="MatClust", kappa=10, mu=10, scale=0.05, verbose=FALSE)
# Each cluster consist of 10 points in a disc of radius 0.2
nclust <- function(x0, y0, radius, n) {
    #print(n)
    return(runifdisc(10, radius, centre=c(x0, y0)))
}
cond_clust_sims <- rNeymanScott(kappa=5, expand=0.0, rclust=nclust, radius=2*r_core, n=10)
# And PLOT
plot_w <- 400
plot_h <- 400
plot_scale <- 2.25
cond_reg_plot <- cond_reg_sims |> sf::st_as_sf() |>
  ggplot() +
  geom_sf() +
  dsan_theme()
ggsave("images/cond_reg.png", cond_reg_plot, width=plot_w, height=plot_h, units="px", scale=plot_scale)
cond_sr_plot <- cond_sr_sims |> sf::st_as_sf() |>
  ggplot() +
  geom_sf() +
  dsan_theme()
ggsave("images/cond_sr.png", cond_sr_plot, width=plot_w, height=plot_h, units="px", scale=plot_scale)
cond_clust_plot <- cond_clust_sims |> sf::st_as_sf() |>
  ggplot() +
  geom_sf() +
  dsan_theme()
ggsave("images/cond_clust.png", cond_clust_plot, width=plot_w, height=plot_h, units="px", scale=plot_scale)

Autocorrelation	\(I = -1\)	\(I = 0\)	\(I = 1\)
Description	Negative Autocorr	No Autocorr	Positive Autocorr
Event at \(\mathbf{s} = (x,y)\) Implies	Less likely to find another point nearby	No information about nearby points	More likely to find another point nearby
Resulting Pattern	Regularity	Reg/Clustered Mix	Clustering
Process(es) Which Could Produce Pattern	1st Order: Random within even-spaced grid 2nd Order: Competition	1st Order: i.i.d. points 2nd Order: i.i.d. distances	1st Order: Tasty food at clust centers 2nd Order: Cooperation
Fixed \(N\)	60	60	60

Complete Spatial Randomness (CSR)

Code

library(tidyverse)
library(spatstat)
set.seed(6807)
lambda <- 60
r_core <- 0.05
obs_window <- square(1)
# Regularity via Inhibition
# Regularity via Inhibition
reg_sims <- rMaternI(lambda, r=r_core, win=obs_window)
# CSR data
csr_sims <- rpoispp(N, win=obs_window)
### Clustered data
clust_mu <- 10
clust_sims <- rMatClust(kappa=lambda / clust_mu, scale=2*r_core, mu=10, win=obs_window)
# And PLOT
plot_w <- 400
plot_h <- 400
plot_scale <- 2.25
reg_plot <- reg_sims |> sf::st_as_sf() |>
  ggplot() +
  geom_sf() +
  labs(title=paste0("N = ",reg_sims$n)) +
  dsan_theme()
ggsave("images/reg.png", reg_plot, width=plot_w, height=plot_h, units="px", scale=plot_scale)
csr_plot <- csr_sims |> sf::st_as_sf() |>
  ggplot() +
  geom_sf() +
  labs(title=paste0("N = ",csr_sims$n)) +
  dsan_theme()
ggsave("images/csr.png", csr_plot, width=plot_w, height=plot_h, units="px", scale=plot_scale)
clust_plot <- clust_sims |> sf::st_as_sf() |>
  ggplot() +
  geom_sf() +
  labs(title=paste0("N = ",clust_sims$n)) +
  dsan_theme()
ggsave("images/clust.png", clust_plot, width=plot_w, height=plot_h, units="px", scale=plot_scale)

Autocorrelation	\(I = -1\)	\(I = 0\)	\(I = 1\)
Description	Negative Autocorr	No Autocorr	Positive Autocorr
Event at \(\mathbf{s} = (x,y)\) Implies	Less likely to find another point nearby	No information about nearby points	More likely to find another point nearby
Resulting Pattern	Regularity	Reg/Clustered Mix	Clustering
Process(es) Which Could Produce Pattern	1st Order: Random within even-spaced grid 2nd Order: Competition	1st Order: i.i.d. points 2nd Order: i.i.d. distances	1st Order: Tasty food at clust centers 2nd Order: Cooperation
Fixed Intensity \(\lambda\)	60	60	60
Random \(N\)

Measures are Relative to Window of Observation

Same data can be spatially random within one window, clustered or regular in others!

Code

N <- 60
obs_window <- square(1)
window_scale <- 3.5
csr_sims_square <- rpoispp(N, win=obs_window)
# Triangular window
obs_window_tri <- st_sfc(st_polygon(list(
    matrix(c(0.3,0.1,0.7,0.1,0.5,0.5,0.3,0.1), byrow=TRUE, nrow=4)
)))
obs_window_geom <- st_sfc(st_linestring(
    matrix(c(0,0,1,0,1,1,0,1,0,0), byrow=TRUE, nrow=5)
))
csr_sims_tri <- ppp(csr_sims_square$x, csr_sims_square$y, window=as.owin(obs_window_tri))
tri_plot <- csr_sims_tri |> sf::st_as_sf() |>
  ggplot() +
  geom_sf() +
  dsan_theme("quarter");
ggsave("images/window_tri.png", tri_plot, width=plot_w, height=plot_h, units="px", scale=window_scale)
# Square window
square_plot <- csr_sims_square |> sf::st_as_sf() |>
  ggplot() +
  geom_sf() +
  geom_sf(data=obs_window_tri |> sf::st_boundary()) +
  dsan_theme("quarter")
ggsave("images/window_square.png", square_plot, width=plot_w, height=plot_h, units="px", scale=window_scale)
# Circular window
obs_window_disc <- st_sfc(st_point(c(1, 0.5))) |> st_buffer(1.2)
csr_sims_circ <- ppp(csr_sims_square$x, csr_sims_square$y, window=as.owin(obs_window_disc))
circ_plot <- csr_sims_circ |> sf::st_as_sf() |>
  ggplot() +
  geom_sf() +
  geom_sf(data=obs_window_geom) +
  dsan_theme("quarter")
ggsave("images/window_circ.png", circ_plot, width=plot_w, height=plot_h, units="px", scale=window_scale)

Regular	CSR	Clustered

Point Process Models

Our New Library: `spatstat`!

Homepage: spatstat.org
GitHub: github.com/spatstat
Book: Baddeley, Rubak, and Turner (2015) [Companion website]
PDF: here

Why Do Events Appear Where They Do?

Code

center_l_function <- function(x, ...) {

  if (!spatstat.geom::is.ppp(x) && !spatstat.geom::is.fv(x)) {
    stop("Please provide either ppp or fv object.")
  }

  if (spatstat.geom::is.ppp(x)) {
    x <- spatstat.explore::Lest(x, ...)
  }

  r <- x$r

  l_centered <- spatstat.explore::eval.fv(x - r)

  return(l_centered)
}
cond_clust_sf <- cond_clust_sims |> sf::st_as_sf()
pines_plot <- cond_clust_sf |>
  ggplot() +
  geom_sf() +
  dsan_theme("full")
ggsave("images/pines.png", pines_plot)
# density() calls density.ppp() if the argument is a ppp object
den <- density(cond_clust_sims, sigma = 0.1)
#summary(den)
png("images/intensity_plot.png")
plot(den, main = "Intensity λ(s)")
contour(den, add = TRUE) # contour plot
dev.off()
# And Kest / Lest
kest_result <- Kest(cond_clust_sims, rmax=0.5, correction="best")
lest_result <- center_l_function(cond_clust_sims, rmax=0.5)
png("images/lest.png")
plot(lest_result, main="K(h)")
dev.off()

	First-Order	Second-Order
	Events considered individually \(\implies\) Intensity function \(\lambda(\mathbf{s})\)	Second-Order: Events considered pairwise \(\implies\) \(K\)-function \(K(\vec{h})\)

References

Baddeley, Adrian, Ege Rubak, and Rolf Turner. 2015. Spatial Point Patterns: Methodology and Applications with R. CRC Press.

Schabenberger, Oliver, and Carol A. Gotway. 2004. Statistical Methods for Spatial Data Analysis. CRC Press.

Waller, Lance A., and Carol A. Gotway. 2004. Applied Spatial Statistics for Public Health Data. John Wiley & Sons.

Naïve Clustering

Why “Naïve”?

Spatial Randomness

Complete Spatial Randomness (CSR)

Measures are Relative to Window of Observation

Point Process Models

Our New Library: spatstat!

Why Do Events Appear Where They Do?

References

Our New Library: `spatstat`!