Urban Growth Modeling in Delaware County, PA

Charlie Huemmler & Kate Tanabe

2023-05-06

Introduction

Understanding where development is likely to take place is crucial for urban planners and decision makers. Regional land use planning requires both supply and demand-side insights to inform how demand interacts with current supply and future goals. Planners are responsible for navigating the trade-offs between economic growth, environmental sustainability, and community objectives. Including future land development predictions in planning analyses allows planners to understand infrastructure needs, advocate for certain types of growth, and incentive development that meets the community’s goals.

Motivation

This project analyzes Delaware County’s historic patterns of development to predict areas that are likely to be developed to inform regional planners of the current and future demands. This information can be used by Delaware County planners when considering regional growth patterns, prioritizing and allocating new developments, and understanding how developments fit within goals for the future.

This model uses national land cover and land cover change datasets from 2009 and 2019 to predict 2029 development patterns. We build a binary logistic model using variables including distance to highways, distance to current development, distance to train lines, and demographic information. We hypothesize that future land development is a function of these variables.

We also use this model to create an allocation procedure for urban growth based on population change, development demand, and supply of land in 2029. The allocation procedure prioritizes future development based on community values by using the model’s outcome, which is probability of development. Planners can use the allocation procedure to understand how different land use and development scenarios will impact the community.

Development Scenarios

We use two scenarios to predict future land development in our analysis. The first scenario looks at demand-side change based on the population projections for Delaware County. The population predictions will inform planners of where demand for development will be greater. The second scenario examines the incorporation of a hypothetical new development, the extension of SEPTA’s Westchester Branch from Wawa west to the county border. This scenario explores how the addition of a new development will change potential allocation. We believe planners can use these scenarios and predictions when considering transit-oriented development and the addition of affordable and mixed-use housing.

Set Up

library(tidyverse)
library(sf)
library(raster)
library(knitr)
library(kableExtra)
library(tidycensus)
library(tigris)
library(FNN)
#library(QuantPsyc) # JE Note: in R 4.1, QuantPsyc package not available.
library(caret)
library(yardstick)
library(pscl)
library(plotROC) 
library(ggrepel)
library(pROC)
library(grid)
library(gridExtra)
library(viridis)
#library(igraph)
library(mapview)
library(here)
library(terra)
library(cowplot)
plotTheme <- theme(
  plot.title =element_text(size=12),
  plot.subtitle = element_text(size=8),
  plot.caption = element_text(size = 6),
  axis.text.x = element_text(size = 10, angle = 45, hjust = 1),
  axis.text.y = element_text(size = 10),
  axis.title.y = element_text(size = 10),
  # Set the entire chart region to blank
  panel.background=element_blank(),
  plot.background=element_blank(),
  #panel.border=element_rect(colour="#F0F0F0"),
  # Format the grid
  panel.grid.major=element_line(colour="#D0D0D0",size=.2),
  axis.ticks=element_blank())

mapTheme <- function(base_size = 8, title_size = 10, small_size = 6) {
  theme(
    text = element_text( color = "black"),
    plot.title = element_text(size = title_size, colour = "black", hjust = 0.5),
    plot.subtitle=element_text(size = base_size, colour = "black", hjust = 0.5, face="italic"),
    plot.caption=element_text(size = small_size, colour = "black", hjust = 0.5),
    axis.ticks = element_blank(),
    axis.title = element_blank(),
    axis.text = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    panel.background = element_blank(),
    panel.grid.minor = element_blank(),
    panel.border = element_blank(),
    strip.text.x = element_text(size = base_size),
    strip.text.y = element_text(size = base_size),
    strip.background = element_rect(colour="transparent", fill="transparent"),
    legend.title = element_text(size = small_size),
    legend.text = element_text(size = small_size),
    legend.key.size = unit(0.25, "cm"))
}

colors <- c("#2F4858", "#1372A4", "#86BBD8", "#A8C686","#E75E15")

#this function converts a column in to quintiles. It is used for mapping.
quintileBreaks <- function(df,variable) {
    as.character(quantile(df[[variable]],
                          c(.01,.2,.4,.6,.8),na.rm=T))
}

#This function can be used to convert a polygon sf to centroids xy coords.
xyC <- function(aPolygonSF) {
  as.data.frame(
    cbind(x=st_coordinates(st_centroid(aPolygonSF))[,1],
          y=st_coordinates(st_centroid(aPolygonSF))[,2]))
} 

#this function convert a raster to a data frame so it can be plotted in ggplot
rast <- function(inRaster) {
  data.frame(
    xyFromCell(inRaster, 1:ncell(inRaster)), 
    value = getValues(inRaster)) }

datafolder <- file.path(here() %>% dirname(), 'assn5data')


palette2 <- c("#41b6c4","#253494")
palette4 <- c("#a1dab4","#41b6c4","#2c7fb8","#253494")
palette5 <- c("#ffffcc","#a1dab4","#41b6c4","#2c7fb8","#253494")
palette10 <- c("#f7fcf0","#e0f3db","#ccebc5","#a8ddb5","#7bccc4",
               "#4eb3d3","#2b8cbe","#0868ac","#084081","#f7fcf0")

Data Wrangling

We move onto data wrangling and start by importing land cover data from the US Geological Survey’s National Land Cover Database. We specifically pulled 2008 land cover, 2019 land cover, and land cover change for Delaware County.

lc_2008 <- raster(paste(datafolder, "delco_2008_final.tif", sep = '/')) 


lc_2019 <- raster(paste(datafolder, "delco_2019_Clip_Clip.tif", sep = '/')) 
lc_change <- raster(paste(datafolder, "delco_change_final.tif", sep = '/')) 
crs(lc_change) <- crs(lc_2008)


delco <- 
  get_decennial(geography = 'county', state = 'PA', year = 2020, county = 'Delaware', variables = 'P1_005N', geometry = T) %>% st_transform(crs = st_crs(lc_change))
LCnames <-c("None","Water",
  "Urban",
  "Wetland within Class",
  "Herbaceous Wetland",
  "Agriculture within Class",
  "Cultivated Crop",
  "Hay/Pasture",
  "Rangeland herbaceous and shrub",
  "Barren",
  "Forest",
  "Woody Wetland")


lc_changespat <- as(lc_change, "SpatRaster")
LCcodes <- unique(lc_change)

nlcdcols_c <- data.frame(coltab(lc_changespat))
nlcdcols_c <- nlcdcols_c[LCcodes + 1,] 
LCcolors <- rgb(red = nlcdcols_c$red,
                green = nlcdcols_c$green,
                blue = nlcdcols_c$blue,
                names = as.character(LCcodes),
                maxColorValue = 255)[3:length(LCnames)]

ggplot()+
  geom_sf(data=delco) +
  geom_raster(data=rast(lc_change) %>% na.omit %>% filter(value > 1),
              aes(x,y,fill=as.factor(value))) +
  scale_fill_manual(name = "Land cover change",
                    values = LCcolors,
                    labels =  LCnames[2:length(LCnames)],
                    na.translate = FALSE) +   labs(title = "Land Cover Change, 2008-2019") +
  mapTheme()

We classify the land cover change data in order to identify what areas have experienced development change since 2008. Here, we reclassify so the raster data so that each area is assigned a value of 0, meaning no development has occurred, or 1, meaning development has occurred. This binary development change will be our dependent variable for our analysis and modeling.

reclassMatrix <- matrix(c(
  0, 0,
  1, 0,
  2, 0,
  3, 1,
  4, 0,
  5, 0,
  6, 0,
  7, 1, 
  8, 0, 
  9, 0, 
  10, 0, 
  11, 0, 
  12, 0, 
  Inf, 0
), ncol=2, byrow=T)

reclassMatrix
##       [,1] [,2]
##  [1,]    0    0
##  [2,]    1    0
##  [3,]    2    0
##  [4,]    3    1
##  [5,]    4    0
##  [6,]    5    0
##  [7,]    6    0
##  [8,]    7    1
##  [9,]    8    0
## [10,]    9    0
## [11,]   10    0
## [12,]   11    0
## [13,]   12    0
## [14,]  Inf    0
lc_change2 <- 
  reclassify(lc_change,reclassMatrix)

lc_change2[lc_change2 < 1] <- NA

names(lc_change2) <- "lc_change"


ggplot() +
  geom_sf(data=delco) +
  geom_raster(data=rast(lc_change2) %>% na.omit, 
              aes(x,y,fill=as.factor(value))) +
  scale_fill_viridis(discrete=TRUE, name ="Land Cover\nChange") + 
  labs(title="Development Land Use Change") +
  mapTheme()

Creating the Fishnet

We use a fishnet that covers Delaware County for our analysis. This will allow us to assign values for each of the features in our analysis to cells within the fishnet. This fishnet uses a 500-meter resolution.

delco_fishnet <- 
  st_make_grid(delco, 500, square = T) %>%
  st_sf()

delco_fishnet <-
  delco_fishnet[delco,]

ggplot() +
  geom_sf(data=delco_fishnet) +
  labs(title="Fishnet, 500 Meter Resolution") +
  mapTheme()

We now aggregate the change in development plotted above into the fishnet and assign each cell as developed or no change.

changePoints <-
  rasterToPoints(lc_change2) %>%
  as.data.frame() %>%
  st_as_sf(coords = c("x", "y"), crs = st_crs(delco_fishnet))

fishnet <- 
  aggregate(changePoints, delco_fishnet, sum) %>%
  mutate(lc_change = ifelse(is.na(lc_change),0,lc_change),
         lc_change = ifelse(lc_change > 20 ,1 ,0),
         lc_change = as.factor(lc_change))

ggplot() +
  geom_sf(data=delco) +
  geom_sf(data=fishnet, aes(fill=lc_change), color = NA) +
  scale_fill_manual(values = colors,
                      labels=c("No Change","New Development"),
                      name = "") +
  labs(title = "Land Cover Development Change", subtitle = "As fishnet centroids") +
  mapTheme()

2.3. Land Cover in 2008 and 2019

Below we plot the land cover by type in Delaware County for 2008 and 2019.

#https://www.mrlc.gov/data/legends/national-land-cover-database-class-legend-and-description


LCnames <-c("NA",
  "Water",
  "DevelopedOpen",
  "DevelopedLow",
  "DevelopedMed",
  "DevelopedHigh",
  "Barren",
  "DeciduousForest",
  "EvergreenForest",
  "MixedForest",
  "ShrubScrub",
  "GrassHerbaceous",
  "PastureHay",
  "CultCrops",
  "WoodyWetlands",
  "EmergentHerbWet")


lc_2008spat <- as(lc_2008, "SpatRaster")
LCcodes <- unique(lc_2008)

nlcdcols08 <- data.frame(coltab(lc_2008spat))
nlcdcols08 <- nlcdcols08[LCcodes + 1,]
LCcolors <- rgb(red = nlcdcols08$red,
                green = nlcdcols08$green,
                blue = nlcdcols08$blue,
                names = as.character(LCcodes),
                maxColorValue = 255)

p1<-ggplot() +
  geom_sf(data=delco) +
  geom_raster(data=rast(lc_2008) %>% na.omit %>% filter(value > 0), 
              aes(x,y,fill=as.factor(value))) +
  scale_fill_manual(name = "Land cover",
                    values = LCcolors,
                    labels = LCnames,
                    na.translate = FALSE) +  
  labs(title = "Land Cover Classification", subtitle = 2008) +
  mapTheme()

p2<-ggplot()+
  geom_sf(data=delco) +
  geom_raster(data=rast(lc_2019) %>% na.omit %>% filter(value > 0), 
              aes(x,y,fill=as.factor(value))) +
  scale_fill_manual(name = "Land cover",
                    values = LCcolors,
                    labels = LCnames,
                    na.translate = FALSE) + 
  labs(title = "Land Cover Classification", subtitle = "2019") +
  mapTheme()+
  theme(legend.position = 'none')

legend <- get_legend(p1)
p1<-p1 +  theme(legend.position = 'none')

# move the legend to the right
ggdraw(plot_grid(p1, p2, legend, ncol=3, align='v', rel_widths=c(1, 1, .4)))

Feature Engineering

In this section, we engineer additional features that will be used as independent variables for our modeling. Features include land cover type, population data, distance to highways, distance to railways, and spatial lag of development.

Land Cover Type Categories

We further refine the land cover types into categories in order to understand where development happens most often. Here we recategorize the land use classification so that any open space plus any low, medium, or high intensity development is considered developed. We categorize the remaining land cover types into the following groups: forest, farm, woodlands, water, or other undeveloped land.

developed <- lc_2008 == 21 | lc_2008 == 22 | lc_2008 == 23 | lc_2008 == 24
forest <- lc_2008 == 41 | lc_2008 == 42 | lc_2008 == 43 
farm <- lc_2008 == 81 | lc_2008 == 82 
wetlands <- lc_2008 == 90 | lc_2008 == 95 
otherUndeveloped <- lc_2008 == 52 | lc_2008 == 71 | lc_2008 == 31 
water <- lc_2008 == 11

names(developed) <- "developed"
names(forest) <- "forest"
names(farm) <- "farm"
names(wetlands) <- "wetlands"
names(otherUndeveloped) <- "otherUndeveloped"
names(water) <- "water"

aggregateRaster <- function(inputRasterList, theFishnet) {
  #create an empty fishnet with the same dimensions as the input fishnet
  theseFishnets <- theFishnet %>% dplyr::select()
  #for each raster in the raster list
  for (i in inputRasterList) {
  #create a variable name corresponding to the ith raster
  varName <- names(i)
  #convert raster to points as an sf
    thesePoints <-
      rasterToPoints(i) %>%
      as.data.frame() %>%
      st_as_sf(coords = c("x", "y"), crs = st_crs(theFishnet)) %>%
      filter(.[[1]] == 1)
  #aggregate to the fishnet
    thisFishnet <-
      aggregate(thesePoints, theFishnet, length) %>%
      mutate(!!varName := ifelse(is.na(.[[1]]),0,1))
  #add to the larger fishnet
    theseFishnets <- cbind(theseFishnets,thisFishnet)
  }
  #output all aggregates as one large fishnet
   return(theseFishnets)
  }

theRasterList <- c(developed,forest,farm,wetlands,otherUndeveloped,water)

aggregatedRasters <-
  aggregateRaster(theRasterList, delco_fishnet) %>%
  dplyr::select(developed,forest,farm,wetlands,otherUndeveloped,water) %>%
  mutate_if(is.numeric,as.factor)

aggregatedRasters %>%
  gather(var,value,developed:water) %>%
  st_cast("POLYGON") %>%    #just to make sure no weird geometries slipped in
  mutate(X = xyC(.)$x,
         Y = xyC(.)$y) %>%
  ggplot() +
    geom_sf(data=delco) +
    geom_sf(aes(fill=as.factor(value)),color = NA) +
    facet_wrap(~var) +
    scale_fill_manual(values = colors,
                        labels=c("Other","Land Cover"),
                        name = "") +
    labs(title = "Land Cover Types, 2001",
         subtitle = "As fishnet centroids") +
   mapTheme()

Census Data

We also include population and population change variables in our analysis given that these population metrics impact demand for development. We import census population data for each of the census tracts in Delaware County. We present 2008 and 2019 populations for the county by tract and by fishnet below.

#census_api_key("YOUR KEY GOES HERE", overwrite = TRUE)
census_api_key("4bbe4bead4e5817f6a6b79e62c5bea69e77f1887", overwrite = TRUE)
# Specify which variable(s) you would like to grab. Here, only one (Total Population) is listed, but you could add more to the call.
acs_vars <- c("B02001_001E")

#haven't updated any of the acs stuff 

# Using "tract" as the geography and 2019 as the year, download data data for the Houston MSA counties listed.
delcoPop09 <- get_acs(geography = "tract", 
                        variables = acs_vars, 
                        year = 2009,
                        state = "PA", 
                        geometry = TRUE, 
                        output = "wide",
                        county=c("Delaware")) %>%
                rename(pop2009 = B02001_001E) %>%
                dplyr::select(-starts_with("B"))

# Make sure to transform to the crs of the fishnet!
delcoPop09  <- delcoPop09  %>%
  st_transform(st_crs(delco_fishnet))

# Specify which variable(s) you would like to grab. Here, only one (Total Population) is listed, but you could add more to the call.
acs_vars <- c("B02001_001E")

# Using "tract" as the geography and 2019 as the year, download data data for the Houston MSA counties listed.
delcoPop19 <-  get_acs(geography = "tract", 
                        variables = acs_vars, 
                        year = 2019,
                        state = "PA", 
                        geometry = TRUE, 
                        output = "wide",
                        county=c("Delaware")) %>%
                rename(pop2019 = B02001_001E) %>%
                dplyr::select(-starts_with("B"))# Make sure to transform to the crs of the fishnet!
delcoPop19 <- delcoPop19 %>%
  st_transform(st_crs(delco_fishnet))

Both years of census data are then plotted, first as polygons and then aggregated to the fishnet.

grid.arrange(
ggplot() +
  geom_sf(data = delcoPop09, aes(fill=factor(ntile(pop2009,5))), colour=NA) +
  scale_fill_manual(values = palette5,
                    labels=quintileBreaks(delcoPop09,"pop2009"),
                   name="Quintile\nBreaks") +
  labs(title="Population, Delware County: 2009") +
  mapTheme(),

ggplot() +
  geom_sf(data = delcoPop19, aes(fill=factor(ntile(pop2019,5))), colour=NA) +
  scale_fill_manual(values = palette5,
                    labels=quintileBreaks(delcoPop19,"pop2019"),
                   name="Quintile\nBreaks") +
  labs(title="Population, Delaware County: 2019") +
  mapTheme(), ncol=2)

delco_fishnet <-
  delco_fishnet %>%
  rownames_to_column("fishnetID") %>% 
  mutate(fishnetID = as.numeric(fishnetID)) %>%
  dplyr::select(fishnetID)

fishnetPopulation09 <-
  st_interpolate_aw(delcoPop09["pop2009"], delco_fishnet, extensive=TRUE) %>%
  as.data.frame(.) %>%
  rownames_to_column(var = "fishnetID") %>%
  left_join(delco_fishnet %>%
              mutate(fishnetID = as.character(fishnetID)),
            ., by=c("fishnetID"='fishnetID')) %>% 
  mutate(pop2009 = replace_na(pop2009,0)) %>%
  dplyr::select(pop2009)

fishnetPopulation19<-
  st_interpolate_aw(delcoPop19["pop2019"],delco_fishnet, extensive=TRUE) %>%
  as.data.frame(.) %>%
  rownames_to_column(var = "fishnetID") %>%
  left_join(delco_fishnet %>%
              mutate(fishnetID = as.character(fishnetID)),
            ., by=c("fishnetID"='fishnetID')) %>% 
  mutate(pop2019 = replace_na(pop2019,0)) %>%
  dplyr::select(pop2019)

fishnetPopulation <- 
  cbind(fishnetPopulation09,fishnetPopulation19) %>%
  dplyr::select(pop2009,pop2019) %>%
  mutate(pop_Change = pop2019 - pop2009)

grid.arrange(
ggplot() +
  geom_sf(data=delcoPop19, aes(fill=factor(ntile(pop2019,5))),colour=NA) +
  scale_fill_manual(values = palette5,
                    labels=substr(quintileBreaks(delcoPop19,"pop2019"),1,4),
                   name="Quintile\nBreaks") +
  labs(title="Population, Delaware County: 2019",
       subtitle="Represented as tracts; Boundaries omitted") +
  mapTheme(),

ggplot() +
  geom_sf(data=fishnetPopulation, aes(fill=factor(ntile(pop2019,5))),colour=NA) +
  scale_fill_manual(values = palette5,
                   labels=substr(quintileBreaks(fishnetPopulation,"pop2019"),1,4),
                   name="Quintile\nBreaks") +
  labs(title="Population, Delaware County: 2019",
       subtitle="Represented as fishnet gridcells; Boundaries omitted") +
  mapTheme(), ncol=2)

Existing Development & Infrastructure

In addition to land types and census data, we incorporate spatial variables related to existing development and infrastructure. Here we create the following features: distance to highways, distance to railways, municipalities, and spatial lag to development. Spatial lag takes the distance to existing development into consideration, as we hypothesize that new development is a function of the previous variables plus the pattern of existing development.

delcoRoads <- read_sf("C:/Users/cchue/Documents/GIS/delcodata/streets.shp")  


delcoHighways <- delcoRoads %>% st_transform(st_crs(lc_change)) %>% filter(RoadClass %in% c("INTERSTATE", "MAJOR"))

delcoRail <- delcoRoads %>% st_transform(st_crs(lc_change)) %>% filter(RoadClass %in% c("RAIL"))
emptyRaster <- lc_change
emptyRaster[] <- NA

highway_raster <- 
  as(delcoHighways,'Spatial') %>%
  rasterize(.,emptyRaster)

highway_raster_distance <- distance(highway_raster)
names(highway_raster_distance) <- "distance_highways"

highwayPoints <-
  rasterToPoints(highway_raster_distance) %>%
  as.data.frame() %>%
  st_as_sf(coords = c("x", "y"), crs = st_crs(delco_fishnet))

highwayPoints_fishnet <- 
  aggregate(highwayPoints, delco_fishnet, mean) %>%
  mutate(distance_highways = ifelse(is.na(distance_highways),0,distance_highways))

ggplot() +
  geom_sf(data=delco) +
  geom_sf(data=highwayPoints_fishnet, aes(fill=factor(ntile(distance_highways,5))),color = NA) +
  scale_fill_manual(values = palette5,
                      labels=substr(quintileBreaks(highwayPoints_fishnet,"distance_highways"),1,8),
                      name="Quintile\nBreaks") +
  geom_sf(data=delcoHighways, colour = "red") +
  labs(title = "Distance to Highways",
       subtitle = "As fishnet centroids; Highways visualized in red") +
  mapTheme()

septa_rr_line <- read_sf(paste(datafolder, "rail_existing_clip.shp", sep = '/')) %>% 
  st_transform(crs = st_crs(delco)) 

septa_rr_line_ext <- read_sf(paste(datafolder, "rail_merge_clip.shp", sep = '/')) %>% 
  st_transform(crs = st_crs(delco)) 


septa_rr_line_addition <- read_sf(paste(datafolder, "rail_extension_clip.shp", sep = '/')) %>% 
  st_transform(crs = st_crs(delco)) 

septa_rr_line_ext <- septa_rr_line_ext[delco,]

septa_rr_line <- septa_rr_line[delco,]

#septa_rr_station <- read_sf("C:/Users/cchue/Documents/GIS/Transit/transitshp/regionalrailstats.shp")  

#septa_trolley_line <- read_sf("C:/Users/cchue/Documents/GIS/Transit/transitshp/trolley.shp")  
#septa_trolley_station <- read_sf("C:/Users/cchue/Documents/GIS/Transit/transitshp/trolleystops.shp")  

ggplot()+
  geom_sf(data=delco, fill ="#86BBD8",color = "transparent") +
  geom_sf(data = septa_rr_line, line = 'dashed', color ="red")+
  labs(title = "Existing Regional Rail in Delaware County") +
  mapTheme()+
  theme(legend.position = 'none')
## Warning: Ignoring unknown parameters: line

emptyRaster <- lc_change
emptyRaster[] <- NA

#for existing rail
rr_raster <- 
  as(septa_rr_line,'Spatial') %>%
  rasterize(.,emptyRaster)

rr_raster_distance <- distance(rr_raster)
names(rr_raster_distance) <- "distance_rr"

rrPoints <-
  rasterToPoints(rr_raster_distance) %>%
  as.data.frame() %>%
  st_as_sf(coords = c("x", "y"), crs = st_crs(delco_fishnet))

rrPoints_fishnet <- 
  aggregate(rrPoints, delco_fishnet, mean) %>%
  mutate(distance_rr = ifelse(is.na(distance_rr),0,distance_rr))

#for future rail extension/2030 predictions
rr_raster2 <- 
  as(st_zm(septa_rr_line_addition),'Spatial') %>%
  rasterize(.,emptyRaster)

rr_raster_distance2 <- distance(rr_raster2)
names(rr_raster_distance2) <- "distance_rr2"

rrPoints2 <-
  rasterToPoints(rr_raster_distance2) %>%
  as.data.frame() %>%
  st_as_sf(coords = c("x", "y"), crs = st_crs(delco_fishnet))

rrPoints2_fishnet <- 
  aggregate(rrPoints2, delco_fishnet, mean) %>%
  mutate(distance_rr2 = ifelse(is.na(distance_rr2),0,distance_rr2))

#back to existing
rrPoints_fishnet <- 
  aggregate(rrPoints, delco_fishnet, mean) %>%
  mutate(distance_rr = ifelse(is.na(distance_rr),0,distance_rr))

ggplot() +
  geom_sf(data=delco) +
  geom_sf(data=rrPoints_fishnet, aes(fill=factor(ntile(distance_rr,5))),color = NA) +
  scale_fill_manual(values = palette5,
                      labels=substr(quintileBreaks(rrPoints_fishnet,"distance_rr"),1,8),
                      name="Quintile\nBreaks") +
  geom_sf(data=septa_rr_line, colour = "red") +
  labs(title = "Distance to Regional Rail",
       subtitle = "As fishnet centroids; current Regional Rail visualized in red") +
  mapTheme()

### New Development: Extension of the SEPTA’s Westchester Brand For the supply-side development scenario, we propose extending the Westchester Brand from Wawa west to the county boarder as shown below.

ggplot()+
  geom_sf(data=delco, fill ="#86BBD8", color = "transparent") +
  geom_sf(data = septa_rr_line, line = 'dashed', color ="red")+
  geom_sf(data = septa_rr_line_addition, line = 'dashed', color ="black", width = 5)+
  labs(title = "Proposed Extension of Regional Rail in Delaware County",
       subtitle="Existing rail displayed in red, proposed extension displayed in black") +
  mapTheme()+
  theme(legend.position = 'none')
## Warning: Ignoring unknown parameters: line
## Warning: Ignoring unknown parameters: line, width

delcoMunic <- read_sf("C:/Users/cchue/Documents/GIS/delcodata/municipals.shp") %>% st_transform(st_crs(delco_fishnet)) %>% 
  mutate(munic_area = as.numeric(st_area(.)))

delcoMunic_fishnet <- st_join(delco_fishnet %>% st_centroid(), delcoMunic, join = st_within) %>% st_drop_geometry() %>% 
  full_join(., delco_fishnet) %>% st_sf() %>% 
  dplyr::select(c('Name','fishnetID','munic_area'))
## Warning in st_centroid.sf(.): st_centroid assumes attributes are constant over
## geometries of x
## Joining, by = "fishnetID"
grid.arrange(
ggplot(delcoMunic_fishnet)+
  geom_sf(aes(fill = munic_area), color = NA)+
  mapTheme(),
ggplot(delcoMunic_fishnet)+
  geom_sf(aes(fill = Name), color = NA)+
  theme(legend.position = 'none')+
  mapTheme(),
ncol = 2
)

nn_function <- function(measureFrom,measureTo,k) {
  #convert the sf layers to matrices
  measureFrom_Matrix <-
    as.matrix(measureFrom)
  measureTo_Matrix <-
    as.matrix(measureTo)
  nn <-   
    get.knnx(measureTo, measureFrom, k)$nn.dist
    output <-
    as.data.frame(nn) %>%
    rownames_to_column(var = "thisPoint") %>%
    gather(points, point_distance, V1:ncol(.)) %>%
    arrange(as.numeric(thisPoint)) %>%
    group_by(thisPoint) %>%
    summarize(pointDistance = mean(point_distance)) %>%
    arrange(as.numeric(thisPoint)) %>% 
    dplyr::select(-thisPoint) %>%
    pull()
  
  return(output)  
}

st_crs(fishnet) == st_crs(aggregatedRasters)
## [1] TRUE
fishnet$lagDevelopment <-
    nn_function(xyC(fishnet),
                xyC(filter(aggregatedRasters,developed==1)),
                2)

ggplot() +
  geom_sf(data=fishnet, 
             aes(fill=factor(ntile(lagDevelopment,5))), color = NA) +
  scale_fill_manual(values = palette5,
                     labels=substr(quintileBreaks(fishnet,"lagDevelopment"),1,7),
                     name="Quintile\nBreaks") +
  labs(title = "Spatial Lag to 2010 Development") +
  mapTheme()

### Final Dataset Below we combine all of the features we engineered and map them to our fishnet.

dat <- 
  cbind(
    fishnet, highwayPoints_fishnet, rrPoints_fishnet, fishnetPopulation, aggregatedRasters, delcoMunic_fishnet) %>%
  dplyr::select(lc_change, developed, forest, farm, wetlands, otherUndeveloped, water,
                pop2009, pop2019, pop_Change, distance_highways, distance_rr, lagDevelopment, Name, munic_area) %>%
  mutate(developed10 = ifelse(lc_change == 1 & developed == 1, 0, developed)) %>%
  filter(water == 0) 

Exploratory Analysis

We now continue into the exploratory analysis process. In this section, we explore the association between development change and our spatial and population variables. We examine the ways in which these variables interact with the fishnet cells that experienced new development.

Continuous Variables

Below we explore the continuous variables: distance to highways, distance to railways, and spatial lag to development. We can see that new development occurs in cells that are closer to highways, railways, and other development. The relationship between new development and distance to new development is split more evenly than with distance to highways or rail, but it does seem that new development occurs in cells with smaller distances.

dat %>%
  dplyr::select(distance_highways,distance_rr,lagDevelopment, lc_change) %>%
  gather(Variable, Value, -lc_change, -geometry) %>%
  ggplot(., aes(lc_change, Value, fill=lc_change)) + 
    geom_bar(position = "dodge", stat = "summary", fun.y = "mean") +
    facet_wrap(~Variable) +
    scale_fill_manual(values = palette2,
                      labels=c("No Change","New Development"),
                      name="") +
    labs(title="New Development as a Function of the Continuous Variables") +
    plotTheme 

### Population Variables We also examine the relationship between new development and population variables. Interestingly, new development in Delaware County occurs in cells where population change is lower than other cells. New development seems to occur in cells with lower population, as well. This is different than our original hypothesis that larger populations and higher rates of population change would lead to development in the county.

dat %>%
  dplyr::select(pop2009,pop2019,pop_Change,lc_change) %>%
  gather(Variable, Value, -lc_change, -geometry) %>%
  ggplot(., aes(lc_change, Value, fill=lc_change)) + 
    geom_bar(position = "dodge", stat = "summary", fun.y = "mean") +
    facet_wrap(~Variable, scale='free') +
    scale_fill_manual(values = palette2,
                      labels=c("No Change","New Development"),
                      name="") +
    labs(title="New Development as a Function of Factor Variables") +
    plotTheme

## Land Conversion Rates We present a table of conversion rates for each land cover category below. The table below indicates that forest and farm land are most often converted to development in Delaware County.

dat %>%
  dplyr::select(lc_change:otherUndeveloped,developed) %>%
  gather(Land_Cover_Type, Value, -lc_change, -geometry) %>%
   st_set_geometry(NULL) %>%
     group_by(lc_change, Land_Cover_Type) %>%
     summarize(n = sum(as.numeric(Value))) %>%
     ungroup() %>%
    mutate(Conversion_Rate = paste0(round(100 * n/sum(n), 2), "%")) %>%
    filter(lc_change == 1) %>%
  dplyr::select(Land_Cover_Type,Conversion_Rate) %>%
  kable() %>% kable_styling(full_width = F)
Land_Cover_Type Conversion_Rate
developed 5.12%
farm 3.89%
forest 4.88%
otherUndeveloped 2.28%
wetlands 0.81%

Predicting for 2030

We now move into the modeling section of the analysis. We developed seven logistic regression models to predict where development will occur for 2019 and 2029/2030. We discuss the models’ ability to predict accurate results and the implications for planners in the following sections.

Modeling

As mentioned, we build seven models using a combination of the following variables: land cover type, spatial lag to development population in 2009, population in 2019, population change, distance to highways, and distance to railroads. We used a 50% split to separate our data into two parts: a training set and a test set.

set.seed(3456)
trainIndex <- 
  createDataPartition(dat$developed, p = .50,
                                  list = FALSE,
                                  times = 1)
datTrain <- dat[ trainIndex,]
datTest  <- dat[-trainIndex,]

Model1 <- glm(lc_change ~ wetlands + forest  + farm + otherUndeveloped, 
              family="binomial"(link="logit"), data = datTrain)

Model2 <- glm(lc_change ~ wetlands + forest  + farm + otherUndeveloped + lagDevelopment, 
              family="binomial"(link="logit"), data = datTrain)
              
Model3 <- glm(lc_change ~ wetlands + forest  + farm + otherUndeveloped + lagDevelopment + pop2009, 
              family="binomial"(link="logit"), data = datTrain)          
              
Model4 <- glm(lc_change ~ wetlands + forest  + farm + otherUndeveloped + lagDevelopment + pop2009 + 
              pop2019, 
              family="binomial"(link="logit"), data = datTrain)              
            
Model5 <- glm(lc_change ~ wetlands + forest  + farm + otherUndeveloped + lagDevelopment + pop_Change, 
              family="binomial"(link="logit"), data = datTrain)              
              
Model6 <- glm(lc_change ~ wetlands + forest  + farm + otherUndeveloped + lagDevelopment + pop_Change + 
              distance_highways, 
              family="binomial"(link="logit"), data = datTrain) 

Model7 <- glm(lc_change ~ wetlands + forest  + farm + otherUndeveloped + lagDevelopment + pop2009 + pop2019 + 
              distance_rr + distance_highways,
              family="binomial"(link="logit"), data = datTrain) 


modelList <- paste0("Model", 1:7)
map_dfc(modelList, function(x)pR2(get(x)))[4,] %>%
  setNames(paste0("Model",1:7)) %>%
  gather(Model,McFadden) %>%
  ggplot(aes(Model,McFadden)) +
    geom_bar(stat="identity") +
    labs(title= "McFadden R-Squared by Model") +
    plotTheme
## fitting null model for pseudo-r2
## fitting null model for pseudo-r2
## fitting null model for pseudo-r2
## fitting null model for pseudo-r2
## fitting null model for pseudo-r2
## fitting null model for pseudo-r2
## fitting null model for pseudo-r2

### Predictions

predsForMap <-         
  dat %>%
    mutate(probs = predict(Model7, dat, type="response") ,
           Threshold_5_Pct = as.factor(ifelse(probs >= 0.05 ,1,0)),
           Threshold_17_Pct =  as.factor(ifelse(probs >= 0.17 ,1,0)),
           Threshold_50_Pct =  as.factor(ifelse(probs >= 0.5 ,1,0))) %>%
    dplyr::select(lc_change,Threshold_5_Pct,Threshold_17_Pct,Threshold_50_Pct) %>%
    gather(Variable,Value, -geometry) %>%
    st_cast("POLYGON")

Predictions Based on Threshold

options(yardstick.event_first = FALSE)

testSetProbs <- 
  data.frame(class = datTest$lc_change,
             probs = predict(Model7, datTest, type="response")) 

testSetProbs <- 
  testSetProbs %>% 
  mutate(
         predClass_17 = as.factor(ifelse(testSetProbs$probs >= 0.17 ,1,0)),
         predClass_50 = as.factor(ifelse(testSetProbs$probs >= 0.50 ,1,0))) 

testSetProbs %>%
   dplyr::select(-probs) %>%
   gather(Variable, Value, -class) %>%
  group_by(Variable) %>%
   summarize(Sensitivity = round(yardstick::sens_vec(class,factor(Value)),2),
             Specificity = round(yardstick::spec_vec(class,factor(Value)),2),
             Accuracy = round(yardstick::accuracy_vec(class,factor(Value)),2)) %>% 
   kable() %>%
   kable_styling(full_width = F)
Variable Sensitivity Specificity Accuracy
predClass_17 0.55 0.78 0.75
predClass_50 0.00 1.00 0.85
ggplot(testSetProbs, aes(probs)) +
  geom_density(aes(fill=class), alpha=0.5) +
  scale_fill_manual(values = palette2,
                    labels=c("No Change","New Development")) +
  labs(title = "Histogram of test set predicted probabilities",
       x="Predicted Probabilities",y="Density") +
  plotTheme

ggplot() +
  geom_sf(data=predsForMap, aes(fill=Value), color =NA) +
  facet_wrap(~Variable) +
  scale_fill_manual(values = palette2, labels=c("No Change","New Development"),
                      name="") +
  labs(title="Development Predictions - Low Threshold") + 
  mapTheme()

Predictions Based on Confusion Matrix

ConfusionMatrix.metrics <-
  dat %>%
    mutate(probs = predict(Model6, dat, type="response") ,
           Threshold_5_Pct = as.factor(ifelse(probs >= 0.05 ,1,0)),
           Threshold_17_Pct =  as.factor(ifelse(probs >= 0.17 ,1,0))) %>%
    mutate(TrueP_05 = ifelse(lc_change  == 1 & Threshold_5_Pct == 1, 1,0),
           TrueN_05 = ifelse(lc_change  == 0 & Threshold_5_Pct == 0, 1,0),
           TrueP_17 = ifelse(lc_change  == 1 & Threshold_17_Pct == 1, 1,0),
           TrueN_17 = ifelse(lc_change  == 0 & Threshold_17_Pct == 0, 1,0)) %>%
    dplyr::select(., starts_with("True")) %>%
    gather(Variable, Value, -geometry) %>%
    st_cast("POLYGON") 

ggplot(data=ConfusionMatrix.metrics) +
  geom_sf(aes(fill = as.factor(Value)), color = NA) +
  facet_wrap(~Variable) +
  scale_fill_manual(values = palette2, labels=c("Correct","Incorrect"),
                       name="") +
  labs(title="Development Predictions - Low Threshold") + mapTheme()

Generalizability & Spatial Cross Validation

spatialCV <- function(dataFrame, uniqueID, dependentVariable, modelName) {

#initialize a data frame 
endList <- list()

#create a list that is all the spatial group unqiue ids in the data frame (ie counties)    
  uniqueID_List <- unique(dataFrame[[uniqueID]])  
  x <- 1
  y <- length(uniqueID_List)
  
#create a counter and while it is less than the number of counties...  
  while(x <= y) 
  {
#call a current county    
    currentUniqueID <- uniqueID_List[x]
#create a training set comprised of units not in that county and a test set of units
#that are that county
    training <- dataFrame[ which(dataFrame[[uniqueID]] != uniqueID_List[x]),]
    testing <- dataFrame[ which(dataFrame[[uniqueID]] == uniqueID_List[x]),]
#create seperate xy vectors
    trainingX <- training[ , -which(names(training) %in% c(dependentVariable))]
    testingX <- testing[ , -which(names(testing) %in% c(dependentVariable))]
    
    trainY <- training[[dependentVariable]]
    testY <- testing[[dependentVariable]]
#Calculate predictions on the test county as part of a data frame including the observed
#outcome and the unique county ID    
   thisPrediction <- 
     data.frame(class = testY,
                probs = predict(modelName, testingX, type="response"),
                county = currentUniqueID) 

#Row bind the predictions to a data farme
   endList <- rbind(endList, thisPrediction)
#iterate counter    
    x <- x + 1 
  } 
#return the final list of counties and associated predictions  
  return (as.data.frame(endList))
}

dat_cv <- dat %>% mutate(Name = ifelse(!is.na(Name),Name,'Edge') )

spatialCV_counties <-
  spatialCV(dat_cv,"Name","lc_change", Model6) %>%
  mutate(predClass = as.factor(ifelse(probs >= 0.1 ,1,0)))

spatialCV_metrics <-
  spatialCV_counties %>% 
    group_by(county) %>% 
    summarize(Observed_Change = sum(as.numeric(as.character(class))),
              Sensitivity = round(yardstick::sens_vec(class,predClass),2),
              Specificity = round(yardstick::spec_vec(class,predClass),2),
              Accuracy = round(yardstick::accuracy_vec(class,predClass),2)) 

spatialCV_metrics %>%
  kable() %>%
  kable_styling(full_width = F)
county Observed_Change Sensitivity Specificity Accuracy
Aldan Borough 0 NA 1.00 1.00
Aston Township 13 0.69 0.46 0.51
Bethel Township 24 0.83 0.31 0.56
Brookhaven Borough 2 0.50 0.71 0.69
Chadds Ford Township 7 1.00 0.13 0.21
Chester City 3 0.33 0.88 0.84
Chester Heights Borough 2 0.50 0.11 0.14
Chester Township 6 0.67 0.38 0.50
Clifton Heights Borough 0 NA 1.00 1.00
Collingdale Borough 0 NA 0.86 0.86
Colwyn Borough 0 NA 0.67 0.67
Concord Township 47 0.87 0.11 0.38
Darby Borough 0 NA 0.60 0.60
Darby Township 1 1.00 0.85 0.86
East Lansdowne Borough 0 NA 1.00 1.00
Eddystone Borough 0 NA 1.00 1.00
Edge 3 0.67 0.79 0.78
Edgmont Township 19 0.84 0.20 0.33
Folcroft Borough 0 NA 0.75 0.75
Glenolden Borough 0 NA 0.78 0.78
Haverford Township 2 1.00 0.68 0.68
Lansdowne Borough 0 NA 0.92 0.92
Lower Chichester Township 4 0.75 0.57 0.64
Marcus Hook Borough 1 0.00 1.00 0.90
Marple Township 5 0.80 0.60 0.61
Media Borough 0 NA 1.00 1.00
Middletown Township 18 0.89 0.21 0.30
Millbourne Borough 0 NA 0.50 0.50
Morton Borough 0 NA 1.00 1.00
Nether Providence Township 4 0.50 0.57 0.57
Newtown Township 39 0.95 0.27 0.54
Norwood Borough 0 NA 0.86 0.86
Parkside Borough 0 NA 1.00 1.00
Prospect Park Borough 0 NA 0.83 0.83
Radnor Township 5 0.80 0.37 0.39
Ridley Park Borough 0 NA 0.92 0.92
Ridley Township 2 0.50 0.71 0.70
Rose Valley Borough 1 0.00 0.60 0.50
Rutledge Borough 0 NA 1.00 1.00
Sharon Hill Borough 0 NA 0.88 0.88
Springfield Township 5 0.60 0.68 0.67
Swarthmore Borough 2 0.50 0.92 0.86
Thornbury Township 16 0.94 0.10 0.25
Tinicum Township 8 0.12 0.88 0.71
Trainer Borough 0 NA 0.75 0.75
Upland Borough 0 NA 0.50 0.50
Upper Chichester Township 14 0.71 0.58 0.61
Upper Darby Township 1 1.00 0.83 0.83
Upper Providence Township 6 0.67 0.39 0.42
Yeadon Borough 0 NA 0.94 0.94

Predicting Land Cover Demand for 2030

dat <-
  dat %>%
  mutate(lagDevelopment = nn_function(xyC(.), xyC(filter(.,developed10 == 1)),2))

Population Projections

dat %>% mutate(Name = ifelse(!is.na(Name),Name,'Edge') )
## Simple feature collection with 1886 features and 16 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 1716243 ymin: 2053225 xmax: 1748243 ymax: 2083225
## Projected CRS: Albers_Conical_Equal_Area
## First 10 features:
##    lc_change developed forest farm wetlands otherUndeveloped water    pop2009
## 1          0         1      0    0        0                0     0   6.554760
## 5          0         0      1    0        1                0     0   7.064492
## 6          0         1      1    1        1                0     0  16.090944
## 7          0         1      0    0        0                0     0  13.417415
## 8          0         1      0    0        0                0     0  80.563266
## 9          0         1      0    0        0                0     0  87.340328
## 10         0         1      0    0        0                0     0 128.905373
## 16         0         0      1    1        1                1     0  31.429047
## 17         0         0      1    0        1                0     0   3.173708
## 20         0         1      1    1        0                0     0  17.396063
##       pop2019 pop_Change distance_highways distance_rr lagDevelopment
## 1    6.077885 -0.4768746          758.8382   1087.9817      3835.5919
## 5    6.942876 -0.1216163         3633.0470  13666.2160       353.5534
## 6   18.628428  2.5374836         3731.4078  13297.0370       500.0000
## 7   13.646500  0.2290857          720.0269    594.1372      3640.8530
## 8   75.222460 -5.3408062          414.9727    198.9697      3646.4631
## 9   84.848747 -2.4915810          136.8162    393.6153      3656.6916
## 10 125.177814 -3.7275597          403.2284    742.9094      3671.7102
## 16  35.622632  4.1935844         3251.9454  12955.9549       250.0000
## 17   3.065627 -0.1080801         3398.2401  12597.2737       250.0000
## 20  19.682510  2.2864471         2474.5082  11519.2789      1250.0000
##                    Name munic_area                       geometry developed10
## 1                  Edge         NA POLYGON ((1731743 2053225, ...           2
## 5                  Edge         NA POLYGON ((1716743 2053725, ...           1
## 6  Chadds Ford Township   22629279 POLYGON ((1717243 2053725, ...           2
## 7                  Edge         NA POLYGON ((1730243 2053725, ...           2
## 8                  Edge         NA POLYGON ((1730743 2053725, ...           2
## 9   Marcus Hook Borough    4217854 POLYGON ((1731243 2053725, ...           2
## 10  Marcus Hook Borough    4217854 POLYGON ((1731743 2053725, ...           2
## 16 Chadds Ford Township   22629279 POLYGON ((1717243 2054225, ...           1
## 17                 Edge         NA POLYGON ((1717743 2054225, ...           1
## 20                 Edge         NA POLYGON ((1719243 2054225, ...           2
muniPopulation_2030 <- 
  data.frame(
   Name = 
     c("Aldan Borough","Aston Township", "Bethel Township","Brookhaven Borough", "Chadds Ford Township", "Chester City", "Chester Heights Borough", "Chester Township", "Clifton Heights Borough", "Collingdale Borough", "Colwyn Borough", "Concord Township", "Darby Borough", "Darby Township", "East Lansdowne Borough", "Eddystone Borough", "Edgmont Township", "Folcroft Borough", "Glenolden Borough", "Haverford Township", "Landsdowne Borough", "Lower Chichester Township", "Marcus Hook Borough", "Marple Township", "Media Borough", "Middletown Township", "Millbourne Borough","Morton Borough", "Nether Providence Township", "Newton Township", "Norwood Borough", "Parkslide Borough", "Prospect Park Borough", "Radnor Township", "Ridley Park Borough", "Ridley Township", "Rose Valley Borough", "Rutledge Borough", "Sharon Hill Borough", "Springfield Township", "Swarthmore Borough", "Thornbury Township", "Tinicum Township", "Trainer Borough", "Upland Borough", "Upper Chichester Township", "Upper Darby Township", "Uppery Providence Township", "Yeadon Borough"),
   muni_projection_2030 = 
     c(4137, 17137, 10142, 8308, 3978, 31988, 2688, 9679, 6535, 8795, 2304, 16580, 11882, 28332, 2679, 2666, 3684, 6447, 7046, 51881, 10625, 3665, 2552, 24550, 5391, 17792, 1168, 2727, 13861, 4145, 5931, 2184, 6112, 32266, 7104, 29300, 903, 811, 5793, 24130, 6331, 8508,4273, 1751, 2734, 17416, 82464, 21180, 11472)) %>% 
  right_join(
     dat %>%
       st_drop_geometry() %>%
       group_by(Name) %>%
       summarize(muni_pop2009 = round(sum(pop2009))))

muniPopulation_2030_1 <- muniPopulation_2030 %>%
  group_by(Name) %>%
  summarize(muni_pop2030 = round(mean(muni_projection_2030)))
colnames(muniPopulation_2030_1)[1] = "Name"

muniPopulation_2030 <- left_join(muniPopulation_2030, muniPopulation_2030_1)
muniPopulation_2030 <- muniPopulation_2030 %>%
  dplyr::select(Name, muni_pop2009, muni_pop2030)

muniPopulation_2030 %>%
  gather(Variable,Value, -Name) %>%
  ggplot(aes(reorder(Name,-Value),Value)) +
  geom_bar(aes(fill=Variable), stat = "identity", position = "dodge") +
  scale_fill_manual(values = palette2,
                    labels=c("2008","2030"),
                    name="Population") +
  labs(title="Population Change by County: 2008 - 2030",
       x="County", y="Population") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  plotTheme

Development Demand in 2030

 dat_infill <-
   dat %>%
   #calculate population change
     left_join(muniPopulation_2030) %>%
     mutate(county_population_2030 = sum(muni_pop2030),
            proportion_of_county_pop = muni_pop2009 / county_population_2030,
            pop_2030.infill = proportion_of_county_pop * muni_pop2030,
            pop_Change = round(pop_2030.infill - muni_pop2009),2) %>%
     dplyr::select(-county_population_2030, -county_population_2030, 
                 -proportion_of_county_pop, -pop_2030.infill) %>%
 #predict for 2020
     mutate(predict_2030.infill = predict(Model7,. , type="response"))
## Joining, by = "Name"
 dat_infill %>%
   ggplot() +  
   geom_sf(aes(fill = factor(ntile(predict_2030.infill,5))),color=NA) +
   scale_fill_manual(values = palette5,
                     labels=substr(quintileBreaks(dat_infill,"predict_2030.infill"),1,4),
                     name="Quintile\nBreaks") +
   #geom_sf(data=delco, fill=NA, colour="black", size=1) +
   labs(title= "Development Demand in 2030: Predicted Probabilities") +
   mapTheme()

6. Comparing Predicted Development Demand & Environmental Sensitivity

We now have a really strong indicator of development demand for 2020 to help guide local land use planning. Demand however, is only one side of the equation. It must balanced with the supply of environmentally sensitive land. Understanding the interplay between demand and supply is the first stage of the ‘Allocation’ phase, where Planners ultimately decide which land should be developed and which should not.

For this analysis farmland and undeveloped land are be deemed Suitable, while environmentally sensitive areas like wetlands and forest are be deemed Not Suitable. Below, 2011 land cover data is read in and several measures of environmental sensitivity are created by county. These include:

  1. The total amount of wetlands and forest land cover area in 2011.
  2. The amount of sensitive land (wetland and forest) lost between 2001 and 2011.
  3. The total area of large sensitive landscape ‘patches’ in 2011.

The third metric warrants some further discussion. In the context of leapfrog development, Section 2.6 discusses the concept of landscape fragmentation - the idea that discontinuous development across space carves out disjointed slivers of wilderness. This fragmentation reduces biodiversity particularly for species that need room to roam. Below, environmentally sensitive_regions are created to represent large areas of unfragmented natural resources. We then consider the total area of these clumps for each county.

6.2. 2019Land Cover Data

To begin, the 2011 Land Cover data is read in and reclassified.

developed19 <- lc_2019 == 21 | lc_2019 == 22 | lc_2019 == 23 | lc_2019 == 24
forest19 <- lc_2019 == 41 | lc_2019 == 42 | lc_2019 == 43 
farm19 <- lc_2019 == 81 | lc_2019 == 82 
wetlands19 <- lc_2019 == 90 | lc_2019 == 95 
otherUndeveloped19 <- lc_2019 == 52 | lc_2019 == 71 | lc_2019 == 31 
water19 <- lc_2019 == 11

names(developed19) <- "developed19"
names(forest19) <- "forest19"
names(farm19) <- "farm19"
names(wetlands19) <- "wetlands19"
names(otherUndeveloped19) <- "otherUndeveloped19"
names(water19) <- "water19"


LCnames <-c("NA",
  "Water",
  "DevelopedOpen",
  "DevelopedLow",
  "DevelopedMed",
  "DevelopedHigh",
  "Barren",
  "DeciduousForest",
  "EvergreenForest",
  "MixedForest",
  "ShrubScrub",
  "GrassHerbaceous",
  "PastureHay",
  "CultCrops",
  "WoodyWetlands",
  "EmergentHerbWet")


lc_2008spat <- as(lc_2008, "SpatRaster")
LCcodes <- unique(lc_2008)

nlcdcols08 <- data.frame(coltab(lc_2008spat))
nlcdcols08 <- nlcdcols08[LCcodes + 1,]
LCcolors <- rgb(red = nlcdcols08$red,
                green = nlcdcols08$green,
                blue = nlcdcols08$blue,
                names = as.character(LCcodes),
                maxColorValue = 255)

p1<-ggplot() +
  geom_sf(data=delco) +
  geom_raster(data=rast(lc_2008) %>% na.omit %>% filter(value > 0), 
              aes(x,y,fill=as.factor(value))) +
  scale_fill_manual(name = "Land cover",
                    values = LCcolors,
                    labels = LCnames,
                    na.translate = FALSE) +  
  labs(title = "Land Cover Classification", subtitle = 2008) +
  mapTheme()

p2<-ggplot()+
  geom_sf(data=delco) +
  geom_raster(data=rast(lc_2019) %>% na.omit %>% filter(value > 0), 
              aes(x,y,fill=as.factor(value))) +
  scale_fill_manual(name = "Land cover",
                    values = LCcolors,
                    labels = LCnames,
                    na.translate = FALSE) + 
  labs(title = "Land Cover Classification", subtitle = "2019") +
  mapTheme()+
  theme(legend.position = 'none')

legend <- get_legend(p1)
p1<-p1 +  theme(legend.position = 'none')

# move the legend to the right
ggdraw(plot_grid(p1, p2, legend, ncol=3, align='v', rel_widths=c(1, 1, .4)))

Next, each raster is aggregated to the fishnet using the aggregateRaster function and 2011 land cover types are mapped.

theRasterList19 <- c(developed19,forest19,farm19,wetlands19,otherUndeveloped19,water19)

dat2 <-
  aggregateRaster(theRasterList19, dat) %>%
  dplyr::select(developed19,forest19,farm19,wetlands19,otherUndeveloped19,water19) %>%
  st_set_geometry(NULL) %>%
  bind_cols(.,dat) %>%
  st_sf() %>%
  st_cast("POLYGON")

dat2 %>%
  gather(var,value,developed19:water19) %>%
  ggplot() +
    geom_sf(data=delco) +
    geom_sf(aes(fill=as.factor(value)),color=NA) +
    facet_wrap(~var) +
    scale_fill_manual(values = palette2,
                        labels=c("Other","Land Cover"),
                        name = "") +
    labs(title = "Land Cover Types, 2019",
         subtitle = "As fishnet centroids") +
   mapTheme()

6.3. Sensitive Land Cover Lost

Below an indicator sensitive_lost is created indicating grid cells that were either forest or wetlands in 2001 but were no longer so in 2011. The output layer, sensitive_land_lost, gives a sense for how development in the recent past has effected the natural environment.

dat2 <-
  dat2 %>%
   mutate(sensitive_lost19 = ifelse(forest == 1 & forest19 == 0 |
                                    farm == 1 & farm19 == 0,1,0))
                      
ggplot() +
  geom_sf(data=dat2, aes(fill=as.factor(sensitive_lost19)),color=NA) +
  scale_fill_manual(values = palette2,
                      labels=c("No Change","Sensitive Lost"),
                      name = "") +
  labs(title = "Sensitive lands lost: 2008 - 2019",
       subtitle = "As fishnet centroids") +
  mapTheme()

# 7. Allocation

Allocation is the final stage of the urban growth modeling process. Now that both demand and supply is understood, Planners can allocate development rights accordingly. Of course, this could take many forms of regulation including zoning, subdivision approval or outright conservation. In this section, demand and supply are visualized for two counties, Fort Bend and Montgomery. The data suggests that the former is more conducive to growth while the latter, less so.

First, development demand is predicted for Fort Bend. Then a layer, fortBend_landUse is created, that includes indicators for both previously developed land and environmentally unsuitable land. This layer then is overlayed atop development demand and projected population change to give the full supply and demand-side picture in Fort Bend.

There are some clear opportunities for development in Fort Bend. Significant infill opportunities exist along the northeast boundary where population change is projected to be greatest. There is also a good deal of environmentally suitable land in the center of the county around a highway interchange. This would be ideal space for large footprint, suburban shopping.
upperDarby <-
  dat2 %>%
    mutate(Development_Demand = predict(Model6, dat2, type="response")) %>%
    filter(Name == "Upper Darby Township") 

upperDarby_landUse <- rbind(
  filter(upperDarby, forest19 == 1 | farm19 == 1 ) %>%
  dplyr::select() %>% mutate(Land_Use = "Not Suitable"),
  filter(upperDarby, developed19 == 1) %>%
  dplyr::select() %>% mutate(Land_Use = "Developed"))

grid.arrange(
ggplot() +
  geom_sf(data=upperDarby, aes(fill=factor(ntile(Development_Demand,5))), colour=NA) +
  geom_point(data=upperDarby_landUse, aes(x=xyC(upperDarby_landUse)[,1], 
                                        y=xyC(upperDarby_landUse)[,2], colour=Land_Use),
                                        shape = 15, size = 2) +
  geom_sf(data=st_intersection(septa_rr_line_ext,filter(dat2, Name=="Upper Darby Township")), size=2) +
  scale_fill_manual(values = palette5, name="Development\nDemand",
                    labels=substr(quintileBreaks(upperDarby,"Development_Demand"),1,5)) +
  scale_colour_manual(values = c("black","red")) + 
  labs(title = "Development Potential, 2020: Upper Darby Township") + mapTheme() +
  guides(fill = guide_legend(order = 1), colour = guide_legend(order = 2)),

ggplot() +
  geom_sf(data=upperDarby, aes(fill=factor(ntile(pop_Change,5))), colour=NA) +
  geom_point(data=upperDarby_landUse, aes(x=xyC(upperDarby_landUse)[,1], 
                                        y=xyC(upperDarby_landUse)[,2], colour=Land_Use),
                                        shape = 15, size = 2) +
  geom_sf(data=st_intersection(septa_rr_line_ext,filter(dat2, Name=="Upper Darby Township")), size=2) +
  scale_fill_manual(values = palette5, name="Population\nChange",
                    labels=substr(quintileBreaks(upperDarby,"pop_Change"),1,5)) +
  scale_colour_manual(values = c("black","red")) + 
  labs(title = "Projected Population, 2020: Upper Darby Township") + mapTheme() +
  guides(fill = guide_legend(order = 1), colour = guide_legend(order = 2)), ncol=2)

haverford <-
  dat2 %>%
    mutate(Development_Demand = predict(Model6, dat2, type="response")) %>%
    filter(Name == "Haverford Township") 

haverford_landUse <- rbind(
  filter(haverford, forest19 == 1 | farm19 == 1 ) %>%
  dplyr::select() %>% mutate(Land_Use = "Not Suitable"),
  filter(haverford, developed19 == 1) %>%
  dplyr::select() %>% mutate(Land_Use = "Developed"))

grid.arrange(
ggplot() +
  geom_sf(data=haverford, aes(fill=factor(ntile(Development_Demand,5))), colour=NA) +
  geom_point(data=haverford_landUse, aes(x=xyC(haverford_landUse)[,1], 
                                        y=xyC(haverford_landUse)[,2], colour=Land_Use),
                                        shape = 15, size = 2) +
  geom_sf(data=st_intersection(septa_rr_line_ext,filter(dat2, Name=="Haverford Township")), size=2) +
  scale_fill_manual(values = palette5, name="Development\nDemand",
                    labels=substr(quintileBreaks(haverford,"Development_Demand"),1,5)) +
  scale_colour_manual(values = c("black","red")) + 
  labs(title = "Development Potential, 2020: Haverford Township") + mapTheme() +
  guides(fill = guide_legend(order = 1), colour = guide_legend(order = 2)),

ggplot() +
  geom_sf(data=haverford, aes(fill=factor(ntile(pop_Change,5))), colour=NA) +
  geom_point(data=haverford_landUse, aes(x=xyC(haverford_landUse)[,1], 
                                        y=xyC(haverford_landUse)[,2], colour=Land_Use),
                                        shape = 15, size = 2) +
  geom_sf(data=st_intersection(septa_rr_line_ext,filter(dat2, Name=="Haverford Township")), size=2) +
  scale_fill_manual(values = palette5, name="Population\nChange",
                    labels=substr(quintileBreaks(haverford,"pop_Change"),1,5)) +
  scale_colour_manual(values = c("black","red")) + 
  labs(title = "Projected Population, 2020: Haverford Township") + mapTheme() +
  guides(fill = guide_legend(order = 1), colour = guide_legend(order = 2)), ncol=2)