Exploring the use of graph neural networks for urban analytics

Stef De Sabbata

sdesabbata.github.io

Overview

A brief introduction to
- Graph Neural Networks (GNNs)
Applications
- Exploring urban form
- Geodemographic classification
Some thoughts on GeoAI

Slides at sdesabbata.github.io/#events

Thanks to my collaborators

on the projects included in this presentation:
- (De Sabbata, Ballatore, Liu, et al. 2023; De Sabbata and Liu 2023)
- Pengyuan Liu, Andrea Ballatore, Nick Tate
and on the new IJGIS special issue on GeoAI in urban analytics
- Andrea Ballatore, Harvey Miller, Renée Sieber, Ivan Tyukin and Godwin Yeboah

Graphs in GIScience

Graphs have long been used in geography and GIScience

to represent networks
- transportation networks
  - street networks (geographic)
  - space syntax
- social networks
to encode proximity
- distance weights

Contains National Statistics data Crown copyright and database right 2015; Contains Ordnance Survey data Crown copyright and database right 2015. Data by OpenStreetMap, under ODbL, and by Boeing (2020), under CC0 1.0.

Graph Neural Networks (GNN) were developed in machine learning

generalisation of Convolutional Neural Networks
“deep neural networks on graphs other than regular grids” (Bruna et al. 2014)

Graph neural networks

Bruna et al. (2014) proposed a spectral construction approach
Kipf and Welling (2017) proposed a message passing approach
- Graph Convolutional Network (GCN) layer for a node \(v\) with weights (\(W^{(l)}\)), activation function (\(\sigma\)) as

\[ h_{v}^{(l)} = \sigma \left( W^{(l)} \sum_{u \in N(v)} \frac{1}{|N(v)|} h_{u}^{(l-1)} \right) \]

Hamilton, Ying, and Leskovec (2017) proposed a generalisation
- in GraphSAGE a simple mean is used as aggregate and sum as combine functions

\[ h_{v}^{(l)} = \sigma \left( W^{(l)} \ {\scriptstyle COMBINE} \left( h_{v}^{l-1}, {\scriptstyle AGGREGATE} \left( \bigl\{ h_{u}^{(l-1)}, \forall u \in N(v) \bigl\} \right) \right) \right) \]

Graph AutoEncoder (GAE)

Unsupervised learning of nodes representations

by optimising a dimensionality reduction model
encoder: uses graph-convolution and linear layers
decoder: commonly an inner product of the embeddings
loss: binary cross entropy for positive and negative sampled edges

Case study 1: exploring urban form

Exploring urban form

The analysis of urban physical form or built form (Batty 2008)

Topics

street connectivity and structure
- geographic street network
- space syntax
building structure and arrangement
size and shape of urban areas

Approaches

network analysis
- centrality
- clustering
- modularity
- Barábasi-Albert model
fractals
agent based modeling

Learning urban form via GAE

(De Sabbata, Ballatore, Liu, et al. 2023)

Pre-processing

random 1% of nodes from 137 UK cities
an ego-graph for each node
- 500m network distance (min 8 nodes)
- junctions as nodes
  - num. of segments as an attribute
  - bounded min-max (1 to 4)
- street segments as edges
  - length as an edge attribute
  - bounded min-max (50m to 500m)

Model

PyTorch Geometric
three-layer encoder
- two GINE (Hu et al. 2020b) layers
  - 64 hidden features
- one linear layer
  - 64 features to 2 embeddings
trained for 1000 epochs
- AdamW optimiser
- 0.0001 learning rate
- random 80% of ego-graphs
tested on remaining 20%

Case study

Leicester (UK)

Population: 368,600 at the 2021 UK Census, increased by 11.8% since 2011
Minority-majority city: 43.4% identify as Asian, 33.2% are White British
Area: about 73 km2 (28 sq mi)
Simplified OSM street network data by Boeing (2020)

Results

Street network data by OpenStreetMap, under ODbL, and by Boeing (2020), under CC0 1.0

Results (embedding clustering)

Results (ego-graph pooled)

Baselines comparison

		Node embeddings		Ego-graph emb.
	Measure	Fist dimension	Second dimension	Fist dimension	Second dimension
Node in city
	closeness centrality	0.262***	-0.194***	0.365***	-0.337***
	betweenness centrality	0.242***	-0.026***	0.117***	-0.155***
Ego-graph
	count of nodes	-0.033***	-0.104***	-0.138***	-0.226***
	count of edges	0.013*	-0.101***	-0.068***	-0.213***
	average node degree	0.261***	0.005	0.377***	0.037***
	total edge length	0.210***	-0.131***	0.208***	-0.246***
	average edge length	0.370***	-0.045***	0.580***	-0.022***
	average count of streets per node	0.280***	-0.232***	0.431***	-0.421***
	count of intersections	0.047***	-0.144***	-0.019***	-0.302***
	total street segment length	0.192***	-0.163***	0.190***	-0.315***
	count of street segments	0.009	-0.134***	-0.070***	-0.285***
	average street segment length	0.365***	-0.044***	0.589***	-0.015*
	average street circuity	-0.028***	0.131***	-0.066***	0.225***

On-going analysis

Conclusions (Case study 1)

GNNs can be used as an unsupervised framework to explore urban form

merely a first exploratory study
- the design space is vast
- a systematic approach is necessary
testing can be particularly challenging
- no “ground-truth” labels

Future work

adaptability and usefulness through space, time and scale
encoding places beyond junctions, including buildings or points of interest
encoding flows beyond networks, including commuting or communications.

Case study 2: spatial geodemographic classification

Geodemographic classification

Crucial tools in quantitative geography (Webber and Burrows 2018)

aim: better understand the places we live and how they change
- social sciences
- social policy
- urban planning
- business strategy
- marketing
methods: machine learning
- socio-demographic data (e.g. census)
- unsupervised clustering

earlier works
- Shevky and Williams (1949) and Shevky and Bell (1955)
modern geodemographics
- academic research
  - Webber and Craig (1976)
  - Webber and Craig (1978)
- commercial classifications
  - CACI’s Acorn¹ (1979)
  - Experian’s Mosaic² (1985)

Careating a classification

Can we automatically identify the two groups visible in the scatterplot, without any previous knowledge of the groups?

Methods:

Centroid-based
- k-means
- fuzzy c-means
Hierarchical
Mixed
- bootstrap aggregating
Density-based
- DBSCAN

Source: Office for National Statistics, Census 2021. Contains National Statistics data Crown copyright and database right 2022; Contains Ordnance Survey data Crown copyright and database right 2022.

Spatial geodemographics

Carver (1998) proposed adjusting fuzzy c-means membership based on neighbours
- after computation, adjust membership (\(m_i\)) of an areal unit (\(i\))
- spatial weights (\(w_{ij}\)) and parameters (\(\alpha, \beta, A\))

\[ m'_i=\alpha m_i+\beta\frac{1}{A}\sum_j^n{w_{ij}m_j} \]

Mason and Jacobson (2007) suggested to adjust membership at each iteration
Grekousis (2021) introduces a distance-based neighbourhood

Intuition: is membership update akin to graph convolution?

\[ h_{v}^{(l)} = \sigma \left( W^{(l)} \ {\scriptstyle COMBINE} \left( h_{v}^{l-1}, {\scriptstyle AGGREGATE} \left( \bigl\{ h_{u}^{(l-1)}, \forall u \in N(v) \bigl\} \right) \right) \right) \]

NAGAE

(De Sabbata and Liu 2023)

Map data source: CDRC LOAC Geodata Pack by the ESRC Consumer Data Research Centre; Contains National Statistics data Crown copyright and database right 2015; Contains Ordnance Survey data Crown copyright and database right 2015.

Setup

Data

Greater London
- 25053 Output Areas (OAs)
167 census variables considered by Gale et al. (2016) for the 2011 Output Area Classification (OAC)
- created 167 z-scores
60 variables (60 k-vars)
- used by Gale et al. (2016) to create 2011 OAC
- and by Singleton and Longley (2015) to create the 2011 London Output Area Classification (OAC) LOAC

Evaluation framework

Spatially clustered OAs
- based on join count on each class
Squared Euclidean Distance (SED) based on
- 60 k-vars
- 167 z-scores
Matching OAs
- overlap with LOAC
- not a quality measure

Models

Baselines
- 60 k-vars → spatial fuzzy c-mean (SFCM)
- 167 z-scores → PCA(60) and k-means
- 167 z-scores → PCA(60) and SFCM
Graph representation
- 167 z-scores → Att2Vec
- 167 z-scores → Node2Vec
Graph neural networks
- 167 z-scores → GCN + CorrNet
- 167 z-scores → GraphSAGE
- 167 z-scores → GraphSAGE + CorrNet
- 167 z-scores → NAGAE (design space search^†)

Spatial graphs
- Queens
- Eight nearest neighbors (KNN8)
- Maximum distance threshold (MDT)
  - 2,098 meters
  - i.e., minimum threshold allowing all OAs to have at least one neighbour

Results

Our best performing (NAGAE-d1)
- higher quality in spatial clustering
- comparable quality in class homogeneity
GNNs can create geodemographic classification
- incorporate neighbouring effects
- minimal attribute preprocessing
Results can vary starkly
- based on spatial graph and hyperparameters
- node attributes reconstruction allows to avoid oversmoothing

Result maps

Data source: CDRC LOAC Geodata Pack by the ESRC Consumer Data Research Centre; Contains National Statistics data Crown copyright and database right 2015; Contains Ordnance Survey data Crown copyright and database right 2015.

Conclusions (Case study 2)

Our GNN framework has the potential to develop into a wide range of approaches

Key challenges
- expertise in deep learning and new body of common practices
- no specific procedure to assess the optimal number of classes
- lower explainability
  - eXplainable AI (Xing and Sieber 2023; Liu, Zhang, and Biljecki)

Key opportunity
- new approach to fuzzy geodemographic classifications
- flixible approach to combine a wide range of variables
- use a diverse set of networks
  - commuting patterns, travel time
  - non-(only-)spatial relationships, such as virtual interactions

Some thoughts

Geospatial AI and Geographical AI

Graph Neural Networks hold great potential in urban analytics

Foundation models will be cornerstones many future methods and studies

how do we adapt foundation models for geospatial applications? (Mai et al. 2023)

how do we ground such models in geographical theory and ethics to unlock a broader use of AI in geography? (De Sabbata, Ballatore, Miller, et al. 2023)
- “reason” with core geographical concepts
- broader, purposeful aim: to do geography with AI
- a more-than-quantitative approach (Bennett and De Sabbata 2023)

Thank you for your attention

Dr Stef De Sabbata (they/them)

Associate Professor of Geographical Information Science at the School of Geography, Geology and the Environment

Research theme lead for Cultural Informatics at the Institute for Digital Culture

University of Leicester, University Road, Leicester, LE1 7RH, UK

Contact: s.desabbata@le.ac.uk

Check out my GitHub repos at: github.com/sdesabbata

(De Sabbata, Ballatore, Liu, et al. 2023; De Sabbata and Liu 2023)

Additional slides

^† NAGAE design space search

Systematic search of the design space (You, Ying, and Leskovec 2020)

An encoder taking as input the 167 z-scores and composed of
- 1 preprocessing linear layer with either 167 or 60 output features
- a series of 2, 4, or 6 GAT convolutional layers with
  - 60 output features
  - 2, 4 or 8 attention heads
  - 0.0, 0.1, 0.2, 0.3, 0.4 or 0.5 dropout rate
- 1 or 2 post-processing linear layers with 60 output features.
A decoder composed of 1, 2 or 4 linear layers with
- either 60 or 167 output features (last always 167)
10368 possible designs → random sample 100 designs → random sample 100 designs (of the 1152) → final 288 designs.

NAGAE designs

NAGAE-d1
- among the simplest in our design space
- 1 preprocessing layer with 60 output features
- 2 GAT layers with 2 attention heads and a 0.0 edge dropout rate
- 1 one post-processing layer
- 1 layer in the decoder
- trained for 1167 epochs, batches of 1024 nodes, using Queens spatial graph
NAGAE-d2
- similar to NAGAE-d1, but 2 post-processing layers and 0.5 dropout rat
- similar, slightly lower performance with Queens
- far better performances with KNN8 and MDT

GAT layer

Graph attentional operator defined by Veličković et al. (2018)

\[ \mathbf{x}^{\prime}_i = \alpha_{i,i}\mathbf{\Theta}_{s}\mathbf{x}_{i} + \sum_{j \in N(i)} \alpha_{i,j}\mathbf{\Theta}_{t}\mathbf{x}_{j} \]

\[ \alpha_{i,j} = \frac{ \exp\left(\mathrm{LeakyReLU}\left( \mathbf{a}^{\top}_{s} \mathbf{\Theta}_{s}\mathbf{x}_i + \mathbf{a}^{\top}_{t} \mathbf{\Theta}_{t}\mathbf{x}_j \right)\right)} {\sum_{k \in N(i) \cup \{ i \}} \exp\left(\mathrm{LeakyReLU}\left( \mathbf{a}^{\top}_{s} \mathbf{\Theta}_{s}\mathbf{x}_i + \mathbf{a}^{\top}_{t}\mathbf{\Theta}_{t}\mathbf{x}_k \right)\right)} \]

GIN layer

Graph isomorphism operator defined by Xu et al. (2019)

\[ \mathbf{x}^{\prime}_i = h_{\mathbf{\Theta}} \left( (1 + \epsilon) \cdot \mathbf{x}_i + \sum_{j \in N(i)} \mathbf{x}_j \right) \]

where \(h_{\mathbf{\Theta}}\) is a multi-layer perceptron (MLP)

GINE layer

Modified graph isomorphism operator defined by Hu et al. (2020a) to incorporate edge features

\[ \mathbf{x}^{\prime}_i = h_{\mathbf{\Theta}} \left( (1 + \epsilon) \cdot \mathbf{x}_i + \sum_{j \in N(i)} \mathrm{ReLU} ( \mathbf{x}_j + \mathbf{e}_{j,i} ) \right) \]

where \(h_{\mathbf{\Theta}}\) is a multi-layer perceptron (MLP)

References

Batty, Michael. 2008. “The Size, Scale, and Shape of Cities.” Science 319 (5864): 769–71. https://doi.org/10.1126/science.1151419.

Bennett, Katy, and Stefano De Sabbata. 2023. “Introducing a More-Than-Quantitative Approach to Explore Emerging Structures of Feeling in the Everyday.” Emotion, Space and Society 49: 100965.

Boeing, Geoff. 2020. “Global Urban Street Networks GraphML.” Harvard Dataverse. https://doi.org/10.7910/DVN/KA5HJ3.

Bruna, Joan, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. 2014. “Spectral Networks and Locally Connected Networks on Graphs.” https://arxiv.org/abs/1312.6203.

Carver, Steve. 1998. “Fuzzy Geodemographics: A Contribution from Fuzzy Clustering Methods.” In Innovations in GIS 5, 141–49. CRC Press.

De Sabbata, Stefano, Andrea Ballatore, Pengyuan Liu, and Nicholas J. Tate. 2023. “Learning Urban Form Through Unsupervised Graph-Convolutional Neural Networks.” In Proceedings of the 2nd International Workshop on Geospatial Knowledge Graphs and GeoAI: Methods, Models, and Resources.

De Sabbata, Stefano, Andrea Ballatore, Harvey J. Miller, Renée Sieber, Ivan Tyukin, and Godwin Yeboah. 2023. “GeoAI in Urban Analytics.” International Journal of Geographical Information Science 37 (12): 2455–63. https://doi.org/10.1080/13658816.2023.2279978.

De Sabbata, Stefano, and Pengyuan Liu. 2023. “A Graph Neural Network Framework for Spatial Geodemographic Classification.” International Journal of Geographical Information Science 37 (12): 2464–86. https://doi.org/10.1080/13658816.2023.2254382.

Gale, Christopher G, Alexander D Singleton, Andrew G Bates, and Paul A Longley. 2016. “Creating the 2011 Area Classification for Output Areas (2011 OAC).” Journal of Spatial Information Science 2016 (12): 1–27. https://doi.org/10.5311/JOSIS.2016.12.232.

Grekousis, George. 2021. “Local Fuzzy Geographically Weighted Clustering: A New Method for Geodemographic Segmentation.” International Journal of Geographical Information Science 35 (1): 152–74. https://doi.org/10.1080/13658816.2020.1808221.

Hamilton, Will, Zhitao Ying, and Jure Leskovec. 2017. “Inductive Representation Learning on Large Graphs.” In Advances in Neural Information Processing Systems, edited by I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett. Vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/5dd9db5e033da9c6fb5ba83c7a7ebea9-Paper.pdf.

Hu, Weihua, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, and Jure Leskovec. 2020a. “Strategies for Pre-Training Graph Neural Networks.” https://arxiv.org/abs/1905.12265.

———. 2020b. “Strategies for Pre-training Graph Neural Networks.” arXiv. https://doi.org/10.48550/arXiv.1905.12265.

Kipf, Thomas N., and Max Welling. 2017. “Semi-Supervised Classification with Graph Convolutional Networks.” https://arxiv.org/abs/1609.02907.

Liu, Pengyuan, Yan Zhang, and Filip Biljecki. “Explainable Spatially Explicit Geospatial Artificial Intelligence in Urban Analytics.” Environment and Planning B: Urban Analytics and City Science 0 (0): 23998083231204689. https://doi.org/10.1177/23998083231204689.

Mai, Gengchen, Weiming Huang, Jin Sun, Suhang Song, Deepak Mishra, Ninghao Liu, Song Gao, et al. 2023. “On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence.” arXiv Preprint arXiv:2304.06798.

Mason, GA, and RD Jacobson. 2007. “Fuzzy Geographically Weighted Clustering.” In Proceedings of the 9th International Conference on Geocomputation, Maynooth, Eire, Ireland, 3–5.

Shevky, Eshref, and Wendell Bell. 1955. Social Area Analysis; Theory, Illustrative Application and Computational Procedures. Stanford University Press.

Shevky, Eshref, and Marilyn Williams. 1949. The Social Areas of Los Angeles. Berkeley, California, USA: University of California Press.

Singleton, Alex David, and Paul Longley. 2015. “The Internal Structure of Greater London: A Comparison of National and Regional Geodemographic Models.” Geo: Geography and Environment 2 (1): 69–87. https://doi.org/https://doi.org/10.1002/geo2.7.

Veličković, Petar, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. “Graph Attention Networks.” https://arxiv.org/abs/1710.10903.

Webber, Richard, and Roger Burrows. 2018. The Predictive Postcode: The Geodemographic Classification of British Society. Sage.

Webber, Richard, and John Craig. 1976. “Which Local Authorities Are Alike.” Population Trends 5: 13–19.

———. 1978. Socio-Economic Classification of Local Authority Areas. 35. HM Stationery Office.

Xing, Jin, and Renee Sieber. 2023. “The Challenges of Integrating Explainable Artificial Intelligence into GeoAI.” Transactions in GIS 27 (3): 626–45. https://doi.org/10.1111/tgis.13045.

Xu, Keyulu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. “How Powerful Are Graph Neural Networks?” https://arxiv.org/abs/1810.00826.

You, Jiaxuan, Zhitao Ying, and Jure Leskovec. 2020. “Design Space for Graph Neural Networks.” In Advances in Neural Information Processing Systems, edited by H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, 33:17009–21. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2020/file/c5c3d4fe6b2cc463c7d7ecba17cc9de7-Paper.pdf.