Digital urban geographies

class: center, middle, inverse, title-slide

# Digital urban geographies
## The quantitative, the qualitative and the convolutional
### Stefano De Sabbata | <a href="https://sdesabbata.github.io/" style="color: white;">sdesabbata.github.io</a>
### 2021-02-25

---

class: center, middle

# the digital

***Information has always had geography***. *It is from somewhere; about somewhere; it evolves and is transformed somewhere; it is mediated by networks, infrastructures, and technologies: all of which exist in physical, material places.*

.referencenote[
Graham, M., De Sabbata, S., and Zook, M. A. (2015) [Towards a study of information geographies: (im)mutable augmentations and a mapping of the geographies of information](https://rgs-ibg.onlinelibrary.wiley.com/action/showCitFormats?doi=10.1002%2Fgeo2.8). Geo: Geography and Environment, 2: 1, 88– 105, doi: 10.1002/geo2.8.
]

*"It is now somehow obvious to state that the digital phenomena have radically transformed every aspect of human life. [...] **Digital platforms** are changing what constitutes **"the field"**: the rise of digital content comprises new forms of evidence with which to approach long-standing geographical concerns"*

.referencenote[
Ash, J., et al. (2018). [Digital Geographies](https://uk.sagepub.com/en-gb/eur/digital-geographies/book258271), SAGE Publications.
]

---

# Digital (urban) geographies

.pull-left[

### the quantitative

- access
- participation
- representativeness
- operationalisation
{{content}}

]
.pull-right[

![](../assets/images/mapping-multiculture/entropy-leicester.png)

]

### the qualitative

- everyday multiculture
{{content}}

### the convolutional

- Graph Convolutional Neural Networks
- Econding spatio-temporal information

---

# Special thanks to...

.pull-left[

.large[

- [Dr Andrea Ballatore](https://aballatore.space/), Birkbeck, University of London
- [Dr Katy Bennett](https://www2.le.ac.uk/departments/geography/people/kjb33), University of Leicester
- [Dr Jonathan Bright](https://www.oii.ox.ac.uk/people/jonathan-bright/), Oxford Internet Institute, University of Oxford
- [Dr Zoe Gardner](https://www2.le.ac.uk/departments/geography/people/dr-zoe-gardner), University of Leicester
- [Prof Mark Graham](https://www.oii.ox.ac.uk/people/mark-graham/), Oxford Internet Institute, University of Oxford
- [Pengyuan Liu](https://geography.digital/), University of Leicester and [University of Helsinki](https://www.helsinki.fi/en/people/people-finder/pengyuan-liu-9426324)

]

]
.pull-right[

![](../assets/images/mapping-multiculture/entropy-leicester.png)

]

---
class: inverse, center, middle

# the quantitative

---

# Access

Understanding **geographies of access and enablement** provides important insights into the distribution of technologies and services that are essential for digital communication, participation, and representation.

.left-column-large[
![](../assets/images/oii/oii-internet-penetration.png)
]
.right-column-small[
 
.referencenote[
Graham, M., De Sabbata, S., and Zook, M. A. (2015) [Towards a study of information geographies: (im)mutable augmentations and a mapping of the geographies of information](https://rgs-ibg.onlinelibrary.wiley.com/action/showCitFormats?doi=10.1002%2Fgeo2.8). Geo: Geography and Environment, 2: 1, 88– 105, doi: 10.1002/geo2.8.
]
]

---

# Participation

Access to the internet is only one aspect of the complex network of factors that drive participation

.left-column-large[
![](../assets/images/oii/oii-github.png)
]
.right-column-small[
 
.referencenote[
Graham, M., De Sabbata, S., and Zook, M. A. (2015) [Towards a study of information geographies: (im)mutable augmentations and a mapping of the geographies of information](https://rgs-ibg.onlinelibrary.wiley.com/action/showCitFormats?doi=10.1002%2Fgeo2.8). Geo: Geography and Environment, 2: 1, 88– 105, doi: 10.1002/geo2.8.
]
]

---

# Participation

.pull-left[
Participation in knowledge production is also affected by non-geographic biases, which have an effect on geographic data

#### OpenStreetMap

- 95–98% of all contributions to OSM being produced by men
- differences in modes of contributions between men and women

]
.pull-right[
![](../assets/images/s10708-019-10035-z/10708_2019_10035_Fig4_HTML.png)
]

.referencenote[
Gardner, Z., Mooney, P., De Sabbata, S. et al. [Quantifying gendered participation in OpenStreetMap: responding to theories of female (under) representation in crowdsourced mapping](https://link.springer.com/article/10.1007/s10708-019-10035-z). GeoJournal 85, 1603–1620 (2020). doi: 10.1007/s10708-019-10035-z
]

---

# Representativeness

.pull-left[

.large[Representation similar biases as participation]

- Higher qualifications strongest factor
- Wealth (house prices) strong factor in both, more so for Wikipedia
- Twitter strongly influenced by perc. of ppl. aged 30-44 (positively) and households with dependent children (negatively)
- Models account  only for about 44–55% of variability
- Need for more explanatory factors, e.g., tourism-related activities
- Ethnic composition is not a factor in the UK

]
.pull-right[
![](../assets/images/tgis-12600/tgis-12600-london-wikipedia-twitter.png)

.referencenote[
Ballatore A., De Sabbata S. (2018) [Charting the Geographies of Crowdsourced Information in Greater London](https://link.springer.com/chapter/10.1007/978-3-319-78208-9_8). In Technologies for All. AGILE 2018. Lecture Notes in Geoinformation and Cartography. Springer, Cham. doi: 10.1007/978-3-319-78208-9_8
]

]

---

# Representativeness

.left-column-large[

![:scale 48%](../assets/images/tgis-12600/tgis-12600-london-twitter.png)
![:scale 48%](../assets/images/tgis-12600/tgis-12600-london-wikipedia.png)
![:scale 48%](../assets/images/tgis-12600/tgis-12600-london-wikipedia-twitter.png)
![:scale 48%](../assets/images/tgis-12600/tgis-12600-london-loac.png)

]
.right-column-smal[

[London Output Area Classification](https://data.london.gov.uk/dataset/london-area-classification), see also:

Singleton, A. D. and Longley, P. (2015). [The internal structure of Greater London: a comparison of national and regional geodemographic models](https://rgs-ibg.onlinelibrary.wiley.com/doi/full/10.1002/geo2.7). Geo: Geography and Environment, 2(1):69–87. doi: 10.1002/geo2.7

]

---

# Representativeness

Twitter and Wikipedia similar but distinct geographies only representative of themselves

.pull-left[

![](../assets/images/tgis-12600/tgis-12600-london-wikipedia-twitter-model.png)

]
.pull-right[
![](../assets/images/tgis-12600/tgis-12600-london-wikipedia-twitter-sa.png)
]

---

# Representativeness

.pull-left[

Comparing London and L.A., broadly similar, but each place and platform has its own idiosyncrasies

- Affluence has seemingly opposite effects in London and L.A.
- Ethnic composition has no explanatory power in London, while presence of white and Asian residents is associated with more data in L.A.
- The 30–44 age group makes a clear contribution to data variability in London, but it is not a factor in L.A.
- In London, the variability in Wikipedia is linked to up to 49% of that in Twitter, but only up to 6% in L.A.

]
.pull-right[
![:scale 95%](../assets/images/tgis-12600/tgis-12600-la-4paltforms-density.png)

]

---

# Operationalisation

.pull-left[

Content created by users on digital platforms is biased and varying in quality

- (how) can we use it for geographic research?
- can we “exploit” the bias?

![:scale 95%](../assets/images/oii/health-replicate.png)

.referencenote[
Bright, J., De Sabbata, S., Lee, S., Ganesh, B. and Humphreys, D.K., 2018. [OpenStreetMap data for alcohol research: Reliability assessment and quality indicators](https://www.sciencedirect.com/science/article/pii/S1353829217305804). Health & place, 50, pp.130-136. doi: 10.1016/j.healthplace.2018.01.009
]

]
.pull-right[
.center[
![:scale 80%](../assets/images/oii/health-quality-index.png)
]

]

---
class: inverse, center, middle

# the qualitative

---

# Mapping multiculture

An *(on-going)* mixed-methods exploration of the digital geographies of Leicester

.pull-left[
![:scale 95%](../assets/images/mapping-multiculture/entropy_as_diversity_map_300dpi.png)
]
.pull-right[
![:scale 95%](../assets/images/mapping-multiculture/ethnic_groups_sim_points_1ppp_300dpi.png)
]

---

# Mapping multiculture

.pull-left[

An integrated approach to mixed qualitative and quantitative methods

- Digital qualitative methods
    - Interviews
    - Qualitative social media analysis
- Results from qualitative analysis as a base for quantitative social media analysis
- A critical approach to quantitative analysis
- A self-reflexive analysis of the process

.referencenote[
See also: [Leverhulme Trust Newsletter, May 2019](https://www.leverhulme.ac.uk/sites/default/files/LT%20Newsletter%20May19%20Lo-res.pdf) *"Mapping multiculture: disrupting representations of an ethnically diverse city"*
]

]
.pull-right[
![](../assets/images/mapping-multiculture/MMc-Nandos264-wordcloud_nandos_outline.png)
]

---
class: inverse, center, middle

# the convolutional

---

# Deep learning in digital geographies

When analysing social media data

.left-column-smallish[

- **Qualitative** methods are nuanced but resource-intensive
  - Can only be reasonably applied to small samples
- **Quantitative** approaches can be applied to vast amounts of data, but they are blunt instruments
  - Difficult to adapt to specific cases, areas and topics

.referencenote[
Liu, P. and De Sabbata, S., 2021. [A graph-based semi-supervised approach to classification learning in digital geographies](https://www.sciencedirect.com/science/article/pii/S0198971520303161). Computers, Environment and Urban Systems, 86, p.101583. doi: 10.1016/j.compenvurbsys.2020.101583
]

]
.right-column-largish[

![](../assets/images/j-compenvurbsys-2020-101583/1-s2.0-S0198971520303161-gr1_lrg.jpg)
]

---

# Deep learning in digital geographies

Can we combine the nuance of qualitative analysis with the scalability of quantitative analysis into a combined mixed-method approach?

A semi-supervised neural network might be the way forward...

.center[
![:scale 85%](../assets/images/j-compenvurbsys-2020-101583/j-compenvurbsys-2020-101583-model.png)
]

---

# Multimodal autoencoder

Combine

- image representations 
    - similar to a Residual Neural Network (ResNet, see Mao et al., 2016)
- text representations 
    - Long Short-Term Memory Neural Network (LSTM) 
- to a combined representation
    - similar to a Correlational Neural Network (Corrnet, see Chandar et al., 2016)
        - minimise self-construction error
        - minimise cross-reconstruction error from image and texts
        - maximise correlation between hidden representations of both components

`$$\mathcal{J}_{\mathcal{Z}} = \sum^{N}_{i=1}(L(z_{i},g(h(z_{i})))+L(z_{i},g(h(x_{i})))+L(z_{i},g(h(y_{i}))))-\lambda corr(h(X),h(Y))$$`

`$$corr(h(X),h(Y)) = \frac{\sum^{N}_{i=1}(h(x_{i}-\overline{h(X)})(h(y_{i}-\overline{h(Y)}))}{\sqrt{(\sum^{N}_{i=1}(h(x_{i}-\overline{h(X)})^{2}(\sum^{N}_{i=1}(h(y_{i}-\overline{h(Y)})^{2}}}$$`

---

# Graph Convolutional Network

Node-level output: &nbsp;&nbsp;&nbsp; `$Z = f(X,A) = \textit{softmax}(H^{(L)})$`

`$X$` is information from autoencoder for each post, `$A$` is graph adjacency matrix

Layer-wise propagation rule for GCN: &nbsp;&nbsp;&nbsp; `$H^{(L+1)} = \sigma(\hat{D}^{-\frac{1}{2}}\hat{A}\hat{D}^{-\frac{1}{2}}H^{(L)}W^{(L)})$`

- `$\hat{A}=A+I_N$` and `$I_N$` is the identity matrix of `$A$` 
- `$W^{(L)}$` is the trainable weight matrix of `$L$`th layer of neural network `$\hat{D}_{ii} = \sum_j\hat{A}_{ij}$`
- `$\sigma(\cdot)$` is a non-linear activation, using `$ReLu(\cdot) = max(0,\cdot)$`
- `$H^{(L)}$` is the activation matrix for the `$L$`th layer
    - `$H^{(0)}=X$`
    - `$H^{(L)}=\hat{A}ReLu(H^{(L-1)})W^{(L)}$`. 
  
Cross-entropy error: &nbsp;&nbsp;&nbsp; `$\mathcal{L}=-\sum_{l\in \mathcal{Y}_{L}}\sum_{f=1}^{F}\mathcal{Y}_{lf}\ln{Z_{lf}}$`

- `$\mathcal{Y}_L$` is the set of nodes that have labels.

---

# Deep learning, spatio-temporally

.pull-left[

![](../assets/images/j-compenvurbsys-2020-101583/1-s2.0-S0198971520303161-gr3.jpg)

]
.pull-right[

![](../assets/images/j-compenvurbsys-2020-101583/1-s2.0-S0198971520303161-gr4.jpg)
.referencenote[
Liu, P. and De Sabbata, S., 2021. [A graph-based semi-supervised approach to classification learning in digital geographies](https://www.sciencedirect.com/science/article/pii/S0198971520303161). Computers, Environment and Urban Systems, 86, p.101583. doi: 10.1016/j.compenvurbsys. 2020.101583
]

]

---

# Deep learning, spatio-temporally

Results of the experiments using a Minimum Spanning Tree (3 km left, 4 km right)

.pull-left[

![](../assets/images/j-compenvurbsys-2020-101583/1-s2.0-S0198971520303161-gr6_lrg.jpg)

]
.pull-right[

![](../assets/images/j-compenvurbsys-2020-101583/1-s2.0-S0198971520303161-gr8_lrg.jpg)
]

---

# Deep learning, spatio-temporally

Ultimately, our results illustrate the advantages (necessity?) of understanding geo-located social media posts as geographic events

| Model input                          | Representation Extractor | Model                                     | Accuracy | Micro-F1 Score |
|--------------------------------------|--------------------------|-------------------------------------------|----------|----------------|
| A-spatial with Images and Text       | Multi-modal Autoencoder  | SVM (no graph structure)                  | 15.87%   | 9.13%          |
| A-spatial with Images and Text       | Multi-modal Autoencoder  | GCN (Cycle Graph)                         | 68.63%   | 65.94%         |
| Spatial with Images and text         | Multi-modal Autoencoder  | GCN (Weighted MST (3 km))                 | 73.57%   | 72.89%         |
| Spatio-temporal with Images and text | Multi-modal Autoencoder  | GCN (StN (temporally-weighted, 4 km))     | 78.98%   | 76.72%         |
| Spatio-temporal with Images and text | Multi-modal Autoencoder  | GCN (StN (distance-temp.-weighted, 4 km)) | 80.08%   | 78.65%         |

---
class: inverse, center, middle

# More on geography and deep learning

---

# GeoConvolution

Adapting the idea of a convolutional neural network to statistical analysis of area units

- **GeoConvolution**:  custom Lambda layer,weighted average of geographic neighbourhood
- **GeoBatch**: geographic selection of batch

.left-column-large[
.center[
![](../assets/images/deep-learning-geodemo/geographic-convolution-and-batch.png)
]
]
.right-column-small[
.referencenote[
De Sabbata, S. and Liu, P., 2019. [Deep learning geodemographics with autoencoders and geographic convolution](https://agile-online.org/images/conference_2019/documents/short_papers/90_Upload_your_PDF_file.pdf). In Proceedings of the 22nd AGILE conference on Geographic Information Science, Limassol, Greece.
]

]

---

# Reproducing the LOAC 2011

.left-column-smallish[
.center[
![:scale 90%](../assets/images/deep-learning-geodemo/geodemo-loac.png)
![:scale 90%](../assets/images/deep-learning-geodemo/geodemo-geoconv-map.png)
]
]
.right-column-largish[

- Chi-Square test clearly shows a significant association between the 2011 LOAC and the GCNN output, `$X^2 (49) = 61881$`, `$p < 0.001$`.
- Similar squared Euclidean distance (SED) 2011 LOAC scored 0.6999, GCNN output scored 0.7005.

.center[
![:scale 65%](../assets/images/deep-learning-geodemo/geodemo-geoconv-alluvial.png)
]
]

---
class: bottom

background-image: url(../assets/images/mina-catching-the-snow.png)
background-size: cover

# Thanks!

Slides created via the R package [**xaringan**](https://github.com/yihui/xaringan).

The chakra comes from [remark.js](https://remarkjs.com), [**knitr**](https://yihui.org/knitr), and [R Markdown](https://rmarkdown.rstudio.com).

---

# Contacts and acknowledgements

.bottom[

.pull-left[

.large[Get in touch!]

👋😊

- Email me at [s.desabbata@le.ac.uk](mailto:s.desabbata@le.ac.uk)
- [sdesabbata.github.io](http://sdesabbata.github.io/) is my website

You can find me

- [@maps4thought](https://twitter.com/maps4thought) on Twitter
- [sdesabbata](https://github.com/sdesabbata) on GitHub
- As well as on
    - [ResearchGate](https://www.researchgate.net/profile/Stefano-De-Sabbata)
    - [Academia.edu](https://leicester.academia.edu/StefanoDeSabbata)
    - [Google Scholar](https://scholar.google.com/citations?user=VcSXvCYAAAAJ&hl=en)
    - [LinkedIn](https://www.linkedin.com/in/stefanodesabbata/?originalSubdomain=uk)

]
.pull-right[
.center[![:scale 50%](../assets/images/mapping-multiculture/entropy-leicester.png)]

.referencenote[
Images, maps and results included in these slides contain public sector information from Office for National Statistics and Ordnance Survey licensed under the [Open Government Licence v3.0](http://www.nationalarchives.gov.uk/doc/open-government-licence). Data from the [GH Archive](https://www.gharchive.org/) under [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/), [OpenStreetMap](OpenStreetMap), under [ODbL](http://www.openstreetmap.org/copyright), [Twitter](https://twitter.com/) under the [Developer Agreement](https://developer.twitter.com/en/developer-terms/agreement), [Wikipedia](https://en.wikipedia.org/wiki/Main_Page) under [CC BY 3.0](https://creativecommons.org/licenses/by/3.0/) and the the [World Bank Open Data](https://data.worldbank.org/) portal under [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/). Map tiles by [Stamen Design](http://maps.stamen.com/#toner/12/37.7706/-122.3782), under [CC BY 3.0](https://creativecommons.org/licenses/by/3.0/).
]

]