OpenGeoHub summer school, Sep 1-3 2021

What is spherical geometry?

Earth-related coordinates of spatial data can be either

  • projected: measured on a two-dimensional surface, which is associated with a projection from the Earth’s surface
  • geocentric: x-, y-, and z-coordinates associated with the three principal directions, measured from Earth’s center
  • geographic: degrees latitude and longitude, associated with a datum (ellipsoidal or even spherical model of the Earth)

Geometric operations include computation of

  • measures (area, length/distance, direction, …),
  • predicates (intersects, covers, contains, touches, …) and
  • transformations (intersection, union, difference, symmetric difference, buffer)

Geometrical operations on projected or geocentric coordinates can be done using Euclidean geometry, where all lines are straight.

Spherical (or ellipsoidal) geometry operations are computed over the surface of the sphere (ellipsoid); lines connecting points are great circle arcs.

Why is this worth talking about?

  • I believe that applying Euclidian geometry to geographical coordinates is the number one most common error made in spatial data science (closely followed by ignoring the support of data).

  • technological advances have made spherical geometry a good option, but inertia in legacy GIS, OGC, and data science software have slowed down its adoption / uptake

  • We have gotten used to Plate Carree, and modified our datasets accordingly.

  • Even GeoJSON has written down that the world is 2D – Clause 3.1.9. Antimeridian Cutting:

    • In representing Features that cross the antimeridian, interoperability is improved by modifying their geometry. Any geometry that crosses the antimeridian SHOULD be represented by cutting it in two such that neither part’s representation crosses the antimeridian.

What is (so bad about) Plate Carrée?

Compared to other projections:

  • every global projection is miserable in its own way
  • it preserves shape (1 unit easting equals 1 unit Northing) only at the equator
  • it is hopeless at the poles and at the antimeridian

Compared to an ellipsoid, or sphere:

  • It is 2D. The world is round, bro.

Consider, for a moment, the difference between:

  • a flat projection of the sphere, and a sphere (1, 2)
  • a sphere and an ellipsoid with 1/300 flattening (2, 3)

drawing drawing drawing

Equirectangular projection: where meridians and parallels form equal rectangles (left; shape preserving at center), not squares (right: nowhere true)

drawing

Prior work

  • GeographicLib (C.F.F. Karney), “[…] a small set of C++ classes for […] solving geodesic problems.
  • PostGIS (liblwgeom)’s geography type
    • funded by palantir
    • feature list,
    • cheats: “The buffer and intersection functions are actually wrappers on top of a cast to geometry, and are not carried out natively in spherical coordinates. As a result, they may fail to return correct results for objects with very large extents that cannot be cleanly converted to a planar representation.

  • R package geosphere providing distance, bearing, etc, and several alternative measures
  • R packages sp, gstat, spdep using geodetic distances in case of geographic coordinates
  • R package lwgeom interfacing parts of liblwgeom

Modern solutions, open source libraries:

  • Google’s s2geometry (C++, Java, Go, Python)
  • H3: Uber’s Hexagonal Hierarchical Spatial Index

The situation with sf and stars

st_as_sfc("POLYGON((0 0,1 0,1 1,0 1,0 0))") %>% 
    st_area()
## [1] 1
st_as_sfc("POLYGON((0 0,1 0,1 1,0 1,0 0))", crs = "EPSG:4326") %>% 
    st_area() %>%
    set_units(km^2)
## 12364.04 [km^2]

\(\Rightarrow\) measures are computed over the sphere/ellipsoid

sf_use_s2(FALSE)
## Spherical geometry (s2) switched off
a <- st_as_sfc("POINT(7 53)", crs = "EPSG:4326") %>% 
    st_buffer(set_units(1, degree)) %>%
    st_transform("EPSG:25832") # UTM zone 32 N
## Warning in st_buffer.sfc(., set_units(1, degree)): st_buffer does not correctly
## buffer longitude/latitude data
b <- st_as_sfc("POINT(7 53)", crs = "EPSG:4326") %>% 
    st_transform("EPSG:25832") %>% # UTM zone 32 N
    st_buffer(set_units(111, km))

\(\Rightarrow\) wrong computations raise a warning

st_as_sfc("POINT(0 89)", crs = "EPSG:4326") %>% 
    st_buffer(set_units(5, degree)) %>%
    st_bbox()
## Warning in st_buffer.sfc(., set_units(5, degree)): st_buffer does not correctly
## buffer longitude/latitude data
## xmin ymin xmax ymax 
##   -5   84    5   94

\(\Rightarrow\) nonsense coordinates may come out

(you could get proper buffering by round-tripping through a projection; this is what PostGIS does)

How is spherical geometry different?

  • in contrast to the plane \(R^2\), the sphere (\(S^2\)) is bounded
  • space continues over the antimeridian, poles are points
  • in linestrings and polygons, points are connected by great circle arcs, not straight lines
  • where in \(R^2\) polygons have an unambiguous inside, on \(S^2\) polygons divide the sphere’s surface in two parts \(\Rightarrow\) ring direction disambiguates
  • the empty polygon has a complement: the full polygon.
  • two new bounding structures:
    • bounding cap: smalles circumfering circle (center point + radius)
    • bounding rectangle: lat/lng range that might cross the antimeridian

Plotting virtual globes

  • Google Earth
  • Google Maps: now has “enable/disable globe view” switches between azimuthal perspective and Web Mercator.
  • Cesium (can switch between Web Mercator, perspective and orthographic)

Analysing data with geographic coordinates

Measures

distance, area, direction, …, bounding box, bounding cap

Predicates

intersects, overlaps, covers, touches, is_within_distance, …

Transformations

intersection, union, difference, sym_difference, buffer

How do we do this? Welcome s2geometry

  • an open source library written and supported by Google, powering
    • Google Maps, Earth, Earth Engine
    • Google’s serverless SQL engine bigquery GIS
  • provides all the measures, predicates, transformations
  • allows several snapping to grid options, preserving validity
  • provides, and uses, for spatial index a space-filling hilbert curve, after projecting a sphere onto six cube-faces of an earth-cube

sf package with using s2

  • through R package s2, sf uses all s2geometry measures / predicates / transformations (*)
sf_use_s2(TRUE) # default since sf 1.0-0, released June 9, 2021
## Spherical geometry (s2) switched on
a <- st_as_sfc("POINT(7 53)", crs = "EPSG:4326") %>% 
    st_buffer(set_units(1, degree))

\(\Rightarrow\) no more long/lat warning!

(*) with some exceptions: st_relate, st_buffer approximates, …

Examples

sf_use_s2(TRUE)
a <- st_as_sfc("POINT(7 52)", crs = "EPSG:4326") %>% 
    st_buffer(set_units(1, degree)) %>% # No more warnings!
    st_transform("EPSG:25832") # UTM zone 32 N
b <- st_as_sfc("POINT(7 52)", crs = "EPSG:4326") %>% 
    st_buffer(set_units(1, degree), max_cells = 300) %>%
    st_transform("EPSG:25832") # UTM zone 32 N
c <- st_as_sfc("POINT(7 52)", crs = "EPSG:4326") %>% 
    st_buffer(set_units(1, degree), max_cells = 100) %>%
    st_transform("EPSG:25832") # UTM zone 32 N

\(\Rightarrow\) no longer raise a warning

Migrating sf to default to s2: challenges

  • \(\Rightarrow\) for complex shapes, what is a good default value for max_cells?
  • \(\Rightarrow\) when are buffers needed, when is st_is_within_distance enough?
  • st_is_within_distance in s2 is fast (indexed), but slow in lwgeom

Outlook

  • sf with s2 switched on: the default since June 2021.
  • sf_use_s2() queries, and can switch back and forth
  • best practices: use st_is_within_distance not st_buffer
  • many valid geometries on “plate carree” are hard to make valid on S2
  • replacing GEOS with s2geometry for geographic coordinates solves some problems, and introduces new ones: see sf issues on github
  • Mission: the world is round. GIS was wrong all the time.

References