`mss`

PackageThis vignette introduces an R package for meaningful spatial statistics,

```
library(mss)
```

The package introduces three classes for spatial data: `SField`

for representing geostatistical data (spatial fields), `SObjects`

for point patterns (spatial objects), and `SLattice`

for lattice
(areal) data (spatial lattices). The abbreviations follow this
paper
(which is under review).

`SExtent`

: extent (window) of field or point pattern`SField`

and `SObjects`

objects contain an `SExtent`

, representing
the domain or extent of the field, or the window of observation
for a point pattern.

The window class defines an area:

```
showClass("SExtent")
```

```
## Class "SExtent" [package "mss"]
##
## Slots:
##
## Name: area
## Class: Spatial
##
## Extends: "SExtentOrNULL"
```

The area can be represented either by a grid, as in

```
library(sp)
demo(meuse, ask = FALSE, echo = FALSE)
w2 = SExtent(meuse.grid) # note the grid is passed, not the polygons
```

or by a polygon or set of polygons, as in

```
w1 = SExtent(meuse.area)
```

`SField`

: geostatistical dataSFields represent *continuous* functions, which over a
domain \(D\) have a value at every *point* \(s \in D\).

```
showClass("SField")
```

```
## Class "SField" [package "mss"]
##
## Slots:
##
## Name: observations domain cellsArePoints
## Class: Spatial SExtentOrNULL logical
```

Observations on fields variables are often done on irregular
point locations. Such a field variable is created from a
`SpatialPointsDataFrame`

object:

```
sf = SField(meuse, meuse.grid)
```

where, here, `meuse.grid`

defines the domain for the
field. Alternatively, *point observations* can be gridded, and
defined as a field by

```
sf = SField(meuse.grid, meuse.grid, cellsArePoints = TRUE)
```

Note that the first argument defines the point observations, and the second argument the domain: the first argument is treated as a (gridded) set of points (grid cell centres), the second as a gridded set of areas (cells).

We can alternatively assume that each grid cell represents an area with constant point values. Such a field is defined by:

```
sf = SField(meuse.grid, meuse.grid, cellsArePoints = FALSE)
```

This field is completely known, for every point location. Note
that grid cell values are *constant throughout the cell*, and are
not conceived as cell aggregate (e.g. average, or otherwise
convoluted) values. For that case, `SpatialAggregates`

are used.

Sets of points can not only be defined by grid cells, but also through lines, or polygons. For instance, points on a contour line could form a sample of a field:

```
library(maptools)
```

```
## Checking rgeos availability: TRUE
```

```
library(rgeos)
```

```
## rgeos version: 0.3-15, (SVN revision (unknown))
## GEOS runtime version: 3.4.2-CAPI-1.8.2 r3921
## Linking to sp version: 1.2-1
## Polygon checking: TRUE
```

```
cl = ContourLines2SLDF(contourLines(as.image.SpatialGridDataFrame(
meuse.grid["dist"])))
proj4string(cl) = CRS(proj4string(meuse.grid))
sf.lines = SField(cl, meuse.grid)
```

or, alternatively, as a set of polygons; in this example, a single polygon with a constant value:

```
pol = addAttrToGeom(meuse.area, data.frame(value = 3), FALSE)
sf.pol = SField(pol, meuse.area)
```

`SObjects`

: point patterns or objectsThe `SObjects`

class can be used to represent objects in a space,
where the *window of observation* reflects the area for which all
points or objects are available:

```
showClass("SObjects")
```

```
## Class "SObjects" [package "mss"]
##
## Slots:
##
## Name: observations window
## Class: Spatial SExtentOrNULL
```

We can create a point pattern e.g. for the sample locations of the `meuse`

data set:

```
pts = geometry(meuse)
se.pts = SObjects(pts, meuse.area)
```

but also areas can indicate entities:

```
se.area = SObjects(meuse.area, meuse.area)
```

`SLattice`

: areal dataAggregations refer to values, measured or computed as the aggregate
over a set of (point) values, e.g. the mean, maximum or minimum value.
As such, aggregations need to be defined for *sets of points*, i.e.
for lines, polygons, or grid cells:

```
showClass("SLattice")
```

```
## Class "SLattice" [package "mss"]
##
## Slots:
##
## Name: observations
## Class: Spatial
```

*Sets of points* can be represented by anything but points: lines, grids,
and polygons:

```
sa.1 = SLattice(cl)
sa.3 = SLattice(meuse.grid)
sa.2 = SLattice(pol)
```

Spatial grid data can be conceived as having values for 1. points, at grid cell centres 2. constant values throughout each grid cell 3. aggregated values over the grid cell region The difference is illustrated by querying each at two points, one coinciding with a grid cell centre, one not coinciding. The two points are indicated by red circles:

```
xy = rbind(coordinates(meuse)[1,], coordinates(meuse.grid)[10,])
pts = SpatialPoints(xy, CRS(proj4string(meuse.grid)))
plot(as(meuse.grid[1:21,], "SpatialPolygons"))
plot(pts, col = 'red', pch = 16, add = TRUE)
```

```
## Error in plot.xy(xy.coords(x, y), type = type, ...): plot.new has not been called yet
```

```
plot(meuse.grid[1:21,], add = TRUE)
```

```
## Error in segments(p[, 1], p[, 2], p[, 3], p[, 4], col = col, lty = lty, : plot.new has not been called yet
```

```
text(coordinates(pts), c("1", "2"), pos = 2)
```

```
## Error in text.default(coordinates(pts), c("1", "2"), pos = 2): plot.new has not been called yet
```

Querying a spatial field where cells are points gives non-missing values only when the point coincides with the grid cell centre:

```
f1 = SField(meuse.grid, meuse.area, cellsArePoints = TRUE)
pts = SField(pts, meuse.grid)
over(pts, f1)
```

```
## part.a part.b dist soil ffreq
## 1 NA NA NA <NA> <NA>
## 2 1 0 0.0122243 1 1
```

Querying a spatial field where cells are constant valued areas gives the grid cell values of the cells where the point is in:

```
f2 = SField(meuse.grid, meuse.area, cellsArePoints = FALSE)
over(pts, f2) # matches pts to the complete grid cell
```

```
## part.a part.b dist soil ffreq
## 1 1 0 0.00135803 1 1
## 2 1 0 0.01222430 1 1
```

Querying a grid where cell values are aggregations does not answer and generates a warning:

```
a = SLattice(meuse.grid)
over(pts, a)
```

```
## Warning: deriving field values from aggregations is not considered
## meaningful
```

```
## NULL
```

We can predict to points, laid out regularly, inside the polygon of the field domain:

```
m = SField(meuse, meuse.area)
v = gstat::vgm(.6, "Sph", 900, .05) # variogram model
i1 = interpolate(log(zinc)~1, m, model = v)
```

```
## [using ordinary kriging]
```

```
spplot(i1, "var1.pred")
```

To show that these really concern points, plotting for a smaller region gives actual point symbols:

```
spplot(i1[1:300,], "var1.pred")
```

Alternatively, we can predict values on a pre-defined grid, taken from the domain:

```
m = SField(meuse, meuse.grid)
i2 = interpolate(log(zinc)~1, m, model = v)
```

```
## [using ordinary kriging]
```

```
print(as(i2[1:3,], "data.frame"), digits = 10) # point support
```

```
## var1.pred var1.var x y
## 1 6.501248333 0.3222480995 181180 333740
## 2 6.624055036 0.2534788321 181140 333700
## 3 6.506542742 0.2747544736 181180 333700
```

From point values, we can predict the polygon area mean value:

```
library(rgeos)
m3 = SLattice(meuse.area)
i3 = interpolate(log(zinc)~1, m, m3, model = v) # interpolate from point TO polygon support
```

```
## [using ordinary kriging]
```

```
i3$var1.pred # a single value, estimate of mean for the whole area
```

```
## [1] 5.716157
```

```
mean(log(m$zinc)) # sample mean
```

```
## [1] 5.885776
```

```
mean(i1$var1.pred) # mean of predictions, similar to i2$var1.pred
```

```
## [1] 5.706189
```

or predict values for grid cell averages (block kriging):

```
m4 = SLattice(meuse.grid)
i4 = interpolate(log(zinc)~1, m, m4, model = v) # interpolate from point TO feature (cell) support
```

```
## [using ordinary kriging]
```

```
print(as(i4[1:3,], "data.frame"), digits = 10) # grid cells support
```

```
## var1.pred var1.var x y
## 1 6.500767419 0.2521655376 181180 333740
## 2 6.623255277 0.1837112271 181140 333700
## 3 6.505998695 0.2048321051 181180 333700
```

```
spplot(i4, "var1.pred")
```

Suppose we have grid data on a 3 x 3 grid, and want to estimate on a new location, indicated by a red circle:

```
set.seed(131)
gt = GridTopology(cellcentre.offset = c(0,0), cellsize = c(1,1), cells.dim = c(3,3))
sgdf = SpatialGridDataFrame(SpatialGrid(gt),
data.frame(r = round(runif(9, max = 10), 1)))
plot(as(sgdf, "SpatialPolygons"), axes = TRUE)
text(coordinates(sgdf), labels = sgdf[[1]])
```

```
## Error in text.default(coordinates(sgdf), labels = sgdf[[1]]): plot.new has not been called yet
```

```
pt.red = SpatialPoints(cbind(0.25, 0.25))
points(pt.red, col = 'red', pch = 1)
```

```
## Error in plot.xy(xy.coords(x, y), type = type, ...): plot.new has not been called yet
```

The first case is when the grid cell data represent
point support values at the grid cell centre, which
is identical to case `i1`

above:

```
sf = SField(as(sgdf, "SpatialPointsDataFrame"), sgdf)
vm = gstat::vgm(1, "Exp", 1)
i5 = interpolate(r~1, sf, SField(pt.red, sgdf), model = vm)
```

```
## [using ordinary kriging]
```

```
as(i5[1,], "data.frame")
```

```
## coords.x1 coords.x2 var1.pred var1.var
## coords.x1 0.25 0.25 4.741938 0.4233456
```

The second case, grid cells represent constant area with point support value, meaning that every point location within a grid cell has the value of the grid cell assigned:

```
sf = SField(sgdf, sgdf, cellsArePoints = FALSE)
i6 = interpolate(r~1, sf, SField(pt.red, sgdf), model = vm)
as(i6[1,], "data.frame") # does not interpolate, but queries grid cell
```

```
## r coords.x1 coords.x2
## 1 5.2 0.25 0.25
```

as case `i4`

above, from point values we can predict a mean value for a grid cell by

```
sf = SField(sgdf, sgdf, cellsArePoints = TRUE)
gt = GridTopology(cellcentre.offset = c(.25,.25), cellsize = c(1,1), cells.dim = c(1,1))
grd.red = SpatialGrid(gt)
i7 = interpolate(r~1, sf, SLattice(grd.red), model = vm)
```

```
## [using ordinary kriging]
```

```
i7@observations@data
```

```
## var1.pred var1.var
## 1 4.609252 0.1198707
```

```
sp = as(sgdf, "SpatialPolygonsDataFrame")
gf = SLattice(sp)
kr = interpolate(r~1, gf, SField(pt.red, sp), model = vm)
kr@observations[[1]]
```

```
## [1] 5.356593
```

```
library(gstat)
krige0(r ~ 1, sp, pt.red, vgmArea, vgm = vm) # check
```

```
## [,1]
## 1 5.356593
```

```
sp = as(sgdf, "SpatialPolygonsDataFrame")
gr = as(grd.red, "SpatialPolygons")
grd = addAttrToGeom(gr, data.frame(r=0), FALSE)
plot(as(sgdf, "SpatialPolygons"), axes = TRUE)
text(coordinates(sgdf), labels = sgdf[[1]])
```

```
## Error in text.default(coordinates(sgdf), labels = sgdf[[1]]): plot.new has not been called yet
```

```
plot(grd, add = TRUE, border = 'red')
```

```
## Error in polypath(x = mcrds[, 1], y = mcrds[, 2], border = border, col = col, : plot.new has not been called yet
```

```
kr = interpolate(r~1, gf, SLattice(grd), model = vm)
kr@observations[[1]]
```

```
## [1] 5.029828
```

```
krige0(r ~ 1, sp, grd, vgmArea, vgm = vm) # check
```

```
## [,1]
## 1 5.029828
```

For `SpatialEntity`

objects, we can estimate the density of objects; this
method reuses `MASS:kde2d`

:

```
e = SObjects(meuse, meuse.area)
d = density(e, newdata = meuse.grid)
lt = list(list("sp.polygons", meuse.area, border = 'black',
first=FALSE), list("sp.points", meuse, col = grey(.5), pch = 16))
class(d)
```

```
## [1] "SLattice"
## attr(,"package")
## [1] "mss"
```

```
spplot(d, sp.layout = lt)
```

The reason why this returns a `SLattice`

is that the
values computed (densities) do not correspond to quantities that are
measurable at point locations where they are registered (grid cells).
In fact, they are aggregated values for areas even much larger than
the grid cells.

`SField`

data```
m = SField(meuse["zinc"], meuse.area)
a = aggregate(m, SLattice(meuse.area), mean) # no warning
a[[1]]
```

```
## [1] 469.7161
```

```
a = aggregate(m, SLattice(meuse.area), sum) # warns:
```

```
## Warning: for SField objects, aggregation using a sum function is not
## considered meaningful
```

```
a[[1]]
```

```
## [1] 72806
```

`SpatialEntity`

data```
m = SObjects(meuse["zinc"], meuse.area)
a = aggregate(m, SLattice(meuse.area), mean) # warns:
```

```
## Warning: for SObjects objects, aggregation using a non-sum function may not
## be meaningful
```

```
a[[1]]
```

```
## [1] 469.7161
```

```
a = aggregate(m, SLattice(meuse.area), sum) # no warning
a[[1]]
```

```
## [1] 72806
```

Pebesma et al. (2014) make assertions A1-A7.

**A prediction is meaningful, if it provides an estimate for a potential observation
Prediction on a field variable generates no warning.**

Where prediction of a `SField`

varialbe generates no warning,

```
m = SField(meuse, meuse.area) # data are point support
res = interpolate(log(zinc)~1, m) # interpolate from point TO point support
```

```
## [inverse distance weighted interpolation]
```

prediction of a point pattern variable generates a warning:

```
m = SObjects(meuse, meuse.area)
sa = SLattice(meuse.grid)
res = interpolate(log(zinc)~1, m, sa) # interpolate point pattern variable: warns
```

```
## Warning: interpolation of SObjects is not considered meaningful
```

**Summing up values over an area is meaningful, if the observed window corresponds to the target geometry (grouping predicate) of the aggregation.**

For `SpatialEntity`

data, spatial aggregation over an area larger than the
observation window generates a warning:

```
x = bbox(meuse)[1,]
y = bbox(meuse)[2,]
bb = Polygon(cbind(c(x[1],x[1],x[2],x[2],x[1]), c(y[1],y[2],y[2],y[1],y[1])))
bbx = SpatialPolygons(list(Polygons(list(bb),"bbox_meuse")),
proj4string = CRS(proj4string(meuse)))
m = SObjects(meuse["zinc"], meuse.area)
a = aggregate(m, SLattice(bbx), sum) # warns:
```

```
## Warning: aggregation over an area larger than the observation window is not
## considered meaningful
```

Stasch et al. 2014 prove A1 and A2 more formally.

Further assertions in the poster:

**Polygon data may reflect constant values over areas (geology) or means
(population density), this matters for (further) aggregation or downsampling.**

Package `mss`

distinguishes between `SField`

and
`SLattice`

to represent constant values, and aggregations,
respectively. Subsampling `SField`

variabls does not not warn,

```
area = addAttrToGeom(meuse.area, data.frame(v=1), match.ID = FALSE)
f = SField(area, meuse.area)
pts = SField(meuse[1:5,], meuse.area)
over(pts, f) # retrieve f at the points of pts:
```

```
## v
## 1 1
## 2 1
## 3 1
## 4 1
## 5 1
```

but subsampling `SLattice`

variables triggers a warning, and no
useful value

```
a = SLattice(area)
over(pts, a)
```

```
## Warning: deriving field values from aggregations is not considered
## meaningful
```

```
## NULL
```

as it should really disaggregate first, from area to point.

**The same is true for grid cells.**

`SField`

and `SLattice`

both generalize polygon and grid data.

```
f1 = SField(meuse.grid, meuse.area, cellsArePoints = TRUE)
pts = SField(meuse[1:5,], meuse.area)
over(pts, f1) # matches pts to grid cell center points:
```

```
## part.a part.b dist soil ffreq
## 1 NA NA NA <NA> <NA>
## 2 NA NA NA <NA> <NA>
## 3 NA NA NA <NA> <NA>
## 4 NA NA NA <NA> <NA>
## 5 NA NA NA <NA> <NA>
```

```
f2 = SField(meuse.grid, meuse.area, cellsArePoints = FALSE)
over(pts, f2) # matches pts to the complete grid cell
```

```
## part.a part.b dist soil ffreq
## 1 1 0 0.00135803 1 1
## 2 1 0 0.01222430 1 1
## 3 1 0 0.10302900 1 1
## 4 1 0 0.19009400 2 1
## 5 1 0 0.27709000 2 1
```

```
a = SLattice(meuse.grid)
over(pts, a)
```

```
## Warning: deriving field values from aggregations is not considered
## meaningful
```

```
## NULL
```

**Point pattern data often come without observation window, over which aggregation is meaningful**

Attempting to create a `SObjects`

object without an observation window specified gives
a warning message:

```
pts <- SObjects(meuse)
```

```
## Warning: window set to the observation data features
```

Providing a window that does not contain all the points gives an error:

```
tr = try(pts <- SObjects(meuse, meuse[1:10,]))
attr(tr, "condition")
```

```
## <simpleError in validityMethod(object): one or more features are outside the observation window>
```

**Field measurements often come without a notion for which locations interpolations based on them make sense.**

```
m = SField(meuse["zinc"], meuse.area)
a = interpolate(log(zinc)~1, m, SLattice(bbx)) # warns:
```

```
## Warning: interpolating over an area larger than the domain is not
## considered meaningful
```

```
## [inverse distance weighted interpolation]
```

**File types do not inform on meaningfulness**

`SpatialPoints`

, `SpatialPixels`

, `SpatialGrid`

, `SpatialLines`

,
`SpatialPolygons`

and their `*DataFrame`

counterparts do not inform
whether data concern fields (geostatistical data), entities (point
patterns), or aggregations. The classes in `mss`

: `SField`

,
`SObjects`

, and `SLattice`

, do.

- E. Pebesma, C. Stasch, B. Graeler and S. Scheider, 2014. Meaningfully Integrating Big Earth Science Data; AGU fall meeting, poster IN33A-3757 pdf
- Stasch, C., S. Scheider, E. Pebesma, W. Kuhn, 2014. Meaningful Spatial Prediction and Aggregation. Environmental Modelling & Software, 51, 149-165 open access.