Variables which enter hydrological analysis like temperature or precipitation are measured locally at point locations. Many of these qualities are then recalculated on different domains using interpolation or extrapolation algorithms. We interpolate within the boundary given by the values which are known to us and extrapolate outside this boundary. Here the two common methods for interpolation are demonstrated.
5.1 Inverse distance weighting (IDW)
IDW is a simple deterministic interpolation method, which is also non-parametric. The interpolated points are calculated with a weighted average of the values which are at disposal. The space is a weighted average of the distribution of points and the weight assigned to each point decrease with increasing distance from the interpolated point.
The point value \(Z_p\) is calculated with the knowledge of \(z_i\) and distance \(d_i\) to the power of \(P\).
\[
Z_L = \dfrac{\sum\limits_{i=1}^{n}\dfrac{z_i}{d_i^p}}{\sum\limits_{i=1}^{n}\dfrac{1}{d_i^p}}
\] Let’s imagine we have three values \(z_1 = 10\), \(z_2 = 15\), \(z_3 = 20\) and we need to estimate a value in a location \(L\). The distances to the location from these points are: \(d_1 = 3.0\), \(d_2 = 2.0\), \(d_3 = 1.5\). The commonly used power \(p\) is \(2\).
So the respective values of weights for these are:
Since this method is very simple, let’s calculate it on our own, in a step-by-step manner.
First we generate some point in the spatial domain, which will represent our measurements.
Next we create a new domain, spatially regular to which we will interpolate.
Finally we will perform the calculations and visualize the results.
5.1.1 Generation of random measurements network
We will create 25 points in the space and also assign some coordinate reference system. These points will be used for both of the methods.
Code
library(sf) # Spatial Feature library for spatial datalibrary(scico) # Scientific palette collection called "scico"n <-30dom <-data.frame(x =runif(n, 0, 50),y =runif(n, 0, 50),value =rnorm(n, mean =5)) |> sf::st_as_sf(coords =c("x", "y"),crs =4326)plot(dom,nbreaks = n +1,pal =scico(n = n,palette ="davos",end =0.9,direction =-1),main ="Precipitation [mm]",pch =20, # Point character selectiongraticule =TRUE, # Display graticulesaxes =TRUE, # Display plot axesbgc ="#f0f0f033", # Background colorkey.pos =1) # Legend position
1
Create \(25\) points with random values, which will serve as the computation origin,
2
store them as SimpleFeatures object with coordinate reference system via EPSG, search in https://epsg.io
3
Specify sf:::plot.sf() function and use scientific colormaps from scico package.
We do have the original points, which are scarcely distributed across the domain. Now we need a grid of new points, which will represent the centroids of a raster, on which we want to recalculate.
and use these distances for missing values computation
Code
epsilon <-1e-6distances <- distances + epsilonpower <-2# Compute weighted sums for all grid pointsweighted_sum <-sapply(1:ncol(distances), function(i) {sum(dom$value / (distances[, i]^power))})# Compute weights sum for all grid pointsweights_sum <-sapply(1:ncol(distances), function(i) { sum(1/ (distances[, i]^power))})# Calculate interpolated values for all grid pointsinterpolated_values <- weighted_sum / weights_sum# Create a spatial data frame for resultsresults <-data.frame(x =st_coordinates(grid)[, "X"],y =st_coordinates(grid)[, "Y"],value = interpolated_values) |> sf::st_as_sf(coords =c("x", "y"), crs =4326)# Replace NaN values with NA for plottingresults$value[is.nan(results$value)] <-NA# Join results with the grid for visualizationjoin <-st_join(grid, results)
1
Set a tolerance for when the distance is equal to 0.
There are many libraries listed in CRAN geostatistics task view. One of these is called gstat, it was developed and is maintained by Edzer Pebesma, who is also behind the raster and terra packages. The gstat package contains functions no olny for interpolations.
Code
library(gstat)
Warning: package 'gstat' was built under R version 4.4.1
power <-2# You can adjust the power parameterweighted_sum <-apply(distances, 1, function(d) sum(dom$value / (d^power)))weighted_sum
5.3 Krigging
Another interpolation technique is called Krigging. Opposing to the Inverse Distance Weighted method.
Code
projected_crs <-32633# Replace with the appropriate UTM zone for your datadom_projected <-st_transform(dom, crs = projected_crs)grid_projected <-st_transform(grid, crs = projected_crs)set.seed(123) # For reproducibilityidw_points <-st_centroid(idw_res)
Warning: st_centroid assumes attributes are constant over geometries
An essential part of kriging is definition of a variogram, which is a function describing the degree of spatial dependence of a spatial process. Due to the range parameter, the spatial data needs to be projected.
Code
variogram <-vgm(psill =5.05, model ="Exp", range =300000, nugget =0.15)
Code
krig_res <- gstat::krige(formula = value ~1,locations = combined_points,newdata = grid_projected,model = variogram # Use the appropriate fitted variogram model)
[using ordinary kriging]
Code
# Step 3: Reproject the kriged result back to the original CRS (WGS84)krig_res <-st_transform(st_as_sf(krig_res), crs =4326)# Generate unique breaksmin_val <-min(krig_res$var1.pred, na.rm =TRUE)max_val <-max(krig_res$var1.pred, na.rm =TRUE)# Ensure that min and max are sufficiently different to create unique breaksif (max_val - min_val >0) { breaks <-seq(min_val, max_val, length.out =25)} else { breaks <-unique(c(min_val, max_val)) # Fallback for nearly constant data}# Use the scico color palette with the correct number of colorscolors <-scico(length(breaks) -1, palette ="davos", end =0.9, direction =-1)# Plotting using the 'pal' argument for sf objectsplot( krig_res |> dplyr::select(var1.pred), breaks = breaks,pal = colors, # Use 'pal' to specify the color palettelwd =0.001,main ="Precipitation [mm]", graticule =TRUE, axes =TRUE, bgc ="#f0f0f033", key.pos =1, reset =FALSE)