Single-band Raster Data

Single-band Raster Data#

In the Introduction to Raster Data Notebook, we used the example of a photograph to introduce raster data stored as pixels.

However, photographs are not the only form of data which is best used in raster format. A photograph of the land surface can hold colour values in each pixel, but pixels aren’t restricted to colours. Instead of colours, the value of each pixel can represent any data which varies discontinuously across an area. Examples include temperature data, rainfall, and the elevation of the land surface. Let’s look at some examples.

Important note: unless you really want to learn the Python code, don’t focus on trying to understand it - a lot of the plotting code tends to be quite fiddly. The concepts are what is important here, not how to do everything in Python. The same concepts apply to all GIS software.

1. Digital Elevation Models#

In a digital elevation model (DEM), each pixel holds a value which represents the elevation of the ground surface.

A couple of important things to note. First of all, pixels are not points. Each pixel covers an area of the ground surface. The value is best thought of as the average elevation of the ground area covered by the pixel. As we saw in the Introduction to Raster Data notebook, the spatial resolution of raster data can vary considerably, and it’s really important to ensure that you know the resolution of the source data. It’s entirely possible to process the data such that the actual pixels cover smaller areas than the original data, but analyses which need higher resolution data won’t become valid if you simply increase the number of pixels in low spatial resolution data.

Second, what do I mean by the ground surface? At first glance that seems obvious, but - say I have elevation data for an area which includes a forest. Is my elevation data measuring the top of the forest canopy, or the land underneath the trees? What about an area with buildings on it - are we including the tops of buildings, or are they removed from the dataset to show the surface as it would be without them?

The answer to this is that it varies. You can find all of these. In fact, digital elevation models are a key tool in forestry monitoring, where they’ll want the top surface, but if you’re doing geology the trees are just getting in your way and you don’t need them. Similarly, urban geography often uses elevation models e.g. to track urban growth, but again if you want the natural surface, you don’t want the buildings.

So, there are generally two categories of DEM.

A Digital Surface Model or DSM will hold data on the elevation of whatever is the first surface to return data to the sensor. This will include forest canopies, buildings, bridges, and anything else present in the area big enough to return a value.

A Digital Terrain Model or DTM, on the other hand, will only have data for the underlying terrain, as the name suggests.

It’s worth taking a moment here to think about how DEM models, whether DSM or DTM, are constructed. It’s possible to make a DEM by manually measuring elevation at a bunch of points and making what’s called an interpolation between them. This basically means estimating what’s in between known points. Although early topographic maps were essentially constructed in this way, it’s really not practical for high resolution data.

Instead, high-resolution DEMs are constructed by collecting data from overhead surveys - by aircraft, by unmanned aerial vehicles (UAVs, commonly known as drones), or by satellite. The aircraft, UAV, or satellite will have a sensor which sends a signal to the ground, and measures the time taken for that signal to be reflected back to the sensor from different locations.

Two different kinds of sensor are used for DEMs.

RADAR We’re generally familiar with aircraft and ground stations using RADAR (Radio Detection And Ranging) to detect other aircraft by transmitting radio waves horizontally, and measure the time taken for the reflected signals to return. It’s also possible to point a RADAR transmitter vertically, at the ground, and produce a DEM by measuring the signal which comes back. This can be done by any of aircraft, UAV, or satellite, but works best from satellite. (We’ll come back to RADAR again, because it’s useful for other things as well as DEMs.)
LiDAR Instead of radio waves, we can also point a laser at the ground and measure the reflection. LiDAR (Light Detection and Ranging) is not so possible from satellites due to the distance, how the signals spread out and weaken, and how light can be affected by travelling through the atmosphere, but from aircraft and UAVs it can produce very high resolution data. UL Geography has just acquired a UAV with a LiDAR camera for use in research projects.

The wavelength of radio waves is much longer than that of light - if you’re not familiar with wavelength, it’s not crucial to understand right now, just think of it as the size of the signal pulse. There’s a few different wavelengths used, which are generally referred to as lettered bands. Many are just used for cilivian and/or military aviation and similar purposes and we don’t need to explain them, but in geospatial research three bands are commonly used. C-band RADAR has a wavelength around 6cm; there’s also the shorter wavelength X-band, around 3cm, and the longer wavelenth L-band, around 24cm. Even longer P-band (up to 1m) wavelengths are sometimes used, but much rarer.

Why I’m explaining this is that the shorter wavelength X-band RADAR signals are reflected by the tops of trees, while the longer wavelength L-band RADAR signals are too big to be reflected by individual leaves and branches, and so pass through forest canopies.

This is not something that’s really crucial to consider most of the time, but it does mean that some RADAR DEMs will measure the terrain, while others will include the height of vegetation.

LiDar, on the other hand, has a wavelength small enough to be reflected from leaves and branches. So, not only will it measure the top of the forest canopy, it will map details of that top surface.

Both LiDAR and RADAR will include buildings - so if you want a true DSM without any anthropogenic (human-made) structures, they have to be removed from the dataset. This is rarely perfect, and you’ll commonly see remnants of buildings as artefacts in DSM models.

Publicly available global DEMs reach a resolution of down to around 30m per pixel. Data with higher spatial resolution is generally only available for specific smaller areas.

LiDAR data for parts of Ireland can be found on the Open Topographic Data Viewer, most of which has been collected by the OPW for flood risk and coastal monitoring purposes, and by TII (Transport Infrastructure Ireland) for road projects.

RADAR DEMs covering Ireland are available from Copernicus (the EU’s Earth Observation programme), specifically the COP30 DEM; the SRTM (Shuttle Radar Topography Mission) DEM available from the USGS Earth Explorer; and ALOS World 3D DEM, available from the Japanese space agency. All have a spatial resolution of approximately 30m per pixel.

The example below uses the Digital Surface Model (DSM) of the COP30 DEM.

if 'google.colab' in str(get_ipython()):
    !pip install osmnx earthpy rasterio

import matplotlib.pyplot as plt
import numpy as np
import geopandas as gpd
from shapely.geometry import box
import osmnx as ox
import earthpy.spatial as es
import rasterio as rio
import rasterio.mask
from rasterio.plot import show
from rasterio import features
from rasterio.warp import calculate_default_transform, reproject, Resampling

First, we’ll use rasterio to read in the DEM:

dem = rio.open(
    'https://raw.githubusercontent.com/bamacgabhann/GY4006/main/gy4006/sample_data/Killarney/DEM_COPERNICUS_30.tiff'
)

and let’s have a look at how it’s stored:

dem.read()

array([[[      nan,       nan,       nan, ..., 156.7302 , 156.13272,
         156.7329 ],
        [      nan,       nan,       nan, ..., 158.12102, 157.84673,
         158.99535],
        [      nan,       nan,       nan, ..., 161.69852, 160.20248,
         161.26048],
        ...,
        [ 74.0657 ,  73.65856,  68.50836, ..., 427.5    , 427.5    ,
         427.5    ],
        [ 62.10695,  61.19374,  60.     , ..., 427.5    , 427.5    ,
         427.5    ],
        [ 60.     ,  60.     ,  60.     , ..., 451.19452, 442.69153,
         427.5    ]]], shape=(1, 383, 516), dtype=float32)

So, this file holds elevation values in each pixel, and those values are stored as float32 numbers. We can also see there’s some pixels which don’t have values - those are listed as nan, which stands for Not A Number. This is how the numpy library represents missing or non-numerical values.

dem.meta

{'driver': 'GTiff',
 'dtype': 'float32',
 'nodata': None,
 'width': 516,
 'height': 383,
 'count': 1,
 'crs': CRS.from_wkt('PROJCS["WGS 84 / UTM zone 29N",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-9],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32629"]]'),
 'transform': Affine(30.028653565891467, 0.0, 452133.272164,
        0.0, -29.981948775458136, 5769540.027478)}

OK, now let’s plot it and see what it looks like:

show(dem)

../../_images/c75843d2c50f5e6416ec3a9ae8e33d13a0172eacc3c5f20422b475b30d23aaf1.png

<Axes: >

In the previous notebook, rasterio was able to automatically detect that the satellite photographs were photographs, but here, it only has a single band of data, and so it uses a different colour scheme. We can specify what colour scheme to use.

Both QGIS and ArcGIS Pro will also allow specifying different colour schemes:

show(dem, cmap='Greens')

../../_images/a7723f49ac7b31ed58abc855bbbee869d811a7648a1e48e9ce1132a3724f61d6.png

<Axes: >

Like GeoPandas, rasterio uses matplotlib to do the plotting of the map, so any colour map in matplotlib can be used. You can find the list here: https://matplotlib.org/stable/users/explain/colors/colormaps.html

Some are quite specific:

show(dem, cmap='terrain')

../../_images/34b45bacf9a130f8639f8ae2088fea47eade25626348fe337300508797e71829.png

<Axes: >

This is the right colour scheme, but it’s showing land in blue - which is normally for oceans, of course. We can use some slightly more complex code to split the colour scheme:

(again, QGIS and ArcGIS will also allow getting more specific with the colour scheme)

import matplotlib.colors as colors
colors_undersea = plt.cm.terrain(np.linspace(0, 0.17, 256))
colors_land = plt.cm.terrain(np.linspace(0.25, 1, 256))
all_colors = np.vstack((colors_undersea, colors_land))
terrain_map = colors.LinearSegmentedColormap.from_list(
    'terrain_map', all_colors)
divnorm = colors.TwoSlopeNorm(vmin=0., vcenter=19, vmax=1000)

and use the new split terrain_map colour scheme to show the DEM. I split the two colour schemes at 19m, because the lake water level is 17-18m, so this will show the lake in blue:

fig, ax = plt.subplots(figsize=(10, 6))
terrain = show(dem, ax=ax, cmap=terrain_map, norm=divnorm)
ax.ticklabel_format(style='plain')
plt.show()

../../_images/225fec1b568186006c905aa1c6dc2ee2a3c87b1997e0149964db04ad836260b7.png

Looking better. It would be better with a scalebar:

fig, ax = plt.subplots(figsize=(10, 6))

scale_setup = ax.imshow(dem.read()[0], cmap=terrain_map, norm=divnorm)
cb = fig.colorbar(scale_setup, ax=ax, extend='both', label="Elevation (m)")
cb.set_ticks([19, 0, 200, 400, 600, 800, 1000])

ax.ticklabel_format(style='plain')
show(dem, ax=ax, cmap=terrain_map, norm=divnorm)
plt.show()

../../_images/c692ccc67be4033683c3c4b4a60602f87c878378d44ebd60e2e4ab9b566efa73.png

2. DEM Hillshading#

We can make the DEM look even more like topography by creating a hillshade - this simulates light from the sun coming from a certain direction and height to show lit areas and shadows, giving a 3D effect. To do this in Python, we’ll use the library earthpy. You can specify the azimuth (i.e. direction) and altitude (height) of the sun, both in degrees.

(Feel free to tweak these numbers and see how it changes the look of the map below. For reference if you do change them, my version has azimuth=270, altitude=30.)

QGIS and ArcGIS also have hillshade options allowing you to adjust the azimuth and altitude for DEM data.

shade = es.hillshade(dem.read(1), azimuth=270, altitude=30)

fig, ax = plt.subplots(figsize=(10, 6))
#scale_setup = ax.imshow(dem.read()[0], cmap=terrain_map, norm=divnorm)
#cb = fig.colorbar(scale_setup, ax=ax, extend='both', label="Elevation (m)")
#cb.set_ticks([19, 0, 200, 400, 600, 800, 1000])
ax.ticklabel_format(style='plain')
show(shade, transform=dem.transform, ax=ax, cmap="Greys")
plt.show()

../../_images/ea1ce1906630bb085b33c74265d584d923a2b2455065bff64a363993a1c2068e.png

And we can even layer the coloured version on top of the hillshaded version, by drawing the hillshade layer on top with 30% transparency (alpha=0.3).

Again, QGIS and ArcGIS can do this, and even have additional colour blending modes.

fig, ax = plt.subplots(figsize=(10, 6))
scale_setup = ax.imshow(dem.read()[0], cmap=terrain_map, norm=divnorm)
cb = fig.colorbar(scale_setup, ax=ax, extend='both', label="Elevation (m)")
cb.set_ticks([19, 0, 200, 400, 600, 800, 1000])
ax.ticklabel_format(style='plain')
show(dem, ax=ax, cmap=terrain_map, norm=divnorm)
show(shade, transform=dem.transform, ax=ax, cmap="Greys", alpha=0.3)
plt.show()

../../_images/60c304f3b1528c01edbc752f62807029ee74de62b637784b51cce6fa5fbadcfe.png

We could layer vector features on top of this. For example, let’s grab the roads from OpenStreetMap, and plot them. We just need a little prep work to get the coordinates of the area covered by the DEM, and convert this from the UTM29N zone CRS EPSG:32629 to the global latitude-longitude CRS EPSG:4326 used by OpenStreetMap.

QGIS and ArcGIS will allow displaying of raster and vector data saved in mixed coordinate reference systems, and will do what is called an ‘on the fly’ transformation - meaning it transforms the data in memory to be in the same CRS, but without changing the saved files on disk.

bounds  = gpd.GeoDataFrame({"id": 1, "geometry":[box(*dem.bounds)]}).set_crs(32629)

killarney_roads = ox.features_from_polygon(bounds.to_crs(4326)["geometry"][0], tags={"highway": True}).to_crs(32629)

fig, ax = plt.subplots(figsize=(10, 6))
#scale_setup = ax.imshow(dem.read()[0], cmap=terrain_map, norm=divnorm)
cb = fig.colorbar(scale_setup, ax=ax, extend='both', label="Elevation (m)")
cb.set_ticks([19, 0, 200, 400, 600, 800, 1000])
ax.ticklabel_format(style='plain')
ax.set_xlim(452100, 467700)
ax.set_ylim(5758000, 5769500)
show(dem, ax=ax, cmap=terrain_map, norm=divnorm)
show(shade, transform=dem.transform, ax=ax, cmap="Greys", alpha=0.3)
killarney_roads.plot(ax=ax, column="highway", cmap="gist_yarg")
plt.show()

/tmp/ipykernel_10027/1758495432.py:3: UserWarning: Adding colorbar to a different Figure <Figure size 1000x600 with 2 Axes> than <Figure size 1000x600 with 2 Axes> which fig.colorbar is called on.
  cb = fig.colorbar(scale_setup, ax=ax, extend='both', label="Elevation (m)")

../../_images/fd23cefd505ca630264351d3769b0730a3322eb9cdfe9819a8fc16e24485905d.png

Or we could use satellite imagery instead of the topographic color map:

photo = rio.open(
    'https://raw.githubusercontent.com/bamacgabhann/GY4006/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_True_color.tiff'
)
fig, ax = plt.subplots(figsize=(10, 6))
ax.ticklabel_format(style='plain')
ax.set_xlim(452100, 467700)
ax.set_ylim(5758000, 5769500)
show(photo, ax=ax)
show(shade, transform=dem.transform, ax=ax, cmap="Greys", alpha=0.2)
killarney_roads.plot(ax=ax, column="highway", cmap="gist_yarg")
plt.show()

../../_images/221a5501884bf45e827990008b2c09c395bb43dd27511dbfab9f4a462ff422fc.png

We could make this look nicer here, but I think that’s enough to make the point, that you can combine plots of multiple raster and vector data sources.

This might be a case where QGIS or ArcGIS is slightly easier to work with, because they will allow you to tweak the colours and symbols used for individual vector features a bit more easily than in Python, and there’s easier options for overlaying maps on hillshade layers.

3. DEM Raster Geoprocessing#

Elevation data isn’t just for looking at, either. It can be particularly useful in the case of flood risk maps. Say the water level in the local river reaches 21m. In that case, we need to create a new array from the DEM layer, identifying areas where the DEM elevation is below 21m.

If the elevation is above 21m, we don’t need any data, and can set the value to a no data value. Because we’re using float data, we can use NumPy’s Not a Number (np.nan) as our no data value.

We might also set elevation 19m or below as no data, because that’s normally covered in water and so wouldn’t be regarded as flooded.

Once we’ve done that, we can set a single value - which would normally be 1, incicating a Boolean True - for everywhere with elevation above 19m. Since we’ve already made the non-flooded areas nan values, this will only be the areas where the elevation is between 19m and 21m.

flood = dem.read()
flood[flood>21]=np.nan
flood[flood<19]=np.nan
flood[flood>19]=1

fig, ax = plt.subplots(figsize=(10, 6))
#scale_setup = ax.imshow(dem.read()[0], cmap=terrain_map, norm=divnorm)
cb = fig.colorbar(scale_setup, ax=ax, extend='both', label="Elevation (m)")
cb.set_ticks([19, 0, 200, 400, 600, 800, 1000])
ax.ticklabel_format(style='plain')
ax.set_xlim(452100, 467700)
ax.set_ylim(5758000, 5769500)
show(dem, ax=ax, cmap=terrain_map, norm=divnorm)
show(shade, transform=dem.transform, ax=ax, cmap="Greys", alpha=0.3)
show(flood, transform=dem.transform, ax=ax, cmap="tab20b")
killarney_roads.plot(ax=ax, column="highway", cmap="gist_yarg")
plt.show()

/tmp/ipykernel_10027/4148257295.py:3: UserWarning: Adding colorbar to a different Figure <Figure size 1000x600 with 2 Axes> than <Figure size 1000x600 with 2 Axes> which fig.colorbar is called on.
  cb = fig.colorbar(scale_setup, ax=ax, extend='both', label="Elevation (m)")

../../_images/34a44b1df4567ef3ebe8c991ba1ad6d54b7c4ebcc0f3fe4f5474f529e9cd5611.png

You can see a few dark blue areas close to the lake, at the edge of Killarney town, and smaller bits at the edges of rivers and lakes. Just a quick example.

4. Rasters and CRS: Temperature Data#

As mentioned in the introduction, any data which varies discontinuously across an area is best used as raster data. Another example is temperature data. Let’s have a look at (and save a copy of) the global maximum land surface temperature data for April 2021 (I have a reason, wait for it) from WorldClim.

with rio.open('zip+https://geodata.ucdavis.edu/climate/worldclim/2_1/hist/cts4.06/2.5m/wc2.1_cruts4.06_2.5m_tmax_2020-2021.zip!wc2.1_2.5m_tmax_2021-04.tif') as src:
    dst_meta = src.meta.copy()
    dst_meta.update({
        'compress': 'lzw'
    })
    with rio.open('../sample_data/wc2.1_2.5m_tmax_2021-04.tif', 'w', **dst_meta) as dst:

        dst.write(src.read())
    
    show(src)

../../_images/3d1f6c832237106a7720c4235ea23fa4f1020f8e3a7a92a8e868ae92179efbde.png

This is in the EPSG:4326 CRS, the global latitude-longitude system treated as X-Y coordinates (the first case example we looked at in the Coordinate Reference Systems notebook. We can of course reproject this to a different coordinate reference system. Doing so does, however, mean that we need to work out a new transform for our raster pixels - but rasterio has built-in functions for this, as will any GIS software.

dst_crs = 'ESRI:54009'

with rio.open('../sample_data/wc2.1_2.5m_tmax_2021-04.tif') as src:
    transform, width, height = calculate_default_transform(
        src.crs, dst_crs, src.width, src.height, *src.bounds)
    kwargs = src.meta.copy()
    kwargs.update({
        'crs': dst_crs,
        'transform': transform,
        'width': width,
        'height': height,
        'compress': 'lzw'
    })

    with rio.open('../sample_data/wc2.1_2.5m_tmax_2021-04_moll.tif', 'w', **kwargs) as dst:
        for i in range(1, src.count + 1):
            reproject(
                source=rio.band(src, i),
                destination=rio.band(dst, i),
                src_transform=src.transform,
                src_crs=src.crs,
                dst_transform=transform,
                dst_crs=dst_crs,
                resampling=Resampling.nearest)

Before showing this, let’s get the world oceans and latitude-longitude lines to add to the map - we’ll need to reproject those to the same CRS:

ocean = gpd.read_file('https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/ocean.gpkg')
grid = gpd.read_file('https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/LatLongGrid.gpkg')

ocean_moll = ocean.to_crs(dst_crs)
grid_moll = grid.to_crs(dst_crs)

We don’t need this to be full resolution, so I’m also going to reduce it by 50% before plotting it

with rio.open('../sample_data/wc2.1_2.5m_tmax_2021-04_moll.tif') as src:
    # Calculate the new shape
    new_height = int(src.height * 0.5)
    new_width = int(src.width * 0.5)
    
    # Update metadata
    transform = src.transform * src.transform.scale(
        (src.width / new_width),
        (src.height / new_height)
    )
    
    kwargs = src.meta.copy()
    kwargs.update({
        'height': new_height,
        'width': new_width,
        'transform': transform
    })
    
    # Perform the resampling
    with rasterio.open('../sample_data/wc2.1_2.5m_tmax_2021-04_moll_red.tif', 'w', **kwargs) as dst:
        for i in range(1, src.count + 1):
            dst.write(
                src.read(
                    i,
                    out_shape=(new_height, new_width),
                    resampling=Resampling.bilinear
                ),
                indexes=i
            )

Now let’s plot those together with the data:

with rio.open('../sample_data/wc2.1_2.5m_tmax_2021-04_moll_red.tif') as moll:

    fig, ax = plt.subplots(figsize = (10,6))
    
    temp_scale_setup = ax.imshow(moll.read()[0], cmap='coolwarm', vmin=-40, vmax=50)
    cb = fig.colorbar(temp_scale_setup, ax=ax, extend='both', label="Temperature (°C)", shrink=0.6)
    cb.set_ticks([-30, -20, -10, 0, 10, 20, 30, 40])
    ocean_moll.plot(ax=ax, color='lightblue')
    grid_moll.plot(ax=ax, color='white', linewidth=0.5)
    show(moll, ax=ax, cmap='coolwarm', vmin=-40, vmax=50)
    ax.axis('off')
    
    plt.show()

../../_images/a7f7041d0474730c836782a4b4d97cea95c670bf258dae6a82b8a6a0c235be2d.png

That looks a bit nicer, doesn’t it? Much better way of presenting the data - using the Mollweide projection, and adding vector features for the oceans in light blue, and the global latitude-longitude grid in white. Plus, using the blue-red colourmap to illustrate the temperature is far more intuitive.

5. Masking and Cropping Rasters#

Of course, while this is global temperature, it does also cover the region in Killarney we’ve been using so far. It’s at a far lower resolution, but it does have a few pixels covering the region. Let’s have a look - although, honestly, more because it demonstrates some points than because it’s super useful.

To do this, first, we’ll use the GeoDataFrame (called bounds) containing a polygon shape of the raster DEM of the Killarney area (we created this earlier for getting the OpenStreetMap data), and use a global latitude-longitude CRS EPSG:4326 reprojection of this to extract the temperature data within that area.

Rasterio refers to showing only part of an image as masking; and running mask with crop=True extracts the non-masked part of the data to a new dataset. That’s what’s happening in this code. It’s also updating the meta information of the new cropped dataset, so that the shape and transform refer only to the cropped data, rather than the full original dataset.

with rio.open('../sample_data/wc2.1_2.5m_tmax_2021-04.tif') as src:
    out_image, out_transform = rio.mask.mask(src, bounds.to_crs(4326).envelope, crop=True)
    out_meta = src.meta
    out_meta.update({"driver": "GTiff",
                     "height": out_image.shape[1],
                     "width": out_image.shape[2],
                     "transform": out_transform})
with rio.open("../sample_data/wc2.1_2.5m_tmax_2021-04_K.tif", "w", **out_meta) as dest:
    dest.write(out_image)
tempK = rio.open("../sample_data/wc2.1_2.5m_tmax_2021-04_K.tif")
show(tempK)

../../_images/5fa0cb9f70ac873053696aba417a89d6495ce07ec8da14865e95d50253088a9a.png

<Axes: >

Not many pixels, just 18 to cover the whole area (6 x 3) - but I’m trying to make a point. We’ll get there. Now, to show this with the original image, we need to reproject it to the CRS used by the Killarney DEM - that’s the UTM29N grid, EPSG:32629.

dst_crs = 'EPSG:32629'

with rio.open("../sample_data/wc2.1_2.5m_tmax_2021-04_K.tif") as src:
    transform, width, height = calculate_default_transform(
        src.crs, dst_crs, src.width, src.height, *src.bounds)
    kwargs = src.meta.copy()
    kwargs.update({
        'crs': dst_crs,
        'transform': transform,
        'width': width,
        'height': height
    })

    with rio.open("../sample_data/wc2.1_2.5m_tmax_2021-04_K_UTM29.tif", 'w', **kwargs) as dst:
        for i in range(1, src.count + 1):
            reproject(
                source=rio.band(src, i),
                destination=rio.band(dst, i),
                src_transform=src.transform,
                src_crs=src.crs,
                dst_transform=transform,
                dst_crs=dst_crs,
                resampling=Resampling.nearest)

tempK = rio.open("../sample_data/wc2.1_2.5m_tmax_2021-04_K_UTM29.tif")
show(tempK)

../../_images/c4508e6030996e2e7e710499e091dde2646f2d84ffbeaa51d037d0541f481588.png

<Axes: >

Now, here’s what I want you to notice. The pixels now look pretty close to square - and count them! There’s no longer just 18 - we now have a 5 x 4 shape with 20 pixels. And it looks like the pixels have been stretched in the vertical direction - the overall shape is much more square than it used to be.

I wanted with this to reinforce the point that while pixels are rectangular, and often displayed as square, on maps - that doesn’t mean the area they cover on the ground is square. In fact, it never is, due to what we discussed in the Coordinate Reference Systems notebook: the impossibility of plotting a round surface of the Earth on a flat map or computer screen.

Also, the pixels haven’t actually been stretched - the values have actually been resampled into a new set of pixels in the new coordinate reference system, which most closely matches the new coordinate reference system. This means that some of the pixels will actually be covering a different area - so, a small warning about reprojections of low-resolution data.

Still, let’s see what it looks like on our Killarney DEM map:

fig, ax = plt.subplots(figsize=(10, 6))

temp_scale_setup = ax.imshow(tempK.read()[0], cmap='coolwarm', vmin=-40, vmax=50)
cb = fig.colorbar(temp_scale_setup, ax=ax, extend='both', label="Temperature (°C)", shrink=0.6)
cb.set_ticks([-30, -20, -10, 0, 10, 20, 30, 40])

ax.ticklabel_format(style='plain')
ax.set_xlim(452100, 467700)
ax.set_ylim(5758000, 5769500)

show(tempK, ax=ax, cmap='coolwarm', vmin=-40, vmax=50)
show(shade, transform=dem.transform, ax=ax, cmap="Greys", alpha=0.3)
killarney_roads.plot(ax=ax, column="highway", cmap="gist_yarg")

plt.show()

../../_images/ebf77f24b1a3e48a6d0f5b724bcc20e22a66a3676722a4354c1880ccd68fd138.png

So, only the southeasternmost pixel very noticeably showing as different, although there are some minor variations.

6. Raster Coregistration#

Hopefully I have made the point about the area on the ground represented by pixels - but it is still the case that the temperature data has pixels which are much larger than the pixels of the DEM. Now, in this case, we don’t actually need to do any complex analysis comparing these two layers - but you might want to do that. And if so, you would not want what you have here.

Remember that when we reprojected the temperature data to this CRS, we ended up with 5 x 4 pixels instead of 6 x 3? That’s because the data was resampled to a pixel size and shape which suited the new CRS. But if we’re resampling the data anyway, why not just resample to exactly the same pixel size and shape as the DEM data?

This is referred to as coregistration of rasters, where we match a second raster to the extent and resolution of a first.

dst_crs = dem.crs
dst_width = dem.width
dst_height = dem.height
dst_bounds = dem.bounds

with rio.open("../sample_data/wc2.1_2.5m_tmax_2021-04_K.tif") as src:
    transform, width, height = calculate_default_transform(
        src.crs, dst_crs, src.width, src.height, dst_width=dst_width, dst_height=dst_height, *src.bounds)
    kwargs = src.meta.copy()
    kwargs.update({
        'crs': dst_crs,
        'transform': transform,
        'width': width,
        'height': height
    })

    with rio.open("../sample_data/wc2.1_2.5m_tmax_2021-04_K_coreg.tif", 'w', **kwargs) as dst:
        for i in range(1, src.count + 1):
            reproject(
                source=rio.band(src, i),
                destination=rio.band(dst, i),
                src_transform=src.transform,
                src_crs=src.crs,
                dst_transform=transform,
                dst_crs=dst_crs,
                resampling=Resampling.nearest)

tempK_coreg = rio.open("../sample_data/wc2.1_2.5m_tmax_2021-04_K_coreg.tif")

fig, ax = plt.subplots(figsize=(10, 6))

temp_scale_setup = ax.imshow(tempK_coreg.read()[0], cmap='coolwarm', vmin=-40, vmax=50)
cb = fig.colorbar(temp_scale_setup, ax=ax, extend='both', label="Temperature (°C)", shrink=0.6)
cb.set_ticks([-30, -20, -10, 0, 10, 20, 30, 40])

ax.ticklabel_format(style='plain')
ax.set_xlim(452100, 467700)
ax.set_ylim(5758000, 5769500)

show(tempK_coreg, ax=ax, cmap='coolwarm', vmin=-40, vmax=50)
show(shade, transform=dem.transform, ax=ax, cmap="Greys", alpha=0.3)
killarney_roads.plot(ax=ax, column="highway", cmap="gist_yarg")

plt.show()

../../_images/fe6e5beb59d3a6c786b53f67ee942122ce2ea271381f3b28ee2b6c0417e81503.png

It looks very similar - well, it should, it’s the same data plotted in the same way, just with a different pixel size. But, check the southeasternmost corner in this and the previous map, and you will see the difference. We can also see a difference if we look at the numpy arrays behind the scenes:

# Previous transform
print(f'Previous array shape: {tempK.shape}')
tempK.read()

Previous array shape: (4, 5)

array([[[13., 13., 13., 12., 12.],
        [11., 11., 11., 12., 12.],
        [11., 11., 11., 12., 12.],
        [11., 12., 12., 11.,  9.]]], dtype=float32)

# This transform
print(f'This array shape: {tempK_coreg.shape}')
tempK_coreg.read()

This array shape: (383, 516)

array([[[nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        ...,
        [11., 11., 11., ...,  9.,  9.,  9.],
        [11., 11., 11., ...,  9.,  9.,  9.],
        [11., 11., 11., ...,  9.,  9.,  9.]]],
      shape=(1, 383, 516), dtype=float32)

383 x 516 pixels is a huge increase on 4 x 5.

So, we can see that different transforms do make a big difference, and it is best to match the pixel size of rasters if we want to do any comparisons.

But, it is also crucial to remember that we have not increased the spatial resolution of the data - only the spatial resolution at which it is displayed. The spatial resolution of the data is the spatial resolution of the source data - and that’s best seen in the original 6 x 3 pixel grid when we first cropped the temperature data to the Killarney area.

Summary#

So, we’ve seen here that we can use single-band rasters as digital elevation models, plotting on maps them using colour scales, and as hillshades, or both, and can layer vector data over them. We can also use the elevation values in calculations, including making flood risk maps.

We’ve also seen that we can use other data, such as temperatures, and used that data to try and understand pixel size and shape in rasters, particularly when combining datasets in different coordinate reference systems from different sources.

The key takeaways here are using raster data for elevation and data like temperatures, the concept that we can do calculations with the data beyond just showing it on maps, and that we have to be a bit careful when using multiple rasters - matching CRS and resolution in order to do calculations, but bearing in mind that increasing the number of pixels doesn’t increase the spatial resolution of the data.

GY4006 Notebooks in Colab:

Data Types
Vector Data
Attribute Data
Coordinate Reference Systems
Geospatial Data Files
Vector Geoprocessing
Introduction to Raster Data
Single-band Raster Data
Multi-band Raster Data: Passive Remote Sensing