Multi-band Raster Data: Passive Remote Sensing

Multi-band Raster Data: Passive Remote Sensing#

In the Introduction to Raster Data Notebook, we used a satellite photograph as our first step in working with raster data. In the Single-band Raster Data Notebook, we then looked at rasters with a single band or dataset - elevation data first, and then temperature data.

However, the photographs we used in the Introduction to Raster Data Notebook actually had three bands, not just a single band - one band each for red, green, and blue colour.

This is actually multi-band passive remote sensing data, the subject of this Notebook.

We saw in the photographs how each raster had separate arrays for red, green and blue - that’s our three bands, hence multi-band. But what do I mean by passive remote sensing?

The sensing part is using an instrument to collect data. A camera counts: it has a sensor which records values for the intensity of red, green, and blue light collected.

The remote part simply means “at a distance”. We’re not taking a reading in contact with the object we’re measuring, like when we put a ruler across something to measure it’s size. We’re doing it from far away. A satellite is pretty far above the Earth when it takes a photo: that definitely counts as remote.

The passive part means that the instrument is not transmitting a signal - just receiving. The satellite taking the photograph isn’t shining a light onto the Earth - it’s taking advantage of another source of light. The sun illuminates the Earth, and some of the sunlight is reflected from the ground back into space, where it can be collected by cameras on satellites.

We can take a step back here, and think about everything which has to happen in order for a remote sensing instrument to receive data.

A source of energy emits that energy
That energy travels from the source to the target
The energy interacts in some way with the target
The energy travels from the target to the sensor
The sensor records the energy as data
The data is processed
The processed data is analysed

For most remote sensing, the energy is part of the electromagnetic spectrum. That’s absolutely not always the case - seismographs measure vibrations (basically sound energy) to monitor earthquakes, and seismic surveys by oil and gas companies do the same. Some satellites measure gravity. We won’t be looking at any of that data, though.

The electromagnetic spectrum includes the visible light we can see; plus higher-energy ultraviolet, X-rays, and gamma rays; and lower energy infrared and radio waves. There are passive remote sensing instruments across this entire range: however, some of these like X-rays and gamma rays are mostly used in astronomy.

In this Notebook, we will focus on visible and infrared light.

The source of energy in passive remote sensing of visible and infrared light is, as we said, the Sun. So, breaking down into our list again:

The sun emits energy. The Sun emits energy across most of the electromagnetic spectrum. There’s some minor details if you’re interested in solar astronomy, but for our purposes, we don’t have to worry about this. It’s not a case that the Sun emits more green light than red light: for our purposes, we can consider it essentially even, as white light.
Some of that energy travels to Earth.

Travel from the Sun to Earth is mostly insignificant; however, once the signal reaches Earth, it has to travel through the atmosphere before reaching the ground. This does have some effects; some light bounces off air molecules in the atmosphere, mostly blue light - which is why the sky is blue when you look up - well, unless it’s cloudy, I suppose. Which is really the major issue - clouds block the light, and satellite photographs of a cloudy area will just show…clouds. Not the ground surface.

I’ll skip a step here to note that the same thing will happen at:

Energy travels from the target to the sensor. For satellite sensors, the energy must travel all the way back through the atmosphere. There are other remote sensing platforms - aircraft and UAVs can take photographs using the light that manages to get through the clouds, and can sit below clouds so they don’t need to worry about them. There’s also instruments like cameras which can be used on the ground.
When the data is received by the sensor, of course the sensitivity and resolution of the sensor control what can be recorded. Again, we can use the example of a camera, and remember how good cameras on phones are now compared to over a decade ago - more megapixels means better data, and similar.
The processing is not something most people working with data really need to get into - but the first stage would be things like adding geographic information. There can also be correction to account for the effects of passing through the atmosphere.
The analysis is the bit that we do.

The skipped step is crucial, though:

The energy interacts with the target. This is the key step. There’s essentially three things which can happen here.
1. The energy can be transmitted, passing through the target completely.
2. The energy can be absorbed by the target.
3. The energy can be reflected by the target.

We discussed an example of transmitted energy in the Single-Band Raster Data Notebook - how some RADAR signals can pass through tree canopies. This will then reach another target - in that case, the ground.

Whether the signal is ultimately absorbed or reflected is something we have a word for, when it comes to visible light: colour.

Plants are green because they reflect green light, but absorb blue and red light. So when white light from the sun hits a plant, we see only the reflected green light. Similarly, tomatoes are red because they absorb the green and blue but reflect the red light. Objects which are other colours reflect some combinations, for example purple objects reflect both red and blue. White objects reflect all light, and black objects absorb it all.

But - this only refers to the visible colours we can see. What about the infrared?

Exactly the same thing happens - we just can’t see it. But we have sensors which can.

1. Multispectral Imaging (MSI)#

The satellite image we used for Killarney in the Introduction to Raster Data Notebook wasn’t actually a true photograph. It was a combination of red, green, and blue bands collected by one of the Sentinel-2 satellites, part of the EU’s Copernicus Earth Observation programme. The Sentinel-2 satellites don’t just collect visible light - they are Multispectral Imaging satellites, which actually collect data in 13 different bands. As well as red, green, and blue visible light, there’s one band at the ultraviolet edge of blue light, and nine different bands of infrared light. You can think of this as nine different infrared colours.

Let’s look at the full dataset for Killarney.

Note Again, I just want to reinforce that you don’t have to understand the code here - and honestly, rasterio is very, very user unfriendly, so I really don’t want you to get discouraged by this.

Instead, it’s the concepts which are important here - and those will be shared across essentially all GIS platforms.

if 'google.colab' in str(get_ipython()):
    !pip install osmnx earthpy rasterio

import matplotlib.pyplot as plt
import numpy as np
import geopandas as gpd
import pandas as pd
from shapely.geometry import box
import osmnx as ox
import earthpy.spatial as es
import rasterio as rio
from rasterio.plot import show
from rasterio.features import dataset_features

band01 = rio.open('https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B01.tiff')
band02 = rio.open('https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B02.tiff')
band03 = rio.open('https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B03.tiff')
band04 = rio.open('https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B04.tiff')
band05 = rio.open('https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B05.tiff')
band06 = rio.open('https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B06.tiff')
band07 = rio.open('https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B07.tiff')
band08 = rio.open('https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B08.tiff')
band09 = rio.open('https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B09.tiff')
band11 = rio.open('https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B11.tiff')
band12 = rio.open('https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B12.tiff')
band8A = rio.open('https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B8A.tiff')

fig, ((ax01, ax02, ax03, ax04), (ax05, ax06, ax07, ax08), (ax09, ax11, ax12, ax8A)) = plt.subplots(3,4, figsize=(24,14), sharex=True, sharey=True)

show((band01, 1), ax=ax01, cmap='Purples', title='B01 (Blue/UV edge, 60m)')
show((band02, 1), ax=ax02, cmap='Blues', title='B02 (Blue visible light, 10m)')
show((band03, 1), ax=ax03, cmap='Greens', title='B03 (Green visible light, 10m)')
show((band04, 1), ax=ax04, cmap='Reds', title='B04 (Red visible light, 10m)')
show((band05, 1), ax=ax05, cmap='Greys', title='B05 (Red edge, 20m)')
show((band06, 1), ax=ax06, cmap='Greys', title='B06 (Red edge, 20m)')
show((band07, 1), ax=ax07, cmap='Greys', title='B07 (Red edge, 20m)')
show((band08, 1), ax=ax08, cmap='Greys', title='B08 (Near Infrared (NIR), 10m)')
show((band09, 1), ax=ax09, cmap='Greys', title='B09 (Near Infrared (NIR), 60m)')
show((band11, 1), ax=ax11, cmap='Greys', title='B11 (Short Wave Infrared (SWIR), 20m)')
show((band12, 1), ax=ax12, cmap='Greys', title='B12 (Short Wave Infrared (SWIR), 20m)')
show((band8A, 1), ax=ax8A, cmap='Greys', title='B8A (Narrow Near Infrared (NIR), 20m)')

for ax in fig.axes:
    ax.ticklabel_format(style='plain')
    ax.set_xlim(452100, 467700)
    ax.set_ylim(5758000, 5769500)
    ax.title.set_size('medium')

plt.show()

../../_images/55ee8829e465d8710cd4cc8e298e998147b3c541ac866e2a94935948cc5f0352.png

Three things to say here.

First, only bands 01-04 are visible light. B01 captures a narrow region at the UV edge of blue light; and B02, B03, and B04 are our blue, green, and red light. I’ve shown these in appropriate colours. B05 and B06 are on the edge between red and infrared, and the other bands are fully infrared ‘colours’: and I’ve shown those in grey above because we don’t have a colour language for them.

The image below shows the visible light spectrum and the near and shortwave infrared, with the ranges of the Sentinel 2 bands:

In each of the images above, you should be able to see that different parts of the area are lighter or darker in that particular colour - and the patterns are not identical across all of the images. What’s lighter in some is darker than others; some features are visible on one image, but not another. This is because different objects have different colours, whether in visible or infrared light.

Second, you might notice that there is no Band 10 image in the maps above. This is because the data I’ve given here as the sample has been processed to what’s referred to as L2A (Level 2 A). It’s not crucial to understand all the data processing steps, but L1C (Level 1 C) data is essentially the data processed to the correct geographic area, with no further corrections; L2A data has been further processed to remove the effects of the signal travelling through the atmosphere. Band 10 is used for looking at the atmosphere - so the L2A correction essentially removes all data from Band 10, hence why it isn’t shown above.

Third, this is a rather awkward way of showing everything - I have to specify each image individually. That’s fine for a one off, but if I want to do more with it, it’s going to get awkward.

Each of the images is a raster which stores only a single band. But it’s all showing the same area - so why not make it a single multiband raster? That’s really what it is.

meta = band01.meta
meta.update(count=13)

B01 = {'band': 'B01', 'index': 1, 'file': 'https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B01.tiff'}
B02 = {'band': 'B02', 'index': 2, 'file': 'https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B02.tiff'}
B03 = {'band': 'B03', 'index': 3, 'file': 'https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B03.tiff'}
B04 = {'band': 'B04', 'index': 4, 'file': 'https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B04.tiff'}
B05 = {'band': 'B05', 'index': 5, 'file': 'https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B05.tiff'}
B06 = {'band': 'B06', 'index': 6, 'file': 'https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B06.tiff'}
B07 = {'band': 'B07', 'index': 7, 'file': 'https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B07.tiff'}
B08 = {'band': 'B08', 'index': 8, 'file': 'https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B08.tiff'}
B09 = {'band': 'B09', 'index': 9, 'file': 'https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B09.tiff'}
B11 = {'band': 'B11', 'index': 11, 'file': 'https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B11.tiff'}
B12 = {'band': 'B12', 'index': 12, 'file': 'https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B12.tiff'}
B8A = {'band': 'B8A', 'index': 13, 'file': 'https://github.com/bamacgabhann/GY4006/raw/main/gy4006/sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_L2A_B8A.tiff'}

with rio.open('../sample_data/2021-04-25_11-46-59_Sentinel-2_L2A.tif', 'w', **meta) as dst:
    bands = [B01, B02, B03, B04, B05, B06, B07, B08, B09, B11, B12, B8A]
    for band in bands:
        b = rio.open(band['file'])
        dst.write(b.read(1), band['index'])
        dst.set_band_description(band['index'], band['band'])
        b.close()

I set Band 8A to be indexed as band 13 in this combined raster, because otherwise it would end up as band 9, with Band 09 as band 10 and so on which would get confusing very quickly.

killarney = rio.open('../sample_data/2021-04-25_11-46-59_Sentinel-2_L2A.tif')
killarney.meta

{'driver': 'GTiff',
 'dtype': 'float32',
 'nodata': None,
 'width': 1549,
 'height': 1148,
 'count': 13,
 'crs': CRS.from_wkt('PROJCS["WGS 84 / UTM zone 29N",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-9],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32629"]]'),
 'transform': Affine(10.003089244673982, 0.0, 452133.272164,
        0.0, -10.002688485192044, 5769540.027478)}

Now, we can do some interesting things with the data a bit more easily.

2. Data Cleaning#

Unfortunately there’s one more preliminary step we have to do, and that’s a little bit of data cleaning.

All of the values contained in our rasters should be floating point numbers between 0 and 1. However, as a consequence of the preliminary data processing to L2A, some of the values have been increased above 1.

We can see this if we as rasterio to show us all values above 1:

killarney.read()[killarney.read()>1]

array([1.0408, 1.0912, 1.0632, 1.0351, 1.0795, 1.0115, 1.0164, 1.0837,
       1.0431, 1.0433, 1.0135, 1.041 , 1.0789, 1.0129, 1.0082, 1.0734,
       1.0346, 1.031 , 1.0018, 1.0414, 1.101 , 1.048 , 1.0529, 1.0188,
       1.0005, 1.0264, 1.1168, 1.1912, 1.0184, 1.0128, 1.0616, 1.1376,
       1.276 , 1.3104, 1.2736, 1.028 , 1.1872, 1.1784, 1.1296, 1.0576,
       1.04  , 1.0352, 1.012 , 1.0128, 1.0056, 1.1176, 1.0686, 1.1327,
       1.1046, 1.0368, 1.0534, 1.1332, 1.145 , 1.2644, 1.4308, 1.4245,
       1.1414, 1.1372, 1.4183, 1.5269, 1.4907, 1.2835, 1.0792, 1.1326,
       1.4379, 1.5442, 1.5247, 1.3933, 1.1456, 1.1638, 1.0407, 1.3253,
       1.4826, 1.525 , 1.4702, 1.2932, 1.1846, 1.119 , 1.2221, 1.4279,
       1.528 , 1.5212, 1.4091, 1.1683, 1.1551, 1.1641, 1.0242, 1.1261,
       1.3788, 1.5331, 1.5463, 1.4932, 1.4286, 1.2591, 1.0758, 1.176 ,
       1.1132, 1.0214, 1.3315, 1.5411, 1.5581, 1.5277, 1.555 , 1.4588,
       1.0907, 1.1598, 1.1642, 1.2854, 1.5514, 1.5562, 1.5135, 1.5517,
       1.4949, 1.2358, 1.1145, 1.176 , 1.2017, 1.5131, 1.5403, 1.5121,
       1.5503, 1.5132, 1.3388, 1.0713, 1.0394, 1.0815, 1.4276, 1.5104,
       1.5225, 1.549 , 1.5123, 1.3998, 1.1664, 1.3374, 1.4774, 1.5329,
       1.5505, 1.5137, 1.4506, 1.2549, 1.2411, 1.4402, 1.5428, 1.5542,
       1.5169, 1.4904, 1.3354, 1.0911, 1.3475, 1.5118, 1.5531, 1.5365,
       1.5261, 1.3815, 1.2005, 1.4396, 1.5446, 1.5684, 1.5537, 1.3909,
       1.0201, 1.2626, 1.347 , 1.3505, 1.3144, 1.1616, 1.0077, 1.2594,
       1.2452, 1.1032, 1.2182, 1.5389, 1.6002, 1.4924, 1.1722, 1.0303,
       1.2635, 1.3106, 1.2356, 1.0065, 1.0061, 1.0484, 1.0514, 1.0154,
       1.0372, 1.2445, 1.1604, 1.1547, 1.4934, 1.4146, 1.1507, 1.1799,
       1.5189, 1.4784, 1.2408, 1.0618, 1.3714, 1.3986, 1.2356, 1.1105,
       1.302 , 1.3238, 1.0391, 1.0543, 1.1786, 1.1378, 1.1085, 1.406 ,
       1.5679, 1.5632, 1.3211, 1.1271, 1.1984, 1.2242, 1.5111, 1.5667,
       1.5275, 1.4772, 1.267 , 1.237 , 1.4109, 1.181 , 1.2313, 1.5382,
       1.557 , 1.5128, 1.558 , 1.4277, 1.2685, 1.5257, 1.3546, 1.0458,
       1.1312, 1.4873, 1.5359, 1.5153, 1.5642, 1.4736, 1.0787, 1.2246,
       1.5463, 1.4665, 1.2037, 1.025 , 1.4223, 1.5091, 1.5173, 1.5559,
       1.5019, 1.2828, 1.0126, 1.1292, 1.4936, 1.5236, 1.3404, 1.3418,
       1.4761, 1.5189, 1.5327, 1.5116, 1.4959, 1.3695, 1.0053, 1.3684,
       1.5261, 1.4551, 1.0113, 1.2187, 1.4208, 1.5221, 1.527 , 1.5157,
       1.5842, 1.5332, 1.2088, 1.2397, 1.4875, 1.5057, 1.107 , 1.054 ,
       1.3438, 1.5271, 1.538 , 1.514 , 1.5516, 1.5102, 1.3114, 1.0423,
       1.106 , 1.4064, 1.4914, 1.1538, 1.2698, 1.5374, 1.5504, 1.5128,
       1.5294, 1.5041, 1.4248, 1.2238, 1.1918, 1.2982, 1.0308, 1.198 ,
       1.552 , 1.564 , 1.5121, 1.5158, 1.5122, 1.5464, 1.4108, 1.1035,
       1.505 , 1.5469, 1.5135, 1.5126, 1.5157, 1.5972, 1.4973, 1.3977,
       1.4997, 1.5168, 1.5196, 1.5147, 1.5793, 1.4865, 1.0296, 1.2489,
       1.4401, 1.5352, 1.5431, 1.5291, 1.5852, 1.4981, 1.0919, 1.059 ,
       1.3664, 1.5657, 1.5801, 1.5556, 1.6098, 1.5261, 1.141 , 1.194 ,
       1.4323, 1.4457, 1.3982, 1.3871, 1.277 , 1.1401, 1.1453, 1.0633,
       1.2586, 1.3798, 1.4125, 1.4366, 1.4134, 1.3601, 1.1953, 1.1044,
       1.542 , 1.6079, 1.5774, 1.583 , 1.5709, 1.5679, 1.4712, 1.1583,
       1.1893, 1.5461, 1.5715, 1.5207, 1.5197, 1.5204, 1.5295, 1.5204,
       1.487 , 1.392 , 1.1937, 1.1448, 1.4452, 1.5347, 1.5582, 1.5682,
       1.5511, 1.4909, 1.4567, 1.5386, 1.5806, 1.4756, 1.2962, 1.0432,
       1.4281, 1.5277, 1.4211, 1.0808, 1.2213, 1.4337, 1.5881, 1.6211,
       1.5701, 1.3937, 1.2666, 1.3576, 1.4727, 1.5077, 1.5049, 1.4764,
       1.3892, 1.2216, 1.1162, 1.5941, 1.6302, 1.4476, 1.025 , 1.008 ,
       1.155 , 1.2   , 1.1687, 1.0108, 1.0791, 1.2428, 1.3996, 1.5307,
       1.5683, 1.439 , 1.2145, 1.2782, 1.5254, 1.4056, 1.1312, 1.0499,
       1.2921, 1.428 , 1.3584, 1.2234, 1.117 , 1.1268, 1.4359, 1.5206,
       1.19  , 1.0507, 1.0483, 1.0165, 1.0619, 1.0293, 1.3149, 1.5616,
       1.5187, 1.1082, 1.3743, 1.5104, 1.3619, 1.2194, 1.4184, 1.338 ,
       1.1126, 1.3418, 1.333 , 1.0979, 1.0028, 1.3162, 1.5055, 1.4951,
       1.3419, 1.0303, 1.1722, 1.3958, 1.2195, 1.264 , 1.5325, 1.5632,
       1.4687, 1.2995, 1.0346, 1.3848, 1.5494, 1.4873, 1.133 , 1.0818,
       1.4747, 1.5378, 1.5053, 1.535 , 1.4148, 1.1241, 1.5891, 1.6544,
       1.4765, 1.0038, 1.2804, 1.4433, 1.5161, 1.5891, 1.5225, 1.1858,
       1.2523, 1.5298, 1.5603, 1.3719, 1.2822, 1.4868, 1.5071, 1.4403,
       1.3677, 1.1599, 1.1601, 1.4308, 1.5107, 1.3582, 1.0661, 1.0879,
       1.3938, 1.4529, 1.4302, 1.4538, 1.3181, 1.4715, 1.6171, 1.4951,
       1.0845, 1.2269, 1.3802, 1.4322, 1.4492, 1.3113, 1.578 , 1.6811,
       1.5151, 1.0525, 1.1117, 1.2827, 1.3533, 1.258 , 1.1118, 1.536 ,
       1.6412, 1.5315, 1.155 , 1.021 , 1.1678, 1.1361, 1.0036, 1.327 ,
       1.527 , 1.5018, 1.3281, 1.0238, 1.4383, 1.5591, 1.5052, 1.2715,
       1.3106, 1.5785, 1.5195, 1.3028, 1.3059, 1.5171, 1.5141, 1.3052,
       1.455 , 1.5952, 1.5091, 1.2716, 1.3877, 1.5539, 1.5161, 1.3347,
       1.0277, 1.1168, 1.3945, 1.4818, 1.2921, 1.1275, 1.3854, 1.1615,
       1.022 , 1.1631, 1.4005, 1.4133, 1.2581, 1.195 , 1.463 , 1.5316,
       1.43  , 1.0958, 1.2466, 1.3791, 1.3813, 1.1972, 1.0528, 1.2492,
       1.3431, 1.2781, 1.1426, 1.0938, 1.2747, 1.3219, 1.2943, 1.2391,
       1.0762, 1.0802, 1.2798, 1.3911, 1.393 , 1.2527, 1.1811, 1.437 ,
       1.4697, 1.3449, 1.0373, 1.1662, 1.4265, 1.4992, 1.4068, 1.0966,
       1.195 , 1.3757, 1.4858, 1.4304, 1.0807, 1.1455, 1.3822, 1.4727,
       1.3976, 1.0703, 1.0456, 1.4139, 1.4532, 1.3288, 1.0671, 1.3577,
       1.4175, 1.331 , 1.1673, 1.2465, 1.3744, 1.3791, 1.3194, 1.1025,
       1.1681, 1.3708, 1.4296, 1.3076, 1.0325, 1.1197, 1.3908, 1.4716,
       1.179 , 1.1682, 1.4014, 1.4248, 1.0268, 1.2759, 1.3834, 1.2877,
       1.1724, 1.3004, 1.184 , 1.2503, 1.3534, 1.2509, 1.339 , 1.2896,
       1.0475, 1.0119, 1.2412, 1.0499, 1.0147, 1.0462, 1.0637, 1.2344,
       1.3372, 1.0116, 1.2776, 1.1098, 1.0744, 1.0985, 1.1955, 1.013 ,
       1.3044, 1.0117, 1.2282, 1.078 , 1.1915, 1.021 , 1.2559, 1.152 ,
       1.1954, 1.1691, 1.2049, 1.2818, 1.1509, 1.1977, 1.3044, 1.191 ,
       1.0235, 1.2333, 1.206 , 1.42  , 1.447 , 1.2359, 1.4868, 1.5932,
       1.4733, 1.1215, 1.0846, 1.535 , 1.6592, 1.5784, 1.2804, 1.1695,
       1.5312, 1.6305, 1.5565, 1.3109, 1.0669, 1.3538, 1.3968, 1.2959,
       1.0881, 1.0717, 1.0471, 1.0868, 1.1528, 1.2009, 1.3018, 1.07  ,
       1.0315, 1.0585, 1.0392, 1.1036, 1.1389, 1.0782, 1.0533, 1.1194,
       1.0863, 1.0073, 1.0714, 1.0466], dtype=float32)

That looks like a lot, but for context:

killarney.read()[killarney.read()>1].shape

(740,)

That means there’s 740 values above 1. The whole dataset, though:

killarney.read().shape

(13, 1148, 1549)

It’s actually 12 bands, since band 10 is empty, but that’s still 12 x 1148 x 1549 pixels: a total of 21,339,024.

740 errors in over 21 million is a pretty good ratio, really.

Anyway, we can easily fix this issue by just setting all these values to 1:

k = killarney.read()
k[k>1]=1.0
kmeta=killarney.meta
with rio.open('../sample_data/2021-04-25_11-46-59_Sentinel-2_L2A.tif', 'w', **kmeta) as dst:
    dst.write(k)

And just to check it worked:

killarney = rio.open('../sample_data/2021-04-25_11-46-59_Sentinel-2_L2A.tif')
killarney.meta

{'driver': 'GTiff',
 'dtype': 'float32',
 'nodata': None,
 'width': 1549,
 'height': 1148,
 'count': 13,
 'crs': CRS.from_wkt('PROJCS["WGS 84 / UTM zone 29N",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-9],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32629"]]'),
 'transform': Affine(10.003089244673982, 0.0, 452133.272164,
        0.0, -10.002688485192044, 5769540.027478)}

killarney.read()[killarney.read()>1]

array([], dtype=float32)

Empty array means it worked!

It is important to do things like this, because if you want to do a calculations involving subtraction and division, you could end up with some unfortunate errors.

Anyway, let’s get on with the fun parts.

3. False Colour Visualisations#

The problem with trying to look at 12 or 13 different colours, including a number of infrared colours, is that we can only see three different colours. Every colour is a combination of red, green, and blue.

But just because an image has been captured in a colour we’re unable to see, doesn’t mean we can’t show it in a colour we can see. That’s essentially what I did above, showing the infrared colours in grey. But we can show them in any colour we want to.

For example, here’s Band 8A shown in green.

fig, ax = plt.subplots(figsize=(10, 6))
show((killarney, 13), ax=ax, cmap='Greens', title='B8A (Narrow Near Infrared (NIR), 20m)')
ax.ticklabel_format(style='plain')
plt.show()

../../_images/17a9a81d858982fd1f368cd8d2f81fccc0b281703708103943f9be2dc3f731a4.png

And here’s the blue light Band 02 shown in red:

fig, ax = plt.subplots(figsize=(10, 6))
show((killarney, 2), ax=ax, cmap='Reds', title='B02 (Blue visible light, 10m)')
ax.ticklabel_format(style='plain')
plt.show()

../../_images/d21925a6bc1662547ca4ddc5520f36046de0cef696c1ab5a0c58ddd989d3f131.png

In these cases, I’m specifying which single colour to use - but we don’t have to stick with just one colour at a time. If Rasterio is passed 3 bands, it treats them as RGB - and of course, there’s no law saying we can only pass red, green, and blue as RGB.

For example, here’s a commonly used false colour combination based on bands 8, 4, and 3. This shows Band 08 (NIR) as red, Band 04 (red light) as green, and band 03 (green light) as blue.

(Rasterio expects the bands to be presented in the order red, green, blue - hence 8, 4, 3 in that order.)

fig, ax = plt.subplots(figsize=(10, 6))
show((killarney, [8,4,3]), ax=ax, title='False colour visualisation (B08, B04, B03)')
ax.ticklabel_format(style='plain')
plt.show()

../../_images/e7fe8a223ddf183a87a8139c862bfd8bd737750d1a0fa0b9f26382fa270c8d7c.png

And here’s a commonly used visualisation referred to a SWIR (Short Wave Infrared), based on Band 12 (SWIR), Band 8A (NIR), and Band 04 (red light):

fig, ax = plt.subplots(figsize=(10, 6))
show((killarney, [12,13,4]), ax=ax, title='SWIR visualisation (B12, B8A, B04)')
ax.ticklabel_format(style='plain')
plt.show()

../../_images/5c421510ace5ef0e83f734a7fc6442d1348fa767fd3f1bc6563547ea7793a31f.png

This SWIR visualisation is extremely useful for distinguishing clouds, snow, and ice in areas where those are present (which all look white in RGB images, but show quite differently in SWIR). It’s also useful for disinguishing different kinds of vegetation, and different kinds of rocks - as well as something in the National Park area which I’m quite sure you’ve noticed there, and are wondering about. Hold that thought.

Here’s one for you too play with a little more easily.

fig, ax = plt.subplots(figsize=(10, 6))
show((killarney, [

# Change this number to the band you want to show in Red    
    9

,# Change this number to the band you want to show in Green    
    6

,# Change this number to the band you want to show in Blue    
    1
    
]), ax=ax, title='Custom visualisation')
ax.ticklabel_format(style='plain')
plt.show()

../../_images/65d3476301fcfb709d21156096a531db55d19bffcc48a1b35e952cfdb259ba96.png

It would be impossible for me to list all the combinations and what they’re useful for, in exactly the same way that I wouldn’t be able to list how to identify objects by their colour. It’s too varied, there’s just too many things. There’s many, many uses, including mapping water, urban areas, type and health of vegetation, geology, land use, and much more. This is another case of, if you have an idea in mind, look up how to do it.

4. Indices#

We can go further. We’re not limited to the bands as they are saved in the raster file - we can do calculations with them.

Again, this is a case where there are hundreds, thousands of different calculations possible, so it’s not about learning the possibilities. Once again, it’s a case of, when you have something you want to do, you figure out how to do it. That said, you need to have some understanding of what’s possible in order to have an idea of where to start, so I’m going to show you two examples in order to explain the concept.

NDVI Possibly the best known and most-used index is the Normalised Difference Vegetation Index, or NDVI. This is based on just two bands: Band 08 (NIR) and Band 04 (red light).

Instead of plotting these two bands in different colours, the index is comparing the values of the two bands. This is the Difference part of the index formula:

B08 - B04

We can plot just this part:

fig, ax = plt.subplots(figsize=(10, 6))
show((killarney.read(8)-killarney.read(4)), ax=ax, title='B08-B04')
ax.axis('off')
plt.show()

../../_images/d676aff1850202a10289b66016f7406941cf4b2687a434116f4373d9dfdedb86.png

There is a problem with doing this though. Say we have two pixels.

In the first, B08 is 0.8, and B04 is 0.7.

In the second, B08 is 0.2, and B04 is 0.1.

In both cases, B08 - B04 gives 0.1 - but that’s now hiding the very different original values in B08 and B04 in the two pixels.

To avoid this, what we do is to normalise the numbers based on the total of their original values, i.e. B08 + B04. Doing this means the 0.1 from our first pixel will be divided by 0.8+0.7 which totals to 1.5, giving a result of 0.667; while the 0.1 from our second pixel will be divided by 0.2+0.1 = 0.3, giving a result of 0.333.

So, the total formula is (B08 - B04) / (B08 + B04).

This also gives us a good, fixed scale. The maximum that B08 - B04 can be is -1, in the case where B08 is 0 and B04 is 1 - since both vary between 0 and 1. Similarly, the minumum is the case where B08 is 1 and B04 is 0, meaning B08 - B04 = 1. In both cases, B08 + B04 = 1. So, the NDVI can vary between -1 and +1.

Let’s look at the full formula result:

ndvi = (killarney.read(8)-killarney.read(4))/(killarney.read(8)+killarney.read(4))
ndvi_meta = killarney.meta
ndvi_meta.update(count=1)

with rio.open('../sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_NDVI.tif', 'w', **ndvi_meta) as dst:
    dst.write(ndvi, 1)
    dst.set_band_description(1, 'NDVI')

/tmp/ipykernel_345/991251910.py:1: RuntimeWarning: invalid value encountered in divide
  ndvi = (killarney.read(8)-killarney.read(4))/(killarney.read(8)+killarney.read(4))

killarney_NDVI = rio.open('../sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_NDVI.tif')
fig, ax = plt.subplots(figsize=(10, 6))

temp_scale_setup = ax.imshow(killarney_NDVI.read()[0], cmap='RdYlGn', vmin=-1, vmax=1)
cb = fig.colorbar(temp_scale_setup, ax=ax, label="NDVI")
cb.set_ticks([-1, -0.5, 0, 0.5, 1])

show((killarney_NDVI, 1), ax=ax, title='NDVI: Normalised Difference Vegetation Index, (B08 - B04) / (B08 + B04)', cmap='RdYlGn', vmin=-1, vmax=1)
ax.ticklabel_format(style='plain')
plt.show()

../../_images/2cb8b489664734034da31a98969ceb52dab40e1bd04228cfb00397a2f7563449.png

As you might guess from the name, the NDVI is excellent at distinguishing vegetation. Values close to 1 indicate strong vegetation like forestry. Low positive numbers generally indicate grasslands and shrubs, while bare soil, rock, and water will usually have negative values.

NDMI

Similarly, we can calculate the Normalised Difference Moisture Index, using bands B8A and B11.

ndmi = (killarney.read(13)-killarney.read(11))/(killarney.read(13)+killarney.read(11))

ndmi_meta = killarney.meta
ndmi_meta.update(count=1)

with rio.open('../sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_NDMI.tif', 'w', **ndvi_meta) as dst:
    dst.write(ndvi, 1)
    dst.set_band_description(1, 'NDMI')

killarney_NDMI = rio.open('../sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_NDMI.tif')
fig, ax = plt.subplots(figsize=(10, 6))

temp_scale_setup = ax.imshow(killarney_NDMI.read()[0], cmap='jet_r', vmin=-1, vmax=1)
cb = fig.colorbar(temp_scale_setup, ax=ax, label="NDMI")
cb.set_ticks([-1, -0.5, 0, 0.5, 1])

show((killarney_NDMI, 1), ax=ax, title='NDMI: Normalised Difference Moisture Index, (B8A - B11) / (B8A + B11)', cmap='jet_r', vmin=-1, vmax=1)
ax.ticklabel_format(style='plain')
plt.show()

/tmp/ipykernel_345/4234981925.py:1: RuntimeWarning: invalid value encountered in divide
  ndmi = (killarney.read(13)-killarney.read(11))/(killarney.read(13)+killarney.read(11))

../../_images/02ff20acd93bcdba9012eb2fc8170452a58ef8450ff39f990543db546975fa0c.png

You might be wondering why the lake isn’t showing up solid red, and that’s because it’s a moisture index, not a water index - it’s very good at highlighting moisture in vegetation. So, strong blue colours here will represent vegetation with a lot of water, while light blues to greens will represent drier vegetation suffering from water stress. Green to red colours generally show non-vegetated areas.

There’s many, many more where these came from - I just wanted to use these two as examples. There’s a great list curated on GitHub which you can find here: awesome-spectral-indices/awesome-spectral-indices

5. Segmentation#

We can go beyond indices, and do more complex calculations to identify particular features in our imagery.

Again, I’ll demonstrate with an example. We’ll start by doing a band calculation similar to the NDVI or NDMI indices:

segmented = (killarney.read(12)-killarney.read(11))/(killarney.read(12)+killarney.read(11)+0.25)

Now we apply a threshold value - setting any pixels below that value as np.nan, Not a Number. This leaves only the data we’re interested in, i.e. the areas where the pixel value is higher than our threshold.

segmented[segmented<0.04]=np.nan

We can plot this on a map - and let’s add an RGB basemap, so that we have something to see where our segmented raster values are np.nan.

fig, ax = plt.subplots(figsize=(10, 6))
killarney.read()
show(k[1:4][::-1]*2.5, transform=killarney.transform, ax=ax)
show(segmented, transform=killarney.transform, ax=ax, title='Killarney Wildfire 25 April 2021', cmap="Reds_r")

ax.ticklabel_format(style='plain')
plt.show()

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..2.5].

../../_images/c2101337873a0c42466af580d5272727231d0379c3aa6b781d6fc1dfe7855097.png

You can see the red areas on the map, I hope - and you can also see the title. This process has isolated pixels which are interpreted as showing active wildfire burning.

I can confirm their accuracy: I was there at the time. I’m in that photo, somewhere. Just a bit too small to make out!

We can save this as a file:

fire_meta = killarney.meta
fire_meta.update(count=1)

with rio.open('../sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_fire.tif', 'w', **fire_meta) as dst:
    dst.write(segmented, 1)
    dst.set_band_description(1, 'Wildfire 2021-04-25')

Now, we didn’t actually have to do the thresholding there. We could have just shown the result of the calculation, without the threshold:

calc = (killarney.read(12)-killarney.read(11))/(killarney.read(12)+killarney.read(11)+0.25)
fig, ax = plt.subplots(figsize=(10, 6))

show(calc, transform=killarney.transform, ax=ax, title='Killarney Wildfire 25 April 2021', cmap="Reds_r")

ax.ticklabel_format(style='plain')
plt.show()

../../_images/d64521e43f6017024f237f10690187c2b90cbe2a03ef77daab212cb32551152a.png

We can still see the active fire pixels are much brighter there, but the rest of the map isn’t telling us anything useful. Much better to isolate the useful pixels, and show only them.

We can also create an array of Boolean values, listing False for non-fire and True for fire pixels. This would commonly be referred to as a mask - such Boolean data is used to mask out (i.e. hide) pixels with a Boolean value of False in the mask data.

fire_shape = (killarney.read(12)-killarney.read(11))/(killarney.read(12)+killarney.read(11)+0.25)
fire_shape[fire_shape<0.04]=0
fire_shape[fire_shape>0.04]=1
fire_mask = fire_shape.astype(bool)
fire_mask

array([[False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       ...,
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False]],
      shape=(1148, 1549))

Something along these lines is often a key step in building maching learning models, where the computer would be essentially learning where particular values are True, and using that to predict where values would be True elsewhere.

I’m not going any further into machine learning though. More usefully for our purposes right now, we can create vector shapes for the fire areas:

fire_shape[fire_shape==0]=np.nan

with rio.open('../sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_fire_shape.tif', 'w', **fire_meta) as dst:
    dst.write(fire_shape, 1)
    dst.set_band_description(1, 'Fire')

fireshapes_raster = rio.open('../sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_fire_shape.tif')
fireshapes_df = dataset_features(fireshapes_raster, bidx=1, band=True, as_mask=False, with_nodata=False, geographic=False)
fire_vector = gpd.GeoDataFrame.from_features(fireshapes_df,crs=32629)
fire_vector = fire_vector[fire_vector['val']==1.0]
fire_vector = fire_vector.drop(columns=['val', 'filename'])
fire_vector['Fire']=True
fire_vector.to_file('../sample_data/Killarney/2021-04-25_11-46-59_Sentinel-2_fire_vector.gpkg')

fig, ax = plt.subplots(figsize=(10, 6))
killarney.read()
show(k[1:4][::-1]*2.5, transform=killarney.transform, ax=ax)
fire_vector.plot(ax=ax, color='Red')

ax.ticklabel_format(style='plain')
plt.show()

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..2.5].

../../_images/fecbc7b3b77a8f70f3d4e39dba89c93f8b8d7ad584506b35cfd15385e1a12037.png

Once you have vector shapes, more geoprocessing options become open:

sum(fire_vector.geometry.area)

185407.0767243383

That’s how many square metres were burning, when this imagery was collected.

Of course, we can also combine some of the data we’ve just analysed with some of the data from the Single-band Raster Data Notebook:

dem = rio.open('https://raw.githubusercontent.com/bamacgabhann/GY4006/main/gy4006/sample_data/Killarney/DEM_COPERNICUS_30.tiff')
shade = es.hillshade(dem.read(1), azimuth=270, altitude=30)
bounds  = gpd.GeoDataFrame({"id": 1, "geometry":[box(*dem.bounds)]}).set_crs(32629)
killarney_roads = ox.features_from_polygon(bounds.to_crs(4326)["geometry"][0], tags={"highway": True}).to_crs(32629)

fig, ax = plt.subplots(figsize=(10, 6))

temp_scale_setup = ax.imshow(killarney_NDVI.read()[0], cmap='RdYlGn', vmin=-1, vmax=1)
cb = fig.colorbar(temp_scale_setup, ax=ax, label="NDVI")
cb.set_ticks([-1, -0.5, 0, 0.5, 1])

ax.ticklabel_format(style='plain')
ax.set_xlim(452100, 467700)
ax.set_ylim(5758000, 5769500)

show((killarney_NDVI, 1), ax=ax, title='Killarney 2021-04-25, NDVI with hillshade DEM, roads, and active wildfires', cmap='RdYlGn', vmin=-1, vmax=1)
show(shade, transform=dem.transform, ax=ax, cmap="Greys", alpha=0.3)
killarney_roads.plot(ax=ax, column="highway", cmap="gist_yarg")
fire_vector.plot(ax=ax, color='Red')

plt.show()

../../_images/7a3f5291b4121fb8ba3735bad715669b3ff1a2d39997eb96583252903282f771.png

Now, how about that for a tasty map?

Summary#

So, yes, through all the raster data so far we’ve actually been exploring data relevant to the Killarney wildfire of April 2021.

We started with photographs (actually bands 04, 03, and 02 from the Sentinel 2 satellite);
We brought in a DEM of the area;
We looked at the land surface temperatures in the area for that month (maximum of 12°C, according to our data);
We looked at the NDVI to see vegetation in the area (not great in the fire area, but that’s mostly after the fire);
We looked at the NDMI for vegetation moisture (pretty dry in the fire area);
Finally, we extracted the shape of the active fires at the time the imagery was collected.

If wildfires is what you’re interested in, there’s plenty more you can do there. We could also have tried to extract the shape of the burned area - and you can see that pretty clearly on the RGB imagery. We could look at vegetation and moisture before wildfires, and find other places with similar vegetation and moisture which could be at risk of wildfires. We could assess the vegetation before and after the wildfire to look at the changes.

But this is only one example of how remote sensing data can be used in environmental geospatial data analysis.

Remote sensing imagery can also be used to help assess biodiversity, to monitor agriculture, to look at flooding, to map landslides, to assess water quality, to measure snow and ice cover, and to map land use.

Even when I was doing my PhD, remote sensing data was very hard to work with, and not easy to get. Now? With the massive advance in computation power, as well as resources like Colab, analysis of the data has become possible for essentially anyone who wants to try. Even more importantly, we all have access to some absolutely incredible datasets. Sentinel 2A was launched in 2015, and all of its data is open access. Most data from the US civilian agencies is open access now as well, including LANDSAT and MODIS, and there’s data available from other countries as well.

And that’s just the multispectral satellites. In the EU’s Copernicus programme, there’s also Sentinel 3 monitoring the oceans, Sentinel 5P monitoring the atmosphere and air pollution, and Sentinel 6 to measure sea level change - and Sentinel 1, which uses RADAR for earth observation. That’s up next week for us.

The easiest way to access the data is through the Copernicus Browser. Copernicus is the EU’s Earth Observation programme - registration is required, but it’s free. US data is available from USGS Earth Explorer or NASA Earthdata.

Between the availability of data, and the computing resources available to use it, the possibilities are virtually limitless. It’s just a case of, what do you want to do with it?

GY4006 Notebooks in Colab:

Data Types
Vector Data
Attribute Data
Coordinate Reference Systems
Geospatial Data Files
Vector Geoprocessing
Introduction to Raster Data
Single-band Raster Data
Multi-band Raster Data: Passive Remote Sensing