GY4006 logo

Geospatial Software and Programming Languages#

0. Closed and Open Source#

In this module, you’ll be using computer software to map geospatial data. We’ll introduce some of the software you’ll be using below, but before talking about particular software and languages, it’s worth understanding the difference between closed and open source software.

Closed Source / Proprietary Software#

When you buy software like ArcGIS Pro, or Microsoft Word, have you ever read the license agreement, or did you just click Accept?

Software like this is proprietary software. It’s created by a company like ESRI or Microsoft, and it’s protected by copyright and other laws. To use the software, you have to pay for it, and what you’re actually paying for is a license to use the software. This license usually imposes strict limits on use - for example allowing installation on only one device (or a small number of devices), banning sharing, and prohibiting users from trying to reverse engineer the source code.

The last point there is a crucial one. With the source code closed in this manner, only the company who writes the software can generally see or edit the code. The reason for this is pretty obvious - if you’re a company who writes and publishes a software package, that’s your product, your company’s income. If you let anyone see the code, then people could just copy it, and then why would anyone pay for your product? So you wouldn’t make any money off it, and that’s your company out of business.

Free and Open Source Source (FOSS) Software#

However, there’s another category of software, and that’s open source software, which is usually free and open source (FOSS). Open Source software has the code for the software available to anyone who wants to see it. Most open source software is also free - and in software, there’s two specific meanings of the word “free”. It can be “free as in speech”, meaning anyone is free to take that source code and do whatever they want with it, including editing it and distributing their versions. It can also be “free as in beer”, meaning there’s no payment required. Most open source software is both, but there are notable exceptions.

The alternative to ArcGIS Pro is QGIS, which is free both as in speech, and as in beer. You don’t have to pay anything for it, and not only can you view the source code, but you can do what you want with it.

FOSS software like this is generally written by anyone and everyone who wants to contribute to it. Usually, there will be one or a small number of project maintainers in charge of the official version, who review all the suggested additions or changes to the code, and publish regular updates. People contribute because they find it useful themselves but want a new feature, because they want the challenge, or because they want to contribute something which will be useful to other people - everyone has different reasons. There’s a huge open source community, and it’s really pretty amazing.

Open source software often has a reputation of lagging behind propietary software. After all, since it’s free, there’s not a company with loads of full-time staff who can be assigned to development of the software. I’m not sure how many staff ESRI has working on updating and adding new features to ArcGIS Pro, but it’s probably a lot.

Some open source software does have full-time staff though, funded by donations and sponsorship. QGIS, for example, has full time staff funded not only by donations from individual users, but also by companies and organisations including other GIS companies which you might think are competitors, like Felt - which is actually the biggest individual sponsor of QGIS - and organisations which use the software, including the Irish Government Office of Public Works (the list is huge). QGIS probably has fewer people working full time on it than ESRI has on ArcGIS Pro, but

Other open source software is more commonly now pursuing other income generation models, for example being free for personal or educational use but charging for commercial use, or allowing free use but offering subscriptions for support services.

The Python language is completely free and open source, maintained by the Python Software Foundation. It does have some full-time staff, funded again by donations, which includes core developers, but also anyone is able to contribute. Probably the vast majority of software written in Python is open source projects, which again anyone can contribute to.

Open Source licenses#

Open source software isn’t completely uncontrolled - most will still have a license explaining what you can and can’t do with the code. There’s a number of different licenses, with different levels of permissions. One f the most commonm is the MIT license:

MIT License

Copyright (c) [year] [fullname]

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

It’s a bit shorter than the license agreements for software like Word that nobody ever fully reads, right? It also allow anyone to do anything - including selling adapted versions, with the only condition that the copyright notice be preserved.

Another common one is the BSD 3-clause license:

Copyright (c) [year] [fullname]

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

More restrictive is the GPL - the Gnu Public License - which is too long to paste here. It requires that any modified versions must use the same license, and publish the source code.

It’s not important to memorise the license types, but do bear it in mind for a couple of reasons.

First, simply so that you understand what you’ll be using in this module. I will be advocating using open source software here - in fact, you already are. The Notebooks in which I have written the content for this module use the Python programming language, which is open source and published under a custom open source license. The Notebooks themselves are Jupyter Notebooks, published by Project Jupyter under the BSD 3-clause license, and arerunning them in Google Colab, which is simply a modified version of Jupyter Notebook hosted online - in other words, Google made use of the BSD license and modified the software to run their own version. QGIS is published under a custom GPL-compatible open source license. Both QGIS and ArcGIS Pro also use Python and other open source elements in their software, including the option to write Python code for specific tasks inside the software; and we’ll also be using a few open source Python projects even just in these Notebooks.

Second, so that you understand what you’re allowed to do with certain software and code after the course - including if you want to write any code or contribute to any projects yourself. You might! It’s not as difficult as you might think, and it’s a little bit addictive!

1. Python#

As I said, these Notebooks use the Python programming language. The Python language itself is open source, and many open source projects are written in Python - including many extremely useful geospatial projects.

I find that using Python is often the best way to teach geospatial data analysis. In part, that’s because I can write Notebooks like these, giving code examples which you can run - and of course there’s no Notebooks for QGIS or ArcGIS. I’ll record some walkthroughs for those, but using the Notebooks means you don’t have to do as much at first, versus if I went straight into QGIS where you’d have to follow written instructions and try to find the right menus to click, type in the right text etc.

But it’s also because using Python, you can do things a little more step by step, breaking things down a bit more. In that sense, it’s similar to how it’s better to teach maths by doing calculations and drawing graphs by hand before jumping into using calculators or software like Excel. So, hopefully using Python like this will let you understand what you’re doing, rather than just knowing which button to click to do what you want.

To use Python, essentially you just type out the commands for what you want to do. You can do this either in an interactive shell, where you type in a command and it runs immediately; or a Notebook like this one, where you type out bits of code and run them when you’re ready; or by writing a script, where you write out a full set of code for what you want to do, and run the entire script to do it.

I won’t say much more about Python here - but if you’re interested in learning more about it, do let me know and I’ll point you in the right direction.

2. QGIS#

QGIS is free and open source desktop GIS software. You can download QGIS from qgis.org and install it on any Windows, Mac, or Linux desktop or laptop computer you want. QGIS is already installed on the UL computer lab PCs, and you will be able to use it for the mapping tasks for this module.

With software like Word, we’re used to opening a file, editing it, and saving it within that software. QGIS works a little differently.

When you open QGIS, the first step is to create a new project. The project file is what QGIS will be editing and saving. Into that project you can add geospatial data as different layers. These layers are completely separate from the project file. The QGIS project file saves which layers are part of the project, where on your computer to find the data for those layers, and how to display them. In some cases, QGIS can edit these layers, and you can create new layers within QGIS, but even when you’re doing that, one of the key steps is to identify where and how that layer should be saved. You can think of the QGIS project as a container for your geospatial data, simply controlling how it’s displayed.

QGIS also isn’t really just one piece of software. You can better think of it as a collection of multiple different open source geospatial tools, handily collected all in one place. It uses GDAL - the Geospatial Data Abstraction Library - for many functions, GRASS GIS and SAGA GIS for others, as well as smaller tools. It can be extended using plugins, additional tools for specific purposes written by particular contributors. Plugins aren’t part of the base QGIS package because they might be useful to only a small subset of users, so it’s preferable not to over-complicate things by installing them for everyone by default.

QGIS will also install a version of Python, because it uses a lot of Python behind the scenes. When you click on a toolbar icon, or on a menu item, it will run a specific command in the background - and many of those are in Python.

3. ArcGIS Pro#

ArcGIS Pro is proprietary desktop GIS software published by ESRI. You should already have been added to the UL Geography ArcGIS license, giving you access to ArcGIS Pro and a range of additional tools provided by ESRI, and you should have used ArcGIS Online with Dr. Porter.

ArcGIS Pro works similarly to QGIS, in that the software itself directly saves mainly a project file which contains which layers are part of the project, where to find those on your computer, and how to display them. Like QGIS, it can also be thought of as a container for several different geospatial tools, including some open source tools.

Why might you consider using ArcGIS Pro instead of QGIS? One reason is that ArcGIS Pro is widely regarded as the industry standard, and so if you are working for a company which uses ArcGIS Pro, or aim to work for one in the future, best to stick with the tools you’ll need to know.

ArcGIS Pro also does have more features than QGIS. Many of these are features you’d only need for very specialised applications - but you might have one of those some day. More significantly, ESRI also provides a huge range of services and adaptions of ArcGIS, including online storage, tools like ArcGIS Online, Survey123, and StoryMaps, and these are all integrated with ArcGIS Pro. If you have any interest in using tools or services in this wider ArcGIS ecosystem, using ArcGIS Pro would be much simpler.

If you want to use ArcGIS Pro as part of this module instead of QGIS, you can do that - but be aware that I don’t personally use ArcGIS Pro for much - I make no secret that I am an advocate for open source software, and QGIS is more than enough for the GIS work I don’t use Python for. So ArcGIS Pro content and support for this course will be provided by the TAs.

4. Google Colab#

Colab is an online platform which allows you to run Python code in Jupyter Notebooks that you have uploaded or created, or from a Github repository. So, if there’s a repository with Python (or Julia, or R) code in a Jupyter Notebook, you can run - and even edit - that code in Colab. The Python version and environment for Colab is fixed for all users, so you have to work with whatever versions of Python and specific Python packages Google has chosen, or which are compatible with what Google has chosen - but that’s easily enough for 99.9% of purposes.

Using Colab has some significant advantages, including integration with not just Github but also Google Drive. It also offers significantly more processing power than most people have access to themselves, including free access to GPUs. This means you can use Colab for heavy processing tasks, including machine learning, making it a viable option as your main tool for analysing geospatial data.

5. Others#

There are other programming languages and tools available. There’s other online platforms like Google Earth Engine, and Microsoft’s Planetary Computer. There’s other GIS desktop software, including Global Mapper, and apps fo Android and iOS such as SW Maps. A lot of geospatial analysis has also been done using the R programming language, and there’s growing use of the languages Julia and Rust, and the new Mojo, as well as long-standing use of C (which is a much more complicated language to learn and use - you can’t run C in a Notebook, for example, only in complete compiled programs).

In general, the best tool to use is simply one that does what you need and which you’re comfortable with. So, for the module exercises, you are free to use QGIS, or ArcGIS Pro, or even just Python code if you want.


GY4006 Notebooks:

  1. Data Types Open In Colab

  2. Vector Data Open In Colab

  3. Attribute Data Open In Colab

  4. Coordinate Reference Systems Open In Colab

  5. Geospatial Data Files Open In Colab

  6. Vector Geoprocessing Open In Colab

  7. Introduction to Raster Data Open In Colab

  8. Single-band Raster Data Open In Colab

  9. Multi-band Raster Data: Passive Remote Sensing Open In Colab