Site Catchment Analysis needs a certain number of steps to be repeated for all sites. So, the best way to do this is via a script! Here we presents a solution with PyQGIS.

This blog post is written by a small learning group for Python. We decided to join forces to motivate each other and help each other out when we’re stuck. After going through two introductory courses, one general and one focused on Python in QGIS, we thought it was time to try our new skills on a small project. We decided on a topic that Sophie will be working on during her PhD. Once she’s got all her archaeological sites, she plans on analysing the parameters of the locations. Some frequent questions in settlement archaeology include on what kind of soils, level of elevation or slope people in the past built their settlements. Of course, determining the “point” where a settlement was built is not the only matter of interest. The surrounding area needs to be considered as well. For this, site catchment analysis is used. Around each “point” of the settlement a circle is drawn that represents the area we assume was important for the settlers. This “site catchment area” is then further analysed. The first step is to extract the percentages of e. g. soil types per area around a settlement.

As far as we could see, there was neither a plug-in that took on that task, nor a script that offered to automatise the workflow. So, that’s what we developed! 

We hope this script can be helpful for others as well. Here we describe what the code does. If you just want to download the whole py-script, go to Github: Site Catchment Script.

The QGIS Python console

QGIS is a great free and open-source geographic information system. We used the QGIS Python console to write the script. It produces a csv-table as output, which tells us the percentage of different areas per point. Using a GIS environment makes it easier to understand what is happening. Therefore, with the script we also generate a temporary canvas with all layers that are needed for the computation. You can save it if you want.

If you install QGIS, you also simultaneously install Python. So don’t bother installing it separately 😉 . This might only lead to confusion once you want to install external modules (which we won’t get into now). If you want to write a script too, there’s a neat little trick: If you use the toolbox in QGIS, you can get the code for each function you’ve used. This makes scripting much easier, because once you’ve got your workflow, you can just get the code from there and adapt it as needed. Since version 3.0 QGIS comes with Python 3. We worked with QGIS 3.18 “Zürich”, and QGIS 3.16 “Hannover” on Windows, as well as a Linux distribution, so you should be fine using either.

Where to find the QGIS Python console? If you open QGIS, go to Plugins and then to Python console to open it. Now you can open a script file (ends on *.py).

Example data

The island of AtlantGIS
(https://github.com/kacebe/AtlantGIS/),
coloured according to landuse type.
Points are the “archaeological sites”.

The example data we put in the github is called AtlantGIS. Kai-Christian Bruhn created it for teaching geo-information systems to archaeology students. It is online under https://github.com/kacebe/AtlantGIS/. As you may understand from the name, it describes an imaginary island (on Atlantis see e.g. Flint Dibble). We chose to use shape-files for now (there are several data types available): archaeological_sites, which is a point data set and the landtype polygon data that describes different landscapes. These data sets are very close to our use case. Be aware though! In the data set of archaeological sites, the ID 120 is given to two different points (this gave us a little scare, I tell you…).

Two things are important: The files you use need to be projected. The example data uses WGS 84 / UTM zone 28N (EPSG 32628). Related to this is the second point: You need to be aware of the unit of your projection, because you will need to give the radius of the circle we use as a catchment area. Here it is of course meter.

Code

In this part, we will go through the code of our sitecatchment analysis script bit by bit and explain how it works.

We start by importing two libraries that we need to handle and write the QGIS vector files. Libraries offer functionalities for Python by providing “reusable” code chunks – they are a bit like extensions or plugins in other programs:

from qgis.PyQt.QtCore import QVariant
from qgis.core import QgsVectorFileWriter

Define your variables

First we need to set a couple of variables so that the script runs on your system. The code parts you need to adapt all start with an ‘INSERT_…’.

As described above, for input data we need two shape files: the point data locating sites, as well as polygons defining soil types. You will have to give the path to the point file to the variable samp and the path to the polygon file to the variable land. Paths always need to go in “quotation marks”. If you use Windows, be careful to use the correct complete path, e.g. starting with “C:/…” The path also should end with the file name and extension (so e. g. land = ".../AtlantGIS_data/landtype.shp")

For both layers we will also need the information which column of the attribute data we will handle later on. land_id should be the column name which denotes the different land types. In our example data that’d be land_id ='Landtype'. samp_id needs to be the ID field of the archaeological sites. In the example it’s: samp_id = 'ID_AR_SITE'.

The output will be a csv file denoting area percentages of soil types per site. Therefore, you’ll need to assign a path where the csv should be saved to the variable output.

The last variable deals with how to calculate the catchment area. It is a circle whose size is something you need to define beforehand by assigning a radius. This needs to be a number without any ‘ ‘-signs. So it would be radius = 5000 to have a five-kilometer radius around the site point (see figure below).

land = 'INSERT_YOUR_ABSOLUT_PATH_TO_POLYGON_LAYER'
land_id = 'INSERT_ID´S NAME'
samp = 'INSERT_YOUR_ABSOLUT_PATH_TO_POINT_LAYER'
samp_id = 'INSERT_ID´S NAME'

output = 'INSERT_YOUR_ABSOLUT_PATH_TO_CSV_LAYER'
radius = INSERT NUMBER

Define vector layers

So far, we’ve only assigned the paths to variables. Now, we want to add the actual shape files to the QGIS canvas. For this we use iface.addVectorLayer(). “iface” is basically a Python wrapper for the C++ class QgisInterface. The corresponding methods allow us to interact with QGIS either by loading layers into the map canvas or by changing their properties. iface.addVectorLayer() takes the layers path as first argument and its name in QGIS secondly. The third parameter refers to the provider’s name being the ogr library supporting a variety of vector formats.

landuse = iface.addVectorLayer(land, 'landuse', 'ogr')
samples = iface.addVectorLayer(samp, 'sample', 'ogr')

Once you’ve done this you should see the two layers “appear” in your QGIS.

Sadly, we still need some more steps before we can start with the actual calculations. We create a temporary layer called result with the function QgsVectorLayer(). Here, we want to define it as a point layer with the coordinate reference system correlating to that of our point data ('Point?crs=' + samples.crs().authid() ). We also add a data provider for the result layer with result.dataProvider(). A data provider helps us to manipulate the attribute table of a layer. We use it to add attributes to the layer:

  • id — this will be the site id of the archaeological sites;
  • obj — here we will store the landtypes;
  • area — this will be the absolute area of a particular land type around the archaeological site denoted by id; and lastly
  • area% which will be the percentage of this landtype in the radius drawn around the site.

We update the resulting layer with result.updateFields() and create fields, a container of the fields in the attribute table.

Lastly, we need a list of features, to which we can later append the results in a loop. We call it feats.

result = QgsVectorLayer('Point?crs=' + samples.crs().authid(), 'Result', 'memory')
prov = result.dataProvider()
prov.addAttributes([QgsField('id', QVariant.Int), QgsField('obj', QVariant.String), QgsField('area', QVariant.Double), QgsField('area%', QVariant.Double)])
result.updateFields()
fields = prov.fields()

feats = []

Now, finally, the fun can start!

Let’s loop it!

The basic functionality of the script is a nested loop, iterating over sites and soils by using the getFeatures() method. So, for each point in samples and for each polygon, we first draw a buffer. Then, we trim the polygon layer with the buffer. From these parts, we get the values for landtype, as well as how large the area with this type of land is. If it is larger than 0, we calculate the percentage and add all the relevant information to the feats – list. This approach was heavily inspired by Detlev Neumanns answer to a question on ResearchGate.

a 25-edges-polygon (almost a circle) drawn around a point and a 5km radius is marked. The circle is filled with several brown and green coloured shapes that represent different landtypes
Example buffer with 25
segments and 5km radius
drawn around a point.
The colours represent
different landtypes within
this circle.

In the first step, we create a buffer called puffer. Using sites.geometry() and buffer(), at every site we draw a circle, using the radius size you set at the beginning. The 25 represents the segments of the buffer that determine how true to a circle it is. We chose it as a value that seemed fine, not overly precise but fast.

Next, the buffer is cut into various parts by polygons representing different soils. For further use, a QgsFeature object is created, which incorporates the fields we had designed for the results layer above. The feature inherits the geometry of the point with which we are working at the moment. It receives the sample_id value, the land type and the size of the area with this land type (feat['area'] = part.area() ).

Now, if this area is larger than 0 we calculate the percentage of the area and add the whole set of informations to the list feats.

for sample in samples.getFeatures():
    puffer = sample.geometry().buffer(radius, 25) 
    for poly in landuse.getFeatures():
        part = puffer.intersection(poly.geometry())
        feat = QgsFeature(fields) 
        feat.setGeometry(sample.geometry())
        feat['id'] = sample[samp_id]
        feat ['obj'] = poly[land_id]
        feat['area'] = part.area()
        if feat ['area'] > 0:
            feat['area%'] = (part.area())/(puffer.area())
            feats.append(feat)

Wrap up for export

Now, when this loop is complete, we add our feats – list to the results layer and add it as a layer to our QGIS canvas.

prov.addFeatures(feats)
QgsProject.instance().addMapLayer(result)

You can see it now on your QGIS Canvas and it’s export time! We write the table as a csv-table at the path you specified in the beginning. As a csv you can import it in any other software for further analysis.

QgsVectorFileWriter.writeAsVectorFormat(result, output,'UTF-8',  result.crs(), 'CSV')

That’s it!

What we did

So, we have a script that does a site catchment analysis in one go: It takes our archaeological sites, draws a circle around them and calculates how large the areas of landtypes of the layer beneath are.

table showing part of the results of python script for site catchments
result.csv – Table with the output of the example data set.

Our script gives us a csv file as output. We can open it with LibreOffice, Excel or a text editor to look at it. The columns are as described above: id of your sites, obj for the different landtypes, area around the site which belong to the landtype and area% for the percentage of the landtype around the area inside the calculated circle.

We hope you find it useful as well!

Do you want to know what the hardest part of this little project was? Between designing this script and writing the blog post lay two years of a pandemic, in which two babies were born into the working group and none of us really remembered how the script worked. A reminder to comment your code! And to comment it better than we did…

Further reading

Just a small list of resources we used for this project:

This blog post is licensed under CC-BY and the code under MIT.

Sophie Schmidt

Founder & Editor

About the Author

My name is Sophie, I am a prehistoric and computational archaeologist and have been research associate at the Universities of Bonn and Cologne, as well as for the NFDI4Objects project at the German Archaeological Institute. I teach statistics for archaeologists, work on new methods in settlement archaeology (GIS, geostatistics in R and stuff) and am interested in archaeogaming. Now I started my PhD-project on the 5th mill. BC in Brandenburg (that's North-East Germany).

View Articles

About the Author

Hi, my name is Sara. I am passionate about data and dinosaurs. For a long time I was focused on geostatistics, but now my interest has shifted towards text analytics and predictive modeling. I am here to do the fun stuff.

View Articles

Kai Vogl

Author

About the Author

Hi, my name is Kai. In the past I have been interested in GIS and geostatistical applications. In the last years I've focused on hardware-based things like Arduino, Raspberry Pi etc. Especially systems and technologies that can be used in the museum or on-site.

View Articles

Sandra Rung

Author

About the Author

Bio will follow

View Articles