Now that my profile is online at the Berlin Graduate School of Ancient Studies, I feel I really should get into this PhD project and also, maybe, start talking about it. So what is the first thing an archaeoinformatics-inclined woman is to do about a PhD? — Design a database! What else?!
My PhD focuses on the 5th mill. BC in Brandenburg, Germany, and the relations & subsistence strategies of the people living there. My main archaeological source will be their ceramics. So I need a ceramics database. Now, going for archaeoinformatics inside an archaeological PhD, here are 5 things I want it to handle well:
- it will be mapped to CIDOC-CRM so that interoperability to other data sets may be given
- it will be published online one day, so I want it to be able to merge into the Linked Open Data Cloud
- I want it to use standards of ceramic description
- It will need to be able to integrate images and RTI-images
- It should handle spatial data well
Why all these things? Let me elaborate:
What does CIDOC-CRM mean? The International Committee for Documentation of the International Council of Museums (CIDOC) develops this Conceptual Reference Model (CRM). A Conceptual Reference Model is a structure for the design of data bases. The CIDOC-CRM is a quasi-standard ontology built on events, actors and actions and is being used widely in Heritage Management now. It enables a read-only integration of very diverse datasets. I’m not the first one to apply it to ceramics: OntoCeramic 2.0 has been published by Brancato et al. . They focus on Sicily and Hellenistic as well as Roman Ware. Nonetheless I’ve been able to use their model as a basis for my own ideas.
I want to be able to publish my data in a FAIR way, which means it shall be Findable, Accessible, Interoperable and Re-usable . Using CIDOC-CRM helps with the interoperability and the re-usability points.
2. Linked Open Data
Flo and I talked about the Linked and SPARQLing Ogham Project before. The world wide web offers an amazing resource for information, but it is so important that things are linked to each other. Otherwise they are hard to find. Google’s selling point is that it finds links for us. But for data sets, the linking is especially important: To explicitly state which two data entries in two databases talk about the same object makes research so much easier for you and me. Links should always be persistent URIs, so that they don’t break during the years.
So, I want my data to be well linked to existing repositories. I guess, though, that there are not yet many repos on prehistoric ceramics online and even less on Brandenburg’s 5th Mill. BC. Nonetheless I can add links e.g. to OpenStreetMaps and maybe some WikiData entries.
3. Standards for ceramic description
There is a regional standard for describing ceramics in Northern Germany developed by Doris Mischka. It is called Nordmitteleuropäische Neolithische Keramik (North-central European Neolithic Ceramics) – NoNeK. At the university of Kiel and the Lower Saxony Institute for Historical Coastal Research people still work at enhancing the catalogue. By now people also use it for other periodes, e. g. for Bronze Age ceramics. The main point in the description of ornaments is their reduction to elements, such as “a line”, which have an orientation (e.g. “vertical”), a technique (e.g. “incised”) and a type (“of triangles”). Repeated elements form a pattern. Patterns at a certain place on a vessel describe an ornament. For more information, have a look here (German only, I’m sorry).
Why is it important to use a standard like NoNeK? While I believe that every data collection needs to be tailored to your specific use case, it can be very helpful to rely on already existing schemata. Thereby you create a much more comparable data set. It will be relevant to more people, if it can be compared and integrated into other data sets easily. Also, I don’t think everyone needs to reinvent the wheel*.
4. Images and RTI-images
Every archaeologists needs a database that can do images. But I also plan on using RTI – Reflectance Transformation Imaging (developed by Cultural Heritage Imaging). So I need to include these into my database design. RTI-images are computationally created from ca. 60 “normal photos” shot with light at differing angles. The images are then merged into one, for which you can interactively change the light settings. To be thorough, I will need to add the 60 source images as well as the RTI-image to my database and link them. RTI has been used to document faint production traces in pottery , which I hope to be able to document as well. Also, I of course want to add images from publications, drawings etc.
5. Spatial data
Spatial information is very important to me. Not just because archaeology is an inherently spatial disciplines – distribution maps galore! – But also, because I want to focus on settlement patterns and relationship to landscapes as well.
I decided to use PostgreSQL. Why PostgreSQL? PostgreSQL is an open-source object-relational database management system. It is well known for being reliable and best-suited for geospatial data. I already used it for my Master’s thesis. The thesis focused on settlement and feature distributions and their geo-spatial and statistical analysis. So I know how well it worked for me and gladly use it again.
Putting archaeoinformatics into the archaeology PhD
To conclude: I know digital strategies are used by many archaeologists. But I hope I could show some aspects here, that not so many prehistoric archaeologists think about when designing their databases. Using an all through digital — archaeoinformatical — approach means I do not only consider my personal needs at the moment, but how to produce the best digital output. I think about my future workflows and other people’s needs for re-using my data.
I will give a more thorough talk on this topic at session S02 during the international CAA conference in June. If you are interested, tune in! 🙂
*footnote: This saying always somehow irks me. I mean. The wheel has been invented several times.