AfriArch is an archaeological and paleoenvironmental data community designed to integrate datasets related to human-environmental interactions in Holocene Africa. This resource encompasses radiocarbon data, stable isotope measurements from human, plant, and animal tissues, and zooarchaeological and archaeobotanical information from c. 12,000 years ago to the present day. Here, we present a continent-wide compilation of carbon, nitrogen, and oxygen stable isotope measurements from bioarchaeological samples and modern samples when reported within original archaeological publications.
Isotopic research on Holocene African archaeology developed from foundational ecological and environmental studies that focused on understanding modern animal dietary behaviours. In the late 1970s, researchers working in southern and eastern Africa identified that the relative contributions of C3 and C4 plants to herbivore diets could be fruitfully explored through the stable carbon isotope (δ13C) content of animal bone collagen (southern Africa: Vogel 1978 ; eastern Africa: Tieszen, et al. 1979 ). The implications of this research for investigating paleoenvironments and past human diets were quickly recognized, especially once δ13C values were combined with stable nitrogen isotopic (δ15N) data to estimate consumer trophic level [3, 4]. In southern Africa, this research focused on characterizing the diets of Holocene hunter-gatherers (e.g., Sealy and van der Merwe 1985; Lee-Thorp, et al. 1989) [5, 6]. In eastern Africa, researchers focused on understanding how isotopic variability related to different ecological guilds among non-human mammals and to different subsistence strategies among modern and archaeological human groups (e.g., Ambrose and DeNiro 1986a, b) [7, 8]. From these beginnings, archaeological stable isotope research has since been widely applied across the African continent to explore dietary signatures associated with diverse hunter-gatherer, fisher-forager, pastoralist, and agriculturalist contexts.
Despite the prevalence of isotopic research within African Holocene archaeology, there has been little macro-scale or regional comparative investigation of dietary patterns. Assessing the rate and scale of transformative processes like the spread of cattle herding and expansion of grain agriculture have often been limited to localized studies. Many factors contribute to this problem. In part, the divergent research traditions between regions, partially a legacy of colonialism, have prevented the integration of data across study areas. For example, Egyptology often remains poorly integrated within Africanist studies, and there remains a general lack of research integration between scholars working in Francophone and Anglophone areas. This is exacerbated by the fact that many areas with high-resolution records are geographically separated by large expanses where little research has been carried out that might help link regional patterns. Assessing dietary change requires study of human remains, and, while large cemeteries of well-preserved individuals across multiple periods have been found in regions like Egypt, such instances remain especially rare across much of the continent.
A second set of problems relate to the heterogeneity and complexity of African ecosystems. Strategies of mobile pastoralism involved different degrees of reliance on wild and domesticated animals/secondary products as they spread from the Saharan to southern Africa, often making mobile pastoralists difficult to distinguish isotopically from hunter-gatherers. Additionally, the dominant African domesticated grains of sorghum (Sorghum bicolor), pearl millet (Pennisetum glaucum) and finger millet (Eleusine coracana) are all C4 plants, and their consumption cannot be readily detected against the backdrop of C4 grasslands that characterize many lowland savanna ecosystems across Africa. In some cases, it has been possible to overcome these challenges in focused case studies; however, interpreting patterns in stable isotope data across such economically and ecologically diverse zones remains a major challenge. Detecting changes in human-environmental interactions related to climate change, spread of populations of new food systems or crops, and introduction of new technologies, and how these changes manifest socially, remain important areas for comparative study. Developing analytical strategies to carry out such broad-scale research requires first the centralized compilation of datasets generated for African prehistory.
This dataset includes samples generated on Africa, the second largest continent in the world with a landmass covering approximately 30.4 million square km.
Northern boundary: 37.8°
Southern boundary: –37.0°
Eastern boundary: 57.4°
Western boundary: –26.3°
Ca. 12,000 BCE – CE 2020
This includes major archaeological periods defined by specific technological, subsistence, or socio-political changes that may be regionally specific or time-transgressive across space. Africa also preserves long-term records of forager and food producer coexistence, meaning that major temporal periods do not indicate a universal economic shift even within specific areas. Figure 1 is a summed probability distribution of unique archaeological phases included in the AfriArch isotopic dataset and is indicative of the relative temporal availability of isotopic measurements on African bioarchaeological samples across the Holocene.
This dataset compiles published measurements of stable carbon (δ13C), nitrogen (δ15N), and oxygen (δ18O) isotope ratios from mainland Africa. The datasets combine isotopic measurements on human and animal remains (e.g., bone and teeth) as well as available data from carbonized (i.e., burnt) plant remains from archaeological sites. Some modern animal samples are also included to further contextualize and interpret human dietary signatures and to provide information on past local environmental conditions and agricultural practices. When available, direct radiocarbon measurements on sampled materials were used to assign chronologies. If direct dates were not available, samples were assigned estimated ranges based on radiocarbon dates from associated organic materials in the same horizon or archaeological site, or the estimated age ranges for the related cultural period based on pottery or artefact seriation or other patterns of material culture or mortuary tradition.
Each sample was also assigned geographic coordinates related to the archaeological site from whence it was recovered or for modern samples the approximate area of collection. Published site coordinates were entered in Decimal Degrees (DD) relative to the WGS84 system, either as presented by excavators or calculated from DMS or other coordinate reference systems. In many cases, site coordinates were not explicitly reported and were estimated from published maps. In such cases, a distance estimate was included with the coordinates to capture the possible area as accurately as possible.
All reported radiocarbon measurements were calibrated using the recent IntCal20, SHCal20, and Marine20 radiocarbon calibration curves and the calibration software OxCal v.4.4.4 [9, 10, 11]. We divided Africa into three zones (northern, central, and southern) using as reference the extreme uncertainty ranges for the location of the Intertropical Convergence Zone after Robison and Henderson-Sellers (1999) . Samples located in the northern zone were assigned the IntCal20 curve as the reference terrestrial radiocarbon calibration curve, samples located in the southern zone were assigned the SHCal20 curve, and samples located in the intermediate zone were assigned a uniform mix (between 0 and 100%) of IntCal20 and SHCal20. These curves were used to calibrate into calendar dates radiocarbon measurements from samples of organisms that do not consume marine foods or if their burial location was at a distance greater than 50 km from the modern-day coastline.
The remaining samples were of humans buried closer to the present coastline. For these, we assumed that the carbon contribution from marine foods is unknown whenever radiocarbon measurements were made on bone bioapatite or tooth enamel. Marine carbon contributions were also assumed as unknown for proteinaceous tissues (e.g., collagen, hair keratin) if either organic carbon or nitrogen stable measurements were not available. To calibrate radiocarbon measurements for these samples, we assumed that the marine carbon contribution towards the 14C-measured sample could be anywhere from 0 to 75% of total carbon, a range that we believe to be conservative. Calibration was done by assigning this contribution to the marine calibration curve Marine20 mixed with the chosen terrestrial calibration curve as described in the previous paragraph. For the remainder of proteinaceous samples, those for which organic carbon or nitrogen stable measurements were available, we made relatively refined estimates of marine dietary carbon contribution towards the 14C-measured sample.
Dietary estimates were made using the Bayesian software ReSources, an upgraded version of the Bayesian software FRUITS [13, 14]. The general approach was similar to that described in Fernandes et al. (2021). For modelling purposes, we considered four main food groups (C3 plants, C4, plants, animal products from terrestrial herbivores, and marine vertebrates) and two isotopic proxies (organic δ13C and δ15N). Isotopic averages for food macronutrients were calculated using a mix of archaeological and modern samples. Modern samples were only considered to calculate isotopic averages for the marine vertebrates food group. For these, a correction (+1‰) was applied to reported δ13C values to account for the Suess effect in marine contexts . When necessary, we applied isotopic corrections to account for isotopic differences between measured tissues (e.g., collagen, charred plants) and edible macronutrients, following Fernandes et al. (2015) and Soncin et al. (2022) [17, 18]. Macronutrient concentrations for each food group were as described in Fernandes et al. (2021) .
Isotopic means (δ13C and δ15N) for the marine vertebrates food group were calculated using all modern entries available in the AfriArch isotopic dataset. For the δ13C values of C3 and C4 plants, we used all available archaeological entries in the dataset. Plant δ15N values for each burial location were calculated by applying an offset (–3.5‰) to herbivore bone collagen δ15N values . Herbivore mean δ13C and δ15N bone collagen values were estimated for each human burial location by producing a smoothed Bayesian isoscape using the model AverageR [20, 21]. A minimum uncertainty of 1‰ was employed for all food macronutrients. The uncertainty for human isotopic measurements was set at 0.5‰. We employed an offset of 5.5 ± 0.5‰ between the δ15N value for edible protein and human proteinaceous tissues . For δ13C, we considered an offset of 4.8 ± 0.5‰ between edible macronutrients and human proteinaceous tissues and that 74 ± 4% of the carbon signal was routed from food protein while the remainder was from carbohydrates/lipids . Each human dietary estimate was generated independently using ReSources to account for differences in local food baselines. The full model description is available online (see Dataset description).
The ΔR offset for local marine radiocarbon reservoir offsets towards Marine20 was calculated around each burial location (radius of 100km) using an estimate generated by the Bayesian model AverageR [20, 21]. The ΔR smoothed surface was produced using ΔR data from the Marine Reservoir Correction database . Radiocarbon calibration into calendar dates was done using the Bayesian chronological software OxCal v4.4.4, the selected terrestrial calibration curve (see above), and a mixed contribution from the marine calibration curve Marine20 corresponding to the estimate, expressed as a normal distribution following the output from ReSources, of the contribution of dietary carbon towards human proteinaceous tissues.
Stable isotope measurements were obtained from all existing publications that reported bioarchaeological δ13C, δ15N, and δ18O values from the African continent that were known to the authors (Figure 2). This includes site reports, research articles and publicly available theses and dissertations.
We include along with stable isotope measurements any metrics relevant for assessing bone collagen preservation that were available in the source publication (e.g., collagen yield, %C, %N, atomic C/N). There were no preservation thresholds used to determine if samples would be included in the dataset, thus allowing other researchers to choose whatever filtering criteria are appropriate for their study. We specify whether δ13C measurements were taken from organic vs. inorganic elements, and we differentiated δ18O measurements from phosphate and carbonate and noted whether these are expressed relative to the V-PDB or V-SMOW standards.
The quality of compiled data also depends on the critical evaluation of published records. Variations of site names were standardized so that the relationship between site names and coordinates was 1:1. We also identified >50 records of 14C and stable isotope data in the published literature with unclear data attribution, which led to duplicate entries with identical elemental data. We deleted these duplicates but note that duplicate entries remain given a history of research associated with particular specimens. For example, the compilation includes three entries for human remains from Robberg, South Africa (Pta-6613 2360 ± 20 14C yr BP) given that Sealy (2010)  reports collagen δ13C and δ15N data from this individual, and Loftus & Sealy (2012)  report both enamel δ13C data and bone apatite δ13C data from this individual. Additionally, duplicate data sometimes come from the same individual and material, as is the case with, for example, a δ13C measurement from human collagen at Oakhurst, South Africa (Pta-4367 5450 ± 70 14C yr BP) of –12.4‰ reported by Sealy et al. (1992)  and a collagen δ13C value of –11.6‰ from the same individual reported by Sealy (2010) .
Relatively cryptic cases of duplicate data also exist given absent 14C lab numbers and incomplete specimen ID numbers. This is the case with data from Rigo Cave (Kenya), Gishimangeda Cave (Tanzania) and Lukenya Hill (Kenya), which were generated by Ambrose & DeNiro (1986)  and later expanded by Prendergast et al. (2019)  and Wang et al. (2022) . Given our inability to identify duplication between the earlier and later work to more clearly separate human individual remains at these sites, we chose to remove from our compilation 17 of the earlier entries from Ambrose & DeNiro (1986) . Note that we also removed from our compilation 6 historic human entries reported by Ambrose & DeNiro (1986)  given that these specimens were likely gathered in a way that today we would deem as unethical.
Poor preservation of biological materials (e.g., bone) is a general constraint for archaeological science in Africa and follows from 1) equatorial climate, 2) highly acidic sediments, 3) intensive processing (e.g., Gifford-Gonzalez 2014; Marshall 1990) [30, 31], 4) partial carbonization of some African grains (e.g., Mueller et al. 2022) , and 5) limited application of intensive flotation to recover plant remains (Figure 2).
The compiled dataset is also biased by the nature of archaeological research in Africa, and this is particularly clear with the representation of human remains. An inevitable constraint is geographic, with data largely clustered into regions with longer histories of intensive archaeological research. A second source of bias is the heterogeneity in human mortuary practices across space and time, with formal cemeteries generally rare or constrained to specific phases, especially for mobile herders in sub-Saharan Africa (Sawchuk et al. 2018). Additionally, some depositional environments include highly fragmentary human remains that cannot be assessed for biological sex and age at death. Consequently, these fields are left blank for entries with missing data.
Presently, this dataset is focused on commonly studied isotope systems related directly to bioarchaeological research. Strontium isotopic measurements are used primarily to study mobility and were not included at this stage. This is because these data are relevant primarily within a specific region where baseline values are established and are not meaningful for inter-regional comparative analyses. We also chose to omit data from other stable isotope systems (e.g., sulphur and zinc) that are rarely applied in Africa and are similarly not yet conducive to comparative study.
Chronological assignments may be improved in the future. Our dataset includes radiocarbon measurements on bone apatite and tooth enamel often due to an absence of alternative organic fractions (e.g., collagen) for dating. These inorganic fractions typically have a lower dating accuracy [34, 35]. Due to a lack of data on local freshwater radiocarbon reservoir effects and of isotopic baselines for freshwater foods, we did not consider the impact of freshwater diet on consumer radiocarbon measurements. More complex modelling (e.g., using Bayesian methods) that combine multiple lines of archaeological information could also be employed in the future to improve dating precision.
We plan to continuously update the database for already included isotopic proxies and to add new proxies once data coverage justifies it.
(3) Dataset description
The dataset reported here is made available via a single table in .xlsx and .csv formats (“afriarch-isotopic-dataset.xlsx” and “afriarch-isotopic-dataset.csv”), available through the data community AfriArch setup within the Pandora Initiative data platform (https://pandoradata.earth/organization/afriarch). Metadata descriptions, also in .xlsx and .csv formats, together with a ReSources file describing the model used for dietary reconstruction are made available at the same location. AfriArch is a broad project that aggregates a wide range of paleoecological and archaeological databases relevant to studies of the African past. The AfriArch isotopic database is also part of the IsoMemo network of autonomous databases (https://isomemo.com/).
The AfriArch isotopic dataset is organized according to a series of nested descriptive fields. In addition to a unique numerical identifier for each entry within the dataset (ID), these fields describe the archaeological site/sampling source (Site_Name), country of origin (Site_Country), and geographic coordinates in WGS84 Decimal Degrees (Latitude, Longitude). During the standardization of site names, extra qualifications of site names in the published literature were separated (Locality_Notes). When site coordinates are estimated, an additional field (Radius) denotes the approximate radius in kilometres around the coordinates in which a site is located. The distance separating sites from the coast (Km_to_Coast) does not account for location uncertainty and was identified by applying the NNJoin tool in QGIS to Africa’s coastline and site latitude and longitude.
Individual samples or reported values for multiple samples are identified based on the identifier given in the original publication (Specimen ID) where one is provided. If available, there are additional fields to denote the type of organic material sampled (Material_Type), the skeletal element (Bone_Element), or the specific type of tooth (Tooth_Element) from which a measurement was obtained.
If a sample has been directly radiocarbon dated or if a radiocarbon date is directly associated with the material, the radiocarbon lab identification number for that analysis is listed, as is the conventional radiocarbon age (RC_age_est) plus associated 1-sigma uncertainty of the measurement (RC_age_sd). Radiocarbon measurements made using Accelerator Mass Spectrometry (AMS) methods are marked by a boolean field (is_Date_AMS), and the type of material that was subject to radiocarbon measurement is identified using a precise classification (14C_dated_material) and a more general categorization (Dated_material_general). If there are no directly associated radiocarbon dates for a sample, only an approximate age range from the original report is used. For recalibrated radiocarbon measurements, the reference terrestrial calibration curve is identified (Calibration_curve) and, if a marine dietary correction was applied, then a categorical field identifies the type of correction (Diet_estimate). Concerning the latter, two fields (Marine_C_Min_or_Mean and Marine_C_Max_or_SD) give the mean or minimum range value and standard deviation or maximum range value for the contribution of marine carbon. When applicable, the marine radiocarbon reservoir for each burial location is summarized as a mean and standard error of the mean (MRE_Mean and MRE_SEM). Calibrated ranges corresponding to a 95% credible interval are given in two separate fields (Date_min and Date_max). If the sample age is based on associations, then the minimum and maximum BCE/CE ages are based on associations to traditions or material culture patterns. Dates BCE are listed with negative integers, and dates CE are presented as positive integers. The type of dating (direct 14C, archaeological context, historical context, or modern collection) is listed in a text field (Date_Type).
Several fields are included to describe sample features, including its biological taxonomy (Kingdom, Class, Order, Family) and, when possible, genus and species (Taxon), as well as species common name (Taxon_Common_Name). To ease the identification of specimens according to frequently used categories in isotopic research, we included specification (Category) of plant photosynthetic pathway and the general habitat and diet of consumers (e.g., terrestrial herbivore vs. marine fish). If original publications listed an approximate age-at-death (Age) or biological sex (Bio_sex) estimation of the sampled individual, this information is also included in respective fields. In a few cases, only the averages of stable isotope ratios across individuals are reported, and we marked these entries with the corresponding number of samples that contributed to the average (Averaged_N). Ages are most frequently reported to age categories (e.g., Adult, Juvenile, Infant), and numerical ages are less common in the published literature. Given known difficulties in estimating specific ages of skeletal elements in under-studied populations, we chose to distinguish only adult (>25 years) from juvenile (<25 years). Any other relevant contextual information relating to the sample is listed under a specific comments (Comments) field.
The stable isotope measurements are differentiated based on fields that separate element and material class. Carbon (δ13C) data are differentiated into fields for measurements of organic (δ13C _org) and mineral-bound (δ13C_inorg) carbon, and oxygen (δ18O) is likewise divided into carbonate (δ18O_Carbonate) and phosphate (δ18O_Phosphate) measurements (separate fields are used to report measurements relative to V-PDB and V-SMOW standards). Nitrogen (δ15N) data are represented only by a single field (δ15N). These are accompanied by fields that specify weight percent carbon (pct_C), the weight per cent nitrogen (pct_N), the carbon to nitrogen atomic ratio (CN_ratio), and the collagen yield for bone samples (C_yield_raw), as available from source publications. We include fields for the number of samples (N_Samples) and averaged sample standard deviation for carbon (δ13C_org_sd/δ13C_inorg_sd) and nitrogen (δ15N_sd) for cases in which several values are reported together.
The source publication for stable isotope data from each sample is given in a reference field that includes publication authors, date of publication, title, and publication citation (Reference) plus the respective persistent identifier (DOI or other). This allows users to reference both the present data compilation and original research when using subsets of the compilation for future research. Any supporting information employed to describe samples which may not be included in the stable isotope publication (e.g., chronology, archaeological context) is mentioned in a comments field (Comments).
Format names and versions
CSV, Excel, & ZIP.
Records created from November 2020 to August 2022.
Jesse Wolfhagen, Steve Goldstein, Sean Hixon, Ricardo Fernandes.
Creative Commons Attribution-ShareAlike.
(4) Reuse potential
The compiled dataset can be used to investigate at various spatiotemporal scales human subsistence strategies and mobility patterns. Animal and plant isotopic data offers isotopic baselines for aforementioned human studies and can also be used to study animal and crop management practices and in the reconstruction of paleo-environments and paleo-climates. The dataset is also a useful resource for heritage conservation activities and in the study of the preservation conditions of archaeological materials. Finally, our dataset provides a useful resource for the identification of data research gaps which then can serve as the basis for the selection of future research targets.