Timing the Mesolithic-Neolithic Transition in the Iberian Peninsula: The Radiocarbon Dataset

Context In recent years the archaeological research shows a growing awareness for the need to share the information in the benefits of reproducibility (transparency), being this journal a great example of this phenomenon. This change of direction is reflected in the increase of radiocarbon datasets. In addition to the current radiocarbon datasets available online, like Banadora or Radon [1, 2] among others, new datasets have been published in recent years as EUROEVOL or IDEArq [3, 4]. The dataset published in this paper was developed and collected in the context of the EVOLPAST project (2016–2018), however, the preliminary design of it goes back to the beginning of the twenty-first century. The chronological framework of this dataset (8000–5500 BP) is characterised by the development of the last huntergatherer groups (Mesolithic societies) and the arrival of the first agriculturalist societies to the Western Mediterranean [5–8] opening an interesting debate about the mechanisms involved in this spread [9]. The main goals of the project related to this database were: a) to examine the socioecological dynamics of the middle Holocene period in the Mediterranean Iberia in order to test temporal probability curves built from radiocarbon dates as a relative proxy for exploring possible links between trends in population patterns and cultural changes in an evolutionary view; b) to explore the possible interaction between the last hunter-gatherer and the first agricultural groups in the region using the radiocarbon dates to create a chronological model based on Bayesian statistics. In this sense, some results of the project have been published on evolutionary approaches related to lithics and pottery [10, 11]. Also, Bayesian and geostatistics analyses on the Mesolithic-Neolithic transition at intra-site or regional scale [12–14] and demographic trends according to settlements distributions and summed probability distribution of calibrated radiocarbon dates have been published [15, 16].

In recent years the archaeological research shows a growing awareness for the need to share the information in the benefits of reproducibility (transparency), being this journal a great example of this phenomenon. This change of direction is reflected in the increase of radiocarbon datasets. In addition to the current radiocarbon datasets available online, like Banadora or Radon [1,2] among others, new datasets have been published in recent years as EUROEVOL or IDEArq [3,4].
The dataset published in this paper was developed and collected in the context of the EVOLPAST project (2016-2018), however, the preliminary design of it goes back to the beginning of the twenty-first century. The chronological framework of this dataset (8000-5500 BP) is characterised by the development of the last huntergatherer groups (Mesolithic societies) and the arrival of the first agriculturalist societies to the Western Mediterranean [5][6][7][8] opening an interesting debate about the mechanisms involved in this spread [9].
The main goals of the project related to this database were: a) to examine the socioecological dynamics of the middle Holocene period in the Mediterranean Iberia in order to test temporal probability curves built from radiocarbon dates as a relative proxy for exploring possible links between trends in population patterns and cultural changes in an evolutionary view; b) to explore the possible interaction between the last hunter-gatherer and the first agricultural groups in the region using the radiocarbon dates to create a chronological model based on Bayesian statistics.
In this sense, some results of the project have been published on evolutionary approaches related to lithics and pottery [10,11]. Also, Bayesian and geostatistics analyses on the Mesolithic-Neolithic transition at intra-site or regional scale [12][13][14] and demographic trends according to settlements distributions and summed probability distribution of calibrated radiocarbon dates have been published [15,16].

Spatial coverage
Iberian Peninsula (Figure 1

(2) Methods
The database has been built based on rigorous research criteria. The data core was obtained directly from both published papers and grey literature, as well as information provided by colleagues that work in the Iberia Peninsula.

Steps
The original database was designed with Filemaker software, due that it allows integrating the database engine with the user interface in a user-friendly way. The dataset was collected in several steps: 1. We imported to our dataset the basic information (name site, laboratory code, BP, SD, sample dated and cultural period) compiled by us, and we checked that the information for inconsistencies (see quality control section). 2. We introduced the geospatial information of each record having in consideration modern administrative divisions (town, district and country), hydrographic region and site coordinates (WGS 84).
We have completed some extra information like archaeological context (layer, stratigraphical units, features and the like) of each radiocarbon date, type of site and data no accessible in step 1 (consulting the original paper). 3. We have added new radiocarbon dates from papers and several radiocarbon compilations for the Mesolithic and Neolithic [17,18].

Sampling strategy
We have conducted a rigorous search of radiocarbon dates published in any kind of supports up to February 2017. Some Thermoluminescence (TL) dates have been included.

Quality Control
First, we have checked all records for inconsistencies based on the laboratory codes. The coordinates were checked using the geographical information provided in the original publication when possible. On the contrary we have added the coordinates of the modern town. Finally, the cultural affiliation (entidad cultural) of each radiocarbon date have been introduced according to the original paper.

Constraints
Although since 1970 exists a thoughtful discussion about how radiocarbon dates should be reported [19] and currently several scientific journals have protocols to publish radiocarbon dating reports, sometimes key information is not available, including laboratory code or material dated. In this case, some attributes are marked as "n.d" meaning that the information is unknown. For instance: only 1.5% of the listed dates have not information about the sample material dated. On the other hand, the sample most represented is charcoal (44.1%), followed by bone (37.6%), seed-fruit (8.1%), shell (6.3%), sediment (0.75%), Pollen (0.21%), pine bract (0.05%) and others (1.18%).
It is important to note that NºID is an unique identifier, which was assigned automatically when the record was introduced.

(3) Dataset description Object name
C14_Piberia.xlsx is the name of the file where radiocarbon dates related to the Neolithic transition in the Iberian Peninsula are compiled. It is composed by several sheets where specific information is available. The field NºID is common for all sheets and it is used to identify each radiocarbon sample and its information throughout the .xlsx file. In order to run it in any software and programming environment each sheet of the database has been upload in a comma separated value (.csv) format too.
On the other hand, our database presents all the necessary fields, as we will see below, which allows its integration in other databases. The reason is that some fields (Site, BP, Identification of the radiocarbon laboratory, Standard Deviation or Sample dated) are the same as those used in this kind of databases.
Dataciones y contexto provides radiocarbon information [BP, SD, laboratory code, material, specie, radiocarbon method] and archaeological context [Number of site, type of site, stratum information, level].
Información geoespacial provides the spatial information of each record including the modern town province and country. It also provides major hydrographical information and its geographical coordinates using the World Geodetic System 84 (WGS84) designed by EPSG code 4326 (https://epsg.io/4326).
Referencias section shows the original reference fore where the information was collected. It has three atributes: NºID, citation (first author and year) and the complete reference.

Data type
Primary and secondary data

Language
Spanish. The database has a file "equivalence Spanish English fields" that provides an English translation of the fields and categories from the original database in order to facility the reuse by non-Spanish speakers.

License
Creative Commons Attribution 4.0

Publication date
The date the dataset was published in the repository on 17/04/2018.

(4) Reuse potential
Our dataset offers a great potential to explore the Neolithic Transition in the Iberian Peninsula. On one hand, its information can be used to explore chronological transition applying bayesian methods an intra-site or regional analysis [12,14]. On the other hand, the information can be used to explore the neolithisation phenomenon based on radiocarbon maps distribution or demographic trends [15,16,20].