Indigenous Landscape Transformation on Northern Haytí: An Archaeological and Environmental Database of the Montecristi Coast

research was able to reconstruct different spatial scales at which human action was evident which in turn constituted the This paper addresses a database collected and constructed as part of PhD research project on the north-western coast of the Dominican Republic. The PhD was part of the ERC Synergy Grant NEXUS 1492: New World Encounters in a Globalizing World. The database was collected during fieldwork campaigns between 2014 and 2015. Fieldwork consisted of a regional survey, material culture registry and collection, test pit excavation, and processing relevant environmental variables. The archaeological data consists of a record of 102 archaeological sites, the material culture associated with them (lithic, shell and coral objects, shell mollusk species), and the relationship between site location and a set of relevant environmental variables used for statistical analysis. This database is one of the only open access archaeological databases available at the moment in the Caribbean and can be reused by any Caribbean archaeologist working in the Greater Antilles.

essence of the indigenous landscape. By comparing these results with early Spanish cartographies and chronicles it was possible to evaluate the indigenous landscape before the arrival of Columbus and its transformation to the colonial landscape impose during the colonization process.

Spatial coverage
Description: Coast of the Montecristi province in northwestern Dominican Republic.
Geographic Temporal coverage 1200 to 1500 AD. This temporal coverage was obtained from regional relative chronology based on the spatiotemporal distribution of ceramic series in the research region. In addition, a set of 7 radiocarbon samples from excavated contexts at four archaeological sites in the research area confirmed this temporal span.

Steps: Survey methods and the definition of site
The archaeological data was collected by applying different regional survey strategies. The first method employed was a Systematic Total Area Survey method. However, due to the combination of complex topography, weather conditions and vegetation, as it has been reported for other regions [6,7], this method proved to be inefficient and time-consuming. A second strategy consisted in developing a more opportunistic survey [8], which provided the expected and necessary results. Third, to assist the opportunistic survey and keep the aim of collecting data in a systematic manner, predictive models were calculated for the area. With the combination of the three methods a total number of 102 archaeological sites were recorded (Figure 2).
In order to register and classify spatial distributions from a regional systematic perspective, a standard method for data recovery was developed. The aim behind this was to define 'sites' after fieldwork rather than "find hidden sites on the field" (see classic critic on 'site' definition by [9,10]). Building upon Foley's [11] off-site archaeology as well as in Ingold's theoretical ideas on taskscape, a particular field registry methodology was shaped. For this, the collected data was divided into three categories: single, cluster and scatter finds. Each spatial category, called in the thesis and the database spatial datasets, was registered as a point with coordinates and as a polygon that represent them spatially, alongside metadata related to the material culture associated with each of them.
During post-fieldwork analysis, a nearest neighbour histogram was calculated to determine the optimal distances at which these spatial phenomena were being clustered. Based on this result, previous archaeological literature and the researcher's own experience in the field, it was decided to use a standard distance of 100 m as a measurement of aggregation. In this sense, an archaeological site was defined in this research as the spatial cluster of evidence of material culture which can be observed in the form of single, clusters, and/or scatters finds, which has no more than 100 m of separation between each other. After the 'site' was defined, all the points and polygons of the 'spatial datasets' associated to it were combined and a code was given to them (MC = Montecristi, and an increasing number, e.g. MC-1). From the combination of the 'spatial datasets' points a 'site polygon' was projected and from this, a 'site centroid' was calculated. The resulting coordinate from the 'site centroid' was the one used as a site coordinate for all the analysis in the research. This definition allowed to have a standard spatial classification for the study area and therefore to obtain a robust idea of the archaeological site both for the analysis and the interpretation processes.

Sampling strategy
The dataset consists of raw materials collected in the field and their classification. Considering the limitation of storage space for archaeological materials in the Dominican Republic and The Netherlands, it was decided to collect only diagnostic materials from each of the 'spatial datasets'. Diagnostic material were defined as any complete or quasicomplete lithic, shell or coral object. In the cases where materials were too abundant, a sample of the observed materials was collected and the rest were photographed and left in situ. In terms of ceramic materials, only the sherds that were either decorated or larger than 5 cm were collected.

Quality Control
All records of the attribute table generated for the archaeological sites were checked against the original files at every crucial point in the processing of materials and sites. In the case of material culture, the check consisted of reviewing the database with the digital pictures and paper field forms from each of the sites. In the case of the archaeological site and its polygons, a revision was performed by comparing the location of the centroid of the site against the location of the spatial datasets points and polygons.

Constraints
Since most of the archaeological sites had not absolute dates, but only a relative chronological association in the regional context, this created a challenge for data analysis and interpretation. In this research, this issue was solved by focusing on studying the spatial attributes and patterns within the idea of 'sites as tendencies' of human action [12: 30].

(3) Dataset description
The database has been stored in the EASY DANS repository in different folders. Each folder contains files related to specific data. The main folder, "Dataset Content", contains six subfolders and four files. Since DANS has a policy of only using open access formats for the data (e.g. '.ods' for spreadsheets and '.mid' for spatial data), the dataset was saved on these formats and additionally on the originally worked ones (e.g. '.csv' for spreadsheets and '.shp for spatial data).

Object name "Dataset Content" Folder
Appendix_Archaeological comparisons_in Spanish -an appendix from the PhD dissertation with various maps depicting the relations between certain environmental variables and the distributions of material culture and archaeological sites (as .pdf file). Appendix_Environmental variables data (in spanish) -an appendix describing the environmental features worked in this research as well as their use as environmental variables (with maps) for the different analyses (as .pdf file).
Archaeological-Environmental-Database -a spreadsheet containing all the information about archaeological and environmental variables. This includes: site locations, types of material culture, relation with the environmental variables and metadata associated with each of the datasets considered (as .csv and .ods files). In the Codebook, there is more information relating the codes used in this file.

"1-Research area_polygon" Sub-folder
research-polygon -a vector polygon representing the spatial extentof the archaeological survey area from which the spatial dataset's data have been collected (as .shp and with associated files, also in .mid and .mif files).

"2-Spatial datasets_points" Sub-folder
spatial-datasets -a vector point dataset representing the location and metadata of each of the single, cluster and/or scatter finds recorded during fieldwork (as .shp and with associated files, also in .mid and .mif files).

"3-Spatial datasets_polygon" Sub-folder
spat-ds-polygon -a vector polygon dataset representing the spatial extent of each of the single, cluster and/or scatter finds recorded during fieldwork (as .shp and with associated files, also in .mid and .mif files).

"4-Archaeological sites_points" Sub-folder
sites -a vector point dataset representing the location and metadata of each of the centroid points of the archaeological sites defined during the research (as .shp and with associated files, also in .mid and .mif files).

"5-Archaeological site_polygon" Sub-folder
site-polygon -a vector polygon representing the spatial extent of the archaeological sites defined for the research on the basis of the 'spatial dataset' distribution (as .shp and with associated files, also in .mid and .mif files).

"codebooks" Sub-folder
CODEBOOK-Folders -a file with the object name explanation as presented in this paper (as a .txt file).
CODEBOOK-database -a file with the explanation of the codes used for the different archaeological and environmental variables in the research in Montecristi (as a .txt file).
object_names_codebook -a file with the object name explanation as presented in this paper (as a .csv file).

Data type
Primary and secondary data, and processed data from published materials.

Dataset Creators
The researcher responsible for the data collection and processing was Eduardo Herrera Malatesta.

(4) Reuse potential
The database is currently being reused by researchers within the NEXUS 1492 research project, particularly by colleagues from the Archaeological and Network Science teams. In addition, considering the broad scope of the research area and the detailed information contained in it, it represents an excellent case study for other researchers in the Caribbean and other regions in the world for further spatial analysis, as well as for regional comparisons.