(1) Overview


These datasets were collected as part of PhD research conducted on the coastal area of the Montecristi province, included within the project NEXUS 1492: New World Encounters in a Globalising World. This ERC project aimed to investigate the impacts of colonial encounters in the Caribbean, the nexus of the first interactions between the New and the Old World. NEXUS1492 addresses intercultural Amerindian-European-African dynamics at multiple temporal and spatial scales across the historical divide of 1492. The datasets presented here were part of the sub-project “Transformations of Indigenous Caribbean Cultures and Societies across the historical Divide”, which formed the archaeological backbone of NEXUS1492 and aimed to examine transformations of indigenous cultures and societies across the historical divide, bridging the pre-colonial and colonial era (AD 1000–1800).

The focus of investigation was the current province of Montecristi due to the fact that this area was visited and named by Christopher Columbus in his first trip back in 1492, and it is the area in between the first Spanish villas and forts funded during the initial colonization period that lasted up to 1516 (Figure 1). The datasets described here constituted the main data used by the first author for PhD research, with a focus on understanding the indigenous landscape of the northern region of the island of Haytí (re-named as La Española by Columbus and share today by the Dominican Republic and Haiti) and its transformation to the colonial one imposed by the Spanish invaders. The project intended to achieve this objective by combining archaeological data collected during regional surveys between 2014 and 2015, with cartographical and documentary data from the early chroniclers and explorers that visited the island. The indigenous landscape was reconstructed by the analysis of regional archaeological and environmental patterns by means of Geographical Information Systems and spatial statistics. The environmental data was provided by Dominican Republic national agencies in digital and paper format [1]. Finally, this research used a theoretical background that built upon theoretical landscape archaeology, Bender’s ideas of contested landscapes [2, 3] and Ingold’s concept of taskscapes [4, 5]. Within this wider context, the research was able to reconstruct different spatial scales at which human action was evident which in turn constituted the essence of the indigenous landscape. By comparing these results with early Spanish cartographies and chronicles it was possible to evaluate the indigenous landscape before the arrival of Columbus and its transformation to the colonial landscape impose during the colonization process.

Figure 1 

Map showing the location of the contemproary province of Montecristi (Dominican Republic) in relation to the location of the “Ruta de Colón” and the spatial organization of the Spanish settlements and forts build between 1492 and 1516.

Spatial coverage

Description: Coast of the Montecristi province in northwestern Dominican Republic.

Geographic Coordinate system: World Geodetic System (WGS) 1984.

Datum: World Geodetic System (WGS) 1984.

Projected Coordinate System: WGS_1984_UTM_Zone_19N

Northern boundary: 2203784,56

Southern boundary: 21735544,97

Eastern boundary: 271982,97

Western boundary: 208716,68

Temporal coverage

1200 to 1500 AD.

This temporal coverage was obtained from regional relative chronology based on the spatiotemporal distribution of ceramic series in the research region. In addition, a set of 7 radiocarbon samples from excavated contexts at four archaeological sites in the research area confirmed this temporal span.

(2) Methods

Steps: Survey methods and the definition of site

The archaeological data was collected by applying different regional survey strategies. The first method employed was a Systematic Total Area Survey method. However, due to the combination of complex topography, weather conditions and vegetation, as it has been reported for other regions [6, 7], this method proved to be inefficient and time-consuming. A second strategy consisted in developing a more opportunistic survey [8], which provided the expected and necessary results. Third, to assist the opportunistic survey and keep the aim of collecting data in a systematic manner, predictive models were calculated for the area. With the combination of the three methods a total number of 102 archaeological sites were recorded (Figure 2).

Figure 2 

Archaeological site distribution on the coast of the Montecristi Province, Dominican Republic (DEM download from https://gdex.cr.usgs.gov/gdex/; ASTER GDEM is a product of NASA and METI).

In order to register and classify spatial distributions from a regional systematic perspective, a standard method for data recovery was developed. The aim behind this was to define ‘sites’ after fieldwork rather than “find hidden sites on the field” (see classic critic on ‘site’ definition by [9, 10]). Building upon Foley’s [11] off-site archaeology as well as in Ingold’s theoretical ideas on taskscape, a particular field registry methodology was shaped. For this, the collected data was divided into three categories: single, cluster and scatter finds. Each spatial category, called in the thesis and the database spatial datasets, was registered as a point with coordinates and as a polygon that represent them spatially, alongside metadata related to the material culture associated with each of them.

During post-fieldwork analysis, a nearest neighbour histogram was calculated to determine the optimal distances at which these spatial phenomena were being clustered. Based on this result, previous archaeological literature and the researcher’s own experience in the field, it was decided to use a standard distance of 100 m as a measurement of aggregation. In this sense, an archaeological site was defined in this research as the spatial cluster of evidence of material culture which can be observed in the form of single, clusters, and/or scatters finds, which has no more than 100 m of separation between each other. After the ‘site’ was defined, all the points and polygons of the ‘spatial datasets’ associated to it were combined and a code was given to them (MC = Montecristi, and an increasing number, e.g. MC-1). From the combination of the ‘spatial datasets’ points a ‘site polygon’ was projected and from this, a ‘site centroid’ was calculated. The resulting coordinate from the ‘site centroid’ was the one used as a site coordinate for all the analysis in the research.

This definition allowed to have a standard spatial classification for the study area and therefore to obtain a robust idea of the archaeological site both for the analysis and the interpretation processes.

Sampling strategy

The dataset consists of raw materials collected in the field and their classification. Considering the limitation of storage space for archaeological materials in the Dominican Republic and The Netherlands, it was decided to collect only diagnostic materials from each of the ‘spatial datasets’. Diagnostic material were defined as any complete or quasi-complete lithic, shell or coral object. In the cases where materials were too abundant, a sample of the observed materials was collected and the rest were photographed and left in situ. In terms of ceramic materials, only the sherds that were either decorated or larger than 5 cm were collected.

Quality Control

All records of the attribute table generated for the archaeological sites were checked against the original files at every crucial point in the processing of materials and sites. In the case of material culture, the check consisted of reviewing the database with the digital pictures and paper field forms from each of the sites. In the case of the archaeological site and its polygons, a revision was performed by comparing the location of the centroid of the site against the location of the spatial datasets points and polygons.


Since most of the archaeological sites had not absolute dates, but only a relative chronological association in the regional context, this created a challenge for data analysis and interpretation. In this research, this issue was solved by focusing on studying the spatial attributes and patterns within the idea of ‘sites as tendencies’ of human action [12: 30].

(3) Dataset description

The database has been stored in the EASY DANS repository in different folders. Each folder contains files related to specific data. The main folder, “Dataset Content”, contains six subfolders and four files. Since DANS has a policy of only using open access formats for the data (e.g. ‘.ods’ for spreadsheets and ‘.mid’ for spatial data), the dataset was saved on these formats and additionally on the originally worked ones (e.g. ‘.csv’ for spreadsheets and ‘.shp for spatial data).

Object name

“Dataset Content” Folder

Appendix_Archaeological comparisons_in Spanish – an appendix from the PhD dissertation with various maps depicting the relations between certain environmental variables and the distributions of material culture and archaeological sites (as .pdf file).

Appendix_Environmental variables data (in spanish) – an appendix describing the environmental features worked in this research as well as their use as environmental variables (with maps) for the different analyses (as .pdf file).

Archaeological-Environmental-Database – a spreadsheet containing all the information about archaeological and environmental variables. This includes: site locations, types of material culture, relation with the environmental variables and metadata associated with each of the datasets considered (as .csv and .ods files). In the Codebook, there is more information relating the codes used in this file.

“1-Research area_polygon” Sub-folder

research-polygon – a vector polygon representing the spatial extentof the archaeological survey area from which the spatial dataset’s data have been collected (as .shp and with associated files, also in .mid and .mif files).

“2-Spatial datasets_points” Sub-folder

spatial-datasets – a vector point dataset representing the location and metadata of each of the single, cluster and/or scatter finds recorded during fieldwork (as .shp and with associated files, also in .mid and .mif files).

“3-Spatial datasets_polygon” Sub-folder

spat-ds-polygon – a vector polygon dataset representing the spatial extent of each of the single, cluster and/or scatter finds recorded during fieldwork (as .shp and with associated files, also in .mid and .mif files).

“4-Archaeological sites_points” Sub-folder

sites – a vector point dataset representing the location and metadata of each of the centroid points of the archaeological sites defined during the research (as .shp and with associated files, also in .mid and .mif files).

“5-Archaeological site_polygon” Sub-folder

site-polygon – a vector polygon representing the spatial extent of the archaeological sites defined for the research on the basis of the ‘spatial dataset’ distribution (as .shp and with associated files, also in .mid and .mif files).

“codebooks” Sub-folder

CODEBOOK-Folders – a file with the object name explanation as presented in this paper (as a .txt file).

CODEBOOK-database – a file with the explanation of the codes used for the different archaeological and environmental variables in the research in Montecristi (as a .txt file).

object_names_codebook – a file with the object name explanation as presented in this paper (as a .csv file).

Data type

Primary and secondary data, and processed data from published materials.

Format names and versions

.shp, .dbf, .prj, .sbn, .sbx, .shp, .xml, .shx, .pdf, .csv, .txt, .mid, .mif, .cpg

Creation dates

The datasets were created between 2014–2016 as part of the PhD research of Eduardo Herrera Malatesta, in the context of the ERC Synergy Grant NEXUS 1492: New World Encounters In A Globalizing World project, under the direction of Prof. Dr. Corinne L. Hofman.

Dataset Creators

The researcher responsible for the data collection and processing was Eduardo Herrera Malatesta.


Spanish and English


Open access

Repository location

The full datasets are available at DANS repository at https://doi.org/10.17026/dans-xyn-cu72.

Publication date


(4) Reuse potential

The database is currently being reused by researchers within the NEXUS 1492 research project, particularly by colleagues from the Archaeological and Network Science teams. In addition, considering the broad scope of the research area and the detailed information contained in it, it represents an excellent case study for other researchers in the Caribbean and other regions in the world for further spatial analysis, as well as for regional comparisons.