(1) Overview


ARCHIPELAGO is an integrated archaeological and historical database of land and sea food resources utilised by humans in the Japanese Islands. Here, we present the first dataset from this initiative: human bone and hair carbon and nitrogen stable isotope measurements from archaeological sites in the Japanese archipelago covering the temporal range from the Upper Palaeolithic until the end of the early modern Tokugawa period (ca. 19,000 BC to AD 1868).

The use of stable isotope analysis of archaeological human remains to reconstruct past diets began in the late 1970s [1, 2]. The technique explores how certain food groups exhibit differences in their stable isotope compositions for certain chemical elements, most often carbon or nitrogen. The isotopic compositions of human tissues reflect those of consumed foods and stable carbon and nitrogen analysis of hair keratin and of collagen extracted from bone or teeth are particularly informative of protein sources, whereas stable carbon isotope measurements of human bone carbonate or tooth enamel reflect the mix of all food macronutrients [3, 4]. By 1987, stable isotope analysis was being applied to Japanese prehistory by Brian Chisholm and Hiroko Koike [5, 6, 7] and by Takeru Akazawa and Masao Minagawa [8, 9]. Minoru Yoneda, a student of Akazawa, began his research in the early 1990s with analyses of Kitamura and other sites in Nagano prefecture [10]. Hiroto Takamiya included isotopic analyses (conducted by Brian Chisholm) of sites in Okinawa and Hokkaido in his 1997 PhD dissertation [11]. Building on these early applications, stable isotope analysis has become widely employed in Japan over the last decade or so, in salvage archaeology as well as in university-based research projects. A brief history of isotope archaeology in Japan up to 2004 is provided by Chisholm [12]. Recent overviews of applications to the Neolithic Jōmon and Bronze Age Yayoi periods were published by Kusaka [13] and Yoneda and Yamazaki [14], respectively.

The results of stable isotope analyses have only gradually been incorporated into interpretations of Japanese archaeology. Many texts do not mention the technique at all [e.g., 15, 16]. Some publications have used isotope analysis to explore the diversity of Jōmon diets in Neolithic Japan [17, 18], but the potential of the technique for Japanese history and archaeology remains underused. According to Chisholm’s summary of the major results of archaeological isotope analysis in Japan, by 2004 there were a number of broad conclusions that seemed to be supported [12]. The first was that C4 plants played, at best, only a minor role in human diets in the archipelago. This assumption reflects a rice-centred view of Japanese history and needs to be re-evaluated in light of growing evidence for millet cultivation and other economic activities from the Yayoi period onwards [19]. A second finding was a very high marine component in the diets of prehistoric populations in Hokkaido, especially those of the Iron Age/medieval Okhotsk culture. This seems to be generally supported by later research, although details regarding the role of trade still need further research [20, 21]. Thirdly, by the early 2000s it was not yet possible to find clear evidence of a dietary change at the Jōmon-Yayoi transition when full-scale cereal farming reached Japan. This point certainly requires new analysis. Finally, except in Hokkaido, few gender differences in diets in prehistoric Japan had been recognised. Despite huge advances in the quantity of isotopic data from Japan since 2004, there remains a real need to investigate these and other questions of historical relevance.

Spatial coverage

The dataset covers the Japanese archipelago (Figure 1), a land area of 364,545 km2.

Figure 1 

Distribution of Japanese archaeological sites from which isotopic data in the database was collected.

Description: Japan

Northern boundary: 45.3138

Southern boundary: 20.2531

Eastern boundary: 153.5911

Western boundary: 122.5601

Temporal coverage

ca. 20,000 BC to AD 1868.

Table 1 provides the periodisation employed in our dataset. Samples from the Jōmon (Neolithic) and early modern Tokugawa periods were the most common (Figure 2).

Table 1

Sample periodisation employed in the dataset.


Palaeolithic Palaeolithic Up to 14,520 BC

Incipient Jōmon Jōmon/Neolithic 14,520–10,550 BC

Initial Jōmon Jōmon/Neolithic 10,550–5050 BC

Early Jōmon Jōmon/Neolithic 5050–3520 BC

Middle Jōmon Jōmon/Neolithic 3520–2470 BC

Late Jōmon Jōmon/Neolithic 2470–1250 BC

Final Jōmon (southwest Japan) Jōmon/Neolithic 1250–970 BC

Final Jōmon (northeast Japan) Jōmon/Neolithic 1250–400 BC

Initial Yayoi Yayoi-Kofun 1000–800 BC

Early Yayoi Yayoi-Kofun 800–450 BC

Middle Yayoi Yayoi-Kofun 450 BC–AD 50

Late Yayoi Yayoi-Kofun AD 50–250

Early Kofun Yayoi-Kofun AD 250–400

Middle Kofun Yayoi-Kofun AD 400–500

Late Kofun Yayoi-Kofun AD 500–710

Epi-Jōmon (Hokkaido) Yayoi-Kofun 340 BC–AD 700

Nara Nara-Heian AD 710–794

Heian Nara-Heian AD 795–1185

Okhotsk & Satsumon Medieval Hokkaido AD 500–1200

Kamakura (early medieval) Medieval Japan AD 1185–1333

Muromachi & Azuchi-Momoyama (late medieval) Medieval Japan AD 1333–1603

Tokugawa (early modern) Early modern AD 1603–1868

Figure 2 

Distribution of number of database entries by generic time period.

Notes: The Jōmon/Neolithic category includes Neolithic sites from the Ryukyu Islands. The Medieval Hokkaido category consists of samples from the Okhotsk and Satsumon cultures.

(2) Methods


Published stable carbon and nitrogen stable isotope data (δ13C and δ15N) were collected for premodern Japan up to the Meiji Restoration (1868). The majority of samples were derived from excavated human skeletons but historic hair from the early modern Tokugawa period (1603–1868) was also included.

Where radiocarbon dates were not available, samples were cross-dated on the basis of pottery and other artefacts. Table 1 shows current widely-accepted dates for archaeological and historical periods in Japan, with the prehistoric chronology taken from Barnes [22].

Geographic coordinates follow those reported by the organisation responsible for the excavation as published in the site report. These were reported using the Japanese Geodetic Datum (JGD2000) or World Geodetic System 84 (WGS 84). The reporting of geographic coordinates in Japanese archaeological site reports became standard practice after 2004. Coordinates for sites published prior to that date were estimated from Google Earth with typically an estimated accuracy better than 10 km.

Sampling strategy

The datasets were derived from all existing publications known to the authors, including research articles and archaeological site reports. The Comprehensive Database of Archaeological Site Reports in Japan run by the Nara National Research Institute for Cultural Properties (https://sitereports.nabunken.go.jp/en) was used to search site reports.

Quality control

Whenever provided we included in the isotopic collection the standard parameters (collagen yield, %C, %N, atomic C/N) for assessment of bone collagen preservation [23], the principal type of tissue included in our data collection. We did not exclude data for which reported parameter values were outside of the recommended range since such data can still be useful for sample preservation studies. Furthermore, such data can be easily filtered prior to a study on human diet.


Bone preservation is generally poor in the acid soils found in Japan. Shell middens, wetland sites and sites on limestone geology provide major exceptions and most of the Neolithic Jōmon samples are from coastal shell middens. Skeletal remains from the Kofun period (AD 250–710) are biased towards elite burials in burial mounds (kofun in Japanese). Under Buddhist influence, cremation was commonly practiced during the Nara (710–794) and Heian (794–1185) periods and human skeletal remains from these centuries are rare.

Our initial compilation was focused on the study of ancient human diets using bulk stable isotope measurements (e.g. on bulk extracted bone collagen or hair keratin). Data from less commonly employed isotopic proxies (e.g. sulphur, hydrogen, etc.), from isotopic proxies relative to the study of human mobility (e.g. oxygen or strontium isotopes), single compound isotope measurements (e.g. amino acids), and isotopic measurements from archaeological plants or animals were not included. Planned future data collections will add these data.

It was not possible to complete data input for all individuals. When this occurred, the fields for which data was not available were left blank. For four entries chronological data was not available and in the corresponding field ‘Period tags’ a question mark was entered. We will update our data collection whenever new data for already recorded individuals becomes available.

(3) Dataset description

The dataset consists of a single table (available as “Japan human SI data v2.csv” and as “Japan human SI data v2.xlsx”) deposited at the data platform of the Pandora initiative (https://pandoradata.earth/) within the ARCHIPELAGO community (https://pandoradata.earth/organization/archipelago). The ARCHIPELAGO isotopic database is also a member of the IsoMemo initiative which brings together a network of isotopic databases (https://isomemo.com) which includes a Webapp for querying and modelling of isotopic data (https://isomemoapp.com/).

The data table consists of fields organized into thematic groups. Each data entry is identified by a unique sequential key (Entry_ID). The data submitter may include additional comments not covered by the existing fields (Comments), and identify the data submitter by name (ID_submitter).

The archaeological site and sample context are described in several fields. A site name (Site_name), short description of the type of site (Site_description), short description of burial context (Context_description), a context identifier as given in original publication (Context_ID), the name of the locality at which the site is located (Locality), the corresponding region (Region), and the site altitude in metres (Altitude). Latitude (Latitude) and longitude (Longitude) are given using the WGS84 metric coordinate system.

Each archaeological individual from which the sample was taken is identified using the identification provided in the original publication (Individual_ID) followed by a short description of the burial (Burial_type_skeletal_context). Additional sample description includes taxon (Taxon) and the corresponding name in common language (Taxon_common_name). Our dataset currently contains only human data but in the future we plan to expand it to include other taxa. Osteological information includes sex identification (Sex), a text description of age (Age_category_individual), numeric ranges in years for minimum (Min_age_individual) and maximum (Max_age_individual) biological age of the individual at death, and the type of bone or hair material sampled from the individual (Sample_type).

Biological age categories for skeletal individuals followed the published reports. These reports used standard bioarchaeological categories based on dental and skeletal age [24]: infant: birth – 3 years; child: 3–12 years; adolescent: 12–20 years; young adult: 20–35 years; middle adult: 35–50 years; old adult: 50 + years. In some cases, particularly with older publications, the ages of these categories may differ slightly. The age >55 is sometimes used for ‘older adults’ in the Japanese literature. Cases where the same values are reported for minimum and maximum individual ages represent average estimated age. Some of the studies used here report very precise age estimates for sub-adults [e.g., 25]. In such cases our dataset follows these estimates, which are derived from different methods described in the studies concerned.

The chronological range of the sample is given by a minimum age (Min_chronology) and maximum age (Max_chronology) in years BC and AD with years BC expressed by negative numbers. Age assignment followed a hierarchical approach. Whenever available we employed direct dates from samples (e.g. radiocarbon dates, in which the calibrated 95% range is reported) or from coeval samples from the same archaeological context. If necessary, corrections for marine radiocarbon reservoir effects were applied on direct radiocarbon measurements of human bones (see below). A dataset field (Dietary_ model_selection) identifies the type of Bayesian model employed to estimate the dietary contributions from marine carbon. If no secure dating was available from the sample context, we employed the site’s chronology given usually in the archaeological report. If this was also not available, we employed the full cultural range to which the sample was assigned. A field was used to identify the type of employed dating method (Dating_method). Also included were fields for uncalibrated direct radiocarbon dates on sample (14C), its uncertainty (14_unc). Period tags are also used to provide traditional chronological information (Period_tags).

Measurements of stable carbon (delta_13C_coll) and nitrogen (delta_15N_coll) isotopic ratios in bone collagen and hair keratin are reported together with measurement quality indicators, the percentage of elemental carbon (%C), the percentage of elemental nitrogen (%N), the carbon to nitrogen atomic ratio (C/N), and the collagen yield for bone samples (Collagen_yield).

A reference in the format author(s)/year of publication/title identified the source publication or publications from where the data was collected (Reference), in addition to a link to the publication whenever available (Link), a Digital object identifier as a persistent identifier (DOI), and the publication date or dates (Publication_date). Macrons were not used for Japanese titles in the list.

(4) Bayesian modelling of direct human radiocarbon measurements

Our dataset contains 292 human samples for which chronological information is based on direct bone radiocarbon measurements. These measurements may be influenced by the consumption of aquatic foods, in particular marine foods from the seas surrounding Japan, which typically result in radiocarbon ages apparently older than the actual chronology of the analysed individual. This effect is known as a dietary marine radiocarbon reservoir effect (dietary MRE) and its correction requires an estimate of the contribution from marine carbon to human bone collagen plus an estimate of the MRE of consumed marine foods given that these can vary in space and time. Below we describe the use of Bayesian modelling to perform such a correction for the direct human radiocarbon measurements within our dataset.

Dietary estimates of marine carbon contributions to human bone collagen were obtained following similar procedures to those described in [26, 27]. Briefly, we used the Bayesian software ReSources available via the IsoMemo Webapp (https://isomemoapp.com/) to generate the dietary estimates. This software is an upgraded version of the Bayesian software FRUITS allowing for the implementation of different Bayesian mixing model variants [28]. We considered three models: model 1) a model that relies only on δ13C measurements and included four food groups (C3 terrestrial plants, terrestrial mammals, marine fish and marine shellfish); model 2) a model with four food groups relying on both δ13C and δ15N measurements (C3 terrestrial plants, terrestrial mammals, marine fish and marine shellfish); model 3) a model with five food groups relying on both δ13C and δ15N measurements (C3 terrestrial plants, C4 terrestrial plants, terrestrial mammals, marine fish and marine shellfish). Models that do not include C4 plants (e.g., millets) typically allow for dietary estimates of higher precision given that these have similar δ13C values to those of marine foods.

Prior to the arrival of broomcorn and foxtail millet after 1000 BC, barnyard millet (Echinochloa utilis) was also likely cultivated by some Jōmon groups [29, 30]. According to the radiocarbon database of the National Museum of Japanese History (https://www.rekihaku.ac.jp/up-cgi/login.pl?p=param/esrd/db_param), the earliest directly dated Echinochloa remains from Japan are two seeds from Middle Jōmon Tomi-no-sawa (Aomori), here re-calibrated to 95% credible intervals of 3006–2703 BC and 2879–2851 BC. However, most finds with direct dates are from the historic era. The majority of archaeological finds of barnyard millet also have a limited distribution in southwest Hokkaido/northeast Honshu. The contribution of Echinochloa to the overall prehistoric diet in Japan was probably limited and as such we only considered C4 plants as a potential significant food source from 1000 BC onwards.

In terms of model selection, for human samples having both reported δ13C and δ15N values and for which either one of their 95% credible intervals following calibration without any dietary corrections was equal or higher than 1200 BC we employed model 3, while for younger samples model 2 was employed. A reference cut-off value older than 1000 BC was taken since marine dietary intakes shift radiocarbon ages towards older values and thus mixed consumers of C4 plants and marine foods dating after 1000 BC could apparently date older from radiocarbon measurements. The average surface marine radiocarbon reservoir is c. 400 years and so our reference cut-off of 1200 BC would correspond to c. 50% marine carbon dietary contributions (this is an approximate estimate given potential differences in local marine radiocarbon reservoir effects and fluctuations in terrestrial and marine calibration curves). Such high levels of marine consumption are not expected following the introduction of farming. For 30 individuals only δ13C values were available. Fortunately for these cases the calibrated radiocarbon ranges prior to dietary corrections clearly separated individuals dating older or younger than 1200 BC. For individuals dating younger we applied no model as higher stable carbon isotopic ratios likely reflect millet over marine consumption while for the remainder of individuals we employed model 1. Within the dataset, individuals with direct bone radiocarbon dates are tagged as ‘none’, ‘model 1’, ‘model 2’, and ‘model 3’ under the field ‘Dietary_model_selection’.

The macronutrient composition, expressed in weight carbon percentage, for each food group (protein vs. carbohydrates/lipids) was as reported in [26] but with doubled uncertainty values. This study reports macronutrient composition values for C3 plants (protein: 10 ± 5% wtC %; carbs/lipids 90 ± 5% wtC %), terrestrial herbivores (protein: 30 ± 5% wtC %; carbs/lipids 70 ± 5% wtC %), and freshwater fish (protein: 65 ± 10% wtC %; carbs/lipids 35 ± 10% wtC %). We assumed that C4 and C3 plants had similar macronutrient compositions and that marine fish and shellfish compositions would be similar to those of freshwater fish given in [26].

For food reference isotopic values, we used the δ13C and δ15N values from modern bulk C3 plants (δ13C = –25.4 ± 1.6‰; δ15N = 1.2 ± 2.4‰), modern bulk C4 plants (δ13C = –10 ± 0.5‰; δ15N = 1 ± 1.9‰), archaeological bone collagen from terrestrial mammals (δ13C = –20.8 ± 1.3‰; δ15N = 5.3 ± 1.0‰), archaeological bone collagen from marine fish (δ13C = –11.7 ± 0.9‰; δ15N = 13.4 ± 0.8‰), and modern marine shellfish (δ13C = –14.3 ± 1.6‰; δ15N = 8.3 ± 2.1‰) as summarised by [31] from archaeological measurements reported in respective study and that also included previous values from modern data reported by [32] (C4 plant values were only reported in [32]). Modern values listed above include a Suess effect correction to account for temporal differences in atmospheric and marine δ13C values for carbon dioxide (correction made by [32]). However, all values reported here do not include the diet-to-consumer isotopic fractionation corrections given in [31].

To obtain δ13C and δ15N values of protein and carbohydrates/lipids components we applied, when available, an offset correction between the measured material and the nutritional component of the respective food group as described in [26]. From this we obtained the macronutrient isotopic values for C3 plants (δ13Cprotein = –27.4 ± 2‰; δ13Ccarbs/lipids = –24.9 ± 2‰; δ15Nprotein = 1.2 ± 2.5‰) and terrestrial herbivores (δ13Cprotein = –22.8 ± 1.5‰; δ13Ccarbs/lipids = –28.8 ± 1‰; δ15Nprotein = 7.3 ± 1‰). We applied to C4 plants (δ13Cprotein = –12 ± 1‰; δ13Ccarbs/lipids = –9.5 ± 1‰; δ15Nprotein = 1 ± 2‰) similar offset corrections as those applied to C3 plants. The same offset corrections given in [26] for freshwater fish were employed here for marine fish (δ13Cprotein = –11.7 ± 1‰; δ13Ccarbs/lipids = –17.7 ± 1‰; δ15Nprotein = 15.4 ± 1‰). In the case of shellfish, we used no offset for δ15Nprotein since reported modern values were measured directly on muscle meat and for δ13Cprotein and δ13Ccarbs/lipids we assumed an offset between protein and lipids of 6‰ (same value as for fish mentioned above) and from its macronutrient composition we did a simple mass conservation estimate of its δ13C protein and carbohydrates/lipid values (δ13Cprotein = –7.7 ± 2‰; δ13Ccarbs/lipids = –13.7 ± 2‰; δ15Nprotein = 11.7 ± 2.5‰). Uncertainties for macronutrient isotopic values were rounded up to multiples of 0.5.

Typical measurement uncertainties for δ13C and δ15N are c. 0.2‰. However, given that our data compilation contains data produced by different labs we need to take into account inter-lab differences as reported in previous studies [33]. Furthermore, isotopic differences may also occur within a bone given different tissue renewal rates across it, which is relevant when comparing radiocarbon and stable isotope results as these may be sampled from different sections of a bone [34]. Different skeletal elements may also exhibit large isotopic differences due to different renewal rates, although for the most part it is likely that the same bone or similar bones were used for both radiocarbon and sable isotope analyses. In our modelling, we set the uncertainties for isotopic measurements in humans at 0.5‰.

A final aspect of modelling to be considered for Bayesian dietary modelling are the offsets between dietary macronutrients and human tissues and metabolic routing mechanisms. For human bone collagen δ15N we took as reference an offset of 5.5 ± 0.5‰ towards dietary protein [26]. In the case of human bone collagen δ13C we considered a routed model in which dietary protein contributed with 74 ± 4% of the collagen signal and the remaining 26% originated from dietary carbohydrates/lipids [4]. The employed δ13C offset between diet and human bone collagen was 4.8 ± 0.5‰. We also employed a Bayesian prior that limited the contribution of dietary protein to between 10 and 35% of total calories in accordance with physiological studies [35].

The ReSources software generates different dietary estimates but for chronological purposes estimates of marine carbon contribution towards bone collagen are the relevant ones. These were expressed as a mean and standard deviation and represent the sum of estimates obtained separately for marine fish and shellfish. Full specification of the models is given in R workspace files (“Model 1.Rdata”, “Model 2.RData”, “Model 3.RData”) made available at the ARCHIPELAGO depository. Estimates were generated for all individuals irrespective of their C/N atomic ratio values. However, as mentioned previously, bone samples for which C/N atomic ratio values are outside acceptable ranges should not be employed for dietary studies and are included in our dataset only for preservation studies.

In addition to dietary estimates, a marine reservoir effect human dietary correction also requires an estimate of the magnitude of the MRE of consumed marine species. For these we lack approximate contemporaneous values for each individual. Thus, we relied on radiocarbon measurements on modern pre-bomb marine samples listed in the Marine Reservoir Correction Database [36]. Using this data, we produced an estimate of the spatial variations in marine ΔR (representing the local MRE offset from the marine calibration curve) along the coast of Japan (Figure 3) using the Bayesian model AverageR available via the IsoMemo Webapp [37, 38]. Average and standard deviation values for ΔR of marine foods for each individual within the dataset were obtained by considering an area within a 100 km radius around assigned geographical coordinates.

Figure 3 

Bayesian estimates of the spatial distribution of marine ΔR around Japan relying on radiocarbon measurements on marine samples (x points) listed in the Marine Reservoir Correction Database [36].

Radiocarbon calibration was done using the Bayesian chronological software OxCal v. 4.4 [39]. We employed a mixed curved model consisting of the terrestrial IntCal20 calibration curve for the northern hemisphere [40] and the Marine20 marine calibration curve [41] where the contribution from Marine20 is the isotope-based normally distributed dietary estimate for each individual. Calibrated radiocarbon results for each individual are reported as the 95% higher posterior density interval in the dataset fields “Min_chronology” and “Max_chronology”.

Data type


Format Names and Versions

CSV, Excel, & .RData, Unicode-8

Creation Dates

Records created from November 2019 to May 2021.

Dataset Creators

The primary researcher responsible for the metadata structure and modelling was Ricardo Fernandes and for data collation was Mark Hudson.




Creative Commons Attribution-ShareAlike

Web App Location


(5) Reuse potential

The collected dataset combines isotopic data, informative of diet, with chronological, osteological, cultural, and other types of archaeological and historical information. This provides the basis for future research on the association between ancient diets and socio-economic status, cultural and religious choices, and local paleo-environmental and palaeo-climatic conditions, among others. We also aim at investigating spatial and diachronic trends in dietary patterns across Japan using the modelling tools available via IsoMemo (https://isomemoapp.com/).