Open science practices are an increasingly important element of scientific research that are likely to become a requirement of every journal, funding body and employer. It is based on four pillars: open data, open methods (protocols and code), open papers and open reviews . This movement seeks to open up science to the wider community, both academic and the public, by making all scientific outputs available to enable greater transparency. The adoption of these approaches will facilitate more efficient transfer of skills and knowledge throughout the research community, and so improve the diversity of the discipline and equity of access. A discussion of open science in Archaeology by Marwick et al.  identifies specific practices that will benefit researchers and the wider scientific community: i) the use of pre-prints; ii) depositing data in repositories; and iii) transparent and reproducible workflows. However, in many scientific fields the current extent of these practices is unknown and therefore reviews need to be conducted to assess the current situation.
As much of the output of scientific research is publications in journal articles, there have been several journal article reviews of data sharing, and other aspects of open science, conducted in archaeology but none concerning phytolith research [4, 5, 8]. Recent reviews of archaeological science  and macro-botanical remains  found low levels of data sharing, 53% and 56% respectively. It was therefore important to assess where phytolith research was in relation to other sub-disciplines of environmental archaeology and archaeological science in general.
The use of phytolith analysis for archaeological and palaeoecological studies has been increasing in recent year in terms of its methods and their applications [1, 3] Phytoliths are silica bodies that are formed within plant cells during the lifetime of the plant and can be used to identify plant taxa to different taxonomic levels . Phytolith analysis is not only used to examine the floral component of past sediments from archaeological sites and their wider environment but is now increasingly being used for radiocarbon dating and isotope analysis [2, 12, 13]. Extraction of phytoliths from artefacts and ecofacts, such as grinding stones, tooth calculus and pottery, are innovations that are addressing new archaeological questions [1, 10].
It is therefore important that this increase in research and particularly the upsurge in publications with associated data is made as useful to other colleagues as possible. Embedding open science practices in a research project will allow for the greatest transparency and therefore the ability for other researchers to validate findings and build on research. Knowing the current extent of open science practices in phytolith research highlights areas in need of improvement and enables guidelines to be established to bring researchers in closer alignment to good open science practices.
Articles in this study cover a global range.
Articles in this study are not restricted to one particular time period. They range from studies of the palaeoenvironments of early hominids to historical archaeological sites. It also includes articles focused on methodological studies of modern environments.
This dataset was designed to complement the dataset produced by Lodwick . Therefore, the same journals (see Table 1 for the list of journals sampled) and same period (2009–2018) were sampled. This enabled the two datasets to be compared as they both concern sub-disciplines of archaeobotany.
|Archaeobotanical journals||Archaeological science journals||General Archaeology||Cross-disciplinary Journals|
|Vegetation History and Archaeobotany||Archaeological and anthropological science||Antiquity||PLOS One|
|Environmental Archaeology||Journal of Field Archaeology||Proceedings of the National Academy of Sciences|
|The Holocene||Oxford Journal of Archaeology|
|Journal of Archaeological Science||Journal of Anthropological Archaeology|
|Journal of Archaeological Science reports||Journal of World Prehistory|
|Journal of Ethnobiology|
|Journal of Wetland Archaeology|
To find the articles needed for this research on open science practices in phytolith research, the following steps were taken:
- The author accessed the journal website and searched the term ‘Phytolith’.
- This was then refined to the 10-year period required (2009–2018).
- Once the list of articles was found, each article in the list was examined for primary phytolith data. Only articles that provided primary data were selected for the dataset. The articles could be archaeological, palaeoenvironmental or methodological. This was determined from the main focus of the article and the research questions being addressed. There is often overlap between palaeoenvironmental and archaeological studies and therefore some articles could have fallen into either category. In these cases, the author put the articles into the archaeological category as they were focused on samples from an archaeological site.
A full list of the categories (column headings) recorded in the dataset can be found in Table 2. This also sets out the key to the codes used. The categories recorded from each article were selected to gain the most information concerning open science practices therefore they included open access, data sharing and other information provided with the articles such as methods, pictures and use of the International Code for Phytolith Nomenclature (ICPN) . These later categories could also be termed as the metadata. Both the raw data and metadata should be made available with all articles to allow thorough peer review and to give other researchers the opportunity to build on previous research.
|Category name||Codes and details recorded in category|
|Journal name||Full name of the journal|
|Year||Range between 2009 and 2018|
|Title||Full title of the article|
|Region||Geographic country of study –geonames used to standardise (https://www.geonames.org/).|
|Period||Archaeological period – dates or name of period used in the study – given in the introduction of each article.|
|Type of study||Archaeological
|Sub-type||For methodological papers only:
MRS = Modern reference soil/habitat
MRP = Modern reference plant
M = Morphometrics
T = Taphonomy
R = Radiocarbon dating
MD = Method development
|Data location||N = no data given
.docx = word document
.xlsx = excel spreadsheet
|Raw count data in re-useable format (supp material as excel or csv, or in repository)||Y = Yes
N = No
|Pictures/photos provided of phytolith morphotypes||Y = Yes
N = No
|Open access article||Y = Yes
N = No
|Open journal||Y = Yes
N = No
|Other access found||N/A – open access articles already.
Repository – other than social media ones above
6 months – article made available after 6 months by journal
N – had to request
N – could not request
|Used ICPN 1.0||Y = Yes
N = No
|Full methods given||Y/N
Y = Written in full in the text/supplementary file or referenced the use of one method article.
|Other details – details of data given.||More detail of the types of data were recorded – what form the data was provided in – raw counts/absolute counts/percentage/types of graphs, etc.|
The author decided to simplify the collection of data from the selected articles by taking a presence/absence approach to most of the categories in the dataset. Therefore, several categories need further clarification as to how they were recorded as Yes or No answers:
- Raw count data in re-useable format – there was a wide variety of data presentation methods and types of data found in the articles and recording all of these would not add anything to the argument of data reusability (it was recorded in the other details section of the dataset). Therefore, the author determined that to enable other researchers to reuse phytolith data, the raw counts and the weights taken during processing need to be provided. This is the actual raw data created in phytolith analysis and making this available will allow other researchers to conduct any form of analyses on the data. This data also needs to be in a format easily accessible, therefore, to get a Yes for this category the raw count data needed to also be in an excel spreadsheet, csv file or in an open repository as an excel or csv file.
- Picture of phytolith morphotypes – this could be photographs or diagrams of the morphotypes identified in the study. These are important for validation of identification.
- Other access found – many of the journal articles were not Gold open access. However, the author felt that it was important to record how available these articles were online in other locations. Many researchers post their articles on academic social media sites such as Research gate and Academia.edu or make them available through open repositories such as their own university repositories. This meant that a true sense of the openness of the publications in this study was determined.
- Use of ICPN 1.0 – an effort has been made by the phytolith community to standardise the nomenclature used in research. The author wanted to determine to what extent this was being used, as the standardisation and use of specific codes for morphotypes is important for the reuse of data.
- Full method – it was determined that a full method was supplied if a full description of the phytolith extraction process (from sediment or plant material) was given in the text of the article or as a supplementary file, or there was a clear reference to one methodological paper.
Once the data collection stage was completed, the dataset was checked for spelling mistakes and consistency of terms. All entries were checked for inconsistencies using Table 2 to confirm that the codes used and the criteria for presence/absence categories were applied correctly. Location (Region) data was standardised to countries using geonames (https://www.geonames.org/).
The period category was entered for archaeological articles only, however, standardising these entries proved difficult due to the global nature of the dataset. Named periods often have different meanings in different geographic regions. Therefore, this data was not standardised and was collected as either a period name, date or date range given in the introduction of the article.
The decision to use a presence/absence category to record the sharing of raw data was determined partly by problems with labelling tables and graphs in the published articles. Some of the data was not labelled adequately to allow the type of data to be determined.
Another factor that did not constrain but hindered the collection of data was the poor labelling of supplementary data files. Often the files were labelled as supplementary file 1, with no other explanation of the contents of the file in the title. It was also found that these files were not adequately referred to in the text. Therefore, to determine what the file contained, it had to be downloaded. If a researcher was collating a large amount of data for meta-analysis, this lack of labelling would add considerably to their workload. All supplementary files in this dataset were downloaded and examined for the completion of the dataset.
(3) Dataset description
Raw data table for Karoune 2020 Assessing Open Science Practices in Phytolith Research – information extracted from 341 articles of primary phytolith data from 16 archaeology and palaeoecology journals between 2009–2018.
Key to codes for Karoune 2020 – description of the data and codes used in each column heading of the dataset.
Primary and secondary data.
Format Names and Versions
Raw data table for Karoune 2020 Assessing Open Science Practices in Phytolith Research – CSV file.
Key to codes for Karoune 2020 – CSV file.
Dataset created between October 2019 to June 2020.
The primary researcher responsible for the data collection was Emma Karoune.
CC0 1.0 Universal
Open Science Framework: https://osf.io/8p3bn/
(4) Reuse Potential
There are several potential ways that this data could be reused. Firstly, this dataset adds to the growing review of open science practice in Archaeology and more specifically Environmental Archaeology. It could therefore be collated with other evidence or built upon further to draw together an overview throughout the discipline.
It could serve as an aid to teaching open science practices. The dataset could be used for a simple data analysis task for students in teaching modules, along with other such evidence from Archaeology or Science in general.
The dataset could also be used in teaching environmental archaeology, particularly phytoliths or archaeobotany modules. It could aid the teaching of data analysis and discussions concerning academic publishing and the application of open science practices.
Within phytolith research, this dataset can be used by the phytolith community to examine the way forward in terms of drawing up guidelines for data sharing and open science practices in this field.
In terms of academic publishing, this data could be used by journal editors to draw up guidelines for journal data availability policies.