Geospatial Data from the “Building a Model to Reconstruct the Hellenistic and Roman Road Networks of the Eastern Desert of Egypt, a Semi-Empirical Approach Based on Modern Travelers’ Itineraries” Paper

– Topographic factor: Digital Elevation Model. – Navigation factor: allows to discriminate valley areas (wadi) where travelers easily orient themselves from areas with few landscaped landmarks. It was produced by analyzing the theoretical hydrographic network, itself created by DEM processing. – Terrain factor: This layer represents the difficult walking zones for a loaded camel that are the coherent outcropping rocks or coarse sedimentary deposits. It was produced by combining geological and topographical data and by photo-interpretation of satellite images. DATA PAPER

More details on input data and their creation are in the associated paper.
The following steps describe the development of least cost camel paths in the Egyptian Eastern desert according to the method described in the associated publication. All these steps are performed on ArcGIS 10.5 (ArcGIS Desktop: Release 10.5.1 licence Advanded Copyright © 1995-2017 ESRI), a data processing model is used for the realization of each step (Figure 1).
The first step is to generate a cost grid combining the navigation and terrain area movement factors with the calibration parameters of these raster grids specific to camel movements in this area. The next step is the realization of the least cost isotropic network. It takes as input the archaeological and watering sites, the previously created cost grid, and the topographic factor with its calibration table.
The raw network created is then cleaned of duplicates and generalized in order to limit the crossroads zones with a tolerance of 40m. The previous least cost analyses are performed on 11m raster grids, the site locations are in the center of the overlapped pixel, a correction is applied to link the effective site location to the network.
A post-processing is applied to this reworked network in order to prepare it for the calculation of the cost of mobility for each segment. A point layer, corresponding to the intersections of the network edges, is generated by this process.
The last step is to calculate the cost of mobility for each segment of the reworked network. This process uses the same processing as the creation of the raw network but applies it to the cleaned and generalized one.

Quality Control
The calibration tables used in the network's design were created to ensure that the modelled paths fit as closely as possible with modern passenger itineraries that have been made under conditions similar to those of ancient travels. These travelers itinerary data were divided in two dataset. One is used to create these calibration tables (with a cumulative distance of 1.343 km), the other one to evaluate paths quality (with a cumulative distance of 1.103 km). All the geospatial data and spreadsheets used in this quality control are in the repository.
The evaluation of travel factor cost calibrations and the model's ability to replicate alternative routes are done by comparing measured deviations, but final validation relies on comparing the valley bottoms crossed.
Uncertainties in the calibration and validation data sets are difficult to quantify by mainly relying on modern travellers' texts interpretation. The geomorphological nature of the information (natural landmarks, valleys explored, surfaces avoided) nevertheless makes us able to estimate an accuracy at the scale of the valley. Between the starting and destination points of a calibration or validation step, a modelled least-cost route between these two locations is considered valid when it follows the same wadi bottom as the control route. The valleys being wider in the lowland areas, the associated and accepted uncertainties are thus greater than in the steeper areas.
The validation of the model results with the calibration and validation data is therefore carried out by comparing the wadis crossed regardless of their width, a valid modeled path following the right valleys, but a quantitative study of the deviations has also been conducted in the associated paper. This quantitative analysis compare trajectories between model paths, calibration and validation ones using a method close to the Path Deviation Index developed by Jan, Horowitz and Peng [1]. It shows lower uncertainties in the validation data than in the calibration data and higher model deviations in non-mountainous areas.

Constraints
The data processing models are not directly usable as a standard ArcGIS tool. You have to enter in the processing model to use it. The tools used by the models are already set up, but the user must enter (or modify in the ArcGIS environment language) the processing model in order to use it. It is therefore in the model builder interface (in the ArcGIS environment) that the tools are used or with the Python scripts. This choice has been adopted so that the user can properly define the paths of the intermediate data according to his needs rather than having the data stored as a temporary folder. The diagram view provides a good visualization of the processing chain structures and the role of the intermediate data.
The "Least Cost Path network to and from all sites" and "Network Cost Value Postprocessing" data models use parameterized iterators to reproduce our data. These operations are very costly to perform. It is strongly advised to test for a few iterations in order to determine if the machine is able to perform them and to estimate the time to perform all the iterations. For the "Least Cost Path network to and from all sites" model, if the machine is not able to perform an iteration, it is recommended to cut the study area by sector with overlapping areas using raster cut tool and vector data selection (be careful to recalculate the "id" column of the start and end input site points).

Object name
The repository structure and file names are described in Figure 2.
A metadata Excel spreadsheet (made by DicoGIS software 3.0.0 with modifications) is also available at the root of the repository that details the basic metadata of the GIS files. In each folder, xml files describe the metadata with the ISO19139 standard and readme text files summarize each folder's content.

Data type
The repository provides the input data, a preprocessing folder for preliminary preparation of the data and an output folder containing the raw and post-processing results. The "Desert_Networks_Tools" folder contains the ArcGIS 10.5 processing templates required for preprocessing and output. The processing tools are in toolbox and python format to allow all the users to have access to the codes. The "Quality_Training_Validation_Paths" folder contains all the data and spreadsheets used in the training and validation process.

Format names and versions
Shapefile, GeoJSON, GeoTIFF, xlsx, PNG, text, ASCII, tbx, xml, py (4) Reuse potential The input data of this dataset allows to reproduce the output data stored in the same repository. The only difference for full reproducibility is the DEM used. The DEM used in our study and in our publication is a combination of the TanDEM-X 0.4 arcsec (~11 m over our study area) provided by the German Aerospace Center (©DLR 2019) and the SRTM 1 arc second (NASA and NGA) reworked by ATDI at 25 m and resampled at 11 m in the sectors not covered by the former (the spatial extents is on the associated paper). These data have been provided for our scientific use only and cannot be shared freely. We put in this repository the SRTM 1 arc second (NASA and NGA) resampled at 11m and cut on our study area to allow other users to test our tools.
The objective of reuse is mainly to allow the implementation of our approach of creating a least cost network based on the study and mapping of travel conditions. The movement and calibration factor data provided in this repository are based on our study area and the defined mode of transportation (camel). They are therefore not suitable to other study area with a different environment from the Egyptian Eastern Desert. However, the approach and the technical tools are reusable. This dataset also provides methods for evaluating the calibration of the movement factors and validating the results.