BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//pretalx//talks.staging.osgeo.org//foss4g-europe-2025//talk//9AMAM
 N
BEGIN:VTIMEZONE
TZID:CET
BEGIN:STANDARD
DTSTART:20001029T040000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10
TZNAME:CET
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20000326T030000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=3
TZNAME:CEST
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:pretalx-foss4g-europe-2025-9AMAMN@talks.staging.osgeo.org
DTSTART;TZID=CET:20250716T160000
DTEND;TZID=CET:20250716T163000
DESCRIPTION:Digitalization and collaborative approach drives Open Science\,
  the modern way of conducting research. In fact\, Open Science can be defi
 ned as “a collaborative culture enabled by technology that empowers the 
 open sharing of data\, information\, and knowledge within the scientific c
 ommunity and the wider public to accelerate scientific research and unders
 tanding”. Its three major objectives are: (a) increase the accessibility
  to the scientific body of knowledge\, (b) increase the efficiency of the 
 processes to share research outputs and findings\, and (c) improve the eva
 luation of the science impact considering new metrics. Due to technologica
 l advances of the last decades\, modern research is\, today\, mainly data-
 driven therefore Open Research Data (ORD)\, which refers to "the data unde
 rpinning scientific research results that has no restrictions on its acces
 s\, enabling anyone to access it." [1]\, is extremely important. \nWith me
 ans to openly share data\, the intent is to accelerate and boost new findi
 ngs and innovations\, minimizing data duplications and enabling interdisci
 plinary and wider collaborative research. To be effectively used by other 
 researchers\, ORD need to follow the specific principles of Findability\, 
 Accessibility\, Interoperability\, and Reuse (FAIR) [2] which led to the c
 reation of data repositories that permits to register\, store\, find and a
 ccess data following interoperable metadata standards. Available repositor
 ies offer services which generally adhere to ORD best practices by offerin
 g open data access\, associating a license to data\, making them persisten
 t\, providing unique citable identifiers (DOI)\, adopting repository stand
 ards\, and providing a defined data policy. Nevertheless\, in most of thos
 e repositories it is only possible to deposit static files\, preferably ar
 chived using standard open formats and metadata. \n\nHowever\, to fully ex
 ploit ORD with modern applications\, using for example AI techniques\, big
  data requires specialized services that offer a systematic and regular de
 livery of Analysis Ready Data (ARD) and filtering capabilities [3]. Sharin
 g ARD perfectly fit with the European vision of establishing Data Spaces a
 s an interoperable digital place to facilitate data exchange and usage in 
 a secured and controlled environment among different disciplines with the 
 goal of boosting innovation\, economic growth and digital transformation [
 4]. This concept goes beyond the simple technical data sharing issues and 
 encompasses the need of offering a space to share data that is compliant w
 ith privacy and security regulation.\n\nIn the geospatial context\, operat
 ional data sharing has been implemented by means of Spatial Data Infrastru
 ctures (SDIs). They have been implemented based on sharing principles whic
 h led to the adoption of interoperable geoservices by which today thousand
 s of geospatial layers are offered to millions of applications worldwide a
 dopting interoperable geostandards that are mainly from the Open Geospatia
 l Consortium (OGC). The technological growth in the last decades led to th
 e explosive increment of time-varying data which dynamically change to rep
 resent phenomena that grows\, persists and decline\, or that constantly va
 ry due to data curation processes that periodically insert\, update\, or d
 elete information related to data and metadata. \n\nTherefore\, based on t
 he current trends\, the ability to link Open Science concepts with interop
 erability and time-varying data management is paramount. In particular\, t
 he capability of obtaining results consistent with a prior study using the
  same materials\, procedures\, and conditions of analyses is very importan
 t since it increases scientific transparency\, fosters a better understand
 ing of the study\, produces an increased impact of the research and ultima
 tely reinforces the credibility of science. In the Open Science paradigm t
 his is indicated as Reproducible Research\, and it can be guaranteed only 
 if the same source code\, dataset\, and configuration used in the study is
  persistently available. For geospatial data\, while the presented OGC sta
 ndards enable an almost FAIR [5] and modern data sharing\, they do not ade
 quately support the reproducibility concept as pursued in Open Science. In
  other words\, they do not offer any guarantee that the geodata accessed i
 n a given instant in a geoservice can be persistently accessed\, immutably
 \, in the future.\n\nThe needs and practices of time-varying data updates 
 is supported by real case examples related to common operations that updat
 e data or metadata of the different geospatial data types\, for example\, 
 specifically: environmental and climate data for sensor observations\, cad
 astral and OSM data for vector datasets and satellite derived land cover\,
  crop maps and observations of water for raster series. From a technical p
 erspective\, the capacity of accessing data as they were in a specific pre
 vious state is strictly linked to the capacity of supporting data versioni
 ng. Feature found in specific tools and approaches that largely differs fr
 om data formats and storage (databases\, files and Log-Structured Tables) 
 bur rarely support geospatial data.\n\nWhile a defined approach to support
  system-time exists in SQL and LSTs [7] it is not yet currently adopted on
  commonly used storage solutions like those offered by the OSGeo’s proje
 cts [8] and/or are easy to integrate in them without new software developm
 ent. \nWe conclude that OGC Geospatial web services that are currently use
 d in Spatial Data Infrastructures do not meet the reproducibility requirem
 ent set by Open Science since they do not guarantee the immutable access t
 o a dataset in its status at a specific time of consumption. \nTo support 
 this capability\, we propose that geospatial data management infrastructur
 es manage datasets versioning and expose these features to users trough st
 andard Web services. Since versions number may evolve extremely fast and a
 re not meaningful to the user\, the system time\, which identifies the ins
 tant for which archived information had specific values\, should be used\,
  in conjunction with web service URL\, as a unique identifier of the datas
 et. Finally\, together with the support of versioning we propose to suppor
 t the git-like metadata on user and motivation on data transactions: this 
 would greatly support reproducibility and Open Science. In fact\, it would
  not only allow us to retrieve temporal versions of the dataset\, but it w
 ould also permit us to perform data lineage analysis to fully understand t
 he historical changes and better comprehend the dataset (including data pr
 ovenance and ownership) with the effect of fostering the transparency and\
 , ultimately\, the credibility of science.
DTSTAMP:20260527T015055Z
LOCATION:PA01 (Quarticle)
SUMMARY:The challenges of reproducibility for research based on geodata web
  services - Massimiliano Cannata
URL:https://talks.staging.osgeo.org/foss4g-europe-2025/talk/9AMAMN/
END:VEVENT
END:VCALENDAR
