BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//pretalx//talks.staging.osgeo.org//foss4g-europe-2024//talk//3WEQY
 E
BEGIN:VTIMEZONE
TZID:EET
BEGIN:STANDARD
DTSTART:20000101T000000
RRULE:FREQ=YEARLY;BYMONTH=1;UNTIL=20001231T220000Z
TZNAME:EET
TZOFFSETFROM:+0200
TZOFFSETTO:+0200
END:STANDARD
BEGIN:STANDARD
DTSTART:20021027T050000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10
TZNAME:EET
TZOFFSETFROM:+0300
TZOFFSETTO:+0200
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20020331T040000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=3
TZNAME:EEST
TZOFFSETFROM:+0200
TZOFFSETTO:+0300
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:pretalx-foss4g-europe-2024-3WEQYE@talks.staging.osgeo.org
DTSTART;TZID=EET:20240703T140000
DTEND;TZID=EET:20240703T143000
DESCRIPTION:## 1. Introduction\n\nTraditional maps use projections to repre
 sent geospatial data in a 2-dimensional plane. This is both very convenien
 t and computationally efficient. However\, this also introduces distortion
 s in terms of area and angles\, especially for global data sets (de Sousa 
 et al.\, 2019). Several global grid system approaches like Equi7Grid or UT
 M aim to reduce the distortions by dividing the surface of the earth into 
 many zones and using an optimized projection for each zone to minimize dis
 tortions. However\, this introduces analysis discontinuities at the zone b
 oundaries and makes it difficult to combine data sets of varying overlappi
 ng extents (Bauer-Marschallinger et al.\, 2014).\n\nDiscrete Global Grid S
 ystems (DGGS) provide a new approach by introducing a hierarchy of global 
 grids that tesselate the Earth’s surface evenly into equal-area grid cel
 ls around the globe at different spatial resolutions\, and providing a uni
 que indexing system (Sahr et al.\, 2004). DGGS are now defined in the join
 t ISO and OGC DGGS Abstract Specification Topic 21 (ISO 19170-1:2021). DGG
 S serve as spatial reference systems facilitating data cube construction\,
  enabling integration and aggregation of multi-resolution data sources. Va
 rious tessellation schemes such as hexagons and triangles cater to differe
 nt needs - equal area\, optimal neighborhoods\, congruent parent-child rel
 ationships\, ease of use\, or vector field representation in modeling flow
 s.\n\nPurss et al. (2019) have explained the idea to combine DGGS and data
  cubes and underlined the compatibility of these two concepts. Thus\, DGGS
  are a promising way to harmonize\, store\, and analyse spatial data on a 
 planetary scale. DGGSs are commonly used with tabular data\, where the cel
 l id is a column. Many datasets have other dimensions\, such as time\, ver
 tical level\, ensemble member\, etc. For these\, it was envisioned to be a
 ble to use Xarray (Hoyer and Hamman 2017)\, one of the core packages in th
 e Pangeo ecosystem\, as a container for DGGS data.\n\nAt the joint OSGeo a
 nd Pangeo code sprint at the ESA BiDS’23 conference (6.-9. November\, 20
 23\, Vienna)\, members from both communities came together and envisioned 
 implementing support for DGGS in the popular Xarray Python package\, which
  is at the core of many geospatial big data processing workflows. The resu
 lt of the codesprint is a prototype Xarray extension\, named xdggs (https:
 //github.com/xarray-contrib/xdggs)\, which we describe in this article.\n\
 n## 2. Design and methodology\n\nThere are several open-source libraries t
 hat make it possible to work with DGGS. Uber H3 \, HEALPIX \, rHEALPix \, 
 DGGRID \, Google S2 \, OpenEAGGR  – many if not most have Python binding
 s (Kmoch et al. 2022). However\, they often come with their very own not e
 asy-to-use APIs\, different assumptions\, and functionalities. This makes 
 it difficult for users to explore the wider possibilities that DGGS can of
 fer.\nThe aim of xdggs is to provide a unified\, high-level\, and user-fri
 endly API that simplifies working with various DGGS types and their respec
 tive backend libraries\, seamlessly integrating with Xarray and the Pangeo
  open-source geospatial computing ecosystem. Executable notebooks demonstr
 ating the use of the xdggs package are also developed to showcase its capa
 bilities. The xdggs community contributors set out with a set of guideline
 s and common DGGS features that xdggs should provide or facilitate\, to ma
 ke DGGS semantics and operations possible to use via the user-friendly Xar
 ray API of working with labelled arrays.\n\n## 3. Results\n\nThis developm
 ent represents a significant step forward. With xdggs\, DGGS become more a
 ccessible and actionable for data users. Like traditional cartographic pro
 jections\, a user does not need to be a expert on the peculiarities of var
 ious grids and libraries to work with DGGS\, and can continue working in t
 he well-known Xarray workflow. One of the aims of xdggs is making DGGS dat
 a access and conversion user-friendly\, while dealing with the coordinates
 \, tesselations\, and projections under the hood.\n\nDGGS-indexed data can
  be stored in an appropriate format like Zarr or (Geo)Parquet\, with accor
 ding metadata to understand which DGGS (and potentially under which specif
 ic configuration) is needed to address the grid cell indices correctly. An
  interactive tutorial on Pangeo-Forge as open-access resource is being dev
 eloped as well to demonstrate to users how to effectively utilizing these 
 storage formats\, thereby facilitating knowledge transfer in data storage 
 best practices within the geospatial open-source community.\n\nNevertheles
 s\, continuous efforts are necessary to broaden the accessibility of DGGS 
 for scientific and operational applications\, especially in handling gridd
 ed data such as global climate and ocean modeling\, satellite imagery\, ra
 ster data\, and maps. This would require\, for example\, an agreement idea
 lly with entities such as the OGC for DGGS reference systems’ registry (
 similar to the epsg/crs/proj database).\n\n## 4. Discussion and outlook\n\
 nOne of the big advantages of DGGS use via Xarray is the data integration 
 between multi-source multi-sensor EO data\, large global-scale ocean and c
 limate models using the Pangeo environment and to make the data access and
  development practical and FAIR (Findable\, Accessible\, Interoperable\, R
 eproducible) in the community. Two additional directions to improve uptake
  and comprise knowledge transfer could include:\n\n1) The implementation o
 f DGGS such as HEALPix\, DGGRID-based equal-area DGGS (ISEA)\, rHEALPix\, 
 and (currently) more industry-friendly DGGS (Uber H3\, Google S2) on Xarra
 y should be improved further\, and more user-friendly API for how to re-gr
 id current data into DGGS grids. Training materials and Pangeo sessions sh
 ould be conducted to demonstrate the use of DGGS in Xarray\, aimed at enha
 ncing the skillset of practitioners and researchers in geospatial data han
 dling\, spatial data analysis\, and professional and academic institutions
 .\n\n2) DGGS-indexed reference datasets could be validated and also used t
 o highlight case studies and instructional material can be used in academi
 c courses and workshops\, focusing on the practical applications of data f
 usion\, quick addressing of equal-area cell grids\, AI\, socio-economic an
 d environmental studies. Especially the emerging property of selecting cel
 l-ranges from different data sources to join and integrate only based on c
 ell ids could make partial data access and sharing more dynamic and easy.
DTSTAMP:20260530T213221Z
LOCATION:Omicum
SUMMARY:XDGGS: A community-developed Xarray package to support planetary DG
 GS data cube computations - Alexander Kmoch
URL:https://talks.staging.osgeo.org/foss4g-europe-2024/talk/3WEQYE/
END:VEVENT
END:VCALENDAR
