BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//pretalx//talks.staging.osgeo.org//foss4g-europe-2024-academic-tra
 ck//speaker//GQL9US
BEGIN:VTIMEZONE
TZID:EET
BEGIN:STANDARD
DTSTART:20000101T000000
RRULE:FREQ=YEARLY;BYMONTH=1;UNTIL=20001231T220000Z
TZNAME:EET
TZOFFSETFROM:+0200
TZOFFSETTO:+0200
END:STANDARD
BEGIN:STANDARD
DTSTART:20021027T050000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10
TZNAME:EET
TZOFFSETFROM:+0300
TZOFFSETTO:+0200
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20020331T040000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=3
TZNAME:EEST
TZOFFSETFROM:+0200
TZOFFSETTO:+0300
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:pretalx-foss4g-europe-2024-academic-track-XUEUD9@talks.staging.osgeo.or
 g
DTSTART;TZID=EET:20240703T110000
DTEND;TZID=EET:20240703T113000
DESCRIPTION:Building footprints (hereinafter buildings) represent key geosp
 atial datasets for several applications\, including city planning\, demogr
 aphic analyses\, modelling energy production and consumption\, disaster pr
 eparedness and response\, and digital twins. Traditionally\, buildings are
  produced by governmental organisations as part of their cartographic data
 bases\, with coverage ranging from local to national and licensing conditi
 ons being heterogeneous and not always open. This makes it challenging to 
 derive open building datasets with a continental or global scale. Over the
  last decade\, however\, the unparalleled developments in the resolution o
 f satellite imagery\, artificial intelligence techniques and citizen engag
 ement in geospatial data collection have enabled the birth of several buil
 ding datasets available at least at a continental scale under open license
 s.\nIn this work\, we analyse four such open building datasets. The first 
 is the building dataset extracted from the well-known OpenStreetMap (OSM\,
  https://www.openstreetmap.org) crowdsourcing project\, which creates and 
 maintains a database of the whole world released under the Open Database L
 icense (ODbL). OSM buildings are typically derived from the digitalisation
  of high-resolution satellite imagery\, and in some case from the import o
 f other databases with ODbL-compatible licenses. The second dataset is EUB
 UCCO (https://eubucco.com)\, a pan-European building database produced by 
 a research team at the Technical University Berlin by merging different in
 put sources: governmental datasets when available and open\, and OSM other
 wise [1]. EUBUCCO is mostly licensed under the ODbL\, with only exceptions
  for two regions in Italy and Czech Republic. The third dataset is Microso
 ft Open Building Footprints (MS\, https://github.com/microsoft/GlobalMLBui
 ldingFootprints)\, extracted through the application of machine learning t
 echnology from high-resolution Bing Maps satellite imagery between 2014 an
 d 2023\, available at the global scale and also licensed under the ODbL. T
 he fourth dataset\, called Digital Building Stock Model (DBSM)\, was produ
 ced by the Joint Research Centre (JRC) of the European Commission to suppo
 rt studies on energy-related purposes. It is an ODbL-licensed pan-European
  dataset produced from the hierarchical conflation of three input datasets
 : OSM\, MS and the European Settlement Map [2].\nThe objective of this wor
 k is to compare the four datasets – which derive from different approach
 es following heterogeneous processing steps and governance rules – in te
 rms of their geometry (i.e. attributes are out of scope) in order to draw 
 conclusions on their similarity and differences. It is known from literatu
 re that building completeness in OSM (which plays a key role in three out 
 of the four datasets – OSM itself\, EUBUCCO and DBSM) varies with the de
 gree of urbanisation [3] and that machine learning applied to satellite im
 agery (used in MS) may have different performance depending on the urban o
 r rural context [4]. In light of this\, we analyse the building datasets a
 ccording to the degree of urbanisation of their location using the adminis
 trative boundaries provided by Eurostat\, which classifies each European p
 rovince as urban\, semi-urban or rural (https://ec.europa.eu/eurostat/web/
 gisco/geodata/reference-data/administrative-units-statistical-units/countr
 ies). \nWe chose five European Union (EU) countries for the analysis: Malt
 a (MT)\, Greece (EL)\, Belgium (BE)\, Denmark (DK) and Sweden (SE). The ch
 oice was motivated by the needs to: i) select countries of different size 
 and geographical location\, which ensure that their national OSM communiti
 es are substantially different\; ii) select countries having different por
 tions of urban\, semi-urban and rural areas\; and iii) select two sets of 
 countries for which the input source for EUBUCCO buildings was a governmen
 tal dataset (BE\, DK) and OSM (MT\, EL\, SE) to detect possibly different 
 behaviours. \nFrom the methodological point of view\, for each country and
  degree of urbanisation we first calculated and compared the total number 
 and total area of buildings in all datasets and we examined their statisti
 cs through box plots. This was followed by the calculation\, for each coup
 le of datasets and degree of urbanisation\, of the building area of inters
 ection and its fraction of the total building area of each of the two data
 sets. Finally\, we intersected all the four datasets and calculated the fr
 action of the area of each dataset that this intersection represents.\nRes
 ults show that in urban areas\, while the datasets are overall similar in 
 terms of total area of buildings\, the total number of buildings is typica
 lly higher in EUBUCCO for DK and BE\, where the information comes from gov
 ernmental datasets. This suggests that such datasets outperform OSM in mod
 elling the footprints of individual buildings in the most urbanised areas.
  In contrast\, in semi-urban and rural areas\, where OSM traditionally lac
 ks completeness\, MS (and as a consequence DBSM\, which is also based on M
 S) captures more buildings. This is especially evident in SE\, where 94% o
 f the country area is not urban. When calculating the intersection between
  building areas for each couple of datasets in all countries and urban are
 as\, the area of OSM buildings scores the lowest percentages of intersecti
 on when compared to the building areas of the other datasets. The lowest s
 uch percentages\, equal to 25%\, are scored when compared to MS in non-urb
 an areas. EUBUCCO represents an obvious exception for the countries (MT\, 
 EL and SE) where it uses OSM. Finally\, the dataset for which the area of 
 intersection between the buildings of all the four datasets represents the
  largest percentage of the area is OSM\, with values even higher than 80% 
 for urban areas. This proves that EUBUCCO and even more DBSM can be consid
 ered a sort of ‘OSM extension’ improving its completeness. Instead\, t
 he lowest values are scored by MS and result from its radically different 
 generation process compared to the other datasets. \nThe whole procedure w
 as written in Python using libraries such as Pandas\, Dask-GeoPandas and P
 lotly. The code is available under the European Union Public License (EUPL
 ) v1.2 at https://github.com/eurogeoss/building-datasets in the form of Ju
 pyter Notebooks. Work is ongoing to extend the analysis to the whole EU in
  order to validate the results of this study and formulate recommendations
  at the continental level.
DTSTAMP:20260415T153655Z
LOCATION:Omicum
SUMMARY:Pan-European open building footprints: analysis and comparison in s
 elected countries - Marco Minghini
URL:https://talks.staging.osgeo.org/foss4g-europe-2024-academic-track/talk/
 XUEUD9/
END:VEVENT
END:VCALENDAR
