Bavaria Buildings is a large, analysis-ready dataset providing openly available co-registered 40cm aerial imagery of Upper Bavaria paired with building footprint information. The Bavaria Buildings dataset (BBD) contains 18205 orthophotos of 2500 × 2500 pixels, where each pixel covers 40cm × 40cm in space (Digitales Orthophoto 40cm - DOP40). The dataset has been pre-processed and co-registered and also provides a set of 5.5 million image tiles of 250 × 250 pixels ready for deep learning and image analysis tasks. For each image tile, we provide two segmentation masks; one based on the official building footprints (Hausumringe) data as published by the Free State of Bavaria and one based on a historic OpenStreetMap (OSM) extract dating to 2021. The dataset is ready for essential analysis tasks, such as detection, segmentation, instance extraction, footprint geometry extraction, multimodal localization, and multimodal data quality assessment of buildings in Bavaria. We plan to update the dataset with each major re-publication of the upstream data sources to foster change detection research in the future. The BBD is available at https://doi.org/10.14459/2023mp1709451.
The last decade witnesses a fast development in geospatial application of artificial intelligence (GeoAI). However, due to the misalignment with wider computer science progresses, the geospatial community, for a long time, keeps working with powerful and over-sophisticated tools and software, whose functionality goes far beyond the actual basic need of GeoAI tasks. This fact, to a certain extent, hinders our steps towards establishing future sustainable and replicable GeoAI models. In this paper, we aim to address this challenge by introducing an efficient big data framework based on the modern HDF5 technology, called AtlasHDF, in which we designed lossless data mappings (immediate mapping and analysis-ready mapping) from OpenStreetMap (OSM) vector data into a single HDF5 data container to facilitate fast and flexible GeoAI applications learnt from OSM data. Since the HDF5 is included as a default dependency in most GeoAI and high performance computing (HPC) environments, the proposed AtlasHDF provides a cross-platformm and single-techonology solution of handling heterogeneous big geodata for GeoAI. As a case study, we conducted a comparative analysis of the AtlasHDF framework with three commonly-used data formats (i.e., PBF, Shapefile and GeoPackage) using the latest OSM data from the city of Berlin (Germany), then elaborated on the advantages of each data format w.r.t file size, querying, rending, dependency, data extendability. Given a wide range of GeoAI tasks that can potentially benefit from our framework, our future work will focus on extending the framework to heterogeneous big geodata (vector and raster) to support seamless and fast data integration without any geospatial software dependency until the training stage of GeoAI. A reference implementation of the framework developed in this paper is provided to the public at: https://github.com/tumbgd/hdf4water.
Attention-based methods have played an important role in model interpretations, where the calculated attention weights are expected to highlight the critical parts of inputs (e.g., keywords in sentences). However, some recent research points out that attention-as-importance interpretations often do not work as well as we expect. For example, learned attention weights are frequently uncorrelated with other feature importance indicators like gradient-based measures, and a debate on the effectiveness of attention-based interpretations has also raised. In this paper, we reveal that one root cause of this phenomenon can be ascribed to the combinatorial shortcuts, which stand for that the models may not only obtain information from the highlighted parts by attention mechanisms but from the attention weights themselves. We design one intuitive experiment to demonstrate the existence of combinatorial shortcuts and propose two methods to mitigate this issue. Empirical studies on attention-based instance-wise feature selection interpretation models are conducted, and the results show that the proposed methods can effectively improve the interpretability of attention mechanisms on a variety of datasets.
SpectralGPT is the first purpose-built foundation model designed explicitly for spectral RS data. It considers unique characteristics of spectral data, i.e., spatial-spectral coupling and spectral sequentiality, in the MAE framework with a simple yet effective 3D GPT network. We will gradually release the trained models (SpectralGPT, SpectralGPT+), the new benchmark dataset (SegMunich) for the downstream task of semantic segmentation, original code, and implementation instructions.
In recent years, various segmentation models have been developed successively. However, due to the limited availability of nighttime datasets and the complexity of nighttime scenes, there remains a scarcity of high-performance nighttime semantic segmentation models. Analysis of nighttime scenes has revealed that the primary challenges encountered are overexposure and underexposure. In view of this, our proposed Histogram Multi-scale Retinex with Color Restoration and No-Exposure Semantic Segmentation Network model is based on semantic segmentation of nighttime scenes and consists of three modules and a multi-head decoder. The three modules—Histogram, Multi-Scale Retinex with Color Restoration (MSRCR), and No Exposure (N-EX)—aim to enhance the robustness of image segmentation under different lighting conditions. The Histogram module prevents over-fitting to well-lit images, and the MSRCR module enhances images with insufficient lighting, improving object recognition and facilitating segmentation. The N-EX module uses a dark channel prior method to remove excess light covering the surface of an object. Extensive experiments show that the three modules are suitable for different network models and can be inserted and used at will. They significantly improve the model's segmentation ability for nighttime images while having good generalization ability. When added to the multi-head decoder network, mean intersection over union increases by 6.2% on the nighttime dataset Rebecca and 1.5% on the daytime dataset CamVid.
Using Proofs of Retrievability (PORs), a file owner is able to check that a cloud server correctly stores her files.Using Proofs of Retrievability and Reliability (PORRs), she can even verify at the same time that the cloud server correctly stores both her original files and their replicas.In 2020, a new PORR combined with Verifiable Delay Functions (VDFs) was presented by Gritti.VDFs are special functions whose evaluation is slow while verification is fast.Therefore, those functions help guarantee that the original files and their replicas are stored at rest.Moreover, an important feature of the 2020 PORR solution is that anyone can verify the cloud provider's behaviour, not only the file owner.This paper extends Gritti's version.In particular, a realistic cloud framework is defined in order to implement and evaluate accurately.Results show that this PORR solution is well suitable for services provided for cloud storage.
Social media has become an important communication tool especially following an extreme event. Research in social psychology has shown that people engage in gathering and "milling" information, and confirmation seeking during the process of forming intent to take action or voice an opinion. Twitter serves as a communications channel where people converge to compile collective intelligence, provide event reporting, and diffuse information. In this paper the investigation of Twitter usage seeks to describe human participation on Twitter following a controversial extreme event -- 2013 Syria sarin gas attack. The methodology employed incorporates Natural Language Processing (NLP) and network analysis to trace human response on Twitter to this event. NLP techniques include Named Entity Recognition (NER) used to extract relevant entities (e.g. countries), Event Extraction (EE) to excerpt relevant events (e.g. conflict, movement, life, etc.), and Stanford Parser to detect actionable verbs discussed by Twitter participants. Network analysis constructs a network based on the Twitter users' communications, detects communities, extracts their leaders and identifies their roles based on structural properties of the networks. Specifically, the research looked at the Twitter data for two days August 22-23, 2013 following the event. The research suggests that (1) there were no immediate polarization of opinions following the event, (2) the primary event of Twitter communication was the conflict and information about the victims of the event, (3) Twitter communities were too sparse to produce substantial amount of social pressure to force an opinion/opinion shift, (4) top community leaders were news sources, political activists, and select individuals, (5) 'individual' leaders political agendas were not revealed.