Publikationen
Lorem ipsum dolor sit amet consetetur
Short introducion into what ELIZA offers for whom. Nam quam nunc, blandit vel, luctus pulvinar, hendrerit id, lorem.
-
- 17.06.2025
- Grundlagen von ML
- Grundlagen von ML: Verarbeitung natürlicher Sprache
A Perspective on Deep Vision Performance with Standard Image and Video Codecs
DOI: Not availableResource-constrained hardware, such as edge devices or cell phones, often rely on cloud servers to provide the re quired computational resources for inference in deep vision models. However, transferring image and video data from an edge or mobile device to a cloud server requires coding to deal with network constraints. The use of standardized codecs, such as JPEG or H.264, is prevalent and required to ensure interoperability. This paper aims to examine the implications of employing standardized codecs within deep vision pipelines. We find that using JPEG and H.264 cod ing significantly deteriorates the accuracy across a broad range of vision tasks and models. For instance, strong com pression rates reduce semantic segmentation accuracy by more than 80% in mIoU. In contrast to previous findings, our analysis extends beyond image and action classification to localization and dense prediction tasks, thus providing a more comprehensive perspective.
-
- 26.04.2025
- Grundlagen von ML: Computer Vision
Gaussians-to-Life: Text-Driven Animation of 3D Gaussian Splatting Scenes
DOI: arXiv:2411.19233State-of-the-art novel view synthesis methods achieve impressive results for multi-view.
-
- 26.12.2024
- Grundlagen von ML: Computer Vision
Evaluating Self-Supervised Learning in Medical Imaging: A Benchmark for Robustness, Generalizability, and Multi-Domain Impact
DOI:Self-supervised learning (SSL) has emerged as a promis ing paradigm in medical imaging, addressing the chronic challenge of limited labeled data in healthcare settings. While SSL has shown impressive results, existing studies in the medical domain are often limited in scope, focusing on specific datasets or modalities, or evaluating only iso lated aspects of model performance. This fragmented eval uation approach poses a significant challenge, as models deployed in critical medical settings must not only achieve high accuracy but also demonstrate robust performance and generalizability across diverse datasets and varying con ditions. To address this gap, we present a comprehensive evaluation of SSL methods within the medical domain, with a particular focus on robustness and generalizability. Using the MedMNIST dataset collection as a standardized bench mark, we evaluate 8 major SSL methods across 11 different medical datasets. Our study provides an in-depth analysis of model performance in both in-domain scenarios and the detection of out-of-distribution (OOD) samples, while ex ploring the effect of various initialization strategies, model architectures, and multi-domain pre-training. We further assess the generalizability of SSL methods through cross dataset evaluations and the in-domain performance with varying label proportions (1%, 10%, and 100%) to sim ulate real-world scenarios with limited supervision. We hope this comprehensive benchmark helps practitioners and researchers make more informed decisions when applying SSL methods to medical applications.
-
- 24.12.2024
- Grundlagen von ML
- Transdisziplinäre Anwendungen
StaR Maps: Unveiling Uncertainty in Geospatial Relations
DOI: https://doi.org/10.48550/arXiv.2412.18356The growing complexity of intelligent transportation systems and their applications in public spaces has increased the demand for expressive and versatile knowledge representation. While various mapping efforts have achieved widespread coverage, including detailed annotation of features with semantic labels, it is essential to understand their inherent uncertainties, which are commonly underrepresented by the respective geographic information systems. Hence, it is critical to develop a representation that combines a statistical, probabilistic perspective with the relational nature of geospatial data. Further, such a representation should facilitate an honest view of the data's accuracy and provide an environment for high-level reasoning to obtain novel insights from task-dependent queries. Our work addresses this gap in two ways. First, we present Statistical Relational Maps (StaR Maps) as a representation of uncertain, semantic map data. Second, we demonstrate efficient computation of StaR Maps to scale the approach to wide urban spaces. Through experiments on real-world, crowd-sourced data, we underpin the application and utility of StaR Maps in terms of representing uncertain knowledge and reasoning for complex geospatial information.
-
- 04.12.2024
- Grundlagen von ML: Computer Vision
- Anwendungen in autonomen Systemen
BIM-based AI-supported LiDAR-Camera Pose Refinement
DOI: https://doi.org/10.48550/arXiv.2412.03434This paper introduces BIMCaP, a novel method to integrate mobile 3D sparse LiDAR data and camera measurements with pre-existing building information models (BIMs), enhancing fast and accurate indoor mapping with affordable sensors. BIMCaP refines sensor poses by leveraging a 3D BIM and employing a bundle adjustment technique to align real-world measurements with the model. Experiments using real-world open-access data show that BIMCaP achieves superior accuracy, reducing translational error by over 4 cm compared to current state-of-the-art methods. This advancement enhances the accuracy and cost-effectiveness of 3D mapping methodologies like SLAM. BIMCaP's improvements benefit various fields, including construction site management and emergency response, by providing up-to-date, aligned digital maps for better decision-making and productivity.
-
- 11.11.2024
- Transdisziplinäre Anwendungen
RNA-Protein Interaction Classification via Sequence Embeddings
DOI: https://doi.org/10.1101/2024.11.08.622607RNA-protein interactions (RPI) are ubiquitous in cellular organisms and essential for gene regulation. In particular, protein interactions with non-coding RNAs (ncRNAs) play a critical role in these processes. Experimental analysis of RPIs is time-consuming and expensive, and existing computational methods rely on small and limited datasets. This work introduces RNAInterAct, a comprehensive RPI dataset, alongside RPIembeddor, a novel transformer-based model designed for classifying ncRNA-protein interactions. By leveraging two foundation models for sequence embedding, we incorporate essential structural and functional insights into our task. We demonstrate RPIembeddor’s strong performance and generalization capability compared to state-of-the-art methods across different datasets and analyze the impact of the proposed embedding strategy on the performance in an ablation study.