Conference Agenda

Session

21a. Methods of Artificial Intelligence 1

Time:

Thursday, 19/Sept/2024:

8:30am - 10:00am

Session Chair: Sebastian Zaunseder
Session Chair: Nicolai Spicher

Location: V 47.01

Session Topics:

Methods of Artificial Intelligence

Presentations

8:30am - 8:42am
ID: 154
Conference Paper
Topics: Methods of Artificial Intelligence

HOLa: HoloLens Object Labeling

Michael Schwimmbeck¹, Serouj Khajarian¹, Konstantin Holzapfel², Johannes Schmidt², Stefanie Remmele¹

¹University of Applied Sciences Landshut, Germany; ²LAKUMED Hospital Landshut-Achdorf, Germany

In the context of medical Augmented Reality (AR) applications, object tracking is a key challenge and requires a significant amount of annotation masks. As segmentation foundation models like the Segment Anything Model (SAM) begin to emerge, zero-shot segmentation requires only minimal human participation obtaining high-quality object masks. We introduce a HoloLens-Object-Labeling (HOLa) Unity and Python application based on the SAM-Track algorithm that offers fully automatic single object annotation for HoloLens 2 while requiring minimal human participation. HOLa does not have to be adjusted to a specific image appearance and could thus alleviate AR research in any application field. We evaluate HOLa for different degrees of image complexity in open liver surgery and in medical phantom experiments. Using HOLa for image annotation can increase the labeling speed by more than 500 times while providing Dice scores between 0.875 and 0.982, which are comparable to human annotators. Our code is publicly available at: https://github.com/mschwimmbeck/HOLa

Schwimmbeck-HOLa HoloLens Object Labeling-154_a.pdf

8:42am - 8:54am
ID: 171
Conference Paper
Topics: Methods of Artificial Intelligence

One-shot learning HMI for people with disabilities

Hanno Homann, Cedric Rohbani, Jens Christian Will

Hannover University of Applied Sciences and Arts, Germany

For people with physical disabilities, it is often desirable to regain control over their personal environment and communication tools. This paper introduces a novel Human-Machine Interface (HMI), by means of one-shot learning for individualized control signals without extensive training or specialized hardware. Our approach presents a modular system that utilizes common, easily accessible devices like webcams to interpret user-defined gestures and commands through a single demonstration.

As a feasibility study on healthy volunteers, we investigate the control of a computer mouse by head movements only. We demonstrate the technical details of the HMI and discuss its potential applications in enhancing the autonomy and interaction capabilities of users with disabilities. By combining user-centric design principles with the advancements in one-shot learning, we aim to forge a more inclusive, accessible path forward in the development of assistive technologies.

Homann-One-shot learning HMI for people with disabilities-171_a.pdf

8:54am - 9:06am
ID: 183
Abstract
Oral Session
Topics: Methods of Artificial Intelligence

Using signals generated by ECG denoising autoencoder during its learning results in a denoising performance collapse

Thomas Schanze

Technische Hochschule Mittelhessen (THM), Germany

Introduction

An electrocardiogram (ECG) provides information about the heart’s electrical activity. If the ECG is affected by inter-ference, e.g. noise, signal processing methods can improve the signal quality. For this purpose, so-called denoising autoencoders (DAE) were proposed, which are special artificial neural networks. However, the denoising quality de-pends on DAE’s properties and on the training data. Using ECG, we show that the performance of the DAE collapses when data denoised by the DAE is used to generate new training data during continued learning.

Methods

For simplicity, a three-layer DAE was designed. The input and output layer sizes were 80 data points each. The hidden layer contained 40 neurons, each with a RELU activation function, and the output neurons were linear. For basic learning 1,000 ECG segments for training and 250 for validation were randomly selected without replacement from a dataset. The duration of each QRS-aligned segment was 0.8 s and included P- and T-wave. The basic learning phase was to get the mapping s+n → s, signal s and Gaussian white noise (GWN) n, for all training segments by back propa-gation batch learning. After learning converged, k times a randomly selected segment of the training dataset was su-perimposed with GWN and then denoised by the DAE to replace the selected segment. A single batch learning step was then executed. This continued learning sequence was executed repeatedly, but the validation segments remained unchanged.

Results

The monitoring of training and validation errors showed a significant decrease in training error combined with a sig-nificant increase in validation error in the continued learning phase. Interestingly, denoising performance decrease correlated positively with noise level increase.

Conclusion

This introduced process efficiently mimics the extension of DAE training data with DAE generated segments. The re-sults show that recursive DAE processing of data must be avoided to prevent denoising performance collapse.

Schanze-Using signals generated by ECG denoising autoencoder during its learning-183_a.pdf

9:06am - 9:18am
ID: 264
Conference Paper
Topics: Methods of Artificial Intelligence

Deep Learning-Based Real Time Human Detection System Using LiDAR Data for Smart Healthcare Monitoring

Niloofar Kalashtari¹, Niklas Huhs², Jens Kraitl², Christoph Hornberger², Olaf Simanski²

¹Hochschule Aalen, Germany; ²Hochschule Wismar

In current healthcare systems, the continuous monitoring of patients is crucial for ensuring safety. Traditionally, this requires constant oversight by healthcare professionals, a practice that is resource-intensive. Furthermore, prevalent detection mechanisms often use cameras, which may intrude on patient privacy by recording detailed visual information. This work introduces a solution by training a machine learning model using YOLOv5 deep learning-based algorithm tailored to human detection in rooms on data from a digital LiDAR sensor. The LiDAR sensor was mounted on ceiling for data acquisition and real-time processing. The motivation behind this project arises from the need to accelerate response times, preserve privacy, maintain safety during contagious disease outbreaks, address healthcare worker shortages, facilitate efficient monitoring without invasive sensors and enable simultaneous patient monitoring.

After training a YOLOv5 object detection model on our data, the performance of the model is promising, with a Mean Average Precision (mAP) of 99.4% at an Intersection over Union (IoU) threshold of 0.5, a Precision of 99.6%, and a Recall of 99.4%.

Kalashtari-Deep Learning-Based Real Time Human Detection System Using LiDAR Data-264_a.pdf

9:18am - 9:30am
ID: 295
Conference Paper
Topics: Methods of Artificial Intelligence

Synthetic Data in Supervised Monocular Depth Estimation of Laparoscopic Liver Images

Heiko Walkner, Lorena Krames, Werner Nahm

Karlsruhe Institute of Technology (KIT), Germany

Monocular depth estimation is an important topic in minimally invasive surgery, providing valuable information for downstream application, like navigation systems. Deep learning for this task requires high amount of training data for an accurate and robust model. Especially in the medical field acquiring ground truth depth information is rarely possible due to patient security and technical limitations. This problem is being tackled by many approaches including the use of synthetic data. This leads to the question, how well does the synthetic data allow the prediction of depth information on clinical data. To evaluate this, the synthetic data is used to train and optimize a U-Net, including hyperparameter tuning and augmentation. The trained model is then used to predict the depth on clinical image and analyzed in quality, consistency over the same scene, time and color. The results demonstrate that synthetic data sets can be used for training, with an accuracy of over 77% and a RMSE below 10mm on the synthetic data set, do well on resembling clinical data, but also have limitations due to the complexity of clinical environments. Synthetic data sets are a promising approach allowing monocular depth estimation in fields with otherwise lacking data.

Walkner-Synthetic Data in Supervised Monocular Depth Estimation-295_a.pdf

9:30am - 9:42am
ID: 392
Conference Paper
Topics: Methods of Artificial Intelligence

Multi-view surgical phase recognition during laparoscopic cholecystectomy

Flakë Bajraktari, Peter P. Pott

University of Stuttgart, Institute of Medical Device Technology, Germany

In the realm of laparoscopic procedures, integrating intelligent sensors and artificial intelligence (AI) holds promise for enhancing surgical workflows and patient safety. This study addresses the challenge of recognizing surgical phases, crucial for implementing context-aware systems, through a novel multi-view approach combining data from laparoscopic and in-room cameras. Leveraging a Trans-SVNet model architecture, which integrates a ResNet, a Temporal Convolutional Network (TCN), and a Transformers, the study aimed to improve phase recognition accuracy. The fusion of laparoscopic and in-room data streams via a late fusion approach yielded mixed results, despite initial expectations. The arrangement of architectural components warrants reconsideration to enhance the extraction of information from diverse streams and achieve improved accuracy following fusion. In addition, disparities in accuracy values between laparoscopic and in-room models highlight the need for improved data quality and fusion techniques. This research sheds light on the complexities and opportunities in integrating multi-view data for surgical phase recognition, emphasizing the importance of diverse data collection strategies and model interpretability for real-world surgical settings.

Bajraktari-Multi-view surgical phase recognition during laparoscopic cholecystectomy-392_a.pdf