Session | ||||||||||||||||||
21a. Methods of Artificial Intelligence 1
Session Topics: Methods of Artificial Intelligence
| ||||||||||||||||||
Presentations | ||||||||||||||||||
8:30am - 8:42am
ID: 154 Conference Paper Topics: Methods of Artificial Intelligence HOLa: HoloLens Object Labeling 1University of Applied Sciences Landshut, Germany; 2LAKUMED Hospital Landshut-Achdorf, Germany In the context of medical Augmented Reality (AR) applications, object tracking is a key challenge and requires a significant amount of annotation masks. As segmentation foundation models like the Segment Anything Model (SAM) begin to emerge, zero-shot segmentation requires only minimal human participation obtaining high-quality object masks. We introduce a HoloLens-Object-Labeling (HOLa) Unity and Python application based on the SAM-Track algorithm that offers fully automatic single object annotation for HoloLens 2 while requiring minimal human participation. HOLa does not have to be adjusted to a specific image appearance and could thus alleviate AR research in any application field. We evaluate HOLa for different degrees of image complexity in open liver surgery and in medical phantom experiments. Using HOLa for image annotation can increase the labeling speed by more than 500 times while providing Dice scores between 0.875 and 0.982, which are comparable to human annotators. Our code is publicly available at: https://github.com/mschwimmbeck/HOLa
8:42am - 8:54am
ID: 171 Conference Paper Topics: Methods of Artificial Intelligence One-shot learning HMI for people with disabilities Hannover University of Applied Sciences and Arts, Germany For people with physical disabilities, it is often desirable to regain control over their personal environment and communication tools. This paper introduces a novel Human-Machine Interface (HMI), by means of one-shot learning for individualized control signals without extensive training or specialized hardware. Our approach presents a modular system that utilizes common, easily accessible devices like webcams to interpret user-defined gestures and commands through a single demonstration. As a feasibility study on healthy volunteers, we investigate the control of a computer mouse by head movements only. We demonstrate the technical details of the HMI and discuss its potential applications in enhancing the autonomy and interaction capabilities of users with disabilities. By combining user-centric design principles with the advancements in one-shot learning, we aim to forge a more inclusive, accessible path forward in the development of assistive technologies.
8:54am - 9:06am
ID: 183 Abstract Oral Session Topics: Methods of Artificial Intelligence Using signals generated by ECG denoising autoencoder during its learning results in a denoising performance collapse Technische Hochschule Mittelhessen (THM), Germany Introduction An electrocardiogram (ECG) provides information about the heart’s electrical activity. If the ECG is affected by inter-ference, e.g. noise, signal processing methods can improve the signal quality. For this purpose, so-called denoising autoencoders (DAE) were proposed, which are special artificial neural networks. However, the denoising quality de-pends on DAE’s properties and on the training data. Using ECG, we show that the performance of the DAE collapses when data denoised by the DAE is used to generate new training data during continued learning. Methods For simplicity, a three-layer DAE was designed. The input and output layer sizes were 80 data points each. The hidden layer contained 40 neurons, each with a RELU activation function, and the output neurons were linear. For basic learning 1,000 ECG segments for training and 250 for validation were randomly selected without replacement from a dataset. The duration of each QRS-aligned segment was 0.8 s and included P- and T-wave. The basic learning phase was to get the mapping s+n → s, signal s and Gaussian white noise (GWN) n, for all training segments by back propa-gation batch learning. After learning converged, k times a randomly selected segment of the training dataset was su-perimposed with GWN and then denoised by the DAE to replace the selected segment. A single batch learning step was then executed. This continued learning sequence was executed repeatedly, but the validation segments remained unchanged. Results The monitoring of training and validation errors showed a significant decrease in training error combined with a sig-nificant increase in validation error in the continued learning phase. Interestingly, denoising performance decrease correlated positively with noise level increase. Conclusion This introduced process efficiently mimics the extension of DAE training data with DAE generated segments. The re-sults show that recursive DAE processing of data must be avoided to prevent denoising performance collapse.
9:06am - 9:18am
ID: 264 Conference Paper Topics: Methods of Artificial Intelligence Deep Learning-Based Real Time Human Detection System Using LiDAR Data for Smart Healthcare Monitoring 1Hochschule Aalen, Germany; 2Hochschule Wismar In current healthcare systems, the continuous monitoring of patients is crucial for ensuring safety. Traditionally, this requires constant oversight by healthcare professionals, a practice that is resource-intensive. Furthermore, prevalent detection mechanisms often use cameras, which may intrude on patient privacy by recording detailed visual information. This work introduces a solution by training a machine learning model using YOLOv5 deep learning-based algorithm tailored to human detection in rooms on data from a digital LiDAR sensor. The LiDAR sensor was mounted on ceiling for data acquisition and real-time processing. The motivation behind this project arises from the need to accelerate response times, preserve privacy, maintain safety during contagious disease outbreaks, address healthcare worker shortages, facilitate efficient monitoring without invasive sensors and enable simultaneous patient monitoring. After training a YOLOv5 object detection model on our data, the performance of the model is promising, with a Mean Average Precision (mAP) of 99.4% at an Intersection over Union (IoU) threshold of 0.5, a Precision of 99.6%, and a Recall of 99.4%.
9:18am - 9:30am
ID: 295 Conference Paper Topics: Methods of Artificial Intelligence Synthetic Data in Supervised Monocular Depth Estimation of Laparoscopic Liver Images Karlsruhe Institute of Technology (KIT), Germany Monocular depth estimation is an important topic in minimally invasive surgery, providing valuable information for downstream application, like navigation systems. Deep learning for this task requires high amount of training data for an accurate and robust model. Especially in the medical field acquiring ground truth depth information is rarely possible due to patient security and technical limitations. This problem is being tackled by many approaches including the use of synthetic data. This leads to the question, how well does the synthetic data allow the prediction of depth information on clinical data. To evaluate this, the synthetic data is used to train and optimize a U-Net, including hyperparameter tuning and augmentation. The trained model is then used to predict the depth on clinical image and analyzed in quality, consistency over the same scene, time and color. The results demonstrate that synthetic data sets can be used for training, with an accuracy of over 77% and a RMSE below 10mm on the synthetic data set, do well on resembling clinical data, but also have limitations due to the complexity of clinical environments. Synthetic data sets are a promising approach allowing monocular depth estimation in fields with otherwise lacking data.
9:30am - 9:42am
ID: 392 Conference Paper Topics: Methods of Artificial Intelligence Multi-view surgical phase recognition during laparoscopic cholecystectomy University of Stuttgart, Institute of Medical Device Technology, Germany In the realm of laparoscopic procedures, integrating intelligent sensors and artificial intelligence (AI) holds promise for enhancing surgical workflows and patient safety. This study addresses the challenge of recognizing surgical phases, crucial for implementing context-aware systems, through a novel multi-view approach combining data from laparoscopic and in-room cameras. Leveraging a Trans-SVNet model architecture, which integrates a ResNet, a Temporal Convolutional Network (TCN), and a Transformers, the study aimed to improve phase recognition accuracy. The fusion of laparoscopic and in-room data streams via a late fusion approach yielded mixed results, despite initial expectations. The arrangement of architectural components warrants reconsideration to enhance the extraction of information from diverse streams and achieve improved accuracy following fusion. In addition, disparities in accuracy values between laparoscopic and in-room models highlight the need for improved data quality and fusion techniques. This research sheds light on the complexities and opportunities in integrating multi-view data for surgical phase recognition, emphasizing the importance of diverse data collection strategies and model interpretability for real-world surgical settings.
|