Hybrid modelling has recently gained significant attention in the field of machine learning. The concept revolves around integrating a fundamental understanding of biological, chemical, and physical mechanisms with data-driven models. This approach enables the development of hybrid models that utilizes both theoretical knowledge and empirical data, offering a more robust framework for solving complex problems.
In vibrational spectroscopy, hybrid models have been employed for decades to merge principles from chemistry and light scattering with data analysis. These models have been particularly impactful in infrared (IR) spectroscopy, where they are used to effectively address complex scattering and absorption phenomena. One prominent application is in infrared microspectroscopy of cells and tissues, where intricate scattering effects are commonly encountered. In infrared microscopy, the wavelength of electromagnetic radiation is comparable to the size of microparticles and subcellular structures. This size similarity, particularly in the mid-infrared (mid-IR) range (approximately 5–50 µm), gives rise to prominent scattering signatures in the measured spectra. Additionally, the mid-IR region is characterized by strong absorption from most chemicals, making it a rich source of information about the physical and chemical properties of a wide range of samples, including cells, tissues, and microplastics. To extract meaningful information, a key goal is to disentangle chemical absorption effects from optical scattering effects.
The retrieval of pure chemical absorbance spectra from spectra heavily distorted by scattering constitutes an inverse scattering problem. This process involves estimating the optical properties of a sample from its measured spectrum. However, inverse scattering problems are inherently ill-posed, as multiple sets of optical properties can produce the same measured spectrum. Consequently, solving such problems requires strategies to constrain the solution space and reduce ambiguity.
One effective approach involves iterative methods, where solutions are refined by searching within a narrow region close to a known solution, for example, the pure absorbance spectrum of a chemically or biologically similar sample. While these methods can achieve high precision, they are not universally applicable, as they rely heavily on prior knowledge.
In this presentation, we provide an overview of the intricate interplay between scattering and absorption in mid-IR spectroscopy of cells and tissues. We explore a range of methodologies for retrieving pure absorbance spectra and estimating optical properties from scatter-distorted spectra. The approaches discussed span from theory-driven frameworks to purely machine learning-based methods, highlighting their respective strengths, limitations, and potential for advancing the field.