Blogs

Applications of Machine Learning for Medical Technology

Article authored by Limor Wainstein

Applications of Machine Learning for Medical Technology

Healthcare has been built on data analysis since the Babylonians and Egyptians wrote the first diagnostic handbooks three thousand years ago. The 21st century has seen an explosion in medical data available to researchers and practitioners: unsurprisingly, this has been paralleled by advances in machine learning and AI techniques to help use this data both to make healthcare more efficient and to find new ways to improve diagnosis and case management.

Machine learning works by training computers to recognize patterns in data. There are various ways to do this, but healthcare machine learning has a typical approach of taking data sets - often 2D, 3D or full-motion images - that correspond to known medical conditions or high risk scenarios. By feeding them into an analytic system which can identify key features or make assessments, correct assessments are used to reinforce the pathways within the system that produces them; conversely, incorrect assessments weaken their pathways. Once the system is trained, it is fed data sets which have not been previously analyzed and classifies them according to the rules it has learned.

The potential upsides are huge, especially as demand for healthcare consistently outstrips resources. Machines with knowledge of thousands of conditions able to analyses medical data swiftly and in bulk can help doctors treat many more patients. The downsides are also very significant: if the training data is badly chosen or the training process inappropriate, the results can be very good for the training data but very poor with a wider selection. Machine learning can also give good results where the process by which a system arrives at a decision is hard or impossible to understand. This makes the results harder for a clinician to assess, and thus reduces their value.  And one of the biggest issues with machine learning is deciding what is good data anyway?

All data is good data if it is classified appropriately - even identifying otherwise useless information is worthwhile in helping to design future filters. One classic technique in data analytics is ETL - Extract, Transform, Load - where promising information is extracted from a large data set, it is cleaned, transformed and reformatted into a known state, and then loaded back into a database for further analysis, often by machine learning, down the pipeline.

This is being explored for one of the biggest healthcare issues of today, the obesity epidemic. It has proved obdurate to attack, because alongside the many medical issues in which it is a factor, it is the product of causes of habit and culture as much as nutrition and genetics. In “A health analytics semantic ETL service for obesity surveillance” (M. Poulymenopoulou et al, Studies in Health Technology and Informatics, Vol 210), a system is proposed that can ingest not only highly structured standard clinical records but also sensor data from fitness monitors, heart rate detectors etc, and social media self-reporting on lifestyle and food choices. The system constructs semiotic models that filter and categorize the unstructured data, transforming information where appropriate from low-level (geographic locations over time) to high-level (types of physical activity), and combines them in ontologies informed by the more structured, industry standard data representation models found in existing medical record sharing.

This approach provides a much broader set of signals for machine learning to act on, while retaining considerable utility for researchers and clinicians who understand the medical pathologies related to obesity but less so the wide scale behavior of large populations.

 

Selecting the right kind of machine learning for a particular task is vital. One good example is the Bill and Melinda Gates Foundation funded project “Machine learning from fetal flow waveforms to predict adverse perinatal outcomes”, led by Aga Khan University (AKU) in Karachi, Pakistan.

Pakistan has some of the highest rates of stillbirth and early neonatal mortality in the world. There are a lot of medical techniques that can significantly reduce these, such as ultrasound measuring of fetal and placental blood flow followed by appropriate treatments such as early delivery. Such options, however, need trained diagnosticians and good infrastructure, both of which are in short supply in the developing world.

The project aims to test a computer model of fetal haemodynamics, using machine learning based on Doppler patterns of fetal cardiovascular, cerebral and placental flows, to spot mothers at risk of stillbirth and other perinatal and neonatal morbidities.

Machine learning offers the best route for efficient data analysis, and this part of the project is led by Bart Bijnens, from the PhySense research group, and Gemma Piella, from the SIMBIOsys research group, both part of the Barcelona MedTech Unit in DTIC.

Fetal ultrasound data is complex and highly diverse - multivariate and heterogeneous, in analytic terms. The key technique in the project is the use of Multiple Kernel Learning - MKL - which uses an array of comparison functions or kernels. A sample consists of several features and a different kernel is assigned to each type of feature to measure the similarity of two features of the same type (measuring features from two different individuals). By identifying the most significant features in data provided from existing cases, where the abnormalities and outcome are already known, the system constrains and reduces the amount of analysis it needs to do. The next step is to learn how to correlate cases into clusters of similar type. When presented with novel data from a patient not used in the learning phase, the system can place it in one of the groups it knows about, and thus predict an outcome.

If the proof of concept works, there is huge potential for rapidly reducing morbidity around pregnancy and childbirth in a way that’s appropriate for available local resources. Identifying the right class of available data and the best fit for machine learning techniques for a particular task hugely increases the possibility of practical success.

Finding such good fits is a particular challenge for machine learning, which even in its purest form has to bring together engineering and mathematical expertise in the right ways. Applying it to practical use in medicine, a field which itself has to effectively bring different disciplines together, requires a great deal of expertise interchange between industry, academia and clinical practice.

CardioFunXion is an EU-funded, €1m four-year project started in 2015 and focused on new ideas in cardiological imaging-based medicine. Led by Universitat Pompeu Fabra (PhySense Group: Sensing in Physiology and Biomedicine, Department of Information and Communication Technologies) in Barcelona, Philips Research France, Medisys, the Institut de Investigacions Biomediques Agusti Pi i Sunyer (IDIBAPS) in Barcelona and the Centre Hospitalier et Universitaire de Caen.

It combines doctoral research in biomedical imaging and analysis with clinical data specialists, and workstation and sensor developers. In a series of summer and winter schools, new computational models of heart function and disease are discussed, together with potential new machine learning techniques and algorithms for diagnostics and experimental treatment exploration, how synthetic predictive ideas can be meshed with actual clinical work, and what protocols can best link all of the above into a framework for future use.

In this article, we discussed the relevance of machine learning in modern healthcare. We reviewed advantages and disadvantages of using machine learning in healthcare scenarios, and provided a few examples where machine learning is leveraged to improve healthcare, for example, where machine learning from fetal flow waveforms is used to help identify mothers in risk of stillbirth and related conditions. Such examples raise multiple questions: What are the best ways of validating data manipulated by machine learning? Is the data secure? Which types of storage should be considered for the data? These questions and more should be explored as additional and innovative applications of machine learning for healthcare are uncovered.  

 

The author:

Limor Wainstein