Phenomix – Phenotypes mixing structured and unstructured data

All hospitals collect massive and original data from inpatients via Electronic Medical Records (EMRs) that could be reused. Furthermore, information on outpatients and causes of death are provided as part of the SNDS (Système National des Données de Santé – French healthcare database). To date, most of the research projects have mainly used structured data from SNDS. The search for predictive elements in EMRs requires a focus on structured data, but also on unstructured data such as free text, and changes over time in various parameters such as laboratory results.

In addition, some methods from the deep learning field make it possible to consider new representations by synthesizing information associated with patients. These new representations, built from massive data, could then be transferred to the context of prospective research. Finally, the continuous increase of applications using reinforcement learning raises the question of the applicability to healthcare, i.e. how to decide while taking into account the uncertainty regarding available information.

The PHENOMIX research program aims to develop a set of tools, models and software components that will enable firstly the construction of a “patient embedding” representation and its use to predict the severity of patients’ medical interventions or identify homogeneous patient groups within a large population. Secondly, PHENOMIX will employ reinforcement-learning methods to massive health data to help clinicians making a series of decisions.

 

Principal Investigators (PI): Vincent Sobanski and Grégoire Ficheur