Why machine learning is the new perspective for generating evidence for Pharma?

The global pharmaceutical industry is no longer being rewarded for selling most drugs but for providing the best healthcare services and products.

The method of assessing outcome for offered health services needs to be tailored for different contexts involving unique patients, medical practitioners, drugs and therapies.  

Machine learning (ML) is about context: 

ML aims to understand human context in the natural setting. It aims to understand how new knowledge can be gained by learning from examples. As a result it is not constrained by statistical assumptions and models. By means of feature extraction, ML maps real world data to a non-linear space to improve the effectiveness and efficiency of the predictive models. ML models takes a more individualised data driven approach towards outcome prediction with minimal assumptions on data distributions,

ML can deal with human context and hence the individual nature of healthcare services in addressing different patients with different lifestyles and different interaction with the healthcare system. The need for randomisation and blinding hold for many statistical approaches. However, ML provides alternative approaches that are fully data driven and don’t make any assumption about data distribution. 

Data driven approaches make no assumption on data distribution - hence no strict requirements are required 

ML does not assume real world data is normally distributed, and does not care if the Central Limit Theorem applies. The most popular algorithms are 100% data driven, they learn from the data -as it is- without underlying assumptions on its distribution. Moreover, computer scientists have a different mindset to generating evidence, their neutral domain knowledge allows them to maintain objectivity on predicting the outcome, the main focus is maximising the predictive power of the ML algorithm which leads automatically to the best objective evidence that can be derived from the data and is only limited by the quantity and quality of available data. 


ML handles articulation 

Heath services are characterised by high dependency on the context of both patient and clinician. The problem of extracting such a context from low level data is known as the semantic gap and it cannot be addressed within the RCT framework. The semantics of a specific human task depend on the context in which it is performed. This requires transferring human tacit knowledge of clinicians and patients to explicit knowledge, a process known as articulation that can only be addressed using ML. 

ML provides tools in handling articulation by inferring high level tasks from low level events. ML provides personalised tools for inferring similar context from different executions. By deriving the context, ML allows for assessing the clinical outcome for individual patients, it also allows for handling outliers and does not set the same restrictions on the data collection procedure as the RCT framework. ML could therefore provide a pragmatic way to analyse and create evidence from real-world data in the healthcare eco-system.