Hands using a laptop next to graphic of brain inside a head outline

Prediction Modelling Presentations


Our Prediction Modelling group will be hosting regular presentations to engage and introduce new members to our community.

All our presentations will be uploaded below, to find out about any upcoming presentations please email raquel.iniesta@kcl.ac.uk

Applying Statistical Learning Methods to Improve Analyses of Medical Research Studies

By Daniel Stahl, 2 May 2020

The aim of this presentation is an introduction to statistical learning and prediction modelling.

Daniel explains the key differences between inferential statistical modelling and prediction modelling and then introduces the concept of prediction modelling and statistical learning. Finally, Daniel assesses the usefulness of statistical learning algorithms for applications in medical research as an alternative to classical statistical inference methods by reanalysing an event-related brain potential (ERP) dataset from infants at high or low risk of developing autism. Daniel also explains the concept of cross-validation for model selection and validation and provide a brief introduction to regularized regressions.




Using modern statistical learning methods to estimate all cause mortality risk

By Dr Olesya Ajnakina, 1 July 2020

Dr Olesya Ajnakina discusses her large population-based cohort study which addresses the need to develop a robust prediction model for estimating an individual risk for all-cause mortality. This allows relevant assessments and interventions to be targeted appropriately.

Having employed modern statistical learning algorithms and addressed the weaknesses of previous models, the new mortality model achieved good discrimination and calibration to quantify absolute 10-year risk of all-cause mortality in older adults, as shown by its performance in a separate validation cohort. The model can be useful for clinical, policy, and epidemiological applications.



Frontal lobes dysfunction across clinical clusters of acute schizophrenia

By Filippo Corponi, 10 September 2020. 

Schizophrenia is a heterogenous disease comprising manifold clinical phenotypes which may underlie distinct biological underpinnings. Frontal lobes are a key area of brain dysfunction in schizophrenia. The frontal assessment battery (FAB) is a battery screening for a dysexecutive syndrome in neurodegenerative diseases.

Filippo Corponi presents his work investigating the relationship between frontal lobe impairment and symptom profiles defined along the Positive and Negative Syndrome Scale (PANSS) principal components in patients with acute schizophrenia.



Efficient penalized regression methods for genetic prediction



Development and validation of a non-remission risk prediction model in First Episode Psychosis

By Dr Sam Leighton, 18 November 2020

Sam covers the development and validation a risk prediction model of symptom non-remission in first-episode psychosis. His development cohort consisted of 1027 patients with first episode psychosis recruited between 2005 to 2010 from 14 early intervention services across the National Health Service in England.

The prediction model showed good discrimination (C-statistic of 0.72 (0.66, 0.78) and adequate calibration with intercept alpha of 0.14 (-0.11, 0.39) and slope beta of 1.15 (0.76, 1.53). Our model improved the net benefit by 13%, equivalent to 13 more detected non-remitted first episode psychosis individuals per 100. Hence, using our model would be worthwhile if we accept using it on eight individuals to predict one additional non-remitted individual, or using our model on eight individuals will avoid unnecessary additional interventions in one individual.



Augmenting Machine Learning with Topological Data Analysis for precision

By Dr Raquel Iniesta, 27 January 2021

Topological Data Analysis (TDA) is a recently emerged field offering promising tools to extract descriptors of the shape and structure of complex data.

In this talk, Raquel provides an overview of TDA methods that complement current analytical approaches based on machine learning for precision medicine studies. She also introduces two popular techniques from TDA: the Persistent Diagram and Mapper graph, and discusses how these techniques are effective, based upon the literature available where TDA has been applied in the context of precision medicine. Lastly, she very briefly presents her and her team's ongoing work on how to integrate TDA with machine learning models to identify homogeneous subgroups of patients and predict clinical outcomes. 



Regularization and Effect Selection in Cox Frailty Models

By Dr Andreas Groll, 17 February 2021

In this talk, Dr Andreas Groll investigates the effect structure in the Cox frailty model, which is the most widely used model that accounts for heterogeneity in survival data.

Since in survival models one has to account for possible variation of the effect strength over time the selection of the relevant features has to distinguish between several cases, covariates can have time-varying effects, can have time-constant effects or be irrelevant. Regularization approaches are discussed that are able to distinguish between these types of effects to obtain a sparse representation that includes the relevant effects in a proper form. This idea is applied to a real world data set, illustrating that the complexity of the influence structure can be strongly reduced by using such a regularization approach.


Regularised Structural Equation Modelling Application to Psychometric Scales

By Isobel Ridler, 31 March 2021

Isobel talks structural equation modelling (SEM) and Regularised SEM (regSEM) as a method incorporating penalised likelihood into the SEM framework. In this seminar, regSEM is applied to a model of outcome prediction including a large psychometric scale in first a simulation study, and then a real-world longitudinal data set, allowing for a comparison of standard maximum likelihood estimation and regSEM, and demonstrating the ability of regSEM to perform sparse model selection and hence potentially optimise a scale for outcome prediction.



Harnessing repeated measurements of predictor variables: A review of existing methods for clinical risk prediction

By Lucy Bull, 21 April 2021

In this talk, Lucy Bull provides an overview of her methodological work that focuses on how we can make better use of routinely-collected medical data to enhance the reliability and applicability of clinical prediction models (CPMs). More specifically, Lucy highlights the motivations behind incorporating longitudinal data into clinical prediction models, provides a detailed overview of available methodology and discusses the challenges faced when applying such methodology to real-world data, using a case-study in chronic disease.




Creating Ensembles of Generative Adversarial Network Discriminators for One-class Classification

By Mihai Ermaliuc, 7 July 2021

In this talk, Mihai introduces a new technique based on Generative Adversarial Networks (GANs), that is able to achieve high performance in the one-class classification problem. He talks about the introduction of an algorithm for one-class classification based on binary classification of the target class against synthetic samples. Mihai's work was recently nominated for the best PhD student paper award of the International Conference of Engineering Applications of Neural Networks, EANN 2021.



Introduction to dCVnet – software for clinical prediction

By Dr Andrew Lawrence, 15 September 2021

In this talk, Andrew speaks about dCVnet, a software tool for prediction modelling. It produces tuned elastic-net regression models with cross-validated prediction performance measures. This approach can be useful in smaller samples or with many predictors. The tool is fast, easy to use and, in contrast to more general prediction modelling software, requires minimal statistical programming experience.  dCVnet was developed recently with support from the Biomedical Research Centre for Mental Health at the South London and Maudsley NHS Trust and King’s College London and is freely available at https://github.com/AndrewLawrence/dCVnet


Combining classical and machine-learning methods in Survival Analysis to boost predictive performance and preserve interpretability

By Diana Shamsutdinova, 22 September 2021

In this talk, Diana speaks about survival analysis, which deals with the longitudinal data and estimates both the distribution of time-to-event in a population over the observation time and how the time-to-event depends on the risk factors.


Identifying homogeneous subgroups of patients and important features: a topological machine learning approach

By Dr Ewan Carr, 20 October 2021

In this talk, Ewan introduces a pipeline that exploits recent developments in topological data analysis to identify homogeneous clusters in high-dimensional data. The approach is based on Mapper, an algorithm that reduces a point cloud into a one-dimensional graph. Written in Python and freely available online, the pipeline offers several advantages over existing clustering techniques. These include the ability to integrate prior knowledge into the clustering process and selection of optimal clusters; the use of the bootstrap to restrict the search to robust topological features; the use of machine learning to inspect clusters; and the ability to incorporate mixed data types.

Building and validating prediction models: An overview of sample size guidance and the pmsampsize package

By Dr Joie Ensor, 24 November 2021

In this talk, Joie discusses some of the considerations when deciding how much data is ‘enough’ when looking to i) develop a new clinical prediction model (CPM) and ii) validate an existing CPM. When designing a study to develop a new CPM, researchers must ensure a large enough sample size to develop a model that predicts as accurately as possible. Conversely, when designing a study to validate an existing CPM, we must ensure a sample size large enough to estimate model performance accurately and precisely in an external sample.

 Artificial Intelligence: Challenges and Ethical Concerns

By Dr Nicholas Cummins, 30 March 2022

Artificial Intelligence (AI) systems and applications are gaining greater prominence in everyday life. With this growth comes the need to discuss and debate the implications of this development. With this is mind, this talk aims to introduce some of the key concepts relating to what AI really means, different means to achieving it, and outline key challenges and ethical considerations.


Network Clustering of Cognition and Clinical Symptoms in Psychosis

By George Gifford, 25 May 2022

Unsupervised learning techniques have been applied to psychosis groups in the hope of finding meaningful but undiscovered groupings of patients. A methodological option for unsupervised learning is network-based clustering, which relies on the topology of the data represented as a network. This study used cognitive and symptom data from a cohort of healthy controls and those with a Clinical High Risk of Psychosis to test the validity of graph clustering and to explore the use of a multilayer clustering method for multimodal unsupervised learning. Graph clustering was able to produce results highly similar to k-means clustering and to separate groups into those with significantly different functioning scores. Multilayer clustering was used to tune the similarity of clustering solutions between modalities.

Performance comparison of dynamic prediction based on joint models and landmark analysis

By Dr Mizanur Khondoker, 29 June 2022

In conventional prediction models, predictors are typically measured at a single fixed time point such as at baseline or the most recent follow-up. Dynamic prediction has emerged as a more appealing prediction technique that takes account of longitudinal history of biomarkers for making predictions. In this talk Dr Mizanur Khondoker presents results from a simulation study comparing the prediction performance of two well-known approaches for dynamic prediction, namely joint modelling and landmarking approaches.

The use of machine learning to improve prognostic and diagnostic accuracy

By Dr Lauric Ferrat, 27 July 2022

The use of machine learning to improve prognostic and diagnostic accuracy has been increasing at the expense of classic statistical models. In this talk Dr Lauric Ferrat presents results comparing the prediction performance of several well-known machine learning approaches to logistic regression. He then argues that focus should not be made on performance optimisation but clinical utility and ease of model access.

Prediction Modelling Group

Prediction Modelling Group

Our Prediction Modelling Group provides a forum for researchers, clinicians and other experts.
About precision medicine and prediction modelling

About precision medicine and prediction modelling

Precision medicine is an emerging approach that focuses on identifying treatments or approaches.
Initiatives and aims

Initiatives and aims

We are finding prediction modelling researchers to establish an online database listing members’ areas of interest and expertise.


Meet the current members of the group and discover their areas of interest.


Predictive models that use data from individuals are an important source of information in medical settings.
Join our prediction modelling group

Join our prediction modelling group

Joining our group helps create our database of researchers and establish collaborations within our institution and beyond.
Prediction Modelling Workshop

Prediction Modelling Workshop

Videos from the inaugural workshop of our Prediction Modelling Group including presentations from world-renowned experts.
Implementation research

Implementation research

Our Prediction Modelling Group provides innovative approaches to tackle this translational gap and advance implementation science.