CRIS blog: The future of psychiatry research

Dr Karyn Ayre, NIHR Academic Clinical Fellow at King's College London, was awarded second prize in the in the 2018 Duncan Macmillan essay competition for her essay “The future of psychiatry research” which argued for new digital methodologies to be used in psychiatry research in order to improve the links between research data and the lived experience of people with mental health problems. We have republished her essay here as a CRIS blog post.

The Duncan Macmillan essay prize competition is held annually by the Institute of Mental Health, a partnership between the University of Nottingham and Nottinghamshire Healthcare NHS Foundation Trust. This year it was held in partnership with the Royal College of Psychiatrists. The purpose of the Duncan Macmillan prize is to inspire the eminent psychiatry trainees nationwide. The competition is named after Nottingham-based psychiatrist Duncan Macmillan, who helped pioneer a community-centred approach to mental health in the 1950's and 1960’s. 

Since the days of Jaspers and Kraeplin, the scope of psychiatry has expanded significantly. It now encompasses a vast array of topics that interface directly with other areas of medicine and sociology: developmental neuropsychiatry, psychopharmacology, genomics and liaison psychiatry to name but a few. However, what if the next big thing in psychiatry research is not what we research, but how we research it?

The paradigm of traditional research is to recruit a sample, design an outcome measure and apply that outcome measure to the sample in order to answer the research question. The issue with this paradigm is that researchers can never be sure how truly representative of real life their study findings are – people who agree to participate in research may be significantly different from those who do not and what a participant says in a research questionnaire can be affected by all kinds of things, particularly in psychiatry, where stigma still lingers. Essentially, research is an artificial scenario, where researchers try as hard as possible to make sure it represents real life. A way around this is to use routinely collected data from the general population, something that often allows very large sample sizes to be generated, as exemplified by the population registries in Scandinavian countries.

However, this is also problematic as it rarely captures nuanced factors particularly relevant to mental health, e.g. relationship difficulties or attitudes towards the outcome in question. Healthcare outcomes are often recorded in such registries using alphanumerical codes from the International Classification of Diseases (ICD) manual, applied by a coder when a someone accesses a healthcare service. However, this coding is also subject to bias – the coder may have no expertise in mental health, and stigmatised events such as self-harm may be under-coded or misclassified. A third way of carrying out a research study is to examine the content of healthcare records of real service-users. This can provide more nuanced data than simple discharge codes, but is labour-intensive, limiting the sample size considerably.

We are now re-defining the paradigm: we can now use computers to “learn” to extract nuanced data from vast amounts of real, rich, pre-existing clinical material. This is known as machine learning or “natural language processing” (NLP). Doing this digitally means very large sample sizes can be created and therefore research using real clinical material can be done on a much larger scale than previously possible.

A prime showcase of NLP is CRIS. This stands for Clinical Record Interactive Search and is a de-identified database of real life electronic mental healthcare clinical notes. The original CRIS was developed at the National Institute of Health Research (NIHR) Maudsley Biomedical Research Centre in South London but is now rolled out across thirteen other mental health Trusts in England.

Self-harm research is a good example of the utility of CRIS and NLP. Measuring how common self-harm is can be very complicated because what someone means by “self-harm” can be defined in many different ways: “attempted suicide”, “parasuicide”, “suicidal gesture”, “overdosing”, “cutting yourself”, “trying to end your life” to name but a few. In traditional research, a commonly used outcome measure is a questionnaire or interview, where the participant is asked whether they have ever self-harmed.

What if the questionnaire worded it wrongly and the participant didn’t think the question applied to them? Someone who “cut themselves” may not have been trying to “end their life”. It would not be very cumbersome to ask a participant a question in six different ways, to make sure we had covered all the potential definitions. The use of techniques and resources like NLP and CRIS mean that instead of limiting ourselves to one definition, we can search for all the definitions we can think of. The value of clinical input into CRIS research is clearly valuable here, and an example of how researchers with clinical and non-clinical expertise can work together to get the best out of a research tool.

So, researchers can do a “free text” search for synonyms that capture a broad range of ways an outcome is defined. However, a potential problem with this is that it can generate a large number of false-positives – e.g. records with mentions of “no self-harm” or that the service-user “has never self-harmed”. The beauty of NLP is that researchers can design “applications”, i.e. specific computer codes, that can allow the “machine to learn” to filter out these non-relevant mentions of the outcome of interest. The application can be applied to very large datasets, meaning the process of filtering out non-relevant material, which would otherwise take an unfeasible amount of time were it to be done manually, can be done with ease.

NLP is not perfect. There are problems with missing data, and not picking up potentially new and unexpected ways of defining outcomes that the clinician was not aware of. Using NLP in CRIS cohorts means are seeing what the service-user is saying through the prism of what the clinician is hearing and documenting. Researchers must ensure any application they develop is validated against a gold standard of manually coded text. Finally, the use of CRIS means cohorts inevitably only hold information on people in contact with mental healthcare services.

However, CRIS and NLP in general represent an incredibly exciting and revolutionary new research paradigm: what goes on at the front-line of mental healthcare can provide the material for research on a vast scale previously unattainable through manual analysis. The next big thing in psychiatry research is therefore ironically something that allows us to better collate and analyse the information we already had from our day-to-day working lives helping service-users with lived experience of mental illness.


Tags: CRIS blog -

By NIHR Maudsley BRC at 16 Nov 2018, 10:51 AM


Back to Blog List