Unlocking the data potential within the United Kingdom for population scale research

Tom Starkey, Maria Ionescu- Opinion Editorial (UK COVID/cancer clinical lead)

Population-scale real-world datasets are becoming increasingly important in health research. Evaluations from these resources are being routinely used to assess and improve health outcomes, guide clinical decision-making and inform government policy.

Population-scale approach has taken centre-stage during the Coronavirus (COVID-19) pandemic with health data evaluation and dissemination profoundly affecting public health policy and clinical outcomes globally.

To date, multiple important large-scale datasets have been curated in the United Kingdom (UK). These include the UK Biobank (500k participants) and 100k Genomes Project, which have started to facilitate cutting-edge research into primary cancers and rare diseases. However, these datasets are still rare and are sometimes challenging to link to relevant clinical information. More pertinently, these remain limited in scope for assessing the impact of Coronavirus in individuals with cancer.

The UK Coronavirus Cancer Programme (UKCCP) dataset was one of the first in the UK to link COVID-19 testing and vaccination records with NHS hospital records and the cancer datasets. Across the COVID-19 pandemic, our population-scale dataset comprised 198,819 positive SARS-CoV-2 test results from 127,322 individual Coronavirus infections in people with cancer*, enabling us to perform the high-quality real-world evaluations of cancer patients infected with COVID-19 required to reduce risk and improve outcomes for these patients. This has included, but not limited to, determining primary and booster vaccination effectiveness across different cancer patient subgroups and subtypes as well as assessing clinical outcomes, thereby identifying the most clinically vulnerable individuals with cancer. In turn, these identified individuals are among those who may benefit from additional measures such as prophylactic antibody therapies to boost anti-SARS-CoV-2 immunity.

The UK CCP has delivered the largest COVID/Cancer analyses in the world and changed global practice, catalysing protection for cancer patients.

We addressed several challenges and developed new solutions. The mission was to use data for health, breakdown silo working, build new bridges and provide confidence in our academic analyses, whilst simultaneously working within the trusted national data platforms. We were successful, and this process has left a lasting legacy with data linkages and scripts which can be run to generate near real-time updates for our UKCCP dataset based on the latest information from each individual data source. Furthermore, we have generated extensive data analysis protocols and code as part of the programme to facilitate up-to-date assessments of Coronavirus outcomes in people with cancer and other immunodeficiency diseases on a population-scale.

The Coronavirus pandemic has underlined the importance of population-scale real-world datasets and data evaluations.  This requires the skillsets of clinicians, health researchers, data scientists as well as public engagement. Indeed, this collaborative approach has had a profound effect within the medical research community, with over 250 individuals involved in the UKCCP alone.

Our programme was successfully in providing proof of principle and drive through securely integrating multiple health datasets, a process which if developed and refined on a global scale has the potential to deliver significant improvements in health research and patient care.

Going forward, population-scale data research providesd great potential to bring together clinicians, researchers, charities, industry partners and the public together to develop new ways to prevent, detect and treat cancer

Leave a Reply