Genome-Centric Multimodal Data Integration in Personalised Cardiovascular Medicine.

Deliverables

The NextGen Data Management Plan v 1 0

NextGen addresses the complex problem of integrating multiple types of data (multi-modal data), including genomic data, into research pathways that might include AI tools. NextGen develops tools to overcome barriers in data integration and access, to lead to an improved clinical outcome from healthcare research.
NextGen exploits on federated technology and decentralized data management techniques centrally aggregating data is outside the declared project scope (which is particularly relevant in the context of data subject to the GDPR).
Starting with datasets accessible by NextGen participants locally or through biobanks, the integration tooling listed will first be deployed locally while a concept of decentralized platform for research portability across different sites is developed around specific pilots defined during the project.

Data Discovery Functionality – Part 1

The NextGen Project aims to develop tooling for integrating genome-centric multimodal data for research analytics. These tools promote a federated approach to dealing with the sensitive data required for personalised cardiovascular medicine. These tools will be deployed in pilots as NextGen's pathfinder projects. The vision of the NextGen federated approach is to create a health data ecosystem where data remains under the control of each research organisation. As a result, these tools will expand the reach of participants to perform research activities beyond their own institutions but within an ecosystem governance.
Achieving this vision requires new mechanisms to lawfully discover data availability for research purposes while the datasets remain secured by each research organisation. The search has to discover data while this data is secured behind each organisation's wall and cannot be moved a priori. Providing such a tool is the goal of NextGen's Federated Catalogues.
Current discovery mechanisms are based on search solutions that require data and information about the data (the meta-data) to be moved to a hosted system (i.e. the catalogue). Existing federated catalogue solutions are not suitable for sensitive health data as a participating research institution cannot maintain sufficient control over the structure and usage of the exposed. This is particularly true for health data spaces spanning multiple jurisdictions.