Integrative Learning

Integreat will integrate different forms of knowledge into joint analyses, as well as knowledge with a large variety of data types. Knowledge can be expressed by means of stochastic models, differential equations, semantic technologies, graphical representations, and logic-based ontologies and so on: the gap between these approaches is considerable today, but Integreat will study how they can be used in combination.

Data can be big, heterogeneous and with variable granularity and signal-to-noise levels, smaller and curated, in the form of natural language texts, images, and data streams. Using these together is often difficult. With integrative learning we can increase accuracy and explainability, quantify uncertainty, and disentangle the incremental contribution made by each knowledge and data component to the results, while protecting the results against excessive confidence only due to multiple use of the same information. We will use constrained optimisation and regularisation (where penalisations carry prescribed knowledge), information theory, ontology-based data integration, and Bayesian hierarchical models, which so far have not been studied in combination for the purpose of knowledge and data integration.

A different challenge of integrative learning emerges when two or more data or knowledge sources carry inconsistent information about the unknown parameters and processes. Integreat will exploit domain knowledge to trust data sources and knowledge unevenly, for example by automatically down-weighting data that show anomalies or that contradict knowledge more than measurement error would allow. A further version of data integration is active learning, where the algorithm in charge of the data analysis and decisions determines when and how to enrich its training data. Integreat will develop knowledge-based approaches to determine how to enrich existing data, e.g., by adding new layers of data and new types of measurements of the system under study, for optimal predictive accuracy.

Compared to the current approach of knowledge-agnostic data generation, knowledge-steered data enrichment and optimal experimental design will increase accuracy, improve decisions and allow energy saving. This is important in particular when learning multi-way interactions, where the number of combinations grows exponentially with the number of interacting objects, as e.g., in reinforcement learning. In multi-way interaction models, we will for example, investigate multi-source input data assuming a higher-order tensor structure.


Key researchers in this research theme:

Publisert 3. juli 2023 10:55 - Sist endret 6. sep. 2023 18:19