Vision

PLIADES aims to facilitate cross-domain data sharing by enabling users from diverse and heterogeneous data spaces to exchange data safely and seamlessly. The project lays the foundation for the development of a versatile data exchange ecosystem that integrates various data spaces. This is accomplished by extending the specification of IDS RAM with cross-domain data services, which augment the capabilities of IDS Connectors and facilitate decentralized data management as it is presented in the following figure. In addition, the project presents a cutting-edge data integration framework that places emphasis on the entire data life cycle, while also considering environmental and security factors.

PLIADES vision

Each distinct data space provides a variety of services for data governance and management, data exchange, identity management, certification, and other domain-specific services and authorities. In accordance with the conventional data exchange paradigm of IDS RAM v4.0, each data transaction involves a data provider and a data consumer, as illustrated in the figure. The data consumer communicates with a metadata broker service to discover data sources (data providers) in the data space. A certification authority verifies the legitimacy of both parties and grants them permission to exchange data through connector services. These services offer data transformation capabilities and comply with a specified communication protocol to transmit the data. To promote interoperability, the PLIADES research project suggests a collection of cross-domain services that amalgamate information from various domain services.

The proposed services and their functionalities presented below:

Active Data Discovery is a service that communicates with domain metadata brokers to enable cross-domain data search and discovery operations, in the form of AI-boosted broker. Through the service’s Recommendation engine, ranking system, and AI declarative querying, users can easily identify optimal datasets, providers, and consumers and establish a connection with them. This transforms the passive domain metadata broker communication to an interactive experience.

This service allows private and secure cross-domain data exchange and analysis and enforces data protection legislation on the connectors. It combines data legislation mandates specified by the data provider and/or EU law (e.g. GDPR) actively discovering the nature of the data to be exchanged, using a special communication node with each connector that employs semantics and trained ML models.

Virtual Replica acts as the central unit of multiple (one for each data space) federated Digital Twin Models. Using generative AI, each connector can elaborate on the raw data of each data provider, exchanging or using simulated data instead of actual data, enhancing data privacy and security and increasing the volume of the data. For the generative AI models’ training, federated learning is employed, partially training models on every data provider and creating federated models for each data space that continuously improve with new information and data types.

This service enforces interoperability between the domain semantic annotation services of each data space. It employs ML-based semantic annotation techniques to assist the transfer of domain knowledge to the semantic annotations and labelling processes. Through human-in-the-loop interaction and active learning techniques, the module aims to efficiently identify the most informative examples for human annotation, reducing the need for manual effort.

During the data exchange process, a cross-domain service will provide data quality assessment functionality to the connectors to identify low quality data, which will be marked as low quality and if needed be excluded from the data transfer and/or analysis. Data providers can then assess their dataset and dispose the low-quality data that did not contribute to the data analysis, saving data storage towards the green deal. Low quality mark can be also added to data irrelevant of time or state of the data consumer or provider.

A central identity provider enables interoperability between the domain identity providers of each data space, covering various privacy ensuing techniques that protect sensitive information without hindering analysis. Self- sovereign identification (SSI) enhances data privacy and cybersecurity, which is a decentralized identity management concept that gives individuals authority over their identity information.

Maintaining the core of the IDS Connector specification, PLIADES expands the standardized features of the connectors with communication nodes that connect with all the previously presented centralized services. This enhances each functionality achieving secure and efficient cross-domain data sharing while preserving privacy.

To realise the aforementioned concept, a high-level technical architecture of the PLIADES framework is presented in the figure below. It consists of four technical layers and three satellite processes:

  1. Data Creation Layer
  2. Data Processing and Analytics
  3. Data Space Integration and Data Sharing
  4. Data Management and Security

while the satellite processes are the Application Areas and the Third-party marketplaces and EU initiatives, the Project Pilots and End-Users and finally the Data Spaces interconnectivity process achieved through the technical layers.

PLIADES architecture will be realized by integrating existing background cutting-edge technological assets coming from PLIADES beneficiaries and external third-party initiatives; (ii) new ad-hoc foreground technological assets developed during the project, and; (iii) adapted existing background AI models and assets that will enrich the umbrella of services offered.