The General Data Protection Regulation (GDPR) is an European data protection regulation that significantly changes the way personal data and consent is used, collected, and shared. With fines in the order of 20million Euros of 4% of global turnover, whichever is higher, compliance is an important activity that needs to be carried out by maintaining appropriate documentation of activities.
This project aims to utilise the semantic web technologies to provide a common and cohesive framework for representing and aiding in the compliance of legislations like the GDPR. This research provides the integration of data management across different information systems specifically adhering to the GDPR and helping controllers to demonstrate compliance
Analysis of GDPR
We have carried out an analysis of GDPR towards understanding its structure, obligations, and implications. Our key areas of focus were to understand and quantify the terms and obligations, as well as to understand the nature of interoperability for entities affected by the GDPR.
GDPRtEXT (GDPR text extension) is the semantification of GDPR into a linked data resource and ontology. It enables referencing articles and concepts/terms within the GDPR using RDF/RDFS/OWL. Read more about GDPRtEXT.
We explore the interoperability of information between the various entities mentioned within GDPR, identify various procedures outlined for information flows which also contain explicit requirements such as presence of structured data or specific data formats being used and to provide a discussion of existing standards by evaluating the state of the art with respect to the standards provided by the World Wide Web Consortium (W3C) for representing information. Read more about GDPR Interoperability Model.
CDMM Model and Consent Ontology
Along with various information flows from the point of interoperability, we also have formalised a model for consent and data management. The model describes the various stages of data management along with the entities involved and their responsibilities. This includes a preliminary work surrounding consent representation as an ontology. Read more about consent ontology.
Consent-aware Mapping Engine for Generating Policy-compliant Datasets
Abstract—The development of intelligent (machine learning or AI-based) applications increasingly require governance models and processes, as financial legal sanctions are more and more being associated with violation of policies (e.g. due to GDPR). An ontology representing the (informed) consent that was captured by an organization can be used to assess a dataset prior its use in any type of data processing activities. We demonstrate the utility using a particular scenario, where datasets are generated “just in time” for a particular purpose such as sending newsletters. This scenario shows how data processing activities can be managed to in such a way as to support compliance verification. This is a work in progress. Read more about this work.
GConsent - GDPR Privacy and Consent Ontology
This work explores how consent can be represented programmatically as a machine-readable data. This is a work in progress. Read more about GConsent.
To represent the various steps and activities regarding GDPR, we extended PROV-O and P-Plan ontologies to create the GDPRov ontology which uses GDPR-specific terms and concepts to define provenance of consent and data along with their relevant activities. Read more about GDPRov.
Changes in Provenance
By representing provenance metadata, it is possible to assist in the identification of expected changes to data as well as consent and their associated activities. We explored this concept as a proof-of-concept.
Data Protection Rights Language
Our exploration yielded the Data Protection Rights Language, based on the ODRL 2.0 template feature, and allows rights to be tracked and propogated through entities.
Data Sharing Agreement Ontology
With GDPR, the various agreements between Controllers and Processors and other relevant entities are important towards ensuring compliance. We explore their semantification and usage based on the concept of smart agreements. Currently, we are working on creating an ontology for representing the data sharing agreement between organisations.
Currently, we are working on representing certification using an ontology, and to self-assess certifications based on testing approaches.
Evaluating GDPR Readiness
By exploiting the metadata provided by our other work, we evaluate GDPR readiness using the checklist provided by Ireland's Data Protection Commissioner's office. This is an application that uses SPARQL queries to pull in information from a triple-store for the queries defined in the checklist. The application demonstrates the feasbility and usefulness of our approach. The application is available online.
Evaluating Compliance Data
Before evaluating compliance, it is necessary to ensure that organisations maintain all the required information. This is explored by using SHACL to identify and validate the provenance information defined in a compliance graph. An online article explains this process in more detail.
By using the metadata explored through our other work, it is possible to create a contextualised knowledge graph that can assist in the determination, documentation, and evaluation of compliance.
Currently, we are working on creating reports and documentations from available metadata regarding GDPR and its compliance.