The General Data Protection Regulation (GDPR) is an European data protection regulation that significantly changes the way personal data and consent is used, collected, and shared. With fines in the order of 20million Euros of 4% of global turnover, whichever is higher, compliance is an important activity that needs to be carried out by maintaining appropriate documentation of activities.
This project aims to utilise the semantic web technologies to provide a common and cohesive framework for representing and aiding in the compliance of legislations like the GDPR. This research provides the integration of data management across different information systems specifically adhering to the GDPR and helping controllers to demonstrate compliance
Analysis of GDPR
We have carried out an analysis of GDPR towards understanding its structure, obligations, and implications. Our key areas of focus were to understand and quantify the terms and obligations, as well as to understand the nature of interoperability for entities affected by the GDPR.
GDPRtEXT (GDPR text extension) is the semantification of GDPR into a linked data resource and ontology. It enables referencing articles and concepts/terms within the GDPR using RDF/RDFS/OWL. Read more about GDPRtEXT. This work has been published as-
GDPRtEXT - GDPR as a Linked Data Resource. Harshvardhan J. Pandit, Kaniz Fatema, Declan O'Sullivan, Dave Lewis. 15th European Semantic Web Conference (ESWC), Resource Track. Crete, Heraklion, Greece. 2018 proceedings PDF; alternate PDF
We explore the interoperability of information between the various entities mentioned within GDPR, identify various procedures outlined for information flows which also contain explicit requirements such as presence of structured data or specific data formats being used and to provide a discussion of existing standards by evaluating the state of the art with respect to the standards provided by the World Wide Web Consortium (W3C) for representing information. Read more about GDPR Interoperability Model. This work has been published as-
GDPR Data Interoperability Model Harshvardhan J. Pandit, Declan O'Sullivan, Dave Lewis. 23rd EURAS Annual Standardisation Conference, Dublin, Ireland view preprint; PDF
CDMM Model and Consent Ontology
Along with various information flows from the point of interoperability, we also have formalised a model for consent and data management. The model describes the various stages of data management along with the entities involved and their responsibilities. This includes a preliminary work surrounding consent representation as an ontology. Read more about consent ontology. This work has been published as -
Compliance through Informed Consent: Semantic Based Consent Permission and Data Management Model Kaniz Fatema, Ensar Hadziselimovic, Harshvardhan J. Pandit, Dave Lewis. Society, Privacy and the Semantic Web - Policy and Technology (PrivOn), co-located with ISWC 2017 proceedings; online published (PDF); alternate PDF download
GDPRivacy - GDPR Privacy and Consent Ontology
This work explores how consent can be represented programmatically as a machine-readable data, based on ther GDPR-related ontologies in this project. This is a work in progress. Read more about GDPRivacy.
Provenance forms an important aspect of metadata for GDPR compliance. Our approach is to model this provenance metadata for modelling, documenting, and evaluating compliance.
To represent the various steps and activities regarding GDPR, we extended PROV-O and P-Plan ontologies to create the GDPRov ontology which uses GDPR-specific terms and concepts to define provenance of consent and data along with their relevant activities. Read more about GDPRov. This work has been published as-
Modelling provenance for GDPR compliance using linked open data vocabularies Harshvardhan J. Pandit, Dave Lewis. Society, Privacy and the Semantic Web - Policy and Technology (PrivOn), co-located with ISWC 2017 proceedings; online published (PDF); alternate PDF download
Changes in Provenance
By representing provenance metadata, it is possible to assist in the identification of expected changes to data as well as consent and their associated activities. We explored this concept through the publication-
GDPR-driven Change Detection in Consent and Activity Metadata Harshvardhan J. Pandit, Declan O'Sullivan, Dave Lewis. Managing the Evolution and Preservation of the Data Web (MEPDaW). Co-located with 15th European Semantic Web Conference (ESWC). Crete, Heraklion, Greece. 2018 view proceedings; PDF;
With GDPR, the various agreements between Controllers and Processors and other relevant entities are important towards ensuring compliance. We explore their semantification and usage based on the concept of smart agreements.
Data Protection Rights Language
Our exploration yielded the Data Protection Rights Language, based on the ODRL 2.0 template feature, and allows rights to be tracked and propogated through entities. This work has been published as-
Linked Data Contracts to Support Data Protection and Data Ethics in the Sharing of Scientific Data Ensar Hadziselimovic, Kaniz Fatema, Harshvardhan J. Pandit, Dave Lewis. Sharing of Scientific Data in Enabling Open Semantic Science (SemSci), co-located with ISWC 2017. proceedings; online published (PDF); alternate PDF download
We also explore how our work in modelling and representing GDPR related concepts can assist in its documentation and compliance.
By using the metadata explored through our other work, it is possible to create a contextualised knowledge graph that can assist in the determination, documentation, and evaluation of compliance. This concept is explored in the publication-
Towards Knowledge-based Systems for GDPR Compliance Harshvardhan J. Pandit, Declan O'Sullivan, Dave Lewis. Contextualised Knowledge Graphs (CKG), ISWC2018 Workshop, Monterey California, USA. 2018 preprint PDF;
Evaluating GDPR Readiness
By exploiting the metadata provided by our other work, we evaluate GDPR readiness using the checklist provided by Ireland's Data Protection Commissioner's office. This is an application that uses SPARQL queries to pull in information from a triple-store for the queries defined in the checklist. The application demonstrates the feasbility and usefulness of our approach. The application is available online, and has been published as-
Queryable Provenance Metadata For GDPR Compliance Harshvardhan J. Pandit, Declan O'Sullivan, Dave Lewis. SEMANTiCS 2018 – 14th International Conference on Semantic Systems view preprint;
Evaluating Compliance Data
Before evaluating compliance, it is necessary to ensure that organisations maintain all the required information. This is explored by using SHACL to identify and validate the provenance information defined in a compliance graph. An online article explains this process in more detail. This work has been explored in the following publication-
Exploring GDPR Compliance Over Provenance Graphs Using SHACL Harshvardhan J. Pandit, Declan O'Sullivan, Dave Lewis. SEMANTiCS 2018 – 14th International Conference on Semantic Systems, Vienna, Austria. 2018 preprint PDF;