Validating requirements using SHACL

ADAPT Centre

Using the constraints gathered, we convert them to SHACL shapes to validate the underlying data graph. For this, we need to use the same representations in the SHACL shapes as used in the underlying data graph i.e. the same ontology and design patterns.

To distinguish between constraints that will be checked automatically or manually on the data graph, we define the classes -

1 :Constraint rdfs:subClassOf sh:NodeShape ;
2   rdfs:label "Constraint" .
3 :AutomaticallyCheckedConstraint rdfs:subClassOf :Constraint, sh:NodeShape ;
4   rdfs:label "Automatically Checked Constraint" .
5 :ManuallyCheckedConstraint rdfs:subClassOf :Constraint, sh:NodeShape ;
6   rdfs:label "Manually Checked Constraint" .

To link a constraint with the GDPR, we link it to a resource using GDPRtEXT

1 :linkToGDPR a rdfs:Property ;
2     rdfs:range eli:LegalResourceSubdivision ;
3     rdfs:label "linkToGDPR" .

We then define constraints using either property shapes or sparql queries, depending on the complexity required. For example, to check the requirement that consent can only be associated with one (and only one) data subject, we define a property shape as follows -

1 :ConsentHasDataSubject a sh:PropertyShape, :AutomaticallyCheckedConstraint ;
2   sh:name "Consent --> Data Subject" ;
3   :linkToGDPR gdpr:article4-11 ;
4   sh:path gc:isConsentForDataSubject ;
5   sh:minCount 1;
6   sh:maxCount 1;
7   sh:or ( [ sh:class gc:DataSubject ] [ sh:class gdprov:DataSubject ] ) ;
8   sh:message "Consent should be linked to Data Subject" .

To check whether a consent has timestamp, we use the SPARQL constraint as follows -

 1 :ConsentHasTimestamp a sh:SPARQLConstraint, :AutomaticallyCheckedConstraint ;
 2   sh:name "Consent --> Timestamp" ;
 3   sh:prefixes :Shape ;
 4   sh:select """
 5     SELECT $this WHERE {
 6             FILTER NOT EXISTS { $this gc:atTime ?time } .
 7             FILTER NOT EXISTS { $this prov:generatedAtTime ?time } .
 8             FILTER NOT EXISTS { $this a gdprov:ConsentAgreementTemplate } .
 9     }
10     """ ;
11   sh:message "Consent should have a timestamp" .

We could use a property shape for the same by using the sh:or facility, though it then becomes a matter of preference. The property shape would be as follows -

1 :ConsentHasTimestamp a sh:PropertyShape ;
2   sh:or (
3     [ sh:path gc:AtTime . sh:minCount 1; sh:maxCount 1 ] ;
4     [ sh:path prov:generatedAtTime . sh:minCount 1; sh:maxCount 1 ] ;
5   ) .

For the Manual Test constraints, we define a class ManualTest, and associate it with properties that signify the validation in the form of a boolean value. We then define a SHACL shape that verifies the boolean value as a representation of validating that requirement. For example, verifying whether consent is freely given is tested as follows -

1 :ValidconsentIsFreelyGiven a sh:PropertyShape, :ManuallyCheckedConstraint ;
2   # R42 freely given - Consent should not be regarded as freely given if the data subject has no genuine or free choice or is unable to refuse or withdraw consent without detriment.
3   :linkToGDPR gdpr:article4-11 ;
4   sh:name "Consent == Freely Given" ;
5   sh:path m:consentIsFreelyGiven ;
6   sh:hasValue true ;
7   sh:message "(MANUAL-TEST) Consent should be freely given" .

To associate prefixes with SPARQL queries within the SHACL shape, we define an ontology at the base with the required prefixes using sh:declare and associate it with the SPARQL queries in shape using sh:prefixes as follows:

 1 :Shape a owl:Ontology ;
 2   sh:declare [ sh:prefix "rdfs"; sh:namespace "http://www.w3.org/2000/01/rdf-schema#"^^xsd:anyURI ; ] ;
 3   sh:declare [ sh:prefix "gc"; sh:namespace "https://w3id.org/GConsent#"^^xsd:anyURI ; ] ;
 4   sh:declare [ sh:prefix "gdprov"; sh:namespace "https://w3id.org/GDPRov#"^^xsd:anyURI ; ] ;
 5   sh:declare [ sh:prefix "gdprtext"; sh:namespace "https://w3id.org/GDPRtEXT#"^^xsd:anyURI ; ] ;
 6   sh:declare [ sh:prefix "p-plan"; sh:namespace "http://purl.org/net/p-plan#"^^xsd:anyURI ; ] ;
 7   sh:declare [ sh:prefix "prov"; sh:namespace "http://www.w3.org/ns/prov#"^^xsd:anyURI ; ] ;
 8   rdfs:label "Shape declarations" .
 9 :CheckHandleDataBreachConstraint a sh:SPARQLConstraint ;
10   :linkToGDPR gdpr:article33 ;
11   :linkToGDPR gdpr:article34 ;
12   sh:name "Handle Data Breach" ;
13   sh:prefixes :Shape ;
14   sh:select """
15       SELECT $this {
16           FILTER NOT EXISTS { ?X a gdprov:HandleDataBreachProcess }
17       }
18   """ ;

In using the model of consent, to check whether the model has been found compliant, we use the sh:ValidationReport itself as a predicate of the sh:targetClass property, and use this to validate the constraint against the validation report of the consent model.

1 :ConsentModelConstraints a sh:NodeShape ;
2   sh:targetClass sh:ValidationReport ; 
3   sh:property :ValidationReportConforms ;
4   rdfs:label "Given Consent following Consent Model constraints" .

We divide the constraints into 3 parts as follows:

  1. Part A: constraints related to the model of the system
  2. Part B: constraints related to instances of given consent
  3. Part C: constraints related to model of consent + instances of given consent

Part A test requirements such as the presence of DPO and procedures to handle the various rights. Part B checks ofr requirements directly associated with an instance of given consent. These constraints have to be executed for every instance of given consent. Part C splits the requirements (from Part B) into two parts - one common to all consent and validated against a 'model' or 'template' of consent, and the other validated against the instance of given consent. As most constraints are abstracted away to the model and only need to be checked once, this makes the validation of given consent more efficient.

To execute the SHACL constraints, we used the TopBraid SHACL binary and bash script. The script executed the constraints in various parts in the correct order and persisted the results in files.