[!TIP] When in doubt regarding the intended meaning of a certain term, refer to the Glossary.
The Data Veracity Assurance building block (DVA from now on) allows data exchange participants to agree on and later prove/verify quality requirements or properties of the exchanged data.
For example, if a data producer (abbreviated P from now on) provides simple sensor data to a data consumer (C from now on), DVA can facilitate P to prove (or at least claim) and C to verify that the provided data is credible (e.g., temperature values are within a certain range, say in the interval (-100 °C, +50 °C)).
DVA requires a veracity level agreement (VLA) between the exchange participants. This agreement is part of the contract and targets a specific data exchange unit (instance). The VLA defines a number of veracity objectives that each describe a data quality aspect (e.g., completeness or accuracy) and an evaluation scheme (e.g., value is within a numerical range). The VLA also defines how the evaluation is to be performed (e.g., with a certain algorithm or software library). When the data exchange occurs, in the simplest model, P attaches an attestation (or even a proof) regarding the exchanged data’s quality that C can verify and trust.
The high-level concepts of the DVA BB have been summarized in the knowledge graph below. The second graph visualizes a concrete example of using DVA in a use case where xAPI training data is exchanged.
---
title: High-Level Data Veracity Concepts (Knowledge Graph / Metamodel)
---
graph TD
xchg(["Data\n Exchange"]):::External
va(["Veracity\n Assurance"]):::Assurance
aov(["Attestation\n of Veracity"]):::Assurance
pov(["Proof\n of Veracity"]):::Assurance
voe(["Veracity Objective Evaluation"]):::Assurance
eval(["Evaluation"]):::Assurance
vla(["Veracity\n Level\n Agreement"]):::Agreement
vo(["Veracity\n Objective"]):::Agreement
qa(["Quality\n Aspect"]):::Agreement
es(["Evaluation\n Scheme"]):::Agreement
crit(["Criterion\n Type"]):::Agreement
method(["Evaluation\n Method"]):::Agreement
syntax(["Syntax\n (ISO 8000)"]):::Aspect
timeliness(["Timeliness\n (ISO 25000)"]):::Aspect
accuracy(["Accuracy\n (ISO 25000)"]):::Aspect
completeness(["Completeness\n (ISO 25000)"]):::Aspect
consistency(["Consistency\n (ISO 25000)"]):::Aspect
validinvalid(["Valid/\n Invalid"]):::Agreement
inrange(["In\n Range"]):::Agreement
greaterless(["Greater Than\n Less Than"]):::Agreement
vla-- targets exchange -->xchg
vla-- has objective -->vo
vo-- targets aspect -->qa
vo-- can be evaluated using -->es
es-- has type -->crit
es-- has method -->method
syntax & timeliness & accuracy & completeness & consistency-- is a -->qa
validinvalid & inrange & greaterless-- is a -->crit
va-- for agreement -->vla
aov & pov-- is a -->va
va-- has evaluation -->voe
voe-- targets objective -->vo
voe-- has evaluation -->eval
classDef Agreement fill:#fcdc00,stroke:#000,color:#000
classDef Aspect fill:#fb4b00,stroke:#000,color:#000
classDef External fill:#73d8ff,color:#000
classDef Assurance fill:#a4dd00,stroke:#000,color:#000
linkStyle default stroke-width:4px
---
title: Data Veracity Concepts Example (xAPI Learning Traces)
---
graph LR
xchg(["xAPI Learning\n Traces Exchange"]):::External
aov(["Attestation\n of Veracity"]):::Assurance
voe_syn(["Syntax\n Evaluation"]):::Assurance
voe_rec(["Recency\n Evaluation"]):::Assurance
eval_syn(["Valid"]):::Assurance
eval_rec(["3 Days\n Old"]):::Assurance
vla(["xAPI Learning Trace\n Veracity Level Agreement"]):::Agreement
vo_syn(["Valid\n Syntax"]):::Agreement
vo_rec(["Recency"]):::Agreement
qa_syn(["Syntax"]):::Aspect
qa_rec(["Timeliness"]):::Aspect
es_syn(["Syntax\n Checking"]):::Agreement
es_rec(["Timeliness\n Checking"]):::Agreement
crit_syn(["Valid/\n Invalid"]):::Agreement
crit_rec(["Greater Than\nLess Than"]):::Agreement
method_syn(["Syntax\n Checker"]):::Agreement
method_rec(["Value\n Comparison"]):::Agreement
vla-- targets exchange -->xchg
vla-- has objective -->vo_syn & vo_rec
vo_syn-- targets aspect -->qa_syn
vo_rec-- targets aspect -->qa_rec
vo_syn-- can be evaluated using -->es_syn
vo_rec-- can be evaluated using -->es_rec
es_syn-- has type -->crit_syn
es_rec-- has type -->crit_rec
es_syn-- has method -->method_syn
es_rec-- has method -->method_rec
aov-- for agreement -->vla
aov-- has evaluation -->voe_syn & voe_rec
voe_syn-- has evaluation -->eval_syn
voe_rec-- has evaluation -->eval_rec
voe_syn-- targets objective --->vo_syn
voe_rec-- targets objective --->vo_rec
classDef Agreement fill:#fcdc00,stroke:#000,color:#000
classDef Aspect fill:#fb4b00,stroke:#000,color:#000
classDef External fill:#73d8ff,color:#000
classDef Assurance fill:#a4dd00,stroke:#000,color:#000
linkStyle default stroke-width:4px
Key functionalities:
Manage data veracity level agreements (VLAs)
Provide means to…
the veracity of exchanged data
Log veracity verification results
Optional functionalities:
The technical usage scenarios have been summarized in the following UML use case diagram.
NaN
values a dataset contains and how to count them) for inclusion in the VLAs of new contractsVLAs describe exactly what data quality P ‘promises’ and/or C expects. The format and exact contents of VLAs is further detailed later in this document.
While VLAs are struck and primarily managed by the Contract Manager, DVA supports the process by managing VLA templates. The dataspace orchestrator is authorized to select the set of templates whose usage is allowed in the dataspace.
DVA of course also provides the means for P to prove (or attest) and C to verify that the exchanged data fulfils the requirements set by the VLA; see below.
We approach veracity compliance assurance as a challenge at the intersection of technology and trust.
There are chiefly two ways P can offer veracity assurance regarding the exchanged data:
In some cases, VLAs do not need to be supported by an explicit AoV/PoV at all: the VLA serves as a kind of a ‘data contract’ where C takes on the responsibility of checking compliance on receiving data.
The primary deficiency of this ‘trust, but verify’ model is that C may not be willing, or even capable to (fully) check compliance with a VLA. Attestations of veracity provide a trust-based solution to establish compliance without consumer-side checking.
We distinguish two major categories of attestations:
Proofs of veracity (PoVs), on the other hand, establish compliance through cryptographic, and not trust-based approaches – when this is required and feasible. Such proofs are sound, meaning that a cheating P cannot forge a PoV for a piece of data that does not adhere to the VLA’s requirements. (Mathematically and succinctly) verifiable zero-knowledge as well as non-zero knowledge proofs on data have been an emerging field of mathematics in the last two decades, with increasingly rapid development in the last few years. However, as algorithms, standards, software frameworks, and use cases are still evolving, the DVA building block will provide a highly extensible framework for PoVs, driven by the use cases of the project.
DVA defines what proofs and attestations are (see later in this document and provides means to generate PoVs, AoVs, and to verify veracity.
DVA also keeps track of veracity verification results for traceability and auditing purposes.
[BB_08__01]
DVA MUST define schemata for VLAs[BB_08__02]
DVA MUST provide VLA templates[BB_08__03]
DVA SHOULD support editing available VLA templates[BB_08__04]
DVA MUST support striking VLAs[BB_08__05]
DVA MUST provide multiple veracity assurance methods[BB_08__06]
DVA MUST support veracity attestation (i.e., either P or a third party attests that veracity requirements are met)[BB_08__07]
DVA SHOULD support veracity self-attestation[BB_08__08]
DVA SHOULD support third-party veracity attestation[BB_08__09]
DVA SHOULD support provider-proven veracity[BB_08__10]
DVA SHOULD support consumer-verified veracity[BB_08__11]
DVA MUST interface with the Contract Manager service[BB_08__12]
DVA MUST interface with the Dataspace Connector[BB_08__13]
DVA MUST log verification results---
title: DVA Requirements
---
requirementDiagram
requirement BB_08__01 {
id: BB_08__01
text: "DVA MUST define schemata for VLAs"
risk: medium
verifymethod: demonstration
}
requirement BB_08__02 {
id: BB_08__02
text: "DVA MUST provide VLA templates"
risk: medium
verifymethod: demonstration
}
requirement BB_08__03 {
id: BB_08__03
text: "DVA SHOULD support editing available VLA templates"
risk: low
verifymethod: demonstration
}
functionalRequirement BB_08__04 {
id: BB_08__04
text: "DVA MUST support striking VLAs"
risk: medium
verifymethod: test
}
functionalRequirement BB_08__05 {
id: BB_08__05
text: "DVA MUST provide multiple veracity assurance methods"
risk: low
verifymethod: demonstration
}
functionalRequirement BB_08__06 {
id: BB_08__06
text: "DVA MUST support veracity attestation"
risk: low
verifymethod: demonstration
}
functionalRequirement BB_08__07 {
id: BB_08__07
text: "DVA SHOULD support veracity self-attestation"
risk: low
verifymethod: demonstration
}
functionalRequirement BB_08__08 {
id: BB_08__08
text: "DVA SHOULD support third-party veracity attestation"
risk: low
verifymethod: demonstration
}
functionalRequirement BB_08__09 {
id: BB_08__09
text: "DVA SHOULD support provider-proven veracity"
risk: medium
verifymethod: demonstration
}
functionalRequirement BB_08__10 {
id: BB_08__10
text: "DVA SHOULD support consumer-verified veracity"
risk: medium
verifymethod: demonstration
}
interfaceRequirement BB_08__11 {
id: BB_08__11
text: "DVA MUST interface with the Contract Manager service"
risk: medium
verifymethod: test
}
interfaceRequirement BB_08__12 {
id: BB_08__12
text: "DVA MUST interface with the Dataspace Connector"
risk: medium
verifymethod: test
}
functionalRequirement BB_08__13 {
id: BB_08__13
text: "DVA MUST log verification result"
risk: medium
verifymethod: test
}
BB_08__02 - refines -> BB_08__01
BB_08__03 - refines -> BB_08__01
BB_08__06 - refines -> BB_08__05
BB_08__07 - refines -> BB_08__06
BB_08__08 - refines -> BB_08__06
BB_08__09 - refines -> BB_08__05
BB_08__10 - refines -> BB_08__05
No direct integrations.
There are ISO standards that define data-quality-related concepts:
Other possibly relevant standards and specifications:
PoVs and AoVs are planned to be manifested as W3C verifiable credentials (VCs):
DSSC: see the Value-Added Services building block.
IDS RAM: see 4.3.6 Data Quality in the Governance Perspective.
[!NOTE] The precise language of VLAs is still being worked out. This should not be a concern to other components such as the Contract Manager at this point, as VLAs are expected to be embedded into the contracts. Take, for example, the Bilateral Contract example: the mockup VLA could be added to this contract under an additional
vla
key (with some minor modifications and after converting the YAML to JSON of course).
Initial mockup VLAs based on data contracts:
---
id: urn:vla:example:vrtraces
meta:
title: VR Learning Traces VLA Example
version: 0.1.0
description: |
A simple Veracity Level Agreement (VLA) example based on the
VR Learning Traces building block.
exchange: cdef77c9-4016-45bb-868d-6f014e17ed2d
models:
trace:
description: A VR learning trace
type: xapi
xapi_extensions:
- http://example.com/exercises/b9e16535-4fc9-4c66-ac87-3ad7ce515f5c/sensors/score
objectives:
- name: xapi_syntax
description: Data is a valid xAPI JSON file
aspect: syntax
evaluation:
method:
id: syntax_check
args:
checker: xapi
type: valid_invalid
- name: 1w_freshness
description: Learning trace is not too old
aspect: timeliness
evaluation:
method:
id: timestamp_comparison
args:
timestamp: xapi_timestamp
within: 1w
type: in_range
- name: new_user
description: No data has been supplied about this actor in the past
aspect: uniqueness
evaluation:
method:
id: uniqueness_check
args:
target: actor.id
type: valid_invalid
---
id: urn:vla:example:moodle
meta:
title: Moodle Learning Traces VLA Example
version: 0.1.0
description: |
A simple Veracity Level Agreement (VLA) example for Moodle-like xAPI
data
exchange: bb54352d-3da4-4b6d-a4db-3639003f5f99
models:
trace:
description: xAPI trace
type: xapi
objectives:
- name: is_dases
description: Trace is within the subset defined by Gaia-X DaSES
aspect: schema
evaluation:
method:
id: xapi_schema_dases
type: valid_invalid
AoVs (and PoVs) will manifest as verifiable credentials. The information graph that summarizes the contents of these credentials can be seen below.
---
title: Information Graph of an AoV Verifiable Credential
---
graph TD
vc(["(AoV) Credential Instance"]):::Main
id[Credential ID #123456789]:::Optional
type([Attestation of Veracity]):::Required
validFrom[2025-01-12T12:31:33Z]:::Optional
subj([Data Exchange Instance]):::Required
issuer([Example Org]):::Required
subjId[Data Exchange ID #ABCD1234]:::Optional
subjContract[Contract ID #98765]:::Custom
subjEval1[Evaluation of Objective ID #AAA]:::Custom
subjEval2[Evaluation of Objective ID #AAB]:::Custom
vc-- id -->id
vc-- type -->type
vc-- validFrom -->validFrom
vc-- credentialSubject -->subj
vc-- issuer -->issuer
subj-- id -->subjId
subj-- contractId -->subjContract
subj-- evaluations -->subjEval1 & subjEval2
classDef Main fill:#fff,stroke:#000,color:#000
classDef Required fill:#0fa,stroke:#000,color:#000
classDef Optional fill:#7d7,stroke:#000,color:#000
classDef Custom fill:#ffa,stroke:#000,color:#000
linkStyle default stroke-width:4px
---
title: Information Graph of a PoV Verifiable Credential
---
graph TD
vc(["(PoV) Credential Instance"]):::Main
id[Credential ID #123456789]:::Optional
type([Proof of Veracity]):::Required
validFrom[2025-01-12T12:31:33Z]:::Optional
subj([Data Exchange Instance]):::Required
issuer([Example Org]):::Required
subjId[Data Exchange ID #ABCD1234]:::Optional
subjContract[Contract ID #98765]:::Custom
proof[Proof]:::Custom
vc-- id -->id
vc-- type -->type
vc-- validFrom -->validFrom
vc-- credentialSubject -->subj
vc-- issuer -->issuer
subj-- id -->subjId
subj-- contractId -->subjContract
subj-- proof -->proof
classDef Main fill:#fff,stroke:#000,color:#000
classDef Required fill:#0fa,stroke:#000,color:#000
classDef Optional fill:#7d7,stroke:#000,color:#000
classDef Custom fill:#ffa,stroke:#000,color:#000
linkStyle default stroke-width:4px
For AoVs, specifying concrete evaluation results is optional. The important elements of an attestation are its issuer and the identifiers of the relevant data exchange (and contract).
For PoVs, the proof is a crucial element of the credential.
---
title: Data Veracity Assurance High-Level Architecture
---
graph LR
apip>"fa:fa-plug\n Data Provider API"]:::API
apic>"fa:fa-plug\n Data Consumer API"]:::API
apio>"fa:fa-plug\n Orchestrator API"]:::API
apim>"fa:fa-plug\n Contract API"]:::API
att["fa:fa-stamp\n Attestation Component"]:::Component
attloc["Local Attestation"]:::Misc
attext["External Attestation"]:::Misc
vla["fa:fa-file\n VLA Component"]:::Component
prov["fa:fa-file-circle-check\n Proving Component"]:::Component
verif["fa:fa-check-double\n Verification Component"]:::Component
gen["Built-in Proof Generator"]:::Misc
gen_ext["External Proof Generator"]:::Misc
ver["Proof Verifier"]:::Misc
apio -- manage templates -->vla
apim -- get templates -->vla
apip -- create AoV --> att
apip -- create PoV --> prov
apic -- verify AoV --> att
apic -- verify PoV --> prov
apic -- check data compliance --> verif
att --> attloc & attext
prov --> gen & gen_ext & ver
classDef default color:#000
classDef API fill:lightgreen
classDef Controller fill:cyan
classDef Component fill:orange
classDef Misc fill:greeen
The sequence diagrams below describe possible DVA additions to the basic Connector flows.
---
title: Data Exchange with Attestation or Proof of Veracity (AoV/PoV)
---
sequenceDiagram
participant c as Consumer PDC
box rgba(50, 100, 20, .5) Data Provider
participant p as Provider PDC
participant dva as Provider DVA
end
participant ctr as Contract Manager
box rgba(150, 50, 50, .5) 3rd Party DVA Organization
participant pdc3 as PDC X
participant dva3 as 3rd Party DVA
end
box rgba(100, 100, 130, .5) Organizagion/Individual A
participant pdca as PDC A
participant dvaa as DVA A
participant svca as Service A
end
box rgba(100, 100, 130, .5) Organizagion/Individual B
participant pdcb as PDC B
participant dvab as DVA B
participant svcb as Service B
end
c ->> ctr : Request data processing chain
ctr --) c: Return processing sequence
c -) p: Initiate data transaction
alt self-attestation or self-generated proof
c ->> dva: Create self-AoV or Generate PoV
dva --) c: Return AoV/PoV
else third-party attestation or proving
c ->> pdc3: Request AoV/PoV
pdc3 ->> dva3: Create AoV/PoV
dva3 --) pdc3: Return AoV/PoV
pdc3 --) c: Return AoV/PoV
end
p -) pdca: Send raw data (+ AoV/PoV) for processing
pdca -) dvaa: Verify AoV/PoV
pdca ->> svca: Process data
svca --) pdca: Return processed data
pdca --) c: Notify progress
pdca -) pdcb: Send data for next processing
pdcb -) dvab: Verify AoV/PoV
pdcb ->> svcb: Process data
svcb --) pdcb: Return processed data
pdcb -) c: Notify progress
pdcb --) c: Send final processed data
The data space orchstrator may configure some basic aspects of DVA, such as…
The main potential error scenarios of DVA are caused by not being able to access the data for which AoVs or PoVs should be generated or verified and by possible limitations in resources required to generate AoVs and PoVs.
To be able to generate an AoV or a PoV, DVA needs access to the data under assessment. Incomplete or corrupted data may also not be possible to properly analyze. AoVs and PoVs must ‘prove’ that they have been created based on the right data to be valid and reliable (this can be most simply accomplished by ‘committing’ them to a checksum).
Access to the data may be necessary not only for generation by P but also for verification by C. This is only relevant in the case of PoVs, which are verifiable proofs that a given piece of data fulfils the VLA – the proof can only be checked if the original data is available.
Some DVA operations may require surprisingly high computational power. This is especially true for PoVs, which are inherently more complex than AoVs.
Furthermore, DVA will not be prepared to handle extreme workloads and will likely start thrashing above a certain limit of request frequency.
For verifiable-credentials-related operations, DVA will rely on:
For performing veracity checks, DVA will use:
Other potential, less important libraries planned to be used by the implementation:
The core functionality of DVA will be implemented over the JVM in Java/Kotlin. Some verifiable-credential-related functionality will be implemented in TypeScript.
The current specification can be found in spec/openapi.yaml
.
The primary objective of testing will be to validate the correct handling of exchanged data compliant and non-compliant with the quality aspects established in the VLA. Several data examples (including correct and incorrect samples) will be used for these tests. Various data quality aspects will be targeted and case studies will be conducted using different data types used in the main project use cases, like VR traces (xAPI), Moodle learning traces (xAPI), and skills (ontology/terminology).
The integration with the Dataspace Connector component will be tested thoroughy to verify that the necessary interactions are indeed possible and that error cases are handled properly (e.g., when no data is received during a data exchange or data is received but without a PoV/AoV even though it would be required).
While DVA will not directly integrate with the Contract Manager component, it should be tested that DVA can recognize VLA fragments defined in the contracts and that it is possible to extend existing contracts with VLA fragments. In the end, this functionality will be provided by (or at least via) the Catalogue, not this BB.
Furthermore, interactions with other components, such as the Data Value Chain Tracker (DVCT) will be validated through testing, as these potentially involve new interactions, protocols, and interfaces.
The DVA BB test acceptance critieria are, informally, and without striving for completeness:
BME (the BB leader) shall design and implement DVA.
DVA may be involved in various service chains and use cases. So far, the following usages have been identified.
Abbreviation | Expansion |
---|---|
DVA | data veracity assurance building block |
VLA | veracity level agreement |
P | data producer |
C | data consumer |
PoV | proof of veracity |
AoV | attestation of veracity |
the issue of a statement, based on a decision, that fulfillment of specified requirements has been demonstrated(ISO/IEC 17000:2020)
a transaction participant to whom data is, or is to be technically supplied by a data provider in the context of a specific data transaction(DSSC Glossary v2.0 2023-09 2. Core Concepts: Data Recipient)
a transaction participant that, in the context of a specific data transaction, technically provides data to the data recipients that have a right or duty to access and/or receive that data(DSSC Glossary v2.0 2023-09 2. Core Concepts: Data Provider)
completeness and/or accuracy of data(ISO/IEC 20546:2019 3.1.16)
A data space participant that represents and is accountable for a specific use case in the context of the governance framework. The orchestrator establishes and enforces business rules and other conditions to be followed by the use case participants.(DSSC Glossary v2.0 2023-09 3. Data space use cases and business model: Use case orchestrator)
a fact or piece of information that shows that something exists or is true(Cambridge Dictionary)
a verifiable credential is a tamper-evident credential that has authorship that can be cryptographically verified(W3C Verifiable Credentials Data Model 2.0)
A set of one or more claims made by an issuer. The claims in a credential can be about different subjects. The definition of credential used in this specification differs from, NIST's definitions of credential.
An assertion made about a subject.