LORIA's Trustworthy AI (LOLA) Design Document – Prometheus-X Components & Services

LORIA's Trustworthy AI (LOLA) Design Document

LOLA is an autonomous platform intended to audit AI algorithms as closely as possible to their use by relying on datasets shared by the EdTech community and more specifically, publishers of digital educational resources and learning online services providers.

The platform hosts a set of scenarios which each represents a use case around a combination of datasets, AI algorithms and quality metrics specific to each scenario.

It is thus possible to experiment an algorithm in the context of a diversity of datasets. And conversely, it is also possible to benchmark several algorithms on the same dataset.

The main challenges of LOLA are:

Technical usage scenarios & Features

Features/main functionalities

As stated previously, LOLA provides space to host scenarios. A scenario constitutes a contribution from an organization and corresponds to a task linked to a learning analytics issue which needs using an AI algorithm. Let us cite here some (non-exhaustive) examples of issues: prediction of dropouts, detection of students at risk, recommendations for educational resources, etc.

Each scenario is constructed according to the model represented schematically as follows:

A Simplified View of Scenario Components

Consequently, we distinguish three types of use of the platform:

  1. Design of a new scenario: this activity requires
  1. Audit of a new algorithm: this requires
  1. Sharing of a new dataset: this requires

Technical usage scenarios

Accounting

Whatever the intended use, it is necessary to first have an account to access the platform's services.

Here is the use case diagram for this step.

Use case #1

Some comments:

Uploading a dataset

Here is the use case diagram for the dataset sharing activity.

Use case #2

Some comments:

Auditing an algorithm

Here is the use case diagram for the algorithm auditing activity.

Use case #3

Some comments:

Designing a scenario

Here is the use case diagram for the scenario designing activity.

Use case #3

Some comments:

Requirements

The LOLA platform is continuously fed by a set of scenarios which each rely on datasets and algorithms which are downloaded asynchronously. As part of the EDGE Skills project, the main needs concern collaborations with partners who have datasets to integrate into a specific scenario.

At this stage, a charter is planned indicating the commitments of all stakeholders, namely, the LOLA administrator, data providers and algorithm providers.

Another possibility would be to rely on BB#05 (Consent Agent) to formalize the agreements between the actors.

Integrations

Direct Integrations with Other BBs

It would be interesting to study the possibility of using BB#05 for contracts management between users (i.e data providers or algorithm designers) and the LOLA administrator.

Integrations via Connector

The main integration need in the framework of this project concerns the upload of datasets by data providers. To achieve this task, we require the use of commons standards:

Regarding the data transfer, we detailed the data transmission over the dataspace connector via the following:

Dataspace-connector-usage

For now, we will transfer the data using xAPI queries stored inside compressed files over the dataspace connector. We strongly advice to encrypt and sign the archive transfered over the dataspace connector using asymetric encryption as it provides a recommended level of security for confidentiality and integrity of the data.

We may see int the future improvements to the dataspace connector that will allow us to directly transfer xAPI data using only xAPI queries with end-to-end LRS from both sides in the dataspace connector, removing the need of creating and encrypting a file containing all the data.

Relevant Standards

Data Format Standards

Mapping to Data Space Reference Architecture Models

DSCC :

Input / Output Data

As explained above, there are several type of interactions with the platform, namely Dataset deposit, Algorithm sending, Scenario design and Scenario execution.

The execution of a scenario is the final objective which allows obtaining an assessment.

We can present in a simplified way that the input to this process is the joint choice of a dataset and an algorithm.

The result is then the evaluation which is presented in the form of indicators and metrics specific to each scenario. The output format is a JSON file.

Architecture

The following diagram presents a general view of the secure architecture of the platform.

LORIA's LOLA platform architecture

Dynamic Behavior

Here are the sequence diagrams corresponding to the main activities.

Send Dataset

Store dataset

Upload Algorithm

Store dataset

Prepare a scenario

Store dataset

Execute a scenario

Store dataset

Configuration and deployment settings

Configuration and logging

The configuration of the LOLA platform mainly rely on networking and data transfer protocols configurations.

The main component of the platform is the Python wrapper used to coordinate the other components of the application. It uses logging methods accessible via API requests.

Error Scenarios

Limits will come from the networking environment and the infrastructure used to run the application. Note that the LRS used in the application might be particularly resource consuming, and thus proper resource allocation should be highly considered.

Third Party Components & Licenses

Third Party Component License Link
Nextflow Apache 2.0 https://www.nextflow.io/docs/latest/index.html
TRAX 1.0 EUPL 1.2 https://github.com/trax-project/trax-lrs
Docker Apache 2.0 https://www.docker.com
Slurm GNU General Public License https://slurm.schedmd.com

Implementation Details

As the LOLA platform uses various components, we introduced a Python wrapper to centralize and enhance ease of use of the different services. We defined an API that communicate with a web application in order to use the services inside LOLA.

OpenAPI Specification

In the future: link your OpenAPI spec here.

Test specification

Test plan

LOLA platform will be tested with the firstly designed scenario "Recommender Systems" in collaboration with our partner Maskott.

Internal unit tests

We can achieve specific unit tests with the sandbox application we provide. This sandbox is mainly used to validate scenarios and algorithm. In order to effectively do unit testing on the platform, a defined dataset as well as a specific scenario have to be created.

Component-level testing

The main component that have to be tested is the the API of the platform. It consists of a Python wrapper of the different services used in LOLA. The different API responses are available in the documentation, and might be tested with such tools as Postman.

UI test (where relevant)

There is nothing in particular to test about LOLA's UI.

Partners & roles

This component is a part of the building block "Trustworthy AI: Algorithm assessment" in which is designed as a toolbox of the Trustworthy AI. It is developed by LORIA and complements the tools provided by the University of Koblenz (see Carisma) and Affectlog (see Affectlog 360).

Usage in the dataspace

Usage in dataspace

Data collection is carried out through a standardized connector (PDC). The transfer can be carried out at the request of an authorized data provider who has the permissions for this operation. The access security protocol is therefore carried out within the framework of this PDC and is supposed to support the consent and contract procedures upstream. The other operations are carried out directly on the platform interactively using a web application.