VIDS Compliance Scoring Rubric¶
The 22-dimension framework used to score medical imaging datasets in VIDS Compliance Evaluations. Published openly so that any score in any evaluation report is traceable to documented criteria.
Purpose¶
The rubric makes evaluations consistent across operators, defensible against scrutiny, and reproducible from the same dataset inputs. It complements the VIDS Reference Validator without replacing it.
Relationship to the validator¶
The VIDS Reference Validator checks 21 rules and produces a binary PASS/FAIL outcome. The rubric scores 22 dimensions and produces a granular score (X / 22) used for diagnostic detail. The two scoring systems serve different purposes:
| Validator (21 rules) | Rubric (22 dimensions) |
|---|---|
| Binary PASS/FAIL at the specified Profile | Granular scoring (X / 22) for diagnostic detail |
| Used in: contract acceptance, attestation issuance | Used in: evaluation reports, public benchmarks |
| Automated, reproducible from any installation | Manual scoring by trained evaluator |
Output: validation_report.json |
Output: VIDS Compliance Evaluation Report |
The rubric is wider than the validator by design. It covers some recommended fields (such as class distribution documentation) that are not yet hard-validated rules but still inform a buyer's judgment. Validator results take precedence for contract decisions; rubric scores supply the explanation.
Scoring model¶
Each dimension is scored binary:
- 1 — Present and compliant. The pass criterion is fully met.
- 0 — Missing, non-compliant, or partially met. No partial credit.
No partial credit. A dimension either meets the pass criterion or it does not. This is deliberate: subjective half-credit scoring breaks reproducibility across evaluators.
Profile thresholds¶
| Profile | Required | Notes |
|---|---|---|
| POC | 16 / 22 | All Structure (S1–S5) and all Provenance (P1–P6) dimensions must be present (5 + 6 = 11 mandatory). Quality and ML Readiness dimensions add to the count but are individually optional. Threshold = mandatory 11 + 5 of remaining 11. |
| Full | 22 / 22 | All 22 dimensions must score 1. A single missing dimension = FAIL. This aligns with the validator's zero-FAIL requirement at the Full profile. |
PASS / FAIL is the headline outcome. The X / 22 score exists to explain what failed, not to create a "this is mostly compliant" soft pass. There is no soft pass.
A. Structure (5 dimensions)¶
Structural compliance: directory layout, naming, and dataset-level metadata.
S1 — Directory Structure¶
Dataset root contains required directories (sub-*/ses-*/<modality>/) and matches the VIDS reference layout. derivatives/annotations/ tree mirrors the source tree.
S2 — File Naming¶
All files follow the sub-<ID>_ses-<ID>_<modality>_<suffix>.<extension> pattern. Naming is deterministic and consistent across all subjects.
S3 — Dataset Description¶
dataset_description.json present at root with all required fields: Name, VIDSVersion, DatasetVersion, License, Description, Authors.
S4 — Metadata Consistency¶
Required fields populated across all subjects, sessions, and annotation files. Participants registry (.json or .tsv) lists every subject directory.
S5 — Version Declaration¶
.vids marker file present and correctly declares profile (POC or Full) and vids_version (1.0).
B. Annotation Provenance (6 dimensions)¶
The category that distinguishes VIDS from other dataset standards. Every annotation must trace back to who, when, how, and under what review.
P1 — Annotator Identity¶
Each annotation sidecar records annotator ID or name. Pseudonymized identifiers are acceptable if mapped consistently across the dataset.
P2 — Annotator Role / Type¶
Each annotation declares whether the annotator is human, automated, or hybrid. Human annotators carry credentials or specialty when available.
P3 — Annotation Timestamp¶
Each annotation records the date (or datetime) the annotation was performed.
P4 — Annotation Tool¶
Each annotation records the tool and version used (for example, 3D Slicer 5.6.2, MD.ai, custom pipeline).
P5 — Annotation Protocol¶
A defined annotation protocol or guideline document is referenced from dataset_description.json or README. The same protocol applies to all subjects unless explicitly varied.
P6 — Multi-annotator Tracking¶
Where multiple annotators contributed, each finding is attributable to a specific annotator. Reviewer / second-annotator records appear in the QualityControl block where applicable.
C. Quality Documentation (5 dimensions)¶
Required for Full profile. Documents how quality was measured, not whether it is good.
Q1 — QA Process Defined¶
quality/quality_summary.json describes the QA process: review percentage, double-annotation rate, review method.
Q2 — QA Results¶
Measured outputs are reported: pass rates on first submission, after revision, total revisions required.
Q3 — Inter-annotator Agreement¶
quality/annotation_agreement.json present with method (for example, Dice), sample size, aggregate statistics, and per-subject results where applicable.
Q4 — Validation Evidence¶
Validation report from the VIDS Reference Validator is included in the dataset archive and references the specific validator version used.
Q5 — Class Distribution¶
quality/class_distribution.json (or equivalent) documents class frequencies, anatomical or clinical-score distributions, and any known imbalances. Recommended for Full profile and required for any dataset intended for ML training.
D. ML Readiness (6 dimensions)¶
Whether the dataset is usable in an ML pipeline without further restructuring.
M1 — Label Consistency¶
Annotation schema is uniform across all subjects. LabelMap (for segmentation) is consistent; bounding box / classification taxonomies do not vary mid-dataset.
M2 — Format Compatibility¶
Files are in standard ML-ready formats: NIfTI for imaging, JSON for sidecars. Export paths to nnU-Net, MONAI, COCO, or flat NIfTI are documented or exportable via the reference tooling.
M3 — Data Completeness¶
No critical files are missing. Every subject directory contains the imaging and annotation files declared in the manifest. Empty or stub files are flagged in the documentation.
M4 — Intended Use Declared¶
dataset_description.json declares the dataset's intended use (training, validation, regulatory submission) in the Description or DatasetType field.
M5 — Known Limitations¶
README or dataset_description.json explicitly documents known limitations: anonymized identifiers, missing demographics, modality biases, geographic source restrictions.
M6 — Train / Test Separation¶
ml/splits.json present with subject-level splits (no slice-level or study-level splits that risk data leakage). Random seed and split strategy documented.
What this rubric does not do¶
- Assess clinical correctness of annotations. The rubric checks documentation; it does not verify whether a segmentation is anatomically correct.
- Replace domain-specific quality criteria. Buyer-defined acceptance criteria (subject count, modality coverage, label accuracy) layer on top of the rubric, not within it.
- Certify regulatory compliance. The rubric provides auditable evidence; it does not constitute or replace FDA, EU AI Act, or CDSCO certification.
Operator¶
The current operator of VIDS Compliance Evaluations using this rubric is Princeton Medical Systems. The rubric itself is part of the VIDS open standard and is governed under the same license terms (CC BY 4.0).
Related¶
- Compliance Evaluation — how the rubric is applied to a specific dataset
- Validation Attestation — issued for datasets that achieve 22 / 22 at the Full profile
- VIDS Specification — the underlying technical standard
- Reference Procurement Language — contract clauses that make validator results binding