Vident-synth: a synthetic intra-oral video dataset for optical flow estimation

Opis

We introduce Vident-synth, a large dataset of synthetic dental videos with corresponding ground truth forward and backward optical flows and occlusion masks. It can be used for:

evaluation of optical flow models in challenging dynamic scenes characterized by fast and complex motions, independently moving objects, variable illumination, specular reflections, fluid dynamics, sparse textures
evaluation of blind measures for unsupervised learning
development of long temporal models for optical flow estimation and dense point trackers
training models in supervised and semi-supervised manner, including domain adaptation and transfer learning
cross-domain learning

The simulated motions are complex, combining rotation, scaling, and perspective changes due to small distance between the camera and observed objects, frequent and fast change in depth. The dataset is an order of magnitude larger than the Sintel dataset. We utilized Blender to manually craft animations that ranged from closely resembling to less similar to real dental scene videos from the Vident-real dataset. For generating ground truth optical flow, we employed the Vision Blender library. The process of creating the synthetic videos involved the following steps:

manual preparation of models of the mouth interior with exact 3D scans of real, extracted teeth with different textures
modelling of independently moving objects like dental tools, tongues, rolls, and tubes,
rendering the sequences with different kinds of artifacts.

The models of sequences were rendered in Blender in three different variants. The first rendering variant generated sequences with constant light, without blur and water. The second variant induced motion and focus blur by changing camera parameters across time (depth of field, f-stop, and blades) with a point light source attached to the camera. The third variant generated water spills, which are considered as artifacts and thus excluded from the dense motion of the main dental scene. We used BlenderKit for skin and metal textures. Tooth textures were transferred from photos of real teeth and mapped onto 3D teeth models.

Optical flow is integral to numerous video processing tasks such as restoration, super-resolution, and stabilization. Although recent advancements in optical flow estimation have shown significant efficacy in general scenes, their applicability to challenging medical scenarios, which exhibit unique domain-specific visual phenomena, remains limited. Supervised learning methods facilitate the robust training of motion estimators. However, the absence of ground truth optical flow in many medical video-assisted applications poses a significant barrier to their progress. This is particularly evident in Video-Assisted Dentistry (VAD), where enhanced and continual vision could improve the educational, training, and fully operational dental workflows. Therefore, development of domain-specific synthetic datasets with available ground truth optical flow appears as a natural first step towards the adaptation of general purpose optical flow models to domain-specific real scenes using domain adaptation techniques.

Plik z danymi badawczymi

Vident-synth.zip

0.0 B, S3 ETag , pobrań: 12

Hash pliku liczony jest ze wzoru
hexmd5(md5(part1)+md5(part2)+...)-{parts_count} gdzie pojedyncza część pliku jest wielkości 512 MB

Przykładowy skrypt do wyliczenia:
https://github.com/antespi/s3md5

Informacje szczegółowe o pliku

Licencja:: otwiera się w nowej karcie

CC BY-NC

Użycie niekomercyjne
Embargo na plik:: 2025-09-30

Informacje szczegółowe

Rok publikacji:: 2024
Data zatwierdzenia:: 2024-07-01
Język danych badawczych:: angielski
DOI:: Identyfikator DOI 10.34808/8yba-cr72 otwiera się w nowej karcie
Weryfikacja:: Politechnika Gdańska

Słowa kluczowe

Powiązane zasoby

dane badawcze Vident-lab: a dataset for multi-task video processing of phantom dental scenes
dane badawcze Vident-real: an intra-oral video dataset for multi-task learning

Cytuj jako

Autorzy

wyświetlono 463 razy

Wyszukiwarka