Opis
We introduce Vident-synth, a large dataset of synthetic dental videos with corresponding ground truth forward and backward optical flows and occlusion masks. It can be used for:
- evaluation of optical flow models in challenging dynamic scenes characterized by fast and complex motions, independently moving objects, variable illumination, specular reflections, fluid dynamics, sparse textures
- evaluation of blind measures for unsupervised learning
- development of long temporal models for optical flow estimation and dense point trackers
- training models in supervised and semi-supervised manner, including domain adaptation and transfer learning
- cross-domain learning
The simulated motions are complex, combining rotation, scaling, and perspective changes due to small distance between the camera and observed objects, frequent and fast change in depth. The dataset is an order of magnitude larger than the Sintel dataset. We utilized Blender to manually craft animations that ranged from closely resembling to less similar to real dental scene videos from the Vident-real dataset. For generating ground truth optical flow, we employed the Vision Blender library. The process of creating the synthetic videos involved the following steps:
- manual preparation of models of the mouth interior with exact 3D scans of real, extracted teeth with different textures
- modelling of independently moving objects like dental tools, tongues, rolls, and tubes,
- rendering the sequences with different kinds of artifacts.
The models of sequences were rendered in Blender in three different variants. The first rendering variant generated sequences with constant light, without blur and water. The second variant induced motion and focus blur by changing camera parameters across time (depth of field, f-stop, and blades) with a point light source attached to the camera. The third variant generated water spills, which are considered as artifacts and thus excluded from the dense motion of the main dental scene. We used BlenderKit for skin and metal textures. Tooth textures were transferred from photos of real teeth and mapped onto 3D teeth models.
Optical flow is integral to numerous video processing tasks such as restoration, super-resolution, and stabilization. Although recent advancements in optical flow estimation have shown significant efficacy in general scenes, their applicability to challenging medical scenarios, which exhibit unique domain-specific visual phenomena, remains limited. Supervised learning methods facilitate the robust training of motion estimators. However, the absence of ground truth optical flow in many medical video-assisted applications poses a significant barrier to their progress. This is particularly evident in Video-Assisted Dentistry (VAD), where enhanced and continual vision could improve the educational, training, and fully operational dental workflows. Therefore, development of domain-specific synthetic datasets with available ground truth optical flow appears as a natural first step towards the adaptation of general purpose optical flow models to domain-specific real scenes using domain adaptation techniques.
Plik z danymi badawczymi
hexmd5(md5(part1)+md5(part2)+...)-{parts_count}
gdzie pojedyncza część pliku jest wielkości 512 MBPrzykładowy skrypt do wyliczenia:
https://github.com/antespi/s3md5
Informacje szczegółowe o pliku
- Licencja:
-
otwiera się w nowej karcieCC BY-NCUżycie niekomercyjne
- Embargo na plik:
- 2025-09-30
Informacje szczegółowe
- Rok publikacji:
- 2024
- Data zatwierdzenia:
- 2024-07-01
- Język danych badawczych:
- angielski
- Dyscypliny:
-
- informatyka techniczna i telekomunikacja (Dziedzina nauk inżynieryjno-technicznych)
- automatyka, elektronika, elektrotechnika i technologie kosmiczne (Dziedzina nauk inżynieryjno-technicznych)
- nauki medyczne (Dziedzina nauk medycznych i nauk o zdrowiu)
- inżynieria biomedyczna (Dziedzina nauk inżynieryjno-technicznych)
- DOI:
- Identyfikator DOI 10.34808/8yba-cr72 otwiera się w nowej karcie
- Weryfikacja:
- Politechnika Gdańska
Słowa kluczowe
Powiązane zasoby
- dane badawcze Vident-lab: a dataset for multi-task video processing of phantom dental scenes
- dane badawcze Vident-real: an intra-oral video dataset for multi-task learning
Cytuj jako
Autorzy
wyświetlono 366 razy