Search results for: multimodal
-
MODALITY corpus - SPEAKER 40 - SEQUENCE S1
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 34 - COMMANDS C1
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
Towards New Mappings between Emotion Representation Models
PublicationThere are several models for representing emotions in affect-aware applications, and available emotion recognition solutions provide results using diverse emotion models. As multimodal fusion is beneficial in terms of both accuracy and reliability of emotion recognition, one of the challenges is mapping between the models of affect representation. This paper addresses this issue by: proposing a procedure to elaborate new mappings,...
-
Efficient Simulation-Based Global Antenna Optimization Using Characteristic Point Method and Nature-Inspired Metaheuristics
PublicationAntenna structures are designed nowadays to fulfil rigorous demands, including multi-band operation, where the center frequencies need to be precisely allocated at the assumed targets while improving other features, such as impedance matching. Achieving this requires simultaneous optimization of antenna geometry parameters. When considering multimodal problems or if a reasonable initial design is not at hand, one needs to rely...
-
MODALITY corpus - SPEAKER 17 - SEQUENCE S1
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
Development of Intelligent Road Signs with V2X Interface for Adaptive Traffic Controlling
PublicationThe objective of this paper is to present a practical project of intelligent road signs, under which a series of new products for the regulation of traffic is being created. The engineering part of the project, described in this paper, was preceded by a series of experimental studies, the results of which were described in another paper accepted for publication at the MTS-ITS conference 2019, entitled "Comparative study on the effectiveness...
-
Scoreboard Architectural Pattern and Integration of Emotion Recognition Results
PublicationThis paper proposes a new design pattern, named Scoreboard , dedicated for applications solving complex, multi-stage, non-deterministic problems. The pattern provides a computational framework for the design and implementation of systems that integrate a large number of diverse specialized modules that may vary in accuracy, solution level, and modality. The Scoreboard is an extension of Blackboard design pattern and comes under...
-
Marking the Allophones Boundaries Based on the DTW Algorithm
PublicationThe paper presents an approach to marking the boundaries of allophones in the speech signal based on the Dynamic Time Warping (DTW) algorithm. Setting and marking of allophones boundaries in continuous speech is a difficult issue due to the mutual influence of adjacent phonemes on each other. It is this neighborhood on the one hand that creates variants of phonemes that is allophones, and on the other hand it affects that the border...
-
Interpretable deep learning approach for classification of breast cancer - a comparative analysis of multiple instance learning models
PublicationBreast cancer is the most frequent female cancer. Its early diagnosis increases the chances of a complete cure for the patient. Suitably designed deep learning algorithms can be an excellent tool for quick screening analysis and support radiologists and oncologists in diagnosing breast cancer.The design of a deep learning-based system for automated breast cancer diagnosis is not easy due to the lack of annotated data, especially...
-
Globalized Simulation-Driven Miniaturization of Microwave Circuits by Means of Dimensionality-Reduced Constrained Surrogates
PublicationSmall size has become a crucial prerequisite in the design of modern microwave components. Miniaturized devices are essential for a number of application areas, including wireless communications, 5G/6G technology, wearable devices, or the internet of things. Notwithstanding, size reduction generally degrades the electrical performance of microwave systems. Therefore, trade-off solutions have to be sought that represent acceptable...
-
Global EM-Driven Optimization of Multi-Band Antennas Using Knowledge-Based Inverse Response-Feature Surrogates
PublicationElectromagnetic simulation tools have been playing an increasing role in the design of contemporary antenna structures. The employment of electromagnetic analysis ensures reliability of evaluating antenna characteristics but also incurs considerable computational expenses whenever massive simulations are involved (e.g., parametric optimization, uncertainty quantification). This high cost is the most serious bottleneck of simulation-driven...
-
The Revitalization Processes of the Port Structures in Gdynia and Gdansk on the Background of Contemporary Port Changes
PublicationTransformations of the port facilities against the modernization of the port structures are present in many city-port centers since more than 50 years. The modernization taking place in the ports located in Gdynia-Gdansk mainly concerns communication availability and adapted to the multimodal technology of transport and transshipment. Developing specialized tech-terminals serving a specific type of load, causes development of the...
-
Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets
PublicationArtificial Neural Network (ANN) models, specifically Convolutional Neural Networks (CNN), were applied to extract emotions based on spectrograms and mel-spectrograms. This study uses spectrograms and mel-spectrograms to investigate which feature extraction method better represents emotions and how big the differences in efficiency are in this context. The conducted studies demonstrated that mel-spectrograms are a better-suited...
-
ALOFON corpus
Open Research DataThe ALOFON corpus is one of the multimodal database of word recordings in English, available at http://www.modality-corpus.org/. The ALOFON corpus is oriented towards the recording of the speech equivalence variants. For this purpose, a total of 7 people who are or speak English with native speaker fluency and a variety of Standard Southern British...