Filters
total: 1160
filtered: 147
Search results for: MODALITY CORPUS · ENGLISH LANGUAGE CORPUS · SPEECH RECOGNITION · AVSR
-
Introduction to the special issue on machine learning in acoustics
PublicationWhen we started our Call for Papers for a Special Issue on “Machine Learning in Acoustics” in the Journal of the Acoustical Society of America, our ambition was to invite papers in which machine learning was applied to all acoustics areas. They were listed, but not limited to, as follows: • Music and synthesis analysis • Music sentiment analysis • Music perception • Intelligent music recognition • Musical source separation • Singing...
-
English, French, and Polish Aliases of Criminals: Diversity of Inspirations in their Creation and Typical Nicknaming Schemes
PublicationThe present paper examines the topic of aliases of criminals, which seems to be understudied in linguistic research. Therefore, this article’s primary goal is to describe how criminals’ aliases are created and what are the differences and similarities in that process in English, French, and Polish. Firstly, the theoretical background concerning the topic of pseudonyms is presented. Then, the corpus gathered for this paper (available...
-
S’attaquer à la suprématie du masculin sur le féminin : le français inclusif dans les publications des universités françaises dans les réseaux sociaux
PublicationThis paper aims to examine the use of inclusive French in the Internet publications of Paris universities on their social media. Three higher education institutions were selected: Paris Dauphine-PSL University, Gustave Eiffel University, and Sorbonne Paris North University. The publications were obtained from Facebook, Instagram, and LinkedIn. Firstly, the groups of people to whom the use of inclusive French referred...
-
The Impact of Foreign Accents on the Performance of Whisper Family Models Using Medical Speech in Polish
PublicationThe article presents preliminary experiments investigating the impact of accent on the performance of the Whisper automatic speech recognition (ASR) system, specifically for the Polish language and medical data. The literature review revealed a scarcity of studies on the influence of accents on speech recognition systems in Polish, especially concerning medical terminology. The experiments involved voice cloning of selected individuals...
-
Unités phraséologiques au pays de la traduction: transfert des collocations nomino-adjectivales avec le lexème «femme» dans la traduction de la littérature houellebecquienne du français vers l’italien et le polonais
PublicationThe present paper examines the transfer of nomino-adjectival collocations based on the word ‘femme’ (‘woman’) in the literary translation from French into Italian and Polish. The lexical connection analysed in the article can be defined as the habitual juxtaposition of a word with another word (or words) that has a significant frequency in a given language. The research corpus comprises seven Michel Houellebecq’s novels written...
-
Methodology and technology for the polymodal allophonic speech transcription
PublicationA method for automatic audiovisual transcription of speech employing: acoustic and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e. the changes in the articulatory setting of speech organs for...
-
Methodology and technology for the polymodal allophonic speech transcription
PublicationA method for automatic audiovisual transcription of speech employing: acoustic, electromagnetical articulography and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e., the changes in the articulatory...
-
Sésame, ouvre-toi: internationalisme phraséologique à contenu universel
PublicationPhraseological units, characterised by their opaque meaning, are the subject of multiple theoretical works. The following article adds to this discussion by providing another interesting example. It analyses the case of the Arabic phraseological unit ‘open sesame’ from the “Ali Baba and the Forty Thievesˮ folk tale, permeating into French, Italian, Polish, Turkish and Japanese – languages distant both linguistically and culturally....
-
SYNTHESIZING MEDICAL TERMS – QUALITY AND NATURALNESS OF THE DEEP TEXT-TO-SPEECH ALGORITHM
PublicationThe main purpose of this study is to develop a deep text-to-speech (TTS) algorithm designated for an embedded system device. First, a critical literature review of state-of-the-art speech synthesis deep models is provided. The algorithm implementation covers both hardware and algorithmic solutions. The algorithm is designed for use with the Raspberry Pi 4 board. 80 synthesized sentences were prepared based on medical and everyday...
-
Quality Evaluation of Speech Transmission via Two-way BPL-PLC Voice Communication System in an Underground Mine
PublicationIn order to design a stable and reliable voice communication system, it is essential to know how many resources are necessary for conveying quality content. These parameters may include objective quality of service (QoS) metrics, such as: available bandwidth, bit error rate (BER), delay, latency as well as subjective quality of experience (QoE) related to user expectations. QoE is expressed as clarity of speech and the ability...
-
Study on Speech Transmission under Varying QoS Parameters in a OFDM Communication System
PublicationAlthough there has been an outbreak of multiple multimedia platforms worldwide, speech communication is still the most essential and important type of service. With the spoken word we can exchange ideas, provide descriptive information, as well as aid to another person. As the amount of available bandwidth continues to shrink, researchers focus on novel types of transmission, based most often on multi-valued modulations, multiple...
-
Once in a season – the pragmatic function of fuck in “BoJack Horseman” TV Show
PublicationThis article investigates the use and pragmatic functions of the swear word fuck in the “BoJack Horseman” produced by Netflix and bridges the gap in the linguistic research on this particular TVshow. Incorporating corpus linguistics tools, the BoJack Horseman Corpus was compiled and thelemma fuck has been investigated and analysed from the multimodal perspective....
-
Towards facts extraction from text in Polish language
PublicationNatural Language Processing (NLP) finds many usages in different fields of endeavor. Many tools exists allowing analysis of English language. For Polish language the situation is different as the language itself is more complicated. In this paper we show differences between NLP of Polish and English language. Existing solutions are presented and TEAMS software for facts extraction is described. The paper shows also evaluation of...
-
Selection of Features for Multimodal Vocalic Segments Classification
PublicationEnglish speech recognition experiments are presented employing both: audio signal and Facial Motion Capture (FMC) recordings. The principal aim of the study was to evaluate the influence of feature vector dimension reduction for the accuracy of vocalic segments classification employing neural networks. Several parameter reduction strategies were adopted, namely: Extremely Randomized Trees, Principal Component Analysis and Recursive...
-
Extracting concepts from the software requirements specification using natural language processing
PublicationExtracting concepts from the software require¬ments is one of the first step on the way to automating the software development process. This task is difficult due to the ambiguity of the natural language used to express the requirements specification. The methods used so far consist mainly of statistical analysis of words and matching expressions with a specific ontology of the domain in which the planned software will be applicable....
-
Exploring the preferences of Polish EFL teachers towards the accents of English
PublicationThis language attitudes study investigates the preferences of EFL (English as a foreign language) teachers from Poland towards the accents of English they speak and teach. Despite the substantial amount of research on EFL learners, little has been done to investigate the impact of preferences of Polish teachers for different variations of English language on their...
-
Phraseological Units in Audiovisual Translation. A Case Study of Polish Dubbing of Disney’s 'The Little Mermaid'
PublicationThe paper aims to discuss phraseological units as the object of audiovisual translation in the Polish dubbing of Disney’s 'The Little Mermaid', to discuss the role of phraseological translation techniques, and to present possible translation inconsistencies. A theoretical introduction presents definitions for crucial terms. It is followed by the analysis of the corpus of phraseological units in Disney’s The Little Mermaid and...
-
Audio Feature Analysis for Precise Vocalic Segments Classification in English
PublicationAn approach to identifying the most meaningful Mel-Frequency Cepstral Coefficients representing selected allophones and vocalic segments for their classification is presented in the paper. For this purpose, experiments were carried out using algorithms such as Principal Component Analysis, Feature Importance, and Recursive Parameter Elimination. The data used were recordings made within the ALOFON corpus containing audio signal...
-
Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention
PublicationThis paper describes two novel complementary techniques that improve the detection of lexical stress errors in non-native (L2) English speech: attention-based feature extraction and data augmentation based on Neural Text-To-Speech (TTS). In a classical approach, audio features are usually extracted from fixed regions of speech such as the syllable nucleus. We propose an attention-based deep learning model that automatically de...
-
XVIII Międzynarodowe Sympozjum Inżynierii i Reżyserii Dźwięku
PublicationThe subjective assessment of speech signals takes into account previous experiences and habits of an individual. Since the perception process deteriorates with age, differences should be noticeable among people from dissimilar age groups. In this work, we investigated the difference of speech quality assessment between high school students and university students. The study involved 60 participants, with 30 people in both the adolescents...
-
Bimodal Emotion Recognition Based on Vocal and Facial Features
PublicationEmotion recognition is a crucial aspect of human communication, with applications in fields such as psychology, education, and healthcare. Identifying emotions accurately is challenging, as people use a variety of signals to express and perceive emotions. In this study, we address the problem of multimodal emotion recognition using both audio and video signals, to develop a robust and reliable system that can recognize emotions...
-
Accelerated remyelination and immune modulation by the EBI2 agonist 7α,25-dihydroxycholesterol analogue in the cuprizone model
PublicationResearch indicates a role for EBI2 receptor in remyelination, demonstrating that its deficiency or antagonism inhibits this process. However, activation of EBI2 with its endogenous ligand, oxysterol 7α,25-dihydroxycholesterol (7α,25OHC), does not enhance remyelination beyond the levels observed in spontaneously remyelinating tissue. We hypothesized that the short half-life of the natural ligand might explain this lack of beneficial...
-
Enhanced voice user interface employing spatial filtration of signals from acoustic vector sensor
PublicationSpatial filtration of sound is introduced to enhance speech recognition accuracy in noisy conditions. An acoustic vector sensor (AVS) is employed. The signals from the AVS probe are processed in order to attenuate the surrounding noise. As a result the signal to noise ratio is increased. An experiment is featured in which speech signals are disturbed by babble noise. The signals before and after spatial filtration are processed...
-
Intelligent multimedia solutions supporting special education needs.
PublicationThe role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....
-
Intelligent video and audio applications for learning enhancement
PublicationThe role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....
-
WYKORZYSTANIE SIECI NEURONOWYCH DO SYNTEZY MOWY WYRAŻAJĄCEJ EMOCJE
PublicationW niniejszym artykule przedstawiono analizę rozwiązań do rozpoznawania emocji opartych na mowie i możliwości ich wykorzystania w syntezie mowy z emocjami, wykorzystując do tego celu sieci neuronowe. Przedstawiono aktualne rozwiązania dotyczące rozpoznawania emocji w mowie i metod syntezy mowy za pomocą sieci neuronowych. Obecnie obserwuje się znaczny wzrost zainteresowania i wykorzystania uczenia głębokiego w aplikacjach związanych...
-
Badania empiryczne związane z ewolucją języków - wybrane zagadnienia
PublicationAlthough language evolution is an area in science yet to be developed, its foundations lay on empirical research. The aim of this article is to present three categories of ways to get empirical data on language evolution: observing language in laboratory, monitoring animal communication and analysing pidgins and creoles. The part of the paper about language in laboratory bases on English-language articles presenting the experiments...
-
Building Knowledge for the Purpose of Lip Speech Identification
PublicationConsecutive stages of building knowledge for automatic lip speech identification are shown in this study. The main objective is to prepare audio-visual material for phonetic analysis and transcription. First, approximately 260 sentences of natural English were prepared taking into account the frequencies of occurrence of all English phonemes. Five native speakers from different countries read the selected sentences in front of...
-
Semantic OLAP with FluentEditor and Ontorion Semantic Excel Toolchain
PublicationSemantic technologies appear as a step on the way to creating systems capable of representing the physical world as real time computational processes. In this context, the paper presents a toolchain for an ontology based knowledge management system. It consists of the ontology editor, FluentEditor and the distributed knowledge representation system, Ontorion. FluentEditor is a comprehensive tool for editing and manipulating complex...
-
Contextual ontology for tonality assessment
Publicationclassification tasks. The discussion focuses on two important research hypotheses: (1) whether it is possible to construct such an ontology from a corpus of textual document, and (2) whether it is possible and beneficial to use inferencing from this ontology to support the process of sentiment classification. To support the first hypothesis we present a method of extraction of hierarchy of contexts from a set of textual documents...
-
Scoreboard Architectural Pattern and Integration of Emotion Recognition Results
PublicationThis paper proposes a new design pattern, named Scoreboard , dedicated for applications solving complex, multi-stage, non-deterministic problems. The pattern provides a computational framework for the design and implementation of systems that integrate a large number of diverse specialized modules that may vary in accuracy, solution level, and modality. The Scoreboard is an extension of Blackboard design pattern and comes under...
-
Automatic Marking of Allophone Boundaries in Isolated English spoken Words
PublicationThe work presents a method that allows delimiting the borders of allophones in isolated English words. The described method is based on the DTW algorithm combining two signals, a reference signal and an analyzed one. As the reference signal, recordings from the MODALITY database were used, from which the words were extracted. This database was also used for tests, which were described. Test results show that the automatic determination...
-
Electrochemical Evaluation of Sustainable Corrosion Inhibitors via Dynamic Electrochemical Impedance Spectroscopy
PublicationFinding suitable measurement methods for the effective management of electrochemical problems is of paramount importance, particularly for improving efficiency in corrosion protection. The need for accurate measurement techniques specific to nonstationary conditions has long been recognized, and promising approaches have emerged. This chapter introduces dynamic electrochemical impedance spectroscopy as a novel advancement in electrochemistry...
-
Enriching the Context: Methods of Improving the Non-contextual Assessment of Sentence Credibility
PublicationThis paper presents several methods of automatic context enrichment of sentences that need to be evaluated, tagged or fact-checked by human judges. We have created a corpus of medical Web articles. Sentences from this corpus have been fact-checked by medical experts in two modes: contextually (reading the entire article and evaluating sentence by sentence) and without context (evaluating sentences from all articles in random order)....
-
Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions
PublicationThe paper aims to discuss a case study of sensing analytics and technology in acoustics when applied to reverberation conditions. Reverberation is one of the issues that makes speech in indoor spaces challenging to understand. This problem is particularly critical in large spaces with few absorbing or diffusing surfaces. One of the natural remedies to improve speech intelligibility in such conditions may be achieved through speaking...
-
Computer-assisted pronunciation training—Speech synthesis is almost all you need
PublicationThe research community has long studied computer-assisted pronunciation training (CAPT) methods in non-native speech. Researchers focused on studying various model architectures, such as Bayesian networks and deep learning methods, as well as on the analysis of different representations of the speech signal. Despite significant progress in recent years, existing CAPT methods are not able to detect pronunciation errors with high...
-
Robot-Based Intervention for Children With Autism Spectrum Disorder: A Systematic Literature Review
PublicationChildren with autism spectrum disorder (ASD) have deficits in the socio-communicative domain and frequently face severe difficulties in the recognition and expression of emotions. Existing literature suggested that children with ASD benefit from robot-based interventions. However, studies varied considerably in participant characteristics, applied robots, and trained skills. Here, we reviewed robot-based interventions targeting...
-
Theory of recognition in a historical perspective. Axel Honneth's Anerkennung: Eine europäische Ideengeschichte
PublicationThe article discusses Honneth excursion into the realm of the history of ideas. This time Honneth decides to laser it on the notion of "recognition" in three different cultural areas and three different traditions: French, English, and German. The article discusses Honneth's persepctive and attempts at finding the common thread that would link three aforementioned traditions.
-
Evaluation of aspiration problems in L2 English pronunciation employing machine learning
PublicationThe approach proposed in this study includes methods specifically dedicated to the detection of allophonic variation in English. This study aims to find an efficient method for automatic evaluation of aspiration in the case of Polish second-language (L2) English speakers’ pronunciation when whole words are analyzed instead of particular allophones extracted from words. Sample words including aspirated and unaspirated allophones...
-
Automatic Classification of Polish Sign Language Words
PublicationIn the article we present the approach to automatic recognition of hand gestures using eGlove device. We present the research results of the system for detection and classification of static and dynamic words of Polish language. The results indicate the usage of eGlove allows to gain good recognition quality that additionally can be improved using additional data sources such as RGB cameras.
-
Voice command recognition using hybrid genetic algorithm
PublicationAbstract: Speech recognition is a process of converting the acoustic signal into a set of words, whereas voice command recognition consists in the correct identification of voice commands, usually single words. Voice command recognition systems are widely used in the military, control systems, electronic devices, such as cellular phones, or by people with disabilities (e.g., for controlling a wheelchair or operating a computer...
-
Decoding imagined speech for EEG-based BCI
PublicationBrain–computer interfaces (BCIs) are systems that transform the brain's electrical activity into commands to control a device. To create a BCI, it is necessary to establish the relationship between a certain stimulus, internal or external, and the brain activity it provokes. A common approach in BCIs is motor imagery, which involves imagining limb movement. Unfortunately, this approach allows few commands. As an alternative, this...
-
An Attempt to Create Speech Synthesis Model That Retains Lombard Effect Characteristics
PublicationThe speech with the Lombard effect has been extensively studied in the context of speech recognition or speech enhancement. However, few studies have investigated the Lombard effect in the context of speech synthesis. The aim of this paper is to create a mathematical model that allows for retaining the Lombard effect. These models could be used as a basis of a formant speech synthesizer. The proposed models are based on dividing...
-
Rozwijanie kreatywności ucznia w procesie kształtowania umiejętności językowych. Innowacja pedagogiczna z elementami neurodydaktyki w edukacji wczesnoszkolnej
PublicationThis text is a ready-to-use pedagogical innovation program combining teaching English and classes developing creativity in early childhood education. Classes developing creativity are a unique opportunity to implement innovative solutions and ideas to develop language competencies and key competencies, which can be difficult during a standard English lesson. The...
-
Vocalic Segments Classification Assisted by Mouth Motion Capture
PublicationVisual features convey important information for automatic speech recognition (ASR), especially in noisy environment. The purpose of this study is to evaluate to what extent visual data (i.e. lip reading) can enhance recognition accuracy in the multi-modal approach. For that purpose motion capture markers were placed on speakers' faces to obtain lips tracking data during speaking. Different parameterizations strategies were tested...
-
Towards Facts Extraction From Texts in Polish Language
PublicationThe Polish language differs from English in many ways. It has more complicated conjugation and declination. Because of that automatic facts extraction from texts is difficult. In this paper we present basic differences between those languages. The paper presents an algorithm for extraction of facts from articles from Polish Wikipedia. The algorithm is based on 7 proposed facts schemes that are searched for in the analyzed text....
-
Words context analysis for improvement of information retrieval
PublicationIn the article we present an approach to improvement of retrieval informationfrom large text collections using words context vectors. The vectorshave been created analyzing English Wikipedia with Hyperspace Analogue to Language model of words similarity. For test phrases we evaluate retrieval with direct user queries as well as retrieval with context vectors of these queries. The results indicate that the proposed method can not...
-
Performance Analysis of the OpenCL Environment on Mobile Platforms
PublicationToday’s smartphones have more and more features that so far were only assigned to personal computers. Every year these devices are composed of better and more efficient components. Everything indicates that modern smartphones are replacing ordinary computers in various activities. High computing power is required for tasks such as image processing, speech recognition and object detection. This paper analyses the performance of...
-
Quantifying inconsistencies in the Hamburg Sign Language Notation System
PublicationThe advent of machine learning (ML) has significantly advanced the recognition and translation of sign languages, bridging communication gaps for hearing-impaired communities. At the heart of these technologies is data labeling, crucial for training ML algorithms on a huge amount of consistently labeled data to achieve models that generalize well. The adoption of language-agnostic annotations is essential to connect different sign...
-
Modeling Object Oriented Systems via Controlled English Verbalization of Description Logic
PublicationThe need for formal methods for Object Oriented (OO) systems resulted in methods like UML and Lepus3 that are de-facto graphical languages equipped with formal tools that are able to handle the design of OO systems. However, they lack precise semantics which might lead to problems, such as inconsistencies or redundancies. On the other hand, to our knowledge, there is no approach that allows one to understand and follow the requirements...