Search results for: Query by Sketch

Search results for: Query by Sketch

results on page:
embed this view on your website

Displayed results came from alternative search method.

Filters

total: 689

clear all filters disabled

Performance Evaluation of a 650V E-HEMT GaN Power Switch
Publication
- P. Czyż
- Year 2015
GaN power switches have better characteristics compared to the state-of-the-art Si power transistors. These devices offer high operating temperature and current densities, fast switching and low on-resistance. However, currently only a few producers offer technology of high voltage GaN transistors. Immaturity of this technology is the reason why experimental evaluation of GaN parameters must be performed to properly exploit their...
Material for Automatic Phonetic Transcription of Speech Recorded in Various Conditions
Publication
- Year 2016
Automatic speech recognition (ASR) is under constant development, especially in cases when speech is casually produced or it is acquired in various environment conditions, or in the presence of background noise. Phonetic transcription is an important step in the process of full speech recognition and is discussed in the presented work as the main focus in this process. ASR is widely implemented in mobile devices technology, but...

Full text to download in external service
Human-computer interactions in speech therapy using a blowing interface
Publication
- Year 2014
In this paper we present a new human-computer interface for the quantitative measurement of blowing activities. The interface can measure the air flow and air pressure during the blowing activity. The measured values are stored and used to control the state of the graphical objects in the graphical user interface. In speech therapy children will find easier to play attractive therapeutic games than to perform repetitive and tedious,...

Full text to download in external service
Distortion of speech signals in the listening area: its mechanism and measurements
Publication
- H. Lasota
- R. Mazurek
- I. Kochańska
- Year 2014
The paper deals with a problem of the influence of the number and distribution of loudspeakers in speech reinforcement systems on the quality of publicly addressed voice messages, namely on speech intelligibility in the listening area. Linear superposition of time-shifted broadband waves of a same form and slightly different magnitudes that reach a listener from numerous coherent sources, is accompanied by interference effects...

Full text to download in external service
Noise profiling for speech enhancement employing machine learning models
Publication
- K. Kąkol
- G. Korvel
- B. Kostek
- Journal of the Acoustical Society of America - Year 2022
This paper aims to propose a noise profiling method that can be performed in near real-time based on machine learning (ML). To address challenges related to noise profiling effectively, we start with a critical review of the literature background. Then, we outline the experiment performed consisting of two parts. The first part concerns the noise recognition model built upon several baseline classifiers and noise signal features...

Full text available to download
International Journal of Speech Technology

Journals

ISSN: 1381-2416 , eISSN: 1572-8110
Journal of Monolingual and Bilingual Speech

Journals

ISSN: 2631-8407 , eISSN: 2631-8415
The Influence of Stretch Rod Speed on the Relationship between Preblown Bottle Aesthetic Quality and Final Blown Bottle Thickness Profile in Stretch Blow Molding from Preform Process
Publication
- P. Wawrzyniak
- Applied Mechanics and Materials - Year 2015
From a mechanical point of view, the aesthetic quality of preblown PET bottles and thickness profile of final blown PET bottles manufactured in ISBM process are determined by mechanical and thermal response of blown preforms. From the microscopic point of view the biggest influence on the mechanical and thermal properties of PET bottles have orientation and crystallization processes. From a technological point of view, the aesthetic...
SYNTHESIZING MEDICAL TERMS – QUALITY AND NATURALNESS OF THE DEEP TEXT-TO-SPEECH ALGORITHM
Publication
- B. Kostek
- B. Szyca
- Journal of the Acoustical Society of America - Year 2023
The main purpose of this study is to develop a deep text-to-speech (TTS) algorithm designated for an embedded system device. First, a critical literature review of state-of-the-art speech synthesis deep models is provided. The algorithm implementation covers both hardware and algorithmic solutions. The algorithm is designed for use with the Raspberry Pi 4 board. 80 synthesized sentences were prepared based on medical and everyday...

Full text available to download
Time-scale modification of speech signals for supporting hearing impaired schoolchildren
Publication
- A. Kupryjanow
- A. Czyżewski
- Year 2009
A study of time scale modification algorithmsapplied to hearing impaired schoolchildren supporting ispresented. Variety of algorithms are considered, namely:overlap and add, two variations of synchronized overlapand add, and the phase vocoder. Their effectiveness as wellas real-time processing capabilities are examined.
Effects of acceptor doping on a metalorganic switch: DFT vs. model analysis
Publication
- T. Ślusarski
- T. Kostyrko
- V. García-Suárez
- PHYSICAL CHEMISTRY CHEMICAL PHYSICS - Year 2018
Full text to download in external service
Estimation of the short-term predictor parameters of speech under noisy conditions
Publication
- M. Kuropatwinski
- W. Kleijn
- M. Kuropatwiński
- IEEE Transactions on Audio Speech and Language Processing - Year 2006
Full text to download in external service
Speech formant frequency and pitch estimation using instantaneous complex frequency
Publication
- M. [. Kaniewska
- Year 2008
W pracy opisany został algorytm estymacji częstotliwości podstawowej oraz częstotliwości środkowych i pasm formantów mowy z wykorzystaniem zespolonej pulsacji chwilowej. W artykule przedstawiono również wyniki działania algorytmu dla polskich samogłosek.
Corrupted speech intelligibility improvement using adaptive filter based algorithm
Publication
- D. Ellwart
- A. Czyżewski
- Year 2010
A technique for improving the quality of speech signals recorded in strong noise is presented. The proposed algorithmemploying adaptive filtration is described and additional possibilities of speech intelligibility improvement arediscussed. Results of the tests are presented.
Regulation of the switch from early to late bacteriophage λ DNA replication
Publication
- S. Barańska
- M. Gabig
- A. Węgrzyn
- G. Konopa
- A. Herman-Antosiewicz
- P. Hernandez
- J. Schvartzman
- D. Helinski
- G. Węgrzyn
- MICROBIOLOGY-SGM - Year 2001
Full text to download in external service
The influence of PET mechanical properties on Stretch Blow Molding (SBM) process
Publication
- P. Wawrzyniak
- Year 2013
In the paper it is said about the influence of PET mechanical properties on SBM process parameters changes. The below paper mentions also about the influence of PET orientation and crystallization processes which have a very big influence on mechanical and thermal properties of PET material during SBM process. All mechanical data of PET material and SBM process parameters change in time are been got from collected literature which...
A non-uniform real-time speech time-scale stretching method
Publication
- A. Kupryjanow
- A. Czyżewski
- Year 2011
An algorithm for non-uniform real-time speech stretching is presented. It provides a combination of typical SOLA algorithm (Synchronous Overlap and Add ) with the vowels, consonants and silence detectors. Based on the information about the content and the estimated value of the rate of speech (ROS), the algorithm adapts the scaling factor value. The ability of real-time speech stretching and the resultant quality of voice were...
High quality speech codec employing sines+noise+transients model
Publication
- Archives of Acoustics - Year 2006
A method of high quality wideband speech signal representation employing sines+transients+noise model is presented. The need for a wideband speech coding approach as well as various methods for analysis and synthesis of sines, residual and transient states of speech signal is discussed. The perceptual criterion is applied in the proposed approach during encoding of sines amplitudes in order to reduce bandwidth requirements and...

Full text available to download
Comparison of Acoustic and Visual Voice Activity Detection for Noisy Speech Recognition
Publication
- Year 2016
The problem of accurate differentiating between the speaker utterance and the noise parts in a speech signal is considered. The influence of utilizing a voice activity detection in speech signals on the accuracy of the automatic speech recognition (ASR) system is presented. The examined methods of voice activity detection are based on acoustic and visual modalities. The problem of detecting the voice activity in clean and noisy...
Analysis of 2D Feature Spaces for Deep Learning-based Speech Recognition
Publication
- G. Korvel
- P. Treigys
- G. Tamulevicus
- J. Bernataviciene
- B. Kostek
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2018
convolutional neural network (CNN) which is a class of deep, feed-forward artificial neural network. We decided to analyze audio signal feature maps, namely spectrograms, linear and Mel-scale cepstrograms, and chromagrams. The choice was made upon the fact that CNN performs well in 2D data-oriented processing contexts. Feature maps were employed in the Lithuanian word recognition task. The spectral analysis led to the highest word...
An Attempt to Create Speech Synthesis Model That Retains Lombard Effect Characteristics
Publication
- G. Korvel
- O. Kurasova
- B. Kostek
- Year 2019
The speech with the Lombard effect has been extensively studied in the context of speech recognition or speech enhancement. However, few studies have investigated the Lombard effect in the context of speech synthesis. The aim of this paper is to create a mathematical model that allows for retaining the Lombard effect. These models could be used as a basis of a formant speech synthesizer. The proposed models are based on dividing...

Full text available to download
Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets
Publication
- Electronics - Year 2022
Artificial Neural Network (ANN) models, specifically Convolutional Neural Networks (CNN), were applied to extract emotions based on spectrograms and mel-spectrograms. This study uses spectrograms and mel-spectrograms to investigate which feature extraction method better represents emotions and how big the differences in efficiency are in this context. The conducted studies demonstrated that mel-spectrograms are a better-suited...

Full text available to download
Investigating Noise Interference on Speech Towards Applying the Lombard Effect Automatically
Publication
- G. Korvel
- K. Kąkol
- P. Treigys
- B. Kostek
- Year 2022
The aim of this study is two-fold. First, we perform a series of experiments to examine the interference of different noises on speech processing. For that purpose, we concentrate on the Lombard effect, an involuntary tendency to raise speech level in the presence of background noise. Then, we apply this knowledge to detecting speech with the Lombard effect. This is for preparing a dataset for training a machine learning-based...

Full text available to download
Information Retrieval with the Use of Music Clustering by Directions Algorithm
Publication
- A. Kaczmarek
- Year 2013
This paper introduces the Music Clustering by Directions (MCBD) algorithm. The algorithm is designed to support users of query by humming systems in formulating queries. This kind of systems makes it possible to retrieve songs and tunes on the basis of a melody recorded by the user. The Music Clustering by Directions algorithm is a kind of an interactive query expansion method. On the basis of query, the algorithm provides suggestions...

Full text to download in external service
A Study of Cross-Linguistic Speech Emotion Recognition Based on 2D Feature Spaces
Publication
- G. Tamulevicius
- G. Korvel
- A. B. Yayak
- P. Treigys
- J. Bernataviciene
- B. Kostek
- Electronics - Year 2020
In this research, a study of cross-linguistic speech emotion recognition is performed. For this purpose, emotional data of different languages (English, Lithuanian, German, Spanish, Serbian, and Polish) are collected, resulting in a cross-linguistic speech emotion dataset with the size of more than 10.000 emotional utterances. Despite the bi-modal character of the databases gathered, our focus is on the acoustic representation...

Full text available to download
Combining visual and acoustic modalities to ease speech recognition by hearing impaired people
Publication
- B. Kostek
- P. Dalka
- Year 2005
Artykuł prezentuje system, którego celem działania jest ułatwienie procesu treningu poprawnej wymowy dla osób z poważnymi wadami słuchu. W analizie mowy wykorzystane zostały parametry akutyczne i wizualne. Do wyznaczenia parametrów wizualnych na podstawie kształtu i ruchu ust zostały wykorzystane modele Active Shape Models. Parametry akustyczne bazują na współczynnikach melcepstralnych. Do klasyfikacji wypowiadanych głosek została...
Elimination of clicks from archive speech signals using sparse autoregressive modeling
Publication
- M. Niedźwiecki
- M. Ciołek
- Year 2012
This paper presents a new approach to elimination of impulsivedisturbances from archive speech signals. The proposedsparse autoregressive (SAR) signal representation is given ina factorized form - the model is a cascade of the so-called formantfilter and pitch filter. Such a technique has been widelyused in code-excited linear prediction (CELP) systems, as itguarantees model stability. After detection of noise pulses usinglinear...

Full text to download in external service
Pitch estimation of narrowband-filtered speech signal using instantaneous complex frequency
Publication
- Elektronika : konstrukcje, technologie, zastosowania - Year 2008
In this paper we propose a novel method of pitch estimation, based on instantaneous complex frequency (ICF). New iterative algorithm for analysis of ICF of speech signal in presented. Obtained results are compared with commonly used methods to prove its accuracy and connection between ICF and pitch, particularly for narrowband-filtered speech signal.
Improving signal quality of a speech codec using hybrid perceptual-parametric algorithm
Publication
- International Journal of Intelligent Information and Database Systems - Year 2008
W artykule zaprezentowano hybrydową architekturę parametryczno-perceptualną kodeka mowy. Jego podstawę stanowi kodek CELP, który wspomagany jest kodekiem perceptualnym. Celem zastosowania proponowanej metody jest uzyskanie poprawy jakości kodowania sygnału mowy. Badaniom poddano dwie architektury, z których w jednej dźwięczne części sygnału rezydualnego kodeka CELP kodowane są perceptualnie. Drugi z proponowanych kodeków dokonuje...

Full text to download in external service
Pitch estimation of narrowband-filtered speech signal using instantaneous complex frequency
Publication
- T. Bandurski
- Ł. Hamerski
- M. Papaj
- A. Paruzel
- K. Świder
- Year 2007
In this paper we propose a novel method of pitch estimation, based on instantaneous complex frequency (ICF). New iterative algorithm for analysis of ICF of speech signal in presented. Obtained results are compared with commonly used methods to prove its accuracy and connection between ICF and pitch, particularly for narrowband-filtered speech signal.
Analysis of Lombard speech using parameterization and the objective quality indicators in noise conditions
Publication
- K. Kąkol
- G. Korvel
- B. Kostek
- Year 2018
The aim of the work is to analyze Lombard speech effect in recordings and then modify the speech signal in order to obtain an increase in the improvement of objective speech quality indicators after mixing the useful signal with noise or with an interfering signal. The modifications made to the signal are based on the characteristics of the Lombard speech, and in particular on the effect of increasing the fundamental frequency...
Database of speech and facial expressions recorded with optimized face motion capture settings
Publication
- A. Czyżewski
- M. Kawaler
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Year 2019
The broad objective of the present research is the analysis of spoken English employing a multiplicity of modalities. An important stage of this process, discussed in the paper, is creating a database of speech accompanied with facial expressions. Recordings of speakers were made using an advanced system for capturing facial muscle motion. A brief historical outline, current applications, limitations and the ways of capturing face...

Full text available to download
Computer-assisted pronunciation training—Speech synthesis is almost all you need
Publication
- D. Korzekwa
- J. Lorenzo-trueba
- T. Drugman
- B. Kostek
- SPEECH COMMUNICATION - Year 2022
The research community has long studied computer-assisted pronunciation training (CAPT) methods in non-native speech. Researchers focused on studying various model architectures, such as Bayesian networks and deep learning methods, as well as on the analysis of different representations of the speech signal. Despite significant progress in recent years, existing CAPT methods are not able to detect pronunciation errors with high...

Full text available to download
Study on Speech Transmission under Varying QoS Parameters in a OFDM Communication System
Publication
- M. Zamłyńska
- P. Falkowski-Gilski
- G. Debita
- B. Miedziński
- Year 2021
Although there has been an outbreak of multiple multimedia platforms worldwide, speech communication is still the most essential and important type of service. With the spoken word we can exchange ideas, provide descriptive information, as well as aid to another person. As the amount of available bandwidth continues to shrink, researchers focus on novel types of transmission, based most often on multi-valued modulations, multiple...

Full text to download in external service
A survey of automatic speech recognition deep models performance for Polish medical terms
Publication
- Year 2023
Among the numerous applications of speech-to-text technology is the support of documentation created by medical personnel. There are many available speech recognition systems for doctors. Their effectiveness in languages such as Polish should be verified. In connection with our project in this field, we decided to check how well the popular speech recognition systems work, employing models trained for the general Polish language....

Full text to download in external service
International Journal of Speech-Language Pathology (previously called Advances in Speech-Language Pathology)

Journals

ISSN: 1754-9507 , eISSN: 1754-9515
An Efficient Noisy Binary Search in Graphs via Median Approximation
Publication
- D. Dereniowski
- A. Łukasiewicz
- P. Uznański
- LECTURE NOTES IN COMPUTER SCIENCE - Year 2021
Consider a generalization of the classical binary search problem in linearly sorted data to the graph-theoretic setting. The goal is to design an adaptive query algorithm, called a strategy, that identifies an initially unknown target vertex in a graph by asking queries. Each query is conducted as follows: the strategy selects a vertex q and receives a reply v: if q is the target, then =, and if q is not the target, then v is a...

Full text to download in external service
Puhe ja Kieli (Speech and Language)

Journals

ISSN: 1458-3410
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING

Journals

ISSN: 1063-6676
JOURNAL OF MEDICAL SPEECH-LANGUAGE PATHOLOGY

Journals

ISSN: 1065-1438
LANGUAGE SPEECH AND HEARING SERVICES IN SCHOOLS

Journals

ISSN: 0161-1461 , eISSN: 1558-9129
AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY

Journals

ISSN: 1058-0360 , eISSN: 1558-9110
American Speech: a Quarterly of Linguistic Usage

Journals

ISSN: 0003-1283 , eISSN: 1527-2133
Journal of Speech, Language, and Hearing Research

Journals

ISSN: 1092-4388 , eISSN: 1558-9102
NLP Questions Answering Using DBpedia and YAGO
Publication
- Vietnam Journal of Computer Science - Year 2020
In this paper, we present results of employing DBpedia and YAGO as lexical databases for answering questions formulated in the natural language. The proposed solution has been evaluated for answering class 1 and class 2 questions (out of 5 classes defined by Moldovan for TREC conference). Our method uses dependency trees generated from the user query. The trees are browsed for paths leading from the root of the tree to the question...

Full text available to download
Automated detection of pronunciation errors in non-native English speech employing deep learning
Publication
- D. Korzekwa
- Year 2023
Despite significant advances in recent years, the existing Computer-Assisted Pronunciation Training (CAPT) methods detect pronunciation errors with a relatively low accuracy (precision of 60% at 40%-80% recall). This Ph.D. work proposes novel deep learning methods for detecting pronunciation errors in non-native (L2) English speech, outperforming the state-of-the-art method in AUC metric (Area under the Curve) by 41%, i.e., from...

Full text available to download
Design of Intelligent Low-Voltage Load Switch for Remote Control System in Smart Grid
Publication
- D. Xiong
- X. Chen
- R. Martinek
- H. Wen
- D. Luo
- J. Smulko
- Iranian Journal of Science and Technology-Transactions of Electrical Engineering - Year 2021
Current low-voltage load switches do not support remote disconnect/connect and real-time monitoring of a disconnect/connect state. Addressing to these issues, this paper presents a low-voltage load switch for a smart remote control system, which uses a one-chip microcontroller board and a DC step motor drive mechanism and provides the feedback on the switch status also. Arrears disconnect and full-pay connect control is implemented...

Full text available to download
The Impact of Foreign Accents on the Performance of Whisper Family Models Using Medical Speech in Polish
Publication
- S. Zaporowski
- Year 2024
The article presents preliminary experiments investigating the impact of accent on the performance of the Whisper automatic speech recognition (ASR) system, specifically for the Polish language and medical data. The literature review revealed a scarcity of studies on the influence of accents on speech recognition systems in Polish, especially concerning medical terminology. The experiments involved voice cloning of selected individuals...

Full text available to download
Novel Family of modified qZS buck-boost multilevel inverters with reduced switch count
Publication
- O. Husev
- R. Strzelecki
- F. Blaabjerg
- V. Chopyk
- D. Vinnikov
- Year 2015
Full text to download in external service
Bimodal classification of English allophones employing acoustic speech signal and facial motion capture
Publication
- Journal of the Acoustical Society of America - Year 2018
A method for automatic transcription of English speech into International Phonetic Alphabet (IPA) system is developed and studied. The principal objective of the study is to evaluate to what extent the visual data related to lip reading can enhance recognition accuracy of the transcription of English consonantal and vocalic allophones. To this end, motion capture markers were placed on the faces of seven speakers to obtain lip...

Full text to download in external service

Search

Filters

Catalog

Search results for: Query by Sketch