Filtry
wszystkich: 1070
wybranych: 737
-
Katalog
Filtry wybranego katalogu
Wyniki wyszukiwania dla: SPEECH EMOTION RECOGNITION
-
Automatic sound recognition for security purposes
PublikacjaIn the paper an automatic sound recognition system is presented. It forms a part of a bigger security system developed in order to monitor outdoor places for non-typical audio-visual events. The analyzed audio signal is being recorded from a microphone mounted in an outdoor place thus a non stationary noise of a significant energy is present in it. In the paper an especially designed algorithm for outdoor noise reduction is presented,...
-
Building Knowledge for the Purpose of Lip Speech Identification
PublikacjaConsecutive stages of building knowledge for automatic lip speech identification are shown in this study. The main objective is to prepare audio-visual material for phonetic analysis and transcription. First, approximately 260 sentences of natural English were prepared taking into account the frequencies of occurrence of all English phonemes. Five native speakers from different countries read the selected sentences in front of...
-
Recognition of Hand Drawn Flowcharts
PublikacjaIn this paper the problem of hand drawn flowcharts recognition is presented. There are described two attitudes to this problem: on-line and off-line. A concept of FCE, a system for recognizing and understanding of freehand drawn on-line flow charts on desktop computer and mobile devices is presented. The first experiments with the FCE system and the planes for future are also described.
-
Semantic Integration of Heterogeneous Recognition Systems
PublikacjaComputer perception of real-life situations is performed using a variety of recognition techniques, including video-based computer vision, biometric systems, RFID devices and others. The proliferation of recognition modules enables development of complex systems by integration of existing components, analogously to the Service Oriented Architecture technology. In the paper, we propose a method that enables integration of information...
-
Communication Platform for Evaluation of Transmitted Speech Quality
PublikacjaA voice communication system designed and implemented is described. The purpose of the presented platform was to enable a series of experiments related to the quality assessment of algorithms used in the coding and transmitting of speech. The system is equipped with tools for recording signals at each stage of processing, making it possible to subject them to subjective assessments by listening tests or, objective evaluation employing...
-
Pitch estimation of narrowband-filtered speech signal using instantaneous complex frequency
PublikacjaIn this paper we propose a novel method of pitch estimation, based on instantaneous complex frequency (ICF). New iterative algorithm for analysis of ICF of speech signal in presented. Obtained results are compared with commonly used methods to prove its accuracy and connection between ICF and pitch, particularly for narrowband-filtered speech signal.
-
Pitch estimation of narrowband-filtered speech signal using instantaneous complex frequency
PublikacjaIn this paper we propose a novel method of pitch estimation, based on instantaneous complex frequency (ICF). New iterative algorithm for analysis of ICF of speech signal in presented. Obtained results are compared with commonly used methods to prove its accuracy and connection between ICF and pitch, particularly for narrowband-filtered speech signal.
-
Automated detection of pronunciation errors in non-native English speech employing deep learning
PublikacjaDespite significant advances in recent years, the existing Computer-Assisted Pronunciation Training (CAPT) methods detect pronunciation errors with a relatively low accuracy (precision of 60% at 40%-80% recall). This Ph.D. work proposes novel deep learning methods for detecting pronunciation errors in non-native (L2) English speech, outperforming the state-of-the-art method in AUC metric (Area under the Curve) by 41%, i.e., from...
-
A Novel Method for Intelligibility Assessment of Nonlinearly Processed Speech in Spaces Characterized by Long Reverberation Times
PublikacjaObjective assessment of speech intelligibility is a complex task that requires taking into account a number of factors such as different perception of each speech sub-bands by the human hearing sense or different physical properties of each frequency band of a speech signal. Currently, the state-of-the-art method used for assessing the quality of speech transmission is the speech transmission index (STI). It is a standardized way...
-
Dependable Integration of Medical Image Recognition Components
PublikacjaComputer driven medical image recognition may support medical doctors in the diagnosis process, but requires high dependability considering potential consequences of incorrect results. The paper presentsa system that improves dependability of medical image recognition by integration of results from redundant components. The components implement alternative recognition algorithms of diseases in thefield of gastrointestinal endoscopy....
-
Weakly-Supervised Word-Level Pronunciation Error Detection in Non-Native English Speech
PublikacjaWe propose a weakly-supervised model for word-level mispronunciation detection in non-native (L2) English speech. To train this model, phonetically transcribed L2 speech is not required and we only need to mark mispronounced words. The lack of phonetic transcriptions for L2 speech means that the model has to learn only from a weak signal of word-level mispronunciations. Because of that and due to the limited amount of mispronounced...
-
Feature extraction in detection and recognition of graphical objects
PublikacjaDetection and recognition of graphic objects in images are of great and growing importance in many areas, such as medical and industrial diagnostics, control systems in automation and robotics, or various types of security systems, including biometric security systems related to the recognition of the face or iris of the eye. In addition, there are all systems that facilitate the personal life of the blind people, visually impaired...
-
Guido: a musical score recognition system
PublikacjaThis paper presents an optical music recognition system Guido that can automatically recognize the main musical symbols of music scores that were scanned or taken by a digital camera. The application is based on object model of musical notation and uses linguistic approach for symbol interpretation and error correction. The system offers musical editor with a partially automatic error correction.
-
Mowa nienawiści (hate speech) a odpowiedzialność dostawców usług internetowych w orzecznictwie sądów europejskich
PublikacjaThe article analyses the phenomenon of hate speech in the Internet contrasted with the problem of responsability of Internet Service Providers for cases of such abuses of freedom of expression. The text provides an analysis of jurisprudence of two European Courts. On the one hand it presents the position of the European Court of Human Rights on the problem of hate speech: its definition and the liability for it as an exception...
-
Multiclass AdaBoost Classifier Parameter Adaptation for Pattern Recognition
PublikacjaThe article presents the problem of parameter value selection of the multiclass ``one against all'' approach of an AdaBoost algorithm in tasks of object recognition based on two-dimensional graphical images. AdaBoost classifier with Haar features is still used in mobile devices due to the processing speed in contrast to other methods like deep learning or SVM but its main drawback is the need to assembly the results of binary...
-
Objectivization of phonological evaluation of speech elements by means of audio parametrization
PublikacjaThis study addresses two issues related to both machine- and subjective-based speech evaluation by investigating five phonological phenomena related to allophone production. Its aim is to use objective parametrization and phonological classification of the recorded allophones. These allophones were selected as specifically difficult for Polish speakers of English: aspiration, final obstruent devoicing, dark lateral /l/, velar nasal...
-
Anion recognition by n,n'-diarylalkanediamides
PublikacjaThe preparation of N,N'-diarylalkanediamides from respective aliphatic dicarboxylic acidesand 4-nitroaniline via microwave-promoted reactions is presented. The most positive effect of microwave irradiation was observed for N,N'-bis(4-nitrophenyl)butanediamide. Anion binding studies on the obtained diamides were carried out in DMSO and acetonitrile using UV-vis and 1H NMR spectroscopy. A mechanism for selective fluoride recognition...
-
Elimination of clicks from archive speech signals using sparse autoregressive modeling
PublikacjaThis paper presents a new approach to elimination of impulsivedisturbances from archive speech signals. The proposedsparse autoregressive (SAR) signal representation is given ina factorized form - the model is a cascade of the so-called formantfilter and pitch filter. Such a technique has been widelyused in code-excited linear prediction (CELP) systems, as itguarantees model stability. After detection of noise pulses usinglinear...
-
Robust and Efficient Machine Learning Algorithms for Visual Recognition
PublikacjaIn visual recognition, the task is to identify and localize all objects of interest in the input image. With the ubiquitous presence of visual data in modern days, the role of object recognition algorithms is becoming more significant than ever and ranges from autonomous driving to computer-aided diagnosis in medicine. Current models for visual recognition are dominated by models based on Convolutional Neural Networks (CNNs), which...
-
AN ALGORITHM FOR PORTAL HYPERTENSIVE GASTROPATHY RECOGNITION ON THE ENDOSCOPIC RECORDINGS
PublikacjaSymptoms recognition of portal hypertensive gastropathy (PHG) can be done by analysing endoscopic recordings, but manual analysis done by physician may take a long time. This increases probability of missing some symptoms and automated methods may be applied to prevent that. In this paper a novel hybrid algorithm for recognition of early stage of portal hypertensive gastropathy is proposed. First image preprocessing is described....
-
Human-computer interactions in speech therapy using a blowing interface
PublikacjaIn this paper we present a new human-computer interface for the quantitative measurement of blowing activities. The interface can measure the air flow and air pressure during the blowing activity. The measured values are stored and used to control the state of the graphical objects in the graphical user interface. In speech therapy children will find easier to play attractive therapeutic games than to perform repetitive and tedious,...
-
Accelerometer signal pre-processing influence on human activity recognition
PublikacjaA study of data pre-processing influence on accelerometer-based human activity recognition algorithms is presented. The frequency band used to filter-out the accelerometer signals and the number of accelerometers involved were considered in terms of their influence on the recognition accuracy.
-
Music Genre Recognition in the Rough Set-Based Environment
PublikacjaThe aim of this paper is to investigate music genre recognition in the rough set-based environment. Experiments involve a parameterized music data-base containing 1100 music excerpts. The database is divided into 11 classes cor-responding to music genres. Tests are conducted using the Rough Set Exploration System (RSES), a toolset for analyzing data with the use of methods based on the rough set theory. Classification effectiveness...
-
Study on Speech Transmission under Varying QoS Parameters in a OFDM Communication System
PublikacjaAlthough there has been an outbreak of multiple multimedia platforms worldwide, speech communication is still the most essential and important type of service. With the spoken word we can exchange ideas, provide descriptive information, as well as aid to another person. As the amount of available bandwidth continues to shrink, researchers focus on novel types of transmission, based most often on multi-valued modulations, multiple...
-
Database of speech and facial expressions recorded with optimized face motion capture settings
PublikacjaThe broad objective of the present research is the analysis of spoken English employing a multiplicity of modalities. An important stage of this process, discussed in the paper, is creating a database of speech accompanied with facial expressions. Recordings of speakers were made using an advanced system for capturing facial muscle motion. A brief historical outline, current applications, limitations and the ways of capturing face...
-
Hand gesture recognition supported by fuzzy rules and Kalman filters
PublikacjaThe paper presents a system based on camera and multimediaprojector enabling a user to control computer applications by dynamic hand gestures. Gesture recognition methodology based on representing hand movement trajectory by motion vectors analysed using fuzzy rule-based inference is first given. For effective hand position tracking Kalman filters are employed. The system engineered is developed using J2SE and C++/OpenCV technology....
-
Transfer learning in imagined speech EEG-based BCIs
PublikacjaThe Brain–Computer Interfaces (BCI) based on electroencephalograms (EEG) are systems which aim is to provide a communication channel to any person with a computer, initially it was proposed to aid people with disabilities, but actually wider applications have been proposed. These devices allow to send messages or to control devices using the brain signals. There are different neuro-paradigms which evoke brain signals of interest...
-
Estimation of the short-term predictor parameters of speech under noisy conditions
Publikacja -
Improvement of speech intelligibility in the presence of noise interference using the Lombard effect and an automatic noise interference profiling based on deep learning
PublikacjaThe Lombard effect is a phenomenon that results in speech intelligibility improvement when applied to noise. There are many distinctive features of Lombard speech that were recalled in this dissertation. This work proposes the creation of a system capable of improving speech quality and intelligibility in real-time measured by objective metrics and subjective tests. This system consists of three main components: speech type detection,...
-
Adversarial attack algorithm for traffic sign recognition
PublikacjaDeep learning suffers from the threat of adversarial attacks, and its defense methods have become a research hotspot. In all applications of deep learning, intelligent driving is an important and promising one, facing serious threat of adversarial attack in the meanwhile. To address the adversarial attack, this paper takes the traffic sign recognition as a typical object, for it is the core function of intelligent driving. Considering...
-
Estimation of the excitation variances of speech and noise AR-models for enhanced speech coding
Publikacja -
Topology recognition and leader election in colored networks
PublikacjaTopology recognition and leader election are fundamental tasks in distributed computing in networks. The first of them requires each node to find a labeled isomorphic copy of the network, while the result of the second one consists in a single node adopting the label 1 (leader), with all other nodes adopting the label 0 and learning a path to the leader. We consider both these problems in networks whose nodes are equipped with...
-
Gesture recognition framework for multimedia content viewer controlling
PublikacjaIn the paper a system for controlling a multimedia content viewer by hand gestures is presented. First, selected methods used for gesture recognition are described. Two different application cases of the system, i.e. for multimedia presentation purposes and for multimedia content viewing are outlined. Moreover, a proposal of improvement of the system combining these approaches is also given. The system work cycle is reviewed. The...
-
Subjective Quality Evaluation of Speech Signals Transmitted via BPL-PLC Wired System
PublikacjaThe broadband over power line – power line communication (BPL-PLC) cable is resistant to electricity stoppage and partial damage of phase conductors. It maintains continuity of transmission in case of an emergency. These features make it an ideal solution for delivering data, e.g. in an underground mine environment, especially clear and easily understandable voice messages. This paper describes a subjective quality evaluation of...
-
Comparison of edge detection algorithms for electric wire recognition
PublikacjaEdge detection is the preliminary step in image processing for object detection and recognition procedure. It allows to remove useless information and reduce amount of data before further analysis. The paper contains the comparison of edge detection algorithms optimized for detection of horizontal edges. For comparison purposes the algorithms were implemented in the developed application dedicated to detection of electric line...
-
Optical recognition elements: macrocyclic imidazole chromoionophores entrapped in silica xerogel
PublikacjaMaterials containing new chromoionophores consisting of crown residue and azole moiety as partsof macrocycles were encapsulated by the sol-gel procedure in silica xerogel matrices and proposed aschemical recognition elements especially for such metal ions as Li+, Cs+ and Cu2+. Action of these recognition elements is in principle based on changes of reflectance. The recognition elements containing 21-membered chromogenic...
-
Acceleration of decision making in sound event recognition employing supercomputing cluster
PublikacjaParallel processing of audio data streams is introduced to shorten the decision making time in hazardous sound event recognition. A supercomputing cluster environment with a framework dedicated to processing multimedia data streams in real time is used. The sound event recognition algorithms employed are based on detecting foreground events, calculating their features in short time frames, and classifying the events with Support...
-
Gesture Recognition With the Linear Optical Sensor and Recurrent Neural Networks
PublikacjaIn this paper, the optical linear sensor, a representative of low-resolution sensors, was investigated in the multiclass recognition of near-field hand gestures. The recurrent neural network (RNN) with a gated recurrent unit (GRU) memory cell was utilized as a gestures classifier. A set of 27 gestures was collected from a group of volunteers. The 27 000 sequences obtained were divided into training, validation, and test subsets....
-
Digits Recognition with Quadrant Photodiode and Convolutional Neural Network
PublikacjaIn this paper we have investigated the capabilities of a quadrant photodiode based gesture sensor in the recognition of digits drawn in the air. The sensor consisting of 4 active elements, 4 LEDs and a pinhole was considered as input interface for both discrete and continuous gestures. Index finger and a round pointer were used as navigating mediums for the sensor. Experiments performed with 5 volunteers...
-
Camera angle invariant shape recognition in surveillance systems
PublikacjaA method for human action recognition in surveillance systems is described. Problems within this task are discussed and a solution based on 3D object models is proposed. The idea is shown and some of its limitations are talked over. Shape description methods are introduced along with their main features. Utilized parameterization algorithm is presented. Classification problem, restricted to bi-nary cases is discussed. Support vector...
-
Intelligent processing of stuttered speech.
PublikacjaW artykule zaprezentowano kilka metod analizy i automatycznego zliczania potknięć artykulacyjnych, związanych z jąkaniem się, opartych na wykorzystaniu algorytmów uczących się sztucznych sieci neuronowych i zbiorów przybliżonych.
-
Quality Evaluation of Speech Transmission via Two-way BPL-PLC Voice Communication System in an Underground Mine
PublikacjaIn order to design a stable and reliable voice communication system, it is essential to know how many resources are necessary for conveying quality content. These parameters may include objective quality of service (QoS) metrics, such as: available bandwidth, bit error rate (BER), delay, latency as well as subjective quality of experience (QoE) related to user expectations. QoE is expressed as clarity of speech and the ability...
-
System for automatic singing voice recognition
PublikacjaW artykule przedstawiono system automatycznego rozpoznawania jakości i typu głosu śpiewaczego. Przedstawiono bazę danych oraz zaimplementowane parametry. Algorytmem decyzyjnym jest algorytm sztucznych sieci neuronowych. Wytrenowany system decyzyjny osiąga skuteczność ok. 90% w obydwu kategoriach rozpoznawania. Dodatkowo wykazano przy pomocy metod statystycznych, że wyniki działania systemu automatycznej oceny jakości technicznej...
-
Pose classification in the gesture recognition using the linear optical sensor
PublikacjaGesture sensors for mobile devices, which have a capability of distinguishing hand poses, require efficient and accurate classifiers in order to recognize gestures based on the sequences of primitives. Two methods of poses recognition for the optical linear sensor were proposed and validated. The Gaussian distribution fitting and Artificial Neural Network based methods represent two kinds of classification approaches. Three types...
-
Influence of accelerometer signal pre-processing and classification method on human activity recognition
PublikacjaA study of data pre-processing influence on accelerometer-based human activity recognition algorithms is presented. The frequency band used to filter-out the accelerometer signals and the number of accelerometers involved were considered in terms of their influence on the recognition accuracy. In the test four methods of classification were used: support vector machine, decision trees, neural network, k-nearest neighbor.
-
On practical application of Shannon theory to character recognition and more
PublikacjaLet us consider an optical character recognition system, which in particular can be used for identifying objects that were assigned strings of some length. The system is not perfect, for example, it sometimes recognizes wrongly the characters "Y" and "V". What is the largest set of strings of given length for the system under consideration, which can be mutually correctly recognized, and the corresponding objects correctly identified?...
-
Molecular Recognition in Complexes of TRF Proteins with Telomeric DNA
PublikacjaTelomeres are specialized nucleoprotein assemblies that protect the ends of linear chromosomes. In humans and many other species, telomeres consist of tandem TTAGGG repeats bound by a protein complex known as shelterin that remodels telomeric DNA into a protective loop structure and regulates telomere homeostasis. Shelterin recognizes telomeric repeats through its two major components known as Telomere Repeat-Binding Factors, TRF1...
-
Parameters optimization in medicine supporting image recognition algorithms
PublikacjaIn this paper, a procedure of automatic set up of image recognition algorithms' parameters is proposed, for the purpose of reducing the time needed for algorithms' development. The procedure is presented on two medicine supporting algorithms, performing bleeding detection in endoscopic images. Since the algorithms contain multiple parameters which must be specified, empirical testing is usually required to optimise the algorithm's...
-
Accelerometer-based Human Activity Recognition and the Impact of the Sample Size
PublikacjaThe presented study focused on the recognition of eight user activities (e.g. walking, lying, climbing stairs) basing on the measurements from an accelerometer embedded in a mobile device. It is assumed that the device is carried in a specific location of the user’s clothing. Three types of classifiers were tested on different sizes of the samples. The influence of the time window (the duration of a single trial) on selected activities...
-
Automatic Singing Voice Recognition EmployingNeural Networks and Rough Sets
PublikacjaCelem badań jest automatyczne rozpoznawanie głosów śpiewaczych w kategorii rodzaju i jakości technicznej śpiewu. W artykule opisano stworzoną bazę danych głosów, która zawiera próbki głosu śpiewaków profesjonalnych i amatorskich. W dalszej części opisano parametry zdefiniowane w oparciu o zjawiska biomechaniczne w narządzie głosu podczas śpiewania. W oparciu o stworzone macierze parametrów wytrenowano i porównano automatyczne klasyfikatory...