Publikacje
Filtry
wszystkich: 890
Katalog Publikacji
-
A comparative study of English viseme recognition methods and algorithm
PublikacjaAn elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector...
-
System for monitoring road slippery based on CCTV cameras and convolutional neural networks
PublikacjaThe slipperiness of the surface is essential for road safety. The growing number of CCTV cameras opens the possibility of using them to automatically detect the slippery surface and inform road users about it. This paper presents a system of developed intelligent road signs, including a detector based on convolutional neural networks (CNNs) and the transferlearning method employed to the processing of images acquired with video...
-
3D Acoustic Field Intensity Probe Design and Measurements
PublikacjaThe aim of this paper is two-fold. First, some basic notions on acoustic field intensity and its measurement are shortly recalled. Then, the equipment and the measurement procedure used in the sound intensity in the performed research study are described. The second goal is to present details of the design of the engineered 3D intensity probe, as well as the algorithms developed and applied for that purpose. Results of the intensity...
-
Human Computer Interface for Tracking Eye Movements Improves Assessment and Diagnosis of Patients With Acquired Brain Injuries
PublikacjaOne of the first clinical signs differentiating the minimally conscious state from the vegetative state is the presence of smooth pursuit eye movements occurring in direct response to moving salient stimuli. Glasgow Coma Scale (GCS) is one of the most commonly used diagnostic tools for acute phase assessment of the level of consciousness, together with a neurological examination. These classic measures are limited to qualitative...
-
Analysis of results of large-scale multimodal biometric identity verification experiment
PublikacjaAn analysis of a large set of biometric data obtained during the enrolment and the verification phase in an experimental biometric system installed in bank branches is presented. Subjective opinions of bank clients and of bank tellers were also surveyed concerning the studied biometric methods in order to discover and to explore relations emerging from the obtained multimodal dataset. First, data acquisition and identity verification...
-
Music Information Retrieval in Music Repositories
PublikacjaThis chapter reviews the key concepts associated with automated Music Information Retrieval (MIR). First, current research trends and system solutions in terms of music retrieval and music recommendation are discussed. Next, experiments performed on a constructed music database are presented. A proposal for music retrieval and annotation aided by gaze tracking is also discussed.
-
Employing Subjective Tests and Deep Learning for Discovering the Relationship between Personality Types and Preferred Music Genres
PublikacjaThe purpose of this research is two-fold: (a) to explore the relationship between the listeners’ personality trait, i.e., extraverts and introverts and their preferred music genres, and (b) to predict the personality trait of potential listeners on the basis of a musical excerpt by employing several classification algorithms. We assume that this may help match songs according to the listener’s personality in social music networks....
-
Mispronunciation Detection in Non-Native (L2) English with Uncertainty Modeling
PublikacjaA common approach to the automatic detection of mispronunciation in language learning is to recognize the phonemes produced by a student and compare it to the expected pronunciation of a native speaker. This approach makes two simplifying assumptions: a) phonemes can be recognized from speech with high accuracy, b) there is a single correct way for a sentence to be pronounced. These assumptions do not always hold, which can result...
-
Evaluation of aspiration problems in L2 English pronunciation employing machine learning
PublikacjaThe approach proposed in this study includes methods specifically dedicated to the detection of allophonic variation in English. This study aims to find an efficient method for automatic evaluation of aspiration in the case of Polish second-language (L2) English speakers’ pronunciation when whole words are analyzed instead of particular allophones extracted from words. Sample words including aspirated and unaspirated allophones...
-
Instrument detection and pose estimation with rigid part mixtures model in video-assisted surgeries
PublikacjaLocalizing instrument parts in video-assisted surgeries is an attractive and open computer vision problem. A working algorithm would immediately find applications in computer-aided interventions in the operating theater. Knowing the location of tool parts could help virtually augment visual faculty of surgeons, assess skills of novice surgeons, and increase autonomy of surgical robots. A surgical tool varies in appearance due to...
-
Digital Transformation and Its Influence on Sustainable Manufacturing and Business Practices
PublikacjaThe paper focuses on the relationship between businesses and digital transformation, and how digital transformation has changed manufacturing in several ways. Aspects like Cloud Computing, vertical and horizontal integration, data communication, and the internet have contributed to sustainable manufacturing by decentralizing supply chains. In addition, digital transformation inventions such as predictive analysis and big data analytics...
-
Independent dynamics of low, intermediate, and high frequency spectral intracranial EEG activities during human memory formation
PublikacjaA wide spectrum of brain rhythms are engaged throughout the human cortex in cognitive functions. How the rhythms of various frequency ranges are coordinated across the space of the human cortex and time of memory processing is inconclusive. They can either be coordinated together across the frequency spectrum at the same cortical site and time or induced independently in particular bands. We used a large dataset of human intracranial...
-
Evaluation of Lombard Speech Models in the Context of Speech in Noise Enhancement
PublikacjaThe Lombard effect is one of the most well-known effects of noise on speech production. Speech with the Lombard effect is more easily recognizable in noisy environments than normal natural speech. Our previous investigations showed that speech synthesis models might retain Lombard-effect characteristics. In this study, we investigate several speech models, such as harmonic, source-filter, and sinusoidal, applied to Lombard speech...
-
Creating Acoustic Maps Employing Supercomputing Cluster
PublikacjaThe implemented online urban noise pollution monitoring system is presented with regard to its conceptual assumptions and technical realization. A concept of the noise source parameters dynamic assessment is introduced. The idea of noise modeling, based on noise emission characteristics and emission simulations, was developer and practically utilized in the system. Furthermore, the working system architecture and the data acquisition...
-
Validating data acquired with experimental multimodal biometric system installed in bank branches
PublikacjaAn experimental system was engineered and implemented in 100 copies inside a real banking environment comprising: dynamic handwritten signature verification, face recognition, bank client voice recognition and hand vein distribution verification. The main purpose of the presented research was to analyze questionnaire responses reflecting user opinions on: comfort, ergonomics, intuitiveness and other aspects of the biometric enrollment...
-
Computer-Aided Diagnosis of COVID-19 from Chest X-ray Images Using Hybrid-Features and Random Forest Classifier
PublikacjaIn recent years, a lot of attention has been paid to using radiology imaging to automatically find COVID-19. (1) Background: There are now a number of computer-aided diagnostic schemes that help radiologists and doctors perform diagnostic COVID-19 tests quickly, accurately, and consistently. (2) Methods: Using chest X-ray images, this study proposed a cutting-edge scheme for the automatic recognition of COVID-19 and pneumonia....
-
Measurements and Visualization of Sound Intensity Around the Human Head in Free Field Using Acoustic Vector Sensor
PublikacjaThis paper presents measurements and visualization of sound intensity around the human head simulator in a free field. A Cartesian robot, applied for precise positioning of the acoustic vector sensor, was used to measure sound intensity. Measurements were performed in a free field using a head and torso simulator and the setup consisting of four different loudspeaker configurations. The acoustic vector sensor was positioned around...
-
A new method for measuring the psychoacoustical properties of tinnitus
Publikacjainformation, select the tinnitus treatment and quantitatively substantiate its effects, the measurement of the Tinnitus psychoacoustic parameters should be made an inherent part of the Tinnitus therapy. Methods For this purpose the multimedia-based sound synthesizer has been proposed for testing tinnitus and the results obtained this way are compared with the outcome of the audiometer-based Wilcoxon test. The method has been verified...
-
New approach to railway noise modeling employing Genetic Algorithms
PublikacjaMain goal of this paper was to describe an innovative method of noise prediction based on Genetic Algorithms. First part of the paper addresses the problem of growing noise, mainly in the context of a unified method for measuring noise. Further, Genetic Algorithms are described with regards to their fundamental features. Further a description is provided as to how Genetic Algorithms were used in the area of noise modeling. Next...
-
Long-term comparative evaluation of an acoustic climate in selected schools before and after the acoustic treatment
PublikacjaThe results of long-term continuous noise measurements in two selected schools are presented in the paper. Noise characteristics were measured continuously there for approximately 16 months. Measurements started eight months prior to the acoustic treatment of the school corridors of both schools. An evaluation of the acoustic climates in both schools, before and after the acoustic treatment, was performed based on comparison of...
-
Examining Feature Vector for Phoneme Recognition
PublikacjaThe aim of this paper is to analyze usability of descriptors coming from music information retrieval to the phoneme analysis. The case study presented consists in several steps. First, a short overview of parameters utilized in speech analysis is given. Then, a set of time and frequency domain-based parameters is selected and discussed in the context of stop consonant acoustical characteristics. A toolbox created for this purpose...
-
Comparison of Classification Methods for EEG Signals of Real and Imaginary Motion
PublikacjaThe classification of EEG signals provides an important element of brain-computer interface (BCI) applications, underlying an efficient interaction between a human and a computer application. The BCI applications can be especially useful for people with disabilities. Numerous experiments aim at recognition of motion intent of left or right hand being useful for locked-in-state or paralyzed subjects in controlling computer applications....
-
Speech Analytics Based on Machine Learning
PublikacjaIn this chapter, the process of speech data preparation for machine learning is discussed in detail. Examples of speech analytics methods applied to phonemes and allophones are shown. Further, an approach to automatic phoneme recognition involving optimized parametrization and a classifier belonging to machine learning algorithms is discussed. Feature vectors are built on the basis of descriptors coming from the music information...
-
Gaze-tracking and acoustic vector sensors technologies for PTZ camera steering and acoustic event detection
Publikacja...
-
Methods of Improving Speech Intelligibility for Listeners with Hearing Resolution Deficit
PublikacjaMethods developed for real-time time scale modification (TSM) of speech signal are presented. They are based onthe non-uniform, speech rate depended SOLA algorithm (Synchronous Overlap and Add). Influence of theproposed method on the intelligibility of speech was investigated for two separate groups of listeners, i.e. hearingimpaired children and elderly listeners. It was shown that for the speech with average rate equal to or...
-
Video content analysis in the urban area telemonitoring system
PublikacjaThe task of constant monitoring of video streams from a large number of cameras and reviewing the recordings in order to find a specified event requires a considerable amount of time and effort from the system operators and it is prone to errors. A solution to this problem is an automatic system for constant analysis of camera images being able to raise an alarm if a predefined event is detected. The chapter presents various aspects...
-
Evaluation of excessive noise effects on hearing employing psychoacoustic dosimetry
PublikacjaResearch results regarding the noise impact on hearing applying the concept of the Psychoacoustic Noise Dosimetry (PND) are presented. The general characteristics of the PND algorithm are discussed. Additionally, the results of hearing examinations conducted in the laboratory conditions are shown. The main objective of the research was to determine the time needed for the Temporary Threshold Shift to reverse. The results were used...
-
Lip movement and gesture recognition for a multimodal human-computer interface
Publikacja -
Creating new voices using normalizing flows
PublikacjaCreating realistic and natural-sounding synthetic speech remains a big challenge for voice identities unseen during training. As there is growing interest in synthesizing voices of new speakers, here we investigate the ability of normalizing flows in text-to-speech (TTS) and voice conversion (VC) modes to extrapolate from speakers observed during training to create unseen speaker identities. Firstly, we create an approach for TTS...
-
Employing flowgraphs for forward route reconstruction in video surveillance system
PublikacjaPawlak’s flowgraphs were utilized as a base idea and knowledge container for prediction and decision making algorithms applied to experimental video surveillance system. The system is used for tracking people inside buildings in order to obtain information about their appearance and movement. The fields of view of the cameras did not overlap. Therefore, when an object was moving through unsupervised areas, prediction was needed...
-
Processing of acoustical data in a multimodal bank operating room surveillance system
PublikacjaAn automatic surveillance system capable of detecting, classifying and localizing acoustic events in a bank operating room is presented. Algorithms for detection and classification of abnormal acoustic events, such as screams or gunshots are introduced. Two types of detectors are employed to detect impulsive sounds and vocal activity. A Support Vector Machine (SVM) classifier is used to discern between the different classes of...
-
Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets
PublikacjaArtificial Neural Network (ANN) models, specifically Convolutional Neural Networks (CNN), were applied to extract emotions based on spectrograms and mel-spectrograms. This study uses spectrograms and mel-spectrograms to investigate which feature extraction method better represents emotions and how big the differences in efficiency are in this context. The conducted studies demonstrated that mel-spectrograms are a better-suited...
-
Computer-Aided Detection of Hypertensive Retinopathy Using Depth-Wise Separable CNN
PublikacjaHypertensive retinopathy (HR) is a retinal disorder, linked to high blood pressure. The incidence of HR-eye illness is directly related to the severity and duration of hypertension. It is critical to identify and analyze HR at an early stage to avoid blindness. There are presently only a few computer-aided systems (CADx) designed to recognize HR. Instead, those systems concentrated on collecting features from many retinopathy-related...
-
Detection of moving objects in images combined from video and thermal cameras
PublikacjaAn algorithm for detection of moving objects in video streams from the monitoring cameras is presented. A system composed of a standard video camera and a thermal camera, mounted in close proximity to each other, is used for object detection. First, a background subtraction is performed in both video streams separately, using the popular Gaussian Mixture Models method. For the next processing stage, the authors propose an algorithm...
-
Performance Evaluation of the Parallel Codebook Algorithm for Background Subtraction in Video Stream
PublikacjaA background subtraction algorithm based on the codebook approach was implemented on a multi-core processor in a parallel form, using the OpenMP system. The aim of the experiments was to evaluate performance of the multithreaded algorithm in processing video streams recorded from monitoring cameras, depending on a number of computer cores used, method of task scheduling, image resolution and degree of image content variability....
-
Detection and localization of selected acoustic events in 3D acoustic field for smart surveillance applications
PublikacjaA method for automatic determination of position of chosen sound events such as speech signals and impulse sounds in 3-dimensional space is presented. The events are localized in the presence of sound reflections employing acoustic vector sensors. Human voice and impulsive sounds are detected using adaptive detectors based on modified peak-valley difference (PVD) parameter and sound pressure level. Localization based on signals...
-
Hand gesture recognition supported by fuzzy rules and Kalman filters
PublikacjaThe paper presents a system based on camera and multimediaprojector enabling a user to control computer applications by dynamic hand gestures. Gesture recognition methodology based on representing hand movement trajectory by motion vectors analysed using fuzzy rule-based inference is first given. For effective hand position tracking Kalman filters are employed. The system engineered is developed using J2SE and C++/OpenCV technology....
-
Computer based system for strabismus and amblyopia therapy
PublikacjaW publikacji opisano system komputerowy do badania i treningu zeza i amblyopii.W przypadku zeza i amblyopii lub tak zwanego syndromu leniwego oka terapia polega na zasłanianiu oka dominującego przez kilka godzin dziennie lub rozmywanie obrazu w tym oku poprzez zastasowanie kropli do oczu lub silnych soczewek w okularach. Taki sposób terapii powoduje zaburzenie widzenia obuocznego. Proponowane rozwiązanie zachowuje widzenie obuoczne....
-
Evaluation of Face Detection Algorithms for the Bank Client Identity Verification
PublikacjaResults of investigation of face detection algorithms efficiency in the banking client visual verification system are presented. The video recordings were made in real conditions met in three bank operating outlets employing a miniature industrial USB camera. The aim of the experiments was to check the practical usability of the face detection method in the biometric bank client verification system. The main assumption was to provide...
-
Highlighting interlanguage phoneme differences based on similarity matrices and convolutional neural network
PublikacjaThe goal of this research is to find a way of highlighting the acoustic differences between consonant phonemes of the Polish and Lithuanian languages. For this purpose, similarity matrices are employed based on speech acoustic parameters combined with a convolutional neural network (CNN). In the first experiment, we compare the effectiveness of the similarity matrices applied to discerning acoustic differences between consonant...
-
Evaluation of Decision Fusion Methods for Multimodal Biometrics in the Banking Application
PublikacjaAn evaluation of decision fusion methods based on Dempster-Shafer Theory (DST) and its modifications is presented in the article, studied over real biometric data from the engineered multimodal banking client verification system. First, the approaches for multimodal biometric data fusion for verification are explained. Then the proposed implementation of comparison scores fusion is presented, including details on the application...
-
Acoustic Detector of Road Vehicles Based on Sound Intensity
PublikacjaA method of detecting and counting road vehicles using an acoustic sensor placed by the road is presented. The sensor measures sound intensity in two directions: parallel and perpendicular to the road. The sound intensity analysis performs acoustic event detection. A normalized position of the sound source is tracked and used to determine if the detected event is related to a moving vehicle and to establish the direction of movement....
-
Musical Instrument Tagging Using Data Augmentation and Effective Noisy Data Processing
PublikacjaDeveloping signal processing methods to extract information automatically has potential in several applications, for example searching for multimedia based on its audio content, making context-aware mobile applications (e.g., tuning apps), or pre-processing for an automatic mixing system. However, the last-mentioned application needs a significant amount of research to reliably recognize real musical instruments in recordings....
-
Genetic programming extension to APF-based monocular human body pose estimation
PublikacjaNew method of the human body pose estimation based on a single camera 2D observation is presented, aimed at smart surveillance related video analysis and action recognition. It employs 3D model of the human body, and genetic algorithm combined with annealed particle filter for searching the global optimum of model state, best matching the object's 2D observation. Additionally, new motion cost metric is employed, considering current...
-
Direct electrical stimulation of the human brain has inverse effects on the theta and gamma neural activities
PublikacjaObjective: Our goal was to analyze the electrophysiological response to direct electrical stimulation (DES) systematically applied at a wide range of parameters and anatomical sites, with particular focus on neural activities associated with memory and cognition. Methods: We used a large set of intracranial EEG (iEEG) recordings with DES from 45 subjects with electrodes...
-
Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions
PublikacjaThe paper aims to discuss a case study of sensing analytics and technology in acoustics when applied to reverberation conditions. Reverberation is one of the issues that makes speech in indoor spaces challenging to understand. This problem is particularly critical in large spaces with few absorbing or diffusing surfaces. One of the natural remedies to improve speech intelligibility in such conditions may be achieved through speaking...
-
Sound intensity distribution around organ pipe
PublikacjaThe aim of the paper was to compare acoustic field around the open and stopped organ pipes. The wooden organ pipe was located in the anechoic chamber and activated with a constant air flow, produced by an external air-compressor. Thus, long-term steady state response was possible to obtain. Multichannel acoustic vector sensor was used to measure the sound intensity distribution of radiated acoustic energy. Measurements have been...
-
Vehicle Detection with Self-Training for Adaptative Video Processing Embedded Platform
PublikacjaTraffic monitoring from closed-circuit television (CCTV) cameras on embedded systems is the subject of the performed experiments. Solving this problem encounters difficulties related to the hardware limitations, and possible camera placement in various positions which affects the system performance. To satisfy the hardware requirements, vehicle detection is performed using a lightweight Convolutional Neural Network (CNN), named...
-
Examining Classifiers Applied to Static Hand Gesture Recognition in Novel Sound Mixing System
PublikacjaThe main objective of the chapter is to present the methodology and results of examining various classifiers (Nearest Neighbor-like algorithm with non-nested generalization (NNge), Naive Bayes, C4.5 (J48), Random Tree, Random Forests, Artificial Neural Networks (Multilayer Perceptron), Support Vector Machine (SVM) used for static gesture recognition. A problem of effective gesture recognition is outlined in the context of the system...
-
Music Data Processing and Mining in Large Databases for Active Media
PublikacjaThe aim of this paper was to investigate the problem of music data processing and mining in large databases. Tests were performed on a large data-base that included approximately 30000 audio files divided into 11 classes cor-responding to music genres with different cardinalities. Every audio file was de-scribed by a 173-element feature vector. To reduce the dimensionality of data the Principal Component Analysis (PCA) with variable...