prof. dr hab. inż. Andrzej Czyżewski
Employment
- Head of Department at Department of Multimedia Systems
- Professor at Department of Multimedia Systems
Publications
Filters
total: 446
Catalog Publications
Year 2020
-
1D convolutional context-aware architectures for acoustic sensing and recognition of passing vehicle type
PublicationA network architecture that may be employed to sensing and recognition of a type of vehicle on the basis of audio recordings made in the proximity of a road is proposed in the paper. The analyzed road traffic consists of both passenger cars and heavier vehicles. Excerpts from recordings that do not contain vehicles passing sounds are also taken into account and marked as ones containing silence....
-
Audio Feature Analysis for Precise Vocalic Segments Classification in English
PublicationAn approach to identifying the most meaningful Mel-Frequency Cepstral Coefficients representing selected allophones and vocalic segments for their classification is presented in the paper. For this purpose, experiments were carried out using algorithms such as Principal Component Analysis, Feature Importance, and Recursive Parameter Elimination. The data used were recordings made within the ALOFON corpus containing audio signal...
-
Automatic Marking of Allophone Boundaries in Isolated English spoken Words
PublicationThe work presents a method that allows delimiting the borders of allophones in isolated English words. The described method is based on the DTW algorithm combining two signals, a reference signal and an analyzed one. As the reference signal, recordings from the MODALITY database were used, from which the words were extracted. This database was also used for tests, which were described. Test results show that the automatic determination...
-
Comparing traffic intensity estimates employing passive acoustic radar and microwave Doppler radar sensor
PublicationThe purpose of our applied research project is to develop an autonomous road sign with built-in radar devices of our design. In this paper, we show that it is possible to calibrate the acoustic vector sensor so that it can be used to measure traffic volume and count the vehicles involved in the traffic through the analysis of the noise emitted by them. Signals obtained from a Doppler radar are used as a reference source. Although...
Year 2016
-
3D Acoustic Field Intensity Probe Design and Measurements
PublicationThe aim of this paper is two-fold. First, some basic notions on acoustic field intensity and its measurement are shortly recalled. Then, the equipment and the measurement procedure used in the sound intensity in the performed research study are described. The second goal is to present details of the design of the engineered 3D intensity probe, as well as the algorithms developed and applied for that purpose. Results of the intensity...
-
A handwritten signature verification method employing a tablet
PublicationA signature verification system based on static features and time-domain functions of signals obtained using a tablet has been presented in the paper. The signature verification method, based mainly on dynamic time warping coupled with some signature image features, has been described. The FRR measures reflecting the method's efficiency have been evaluated for verification attempts performed directly after obtaining model signatures...
-
A Study in Experimental Methods of Human-Computer Communication for Patients After Severe Brain Injuries
PublicationExperimental research in the domain of multimedia technology applied to medical practice is discussed, employing a prototype of integrated multimodal system to assist diagnosis and polysensory stimulation of patients after severe brain injury. The system being developed includes among others: eye gaze tracker, and EEG monitoring of non-communicating patients after severe brain injuries. The proposed solutions are used for collecting...
-
A system for acoustic field measurement employing cartesian robot
PublicationA system setup for measurements of acoustic field, together with the results of 3D visualisations of acoustic energy flow are presented in the paper. Spatial sampling of the field is performed by a Cartesian robot. Automatization of the measurement process is achieved with the use of a specialized control system. The method is based on measuring the sound pressure (scalar) and particle velocity (vector) quantities. The aim of the...
-
Adaptive Personal Tuning of Sound in Mobile Computers
PublicationAn integrated methodology for enhancing audio quality in mobile computers is presented. The key features are adaptation of the characteristics of their acoustic track to changing acoustic conditions of the environment and to users’ individual preferences. Signal processing algorithms are introduced that concern: linearization of frequency response, dialogue intelligibility enhancement, and dynamics processing tuned up to the users’...
Year 2010
-
3D Morphable Models Application for Expanding Face Database Limited to Single Frontal Face Per Person
Publication1. Zaprezentowany materiał dotyczył badań nad rozszerzeniem dysponowanej bazy wzorców wizerunków twarzy, o dodatkowe wzorce z wariacją w ustawieniu. Dodatkowe wzorce były usyskiwane poprzez przejście z wizerunku twarzy 2D na model 3D, zasymulowanie zadanego ustawienia i powrót do dziedziny 2D (poprzez rzutowanie 3D->2D). W fazie konstrukcji modelu 3D, z wizerunku 2D była ściągana zarówno tekstura twarzy jak i siatka punktów charakterystycznych....
-
A framework for automatic detection of abandoned luggage in airport terminal
PublicationA framework for automatic detection of events in a video stream transmitted from a monitoring system is presented. The framework is based on the widely used background subtraction and object tracking algorithms. The authors elaborated an algorithm for detection of left and removed objects based on mor-phological processing and edge detection. The event detection algorithm collects and analyzes data of all the moving objects in...
-
Acoustic radar employing particle velocity sensors
PublicationA concept, practical realization and applications of a passive acoustic radar to automatic localization, tracking of sound sources were presented in the paper. The device consist of the new kind of multichannel miniature sound intensity sensors and a group of digital signal processing algorithms. Contrary to active radars, it does not emit the scanning beam but after receiving surroundings sounds it provide information about the...
-
Advanced surveillance and operational communication system employing mobile terminals
PublicationDistributed surveillance and operational communications system based on XMPP protocol is presented. Its architecture and assumptions leading to the depicted design are shown. Features of XMPP protocol are portrayed with the emphasis on those most important in the context of the application. Real-time multimedia transmission with the use of Jingle/XMPP extension is discussed. The use of PDA-class computers as mobile terminals is...
-
Algorytm ekstrakcji cech biometrycznych twarzy
PublicationW referacie zawarto opis metody automatycznej lokalizacji oraz parametryzacji punktów charakterystycznych w obrazie twarzy. Do lokalizacji punktów charakterystycznych wykorzystano zmodyfikowany algorytm EBGM (ang. Elastic Bunch Graph Matching). Algorytm ten pozwala lokalizować punkty w obrazie przy założeniu niezmienności topologii grafu połączeń między nimi.W referacie przedstawiono podstawy teoretyczne metody oraz zaimplementowany...
-
Analiza zachowań tłumu w multimedialnym systemie bezpieczeństwa
PublicationW niniejszym referacie zawarto opis metody detekcji zachowań tłumu na podstawie analizy obrazu. Koncepcja docelowego wykorzystania to wspomaganie pracy operatorów w systemach monitoringu, w szczególności podczas imprez masowych, np. na stadionach wyposażonych w wiele kamer. Celem opracowanej metody jest wykrywanie normalnych oraz potencjalnie niebezpiecznych zachowań tłumu, takich jak: panika, kierunkowy ruch masy ludzi, czy gromadzenie...
-
Audio content analysis in the urban area telemonitoring system
PublicationArtykuł przedstawia możliwości rozwinięcie monitoringu miejskiego o automatyczną analizę dźwięku. Przedstawiono metody parametryzacji dźwięku, które możliwe są do zastosowania w takim systemie oraz omówiono aspekty techniczne implementacji. W kolejnej części przedstawiono system decyzyjny oparty na drzewach zastosowany w systemie. System ten rozpoznaje dźwięki niebezpieczne (strzał, rozbita szyba, krzyk) wśród dźwięków zarejestrowanych...
-
Automatic audio-visual threat detection
PublicationThe concept, practical realization and application of a system for detection and classification of hazardous situations based on multimodal sound and vision analysis are presented. The device consists of new kind multichannel miniature sound intensity sensors, digital Pan Tilt Zoom and fixed cameras and a bundle of signal processing algorithms. The simultaneous analysis of multimodal signals can significantly improve the accuracy...
-
Automatic localization and continous tracking of mobile sound source using passive acoustic radar
PublicationA concept, practical realization and applications of the passive acoustic radar for localization and continuous tracking of fixed and mobile sound sources such as: cars, trucks, aircrafts and sources of shooting, explosions were presented in the paper. The device consists of the new kind of multi-channel miniature three dimensional sound intensity sensors invented by the Microflown company and a group of digital signal processing...
-
Automatyczna lokalizacja źródła dźwięku w obecności zakłóceń z wykorzystaniem wektorowych czujników akustycznych
PublicationW referacie przedstawiono pomysł i praktyczną realizację pasywnego radaru akustycznego do automatycznego lokalizowania i śledzenia źródeł dźwięku w warunkach zakłóceń. Urządzenie składa się z nowego typu wielokanałowych miniaturowych czujników natężeniowych oraz algorytmów cyfrowego przetwarzania sygnałów. Czułość radaru akustycznego została zbadana w warunkach pola swobodnego. Użyto sygnałów testowych takich jak: sygnały tonalne...
-
Badanie i terapia zaburzeń widzenia obuocznego wspomagana przez bezkontaktowy system śledzenia punktu fiksacji wzroku
PublicationNa rynku znajduje się klika systemów pozwalających na badanie lub trening syndromu leniwego oka z użyciem komputera PC. Niewiele z nich bazuje na wirtualnej rzeczywistości. Większość jedynie skupia się na terapii niedowidzenia bez mierzenia jakichkolwiek parametrów lub wykonuje tylko same pomiary. Proponowane rozwiązanie to kompletny system diagnostyczno - terapeutyczny do detekcji i terapii zaburzeń widzenia obuocznego - zwłaszcza...
-
Binocular Vision Impairments Therapy Supported By Contactless Eye-gaze Tracking System
PublicationBinocular vision impairments often result in partial or total loss of stereoscopic vision. The lack of binocular vision is a serious vision impairment that deserves more attention. Very important result of the binocular vision impairments is a binocular depth perception. This paper describes also a concept of a measurement and therapy system for the binocular vision impairments by using eye-gaze tracking system.
-
Camera angle invariant shape recognition in surveillance systems
PublicationA method for human action recognition in surveillance systems is described. Problems within this task are discussed and a solution based on 3D object models is proposed. The idea is shown and some of its limitations are talked over. Shape description methods are introduced along with their main features. Utilized parameterization algorithm is presented. Classification problem, restricted to bi-nary cases is discussed. Support vector...
Year 2018
-
A comparative study of English viseme recognition methods and algorithm
PublicationAn elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector...
-
A comparative study of English viseme recognition methods and algorithms
PublicationAn elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector construction...
-
A Device for Measuring Auditory Brainstem Responses to Audio
PublicationStandard ABR devices use clicks and tone bursts to assess subjects’ hearing in an objective way. A new device was developed that extends the functionality of a standard ABR audiometer by collecting and analyzing auditory brainstem responses (ABR). The developed accessory allows for the use of complex sounds (e.g., speech or music excerpts) as stimuli. Therefore, it is possible to find out how efficiently different types of sounds...
-
An application of acoustic sensors for the monitoring of road traffic
PublicationAssessment of road traffic parameters for the developed intelligent speed limit setting decision system constitutes the subject addressed in the paper. Current traffic conditions providing vital data source for the calculation of the locally fitted speed limits are assessed employing an economical embedded platform placed at the roadside. The use of the developed platform employing a low-powered processing unit with a set of microphones,...
-
Analysis of results of large-scale multimodal biometric identity verification experiment
PublicationAn analysis of a large set of biometric data obtained during the enrolment and the verification phase in an experimental biometric system installed in bank branches is presented. Subjective opinions of bank clients and of bank tellers were also surveyed concerning the studied biometric methods in order to discover and to explore relations emerging from the obtained multimodal dataset. First, data acquisition and identity verification...
-
Assessment of Therapeutic Progress After Acquired Brain Injury Employing Electroencephalography and Autoencoder Neural Networks
PublicationA method developed for parametrization of EEG signals gathered from participants with acquired brain injuries is shown. Signals were recorded during therapeutic session consisting of a series of computer assisted exercises. Data acquisition was performed in a neurorehabilitation center located in Poland. The presented method may be used for comparing the performance of subjects with acquired brain injuries (ABI) who are involved...
-
Automatic Clustering of EEG-Based Data Associated with Brain Activity
PublicationThe aim of this paper is to present a system for automatic assigning electroencephalographic (EEG) signals to appropriate classes associated with brain activity. The EEG signals are acquired from a headset consisting of 14 electrodes placed on skull. Data gathered are first processed by the Independent Component Analysis algorithm to obtain estimates of signals generated by primary sources reflecting the activity of the brain....
-
Badanie stanu nawierzchni drogowej z wykorzystaniem uczenia maszynowego
PublicationW artykule opisano budowę systemu informowania o stanie nawierzchni drogowej z wykorzystaniem metod cyfrowego przetwarzania obrazów oraz uczenia maszynowego. Efektem wykonanych prac badawczych jest eksperymentalna platforma, pozwalająca na rejestrację uszkodzeń na drogach, system do analizy, przetwarzania i klasyfikacji danych oraz webowa aplikacja użytkownika do przeglądu stanu nawierzchni w wybranej lokalizacji.
-
Bimodal classification of English allophones employing acoustic speech signal and facial motion capture
PublicationA method for automatic transcription of English speech into International Phonetic Alphabet (IPA) system is developed and studied. The principal objective of the study is to evaluate to what extent the visual data related to lip reading can enhance recognition accuracy of the transcription of English consonantal and vocalic allophones. To this end, motion capture markers were placed on the faces of seven speakers to obtain lip...
Year 2009
-
A double-talk detector using audio watermarking
Publicationa novel approach to double-talk detection in the acoustic echo canceler is proposed. a hidden signature is embedded into the arriving signal, using the echo-hiding method. next detection of the presence of this signature in the microphone signal is performed. the results of the signature detection may be used by the acoustic echo canceler to stop or restart the adaptation process.
-
A new methodological approach to the noise threat evaluation based on the selected physiological properties of the human hearing system
PublicationA new way of assessment of noise-induced harmful effects on human hearing system is presented in the paper. The method takes into consideration properties of the selected physiological human hearing system. On the basis of the hearing examinations and noise measurements results and psychoacoustical noise dosimeter performance the new indicators of the noise harmfulness were proposed. The evaluation of the proposed indicators were...
-
Accelerometer signal pre-processing influence on human activity recognition
PublicationA study of data pre-processing influence on accelerometer-based human activity recognition algorithms is presented. The frequency band used to filter-out the accelerometer signals and the number of accelerometers involved were considered in terms of their influence on the recognition accuracy.
-
Audio codec employing frequency-derived tonality measure
PublicationA transform codec employing efficient algorithm for detection of spectral tonal components is presented. The tonality measure used in MPEG psychoacoustic model is replaced with the method providing adequate tonality estimates even if the tonal components are deeply frequency modulated. The reliability of hearing threshold estimated using psychoacoustic model with standardized tonality measure and the proposed one is investigated...
Year 2006
-
A hybrid speech codec employing parametric and perceptual coding techniques
PublicationW referacie przedstawiono hybrydowy kodek mowy dla zastosowan w komunikacji VoIP wykorzystujący kodowanie parametryczne i percetualne. Sygnał mowy jest dzielony na składowe dźwięczne, które podlegają kodowania perceptualnemu, składowe bezdźwięczne, które kodowane są metodą parametryczną oraz transjenty, które nie są kodowane żadną stratną metodą. Dodatkowo przedstawiono architekturę kodeka, w której perceptualnie kodowana i przesyłana...
-
Accidental wow defect evaluation using sinusoidal analysis enhanced by artificial neural networks
PublicationArtykuł przedstawia metodę do wyznaczania charakterystyki pasożytniczych modulacji częstotliwości (kołysanie) obecnych w archiwalnych nagraniach dźwiękowych. Prezentowane podejście wykorzystuje śledzenie zmian sinusoidalnych komponentów dźwięku które odzwierciedlają przebieg kołysania. Analiza sinusoidalna wykorzystana jest do ekstrakcji składowych tonalnych ze zniekształconych nagrań dźwiękowych. Dodatkowo, w celu zwiększenia...
-
Accidental wow evaluation based on sinusoidal modeling and neural nets prediction
PublicationReferat przedstawia opis algorytmu do określenia charakterystyki zniekształcenia kołysania dźwięku. Prezentowane podejście wykorzystuje sinusoidalną analizę dźwięku bazującą zarówno na amplitudowym jak i fazowym widmie sygnału fonicznego. Trajektorie poszczególnych składowych tonalnych, obrazujące zniekształcenie kołysania, określane są na podstawie analizy ich chwilowych amplitud, częstotliwości i faz. Dodatkowo referat przedstawia...
-
Applications of knowledge technologies to sound and vision engineering
PublicationSpecjalność Inżynieria Dźwięku i Obrazu jest ukierunkowana przede wszystkim na aplikacje praktyczne metod rejestracji i przetwarzania sygnałów fonicznych i wizyjnych we współczesnej telekomunikacji i w multimediach. W związku z tym, specjalność ta wykorzystuje również wiedzę z obszaru akustyki, psychofizjologii percepcji a także estetyki muzycznej. W zastosowaniach multimedialnej technologii informatycznej w telekomunikacji, w...
-
Audiovisual speech recognition for training hearing impaired patients
PublicationPraca przedstawia system rozpoznawania izolowanych głosek mowy wykorzystujący dane wizualne i akustyczne. Modele Active Shape Models zostały wykorzystane do wyznaczania parametrów wizualnych na podstawie analizy kształtu i ruchu ust w nagraniach wideo. Parametry akustyczne bazują na współczynnikach melcepstralnych. Sieć neuronowa została użyta do rozpoznawania wymawianych głosek na podstawie wektora cech zawierającego oba typy...
Year 2008
-
A low complexity double-talk detector based on the signal envelope
PublicationA new algorithm for double-talk detection, intended for use in the acoustic echo canceller for voice communication applications, is proposed. The communication system developed by the authors required the use of a double-talk detection algorithm with low complexity and good accuracy. The authors propose an approach to doubletalk detection based on the signal envelopes. For each of three signals: the far-end speech, the microphone...
-
A novel dynamic noise maps visualization tool
PublicationW referacie przedstawiono aplikację realizujacą wizualizację dynamicznych map akustycznych zintegrowaną z multimedialnym systemem monitoringu hałasu. Moduł ten został oparty na nowym podejściu do wykreślania dynamicznych map, w referacie przedstawiono porównanie wyników uzyskanych metodami tradycyjnymi i zaproponowaną metodą. Słowa kluczowe: dynamiczne mapy, wizualizacja, monitoring, hałas, system GIS
-
Automatic Singing Voice Recognition EmployingNeural Networks and Rough Sets
PublicationCelem badań jest automatyczne rozpoznawanie głosów śpiewaczych w kategorii rodzaju i jakości technicznej śpiewu. W artykule opisano stworzoną bazę danych głosów, która zawiera próbki głosu śpiewaków profesjonalnych i amatorskich. W dalszej części opisano parametry zdefiniowane w oparciu o zjawiska biomechaniczne w narządzie głosu podczas śpiewania. W oparciu o stworzone macierze parametrów wytrenowano i porównano automatyczne klasyfikatory...
-
Automatic system for audio-video material reconstruction and archiving
PublicationReferat przedstawia propozycję modelu systemu automatycznej archiwizacji i rekonstrukcji nagrań audio-wideo. Założeniem tego rozwiązania jest uczynienie procesu rekonstrukcji nagrań bardziej niezależnym od człowieka. Ma to na celu redukcję kosztów rekonstrukcji przetwarzanych nagrań. Z powodu dużej liczby archiwalnych nagrań audio-wideo istnieje potrzeba stworzenia systemu który umożliwi automatyczną indeksację ich treści. Pomoże...
Year 2015
-
A method for counting people attending large public events
PublicationThe algorithm for people counting in crowded scenes, based on the idea of virtual gate which uses optical flow method is presented. The concept and practical application of the developed algorithm under real conditions is depicted. The aim of the work is to estimate the number of people passing through entrances of a large sport hall. The most challenging problem was the unpredicted behavior of people while entering the building....
-
Analiza drgań struny gitarowej z użyciem szybkich kamer
PublicationW referacie przedstawiono metodę analizy i wizualizacji ruchu struny gitarowej. Drgania struny zostały zarejestrowane za pomocą szybkich kamer. Układ optyczny zastosowany do rejestracji został dobrany w taki sposób, by móc obserwować drgania wzdłuż struny. Obrazy zarejestrowane za pomocą szybkich kamer zostały przeanalizowane za pomocą algorytmów cyfrowego przetwarzania sygnałów tak, aby z dużą dokładnością śledzić wychylenia i...
-
Analysis of impact of audio modifications on the robustness of watermark for non-blind architecture
PublicationThe aim of this paper is to assess the robustness of the non-blind audio content watermarking scheme proposed by the authors. The authors present the architecture of the designed system along with the employed workflows for embedding and extracting the watermark followed by the implementation phase description and the analysis of the experimental results. Some possible attack simulations on the embedded watermarks are reviewed,...
-
Application of Fast Cameras to String Vibrations Recording
PublicationA hardware and software solution for guitar string vibration measurement by fast cameras is described. Orthogonal setup for 3D image acquisition is proposed capable to capture several thousand image frames per second. Dedicated image processing algorithm was developed and described in the paper, aimed at tracking the movement of some selected points along the string. Fast and accurate tracking results provided a detailed information...
-
Automatyczna weryfikacja klienta bankowego w oparciu o multimodalne technologie biometryczne
PublicationW referacie przedstawiono przegląd rozwiązań wykorzystywanych w bankach do weryfikacji tożsamości klientów. Ponadto zawarto opis metod biometrycznych aktualnie wykorzystywanych w placówkach bankowych wraz z odniesieniem do skuteczności i wygody korzystania z dostępnych rozwiązań. Zaproponowano rozszerzenie zakresu wykorzystania technologii biometrycznych, wskazując kierunek rozwoju systemów bezpieczeństwa dla poprawy dostępu do...
Year 2017
-
A Method of Object Re-identiciation Applicable to Multicamera Surveillance Systems
PublicationThe paper addresses some challenges pertaining to the methods for tracking of objects in multi-camera systems. The tracking methods related to a single Field of Vision (FOV) are quite different from inter-camera tracking, especially in case of non-overlapping FOVs. In this case, the processing is directed to determine the probability of a particular object’s identity seen in a pair of cameras in the presence of places non-observed...
-
An audio-visual corpus for multimodal automatic speech recognition
Publicationreview of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...
-
Analysis of allophones based on audio signal recordings and parameterization
PublicationThe aim of this study is to develop an allophonic description of English plosive consonants based on recordings of 600 specially selected words. Allophonic variations addressed in the study may have two sources: positional and contextual. The former one depends on the syllabic or prosodic position in which a particular phoneme occurs. Contextual allophony is conditioned by the local phonetic environment. Co-articulation overlapping...
-
Assessment of hearing in coma patients employing auditory brainstem response, electroencephalography, and eye-gaze-tracking
PublicationThe results of the study conducted by Tagliaferri et al. in 12 European countries indicate that the ratio of registered brain injury cases in Europe amounts to 150-300 per 100 000 people, with the European mean value of 235 cases per 100 000 people. The project presented in the paper assumes development of a combined metric of patients’ state remaining in coma by intelligent fusion of GCS (subjective Glasgow Coma Scale or its derivatives)...
-
Building Knowledge for the Purpose of Lip Speech Identification
PublicationConsecutive stages of building knowledge for automatic lip speech identification are shown in this study. The main objective is to prepare audio-visual material for phonetic analysis and transcription. First, approximately 260 sentences of natural English were prepared taking into account the frequencies of occurrence of all English phonemes. Five native speakers from different countries read the selected sentences in front of...
-
Classifying type of vehicles on the basis of data extracted from audio signal characteristics
PublicationThe aim of this study is to find and optimize a feature vector for an automatic recognition of the type of vehicles, extracted form an audio signal. First, the influence of weather-based conditions of road surface on spectral characteristic of the audio signal recorded from a passing vehicle in close proximity to the road is discussed. Next, parameterization of the recorded audio signal is performed. For that purpose, the MIRtoolbox,...
-
Comparative Study of Self-Organizing Maps vs. Subjective Evaluation of Quality of Allophone Pronunciation for Nonnative English Speakers
PublicationThe purpose of this study was to apply Self-Organizing Maps to differentiate between the correct and the incorrect allophone pronunciations and to compare the results with subjective evaluation. Recordings of a list of target words, containing selected allophones of English plosive consonants, the velar nasal and the lateral consonant, were made twice. First, the target words were read from the list by 9 non-native speakers and...
Year 2012
-
A Method of Real-Time Non-uniform Speech Stretching
PublicationDeveloped method of real-time non-uniform speech stretching is presented.The proposed solution is based on the well-known SOLA algorithm(Synchronous Overlap and Add). Non-uniform time-scale modification isachieved by the adjustment of time scaling factor values in accordance with thesignal content. Dependently on the speech unit (vowels/consonants), instantaneousrate of speech (ROS), and speech signal presence, values of the scalingfactor...
-
Akustyka
PublicationW artykule przedstawiono zadania realizowane w ramach projektu PL GRID Plus przez zespół wykonawców Katedry Systemów Multimedialnych. Zadanie te obejmują przygotowanie zestawu usług umożliwiających wykonywanie obliczeń map hałasu i wpływu hałasu na słuch z wykorzystaniem infrastruktury PL GRID.
-
Analysis of impact of lossy audio compression on the robustness of watermark embedded in the DWT domain for non-blind copyright protection
PublicationA methodology of non-blind watermarking of the audio content is proposed. The outline of audio copyright problem and motivation for practical applications are discussed. The algorithmic theory pertaining watermarking techniques is briefly introduced. The system architecture together with employed workflows for embedding and extracting the watermarks are described. The implemented approach is described and obtained results are reported....
-
Application of virtual gate for counting people participating in large public events
PublicationThe concept and practical application of the developed algorithm forpeople counting in crowded scene is presented. The aim of the work is to estimatethe number of people passing towards entrances of a large sport hall. Thedetails of implemented the Virtual Gate algorithm are presented. The video signalfrom the camera installed in the building constituted the input for the algorithm.The most challenging problem was the unpredicted...
-
Awareness evaluation of patients in vegetative state employing eye-gaze tracking system
PublicationApplication of eye-gaze tracking system to awareness evaluation is demonstrated. Hitherto awareness evaluation methods are presented. The assumptions of proposed method based on analysis of visual activity of patients in vegetative state are demonstrated. The eye-gaze tracking system ''Cyber-Eye'' developed at the Multimedia Systems Department employed to conducted experiments is presented. Research described in the paper indicates...
Year 2011
-
A non-uniform real-time speech time-scale stretching method
PublicationAn algorithm for non-uniform real-time speech stretching is presented. It provides a combination of typical SOLA algorithm (Synchronous Overlap and Add ) with the vowels, consonants and silence detectors. Based on the information about the content and the estimated value of the rate of speech (ROS), the algorithm adapts the scaling factor value. The ability of real-time speech stretching and the resultant quality of voice were...
-
An approach to determining tinnitus acoustical characteristic
PublicationFor many treatment methods, accurate estimation of Tinnitus(ringing in ears) concerning sound type, level, and bandwidth or frequency is inevitable. The proposed way of obtaining Tinnitus parameters is described in this paper. The method employs sound synthesis, aimed at obtaining sound which is closest to perceived Tinnitus. The proposed method assumes running a designed application on a multimedia PC provided with a special graphical...
-
Application of Vector Sensors to Acoustic Surveillance of a Public Interior Space
PublicationPrzedstawiono metodę precyzyjnej detekcji i lokalizacji źródeł dźwięku w pomieszczeniach. Wykorzystano wektorowe czujniki akustyczne, dostarczające sygnałów ciśnienia akustycznego i prędkości cząsteczek powietrza. Zaprezentowano metodę lokalizacji źródeł dźwięku na widowni wydarzenia publicznego. Przedstawiono demonstracyjny system zainstalowany w sali wykładowej. System poddano ocenie dokładności na podstawie przeprowadzonych...
-
Automatic prosodic modification in a Text-To-Speech synthesizer of Polish language
PublicationPrzedstawiono system syntezy mowy polskiej z funkcją automatycznej modyfikacji prozodii wypowiedzi. Opisane zostały metody automatycznego wyznaczania akcentu i intonacji wypowiedzi. Przedstawiono zastosowanie algorytmów przetwarzania sygnału mowy w procesie kształtowania prozodii. Omówiono wpływ zastosowanych modyfikacji na naturalność brzmienia syntezowanego sygnału. Zastosowana metoda oparta jest na algorytmie TD-PSOLA. Opracowany...
-
Automatic sound source localization in disturbing conditions using acoustic vector sensors
PublicationA concept, practical realization and applications of a passive acoustic radar to automatic localization and tracking of sound sources in disturbing conditions were presented in the paper. The device consists of the new kind of multichannel miniature sound intensity sensors and a group of digital signal processing algorithms. The sensitivity of the realized acoustic radar was examined in free sound field. Several kinds of sound...
-
Badanie rozpoznawania twarzy przez człowieka z wykorzystaniem systemu śledzenia fiksacji wzroku Cyber-Oko
PublicationW celu dokładniejszego zrozumienia sposobu rozpoznawania i zapamiętywania twarzy przez człowieka przeprowadzono doświadczenie na grupie 20 osób z wykorzystaniem wcześniej opracowanego systemu śledzenia fiksacji wzroku Cyber-Oko. Wykorzystując diody i kamerę podczerwieni wraz z dedykowanym oprogramowaniem Cyber-Oko, które pozwala na śledzenie punktu skupienia wzroku na ekranie. Każdej osobie biorącej udział w doświadczeniu pokazano...
-
Behavior Analysis and Dynamic Crowd Management in Video Surveillance System
PublicationA concept and practical implementation of a crowd management system which acquires input data by the set of monitoring cameras is presented. Two leading threads are considered. First concerns the crowd behavior analysis. Second thread focuses on detection of a hold-ups in the doorway. The optical flow combined with soft computing methods (neural network) is employed to evaluate the type of crowd behavior, and fuzzy logic aids detection...
-
Camera sabotage detection for surveillance systems
PublicationCamera dysfunction detection algorithms and their utilization in realtime video surveillance systems are described. The purpose of using the proposed analysis is explained. Regarding image tampering three algorithms for focus loss, scene obstruction and camera displacement detection are implemented and presented. Features of each module are described and certain scenarios for best performance are depicted. Implemented solutions...
-
Communication Platform for Evaluation of Transmitted Speech Quality
PublicationA voice communication system designed and implemented is described. The purpose of the presented platform was to enable a series of experiments related to the quality assessment of algorithms used in the coding and transmitting of speech. The system is equipped with tools for recording signals at each stage of processing, making it possible to subject them to subjective assessments by listening tests or, objective evaluation employing...
Year 2023
-
A survey of automatic speech recognition deep models performance for Polish medical terms
PublicationAmong the numerous applications of speech-to-text technology is the support of documentation created by medical personnel. There are many available speech recognition systems for doctors. Their effectiveness in languages such as Polish should be verified. In connection with our project in this field, we decided to check how well the popular speech recognition systems work, employing models trained for the general Polish language....
-
Autoencoder application for anomaly detection in power consumption of lighting systems
PublicationDetecting energy consumption anomalies is a popular topic of industrial research, but there is a noticeable lack of research reported in the literature on energy consumption anomalies for road lighting systems. However, there is a need for such research because the lighting system, a key element of the Smart City concept, creates new monitoring opportunities and challenges. This paper examines algorithms based on the deep learning...
Year 2004
-
A system for multitask noisy speech enhancement.
PublicationW artykule przedstawiono ogolną charakterystyke opracowanego systemu rejestracji i rekonstrukcji mowy. Artykuł zawiera opis składników systemu, ktory jest oprogramowaniem zawierającym zaawansowane narzędzia służące poprawie zrozumiałości mowy. Zaimplementowane narzędzia systemu umożliwiają wyszukiwanie nagrań dźwiękowych i ich obróbkę przy pomocy zaimplementowanych pluginów. W artykule przedstawione wykorzystane w systemie algorytmy...
-
Badanie jakości transmisji mowy w sieciach IP.
PublicationPraca zawiera opis eksperymentu mającego na celu zbadanie relacji pomiędzy oceną subiektywną sygnału mowy a jakością transmisji tego sygnału w kanale telekomunikacyjnym. Zrealizowano symulację transmisji pakietowej sygnału mowy w sieci Internet (VoIP). Wykonano serię testów odsłuchowych opartych na listach logatomowych i odpowiednio dobranych zdaniach. Do interpretacji wyników zastosowano analizę statystyczną.
-
Badanie jakości transmisji mowy w sieciach IP.
PublicationPraca zawiera opis eksperymentu mającego na celu zbadanie relacji pomiędzy oceną subiektywną sygnału mowy a jakością transmisji tego sygnału w kanale telefonicznym VoIP. Wykorzystano symulacje transmisji pakietowej sygnału w sieci IP. Wykonano serie testów odsłuchowych opartych na listach logatomowych i odpowiednio dobranych zdaniach. Do interpretacji wyników zastosowano analizę statystyczną.
-
Comparing noise levels and audiometric testing results employing it based diagnostic systems.
PublicationW referacie przedstawiono Internetowy system przeznaczony do przeprowadzania przesiewowych testów słuchu. Zaprezentowano również system informacyjny przeznaczony do monitorowania hałasu środowiskowego. Obie Internetowe aplikacje mogą być pomocne w zmniejszaniu częstości występowania chorób słuchu powodowanych przez hałas środowiskowy i przemysłowy. Porównano wyniki testów audiometrycznych z pomiarami hałasu na podstawie zawartości...
Year 2014
-
Acceleration of decision making in sound event recognition employing supercomputing cluster
PublicationParallel processing of audio data streams is introduced to shorten the decision making time in hazardous sound event recognition. A supercomputing cluster environment with a framework dedicated to processing multimedia data streams in real time is used. The sound event recognition algorithms employed are based on detecting foreground events, calculating their features in short time frames, and classifying the events with Support...
-
Adaptive acoustic crosstalk cancellation in mobile computer device
PublicationThe cancellation of acoustic crosstalk is employed to enhance the stereo image in mobile listening conditions. A practical setup employing a mobile computer is employed. The adaptation of the crosstalk cancellation filter to the position of the listener's head is featured. The measurement evaluating the possibility of practical application of the method are described. The head and torso simulator was used for measurements. The...
-
Aktywny system RFID do lokalizacji i identyfikacji obiektów w wielomodalnej infrastrukturze bezpieczeństwa
PublicationPrzedstawiono prace koncepcyjne, badawcze oraz implementacyjne skoncentrowane na praktycznej realizacji systemu detekcji obiektów z wykorzystaniem kamer wizyjnych i identyfikacji radiowej. Zaproponowano rozbudowę wielomodalnego teleinformatycznego systemu bezpieczeństwa o warstwę identyfikacji radiowej obiektów. Omówiono założenia zaprojektowanego systemu oraz opracowaną warstwę sprzętową. Zaproponowano i przedyskutowano praktyczne...
-
Application of PL-Grid Platform for Modeling of the Selected Acoustic Phenomena
PublicationDomain grids are specific computational environments, developed within the PLGrid Plus project. For the Acoustic domain grid two supercomputer grid based services were prepared. Dedicated software consists of the outdoor sound propagation module and psychoacoustical noise dosimeter. The results are presented in a form of maps of sound level and Temporary Threshold Shift (TTS) values, therefore the services may play an informative...
-
Auto adaptation of mobile device characteristics to various acoustic conditions
PublicationThe proposed methodology of auto adaptation of the mobile device characteristics to various acoustic conditions is presented in the paper. The first goal of this study was to determine the parameters of the acoustic path of the mobile device, for both transmitting (speaker) and receiver (microphone). Results of the measurement of characteristics of mobile devices were presented. Information about characteristics of individual parts...
Year 2013
-
Acoustics - new services for urban planning, research and education
PublicationThe main purpose of the presented design is twofold, namely: providing detailed information about the noise threats that occur every day in city areas and preventing the noise induced hearing loss especially among young people. An experimental system designed for the continuous monitoring of the acoustic climate of urban areas was developed and implemented within the PLGrid Plus project. The assessment of environmental threats...
-
Adaptive Method of Adjusting Flowgraph for Route Reconstruction in Video Surveillance Systems
PublicationPawlak’s flowgraph has been applied as a suitable data structure for description and anal- ysis of human behaviour in the area supervised with multicamera video surveillance system. Infor- mation contained in the flowgraph can be easily used to predict consecutive movements of a partic- ular object. Moreover, utilization of the flowgraph can support reconstructing object route from the past video images. However, such a flowgraph with...
-
Audio-visual surveillance system for application in bank operating room
PublicationAn audio-visual surveillance system able to detect, classify and to localize acoustic events in a bank operating room is presented. Algorithms for detection and classification of abnormal acoustic events, such as screams or gunshots are introduced. Two types of detectors are employed to detect impulsive sounds and vocal activity. A Support Vector Machine (SVM) classifier is used to discern between the different classes of acoustic...
-
Auditory-visual attention stimulator
PublicationNew approach to lateralization irregularities formation was proposed. The emphasis is put on the relationship between visual and auditory attention stimulation. In this approach hearing is stimulated using time scale modified speech and sight is stimulated by rendering the text of the currently heard speech. Moreover, displayed text is modified using several techniques i.e. zooming, highlighting etc. In the experimental part of...
-
Cartographic Representation of Route Reconstruction Results in Video Surveillance System
PublicationThe video streams available in a surveillance system distributed on the wide area may be accompanied by metadata are obtained as a result of video processing. Many algorithms applied to surveillance systems, e.g. event detection or object tracking, are strictly connected with localization of the object and reconstruction of its route. Drawing related information on a plan of a building or on a map of the city can facilitate the...
Year 2021
-
Adaptive Method for Modeling of Temporal Dependencies between Fields of Vision in Multi-Camera Surveillance Systems
PublicationA method of modeling the time of object transition between given pairs of cameras based on the Gaussian Mixture Model (GMM) is proposed in this article. Temporal dependencies modeling is a part of object re-identification based on the multi-camera experimental framework. The previously utilized Expectation-Maximization (EM) approach, requiring setting the number of mixtures arbitrarily as an input parameter, was extended with the...
-
An Automated Method for Biometric Handwritten Signature Authentication Employing Neural Networks
PublicationHandwriting biometrics applications in e-Security and e-Health are addressed in the course of the conducted research. An automated graphomotor analysis method for the dynamic electronic representation of the handwritten signature authentication was researched. The developed algorithms are based on dynamic analysis of electronically handwritten signatures employing neural networks. The signatures were acquired with the use of the...
-
Closer Look at the Uncertainty Estimation in Semantic Segmentation under Distributional Shift
PublicationWhile recent computer vision algorithms achieve impressive performance on many benchmarks, they lack robustness - presented with an image from a different distribution, (e.g. weather or lighting conditions not considered during training), they may produce an erroneous prediction. Therefore, it is desired that such a model will be able to reliably predict its confidence measure. In this work, uncertainty estimation for the task...
Year 2005
-
Advanced speech archiving and restoration system for aviation applications
PublicationW referacie przedstawiono opracowany System Rejestracji I Rekonstrukcji Mowy dla potrzeb lotnictwa. System ten umożliwia jednoczesny zapis, archiwizację i poprawę zrozumiałości sygnału mowy pochodzącego z wielu różnych kanałów komunikacji radiowej. Głównym celem systemu jest rejestracja i rekonstrukcja komunikatów słownych wymienianych drogą radiową pomiędzy pilotem samolotu a stacją kontroli lotów - jest to niezwykle istotne w...
-
Application of hybrid signals processors to speech and hearing aids
PublicationDzięki postępowi w technice Cyfrowych Procesorów Sygnałowych (ang. DSP) stało się możliwe budowanie miniaturowych protez słuchu i mowy. Mimo niewielkich wymiarów procesory te są w stanie wykonywać złożone algorytmy. Ich dodatkową zaletą jest łatwość zmiany oprogramowania, a co za tym idzie łatwość zmiany dziedziny zastosowań. W pracy skupiono się na zagadnieniach związanych z projektowanie i implementacją algorytmów mających zastosowanie...
Year 2022
-
Algoritmically improved microwave radar monitors breathing more acurrate than sensorized belt
PublicationThis paper describes a novel way to measure, process, analyze, and compare respiratory signals acquired by two types of devices: a wearable sensorized belt and a microwave radar-based sensor. Both devices provide breathing rate readouts. First, the background research is presented. Then, the underlying principles and working parameters of the microwave radar-based sensor, a contactless device for monitoring breathing, are described....
Year 2019
-
Application of autoencoder to traffic noise analysis
PublicationThe aim of an autoencoder neural network is to transform the input data into a lower-dimensional code and then to reconstruct the output from this code representation. Applications of autoencoders to classifying sound events in the road traffic have not been found in the literature. The presented research aims to determine whether such an unsupervised learning method may be used for deploying classification algorithms applied to...
-
Automatic labeling of traffic sound recordings using autoencoder-derived features
PublicationAn approach to detection of events occurring in road traffic using autoencoders is presented. Extensions of existing algorithms of acoustic road events detection employing Mel Frequency Cepstral Coefficients combined with classifiers based on k nearest neighbors, Support Vector Machines, and random forests are used. In our research, the acoustic signal gathered from the microphone placed near the road is split into frames and converted...
-
Comparative study on the effectiveness of various types of road traffic intensity detectors
PublicationVehicle detection and speed measurements are crucial tasks in traffic monitoring systems. In this work, we focus on several types of electronic sensors, operating on different physical principles in order to compare their effectiveness in real traffic conditions. Commercial solutions are based on road tubes, microwave sensors, LiDARs, and video cameras. Distributed traffic monitoring systems require a high number of monitoring...
Year 2002
-
Applications of neural networks and perceptual masking to audio restoration
PublicationOmówiono zastosowania algorytmów uczących się w dziedzinie rekonstruowania nagrań fonicznych. Szczególną uwagę zwrócono na zastosowanie sztucznych sieci neuronowych do usuwania zakłócających impulsów. Ponadto opisano zastosowanie inteligentnego algorytmu decyzyjnego do sterowania maskowaniem perceptualnym w celu redukowania szumu.
-
Comparing some convolution-based methods for creation of surround sound
PublicationW referacie przedstawiono eksperymenty związane z symulacją dźwięku dookólnego w sali koncertowej. W tym celu wykorzystano splot odpowiedzi impulsowej z danego wnętrza (wielokanałowe nagrania odpowiedzi impulsowej) z nagraniami z komory bezechowej. Uzyskany w ten sposób sygnał został następnie przypisany do odpowiednich kanałów w systemie dookólnym. Uzyskane w ten sposób nagrania były następnie porównywane w testach subiektywnych...
-
Comparing some convolution-based methods for creation of surround sound. W: [CD-ROM] Collected papers. First Pan-American/Iberian Meeting on Acoustics. 144 Meeting of the Acoustical Society of America. III Iberoamerican Cong- ress of Acoustics. 9o Mexican Congress of Acoustics. Cancun, Q. R. Mxico, 2-6 Dec. 2002. [B.m.:ASA]**2002 paper 2pPP22, 7 s. 8 rys. 1 tab. bibliogr. 12 poz. system nagrań dźwięku dookólnego z wykorzystaniem splotu odpowiedzi impul- sowej sali.
PublicationW referacie przedstawiono eksperymenty związane z symulacją dźwięku dookól-nego w sali koncertowej. W tym celu wykorzystano splot odpowiedzi impulsowejz danego wnętrza (wielokanałowe nagrania odpowiedzi impulsowej) z nagraniamiz komory bezechowej. Uzyskany w ten sposób sygnał został następnie przypisa-ny do odpowiednich kanałów w systemie dookólnym. Uzyskane w ten sposób nag-rania były następnie porównywane w testach subiektywnych...
Year 2003
-
Automatic assessment of the hearing aid dynamics based on fuzzy logic
PublicationPrzedstawiono podstawy koncepcyjne systemu dopasowania protez słuchu opartego na logice rozmytej. Przeprowadzono dyskusje na temat metody skalowania głośności. Następnie podano szczegóły procesu aproksymacji funkcji przynależności odzwierciedlające słuchowe wrażenia głośności. Załączono wnioski.
Year 2007
-
Automatic singing voice recognition employing neural networks and rough sets
PublicationCelem prac opisanych w referacie jest automatyczne rozpoznawanie głosów śpiewaczych. Do tego celu utworzona została baza nagrań próbek śpiewu profesjonalnego i amatorskiego. Próbki poddane zostały parametryzacji parametrami zaproponowanymi przez autorów ściśle do tego celu. Sposób wyznaczenia parametrów i ich interpretacja fizyczna przedstawione są w referacie. Parametry wprowadzane są do systemów decyzyjnych, klasyfikatorów opartych...
seen 7985 times