Publications
Filters
total: 908
Catalog Publications
Year 2013
-
A Study on Influence of Normalization Methods on Music Genre Classification Results Employing kNN Algorithms
PublicationThis paper presents a comparison of different normalization methods applied to the set of feature vectors of music pieces. Test results show the influence of min-nlax and Zero-Mean normalization methods, employing different distance functions (Euclidean, Manhattan, Chebyshev, Minkowski) as a pre-processing for genre classification, on k-Nearest Neighbor (kNN) algorithm classification results.
-
Acoustics - new services for urban planning, research and education
PublicationThe main purpose of the presented design is twofold, namely: providing detailed information about the noise threats that occur every day in city areas and preventing the noise induced hearing loss especially among young people. An experimental system designed for the continuous monitoring of the acoustic climate of urban areas was developed and implemented within the PLGrid Plus project. The assessment of environmental threats...
-
Adaptive Method of Adjusting Flowgraph for Route Reconstruction in Video Surveillance Systems
PublicationPawlak’s flowgraph has been applied as a suitable data structure for description and anal- ysis of human behaviour in the area supervised with multicamera video surveillance system. Infor- mation contained in the flowgraph can be easily used to predict consecutive movements of a partic- ular object. Moreover, utilization of the flowgraph can support reconstructing object route from the past video images. However, such a flowgraph with...
-
An Approach to the Detection of Bank Robbery Acts Employing Thermal Image Analysis
PublicationA novel approach to the detection of selected security-related events in bank monitoring systems is presented. Thermal camera images are used for the detection of people in difficult lighting conditions. Next, the algorithm analyses movement of objects detected in thermal or standard monitoring cameras using a method evolved from the motion history images algorithm. At the same time, thermal images are analyzed in order to detect...
-
APPLICATION OF THE HIGH FREQUENCY LINEARIZATION OF THE EAR IN PATIENTS WITH TINNITUS . Metoda linearyzacji narządu słuchu u osób cierpiących z szumami usznymi
PublicationThis paper summarises the problem of tinnitus, hypotheses on its causes and the treatment methods. Moreover, a hypothesis on tinnitus origins is explained, based on the mechanisms of the analog-to-digital conversion and quantization. In addition, this paper describes methods of determining the acoustic intensity and spectra of low- level ultrasonic signals, as well as impedance characteristics of an ultrasound transducer. Furthermore,...
-
Audio-visual surveillance system for application in bank operating room
PublicationAn audio-visual surveillance system able to detect, classify and to localize acoustic events in a bank operating room is presented. Algorithms for detection and classification of abnormal acoustic events, such as screams or gunshots are introduced. Two types of detectors are employed to detect impulsive sounds and vocal activity. A Support Vector Machine (SVM) classifier is used to discern between the different classes of acoustic...
-
AUDITORY DISPLAY FROM THE MUSIC TECHNOLOGY PERSPECTIVE . Obecność wirtualnego środowiska dźwiękowego w technologiach muzycznych
PublicationThis paper presents some applications of Auditory Displays (AD) in the domain of music technology. First, the scope of music technology and auditory display areas are shortly outlined. Then, the research trends and system solutions within the fields of music technology, music information retrieval and music recommendation are discussed. Finally, an example of an auditory display that facilities music annotation process based on...
-
Auditory-visual attention stimulator
PublicationNew approach to lateralization irregularities formation was proposed. The emphasis is put on the relationship between visual and auditory attention stimulation. In this approach hearing is stimulated using time scale modified speech and sight is stimulated by rendering the text of the currently heard speech. Moreover, displayed text is modified using several techniques i.e. zooming, highlighting etc. In the experimental part of...
-
Cartographic Representation of Route Reconstruction Results in Video Surveillance System
PublicationThe video streams available in a surveillance system distributed on the wide area may be accompanied by metadata are obtained as a result of video processing. Many algorithms applied to surveillance systems, e.g. event detection or object tracking, are strictly connected with localization of the object and reconstruction of its route. Drawing related information on a plan of a building or on a map of the city can facilitate the...
-
Creating Dynamic Maps of Noise Threat Using PL-Grid Infrastructure
PublicationThe paper presents functionality and operation results of a system for creating dynamic maps of acoustic noise employing the PL-Grid infrastructure extended with a distributed sensor network. The work presented provides a demonstration of the services being prepared within the PLGrid Plus project for measuring, modeling and rendering data related to noise level distribution in city agglomerations. Specific computational environments,...
-
Creating dynamic maps of noise threat using pl-grid infrastructure; materiały konferencyjne
PublicationThis paper presents functionality and operation results of the system for creating dynamic maps of noise thread with the use of the PL-Grid infrastructure integrated with distributed sensors network for measuring, modeling and rendering noise level distribution. The work presented provides a demonstration of the services being prepared within the PLGrid Plus project. Specific computational environments, so called domain grids,...
-
Detection of moving objects in images combined from video and thermal cameras
PublicationAn algorithm for detection of moving objects in video streams from the monitoring cameras is presented. A system composed of a standard video camera and a thermal camera, mounted in close proximity to each other, is used for object detection. First, a background subtraction is performed in both video streams separately, using the popular Gaussian Mixture Models method. For the next processing stage, the authors propose an algorithm...
-
Development of Domain-Specific Solutions within the Polish Infrastructure for Advanced Scientific Research
PublicationThe Polish Grid computing infrastructure was established during the PL-Grid project (2009-2012). The main purpose of this Project was to provide the Polish scientists with an IT basic platform, allowing them to conduct interdisciplinary research on a national scale, and giving them transparent access to international grid resources via international grid infrastructures. Currently, the infrastructure is maintained and extended...
-
Drum Replacement Using Wavelet Filtering Podmienianie próbek perkusyjnych przy zastosowaniu filtracji falkowej .
PublicationThe paper presents the solution that can be used to unify snare drum sound within a chosen fragment. The algorithm is based on the wavelet transformation and allows replacement of sub-bands of particular sounds, which are outside a certain range. Five experienced sound engineers put the algorithm under the test using samples of five different snare drums. Wavelet filtering seems to be useful in terms of drum replacement, while...
-
Drum Replacement Using Wavelet Filtering Podmienianie próbek perkusyjnych przy zastosowaniu filtracji falkowej
PublicationThe paper presents the solution that can be used to unify snare drum sound within a chosen fragment. The algorithm is based on the wavelet transformation and allows replacement of sub-bands of particular sounds, which are outside a certain range. Five experienced sound engineers put the algorithm under the test using
-
Europejski projekt ADDPRIV Automatyczna interpretacja danych pozyskiwanych z obrazu dla potrzeb systemów monitoringu wizyjnego funkcjonujących z poszanowaniem prywatności osób
PublicationSystemy monitorowania bezpieczeństwa publicznego generują i przechowują ogromne ilości danych implikując wzrost prawdopodobieństwa użycia tych danych w sposób nieodpowiedni z punktu widzenia ochrony danych osobowych. W niniejszym referacie zaprezentowany jest europejski projekt ADDPRIV, który bezpośrednio odnosi się do kwestii poszanowania prywatności poprzez automatyczne rozpoznawanie istotności danych pochodzących z rozproszonego systemu...
-
Evaluation of Sound Enhancement in Mobile Device Using Virtual Bass Synthesiss Algorithm
PublicationAn experiment conducted to validate possibility of use virtual bass synthesis (VBS) algorithm in a portable computer is presented. The subjective listening tests based on the procedure of pairwise comparison between VBS, based on the so-called missing fundamental phenomenon, and standard bass boost technique are employed. The evaluation was carried out in two types of conditions: in a professional listening room and employing an...
-
Examining Classifiers Applied to Static Hand Gesture Recognition in Novel Sound Mixing System
PublicationThe main objective of the chapter is to present the methodology and results of examining various classifiers (Nearest Neighbor-like algorithm with non-nested generalization (NNge), Naive Bayes, C4.5 (J48), Random Tree, Random Forests, Artificial Neural Networks (Multilayer Perceptron), Support Vector Machine (SVM) used for static gesture recognition. A problem of effective gesture recognition is outlined in the context of the system...
-
EXPERIMENTAL ANALYSIS OF CONNECTION BETWEEN OBJECT-ORIENTED METRICS AND SOFTWARE CHANGEABILITY
PublicationFor the purpose of video surveillance software quality assessment in this work the ISO/IEC-9126 norm was used with a particular focus on maintainability of the software system. The paper presents a study on the connection between software metrics derived from the static analysis of the source code and changeability of the video surveillance software system. It is shown that meeting requirements of software quality metrics may result...
-
Gesture-controlled Sound Mixing System With a Sonified Interface
PublicationIn this paper the Authors present a novel approach to sound mixing. It is materialized in a system that enables to mix sound with hand gestures recognized in a video stream. The system has been developed in such a way that mixing operations can be performed both with or without visual support. To check the hypothesis that the mixing process needs only an auditory display, the influence of audio information visualization on sound...
-
In uence of Low-Level Features Extracted from Rhythmic and Harmonic Sections on Music Genre Classi cation
PublicationWe present a comprehensive evaluation of the infuence of 'harmonic' and rhythmic sections contained in an audio file on automatic music genre classi cation. The study is performed using the ISMIS database composed of music files, which are represented by vectors of acoustic parameters describing low-level music features. Non-negative Matrix Factorization serves for blind separation of instrument components. Rhythmic components...
-
Influence of image transformations and quality degradations on SURF detector efficiency
PublicationA method for task-oriented examination of SURF keypoint detector accuracy is presented in the paper. It consists of generating test images, based on a given exemplar, processed by affine transformations: random rotation and scaling, and varying degree of degradations: darkening, blurring, noising, and compression. Details of applied degradation procedure are presented, followed by essentials of SURF-based images matching. A distance...
-
Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej
PublicationThe bi-modal speech recognition system requires a 2-sample language input for training and for testing algorithms which precisely depicts natural English speech. For the purposes of the audio-visual recordings, a training data base of 264 sentences (1730 words without repetitions; 5685 sounds) has been created. The language sample reflects vowel and consonant frequencies in natural speech. The recording material reflects both the...
-
LINEARYZACJA CHARAKTERYSTYKI TRANSMISYJNEJ UCHA Z ZASTOSOWANIEM NISKICH POZIOMÓW SZUMU ULTRADŹWIĘKOWEGO U PACJENTÓW CIERPIĄCYCH NA SZUMY USZNE
PublicationW pracy przedstawiono pokrótce problematykę szumów usznych, przegląd hipotez ich powstawania oraz stosowane metody terapii. Dodatkowo przywołano jedną z teorii powstawania szumów usznych opartą na mechanizmie działania układów kwantyzacji. W dalszej kolejności zawarto opis przeprowadzonych badań przeprowadzonych z pacjentami cierpiącymi na szumy uszne, w których wykorzystano mechanizm linearyzacji z użyciem szumu ultradźwiękowego...
-
Low-Level Music Feature Vectors Embedded as Watermarks
PublicationIn this paper a method consisting in embedding low-level music feature vectors as watermarks into a musical signal is proposed. First, a review of some recent watermarking techniques and the main goals of development of digital watermarking research are provided. Then, a short overview of parameterization employed in the area of Music Information Retrieval is given. A methodology of non-blind watermarking applied to music-content...
-
Measurements of acoustic crosstalk cancellation efficiency in mobile listening conditions
PublicationThe cancellation of acoustic crosstalk is employed to enhance the stereo image in mobile listening conditions. The implementation of the crosstalk cancellation algorithm in Matlab is introduced. The measurement signals and equipment are described. A practical setup employing a mobile computer and a head and torso simulator is employed. The results of the measurements provided conclusions regarding the employment of acoustic crosstalk...
-
Metoda dopasowania charakterystyk toru fonicznego komputera przenośnego do preferencji słuchowych użytkownika
PublicationUżytkownicy urządzeń przenośnych, takich jak smartfony, tablety, ultrabooki, coraz częściej zwracają uwagę na niedoskonałości dźwięku emitowanego przez te urządzenia. Zmiana wzmocnienia czy korekcja barwy nie wystarczają, by dopasować dźwięk do preferencji użytkownika. W referacie zaproponowano nowe podejście do tego zagadnienia, polegające na dynamicznej kontroli poziomu dźwięku, tak aby jak najlepiej odwzorować sposób postrzegania...
-
Metoda i algorytmy modyfikacji sygnału do celu wspomagania rozumienia mowy przez osoby z pogorszoną rozdzielczością czasową słuchu
PublicationPrzedmiotem badań przeprowadzonych w ramach rozprawy są metody modyfikacji czasu trwania sygnału (ang. Time Scale Modification –TSM) mowy operujące w czasie rzeczywistym oraz ocena ich wpływu na rozumienie wypowiedzi przez osoby z pogorszoną rozdzielczością czasową słuchu. Pogorszona rozdzielczość słuchu jest jednym z symptomów związanych z ośrodkowymi zaburzeniami słuchu (ang. Cetnral Auditory Processing Disorder – CAPD). W odróżnieniu...
-
Metoda i algorytmy sterowania procesami miksowania dźwięku za pomocą gestów w oparciu o analizę obrazu wizyjnego
PublicationGłównym celem rozprawy było opracowanie systemu miksowania dźwięku za pomocą gestów rąk wykonywanych w powietrzu oraz zbadanie możliwości oferowanych przez takie rozwiązanie w porównaniu ze współczesną metodą miksowania sygnałów fonicznych, wykorzystującą środowisko komputera. Opracowany system rozpoznaje zarówno dynamiczne jak i statyczne gesty rąk. Rozpoznawanie gestów dynamicznych zrealizowano w oparciu o metody logiki rozmytej...
-
Metoda zliczania osób w tłumie z zastosowaniem wirtualnej bramki
PublicationW referacie przedstawiono koncepcję oraz wyniki realizacji praktycznej algorytmu zliczania osób w tłumie. Zaprezentowano szczegóły opracowanej metody zwanej wirtualną bramką, której działanie wymaga obliczenia przepływu optycznego w obrazie. Zilustrowano możliwości praktycznego zastosowania opracowanego algorytmu do zliczania osób w obszarach o rozmiarach znacznie przekraczających szerokość typowych wejść, gdzie mają zastosowanie...
-
Metody Śledzenia Obiektów W Rozproszonych Systemach Monitoringu Wideo
PublicationSystemy monitoringu wideo stały się powszechną częścią zarówno przestrzeni publicznej jak również miejsc o ograniczonym dostępie. Nadzór obszaru o dużej powierzchni wymaga rozmieszczenia wielu kamer. Skuteczna analiza przez człowieka dużej liczby obrazów wideo jest praktycznie niemożliwa. Dlatego rozwijane są metody służące do automatycznego przetwarzania wideo ukierunkowanego na analizę kontekstową. W przypadku niepokrywających...
-
Multidimensional Scaling Analysis Applied to Music Mood Recognition
PublicationThe paper presents two experiments aimed at categorizing mood associated with music. Two parts of a listening test were designed and carried out with a group of students, most of whom where users of online social music services. The initial experiment was designed to evaluate the extent to which a given label describes the mood of the particular music excerpt. The second subjective test was conducted to collect the similarity data...
-
Multimodal English corpus for automatic speech recognition
PublicationA multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech...
-
Multimodal human-computer interfaces based on advanced video and audio analysis
PublicationMultimodal interfaces development history is reviewed briefly in the introduction. Examples of applications of multimodal interfaces to education software and for the disabled people are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with mouth gestures and the audio interface for speech stretching for hearing impaired and stuttering people. The Smart...
-
Multimodal Surveillance Based Personal Protection System
PublicationA novel, multimodal approach for automatic detection of abduction of a protected individual, employing dedicated personal protection device and a city monitoring system is proposed and overviewed. The solution is based on combining four modalities (signals coming from: Bluetooth, fixed and PTZ cameras, thermal camera, acoustic sensors). The Bluetooth signal is used continuously to monitor the protected person presence, and in case...
-
Music Information Retrieval in Music Repositories
PublicationThis chapter reviews the key concepts associated with automated Music Information Retrieval (MIR). First, current research trends and system solutions in terms of music retrieval and music recommendation are discussed. Next, experiments performed on a constructed music database are presented. A proposal for music retrieval and annotation aided by gaze tracking is also discussed.
-
Music Recommendation Based on Multidimensional Description and Similarity Measures . Rekomendacja muzyki na podstawie wielowymiarowego wektora cech i miar podobieństwa
PublicationThis study aims to create an algorithm for assessing the degree to which songs belong to genres defined a priori. Such an algorithm is not aimed at providing unambiguous classification-labelling of songs, but at producing a multidimensional description encompassing all of the defined genres. The algorithm utilized data derived from the most relevant examples belonging to a particular genre of music. For this condition to be met,...
-
Network oscillations modulate interictal epileptiform spike rate during human memory
PublicationEleven patients being evaluated with intracranial electroencephalography for medically resistant temporal lobe epilepsy participated in a visual recognition memory task. Interictal epileptiform spikes were manually marked and their rate of occurrence compared between baseline and three 2 s periods spanning a 6 s viewing period. During successful, but not unsuccessful, encoding of the images there was a significant reduction in...
-
New Aspects of Virtual Sound Source Localization Research—Impact of Visual Angle and 3-D Video Content on Sound Perception
PublicationThe influence of image on virtual sound source localization, called the “image proximity effect” or the “ventriloquism effect”, is a well known phenomenon. This paper focuses on other aspects related to this effect, namely the impact of the visual angle of the presented object and 3D video content on sound perception. The research conducted confirmed that the visual angle of the presented object determines the image proximity effect...
-
Novel 5.1 Downmix Algorithm with Improved Dialogue Intelligibility
PublicationA new algorithm for 5.1 to stereo downmix is introduced, which addresses the problem of dialogue intelligibility. The algorithm utilizes proposed signal processing algorithms to enhance the intelligibility of movie dialogues, especially in difficult listening conditions or in compromised speaker setup. To account for the latter, a playback configuration utilizing a portable device, i.e. an ultrabook, is examined. The experiments...
-
OCHRONA PRYWATNOŚCI W SYSTEMACH MONITORINGU WIZYJNEGO, PRZEGLĄD OPRACOWANYCH ARCHITEKTUR I ALGORYTMÓW
PublicationNieustannie rozwijające się technologie informacyjne związane z inteligentnym monitoringiem wizyjnym stwarzają ryzyko niewłaściwego wykorzystywania danych osobowych. W celu zapewnienia prawidłowej ochrony materiału wizyjnego, w ramach projektów realizowanych w Katedrze Systemów Multimedialnych WETI PG, opracowany został szereg architektur i algorytmów, które ułatwiają ochronę danych wrażliwych, takich jak: wizerunki osób, numery...
-
Online sound restoration system for digital library applications.
PublicationAudio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
-
Online sound restoration system for digital library applications
PublicationAudio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
-
Open standards-based communication system for distributed intelligent surveillance solution
PublicationThe paper presents an open standards-based communication system being a part of a distributed surveillance solution. The paradigm of “intelligent” surveillance approach is introduced, and employed video processing is discussed briefly. Requirements analysis toward the design of communication subsystem architecture is presented. Special attention is paid to the multimedia streaming functionality of presented solution, which is based...
-
Parametrization and Correlation Analysis Applied to Music Mood Classification .
PublicationThe paper presents a study on music mood categorization. First, a review of music mood models is presented. Then, the preparation of a set of music excerpts to be used in the experiments and music parametrization is described. Next, some listening tasks performed to obtain mood descriptors are introduced. Finally,the correlation between mood descriptors and features extracted from parameters is discussed. The paper concludes with...
-
Pose-Configurable Generic Tracking of Elongated Objects
PublicationElongated objects have various shapes and can shift, rotate, change scale, and be rigid or deform by flexing, articulating, and vibrating, with examples as varied as a glass bottle, a robotic arm, a surgical suture, a finger pair, a tram, and a guitar string. This generally makes tracking of poses of elongated objects very challenging. We describe a unified, configurable framework for tracking the pose of elongated objects, which...
-
Reversible Video Stream Anonymization for Video Surveillance Systems Based on Pixels Relocation and Watermarking
PublicationA method of reversible video image regions of interest anonymization for applications in video surveillance systems is described. A short introduction to theanonymization procedures is presented together with the explanation of its relation to visual surveillance. A short review of state of the art of sensitive data protection in media is included. An approach to reversible Region of Interest (ROI) hiding in video is presented,...
-
Rozpoznawanie osób i zdarzeń: Charakterystyka algorytmów
PublicationRozpoznawanie osób i zdarzeń, analiza strumieni wielomadalnych, cyfrowe przetwarzanie sygnałów.
-
Rozpoznawanie osób i zdarzeń: Ocena jakościowa aplikacji
PublicationRozpoznawanie osób i zdarzeń, analiza strumieni wielomadalnych, cyfrowe przetwarzanie sygnałów.
-
Rozpoznawanie osób i zdarzeń: Opis aplikacji rozpoznawania obiektów i zdarzeń
PublicationRozpoznawanie osób i zdarzeń, analiza strumieni wielomadalnych, cyfrowe przetwarzanie sygnałów.