Wyniki wyszukiwania dla: AUDIO ENGINEERING, SEMANTIC AUDIO - MOST Wiedzy

Wyszukiwarka

Wyniki wyszukiwania dla: AUDIO ENGINEERING, SEMANTIC AUDIO

Wyniki wyszukiwania dla: AUDIO ENGINEERING, SEMANTIC AUDIO

  • SYNAT Music Genre Parameters PCA 19

    Dane Badawcze

    The dataset contains feature vector after  Principal Component Analysis (PCA) performing, so there are 11 music genres and 19-element vector derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier research studies carried out by the team of authors [1-6]. A collection of 52532 music excerpts described...

  • SYNAT_PCA_48

    Dane Badawcze

    There is a series of datasets containing feature vectors derived from music tracks. The dataset contains 51582 music tracks (22 music genres) and feature vector after  Principal Component Analysis (PCA) performing, so there are 48-element vectors derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier...

  • SYNAT_PCA_11

    Dane Badawcze

    The dataset contains 51582 music tracks (22 music genres) and feature vector after  Principal Component Analysis (PCA) performing, so there are 11-element vectors derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier research studies carried out by the team of authors [1-6]. A collection of more than...

  • Auditory Brainstem Responses recorded employing Audio ABR device

    Dane Badawcze
    open access

    The dataset consists of ABR measurements employing click, burst and speech stimuli. Parameters of the particular stimuli were as follows:

  • Multimodal system for diagnosis and polysensory stimulation of subjects with communication disorders

    An experimental multimodal system, designed for polysensory diagnosis and stimulation of persons with impaired communication skills or even non-communicative subjects is presented. The user interface includes an eye tracking device and the EEG monitoring of the subject. Furthermore, the system consists of a device for objective hearing testing and an autostereoscopic projection system designed to stimulate subjects through their...

  • Testbed analysis of video and VoIP transsmission performance in IEEE 802.11 b/g/n networks

    The aim of the work is to analyze capabilities and limitations of different implementations of IEEE 802.11 technologies (IEEE 802.11 b/g/n), utilized for both video streaming and VoIP calls directed to mobile devices. Our preliminary research showed that results obtained with currently popular simulation tools can be drastically different than these possible in real-world environment, so, in order to correctly evaluate performance...

    Pełny tekst do pobrania w portalu

  • Multimodal Surveillance Based Personal Protection System

    A novel, multimodal approach for automatic detection of abduction of a protected individual, employing dedicated personal protection device and a city monitoring system is proposed and overviewed. The solution is based on combining four modalities (signals coming from: Bluetooth, fixed and PTZ cameras, thermal camera, acoustic sensors). The Bluetooth signal is used continuously to monitor the protected person presence, and in case...

  • ANALIZA PARAMETRÓW SYGNAŁU MOWY W KONTEKŚCIE ICH PRZYDATNOŚCI W AUTOMATYCZNEJ OCENIE JAKOŚCI EKSPRESJI ŚPIEWU

    Praca dotyczy podejścia do parametryzacji w przypadku klasyfikacji emocji w śpiewie oraz porównania z klasyfikacją emocji w mowie. Do tego celu wykorzystano bazę mowy i śpiewu nacechowanego emocjonalnie RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song), zawierającą nagrania profesjonalnych aktorów prezentujących sześć różnych emocji. Następnie obliczono współczynniki mel-cepstralne (MFCC) oraz wybrane deskryptory...

    Pełny tekst do pobrania w portalu

  • A Study of Cross-Linguistic Speech Emotion Recognition Based on 2D Feature Spaces

    Publikacja
    • G. Tamulevicius
    • G. Korvel
    • A. B. Yayak
    • P. Treigys
    • J. Bernataviciene
    • B. Kostek

    - Electronics - Rok 2020

    In this research, a study of cross-linguistic speech emotion recognition is performed. For this purpose, emotional data of different languages (English, Lithuanian, German, Spanish, Serbian, and Polish) are collected, resulting in a cross-linguistic speech emotion dataset with the size of more than 10.000 emotional utterances. Despite the bi-modal character of the databases gathered, our focus is on the acoustic representation...

    Pełny tekst do pobrania w portalu

  • Ranking Speech Features for Their Usage in Singing Emotion Classification

    Publikacja

    This paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based...

    Pełny tekst do pobrania w portalu

  • Automatic audio-visual threat detection

    Publikacja

    - Rok 2010

    The concept, practical realization and application of a system for detection and classification of hazardous situations based on multimodal sound and vision analysis are presented. The device consists of new kind multichannel miniature sound intensity sensors, digital Pan Tilt Zoom and fixed cameras and a bundle of signal processing algorithms. The simultaneous analysis of multimodal signals can significantly improve the accuracy...

  • New Applications of Multimodal Human-Computer Interfaces

    Publikacja

    - Rok 2012

    Multimodal computer interfaces and examples of their applications to education software and for the disabled people are presented. The proposed interfaces include the interactive electronic whiteboard based on video image analysis, application for controlling computers with gestures and the audio interface for speech stretching for hearing impaired and stuttering people. Application of the eye-gaze tracking system to awareness...

  • Marek Blok dr hab. inż.

    Osoby

    Marek Blok w 1994 roku ukończył studia na kierunku Telekomunikacja wydziału Elektroniki Politechniki Gdańskiej i uzyskał tytuł mgra inżyniera. Doktorat w zakresie telekomunikacji uzyskał w 2003 roku na Wydziale Elektroniki, Telekomunikacji i Informatyki Politechniki Gdańskiej. W 2017 roku uzyskał stopień naukowy dra habilitowanego w dyscyplinie telekomunikacja. Jego zainteresowania badawcze ukierunkowane są na telekomunikacyjne...

  • Rough Sets Applied to Mood of Music Recognition

    Publikacja

    With the growth of accessible digital music libraries over the past decade, there is a need for research into automated systems for searching, organizing and recommending music. Mood of music is considered as one of the most intuitive criteria for listeners, thus this work is focused on the emotional content of music and its automatic recognition. The research study presented in this work contains an attempt to music emotion recognition...

  • Bimodal Emotion Recognition Based on Vocal and Facial Features

    Emotion recognition is a crucial aspect of human communication, with applications in fields such as psychology, education, and healthcare. Identifying emotions accurately is challenging, as people use a variety of signals to express and perceive emotions. In this study, we address the problem of multimodal emotion recognition using both audio and video signals, to develop a robust and reliable system that can recognize emotions...

    Pełny tekst do pobrania w portalu

  • Study on CPU and RAM Resource Consumption of Mobile Devices using Streaming Services

    Publikacja

    Streaming multimedia services have become very popular in recent years, due to the development of wireless networks. With the growing number of mobile devices worldwide, service providers offer dedicated applications that allow to deliver on-demand audio and video content anytime and everywhere. The aim of this study was to compare different streaming services and investigate their impact on the CPU and RAM resources, with respect...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Subjective and Objective Quality Evaluation Study of BPL -PLC Wired Medium

    Publikacja

    - Elektronika Ir Elektrotechnika - Rok 2020

    This paper presents results of research on the effectiveness of bi-directional voice transmission in a 6 kV mine cable network using BPL-PLC (Broadband over Power Line - Power Line Communication) technology. It concerns both emergency cable state (supply outage with cable shorted at both ends) and loaded with distorted current waveforms. The narrowband (0.5 MHz–15 MHz) and broadband (two different modes, frequency range of 3 MHz–7.5...

    Pełny tekst do pobrania w portalu

  • Musical Instrument Identification Using Deep Learning Approach

    Publikacja

    The work aims to propose a novel approach for automatically identifying all instruments present in an audio excerpt using sets of individual convolutional neural networks (CNNs) per tested instrument. The paper starts with a review of tasks related to musical instrument identification. It focuses on tasks performed, input type, algorithms employed, and metrics used. The paper starts with the background presentation, i.e., metadata...

    Pełny tekst do pobrania w portalu

  • Architecture Design of a Networked Music Performance Platform for a Chamber Choir

    This paper describes an architecture design process for Networked Music Performance (NMP) platform for medium-sized conducted music ensembles, based on remote rehearsals of Academic Choir of Gdańsk University of Technology. The issues of real-time remote communication, in-person music performance, and NMP are described. Three iterative steps defining and extending the architecture of the NMP platform with additional features to...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Multimodal Audio-Visual Recognition of Traffic Events

    Przedstawiono demonstrator systemu wykrywania niebezpiecznych zdarzeń w ruchu drogowym oparty na jednoczesnej analizie danych wizyjnych i akustycznych. System jest częścią systemu automatycznego nadzoru bezpieczeństwa. Wykorzystuje on kamery i mikrofony jako źródła danych. Przedstawiono wykorzystane algorytmy - algorytmy rozpoznawania zdarzeń dźwiękowych oraz analizy obrazu. Zaprezentowano wyniki działania algorytmów na przykładzie...

  • Adaptive filter for reconstruction of stereo audio signals.

    Publikacja

    - Rok 2004

    Artykuł poświęcony jest omówieniu metody rekonstrukcji zakłóconych impulsowo sygnałów stereofonicznych. W pracy zdefiniowano model sygnału stereofonicznego i przedstawiono zaprojektowany dla tego modelu filtr Kalmana. Przedstawiono modyfikacje filtru, w wyniku których algorytm dokonuje rekonstrukcji zakłóconego impulsowo sygnału w jednym kanale z wykorzystaniem dodatkowej informacji zawartej w niezakłóconych próbkach sygnału pochodzącego...

  • Intelligent algorithms for optical track audio restoration

    W referacie przedstawiono dwa algorytmy dedykowane redukcji pasożytniczych zniekształceń dźwięku spotykanych w optycznych ścieżkach dźwiękowych. Pierwszy algorytm umożliwia redukcję szerokopasmowego szumu w nagraniach fonicznych. Wykorzystano w nim psycho-akustyczny model słuchu oparty o miarę nieprzewidywalność sygnału (ang. Unpredictability Measure). Ocena jakości redukcji szumu została wykonana z wykorzystaniem metod inteligentnych....

  • A Device for Measuring Auditory Brainstem Responses to Audio

    Standard ABR devices use clicks and tone bursts to assess subjects’ hearing in an objective way. A new device was developed that extends the functionality of a standard ABR audiometer by collecting and analyzing auditory brainstem responses (ABR). The developed accessory allows for the use of complex sounds (e.g., speech or music excerpts) as stimuli. Therefore, it is possible to find out how efficiently different types of sounds...

    Pełny tekst do pobrania w portalu

  • Smart Virtual Bass Synthesis Algorithm Based on Music Genre Classification

    Publikacja

    The aim of this paper is to present a novel approach to the Virtual Bass Synthesis (VBS) algorithms applied to portable computers. The proposed algorithm employed automatic music genre recognition to determine the optimum parameters for the synthesis of additional frequencies. The synthesis was carried out using the non-linear device (NLD) and phase vocoder (PV) methods depending on the music excerpt genre. Classification of musical...

  • TRANSPORT POSSIBILITY FOR MPEG-4/AVC- AND MPEG-2-ENCODED VIDEO DATA IN IPTV: A COMPARISON STUDY

    Publikacja

    - Rok 2013

    IPTV (Television over IP) is a modern service with a great potential to expand. It uses the IP transport platform, that is already in worldwide operation. At the time of writing, two techniques are used to transport the video and audio data of IPTV: MPEG-2 TS and Native RTP. The two techniques quite definitely have an influence on both quality of service (QoS) and quality of experience (QoE). This paper sets out to demonstrate...

  • A Review of Emotion Recognition Methods Based on Data Acquired via Smartphone Sensors

    Publikacja

    In recent years, emotion recognition algorithms have achieved high efficiency, allowing the development of various affective and affect-aware applications. This advancement has taken place mainly in the environment of personal computers offering the appropriate hardware and sufficient power to process complex data from video, audio, and other channels. However, the increase in computing and communication capabilities of smartphones,...

    Pełny tekst do pobrania w portalu

  • Automatic Breath Analysis System Using Convolutional Neural Networks

    Publikacja

    Diseases related to the human respiratory system have always been a burden for the entire society. The situation has become particularly difficult now after the outbreak of the COVID-19 pandemic. Even now, however, it is not uncommon for people to consult their doctor too late, after the disease has developed. To protect patients from severe disease, it is recommended that any symptoms disturbing the respiratory system be detected...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Broadening the scope of measurement and analysis of vibrations of an organ pipe employing intensity probe, simulations, and highspeed camera

    Publikacja

    This paper shows an integrated approach to measure, analyze, and model phenomena occurring in an organ pipe driven by pressurized air. The aim of this paper is two-fold, i.e., to measure the pressure signal and the intensity field around the mouth by means of an intensity probe and to visualize and observe the motion of the air jet, which represents the excitation mechanism of the system. This is realized through two techniques,...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Speech Analytics Based on Machine Learning

    In this chapter, the process of speech data preparation for machine learning is discussed in detail. Examples of speech analytics methods applied to phonemes and allophones are shown. Further, an approach to automatic phoneme recognition involving optimized parametrization and a classifier belonging to machine learning algorithms is discussed. Feature vectors are built on the basis of descriptors coming from the music information...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Automatic Breath Analysis System Using Convolutional Neural Networks

    Publikacja

    Diseases related to the human respiratory system have always been a burden for the entire society. The situation has become particularly difficult now after the outbreak of the COVID-19 pandemic. Even now, however, it is common for people to consult their doctor too late, after the disease has developed. To protect patients from severe disease, it is recommended that any symptoms disturbing the respiratory system be detected as...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • e-wykład "Fizyk pod wodą" - Brygida Mielewska (FTiMS)

    Kursy Online
    • B. Mielewska

    Kurs zawiera materiał wykładowy pt. "Fizyk pod wodą" dotyczący fizycznych i biofizycznych aspektów nurkowania. Wykład stanowi uzupełnienie treści do przedmiotu "Biofizyka", może tez stanowić samodzielny materiał popularyzatorski, nie wymagający wiedzy specjalistycznej. Kurs zawiera 3-częściowy wykład audio w formacie SCORM, materiały pomocnicze do notatek oraz krótkie quizy tematyczne do każdej z części. Do korzystania z pełnej...

  • Study Analysis of Transmission Efficiency in DAB+ Broadcasting System

    Publikacja

    - Rok 2018

    DAB+ is a very innovative and universal multimedia broadcasting system. Thanks to its updated multimedia technologies and metadata options, digital radio keeps pace with changing consumer expectations and the impact of media convergence. Broadcasting analog and digital radio services does vary, concerning devices on both transmitting and receiving side, as well as content processing mechanisms. However, the biggest difference is...

    Pełny tekst do pobrania w portalu

  • Comparing traffic intensity estimates employing passive acoustic radar and microwave Doppler radar sensor

    The purpose of our applied research project is to develop an autonomous road sign with built-in radar devices of our design. In this paper, we show that it is possible to calibrate the acoustic vector sensor so that it can be used to measure traffic volume and count the vehicles involved in the traffic through the analysis of the noise emitted by them. Signals obtained from a Doppler radar are used as a reference source. Although...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Creating a Remote Choir Performance Recording Based on an Ambisonic Approach

    The aim of this paper is three-fold. First, the basics of binaural and ambisonic techniques are briefly presented. Then, details related to audio-visual recordings of a remote performance of the Academic Choir of the Gdańsk University of Technology are shown. Due to the COVID-19 pandemic, artists had a choice, namely, to stay at home and not perform or stay at home and perform. In fact, staying at home brought in the possibility...

    Pełny tekst do pobrania w portalu

  • Analysis of allophones based on audio signal recordings and parameterization

    The aim of this study is to develop an allophonic description of English plosive consonants based on recordings of 600 specially selected words. Allophonic variations addressed in the study may have two sources: positional and contextual. The former one depends on the syllabic or prosodic position in which a particular phoneme occurs. Contextual allophony is conditioned by the local phonetic environment. Co-articulation overlapping...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Wireless intelligent audio-video surveillance prototyping system

    Publikacja

    The presented system is based on the Virtex6 FPGA and several supporting devices like a fast DDR3 memory, small HD camera, microphone with A/D converter, WiFi radio communication module, etc. The system is controlled by the Linux operating system. The Linux drivers for devices implemented in the system have been prepared. The system has been successfully verified in a H.264 compression accelerator prototype in which the most demanding...

    Pełny tekst do pobrania w portalu

  • Audio codec employing frequency-derived tonality measure

    Publikacja

    A transform codec employing efficient algorithm for detection of spectral tonal components is presented. The tonality measure used in MPEG psychoacoustic model is replaced with the method providing adequate tonality estimates even if the tonal components are deeply frequency modulated. The reliability of hearing threshold estimated using psychoacoustic model with standardized tonality measure and the proposed one is investigated...

  • Applications of neural networks and perceptual masking to audio restoration

    Omówiono zastosowania algorytmów uczących się w dziedzinie rekonstruowania nagrań fonicznych. Szczególną uwagę zwrócono na zastosowanie sztucznych sieci neuronowych do usuwania zakłócających impulsów. Ponadto opisano zastosowanie inteligentnego algorytmu decyzyjnego do sterowania maskowaniem perceptualnym w celu redukowania szumu.

  • Wow detection and compensation employing spectral processing of audio.

    Praca zawiera opis opracowanych algorytmów detekcji i kompensacji pasożytniczych modulacji częstotliwości wynikających z nierównomiernego przesuwu nośnika dźwięku. Proponowane metody opracowano ze szczególnym uwzględnieniem przypadkowych zniekształceń drżenia obecnych w archiwalnych filmowych ścieżkach dźwiękowych. Dodatkowo algorytmy badają wpływ zniekształceń na strukturę formantową sygnałów. Analiza zmian położenia formantów...

  • New algorithms for wow and flutter detection and compensation in audio

    W referacie przedstawiono nowe metody dyskryminacji naturalnych efektów muzycznych i pasożytniczych zniekształceń drżenia dźwięku. Dodatkowo, opisano w nim metody wyznaczania przebiegu zniekształceń drżenia. Wśród nich znajdują się: detekcja okresowości sygnału w poszczególnych ramkach czasowych, śledzenie zmian przydźwięku sieciowego wykorzystujące modelowane AR widma sygnału, śledzenie zmian wysokoczęstotliwościowego prądu podkładu....

  • New algorithms for wow and flutter detection and compensation in audio

    W referacie przedstawiono nowe metody dyskryminacji naturalnych efektów muzycznych i pasożytniczych zniekształceń drżenia dźwięku. Dodatkowo, opisano w nim metody wyznaczania przebiegu zniekształceń drżenia. Wśród nich znajdują się: detekcja okresowości sygnału w poszczególnych ramkach czasowych, śledzenie zmian przydźwięku sieciowego wykorzystujące modelowane AR widma sygnału, śledzenie zmian wysokoczęstotliwościowego prądu podkładu....

  • Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej

    Publikacja

    - Rok 2013

    The bi-modal speech recognition system requires a 2-sample language input for training and for testing algorithms which precisely depicts natural English speech. For the purposes of the audio-visual recordings, a training data base of 264 sentences (1730 words without repetitions; 5685 sounds) has been created. The language sample reflects vowel and consonant frequencies in natural speech. The recording material reflects both the...

  • Two-stage method of impulsive noise detection for audio signals

    Przedstawiono nowa dwuetapową metodę detekcji zakłóceń impulsowych opartą na analizie funkcji gęstości rozkładu prawdopodobieństwa zakłóconego sygnału. Opisano algorytm określania poziomu wyzwalania detektora progowego.

  • Akustyczna analiza natężenia ruchu drogowego dla systemów zarządzania ruchem

    Publikacja

    - Rok 2019

    W pracy przybliżono wybrane zagadnienia z dziedziny zarządzania transportem drogowym w Polsce i na świecie. W tym kontekście pzredstawiono potrzeby rynkowe, wymagania jak i możliwości w zakresie pozyskiwania informacji o aktualnym stanie sieci drogowych. Zaproponowano akustyczną metodę nadzorowania ruchu drogowego i jej możliwości w kontekście systemów zarządzania ruchem. Przedstawiono schemat akwizycji sygnału wraz z danymi odniesienia....

  • Evaluation of aspiration problems in L2 English pronunciation employing machine learning

    The approach proposed in this study includes methods specifically dedicated to the detection of allophonic variation in English. This study aims to find an efficient method for automatic evaluation of aspiration in the case of Polish second-language (L2) English speakers’ pronunciation when whole words are analyzed instead of particular allophones extracted from words. Sample words including aspirated and unaspirated allophones...

    Pełny tekst do pobrania w portalu

  • Buzz-based honeybee colony fingerprint

    Non-intrusive remote monitoring has its applications in a variety of areas. For industrial surveillance case, devices are capable of detecting anomalies that may threaten machine operation. Similarly, agricultural monitoring devices are used to supervise livestock or provide higher yields. Modern IoT devices are often coupled with Machine Learning models, which provide valuable insights into device operation. However, the data...

    Pełny tekst do pobrania w portalu

  • Comparative study on the effectiveness of various types of road traffic intensity detectors

    Publikacja

    - Rok 2019

    Vehicle detection and speed measurements are crucial tasks in traffic monitoring systems. In this work, we focus on several types of electronic sensors, operating on different physical principles in order to compare their effectiveness in real traffic conditions. Commercial solutions are based on road tubes, microwave sensors, LiDARs, and video cameras. Distributed traffic monitoring systems require a high number of monitoring...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES

    Automatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and selforganizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’...

    Pełny tekst do pobrania w portalu

  • Fully Automated AI-powered Contactless Cough Detection based on Pixel Value Dynamics Occurring within Facial Regions

    Publikacja

    - Rok 2021

    Increased interest in non-contact evaluation of the health state has led to higher expectations for delivering automated and reliable solutions that can be conveniently used during daily activities. Although some solutions for cough detection exist, they suffer from a series of limitations. Some of them rely on gesture or body pose recognition, which might not be possible in cases of occlusions, closer camera distances or impediments...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Multimodal human-computer interfaces based on advanced video and audio analysis

    Multimodal interfaces development history is reviewed briefly in the introduction. Some applications of multimodal interfaces to education software for disabled people are presented. One of them, the LipMouse is a novel, vision-based human-computer interface that tracks user’s lip movements and detect lips gestures. A new approach to diagnosing Parkinson’s disease is also shown. The progression of the disease can be measured employing...

    Pełny tekst do pobrania w serwisie zewnętrznym