Search results for: ARCHIWIZACJA AUDIO-WIDEO

Search results for: ARCHIWIZACJA AUDIO-WIDEO

results on page:
embed this view on your website

Filters

total: 499

clear all filters disabled

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING

Journals

ISSN: 1063-6676
Two-stage method of impulsive noise detection for audio signals
Publication
- K. Cisowski
- Poznan University of Technology Academic Journals. Electrical Engineering - Year 2007
Przedstawiono nowa dwuetapową metodę detekcji zakłóceń impulsowych opartą na analizie funkcji gęstości rozkładu prawdopodobieństwa zakłóconego sygnału. Opisano algorytm określania poziomu wyzwalania detektora progowego.
IEEE Transactions on Audio Speech and Language Processing

Journals

ISSN: 1558-7916
Multimodal human-computer interfaces based on advanced video and audio analysis
Publication
- Advances in Intelligent Systems and Computing - Year 2014
Multimodal interfaces development history is reviewed briefly in the introduction. Some applications of multimodal interfaces to education software for disabled people are presented. One of them, the LipMouse is a novel, vision-based human-computer interface that tracks user’s lip movements and detect lips gestures. A new approach to diagnosing Parkinson’s disease is also shown. The progression of the disease can be measured employing...

Full text to download in external service
Noise reduction in audio employing spectral unpredictability measure and neural net.
Publication
- A. Czyżewski
- M. Dziubiński
- Year 2004
modelu psychoakustycznym zostały przedyskutowane. Uczący się algorytm decyzjny, działający w opraciu o sztuczną sieć neuronową wykorzystany został w klasyfikacji składowych na pasożytnicze i użyteczne. Przedstawiona została również nowa iteracyjna procedura obliczania progu maskowania. W pracy zawarte zostały wyniki eksperymentów, oraz konkluzje odnoszące się do przedstawionych algorytmów.
Intelligent acquisition of audio signals, employing neutral networks and rough set algorithms
Publication
- A. Czyżewski
- Year 2003
Algorytmy oparte na sztucznych sieciach neuronowych i metodzie zbiorówprzybliżonych zostały zastosowane do lokalizacji sygnałów fonicznych obar-czonych pasożytniczym szumem i rewerberacjami. Informacja o kierunku napły-wania dźwięku była uzyskiwana na wyjściach tych algorytmów na podstawie re-prezentacji parametrycznej. Przedstawiono wyniki eksperymentalne i przepro-wadzono ich dyskusję.
Pomiary wartości opóźnień w torze audio urządzeń z systemem Android
Publication
- Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej - Year 2018
Poniższy artykuł opisuje metody pomiarów wartości opóźnienia w torze fonicznym urządzeń pracujących na różnych wersjach systemu Android. W pierwszej części artykułu podano krótką charakterystykę środowiska Android w kontekście opóźnień w torze fonicznym. Następnie przedstawiono sposób pomiaru opóźnienia w torze fonicznym za pomocą aplikacji SuperPowered Latency oraz Dr. Rick O’Rang Loopback. W końcowej...

Full text available to download
IEEE-ACM Transactions on Audio Speech and Language Processing

Journals

ISSN: 2329-9290
Machine Learning Applied to Aspirated and Non-Aspirated Allophone Classification—An Approach Based on Audio "Fingerprinting"
Publication
- Year 2018
The purpose of this study is to involve both Convolutional Neural Networks and a typical learning algorithm in the allophone classification process. A list of words including aspirated and non-aspirated allophones pronounced by native and non-native English speakers is recorded and then edited and analyzed. Allophones extracted from English speakers’ recordings are presented in the form of two-dimensional spectrogram images and...

Full text to download in external service
Evaluation of Six Degrees of Freedom 3D Audio Orchestra Recording and Playback using multi-point Ambisonic interpolation
Publication
- T. Ciotucha
- A. Rumiński
- T. Żernicki
- B. Mróz
- Scopus - Year 2021
This paper describes a strategy for recording sound and enabling six-degrees-of-freedom playback, making use of multiple simultaneous and synchronized Higher Order Ambisonics (HOA) recordings. Such a strategy enables users to navigate in a simulated 3D space and listen to the six-degrees-of-freedom recordings from different perspectives. For the evaluation of the proposed approach, an Unreal Engine-based navigable 3D audiovisual...

Full text to download in external service
Adaptive system for recognition of sounds indicating threats to security of people and property employing parallel processing of audio data streams
Publication
- K. Łopatka
- Year 2015
A system for recognition of threatening acoustic events employing parallel processing on a supercomputing cluster is featured. The methods for detection, parameterization and classication of acoustic events are introduced. The recognition engine is based onthreshold-based detection with adaptive threshold and Support Vector Machine classifcation. Spectral, temporal and mel-frequency descriptors are used as signal features. The...
Testing A Novel Gesture-Based Mixing Interface
Publication
- M. Lech
- B. Kostek
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2013
With a digital audio workstation, in contrast to the traditional mouse-keyboard computer interface, hand gestures can be used to mix audio with eyes closed. Mixing with a visual representation of audio parameters during experiments led to broadening the panorama and a more intensive use of shelving equalizers. Listening tests proved that the use of hand gestures produces mixes that are aesthetically as good as those obtained using...

Full text available to download
EURASIP Journal on Audio Speech and Music Processing

Journals

ISSN: 1687-4714 , eISSN: 1687-4722
Musical Instrument Tagging Using Data Augmentation and Effective Noisy Data Processing
Publication
- D. Koszewski
- B. Kostek
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2020
Developing signal processing methods to extract information automatically has potential in several applications, for example searching for multimedia based on its audio content, making context-aware mobile applications (e.g., tuning apps), or pre-processing for an automatic mixing system. However, the last-mentioned application needs a significant amount of research to reliably recognize real musical instruments in recordings....

Full text available to download
Adaptive Personal Tuning of Sound in Mobile Computers
Publication
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2016
An integrated methodology for enhancing audio quality in mobile computers is presented. The key features are adaptation of the characteristics of their acoustic track to changing acoustic conditions of the environment and to users’ individual preferences. Signal processing algorithms are introduced that concern: linearization of frequency response, dialogue intelligibility enhancement, and dynamics processing tuned up to the users’...

Full text available to download
Editor's note and 2018 reviewers
Publication
- B. Kostek
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2018
Przedmiotem pracy jest odniesienie do prac opublikowanych w 2018 roku, jak również do serii artykułów w ramach specjalnego wydania: Special Issue on Augmented and Participatory Sound and Music Interaction Using Semantic Audio.

Full text to download in external service
KORPUS MOWY ANGIELSKIEJ DO CELÓW MULTIMODALNEGO AUTOMATYCZNEGO ROZPOZNAWANIA MOWY
Publication
- Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne - Year 2016
W referacie zaprezentowano audiowizualny korpus mowy zawierający 31 godzin nagrań mowy w języku angielskim. Korpus dedykowany jest do celów automatycznego audiowizualnego rozpoznawania mowy. Korpus zawiera nagrania wideo pochodzące z szybkoklatkowej kamery stereowizyjnej oraz dźwięk zarejestrowany przez matrycę mikrofonową i mikrofon komputera przenośnego. Dzięki uwzględnieniu nagrań zarejestrowanych w warunkach szumowych korpus...
Digital Audio Effects Conference

Conferences
Sylwester Kaczmarek dr hab. inż.

People

Department of Teleinformation Networks

Sylwester Kaczmarek received his M.Sc in electronics engineering, Ph.D. and D.Sc. in switching and teletraffic science from the Gdansk University of Technology, Gdansk, Poland, in 1972, 1981 and 1994, respectively. His research interests include: IP QoS and GMPLS and SDN networks, switching, QoS routing, teletraffic, multimedia services and quality of services. Currently, his research is focused on developing and applicability...
EVENTS VISUALIZATION POST IN A DISTRIBUTED TELEINFORMATION SYSTEM FOR THE BORDER GUARD
Publication
- Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne - Year 2017
Events Visualization Post is a part of the STRADAR project, which is dedicated to streaming real-time data in distributed dispatcher and teleinformation systems of the Border Guard. Events Visualization Post is a software designed for simultaneous visualization of data of different types. In the paper, the structure of the software is presented, the process of generation of tasks is described, and the visualization of audio, files,...
Automatic music signal mixing system based on one-dimensional Wave-U-Net autoencoders
Publication
- D. Koszewski
- T. Görne
- G. Korvel
- B. Kostek
- EURASIP Journal on Audio Speech and Music Processing - Year 2023
The purpose of this paper is to show a music mixing system that is capable of automatically mixing separate raw recordings with good quality regardless of the music genre. This work recalls selected methods for automatic audio mixing first. Then, a novel deep model based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. The model is trained on a custom-prepared database. Mixes created using the...

Full text available to download
Analysis of 2D Feature Spaces for Deep Learning-based Speech Recognition
Publication
- G. Korvel
- P. Treigys
- G. Tamulevicus
- J. Bernataviciene
- B. Kostek
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2018
convolutional neural network (CNN) which is a class of deep, feed-forward artificial neural network. We decided to analyze audio signal feature maps, namely spectrograms, linear and Mel-scale cepstrograms, and chromagrams. The choice was made upon the fact that CNN performs well in 2D data-oriented processing contexts. Feature maps were employed in the Lithuanian word recognition task. The spectral analysis led to the highest word...
Bass Enhancement Settings in Portable Devices Based on Music Genre Recognition
Publication
- P. Hoffmann
- B. Kostek
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2015
The paper presents a novel approach to the Virtual Bass Synthesis (VBS) applied to mobile devices, called Smart VBS (SVBS). The proposed algorithm uses an intelligent, rule-based setting of bass synthesis parameters adjusted to the particular music genre. Harmonic generation is based on a nonlinear device (NLD) method with the intelligent controlling system adapting to the recognized music genre. To automatically classify music...

Full text available to download
Measuring and Analyzing Audio Levels in Film, Commercials, and Movie Trailers Using Leq(A) Values and the LUFS Loudness Model . Analiza pomiarów dźwięku w filmie oraz w reklamach filmowych z wykorzystaniem modelu głośności
Publication
- Year 2015
The purpose of this paper is to describe the measurement of loudness levels in movies, movie trailers, and commercials displayed before feature films at movie theaters. In the initial section, the paper discusses the issues related to measurement of loudness levels, provides recommendations regarding permissible loudness levels during movie screenings, and mentions the applied units of measurement. The following section of the...
SYSTEMY BEZDOTYKOWEJ OCENY PARAMETRÓW ŻYCIOWYCH
Publication
- J. Rumiński
- Year 2019
W rozdziale przedstawiono metody ekstrakcji sygnałów biomedycznych i parametrów medycznych z wideo twarzy. W szczególności omówiono metody pozyskiwania pulsu w wideo uzyskiwanego w zakresie widzialnym oraz parametrów oddychania z zapisów sekwencji obrazów termograficznych.
Piotr Odya dr inż.

People

Department of Multimedia Systems

Piotr Odya was born in Gdansk in 1974. He received his M.Sc. in 1999 from the Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, Poland. His thesis was related to the problem of sound quality improvement in the contemporary broadcasting studio. He is interested in video editing and multichannel sound systems. The goal of Mr. Odya Ph.D. thesis concerned methods and algorithms for correcting...
International Symposium on Audio, Video, Image Processing and Intelligent Applications

Conferences
Grzegorz Szwoch dr hab. inż.

People

Department of Multimedia Systems

Grzegorz Szwoch was born in 1972 in Gdansk. In 1991-1996 he studied at the Technical University of Gdansk. In 1996 he graduated as a student from the Sound Engineering Department. His thesis was related to physical modeling of musical instruments. Since that time he has been a member of the research staff at the Multimedia Systems Department as a PhD student (1996-2001), Assistant (2001-2004), Assistant professor (2004-2020) and...
Dynamic Bayesian Networks for Symbolic Polyphonic Pitch Modeling
Publication
- S. Raczyński
- E. Vincent
- S. Sagayama
- IEEE Transactions on Audio Speech and Language Processing - Year 2013
Symbolic pitch modeling is a way of incorporating knowledge about relations between pitches into the process of an- alyzing musical information or signals. In this paper, we propose a family of probabilistic symbolic polyphonic pitch models, which account for both the “horizontal” and the “vertical” pitch struc- ture. These models are formulated as linear or log-linear interpo- lations of up to fi ve sub-models, each of which is...

Full text to download in external service
Instalacja artystyczna "W sztuce lubię: romantyzm, poezję i figle"
Publication
- P. Różycki
- Year 2019
Instalacja artystyczna "W sztuce lubię: romantyzm, poezję i figle" instalacja z 70 koszul męskich różnego koloru, wraz z wideo wyświetlanym na suficie. Wystawa w Instytucie Cybernetyki Sztuki.
Automatic sound recognition for security purposes
Publication
- P. Żwan
- Year 2008
In the paper an automatic sound recognition system is presented. It forms a part of a bigger security system developed in order to monitor outdoor places for non-typical audio-visual events. The analyzed audio signal is being recorded from a microphone mounted in an outdoor place thus a non stationary noise of a significant energy is present in it. In the paper an especially designed algorithm for outdoor noise reduction is presented,...
Workflow application for detection of unwanted events
Publication
- P. Czarnul
- W. Kicior
- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Year 2010
Zaprezentowano rozproszoną aplikację do wykrywania potencjalnie niebezpiecznych zdarzeń z wejściowych strumieni wideo. Rozpoznanie niepożądanych zdarzeń wywołuje alarmy i wysyła powiadomienia do odpowiednich służb, jak również powoduje zarejestrowanie filmu. Model aplikacji składa się z węzłów z kamerami, pobierajacych strumienie danych, przetwarzajacych dane, wysyłajacych powiadomienia i zapisujacych dane. Zaimplementowana aplikacja...
QoS/QoE in the Heterogeneous Internet of Things (IoT)
Publication
- K. Nowicki
- T. Uhl
- Year 2017
Applications provided in the Internet of Things can generally be divided into three categories: audio, video and data. This has given rise to the popular term Triple Play Services. The most important audio applications are VoIP and audio streaming. The most notable video applications are VToIP, IPTV, and video streaming, and the service WWW is the most prominent example of data-type services. This chapter elaborates on the most...
Akcelerator transformacji DCT do kompresji obrazu w sensorach wizyjnych
Publication
- Year 2015
W komunikacie przedstawiono konfigurowalny cyfrowyakcelerator transformacji DCT przeznaczony dla enkodera wideo standardu H.264. Akcelerator realizuje także odwrotnątransformacjęDCT oraz kwantyzację i dekwantyzację. Akcelerator początkowo zaimplementowano w układzie FPGA. Został on pomyślnie zweryfikowany, a następnie zaimplementowany w układzie ASIC w technologiiUMC 90 nm. Szczegółowe wyniki testów akceleratora ASIC zostały...
Akcelerator transformacji DCT do kompresji obrazu w sensorach wizyjnych
Publication
- Przegląd Elektrotechniczny - Year 2015
W komunikacie przedstawiono konfigurowalny cyfrowy akcelerator transformacji DCT przeznaczony dla enkodera wideo standardu H.264. Akcelerator realizuje także odwrotnątransformacjęDCT oraz kwantyzacjęi dekwantyzację. Akcelerator początkowo zaimplementowano w układzie FPGA. Zostałon pomyślnie zweryfikowany, a następnie zaimplementowany w układzie ASIC w technologii UMC 90 nm. Szczegółowe wyniki testów akceleratora ASIC zostały...

Full text available to download
Material for Automatic Phonetic Transcription of Speech Recorded in Various Conditions
Publication
- Year 2016
Automatic speech recognition (ASR) is under constant development, especially in cases when speech is casually produced or it is acquired in various environment conditions, or in the presence of background noise. Phonetic transcription is an important step in the process of full speech recognition and is discussed in the presented work as the main focus in this process. ASR is widely implemented in mobile devices technology, but...

Full text to download in external service
Analiza stanu nawierzchni i klas pojazdów na podstawie parametrów ekstrahowanych z sygnału fonicznego
Publication
- K. Marciniuk
- B. Kostek
- Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej - Year 2016
Celem badań jest poszukiwanie parametrów wektora cech ekstrahowanego z sygnału fonicznego w kontekście automatycznego rozpoznawania stanu nawierzchni jezdni oraz typu pojazdów. W pierwszej kolejności przedstawiono wpływ warunków pogodowych na charakterystykę widmową sygnału fonicznego rejestrowanego przy przejeżdżających pojazdach. Następnie, dokonano parametryzacji sygnału fonicznego oraz przeprowadzano analizę korelacyjną w celu...

Full text available to download
Intelligent multimedia asplications - scanning the issue
Publication
- B. Kostek
- Year 2005
Celem specjalnego wydania tego tomu czasopisma JIIS, zatytułowanego ''Inteligentne przetwarzanie multimediów'', było przedstawienie badań w tej dziedzinie, prowadzonych w różnych ośrodkach na świecie. Zawarte w tym tomie artykuły dotyczyły inteligentnego przetwarzania sygnałów fonicznych i wideo, jak również muzyki.
Reliability of Pulse Measurements in Videoplethysmography
Publication
- J. Rumiński
- Metrology and Measurement Systems - Year 2016
Reliable, remote pulse rate measurement is potentially very important for medical diagnostics and screening. In this paper the Videoplethysmography was analyzed especially to verify the possible use of signals obtained for the YUV color model in order to estimate the pulse rate, to examine what is the best pulse estimation method for short video sequences and finally, to analyze how potential PPG-signals can be distinguished from...

Full text available to download
Kaskada - scenariusze analizy strumieni multimedialnych
Publication
- J. Proficz
- Year 2010
Zaprezentowano podstawowe mechanizmy działania platformy KASKADA ze względu na wykonywanie usług prostych związanych z realizacją algorytmów analizy strumieni multimedialnych, jak również usług złożonych dotyczących wykonywania bardziej skomplikowanych scenariuszy. Rozpatrzono problemy implementacyjne na przykładzie scenariusza analizy strumieni z kamer wideo.
Examining Acoustic Emission of Engineered Ultrasound Loudspeakers
Publication
- Year 2014
Measurement results of the sound emitted from an ultrasound custom-made system with high spatial directivity are presented. The proposed system is using modulated ultrasound waves which demodulate in nonlinear medium resulting in audible sound. The system is aimed at enhancing the users’ personal audio space, therefore the measurements are performed using the Head and Torso Simulator which provides the realistic reproduction of...
Akcelerator predykcji wewnątrzramkowej H.264 do kompresji obrazu w sensorach wizyjnych
Publication
- Year 2016
W komunikacie przedstawiono konfigurowalny cyfrowy akcelerator predykcji wewnątrzramkowej przeznaczony dla enkodera wideo standardu H.264. Akcelerator realizuje predykcję typu „intra” dla makrobloków luminancji o wymiarach 4x4 i 16x16. Akcelerator wstępnie zaimplementowano w układzie FPGA, gdzie został on pomyślnie zweryfikowany, a następnie zaimplementowano go w układzie ASIC w technologii UMC 90 nm. Szczegółowe wyniki testów...
Akcelerator predykcji wewnątrzramkowej H.264 do kompresji obrazu w sensorach wizyjnych
Publication
- Elektronika : konstrukcje, technologie, zastosowania - Year 2016
W artykule przedstawiono konfigurowalny cyfrowy akcelerator predykcji wewnątrzramkowej przeznaczony dla enkodera wideo standardu H.264. Akcelerator realizuje predykcję typu „intra” dla makrobloków luminancji o wymiarach 4x4 i 16x16. Akcelerator wstępnie zaimplementowano w układzie FPGA, gdzie został on pomyślnie zweryfikowany, a następnie zaimplementowano go w układzie ASIC w technologii UMC 90 nm. Szczegółowe wyniki testów akceleratora...

Full text to download in external service
A concept of Signal Equalization Method Based on Music Genre and the Listener's Room Characteristics
Publication
- B. Kostek
- P. Hoffmann
- Year 2016
A research study that investigates the influence of the room acoustics environment on the frequency characteristic of the audio signal playback is presented. First, a novel spectral equalization method of the room acoustic conditions is introduced. On the basis of the frequency response of the room, a system for room acoustics compensation based on eight-band equalizer is proposed. The system settings depend on music genre. In...
Przegląd metod szybkiego prototypowania algorytmów uczenia maszynowego w FPGA
Publication
- R. Smyk
- P. Kowalski
- Poznan University of Technology Academic Journals. Electrical Engineering - Year 2021
W artykule opisano możliwe do wykorzystania otwarte narzędzia wspomagające szybkie prototypowanie algorytmów uczenia maszynowego (ML) i sztucznej inteligencji (AI) przy użyciu współczesnych platform FPGA. Przedstawiono przykład szybkiej ścieżki przy realizacji toru wideo wraz z implementacją przykładowego algorytmu prze-twarzania w trybie na żywo.

Full text available to download
Measurements and Simulations of Engineered Ultrasound Loudspeakers
Publication
- Computational Methods in Science and Technology - Year 2015
Simulation and measurement results of the sound emitted from an ultrasound custom-made system with high spatial directivity are presented. The proposed system is using modulated ultrasound waves which demodulate in nonlinear medium resulting in audible sound. The system is aimed at enhancing the users’ personal audio space, therefore the measurements are performed using the Head and Torso Simulator which provides realistic reproduction...

Full text to download in external service
Intelligent multimedia solutions supporting special education needs.
Publication
- A. Czyżewski
- B. Kostek
- LECTURE NOTES IN COMPUTER SCIENCE - Year 2011
The role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....
Quality Aspects in Digital Broadcasting and Webcasting Systems: Bitrate versus Loudness
Publication
- Journal of Telecommunications and Information Technology - Year 2017
In this paper the quality aspects of bitrate and loudness in digital broadcasting and webcasting systems are examined. The authors discuss a survey concerning user preferences related with processing and managing audio content. The coding efficiency of a popular audio format is analyzed in the context of storing media. An objective study on a representative group of signal samples, as well as a subjective study of the perceived...

Full text available to download
Bloki Funkcjonalne Systemów Elektronicznych 22/23
e-Learning Courses
- A. Kwiatkowski
Kurs zawiera materiały do wykładu (slajdy oraz materiały wideo) i laboratorium (instrukcje) z przedmiotu Bloki Funkcjonalne Systemów Elektronicznych.
Bloki Funkcjonalne Systemów Elektronicznych
e-Learning Courses
- A. Kwiatkowski
Kurs zawiera materiały do wykładu (slajdy oraz materiały wideo) i laboratorium (instrukcje) z przedmiotu Bloki funkcjonalne Systemów Elektronicznych.

Search

Filters

Catalog

Search results for: ARCHIWIZACJA AUDIO-WIDEO

Sylwester Kaczmarek dr hab. inż.

Piotr Odya dr inż.

Grzegorz Szwoch dr hab. inż.