Wyniki wyszukiwania dla: AUDIO PROCESSING

Wyniki wyszukiwania dla: AUDIO PROCESSING

wyników na stronę:
osadź ten widok na swojej stronie

Filtry

wszystkich: 225

wyczyść wszystkie filtry niedostępne

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING

Czasopisma

ISSN: 1063-6676
Wow detection and compensation employing spectral processing of audio.
Publikacja
- Rok 2004
Praca zawiera opis opracowanych algorytmów detekcji i kompensacji pasożytniczych modulacji częstotliwości wynikających z nierównomiernego przesuwu nośnika dźwięku. Proponowane metody opracowano ze szczególnym uwzględnieniem przypadkowych zniekształceń drżenia obecnych w archiwalnych filmowych ścieżkach dźwiękowych. Dodatkowo algorytmy badają wpływ zniekształceń na strukturę formantową sygnałów. Analiza zmian położenia formantów...
IEEE Transactions on Audio Speech and Language Processing

Czasopisma

ISSN: 1558-7916
Elimination of Impulsive Disturbances From Archive Audio Signals Using Bidirectional Processing
Publikacja
- M. Niedźwiecki
- M. Ciołek
- IEEE Transactions on Audio Speech and Language Processing - Rok 2013
In this application-oriented paper we consider the problem of elimination of impulsive disturbances, such as clicks, pops and record scratches, from archive audio recordings. The proposed approach is based on bidirectional processing—noise pulses are localized by combining the results of forward-time and backward-time signal analysis. Based on the results of specially designed empirical tests (rather than on the results of theoretical analysis),...

Pełny tekst do pobrania w portalu
RENOVATION OF ARCHIVE AUDIO RECORDINGS USING SPARSE AUTOREGRESSIVE MODELING AND BIDIRECTIONAL PROCESSING
Publikacja
- M. Niedźwiecki
- M. Ciołek
- Rok 2013
The paper presents a new approach to elimination of broadband noise and impulsive disturbances from archive audio recordings. The proposed adaptive Kalman-like algorithm, based on a sparse autoregressive model of the audio signal, simultaneously detects noise pulses, interpolates the irrevocably distorted samples and performs signal smoothing. It is shown that bidirectional (forward-backward) processing of the archive signal improves...

Pełny tekst do pobrania w serwisie zewnętrznym
Intelligent Audio Signal Processing − Do We Still Need Annotated Datasets?
Publikacja
- B. Kostek
- Rok 2022
In this paper, intelligent audio signal processing examples are shortly described. The focus is, however, on the machine learning approach and datasets needed, especially for deep learning models. Years of intense research produced many important results in this area; however, the goal of fully intelligent signal processing, characterized by its autonomous acting, is not yet achieved. Therefore, a review of state-of-the-art concerning...

Pełny tekst do pobrania w portalu
IEEE-ACM Transactions on Audio Speech and Language Processing

Czasopisma

ISSN: 2329-9290
Adaptive system for recognition of sounds indicating threats to security of people and property employing parallel processing of audio data streams
Publikacja
- K. Łopatka
- Rok 2015
A system for recognition of threatening acoustic events employing parallel processing on a supercomputing cluster is featured. The methods for detection, parameterization and classication of acoustic events are introduced. The recognition engine is based onthreshold-based detection with adaptive threshold and Support Vector Machine classifcation. Spectral, temporal and mel-frequency descriptors are used as signal features. The...
EURASIP Journal on Audio Speech and Music Processing

Czasopisma

ISSN: 1687-4714 , eISSN: 1687-4722
International Symposium on Audio, Video, Image Processing and Intelligent Applications

Konferencje
Elimination of Impulsive Disturbances From Stereo Audio Recordings Using Vector Autoregressive Modeling and Variable-order Kalman Filtering
Publikacja
- IEEE Transactions on Audio Speech and Language Processing - Rok 2015
This paper presents a new approach to elimination of impulsive disturbances from stereo audio recordings. The proposed solution is based on vector autoregressive modeling of audio signals. Online tracking of signal model parameters is performed using the exponential ly weighted least squares algo- rithm. Detection of noise pulses an d model-based interpolation of the irrevocably distorted sampl es is realized using an adaptive, variable-order...

Pełny tekst do pobrania w portalu
Automatic music signal mixing system based on one-dimensional Wave-U-Net autoencoders
Publikacja
- D. Koszewski
- T. Görne
- G. Korvel
- B. Kostek
- EURASIP Journal on Audio Speech and Music Processing - Rok 2023
The purpose of this paper is to show a music mixing system that is capable of automatically mixing separate raw recordings with good quality regardless of the music genre. This work recalls selected methods for automatic audio mixing first. Then, a novel deep model based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. The model is trained on a custom-prepared database. Mixes created using the...

Pełny tekst do pobrania w portalu
Dynamic Bayesian Networks for Symbolic Polyphonic Pitch Modeling
Publikacja
- S. Raczyński
- E. Vincent
- S. Sagayama
- IEEE Transactions on Audio Speech and Language Processing - Rok 2013
Symbolic pitch modeling is a way of incorporating knowledge about relations between pitches into the process of an- alyzing musical information or signals. In this paper, we propose a family of probabilistic symbolic polyphonic pitch models, which account for both the “horizontal” and the “vertical” pitch struc- ture. These models are formulated as linear or log-linear interpo- lations of up to fi ve sub-models, each of which is...

Pełny tekst do pobrania w serwisie zewnętrznym
Estimation of the short-term predictor parameters of speech under noisy conditions
Publikacja
- M. Kuropatwinski
- W. Kleijn
- M. Kuropatwiński
- IEEE Transactions on Audio Speech and Language Processing - Rok 2006
Pełny tekst do pobrania w serwisie zewnętrznym
Genre-Based Music Language Modeling with Latent Hierarchical Pitman-Yor Process Allocation
Publikacja
- S. Raczyński
- E. Vincent
- IEEE Transactions on Audio Speech and Language Processing - Rok 2014
In this work we present a new Bayesian topic model: latent hierarchical Pitman-Yor process allocation (LHPYA), which uses hierarchical Pitman-Yor pr ocess priors for both word and topic distributions, and generalizes a few of the existing topic models, including the latent Dirichlet allocation (LDA), the bi- gram topic model and the hierarchical Pitman-Yor topic model. Using such priors allows for integration of -grams with a topic model,...

Pełny tekst do pobrania w serwisie zewnętrznym
New approach for determining the QoS of MP3-coded voice signals in IP networks
Publikacja
- T. Uhl
- S. Paulsen
- K. Nowicki
- EURASIP Journal on Audio Speech and Music Processing - Rok 2017
Present-day IP transport platforms being what they are, it will never be possible to rule out conflicts between the available services. The logical consequence of this assertion is the inevitable conclusion that the quality of service (QoS) must always be quantifiable no matter what. This paper focuses on one method to determine QoS. It defines an innovative, simple model that can evaluate the QoS of MP3-coded voice data transported...

Pełny tekst do pobrania w portalu
Piotr Szczuko dr hab. inż.

Osoby

Katedra Systemów Multimedialnych

Dr hab. inż. Piotr Szczuko w 2002 roku ukończył studia na Wydziale Elektroniki, Telekomunikacji i Informatyki Politechniki Gdańskiej zdobywając tytuł magistra inżyniera. Tematem pracy dyplomowej było badanie zjawisk jednoczesnej percepcji obrazu cyfrowego i dźwięku dookólnego. W roku 2008 obronił rozprawę doktorską zatytułowaną "Zastosowanie reguł rozmytych w komputerowej animacji postaci", za którą otrzymał nagrodę Prezesa Rady...
Marek Blok dr hab. inż.

Osoby

Marek Blok w 1994 roku ukończył studia na kierunku Telekomunikacja wydziału Elektroniki Politechniki Gdańskiej i uzyskał tytuł mgra inżyniera. Doktorat w zakresie telekomunikacji uzyskał w 2003 roku na Wydziale Elektroniki, Telekomunikacji i Informatyki Politechniki Gdańskiej. W 2017 roku uzyskał stopień naukowy dra habilitowanego w dyscyplinie telekomunikacja. Jego zainteresowania badawcze ukierunkowane są na telekomunikacyjne...
Michał Lech dr inż.

Osoby

Michał Lech was born in Gdynia in 1983. In 2007 he graduated from the faculty of Electronics, Telecommunications and Informatics of Gdansk University of Technology. In June 2013, he received his Ph.D. degree. The subject of the dissertation was: “A Method and Algorithms for Controlling the Sound Mixing Processes with Hand Gestures Recognized Using Computer Vision”. The main focus of the thesis was the bias of audio perception caused...
Personal adaptive tuning of mobile computer audio
Publikacja
- Rok 2015
An integrated methodology for enhancing audio quality in mobile computers is presented. The key features are adaptation of the characteristics of the acoustic track to the changing conditions and to the user's individual preferences. Original signal processing algorithms are introduced, which concern: linearization of frequency response, dialogue intelligibility enhancement and dynamics processing tuned up to the user's preferences....
Measurement of Latency in the Android Audio Path
Publikacja
- Rok 2018
This paper provides a description of experimental investigations concerning comparison between the audio path characteristics of various Android versions. First, information about the changes in each system version in the context of latency caused by them is presented. Then, a measurement procedure employing available applications to measure latency is described comparing to results contained in the Internet. Finally, a comparison...

Pełny tekst do pobrania w serwisie zewnętrznym
Musical Instrument Tagging Using Data Augmentation and Effective Noisy Data Processing
Publikacja
- D. Koszewski
- B. Kostek
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Rok 2020
Developing signal processing methods to extract information automatically has potential in several applications, for example searching for multimedia based on its audio content, making context-aware mobile applications (e.g., tuning apps), or pre-processing for an automatic mixing system. However, the last-mentioned application needs a significant amount of research to reliably recognize real musical instruments in recordings....

Pełny tekst do pobrania w portalu
Data obtained via parametrization of differently mixed audio signals
Dane Badawcze
open access
- J. Stefański
- K. Marciniuk
Dataset consists of audio samples and the results of their parametrization. The extraction of music parameters was performed using MIRToolbox. Information extracted from the samples was used as a database for master's thesis titled 'The influence of audio signal processing chain in mixing on the emotional state of a music piece'.
Music Data Processing and Mining in Large Databases for Active Media
Publikacja
- B. Kostek
- P. Hoffmann
- Rok 2014
The aim of this paper was to investigate the problem of music data processing and mining in large databases. Tests were performed on a large data-base that included approximately 30000 audio files divided into 11 classes cor-responding to music genres with different cardinalities. Every audio file was de-scribed by a 173-element feature vector. To reduce the dimensionality of data the Principal Component Analysis (PCA) with variable...

Pełny tekst do pobrania w serwisie zewnętrznym
A Study on Audio Signal Processed by "Instant Mastering"
Publikacja
- M. Piotrowska
- S. Piotrowski
- B. Kostek
- Rok 2018
An increasing amount of music produced in home- and project-studios results in development and growth of "automatic mastering services". The presented investigation explores changes introduced to audio signal by various online mastering platforms. A music set consisting of 10 songs produced in small facilities was processed by eight on-line automatic mastering services. Additionally, some laboratory-constructed signals were tested....
Fitting the mobile device characteristics to the user's hearing preferences
Publikacja
- Rok 2014
A method for fitting the mobile computer audio characteristics to the user's hearing preferences is proposed. The process consists of two stages: calibration and dynamics processing. During the calibration phase the user performs a loudness scaling test giving their response regarding the perceived loudness. The dynamics processing made on above basis sets the loudness to the most comfortable level. The processing accounts both...

Pełny tekst do pobrania w serwisie zewnętrznym
Adaptive Personal Tuning of Sound in Mobile Computers
Publikacja
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Rok 2016
An integrated methodology for enhancing audio quality in mobile computers is presented. The key features are adaptation of the characteristics of their acoustic track to changing acoustic conditions of the environment and to users’ individual preferences. Signal processing algorithms are introduced that concern: linearization of frequency response, dialogue intelligibility enhancement, and dynamics processing tuned up to the users’...

Pełny tekst do pobrania w portalu
An audio-visual corpus for multimodal automatic speech recognition
Publikacja
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Rok 2017
review of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...

Pełny tekst do pobrania w portalu
Grzegorz Szwoch dr hab. inż.

Osoby

Katedra Systemów Multimedialnych

Grzegorz Szwoch urodził się w 1972 roku w Gdańsku. W latach 1991-1996 studiował na wydziale Elektroniki Politechniki Gdańskiej. W roku 1996 ukończył studia w Zakładzie Inżynierii Dźwięku (obecnie Katedra Systemów Multimedialnych), broniąc pracę dyplomową pt. Modelowanie fizyczne wybranych instrumentów muzycznych. W tym samym roku dołączył do zespołu badawczego Katedry jako uczestnik Studium Doktoranckiego. Od stycznia 2001 roku...
Automatic audio signal mixing system based on one-dimensional Wave-U-Net autoencoders
Publikacja
- D. Koszewski
- Rok 2023
The purpose of this dissertation is to develop an automatic song mixing system that is capable of automatically mixing a song with good quality in any music genre. This work recalls first the audio signal processing methods used in audio mixing, and it describes selected methods for automatic audio mixing. Then, a novel architecture built based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. Models...

Pełny tekst do pobrania w portalu
Data, Information, Knowledge, Wisdom Pyramid Concept Revisited in the Context of Deep Learning
Publikacja
- B. Kostek
- Rok 2023
In this paper, the data, information, knowledge, and wisdom (DIKW) pyramid is revisited in the context of deep learning applied to machine learningbased audio signal processing. A discussion on the DIKW schema is carried out, resulting in a proposal that may supplement the original concept. Parallels between DIWK pertaining to audio processing are presented based on examples of the case studies performed by the author and her collaborators....

Pełny tekst do pobrania w serwisie zewnętrznym
Audio content analysis in the urban area telemonitoring system
Publikacja
- Rok 2010
Artykuł przedstawia możliwości rozwinięcie monitoringu miejskiego o automatyczną analizę dźwięku. Przedstawiono metody parametryzacji dźwięku, które możliwe są do zastosowania w takim systemie oraz omówiono aspekty techniczne implementacji. W kolejnej części przedstawiono system decyzyjny oparty na drzewach zastosowany w systemie. System ten rozpoznaje dźwięki niebezpieczne (strzał, rozbita szyba, krzyk) wśród dźwięków zarejestrowanych...

Pełny tekst do pobrania w serwisie zewnętrznym
Elimination of impulsive disturbances from archive audio files – comparison of three noise pulse detection schemes
Publikacja
- M. Niedźwiecki
- M. Ciołek
- Rok 2014
The problem of elimination of impulsive disturbances (such as clicks, pops, ticks, crackles, and record scratches) from archive audio recordings is considered and solved using autoregressive modeling. Three classical noise pulse detection schemes are examined and compared: the approach based on open-loop multi-step-ahead signal prediction, the approach based on decision-feedback signal prediction, and the double threshold approach,...

Pełny tekst do pobrania w serwisie zewnętrznym
Quality Aspects in Digital Broadcasting and Webcasting Systems: Bitrate versus Loudness
Publikacja
- Journal of Telecommunications and Information Technology - Rok 2017
In this paper the quality aspects of bitrate and loudness in digital broadcasting and webcasting systems are examined. The authors discuss a survey concerning user preferences related with processing and managing audio content. The coding efficiency of a popular audio format is analyzed in the context of storing media. An objective study on a representative group of signal samples, as well as a subjective study of the perceived...

Pełny tekst do pobrania w portalu
Localization of impulsive disturbances in audio signals using template matching
Publikacja
- M. Niedźwiecki
- M. Ciołek
- DIGITAL SIGNAL PROCESSING - Rok 2015
In this paper, a new solution to the problem of elimination of impulsive disturbances from audio signals, based on the matched filtering technique, is proposed. The new approach stems from the observation that a large proportion of noise pulses corrupting audio recordings have highly repetitive shapes that match several typical “patterns”. In many cases a representative set of exemplary pulse waveforms can be extracted from the...

Pełny tekst do pobrania w portalu
Recognition of hazardous acoustic events employing parallel processing on a supercomputing cluster . Rozpoznawanie niebezpiecznych zdarzeń dźwiękowych z wykorzystaniem równoległego przetwarzania na klastrze superkomputerowym
Publikacja
- K. Łopatka
- A. Czyżewski
- Rok 2015
A method for automatic recognition of hazardous acoustic events operating on a super computing cluster is introduced. The methods employed for detecting and classifying the acoustic events are outlined. The evaluation of the recognition engine is provided: both on the training set and using real-life signals. The algorithms yield sufficient performance in practical conditions to be employed in security surveillance systems. The...
Piotr Odya dr inż.

Osoby

Katedra Systemów Multimedialnych

Piotr Odya urodził się w Gdańsku w 1974. W 1999 roku ukończył z wyróżnieniem studia na Wydziale Elektroniki, Telekomunikacji i Informatyki Politechniki Gdańskiej zdobywając tytuł magistra inżyniera. Praca dyplomowa dotyczyła problemów poprawy jakości dźwięku w studiach emisyjnych współczesnych rozgłośni radiowych.Jego zainteresowania dotyczą montażu wideofonicznego, systemów dźwięku wielokanałowego. W ramach studiów doktoranckich...
Reduction of parasitic pitch variations in archival musical recordings
Publikacja
- SIGNAL PROCESSING - Rok 2010
A new method for reducing parasitic pitch variations in archival audio recordings is presented. The method is intended for analyzing movie soundtracks recorded in optical films. It utilizes image processing for calculating and reducing effects of tape shrinkage being one of the main reasons for parasitic pitch variations in audio accompanying moving images. As long as the film tape characteristics are known the new method can be...

Pełny tekst do pobrania w portalu
Acceleration of decision making in sound event recognition employing supercomputing cluster
Publikacja
- K. Łopatka
- A. Czyżewski
- INFORMATION SCIENCES - Rok 2014
Parallel processing of audio data streams is introduced to shorten the decision making time in hazardous sound event recognition. A supercomputing cluster environment with a framework dedicated to processing multimedia data streams in real time is used. The sound event recognition algorithms employed are based on detecting foreground events, calculating their features in short time frames, and classifying the events with Support...

Pełny tekst do pobrania w serwisie zewnętrznym
Audio Content and Crowdsourcing: A Subjective Quality Evaluation of Radio Programs Streamed Online
Publikacja
- P. Falkowski-Gilski
- Rok 2023
Radio broadcasting has been present in our lives for over 100 years. The transmission of speech and music signals accompanies us from an early age. Broadcasts provide the latest information from home and abroad. They also shape musical tastes and allow many artists to share their creativity. Modern distribution involves transmission over a number of terrestrial systems. The most popular are analog FM (Frequency Modulation) and...

Pełny tekst do pobrania w serwisie zewnętrznym
Online sound restoration system for digital library applications
Publikacja
- Rok 2013
Audio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...

Pełny tekst do pobrania w serwisie zewnętrznym
Online Sound Restoration for Digital Library Applications
Publikacja
- Rok 2012
A system for sound restoration was conceived and engineered having the following features: no special sound restoration software is needed to perform audio restoration by the user, the process of restoration employs automatic reduction of noise, wow and impulse distortions performed in the online mode, no skills in digital signal processing from the user are needed. The principles of the created system and its features as well...

Pełny tekst do pobrania w serwisie zewnętrznym
Creating a Realible Music Discovery and Recomendation System
Publikacja
- Rok 2014
The aim of this paper is to show problems related to creating a reliable music dis-covery system. The SYNAT database that contains audio files is used for the purpose of experiments. The files are divided into 22 classes corresponding to music genres with different cardinality. Of utmost importance for a reliable music recommendation system are the assignment of audio files to their appropriate gen-res and optimum parameterization...

Pełny tekst do pobrania w serwisie zewnętrznym
Processing of musical data employing rough sets and artificial neural networks
Publikacja
- Rok 2005
Artykuł opisuje założenia systemu automatycznej identyfikacji muzyki i dźwięków muzycznych. Dokonano przeglądu standardu MPEG-7, ze szczególnym naciskiem na parametry opisowe dźwięku. Przedyskutowano problemy analizy danych audio, związane z zastosowaniami wykorzystującymi MPEG-7. W oparciu o eksperymenty przedstawiono efektywność deskryptorów niskiego poziomu w automatycznym rozpoznawaniu dźwięków instrumentów muzycznych. Przedyskutowano...
Processing of musical data employing rough sets and artificial neural networks
Publikacja
- Rok 2004
Artykuł opisuje założenia systemu automatycznej identyfikacji muzyki i dźwięków muzycznych. Dokonano przeglądu standardu MPEG-7, ze szczególnym naciskiem na parametry opisowe dźwięku. Przedyskutowano problemy analizy danych audio, związane z zastosowaniami wykorzystującymi MPEG-7. W oparciu o eksperymenty przedstawiono efektywność deskryptorów niskiego poziomu w automatycznym rozpoznawaniu dźwięków instrumentów muzycznych. Przedyskutowano...
Online sound restoration system for digital library applications.
Publikacja
- Journal of the Acoustical Society of America - Rok 2013
Audio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
Variable Ratio Sample Rate Conversion Based on Fractional Delay Filter
Publikacja
- M. Blok
- P. Drózda
- Archives of Acoustics - Rok 2014
In this paper a sample rate conversion algorithm which allows for continuously changing resampling ratio has been presented. The proposed implementation is based on a variable fractional delay filter which is implemented by means of a Farrow structure. Coefficients of this structure are computed on the basis of fractional delay filters which are designed using the offset window method. The proposed approach allows us to freely...

Pełny tekst do pobrania w portalu
Further Developments of the Online Sound Restoration System for Digital Library Applications
Publikacja
- Rok 2014
New signal processing algorithms were introduced to the online service for audio restoration available at the web address: www.youarchive.net. Missing or distorted audio samples are estimated using a specific implementation of the Jannsen interpolation method. The algorithm is based on the autoregressive model (AR) combined with the iterative complementation of signal samples. Since the interpolation algorithm is computationally...

Pełny tekst do pobrania w serwisie zewnętrznym
New semi-causal and noncausal techniques for detection of impulsive disturbances in multivariate signals with audio applications
Publikacja
- M. Niedźwiecki
- M. Ciołek
- IEEE TRANSACTIONS ON SIGNAL PROCESSING - Rok 2017
This paper deals with the problem of localization of impulsive disturbances in nonstationary multivariate signals. Both unidirectional and bidirectional (noncausal) detection schemes are proposed. It is shown that the strengthened pulse detection rule, which combines analysis of one-step-ahead signal prediction errors with critical evaluation of leave-one-out signal interpolation errors, allows one to noticeably improve detection results...

Pełny tekst do pobrania w portalu
Multimodal English corpus for automatic speech recognition
Publikacja
- Rok 2013
A multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech...

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: AUDIO PROCESSING

Piotr Szczuko dr hab. inż.

Marek Blok dr hab. inż.

Michał Lech dr inż.

Grzegorz Szwoch dr hab. inż.

Piotr Odya dr inż.