Wyniki wyszukiwania dla: 2d space feature, speech analysis, deep learning, spectrogram, cepstrogram, chromagram

Wyniki wyszukiwania dla: 2d space feature, speech analysis, deep learning, spectrogram, cepstrogram, chromagram

wyników na stronę:
osadź ten widok na swojej stronie

Filtry

wszystkich: 149

wyczyść wszystkie filtry niedostępne

Analysis of 2D Feature Spaces for Deep Learning-based Speech Recognition
Publikacja
- G. Korvel
- P. Treigys
- G. Tamulevicus
- J. Bernataviciene
- B. Kostek
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Rok 2018
convolutional neural network (CNN) which is a class of deep, feed-forward artificial neural network. We decided to analyze audio signal feature maps, namely spectrograms, linear and Mel-scale cepstrograms, and chromagrams. The choice was made upon the fact that CNN performs well in 2D data-oriented processing contexts. Feature maps were employed in the Lithuanian word recognition task. The spectral analysis led to the highest word...
Speech Analytics Based on Machine Learning
Publikacja
- Rok 2019
In this chapter, the process of speech data preparation for machine learning is discussed in detail. Examples of speech analytics methods applied to phonemes and allophones are shown. Further, an approach to automatic phoneme recognition involving optimized parametrization and a classifier belonging to machine learning algorithms is discussed. Feature vectors are built on the basis of descriptors coming from the music information...

Pełny tekst do pobrania w serwisie zewnętrznym
Analysis-by-synthesis paradigm evolved into a new concept
Publikacja
- B. Kostek
- Journal of the Acoustical Society of America - Rok 2022
This work aims at showing how the well-known analysis-by-synthesis paradigm has recently been evolved into a new concept. However, in contrast to the original idea stating that the created sound should not fail to pass the foolproof synthesis test, the recent development is a consequence of the need to create new data. Deep learning models are greedy algorithms requiring a vast amount of data that, in addition, should be correctly...

Pełny tekst do pobrania w serwisie zewnętrznym
Playback detection using machine learning with spectrogram features approach
Publikacja
- J. Dembski
- J. Rumiński
- Rok 2017
This paper presents 2D image processing approach to playback detection in automatic speaker verification (ASV) systems using spectrograms as speech signal representation. Three feature extraction and classification methods: histograms of oriented gradients (HOG) with support vector machines (SVM), HAAR wavelets with AdaBoost classifier and deep convolutional neural networks (CNN) were compared on different data partitions in respect...

Pełny tekst do pobrania w portalu
Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech
Publikacja
- D. Korzekwa
- R. Barra-Chicote
- B. Kostek
- T. Drugman
- M. Łajszczak
- Rok 2019
We present a novel deep learning model for the detection and reconstruction of dysarthric speech. We train the model with a multi-task learning technique to jointly solve dysarthria detection and speech reconstruction tasks. The model key feature is a low-dimensional latent space that is meant to encode the properties of dysarthric speech. It is commonly believed that neural networks are black boxes that solve problems but do not...

Pełny tekst do pobrania w portalu
Detecting Lombard Speech Using Deep Learning Approach
Publikacja
- K. Kąkol
- G. Korvel
- G. Tamulevicius
- B. Kostek
- SENSORS - Rok 2023
Robust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...

Pełny tekst do pobrania w portalu
A Study of Cross-Linguistic Speech Emotion Recognition Based on 2D Feature Spaces
Publikacja
- G. Tamulevicius
- G. Korvel
- A. B. Yayak
- P. Treigys
- J. Bernataviciene
- B. Kostek
- Electronics - Rok 2020
In this research, a study of cross-linguistic speech emotion recognition is performed. For this purpose, emotional data of different languages (English, Lithuanian, German, Spanish, Serbian, and Polish) are collected, resulting in a cross-linguistic speech emotion dataset with the size of more than 10.000 emotional utterances. Despite the bi-modal character of the databases gathered, our focus is on the acoustic representation...

Pełny tekst do pobrania w portalu
Automated detection of pronunciation errors in non-native English speech employing deep learning
Publikacja
- D. Korzekwa
- Rok 2023
Despite significant advances in recent years, the existing Computer-Assisted Pronunciation Training (CAPT) methods detect pronunciation errors with a relatively low accuracy (precision of 60% at 40%-80% recall). This Ph.D. work proposes novel deep learning methods for detecting pronunciation errors in non-native (L2) English speech, outperforming the state-of-the-art method in AUC metric (Area under the Curve) by 41%, i.e., from...

Pełny tekst do pobrania w portalu
Improvement of speech intelligibility in the presence of noise interference using the Lombard effect and an automatic noise interference profiling based on deep learning
Publikacja
- K. Kąkol
- Rok 2023
The Lombard effect is a phenomenon that results in speech intelligibility improvement when applied to noise. There are many distinctive features of Lombard speech that were recalled in this dissertation. This work proposes the creation of a system capable of improving speech quality and intelligibility in real-time measured by objective metrics and subjective tests. This system consists of three main components: speech type detection,...

Pełny tekst do pobrania w portalu
Deep Learning Optimization for Edge Devices: Analysis of Training Quantization Parameters
Publikacja
- A. Kwaśniewska
- M. Szankin
- M. Ozga
- J. Wolfe
- A. Das
- A. Zajac
- J. Rumiński
- P. Rad
- Rok 2019
This paper focuses on convolution neural network quantization problem. The quantization has a distinct stage of data conversion from floating-point into integer-point numbers. In general, the process of quantization is associated with the reduction of the matrix dimension via limited precision of the numbers. However, the training and inference stages of deep learning neural network are limited by the space of the memory and a...

Pełny tekst do pobrania w portalu
Interpretable deep learning approach for classification of breast cancer - a comparative analysis of multiple instance learning models
Publikacja
- J. Buler
- R. Buler
- M. Bobowicz
- M. Ferlin
- M. Rygusik
- A. Kwasigroch
- M. Grochowski
- Rok 2023
Breast cancer is the most frequent female cancer. Its early diagnosis increases the chances of a complete cure for the patient. Suitably designed deep learning algorithms can be an excellent tool for quick screening analysis and support radiologists and oncologists in diagnosing breast cancer.The design of a deep learning-based system for automated breast cancer diagnosis is not easy due to the lack of annotated data, especially...

Pełny tekst do pobrania w serwisie zewnętrznym
OmicSelector: automatic feature selection and deep learning modeling for omic experiments
Publikacja
- K. Stawiski
- M. Kaszkowiak
- D. Mikulski
- P. Hogendorf
- A. Durczyński
- J. Strzelczyk
- D. Chowdhury
- W. Fendler
- Rok 2022
Pełny tekst do pobrania w serwisie zewnętrznym
Thermal Images Analysis Methods using Deep Learning Techniques for the Needs of Remote Medical Diagnostics
Publikacja
- A. Kwaśniewska
- Rok 2020
Remote medical diagnostic solutions have recently gained more importance due to global demographic shifts and play a key role in evaluation of health status during epidemic. Contactless estimation of vital signs with image processing techniques is especially important since it allows for obtaining health status without the use of additional sensors. Thermography enables us to reveal additional details, imperceptible in images acquired...

Pełny tekst do pobrania w portalu
Orientation-aware ship detection via a rotation feature decoupling supported deep learning approach
Publikacja
- X. Chen
- H. Wu
- B. Han
- W. Liu
- J. Montewka
- R. W. Liu
- ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE - Rok 2023
Ship imaging position plays an important role in visual navigation, and thus significant focuses have been paid to accurately extract ship imaging positions in maritime videos. Previous studies are mainly conducted in the horizontal ship detection manner from maritime image sequences. This can lead to unsatisfied ship detection performance due to that some background pixels maybe wrongly identified as ship contours. To address...

Pełny tekst do pobrania w serwisie zewnętrznym
Solubility of dapsone in deep eutectic solvents: Experimental analysis, molecular insights and machine learning predictions
Publikacja
- T. Jeliński
- M. Przybyłek
- R. Różalski
- P. Cysewski
- Polimery w Medycynie - Rok 2024
Background. Dapsone (DAP) is an anti-inflammatory and antimicrobial active pharmaceutical ingredient used to treat, e.g., AIDS-related diseases. However, low solubility is a feature hampering its efficient use. Objectives. First, deep eutectic solvents...

Pełny tekst do pobrania w portalu
Intra-subject class-incremental deep learning approach for EEG-based imagined speech recognition
Publikacja
- J. S. Garcia Salinas
- A. A. Torres-García
- C. A. Reyes-Garćia
- L. Villaseñor-Pineda
- Biomedical Signal Processing and Control - Rok 2023
Brain–computer interfaces (BCIs) aim to decode brain signals and transform them into commands for device operation. The present study aimed to decode the brain activity during imagined speech. The BCI must identify imagined words within a given vocabulary and thus perform the requested action. A possible scenario when using this approach is the gradual addition of new words to the vocabulary using incremental learning methods....

Pełny tekst do pobrania w serwisie zewnętrznym
Analysis of the Capability of Deep Learning Algorithms for EEG-based Brain-Computer Interface Implementation
Publikacja
- K. Ledwosiński
- P. Czapla
- T. Kocejko
- J. Kang-Hyun
- Rok 2023
Machine learning models have received significant attention for their exceptional performance in classifying electroencephalography (EEG) data. They have proven to be highly effective in extracting intricate patterns and features from the raw signal data, thereby contributing to their success in EEG classification tasks. In this study, we explore the possibilities of utilizing contemporary machine learning algorithms in decoding...

Pełny tekst do pobrania w serwisie zewnętrznym
Investigating Feature Spaces for Isolated Word Recognition
Publikacja
- G. Korvel
- G. Tamulevicus
- P. Treigys
- J. Bernataviciene
- B. Kostek
- Rok 2018
Much attention is given by researchers to the speech processing task in automatic speech recognition (ASR) over the past decades. The study addresses the issue related to the investigation of the appropriateness of a two-dimensional representation of speech feature spaces for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and timefrequency signal representation...
Investigating Feature Spaces for Isolated Word Recognition
Publikacja
- P. Treigys
- G. Korvel
- G. Tamulevicius
- J. Bernataviciene
- B. Kostek
- Rok 2020
The study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms and fractal dimension features of the signal were chosen for the time domain, and...

Pełny tekst do pobrania w serwisie zewnętrznym
Rediscovering Automatic Detection of Stuttering and Its Subclasses through Machine Learning—The Impact of Changing Deep Model Architecture and Amount of Data in the Training Set
Publikacja
- P. Filipowicz
- B. Kostek
- Applied Sciences-Basel - Rok 2023
This work deals with automatically detecting stuttering and its subclasses. An effective classification of stuttering along with its subclasses could find wide application in determining the severity of stuttering by speech therapists, preliminary patient diagnosis, and enabling communication with the previously mentioned voice assistants. The first part of this work provides an overview of examples of classical and deep learning...

Pełny tekst do pobrania w portalu
Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention
Publikacja
- D. Korzekwa
- R. Barra-Chicote
- S. Zaporowski
- G. Beringer
- J. Lorenzo-trueba
- A. Serafinowicz
- J. Droppo
- T. Drugman
- B. Kostek
- Rok 2021
This paper describes two novel complementary techniques that improve the detection of lexical stress errors in non-native (L2) English speech: attention-based feature extraction and data augmentation based on Neural Text-To-Speech (TTS). In a classical approach, audio features are usually extracted from fixed regions of speech such as the syllable nucleus. We propose an attention-based deep learning model that automatically de...

Pełny tekst do pobrania w portalu
Ranking Speech Features for Their Usage in Singing Emotion Classification
Publikacja
- S. Zaporowski
- B. Kostek
- Rok 2020
This paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based...

Pełny tekst do pobrania w portalu
WYKORZYSTANIE SIECI NEURONOWYCH DO SYNTEZY MOWY WYRAŻAJĄCEJ EMOCJE
Publikacja
- S. Zaporowski
- B. Kostek
- Rok 2018
W niniejszym artykule przedstawiono analizę rozwiązań do rozpoznawania emocji opartych na mowie i możliwości ich wykorzystania w syntezie mowy z emocjami, wykorzystując do tego celu sieci neuronowe. Przedstawiono aktualne rozwiązania dotyczące rozpoznawania emocji w mowie i metod syntezy mowy za pomocą sieci neuronowych. Obecnie obserwuje się znaczny wzrost zainteresowania i wykorzystania uczenia głębokiego w aplikacjach związanych...
Deep neural networks for data analysis
Kursy Online
- K. Draszawka
The aim of the course is to familiarize students with the methods of deep learning for advanced data analysis. Typical areas of application of these types of methods include: image classification, speech recognition and natural language understanding. Celem przedmiotu jest zapoznanie studentów z metodami głębokiego uczenia maszynowego na potrzeby zaawansowanej analizy danych. Do typowych obszarów zastosowań tego typu metod należą:...
Introduction to the special issue on machine learning in acoustics
Publikacja
- Z. Michalopoulou
- P. Gerstoft
- B. Kostek
- M. A. Roch
- Journal of the Acoustical Society of America - Rok 2021
When we started our Call for Papers for a Special Issue on “Machine Learning in Acoustics” in the Journal of the Acoustical Society of America, our ambition was to invite papers in which machine learning was applied to all acoustics areas. They were listed, but not limited to, as follows: • Music and synthesis analysis • Music sentiment analysis • Music perception • Intelligent music recognition • Musical source separation • Singing...

Pełny tekst do pobrania w portalu
Computer-assisted pronunciation training—Speech synthesis is almost all you need
Publikacja
- D. Korzekwa
- J. Lorenzo-trueba
- T. Drugman
- B. Kostek
- SPEECH COMMUNICATION - Rok 2022
The research community has long studied computer-assisted pronunciation training (CAPT) methods in non-native speech. Researchers focused on studying various model architectures, such as Bayesian networks and deep learning methods, as well as on the analysis of different representations of the speech signal. Despite significant progress in recent years, existing CAPT methods are not able to detect pronunciation errors with high...

Pełny tekst do pobrania w portalu
Machine Learning Applied to Aspirated and Non-Aspirated Allophone Classification—An Approach Based on Audio "Fingerprinting"
Publikacja
- Rok 2018
The purpose of this study is to involve both Convolutional Neural Networks and a typical learning algorithm in the allophone classification process. A list of words including aspirated and non-aspirated allophones pronounced by native and non-native English speakers is recorded and then edited and analyzed. Allophones extracted from English speakers’ recordings are presented in the form of two-dimensional spectrogram images and...

Pełny tekst do pobrania w serwisie zewnętrznym
Tool Wear Monitoring Using Improved Dragonfly Optimization Algorithm and Deep Belief Network
Publikacja
- L. Gertrude David
- R. Kumar Patra
- P. Falkowski-Gilski
- P. Bidare Divakarachari
- L. J. Antony Marcilin
- Applied Sciences-Basel - Rok 2022
In recent decades, tool wear monitoring has played a crucial role in the improvement of industrial production quality and efficiency. In the machining process, it is important to predict both tool cost and life, and to reduce the equipment downtime. The conventional methods need enormous quantities of human resources and expert skills to achieve precise tool wear information. To automatically identify the tool wear types, deep...

Pełny tekst do pobrania w portalu
Musical Instrument Tagging Using Data Augmentation and Effective Noisy Data Processing
Publikacja
- D. Koszewski
- B. Kostek
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Rok 2020
Developing signal processing methods to extract information automatically has potential in several applications, for example searching for multimedia based on its audio content, making context-aware mobile applications (e.g., tuning apps), or pre-processing for an automatic mixing system. However, the last-mentioned application needs a significant amount of research to reliably recognize real musical instruments in recordings....

Pełny tekst do pobrania w portalu
Variable Data Structures and Customized Deep Learning Surrogates for Computationally Efficient and Reliable Characterization of Buried Objects
Publikacja
- R. Yurt
- H. Torpi
- A. Kizilay
- S. Kozieł
- P. Mahouti
- Scientific Reports - Rok 2024
In this study, in order to characterize the buried object via deep-learning-based surrogate modeling approach, 3-D full-wave electromagnetic simulations of a GPR model has been used. The task is to predict simultaneously and independent of each characteristic parameters of a buried object of several radii at different positions (depth and lateral position) in various dispersive subsurface media. This study has analyzed variable...

Pełny tekst do pobrania w portalu
Looking through the past: better knowledge retention for generative replay in continual learning
Publikacja
- V. Khan
- S. Cygert
- K. Deja
- T. Trzciński
- B. Twardowski
- IEEE Access - Rok 2024
In this work, we improve the generative replay in a continual learning setting to perform well on challenging scenarios. Because of the growing complexity of continual learning tasks, it is becoming more popular, to apply the generative replay technique in the feature space instead of image space. Nevertheless, such an approach does not come without limitations. In particular, we notice the degradation of the continually trained...

Pełny tekst do pobrania w portalu
Selection of Features for Multimodal Vocalic Segments Classification
Publikacja
- S. Zaporowski
- A. Czyżewski
- Rok 2018
English speech recognition experiments are presented employing both: audio signal and Facial Motion Capture (FMC) recordings. The principal aim of the study was to evaluate the inﬂuence of feature vector dimension reduction for the accuracy of vocalic segments classiﬁcation employing neural networks. Several parameter reduction strategies were adopted, namely: Extremely Randomized Trees, Principal Component Analysis and Recursive...

Pełny tekst do pobrania w serwisie zewnętrznym
SYNTHESIZING MEDICAL TERMS – QUALITY AND NATURALNESS OF THE DEEP TEXT-TO-SPEECH ALGORITHM
Publikacja
- B. Kostek
- B. Szyca
- Journal of the Acoustical Society of America - Rok 2023
The main purpose of this study is to develop a deep text-to-speech (TTS) algorithm designated for an embedded system device. First, a critical literature review of state-of-the-art speech synthesis deep models is provided. The algorithm implementation covers both hardware and algorithmic solutions. The algorithm is designed for use with the Raspberry Pi 4 board. 80 synthesized sentences were prepared based on medical and everyday...

Pełny tekst do pobrania w portalu
Spiral Search Grasshopper Features Selection with VGG19-ResNet50 for Remote Sensing Object Detection
Publikacja
- A. Stateczny
- G. Uday Kiran
- G. Bindu
- K. Ravi Chythanya
- K. Ayyappa Swamy
- Remote Sensing - Rok 2022
Remote sensing object detection plays a major role in satellite imaging and is required in various scenarios such as transportation, forestry, and the ocean. Deep learning techniques provide efficient performance in remote sensing object detection. The existing techniques have the limitations of data imbalance, overfitting, and lower efficiency in detecting small objects. This research proposes the spiral search grasshopper (SSG)...

Pełny tekst do pobrania w portalu
Evaluation of Lombard Speech Models in the Context of Speech in Noise Enhancement
Publikacja
- G. Korvel
- K. Kąkol
- O. Kurasova
- B. Kostek
- IEEE Access - Rok 2020
The Lombard effect is one of the most well-known effects of noise on speech production. Speech with the Lombard effect is more easily recognizable in noisy environments than normal natural speech. Our previous investigations showed that speech synthesis models might retain Lombard-effect characteristics. In this study, we investigate several speech models, such as harmonic, source-filter, and sinusoidal, applied to Lombard speech...

Pełny tekst do pobrania w portalu
Deep learning techniques for biometric security: A systematic review of presentation attack detection systems
Publikacja
- K. Shaheed
- P. Szczuko
- M. Kumar
- I. Qureshi
- Q. Abbas
- I. Ullah
- ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE - Rok 2024
Biometric technology, including finger vein, fingerprint, iris, and face recognition, is widely used to enhance security in various devices. In the past decade, significant progress has been made in improving biometric sys- tems, thanks to advancements in deep convolutional neural networks (DCNN) and computer vision (CV), along with large-scale training datasets. However, these systems have become targets of various attacks, with...

Pełny tekst do pobrania w serwisie zewnętrznym
Application of gas chromatographic data and 2D molecular descriptors for accurate global mobility potential prediction
Publikacja
- W. Studziński
- M. Przybyłek
- A. Gackowska
- ENVIRONMENTAL POLLUTION - Rok 2023
Mobility is a key feature affecting the environmental fate, which is of particular importance in the case of persistent organic pollutants (POPs) and emerging pollutants (EPs). In this study, the global mobility classification artificial neural networks-based models employing GC retention times (RT) and 2D molecular descriptors were constructed and validated. The high usability of RT was confirmed based on the feature selection...

Pełny tekst do pobrania w serwisie zewnętrznym
Deep CNN based decision support system for detection and assessing the stage of diabetic retinopathy
Publikacja
- A. Kwasigroch
- B. Jarzembinski
- M. Grochowski
- Rok 2018
The diabetic retinopathy is a disease caused by long-standing diabetes. Lack of effective treatment can lead to vision impairment and even irreversible blindness. The disease can be diagnosed by examining digital color fundus photographs of retina. In this paper we propose deep learning approach to automated diabetic retinopathy screening. Deep convolutional neural networks (CNN) - the most popular kind of deep learning algorithms...

Pełny tekst do pobrania w serwisie zewnętrznym
A Comprehensive Analysis of Deep Neural-Based Cerebral Microbleeds Detection System
Publikacja
- M. Ferlin
- M. Grochowski
- A. Kwasigroch
- A. Mikołajczyk-Bareła
- E. Szurowska
- M. Grzywińska
- A. Sabisz
- Electronics - Rok 2021
Machine learning-based systems are gaining interest in the field of medicine, mostly in medical imaging and diagnosis. In this paper, we address the problem of automatic cerebral microbleeds (CMB) detection in magnetic resonance images. It is challenging due to difficulty in distinguishing a true CMB from its mimics, however, if successfully solved it would streamline the radiologists work. To deal with this complex three-dimensional...

Pełny tekst do pobrania w portalu
PHONEME DISTORTION IN PUBLIC ADDRESS SYSTEMS
Publikacja
- I. Kochańska
- H. Lasota
- Rok 2015
The quality of voice messages in speech reinforcement and public address systems is often poor. The sound engineering projects of such systems take care of sound intensity and possible reverberation phenomena in public space without, however, considering the influence of acoustic interference related to the number and distribution of loudspeakers. This paper presents the results of measurements and numerical simulations of the...
Voice command recognition using hybrid genetic algorithm
Publikacja
- M. Wroniszewska
- J. Dziedzic
- TASK Quarterly - Rok 2010
Abstract: Speech recognition is a process of converting the acoustic signal into a set of words, whereas voice command recognition consists in the correct identification of voice commands, usually single words. Voice command recognition systems are widely used in the military, control systems, electronic devices, such as cellular phones, or by people with disabilities (e.g., for controlling a wheelchair or operating a computer...

Pełny tekst do pobrania w portalu
Distortion of speech signals in the listening area: its mechanism and measurements
Publikacja
- H. Lasota
- R. Mazurek
- I. Kochańska
- Rok 2014
The paper deals with a problem of the influence of the number and distribution of loudspeakers in speech reinforcement systems on the quality of publicly addressed voice messages, namely on speech intelligibility in the listening area. Linear superposition of time-shifted broadband waves of a same form and slightly different magnitudes that reach a listener from numerous coherent sources, is accompanied by interference effects...

Pełny tekst do pobrania w serwisie zewnętrznym
Bi-GRU-APSO: Bi-Directional Gated Recurrent Unit with Adaptive Particle Swarm Optimization Algorithm for Sales Forecasting in Multi-Channel Retail
Publikacja
- A. Mogarala Guruvaya
- A. Kollu
- P. Bidare Divakarachari
- P. Falkowski-Gilski
- H. Dwaraka Praveena
- Telecom - Rok 2024
In the present scenario, retail sales forecasting has a great significance in E-commerce companies. The precise retail sales forecasting enhances the business decision making, storage management, and product sales. Inaccurate retail sales forecasting can decrease customer satisfaction, inventory shortages, product backlog, and unsatisfied customer demands. In order to obtain a better retail sales forecasting, deep learning models...

Pełny tekst do pobrania w serwisie zewnętrznym
Data augmentation for improving deep learning in image classification problem
Publikacja
- A. Mikołajczyk-Bareła
- M. Grochowski
- Rok 2018
These days deep learning is the fastest-growing field in the field of Machine Learning (ML) and Deep Neural Networks (DNN). Among many of DNN structures, the Convolutional Neural Networks (CNN) are currently the main tool used for the image analysis and classification purposes. Although great achievements and perspectives, deep neural networks and accompanying learning algorithms have some relevant challenges to tackle. In this...

Pełny tekst do pobrania w serwisie zewnętrznym
Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging
Publikacja
- P. Rościszewski
- J. Kaliski
- Rok 2017
In the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modiﬁcation of the training program which minimizes the...

Pełny tekst do pobrania w serwisie zewnętrznym
DEVELOPMENT OF THE ALGORITHM OF POLISH LANGUAGE FILM REVIEWS PREPROCESSING
Publikacja
- N. Rizun
- J. Taranenko
- Rocznik Naukowy Wydzialu Zarzadzania w Ciechanowie - Rok 2017
The algorithm and the software for conducting the procedure of Preprocessing of the reviews of films in the Polish language were developed. This algorithm contains the following steps: Text Adaptation Procedure; Procedure of Tokenization; Procedure of Transforming Words into the Byte Format; Part-of-Speech Tagging; Stemming / Lemmatization Procedure; Presentation of Documents in the Vector Form (Vector Space Model) Procedure; Forming...

Pełny tekst do pobrania w portalu
Impact of Visual Image Quality on Lymphocyte Detection Using YOLOv5 and RetinaNet Algorithms
Publikacja
- Rok 2024
Lymphocytes, a type of leukocytes, play a vital role in the immune system. The precise quantification, spatial arrangement and phenotypic characterization of lymphocytes within haematological or histopathological images can serve as a diagnostic indicator of a particular lesion. Artificial neural networks, employed for the detection of lymphocytes, not only can provide support to the work of histopathologists but also enable better...

Pełny tekst do pobrania w serwisie zewnętrznym
Multiplicative Long Short-Term Memory with Improved Mayfly Optimization for LULC Classification
Publikacja
- A. Stateczny
- S. M. Bolugallu
- P. B. Divakarachari
- K. Ganesan
- J. R. Muthu
- Remote Sensing - Rok 2022
Land Use and Land Cover (LULC) monitoring is crucial for global transformation, sustainable land control, urban planning, urban growth prediction, and the establishment of climate regulations for long-term development. Remote sensing images have become increasingly important in many environmental planning and land use surveys in recent times. LULC is evaluated in this research using the Sat 4, Sat 6, and Eurosat datasets. Various...

Pełny tekst do pobrania w portalu
Deep learning based thermal image segmentation for laboratory animals tracking
Publikacja
- M. Mazur-Milecka
- J. Rumiński
- QIRT Journal - Rok 2020
Automated systems for behaviour classification of laboratory animals are an attractive alternative to manual scoring. However, the proper animals separation and tracking, especially when they are in close contact, is the bottleneck of the behaviour analysis systems. In this paper, we propose a method for the segmentation of thermal images of laboratory rats that are in close contact during social behaviour tests. For this, we are...

Pełny tekst do pobrania w serwisie zewnętrznym
English Language Learning Employing Developments in Multimedia IS
Publikacja
- Rok 2024
In the realm of the development of information systems related to education, integrating multimedia technologies offers novel ways to enhance foreign language learning. This study investigates audio-video processing methods that leverage real-time speech rate adjustment and dynamic captioning to support English language acquisition. Through a mixed-methods analysis involving participants from a language school, we explore the impact...

Pełny tekst do pobrania w serwisie zewnętrznym

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: 2d space feature, speech analysis, deep learning, spectrogram, cepstrogram, chromagram