Filters
total: 438
filtered: 356
-
Catalog
Chosen catalog filters
Search results for: training set
-
Creating neural models using an adaptive algorithm for optimal size of neural network and training set.
PublicationZaprezentowano adaptacyjny algorytm generujący modele neuronowe liniowych układów mikrofalowych, zdolny do oszacowania optymalnego rozmiaru zbiory uczącego i sieci neuronowej. Stworzono kilka modeli nieciągłości falowodowych i mokropaskowych, a następnie zweryfikowano ich poprawność porównując wyniki analiz metodą dopasowania rodzajów i metodą momentów filtrów pasmowo-przepustowych.
-
Rediscovering Automatic Detection of Stuttering and Its Subclasses through Machine Learning—The Impact of Changing Deep Model Architecture and Amount of Data in the Training Set
PublicationThis work deals with automatically detecting stuttering and its subclasses. An effective classification of stuttering along with its subclasses could find wide application in determining the severity of stuttering by speech therapists, preliminary patient diagnosis, and enabling communication with the previously mentioned voice assistants. The first part of this work provides an overview of examples of classical and deep learning...
-
Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging
PublicationIn the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modification of the training program which minimizes the...
-
Real and Virtual Instruments in Machine Learning – Training and Comparison of Classification Results
PublicationThe continuous growth of the computing power of processors, as well as the fact that computational clusters can be created from combined machines, allows for increasing the complexity of algorithms that can be trained. The process, however, requires expanding the basis of the training sets. One of the main obstacles in music classification is the lack of high-quality, real-life recording database for every instrument with a variety...
-
Vehicle detector training with minimal supervision
PublicationRecently many efficient object detectors based on convolutional neural networks (CNN) have been developed and they achieved impressive performance on many computer vision tasks. However, in order to achieve practical results, CNNs require really large annotated datasets for training. While many such databases are available, many of them can only be used for research purposes. Also some problems exist where such datasets are not...
-
Color-based Detection of Bleeding in Endoscopic Images
PublicationIn this paper a color descriptor designed for bleeding detection in endoscopic images is proposed. The development of the algorithm was carried out on a representative training set of 36 images of bleeding and 25 clear images. Another 38 bleeding and 26 normal images were used in the final stage as a test set. All of the considered images were extracted from separate endoscopic examinations. The experiments include color distribution...
-
Active Learning Based on Crowdsourced Data
PublicationThe paper proposes a crowdsourcing-based approach for annotated data acquisition and means to support Active Learning training approach. In the proposed solution, aimed at data engineers, the knowledge of the crowd serves as an oracle that is able to judge whether the given sample is informative or not. The proposed solution reduces the amount of work needed to annotate large sets of data. Furthermore, it allows a perpetual increase...
-
Semantic segmentation training using imperfect annotations and loss masking
PublicationOne of the most significant factors affecting supervised neural network training is the precision of the annotations. Also, in a case of expert group, the problem of inconsistent data annotations is an integral part of real-world supervised learning processes, well-known to researchers. One practical example is a weak ground truth delineation for medical image segmentation. In this paper, we have developed a new method of accurate...
-
Frequency response spectra applied to assess efficiency of the training techniques
PublicationThe purpose of the research is to assess the increase of the muscle strength and power. Movement of the human body when the moving one impacts a stationary or moving body is taken under consideration. The waveform produced by an impact is transformed into frequency domain. The acceleration record is transformed as a complex spectrum, by the use of a Discrete Fourier Transformation. In this paper the applications of the discrete...
-
Biometric identity verification
PublicationThis chapter discusses methods which are capable of protecting automatic speaker verification systems (ASV) from playback attacks. Additionally, it presents a new approach, which uses computer vision techniques, such as the texture feature extraction based on Local Ternary Patterns (LTP), to identify spoofed recordings. We show that in this case training the system with large amounts of spectrogram patches may be difficult, and...
-
The impact of the AC922 Architecture on Performance of Deep Neural Network Training
PublicationPractical deep learning applications require more and more computing power. New computing architectures emerge, specifically designed for the artificial intelligence applications, including the IBM Power System AC922. In this paper we confront an AC922 (8335-GTG) server equipped with 4 NVIDIA Volta V100 GPUs with selected deep neural network training applications, including four convolutional and one recurrent model. We report...
-
Texture Features for the Detection of Playback Attacks: Towards a Robust Solution
PublicationThis paper describes the new version of a method that is capable of protecting automatic speaker verification (ASV) systems from playback attacks. The presented approach uses computer vision techniques, such as the texture feature extraction based on Local Ternary Patterns (LTP), to identify spoofed recordings. Our goal is to make the algorithm independent from the contents of the training set as much as possible; we look for the...
-
Performance improvement of NN based RTLS by customization of NN structure - heuristic approach
PublicationThe purpose of this research is to improve performance of the Hybrid Scene Analysis – Neural Network indoor localization algorithm applied in Real-time Locating System, RTLS. A properly customized structure of Neural Network and training algorithms for specific operating environment will enhance the system’s performance in terms of localization accuracy and precision. Due to nonlinearity and model complexity, a heuristic analysis...
-
Musical Instrument Identification Using Deep Learning Approach
PublicationThe work aims to propose a novel approach for automatically identifying all instruments present in an audio excerpt using sets of individual convolutional neural networks (CNNs) per tested instrument. The paper starts with a review of tasks related to musical instrument identification. It focuses on tasks performed, input type, algorithms employed, and metrics used. The paper starts with the background presentation, i.e., metadata...
-
Active Annotation in Evaluating the Credibility of Web-Based Medical Information: Guidelines for Creating Training Data Sets for Machine Learning
PublicationMethods Results Discussion References Abbreviations Copyright Abstract Background: The spread of false medical information on the web is rapidly accelerating. Establishing the credibility of web-based medical information has become a pressing necessity. Machine learning offers a solution that, when properly deployed, can be an effective tool in fighting medical misinformation on the web. Objective: The aim of this study is to...
-
FEEDB: A multimodal database of facial expressions and emotions
PublicationIn this paper a first version of a multimodal FEEDB database of facial expressions and emotions is presented. The database contains labeled RGB-D recordings of people expressing a specific set of expressions that have been recorded using Microsoft Kinect sensor. Such a database can be used for classifier training and testing in face recognition as well as in recognition of facial expressions and human emotions. Also initial experiences...
-
Emotion Recognition and Its Applications
PublicationThe paper proposes a set of research scenarios to be applied in four domains: software engineering, website customization, education and gaming. The goal of applying the scenarios is to assess the possibility of using emotion recognition methods in these areas. It also points out the problems of defining sets of emotions to be recognized in different applications, representing the defined emotional states, gathering the data and...
-
Low-cost data-driven modelling of microwave components using domain confinement and PCA-based dimensionality reduction
PublicationFast data-driven surrogate models can be employed as replacements of computationally demanding full-wave electromagnetic simulations to facilitate the microwave design procedures. Unfortunately, practical application of surrogate modelling is often hindered by the curse of dimensionality and/or considerable nonlinearity of the component characteristics. This paper proposes a simple yet reliable approach to cost-efficient modelling...
-
Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention
PublicationThis paper describes two novel complementary techniques that improve the detection of lexical stress errors in non-native (L2) English speech: attention-based feature extraction and data augmentation based on Neural Text-To-Speech (TTS). In a classical approach, audio features are usually extracted from fixed regions of speech such as the syllable nucleus. We propose an attention-based deep learning model that automatically de...
-
Creating new voices using normalizing flows
PublicationCreating realistic and natural-sounding synthetic speech remains a big challenge for voice identities unseen during training. As there is growing interest in synthesizing voices of new speakers, here we investigate the ability of normalizing flows in text-to-speech (TTS) and voice conversion (VC) modes to extrapolate from speakers observed during training to create unseen speaker identities. Firstly, we create an approach for TTS...
-
Cost‐efficient performance‐driven modelling of multi‐band antennas by variable‐fidelity electromagnetic simulations and customized space mapping
PublicationElectromagnetic (EM) simulations have become an indispensable tool in the design of contemporary antennas. EM‐driven tasks, for example, parametric optimization, entail considerable computational efforts, which may be reduced by employing surrogate models. Yet, data‐driven modelling of antenna characteristics is largely hindered by the curse of dimensionality. This may be addressed using the recently reported domain‐confinement...
-
Towards Scalable Simulation of Federated Learning
PublicationFederated learning (FL) allows to train models on decentralized data while maintaining data privacy, which unlocks the availability of large and diverse datasets for many practical applications. The ongoing development of aggregation algorithms, distribution architectures and software implementations aims for enabling federated setups employing thousands of distributed devices, selected from millions. Since the availability of...
-
Local Texture Pattern Selection for Efficient Face Recognition and Tracking
PublicationThis paper describes the research aimed at finding the optimal configuration of the face recognition algorithm based on local texture descriptors (binary and ternary patterns). Since the identification module was supposed to be a part of the face tracking system developed for interactive wearable computer, proper feature selection, allowing for real-time operation, became particularly important. Our experiments showed that it is...
-
Evaluating Performance and Accuracy Improvements for Attention-OCR
PublicationIn this paper we evaluated a set of potential improvements to the successful Attention-OCR architecture, designed to predict multiline text from unconstrained scenes in real-world images. We investigated the impact of several optimizations on model’s accuracy, including employing dynamic RNNs (Recurrent Neural Networks), scheduled sampling, BiLSTM (Bidirectional Long Short-Term Memory) and a modified attention model. BiLSTM was...
-
Multiobjective Aerodynamic Optimization by Variable-Fidelity Models and Response Surface Surrogates
PublicationA computationally efficient procedure for multiobjective design optimization with variable-fidelity models and response surface surrogates is presented. The proposed approach uses the multiobjective evolutionary algorithm that works with a fast surrogate model, obtained with kriging interpolation of the low-fidelity model data enhanced by space-mapping correction exploiting a few high-fidelity training points. The initial Pareto...
-
Automated Classifier Development Process for Recognizing Book Pages from Video Frames
PublicationOne of the latest developments made by publishing companies is introducing mixed and augmented reality to their printed media (e.g. to produce augmented books). An important computer vision problem that they are facing is classification of book pages from video frames. The problem is non-trivial, especially considering that typical training data is limited to only one digital original per book page, while the trained classifier...
-
Accurate Modeling of Antenna Structures by Means of Domain Confinement and Pyramidal Deep Neural Networks
PublicationThe importance of surrogate modeling techniques has been gradually increasing in the design of antenna structures over the recent years. Perhaps the most important reason is a high cost of full-wave electromagnetic (EM) analysis of antenna systems. Although imperative in ensuring evaluation reliability, it entails considerable computational expenses. These are especially pronounced when carrying out EM-driven design tasks such...
-
Domain segmentation for low-cost surrogate-assisted multi-objective design optimisation of antennas
PublicationAbstract: Information regarding the best possible design trade-offs of an antenna structure can be obtained through multiobjective optimisation (MO). Unfortunately, MO is extremely challenging if full-wave electromagnetic (EM) simulation models are used for performance evaluation. Yet, for the majority of contemporary antennas, EM analysis is the only tool that ensures reliability. This study introduces a procedure for accelerated...
-
Recognition of hazardous acoustic events employing parallel processing on a supercomputing cluster . Rozpoznawanie niebezpiecznych zdarzeń dźwiękowych z wykorzystaniem równoległego przetwarzania na klastrze superkomputerowym
PublicationA method for automatic recognition of hazardous acoustic events operating on a super computing cluster is introduced. The methods employed for detecting and classifying the acoustic events are outlined. The evaluation of the recognition engine is provided: both on the training set and using real-life signals. The algorithms yield sufficient performance in practical conditions to be employed in security surveillance systems. The...
-
Low-Cost Multi-Objective Optimization Yagi-Uda Antenna in Multi-Dimensional Parameter Space
PublicationA surrogate-based technique for fast multi-objective optimization of a multi-parameter planar Yagi-Uda antenna structure is presented. The proposed method utilizes response surface approximation (RSA) models constructed using training samples obtained from evaluation of the low-fidelity antenna model. Utilization of the RSA models allowsfor fast determination of the best possible trade-offs between conflicting objectives in multi-objective...
-
Pose classification in the gesture recognition using the linear optical sensor
PublicationGesture sensors for mobile devices, which have a capability of distinguishing hand poses, require efficient and accurate classifiers in order to recognize gestures based on the sequences of primitives. Two methods of poses recognition for the optical linear sensor were proposed and validated. The Gaussian distribution fitting and Artificial Neural Network based methods represent two kinds of classification approaches. Three types...
-
On Reduced-Cost Design-Oriented Constrained Surrogate Modeling of Antenna Structures
PublicationDesign of contemporary antenna structures heavily relies on full-wave electromagnetic (EM) simulation models. Such models are essential to ensure reliability of evaluating antenna characteristics, yet, they are computationally expensive and therefore unsuitable for handling tasks that require multiple analyses, e.g., parametric optimization. The cost issue can be alleviated by using fast surrogate models. Conventional data-driven...
-
Evaluation of sound event detection, classification and localization in the presence of background noise for acoustic surveillance of hazardous situations
PublicationAn evaluation of the sound event detection, classification and localization of hazardous acoustic events in the presence of background noise of different types and changing intensities is presented. The methods for separating foreground events from the acoustic background are introduced. The classifier, based on a Support Vector Machine algorithm, is described. The set of features and samples used for the training of the classifier...
-
Optimizing Medical Personnel Speech Recognition Models Using Speech Synthesis and Reinforcement Learning
PublicationText-to-Speech synthesis (TTS) can be used to generate training data for building Automatic Speech Recognition models (ASR). Access to medical speech data is because it is sensitive data that is difficult to obtain for privacy reasons; TTS can help expand the data set. Speech can be synthesized by mimicking different accents, dialects, and speaking styles that may occur in a medical language. Reinforcement Learning (RL), in the...
-
Viewpoint independent shape-based object classification for video surveillance
PublicationA method for shape based object classification is presented.Unlike object dimension based methods it does not require any system calibration techniques. A number of 3D object models are utilized as a source of training dataset for a specified camera orientation. Usage of the 3D models allows to perform the dataset creation process semiautomatically. The background subtraction method is used for the purpose of detecting moving objects...
-
Uniform sampling in constrained domains for low-cost surrogate modeling of antenna input characteristics
PublicationIn this letter, a design of experiments technique that permits uniform sampling in constrained domains is proposed. The discussed method is applied to generate training data for construction of fast replacement models (surrogates) of antenna input characteristics. The modeling process is design-oriented with the surrogate domain spanned by a set of reference designs optimized with respect to the performance figures and/or operating...
-
The Influence of Selecting Regions from Endoscopic Video Frames on The Efficiency of Large Bowel Disease Recognition Algorithms
PublicationThe article presents our research in the field of the automatic diagnosis of large intestine diseases on endoscopic video. It focuses on the methods of selecting regions of interest from endoscopic video frames for further analysis by specialized disease recognition algorithms. Four methods of selecting regions of interest have been discussed: a. trivial, b. with the deletion of characteristic, endoscope specific additions to the...
-
Monitoring Parkinson's disease patients employing biometric sensors and rule-based data processing
PublicationArtykuł prezentuje automatyczny system wykrywania pogorszenia zdrowia pacjentów z chorobą Parkinsona opracowany w ramach projektu PERFORM.The paper presents how rule-based processing can be applied to automatically evaluate the motor state of Parkinson's Disease patients. Automatic monitoring of patients by using biometric sensors can provide assessment of the Parkinson's Disease symptoms. All data on PD patients' state are compared...
-
OBTAINING FLUID FLOW PATTERN FOR TURBINE STAGE WITH NEURAL MODEL.
PublicationIn the paper possibility of applying neural model to obtaining patterns of proper operation for fluid flow in turbine stage for fluid-flow diagnostics is discussed. Main differences between Computational Fluid Dynamics (CFD) solvers and neural model is given, also limitations and advantages of both are considered. Time of calculations of both methods was given, also possibilities of shortening that time with preserving the accuracy...
-
Method for Clustering of Brain Activity Data Derived from EEG Signals
PublicationA method for assessing separability of EEG signals associated with three classes of brain activity is proposed. The EEG signals are acquired from 23 subjects, gathered from a headset consisting of 14 electrodes. Data are processed by applying Discrete Wavelet Transform (DWT) for the signal analysis and an autoencoder neural network for the brain activity separation. Processing involves 74 wavelets from 3 DWT families: Coiflets,...
-
Accelerated multi-objective design optimization of antennas by surrogate modeling and domain segmentation
PublicationMulti-objective optimization yields indispensable information about the best possible design trade-offs of an antenna structure, yet it is challenging if full-wave electromagnetic (EM) analysis is utilized for performance evaluation. The latter is a necessity for majority of contemporary antennas as it is the only way of achieving acceptable modeling accuracy. In this paper, a procedure for accelerated multi-objective design of...
-
Detection, classification and localization of acoustic events in the presence of background noise for acoustic surveillance of hazardous situations
PublicationEvaluation of sound event detection, classification and localization of hazardous acoustic events in the presence of background noise of different types and changing intensities is presented. The methods for discerning between the events being in focus and the acoustic background are introduced. The classifier, based on a Support Vector Machine algorithm, is described. The set of features and samples used for the training of the...
-
Reliable Surrogate Modeling of Antenna Input Characteristics by Means of Domain Confinement and Principal Components
PublicationA reliable design of contemporary antenna structures necessarily involves full-wave electromagnetic (EM) analysis which is the only tool capable of accounting, for example, for element coupling or the effects of connectors. As EM simulations tend to be CPU-intensive, surrogate modeling allows for relieving the computational overhead of design tasks that require numerous analyses, for example, parametric optimization or uncertainty...
-
Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets
PublicationArtificial Neural Network (ANN) models, specifically Convolutional Neural Networks (CNN), were applied to extract emotions based on spectrograms and mel-spectrograms. This study uses spectrograms and mel-spectrograms to investigate which feature extraction method better represents emotions and how big the differences in efficiency are in this context. The conducted studies demonstrated that mel-spectrograms are a better-suited...
-
News that Moves the Market: DSEX-News Dataset for Forecasting DSE Using BERT
PublicationStock market is a complex and dynamic industry that has always presented challenges for stakeholders and investors due to its unpredictable nature. This unpredictability motivates the need for more accurate prediction models. Traditional prediction models have limitations in handling the dynamic nature of the stock market. Additionally, previous methods have used less relevant data, leading to suboptimal performance. This study...
-
Assessing the attractiveness of human face based on machine learning
PublicationThe attractiveness of the face plays an important role in everyday life, especially in the modern world where social media and the Internet surround us. In this study, an attempt to assess the attractiveness of a face by machine learning is shown. Attractiveness is determined by three deep models whose sum of predictions is the final score. Two annotated datasets available in the literature are employed for training and testing...
-
Visual Features for Improving Endoscopic Bleeding Detection Using Convolutional Neural Networks
PublicationThe presented paper investigates the problem of endoscopic bleeding detection in endoscopic videos in the form of a binary image classification task. A set of definitions of high-level visual features of endoscopic bleeding is introduced, which incorporates domain knowledge from the field. The high-level features are coupled with respective feature descriptors, enabling automatic capture of the features using image processing methods....
-
Improved Uniform Sampling in Constrained Domains for Data-Driven Modelling of Antennas
PublicationData-driven surrogate modelling of antenna structures is an attractive way of accelerating the design process, in particular, parametric optimization. In practice, construction of surrogates is hindered by curse of dimensionality as well as wide ranges of geometry parameters that need to be covered in order to make the model useful. These difficulties can be alleviated by constrained performance-driven modelling with the surrogate...
-
Investigating Feature Spaces for Isolated Word Recognition
PublicationThe study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms and fractal dimension features of the signal were chosen for the time domain, and...
-
Neural Modelling of Steam Turbine Control Stage
PublicationThe paper describes possibility of steam turbine control stage neural model creation. It is of great importance because wider application of green energy causes severe conditions for control of energy generation systems operation Results of chosen steam turbine of 200 MW power measurements are applied as an example showing way of neural model creation. They serve as training and testing data of such neural model. Relatively simple...