Search results for: deep convolutional neural network
-
CNN Architectures for Human Pose Estimation from a Very Low Resolution Depth Image
PublicationThe paper is dedicated to proposing and evaluating a number of convolutional neural network architectures for calculating a multiple regression on 3D coordinates of human body joints tracked in a single low resolution depth image. The main challenge was to obtain a high precision in case of a noisy and coarse scan of the body, as observed by a depth sensor from a large distance. The regression network was expected to reason about...
-
Speech Analytics Based on Machine Learning
PublicationIn this chapter, the process of speech data preparation for machine learning is discussed in detail. Examples of speech analytics methods applied to phonemes and allophones are shown. Further, an approach to automatic phoneme recognition involving optimized parametrization and a classifier belonging to machine learning algorithms is discussed. Feature vectors are built on the basis of descriptors coming from the music information...
-
Machine Learning and Deep Learning Methods for Fast and Accurate Assessment of Transthoracic Echocardiogram Image Quality
PublicationHigh-quality echocardiogram images are the cornerstone of accurate and reliable measurements of the heart. Therefore, this study aimed to develop, validate and compare machine learning and deep learning algorithms for accurate and automated assessment of transthoracic echocardiogram image quality. In total, 4090 single-frame two-dimensional transthoracic echocardiogram...
-
Machine Learning Applied to Aspirated and Non-Aspirated Allophone Classification—An Approach Based on Audio "Fingerprinting"
PublicationThe purpose of this study is to involve both Convolutional Neural Networks and a typical learning algorithm in the allophone classification process. A list of words including aspirated and non-aspirated allophones pronounced by native and non-native English speakers is recorded and then edited and analyzed. Allophones extracted from English speakers’ recordings are presented in the form of two-dimensional spectrogram images and...
-
Intelligent Autonomous Robot Supporting Small Pets in Domestic Environment
PublicationIn this contribution, we present preliminary results of the student project aimed at the development of an intelligent autonomous robot supporting small pets in a domestic environment. The main task of this robot is to protect a freely moving small pets against accidental stepping on them by home residents. For this purpose, we have developed the mobile robot which follows a pet and makes an alarm signal when a human is approaching....
-
Architektury klasyfikatorów obrazów
PublicationKlasyfikacja obrazów jest zagadnieniem z dziedziny widzenia komputerowego. Polega na całościowej analizie obrazu i przypisaniu go do jednej lub wielu kategorii (klas). Współczesne rozwiązania tego problemu są w znacznej części realizowane z wykorzystaniem konwolucyjnych głębokich sieci neuronowych (convolutional neural network, CNN). W tym rozdziale opisano przełomowe architektury CNN oraz ewolucję state-of-the-art w klasyfikacji...
-
Using Long-Short term Memory networks with Genetic Algorithm to predict engine condition
PublicationPredictive maintenance (PdM) is a type of approach for maintenance processes, allowing maintenance actions to be managed depending on the machine's current condition. Maintenance is therefore carried out before failures occur. The approach doesn’t only help avoid abrupt failures but also helps lower maintenance cost and provides possibilities to manufacturers to manage maintenance budgets in a more efficient way. A new deep neural...
-
Open-Set Speaker Identification Using Closed-Set Pretrained Embeddings
PublicationThe paper proposes an approach for extending deep neural networks-based solutions to closed-set speaker identification toward the open-set problem. The idea is built on the characteristics of deep neural networks trained for the classification tasks, where there is a layer consisting of a set of deep features extracted from the analyzed inputs. By extracting this vector and performing anomaly detection against the set of known...
-
Towards Cancer Patients Classification Using Liquid Biopsy
PublicationLiquid biopsy is a useful, minimally invasive diagnostic and monitoring tool for cancer disease. Yet, developing accurate methods, given the potentially large number of input features, and usually small datasets size remains very challenging. Recently, a novel feature parameterization based on the RNA-sequenced platelet data which uses the biological knowledge from the Kyoto Encyclopedia of Genes and Genomes, combined with a classifier...
-
Driver fatigue detection method based on facial image analysis
PublicationNowadays, ensuring road safety is a crucial issue that demands continuous development and measures to minimize the risk of accidents. This paper presents the development of a driver fatigue detection method based on the analysis of facial images. To monitor the driver's condition in real-time, a video camera was used. The method of detection is based on analyzing facial features related to the mouth area and eyes, such as...
-
LSTM-based method for LOS/NLOS identification in an indoor environment
PublicationDue to the multipath propagation, harsh indoor environment significantly impacts transmitted signals which may adversely affect the quality of the radiocommunication services, with focus on the real-time ones. This negative effect may be significantly reduced (e.g. resources management and allocation) or compensated (e.g. correction of position estimation in radiolocalisation) by the LOS/NLOS identification algorithm. This paper...
-
When Neural Networks Meet Decisional DNA: A Promising New Perspective for Knowledge Representation and Sharing
PublicationABSTRACT In this article, we introduce a novel concept combining neural network technology and Decisional DNA for knowledge representation and sharing. Instead of using traditional machine learning and knowledge discovery methods, this approach explores the way of knowledge extraction through deep learning processes based on a domain’s past decisional events captured by Decisional DNA. We compare our approach with kNN (k-nearest...
-
Sign Language Recognition Using Convolution Neural Networks
PublicationThe objective of this work was to provide an app that can automatically recognize hand gestures from the American Sign Language (ASL) on mobile devices. The app employs a model based on Convolutional Neural Network (CNN) for gesture classification. Various CNN architectures and optimization strategies suitable for devices with limited resources were examined. InceptionV3 and VGG-19 models exhibited negligibly higher accuracy than...
-
Investigating Feature Spaces for Isolated Word Recognition
PublicationMuch attention is given by researchers to the speech processing task in automatic speech recognition (ASR) over the past decades. The study addresses the issue related to the investigation of the appropriateness of a two-dimensional representation of speech feature spaces for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and timefrequency signal representation...
-
Data Acquisition and Processing for GeoAI Models to Support Sustainable Agricultural Practices
PublicationThere are growing opportunities to leverage new technologies and data sources to address global problems related to sustainability, climate change, and biodiversity loss. The emerging discipline of GeoAI resulting from the convergence of AI and Geospatial science (Geo-AI) is enabling the possibility to harness the increasingly available open Earth Observation data collected from different constellations of satellites and sensors...
-
Deep Learning: A Case Study for Image Recognition Using Transfer Learning
PublicationDeep learning (DL) is a rising star of machine learning (ML) and artificial intelligence (AI) domains. Until 2006, many researchers had attempted to build deep neural networks (DNN), but most of them failed. In 2006, it was proven that deep neural networks are one of the most crucial inventions for the 21st century. Nowadays, DNN are being used as a key technology for many different domains: self-driven vehicles, smart cities,...
-
Deep Learning
PublicationDeep learning (DL) is a rising star of machine learning (ML) and artificial intelligence (AI) domains. Until 2006, many researchers had attempted to build deep neural networks (DNN), but most of them failed. In 2006, it was proven that deep neural networks are one of the most crucial inventions for the 21st century. Nowadays, DNN are being used as a key technology for many different domains: self-driven vehicles, smart cities,...
-
Deep learning techniques for biometric security: A systematic review of presentation attack detection systems
PublicationBiometric technology, including finger vein, fingerprint, iris, and face recognition, is widely used to enhance security in various devices. In the past decade, significant progress has been made in improving biometric sys- tems, thanks to advancements in deep convolutional neural networks (DCNN) and computer vision (CV), along with large-scale training datasets. However, these systems have become targets of various attacks, with...
-
Bees Detection on Images: Study of Different Color Models for Neural Networks
PublicationThis paper presents an approach to bee detection in video streams using a neural network classifier. We describe the motivation for our research and the methodology of data acquisition. The main contribution to this work is a comparison of different color models used as an input format for a feedforward convolutional architecture applied to bee detection. The detection process has is based on a neural binary classifier that classifies...
-
Olgun Aydin dr
PeopleOlgun Aydin finished his PhD by publishing a thesis about Deep Neural Networks. He works as a Principal Machine Learning Engineer in Nike, and works as Assistant Professor in Gdansk University of Technology in Poland. Dr. Aydin is part of editorial board of "Journal of Artificial Intelligence and Data Science" Dr. Aydin served as Vice-Chairman of Why R? Foundation and is member of Polish Artificial Intelligence Society. Olgun is...
-
Visual Content Learning in a Cognitive Vision Platform for Hazard Control (CVP-HC)
PublicationThis work is part of an effort for the development of a Cognitive Vision Platform for Hazard Control (CVP-HC) for applications in industrial workplaces, adaptable to a wide range of environments. The paper focuses on hazards resulted from the nonuse of personal protective equipment (PPE). Given the results of previous analysis of supervised techniques for the problem of classification of a few PPE (boots, hard hats, and gloves...
-
Playback detection using machine learning with spectrogram features approach
PublicationThis paper presents 2D image processing approach to playback detection in automatic speaker verification (ASV) systems using spectrograms as speech signal representation. Three feature extraction and classification methods: histograms of oriented gradients (HOG) with support vector machines (SVM), HAAR wavelets with AdaBoost classifier and deep convolutional neural networks (CNN) were compared on different data partitions in respect...
-
A Novel IoT-Perceptive Human Activity Recognition (HAR) Approach Using Multi-Head Convolutional Attention
PublicationTogether with fast advancement of the Internet of Things (IoT), smart healthcare applications and systems are equipped with increasingly more wearable sensors and mobile devices. These sensors are used not only to collect data, but also, and more importantly, to assist in daily activity tracking and analyzing of their users. Various human activity recognition (HAR) approaches are used to enhance such tracking. Most of the existing...
-
Adaptive Hounsfield Scale Windowing in Computed Tomography Liver Segmentation
PublicationIn computed tomography (CT) imaging, the Hounsfield Unit (HU) scale quantifies radiodensity, but its nonlinear nature across organs and lesions complicates machine learning analysis. This paper introduces an automated method for adaptive HU scale windowing in deep learning-based CT liver segmentation. We propose a new neural network layer that optimizes HU scale window parameters during training. Experiments on the Liver Tumor...
-
Assessing the attractiveness of human face based on machine learning
PublicationThe attractiveness of the face plays an important role in everyday life, especially in the modern world where social media and the Internet surround us. In this study, an attempt to assess the attractiveness of a face by machine learning is shown. Attractiveness is determined by three deep models whose sum of predictions is the final score. Two annotated datasets available in the literature are employed for training and testing...
-
Deep learning-enabled integration of renewable energy sources through photovoltaics in buildings
PublicationInstalling photovoltaic (PV) systems in buildings is one of the most effective strategies for achieving sustainable energy goals and reducing carbon emissions. However, the requirement for efficient energy management, the fluctuating energy demands, and the intermittent nature of solar power are a few of the obstacles to the seamless integration of PV systems into buildings. These complexities surpass the capabilities of rule-based...
-
Ranking Speech Features for Their Usage in Singing Emotion Classification
PublicationThis paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based...
-
A Study of Cross-Linguistic Speech Emotion Recognition Based on 2D Feature Spaces
PublicationIn this research, a study of cross-linguistic speech emotion recognition is performed. For this purpose, emotional data of different languages (English, Lithuanian, German, Spanish, Serbian, and Polish) are collected, resulting in a cross-linguistic speech emotion dataset with the size of more than 10.000 emotional utterances. Despite the bi-modal character of the databases gathered, our focus is on the acoustic representation...
-
Vehicle detector training with labels derived from background subtraction algorithms in video surveillance
PublicationVehicle detection in video from a miniature station- ary closed-circuit television (CCTV) camera is discussed in the paper. The camera provides one of components of the intelligent road sign developed in the project concerning the traffic control with the use of autonomous devices being developed. Modern Convolutional Neural Network (CNN) based detectors need big data input, usually demanding their manual labeling. In the presented...
-
Improving Accuracy of Respiratory Rate Estimation by Restoring High Resolution Features With Transformers and Recursive Convolutional Models
PublicationNon-contact evaluation of vital signs has been becoming increasingly important, especially in light of the COVID- 19 pandemic, which is causing the whole world to examine people’s interactions in public places at a scale never seen before. However, evaluating one’s vital signs can be a relatively complex procedure, which requires both time and physical contact between examiner and examinee. These re- quirements limit the number...
-
Deep-Learning-Based Precise Characterization of Microwave Transistors Using Fully-Automated Regression Surrogates
PublicationAccurate models of scattering and noise parameters of transistors are instrumental in facilitating design procedures of microwave devices such as low-noise amplifiers. Yet, data-driven modeling of transistors is a challenging endeavor due to complex relationships between transistor characteristics and its designable parameters, biasing conditions, and frequency. Artificial neural network (ANN)-based methods, including deep learning...
-
Towards bees detection on images: study of different color models for neural networks
PublicationThis paper presents an approach to bee detection in videostreams using a neural network classifier. We describe the motivationfor our research and the methodology of data acquisition. The maincontribution to this work is a comparison of different color models usedas an input format for a feedforward convolutional architecture appliedto bee detection. The detection process has is based on a neural...
-
Deep Learning-Based Intrusion System for Vehicular Ad Hoc Networks
PublicationThe increasing use of the Internet with vehicles has made travel more convenient. However, hackers can attack intelligent vehicles through various technical loopholes, resulting in a range of security issues. Due to these security issues, the safety protection technology of the in-vehicle system has become a focus of research. Using the advanced autoencoder network and recurrent neural network in deep learning, we investigated...
-
MP3vec: A Reusable Machine-Constructed Feature Representation for Protein Sequences
Publication—Machine Learning (ML) methods have been used with varying degrees of success on protein prediction tasks, with two inherent limitations. First, prediction performance often depends upon the features extracted from the proteins. Second, experimental data may be insufficient to construct reliable ML models. Here we introduce MP3vec, a transferable representation for protein sequences that is designed to be used specifically for sequence-to-sequence...
-
Deep learning for recommending subscription-limited documents
PublicationDocuments recommendation for a commercial, subscription-based online platform is important due to the difficulty in navigation through a large volume and diversity of content available to clients. However, this is also a challenging task due to the number of new documents added every day and decreasing relevance of older contents. To solve this problem, we propose deep neural network architecture that combines autoencoder with...
-
Method for Clustering of Brain Activity Data Derived from EEG Signals
PublicationA method for assessing separability of EEG signals associated with three classes of brain activity is proposed. The EEG signals are acquired from 23 subjects, gathered from a headset consisting of 14 electrodes. Data are processed by applying Discrete Wavelet Transform (DWT) for the signal analysis and an autoencoder neural network for the brain activity separation. Processing involves 74 wavelets from 3 DWT families: Coiflets,...
-
Improvement of speech intelligibility in the presence of noise interference using the Lombard effect and an automatic noise interference profiling based on deep learning
PublicationThe Lombard effect is a phenomenon that results in speech intelligibility improvement when applied to noise. There are many distinctive features of Lombard speech that were recalled in this dissertation. This work proposes the creation of a system capable of improving speech quality and intelligibility in real-time measured by objective metrics and subjective tests. This system consists of three main components: speech type detection,...
-
Vehicle Detection with Self-Training for Adaptative Video Processing Embedded Platform
PublicationTraffic monitoring from closed-circuit television (CCTV) cameras on embedded systems is the subject of the performed experiments. Solving this problem encounters difficulties related to the hardware limitations, and possible camera placement in various positions which affects the system performance. To satisfy the hardware requirements, vehicle detection is performed using a lightweight Convolutional Neural Network (CNN), named...
-
INVESTIGATION OF THE LOMBARD EFFECT BASED ON A MACHINE LEARNING APPROACH
PublicationThe Lombard effect is an involuntary increase in the speaker’s pitch, intensity, and duration in the presence of noise. It makes it possible to communicate in noisy environments more effectively. This study aims to investigate an efficient method for detecting the Lombard effect in uttered speech. The influence of interfering noise, room type, and the gender of the person on the detection process is examined. First, acoustic parameters...
-
Poprawa jakości klasyfikacji głębokich sieci neuronowych poprzez optymalizację ich struktury i dwuetapowy proces uczenia
PublicationW pracy doktorskiej podjęto problem realizacji algorytmów głębokiego uczenia w warunkach deficytu danych uczących. Głównym celem było opracowanie podejścia optymalizującego strukturę sieci neuronowej oraz zastosowanie uczeniu dwuetapowym, w celu uzyskania mniejszych struktur, zachowując przy tym dokładności. Proponowane rozwiązania poddano testom na zadaniu klasyfikacji znamion skórnych na znamiona złośliwe i łagodne. W pierwszym...
-
Optimal selection of input features and an acompanying neural network structure for the classification purposes - skin lesions case study
PublicationMalignant melanomas are the most deadly type of skin cancers however detected early enough give a high chances for successful treatment. The last years saw the dynamic growth of interest of automatic computer-aided skin cancer diagnosis. Every month brings new research results on new approaches to this problem, new methods of preprocessing, new classifiers, new ideas to follow etc. In particular, the rapid development of dermatoscopy,...
-
IFE: NN-aided Instantaneous Pitch Estimation
PublicationPitch estimation is still an open issue in contemporary signal processing research. Nowadays, growing momentum of machine learning techniques application in the data-driven society allows for tackling this problem from a new perspective. This work leverages such an opportunity to propose a refined Instantaneous Frequency and power based pitch Estimator method called IFE. It incorporates deep neural network based pitch estimation...
-
Study of Statistical Text Representation Methods for Performance Improvement of a Hierarchical Attention Network
PublicationTo effectively process textual data, many approaches have been proposed to create text representations. The transformation of a text into a form of numbers that can be computed using computers is crucial for further applications in downstream tasks such as document classification, document summarization, and so forth. In our work, we study the quality of text representations using statistical methods and compare them to approaches...
-
BP-EVD: Forward Block-Output Propagation for Efficient Video Denoising
PublicationDenoising videos in real-time is critical in many applications, including robotics and medicine, where varying light conditions, miniaturized sensors, and optics can substantially compromise image quality. This work proposes the first video denoising method based on a deep neural network that achieves state-of-the-art performance on dynamic scenes while running in real-time on VGA video resolution with no frame latency. The backbone...
-
Detecting Objects of Various Categories in Optical Remote Sensing Imagery Using Neural Networks
PublicationThe effective detection of objects in remote sensing images is of great research importance, so recent years have seen a significant progress in deep learning techniques in this field. However, despite much valuable research being conducted, many challenges still remain. A lot of research projects focus on detecting objects of a single category (class), while correctly detecting objects of different categories is much harder. The...
-
Toward Intelligent Recommendations Using the Neural Knowledge DNA
PublicationIn this paper we propose a novel recommendation approach using past news click data and the Neural Knowledge DNA (NK-DNA). The Neural Knowledge DNA is a novel knowledge representation method designed to support discovering, storing, reusing, improving, and sharing knowledge among machines and computing systems. We examine our approach for news recommendation tasks on the MIND benchmark dataset. By taking advantages of NK-DNA, deep...
-
Urban scene semantic segmentation using the U-Net model
PublicationVision-based semantic segmentation of complex urban street scenes is a very important function during autonomous driving (AD), which will become an important technology in industrialized countries in the near future. Today, advanced driver assistance systems (ADAS) improve traffic safety thanks to the application of solutions that enable detecting objects, recognising road signs, segmenting the road, etc. The basis for these functionalities...
-
How to Sort Them? A Network for LEGO Bricks Classification
PublicationLEGO bricks are highly popular due to the ability to build almost any type of creation. This is possible thanks to availability of multiple shapes and colors of the bricks. For the smooth build process the bricks need to properly sorted and arranged. In our work we aim at creating an automated LEGO bricks sorter. With over 3700 different LEGO parts bricks classification has to be done with deep neural networks. The question arises...
-
Satellite Image Classification Using a Hierarchical Ensemble Learning and Correlation Coefficient-Based Gravitational Search Algorithm
PublicationSatellite image classification is widely used in various real-time applications, such as the military, geospatial surveys, surveillance and environmental monitoring. Therefore, the effective classification of satellite images is required to improve classification accuracy. In this paper, the combination of Hierarchical Framework and Ensemble Learning (HFEL) and optimal feature selection is proposed for the precise identification...
-
Evaluation of aspiration problems in L2 English pronunciation employing machine learning
PublicationThe approach proposed in this study includes methods specifically dedicated to the detection of allophonic variation in English. This study aims to find an efficient method for automatic evaluation of aspiration in the case of Polish second-language (L2) English speakers’ pronunciation when whole words are analyzed instead of particular allophones extracted from words. Sample words including aspirated and unaspirated allophones...