Filters
total: 336
filtered: 280
Chosen catalog filters
Search results for: DATASET CONSTRUCTION
-
Automated Valuation Model based on fuzzy and rough set theory for real estate market with insufficient source data
PublicationObjective monitoring of the real estate value is a requirement to maintain balance, increase security and minimize the risk of a crisis in the financial and economic sector of every country. The valuation of real estate is usually considered from two points of view, i.e. individual valuation and mass appraisal. It is commonly believed that Automated Valuation Models (AVM) should be devoted to mass appraisal, which requires a large...
-
Self-Supervised Learning to Increase the Performance of Skin Lesion Classification
PublicationTo successfully train a deep neural network, a large amount of human-labeled data is required. Unfortunately, in many areas, collecting and labeling data is a difficult and tedious task. Several ways have been developed to mitigate the problem associated with the shortage of data, the most common of which is transfer learning. However, in many cases, the use of transfer learning as the only remedy is insufficient. In this study,...
-
Bimodal deep learning model for subjectively enhanced emotion classification in films
PublicationThis research delves into the concept of color grading in film, focusing on how color influences the emotional response of the audience. The study commenced by recalling state-of-the-art works that process audio-video signals and associated emotions by machine learning. Then, assumptions of subjective tests for refining and validating an emotion model for assigning specific emotional labels to selected film excerpts were presented....
-
Efficiency of Artificial Intelligence Methods for Hearing Loss Type Classification: an Evaluation
PublicationThe evaluation of hearing loss is primarily conducted by pure tone audiometry testing, which is often regarded as golden standard for assessing auditory function. If the presence of hearing loss is determined, it is possible to differentiate between three types of hearing loss: sensorineural, conductive, and mixed. This study presents a comprehensive comparison of a variety of AI classification models, performed on 4007 pure tone...
-
Medical Image Dataset Annotation Service (MIDAS)
PublicationMIDAS (Medical Image Dataset Annotation Service) is a custom-tailored tool for creating and managing datasets either for deep learning, as well as machine learning or any form of statistical research. The aim of the project is to provide one-fit-all platform for creating medical image datasets that could easily blend in hospital's workflow. In our work, we focus on the importance of medical data anonimization, discussing the...
-
Consumer Bankruptcy Prediction Using Balanced and Imbalanced Data
PublicationThis paper examines the usefulness of logit regression in forecasting the consumer bankruptcy of households using an imbalanced dataset. The research on consumer bankruptcy prediction is of paramount importance as it aims to build statistical models that can identify consumers in a difficult financial situation that may lead to consumer bankruptcy. In the face of the current global pandemic crisis, the future of household finances...
-
A comprehensive review of open data platforms, prevalent technologies, and functionalities
PublicationOpen data can play a crucial role in different sectors of the world,such as government, science, research, technology, culture, andfinance. There are several necessary measures that every organiza-tion needs to consider before opening data. There are three majorsteps to opening the data: (1) Preparation stage, and (2) launchingthe open data initiative (3) In this case, the feedback mechanismstudy such as expand and sustain stage,...
-
Urban scene semantic segmentation using the U-Net model
PublicationVision-based semantic segmentation of complex urban street scenes is a very important function during autonomous driving (AD), which will become an important technology in industrialized countries in the near future. Today, advanced driver assistance systems (ADAS) improve traffic safety thanks to the application of solutions that enable detecting objects, recognising road signs, segmenting the road, etc. The basis for these functionalities...
-
Prediction of fracture toughness in fibre-reinforced concrete, mortar, and rocks using various Machine learning techniques
PublicationMachine Learning (ML) method is widely used in engineering applications such as fracture mechanics. In this study, twenty different ML algorithms were employed and compared for the prediction of the fracture toughness and fracture load in modes I, II, and mixed-mode (I-II) of various materials, including fibre-reinforced concrete, cement mortar, sandstone, white travertine, marble, and granite. A set of 401 specimens of “Brazilian...
-
Towards application of uncertainty quantification procedure combined with experimental procedure for assessment of the accuracy of the DEM approach dedicated for granular flow modeling
PublicationThere is a high demand for accurate and fast numerical models for dense granular flows found in many industrial applications. Nevertheless, before numerical model can be used its need to be always validated against experimental data. During the validation, it is important to consider how the measurement data sets, as well as the numerical models, are affected by errors and uncertainties. In this study, the uncertainty quantification...
-
Cost-Efficient Multi-Objective Design of Miniaturized Microwave Circuits Using Machine Learning and Artificial Neural Network
PublicationDesigning microwave components involves managing multiple objectives such as center frequencies, impedance matching, and size reduction for miniaturized structures. Traditional multi-objective optimization (MO) approaches heavily rely on computationally expensive population-based methods, especially when exe-cuted with full-wave electromagnetic (EM) analysis to guarantee reliability. This paper introduces a novel and cost-effective...
-
Closer Look at the Uncertainty Estimation in Semantic Segmentation under Distributional Shift
PublicationWhile recent computer vision algorithms achieve impressive performance on many benchmarks, they lack robustness - presented with an image from a different distribution, (e.g. weather or lighting conditions not considered during training), they may produce an erroneous prediction. Therefore, it is desired that such a model will be able to reliably predict its confidence measure. In this work, uncertainty estimation for the task...
-
Semantic URL Analytics to Support Efficient Annotation of Large Scale Web Archives
PublicationLong-term Web archives comprise Web documents gathered over longer time periods and can easily reach hundreds of terabytes in size. Semantic annotations such as named entities can facilitate intelligent access to the Web archive data. However, the annotation of the entire archive content on this scale is often infeasible. The most efficient way to access the documents within Web archives is provided through their URLs, which are...
-
Investigating Feature Spaces for Isolated Word Recognition
PublicationMuch attention is given by researchers to the speech processing task in automatic speech recognition (ASR) over the past decades. The study addresses the issue related to the investigation of the appropriateness of a two-dimensional representation of speech feature spaces for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and timefrequency signal representation...
-
Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging
PublicationIn the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modification of the training program which minimizes the...
-
Going all in or spreading your bet: a configurational perspective on open innovation interaction channels in production sectors
PublicationUsing different interaction channels within open innovation partnerships holds the potential to enhance the chance of success in production sectors. However, our comprehension of how open innovation partnerships are affected by varying combinations of interaction channels, and how this corelates with their level of open innovation output, remains limited. There are discrepancies in the current literature regarding the individual...
-
Learning sperm cells part segmentation with class-specific data augmentation
PublicationInfertility affects around 15% of couples worldwide. Male fertility problems include poor sperm quality and low sperm count. The advanced fertility treatment methods like ICSI are nowadays supported by vision systems to assist embryologists in selecting good quality sperm. Computer-Assisted Semen Analysis (CASA) provides quantitative and qualitative sperm analysis concerning concentration, motility, morphology, vitality, and fragmentation....
-
Solubility Characteristics of Acetaminophen and Phenacetin in Binary Mixtures of Aqueous Organic Solvents: Experimental and Deep Machine Learning Screening of Green Dissolution Media
PublicationThe solubility of active pharmaceutical ingredients is a mandatory physicochemical characteristic in pharmaceutical practice. However, the number of potential solvents and their mixtures prevents direct measurements of all possible combinations for finding environmentally friendly, operational and cost-effective solubilizers. That is why support from theoretical screening seems to be valuable. Here, a collection of acetaminophen...
-
Systematic Literature Review for Emotion Recognition from EEG Signals
PublicationResearchers have recently become increasingly interested in recognizing emotions from electroencephalogram (EEG) signals and many studies utilizing different approaches have been conducted in this field. For the purposes of this work, we performed a systematic literature review including over 40 articles in order to identify the best set of methods for the emotion recognition problem. Our work collects information about the most...
-
Systematic Literature Review for Emotion Recognition from EEG Signals
PublicationResearchers have recently become increasingly interested in recognizing emotions from electroencephalogram (EEG) signals and many studies utilizing different approaches have been conducted in this field. For the purposes of this work, we performed a systematic literature review including over 40 articles in order to identify the best set of methods for the emotion recognition problem. Our work collects information about the most...
-
Categorization of emotions in dog behavior based on the deep neural network
PublicationThe aim of this article is to present a neural system based on stock architecture for recognizing emotional behavior in dogs. Our considerations are inspired by the original work of Franzoni et al. on recognizing dog emotions. An appropriate set of photographic data has been compiled taking into account five classes of emotional behavior in dogs of one breed, including joy, anger, licking, yawning, and sleeping. Focusing on a particular...
-
Transfer learning in imagined speech EEG-based BCIs
PublicationThe Brain–Computer Interfaces (BCI) based on electroencephalograms (EEG) are systems which aim is to provide a communication channel to any person with a computer, initially it was proposed to aid people with disabilities, but actually wider applications have been proposed. These devices allow to send messages or to control devices using the brain signals. There are different neuro-paradigms which evoke brain signals of interest...
-
CNN-CLFFA: Support Mobile Edge Computing in Transportation Cyber Physical System
PublicationIn the present scenario, the transportation Cyber Physical System (CPS) improves the reliability and efficiency of the transportation systems by enhancing the interactions between the physical and cyber systems. With the provision of better storage ability and enhanced computing, cloud computing extends transportation CPS in Mobile Edge Computing (MEC). By inspecting the existing literatures, the cloud computing cannot fulfill...
-
Independent dynamics of low, intermediate, and high frequency spectral intracranial EEG activities during human memory formation
PublicationA wide spectrum of brain rhythms are engaged throughout the human cortex in cognitive functions. How the rhythms of various frequency ranges are coordinated across the space of the human cortex and time of memory processing is inconclusive. They can either be coordinated together across the frequency spectrum at the same cortical site and time or induced independently in particular bands. We used a large dataset of human intracranial...
-
Selected Technical Issues of Deep Neural Networks for Image Classification Purposes
PublicationIn recent years, deep learning and especially Deep Neural Networks (DNN) have obtained amazing performance on a variety of problems, in particular in classification or pattern recognition. Among many kinds of DNNs, the Convolutional Neural Networks (CNN) are most commonly used. However, due to their complexity, there are many problems related but not limited to optimizing network parameters, avoiding overfitting and ensuring good...
-
The use of fast molecular descriptors and artificial neural networks approach in organochlorine compounds electron ionization mass spectra classification
PublicationDeveloping of theoretical tools can be very helpful for supporting new pollutant detection. Nowadays, a combination of mass spectrometry and chromatographic techniques are the most basic environmental monitoring methods. In this paper, two organochlorine compound mass spectra classification systems were proposed. The classification models were developed within the framework of artificial neural networks (ANNs) and fast 1D and...
-
Machine learning-based prediction of preplaced aggregate concrete characteristics
PublicationPreplaced-Aggregate Concrete (PAC) is a type of preplaced concrete where coarse aggregate is placed in the mold and a Portland cement-sand grout with admixtures is injected to fill the voids. Due to the complex nature of PAC, many studies were conducted to determine the effects of admixtures and the compressive and tensile strengths of PAC. Considering that a prediction tool is needed to estimate the compressive and tensile...
-
Exploring the Solubility Limits of Edaravone in Neat Solvents and Binary Mixtures: Experimental and Machine Learning Study
PublicationThis study explores the edaravone solubility space encompassing both neat and binary dissolution media. Efforts were made to reveal the inherent concentration limits of common pure and mixed solvents. For this purpose, the published solubility data of the title drug were scrupulously inspected and cured, which made the dataset consistent and coherent. However, the lack of some important types of solvents in the collection called...
-
Platelet RNA Sequencing Data Through the Lens of Machine Learning
PublicationLiquid biopsies offer minimally invasive diagnosis and monitoring of cancer disease. This biosource is often analyzed using sequencing, which generates highly complex data that can be used using machine learning tools. Nevertheless, validating the clinical applications of such methods is challenging. It requires: (a) using data from many patients; (b) verifying potential bias concerning sample collection; and (c) adding interpretability...
-
Towards High-Value Datasets Determination for Data-Driven Development: A Systematic Literature Review
PublicationOpen government data (OGD) is seen as a political and socio-economic phenomenon that promises to promote civic engagement and stimulate public sector innovations in various areas of public life. To bring the expected benefits, data must be reused and transformed into value-added products or services. This, in turn, sets another precondition for data that are expected to not only be available and comply with open data principles,...
-
Detecting type of hearing loss with different AI classification methods: a performance review
PublicationHearing is one of the most crucial senses for all humans. It allows people to hear and connect with the environment, the people they can meet and the knowledge they need to live their lives to the fullest. Hearing loss can have a detrimental impact on a person's quality of life in a variety of ways, ranging from fewer educational and job opportunities due to impaired communication to social withdrawal in severe situations. Early...
-
Magnetic Signature Description of Ellipsoid-Shape Vessel Using 3D Multi-Dipole Model Fitted on Cardinal Directions
PublicationThe article presents a continuation of the research on the 3D multi-dipole model applied to the reproduction of magnetic signatures of ferromagnetic objects. The model structure has been modified to improve its flexibility - model parameters determined by optimization can now be located in the cuboid contour representing the object's hull. To stiffen the model, the training dataset was expanded to data collected from all four cardinal...
-
Know your safety indicator – A determination of merchant vessels Bow Crossing Range based on big data analytics
PublicationEven in the era of automatization maritime safety constantly needs improvements. Regardless of the presence of crew members on board, both manned and autonomous ships should follow clear guidelines (no matter as bridge procedures or algorithms). To date, many safety indicators, especially in collision avoidance have been proposed. One of such parameters commonly used in day-to-day navigation but usually omitted by researchers is...
-
Interrelations between Travel Patterns and Urban Spatial Structure of the Largest Russian Cities
PublicationThe study presented within this dissertation involves the analysis of the relationship between urban spatial structure and travel patterns in the largest Russian cities. It is an empirical investigation of how the spatial structure, formed during the Soviet and post-Soviet periods, affects the travel patterns in the largest cities of contemporary Russia. It aims to determine what measures, both urban structure and transportation...
-
Reliable computationally-efficient behavioral modeling of microwave passives using deep learning surrogates in confined domains
PublicationThe importance of surrogate modeling techniques has been steadily growing over the recent years in high-frequency electronics, including microwave engineering. Fast metamodels are employed to speedup design processes, especially those conducted at the level of full-wave electromagnetic (EM) simulations. The surrogates enable massive system evaluations at nearly EM accuracy and negligible costs, which is invaluable in parameter...
-
Machine learning-based seismic fragility and seismic vulnerability assessment of reinforced concrete structures
PublicationMany studies have been performed to put quantifying uncertainties into the seismic risk assessment of reinforced concrete (RC) buildings. This paper provides a risk-assessment support tool for purpose of retrofitting and potential design strategies of RC buildings. Machine Learning (ML) algorithms were developed in Python software by innovative methods of hyperparameter optimization, such as halving search, grid search, random...
-
Dependent self-employed individuals: are they different from paid employees?
PublicationThis study focuses on dependent self-employment, which covers a situation where a person works for the same employer as a typical worker while on a self-employment contractual basis, i.e., without a traditional employment contract and without certain rights granted to "regular" employees. The research exploits the individual-level dataset of 35 European countries extracted from the 2017 edition of the European Labour Force Survey...
-
Machine learning-based prediction of preplaced aggregate concrete characteristics
PublicationPreplaced-Aggregate Concrete (PAC) is a type of preplaced concrete where coarse aggregate is placed in the mold and a Portland cement-sand grout with admixtures is injected to fill the voids. Due to the complex nature of PAC, many studies were conducted to determine the effects of admixtures and the compressive and tensile strengths of PAC. Considering that a prediction tool is needed to estimate the compressive and tensile strengths...
-
Rediscovering Automatic Detection of Stuttering and Its Subclasses through Machine Learning—The Impact of Changing Deep Model Architecture and Amount of Data in the Training Set
PublicationThis work deals with automatically detecting stuttering and its subclasses. An effective classification of stuttering along with its subclasses could find wide application in determining the severity of stuttering by speech therapists, preliminary patient diagnosis, and enabling communication with the previously mentioned voice assistants. The first part of this work provides an overview of examples of classical and deep learning...
-
Deep learning-based waste detection in natural and urban environments
PublicationWaste pollution is one of the most significant environmental issues in the modern world. The importance of recycling is well known, both for economic and ecological reasons, and the industry demands high efficiency. Current studies towards automatic waste detection are hardly comparable due to the lack of benchmarks and widely accepted standards regarding the used metrics and data. Those problems are addressed in this article by...
-
Diurnal variability of atmospheric water vapour, precipitation and cloud top temperature across the global tropics derived from satellite observations and GNSS technique
PublicationThe diurnal cycle of convection plays an important role in clouds and water vapour distribution across the global tropics. In this study, we utilize integrated moisture derived from the global navigation satellite system (GNSS), satellite precipitation estimates from TRMM and merged infrared dataset to investigate links between variability in tropospheric moisture, clouds development and precipitation at a diurnal time scale. Over...
-
Impact of optimization of ALS point cloud on classification
PublicationAirborne laser scanning (ALS) is one of the LIDAR technologies (Light Detection and Ranging). It provides information about the terrain in form of a point cloud. During measurement is acquired: spatial data (object’s coordinates X, Y, Z) and collateral data such as intensity of reflected signal. The obtained point cloud is typically applied for generating a digital terrain model (DTM) and a digital surface model (DSM). For DTM...
-
Fault diagnosis of marine 4-stroke diesel engines using a one-vs-one extreme learning ensemble
PublicationThis paper proposes a novel approach for intelligent fault diagnosis for stroke Diesel marine engines, which are commonly used in on-road and marine transportation. The safety and reliability of a ship's work rely strongly on the performance of such an engine; therefore, early detection of any type of failure that affects the engine is of crucial importance. Automatic diagnostic systems are of special importance because they can...
-
An Analysis of Neural Word Representations for Wikipedia Articles Classification
PublicationOne of the current popular methods of generating word representations is an approach based on the analysis of large document collections with neural networks. It creates so-called word-embeddings that attempt to learn relationships between words and encode this information in the form of a low-dimensional vector. The goal of this paper is to examine the differences between the most popular embedding models and the typical bag-of-words...
-
Emissions and toxic units of solvent, monomer and additive residues released to gaseous phase from latex balloons
PublicationThis study describes the VOCs emissions from commercially available latex balloons. Nine compounds are determined to be emitted from 13 types of balloons of different colors and imprints in 30 and 60°C. The average values of total volatile organic compounds (TVOCs) emitted from studied samples ranged from 0.054 up to 7.18 μg∙g-1 and from 0.27 up to 36.13 μg∙g-1 for 30oC and 60oC, respectively. The dataset is treated with principal...
-
Graph Representation Integrating Signals for Emotion Recognition and Analysis
PublicationData reusability is an important feature of current research, just in every field of science. Modern research in Affective Computing, often rely on datasets containing experiments-originated data such as biosignals, video clips, or images. Moreover, conducting experiments with a vast number of participants to build datasets for Affective Computing research is time-consuming and expensive. Therefore, it is extremely important to...
-
imPlatelet classifier: image‐converted RNA biomarker profiles enable blood‐based cancer diagnostics
PublicationLiquid biopsies offer a minimally invasive sample collection, outperforming traditional biopsies employed for cancer evaluation. The widely used material is blood, which is the source of tumor-educated platelets. Here, we developed the imPlatelet classifier, which converts RNA-sequenced platelet data into images in which each pixel corresponds to the expression level of a certain gene. Biological knowledge from the Kyoto Encyclopedia...
-
Super-resolved Thermal Imagery for High-accuracy Facial Areas Detection and Analysis
PublicationIn this study, we evaluate various Convolutional Neural Networks based Super-Resolution (SR) models to improve facial areas detection in thermal images. In particular, we analyze the influence of selected spatiotemporal properties of thermal image sequences on detection accuracy. For this purpose, a thermal face database was acquired for 40 volunteers. Contrary to most of existing thermal databases of faces, we publish our dataset...
-
ELECTRICAL CONDUCTIVITY AND pH IN SURFACE WATER AS TOOL FOR IDENTIFICATION OF CHEMICAL DIVERSITY
PublicationIn the present study, the creeks and lakes located at the western shore of Admiralty Bay were analysed. The impact of various sources of water supply was considered, based on the parameters of temperature, pH and specific electrolytic conductivity (SEC25). All measurements were conducted during a field campaign in January-February 2017. A multivariate dataset was also created and a biplot of SEC25 and pH of the investigated waters...
-
Fusion-based Representation Learning Model for Multimode User-generated Social Network Content
PublicationAs mobile networks and APPs are developed, user-generated content (UGC), which includes multi-source heterogeneous data like user reviews, tags, scores, images, and videos, has become an essential basis for improving the quality of personalized services. Due to the multi-source heterogeneous nature of the data, big data fusion offers both promise and drawbacks. With the rise of mobile networks and applications, UGC, which includes...