Filtry
wszystkich: 2302
wybranych: 269
Wyniki wyszukiwania dla: DATASET FEATURES, DATASET PROFILING VOCABULARIES
-
Increasing K-Means Clustering Algorithm Effectivity for Using in Source Code Plagiarism Detection
PublikacjaThe problem of plagiarism is becoming increasingly more significant with the growth of Internet technologies and the availability of information resources. Many tools have been successfully developed to detect plagiarisms in textual documents, but the situation is more complicated in the field of plagiarism of source codes, where the problem is equally serious. At present, there are no complex tools available to detect plagiarism...
-
The New Klebsiella pneumoniae ST152 Variants with Hypermucoviscous Phenotype Isolated from Renal Transplant Recipients with Asymptomatic Bacteriuria-Genetic Characteristics by WGS.
PublikacjaKlebsiella pneumoniae (Kp) is one of the most important etiological factors of urinary tract infections in renal transplant (RTx) recipients. We described the antimicrobial susceptibility phenotypes and genomic features of two hypermucoviscous (HM) Kp isolates recovered from RTx recipients with asymptomatic bacteriuria (ABU). Using whole genome sequencing (WGS) data, we showed that the strains belong to the ST152 lineage with the...
-
Application 2D Descriptors and Artificial Neural Networks for Beta-Glucosidase Inhibitors Screening
PublikacjaBeta-glucosidase inhibitors play important medical and biological roles. In this study, simple two-variable artificial neural network (ANN) classification models were developed for beta-glucosidase inhibitors screening. All bioassay data were obtained from the ChEMBL database. The classifiers were generated using 2D molecular descriptors and the data miner tool available in the STATISTICA package (STATISTICA Automated Neural...
-
Optimization algorithm and filtration using the adaptive TIN model at the stage of initial processing of the ALS point cloud
PublikacjaAirborne laser scanning (ALS) provides survey results in the form of a point cloud. The ALS point cloud is a source of data used primarily for constructing a digital terrain model (DTM). To generate a DTM, the set of ALS observations must be first subjected to the point cloud processing methodology. A standard methodology is composed of the following stages: acquisition of the ALS data, initial processing (including filtration),...
-
Application of multivariate statistics in assessment of green analytical chemistry parameters of analytical methodologies
PublikacjaThe study offers a multivariate statistical analysis of a dataset, including the major metrological, “greenness” and methodological parameters of 43 analytical methodologies applied for aldrin determination (a frequently analyzed organic compound) in water samples. The variables (parameters) chosen were as follows: metrological (LOD, recovery, RSD), describing the “greenness” (amount of the solvent used, amount of waste generated)...
-
Independent dynamics of slow, intermediate, and fast intracranial EEG spectral activities during human memory formation
PublikacjaA wide spectrum of brain rhythms are engaged throughout the human cortex in cognitive functions. How the rhythms of various low and high frequencies are spatiotemporally coordinated across the human brain during memory processing is inconclusive. They can either be coordinated together across a wide range of the frequency spectrum or induced in specific bands. We used a large dataset of human intracranial electroencephalography...
-
Data regarding a new, vector-enzymatic DNA fragment amplification-expression technology for the construction of artificial, concatemeric DNA, RNA and proteins, as well as biological effects of selected polypeptides obtained using this method
PublikacjaApplications of bioactive peptides and polypeptides are emerging in areas such as drug development and drug delivery systems. These compounds are bioactive, biocompatible and represent a wide range of chemical properties, enabling further adjustments of obtained biomaterials. However, delivering large quantities of peptide derivatives is still challenging. Several methods have been developed for the production of concatemers –...
-
Clothes Detection and Classification Using Convolutional Neural Networks
PublikacjaIn this paper we describe development of a computer vision system for accurate detection and classification of clothes for e-commerce images. We present a set of experiments on well established architectures of convolutional neural networks, including Residual networks, SqueezeNet and Single Shot MultiBox Detector (SSD). The clothes detection network was trained and tested on DeepFashion dataset, which contains box annotations...
-
Hey student, are you sharing your knowledge? A cluster typology of knowledge sharing behaviours among students
PublikacjaKnowledge Sharing (KS) is crucial for all organisations to better face current and future challenges. It is justifiable to assume that after graduation, students will have to face the coming challenges at societal and business levels, and that they will need the adequate KS skills to do so. Though the importance of KS is established, the understanding of how students pass on their knowledge is still fragmented and underdeveloped....
-
Enabling Deeper Linguistic-based Text Analytics – Construct Development for the Criticality of Negative Service Experience
PublikacjaSignificant progress has been made in linguistic-based text analytics particularly with the increasing availability of data and deep learning computational models for more accurate opinion analysis and domain-specific entity recognition. In understanding customer service experience from texts, analysis of sentiments associated with different stages of the service lifecycle is a useful starting point. However, when richer insights...
-
Machine learning applied to acoustic-based road traffic monitoring
PublikacjaThe motivation behind this study lies in adapting acoustic noise monitoring systems for road traffic monitoring for driver’s safety. Such a system should recognize a vehicle type and weather-related pavement conditions based on the audio level measurement. The study presents the effectiveness of the selected machine learning algorithms in acoustic-based road traffic monitoring. Bases of the operation of the acoustic road traffic...
-
Material characterisation of biaxial glass-fibre non-crimp fabrics as a function of ply orientation, stitch pattern, stitch length and stitch tension
PublikacjaDue to their high density-specific stiffnesses and strength, fibre reinforced plastic (FRP) composites are particularly interesting for mobility and transport applications. Warp-knitted non-crimp fabrics (NCF) are one possible way to produce such FRP composites. They are advantageous because of their low production costs and the ability to tailor the properties of the textile to the reinforcement and drape requirements of the application....
-
An Automated Method for Biometric Handwritten Signature Authentication Employing Neural Networks
PublikacjaHandwriting biometrics applications in e-Security and e-Health are addressed in the course of the conducted research. An automated graphomotor analysis method for the dynamic electronic representation of the handwritten signature authentication was researched. The developed algorithms are based on dynamic analysis of electronically handwritten signatures employing neural networks. The signatures were acquired with the use of the...
-
Selection of an artificial pre-training neural network for the classification of inland vessels based on their images
PublikacjaArtificial neural networks (ANN) are the most commonly used algorithms for image classification problems. An image classifier takes an image or video as input and classifies it into one of the possible categories that it was trained to identify. They are applied in various areas such as security, defense, healthcare, biology, forensics, communication, etc. There is no need to create one’s own ANN because there are several pre-trained...
-
Automated Valuation Model based on fuzzy and rough set theory for real estate market with insufficient source data
PublikacjaObjective monitoring of the real estate value is a requirement to maintain balance, increase security and minimize the risk of a crisis in the financial and economic sector of every country. The valuation of real estate is usually considered from two points of view, i.e. individual valuation and mass appraisal. It is commonly believed that Automated Valuation Models (AVM) should be devoted to mass appraisal, which requires a large...
-
Self-Supervised Learning to Increase the Performance of Skin Lesion Classification
PublikacjaTo successfully train a deep neural network, a large amount of human-labeled data is required. Unfortunately, in many areas, collecting and labeling data is a difficult and tedious task. Several ways have been developed to mitigate the problem associated with the shortage of data, the most common of which is transfer learning. However, in many cases, the use of transfer learning as the only remedy is insufficient. In this study,...
-
Machine learning applied to acoustic-based road traffic monitoring
PublikacjaThe motivation behind this study lies in adapting acoustic noise monitoring systems for road traffic monitoring for driver’s safety. Such a system should recognize a vehicle type and weather-related pavement conditions based on the audio level measurement. The study presents the effectiveness of the selected machine learning algorithms in acoustic-based road traffic monitoring. Bases of the operation of the acoustic road traffic...
-
The Verification of the Usefulness of Electronic Nose Based on Ultra-Fast Gas Chromatography and Four Different Chemometric Methods for Rapid Analysis of Spirit Beverages
PublikacjaSpirit beverages are a diverse group of foodstuffs. They are very often counterfeited which cause the appearance of low quality products or wrongly labelled products on the market. It is important to find a proper quality control and botanical origin method enabling the same time preliminary check of the composition of investigated samples, which was the main goal of this work. For this purpose, the usefulness of electronic nose...
-
Consumer Bankruptcy Prediction Using Balanced and Imbalanced Data
PublikacjaThis paper examines the usefulness of logit regression in forecasting the consumer bankruptcy of households using an imbalanced dataset. The research on consumer bankruptcy prediction is of paramount importance as it aims to build statistical models that can identify consumers in a difficult financial situation that may lead to consumer bankruptcy. In the face of the current global pandemic crisis, the future of household finances...
-
Semantic URL Analytics to Support Efficient Annotation of Large Scale Web Archives
PublikacjaLong-term Web archives comprise Web documents gathered over longer time periods and can easily reach hundreds of terabytes in size. Semantic annotations such as named entities can facilitate intelligent access to the Web archive data. However, the annotation of the entire archive content on this scale is often infeasible. The most efficient way to access the documents within Web archives is provided through their URLs, which are...
-
Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging
PublikacjaIn the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modification of the training program which minimizes the...
-
Urban scene semantic segmentation using the U-Net model
PublikacjaVision-based semantic segmentation of complex urban street scenes is a very important function during autonomous driving (AD), which will become an important technology in industrialized countries in the near future. Today, advanced driver assistance systems (ADAS) improve traffic safety thanks to the application of solutions that enable detecting objects, recognising road signs, segmenting the road, etc. The basis for these functionalities...
-
Closer Look at the Uncertainty Estimation in Semantic Segmentation under Distributional Shift
PublikacjaWhile recent computer vision algorithms achieve impressive performance on many benchmarks, they lack robustness - presented with an image from a different distribution, (e.g. weather or lighting conditions not considered during training), they may produce an erroneous prediction. Therefore, it is desired that such a model will be able to reliably predict its confidence measure. In this work, uncertainty estimation for the task...
-
Towards application of uncertainty quantification procedure combined with experimental procedure for assessment of the accuracy of the DEM approach dedicated for granular flow modeling
PublikacjaThere is a high demand for accurate and fast numerical models for dense granular flows found in many industrial applications. Nevertheless, before numerical model can be used its need to be always validated against experimental data. During the validation, it is important to consider how the measurement data sets, as well as the numerical models, are affected by errors and uncertainties. In this study, the uncertainty quantification...
-
Solubility Characteristics of Acetaminophen and Phenacetin in Binary Mixtures of Aqueous Organic Solvents: Experimental and Deep Machine Learning Screening of Green Dissolution Media
PublikacjaThe solubility of active pharmaceutical ingredients is a mandatory physicochemical characteristic in pharmaceutical practice. However, the number of potential solvents and their mixtures prevents direct measurements of all possible combinations for finding environmentally friendly, operational and cost-effective solubilizers. That is why support from theoretical screening seems to be valuable. Here, a collection of acetaminophen...
-
Selected Technical Issues of Deep Neural Networks for Image Classification Purposes
PublikacjaIn recent years, deep learning and especially Deep Neural Networks (DNN) have obtained amazing performance on a variety of problems, in particular in classification or pattern recognition. Among many kinds of DNNs, the Convolutional Neural Networks (CNN) are most commonly used. However, due to their complexity, there are many problems related but not limited to optimizing network parameters, avoiding overfitting and ensuring good...
-
Categorization of emotions in dog behavior based on the deep neural network
PublikacjaThe aim of this article is to present a neural system based on stock architecture for recognizing emotional behavior in dogs. Our considerations are inspired by the original work of Franzoni et al. on recognizing dog emotions. An appropriate set of photographic data has been compiled taking into account five classes of emotional behavior in dogs of one breed, including joy, anger, licking, yawning, and sleeping. Focusing on a particular...
-
Transfer learning in imagined speech EEG-based BCIs
PublikacjaThe Brain–Computer Interfaces (BCI) based on electroencephalograms (EEG) are systems which aim is to provide a communication channel to any person with a computer, initially it was proposed to aid people with disabilities, but actually wider applications have been proposed. These devices allow to send messages or to control devices using the brain signals. There are different neuro-paradigms which evoke brain signals of interest...
-
Independent dynamics of low, intermediate, and high frequency spectral intracranial EEG activities during human memory formation
PublikacjaA wide spectrum of brain rhythms are engaged throughout the human cortex in cognitive functions. How the rhythms of various frequency ranges are coordinated across the space of the human cortex and time of memory processing is inconclusive. They can either be coordinated together across the frequency spectrum at the same cortical site and time or induced independently in particular bands. We used a large dataset of human intracranial...
-
Exploring the Solubility Limits of Edaravone in Neat Solvents and Binary Mixtures: Experimental and Machine Learning Study
PublikacjaThis study explores the edaravone solubility space encompassing both neat and binary dissolution media. Efforts were made to reveal the inherent concentration limits of common pure and mixed solvents. For this purpose, the published solubility data of the title drug were scrupulously inspected and cured, which made the dataset consistent and coherent. However, the lack of some important types of solvents in the collection called...
-
Impact of optimization of ALS point cloud on classification
PublikacjaAirborne laser scanning (ALS) is one of the LIDAR technologies (Light Detection and Ranging). It provides information about the terrain in form of a point cloud. During measurement is acquired: spatial data (object’s coordinates X, Y, Z) and collateral data such as intensity of reflected signal. The obtained point cloud is typically applied for generating a digital terrain model (DTM) and a digital surface model (DSM). For DTM...
-
The use of fast molecular descriptors and artificial neural networks approach in organochlorine compounds electron ionization mass spectra classification
PublikacjaDeveloping of theoretical tools can be very helpful for supporting new pollutant detection. Nowadays, a combination of mass spectrometry and chromatographic techniques are the most basic environmental monitoring methods. In this paper, two organochlorine compound mass spectra classification systems were proposed. The classification models were developed within the framework of artificial neural networks (ANNs) and fast 1D and...
-
An Analysis of Neural Word Representations for Wikipedia Articles Classification
PublikacjaOne of the current popular methods of generating word representations is an approach based on the analysis of large document collections with neural networks. It creates so-called word-embeddings that attempt to learn relationships between words and encode this information in the form of a low-dimensional vector. The goal of this paper is to examine the differences between the most popular embedding models and the typical bag-of-words...
-
Detecting type of hearing loss with different AI classification methods: a performance review
PublikacjaHearing is one of the most crucial senses for all humans. It allows people to hear and connect with the environment, the people they can meet and the knowledge they need to live their lives to the fullest. Hearing loss can have a detrimental impact on a person's quality of life in a variety of ways, ranging from fewer educational and job opportunities due to impaired communication to social withdrawal in severe situations. Early...
-
Diurnal variability of atmospheric water vapour, precipitation and cloud top temperature across the global tropics derived from satellite observations and GNSS technique
PublikacjaThe diurnal cycle of convection plays an important role in clouds and water vapour distribution across the global tropics. In this study, we utilize integrated moisture derived from the global navigation satellite system (GNSS), satellite precipitation estimates from TRMM and merged infrared dataset to investigate links between variability in tropospheric moisture, clouds development and precipitation at a diurnal time scale. Over...
-
Fault diagnosis of marine 4-stroke diesel engines using a one-vs-one extreme learning ensemble
PublikacjaThis paper proposes a novel approach for intelligent fault diagnosis for stroke Diesel marine engines, which are commonly used in on-road and marine transportation. The safety and reliability of a ship's work rely strongly on the performance of such an engine; therefore, early detection of any type of failure that affects the engine is of crucial importance. Automatic diagnostic systems are of special importance because they can...
-
Emissions and toxic units of solvent, monomer and additive residues released to gaseous phase from latex balloons
PublikacjaThis study describes the VOCs emissions from commercially available latex balloons. Nine compounds are determined to be emitted from 13 types of balloons of different colors and imprints in 30 and 60°C. The average values of total volatile organic compounds (TVOCs) emitted from studied samples ranged from 0.054 up to 7.18 μg∙g-1 and from 0.27 up to 36.13 μg∙g-1 for 30oC and 60oC, respectively. The dataset is treated with principal...
-
imPlatelet classifier: image‐converted RNA biomarker profiles enable blood‐based cancer diagnostics
PublikacjaLiquid biopsies offer a minimally invasive sample collection, outperforming traditional biopsies employed for cancer evaluation. The widely used material is blood, which is the source of tumor-educated platelets. Here, we developed the imPlatelet classifier, which converts RNA-sequenced platelet data into images in which each pixel corresponds to the expression level of a certain gene. Biological knowledge from the Kyoto Encyclopedia...
-
Quantitative Risk Assessment in Construction Disputes Based on Machine Learning Tools
PublikacjaA high monetary value of the construction projects is one of the reasons of frequent disputes between a general contractor (GC) and a client. A construction site is a unique, one-time, and single-product factory with many parties involved and dependent on each other. The organizational dependencies and their complexity make any fault or mistake propagate and influence the final result (delays, cost overruns). The constant will...
-
Graph Representation Integrating Signals for Emotion Recognition and Analysis
PublikacjaData reusability is an important feature of current research, just in every field of science. Modern research in Affective Computing, often rely on datasets containing experiments-originated data such as biosignals, video clips, or images. Moreover, conducting experiments with a vast number of participants to build datasets for Affective Computing research is time-consuming and expensive. Therefore, it is extremely important to...
-
Super-resolved Thermal Imagery for High-accuracy Facial Areas Detection and Analysis
PublikacjaIn this study, we evaluate various Convolutional Neural Networks based Super-Resolution (SR) models to improve facial areas detection in thermal images. In particular, we analyze the influence of selected spatiotemporal properties of thermal image sequences on detection accuracy. For this purpose, a thermal face database was acquired for 40 volunteers. Contrary to most of existing thermal databases of faces, we publish our dataset...
-
Reliable computationally-efficient behavioral modeling of microwave passives using deep learning surrogates in confined domains
PublikacjaThe importance of surrogate modeling techniques has been steadily growing over the recent years in high-frequency electronics, including microwave engineering. Fast metamodels are employed to speedup design processes, especially those conducted at the level of full-wave electromagnetic (EM) simulations. The surrogates enable massive system evaluations at nearly EM accuracy and negligible costs, which is invaluable in parameter...
-
Machine learning-based seismic fragility and seismic vulnerability assessment of reinforced concrete structures
PublikacjaMany studies have been performed to put quantifying uncertainties into the seismic risk assessment of reinforced concrete (RC) buildings. This paper provides a risk-assessment support tool for purpose of retrofitting and potential design strategies of RC buildings. Machine Learning (ML) algorithms were developed in Python software by innovative methods of hyperparameter optimization, such as halving search, grid search, random...
-
Dependent self-employed individuals: are they different from paid employees?
PublikacjaThis study focuses on dependent self-employment, which covers a situation where a person works for the same employer as a typical worker while on a self-employment contractual basis, i.e., without a traditional employment contract and without certain rights granted to "regular" employees. The research exploits the individual-level dataset of 35 European countries extracted from the 2017 edition of the European Labour Force Survey...
-
Towards High-Value Datasets Determination for Data-Driven Development: A Systematic Literature Review
PublikacjaOpen government data (OGD) is seen as a political and socio-economic phenomenon that promises to promote civic engagement and stimulate public sector innovations in various areas of public life. To bring the expected benefits, data must be reused and transformed into value-added products or services. This, in turn, sets another precondition for data that are expected to not only be available and comply with open data principles,...
-
Deep learning-based waste detection in natural and urban environments
PublikacjaWaste pollution is one of the most significant environmental issues in the modern world. The importance of recycling is well known, both for economic and ecological reasons, and the industry demands high efficiency. Current studies towards automatic waste detection are hardly comparable due to the lack of benchmarks and widely accepted standards regarding the used metrics and data. Those problems are addressed in this article by...
-
ELECTRICAL CONDUCTIVITY AND pH IN SURFACE WATER AS TOOL FOR IDENTIFICATION OF CHEMICAL DIVERSITY
PublikacjaIn the present study, the creeks and lakes located at the western shore of Admiralty Bay were analysed. The impact of various sources of water supply was considered, based on the parameters of temperature, pH and specific electrolytic conductivity (SEC25). All measurements were conducted during a field campaign in January-February 2017. A multivariate dataset was also created and a biplot of SEC25 and pH of the investigated waters...
-
Magnetic Signature Description of Ellipsoid-Shape Vessel Using 3D Multi-Dipole Model Fitted on Cardinal Directions
PublikacjaThe article presents a continuation of the research on the 3D multi-dipole model applied to the reproduction of magnetic signatures of ferromagnetic objects. The model structure has been modified to improve its flexibility - model parameters determined by optimization can now be located in the cuboid contour representing the object's hull. To stiffen the model, the training dataset was expanded to data collected from all four cardinal...
-
Know your safety indicator – A determination of merchant vessels Bow Crossing Range based on big data analytics
PublikacjaEven in the era of automatization maritime safety constantly needs improvements. Regardless of the presence of crew members on board, both manned and autonomous ships should follow clear guidelines (no matter as bridge procedures or algorithms). To date, many safety indicators, especially in collision avoidance have been proposed. One of such parameters commonly used in day-to-day navigation but usually omitted by researchers is...
-
Interrelations between Travel Patterns and Urban Spatial Structure of the Largest Russian Cities
PublikacjaThe study presented within this dissertation involves the analysis of the relationship between urban spatial structure and travel patterns in the largest Russian cities. It is an empirical investigation of how the spatial structure, formed during the Soviet and post-Soviet periods, affects the travel patterns in the largest cities of contemporary Russia. It aims to determine what measures, both urban structure and transportation...