Filtry
wszystkich: 378
wybranych: 308
Filtry wybranego katalogu
Wyniki wyszukiwania dla: DATASET CONSTRUCTION
-
Down-Sampling of Large LiDAR Dataset in the Context of Off-Road Objects Extraction
PublikacjaNowadays, LiDAR (Light Detection and Ranging) is used in many fields, such as transportation. Thanks to the recent technological improvements, the current generation of LiDAR mapping instruments available on the market allows to acquire up to millions of three-dimensional (3D) points per second. On the one hand, such improvements allowed the development of LiDAR-based systems with increased productivity, enabling the quick acquisition...
-
Active Learning on Ensemble Machine-Learning Model to Retrofit Buildings Under Seismic Mainshock-Aftershock Sequence
PublikacjaThis research presents an efficient computational method for retrofitting of buildings by employing an active learning-based ensemble machine learning (AL-Ensemble ML) approach developed in OpenSees, Python and MATLAB. The results of the study shows that the AL-Ensemble ML model provides the most accurate estimations of interstory drift (ID) and residual interstory drift (RID) for steel structures using a dataset of 2-, to 9-story...
-
Data on LEGO sets release dates and worldwide retail prices combined with aftermarket transaction prices in Poland between June 2018 and June 2023
PublikacjaThe dataset contains LEGO bricks sets item count and pricing history for AI-based set pricing prediction. The data spans the timeframe from June 2018 to June 2023. The data was obtained from three sources: Brickset.com (LEGO sets retail prices, release dates, and IDs), Lego.com official web page (ID number of each set that was released by Lego, its retail prices, the current status of the set) and promoklocki.pl web page (the retail...
-
C-reactive protein (CRP) evaluation in human urine using optical sensor supported by machine learning
PublikacjaThe rapid and sensitive indicator of inflammation in the human body is C-Reactive Protein (CRP). Determination of CRP level is important in medical diagnostics because, depending on that factor, it may indicate, e.g., the occurrence of inflammation of various origins, oncological, cardiovascular, bacterial or viral events. In this study, we describe an interferometric sensor able to detect the CRP level for distinguishing between...
-
Induction of the common-sense hierarchies in lexical data
PublikacjaUnsupervised organization of a set of lexical concepts that captures common-sense knowledge inducting meaningful partitioning of data is described. Projection of data on principal components allow for dentification of clusters with wide margins, and the procedure is recursively repeated within each cluster. Application of this idea to a simple dataset describing animals created hierarchical partitioning with each clusters related...
-
Using contextual conditional preferences for recommendation taska: a case study in the movie domain
PublikacjaRecommendation engines aim to propose users items they are interested in by looking at the user interaction with a system. However, individual interests may be drastically influenced by the context in which decisions are taken. We present an attempt to model user interests via a set of contextual conditional preferences. We show that usage of proposed preferences gives reasonable values of the accuracy and the precision even when...
-
Herbarium of Division of Marine Biology and Ecology as the Primary Basis for Conservation Status Assessments in the Gulf of Gdańsk
PublikacjaThe dataset titled Herbarium of Division of Marine Biology and Ecology University of Gdańsk (DMBE) is a research herbarium encompassing specimens of vascular plants and algae hosted by the Laboratory of Marine Plant Ecology at the University of Gdańsk, Poland. The aim of Herbarium is to preserve marine plant and algae collections mostly from the Gulf of Gdańsk, but the herbarium also holds specimens from other parts of the world.
-
Selection of Visual Descriptors for the Purpose of Multi-camera Object Re-identification
PublikacjaA comparative analysis of various visual descriptors is presented in this chapter. The descriptors utilize many aspects of image data: colour, texture, gradient, and statistical moments. The descriptor list is supplemented with local features calculated in close vicinity of key points found automatically in the image. The goal of the analysis is to find descriptors that are best suited for particular task, i.e. re-identification...
-
Dataset Relating Collective Angst, Identifications, Essentialist Continuity and Collective Action for Progressive City Policy among Gdańsk Residents
PublikacjaThis dataset contains the individual responses of 456 residents of Gdańsk who participated in the study. The study was conducted before the second term of the presidential election in Poland in 2020. Demographic variables as well as psychological measures of angst, place attachment, identification in-group continuity and willingness to engage in collective action were collected. We also measured the perception of the risk of...
-
Minimal Sets of Lefschetz Periods for Morse-Smale Diffeomorphisms of a Connected Sum of g Real Projective Planes
PublikacjaThe dataset titled Database of the minimal sets of Lefschetz periods for Morse-Smale diffeomorphisms of a connected sum of g real projective planes contains all of the values of the topological invariant called the minimal set of Lefschetz periods, computed for Morse-Smale diffeomorphisms of a non-orientable compact surface without boundary of genus g (i.e. a connected sum of g real projective planes), where g varies from 1 to...
-
On Bayesian Tracking and Prediction of Radar Cross Section
PublikacjaWe consider the problem of Bayesian tracking of radar cross section. The adopted observation model employs the gamma family, which covers all Swerling cases in a unified framework. State dynamics are modeled using a nonstationary autoregressive gamma process. The principal component of the proposed solution is a nontrivial gamma approximation, applied during the time update recursion. The superior performance of the proposed approach...
-
Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets
PublikacjaArtificial Neural Network (ANN) models, specifically Convolutional Neural Networks (CNN), were applied to extract emotions based on spectrograms and mel-spectrograms. This study uses spectrograms and mel-spectrograms to investigate which feature extraction method better represents emotions and how big the differences in efficiency are in this context. The conducted studies demonstrated that mel-spectrograms are a better-suited...
-
Application of Multivariate Adaptive Regression Splines (MARSplines) Methodology for Screening of Dicarboxylic Acids Cocrystal Using 1D and 2D Molecular Descriptors
PublikacjaDicarboxylic acids (DiAs) are probably one of the most popular cocrystals formers. Due to the high hydrophilicity and non-toxicity, they are promising solubilizes of active pharmaceutical ingredients (APIs). Although DiAs appear to be highly capable of forming multicomponent crystals with various compounds, some systems reported in the literature are physical mixtures the solid state without forming stable intermolecular complex....
-
Preprocessing of Document Images Based on the GGD and GMM for Binarization of Degraded Ancient Papyri Images
PublikacjaThresholding of document images is one of the most relevant operations that influence the final results of their further analysis. Although many image binarization methods have been proposed during recent several years, starting from global thresholding, through local and adaptive methods, to more sophisticated multi-stage algorithms and the use of deep convolutional neural networks, proper thresholding of degraded historical...
-
Description of the Dataset Rhetoric at School – a Selection of the Syllabi from the Academic Gymnasium in Gdańsk – Transcription and Photographs
PublikacjaThe research dataset described in the article was based on photographs and transcription of a textual record from Latin syllabi for classes at the Gdańsk Academic Gymnasium. The syllabi concern the years 1645/1648/1652/1653. The original document is held in the collection of the Gdańsk Library of the Polish Academy of Sciences [reference number: Ma 3920 8o]. The collected research material can be used for studying the practical...
-
Surf Zone Currents in the Coastal Zone of the Southern Baltic Sea – a Modelling Approach
PublikacjaNearshore currents in a multi-bar non-tidal coastal zone environment located in the Southern Baltic Sea are studied. Spatiotemporal seaward-directed jets – so-called rip currents – are an important part of the nearshore current system. In previous research, Dudkowska et al. (2020) performed an extended modelling experiment to determine the wave conditions that are conducive to the emergence of rip currents. In this paper, the...
-
On Computing Curlicues Generated by Circle Homeomorphisms
PublikacjaThe dataset entitled Computing dynamical curlicues contains values of consecutive points on a curlicue generated, respectively, by rotation on the circle by different angles, the Arnold circle map (with various parameter values) and an exemplary sequence as well as corresponding diameters and Birkhoff averages of these curves. We additionally provide source codes of the Matlab programs which can be used to generate and plot the...
-
Detection of the Oocyte Orientation for the ICSI Method Automation
PublikacjaAutomation or even computer assistance of the popular infertility treatment method: ICSI (Intracytoplasmic Sperm Injection) would speed up the whole process and improve the control of the results. This paper introduces a preliminary research for automatic spermatozoon injection into the oocyte cytoplasm. Here, the method for detection a correct orientation of the polar body of the oocyte is presented. Proposed method uses deep...
-
Comprehensive Comparison of a Few Variants of Cluster Analysis as Data Mining Tool in Supporting Environmental Management
PublikacjaA few variants of hierarchical cluster analysis (CA) as tool of assessment of multidimensional similarity in environmental dataset are compared. The dataset consisted of analytical results of determination of metals (Na, K, Ca, Sc, Fe, Co, Zn, As, Br, Rb, Mo, Sb, Cs, Ba, La, Ce, Sm, Hf and Th) in ambient air dried and kept alive, by the means of hydroponics, moss baskets collected in 12 locations on the area of Tricity (Poland)....
-
Deep neural networks for human pose estimation from a very low resolution depth image
PublikacjaThe work presented in the paper is dedicated to determining and evaluating the most efficient neural network architecture applied as a multiple regression network localizing human body joints in 3D space based on a single low resolution depth image. The main challenge was to deal with a noisy and coarse representation of the human body, as observed by a depth sensor from a large distance, and to achieve high localization precision....
-
The Central European GNSS Research Network (CEGRN) dataset
PublikacjaThe Central European GNSS Research Network (CEGRN) collects GNSS data since 1994 from contributors which today include 42 Institutions in 33 Countries. CEGRN returns a dataset of coordinates and velocities computed according to international standards and the most recent processing procedures and recommendations. We provide a dataset of 1229 positions and velocities resulting from 3 or more repetitions of coordinate measurements...
-
Domain adaptation for inpainting-based face recognition studies
PublikacjaRecent inpainting methods have demonstrated im-pressive outcomes in filling missing parts of images, especially for reconstructing facial areas obscured by occlusions. However, studies show that these models are not adequately effective in real-world applications, primarily due to data bias and the distribution of faces in images. This research focuses on domain adaptation of the commonly used Labeled Faces in the Wild (LFW) dataset,...
-
Personalized prediction of the secondary oocytes number after ovarian stimulation: A machine learning model based on clinical and genetic data
PublikacjaControlled ovarian stimulation is tailored to the patient based on clinical parameters but estimating the number of retrieved metaphase II (MII) oocytes is a challenge. Here, we have developed a model that takes advantage of the patient’s genetic and clinical characteristics simultaneously for predicting the stimulation outcome. Sequence variants in reproduction-related genes identified by next-generation sequencing were matched...
-
Data-driven, probabilistic model for attainable speed for ships approaching Gdańsk harbour
PublikacjaThe growing demand for maritime transportation leads to increased traffic in ports. From this arises the need to observe the consequences of the specific speed ships reach when approaching seaports. However, usually the analyzed cases refer only to the statistical evaluation of the studied phenomenon or to the empirical modelling, ignoring the mutual influence of variables such as ship type, length or weather conditions. In this...
-
Multiple Group Membership and Collective Action Intention
PublikacjaDatasets from two studies conducted in Poland on the relation between identity fusion, group identification, multiple group membership, perceived injustice, and collective action intention. The presented studies, in the context of protests against attempts to restrict abortion law, were conducted to examine the link between belonging to multiple groups, group efficacy & identification, perceived injustice and collective...
-
High Resolution Sea Ice Floe Size and Shape Data from Knox Coast, East Antarctica
PublikacjaThis dataset contains floe size distribution data from a very high resolution (pixel size: 0.3 m) optical satellite image of sea ice, acquired on 16 Feb. 2019 off the Knox Coast (East Antarctica). The image shows relatively small ice floes produced by wave-induced breakup of landfast ice between Mill Island and Bowman Island. The ice floes are characterised by a narrow size distribution and angular, polygonal shapes, typical...
-
Thermal imaging in automatic rodent’s social behaviour analysis
PublikacjaLaboratory rodent social behaviour analysis is an extremely important task for biological, medical and pharmacological researches. In this work thermal images features that facilitate analysis are presented. Methods to distinguish objects on the basis of thermal distribution are tested. Actions of grooming or biting one rodent by another - important social behaviour incidents - are clearly visible...
-
From Data to Decision: Interpretable Machine Learning for Predicting Flood Susceptibility in Gdańsk, Poland
PublikacjaFlood susceptibility prediction is complex due to the multifaceted interactions among hydrological, meteorological, and urbanisation factors, further exacerbated by climate change. This study addresses these complexities by investigating flood susceptibility in rapidly urbanising regions prone to extreme weather events, focusing on Gdańsk, Poland. Three popular ML techniques, Support Vector Machine (SVM), Random Forest (RF), and...
-
Deep CNN based decision support system for detection and assessing the stage of diabetic retinopathy
PublikacjaThe diabetic retinopathy is a disease caused by long-standing diabetes. Lack of effective treatment can lead to vision impairment and even irreversible blindness. The disease can be diagnosed by examining digital color fundus photographs of retina. In this paper we propose deep learning approach to automated diabetic retinopathy screening. Deep convolutional neural networks (CNN) - the most popular kind of deep learning algorithms...
-
Convolutional Neural Networks for C. Elegans Muscle Age Classification Using Only Self-Learned Features
PublikacjaNematodes Caenorhabditis elegans (C. elegans) have been used as model organisms in a wide variety of biological studies, especially those intended to obtain a better understanding of aging and age-associated diseases. This paper focuses on automating the analysis of C. elegans imagery to classify the muscle age of nematodes based on the known and well established IICBU dataset. Unlike many modern classification methods, the proposed...
-
Pedestrian detection in low-resolution thermal images
PublikacjaOver one million people die in car accidents worldwide each year. A solution that will be able to reduce situations in which pedestrian safety is at risk has been sought for a long time. One of the techniques for detecting pedestrians on the road is the use of artificial intelligence in connection with thermal imaging. The purpose of this work was to design a system to assist the safety of people and car intelligence with the use...
-
News that Moves the Market: DSEX-News Dataset for Forecasting DSE Using BERT
PublikacjaStock market is a complex and dynamic industry that has always presented challenges for stakeholders and investors due to its unpredictable nature. This unpredictability motivates the need for more accurate prediction models. Traditional prediction models have limitations in handling the dynamic nature of the stock market. Additionally, previous methods have used less relevant data, leading to suboptimal performance. This study...
-
Detection of circulating tumor cells by means of machine learning using Smart-Seq2 sequencing
PublikacjaCirculating tumor cells (CTCs) are tumor cells that separate from the solid tumor and enter the bloodstream, which can cause metastasis. Detection and enumeration of CTCs show promising potential as a predictor for prognosis in cancer patients. Furthermore, single-cells sequencing is a technique that provides genetic information from individual cells and allows to classify them precisely and reliably. Sequencing data typically...
-
Assessing the attractiveness of human face based on machine learning
PublikacjaThe attractiveness of the face plays an important role in everyday life, especially in the modern world where social media and the Internet surround us. In this study, an attempt to assess the attractiveness of a face by machine learning is shown. Attractiveness is determined by three deep models whose sum of predictions is the final score. Two annotated datasets available in the literature are employed for training and testing...
-
Testing the Effect of Bathymetric Data Reduction on the Shape of the Digital Bottom Model
PublikacjaDepth data and the digital bottom model created from it are very important in the inland and coastal water zones studies and research. The paper undertakes the subject of bathymetric data processing using reduction methods and examines the impact of data reduction according to the resulting representations of the bottom surface in the form of numerical bottom models. Data reduction is an approach that is meant to reduce the size...
-
Cyanobacterial and Algal Strains in the Culture Collection of Baltic Algae (CCBA)
PublikacjaThe dataset titled Microalgal strains from “Culture Collection of Baltic Algae (CCBA)” is a representation of cyanobacterial and microalgal cultures isolated from the Baltic Sea. It is a unique catalogue of strains of the dominant and rare species found in the Baltic phytoplankton and microphytobenthos assemblages. The main purpose of the collection is to extend the knowledge on the Baltic microbial communities by providing...
-
Macrophytobenthos in the Puck Bay in 2010–2018 Dataset
PublikacjaThe dataset titled Biomass of macrophytobenthos in the Puck Bay in 2010-2018 con-tains data on the qualitative composition and biomass of macrophytobenthos (flow-er plants and macroalgae) in samples collected in the Puck Bay area (Gulf of Gdańsk, southern Baltic Sea) at 20 stations between 2010–2018. The data was supplemented with additional information: values of measured parameters of water and sediment, e.g. tem-perature...
-
Simplified AutoDock force field for hydrated binding sites
Publikacjahas been extracted from the Protein Data Bank and used to test and recalibrate AutoDock force field. Since for some binding sites water molecules are crucial for bridging the receptor-ligand interactions, they have to be included in the analysis. To simplify the process of incorporating water molecules into the binding sites and make it less ambiguous, new simple water model was created. After recalibration of the force field on...
-
Comparison of image pre-processing methods in liver segmentation task
PublikacjaAutomatic liver segmentation of Computed Tomography (CT) images is becoming increasingly important. Although there are many publications in this field there is little explanation why certain pre-processing methods were utilised. This paper presents a comparison of the commonly used approach of Hounsfield Units (HU) windowing, histogram equalisation, and a combination of these methods to try to ascertain what are the differences...
-
Educational Dataset of Handheld Doppler Blood Flow Recordings
PublikacjaVital signals registration plays a significant role in biomedical engineering and education process. Well acquired data allow future engineers to observe certain physical phenomena as well learn how to correctly process and interpret the data. This dataset was designed for students to learn about Doppler phenomena and to demonstrate correctly and incorrectly acquired signals as well as the basic methods of signal processing. This...
-
KEMR-Net: A Knowledge-Enhanced Mask Refinement Network for Chromosome Instance Segmentation
PublikacjaThis article proposes a mask refinement method for chromosome instance segmentation. The proposed method exploits the knowledge representation capability of Neural Knowledge DNA (NK-DNA) to capture the semantics of the chromosome’s shape, texture, and key points, and then it uses the captured knowledge to improve the accuracy and smoothness of the masks. We validate the method’s effectiveness on our latest high-resolution chromosome...
-
Operational Enhancement of Numerical Weather Prediction with Data from Real-time Satellite Images
PublikacjaNumerical weather prediction (NWP) is a rapidly expanding field of science, which is related to meteorology, remote sensing and computer science. Authors present methods of enhancing WRF EMS (Weather Research and Forecast Environmental Modeling System) weather prediction system using data from satellites equipped with AMSU sensor (Advanced Microwave Sounding Unit). The data is acquired with Department of Geoinformatics’ ground...
-
Legislation and Practice of Selected State Aid Issues, According to EU and Polish Law
PublikacjaThe dataset encompasses several tables, each consisting of three elements: legislation, jurisprudence and scientific articles on numerous subjects and economic activities receiving public financial support in the form of state aid instruments. The set includes a subjective list of the most commonly used and/or disputable examples of granting aid, such as for (local) airports and airlines, steel production, shipyards, and coalmines....
-
Long-term Hindcast Simulation of Currents, Sea Level, Water Temperature and Salinity in the Baltic Sea
PublikacjaThis dataset contains the results of numerical modelling of currents, sea level, water temperature and salinity over a period of 50 years (1958–2007) in the Baltic Sea. A long-term hindcast simulation was performed using a three-dimensional hydrodynamic model (PM3D) based on the Princeton Ocean Model (POM). The spatial resolution was 3 nautical miles, i.e. about 5.5 km. Currents, water temperature, and salinity were recorded...
-
Simultaneous grouping and ranking with combination of SOM and TOPSIS for selection of preferable analytical procedure for furan determination in food
PublikacjaNovel methodology for grouping and ranking with application of self-organizing maps and multicriteria decision analysis is presented. The dataset consists of 22 objects that are analytical procedures applied to furan determination in food samples. They are described by 10 variables, referred to their analytical performance, environmental and economic aspects. Multivariate statistics analysis allows to limit the amount of input...
-
Toward Intelligent Recommendations Using the Neural Knowledge DNA
PublikacjaIn this paper we propose a novel recommendation approach using past news click data and the Neural Knowledge DNA (NK-DNA). The Neural Knowledge DNA is a novel knowledge representation method designed to support discovering, storing, reusing, improving, and sharing knowledge among machines and computing systems. We examine our approach for news recommendation tasks on the MIND benchmark dataset. By taking advantages of NK-DNA, deep...
-
Global Value Chains and Wages: Multi-Country Evidence from Linked Worker-Industry Data
PublikacjaThis paper uses a multi-country microeconomic setting to contribute to the literature on the nexus between production fragmentation and wages. Exploiting a rich dataset on over 110,000 workers from nine Eastern and Western European countries and the United States, we study the relationship between individual workers’ wages and industry ties into global value chains (GVCs). We find an inverse (but weak) relationship between the...
-
Analysing By-Products Interaction as an Industry Resource of Circular Economy in Ukraine and the World
PublikacjaThe paper analyses existing and current scientific developments and literature sources, which show the advantages and disadvantages of many different influences of waste in Ukraine and other countries of Europe and the world. As a research result, stable connections have been established between the factors and criteria in assessing the by-product interaction as an industry resource. In our research, we used programs R.Studio and...
-
Towards semantic-rich word embeddings
PublikacjaIn recent years, word embeddings have been shown to improve the performance in NLP tasks such as syntactic parsing or sentiment analysis. While useful, they are problematic in representing ambiguous words with multiple meanings, since they keep a single representation for each word in the vocabulary. Constructing separate embeddings for meanings of ambiguous words could be useful for solving the Word Sense Disambiguation (WSD)...
-
Balanced Spider Monkey Optimization with Bi-LSTM for Sustainable Air Quality Prediction
PublikacjaA reliable air quality prediction model is required for pollution control, human health monitoring, and sustainability. The existing air quality prediction models lack efficiency due to overfitting in prediction model and local optima trap in feature selection. This study proposes the Balanced Spider Monkey Optimization (BSMO) technique for effective feature selection to overcome the local optima trap and overfitting problems....