Filters
total: 378
Search results for: DATASET CONSTRUCTION
-
The most popular fields of study at full-time first-cycle studies and uniform master's studies in 2010/2011
Open Research DataUniversities try to adjust their didactic offer to the preferences of university candidates and the market situation. The socio-economic changes of the transformation period meant that specialists in the field of the market economy were in demand on the labor market. Subsequently, at universities, such fields as: "Management" or "Pedagogy" are ranked...
-
Structure of the SME sector by economy sectors in Poland and the EU-28
Open Research DataEurostat data show that almost 75% of SMEs in Poland are trade and services, and every seventh enterprise conducts activities related to construction, and every eight - in industry. Comparing the industry structure of Polish companies with the structure of enterprises operating in EU countries, it should be noted that we are characterized by a greater...
-
The OptD-multi method in LiDAR processing
PublicationNew and constantly developing technology for acquiring spatial data, such as LiDAR (light detection and ranging), is a source for large volume of data. However, such amount of data is not always needed for developing the most popular LiDAR products: digital terrain model (DTM) or digital surface model. Therefore, in many cases, the number of contained points are reduced in the pre-processing stage. The degree of reduction is determined...
-
Style Transfer for Detecting Vehicles with Thermal Camera
PublicationIn this work we focus on nighttime vehicle detection for intelligent traffic monitoring from the thermal camera. To train a Convolutional Neural Network (CNN) detector we create a stylized version of COCO (Common Objects in Context) dataset using Style Transfer technique that imitates images obtained from thermal cameras. This new dataset is further used for fine-tuning of the model and as a result detection accuracy on images...
-
Crack Mouth Opening Displacement for EH36 Shipbuilding Steel Measurements
PublicationThe dataset titled EH36 steel for shipbuilding (plate thicnkness 50mm) - CMOD - force record, a0/W = 0.6 contains CMOD (Crack Mouth Opening Displacement) - Force record which is the base for evaluation of fracture toughness of structural steel. Bend specimens witch Bx2B section (B= 50mm), and relative initial crack length a0 / W = 0.60 were used. The test was carried out at ambient temperature in accordance to ISO 12135 standard....
-
Exploratory analysis and ranking of analytical procedures for short-chain chlorinated paraffins determination in environmental solid samples
PublicationShort-chain chlorinated paraffins are ones of the most recent chemical compounds that have been classified as persistent organic pollutants. They have various applications and are emitted to the environment. Despite the fact, that the content levels of these compounds in the environmental compartments should be monitored, there is still a lack of well-defined and validated analytical procedures, proposed or suggested by the national...
-
Crack Mouth Opening Displacement for EH36 Shipbuilding Steel Measurements Dataset
PublicationThe dataset titled EH36 steel for shipbuilding (plate thickness 50 mm) – CMOD – force record, a0/W=0.6 contains a CMOD (Crack Mouth Opening Displacement) – Force record which is the base for evaluation of the fracture toughness of structural steel. Bend specimens with a Bx2B section (B = 50 mm), and relative initial crack length a0/W=0.60 were used. The test was carried out at ambient temperature in accordance with the ISO 12135...
-
Annual growth of imports of goods and services in selected countries, 1997–2021
Open Research DataThe table shows the import of goods and services, which is the value of all goods and other market services received from the rest of the world. They include the value of goods, freight, insurance, transportation, travel, royalties, license fees, and other services such as communications, construction, finance, information, business, personal, and government...
-
Extending touch-less interaction with smart glasses by implementing EMG module
PublicationIn this paper we propose to use temporal muscle contraction to perform certain actions. Method: The set of muscle contractions corresponding to one of three actions including “single-click”, “double-click” “click-n-hold” and “non-action” were recorded. After recording certain amount of signals, the set of five parameters was calculated. These parameters served as an input matrix for the neural network. Two-layer feedforward neural...
-
Using Isolation Forest and Alternative Data Products to Overcome Ground Truth Data Scarcity for Improved Deep Learning-based Agricultural Land Use Classification Models
PublicationHigh-quality labelled datasets represent a cornerstone in the development of deep learning models for land use classification. The high cost of data collection, the inherent errors introduced during data mapping efforts, the lack of local knowledge, and the spatial variability of the data hinder the development of accurate and spatially-transferable deep learning models in the context of agriculture. In this paper, we investigate...
-
Long-Term GNSS Tropospheric Parameters for the Tropics (2001-2018) Derived from Selected IGS Stations
PublicationThis paper describes dataset “Tropospheric parameters derived from selected IGS stations in the tropics for the years 2001-2018” contains GNSS-derived zenith tropospheric delay (ZTD), a posteriori corrected zenith wet delay (ZWD), and precipitable water vapour (PWV) time series. These troposphere-related data were estimated for the Jan 2001 – Dec 2018 period for 43 International GNSS Service (IGS) stations located across the global...
-
Viewpoint independent shape-based object classification for video surveillance
PublicationA method for shape based object classification is presented.Unlike object dimension based methods it does not require any system calibration techniques. A number of 3D object models are utilized as a source of training dataset for a specified camera orientation. Usage of the 3D models allows to perform the dataset creation process semiautomatically. The background subtraction method is used for the purpose of detecting moving objects...
-
Outlier detection method by using deep neural networks
PublicationDetecting outliers in the data set is quite important for building effective predictive models. Consistent prediction can not be made through models created with data sets containing outliers, or robust models can not be created. In such cases, it may be possible to exclude observations that are determined to be outlier from the data set, or to assign less weight to these points of observation than to other points of observation....
-
Training of Deep Learning Models Using Synthetic Datasets
PublicationIn order to solve increasingly complex problems, the complexity of Deep Neural Networks also needs to be constantly increased, and therefore training such networks requires more and more data. Unfortunately, obtaining such massive real world training data to optimize neural networks parameters is a challenging and time-consuming task. To solve this problem, we propose an easy-touse and general approach to training deep learning...
-
Local variability in snow concentrations of chlorinated persistent organic pollutants as a source of large uncertainty in interpreting spatial patterns at all scales
PublicationSingle point sampling, a widespread practice in snow studies in remote areas, due to logistical constraints, can present an unquantified error to the final study results. The low concentrations of studied chemicals, such as chlorinated persistent organic pollutants, contribute to the uncertainty. We conducted a field experiment in the Arctic to estimate the error stemming from differences in the composition of snow at short distances...
-
Dataset Related Experimental Investigation of Chess Position Evaluation Using a Deep Neural Network
PublicationThe idea of training Articial Neural Networks to evaluate chess positions has been widely explored in the last ten years. In this paper we investigated dataset impact on chess position evaluation. We created two datasets with over 1.6 million unique chess positions each. In one of those we also included randomly generated positions resulting from consideration of potentially unpredictable chess moves. Each position was evaluated...
-
Searching for Solvents with an Increased Carbon Dioxide Solubility Using Multivariate Statistics
PublicationIonic liquids (ILs) are used in various fields of chemistry. One of them is CO2 capture, a process that is quite well described. The solubility of CO2 in ILs can be used as a model to investigate gas absorption processes. The aim is to find the relationships between the solubility of CO2 and other variables—physicochemical properties and parameters related to greenness. In this study, 12 variables are used to describe a dataset...
-
Gdynia 2019 - video data - pedestrian, bicycles, vehicles
Open Research DataGdynia 2019 - video data - pedestrian, bicycles, vehicles
-
Application of the Optimum Dataset Method in Archeological Studies on Barrows
PublicationLight Detection and Ranging (LiDAR) became one of the technologies used in archaeological research. It allows for relatively easy detection of archaeological sites that have their own field form, e.g.: barrows, fortresses, tracts, ancient fields [1]. As a result of the scanning, the so-called point cloud is obtained, often consisting of millions of points. Such large measurement datasets are very time-consuming and labor-intensive...
-
Detecting Objects of Various Categories in Optical Remote Sensing Imagery Using Neural Networks
PublicationThe effective detection of objects in remote sensing images is of great research importance, so recent years have seen a significant progress in deep learning techniques in this field. However, despite much valuable research being conducted, many challenges still remain. A lot of research projects focus on detecting objects of a single category (class), while correctly detecting objects of different categories is much harder. The...
-
High performance filtering for big datasets from Airborne Laser Scanning with CUDA technology
PublicationThere are many studies on the problems of processing big datasets provided by Airborne Laser Scanning (ALS). The processing of point clouds is often executed in stages or on the fragments of the measurement set. Therefore, solutions that enable the processing of the entire cloud at the same time in a simple, fast, efficient way are the subject of many researches. In this paper, authors propose to use General-Purpose computation...
-
Standard deviation as the optimization criterion in the OptD method and its influence on the generated DTM
PublicationReduction of the measurement dataset is one of the current issues related to constantly developing technologies that provide large datasets, eg. laser scanning. It could seems that presence and evolution of processors computer, increase of hard drive capacity etc. is the solution for development of such large datasets. And in fact it is, however, the “lighter” datasets are easier to work with. Additionally, reduced datasets can...
-
Application of Binary Image Quality Assessment Methods to Predict the Quality of Optical Character Recognition Results
PublicationOne of the continuous challenges related to the growing popularity of mobile devices and embedded systems with limited memory and computational power is the development of relatively fast methods for real-time image and video analysis. One such example is Optical Character Recognition (OCR), which is usually too complex for such devices. Considering that images captured by cameras integrated into mobile devices may be acquired...
-
C-reactive protein (CRP) evaluation in human urine using optical sensor supported by machine learning
PublicationThe rapid and sensitive indicator of inflammation in the human body is C-Reactive Protein (CRP). Determination of CRP level is important in medical diagnostics because, depending on that factor, it may indicate, e.g., the occurrence of inflammation of various origins, oncological, cardiovascular, bacterial or viral events. In this study, we describe an interferometric sensor able to detect the CRP level for distinguishing between...
-
Data on LEGO sets release dates and worldwide retail prices combined with aftermarket transaction prices in Poland between June 2018 and June 2023
PublicationThe dataset contains LEGO bricks sets item count and pricing history for AI-based set pricing prediction. The data spans the timeframe from June 2018 to June 2023. The data was obtained from three sources: Brickset.com (LEGO sets retail prices, release dates, and IDs), Lego.com official web page (ID number of each set that was released by Lego, its retail prices, the current status of the set) and promoklocki.pl web page (the retail...
-
Active Learning on Ensemble Machine-Learning Model to Retrofit Buildings Under Seismic Mainshock-Aftershock Sequence
PublicationThis research presents an efficient computational method for retrofitting of buildings by employing an active learning-based ensemble machine learning (AL-Ensemble ML) approach developed in OpenSees, Python and MATLAB. The results of the study shows that the AL-Ensemble ML model provides the most accurate estimations of interstory drift (ID) and residual interstory drift (RID) for steel structures using a dataset of 2-, to 9-story...
-
Down-Sampling of Large LiDAR Dataset in the Context of Off-Road Objects Extraction
PublicationNowadays, LiDAR (Light Detection and Ranging) is used in many fields, such as transportation. Thanks to the recent technological improvements, the current generation of LiDAR mapping instruments available on the market allows to acquire up to millions of three-dimensional (3D) points per second. On the one hand, such improvements allowed the development of LiDAR-based systems with increased productivity, enabling the quick acquisition...
-
Using contextual conditional preferences for recommendation taska: a case study in the movie domain
PublicationRecommendation engines aim to propose users items they are interested in by looking at the user interaction with a system. However, individual interests may be drastically influenced by the context in which decisions are taken. We present an attempt to model user interests via a set of contextual conditional preferences. We show that usage of proposed preferences gives reasonable values of the accuracy and the precision even when...
-
Induction of the common-sense hierarchies in lexical data
PublicationUnsupervised organization of a set of lexical concepts that captures common-sense knowledge inducting meaningful partitioning of data is described. Projection of data on principal components allow for dentification of clusters with wide margins, and the procedure is recursively repeated within each cluster. Application of this idea to a simple dataset describing animals created hierarchical partitioning with each clusters related...
-
Herbarium of Division of Marine Biology and Ecology as the Primary Basis for Conservation Status Assessments in the Gulf of Gdańsk
PublicationThe dataset titled Herbarium of Division of Marine Biology and Ecology University of Gdańsk (DMBE) is a research herbarium encompassing specimens of vascular plants and algae hosted by the Laboratory of Marine Plant Ecology at the University of Gdańsk, Poland. The aim of Herbarium is to preserve marine plant and algae collections mostly from the Gulf of Gdańsk, but the herbarium also holds specimens from other parts of the world.
-
Selection of Visual Descriptors for the Purpose of Multi-camera Object Re-identification
PublicationA comparative analysis of various visual descriptors is presented in this chapter. The descriptors utilize many aspects of image data: colour, texture, gradient, and statistical moments. The descriptor list is supplemented with local features calculated in close vicinity of key points found automatically in the image. The goal of the analysis is to find descriptors that are best suited for particular task, i.e. re-identification...
-
On Bayesian Tracking and Prediction of Radar Cross Section
PublicationWe consider the problem of Bayesian tracking of radar cross section. The adopted observation model employs the gamma family, which covers all Swerling cases in a unified framework. State dynamics are modeled using a nonstationary autoregressive gamma process. The principal component of the proposed solution is a nontrivial gamma approximation, applied during the time update recursion. The superior performance of the proposed approach...
-
Minimal Sets of Lefschetz Periods for Morse-Smale Diffeomorphisms of a Connected Sum of g Real Projective Planes
PublicationThe dataset titled Database of the minimal sets of Lefschetz periods for Morse-Smale diffeomorphisms of a connected sum of g real projective planes contains all of the values of the topological invariant called the minimal set of Lefschetz periods, computed for Morse-Smale diffeomorphisms of a non-orientable compact surface without boundary of genus g (i.e. a connected sum of g real projective planes), where g varies from 1 to...
-
Dataset Relating Collective Angst, Identifications, Essentialist Continuity and Collective Action for Progressive City Policy among Gdańsk Residents
PublicationThis dataset contains the individual responses of 456 residents of Gdańsk who participated in the study. The study was conducted before the second term of the presidential election in Poland in 2020. Demographic variables as well as psychological measures of angst, place attachment, identification in-group continuity and willingness to engage in collective action were collected. We also measured the perception of the risk of...
-
Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets
PublicationArtificial Neural Network (ANN) models, specifically Convolutional Neural Networks (CNN), were applied to extract emotions based on spectrograms and mel-spectrograms. This study uses spectrograms and mel-spectrograms to investigate which feature extraction method better represents emotions and how big the differences in efficiency are in this context. The conducted studies demonstrated that mel-spectrograms are a better-suited...
-
Application of Multivariate Adaptive Regression Splines (MARSplines) Methodology for Screening of Dicarboxylic Acids Cocrystal Using 1D and 2D Molecular Descriptors
PublicationDicarboxylic acids (DiAs) are probably one of the most popular cocrystals formers. Due to the high hydrophilicity and non-toxicity, they are promising solubilizes of active pharmaceutical ingredients (APIs). Although DiAs appear to be highly capable of forming multicomponent crystals with various compounds, some systems reported in the literature are physical mixtures the solid state without forming stable intermolecular complex....
-
Preprocessing of Document Images Based on the GGD and GMM for Binarization of Degraded Ancient Papyri Images
PublicationThresholding of document images is one of the most relevant operations that influence the final results of their further analysis. Although many image binarization methods have been proposed during recent several years, starting from global thresholding, through local and adaptive methods, to more sophisticated multi-stage algorithms and the use of deep convolutional neural networks, proper thresholding of degraded historical...
-
Description of the Dataset Rhetoric at School – a Selection of the Syllabi from the Academic Gymnasium in Gdańsk – Transcription and Photographs
PublicationThe research dataset described in the article was based on photographs and transcription of a textual record from Latin syllabi for classes at the Gdańsk Academic Gymnasium. The syllabi concern the years 1645/1648/1652/1653. The original document is held in the collection of the Gdańsk Library of the Polish Academy of Sciences [reference number: Ma 3920 8o]. The collected research material can be used for studying the practical...
-
Surf Zone Currents in the Coastal Zone of the Southern Baltic Sea – a Modelling Approach
PublicationNearshore currents in a multi-bar non-tidal coastal zone environment located in the Southern Baltic Sea are studied. Spatiotemporal seaward-directed jets – so-called rip currents – are an important part of the nearshore current system. In previous research, Dudkowska et al. (2020) performed an extended modelling experiment to determine the wave conditions that are conducive to the emergence of rip currents. In this paper, the...
-
On Computing Curlicues Generated by Circle Homeomorphisms
PublicationThe dataset entitled Computing dynamical curlicues contains values of consecutive points on a curlicue generated, respectively, by rotation on the circle by different angles, the Arnold circle map (with various parameter values) and an exemplary sequence as well as corresponding diameters and Birkhoff averages of these curves. We additionally provide source codes of the Matlab programs which can be used to generate and plot the...
-
Detection of the Oocyte Orientation for the ICSI Method Automation
PublicationAutomation or even computer assistance of the popular infertility treatment method: ICSI (Intracytoplasmic Sperm Injection) would speed up the whole process and improve the control of the results. This paper introduces a preliminary research for automatic spermatozoon injection into the oocyte cytoplasm. Here, the method for detection a correct orientation of the polar body of the oocyte is presented. Proposed method uses deep...
-
Comprehensive Comparison of a Few Variants of Cluster Analysis as Data Mining Tool in Supporting Environmental Management
PublicationA few variants of hierarchical cluster analysis (CA) as tool of assessment of multidimensional similarity in environmental dataset are compared. The dataset consisted of analytical results of determination of metals (Na, K, Ca, Sc, Fe, Co, Zn, As, Br, Rb, Mo, Sb, Cs, Ba, La, Ce, Sm, Hf and Th) in ambient air dried and kept alive, by the means of hydroponics, moss baskets collected in 12 locations on the area of Tricity (Poland)....
-
Domain adaptation for inpainting-based face recognition studies
PublicationRecent inpainting methods have demonstrated im-pressive outcomes in filling missing parts of images, especially for reconstructing facial areas obscured by occlusions. However, studies show that these models are not adequately effective in real-world applications, primarily due to data bias and the distribution of faces in images. This research focuses on domain adaptation of the commonly used Labeled Faces in the Wild (LFW) dataset,...
-
Data-driven, probabilistic model for attainable speed for ships approaching Gdańsk harbour
PublicationThe growing demand for maritime transportation leads to increased traffic in ports. From this arises the need to observe the consequences of the specific speed ships reach when approaching seaports. However, usually the analyzed cases refer only to the statistical evaluation of the studied phenomenon or to the empirical modelling, ignoring the mutual influence of variables such as ship type, length or weather conditions. In this...
-
Personalized prediction of the secondary oocytes number after ovarian stimulation: A machine learning model based on clinical and genetic data
PublicationControlled ovarian stimulation is tailored to the patient based on clinical parameters but estimating the number of retrieved metaphase II (MII) oocytes is a challenge. Here, we have developed a model that takes advantage of the patient’s genetic and clinical characteristics simultaneously for predicting the stimulation outcome. Sequence variants in reproduction-related genes identified by next-generation sequencing were matched...
-
Deep neural networks for human pose estimation from a very low resolution depth image
PublicationThe work presented in the paper is dedicated to determining and evaluating the most efficient neural network architecture applied as a multiple regression network localizing human body joints in 3D space based on a single low resolution depth image. The main challenge was to deal with a noisy and coarse representation of the human body, as observed by a depth sensor from a large distance, and to achieve high localization precision....
-
The Central European GNSS Research Network (CEGRN) dataset
PublicationThe Central European GNSS Research Network (CEGRN) collects GNSS data since 1994 from contributors which today include 42 Institutions in 33 Countries. CEGRN returns a dataset of coordinates and velocities computed according to international standards and the most recent processing procedures and recommendations. We provide a dataset of 1229 positions and velocities resulting from 3 or more repetitions of coordinate measurements...
-
Data from terrestrial laser scanning: The Forge in the district of Gdańsk Orunia
Open Research DataWithin the frames of the use of terrestrial laser scanning we find numerous examples of registration of building facilities, including also historical and valuable in their culture. Data were acquired using a Leica Geosystems C10 laser scanner. Data embrace blacksmith forges a historic building located in Gdańsk Orunia, 10 Goscinna Street. Scanning...
-
Multiple Group Membership and Collective Action Intention
PublicationDatasets from two studies conducted in Poland on the relation between identity fusion, group identification, multiple group membership, perceived injustice, and collective action intention. The presented studies, in the context of protests against attempts to restrict abortion law, were conducted to examine the link between belonging to multiple groups, group efficacy & identification, perceived injustice and collective...
-
High Resolution Sea Ice Floe Size and Shape Data from Knox Coast, East Antarctica
PublicationThis dataset contains floe size distribution data from a very high resolution (pixel size: 0.3 m) optical satellite image of sea ice, acquired on 16 Feb. 2019 off the Knox Coast (East Antarctica). The image shows relatively small ice floes produced by wave-induced breakup of landfast ice between Mill Island and Bowman Island. The ice floes are characterised by a narrow size distribution and angular, polygonal shapes, typical...