Publications
Filters
total: 858
Catalog Publications
Year 2025
-
AI-Driven Sustainability in Agriculture and Farming
PublicationIn this chapter, we discuss the role of artificial intelligence (AI) in promoting sustainable agriculture and farming. Three main themes run through the chapter. First, we review the state of the art of smart farming and explore the transformative impact of AI on modern agricultural practices, focusing on its contribution to sustainability. With this in mind, our analysis focuses on topics such as data collection and storage, AI...
-
Declarative ship arenas under favourable conditions
PublicationAccording to maritime regulations, a collision-avoidance action shall be taken at an “ample time” while strict interpretation of this term is ambiguous. Evasive manoeuvres, executed by marine navigators on a daily basis, are usually carried out well in advance, while the distance at which they decide to perform such a manoeuvre is mostly subjective and results, e.g., from the navigator's seagoing experience. A proper understanding...
Year 2024
-
A Machine Learning Approach for Estimating Overtime Allocation in Software Development Projects
PublicationOvertime planning in software projects has traditionally been approached with search-based multi-objective optimization algorithms. However, the explicit solutions produced by these algorithms often lack applicability and acceptance in the software industry due to their disregard for project managers' intuitive knowledge. This study presents a machine learning model that learns the preferred overtime allocation patterns from solutions...
-
A review of explainable fashion compatibility modeling methods
PublicationThe paper reviews methods used in the fashion compatibility recommendation domain. We select methods based on reproducibility, explainability, and novelty aspects and then organize them chronologically and thematically. We presented general characteristics of publicly available datasets that are related to the fashion compatibility recommendation task. Finally, we analyzed the representation bias of datasets, fashion-based algorithms’...
-
A Survey on the Datasets and Algorithms for Satellite Data Applications
PublicationThis survey compiles insights and describes datasets and algorithms for applications based on remote sensing. The goal of this review is twofold: datasets review for particular groups of tasks and high-level steps of data flow between satellite instruments and end applications from an implementation and development perspective. The article outlines the generalized data processing pipelines, taking into account the variations in...
-
Adaptive Hounsfield Scale Windowing in Computed Tomography Liver Segmentation
PublicationIn computed tomography (CT) imaging, the Hounsfield Unit (HU) scale quantifies radiodensity, but its nonlinear nature across organs and lesions complicates machine learning analysis. This paper introduces an automated method for adaptive HU scale windowing in deep learning-based CT liver segmentation. We propose a new neural network layer that optimizes HU scale window parameters during training. Experiments on the Liver Tumor...
-
An intelligent cellular automaton scheme for modelling forest fires
PublicationForest fires have devastating consequences for the environment, the economy and human lives. Understanding their dynamics is therefore crucial for planning the resources allocated to combat them effectively. In a world where the incidence of such phenomena is increasing every year, the demand for efficient and accurate computational models is becoming increasingly necessary. In this study, we perform a revision of an initial proposal...
-
Assessment Of the Relevance of Best Practices in The Development of Medical R&D Projects Based on Machine Learning
PublicationMachine learning has emerged as a fundamental tool for numerous endeavors within health informatics, bioinformatics, and medicine. However, novices among biomedical researchers and IT developers frequently lack the requisite experience to effectively execute a machine learning project, thereby increasing the likelihood of adopting erroneous practices that may result in common pitfalls or overly optimistic predictions. The paper...
-
Data on LEGO sets release dates and worldwide retail prices combined with aftermarket transaction prices in Poland between June 2018 and June 2023
PublicationThe dataset contains LEGO bricks sets item count and pricing history for AI-based set pricing prediction. The data spans the timeframe from June 2018 to June 2023. The data was obtained from three sources: Brickset.com (LEGO sets retail prices, release dates, and IDs), Lego.com official web page (ID number of each set that was released by Lego, its retail prices, the current status of the set) and promoklocki.pl web page (the retail...
-
Dataset Characteristics and Their Impact on Offline Policy Learning of Contextual Multi-Armed Bandits
PublicationThe Contextual Multi-Armed Bandits (CMAB) framework is pivotal for learning to make decisions. However, due to challenges in deploying online algorithms, there is a shift towards offline policy learning, which relies on pre-existing datasets. This study examines the relationship between the quality of these datasets and the performance of offline policy learning algorithms, specifically, Neural Greedy and NeuraLCB. Our results...
-
Deep Learning-Based Cellular Nuclei Segmentation Using Transformer Model
PublicationAccurate segmentation of cellular nuclei is imperative for various biological and medical applications, such as cancer diagnosis and drug discovery. Histopathology, a discipline employing microscopic examination of bodily tissues, serves as a cornerstone for cancer diagnosis. Nonetheless, the conventional histopathological diagnosis process is frequently marred by time constraints and potential inaccuracies. Consequently, there...
-
Enhancing Word Embeddings for Improved Semantic Alignment
PublicationThis study introduces a method for the improvement of word vectors, addressing the limitations of traditional approaches like Word2Vec or GloVe through introducing into embeddings richer semantic properties. Our approach leverages supervised learning methods, with shifts in vectors in the representation space enhancing the quality of word embeddings. This ensures better alignment with semantic reference resources, such as WordNet....
-
Holistic collision avoidance decision support system for watchkeeping deck officers
PublicationThe paper presents a 3-stage synthesis-based Decision Support System for watchkeeping deck officers. Its functional scope covers conflict detection, maneuver selection, and maneuver execution, all phases supplemented by collision alerts. First, a customized elliptic ship domain is used for checking if both OS and TS will have enough free space. A survey-based navigators’ declarative OS arena is then used to determine the time at...
-
Investigation of Performance and Energy Consumption of Tokenization Algorithms on Multi-core CPUs Under Power Capping
PublicationIn this paper we investigate performance-energy optimization of tokenizer algorithm training using power capping. We focus on parallel, multi-threaded implementations of Byte Pair Encoding (BPE), Unigram, WordPiece, and WordLevel run on two systems with different multi-core CPUs: Intel Xeon 6130 and desktop Intel i7-13700K. We analyze execution times and energy consumption for various numbers of threads and various power caps and...
-
LSA Is not Dead: Improving Results of Domain-Specific Information Retrieval System Using Stack Overflow Questions Tags
PublicationThe paper presents the approach to using tags from Stack Overflow questions as a data source in the process of building domain-specific unsupervised term embeddings. Using a huge dataset of Stack Overflow posts, our solution employs the LSA algorithm to learn latent representations of information technology terms. The paper also presents the Teamy.ai system, currently developed by Scalac company, which serves as a platform that...
-
Multi-GPU UNRES for scalable coarse-grained simulations of very large protein systems
PublicationGraphical Processor Units (GPUs) are nowadays widely used in all-atom molecular simulations because of the advantage of efficient partitioning of atom pairs between the kernels to compute the contributions to energy and forces, thus enabling the treatment of very large systems. Extension of time- and size-scale of computations is also sought through the development of coarse-grained (CG) models, in which atoms are merged into extended...
-
Multi-GPU-powered UNRES package for physics-based coarse-grained simulations of structure, dynamics, and thermodynamics of protein systems at biological size- and timescales
PublicationCoarse-grained models are nowadays extensively used in biomolecular simulations owing to the tremendous extension of size- and time-scale of simulations. The physics-based UNRES (UNited RESidue) model of proteins developed in our laboratory has only two interaction sites per amino-acid residue (united peptide groups and united side chains) and implicit solvent. However, owing to rigorous physics-based derivation, which enabled...
-
Performance and Energy Aware Training of a Deep Neural Network in a Multi-GPU Environment with Power Capping
PublicationIn this paper we demonstrate that it is possible to obtain considerable improvement of performance and energy aware metrics for training of deep neural networks using a modern parallel multi-GPU system, by enforcing selected, non-default power caps on the GPUs. We measure the power and energy consumption of the whole node using a professional, certified hardware power meter. For a high performance workstation with 8 GPUs, we were...
-
Sampling-based novel heterogeneous multi-layer stacking ensemble method for telecom customer churn prediction
PublicationIn recent times, customer churn has become one of the most significant issues in business-oriented sectors with telecommunication being no exception. Maintaining current customers is particularly valuable due to the high degree of rivalry among telecommunication companies and the costs of acquiring new ones. The early prediction of churned customers may help telecommunication companies to identify the causes of churn and design...
-
Segmentation-Based BI-RADS ensemble classification of breast tumours in ultrasound images
PublicationBackground: The development of computer-aided diagnosis systems in breast cancer imaging is exponential. Since 2016, 81 papers have described the automated segmentation of breast lesions in ultrasound images using arti- ficial intelligence. However, only two papers have dealt with complex BI-RADS classifications. Purpose: This study addresses the automatic classification of breast lesions into binary classes (benign vs. ma- lignant)...
-
Teaching High–performance Computing Systems – A Case Study with Parallel Programming APIs: MPI, OpenMP and CUDA
PublicationHigh performance computing (HPC) education has become essential in recent years, especially that parallel computing on high performance computing systems enables modern machine learning models to grow in scale. This significant increase in the computational power of modern supercomputers relies on a large number of cores in modern CPUs and GPUs. As a consequence, parallel program development based on parallel thinking has become...
Year 2023
-
A Formal Approach to Model the Expansion of Natural Events: The Case of Infectious Diseases
PublicationA formal approach to modeling the expansion of natural events is presented in this paper. Since the mathematical, statistical or computational methods used are not relevant for development, a modular framework is carried out that guides from the external observation down to the innermost level of the variables that have to appear in the future mathematical-computational formalization. As an example we analyze the expansion of Covid-19....
-
A multithreaded CUDA and OpenMP based power‐aware programming framework for multi‐node GPU systems
PublicationIn the paper, we have proposed a framework that allows programming a parallel application for a multi-node system, with one or more GPUs per node, using an OpenMP+extended CUDA API. OpenMP is used for launching threads responsible for management of particular GPUs and extended CUDA calls allow to manage CUDA objects, data and launch kernels. The framework hides inter-node MPI communication from the programmer who can benefit from...
-
AngioScore: An artificial intelligence tool to assess coronary artery lesions
PublicationThe functionality scope of the AngioScore tool in semi-automatic assessment of stenoses according to the SYNTAX scale was presented. An evaluation of the preliminary accuracy of AngioScore in lesion assessment was performed.
-
Antipsychotic drug prescription sequence analysis in relation to death occurrence and cardiometabolic drug usage: A retrospective longitudinal study
PublicationThe potential role of antipsychotics in increasing cardiovascular risk of mortality is still debated. The aim of this study was to assess the death risk associated with sequences of first-generation antipsychotic (FGA) and second-generation antipsychotic (SGA) prescriptions, including clozapine and lithium, and drugs for cardiometabolic diseases. We conducted a retrospective longitudinal analysis involving 84,881 patients who received...
-
Application of a stochastic compartmental model to approach the spread of environmental events with climatic bias
PublicationWildfires have significant impacts on both environment and economy, so understanding their behaviour is crucial for the planning and allocation of firefighting resources. Since forest fire management is of great concern, there has been an increasing demand for computationally efficient and accurate prediction models. In order to address this challenge, this work proposes applying a parameterised stochastic model to study the propagation...
-
Characterizing the Scalability of Graph Convolutional Networks on Intel® PIUMA
PublicationLarge-scale Graph Convolutional Network (GCN) inference on traditional CPU/GPU systems is challenging due to a large memory footprint, sparse computational patterns, and irregular memory accesses with poor locality. Intel’s Programmable Integrated Unffied Memory Architecture (PIUMA) is designed to address these challenges for graph analytics. In this paper, a detailed characterization of GCNs is presented using the Open-Graph Benchmark...
-
Comparison of Selected Neural Network Models Used for Automatic Liver Tumor Segmentation
PublicationAutomatic and accurate segmentation of liver tumors is crucial for the diagnosis and treatment of hepatocellular carcinoma or metastases. However, the task remains challenging due to imprecise boundaries and significant variations in the shape, size, and location of tumors. The present study focuses on tumor segmentation as a more critical aspect from a medical perspective, compared to liver parenchyma segmentation, which is the...
-
Dataset Related Experimental Investigation of Chess Position Evaluation Using a Deep Neural Network
PublicationThe idea of training Articial Neural Networks to evaluate chess positions has been widely explored in the last ten years. In this paper we investigated dataset impact on chess position evaluation. We created two datasets with over 1.6 million unique chess positions each. In one of those we also included randomly generated positions resulting from consideration of potentially unpredictable chess moves. Each position was evaluated...
-
Dynamic GPU power capping with online performance tracing for energy efficient GPU computing using DEPO tool
PublicationGPU accelerators have become essential to the recent advance in computational power of high- performance computing (HPC) systems. Current HPC systems’ reaching an approximately 20–30 mega-watt power demand has resulted in increasing CO2 emissions, energy costs and necessitate increasingly complex cooling systems. This is a very real challenge. To address this, new mechanisms of software power control could be employed. In this...
-
Efficient parallel implementation of crowd simulation using a hybrid CPU+GPU high performance computing system
PublicationIn the paper we present a modern efficient parallel OpenMP+CUDA implementation of crowd simulation for hybrid CPU+GPU systems and demonstrate its higher performance over CPU-only and GPU-only implementations for several problem sizes including 10 000, 50 000, 100 000, 500 000 and 1 000 000 agents. We show how performance varies for various tile sizes and what CPU–GPU load balancing settings shall be preferred for various domain...
-
Empirical analysis of tree-based classification models for customer churn prediction
PublicationCustomer churn is a vital and reoccurring problem facing most business industries, particularly the telecommunications industry. Considering the fierce competition among telecommunications firms and the high expenses of attracting and gaining new subscribers, keeping existing loyal subscribers becomes crucial. Early prediction of disgruntled subscribers can assist telecommunications firms in identifying the reasons for churn and...
-
Energy-Aware Scheduling for High-Performance Computing Systems: A Survey
PublicationHigh-performance computing (HPC), according to its name, is traditionally oriented toward performance, especially the execution time and scalability of the computations. However, due to the high cost and environmental issues, energy consumption has already become a very important factor that needs to be considered. The paper presents a survey of energy-aware scheduling methods used in a modern HPC environment, starting with the...
-
From Scores to Predictions in Multi-Label Classification: Neural Thresholding Strategies
PublicationIn this paper, we propose a novel approach for obtaining predictions from per-class scores to improve the accuracy of multi-label classification systems. In a multi-label classification task, the expected output is a set of predicted labels per each testing sample. Typically, these predictions are calculated by implicit or explicit thresholding of per-class real-valued scores: classes with scores exceeding a given threshold value...
-
General Provisioning Strategy for Local Specialized Cloud Computing Environments
PublicationThe well-known management strategies in cloud computing based on SLA requirements are considered. A deterministic parallel provisioning algorithm has been prepared and used to show its behavior for three different requirements: load balancing, consolidation, and fault tolerance. The impact of these strategies on the total execution time of different sets of services is analyzed for randomly chosen sets of data. This makes it possible...
-
Long‐time scale simulations of virus‐like particles from three human‐norovirus strains
PublicationThe dynamics of the virus like particles (VLPs) corresponding to the GII.4 Houston, GII.2 SMV, and GI.1 Norwalk strains of human noroviruses (HuNoV) that cause gastroenteritis was investigated by means of long-time (about 30 μs in the laboratory timescale) molecular dynamics simulations with the coarse-grained UNRES force field. The main motion of VLP units turned out to be the bending at the junction between the P1 subdomain (that...
-
Machine Learning Assisted Interactive Multi-objectives Optimization Framework: A Proposed Formulation and Method for Overtime Planning in Software Development Projects
PublicationMachine Learning Assisted Interactive Multi-objectives Optimization Framework: A Proposed Formulation and Method for Overtime Planning in Software Development Projects Hammed A. Mojeed & Rafal Szlapczynski Conference paper First Online: 14 September 2023 161 Accesses Part of the Lecture Notes in Computer Science book series (LNAI,volume 14125) Abstract Software development project requires proper planning to mitigate risk and...
-
Optimization of Bread Production Using Neuro-Fuzzy Modelling
PublicationAutomation of food production is an actively researched domain. One of the areas, where automation is still not progressing significantly is bread making. The process still relies on expert knowledge regarding how to react to procedure changes depending on environmental conditions, quality of the ingredients, etc. In this paper, we propose an ANFIS-based model for changing the mixer speed during the kneading process. Although the...
-
Optimization of parallel implementation of UNRES package for coarse‐grained simulations to treat large proteins
PublicationWe report major algorithmic improvements of the UNRES package for physics-based coarse-grained simulations of proteins. These include (i) introduction of interaction lists to optimize computations, (ii) transforming the inertia matrix to a pentadiagonal form to reduce computing and memory requirements, (iii) removing explicit angles and dihedral angles from energy expressions and recoding the most time-consuming energy/force terms...
-
Optymalizacja zasobów chmury obliczeniowej z wykorzystaniem inteligentnych agentów w zdalnym nauczaniu
PublicationRozprawa dotyczy optymalizacji zasobów chmury obliczeniowej, w której zastosowano inteligentne agenty w zdalnym nauczaniu. Zagadnienie jest istotne w edukacji, gdzie wykorzystuje się nowoczesne technologie, takie jak Internet Rzeczy, rozszerzoną i wirtualną rzeczywistość oraz deep learning w środowisku chmury obliczeniowej. Zagadnienie jest istotne również w sytuacji, gdy pandemia wymusza stosowanie zdalnego nauczania na dużą skalę...
-
Parallel implementation of a Sailing Assistance Application in a Cloud Environment
PublicationSailboat weather routing is a highly complex problem in terms of both the computational time and memory. The reason for this is a large search resulting in a multitude of possible routes and a variety of user preferences. Analysing all possible routes is only feasible for small sailing regions, low-resolution maps, or sailboat movements on a grid. Therefore, various heuristic approaches are often applied, which can find solutions...
-
Performance assessment of OpenMP constructs and benchmarks using modern compilers and multi-core CPUs
PublicationConsidering ongoing developments of both modern CPUs, especially in the context of increasing numbers of cores, cache memory and architectures as well as compilers there is a constant need for benchmarking representative and frequently run workloads. The key metric is speed-up as the computational power of modern CPUs stems mainly from using multiple cores. In this paper, we show and discuss results from running codes such as:...
-
Photos and rendered images of LEGO bricks
PublicationThe paper describes a collection of datasets containing both LEGO brick renders and real photos. The datasets contain around 155,000 photos and nearly 1,500,000 renders. The renders aim to simulate real-life photos of LEGO bricks allowing faster creation of extensive datasets. The datasets are publicly available via the Gdansk University of Technology “Most Wiedzy” institutional repository. The source files of all tools used during...
-
Previous Opinions is All You Need - Legal Information Retrieval System
PublicationWe present a system for retrieving the most relevant legal opinions to a given legal case or question. To this end, we checked several state-of-the-art neural language models. As a training and testing data, we use tens of thousands of legal cases as question-opinion pairs. Text data has been subjected to advanced pre-processing adapted to the specifics of the legal domain. We empirically chose the BERT-based HerBERT model to perform...
-
Simulation Environment in Python for Ship Encounter Situations
PublicationTo assess the risk of collision in radar navigation distance-based safety measures such as Distance at the Closest Point of Approach and Time to the Closest Point of Approach are most commonly used. Also Bow Crossing Range and Bow Crossing Time measures are good complement to the picture of the meeting situation. When ship safety domain is considered then Degree of Domain Violation and Time to Domain Violation can be applied. This...
-
The Idea of a Student Research Project as a Method of Preparing a Student for Professional and Scientific Work
PublicationIn the paper we present the idea and implementation of a student research project course within the master’s program at the Faculty of Electronics, Telecommunications and Informatics, Gdansk Tech. It aims at preparing students for performing research and scientific tasks in future professional work. We outline the evolution from group projects into research project and the current deployment of both at bachelor’s and master’s levels...
-
UNRES-GPU for Physics-Based Coarse-Grained Simulations of Protein Systems at Biological Time- and Size-Scales
PublicationThe dynamics of the virus like particles (VLPs) corresponding to the GII.4 Houston, GII.2 SMV, and GI.1 Norwalk strains of human noroviruses (HuNoV) that cause gastroenteritis was investigated by means of long-time (about 30 μs in the laboratory timescale) molecular dynamics simulations with the coarse-grained UNRES force field. The main motion of VLP units turned out to be the bending at the junction between the P1 subdomain (that...
-
Visual Features for Improving Endoscopic Bleeding Detection Using Convolutional Neural Networks
PublicationThe presented paper investigates the problem of endoscopic bleeding detection in endoscopic videos in the form of a binary image classification task. A set of definitions of high-level visual features of endoscopic bleeding is introduced, which incorporates domain knowledge from the field. The high-level features are coupled with respective feature descriptors, enabling automatic capture of the features using image processing methods....
Year 2022
-
Active Learning Based on Crowdsourced Data
PublicationThe paper proposes a crowdsourcing-based approach for annotated data acquisition and means to support Active Learning training approach. In the proposed solution, aimed at data engineers, the knowledge of the crowd serves as an oracle that is able to judge whether the given sample is informative or not. The proposed solution reduces the amount of work needed to annotate large sets of data. Furthermore, it allows a perpetual increase...
-
Algorytm mrówkowy do zarządzania zasobami sprzętowymi chmury obliczeniowej w przypadku różnych kategorii usług
PublicationZarządzanie chmurą obliczeniową odbywa się na dwóch poziomach: zarządzanie żądaniami klientów chmury oraz zarządzanie jej infrastrukturą, na której te usługi są realizowane. Analizując standardy dotyczące zarządzania usługami, w niniejszym rozdziale skoncentrowano się na drugim poziomie zarządzania, którego głównym celem jest efektywne wykonanie wskazanej usługi (lub usług) na dostępnych zasobach sprzętowych, tak by spełnione zostały...