dr inż. Paweł Rościszewski
Employment
Research fields
Publications
Filters
total: 23
Catalog Publications
Year 2021
-
TensorHive: Management of Exclusive GPU Access for Distributed Machine Learning Workloads
PublicationTensorHive is a tool for organizing work of research and engineering teams that use servers with GPUs for machine learning workloads. In a comprehensive web interface, it supports reservation of GPUs for exclusive usage, hardware monitoring, as well as configuring, executing and queuing distributed computational jobs. Focusing on easy installation and simple configuration, the tool automatically detects the available computing...
-
Towards Scalable Simulation of Federated Learning
PublicationFederated learning (FL) allows to train models on decentralized data while maintaining data privacy, which unlocks the availability of large and diverse datasets for many practical applications. The ongoing development of aggregation algorithms, distribution architectures and software implementations aims for enabling federated setups employing thousands of distributed devices, selected from millions. Since the availability of...
Year 2020
-
Automated Classifier Development Process for Recognizing Book Pages from Video Frames
PublicationOne of the latest developments made by publishing companies is introducing mixed and augmented reality to their printed media (e.g. to produce augmented books). An important computer vision problem that they are facing is classification of book pages from video frames. The problem is non-trivial, especially considering that typical training data is limited to only one digital original per book page, while the trained classifier...
-
Auto-tuning methodology for configuration and application parameters of hybrid CPU + GPU parallel systems based on expert knowledge
PublicationAuto-tuning of configuration and application param- eters allows to achieve significant performance gains in many contemporary compute-intensive applications. Feasible search spaces of parameters tend to become too big to allow for exhaustive search in the auto-tuning process. Expert knowledge about the utilized computing systems becomes useful to prune the search space and new methodologies are needed in the face of emerging heterogeneous...
-
The impact of the AC922 Architecture on Performance of Deep Neural Network Training
PublicationPractical deep learning applications require more and more computing power. New computing architectures emerge, specifically designed for the artificial intelligence applications, including the IBM Power System AC922. In this paper we confront an AC922 (8335-GTG) server equipped with 4 NVIDIA Volta V100 GPUs with selected deep neural network training applications, including four convolutional and one recurrent model. We report...
Year 2018
-
Optimization of hybrid parallel application execution in heterogeneous high performance computing systems considering execution time and power consumption
PublicationMany important computational problems require utilization of high performance computing (HPC) systems that consist of multi-level structures combining higher and higher numbers of devices with various characteristics. Utilizing full power of such systems requires programming parallel applications that are hybrid in two meanings: they can utilize parallelism on multiple levels at the same time and combine together programming interfaces...
Year 2017
-
From Linear Classifier to Convolutional Neural Network for Hand Pose Recognition
PublicationRecently gathered image datasets and the new capabilities of high-performance computing systems have allowed developing new artificial neural network models and training algorithms. Using the new machine learning models, computer vision tasks can be accomplished based on the raw values of image pixels instead of specific features. The principle of operation of deep neural networks resembles more and more what we believe to be happening...
-
MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems
PublicationIn this paper we present a new environment called MERPSYS that allows simulation of parallel application execution time on cluster-based systems. The environment offers a modeling application using the Java language extended with methods representing message passing type communication routines. It also offers a graphical interface for building a system model that incorporates various hardware components such as CPUs, GPUs, interconnects...
-
Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging
PublicationIn the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modification of the training program which minimizes the...
-
Modeling and Simulation for Exploring Power/Time Trade-off of Parallel Deep Neural Network Training
PublicationIn the paper we tackle bi-objective execution time and power consumption optimization problem concerning execution of parallel applications. We propose using a discrete-event simulation environment for exploring this power/time trade-off in the form of a Pareto front. The solution is verified by a case study based on a real deep neural network training application for automatic speech recognition. A simulation lasting over 2 hours...
Year 2016
-
Executing Multiple Simulations in the MERPSYS Environment
PublicationThe chapter investigates the steps necessary to perform a simulation instance in the MERPSYS environment and discusses potential limitations in case when vast numbers of simulations are required. An extended architecture is proposed which includes a JMS-based simulation queue and multiple distributed simulators, overcoming the potential bottlenecks. The chapter introduces also methods for preparing suites of multiple simulations...
-
KernelHive: a new workflow-based framework for multilevel high performance computing using clusters and workstations with CPUs and GPUs
PublicationThe paper presents a new open-source framework called KernelHive for multilevel parallelization of computations among various clusters, cluster nodes, and finally, among both CPUs and GPUs for a particular application. An application is modeled as an acyclic directed graph with a possibility to run nodes in parallel and automatic expansion of nodes (called node unrolling) depending on the number of computation units available....
-
Modeling energy consumption of parallel applications
PublicationThe paper presents modeling and simulation of energy consumption of two types of parallel applications: geometric Single Program Multiple Data (SPMD) and divide-and-conquer (DAC). Simulation is performed in a new MERPSYS environment. Model of an application uses the Java language with extension representing message exchange between processes working in parallel. Simulation is performed by running threads representing distinct process...
Year 2015
-
Simulation of parallel similarity measure computations for large data sets
PublicationThe paper presents our approach to implementation of similarity measure for big data analysis in a parallel environment. We describe the algorithm for parallelisation of the computations. We provide results from a real MPI application for computations of similarity measures as well as results achieved with our simulation software. The simulation environment allows us to model parallel systems of various sizes with various components...
Year 2014
-
A Regular Expression Matching Application with Configurable Data Intensity for Testing Heterogeneous HPC Systems
PublicationModern High Performance Computing (HPC) systems are becoming increasingly heterogeneous in terms of utilized hardware, as well as software solutions. The problems, that we wish to efficiently solve using those systems have different complexity, not only considering magnitude, but also the type of complexity: computation, data or communication intensity. Developing new mechanisms for dealing with those complexities or choosing an...
-
Data Mining Applications and Methods in Medicine
PublicationIn this paper we describe the research area of data mining and its applications in medicine. The origins of data mining and its crucial features are shortly presented. We discuss the specificity of medicine as an application area for computer systems. Characteristic features of the medical data are investigated. Common problems in the area are also presented as well as the strengths and capabilities of the data mining methods....
-
Dynamic Data Management Among Multiple Databases for Optimization of Parallel Computations in Heterogeneous HPC Systems
PublicationRapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...
-
Network-aware Data Prefetching Optimization of Computations in a Heterogeneous HPC Framework
PublicationRapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...
-
Obtaining a Well-Trained Artificial Intelligence Algorithm from Cross-Validation in Endoscopy
PublicationThe article shortly discusses endoscopic video analysis problems and artificial intelligence algorithms supporting it. The most common method of efficiency testing of these algorithms is to perform intensive cross-validation. This allows for accurately evaluate their performance of generalization. One of the main problems of this procedure is that there is no simple and universal way of obtaining a specific instance of a well-trained...
-
Optimization of Execution Time under Power Consumption Constraints in a Heterogeneous Parallel System with GPUs and CPUs
PublicationThe paper proposes an approach for parallelization of computations across a collection of clusters with heterogeneous nodes with both GPUs and CPUs. The proposed system partitions input data into chunks and assigns to par- ticular devices for processing using OpenCL kernels defined by the user. The sys- tem is able to minimize the execution time of the application while maintaining the power consumption of the utilized GPUs and...
-
Runtime Visualization of Application Progress and Monitoring of a GPU-enabled Parallel Environment
PublicationThe paper presents design, implementation and real life uses of a visualization subsystem for a distributed framework for parallelization of workflow-based computations among clusters with nodes that feature both CPUs and GPUs. Firstly, the proposed system presents a graphical view of the infrastructure with clusters, nodes and compute devices along with parameters and runtime graphs of load, memory available, fan speeds etc. Secondly,...
-
Simulation of Parallel Applications on Large-scale Distributed Systems
PublicationThis chapter has a form of a review article in the field of simulating High-Performance Computing systems. We justify the need for a new versatile simulator considering heterogeneity, energy efficiency and reliability of HPC systems. We sketch the problems that need to be solved by such simulator and rationalize using discrete-event simulation for this purpose. Based on a review of existing discrete-event HPC simulation solutions...
Year 2013
-
A ROLE PLAYING GAME NAME GENERATOR LEARNING ITS CREATIVITY FROM ARKADIA MUD PLAYERS
PublicationThe paper proposes an approach to creative generation of new names for the purposes of Role Playing Games in fantasy realms. The generator based on an existing database of na mes is able to propose a set of new names with regard to demanded attributes, such as: length of the name, sex and race of the character, a given p hrase as the origin for the generated name as well as subjective evaluations from former users. The software...
seen 2303 times