Filters
total: 2772
filtered: 2152
-
Catalog
- Publications 2152 available results
- Journals 224 available results
- Conferences 1 available results
- Publishing Houses 1 available results
- People 61 available results
- Inventions 4 available results
- Projects 14 available results
- Laboratories 2 available results
- Research Teams 7 available results
- e-Learning Courses 124 available results
- Events 29 available results
- Open Research Data 153 available results
Chosen catalog filters
displaying 1000 best results Help
Search results for: cialo czarne
-
Planowanie przestrzenne i kształtowanie krajobrazu miasta w kontekście terenów zalewowych na przykładzie Drezdenka / Spatial planning and landscaping of the city in the context of the floodplains on the example of Drezdenko
Publication -
Does the community-based combined Meeting Center Support Programme (MCSP) make the pathway to day-care activities easier for people living with dementia? A comparison before and after implementation of MCSP in three European countries
Publication -
Skanery zwiększają wydajność : Nowoczesne tartaki w Szwecji
PublicationW relacji przedstawiono szereg szwedzkich tartaków, w których w liniach przetarcia znajduje się szereg różnorodnych skanerów m. in. przeznaczonych do wykrywania wad wewnętrznych drewna, a także po przetarciu wykrywających tzw. czarne sęki.
-
Applications of Unnatural Amino Acids in Protease Probes
Publication -
Określenie miejsca wypadku przy pomocy metody Slibara
PublicationStatystyki dotyczące wypadków drogowych wskazują znaczy udział wypadków z udziałem pieszych. Przebieg zderzenia pojazdu samochodowego z pieszym stanowi złożony proces zależny od wielu czynników takich jak np. prędkość uderzenia, wzrost pieszego, kształt nadwozia, miejsce uderzenia pojazdu w ciało pieszego oraz wiele innych. W celu ustalenia rzeczywistego przebiegu takiego zdarzenia należy posłużyć się odpowiednimi narzędziami obliczeniowymi...
-
The Dream of Black
PublicationThe Dream of Black Wystawa Sen o czerni to projekt pedagogów Uniwersytetu w Ostrawie i Fundacji Wyspa Progress a właściwie ich studentów, którzy są również dzisiaj pedagogami. W wielu przypadkach jest to już drugie pokolenie studentów. The Dream of Black oferuje szerokie spektrum form artystycznych. Autorska próbka stanowiąca trzon projektu wystawy (gość: Viktor Frešo i inni artyści związani z Gdańską sceną artystyczną) to tylko...
-
Three levels of fail-safe mode in MPI I/O NVRAM distributed cache
PublicationThe paper presents architecture and design of three versions for fail-safe data storage in a distributed cache using NVRAM in cluster nodes. In the first one, cache consistency is assured through additional buffering write requests. The second one is based on additional write log managers running on different nodes. The third one benefits from synchronization with a Parallel File System (PFS) for saving data into a new file which...
-
A Solution to Image Processing with Parallel MPI I/O and Distributed NVRAM Cache
PublicationThe paper presents a new approach to parallel image processing using byte addressable, non-volatile memory (NVRAM). We show that our custom built MPI I/O implementation of selected functions that use a distributed cache that incorporates NVRAMs located in cluster nodes can be used for efficient processing of large images. We demonstrate performance benefits of such a solution compared to a traditional implementation without NVRAM...
-
Use of ICT infrastructure for teaching HPC
PublicationIn this paper we look at modern ICT infrastructure as well as curriculum used for conducting a contemporary course on high performance computing taught over several years at the Faculty of Electronics Telecommunications and Informatics, Gdansk University of Technology, Poland. We describe the infrastructure in the context of teaching parallel programming at the cluster level using MPI, node level using OpenMP and CUDA. We present...
-
Human awareness versus Autonomous Vehicles view: comparison of reaction times during emergencies
PublicationHuman safety is one of the most critical factors when a new technology is introduced to the everyday use. It was no different in the case of Autonomous Vehicles (AV), designed to replace generally available Conventional Vehicles (CV) in the future. AV rules, from the start, focus on guaranteeing safety for passengers and other road users, and these assumptions usually work during normal traffic conditions. However, there is still...
-
Benchmarking Deep Neural Network Training Using Multi- and Many-Core Processors
PublicationIn the paper we provide thorough benchmarking of deep neural network (DNN) training on modern multi- and many-core Intel processors in order to assess performance differences for various deep learning as well as parallel computing parameters. We present performance of DNN training for Alexnet, Googlenet, Googlenet_v2 as well as Resnet_50 for various engines used by the deep learning framework, for various batch sizes. Furthermore,...
-
Distributed NVRAM Cache – Optimization and Evaluation with Power of Adjacency Matrix
PublicationIn this paper we build on our previously proposed MPI I/O NVRAM distributed cache for high performance computing. In each cluster node it incorporates NVRAMs which are used as an intermediate cache layer between an application and a file for fast read/write operations supported through wrappers of MPI I/O functions. In this paper we propose optimizations of the solution including handling of write requests with a synchronous mode,...
-
Parallelization of Selected Algorithms on Multi-core CPUs, a Cluster and in a Hybrid CPU+Xeon Phi Environment
PublicationIn the paper we present parallel implementations as well as execution times and speed-ups of three different algorithms run in various environments such as on a workstation with multi-core CPUs and a cluster. The parallel codes, implementing the master-slave model in C+MPI, differ in computation to communication ratios. The considered problems include: a genetic algorithm with various ratios of master processing time to communication...
-
Investigation of Performance and Configuration of a Selected IoT System—Middleware Deployment Benchmarking and Recommendations
PublicationNowadays Internet of Things is gaining more and more focus all over the world. As a concept it gives many opportunities for applications for society and it is expected that the number of software services deployed in this area will still grow fast. Especially important in this context are properties connected with deployment such as portability, scalability and balance between software requirements and hardware capabilities. In...
-
Performance Assessment of Using Docker for Selected MPI Applications in a Parallel Environment Based on Commodity Hardware
PublicationIn the paper, we perform detailed performance analysis of three parallel MPI applications run in a parallel environment based on commodity hardware, using Docker and bare-metal configurations. The testbed applications are representative of the most typical parallel processing paradigms: master–slave, geometric Single Program Multiple Data (SPMD) as well as divide-and-conquer and feature characteristic computational and communication...
-
Efficient parallel implementation of crowd simulation using a hybrid CPU+GPU high performance computing system
PublicationIn the paper we present a modern efficient parallel OpenMP+CUDA implementation of crowd simulation for hybrid CPU+GPU systems and demonstrate its higher performance over CPU-only and GPU-only implementations for several problem sizes including 10 000, 50 000, 100 000, 500 000 and 1 000 000 agents. We show how performance varies for various tile sizes and what CPU–GPU load balancing settings shall be preferred for various domain...
-
Dataset Related Experimental Investigation of Chess Position Evaluation Using a Deep Neural Network
PublicationThe idea of training Articial Neural Networks to evaluate chess positions has been widely explored in the last ten years. In this paper we investigated dataset impact on chess position evaluation. We created two datasets with over 1.6 million unique chess positions each. In one of those we also included randomly generated positions resulting from consideration of potentially unpredictable chess moves. Each position was evaluated...
-
Auto-tuning methodology for configuration and application parameters of hybrid CPU + GPU parallel systems based on expert knowledge
PublicationAuto-tuning of configuration and application param- eters allows to achieve significant performance gains in many contemporary compute-intensive applications. Feasible search spaces of parameters tend to become too big to allow for exhaustive search in the auto-tuning process. Expert knowledge about the utilized computing systems becomes useful to prune the search space and new methodologies are needed in the face of emerging heterogeneous...
-
Multi-agent large-scale parallel crowd simulation with NVRAM-based distributed cache
PublicationThis paper presents the architecture, main components and performance results for a parallel and modu-lar agent-based environment aimed at crowd simulation. The environment allows to simulate thousandsor more agents on maps of square kilometers or more, features a modular design and incorporates non-volatile RAM (NVRAM) with a fail-safe mode that can be activated to allow to continue computationsfrom a recently analyzed state in...
-
Performance/energy aware optimization of parallel applications on GPUs under power capping
PublicationIn the paper we present an approach and results from application of the modern power capping mechanism available for NVIDIA GPUs to the bench- marks such as NAS Parallel Benchmarks BT, SP and LU as well as cublasgemm- benchmark which are widely used for assessment of high performance computing systems’ performance. Specifically, depending on the benchmarks, various power cap configurations are best for desired trade-off of performance...
-
Performance evaluation of Unified Memory with prefetching and oversubscription for selected parallel CUDA applications on NVIDIA Pascal and Volta GPUs
PublicationThe paper presents assessment of Unified Memory performance with data prefetching and memory oversubscription. Several versions of code are used with: standard memory management, standard Unified Memory and optimized Unified Memory with programmer-assisted data prefetching. Evaluation of execution times is provided for four applications: Sobel and image rotation filters, stream image processing and computational fluid dynamic simulation,...
-
Recent advances in traffic optimisation: systematic literature review of modern models, methods and algorithms
PublicationOver the past few decades, the increasing number of vehicles and imperfect road traffic management have been sources of congestion in cities and reasons for deteriorating health of its inhabitants. With the help of computer simulations, transport engineers optimise and improve the capacity of city streets. However, with an enormous number of possible simulation types, it is difficult to grasp valuable, innovative solutions which...
-
Benchmarking Scalability and Security Configuration Impact for A Distributed Sensors-Server IOT Use Case
PublicationInternet of Things has been getting more and more attention and found numerous practical applications. Especially important in this context are performance, security and ability to cope with failures. Especially crucial is to find good trade-off between these. In this article we present results of practical tests with multiple clients representing sensors sending notifications to an IoT middleware – DeviceHive. We investigate performance...
-
Performance assessment of OpenMP constructs and benchmarks using modern compilers and multi-core CPUs
PublicationConsidering ongoing developments of both modern CPUs, especially in the context of increasing numbers of cores, cache memory and architectures as well as compilers there is a constant need for benchmarking representative and frequently run workloads. The key metric is speed-up as the computational power of modern CPUs stems mainly from using multiple cores. In this paper, we show and discuss results from running codes such as:...
-
Some Security Features of Selected IoT Platforms
PublicationIoT (Internet of Things) is certainly one of the leading current and future trends for processing in the current distributed world. It is changing our life and society. IoT allows new ubiquitous applications and processing, but, on the other hand, it introduces potentially serious security threats. Nowadays researchers in IoT areas should, without a doubt, consider and focus on security aspects. This paper is aimed at a high-level...
-
Performance and Power-Aware Modeling of MPI Applications for Cluster Computing
PublicationThe paper presents modeling of performance and power consumption when running parallel applications on modern cluster-based systems. The model includes basic so-called blocks representing either computations or communication. The latter includes both point-to-point and collective communication. Real measurements were performed using MPI applications and routines run on three different clusters with both Infiniband and Gigabit Ethernet...
-
Performance Modeling and Prediction of Real Application Workload in a Volunteer-based System
PublicationThe goal of this paper is to present a model that predicts the real workload placed on a volunteer based system by an application, with incorporation of not only performance but also availability of volunteers. The application consists of multiple data packets that need to be processed. Knowing the computational workload demand of a single data packet we show how to estimate the application workload in a volunteer based system. Furthermore,...
-
A distributed system for conducting chess games in parallel
PublicationThis paper proposes a distributed and scalable cloud based system designed to play chess games in parallel. Games can be played between chess engines alone or between clusters created by combined chess engines. The system has a built-in mechanism that compares engines, based on Elo ranking which finally presents the strength of each tested approach. If an approach needs more computational power, the design of the system allows...
-
Performance evaluation of unified memory and dynamic parallelism for selected parallel CUDA applications
PublicationThe aim of this paper is to evaluate performance of new CUDA mechanisms—unified memory and dynamic parallelism for real parallel applications compared to standard CUDA API versions. In order to gain insight into performance of these mechanisms, we decided to implement three applications with control and data flow typical of SPMD, geometric SPMD and divide-and-conquer schemes, which were then used for tests and experiments. Specifically,...
-
Considerations of Computational Efficiency in Volunteer and Cluster Computing
PublicationIn the paper we focus on analysis of performance and power consumption statistics for two modern environments used for computing – volunteer and cluster based systems. The former integrate computational power donated by volunteers from their own locations, often towards social oriented or targeted initiatives, be it of medical, mathematical or space nature. The latter is meant for high performance computing and is typically installed...
-
Dynamic Compatibility Matching of Services for Distributed Workflow Execution
PublicationThe paper presents a concept and an implementation of dynamic learn-ing of compatibilities of services used in a workflow application. While services may have the same functionality, they may accept input and produce output in different formats. The proposed solution learns matching of outputs and inputs at runtime and uses this knowledge in subsequent runs of workflow applications. The presented solution was implemented in an...
-
Modelling and simulation of GPU processing in the MERPSYS environment
PublicationIn this work, we evaluate an analytical GPU performance model based on Little's law, that expresses the kernel execution time in terms of latency bound, throughput bound, and achieved occupancy. We then combine it with the results of several research papers, introduce equations for data transfer time estimation, and finally incorporate it into the MERPSYS framework, which is a general-purpose simulator for parallel and distributed...
-
Automatic conversion of legacy applications into services in beesycluster
PublicationPrzedstawiono rozwiązanie, dzieki któremu uzytkownikw prosty sposób uzyskuje mozliwość automatycznej konwersji aplikacji dostępnych w systemach Unix do usług w systemie BeesyCluster. system BeesyCluster stanowi warstę pośredniczącą w dostepie do sieci klasrów poprzez WWW. Aby zapewnić szeroki zakres dostępnych usług mozliwa jest konwersja wielu pakietów linuksowych jednoczesnie. narzędzie umozliwia na podstawie wydobytych informacji...
-
Workflow application for detection of unwanted events
PublicationZaprezentowano rozproszoną aplikację do wykrywania potencjalnie niebezpiecznych zdarzeń z wejściowych strumieni wideo. Rozpoznanie niepożądanych zdarzeń wywołuje alarmy i wysyła powiadomienia do odpowiednich służb, jak również powoduje zarejestrowanie filmu. Model aplikacji składa się z węzłów z kamerami, pobierajacych strumienie danych, przetwarzajacych dane, wysyłajacych powiadomienia i zapisujacych dane. Zaimplementowana aplikacja...
-
Optimization of Execution Time under Power Consumption Constraints in a Heterogeneous Parallel System with GPUs and CPUs
PublicationThe paper proposes an approach for parallelization of computations across a collection of clusters with heterogeneous nodes with both GPUs and CPUs. The proposed system partitions input data into chunks and assigns to par- ticular devices for processing using OpenCL kernels defined by the user. The sys- tem is able to minimize the execution time of the application while maintaining the power consumption of the utilized GPUs and...
-
Parallel simulations of electrophysiological phenomena in myocardium on large 32 and 64-bit Linux clusters.
PublicationW pracy podjęto badania i przeprowadzono symulacje zjawisk elektrofizjologicznych w mięśniu sercowym z wykorzystaniem wytworzonego w tym celu oprogramowania równoległego opartego na MPI. Zaimplementowano i zbadano ulepszenia kodu prowadzące do uzyskania dobrej skalowalności oraz przeprowadzono testy wydajności na najnowszych 32 i 64-bitowych klastrach linuksowych. Praca stanowi próbę równoległej implementacji znanego podejścia...
-
New user-guided and ckpt-based checkpointing libraries for parallel MPI applications
PublicationPraca prezentuje szczególy projektowe i implementacyjne jak również wyniki wydajnościowe dwóch nowych bibliotek checkpointingu opracowanych przez autorów dla równoległych aplikacji MPI. Pierwsz biblioteka, tzw. user-guided wymaga od programisty dostarczenia funkcji pakujących i rozpakowujących stan procesu, ale dostarcza łatwego w użyciu API z wykorzystaniem stałych MPI. Wykorzystuje funkcje I/O MPI-2 lub dedykowany proces master...
-
Portable parallel simulator using MPI for 2D and 3D domains: design and performance testing
PublicationW artykule prezentujemy szczegóły projektowo-implementacyjne naszego modularnego kodu symulacyjnego z wykorzystaniem MPI, w tym nakładaniem obliczeń i komunikacji. Podkreślamy modularność naszej implementacji pozwalającą na łatwą adaptację kodu dla innych zasotosowań. Prezentujemy związek pomiędzy przyspieszeniem obliczeń, rozmiarem i kształtami trójwymiarowych domen z różnymi stosunkami liczby węzłów aktualizowanych przez procesor...
-
Mixed electromagnetic - circuits modeling and parallelization for rigorouscharacterization of cosite interference in wireless communication channels. W: UGC 2002 Homepage [online]. Department of Defense High Performance Com- puting Modernization Program. Users Group Conference 2002. Austin, Texas, USA. June 10-14, 2002. [Dostęp: 15 grudnia**2002]. Dostępny w World Wide Web: http://www.hpcmo.hpc.mil/Htdocs/UGC/UGC02/paper/[45 slajdów]. Modelowanie układów elektromagnetycznych i zrównoleglanie w celu określenia wzajemnych oddziaływań w bezprzewodowych kanałach komunikacyjnych.
PublicationRównoległe działanie sąsiadujące modułów nadawczo-odbiorczych typowo prowa-dzi do efektów ubocznych z powodu wzajemnych oddziaływań, które obniżają pa-rametry sieci. W celu scharakteryzowania takich efektów, zaprezentowano roz-wiązanie równań Maxwella w dziedzinie czasu z modelowaniem efektów nielinio-wych.
-
The influence of strain on the Verwey transition as a function of dopant concentration: towards a geobarometer for magnetite-bearing rocks
Publication -
Optimization of Data Assignment for Parallel Processing in a Hybrid Heterogeneous Environment Using Integer Linear Programming
PublicationIn the paper we investigate a practical approach to application of integer linear programming for optimization of data assignment to compute units in a multi-level heterogeneous environment with various compute devices, including CPUs, GPUs and Intel Xeon Phis. The model considers an application that processes a large number of data chunks in parallel on various compute units and takes into account computations, communication including...
-
From Sequential to Parallel Implementation of NLP Using the Actor Model
PublicationThe article focuses on presenting methods allowing easy parallelization of an existing, sequential Natural Language Processing (NLP) application within a multi-core system. The actor-based solution implemented with the Akka framework has been applied and compared to an application based on Task Parallel Library (TPL) and to the original sequential application. Architectures, data and control flows are described along with execution...
-
Analyzing energy/performance trade-offs with power capping for parallel applications on modern multi and many core processors
PublicationIn the paper we present extensive results from analyzing energy/performance trade-offs with power capping observed on four different modern CPUs, for three different parallel applications such as 2D heat distribution, numerical integration and Fast Fourier Transform. The CPU tested represent both multi-core type CPUs such as Intel⃝R Xeon⃝R E5, desktop and mobile i7 as well as many-core Intel⃝R Xeon PhiTM x200 but also server, desktop...
-
Highlights from RNDM 2018 – 10th Anniversary Workshop on Resilient Networks Design and Modeling
PublicationArtykuł prezentujący relację z workshopu RNDM 2018
-
Survey of Methodologies, Approaches, and Challenges in Parallel Programming Using High-Performance Computing Systems
PublicationThis paper provides a review of contemporary methodologies and APIs for parallel programming, with representative technologies selected in terms of target system type (shared memory, distributed, and hybrid), communication patterns (one-sided and two-sided), and programming abstraction level. We analyze representatives in terms of many aspects including programming model, languages, supported platforms, license, optimization goals,...
-
DEPO: A dynamic energy‐performance optimizer tool for automatic power capping for energy efficient high‐performance computing
PublicationIn the article we propose an automatic power capping software tool DEPO that allows one to perform runtime optimization of performance and energy related metrics. For an assumed application model with an initialization phase followed by a running phase with uniform compute and memory intensity, the tool performs automatic tuning engaging one of the two exploration algorithms—linear search (LS) and golden section search (GSS), finds...
-
GPU Power Capping for Energy-Performance Trade-Offs in Training of Deep Convolutional Neural Networks for Image Recognition
PublicationIn the paper we present performance-energy trade-off investigation of training Deep Convolutional Neural Networks for image recognition. Several representative and widely adopted network models, such as Alexnet, VGG-19, Inception V3, Inception V4, Resnet50 and Resnet152 were tested using systems with Nvidia Quadro RTX 6000 as well as Nvidia V100 GPUs. Using GPU power capping we found other than default configurations minimizing...
-
Energy-Aware Scheduling for High-Performance Computing Systems: A Survey
PublicationHigh-performance computing (HPC), according to its name, is traditionally oriented toward performance, especially the execution time and scalability of the computations. However, due to the high cost and environmental issues, energy consumption has already become a very important factor that needs to be considered. The paper presents a survey of energy-aware scheduling methods used in a modern HPC environment, starting with the...
-
Dynamic GPU power capping with online performance tracing for energy efficient GPU computing using DEPO tool
PublicationGPU accelerators have become essential to the recent advance in computational power of high- performance computing (HPC) systems. Current HPC systems’ reaching an approximately 20–30 mega-watt power demand has resulted in increasing CO2 emissions, energy costs and necessitate increasingly complex cooling systems. This is a very real challenge. To address this, new mechanisms of software power control could be employed. In this...
-
The impact of the AC922 Architecture on Performance of Deep Neural Network Training
PublicationPractical deep learning applications require more and more computing power. New computing architectures emerge, specifically designed for the artificial intelligence applications, including the IBM Power System AC922. In this paper we confront an AC922 (8335-GTG) server equipped with 4 NVIDIA Volta V100 GPUs with selected deep neural network training applications, including four convolutional and one recurrent model. We report...