(Field of Science):
- Automation, electronic and electrical engineering (Engineering and Technology)
- Information and communication technology (Engineering and Technology)
- Biomedical engineering (Engineering and Technology)
- Management and quality studies (Social studies)
- Computer and information sciences (Natural sciences)
Ministry points: Help
|2021||70||Ministry Scored Journals List 2019|
|2020||70||Ministry Scored Journals List 2019|
|2019||70||Ministry Scored Journals List 2019|
Papers published in journal
The Clairvoyant algorithm proposed in “A novel MPI reduction algorithm resilient to imbalances in process arrival times” was analyzed, commented and improved. The comments concern handling certain edge cases in the original pseudocode and description, i.e., adding another state of a process, improved cache friendliness more precise complexity estimations and some other issues improving the robustness of the algorithm implementation....
Performance evaluation of Unified Memory with prefetching and oversubscription for selected parallel CUDA applications on NVIDIA Pascal and Volta GPUsPublication
The paper presents assessment of Unified Memory performance with data prefetching and memory oversubscription. Several versions of code are used with: standard memory management, standard Unified Memory and optimized Unified Memory with programmer-assisted data prefetching. Evaluation of execution times is provided for four applications: Sobel and image rotation filters, stream image processing and computational fluid dynamic simulation,...
Two new algorithms for the all-reduce operation optimized for imbalanced process arrival patterns (PAPs) are presented: (1) sorted linear tree, (2) pre-reduced ring as well as a new way of online PAP detection, including process arrival time estimations, and their distribution between cooperating processes was introduced. The idea, pseudo-code, implementation details, benchmark for performance evaluation and a real case example...
The paper presents design, implementation and tuning of a hybrid parallel OpenMP+CUDA code for computation of similarity between pairs of a large number of multidimensional vectors. The problem has a wide range of applications, and consequently its optimization is of high importance, especially on currently widespread hybrid CPU+GPU systems targeted in the paper. The following are presented and tested for computation of all vector...
Performance evaluation of unified memory and dynamic parallelism for selected parallel CUDA applicationsPublication
The aim of this paper is to evaluate performance of new CUDA mechanisms—unified memory and dynamic parallelism for real parallel applications compared to standard CUDA API versions. In order to gain insight into performance of these mechanisms, we decided to implement three applications with control and data flow typical of SPMD, geometric SPMD and divide-and-conquer schemes, which were then used for tests and experiments. Specifically,...
A model, design, and implementation of an efficient multithreaded workflow execution engine with data streaming, caching, and storage constraintsPublication
The paper proposes a model, design, and implementation of an efficient multithreaded engine for execution of distributed service-based workflows with data streaming defined on a per task basis. The implementation takes into account capacity constraints of the servers on which services are installed and the workflow data footprint if needed. Furthermore, it also considers storage space of the workflow execution engine and its cost....
Modeling, run-time optimization and execution of distributed workflow applications in the JEE-based BeesyCluster environmentPublication
Artykuł prezentuje kompletne rozwiązanie do modelowania naukowych i biznesowych scenariuszy. statycznego i dynamicznego wyboru usług z uwzględnieniem parametrów jakościowych oraz wykonanie scenariuszy w rzeczywistym środowisku. Scenariusz jest modelowany jako acykliczny graf skierowany, w którym węzły reprezentują zadania zaś krawędzie zależności pomiędzy zadaniami. Warstwa pośrednia BeesyCluster jest wykorzystana do umożliwienia...
seen 182 times