Filtry
wszystkich: 1555
-
Katalog
- Publikacje 975 wyników po odfiltrowaniu
- Czasopisma 207 wyników po odfiltrowaniu
- Konferencje 25 wyników po odfiltrowaniu
- Osoby 40 wyników po odfiltrowaniu
- Wynalazki 1 wyników po odfiltrowaniu
- Projekty 4 wyników po odfiltrowaniu
- Kursy Online 52 wyników po odfiltrowaniu
- Wydarzenia 1 wyników po odfiltrowaniu
- Dane Badawcze 250 wyników po odfiltrowaniu
wyświetlamy 1000 najlepszych wyników Pomoc
Wyniki wyszukiwania dla: OVERLAPPING COMMUNICATION AND COMPUTATIONS
-
Benchmarking overlapping communication and computations with multiple streams for modern GPUs
PublikacjaThe paper presents benchmarking a multi-stream application processing a set of input data arrays. Tests have been performed and execution times measured for various numbers of streams and various compute intensities measured as the ratio of kernel compute time and data transfer time. As such, the application and benchmarking is representative of frequently used operations such as vector weighted sum, matrix multiplication etc....
-
Investigation of Parallel Data Processing Using Hybrid High Performance CPU + GPU Systems and CUDA Streams
PublikacjaThe paper investigates parallel data processing in a hybrid CPU+GPU(s) system using multiple CUDA streams for overlapping communication and computations. This is crucial for efficient processing of data, in particular incoming data stream processing that would naturally be forwarded using multiple CUDA streams to GPUs. Performance is evaluated for various compute time to host-device communication time ratios, numbers of CUDA streams,...
-
Parallelization of large vector similarity computations in a hybrid CPU+GPU environment
PublikacjaThe paper presents design, implementation and tuning of a hybrid parallel OpenMP+CUDA code for computation of similarity between pairs of a large number of multidimensional vectors. The problem has a wide range of applications, and consequently its optimization is of high importance, especially on currently widespread hybrid CPU+GPU systems targeted in the paper. The following are presented and tested for computation of all vector...
-
Benchmarking Performance of a Hybrid Intel Xeon/Xeon Phi System for Parallel Computation of Similarity Measures Between Large Vectors
PublikacjaThe paper deals with parallelization of computing similarity measures between large vectors. Such computations are important components within many applications and consequently are of high importance. Rather than focusing on optimization of the algorithm itself, assuming specific measures, the paper assumes a general scheme for finding similarity measures for all pairs of vectors and investigates optimizations for scalability...
-
A multithreaded CUDA and OpenMP based power‐aware programming framework for multi‐node GPU systems
PublikacjaIn the paper, we have proposed a framework that allows programming a parallel application for a multi-node system, with one or more GPUs per node, using an OpenMP+extended CUDA API. OpenMP is used for launching threads responsible for management of particular GPUs and extended CUDA calls allow to manage CUDA objects, data and launch kernels. The framework hides inter-node MPI communication from the programmer who can benefit from...
-
Parallel Programming for Modern High Performance Computing Systems
PublikacjaIn view of the growing presence and popularity of multicore and manycore processors, accelerators, and coprocessors, as well as clusters using such computing devices, the development of efficient parallel applications has become a key challenge to be able to exploit the performance of such systems. This book covers the scope of parallel programming for modern high performance computing systems. It first discusses selected and...
-
Performance and Power-Aware Modeling of MPI Applications for Cluster Computing
PublikacjaThe paper presents modeling of performance and power consumption when running parallel applications on modern cluster-based systems. The model includes basic so-called blocks representing either computations or communication. The latter includes both point-to-point and collective communication. Real measurements were performed using MPI applications and routines run on three different clusters with both Infiniband and Gigabit Ethernet...
-
Optimization of Data Assignment for Parallel Processing in a Hybrid Heterogeneous Environment Using Integer Linear Programming
PublikacjaIn the paper we investigate a practical approach to application of integer linear programming for optimization of data assignment to compute units in a multi-level heterogeneous environment with various compute devices, including CPUs, GPUs and Intel Xeon Phis. The model considers an application that processes a large number of data chunks in parallel on various compute units and takes into account computations, communication including...
-
ENGINEERING COMPUTATIONS
Czasopisma -
Modeling SPMD Application Execution Time
PublikacjaParallel applications in a Single Process Multiple Data paradigm assume splitting huge amounts of data to multiple processors working in parallel at small data packets. As the individual data packets are not independent, the processors must interact with each other to exchange results of the calculations with their adjacent partners and take these results into account in their own computations. An example of SPMD is geometric parallelism...
-
MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems
PublikacjaIn this paper we present a new environment called MERPSYS that allows simulation of parallel application execution time on cluster-based systems. The environment offers a modeling application using the Java language extended with methods representing message passing type communication routines. It also offers a graphical interface for building a system model that incorporates various hardware components such as CPUs, GPUs, interconnects...
-
Optimization of parallel implementation of UNRES package for coarse‐grained simulations to treat large proteins
PublikacjaWe report major algorithmic improvements of the UNRES package for physics-based coarse-grained simulations of proteins. These include (i) introduction of interaction lists to optimize computations, (ii) transforming the inertia matrix to a pentadiagonal form to reduce computing and memory requirements, (iii) removing explicit angles and dihedral angles from energy expressions and recoding the most time-consuming energy/force terms...
-
Accuracy of the discrete Green's function computations
PublikacjaThis paper discusses the accuracy of the discrete Green's function (DGF) computations. Recently closed-form expression of the DGF and its efficient numerical implementation were presented which facilitate the DGF applications in FDTD simulations of radiation and scattering problems. By carefully comparing the DGF results to those of the FDTD simulation, one can make conclusions about the range of the applicability of the DGF for...
-
Acceleration of the discrete Green's function computations
PublikacjaResults of the acceleration of the 3-D discrete Green's function (DGF) computations on the multicore processor are presented. The code was developed in the multiple precision arithmetic with use of the OpenMP parallel programming interface. As a result, the speedup factor of three orders of magnitude compared to the previous implementation was obtained thus applicability of the DGF in FDTD simulations was significantly improved.
-
Electromagnetic Problems Requiring High-Precision Computations
PublikacjaAn overview of the applications of multiple-precision arithmetic in CEM was presented in this paper for the first time. Although double-precision floating-point arithmetic is sufficient for most scientific computations, there is an expanding body of electromagnetic problems requiring multiple-precision arithmetic. Software libraries facilitating these computations were described, and investigations requiring multiple-precision...
-
IP Core of Coprocessor for Multiple-Precision-Arithmetic Computations
PublikacjaIn this paper, we present an IP core of coprocessor supporting computations requiring integer multiple-precision arithmetic (MPA). Whilst standard 32/64-bit arithmetic is sufficient to solve many computing problems, there are still applications that require higher numerical precision. Hence, the purpose of the developed coprocessor is to support and offload central processing unit (CPU) in such computations. The developed digital...
-
Simulation of parallel similarity measure computations for large data sets
PublikacjaThe paper presents our approach to implementation of similarity measure for big data analysis in a parallel environment. We describe the algorithm for parallelisation of the computations. We provide results from a real MPI application for computations of similarity measures as well as results achieved with our simulation software. The simulation environment allows us to model parallel systems of various sizes with various components...
-
Parallel computations in the volunteer based Comcute system
PublikacjaThe paper presents Comcute which is a novel multi-level implemen- tation of the volunteer based computing paradigm. Comcute was designed to let users donate the computing power of their PCs in a simplified manner, requiring only pointing their web browser at a specific web address and clicking a mouse. The server side appoints several servers to be in charge of execution of particular tasks. Thanks to that the system can survive...
-
Windowing of the Discrete Green's Function for Accurate FDTD Computations
PublikacjaThe paper presents systematic evaluation of the applicability of parametric and nonparametric window functions for truncation of the discrete Green's function (DGF). This function is directly derived from the FDTD update equations, thus the FDTD method and its integral discrete formulation can be perfectly coupled using DGF. Unfortunately, the DGF computations require processor time, hence DGF has to be truncated with appropriate...
-
Maritime Communication and Sea Safety of the Future - Machnine-type 5G Communication Concept
PublikacjaThe article presents the concept of a system based on 5G network and M2M communication increasing maritime safety. Generally, the focus was on presenting a proposal for a hierarchical, hybrid, cooperative system with M2M communication coordinated with BAN networks. The possi-ble applications of M2M communication at sea were also presented.