Wyniki wyszukiwania dla: parallel programming
-
Fast implementation of FDTD-compatible green's function on multicore processor
PublikacjaIn this letter, numerically efficient implementation of the finite-difference time domain (FDTD)-compatible Green's function on a multicore processor is presented. Recently, closed-form expression of this discrete Green's function (DGF) was derived, which simplifies its application in the FDTD simulations of radiation and scattering problems. Unfortunately, the new DGF expression involves binomial coefficients, whose computations...
-
Scheduling with Complete Multipartite Incompatibility Graph on Parallel Machines: Complexity and Algorithms
PublikacjaIn this paper, the problem of scheduling on parallel machines with a presence of incompatibilities between jobs is considered. The incompatibility relation can be modeled as a complete multipartite graph in which each edge denotes a pair of jobs that cannot be scheduled on the same machine. The paper provides several results concerning schedules, optimal or approximate with respect to the two most popular criteria of optimality:...
-
Towards an efficient multi-stage Riemann solver for nuclear physics simulations
PublikacjaRelativistic numerical hydrodynamics is an important tool in high energy nuclear science. However, such simulations are extremely demanding in terms of computing power. This paper focuses on improving the speed of solving the Riemann problem with the MUSTA-FORCE algorithm by employing the CUDA parallel programming model. We also propose a new approach to 3D finite difference algorithms, which employ a GPU that uses surface memory....
-
Scheduling of compatible jobs on parallel machines
PublikacjaThe dissertation discusses the problems of scheduling compatible jobs on parallel machines. Some jobs are incompatible, which is modeled as a binary relation on the set of jobs; the relation is often modeled by an incompatibility graph. We consider two models of machines. The first model, more emphasized in the thesis, is a classical model of scheduling, where each machine does one job at time. The second one is a model of p-batching...
-
An facile Fortran-95 algorithm to simulate complex instabilities in three-dimensional hyperbolic systems
Dane BadawczeIt is well know that the simulation of fractional systems is a difficult task from all points of view. In particular, the computer implementation of numerical algorithms to simulate fractional systems of partial differential equations in three dimensions is a hard task which has no been solved satisfactorily. Here, we provide a Fortran-95 code to solve...
-
Use of ICT infrastructure for teaching HPC
PublikacjaIn this paper we look at modern ICT infrastructure as well as curriculum used for conducting a contemporary course on high performance computing taught over several years at the Faculty of Electronics Telecommunications and Informatics, Gdansk University of Technology, Poland. We describe the infrastructure in the context of teaching parallel programming at the cluster level using MPI, node level using OpenMP and CUDA. We present...
-
Tuning matrix-vector multiplication on GPU
PublikacjaA matrix times vector multiplication (matvec) is a cornerstone operation in iterative methods of solving large sparse systems of equations such as the conjugate gradients method (cg), the minimal residual method (minres), the generalized residual method (gmres) and exerts an influence on overall performance of those methods. An implementation of matvec is particularly demanding when one executes computations on a GPU (Graphics...
-
Accurate modeling of quasi-resonant inverter fed IM drive
PublikacjaIn this paper wide-band modeling methodology of a parallel quasi-resonant dc link inverter (PQRDCLI) fed induction machine (IM) is presented. The modeling objective is early-design stage prediction of conductive electromagnetic interference (EMI) emissions of the considered converter fed IM drive system. Operation principles of the selected topology of PQRDCLI feeding IM drive are given. Modeling of the converter drive system is...
-
Laboratory investigation with subbottom parametric echosounder SES-2000 standard with an emphasis on reflected pure signals analysis
PublikacjaThe main goal of the paper is to describe correlations between measurements results of trials taken on Gulf of Gdańsk bottom sounded with parametric echosounder SES-2000 Standard and laboratory research where collected during survey sediments were measured. Stationary tests took place at Gdansk University of Technology where 30 meters long 1.8 meter deep and 3 meters wide water tank is located. Main lobe of antenna was directed...
-
Performance Analysis of the OpenCL Environment on Mobile Platforms
PublikacjaToday’s smartphones have more and more features that so far were only assigned to personal computers. Every year these devices are composed of better and more efficient components. Everything indicates that modern smartphones are replacing ordinary computers in various activities. High computing power is required for tasks such as image processing, speech recognition and object detection. This paper analyses the performance of...
-
Application of mechanistic and data-driven models for nitrogen removal in wastewater treatment systems
PublikacjaIn this dissertation, the application of mechanistic and data-driven models in nitrogen removal systems including nitrification and deammonification processes was evaluated. In particular, the influential parameters on the activity of the Nitrospira activity were assessed using response surface methodology (RSM). Various long-term biomass washout experiments were operated in two parallel sequencing batch reactor (SBR) with a different...
-
Nieliniowa statyka 6-parametrowych powłok sprężysto plastycznych. Efektywne obliczenia MES
PublikacjaGłównym zagadnieniem omawianym w monografii jest sformułowanie sprężysto-plastycznego prawa konstytutywnego w nieliniowej 6-parametrowej teorii powłok. Wyróżnikiem tej teorii jest występujący w niej w naturalny sposób tzw. stopień 6 swobody, czyli owinięcie (drilling rotation). Podstawowe założenie pracy to przyjęcie płaskiego stanu naprężenia uogólnionego na ośrodek typu Cosseratów. Takie podejście stanowi oryginalny aspekt opracowania....
-
Computer controlled systems - 2022/2023
Kursy Onlinemateriały wspierające wykład na studiach II stopnia na kierunku ACR pod tytułem komputerowe systemy automatyki 1. Computer system – controlled plant interfacing technique; simple interfacing and with both side acknowledgement; ideas, algorithms, acknowledge passing. 2. Methods of acknowledgement passing: software checking and passing, using interrupt techniques, using readiness checking (ready – wait lines). The best solution...
-
CCS-lecture-2023-2024
Kursy Onlinemateriały wspierające wykład na studiach II stopnia na kierunku ACR pod tytułem komputerowe systemy automatyki 1. Computer system – controlled plant interfacing technique; simple interfacing and with both side acknowledgement; ideas, algorithms, acknowledge passing. 2. Methods of acknowledgement passing: software checking and passing, using interrupt techniques, using readiness checking (ready – wait lines). The best solution optimization...
-
Parallelization of Selected Algorithms on Multi-core CPUs, a Cluster and in a Hybrid CPU+Xeon Phi Environment
PublikacjaIn the paper we present parallel implementations as well as execution times and speed-ups of three different algorithms run in various environments such as on a workstation with multi-core CPUs and a cluster. The parallel codes, implementing the master-slave model in C+MPI, differ in computation to communication ratios. The considered problems include: a genetic algorithm with various ratios of master processing time to communication...
-
Parallelization of large vector similarity computations in a hybrid CPU+GPU environment
PublikacjaThe paper presents design, implementation and tuning of a hybrid parallel OpenMP+CUDA code for computation of similarity between pairs of a large number of multidimensional vectors. The problem has a wide range of applications, and consequently its optimization is of high importance, especially on currently widespread hybrid CPU+GPU systems targeted in the paper. The following are presented and tested for computation of all vector...
-
Assessment of OpenMP Master–Slave Implementations for Selected Irregular Parallel Applications
PublikacjaThe paper investigates various implementations of a master–slave paradigm using the popular OpenMP API and relative performance of the former using modern multi-core workstation CPUs. It is assumed that a master partitions available input into a batch of predefined number of data chunks which are then processed in parallel by a set of slaves and the procedure is repeated until all input data has been processed. The paper experimentally...