Filtry
wszystkich: 23
Wyniki wyszukiwania dla: MULTICORE PROCESSING
-
Parallel Implementation of the Discrete Green's Function Formulation of the FDTD Method on a Multicore Central Processing Unit
PublikacjaParallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method was developed on a multicore central processing unit. DGF-FDTD avoids computations of the electromagnetic field in free-space cells and does not require domain termination by absorbing boundary conditions. Computed DGF-FDTD solutions are compatible with the FDTD grid enabling the perfect hybridization of FDTD...
-
Implementation of FDTD-Compatible Green's Function on Graphics Processing Unit
PublikacjaIn this letter, implementation of the finite-difference time domain (FDTD)-compatible Green's function on a graphics processing unit (GPU) is presented. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates its applications in the FDTD simulations of radiation and scattering problems. Unfortunately, implementation of the new DGF formula in software requires a multiple precision...
-
Parallel implementation of the DGF-FDTD method on GPU Using the CUDA technology
PublikacjaThe discrete Green's function (DGF) formulation of the finite-difference time-domain method (FDTD) is accelerated on a graphics processing unit (GPU) by means of the Compute Unified Device Architecture (CUDA) technology. In the developed implementation of the DGF-FDTD method, a new analytic expression for dyadic DGF derived based on scalar DGF is employed in computations. The DGF-FDTD method on GPU returns solutions that are compatible...
-
Implementation of FDTD-compatible Green's function on heterogeneous CPU-GPU parallel processing system
PublikacjaThis paper presents an implementation of the FDTD-compatible Green's function on a heterogeneous parallel processing system. The developed implementation simultaneously utilizes computational power of the central processing unit (CPU) and the graphics processing unit (GPU) to the computational tasks best suited to each architecture. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates...
-
Acceleration of the DGF-FDTD method on GPU using the CUDA technology
PublikacjaWe present a parallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method on a graphics processing unit (GPU). The compute unified device architecture (CUDA) parallel computing platform is applied in the developed implementation. For the sake of example, arrays of Yagi-Uda antennas were simulated with the use of DGF-FDTD on GPU. The efficiency of parallel computations...
-
Preconditioners with Low Memory Requirements for Higher-Order Finite-Element Method Applied to Solving Maxwell’s Equations on Multicore CPUs and GPUs
PublikacjaThis paper discusses two fast implementations of the conjugate gradient iterative method using a hierarchical multilevel preconditioner to solve the complex-valued, sparse systems obtained using the higher order finite-element method applied to the solution of the time-harmonic Maxwell equations. In the first implementation, denoted PCG-V, a classical V-cycle is applied and the system of equations on the lowest level is solved...
-
Multi Queue Approach for Network Services Implemented for Multi Core CPUs
PublikacjaMultiple core processors have already became the dominant design for general purpose CPUs. Incarnations of this technology are present in solutions dedicated to such areas like computer graphics, signal processing and also computer networking. Since the key functionality of network core components is fast package servicing, multicore technology, due to multi tasking ability, seems useful to support packet processing. Dedicated...
-
Single and Dual-GPU Generalized Sparse Eigenvalue Solvers for Finding a Few Low-Order Resonances of a Microwave Cavity Using the Finite-Element Method
PublikacjaThis paper presents two fast generalized eigenvalue solvers for sparse symmetric matrices that arise when electromagnetic cavity resonances are investigated using the higher-order finite element method (FEM). To find a few loworder resonances, the locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm with null-space deflation is applied. The computations are expedited by using one or two graphical processing...
-
Acceleration of Electromagnetic Simulations on Reconfigurable FPGA Card
PublikacjaIn this contribution, the hardware acceleration of electromagnetic simulations on the reconfigurable field-programmable-gate-array (FPGA) card is presented. In the developed implementation of scientific computations, the matrix-assembly phase of the method of moments (MoM) is accelerated on the Xilinx Alveo U200 card. The computational method involves discretization of the frequency-domain mixed potential integral equation using...
-
A GPU Solver for Sparse Generalized Eigenvalue Problems with Symmetric Complex-Valued Matrices Obtained Using Higher-Order FEM
PublikacjaThe paper discusses a fast implementation of the stabilized locally optimal block preconditioned conjugate gradient (sLOBPCG) method, using a hierarchical multilevel preconditioner to solve nonHermitian sparse generalized eigenvalue problems with large symmetric complex-valued matrices obtained using the higher-order finite-element method (FEM), applied to the analysis of a microwave resonator. The resonant frequencies of the low-order...
-
Energy-Aware Scheduling for High-Performance Computing Systems: A Survey
PublikacjaHigh-performance computing (HPC), according to its name, is traditionally oriented toward performance, especially the execution time and scalability of the computations. However, due to the high cost and environmental issues, energy consumption has already become a very important factor that needs to be considered. The paper presents a survey of energy-aware scheduling methods used in a modern HPC environment, starting with the...
-
Advanced Potential Energy Surfaces for Molecular Simulation
PublikacjaAdvanced potential energy surfaces are defined as theoretical models that explicitly include many-body effects that transcend the standard fixed-charge, pairwise-additive paradigm typically used in molecular simulation. However, several factors relating to their software implementation have precluded their widespread use in condensed-phase simulations: the computational cost of the theoretical models, a paucity of approximate models...
-
Generation of large finite-element matrices on multiple graphics processors
PublikacjaThis paper presents techniques for generating very large finite-element matrices on a multicore workstation equipped with several graphics processing units (GPUs). To overcome the low memory size limitation of the GPUs, and at the same time to accelerate the generation process, we propose to generate the large sparse linear systems arising in finite-element analysis in an iterative manner on several GPUs and to use the graphics...
-
Block-based Representation of Application Execution on Modern Parallel Systems
PublikacjaThe chapter presents how to model execution of a parallel computational application that is to be executed in a large-scale parallel or distributed environment with potentially thousands to millions of execution units. The representation uses pre- viously attributes and factors representative of modern high performance systems including multicore CPUs, GPUs, dedicated accelerators such as Intel Phi.
-
Acceleration of the discrete Green's function computations
PublikacjaResults of the acceleration of the 3-D discrete Green's function (DGF) computations on the multicore processor are presented. The code was developed in the multiple precision arithmetic with use of the OpenMP parallel programming interface. As a result, the speedup factor of three orders of magnitude compared to the previous implementation was obtained thus applicability of the DGF in FDTD simulations was significantly improved.
-
Parallel Programming for Modern High Performance Computing Systems
PublikacjaIn view of the growing presence and popularity of multicore and manycore processors, accelerators, and coprocessors, as well as clusters using such computing devices, the development of efficient parallel applications has become a key challenge to be able to exploit the performance of such systems. This book covers the scope of parallel programming for modern high performance computing systems. It first discusses selected and...
-
Fast implementation of FDTD-compatible green's function on multicore processor
PublikacjaIn this letter, numerically efficient implementation of the finite-difference time domain (FDTD)-compatible Green's function on a multicore processor is presented. Recently, closed-form expression of this discrete Green's function (DGF) was derived, which simplifies its application in the FDTD simulations of radiation and scattering problems. Unfortunately, the new DGF expression involves binomial coefficients, whose computations...
-
High-Speed Serial Embedded Deterministic Test for System-on-Chip Designs
PublikacjaThe paper presents a high-speed serial interface between external tester and Embedded Deterministic Test (EDT) compression logic hosted by SoC designs. With only a single bidirectional link, the system is capable of feeding distributed heterogeneous cores with hundreds of test channels. Moreover, it synergistically supports EDT bandwidth management to improve the overall test performance. A detailed study indicates a high potential...
-
An Efficient Framework For Fast Computer Aided Design of Microwave Circuits Based on the Higher-Order 3D Finite-Element Method
PublikacjaIn this paper, an efficient computational framework for the full-wave design by optimization of complex microwave passive devices, such as antennas, filters, and multiplexers, is described. The framework consists of a computational engine, a 3D object modeler, and a graphical user interface. The computational engine, which is based on a finite element method with curvilinear higher-order tetrahedral elements, is coupled with built-in...
-
Efficient parallel implementation of crowd simulation using a hybrid CPU+GPU high performance computing system
PublikacjaIn the paper we present a modern efficient parallel OpenMP+CUDA implementation of crowd simulation for hybrid CPU+GPU systems and demonstrate its higher performance over CPU-only and GPU-only implementations for several problem sizes including 10 000, 50 000, 100 000, 500 000 and 1 000 000 agents. We show how performance varies for various tile sizes and what CPU–GPU load balancing settings shall be preferred for various domain...
-
High-Power Jamming Attack Mitigation Techniques in Spectrally-Spatially Flexible Optical Networks
PublikacjaThis work presents efficient connection provisioning techniques mitigating high-power jamming attacks in spectrally-spatially flexible optical networks (SS-FONs) utilizing multicore fibers. High-power jamming attacks are modeled based on their impact on the lightpaths’ quality of transmission (QoT) through inter-core crosstalk. Based on a desired threshold on a lightpath’s QoT, the modulation format used, the length of the path,...
-
Benchmarking Performance of a Hybrid Intel Xeon/Xeon Phi System for Parallel Computation of Similarity Measures Between Large Vectors
PublikacjaThe paper deals with parallelization of computing similarity measures between large vectors. Such computations are important components within many applications and consequently are of high importance. Rather than focusing on optimization of the algorithm itself, assuming specific measures, the paper assumes a general scheme for finding similarity measures for all pairs of vectors and investigates optimizations for scalability...
-
Visual Data Encryption for Privacy Enhancement in Surveillance Systems
PublikacjaIn this paper a methodology for employing reversible visual encryption of data is proposed. The developed algorithms are focused on privacy enhancement in distributed surveillance architectures. First, motivation of the study performed and a short review of preexisting methods of privacy enhancement are presented. The algorithmic background, system architecture along with a solution for anonymization of sensitive regions of interest...