Filters
total: 86
filtered: 80
-
Catalog
Chosen catalog filters
Search results for: FINITE DIFFERENCE RIEMANN SOLVER MUSTA-FORCE ALGORITHM PARALLEL ALGORITHMS CUDA
-
Towards an efficient multi-stage Riemann solver for nuclear physics simulations
PublicationRelativistic numerical hydrodynamics is an important tool in high energy nuclear science. However, such simulations are extremely demanding in terms of computing power. This paper focuses on improving the speed of solving the Riemann problem with the MUSTA-FORCE algorithm by employing the CUDA parallel programming model. We also propose a new approach to 3D finite difference algorithms, which employ a GPU that uses surface memory....
-
Optimizing the computation of a parallel 3D finite difference algorithm for graphics processing units
PublicationThis paper explores the possibilities of using a graphics processing unit for complex 3D finite difference computation via MUSTA‐FORCE and WENO algorithms. We propose a novel algorithm based on the new properties of CUDA surface memory optimized for 2D spatial locality and compare it with 3D stencil computations carried out via shared memory, which is currently considered to be the best approach. A case study was performed for...
-
OpenGL accelerated method of the material matrix generation for FDTD simulations
PublicationThis paper presents the accelerated technique of the material matrix generation from CAD models utilized by the finite-difference time-domain (FDTD) simulators. To achieve high performance of these computations, the parallel-processing power of a graphics processing unit was employed with the use of the OpenGL library. The method was integrated with the developed FDTD solver, providing approximately five-fold speedup of the material...
-
Acceleration of the DGF-FDTD method on GPU using the CUDA technology
PublicationWe present a parallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method on a graphics processing unit (GPU). The compute unified device architecture (CUDA) parallel computing platform is applied in the developed implementation. For the sake of example, arrays of Yagi-Uda antennas were simulated with the use of DGF-FDTD on GPU. The efficiency of parallel computations...
-
Parallel multithread computing for spectroscopic analysis in optical coherence tomography
PublicationSpectroscopic Optical Coherence Tomography (SOCT) is an extension of Optical Coherence Tomography (OCT). It allows gathering spectroscopic information from individual scattering points inside the sample. It is based on time-frequency analysis of interferometric signals. Such analysis requires calculating hundreds of Fourier transforms while performing a single A-scan. Additionally, further processing of acquired spectroscopic information...
-
Performance evaluation of parallel background subtraction on GPU platforms
PublicationImplementation of the background subtraction algorithm on parallel GPUs is presented. The algorithm processes video streams and extracts foreground pixels. The work focuses on optimizing parallel algorithm implementation by taking into account specific features of the GPU architecture, such as memory access, data transfers and work group organization. The algorithm is implemented in both OpenCL and CUDA. Various optimizations of...
-
Efficient parallel implementation of crowd simulation using a hybrid CPU+GPU high performance computing system
PublicationIn the paper we present a modern efficient parallel OpenMP+CUDA implementation of crowd simulation for hybrid CPU+GPU systems and demonstrate its higher performance over CPU-only and GPU-only implementations for several problem sizes including 10 000, 50 000, 100 000, 500 000 and 1 000 000 agents. We show how performance varies for various tile sizes and what CPU–GPU load balancing settings shall be preferred for various domain...
-
Single and Dual-GPU Generalized Sparse Eigenvalue Solvers for Finding a Few Low-Order Resonances of a Microwave Cavity Using the Finite-Element Method
PublicationThis paper presents two fast generalized eigenvalue solvers for sparse symmetric matrices that arise when electromagnetic cavity resonances are investigated using the higher-order finite element method (FEM). To find a few loworder resonances, the locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm with null-space deflation is applied. The computations are expedited by using one or two graphical processing...
-
Performance evaluation of the parallel object tracking algorithm employing the particle filter
PublicationAn algorithm based on particle filters is employed to track moving objects in video streams from fixed and non-fixed cameras. Particle weighting is based on color histograms computed in the iHLS color space. Particle computations are parallelized with CUDA framework. The algorithm was tested on various GPU devices: a desktop GPU card, a mobile chipset and two embedded GPU platforms. The processing speed depending on the number...
-
Computer experiments with a parallel clonal selection algorithm for the graph coloring problem
PublicationArtificial immune systems (AIS) are algorithms that are based on the structure and mechanisms of the vertebrate immune system. Clonal selection is a process that allows lymphocytes to launch a quick response to known pathogens and to adapt to new, previously unencountered ones. This paper presents a parallel island model algorithm based on the clonal selection principles for solving the Graph Coloring Problem. The performance of...
-
Performance Evaluation of Selected Parallel Object Detection and Tracking Algorithms on an Embedded GPU Platform
PublicationPerformance evaluation of selected complex video processing algorithms, implemented on a parallel, embedded GPU platform Tegra X1, is presented. Three algorithms were chosen for evaluation: a GMM-based object detection algorithm, a particle filter tracking algorithm and an optical flow based algorithm devoted to people counting in a crowd flow. The choice of these algorithms was based on their computational complexity and parallel...
-
On root finding algorithms for complex functions with branch cuts
PublicationA simple and versatile method is presented, which enhances the complex root finding process by eliminating branch cuts and branch points in the analyzed domain. For any complex function defined by a finite number of Riemann sheets, a pointwise product of all the surfaces can be obtained. Such single-valued function is free of discontinuity caused by branch cuts and branch points. The roots of the new function are the same as the...
-
GPU based implementation of Temperature-Vegetation Dryness Index for AVHRR3 Satellite Data
PublicationPaper presents an implementation of TVDI (Temperature-Vegetation-Dryness Index) algorithm on GPU (Graphics Processing Unit). Calculation of this index is based on LST (Land Surface Temperature) and NDVI (Normalized Difference Vegetation Index). Discussed results are based on multi-spectral imagery retrieved from AVHRR3 sensors for area of Poland. All phases of TVDI implementation on GPU are modified in respect to CUDA platform....
-
Performance evaluation of unified memory and dynamic parallelism for selected parallel CUDA applications
PublicationThe aim of this paper is to evaluate performance of new CUDA mechanisms—unified memory and dynamic parallelism for real parallel applications compared to standard CUDA API versions. In order to gain insight into performance of these mechanisms, we decided to implement three applications with control and data flow typical of SPMD, geometric SPMD and divide-and-conquer schemes, which were then used for tests and experiments. Specifically,...
-
Designing acoustic scattering elements using machine learning methods
PublicationIn the process of the design and correction of room acoustic properties, it is often necessary to select the appropriate type of acoustic treatment devices and make decisions regarding their size, geometry, and location of the devices inside the room under the treatment process. The goal of this doctoral dissertation is to develop and validate a mathematical model that allows predicting the effects of the application of the scattering...
-
Finite-window RLS algorithms
PublicationTwo recursive least-squares (RLS) adaptive filtering algorithms are most often used in practice, the exponential and sliding (rectangular) window RLS algorithms. This popularity is mainly due to existence of low-complexity versions of these algorithms. However, these two windows are not always the best choice for identification of fast time-varying systems, when the identification performance is most important. In this paper, we...
-
Parallel implementation of the DGF-FDTD method on GPU Using the CUDA technology
PublicationThe discrete Green's function (DGF) formulation of the finite-difference time-domain method (FDTD) is accelerated on a graphics processing unit (GPU) by means of the Compute Unified Device Architecture (CUDA) technology. In the developed implementation of the DGF-FDTD method, a new analytic expression for dyadic DGF derived based on scalar DGF is employed in computations. The DGF-FDTD method on GPU returns solutions that are compatible...
-
Parallel implementation of background subtraction algorithms for real-time video processing on a supercomputer platform
PublicationResults of evaluation of the background subtraction algorithms implemented on a supercomputer platform in a parallel manner are presented in the paper. The aim of the work is to chose an algorithm, a number of threads and a task scheduling method, that together provide satisfactory accuracy and efficiency of a real-time processing of high resolution camera images, maintaining the cost of resources usage at a reasonable level. Two...
-
Implementation of FDTD-Compatible Green's Function on Graphics Processing Unit
PublicationIn this letter, implementation of the finite-difference time domain (FDTD)-compatible Green's function on a graphics processing unit (GPU) is presented. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates its applications in the FDTD simulations of radiation and scattering problems. Unfortunately, implementation of the new DGF formula in software requires a multiple precision...
-
Genetic Positioning of Fire Stations Utilizing Grid-computing Platform
PublicationA chapter presents a model for determining near-optimal locations of fire stations based on topography of a given area and location of forests, rivers, lakes and other elements of the site. The model is based on principals of genetic algorithms and utilizes the power of the grid to distribute and execute in parallel most performance-demanding computations involved in the algorithm.
-
Generalized adaptive comb filter with improved accuracy and robustness properties
PublicationGeneralized adaptive comb lters can be used to identify/track parameters of quasi-periodically varying systems.In a special, signal case they reduce down to adaptive comblters, applied to elimination or extraction of nonstationarymulti-harmonic signals buried in noise. We proposea new algorithm which combines, in an adaptive way, resultsyielded by several, simultaneously working generalizedadaptive comb lters. Due to its highly...
-
New Approach to Noncasual Identification of Nonstationary Stochastic FIR Systems Subject to Both Smooth and Abrupt Parameter Changes
PublicationIn this technical note, we consider the problem of finite-interval parameter smoothing for a class of nonstationary linear stochastic systems subject to both smooth and abrupt parameter changes. The proposed parallel estimation scheme combines the estimates yielded by several exponentially weighted basis function algorithms. The resulting smoother automatically adjusts its smoothing bandwidth to the type and rate of nonstationarity...
-
Parallelization of Selected Algorithms on Multi-core CPUs, a Cluster and in a Hybrid CPU+Xeon Phi Environment
PublicationIn the paper we present parallel implementations as well as execution times and speed-ups of three different algorithms run in various environments such as on a workstation with multi-core CPUs and a cluster. The parallel codes, implementing the master-slave model in C+MPI, differ in computation to communication ratios. The considered problems include: a genetic algorithm with various ratios of master processing time to communication...
-
Parallel Implementation of the Discrete Green's Function Formulation of the FDTD Method on a Multicore Central Processing Unit
PublicationParallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method was developed on a multicore central processing unit. DGF-FDTD avoids computations of the electromagnetic field in free-space cells and does not require domain termination by absorbing boundary conditions. Computed DGF-FDTD solutions are compatible with the FDTD grid enabling the perfect hybridization of FDTD...
-
Self-optimizing generalized adaptive notch filters - comparison of three optimization strategies
PublicationThe paper provides comparison of three different approaches to on-line tuning of generalized adaptive notch filters (GANFs) the algorithms used for identification/tracking of quasi-periodically varying dynamic systems. Tuning is needed to adjust adaptation gains, which control tracking performance of ANF algorithms, to the unknown and/or time time-varying rate of system nonstationarity. Two out ofthree compared approaches are classical...
-
Simulation of unsteady flow over floodplain using the diffusive wave equation and the modified finite element method
PublicationWe consider solution of 2D nonlinear diffusive wave equation in a domain temporarily covered by a layer of water. A modified finite element method with triangular elements and linear shape functions is used for spatial discretization. The proposed modification refers to the procedure of spatial integration and leads to a more general algorithm involving a weighting parameter. The standard finite element method and the finite difference...
-
Variable-structure algorithm for identification of quasi-periodically varying systems
PublicationThe paper presents a variable-structure version of a generalized notchfiltering (GANF) algorithm. Generalized notch filters are used for identification of quasi-periodically varying dynamic systems and can be considered an extension, to the system case, of classical adaptive notch filters. The proposed algorithm is a cascade of two GANF filters: a multiple-frequency "precise" filter bank, used for precise system tracking, and a...
-
Analysis of Floodplain Inundation Using 2D Nonlinear Diffusive Wave Equation Solved with Splitting Technique
PublicationIn the paper a solution of two-dimensional (2D) nonlinear diffusive wave equation in a partially dry and wet domain is considered. The splitting technique which allows to reduce 2D problem into the sequence of one-dimensional (1D) problems is applied. The obtained 1D equations with regard to x and y are spatially discretized using the modified finite element method with the linear shape functions. The applied modification referring...
-
Process zone in the Single Cantilever Beam under transverse loading. - Part I: Theoretical analysis
PublicationSingle Cantilever Beam (SCB) specimen loaded with a transverse force parallel to the crack front is proposed for the analysis of crack propagation phenomena under mixed mode conditions. The stress redistribution in the adhesive layer in the vicinity of the crack front so as the beam deformation are estimated using a Timoshenko beam on elastic foundation model. This model emphasizes the Mode II contribution due to flexural beam...
-
Fast implementation of FDTD-compatible green's function on multicore processor
PublicationIn this letter, numerically efficient implementation of the finite-difference time domain (FDTD)-compatible Green's function on a multicore processor is presented. Recently, closed-form expression of this discrete Green's function (DGF) was derived, which simplifies its application in the FDTD simulations of radiation and scattering problems. Unfortunately, the new DGF expression involves binomial coefficients, whose computations...
-
Kinetics of molecular decomposition under irradiation of gold nanoparticles with nanosecond laser pulses—A 5-Bromouracil case study
PublicationABSTRACT Laser illuminated gold nanoparticles (AuNPs) efficiently absorb light and heat up the surrounding medium, leading to versatile applications ranging from plasmonic catalysis to cancer photothermal therapy. Therefore, an in-depth understanding of the thermal, optical, and electron induced reaction pathways is required. Here, the electrophilic DNA nucleobase analog 5-Bromouracil (BrU) has been used as a model compound to...
-
Computationally-efficient design optimisation of antennas by accelerated gradient search with sensitivity and design change monitoring
PublicationElectromagnetic (EM) simulation tools are of primary importance in the design of contemporary antennas. The necessity of accurate performance evaluation of complex structures is a reason why the final tuning of antenna dimensions, aimed at improvement of electrical and field characteristics, needs to be based on EM analysis. Design automation is highly desirable and can be achieved by coupling EM solvers with numerical optimisation...
-
New First - Path Detector for LTE Positioning Reference Signals
PublicationIn today's world, where positioning applications reached a huge popularity and became virtually ubiquitous, there is a strong need for determining a device location as accurately as possible. A particularly important role in positioning play cellular networks, such as Long Term Evolution (LTE). In the LTE Observed Time Difference of Arrival (OTDOA) positioning method, precision of device location estimation depends on accuracy...
-
On Computational Aspects of Greedy Partitioning of Graphs
PublicationIn this paper we consider a problem of graph P-coloring consisting in partitioning the vertex set of a graph such that each of the resulting sets induces a graph in a given additive, hereditary class of graphs P. We focus on partitions generated by the greedy algorithm. In particular, we show that given a graph G and an integer k deciding if the greedy algorithm outputs a P-coloring with a least k colors is NP-complete for an infinite...
-
Online sound restoration system for digital library applications
PublicationAudio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
-
The Quick Measure of a Nurbs Surface Curvature for Accurate Triangular Meshing
PublicationNURBS surfaces are the most widely used surfaces for three-dimensional models in CAD/CAE programs. As a model for FEM calculation is prepared with a CAD program it is inevitable to mesh it finally. There are many algorithms for meshing planar regions. Some of them may be used for meshing surfaces but it is necessary to take the curvature of the surface under consideration to avoid poor quality mesh. The mesh must be denser in the...
-
All-gather Algorithms Resilient to Imbalanced Process Arrival Patterns
PublicationTwo novel algorithms for the all-gather operation resilient to imbalanced process arrival patterns (PATs) are presented. The first one, Background Disseminated Ring (BDR), is based on the regular parallel ring algorithm often supplied in MPI implementations and exploits an auxiliary background thread for early data exchange from faster processes to accelerate the performed all-gather operation. The other algorithm, Background Sorted...
-
Online sound restoration system for digital library applications.
PublicationAudio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
-
Reinforcement Learning Algorithm and FDTD-based Simulation Applied to Schroeder Diffuser Design Optimization
PublicationThe aim of this paper is to propose a novel approach to the algorithmic design of Schroeder acoustic diffusers employing a deep learning optimization algorithm and a fitness function based on a computer simulation of the propagation of acoustic waves. The deep learning method employed for the research is a deep policy gradient algorithm. It is used as a tool for carrying out a sequential optimization process the goal of which is...
-
Computational aspects of greedy partitioning of graphs
PublicationIn this paper we consider a variant of graph partitioning consisting in partitioning the vertex set of a graph into the minimum number of sets such that each of them induces a graph in hereditary class of graphs P (the problem is also known as P-coloring). We focus on the computational complexity of several problems related to greedy partitioning. In particular, we show that given a graph G and an integer k deciding if the greedy...
-
On noncausal weighted least squares identification of nonstationary stochastic systems
PublicationIn this paper, we consider the problem of noncausal identification of nonstationary, linear stochastic systems, i.e., identification based on prerecorded input/output data. We show how several competing weighted (windowed) least squares parameter smoothers, differing in memory settings, can be combined together to yield a better and more reliable smoothing algorithm. The resulting parallel estimation scheme automatically adjusts...
-
Modelling of FloodWave Propagation with Wet-dry Front by One-dimensional Diffusive Wave Equation
PublicationA full dynamic model in the form of the shallow water equations (SWE) is often useful for reproducing the unsteady flow in open channels, as well as over a floodplain. However, most of the numerical algorithms applied to the solution of the SWE fail when flood wave propagation over an initially dry area is simulated. The main problems are related to the very small or negative values of water depths occurring in the vicinity of...
-
Application of the finite element methods in long-term simulation of the multi-physics systems with large transient response differences
PublicationApplication of the Finite Element Method (FEM) and the Multibody Dynamics Method allows analyzing of complex physical systems. Complexity of the system could be related both to the geometry and the physical description of phenomenon. The metod is the excellent tool for analyzing statics or dynamics of the mechanical systems, and permits tracking of Multi Body System (MBS) transient response for the long-term simulations and application...
-
FDTD Method for Electromagnetic Simulations in Media Described by Time-Fractional Constitutive Relations
PublicationIn this paper, the finite-difference time-domain (FDTD) method is derived for electromagnetic simulations in media described by the time-fractional (TF) constitutive relations. TF Maxwell’s equations are derived based on these constitutive relations and the Grünwald–Letnikov definition of a fractional derivative. Then the FDTD algorithm, which includes memory effects and energy dissipation of the considered media, is introduced....
-
Simulating coherent light propagation in a random scattering materials using the perturbation expansion
PublicationMultiple scattering of a coherent light plays important role in the optical metrology. Probably the most important phenomenon caused by multiple scattering are the speckle patterns present in every optical imaging method based on coherent or partially coherent light illumination. In many cases the speckle patterns are considered as an undesired noise. However, they were found useful in various subsurface imaging methods such as...
-
A novel heterogeneous model of concrete for numerical modelling of ground penetrating radar
PublicationThe ground penetrating radar (GPR) method has increasingly been applied in the non-destructive testing of reinforced concrete structures. The most common approach to the modelling of radar waves is to consider concrete as a homogeneous material. This paper proposes a novel, heterogeneous, numerical model of concrete for exhaustive interpretation of GPR data. An algorithm for determining the substitute values of the material constants...
-
MEMORY EFFECT ANALYSIS USING PIECEWISE CUBIC B-SPLINE OF TIME FRACTIONAL DIFFUSION EQUATION
PublicationThe purpose of this work is to study the memory effect analysis of Caputo–Fabrizio time fractional diffusion equation by means of cubic B-spline functions. The Caputo–Fabrizio interpretation of fractional derivative involves a non-singular kernel that permits to describe some class of material heterogeneities and the effect of memory more effectively. The proposed numerical technique relies on finite difference approach and cubic...
-
Using GPUs for Parallel Stencil Computations in Relativistic Hydrodynamic Simulation
PublicationThis paper explores the possibilities of using a GPU for complex 3D finite difference computation. We propose a new approach to this topic using surface memory and compare it with 3D stencil computations carried out via shared memory, which is currently considered to be the best approach. The case study was performed for the extensive computation of collisions between heavy nuclei in terms of relativistic hydrodynamics.
-
Further Developments of the Online Sound Restoration System for Digital Library Applications
PublicationNew signal processing algorithms were introduced to the online service for audio restoration available at the web address: www.youarchive.net. Missing or distorted audio samples are estimated using a specific implementation of the Jannsen interpolation method. The algorithm is based on the autoregressive model (AR) combined with the iterative complementation of signal samples. Since the interpolation algorithm is computationally...
-
Dynamic coloring of graphs
PublicationDynamics is an inherent feature of many real life systems so it is natural to define and investigate the properties of models that reflect their dynamic nature. Dynamic graph colorings can be naturally applied in system modeling, e.g. for scheduling threads of parallel programs, time sharing in wireless networks, session scheduling in high-speed LAN's, channel assignment in WDM optical networks as well as traffic scheduling. In...