Filtry
wszystkich: 189
wybranych: 151
Wyniki wyszukiwania dla: FPGAS, MULTIPLE-PRECISION ARITHMETIC, SCIENTIFIC COMPUTING, PARALLEL PROCESSING, EMBEDDED SYSTEMS
-
IP Core of Coprocessor for Multiple-Precision-Arithmetic Computations
PublikacjaIn this paper, we present an IP core of coprocessor supporting computations requiring integer multiple-precision arithmetic (MPA). Whilst standard 32/64-bit arithmetic is sufficient to solve many computing problems, there are still applications that require higher numerical precision. Hence, the purpose of the developed coprocessor is to support and offload central processing unit (CPU) in such computations. The developed digital...
-
FPGA implementation of the multiplication operation in multiple-precision arithmetic
PublikacjaAlthough standard 32/64-bit arithmetic is sufficient to solve most of the scientific-computing problems, there are still problems that require higher numerical precision. Multiple-precision arithmetic (MPA) libraries are software tools for emulation of computations in a user-defined precision. However, availability of a reconfigurable cards based on field-programmable gate arrays (FPGAs) in computing systems allows one to implement...
-
Verification and Benchmarking in MPA Coprocessor Design Process
PublikacjaThis paper presents verification and benchmarking required for the development of a coprocessor digital circuit for integer multiple-precision arithmetic (MPA). Its code is developed, with the use of very high speed integrated circuit hardware description language (VHDL), as an intellectual property core. Therefore, it can be used by a final user within their own computing system based on field-programmable gate arrays (FPGAs)....
-
Open-Source Coprocessor for Integer Multiple Precision Arithmetic
PublikacjaThis paper presents an open-source digital circuit of the coprocessor for an integer multiple-precision arithmetic (MPA). The purpose of this coprocessor is to support a central processing unit (CPU) by offloading computations requiring integer precision higher than 32/64 bits. The coprocessor is developed using the very high speed integrated circuit hardware description language (VHDL) as an intellectual property (IP) core. Therefore,...
-
Implementation of Addition and Subtraction Operations in Multiple Precision Arithmetic
PublikacjaIn this paper, we present a digital circuit of arithmetic unit implementing addition and subtraction operations in multiple-precision arithmetic (MPA). This adder-subtractor unit is a part of MPA coprocessor supporting and offloading the central processing unit (CPU) in computations requiring precision higher than 32/64 bits. Although addition and subtraction operations of two n-digit numbers require O(n) operations, the efficient...
-
Implementation of Coprocessor for Integer Multiple Precision Arithmetic on Zynq Ultrascale+ MPSoC
PublikacjaRecently, we have opened the source code of coprocessor for multiple-precision arithmetic (MPA). In this contribution, the implementation and benchmarking results for this MPA coprocessor are presented on modern Zynq Ultrascale+ multiprocessor system on chip, which combines field-programmable gate array with quad-core ARM Cortex-A53 64-bit central processing unit (CPU). In our benchmark, a single coprocessor can be up to 4.5 times...
-
Parallel Programming for Modern High Performance Computing Systems
PublikacjaIn view of the growing presence and popularity of multicore and manycore processors, accelerators, and coprocessors, as well as clusters using such computing devices, the development of efficient parallel applications has become a key challenge to be able to exploit the performance of such systems. This book covers the scope of parallel programming for modern high performance computing systems. It first discusses selected and...
-
Optimization of hybrid parallel application execution in heterogeneous high performance computing systems considering execution time and power consumption
PublikacjaMany important computational problems require utilization of high performance computing (HPC) systems that consist of multi-level structures combining higher and higher numbers of devices with various characteristics. Utilizing full power of such systems requires programming parallel applications that are hybrid in two meanings: they can utilize parallelism on multiple levels at the same time and combine together programming interfaces...
-
Survey of Methodologies, Approaches, and Challenges in Parallel Programming Using High-Performance Computing Systems
PublikacjaThis paper provides a review of contemporary methodologies and APIs for parallel programming, with representative technologies selected in terms of target system type (shared memory, distributed, and hybrid), communication patterns (one-sided and two-sided), and programming abstraction level. We analyze representatives in terms of many aspects including programming model, languages, supported platforms, license, optimization goals,...
-
Dynamic Data Management Among Multiple Databases for Optimization of Parallel Computations in Heterogeneous HPC Systems
PublikacjaRapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...
-
Investigation of Parallel Data Processing Using Hybrid High Performance CPU + GPU Systems and CUDA Streams
PublikacjaThe paper investigates parallel data processing in a hybrid CPU+GPU(s) system using multiple CUDA streams for overlapping communication and computations. This is crucial for efficient processing of data, in particular incoming data stream processing that would naturally be forwarded using multiple CUDA streams to GPUs. Performance is evaluated for various compute time to host-device communication time ratios, numbers of CUDA streams,...
-
Highly parallel distributed computing systems with optical interconnections
Publikacja -
A Taxonomy of Model-Based Testing for Embedded Systems from Multiple Industry Domains
PublikacjaThis chapter provides a taxonomy of Model-Based Testing (MBT) based on the approaches that are presented throughout this book as well as in the related literature. The techniques for testing are categorized using a number of dimensions to familiarize the reader with the terminology used throughout the chapters that follow. In this chapter, after a brief introduction, a general definition of MBT and related work on available MBT...
-
Electromagnetic Problems Requiring High-Precision Computations
PublikacjaAn overview of the applications of multiple-precision arithmetic in CEM was presented in this paper for the first time. Although double-precision floating-point arithmetic is sufficient for most scientific computations, there is an expanding body of electromagnetic problems requiring multiple-precision arithmetic. Software libraries facilitating these computations were described, and investigations requiring multiple-precision...
-
Implementation of FDTD-compatible Green's function on heterogeneous CPU-GPU parallel processing system
PublikacjaThis paper presents an implementation of the FDTD-compatible Green's function on a heterogeneous parallel processing system. The developed implementation simultaneously utilizes computational power of the central processing unit (CPU) and the graphics processing unit (GPU) to the computational tasks best suited to each architecture. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates...
-
Implementation of FDTD-Compatible Green's Function on Graphics Processing Unit
PublikacjaIn this letter, implementation of the finite-difference time domain (FDTD)-compatible Green's function on a graphics processing unit (GPU) is presented. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates its applications in the FDTD simulations of radiation and scattering problems. Unfortunately, implementation of the new DGF formula in software requires a multiple precision...
-
Acceleration of the discrete Green's function computations
PublikacjaResults of the acceleration of the 3-D discrete Green's function (DGF) computations on the multicore processor are presented. The code was developed in the multiple precision arithmetic with use of the OpenMP parallel programming interface. As a result, the speedup factor of three orders of magnitude compared to the previous implementation was obtained thus applicability of the DGF in FDTD simulations was significantly improved.
-
Exception handling model influence factors for discributed systems. W: Proceedings. PPAM 2003. Parallel Processing and Applied Mathematics. 5th In- ternational Conference. Częstochowa, 7-10 September 2003.Model obsługi wyjątków uwzględniający wpływ czynników systemu rozproszonego.
PublikacjaSpecyfikacja programu jest jasno określona w systemach sekwencyjnych, gdzie posiada standardowe i wyjątkowe przejścia. Praca przedstawia rozszerzony model specyfikacji systemu w środowiskach rozproszonych uwzględniający szereg specyficznych czynników. Model zawiera analizę specyfikacji pod kątem obsługi wyjątków dla rozproszonych danych oraz komunikacji międzyprocesorowej. Ogólny model został zaimplementowany w środowisku...
-
Fast implementation of FDTD-compatible green's function on multicore processor
PublikacjaIn this letter, numerically efficient implementation of the finite-difference time domain (FDTD)-compatible Green's function on a multicore processor is presented. Recently, closed-form expression of this discrete Green's function (DGF) was derived, which simplifies its application in the FDTD simulations of radiation and scattering problems. Unfortunately, the new DGF expression involves binomial coefficients, whose computations...
-
Recognition of hazardous acoustic events employing parallel processing on a supercomputing cluster . Rozpoznawanie niebezpiecznych zdarzeń dźwiękowych z wykorzystaniem równoległego przetwarzania na klastrze superkomputerowym
PublikacjaA method for automatic recognition of hazardous acoustic events operating on a super computing cluster is introduced. The methods employed for detecting and classifying the acoustic events are outlined. The evaluation of the recognition engine is provided: both on the training set and using real-life signals. The algorithms yield sufficient performance in practical conditions to be employed in security surveillance systems. The...
-
Parallel multithread computing for spectroscopic analysis in optical coherence tomography
PublikacjaSpectroscopic Optical Coherence Tomography (SOCT) is an extension of Optical Coherence Tomography (OCT). It allows gathering spectroscopic information from individual scattering points inside the sample. It is based on time-frequency analysis of interferometric signals. Such analysis requires calculating hundreds of Fourier transforms while performing a single A-scan. Additionally, further processing of acquired spectroscopic information...
-
Acceleration of Electromagnetic Simulations on Reconfigurable FPGA Card
PublikacjaIn this contribution, the hardware acceleration of electromagnetic simulations on the reconfigurable field-programmable-gate-array (FPGA) card is presented. In the developed implementation of scientific computations, the matrix-assembly phase of the method of moments (MoM) is accelerated on the Xilinx Alveo U200 card. The computational method involves discretization of the frequency-domain mixed potential integral equation using...
-
Affective Learning Manifesto – 10 Years Later
PublikacjaIn 2004 a group of affective computing researchers proclaimed a manifesto of affective learning that outlined the prospects and white spots of research at that time. Ten years passed by and affective computing developed many methods and tools for tracking human emotional states as well as models for affective systems construction. There are multiple examples of affective methods applications in Intelligent Tutoring Systems (ITS)....
-
A New Expression for the 3-D Dyadic FDTD-Compatible Green's Function Based on Multidimensional Z-Transform
PublikacjaIn this letter, a new analytic expression for the time-domain discrete Green's function (DGF) is derived for the 3-D finite-difference time-domain (FDTD) grid. The derivation employs the multidimensional Z-transform and the impulse response of the discretized scalar wave equation (i.e., scalar DGF). The derived DGF expression involves elementary functions only and requires the implementation of a single function in the multiple-precision...
-
Tools, Methods and Services Enhancing the Usage of the Kepler-based Scientific Workflow Framework
PublikacjaScientific workflow systems are designed to compose and execute either a series of computational or data manipulation steps, or workflows in a scientific application. They are usually a part of a larger eScience environment. The usage of workflow systems, however very beneficial, is mostly not irrelevant for scientists. There are many requirements for additional functionalities around scientific workflows systems that need to be...
-
Acceleration of the DGF-FDTD method on GPU using the CUDA technology
PublikacjaWe present a parallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method on a graphics processing unit (GPU). The compute unified device architecture (CUDA) parallel computing platform is applied in the developed implementation. For the sake of example, arrays of Yagi-Uda antennas were simulated with the use of DGF-FDTD on GPU. The efficiency of parallel computations...
-
Smart Embedded Systems with Decisional DNA Knowledge Representation
PublikacjaEmbedded systems have been in use since the 1970s. For most of their history embedded systems were seen simply as small computers designed to accomplish one or a few dedicated functions; and they were usually working under limited resources i.e. limited computing power, limited memories, and limited energy sources. As such, embedded systems have not drawn much attention from researchers, especially from those in the artificial...
-
Sieciowe systemy przetwarzania rozproszonego typu GRID – rozwiązania systemowe oraz przykłady aplikacyjne
PublikacjaZaprezentowano możliwości wykorzystania oraz integracji rozproszonych mocy obliczeniowych komputerów Internautów w globalnej sieci www. Pokazano paradygmaty internetowego przetwarzania rozproszonego typu grid computing oraz volunteer computing. Zwrócono uwagę na istotność tego typu przetwarzania w rozwiązywaniu zagadnień wymagających bardzo dużych mocy obliczeniowych. Pokazano reprezentatywne przykłady rozwiązań systemowych tego...
-
Internetowe systemy przetwarzania rozproszonego typu grid w zastosowaniach biznesowych
PublikacjaSkoncentrowano się na możliwościach wykorzystania oraz integracji rozproszonych mocy obliczeniowych komputerów Internautów w globalnej sieci www. Zaprezentowano paradygmaty sieciowego przetwarzania typu grid computing oraz volunteer computing. Podkreślono istotność tego typu przetwarzania w zagadnieniach wymagających bardzo dużych mocy obliczeniowych. Zaprezentowano przykłady rozwiązań systemowych tego typu: system BOINC, będący...
-
Performance Analysis of the OpenCL Environment on Mobile Platforms
PublikacjaToday’s smartphones have more and more features that so far were only assigned to personal computers. Every year these devices are composed of better and more efficient components. Everything indicates that modern smartphones are replacing ordinary computers in various activities. High computing power is required for tasks such as image processing, speech recognition and object detection. This paper analyses the performance of...
-
MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems
PublikacjaIn this paper we present a new environment called MERPSYS that allows simulation of parallel application execution time on cluster-based systems. The environment offers a modeling application using the Java language extended with methods representing message passing type communication routines. It also offers a graphical interface for building a system model that incorporates various hardware components such as CPUs, GPUs, interconnects...
-
Network-aware Data Prefetching Optimization of Computations in a Heterogeneous HPC Framework
PublikacjaRapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...
-
Digital structures for high-speed signal processing
PublikacjaThe work covers several issues of realization of digital structures for pipelined processing of real and complex signals with the use of binary arithmetic and residue arithmetic. Basic rules of performing operations in residue arithmetic are presented along with selected residue number systems for processing of complex signals and computation of convolution. Subsequently, methods of conversion of numbers from weighted systems to...
-
A Workflow Application for Parallel Processing of Big Data from an Internet Portal
PublikacjaThe paper presents a workflow application for efficient parallel processing of data downloaded from an Internet portal. The workflow partitions input files into subdirectories which are further split for parallel processing by services installed on distinct computer nodes. This way, analysis of the first ready subdirectories can start fast and is handled by services implemented as parallel multithreaded applications using multiple...
-
Tuning matrix-vector multiplication on GPU
PublikacjaA matrix times vector multiplication (matvec) is a cornerstone operation in iterative methods of solving large sparse systems of equations such as the conjugate gradients method (cg), the minimal residual method (minres), the generalized residual method (gmres) and exerts an influence on overall performance of those methods. An implementation of matvec is particularly demanding when one executes computations on a GPU (Graphics...
-
Auto-tuning methodology for configuration and application parameters of hybrid CPU + GPU parallel systems based on expert knowledge
PublikacjaAuto-tuning of configuration and application param- eters allows to achieve significant performance gains in many contemporary compute-intensive applications. Feasible search spaces of parameters tend to become too big to allow for exhaustive search in the auto-tuning process. Expert knowledge about the utilized computing systems becomes useful to prune the search space and new methodologies are needed in the face of emerging heterogeneous...
-
Performance Evaluation of Selected Parallel Object Detection and Tracking Algorithms on an Embedded GPU Platform
PublikacjaPerformance evaluation of selected complex video processing algorithms, implemented on a parallel, embedded GPU platform Tegra X1, is presented. Three algorithms were chosen for evaluation: a GMM-based object detection algorithm, a particle filter tracking algorithm and an optical flow based algorithm devoted to people counting in a crowd flow. The choice of these algorithms was based on their computational complexity and parallel...
-
A Survey on the Datasets and Algorithms for Satellite Data Applications
PublikacjaRecent advances in the area of the Internet of Things shows that devices are usually resource-constrained. To enable advanced applications on these devices, it is necessary to enhance their performance by leveraging external computing resources available in the network. This work presents a study of computational platforms to increase the performance of these devices based on the Mobile Cloud Computing (MCC) paradigm. The main...
-
Network-assisted processing of advanced IoT applications: challenges and proof-of-concept application
PublikacjaRecent advances in the area of the Internet of Things shows that devices are usually resource-constrained. To enable advanced applications on these devices, it is necessary to enhance their performance by leveraging external computing resources available in the network. This work presents a study of computational platforms to increase the performance of these devices based on the Mobile Cloud Computing (MCC) paradigm. The main...
-
Parallel Background Subtraction in Video Streams Using OpenCL on GPU Platforms
PublikacjaImplementation of the background subtraction algorithm using OpenCL platform is presented. The algorithm processes live stream of video frames from the surveillance camera in on-line mode. Processing is performed using a host machine and a parallel computing device. The work focuses on optimizing an OpenCL algorithm implementation for GPU devices by taking into account specific features of the GPU architecture, such as memory access,...
-
Online sound restoration system for digital library applications
PublikacjaAudio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
-
Analysis of radiation and scattering problems with the use of hybrid techniques based on the discrete Green's function formulation of the FDTD method
PublikacjaIn this contribution, simulation scenarios are presented which take advantage of the hybrid techniques based on the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method. DGF-FDTD solutions are compatible with the finite-difference grid and can be applied for perfect hybridization of the FDTD method. The following techniques are considered: (i) DGF-FDTD for antenna simulations, (ii) DGF-based...
-
Analytical Expression for the Time-Domain Discrete Green's Function of a Plane Wave Propagating in the 2-D FDTD Grid
PublikacjaIn this letter, a new closed-form expression for the time-domain discrete Green's function (DGF) of a plane wave propagating in the 2-D finite-difference time-domain (FDTD) grid is derived. For the sake of its verification, the time-domain implementation of the analytic field propagator (AFP) technique was developed for the plane wave injection in 2-D total-field/scattered-field (TFSF) FDTD simulations. Such an implementation of...
-
Analytical Expression for the Time-Domain Green's Function of a Discrete Plane Wave Propagating in the 3-D FDTD Grid
PublikacjaIn this paper, a closed-form expression for the time-domain dyadic Green’s function of a discrete plane wave (DPW) propagating in a 3-D finite-difference time-domain (FDTD) grid is derived. In order to verify our findings, the time-domain implementation of the DPW-injection technique is developed with the use of the derived expression for 3-D total-field/scattered-field (TFSF) FDTD simulations. This implementation requires computations...
-
Image Processing Techniques for Distributed Grid Applications
PublikacjaParallel approaches to 2D and 3D convolution processing of series of images have been presented. A distributed, practically oriented, 2D spatial convolution scheme has been elaborated and extended into the temporal domain. Complexity of the scheme has been determined and analysed with respect to coefficients in convolution kernels. Possibilities of parallelisation of the convolution operations have been analysed and the results...
-
Developing Methods for Building Intelligent Systems of Information Resources Processing Using an Ontological Approach
PublikacjaThe problem of developing methods of information resource processing is investigated. A formal procedure description of processing text content is developed. A new ontological approach to the implementation of business processes is proposed. Consider that the aim of our work is to develop methods and tools for building intelligent systems of information resource processing, the core of knowledge bases of which are ontology’s, and...
-
CNN-CLFFA: Support Mobile Edge Computing in Transportation Cyber Physical System
PublikacjaIn the present scenario, the transportation Cyber Physical System (CPS) improves the reliability and efficiency of the transportation systems by enhancing the interactions between the physical and cyber systems. With the provision of better storage ability and enhanced computing, cloud computing extends transportation CPS in Mobile Edge Computing (MEC). By inspecting the existing literatures, the cloud computing cannot fulfill...
-
Online sound restoration system for digital library applications.
PublikacjaAudio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
-
Further Developments of the Online Sound Restoration System for Digital Library Applications
PublikacjaNew signal processing algorithms were introduced to the online service for audio restoration available at the web address: www.youarchive.net. Missing or distorted audio samples are estimated using a specific implementation of the Jannsen interpolation method. The algorithm is based on the autoregressive model (AR) combined with the iterative complementation of signal samples. Since the interpolation algorithm is computationally...
-
Scaling of numbers in residue arithmetic with the flexible selection of scaling factor
PublikacjaA scaling technique of numbers in resudue arithmetic with the flexible selection of the scaling factor is presented. The required scaling factor can be selected from the set of moduli products of the Residue Number System (RNS) base. By permutation of moduli of the number system base it is possible to create many auxilliary Mixed-Radix Systems associated with the given RNS with respect to the base, but they have different sets...