Filtry
wszystkich: 42
Wyniki wyszukiwania dla: MULTIPLE PRECISION ARITHMETIC, FPGA
-
FPGA implementation of the multiplication operation in multiple-precision arithmetic
PublikacjaAlthough standard 32/64-bit arithmetic is sufficient to solve most of the scientific-computing problems, there are still problems that require higher numerical precision. Multiple-precision arithmetic (MPA) libraries are software tools for emulation of computations in a user-defined precision. However, availability of a reconfigurable cards based on field-programmable gate arrays (FPGAs) in computing systems allows one to implement...
-
Open-Source Coprocessor for Integer Multiple Precision Arithmetic
PublikacjaThis paper presents an open-source digital circuit of the coprocessor for an integer multiple-precision arithmetic (MPA). The purpose of this coprocessor is to support a central processing unit (CPU) by offloading computations requiring integer precision higher than 32/64 bits. The coprocessor is developed using the very high speed integrated circuit hardware description language (VHDL) as an intellectual property (IP) core. Therefore,...
-
Implementation of Addition and Subtraction Operations in Multiple Precision Arithmetic
PublikacjaIn this paper, we present a digital circuit of arithmetic unit implementing addition and subtraction operations in multiple-precision arithmetic (MPA). This adder-subtractor unit is a part of MPA coprocessor supporting and offloading the central processing unit (CPU) in computations requiring precision higher than 32/64 bits. Although addition and subtraction operations of two n-digit numbers require O(n) operations, the efficient...
-
IP Core of Coprocessor for Multiple-Precision-Arithmetic Computations
PublikacjaIn this paper, we present an IP core of coprocessor supporting computations requiring integer multiple-precision arithmetic (MPA). Whilst standard 32/64-bit arithmetic is sufficient to solve many computing problems, there are still applications that require higher numerical precision. Hence, the purpose of the developed coprocessor is to support and offload central processing unit (CPU) in such computations. The developed digital...
-
Implementation of Coprocessor for Integer Multiple Precision Arithmetic on Zynq Ultrascale+ MPSoC
PublikacjaRecently, we have opened the source code of coprocessor for multiple-precision arithmetic (MPA). In this contribution, the implementation and benchmarking results for this MPA coprocessor are presented on modern Zynq Ultrascale+ multiprocessor system on chip, which combines field-programmable gate array with quad-core ARM Cortex-A53 64-bit central processing unit (CPU). In our benchmark, a single coprocessor can be up to 4.5 times...
-
Electromagnetic Problems Requiring High-Precision Computations
PublikacjaAn overview of the applications of multiple-precision arithmetic in CEM was presented in this paper for the first time. Although double-precision floating-point arithmetic is sufficient for most scientific computations, there is an expanding body of electromagnetic problems requiring multiple-precision arithmetic. Software libraries facilitating these computations were described, and investigations requiring multiple-precision...
-
Implementation of FDTD-Compatible Green's Function on Graphics Processing Unit
PublikacjaIn this letter, implementation of the finite-difference time domain (FDTD)-compatible Green's function on a graphics processing unit (GPU) is presented. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates its applications in the FDTD simulations of radiation and scattering problems. Unfortunately, implementation of the new DGF formula in software requires a multiple precision...
-
Acceleration of the discrete Green's function computations
PublikacjaResults of the acceleration of the 3-D discrete Green's function (DGF) computations on the multicore processor are presented. The code was developed in the multiple precision arithmetic with use of the OpenMP parallel programming interface. As a result, the speedup factor of three orders of magnitude compared to the previous implementation was obtained thus applicability of the DGF in FDTD simulations was significantly improved.
-
Implementation of FDTD-compatible Green's function on heterogeneous CPU-GPU parallel processing system
PublikacjaThis paper presents an implementation of the FDTD-compatible Green's function on a heterogeneous parallel processing system. The developed implementation simultaneously utilizes computational power of the central processing unit (CPU) and the graphics processing unit (GPU) to the computational tasks best suited to each architecture. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates...
-
A New Expression for the 3-D Dyadic FDTD-Compatible Green's Function Based on Multidimensional Z-Transform
PublikacjaIn this letter, a new analytic expression for the time-domain discrete Green's function (DGF) is derived for the 3-D finite-difference time-domain (FDTD) grid. The derivation employs the multidimensional Z-transform and the impulse response of the discretized scalar wave equation (i.e., scalar DGF). The derived DGF expression involves elementary functions only and requires the implementation of a single function in the multiple-precision...
-
Discrete convolution based on polynomial residue representation
PublikacjaThis paper presents the study of fast discrete convolution calculation with use of the Polynomial Residue Number System (PRNS). Convolution can be based the algorithm similar to polynomial multiplication. The residue arithmetic allows for fast realization of multiplication and addition, which are the most important arithmetic operations in the implementation of convolution. The practical aspects of hardware realization of PRNS...
-
Fast implementation of FDTD-compatible green's function on multicore processor
PublikacjaIn this letter, numerically efficient implementation of the finite-difference time domain (FDTD)-compatible Green's function on a multicore processor is presented. Recently, closed-form expression of this discrete Green's function (DGF) was derived, which simplifies its application in the FDTD simulations of radiation and scattering problems. Unfortunately, the new DGF expression involves binomial coefficients, whose computations...
-
FPGA-Based System for Electromagnetic Interference Evaluation in Random Modulated DC/DC Converters
PublikacjaField-Programmable Gate Array (FPGA) provides the possibility to design new “electromagnetic compatibility (EMC) friendly” control techniques for power electronic converters. Such control techniques use pseudo-random modulators (RanM) to control the converter switches. However, some issues connected with the FPGA-based design of RanM, such as matching the range of fixed-point numbers, might be challenging. The modern programming...
-
Implementation of discrete convolution using polynomial residue representation
PublikacjaConvolution is one of the main algorithms performed in the digital signal processing. The algorithm is similar to polynomial multiplication and very intensive computationally. This paper presents a new convolution algorithm based on the Polynomial Residue Number System (PRNS). The use of the PRNS allows to decompose the computation problem and thereby reduce the number of multiplications. The algorithm has been implemented in Xilinx...
-
Analytical Expression for the Time-Domain Discrete Green's Function of a Plane Wave Propagating in the 2-D FDTD Grid
PublikacjaIn this letter, a new closed-form expression for the time-domain discrete Green's function (DGF) of a plane wave propagating in the 2-D finite-difference time-domain (FDTD) grid is derived. For the sake of its verification, the time-domain implementation of the analytic field propagator (AFP) technique was developed for the plane wave injection in 2-D total-field/scattered-field (TFSF) FDTD simulations. Such an implementation of...
-
Analytical Expression for the Time-Domain Green's Function of a Discrete Plane Wave Propagating in the 3-D FDTD Grid
PublikacjaIn this paper, a closed-form expression for the time-domain dyadic Green’s function of a discrete plane wave (DPW) propagating in a 3-D finite-difference time-domain (FDTD) grid is derived. In order to verify our findings, the time-domain implementation of the DPW-injection technique is developed with the use of the derived expression for 3-D total-field/scattered-field (TFSF) FDTD simulations. This implementation requires computations...
-
Analysis of radiation and scattering problems with the use of hybrid techniques based on the discrete Green's function formulation of the FDTD method
PublikacjaIn this contribution, simulation scenarios are presented which take advantage of the hybrid techniques based on the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method. DGF-FDTD solutions are compatible with the finite-difference grid and can be applied for perfect hybridization of the FDTD method. The following techniques are considered: (i) DGF-FDTD for antenna simulations, (ii) DGF-based...
-
FIReWORK: FIR Filters Hardware Structures Auto-Generator
PublikacjaThe paper presents application called FIReWORK, that allows for automatic creation of the VHDL hardware structures of FIR filters. Automat- ically generated specialized hardware solutions dedicated to the FPGA and ASIC are commonly known as Intellectual Property Cores. The essential fu- ture of the application is easy initialization of FIR filter parameters in GUI, and then automatically design, calculate and generate the IP Core...
-
HIGH LEVEL SYNTHESIS IN FPGA OF TCS/RNS CONVERTER
PublikacjaThe work presents the design process of the TCS/RNS (two's complement–to– residue) converter in Xilinx FPGA with the use of HLS approach. This new approach allows for the design of dedicated FPGA circuits using high level languages such as C++ language. Such approach replaces, to some extent, much more tedious design with VHDL or Verilog and facilitates the design process. The algorithm realized by the given hardware circuit is...
-
FPGA computation of magnitude of complex numbers using modified CORDIC algorithm
PublikacjaIn this work we present computation of the magnitude of complex numbers using a modified version of the CORDIC algorithm that uses only five iterations. The relationship between the computation error and the number of CORDIC iterations are presented for floating-point and integer arithmetics. The proposed modification of CORDIC for integer arithmetic relies upon the introduction of correction once basic computations are performed...
-
Digital structures for high-speed signal processing
PublikacjaThe work covers several issues of realization of digital structures for pipelined processing of real and complex signals with the use of binary arithmetic and residue arithmetic. Basic rules of performing operations in residue arithmetic are presented along with selected residue number systems for processing of complex signals and computation of convolution. Subsequently, methods of conversion of numbers from weighted systems to...
-
Verification and Benchmarking in MPA Coprocessor Design Process
PublikacjaThis paper presents verification and benchmarking required for the development of a coprocessor digital circuit for integer multiple-precision arithmetic (MPA). Its code is developed, with the use of very high speed integrated circuit hardware description language (VHDL), as an intellectual property core. Therefore, it can be used by a final user within their own computing system based on field-programmable gate arrays (FPGAs)....
-
Deep Learning Optimization for Edge Devices: Analysis of Training Quantization Parameters
PublikacjaThis paper focuses on convolution neural network quantization problem. The quantization has a distinct stage of data conversion from floating-point into integer-point numbers. In general, the process of quantization is associated with the reduction of the matrix dimension via limited precision of the numbers. However, the training and inference stages of deep learning neural network are limited by the space of the memory and a...
-
GPU-Accelerated Finite-Element Matrix Generation for Lossless, Lossy, and Tensor Media [EM Programmer's Notebook]
PublikacjaThis paper presents an optimization approach for limiting memory requirements and enhancing the performance of GPU-accelerated finite-element matrix generation applied in the implementation of the higher-order finite-element method (FEM). It emphasizes the details of the implementation of the matrix-generation algorithm for the simulation of electromagnetic wave propagation in lossless, lossy, and tensor media. Moreover, the impact...
-
FPGA realization of fir filter in residue arithmetic
Publikacjaw pracy zaprezentowano realizację fpga przepływowego filtru fir o stałych współczynnikach w arytmetyce resztowej z użyciem 8 5-bitowych modułów o łącznym zakresie liczbowym 37.07 bita. zastosowano formębezpośrednią fir. mnożenia wykonywane są przy użyciu odczytu z pamięci. sumowania w każdym z kanałów są realizowane przy zastosowaniu wielopoziomowej struktury sumatora opartego o 4-operandowe sumatory csa. w stopniu końcowym wykonywane...
-
A memory efficient and fast sparse matrix vector product on a Gpu
PublikacjaThis paper proposes a new sparse matrix storage format which allows an efficient implementation of a sparse matrix vector product on a Fermi Graphics Processing Unit (GPU). Unlike previous formats it has both low memory footprint and good throughput. The new format, which we call Sliced ELLR-T has been designed specifically for accelerating the iterative solution of a large sparse and complex-valued system of linear equations arising...
-
Pipelined division of signed numbers with the use of residue arithmetic in FPGA
PublikacjaAn architecture of a pipelined signed residue divider for small number ranges is presented. The divider makes use of the multiplicative division algorithm where initially the reciprocal of the divisor is calculated and subsequently multiplied by the dividend. The divisor represented in the signed binary form is used to compute the approximated reciprocal in the residue form by the table look-up. In order to reduce the needed length...
-
FPGA realization of high-speed multi-stage FIR filter in residue arithmetic
PublikacjaW pracy przedstawiono implementację szybkiego wielostopniowego, kaskadowego filtru FIR w technologii FPGA. Zastosowanie arytmetyki resztowej pozwala na uzyskanie dużych częstotliwości próbkowania w zwiżaku z użyciem małych mnożników. Zalety wynikające z uzycia arytmetyki resztowej sa w pewnym stopniu ograniczne koniecznością wykonania skalowania przy kaskadowym połaczeniu filtrów FIR, tak aby uniknąć nadmiaru arytmetycznego. W...
-
Fpga implementation of the two-stage high-speed fir filter in residue arithmetic
Publikacjaw pracy przedstawiono implementację szybkiego, dwustopniowego kaskadowego filtru fir w technologii fpga z użyciem arytmetyki resztowej. zastosowanie arytmetyki resztowej pozwala na uzyskanie dużych częstotliwości potokowania w związku z użyciem małych mnożników. zalety arytmetyki resztowej są ograniczane w pewnym stopniu koniecznością wykonywania skalowania po pierwszym stopniu filtru celem uniknięcia nadmiaru arytmetycznego. w...
-
Marek Czachor prof. dr hab.
Osoby -
Sprzętowa implementacja transformacji Hougha w czasie rzeczywistym
PublikacjaW artykule przedstawiono implementację sprzętową w FPGA algorytmu do wykrywania kształtów aproksymowanych zbiorem linii prostych podczas przetwarzania obrazu cyfrowego w czasie rzeczywistym. W opracowanej strukturze sprzętowej podniesiono efektywność przetwarzania poprzez zastosowanie przetwarzania przepływowego, lookup table, wykorzystanie wyłącznie arytmetyki liczb całkowitych oraz rozproszenie pamięci głosowania. Eksperymentalnie...
-
Programmable Input Mode Instrumentation Amplifier Using Multiple Output Current Conveyors
PublikacjaIn this paper a programmable input mode instrumentation amplifier (IA) utilising second generation, multiple output current conveyors and transmission gates is presented. Its main advantage is the ability to choose a voltage or current mode of inputs by setting the voltage of two configuration nodes. The presented IA is prepared as an integrated circuit block to be used alone or as a sub-block in a microcontroller or in a field...
-
Mispronunciation Detection in Non-Native (L2) English with Uncertainty Modeling
PublikacjaA common approach to the automatic detection of mispronunciation in language learning is to recognize the phonemes produced by a student and compare it to the expected pronunciation of a native speaker. This approach makes two simplifying assumptions: a) phonemes can be recognized from speech with high accuracy, b) there is a single correct way for a sentence to be pronounced. These assumptions do not always hold, which can result...
-
CNN Architectures for Human Pose Estimation from a Very Low Resolution Depth Image
PublikacjaThe paper is dedicated to proposing and evaluating a number of convolutional neural network architectures for calculating a multiple regression on 3D coordinates of human body joints tracked in a single low resolution depth image. The main challenge was to obtain a high precision in case of a noisy and coarse scan of the body, as observed by a depth sensor from a large distance. The regression network was expected to reason about...
-
Multiple Cues-Based Robust Visual Object Tracking Method
PublikacjaVisual object tracking is still considered a challenging task in computer vision research society. The object of interest undergoes significant appearance changes because of illumination variation, deformation, motion blur, background clutter, and occlusion. Kernelized correlation filter- (KCF) based tracking schemes have shown good performance in recent years. The accuracy and robustness of these trackers can be further enhanced...
-
Attention-Based Deep Learning System for Classification of Breast Lesions—Multimodal, Weakly Supervised Approach
PublikacjaBreast cancer is the most frequent female cancer, with a considerable disease burden and high mortality. Early diagnosis with screening mammography might be facilitated by automated systems supported by deep learning artificial intelligence. We propose a model based on a weakly supervised Clustering-constrained Attention Multiple Instance Learning (CLAM) classifier able to train under data scarcity effectively. We used a private...
-
Obliczanie składowej OEE przy wielu operacjach technologicznych
Publikacja-
-
Deep neural networks for human pose estimation from a very low resolution depth image
PublikacjaThe work presented in the paper is dedicated to determining and evaluating the most efficient neural network architecture applied as a multiple regression network localizing human body joints in 3D space based on a single low resolution depth image. The main challenge was to deal with a noisy and coarse representation of the human body, as observed by a depth sensor from a large distance, and to achieve high localization precision....
-
Neural network training with limited precision and asymmetric exponent
PublikacjaAlong with an extremely increasing number of mobile devices, sensors and other smart utilities, an unprecedented growth of data can be observed in today’s world. In order to address multiple challenges facing the big data domain, machine learning techniques are often leveraged for data analysis, filtering and classification. Wide usage of artificial intelligence with large amounts of data creates growing demand not only for storage...
-
Application of gas chromatography–tandem mass spectrometry for the determination of amphetamine-type stimulants in blood and urine
PublikacjaAmphetamine, methamphetamine, phentermine, 3,4-methylenedioxyamphetamine (MDA), 3,4-methylenedioxymethamphetamine (MDMA), and 3,4-methylenedioxy-N-ethylamphetamine (MDEA) are the most popular amphetamine-type stimulants. The use of these substances is a serious societal problem worldwide. In this study, a method based on gas chromatography-tandem mass spectrometry (GC-MS/MS) with simple and rapid liquid-liquid extraction (LLE)...
-
Real-Time Facial Features Detection from Low Resolution Thermal Images with Deep Classification Models
PublikacjaDeep networks have already shown a spectacular success for object classification and detection for various applications from everyday use cases to advanced medical problems. The main advantage of the classification models over the detection models is less time and effort needed for dataset preparation, because classification networks do not require bounding box annotations, but labels at the image level only. Yet, after passing...
-
Study of Multi-Class Classification Algorithms’ Performance on Highly Imbalanced Network Intrusion Datasets
PublikacjaThis paper is devoted to the problem of class imbalance in machine learning, focusing on the intrusion detection of rare classes in computer networks. The problem of class imbalance occurs when one class heavily outnumbers examples from the other classes. In this paper, we are particularly interested in classifiers, as pattern recognition and anomaly detection could be solved as a classification problem. As still a major part of...