Filters
total: 200
filtered: 172
Search results for: CPTU
-
Using Disparity Map for Moving Object Position Estimation in Pan Tilt Camera Images
PublicationIn this paper we present the algorithm for rapid moving object position estimation in an images acquired from pan tilt camera. Detection of a moving object in a image acquired from a moving camera might be quite challenging. Standard methods that relay on analyzing two consecutive frames are not applicable due to the changing background. To overtake this problem we decided to evaluate the possibility of calculating a disparity...
-
DEPO: A dynamic energy‐performance optimizer tool for automatic power capping for energy efficient high‐performance computing
PublicationIn the article we propose an automatic power capping software tool DEPO that allows one to perform runtime optimization of performance and energy related metrics. For an assumed application model with an initialization phase followed by a running phase with uniform compute and memory intensity, the tool performs automatic tuning engaging one of the two exploration algorithms—linear search (LS) and golden section search (GSS), finds...
-
Metoda SPH+MES na przykładzie symulacji wzmocnienia podłoża gruntowego metodą wymiany dynamicznej
PublicationPraca zawiera opis hybrydowej metody połączenia metody bezsiatkowej Smooth Particle Hydrodynamics (SPH) z Metodą Elementów. Metoda SPH ma zastosowania w zagadnieniach w których występują skomplikowane i zmienne w czasie algorytmy kontaktowe co pozwoliło na jej wykorzystanie w symulacji formowania kolumny przez wbijanie i rozpychanie materiału zasypowego. Przedstawiono wskazówki do przygotowania symulacji numerycznej z wykorzystaniem...
-
Influence of instalation of piles with partial and full displacement of the soil on the subsoil strength
PublicationZjawiska zachodzące w bezpośrednim sąsiedztwie pobocznicy pali przemieszczeniowych mają ważne znaczenie dla ich pracy w podłożu. Zmiany stanu naprężenia pionowego, poziomego, zmiana struktury gruntu, wilgotności i zagęszczenia stwarza nowe warunki oddziaływania w strefie pobocznica - podłoże. Przedstawiono wyniki badań sondą CPT w sąsiedztwie pali CFA dla gruntów uwarstwionych, niespoiste / spoiste. Zwrócono uwagę na dobór parametrów...
-
Modeling and Simulation for Exploring Power/Time Trade-off of Parallel Deep Neural Network Training
PublicationIn the paper we tackle bi-objective execution time and power consumption optimization problem concerning execution of parallel applications. We propose using a discrete-event simulation environment for exploring this power/time trade-off in the form of a Pareto front. The solution is verified by a case study based on a real deep neural network training application for automatic speech recognition. A simulation lasting over 2 hours...
-
Implementation of TVDI calculation for coastal zone
PublicationPaper will show an implementation of TVDI (Temperature-Vegetation-Dryness Index) algorithm on GPU (Graphics Processing Unit). Calculation of this index is based on LST (Land Surface Temperature) and NDVI (Normalized Difference Vegetation Index). Discussed results are based on multi-spectral imagery retrieved from AVHRR3 sensors for area of Poland, especially from region of Gdańsk coastal zone. All phases of TVDI implementation...
-
Preconditioners with Low Memory Requirements for Higher-Order Finite-Element Method Applied to Solving Maxwell’s Equations on Multicore CPUs and GPUs
PublicationThis paper discusses two fast implementations of the conjugate gradient iterative method using a hierarchical multilevel preconditioner to solve the complex-valued, sparse systems obtained using the higher order finite-element method applied to the solution of the time-harmonic Maxwell equations. In the first implementation, denoted PCG-V, a classical V-cycle is applied and the system of equations on the lowest level is solved...
-
Big Data and the Internet of Things in Edge Computing for Smart City
PublicationRequests expressing collective human expectations and outcomes from city service tasks can be partially satisfied by processing Big Data provided to a city cloud via the Internet of Things. To improve the efficiency of the city clouds an edge computing has been introduced regarding Big Data mining. This intelligent and efficient distributed system can be developed for citizens that are supposed to be informed and educated by the...
-
Tuning matrix-vector multiplication on GPU
PublicationA matrix times vector multiplication (matvec) is a cornerstone operation in iterative methods of solving large sparse systems of equations such as the conjugate gradients method (cg), the minimal residual method (minres), the generalized residual method (gmres) and exerts an influence on overall performance of those methods. An implementation of matvec is particularly demanding when one executes computations on a GPU (Graphics...
-
Linux scheduler improvement for time demanding network applications, running on Communication Platform Systems
PublicationCommunication Platform Systems as ex. ATCA standard blades located in standardized chassis provides high level communication services between system peripherals. Each ATCA blade brings dedicated functionality to the system but can as well exist as separated host responsible for servicing set of task. According to platform philosophy these parts of system can be quite independent against another solutions provided by competitors....
-
Multi Queue Approach for Network Services Implemented for Multi Core CPUs
PublicationMultiple core processors have already became the dominant design for general purpose CPUs. Incarnations of this technology are present in solutions dedicated to such areas like computer graphics, signal processing and also computer networking. Since the key functionality of network core components is fast package servicing, multicore technology, due to multi tasking ability, seems useful to support packet processing. Dedicated...
-
Numerical analysis of pile installation effects in cohesive soils
PublicationIn this thesis the empirical equation for radial effective stress calculation after displacement pile installation and following consolidation phase has been proposed. The equation is based on the numerical studies performed with Updated Lagrangian, Arbitrary Lagrangian-Eulerian and Coupled Eulerian-Lagrangian formulations as well as the calibration procedure with database containing world-wide 30 pile static loading tests in cohesive...
-
Design of a Multidomain IMS/NGN Service Stratum
PublicationThe paper continues our research concerning the Next Generation Network (NGN), which is standardized for delivering multimedia services with strict quality and includes elements of the IP Multimedia Subsystem (IMS). A design algorithm for a multidomain IMS/NGN service stratum is proposed, which calculates the necessary CSCF servers CPU message processing times and link bandwidths with respect to the given maximum values of mean...
-
A Regular Expression Matching Application with Configurable Data Intensity for Testing Heterogeneous HPC Systems
PublicationModern High Performance Computing (HPC) systems are becoming increasingly heterogeneous in terms of utilized hardware, as well as software solutions. The problems, that we wish to efficiently solve using those systems have different complexity, not only considering magnitude, but also the type of complexity: computation, data or communication intensity. Developing new mechanisms for dealing with those complexities or choosing an...
-
GPU-Accelerated Finite-Element Matrix Generation for Lossless, Lossy, and Tensor Media [EM Programmer's Notebook]
PublicationThis paper presents an optimization approach for limiting memory requirements and enhancing the performance of GPU-accelerated finite-element matrix generation applied in the implementation of the higher-order finite-element method (FEM). It emphasizes the details of the implementation of the matrix-generation algorithm for the simulation of electromagnetic wave propagation in lossless, lossy, and tensor media. Moreover, the impact...
-
An Efficient Framework For Fast Computer Aided Design of Microwave Circuits Based on the Higher-Order 3D Finite-Element Method
PublicationIn this paper, an efficient computational framework for the full-wave design by optimization of complex microwave passive devices, such as antennas, filters, and multiplexers, is described. The framework consists of a computational engine, a 3D object modeler, and a graphical user interface. The computational engine, which is based on a finite element method with curvilinear higher-order tetrahedral elements, is coupled with built-in...
-
Ocena współpracy pali Vibro z podłożem gruntowym na podstawie badań in-situ
PublicationW odniesieniu od obecnych trendów optymalnego projektowania obiektów inżynierskich, dąży się do określania rzeczywistych wartości współdziałania podłoża gruntowego i konstrukcji. Pale Vibro, należą do grupy pali przemieszczeniowych z poszerzoną podstawą, charakteryzują się bardzo dużą nośnością, szczególnie w gruntach niespoistych. Próbne obciążenia statyczne wskazują, że nośność pali Vibro jest znacznie większa niż zakładana...
-
Scalability of surrogate-assisted multi-objective optimization of antenna structures exploiting variable-fidelity electromagnetic simulation models
PublicationMulti-objective optimization of antenna structures is a challenging task due to high-computational cost of evaluating the design objectives as well as large number of adjustable parameters. Design speedup can be achieved by means of surrogate-based optimization techniques. In particular, a combination of variable-fidelity electromagnetic (EM) simulations, design space reduction techniques, response surface approximation (RSA) models,...
-
Performance and Energy Aware Training of a Deep Neural Network in a Multi-GPU Environment with Power Capping
PublicationIn this paper we demonstrate that it is possible to obtain considerable improvement of performance and energy aware metrics for training of deep neural networks using a modern parallel multi-GPU system, by enforcing selected, non-default power caps on the GPUs. We measure the power and energy consumption of the whole node using a professional, certified hardware power meter. For a high performance workstation with 8 GPUs, we were...
-
Model tests of cast-in-place piles formed by using different types of auger
PublicationModel tests are still a popular research tool used to observe and determine the mechanisms of pile-soil interaction. Due to the significant scale effect, the results of model tests performed in the 1g system can only be analysed from the qualitative side. This article describes and presents the results of 1g pile model tests carried out for comparative purposes. There were tested the effectiveness and efficiency of various types...
-
Food Classification from Images Using a Neural Network Based Approach with NVIDIA Volta and Pascal GPUs
PublicationIn the paper we investigate the problem of food classification from images, for the Food-101 dataset extended with 31 additional food classes from Polish cuisine. We adopted transfer learning and firstly measured training times for models such as MobileNet, MobileNetV2, ResNet50, ResNet50V2, ResNet101, ResNet101V2, InceptionV3, InceptionResNetV2, Xception, NasNetMobile and DenseNet, for systems with NVIDIA Tesla V100 (Volta) and...
-
A memory efficient and fast sparse matrix vector product on a Gpu
PublicationThis paper proposes a new sparse matrix storage format which allows an efficient implementation of a sparse matrix vector product on a Fermi Graphics Processing Unit (GPU). Unlike previous formats it has both low memory footprint and good throughput. The new format, which we call Sliced ELLR-T has been designed specifically for accelerating the iterative solution of a large sparse and complex-valued system of linear equations arising...
-
Cost-Efficient Design Methodology for Compact Rat-Race Couplers
PublicationIn this article, a reliable and low-cost design methodology for simulation-driven optimization of miniaturized rat-race couplers (RRCs) is presented. We exploit a two-stage design approach, where a composite structure (a basic building block of the RRC structure) is first optimized using a pattern search algorithm, and, subsequently, the entire coupler is tuned by means of surrogate-based optimization (SBO) procedure. SBO is executed...
-
Expedited Gradient-Based Design Closure of Antennas Using Variable-Resolution Simulations and Sparse Sensitivity Updates
PublicationNumerical optimization has been playing an increasingly important role in the design of contemporary antenna systems. Due to the shortage of design-ready theoretical models, optimization is mainly based on electromagnetic (EM) analysis, which tends to be costly. Numerous techniques have evolved to abate this cost, including surrogate-assisted frameworks for global optimization, or sparse sensitivity updates for speeding up local...
-
Reduced-Cost Constrained Modeling of Microwave and Antenna Components: Recent Advances
PublicationElectromagnetic (EM) simulation models are ubiquitous in the design of microwave and antenna components. EM analysis is reliable but CPU intensive. In particular, multiple simulations entailed by parametric optimization or uncertainty quantification may considerably slow down the design processes. In order to address this problem, it is possible to employ fast metamodels. Here, the popular solution approaches are approximation...
-
A highly-efficient technique for evaluating bond-orientational order parameters
PublicationWe propose a novel, highly-efficient approach for the evaluation of bond-orientational order parameters (BOPs). Our approach exploits the properties of spherical harmonics and Wigner 3jj-symbols to reduce the number of terms in the expressions for BOPs, and employs simultaneous interpolation of normalised associated Legendre polynomials and trigonometric functions to dramatically reduce the total number of arithmetic operations....
-
Zastosowanie wysokopoziomowych języków programowania do wyznaczania nośności przemieszczeniowych pali wkręcanych.
PublicationW artykule podjęta zostaje problematyka współczesnego, bardziej ekonomicznego projektowania pali. Rozwiązania normowe np. PN-83-B-2482, bazują zazwyczaj na wielkościach takich jak stopień zagęszczenia czy stopień plastyczności. Powoduje to, że dane uzyskane bezpośrednio z badań podłoża są korelowane podwójnie. W niniejszym opracowaniu proponuje się, aby korzystając z funkcji transformacyjnych wyznaczać nośność pala bezpośrednio...
-
Optymalizacja zasobów chmury obliczeniowej z wykorzystaniem inteligentnych agentów w zdalnym nauczaniu
PublicationRozprawa dotyczy optymalizacji zasobów chmury obliczeniowej, w której zastosowano inteligentne agenty w zdalnym nauczaniu. Zagadnienie jest istotne w edukacji, gdzie wykorzystuje się nowoczesne technologie, takie jak Internet Rzeczy, rozszerzoną i wirtualną rzeczywistość oraz deep learning w środowisku chmury obliczeniowej. Zagadnienie jest istotne również w sytuacji, gdy pandemia wymusza stosowanie zdalnego nauczania na dużą skalę...
-
Optymalizacja wydajności obliczeniowej metody elementów skończonych w architekturze CUDA
PublicationCelem niniejszej rozprawy oraz stypendium odbytego w ramach projektu było opracowanie numerycznie efektywnego rozwiązania algorytmicznego i sprzętowego, które umożliwia przyspieszenie analizy problemów elektromagnetycznych metodą elementów skończonych (MES) z funkcjami bazowymi wysokiego rzędu. Metoda elementów skończonych w dziedzinie częstotliwości stanowi wydajne i uniwersalne narzędzie analizy układów mikrofalowych (rys....
-
Triangulation-based Constrained Surrogate Modeling of Antennas
PublicationDesign of contemporary antenna structures is heavily based on full-wave electromagnetic (EM) simulation tools. They provide accuracy but are CPU-intensive. Reduction of EM-driven design procedure cost can be achieved by using fast replacement models (surrogates). Unfortunately, standard modeling techniques are unable to ensure sufficient predictive power for real-world antenna structures (multiple parameters, wide parameter ranges,...
-
A distributed system for conducting chess games in parallel
PublicationThis paper proposes a distributed and scalable cloud based system designed to play chess games in parallel. Games can be played between chess engines alone or between clusters created by combined chess engines. The system has a built-in mechanism that compares engines, based on Elo ranking which finally presents the strength of each tested approach. If an approach needs more computational power, the design of the system allows...
-
Two-Stage Variable-Fidelity Modeling of Antennas with Domain Confinement
PublicationSurrogate modeling has become the method of choice in solving an increasing number of antenna design tasks, especially those involving expensive full-wave electromagnetic (EM) simulations. Notwithstanding, the curse of dimensionality considerably affects conventional metamodeling methods, and their capability to efficiently handle nonlinear antenna characteristics over broad ranges of the system parameters is limited. Performance-driven...
-
Antiproliferative, Antiangiogenic, and Antimetastatic Therapy Response by Mangiferin in a Syngeneic Immunocompetent Colorectal Cancer Mouse Model Involves Changes in Mitochondrial Energy Metabolism
PublicationIn spite of the current advances and achievements in cancer treatments, colorectal cancer (CRC) persists as one of the most prevalent and deadly tumor types in both men and women worldwide. Drug resistance, adverse side effects and high rate of angiogenesis, metastasis and tumor relapse remain one of the greatest challenges in long-term management of CRC and urges need for new leads of anticancer drugs. We demonstrate that CRC...
-
Expedited Simulation-Driven Multi-Objective Design Optimization of Quasi-Isotropic Dielectric Resonator Antenna
PublicationMajority of practical engineering design problems require simultaneous handling of several criteria. Although many of design tasks can be turned into single-objective problems using sufficient formulations, in some situations, acquiring comprehensive knowledge about possible trade-offs between conflicting objectives may be necessary. This calls for multi-objective optimization that aims at identifying a set of alternative, Pareto-optimal...
-
Zastosowanie technologii GPGPU do wspomagania inżynierskich obliczeń numerycznych na przykładzie analizy przepływu przez ośrodek dwufazowy płyn - ciało stałe
PublicationW artykule po przedstawieniu podstawowych informacji na temat technologii GPGPU oraz struktury NVIDIA CUDA opisano równania zachowania rządzące przepływami oraz ich dyskretyzację numeryczna. Następnie zbadano możliwości wykorzystania technologii GPGPU w celu zoptymalizowania czasu wykonywania obliczeń numerycznych przepływu przez ośrodek dwufazowy (płyn - cząsteczki ciała stała stałego) zbliżony do ośrodka porowatego. W tym celu,...
-
Fast EM-Driven Parameter Tuning of Microwave Circuits with Sparse Sensitivity Updates via Principal Directions
PublicationNumerical optimization has become more important than ever in the design of microwave components and systems, primarily as a consequence of increasing performance demands and growing complexity of the circuits. As the parameter tuning is more and more often executed using full-wave electromagnetic (EM) models, the CPU cost of the overall process tends to be excessive even for local optimization. Some ways of alleviating these issues...
-
Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging
PublicationIn the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modification of the training program which minimizes the...
-
Design-oriented computationally-efficient feature-based surrogate modelling of multi-band antennas with nested kriging
PublicationDesign of modern antenna structures heavily depends on electromagnetic (EM) simulation tools. EM analysis provides reliable evaluation of increasingly complex designs but tends to be CPU intensive. When multiple simulations are needed (e.g., for parameters tuning), the aggregated simulation cost may become a serious bottleneck. As one possible way of mitigating the issue, the recent literature fosters utilization of faster representations,...
-
Zastosowanie technologii GPGPU do wspomagania inżynierskich obliczeń numerycznych na przykładzie analizy przepływu przez ośrodek dwufazowy płyn-ciało stałe
PublicationW artykule po przedstawieniu podstawowych informacji na temat technologii GPGPU oraz struktury NVIDIA CUDA opisano równania zachowania rządzące przepływami oraz ich dyskretyzację numeryczna. Następnie zbadano możliwości wykorzystania technologii GPGPU w celu zoptymalizowania czasu wykonywania obliczeń numerycznych przepływu przez ośrodek dwufazowy (płyn - cząsteczki ciała stała stałego) zbliżony do ośrodka porowatego. W tym celu,...
-
Reliable Surrogate Modeling of Antenna Input Characteristics by Means of Domain Confinement and Principal Components
PublicationA reliable design of contemporary antenna structures necessarily involves full-wave electromagnetic (EM) analysis which is the only tool capable of accounting, for example, for element coupling or the effects of connectors. As EM simulations tend to be CPU-intensive, surrogate modeling allows for relieving the computational overhead of design tasks that require numerous analyses, for example, parametric optimization or uncertainty...
-
Efficient Simulation-Based Global Antenna Optimization Using Characteristic Point Method and Nature-Inspired Metaheuristics
PublicationAntenna structures are designed nowadays to fulfil rigorous demands, including multi-band operation, where the center frequencies need to be precisely allocated at the assumed targets while improving other features, such as impedance matching. Achieving this requires simultaneous optimization of antenna geometry parameters. When considering multimodal problems or if a reasonable initial design is not at hand, one needs to rely...
-
Simulation-Driven Antenna Modeling by Means of Response Features and Confined Domains of Reduced Dimensionality
PublicationIn recent years, the employment of full-wave electromagnetic (EM) simulation tools has become imperative in the antenna design mainly for reliability reasons. While the CPU cost of a single simulation is rarely an issue, the computational overhead associated with EM-driven tasks that require massive EM analyses may become a serious bottleneck. A widely used approach to lessen this cost is the employment of surrogate models, especially...
-
A GPU Solver for Sparse Generalized Eigenvalue Problems with Symmetric Complex-Valued Matrices Obtained Using Higher-Order FEM
PublicationThe paper discusses a fast implementation of the stabilized locally optimal block preconditioned conjugate gradient (sLOBPCG) method, using a hierarchical multilevel preconditioner to solve nonHermitian sparse generalized eigenvalue problems with large symmetric complex-valued matrices obtained using the higher-order finite-element method (FEM), applied to the analysis of a microwave resonator. The resonant frequencies of the low-order...
-
Implementation of Addition and Subtraction Operations in Multiple Precision Arithmetic
PublicationIn this paper, we present a digital circuit of arithmetic unit implementing addition and subtraction operations in multiple-precision arithmetic (MPA). This adder-subtractor unit is a part of MPA coprocessor supporting and offloading the central processing unit (CPU) in computations requiring precision higher than 32/64 bits. Although addition and subtraction operations of two n-digit numbers require O(n) operations, the efficient...
-
Surrogate-assisted EM-driven miniaturization of wideband microwave couplers by means of co-simulation low-fidelity models
PublicationThis article proposes a methodology for rapid design optimization of miniaturized wideband couplers. More specifically, a class of circuits is considered, in which conventional transmission lines are replaced by their abbreviated counterparts referred to as slow-wave compact cells. Our focus is on explicit reduction of the structure size as well as on reducing the CPU cost of the design process. For the sake of computational feasibility,...
-
Smaller Representation of Finite State Automata
PublicationThis paper is a follow-up to Jan Daciuk's experiments on space-effcient finite state automata representation that can be used directly for traversals in main memory. We investigate several techniques of reducing memory footprint of minimal automata, mainly exploiting the fact that transition labels and transition pointer offset values are not evenly distributed and so are suitable for compression. We achieve a gain of around 20-30%...
-
Expedited EM-Driven Design of Miniaturized Microwave Hybrid Couplers Using Surrogate-Based Optimization
PublicationMiniaturization of microwave hybrid couplers is important for contemporary wireless communication engineering. Using standard computer-aided design methods for development of compact structures is extremely challenging due to a general lack of computationally efficient and accurate simulation models. Poor accuracy of available equivalent circuits results from neglecting parasitic cross-couplings that greatly affect the performance...
-
Assessment of OpenMP Master–Slave Implementations for Selected Irregular Parallel Applications
PublicationThe paper investigates various implementations of a master–slave paradigm using the popular OpenMP API and relative performance of the former using modern multi-core workstation CPUs. It is assumed that a master partitions available input into a batch of predefined number of data chunks which are then processed in parallel by a set of slaves and the procedure is repeated until all input data has been processed. The paper experimentally...
-
Rapid Multi-Criterial Antenna Optimization by Means of Pareto Front Triangulation and Interpolative Design Predictors
PublicationModern antenna systems are designed to meet stringent performance requirements pertinent to both their electrical and field properties. The objectives typically stay in conflict with each other. As the simultaneous improvement of all performance parameters is rarely possible, compromise solutions have to be sought. The most comprehensive information about available design trade-offs can be obtained through multi-objective optimization...
-
Reduced-Cost Microwave Modeling Using Constrained Domains and Dimensionality Reduction
PublicationDevelopment of modern microwave devices largely exploits full-wave electromagnetic (EM) simulations. Yet, simulation-driven design may be problematic due to the incurred CPU expenses. Addressing the high-cost issues stimulated the development of surrogate modeling methods. Among them, data-driven techniques seem to be the most widespread owing to their flexibility and accessibility. Nonetheless, applicability of approximation-based...