Wyniki wyszukiwania dla: PARALLEL PERFORMANCE
-
Planning optimised multi-tasking operations under the capability for parallel machining
PublikacjaThe advent of advanced multi-tasking machines (MTMs) in the metalworking industry has provided the opportunity for more efficient parallel machining as compared to traditional sequential processing. It entailed the need for developing appropriate reasoning schemes for efficient process planning to take advantage of machining capabilities inherent in these machines. This paper addresses an adequate methodical approach for a non-linear...
-
Experimental Research on the Energy Efficiency of a Parallel Hybrid Drive for an Inland Ship
PublikacjaThe growing requirements for limiting the negative impact of all modes of transport on the natural environment mean that clean technologies are becoming more and more important. The global trend of e-mobility also applies to sea and inland water transport. This article presents the results of experimental tests carried out on a life-size, parallel diesel-electric hybrid propulsion system. The eciency of the propulsion system was...
-
Scheduling with Complete Multipartite Incompatibility Graph on Parallel Machines: Complexity and Algorithms
PublikacjaIn this paper, the problem of scheduling on parallel machines with a presence of incompatibilities between jobs is considered. The incompatibility relation can be modeled as a complete multipartite graph in which each edge denotes a pair of jobs that cannot be scheduled on the same machine. The paper provides several results concerning schedules, optimal or approximate with respect to the two most popular criteria of optimality:...
-
MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems
PublikacjaIn this paper we present a new environment called MERPSYS that allows simulation of parallel application execution time on cluster-based systems. The environment offers a modeling application using the Java language extended with methods representing message passing type communication routines. It also offers a graphical interface for building a system model that incorporates various hardware components such as CPUs, GPUs, interconnects...
-
A Parallel Corpus-Based Approach to the Crime Event Extraction for Low-Resource Languages
PublikacjaThese days, a lot of crime-related events take place all over the world. Most of them are reported in news portals and social media. Crime-related event extraction from the published texts can allow monitoring, analysis, and comparison of police or criminal activities in different countries or regions. Existing approaches to event extraction mainly suggest processing texts in English, French, Chinese, and some other resource-rich...
-
Molecular Diffusion Simulation on ARUZ – Massively-parallel FPGA-based Machine
Publikacja -
Scheduling with precedence constraints: mixed graph coloring in series-parallel graphs.
PublikacjaW pracy rozważono problem kolorowania grafów mieszanych, opisujący zagadnienie szeregowania zadań, w którym zależności czasowe zadań mają charakter częściowego porządku lub wzajemnego wykluczania. Dla przypadku, w którym graf zależności jest szeregowo-równoległy, podano algorytm rozwiązujący problem optymalnie w czasie $O(n^3.376 * log n)$.
-
Parallel implementation of the DGF-FDTD method on GPU Using the CUDA technology
PublikacjaThe discrete Green's function (DGF) formulation of the finite-difference time-domain method (FDTD) is accelerated on a graphics processing unit (GPU) by means of the Compute Unified Device Architecture (CUDA) technology. In the developed implementation of the DGF-FDTD method, a new analytic expression for dyadic DGF derived based on scalar DGF is employed in computations. The DGF-FDTD method on GPU returns solutions that are compatible...
-
Effective methods for functional confermance testing of parallel and distributed programming libraries.
PublikacjaRozprawa przedstawia kompletna metodykę tworzenia Zestawów Testów Zgodności dla języków programowania, bibliotek i API, ze szczególnym uwzględnieniem języków i bibliotek programowania równoleglego i rozproszonego. Autor rozpoczął badania w dziedzinie testowania zgodności dla bibliotek programowania równoleglego i rozproszonego, ale Metodyka Kolejnych zawężeń (ang. Consecutive Confinenments Method -CoCoM, stworzona przez Autora,...
-
Towards Efficient Parallel Image Processing on Cluster Grids Using GIMP.
PublikacjaZe względu na fakt, iż niewielu użytkowników posiada wiedzę niezbędną do wykorzystania niskopoziomowych bibliotek programowania równoległego w celu przyspieszenia działania programów operujących na obrazach, proponujemy plugin do znanej aplikacji GIMP, który umożliwia potokowe wykonanie szeregu filtrów na obrazach załadowanych przez plugin. Prezentujemy szczegóły implementacyjne, scenariusze testowe i wyniki na klastrach, potencjalnie...
-
Implementation of FDTD-compatible Green's function on heterogeneous CPU-GPU parallel processing system
PublikacjaThis paper presents an implementation of the FDTD-compatible Green's function on a heterogeneous parallel processing system. The developed implementation simultaneously utilizes computational power of the central processing unit (CPU) and the graphics processing unit (GPU) to the computational tasks best suited to each architecture. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates...
-
Benchmarking Parallel Chess Search in Stockfish on Intel Xeon and Intel Xeon Phi Processors
PublikacjaThe paper presents results from benchmarking the parallel multithreaded Stockfish chess engine on selected multi- and many-core processors. It is shown how the strength of play for an n-thread version compares to 1-thread version on both Intel Xeon and latest Intel Xeon Phi x200 processors. Results such as the number of wins, losses and draws are presented and how these change for growing numbers of threads. Impact of using particular...
-
Parallel Implementation of the Discrete Green's Function Formulation of the FDTD Method on a Multicore Central Processing Unit
PublikacjaParallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method was developed on a multicore central processing unit. DGF-FDTD avoids computations of the electromagnetic field in free-space cells and does not require domain termination by absorbing boundary conditions. Computed DGF-FDTD solutions are compatible with the FDTD grid enabling the perfect hybridization of FDTD...
-
Edge-Guided Mode Performance and Applications in Nonreciprocal Millimeter-Wave Gyroelectric Components
PublikacjaThe analogies between the behavior of gyromagnetic and gyroelectric nonreciprocal structures, the use of the simple transfer matrix approach, and the edge-guided (EG) wave property, supported in a parallel plate model for integrated magnetized semiconductor waveguide, are investigated in those frequency regions, where the effective permittivity is negative or positive. As with their ferrite counterparts, the leakage of the EG waves...
-
Makespan minimization of multi-slot just-in-time scheduling on single and parallel machines
PublikacjaArtykuł podejmuje problem szeregowania zadań przy założeniu podziału czasu na sloty jednakowej długości, gdzie każde z zadań ma ustaloną długość oraz czas jego zakończenia, który jest relatywny do końca slotu. Problem znalezienia uszeregowania polega na dokonaniu przydziału zadań do poszczególnych slotów, przy czym w ogólności długość zadania może wymuszać sytuację, w której zadańie jest realizowane nie tylko w slocie, w którym...
-
Parallel in vitro and in silico investigations into anti-inflammatory effects of non-prenylated stilbenoids
Publikacja -
From the Dynamic Lattice Liquid Algorithm to the Dedicated Parallel Computer – mDLL Machine
Publikacja -
Optimizing the computation of a parallel 3D finite difference algorithm for graphics processing units
PublikacjaThis paper explores the possibilities of using a graphics processing unit for complex 3D finite difference computation via MUSTA‐FORCE and WENO algorithms. We propose a novel algorithm based on the new properties of CUDA surface memory optimized for 2D spatial locality and compare it with 3D stencil computations carried out via shared memory, which is currently considered to be the best approach. A case study was performed for...
-
New user-guided and ckpt-based checkpointing libraries for parallel MPI applications
PublikacjaPraca prezentuje szczególy projektowe i implementacyjne jak również wyniki wydajnościowe dwóch nowych bibliotek checkpointingu opracowanych przez autorów dla równoległych aplikacji MPI. Pierwsz biblioteka, tzw. user-guided wymaga od programisty dostarczenia funkcji pakujących i rozpakowujących stan procesu, ale dostarcza łatwego w użyciu API z wykorzystaniem stałych MPI. Wykorzystuje funkcje I/O MPI-2 lub dedykowany proces master...
-
Generating reliable conformance test suites for parallel and distributed languages, libraries, and APIs.
PublikacjaArtykuł nakreśla nową metodykę dla tworzenia Zestawów Testów Zgodności (ZTG) dla języków, bibliotek i API programowania równoległego i rozproszonego. Autor rozpoczął swoje badania w zakresie testowania zgodności dla języka równoległego sterowanego danymi Athapascan, opracował metodykę dla projektowania i analizowania ZTG nazwaną Metodą Kolejnych Zawężeń (ang. Consecutive Confinements Methods - CoCoM), stworzył narzędzie CTS Designer,...
-
Dynamic Data Management Among Multiple Databases for Optimization of Parallel Computations in Heterogeneous HPC Systems
PublikacjaRapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...
-
Cellulosic bionanocomposites based on acrylonitrile butadiene rubber and Cuscuta reflexa: adjusting structure-properties balance for higher performance
PublikacjaDesign and manufacture of cellulosic nanocomposites with acceptable performance is in the period of a transition from fantasy to reality. Typically, cellulosic nanofillers reveal poor compatibility with polymer matrices. Thus, adjusting the balance between structure and properties of cellulosic bionanocomposites by careful selection of parent ingredients is the first priority. Herein, we incorporated Cuscuta reflexa derived cellulose...
-
Genetic Positioning of Fire Stations Utilizing Grid-computing Platform
PublikacjaA chapter presents a model for determining near-optimal locations of fire stations based on topography of a given area and location of forests, rivers, lakes and other elements of the site. The model is based on principals of genetic algorithms and utilizes the power of the grid to distribute and execute in parallel most performance-demanding computations involved in the algorithm.
-
Single and Series of Multi-valued Decision Diagrams in Representation of Structure Function
PublikacjaStructure function, which defines dependency of performance of the system on performance of its components, is a key part of system description in reliability analysis. In this paper, we compare two approaches for representation of the structure function. The first one is based on use of a single Multi-valued Decision Diagram (MDD) and the second on use of a series of MDDs. The obtained results indicate that the series of MDDs...
-
Measurements of the coefficients of current distribution between two generators operating in parallel in a ship power station
Dane BadawczeThe presented dataset is part of research focusing on the assessment of metrological properties of the instrument, Estimator/Analyzer (E/A v.2), developed and made at the Faculty of Electrical Engineering, Department of Marine Electrical Power Engineering of Gdynia Maritime University. The attached dataset contains processed data, expressing the coefficients...
-
Measurements of the rms currents in two phases in a ship power station with two generators operating in parallel
Dane BadawczeThe presented dataset is part of research focusing on the assessment of metrological properties of the instrument, Estimator/Analyzer (E/A v.2), developed and made at the Faculty of Electrical Engineering, Department of Marine Electrical Power Engineering of Gdynia Maritime University. The attached dataset contains processed data, expressing the rms...
-
Measurements of the rms voltages on main bars in a ship power station with two generators operating in parallel
Dane BadawczeThe presented dataset is part of research focusing on the assessment of metrological properties of the instrument, Estimator/Analyzer (E/A v.2), developed and made at the Faculty of Electrical Engineering, Department of Marine Electrical Power Engineering of Gdynia Maritime University. The attached dataset contains processed data, expressing the rms...
-
Efficient parallel algorithms in global optimization of potential energy functions for peptides, proteins, and crystals
Publikacja -
High power, zero ripples active filtering system with power modules operating in parallel
Publikacja -
ARUZ — Large-scale, massively parallel FPGA-based analyzer of real complex systems
Publikacja -
Parallel simulations of electrophysiological phenomena in myocardium on large 32 and 64-bit Linux clusters.
PublikacjaW pracy podjęto badania i przeprowadzono symulacje zjawisk elektrofizjologicznych w mięśniu sercowym z wykorzystaniem wytworzonego w tym celu oprogramowania równoległego opartego na MPI. Zaimplementowano i zbadano ulepszenia kodu prowadzące do uzyskania dobrej skalowalności oraz przeprowadzono testy wydajności na najnowszych 32 i 64-bitowych klastrach linuksowych. Praca stanowi próbę równoległej implementacji znanego podejścia...
-
Mechanism of recognition of parallel G-quadruplexes by DEAH/RHAU helicase DHX36 explored by molecular dynamics simulations
PublikacjaBecause of high stability and slow unfolding rates of G-quadruplexes (G4), cells have evolved specialized helicases that disrupt these non-canonical DNA and RNA structures in an ATP-dependent manner. One example is DHX36, a DEAH-box helicase, which participates in gene expression and replication by recognizing and unwinding parallel G4s. Here, we studied the molecular basis for the high affinity and specificity of DHX36 for parallel-type...
-
A CMOS Pixel With Embedded ADC, Digital CDS and Gain Correction Capability for Massively Parallel Imaging Array
PublikacjaIn the paper, a CMOS pixel has been proposed for imaging arrays with massively parallel image acquisition and simultaneous compensation of dark signal nonuniformity (DSNU) as well as photoresponse nonuniformity (PRNU). In our solution the pixel contains all necessary functional blocks: a photosensor and an analog-to-digital converter (ADC) with built-in correlated double sampling (CDS) integrated together. It is implemented in...
-
A Fail-Safe NVRAM Based Mechanism for Efficient Creation and Recovery of Data Copies in Parallel MPI Applications
PublikacjaThe paper presents a fail-safe NVRAM based mechanism for creation and recovery of data copies during parallel MPI application runtime. Specifically, we target a cluster environment in which each node has an NVRAM installed in it. Our previously developed extension to the MPI I/O API can take advantage of NVRAM regions in order to provide an NVRAM based cache like mechanism to significantly speed up I/O operations and allow to preload...
-
Effective configuration of a double triad planar parallel manipulator for precise positioning of heavy details during their assembling process
PublikacjaIn the paper, dynamics analysis of a parallel manipulator is presented. It is an atypical manipulator, devoted to help in assembling of heavy industrial constructions. Few atypical properties are required: small workspace; slow velocities; high loads. Initially, a short discussion about definition of the parallel manipulators is presented, as well as the sketch of the proposed structure. In parallel, some definitions, assumptions...
-
Low-Power Receivers for Wireless Capacitive Coupling Transmission in 3-D-Integrated Massively Parallel CMOS Imager
PublikacjaThe paper presents pixel receivers for massively parallel transmission of video signal between capacitive coupled integrated circuits (ICs). The receivers meet the key requirements for massively parallel transmission, namely low-power consumption below a single μW, small area of less than 205 μm2, high sensitivity better than 160 mV, and good immunity to crosstalk. The receivers were implemented and measured in a 3-D IC (two face-to-face...
-
A Parallel MPI I/O Solution Supported by Byte-addressable Non-volatile RAM Distributed Cache
PublikacjaWhile many scientific, large-scale applications are data-intensive, fast and efficient I/O operations have become of key importance for HPC environments. We propose an MPI I/O extension based on in-system distributed cache with data located in Non-volatile Random Access Memory (NVRAM) available in each cluster node. The presented architecture makes effective use of NVRAM properties such as persistence and byte-level access behind...
-
Parallelization of video stream algorithms in kaskada platform
PublikacjaThe purpose of this work is to present different techniques of video stream algorithms parallelization provided by the Kaskada platform - a novel system working in a supercomputer environment designated for multimedia streams processing. Considered parallelization methods include frame-level concurrency, multithreading and pipeline processing. Execution performance was measured on four time-consuming image recognition algorithms,...
-
Chained machine learning model for predicting load capacity and ductility of steel fiber–reinforced concrete beams
PublikacjaOne of the main issues associated with steel fiber–reinforced concrete (SFRC) beams is the ability to anticipate their flexural response. With a comprehensive grid search, several stacked models (i.e., chained, parallel) consisting of various machine learning (ML) algorithms and artificial neural networks (ANNs) were developed to predict the flexural response of SFRC beams. The flexural performance of SFRC beams under bending was...
-
Construction of highly stable parallel two-step Runge-Kutta methods for delay differential equations
PublikacjaW pracy pokazano, że każda A-stabilna dwukrokowa metoda Rungego-Kutty dla równań różniczkowych zwyczajnych rzędu p1 i rzędu etapowego q=p1 może być uogólniona do P-stabilnej metody dla równań różniczkowych z opóźnieniem zbieżnej jednostajnie z rzędem p=p1.
-
Modelling of First- and Second-order Chemical Reactions on ARUZ – Massively-parallel FPGA-based Machine
Publikacja -
Carbonized Lanthanum-Based Metal-Organic Framework with Parallel Arranged Channels for Azo-Dye Adsorption
Publikacja -
Infrared techniques for natural convection investigations in channels between two vertical, parallel, isothermal and symmetrically heated plates
PublikacjaThe effect of the gap width between two symmetrically heated vertical, parallel, isothermal plates on intensity of natural convective heat transfer in a gas (Pr = 0.71) was experimentally studied using the balance and gradient methods. In the former method heat fluxes were determined based on measurements of the voltage and electric current supplying the heaters placed inside the walls. In the latter, heat fluxes were calculated...
-
Two Stage SVM and kNN Text Documents Classifier
PublikacjaThe paper presents an approach to the large scale text documents classification problem in parallel environments. A two stage classifier is proposed, based on a combination of k-nearest neighbors and support vector machines classification methods. The details of the classifier and the parallelisation of classification, learning and prediction phases are described. The classifier makes use of our method named one-vs-near. It is...
-
GPU-Accelerated LOBPCG Method with Inexact Null-Space Filtering for Solving Generalized Eigenvalue Problems in Computational Electromagnetics Analysis with Higher-Order FEM
PublikacjaThis paper presents a GPU-accelerated implementation of the Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) method with an inexact nullspace filtering approach to find eigenvalues in electromagnetics analysis with higherorder FEM. The performance of the proposed approach is verified using the Kepler (Tesla K40c) graphics accelerator, and is compared to the performance of the implementation based on functions from...
-
OpenGL accelerated method of the material matrix generation for FDTD simulations
PublikacjaThis paper presents the accelerated technique of the material matrix generation from CAD models utilized by the finite-difference time-domain (FDTD) simulators. To achieve high performance of these computations, the parallel-processing power of a graphics processing unit was employed with the use of the OpenGL library. The method was integrated with the developed FDTD solver, providing approximately five-fold speedup of the material...
-
Benchmarking Deep Neural Network Training Using Multi- and Many-Core Processors
PublikacjaIn the paper we provide thorough benchmarking of deep neural network (DNN) training on modern multi- and many-core Intel processors in order to assess performance differences for various deep learning as well as parallel computing parameters. We present performance of DNN training for Alexnet, Googlenet, Googlenet_v2 as well as Resnet_50 for various engines used by the deep learning framework, for various batch sizes. Furthermore,...
-
Self-optimizing generalized adaptive notch filters - comparison of three optimization strategies
PublikacjaThe paper provides comparison of three different approaches to on-line tuning of generalized adaptive notch filters (GANFs) the algorithms used for identification/tracking of quasi-periodically varying dynamic systems. Tuning is needed to adjust adaptation gains, which control tracking performance of ANF algorithms, to the unknown and/or time time-varying rate of system nonstationarity. Two out ofthree compared approaches are classical...
-
Measurements of the coefficients of active power distribution between two generators operating in parallel in a ship power station
Dane BadawczeThe presented dataset is part of research focusing on the assessment of metrological properties of the instrument, Estimator/Analyzer (E/A v.2), developed and made at the Faculty of Electrical Engineering, Department of Marine Electrical Power Engineering of Gdynia Maritime University. The attached dataset contains processed data, expressing coefficients...
-
Influence of permanent magnetic field on wear performance of dry sliding contacts
PublikacjaResults of experimental studies concerning the influence of permanent magnetic field on wear of dry sliding contact are presented.It was found that magnetic field of horizontal orientation is causing temporary reduction of material's hardness. It also facilitates and accelerates removal process of wear particles from the contact zone.Formation of bulges, consisting of accumulated and compacted anti-ferromagnetic material produced...