Filtry
wszystkich: 426
wybranych: 416
Wyniki wyszukiwania dla: PARALLEL ALGORITHMS
-
Modern Platform for Parallel Algorithms Testing: Java on Intel Xeon Phi
PublikacjaParallel algorithms are popular method of increasing system performance. Apart from showing their properties using asymptotic analysis, proof-of-concept implementation and practical experiments are often required. In order to speed up the development and provide simple and easily accessible testing environment that enables execution of reliable experiments, the paper proposes a platform with multi-core computational accelerator:...
-
Efficient parallel algorithms in global optimization of potential energy functions for peptides, proteins, and crystals
Publikacja -
Scheduling with Complete Multipartite Incompatibility Graph on Parallel Machines: Complexity and Algorithms
PublikacjaIn this paper, the problem of scheduling on parallel machines with a presence of incompatibilities between jobs is considered. The incompatibility relation can be modeled as a complete multipartite graph in which each edge denotes a pair of jobs that cannot be scheduled on the same machine. The paper provides several results concerning schedules, optimal or approximate with respect to the two most popular criteria of optimality:...
-
Parallel implementation of background subtraction algorithms for real-time video processing on a supercomputer platform
PublikacjaResults of evaluation of the background subtraction algorithms implemented on a supercomputer platform in a parallel manner are presented in the paper. The aim of the work is to chose an algorithm, a number of threads and a task scheduling method, that together provide satisfactory accuracy and efficiency of a real-time processing of high resolution camera images, maintaining the cost of resources usage at a reasonable level. Two...
-
Performance Evaluation of Selected Parallel Object Detection and Tracking Algorithms on an Embedded GPU Platform
PublikacjaPerformance evaluation of selected complex video processing algorithms, implemented on a parallel, embedded GPU platform Tegra X1, is presented. Three algorithms were chosen for evaluation: a GMM-based object detection algorithm, a particle filter tracking algorithm and an optical flow based algorithm devoted to people counting in a crowd flow. The choice of these algorithms was based on their computational complexity and parallel...
-
Parallel multithread computing for spectroscopic analysis in optical coherence tomography
PublikacjaSpectroscopic Optical Coherence Tomography (SOCT) is an extension of Optical Coherence Tomography (OCT). It allows gathering spectroscopic information from individual scattering points inside the sample. It is based on time-frequency analysis of interferometric signals. Such analysis requires calculating hundreds of Fourier transforms while performing a single A-scan. Additionally, further processing of acquired spectroscopic information...
-
Parallelization of video stream algorithms in kaskada platform
PublikacjaThe purpose of this work is to present different techniques of video stream algorithms parallelization provided by the Kaskada platform - a novel system working in a supercomputer environment designated for multimedia streams processing. Considered parallelization methods include frame-level concurrency, multithreading and pipeline processing. Execution performance was measured on four time-consuming image recognition algorithms,...
-
Computer experiments with a parallel clonal selection algorithm for the graph coloring problem
PublikacjaArtificial immune systems (AIS) are algorithms that are based on the structure and mechanisms of the vertebrate immune system. Clonal selection is a process that allows lymphocytes to launch a quick response to known pathogens and to adapt to new, previously unencountered ones. This paper presents a parallel island model algorithm based on the clonal selection principles for solving the Graph Coloring Problem. The performance of...
-
NVRAM as Main Storage of Parallel File System
PublikacjaModern cluster environments' main trouble used to be lack of computational power provided by CPUs and GPUs, but recently they suffer more and more from insufficient performance of input and output operations. Apart from better network infrastructure and more sophisticated processing algorithms, a lot of solutions base on emerging memory technologies. This paper presents evaluation of using non-volatile random-access memory as a...
-
Parallelization of Selected Algorithms on Multi-core CPUs, a Cluster and in a Hybrid CPU+Xeon Phi Environment
PublikacjaIn the paper we present parallel implementations as well as execution times and speed-ups of three different algorithms run in various environments such as on a workstation with multi-core CPUs and a cluster. The parallel codes, implementing the master-slave model in C+MPI, differ in computation to communication ratios. The considered problems include: a genetic algorithm with various ratios of master processing time to communication...
-
Scheduling with Complete Multipartite Incompatibility Graph on Parallel Machines
PublikacjaIn this paper we consider a problem of job scheduling on parallel machines with a presence of incompatibilities between jobs. The incompatibility relation can be modeled as a complete multipartite graph in which each edge denotes a pair of jobs that cannot be scheduled on the same machine. Our research stems from the works of Bodlaender, Jansen, and Woeginger (1994) and Bodlaender and Jansen (1993). In particular, we pursue the...
-
Scheduling of compatible jobs on parallel machines
PublikacjaThe dissertation discusses the problems of scheduling compatible jobs on parallel machines. Some jobs are incompatible, which is modeled as a binary relation on the set of jobs; the relation is often modeled by an incompatibility graph. We consider two models of machines. The first model, more emphasized in the thesis, is a classical model of scheduling, where each machine does one job at time. The second one is a model of p-batching...
-
All-gather Algorithms Resilient to Imbalanced Process Arrival Patterns
PublikacjaTwo novel algorithms for the all-gather operation resilient to imbalanced process arrival patterns (PATs) are presented. The first one, Background Disseminated Ring (BDR), is based on the regular parallel ring algorithm often supplied in MPI implementations and exploits an auxiliary background thread for early data exchange from faster processes to accelerate the performed all-gather operation. The other algorithm, Background Sorted...
-
Recognition of hazardous acoustic events employing parallel processing on a supercomputing cluster . Rozpoznawanie niebezpiecznych zdarzeń dźwiękowych z wykorzystaniem równoległego przetwarzania na klastrze superkomputerowym
PublikacjaA method for automatic recognition of hazardous acoustic events operating on a super computing cluster is introduced. The methods employed for detecting and classifying the acoustic events are outlined. The evaluation of the recognition engine is provided: both on the training set and using real-life signals. The algorithms yield sufficient performance in practical conditions to be employed in security surveillance systems. The...
-
Identification of nonstationary processes using noncausal bidirectional lattice filtering
PublikacjaThe problem of off-line identification of a nonstationary autoregressive process with a time-varying order and a time-varying degree of nonstationarity is considered and solved using the parallel estimation approach. The proposed parallel estimation scheme is made up of several bidirectional (noncausal) exponentially weighted lattice algorithms with different estimation memory and order settings. It is shown that optimization of...
-
Process arrival pattern aware algorithms for acceleration of scatter and gather operations
PublikacjaImbalanced process arrival patterns (PAPs) are ubiquitous in many parallel and distributed systems, especially in HPC ones. The collective operations, e.g. in MPI, are designed for equal process arrival times (PATs), and are not optimized for deviations in their appearance. We propose eight new PAP-aware algorithms for the scatter and gather operations. They are binomial or linear tree adaptations introducing additional process...
-
Self-optimizing generalized adaptive notch filters - comparison of three optimization strategies
PublikacjaThe paper provides comparison of three different approaches to on-line tuning of generalized adaptive notch filters (GANFs) the algorithms used for identification/tracking of quasi-periodically varying dynamic systems. Tuning is needed to adjust adaptation gains, which control tracking performance of ANF algorithms, to the unknown and/or time time-varying rate of system nonstationarity. Two out ofthree compared approaches are classical...
-
Adaptive system for recognition of sounds indicating threats to security of people and property employing parallel processing of audio data streams
PublikacjaA system for recognition of threatening acoustic events employing parallel processing on a supercomputing cluster is featured. The methods for detection, parameterization and classication of acoustic events are introduced. The recognition engine is based onthreshold-based detection with adaptive threshold and Support Vector Machine classifcation. Spectral, temporal and mel-frequency descriptors are used as signal features. The...
-
Approximation algorithms for job scheduling with block-type conflict graphs
PublikacjaThe problem of scheduling jobs on parallel machines (identical, uniform, or unrelated), under incompatibility relation modeled as a block graph, under the makespan optimality criterion, is considered in this paper. No two jobs that are in the relation (equivalently in the same block) may be scheduled on the same machine in this model. The presented model stems from a well-established line of research combining scheduling theory...
-
Design of weighted PID controllers for control of the Stewart-Gough platform
PublikacjaStewart-Gough platform (SGP) is a popular parallel type manipulator that involves a 6 degrees of freedom (DOF) motion. In this paper, the process of mathematical modelling of SGP is presented. Two selected control algorithms that use PID controllers and weighted PID controllers are designed. Both control systems using these algorithms are implemented in MATLAB environment as well as on the actual SGP. Parameters of the controllers...
-
Planning optimised multi-tasking operations under the capability for parallel machining
PublikacjaThe advent of advanced multi-tasking machines (MTMs) in the metalworking industry has provided the opportunity for more efficient parallel machining as compared to traditional sequential processing. It entailed the need for developing appropriate reasoning schemes for efficient process planning to take advantage of machining capabilities inherent in these machines. This paper addresses an adequate methodical approach for a non-linear...
-
Optimization of hybrid parallel application execution in heterogeneous high performance computing systems considering execution time and power consumption
PublikacjaMany important computational problems require utilization of high performance computing (HPC) systems that consist of multi-level structures combining higher and higher numbers of devices with various characteristics. Utilizing full power of such systems requires programming parallel applications that are hybrid in two meanings: they can utilize parallelism on multiple levels at the same time and combine together programming interfaces...
-
Genetic Positioning of Fire Stations Utilizing Grid-computing Platform
PublikacjaA chapter presents a model for determining near-optimal locations of fire stations based on topography of a given area and location of forests, rivers, lakes and other elements of the site. The model is based on principals of genetic algorithms and utilizes the power of the grid to distribute and execute in parallel most performance-demanding computations involved in the algorithm.
-
Performance evaluation of unified memory and dynamic parallelism for selected parallel CUDA applications
PublikacjaThe aim of this paper is to evaluate performance of new CUDA mechanisms—unified memory and dynamic parallelism for real parallel applications compared to standard CUDA API versions. In order to gain insight into performance of these mechanisms, we decided to implement three applications with control and data flow typical of SPMD, geometric SPMD and divide-and-conquer schemes, which were then used for tests and experiments. Specifically,...
-
Decoupled Kalman filter based identification of time-varying FIR systems
PublikacjaWhen system parameters vary at a fast rate, identification schemes based on model-free local estimation approaches do not yield satisfactory results. In cases like this, more sophisticated parameter tracking procedures must be used, based on explicit models of parameter variation (often referred to as hypermodels), either deterministic or stochastic. Kalman filter trackers, which belong to the second category, are seldom used in...
-
New Approach to Noncasual Identification of Nonstationary Stochastic FIR Systems Subject to Both Smooth and Abrupt Parameter Changes
PublikacjaIn this technical note, we consider the problem of finite-interval parameter smoothing for a class of nonstationary linear stochastic systems subject to both smooth and abrupt parameter changes. The proposed parallel estimation scheme combines the estimates yielded by several exponentially weighted basis function algorithms. The resulting smoother automatically adjusts its smoothing bandwidth to the type and rate of nonstationarity...
-
Towards an efficient multi-stage Riemann solver for nuclear physics simulations
PublikacjaRelativistic numerical hydrodynamics is an important tool in high energy nuclear science. However, such simulations are extremely demanding in terms of computing power. This paper focuses on improving the speed of solving the Riemann problem with the MUSTA-FORCE algorithm by employing the CUDA parallel programming model. We also propose a new approach to 3D finite difference algorithms, which employ a GPU that uses surface memory....
-
Acceleration of decision making in sound event recognition employing supercomputing cluster
PublikacjaParallel processing of audio data streams is introduced to shorten the decision making time in hazardous sound event recognition. A supercomputing cluster environment with a framework dedicated to processing multimedia data streams in real time is used. The sound event recognition algorithms employed are based on detecting foreground events, calculating their features in short time frames, and classifying the events with Support...
-
Generalized adaptive comb filter with improved accuracy and robustness properties
PublikacjaGeneralized adaptive comb lters can be used to identify/track parameters of quasi-periodically varying systems.In a special, signal case they reduce down to adaptive comblters, applied to elimination or extraction of nonstationarymulti-harmonic signals buried in noise. We proposea new algorithm which combines, in an adaptive way, resultsyielded by several, simultaneously working generalizedadaptive comb lters. Due to its highly...
-
The Quick Measure of a Nurbs Surface Curvature for Accurate Triangular Meshing
PublikacjaNURBS surfaces are the most widely used surfaces for three-dimensional models in CAD/CAE programs. As a model for FEM calculation is prepared with a CAD program it is inevitable to mesh it finally. There are many algorithms for meshing planar regions. Some of them may be used for meshing surfaces but it is necessary to take the curvature of the surface under consideration to avoid poor quality mesh. The mesh must be denser in the...
-
KernelHive: a new workflow-based framework for multilevel high performance computing using clusters and workstations with CPUs and GPUs
PublikacjaThe paper presents a new open-source framework called KernelHive for multilevel parallelization of computations among various clusters, cluster nodes, and finally, among both CPUs and GPUs for a particular application. An application is modeled as an acyclic directed graph with a possibility to run nodes in parallel and automatic expansion of nodes (called node unrolling) depending on the number of computation units available....
-
Online sound restoration system for digital library applications
PublikacjaAudio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
-
Investigation of Performance and Energy Consumption of Tokenization Algorithms on Multi-core CPUs Under Power Capping
PublikacjaIn this paper we investigate performance-energy optimization of tokenizer algorithm training using power capping. We focus on parallel, multi-threaded implementations of Byte Pair Encoding (BPE), Unigram, WordPiece, and WordLevel run on two systems with different multi-core CPUs: Intel Xeon 6130 and desktop Intel i7-13700K. We analyze execution times and energy consumption for various numbers of threads and various power caps and...
-
On noncausal weighted least squares identification of nonstationary stochastic systems
PublikacjaIn this paper, we consider the problem of noncausal identification of nonstationary, linear stochastic systems, i.e., identification based on prerecorded input/output data. We show how several competing weighted (windowed) least squares parameter smoothers, differing in memory settings, can be combined together to yield a better and more reliable smoothing algorithm. The resulting parallel estimation scheme automatically adjusts...
-
Online sound restoration system for digital library applications.
PublikacjaAudio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
-
Variable-structure algorithm for identification of quasi-periodically varying systems
PublikacjaThe paper presents a variable-structure version of a generalized notchfiltering (GANF) algorithm. Generalized notch filters are used for identification of quasi-periodically varying dynamic systems and can be considered an extension, to the system case, of classical adaptive notch filters. The proposed algorithm is a cascade of two GANF filters: a multiple-frequency "precise" filter bank, used for precise system tracking, and a...
-
A self-optimization mechanism for generalized adaptive notch smoother
PublikacjaTracking of nonstationary narrowband signals is often accomplished using algorithms called adaptive notch filters (ANFs). Generalized adaptive notch smoothers (GANSs) extend the concepts of adaptive notch filtering in two directions. Firstly, they are designed to estimate coefficients of nonstationary quasi-periodic systems, rather than signals. Secondly, they employ noncausal processing, which greatly improves their accuracy and...
-
Identification of nonstationary multivariate autoregressive processes– Comparison of competitive and collaborative strategies for joint selection of estimation bandwidth and model order
PublikacjaThe problem of identification of multivariate autoregressive processes (systems or signals) with unknown and possibly time-varying model order and time-varying rate of parameter variation is considered and solved using parallel estimation approach. Under this approach, several local estimation algorithms, with different order and bandwidth settings, are run simultaneously and compared based on their predictive performance. First,...
-
VARIANT DESIGNING IN the PRELIMINARY SMALL SHIP DESIGN PROCESS
PublikacjaShip designing is a complex process, as the ship itself is a complex, technical multi-level object which operates in the air/water boundary environment and is exposed to the action of many different external and internal factors resulting from the adopted technical solutions, type of operation, and environmental conditions. A traditional ship design process consists of a series of subsequent multistage iterations, which gradually...
-
Parallel tabu search for graph coloring problem
PublikacjaTabu search is a simple, yet powerful meta-heuristic based on local search that has been often used to solve combinatorial optimization problems like the graph coloring problem. This paper presents current taxonomy of patallel tabu search algorithms and compares three parallelization techniques applied to Tabucol, a sequential TS algorithm for graph coloring. The experimental results are based on graphs available from the DIMACS...
-
Chained machine learning model for predicting load capacity and ductility of steel fiber–reinforced concrete beams
PublikacjaOne of the main issues associated with steel fiber–reinforced concrete (SFRC) beams is the ability to anticipate their flexural response. With a comprehensive grid search, several stacked models (i.e., chained, parallel) consisting of various machine learning (ML) algorithms and artificial neural networks (ANNs) were developed to predict the flexural response of SFRC beams. The flexural performance of SFRC beams under bending was...
-
DEPO: A dynamic energy‐performance optimizer tool for automatic power capping for energy efficient high‐performance computing
PublikacjaIn the article we propose an automatic power capping software tool DEPO that allows one to perform runtime optimization of performance and energy related metrics. For an assumed application model with an initialization phase followed by a running phase with uniform compute and memory intensity, the tool performs automatic tuning engaging one of the two exploration algorithms—linear search (LS) and golden section search (GSS), finds...
-
Integration of Services into Workflow Applications
PublikacjaDescribing state-of-the-art solutions in distributed system architectures, Integration of Services into Workflow Applications presents a concise approach to the integration of loosely coupled services into workflow applications. It discusses key challenges related to the integration of distributed systems and proposes solutions, both in terms of theoretical aspects such as models and workflow scheduling algorithms, and technical...
-
Total Completion Time Minimization for Scheduling with Incompatibility Cliques
PublikacjaThis paper considers parallel machine scheduling with incompatibilities between jobs. The jobs form a graph equivalent to a collection of disjoint cliques. No two jobs in a clique are allowed to be assigned to the same machine. Scheduling with incompatibilities between jobs represents a well-established line of research in scheduling theory and the case of disjoint cliques has received increasing attention in recent...
-
Optimizing the computation of a parallel 3D finite difference algorithm for graphics processing units
PublikacjaThis paper explores the possibilities of using a graphics processing unit for complex 3D finite difference computation via MUSTA‐FORCE and WENO algorithms. We propose a novel algorithm based on the new properties of CUDA surface memory optimized for 2D spatial locality and compare it with 3D stencil computations carried out via shared memory, which is currently considered to be the best approach. A case study was performed for...
-
Further Developments of the Online Sound Restoration System for Digital Library Applications
PublikacjaNew signal processing algorithms were introduced to the online service for audio restoration available at the web address: www.youarchive.net. Missing or distorted audio samples are estimated using a specific implementation of the Jannsen interpolation method. The algorithm is based on the autoregressive model (AR) combined with the iterative complementation of signal samples. Since the interpolation algorithm is computationally...
-
Condition-Based Monitoring of DC Motors Performed with Autoencoders
PublikacjaThis paper describes a condition-based monitoring system estimating DC motor degradation with the use of an autoencoder. Two methods of training the autoencoder are evaluated, namely backpropagation and extreme learning machines. The root mean square (RMS) error in the reconstruction of successive fragments of the measured DC motor angular-frequency signal, which is fed to the input of autoencoder, is used to determine the health...
-
Nieliniowa statyka 6-parametrowych powłok sprężysto plastycznych. Efektywne obliczenia MES
PublikacjaGłównym zagadnieniem omawianym w monografii jest sformułowanie sprężysto-plastycznego prawa konstytutywnego w nieliniowej 6-parametrowej teorii powłok. Wyróżnikiem tej teorii jest występujący w niej w naturalny sposób tzw. stopień 6 swobody, czyli owinięcie (drilling rotation). Podstawowe założenie pracy to przyjęcie płaskiego stanu naprężenia uogólnionego na ośrodek typu Cosseratów. Takie podejście stanowi oryginalny aspekt opracowania....
-
Multi-core processing system for real-time image processing in embedded computer vision applications
PublikacjaW artykule opisano architekturę wielordzeniowego programowalnego systemu do przetwarzania obrazów w czasie rzeczywistym. Dane obrazu są przetwarzane równocześnie przez wszystkie procesory. System umożliwia niskopoziomowe przetwarzanie obrazów,np. odejmowanie tła, wykrywanie obiektów ruchomych, transformacje geometryczne, indeksowanie wykrytych obiektów, ocena ich kształtu oraz podstawowa analiza trajektorii ruchu. Ang:This paper...
-
A novel hybrid adaptive framework for support vector machine-based reliability analysis: A comparative study
PublikacjaThis study presents an innovative hybrid Adaptive Support Vector Machine - Monte Carlo Simulation (ASVM-MCS) framework for reliability analysis in complex engineering structures. These structures often involve highly nonlinear implicit functions, making traditional gradient-based first or second order reliability algorithms and Monte Carlo Simulation (MCS) time-consuming. The application of surrogate models has proven effective...