Filters
total: 316
filtered: 135
Search results for: AUDIO-VIDEO RECORDINGS DATABASE
-
Systematic approach to binary classification of images in video streams using shifting time windows
Publicationin the paper, after pointing out of realistic recordings and classifications of their frames, we propose a new shifting time window approach for improving binary classifications. We consider image classification in tewo steps. in the first one the well known binary classification algorithms are used for each image separately. In the second step the results of the previous step mare analysed in relatively short sequences of consecutive...
-
Detection of Face Position and Orientation Using Depth Data
PublicationIn this paper an original approach is presented for real-time detection of user's face position and orientation based only on depth channel from a Microsoft Kinect sensor which can be used in facial analysis on scenes with poor lighting conditions where traditional algorithms based on optical channel may have failed. Thus the proposed approach can support, or even replace, algorithms based on optical channel or based on skeleton...
-
Extraction of stable foreground image regions for unattended luggage detection
PublicationA novel approach to detection of stationary objects in the video stream is presented. Stationary objects are these separated from the static background, but remaining motionless for a prolonged time. Extraction of stationary objects from images is useful in automatic detection of unattended luggage. The proposed algorithm is based on detection of image regions containing foreground image pixels having stable values in time and...
-
Uwierzytelnienie i autoryzacja w systemie STRADAR
PublicationPrzedstawiono rozwiązanie serwera uwierzytelnienia i autoryzacji (AA) w rozproszonym systemie STRADAR, udostępniającym funkcjonalności dla prowadzenia działań operacyjnych Morskiego Oddziału Straży Granicznej. System umożliwia prezentację na stanowisku wizualizacji zdarzeń (SWZ) bieżącej i archiwalnej sytuacji na mapie (AIS, radary), obrazu z kamer, zdjęć, notatek, rozmów telefonicznych oraz plików i wiadomości tekstowych (SMS)...
-
Analysis of Lombard speech using parameterization and the objective quality indicators in noise conditions
PublicationThe aim of the work is to analyze Lombard speech effect in recordings and then modify the speech signal in order to obtain an increase in the improvement of objective speech quality indicators after mixing the useful signal with noise or with an interfering signal. The modifications made to the signal are based on the characteristics of the Lombard speech, and in particular on the effect of increasing the fundamental frequency...
-
A comparative study of English viseme recognition methods and algorithms
PublicationAn elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector construction...
-
A comparative study of English viseme recognition methods and algorithm
PublicationAn elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector...
-
Behavior Analysis and Dynamic Crowd Management in Video Surveillance System
PublicationA concept and practical implementation of a crowd management system which acquires input data by the set of monitoring cameras is presented. Two leading threads are considered. First concerns the crowd behavior analysis. Second thread focuses on detection of a hold-ups in the doorway. The optical flow combined with soft computing methods (neural network) is employed to evaluate the type of crowd behavior, and fuzzy logic aids detection...
-
Multimodal Surveillance Based Personal Protection System
PublicationA novel, multimodal approach for automatic detection of abduction of a protected individual, employing dedicated personal protection device and a city monitoring system is proposed and overviewed. The solution is based on combining four modalities (signals coming from: Bluetooth, fixed and PTZ cameras, thermal camera, acoustic sensors). The Bluetooth signal is used continuously to monitor the protected person presence, and in case...
-
Akustyczna analiza parametrów ruchu drogowego z wykorzystaniem informacji o hałasie oraz uczenia maszynowego
PublicationCelem rozprawy było opracowanie akustycznej metody analizy parametrów ruchu drogowego. Zasada działania akustycznej analizy ruchu drogowego zapewnia pasywną metodę monitorowania natężenia ruchu. W pracy przedstawiono wybrane metody uczenia maszynowego w kontekście analizy dźwięku (ang.Machine Hearing). Przedstawiono metodologię klasyfikacji zdarzeń w ruchu drogowym z wykorzystaniem uczenia maszynowego. Przybliżono podstawowe...
-
ANALIZA PARAMETRÓW SYGNAŁU MOWY W KONTEKŚCIE ICH PRZYDATNOŚCI W AUTOMATYCZNEJ OCENIE JAKOŚCI EKSPRESJI ŚPIEWU
PublicationPraca dotyczy podejścia do parametryzacji w przypadku klasyfikacji emocji w śpiewie oraz porównania z klasyfikacją emocji w mowie. Do tego celu wykorzystano bazę mowy i śpiewu nacechowanego emocjonalnie RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song), zawierającą nagrania profesjonalnych aktorów prezentujących sześć różnych emocji. Następnie obliczono współczynniki mel-cepstralne (MFCC) oraz wybrane deskryptory...
-
Sound quality metrics applied to road noise evaluation
PublicationRoad noise monitoring systems typically measure sound levels in specific time periods. The more insightful approach suggests to measure also the nature of noise. Sound quality of sounds such as car noise can be objectively evaluated by several parameters. One of them is psychoacoustic annoyance, described by loudness, tone color, and the temporal structure of sound. In this paper the assessment of several sound quality parameters, such...
-
Verification of the Parameterization Methods in the Context of Automatic Recognition of Sounds Related to Danger
PublicationW artykule opisano aplikację, która automatycznie wykrywa zdarzenia dźwiękowe takie jak: rozbita szyba, wystrzał, wybuch i krzyk. Opisany system składa się z bloku parametryzacji i klasyfikatora. W artykule dokonano porównania parametrów dedykowanych dla tego zastosowania oraz standardowych deskryptorów MPEG-7. Porównano też dwa klasyfikatory: Jeden oparty o Percetron (sieci neuronowe) i drugi oparty o Maszynę wektorów wspierających....
-
Bimodal Emotion Recognition Based on Vocal and Facial Features
PublicationEmotion recognition is a crucial aspect of human communication, with applications in fields such as psychology, education, and healthcare. Identifying emotions accurately is challenging, as people use a variety of signals to express and perceive emotions. In this study, we address the problem of multimodal emotion recognition using both audio and video signals, to develop a robust and reliable system that can recognize emotions...
-
Study on CPU and RAM Resource Consumption of Mobile Devices using Streaming Services
PublicationStreaming multimedia services have become very popular in recent years, due to the development of wireless networks. With the growing number of mobile devices worldwide, service providers offer dedicated applications that allow to deliver on-demand audio and video content anytime and everywhere. The aim of this study was to compare different streaming services and investigate their impact on the CPU and RAM resources, with respect...
-
DevEmo—Software Developers’ Facial Expression Dataset
PublicationThe COVID-19 pandemic has increased the relevance of remote activities and digital tools for education, work, and other aspects of daily life. This reality has highlighted the need for emotion recognition technology to better understand the emotions of computer users and provide support in remote environments. Emotion recognition can play a critical role in improving the remote experience and ensuring that individuals are able...
-
Creating a Remote Choir Performance Recording Based on an Ambisonic Approach
PublicationThe aim of this paper is three-fold. First, the basics of binaural and ambisonic techniques are briefly presented. Then, details related to audio-visual recordings of a remote performance of the Academic Choir of the Gdańsk University of Technology are shown. Due to the COVID-19 pandemic, artists had a choice, namely, to stay at home and not perform or stay at home and perform. In fact, staying at home brought in the possibility...
-
Monitoring of Caged Bluefin Tuna Reactions to Ship and Offshore Wind Farm Operational Noises
PublicationUnderwater noise has been identified as a relevant pollution affecting marine ecosystems in different ways. Despite the numerous studies performed over the last few decades regarding the adverse effect of underwater noise on marine life, a lack of knowledge and methodological procedures still exists, and results are often tentative or qualitative. A monitoring methodology for the behavioral response of bluefin tuna (Thunnus thynnus)...
-
TRANSPORT POSSIBILITY FOR MPEG-4/AVC- AND MPEG-2-ENCODED VIDEO DATA IN IPTV: A COMPARISON STUDY
PublicationIPTV (Television over IP) is a modern service with a great potential to expand. It uses the IP transport platform, that is already in worldwide operation. At the time of writing, two techniques are used to transport the video and audio data of IPTV: MPEG-2 TS and Native RTP. The two techniques quite definitely have an influence on both quality of service (QoS) and quality of experience (QoE). This paper sets out to demonstrate...
-
Broadening the scope of measurement and analysis of vibrations of an organ pipe employing intensity probe, simulations, and highspeed camera
PublicationThis paper shows an integrated approach to measure, analyze, and model phenomena occurring in an organ pipe driven by pressurized air. The aim of this paper is two-fold, i.e., to measure the pressure signal and the intensity field around the mouth by means of an intensity probe and to visualize and observe the motion of the air jet, which represents the excitation mechanism of the system. This is realized through two techniques,...
-
A Review of Emotion Recognition Methods Based on Data Acquired via Smartphone Sensors
PublicationIn recent years, emotion recognition algorithms have achieved high efficiency, allowing the development of various affective and affect-aware applications. This advancement has taken place mainly in the environment of personal computers offering the appropriate hardware and sufficient power to process complex data from video, audio, and other channels. However, the increase in computing and communication capabilities of smartphones,...
-
Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej
PublicationThe bi-modal speech recognition system requires a 2-sample language input for training and for testing algorithms which precisely depicts natural English speech. For the purposes of the audio-visual recordings, a training data base of 264 sentences (1730 words without repetitions; 5685 sounds) has been created. The language sample reflects vowel and consonant frequencies in natural speech. The recording material reflects both the...
-
Comparison of Classification Methods for EEG Signals of Real and Imaginary Motion
PublicationThe classification of EEG signals provides an important element of brain-computer interface (BCI) applications, underlying an efficient interaction between a human and a computer application. The BCI applications can be especially useful for people with disabilities. Numerous experiments aim at recognition of motion intent of left or right hand being useful for locked-in-state or paralyzed subjects in controlling computer applications....
-
Evaluation of aspiration problems in L2 English pronunciation employing machine learning
PublicationThe approach proposed in this study includes methods specifically dedicated to the detection of allophonic variation in English. This study aims to find an efficient method for automatic evaluation of aspiration in the case of Polish second-language (L2) English speakers’ pronunciation when whole words are analyzed instead of particular allophones extracted from words. Sample words including aspirated and unaspirated allophones...
-
Objectivization of Audio-Visual Correlation analysis
PublicationSimultaneous perception of audio and visual stimuli often causes the concealment or misrepresentation of information actually contained in these stimuli. Such effects are called the ''image proximity effect'' or the ''ventriloquism effect'' in literature. Until recently, most research carried out to understand their nature was based on subjective assessments. The Authors of this paper propose a methodology based on both subjective...
-
Application of autoencoder to traffic noise analysis
PublicationThe aim of an autoencoder neural network is to transform the input data into a lower-dimensional code and then to reconstruct the output from this code representation. Applications of autoencoders to classifying sound events in the road traffic have not been found in the literature. The presented research aims to determine whether such an unsupervised learning method may be used for deploying classification algorithms applied to...
-
Fully Automated AI-powered Contactless Cough Detection based on Pixel Value Dynamics Occurring within Facial Regions
PublicationIncreased interest in non-contact evaluation of the health state has led to higher expectations for delivering automated and reliable solutions that can be conveniently used during daily activities. Although some solutions for cough detection exist, they suffer from a series of limitations. Some of them rely on gesture or body pose recognition, which might not be possible in cases of occlusions, closer camera distances or impediments...
-
Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets
PublicationArtificial Neural Network (ANN) models, specifically Convolutional Neural Networks (CNN), were applied to extract emotions based on spectrograms and mel-spectrograms. This study uses spectrograms and mel-spectrograms to investigate which feature extraction method better represents emotions and how big the differences in efficiency are in this context. The conducted studies demonstrated that mel-spectrograms are a better-suited...
-
IMAGE CORRELATION AS A TOLL FOR TRACKING FACIAL CHANGES CAUSING BY EXTERNAL STIMULI
PublicationExpressions of the human face bring a lot of information, which are a valuable source in the areas of computer vision, remote sensing and affective computing. For years, by analyzing the movement of the skin and facial muscles scientists are trying to create the perfect tool, based on image analysis, allowing the recognition of emotional states of human beings. To create a reliable algorithm, it is necessary to explore and examine...
-
MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES
PublicationAutomatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and selforganizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’...
-
The effect of groyne field on trapping macroplastic. Preliminary results from laboratory experiments
PublicationMacroplastic, a precursor of microplastic pollution, has become a new scope of research interest. However, the physical processes of macroplastic transport and deposition in rivers are poorly understood, which makes the decisions of where to locate macroplastic trapping infrastructure difficult. In this research, we conducted a series of experiments in a laboratory channel, exploring the impact of groynes and flexible artificial...
-
New Applications of Multimodal Human-Computer Interfaces
PublicationMultimodal computer interfaces and examples of their applications to education software and for the disabled people are presented. The proposed interfaces include the interactive electronic whiteboard based on video image analysis, application for controlling computers with gestures and the audio interface for speech stretching for hearing impaired and stuttering people. Application of the eye-gaze tracking system to awareness...
-
System for automatic singing voice recognition
PublicationW artykule przedstawiono system automatycznego rozpoznawania jakości i typu głosu śpiewaczego. Przedstawiono bazę danych oraz zaimplementowane parametry. Algorytmem decyzyjnym jest algorytm sztucznych sieci neuronowych. Wytrenowany system decyzyjny osiąga skuteczność ok. 90% w obydwu kategoriach rozpoznawania. Dodatkowo wykazano przy pomocy metod statystycznych, że wyniki działania systemu automatycznej oceny jakości technicznej...
-
Improving automatic surveillance by sound analysis
PublicationAn automatic surveillance system, based on event detection in the video image can be improved by implementing algorithms for audio analysis. Dangerous or illegal actions are often connected with distinctive sound events like screams or sudden bursts of energy. A method for detection and classification of alarming sound events is presented. Detection is based on the observation of sudden changes in sound level in distinctive sub-bands...
-
ZINTEGROWANY SYSTEM DOMOWEGO MONITORINGU PARAMETRÓW MEDYCZNYCH OSÓB STARSZYCH I CHORYCH
PublicationProponowane rozwiązania mają na celu wspomaganie osób starszych i chorych, tak by mogły jak najdłużej mieszkać i żyć samodzielnie ze zwiększonym poczuciem bezpieczeństwa, iż są nadzorowane i w razie nagłego zagrożenia życia nie pozostaną bez pomocy. System jednocześnie nie narusza poczucia zachowania prywatności i intymności, gdyż nie są używane do monitoringu kamery wizyjne czy też stały nasłuch audio. Dodatkowo gromadzone informacje...