Search results for: emotion recognition, dataset, video annotation

Multimodal English corpus for automatic speech recognition

Publication

- Year 2013

A multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech...

Recognizing emotions on the basis of keystroke dynamics

Publication

A. Kołakowska

- Year 2015

The article describes a research on recognizing emotional states on the basis of keystroke dynamics. An overview of various studies and applications of emotion recognition based on data coming from keyboard is presented. Then, the idea of an experiment is presented, i.e. the way of collecting and labeling training data, extracting features and finally training classifiers. Different classification approaches are proposed to be...

Full text to download in external service

Domain adaptation for inpainting-based face recognition studies

Publication

- Year 2024

Recent inpainting methods have demonstrated im-pressive outcomes in filling missing parts of images, especially for reconstructing facial areas obscured by occlusions. However, studies show that these models are not adequately effective in real-world applications, primarily due to data bias and the distribution of faces in images. This research focuses on domain adaptation of the commonly used Labeled Faces in the Wild (LFW) dataset,...

Full text to download in external service

Focus on Misinformation: Improving Medical Experts’ Efficiency of Misinformation Detection

Publication

A. Nabożny
B. Balcerzak
M. Morzy
A. Wierzbicki

- Year 2021

Fighting medical disinformation in the era of the global pandemic is an increasingly important problem. As of today, automatic systems for assessing the credibility of medical information do not offer sufficient precision to be used without human supervision, and the involvement of medical expert annotators is required. Thus, our work aims to optimize the utilization of medical experts’ time. We use the dataset of sentences taken...

Full text to download in external service

Analysis of human behavioral patterns

Publication

A. Kołakowska

- Year 2022

Widespread usage of Internet and mobile devices entailed growing requirements concerning security which in turn brought about development of biometric methods. However, a specially designed biometric system may infer more about users than just verifying their identity. Proper analysis of users’ characteristics may also tell much about their skills, preferences, feelings. This chapter presents biometric methods applied in several...

Full text to download in external service

Improvement of Image Binarization Methods Using Image Preprocessing with Local Entropy Filtering for Alphanumerical Character Recognition Purposes

Publication

H. Michalak
K. P. Okarma

- ENTROPY - Year 2019

Automatic text recognition from the natural images acquired in uncontrolled lighting conditions is a challenging task due to the presence of shadows hindering the shape analysis and classification of individual characters. Since the optical character recognition methods require prior image binarization, the application of classical global thresholding methods in such case makes it impossible to preserve the visibility of all...

Full text to download in external service

Visual Lip Contour Detection for the Purpose of Speech Recognition

Publication

- Year 2014

A method for visual detection of lip contours in frontal recordings of speakers is described and evaluated. The purpose of the method is to facilitate speech recognition with visual features extracted from a mouth region. Different Active Appearance Models are employed for finding lips in video frames and for lip shape and texture statistical description. Search initialization procedure is proposed and error measure values are...

Discovering Rule-Based Learning Systems for the Purpose of Music Analysis

Publication

G. Korvel
B. Kostek

- Journal of the Acoustical Society of America - Year 2019

Music analysis and processing aims at understanding information retrieved from music (Music Information Retrieval). For the purpose of music data mining, machine learning (ML) methods or statistical approach are employed. Their primary task is recognition of musical instrument sounds, music genre or emotion contained in music, identification of audio, assessment of audio content, etc. In terms of computational approach, music databases...

Full text available to download

Video Semantic Analysis Framework based on Run-time Production Rules - Towards Cognitive Vision

Publication

E. Szczerbicki
C. Toro
C. Sanin

- JOURNAL OF UNIVERSAL COMPUTER SCIENCE - Year 2015

This paper proposes a service-oriented architecture for video analysis which separates object detection from event recognition. Our aim is to introduce new tools to be considered in the pathway towards Cognitive Vision as a support for classical Computer Vision techniques that have been broadly used by the scientific community. In the article, we particularly focus in solving some of the reported scalability issues found in current...

Full text available to download

Knowledge representation of motor activity of patients with Parkinson’s disease

Publication

- Natural Computing - Year 2015

An approach to the knowledge representation extraction from biomedical signals analysis concerning motor activity of Parkinson disease patients is proposed in this paper. This is done utilizing accelerometers attached to their body as well as exploiting video image of their hand movements. Experiments are carried out employing artificial neural networks and support vector machine to the recognition of characteristic motor activity...

Full text available to download

Affect-awareness framework for intelligent tutoring systems

Publication

A. Landowska

- Year 2013

The paper proposes a framework for construction of Intelligent Tutoring Systems (ITS), that take into consideration student emotional states and make affective interventions. The paper provides definitions of `affect-aware systems' and `affective interventions' and describes the concept of the affect-awareness framework. The proposed framework separates emotion recognition from its definition, processing and making decisions on...

Full text to download in external service

Music Mood Visualization Using Self-Organizing Maps

Publication

- Archives of Acoustics - Year 2015

Due to an increasing amount of music being made available in digital form in the Internet, an automatic organization of music is sought. The paper presents an approach to graphical representation of mood of songs based on Self-Organizing Maps. Parameters describing mood of music are proposed and calculated and then analyzed employing correlation with mood dimensions based on the Multidimensional Scaling. A map is created in which...

Full text available to download

A Novel IoT-Perceptive Human Activity Recognition (HAR) Approach Using Multi-Head Convolutional Attention

Publication

H. Zhang
Z. Xiao
J. Wang
F. Li
E. Szczerbicki

- IEEE Internet of Things Journal - Year 2019

Together with fast advancement of the Internet of Things (IoT), smart healthcare applications and systems are equipped with increasingly more wearable sensors and mobile devices. These sensors are used not only to collect data, but also, and more importantly, to assist in daily activity tracking and analyzing of their users. Various human activity recognition (HAR) approaches are used to enhance such tracking. Most of the existing...

Full text available to download

Vehicle detector training with labels derived from background subtraction algorithms in video surveillance

Publication

- Year 2018

Vehicle detection in video from a miniature station- ary closed-circuit television (CCTV) camera is discussed in the paper. The camera provides one of components of the intelligent road sign developed in the project concerning the traffic control with the use of autonomous devices being developed. Modern Convolutional Neural Network (CNN) based detectors need big data input, usually demanding their manual labeling. In the presented...

Using Convolutional Neural Networks for Corneal Arcus Detection Towards Familial Hypercholesterolemia Screening

Publication

T. Kocejko
J. Rumiński
M. Mazur-Milecka
M. Romanowska-Kocejko
K. Chlebus
J. Kang-Hyun

- Journal of King Saud University-Computer and Information Sciences - Year 2022

Familial hypercholesterolemia (FH) is a highly undiagnosed disease. Among FH patients, the onset of premature coronary artery disease is 13 times higher than in the general population. Early diagnosis and treatment is essential to prevent cardiovascular diseases and their complications, and to prolong life. One of the clinical criteria of FH is the occurrence of a corneal arcus (CA) among patients, especially those under 45 years...

Full text available to download

Thermal imaging in automatic rodent’s social behaviour analysis

Publication

M. Mazur-Milecka

- Year 2016

Laboratory rodent social behaviour analysis is an extremely important task for biological, medical and pharmacological researches. In this work thermal images features that facilitate analysis are presented. Methods to distinguish objects on the basis of thermal distribution are tested. Actions of grooming or biting one rodent by another - important social behaviour incidents - are clearly visible...

Full text to download in external service

Methodology and technology for the polymodal allophonic speech transcription

Publication

- Journal of the Acoustical Society of America - Year 2016

A method for automatic audiovisual transcription of speech employing: acoustic and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e. the changes in the articulatory setting of speech organs for...

Full text to download in external service

Methodology and technology for the polymodal allophonic speech transcription

Publication

- Journal of the Acoustical Society of America - Year 2016

A method for automatic audiovisual transcription of speech employing: acoustic, electromagnetical articulography and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e., the changes in the articulatory...

Full text to download in external service

Detecting Objects of Various Categories in Optical Remote Sensing Imagery Using Neural Networks

Publication

A. Madajczak
M. Ciecholewski

- Year 2024

The effective detection of objects in remote sensing images is of great research importance, so recent years have seen a significant progress in deep learning techniques in this field. However, despite much valuable research being conducted, many challenges still remain. A lot of research projects focus on detecting objects of a single category (class), while correctly detecting objects of different categories is much harder. The...

Full text to download in external service

Virtual Whiteboard: A gesture-controlled pen-free tool emulating school whiteboard

Publication

- Year 2012

In the paper the so-called Virtual Whiteboard is presented which may be an alternative solution for modern electronic whiteboards based on electronic pens and sensors. The presented tool enables the user to write, draw and handle whiteboard contents using his/her hands only. An additional equipment such as infrared diodes, infrared cameras or cyber gloves is not needed. The user's interaction with the Virtual Whiteboard computer...

Robot-Based Intervention for Children With Autism Spectrum Disorder: A Systematic Literature Review

Publication

K. D. Bartl-Pokorny
P. Uluer
D. E. Barkana
A. Baird
H. Kose
T. Zorcec
B. Robins
B. Schuller
A. Landowska
M. Pykała

- IEEE Access - Year 2021

Children with autism spectrum disorder (ASD) have deficits in the socio-communicative domain and frequently face severe difficulties in the recognition and expression of emotions. Existing literature suggested that children with ASD benefit from robot-based interventions. However, studies varied considerably in participant characteristics, applied robots, and trained skills. Here, we reviewed robot-based interventions targeting...

Full text available to download

Investigating Feature Spaces for Isolated Word Recognition

Publication

P. Treigys
G. Korvel
G. Tamulevicius
J. Bernataviciene
B. Kostek

- Year 2020

The study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms and fractal dimension features of the signal were chosen for the time domain, and...

Full text to download in external service

Investigating Feature Spaces for Isolated Word Recognition

Publication

G. Korvel
G. Tamulevicus
P. Treigys
J. Bernataviciene
B. Kostek

- Year 2018

Much attention is given by researchers to the speech processing task in automatic speech recognition (ASR) over the past decades. The study addresses the issue related to the investigation of the appropriateness of a two-dimensional representation of speech feature spaces for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and timefrequency signal representation...

Detection of Face Position and Orientation Using Depth Data

Publication

M. Szwoch
P. Pieniążek

- Advances in Intelligent Systems and Computing - Year 2015

In this paper an original approach is presented for real-time detection of user's face position and orientation based only on depth channel from a Microsoft Kinect sensor which can be used in facial analysis on scenes with poor lighting conditions where traditional algorithms based on optical channel may have failed. Thus the proposed approach can support, or even replace, algorithms based on optical channel or based on skeleton...

Full text to download in external service

Multi-Stage Video Analysis Framework

Publication

- Year 2011

The chapter is organized as follows. Section 2 presents the general structure of the proposed framework and a method of data exchange between system elements. Section 3 is describing the low-level analysis modules for detection and tracking of moving objects. In Section 4 we present the object classification module. Sections 5 and 6 describe specialized modules for detection and recognition of faces and license plates, respectively....

Full text to download in external service

Parallelization of video stream algorithms in kaskada platform

Publication

A. Brzeski

- Year 2011

The purpose of this work is to present different techniques of video stream algorithms parallelization provided by the Kaskada platform - a novel system working in a supercomputer environment designated for multimedia streams processing. Considered parallelization methods include frame-level concurrency, multithreading and pipeline processing. Execution performance was measured on four time-consuming image recognition algorithms,...

An audio-visual corpus for multimodal automatic speech recognition

Publication

- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Year 2017

review of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...

Full text available to download

Automatic recognition of males and females among web browser users based on behavioural patterns of peripherals usage

Publication

A. Kołakowska
A. Landowska
P. Jarmolkowicz
M. Jarmolkowicz
K. Sobota

- Internet Research - Year 2016

Purpose The purpose of this paper is to answer the question whether it is possible to recognise the gender of a web browser user on the basis of keystroke dynamics and mouse movements. Design/methodology/approach An experiment was organised in order to track mouse and keyboard usage using a special web browser plug-in. After collecting the data, a number of parameters describing the users’ keystrokes, mouse movements and clicks...

Full text to download in external service

Gesture Recognition With the Linear Optical Sensor and Recurrent Neural Networks

Publication

- IEEE SENSORS JOURNAL - Year 2018

In this paper, the optical linear sensor, a representative of low-resolution sensors, was investigated in the multiclass recognition of near-field hand gestures. The recurrent neural network (RNN) with a gated recurrent unit (GRU) memory cell was utilized as a gestures classifier. A set of 27 gestures was collected from a group of volunteers. The 27 000 sequences obtained were divided into training, validation, and test subsets....

Full text available to download

Neural Network Subgraphs Correlation with Trained Model Accuracy

Publication

I. Wrosz

- Year 2020

Neural Architecture Search (NAS) is a computationally demanding process of finding optimal neural network architecture for a given task. Conceptually, NAS comprises applying a search strategy on a predefined search space accompanied by a performance evaluation method. The design of search space alone is expected to substantially impact NAS efficiency. We consider neural networks as graphs and find a correlation between the presence...

Full text to download in external service

Application of autoencoder to traffic noise analysis

Publication

- Journal of the Acoustical Society of America - Year 2019

The aim of an autoencoder neural network is to transform the input data into a lower-dimensional code and then to reconstruct the output from this code representation. Applications of autoencoders to classifying sound events in the road traffic have not been found in the literature. The presented research aims to determine whether such an unsupervised learning method may be used for deploying classification algorithms applied to...

Full text available to download

Driver fatigue detection method based on facial image analysis

Publication

- Year 2024

Nowadays, ensuring road safety is a crucial issue that demands continuous development and measures to minimize the risk of accidents. This paper presents the development of a driver fatigue detection method based on the analysis of facial images. To monitor the driver's condition in real-time, a video camera was used. The method of detection is based on analyzing facial features related to the mouth area and eyes, such as...

Full text to download in external service

Noise profiling for speech enhancement employing machine learning models

Publication

K. Kąkol
G. Korvel
B. Kostek

- Journal of the Acoustical Society of America - Year 2022

This paper aims to propose a noise profiling method that can be performed in near real-time based on machine learning (ML). To address challenges related to noise profiling effectively, we start with a critical review of the literature background. Then, we outline the experiment performed consisting of two parts. The first part concerns the noise recognition model built upon several baseline classifiers and noise signal features...

Full text available to download

Driver’s Condition Detection System Using Multimodal Imaging and Machine Learning Algorithms

Publication

- Year 2023

To this day, driver fatigue remains one of the most significant causes of road accidents. In this paper, a novel way of detecting and monitoring a driver’s physical state has been proposed. The goal of the system was to make use of multimodal imaging from RGB and thermal cameras working simultaneously to monitor the driver’s current condition. A custom dataset was created consisting of thermal and RGB video samples. Acquired data...

Full text to download in external service

Challenges in Observing the Emotions of Children with Autism Interacting with a Social Robot

Publication

D. Erol Barkana
K. D. Bartl-Pokorny
H. Kose
A. Landowska
M. Milling
B. Robins
B. Schuller
P. Uluer
M. Wróbel
T. Zorcec

- International Journal of Social Robotics - Year 2024

This paper concerns the methodology of multi-modal data acquisition in observing emotions experienced by children with autism while they interact with a social robot. As robot-enhanced therapy gains more and more attention and proved to be effective in autism, such observations might influence the future development and use of such technologies. The paper is based on an observational study of child-robot interaction, during which...

Full text available to download

Thermal Image Processing for Respiratory Estimation from Cubical Data with Expandable Depth

Publication

M. Szankin
A. Kwaśniewska
J. Rumiński

- Journal of Imaging - Year 2023

As healthcare costs continue to rise, finding affordable and non-invasive ways to monitor vital signs is increasingly important. One of the key metrics for assessing overall health and identifying potential issues early on is respiratory rate (RR). Most of the existing methods require multiple steps that consist of image and signal processing. This might be difficult to deploy on edge devices that often do not have specialized...

Full text available to download

A comparative study of English viseme recognition methods and algorithms

Publication

- MULTIMEDIA TOOLS AND APPLICATIONS - Year 2018

An elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector construction...

Full text available to download

Controlling computer by lip gestures employing neural network

Publication

- Year 2010

Results of experiments regarding lip gesture recognition with an artificial neural network are discussed. The neural network module forms the core element of a multimodal human-computer interface called LipMouse. This solution allows a user to work on a computer using lip movements and gestures. A user face is detected in a video stream from a standard web camera using a cascade of boosted classifiers working with Haar-like features....

Full text to download in external service

A comparative study of English viseme recognition methods and algorithm

Publication

- MULTIMEDIA TOOLS AND APPLICATIONS - Year 2018

An elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector...

Full text available to download

Improving methods for detecting people in video recordings using shifting time-windows

Publication

A. Blokus
H. Krawczyk

- Year 2018

We propose a novel method for improving algorithms which detect the presence of people in video sequences. Our focus is on algorithms for applications which require reporting and analyzing all scenes with detected people in long recordings. Therefore one of the target qualities of the classification result is its stability, understood as a low number of invalid scene boundaries. Many existing methods process images in the recording...

Full text to download in external service

Deep CNN based decision support system for detection and assessing the stage of diabetic retinopathy

Publication

A. Kwasigroch
B. Jarzembinski
M. Grochowski

- Year 2018

The diabetic retinopathy is a disease caused by long-standing diabetes. Lack of effective treatment can lead to vision impairment and even irreversible blindness. The disease can be diagnosed by examining digital color fundus photographs of retina. In this paper we propose deep learning approach to automated diabetic retinopathy screening. Deep convolutional neural networks (CNN) - the most popular kind of deep learning algorithms...

Full text to download in external service

Convolutional Neural Networks for C. Elegans Muscle Age Classification Using Only Self-Learned Features

Publication

- Journal of Telecommunications and Information Technology - Year 2022

Nematodes Caenorhabditis elegans (C. elegans) have been used as model organisms in a wide variety of biological studies, especially those intended to obtain a better understanding of aging and age-associated diseases. This paper focuses on automating the analysis of C. elegans imagery to classify the muscle age of nematodes based on the known and well established IICBU dataset. Unlike many modern classification methods, the proposed...

Full text available to download

Real-Time Gastrointestinal Tract Video Analysis on a Cluster Supercomputer

Publication

- Year 2012

The article presents a novel approach to medical video data analysis and recognition. Emphasis has been put on adapting existing algorithms detecting le- sions and bleedings for real time usage in a medical doctor's office during an en- doscopic examination. A system for diagnosis recommendation and disease detec- tion has been designed taking into account the limited mobility of the endoscope and the doctor's requirements. The...

Real-Time Bleeding Detection in Gastrointestinal Tract Endoscopic Examinations Video

Publication

- International Journal of Distributed and Parallel Systems - Year 2013

The article presents a novel approach to medical video data analysis and recognition of bleedings. Emphasis has been put on adapting pre-existing algorithms dedicated to the detection of bleedings for real-time usage in a medical doctor’s office during an endoscopic examination. A real-time system for analyzing endoscopic videos has been designed according to the most significant requirements of medical doctors. The main goal of...

Full text available to download

The Hough transform in the classification process of inland ships

Publication

K. Bobkowska
N. Wawrzyniak

- Zeszyty Naukowe Akademii Morskiej w Szczecinie - Year 2019

This article presents an analysis of the possibilities of using image processing methods for feature extraction that allows kNN classification based on a ship’s image delivered from an on-water video surveillance system. The subject of the analysis is the Hough transform which enables the detection of straight lines in an image. The recognized straight lines and the information about them serve as features in the classification...

Full text available to download

Concurrent Video Denoising and Deblurring for Dynamic Scenes

Publication

- IEEE Access - Year 2021

Dynamic scene video deblurring is a challenging task due to the spatially variant blur inflicted by independently moving objects and camera shakes. Recent deep learning works bypass the ill-posedness of explicitly deriving the blur kernel by learning pixel-to-pixel mappings, which is commonly enhanced by larger region awareness. This is a difficult yet simplified scenario because noise is neglected when it is omnipresent in a wide...

Full text available to download

Identification of Emotional States Using Phantom Miro M310 Camera

Publication

M. Przyborski

- Internal Security - Year 2013

The purpose of this paper is to present the possibilities associated with the use of remote sensing methods in identifying human emotional states, and to present the results of the research conducted by the authors in this field. The studies presented involved the use of advanced image analysis to identify areas on the human face that change their activity along with emotional expression. Most of the research carried out in laboratories...

Toward Robust Pedestrian Detection With Data Augmentation

Publication

- IEEE Access - Year 2020

In this article, the problem of creating a safe pedestrian detection model that can operate in the real world is tackled. While recent advances have led to significantly improved detection accuracy on various benchmarks, existing deep learning models are vulnerable to invisible to the human eye changes in the input image which raises concerns about its safety. A popular and simple technique for improving robustness is using data...

Full text available to download

Audio content analysis in the urban area telemonitoring system

Publication

- Year 2010

Artykuł przedstawia możliwości rozwinięcie monitoringu miejskiego o automatyczną analizę dźwięku. Przedstawiono metody parametryzacji dźwięku, które możliwe są do zastosowania w takim systemie oraz omówiono aspekty techniczne implementacji. W kolejnej części przedstawiono system decyzyjny oparty na drzewach zastosowany w systemie. System ten rozpoznaje dźwięki niebezpieczne (strzał, rozbita szyba, krzyk) wśród dźwięków zarejestrowanych...

Full text to download in external service

Search

Filters

Catalog

Search results for: emotion recognition, dataset, video annotation

Agnieszka Landowska dr hab. inż.