Search results for: visual processing
-
Hidden Markov Models for Visual Processing of Marketing Leaflets
Publication -
International Journal of Image Processing and Visual Communication
Journals -
IEEE International Conference on Visual Communications and Image Processing
Conferences -
Pan-Sydney Area Workshop on Visual Information Processing
Conferences -
Endoscopic Videos Deinterlacing and On-Screen Text and Light Flashes Removal and Its Influence on Image Analysis Algorithms' Efficiency
PublicationIn this article, deinterlacing and removing on- screen text and light flashes methods on endoscopic video images are discussed. The research is intended to improve disease recognition algorithms' performance. In the article, four configurations of deinterlacing methods and another four configurations of text and flashes removal methods are described and examined. The efficiency of endoscopic video analysis algorithms is measured...
-
Marek Blok dr hab. inż.
PeopleMarek Blok in 1994 graduated from the Faculty of Electronics at Gdansk University of Technology receiving his MSc in telecommunications. In 2003 received Ph.D. and in 2017 D.Sc. in telecommunications from the Faculty of Electronics, Telecommunications and Informatics of Gdańsk University of Technology. His research interests are focused on application of digital signal processing in telecommunications. He provides lectures, laboratory...
-
Michał Lech dr inż.
PeopleMichał Lech was born in Gdynia in 1983. In 2007 he graduated from the faculty of Electronics, Telecommunications and Informatics of Gdansk University of Technology. In June 2013, he received his Ph.D. degree. The subject of the dissertation was: “A Method and Algorithms for Controlling the Sound Mixing Processes by Hand Gestures Recognized Using Computer Vision”. The main focus of the thesis was the bias of audio perception caused...
-
Visual Features for Endoscopic Bleeding Detection
PublicationAims: To define a set of high-level visual features of endoscopic bleeding and evaluate their capabilities for potential use in automatic bleeding detection. Study Design: Experimental study. Place and Duration of Study: Department of Computer Architecture, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, between March 2014 and May 2014. Methodology: The features have...
-
Piotr Szczuko dr hab. inż.
PeoplePiotr Szczuko received his M.Sc. degree in 2002. His thesis was dedicated to examination of correlation phenomena between perception of sound and vision for surround sound and digital image. He finished Ph.D. studies in 2007 and one year later completed a dissertation "Application of Fuzzy Rules in Computer Character Animation" that received award of Prime Minister of Poland. His interests include: processing of audio and video, computer...
-
Visual Data Encryption for Privacy Enhancement in Surveillance Systems
PublicationIn this paper a methodology for employing reversible visual encryption of data is proposed. The developed algorithms are focused on privacy enhancement in distributed surveillance architectures. First, motivation of the study performed and a short review of preexisting methods of privacy enhancement are presented. The algorithmic background, system architecture along with a solution for anonymization of sensitive regions of interest...
-
Human verbal memory encoding is hierarchically distributed in a continuous processing stream
PublicationProcessing of memory is supported by coordinated activity in a network of sensory, association, and motor brain regions. It remains a major challenge to determine where memory is encoded for later retrieval. Here we used direct intracranial brain recordings from epilepsy patients performing free recall tasks to determine the temporal pattern and anatomical distribution of verbal memory encoding across the entire human cortex. High...
-
UAV Design and Construction for Real Time Photogrammetry and Visual Navigation
PublicationA unmanned aerial vehicles applications in photogrammetry have increased rapidly last years. A fast data gathering and processing in real time in some cases become crucial and desired in some application. In the paper, a real time solution is proposed. A real time photogrammetry from UAV is proposed, where image data are gathered and processed on board UAV and finally reconstructed 3D model and measurements are delivered. The paper...
-
A Model-Driven Solution for Development of Multimedia Stream Processing Applications
PublicationThis paper presents results of action research related to model-driven solutions in the area of multimedia stream processing. The practical problem to be solved was the need to support application developers who make their multimedia stream processing applications in a supercomputer environment. The solution consists of a domain-specific visual language for composing complex services from simple services called Multimedia Stream...
-
An audio-visual corpus for multimodal automatic speech recognition
Publicationreview of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...
-
High frequency oscillations are associated with cognitive processing in human recognition memory
PublicationHigh frequency oscillations are associated with normal brain function, but also increasingly recognized as potential biomarkers of the epileptogenic brain. Their role in human cognition has been predominantly studied in classical gamma frequencies (30-100 Hz), which reflect neuronal network coordination involved in attention, learning and memory. Invasive brain recordings in animals and humans demonstrate that physiological oscillations...
-
Combined Single Neuron Unit Activity and Local Field Potential Oscillations in a Human Visual Recognition Memory Task
PublicationGOAL: Activities of neuronal networks range from action potential firing of individual neurons, coordinated oscillations of local neuronal assemblies, and distributed neural populations. Here, we describe recordings using hybrid electrodes, containing both micro- and clinical macroelectrodes, to simultaneously sample both large-scale network oscillations and single neuron spiking activity in the medial temporal lobe structures...
-
Guitar String Sound Retrieved from Moving Pixels
PublicationThe aim of this study was to develop a method of visual recording and analyzing the vibrations of guitar strings using high-speed cameras and dedicated video processing algorithms. The recording of a plucked string reveals the way in which the deformations propagate, composing the standing and travelling wave. The paper compares the results for a few selected models of classical and acoustic guitars, and it involves processing...
-
Visual Features for Improving Endoscopic Bleeding Detection Using Convolutional Neural Networks
PublicationThe presented paper investigates the problem of endoscopic bleeding detection in endoscopic videos in the form of a binary image classification task. A set of definitions of high-level visual features of endoscopic bleeding is introduced, which incorporates domain knowledge from the field. The high-level features are coupled with respective feature descriptors, enabling automatic capture of the features using image processing methods....
-
Patryk Ziółkowski dr inż.
PeoplePatryk Ziolkowski is a graduate of the Faculty of Civil and Environmental Engineering at the Gdansk University of Technology, specializing in Building and Engineering Structures. He works as an Assistant Professor at the Department of Engineering Structures. He participated in international projects, including projects for the Ministry of Transportation of the State of Alabama (2015), he is also the winner of a grant from the Kosciuszko...
-
Piotr Odya dr inż.
PeoplePiotr Odya was born in Gdansk in 1974. He received his M.Sc. in 1999 from the Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, Poland. His thesis was related to the problem of sound quality improvement in the contemporary broadcasting studio. He is interested in video editing and multichannel sound systems. The goal of Mr. Odya Ph.D. thesis concerned methods and algorithms for correcting...
-
Flock behavior and control
PublicationIn this paper we present the results of the Flock Behaviour and Control workshop cluster during “Shapes of Logic Conference 2015”. During the event, students got familiar with the techniques of both visual and sound real-time data processing. The second topic presented for students was behaviourbased approach of design process, mainly based on the mathematical rules set up by Craig Raynolds on the swarm behaviour. The aim of the...
-
Sensors and System for Vehicle Navigation
PublicationIn recent years, vehicle navigation, in particular autonomous navigation, has been at the center of several major developments, both in civilian and defense applications. New technologies, such as multisensory data fusion, big data processing, or deep learning, are changing the quality of areas of applications, improving the sensors and systems used. Recently, the influence of artificial intelligence on sensor data processing and...
-
Multi-task Video Enhancement for Dental Interventions
PublicationA microcamera firmly attached to a dental handpiece allows dentists to continuously monitor the progress of conservative dental procedures. Video enhancement in video-assisted dental interventions alleviates low-light, noise, blur, and camera handshakes that collectively degrade visual comfort. To this end, we introduce a novel deep network for multi-task video enhancement that enables macro-visualization of dental scenes. In particular,...
-
Multimodal English corpus for automatic speech recognition
PublicationA multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech...
-
Context-Aware Indexing and Retrieval for Cognitive Systems Using SOEKS and DDNA
PublicationVisual content searching, browsing and retrieval tools have been a focus area of interest as they are required by systems from many different domains. Context-based, Content-Based, and Semantic-based are different approaches utilized for indexing/retrieving, but have their drawbacks when applied to systems that aim to mimic the human capabilities. Such systems, also known as Cognitive Systems, are still limited in terms of processing...
-
Framework for Structural Health Monitoring of Steel Bridges by Computer Vision
PublicationThe monitoring of a structural condition of steel bridges is an important issue. Good condition of infrastructure facilities ensures the safety and economic well-being of society. At the same time, due to the continuous development, rising wealth of the society and socio-economic integration of countries, the number of infrastructural objects is growing. Therefore, there is a need to introduce an easy-to-use and relatively low-cost...
-
Art Composition
e-Learning CoursesPerson in charge: prof. Krzysztof Wróblewski, Department of Visual Arts Teacher: mgr Patryk Różycki, Department of Visual Arts Five Words. Society and Politics. What? By What? General assumptions. The aim of the proposed two artistic compositions is a creative processing of emotions related to the socio-political issues. In general, it is about personal views and feelings, but it must be also considered that architects are...
-
In vivo imaging of the human eye using a two-photon excited fluorescence scanning laser ophthalmoscope
PublicationBACKGROUND. Noninvasive assessment of metabolic processes that sustain regeneration of human retinal visual pigments (visual cycle) is essential to improve ophthalmic diagnostics and to accelerate development of new treatments to counter retinal diseases. Fluorescent vitamin A derivatives, which are the chemical intermediates of these processes, are highly sensitive to UV light; thus, safe analyses of these processes in humans...
-
Independent dynamics of slow, intermediate, and fast intracranial EEG spectral activities during human memory formation
PublicationA wide spectrum of brain rhythms are engaged throughout the human cortex in cognitive functions. How the rhythms of various low and high frequencies are spatiotemporally coordinated across the human brain during memory processing is inconclusive. They can either be coordinated together across a wide range of the frequency spectrum or induced in specific bands. We used a large dataset of human intracranial electroencephalography...
-
An Overview of Image Analysis Techniques in Endoscopic Bleeding Detection
PublicationAuthors review the existing bleeding detection methods focusing their attention on the image processing techniques utilised in the algorithms. In the article, 18 methods were analysed and their functional components were identified. The authors proposed six different groups, to which algorithms’ components were assigned: colour techniques, reflecting features of pixels as individual values, texture techniques, considering spatial...
-
Automatic audio-visual threat detection
PublicationThe concept, practical realization and application of a system for detection and classification of hazardous situations based on multimodal sound and vision analysis are presented. The device consists of new kind multichannel miniature sound intensity sensors, digital Pan Tilt Zoom and fixed cameras and a bundle of signal processing algorithms. The simultaneous analysis of multimodal signals can significantly improve the accuracy...
-
Independent dynamics of low, intermediate, and high frequency spectral intracranial EEG activities during human memory formation
PublicationA wide spectrum of brain rhythms are engaged throughout the human cortex in cognitive functions. How the rhythms of various frequency ranges are coordinated across the space of the human cortex and time of memory processing is inconclusive. They can either be coordinated together across the frequency spectrum at the same cortical site and time or induced independently in particular bands. We used a large dataset of human intracranial...
-
Video content analysis in the urban area telemonitoring system
PublicationThe task of constant monitoring of video streams from a large number of cameras and reviewing the recordings in order to find a specified event requires a considerable amount of time and effort from the system operators and it is prone to errors. A solution to this problem is an automatic system for constant analysis of camera images being able to raise an alarm if a predefined event is detected. The chapter presents various aspects...
-
Brain-computer interaction based on EEG signal and gaze-tracking information = Analiza interackji mózg-komputer wykorzystująca sygnał EEg i informacje z systemu śledzenia punktu fiksacji wzroku
PublicationThe article presents an attempt to integrate EEG signal analysis with information about human visual activities, i.e. gaze fixation point. The results from gaze-tracking-based measurement were combined with the standard EEG analysis. A search for correlation between the brain activity and the region of the screen observed by the user was performed. The preliminary stage of the study consists in electrooculography (EOG) signal processing....
-
How Can We Identify Electrophysiological iEEG Activities Associated with Cognitive Functions?
PublicationElectrophysiological activities of the brain are engaged in its various functions and give rise to a wide spectrum of low and high frequency oscillations in the intracranial EEG (iEEG) signals, commonly known as the brain waves. The iEEG spectral activities are distributed across networks of cortical and subcortical areas arranged into hierarchical processing streams. It remains a major challenge to identify these activities in...
-
Bimodal classification of English allophones employing acoustic speech signal and facial motion capture
PublicationA method for automatic transcription of English speech into International Phonetic Alphabet (IPA) system is developed and studied. The principal objective of the study is to evaluate to what extent the visual data related to lip reading can enhance recognition accuracy of the transcription of English consonantal and vocalic allophones. To this end, motion capture markers were placed on the faces of seven speakers to obtain lip...
-
Multibeam Echosounder and LiDAR in Process of 360-Degree Numerical Map Production for Restricted Waters with HydroDron
PublicationIn order to increase the safety of inland navigation and facilitate the monitoring of the coastal zone of restricted waters, a model of multi-sensory fusion of data from hydroacoustic and optoelectronic systems mounted on the autonomous survey vessel HydroDron will be developed. In the research will be used the LiDAR laser scanner and multibeam echosounder. To increase the visual quality and map accuracy, additionally side scan...
-
Detection of debonding in adhesive joints using Lamb wave propagation
PublicationAdhesively bonded joints are widely used in many branches of industry. Mechanical degradation of this type of connections does not have significant symptoms that can be noticed during visual assessment, so non-destructive testing becomes a very important issue. The paper deals with experimental investigations of adhesively bonded steel plates with different defects. Five samples (an intact one and four with damages in the form...
-
Pupil size reflects successful encoding and recall of memory in humans
PublicationPupil responses are known to indicate brain processes involved in perception, attention and decision-making. They can provide an accessible biomarker of human memory performance and cognitive states in general. Here we investigated changes in the pupil size during encoding and recall of word lists. Consistent patterns in the pupil response were found across and within distinct phases of the free recall task. The pupil was most...
-
Automatic Watercraft Recognition and Identification on Water Areas Covered by Video Monitoring as Extension for Sea and River Traffic Supervision Systems
PublicationThe article presents the watercraft recognition and identification system as an extension for the presently used visual water area monitoring systems, such as VTS (Vessel Traffic Service) or RIS (River Information Service). The watercraft identification systems (AIS - Automatic Identification Systems) which are presently used in both sea and inland navigation require purchase and installation of relatively expensive transceivers...
-
Hazard Control in Industrial Environments: A Knowledge-Vision-Based Approach
PublicationThis paper proposes the integration of image processing techniques (such as image segmentation, feature extraction and selection) and a knowledge representation approach in a framework for the development of an automatic system able to identify, in real time, unsafe activities in industrial environments. In this framework, the visual information (feature extraction) acquired from video-camera images and other context based gathered...
-
Visual Object Tracking System Employing Fixed and PTZ Cameras
PublicationThe paper presents a video monitoring system utilizing fixed and PTZ cameras for tracking of moving objects. First type of camera provides image for background modelling, being employed for foreground objects localization. Estimated objects locations are then utilised for steering of PTZ cameras when observing targeted objects with high close-ups. Objects are classified into several classes, then basic event detection is being...
-
Orientation-aware ship detection via a rotation feature decoupling supported deep learning approach
PublicationShip imaging position plays an important role in visual navigation, and thus significant focuses have been paid to accurately extract ship imaging positions in maritime videos. Previous studies are mainly conducted in the horizontal ship detection manner from maritime image sequences. This can lead to unsatisfied ship detection performance due to that some background pixels maybe wrongly identified as ship contours. To address...
-
Distributed Framework for Visual Event Detection in Parking Lot Area
PublicationThe paper presents the framework for automatic detection of various events occurring in a parking lot basing on multiple camera video analysis. The framework is massively distributed, both in the logical and physical sense. It consists of several entities called node stations that use XMPP protocol for internal communication and SRTP protocol with Jingle extension for video streaming. Recognized events include detecting parking...
-
Visual Detection of People Movement Rules Violation in Crowded Indoor Scenes
PublicationThe paper presents a camera-independent framework for detecting violations of two typical people movement rules that are in force in many public transit terminals: moving in the wrong direction or across designated lanes. Low-level image processing is based on object detection with Gaussian Mixture Models and employs Kalman filters with conflict resolving extensions for the object tracking. In order to allow an effective event...
-
Human memory enhancement through electrical stimulation in the temporal cortex
PublicationDirect electrical stimulation of the human brain can elicit sensory and motor perceptions as well as recall of memories. Stimulating higher order association areas of the lateral temporal cortex in particular was reported to activate visual and auditory memory representations of past experiences (Penfield and Perot, 1963). We hypothesized that this effect could be used to modulate memory processing. Recent attempts at memory enhancement...
-
Advanced Visual Interfaces
Conferences -
Visual Analytics [VA]
Conferences -
Visual Information Communication and Interaction (Visual Information Communications International)
Conferences -
Network oscillations modulate interictal epileptiform spike rate during human memory
PublicationEleven patients being evaluated with intracranial electroencephalography for medically resistant temporal lobe epilepsy participated in a visual recognition memory task. Interictal epileptiform spikes were manually marked and their rate of occurrence compared between baseline and three 2 s periods spanning a 6 s viewing period. During successful, but not unsuccessful, encoding of the images there was a significant reduction in...
-
Human Feedback and Knowledge Discovery: Towards Cognitive Systems Optimization
PublicationCurrent computer vision systems, especially those using machine learning techniques are data-hungry and frequently only perform well when dealing with patterns they have seen before. As an alternative, cognitive systems have become a focus of attention for applications that involve complex visual scenes, and in which conditions may vary. In theory, cognitive applications uses current machine learning algorithms, such as deep learning,...
-
“Shadow” vs. “Phase 3D” method within endoscopic examinations of marine engines
PublicationA visual investigation of surfaces creating internal, working spaces of marine combustion engines by means of specialized view-finders so called endoscopes is at present almost a basic method of technical diag-nostics. The surface structure of constructional material is visible during investigations like through the magnifying glass (usually with a precisely determined magnification), which makes possible a detection, recognition...
-
User experience evaluation study on the quality of 1K, 2K, and 4K H.265/HEVC video content
PublicationNowadays, most content creators focus on distributing rich media at the highest possible resolution. Currently, the majority of sold consoles, media players, computer hardware, as well as displays and TVs are advertised as 4K-compatible. The same trend is observed in the case of popular online streaming services and terrestrial TV broadcasts. Generally speaking, it is assumed that higher bitrates provide higher subjective judgements....
-
Testing the Effect of Bathymetric Data Reduction on the Shape of the Digital Bottom Model
PublicationDepth data and the digital bottom model created from it are very important in the inland and coastal water zones studies and research. The paper undertakes the subject of bathymetric data processing using reduction methods and examines the impact of data reduction according to the resulting representations of the bottom surface in the form of numerical bottom models. Data reduction is an approach that is meant to reduce the size...
-
Knowledge Visualization and Visual Thinking
Conferences -
Visual Languages and Formal Methods
Conferences -
International Symposium on Visual Computing
Conferences -
BETWEEN IDEA AND INTERPRETATION - DESIGN PROCESS AUGMENTATION
PublicationThe following paper investigates the idea of reducing the human digital intervention to a minimum during the advanced design process. Augmenting the outcome attributes beyond the designer's capabilities by computational design methods, data collection, data computing and digital fabrication, altogether imitating the human design process. The primary technical goal of the research was verification of restrictions and abilities used...
-
International Conference on Visual Information Systems
Conferences -
BP-EVD: Forward Block-Output Propagation for Efficient Video Denoising
PublicationDenoising videos in real-time is critical in many applications, including robotics and medicine, where varying light conditions, miniaturized sensors, and optics can substantially compromise image quality. This work proposes the first video denoising method based on a deep neural network that achieves state-of-the-art performance on dynamic scenes while running in real-time on VGA video resolution with no frame latency. The backbone...
-
SPIE Conference on Visual Data Exploration and Analysis
Conferences -
IEEE Symposium on Visual Analytics Science and Technology
Conferences -
IFIP Working Conference on Visual Database Systems
Conferences -
IEEE Workshop on Computational Intelligence for Visual Intelligence
Conferences -
Special forms of echo visual representation in an ahead looking sonar.
PublicationThe paper discusses ways to organise visual representation in a multi-beam ahead looking sonars whose function is to detect objects on the bottom and in pelagic zones. Forms of visual representation are shown and illustrated on the basic screen (panoramic representation and setting, alarms) and on the auxiliary screen (type A, B and special). Special forms of visual representation are mainly used in detecting objects in difficult...
-
Pursuing Listeners’ Perceptual Response in Audio-Visual Interactions - Headphones vs Loudspeakers: A Case Study
PublicationThis study investigates listeners’ perceptual responses in audio-visual interactions concerning binaural spatial audio. Audio stimuli are coupled with or without visual cues to the listeners. The subjective test participants are tasked to indicate the direction of the incoming sound while listening to the audio stimulus via loudspeakers or headphones with the head-related transfer function (HRTF) plugin. First, the methodology...
-
Visual Management as the support in building the concept of continuous improvement in the enterprise
PublicationThe following article presents one of the selected tools of the Lean Management concept – visual management. This method enables enterprises to strengthen their process of continuous improvement. Due to the support of visual management, it is possible to manage information more effectively by the managerial board and to improve communication process within in the particular company. In the first part, the author describes the concept...
-
Visual TreeCmp : Comprehensive Comparison of Phylogenetic Trees on the Web
Publication1. We present Visual TreeCmp—a package of applications for comparing phylogenetic tree sets. 2. Visual TreeCmp includes a graphical web interface allowing the visualization of compared trees and command line application extended by comparison methods recently proposed in the literature. 3. The phylogenetic tree similarity analysis in Visual TreeCmp can be performed using eighteen metrics, of which 11 are dedicated to rooted trees...
-
Multimodal Attention Stimulator
PublicationMultimodal attention stimulator was proposed and tested for improving auditory and visual attention, including pupils with developmental dyslexia. Results of the conducted experiments shown that the designed stimulator can be used in order to improve comprehension during reading tasks. The changes in the visual attention, observed in reading test results, translate into the overall reading performance.
-
The short-term flicker severity level measured in the industrial power system supplying the rolling mill motors
Open Research DataThe dataset presents a short-term flicker severity level measured on the bus bars of the main switchgear of the industrial power network for the supply of rolling mills. The data were obtained during an experiment whose purpose was to determine a level of short-term and long-term flicker caused by voltage fluctuations. In the virtual application of...
-
Visual Capacity Assessment of the Open Landscape in Terms of Protection and Shaping: Case Study of a Village in Poland
PublicationThis article describes the methodology and results of research on landscape visual capacity. The aim of the project was to develop a tool that would support planning and design decisions at the level of communal management in rural areas in Poland through systematic application of visual criteria. Their importance in the protection, management and shaping of space is underlined by the document produced at the European Landscape...
-
Visual Lip Contour Detection for the Purpose of Speech Recognition
PublicationA method for visual detection of lip contours in frontal recordings of speakers is described and evaluated. The purpose of the method is to facilitate speech recognition with visual features extracted from a mouth region. Different Active Appearance Models are employed for finding lips in video frames and for lip shape and texture statistical description. Search initialization procedure is proposed and error measure values are...
-
Exploiting audio-visual correlation by means of gaze tracking
PublicationThis paper presents a novel means for increasing audio-visual correlation analysis reliability. This is done based on gaze tracking technology engineered at the Multimedia Systems Department of the Gdansk University of Technology, Poland. In the paper, the past history and current research in the area of audio-visual perception analysis are shortly reviewed. Then the methodology employing gaze tracking is presented along with the...
-
IEEE Symposium on Visual Languages and Human-Centric Computing (was VL)
Conferences -
Modelling Of Commercial Websites. A New Perspective On Usability And Customer Relation
PublicationFrom an economic point of view, a critical aspect of online services is their ability to retain customers. The aim of presented study was the use of a layered model VIPR (Visual - Interaction - Process - Relation ) for commercial services online. The indicator of trust and establishing lasting relationships were assessment achieved from experienced users of commercial online services (n = 207), obtained by means of Web Credibility...
-
Objectivization of Audio-Visual Correlation analysis
PublicationSimultaneous perception of audio and visual stimuli often causes the concealment or misrepresentation of information actually contained in these stimuli. Such effects are called the ''image proximity effect'' or the ''ventriloquism effect'' in literature. Until recently, most research carried out to understand their nature was based on subjective assessments. The Authors of this paper propose a methodology based on both subjective...
-
A comparative study of English viseme recognition methods and algorithms
PublicationAn elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector construction...
-
A comparative study of English viseme recognition methods and algorithm
PublicationAn elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector...
-
Visual content representation and retrieval for Cognitive Cyber Physical Systems
PublicationCognitive Cyber Physical Systems have gained significant attention from academia and industry during the past few decade. One of the main reasons behind this interest is the potential of such technologies to revolutionize human life since they intend to work robustly under complex visual scenes, which environmental conditions may vary, adapting to a comprehensive range of unforeseen changes, and exhibiting prospective behavior...
-
Lighting conditions in Home Office and occupant’s perception: an international study
PublicationThe global pandemic and physical distancing restrictions are forcing us to rethink how residential buildings are used regarding the visual environment. This paper describes home office lighting conditions within different countries and continents. The aim is to define the current limitations of home offices in providing a resilient visual environment. The work was developed by a team of international experts working together on...
-
Methodology and technology for the polymodal allophonic speech transcription
PublicationA method for automatic audiovisual transcription of speech employing: acoustic and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e. the changes in the articulatory setting of speech organs for...
-
Methodology and technology for the polymodal allophonic speech transcription
PublicationA method for automatic audiovisual transcription of speech employing: acoustic, electromagnetical articulography and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e., the changes in the articulatory...
-
An new method of audio-visual correlation analysis
PublicationThis paper presents a new methodology of conducting the audio-visual correlation analysis employing the gaze tracking system. Interaction between two perceptual modalities, seeing and hearing, their interaction and mutual reinforcement in a complex relationship was a subject of many research studies. Earlier stage of the carried out experiments at the Multimedia Systems Department (MSD) showed that there exists a relationship between...
-
Audio-visual aspect of the Lombard effect and comparison with recordings depicting emotional states.
PublicationIn this paper an analysis of audio-visual recordings of the Lombard effect is shown. First, audio signal is analyzed indicating the presence of this phenomenon in the recorded sessions. The principal aim, however, was to discuss problems related to extracting differences caused by the Lombard effect, present in the video , i.e. visible as tension and work of facial muscles aligned to an increase in the intensity of the articulated...
-
Vocalic Segments Classification Assisted by Mouth Motion Capture
PublicationVisual features convey important information for automatic speech recognition (ASR), especially in noisy environment. The purpose of this study is to evaluate to what extent visual data (i.e. lip reading) can enhance recognition accuracy in the multi-modal approach. For that purpose motion capture markers were placed on speakers' faces to obtain lips tracking data during speaking. Different parameterizations strategies were tested...
-
Simple gait parameterization and 3D animation for anonymous visual monitoring based on augmented reality
PublicationThe article presents a method for video anonymization and replacing real human silhouettes with virtual 3D figures rendered on a screen. Video stream is processed to detect and to track objects, whereas anonymization stage employs animating avatars accordingly to behavior of detected persons. Location, movement speed, direction, and person height are taken into account during animation and rendering phases. This approach requires...
-
Robust and Efficient Machine Learning Algorithms for Visual Recognition
PublicationIn visual recognition, the task is to identify and localize all objects of interest in the input image. With the ubiquitous presence of visual data in modern days, the role of object recognition algorithms is becoming more significant than ever and ranges from autonomous driving to computer-aided diagnosis in medicine. Current models for visual recognition are dominated by models based on Convolutional Neural Networks (CNNs), which...
-
Smart Modeling of Maritime Vessels
PublicationCurrently, the market offers many visualization tools available to graphic designers, engineers, managers and academics working on maritime environments. The practice of visualization involves making and manipulating images that convey novel phenomena and ideas. Visual communication, together with virtual reality environments, is an emerging and rapidly evolving discipline. It brings great advantage over written word or voice alone,...
-
Visual and auditory attention stimulator for assisting pedagogical therapy . Stymulator uwagi wzrokowej i słuchowej do wspomagania terapii pedagogicznej
PublicationVisual and auditory attention stimulator provides a system developed in order to improve reading skills using simultaneous presentation of text in its visual form and in transformed auditory form accompanied by related movie material. The described research employed 40 children at the age of 8 13 years having difficulties in learning of reading, who were diagnosed as having developmental dyslexia. It was shown that application...
-
Visual and Auditory Attention Stimulator for Assisting Pedagogical Therapy
PublicationVisual and auditory attention stimulator provides a system developed in order to improve reading skills using simultaneous presentation of text in its visual form and in transformed auditory form accompanied by related movie material. The described research employed 40 children at the age of 8 13 years having difficulties in learning of reading, who were diagnosed as having developmental dyslexia. It was shown that application...
-
EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY
PublicationThe problem of video framerate and audio/video synchronization in audio-visual speech recognition is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...
-
EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY
PublicationThe problem of video framerate and audio/video synchronization in audio-visual speech recogni-tion is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...
-
Light formed through urban morphology and different organism groups: First findings from a systematic review
PublicationThe prevailing implementation and usage of contemporary lighting technologies and design practices in cities have created over-illuminated built environments. Recent studies indicate that exposure to electric lighting effects formed through spatial characteristics has visual, physiological, and behavioural effects on both humans and non-humans, such as wildlife. In order to gain a better understanding of the impact that electric...
-
Public spaces connecting cities. Green and Blue Infrastructures potential.
PublicationA city fragmentation causes a lot of negative effects in urban environment such as: disconnecting the environmental, functional and compositional relations, a loss of urban compactness, chaotic development, visual chaos, a domination of technical landscape, reduction of security. This is why one of main challenges for urban planners is to connect the fragmented structures by creating friendly, attractive and safe public space....
-
Augmented Reality for Privacy-Sensitive Visual Monitoring
PublicationThe paper presents a method for video anonymization and replacing real human silhouettes with virtual 3D figures rendered on the screen. Video stream is processed to detect and to track objects, whereas anonymization stage employs fast blurring method. Substitute 3D figures are animated accordingly to behavior of detected persons. Their location, movement speed, direction, and person height are taken into account during the animation...
-
Remote Estimation of Video-Based Vital Signs in Emotion Invocation Studies
PublicationAbstract— The goal of this study is to examine the influence of various imitated and video invoked emotions on the vital signs (respiratory and pulse rates). We also perform an analysis of the possibility to extract signals from sequences acquired with cost-effective cameras. The preliminary results show that the respiratory rate allows for better separation of some emotions than the pulse rate, yet this relation highly depends...
-
Objectivization of audio-video correlation assessment experiments
PublicationThe purpose of this paper is to present a new method of conducting an audio-visual correlation analysis employing a head-motion-free gaze tracking system. First, a review of related works in the domain of sound and vision correlation is presented. Then assumptions concerning audio-visual scene creation are shortly described. The objectivization process of carrying out correlation tests employing gaze-tracking system is outlined....
-
Preferences of the Facade Composition in the Context of Its Regularity and Irregularity
PublicationAbstract: The aim of this study is to determine the preferences of Polish society towards building facades depending on the degree of the composition regularity of the facade elements. The subject matter is inspired by the authors’ observations in relation to the current architectural trends. The purposefulness of the conducted research results from several issues. Firstly, the reports of psychology and neurosciences clearly indicate...
-
New Aspects of Virtual Sound Source Localization Research—Impact of Visual Angle and 3-D Video Content on Sound Perception
PublicationThe influence of image on virtual sound source localization, called the “image proximity effect” or the “ventriloquism effect”, is a well known phenomenon. This paper focuses on other aspects related to this effect, namely the impact of the visual angle of the presented object and 3D video content on sound perception. The research conducted confirmed that the visual angle of the presented object determines the image proximity effect...
-
Support for argument structures review and assessment
PublicationArgument structures are commonly used to develop and present cases for safety, security and for other properties of systems. Such structures tend to grow excessively, which causes problems with their review and assessment. Two issues are of particular interest: (1) systematic and explicit assessment of the compelling power of an argument, and (2) communication of the result of such an assessment to relevant recipients. The paper...