Search results for: allophonic speech transcription
-
Molecular and structural basis of inner core lipopolysaccharide alterations in Escherichia coli: incorporation of glucuronic acid and phosphoethanolamine in the heptose region.
PublicationIt is well established that lipopolysaccharide (LPS) often carries nonstoichiometric substitutions in lipid A and in the inner core. In this work, the molecular basis of inner core alterations and their physiological significance are addressed. A new inner core modification of LPS is described, which arises due to the addition of glucuronic acid on the third heptose with a concomitant loss of phosphate on the second heptose. This...
-
Rediscovering Automatic Detection of Stuttering and Its Subclasses through Machine Learning—The Impact of Changing Deep Model Architecture and Amount of Data in the Training Set
PublicationThis work deals with automatically detecting stuttering and its subclasses. An effective classification of stuttering along with its subclasses could find wide application in determining the severity of stuttering by speech therapists, preliminary patient diagnosis, and enabling communication with the previously mentioned voice assistants. The first part of this work provides an overview of examples of classical and deep learning...
-
Neotenic phenomenon in gene expression in the skin of Foxn1- deficient (nude) mice - a projection for regenerative skin wound healing
PublicationMouse fetuses up to 16 day of embryonic development and nude (Foxn1- deficient) mice are examples of animals that undergo regenerative (scar-free) skin healing. The expression of transcription factor Foxn1 in the epidermis of mouse fetuses begins at embryonic day 16.5 which coincides with the transition point from scar-free to scar-forming skin wound healing. In the present study, we tested the hypothesis that Foxn1 expression...
-
Tombstone of Simon Bahr in St. Mary's Church in Gdańsk
Open Research DataThe data set concerns epigraphy. It refers to the tombstone placed in St. Mary’s Church in Gdańsk which is dedicated to Simon Bahr, the merchant and banker of the Swedish King John III Vasa and his son, Sigismund III Vasa, king of Poland and Sweden. Simon formed powerful alliances by marrying his numerous offspring into the finest Gdańsk families. The...
-
Epitaph of Dorothy and John Brandes in St. Mary's Church in Gdansk
Open Research DataThe data set concerns epigraphy. It refers to the epitaph placed in St. Mary’s Church in Gdańsk, that is dedicated to John Brandes, highly influential citizen of Gdańsk, whose official career was crowned with the function of mayor, and his wife Dorothy née Zimmermann. The epitaph was founded by their family, and especially by their grandson John Speymann,...
-
Modeling and Designing Acoustical Conditions of the Interior – Case Study
PublicationThe primary aim of this research study was to model acoustic conditions of the Courtyard of the Gdańsk University of Technology Main Building, and then to design a sound reinforcement system for this interior. First, results of measurements of the parameters of the acoustic field are presented. Then, the comparison between measured and predicted values using the ODEON program is shown. Collected data indicate a long reverberation...
-
A comparative study of English viseme recognition methods and algorithms
PublicationAn elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector construction...
-
A comparative study of English viseme recognition methods and algorithm
PublicationAn elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector...
-
Comparative analysis of various transformation techniques for voiceless consonants modeling
PublicationIn this paper, a comparison of various transformation techniques, namely Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT) and Discrete Walsh Hadamard Transform (DWHT) are performed in the context of their application to voiceless consonant modeling. Speech features based on these transformation techniques are extracted. These features are mean and derivative values of cepstrum coefficients, derived from each transformation....
-
Playback detection using machine learning with spectrogram features approach
PublicationThis paper presents 2D image processing approach to playback detection in automatic speaker verification (ASV) systems using spectrograms as speech signal representation. Three feature extraction and classification methods: histograms of oriented gradients (HOG) with support vector machines (SVM), HAAR wavelets with AdaBoost classifier and deep convolutional neural networks (CNN) were compared on different data partitions in respect...
-
Evaluation of aspiration problems in L2 English pronunciation employing machine learning
PublicationThe approach proposed in this study includes methods specifically dedicated to the detection of allophonic variation in English. This study aims to find an efficient method for automatic evaluation of aspiration in the case of Polish second-language (L2) English speakers’ pronunciation when whole words are analyzed instead of particular allophones extracted from words. Sample words including aspirated and unaspirated allophones...
-
Intelligent multimedia solutions supporting special education needs.
PublicationThe role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....
-
Intelligent video and audio applications for learning enhancement
PublicationThe role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....
-
Evaluation Criteria for Affect-Annotated Databases
PublicationIn this paper a set of comprehensive evaluation criteria for affect-annotated databases is proposed. These criteria can be used for evaluation of the quality of a database on the stage of its creation as well as for evaluation and comparison of existing databases. The usefulness of these criteria is demonstrated on several databases selected from affect computing domain. The databases contain different kind of data: video or still...
-
Effect of new bisacridines IKE16, IKE18 and IE10 on the yeast topoisomerase II relaxation activity
Open Research DataThe datasets contain the results of new bisacridines IKE16, IKE18 and IE10 inhibition activity against yeast topoisomerase II. DNA topoisomerases (Topo) are enzymes that catalyze changes in the spatial structure of DNA and play an important role in replication, transcription and recombination. Beyond their normal functions, DNA topo are significant...
-
Sepulchral plate of Thomas Tympfius in St. Mary's Church in Gdańsk
Open Research DataThe data set concerns epigraphy. It refers to the sepulchral plate placed in St. Mary’s Church in Gdańsk which is dedicated to Thomas Tympfius, a minter who, together with his brother Andrew, rented several mints in the Kingdom of Poland, and gave his name to the timpf - the silver zloty released by the Tympfius mint. Unfortunately, as a result of...
-
The role of epigenetics in regeneration
PublicationComplex changes in chromatin structure and at the transcriptional level occur from the creation of a single parental gamete throughout fertilization, embryo development and the life of an adult organism. Epigenetic changes, such as methylation and hydroxymethylation of DNA or histone methylation and acetylation, are an important part of these processes. Epigenetic regulation has an essential influence on gene expression level. DNA...
-
Deep neural networks for data analysis
e-Learning CoursesThe aim of the course is to familiarize students with the methods of deep learning for advanced data analysis. Typical areas of application of these types of methods include: image classification, speech recognition and natural language understanding. Celem przedmiotu jest zapoznanie studentów z metodami głębokiego uczenia maszynowego na potrzeby zaawansowanej analizy danych. Do typowych obszarów zastosowań tego typu metod należą:...
-
Szymon Andrzejewski dr
PeopleMaster’s degree at the University of Gdańsk in 2008 Major in political system and self-government. Overgraduate studies at the Gdańsk University of Technology „Management and evaluation of projects financed from EU funds” and at AGH University of Science and Technology Noise protection against noise and vibration. Student of sociology PhD studies at the University of Gdańsk from 2016. The research scope is democracy and institutions...
-
Analysis-by-synthesis paradigm evolved into a new concept
PublicationThis work aims at showing how the well-known analysis-by-synthesis paradigm has recently been evolved into a new concept. However, in contrast to the original idea stating that the created sound should not fail to pass the foolproof synthesis test, the recent development is a consequence of the need to create new data. Deep learning models are greedy algorithms requiring a vast amount of data that, in addition, should be correctly...
-
Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging
PublicationIn the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modification of the training program which minimizes the...
-
DEVELOPMENT OF THE ALGORITHM OF POLISH LANGUAGE FILM REVIEWS PREPROCESSING
PublicationThe algorithm and the software for conducting the procedure of Preprocessing of the reviews of films in the Polish language were developed. This algorithm contains the following steps: Text Adaptation Procedure; Procedure of Tokenization; Procedure of Transforming Words into the Byte Format; Part-of-Speech Tagging; Stemming / Lemmatization Procedure; Presentation of Documents in the Vector Form (Vector Space Model) Procedure; Forming...
-
A study on signal processing methods applied to hearing aids
PublicationThis paper presents a short survey on current technology available in hearing aids with a focus on digital signal processing techniques used. First, factors influencing the hearing aid effectiveness are introduced. Then, examples of the present DSP methods and strategies are provided. Also, a description of current limitations of hearing aids and future trends of development are shown. Finally, the notion of computational auditory...
-
Selection of Features for Multimodal Vocalic Segments Classification
PublicationEnglish speech recognition experiments are presented employing both: audio signal and Facial Motion Capture (FMC) recordings. The principal aim of the study was to evaluate the influence of feature vector dimension reduction for the accuracy of vocalic segments classification employing neural networks. Several parameter reduction strategies were adopted, namely: Extremely Randomized Trees, Principal Component Analysis and Recursive...
-
Anticancer imidazoacridinone C-1311 inhibits hypoxia-inducible factor-1α (HIF-1α), vascular endothelial growth factor (VEGF) and angiogenesis
PublicationAntitumor imidazoacridinone C-1311 is a DNA-reactive topoisomerase II and FLT3 receptor tyrosine kinase inhibitor. Here, we demonstrate the mechanism of C-1311 inhibitory action on novel targets: hypoxia-inducible factor-1α (HIF-1α), vascular-endothelial growth factor (VEGF), and angiogenesis. In a cell-free system, C-1311 prevented HIF-1α binding to an oligonucleotide encompassing a canonical hypoxia-responsive element (HRE),...
-
Promocja zasobów Pomorskiej Biblioteki Cyfrowej na przykładzie XVIII-wiecznego rękopisu
PublicationCelem artykułu jest przedstawienie sposobu udostępniania i promocji zbiorów rękopiśmiennych na przykładzie XVIII-wiecznego rękopisu Christiana Gabriela Fishera dostępnego w Pomorskiej Bibliotece Cyfrowej (dalej: PBC). Rękopis ten stał się inspiracją do podjęcia współpracy Biblioteki Politechniki Gdańskiej oraz Instytutu Kultury Miejskiej w Gdańsku. Dzięki wspólnej inicjatywie rozpoczęto prace nad transkrypcją niemieckiego tekstu...
-
Justification of quasi-stationary approximation in models of gene expression of a self-regulating protein
PublicationWe analyse a model of Hes1 gene transcription and protein synthesis with a negative feedback loop. The effect of multiple binding sites in the Hes1 promoter as well as the dimer formation process are taken into account. We consider three, possibly different, time scales connected with: (i) the process of binding to/dissolving from a binding site, (ii) formation and dissociation of dimers, (iii) production and degradation of Hes1...
-
Sound engineering as our commitment to its creators in Poland
PublicationSound engineering is an interdisciplinary and rapidly expanding domain. It covers many aspects, such as sound perception, studio and sound mastering technology, music information retrieval including content-based search systems and automatic music transcription frameworks, sound synthesis, sound restoration, electroacoustics, and other ones constituting multimedia technology. Moreover, machine learning methods applied to the topics...
-
Triazoloacridone C-1305 impairs XBP1 splicing by acting as a potential IRE1α endoribonuclease inhibitor
PublicationInositol requiring enzyme 1 alpha (IRE1α) is one of three signaling sensors in the unfolding protein response (UPR) that alleviates endoplasmic reticulum (ER) stress in cells and functions to promote cell survival. During conditions of irrevocable stress, proapoptotic gene expression is induced to promote cell death. One of the three signaling stressors, IRE1α is an serine/threonine-protein kinase/endoribonuclease (RNase) that...
-
HDAC Inhibitors: Innovative Strategies for Their Design and Applications
PublicationHistone deacetylases (HDACs) are a large family of epigenetic metalloenzymes that are involved in gene transcription and regulation, cell proliferation, differentiation, migration, and death, as well as angiogenesis. Particularly, disorders of the HDACs expression are linked to the development of many types of cancer and neurodegenerative diseases, making them interesting molecular targets for the design of new efficient drugs...
-
Elimination of Impulsive Disturbances From Archive Audio Signals Using Bidirectional Processing
PublicationIn this application-oriented paper we consider the problem of elimination of impulsive disturbances, such as clicks, pops and record scratches, from archive audio recordings. The proposed approach is based on bidirectional processing—noise pulses are localized by combining the results of forward-time and backward-time signal analysis. Based on the results of specially designed empirical tests (rather than on the results of theoretical analysis),...
-
Dynamic Bayesian Networks for Symbolic Polyphonic Pitch Modeling
PublicationSymbolic pitch modeling is a way of incorporating knowledge about relations between pitches into the process of an- alyzing musical information or signals. In this paper, we propose a family of probabilistic symbolic polyphonic pitch models, which account for both the “horizontal” and the “vertical” pitch struc- ture. These models are formulated as linear or log-linear interpo- lations of up to fi ve sub-models, each of which is...
-
Automatic music signal mixing system based on one-dimensional Wave-U-Net autoencoders
PublicationThe purpose of this paper is to show a music mixing system that is capable of automatically mixing separate raw recordings with good quality regardless of the music genre. This work recalls selected methods for automatic audio mixing first. Then, a novel deep model based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. The model is trained on a custom-prepared database. Mixes created using the...
-
Genre-Based Music Language Modeling with Latent Hierarchical Pitman-Yor Process Allocation
PublicationIn this work we present a new Bayesian topic model: latent hierarchical Pitman-Yor process allocation (LHPYA), which uses hierarchical Pitman-Yor pr ocess priors for both word and topic distributions, and generalizes a few of the existing topic models, including the latent Dirichlet allocation (LDA), the bi- gram topic model and the hierarchical Pitman-Yor topic model. Using such priors allows for integration of -grams with a topic model,...
-
Elimination of Impulsive Disturbances From Stereo Audio Recordings Using Vector Autoregressive Modeling and Variable-order Kalman Filtering
PublicationThis paper presents a new approach to elimination of impulsive disturbances from stereo audio recordings. The proposed solution is based on vector autoregressive modeling of audio signals. Online tracking of signal model parameters is performed using the exponential ly weighted least squares algo- rithm. Detection of noise pulses an d model-based interpolation of the irrevocably distorted sampl es is realized using an adaptive, variable-order...
-
Vocalic Segments Classification Assisted by Mouth Motion Capture
PublicationVisual features convey important information for automatic speech recognition (ASR), especially in noisy environment. The purpose of this study is to evaluate to what extent visual data (i.e. lip reading) can enhance recognition accuracy in the multi-modal approach. For that purpose motion capture markers were placed on speakers' faces to obtain lips tracking data during speaking. Different parameterizations strategies were tested...
-
A Device for Measuring Auditory Brainstem Responses to Audio
PublicationStandard ABR devices use clicks and tone bursts to assess subjects’ hearing in an objective way. A new device was developed that extends the functionality of a standard ABR audiometer by collecting and analyzing auditory brainstem responses (ABR). The developed accessory allows for the use of complex sounds (e.g., speech or music excerpts) as stimuli. Therefore, it is possible to find out how efficiently different types of sounds...
-
Secured wired BPL voice transmission system
PublicationDesigning a secured voice transmission system is not a trivial task. Wired media, thanks to their reliability and resistance to mechanical damage, seem an ideal solution. The BPL (Broadband over Power Line) cable is resistant to electricity stoppage and partial damage of phase conductors, ensuring continuity of transmission in case of an emergency. It seems an appropriate tool for delivering critical data, mostly clear and understandable...
-
Modeling and Simulation for Exploring Power/Time Trade-off of Parallel Deep Neural Network Training
PublicationIn the paper we tackle bi-objective execution time and power consumption optimization problem concerning execution of parallel applications. We propose using a discrete-event simulation environment for exploring this power/time trade-off in the form of a Pareto front. The solution is verified by a case study based on a real deep neural network training application for automatic speech recognition. A simulation lasting over 2 hours...
-
Examining Feature Vector for Phoneme Recognition / Analiza parametrów w kontekście automatycznej klasyfikacji fonemów
PublicationThe aim of this paper is to analyze usability of descriptors coming from music information retrieval to the phoneme analysis. The case study presented consists in several steps. First, a short overview of parameters utilized in speech analysis is given. Then, a set of time and frequency domain-based parameters is selected and discussed in the context of stop consonant acoustical characteristics. A toolbox created for this purpose...
-
Comparison of Lithuanian and Polish Consonant Phonemes Based on Acoustic Analysis – Preliminary Results
PublicationThe goal of this research is to find a set of acoustic parameters that are related to differences between Polish and Lithuanian language consonants. In order to identify these differences, an acoustic analysis is performed, and the phoneme sounds are described as the vectors of acoustic parameters. Parameters known from the speech domain as well as those from the music information retrieval area are employed. These parameters are...
-
Multimedia industrial and medical applications supported by machine learning
PublicationThis article outlines a keynote paper presented at the Intelligent DecisionTechnologies conference providing a part of the KES Multi-theme Conference “Smart Digital Futures” organized in Rome on June 14–16, 2023. It briefly discusses projects related to traffic control using developed intelligent traffic signs and diagnosing the health of wind turbine mechanisms and multimodal biometric authentication for banking branches to provide...
-
Orken Mamyrbayev Professor
People1. Education: Higher. In 2001, graduated from the Abay Almaty State University (now Abay Kazakh National Pedagogical University), in the specialty: Computer science and computerization manager. 2. Academic degree: Ph.D. in the specialty "6D070300-Information systems". The dissertation was defended in 2014 on the topic: "Kazakh soileulerin tanudyn kupmodaldy zhuyesin kuru". Under my supervision, 16 masters, 1 dissertation...
-
Chirp Rate and Instantaneous Frequency Estimation: Application to Recursive Vertical Synchrosqueezing
PublicationThis letter introduces new chirp rate and instantaneous frequency estimators designed for frequency-modulated signals. These estimators are first investigated from a deterministic point of view, then compared together in terms of statistical efficiency. They are also used to design new recursive versions of the vertically synchrosqueezed short-time Fourier transform, using a previously published method (D. Fourer, F. Auger, and...
-
The Innovative Faculty for Innovative Technologies
PublicationA leaflet describing Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology. Multimedia Systems Department described laboratories and prototypes of: Auditory-visual attention stimulator, Automatic video event detection, Object re-identification application for multi-camera surveillance systems, Object Tracking and Automatic Master-Slave PTZ Camera Positioning System, Passive Acoustic Radar,...
-
Ultrawideband transmission in physical channels: a broadband interference view
PublicationThe superposition of multipath components (MPC) of an emitted wave, formed by reflections from limiting surfaces and obstacles in the propagation area, strongly affects communication signals. In the case of modern wideband systems, the effect should be seen as a broadband counterpart of classical interference which is the cause of fading in narrowband systems. This paper shows that in wideband communications, the time- and frequency-domain...
-
Examining Feature Vector for Phoneme Recognition
PublicationThe aim of this paper is to analyze usability of descriptors coming from music information retrieval to the phoneme analysis. The case study presented consists in several steps. First, a short overview of parameters utilized in speech analysis is given. Then, a set of time and frequency domain-based parameters is selected and discussed in the context of stop consonant acoustical characteristics. A toolbox created for this purpose...
-
Cytokine IL6, but not IL-1β, TNF-α and NF-κB is increased in paediatric cancer patients
PublicationCytokines are responsible for maintaining homeostasis as cell growth, differentiation, migration and apoptosis mediators. They play a pivotal role in immune responses to inflammatory reactions. In oncological diseases, the cross-talk between cells of the immunological system and cells of the tumour microenvironment is led by cytokines. Also, the overproduction of cytokines may change the tumour microenvironment and stimulate tumour...
-
Quality Evaluation of Novel DTD Algorithm Based on Audio Watermarking
PublicationEcho cancellers typically employ a doubletalk detection (DTD) algorithm in order to keep the adaptive filter from diverging in the presence of near-end speech signal or other disruptive sounds in the microphone signal. A novel doubletalk detection algorithm based on techniques similar to those used for audio signal watermarking was introduced by the authors. The application of the described DTD algorithm within acoustic echo cancellation...
-
Detection and localization of selected acoustic events in 3D acoustic field for smart surveillance applications
PublicationA method for automatic determination of position of chosen sound events such as speech signals and impulse sounds in 3-dimensional space is presented. The events are localized in the presence of sound reflections employing acoustic vector sensors. Human voice and impulsive sounds are detected using adaptive detectors based on modified peak-valley difference (PVD) parameter and sound pressure level. Localization based on signals...