This is done by a young scientist Alexandra Tatarinova
Content-Based Retrieval is a dynamically developing field of computer science. Among the urgent problems is the search for speech documents by the text request of users. How to find the right material in the collection of audio files without resorting to listening to each of them? Is it possible to do this by introducing a regular text query?
The assistant professor of applied mathematics and computer science at Vyatka State University Alexander Tatarinov explained how to do this.
We proceeded from the hypothesis that the search should not be performed specifically on the recognized text, but by converting it into a phoneme representation.
The young scientist, together with supervisor Dmitry Prozorov, professor of the Department of Radioelectronic Tools of Vyatka State University, managed to achieve the goal: a search method and a phonemic transcription algorithm based on multiply connected Markov chains were proposed. Now a user who has a certain collection of audio files can make a request by entering text, and the system will translate it into a phoneme representation, providing maximum search accuracy.
The study formed the basis of the dissertation by Alexandra Tatarinova, and the latest results are reflected in the article “Comparison Of Grapheme-to-Phoneme Conversions For Spoken Document Retrieval”, which was included in the collection of materials of the IEEE EWDTS 2019 conference.
The method and algorithm developed by Vyatka State University scientists can be used to create systems aimed at obtaining relevant information for commercial companies and government institutions: from complaints of consumers of goods and services to receiving data on the disclosure of confidential information.
Today, Alexander Tatarinova is also actively engaged in research in the field of Grammatical Error Correction. This is, first of all, the elimination of violations of the grammatical connection between words in sentences, for the solution of which deep neural networks can be used.
In fact, we must train the neural network to find and correct in sentences inconsistencies resulting from typos or poor knowledge of the Russian language by the user. We do this on the basis of a neural network with the Transformer architecture, which contains a self-attention mechanism, which allows us to better educate the connections between words within a sentence, - explained A.G. Tatarinova.
Research conducted by young scientists is fully consistent with global trends. It is important that these topics are reflected in classes with students, introducing them to the latest scientific achievements. This, in particular, can be said about the courses "Mathematical Models for Pattern Recognition" and "Computer Vision", which Alexandra Gennadyevna reads for students at the Institute of Mathematics and Information Systems of Vyatka State University.
In the figure: diagram of a search system for voice documents by text query.