Khokhlov Y., Prisyach T., Mitrofanov A., Dutov D., Agafonov I., Timofeeva T., Romanenko A., Korenevsky M. Classification of Room Impulse Responses and its application for channel verification and diarization. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2024. pp. 274--278.
Присяч Т.Н., Хохлов Ю.Ю., Korenevsky M., Mitrofanov A., Timofeeva T.N., Mitrofanova M.M., Novoselov S.A., Romanenko A.N. STCON System for the CHiME-7 Challenge. International Workshop on Speech Processing in Everyday Environments (CHiME 2023). 2023. pp. 87-92.
Andrusenko A., Nasretdinov R., Romanenko A. UCONV-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition. 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2023. pp. 1-5.
UCONV-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition
Andrusenko A., Romanenko A. Improving out of vocabulary words recognition accuracy for an end-to-end Russian speech recognition system. Научно-технический вестник информационных технологий, механики и оптики [Scientific and Technical Journal of Information Technologies, Mechanics and Optics]. 2022. Vol. 22. No. 6(142). pp. 1143-1149.
Mitrofanov A., korenevskaya M., Podluzhny I., Khokhlov Y., Laptev A., Andrusenko A., Ilin A., Korenevsky M., Medennikov I., Romanenko A. LT-LM: A Novel Non-Autoregressive Language Model for Single-Shot Lattice Rescoring. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2021. Vol. 3. pp. 2053-2057.
Medennikov I., Korenevsky M., Prisyach T., Khokhlov Y., Korenevskaya M., Sorokin I., Timofeeva T.N., Mitrofanov A., Andrusenko A., Podluzhny I., Laptev A., Romanenko A. Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2020. pp. 274-278.
Medennikov I., Korenevsky M., Prisyach T., Khokhlov Y., Korenevskaya M., Sorokin I., Timofeeva T., Mitrofanov A., Andrusenko A., Laptev A., Romanenko A. The STC System for the CHiME-6 Challenge. 6th International Workshop on Speech Processing in Everyday Environments (CHiME 2020). 2020. pp. 36-41.
Khokhlov Y.-., Zatvornitskiy A., Medennikov I., Sorokin I., Prisyach T., Romanenko A., Mitrofanov A., Bataev V., Andrusenko A.I., Korenevskaya M., Petrov O. R-vectors: New Technique for Adaptation to Room Acoustics. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2019. pp. 1243-1247.
Medennikov I., Khokhlov Y., Romanenko A., Sorokin I., Mitrofanov A., Bataev V., Andrusenko A.I., Korenevskaya M., Petrov O., Zatvornitskiy A. The STC ASR System for the VOiCES from a Distance Challenge 2019. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2019. pp. 2453-2457.
Medennikov I., Khokhlov Y., Romanenko A., Popov D., Tomashenko N., Sorokin I., Zatvornitskiy A. An investigation of mixup training strategies for acoustic models in ASR. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2018. pp. 2903-2907.
Medennikov I., Romanenko A., Сорокин И., Popov D., Хохлов Ю., Присяч Т.Н., Мальковский Н., Батаев В., Astapov S., Korenevsky M., Zatvornitskiy A. The STC System for the CHiME 2018 Challenge. CHiME 2018 Workshop on Speech Processing in Everyday Environments. 2018. pp. 1-5.
Романенко А.Н., Матвеев Ю.Н., Минкер В. Перенос знаний в задаче автоматического распознавания русской речи в телефонных переговорах. Научно-технический вестник информационных технологий, механики и оптики [Scientific and Technical Journal of Information Technologies, Mechanics and Optics]. 2018. Т. 18. № 2(114). С. 236-242.
Романенко А.Н. Объединение признаков в задаче обучения нейросетевых акустических моделей. Научно-технический вестник информационных технологий, механики и оптики [Scientific and Technical Journal of Information Technologies, Mechanics and Optics]. 2018. Т. 18. № 2(114). С. 350–352.
Medennikov I., Romanenko A., Prudnikov A., Mendelev V., Khokhlov Y.Y., Korenevsky M., Tomashenko N., Zatvornitskiy A. Acoustic Modeling In The STC Keyword Search System For OpenKWS 2016 Evaluation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2017. Vol. 10458. pp. 76-86.
Романенко А.Н. Использование фрагментов слов для повышения качества поиска токенов, не содержащихся в словаре. Альманах научных работ молодых ученых Университета ИТМО. 2017. Т. 3. С. 161-163.
Khokhlov Y.-., Tomashenko N., Medennikov I., Romanenko A. Fast and Accurate OOV Decoder on High-Level Features. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2017. pp. 2884-2888.
Khokhlov Y.-., Medennikov I., Romanenko A., Mendelev V., Korenevsky M., Prudnikov A., Tomashenko N., Zatvornitsky A. The STC Keyword Search System For OpenKWS 2016 Evaluation. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2017. pp. 3602-3606.
Повышение качества поиска токенов, не содержащихся в словаре распознавания
Романенко А.Н. Разработка системы автоматического распознавания речи для египетского диалекта арабского языка в телефонном канале. Научно-технический вестник информационных технологий, механики и оптики [Scientific and Technical Journal of Information Technologies, Mechanics and Optics]. 2016. Т. 16. № 4(104). С. 703-709.
Romanenko A., Mendelev V. Speaker-dependent bottleneck features for Egyptian Arabic speech recognition. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2016. Vol. 9811. pp. 620-626.
РАСПОЗНАВАНИЕ СПОНТАННОЙ АРАБСКОЙ РЕЧИ В ТЕЛЕФОННОМ КАНАЛЕ
Korenevsky M., Romanenko A. Feature space VTS with phase term modeling. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2016. Vol. 9811. pp. 312-320.
Использование упрощенного алгоритма рандомизированной стохастической аппроксимации для оптимизации параметров декодера в задаче распознавания речи
Романенко А.Н. Исследование смеси обучающих речевых корпусов в задаче распознавания спонтанной речи. Альманах научных работ молодых ученых Университета ИТМО. 2015. Т. 3. С. 54-56.
Zatvornitskiy A., Romanenko A.N., Korenevsky M. Proportional-Integral-Derivative Control of Automatic Speech Recognition Speed. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2014. Vol. 8773. No. LNAI. pp. 360–367.
Merkin N., Medennikov I.P., Romanenko A.N., Zatvornitskiy A. Controlling the uncertainty area in the real time LVCSR application. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2014. Vol. 8773. No. LNAI. pp. 153–160.
Romanenko A.N., Zatvornitsky A., Medennikov I.P. Simplified Simultaneous Perturbation Stochastic Approximation for the optimization of free decoding parameters. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2014. Vol. 8773. No. LNAI. pp. 402-409.