Mitrofanov A., Prisyach T., Timofeeva T., Novoselov S., Korenevsky M., Khokhlov Y., Akulov A., Anikin A., Khalili R., Lezhenin I., Melnikov A., Miroshnichenko D., Mamaev N., Odegov I., Rudnitskaya O., Romanenko A. STCON System for the CHiME-8 Challenge. 8th International Workshop on Speech Processing in Everyday Environments (CHiME 2024). 2024. pp. 13-17.
Khokhlov Y., Prisyach T., Mitrofanov A., Dutov D., Agafonov I., Timofeeva T., Romanenko A., Korenevsky M. Classification of Room Impulse Responses and its application for channel verification and diarization. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2024. pp. 274--278.
Andrusenko A., Nasretdinov R., Romanenko A. UCONV-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition. 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2023. pp. 1-5.
UCONV-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition
Prisyach T., Khokhlov Y., Korenevsky M., Mitrofanov A., Timofeeva T., Odegov I., Nasretdinov R., Lezhenin I., Miroshnichenko D., Karelin A., Mitrofanova M., Svechnikov R., Novoselov S., Romanenko A. STCON System for the CHiME-7 Challenge. 7th International Workshop on Speech Processing in Everyday Environments (CHiME 2023). 2023. pp. 87-92.
Andrusenko A., Romanenko A. Improving out of vocabulary words recognition accuracy for an end-to-end Russian speech recognition system. Научно-технический вестник информационных технологий, механики и оптики [Scientific and Technical Journal of Information Technologies, Mechanics and Optics]. 2022. Vol. 22. No. 6(142). pp. 1143-1149.
Mitrofanov A., korenevskaya M., Podluzhny I., Khokhlov Y., Laptev A., Andrusenko A., Ilin A., Korenevsky M., Medennikov I., Romanenko A. LT-LM: A Novel Non-Autoregressive Language Model for Single-Shot Lattice Rescoring. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2021. Vol. 3. pp. 2053-2057.
Medennikov I., Korenevsky M., Prisyach T., Khokhlov Y., Korenevskaya M., Sorokin I., Timofeeva T.N., Mitrofanov A., Andrusenko A., Podluzhny I., Laptev A., Romanenko A. Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2020. pp. 274-278.
Medennikov I., Korenevsky M., Prisyach T., Khokhlov Y., Korenevskaya M., Sorokin I., Timofeeva T., Mitrofanov A., Andrusenko A., Laptev A., Romanenko A. The STC System for the CHiME-6 Challenge. 6th International Workshop on Speech Processing in Everyday Environments (CHiME 2020). 2020. pp. 36-41.
Khokhlov Y.-., Zatvornitskiy A., Medennikov I., Sorokin I., Prisyach T., Romanenko A., Mitrofanov A., Bataev V., Andrusenko A.I., Korenevskaya M., Petrov O. R-vectors: New Technique for Adaptation to Room Acoustics. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2019. pp. 1243-1247.
Medennikov I., Khokhlov Y., Romanenko A., Sorokin I., Mitrofanov A., Bataev V., Andrusenko A.I., Korenevskaya M., Petrov O., Zatvornitskiy A. The STC ASR System for the VOiCES from a Distance Challenge 2019. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2019. pp. 2453-2457.
Романенко А.Н., Матвеев Ю.Н., Минкер В. Перенос знаний в задаче автоматического распознавания русской речи в телефонных переговорах. Научно-технический вестник информационных технологий, механики и оптики [Scientific and Technical Journal of Information Technologies, Mechanics and Optics]. 2018. Т. 18. № 2(114). С. 236-242.
Medennikov I., Khokhlov Y., Romanenko A., Popov D., Tomashenko N., Sorokin I., Zatvornitskiy A. An investigation of mixup training strategies for acoustic models in ASR. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2018. pp. 2903-2907.
Романенко А.Н. Объединение признаков в задаче обучения нейросетевых акустических моделей. Научно-технический вестник информационных технологий, механики и оптики [Scientific and Technical Journal of Information Technologies, Mechanics and Optics]. 2018. Т. 18. № 2(114). С. 350–352.
Medennikov I., Romanenko A., Сорокин И., Popov D., Хохлов Ю., Присяч Т.Н., Мальковский Н., Батаев В., Astapov S., Korenevsky M., Zatvornitskiy A. The STC System for the CHiME 2018 Challenge. CHiME 2018 Workshop on Speech Processing in Everyday Environments. 2018. pp. 1-5.
Романенко А.Н. Использование фрагментов слов для повышения качества поиска токенов, не содержащихся в словаре. Альманах научных работ молодых ученых Университета ИТМО. 2017. Т. 3. С. 161-163.
Khokhlov Y.-., Medennikov I., Romanenko A., Mendelev V., Korenevsky M., Prudnikov A., Tomashenko N., Zatvornitsky A. The STC Keyword Search System For OpenKWS 2016 Evaluation. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2017. pp. 3602-3606.
Medennikov I., Romanenko A., Prudnikov A., Mendelev V., Khokhlov Y.Y., Korenevsky M., Tomashenko N., Zatvornitskiy A. Acoustic Modeling In The STC Keyword Search System For OpenKWS 2016 Evaluation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2017. Vol. 10458. pp. 76-86.
Khokhlov Y.-., Tomashenko N., Medennikov I., Romanenko A. Fast and Accurate OOV Decoder on High-Level Features. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2017. pp. 2884-2888.
Повышение качества поиска токенов, не содержащихся в словаре распознавания
Korenevsky M., Romanenko A. Feature space VTS with phase term modeling. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2016. Vol. 9811. pp. 312-320.
Романенко А.Н. Разработка системы автоматического распознавания речи для египетского диалекта арабского языка в телефонном канале. Научно-технический вестник информационных технологий, механики и оптики [Scientific and Technical Journal of Information Technologies, Mechanics and Optics]. 2016. Т. 16. № 4(104). С. 703-709.
Romanenko A., Mendelev V. Speaker-dependent bottleneck features for Egyptian Arabic speech recognition. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2016. Vol. 9811. pp. 620-626.
Романенко А.Н. Исследование смеси обучающих речевых корпусов в задаче распознавания спонтанной речи. Альманах научных работ молодых ученых Университета ИТМО. 2015. Т. 3. С. 54-56.
Использование упрощенного алгоритма рандомизированной стохастической аппроксимации для оптимизации параметров декодера в задаче распознавания речи
Zatvornitskiy A., Romanenko A.N., Korenevsky M. Proportional-Integral-Derivative Control of Automatic Speech Recognition Speed. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2014. Vol. 8773. No. LNAI. pp. 360–367.
Merkin N., Medennikov I.P., Romanenko A.N., Zatvornitskiy A. Controlling the uncertainty area in the real time LVCSR application. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2014. Vol. 8773. No. LNAI. pp. 153–160.
Romanenko A.N., Zatvornitsky A., Medennikov I.P. Simplified Simultaneous Perturbation Stochastic Approximation for the optimization of free decoding parameters. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2014. Vol. 8773. No. LNAI. pp. 402-409.