[1]刘 悦,林 军,游 俊.语音识别技术在车载领域的应用及发展[J].控制与信息技术(原大功率变流技术),2019,(02):1-6.[doi:10.13889/j.issn.2096-5427.2019.02.001]
 LIU Yue,LIN Jun,YOU Jun.Application and Development of Automatic Speech Recognition in Vehicle Field[J].High Power Converter Technology,2019,(02):1-6.[doi:10.13889/j.issn.2096-5427.2019.02.001]
点击复制

语音识别技术在车载领域的应用及发展()
分享到:

《控制与信息技术》(原《大功率变流技术》)[ISSN:2095-3631/CN:43-1486/U]

卷:
期数:
2019年02期
页码:
1-6
栏目:
综述与评论
出版日期:
2019-04-05

文章信息/Info

Title:
Application and Development of Automatic Speech Recognition in Vehicle Field
文章编号:
2096-5427(2019)02-0001-06
作者:
刘 悦林 军游 俊
(中车株洲电力机车研究所有限公司,湖南株洲 412001)
Author(s):
LIU YueLIN JunYOU Jun
( CRRC Zhuzhou Institute Co., Ltd., Zhuzhou, Hunan 412001, China )
关键词:
语音识别深度学习人机交互汽车轨道交通
Keywords:
automatic speech recognition deep learning human-computer interaction automobile rail transit
分类号:
TP391.4
DOI:
10.13889/j.issn.2096-5427.2019.02.001
文献标志码:
A
摘要:
语音识别作为人工智能领域的关键技术之一受到了广泛关注,深度学习的出现使语音识别技术商业化成为可能。文章分析了语音识别技术的原理、基本框架及研究历程,梳理了语音识别领域最为重要的声学模型及语言模型在每个发展阶段技术上的突破。在语音识别的应用方面,重点介绍了乘用车领域语音识别技术的应用现状及未来发展趋势;同时,针对轨道交通领域的特殊需求,介绍了语音识别技术在轨道交通领域已开展的应用,并对未来的研究方向进行展望。
Abstract:
As a key technology of artificial intelligence, speech recognition has enjoyed widespread popularity, and the appearance of deep learning makes it possible to commercialize speech recognition. This paper gave a detailed description of speech recognition principle and structure, introduced the development of speech recognition based deep learning in history, and then listed critical breakthroughs on acoustic model and language model. Aiming at the application of speech recognition technology, it introduced its application process and the future trends of passenger car. In the meantime, under the requirement of rail transit, it finally described some on-going works carried out and presented some future research trends.

参考文献/References:

[1] 俞栋,邓力.解析深度学习语音识别实践[M].北京:电子工业出版社,2017.
 [2] LEE K F, REDDY R. Automatic Speech Recognition: The Development of the Sphinx Recognition System [M].Holand: Kluwer Academic Publishers, 1989.
 [3] DENG L, O’SHAUGHNESSY D. SPEECH PROCESSING -A Dynamic and Optimization-Oriented Approach [M]. New York: Marcel Dekker Inc, 2003.
[4] HINTON G E, OSINDERO S, TEH Y W. A fast learning algorithm for deep belief nets [J]. Neural Computation, 2006, 18(7):1527.
[5] MOHAMED A R, DAHL G, HINTON G. Deep Belief Networks for phone recognition [J]. Proc NIPS, 2009, 4.
[6] HINTON G, DENG L, YU D, et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups[J]. IEEE Signal Processing Magazine, 2012, 29(6):82-97.
[7] ABDEL-HAMID O, MOHAMED A, JIANG H, et al. Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition[C]//IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2012:4277-4280.
 [8] SAINATH T N, KINGSBURY B, MOHAMED A, et al. Improvements to deep convolutional neural networks for LVCSR[C]//IEEE Workshop on Automatic Speech Recognition and Understanding, 2013:315-320.
[9] HU X, LU X, HORI C. Mandarin speech recognition using convolution neural network with augmented tone feature[C]//International Symposium on Chinese Spoken Language Processing. IEEE, 2014:15-18.
[10] GRAVES A, JAITLY N, MOHAMED A. Hybrid speech recognition with deep bidirectional LSTM[C]//IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). Olomouc, Czech Republic, 2013:273-278.
[11] SAK H, SENIOR A, BEAUFAYS F. Long short-term memory recurrent neural network architectures for large scale acoustic modeling[C]//15th Annual Conference of the International Speech Communication Association. Singapore, 2014:338-342.
[12] NARAYANAN A, MISRA A, CHIN K. Large-scale, sequence-discriminative, joint adaptive training for masking-based robust ASR[C]//16th Annual Conference of the International Speech Communication Association. Dresden, Germany, 2015:3571-3575.
[13] LI J Y, MOHAMED A, ZWEIG G, et al. Exploring multidimensional LSTMs for large vocabulary ASR[C]//IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP). Shanghai, China, 2016:4940-4944.
[14] KIM J, EL-KHAMY M, LEE J. Residual LSTM:Design of a Deep Recurrent Architecture for Distant Speech Recognition [J]. arXiv preprint, 2017.
 [15] ZHANG Y, CHEN G G, YU D, et al. Highway long short-term memory RNNs for distant speech recognition[C]// IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Shanghai,China, 2016:5755-5759.
 [16] PUNDAK G, SAINATH T N. Highway-LSTM and Recurrent Highway Networks for Speech Recognition [C]//Proc. Interspeech. Stockholm, Sweden, 2017:1303-1307.
[17] CHO K, VAN MERRI?NBOER B, BAHDANAU D, et al. On the properties of neural machine translation:Encoder-decoder approaches [J]. arXiv preprint, 2014.
[18] 文娟.统计语言模型的研究与应用[D]. 北京:北京邮电大学,2009.
[19] SCHWENK H, GAUVAIN J L. Connectionist language modeling for large vocabulary continuous speech recognition[C]//Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing(ICASSP). Orlando, USA,2002: 756-768.
 [20] MORIN F, BENGIO Y. Hierarchical probabilistic neural network language model[C]//Proceedings of the Tenth International workshop on Artificial Intelligence and Statistics (AISTATS). Barbados, 2005: 246-252.
[21] MIKOLOV T,KARAFIAT M, BURGET L, et al. Recurrent neural network based language model[C]//Proceedings of INTERSPEECH 2010. Makuhari, Japan, 2010: 1045-1048.

相似文献/References:

[1]熊群芳,林 军,刘 悦,等.深度学习研究现状及其在轨道交通领域的应用[J].控制与信息技术(原大功率变流技术),2018,(02):1.[doi:10.13889/j.issn.2096-5427.2018.02.001]
 XIONG Qunfang,LIN Jun,LIU Yue,et al.Deep Learning and Its Application in the Field of Rail Transit[J].High Power Converter Technology,2018,(02):1.[doi:10.13889/j.issn.2096-5427.2018.02.001]
[2]熊群芳,林 军,岳 伟,等. 基于深度学习的疲劳驾驶状态检测方法[J].控制与信息技术(原大功率变流技术),2018,(06):1.[doi:10.13889/j.issn.2096-5427.2018.06.400]
 XIONG Qunfang,LIN Jun,YUE Wei,et al. A Method of Fatigue Driving State Detection Based on Deep Learning[J].High Power Converter Technology,2018,(02):1.[doi:10.13889/j.issn.2096-5427.2018.06.400]
[3]熊群芳,林 军,岳 伟.基于深度学习的疲劳驾驶状态检测方法[J].控制与信息技术(原大功率变流技术),2018,(06):91.[doi:10.13889/j.issn.2096-5427.2018.06.400]
 XIONG Qunfang,LIN Jun,YUE Wei.A Method of Fatigue Driving State Detection Based on Deep Learning[J].High Power Converter Technology,2018,(02):91.[doi:10.13889/j.issn.2096-5427.2018.06.400]
[4]丁 驰,林 军,游 俊,等.基于深度学习的手势识别方法[J].控制与信息技术(原大功率变流技术),2018,(06):96.[doi:10.13889/j.issn.2096-5427.2018.06.016]
 DING Chi,LIN Jun,YOU Jun,et al.A Gesture Recognition Method Based on Deep Learning[J].High Power Converter Technology,2018,(02):96.[doi:10.13889/j.issn.2096-5427.2018.06.016]
[5]高 群,朱 均,王芊芊,等. 基于鱼眼图像的目标检测算法研究[J].控制与信息技术(原大功率变流技术),2019,(03):1.[doi:10.13889/j.issn.2096-5427.2019.03.100]
 GAO Qun,ZHU Jun,WANG Qianqian,et al.Research on the Object Detection Algorithm Based on Fisheye Image[J].High Power Converter Technology,2019,(02):1.[doi:10.13889/j.issn.2096-5427.2019.03.100]
[6]高 群,朱 均,王芊芊,等.基于鱼眼图像的目标检测算法研究[J].控制与信息技术(原大功率变流技术),2019,(03):43.[doi:10.13889/j.issn.2096-5427.2019.03.100]
 GAO Qun,ZHU Jun,WANG Qianqian,et al.Research on the Object Detection Algorithm Based on Fisheye Image[J].High Power Converter Technology,2019,(02):43.[doi:10.13889/j.issn.2096-5427.2019.03.100]
[7]齐 航,袁健全,李 磊,等. 基于深度学习的红外烟幕区域分割技术[J].控制与信息技术(原大功率变流技术),2019,(04):1.[doi:10.13889/j.issn.2096-5427.2019.04.400]
 QI Hang,YUAN Jianquan,LI Lei,et al. A Method of Smoke Area Segmentation for Infrared Images Based on Deep Learning[J].High Power Converter Technology,2019,(02):1.[doi:10.13889/j.issn.2096-5427.2019.04.400]
[8]齐 航,袁健全,李 磊,等.基于深度学习的红外烟幕区域分割技术[J].控制与信息技术(原大功率变流技术),2019,(04):18.[doi:10.13889/j.issn.2096-5427.2019.04.400]
 QI Hang,YUAN Jianquan,LI Lei,et al.A Method of Smoke Area Segmentation for Infrared Images Based on Deep Learning[J].High Power Converter Technology,2019,(02):18.[doi:10.13889/j.issn.2096-5427.2019.04.400]

备注/Memo

备注/Memo:
收稿日期:2018-07-17
 作者简介:刘悦(1983—),女,博士,工程师,主要从事语音处理方面工作。
基金项目:国家重点研发计划(2018YFB1201602)
更新日期/Last Update: 2019-04-19