Contact Us
Contact:Ms Xie
Tel:0755-27176355
E-Mail:fae@twsdy.com;jelly@jmk.com.cn
Add:No. 3 Elevator, 6th Building, 4th Floor, Huashun Industrial Park, Gongming Town, Guangming New District, Shenzhen City
You are here:Home > News Center > Understanding the structure and trend of Chinese phonetic industry

Understanding the structure and trend of Chinese phonetic industry

From:Shenzhen jie mei ke security technology co., LTD    Date:2019-06-01
After iFLYTEK, SinoVoice, within the industry and the emergence of thinking Chi, cloud known sound, go ask etc. in addition to the traditional education of the bright younger generation, industry, customer service, such as telecommunications, open up new world, Home Furnishing, medical application and hardware technology.
At the same time, Natural Language Processing (NLP) as an important part of human-computer interaction technology, but also to provide help. Siri launched the opening of the voice of the interactive precedent, not only spawned a number of voice semantic startups, but also inspired Baidu, Sogou and other large Internet Co in the voice of the semantic technology investment.
Because the NLP and semantic understanding technology can make the machines understand people''s intentions and needs, and the content of the corresponding feedback to the user, so it is widely used in service industry, reduce labor costs, improve the efficiency of business operations.
Then, China speech semantic industry mainly involves what technology? How the technology development level? What are the problems? What are the application field, game player and business models? The industry structure and the development trend in the future will be? This paper will answer you.
A technical article: speech recognition and NLP technology is still immature
Speech semantics includes three main technologies: speech synthesis, speech recognition and Natural Language Processing (NLP).
The development of speech synthesis technology was first applied, has been more common, in addition to the synthesis of music is still partial machinery, there is no problem too big in 2012 speech recognition technology; convolutional neural network (CNN) application, the accuracy rate has increased dramatically, in the C terminal and the B terminal has been widely used, but the effect and experience is not ideal; although the NLP technology has been applied early in the search engine, but in the field of human-computer interaction still belongs to the shallow layer treatment.
Speech recognition "robustness" problem significantly
In biology, a term called "robustness" refers to the ability of a system to maintain its characteristic behavior in the presence of disturbances or uncertainties. This problem also exists in the field of speech recognition.
The whole process of speech recognition includes many aspects, such as speech signal processing, mute resection, acoustic feature extraction, pattern matching and so on. Due to the diversity and complexity of speech signal, the system can only be satisfied under certain conditions. In real scenarios, taking into account the far field, dialect, noise, etc, the accuracy will be greatly reduced. At present, the industry generally claimed that the accuracy rate of 97%, more artificial assessment results, only in the quiet room approach can be achieved in recognition.
In order to solve the problem of speech recognition robustness, we need to optimize the two aspects of technology and product. On the one hand, the technology in the field of speech enhancement, as well as a number of separate microphone array speaker continued to invest, and combined with the back-end semantics, promote the understanding of context, so as to enhance the recognition effect; on the other hand, need to be optimized from the product design, for example, through further interaction, make speech recognition more accurate.
Semantic analysis is still shallow processing
NLP technology includes three levels: lexical analysis, syntactic analysis, semantic analysis, the progress of the three and contains.
Word sense disambiguation is the biggest bottleneck of NLP technology. The machine in the segmentation, POS tagging, and after identification, the need for understanding of the various words. Because of the polysemy in language, it is difficult to do it when people understand it based on existing knowledge and context. Although the system can make a syntactic analysis of the sentence, it can help the machine to understand the meaning and semantics to some extent, but the actual situation is not ideal.
At present, the machine understanding of the sentence can only be a level of semantic role labeling, namely the standard of sentence of a sentence and passive relationship, it belongs to the shallow semantic analysis technology is relatively mature. In the future to make the machine better understand the human language, and to achieve natural interaction, or need to rely on the depth of learning technology, through large-scale data training, so that the machine learning. Of course, in the field of practical applications, product design can also be used to reduce the ambiguity of the Q & a content, in order to enhance the user experience.
Because the artificial technology relies heavily on the data, therefore, the technological progress and industrialization of the field forward is a kind of cooperative relationship -- through the project of improving technology effect and experience, so as to promote the industrial application, according to the actual application of the data and feedback, to achieve a breakthrough in promoting technology. So, what are the problems in the application of speech semantics in the field of industrialization?
Two, application articles: C side to enhance the experience, improve the efficiency of the B side
In the form of questions and answers and chat services, voice semantics in a wide range of applications and industry fields have a wide range of applications, we can simply from the C side and the B side of the two directions.
Figure: /NLP applications of speech recognition technology
C client applications, mainly for mobile equipment, automobile, Home Furnishing three scenes, to change the original way of human-computer interaction; B end for vertical industry needs, improve labor efficiency, for example, to help doctors make electronic medical record entry, or replace human work, such as answer most simple repetitive customer service issues. Because the two areas to solve the problem is different, so the challenges are also different.
C end application: change interactive mode, demand and experience is the key
Voice for the C side provides a new way of interaction, but the application and popularity of the specific scene and demand linked
Prev:From the wisdom of the health care industry development action plan to interpret the health service robot business opportunities
Next:AWE 2017 Outlook: home this year?
About Us
▪ Company profile
▪ Qualification honor
▪ Contact information
▪ Feedback
Product
▪ Car CVBS Camera
▪ Car AHD Camera
▪ Car IP Camera
▪ Car CVBS Monitor
▪ Car AHD Monitor
▪ Wireless Monitor System
▪ VGA/HDMI Monitor
▪ Waterproof Monitor
▪ Mobile DVR
▪ Accessories
News Center
▪ Common problem
▪ Industry information
▪ Company news
Join Us
▪ Recruitment
▪ Talent concept
Contact Us
Tel:0755-27176355
E-Mail:fae@twsdy.com;jelly@jmk.com.cn
Add:No. 3 Elevator, 6th Building, 4th Floor, Huashun Industrial Park, Gongming Town, Guangming New District, Shenzhen City
Copyright © SDY Electronics Co., Ltd