History & Current State of AI in China: Industry Breakdown & Analysis
1. What is AI?
Artificial Intelligence, also called AI, was proposed by Assistant Professor John McCarthy of Dartmouth University in 1956. However, there has never been a unified view on the definition of artificial intelligence. Different scholars and researchers have put forward a variety of definitions of artificial intelligence according to different contexts and perspectives.
Based on years of investment experience and field due diligence on artificial intelligence projects, the author believes that artificial intelligence refers to the use of machines to replace humans to realize functions such as cognition, recognition, analysis, and decision-making. Its essence is the information process of human consciousness and thinking. simulation. When we measure the ability of artificial intelligence, we must involve three aspects of ability, namely computing ability, perception ability, and cognitive ability.
Among them, computing power refers to the machine’s fast calculation and memory storage capabilities. In terms of computing capabilities, computers have far surpassed human computing and storage capabilities; perception capabilities are to solve the problem of machine hearing and seeing, generally referring to perception capabilities such as vision, hearing, and touch. At the technical level, it is generally considered as voice recognition and image Recognition and other technologies belong to the field of perceptual intelligence; cognitive ability, to solve the problem of understanding and understanding by machines, is generally speaking “understand and think”.
2. AI Market in China
(1) At the technical level, computer vision is the current hot spot, and the bottom layer such as chips and algorithms will be the future direction
From the above figure, we can find that in recent years, the domestic AI industry has developed rapidly, with a month-on-month growth rate of about 45%. It is estimated that by 2020, the entire domestic AI market will reach 71 billion yuan. After talking about the overall market capacity, the author Take a look at the current technology scale structure of the artificial intelligence market:
The main technical applications of each technology sector are as follows:
Voice: voice recognition, voice synthesis, voice interaction, voice evaluation, man-machine dialogue, voiceprint recognition.
Computer vision: biometrics (face recognition, iris recognition, fingerprint recognition, vein recognition), emotion calculation, emotion recognition, expression recognition, behavior recognition, gesture recognition, human body recognition, video content recognition, object and scene recognition, computer vision, Machine vision, mobile vision, OCR, handwriting recognition, text recognition, image processing, image recognition, pattern recognition, eye tracking, human-computer interaction, SLAM, mobile vision, spatial recognition, 3D scanning, 3D reconstruction;
Natural language processing: natural language interaction, natural language understanding, semantic understanding, machine translation, text mining (semantic analysis, semantic computing, classification, clustering), information extraction, human-computer interaction;
ML/DL algorithm and platform: machine learning, deep learning, algorithm platform; Basic hardware: chips, high-definition image transmission equipment, lidar, sensors, servers.
From the pie chart, it can be seen that the current domestic artificial intelligence technology is mainly in perceptual intelligence. Perceptual intelligence is advancing by leaps and bounds, and the technological maturity is relatively high. For cognitive intelligence (natural language understanding, etc.), further development is still needed.
(2) At the application level, “AI+” and intelligent robots dominate
Through the above figure, we can find that at the application level, AI+ applications account for the largest proportion, reaching 40%, followed by intelligent robots, reaching 27%. The specific applications of each industry are as follows:
Intelligent robots (including solutions): industrial robots (focus on production processes, such as handling, welding, assembly, palletizing, spraying, etc.), industry service robots (applied to banks, restaurants, hotels, shopping malls, exhibition halls, hospitals, logistics), Personal/home robots (virtual assistants, emotional companion robots, children’s robots, educational robots, homework robots (sweeping floors, window cleaning, etc.), home security robots, vehicle-mounted robots);
Intelligent driving (including solutions): intelligent driving, unmanned driving, autonomous driving, assisted driving, advanced driver assistance systems (ADAS), lidar, ultrasonic radar, millimeter wave radar, GPS positioning, high-precision maps, car chips, humans Car interaction, car networking;
Drones (including solutions): consumer drones (entertainment, aerial photography) industrial drones (agriculture, forestry, power, logistics, security, etc.);
Big data and data services: data visualization, data collection, data cleaning, data mining, data solutions.
In terms of AI+, the current combination of AI and specific industries or scenarios is as follows, among which the main scenarios are finance, manufacturing, e-commerce, and medical care.
Data source: 2017 CSDN China Developer Survey
(3) At the financing level, the financing amount has increased year by year, and the single amount has gradually increased
Data source: IT Orange 2012–2017 AI financing trends
Since 2012, there have been 1354 companies in the AI field in my country, with 1353 investment events and a total investment of 144.8 billion yuan. In 2012, there were 26 AI investment events in my country, with an investment amount of 600 million yuan. By 2017, there were 334 investment events, and the total investment has exceeded 55 billion yuan, a hundred times higher than in 2012. However, compared with 2016, the investment events in 2017 have declined, but the total investment has risen sharply. Capital’s enthusiasm for AI is still worthy of recognition.
3. History of Development of AI
1. History of AI
From the conception in 1956 to the large-scale explosion in 2016, artificial intelligence has experienced three ups and downs in the past 60 years. In 1970 and 2000, the development of artificial intelligence fell into a trough.
At that time, the algorithm and technological level could not meet the needs of artificial intelligence, and in the end, breakthroughs depended on the evolution of algorithms and the improvement of computing power.
The application of artificial intelligence did not meet people’s expectations, and the government reduced its investment.
Currently, artificial intelligence is in its third craze. In addition to the combination of technology and algorithm improvements, the biggest feature of this craze is that through the combination of deep learning and big data, artificial intelligence has found real application scenarios in multiple fields, combined with specific business scenarios, and began to work in some industries.
2. Three key factors for AI Development
If a formula is used to summarize those factors that affect the development of artificial intelligence, then this formula can be AI=computing power+algorithm+data. Regarding the relationship between these three, the famous artificial intelligence expert Wu Enda once had a famous analogy: the development of artificial intelligence Just like launching a satellite with a rocket, a powerful engine and sufficient fuel are needed. If the fuel is not enough, the rocket cannot push the satellite to a proper orbit; if the engine thrust is not enough, the rocket cannot even take off. Among them, the algorithm model is like an engine, a high-performance computer is a tool for building an engine, and massive amounts of data are the fuel for the engine.
(1) Computing power, mainly including chip + supercomputer + cloud computing
FLOPS is the English abbreviation for the number of floating-point operations per second performed by Floating-point Operations Per Second. It is a measure of the computing power of a computer. The more floating-point operations performed per second, the stronger the computing power. 1GFLOPS ( gigaFLOPS) = 1 billion (=1⁰⁹) floating point operations per second.
Note: Floating point computing power:
1MFLOPS (megaFLOPS) = 1 million (=1⁰⁶) floating point operations per second
1GFLOPS (gigaFLOPS) = 1 billion (=1⁰⁹) floating point operations per second
1TFLOPS (teraFLOPS) = 1 trillion (=1⁰¹²) floating point operations per second
1PFLOPS (petaFLOPS) = 1 quadrillion (=1⁰¹⁵) floating point operations per second
In the case of traditional neural networks, due to the limitations of technical capabilities, artificial intelligence could not be specifically applied and implemented in many industries. After the emergence of deep neural networks, the technical capabilities of artificial intelligence have been rapidly improved.
Take computer vision as an example: its main recognition method has undergone a major change, and the self-learning state has become the mainstream of visual recognition. The machine automatically summarizes object features from a massive database, and then recognizes objects according to the feature law. The accuracy of image recognition has also been greatly improved, from 70%+ to 95%.
In 2017, the global population was 7.5 billion. One person generates approximately 52GB of information in a year. Although we as individuals are indeed very small, the entire development of artificial intelligence is inseparable from the contribution of each of us. Because each of us is delivering fuel to AI all the time.
4. AI Industry Pyramid
From the bottom to the application layer, artificial intelligence can be roughly divided into the technical support layer, the basic application layer and the solution integration layer. The author will explain each piece of content below.
1. Technology Infrastructure
- Chip classification
The chip generally refers to the carrier of the integrated circuit, which is divided into wafers. The chips can be divided into many types according to different functions. Some are responsible for audio and video processing, some are responsible for image processing, and some are responsible for complex calculation processing. Algorithms must use The chip can run, and different scenarios and technologies have different performance requirements for the chip. At present, the chips that everyone has more contact with should be CPU and GPU.
CPU based on highly versatile and strong logic design — biased cognitive application.
The CPU needs a strong versatility to process a variety of different data types, is extremely limited in large-scale parallel computing capabilities, and is better at logic control. Core: Stored programs, executed sequentially.
GPU is an application based on high throughput and high concurrency design — bias perception ability
Unlike general-purpose data operations, GPUs are good at large-scale concurrent computing, which is exactly what is needed for password cracking.
The difference between CPU and GPU:
Give a chestnut: a mathematics professor and 100 elementary school students PK. The first round, four arithmetic, 100 questions. The professor got the papers and counted one by one, and one hundred pupils each took one problem and counted separately. When the professor just started to count the fifth question, the pupils handed in the paper collectively, and the pupils crushed the professor in the first round.
The second round, proof question. One question, after the professor has solved it, 100 elementary school students still don’t know what they are doing…
In the second round, the professor crushed a hundred elementary school students. This is a simple comparison of CPU and GPU.
High general: In addition to the four arithmetic, there can also be proof questions, geometry, calculus, etc.
Strong logic: Proof questions, the proof process is logical up and down, it is meaningless to take a single line to see.
High concurrency: 100 questions at once.
Because the CPU architecture requires a lot of space to place the storage unit and the control unit, the computing unit occupies only a small part, so it is extremely limited in the large-scale parallel computing power, and is better at logic control . In this regard, the GPU is just the opposite, but the GPU cannot work alone and must be controlled by the CPU to work. The CPU can act alone to process complex logical operations and different data types, but when a large amount of data with a uniform processing type is required, the GPU can be called for parallel computing.
2) The situation of chip upstream and downstream industry chain
Different chips have different functions and values under different algorithms and application scenarios. The reason is mainly related to the structure of integrated circuit design.
Wafer: The chip is a semiconductor, and the main material is silicon. The process of making silicon is smaller and thinner, saving materials, and more chips can be made per unit material.
Professional packaging and testing: packaging materials such as plastics, ceramics, glass, metals, etc., after the packaging is completed, it must enter the testing stage. At this stage, it is necessary to confirm whether the packaged IC is operating normally, and it can be shipped to Assembly plant.
In the chip industry chain, the closer to the upstream, the higher the added value, the higher the technical threshold, and the higher the efficiency of capital investment. Currently in the chip sector, Intel, IBM, Samsung and other giant companies have all the upstream and downstream processes of the chip, and the overall industrial chain has strong control. The domestic ZTE Group has a large number of core components in the communication equipment and mobile phones. Procurement is made from abroad. Once the trade friction is sent, it is easy to be cut off by foreign countries and the supply of this raw material is easier to be controlled by others.
3) In the future, AI custom chips must be the trend
When she first came out of any product or business model, when it had greater economic value in which industry she was in, it was usually first generalized and then verticalized. Through the advantages of verticalized services, Enhance the core competitiveness of the entire product and business model. According to the above, both CPU and GPU are more general-purpose chips. However, with the rapid development of the industry, people have higher and higher individual requirements for chips, and the efficiency of universal tools will never compare to special tools.
As the field of artificial intelligence is a data-intensive field, traditional data processing technology cannot meet the processing requirements of high-intensity parallel data. In order to solve this problem, after CPU and GPU, NPU, FPGA, DSP and other chips specifically for AI have appeared one after another.
TPU — A chip developed to accelerate deep learning computing power
On behalf of the company GOOGLE
Most of the original machine learning and image processing algorithms ran on GPUs and FPGAs (semi-customized chips), but both of these chips are still a general purpose chip. Therefore, machine learning algorithms cannot be more closely adapted in terms of performance and power consumption. Compared with CPUs and GPUs of the same period, TPU can provide 15–30 times performance improvement and 30–80 times efficiency improvement.
NPU — is a neural network processor, which uses circuits to simulate the structure of human neurons and synapses.
Representative company Cambrian
Processors that are customized to perform AI-related calculations efficiently are just like GPUs for graphics-related calculations and ISPs for imaging-related calculations. NPU performance reached 1.92TFLOP, using NPU calculations is 25 times faster than using CPU calculations, and 50 times the energy efficiency ratio.
4) Introduction of domestic chip companies
The author said a lot about the introduction and analysis of artificial intelligence, so what is intelligence, that is, what is the core of artificial intelligence, in the author’s opinion, it is nothing more than four words — — machine learning.
Machine learning needs to be supported by algorithms. The role of algorithms: induction and deduction of data. The ultimate goal is to improve the efficiency and accuracy of recognition, and then make decisions and predictions about events in the real world. The core of artificial intelligence is to make yourself smarter through continuous machine learning.
So what is machine learning?
Machine learning is the ability of computers to use large amounts of data to “train” and learn how to complete tasks without explicit programming. Before the advent of deep learning, the mainstream in the field of machine learning was various shallow learning algorithms, such as neural network reverberation propagation algorithm (BP algorithm), support vector machine (SVM), Boosting, Logistic Regression, etc. The limitation of these algorithms lies in their limited ability to express complex functions with limited samples and computing units, and the processing of complex data is restricted.
1) Deep learning is learning through a multi-layer neural network that simulates the structure of the brain
Neurons in the brain, also known as nerve cells, are the basic units that constitute the structure and function of the nervous system. They are composed of cell bodies and cell protrusions. Each neuron has several counts, and only one axon can divert excitement from The cell body is transmitted to another neuron or other tissue, such as a muscle or gland.
But unlike a neuron in the brain that can connect to any neuron within a certain distance, an artificial neural network has discrete layers, and each time it connects only to other layers that match the direction of data propagation.
The “depth” of deep learning refers to the number of layers of multi-layer neural networks, and the depth of the model structure. There are usually 5, 6, or even 10 hidden nodes, each layer is equivalent to one that can solve different problems. Aspect of machine learning. Using this deep non-linear network structure, deep learning can achieve the approximation of complex functions, the distributed representation of the characterizing input data, and then show a powerful ability to learn the essential characteristics of the data set from a small number of samples, and make the probability vector more convergent.
2) The emergence of deep learning makes artificial intelligence have the possibility of value realization in many industries
After the advent of deep learning, the main recognition method of computer vision has undergone a major change, and the self-learning state has become the mainstream of visual recognition. That is, the machine automatically summarizes the object features from the massive database, and then recognizes the object according to the feature law. The accuracy of image recognition has also been greatly improved, from 70%+ to 95%. In the field of medical imaging, 95% accuracy recognition has certain value in use, and after the accuracy reaches more than 97%, it will be assisted Diagnostic value.
3) The open source of deep learning frameworks has become a trend
Machine learning and deep learning need to be supported by algorithms, and algorithms need to continuously use data for training and optimization. Since the breakthrough of deep learning, giants have frequently open sourced. When AI companies use open source platforms to iterate algorithms, open source platforms can obtain data and market feedback on application scenarios to accelerate model training.
In this context, Google released TensorFlow version 1.0 on February 15, 2015. In today’s accelerated development of deep learning, the code and database are rapidly updated, and the developer ecology built by open source is extremely important.
4) Association of algorithm and basic application technology
The above picture is based on the author’s own investment industry experience and understanding, the difficulty of making an artificial intelligence algorithm is related to the basic application technology. The higher the degree of sceneization, the greater the difficulty of the algorithm.
(1) Voice recognition
The so-called speech recognition is to convert the sound signal into a digital signal, and then through feature extraction, the induction is performed, and the corresponding text is inferred. The main difficulty of speech recognition is mainly in two aspects.
The first is data acquisition and cleaning. Speech recognition requires a large number of standardized corpus data in subdivisions as support, especially the diversity of local dialects has increased the workload of corpus collection.
The second difficulty is the extraction of speech features. At present, it is mainly solved by deep learning with a multi-layer neural network. The multi-layer neural network is equivalent to a feature extractor, which can deepen the feature description of the signal layer by layer. To the whole, from the general to the concrete, the original characteristics of the signal can be restored to the greatest extent.
Investment Values & Opportunities：
Although the speech recognition market is huge but oligarchs have emerged, there are not many opportunities for startups. According to the Research and Markets research report, the global intelligent speech market will continue to grow significantly. It is estimated that by 2020, the global speech market is expected to reach 19.17 billion Dollar. According to the Capvision report, from the perspective of market share in the voice industry, Nuance leads the world, while iFlytek in China is the dominant player.
Take HKUST iFlytek as an example, as of 2018
1) Open platform
Xunfei open platform developers reached 518,000 (a year-on-year increase of 102%), and the annual growth exceeded the total of the previous five years; the total number of applications reached 400,000 (a year-on-year growth of 88%), and the annual growth exceeded the total of the previous five years; the platform was connected to terminal equipment The total amount reached 1.76 billion (a year-on-year increase of 93%). HKUST Xunfei currently SDKs many functional modules of speech recognition, and charges are based on the number of concurrent data of the developer’s terminal APP or device per month.
2) Speech recognition accuracy
The current speech recognition accuracy rate is above 98%, which can basically meet a large number of business scenarios and needs.
3) The level of intelligent education, politics and law, and city services is very deep
It is difficult for startups to compete with iFlytek in these three industries.
(2) Semantic recognition
Speech recognition solves the problem of computer “hearing”, while semantic recognition solves the problem of “understanding”. Natural language processing (NLP) implements the language model by establishing a computer framework, and designing various practical systems based on the language model, inferring what users want to express based on statistical principles, predicting user behavior, and then giving the corresponding Instructions or feedback.
At present, the bottleneck of NLP technology is mainly in the complexity of semantics, including the context of causality and logical reasoning, etc. The current ideas for solving these problems mainly rely on deep learning.
Investment Values & Opportunities：
In the current field of semantic recognition, large technology giants are willing to acquire, while small and beautiful companies prefer to segment the scene.
Regarding startup companies in the field of semantic recognition, domestic representative companies include Go Smart 360, Go Ask, Triangle Beast, Sudden Cognition, etc.
Large companies are more inclined to do general technology in the platform, based on the platform, if a good project appears, direct acquisition. To be a small and beautiful company, the difficulty of semantic analysis in specific scenarios is lower than that of general industry semantic analysis, and the accuracy rate can even reach more than 85%. The reason is that it is based on corpus analysis in specific scenarios. Because the corpus is relatively specific, it can Improve accuracy to a certain extent.
Smart Customer Service: Wisdom Teeth Technology Small I
Legal consultation: no litigation, legal valley, etc.
(3) Computer vision
In the technical process of computer vision, real-time data must first be obtained. This step can be obtained by a series of sensors. A small part of the data can be directly processed on the sensor side with MEMS function. Most of the data will continue to be transmitted to the brain platform, and the brain is calculated by The unit and algorithm are composed, where calculations and decision support are given.
Computer vision application scenarios can be divided into two categories: image recognition and face recognition. Each category can be further divided into four categories: dynamic and static, which basically cover the current application scenarios of computer vision. Among them, dynamic face recognition technology is currently the most enthusiastic segment of entrepreneurship, especially financial and security scenarios, which are its key layout scenarios.
Investment Values & Opportunities：
For computer vision, the main bottleneck lies in the impact of image quality and lighting environment. Existing image recognition technology is difficult to solve for image defects, excessive light, and dark images. In addition, subject to the volume and quantity of the marked data, without a large amount of high-quality segmented application scenario data, it is difficult to achieve a breakthrough in the algorithm iteration of this specific application scenario. Currently, the computer vision market technology is relatively mature, and the first echelon pattern has been formed, leaving little opportunity for startups.
1) At present, the development of AI is still in its early stage, certain achievements have been made in perception technology, and the development of cognitive technology still needs a breakthrough;
2) In the future of chips, customized AI chips will be the trend. Start-up companies still have opportunities in the R&D field of vertical AI chips;
3) Algorithms are an obstacle to competition, but how to integrate AI with life and business scenarios to land is the real difficulty;
4) The bottleneck in the development of semantic recognition is still large, and small and beautiful companies have acquisition value;
5) Companies with first-hand data sources that can integrate with actual business are expected to establish their own barriers to competition. Data will become an important factor restricting the development speed of AI companies in the industry and the threshold of competition.
Visit ChinaPotion Research Store.