Andrew Ng proposes taking AI from the top 1% to the masses • The Register


In 2015, modern AI pioneer Andrew Ng’s recipe for success was to bet on neural networks, data and monolithic systems. Now, this recipe has created a problem: the technology is dominated by only a few wealthy corporations with the money and manpower to build such huge systems.

But the landscape doesn’t need to depend on such mainstream accessibility, according to Ng, the Baidu and Google Brain alum (and current CEO of software maker Landing.AI). Instead, it suggests an approach to making machine learning inclusive and open for a session at Nvidia’s GPU Technology Conference last week.

Ng suggested building better AI analytical tools and domain knowledge, with the goal of being able to do more with less, essentially. The key to AI accessibility is being able to understand patterns and trends from smaller datasets.

“We know that in mainstream internet companies you can have a billion users and a giant data set. But when you go into other industries, the sizes are often much smaller,” Ng said.

Ng spoke about building AI systems in places like hospitals, schools or factories, which lack the resources and datasets to develop and train AI models.

“AI is meant to change all industries. We are not yet seeing this happen at the rate we would like, and we need data-centric AI tools and principles to make AI useful to everything. the world…not just the big consumer internet companies,” Ng said.

For example, he cites the thousands of $1-5 million projects in places like hospitals, which typically have budget restrictions, that could transition to smaller custom AI systems to improve analytics.

Ng said he saw some manufacturing environments that only had 50 images on which to build a computer vision-based inspection system to eliminate defective parts.

“The only way for the AI ​​community to build these very large numbers of systems is to start building vertical platforms that bring all of these use cases together. This allows the end customer to build the AI ​​system personalized,” Ng said.

One such step is to improve “data preparation” – as opposed to data cleaning – to improve the machine learning system iteratively. The idea is not to improve all the data in a large dataset, but to implement an error analysis technique that identifies a subset or slice of data, which can then be improved.

“Rather than trying to improve all the data, which is just too much, you might know you want to improve that part of the data, but let’s leave the others. You can be much more focused,” Ng said. .

For example, if part of an image is identified as faulty, error analysis can zoom in on acquiring more targeted and specific data, which is a better way to train systems. This small data approach is more efficient than the larger approach of acquiring larger data, which can be expensive and resource intensive.

“It allows you to go to a much more focused data acquisition process where you’re going to say, ‘Hey, let’s go to the fab and get a lot more photos,'” Ng said, adding that a Consistent and effective labeling is a big part of the process.

Ng gave a specific example of error analysis in speech recognition systems to filter car noise from human speech in a sound sentence. Engineers were tempted to build a system to detect car noise, filter car noise, and then get rid of car noise.

A more effective approach is to generate more human speech and car background noise data, then use error analysis to slice out the problematic car noise data, and then use a targeted approach to data augmentation and generation to improve performance on this problematic data slice.

Classic AI approaches based on big data are still good, but error analysis systems are better for limited data sets. “You decide what you want to improve and associate the cost of greater data acquisition – [whether] it’s reasonable compared to the potential profit,” Ng said.

It may take a decade and thousands of research papers to flesh out a cohesive data-centric model for deep learning, he said.

“We’re still in the early stages of defining the principles as well as the tools to systematically capture data,” Ng said, adding, “I’m looking forward to seeing a lot of people doing this and celebrating their work as well. ” ®

Previous The Philadelphia Orchestra accompanies artificial intelligence
Next BPL book sale on April 15 and 16