[ad_1]
By AI Trend Staff
The advances in AI behind voice recognition drive market growth, attract venture capital and fundraising startups, and pose challenges for established players.
The increased acceptance and use of speech recognition devices is driving the market. This is expected to reach $26.8 billion by 2025, according to meticulous research estimates. Analytics Insight. Better speed and accuracy are one of the advantages of evolving technology.
Assemblyai from San Francisco, one company struggling with this new growth, offers voice recognition APIs that can transcription videos, podcasts, phones and remote meetings. The company was founded in 2017 by CEO Dylan Fox and was supported by Startup Accelerators Y Combinator and Nvidia.
Fox has a rare background for high-tech entrepreneurs. He graduated from George Washington University with degrees in business administration, business economics and public policy. He has earned a job as a machine learning software engineer at Cisco’s emerging product lab in San Francisco, working on deep neural networks and machine learning. He got the idea of ​​AssemblyAI and attracted capital from the Y-combinator. This allowed us to hire data scientists and data engineers to acquire technology.
I asked in an interview with AI trends Fox said how he made this transition from undergraduate in business administration and economics to high-tech entrepreneurs, “learned the path of machine learning. They were working on Siri for Apple’s enterprise at the time.
To speed up the task, Cisco was considering obtaining voice recognition software. Fox was in Catbird’s seat for a search. For example, “I saw the nuance” was recognized as a market leader and owner of voice recognition software than its competitors. (Microsoft’s acquisition of Nuance is expected to be confirmed by the end of the year at $19.6 billion.) The young, emerging entrepreneur was not impressed. “It was crazy how bad all the options were from the accuracy and developer’s perspective,” he said.
He was impressed with Twilio, a San Francisco-based company founded in 2008. The company subsequently raised $103 million from venture capital. “They were setting new standards for great APIs for developers,” Fox said.
Fox’s idea was to use AI and machine learning to “achieve very accurate results and enable developers to easily incorporate APIs into their products. One customer is Callrail, providing call tracking and marketing analytics software. It plans to provide Assembyai’s API to gain insights to understand why other customers are calling NBC.
“We’ve been working to build as close to the quality of human speech recognition as possible, and that’s a lot of work,” Fox said. He hopes to reach that plateau in 2022.
He targets companies that incorporate speech recognition into their products, making purchasing easier. Customers pay on a usage basis. AssemblyAi charges just a portion of a penny when audio is transcribed every second. Clients are billed monthly. It costs about $9 for a customer to use it for 10 hours a month. If a customer uses one million hours a month, it costs around $900,000.
Voice recognition is a hot market. “There are a lot of new startups being launched,” Fox said, providing the opportunity. “Many interesting new businesses are built on voice data.”
AssemblyAI products can detect sensitive topics such as hate speech and profanity, allowing customers to save moderation in human content.
Asked to explain what distinguishes his technology, Fox says, “We’re an experienced team of deep learning researchers,” and has experience with companies such as BMW, Apple and Facebook. “We build very large, very accurate deep learning models with much more accurate recognition results than traditional machine learning approaches. We are building very large models using advanced neural network technology.” He compared the approaches used by Openai with those developing major language models for GPT-3.
Additionally, AI capabilities can be built on top of Trump Shots to search and index audio and video content overviews. “It goes beyond just transcription,” Fox said.
The company currently has 25 employees, and is expected to double in about four months. The business was good. “We see a lot of demand as audio and video data explodes online and customers want to make it available,” Fox said.
See more details at Assemblyai.