It all started with Anirudh Koul—a data scientist working with machine learning and natural language processing in Bing. In early , Anirudh realized that his grandfather, who was gradually losing his vision with age, was unable to recognize him during Skype calls.
Anirudh was also aware of the emerging trend in computer vision—image classification errors were decreasing at a rate of 50 percent year-over-year, meaning it was likely that it would catch up to human accuracy in the near future. A short mobile prototype, while promising, left much to be desired in accuracy. His idea to help navigate users who are blind to nearby objects would have to wait. In just a year, two big breakthroughs changed everything.
First, a team of Microsoft researchers developed vision-to-language technology that was recognized as the most humanlike in the world. Equally important, the best image classification system in the world built by another Microsoft Research team recorded a 3. And just like that, the building blocks were ready. Anirudh began recruiting people to join his project, dubbed Deep Vision, for the Hackathon. He started with the researchers who had developed the vision-to-language technology.
He also scoured the Internet to find published accessibility experts within Microsoft. Price Free. Family Sharing With Family Sharing set up, up to six family members can use this app. More By This Developer. Microsoft Teams. Microsoft Edge: Web Browser. Microsoft Bing Search. Microsoft OneNote. Microsoft OneDrive. Microsoft Office. You Might Also Like.
Envision AI. Writing a comprehensive review of Seeing AI is beyond the scope of this article and at the rate techies are reviewing it, there will be plenty of tutorials and guidance available elsewhere. That said, the structure and components of the app itself provide an excellent outline for a review of some of the apps it is competing with, and the concepts that may be helpful to users as they decide which of these software tools is right for them.
Both Person and Scene are beta works in progress. These channels correspond to a specific type of app or software utility in the following broader categories:. Traditionally, users would get one or more apps for each feature listed above, and pay for several apps. However, consumers may discover that a critical comparison of Seeing AI to some of the stand-alone competitors can reveal some significant features on the stand-alone apps that make them a good value, even when compared to the free Seeing AI app.
For example, you can take a picture of a magazine article or printed recipe and have it read aloud by the smartphone! OCR is certainly nothing new and has been around for decades. Years ago, when Ray Kurzweil developed one of the first commercially available OCR devices, it originally cost tens of thousands of dollars and was the size of a filing cabinet—neither cheap nor portable.
Over time these devices have morphed into software applications that can run on a smartphone, and cost infinitely less—or free, as in the case of Seeing AI. However, it works well with a screen reader and has some support for focusing the camera.
The software can identify the edges of the paper to be photographed and prompt users when all edges of a document are within the viewfinder. Currency Demo A guide for using the app to read currency bills. Intelligently built Seeing AI is a Microsoft research project that brings together the power of the cloud and AI to deliver an intelligent app, designed to help you navigate your day. Turns the visual world into an audible experience With this intelligent camera app, just hold up your phone and hear information about the world around you.
Recognize friends and their facial expressions Recognize and locate the faces of people you're with, as well as facial characteristics, approximate age, emotion, and more. Read text quickly Hear short snippets of text instantly and get audio guidance to capture full documents. Follow Microsoft.
0コメント