Site icon COOL BLIND TECH

Now Available In Narrator! AI-tagged Image Descriptions

Cool Blind Tech logo

If there was a word which is to define the era that dominates this early 21st century, it no doubt would be the cloud. Everything has moved to incorporating some cloud-based feature or another. Whilst screen reading has done OCR and other image recognition prior, this too was relegated to locally performed magic, rather than anything that used the cloud to process data. The rise of huge AI clusters which now can process trillions of calculations a second has created a better opportunity for so much more to be off-loaded to these systems.

For Microsoft, this is called Cognitive services and it is a platform any developer can leverage to create AI-generated content for an image.

Once you combine this with a screen reader, things get very interesting. Such has happened in the latest Windows 10 insider build, 16226 Could a screen reader some day suggest labels for controls through an intelligent API which can interpret the icons? Could a computer describe a powerPoint presentation, typically full of rectangles and decorations?

To be clear, Apple is also doing this in iOS 11, though the image processing they use is all on-hardware, not cloud-based. Microsoft is first to announce this to the public, and users will be able to compare it to that of iOS 11 once the public beta of it is released. Narrator is noteable for being able to generate full-sentence descriptions of images, rather than just a few words or phrases which explain its content, as done on iOS and Facebook. The insider program is open to anyone, and submitting feedback is highly encouraged if you venture into this world. As well as having lots of patience, and perhaps not a main computer that you install it on but a virtual machine or spare.

Do you see ways in which you would use this technology for yourself to help understand the web or apps better? Feel free to let us know in the comments.

Exit mobile version