



OpenAI has added a new “Read Aloud” feature to iOS, Android, and Web.

OpenAI

In a post shared on X on Monday, San Francisco-based OpenAI announced a new feature available in its iOS and Android apps that allows chats to be read aloud to users. Also available on the web.

In an interview with me late last week ahead of the announcement, Joanne Jang, who leads the model behavior product at OpenAI, explained that the company has been thinking about accessibility for some time. She pointed to a feature released about a year ago that allowed people to use an image as input and ask questions about it, and said she knew the technology was powerful, but wasn't sure how it was being used around the world. He said he didn't really understand how it would be used. . OpenAI approached the team at Be My Eyes to gain insight and critical feedback from people who are blind or have low vision, Chan said. Jang said that OpenAI was very overwhelmed by the feedback, and there were a lot of unexpected comments. She cited use cases like people taking a photo of their outfit and asking ChatGPT if it matches, or taking a photo of their garden and asking ChatGPT to describe it. She said Chan said people were able to notice new things about the environment that they didn't know before. This was enlightening because the explanation was from a more neutral observer rather than a stranger.

I think that's when we learned, “Okay, there's something about AI here.” Chan said of OpenAI’s proverbial lightbulb moment. It's the fact that he's not someone who can offer a different approach to accessibility.[the chatbot] Provides a new, more objective perspective.

She continued: We were very interested to learn [about] Provides information from the accessibility and blind community. We didn't mean to say, “Our app works perfectly for every accessibility use case.” We still have a long way to go, but we definitely want to improve everyone's lives with this technology. We welcome your feedback.

Elaborating on her comments, Jiang said the team has learned a lot through its partnership with Be My Eyes. One of her key learnings was that working with people in the blind and low vision community doesn't just mean providing visual aids. OpenAI has learned that many people use voice-centric software such as Screen Reader, she said. For example, Jean noted that the operating system's native text-to-speech engine can sometimes have problems with large numbers of items in a shopping list. In contrast, a user could take a screenshot of her Amazon cart and she could ask questions about it on ChatGPT. This is a feature that didn't exist before, Chan said. This feature should make shopping easier and more accessible.

Through our collaboration with Be My Eyes, as well as feedback from many of our users, we believe we have learned the following: [AI] can be used for [for accessibility]said Mr. Chan. In fact, I think that was one of the many reasons we decided to support Read Aloud.

Be My Eyes CEO Mike Buckley raved about OpenAIs' efforts in an interview with me last December.He said that he thinks OpenAI is better at putting accessibility first during development, adding that he has a lot of respect for them because they spent a lot of time on engineering. [and] lots of projects [and] Product managers have time to work on accessibility when no one is looking.

The depth of Read Alouds' impact excites OpenAI.

We were very excited about this technology because we had been working on it for a while, Chan said of the atmosphere at Read Aloud. I think I'm excited about this in several ways. One is that a lot of work has been done on writing before these voice features came along. [and] typing. But not everyone thinks through sentences. I think by writing, but there are many people around me who think better by saying it out loud. I think many people feel nervous when it comes to writing using a chatbot, but I find it much more intuitive to talk interactively with ChatGPT. What I'm most excited about is [voice capabilities] It opens up another avenue for people to interact with advanced technology, and in doing so, opens up ways for people to better communicate and express their ideas. Honestly, that's one of the things we're most excited about about OpenAI. OpenAI aims to ensure that advanced AI technologies benefit all humanity. I'm especially excited about the fact that this also caters to people who don't like writing for whatever reason.

Jangs' colleague Mada Afflak, an engineer on the ChatGPT team, agreed.

Speaking is a basic human skill, [so] It's also important to enable AI to communicate by voice, she said in an interview at the same time as Jangs. While there are many conversations that are okay to write, there are some conversations that seem much more natural when spoken aloud, such as when brainstorming. All of these use cases work better with voice. I believe that anything you can do with typing should also be possible with voice commands. Building this technology that allows users to perform voice commands to understand and speak in natural language will make any digital device more accessible.Now it can also be generated using voice [images] You don't need to enter anything. Ultimately, anything that can be typed should also be possible with voice.

When asked about OpenAI's hopes and dreams for the future, Jang candidly said that the company has made great strides in the past year in terms of ChatGPT's features and accessibility. Her hope is that AI will become smarter and more capable, allowing us to automate tasks that were once boring or not clearly accessible here. Doing so, she said, could allow people to be more creative and independent and do what they love. By estimating how people interact around specific tasks, we expect that as ChatGPT advances, people will be able to perform more high-level tasks faster, Zhang said. said.

Aflac again agreed with Chan.

we imagine a world where you don't need [user interfaces], she said of the potential for future growth in AI, at least in its traditional form. Everything you can do can be done with voice commands. I feel like that's a great future.

