ChatGPT now supports voice chats and image-based queries

OpenAI’s ChatGPT is receiving major updates that will enhance its capabilities to handle voice commands and image-based queries. Users can now engage in voice conversations with ChatGPT on both Android and iOS platforms, as well as feed images into the chatbot across all platforms. These new features are currently available to Plus and Enterprise users, with wider access to the image-based functions expected in the future.

To try out voice conversations, users must opt in to this feature in the ChatGPT app by navigating to Settings and then New Features. By tapping the microphone button, users can select from five different voices to interact with ChatGPT.

OpenAI has powered the back-and-forth voice conversations with a new text-to-speech model capable of generating “human-like audio from just text and a few seconds of sample speech.” These five voices were created with the assistance of professional actors. Conversely, OpenAI’s Whisper speech recognition system is utilized to convert users’ spoken words into text.

The image-based functions of ChatGPT are also quite intriguing. OpenAI states that users can show the chatbot a photo of their grill and inquire why it won’t start, seek assistance in meal planning based on a picture of their refrigerator contents, or prompt the chatbot to solve a math problem by capturing an image of it. It’s interesting to note that Microsoft recently showcased its Copilot AI’s ability to solve math problems at the Surface event.

NYC Mayor Eric Adams Indicted by Grand Jury, Set to Surrender in following days

September 26, 2024

Sean “Diddy” Combs Arrested in New York Following Grand Jury Indictment

September 17, 2024

Chi Chaga Launches New Line of Adaptogen Mushroom Teas: The 3 KINGS Premium Blend

September 11, 2024

IMPACT MEDIA Releases First Look of Extortion 17: “You Can’t Burn the Truth”

August 23, 2024

OpenAI leverages GPT-3.5 and GPT-4 to power the image recognition features in ChatGPT. To utilize the chatbot’s image-based functions, users simply need to tap the photo button (iOS users may need to tap the plus button first) to capture a photo or select an existing image on their device. ChatGPT supports multiple photos, and users can also use a drawing tool to focus on specific parts of an image.

In an announcement regarding these updates, OpenAI acknowledges the potential for harm that this technology poses. It is conceivable for bad actors to mimic the voices of public figures or even regular individuals, potentially leading to fraud. For this reason, OpenAI is primarily focusing on voice conversations with ChatGPT. However, the company is also collaborating with select partners on limited use cases. OpenAI has published a paper on the safety properties of the image-based functionality, referred to as GPT-4 with vision.

ChatGPT demonstrates better performance in understanding English text in images compared to other languages. OpenAI advises non-English users to refrain from using ChatGPT for text in images for the time being, particularly for languages that use non-Roman scripts.

Meanwhile, Spotify has joined forces with OpenAI to leverage the voice-based technology for an intriguing purpose. Spotify is piloting a tool called Voice Translation for podcasters, which can translate podcasts into different languages using the voices of the podcast participants. This innovative tool retains the speech characteristics of the original speaker, even after the conversion to other languages. Initially, Spotify is converting select English-based shows into several languages, with Spanish versions of certain “Armchair Expert” and “The Diary of a CEO with Steven Bartlett” episodes already available, and French and German versions to follow.

With these advancements in voice and image-based capabilities, ChatGPT is expanding its conversational AI capabilities and finding applications in various contexts, including podcasts and everyday user interactions. The continuous development and deployment of such technologies raise important considerations regarding potential risks and the need for responsible implementation to mitigate any adverse consequences.

ChatGPT now supports voice chats and image-based queries

NYC Mayor Eric Adams Indicted by Grand Jury, Set to Surrender in following days

Sean “Diddy” Combs Arrested in New York Following Grand Jury Indictment

Chi Chaga Launches New Line of Adaptogen Mushroom Teas: The 3 KINGS Premium Blend

IMPACT MEDIA Releases First Look of Extortion 17: “You Can’t Burn the Truth”

URECOMM NEWS

RelatedPosts

Discord plans to let creators sell downloadable products

My first day with the Samsung Galaxy Watch 4 (unboxing)

Amazon Prime Day kicks off July 11th this year

Popular News

Mom and Kids Seen Taking All Halloween Candy From One House

Braun: Designed to Keep is a book worth holding onto

Teen Shot at Toronto Airbnb Party Doubts Company’s Safety Measures

MASTER of WINE Rates Affordable CELLAR TRACKER Favorites

SRI LANKA IS A MESS!! Food Tour During a Food Shortage!!

Connect with us

ChatGPT now supports voice chats and image-based queries

Related Post

RelatedPosts

Popular News