OpenAI Announced Sora: The Game Changing Text-to-Video AI Model

OpenAI's announced AI tool Sora, an innovative text-to-video model. which creates high-quality real visuals by providing text in just one minute.

1

Not to be surpassed by competitors like Google, who recently launched a text-to-video tool, AI firm OpenAI on Thursday announced its text-to-video model, Sora.

Sora, like Google Lumiere, has limited availability. Sora can produce videos that are up to one minute long, unlike Lumiere.

Text-to-video has become the latest technological arms race in artificial intelligence (AI), as OpenAI, Google, Microsoft, and others look beyond text and image generation to secure their position in a sector expected to generate $1.3 trillion in revenue by 2032 and to attract users who have been attracted by generative AI since ChatGPT appeared a little more than a year ago.


Also Read: What is Artificial Intelligence (AI)? Basics, Differences and Key Features

In a post on 15 Feb 2024, the company announced that Sora, the creator of both ChatGPT and Dall-E, will be available to “red teamers,” or experts in areas like as misinformation, violent speech and bias, who will “adversarily test the model.” It will additionally interact with visual artists, designers and filmmakers to get more feedback from creative specialists. Adversarial testing will be especially important in addressing the possibility of producing deepfakes, which is a key issue in using artificial intelligence (AI)  to generate images and videos.

In addition to collecting feedback from outside the organization, the AI company showed that it is interested in disclosing its achievements right away to “give the public a sense of what AI capabilities are on the horizon.”

Strengths of Sora

One thing that may differentiate Sora is its capacity to read long prompts, such as one that was 135 words long. OpenAI released an example video on Thursday that shows Sora producing a variety of characters and scenarios, featuring people, animals and fluffy monsters, as well as urban environments, landscapes, zen gardens, and even a submerged New York City.

This can be attributed in part to OpenAI’s prior work with the Dall-E and GPT models. Dall-E 3, a text-to-image generator, was launched in September 2023. Sora, in particular, utilizes Dall-E 3’s recaptioning method, which, according to OpenAI, produces “highly descriptive captions for the visual training data.”

Also Read: Top 15 AI-powered apps shaping the world

“Sora can generate complicated scenes that include numerous characters, certain kinds of motion and accurate details of the subject and background,” according to the statement. “The program understands not only what the user was looking for in the prompt, but also how those things function in the physical world,” the statement said.

OpenAI’s sample videos on X (previously known as Twitter) are very realistic, except for close-ups of a human face and swimming aquatic creatures. Otherwise, you may find it difficult to tell the difference between what is real and what is not.

The model, like Lumiere, can create video from still images, extend existing films, and replace missing frames.

“Sora provides a foundation for models that can understand and recreate the real world, which we believe will be an important milestone in achieving AGI,” according to the announcement.

Sora’s weaknesses

OpenAI admits that Sora has weaknesses such as trying to fully convey the physics of a complex situation and understand cause and effect.
“For example, a person could take a bite out of a cookie, but afterwards, the cookie might not have a bite mark,” stated the post.

Anyone who still has to create an L with their hands to find out which side is left may use their heart: Sora confuses left and right.

Also Read: Top 10 Future Tech: Brain-Computer Interfaces to Eco-Friendly AI

Is OpenAI’s Sora accessible to the public?

OpenAI did not say when Sora will be publicly accessible but clarified that it wants to take “multiple essential safety measures” first. This includes complying with OpenAI’s existing safety guidelines, which ban too much violence, sexual content, hostile imagery, celebrity likeness and the use of others’ intellectual property.

“Despite years of testing and research, we cannot predict all of the beneficial ways people will use our technology, nor all the ways people will misuse it,” the post said. It went on to say, “That is why we believe that gaining knowledge from real-world use is an essential part of creating and releasing increasingly safe AI systems over time.”

1 Comment
  1. […] OpenAI Announced Sora: The Game Changing Text-to-Video AI… […]

Leave A Reply

Your email address will not be published.