easy to use
easy to use
easy to use
easy to use
easy to use
easy to use
tokyo-walk

Prompt: A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.

Join our alert list for Sora's upcoming launch details!
Join WaitList

Transform Your Ideas into Videos with Sora

Unlock the power of AI storytelling with Sora, OpenAI's revolutionary text-to-video tool.

Sora empowers creators, educators, and businesses alike to effortlessly bring their narratives to life. With just a few lines of text, you can generate stunning videos that captivate your audience, enhance your message, and elevate your digital presence.

Experience the seamless fusion of creativity and technology, and let Sora transform your imaginative concepts into visual masterpieces. Whether you're crafting educational content, marketing materials, or personal stories, Sora is your gateway to a world where your ideas are not just heard, but seen and felt.

Latest Sora AI News & Insights

Discover daily updates on the latest Sora AI News & Insights at Soramaker.ai. Dive into the world of AI innovation with fresh, engaging content every day. Stay ahead in the AI landscape with our expert analysis and in-depth articles.

Introducing

Harms or Risks

Today, Sora is becoming available to red teamers to assess critical areas for harms or risks. We are also granting access to a number of visual artists, designers, and filmmakers to gain feedback on how to advance the model to be most helpful for creative professionals.

We’re sharing our research progress early to start working with and getting feedback from people outside of OpenAI and to give the public a sense of what AI capabilities are on the horizon.

Complex Scenes With Multiple Characters

Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.

Create Multiple Shots Within a Single Generated Video

The model has a deep understanding of language, enabling it to accurately interpret prompts and generate compelling characters that express vibrant emotions. Sora can also create multiple shots within a single generated video that accurately persist characters and visual style.

Weaknesses

The current model has weaknesses. It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.

The model may also confuse spatial details of a prompt, for example, mixing up left and right, and may struggle with precise descriptions of events that take place over time, like following a specific camera trajectory.

How to Use Sora - OpenAI

OpenAI’s Sora Turns AI Prompts Into Photorealistic Videos. OpenAI's entry into generative AI video is an impressive first step. The model can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt.

  • OpenAI didn’t let me enter my own prompts, but it shared four instances of Sora’s power. (None approached the purported one-minute limit; the longest was 17 seconds.) The first came from a detailed prompt that sounded like an obsessive screenwriter’s setup: “Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes.”
  • The result is a convincing view of what is unmistakably Tokyo, in that magic moment when snowflakes and cherry blossoms coexist. The virtual camera, as if affixed to a drone, follows a couple as they slowly stroll through a streetscape. One of the passersby is wearing a mask. Cars rumble by on a riverside roadway to their left, and to the right shoppers flit in and out of a row of tiny shops.
  • It’s not perfect. Only when you watch the clip a few times do you realize that the main characters—a couple strolling down the snow-covered sidewalk—would have faced a dilemma had the virtual camera kept running. The sidewalk they occupy seems to dead-end; they would have had to step over a small guardrail to a weird parallel walkway on their right. Despite this mild glitch, the Tokyo example is a mind-blowing exercise in world-building. Down the road, production designers will debate whether it’s a powerful collaborator or a job killer. Also, the people in this video—who are entirely generated by a digital neural network—aren’t shown in close-up, and they don’t do any emoting. But the Sora team says that in other instances they’ve had fake actors showing real emotions.

  • The other clips are also impressive, notably one asking for “an animated scene of a short fluffy monster kneeling beside a red candle,” along with some detailed stage directions (“wide eyes and open mouth”) and a description of the desired vibe of the clip. Sora produces a Pixar-esque creature that seems to have DNA from a Furby, a Gremlin, and Sully in Monsters, Inc. I remember when that latter film came out, Pixar made a huge deal of how difficult it was to create the ultra-complex texture of a monster’s fur as the creature moved around. It took all of Pixar’s wizards months to get it right. OpenAI’s new text-to-video machine … just did it.