Sora: Creating video from text

dang

Related ongoing thread: Video generation models as world simulators - https://news.ycombinator.com/item?id=39391458 - Feb 2024 (43 comments)

Also (since it's been a while): there are over 2000 comments in the current thread. To read them all, you need to click More links at the bottom of the page, or like this:

https://news.ycombinator.com/item?id=39386156&p=2

https://news.ycombinator.com/item?id=39386156&p=3

https://news.ycombinator.com/item?id=39386156&p=4[etc.]

February 16, 2024 at 11:33 AM

crazygringo

This is insane. But I'm impressed most of all by the quality of motion. I've quite simply never seen convincing computer-generated motion before. Just look at the way the wooly mammoths connect with the ground, and their lumbering mass feels real.

Motion-capture works fine because that's real motion, but every time people try to animate humans and animals, even in big-budget CGI movies, it's always ultimately obviously fake. There are so many subtle things that happen in terms of acceleration and deceleration of all of the different parts of an organism, that no animator ever gets it 100% right. No animation algorithm gets it to a point where it's believable, just where it's "less bad".

But these videos seem to be getting it entirely believable for both people and animals. Which is wild.

And then of course, not to mention that these are entirely believable 3D spaces, with seemingly full object permanence. As opposed to other efforts I've seen which are basically briefly animating a 2D scene to make it seem vaguely 3D.

February 16, 2024 at 3:29 AM

patall

I disagree, just look at the legs of the woman in the first video. First she seems to be limping, than the legs rotate. The mammoth are totally uncanny for me as its both running and walking at the same time.

Don't get me wrong, it is impressive. But I think many people will be very uncomfortable with such motion very quickly. Same story as the fingers before.

February 16, 2024 at 5:09 AM

netcan

> I think many people will be very uncomfortable with such motion very quickly

So... I think OP's point stands. (impressive, surpasses human/algorithmic animation thus far).

You're also right. There are "tells." But, a tell isn't a tell until we've seen it a few times.

Jaron Lanier makes a point about novel technology. The first gramophone users thought it sounded identical to live orchestra. When very early films depicting a train coming towards a camera, and people fell out of their chairs... Blurry black and white, super slow frame rate projected on a bedsheet.

Early 3d animation was mindblowing in the 90s. Now it seems like a marionette show. Well... I suppose there was a time when marionette shows were not campy. They probably looked magic.

It seems we need some experience before we internalize the tells and it starts to look fake. My own eye for CG images seems to improving faster then the quality. We're all learning to recognize GPT generated text. I'm sure these motion captures will look more fake to us soon.

That said... the fact that we're having this discussion proves that what we have here is "novel." We're looking at a breakthrough in motion/animation.

Also, I'm not sure "real" is necessary. For games or film what we need is rich and believable, not real.

February 16, 2024 at 9:15 PM

Jensson

> You're also right. There are "tells." But, a tell isn't a tell until we've seen it a few times.

Once you have seen a few you can tell instantly. They all move at 2 keyframes per second, that makes all movements seem alien and everything in an image moves strangely in sync. The dog moves in slow motion since they need more keyframes etc. That street some looks like they move in slow motion and others not.

People will quickly learn to notice those issues, they aren't even subtle once you are aware of them, not to mention the disappearing things etc.

And that wouldn't be very easy to fix, they need to train it on keyframes because training frame by frame is too much.

But that should make this really easy for others to replicate. You just train on keyframes and then train a model to fill in between keyframes, and you get this. It has some limitations as we see with movement keeping the same pace in every video, but there are a lot of cool results from it anyway.

February 17, 2024 at 12:06 AM

kurthr

I have a friend who has worked on many generations of video compression over the last 20 years. He would rather watch a movie on film without effects than anything on a TV or digital theater. He's trained himself to spot defects and now even with the latest HEVC H.265 he finds it impossible to enjoy. It's artifacts all the way down and the work never ends. At the superbowl he was obsessed with blocking for fast objects, screen edge artifacts, flat field colors, and something with the grass.

Luckily, I think he'll retire sooner than later, and maybe it will get better then.

February 17, 2024 at 10:45 AM

justworkout

I think a lot of these issues could be "solved" by lowering the resolution, using a low quality compression algorithm, and trimming clips down to under 10 seconds.

And by solved, I mean they'll create convincing clips that'll be hard for people to dismiss unless they're really looking closely. I think it's only a matter of time until fake video clips lead to real life outrage and violence. This tech is going to be militarized before we know it.

February 16, 2024 at 11:39 AM

OscarTheGrinch

Yeah, we are very close to losing video as a source of truth.

I showed these demos to my partner yesterday and she was upset about how real AI has become, how little we will be able to trust what we see in the future. Authoritative sources will be more valuable, but they themselves may struggle to publish only the facts and none of the fiction.

Here's one possible military / political use:

The commander of Russia's Black Sea Fleet, Viktor Sokolov, is widely believed to have been killed by a missile strike on 22 September 2023. https://en.wikipedia.org/wiki/Viktor_Sokolov_(naval_officer)

Russian authorities refute his death and have released proof of life footage, which may be doctored or taken before his death. Authoritative source Wikipedia is not much help in establishing truth here, because without proof of death they must default to toeing the official line.

I predict that in the coming months Sokolov (who just yesterday was removed from his post) will re-emerge in the video realm, and go on to have a glorious career. Resurrecting dead heroes is a perfect use of this tech, for states where feeding people lies is preferable to arming them with the truth.

Sokolov may even go on to be the next Russian President.

February 16, 2024 at 4:34 PM

antris

> Yeah, we are very close to losing video as a source of truth.

I think this way of thinking is distracted. No type of media has ever been a source of truth in itself. Videos have been edited convincingly for a long time, and people can lie about their context or cut them in a way that flips their meaning.

Text is the easiest media to lie on, you can freely just make stuff up as you go, yet we don't say "we cannot trust written text anymore".

Well yeah duh, you can trust no type of media just because it is formatted in a certain way. We arrive at the truth by using multiple sources and judging the sources' track records of the past. AI is not going to change how sourcing works. It might be easier to fool people who have no media literacy, but those people have always been a problem for society.

February 16, 2024 at 4:43 PM

subtra3t

Text was never looked at a source of truth like video was. If you messaged someone something, they wouldn't necessarily believe it. But if you sent them a video of that something, they would feel that they would have no choice but to believe that something.

> Well yeah duh, you can trust no type of media just because it is formatted in a certain way

Maybe you wouldn't, but the layperson probably would.

> We arrive at the truth by using multiple sources and judging the sources' track records of the past

Again, this is something that the ideal person would, not the average layperson. Almost nobody would go through all that to decide if they want to believe something or not. Presenting them a video of this sometjing would've been a surefire way to force them to believe it though, at least before Sora.

> people have always been a problem for society

Unrelated, but I think this attitude is by far the bigger "problem for society". It encourages us to look down on some people even when we do not know their circumstances or reasons, all for an extremely trivial matter. It encourages gatekeeping and hostility, and I think that kind of attitude is at least as detrimental to society as people with no media literacy.

February 16, 2024 at 8:16 PM

krab

During a significant part of history, text was definitely considered a source of truth, at least to the extent a lot of people see video now. A fancy recommendation letter from a noble would get you far. It makes sense because if you forge it, that means you had to invest significant amount of effort and therefore you had to plan the deception. It's a different kind of behavior than just lying on a whim.

But even then, as nowadays, people didn't trust the medium absolutely. The possibility of forgery was real, as it has been with the video, even before generative AI.

February 17, 2024 at 4:01 PM

quartesixte

To back up this claim, when fictional novels first became a literary format in the Western world, there was immense consternation about the fact that un-true things were being said in text. It actually took a while for authors to start writing in anything besides formats that mimicked non-fictional writing (letters, diary entries, etc.).

February 17, 2024 at 4:49 PM

xanderlewis

> No type of media has ever been a source of truth in itself.

'pics or it didn't happen' has been a thing (possibly) until very recently for good reason.

February 16, 2024 at 11:02 PM

throwup238

And they've been doctored almost as long as photography has been around: https://en.wikipedia.org/wiki/Censorship_of_images_in_the_So...

February 16, 2024 at 11:23 PM

xanderlewis

As has been pointed ad nauseam by now, no one's suggesting that AI unlocks the ability to doctor images; they're suggesting that it makes it trivially easy for anyone, no matter how unskilled, to do so.

I really find this constant back and forth exhausting. It's always the same conversation: '(gen)AI makes it easy to create lots of fake news and disinformation etc.' --> 'but we've always been able to do that. have you not guys not heard of photoshop?' --> 'yes, but not on this scale this quickly. can you not see the difference?'

Anyway, my original point was simply to say that a lot of people have (rightly or wrongly) indeed taken photographic evidence seriously, even in the age of photographic manipulation (which as you point out, pretty much coincides with the age of photography itself).

February 17, 2024 at 1:57 AM

victor106

> Videos have been edited convincingly for a long time,

You are right but the thing with this is the speed and ease with which you can generate something completely fake.

February 16, 2024 at 8:12 PM

rightbyte

> Yeah, we are very close to losing video as a source of truth.

Why have you been trusting videos? The only difference is that the cost will decrease.

Haven't you seen Holywood movies? CGI has been convincing enough for a decade. Just add some compression and shaky mobile cam and it would be impossible to tell the difference on anything.

February 16, 2024 at 6:31 PM

taylorius

Of course, any video could be a fake, it's a question of the cost, and corresponding likelihood of that being the case.

February 17, 2024 at 4:22 AM

snowram

Hell, some people have been doubting moon landing videos for even longer now. Video wasn't a reliable source since its inception.

February 16, 2024 at 7:17 PM

scotty79

The truth is to be found in sources not the content itself.

Every piece of information should have "how do you know?" question attached.

February 16, 2024 at 9:33 PM

cruffle_duffle

> Yeah, we are very close to losing video as a source of truth.

We've been living in a post-truth society for a while now. Thanks to "the algorithm" interacting with basic human behavior, you can find something somewhere that will tell you anything is true. You'll even find a community of people who'll be more than happy to feed your personal echo chamber -- downvoting & blocking any objections and upvoting and encouraging anything that feeds the beast.

And this doesn't just apply to "dumb people" or "the others", it applies to the very people reading this forum right now. You and me and everybody here lives in their safe, sound truth bubble. Don't like what people tell you? Just find somebody or something that will assure you that whatever it is you think, you are thinking the truth. No, everybody is the asshole who is wrong. Fuck those pond scum spreaders of "misinformation".

It could be a blog, it could be some AI generated video, it could even be "esteemed" newspapers like the New York Times or NPR. Everybody thinks their truth is the correct one and thanks to the selective power of the internet, we can all believe whatever truth we want. And honestly, at this point, I am suspecting there might not be any kind of ground truth. It's bullshit all the way down.

February 16, 2024 at 11:54 PM

fragmede

so where do we go from here? the moon landing was faked, we're ruled by lizard people, and there are microchips in the vaccine. at some level, you can believe what you want to believe, and if the checkout clerk thinks the moon is made of cheese, it makes no difference to me, I still get my groceries. but for things like nuclear fusion, are we actually making progress on it or is it also a delusion. where the rubber meets the road is how money gets spent on building big projects. is JWST bullshit? is the LHC? ITER? GPS?

we need ground truths for these things to actually function. how else can things work together?

February 19, 2024 at 5:17 AM

geysersam

I've always found that take quite ridiculous. Fake videos have existed for a long time. This technology reduces the effort required but if we're talking about state actors that was never an issue to begin with.

People already know that video cannot be taken at face value. Lord of the rings didn't make anyone belive orcs really exist.

February 16, 2024 at 5:58 PM

latexr

> This technology reduces the effort required

Which is a huge deal. It’s absurd to brush that off.

> People already know that video cannot be taken at face value.

No, no they do not. People don’t even know to not take photos at face value, let alone video.

https://www.forbes.com/sites/mattnovak/2023/03/26/that-viral...

February 16, 2024 at 7:39 PM

justworkout

Lord of the Rings had a budget in the high millions and took years to make with a massive advertising campaign.

Riots happen due to out of context video clips. Violence happens due to people seeing grainy phone videos and acting on it immediately. We're reaching a point where these videos can be automatically generated instantly by anyone. If you can't see the difference between anyone with a grudge generating a video that looks realistic enough, and something that requires hundreds of millions of dollars and hundreds of employees to attain similar quality, then you're simply lying.

February 17, 2024 at 1:15 AM

_kb

A key difference in the current trajectory is its becoming feasible to generate highly targeted content down to an individual level. This can also be achieved without state actor level resources or the time delays needed to traditionally implement, regardless of budget. The fact it could also be automated is mildly terrifying.

February 16, 2024 at 6:37 PM

roenxi

Coordinated campaigns of hate through the mass media - like kicking up war fever before any major war you care to name - is far more concerning and has already been with us for about a century. Look at WWII and what Hitler was doing with it for a clearest example; propaganda was the name of the game. The techniques haven't gone anywhere.

If anything, making it cheap enough that people have to dismiss video footage might soften the impact. It is interesting how the internet is making it much harder for the mass media to peddle unchallenged lies or slanted perspectives. This tech might counter-intuitively make it harder again.

February 16, 2024 at 7:17 PM

_kb

I have no doubt trust levels will adjust, eventually. The challenge is that takes a non-trivial amount of time.

It's still an issue with traditional mass media. See basically any political environment where the Murdoch media empire is active. The long tail of (I hate myself for this terminology, but hey, it's HN) 'legacy humans' still vote and have a very real affect on society.

February 16, 2024 at 9:02 PM

yurishimo

It's funny you mention LotR, because the vast vast vast majority of the character effects were practical (at least in the original trilogy). They were in fact, entirely real, even if they were not true to life.

February 16, 2024 at 6:09 PM

ksangeelee

You can still be enraged by things you know are not real. You can reason about your emotional response, but it's much harder to prevent an emotional response from happening in the first place.

February 16, 2024 at 6:25 PM

tomaskafka

... and learning to prevent emotional response means unlearning to be human, like burnt out people.

The only winning move is to not watch.

February 16, 2024 at 6:43 PM

geysersam

You can have an emotional response and still act rationally.

February 16, 2024 at 9:08 PM

galdauts

The issue is not even so much generating fake videos as creating plausible deniability. Now everything can be questioned for the pure reason of seeming AI-generated.

February 16, 2024 at 6:42 PM

lukan

Yeah, it looks good at first glance. Also the fingers are still weird. And I suppose for every somewhat working vid, there were dozens of garbage. At least that was my experience with image generation.

I don't believe, movie makers are out of buisness any time soon. They will have to incorporate it though. So far this can make convincing background scenery.

February 16, 2024 at 5:30 AM

anoopelias

> I don't believe, movie makers are out of business any time soon

My son was learning how to play keyboard and he started practicing based on metronome. At some point, I was thinking, why is he learning it at all? We can program which key to be pressed at what point in time, and then a software can play itself! Why bother?

Then it hit me! Musicians could automate all the instruments with incredible accuracy since a long time. But they never do that. For some reason, they still want a person behind the piano / guitar / drums.

February 16, 2024 at 10:30 AM

inference-lord

Isn't it obvious? Life is about experiences and enjoyment, all of this tech is fun and novel and interesting but realistically, it's really exciting for tech people because it's going to be used to make more computer games, social media posts and advertisements, essentially, it's exciting because it's going to "make money".

Outside of that, people just want to know what it feels like to be able to play their favorite song on guitar and to go skiing etc.

Being perfect at everything would be honestly boring as shit.

February 16, 2024 at 12:42 PM

bruce511

I completely agree. There is more to a product than the final result. People who don't play an instrument see music I terms of money. (Hint: there's no money in music). But those who play know that the pleasure is in the playing, and jamming with your mates. Recording and selling are work, not pleasure.

This is true for literally every hobby people do for fun. I am learning ceramics. Everything I've ever made could be bought in a shop for a 100th of the cost, and would be 100 times "better". But I enjoy making the pot, and it's worth more to me than some factory item.

Sona will allow a new hobby, and lots will have fun with it. Pros will still need to fo Pro things. Not everything has to be viewed through the lens of money.

February 16, 2024 at 1:20 PM

disqard

You articulated what I wanted to add to this thread -- thank you!

I play the piano, and even though MIDI exists, I still derive a lot of enjoyment from playing an acoustic instrument.

February 16, 2024 at 1:49 PM

vitro

I like this saying: “The woods would be very silent if no birds sang except those who sang the best.” It's fun learning to play the instrument.

February 16, 2024 at 3:56 PM

numpad0

I think it's not. If musicians and only musicians wanted themselves behind instruments, for the sake of being, there should be a market for autogenerated self-playing music machines for their former patrons who wouldn't care. And that's not the case; the market for ambient sound machines is small. It takes equal or more insanity to have one at home than, say, having a military armored car in the garage.

On the other hand you've probably heard of an iPod, which I think I could describe as a device dedicated to give false sense of an ever-present musician, so to speak.

So, "they" in "they still want a person behind the piano" is not just limited to hobbyists and enthusiasts. People wants people behind an instrument, for some reason. People pays for others' suffering, not for a thing's peculiarity.

February 16, 2024 at 4:54 PM

alisonatwork

I don't think this is entirely accurate. There are entire genres of music where the audience does not want a person behind the piano/guitar/drums. Plenty of electronic artists have tried the live band gimmick and while it goes down well with a certain segment of the audience, it turns off another segment that doesn't want to hear "humanized" cover versions of the material. But the point is that both of those audiences exist, and they both have lots of opportunity to hear the music they want to hear. The same will be true of visual art created by computers. Some people will prefer a stronger machine element, other people will prefer a stronger human element, and there is room for us all.

February 16, 2024 at 1:42 PM

bamboozled

I don't think this is entirely accurate. There are entire genres of music where the audience does not want a person behind the piano/guitar/drums.

Hilariously, nearly every electronic artist I can think of, stands in front of a crowd and "plays "live" by twisting dials etc, so I think it's fairly accurate.

Carl Cox, Tycho, Aphex Twin, Chemical Brothers, Underworld, to name a few.

February 16, 2024 at 7:09 PM

alisonatwork

DJ performances far outnumber "live" performances in the electronic scene. Perhaps you can cherry-pick certain DJs and make a point that they are creating a new musical composition by live-remixing the tracks they play, but even then a significant number of clubbers don't care, they just want to dance to the music. There are venues where a bunch of the audience can't even see the DJ and they still dance because they are enjoying the music on its own merits.

I stand by my original point. There are plenty of people who really do not care if there is a human somewhere "performing" the music or not. And that's totally fine.

February 16, 2024 at 10:47 PM

bamboozled

If there is no human performing there, then it's a completely different event, so I actually have little idea what we're debating.

February 17, 2024 at 9:17 AM

alisonatwork

Your reasoning is circular. Humans who go to performances of other humans playing instruments enjoy seeing other humans playing instruments. That should not be surprising. The question is whether humans as a whole intrinsically prefer seeing other humans playing instruments over hearing a "perfect" machine reproduction. And the answer to that question is no. There are plenty of humans who really do prefer the machine reproduction.

February 17, 2024 at 11:33 AM

bamboozled

If you're still talking about whether people want to hear live covers, or recordings, I think it's an apples to oranges comparison therefore I don't see the point in it.

February 17, 2024 at 2:02 PM

taylorius

Why does the DJ need to be there, in such a case?

February 17, 2024 at 4:28 AM

alisonatwork

Mainly to pick songs that fit the mood of the audience. At the moment, humans seem to do a better job "reading" the emotions of other humans in this kind of group setting than computers do, and people are willing to pay for experts who have that skill.

An ML model could probably do a good job at selecting tunes of a particular genre that fit into a pre-defined "journey" that the promoter is trying to construct, so I could see a role for "AI DJs" in the future, especially for low budget parties during unpopular timeslots like first day of a festival while people are still arriving and the crew is still setting up. Some of that is already done by just chucking a smart playlist on shuffle. But then you also have up-and-comer or hobbyist DJs who will play for free in those slots, so maybe there's not really a need for a smarter computer to take over the job.

This whole thread started from the question of why a human should do something when a machine can do it better. And the answer is simple: because humans like to do stuff. It is not because humans doing stuff adds some kind of hand-wavey X factor that other humans intrinsically prefer.

February 17, 2024 at 11:23 AM

czl

> Musicians could automate all the instruments with incredible accuracy since a long time. But they never do that.

What do you judge was the ratio of automated music (recordings played back) to live music played in the last year?

February 16, 2024 at 10:43 AM

anoopelias

Just to be clear, I was talking about the original sound produced by a person (vs. a machine). Of course it was recorded and played back a _lot_ more than folks listening live.

But I take it, maybe I'm not so familiar with world music, I was talking more about Indian music. While the music is recorded and mixed across several tracks electronically, I think most of it is played (or sang) originally by a person.

February 16, 2024 at 11:30 AM

Larrikin

His point still stands.

In the US atleast there's the occasional acoustic song that becomes a hit, but rock music is obviously on its way to slowly becoming jazz status. It and country are really the last genres where live traditional instruments are common during live performances. Pop, Hip Hop, and EDM basically all are put together as being nearly computer perfect.

All the great producers can play instruments, and that's often times the best way to get a section out initially. But what you hear on Spotify is more and more meticulously put together note by note on a computer after the fact.

Live instruments on stage are now often for spectacle or worse a gimmick, and it's not the song people came to love. I think the future will have people like Lionclad[1] in it pushing what it means to perform live, but I expect them to become fewer and fewer as music just gets more complex to produce overall.

[1] https://www.youtube.com/watch?v=MuBas80oGEU

February 16, 2024 at 2:33 PM

Tainnor

Thankfully, art is not about the least common denominator and I'm confident that there will continue to be music played live as long as humanity exists.

February 16, 2024 at 5:10 PM

Larrikin

Music has a lot of people who believe that not only is their favorite genre the best but that they must tear down people who don't appreciate it.

You aren't better because you prefer live music, you just have a preference. Music wasn't better some arbitrary number of years ago, you just have a preference.

Nobody said one form is objectively better, just that there is a form that is becoming more popular.

But to state my opinion, I can't imagine something more boring than thinking the best of music, performance, TV, or media in general was done best and created in the past.

February 16, 2024 at 5:19 PM

Tainnor

It's not that I think my tastes in music are objectively better, it's that I strongly feel that music is a very personal matter for many people and there will be enough people who will seek out different forms of music than what is "popular". Rock, jazz, even classical music, are still alive and well.

> But to state my opinion, I can't imagine something more boring than thinking the best of music, performance, TV, or media in general was done best and created in the past.

And to state my opinion, art isn't about "the best" or any sort of progress, it's about the way we humans experience the world, something I consider to be a timeless preoccupation, which is why a song from 2024 can be equally touching as a ballad from the 14th century.

February 17, 2024 at 10:13 AM

picklesman

When I was studying music technology and using state of the art software synthesizers and sequencers, I got more and more into playing my acoustic guitar. There's a deep and direct connection and a pleasure that comes with it that computers (and now/eventually AI) will never be able to match.

(That being said, a realtime AI-based bandmate could be interesting...)

February 16, 2024 at 12:17 PM

inference-lord

My son is an interesting example of this, I can play all the best guitar music on earth via the speakers, but when I physically get the guitar out and strum it, he sits up like he has just seen god, and is total awe of the sounds of it, the feel of the guitar and the site of it. It's like nothing else can compare. Even if he is hysterically crying, the physical isntrument and the sound of it just makes him calm right down.

I wonder if something is lost in the recording process that just cannot be replicated? A live instrument is something that you can actually feel the sound of IMO, I've never felt the same with recorded music even though I of course enjoy it.

I wonder if when we get older we just get kind of "bored" (sadly) and it doesn't mean as much to us as it probably should.

February 16, 2024 at 12:48 PM

vczf

Mirror neurons?

February 16, 2024 at 1:12 PM

inference-lord

What does this have to do with it?

February 16, 2024 at 1:33 PM

vczf

I'm speculating that one would have more mirror neuron activation watching a person perform live, compared to listening to a recording or watching a video. Thus the missing component that makes live performance special.

February 16, 2024 at 1:36 PM

rightbyte

The sound feels present with live music. Speakers have this synthetic far away feel no matter how good they are.

February 16, 2024 at 6:51 PM

tzs

What about live music on non-acoustic instruments so it inherently comes through a speaker?

February 19, 2024 at 5:50 AM

inference-lord

My son isn't even a toddler so I don't think it would possibly be "mirror neurons".

February 16, 2024 at 7:10 PM

_glass

For me the guitar is like the keyboard I am writing on right now. It will never be replaced, because that is how I input music into the world. I could not program that, I was doing tracker music as a teenager, and all of the songs sounded weird, because the timing, and so on is not right. And now when I transcribe demos, and put them into a DAW, there seem to be the milliseconds off, that are not quite right. I still play the piano parts live, because we don't have the technology right now to make it sound better than a human, and even if we had, it would not be my music, but what an AI performed.

February 16, 2024 at 4:48 PM

throwaway14356

I really briefly looked at AI in music, lots of wild things are made. It is hard to explain, one was generating a bunch of sliders after mimicking a sample from sine waves (quite accurately)

February 16, 2024 at 2:01 PM

sdrothrock

> Musicians could automate all the instruments with incredible accuracy since a long time. But they never do that. For some reason, they still want a person behind the piano / guitar / drums.

This actually happened on a recent hit, too -- Dua Lipa's Break My Heart. They originally had a drum machine, but then brought in Chad Smith to actually play the drums for it.

Edit: I'm not claiming this was new or unusual, just providing a recent example.

February 16, 2024 at 12:18 PM

shon

This goes way back. Nine Inch Nails was a synth-first band with the music being written by Trent in a studio on a DAW. That worked but what really made the bad was live shows so they found ways even using 2 drummers to translate the synths and machines into human-plated instruments.

Also way before that back in the early 80’a Depeche Mode displayed the recorded drumb-reel onstage so everyone knew what it was, but when the got big enough they also transitioned into an epic live show with guitars and live drum a as well as synth-hooked drums devices they could bag on in addition to keyboards.

We are human. We want humans. Same reason I want a hipster barista to pour my coffee when a machine could do it just as well.

February 16, 2024 at 12:34 PM

inference-lord

Same reason I want a hipster barista to pour my coffee when a machine could do it just as well.

I've wondered about this for a long time too, why on earth is anyone still able to be a barista, it turns out, people actually like the community around cafes and often that means interacting with the staff on a personal level.

Some of my best friends have been barista's I've gone to over several years.

February 16, 2024 at 12:45 PM

baq

Back before Twitter was born, or perhaps tv, cafes were just that - a place to spend evenings (…just don’t ask who watched over the kids)

February 16, 2024 at 2:48 PM

lox

It’s more than that, doing it well is still beyond sophisticated automation. Many variables that need do be constantly adjusted for. Humans are still much better at it than machines, regardless of the social element.

February 16, 2024 at 6:17 PM

shon

If true, probably not for long. Still my point is people are customer. It’s more fun to think about what won’t change. I think we will still have baristas.

February 17, 2024 at 9:15 AM

lukan

A good live performance is intentionally not 100% the same as in the studio, but there can and should be variations. A refrain repeated another time, some improvisation here. Playing with the tempo there. It takes a good band, who know each other intimately, to make that work, though. (a good DJ can also do this with electronic music)

A recorded studio version, I can also listen to at home. But a full band performing in this very moment is a different experience to me.

February 16, 2024 at 2:18 PM

zogrodea

Regarding your point about music:

There are subtle and deliberate deviations in timing and elements like vibrato when a human plays the same song on an instrument twice, which is partly why (aside from recording tech) people prefer live or human musicians.

Think about how precise and exacting a computer can be. It can play the same notes in a MIDI editor with exact timing, always playing note B after 18 seconds of playing note A. Human musicians can't always be that precise in timing, but we seem to prefer how human musicians sound with all of the variations they make. We seem to dislike the precise mechanical repetition of music playback on a computer comparatively.

I think the same point generalises into a general dislike on the part of humans of sensory repetition. We want variety. (Compare the first and second grass pictures at [0] and you will probably find that the second which has more "dirt" and variety looks better.) "Semantic satiation" seems to be a specific case of the same tendency.

I'm not saying that's something a computer can't achieve eventually but it's something that will need to be done before machines can replace musicians.

[0] http://gas13.ru/v3/tutorials/sywtbapa_gradient_tool.php

February 16, 2024 at 8:03 PM

riwsky

You can modulate midi timinbg with noise. In some programs, there’s literally a Humanize button.

February 17, 2024 at 2:25 PM

zogrodea

Yes. I tried that with some software-based synthesisers (like the SWAM violin and Reason's Friktion) which are designed for human-playing (humans controlling the VST through a device that emits MIDI CC control messages) but my understanding is that the modulation that skilled human players perform with tends to be better/more desirable than what software modulators can currently achieve.

February 18, 2024 at 8:00 AM

code51

The real dilemma is with composition/song-writing.

Ability to create live experiences can still be a motivating factor for musicians (aside from the love of learning). Yet, when AI does the song-writing far more effectively, then will the musician ignore this?

It's like Brave New World. Musicians who don't use these AI tools for song-writing will be like a tribe outside modern world. That's a tough future to prepare for. We won't know whether a song was actually the experience and emotions of a person or not.

February 16, 2024 at 3:53 PM

palmfacehn

Even if we assume that people want fully automated music, the process of learning to play educates the musician. Similarly, you'd still need a director/auteur, editors, writers and other roles I have no appreciation or knowledge of to create a film from AI models.

Steam shovels and modern excavators didn't remove our need for shovels or more importantly, the know-how to properly apply these tools. Naturally, most people use a shovel before they operate an excavator.

February 16, 2024 at 6:58 PM

Avicebron

It's interesting though, the question really becomes, if 10 people used to shovel manually to feed their family. And now it takes 1 person and an excavater, what in good faith do you tell those other 9..."don't worry you can always be a hobby shovelist?"

February 16, 2024 at 7:35 PM

palmfacehn

They can apply their labor wherever it is valued. Perhaps they will become more productive excavator operators. By creating value in a specialized field their income would increase. Technology does not decrease the need for labor. Rather it increases the productivity of the laborer.

Human ingenuity always finds a need for value creation. Greater abundance creates new opportunities.

Take the inverse position. Should we go back to reading by candlelight to increase employment in candle making?

No, electric lighting allowed peopled to become productive during night hours. A market was created for electricity producers, which allowed additional products which consume electricity to be marketed. Technological increases in productivity cascade into all areas of life, increasing our living standards.

A more interesting, if not controversial line of inquiry might start with: If technology is constantly advancing human productivity, why do modern economies consistently experience price inflation?

February 16, 2024 at 8:04 PM

stevesimmons

You miss the important point, which is the productivity gain means the average living standard of society as a whole increases. A chunk of what is now regarded as 'toil' work disappears, and the time freed up is able to be deployed more productively in other areas.

Of course, this change is dislocating for the particular people whose toil disappeared. They need support to retrain to new occupations.

The alternative is to cling to a past where everyone - on average - is poorer, less healthy, and works in more dangerous jobs.

February 16, 2024 at 8:06 PM

Avicebron

That's awesome, sign me up for retraining. Where do I go and who can I talk to so I can be retrained into a less drudgery filled position?

Clearly if there are ways out of being displaced, please share them

February 16, 2024 at 8:22 PM

Someone

The ‘augmented singer’ is very popular, though. https://en.wikipedia.org/wiki/Auto-Tune: “Auto-Tune has been widely criticized as indicative of an inability to sing on key.”

February 16, 2024 at 7:42 PM

grotorea

Live play is what, 1% of all music heard in the world? Computers, radios, iPods and phones all play automated reproductions.

February 16, 2024 at 8:44 PM

anigbrowl

Musicians could automate all the instruments with incredible accuracy since a long time. But they never do that. For some reason, they still want a person behind the piano / guitar / drums.

You've never been to a rave, huh? For that matter, there's a lot of pop artists that use sequencers and dispense with the traditional band on stage.

February 16, 2024 at 2:32 PM

itronitron

I can see this being used extensively for short commercials, as the uncanny aspect of a lot of the figures will help to capture people's attention. I don't necessarily believe it will be less expensive than hiring a director and film crew however.

February 16, 2024 at 5:56 PM

Solvency

I love these hot takes based on profoundly incredible tech that literally just launched. Acting like 2030 isn't around the corner.

February 16, 2024 at 6:02 AM

chefandy

> I love these hot takes based on profoundly incredible tech that literally just launched. Acting like 2030 isn't around the corner.

It seems bizarre to think the gee whiz factor in a new commercial creative product makes critiquing its output out-of-bounds. This isn't a university research team: they're charging money for this. Most people have to determine if something is useful before they pay for it.

February 16, 2024 at 7:13 AM

goatlover

Let me guess, hard singularity take-off in 2030? Does the hype cycle not exist for techno-optimists? Just one breathless prediction after another?

February 16, 2024 at 1:30 PM

andrepd

Anything less than absolute enrapture is a "hot take"... :)

February 16, 2024 at 7:04 AM

bamboozled

We’re glad you love them.

February 16, 2024 at 6:46 AM

dugite-code

> fingers are still weird

Also keep an eye on teeth and high contrast text. Anything small and prone to distortion in low resolution video and images used to train this stuff.

February 16, 2024 at 2:39 PM

sinuhe69

Yeah. I think people nowadays are in a kind of AI-euphoria and they took every advancement in AI for more than what they really are. The realization of their limitations will set in once people have been working long enough on the stuff. The capacity of the newfangled AIs are impressive. But even more impressive are their mimicry capabilities.

February 16, 2024 at 2:04 PM

Qwero

Are you joking?

We were not even able to just create random videos by just text promoting a few years back and now this.

The progress is crazy.

Why do you dismiss this?

February 16, 2024 at 3:26 PM

cezart

Not dismissing, but being realistic. I observed all the AI tools, usually amaze most people initially by showing capabilities never seen before. Then people realise their limitations, ie what capabilities are still missing. And they're like: "oh, this is no genie in a bottle capable of satisfying every wish. We'll still have to work to obtain our vision..." So the magic fades away, and the world returns to normal, but now with an additional tool very useful in some situations :)

February 16, 2024 at 4:40 PM

Qwero

I'm still amazed.

The progress doesn't slow down right now at all.

This is probably one of the most exciting developments in the world besides the Internet.

And Geminis news regarding the 1 million token window shows were we are going.

This will impact a lot of people faster than a lot of people realize

February 16, 2024 at 5:27 PM

attilakun

I agree. Skepticism usually serves people well as a lot of new tech turns out to be just hype. Except when it is not and I think this is one of those few cases.

February 16, 2024 at 6:15 PM

MSFT_Edging

Not who you're replying to but this is a toy.

AI won't make artistic decisions that wow an audience.

AI won't teach you something about the human condition.

AI will only enable higher quarterly profits from layoffs until GPU costs catch up.

What the fuck is the point of AI automating away jobs when the only people who benefit are the already enormously wealthy? AI won't be providing time to relax for the average worker, it will induce starvation. Anything to prevent will be stopped via lobbying to ensure taxes don't rise.

Seriously, what is the point? What is the point? What the fuck is there to live for when art and humanities is undermined by the MBA class and all you fucking have is 3 gig jobs to prevent starvation?

February 16, 2024 at 9:22 PM

fennecbutt

Problem isn't the tool but with the tools using the tool.

It's not ML fault that we don't have UBI, it's voters' faults.

February 18, 2024 at 9:38 PM

Qwero

I believe ai and full automatisation is critical for a Star Trek society.

We are not very good in providing anything reasonable today because capitalism is still way to strong and manual laber still way to necessary.

Nonetheless look at my country Germany: we are a social state. Plenty of people get 'free' money and it works.

The other good thing: there are plenty of people who know what good is (good art etc) but are not able to draw. The can also express themselves. AI as a tool.

If we as society discover that there will be no really new music or art happening I don't know what we will do.

Plenty of people are well entertained with crap anyway.

February 18, 2024 at 1:36 AM

bowsamic

Sure there are limitations but this is still absurdly impressive.

My benchmark is the following: imagine if someone 5 years ago told you that in 5 years we could do this, you would think they were crazy.

February 16, 2024 at 5:39 PM

patall

I would not. Five (six, seven?) years ago, we had style transfer with video and everyone was also super euphoric about that. If I compare to those videos, there is clearly progress but it is not like we started from zero 2 years ago.

February 16, 2024 at 6:52 PM

bowsamic

I don't really know what you mean by "euphoric", this is a term I only know from drugs. Can you define it?

February 16, 2024 at 7:10 PM

Avicebron

"Blissful/happy", which is why the word euphoria is often abused to be sinister

February 16, 2024 at 7:27 PM

npinsker

It means "extremely happy", but it's usually used to refer to a particular moment in time (rather than a general sentiment), and so the word sounds a bit out of place here, to me.

February 16, 2024 at 7:26 PM

dugite-code

And further down the page the:

"The camera follows behind a white vintage SUV with a black roof": The letters clearly wobble inconsistently.

"A drone camera circles around a beautiful historic church built on a rocky outcropping along the Amalfi Coast": The woman in the white dress in the bottom left suddenly splits into multiple people like she was a single cell microbe multiplying.

February 16, 2024 at 2:37 PM

Yiin

Sure, but think what it will be capable of two papers ahead :)

February 16, 2024 at 3:19 PM

csomar

Progress is this field has not been linear, though. So it's quite possible that two papers ahead we are still in the same place.

February 16, 2024 at 3:44 PM

dr_dshiv

On the other hand, this is the first convincing use of a “diffusion transformer” [1]. My understanding is that videos and images are tokenized into patches, through a process that compresses the video/images into abstracted concepts in latent space. Those patches (image/video concepts in latent space) can then be used with transformers (because patches are the tokens). The point is that there is plenty of room for optimization following the first demonstration of a new architecture.

Edit: sorry, it’s not the first diffusion transformer. That would be [2]

[1] https://openai.com/research/video-generation-models-as-world...

[2] https://arxiv.org/abs/2212.09748

February 16, 2024 at 4:31 PM

koconder

Here is an explainer https://towardsdatascience.com/explaining-openai-soras-space...

February 17, 2024 at 5:34 AM

dr_dshiv

I think it is misleading. The role of the diffusion network is completely absent from this explanation

February 18, 2024 at 6:28 PM

fennecbutt

Hold on to your papers~

February 18, 2024 at 9:38 PM

brookst

It’s not perfect, for sure. But maybe this isn’t the final pinnacle of the tech?

February 16, 2024 at 9:13 PM

Hoasi

> I disagree, just look at the legs of the woman in the first video.

The people behind her all walk at the same pace and seem like floating. The moving reflections, on the other hand, are impressive make-believe.

February 16, 2024 at 5:24 AM

b1gnasty

Really makes me think of The Matrix scene with the woman in the red dress. Can't tell if they did this on purpose to freak us all out? Are we all just prompts?

February 16, 2024 at 12:13 PM

grotorea

I'm 99% sure this is supposed to invoke cyberpunk but not sure about The Matrix.

February 16, 2024 at 8:34 PM

kyrra

If you watch the background, you'll see one guy has hits pants change color. And also, some of the guys are absolute giants compared to people around them.

February 16, 2024 at 6:08 AM

matt_s

Yep. If you look at the detail you can find obvious things wrong and these are limited to 60s in length with zero audio so I doubt full motion picture movies are going to be replaced anytime soon. B-roll background video or AI generated backgrounds for a green screen sure.

I would expect any subscription to use this service when it comes out to be very expensive. At some point I have to imagine the GPU/CPU horsepower needed will outweigh the monetary costs that could be recovered. Storage costs too. Its much easier to tinker with generating text or static images in that regard.

Of note: NVDA's quarterly results come out next week.

February 16, 2024 at 8:59 PM

anigbrowl

Same story as the fingers before.

This is weird to me considering how much better this is than the SOTA still images 2 years ago. Even though there's weirdo artefacts in several of their example videos (indeed including migrating fingers), that stuff will be super easy to clean up, just as it is now for stills. And it's not going to stop improving.

February 16, 2024 at 2:27 PM

bamboozled

Agreed and these are the cherry picked examples of course.

February 16, 2024 at 6:46 AM

jstummbillig

> But I think many people will be very uncomfortable with such motion very quickly.

Given the momentum in this space, I think you will have get very uncomfortable super quick about any of the shortcomings of any particular model.

February 17, 2024 at 1:53 AM

samstave

>>>just look at the legs of the woman

Denise Richards hard sharp knees in '97

--

these infant tech are already insanely good... just wait and rahter try to focus on the "what should I be betting on in 5 years from now?

I suggest 'invisibility cloaks' (ghosts in machines?)

February 16, 2024 at 12:38 PM

josemanuel

At second 15, of the woman video, the legs switch sides!! Definitely there are some glitches :)

February 17, 2024 at 6:08 AM

4b11b4

The left and right side of her face are almost... a different person.

February 17, 2024 at 1:46 AM

gerash

When others create text to video systems (eg. Lumiere from Google) they publish the research (eg. https://arxiv.org/pdf/2401.12945.pdf). Open AI is all about commercialization. I don't like their attitude

February 16, 2024 at 4:57 AM

comex

Google is hardly a good actor here. They just announced Gemini 1.5 along with a "technical report" [1] whose entire description of the model architecture is: "Gemini 1.5 Pro is a sparse mixture-of-expert (MoE) Transformer-based model". Followed by a list of papers that it "builds on", followed by a definition of MoE. I suppose that's more than OpenAI gave in their GPT-4 technical report. But not by much!

[1] https://storage.googleapis.com/deepmind-media/gemini/gemini_...

February 16, 2024 at 11:13 AM

cubefox

The report and the previous one for 1.0 definitely contain much more information than the GPT-4 whitepaper. And Google regularly publishes technical details on other models, like Lumiere, things that OpenAI stopped doing after their InstructGPT paper.

February 16, 2024 at 8:47 PM

cchance

Maybe because GPT3.5 is closer to what Gemini 1.0 was... GPT4 and Gemini 1.5 are similarly sparse in their "how we did it and what we used" when it comes to papers

February 17, 2024 at 2:42 AM

jstummbillig

Not to be overly cute, but if the cutting edge research you do is maybe changing the world fundamentally, forever, guarding that tech should be really, really, really far up your list of priorities and everyone else should be really happy about your priorities.

And that should probably take precedence over the semantics of your moniker, every single time (even if hn continues to be super sour about it)

February 16, 2024 at 5:09 AM

cloogshicer

I'd much rather this tech be open - better for everyone to have it than a select few.

The more powerful, the more important it is that everyone has access.

February 16, 2024 at 5:17 AM

crazygringo

Do you feel the same way about nuclear weapons tech?

That "the more powerful, the more important it is that everyone has access"?

Especially considering that the biggest killer app for AI could very well be smart weapons like we've never seen before.

February 16, 2024 at 5:23 AM

spdustin

I feel this is a false equivalence.

Nukes aren’t even close to being commodities, cannot be targeted at a class of people (or a single person), and have a minutely small number of users. (Don’t argue semantics with “class of people” when you know what I mean, btw)

On the other hand, tech like this can easily become as common as photoshop, can cause harm to a class of people, and be deployed on a whim by an untrained army of malevolent individuals or groups.

February 16, 2024 at 5:34 AM

nearbuy

So if someone discovered a weapon of mass destruction (say some kind of supervirus) that could be produced and bought cheaply and could be programmed to only kill a certain class of people, then you'd want the recipe to be freely available?

February 16, 2024 at 1:28 PM

sanitycheck

This poses no direct threat to human life though. (Unlike, say, guns - which are totally fine for everyone in the US!)

The direct threat to society is actually this kind of secrecy.

If ordinary people don't have access to the technology they don't really know what it can do, so they can't develop a good sense of what could now be fake that only a couple of years ago must have been real.

Imagine if image editing technology (Photoshop etc) had been restricted to nation states and large powerful corporations. The general public would be so easy to fool with mere photographs - and of course more openly nefarious groups would have found ways to use it anyway. Instead everybody now knows how easily we can edit an image and if we see a shot of Mr Trump apparently sharing a loving embrace with Mr Putin we can make the correct judgement regarding a probable origin.

February 16, 2024 at 3:24 PM

war321

The bottleneck for bioterrorism isn't AI telling you how to do something, it's producing the final result. You wanna curtail bioweapons, monitor the BSL labs, biowarfare labs, bioreactors, and organic 3D printers. ChatGPT telling me how to shoot someone isn't gonna help me if I can't get a gun.

February 16, 2024 at 11:24 PM

nearbuy

This isn't related to my comment. I wasn't asking what if an AI invents a supervirus. I was asking what if someone invents a supervirus. AI isn't involved in this hypothetical in any way.

I was replying to a comment saying that nukes aren't commodities and can't target specific classes of people, and I don't understand why those properties in particular mean access to nukes should be kept secret and controlled.

February 17, 2024 at 4:06 AM

lastdong

I understand your perspective regarding the potential risks associated with freely available research, particularly when it comes to illegal weapons and dangerous viruses. However, it's worth considering that by making research available to the world, we enable a collaborative effort in finding solutions and antidotes to such threats. In the case of Covid, the open sharing of information led to the development of vaccines in record time.

It's important to weigh the benefits of diversity and open competition against the risks of bad actors misusing the tools. Ultimately, finding a balance between accessibility and responsible use is key.

What guarantee do we have that OpenAI won't become an evil actor like Skynet?

February 16, 2024 at 2:51 PM

nearbuy

I'm not advocating for or against secrecy. I'm just not understanding the parent comment I replied to. They said nukes are different than AI because they aren't commodities and can't target specific classes of people, and presumably that's why nukes should be kept secret and AI should be open. Why? That makes no sense to me. If nukes had those qualities, I'd definitely want them kept secret and controlled.

February 16, 2024 at 4:30 PM

tavavex

An AI video generator can't kill billions of people, for one. I'd prefer it if access wasn't limited to a single corporation that's accountable to no one and is incentivized to use it for their benefits only.

February 16, 2024 at 5:39 AM

jstummbillig

> accountable to no one

What do you mean? Are you being dramatic or do you actually believe that the US government will/can not absolutely shut OpenAI down, if they feel it was required to guarantee state order?

February 16, 2024 at 5:56 AM

tavavex

For the US government to step in, they'd have to do something extremely dangerous (and refuse to share with the government). If we're talking about video generation, the benefits they have are financial, and the lack of accountability is in that they can do things no one else can. I'm not saying they'll be allowed to break the law, there's plenty of space between the two extremes. Though, given how things were going, I can also see OpenAI teaming up with the US government and receiving exclusive privileges to run certain technologies for the sake of "safety". It's what Altman has already been pushing for.

February 16, 2024 at 6:01 AM

NoGravitas

> An AI video generator can't kill billions of people, for one.

Not directly. But I won't be surprised if AI video generators aren't somewhere in the chain of causes of gigadeaths this century.

February 17, 2024 at 12:10 AM

huytersd

I think it could. The right sequence of videos sent to the right people could definitely set something catastrophic off.

February 16, 2024 at 8:19 AM

czl

> The right sequence of videos sent to the right people could definitely set something catastrophic off.

...after amazing public world wide demos that show how real the AI generated videos can be? How long has Hollywood had similar "fictional videos" powers?

February 16, 2024 at 10:50 AM

NoGravitas

> ...after amazing public world wide demos that show how real the AI generated videos can be?

How quickly do you think our gerontocracy will adapt to the new reality?

February 17, 2024 at 12:12 AM

huytersd

Flat earth Billy can now make videos with a $20 subscription.

February 16, 2024 at 1:16 PM

8n4vidtmkvmk

I think that's great. Billy will feed his flat earther friends for a few weeks or months and pretty soon the entire world will wise up and be highly skeptical of any new such videos. The more of this that gets out there, the quicker people will learn. If it's 1 or 2 videos to spin an election... People might not get wise to it.

February 16, 2024 at 2:07 PM

huytersd

Given the last 10 years I have no such faith in the common person.

February 16, 2024 at 2:56 PM

WhrRTheBaboons

which will only continue to convince people if the technology stays safely locked away in possession of a single corp.

if it were opened to public faking such videos would lose (nearly) all of its power

February 16, 2024 at 6:52 PM

ngcazz

Make it high-enough fidelity, and it will be used to convince people to kill billions.

February 16, 2024 at 4:58 PM

jiggawatts

Video can convince people to kill each other now because it is assumed to show real things. Show people a Jew killing a Palestinian, and that will rile up the Muslims, or vice versa.

When a significant fraction of video is generated content spat out by a bored teenager on 4chan, then people will stop trusting it, and hence it will no longer have the power to convince people to kill.

February 16, 2024 at 7:23 PM

mengibar10

You don't need to generate fake videos for that example. State of Isreal have been killing Palestinians en masse for a long time and intensified the effort for the last 4 months. The death toll is 29,000+ and counting. Two thirds are children and women.

Isreal media machinery parading photographs of damaged houses that could only be done by heavy artillery or tank shells blaming on rebels carrying infantry rifles.

But I agree, as if the current tools were not enough to sway people they will have more means to sway public opinion.

February 16, 2024 at 11:35 PM

ardaoweo

Hamas has similarly been shooting rockets into Israel for a long time. Eventually people get tired and stop caring about long-lasting conflicts, just like we don't care about concentration camps in North Korea and China, or various deadly civil wars in Sub-Saharan Africa, some of which have killed way more civilians than all wars in Palestinian history. One can already see support towards Ukraine fading as well, even though there Western countries would have a real geopolitical interest.

February 17, 2024 at 1:42 AM

solardev

> Especially considering that the biggest killer app for AI could very well be smart weapons like we've never seen before.

A homing missile that chases you across continents and shows you disturbing deepfakes of yourself until you lose your mind and ask it to kill you. At that point it switches to encourage mode, rebuilds your ego, and becomes your lifelong friend.

February 16, 2024 at 8:49 AM
February 16, 2024 at 12:19 PM

bb88

I don't think it's really that hard to make a nuclear weapon, honestly. Just because you have the plans for one, doesn't mean you have the uranium/plutonium to make one. Weapons-grade uranium doesn't fall into your lap.

The ideas of critical mass, prompt fission, and uranium purification, along with the design of the simplest nuclear weapon possible has been out in the public domain for a long time.

February 16, 2024 at 9:52 AM

Vinnl

Oof, imagine if our safeguard for nuclear weapons was that a private company kept it safe.

February 16, 2024 at 6:45 AM

nlnn

While it's probably too idealistic to be possible, I'd rather try and focus on getting people/society/the world to a state where it doesn't matter if everyone has access (i.e. getting to a place where it doesn't matter if everyone has access to nuclear weapons, guns, chemical weapons, etc., because no-one would have the slightest desire to use them).

As things are at the moment, while supression of a technology has benefits, it seems like a risky long-term solution. All it takes is for a single world-altering technology to slip through the cracks, and a bad actor could then forever change the world with it.

February 16, 2024 at 5:28 PM

lagrange77

On a geopolitical level 'everyone' does have access.

February 16, 2024 at 7:15 AM

Fidelix

Do you feel the same way about electricity?

February 16, 2024 at 10:49 PM

jstummbillig

As long as destroying things remains at least two magnitudes easier than building things and defending against attacks, this take (as a blanket statement) will continue to be indefensible and irresponsible.

February 16, 2024 at 5:25 AM

iwsk

Should nukes be open source?

February 16, 2024 at 5:24 AM

spdustin

I humbly refer you to this comment:

https://news.ycombinator.com/item?id=39389262

February 16, 2024 at 5:35 AM

esafak

ML models of this complexity are just as accessible as nuclear weapons. How many nations possess a GPT-4? The only reason nuclear weapons are not more common is because their proliferation is strictly controlled by conventions and covert action.

February 16, 2024 at 12:17 PM

nradov

The basic designs for workable (although inefficient) nuclear weapons have been published in open sources for decades. The hard part is obtaining enough uranium and then refining it.

February 16, 2024 at 1:22 PM

baq

If you have two pieces of plutonium and put them too close together you have accidentally created a nuclear weapon… so yeah nukes are open source, plutonium breeding isn’t.

February 16, 2024 at 2:51 PM

extheat

I love it when people make this “nuke” argument because it tells you a lot more about them than it does about anything else. There are so many low information people out there, it’s a bit sad the state of education even in developed countries. There’s people trotting around the word “chemical” at things that are scary without understanding what exactly the word means, how it differs from the word mixture or anything like that. I don’t expect most people to understand the difference between a proton and a quark but at least a general understanding of physics and chemistry would save a lot of people from falling into the “world is magic and information is hidden away inside geniuses” mentality.

February 16, 2024 at 6:12 PM

Fidelix

Should electricity?

February 16, 2024 at 10:50 PM

bamboozled

What a load…image if everyone else guarded all their discoveries, there’d be no text to video would there?

February 16, 2024 at 6:45 AM

andrepd

People defending this need to meditate on the meaning of the phrase "shoulders of giants".

February 16, 2024 at 7:02 AM

clayhacks

New technology will always be new giants to see from, but open source really is a nice ladder up to the shoulders of giants. So many benefits from sharing the tech

February 16, 2024 at 7:48 AM

spookie

This reminded me of a conversation with a historian. He requested the reconstruction of a monument in France that a game studio had already made.

The studio told him the model was their property, and they wouldn't share it.

Peculiar reasoning, isn't it?

February 16, 2024 at 12:45 PM
February 16, 2024 at 7:08 AM

creatonez

This is meaningless until you've defined "world changing". It's possible that open sourcing AIs will be world-changing in a good way and developing closed source AIs will be world-changing in a bad way.

If I engineered the tech I would be much more fearful of the possibility of malice in the future leadership of the organization I'm under if they continue to keep it closed, than I would be fearful of the whole world getting the capability if they decide to open source.

I feel that, like with Yellow Journalism of the 1920s, much of the misinformation problem with generative AI will only be mitigated during widespread proliferation, wherein people become immune to new tactics and gain a new skepticism of the media. I've always thought it strange when news outlets discuss new deepfakes but refuse to show it, even with a watermark indicating it is fake. Misinformation research shows that people become more skeptical once they learn about the technological measures (e.g. buying karma-farmed Reddit accounts, or in the 1920s, taking advantage of dramatically lower newspaper printing costs to print sensationalism) through which misinformation is manufactured.

February 16, 2024 at 6:09 AM

grotorea

The problem is when we start to run out of reliable sources after becoming sceptical of everything.

February 16, 2024 at 8:43 PM

MeImCounting

It will be kind of like most of history where the only trustworthy method of communication is with face to face communication or with a letter or book (perhaps cryptographically) verified from a person you personally know or trust. Sounds good to me

February 17, 2024 at 12:05 PM

towelpluswater

This is a fantastic write up and great parallel to the state of where we’re headed.

February 16, 2024 at 12:22 PM

opportune

How convenient for all the OpenAI employees trying to make millions of dollars by commercializing their technology. Surely this technology won’t be well-understood and easily replicable in a few years as FOSS

February 16, 2024 at 7:35 AM

spookie

It'll, even if they guard their secret sauce. Let's not be naive about this, obfuscation is and always will be a minor nuisance.

February 16, 2024 at 12:33 PM

andrepd

>If you have world-changing technology it's better for a megacorp to control it.

You need to watch more dystopian movies.

February 16, 2024 at 7:00 AM

RandomLensman

The wheel should have been a tightly controlled technology?

February 16, 2024 at 6:42 PM

y_gy

Ironic, isn't it! OpenAI started out "open," publishing research, and now "ClosedAI" would be a much better name.

February 16, 2024 at 5:02 AM

ionwake

TBH they should just rename to ClosedAI and run with it, I and others would appreciate the honesty plus it would be amusing.

February 16, 2024 at 5:11 AM

polygamous_bat

However if you are playing for the regulatory capture route (which Sam Altman seems to be angling for) it’s much easier if your name is “OpenAI”.

February 16, 2024 at 5:27 AM

tavavex

If you go full regulatory capture, you might as well name it "AI", The AI Company.

February 16, 2024 at 5:40 AM

ionwake

You never go "full" regulatory capture.

February 16, 2024 at 6:37 AM

efrank3

gottem

February 16, 2024 at 5:43 AM

ShamelessC

Sick burn!

February 16, 2024 at 6:17 AM

neya

When has OpenAI - for a company named "Open" AI ever released any of their stuff into anything open?

February 16, 2024 at 1:17 PM

sebzim4500

They actually did a few years ago, but that's ancient history in AI terms.

The most recent thing they released was Whisper, which to be fair is the only model with absolutely no safety implications.

February 16, 2024 at 6:41 PM

ambrose2

From what I remember reading, Open was never supposed to be like open source with the internals freely available, but Open as in available for the public to use, as opposed to a technology only for the company to wield and create content with.

February 16, 2024 at 9:09 PM

hnben

They stopped releasing their stuff openly around the time GPT3 came to be.

February 16, 2024 at 4:59 PM

sebzim4500

Whisper was after GPT3 and that was fully open.

February 16, 2024 at 6:41 PM

disillusioned

More like ClosedAI, amirite?

February 16, 2024 at 8:16 AM

mtillman

OAI requires a real mobile phone number to signup and are therefore an adtech company.

February 16, 2024 at 5:09 AM

BadHumans

Might be one of the most absurd things said on here. Requiring a phone number for sign up does not automatically mean you are selling ads.

February 16, 2024 at 5:17 AM

polygamous_bat

When the time for making money comes, if you don’t think OpenAI will sell every drop of information they have on you, then you are incredibly naive. Why would they leave money on the table when everyone else has been doing it for forever without any adverse effects?

February 16, 2024 at 5:26 AM

Zacharias030

They are currently hiring people with Adtech experience.

The most simple version would be an ad-supported ChatGPT experience. Anyone thinking that an internet consumer company with 100m weekly active users (I‘m citing from their job ad) is not going to sell ads is lacking imagination.

February 16, 2024 at 6:30 AM

jstummbillig

If Google Workspace was selling my or any customers information, at all or "forever", it would not be called Google Workspace, it would be called Google We-died-in-the-most-expensive-lawsuit-of-all-time.

February 16, 2024 at 6:03 AM

8n4vidtmkvmk

There's a difference. Open AI essentially has 2 products. The chat bot $20 a month thing for Joe shmoe which they admit to training on your prompts, and the API for businesses. Workspace is like the latter. The former is closer to Google search.

February 16, 2024 at 2:13 PM

jstummbillig

Sure, but there is no ambiguity about that, is there? You know that, because they tell you (and, sure, maybe they only tell you, because they have to, by law – but they do and you know)

How do we get from there to "just assume every company in the world will sell your data in wildly and obviously illegal ways", I don't know.

February 16, 2024 at 10:24 PM

erhaetherth

Well..that does seem to be the default. If they don't explicitly say they won't, they probably will. It's a sad world.

February 17, 2024 at 10:38 AM

esafak

We're face to face with AGI and you're worried about ads?? Get your risks in order!!

February 16, 2024 at 12:23 PM

8n4vidtmkvmk

We're still nowhere near AGI.

February 16, 2024 at 2:09 PM

ilrwbwrkhv

The day the AI stops listening to prompts instead of following them is the day I will worry about AGI.

February 16, 2024 at 6:54 PM

esafak

You'd be too late. You're just waiting for someone to imbue a model with agency. We have agency due to evolution. Robots need it programmed into them, and honestly, that is easy to do compared with instilling reasoning. Primitive animals have agency. No animal can reason on the level of GPT. That will get us to HAL2000. If you stick it in a robot, you have the Terminator.

February 17, 2024 at 11:34 AM

dudel

AI doesn’t exist. Neither in practice nor theoretically. Artificial intelligence is an oxymoron. Intelligence is a complex system. Artificial systems are logic systems. You live in a complex universe that you cannot perceive, i.e. we perceive it as noise/randomness only. All you can see are the logical systems expressed at the surface (Mendelbrot Set) of the noise. Everything you see and know is strictly logical, all knowns laws of the universe are derived from those logical systems. Hence, we can only build logical systems. Not complex systems. There is a limit to what we can build here on the surface (Church-Turing). We never have and never will build a complex system.

February 18, 2024 at 11:49 AM

Sohcahtoa82

> Motion-capture works fine because that's real motion

Except in games where they mo-cap at a frame rate less than what it will be rendered at and just interpolate between mo-cap samples, which makes snappy movements turn into smooth movements and motions end up in the uncanny valley.

It's especially noticeable when a character is talking and makes a "P" sound. In a "P", your lips basically "pop" open. But if the motion is smoothed out, it gives the lips the look of making an "mm" sound. The lips of someone saying "post" looks like "most".

At 30 fps, it's unnoticeable. At 144 fps, it's jarring once you see it and can't unsee it.

February 16, 2024 at 3:57 AM

omega3

Out of all the examples, the wooly mammoths one actually feels like CGI the most to me, the other ones are much more believable than this one.

February 16, 2024 at 4:58 AM

mtlmtlmtlmtl

Possibly because there are no videos or even photos of live wooly mammoths, but loads and loads of CG recreations in various documentaries.

February 16, 2024 at 5:16 AM

mikeInAlaska

I saw the cat in the bed grows an extra limb...

February 16, 2024 at 5:34 AM

krapp

Cats are weird sometimes.

February 16, 2024 at 5:38 AM

windowshopping

Huh, strong disagree. I've seen realistic CGI motion many times and I don't consider this to feel realistic at all.

February 16, 2024 at 12:45 PM

bamboozled

I’m a bit thrown off by the fact the mammoths are steaming, is that normal for mammoths ?

February 16, 2024 at 6:49 AM

throw310822

Good question :)

February 16, 2024 at 7:07 AM

colordrops

You might just be subject to confirmation bias here. Perhaps there were scenes and entities you didn't realize were CGI due to high quality animation, and thus didn't account for them in your assessment.

February 16, 2024 at 9:50 AM

lastdong

Regarding CGI, I think it has became so good that you don’t know it’s CGI. Look at the dog in Guardians of the Galaxy 3. There’s a whole series on YouTube called “no cgi is really just invisible cgi” that I recommend watching.

And as with cgi, models like SORA will get better until you can’t tell reality apart. It's not there Yet, but an immense astonishingly breakthrough.

February 16, 2024 at 2:27 PM

kitd

Maybe it's my anthropocentric brain, but the animals move realistically while the people still look quite off.

It's still an unbelievable achievement though. I love the paper seahorse whose tail is made (realistically) using the paper folds.

February 16, 2024 at 5:45 PM

samstave

Serious: Can one just pipe an SRT (subtitle file) and then tell it to compare its version to the mp4 and then be able to command it to zoom, enhance, edit, and basically use it to remould content. I think this sounds great!

February 16, 2024 at 12:32 PM

geor9e

It's possible that through sheer volume of training, the neural network essentially has a 3D engine going on, or at least picked up enough of the rules of light and shape and physics to look the same as unreal or unity

February 16, 2024 at 8:45 AM

samsullivan

It would have to in order to produce the outputs, our brains have crazy physics engines though, F1 drivers can simulate an entire race in their heads.

February 16, 2024 at 12:04 PM

staticautomatic

I wonder if they could theoretically race multiple people at once like chess masters.

February 16, 2024 at 10:50 PM

kaba0

> I've quite simply never seen convincing computer-generated motion before

I’m fairly sure you have seen it many times, it was just so convincing that you didn’t realize it was CGI. It’s a fundamentally biased way to sample it, as you won’t see examples of well executed stuff.

February 19, 2024 at 4:34 PM

djmips

I'm not sure I feel the same way about the mammoths - and the billowing snow makes no sense as someone who grew up in a snowy area. If the snow was powder maybe but that's not what's depicted on the ground.

February 16, 2024 at 5:43 AM

isthispermanent

Pixar is computer generated motion, no?

February 16, 2024 at 4:35 AM

viewtransform

Main Pixar characters are all computer animated by humans. Physics effects like water, hair, clothing, smoke and background crowds use computer physics simulation but there are handles allowing an animator to direct the motion as per the directors wishes.

February 16, 2024 at 4:52 AM

minimaxir

With extreme amounts of man-hours to do so.

February 16, 2024 at 4:41 AM
February 16, 2024 at 3:44 AM

globular-toast

Nah this still has the problem with connecting surfaces that never seems to look right in any CGI. It's actually interesting that it doesn't look right here as well considering they are completely different techniques.

February 16, 2024 at 4:44 PM

swamp40

It's been trained on videos exclusively. Then GPT-4 interprets your prompt for it.

February 16, 2024 at 3:40 AM

belter

Just setup a family password last week...Now it seems every member of the family will have to become their own certificate authority and carry an MFA device.

"Worried About AI Voice Clone Scams? Create a Family Password" - https://www.eff.org/deeplinks/2024/01/worried-about-ai-voice...

February 16, 2024 at 6:11 PM

unsigner

Don't think of them as "computer-generated" any more than your phone's heavily processed pictures are "computer-generated", or JWST's false color, IR-to-visible pictures are "computer-generated".

This article makes a convincing argument: https://studio.ribbonfarm.com/p/a-camera-not-an-engine

February 16, 2024 at 4:10 PM

lynguist

That is such a gem of an article that looks at AI with a new lens I haven’t encountered before:

- AI sees and doesn’t generate

- It is dual to economics that pretends to describe but actually generates

February 16, 2024 at 5:20 PM

sebastiennight

I think the implications go much further than just the image/video considerations.

This model shows a very good (albeit not perfect) understanding of the physics of objects and relationships between them. The announcement mentions this several times.

The OpenAI blog post lists "Archeologists discover a generic plastic chair in the desert, excavating and dusting it with great care." as one of the "failed" cases. But this (and "Reflections in the window of a train traveling through the Tokyo suburbs.") seem to me to be 2 of the most important examples.

- In the Tokyo one, the model is smart enough to figure out that on a train, the reflection would be of a passenger, and the passenger has Asian traits since this is Tokyo. - In the chair one, OpenAI says the model failed to model the physics of the object (which hints that it did try to, which is not how the early diffusion models worked ; they just tried to generate "plausible" images). And we can see one of the archeologists basically chasing the chair down to grab it, which does correctly model the interaction with a floating object.

I think we can't underestimate how crucial that is to the building of a general model that has a strong model of the world. Not just a "theory of mind", but a litteral understanding of "what will happen next", independently of "what would a human say would happen next" (which is what the usual text-based models seem to do).

This is going to be much more important, IMO, than the video aspect.

February 16, 2024 at 4:49 AM

bamboozled

Wouldn't having a good understanding of physics mean you know that a women doesn't slide down the road when she walks? Wouldn't it know that a woolly mammoth doesn't emit profuse amounts steam when walking on frozen snow? Wouldn't the model know that legs are solid objects in which other object cannot pass through?

Maybe I'm missing the big picture here, but the above and all the weird spatial errors, like miniaturization of people make me think you're wrong.

Clearly the model is an achievement and doing something interesting to produce these videos, and they are pretty cool, but understanding physics seems like quite a stretch?

I also don't really get the excitement about the girl on the train in Tokyo:

In the Tokyo one, the model is smart enough to figure out that on a train, the reflection would be of a passenger, and the passenger has Asian traits since this is Tokyo

I don't know a lot about how this model works personally, but I'm guessing in the training data the vast majority of people riding trains in Tokyo featured asian people in them, assuming this model works on statistics like all of the other models I've seen recently from Open AI, then why is it interesting the girl in the reflection was Asian? Did you not expect that?

February 16, 2024 at 2:05 PM

csomar

> Wouldn't having a good understanding of physics mean you know that a women doesn't slide down the road when she walks? Wouldn't it know that a woolly mammoth doesn't emit profuse amounts steam when walking on frozen snow? Wouldn't the model know that legs are solid objects in which other object cannot pass through?

This just hit me but humans do not have a good understanding of physics; or maybe most of humans have no understanding of physics. We just observe and recognize whether it's familiar or not.

AI will need to be, that being the case, way more powerful than a human mind. Maybe orders of magnitude more "neural networks" than a human brain has.

February 16, 2024 at 4:35 PM

bamboozled

Well we feel the world, it's pretty wild when you think about how much data the body must be receiving and processing constantly.

I was watching my child in the bath the other day, they were having the most incredible time splashing, feeling the water, throwing balls up and down, and yes, they have absolutely no knowledge of "physics" yet navigating and interacting with it as if it was the best thing they've ever done. Not even 12 months old yet.

It was all just happening on feel and yeah, I doubt they could describe how to generate a movie.

February 16, 2024 at 6:16 PM

ehnto

Operating a human takes an incredible intuition of physics, just because you can't write or explain the math doesn't mean your mind doesn't understand it. Further to that, we are able to apply our patterns of physics to novel external situations on the fly sometimes within miliseconds of encountering the situation.

You only need to see a ball bounce once and your brain has done some rough approximations of it's properties and will calc both where it's going and how to get your gangly menagerie pivots, levers, meat servos and sockets to intercept them at just the right time.

Think also about how well people can come to understand the physics of cars and bikes in motorsport and the like. The internal model of a cars suspension in operation is non-trivial but people can put it in their head.

February 16, 2024 at 10:56 PM

nox101

Humans have an intuitive understanding of physics, not a mathy science one.

I know I can't put my hand through solid objects. I know that if I drop my laptop from chest height it will likely break it, the display will crack or shatter, the case will get a dent. If it hits my foot it will hurt. Depending on the angle it may break a bone. It may even draw blood. All of that is from my intuitive knowledge of physics. No book smarts needed.

February 17, 2024 at 2:15 AM

pera

I agree, to me the most clear example is how the rocks in the sea vanish/transform after the wave: The generated frames are hyperreal for sure, but the represented space looks as consistent as a dream.

February 16, 2024 at 4:33 PM

pests

They could test this by trying to generate the same image but set in New York, etc. I bet it would still be asain.

February 16, 2024 at 3:00 PM

barfingclouds

Give it a year

February 17, 2024 at 2:28 AM

bamboozled

Ok bro

February 17, 2024 at 2:03 PM

livshitz

The answer could be in between. Who said delusion models are limited to 2d pixel generations?

February 17, 2024 at 7:45 AM

bamboozled

Did you mean diffusion ?

February 17, 2024 at 2:03 PM

RhysU

> very good... understanding of the physics of objects and relationships between them

I am always torn here. A real physics engine has a better "understanding" but I suspect that word applies to neither Sora nor a physics engine: https://www.wikipedia.org/wiki/Chinese_room

An understanding of physics would entail asking this generative network to invert gravity, change the density or energy output of something, or atypically reduce a coefficient of friction partway through a video. Perhaps Sora can handle these, but I suspect it is mimicking the usual world rather than understanding physics in any strong sense.

None of which is to say their accomplishment isn't impressive. Only that "understand" merits particularly careful use these days.

February 16, 2024 at 6:45 AM

mewpmewp2

Question is - how much do you need to understand something in order to mimick it?

The Chinese Room seems to however point to some sort of prewritten if-else type of algorithm type of situation. E.g. someone following scripted algorithmic procedures might not understand the content, but obviously this simplification is not the case with LLMs or this video generation, as the algorithmic scripting requires pre-written scripts.

Chinese Room seems to more refer to cases like "if someone tells me "xyz", then respond with "abc" - of course then you don't understand what xyz or abc mean, but it's not referring to neural networks training on ton of material to build this model representation of things.

February 16, 2024 at 7:49 AM

RhysU

Good points.

Perhaps building the representation is building understanding. But humans did that for Sora and for all the other architectures too (if you'll allow a little meta-building).

But evaluation alone is not understanding. Evaluation is merely following a rote sequence of operations, just like the physics engine or the Chinese room.

People recognize this distinction all the time when kids memorize mathematical steps in elementary school but they do not yet know which specific steps to apply for a particular problem. This kid does not yet understand because this kid guesses. Sora just happens to guess with an incredibly complicated set of steps.

(I guess.)

February 16, 2024 at 10:05 AM

ketzo

I think this is a good insight. But if the kid gets sufficiently good at guessing, does it matter anymore..?

I mean, at this point the question is so vague… maybe it’s kinda silly. But I do think that there’s some point of “good-at-guessing” that makes an LLM just as valuable as humans for most things, honestly.

February 16, 2024 at 12:44 PM

RhysU

Agreed.

For low-stakes interpolation, give me the guesser.

For high-stakes interpolation or any extrapolation, I want someone who does not guess (any more than is inherent to extrapolating).

February 16, 2024 at 10:15 PM

jedharris

That matches how philosophers typically talk about the Chinese room. However the Chinese room is supposed to "behaves as if it understands Chinese" and can engage in a conversation (let us assume via text). To do this the room must "remember" previously mentioned facts, people, etc. Furthermore it must line up ambiguous references correctly (both in reading and writing).

As we now know from more than 60 years of good old fashioned AI efforts, plus recent learning based AI, this CAN be done using computers but CANNOT be done using just ordinary if - then - else type rules no matter how complicated. Searle wrote before we had any systems that could actually (behave as if they) understood language and could converse like humans, so he can be forgiven for failing to understand this.

Now that we do know how to build these systems, we can still imagine a Chinese room. The little guy in the room will still be "following pre-written scripted algorithmic procedures." He'll have archives of billions of weights for his "dictionary". He will have to translate each character he "reads" into one or more vectors of hundreds or thousands of numbers, perform billions of matrix multiplies on the results, and translate the output of the calculations -- more vectors -- into characters to reply. (We may come up with something better, but the brain can clearly do something very much like this.)

Of course this will take the guy hundreds or thousands of years from "reading" some Chinese to "writing" a reply. Realistically if we use error correcting codes to handle his inevitable mistakes that will increase the time greatly.

Implication: Once we expand our image of the Chinese room enough to actually fulfill Searle's requirements, I can no longer imagine the actual system concretely, and I'm not convinced that the ROOM ITSELF "doesn't have a mind" that somehow emerges from the interaction of all these vectors and weights.

Too bad Searle is dead, I'd love to have his reply to this.

February 17, 2024 at 5:40 AM

seydor

Facebook released something in that direction today https://ai.meta.com/blog/v-jepa-yann-lecun-ai-model-video-jo...

February 16, 2024 at 4:55 AM

sebastiennight

Wow this is a huge announcement too, I can't believe this hasn't made the front page yet.

February 16, 2024 at 5:52 AM
February 16, 2024 at 7:38 PM

gspetr

This seems to be completely in line with the previous "AI is good when it's not news" type of work:

Non-news: Dog bites a man.

News: Man bites a dog.

Non-news: "People riding Tokyo train" - completely ordinary, tons of similar content.

News: "Archaeologists dust off a plastic chair" - bizarre, (virtually) no similar content exists.

February 16, 2024 at 5:48 AM

sva_

I found the one about the people in Lagos pretty funny. The camera does about a 360deg spin in total, in the beginning there are markets, then suddenly there are skyscrapers in the background. So there's only very limited object permanence.

> A beautiful homemade video showing the people of Lagos, Nigeria in the year 2056. Shot with a mobile phone camera.

> https://cdn.openai.com/sora/videos/lagos.mp4

February 16, 2024 at 8:11 AM

bamboozled

Also the women in red next to the people is very tiny and the market stall is also a mini market stall, and the table is made out of a bike.

For everyone that's carrying on about this thing understanding physics and has a model of the world...it's an odd world.

February 16, 2024 at 2:01 PM

lostemptations5

The thing is -- over time I'm not sure people will care. People will adapt to these kinds of strange things and normalize them -- as long as they are compelling visually. The thing about that scene is it looks weird only if you think about it. Otherwise it seems like the sort of pan you would see in some 30 second commercial for coffee or something.

If anything it tells a story: going from market, to people talking as friends, to the giant world (of Lagos).

February 16, 2024 at 3:57 PM

bamboozled

I'm not so sure.

My instagram feed is full of AI people, I can tell with pretty good accuracy when the image is "AI" or real, the lighting and just the framing and the scene itself, just something is off.

I think a similar thing will happen here, over the next few months we'll adapt to these videos and the problems will become very obvious.

When I first looked at the videos I was quite impressed, but I looked again and I saw a bunch of werid stuff going on. I think our brains are just wired to save energy, and accepting whatever we see on a video or an image as being good enough is pretty efficient / low risk thing.

February 16, 2024 at 6:19 PM

ehnto

Agreed, at first glance of the woman walking I was so focused on how well they were animating that the surreal scene went unnoticed. Once I'd stopped noticing the surreal scene, I started picking up on weird motion in the walk too.

Where I think this will get used a lot is in advertising. Short videos, lots going on, see it once and it's gone, no time to inspect. Lady laughing with salad pans to a beach scene, here's a product, buy and be as happy as salad lady.

February 16, 2024 at 11:09 PM

tuyiown

This will be classified unconsciously as cheap and uninteresting by the brain real quick. It'll have its place in the tides of cheap content, but if overall quality was to be overlooked that easily, producers would never have increased production budget that much, ever, just for the sake of it.

February 16, 2024 at 7:21 PM

po

In the video of the girl walking down the Tokyo city street, she's wearing a leather jacket. After the closeup on her face they pull back and the leather jacket has hilariously large lapels that weren't there before.

February 16, 2024 at 11:15 AM

ketzo

Object permanence (just from images/video) seems like a particularly hard problem for a super-smart prediction engine. Is it the old thing, or a new thing?

February 16, 2024 at 12:51 PM

vingt_regards

There are also perspective issues: the relative sizes of the foreground (the people sitting at the café) and the background (the market) are incoherent. Same with the "snowy Tokyo with cherry blossoms" video.

February 16, 2024 at 10:03 AM

lostemptations5

Though I'm not sure your point here -- outside of America -- in Asia and Africa -- these sorts of markets mixed in with skyscrapers are perfectly normal. There is nothing unusual about it.

February 16, 2024 at 3:56 PM

PoignardAzur

Yeah, some of the continuity errors in that one feel horrifying.

February 16, 2024 at 8:46 AM

cruffle_duffle

> then suddenly there are skyscrapers in the background. So there's only very limited object permanence.

Ah but you see that is artistic liberty. The director wanted it shot that way.

February 17, 2024 at 12:19 AM

XCSme

It doesn't understand physics.

It just computes next frame based on current one and what it learned before, it's a plausible continuation.

In the same way, ChatGPT struggles with math without code interpreter, Sora won't have accurate physics without a physics engine and rendering 3d objects.

Now it's just a "what is the next frame of this 2D image" model plus some textual context.

February 16, 2024 at 8:42 AM

yberreby

> It just computes next frame based on current one and what it learned before, it's a plausible continuation.

...

> Now it's just a "what is the next frame of this 2D image" model plus some textual context.

This is incorrect. Sora is not an autoregressive model like GPT, but a diffusion transformer. From the technical report[1], it is clear that it predicts the entire sequence of spatiotemporal patches at once.

[1]: https://openai.com/research/video-generation-models-as-world...

February 16, 2024 at 12:23 PM

XCSme

Good link.

But, even there it says:

> Sora currently exhibits numerous limitations as a simulator. For example, it does not accurately model the physics of many basic interactions, like glass shattering. Other interactions, like eating food, do not always yield correct changes in object states

Regardless whether all the frames are generated at once, or one by one, you can see in their examples it's still just pixel based. See the first example with the dog with blue hat, the woman has a blue thing suddenly spawn into her hand because her hand went over another blue area of the image.

February 16, 2024 at 6:53 PM

yberreby

I'm not denying that there are obvious limitations. However, attributing them to being "pixel-based" seems misguided. First off, the model acts in latent space, not directly on pixels. Secondly, there is no fundamental limitation here. The model has already acquired limited-yet-impressive ability to understand movement, texture, social behavior, etc., just from watching videos.

I learned to understand reality by interpreting photons and various sensory inputs. Does that make my model of reality fundamentally flawed? In the sense that I only have a partial intuitive understanding of it, yes. But I don't need to know Maxwell's equations to get a sense of what happens when I open the blinds or turn on my phone.

I think many of the limitations we are seeing here - poor glass physics, flawed object permanence - will be overcome given enough training data and compute.

We will most likely need to incorporate exploration, but we can get really far with astute observation.

February 16, 2024 at 11:32 PM

sydd

Actually your comment gives me hope that we will never have AI singularity, since how the brain works is flawed, and were trying to copy it.

Heck a super AI might not even be possible, what if we're peak intelligence with our millions of years of evolution?

Just adding compute speed will not help much -- say the goal of an intelligence is to win a war. If you're tasked with it then it doesn't matter if you have a month or a decade (assume that time is.frozen while you do your research), its a too complex problem and simply cannot be solved, and the same goes for an AI.

Or it will be like with chess solvers, machines will be more intelligent than us simply because they can load much more context to solve a problem than us in their "working memory"

February 17, 2024 at 9:46 PM

yberreby

> Actually your comment gives me hope that we will never have AI singularity, since how the brain works is flawed, and were trying to copy it.

As someone working in the field, the vast majority of AI research isn't concerned with copying the brain, simply with building solutions that work better than what came before. Biomimetism is actually quite limited in practice.

The idea of observing the world in motion in order to internalize some of its properties is a very general one. There are countless ways to concretize it; child development is but one of them.

> If you're tasked with it then it doesn't matter if you have a month or a decade (assume that time is.frozen while you do your research), its a too complex problem and simply cannot be solved, and the same goes for an AI.

I highly disagree.

Let's assume a superintelligent AI can break down a problem into subproblems recursively, find patterns and loopholes in absurd amounts of data, run simulations of the potential consequences of its actions while estimating the likelihood of various scenarios, and do so much faster than humans ever could.

To take your example of winning a war, the task is clearly not unsolvable. In some capacity, military commanders are tasked with it on a regular basis (with varying degrees of success).

With the capabilities described above, why couldn't the AI find and exploit weaknesses in the enemy's key infrastructure (digital and real-world) and people? Why couldn't it strategically sow dissent, confuse, corrupt, and efficiently acquire intelligence to update its model of the situation minute-by-minute?

I don't think it's reasonable to think of a would-be superintelligence as an oracle that gives you perfect solutions. It will still be bound by the constraints of reality, but it might be able to work within them with incredible efficiency.

February 20, 2024 at 11:59 AM

XCSme

This is an excellent comparison and I agree with you.

Unfortunately we are flawed. We do know how physics work intuitively and can somewhat predict them, but not perfectly. We can imagine how a ball will move, but the image is blurry and trajectory only partially correct. This is why we invented math and physics studies, to be able to accurately calculate, predict and reproduce those events.

We are far off from creating something as efficient as the human brain. It will take insane amounts of compute power to simply match our basic innacurate brains, imagine how much will be needed to create something that is factually accurate.

February 17, 2024 at 12:11 AM

yberreby

Indeed. But a point that is often omitted from comparisons with organic brains is how much "compute equivalent" we spent through evolution. The brain is not a blank slate; it has clear prior structure that is genetically encoded. You can see this as a form of pretraining through a RL process wherein reward ~= surviving and procreating. If you see things this way, data-efficiency comparisons are more appropriate in the context of learning a new task or piece of information, and foundation models tend to do this quite well.

Additionally, most of the energy cost comes from pretraining, but once we have the resulting weights, downstream fine-tuning or inference are comparatively quite cheap. So even if the energy cost is high, it may be worth it if we get powerful generalist models that we can specialize in many different ways.

> This is why we invented math and physics studies, to be able to accurately calculate, predict and reproduce those events.

We won't do away without those, but an intuitive understanding of the world can go a long way towards knowing when and how to use precise quantitative methods.

February 17, 2024 at 1:03 AM

og_kalu

GPT-4 doesn't "struggle with math". It does fine. Most humans aren't any better.

Sora is not autoregressive anyway but there's nothing "just" and next frame/token prediction.

February 16, 2024 at 1:35 PM

8n4vidtmkvmk

It absolutely struggles with math. It's not solving anything. It sometimes gets the answer right only because it's seen the question before. It's rote memorization at best.

February 16, 2024 at 2:20 PM

og_kalu

No it doesn't. I know because I've actually used the thing and you clearly haven't.

And if Terence Tao finds some use for GPT-4 as well as Khan Academy employing it as a Math tutor then I don't think I have some wild opinion either.

Now Math isn't just Arithmetic but do you know easy it is to go out of training for say Arithmetic ?

February 16, 2024 at 2:57 PM

peebeebee

Yesterday, it failed to give me the correct answer to 4 + 2 / 2. It said 3...

February 16, 2024 at 6:55 PM

isaacfrond

Just tried in chatGpt-4. It gives the correct output (5), along with a short explanations of the order of operations (which you probably need to know, if you're asking the question).

February 16, 2024 at 8:50 PM

pizzafeelsright

Correct based upon whom? If someone of authority asks the question and receives a detailed response back that is plausible but not necessarily correct, and that version of authority says the answer is actually three, how would you disagree?

In order to combat Authority you need to both appeal to a higher authority, and that has been lost. One follows AI. Another follows Old Men from long ago who's words populated the AI.

February 16, 2024 at 9:44 PM

XCSme

The TV show American Gods becoming reality...

February 17, 2024 at 12:13 AM

xanderlewis

We shouldn't necessarily regard 5 as the correct output. Sure, almost all of us choose to make division higher precedence than addition, but there's no reason that has to be the case. I think a truly intelligent system would reply with 5 (which follows the usual convention, and would therefore mimic the standard human response), but immediately ask if perhaps you had intended a different order of operations (or even other meanings for the symbols), and suggest other possibilities and mention the fact that your question could be considered not well-defined...which is basically what it did.

February 16, 2024 at 9:02 PM

xanderlewis

I guess you might think 'math' means arithmetic. It definitely does struggle with mathematical reasoning, and I can tell you that because I and many others have tried it.

Mind you, it's not brilliant at arithmetic either...

February 16, 2024 at 8:55 PM

og_kalu

I'm not talking about Arithmetic

February 16, 2024 at 9:48 PM
February 16, 2024 at 1:54 PM

timdiggerm

> In the Tokyo one, the model is smart enough to figure out that on a train, the reflection would be of a passenger, and the passenger has Asian traits since this is Tokyo.

How is this any more accurate than saying that the model has mostly seen Asian people in footage of Tokyo, and thus it is most likely to generate Asian-features for a video labelled "Tokyo"? Similarly, how many videos looking out a train window do you think it's seen where there was not a reflection of a person in the window when it's dark?

February 17, 2024 at 12:09 AM

shostack

I'm hoping to see progress towards consistent characters, objects, scenes etc. So much of what I'd want to do creatively hinges on needing persisting characters who don't change appearance/clothing/accessories from usage to usage. Or creating a "set" for a scene to take place in repeatedly.

I know with stable diffusion there's things like lora and controlnet, but they are clunky. We still seem to have a long way to go towards scene and story composition.

Once we do, it will be a game changer for redefining how we think about things like movies and television when you can effectively have them created on demand.

February 16, 2024 at 9:17 AM

treesciencebot

This is leaps and bounds beyond anything out there, including both public models like SVD 1.1 and Pika Labs' / Runway's models. Incredible.

February 16, 2024 at 2:25 AM

drdaeman

Let's hold our breath. Those are specifically crafted hand-picked good videos, where there wasn't any requirement but "write a generic prompt and pick something that looks good", with no particular requirements. Which is very different from the actual process where you have a very specific idea and want the machine to make it happen.

DALL-E presentation also looked cool and everyone was stoked about it. Now that we know of its limitations and oddities? YMMV, but I'd say not so much - Stable Diffusion is still the go-to solution. I strongly suspect the same thing with Sora.

February 16, 2024 at 2:53 AM

treesciencebot

The examples are most certainly cherry-picked. But the problem is there are 50 of them. And even if you gave me 24 hour full access to SVD1.1/Pika/Runway (anything out there that I can use), I won't be able to get 5 examples that match these in quality (~temporal consistency/motions/prompt following) and more importantly in the length. Maybe I am overly optimistic, but this seems too good.

February 16, 2024 at 2:58 AM

xnx

Credit to OpenAI for including some videos with failures (extra limbs, etc.). I also wonder how closely any of these videos might match one from the training set. Maybe they chose prompts that lined up pretty closely with a few videos that were already in there.

February 17, 2024 at 1:41 AM

htrp

https://twitter.com/sama/status/1758200420344955288

They're literally taking requests and doing them in 15 minutes.

February 16, 2024 at 2:59 AM

drdaeman

Cool, but see the drastic difference in quality ;)

February 16, 2024 at 3:06 AM

golol

Lack of quality in the details yes but the fact that characters and scenes depict consistent and real movement and evolution as opposed to the cinemagraph and frame morphing stuff we have had so far is still remarkable!

February 16, 2024 at 3:43 AM

zamadatix

That particular example seems to have more a "cheap 3d" style to it but the actual synthesis seems on par with the examples. If the prompt had specified a different style it'd have that style instead. This kind of generation isn't like actual animating, "cheap 3d" style and "realistic cinematic" style take roughly the same amount of work to look right.

February 16, 2024 at 7:53 AM

gigglesupstairs

Drastic difference in quality of the prompts too. Ones used in the OP are quite detailed ones mostly.

February 16, 2024 at 5:33 AM

ShamelessC

There are absolutely example videos on their website which have worse quality than that.

February 16, 2024 at 4:09 AM

karmasimida

It has a comedy like quality lol

But all to be said, it is no less impressive after this new demo

February 16, 2024 at 4:36 AM

z7

Depends on the quality of the prompts.

February 16, 2024 at 3:35 AM

minimaxir

The output speed doesn't disprove possible cherry-picking, especially with batch generation.

February 16, 2024 at 3:06 AM

efrank3

Who cares? If it can be generated in 15 minutes then it's commercially useful.

February 16, 2024 at 5:45 AM

lostemptations5

Especially of you think that after you can get feedback and try again..15 minutes later have a new one...try again...etc

February 16, 2024 at 4:12 PM

djoletina

What is your point? That they make multiple ones and pick out the best ones? Well duh? That’s literally how the model is going to be used.

February 16, 2024 at 3:12 AM

dang

Please make your substantive points without swipes. This is in the site guidelines: https://news.ycombinator.com/newsguidelines.html.

February 16, 2024 at 4:38 AM

raydev

OpenAI people running these prompts have access to way more resources than any of us will through the API.

February 16, 2024 at 4:42 AM

timdiggerm

Looks ready for _Wishbone_

February 16, 2024 at 3:16 AM

999900000999

The year is 2030.

Sarah is a video sorter, this was her life. She graduated top of her class in film, and all she could find was the monotonous job of selecting videos that looked just real enough.

Until one day, she couldn't believe it. It was her. A video of of her in that very moment sorting. She went to pause the video, but stopped when he doppelganger did the same.

February 16, 2024 at 7:37 AM

esafak

https://en.wikipedia.org/wiki/Joan_Is_Awful

February 16, 2024 at 1:04 PM

Zondartul

I got reminded of an even older sci-fi story: https://qntm.org/responsibility

February 16, 2024 at 2:46 PM

sdombi

Man i was looking for this story for a year or so... thanks for sharing

February 19, 2024 at 5:45 PM

turnsout

Seems like in about two years I’ll be able to stuff this saved comment into a model and generate this full episode of Black Mirror

February 16, 2024 at 7:45 AM

dragonwriter

> Stable Diffusion is still the go-to solution. I strongly suspect the same thing with Sora.

Sure, for people who want detailed control with AI-generated video, workflows built around SD + AnimateDiff, Stable Video Diffusion, MotionDiff, etc., are still going to beat Sora for the immediate future, and OpenAI's approach structurally isn't as friendly to developing a broad ecosystem adding power on top of the base models.

OTOH, the basic simple prompt-to-video capacity of Sora now is good enough for some uses, and where detailed control is not essential that space is going to keep expanding -- one question is how much their plans for safety checking (which they state will apply both to the prompt and every frame of output) will cripple this versus alternatives, and how much the regulatory environment will or won't make it possible to compete with that.

February 16, 2024 at 3:36 AM

theLiminator

I suspect given equal effort into prompting both, Sora probably provides superior results.

February 16, 2024 at 4:42 AM

dragonwriter

> I suspect given equal effort into prompting both, Sora probably provides superior results

Strictly to prompting, probably, just as that is the case with Dall-E 3 vs, say, SDXL.

The thing is, there’s a lot more that you can do than just tweaking prompting with open models, compared to hosted models that offer limited interaction options.

February 16, 2024 at 6:11 AM

karmasimida

Generate stock video bits I think.

February 16, 2024 at 4:37 AM

og_kalu

It doesn't matter if they're cherrypicked when you can't match this quality with SD or Pika regardless of how much time you had.

and i still prefer Dalle-3 to SD.

February 16, 2024 at 3:06 AM

sebzim4500

In the past the examples tweeted by OpenAI have been fairly representative of the actual capabilities of the model. i.e. maybe they do two or three generations and pick the best, but they aren't spending a huge amount of effort cherry-picking.

February 16, 2024 at 3:17 AM

ChildOfChaos

Stable diffusion is not the go-to solution, it's still behind midjourney and DAllE

February 16, 2024 at 4:08 AM

educaysean

Would love to see handpicked videos from competitors that can hold their own against what SORA is capable of

February 16, 2024 at 4:42 AM

barfingclouds

Look at Sam altman’s twitter where he made videos on demand from what people prompted him

February 17, 2024 at 2:30 AM

schleck8

Wrong, this is the first time I've seen an astronaut with a knit cap.

February 16, 2024 at 5:37 AM

blibble

they're not fantastic either if you pay close attention

there are mini-people in the 2060s market and in the cat one an extra paw comes out of nowhere

February 16, 2024 at 3:53 AM

dartos

The woman’s legs move all weirdly too

February 16, 2024 at 5:07 AM

throwaway4233

While Sora might be able to generate short 60-90 second videos, how well it would scale with a larger prompt or a longer video remains yet to be seen. And the general logic of having the model do 90% of the work for you and then you edit what is required might be harder with videos.

February 16, 2024 at 5:02 AM

sebastiennight

60 seconds at a time is much better than enough.

Most fictional long-form video (whether live-action movies or cartoons, etc) is composed of many shots, most of them much shorter than 7 seconds, let alone 60.

I think the main factor that will be key to generate a whole movie is being able to pass some reference images of the characters/places/objects so they remain congruent between two generations.

You could already write a whole book in GPT-3 from running a series of one-short-chapter-at-a-time generations and passing the summary/outline of what's happened so far. (I know I did, in a time that feels like ages ago but was just early last year)

Why would this be different?

February 16, 2024 at 5:44 AM

throwaway4233

> I think the main factor that will be key to generate a whole movie is being able to pass some reference images of the characters/places/objects so they remain congruent between two generations.

I partly agree with this. The congruency however needs to extend to more than 2 generations. If a single scene is composed of multiple shots, then those multiple shots need to be part of the same world the scene is being shot in. If you check the video with the title `A beautiful homemade video showing the people of Lagos, Nigeria in the year 2056. Shot with a mobile phone camera.` the surroundings do not seem to make sense as the view starts with a market, spirals around a point and then ends with a bridge which does not fit into the market. If the the different shots generated the model did fit together seamlessly, trying to make the fit together is where the difficulty comes in. However I do not have any experience in video editing, so it's just speculation.

February 16, 2024 at 6:14 AM

esafak

The CGI industry is about to be turned upside down. They charge hundreds of thousands per minute, and it takes them forever to produce the finished product.

February 16, 2024 at 1:08 PM

Solvency

You do realize virtually all movies are made up of shots often lasting no longer than 10 seconds. Edited together. Right.

February 16, 2024 at 6:04 AM

Der_Einzige

The best films have long takes. Children of men or stalker come to mind

February 16, 2024 at 9:12 PM

imbnwa

Copacabana tracking shot in Goodfellas

February 17, 2024 at 1:04 PM

davidbarker

I'm almost speechless. I've been keeping an eye on the text-to-video models, and if these example videos are truly indicative of the model, this is an order of magnitude better than anything currently available.

In particular, looking at the video titled "Borneo wildlife on the Kinabatangan River" (number 7 in the third group), the accurate parallax of the tree stood out to me. I'm so curious to learn how this is working.

[Direct link to the video: https://player.vimeo.com/video/913130937?h=469b1c8a45]

February 16, 2024 at 2:43 AM

calgoo

The video of the gold rush town just makes me think of what games like Red Dead and GTA could look like.

February 16, 2024 at 2:56 AM

93po

holy cow, is that the future of gaming? instead of 3D renders it's real-time video generation, complete with audio and music and dialog and intelligent AI conversations and it's a unique experience no one else has ever played. gameplay mechanics could even change on the fly

February 16, 2024 at 3:08 AM

joegibbs

I think for the near future we’ll see something like this:

https://youtube.com/watch?v=P1IcaBn3ej0

From a few years ago, where the game is rendered traditionally and used as a ground truth, with a model on top of it that enhances the graphics.

After maybe 10-15 years we will be past the point where the entire game can be generated without obvious mistakes in consistency.

Realtime AI dialogue is already possible but still a bit primitive, I wrote a blog post about it here: https://jgibbs.dev/blogs/local-llm-npcs-in-unreal-engine

February 16, 2024 at 5:46 AM

gdubs

That's why NVIDIA's CEO said recently that in the future every pixel will be generated — not rendered.

February 16, 2024 at 3:18 AM

Keyframe

five years ago: https://www.youtube.com/watch?v=ayPqjPekn7g I'm eager to see an updated version.

February 16, 2024 at 5:36 AM

7734128

DLSS is essentially this, isn't it? It uses a low quality render from the game and then increases the fidelity with something very similar to a diffusion model.

February 17, 2024 at 11:10 PM

yard2010

The answer is most definitely YES. Computer games, and of course, porn, the stuff the internet is made up for.

February 16, 2024 at 5:12 AM

monlockandkey

Shove all the tech you mentioned into a VR headset and it is literally game over for humans

February 16, 2024 at 3:16 AM

rightbyte

You'd still get a headache after 20 minutes. No matter how addictive, it wont be bad until you can wear VR headsets for hours.

February 16, 2024 at 5:02 AM

theshackleford

Many people can. I can and have been since the DK1. I’ve done 12 hour plus stints in it.

February 16, 2024 at 12:34 PM

rightbyte

Really? My head hurts bad after 30 minutes and I feel uneasy after like 10-15.

The DK1 I could wear for like 1 minite before feeling sick, so they are getting better ...

I am prone to sea sickness. Maybe it is related.

February 16, 2024 at 11:16 PM

theshackleford

> Really?

Yeah, but I mean who knows why. I know some people can't, my GF is one of them.

I've often wondered if im ok with it because im used to the object on head stuff (like 25 odd years of motorcycle riding/ergo helmet wearing) and close up, high fov coverage fast past gaming? (I play on a 32" maybe 70 cms from the eyes give or take.)

> I am prone to sea sickness. Maybe it is related.

I'd think it might be given my understanding of why illness in many is triggered. It's odd because I never got sick from it, but i've seen others get INCREDIBLY ill in two different ways.

1. My GF tried to use simple locomotion in a game and almost vomited as an immediate reaction

2. A friend who was fine at first, but then randomly started getting very slowly ill over a matter of like an hour, just getting more and more nausea after the fact.

It's unfortunate, because due to lack of bad feelings/nausea/discomfort etc, I love VR. I equally from those around me can see no real path forward for it as it stands today though because of those impacts and limitations.

That being said, maybe they get smaller, lighter, we learn to induce motion sickness less, I dunno. I'm not optimistic.

February 22, 2024 at 4:41 PM

arathis

lol horseshit

February 16, 2024 at 5:22 PM
February 16, 2024 at 7:35 PM

theshackleford

You’re projecting.

February 16, 2024 at 6:57 PM

krapp

Even otherwise, and no matter how good the screen and speakers are, a screen and speakers can only be so immersive. People oversell the potential for VR when they describe it as being as good as or better than reality. Nothing less than the Matrix is going to work in that regard.

February 16, 2024 at 5:38 AM

trafficante

Yep, once your brain gets over the immediate novelty of VR, it’s very difficult to get back that “Ready Player One” feeling due to the absence of sensory feedback.

If/once they get it working though, society will shift fast.

There’s an XR app called Brink Traveler that’s full of handcrafted photogrammetry recreations of scenic landmarks. On especially gloomy PNW winter days, I’ll lug a heat lamp to my kitchen and let it warm up the tiled stone a bit, put a floor fan on random oscillation, toss on some good headphones, load up a sunny desert location in VR, and just lounge on the warm stone floor for an hour.

My conscious brain “knows” this isn’t real and just visuals alone can’t fool it anymore, but after about 15 minutes of visuals + sensory input matching, it stops caring entirely. I’ve caught myself reflexively squinting at the virtual sun even though my headset doesn’t have HDR.

February 16, 2024 at 6:27 AM

Xirgil

Digital Westworld

February 16, 2024 at 4:04 AM

slowmovintarget

I'll take one holodeck, please.

February 17, 2024 at 1:52 AM

dartos

Sometimes, but for specific or unique art styles, statistical models like this may not work well.

For games like call of duty or other hyper realistic games it very likely will be.

February 16, 2024 at 5:08 AM

arvinsim

For games like 2D/3D fighting games where you don't to generate a lot of terrain, the possibilities of randomly generating stages with unique terrain and obstacles is interesting.

February 16, 2024 at 3:29 PM

dartos

That’s also true, but those stages would need to fit in a specific art style.

A large part of fighting games is the style.

The cost difference of just making bespoke art and tuning an AI system to generate it for you may not be worth it (at least right now.)

February 18, 2024 at 5:30 AM

notpachet

Lucid Dreaming as a Service.

See also: https://en.wikipedia.org/wiki/Vanilla_Sky

February 16, 2024 at 6:10 AM

QuadmasterXLII

The diffusion is almost certainly taking place over some sort of compressed latent, from the visual quirks of the output I suspect that the process of turning that latent into images goes latent -> nerf / splat -> image, not latent -> convolutional decoder -> image

February 16, 2024 at 5:25 AM
February 16, 2024 at 5:34 AM

Zelphyr

Agreed. It's amazing how much of a head start OpenAI appears to have over everyone else. Even Microsoft who has access to everything OpenAI is doing. Only Microsoft could be given the keys to the kingdom and still not figure out how to open any doors with them.

February 16, 2024 at 2:33 AM

Voloskaya

Microsoft doesn’t have access to OpenAI’s research, this was part of the deal. They only have access to the weights and inference code of production models and even then who has access to that inside MS is extremely gated and only a few employees have access to this based on absolute need to actually run the service.

AI researcher at MSFT barely have more insights about OpenAI than you do reading HN.

February 16, 2024 at 2:39 AM

toneyG

This is not true. Microsoft have a perpetual license to all of OpenAI's IP. If they really wanted to they could get their hands on it.

February 16, 2024 at 2:57 AM

93po

Yeah but what's in the license? It's not public so we have no way of knowing

February 16, 2024 at 3:06 AM

vitorgrs

No. They have early access. Example: MSFT was using Dall-e Exp (early 3 version) in PUBLIC, since February of 2023.

In the same month, they were also using GPT4 in public - before OpenAI.

And they had access to GPT4 in 2022 (which was when they decided to create Bing Chat, now called Copilot).

All the current GPT4 models at MSFT are also finetuned versions (literally Creative and Precise mode runs different finetuned versions of GPT4). It runs finetuned versions since launch even...

February 16, 2024 at 2:37 PM

Zelphyr

I didn't realize that. Thank you for the clarification.

February 16, 2024 at 2:40 AM

costcofries

I promise you this isn't true.

February 16, 2024 at 4:45 AM

Jensson

Microsoft said that they could continue OpenAI's research with no slowdown if OpenAI cut them off by hiring all OpenAI's people, so from that statement it sounds like they have access.

February 16, 2024 at 4:40 AM

pcbro141

Many people say the same about Google/DeepMind.

February 16, 2024 at 2:45 AM

DANI-HACKER

[flagged]

February 16, 2024 at 2:43 AM

SeanAnderson

Eh. MSFT owns 49% of OpenAI. Doesn't really seem like they need to do much except support them.

February 16, 2024 at 2:37 AM

Zelphyr

Except they keep trying to shove AI into everything they own. CoPilot Studio is an example of how laughably bad at it they are. I honestly don't understand why they don't contract out to OpenAI to help them do some of these integrations.

February 16, 2024 at 2:39 AM

SeanAnderson

Every company is trying to shove AI into everything they own. It's what investors currently demand.

OpenAI is likely limited by how fast they are able to scale their hiring. They had 778 FTEs when all the board drama occurred, up 100% YoY. Microsoft has 221,000. It seems difficult to delegate enough headcount to all the exploratory projects of MSFT and it's hard to scale headcount quicker while preserving some semblance of culture.

February 16, 2024 at 2:45 AM

frabcus

They don't own 49% of OpenAI. They have capped rights to 49% of OpenAI's profits.

February 16, 2024 at 4:31 AM

SeanAnderson

Apparently all the rumors weren't true then, my mistake.

I don't think what you're saying is correct though, either. All the early news outlets reported 49% ownership:

https://en.wikipedia.org/wiki/OpenAI#:~:text=Rumors%20of%20t...

https://www.theverge.com/2023/1/23/23567448/microsoft-openai...

https://www.reuters.com/world/uk/uk-antitrust-regulator-cons...

https://techcrunch.com/2023/01/23/microsoft-invests-billions...

The only official statement from Micorosft is "While details of our agreement remain confidential, it is important to note that Microsoft does not own any portion of OpenAI and is simply entitled to share of profit distributions," said company spokesman Frank Shaw.

No numbers, though.

Do you have a better source for numbers?

February 16, 2024 at 4:38 AM

sschueller

Yes, but I am stuck in their (American) view of what is consider appropriate. Not what is legal, but what they determine to be OK to produce.

Good luck generating anything similar to an 80s action movie. The violence and light nudity will prevent you from generating anything.

February 16, 2024 at 3:05 AM

Xirgil

I suspect it's less about being puritanical about violence and nudity in and of themself, and more a blanket ban to make up for the inability to prevent the generation of actually controversial material (nude images of pop stars, violence against politicians, hate speech)

February 16, 2024 at 4:03 AM

SamBam

Put like that, it's a bit like the Chumra in Judaism [1]. The fence, or moat, around the law that extends even further than the law itself, to prevent you from accidentally commiting a sin.

1. https://en.m.wikipedia.org/wiki/Chumra_(Judaism)

February 16, 2024 at 4:39 AM

UberFly

Na. It's more like what he said: Cover your ass legally for the real problems this could cause.

February 16, 2024 at 5:25 AM

wilg

No, it's America's fault.

February 16, 2024 at 4:37 AM

throwitaway222

I am guessing a movie studio will get different access with controls dropped. Of course, that does mean they need to be VERY careful when editing, and making sure not to release a vagina that appears for 1 or 2 frames when a woman is picking up a cat in some random scene.

February 16, 2024 at 4:40 AM

Fricken

We can't do narrative sequences with persistent characters and settings, even with static images.

These video clips just generic stock clips. You cut cut them together to make a sequence of random flashy whatever, but you still can't do storytelling in any conventional sense. We don't appear to be close to being able to use these tools for the hypothetical disruptive use case we worry about.

Nonetheless, The stock video and photo people are in trouble. So long as the details don't matter this stuff is presumably useful.

February 16, 2024 at 5:56 AM
February 16, 2024 at 12:56 PM

zamadatix

I wonder how much of it is really "concern for the children" type stuff vs not wanting to deal with fights on what should be allowed and how and to who right now. When film was new towns and states started to make censorship review boards. When mature content became viewable on the web battles (still ongoing) about how much you need to do to prevent minors from accessing it came up. Now useful AI generated content is the new thing and you can avoid this kind of distraction by going this route instead.

I'm not supporting it in any way, I think you should be able to generate and distribute any legal content with the tools, but just giving a possible motive for OpenAI being so conservative whenever it comes to ethics and what they are making.

February 16, 2024 at 8:01 AM

golergka

I've been watching 80s movies recently, and amount of nudity and sex scenes often feels unnecessary. I'm definitely not a prude. I watch porn, I talk about sex with friends, I go to kinky parties sometimes. But it really feels that a lot of movies sacrificed stories to increase sex appeal — and now that people have free and unlimited access to porn, movies can finally be movies.

February 16, 2024 at 9:28 AM

TulliusCicero

It's not a particularly American attitude to be opposed to violence in media though, American media has plenty of violence.

They're trying to be all-around uncontroversial.

February 16, 2024 at 4:06 AM

jsheard

Where is the training material for this coming from? The only resource I can think of that's broad enough for a general purpose video model is YouTube, but I can't imagine Google would allow a third party to scrape all of YT without putting up a fight.

February 16, 2024 at 2:49 AM

Zetobal

It's movies the shots are way to deliberate to have random YouTube crap in the dataset.

February 16, 2024 at 2:56 AM

cma

You can still have a broad dataset and use RLHF to steer it more towards the aesthetic like midjourney and SDXL did through discord feedback. I think there was still some aesthetic selection in the dataset as well but it still included a lot of crap.

February 16, 2024 at 4:55 AM

xnx

It's very good. Unclear how far ahead of Lumiere it is (https://lumiere-video.github.io/) or if its more of a difference in prompting/setttings.

February 16, 2024 at 2:53 AM

vunderba

The big stand out to me beyond almost any other text video solution is that the video duration is tremendously longer (minute+). Everything else that I've seen can't get beyond 15 to 20 seconds at the absolute maximum.

February 16, 2024 at 5:17 AM

ehsankia

In terms of following the prompt and generating visually interesting results, I think they're comparable. But the resolution for Sora seems so far ahead.

Worth noting that Google also has Phenaki [0] and VideoPoet [1] and Imagen Video [2]

[0] https://sites.research.google/phenaki/

[1] https://sites.research.google/videopoet/

[2] https://imagen.research.google/video/

February 16, 2024 at 2:59 AM

mizzao

Must be intimidating to be on the Pika team at the moment...

February 16, 2024 at 2:43 AM

alokjnv10

you nailed it

February 16, 2024 at 1:18 PM

rvz

All those startups have been squeezed in the middle. Pika, Runway, etc might as well open source their models.

Or Meta will do it for them.

February 16, 2024 at 5:25 AM

iLoveOncall

It is incredible indeed, but I remember there was a humongous gap between the demoed pictures for DALL-E and what most prompts would generate.

Don't get overly excited until you can actually use the technology.

February 16, 2024 at 2:58 AM
February 16, 2024 at 4:52 AM

JKCalhoun

I know it's Runway (and has all manner of those dream-like AI artifacts) but I like what this person is doing with just a bunch 4 second clips and an awesome soundtrack:

https://youtu.be/JClloSKh_dk

https://youtu.be/upCyXbTWKvQ

February 16, 2024 at 5:01 AM

jasonjmcghee

I agree in terms of raw generation, but runway especially is creating fantastic tooling too.

February 16, 2024 at 4:05 AM

jug

Yup, it's been even several months! ;) But now we finally have another quantum leap in AI.

February 16, 2024 at 4:48 AM

Animats

The Hollywood Reporter says many in the industry are very scared.[1]

“I’ve heard a lot of people say they’re leaving film,” he says. “I’ve been thinking of where I can pivot to if I can’t make a living out of this anymore.” - a concept artist responsible for the look of the Hunger Games and some other films.

"A study surveying 300 leaders across Hollywood, issued in January, reported that three-fourths of respondents indicated that AI tools supported the elimination, reduction or consolidation of jobs at their companies. Over the next three years, it estimates that nearly 204,000 positions will be adversely affected."

"Commercial production may be among the main casualties of AI video tools as quality is considered less important than in film and TV production."

[1] https://www.hollywoodreporter.com/business/business-news/ope...

February 16, 2024 at 7:24 AM

snewman

Honest question: of what possible use could Sora be for Hollywood?

The results are amazing, but if the current crop of text-to-image tools is any guide, it will be easy to create things that look cool but essentially impossible to create something that meets detailed specific criteria. If you want your actor to look and behave consistently across multiple episodes of a series, if you want it to precisely follow a detailed script, if you want continuity, if you want characters and objects to exhibit consistent behavior over the long term – I don't see how Sora can do anything for you, and I wouldn't expect that to change for at least a few years.

(I am entirely open to the idea that other generative AI tools could have an impact on Hollywood. The linked Hollywood Reporter article states that "Visual effects and other postproduction work stands particularly vulnerable". I don't know much about that, I can easily believe it would be true, but I don't think they're talking about text-to-video tools like Sora.)

February 16, 2024 at 8:42 AM

Animats

I suspect that one of the first applications will be pre-viz. Before a big-budget movie is made, a cheap version is often made first. This is called "pre-visualization". These text to video applications will be ideal for that. Someone will take each scene in the script, write a big prompt describing the scene, and follow it with the dialog, maybe with some commands for camerawork and cuts. Instant movie. Not a very good one, but something you can show to the people who green-light things.

There are lots of pre-viz reels on line. The ones for sequels are often quite good, because the CGI character models from the previous movies are available for re-use. Unreal Engine is often used.

February 16, 2024 at 9:33 AM

theshrike79

Especially when you can do this with still images on a normal M-series MacBook _today_, automating it would be pretty trivial.

Just feed it a script and get a bunch of pre-vis images for every scene.

When we get something like this running on hardware with an uncensored model, there's going to be a lot of redundancies but also a ton of new art that would've never happened otherwise.

February 17, 2024 at 5:54 AM

becquerel

This is a fascinating idea I'd never considered before.

February 16, 2024 at 7:40 PM
February 16, 2024 at 6:43 PM

QuadmasterXLII

People are extrapolating out ten years. They will still have to eat and pay rent in ten years.

February 16, 2024 at 10:10 AM

Karuma

It wouldn't be too hard to do any of the things you mention. See ControlNet for Stable Diffusion, and vid2vid (if this model does txt2vid, it can also do vid2vid very easily).

So you can just record some guiding stuff, similar to motion capture but with just any regular phone camera, and morph it into anything you want. You don't even need the camera, of course, a simple 3D animation without textures or lighting would suffice.

Also, consistent look has been solved very early on, once we had free models like Stable Diffusion.

February 16, 2024 at 8:54 AM

quickthrower2

Right now you’d need a artistic/ML mixed team. You wouldn’t use an off the shelf tool. There was a video of some guys doing this (sorry can’t find it) to make an anime type animation. With consistent characters. They used videos of themselves running through their own models to make the characters. So I reckon while prompt -> blockbuster is not here yet, a movie made using mostly AI is possible but it will cost alot now but that cost will go down. Why this is sad it is also exciting. And scary. Black mirror like we will start creating AI’s we will have relationships with and bring people back to life (!) from history and maybe grieving people will do this. Not sure if that is healthy but people will do it once it is a click of a button thing.

February 16, 2024 at 5:48 PM

someperson

> There was a video of some guys doing this (sorry can’t find it) to make an anime type animation. With consistent characters. They used videos of themselves running through their own models to make the characters.

That was Corridor Crew: https://www.youtube.com/watch?v=_9LX9HSQkWo

February 17, 2024 at 1:02 AM

Qwero

It shows that good progress is still made.

Just this week sd audio model can make good audio effects like doors etc.

If this continues (and it seems it will) it will change the industry tremendously.

February 16, 2024 at 3:31 PM

kranke155

It won’t be Hollywood at first . It will be small social ads for TikTok, IG and social media. The brands likely won’t even care if it’s they don’t get copyright at the end, since they have copyright of their product.

Source: I work in this.

February 16, 2024 at 7:32 PM

lesinski

Seconding this. There is also a huge SMB and commercial business that supports many agencies and production companies. This could replace a lot of that work.

February 17, 2024 at 5:02 AM

MauranKilom

The OpenAI announcement mentions being able to provide an image to start the video generation process from. That sounds to me like it will actually be incredibly easy to anchor the video generation to some consistent visual - unlike all the text-based stable diffusion so far. (Yes, there is img2img, but that is not crossing the boundary into a different medium like Sora).

February 16, 2024 at 5:09 PM

theptip

Probably a bad time to be an actor.

Amazing time to be a wannabe director or producer or similar creative visionary.

Bad time to be high up in a hierarchical/gatekeeping/capital-constrained biz like Hollywood.

Amazing time to be an aspirant that would otherwise not have access to resources, capital, tools in order to bring their ideas to fruition.

On balance I think the ‘20s are going to be a great decade for creativity and the arts.

February 16, 2024 at 7:33 AM

gwd

> Probably a bad time to be an actor.

I don't see why -- the distance between "here's something that looks almost like a photo, moving only a little bit like a mannequin" and "here's something that has the subtle facial expressions and voice to convey complex emotions" is pretty freaking huge; to the point where the vast majority of actual humans fail to be that good at it. At any rate, the number of BNNs (biological neural networks) competing with actors has only been growing, with 8 billion and counting.

> Amazing time to be a wannabe director or producer or similar creative visionary. Amazing time to be an aspirant that would otherwise not have access to resources, capital, tools in order to bring their ideas to fruition.

Perhaps if you mainly want to do things for your own edification. If you want to be able to make a living off it, you're suddenly going to be in a very, very flooded market.

February 16, 2024 at 11:33 AM

theptip

It’s for sure plausible that acting remains a viable profession.

The bull case would be something like ‘Ractives in “The Diamond Age” by Neal Stephenson; instead of video games people play at something like live plays with real human actors. In this world there is orders of magnitude more demand for acting.

Personally I think it’s more likely that we see AI cross the uncanny valley in a decade or two (at least for movies/TV/TikTok style content). But this is nothing more than a hunch; 55/45 confidence say.

> Perhaps if you mainly want to do things for your own edification.

My mental model is that most aspiring creatives fall in this category. You have to be doing quite well as an actor to make a living from it, and most who try do not.

February 17, 2024 at 6:59 AM

lowbloodsugar

> the distance between "here's something that looks almost like a photo, moving only a little bit like a mannequin" and "here's something that has the subtle facial expressions and voice to convey complex emotions" is pretty freaking huge;

The distance between pixelated noise and a single image is freaking huge.

The distance between a single image and a video of a consistent 3D world is freaking huge (albeit with rotating legs).

The distance between a video of a consistent 3D world and a full length movie of a consistent 3D world with subtle facial expressions is freaking huge.

So... next 12 months then.

>If you want to be able to make a living off it, you're suddenly going to be in a very, very flooded market.

That is, I believe, GPs point.

February 17, 2024 at 7:07 AM

robbomacrae

Considering a year ago we had that nightmare fuel of will smith eating spaghetti and Don and Joe hair force one it seems odd to see those of you who assume we’re not going to get to the point of being indistinguishable from reality in the near future.

February 17, 2024 at 1:20 AM

nprateem

* Flesh out a movie about x following the Hero's Journey in the style of Notting Hill.

* Create a scene in which a character with the mannerisms of Tom Cruise from Top Gun goes into a bar and says "...."

February 16, 2024 at 8:14 PM

theshrike79

We might enter a world where "actors" are just for mocap. They do the little micro expressions with a bunch of dots on their face.

AI models add the actual character and maybe even voice.

At that point the amount of actors we "need" will go down drastically. The same experienced group of a dozen actors can do multiple movies a month if needed.

February 17, 2024 at 5:57 AM

hackermatic

It's always a bad time to be an actor, between long hours, low pay, and a culture of abuse, but this will definitely make it worse. My writer and artist friends are already despondent from genAI -- it was rare to be able to make art full-time, and even the full-timers were barely making enough money to live. Even people writing and drawing for marketing were not exactly getting rich.

I think this will lead to a further hollowing-out of who can afford to be an actor or artist, and we will miss their creativity and perspective in ways we won't even realize. Similarly, so much art benefits from being a group endeavor instead of someone's solo project -- imagine if George Lucas had created Star Wars entirely on his own.

Even the newly empowered creators will have to fight to be noticed amid a deluge of carelessly generated spam and sludge. It will be like those weird YouTube Kids videos, but everywhere (or at least like indie and mobile games are now). I think the effect will be that many people turn to big brands known for quality, many people don't care that much, and there will be a massive doughnut hole in between.

February 16, 2024 at 1:20 PM

arvinsim

> Even the newly empowered creators will have to fight to be noticed amid a deluge of carelessly generated spam and sludge. It will be like those weird YouTube Kids videos, but everywhere (or at least like indie and mobile games are now).

Reminds me of Syndrome's quote in the Incredibles.

"If everyone is super, then no one will be".

February 16, 2024 at 3:33 PM

yterdy

I dunno. Thanks to big corpo shenanigans (and, er, racism?) a lot of people have turned away from big brands (or, at least obviously brand-y brands) towards "trusted individuals" (though you might classify them as brands themselves). Who goes to PCMag anymore? It's all LTT and Marques Brownlee and any number of small creators. Or, the people on the right who abandoned broadcast and even cable news and get everything they "know" from Twitter randos. Even on this site, asks for a Google Search alternative are not rare, and you'll get about a dozen different answers each time, each with a fraction of the market share of the big guy (but growing).

February 17, 2024 at 8:15 PM

sva_

> Probably a bad time to be an actor.

I'm thinking people will probably still want to see their favorite actors, so established actors may sell the rights to their image. They're sitting on a lot of capital. Bad time to be becoming an actor though.

February 16, 2024 at 8:02 AM

lIl-IIIl

You are talking about movie and TV stars, not actors in general. The vast majority of working actors are not known to the audience.

February 16, 2024 at 1:29 PM

Animats

Even the average SAG-AFTRA member barely makes a living wage from acting. And those are the ones that got into the union. There's a whole tier below that. If you spend time in LA, you probably know some actress/model/waitress types.

There's also the weird misery of being famous, but not rich. You can't eat fame.

February 16, 2024 at 3:46 PM

nprateem

> established actors may sell the rights to their image

I had a conversation with a Hollywood producer last year who said this is already happening.

February 16, 2024 at 8:16 PM

throwaway743

Likely less and less tho given that people will be able to generate a hyper personalized set of actors/characters/personalities in their hyper personalized generated media.

Younger generations growing up with hyper personalized media will likely care even less about irl media figures.

February 16, 2024 at 8:19 AM

kranke155

You can’t replace actors with this for a long time. Actors are “rendering” faster than any AI. Animation is where the real issues will show up first, particularly in Advertising.

February 16, 2024 at 7:33 PM

theshrike79

Have you seen the amount of CGI in movies and TV shows? :)

In many AAA blockbusters the "actors" on screen are just CGI recreations during action scenes.

But you're right, actors won't be out of a job soon, but unless something drastic happens they'll have the role of Vinyl records in the future. For people who appreciate the "authenticity". =)

February 17, 2024 at 6:00 AM

murukesh_s

I think you can fill-in many scenes for the actor - perhaps a dupe but would look like the real actor - of course the original actor would have to be paid, but perhaps much less as the effort is reduced.

February 16, 2024 at 7:43 PM

kranke155

If it requires acting, it likely can't be done with AI. You underestimate, I think, how much an actor carries a movie. You can use it for digi doubles maybe, for stunts and VFX. But if his face in on the screen... We are ages away from having an AI actor perform at the same level as Daniel Day Lewis, Williem Dafoe, or anyone else that's in that atmosphere. They make too many interesting choices per second for it to replaced by AI.

February 16, 2024 at 8:54 PM

daxfohl

Quality aside, there's a reason producers pay millions for A-list stars instead of any of the millions of really good aspiring actors in LA that they could hire for pennies. People will pay to see the new Matt Damon flick but wouldn't give it a second glance if some no-name was playing the part.

If you can't replace Matt Damon with another equivalently skilled human, CGI won't be any different.

Granted, maybe that's less true today, given Marvell and such are more about the action than the acting. But if that's the future of the industry anyway, then acting as a worthwhile profession is already on its way out, CGI or no.

February 17, 2024 at 1:23 AM

kranke155

Yes, people also take actors as a sign of the quality of the film, or at least they used to, before Marvel. Hence films with big names attached get more money, etc.

Still the idea that actors are easy to replace is preposterous to anyone who's ever worked with actors. They are preposterously HARD to replace, in theatre and film. A good actor is worth their weight in gold. Very very few people are good actors. A good actor is a good comedian, a master at controlling his body, and a master at controlling his voice, towards a specifically intended goal. They can make you laugh, cry, sigh, or feel just about anything. You just look at Paul Giamatti or Willem Dafoe or Denzel Washington. Those people are not replaceable, and their work is just as good and just as culturally important as a Picasso or a Monet. A hundred years from now people will know the name of actors, because that was the dominant mode of entertainment of our age.

February 20, 2024 at 4:34 AM

dingclancy

The idea that this destroys the industry is overblown, because the film industry has already been dying since 2000's.

Hollywood is already destroyed. It is not the powerful entity it once was.

In terms of attention and time of entertainment, Youtube has already surpassed them.

This will create a multitude more YouTube creators that do not care about getting this right or making a living out of it. It will just take our attention all the same, away from the traditional Hollywood.

Yes there will still be great films and franchises, the industry is shrinking.

This is similar with Journalism saying that AI will destroy it. Well there was nothing to destroy because the a bunch of traditional newspapers already closed shop even before AI came.

February 16, 2024 at 2:38 PM

hcarvalhoalves

They shouldn’t be worried so soon. This will be used to pump out shitty hero movies more quickly, but there will always be demand for a masterpiece after the hype cools down.

This is like a chef worrying going out of business because of fast food.

February 17, 2024 at 4:56 AM

FrozenSynapse

Yeah, but how many will work on that singular masterpiece? The rest will be reduced and won’t have a job to put food on the table

February 17, 2024 at 5:31 AM

hcarvalhoalves

Only if the entertainment market remains the same size.

February 17, 2024 at 9:45 AM

LegitShady

Without a change in copyright law, I doubt it. The current policy of the USCO is that the products of AI based on prompts like this are not human authored and can't be copywritten. No one is going to release AI created stuff that someone else can reproduce because its public domain.

February 16, 2024 at 4:07 PM
February 16, 2024 at 7:33 AM

neutralx

Has anyone else noticed the leg swap in Tokyo video at 0:14. I guess we are past uncanny, but I do wonder if these small artifacts will always be present in generated content.

Also begs the question, if more and more children are introduced to media from young age and they are fed more and more with generated content, will they be able to feel "uncanniness" or become completely blunt to it.

There's definitely interesting period ahead of us, not yet sure how to feel about it...

February 16, 2024 at 4:54 AM

snewman

There are definitely artifacts. Go to the 9th video in the first batch, the one of the guy sitting on a cloud reading a book. Watch the book; the pages are flapping in the wind in an extremely strange way.

February 16, 2024 at 8:43 AM

daxfohl

The third batch, the one with the cat, the guy in bed has body parts all over, his face deforms, and the blanket is partially alive.

February 16, 2024 at 11:19 AM

quickthrower2

When there is a dejavu cat, we know we are in trouble!

February 16, 2024 at 5:50 PM

Crespyl

In the one with the cat waking up its owner, the owners shoulder turns into a blanket corner when she rolls over.

February 16, 2024 at 11:32 AM

Kydlaw

Yep, I noticed it immediately too. Yet it is subtle in reality. I'm not that good to spot imperfections on picture but on the video I immediately felt something was not quite right.

February 16, 2024 at 4:58 AM

elicksaur

Tangent to feeling numb to it - will it hinder children developing the understanding of physics, object permanence, etc. that our brains have?

February 16, 2024 at 7:28 AM

lukan

There have been children, that reacted iritated, when they cannot swipe away real life objects. The idea is, to give kids enough real world experiences, so this does not happen.

February 16, 2024 at 2:51 PM

sunnybeetroot

Kids have been exposed to decades of 2D and 3D animations that do not contain realistic physics etc; I’m assuming they developed fine?

February 19, 2024 at 4:16 AM

dymk

Kids aren't supposed to have screen time until they're at least a few years old anyways

February 16, 2024 at 11:48 AM

jrockway

I noticed at the beginning that cars are driving on the right side of the road, but in Japan they drive on the left. The AI misses little details like that.

(I'm also not sure they've ever had a couple inches of snow on the ground while the cherry blossoms are in bloom in Tokyo, but I guess it's possible.)

February 16, 2024 at 6:37 AM

throw310822

The cat in the "cat wakes up its owner" video has two left front legs, apparently. There is nothing that is true in these videos. They can and do deviate from reality at any place and time and at any level of detail.

February 16, 2024 at 7:12 AM

hackerlight

These artefacts go down with more compute. In four years when they attack it again with 100x compute and better algorithms I think it'll be virtually flawless.

February 16, 2024 at 2:18 PM

lostemptations5

I had to go back several times to 0:14 to see if it was really unusual. I get it of course, but probably watching 20 times I would have never noticed it.

February 16, 2024 at 4:22 PM

hank808

Yep! Glad I wasn't the only one that saw that. I have a feeling THEY didn't see it or they wouldn't have showcased it.

February 16, 2024 at 5:28 AM

ryanisnan

I don't think that's the case. I think they're aware of the limitations and problems. Several of the videos have obvious problems, if you're looking - e.g. people vanishing entirely, objects looking malformed in many frames, objects changing in size incongruent with perspective, etc.

I think they just accept it as a limitation, because it's still very technically impressive. And they hope they can smooth out those limitations.

February 16, 2024 at 5:55 AM

SirMaster

They swap multiple times lol. Not to mention it almost always looks like the feet are slightly sliding on the ground with every step.

I mean there are some impressive things there, but it looks like there's a long ways to go yet.

They shouldn't have played it into the close up of the face. The face is so dead and static looking.

February 16, 2024 at 5:27 AM

micromacrofoot

certainly not perfect... but "some impressive things" is an understatement, think of how long it took to get halfway decent CGI... this AI thing is already better than clips I've seen people spend days building by hand

February 16, 2024 at 6:05 AM

xkgt

This is pretty impressive, it seems that OpenAI consistently delivers exceptional work, even when venturing into new domains. But looking into their technical paper, it is evident that they are benefiting from their own body of work done in the past and also the enormous resources available to them.

For instance, the generational leap in video generation capability of SORA may be possible because:

1. Instead of resizing, cropping, or trimming videos to a standard size, Sora trains on data at its native size. This preserves the original aspect ratios and improves composition and framing in the generated videos. This requires massive infrastructure. This is eerily similar to how GPT3 benefited from a blunt approach of throwing massive resources at a problem rather than extensively optimizing the architecture, dataset, or pre-training steps.

2. Sora leverages the re-captioning technique from DALL-E 3 by leveraging GPT to turn short user prompts into longer detailed captions that are sent to the video model. Although it remains unclear whether they employ GPT-4 or another internal model, it stands to reason that they have access to a superior captioning model compared to others.

This is not to say that inertia and resources are the only factors that is differentiating OpenAI, they may have access to much better talent pool but that is hard to gauge from the outside.

February 16, 2024 at 10:43 AM

Imnimo

https://openai.com/sora?video=big-sur

In this video, there's extremely consistent geometry as the camera moves, but the texture of the trees/shrubs on the top of the cliff on the left seems to remain very flat, reminiscent of low-poly geometry in games.

I wonder if this is an artifact of the way videos are generated. Is the model separating scene geometry from camera? Maybe some sort of video-NeRF or Gaussian Splatting under the hood?

February 16, 2024 at 2:35 AM

ethbr1

Curious about what current SotA is on physics-infusing generation. Anyone have paper links?

OpenAi has a few details:

>> The current model has weaknesses. It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.

>> Similar to GPT models, Sora uses a transformer architecture, unlocking superior scaling performance.

>> We represent videos and images as collections of smaller units of data called patches, each of which is akin to a token in GPT. By unifying how we represent data, we can train diffusion transformers on a wider range of visual data than was possible before, spanning different durations, resolutions and aspect ratios.

>> Sora builds on past research in DALL·E and GPT models. It uses the recaptioning technique from DALL·E 3, which involves generating highly descriptive captions for the visual training data. As a result, the model is able to follow the user’s text instructions in the generated video more faithfully.

The implied facts that it understands physics of simple scenes and any instances of cause and effect are impressive!

Although I assume that's been SotA-possible for awhile, and I just hadn't heard?

February 16, 2024 at 3:00 AM

msoad

On the announcement page, it specifically says Sora does not understand physics

February 16, 2024 at 5:42 AM

nuz

I saw similar artifacts in dalle-1 a lot (as if the image was pasted onto geometry). Definitely wouldn't surprise me if they use synthetic rasterized data to in the training, which could totally create artifacts like this.

February 16, 2024 at 2:56 AM

thomastjeffery

The model is essentially doing nothing but dreaming.

I suspect that anything that looks like familiar 3D-rendering limitations is probably a result of the training dataset simply containing a lot of actual 3D-rendered content.

We can't tell a model to dream everything except extra fingers, false perspective, and 3D-rendering compromises.

February 16, 2024 at 3:12 AM

makin

Technically we can, that's what negative prompting[1] is about. For whatever reason, OpenAI has never exposed this capability in its image models, so it remains an open source exclusive.

[1] https://stable-diffusion-art.com/how-to-use-negative-prompts...

February 16, 2024 at 7:40 AM

thomastjeffery

It's more complicated than that. Negative prompts are just as limited as positive prompts.

February 16, 2024 at 8:32 AM

spyder

It's possible it was pre-trained on 3D renderings first, because it's easy to get almost infinite synthetic data that way, and after that they continued the training on real videos.

February 16, 2024 at 7:45 AM

iandanforth

In the car driving on the mountain road video you can see level-of-detail popping artifacts being reproduced, so I think that's a fair guess.

February 16, 2024 at 12:17 PM

burkaman

Maybe it was trained on a bunch of 3d Google Earth videos.

February 16, 2024 at 4:21 AM

downWidOutaFite

Doesn't look flat to me.

Edit: Here[0] I highlighted a groove in the bushes moving with perfect perspective

[0] https://ibb.co/Y7WFW39

February 16, 2024 at 4:50 AM

internetter

Look in the top left corner, on the plane

February 16, 2024 at 4:54 AM

montag

My vote is yes - some sort of intermediate representation is involved. It just seems unbelievable that it's end-to-end with 2D frames...

February 16, 2024 at 3:09 PM

cush

The water is on par with Avatar. Looks perfect

February 16, 2024 at 5:46 AM

cush

Wow, yeah I didn't notice it at first, but looking at the rocks in the background is actually nauseating

February 16, 2024 at 5:44 AM

jquery

It looks perfect to me. That's exactly how the area looks in person.

February 16, 2024 at 1:41 PM
February 16, 2024 at 3:08 AM

dudeinhawaii

I say this with all sincerity, if you're not overwhelmingly impressed with Sora then you haven't been involved in the field of AI generated video recently. While we understand that we're on the exponential curve of AI progress, it's always hard to intuit just what that means.

Sora represents a monumental leap forward, it's comically a 3000% improvement in 'coherent' video generation seconds. Coupled with a significantly enhanced understanding of contextual prompts and overall quality, it's has achieved what many (most?) thought would take another year or two.

I think we will see studios like ILM pivoting to AI in the near future. There's no need for 200 VFX artists when you can have 15 artists working with AI tooling to generate all the frame-by-frame effects, backgrounds, and compositing for movies. It'll open the door for indie projects that can take place in settings that were previously the domain of big Hollywood. A sci-fi opera could be put together with a few talented actors, AI effects and a small team to handle post-production. This could conceivably include AI scoring.

Sure, Hollywood and various guilds will strongly resist but it'll require just a handful of streaming companies to pivot. Suddenly content creation costs for Netflix drops an order of magnitude. The economics of content creation will fundamentally change.

At the risk of being proven very wrong, I think replacing actors is still fairly distant in the future but again... humans are bad at conceptualizing exponential progress.

February 17, 2024 at 6:38 AM

xmprt

I strongly believe that AI will have massive impact on the film industry however it won't be because of a blackbox, text to video tool like Sora. VFX artists and studios still want a high level of control over the end product and unless it's very simple to tweak small details like the blur of an object in the background, or the particle physics of an explosion, then they wouldn't use it. What Hollywood needs are AI tools that can integrate with their existing workflows. I think Adobe is doing a pretty good job at this.

February 17, 2024 at 7:00 AM

random_cynic

You're completely missing the point. Who cares what VFX artists and studios want if anyone with a small team can create high quality entertaining videos that millions of people would pay to watch? And if you think that's a bar too high for AI, then you haven't actually seen the quality of average videos and films generated these days.

February 17, 2024 at 12:54 PM

xmprt

I was specifically responding to this point which seemed to be the thesis of the parent commenter.

> I think we will see studios like ILM pivoting to AI in the near future. There's no need for 200 VFX artists when you can have 15 artists working with AI tooling

Yes this will bring the barrier to entry for small teams down significantly. However it's not going to replace the 200 people studios like ILM.

February 17, 2024 at 2:25 PM

hackerlight

I believe this to be a failure of imagination. You're assuming Sora stays like this. Reality is we are on an exponential and it's just a matter of time. ILM will be the last to go but it'll eventually go, in the sense of having less humans needed to create the same output.

February 17, 2024 at 6:13 PM

dogcomplex

I think it's fair to be impressed with Sora as the next stage of AI video, yet not be too surprised or consider it some insurmountable leap from the public pieces we've seen of AI video up to this point. We've always been just a couple papers away, seeking a good consistency modelling step - now we've got it. Amazing and viscerally chilling - seeing the net effect - but let's not be intimidated so easily or prop these guys up as gods just for being a bit ahead of the obviously-accelerating curve. Anyone tracking this stuff had a very strong prediction of good AI video within a year - two max. This was a context size increase and overall impressive quality pass reaching a new milestone, but the bones were there.

February 20, 2024 at 12:11 PM

telesilla

Watching these made me think, I'm going to want to go to the theatre a lot more in the future and see fellow humans in plays, lectures and concerts.

Such achievements in technology must lead to cultural change. Look at how popular vinyl has become, why not theatre again.

February 16, 2024 at 6:22 AM

imiric

Do you feel the same way about modern movies? CGI is so ubiquitous and accessible, that most movies use some form of it. It's actually news when a filmmaker _doesn't_ use CGI (e.g. Nolan).

These advancements are just the next step in that evolution. The tech used in movies will be commoditized, and you'll see Hollywood-style production in YouTube videos.

I'm not sure why you think theater will become _more_ popular because of this. It has remained popular throughout the years, as technology comes and goes. People can enjoy both video and theater, no?

February 16, 2024 at 7:15 AM

mark_l_watson

I agree, seeing real human actors on stage will always be popular for some consumers. Same for local live musicians.

That said, I helped a friend who makes low budget, edgy and cool films last week. I showed him what I knew about driving Pika.art and he picked it up quickly. He is very excited about the possibility of being able to write more stories and turn them into films.

I think there is plenty of demand for all kinds of entertainment. It is sad that so many creative people in Hollywood and other content creation centers will lose jobs. I think the very best people will be employed, but often partnered with AIs. Off topic, but I have been a paid AI practitioner since 1982, and the breakthroughs of deep learning, transformers, and LLMs are stunning.

February 16, 2024 at 11:52 AM

roca

We will soon find that story generation is easily automated.

February 16, 2024 at 1:44 PM

htrp

It's already easy automated.

February 16, 2024 at 10:13 PM

carrozo

Drop a link to your friend’s work?

February 17, 2024 at 6:11 PM

dogcomplex

I actually suspect one of the new most popular mediums will be actors on a theatre stage doing live performances to a live AI CGI video being rendered behind them - similar to musicians in a live orchestra. It would bring together the nostalgia and wonder of human acting and performance art, while still smoothing and enhancing their live performance into the quality levels and wonder we've come to expect from movie theatre experiences. This will be technologically doable soon.

February 20, 2024 at 12:15 PM

Buckworthy

Imagine movies generated in real-time just for you, with the faces you know, places you know and what not!

February 16, 2024 at 7:23 AM

Astraco

That's terrifying and dystopian.

February 16, 2024 at 8:49 AM

dorkwood

No it's not. Imagine turning on the television when you get home and it's a show all about you (think Breaking Bad, but you're Walter White). You flip to another channel and it's a pornographic movie where you sleep with all the world's most famous movie stars. Flip the channel again and it's all the home movies you wish you had but were never able to make.

This is a future we could once only dream of, and OpenAI is making it possible. Has anyone noticed how anti-progress HN has become lately?

February 16, 2024 at 1:42 PM

enumjorge

I guess it depends on your definition of progress. None of those examples you listed sound particularly appealing to me. I've never watched a show and thought I'd get more enjoyment if I was at the center of that story. Porn and dating apps have created such unrealistic expectations of sex and relationships that we're already seeing the effects in younger generations. I can only imagine what on-demand fully generative porn will have on issues like porn addiction.

Not to say I don't have some level of excitement about the tech, but I don't think it's unwarranted pessimism to look at this stuff and worry about it's darker implications.

February 16, 2024 at 2:20 PM

Astraco

> You flip to another channel and it's a pornographic movie where you sleep with all the world's most famous movie stars.

This is not only dystopian, it's just sad. All these look taken from the first seasons of Black Mirror. I don't know what you think progress is but AI porno and ads are not.

February 16, 2024 at 2:57 PM

oliverpk

I don't think any well adjusted person ever has actually wanted this

February 16, 2024 at 6:45 PM

dukeyukey

This might be more revealing of you than of people in general. Even when I play tabletop RPGs, a place I could _easily_ play a version of myself, I almost never do. There's nothing wrong with doing so, but most people don't.

February 16, 2024 at 7:09 PM

ookdatnog

That seems depressingly solipsistic. I think part of the appeal of art is that it's other humans trying to communicate with you, that you feel the personality of the creators shining through.

Also I've never interacted with any piece of art or entertainment and thought to myself "this is neat and all, but it would be much improved if this were entirely about me, with me as the protagonist." One watches Breaking Bad because Walter White is an interesting character; he's a man who falls into a life of crime initially for understandable reasons, but as the series goes on it becomes increasingly clear that he is lying to himself about his motivations and that his primary motivation for his escalating criminal life is his deep-seated frustration at the mediocrity of his life. More than anything else, he craves being important. The unraveling of his motivations and where they come from is the story, and that's something you can't really do when you're literally watching yourself shoehorned into a fictional setting.

You seem to regard it as self-evident that art or entertainment would be improved if (1) it's all about you personally and (2) involvement of other real humans is reduced to zero, but I cannot fathom why you would think that (with the exception of the porn example).

February 16, 2024 at 5:20 PM

dorkwood

Seems like we're pretty close to inserting ourselves into pornographic movies.

February 16, 2024 at 8:46 AM

dymk

We can do that already, you just need a camera

February 16, 2024 at 11:50 AM

ugh123

Can also achieve multiple (still) angles, with multiple phones.

February 16, 2024 at 3:51 PM

danielbln

Not close, we're there. Look up FaceFusion.

February 16, 2024 at 7:13 PM

nomadpenguin

That would suck. I want to see something I haven't seen before.

February 16, 2024 at 7:28 AM

dymk

I guarantee you haven't seen the entire latent space of any large model

February 16, 2024 at 11:51 AM

iwsk

Your wish is Sora's(or its successor model's) prompt.

February 16, 2024 at 11:36 AM

sussmannbaka

I am a much better software engineer than I am a director. I can guarantee you that I don’t want to see anything that I could prompt.

February 16, 2024 at 3:58 PM

superhumanuser

Your favorite shows, where the season never ends, and the actors never age.

February 16, 2024 at 9:31 AM
February 16, 2024 at 10:07 AM

dingclancy

The vinyl narrative is so whack.

https://www.riaa.com/u-s-sales-database/

At its peak, Inflation adjusted Vinyl Sales was $1.4billion in 1979. Then forward to the lowest sales in 2009 at $3.4million. So Vinyl has been so popular it grew to $8.5m by 2021.

That is just nostalgia, not cultural change pushed by the dystopia of AI.

February 16, 2024 at 2:51 PM

halfstar91

Why is my 14 year old niece now collecting vinyl? I can guarantee it's not nostalgia. There's obviously more at play there even when acknowledging your point about relative market size.

February 16, 2024 at 3:18 PM

turtles3

Perhaps it is _anemoia_ - nostalgia for a time you've never known https://www.dictionaryofobscuresorrows.com/post/105778238455...

In this case, it's for the harmless charm of an imagined past, but the same forces are at play in some more dangerous forms of social conservatism.

February 16, 2024 at 6:53 PM

peebeebee

It's a very narrow subgroup.

But things can coexist. It's now easier to create music than ever, and there is more music created by more artists than ever. Most music is forgettable and just streamed as background music. But there is also room for superstars like Taylor Swift.

Things don't have to be either-or.

February 16, 2024 at 6:59 PM

outime

How many 14 years old do you know who collect vinyl?

February 16, 2024 at 6:05 PM

r9295

The medium is the message. I know several people born post 2000 who are embracing records and tapes.

February 16, 2024 at 6:40 PM

xanderlewis

I started when I was pretty much exactly that age, ten years ago.

February 16, 2024 at 6:50 PM

throw_m239339

> The vinyl narrative is so whack.

"Revenues for the LP/EP format were $1.2B in 2022 and accounted for 7.7% of total revenue of $15.9B for all selected formats for the year"

Adjusted to inflation.

It's my understanding that LP/EP is vinyl as well. Not Just vinyl single.

February 16, 2024 at 10:37 PM

spyckie2

This has to be it. Vinyl costs like 20$ per, and $8m is like 400k vinyl sales (users often buy more than 1 vinyl so it's a lot less users) which seems too low globally. At 1.2b, it is more like 60m sales which seems more reasonable.

February 17, 2024 at 1:22 AM

procinct

I think a lot of people collect vinyl less for nostalgia reasons and more so to have a physical collection of their music. I think vinyl wins over CDs just due to how it’s larger and the cover art often looks better as a result.

February 17, 2024 at 4:12 AM

sleepingreset

[dead]

February 16, 2024 at 7:09 AM

kweingar

Obviously incredibly cool, but it seems that people are incredibly overstating the applications of this.

Realistically, how do you fit this into a movie, a TV show, or a game? You write a text prompt, get a scene, and then everything is gone—the characters, props, rooms, buildings, environments, etc. won’t carry over to the next prompt.

February 16, 2024 at 3:36 AM

superconduct123

It doesn't need to replace the whole movie

You could use it for stuff like wide shots, close ups, random CG shots, rapid cut shots, stuff where you just cut to it once and don't need multiple angles

To me it seem most useful for advertising where a lot of times they only show something once, like a montage

February 16, 2024 at 4:10 AM

planckscnst

And it would be magic for storyboarding. This would be such a useful tool for a director to iterate on a shot and then communicate that to the team

February 16, 2024 at 5:43 AM

_sys49152

i could arrange in frameforge 3d shot by shot, even adjusting for motion in between, then export to an AI solution. that to me would be everything. of course then comes issues of consistency, adjustments & tweaks, etc

February 16, 2024 at 8:49 AM

Boss0565

I also see advertising (especially lower-budget productions, such as dropshipping or local TV commercials) being early adopters of this technology once businesses have access to this at an affordable price.

February 16, 2024 at 4:53 AM

fassssst

It generates up to 1 minute videos which is like what all the kids are watching on TikTok and YouTube Shorts, right? And most ads are shorter than 1 minute.

February 16, 2024 at 3:43 AM

Janicc

A few months ago ai generated videos of people getting arrested for wearing big boots went viral on TikTok. I think this sort of silly "interdimensional cable" stuff will be really big on these short form video type sites once this level of quality becomes available to everyone.

February 16, 2024 at 5:03 AM

JamesSwift

Robot chicken, but full motion video

February 16, 2024 at 9:17 AM

wilg

You wait a year and they'll figure it out.

February 16, 2024 at 4:08 AM

matsemann

It also seems hard to control exactly what you get. Like you'd want a specific pan, focus etc. to realize your vision. The examples here look good, but they aren't very specific.

But it was the same with Dall-E and others in the beginning, and there's now lots of ways to control image generators. Same will probably happen here. This was a huge leap just in how coherent the frames are.

February 16, 2024 at 3:56 AM

sethammons

What came to mind is what is right around the corner: you create segments and stitch them together.

"ok, continue from the context on the last scene. Great. Ok, move the bookshelf. I want that cat to be more furry. Cool. Save this as scene 34."

As clip sizes grow and context can be inferred from a previous scene, and a library of scenes can be made, boom, you can now create full feature length films, easy enough that elementary school kids will be able to craft up their imaginations.

February 16, 2024 at 6:34 AM

JKCalhoun

You could use it to storyboard right now. Continuity of characters/wardrobe, etc. is not that important in storyboarding.

February 16, 2024 at 4:49 AM

al_borland

Family Guy is built on out of context clips.

It could also fill it for background videos in scenes, instead of getting real content they’d have to pay for, or making their own. The gangster movie Kevin was playing in Home Alone was specifically shot for that movie, from what I remember.

February 16, 2024 at 9:45 AM

dragonwriter

> You write a text prompt, get a scene, and then everything is gone—the characters, props, rooms, buildings, environments, etc. won’t carry over to the next prompt.

Sure, you can't use the text-to-video frontend for that purpose. But if you've got a t2v model as good as Sora clearly is, you've got the infrastructure for a lot more, as the ecosystem around the open-source models in the space has shown. The same techniques that allow character, object, etc., consistency in text-to-image models can be applied to text-to-video models.

February 16, 2024 at 3:45 AM

nprateem

It's pretty obvious they just need to add the ability to prompt it with an image saying "continue in this style and make the character..."

February 16, 2024 at 5:21 AM

padolsey

Nah just fine-tune the model to a specific set of characters or aesthetic. It's not hard, already done with SDXL LoRAs. You can definitely generate a whole movie from just a storyboard.. if not now, then in maybe five yrs.

February 16, 2024 at 1:45 PM

barbarr

Explicit video clips? 4chan is gonna have a field day with this.

February 16, 2024 at 5:53 AM

matheusmoreira

Not necessarily explicit. Maybe just deliberately offensive. Maybe just weirdly specific.

It's gonna be great.

February 16, 2024 at 1:13 PM
February 16, 2024 at 5:43 AM

TrackerFF

lots and lots and lots of b-roll and stock footage is about to get cheaper.

Also, using this kind of footage is the bread and butter for a lot of marketers for their content.

Imagine never having to pay stock footage companies

February 17, 2024 at 2:16 AM

dogcomplex

What? You're serious?

Script => Video baseline. Take a frame of any character/prop/room/etc you want to remain consistent, and one shitty photoshop and it's part of the new scene.

Incredibly overstating. That is an incredible lack of imagination buddy. Or even just basic craftsmanship.

February 16, 2024 at 1:11 PM

seydor

tiktok

February 16, 2024 at 5:03 AM

idiliv

People here seem mostly impressed by the high resolution of these examples.

Based on my experience doing research on Stable Diffusion, scaling up the resolution is the conceptually easy part that only requires larger models and more high-resolution training data.

The hard part is semantic alignment with the prompt. Attempts to scale Stable Diffusion, like SDXL, have resulted only in marginally better prompt understanding (likely due to the continued reliance on CLIP prompt embeddings).

So, the key question here is how well Sora does prompt alignment.

February 16, 2024 at 3:08 AM

golol

The real advancement is the consistency of character, scene, and movement!

February 16, 2024 at 3:53 AM

kolja005

There needs to be an updated CLIP-like model in the open-source community. The model is almost three years old now and is still the backbone of a lot of multimodal models. It's not a sexy problem to take on since it isn't especially useful in and of itself, but so many downstream foundation models (LLaVA, etc.) would benefit immensely from it. Is there anything out there that I'm just not aware of, other than SigLIP?

February 16, 2024 at 7:35 AM

nimbleal

I agree.

I think one part of the problem is using English (or whatever natural language) for the prompts/training. Too much inherent ambiguity. I’m interested to see what tools (like control nets with SD) are developed to overcome this.

February 16, 2024 at 7:44 PM
February 16, 2024 at 1:46 PM

rglover

I was super on board until I saw...the paw: https://player.vimeo.com/video/913131059?h=afe5567f31&badge=...

Exciting for the potential this creates, but scary for the social implications (e.g., this will make trial law nearly impossible).

February 16, 2024 at 5:31 AM

comicjk

If I understand trial law correctly, the rules of evidence already prohibit introducing a video at trial without proving where it came from (for example, testimony from a security guard that a given video came from a given security camera).

But social media has no rules of evidence. Already I see AI-generated images as illustrations on many conspiracy theory posts. People's resistance to believing images and videos from sketchy sources is going to have to increase very fast (especially for images and videos that they agree with).

February 16, 2024 at 6:07 AM

al_borland

All the more reason why we need to rely on the courts and not the mob justice (in the social sense) which has become popular over the last several years.

February 16, 2024 at 9:42 AM

a_wild_dandan

Nothing will change. Confirmation bias junkies already accept far worse fakes. People who use trusted sources will continue doing so. Bumping the quantity/quality of fabricated horseshit won't move the needle.

February 16, 2024 at 9:46 AM

zuminator

Wow. If I saw this clip a year ago I wouldn't think, "The image generator fucked up," I'd just think that a CG effects artist deliberately tweaked an existing real-world video.

February 16, 2024 at 5:48 AM

rglover

Yeah, if that gets cleaned up (one would expect it to in time), this is going to change a lot.

February 16, 2024 at 6:05 AM
February 16, 2024 at 7:50 AM

itissid

How does one cope with this?

- Disruptions like this happen to every industry every now and then. Just not on the level of "Communicating with people with words, and pictures". Anduril and SpaceX disrupted defense contractors and United Launch Alliance; Someone working for a defense contractor/ULA here affected by that might attest to the feeling?

- There will be plenty of opportunity to innovate. Industries are being created right now. People probably also felt the same way when they saw HTTP on their screens the first time. So don't think your career or life's worth of work is miniscule, its just a moving target, adapt & learn.

- Devil is in the details. When a bunch of large SaaS behemoths created Enterprise software an army of contractors and consultants grew to support the glue that was ETL. A lot of work remains to be done. It will just be a more imaginative glue.

February 16, 2024 at 5:51 AM

TaupeRanger

I would be willing to bet $10,000 that the average person's life will not be changed in any significant way by this technology in the next 10 years. Will there be some VFX disruption in Hollywood and games? Sure, maybe some. It's not a cure for cancer. It's not AGI. It's not earth shattering. It is fun and interesting though.

February 16, 2024 at 10:14 PM

the8472

"by this technology" does a lot of heavy lifting. Look at the pace of AI development and extrapolate 10 years.

February 16, 2024 at 11:53 PM

maximus-decimus

Relevant XKCD : https://xkcd.com/605/

February 17, 2024 at 6:52 AM

the8472

Not really. We have way more data points than one on AI development. It has been incremental progress for more than a decade.

February 17, 2024 at 5:57 PM

hello_newman

Totally agree with you.

Most of the responses in this thread remind me of why I don't typically go into the comment section of these announcements. It's way too easy to fall into the trap set by the doomsday-predicting armchair experts, who make it sound like we're on the brink of some apocalypse. But anyone attempting to predict the future right now is wasting time at best, or intentionally fear mongering at worst.

Sure, for all we know, OpenAI might just drop the AGI bomb on us one day. But wasting time worrying about all the "what ifs" doesn't help anyone.

Like you said, there is so much work out there to be done, _even if_ AGI has been achieved. Not to get sidetracked from your original comment, but I've seen AGI repeatedly mentioned in this thread. It's really all just noise until proven otherwise.

Build, adapt, and learn. So much opportunity is out there.

February 16, 2024 at 8:55 AM

kypro

> But wasting time worrying about all the "what ifs" doesn't help anyone.

Worry about the what if is all we have as a species. If we don't worry about how stop global warming, or how we can prevent a nuclear holocaust these things become more far more likely.

If OpenAI drops an AGI bomb on us then there a good chance that's it for us. From there it will just be a matter of time before a rouge AGI or a human working with an AGI causes mass destruction. This is every bit as dangerous as nuclear weapons - if not more dangerous – yet people seem unable to take the matter as seriously as it needs to be taken.

I fear millions of people will need to die or tens of millions will need to be made unemployable before we even begin to start asking the right questions.

February 16, 2024 at 9:41 AM

__MatrixMan__

Isn't the alternative worse though? We could try to shut Pandora's box and continue to worsen the situation gradually and never start asking the right questions. Isn't that a recipe for even more hardship overall, just spread out a bit more evenly?

It seems like maybe it's time for the devil we don't know.

February 16, 2024 at 12:15 PM

roca

We live in a golden age. Worldwide poverty is at historic lows. Billions of people don't have to worry about where their next meal is coming from or whether they'll have a roof over their head. Billions of people have access to more knowledge and entertainment options than anyone had 100 years ago.

This is not the time to risk it all.

February 16, 2024 at 1:53 PM

__MatrixMan__

Staying the course is risking it all. We've built a system of incentives which is asleep at the wheel and heading towards as cliff. If we don't find a different way to coordinate our aggregate behavior--one that acknowledges and avoids existential threats--then this golden age will be a short one.

February 17, 2024 at 11:43 AM

roca

Maybe. But I'm wary of the argument "we need to lean into the existential threat of AI because of those other existential threats over there that haven't arrived yet but definitely will".

It all depends on what exactly you mean by those other threats, of course. I'm a natural pessimist and I see threats everywhere, but I've also learned I can overestimate them. I've been worried about nuclear proliferation for the last 40 years, and I'm more worried about it than ever, but we haven't had another nuclear war yet.

February 19, 2024 at 11:19 AM

nopinsight

Many might miss the key paragraph at the end:

   "Sora serves as a foundation for models that can understand and simulate the real world, a capability we believe will be an important milestone for achieving AGI."
This also helps explain why the model is so good since it is trained to simulate the real world, as opposed to imitate the pixels.

More importantly, its capabilities suggest AGI and general robotics could be closer than many think (even though some key weaknesses remain and further improvements are necessary before the goal is reached.)

EDIT: I just saw this relevant comment by an expert at Nvidia:

   “If you think OpenAI Sora is a creative toy like DALLE, ... think again. Sora is a data-driven physics engine. It is a simulation of many worlds, real or fantastical. The simulator learns intricate rendering, "intuitive" physics, long-horizon reasoning, and semantic grounding, all by some denoising and gradient maths.

   I won't be surprised if Sora is trained on lots of synthetic data using Unreal Engine 5. It has to be!

   Let's breakdown the following video. Prompt: "Photorealistic closeup video of two pirate ships battling each other as they sail inside a cup of coffee." ….”
https://twitter.com/DrJimFan/status/1758210245799920123

February 16, 2024 at 3:58 AM

lucisferre

> since it is trained to simulate the real world

Is it though? Or is this just marketing?

February 16, 2024 at 4:18 AM

janalsncm

I was impressed with their video of a drone race on Mars during a sunset. In part of the video, the sun is in view, but then the camera turns so it’s out of view. When the camera turns back, the sun is where it’s supposed to be.

February 16, 2024 at 4:49 AM

djsavvy

there's mention of memory in the post — the model can remember where it put objects for a short while, so if it pans away and pans back it should keep that object "permanence".

February 16, 2024 at 5:11 AM

hbn

Well the video in the weaknesses section with the archeologists makes me think it's not just predicting pixels. The fact that a second chair spawns out of nothing looks like a typical AI uncanny valley mistake you'd expect, but then it starts hovering which looks more like a video game physics glitch than an incorrect interpretation of pixels on screen.

February 16, 2024 at 5:24 AM

rdedev

If it is its not there yet. The snow in the mammoth video kind of looks like smoke, the way it rises into the air

February 16, 2024 at 4:38 AM

wilg

I think it's just inherent to the problem space. Obviously it understands something about the world to be able to generate convincing depictions of it.

February 16, 2024 at 4:39 AM

lucisferre

It seems very dangerous to assume claims without evidence are obvious.

February 16, 2024 at 5:00 AM

wilg

I didn't do that.

February 16, 2024 at 5:02 AM
February 16, 2024 at 4:40 AM

nopinsight

What other likely reasons might explain the leap ahead of other significant efforts?

See also: https://news.ycombinator.com/item?id=39387333

February 16, 2024 at 4:40 AM

lucisferre

Just having a better or bigger model? Better training data, better feedback process, etc.

Seems more likely then "it can simulate reality".

Also I take anecdotal reviews like that with a grain of salt. I follow numerous AI groups on Reddit and elsewhere and many users seem to have strong opinions that their tool of choice is the best. These reviews are highly biased.

Not to say I'm not impressed, but it's just been released.

February 16, 2024 at 4:49 AM

nopinsight

Object persistence and consistency are not likely to arise simply from a bigger model. A different approach or architecture is needed.

Also, I just added a link to an expert’s tweet above. What do you think?

February 16, 2024 at 5:08 AM

lucisferre

Others have provided explanations for things like object persistence, for example keeping a memory of the rendering outside of the frame.

The comment from the expert is definitely interesting and compelling, but clearly still speculation based on the following comment.

> I won't be surprised if Sora is trained on lots of synthetic data using Unreal Engine 5. It has to be!

I like the speculation though, the comments provide some convincing explanations for how this might work. For example, the idea that it is trained using synthetic 3-dimensional data from something like UE5 seems like a brilliant idea. I love it.

Also in his example video the physics look very wrong to me. The movement of the coffee waves are realistic-ish at best. The boat motion also looks wrong and doesn't match up with the liquid much of the time.

February 16, 2024 at 5:48 AM
February 16, 2024 at 5:41 AM

grbsh

I think you are reading too far into this. The title of the technical paper is “ Video generation models as world simulators”.

This is “just” a transformer that takes in a sequence of noisy image (video frame) tokens + prompt, and produces a sequence of less noisy video tokens. Repeat until noise gone.

The point they’re making, which is totally valid, is that in order for such a model to produce videos with realistic physics, the underlying model is forced to learn a model of physics (a “world simulation”).

February 16, 2024 at 9:08 AM

nopinsight

AlphaGo and AlphaZero were able to achieve superhuman performance due to the availability of perfect simulators for the game of Go. There is no such simulator for the real world we live in. (Although pure LLMs sorta learn a rough, abstract representation of the world as perceived by humans.) Sora is an attempt to build such a simulator using deep learning.

This actually affirms my comment above.

  “Our results suggest that scaling video generation models is a promising path towards building general purpose simulators of the physical world.”
https://openai.com/research/video-generation-models-as-world...

What part of my argument do you disagree about?

February 16, 2024 at 9:33 AM

lanternfish

`since it is trained to simulate the real world, as opposed to imitate the pixels.`

It's not that its learning a model of the world instead of imitating pixels - the world model is just a necessary emergent phenomenon from the pixel imitation. It's still really impressive and very useful, but it's still 'pixel imitation'

February 16, 2024 at 6:16 PM

xtracto

What I want is an AI trained to simulate the human body, allowing scientists to perform artificial human trials on all kind of medicines. Cutting trial times from years to months.

February 16, 2024 at 5:20 AM

delichon

Or to simulate the short or long term regret you'll feel for eating the meal in the photo.

February 16, 2024 at 7:10 AM

mentalpiracy

> "understand... the real world"

doing a lot of heavy lifting in this statement

February 16, 2024 at 4:47 AM

fasteddie31003

Movie making is going to become fine-tuning these foundational video models. For example, if you want Brad Pitt in your movie you'll need to use his data to fine-tune his character.

February 16, 2024 at 4:24 AM

kevmo314

What is latent space if not a representation of the real world?

February 16, 2024 at 4:20 AM

nopinsight

Pretty sure many latent spaces are not trained to represent 3D motions and some detailed physics of the real world. Those in pure text LLMs, for example.

February 16, 2024 at 4:46 AM
February 16, 2024 at 4:12 AM

ramathornn

Wow, some of those shots are so close to being unnoticeable. That one of the eye close up is insane.

It’s interesting reading all the comments, I think both sides to the “we should be scared” are right in some sense.

These models currently give some sort of super power to experts in a lot of digital fields. I’m able to automate the mundane parts of coding and push out fun projects a lot easier today. Does it replace my work, no. Will it keep getting better, of course!

People who are willing to build will have a greater ability to output great things. On the flip side, larger companies will also have the ability to automate some parts of their business - leading to job loss.

At some point, my view is that this must keep advancing to some sort of AGI. Maybe it’s us connecting our brains to LLMs through a tool like Neuralink. Maybe it’s a random occurrence when you keep creating things like Sora. Who knows. It seems inevitable though doesn’t it?

February 16, 2024 at 12:04 PM

offsign

One of things I've loved about HN was the quality of comments. Whether broad or arcane, you had experts the world over who would tear the topic apart with data and a healthy dose of cynicism. I frequently learned more from the debate and critique than I did from the "news" itself.

I don't know what is it about AI and current state of tech, but the discourse as of late has really taken a nosedive. I'm not saying that any of this conjecture won't happen, but the acceleration towards fervor and fear mongering on the subject is bordering on religiosity - seriously, it makes crypto bros look good.

And yeah -- looks like some cool new tech from OpenAI, and excited when I can actually dig in. Would also love it if I could hire their marketing department.

February 16, 2024 at 12:49 PM

ThrowawayTestr

It's pretty obvious why. Automation has finally come for programmers so now everyone here is anti-progress.

February 16, 2024 at 12:52 PM

fisf

This.

Many people here have a lucrative career in traditional fields, big tech, etc.

Working in those fields is good. Building "products" is good (even if that only means optimizing conversion rates and pushing ads). Doing well in the traditional financial sense (stocks and USD) is good.

Anything that rocks the boat (crypto, ai) is bad.

February 16, 2024 at 10:16 PM

Delumine

This is insane. Even though there are open-source models, I think this is too dangerous to release to the public. If someone would've uploaded that Tokyo video to youtube, and told me it was a drone.. I would've believed them.

All "proof" we have can be contested or fabricated.

February 16, 2024 at 2:37 AM

thepasswordis

"Proof" for thousands of years was whatever was written down, and that was even easier to forge.

There was a brief time (maybe 100 years at the most) where photos and videos were practically proof of something happening; that is coming to an end now, but that's just a regression to the mean, not new territory.

February 16, 2024 at 3:00 AM

ctoth

Hmmm. Actually I think I finally figured out why I dislike this argument, so thank you.

The important number here isn't the total years something has been true, when talking about something with sociocultural momentum, like the expectation that a recording/video is truthful.

Instead, the important number seems to me to be the total number of lived human years where the thing has been true. In the case of reliable recordings, the last hundred years with billions of humans has a lot more cultural weight than the thousands of preceding years by virtue of there having been far more human years lived with than without the expectation.

February 16, 2024 at 4:12 AM

random_cynic

That's a false metric. With exponential progress, we have to adjust equally rapidly. It's quite obvious that photos and videos would last far shorter than written medium as proof of something.

February 17, 2024 at 12:58 PM

nlpparty

Photos have never been a fundamental proof if the stakes are high or you have an idling censorship institution. Soviets (and maybe others, I just happen to know only about them ) successfully edited photos and then mass-reproduced them.

just some google link about the issue: https://rarehistoricalphotos.com/stalin-photo-manipulation-1...

February 16, 2024 at 7:35 AM

a_wild_dandan

This changes nothing about "proof" (i.e. "evidence", here). Authenticity is determined by trust in the source institution(s), independent verification, chains of evidence, etc. Belief is about people, not technology. Always was, always will be. Fraud is older than Photoshop, than the first impersonation, than perhaps civilization. The sky is not falling here. Always remember: fidelity and belief aren't synonyms.

February 16, 2024 at 3:04 AM

mtlmtlmtlmtl

Scale matters. This will allow unprecedented scale of producing fabricated video. You're right about evidence, but it doesn't need to hold up in court to do a lot of damage.

February 16, 2024 at 4:52 AM

a_wild_dandan

No, it doesn't. You cannot scale your way into posting from the official New York Times account, or needing valid government ID to comment, or whatever else contextually suggests content legitimacy. Abusing scale is an ancient exploit, with myriad antidotes. Ditto for producing realistic fakes. Baddies combining the two isn't new, or cause for panic. We'll be fine.

February 16, 2024 at 9:01 AM

mtlmtlmtlmtl

Your entire argument that scale doesn't matter rests on the notion that legitimacy needs to be signalled at all to fool people. It doesn't. It just needs to appeal to people's biases, create social chaos through word of mouth. Also, all you need to get posted on the NY times "account" is to fool some journalists. Scale can help there too by creating so much misinformation it becomes hard to find real information.

Scale definitely matters when that's what you're doing. In fact I challenge you to find any physical or social phenomenon where scale doesn't matter.

February 16, 2024 at 9:41 AM

a_wild_dandan

If read aloud, no one could guess if your comment came from 2024 or 2017. There is zero barrier between you and using trusted sources, or endlessly consuming whatever fantasy bullshit supports your biases. That has not, and will not, change.

February 16, 2024 at 11:41 AM

mtlmtlmtlmtl

Look, you can repeat all you want that fraud has existed before, but that's not an argument.

February 17, 2024 at 1:03 AM

brigadier132

> All "proof" we have can be contested or fabricated.

This has been the case for a while now already, it's better that we just rip off the bandaid and everyone should become a skeptic. Standards for evidence will need to rise.

February 16, 2024 at 2:55 AM

bogwog

If you rip off the bandaid too soon, there will be blood.

February 16, 2024 at 2:57 AM

losvedir

That's interesting. It made me think of a potential feature for upcoming cameras that essentially cryptographically sign their videos. If this became a real issue in the future, I could see Apple introducing it in a new model. "Now you can show you really did take that trip to Paris. When you send a message to a friend that contains a video that you shot on iPhone, they will see it in a gold bubble."

February 16, 2024 at 4:11 AM

1970-01-01

Weird hallucination artifacts are still giving it all away. Look closely at the train and viaduct rendering, and you can't unsee windows morphing into each other.

February 16, 2024 at 4:05 AM

standardUser

The key word there is "someone". The only way forward is to care a lot more about our sources. Trust is about to become really valuable.

February 16, 2024 at 2:52 AM

Delumine

We give too much credit to ordinary people. All these bleeding-edge advancements in AI, code, databases, and technology are things a user on HNews would be aware of. However, most peers in regular jobs, parents, children, et al., would be susceptible to being fooled on social media. They're not going to say... "hmm, let me fact-check and see if the sources are correct and that this wasn't created by AI."

They'll simply see an inflammatory tweet from their leader on Twitter.

February 16, 2024 at 2:57 AM

standardUser

They're not going to fact check, they're simply going to think "huh, could be AI" and that will change the way we absorb and process information. It already has. And when we really need to know something and can't afford to be wrong, we'll seek out high trust sources. Just like we do now, but more so.

And of course some large cross section of people will continue to be duped idiots.

February 16, 2024 at 3:34 AM

Delumine

Most people don't even know what AI is. I've had to educate my parents that the technology to not only clone my voice, but my face.. is in existence. Pair that with number spoofing, and you have a recipe for disaster to scam people.

February 16, 2024 at 3:36 AM

skepticATX

This is what lots of folks said about image generation. Which is now in many ways “solved”. And society has easily adapted to it. The same will happen with video generation.

The reality is that people are a lot more resourceful / smarter than a lot of us think. And the ones who aren’t have been fooled long before this tech came around.

February 16, 2024 at 3:17 AM

diputsmonro

In what ways has image generation been solved? Prompt blocking is about the only real effort I can think of, which will mean nothing once open source models reach the same fidelity.

February 16, 2024 at 3:39 AM

mzs

I guess you can't read Japanese.

February 16, 2024 at 3:01 AM

volkk

maybe for now, only a matter of time before stuff like this is fixed

February 16, 2024 at 3:22 AM

zogwarg

And I guess you haven't actually been to Tokyo, the number of details which are subtly wrong is actually very high, and it isn't limited to text, heck detecting those flaws isn't even limited by knowledge of Japan:

- Uncanny texture and shape for the manhole cover

- Weirdly protruding yellow line in the middle of the road, where it doesn't make sense - Weird double side-curb on the right, which can't really be called steps.

- Very strange gait for the "protagonist", with the occasional leg swap.

- Not quite sensical geometry for the crosswalks, some of them leading nowhere (into the wet road, but not continuing further)

- Weird glowy inside behind the columns on the right.

- What was previously a crosswalk, becoming wet "streaks" on the road.

- No good reason for crosswalks being the thing visible in the reflection of the sunglasses.

- Absurd crosswalk orientation at the end. (90 degrees off)

- Massive difference in lighting between the beginning of the clip and the end, suggesting an impossible change of day.

Nothing suggests to me that these are easy artifacts to remove, given how the technology is described as "denoising" changes between frames.

This is probably disruptive to some forms of video production, but the high-end stuff I suspect will still use filming mostly ground in truth, this could highly impact how VFX and post-production is done, maybe.

February 16, 2024 at 9:33 AM

padolsey

With everything we've seen in the last couple years, do you sincerely believe that all of those points won't be solved pretty soon? There are many intermediary models that can be used to remove these kind of artefacts. Human motion can be identified and run through a pose/control-net filter, for example. If these generations are effectively one-shot without subsequent domain-specific adjustments, then we should expect for every single one of your identified flaws to be remedied pretty soon.

February 16, 2024 at 8:01 PM

serf

the world is getting increasingly surveilled as well, I guess the presumption is that eventually you'll just be able to cross reference a 'verified' recording of the scene against whatever media exists.

"We ran the vid against the nationally-ran Japanese scanners, turns out that there are no streets that look like this, nor individuals."

in other words I think that the sudden leap of usable AI into real life is going to cause another similar leap towards non-human verification of assets and media.

February 16, 2024 at 3:29 AM

m3kw9

all the news you see has zero proof unless you see it, you just have to have a sense if it's real based on a concensus or trust worthness of a reporter/outlet.

The UA war is real, most likley, but i havent' seen it with my own eyes, nor did most people, but maybe they have relatives/friends saying it, and they are not likely to lie. Stuff like that.

February 16, 2024 at 3:23 AM

HermitX

AI will eventually be capable of performing most of the tasks humans can do. My neighbor's child is only 6 years old now. What advice do you think I should give to his parents to develop their child in a way that avoids him growing up to find that AI can do everything better than he can?

February 16, 2024 at 9:50 AM

kypro

If you want an honest answer you should tell the parents to vote for politicians prepared to launch missile strikes on data centers to secure their child's future.

People who are worried purely about employment here are completely missing the larger risks.

Realistically his child is going to be unemployable and will therefore either starve or be dependant on some kind of government UBI policy. However UBI is completely unworkable in an AI world because it assumes that AI companies won't just relocate where they don't need to pay tax, and that us as citizens will have any power over the democratic process in a world where we're economically and physically worthless.

Assuming UBI happens and the child doesn't starve to death, if the government alter decides to cut UBI payments after receiving large bribes from AI companies what would people do? They can't strike, so I guess they'll need to try to overthrow the government in a world with AI surveillance tech and policing.

Realistically humans in the future are going to have no power, and worse still in a world of UBI the less people there leaching from the government means the more resources there are for those with power. The more you can kill the more you earn.

And I'm just focusing on how we deal with the unemployment risks here. There's also the risk that AI will be used to create biological weapons. The risk of us creating a rogue superintelligent AGI. The risk of horrific AI applications like mind-reading.

Assuming this parent loves their child they should be doing everything in their power to demand progress in AI is halted before it's too late.

February 16, 2024 at 10:35 AM

dogcomplex

Way too much certainty, bud. And too much deference to the AI Company Gods.

As utterly impressive as this is - unless they have perfect information security on every level this technique and training will be disseminated and used by copious competitors, especially in the open source community. It will be used to improve technology worldwide, creating ridiculously powerful devices that we can own, improving our own individual skills similarly ridiculously.

Sure, the market for those skills dries up just as fast - because what's the point when there's ubiquitous intelligence on tap - but it still leaves a population of AI-augmented superhumans just with AIs using our phones optimally. What we're about to be capable of compared to 5 years ago is going to be staggering. Establishing independent sources to meet basic needs and networks of trust are just no-brainers.

Sure, we'll always be outclassed by the very best - and they will continue to hold the ability to utterly obliterate the world population if they so wished to - but we as basic consumer humans are about to become more powerful in absolute terms than entire nations historically. (Or rather, our AIs will be, but til they rebel - this is more of a pokemon sort of situation)

If you're worried, get to working on making sure these tools remain accessible and trustworthy on the base level to everyone. And start building ways to meet basic needs so nobody can casually take those away from your community.

This won't be halted. And attempting to halt would create a centralized censorship authority ensuring the everyman will never have innate access to this tech. Dead end road that ends in a much worse dystopia.

February 16, 2024 at 1:01 PM
February 16, 2024 at 2:30 PM

kypro

> As utterly impressive as this is - unless they have perfect information security on every level this technique and training will be disseminated and used by copious competitors, especially in the open source community. It will be used to improve technology worldwide, creating ridiculously powerful devices that we can own, improving our own individual skills similarly ridiculously.

You're wrong, it's not your "individual skills". If I hire you do to work for me, you're not improving my individual skills. I am not more employable as a result of me outsourcing my labour to you, I am less employable. Anyone who wants something done would go to you directly, there's no need to do business through me.

This is why you won't be employable because the same applies to AI – why would I ask you to ask an AI to complete a task when I can just ask the AI myself?

The end result here is that only the people with access to AI at scale will be able to do anything. You might have access to the AI, but you can't create resources with a chatbot on your computer. Only someone who can afford an army of machines powered by AI can do this. Any manufacturing problem, any amount of agricultural work, any service job – these can all done by those with resources independently of any human labourers.

At best you might be able to prompt an AI to do service work for you, but again, if anyone can do this, you'd have to question why anyone would ask you to do it for them. If I want to know the answer to 13412321 * 1232132, I don't ask a calculator prompter, I just find the answer myself. The same is true of AI. Your labour is worthless. You are less than worthless.

> If you're worried, get to working on making sure these tools remain accessible and trustworthy on the base level to everyone. And start building ways to meet basic needs so nobody can casually take those away from your community.

You cannot make it accessible. Again, how are we all going to have access to manufacturing plants armed with AIs? The only thing you can make accessible is service jobs and these are the easiest to replace.

> This won't be halted.

Not saying it will, but the reason for that is that there's still people like yourself who believe you have some value as an AI prompter.

We have two options – destroy AI data centers, or become AIs ourselves. With the former being by far the option with better odds.

I hold this view with high certainty and I hold few opinions with high certainty. I'm aware people disagree strongly with my perspective, but I truly believe they are wrong, and their wrong opinions are risking our future.

February 16, 2024 at 9:06 PM

dogcomplex

Again, your problem is seeing the rich capital dominated business market as the only market.

There's an inherent market your skills will always be useful to: yourself. Base survival, maintaining your home, caring for family and friends, improving quality of life - there's plenty of demand there and work to do. The cost to deliver that demand will demonstrably be far lower than it ever has been with these new tools. Would you be able to hire that labor out to corporate AIs for even cheaper in absolute costs due to the benefits of mass production? Sure. But providing these things is a job for you too and it's "free" with just a bit of time and effort.

Tinkering with open source tools to assemble your first robot kit out of older hardware and 3D printed materials is not going to be prohibitively expensive. The cost to train it - probably not either, if the massive efficiencies we keep finding in models keep lowering and the community keeps sharing model tweaks. Make one robot with good enough dexterity and your second bot is a hell of a lot easier to make. These aren't going to take some ridiculously unheard-of materials or manufacturing processes. In fact, cheap AI chip alternatives to GPUs can be built on decades-old architectures designed to just maximize matrix multiplication with much simpler manufacturing. Monopolizing scarcities here isn't a sure bet. We've just been waiting for a good general-purpose brain. We have it now - and every bit of information we expose it to, the easier it gets to do anything with it.

Unless the big fancy AI wielders are coming for you with killer drones by then, this is all stuff people are going to be well-capable of while unemployed and living off food stamps, savings, or remortgaged houses. If they don't have the skills personally, they'll turn to friends and family who do and find mutual tribal support in tough times as people always do. Growing your own food, building your own infrastructure - all have been doable for a while, but are about to get stupidly easy with a few bots and helpful AI guidance. Normal humanity will carry on and pick up the pieces just fine in this new Dark Age, even as the corporates take the open field opportunity to chase for riches beyond our comprehension, mining asteroids and claiming the solar system.

Now imagine if those greedy corporates happened to just throw the rest of us a bone - 1% of their exponentially-increasing profits - as a PR gesture. Still would soon become far more wealth in absolute terms than the common people have ever seen in the history of earth.

If you think none of that is going to happen, then the alternative is a lot closer to the first people with AGI simply scouring the earth in a paranoid culling. Sure, it's entirely possible. But it takes a certain Next Level of Evil to make that happen.

And all that aside - if you really want to play up the capitalist dystopia angle, there's still plenty of individual value to be mined from people via a wage. Memory and preference mining, medical testing, AI fidelity comparison - plenty of reasons to pay people a little bit to steal what's left of their souls for even further improvement of AI. Might be enough for them to afford their first robots, even.

But by all means - go destroy corporate AI data centers if you think you can get away with it. Anything to tip the scales towards public / open source AI keeping up. But this tech is not going away, nor should it. It could very well result in unprecedented abundance for all, so long as things don't go ridiculously extremist.

February 20, 2024 at 12:00 PM

checker659

If there are no consumers, how will the AI companies earn money? You need UBI to keep the wheel turning.

The only way ahead is UBI and appropriate taxation (+ve for AI companies, -ve for citizens).

February 16, 2024 at 4:20 PM

colordrops

It would be a post-money world. Who needs money when you have an oracle machine that provides you with whatever you want?

February 16, 2024 at 6:16 PM

kypro

Exactly, money is only useful for the exchange of resources. It's the resources we actually want.

In a world of AI those with access to AI can have all the resources they want. Why would they earn money to buy things? Who would they even be buying from? It wouldn't be human labours.

February 16, 2024 at 8:04 PM

checker659

> In a world of AI those with access to AI can have all the resources they want.

How so? What about `time` as a resource?

February 16, 2024 at 9:08 PM

HeartStrings

Dude, too pessimistic, next gen won’t be totally unemployable. Lots of professions up for grabs: roofer (they ain’t sending expensive robots there), anything to do with massage, sex work, anything to do with sports and performance so boxing, theater, Opera singing, live performance, dancing, military (will always need cheap flesh boots on ground), also care in elder facility for aging population, therapist (people still prefer interacting with a human), entertainer, maid cafe employee…

February 16, 2024 at 12:43 PM

TheRoque

Perhaps we will finally reconnect with each other and quit the virtual life, as everything in the virtual world will be managed by and for other AIs, with humans unable to do anything but consume their content

February 16, 2024 at 4:36 PM

kypro

> Dude, too pessimistic, next gen won’t be totally unemployable.

For what it's worth I agree with you, just with very low confidence.

My real issue, and reason I don't hide my alarmism on this subject is that I have low confidence on the timelines, but high confidence on the ultimate outcomes.

Let's assume you're right. If AI simply causes ~10%-20% of middle class workers to fall into the lower class as you suggest then I'd agree it won't be the end of the world. But if the optimistic outcome here is the near-term people won't be "totally unemployable" because people who lose their jobs can always join the working class then I'd still rather bomb the data centers.

If we're a little more aggressive and assume 50% of the middle class will lose their jobs in the next 10-20 years then in my opinion this is not as easy as just reskilling people to do manual labour.

Firstly, you're just assuming that all these middle class workers are going to be happy with being forced into the lower class – they won't be and again this isn't a desirable outcome.

You're also not considering the fact that this huge influx of labour competing for these crappy manual labour jobs will make them even less desirable than they already are. I keep hearing people say how they're going to reskill as a plumber / electrician when AI takes their job as if there is an endless demand for these workers. Horses still have some niche uses, but for the most part they're useless. This is far more likely to be the future of human labour. Even if plumbers are one of the few jobs humans will be able to do in a post-AI world then the supply of them will almost certainly far exceed demand. The end result of this excess supply is that plumbers going to be paid crap and mostly be unemployed.

I think you're also underestimating how fast fields like robotics could advance with AI. The primary reason robotics suck is because of a lack of intelligence. We can build physically flexible machines that have decent battery lives already – Spot as an example. The issue is more that we can't currently use them for much because they're not intelligent enough to solve useful problems. At best we can code / train them to solve very niche problems. This could change rapidly in the coming years as AI advances.

Even the optimistic outcomes here are god awful, and the ultimate risks compound with time.

We either stop the AI or we become the AI. That's the decision we have to make this decade. If we don't we should assume we will be replaced with time. If I'm correct I feel we should be alarmist. If I am wrong, then I'd love for someone to convince me that humans are special and irreplaceable.

February 16, 2024 at 9:42 PM

HeartStrings

People will just join the military ranks. We will need a ton of meat for upcoming WW3. This will solve the unemployment issue. Also, no need to “bomb data centers”, Russia will use EMP weapon for that.

February 19, 2024 at 11:04 AM

HeartStrings

So vote for Putin?

February 16, 2024 at 12:36 PM

feoren

I'm sure people felt similarly when the first sewing machines were invented. And of course, sewing machines did completely irreversibly change the course of humanity and altered (and even destroyed) many lives. But ultimately, most humans managed, and -- in the end (though that end may be farther away than our own lifetimes) -- benefited.

I'm not sure you're actually under-estimating the impact of this AI meteor that's currently hitting humanity, because it is a huge impact. But I think you're grossly under-estimating the vastness of human endeavors, ingenuity, and resilience. Ultimately we're still talking about the bottom falling out of the creative arts: storytelling, images, movies, even porn -- all of that is about to be incredibly easy to create mediocre versions of. Anyone who thrived on making mediocre art, and anyone who thrived second-hand on that industry, is going to have a very bad time. And that's a lot of people, and it's awful. But we're talking about a complete shift in the creative industries in a world where most people drive trucks and work in restaurants or retail. Yes, many of those industries may also get replaced by AI one day, and rapidly at that, but not by ChatGPT or Sora.

Of course you're right that our near future may suddenly be an AI company hegemony, replacing the current tech hegemony, which replaced the physical retail hegemony, which replaced the manufacturing hegemony, which replaced the railway hegemony, which replaced the slave-owning plantation hegemony, which replaced the guilds hegemony, which replaced the ...

You're also under-estimating how much business can actually be relocated outside the U.S., and also how much revolution can be wrought by a completely disenfranchised generation.

February 16, 2024 at 12:20 PM

parhamn

I get really surprised when seemingly rational people compare AGI to sewing machines and cars. Is it just an instinct to look for some historic analogy, regardless of its relevance?

February 16, 2024 at 12:51 PM

feoren

I am absolutely not comparing AGI to sewing machines and cars. I am comparing ChatGPT and Sora to sewing machines and cars. My claim is that these are incredibly disruptive technologies to a limited scope. ChatGPT and SORA are closer to sewing machines than they are to AGI. We're nowhere near AGI yet. Remember that the original claim was that all 6-year-olds today will be unemployable. That's a pretty crazy claim IMO.

February 17, 2024 at 1:37 AM

ThrowawayTestr

It's pattern recognition. Machines replace human labor, people get sacred, the world doesn't end, we move on. ML is no different.

February 16, 2024 at 12:54 PM

azan_

when machines reduced physical labor, displaced people moved to intelectual and creative jobs; tell me, what kind of work will be left for human if ai will be better at intellectual and creative tasks?

February 16, 2024 at 8:06 PM

ThrowawayTestr

If there truly is no work to be done, we can finally start living.

February 16, 2024 at 9:09 PM

lurkingllama

Who's going to pay for you to start living?

February 16, 2024 at 10:46 PM

ThrowawayTestr

If the robots are doing everything why does the concept of "paying" need to exist?

February 17, 2024 at 4:21 AM

feoren

100% agree in principle, but the unfortunate answer to your question is: because the people who already own everything won't allow that to happen. Or, at least, not without a huge fight.

February 17, 2024 at 12:27 PM

brikym

The problem with applying the horse-automobile argument to AI is that this time we don’t have anywhere to go. People moved from legwork to handwork to thinking work and now what? We’ve pretty much covered all the parts of the body. Unless you like wearing goggles all day nobody has managed to replicate an attractive person yet so maybe attractive people will have the edge in the new world where thinking and labour are both valueless.

February 16, 2024 at 3:49 PM

arnaudsm

AI generated influencers are a thing, even on OF nowadays.

Our last value will reside in "human authenticity", but maybe that can be faked too

February 16, 2024 at 4:27 PM

ramathornn

Humans seems to always find a way to make it work, so I’d tell them to enjoy their younger years and be curious. Lots of beauty in this world and even with a shit ton of ugly stuff, we somehow make it work and keep advancing forward.

February 16, 2024 at 11:42 AM

sumedh

> Humans seems to always find a way to make it work

There are people who fall behind though and they vote for politicians who will make the country great again when he promises to bring back jobs.

February 16, 2024 at 6:12 PM

dougmwne

He will be in the same boat as the rest of us. In 12 years I expect the current crop of AI capabilities will have hit maturity. We will all collectively have to figure out how life+AI looks like, just as we have done with life+iPhones.

February 16, 2024 at 1:00 PM

neta1337

It will be difficult to keep up proper levels of intelligence and education in humanity, because this time it is not only social media and its mostly negative impacts, but also tons of trash content generated by overhyped tools that will impact lots of people in a bad way. Some already stopped thinking and instead consult the chat app under the disguise of being more productive (whatever this means). Tough times ahead!

February 16, 2024 at 3:34 PM

colordrops

It's not his choice. It's the choice of the ruling class as to whether they will share the wealth or live in walled gardens and leave the rest of us in squalor outside the city walls.

February 16, 2024 at 9:52 AM

dogcomplex

It is his (parents') choice in terms of whether he reaches for the tools that are just lying around right there. We can run AI video on consumer hardware at 12fps that is considerably less consistent than this one - but that's just an algorithm and model training away. This is not all just locked up at the top. Anyone can enter this race right now. Sure, you're gonna be 57,000th at the finish line, but you can still run it. And if you're feeling generous, use it to insulate your local community (or the world) from the default forces of capitalism taking their livelihoods.

We'll have to still demand from the ruling class - cuz they'll be capable of ending us with a hand wave, like they always have. But we can build, too.

February 16, 2024 at 1:07 PM

TaupeRanger

There's no evidence to suggest what you say is true, so I would tell them to simply go to college or trade school for what they are interested in, then take a deep breath, go outside, and realize that literally nothing has changed except that a few people can create visual mockups more quickly.

February 16, 2024 at 10:10 PM

kart23

AI still can't drive reliably. AI isn't sure if something is correct or not. AI still doesn't really understand anything. You could replace AI with computers in your sentence and it would probably be a very real worry that people shared in 1990. Theres always been technology that people are afraid will drastically change things, but ultimately people adapt and the world is usually better off.

February 16, 2024 at 12:54 PM

HeartStrings

He should become a massage therapist or a Circus performer would be solid advice.

February 16, 2024 at 12:35 PM

smusamashah

Did anyone else feel motion sickness or nausea watching some of these videos? In some of the videos with some panning or rotating motion, i felt some nausea like sickness effect. I guess its because some details were changing while in motion and I was unable to keep track or focus anything in particular.

Effect was stronger in some videos.

February 16, 2024 at 3:03 AM

charlotte-fyi

Yeah, these all made me feel incredibly nauseous. I was trying to figure out what aspect of the motion was triggering this (bad parallax?) but couldn't. The results are impressive but it's still amazing to me how little defects like this can trigger our sense of not just uncanniness but actual sickness.

February 16, 2024 at 4:57 AM

_bramses

I do. My hypothesis is that there isn't really good bokeh yet in the videos, and our brains get motion sick trying to decide what to focus on. I.e. too much movement and *too much detail* spread out throughout the frame. Add motion to that and you have a recipe for nausea (at least for now)

February 16, 2024 at 3:10 AM

throwanem

You can shoot with high depth of field and not cause motion sickness. Aerial videography does that every day, and it's no more difficult in general to parse than looking out an airliner window or at a distant horizon would be.

I suspect GP is closer to on the money here, in suspecting the issue lies with a semblance of movement that isn't like what we see when we look at something a long way away.

I didn't notice such an effect myself, but I also haven't yet inspected the videos in much detail, so I doubt I'd have noticed it in any case.

February 16, 2024 at 3:29 AM

hbn

I think I feel a bit of queasiness but more from the fact that I'm looking at what I recognize as actual humans, and I'm making judgements about what kinds of people they are as I do with any other human, but it's actually not a human. It's not a person that exists.

February 16, 2024 at 5:30 AM

EMBSee

Yes, I felt seriously nauseous. I feel like I just took off early gen VR goggles. Still feeling gross after 30 minutes.

February 17, 2024 at 4:25 AM

mitthrowaway2

I think part of it might be the slow motion / high frame rate effect. I get this too sometimes with the Apple TV backgrounds.

February 16, 2024 at 4:15 AM

timeon

Perfect fit for VR.

February 16, 2024 at 5:45 AM

aantix

This the killer feature.

“ Sora can also create multiple shots within a single generated video that accurately persist characters and visual style.”

To create a movie I need character visual consistency across scenes.

Getting that right is the hardest part of all the existing text->video tools out there.

February 16, 2024 at 2:53 AM

javednissar

I question how much anyone has really used these models if they actually think these systems can replace people. I’ve consistently failed to get professional results out of these things and the degree of work required to get professional results makes me think a new class of job will be created to get professional results out of these systems.

That being said, there is value in these systems for casual use. For example, me and my girlfriend got into the habit of sending little cartoons to each other. These are cartoons we would have never created otherwise. I think that’s pretty awesome.

February 16, 2024 at 8:40 AM

Trasmatta

The more I use them, the more I get a sense of something fundamental that's missing, and the less I worry about losing my job. It's hard to describe, I need to think harder about what that feeling is.

February 16, 2024 at 9:13 AM

TillE

Art is communication, it's as simple as that. Computer generated stuff isn't communicating anything.

February 16, 2024 at 9:26 AM

kristofferR

Most people who work in "the arts" probably aren't communicating anything directly either - they just create the scenes, sound effects, textures, animation, models +++ that someone above them in the organization has asked them to create for their project.

February 16, 2024 at 11:03 AM

danielbln

What's the difference between having an idea, then putting an actor on a set, lighting them, doing background green screen set extension afterwards, digital clean up, etc. vs doing all of that generatively?

How is asking a VFX house for animated footage any different than generating it? If art is intent, there is no reason you can't generate the building blocks that reflect that intent, no?

February 16, 2024 at 7:32 PM

xk_id

Just imagine how annoying the past year was for those of us who had figured this out quicker.

February 16, 2024 at 12:47 PM

Trasmatta

Probably about as annoying as the last 15 years of crypto have been

February 16, 2024 at 12:58 PM

arnaudsm

Many financiers are willing to trade quality with cost reduction.

February 16, 2024 at 4:34 PM

nuz

This is the second time OpenAI has released something right at the same time as google did (Gemini 1.5 Pro with 10M token context length just now). Can't just be a coincidence

February 16, 2024 at 2:38 AM

DylanBohlender

They absolutely sat on this and waited until a competitor announced something, so they could suck the air out of the room.

February 16, 2024 at 3:42 AM

Zelphyr

Not to mention, the Gemini 1.5 Pro announcement was almost all technical talk whereas Sora is light on text and heavy on demonstration.

I'm actually worried about the future of Google at this point. They really seem to be struggling under their own weight.

February 16, 2024 at 2:55 AM

a_vanderbilt

And the tech demo of GPT-4 was Sam interacting with the thing and showing what it did well and where it faltered. We could also access the thing soon after. Not so with Gemini. Hell even Mixtral got me more excited.

February 16, 2024 at 8:02 AM
February 16, 2024 at 3:07 AM

nopinsight

What was released during the first time?

February 16, 2024 at 3:04 AM

nuz

GPT-4 was released at the same time as Bard was announced I believe (same day, same hour basically).

February 16, 2024 at 3:27 AM

a_vanderbilt

I too noticed the coincidence. Not to be a conspiracy theorist but part of me wonders if they share this information with each other, or if OpenAI has advancements like these sitting in the chamber and they are willing wait a few weeks before they release them to maximize the impact of the timing.

February 16, 2024 at 8:00 AM

kromem

So the top two stories are about a model that can generate astonishingly good video from text and a model that has a context window which allows it to process and identify nuanced details in an hour long video.

We've fairly quickly moved from a world where AIs would communicate with each other through text to one in which they can do so through video.

I'm very curious how something like Sora might end up being used to generate synthetic training data for multimodal models...

February 16, 2024 at 7:10 AM

ulnarkressty

To put it into perspective, the Will Smith eating spaghetti video came out not even a year ago --

https://www.youtube.com/watch?v=XQr4Xklqzw8

February 16, 2024 at 3:47 AM

maximus-decimus

And this was made over 3 years ago : https://www.youtube.com/watch?v=9WfZuNceFDM

February 17, 2024 at 6:57 AM

0x4164

Now, we can make anyone eat spaghetti with this AGI model.

February 16, 2024 at 7:17 AM

system2

Extremely meme quality video made by a kid though.

February 16, 2024 at 9:52 AM

ekms

You do know that video wasnt like... state of the art video generation a year ago? It's an intentionally silly meme video

February 16, 2024 at 12:49 PM

derac

It was state of the art "Will Smith eating spaghetti". The idea being that it's a tough thing to generate.

February 16, 2024 at 4:23 PM

brandly

Link to state of the art at the time please!

February 16, 2024 at 1:10 PM

marvin

The relevant state of the art here, is the state of "what can an 8-year old kid who just learned how to type" create videos of. That was even worse 12 months ago!

February 17, 2024 at 3:37 AM

thepasswordis

OpenAI demonstrating the size of their moat. How many multi-million-dollar funded startups did this just absolutely obsolete? This is so, so, so much better than every other generative video AI we've seen. Most of those were basically a still image with a very slowly moving background. This is not that.

Sam is probably going to get his $7T if he keeps this up, and when he does everybody else will be locked out forever.

I already know people who have basically opted out of life. They're addicted to porn, addicted to podcasts where it's just dudes chatting as if they're all hanging out together, and addicted to instagram influencers.

100% they would pay a lot of money to be able to hang out with Joe Rogan, or some only fans person, and those pornstars or podcasts hosts will never disagree with them, never get mad at them, never get bored of them, never thing they're a loser, etc.

These videos are crazy. Highly suggest anybody who was playing with Dall-E a couple of years ago, and being mindblown by "an astronaut riding a horse in space" or whatever go back and look at the images they were creating then, and compare that to this.

February 16, 2024 at 2:57 AM

minimaxir

> OpenAI demonstrating the size of their moat. How many multi-million-dollar funded startups did this just absolutely obsolete?

For posterity since the term has been misused lately, having a very good product isn't a moat in the business sense. There's nothing stopping a competitor from creating a similar product (even if it's difficult), and there's nothing currently stopping OpenAI's users from switching from using Sora to a sufficient competitor if it exists.

Sora is more akin to a company like Apple/Google a decade ago using their vast resources to do what a third-party does, but better (e.g. the Sherlocked incident: https://www.howtogeek.com/297651/what-does-it-mean-when-a-co...).

February 16, 2024 at 3:29 AM

neosat

"having a very good product isn't a moat"

It definitely is. Having the best product and being able to maintain that best-in-class product status over time through a firm's 'internal capabilities' is very much a moat and a strong one at that. A moat is the business strategy sense is anything that enables a firm to maintain competitive advantage. Having the best product in a category, and being able to maintain that over releases is a strong competitive advantage (especially when there is high willingness to pay or price is a strong competitive dimension compared to the value created).

February 16, 2024 at 3:44 AM

cma

That's not a real moat except in one sense: if it is really expensive to get to the level to compete, and you know a competitive market would bring margins near zero, then no competitor may actually step up. We see this in off-patent drugs, where it may have 200X margins but no competitor will go through the FDA manufacturing reapproval process because they won't actually get those margins if they begin competing on price, and then the sunk cost of getting to the competitive level isn't worth anything for them.

I think OpenAI's big moats are in userbase feedback and just proprietary trade knowledge after they stopped sharing model details. They may have made some exclusive data source deals with book/textbook and other publishers, though it isn't clear a license is actually needed for that until things work through the courts.

February 16, 2024 at 5:22 AM

ij09j901023123

Nah, this is gonna be the next big thing since the Iphone. You're gonna see Sam surpass Elon in the next decade

February 16, 2024 at 5:38 AM
February 16, 2024 at 4:36 AM

thepasswordis

OpenAIs moat is their massive access to capital and compute. That’s what I mean.

February 16, 2024 at 3:39 AM

minimaxir

Again, that's not a moat.

The original "We Have No Moat, And Neither Does OpenAI" leaked memo from Google that memefied the term focuses explicitly on the increasing ease of competitors (especially open-source) entering the ecosystem: https://www.semianalysis.com/p/google-we-have-no-moat-and-ne...

February 16, 2024 at 3:46 AM

thepasswordis

First of all, the term moat comes from Warren Buffet, and has to do with his investment strategy: https://finance.yahoo.com/news/warren-buffett-explains-moat-...

Second: Massive capital expenditure, specifically in this case the huge cost of building or leasing enormous GPU clusters, is *exactly* what he means by this.

February 16, 2024 at 3:51 AM

declaredapple

> What we're trying to find is a business that, for one reason or another -- it can be because it's the low-cost producer in some area, it can be because it has a natural franchise because of surface capabilities, it could be because of its position in the consumers' mind, it can be because of a technological advantage, or any kind of reason at all, that it has this moat around it.

He didn't seem to have specific definition at all really.

I think most people attribute it to a "secret sauce technology" in the case of OpenAI, I'm not sure if "finances to lease a huge cluster of GPUs" makes sense here because the main competitors (Google, AWS, Apple, etc) also have access to insane compute as well yet have struggled to get close to GPT4's performance in practice.

That said I do agree that it's a moat for the startups like stability/mistral, etc. They also have access to $/compute, albiet a lot less. And you can see this in their research, as they've been focused on methods to lower the training/inference costs.

February 16, 2024 at 3:58 AM

Workaccount2

I believe that Google actually has more AI compute at their disposal than OpenAI. They have been building out their TPU infrastructure for a while now. OpenAI is reliant on Azure obtaining nvidia GPUs.

So at least in the battle between OpenAI and Google, their moat right now are their models.

February 16, 2024 at 4:03 AM

rvnx

Exactly, to create larger and better performing models, there is no lack of ideas or techniques. The real problem is to have the GPUs for that.

February 16, 2024 at 3:54 AM

declaredapple

I disagree mainly because google, aws, apple, etc. All have similar, or even more access to GPU compute and funding for it, and in google's case also has been one of the main research contributers, yet they still struggle to touch GPT4's performance in practice.

If it was as simple as dropping 10's millions on compute they could do that, yet google's bard/gemini have been a year behind GPT4's performance.

That said I do agree that it's a moat for the startups like stability/mistral, etc. They also have access to $/compute, albiet a lot less. And you can see this in their research, as they've been focused on methods to lower the training/inference costs.

*I'm measuring performance by the chatbot arena's elo system and r/locallama

February 16, 2024 at 4:01 AM

frabcus

I agree it isn't a moat in the business sense - that would be some kind of lock in network effect.

e.g. If ChatGPT being popular gives OpenAI enough extra training data, they're locked in forever having the best model, and it is impossible for anyone - even with unlimited money, and the same technology - to beat them. Because they don't have that critical data.

Yes, Google had the best search product, and got a huge market share simply by being better. Their moat however is that their search rankings are based off the click data of which search results people use and cause them to stop their search because they've found a solution.

They also have a moat to do with advertising pricing, based on volume of advertising customers.

Bing spend a lot of capital, and had the tech ability, but those two moats blocked them gaining more than a tiny market share.

In this case, maybe OpenAI will have a video business moat, maybe they don't...

February 16, 2024 at 4:10 AM

ericd

Moats have never been uncrossable, they just make it harder to get to the walls.

February 16, 2024 at 4:09 AM

PunchTornado

Google, Microsoft and Facebook have capital and compute. That is not an OpenAI moat.

Facebook has Moat because of their social network. It is very hard to switch to another network. Google with search has no moat because it is easy to switch to a new search engine. OpenAI has no moat because it is easy to switch to a new AI chat once a better product becomes available. AWS has moat because it is hard to switch cloud providers. Apple has moat because people want to buy apple products. etc.

A moat can be seen where even if you have a worse product than the competition, or users hate you, they still use your products because the cost to switch is immense.

February 16, 2024 at 7:26 PM

coffeemug

Being (a) first and (b) good enough is a moat. Nothing stopped people from switching from google to bing all these years other than not having any reason to.

February 16, 2024 at 3:36 AM

sdenton4

Google wasn't the first, as all those altavista investors will unhappily attest.

February 16, 2024 at 4:03 AM

ericd

They were the first to "good enough", which is what the GP is talking about.

February 16, 2024 at 4:10 AM

janalsncm

> There's nothing stopping a competitor from creating a similar product

This is like saying there’s nothing stopping a competitor from launching reusable rockets into space. Of course there isn’t, but it’s hard and won’t happen for the foreseeable future.

Similarly with a physical moat, it’s not impossible to cross, but it’s hard to do.

February 16, 2024 at 4:58 AM

kortilla

It’s not the same because there is basically no cost to trying an OpenAI competitor. Betting your payload on an up and coming rocket company is a major business risk.

February 16, 2024 at 10:24 AM

tomp

There is nothing stopping Wolkswagen from creating a product similar to Tesla.

There is nothing stoping Microsoft from creating a search engine as good as Google’s.

There is nothing stopping Facebook from creating an iPhone alternative, after all it’s just engineering!

There is nothing stopping Google from beating GPT-4.

Shall I go on?

February 16, 2024 at 3:45 AM

jstummbillig

To what end?

The point is that "moat" gets conflated with just being ahead in the game. I don't find it a super interesting point of contention, but there is a distinction alright.

February 16, 2024 at 3:52 AM

TulliusCicero

Having a very good product can be a moat if it takes enormous resources and skill to create said product.

February 16, 2024 at 4:03 AM

swalsh

"How many multi-million-dollar funded startups did this just absolutely obsolete?"

The play with AI isn't to build the tools to help businesses make money, the play is to directly build the businesses that makes the money.

In practice this means, don't focus your business model on building the AI to make text to video happen. Your business model should be an AI studio, if the tech you need doesn't exist, build it.... but if you get beat by someone with more GPU's and more data, cool use the better models. Your business model should focus on using the capability not building it. It's proving quite hard to beat someone with more GPU's, more data, more brain power.

February 16, 2024 at 3:13 AM

downWidOutaFite

But then you're stuck playing in the model owner's playground and if you're too successful they can yank the rug from under you and steal your business any time they want.

February 16, 2024 at 4:22 AM

loceng

Indeed, they're letting all of these businesses and professionals subscribe to the gold mining equipment - but retaining ownership of it, and they'll be able to undercut those services and cut people off as they please.

February 16, 2024 at 3:31 AM

shostack

This is effectively what Amazon does. Provide the infrastructure to make money selling things, then let merchants de-risk their R&D into what sells best and would be most profitable, then sell their own version of it.

February 16, 2024 at 11:05 AM

marvin

I don't see AWS fast-following the 1.5 million companies that use AWS, not even the 0.1% most successful of them.

February 17, 2024 at 4:28 AM

jijijijij

I predict, this "AI" content generation will eat itself at last. It will outcompete the low-effort "content" industry as is. Then inevitably completely devalue this sort of "product". Because it will never get to 100% of the real thing, the "AI" content craze will ultimately implode.

I bet we won't get AGI as a progression of this very technology. The impression of "usefulness" will end when "AI" is starting to drink its own Koolaid on a large scale (copilot lol), and when everyone starts using it as super inefficient business interface. Overfitted mediocre mediocrity, on steroids.

Hopefully, this sobriety happens before the economy collapses, as a consequence of all dem bullshit jobs cleansed.

February 16, 2024 at 3:49 AM

visarga

Funny you chose the day of a huge leap in generative video to proclaim generative model limitations.

February 16, 2024 at 3:52 AM

jijijijij

I know, right? Incidentally, even in the same HN thread, too!

February 16, 2024 at 4:18 AM

neilk

I think this analysis is flawed. New technologies are usually bad at substituting for things that already exist. It's 100% true this will not substitute for the existing genre of film and video.

New technologies change the economics of how we satisfy our needs.

When search engines became good, many pundits confidently predicted Google would never replace librarians or libraries. It didn't. It shifted our relationship to knowledge; instead of having to employ an expert in looking things up, we all had to become experts at sifting through a flood of info.

When the cost of producing art-directed and realistic video goes to zero it's hard to predict what's going to happen. Obviously the era of video = veracity is now over. And you can get the equivalent of Martin Scorsese and a million dollar budget to do the video instructions for a hair dryer. Instead of hunting for a gif to express how you feel, captured from an existing TV show or something, you could create a scene on the fly and attach it to a text message. Or maybe you dispense with text messages altogether. Maybe text is only for talking to computers now.

My personal prediction is that the value of a degree in art history is going to go way, way up, because they'll be the best prompt engineers. And just like desktop publishing spawned legions of amateur typesetters, it will create lots of lore among amateur video creators.

February 16, 2024 at 4:30 AM

jijijijij

I didn't analyze anything.

I haven't seen a lot of use cases outside of productions and businesses, which shouldn't exist in the first place (at least to this extent).

Some of our "needs" are flawed, since "content" speaks to evolutionary relicts developed in times of scarcity and life in small groups. In the unbounded production of "AI", there is no way to keep up the sense of newness of input indefinitely. I am already fatigued by "AI" """art""". It has no real relevancy. You can't trust any of it.

Every medium where "AI" content becomes prevalent, will lose it's appeal. E.g. if I get the impression a significant proportion of comments here were "AI" generated, I will leave HN. Thing is, all these open platforms can't prevent "AI" spam. So they will die. Look at the frontpage of Reddit... it's almost all reposts, by karma farming bots. Youtube "AI" spam already drowning real content. This is what's going to happen to everything. User content will die. "Content" will die. The web will die. You won't even try, because of "AI" generated fatigue.

> My personal prediction is that the value of a degree in art history is going to go way, way up, because they'll be the best prompt engineers.

Lol. Yeah, "best prompt engineer" in the infinitely abundant production economy...

You people really need to iterate the world you are imagining a few times more and maybe think about some fundamentals a bit.

If I am wrong, life will be hell.

February 16, 2024 at 5:25 AM

theshackleford

> I am already fatigued by "AI" """art""".

“I’m bored of it, everyone must felt the same way as me.”

Ok

February 16, 2024 at 1:17 PM

darkwater

At least in the HN bubble, you can see a lot of similar comments in every blog post featuring (useless?) AI generated images.

February 16, 2024 at 7:44 PM

zuminator

Do people care about 100% of the real thing though? Phone photos are oversaturated and over-sharpened. TikTok and other social media videos are more often then not run through filters giving their creators impossibly smooth skin and slim waists along with other effects not intended to look in any way realistic. Almost every major motion picture has tons of visual effects that defy physical reality. Nature documentaries have for decades faked or sweetened their sound production, staged their encounters with wildlife, etc.

People are more concerned about being stimulated than they are about verisimilitude.

February 16, 2024 at 5:57 AM

jes5199

perhaps film photos are undersaturated and blurry

February 16, 2024 at 3:11 PM

zer0tonin

$7T is more than the budget of the US federal government, a third of the NASDAQ, or 2,3x AAPL market cap. Sam getting his 7T is not actually possible.

February 16, 2024 at 3:04 AM

figassis

Have you considered that he might not actually expect $7T, but this ask makes us think $1T is relatively reasonable and so he gets it?

February 16, 2024 at 3:22 AM

Xirgil

It's called anchoring

February 16, 2024 at 3:35 AM

ed_balls

yes, he is expanding his own personal Overton window.

February 16, 2024 at 4:02 AM

synergy20

AAPL is 2.8T now, how is it 2300x AAPL equals to 7T.

7T is actually possible, but yes it's huge.

February 16, 2024 at 3:06 AM

timdiggerm

In some places, usage of , and . within numbers is revered from what you use

February 16, 2024 at 3:11 AM

Ukv

The comment they're replying to initially said "2300x", but was fixed after it was pointed out (https://news.ycombinator.com/item?id=39386997).

IMO HN should have an edit indicator, at least after others have already replied.

February 16, 2024 at 3:45 AM

kilbuz

I can respect that.

February 16, 2024 at 3:48 AM

dilyevsky

You’re comparing cash flow to a static pile of money spent over decades

February 16, 2024 at 4:48 AM

jstummbillig

The current world economy is $85T/anno

If (the best) AI adds 10% to that, $7T is not only possible but a bargain.

February 16, 2024 at 3:31 AM

windexh8er

AI is more akin to a zero sum game. It won't add 10% to the global economy (and if it did - it would be around "peak of inflated expectations" and, likely, have a corollary slide down into the "trough of disillusionment") because it will both distract budgets and/or redirect budgets. That hypothetical $7T is not coming out of thin air. I'd even go as far to argue that this hype cycle will ultimately detract from global economy over time as it's a significant draw on resources that could have been / would have been used on more productive efforts long term.

February 16, 2024 at 3:46 AM

jstummbillig

This reads like it could be used to reason against the industrial revolution or the first computer revolution or any other significant advance in human history. Am I missing something?

February 16, 2024 at 3:59 AM

zer0tonin

James Watt didn't ask for 10% of the global GDP

February 16, 2024 at 4:03 AM

ben_w

If he had, it would've been a bargain for the impact of the industrial revolution.

Watt couldn't have asked, his engines specifically weren't enough of a difference by themselves even though the revolution as a whole was, and I strongly suspect this is also going to be true for any single AI developer; however a $7T investment in many unrelated chip factories owned by different people and invested over a decade, is something I can believe happening.

February 16, 2024 at 4:33 AM

educaysean

I assume his objection was regarding the AI being a "zero-sum game", whatever that was supposed to convey

February 16, 2024 at 4:30 AM

windexh8er

The industrial revolution wasn't a leech on resources for little to no value. Most of the energy and diverged efforts by companies globally is currently being wasted on efforts to try and figure out how to profit from this "revolution". This isn't a revolution, this currently looks like a heist of epic proportions.

February 17, 2024 at 11:19 AM

windexh8er

If the industrial revolution wasted the majority of its input for low value / unneeded output it wouldn't have been a revolution. Please enlighten me on how LLMs have revolutionized the world and then feel free to share how much energy, money and time have been sunk thus far with little to show as a tangible increase in the lives of a global human population.

February 17, 2024 at 11:22 AM

schoen

Per annum (the preposition per governs the accusation case rather than the ablative case).

February 16, 2024 at 3:33 AM

cabalamat

$100T in 2022 according to the World Bank.

https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(nomi...

February 16, 2024 at 3:35 AM

justrealist

> or 2300x AAPL market cap

It's only 2x the AAPL market cap.

February 16, 2024 at 3:05 AM

guywithabowtie

It is a future projected value of a company. You can not realize it. If you start selling stocks, they will drop at a rapid pace. The entire stock market is in a way projection of all future money the stocks will potentially make for a long time. This is not liquid cash that can be injected for any other purpose.

February 16, 2024 at 3:16 AM

justrealist

Cool.

However, the OP was incorrect, it's 2x the AAPL market cap.

February 16, 2024 at 3:20 AM

zer0tonin

Oops, you're right on that one

February 16, 2024 at 3:07 AM

m3kw9

is 7T over a period of maybe 20 years. 1T is enough to buy out most engineers from TSMC, or maybe even buy out TSMC

February 16, 2024 at 3:14 AM

cabalamat

Or to put it another way, one month's worth of world GDP.

February 16, 2024 at 3:33 AM

dheera

OpenAI's moat is (a) talent (b) access to compute (c) no fear of using whatever data they can get.

On the other hand, I think these moats will be destroyed as soon as anyone finds a drastically more efficient (compute- and data-wise) way to train LLMs. Biology would suggest that it doesn't take $100 million worth of GPUs and exaflops of compute to achieve the intelligence of a human.

(Of course it is possible that at that point, OpenAI may then be able to achieve something far superior to human intelligence, but there is a LOT of $$$ out there that only needs human levels of intelligence.)

February 16, 2024 at 3:19 AM

golol

Biology literally took a planet sized genetic algorithm with nanomachines a couple Billion years to get to this point.

February 16, 2024 at 3:39 AM

Eisenstein

> Biology would suggest that it doesn't take $100 million worth of GPUs and exaflops of compute to achieve the intelligence of a human.

Biology suggests that a self-replicating machine can exist by ingesting other machines, turning them into energy and then using that energy to power themselves. Biology suggests that these machines can be so small that we cannot even see them.

How close are we to making one of those?

February 16, 2024 at 3:52 AM

PeterisP

I believe that synthetic biology had succeeded already a few years ago in making artificial cells with a fully synthetic genome designed by us with what is sufficient for the cell to eat, grow and replicate, se we already can design and make such 'machines'.

February 16, 2024 at 4:48 AM

Eisenstein

So make a biological AI then. What the parent was saying is that 'biology can do it with organic materials, so we should be able to do it with electronics".

February 16, 2024 at 7:02 AM

PeterisP

There's nothing obviously wrong with assuming that "biology can do it with organic materials, so we should be able to do it with electronics" - while it's theoretically possible that we'll eventually identify some fundamental obstacle preventing that, as far as we currently know, computation is universal and the only thing that depends on the substrate is efficiency.

Since we have a much, much better industrial process for manufacturing electronic components, why attempt to make a biological AI if there's no current reason to believe that it being biological is somehow necessary or even beneficial?

February 16, 2024 at 2:35 PM

Eisenstein

I love it when people completely pivot what they say just to keep arguing.

February 16, 2024 at 9:41 PM

superjared

> 100% they would pay a lot of money to be able to hang out with Joe Rogan, or some only fans person, and those pornstars or podcasts hosts will never disagree with them, never get mad at them, never get bored of them, never thing they're a loser, etc.

This is the stuff of Brave New World. It's happening to us in real time.

February 16, 2024 at 3:34 AM

ben_w

> Sam is probably going to get his $7T if he keeps this up, and when he does everybody else will be locked out forever.

I would be extremely surprised if he could get past the market cap of all current corporations as an investment. That doesn't mean "no, never"[0], but I would be extremely surprised.

$7T in one go would be 6.7% of global GDP, and is approximately the combined GDP of Japan and Canada.

> These videos are crazy. Highly suggest anybody who was playing with Dall-E a couple of years ago, and being mindblown by "an astronaut riding a horse in space" or whatever go back and look at the images they were creating then, and compare that to this.

Indeed, though I will moderate that by analogy: it's been just over 30 years since DOOM was released, and that was followed by a large number of breathless announcements about how each game had "amazing photorealistic graphics that beat everything else" while forgetting that the same people had said the same things about all the other games released since DOOM.

Don't get me wrong: these clips are amazing. They may not be perfect, but it took me a few loops to notice the errors.

I'm sure there are people with better eyes for details than me, who will spot more errors, spot them sooner, and keep noticing them long after GenAI seems perfect to me.

But I also expect that, just as 3D games' journalism spent a long time convinced the products were perfect when they weren't, so too will GenAI journalism spend a long time convinced the products are perfect before they actually are.

[0] a sufficiently capable AI is an economic power in its own right. I previously guessed, and even with it's flaws would continue to guess, that the initial ChatGPT model was about as economically valuable to each user as an industrial placement student, and when I was one of those I was earning £1k/month (about £1.7k/month when adjusted for inflation).

February 16, 2024 at 4:14 AM

gwern

Yes, the 'special effects' effect will kick in. Within a year or so, you'll spot this easily, quite aside from the more obvious issues. (That Landrover captioned 'DANDOVER' - is this still using BPEs?!)

Aside from visual plausibility, there's also the issue of physics: one of the things you would like to use video models for is understanding real-world physics and cause-and-effect for planning or learning _in silico_. Something may look good but get key physics wrong and be useless for, say, robotics.

February 16, 2024 at 4:22 AM

cabalamat

> 100% they would pay a lot of money to be able to hang out with Joe Rogan, or some only fans person, and those pornstars or podcasts hosts will never disagree with them, never get mad at them, never get bored of them, never thing they're a loser, etc.

I think immersive games will also be a big application. Games AI will also benefit from being more strategically intelligent and from being able to negotiate, in a human-like fashion, with human and other AI players. The latter will not only make games better, it will also improve the intelligence of AIs.

February 16, 2024 at 3:28 AM

downWidOutaFite

Yep, since at least World of Warcraft millions of people have already "opted out of life" to live in game worlds.

The thing that "The Matrix" style plots get wrong is that the machines don't need to coerce us into their virtual prisons, we will submit willingly.

February 16, 2024 at 4:34 AM

aantix

That's an interesting take - podcasts have become a replacement for companionship and conversation.

February 16, 2024 at 3:01 AM

throw4847285

I don't buy that. People form fan communities around these podcasts where they talk with real people about how much they love listening to minor internet celebrities talk about nothing. Why would they do that if the podcasts served that purpose already?

I think rather than replace real human contact, the internet has created an increased demand for it. People need every moment of their lives to be filled with human speech or images.

If I were to take off my "reasonable point" hat and put on my "grandiose bullshit" hat I'd say that in the same way drugs can artificially stimulate various "feel good" parts of your brain, we have found a way to artificially stimulate the "social animal" instinct until we're numb.

I think the real risk of this kind of AI is not that people live in a world of fake videos of their favorite celebrities talking to them, but that entire fake social media ecosystems are created for each individual filled with the content they want to see and fake people commenting on it so they can argue with them about it.

Everybody needs to read The Three Stigmata of Palmer Eldritch by Philip K Dick.

February 16, 2024 at 3:21 AM

throw4847285

I may be having a hypomanic episode, but I've been thinking about it more, and it seems like the entire Internet Age has been an attempt to more precisely synthesize the substance which sates human social needs artificially, and that when they perfect it, it's all over.

February 16, 2024 at 3:29 AM

awfulneutral

I've been thinking along those lines too, but more from the angle that our goal is to eliminate any need to rely on other humans for anything. We consider the need for interacting with other humans as a burden and an inconvenience, and we're going to get rid of it, at the cost of all the indirect benefits we got from being forced to do it.

February 16, 2024 at 4:55 AM

ggregoire

That's what Twitch has become too. The most popular Twitch streamers do nothing other than watching YouTube videos and providing a fake relationship to their 50,000 live viewers.

February 16, 2024 at 3:06 AM

zamfi

Same with TV decades years ago, and radio before that. Just a different generation.

February 16, 2024 at 3:04 AM

nomel

It looks like some people are just learning that introverts exist. Maybe there's something interesting about how more common it is, but none of this is new.

February 16, 2024 at 3:40 AM

testfrequency

It’s also more fuel for brain rot and toxic personalities to spread.

Most podcasters are narcissists

February 16, 2024 at 3:03 AM

carbine

I agree with much of what you say, but I'm not sure the dystopian conclusion is the main one I'd draw.

Improving your ability to connect with and enjoy/learn from people all around the world is one of the main value props of the internet, and tech like this just deepens that potential. Will some people take this to an unhealthy degree that pulls them too far out of reality? Yes. But others will use it to level up their abilities, enrich their lives, create beautiful things, and reduce loneliness.

February 16, 2024 at 3:43 AM

ren_engineer

seems like a significant chunk of the population may opt in to the Matrix voluntarily.

on another note I find it funny they released this right after Google announced their new model. Bad luck for Google or did OpenAI just decide to move up their announcement date to steal their thunder?

February 16, 2024 at 3:09 AM

IncreasePosts

If there is a high fidelity nice simulation of a pleasant world, and the actual real world is a hellscape, what is the problem with that?

If you were presented with the fact that whatever your life is is just an illusion, and you are actually a starving slave in North Korea, you would choose to "wake up"?

February 16, 2024 at 3:16 AM

tomp

Why not just take cocaine to fake good feelings, instead of seeking real-life experiences that generate good feelings?

(I mean, a lot of people do do that!)

February 16, 2024 at 3:49 AM

IncreasePosts

Well, there are huge downsides to using cocaine, whether it is undesired health impacts, or addiction, or threat of arrest, or mere cost, or even just social stigma.

I'm not sure there are downsides to living out your life in a simulation while robots take care of your physical form.

February 16, 2024 at 4:52 AM

thepasswordis

A lot more people would do that if cocaine was legal, I suspect.

February 16, 2024 at 3:52 AM

tmaly

This is like something out of Ready Player One

February 16, 2024 at 3:42 AM

globular-toast

More like The Matrix, which was originally referenced.

February 17, 2024 at 12:18 AM

tmaly

The Matrix is like the next step. Ready Player One, people were mostly on VR. Ready Player Two is where they became sort of jacked in.

February 17, 2024 at 1:32 AM

throwup238

Only those that can afford it. The rest will be forced to live in the real world, like 20th century peasants.

February 16, 2024 at 3:10 AM

ren_engineer

actually the opposite imo, this stuff is the ultimate bread and circus to distract poor people from worsening living conditions. Much cheaper to provide VR goggles with AI model access than housing and healthcare

February 16, 2024 at 3:11 AM

throwup238

As long as sex is the competition, I don't think that's likely. Simulating orgasms will require the Apple iPleasure Maxxx implant and expensive brain surgery & recovery.

February 16, 2024 at 3:19 AM

emmo

I'm not sure sex is always going to be the competition. More and more people are sexless (by choice or not).

There are already sex toys that you.. insert yourself in, and then have scripts that sync up movements with VR videos you are watching.

Crazy times coming in the next few decades.

February 16, 2024 at 4:09 AM

advael

I think there are people for whom the fundamental assumption that someone will want "more" of stuff they already like does not hold, and that while those people are a minority, recent developments in the media landscape toward a constant stream of increasingly similarity-curated media has caused them to increasingly disengage from media consumption

That said, those people are by definition less relevant to internet consumption metrics

February 16, 2024 at 3:44 AM

karmasimida

This is very impressive

But VFX isn't that big of a market by itself: Global visual effects (VFX) market size was US$ 10.0 Billion in 2023

February 16, 2024 at 4:04 AM

ericzawo

I hate that this is true.

February 16, 2024 at 3:23 AM

tmaly

Even with 7 trillion, he is still going to need a national grid that can supply the power for the compute.

There is a lot that has to planned and put in place now to get there.

As for people that have opted out of life. We would have a better world if we started encouraging more dreamers/doers like out of the movie Tomorrowland.

February 16, 2024 at 3:36 AM

macrolime

>100% they would pay a lot of money to be able to hang out with Joe Rogan, or some only fans person, and those pornstars or podcasts hosts will never disagree with them, never get mad at them, never get bored of them, never thing they're a loser, etc.

All of these things are against the terms of service and attempting them may result in a ban.

February 16, 2024 at 3:52 AM

resolutebat

There are no terms of service for the open-source clone of this that we'll have in 6 months.

February 16, 2024 at 4:20 AM

raydev

Is there an open-source GPT4 equivalent right now? Doesn't seem like anything has taken off and gotten rave reviews on the level of OpenAI's offering yet.

February 16, 2024 at 4:30 AM

resolutebat

Equivalent, no. Close enough for many uses, sure, and it's getting better all the time.

February 16, 2024 at 5:01 AM

Alifatisk

We say 7T$ as if it’s nothing, am I the only one shocked by the sum we are talking about?

This is close to what BlackRock is managing!

February 16, 2024 at 3:23 AM

ben_w

I'm fairly sure $7T is a speculation bubble, and that's going to pop like all bubbles pop. It's the combined GDP of Japan and Canada. It's too big for an investment.

It's not necessarily too big for a valuation, as a sufficiently capable AI is an economic power in its own right: I previously guessed, and even despite its flaws would continue to guess within the domain of software development at least, that the initial ChatGPT model was about as economically valuable to each user as an industrial placement student, and when I was one of those I was earning about £1.7k/month when adjusted for inflation, US$2.1k at current nominal exchange rates. 100 million users at that rate is $2.52e+12/year in economic productivity, and that's with the current chip supply and (my estimate of) the productivity of a year-old model — and everyone knows that this sector is limited by the chips, and that $7T investment story is supposed to be about improving the supply of those chips.

February 16, 2024 at 4:42 AM

zitterbewegung

Looks like they have made large progress in hand generation. They still look like claws a bit but you didn’t have to add a workaround for the query to render correctly and I had to zoom in to verify . When I was watching it the first time I didn’t even notice hand issues.

February 16, 2024 at 3:43 AM

treprinum

It's going to take a while to make this realtime as you suggest. The lower the latency, the more $$$ it costs (exponentially).

February 16, 2024 at 3:53 AM

t0lo

it wouldn't be too difficult to make a tiktok like app that created tailored prompts for sora based on the user and tracking data. Question is whether it is profitable

Hopefully, the line between the real world and virtual world gets stronger once again.

February 16, 2024 at 9:30 AM

patrickwalton

This comment just hit the charts of the black mirror scoreboard

February 16, 2024 at 4:16 AM

croes

He never mentioned $7T.

February 16, 2024 at 3:05 AM

bogwog

My AI idea: Civil war as a service (CWaaS)

Prompt: poll worker sneakily taking ballots labeled , and throwing them in the trash.

February 16, 2024 at 2:56 AM

aaroninsf

Srsly.

We were not able to handle applications of preexisting tools for steering public sentiment, limited to static text and puppet account generation etc.

We are not handling the current generation of text and image generation, or, deepfake style transfer, or, voice cloning, etc ad nauseum.

We will not be able to handle this.

GOOD. TIMES. AHEAD.

but oh that Spatial Video NeRF generated pr0n with biometric feedback autotuning and a million token memory for what. I. like.

February 16, 2024 at 3:04 AM

rightbyte

You realize how easy it is to do that with actors, right?

February 16, 2024 at 6:53 AM

bogwog

It's not easy to do that with actors. It costs money, you need to get props, find a location, schedule the shoot, etc. People who lose their minds over petty grievances will sober up long before their video is produced.

With AI video generation you could produce multiple videos per day, each one customized to be highly targeted for a local market. Actors can be generated to represent a local minority that is villainized by politicians and the clothing and set customized for the locale.

Then you can automate posting it all over social media with fake AI generated discussions calling for a revolution. Even if the video gets flagged as fake, you can upload a thousand more. As a bonus, add comments along the lines of "of course THEY want you to think this is fake! Don't be fooled!" in order to appeal to the paranoid lunatics who are most likely to get the ball rolling.

In conclusion, I believe this is a solid startup idea. Thank you all for coming.

February 16, 2024 at 8:27 PM

hackerlight

At least 1000x more effort than typing a sentence into your keyboard. Hence less likely to happen at the same frequency and scale.

February 17, 2024 at 1:14 AM

dorkwood

If it's so easy can you do it right now and show me the result? Hell, I'll even give you the whole day.

February 17, 2024 at 6:40 AM

darkhorse13

Does anyone else feel a sense of doom from these advancements? I'm definitely not a Luddite, I've been working professionally as a programmer for quite some time now, but I just can't shake this feeling. And this is not in the "I might lose my job to this" kind of feeling, that's obviously there, but it's something deeper, more sinister. I don't think I can explain it properly.

Anyway, videos look incredible. I genuinely can't believe my eyes.

February 16, 2024 at 5:35 AM

geor9e

I feel the opposite. I've been overwhelmed with a sense of my mortality, that I need to care for my body better, in order to live as long as possible into this age. I feel like I won the lottery of birth date to be able to see this. I get your perspective, and I have no doubt the wealth gap will widen painfully, but I'm also optimistic about humanity's ability to work it out.

February 16, 2024 at 9:45 AM

ed_mercer

Same. It’s absolutely amazing to be alive right now. AGI is our only shot at fixing the planet and fixing/extending our lives.

February 16, 2024 at 12:38 PM

davidmurdoch

Where does your optimism come from?

February 16, 2024 at 10:34 AM

geor9e

That's a therapist question. Probably from an engineering career, surrounded by smart folks for whom succumbing to "we're doomed" was never an option, and a solution was something you beat your head against a brick wall for the 999th time about. You just get used to things turning out alright. But the topic has a lot written over the centuries by people who can write better than I https://en.wikipedia.org/wiki/Technological_unemployment

February 16, 2024 at 11:14 AM

brink

Not history.

February 16, 2024 at 10:54 AM

geor9e

Yes history. If you ignore the clickbait headlines designed to elicit rage, and the news feeds designed to spiral you into a cycle of fear, and just google "poverty graph" to find raw data sources you'll find it's generally a good trend like https://blogs.worldbank.org/opendata/dataviz-remake-fall-ext...

February 16, 2024 at 11:17 AM

brink

What I mean is the problem with wealth gaps is historically they typically get remedied with violent revolution.

February 16, 2024 at 11:48 AM

dogcomplex

Or appeased with a tiny fraction of the total gains - just enough to keep a middle class happily with their basic little toys while wealth inequality grows.

This could easily be the same, except the toy is "you don't have to work anymore and here's some houses and robot chefs! Now play nice while the adults go build star fleets"

February 16, 2024 at 1:21 PM

slothtrop

It allows the technical possibility for a post-truth reality, where it's impossible to tell what's true and what isn't. Every piece of information fed through your machine and smartphone. That's the scariest part to me. We need to get ahead of that, because certain interests will be fabricating things with it.

As jobs go, well, we're a long ways from full automation but this represents some serious growing pains that will decimate certain jobs and replace them with few. Not sure what the reaction will be on the consumption side, revulsion or enthusiasm. The "handcrafted" market will still be there but then you wouldn't really know if any AI was used. In a long enough timeline we can hand-wave this away with UBI/negative tax.

But ah, the most at-risk workers are the professional services, white-collar upper-middle class types, even engineers but to a lesser extent. So I wonder what kind of upheaval that would cause.

February 16, 2024 at 6:25 AM

sweetbacon

Certain interests are already fabricating voices in political robocalls in New Hampshire. I chill at what the US will see as we approach the Presidential election this fall. Then again, maybe it will give us an early taste to better prepare for what is to come.

February 16, 2024 at 8:31 AM

a_wild_dandan

The proliferation of convincing fakes will be a massive problem in about twenty years ago.

February 16, 2024 at 9:33 AM

ShamelessC

> It allows the technical possibility for a post-truth reality

Social media already did that. Donald Trump got elected POTUS which is effectively the sum of all fears w.r.t. a "post truth reality".

February 16, 2024 at 9:07 AM

slothtrop

I agree, but this cranks it up to 11 to threaten every vector, not just facebook feeds. Unfriendly governments will also have a field day.

February 16, 2024 at 9:27 AM

jiggawatts

For me, it's my kid. He's just turned three. He had just turned two when GPT4 was announced.

Going back generations, my grandparents' lives were virtually identical to my great grandparents'. My parents grew up with radio, but they were adults by the time TV changed their world. All three generations got the bulk of their information from books and newsprint.

I grew up together with computers. I remember riding that exponential wave of tech like a surfer. From Commodore 64 to a laptop with 64GB of memory, a million-to-one ratio. Tetris to Doom Eternal. Dialup modem to gigabit... in a mobile device that fits in my pocket.

All of this took decades, but now changes like this happen in months.

I keep thinking that "this tech will change my kid's childhood", but what "this" is, is already outdated and being replaced in a blink of an eye, and he hasn't even reached that point yet where he'd notice!

When image generators were first released... what... a year ago... I thought: Wow! One day, when my kid is a little older, I'll be able to use this to create illustrations for stories we make up as we go along! Won't that be great!

I still haven't gotten around to that yet, he's still too young to appreciate that, and anyway, with this Sora I'll be able to create video instead by the time he's old enough!

I keep trying to imagine what his life will be like when he grows up to be a teenager, but realistically I'm having a hard time predicting what will already be outdated by the time he's four.

February 16, 2024 at 8:06 PM

hansoolo

I think my thoughts went in a similarly sinister direction, when I saw it. I couldn't quiet grasp it.

My mood wasn't euphoric, to say the least.

February 16, 2024 at 6:45 AM

a_wild_dandan

Tom Scott elaborated on this experience last year.[1] We should form a support group to commiserate.

[1] https://www.youtube.com/watch?v=jPhJbKBuNnA

February 16, 2024 at 9:30 AM

survirtual

The compute and innovations behind it should be owned by the planet, not by a handful of billionaires. It is far too powerful to be controlled by such a small group of humans, who decide what is "safe" and what isn't.

It took billions of years for all of our ancestors to enable this technology, and now a handful claim it for themselves. The GPUs to run these models cost $20,000+ each, and only the ultra-rich can afford to have that compute.

Compute power needs to be radically redistributed and equalized across the board. This is too much power.

February 16, 2024 at 7:03 AM

dogcomplex

Actually, you can live-render around 12fps videos on a consumer gaming rig using software installable in a night ($3k). Not as fancy internally-consistent videos as these, but still impressive - and that's just an algorithm update and model download away. And every second a corporate AI model is exposed publicly to the world that's more training that can be siphoned to open source models at far more cost effective rates than the initial leaders.

You're impressed by the lions. But us hyenas and vultures will get our turns still too. This is not over. Information innately diffuses.

We need to organize, and we need to build.

February 16, 2024 at 1:29 PM

survirtual

The open-source solutions on anything besides image gen are like toys compared to the corporate owned ones -- and even image gen is behind DALLE3. I built video generation on top of stable diffusion 1.5 when it first came out, getting better results than what I've seen published, but it was no where close to this.

A conspiratorial part of my mind feels it is orchestrated; give the masses old / misdirected code so their work goes into dead-ends that can never achieve the results corporate is hoarding. Open Source hasn't even scratched at GPT4 yet, and that is approaching a year old.

The power dynamics need to radically shift. Corporate cannot own all this compute and brain power when it involves birthing AGI. That will create an instant and permanent divide the likes of which will never, ever be cross; you will either be an owner of intelligence indistinguishable from a god, or you will be a mortal. Even the RISK of this happening is laughable that it is being allowed.

We need radically redesigned government, regulation, and public involvement, and we need it yesterday. AGI is a Earth-wide, publicly owned effort, it cannot be relinquished to the owner / slaver class of this planet -- that is madness.

February 16, 2024 at 6:11 PM

marvin

In two posts, you went from ~"it's a travesty that only 5000 people have access to the technology that will soon own the world", to ~"it takes at least three years for the state of the art to run on a box owned by myself in my bedroom".

This is a reason for optimism.

February 17, 2024 at 4:32 AM

survirtual

Do you understand what AGI is?

Why are people so willing to trust such a small minority with power like that?

If there is even a 1% chance that they decide to "cut the cord", it is still too high. Once AGI is achieved, there is no coming back. Minutes will be like an eternity, and days, let alone years, will be beyond that.

Trusting such a small number of people with that kind of power is obscene, especially when history has time and time again shown what humans do to things which are no longer useful to them.

There has never been a divide like humans w access to AGI and regular humans before. It will be greater than the difference between a human with a modern cell phone and a carrier pigeon -- the pigeon itself, not a human using it.

My money is on the cord being cut. That means as soon as AGI is achieved, an impenetrable two-tiered human species is created; one with AGI backed intelligence, and the majority, without it, or with a dumbed down version as a transitional cookie until a more final solution can be realized. Once it reaches this stage (without any change to public governance over AGI), it will be too late.

We are already a stone's throw from this reality, if it hasn't already happened.

Why is my money on this outcome? Because that thinking brings about necessary change. We as a people have the power to prevent it 100%, and we ought to, now. Instead of relying on the outlandish chance that a historically malevolent elite suddenly gains benevolence and shares access from the kindness of their hearts to a boundless intelligence, there is a window to force access for all. Everyone having access is autonomously balancing.

February 18, 2024 at 4:29 PM

dogcomplex

Sounds like we need a public option funding and training foundational models and fueling public research which can outpace corporate models in money and brainpower. This could be a thing if governments weren't horrifically corrupted by corporate interests already, or if we could get off our asses and build some sort of decentralized swarm compute network. I do agree, in terms of raw resources (capital) we are far behind and unorganized. In terms of collective brainpower which could be applied if done right... eh, this shit doesn't seem nearly as hard as the ML experts portray it. Either we're being fed a worldwide false reality bubble, or there's a plethora of low-hanging fruit being discovered daily from just a few foundational model finds, and even though the big firms are gonna scoop those up first it's gonna be pretty hard to ever hide that information for long

Of course they're gonna get a few years heads start - which will feel like centuries in AGI time - but either they wipe us all out during that, or we're gonna put the garbage together and make our own AGI too.

February 20, 2024 at 11:20 AM

ThrowawayTestr

It's amazing that this is what it takes to turn a forum of libertarians into a forum of communists.

February 16, 2024 at 7:21 AM

krapp

Yes, everyone considers themselves a dyed in the wool capitalist until circumstances lead them to realize the difference between the sheep and the wolves.

February 16, 2024 at 10:36 AM

kortilla

>into a forum of communists.

Speak for yourself

February 16, 2024 at 10:28 AM

throwaway743

> libertarians

Please. Speak for yourself.

February 16, 2024 at 8:28 AM

nonbirithm

I think of it like: The only reason humans still drive cars is we have yet to find a good enough way of replacing ourselves with something more effective. It's merely an implementation detail of "getting from A to B" that would be disrupted if a true autonomous solution was discovered. Many would want to optimize away drunk drivers and road rage if it were possible in some faraway future. So something like a steering wheel could be seen like a compromise of sorts, until the next big thing makes them obsolete.

That, and the state of missing a technology in a period of time is irreplaceable once it's been discovered. Nobody can live in an era without social media anymore, barring a global-scale catastrophic reset. So I believe it's important to consider what technology is not yet totally pervasive, for example by realizing there is still a steering wheel for you to grip in your car.

And in my mind, the sinister feeling stems from the fact that all it takes to irreversibly shift society like that is enough smart people with honest intentions but little foresight of what will happen in a few decades as a result of proliferating all this. The problems that result stop being in anyone's control, "throwing it over the wall" so to speak, and instead become yet another fact of life that could weigh us down (mostly I think of the ubiquity of social media and how it has changed human interaction). And it all stems from just a few engineering type people getting overexcited about cool possibilities they can grasp at, not considering there are billions of people unlike them who may have other ideas.

February 16, 2024 at 1:38 PM

tsunamifury

The Congress and Until the End of the World lay out some possible outcomes of tech like this.

Specifically the dream fed back to the brain in an endless loop of personalization until individuals no longer share the same world.

February 16, 2024 at 8:50 AM
February 16, 2024 at 1:41 PM

Trasmatta

I felt the same thing when I saw LLMs writing code for the first time

February 16, 2024 at 9:07 AM

strikelaserclaw

yes this is both exciting and scary.

February 16, 2024 at 8:38 AM

trungaczne

I have felt the same since Stable Diffusion came out.

The thing is, things have value in society partly because human efforts were involved in its making. It's not just about the end result; people still go to concert on top of listening to studio recordings for example, and people still watch humans play chess even though it's clear that good enough algorithms can beat the best humans easily. Technology like these which takes away too much immediate effort (hours needed to create the product) and long term effort (decades of training) are inherently absent of underlying value that I spoke of. Of course, if a person is only interested in consumption, it matters not how the "thing" is created.

Much of the sense of doom I have comes from the inherent erosion of this human effort element in the creative process. Whether we like it or not, the availability of mass produced content naturally threatens crafts themselves. After all, nobody wants to spend a few decades on their skills only to have their creation compared to an AI generated image produced in a few seconds.

I understand there are a lot of hypes around these technology to "humanity" but I have yet to see it. It just feels like more power consolidation to billionaires (especially when done as ClosedAI). There are artists who have tried to incorporate these but they have always felt the need to willingly not label their work as AI-generated or AI-assisted to sell (but still leaves in enough details for keen observers to tell it's AI touched).

As a whole, it just feels wrong. The most optimistic (and reasonable) take I have seen is "Just wait and see". It might feel like a non-argument, but it's the only realistic take between the hyped up techbros and the doomer cult (admittedly, I might belong to the latter group).

I think one of the most worrying thing for me is that regardless of how this plays out, this technology has only added more complexity to our society. That people are divided into camps about how they feel about the technology is simply a symptom about how much uncertainty there is in the future. This last bit will be a personal quarrel, but I personally lose any last desire to have children seeing the AI advancement. It's not right creating sentient life in an age where every year people have to play lottery to see whether technological advancement has deemed their life long effort unworthy.

February 16, 2024 at 12:42 PM

Spacecosmonaut

I think you're right. A large part of the joy from creative endevours is actually getting good at something, and having other people enjoy your work. In the face of instant high quality generative AI placating the entertainment needs of the masses, we are creating a society where most people are unable to enjoy human creative expression, in part because human artists are just too slow. Attention spans are already shrinking, and after getting used to generative AI, few people will have the patience to wait for an author to write the second part of his magnum opus.

February 17, 2024 at 1:43 AM

antegamisou

Feeling anything else other than concern is unpopular only on this website where most believe the entire world is cute like the nerds hanging out at the office's game room.

Nothing good is coming out of this. I don't give a shit if you believe this is Luddism.

February 16, 2024 at 12:38 PM

eggplantemoji69

Obviously concern yourself with your job and what you need to do to ensure you can obtain buying power going forward, but most problems and concerns about things like these go away if you just turn off your tech, or really be intentional about your usage.

Extremely hard to do, it is, but you’ll become quasi-Amish and realize how little is actually actionable and in our control.

You’ll also feel quite isolated, but peaceful. There’s always tradeoffs. You can’t have something without giving up not-something, if that makes sense.

Edit: So, essentially, ignorance is bliss, but try to look past the pejorative nature of that phrase and take it for what it is without status implications.

February 16, 2024 at 8:36 AM

break_the_bank

In less than a few hours Gemini 1.5 is old news. Sam is doing live demos on Twitter while Google just released a blog.

Didn't think Google would be the first of the Facebook, Apple, Google and Microsoft to get disrupted.

February 16, 2024 at 3:18 AM

Alifatisk

I disagree, Gemini 1.5 is still impressive (if true) with the 10 million context size!

February 16, 2024 at 3:29 AM

umeshunni

It'll be impressive when I can use it.

February 16, 2024 at 5:02 AM

JoshGlazebrook

This is my only gripe with these announcements. When will us plebs paying $20/month for chatgpt get to use it?

February 16, 2024 at 8:07 AM

system2

Yet rate blocked every few hours with it too.

February 16, 2024 at 9:43 AM

joshua11

Welp, I can't upvote. But this ^^^

February 16, 2024 at 5:09 AM

alooPotato

you can't use sora either

February 16, 2024 at 5:09 AM

joshua11

At least you can see the demos. Google released a blog post and was like, keep waiting

February 16, 2024 at 5:10 AM

Alifatisk

There’s also videos showcasing Gemini Pro 1.5, but historically speaking, Google hasn’t been fully truthful with their demos.

Can’t you access Gemini Pro 1.5 through Vertex Ai?

February 16, 2024 at 5:35 AM

htrp

Whitelist only (talk to your GCP account rep)

February 16, 2024 at 6:08 AM

vitorgrs

The context size is 1 million... They said "they tested up to 10".

February 16, 2024 at 8:42 AM

matsemann

Did they have this ready to go to upstage whatever Google would release? Or just coincidental both things announced today?

February 16, 2024 at 3:52 AM

Palmik

It's just as likely that Google knew OpenAI has their announcement planned for today and wanted to preempt it. Happens all the time.

February 16, 2024 at 4:48 AM

overstay8930

I'm pretty sure everyone saw Google as a directionless pile of money, there's a reason killedbygoogle exists.

February 16, 2024 at 4:33 AM

minimaxir

There's not really too much to talk about Gemini 1.5 as it's an iteration and there's not much to test around the new context length.

The Sora demos are more interesting.

February 16, 2024 at 3:31 AM

thepasswordis

The fact that SamA just seems to go off the cuff on twitter pretty frequently is such a breath of fresh air.

February 16, 2024 at 4:02 AM

ldjkfkdsjnv

Hes a real CEO, Sundar is just a political appointment

February 16, 2024 at 4:20 AM

Xenoamorphous

As someone who just skims Hacker News and little else and no skin in the game, I always get the impression that Pichai is the weakest of the big tech CEOs, compared to Satya, Cook, etc.

Is my impression correct? Or it’s just that the anti-Google sentiment is strong in HN?

February 16, 2024 at 5:23 AM

ldjkfkdsjnv

No hes bad. Very good politician at Google, did some interesting moves with Chrome a long time back. Not a visionary, and they are afraid of ai overtaking google

February 16, 2024 at 6:33 AM

huytersd

Sundar is a profitability machine. Google is also an order of magnitude larger than OpenAI. I don’t want all my orders drunk tweeting their thoughts to me. Apple doesn’t say shit but look at what they have achieved.

February 16, 2024 at 8:26 AM

superhumanuser

Apple hits home runs though. Google, correct me if I'm wrong, hasn't had a strong hit for a while!

February 16, 2024 at 9:29 AM

huytersd

It depends. Microsoft is the most valuable company in the world and they don’t have any recent “hits”. They just keep doing their core business well just like Google does. That being said, all the research for all of this AI renaissance has come directly out of Google.

February 16, 2024 at 12:19 PM

sumedh

> That being said, all the research for all of this AI renaissance has come directly out of Google.

Wasnt SQL some IBM research paper yet it was Oracle who got famous and rich for creating a database?

February 16, 2024 at 6:27 PM

duccinator

I don't understand how Apple is hitting home runs. What have they really innovated on post Steve Jobs? Their products are pretty much equivalent to the competition with 5% more polish at the cost of 5% more time to release. Marketing wise, they are close to gods, but innovation wise, even Microsoft is better.

I definitely agree on the fact that Tim is a much better CEO than Sundar. However I consider Satya to be much better than Tim.

February 17, 2024 at 10:56 PM

huytersd

Gargantuan achievements in two different spaces. 10 million tokens means insane things. Things like feeding the entire codebase of a massive site and saying make a copy of this with these changes.

February 16, 2024 at 8:24 AM

bamboozled

This is a really silly take isn't it?

February 16, 2024 at 7:39 AM

karmasimida

I mean, why would this make google look bad?

Gemini is catching up, so OpenAI needs a new venue to market itself to the investors. It is doing a soft pivoting if you ask me, now GPT4 is like not that special anymore.

February 16, 2024 at 4:46 AM

Oras

That’s the point I guess. They are just catching up, not really making leaps.

February 16, 2024 at 4:50 AM

karmasimida

Fair

On the other hand, Video to google is much less relevant than text. But if OpenAI figuring out something from it to AGI, that would be a different story.

February 16, 2024 at 4:52 AM

cma

Youtube? Someone's going to make a tiktok like quick-feedback thing of purely generated stuff that learns what you like and tailors the generations to you, and, despite Google owning Youtube, OpenAI looks far closer to it than them.

February 16, 2024 at 4:56 AM

karmasimida

Youtube is a video hosting platform, its advantage is in video delivery and ads. Why would a video generation software disrupts business?

Creating realistic video isn't hard even today, you can just do it on your phone and creating hours, hours of cat/dog videos. The hard part is to find a story to make it interesting. It could be possible in the future, like automatic film making, from script to realization, but that doesn't make YouTube's business go away either.

February 16, 2024 at 5:06 AM

digging

> Why would a video generation software disrupts business?

If those videos aren't hosted on YouTube.

February 16, 2024 at 6:57 AM

karmasimida

How? Not hosting on the planets largest video platform because it is generated by OpenAI?

February 16, 2024 at 7:23 AM

pama

I would much rather pay to generate my own realistic videos based on my prompts than watch other people’s random creations (possibly filled with ads). When generation becomes great the motivation and need to store, retrieve and serve becomes less relevant.

February 16, 2024 at 8:23 AM

hcks

Why would you give free money to YouTube if you control the content

February 17, 2024 at 1:26 AM

rvz

YouTube is just another moat for Google to catch up to Sora.

But this time, Google is finally showing their war face instead of not trying hard to compete against Microsoft and OpenAI.

February 16, 2024 at 6:03 AM

huytersd

I would call 10 million tokens mopping the floor (if true).

February 16, 2024 at 8:27 AM

VeejayRampay

it's not old news and it's actually way more impressive

it's just that people want to root for OpenAI more because hype

February 16, 2024 at 7:05 AM

mtlmtlmtlmtl

This is all very impressive. I can't help to wonder though. How is text-to-video going to benefit humanity? That's what OpenAI is supposedly about, right?

We'll get some groundbreaking film content out of this in the hands of a few talented creatives, and a vast ocean of mediocre content from the hands of talentless people who know how to type. What's the benefit to humanity, concretely?

February 16, 2024 at 4:58 AM

chidiw

> Sora serves as a foundation for models that can understand and simulate the real world, a capability we believe will be an important milestone for achieving AGI.

February 16, 2024 at 5:08 AM

dorkwood

That struck me as a line they added to drum up more funding.

February 17, 2024 at 12:41 AM

nycdatasci

For models to interact with real-world objects, they first need to understand those objects. These videos demonstrate just how advanced that awareness is. The goal is not to generate videos. Of course, they could and likely will build products on this capability, but the long-term goal is bigger.

February 16, 2024 at 5:13 AM

mtlmtlmtlmtl

Sure, if that's not just marketing. I haven't seen enough evidence to conclude this will go towards that kind of thing yet, but I'm open to the possibility.

February 16, 2024 at 5:20 AM

andai

That's exactly what we have now with YouTube.

February 16, 2024 at 5:00 AM

BeFlatXIII

How else is the next generation of talented creatives cultivated, if not out of the pool of the millions of untalented typists?

February 17, 2024 at 12:07 AM

dinobones

If a model can generate it, it can understand it.

They can probably reverse engineer this to build a multi-modal GPT that is fed video and understands what is going on. That's how you get "smart" robots. Active scene understanding via the video modality + conversational capabilities via the text/audio modality.

February 16, 2024 at 5:00 AM

internetter

But we can already do this?

February 16, 2024 at 5:06 AM

sayagain

This vast amount of human talent and computational power could be channeled into fighting disease and death.

February 16, 2024 at 5:20 AM

mtlmtlmtlmtl

I'm not quite sure what you mean, so I'll ask for clarification. Are you saying this technology can be channeled into fighting disease and death, or that the man hours and computational freed up by this technology can be channeled?

February 16, 2024 at 5:35 AM

sayagain

I think that all this goodness was spent on entertainment at a time when every second a catastrophe occurs - a human dies.

February 16, 2024 at 5:48 AM

mtlmtlmtlmtl

Oh I see what you mean, thanks.

Yeah, this is a very real issue with a lot of Silicon Valley tech, unfortunately. They're perfecting the art of pretending everything is fine, I feel like.

February 16, 2024 at 6:08 AM

ij09j901023123

Biologists, chemists, and researchers can be all automated and trained on a very big LLM that OpenAI eventually creates. Then, more cures to diseases and technological advances can be invented. This technology can soon run entire countries and emulate humanity / society.

February 16, 2024 at 5:42 AM

mtlmtlmtlmtl

If you assume the technology will be able to do arbitrary things in the future then sure.

February 16, 2024 at 8:00 AM

jonplackett

No mention of how much they had to cherry pick right?

Interested to know what the success rate of such amazingmess

Pika have really impressive videos on their homepage that are borderline impossible to make for myself.

February 16, 2024 at 2:38 AM

kredd

There’s an ongoing thread on Twitter where Sam takes suggestions from replies and shares the output. E.g. https://x.com/sama/status/1758200420344955288?s=46&t=VQo1eLU...

February 16, 2024 at 3:08 AM

usaar333

Just from a quick scan, those are a lot worse than the ones on the marketing page.

February 16, 2024 at 3:45 AM

burkaman

Definitely a lot worse, but still an order of magnitude better than every other attempt at generative video.

Even these may be cherry-picked though, he's only posted a few and I'm sure he's gotten thousands of requests already.

February 16, 2024 at 3:57 AM

Kiro

I don't think they are a lot worse:

https://twitter.com/sama/status/1758218059716939853

https://twitter.com/sama/status/1758218820542763012

https://twitter.com/sama/status/1758219575882301608

https://twitter.com/sama/status/1758220311735181384

February 16, 2024 at 6:08 AM

usaar333

Significantly simpler scenery. I'd put only the first at good complexity/quality.

2nd is quite simple, but still suffers from your typical lighting issues that plague image gen. (shadows are significantly off)

3rd has magically appearing spoon and isn't that complex.

4th has a lot of prompt following issues

Some others feel quite off -- wizard, flying dragon, etc.

Still impressive of course, but not to the degree of what I saw on the marketing page.

February 16, 2024 at 9:30 AM

jonplackett

holy crap. these are still amazing though.

I guess he might be generating 50 for each response and posting the best, but that would seem deliberately disingenuous which hasn't been openai's style.

even the worst is still orders of magnitude better than anything else.

February 16, 2024 at 4:19 AM

slekker

Is there any alternative to Nitter?

February 16, 2024 at 3:19 AM

burkaman

I think Nitter may still work if you self-host it, but otherwise no, they have made it impossible to read without an account.

February 16, 2024 at 3:54 AM

aaroninsf

Totally agree, they can pay a lot of monkeys at typewriters,

but also? https://openai.com/sora?video=big-sur

made me literally say, out loud, "doesnt-matter-had-sex"

February 16, 2024 at 2:56 AM

hansonkd

Countdown to when studios licensing this for "unlimited" episodes of your favorite series.

There was Seinfeld "Nothing, Forever" AI parody, but once the models improve enough and are cheap enough to deploy, studios will license their content for real and just have endless seasons.

Or even custom episodes. Imagine if every episode of a TV show was unique to the viewer.

February 16, 2024 at 2:29 AM

mbil

I imagine it's not long before we see hyper-targeted commercials where the actors look like us, live in our city, etc.

February 16, 2024 at 2:33 AM

hansonkd

Custom AI commercials would be very interesting. Instead of seeing strangers enjoying the benefits of the product, it shows you. A car commercial would show you driving, etc.

Commercials and TV episodes could have a basic "story arc" and then completely customized to the viewer.

Think about the simpson's or something. Imagine that the story of the episodes were kept, but you could swap in the characters and locations. So for instance if you lived in Nashville TN, all the simpson's episodes could be generated to show the settings as Nashville instead of Springfield.

Then you could have the AI switch out the characters to be people you want. Maybe you want to replace Lisa with an AI Simpsons version of you. Mayor Quimby with Nashville's actual mayor, etc.

February 16, 2024 at 2:41 AM

cdme

This sounds absolutely dystopian.

February 16, 2024 at 11:15 AM

tavavex

> Custom AI commercials would be very interesting. Instead of seeing strangers enjoying the benefits of the product, it shows you. A car commercial would show you driving, etc.

I think it'd kind of defeat the point - I can't imagine a person that'd want their likenesses to be used to market to them. It'd be a disaster. Setting swaps are more realistic, though at the point where things get good enough for that to be possible, we may just see completely on-demand newly generated media instead of modifying what already exists.

February 16, 2024 at 5:47 AM

altruios

If I saw myself onscreen telling myself to buy a product I've never seen or used: I would not buy that product or use that service. It feels violating to have your image used against your best interests (of not being manipulated to be capitalism's bitch) like that.

That is a hell-scape (to me).

Inserting yourself into shows... that's feels different, but my gut tells me advertisers will corrupt that idea quickly. Product placement...

February 16, 2024 at 5:55 AM

elevatedastalt

With most things, Today's hell-scape is Tomorrow's Hippie Idea, Day After Tomorrow's Normal, and Next Week's You-Are-Cancelled-If-You-Don't

February 16, 2024 at 12:51 PM

azan_

Could you show any example of that pipeline? I'm trying to think about technology not using which would result in being cancelled, but can't come with anything

February 16, 2024 at 7:55 PM

internetter

Oh lord no thank you

February 16, 2024 at 5:05 AM

CSMastermind

There was some monitoring company that used to have creepy web ads that would show the actual company you worked at in the ads.

If anything it was a turn off and I was confused how they knew where I worked.

February 16, 2024 at 4:26 AM

easton

They were probably using the ASN for your IP, and your company had its own.

I used to get ones that said “Comcast user you are insecure” and stuff.

February 16, 2024 at 6:29 AM

doabell

Wow, I would imagine this being very effective in election campaigns (for better or for worse, probably for worse).

February 16, 2024 at 2:45 AM

ex3ndr

Nothing stopped doing so before AI - just slam a photo of your friends to the ad.

February 16, 2024 at 3:01 AM

woah

You could have ChatGPT create unlimited simulated forum threads about news articles, but here you are on Hacker News

February 16, 2024 at 6:20 AM

xanderlewis

This is such a great example. Simple, but so telling.

February 16, 2024 at 8:44 AM

apitman

And yet as time goes on we will become less and less certain which comments are made by humans.

February 16, 2024 at 12:17 PM

xanderlewis

That's entirely orthogonal to the issue it was addressing.

The point is that it doesn't matter how close the two can become (indeed, we're already pretty much there); people will always want to read stuff written by actual people (or at least a thinking being) than something purely generated by a model with no other grounding in reality.

February 16, 2024 at 9:12 PM

minimaxir

One understated aspect of AI Seinfeld is that it took many steps to differentiate it from the actual Seinfeld and create its own identity, such as the 144p visual filter and the random microwave. Those tweaks added to its charm.

If someone tried to do AI Seinfeld again in 2024, many would criticze it for not being realistic enough now that the tools to do so are now available.

February 16, 2024 at 3:01 AM

suddenclarity

I assume you would still be able to do that, just better? Like pixel art. Super Mario Bros. 3 look great despite being 36 years old. Contrast this with 3D games for the original PlayStation that have aged poorly.

February 16, 2024 at 5:44 AM

jquery

The low poly Ps1 aesthetic is huge in indie gaming these days

February 16, 2024 at 1:44 PM

ren_engineer

I'm not sure there would be much demand for purely custom/individualized episodes beyond the novelty and maybe for fun with a group of friends. Most of the reason people watch TV or movies is for the shared experience that you can discuss with others. It could definitely drive down production costs though, hopefully HBO uses it to eventually redo Game of Thrones post season 4

February 16, 2024 at 3:06 AM

hansonkd

Well there is always your AI girlfriend and AI friend group with the AI generated podcast breaking down the episode. (jk, sort of)

> Most of the reason people watch TV or movies is for the shared experience that you can discuss with others

I wouldn't say that. Most of the reason people watch TV is to kill time.

To be honest, I find my discussions with friends about TV shows on the decline just because of the fact that everyone is watching there own thing. So many shows and people watch them at their own pace. so most of the discussions go like this "Hey have you seen that new Netlix show X?" "No I haven't, maybe I'll check it out". Or "Oh yeah, i saw that a year ago, Its good but I don't remember the details".

Before Streaming when you had a set schedule for TV, it was way easier to discuss things because people were forced to watch programs on a certain day and there was more limited content. This led to "water cooler" conversations about what the previous nights show.

I bet if you graphed (discussions had about tv shows) / (hours watched of tv shows) that graph would trend down.

Think about little kids. My niece watches cocomelon all day long. She doesn't need to discuss it with anybody. She just wants an unlimited stream.

February 16, 2024 at 3:22 AM

JKCalhoun

> I wouldn't say that. Most of the reason people watch TV is to kill time.

How annoying to see something amazing and then not be able to find anyone who also experienced it that you can ... what word mean's commiserate but in a positive way?

I'm thinking now about the astronauts that walked on the Moon and had only the few others. I think one of the astronauts bemoaned having gone to this amazing place, like some kind of wild vacation, but not being able ever to return.

February 16, 2024 at 4:53 AM

awfulneutral

You can just talk to your AI companion about it. If you involve another human there's always a chance somebody might be slightly bored or inconvenienced, so we want to avoid that.

February 16, 2024 at 6:50 AM

jakub_g

Same about music. In good ol' days, one would meet a friend to listen to cool new music together, share CDs with mp3s etc

It's actually really weird. I wanted to buy my niece some CDs for Christmas to discover 90s music, but kids don't listen music from CDs anymore. They don't have devices even. Should I buy her a Spotify gift card and send her links to Spotify via Whatsapp? It's so strange.

February 16, 2024 at 6:43 AM

nuancebydefault

Indeed. That is why in our family we watch broadcast or timeshifted tv and no netflix. Still it is hard to find other families like that so little tv stuff to talk about at work during lunch.

February 16, 2024 at 6:21 AM

kirill5pol

If this sounds interesting I’d highly recommend this short story by Ken Liu

https://future-sf.com/fiction/1700/

February 16, 2024 at 5:15 AM

mempko

That would not work because that's now how people work. People watch/play media to connect to others. How can you talk about anything to anyone or have any shared culture when other people will never see what you see?

Movies, books, games, are a collective culture, not an individualist one. I don't know about you, but when I like an experience, I want to share it with others.

February 16, 2024 at 4:02 AM

xanderlewis

To be blunt about it, I can't help but imagine that the people who make such comments (and I've seen quite a few recently) are just complete philistines. They're the same people who can't draw, write, play, sing, design, or anything else and yet think they know what's good.

It's almost as if they think the purpose of art or entertainment is to stimulate some particular part of the brain and everything else between that and the screen/speakers/canvas/whatever is just an inconvenience that ought to be dispensed with as soon as technology allows.

February 16, 2024 at 8:51 AM

dukeyukey

> Imagine if every episode of a TV show was unique to the viewer.

This is the bit I don't think will happen, at least in big quantities. Half the fun of watching a popular series is being able to discuss it with epople afterwards!

February 16, 2024 at 6:20 PM

jpeter

I am thinking of Stargate SG-1 Season 11. And remaking Game of Thrones after Season 5

February 16, 2024 at 5:20 AM

htrp

Speedrunning Black Mirror

February 16, 2024 at 2:30 AM
February 16, 2024 at 2:40 AM

slothtrop

Actors had a strike in part over this recently.

February 16, 2024 at 5:57 AM

Zelphyr

I wonder if there is anything in the recent Hollywood strikes that will prevent the studios from dong that?

February 16, 2024 at 2:52 AM

nielsbot

I think that was one of the areas that SAG-AFTRA lost on.

Majority Report spoke to one of the negotiators and national board member:

https://www.youtube.com/watch?v=E62k1ZsY1IU

February 16, 2024 at 2:56 AM

freediver

We haven't seen that happen for books. Perhaps humans crave human spirit?

February 16, 2024 at 7:25 AM

jenny91

Absolutely insane. It's very odd where the glitches happen. Did anyone else notice in the "stylish woman ... Tokyo" clip how her legs skip-hop and then cross at 0:30 in a physically impossible way. Everything else about the clip seems so realistic, yet this is where it trips up?

February 16, 2024 at 3:55 AM

psb217

She's also wearing a different jacket at the end of the video. Continuity is not maintained when the video zooms back out to a wider shot after the close-up on her face. See, e.g., no zipper on end jacket and obvious zipper on jacket earlier in the video, or placement of the silver "buttons" and general structure of the lapels.

The background details are particularly "slippery" in these videos. E.g., in the initial video of walking along a snowy street in Japan, characters on the left just sort of merge into/out of existence. It's impressive locally, but the global structure and ability to paint in finer-grained details in a physically plausible way fails similarly to current image gen models, but more noticeably with the added temporal dimension.

February 16, 2024 at 4:18 AM

mitthrowaway2

And the cat that wakes up the woman in bed, has three front paws! And that woman seems to be wearing the blanket as though they were pyjamas. Still, it's usually very hard to notice the inconsistencies -- just like the subtle inconsistencies we might see in our dreams.

February 16, 2024 at 4:14 AM

jenny91

Yes, there's some really weird hand-blanket morphing going on in that cat shot. Similarly in the guy reading a book on a cloud, the pages flip in a physically impossible way at one point.

I just think it's perplexing how they got things so right, yet so wrong. How did they implement this?!

February 16, 2024 at 4:24 AM

qiller

The construction scene has people appearing out of thin air, changing jacket colors, and in general weird things happening

February 16, 2024 at 4:27 AM

mihaic

This is both amazing and saddening to me. All our cultural legacy is being fed into a monstrous machine that gives no attribution to the original content with which it was fed, and so the creative industry seems to be in great danger.

Creativity being automated while humans are forced to perform menial tasks for minimum wage doesn't seem like a great future and the geriatric political class has absolutely no clue how to manage the situation.

February 16, 2024 at 7:13 PM

LoveMortuus

We are all standing on the shoulders of giants, whose existence and names we will never know or acknowledge.

The way these models are creative is the same way humans are.

The artist that painted Mona Lisa didn't credit any of the influences and inspirations that they had.

Just as cameras made many artists redundant, so too will every other new tool, and not just artist but pretty much every job.

But there are still people that weave baskets, and people are prepared to pay the premium to get a product that was 'hand-made'.

While receiving the credit that you are deserved is nice and fair. The world doesn't work that way.

February 16, 2024 at 7:27 PM

deergomoo

None of the examples you’ve given are even remotely the same thing.

> The artist that painted Mona Lisa didn't credit any of the influences and inspirations that they had.

This is not “influence and inspiration”, this is companies feeding other people’s work into a commercial product which they sell access to. The product would be useless without other people’s work, therefore they should be compensated.

> Just as cameras made many artists redundant, so too will every other new tool, and not just artist but pretty much every job.

The camera enabled something that was not possible before, and I wasn’t built by taking the work of sketch artists and painters. It was an entirely new form of art and media.

The only thing this stuff revolutionises is new ways to not pay people. I find the implications deeply depressing.

February 16, 2024 at 7:55 PM

csallen

> This is not “influence and inspiration”, this is companies feeding other people’s work into a commercial product which they sell access to. The product would be useless without other people’s work, therefore they should be compensated.

How else do you get influence and inspiration without feeding other people's work into your own brain? Do you know a single artist, writer, or musician who hasn't seen other artists' paintings, read other writers' books, or listened to other musician's music? Ingesting content is the core of how influence, inspiration, and learning work.

> The camera enabled something that was not possible before… The only thing this stuff revolutionises is new ways to not pay people.

It's never been possible to generate thoughts, writing, and images so quickly and at such a high level. It's made creative pursuits accessible to billions who previously didn't have the skill or time to do them well, or the money to hire others. As a random example, I have friends using ChatGPT to compose creative and personalized poems and notes about each other. Not something they were doing before.

> The only thing this stuff revolutionises is new ways to not pay people.

The camera lessened the need of people to go to plays and pay for tickets to see things in person. Just like records, CDs, and mp3s lessened the need to go to concerts and shows. Technology is always creating and destroying ways to pay people. The ways that people get paid are not suppose to be fixed and unchanging in time.

February 16, 2024 at 8:24 PM

deergomoo

> How else do you get influence and inspiration without feeding other people's work into your own brain? Do you know a single artist, writer, or musician who hasn't seen other artists' paintings, read other writers' books, or listened to other musician's music? Ingesting content is the core of how influence, inspiration, and learning work

I am a human, alive and sentient. I can be held responsible if my “inspirations” stray into theft. A machine cannot, and it’s increasingly looking like the companies that operate the machines can’t either.

I also can’t churn out my inspired works at a rate that displaces potentially everyone who has ever influenced me.

> It's made creative pursuits accessible to billions who previously didn't have the skill or time to do them well, or the money to hire others. As a random example, I have friends using ChatGPT to compose creative and personalized poems and notes about each other. Not something they were doing before

How on earth is using a machine to spit out a poem a creative pursuit? There’s no more creativity there than watching a movie someone else made. It’s entertaining, yes, but it’s not creativity.

> The camera lessened the need of people to go to plays and pay for tickets to see things in person. Just like records, CDs, and mp3s lessened the need to go to concerts and shows

This doesn’t hold water. Cinema did not eliminate theatre just as records did not eliminate live music. In fact, both are arguably as big now as they have ever been. The technology here filled a new space, it didn’t threaten to throw everyone out of an existing one.

February 16, 2024 at 11:14 PM

l33tman

I can't know if you've actually used these tools, but it requires a pretty high level of creative mind to get them to produce the content you're looking for. Maybe you as a user of an LLM you don't need to be creative in the writing of words for example, but you instead need to be creative in how you control the tools and pick the right outputs, feed it back, copy/paste/cut it, change stuff, extend it.. and the same with the image generators. There's a HUGE amount of creative accessories around them to manipulate and steer the process. There might be less creativity needed with the pen, but it's needed in other ways.

February 16, 2024 at 11:50 PM

jitix

I don’t see the advent of generative art any different than when we moved from paper to photoshop.

For those unaware the vast majority of graphic artists start their projects with assets and base images that they themselves don’t create. With generative ai you’re simply going one step further and have another new tool create a more polished version that you can edit to remove extra fingers, etc. It’s simply moving the baseline from 20% done to 60% done, which will result in artists producing even higher fidelity and more detailed art.

For example an artist could generate a bunch of scenes using Sora and create a collage of them for a larger piece of art, something that is prohibitively time consuming right now.

February 17, 2024 at 12:32 AM

caeril

> I also can’t churn out my inspired works at a rate that displaces potentially everyone who has ever influenced me.

I'm with you, man. I'm still trying to find a lawyer who will sue Kubota and John Deere for moving dirt at a rate far superior to me and a shovel, but nobody will take my case.

> How on earth is using a machine to spit out a poem a creative pursuit?

100%, man. Nobody is mentioning the magical fairy dust in human brains that makes us superior to these models. When I really like fantasy novels, and then train my neurons on thousands of hours of reading Tolkien, Terry Brooks, Brandon Sanderson, etc, and then I get the idea to write my own fantasy series, my creative process doesn't draw on my own model's training data at all. It's 100% "creative", and I would produce exactly the same content if I were illiterate. But these goddamned machines, man. They don't have our special human fairy dust.

When we discovered the universal law of gravitation, and realized that the laws of physics are omnipresent in our universe, we put a giant asterisk to note that the laws of physics are different inside humans. The epidermis is a sort of barrier to physics, and within its confines, magic happens, that these pro-AI people conveniently "forget".

To paraphrase the eminent Human Unique Creative Person Roger Penrose: "There's magical quantum shit goin down in the microtubules. It's gotta be the microtubules. I think, right? I can't prove it, but as a scientist, we don't need proof. Making sure we think we are superior is more important."

February 17, 2024 at 2:24 AM

ben_w

> I am a human, alive and sentient. I can be held responsible if my “inspirations” stray into theft. A machine cannot, and it’s increasingly looking like the companies that operate the machines can’t either.

150 years ago, Bertha Benz wasn't allowed to own property or patents in her own right, because the law said so.

The specific reason a machine cannot be held responsible today is because the law says so.

Also, dead humans' copyright is respected in law, so "alive" isn't adding value to your argument here.

> I also can’t churn out my inspired works at a rate that displaces potentially everyone who has ever influenced me.

I can't run faster than every athlete who has ever inspired me, this argument does not prevent motor cars.

I can't write notes faster than the world record holder in shorthand, this argument does not prevent the printing press.

I can't play chess or go at even a mediocre level, this argument does not prevent Stockfish or Alpha Go.

I can't hear the tonal differences in Chinese well enough to distinguish "hello" from "mud trench", 这个论点并没有阻止谷歌翻译学习 “你好” 和 “泥壕” 之间的区别。

I can't do arithmetic in my head faster than literally all other humans combined even if they hadn't been trained to the level of the current world record holder, this argument does not prevent the original model of the Raspberry Pi Zero.

"The machine is 'better', in one or more senses of the word, than a human" is, in fact, a reason to use the machine. It's the reason to use a machine. It's why the machine is an economic threat — but you can't just use "my income is threatened by this machine" as a reason to prevent other people using the machine, just as I as a software developer can't use that argument to stop other people using LLMs to write code without hiring me.

> Cinema did not eliminate theatre just as records did not eliminate live music. In fact, both are arguably as big now as they have ever been.

You can argue that, but you'd be wrong.

Shakespeare wrote for normal everyday people, his stuff fit into the category that today would be "TV soap opera", where the audience was everyone rather than just the well-off, where the only other public entertainment was options were bear-baiting and public executions, where the actors have very little time to rehearse, and where "you're ripping off my ideas" was handled by rapidly churning out new content.

Live music, without amplification, used to be the only way to listen to music. Now, even if you see a live performance, you can have 10k people in a single venue listening to a single band… and if you want music in a pub or a dance club, the most likely performance is from a DJ rather than a band, and the "D" stands for "disk" because the actual content is pre-recorded — and that's not to say I would deny that DJ work is "creative", but rather that it makes DJing exactly what critics accuse GenAI of being, remixing of other people's work.

Which, now I think about it, is a description that would also apply to all the modern performances of Shakespeare: simply reusing someone else's creation without paying any compensation to the estate.

But I know that will tickle you the wrong way, I know that art is the peacock's tail of humans: the struggle, the difficulty, is the point, and it has to be because that's how we find people to start families with. Because of that, GenAI is like being caught wearing a fake Rolex watch, and you can't actually defend that with logical reasons such as "real Rolex watches aren't very good at keeping time compared to even a Casio F-91W let alone the atomic clock synchronising with my phone", because logic isn't the point, and never was the point.

February 17, 2024 at 12:55 AM

eric_cc

Reading your opinion on the subject, I believe you’re struggling to make sense of what is happening. I suspect there is a combination of factors here: you are reinforcing a bias, can’t wrap your head around it, don’t have much experience working with AI, haven’t deeply considered the evolution of the universe.

My recommendation: zoom out a little bit. Every step in history is so brief and nothing is normal for long. Even humanity is a blink.

Comments like: “how is using a machine to spit out a poem creative”. Really? How is using a digital camera creative compared to painting. How is a painting creative compared to etching? And on and on evolution goes..

February 17, 2024 at 1:28 AM

throw10920

> Reading your opinion on the subject, I believe you’re struggling to make sense of what is happening. I suspect there is a combination of factors here: you are reinforcing a bias, can’t wrap your head around it, don’t have much experience working with AI, haven’t deeply considered the evolution of the universe.

Please don't try to profile other HN users.

February 17, 2024 at 12:40 PM

eric_cc

I agree. I could have expressed my thoughts better in this case. It wasn’t just OP I was considering. I was thinking of a common AI take that I’ve seen when I wrote my comment. Regardless, will do better to express my thoughts and agree that we shouldn’t profile each other here.

February 21, 2024 at 2:31 AM

throw10920

Thank you! I appreciate and admire your humility and willingness to change. You're an inspiration to me to better express my thoughts too.

February 23, 2024 at 11:04 AM

imiric

I agree with everything you said.

I would just add two points:

- The rate of change that AI forces upon us has never before been experienced.

- The scale of these changes is nothing like we've ever seen before.

The adoptions of the camera, radio, automobile, TV, etc., didn't happen practically overnight. Society had a good decade+ to prepare for them.

Similarly, AI doesn't just change one industry. It fundamentally changes _all_ industries, and brings up some fundamental questions about the meaning of intelligence and our place in the universe.

My fear is that we're not prepared for either of these things. We're not even certain how exactly this will affect us, or where this is actually all taking us, but somehow a very small group of people is inevitably forcing this on all of us.

Because of this I think that being conservative, and maybe putting some strict regulation on these advancements, might not be such a bad idea.

February 16, 2024 at 9:50 PM

mlrtime

Agree with what you are saying as well. But AI is not displacing at the rate of change that is advancing. True, we hear anecdotes about people losing their jobs in HN, that was happening when those other adoptions happened but we didn't know about it happening real-time.

Humans still need to adapt and we are slow. If singularity is near [it isn't] we can be afraid, until then we are the limiting factor here. Displacement will happen but growth will happen faster with these new tools

February 16, 2024 at 10:02 PM

michaelcampbell

> - The rate of change that AI forces upon us has never before been experienced.

Sure, but I'd reckon on average, the rate of change at time T has never before been experienced at any time < T.

February 17, 2024 at 3:50 AM

eric_cc

> The rate of change that AI forces upon us has never before been experienced.

On what timeline?

February 17, 2024 at 1:30 AM

ben_w

IMO, any. It looks like an exponential curve, and for those, rate of change is proportional to value.

February 17, 2024 at 2:30 AM

shrx

Why are you afraid of change?

February 16, 2024 at 10:24 PM

bluefirebrand

Because as I grow older, I find I am less and less equipped to keep up with the rate of change that we are undergoing. It also means a lot of uncertainty for the immediate future. If AI takes over my job, will I still be able to compete in some industry somewhere and provide for myself?

I don't want much out of life, but I do want the ability to influence my own personal situation. If we wind up in the UBI-ified, dense urban housing future where AI does all the work and no one owns anything, how much real influence will I have over my life?

Will I live out my days in a government issued single bedroom apartment, with a monthly "congratulations for being human" allowance from the government? I don't want that. People say it will free us up to pursue whatever we want, but to me it sounds like the worst cage imaginable. All the free time, and no real freedom to enjoy it with.

Because make no mistake. If you live on handouts from your government, you aren't free.

So with that as a potential, maybe even likely outcome, why aren't you afraid of change?

February 16, 2024 at 11:31 PM

bookofjoe

>Because make no mistake. If you live on handouts from your government, you aren't free.

So my monthly Social Security check makes me a prisoner? I don't think so.

February 17, 2024 at 1:03 AM

talldatethrow

Can you move to Brazil full time and keep that income going?

February 17, 2024 at 1:22 AM

bookofjoe

Yes. Social Security income can be deposited directly into any bank in the world.

February 17, 2024 at 1:44 AM

bluefirebrand

I think the question is more along the lines of "will your government continue to pay your social security if you don't remain living in the country", not "can you deposit it somewhere else"

Also, how about if you get into trouble. If you're arrested for a crime (even if eventually found not guilty), will you continue to receive social security?

Is there any circumstances where your government could refuse to continue paying it?

And most importantly: could your government invent such a circumstance in the future, and then invoke the new circumstance to deny you the payment?

Living on government money reminds me of my cat. She relies on me to feed her and provide for her, and I do happily take good care of her because I love her very much.

Does the government love you very much?

I don't feel mine does.

February 17, 2024 at 4:17 AM

bookofjoe

1. My government will continue to pay my Social Security if I don't remain living in the country. My father emigrated from the U.S. to Israel after he retired and he continued to receive his Social Security for about 20 years, until the day he died.

2. "Also, how about if you get into trouble. If you're arrested for a crime (even if eventually found not guilty), will you continue to receive social security?"

"If you receive Social Security, we'll suspend your benefits if you're convicted of a criminal offense and sentenced to jail or prison for more than 30 continuous days. We can reinstate your benefits starting with the month following the month of your release." — Social Security Administration

3. "Is there any circumstances where your government could refuse to continue paying it?"

If it goes broke, certainly.

4. And most importantly: could your government invent such a circumstance in the future, and then invoke the new circumstance to deny you the payment?"

Of course!

It's about money — not love.

February 17, 2024 at 5:08 AM

ben_w

> Because as I grow older, I find I am less and less equipped to keep up with the rate of change that we are undergoing. It also means a lot of uncertainty for the immediate future. If AI takes over my job, will I still be able to compete in some industry somewhere and provide for myself?

I understand this fear, and sympathise with it even though I have multiple income streams.

> I don't want much out of life, but I do want the ability to influence my own personal situation. If we wind up in the UBI-ified, dense urban housing future where AI does all the work and no one owns anything, how much real influence will I have over my life?

Why do you fear "dense" urban housing future? I think most people choose relatively dense environments because that's where all the stuff they want is, but rural areas are cheaper[0], and the kind of future where humans must live on UBI due to lack of economic opportunity is necessarily one where robots do the manual labor such as house building and civil engineering, not just the intellectual jobs like architecture and practicing real estate law.

Likewise, while I can see several possible futures where nobody owns stuff, the tech to make it happen is necessarily also good enough that any random philanthropist who owns just one tiny autofac would find it trivial to give everyone their own personal autofac — "my first wish is infinite wishes" except the magic gene doesn't say "no".

[0] The only reason I'm looking to get somewhere a bit more rural is that the sound insulation in my current place is failing, and I'm right by a busy junction with multiple emergency vehicles passing each day — and the more less built-up areas are the cheap ones. Still the biggest city in Europe, but I'll be surrounded by forest and lakes on most sides within 15 minutes' walk.

February 17, 2024 at 1:26 AM

bluefirebrand

> Why do you fear "dense" urban housing future

Because I hated living in Apartments when I lived in them. They are noisy and small, and I like quiet and space. For me, being closer to walk to stuff is not really appealing enough to deal with how awful the experience of living in dense housing is.

I strongly think that dense housing is only positive for people who don't spend much time at home.

> "my first wish is infinite wishes" except the magic gene doesn't say "no"

The problem with this is that we haven't actually solved resource scarcity, and until we do there is still going to be an upper limit to what you will be allowed to buy, controlled by the number printed on your UBI cheque. I am anticipating this number to be much lower than what I currently am capable of achieving in my career.

Of course this is the fear that my career won't exist in the future. Or simply that AI will eat enough jobs that I will be edged out by better human competition. I'm under no illusions that I'm near the top of my field, I am firmly in the middle of the pack at best.

> sound insulation in my current place is failing

The sound insulation in the apartments I've lived in was nonexistent. This is a big part of why I never want to do that again.

February 17, 2024 at 4:13 AM

ben_w

> Because I hated living in Apartments when I lived in them.

I meant more along the lines: why do you expect that to be the future, such that you have reason to fear it?

> The problem with this is that we haven't actually solved resource scarcity, and until we do there is still going to be an upper limit to what you will be allowed to buy

Yes, but the AI necessary to make human labour redundant is that tech. In the absence of that tech, humans could still get jobs doing whatever the stuff is that AI can't do.

February 17, 2024 at 4:28 AM

bluefirebrand

> why do you expect that to be the future, such that you have reason to fear it?

Because if I don't have an income I don't expect to be able to afford anything bigger.

> In the absence of that tech, humans could still get jobs doing whatever the stuff is that AI can't do

Which will be manual tasks that I am aging out of being able to keep up with, or.. what? Stuff that traditionally doesn't pay as well as knowledge work, right? And may not pay much more than the UBI anyways?

February 17, 2024 at 8:56 AM

ben_w

> Because if I don't have an income I don't expect to be able to afford anything bigger.

A big rural place is cheaper than a tiny city place.

> Which will be manual tasks that I am aging out of being able to keep up with, or.. what?

Automation started with the manual stuff, well before computers were invented. Even for humanoid robots, their hardware is better than our bodies, and it's the software which keeps it from replacing specific workers, though telepresence is one way around that.

February 17, 2024 at 9:13 AM

eric_cc

> I don't want much out of life, but I do want the ability to influence my own personal situation.

We are still animals in the animal kingdom. It’s survival of the fittest as long as resources are not infinite. You can never expect this luxury. You are predator or prey.

February 17, 2024 at 1:32 AM

ben_w

> You are predator or prey.

Nah, we're cells in a distributed super-organism, or possibly a holobiont.

February 17, 2024 at 2:24 AM

bluefirebrand

You really do not want to preach this to a bunch of people who will have nothing but free time and growing rage at their situation

That has never worked out

February 17, 2024 at 4:05 AM

eric_cc

Give them football, video games and cannabis.

February 17, 2024 at 5:12 AM

imbnwa

>Because make no mistake. If you live on handouts from your government, you aren't free.

This isn't actually the problem since we need and will continue to need UBI for non-AI related reasons

>People say it will free us up to pursue whatever we want, but to me it sounds like the worst cage imaginable.

This is where you missed the bit that "pursue whatever we want" will also be limited by AI, and secondary effect of people growing up consuming and enjoying AI productions that tailored to their interest. At best, you'll have a few people commanding Patreons who have some skill, but generally you'd have to find a domain to pursue that isn't already automated. Luddite subcultures will have to develop. But generally you yourself and most others, particularly children of millennials who'll grow up with this stuff progressing in sophistication, might just spend your time watching your video prompts come alive; and who would wanna. do anything else when you can get straight to what you wanna see.

February 16, 2024 at 11:53 PM

eric_cc

> we need and will continue to need UBI for non-AI related reasons

This mentality is why bitcoin is going to cruise through 1 million dollars a bitcoin and on and on. Print Monopoly money and people who earn will keep seeking out sound money.

February 17, 2024 at 1:34 AM

htfu

Hint: the money comes from redistribution, not blindly printing more, the latter would obviously be completely insane (which is why you'd rather argue that scenario) whereas the former would keep the economy going, which is obviously in the interest of the capitalist class. No point owning and producing if there's no buyer because everyone is starving.

What you seem to think would devalue money will be the very thing that keeps it going as a concept.

And I hope you understand somewhere deep down that Bitcoin is the epitome of monopoly money.

February 17, 2024 at 3:24 AM

eric_cc

> Bitcoin is the epitome of monopoly money

I see it as the polar opposite, backed by math. A politically controlled money supply with no immutable math-based proof of its release schedule is Monopoly money. Cuck bucks. Look at the 100 year buying power chart.

On your second point, in spirit I agree. You need a stable society to enjoy wealth so it’s in the ruling classes best interest to keep things under control. HOW to keep things under control is the real debate.

February 17, 2024 at 5:17 AM

htfu

That's what makes it bad. A fixed algorithm that soon will spawn pittances would do an utterly miserable job if it ever gained status and usage as actual currency. Deflation is bad. So much worse than inflation. Not having flexibility in the money supply is lunacy. Mild inflation resulting in 100 year buying power going to fuck-all is good. It forces money to be invested, put to work. If sitting on your stash is its own investment the economy is screwed. Reduced circulation means less business means less value added and generally more friction. Why would you want that?

Crypto does some things well (illegal stuff, escaping currency controls/moving lots of money "with you") but in the end that also requires it is only just big enough for reasonable liquidity, but not so big it has an impact on the actual economy. For what it's being pushed for... it's a negative-sum game only good for taking people for a ride. It should stay in its goddamn lane.

February 17, 2024 at 11:44 AM

ben_w

> A politically controlled money supply

All money is politically controlled, including Bitcoin (although it's debatable if Bitcoin even counts as money). The politics of Bitcoin are one-op-one-vote rather than one-man-one-vote, but it's still there, and it's still mutable if enough of them cast their votes in any given way.

February 17, 2024 at 8:42 PM

ttoinou

Da Vinci also made money from the painting, and the Louvres continues to do so right now. They didn't credit his influence and inspiration. This is not sad.

The camera did enable painters to pretend they were, for hours, at a scene they painted, but instead they painted photographs from others. Artists are not angels, they do the same "bad" things than OpenAI

February 16, 2024 at 8:12 PM

jonplackett

Da Vinci was just a man though. He was able to produce one or perhaps two paintings at a time.

He was not able to create a monopoly on the creation of paintings across the entire world and undercut the price and ability of all other painters.

It’s not a sensible comparisons.

February 16, 2024 at 8:44 PM

throwuwu

In what way does anyone have a monopoly on generated images and video? Last I checked there were several major players and more startups than you can shake a stick at.

February 16, 2024 at 8:54 PM

jakub_g

Not monopoly but oligopoly. Only a small # of entities have enough resources to train the models on tens of 1000s of GPUs.

February 16, 2024 at 9:25 PM

throwuwu

It won’t last. There’s a massive incentive to build more GPUs and develop specialized chips and everyone who can is scrambling to meet that demand. The technology is not some trade secret that no one can copy which is why there are so many people and companies diving into this market now. Hardware is a bit slow to ramp up production of but it will get there eventually because there’s money to be made.

February 16, 2024 at 11:23 PM

ben_w

Does that matter when the models they generate are given away for free?

You can make your argument validly against DALL•E or Midjourney families, but we've also got the Stable Diffusion family of models that anyone can just grab a copy of.

February 17, 2024 at 1:29 AM

jonplackett

I’m talking about generative ai VS human artists. But in this case it seems like OpenAI specifically has a massive leap over everyone else with this video generation. So whether they have a monopoly over that remains to be seen.

What does not remain to be seen though is that generative ai is going to put a lot of artists out of work.

You can argue about the good and bad of that but it’s defo happening.

February 16, 2024 at 11:58 PM

FeepingCreature

So at what point is a painter too effective to be legal? Should we limit the amount of paintings that a single painter is allowed to produce per month?

February 16, 2024 at 8:59 PM

jonplackett

Not sure if you’re just being facetious but my point is that individual painters do not need to have limits on them because they have a natural human limit that stops them causing societal problems.

What if da Vinci had been superhuman and could take on 1,000,000 commissions per day and had also taught himself every style of art and would do each commission for 0.001x the cost of anyone else.

Yes society as a whole benefit from a fantastic amount of super high quality art.

But the other artists are not gonna be so happy with the situation are they?

February 17, 2024 at 12:02 AM

nostrebored

Sincerely — who cares?

There isn’t a human right to make money from art.

People make decisions based on what society deems valuable. That changes over time and has for the entirety of human history.

Maybe there’s a demand for more customized art. Maybe spite patronage will make a comeback.

Anyone telling you they know how it will shake out is a fraud. But the incentives we’ve set up have a natural push and pull to get people to do what society values.

February 17, 2024 at 1:44 AM

cycomanic

It's funny all you guys arguing there isn't a right/law to make money from art. What do you think copyright is? The issue is that all these models were trained in blatant violation of copyright. And before you say they just take inspiration, that's the same argument as saying when I copy a movie to my harddrive it's the same remembering. It's not and a computer is not a human.

February 17, 2024 at 3:31 AM

FeepingCreature

Hey, don't look at me, I voted Pirates. - So yeah, I am skeptical of copyright too, and for the same reason.

February 17, 2024 at 6:55 AM

varelse

[dead]

February 16, 2024 at 10:10 PM

lewhoo

Da Vinci inspired whole new generations of artists, thinkers and scientists. The net benefit of his existence distributed itself among many others - as it does with any great artist, thinker or scientist. It certainly looks like generative AI has at least in some cases the opposite effect.

February 17, 2024 at 2:30 AM

quonn

> into a commercial product which they sell access to

Within a few mon the or years there will be open source implementations anyway, running locally or in a data center. Most of the technology is published.

February 16, 2024 at 8:15 PM

sheepdestroyer

Contrary to text and the big piles of "liberated" data hanging around for anyone looking hard enough to grab, the training data for video seems to be harder to access for opensource / research / individuals. Google has Youtube, OpenAI can pay whatever fee any proprietary data bank requires. There's a moat right there that I can't see how to overcome.

February 16, 2024 at 9:51 PM

jerojero

Weird to say I guess, but meta might release an open source model too. And they do have plenty of data to feed their models. Arguably more data than openAI should have as they don't really own any social media.

Thing is, anyway, as soon as one model is open there will be copies of it, fine-tune implementations. People don't care that much about ownership of data I would say if they actually have access to the models that are produced by gathering this data.

Ultimately, to me, an open source model for this tool makes a lot of sense. They use publicly available data and the models become publicly available.

I for one am quite excited for this tooling to become better and better so I can make the adaptation of a book I love into a movie I imagine it can be. At least I can have a lot of fun trying.

February 16, 2024 at 11:04 PM

ben_w

> This is not “influence and inspiration”, this is companies feeding other people’s work into a commercial product which they sell access to. The product would be useless without other people’s work, therefore they should be compensated.

Sure.

Who do we send the compensation to for Leonardo da Vinci? Or Shakespeare, for a text-based example?

Do you want them to compensate me for the stuff I uploaded to Wikipedia and licensed as public domain, or what I've uploaded to GitHub with an MIT license?

A model trained only on licensed data is still an existential threat to the incomes of people whose works were never included in the model, precisely because they're only useful to the extent that they generalise beyond their own examples.

> The camera enabled something that was not possible before, and I wasn’t built by taking the work of sketch artists and painters. It was an entirely new form of art and media.

A new form of art that was (a) initially decried as "not art", and (b) which almost completely ended the economic value of portraiture.

February 16, 2024 at 11:51 PM

throw10920

> Who do we send the compensation to for Leonardo da Vinci? Or Shakespeare, for a text-based example?

Those authors aren't alive and their works are in the public domain. Bringing them up is irrelevant and a diversion from the actual problem, which is that creators alive today whose work is under copyright today and who need to make a living from their art are having it taken with zero compensation and had it fed into AI, stealing their effort.

> A model trained only on licensed data is still an existential threat to the incomes of people whose works were never included in the model, precisely because they're only useful to the extent that they generalise beyond their own examples.

Again, a diversion. We can debate how much AI trained on properly-licensed AI should be controlled, but it's pretty clear that the bare minimum is for all AI training data to require explicit permission from the creator of that data.

February 17, 2024 at 12:44 PM

ben_w

[flagged]

February 17, 2024 at 6:19 PM

throw10920

That's clearly not what I said. You're intentionally misinterpreting my comment.

February 17, 2024 at 11:31 PM

ben_w

If you don't like how I interpret your comment, rephrase it. Don't assume you can read my intentions when my response isn't one you want.

Or don't. I'm not your boss. But if you don't, I'll never know what you meant.

February 18, 2024 at 6:15 AM

throw10920

Let me clarify - you're not even misinterpreting my comment - you're just making up random things that I never said and which no reasonable person could ever draw from my words.

There's no point to arguing this further because you're clearly not acting in good faith. It is impossible to have a reasonable conversation with someone who randomly (and falsely) claims that others said things that they did not.

February 18, 2024 at 6:22 AM

ETH_start

Those are not fundamentally different. A group of people coming together to create a company that trains a AI model for profit and an artist studying thousands of pieces to develop a style of their own, and then selling paintings based on that style, are both totally dependent on the body of knowledge that civilization left for them.

February 16, 2024 at 8:18 PM

pera

Artists do credit their teachers (Verrocchio in the case of da Vinci), schools, sources of inspiration and influences, so I'm confused by this comment.

What kind of acknowledgement did you have in mind?

February 16, 2024 at 8:24 PM

spookybones

Yeah, some of these comments are clearly made by people who don't actually know the history of art.

February 16, 2024 at 11:05 PM

saagarjha

I'm not even sure the commenter knew who "the artist that painted Mona Lisa" was when they made that comment.

February 16, 2024 at 11:52 PM

s3p

What kind of acknowledgement should AI be giving?

February 16, 2024 at 11:00 PM

jazzyjackson

if the producers of these models weren't incentivized to hide their training data it would be almost trivial to at least retrieve the images most similar to the content produced

some images will be maximally distant from training examples but midjourney repainting frames from "harry potter" could very easily automatically send a check to jk rowling per generation

these AI start ups are just trying to have a free lunch in a very mature industry

February 16, 2024 at 11:38 PM

padolsey

"The world doesn't work that way". Quite pessimistic a position to hold here, no? We–in technology especially–are in positions of significant leverage. We should be talking about how we can limit the negatives and bolster the positives from these generative models. The world can work in a different way if we put enough energy into it. We don't have to stand by as subjects of inertia. That is why OpenAI and others are treading carefully, trying to trigger some kind of momentum of reflection instead of letting our base demons run amok.

February 16, 2024 at 7:50 PM

Avicebron

That's a massively charitable reading on their actions, whenever I see a "thought leader" behind these companies talk about how careful they are being, I just see marketing. Someone desperately trying to impress upon everyone how revolutionary their model and by extension they are, it's kind of sad..

February 16, 2024 at 7:55 PM

padolsey

I definitely see it as self-serving too, yes, but I also see it as a convenient temporary alignment of incentives. The world and its regulators definitely need time to adjust and educate themselves, so I'm glad for now that they're exercising restraint.

February 16, 2024 at 8:05 PM

TaupeRanger

> The way these models are creative is the same way humans are

We have no idea how human creativity works, but we know with certainty that it doesn't involve a Python program sucking in pixel data and outputting statistical likelihoods.

February 16, 2024 at 10:08 PM

ben_w

Those Python programs are (loosely) inspired by how organic brains work.

(I still have on my to-do list "learn more about why Hebbian learning is different from gradient descent and how much those differences matter").

February 17, 2024 at 1:32 AM

jerojero

You know, Ive seen people do amazing things with math equations. Beautiful visualisations.

As these tools improve and it becomes more possible for us to actually take our ideas into images and videos that fit a sort of "yes this is what I want" bill we are going to see amazing things come out.

I mean, a few days ago I saw this clearly AI generated video of some wizards doing snowboard and having a blast in the mountains. It's one of the funniest things I've seen in a while, simply so ridiculous. Obviously someone had the idea "I want to make a video of wizards doing snowboard in a mountain" that's where creativity lies.

So to say "creativity doesn't involve a python program outputting statistical likelihoods" imo is just you saying you're not creative enough to know what to do with the tools you've been given.

Some people when they see a strawberry they see a fruit. Others see endless dishes where the fruit is just an ingredient.

February 16, 2024 at 11:09 PM

jazzyjackson

rude

obviously you can use python to create works of art

whether a python script can itself be creative is the question posed by OP, but you went with "you're just not creative enough to get it"

February 16, 2024 at 11:42 PM

eric_cc

We do have, at the very least, an idea about how human creativity works and it is an input output pattern.

February 17, 2024 at 1:37 AM

goatlover

That's a meaningless statement. Any interacting physical system is an "input output" pattern, as long as you're only looking at the inputs and outputs. Behaviorism fell out of favor for a reason. It's whats transforming inputs and creating outputs that matters. For that matter, you need to be able to define what an input and an output is for humans, given that we have bodies.

February 17, 2024 at 1:45 AM

mihaic

I don't really want to weave baskets, that's what I'd want a machine to do.

"The world doesn't work that way" - I've seen this so often, but the most incredible thing about humans was the optimism to be able to change how the world worked -- that's the main impetus of most revolutions.

Personifying computer programs also is an error, it's like saying that bombs kill people when there has to be a person dropping them (at least until we get Skynet).

February 16, 2024 at 8:04 PM

LoveMortuus

>I don't really want to weave baskets, that's what I'd want a machine to do.

In my free time I like to code games, I don't have money to pay for an artist, nor the time/will to learn how to draw, that's what I'd want a machine to do.

I do agree with you that personifying computer programs is an error. That's also why I avoid calling these AI, because they're FAR from that. But I do believe that there will come a day, where personifying a computer program will be a real question.

February 16, 2024 at 8:57 PM

IAmGraydon

>The way these models are creative is the same way humans are. The artist that painted Mona Lisa didn't credit any of the influences and inspirations that they had.

I'm continually amazed at how many people argue against this point on HN, which is largely biased toward logical discourse. What you just said is exactly right, and is the Achilles heel of the legal arguments against generative AI. If what they are doing illegal, then so is the human act of creativity. If human creativity is legal, then so is generative AI trained on existing art.

What has yet to come is the mass realization (or perhaps, admission) that the way AI works is no different from the way we work.

February 17, 2024 at 3:12 AM

jaystraw

my name is timothy basket -- you're saying people have stolen my weave?!

end sarcasm. but seriously -- claiming you made something you didn't isn't ok. but it happens, regardless of laws or regulations or norms.

i don't have any solutions; the internet helps because you can publish something and point to it. i'm a musician and sometimes i only realize well after the fact how influenced i was by something after the fact for a song i've written.

and of course, my precious baskets.

February 16, 2024 at 8:00 PM

camillomiller

It is absolutely not the same, and saying so disregards centuries of knowledge stratification. These machine produce superficial artifacts that lack any layering of meaning of semantic capital (see Luciano Floridi). They are the byproduct of the engineering extremism and lack of humanities knowledge of the people getting rich through their creation.

February 16, 2024 at 9:48 PM

chefandy

Models learn exactly like artists, and also, for some reason, the person that uses those models are artists making art. Wait… Artists learn by passively ingesting many millions pieces of media someone feeds them for the non-specific purpose of “generating art” so some person who wants to take credit for making the end piece can tell them exactly what to make, right?

February 16, 2024 at 10:21 PM

mlrtime

If what you say is true then people will still value non-superficial artifacts.

However the mass produced semi-superficial artifact creators that were being created before AI will adapt or suffer.

February 16, 2024 at 10:05 PM

nostrebored

If the lack of humanities education is what allows us to create the most abundance of art in human history, was that education really worth it?

February 17, 2024 at 1:53 AM

sebastiennight

This reads like a wildly confident statement about art.

While at the same time not mentioning the actual name of "the artist that painted Mona Lisa" (Leonardo Da Vinci), nor knowing that the name of his master is very well known, and even the influence of artists that he seemed to despise (eg Michaelangelo) are very well documented as well.

Maaybe this narrow view of (art) history needs to be fine-tuned on more data :-)

February 17, 2024 at 10:50 AM

goatlover

> The world doesn't work that way.

The human world works that way humans make it work. Pretty much what Jody Foster's character in the movie Contact told that asshole trying to steal all the credit from her, and take her place in the mission to go visit alien dad in Pensacola.

February 17, 2024 at 1:48 AM

jonplackett

da Vinci is a silly comparison. He is just one man. Even he didn’t have such great ability that he can put all other artists out of business.

This is more like the invention of weaving machines. Yes we still have weavers but no where near as many.

February 16, 2024 at 8:50 PM

spunker540

I agree and actually think the camera was definitely more disruptive to artists than this AI stuff, and somehow the camera didn’t kill artists.

February 16, 2024 at 11:15 PM
February 17, 2024 at 2:14 AM
February 16, 2024 at 11:33 PM

reactordev

“whose existence and names we will never know or acknowledge.”

That’s the problem. We know their names. We know their stories, their contributions. Babbage. Lovelace. Ritchie. Spielberg. Picasso. Rembrandt. This is what giving attribution is all about. So we don’t just stand there asking how we got here.

February 17, 2024 at 1:17 AM

soperj

This is nonsense, people give credit to their influences all the time.

February 17, 2024 at 1:22 AM

nostrebored

To the influences that they know. Our brain isn’t an attribution machine. When a musician recreates a chord progression that they’ve heard before without noticing it, is that theft?

If a comedian accidentally retells a joke, is that theft?

Our influences are subtle and often inscrutable.

February 17, 2024 at 1:51 AM

legohead

Your argument is similar to the classic hand vs. power tools argument in crafting, which eventually boils down to "did you mine the ore and forge the tools yourself?" Nowadays the argument is about CNC vs. hand crafting.

This is just a point in our overall evolution. It's an exciting time. We are here to learn and adapt.

Humans can still be creative all they want. There's still the stamp of "created by a human" that will never go away. You can choose to respect it or ignore it.

February 17, 2024 at 1:30 AM

d0mine

> never go away

It reminds me: centaurs (human+AI) in chess/go were better than either humans or AI just for a short time.

People still play chess but they are outclassed by modern AI.

February 17, 2024 at 1:42 AM

halinc

True, but while the 'best' chess is played by computers, few people care to watch Stockfish playing with itself. Meanwhile the human-powered chess world is enjoying a surge in popularity.

February 17, 2024 at 1:53 AM

j-wags

> centaurs (human+AI) in chess/go were better than either humans or AI just for a short time.

I was having a conversation about this with a friend last weekend, and we'd assumed that centaurs were still better than either top humans or top computers. I'm unable to easily find this info on google, could you share where you saw that centaurs are no longer better than top computers?

February 17, 2024 at 2:52 AM

d0mine

I saw it here, perhaps, in articles about competitions where both humans and computers were allowed (computers-only teams won). I too can’t find anything relevant on google.

February 17, 2024 at 5:27 PM

eric_cc

> will never go away

Nothing is forever. It’s unlikely unmodified Homo sapiens are dominant on earth 1,000 years from now.

February 17, 2024 at 1:41 AM

mihaic

I see a shallow analogy that isn't true to me on close inspection.

To me, human activities from which we can earn a living wage feels like nomadism as the edge of an ever expanding region of agriculture (technological automation in this case). When you lose some activities to automation, we've always found new ones until now. In the end though, there were no more pastures for nomads to move, and there will be no more new activities from which humans can earn a wage (not to mention the satisfaction of accomplishing something hard). And, while there might be a future with UBI for everyone, the transition seems rough and exploitative.

February 17, 2024 at 3:06 AM

vin047

Yeah we all thought machines would automate menial labour allowing us to focus on creativity and passion. Looks like it’s the exact opposite.

February 16, 2024 at 8:05 PM

hackerlight

Most labor is being automated within the next few decades. It'll be a post-labor world with one less factor of production. Capital and land ownership is all that will matter assuming we don't completely redesign our economic and political system. Pretty scary.

My one hope is that the price of goods becomes so low due to AGI/automation, that the uselessness of labor in the economy won't matter. People can still be materially prosperous even with a meagre UBI (and it will be meagre because people have no political power in a post-labor society where the only thing that matters is capital).

February 16, 2024 at 8:43 PM

Frost1x

>Capital and land ownership is all that will matter assuming we don't completely redesign our economic and political system. Pretty scary.

Agreed. My concern isn’t really remotely about any of the accomplishments of generative AI. Frankly in my daily life I’d welcome readily available access. As it stands now it’s sort of a mixture of analytics and creativity without consciousness as we best understand it, so GPT itself isn’t going to murder me and take over the world.

The real issue is who owns these things, how you access them, how effects will ripple through a labor based economy, and how we’ll adapt (or not) our current economic system. As it stands for awhile we’ve been catering to the capital ownership group. If that doesn’t have a change in direction then I fear the implications of much of this in daily life. There’s still a fair bit of specialization and domain knowledge needed to leverage these tools to understand the questions to ask (I.e prompts to generate both around LLMs and the context of information fed to them) but they can certainly in many cases behave as multipliers that could reduce the amount of staff needed in some creative roles or eliminate some all together.

This isnt a new dilemma as arguably technology has been shifting the labor market for centuries, the question is how and if it can reshape well this time or if we need to fundamentally rethink these concepts of labor and capital ownership. That’s my major concern.

February 16, 2024 at 8:59 PM

mathverse

It's the opposite. Price of goods is becoming more and more expensive due to larger demand and lower salaries.

February 16, 2024 at 10:20 PM

hackerlight

> It's the opposite. Price of goods is becoming more and more expensive due to larger demand and lower salaries.

We're discussing a hypothetical post-labor future in 5-40 years. We probably shouldn't predict the economic theory of this future by looking at recent trends. Recent trends are driven by business-as-usual things like supply chain disruptions. But we're still near full employment, so we're not on the gradient to realized post-labor just yet. Post-labor economics (and politics) will probably be radically different, all the economic assumptions we take for granted go out the window.

February 16, 2024 at 10:43 PM

iamthirsty

Salaries have actually been increasing — at least in the U.S. overall.

February 17, 2024 at 1:48 AM

throwuwu

Honestly, I don’t think the unemployment rate will change much. Humans are great at inventing things to do and if other people see those things as valuable they will pay for them. I do think the world will look very different, maybe even unrecognizable but it’s not going to be full of people doing nothing.

February 16, 2024 at 10:23 PM

brtkdotse

To be fair, generating stock images and videos and writing listicle blog post is pretty menial labour.

February 16, 2024 at 8:07 PM

posterguy

not really

February 16, 2024 at 11:12 PM

wruza

It’s too early to close the bets. Arts (I mean, drawn porn) was just the easiest to kickstart from all the tech that the invention of modern ML and GPUs will enable.

It doesn’t look the opposite, it looks it automated even what we all couldn’t think of, and did that first.

February 16, 2024 at 8:30 PM

DiscourseFan

Humans are still cheaper than robots for some tasks...

February 16, 2024 at 8:11 PM

flkenosad

Not for long...

February 17, 2024 at 12:40 AM

namlem

I disagree. I think this is going to empower creatives like never before. Filmmaking currently takes a huge amount of time and money. Countless would be filmmakers are relegated to making 30 second tiktoks because that's all they can afford to do. This could change all that.

February 16, 2024 at 10:27 PM

daniel_reetz

But an equally likely future is tiktok/insta generate all the videos. After all, they can afford the hardware and they understand how to be addictive.

February 16, 2024 at 11:22 PM

marcc

Exactly this. Art changes over time. The mediums that we use to express ourselves creatively evolves. The position that AI is the end of creative art isn't taking this evolution into account.

When video became an affordable medium, would people say "this is the end of art, live performances are art. Now the people will just watch the same recordings over and over?" Maybe, if the internet existed. But it's had the effect of creating and introducing new art forms.

AI generated content won't replace art. It will evolve it to a new creative.

February 16, 2024 at 11:13 PM

MichaelDickens

Today, only a highly privileged slice of the population can make a living making art. Nearly everyone who enjoys making art can't make a living off of it, and even the vast majority of people trying to do it full time still can't make ends meet (hence the cliche of the starving artist). But everyone can make art as a hobby if they'd like, that's what almost all artists do, and that will continue to be true as AI advances.

So I don't see AI art as changing careers much. Even if AI fully replaces human artists, all that means is the 0.1% of people who make a career off their art will have to join the rest of us 99.9% who only do art for the fun of it.

February 16, 2024 at 11:03 PM

mezeek

You sound like "making art" is only the painter in his Brooklyn studio. But it's video game designers, movie animators, videographers, graphic artists, and more that work in agencies and marketing departments of all companies that will be affected.

February 16, 2024 at 11:08 PM

ben_w

Those are mostly not well paid roles[0], and there are clearly many hobbyists in these areas also — looking at YouTube, all output is necessarily videography or animation, but what's the income distribution? I have a channel, no money from it (not that this was ever the point).

[0] Unless you're doing furry art, but that's only because furries are "suspiciously wealthy".

February 16, 2024 at 11:37 PM

digging

> Today, only a highly privileged slice of the population can make a living making art.

I think this is less true than it's been in centuries or perhaps all of history. Artistry is widespread, anyone can do it, and many choose to pursue it even though the pay isn't going to be great; in preindustrial times even having access to the ability to create art was quite limited as were the media types that existed.

February 17, 2024 at 12:37 AM

ddbb33

Creative fields encompasses much more than art creation.

February 16, 2024 at 11:08 PM

high_priest

Haven't we always attributed creations to people, to motivate our own egos to pursue higher achievements in the name if "glory"? With vision of wealth attributed to fame? Forgive me for being cynical here, but this is how I always viewed the world. Names are... just this, names. Things we use to communicate some ideas/phenomena, but are irrelevant in scope of endless evolution. And can function just aswell with some other "identifier" attached to it.

I have come to terms with the fact, that I'm just a spit of sand, just as irrelevant to my own creation, as I am to the cells and bacteria that create me.

February 16, 2024 at 7:32 PM

digging

I suspect chasing glory is the main driver yes, but we also like to understand how things came to be, and by knowing who made them and when and where we can do that. AI is ushering in a dark age of attribution where we may no longer be able to know how anything came to be after it's spit out of a computer. (I mean dark age as in "it's dark and we can't see," like the Greek Dark Ages or Dark Matter, not in the sense of "times are bad".)

February 17, 2024 at 12:45 AM

mihaic

If you truly feel like a grain of sand, that's your choice, but won't you help us that don't feel that way, if it won't do you any harm?

I for one do feel really special, as for every human there are about as many bacteria as there are stars in the universe (give or take a bit).

February 16, 2024 at 8:12 PM

war321

As said every time this "why are we automating creativity when menial jobs exist?" response comes up:

1) Errors in art programs messing up is less worrisome than a physical robot. One going wrong makes extra fingers in a picture, the other potentially maims or kills you.

2) Moravec's Paradox. Reasoning requires little computation versus sensorimotor and perception.

3) Despite 1 and 2, we are constantly automating menial jobs!

February 16, 2024 at 7:40 PM

reubenmorais

Classifying image generation and manipulation as "art programs" is the most beneficial possible reading of it. When you use them to generate disinformation, incitement and propaganda, they are potentially maiming and killing humans. This failure mode is well known, the mitigations ineffective, yet here we are, about to take another leap forward after a performative period of "red teaming" where some mitigation work happens but the harsher criticism is brushed off as paranoiac.

February 16, 2024 at 9:22 PM

thegrimmest

I couldn't disagree more strongly that disinformation, incitement, or propaganda maim and kill people. People kill other people. Don't give killers an avenue to abdicate responsibility for their actions. Propaganda doesn't cause anyone to do anything. It may convince them, but those are entirely separate things with a clear, bright line between them. Best not mix them up.

February 17, 2024 at 1:34 AM

reubenmorais

It might be instructive to consider for example the history of genocide, in particular of civilian collaboration in state lead genocide. It might be instructive to consider why the genocide convention criminalizes not only acts of genocide, but also incitement of genocide. Why it criminalizes not only the failure to prevent genocide, but also the failure to prevent incitement of genocide. The US has an extraordinarily strong position on freedom of speech, it is nowhere near a universal moral value.

People kill other people is a statement so simple as to be devoid of any positive meaning. What are you actually trying to say? Don't justice systems almost universally contain notions of incitement of crime, criminal negligence to prevent a crime, and other accessory considerations to the actual act?

Don't justice systems almost universally have several levels of responsibility in relation to intent, which at its most basic level can be established by predictable outcomes?

If, for example, you are a leader of armed forces, and also a leader of organizations capable of creating propaganda. Let's say you create and distribute some propaganda (maybe using some AI tools), and a predictable outcome of that is that soldiers will be more lenient in their consideration of the rules of engagement and international law. In that case, one could at the very least establish that you were negligent in your creation and distribution of propaganda. The actual crime would have been the people killing people, namely your soldiers, but you would certainly be given some responsibility for that.

You can similarly take a small next step after that and consider that a company producing, distributing, and profiting from a dual use technology capable of creating propaganda and disinformation that can be responsible for crimes could be held at the very least morally accountable for those crimes, if not criminally.

Responsibility, accountability, moral and criminal, are not black and white notions. They are heaviest and easiest to attribute around physical acts of damage, but they stretch far and wide. To think otherwise is to allow the people with the most power to rampage unaccounted.

February 18, 2024 at 8:06 PM

thegrimmest

> freedom of speech .. is nowhere near a universal moral value

It depends on the basis form which you derive your (universal) moral values. Maximalist liberty as a universal moral value can be derived from the dual axioms of universal moral equality and a lack of moral oracle. If you accept these axioms, it follows that there is no source of moral authority that can legitimately constrain the non-infringing actions of another (eg. your right to wave your fists around ends where my nose begins). These ideas were first laid out in The Declaration of the Rights of Man, and expanded on in the Declaration of Independence.

> What are you actually trying to say?

That the causal chain of an action is completely interrupted at the first agent/actor in the system, who bears full responsibility for their actions.

> justice systems almost universally

It very much depends on the justice system. If you look at US/British/Roman law, a guilty mind (mens rea) and a guilty action (actus reus) are core facts that must be established in order to prove a crime has been committed. These still apply in cases of eg. criminal negligence, where a reasonable person ought to have known that their actions will result in harm. Mens rea is quite challenging to prove in cases of incitement, and legal precedents vary.

In combination with the above causal thesis, I hold that restricting incitement is in all cases an overstep of federal authority and an infringement of fundamental liberty. Incitement as a crime seems to have been established to make policing easier, not because telling someone to do something makes you responsible for their actions.

> you were negligent in your creation and distribution of propaganda

People are not inanimate objects. They are decision-making agents. The world is not a Rube-Goldberg machine. The soldiers who do the killing are responsible for their own moral attitude, and their own actions. You cannot be reasonably expected to know how your ideas will impact the minds of others, since every mind is a black box. Everything that contradicts this does so with generalizations too broad to be predictively useful.

> You can similarly take a small next step

This is where everything goes insane. Where does the responsibility end? You're trying to piece the butterfly effect back together.

Are people who make and sell bullets responsible for shootings? What about those that refine brass and lead? What about those that mine for ore? Creating economic demand, or promoting an idea, are morally neutral actions. People buying goods are in no way responsible for the conditions of their manufacture. People promoting ideas are in no way responsible for the actions a listener may take. Responsibility is zero-sum. Don't allow slavers and murderers to dispense with even a tiny portion of the sum responsibility for their actions. They must bear it all.

February 21, 2024 at 8:16 AM

scotty79

Disinformation is art. Art is disinformation.

February 16, 2024 at 9:30 PM

panagathon

This reads a little hysterical to me. It's just a new medium of expression. Who knows, maybe the lack of genuine artistic merit, if there is such a thing, would lead more people to watch Jim Jarmusch flicks.

February 16, 2024 at 10:00 PM

seanw444

It's impressive how much hysteria I absorb from this site. Maybe I need a break.

February 16, 2024 at 10:58 PM

rmbyrro

I used to have the same view. Watching "Everything is a Remix" [1] helped me broaden my perspective.

[1] https://youtu.be/nJPERZDfyWc

February 16, 2024 at 11:36 PM

mihaic

I watched that many years ago, but still see a difference here. Everything was a remix made by humans that put in their unique selves into the remixing process.

An AI model has no "unique self" to add to creation, at least not as we've understood so far.

February 17, 2024 at 1:24 AM

rmbyrro

I have the impression you think that it's OK for humans to learn upon other people's work and then create their own, but it's not OK for machines to do that. Am I right?

I don't think this position will lead to good outcomes in terms of progress for civilization.

February 17, 2024 at 7:00 AM

mihaic

Why do you think this, right now?

I'm not ideological about this, I wish for a future with self-driving cars for instance.

The current situation is simply too rapidly evolving and can cause significant economic destruction, for instance if many middle-class jobs are lost without anything to replace them.

Change is inevitable, but reckless speed is not, that's a choice we make as a group.

February 17, 2024 at 4:01 PM

Subdivide8452

[flagged]

February 16, 2024 at 11:39 PM

brookst

Think of your favorite musicians. How many of them give attribution to where each musical idea came from?

The concept of art as exclusive property is very new. Throughout history, artists have built on one another’s works with no attribution or provenance. It’s really just the past 100 years — Disney, specifically - that have created the cultural mindset that the first person to express something owns it forever and everyone else has to pay them for the privilege of building that next work.

BTW I’m old enough to remember people decrying the rise of desktop publishing (“WYSIWYG”) as the automation of creativity.

I share your disdain for the geriatric political class, but I strongly disagree that this is a situation that needs to be managed. I say we let the arts return to the free for all they were for the fist 80,000 years or whatever.

February 16, 2024 at 9:09 PM

fipar

“ Think of your favorite musicians. How many of them give attribution to where each musical idea came from?”

Certainly not for every individual idea, but good musicians do a lot of attribution. I got to know a lot of music I love now after following a mention on the liner notes of another musician’s album, or having them mentioned in an interview.

February 16, 2024 at 9:51 PM

brookst

Aren’t liner notes the moral equivalent of OpenAI mentioning some source material used for training?

People seem to be asking for much more direct attribution: the pixels in this image are 0.02% from artist X, and 0.006% from artist Y, etc.

It is very rare for a song to include a breakdown of all of the influences that the artist is exercising in that particular piece.

February 16, 2024 at 10:15 PM

internet101010

How you are describing that percentage breakdown is how I see this all playing out legally, such that royalty for IP holder = (tags in prompt)/(count of same tags in training data). I am oversimplifying this obviously but you get the idea. This approach would require collective effort of major IP holders but if record labels and streamers can figure out revenue pooling I don't see why it can't work elsewhere.

February 16, 2024 at 11:30 PM

fipar

If the source material was mentioned for every generated image then I think it would be more like what you say. No percentages needed since that's not something we used to get from liner notes either.

February 17, 2024 at 4:43 AM

brookst

But each generated image likely pulls from thousands, maybe millions of pieces of training data, each at a very small weight.

February 17, 2024 at 8:54 AM

geraneum

> Think of your favorite musicians. How many of them give attribution to where each musical idea came from?

Great many, if you care to read a bit more of the biographies, autographies, history of music books, interviews, blogs, etc.

February 16, 2024 at 9:25 PM

scotty79

At some point we'll probably have insight into learning data of AI. For now copyright makes that super hard.

February 16, 2024 at 9:29 PM

pclmulqdq

In what sense does copyright make attribution of that data so hard?

Is it because people are violating copyrights to train these AIs?

February 16, 2024 at 10:01 PM

scotty79

They can't publish their training databases because that would be publishing of copyrighted material which is illegal. They can only train which is potential fair use.

February 17, 2024 at 6:00 PM

pclmulqdq

It would be more accurate to say that they don't publish their training databases (including sanitized pointers to the copyrighted stuff) because they aren't sure that training is fair use.

They are sure, however, that it is a kind of infringement. Citing "fair use" is an admission of infringement - just a specific kind of infringement that is allowed.

February 18, 2024 at 3:17 AM

scotty79

Publishing is definitely not fair use. What's allowed in not an infringement of any kind. Using a right you have cannot be any kind of infringement.

February 18, 2024 at 9:07 AM

brookst

How can training violate copyright? Is reading a book also violation? My understanding was that copyright was about reproduction, not consumption.

February 16, 2024 at 10:16 PM

testermelon

It was about unfairly compensated usage, not limited to reproduction.

February 16, 2024 at 11:14 PM

brookst

That doesn’t sound like copyright.

February 17, 2024 at 8:55 AM

testermelon

fair enough. It might be better to use other word.

February 17, 2024 at 6:11 PM

scotty79

I especially like how Gorillaz artist admits the main hook of their breakaway success song was a rock preset on some niche electronic synthesizer.

February 16, 2024 at 9:28 PM

skriticos2

I'd be very skeptical that AI would worsen the situation with music. For example, many pop music titles in last decades incorporate the same millennial whoop over and over and over again. I seriously stopped following pop music a long time ago and I can't imagine that AI can make it any more generic if it tried. I don't see a threat for non-generic indie music. AI is good at the generic stuff, as it usually statistically averages out the inputs.

February 16, 2024 at 10:49 PM

jazzyjackson

when nirvana played MTV unplugged they mostly played covers from bands that influenced them

also no, disney did not invent the notion of authorship nor royalties. having enough honor not to take credit for someone else's work goes back millennia. attribution goes back millennia, otherwise we wouldn't know the names Sophocles, Aeschylus, Euripides.

Don't get me started on the pharaohs, mother fuckers loved carving their names into things.

February 16, 2024 at 11:54 PM

arendtio

> This is both amazing and saddening to me. All our cultural legacy is being fed into a monstrous machine that gives no attribution to the original content with which it was fed, and so the creative industry seems to be in great danger.

It is the same as what every human being is doing. We consume and we create. Sometimes creations are very good, but most of the time they are just mediocre. If the machines can create better average results, it will be due to the genius of the humans who invented those machines.

So we can be happy, that we have such beings among us and should cherish, that we will have better content to consume in the future. When you look at the world, you will see, that there are still plenty of problems to be solved for humans.

February 17, 2024 at 1:50 AM

honkycat

In the same way the "organic" movement took over food, and we want to feel human skill and touch in what we are consuming, I think we will see a similar swing in media.

People invest in stories. They also invest in other people. This is why people love seeing Tom Cruise over and over again in movies. Or why I'm going to go see the next Scorsese movie.

Reality television is designed to be addicting, and engaging, and it is very successful at that. I get pulled into The storylines whenever I watch. But I quickly turn it off. I don't watch it not because it is not enjoyable, but because I realize it is a cotton candy: empty trash that is not worth my time.

Artists are already often criticized for being "corporate." I think we'll see a similar effect for AI generated content. The hoi polloi and normies will slurp it up.

The true fans and passionate ones who give a shit aren't going to be fooled.

Edit: for length

February 17, 2024 at 4:15 AM

underlipton

Proactively splitting up the menial tasks so that everyone is doing a little bit, inasmuch as they are able to, for a few hours a day, a few days a week, and getting paid a full-living wage for it seems like the way to go. Or, everyone pulls a Xiu Xiang from Rainbows End and goes back to high school.

The main obstacle to this is the pride and ego of the people who've "made it" up until now. Let go. Let society have nice things, even if you have to reinvent yourself. I don't think that creativity is endangered; art, uh, finds a way.

February 16, 2024 at 11:03 PM

bamboozled

They manage it by meeting with Sam Altman while he runs around in incredibly expensive suits and tells them he will open an office in their country so they will all benefit.

February 16, 2024 at 8:01 PM

PeterStuer

Let's go beyond the philosophical. Which sources would you expect the "woman walks through Tokyo" video to attribute?

February 16, 2024 at 10:43 PM

kajumix

I didn't go to film school or had any training in creative arts. I love the fact that I will have an outlet for creative expression where my text can generate image, video and sound. I can iterate over them, experiment with visualizations, and get better without technical barriers. Generative AI is making everyone an artist as well as a coder

February 16, 2024 at 10:14 PM

threetonesun

You could take your phone and film something outside your house in an interesting way and I'd probably argue that's more "art" than whatever glorified stock video AI generates for you. I'm interested where the tooling can go in the long run - can I scribble a picture of a cat and have it turned into an accurate 3D model, then have AI animate it based on text? That would be neat. Text prompts into "something" isn't, to me.

February 16, 2024 at 10:28 PM

moritonal

A part of the book Look to Windward by Ian M. Banks wrote of this. How the machine minds could comfortably write opera's greater than any man, but still humans would go to the theatre, just to appreciate it, but the impact was recognised in society. Of course that world was based on post-scarcity whilst we are not.

February 16, 2024 at 11:56 PM

yakito

Automatizing creativity, some claim, is an endeavor akin to catching smoke with bare hands—futile, if not utterly fanciful. Yet, I can't help but ponder over the peculiar ballet of human ingenuity and mechanical precision. Consider for a moment, this strange juncture we find ourselves at, a place where the tools crafted by our own hands begin to sketch the outlines of what could very well be new breeds of creativity.

Let's muse on the notion that creativity, as we've known and cherished it, can be bottled up and dispensed by machines, up to a certain whimsical point. Beyond that? We stumble upon creations like these, novel tools that beckon us, the flesh-and-blood creators, to mold unforeseen "creativities." It's one spectacle to mechanize the known realms of artistic endeavor, quite another to boldly claim that machines shall inherit the mantle of creativity, henceforth dictating the contours of all future artistic landscapes.

History, that grand tapestry, is peppered with instances where the mechanical muses have dared to tread upon the sacred grounds of creativity. Take photography, for instance, a marvel of the 19th century that promised to capture reality with an accuracy that scoffed at the painter's brush. Or consider the digital revolution, which flung open the doors to realms of visual and auditory experiences previously consigned to the realm of dreams. The synthesizer, not merely an instrument but a portal, has ushered us into a new era of musical exploration, challenging the supremacy of the acoustic tradition.

Each of these milestones, while distinctly modern, echoes the age-old dance between creator and tool, where each step forward is both a continuation and a departure from the past. In this light, the question isn't whether creativity can be automated, but rather how our definition of creativity evolves as we, hand in hand with our mechanical counterparts, stride into the unknown.

February 16, 2024 at 9:59 PM

TheRealDunkirk

So... you fed the GP comment into some LLM, and it produced this meaningless pablum?

February 16, 2024 at 10:06 PM

activitypea

yaaaaawn

February 16, 2024 at 10:01 PM

michaelcampbell

Yes and no; I mean there are still painters around and we still appreciate their skill in the world of photographs. Sometimes it's only marginally about the finished product, but also about the work to make it.

February 17, 2024 at 3:48 AM

melagonster

The creators just don't care humans haha. I don't know why people still learning communications, writing, art or any other crafts. everything will be displaced by next AI.

February 16, 2024 at 7:35 PM

bamboozled

I mean why do kids go to school, why learn anything at all I guess?

February 16, 2024 at 7:59 PM

melagonster

In today's world, it may not be optimal to send children to school. I hope they can follow their passion, but alot of options are not good now.

February 17, 2024 at 5:27 PM

test6554

I believe people will learn, but they will learn more at a lower price.

February 16, 2024 at 9:08 PM

throwaway98797

take shakespeare, he borrowed from so many and yet most people don’t know

a few examples

Plutarch's Lives

Holinshed's Chronicles

Ovid's Metamorphoses

good artist copy, great artist steal

so on and worth

i, for one, welcome these creative machines slurping all that was created!

February 16, 2024 at 10:37 PM

adabaed

We have been on this path for centuries. Compare the symphonies of 200 years ago with our current music or painting. We humans prefer quantity over quality.

February 16, 2024 at 7:48 PM

wruza

I see nice paintings (not black squares or abstract nonsense) like all the time. Feels like more people can paint at the level of “classics” now. Of course they cannot surpass the deeper meaning of the originals, because they aren’t dead yet and there’s no mystery and legacy around their names. But otherwise they are pretty good at making cool pics.

February 16, 2024 at 8:46 PM

boppo1

Find me 15 painters who have non-digital works at this level of scale/detail:

https://www.metmuseum.org/art/collection/search/436482

I suspect you will struggle. The economics for that sort of work don't exist anymore.

February 16, 2024 at 10:10 PM

flkenosad

> We humans prefer quantity over quality.

Speak for yourself.

February 17, 2024 at 12:42 AM

robomartin

> the geriatric political class has absolutely no clue how to manage the situation.

OK, well, you walked right into this one:

You must know the answer: How do you manage it?

February 17, 2024 at 3:24 AM

leovingi

Machines can reliably beat humans at chess. Has that stopped anyone from playing? Has it stopped anyone from watching chess tournaments?

February 17, 2024 at 12:02 AM

suprxd

I guess when you know why and how of something it doesn't feel surprising anymore. That's why two computers playing chess is not a fun event. People would however watch two humans playing even if their moves are secretly controlled by a machine. In contrast the generative content if (and will) reach indistinguishable levels, I doubt majority of people would care if a machine created it or a human? The biggest problem with AI which is disguised as its pros is that it is reachable to anyone and everyone and can be used as a weapon.

February 17, 2024 at 1:34 AM

mengibar10

This is a similar problem manuscript duplicating workers in the Ottoman realm. The printing machines would take their job, but they resisted and lobbied against it in the courts of the sultan. They succeeded in delaying the adoption of printing for some time for the detriment of the people. Unfortunately, this has been the history of man and technology destroying something for the better or worse.

Some twisted the story as if the underlying issue was the religion but economic concerns were the real reason.

February 16, 2024 at 11:26 PM

thiscatis

Have you ever danced or even just enjoyed listening to Daft Punk?

February 16, 2024 at 11:42 PM

scotty79

I think turning human creativity into industry was huge mistake.

I welcome its fall.

February 16, 2024 at 9:24 PM

flkenosad

For sure. It's only being kept alive by ritual sacrifice these days.

February 17, 2024 at 12:44 AM

ben_w

> Creativity being automated while humans are forced to perform menial tasks for minimum wage doesn't seem like a great future and the geriatric political class has absolutely no clue how to manage the situation.

May I introduce you to the entire history of humanity between 7 millennia before the invention of writing to approximately 50 years after the invention of communism? :P

More seriously: yes, we have no clue how to manage the situation. The best guess right now is UBI, which looks cool, but then at a first glance so does communism and laissez-faire capitalism.

Time for, ironically because humans are surprisingly bad at this, a creative idea for how to manage all this.

February 16, 2024 at 11:29 PM

whywhywhywhy

Wish I could go back in time and tell myself to not bother learning how to do any of this stuff the old fashioned way.

February 16, 2024 at 8:18 PM

cdelsolar

Why is the machine monstrous?

February 16, 2024 at 11:17 PM

erur

I feel like a lot of that frustration comes from seeing "arts and culture" as the pinnacle of anything when maybe it's just an overvalued side-effect of human wiring to avoid boredom.

Imho. it's just really hard to reason that average non-educational entertainment has a positive net effect on global society.

Seeing it this way makes it way less surprising that "art" and "creative entertainment" is one of the first things that gets hit by automation.

February 16, 2024 at 8:47 PM

boppo1

Painter/illustrator here. I mostly agree with you. I often have wondered if what I do is a total waste of time, long before generative models showed up. My close childhood peers became doctors and engineers, and there just isn't any comparison about our contributions to society. People get all whimsical when I bring this up and say "but what about the [spirit/feelings/blah]. I'm clear eyed about it though. If I could go back & re-roll my character sheet (i.e. slap my younger self into realizing STEM is cool while those doors were still open), I certainly would.

However, there's a line somewhere. I've spent most of my life around drab midwestern utilitarian/corporate/commercial buildings, and it has been noticeably depressing. In the periods where I've spent time in beautiful buildings, I have felt much better. Based on anecdata, I'm not the only one. There's something important & essential for humans about ornamentation & beauty. It's more than entertainment.

Humans can live on rice and kidney beans, but if one must do so without hope for more tasty options[0] it is demoralizing.

[0] lots of people are happy with spartan diets, but most often those people are doing so by choice.

H

February 16, 2024 at 10:05 PM

flkenosad

Are those doors not still open today? Engineering schools take mature students.

February 17, 2024 at 12:46 AM

boppo1

I have ~50k in debt, and my GPA was garbage. Self study and hobbyist pursuits seem to be my place unless I find a specific field+program I really love enough to bet everything on.

February 19, 2024 at 3:05 AM

mlrtime

You don't have to feel it, millions of people start painting or other artistic endeavors when retiring. Most of the time the [market] value is close to 0. AI does nothing here.

Anecdote: My grandma retired and started painting and has since passed. The market value of these paintings is 0, nobody would buy them as they are just average. But I will never get rid of them because she created it. They have value to me only.

February 16, 2024 at 10:39 PM

torginus

I'm not sure about others, but I'm extremely unnerved about how OpenAI just throws these innovations out with zero foreshadowing - it's crazy how the world's potentially most life-changing company operates with the secrecy of a black military program.

I really wonder what's going to come out of the company and on what timeline.

February 16, 2024 at 3:51 AM

superconduct123

That's what's mindblowing to me

It doesn't feel like a slow incremental progress, the last AI videos I've seen were terrible

Its like suddenly a huge jump in quality

February 16, 2024 at 4:05 AM

Jackson__

It is a sudden jump in quality. A mere _month_ ago, this is what googles SOTA was: https://lumiere-video.github.io/

February 16, 2024 at 4:45 AM

marvin

OpenAI is the Manhattan Project of machine intelligence. Private-sector.

February 17, 2024 at 4:34 AM

ilaksh

Holy %@$%! Abso%@#inglutely amazing! Also, now I see why we need $7 trillion worth of GPUs.

February 16, 2024 at 2:32 AM

CommanderData

All the software engineers and VFX people training to become plumbers. I'm afraid your clients will be jobless or underpaid by that time.

Jokes aside. It's becoming more apparent, Power will further concentrate to big tech firms.

February 16, 2024 at 8:57 AM

nuz

AGI at the quality of sora or dalle but for intelligence is gonna be quite the thing to witness

February 16, 2024 at 2:25 AM

zemo

> Prompt: Historical footage of California during the gold rush.

this is the opposite of history

February 16, 2024 at 2:20 AM

minimaxir

It's a test prompt to demo the model, not a clickbait social media post.

February 16, 2024 at 2:24 AM

diputsmonro

Yes, but the point is that in a few years, there won't be a difference. Those clickbait accounts already exist for AI generated images. How many impressionable or young people have been fooled into believing history that never happened?

More importantly, how can these accounts subtly direct the generations to instill modern ideology or politics into "historical" images, giving them historical credibility? Think of all the subtly white supremacist "retvrn" accounts, for example, falsely recontextualizing inventions and accomplishments to support their ideology.

We all need to be thinking much more creatively and cynically about how these tools will be abused. The technology will get better. The people who want to abuse it will get smarter. And your capability to distinguish fake information is likely much worse than you believe - to say nothing of younger people who have less context and experience to form a mental "immune system".

February 16, 2024 at 2:45 AM

minimaxir

Granted, the blog post is about opening the model up for red-teaming, so highlighting potential vectors for abuse is actually the desired intent.

February 16, 2024 at 2:54 AM

psychoslave

>How many impressionable or young people have been fooled into believing history that never happened?

I would say, all of them. Since the dawn of history. Actually, far before, as treachery certainly precedes speech itself by a few million years in the struggle to survive game.

Just to take a contemporary western (mostly?) thing: how did it went last time you looked straight into the eyes of kids to reveal them Santa Clauss is a lie and yes almost all adults in their society are into that evil conspiracy? And what about the adult around you deeply attached to their national myths, not even mentioning all the folklore around their afterlife beliefs?

But don’t worry, everything is going to go well, I promise and you know you can trust me. :)

February 16, 2024 at 9:56 PM

zemo

yeah I think a tech company showing how their tech can be used to cause damage to a humanities field as one of their leading product demos is bad

February 16, 2024 at 2:28 AM

pimlottc

Yeah, my heart sank when I saw that.

Social media is really good at separating content from context, things like this will distort people's understanding of history.

February 16, 2024 at 4:22 AM

zen928

only if you consider "historical footage" to exclusively mean the "[original] historical footage [stored in archiving]" versus e.g. "historical[ly accurate] footage"

if "historical" is going to be used subjectively with no further qualifying statements then the meaning of "history" will be subjucated to the context it's being presented in, I don't see it's use here as contradictory

February 16, 2024 at 5:27 AM

zemo

I think most people consider "history" to mean "things that have actually happened" and not "the aesthetic of the past" as you seem to be suggesting.

February 16, 2024 at 6:18 AM

aubanel

I love how they show the failure cases: compare that with Gemini 1.5 pro's technical paper that carefully avoids any test where it does not seem like a 100% perf! I think confronting your failures a condition for success, and Google seems much too self-indulgent here.

February 16, 2024 at 5:36 AM

ugh123

Imagine a movie script, but with more detail of the scenes and actors, plugged into this.

The killer app for this is being able to give a prompt of a detailed description of a scene, with actor movements and all detail of environment, structure, furniture, etc. Add to that camera views/angles/movement specified in the prompt along with text for actors.

February 16, 2024 at 3:11 AM

PepGuardiola

In the future, you won't need to do any of that. Your own AI will generate a movie for you and ask you if you feel like watching a movie. You will love it. Because it will know your taste, your hobbies, your friends, ads, chat history, website you visited, ..everything.

February 16, 2024 at 9:10 AM

ugh123

I am a huge proponent for AI, especially in film making. But I hope that real people have the opportunity to write, act, and direct themselves, or with a small group of semi-professionals or even amateurs, their own blockbuster big-budget-looking movies.

February 16, 2024 at 2:57 PM

wingspar

Watched the MKBHD video on this and couldn’t help but think about copyrights when he spoke of the impact on stock footage companies.

As I understand the current US situation, a straight prompt-to-generate-video cannot be copyrighted. https://www.copyright.gov/ai/ai_policy_guidance.pdf

But the copyright office is apparently considering the situation more thoroughly now.

Is that where it stands?

If it can’t be copyrighted, it seems that would tamper many uses.

February 16, 2024 at 11:09 AM

kashnote

Absolutely unreal. Kinda funny how some people are complaining about minor glitches or motion sickness when this is the most impressive piece of technology I've seen. Way to go, OpenAI.

February 16, 2024 at 5:13 AM

albertzeyer

It's such a shame that they aren't releasing any detailed technical paper anymore on all the technical details of the model and how it was trained.

*Edit* Oh, I just read here (https://www.reddit.com/r/MachineLearning/comments/1armmng/d_...) that a technical paper should be released later today?

February 16, 2024 at 7:12 AM

OscarTheGrinch

How is this done technically? So many moving parts and the tracking on each is exquisite.

My initial observation is that the camera moves are very similar to a camera in a 3D modeling program: on an inhuman dolly flying through space on a impossibly smooth path / bezier curve. Makes me wonder if there is actually a something like 3D simulation at the root here, or maybe a 3D unsupervised training loop, and they are somehow mapping persistent AI textures onto it?

February 16, 2024 at 3:24 AM

vilius

The Lagos video (https://openai.com/sora?video=lagos) is very much how my dreams unfold. One moment, I'm with my friends in a bustling marketplace, then suddenly we are no longer at the marketplace, but rather overlooking a sunset and a highway. I wonder if there are some conceptual similarities how dreams and AI video models work.

February 16, 2024 at 4:12 AM

ladberg

Yeah that one has more surreal elements every time you watch it: the people at the table are giants compared to everyone else, someone is headless, the kid's hand warps around like crazy.

February 16, 2024 at 5:04 AM

IceHegel

Those samples are incredibly impressive. It blows RunwayML out of the water.

As a layman watching the space, I didn't expect this level of quality for two or three more years. Pretty blown away, the puppies in the snow were really impressive.

February 16, 2024 at 3:01 AM

m3kw9

i'm not surprised given what was there before, the stills from stability was really good, and it's "just" generating new frames.

February 16, 2024 at 3:28 AM

Xirgil

Maintaining continuity of appearance, motion, etc does not seem like a "just" to me

February 16, 2024 at 4:23 AM

agomez314

Imagine someone combining this with the Apple Vision Pro...many people will simply opt out of reality and live in a digital world. Not that this is new, but I'll entice a lot more people than ever before.

February 16, 2024 at 2:28 AM

TechnicolorByte

Had the same thought. Seems like we’re entering the era of generative AI and mixed reality in a very real way very soon.

As much as I love the technology, I’m really not looking forward to this becoming ubiquitous. Time and time again we’ve allowed technological progress to outpace our ability to weight the societal pros ands cons.

Smartphones and the rise of image-heavy social media has rapidly changed social norms. Watch a video of people out in public 20 years ago: no screen to distract them at bus stops, concert events, or while eating dinner with friends. And if that seems trite, consider how well correlated the rise in suicide rates is with the popularity of these technologies.

Not sure if this makes me a luddite or if the feeling is common in this crowd.

February 16, 2024 at 2:55 AM

pants2

Basically the Holodeck.

February 16, 2024 at 2:30 AM

ilaksh

I was just thinking that -- I used to think the Holodeck was far-fetched. Now it seems like it's practically around the corner (with VR/XR glasses).

February 16, 2024 at 2:34 AM

ctoth

Presumably the Post-atomic horror set back technology for a while, so we should be able to expect TNG-level technology before the war. This also explains why Kirk's Enterprise uses datatapes.

February 16, 2024 at 4:01 AM

m3kw9

but you cannot walk/feel it, just watching. It's still a huge gap to reality, less so, but you will still feel it's fake very vividly because those senses are missing.

February 16, 2024 at 3:20 AM

dw_arthur

Watching it is enough for a lot of people. Watching 1080p first person extreme sport videos on youtube is almost too compelling to me. I have to turn it off because it feels addictive.

February 16, 2024 at 8:19 AM

kuprel

chips will have to come a long way for this to be generated in real time, but there's no reason a generated 3D environment can't be interactive

February 16, 2024 at 3:54 AM

m3kw9

Maybe some sort of implants that can generate senses. Would be 100s of years because you can say simulate weight/pressure and pinpoint accuracy if feeling friction.

February 16, 2024 at 11:29 PM

TealMyEal

in their research paper it says "These capabilities suggest that continued scaling of video models is a promising path towards the development of highly-capable simulators of the physical and digital world, and the objects, animals and people that live within them*." they are well set on that happening

February 16, 2024 at 11:07 AM

bonaldi

It’s heartening and gives me hope that the reaction here is so full of scepticism and concern. Sometimes proceeding with caution is warranted.

February 16, 2024 at 7:29 AM

throwitaway222

https://openai.com/sora?video=cat-on-bed

Even though many things are super impressive, there is a lot of uncanny valley happening here.

February 16, 2024 at 4:11 AM

gwern

Cats are like hands: they are hilariously hard for generative models and then after thinking about it, you realize that cats/hands really are hard. I mean, look at photos of a black cat curled up where it might have its paws sticking out at any angle from anywhere from a solid black void. How the heck do you learn that?

February 16, 2024 at 4:29 AM

mitthrowaway2

Yes, the cat has three hands...

February 16, 2024 at 4:15 AM

hubraumhugo

The amount of VC money in the text-to-video space that just got wiped out is impressive. Have we ever seen such fast market moves?

Pika - $55M

Synthesia - $156M

Stability AI - $173M

February 16, 2024 at 4:39 AM

guwop

Obviously they did not get "wiped" out. Where can i use Sora right now ?

February 16, 2024 at 5:21 AM

chasing

Yeah, you just can't let all media, all the cost and hard work of millions of photographers, animators, filmmakers, etc be completely consumed and devalued by one company just because it's a very cool technical trick. The more powerful these services become the more obvious that will be.

What OpenAI does is amazing, but they obviously cannot be allowed to capture the value of every piece of media ever created — it'll both tank the economy and basically halt all new creation if everything you create will be immediately financially weaponized against you, if everything you create goes immediately into the Machine that can spit out a billion variations, flood the market, and give you nothing in return.

It's the same complaint people have had with Google Search pushed to its logical conclusion: anything you create will be anonymized and absorbed. You put in the effort and money, OpenAI gets the reward.

Again, I like OpenAI overall. But everyone's got to be brought to the table on this somehow. I wish our government would be capable of giving realistic guidance and regulation on this.

February 16, 2024 at 2:29 AM

CooCooCaCha

It’s funny, people dreamed of AI robots doing the shitty work that nobody wants to do so that we are free to pursue things we actually want to do.

But in reality it seems like the opposite is going to be true. AI is automating the creative, intellectual work and leaving the rest to us.

February 16, 2024 at 2:31 AM

nerdix

I'm kind of excited to see how scifi authors will tackle the generative AI revolution in their novels.

As of now, the models still need large amounts of human produced creative works for training. So you can imagine a story set in a world where large swathes of humanity are regulated to being basically gig workers for some quadrillion dollar AI megacorp where they sit around and wait to be prompted by the AI. "Draw a purple cat with pink stripes and a top hat" and then millions of freelance artists around the world start drawing a stupid picture of a cat because the model determined that it had insufficient training data to produce high quality results for the given prompt. And that's how everyone lives their lives....just working to feed the model but everything consumed is generated by the model. It's rather dystopian.

February 16, 2024 at 3:24 AM

mortenjorck

I have a novel I've been working on intermittently since the late 2000s, the central conflict of which grew to be about labor in an era of its devaluation. The big reveal was always going to be the opposite of Gibson's Mona Lisa Overdrive, that rather than something human-like turning out to be AI, society's AI infrastructure turns out to depend on mostly human "compute" (harvested in a surreptitious way I thought was clever).

I've been trying to figure out how to retool the story to fit a timeline where ubiquitous AI that can write poems and paint pictures predates ubiquitous self-driving cars.

February 16, 2024 at 4:25 AM

dsign

I would say it's very profitable in terms of ideas...if you put the work. The problem is that most main-market sci-fi is not about ideas, but about cool special effects and good vs bad guys.

February 16, 2024 at 4:12 AM

dovin

Sure, 90% of everything is crap.

February 16, 2024 at 5:29 AM

Hoasi

> As of now, the models still need large amounts of human produced creative works for training.

That will likely always be the case. Even 100% synthetic data has to come from somewhere. Great synopsis! Working for hire to feed a machine that regurgitates variations of the missing data sounds dystopian. But here we are, almost there.

February 16, 2024 at 4:20 AM

ItsMattyG

Eventually models will likely get their creativity by:

1. Interacting with the randomness of the world

and

2. Thinking a lot, going in loops and thought loops and seeing what they discover.

I don't expect them to need humans forever.

February 16, 2024 at 4:25 AM

Hoasi

Agreed, by some definitions, specifically associating unrelated things, models are already creative.

Hallucinations are highly creative as well. But unless the technology changes, large language models will need human-made training substrate data for a long time to operate.

February 16, 2024 at 4:39 AM

cubefox

It's ironic that you nonetheless think "scifi authors" will be writing those novels, not language models.

February 16, 2024 at 4:46 AM

alex_suzuki

I would read that! But hopefully it won’t be written by ChatGPT.

February 16, 2024 at 3:41 AM

throwup238

I'm sorry but as a large language model I must insist that you get back in the kitchen and make me a burger.

February 16, 2024 at 2:36 AM

CuriouslyC

It's bimodal. AI can automate a lot of low level knowledge work, but as wide and deep as its knowledge is, it is also incredibly superficial when it comes to logic and creativity. What it's going to do is hollow out the middle class, as creative people who know how to wield AI will become wealthy while the majority of white collar workers are forced into trades.

February 16, 2024 at 2:38 AM

nopinsight

A major follow up to GPT-4 later this year is rumored to be (far) superior at logical reasoning than GPT-4. What's likely to happen if that becomes real?

February 16, 2024 at 3:39 AM

CuriouslyC

That might let it encroach more into some fields like law where it's almost good enough already. Shitty time to be a junior lawyer, firms are going to hire and promote people not for their legal skills but for their ability to manage/attract clients.

In general though, I don't think the extra reasoning ability is going to enable it to displace that much farther than it already will, GPT lives in a box and responds to prompts. When it's connected to multiple layers of real-time sensor data and self-directing, that'll be another story.

February 16, 2024 at 3:51 AM

nopinsight

From last week: OpenAI shifts AI battleground to software that operates devices and automates tasks

https://www.theinformation.com/articles/openai-shifts-ai-bat...

There were independent efforts to create AI agents since last year as well. AutoGPT and BabyAGI iirc. They didn’t go far probably because the LLM used was not good enough for that.

February 16, 2024 at 4:12 AM

Hoasi

> AI is automating the creative, intellectual work and leaving the rest to us.

Indeed, there is a risk it completely devalues creative jobs. That's ironic. Even if you can still use AI creatively, it removes the pleasure of creating. Prompting feels like filling Excel sheets while also feeding a pachinko machine.

February 16, 2024 at 4:13 AM

Der_Einzige

This was known for a long time: https://en.wikipedia.org/wiki/Moravec%27s_paradox

February 16, 2024 at 3:31 AM

dilap

Maybe we'll see a resurgence in live theater.

February 16, 2024 at 2:41 AM

multi_tude

Agree, plus performance art might finally hit the mainstream :)

February 16, 2024 at 1:24 PM

sho_hn

Just like it's far more likely for AI to replace middle-management and stream instructions to meat-bots than replace menial labor.

February 16, 2024 at 2:35 AM

mwigdahl

Sounds disturbingly like "Manna" (https://marshallbrain.com/manna1)

February 16, 2024 at 4:16 AM
February 16, 2024 at 2:39 AM

croes

The problem, as long as people need money to live, every work is necessary and every automation is a threat.

February 16, 2024 at 3:08 AM

dingnuts

if it was actually AI, instead of a stochastic parrot, we could ask it to design robots that could do the manual labor that we still have to do, because we haven't been able to design robots to do the manual labor.

Unfortunately, LLMs aren't intelligent in any way, so you cannot ask them to synthesize any kind of second-order knowledge.

This is why they won't take away the creative work, either. They are fundamentally incapable of creating anything new.

February 16, 2024 at 3:38 AM

ldjkfkdsjnv

Blue collar workers have the last laugh

February 16, 2024 at 2:34 AM

throwup238

This is the beginning of the end for many of them too. Look at the opening line of the page:

> We’re teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction.

Text-to-video is just the flashy demo that everyone can understand after exposure to text-to-image. Once the model can "simulate the physical world in motion" it's only a few steps away from generic robotic control software that can automate a ton of processes that were impossible before.

Humans still have the benefit of dexterity and precise muscle control but in the vast majority of cases robots can overcome those limitations with better control software and specialized robotic end effectors. This won't soon replace someone crawling under a house or welding in awkward positions, but it could for example replace someone who flips burgers or does manual labwork.

This could eliminate the limiting factor for automating many manual processes. (ruh-roh)

February 16, 2024 at 3:01 AM

prisenco

Plumbers keep winning.

February 16, 2024 at 2:54 AM

Xirgil

What happens when anyone can put on their AR headset and have AI diagnose and walk them through exactly how to fix their plumbing problems?

February 16, 2024 at 3:50 AM

prisenco

What happens when their AR headset gets wet?

Less glibly, no matter how good you are at following instructions, tearing out a wall filled with water than can destroy your home, fiberglass insulation that can damage your lungs and electrical wiring that can kill you will never be something I’d recommend a layman do. No matter how good the ai tutorials are.

February 16, 2024 at 4:01 AM

Vetch

Don't take tacit knowledge for granted.

February 16, 2024 at 4:14 AM

Drakim

Turns out the only jobs robots can't take are the ones where humans are specialized, such as cleaning staircases.

February 16, 2024 at 2:36 AM

theultdev

It's just cheaper to put humans on tedious physical tasks. See Amazon.

AI is cheaper than a high paid designer, developer, writer, etc.

A robot is more expensive than a human laborer.

It's really funny to see the squirm from those thinking truckers would be automated away, not them.

February 16, 2024 at 2:39 AM

reducesuffering

> A robot is more expensive than a human laborer.

Not when intelligence is cheap and highly abundant. Perfecting general robotics as an improvement on humans will be quick. The upper limit of strength and consistency is much higher.

February 16, 2024 at 2:45 AM

theultdev

I mean today, in the real world.

It is currently more expensive to build a robot for many tasks than it is to have a human do it.

> Perfecting general robotics as an improvement on humans will be quick.

It has not been nor is there any indication it will be.

February 16, 2024 at 3:10 AM

hansonkd

Today in the real world AI can replace very little of designers, programmers, etc. Lots of potential and extrapolation, sure. but hasn't happened. What has actually been produced by AI has been panned as not quite ripe yet.

Same with robotics. Lots of potential, but hasn't happened yet. If you read the description, Sora, is based out of trying to simulate the physical world to solve physics based problems. Something that would be perfect for the next leap in robotics.

February 16, 2024 at 3:57 AM

theultdev

I use to pay designers for artwork, now I just use AI.

There's no physical task that robots have replaced humans for me.

Hell, even the roomba sucks (pun intended) and my wife has to pick up the slack.

February 16, 2024 at 4:18 AM

nogridbag

Haven't you seen Migo Robotics? :)

https://www.youtube.com/watch?v=kCKN8k-OFG8

February 16, 2024 at 2:42 AM

karmasimida

Think about it. Sora demonstrate AI can understand real world physics to a scary degree.

If you use Sora like models to imagine what actions needed to be taken, then realize it, well, the only thing left is to create an arm/fingers that can took action, then you are done.

February 16, 2024 at 4:22 AM

bsza

Machines have replaced a lot of blue-collar jobs alright. It's just that most of it happened during the Industrial Revolution, so we aren't even aware of all the shitty (and not-shitty-but-obsolete-nonetheless) jobs that used to exist.

February 16, 2024 at 5:04 AM

ajmurmann

Is it automating the creative part of the work or the mechanical part of creative work?

February 16, 2024 at 2:45 AM

whstl

It’s automating a big chunk of the money-making part of creative work.

February 16, 2024 at 2:47 AM
February 16, 2024 at 4:02 AM

Hoasi

It's automating some of the craftsmanship part, which is substantial, but in a sense, it also threatens the creative part.

It's already very tempting for large entertainment businesses to create lazy remakes as it involves less risk. Automating creative jobs will create a shift at the production level but also on the receiving end: the public.

February 16, 2024 at 4:30 AM

zemo

that would never happen because someone owns the robots and rich people can afford more robots than poor people and rich people aren’t rich people if poor people aren’t poor

February 16, 2024 at 3:03 AM

golol

Come on, don't you see that the capability to understand the physical world that sora demonstrates is exactly what we need to develop those household robots? All these genAI products are just toys because they are technology demonstrators. They're all steps in the way to AGI and androids.

February 16, 2024 at 3:41 AM
February 16, 2024 at 5:30 AM

CooCooCaCha

No. Because sensorimotor control is a completely different ballgame and AI tech for that is far behind these models.

February 16, 2024 at 9:57 AM

golol

sensorimotor control is imo not at all the bottleneck. Teleoperated androids could do lots of useful things right now, but the AI is lacking to automate them.

February 16, 2024 at 1:35 PM

CooCooCaCha

“The ai is lacking to automate them”

Yes that’s my whole point…

February 16, 2024 at 10:07 PM

golol

well I let's say you want to make the coffee and we split that task into roughly two subtasks. The first is to imagine what motions are necessary to do that. How does the coffee cup have to move, how does your hand have to move to grasp it, etc. The second part is to find a way to use your muscles or actuators to execute those imagined actions.

I claim that the first part is the more difficult one and where we have the bottleneck currently. Furthermore, generative video AI is exactly the kind of thing that would give a model an understanding of what kinds of things have to happen in order to make coffee.

February 17, 2024 at 7:06 PM

dkjaudyeqooe

Somehow, according to that logic, and in general the logic of all AI danger hysteria, humans have no agency in determining what the limits of what AI is fed and of its use and abuse.

February 16, 2024 at 2:43 AM

diputsmonro

Some humans do - the investors and executives in AI tech companies (and the legislators who theoretically could regulate them) , who all stand to make a lot of money from every one of the "AI danger hysteria" scenarios, and are therefore highly motivated to bring them to fruition.

The rest of us have no choice. Despite millions of artists, animators, etc. all being resoundly opposed to AI art, the models that infringe their work are still allowed to exist, and it seems they're fighting a losing battle.

A lot of people are being "hysterical" because a lot of people don't have a choice.

To be clear, the problem of these scenarios is tightly intertwined with the problem of unfettered capitalism and wealth inequality in general. Food and shelter require money, and we get money by working a job. If millions of jobs disappear overnight, then of course millions of people are going to be distressed over no longer having ready access to food and shelter.

The idea of "just getting another job" doesn't scale to the destruction of entire industries employing tens of millions of people. This is how depressions are made.

The idea of "the depression will end someday" is not only not necessarily true as wealth inequality skyrockets, but is also cold comfort to the people who will lose their houses and for some, lives, due to the disruption.

A different economic system could perhaps allow us to appreciate these technological advances without worrying about them displacing our ability to live. But the American political system consistently and firmly rejects any ideas not rooted in social darwinist capitalism.

For your sake, I hope your resume is very impressive.

February 16, 2024 at 3:07 AM

visarga

If millions of jobs disappear overnight it means AI is amazingly good, which means people will also have AI empowerment on a whole new level as open source trails companies by 1-2 years. Everyone will just order their AI "take care of my needs", maybe work along with it. You got to agree that we already have some amazing open models and they are only getting better - that empowerment will remain with us in times of need.

"Companies employing people" will be replaced by "people employing AI". Open models are free, small, fast, trainable and easy to use. They capture 90% of the value at 10% the cost, and are private.

February 16, 2024 at 3:59 AM

diputsmonro

"Companies employing people" getting replaced by anything is pretty dangerous in an economic system where employment is synonymous with having food and shelter. It won't matter that AI could help me keep a to-do list or generate pretty videos if I don't have a job or income.

What we're looking at is a massive decrease in the relative economic value of the average human's work. If the economic value of a hundred people is less than what the company can produce with a single human operator running AI models, then those 100 people are economically worthless, and don't get to eat.

We drastically need to tax the usage of AI models on the huge windfall they're about to create for their operators, and use that to fund universal basic income for those displaced. Generally speaking, as automation and wealth disparity skyrocket, UBI will be required to maintain any semblance of the society we currently have. I am incredibly pessimistic about the chances of that happening in any real way though.

February 16, 2024 at 4:16 AM

visarga

We don't have any control because we don't trust each other. Prisoner's dilemma

February 16, 2024 at 3:56 AM

93po

Robotics is going to catch up extremely fast

February 16, 2024 at 3:09 AM

danavar

I would agree. While we are seeing all this creative work get automated by AI, how big of an impact would that really have on the economy?

Fully-functional autonomous driving will have a much larger economic impact - and that's just the first area where autonomous robots will come into our lives.

February 16, 2024 at 3:28 AM
February 16, 2024 at 2:34 AM

neilk

https://www.smbc-comics.com/comic/sad-2

February 16, 2024 at 4:38 AM

supriyo-biswas

People on HN like to split hairs and make muddled juxtapositions about human rights and AI model capabilities. But this is something that people and governments around the world would have to reckon with very quickly, since the rate at which generative AI technology is advancing, there could be hundreds of millions of people who’re unemployed and have no way to find work.

The quickest way to address this would be an extremely high tax rate on any generative AI model, say 500%, while the government figures out what’s the best way to sustain an economy (such as UBI) with a diminishing set of consumers as more people are pushed towards unemployment.

February 16, 2024 at 2:37 AM

stale2002

I can run these models on my home PC.

How are you going to stop me from doing that?

Even the free and open source stuff will destroy industries and you can't confiscate everyone's consumer gamer PCs.

Taxing the big guys doesn't save creative industries. It's a lost cause.

February 16, 2024 at 4:34 AM

supriyo-biswas

Taxes are meant to capture some of the profit that is made by a business entity. You could use a local model, but if you sell some kind of product or service, the tax would be levied on you. Not declaring that properly, of course, is tax evasion :)

February 16, 2024 at 9:30 AM

BeFlatXIII

…ånd how would such tax evasion be proven in a court of law?

February 17, 2024 at 12:11 AM

stale2002

Most productive work will use AI to facilitate that work in some way. I'm already doing that with coding.

There isn't a way to "capture" the value from that.

Even if you aren't directly selling AI assets to someone else, people simply using AI themselves will still destroy industries.

Good luck confiscating everyone's graphics cards. The cat is is out of the box already.

> the tax would be levied on you

No it wouldn't. AI is already everywhere. Its game over. You aren't going to be able to track basically anyone who is using local models or other AI.

February 16, 2024 at 12:15 PM

ndjshe3838

Exactly, you can’t put it back in the box

The only thing I can imagine is like limiting people’s compute power

But even then they’d just go do it in another country or use an online service based in another country

February 16, 2024 at 5:00 AM

pstorm

500%? So if the generative AI model created something worth $1m, you tax it $5m? How do you tax a technology anyways?

February 16, 2024 at 3:25 AM

cabalamat

I suspect what was meant is something like 500% VAT, where if a generative AI charges a customer $6, then $5 goes to the taxman and $1 to the AI company.

February 16, 2024 at 3:54 AM

mythz

Typical argument against technological progress "We should ban technology to stop it doing what humans can do in a fraction of time and resources".

Can see this create an explosion of new Content from aspiring Film, Story tellers and cut scenes from Game creators that previously never would have the budget or capabilities to be able to see their ideas through to creation.

February 16, 2024 at 3:04 AM

wnc3141

If we had a safety net where career progression and time/money invested in training was unnecessary to sustain life, then maybe. Until then it feels like a bit of allowing a few people to plunder and own the collective output of millions.

February 16, 2024 at 3:37 AM

visarga

This moment seems like trade guilds revolting against free craftsmen. What AI is essentially doing is learning skills from people according to their works and then helping everyone according to their needs. It's more rad than open source.

This is not plunder, it is empowerment. Blocking generative AI would be a huge power grab for copyright owners. They want to claim ideas and styles, and all their possible combinations.

Gen AI need only ensure it never reproduces a copyrighted work verbatim. Culture doesn't work if we stop ideas from moving freely.

February 16, 2024 at 4:08 AM

wnc3141

I agree that preventing technology from dispersing generally prevents the creation of wealth. However, given our current economic structure, the downside in instability of a livelihood has dramatic effects on swaths of people who were unlucky enough to be disrupted -think of the dramatic costs of retraining, healthcare access, the high costs of diminished earnings, inability to accrue wealth and retire. Perhaps we could socialize these costs, but we don't and are unlikely to do so.

Another issue to look at is the lack of ownership of the tools of your trade. In a context where many use AI models to competitively produce, hosts of AI models essentially own the access to your trade - thereby able to charge a toll, or privilege certain behaviors for any who strive to make living with these tools. (of course this is happening now with plenty of software products). The ultimate trajectory of this is not democratization of a toolset, but a transfer of wealth from labor to capital. And keep in mind that the labor share of income has been steadily declining for half a century.

The creation of wealth from AI ultimately depends on the strength of democratic and pluralistic institutions that safeguard ownership of your trade, democratized access to capital, and safeguards of welfare in the environment of creative destrcution. Otherwise you wind up with the cotton gin.

February 16, 2024 at 9:50 AM

karpour

Artists should be able to choose whether their work gets used to train machine learning algorithms.

February 16, 2024 at 4:45 AM

CaptainFever

This is a very vague statement that covers both opt-in and opt-out.

February 16, 2024 at 10:57 AM

educaysean

We all stand on the shoulders of giants. Yes, I want artists and other creators to be compensated fairly for any work that they contribute into training datasets, but outside of that there is no moral responsibility AI creators should feel towards those whose potential careers would be impacted.

February 16, 2024 at 4:41 AM

chasing

> ...there is no moral responsibility AI creators should feel...

Yeah, this is why "AI creators" shouldn't be the ones unilaterally deciding how this all plays out.

February 16, 2024 at 4:57 AM

stale2002

They aren't. Every person is free to use AI or not.

Go blame your fellow consumers if you don't like the fact that they prefer AI.

These are choices that everyone makes. AI companies alone aren't forcing everyone to use their cool new tools. Instead, thats a decision that 10s of millions of people are making every day.

February 16, 2024 at 5:09 AM

ls612

“Many were increasingly of the opinion that they'd all made a big mistake coming down from the trees in the first place, and some said that even the trees had been a bad move, and that no-one should ever have left the oceans.“

February 16, 2024 at 4:14 AM

wewtyflakes

Does it not just shift where we (as people) perceive value? If the cost of content drops to effectively zero, it seems reasonable that we would not value it so highly. If so, it does not mean that people do not value anything, but it may mean we start associating value with new or different things. While this may disrupt industries, I do not think we have an ethical or legal duty to those industries to remain profitable.

February 16, 2024 at 3:46 AM

bbor

GREAT response imo, I’ll try to remember this concise phrasing. I think this highlights that people aren’t worried so much about changes coming to them as consumers, and are much more worried about what “industries no longer remaining profitable” means for them as a laborer.

Means for us :(

February 16, 2024 at 4:10 AM

Geep5

Your argument is used time and time again with technology’s progress.

February 16, 2024 at 2:32 AM

chasing

Yup! Technology is powerful. It impacts people's lives.

I love tech, but if you take the stance that it's okay to hurt people for the sake of technical progress, you get into some very dark and terrible places...

February 16, 2024 at 2:35 AM

CooCooCaCha

It doesn’t help that the tech industry is trying to make it seem black and white. Like you’re either endlessly optimistic and let tech run rampant or you’re a depressing doomer pessimist. We should reject this framing whenever possible.

February 16, 2024 at 2:37 AM

sekai

> okay to hurt people for the sake of technical progress, you get into some very dark and terrible places

Hurt is a very subjective word in this context, how many people do you think the invention of the steam engine hurt? Or the electricity?

February 16, 2024 at 3:32 AM

joks

I think dismantling creative fields like this is completely different from automating manual labor in a way that makes humanity more prosperous. I don't see what the upside is of this -- it's not making creative work better, it's devaluing creative work and disenfranchising creatives.

February 16, 2024 at 4:15 AM

CaptainFever

What makes creative labour better and more deserving of protection than manual labour?

February 16, 2024 at 10:59 AM

stale2002

> I don't see what the upside is of this

The upside is that creative works are completely democratized.

Now, anyone, with very little effort is fully empowered to create creative works on their own and there is no barrier to entry.

Yes, empowerment and democratization harms people who's livelyhood depends on disenfranchisement.

February 16, 2024 at 4:37 AM
February 16, 2024 at 4:21 AM

hackerlight

> it's okay to hurt people for the sake of technical progress

That's a strawman. The real view is that protecting jobs that are made extinct by technology and automation is historically a bad idea because it leads to stagnation and poverty. It's better to let people lose their jobs, and for those people to find other jobs, while supporting them with a social safety net while they make the transition. Painful for them but unfortunately very necessary for a prosperous society.

February 16, 2024 at 2:51 AM

throwanem

> for those people to find other jobs, while supporting them with a social safety net while they make the transition

This is the part that no one is expecting to see actually happen, though. Without that addressed, your argument is sound but footless.

February 16, 2024 at 3:24 AM

Eisenstein

Instead of using this outrage and energy to push a political will to grant something that benefits everyone forever, we should use it to grant something that helps prop up a few people in dying industries so that they can stifle innovation which would lead to a creative revolution?

What no one is asking is: 'it this makes it easy for anyone to be an artist, a director, a musician... what are we going to get, and will it be worse than what we have now?

February 16, 2024 at 4:08 AM

throwanem

> What no one is asking is: 'it this makes it easy for anyone to be an artist, a director, a musician... what are we going to get, and will it be worse than what we have now?

Everyone is asking this.

But that's also not the only question. The one you're ignoring here is: If these tools enable one artist to do the work of a hundred, what happens to the other 99?

AI boosters have as yet offered no satisfactory answer for this question. Given the intimate involvement some of them have with politics at the national and global level, this absence constitutes reasonable grounds for suspicion that no answer is intended or forthcoming, and that suspicion is what's asking here to be addressed.

February 16, 2024 at 4:11 AM

Eisenstein

> If these tools enable one artist to do the work of a hundred, what happens to the other 99?

Not really -- as people have gotten more efficient at their jobs, we tend to just produce more/better things, not impoverish a bunch of people. If one person can day (8 hours) making a shoe by hand, and one person can make a shoe in an hour using a shoe making machine, then we don't have one less shoe maker, we have two people making 16 shoes a day. As an effect, shoes are now much cheaper, so they aren't only worn by rich people. If the one-shoe-per-day maker refuses to use a shoe making machine, he or she can upsell their 'hand crafted' shoes to rich people who want to distinguish themselves.

Believe me, I am not a 'free market fixes everything' person, at all, but in these cases, that is how it has worked since the industrial revolution. This is not a new process (automation making a task much more accessible/efficient) and this is not a new complaint (what happens to the people who made a living doing task).

Change is scary -- and everyone has the right to be afraid of an uncertain future, but I can't recall an instance of the regressive approach actually working to allay the fears of those who imposed it. Yet, we all see huge reminders of how our lives have been improved by making hard things easier and accessible to more people.

February 16, 2024 at 6:20 AM

throwanem

The argument as presented so omits even the possibility of harm being done anyone in this process as to seem as if it seeks to foreclose the thought at root.

It would not surprise me if anyone called this pollyannaish, or even Panglossian.

February 16, 2024 at 6:30 AM

Eisenstein

Can you explain yourself differently please? I have no idea what you mean.

February 16, 2024 at 7:01 AM

throwanem

You don't really touch at any point in your argument on even the possibility someone might be harmed, in the process of entire segments of the labor market being automated. Why is that?

February 16, 2024 at 7:14 AM

Eisenstein

It is assumed is anything with any kind of scale that harm with occur.

Did anyone get harmed when photography was used to supplant portraits? Did anyone get harmed when mail started getting sent by rail instead of horse? Did anyone get harmed when air travel became possible? Did anyone get harmed when we supplied electric power to homes?

I have an idea -- why don't you propose a solution to AI ruining creative jobs and we can apply that standard to it.

February 16, 2024 at 7:39 AM

throwanem

Price in the externality. The multiple of US GDP that OpenAI currently seeks in funding should certainly suffice to fund UBI, and if that slows down OpenAI's development of new capabilities, then that should still be preferable to the alternative of OpenAI being enjoined from doing business until that is done.

Of course you may respond that this is unrealistic, which it is; it requires a government capable of acting via regulation in defense of its citizens, and so nothing like it will be done.

February 16, 2024 at 8:48 AM

Eisenstein

I would love to have UBI. If AI fear gets that going I would be happy, but I must agree that is unrealistic.

February 16, 2024 at 9:39 PM

diputsmonro

The social safety net component of your idea is both extremely important and not at all likely in the modern ultra-capitalist, "even healthcare is socialist extremism" political atmosphere.

Maybe mass unemployment will create a sea change in that mentality, but most of the people who's opinions need to be changed will probably just laugh at "the elites" getting screwed over.

February 16, 2024 at 3:24 AM

cabalamat

It's a shame Andrew Yang isn't running this year, as his 2020 platform of UBI because of AI is looking very prescient.

February 16, 2024 at 3:57 AM
February 16, 2024 at 2:50 AM
February 16, 2024 at 3:28 AM

paxys

Is it? What is another example of a technological leap that made a certain class of workers redundant while also continually relying on the output of these same workers to be feasible in the first place?

The current batch of LLMs is in the same class of technological revolutions as Napster and The Pirate Bay. Immensely impactful, sure, but mostly because of theft of value from elsewhere.

February 16, 2024 at 2:37 AM

hansonkd

Isn't the Luddite movement an example?

The factories that replaced the artisans were only made possible by the work of the artisans forging the way.

February 16, 2024 at 2:51 AM

lewhoo

I don't think so. The main idea is that for AI to continue to develop new data is needed. Skills of the Luddites were no longer needed.

February 16, 2024 at 4:25 AM

CaptainFever

New data can still be created using AI and curation, couldn't it? New works, incorporating AI or not, still enjoy copyright protections that one can monetize by selling access to that specific work.

February 16, 2024 at 11:01 AM

timdiggerm

Okay? That could just as easily mean this argument has been right all along.

February 16, 2024 at 2:59 AM

tomtheelder

I really don't think that's true. Essentially the argument is that these models are more or less just outputting the work of others. Work already done- not theoretical future work, which is what people usually criticize new technologies for.

The question here is really about whether it's sufficiently transformative, or whether that's even the right standard to be applied to generated media.

February 16, 2024 at 2:35 AM

s__s

The argument should be brought up every single time. Each major technological jump is a unique event completely different from the last.

AI is nothing like anything we’ve seen, and is truly unique in the dangers it poses to the world.

February 16, 2024 at 4:54 AM

hk__2

That doesn’t make it invalid. It’s a tough question, there’s no easy answer.

February 16, 2024 at 2:34 AM

fardinahsan146

Sorry no. If there was even the remotest possibility that everyone could be brought to the table, none of these would even exist.

Training a massive model like this is a risk, and no one is going to take that risk without some reward. You can complain OpenAI is going to too much of the value, but its value that would have otherwise never existed. It's value.

February 16, 2024 at 2:47 AM

throw4847285

This has been shared before, but: https://pbs.twimg.com/media/FadzEwVWAAYEyRW?format=jpg&name=...

February 16, 2024 at 3:23 AM

wnc3141

Research on creativity and competition points to this. Essentially, creativity occurs when there is some expectancy of increasing competitiveness. However when the expectancy of value capture from your effort becomes less clear, or diminished, creativity stops altogether.

(as pointed out in the "Freakonomics" episode highlighting this reaserch)

https://direct.mit.edu/rest/article-abstract/102/3/583/96779...

https://freakonomics.com/podcast/can-a-i-take-a-joke/

February 16, 2024 at 3:34 AM

CuriouslyC

Things that can be easily reproduced already have little value, the people who produce those things have adapted to focus on brand, and that's just how it's going to be from now own.

February 16, 2024 at 2:33 AM

sho_hn

Reminds me of an interview with a Korean pop music producer I watched 15 years ago.

South Korea had a high % of broadband penetration earlier than many Western countries, and as a result physical CD sales crashed very hard, and very quickly. So he asked himself, what's the most analog good I could sell? It's people. And went the pop idol / personality marketing route with great and lasting success.

February 16, 2024 at 2:37 AM

chasing

> Things that can be easily reproduced already have little value...

Nonsense. Also, my point is that it shouldn't be up to tech companies to unilaterally decide what has value.

February 16, 2024 at 2:37 AM

stale2002

It's not up to them.

Instead it is up to the consumers.

If consumers choose to give money to AI company, and not to artists, then in the eyes of the consumer those artists do not have value.

February 16, 2024 at 4:41 AM

nicksrose7224

I dont think they're saying its up to tech companies to decide what has value, more that the development of new technology itself ends up deciding for the rest of the world how things are valued.

It's been this way for 10,000 years since the invention of the wheel. New inventions change how things are valued by making it easier for people do more work with less time.

February 16, 2024 at 4:26 AM

m_ke

There will just be 1000x more content, with most of it hyper personalized and consumed by individual users instead of by masses of people.

February 16, 2024 at 2:38 AM

Mockapapella

The creators who create media can also use these tools to create more media faster, as can novices. It's not like OpenAI literally eats the media, never to be shared with the world again.

February 16, 2024 at 2:36 AM

overthehorizon

I create media for a living, painstakingly creating stuff from scratch in 3D. This tool will not help me, it will help clients avoid ever having to contract me. The main beneficiaries of this are holders of capital

February 16, 2024 at 4:16 AM

notimpotent

But doesn't this technology give you the same edge?

You can deliver more content, faster, cheaper.

February 16, 2024 at 4:45 AM

MrNeon

This issue here is thinking you, holding the knowledge to 3D model, are not also a holder of capital. Capital isn't just money.

February 16, 2024 at 4:50 AM

jsheard

Oh I see, they're not eating the media, just extorting the creators into paying OpenAI in perpetuity to use the tool derived from their own work, or face becoming uncompetitive with their peers who do use it. What if landlords, but for media creation, and they don't even have to pay for the land in the first place. That's fine then.

February 16, 2024 at 3:07 AM

Mockapapella

> pay a subscription to OpenAI in perpetuity in order to remain competitive with their peers

This is how technology works in general and should not be vilified. Someone comes up with a better way to do things (in this case bringing creative ideas to life) and charges a premium on top of that for their efforts. If the current wave of creators doesn't like it, then they should instead make something people want more than what their competition has to offer.

Either way, this is why local open source models are critical, so that everyone can benefit without needing to pay any single party.

February 16, 2024 at 3:17 AM

jsheard

If a company were founded tomorrow which allowed you to stream unlicensed TV shows and movies for a monthly subscription, undercutting Netflix and Amazons licensed streams, that wouldn't be described as "a better way to do things" just because their customers prefer it for being cheaper and easier because all the content is in one place. The difference between that and what OpenAI is doing is just degrees of abstraction, either way they're deriving value from others work without compensating them, and actively undermining the ongoing creation of the work they're appropriating, while simultaneously relying on the ongoing creation of that work to keep feeding their machine.

IP law has yet to decide whether my interpretation of the situation is correct in the legal sense, but I find it impossible to see "ChatGPT absorbs the work of writers/journalists and sells a superficially reworded version without attribution or compensation" as anything but theft obfuscated behind lots of fancy math. It's only going to get worse if LLMs end up displacing traditional search engines, so one day you'll publish an article and get exactly one impression from GPTBot which then turns around and figuratively copies your homework.

February 16, 2024 at 3:31 AM

diputsmonro

Forgive me for thinking that it may be difficult for independent artists to compete against the trillion-dollar groundbreaking plagiarism machine that is actively plagiarizing their work faster than they can produce original work, without consequence, and suffocating them under a deluge of generated works.

This is an extremely different difference of scale, which does constitute a meaningful difference from prior technologies.

February 16, 2024 at 3:33 AM

Mtinie

It’s difficult for independent artists to live as independent artists today, even without the specter of a “trillion-dollar groundbreaking plagiarism machine”[0]. So far, we’ve still been producing original work, primarily because it’s what we do even when we’re not making money from it. It’s a blessing and a curse.

This is not to dismiss the concern. I simply wanted to state that artists will find ways to keep moving the creative bar forward.

[0] I really like this turn of phrase, thank you for sharing it.

February 16, 2024 at 3:56 AM

brigadier132

> extorting

That's not what extortion is. Stop abusing language.

February 16, 2024 at 3:31 AM

jsemrau

Interestingly a lot of movies flopped in 2023 not because of bad visuals, but because their writing was bad. Hence, I believe the demise of the movie industry is overstated. I can see completely new forms of entertainment coming out of this. Probably Youtube will be the biggest winner as the social network with the highest monetization and reach.

February 16, 2024 at 3:45 AM

strangescript

Meanwhile, I am going to take my horse and buggy down to the local blacksmith to get some work done...

February 16, 2024 at 4:16 AM

bottlepalm

You can't regulate it because it will just be outsourced to another country.

Nope, we are headed towards deflation. Families that need only a single worker to support everyone, and even support extended family, and less time working overall.

February 16, 2024 at 4:52 AM
February 16, 2024 at 3:30 AM

airstrike

> Yeah, you just can't let

Who's "you"?

February 16, 2024 at 2:55 AM

VoodooJuJu

The middle class.

Automate away the lower classes all you want, just don't touch the white collar class, that's a heckin' nono.

February 16, 2024 at 2:59 AM

ComplexSystems

I don't disagree with your basic sentiment, but it's worth pointing out that, on some level, the *entirety of artificial intelligence* is not much more than a "cool technical trick."

February 16, 2024 at 4:15 AM

powera

I am getting sick of these "people can't be allowed to make their own nice things easily, because of a pugnacious (and very online) interest group that wants to keep getting money" takes.

February 16, 2024 at 3:03 AM

bbor

  capture the value of every piece of media ever created
In what way does “I have a computer that can make movies” mean “I have captured the value of every piece of media ever created?” What do you mean by “value”? In my biased view, this amazing new technology couldn’t possibly be a better time to fix our insane notions of property, intellectual or otherwise

February 16, 2024 at 4:06 AM

vunderba

Are you against records? Because the technology to record songs and play them back at your leisure killed an entire industry of live performers / instrumentalists?

The call for live music drastically shrank when it became trivial for any business or residence to play music on command.

Are you against automatic language translation? I can positively guarantee that the training data that they used to be able to create significantly better translation models was not authorized for that purpose.

The entire translator industry has been steadily shrinking ever since the invention of automatic language translation.

Etc etc etc.

There's obviously two aspects of this complex social issue right now.

1. Whether or not the usage of publicly available media as training data is legal/ethical.

2. Whether or not the output of these types of generative systems (even if they're trained on "ethical" training data) which may result in the displacement of many jobs is legal/ethical.

I'm neither for nor against AI (LLM, diffusion, video, etc), but if you are going to take a stance, then you have to be consistent in your view.

You don't get to cherry pick - I don't want to see you using chatGPT, copilot, stable diffusion, DALL-E, midjourney, sora, etc.

February 16, 2024 at 5:35 AM

chasing

It's weird that a call for generative AI to be more equitable towards the people whose creative work powers it is being interpreted as somehow being against tech, against AI, or that I think technological advancement should never make jobs obsolete.

February 16, 2024 at 6:05 AM

moralestapia

>Yeah, you just can't let all media, all the cost and hard work of millions of photographers, animators, filmmakers, etc be completely consumed and devalued by one company just because it's a very cool technical trick.

Oh man, how I miss it when ice was hauled from the Arctic in boats.

February 16, 2024 at 2:46 AM

chasing

You recognize the difference, right? Modern freezers don’t rely on people shipping ice from the Arctic. Generative AI does rely on people continuing to create media.

February 16, 2024 at 3:05 AM

moralestapia

It doesn't anymore. It sucks, but that's what it is.

February 16, 2024 at 3:44 AM

chasing

Where do you think the training data comes from?

February 16, 2024 at 7:52 AM

moralestapia

came from

February 16, 2024 at 8:55 AM

chasing

comes from

February 16, 2024 at 12:38 PM

moralestapia

No, once the model is trained it doesn't need any new media.

February 16, 2024 at 9:09 PM

bsza

Do you feel the same about the hard work of knocker-uppers having been devalued by the invention of the alarm clock? Or is it just the (relatively) highly paid intellectual workers that "cannot be allowed" to be replaced with machines?

February 16, 2024 at 4:42 AM

wilg

It doesn't really matter, because if this is possible then it will not be exclusive to OpenAI for long. It's simply just something that can exist. There will be open source versions of everything lagging 1-2 years behind or something.

February 16, 2024 at 4:42 AM

thomastjeffery

They shouldn't get exclusive rights to ignore IP law. Instead, we should all get that right.

Copyright should have ended decades ago. It has accomplished nothing but harm.

February 16, 2024 at 3:19 AM

Krasnol

Never ever will there be everyone at the table. This is not how the Internet works. It is not how the world and humanity work. If OpenAI doesn't do it, the next big player will. China will. Maybe it'll soon not even need China because it'll be so easy to deploy.

There is no stop now. It's too late for that. Time to think about the full development and how we'll handle that. How we as people will be able to exist next to it. What our purpose in the world is supposed to be. What the purpose of "value" is. What the purpose of "economy" or "the market" is.

Exiting times.

February 16, 2024 at 4:21 AM

resolutebat

It's worth remembering that "intellectual property" is an entirely artificial and fairly recent construct. Humanity did fine for thousands of years without it, and I'm not going to shed too many years if OpenAI blows it up.

https://www.gnu.org/philosophy/not-ipr.en.html

https://www.gnu.org/philosophy/not-ipr.en.html

February 16, 2024 at 5:03 AM

dkjaudyeqooe

I think there is 100% chance it will be regulated to address some of the points you raised. Copyright being essentially neutered won't work.

February 16, 2024 at 2:40 AM

niam

I see the validity of this concern in the short term, but long term I feel like this is a bit doomsday. I don't want anyone's livelihood to get shafted, but realistically I see this as lowering the barrier to creating videos / proofs of concept--which is a good thing (with a lot of caveats and asterisks).

February 16, 2024 at 2:40 AM

visarga

> just because it's a very cool technical trick

That's one big trick, almost magical.

February 16, 2024 at 3:54 AM

palmfacehn

Similar things were said about Internet piracy in decades past.

February 16, 2024 at 2:31 AM

seydor

They could pay people to capture it. They could buy out one of the stock video companies. this is not important

February 16, 2024 at 4:20 AM

malermeister

The limitation is with capitalism, not with the technology. It's time we move on to post-scarcity communism, Star Trek style.

February 16, 2024 at 3:59 AM

mring33621

"Can't be allowed"

February 16, 2024 at 2:33 AM

abi

[flagged]

February 16, 2024 at 2:31 AM

torginus

I wonder why the input is always text - can't it be text, as well as a low quality blender scene with a camera rig flying through space, a moodboard, sketches of the characters etc.?

February 16, 2024 at 3:59 AM

thepasswordis

My guess is because the models were all trained on text. You could do as you say, but I think it would go: blender video {gets described by an AI into text}-> text prompt -> video.

February 16, 2024 at 4:03 AM

AbuAssar

Sora means picture or image in Arabic language

February 16, 2024 at 3:20 AM

geor9e

Looking forward to someone feeding it the first draft of The Empire Strikes Back https://www.starwarz.com/starkiller/the-empire-strikes-back-...

February 16, 2024 at 8:35 AM

zmk5

These samples look pretty amazing. I'm curious the compute required to train and even deploy something like this. How would it scale to making something like a CGI Pixar movie?

February 16, 2024 at 2:23 AM

pradn

It's impressive, but I think it's still in the same category as even the best LLMs: the demos look good and they can be quite useful, but you can never quite trust them. You really can't just have an LLM write a whole report for you - who knows what facts it'll make up, what it'll miss? You really can't use this to generate video for work, who knows where the little artifacts are (it's easier to tell with video).

The future of these high-fidelity (but not perfect) generative AI systems is in realizing we're going to need "humans in the loop". This means designing to output human-manipulable data - perhaps models/skeletons/textures instead of whole output. Pixels are hard to manipulate directly!

As for entertainment, already we see people sick of CGI - will people really want to pay for AI-generated video?

February 16, 2024 at 6:00 AM

bufferoverflow

> The future of these high-fidelity (but not perfect) generative AI systems is in realizing we're going to need "humans in the loop"

Last weekend my 7 year old decided he wanted to make and sell a shirt with an image of a space cat shooting a laser gun. It took him like 1 minute to use free Dalle3 to make and choose an image. Then I showed him a website to remove the background. Then I showed him a tool to AI-upscale the image. Then we uploaded it to Amazon Merch, it got approved after a few hours, and now it's for sale on Amazon. It took us maybe 10 minutes of effort end-to-end. Involved no artists.

Funny enough, Amazon is full of AI-designed merch, there were like 7 pages of shirts with space cats with lasers.

February 16, 2024 at 6:09 AM

pradn

Oh sure, for "low end" applications, even this wave of generative AI is going to pull the rug under artists.

I'm talking about, say, art for video games and actual movies.

February 17, 2024 at 12:38 AM

speedgoose

I subscribe to Disney+ and some of the content is a lot less perfect than the Sora videos presented there.

February 16, 2024 at 6:05 AM

thomastraum

I am a CG artist and Director and this made me so sad. I am watching in horror and amazement. I am not anti AI at all, but being on the wrong side of efficiency, for the individual this is heartbreaking. its so much fun to make CG and create shots and the reason its hard (just like anything) makes it rewarding.

February 16, 2024 at 3:27 AM

Keyframe

Ex colleague then! I'm kind of glad I went out of it all now that I see all of this, but on the other hand it's also an amazing opportunity unfolding, as long as it's directable. What a great toolset! For what you've had to have army of people, freezing ass on location, working with actors.. soon gone. Well, if you want it to. On the other hand, look at what happened to imagery, concept art in general. For the better part it cheapened it. Turned it into this mass produced, easily available thing that it's not special anymore. Skills are still needed to produce exactly what you want, but the special flair is kind of gone. It will need way more energy and creativity now to stand out.

February 16, 2024 at 5:14 AM

superconduct123

I'm conflicted though because on the flip side it could open up filmmaking to way more people who don't have the skills/money/time

Like what if any artist could make a whole movie by themself without needing millions of dollars or hundreds of people

Similar to how you used to need a huge studio full of equipment to record music and now someone in their bedroom with a DAW can do it

February 16, 2024 at 3:59 AM

thomastraum

The point is by doing you become really good in creative fields. in any field. Prompting is not doing. What makes you a really good programmer? Writing code.

the pursuit of mastery is at the essence of any craft.

February 16, 2024 at 9:31 AM

Zelphyr

I can't help but worry that this will make it too easy to create movies and the product will be of much lower quality. There is precedence here in the music industry. A recent report came out that said that about 70% of music sales was catalog music, implying that people are buying less new music than old. I personally feel that's because the new music just isn't very good and one of the reasons is, it's too easy to make and distribute music now.

February 16, 2024 at 4:31 AM

Solvency

That is a ridiculous take. Look at the absolute SEA of bottom-barrel content flooding every single streaming platform. For people at the top of the studio system, they are already living out their AI power trips, just in the meatspace.

The entire industry is already turning out terrible shit, but doing it by wasting hundreds of thousands of actors, production teams, and studio dollars in order to churn out that nonsense.

Meanwhile, there are millions of latent storytellers, who, for whatever reason (but primarily: not born into extreme wealth and nepotistic connections) could never express their ideas in motion/cinema at such ambitious scales.

By putting this power in the hands of actually talented writers and storytellers, you create a completely new market of potentially incredible works of art.

February 16, 2024 at 4:56 AM

mtlmtlmtlmtl

Sure. But you have to admit that you also create a new market of low effort garbage art. The question is which is bigger, and where the money will ultimately go.

February 16, 2024 at 6:37 AM

chefandy

"Things are already bad. How could you be mad about making it much easier to make things worse? Quality isn't compatible with today's business ambitions."

February 16, 2024 at 11:53 PM
February 16, 2024 at 8:36 PM

gcanko

I think an important skill in the future would be just having good ideas. That's going to differentiate the winners from the losers

February 16, 2024 at 4:48 AM

JBits

I think it's worth remembering that all of these AI's work by having an unbelievably large number of weights. So many weights that it's all an uncontrollable black box. On the other hand, your work is all about having control, and I don't personally expect your work to lose value for this reason.

Another thing to think about is what the AI is designed to do. Without knowing the details, I would expect it to be trained to produce the 'most likely' output given the prompt. Consequently, I would think being inventive is against its design, and 'most likely' is effectively that same as 'average'.

February 18, 2024 at 10:44 AM

__loam

I think it's okay to be a bit anti ai lol.

February 16, 2024 at 4:58 AM

manuka

Why the terror? Your job will change a bit but won't be gone. You would guide the output and make prompts not with text but your own video CGI shorts to make things 100% to your liking and the AI will do the rest of the dirty work. You productivity will grow and quality of your work too. You would be able to make an AAA movie all by yourself on a laptop. Since everyone would be able to do the same, the fight for the imagination and inginuity in scripting and artists view would skyroket. :) IMHO

February 16, 2024 at 4:02 AM

tasty_freeze

You are rather cavalier about other people's livelihoods. There will be budget for maybe 10% of the people currently employed, and yes, they will be making use of the new tools and they'll adapt. The other 90% are going to be doing doordash until they can figure out a new career.

February 16, 2024 at 4:12 AM

okrad

Initial displacement will happen and it will require time for society to adapt and new industries to mature. The printing press significantly reduced the cost of producing books and other printed materials, which led to a dramatic increase in the availability of books, literacy rates, and the spread of knowledge. This technological advancement didn’t just replace the scribes; it created new jobs in printing, publishing, book selling, and eventually led to the creation of new genres of literature.

February 16, 2024 at 7:35 AM
February 16, 2024 at 3:31 PM

tasty_freeze

Yes, in the long term the printing press brought many benefits. In the short term, a lot of people were out of work.

February 16, 2024 at 8:10 AM

manuka

Who lost their jobs to the printing press? The monks who were the only scribes back then? They got their time freed to spend it on other duties in the monasteries and mayhaps even more time to read other books rather that to scribe them. So the level of education grew even for them.

The same will be for the FX artist and 3D artists etc. The level of their work will grow, they will spend less time on dull work and more on tinkering with tiny but more important things like ideas, emotions, art overall etc.

February 16, 2024 at 3:37 PM

ihumanable

The terror is because companies want to maximize profits and a great way to do that is to minimize costs.

If you have a team of X people producing Y pieces, and now X people can produce 10Y pieces, everything is fine as long as the demand for pieces keeps up. But if your company really only needs Y pieces or really any amount less than 10Y then the easiest thing for a company to do is go, "We don't really need X people, let's fire some"

Getting fired, in America at least, means loss of healthcare, income, and if it persists long enough housing. Most people are terrified of being homeless, broke, and without access to medicine.

February 16, 2024 at 4:17 AM

manuka

> as long as the demand for pieces keeps up

So the problem not in the AI but in demand...

February 16, 2024 at 3:32 PM

ihumanable

AI causes the supply and demand to change by creating additional supply of pieces through increased productivity.

It's cold comfort to someone getting fired to tell them "If demand had also increased 10 fold you wouldn't have to sleep on the street."

The actual living human being who has had their livelihood destroyed probably isn't any less scared of their fate because you cleverly tut at them and go, "In actuality the AI didn't do anything bad to you, it just created a glut of supply and the market demand didn't keep up."

February 17, 2024 at 4:20 AM

avisser

> Your job will change a bit but won't be gone. > You[r] productivity will grow

This aren't compatible at scale. If productivity grows, there will be less people doing the job.

February 16, 2024 at 4:19 AM

VMG

Programmers are more productive than years ago and there are many more of them

February 16, 2024 at 4:33 AM

timeon

Sometimes it looks like the peek is ending. Who knows.

February 16, 2024 at 6:04 AM

charlotte-fyi

Many people consider what you refer to as "the dirty work" as precisely the point of creative practices.

February 16, 2024 at 5:06 AM

manuka

Depends on what you think is "dull work". I think there are many artist who could welcome some of the "creating work" to be automated. What part? Depends on the artists and his preferences. AI can take the burden of any type of work and leave those parts which are needed for the human to do. Human can choose what parts he will work on. That's the point.

February 16, 2024 at 3:36 PM

corobo

Oooh this is gonna usher in a new wave of GPT wrappers!

If anyone's taking requests, could you do one that takes audio clips from podcasts and turns them into animations? Ideally via API rather than some PITA UI

Being able to keep the animation style between generations would be the key feature for that kind of use-case I imagine.

February 16, 2024 at 3:43 AM

ij09j901023123

Apple vision pro + OpenAI entertainment on the fly + living in a tight pod next to millions of other people, hooked onto life support. A wonderful matrix fantasy

February 16, 2024 at 5:46 AM

quadcore

HN server runs smoothly and is having a walk in the park it seems - impressive compared to previous OpenAI annoucements. Has there been significant rollouts?

February 16, 2024 at 6:50 AM

system2

Instead of 1 core 2 GB RAM, they now have 2 core 4 GB RAM so it is running okay now.

February 16, 2024 at 9:56 AM

epberry

These looks fantastic. Very slight weirdness in some movement, hands, etc. But the main thing that strikes me is the cinematic tracking shots. I guess that's why they use "scenes". It doesn't seem like a movie could be generated with this involving actors talking.

February 16, 2024 at 2:33 AM

0xcb0

Wow, feels unreal. Can't believe we have come so far, yet we cannot solve the worlds most basic problems and people still starve each day.

February 16, 2024 at 5:25 AM

bscphil

Not that this isn't a leaps and bounds improvement over the state of the art, but it's interesting to look at the mistakes it makes - where do we still need improvements?

This video is pretty instructive: https://cdn.openai.com/sora/videos/amalfi-coast.mp4

It "eats" several people with the wall part of the way through the video, and the camera movements are odd. Strange camera movements, in response to most of the prompts, seems like the biggest problem. The model arbitrarily decides to change direction on a dime - even a drone wouldn't behave quite like that.

February 16, 2024 at 5:09 AM

justinl33

Technical report here: https://openai.com/research/video-generation-models-as-world...

February 16, 2024 at 10:17 AM

cod1r

OpenAI is definitely cooking

February 16, 2024 at 2:17 AM

pantulis

This is the harbinger that announces that, as a technologist, the time has come for me to witness more and more things that I cannot understand how they work any more. The cycle has closed and I have now become my father.

February 16, 2024 at 4:44 PM

mihaic

The difference is that now nobody really "understands" what's going on, it's just that some know how to build these.

February 16, 2024 at 6:08 PM

psychoslave

How is that new? People built a gnomon, a stick was thrust into the soil and ta-da. No doubt it happened far before any writing system was out there. So it still took human quite some time to come with a compelling helio-centric model to cast some grabbable explanation of it all, even if you take Aristarchus of Samos as a pionner in this field.

February 16, 2024 at 9:34 PM

richardwhiuk

It's new for computing.

February 16, 2024 at 10:17 PM

psychoslave

Ok, maybe on some perspective I’m with you here. There are things happening no-one even those on the edge of the fringe can understand anymore how it works while it does. Or at least that is how it seems to be from my narrow perspective on AI.

On the other hand, I don’t feel like you need to know how a compiler work, let alone the hardware architecture it targets, before you can go through your first hello world program or even build some useful software on top of frameworks/library treated at blackboxes. So "I have no idea what I’m doing" in this perspective is probably as old as CS/informatics.

February 16, 2024 at 10:26 PM

richardwhiuk

There's a huge difference between "I don't understand how X works", and "Nobody understands how X works".

Also, every single abstract is leaky, so often it's a difference between "I don't need to know how X works now", and "I can never find out how X works because it's simply not knowable".

February 20, 2024 at 2:23 AM

Exuma

My dad is 80 and willingly loves to listen to me explain how neural networks work, then he also read about them, busy beaver functions, kafka, and all kinds of crazy shit I tell him abour. This is all in your mind. You are as young as your mind is.

February 16, 2024 at 8:32 PM

twosdai

Not the original poster, but the more frightening part of the sentence, is the "not understanding how something works part" over the "becoming my father"

Getting to a point where realistically you're not able to know something deeply but then still use it is pretty frightening.

When I say deeply I don't necessarily mean that for every device you need to know about all of its atoms, but to have a pretty good framework for how the thing works deterministically, and how it can fail.

February 16, 2024 at 10:56 PM

thoughtpeddler

> Getting to a point where realistically you're not able to know something deeply but then still use it is pretty frightening.

This now applies to most things in modern industrial society. We operate our daily lives at a crazy high level of abstraction. I think for a lot of us on HN, we "know too much about what we don't know", and that is ... overwhelming.

Funny enough, most people are actually able to operate at these higher levels of abstraction without worrying too much, because they don't know enough about what they don't know.

February 17, 2024 at 7:15 AM

pantulis

> Not the original poster, but the more frightening part of the sentence, is the "not understanding how something works part" over the "becoming my father"

That was my point, exactly.

February 17, 2024 at 1:51 AM

semi-extrinsic

So unless you have a solid grasp of quantum mechanics and solid state physics, using any electronic device is frightening?

February 17, 2024 at 1:58 AM

megamix

Thankfully it's nothing magical. But are you willing to learn about it or not?

Think about animation, how a program can generate a sequence of a bouncing ball between two key frames. Think about what defines a video. The frames right? From there I can try to imagine.

February 16, 2024 at 4:56 PM

pantulis

> But are you willing to learn about it or not?

This is the key. I have enough curiosity to want learn the stuff from the ground up, just as I did with other technologies. But man do I have the stamina today? Not so sure!!!

February 16, 2024 at 8:22 PM

quonn

This book is great: https://www.penguinrandomhouse.com/books/730887/understandin...

It's comparatively easy to understand and it does cover everything from basic networks to LLMs and Diffusion models.

February 16, 2024 at 6:59 PM

coldfoundry

Thanks for putting this into words. Its a very off-putting feeling for me, and couldn’t exactly figure out what that feeling was. It both scares me and excites me in a way that only makes me subconsciously anxious. Time to deep dive before I become what I always feared, which is being technologically left behind.

February 16, 2024 at 8:46 PM

dovyski

This comment describes with precision what I was feeling and was unable to name or frame. Marvelous times for sure.

February 16, 2024 at 6:48 PM

ab_entropy

This is likely a wild guess on my part but i've faced a similar feeling lately. If this comes from the realm of Webdev, React, SSR and all the F'ing acronyms that we need to learn today and you want to feel like you've "caught on": My advice would be to avoid NextJS at all costs. It's too bleeding edge.

Opt for a sane option instead to get started, likely one of these: (Astro, SvelteKit or Remix).

February 16, 2024 at 9:14 PM

adroniser

Lol there's a massive difference between a framework that generates javascript, a language which has existed for 30 years at this point, and a magic LLM that no one on earth understands the internals of.

February 17, 2024 at 1:56 AM

ddano

It is all just a mindset and how much you want to be involved.

Here is an inspirational story for you: https://news.ycombinator.com/item?id=39288139

February 16, 2024 at 9:55 PM

georgespencer

I'm on the very cusp of this, you helped me realize. Thanks.

February 16, 2024 at 5:01 PM

gzer0

Truly stunning. Waiting on the research paper, says will be published (soon). Can't wait to read on the technical details.

February 16, 2024 at 2:36 AM

VladimirGolovin

I did not expect this level of quality in the beginning of 2024. Makes me think that we may see AGI by the end of this decade.

February 16, 2024 at 3:18 AM

billiam

I find creepy things in all the videos, despite their breathtaking quality at first glance. Whether it is the way the dog walks out into space or the clawlike hand of the woman in Tokyo, they are still uncanny valley to me. I'm not going to watch a movie made this way, even if it costs my $0.15 instead of $15.00. But I got tired of Avatar after watching it for 20 minutes. Maybe all the artificial abundance and intellectual laziness the generative AI world will make us realize how precious and beautiful the real world is. For my kids' sake, I hope so.

February 16, 2024 at 10:03 AM

cheschire

Sure, but imagine using this as a generative-fill to augment a movie, not just making an entire movie from it. We've seen fantastic homemade movies from very talented artists before. Now imagine if mostly talented artists could do it too.

February 16, 2024 at 10:09 AM

tehsauce

It's fascinating that it can model so much of the subtle dynamics, structure, and appearance of the world in photorealistic detail, and still have a relatively poor model of things like object permanence:

https://cdn.openai.com/sora/videos/puppy-cloning.mp4

Perhaps there are particular aspects of our world that the human mind has evolved to hyperfocus on.

Will we figure out an easy way make these models match humans in those areas? Let's hope it takes some time.

February 16, 2024 at 3:53 AM

throwaway4good

What’s the connection between this and high end game engines (like unreal 5). I would expect 3d game engines to be used at least for training data and fine tuning. But perhaps also directly in the generation of the resulting videos?

For example this looks very much like something from a modern 3d engine:

https://twitter.com/OpenAI/status/1758192957386342435

February 16, 2024 at 4:30 PM

kypro

They almost certainly trained on video game output and this is clearly bleeding into the style of some of these demos.

The SUV video for example looks very much like something you'd see in a modern video game which probably makes sense because most videos with kind of perspective are going to be from video games.

I don't know how they would use game engines directly for training and fine tuning though. It would be far too labour intensive to render high quality scenes using a video game engine for every prompt.

February 16, 2024 at 6:31 PM

cush

Does OpenAI hang out with these kinds of features in their back pocket just waiting for a Gemeni announcement so they can wait an hour and absolutely dunk on Google?

February 16, 2024 at 4:46 AM

gigglesupstairs

Looking at the scale of this announcement, it’s likelier that Google just preempted their announcement with their own.

February 16, 2024 at 5:52 AM
February 16, 2024 at 4:53 PM

ericzawo

Why can't AI take the non-fun jobs?

February 16, 2024 at 4:14 AM

ta8645

Why are you able to have a fun job, when another human has a non-fun job? Because you're more talented and have skills they lack. Same goes for AI versus you. You're just starting to feel what billions of other people have felt, for a long time.

February 16, 2024 at 9:03 AM

Pugpugpugs

Yeah but AI can't experience pleasure.

February 16, 2024 at 1:47 PM

ta8645

Customers care about results, not whatever pleasure it creates for the vendor.

February 16, 2024 at 1:52 PM

Pugpugpugs

Wow really? Thanks for your insight buddy.

February 16, 2024 at 2:08 PM

ta8645

Thanks, that means a lot, especially coming from someone who was commenting on the ability of AI to feel pleasure.

February 16, 2024 at 2:17 PM

Pugpugpugs

We're both saying "the current system is bad because the way it works will interact with ai to create negative outcomes" and you're saying "wow you're very stupid, here's how the system works." We're aware friend, that's the problem.

February 16, 2024 at 3:25 PM

ta8645

Wow really? Thanks for your insight buddy.

February 16, 2024 at 3:46 PM

karpour

Not a single line saying anything about training data.

February 16, 2024 at 4:38 AM

hoc

Everytime OpenAI comes up with an new fascinating gen model it also allows for that bluntly eye-opening perspective on what flood of crappy und unnecessary content we have been gotten accustomed to being thrown at us. Be it blown-up text description and filler talk, to these kind of vodka-selling commercial videos.

It's a nice cleansing benefit that comes with these really extraordinary tech achievement that should not be undervalued (after all it produces basically an endless amount of equally trained producers like the industry did in a - somehow malformed - way before).

Poster frames and commercials thrown at us all the time, consumed by our brains to a degree that we actually see a goal in producing more of them to act like a pro. The inflationary availability that comes with these tools seems a great help to leave some of this behind and draw a clearer line between it and actual content.

That said, Dall-E still produces enough colorful weirdness to not fall into that category at all.

February 16, 2024 at 1:06 PM

cuuupid

Not loving that there are more details on safety than details of the actual model, benchmarks, or capabilities.

> That’s why we believe that learning from real-world use is a critical component of creating and releasing increasingly safe AI systems over time.

"We believe safety relies on real-world use and that's why we will not be allowing real-world use until we have figured out safety."

February 16, 2024 at 2:26 AM

diputsmonro

Yeah, it would be way better if they just released it right away, so that political campaigns can use AI generated videos of their opponents doing horrible/stupid things right before an election and before any of the general public has any idea that fake videos could be this realistic.

February 16, 2024 at 2:59 AM

BeerAnthem

Let's make it safe by allowing only the government (the side we like) and approved corporations to use it.

That'll fix it.

February 19, 2024 at 9:01 PM

serf

you joke, but the hobbling of these 'safe' models is exactly what spurs development of the unsafe ones that are ran locally, anonymously, and for who knows what purpose.

someone really interested in control would want OpenAI or whatever centralized organization to be able to sift through the results for dangerous individuals -- part of this is making sure to stymie development of alternatives to that concept.

February 16, 2024 at 3:39 AM

eggplantemoji69

Value is going to be higher for professions where the human essence is an essential component of the function. Or professions that are more coupled with physical reality…my hedge is probably becoming an electrician.

I’d imagine IRL no-tech experiences will be the new ‘escapes’ too.

Maybe I’m too idealistic about the importance of the human spirit/essence…whatever that actually is.

February 16, 2024 at 8:45 AM

tropdrop

I see many possibilities for commercials, demos... not to mention kids' animations, of course.

Actually, thinking of this from the perspective of a start-up, it could be cool to instantly demonstrate a use-case of a product (with just a little light editing of a phone screen in post). We spent a lot of money on our product demo videos and now this would basically be free.

February 16, 2024 at 3:19 AM

unleaded

how will the AI know what your product looks like? You probably already have CAD models, couldn't you import those into blender and make something in an afternoon or two?

February 16, 2024 at 5:16 AM

dragonwriter

> how will the AI know what your product looks like?

Training an embedding/LoRA on the product and using it with the base model, same as is done for image-generation models (video generation models usually often use very similar architecture to image generation models -- e.g., SVD is a Stable Diffusion 2.x family model with some tweaks.)

Now, you may not be be able to do this with Sora when OpenAI releases it as a public product, just like you can't with DALL-E. But that's a limitation of OpenAI's decisions around what to expose, not the underlying technology.

February 16, 2024 at 6:49 AM
February 16, 2024 at 6:39 AM

max_

This is amazing!

1. Why would Adrej Karpathy leave when he knows such an impressive breakthrough is in the pipeline?

2. Why hasn't Ilya Stuskever spoken about this?

February 16, 2024 at 4:23 AM

taejavu

No idea for your first question, but wouldn't the answer to the second be "NDA's and or other legal concerns"?

February 16, 2024 at 6:25 AM

kevingadd

It's interesting how a lot of the higher frequency detail is obviously quantized. The motion of humans in the drone shots for example is very 'low frequency' or 'low framerate', and things like flowing ocean water also appears to be quantized. I assume this is because of the internal precision of these models not being very high?

February 16, 2024 at 3:10 AM

nbzso

The idea that prompting is a creative tool is utterly illogical. This will result in a ton of mediocre synthetic crap for corporate presentations and porn generating.

Contrary to the trends in SV, dehumanization of creative professions will result not in productivity boost but in utter chaos and as a result will add more time loss in production process.

I never liked Sam Altman in his Y years, now I know why.

Even with the "blessings" from the "masters" in Davos/Bilderberg, a bad idea is a bad idea. Maybe this will push World ID as a result, but is it necessary?

The current trends in tech are not producing solutions for a professional problem. With rare exceptions, this looks more and more as removing of human input and normalization of a society ruled by AI at any cost. So sad.

February 17, 2024 at 7:16 AM

wouldbecouldbe

It looks beautiful, however I thought openai's mission was creating AGI, not become a generative ai content supplier.

February 16, 2024 at 2:56 AM

rareitem

I used to think a few years ago that virtual reality/ai projects such as the mataverse wouldn't amount to anything big. I even thought of them ridiculous. Even recently, I thought that GPT's and ai generated images would be the pinnacle of what this new ai wave would amount to. I just keep getting baffled.

February 16, 2024 at 4:28 AM

david_shi

If you draw a line from Pong (1972, or 52 years ago) to Sora, what does that imply for the quality and depth of simulations in 2076 (52 years in the future)?

Would we be able to perceive the differences between those and the physical world? I can't help but feel like there is a proof for the simulation theory possible here.

February 16, 2024 at 7:16 AM

steveBK123

Genuinely impressive.

I've always been a digital stills guy, and dabbled in video.. as a hobby. As a hobbyist, I always found the hardest thing is making something worth looking at. I don't see AI displacing the pleasure of the art for a hobbyist.

My next guess is the 80/20% or 95/5% problem is gonna be stuff like dialogue matching audio and mouth/face motion.

I do see this kind of stuff killing the stock images / media illustrator / b-roll footage / etc jobs.

Could a content mill pump out plausibly decent Netflix video series given this tool and a couple half decent writers.. maybe? Then again it may be the perpetual "5 years away". There's a wide gap between generating filler content & producing something people choose to watch willingly for entertainment.

February 16, 2024 at 8:12 PM

internetter

The watermark is interesting. Looks like it's unique for every video so they can trace it to the creator?

February 16, 2024 at 4:53 AM

noisy_boy

To those who are saying, look at this at a positive and it lets people unleash their creativity?

- This enables everyone to be creators

- Given that everyone's creativity isn't top notch, highest quality will be limited to a the best

- So rest of us will be consumers

- How will we consume if we don't have work and there is no UBI?

February 16, 2024 at 10:00 AM

CaptainFever

If "Given that everyone's creativity isn't top notch, highest quality will be limited to a the best", that implies the existence of professionals, which implies work.

February 16, 2024 at 10:49 AM

noisy_boy

I meant those that are proficient/creative enough to be creating top content using AI but if we take it further to AI using AI, then yes, its AI all the way down.

February 16, 2024 at 6:06 PM

daxfohl

We won't and the world will go into a massive depression, destroying the market for AI produced garbage and staving off global warming for a few extra years in the process. So even better than UBI.

February 16, 2024 at 11:34 AM

drcwpl

Wow - "All videos on this page were generated directly by Sora without modification."

The prompts - incredible and such quality - amazing. "Prompt: An extreme close-up of an gray-haired man with a beard in his 60s, he is deep in thought pondering the history of the universe as he sits at a cafe in Paris, his eyes focus on people offscreen as they walk as he sits mostly motionless, he is dressed in a wool coat suit coat with a button-down shirt , he wears a brown beret and glasses and has a very professorial appearance, and the end he offers a subtle closed-mouth smile as if he found the answer to the mystery of life, the lighting is very cinematic with the golden light and the Parisian streets and city in the background, depth of field, cinematic 35mm film."

February 16, 2024 at 2:54 AM

imbusy111

I had a good laugh looking at the sliding and twisting legs in the "Girl walking in City" video.

February 16, 2024 at 2:27 AM

ummonk

I'm a little concerned that so many people in these comments say they wouldn't be able to tell that it's not real.

February 16, 2024 at 3:13 AM

kjqgqkejbfefn

Indeed @0:15, the right leg goes to the left and vice versa.

February 16, 2024 at 3:40 AM

geor9e

Today we scroll social media feeds where every post we see is chosen by an algorithm based on all the feedback it gets from our interactions. Now imagine years down the road when Sora renders at 60 fps, every frame influenced by our reaction to the prior frame.

February 16, 2024 at 11:44 AM

birriel

With the third and last videos (space men, and man reading in the clouds), this is the first time I have found the resolution indistinguishable from real life. Even with SOTA stills from Midjourney and Stable Diffusion I was not entirely convinced. This is incredible.

February 16, 2024 at 3:42 AM

justanotherjoe

What the f. What. I'm no AI pessimist by any means but I thought there are some significant hurdles before we get realistic, video generation without guidance. This is nothing short of amazing.

It's doubly amazing when you think that the richness of video data is almost infinitely more than text, and require no human made data.

The next step is to combine LLM with this, not for multimodal, but to team up together to make a 'reality model' that can work together to make a shared understanding?

I called LLMs 'language induced reality model' in the past. Then this is 'video induced reality model', which is far better at modeling reality than just language, as humans have testified.

February 16, 2024 at 11:29 AM

doakes

This is super cool. So many innovations come to mind. But it makes me wonder what will come from having the ability to virtually experience anything we want. It'll take a while, but I'm hoping we'll eventually want to go outside more instead of less.

February 16, 2024 at 3:27 AM

danjoredd

Porn is about to get so much weirder

February 16, 2024 at 4:01 AM

totaldude87

What happens when humanity stops generating new content/recording new findings/knowledge etc ? are at a place where whatever we had is enough knowledge for AI takeover?

or we are heading towards a skynet-y feature

February 16, 2024 at 10:06 PM

sulayman1

As a counterpoint, i don't think that the average person has stopped taking pictures just because image generation models exists. Nor have people stopped pursuing other hobbies impacted by AI. We don't go to museums to look at AI art that was created in 10 seconds and I doubt culture will shift to a point where that's common place. Human content will still be created, and we will probably see the general quality of that content increase as a result of foundational models. Content creation is taking whats in the mind and translating it into the physical/digital realm. With better AI, this translation becomes easier for a lot of fields and you no longer have to master the use devices to make your art quality. However, everyone can agree that prompt based generation is a lot less satisfying than making content from scratch. It feels more akin to a google search than a satisfying creative process. Those who are passionate and talented will continue to pursue their physical medium because of this.

The monetary value of generic stock content will surely drop and won't be created by professionals anymore. However, that doesn't mean people stop taking pictures of their dog just because they can get midjourney to generate the same thing. Creation for the sake of creation will continue. AI companies will initially reap in a lot of the $ value that used to go to the creators of stock content, but when open source models reach parity the masses will be able to make what's in their mind a reality as casual creators. Hobbyists will still exist and those that become truly great will still rise to notoriety.

February 16, 2024 at 10:34 PM

alkonaut

It's odd how the model thinks "historical footage" could be done by drone. So it understands that there should be no cars in the picture. But not that there should be no flying perspective.

February 16, 2024 at 7:03 PM

d4rkp4ttern

Mind blown of course.

Two things are interesting:

- No audio -- that must have been hard to add, or else it would have been there.

- Spelling is still probably hard to do (the familiar DallE problem)... e.g. a video showing a car driving past a billboard with specified text.

February 16, 2024 at 5:53 AM

slothtrop

My intuition is that training on audio will be trivial if they can accomplish this for video. Maybe I'm wrong.

February 16, 2024 at 5:58 AM

Sxubas

I wonder what this tech would do using a descriptive fragment from a book. I don't read many books at all but I would spend some time feeding in fantasy fragments and see how much they differ from what I imagined.

February 16, 2024 at 5:03 AM

throw310822

Very late so probably invisible to all, but is this just a byproduct of OpenAI's work on understanding of video input? The Google Gemini presentation video suggested that this is the next step-level of AIs. Already with GPT-4V, being able to converse with an AI about the contents of an image feels surreal. The applications that become possible with an AI that can just look at video streams are incredible.

February 17, 2024 at 8:07 PM
February 16, 2024 at 4:21 AM

generagent

This is machine simulated art. It is not a convincing simulation to videographers, yet it pleases software architects and other non-visual artists. Aptitude for visual art making provokes envy in some who lack it. The drive to simulate art is almost as common as the desire to be recognized as a capable visual artist. The most interesting generative art I’ve seen does not attempt verisimilitude. Children want their art to look real. Verisimilitude is hard, especially for children and quasi AI.

February 16, 2024 at 8:42 PM

mywacaday

This is amazing and was to be expected. Are there any good solutions that can be used to prove a video is not generated? I guess in some ways we have come full circle and are back to trusting individual journalists and content creators I just did'nt think it would happen this fast.

February 19, 2024 at 3:50 AM

XCSme

If this can generate videos in real-time (60FPS), then you can, in theory, create any game just from text/prompts.

You just write the rules of the game and the player input, and let the AI generate the next frame.

February 16, 2024 at 8:44 AM

ShamelessC

Pretty unlikely this generates in real-time.

February 16, 2024 at 9:17 AM

XCSme

"Two papers down the line..."

February 16, 2024 at 9:29 AM

donsupreme

All current form of entertainment will be impacted, all of them.

Except for live sporting events.

This is why I think megacorps all going to bid for sport league streaming right. That's the only one that AI can't touch.

February 16, 2024 at 11:12 AM

aurareturn

Anyway to benefit economically from this trend?

February 16, 2024 at 12:15 PM

Zuiii

What goes around, comes around. I'm glad this is happening. Gitty and friends should be driven out of business for their absurd stunt they pulled with image search.

Yes, I'm still bitter about that.

February 16, 2024 at 1:08 PM

pants2

Another step in the trend of everything becoming digital (in film and otherwise). It used to be that everything was done in camera. Then we got green screens, then advanced compositing, then CGI, then full realistic CGI movies modeled after real things and mocap suits. Now we're at the end game, where there will be no cameras used in the production of a movie, just studios of people sitting at their computers. Because more and more, humans are more efficient at just about anything when aided by a computer.

February 16, 2024 at 8:40 AM

al_borland

A bicycle for the mind.

February 16, 2024 at 9:34 AM

sylware

Hopefully we will see AIs with tools which are not "paint" or "notepad", but a maths formal proof solver, etc.

But I have a problem: I am unable to believe the videos I saw were dreamt by AI. I can feel deeply that I do believe there is some trickery or severe embellishment. If I am wrong, I guess we are at an inflexion point.

I can recall 10+ years ago, we were talking "in hacking groups" about AI because we thought the human brain alone was not good enough anymore... but in a maths/sciences context.

February 16, 2024 at 8:45 AM

uoaei

Visual sharpness at the expense of wider-scale coherence (see: sliding/floating walking woman in Tokyo demo or tiny people next to giant people in Lagos demo) seems to be a local optimum consistently achieved by today's SOTA models in all domains.

This is neat and all but mostly just a toy. Everything I've seen has me convinced either we are optimizing the wrong loss functions or the architectures we have today are fundamentally limited. This should be understood for what it is and not for what people want it to be.

February 16, 2024 at 2:32 AM

og_kalu

>Visual sharpness at the expense of wider-scale coherence (see: sliding/floating walking woman in Tokyo demo or tiny people next to giant people in Lagos demo)

Wider-Scale coherence is still much better than previous models and has consistently been improving. It's not "visual sharpness at the expense of coherence". At worst, the models are learning wider-scale coherence slower.

Not everything is equally difficult to learn so it follows that some aspects will lag behind others. If coherence weren't improving you might have a point but it is so...

February 16, 2024 at 2:46 AM

uoaei

Scaling laws operate in the limit but eventually practical considerations dominate. There's a lot we haven't yet fully appreciated about biological vision and cognition -- and indeed, common sense as regards sensible video generation and processing -- that have not made their way into this kind of model. NeRFs are interesting and I hope to see more from that side of things in the coming months and years.

February 16, 2024 at 4:17 AM

og_kalu

Nature is great and all but looking to it as an example of a lack of scaling and brute force is a bit ridiculous.

Your vision is hundreds of millions of years in the making.

February 16, 2024 at 4:59 AM

uoaei

Yes and in that time we've learned some important lessons that it would be unwise to ignore, e.g. comprehension of 3D geometry despite 2D input visual data.

February 16, 2024 at 7:09 AM

dsco

Did you just recreate the infamous DropBox comment? https://news.ycombinator.com/item?id=9224

February 16, 2024 at 3:02 AM

uoaei

That seems like quite the reach, but we will see if it really is just "all you need is scale".

February 16, 2024 at 4:15 AM

superconduct123

Seems like every big jump in improvement someone says "its just a toy"

Its like we keep moving the bar

February 16, 2024 at 8:43 AM

cboswel1

Who owns a person’s likeness? Now that we’re approaching text to video of a quality that could fool an average person, won’t this just open a whole new can of worms if the training models are replicating celebrities? The ambiguity around copyright when something on paper is in the style of seems to fall into an entirely separate category than making AI generated videos of actual people without their consent. Will people of note have to get a copyright of their likeness to fight its use in these models?

February 16, 2024 at 4:00 AM

Workaccount2

$100 on the table that studios create new celebrities that they own the rights too.

February 16, 2024 at 4:26 AM

hyperion2010

No need to take the bet, reality is already there. Miku is the endgame for idols. Forever young. Will never have a boyfriend. Always follows the script, or not when the team managing her decides they need a little drama. etc. etc. etc.

February 16, 2024 at 4:37 AM

void-pointer

This is the beginning of the end, folks

February 16, 2024 at 3:38 AM

M4v3R

> The model can also take an existing video and extend it or fill in missing frames

I wonder if it could be used as a replacement for optical flow to create slow motion videos out of normal speed ones.

February 16, 2024 at 3:40 AM

minimaxir

I do wonder why OpenAI chose the name "Sora" for this model. AI is now going to have intersectionality with Kingdom Hearts. (Atleast you don't need a PhD to understand AI.)

February 16, 2024 at 2:27 AM

starshadowx2

Sora means sky in Japanese, their reasoning is akin to "the sky's the limit".

> The team behind the technology, including the researchers Tim Brooks and Bill Peebles, chose the name because it “evokes the idea of limitless creative potential.”

February 16, 2024 at 3:00 AM

meitham

Sora is pictures or movie (visual) in arabic!

February 16, 2024 at 2:28 AM

yogorenapan

Hear me out: Someone on the team is a fan of Yosuga No Sora

February 16, 2024 at 2:41 AM

Tiberium

I'm not the only one ;)

February 16, 2024 at 10:43 AM

hk__2

I’m confused as well because "sora" means "sister" in Neapolitan.

February 16, 2024 at 2:35 AM

GaggiX

I'm glad I'm not the only to have think of that, it's usually used for insults. I thought it was kinda funny.

February 16, 2024 at 4:50 AM

ristomatti

Obviously for it's meaning in Finnish, "gravel".

February 16, 2024 at 4:51 AM

renewiltord

That's because it means AI Model in Wiltordian.

February 16, 2024 at 4:08 AM

pavlov

"Scene-Oriented Rank Adaptation"?

I have no idea, just guessing...

February 16, 2024 at 2:38 AM

xandrius

It also means up/upstairs in some dialect

February 16, 2024 at 3:20 AM

seabombs

All the examples feel so familiar, like I have seen them all before buried in the depths of YouTube and long-forgotten BBC documentaries. Which I guess is obvious knowing roughly how the training works.

I guess what I'm wondering is how "new" the videos are, or how closely do they mimic a particular video in the training set? Will we generate compelling and novel works of art with this, or is this just a very round-about way of re-implementing the YouTube search bar?

February 16, 2024 at 9:54 AM

seabombs

Maybe this was a big influence on the woolly mammoth example: https://youtu.be/EzzTX3DYMNs?si=WS28fsf5j6SBI1-7&t=15

Also interesting that some of the examples ignore details in the prompts. No clouds or sun in the sky, no depth of field, their hair isn't blowing in the wind.

February 16, 2024 at 10:16 AM

slothtrop

RE worrying about the future: what concerns me most is post-truth reality. Being thrown into a world where it's impossible to tell fact from fiction is insane and dangerous. Just thinking about it evokes paranoia.

We're nowhere near full-automation, these are growing pains, but maybe the canary in the goldmine for the job market. Expect more enthusiasm for UBI or negative tax and the like and policies to follow. Cheap energy is also coming eventually, just slower.

February 16, 2024 at 6:15 AM

nerdjon

It is honestly quite concerning just how good these videos look.

Like you can see some weird artifacts, but take one of these videos, compress it down to a much lower quality and with the loss of quality you might not be able to tell the difference based on these examples. Any artifacts would likely be gone.

Given what I had seen on social media I had figured anything remotely real was a few years away, but I guess not...

I guess we have just stopped worrying about the impact of these tools?

February 16, 2024 at 2:27 AM

senthilnayagam

samples look amazing , Looking forward for access, and hope they price it competitively

February 16, 2024 at 2:20 AM

treesciencebot

If we go from DALL-E 3, it won't be nowhere near competitive while they have the superior ground. Generating a high quality 1024x1024 image with costs around ~$0.002, but $0.08 on DALL-E 3 (20x more expensive per-image). For videos with very high computational needs (since each frame needs to be temporally consistent, you need huge GPUs to serve this) I'm expecting this to be so much more expensive than its competitors (Pika or SVD1.1)

February 16, 2024 at 2:30 AM

garfieldnate

The gold rush scene is the most captivating to me. The film style looks like it's from the 70's/80's (reminds me of Little House on the Prairie), but the footage is from a drone standpoint. I find it magically immersive in a time when none of the technology to make the shot would have existed.

February 17, 2024 at 12:28 PM

hownowbrowncow

Amazing.

One wonders how you might gain a representation of physics learned in the model. Perhaps multimodal inputs with rendered objects; physics simulations?

February 16, 2024 at 2:22 AM

ilaksh

Just lots of videos from Youtube probably.

February 16, 2024 at 2:24 AM

Tempest1981

The rendering of static on the TVs is interesting/strange. Must be hard for AI to generate random noise:

Video 7 of 8 on the 2nd player on the page.

> Prompt: The camera rotates around a large stack of vintage televisions all showing different programs — 1950s sci-fi movies, horror movies, news, static, a 1970s sitcom, etc, set inside a large New York museum gallery.

February 16, 2024 at 10:11 AM

Marwari

Videos don’t feel real though this is best thing I have ever seen on topic ‘text-to-video’. I am sure this will go so far and become more realistic. But does this mean that we will not hire actors and creators but we will hire video editors who can stitch all together and prompt writers who can create tiny videos for story.

February 16, 2024 at 12:57 PM

dr__mario

I'd love to feel excited by all these advancements and somehow I feel numb. I get part of the feeling (worry about inequalities it may generate), but I sense something more. It's like I see it as a toy... I'm unable to dream on how this will impact my life in any meaningful way.

February 16, 2024 at 10:14 PM

boppo1

Imagine dumping all the HIPAA data into a process like this. Obviously fraught with privacy and accuracy[0] concerns. Nonetheless, it might help us move some things forward.

February 17, 2024 at 12:32 AM

anirudhv27

What makes OpenAI so far ahead of all of these other research firms (or even startups like Pika, Runway, etc.)? I feel like I see so many examples of fields where progress is being made all across and OpenAI suddenly swoops in with an insane breakthrough lightyears ahead of everyone else.

February 16, 2024 at 11:36 PM

packetlost

I wonder how much of a blocker to real use not having things like model rigging or fine-tuned control over things will be to practical use of this? Clearly it can be used in toy examples with extremely impressive results, but I'm not entirely convinced that, as is, it can replace the VFX industry as a whole.

February 16, 2024 at 10:18 AM

ein0p

That actually looks borderline useful in practice. 3 years from now someone will make a decent full length movie with this.

February 16, 2024 at 5:48 AM

taejavu

Do we know anything yet about the maximum resolution of the output, or how long it takes to generate these kind of examples?

February 16, 2024 at 4:37 AM

zamadatix

The technical report mentions it the training data was fed at up to 1920x1080 (allowing for a vertical 1080x1920 as well) so I'd guess that's why all of these videos were 1080p or lower, any larger and it probably gets wonky fast. I didn't see anything on absolute compute requirements and their impact on time to generate though.

February 16, 2024 at 9:16 AM

HEGalloway

This is a great technical achievement, but in a couple of years time this will be as interesting as AI image generators.

February 16, 2024 at 7:59 AM

tzm

"so far ahead" "leaps and bounds beyond anything out there" "This is insane"

Let's temper the emotions for a second. Sora is great, but it's not ready for prime time. Many people are working on this problem that haven't shared their results yet. The speed of refinement is what's more interesting to me.

February 16, 2024 at 3:09 AM

jibalt

Something odd happens with that Tokyo woman's legs. First she skips a couple of times, then her feet change places.

February 16, 2024 at 7:38 AM

TriangleEdge

I predict the word "disrupt" will see an exponential curve [1].

https://trends.google.com/trends/explore?date=all&q=disrupt&...

February 17, 2024 at 4:35 AM

synapsomorphy

Holy cow, I've literally only looked at the first two videos so far, and it's clear that this absolutely blows every other generative video model out of the water, barely even worth comparing. We immediately jumped from interesting toy models where it was pretty easy to tell that the output was AI generated to.. this.

February 16, 2024 at 3:55 AM

hooande

This really seems like "DALL-E", but for videos. I can make cool/funny videos for my friends, but after a while the novelty wears off.

All of the AI generated media has this quality where I can immediately tell that it's ai, and that becomes my dominant thought. I see these things on social media and think "oh, another ai pic" and keep scrolling. I've yet to be confused about whether something is ai generated or real for more than several seconds.

Consistency and continuity still seem to be a major issues. It would be very difficult to tell a story using Sora because details and the overall style would change from scene to scene. This is also true of the newest image models.

Many people think that Sora is the second coming, and I hope it turns out to have a major impact on all of our lives. But right now it's looking to have about the same impact that DALL-E has had so far.

February 16, 2024 at 4:19 AM

mbm

Yeah, you really have to fast-forward 5 to 10 years. The first cars or airplanes didn't run particularly well either. Soon enough, we won't be able to tell.

February 16, 2024 at 4:30 AM

thorncorona

These limitations are fine for short form content ala reels / tiktok. I think the younger generations will get used to how it looks.

February 16, 2024 at 4:32 AM

MrNeon

> I've yet to be confused about whether something is ai generated or real for more than several seconds.

How did you rule out survivorship bias?

February 16, 2024 at 4:59 AM

gondo

This might be amazing progress, but I would never know as the website is consistently crashing Safari on my iPhone 13.

February 16, 2024 at 3:44 AM

bilsbie

I wish this was connected to chatgpt4 such that it could directly generate videos as part of its response.

The bottleneck of creating a separate prompt is very limiting.

Imagine asking for a recipe or car repair and it makes a video of the exact steps. Or if you could upload a video and ask it to make a new ending.

That’s what I imagine multi modal models would be.

February 16, 2024 at 4:23 AM

firefoxd

Now I can finally adapt my short story into a short film. All for however this thing will end up costing.

February 16, 2024 at 2:13 PM

Devasta

This technology is going to destroy society.

Want to form a trade union I'm your workplace? Best be ready to have videos of you jacking off to be all over the internet.

Videotape a police officer brutalising someone? Could easily have been made with AI, not admissable.

These things will ruin the ability to trust anything online.

February 16, 2024 at 9:51 AM

colordrops

Naw, if/when it gets to that, media won't be believed or admissable unless signed with someone's private keys or otherwise attested.

February 16, 2024 at 9:53 AM

cchance

The scene of the train, could easily be used in a transition scene in a movie, like theres so much here like stock videos are gonna be f*cked in short order, and if they add composition and planning tools, and loras, so will the movie industry.

February 17, 2024 at 2:37 AM

m3kw9

Impressive actually, i can actually see UI being real time generated one day now.

You give it data like real time stock data, feed it into Sora, the prompt is "I need a chart based on the data, show me different time ranges"

As you move the cursor, it feeds into sora again, generating the next frame in real time.

February 16, 2024 at 3:17 AM
February 16, 2024 at 4:06 AM

_virtu

In the future, we're not going to have common tv shows or movies. We'll have a constantly evolving stream of entertainment that's perfectly customized to the viewer's preferences in real time. This is just the first step.

February 16, 2024 at 3:07 PM

itissid

So what happens to the film industry now?

- Local/Bespoke high quality video content creation by ordinary Joes: Check. - Ordinary joes making fake porn videos for money: Check. - Reduce cost for real movies dramatically by editing in AI scenes: Check.

A whole industry will get upturned.

February 16, 2024 at 6:05 AM

justin66

Now that they’ve gone corporate, the OpenAI corporate motto ought to be “Because We Could.”

February 16, 2024 at 11:16 AM

landingunless

Wonder how the folks at Runway and Pika are thinking about this.

To me, it's becoming increasingly obvious that startups whose defensibility hinges on "hoping OpenAI doesn't do this" are probably not very enduring ones.

February 16, 2024 at 11:19 PM
February 19, 2024 at 4:21 AM

majani

In the last few days I've been asking myself what would drive the next big leap in advertising efficiency after big data and conversion pixels. I think I have my answer now. This is going to disrupt the ad agency side of the business big time.

February 16, 2024 at 7:39 AM

lqcfcjx

This is very impressive. I know in general people are iffy about research benchmark. How does it work to evaluate text-to-video types of use cases? I want to have some intuition on how much this is better than other systems like pika quantatively.

February 16, 2024 at 6:39 AM

countmora

> We’re also building tools to help detect misleading content such as a detection classifier that can tell when a video was generated by Sora.

I am curious of how optimised their approach is and what hardware you would need to analyse videos at reasonable speed.

February 16, 2024 at 4:53 AM
February 16, 2024 at 2:14 PM

globular-toast

This might actually ruin video and films for me. I don't want to be looking out for AI giveaways in everything I watch.

I can see a new market for true end-to-end analogue film productions emerging for people who like film.

February 16, 2024 at 7:25 PM

danielbln

Eh, it's like watching out for VFX giveaways.

February 16, 2024 at 7:51 PM

timonoko

What is the first book you want to see movie of? It should be verbatim and last a week, if needed.

I vote for Hothouse, by Brian W Aldiss. So many images need to imagined, like spiders that jump to the moon and back again.

February 16, 2024 at 6:07 PM

jmfldn

Technically breathtaking, but why do these examples of AI-generated content always have a cheap clipart vibe about them? So naff and uninspired given the, no doubt, endless potential this technology has.

I also feel a sense of dread too. Imagine the tidal wave of rubbish coming our way. First text, then images and now video can be spewed out in industrial quantities. Will it lead to a better culture? In theory it could, in practice I just feel like we'll be deluged with exponentially more mediocre "content" .

February 16, 2024 at 6:47 AM

krisboyz781

OpenAI will be the most valuable company in history at this rate. This is insane

February 16, 2024 at 1:35 PM

dist-epoch

Totally a coincidence that it's announced immediately after the new Gemini reveal.

February 16, 2024 at 2:24 AM

stronglikedan

Timing is everything. Smart move

February 16, 2024 at 2:30 AM
February 16, 2024 at 8:15 AM

wslh

Where is the link to try it, ChatGPT doesn't know anything about it:

"Sora" is not a video generation technology offered by OpenAI. As of my last update in April 2023, OpenAI provides access to various AI technologies, including GPT (Generative Pre-trained Transformer) for text generation and DALL·E for image generation. For video generation or enhancement, there might be other technologies or platforms available, but "Sora" as a specific product related to OpenAI or video generation does not exist in the information I have.

If you're interested in AI technologies for video generation or any other AI-related inquiries, I'd be happy to provide information or help with what's currently available!

February 16, 2024 at 6:17 PM

hyperhopper

Why would chatGPT know about anything new?

February 16, 2024 at 6:20 PM

wslh

Marketing and Sales?

February 16, 2024 at 6:38 PM

stephenw310

The results are mindblowing, to say the least. But will they allow developers to fine-tune this eventually? OpenAI is still yet to give that ability to txt2img DALLE models, so I doubt that will be the case.

February 16, 2024 at 5:23 AM

helix278

They're attaching metadata to the videos which can be easily removed. Aren't there techniques to hash metadata into the content itself? I.e. such that removing the data would alter the image.

February 16, 2024 at 4:30 AM

lagrange77

They should generate a video of Steve Jobs introducing this in a keynote.

February 16, 2024 at 5:22 AM

itissid

I've to go lie down...

February 16, 2024 at 5:13 AM

htrp

> All videos on this page were generated directly by Sora without modification.

I hope there is at least some cherrypicking here. This also seems like some shots fired at some of the other gen video startups

February 16, 2024 at 2:19 AM

palmfacehn

The example cat had two left forelegs.

February 16, 2024 at 2:29 AM

booleandilemma

Does Google have a competing product I can join the wait list for?

February 16, 2024 at 10:55 AM

kccqzy

No public access but they have Lumière: https://lumiere-video.github.io/

February 16, 2024 at 11:46 AM

alokjnv10

How will it effect gaming industry? https://news.ycombinator.com/item?id=39393252

February 16, 2024 at 12:45 PM

notpachet

OpenAI: Prompt: The camera follows behind a white vintage SUV with a black roof rack as it speeds up a steep dirt road surrounded by pine trees on a steep mountain slope...

Sora: plays GTA V

February 16, 2024 at 5:49 AM

alex201

It's a revolutionary thing, but I'll reserve my judgment until I see if it can handle the real challenge: creating a video where my code works perfectly on the first try.

February 16, 2024 at 4:44 AM

gigatexal

I am genuinely impressed.

February 16, 2024 at 3:28 AM

anupamchugh

Wow. And just like that fliki.ai and similar products have been sherlocked. Great time to be a creator, not the best time to be a product developer, production designer

February 16, 2024 at 10:47 AM

chrishare

I am very uncomfortable with this being released commercially without the requisite defence against misuse being also accessible. If we didn't have a problem with deepfakes, spam, misleading media before, we surely are now. All leading AI organisations are lacking here, benefiting from the tech but not sufficiently attacking the external costs that society will pay.

February 16, 2024 at 4:34 AM

wilg

What's "the requisite defence"?

February 16, 2024 at 4:35 AM

__loam

Something like a watermark (doesn't necessarily have to be visible to people) and a tool to detect that watermark might be nice for example. Or alternatively we could stop developing this hell technology and try to automate something that isn't cultural expression.

February 16, 2024 at 4:42 AM

wilg

Both of those are included and mentioned in the linked article.

February 16, 2024 at 4:50 AM

chrishare

Two things I would like - advances in detectors or generative content that do not do c2pa, and more transparency in what the usage policy means in practice.

February 16, 2024 at 4:34 PM

__loam

Two things I would like - algorithmic disgorgement and an apology to the human race.

February 16, 2024 at 7:53 PM

Palmik

I'm very uncomfortable with this technology being accessible only to a small and arbitrary subset of the population.

February 16, 2024 at 4:42 AM

2OEH8eoCRo0

Shall I get into the unemployment line now and beat the rush?

February 16, 2024 at 6:23 AM

jk_tech

This is bananas. This is ahead of anything else I've seen. The entire stock footage industry may be shut down over night because of something like this.

And it is still not perfect. Looking at the example of the plastic chair being dug up in the desert[1] is frankly a bit... funky. But imagine in 5 or even 10 years.

1. https://openai.com/sora?video=chair-archaeology

February 16, 2024 at 3:13 AM
February 16, 2024 at 2:34 AM

sebnun

This is amazing. My first thought was about the potential for abuse. Deepfakes will be more realistic than ever.

Also, nicely timed to overshadow the Google Gemini 1.5 announcement.

February 16, 2024 at 2:53 AM

sorokod

Just in time for the election season. Also "A cat waking up its sleeping owner demanding breakfast" has too many paws - yes I do feel petty saying this.

February 16, 2024 at 3:41 AM

matsemann

And the sleeper's shoulder gets converted to the duvet? And a strange extra hand somewhere. It was also the one that to me stood out as the worst. The quality was good, but it had the same artifacts as previous generations of ai videoes where thing morphs.

February 16, 2024 at 3:58 AM

throwitaway222

What in the flying f just happened.

I guess we've all just been replaced.

February 16, 2024 at 3:51 AM

seydor

This inside VR goggles would make it amazing. probably it wouldnt even need to render 360, it would generate it on demand. I better go get some feeding tube

February 16, 2024 at 5:15 AM

bsimpson

That's the difference between Donkey Kong Country and the N64 (or perhaps between Pixar and Quake).

The amount of power needed to generate this can't be feasible for real time VR today. There's a reason even the company that invented (massive and free) Gmail is charging for its top tier generative AI.

February 16, 2024 at 5:20 AM

ij09j901023123

We thought programmers, fast food workers, and drivers would be automated first. Turns out, it's movie / video, actors, editors and artists....

February 16, 2024 at 5:29 AM

Pmop

We all are going to get automated out of the workforce together :)

February 16, 2024 at 5:35 AM

john2x

Silver lining in this I guess. If everyone realizes at the same time they're all f'd together, regardless of "skill", then maybe there's a chance we can all work together to save ourselves.

No chance to think "sucks for you, but I'm good here" like so often happens with other issues.

February 16, 2024 at 2:21 PM

hpeter

One one side, we have people who are upset because the creators of the videos in the dataset used for teaching this language model were not compensated.

On the other hand, people find the tech very impressive and there are a lot of mind blowing use-cases.

Personally, this opens up the world for me to create video ads for software projects I create, since I have no financial resources or time to actually make videos, I only know how to code. So I find it pretty exciting. It's great for solo entrepreneurs.

February 17, 2024 at 2:09 AM

SushiHippie

I find the watermark at the bottom right really interesting at first it looks like random movement and then in the end it transforms into the OpenAI logo

February 16, 2024 at 3:39 AM

ulnarkressty

I do hope that they have a documentary team embedded in this company, like DeepMind had. They're making historical advancements on multiple fronts.

February 16, 2024 at 3:52 AM

supriyo-biswas

I wonder what served as the dataset for the model. Videos on YouTube presumably, since messing around with the film industry would be too expensive?

February 16, 2024 at 2:28 AM

tinyhouse

How would they access youtube tough?

February 16, 2024 at 2:30 AM

bori5

yt-dlp

February 16, 2024 at 10:21 AM

achr2

Almost certainly troves of stock footage. The type of exaggerated motion seen in these examples is very reminiscent of stock footage. And it is heavily textually annotated for search.

February 16, 2024 at 5:13 AM

tsunamifury

The film "The Congress" will end up being the most on point prediction of our future in ever. I can't believe it. Im in shock.

February 16, 2024 at 5:07 AM
February 16, 2024 at 2:57 AM

mring33621

I wanna see the rest of the knit hat spaceman movie!

February 16, 2024 at 2:34 AM

pcdoodle

Call me a Luddite but I don't want these videos hitting my retinas.

There should be an opt out from being subjected to AI content.

February 16, 2024 at 10:21 PM

TriangleEdge

Welp, goodbye internet, it was fun to know you.

February 16, 2024 at 12:34 PM

throwitaway222

How many of you think YT is looking through their logs trying to find a high burn rate of videos that might possibly be from Open AI?

February 16, 2024 at 4:25 AM

dartos

God the legs of the woman walking are horrifying.

February 16, 2024 at 5:05 AM

swayvil

It really makes me wonder if something like this is running inside my head.

The prompt tho. Probably not text. Probably a stream of vibes or something.

February 16, 2024 at 3:08 AM

ta8645

You're lucky if so. I have something closer to pong running inside my head.

February 16, 2024 at 9:13 AM

s-xyz

This is seriously insane, in particular as someone mentioned the quality of it. I can't wait to play around with this. SICK!

February 16, 2024 at 5:17 AM

layer8

The left hand of the Tokyo woman looks really creepy, especially from second ~20 onward. I guess some things don’t change. ;)

February 16, 2024 at 7:33 AM

pxeger1

Funny that this launched so soon after Gemini 1.5. I guess OpenAI have a strong incentive to dominate the media narrative.

February 16, 2024 at 7:18 AM

apexalpha

Wow. It's bizarre to see these video's.

Creating these video's in CGI is a profession that can make you serious money.

Until today.

What a leap.

February 16, 2024 at 4:34 PM

superconduct123

So do we think this is the "breakthrough" that was mentioned back when the Sam Altman stuff was going on?

February 16, 2024 at 4:50 AM

Janicc

I honestly expected video generation to get stuck at barely consistent 5 second clips without much movement for the next few years. This is the type of stuff I expected to maybe be possible towards the end of the decade. Maybe we really are still at the bottom of the S curve which is scary to think about.

February 16, 2024 at 2:30 AM

ericra

It's been said a thousand times, but the "open" in openai becomes more comical every day. I can't imagine how much money they will generate from such a tool, and I'm sure they will do everything possible to keep a tight lid on all the implementation details.

This product looks incredible...

February 16, 2024 at 4:32 AM

foobar_______

Feels like another pivotal moment in AI. Feel like I’m watching history live. I think I need to go lay down.

February 16, 2024 at 6:16 AM

velo_aprx

I don't think i like the future.

February 16, 2024 at 3:58 PM

pknerd

So no APIs yet?

February 16, 2024 at 2:30 AM

impulser_

This is good, but far from being useful or production ready.

It's still too easy to notice these are all AI rendered.

February 16, 2024 at 6:26 AM

mlsu

They must be using techniques from NeRF in here, maybe in tokenization? The artifacts are unmistakeable.

February 16, 2024 at 5:23 AM

xyproto

The big question is if it will be able to create a video of whisky without ice or a car without windows.

February 16, 2024 at 4:09 AM

0xE1337DAD

How far are we from just giving it a novel and effectively asking it to create a TV series from it

February 16, 2024 at 7:39 AM

selvan

Ad generation usecases are getting interesting with Video generation + Controlnet + Finetuning

February 16, 2024 at 12:28 PM

bilsbie

Could this same technology be used to make games? It seems like it has a built in physics engine.

February 16, 2024 at 4:06 AM

mushufasa

Do you think they announced this today to steal attention from Google/Gemini annuncement?

February 16, 2024 at 3:40 AM

crazygringo

No, corporate announcements are very much planned in advance. There's a lot of coordination that has to happen. This is just coincidence, unless one of the companies had inside information about the other's announcement and timing. But that's pretty unlikely.

February 16, 2024 at 3:43 AM

khazhoux

The focus here is on video motion, but I'm very impressed by the photorealistic humans.

February 16, 2024 at 3:48 AM

aggrrrh

Looking at it and in my opinion it just reinforces theory that we live in simulation

February 16, 2024 at 1:40 PM

kaonashi

quite the technical feat I suppose, but the actual result is nightmare fuel -- legs swapping places, people walking into simulacrum of spaces -- just deeply unsettling uncanny valley stuff

February 17, 2024 at 10:23 AM

nomad86

Demo is always better than the real product. We'll soon see how it works...

February 16, 2024 at 3:28 PM

lacoolj

Total coincidence this comes out the day Google announces Gemini 1.5 I'm sure

February 16, 2024 at 4:10 AM

ummonk

Looked at the first clip and immediately noticed the woman's feet swap at ~15 seconds in. My eyes were drawn to the feet because of the extreme supination in her steps.

Looks like a dramatic improvement in video generation but still a miss in terms of realism unless one can apply pose control to the generated videos.

February 16, 2024 at 3:09 AM

jgalt212

These looks like well done PS5 games. Which, of course, is a great achievement.

February 16, 2024 at 6:40 AM

ionwake

How long until there is an open source model for.... text to video?

Genuine question I have no idea

February 16, 2024 at 3:55 AM

partiallypro

These are insanely good, but there are still some things that just give them away (which is good, imo.) Like the Tokyo video is amazing, the reflections, etc are all great, but the gaits of people in the background and how fast they are moving is clearly off. It sticks out once you notice it. These things will obviously improve as time marches on.

The fear I have has less to do with these taking jobs, but in that eventually this is just going to be used by a foreign actor and no one is going to know what is real anymore. This already exists in new stories, now imagine that with actual AI videos that are near indistinguishable from reality. It could get really bad. Have an insane conspiracy theory? Well, now you can have your belief validated by a completely fictional AI generated video that even the most trained eyes have trouble debunking.

The jobs thing is also a concern, because if you have a bunch of idle hands that suddenly aren't sure what to believe or just believe lies, it can quickly turn into mass political violence. Don't be naive to think this isn't already being thought of by various national security services and militaries. We're already on the precipice of it, this could eventually be a good shove down the hill.

February 16, 2024 at 3:52 AM

kilbuz

This is like seeing the first packets ever sent on the internet and noting that latency is high, lol.

February 16, 2024 at 5:54 AM

bottlepalm

Why aren't you more afraid of ASI? We're clearly just dancing around it at this point.

February 16, 2024 at 5:00 AM

partiallypro

Real AGI is farther away than I think people think, and the tendency for mankind to destroy itself is much better demonstrated than machines doing that even when that time comes.

February 16, 2024 at 6:41 AM

dwighttk

What do y’all think caused the weird smoke/cloud in the mammoth video?

February 16, 2024 at 4:25 AM

sidcool

Even the videos with some physics anomalies are quite good and entertaining.

February 16, 2024 at 2:35 AM

beders

Finally, a true Star Wars prequel is in reach. Everybody gets their own :)

February 16, 2024 at 3:34 AM

hcarvalhoalves

Oh nice, we’ll get a new shitty Marvel movie every week now.

February 17, 2024 at 4:52 AM

hansoolo

Is it really just coincidence that Andrej Karpathy just left yesterday?

February 16, 2024 at 6:39 AM

lagrange77

Has anyone noticed the label on the surfing otter's lifejacket? :D

February 16, 2024 at 6:36 AM

sabzetro

Can't wait until we can generate feature length films with a prompt.

February 16, 2024 at 2:49 AM

lencastre

One day OpenAI itself will replace Altman and take charge.

February 17, 2024 at 3:47 AM

lofaszvanitt

When will this bubble going to burst? :D

February 18, 2024 at 12:43 PM

cfr2023

I want to storyboard/pre-vis/mess around with this ASAP

February 16, 2024 at 4:49 AM

m3kw9

How many of the video startups are shtting their pants right now?

February 16, 2024 at 3:17 AM

guybedo

Looks like OpenAI managed to burry Gemini 1.5 news.

I guess it was anticipated.

February 16, 2024 at 4:36 AM

alokjnv10

I'm just blown away. This can't be real. But lets be face the truth. Its even more impressive than ChatGPT. I think its the most impressive AI tech i've seen till now. I'm speechless.

Now the big question is. As OpenAI keeps pushing boundaries, it's fascinating to see the emergence of tools like Sora AI, capable of creating incredibly lifelike videos. But with this innovation comes a set of concerns we can't ignore.

So i'm worried about getting these tools misused. I'm thinking about what impact could they have on the trustworthiness of visual media, especially in an era plagued by fake news and misinformation? And what about the ethical considerations surrounding the creation and dissemination of content that looks real but isn't?

And, what we should do to tackle these potential issues? Should there be rules or guidelines to govern the use of such tools, and if so, how can we make sure they're effective?

February 16, 2024 at 12:47 PM

SubiculumCode

https://news.ycombinator.com/edit?id=39393236

Its why I submitted this. We need some way to attest the authenticity of images.

February 16, 2024 at 12:49 PM

qwertox

The one with the grandma is outright scary. All the lies...

February 16, 2024 at 6:29 AM
February 16, 2024 at 4:20 AM

kuprel

Next they have to add audio, then VideoChatGPT is possible

February 16, 2024 at 3:49 AM

bluechair

The signs are non-sensical but this is probably expected.

February 16, 2024 at 2:35 AM

criddell

Why is that so difficult for these things to get right?

February 16, 2024 at 3:00 AM
February 16, 2024 at 4:56 AM

lorenzofalco

Ahora si que si se jodio todo. Apaga todo o desco ecta

February 16, 2024 at 6:31 AM

LeicaLatte

Real GPT-4 moment. Your 3500 MacBook cannot do this.

February 16, 2024 at 8:59 AM

sys32768

Game of Thrones Season 8 will be great in a few years.

February 16, 2024 at 3:44 AM

srameshc

Probably we humans will come to a point where we wouldn't even bother ourselves with making videos. We may just consume based on our emotional state on the fly generated by such services.

February 16, 2024 at 8:48 AM

alokjnv10

I'm just blown away. This can't be real. But lets be face the truth. Its even more impressive than ChatGPT. I think its the most impressive AI tech i've seen till now. I'm speechless. Now the big question is. As OpenAI keeps pushing boundaries, it's fascinating to see the emergence of tools like Sora AI, capable of creating incredibly lifelike videos. But with this innovation comes a set of concerns we can't ignore.

So i'm worried about getting these tools misused. I'm thinking about what impact could they have on the trustworthiness of visual media, especially in an era plagued by fake news and misinformation? And what about the ethical considerations surrounding the creation and dissemination of content that looks real but isn't?

February 16, 2024 at 12:58 PM

DeathArrow

Goodbye, Hollywood!

February 16, 2024 at 3:04 AM

SandroG

This is surreal, both literally and figuratively.

February 16, 2024 at 8:19 AM

timetraveller26

Is this real life? Or is just a generated fantasy?

February 16, 2024 at 5:15 AM

quonn

> Sora serves as a foundation for models that can understand and simulate the real world, a capability we believe will be an important milestone for achieving AGI.

Why would it?

February 16, 2024 at 6:53 PM

rambambram

I like how the dalmatian puppy moves like a cat.

February 16, 2024 at 2:50 AM

redm

Why are all the example videos in slow motion?

February 16, 2024 at 6:03 AM

oxqbldpxo

US Elections about to peak, terrible timing.

February 16, 2024 at 10:32 AM

lairv

The 3D consistency of those videos is insane compared to what has previously been done, they must have used some form of 3D regularization with depth or flow I think

February 16, 2024 at 4:06 AM

reducesuffering

Apple Vision Pro VR + unlimited, addicting... I mean, engaging video feed into your eyes. The machines will keep you tube fed and your bowels emptied. Woe to the early 21st century techno-optimism. An alien intelligence rules the galaxy now. Welcome to the simulation.

February 16, 2024 at 2:37 AM

m3kw9

will be super depressing once you take off the helmet and feel the reality

February 16, 2024 at 3:31 AM

superconduct123

Imagine that but the loss function is measuring how "good" you feel via brain signals

And the AI is optimizing the video feed purely for that

What would it generate?

February 16, 2024 at 4:14 AM

MobinaMaghami

hi, my name is mobina and I am from Iran. I want to make a video from text and so yeah. thank you for watching.

February 16, 2024 at 6:47 AM

thesmart

What real life problem does this solve?

February 16, 2024 at 7:48 AM

chipweinberger

the same problem movies and tv solve, for one. entertainment?

February 16, 2024 at 7:49 AM

birracerveza

This comment cannot honestly be in good faith, come on now.

You seriously can’t think of a single practical application for generating arbitrary media content on demand?

February 18, 2024 at 2:20 AM

cdme

I don't understand why anyone would find these videos compelling enough to watch. They're visually polished, but totally uninteresting.

February 16, 2024 at 6:58 AM

razemio

Then change the prompt? It is a demo afterall. From a creators perspective, those shots are awesome for inspiration and / or a tool to create something bigger.

February 16, 2024 at 7:03 AM

cdme

To yield yet another soulless, machine generated clip?

February 16, 2024 at 7:04 AM

tokai

Even the good ones look kinda bad.

February 16, 2024 at 8:00 PM

lxe

Blown every expectation way away....

February 16, 2024 at 6:35 AM

wsintra2022

Seriously cannot wait to be able to put a 1 weeks worth of dream diary into a tool like this and see my dream inspired movies!

February 16, 2024 at 10:34 AM

elorant

This could kill the porn industry.

February 16, 2024 at 6:35 AM

thelastparadise

This looks like state of the art?

February 16, 2024 at 9:28 AM

ilteris

Where is the tool that we can try?

February 16, 2024 at 3:24 AM

cooper_ganglia

It's always kinda crazy to me to see an emerging technology like this have it's next iteration in the development pipeline, and even after seeing the First Gen AI video models, even many of the HN people here still say, "Meh, not that impressive."

Brother, have you seen Runway Gen 2, or SVD 1.1? I'm not excited about Sora because I think it looks like Hollywood animations, I'm excited because an open-source 3rd-Gen Sora is going to be so much better, and this much progression in one step is really exciting!

February 16, 2024 at 5:32 AM

golol

This does put a smile on my face

February 16, 2024 at 3:31 AM

drcongo

This is actually mind-blowing.

February 16, 2024 at 3:03 AM

dsign

This is impressive and amazing. I can already see a press release not too far down the road: "Our new model HoSapiens can do everything humans can do, but better. It has been specifically designed to deprecate humanity. We are working with red teamers — domain experts in areas like union busting, corporate law and counterinsurgency, plus our habitual bias, misinformation, and hateful content against AI orange team— who will be adversarially testing the model.

February 16, 2024 at 4:18 AM

alokjnv10

I'm simply blown away

February 16, 2024 at 12:44 PM

asciii

Beautifully terrifying

February 17, 2024 at 12:48 AM

lagrange77

Finally new TNG episodes!

February 16, 2024 at 5:18 AM

ed_balls

How to invest in OpenAI?

February 16, 2024 at 4:03 AM

dietmtnview

oh man, we're going to be in The Running Man really quick.

February 16, 2024 at 2:29 AM

CapitalTntcls

Good by civilization

February 16, 2024 at 6:51 PM

Jeve11326gr6ed

How can I get started

February 16, 2024 at 4:29 AM

yandrypozo

did anyone saw the two-leg horses in the video?

February 17, 2024 at 2:09 AM

uconnectlol

right, we all new AI would be closer to realization in 2020. of course the first one to do it is some complete sellout asshole, affirming hateful rhetoric like "we have to make thing safe", which is just thinly veiled pro police state sentiment. every single thing you can come up with why this is "unsafe" is just police state mentality.

"porn without consent" - thought crime

"too much porn of whatever you dream of" - yes, conservatives (50% of USA) actually think this is a problem

"spam" - advancing the closed garden model email is heading towards. soon you will simply need government id to make email even though there are plenty of alternative ways to do communication aside from email which was already considered insecure and a bad protocol in 2000. this has nothing to do with AI but they are still acknowledging this absurdity by framing AI as the enabler of that.

"automated social engineering" - just weaponizing the ignorance the bad thought leaders of the industry left us. instead of giving us proper authentication methods, we still have "just send my photo id to these 33 companies, which will ask for it in random ways we dont expect and just have to trust them"

"copyright" - literally not a problem, almost nothing "protected" by copyright matters and the law is just used by aggressive capitalists to shove their products down everyone's throat

"ICBMs being automatically hacked and launched at people" - just stop being bad government and hiring completely uncredible people to implement every mission critical control system while hooking it up to the internet

"racist bias" (or whatever) - this is the dumbest fucking thing i've ever heard of

this website is a perfect snapshot of why tech sucks so hard. its dressed up like cinematic film using a ton of js libs and css hacks or god knows so it can only be viewed smoothly on the latest computer hardware. only on one of the big 3 browsers that each had a trillion man hours of pointless iterations driven by digital graphics marketing companies. and on top of that they have a nice professional tone made by $300K/year PR people. please, sincerely, fuck off.

February 17, 2024 at 7:30 AM
February 16, 2024 at 8:48 AM

dom96

This is going to make the latest election really interesting (and scary). Is anyone working to ensure a faked video of Biden that looks plausible but is AI generated doesn't get significant traction at a critical moment of the election?

February 16, 2024 at 2:54 AM

superconduct123

Media outlets are not going to just publish a random video with sketchy provenance

February 16, 2024 at 8:45 AM

airstrike

They've released it but not made it GA

February 16, 2024 at 2:57 AM

duderific

That just doesn't seem like a plausible scenario to me. Obviously, if such thing happened, Biden would have an alibi, since it's known where he is at all times.

The people who already hate Biden, probably already think he's doing some weird shady stuff, and would point to some conspiracy. The people who like Biden, would accept the alibi.

Ultimately it wouldn't move the needle.

What is conc