SORA 2 AND THE LIMITS OF DIGITAL NARCISSISM

发表于 2025年10月25日

During the past few weeks, I’ve seen a proliferation of A.I.-generated video in my social-media feeds and group texts. The more impressive—or, at least, more personalized—of these have been the work of Sora 2, the updated version of OpenAI’s video-generation platform, which the company released on an invitation-only basis at the end of September. This iteration of Sora comes with a socially networked app, and it appears to be much better at integrating you and your friends, say, into a stock scene. What this means is that, when you open up Sora 2, you’ll likely see a video of someone you know winning a Nobel Prize, getting drafted into the N.B.A., or flying a bomber plane in the Second World War.

When I first started seeing these videos, my internal monologue went something like this: Whoa, that really looks like me/my friend. Ha, that’s cool. This is going to be a problem for society.

This set of thoughts, usually in this order, has been part of the A.I. industry’s pitch to consumers—and it’s worth pausing in this moment and asking if those reactions still hold the same sway as they did when ChatGPT was launched to justifiably breathless reviews and speculation about the future three years ago. Sora 2 has been met with relatively little fanfare, at least in comparison to earlier releases from OpenAI. I was impressed by the videos I’ve seen, but I find myself feeling similarly meh. I opened the app, saw a video of my friend Max giving a TED Talk, chuckled, and then went back to watching YouTube. I haven’t opened it up since.

I think of myself as a cautious A.I. believer—if I were a Christian, I would be the type who goes to church two Sundays a month, knows the hymns, but mostly keeps up his faith as a matter of social norms and the possibility that God actually is angry. I don’t believe artificial general intelligence, A.G.I., will gut the world, but I do think a lot of us will be working in new jobs in the next decade or so. (These changes, I suspect, will mostly be for the worse.) I also have spent a good deal of time working on documentaries, which has driven home for me how much time and money typically goes into producing even a good minute of traditional film. So, what’s changed? Why do these updates feel increasingly like the unremarkable ones that periodically hit my iPhone?

The most powerful trick that A.I. pulls is to put you, or at least your digitized self, into some new dimension. It does this by, for instance, converting your family photos into Studio Ghibli animations, or writing in your voice, and, now, by grafting your face onto your favorite movie scenes. All of this tickles the vanities of the user—even if he’s the President of the United States, apparently—and creates an accessible and ultimately personal connection to the program. You might not be impressed by watching Claude, the large language model created by the A.I. startup Anthropic, write code for six straight hours because, chances are, you can’t follow what it’s doing, nor, unless you’re a coder, do you likely care too much about the possible ramifications. But, when you see yourself riding a dragon as if you’re in a copyright-gray-zoned version of “Game of Thrones,” you will probably take notice.

For the most part, we enjoy A.I. because it lets us gaze into a better mirror, at least for a little while. And by giving well-timed teases of what the A.I. future might look like, the companies behind these programs nudge us to ask if the A.I. version of our lives might not be better than the real ones. It’s worth noting that this is more or less how strip clubs work. The customers are sold a fantasy, and they keep throwing money around because they hold out hope, however dim, that the teases will turn into something else. Under the spell of such intense flattery, we all become easy marks.

The A.I. boom of the past few years has been built in the space between the second thought I had when I first saw Sora 2 videos and the third—between “Ha, that’s cool” and “This is going to be a problem for society.” Many of us have that third thought, but few of us, save for the A.I. doomers proselytizing about the existential threats posed by this technology, have sat with it long. We wonder if these cute, obsequious chatbots will someday try to kill us because that’s what happens in “The Terminator” and “Ex Machina” and “2001: A Space Odyssey.” We don’t actually have a working theory of how Claude or Grok will subjugate the human race, nor, I imagine, do we really believe that will happen.

Why, once our brains are finished being mildly impressed with the latest step in A.I. technology, do we immediately start sketching out doom scenarios? The people making threats often happen to be financially incentivized to make A.I. seem as world changing and dangerous as possible. There are true believers among the doomers, but I suspect that a good portion of people who work at A.I. companies have no strong opinions about the threats of A.G.I. Some of them, given how engineering talent follows capital in Silicon Valley, may have worked previously at a cryptocurrency startup. And if they spent any amount of time in crypto, especially during the early days of apocalyptic Bitcoin philosophizing, they may recognize the similarities in the rhetoric of Bitcoin maximalists—who preached about the inevitability of deflationary currency, the coming upheaval of the world markets, and the need to use this power for good—and the A.I. Cassandras, who say that SkyNet is coming for us all. When the iPhone never changes and Bitcoin just becomes an investment vehicle, the only way left to grab people’s attention is to tell them they might all die.

A.I. prompts another existential question, one that I find far more compelling than the end of the world. If our labor is replaced, will anyone miss us? For the past two years, whenever I have finished a column and sent it to my editor, I have summarized what it’s about for an L.L.M. and asked it to write a version of the piece. At first, what came out was comparable to college-newspaper work—confident in its assertions, simple in its phrasing, a bit confused about what it was trying to say. About a year ago, I noticed that Claude and ChatGPT were getting pretty close to replicating what I had written; sometimes they made arguments that I wish I had made. The language was still a bit off, and the L.L.M.s tend to be more direct in making their points—some of you might prefer their writing to mine. When I described this column to ChatGPT, the piece I got back addressed “micro-propaganda,” deepfake pornography, and “ambient shame,” a nice phrase that ChatGPT didn’t really bother to explain. Claude, for his part, suggested that we use “Waiting for Godot,” and not a strip club, as the piece’s central metaphor.

Some time in the next five years, I’m certain that Claude or Gemini or some other L.L.M. will be able to write a better column than I can. But when that day comes, will I care? Will you? A computer first beat a chess grandmaster in 1989, and chess has never been more popular than it is today. What this reveals to me is that, when it comes to activities that we take pleasure in, we don’t really care if the machines can do it better. As with the kinds of A.I. we do like, much of the pleasure comes from seeing ourselves. Take us out of the picture, and we lose interest.

The actual question that these A.I. companies might ask, at least on a philosophical level, is whether their customers will want to keep seeing better reflections, or if they’ll eventually grow tired of yet another impressive feat of narcissism and coding. Once we see our face on Tom Cruise in “Mission: Impossible,” or on an extra in the Simpsons universe, or in porn, what fantasies remain? How many more ways are there to see ourselves?

在过去的几周里，我注意到我的社交媒体动态和群聊中出现了大量由人工智能生成的视频。其中更令人印象深刻——或者说，至少是更具个性化的——作品，都出自 Sora 2，这是 OpenAI 视频生成平台的更新版本，该公司在九月底以邀请制的方式发布了它。Sora 的这个版本配备了一款具备社交网络功能的应用程序，它似乎更擅长将你和你的朋友融入到比如某个预设场景中。这意味着，当你打开 Sora 2 时，你很可能会看到你认识的人赢得诺贝尔奖、被 NBA 选中，或者在第二次世界大战中驾驶轰炸机的视频。

When I first started seeing these videos, my internal monologue went something like this: Whoa, that really looks like me/my friend. Ha, that’s cool. This is going to be a problem for society.

当我第一次看到这些视频时，我内心的独白是这样的：“哇，这真的好像我/我的朋友啊。哈，真酷。这会给社会带来问题。”

这一系列想法，通常按这个顺序出现，一直都是人工智能行业向消费者推销时惯用的手法。此刻值得我们停下来思考的是，这些反应是否还像三年前 ChatGPT 刚发布时那样，能引起理所当然的热烈反响和对未来无限的憧憬与猜测。Sora 2 受到的关注相对较少，至少与 OpenAI 之前的其他产品相比是这样。我对看到的视频确实印象深刻，但我的感觉也只是平平。我打开了应用，看到一段我的朋友 Max 在做 TED 演讲的视频，笑了一下，然后就回去看 YouTube 了。从那以后，我就再也没打开过这个应用。

我认为自己是一个谨慎的人工智能信徒——如果我是个基督徒，我大概就是那种一个月去两次教堂、熟悉赞美诗，但更多是出于社会规范和害怕上帝真的会发怒才保持信仰的人。我不认为通用人工智能（A.G.I.）会彻底颠覆世界，但我确实认为，在未来十年左右，我们很多人都将从事新的工作。（我怀疑，这些变化大多会是糟糕的。）我还花了很多时间制作纪录片，这让我深刻体会到，即使是制作一分钟优质的传统电影，通常也需要投入大量时间和金钱。那么，到底是什么改变了呢？为什么这些更新越来越像我iPhone上那些不时出现的、平淡无奇的更新呢？

人工智能最强大的招数在于，能把你——或者说你数字化的分身——带入一个全新的维度。它是通过多种方式实现的，比如将你的家庭照片转换成吉卜力工作室（Studio Ghibli）风格的动画片，或者模仿你的写作风格；而现在，它甚至能把你的脸嫁接到你最喜欢的电影场景中。所有这些都迎合了用户的虚荣心——显然，即使他是美国总统也不例外——并最终与程序建立了一种触手可及且极具个人化的联系。你可能不会因为看到由人工智能初创公司Anthropic开发的大型语言模型Claude连续六小时编写代码而感到惊叹，因为很可能你根本看不懂它在做什么，而且除非你是一名程序员，否则你大概也不会太关心这可能带来的影响。但是，当你看到自己骑着一条龙，仿佛身处一个版权模糊的《权力的游戏》（Game of Thrones）版本中时，你很可能会为之侧目。

大部分情况下，我们之所以喜欢人工智能，是因为它让我们（至少在短时间内）能看到一个更理想的自己。并且，通过适时地展示人工智能未来可能的样子，这些程序背后的公司引导我们思考，人工智能版的生活是否会比我们的真实生活更美好。值得注意的是，这与脱衣舞俱乐部的运作方式如出一辙。顾客被兜售了一种幻想，他们不断花钱，因为他们怀揣着一丝希望，尽管这希望很渺茫，认为那些诱惑会变成现实。在这种强烈的奉承之下，我们都成了容易上当的目标。

近几年的人工智能热潮，正好处在我第一次看到Sora 2视频时产生的第二个想法和第三个想法之间——即“哈哈，这真酷”和“这将成为一个社会问题”之间。我们很多人都有过第三个想法，但除了那些大肆宣扬人工智能会带来生存威胁的“末日论者”之外，很少有人真正深入思考过它。我们会好奇这些可爱又谄媚的聊天机器人有朝一日是否会试图杀死我们，因为在《终结者》、《机械姬》和《2001：太空漫游》中就是这样。但我们实际上并没有一个成熟的理论，来解释Claude或Grok将如何征服人类，而且，我想我们也不真的相信那会发生。

为什么我们的大脑刚对人工智能技术的最新进展感到一丝惊叹，就立刻开始设想末日场景呢？那些散布威胁言论的人，往往有经济动机去夸大人工智能的颠覆性和危险性。末日论者中确实有真正的信徒，但我怀疑在人工智能公司工作的大多数人，对通用人工智能（A.G.I.）的威胁并没有太强的看法。考虑到在硅谷工程技术人才追逐资本的现象，他们中的一些人可能曾供职于加密货币初创公司。如果他们在加密货币领域有过任何经历，尤其是在早期“比特币末日论”盛行的时期，他们可能会发现比特币极端主义者（他们宣扬通缩货币的必然性、全球市场即将发生的剧变以及善用这种力量的必要性）的言论，与那些声称“天网”（SkyNet）会来毁灭我们所有人的“人工智能卡桑德拉们”（A.I. Cassandras）的言论之间存在相似之处。当iPhone不再有革新，比特币也仅仅变成了一种投资工具时，唯一能抓住人们注意力的方式，就是告诉他们所有人都有可能死去。

人工智能提出了另一个存在主义的问题，我发现这个问题远比讨论世界末日更引人深思。如果我们的工作被取代了，还会有人怀念我们吗？在过去的两年里，每当我完成一篇专栏文章并将其发送给编辑后，我都会向大型语言模型（L.L.M.）概括文章主旨，并请它也写一个版本。起初，它写出来的东西就像大学校报上的文章——观点自信，措辞简单，但对它想表达什么有点模糊不清。大约一年前，我注意到Claude和ChatGPT已经非常接近复刻我写的内容了；有时它们提出的论点甚至是我希望自己能想到的。它们的语言仍然有些生硬，而且大型语言模型（L.L.M.）在表达观点时往往更直接——你们中的一些人甚至可能更喜欢它们的写作风格。当我向ChatGPT描述这篇专栏文章时，它给我的版本提到了“微型宣传”、“深度伪造色情”和“环境羞耻”，其中“环境羞耻”这个词很不错，但ChatGPT并没有详细解释。而Claude则建议，我们应该用《等待戈多》（Waiting for Godot），而不是脱衣舞俱乐部，作为文章的核心比喻。

在未来五年内的某个时候，我确信Claude、Gemini或任何其他大型语言模型（L.L.M.）将能够写出比我更优秀的专栏文章。但当那一天到来时，我会在乎吗？你又会在乎吗？1989年，电脑首次击败了国际象棋特级大师，而如今，国际象棋的受欢迎程度达到了前所未有的高度。这对我而言揭示了一个道理：当我们从事那些能带来乐趣的活动时，我们并不真正关心机器是否能做得比我们更好。就像我们喜欢的人工智能应用一样，很多乐趣都来自于“看到我们自己”。如果把我们从中移除，我们就会失去兴趣。

这些人工智能公司可能会问的真正问题，至少从哲学层面来看，是他们的客户是否会想一直看到更完美的自我映射，抑或是他们最终会厌倦这种又一个彰显自恋和编码技艺的壮举。一旦我们看到自己的脸出现在《碟中谍》中的汤姆·克鲁斯身上，或者出现在《辛普森一家》宇宙中的某个龙套角色身上，抑或是在色情作品中，还能剩下多少幻想呢？我们还能有多少种方式看到自己呢？