SORA 2 AND THE LIMITS OF DIGITAL NARCISSISM

发表于 2025年11月1日

During the past few weeks, I’ve seen a proliferation of A.I.-generated video in my social-media feeds and group texts. The more impressive—or, at least, more personalized—of these have been the work of Sora 2, the updated version of OpenAI’s video-generation platform, which the company released on an invitation-only basis at the end of September. This iteration of Sora comes with a socially networked app, and it appears to be much better at integrating you and your friends, say, into a stock scene. What this means is that, when you open up Sora 2, you’ll likely see a video of someone you know winning a Nobel Prize, getting drafted into the N.B.A., or flying a bomber plane in the Second World War.

When I first started seeing these videos, my internal monologue went something like this: Whoa, that really looks like me/my friend. Ha, that’s cool. This is going to be a problem for society.

This set of thoughts, usually in this order, has been part of the A.I. industry’s pitch to consumers—and it’s worth pausing in this moment and asking if those reactions still hold the same sway as they did when ChatGPT was launched to justifiably breathless reviews and speculation about the future three years ago. Sora 2 has been met with relatively little fanfare, at least in comparison to earlier releases from OpenAI. I was impressed by the videos I’ve seen, but I find myself feeling similarly meh. I opened the app, saw a video of my friend Max giving a TED Talk, chuckled, and then went back to watching YouTube. I haven’t opened it up since.

I think of myself as a cautious A.I. believer—if I were a Christian, I would be the type who goes to church two Sundays a month, knows the hymns, but mostly keeps up his faith as a matter of social norms and the possibility that God actually is angry. I don’t believe artificial general intelligence, A.G.I., will gut the world, but I do think a lot of us will be working in new jobs in the next decade or so. (These changes, I suspect, will mostly be for the worse.) I also have spent a good deal of time working on documentaries, which has driven home for me how much time and money typically goes into producing even a good minute of traditional film. So, what’s changed? Why do these updates feel increasingly like the unremarkable ones that periodically hit my iPhone?

The most powerful trick that A.I. pulls is to put you, or at least your digitized self, into some new dimension. It does this by, for instance, converting your family photos into Studio Ghibli animations, or writing in your voice, and, now, by grafting your face onto your favorite movie scenes. All of this tickles the vanities of the user—even if he’s the President of the United States, apparently—and creates an accessible and ultimately personal connection to the program. You might not be impressed by watching Claude, the large language model created by the A.I. startup Anthropic, write code for six straight hours because, chances are, you can’t follow what it’s doing, nor, unless you’re a coder, do you likely care too much about the possible ramifications. But, when you see yourself riding a dragon as if you’re in a copyright-gray-zoned version of “Game of Thrones,” you will probably take notice.

For the most part, we enjoy A.I. because it lets us gaze into a better mirror, at least for a little while. And by giving well-timed teases of what the A.I. future might look like, the companies behind these programs nudge us to ask if the A.I. version of our lives might not be better than the real ones. It’s worth noting that this is more or less how strip clubs work. The customers are sold a fantasy, and they keep throwing money around because they hold out hope, however dim, that the teases will turn into something else. Under the spell of such intense flattery, we all become easy marks.

The A.I. boom of the past few years has been built in the space between the second thought I had when I first saw Sora 2 videos and the third—between “Ha, that’s cool” and “This is going to be a problem for society.” Many of us have that third thought, but few of us, save for the A.I. doomers proselytizing about the existential threats posed by this technology, have sat with it long. We wonder if these cute, obsequious chatbots will someday try to kill us because that’s what happens in “The Terminator” and “Ex Machina” and “2001: A Space Odyssey.” We don’t actually have a working theory of how Claude or Grok will subjugate the human race, nor, I imagine, do we really believe that will happen.

Why, once our brains are finished being mildly impressed with the latest step in A.I. technology, do we immediately start sketching out doom scenarios? The people making threats often happen to be financially incentivized to make A.I. seem as world changing and dangerous as possible. There are true believers among the doomers, but I suspect that a good portion of people who work at A.I. companies have no strong opinions about the threats of A.G.I. Some of them, given how engineering talent follows capital in Silicon Valley, may have worked previously at a cryptocurrency startup. And if they spent any amount of time in crypto, especially during the early days of apocalyptic Bitcoin philosophizing, they may recognize the similarities in the rhetoric of Bitcoin maximalists—who preached about the inevitability of deflationary currency, the coming upheaval of the world markets, and the need to use this power for good—and the A.I. Cassandras, who say that SkyNet is coming for us all. When the iPhone never changes and Bitcoin just becomes an investment vehicle, the only way left to grab people’s attention is to tell them they might all die.

A.I. prompts another existential question, one that I find far more compelling than the end of the world. If our labor is replaced, will anyone miss us? For the past two years, whenever I have finished a column and sent it to my editor, I have summarized what it’s about for an L.L.M. and asked it to write a version of the piece. At first, what came out was comparable to college-newspaper work—confident in its assertions, simple in its phrasing, a bit confused about what it was trying to say. About a year ago, I noticed that Claude and ChatGPT were getting pretty close to replicating what I had written; sometimes they made arguments that I wish I had made. The language was still a bit off, and the L.L.M.s tend to be more direct in making their points—some of you might prefer their writing to mine. When I described this column to ChatGPT, the piece I got back addressed “micro-propaganda,” deepfake pornography, and “ambient shame,” a nice phrase that ChatGPT didn’t really bother to explain. Claude, for his part, suggested that we use “Waiting for Godot,” and not a strip club, as the piece’s central metaphor.

Some time in the next five years, I’m certain that Claude or Gemini or some other L.L.M. will be able to write a better column than I can. But when that day comes, will I care? Will you? A computer first beat a chess grandmaster in 1989, and chess has never been more popular than it is today. What this reveals to me is that, when it comes to activities that we take pleasure in, we don’t really care if the machines can do it better. As with the kinds of A.I. we do like, much of the pleasure comes from seeing ourselves. Take us out of the picture, and we lose interest.

The actual question that these A.I. companies might ask, at least on a philosophical level, is whether their customers will want to keep seeing better reflections, or if they’ll eventually grow tired of yet another impressive feat of narcissism and coding. Once we see our face on Tom Cruise in “Mission: Impossible,” or on an extra in the Simpsons universe, or in porn, what fantasies remain? How many more ways are there to see ourselves?

在过去几周里，我注意到我的社交媒体动态和群聊信息中充斥着大量由AI生成的视频。其中那些更令人印象深刻——或者至少是更具个性化的——都出自Sora 2之手，它是OpenAI视频生成平台的更新版本，该公司在九月底以邀请制的方式发布了它。这个版本的Sora带有一个社交应用，它似乎能更好地将你和你的朋友融入到某个现成的场景中。这意味着，当你打开Sora 2时，你很可能会看到一个你认识的人赢得诺贝尔奖、被选入美国职业篮球联赛（NBA），或者在第二次世界大战中驾驶轰炸机的视频。

When I first started seeing these videos, my internal monologue went something like this: Whoa, that really looks like me/my friend. Ha, that’s cool. This is going to be a problem for society.

当我第一次看到这些视频时，我内心的独白大概是这样的：“哇，那真的像我/我的朋友。哈，真酷。这会成为一个社会问题。”

我脑海中浮现的这组想法，通常是按照这个顺序出现的，一直都是人工智能行业向消费者推销的惯用套路——此刻，我们值得停下来思考，这些反应是否还像三年前 ChatGPT 发布时那样，获得了当之无愧的惊叹评价并引发了对未来的热烈讨论，而且仍有同样的影响力。与 OpenAI 之前的发布相比，Sora 2 得到的关注相对较少，反响平平。虽然我对我看到的视频印象深刻，但我的感受也差不多是平平无奇。我打开应用程序，看到我的朋友 Max 在做一个 TED 演讲的视频，笑了笑，然后又回过头去刷 YouTube 了。从那以后，我就再也没打开过那个应用。

我认为自己是一个谨慎的人工智能信徒——如果我是一个基督徒，我就是那种一个月去两次教堂、熟悉赞美诗，但主要是出于社会规范和“上帝可能真的会发怒”这种可能性而保持信仰的人。我不认为通用人工智能（A.G.I.）会摧毁世界，但我确实认为在未来十年左右，我们很多人都将从事新的工作。（我怀疑，这些变化多半会是负面的。）我还花了很多时间制作纪录片，这让我深刻体会到，即使是制作一分钟高质量的传统电影，通常也需要投入大量时间和金钱。那么，现在情况有何不同？为什么这些更新越来越像我iPhone上那些定期出现的、平平无奇的更新呢？

人工智能（A.I.）最强大的技巧，就是将你，或者至少是你的数字化形象，带入一个全新的维度。例如，它可以通过将你的家庭照片转换成吉卜力工作室（Studio Ghibli）风格的动画，或者模仿你的声音进行写作，而现在，它还能将你的脸移植到你喜欢的电影场景中。所有这一切都迎合了用户的虚荣心——即使他表面上是美国总统也不例外——并创造了一种易于接近且最终个性化的与程序的连接。你可能不会因为看到由人工智能初创公司Anthropic创建的大型语言模型Claude连续六小时编写代码而感到印象深刻，因为你很可能无法理解它在做什么，而且除非你是程序员，否则你也不太可能太关心这可能带来的影响。但是，当你看到自己骑着一条龙，仿佛置身于一个版权擦边球版的《权力的游戏》（Game of Thrones）中时，你很可能会注意到并感兴趣。

很大程度上，我们喜欢人工智能，因为它至少在短时间内能让我们窥见一个更美好的自己。而且，这些程序背后的公司通过适时地展现未来AI可能带来的美好，诱使我们思考，由AI塑造的我们生活版本是否会比现实生活更出色。值得注意的是，这或多或少与脱衣舞俱乐部的运作方式如出一辙。顾客被贩卖了一种幻想，他们不断地花钱，因为他们抱持着一丝希望——无论多么渺茫——认为那些诱惑最终会变成现实。在这种强烈奉承的诱惑下，我们都成了容易上当受骗的对象。

过去几年人工智能的爆发式增长，正是围绕着我第一次看到Sora 2视频时产生的第二个想法和第三个想法之间的那个“空间”展开的——这两个想法分别是“哈，这真酷”和“这会给社会带来麻烦”。我们许多人都有过第三个想法，但除了那些宣扬这项技术带来生存威胁的人工智能末日论者之外，很少有人会长时间深入思考它。我们会想，这些可爱又顺从的聊天机器人有朝一日会不会试图杀死我们，因为电影《终结者》（The Terminator）、《机械姬》（Ex Machina）和《2001太空漫游》（2001: A Space Odyssey）里就是这么演的。我们实际上并没有一个关于Claude或Grok将如何征服人类的行之有效的理论，而且，我想我们也不真的相信那会发生。

为什么当我们的大脑对人工智能技术的最新进展稍感惊叹之后，我们立刻就开始勾勒出末日情景呢？那些散布威胁论的人，往往是受到经济利益的驱动，希望把人工智能描绘得尽可能地具有颠覆性且危险。末日论者中确实有真正的信徒，但我怀疑，在人工智能公司工作的大多数人，对通用人工智能（A.G.I.）的威胁并没有特别强烈的看法。考虑到硅谷的工程人才总是追逐资本，他们中的一些人可能以前在加密货币初创公司工作过。如果他们曾在加密货币领域待过一段时间，尤其是在比特币末日哲学思潮盛行的早期，他们可能会发现比特币极端主义者（他们宣扬通缩货币的必然性、全球市场即将发生的剧变以及需要善用这种力量）的言论，与那些人工智能“卡珊德拉们”（AI Cassandras，指那些警告人工智能危险，预言“天网”将来统治我们所有人的人）的言论有相似之处。当iPhone不再有革新，比特币也仅仅变成了一种投资工具时，唯一能抓住人们注意力的方式，就是告诉他们：你们可能都会死。

人工智能提出了另一个关乎生存的哲学问题，一个我觉得远比世界末日更引人深思的问题。如果我们的劳动被取代，还会有人怀念我们吗？过去两年里，每当我写完一篇专栏文章并把它发给我的编辑时，我都会为大型语言模型（L.L.M.）总结文章内容，并让它也写一版文章。起初，它写出来的东西就像大学校报上的文章——论断自信，用词简单，但有点不知所云。大约一年前，我注意到Claude和ChatGPT写出来的文章已经非常接近我写的了；有时它们甚至能提出我希望自己能想到的论点。它们的语言仍然有些许生硬，而且大型语言模型在表达观点时往往更直接——你们中的一些人甚至可能更喜欢它们的文笔。当我向ChatGPT描述这篇专栏时，我得到的回稿提到了“微型宣传”（micro-propaganda）、深度伪造色情（deepfake pornography）以及“环境羞耻”（ambient shame），这个短语听起来不错，但ChatGPT并没有真正解释它的含义。而Claude则建议，将《等待戈多》（Waiting for Godot，一部著名荒诞派戏剧）而不是脱衣舞俱乐部作为文章的中心比喻。

未来五年内的某个时候，我确信 Claude 或 Gemini 或其他大型语言模型（L.L.M.）将能写出比我更好的专栏文章。但当那一天到来时，我会在乎吗？你会在乎吗？电脑在1989年首次击败了国际象棋特级大师，而国际象棋在今天比以往任何时候都更受欢迎。这向我揭示了一个道理：当涉及到我们从中获得乐趣的活动时，我们并不真正在乎机器是否能做得更好。就像我们喜欢的那类人工智能一样，很多乐趣来源于看到我们自己。如果把我们从画面中移除，我们就会失去兴趣。

这些人工智能公司可能会提出的真正问题，至少从哲学层面来看，是他们的客户是否会想要不断看到更完美的映照，还是他们最终会厌倦这种又一次令人印象深刻的自恋与编程的杰作。一旦我们在《碟中谍》（Mission: Impossible）中看到自己的脸出现在汤姆·克鲁斯（Tom Cruise）身上，或者在《辛普森一家》（The Simpsons）的世界里出现在一个群众演员身上，甚至出现在色情内容中，那我们还会剩下多少幻想呢？我们还能以多少种方式看到自己呢？