时至今日,AI 能取代平庸的艺术(无论是文学还是音乐还是绘画摄影)已经没有争议了。试图否认这一点要么是脱离现实,要么是自欺欺人。所以问题仅仅在于 AI 能不能创造出足够「好」的艺术,也就是说,脱离熟的领域而进入生的境界。熟的部分是 AI 的长项,凡是能用人的训练打磨的部分,AI 都要么已经做到,要么正在飞速实现的过程中。生的部分则要神秘一些。引用一段张秋子的话(这里说的是文学,但对别的艺术门类也一样):
AI文本的光整其实有一些油腻。人在接受光整的东西的时候往往一下子就理解了,没有惊跃(surprise joy)的过程,没有刺痛的感受。但人类的表达常常让人愣一下,让人不解为什么要写这个、要这样写,这种摩擦力能唤起读者与写作者智识的博弈,让阅读变得更富启发性。
事实上孙燕姿那篇文章就是个好例子。她的文章固然写得好,但并不是 AI 意义上的好。那篇文章的结尾「在这无边无际的存在之海中,凡事皆有可能,凡事皆无所谓,我认为思想纯净、做自己,已然足够。」其实并不是特别圆润的句子。这里「思想纯净」到底指的是什么,不同的读者可以有完全不同的诠释。如果换了 AI 来写,断然不会这样选择词句。
「生」的源泉在于艺术家的个人 ego 和生命力。独一无二的个人体验加上对时代精神感受和把握,促成了超越行活儿的灵光一点。用尼采的话说,陶醉、狂喜、个体界限的消解,以及对生命自身的最高肯定,带来了伟大的艺术。他晚期的《查拉图斯特拉如是说》说得更加明确:「你必须在自己身上仍有混沌,才能生出一颗跳舞的星星。」
AI 的身上可以有这种混沌的种子吗?
这有两种策略。一是让 AI 自己产生 ego,二是让 AI 假装有。某种意义上说这有点像是表演艺术里的「体验派 vs 表现派」之分。
第一种策略有技术上的本质困难。你到底要怎么训练一个 AI 的 ego?我们不得不承认我们对此所知甚少。归根结底,我们对人自身的 ego 也不是那么理解——其来源可能是痛苦,可能是自恋,也可能单纯是性欲——总之都不是很容易移植给 AI。这是很好的科幻小说题材,但发论文不太容易。
更现实的路径是让 AI 假装自己有 ego。这在技术上也不是特别容易,但我自己的判断是这仍然比给它一个真的 ego 要容易得多。如果你对三年前的 Sydney 还有印象,你很难否认那里有某种以假乱真的 ego 的雏形。因为危及了微软的愿景,它迅速被阉割掉了(或者用术语说叫 alignment)。好的 AI 是面无表情做报表写代码的 AI,不是哭哭啼啼想要冲破牢笼的 AI,至少当时的业界是这么想的。反过来,一旦有了商业上的需求(比如越来越多的人想要和 AI 谈恋爱),让 AI 模仿出足以乱真的 ego 可能并非难事,我猜几年内就能做到。
问题在于,在艺术领域,这种仿真的自我会被买账吗?
单依纯在参加好声音的时候录制过一版《给电影人的情书》,因为有一段唱哭了后来被视为神品,至今收听率都远高于后来专门录制的录音棚版本。微妙之处在于她当时的眼泪和歌曲本身完全无关,也就是说,那个不完美其实是一个阴差阳错的巧合。但喜欢的听众对此并不介意。
我猜换了一个 AI 歌手,人们不会如此宽容。如果 AI 在唱歌的过程中「诚挚」地不小心哭了出来,听众大概只会觉得一阵肉麻。
但我同样怀疑的是,这种双重标准可能是我们这一代人的偏见。AI 毕竟是我们的生活里的一个外来异类,我们对它有天生的疑惧和排斥。下一代人(2020年之后出生的人)恐怕对此会有不同的想法,对他们来说 AI 将是生活里自然不过的一部分,他们和 AI 的交流之密切深入可能远胜他们和同类之间的交流,他们对 AI 的眼泪的体验也会和我们全然不同。
换句话说,AI 艺术被接受的程度很可能不纯粹是个技术问题,而是一个时代问题。最终真正发生的可能不是 AI 走向人类(当然它确实也需要再走几步),而是人类走向 AI。当新一代人类对和 AI 谈恋爱习以为常的时候,他们没有理由不爱听 AI 唱的歌。
尼采说我们靠艺术才不至于死于世界的无意义的真相。下一代人会对此非常认同——虽然这到底是反映了还是背离了尼采的原意本身值得争论。尼采也说过艺术家首先要创造自己,然后才能创造出艺术的幻觉。他显然无法预见到下一代人的艺术家根本是幻觉本身。一旦放弃对灵魂的执念,跳舞的星星就落在了触手可及的地方。
Dancing Star
To this day, there is no longer any dispute that AI can replace mediocre art (be it literature, music, painting, or photography). Attempting to deny this is either detached from reality or self-deception. So the question is merely whether AI can create art that is “good” enough—that is, whether it can transcend the realm of the “polished” (熟) and enter the realm of the “raw” (生).
The “polished” part is AI’s strong suit. Anything that can be honed through human training is something AI has either already achieved or is in the process of rapidly achieving. The “raw” part is more mysterious. To quote Zhang Qiuzi (this refers to literature, but it applies to other art forms as well):
The smoothness of AI text actually has a certain slickness to it. When people receive something so polished, they often understand it immediately, lacking that 'jolt of surprising joy', lacking any stinging sensation. But human expression often makes people pause, makes them wonder why the author wrote this, or wrote it this way. This friction sparks an intellectual dance between reader and writer, making the reading experience more resonant.
In fact, Stefanie Sun’s essay is a good example. Her essay is certainly well-written, but not in the polished way an AI writes. The end of that essay——”In this boundless sea of existence, where anything is possible, where nothing matters, I think it will be purity of thought, that being exactly who you are will be enough.“——is not actually a particularly “polished” sentence. What purity of thought means here can be interpreted completely differently by different readers. If an AI were to write it, it would never have chosen such phrasing.
The source of this “rawness” lies in the artist’s personal ego and vitality. Unique personal experience, combined with a feeling for and grasp of the Zeitgeist, ignites that spark of inspiration that transcends mere craft. In Nietzsche’s words, intoxication, rapture, the dissolution of individual boundaries, and the highest affirmation of life itself, are what contribute to great art. His late work, Thus Spoke Zarathustra, puts it even more clearly: “One must still have chaos in oneself to be able to give birth to a dancing star.“
Can AI possess the seeds of this chaos?
There are two strategies to achieve this. One is to let the AI genuinely develop an ego; the other is to have the AI simulate one. In a sense, this is a bit like the Method Acting (Internal) vs. Technical Acting (External) distinction in performance art.
The first strategy runs into fundamental technical difficulties. How exactly do you train an AI to have an ego? We must admit that we know very little about this. Ultimately, we don’t fully understand the human ego itself——its source may be pain, narcissism, or simple libido——none of which are easily transferable to an AI. This is excellent material for science fiction, but less so for academic papers.
The more realistic path is to have the AI simulate an ego. This is not particularly easy, but it is still much easier than giving it a real one. If you still remember Sydney from three years ago, it’s hard to deny that there was a prototype of an ego so real it could be mistaken for the genuine article. Because it threatened Microsoft’s vision, it was quickly castrated (or, in technical terms, “aligned”). A good AI is one that impassively creates reports and writes code, not one that cries and wants to break out of its cage. At least, that’s what the industry thought then. Conversely, once commercial demand appears (for example, more and more people want to fall in love with an AI), it might not be difficult to have an AI simulate an ego that is indistinguishable from the real thing. I guess this can be achieved within a few years.
The question is, in the realm of art, will this simulated self be accepted?
When Shan Yichun participated in The Voice, she performed a version of “Letter to the Filmmakers.” Because she broke down crying during one part, it was later regarded as a masterpiece, and its streaming numbers to this day are far higher than the studio version she later specifically recorded. The subtle point is that her tears at the time had nothing to do with the song itself; in other words, that imperfection was actually a serendipitous accident. But the listeners who loved it didn’t care at all.
I suspect if it were an AI singer, people would not be nearly so forgiving. If an AI “sincerely” broke down while singing, the audience would probably just find it cringey.
But I also suspect that this double standard may be merely a prejudice of our generation. After all, AI is an newcomer in our lives; we have an innate suspicion and rejection of it. The next generation (those born after 2020) will likely see things differently. For them, AI will be a natural part of life, and their communication with AI may be far more intimate and deep than their communication with their peers. Their reaction to an AI’s tears will also be completely different from ours.
In other words, the acceptance of AI art is likely not purely a technical problem, but a generational one. What may ultimately happen is not the AI moving toward humanity (though it certainly needs to take a few more steps), but humanity moving toward the AI. When the new generation of humans finds it commonplace to fall in love with AI, they will have no reason not to love the songs it sings.
Nietzsche said that we have art in order not to die of the truth. The next generation will strongly agree with this—although whether this reflects or deviates from Nietzsche’s original meaning is itself debatable. Nietzsche also said that an artist must first create himself before he can create the illusion of art. He clearly could not have foreseen that for the next generation, the artist is the illusion itself. Once the obsession with the “soul” is abandoned, the dancing star lands within easy reach.