Editors’ Summary: In this post, the author considers a recent train advertisement in Norway as an example of the problem of floating motifs in LLM generated writing. She defines a floating motif as a motif appearing in AI-generated content that is out of place and detached from its original context. The new ad in Norway makes use of a line from Arnulf Øverland’s famous anti-fascist poem Du må ikke sove (“You must not sleep”). She likens this tone deaf ad to an advertisement jokingly referencing the “First they came for the communists” poem. All the LLMs she tested, including a Norwegian LLM NorMistral, struggled to make a connection between the use of Du må ikke sove in the train advertisement and the famous poem. Most LLMs are trained on data that has been cleaned of web pages containing violent and politically sensitive words, and in posttraining, the model was trained to avoid anything approaching toxic or harmful content. The author hypothesizes that lines of poetry that mean something in one culture but not others are misinterpreted by LLMs and lose their meaning.
Editors’ Choice: “Du må ikke sove”: a floating motif detached from its meaning (or: LLMs can write Norwegian but miss cultural references)