A mechanical way of thinking, an artificial kind of intelligence, has underlaid the making of popular music since the invention of the form. That’s why we can talk about pop as an invention with a given form. As early as the days of hit songs written on sheet music, before the advent of recordings and radio, Charles K. Harris, writer of the first million-selling song sheet, “After the Ball,” described the trade of tunesmithing in industrial terms: a matter of applying procedures derived from past successes, calibrating them for mass consumption. He wrote an instruction manual, How to Write a Popular Song, in 1906, with a checklist of essential rules. In his words:
Watch your competitors. Note their success or failure; analyze the cause and profit thereby.
Note public demand.
If you do not feel confident to write or compose a certain style of song…adapt yourself to the others.
Over more than a century since Harris’s era, the proposition that songs should be made according to rules based on precedent and shaped by market forces is taken as a given and, indeed, valued as a way of honoring tradition and pleasing the public in many spheres of music, from country and gospel to jazz and R&B. At the same time, talk of regimented production or subordination of the creative impulse is widely and freely employed as criticism. Music journalists and critics have few tools as piercing as the charge that a song is formulaic or mechanical, or that an artist is pandering or a sellout.
The growing use of artificial intelligence in music challenges us to consider not only songwriting but also singing and musicianship in ways that are essentially extensions of the machine-age thinking of Harris and, at the same time, startlingly new. A small but expanding group of tech innovators has been developing a range of musical applications of AI such as Boomy, which, so far, has allowed people to make more than 400,000 songs through a combination of machine learning and input from human users. Earlier this year, the start-up Endel released a series of albums of AI-generated ambient music on the major streaming services, through a partnership with Warner Music. The company is planning to release 20 albums by the end of this year, all in the vein of chill playlists, with candle-scent-like titles that signal their purpose of conjuring soothing atmospheres: “Clear Night,” “Rainy Night,” “Cloudy Afternoon,” “Cloudy Night,” and “Foggy Morning.” Wordless, nearly formless, and harmless, they’re perfectly functional background music, well-marketed to make an asset of the absence of anything warranting the listener’s attention. A more recent project from another start-up, Auxuman, is an achievement on another level and may well be a watershed in pop music history.
Auxuman (brand shorthand for auxiliary human) is the brainchild of the British Iranian interdisciplinary artist Ash Koosha (aka Ashkan Kooshenaejad). He first established himself with a couple of pleasantly atmospheric synth-based pop albums, the first of which, Guud, included a single, “I Feel That,” whose official video starred a synthetic semihumanoid, created by the digital artist Hirad Sab. Before turning to AI, Koosha made some multisensory art for VR headsets. Now Koosha and a team of programmers have created a stable of digital music acts who (I’ll use the personal pronoun for human beings in deference to the “uman” in Auxuman) will be releasing a full album of new material every month under the umbrella name Auxuman.
As Koosha has described the Auxuman process to the tech-fan site Digital Trends, the words, music, instruments, and singing voices on the tracks are generated by mining existing music on the Web and processing it to generate new songs. The synthetic artists who are the public face of the work are, in Koosha’s words, “a reflection of human life on the internet.” Their music “comes from stories we have told, ideas we have generated, and opinions we have shared.” The principles are not far removed from Harris’s rules for analyzing the music of other songwriters and factoring in public demand.
There are five digital performers in the Auxuman collective to date: Gemini, Hexe, Mony, Yona, and Zoya. Visually, in the avatar imagery that accompanies each song on YouTube, they’re a mix of racial and ethnic identities, in a few cases thoroughly and indefinitely mixed. Four (Gemini, Hexe, Mony, and Yona) are distinctly female in appearance, one (Zoya) male. The one front and center in group pictures, Yona, is, a pixieish white woman who looks like what she is: a computer’s idea of a pop star.
In September, the first music attributed to the five was released as a 10-track album, Auxuman #1. Taking in these recordings, at first I tried to shake the fact that they were generated by AI, and sought to listen with no preconceptions or expectations. Within a minute, I realized that was pointless and unfair to the work and its digital creators. With Auxuman #1, we are forced to confront a genuinely new type of music that works on its own terms.
All five voices, though somewhat distinct from one another, sound of a piece and appropriately artificial—metallic in tone and rigidly staccato in their diction and phrasing. The voices carry no warmth and have no flexibility. They sound inhuman, soulless, but fascinatingly so. To expect otherwise from them would be as wrongheaded as it was for early listeners of recordings to expect pioneers of the microphone such as Billie Holiday and Frank Sinatra to belt like Al Jolson, Sophie Tucker, and others who needed to bellow to reach the rafters of big concert halls. Yona, in particular, exudes impersonal detachment and superficiality. As well as any other artist I can think of, she gives voice to the sense of isolation that chills the air of the digital world.
The lyrics are an eerie jumble of phrases, mostly trite babbling, not unlike the words of a fair number of pop songs since Charles K. Harris’s day. From time to time, though, the random juxtapositions come together in unnerving, accidental eloquence. In “Oblivious,” Yona sings:
I’ve never felt warm
I’ve never felt warm
Through the lens
Through your lens
I feel warm
We know she cannot feel anything. And she communicates only coldness. Knowing this, as I listened, I found myself projecting onto her and started to feel bad for her. Through my lens, I gave her warmth.
The music on all 10 tracks is perfectly, unsettlingly synthetic. Every note is placed precisely on beat. The harmony in every chord is correct, technically. And yet, nothing sounds quite right. There’s something profoundly but fittingly disturbing about it all. It’s utterly conventional and predictable in its formal structures and musical particulars, but wholly arbitrary, built of nothing but its own surface qualities. It sounds like what it is: code pretending to be life. I can think of nothing more relevant.