Would You Like To Hear A Song, Dave?

Would You Like To Hear A Song, Dave?

3:44pm Jun 05, 2015
Over time, Aether's Cone speaker uses an algorithm to learn the taste of its owner, and offers time- and location-specific recommendations.
Over time, Aether's Cone speaker uses an algorithm to learn the taste of its owner, and offers time- and location-specific recommendations.
Photo illustration / Photos via NASA and courtesy of Aether

Late one Saturday morning last December, after a couple months using my Aether Cone, the "thinking" speaker played David Bowie's "Changes." I pressed the soft button in the center of the sleek, chrome-plated player, and out came the swaggering piano and sharp blast of sax. "Oh yeah," cooed Bowie. "That'll do just fine," I thought, walking away from the wireless speaker sitting on the desk in my bedroom in order to do a few chores.

For the next half hour, the Cone played a selection of classic rock songs spanning the decades: Queen's "Somebody to Love" into The Cars' "I'm Not the One" into T.Rex's "Lean Woman Blues" into The Velvet Underground's "I'm Sticking With You" into The Move's "I Can Hear the Grass Grow" into Pink Floyd's "The Wall." I was feeling the progressively adventurous run from T.Rex through The Move, while years of classic rock radio consumption has left me comfortably numb to "The Wall." Then, out of nowhere: late era Foo Fighters. I thought you knew me, Aether Cone.

Versions of this same story comprise my personal history with music algorithm technology. Many of my experiences letting a computer tell me what I should listen to have been spent feeling like I'm simply tolerating the results. But with its promise of "thinking things," Aether intrigued me. The 50-person, San Francisco-based company has just one product now — the $399 Aether Cone wireless speaker — and they plan to keep it that way until they've perfected the music player. What makes the hardware company stand out in a marketplace littered with free algorithm software is that the physical product tracks your listening habits in relation to time, date and location in your home. If you turn up or turn down a song, Cone tracks that too. Weather is likely the next variable tracked by the speaker. After all, you shouldn't have to tell your stereo that you only want listen to melancholy folk songs when it's raining.

I first heard about Aether Cone last summer when it was released to mostly positive feedback in the tech press. Bloggers applauded Cone's stylish design and easy-to-use controls (an app for iPhone/Android, voice commands, or a physical twist of the speaker's front panel). The Cone runs off Rdio's library of 20+ million songs, so I knew the selection would be fairly comprehensive. I didn't have an Rdio account before using Cone back in October, so the system had zero data regarding my listening behavior and preferences when I started. But ostensibly, the idea is that over time, more and more user behavior data is fed into the algorithm, thus yielding more accurate results. Over the years, as I've started using various algorithm-powered features on streaming music services, I've found myself wondering, "How long does it take for this thing to know me, or at least reach a decent ratio of maximizing what I like and minimizing what I dislike? How much do I have to interact with it? Oh god, am I headed for a future resembling the film Her?"

Like many music fans these days, I claim my tastes as wide-ranging yet specific. There are pockets of genres and eras I love, others I think I expected to enjoy but ended up disliking. Because I'm just as likely to listen to Ariana Grande and Al Green as I am The Wrens and Chopin, there are gaps in my knowledge. I expect that it shouldn't be hard for a streaming service to play something I haven't heard, but would ostensibly be interested in hearing. Yet I find myself constantly fighting the inclination to switch the song when I'm using one of these recommenders, be it Pandora, Spotify Radio, Beats Music or Rdio's You FM. It all sounds so ... same-y. There's little about it that reminds me of the playlists I make for myself and others — compilations that not only flow through styles and eras, but worldviews, varying levels of taste and obscurity and musical strengths. Appreciating one songwriter for her dedication to DIY ethics or well-crafted lyricism about thorny topics does not preclude one from liking some big, dumb EDM banger where redemption is found in a far-out choice of producer — not that a computer would likely cull the nuance of all this from some algorithm. So I'm left to wonder: Who's using computer-generated playlists engineered to mimic their own tastes on a daily basis, and to what level of satisfaction? And do these people not consider themselves musically omnivorous?

The Cone, however, made me wonder something else entirely about music algorithm technology: What if mimicking mood is more important than mimicking taste? "It's 7 a.m. on a Tuesday and I'm reading the paper. Cone knows to play something bluesy," the speaker boasts on its website. But Aether is not totally alone in curating by this belief. The Sentence is the most unique feature of Beats Music, the iTunes-acquired streaming service set for a massive re-launch across Apple products, and which boasts handmade playlists from the ex-Pitchfork staffers and other music experts who comprise Beats' curatorial staff. In a Mad Libs format, listeners tell Beats where they are, what they're doing, who they're with, and what kind of music they feel like listening to. By learning its owners' habits over time, the Cone offers a less specific version of the same thing in an automated way.

"Monday mornings, Friday nights, Saturday afternoons: They're all very different moods, and music and mood are so closely tied," says Aether co-founder Duncan Lamb.

As opposed to streaming services tied to individual users and their social media accounts, Cone is a product that expects to be used by multiple people, each with different listening habits, within one household. (Moms with young families are an important part of Cone's demographic; the most requested song by voice command, for example, is Frozen's "Let It Go.") Among hardware companies, there's really no incentive to encourage this kind of device sharing. Then again, most hardware companies aren't built on quite as specific a philosophy as Aether.

"We want real objects in our lives that we can physically interact with, directly, in a really humane way, but that use collective data to help us make really good decisions," Lamb says. "A simple question like, 'What should I listen to right now?' can be answered by a physical gesture with the whole might of the world's data behind it."

Technology that hides the tech — behind a sleek product at that — remains one of Cone's primary strengths. You can use the Cone without interacting with a single interface. Turn it on, let it play what it thinks you want to hear based on context (i.e. what time of day and part of the year it is, where in your house you are), and simply twist the wheel located on front of the speaker if you dislike the direction it's headed. Voice commands for specific songs or artists are enabled when you press the button in the center of the speaker. (There's also a corresponding app for Apple and Android devices that doesn't even require you be in the same room as your Cone, though it is prone to freezing up.)

"The new 'undo' function is tuning — the ability to speak back to our digital devices and say, 'A little more of this, a little less of that,'" Lamb says. "The physical dial on Cone acknowledges that the system is not going to be right all the time, and gives this very gentle way for humans to manage what could be an intensely frustrating experience with a computer."

If you're looking to minimize time spent interacting with screens like many of us are, this approach can seem quite appealing. But the product is suited for people who think about what they want to listen to in broad strokes, honing in on basic genres to match mood and environment.

"The people we're designing for are people who, when they're in a silent room, think, 'Huh, it'd be nice if there was some music on,'" Lamb says. "They genuinely love it, but something happened and ... they got a life. It got really busy and crazy. These people are really engaged, but they're also kind of over it. They're not the kind of people who're like, 'Oh goodie, I'm gonna buy this thing and spend my Saturday configuring my router to change my playlists from across the house.'"

In researching for Cone, Aether went into people's homes to see, specifically, how they use music — in what ways socially, and on which devices. What they found was a recurring pattern of "weird stuff" in terms of technology and listening patterns. "People are listening to less music even though they have access to more," says Lamb, who describes the current state of music discovery as "much, much better, but also much more complicated."

"We saw people who were selling really high-end sound systems on Craigslist and replacing them with, like, playing music out of the speakers of their TV," Lamb says. "We'd ask, 'Why'd you do that?' and the answer would usually revolve around convenience: 'I can Airplay from my phone, my playlists are all there.'"

From Siri to Jibo (a robot marketed as a helpful member of your family), personal technology that tracks behavior and uses the data to simplify users' lives is where the larger industry's headed. Aether's just the first company to tailor contextual learning specifically for music in a physical product. However, the music discovery aspect of Aether's algorithm leaves something to be desired. The company knows that, with Lamb acknowledging that discovery is just "part of the recipe."

As Todd Kemmerling, Aether's VP of Engineering, explains to me, there are two types of algorithms commonly used in the music space, both to varying results. Cone is powered by the first kind of algorithm, as are many recommender features on streaming services.

"On one hand, there are algorithms that are based on people's past behavior," Kemmerling says. "These use things like collaborative filters and association rules. Basically, if you have two people — you and I — and I like song A, song B and song C, and you like song A and song B, there's a likelihood that you will like song C because we both liked song A and song B. That would be where the recommendation algorithm would say, 'Ah, maybe for Jill [the author], we'll give her song C.' This class of algorithms is very nuanced and detailed, but that's generally how things work at a high level."

(In addition to basic song association, streaming services using The Echo Nest and similar data collectors also integrate information across the web — from album review scores to social media buzz — into the algorithms that power their discovery features. The Echo Nest was acquired by Spotify early last year; while TastemakerX, which collects social sharing data, was acquired by Rdio mere months later. These high-level acquisitions represent the thirst for more and better data that exists in this kind of technology — another layer of filters that acknowledge acclaim and "what the kids are talking about" in an attempt to insure relevancy in their playlists.)

The other type of algorithm used in streaming music is a content-based algorithm; it's less common because it requires more resources. Humans set parameters around songs' musical components, which algorithms then identify and use to sort the songs into clusters. The closer two songs are in a cluster, the more similar they are. A version of this method is what powers Pandora's Music Genome Project, which — while only running off a library of around one million songs compared to the 20+ million in the catalogues of Spotify, Rdio, and Beats — remains a leader in terms of nuanced recommendations.

Pandora's on-staff musicologists assess a mere four songs per hour each on average, scoring as many as 450 individual musical characteristics — or "genes," as they call them — for each song. These include beats per minute, vocal gender and register, the vocal tone's grittiness, amount of guitar distortion, lyrical subjects, stylistic influences and more.

"There are a lot of details within a song that the average non-musician wouldn't necessarily be aware of [that are analyzed by Pandora]," says Steve Hogan, one of Pandora's music analysts. "For example, the presence of a harmonic 'vamp,' which is a short, repeating sequence of chords. Many pop songs make use of this. The verse of Van Morrison's 'Moondance' is a good example, as it cycles a short sequence of four chords over and over. That's in contrast to the verse of 'Hey Jude' by The Beatles, which has a non-repetitive chord progression, or what we'd call 'through composed.'"

"Another example would be what we call a 'swung sixteenth' rhythmic feel, where the sixteenth notes are given a slightly uneven lilt," Hogan continues. "It's a relatively subtle detail that many people wouldn't be able to articulate, but it makes a huge difference in the experience of a song's groove. A Tribe Called Quest's song 'Check The Rhime' is a good example, as is 'Superstition' by Stevie Wonder."

Aether "messed around quite a bit" with content-based algorithms in their beginning stages, but ultimately decided not to go that route. Not only was it extremely expensive, it wasn't helping them solve the problem at their core: "When you first approach the Cone, we're trying to get you close to what you want to hear at that moment," Kemmerling says, adding that Cone's algorithms offer up an initial handful of songs in each session before Rdio's algorithms take over curation. "Content-based algorithms are more valuable for [song] sequencing, so it's really helpful for streaming service providers."

Just a few days ago, I used Cone for the first time in about a month. On a Sunday afternoon, in my living room, Cone opened with one of my favorite early Aphex Twin tracks ("We Are the Music Makers") before shifting to "Too Bright," the orchestral climax of Perfume Genius' 2014 album of the same name, then "Think of You" from New York electro-pop duo MS MR. This was the closest Cone had ever been to "getting" me. I must admit it put me in a good mood, even if it wasn't what I would have chosen for myself in that moment. On the surface, these songs aren't related, but the teetering between certain conflicting aesthetics — the most accessible of experimental electronic sounds — showed me something about my own musical preferences I hadn't previously given much thought, even if I hadn't "discovered" anything previously unheard. And really, isn't that something valuable for the most engaged and opinionated listeners — a new perspective on old taste, tested against millions of songs?

Before I knew it, Cone slipped over to Vampire Weekend, then Jack White, then Arctic Monkeys. Cone's weird pop bent was gone, just as I'd realized that's what I wanted to hear. I spent an hour going down that rabbithole on my own via Spotify search, revisiting Soft Cell and other artists I hadn't thought about in years, nary a computer-generated recommendation in earshot.

Copyright 2015 NPR. To see more, visit http://www.npr.org/.
Support your
public radio station