Transcript
v-9Mpe7NhkM • Gustav Soderstrom: Spotify | Lex Fridman Podcast #29
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/lexfridman/.shards/text-0001.zst#text/0084_v-9Mpe7NhkM.txt
Kind: captions
Language: en
the following is a conversation with
Gustav Sorum he's the chief research and
development officer Spotify leading
their product design data technology and
engineering teams as I've said before in
my research and in life in general I
love music listening to it and creating
it and using technology especially
personalization through machine learning
to enrich the music discovery and
listening experience that is what
Spotify has been doing for years
continually innovating defining how we
experience music as a society in the
digital age that's what Gustav and I
talk about among many other topics
including our shared appreciation of the
movie true romance in my view one of the
great movies of all time this is the
artificial intelligence podcast if you
enjoy it subscribe on YouTube give it
five stars on iTunes support on patreon
or simply connect with me on twitter at
lux Friedman spelled Fri D ma n and now
here's my conversation with Gustav Soros
from Spotify has over 50 million songs
in its catalogue so let me ask the
all-important question I feel like
you're the right person to ask what is
the definitive greatest song of all time
it varies for me personally she can't
speak definitively for everyone
I wouldn't believe very much in machine
learning if I did right because everyone
had the same taste so for you what is
you have to pick what is the song
alright so it's it's pretty easy for me
there is this song called you're so cool
Hans Zimmer soundtrack to true romance
it was a movie that made a big
impression on me and it's kind of been
following me through my life actually
had it play at my wedding I start with
the organist and help them play it on an
organ which was a pretty pretty
interesting experience that is probably
my I would say top 3 movie of all time
yeah it's just an incredible yeah and
then it came out during my formative
years and
as I've discovered in music you shape
your music taste during those years so
it definitely affected me quite a bit
did it affect you in any other kind of
way well the movie itself affected me
back then it was a big part of culture I
didn't really adopt any characters from
the movie about it it was a it was a
great story of love fantastic actors and
and you know really I didn't even know
who Hans Zimmer was at a time but
fantastic music and so um that song has
followed me and the movie actually
follow me throughout my life that was a
Quentin Tarantino actually I think
director director produced that her so
it's not stairway to heaven or Bohemian
Rhapsody so those are those are great
they're not my personal favorites but uh
but they're I realized that people have
different tastes and that's it's a big
part of what we do well for me I have to
stick with stairway to heaven so 35,000
years ago I looked this up on Wikipedia
flute like instruments started being
used in caves as part of hunting rituals
then primitive cultural gatherings
things like that this is the birth of
music since then we had a few folks
Beethoven Elvis Beatles Justin Bieber
of course Drake so in your view let's
start like high level philosophical what
is the purpose of music on this planet
of ours I think music has many different
purposes I think there's there's
certainly a big purpose which is the
same as multiple attainment which is
ESCA pisum and to be able to live in
some sort of other mental state for a
while
but I also think you have the the
opposite of escaping which is to help
you focus on something you are actually
doing and so I think people use music as
a tool to to tune the brain to the
activities that they are actually doing
and it's kind of like in one sense maybe
it's the rawest signal if you if you
think about the brain that's known that
works it's maybe the most efficient hack
we can do to actually actively tune it
into some state that you want to be you
can do it in other ways you can tell
stories to put people in a certain mood
but music is probably very F
to get to a certain mood very fast and
you know there's uh there's a social
component historically to music where
people listen to music together I was
just thinking about this that to me you
mentioned machine learning but to me
personally music is a really private
thing I'm speaking for myself I listen
to music like almost nobody knows the
kind of things I have in my library
except people who are really close to me
and they really only know a certain
percentage there's like some weird stuff
that I'm almost probably embarrassed but
by right it's called to give the
pleasure everyone I said the guilty
pleasures yet hopefully they're not too
bad but it's just the ice for me it's
personal do you think of music is
something that's social or as something
that's personal this is a very so I
think it's the same it's the same answer
that you use it for for both we we've
thought a lot about this during these 10
years at Spotify obviously in one sense
as you said music is incredibly social
you go to concerts and so forth on the
other hand it is your your escape and
everyone has these things are very
personal to them so what we've found is
that when it comes to to most people
claim that they have a friend or two
that they are heavily inspired by and
that they listen to so I actually think
music is very social but in a smaller
group setting it's in it's an intimate
form of of it's an intimate relationship
it's not something that you necessarily
share broadly now at concerts you can
argue you do but then you've gathered a
lot of people that you have something in
common with I think this broadcast
sharing on music it's something we tried
on social networks and so forth but it
turns out that people aren't super
interested in is what their friends
listen to they're interested in
understanding if they have something in
common crabs with a friend but not you
know not just as information right that
that that's really interesting
that was just
thinking about this morning listening to
Spotify I really have a pretty intimate
relationship was modified with my
playlists right I've had them for many
years now and they've grown with me
together there's there's an intimate
relationship you have with a library of
music you've developed and we'll talk
about different ways to play with that
can you do the impossible task and try
to give a history of music listening
from your perspective from before the
Internet and after the Internet and just
kind of everything leading up to
streaming Spotify I'll try it it could
be a 100 year podcast yeah I'll try to
do a brief version there are some things
that that I think are very interesting
during the history of music which is
that before recorded music you to be
able to enjoy music you actually had to
be where the music was produced because
he couldn't he couldn't record it and
time-shifted write creation and
consumption had to happen at the same
time basically concerts and so you
either had to get to the nearest village
to listen to music and while that was
cumbersome and it severely limited the
distribution of music it also had some
different qualities which was that the
Creator could always interact with the
audience
it was always live and also there was no
time cap on the music so I think it's
not a coincident that these early
classical works they're much longer than
the three minutes the three minutes came
in as a restriction of the first wax
disc that could only contain a three
minute singsong on one side right so
actually the recorded music severely
limited there or could constraint I
won't say limit I mean constraints often
good but it put very hard constraints on
the music format so you kind of said
like instead of doing these this opus
like many tens of minutes or something
now you get three and a half minutes
because then you're out of wax on this
disc but in return you get in amazing
distribution you reach will widen right
just on that point real quick without
the mass scale distribution there's a
scarcity component where you kind of
look forward
what we had that it's like the Netflix
versus HBO Game of Thrones you like wait
for the event because you can't really
listen to it see you're like look
forward to it and then it's you derive
perhaps more pleasure because it's more
rare for you to listen to particular
piece you think there's value to that
scarcity yeah I think that that is
definitely a thing and there's always
this component of if you have something
in infinite amount so will you value it
as much probably not humanity is always
seeking some is relative so you're
always seeking something you didn't have
and when you have it you don't
appreciate as much I think that's
probably true but I think that's why
concerts exist so you can actually have
both but I think net if you couldn't
listen to music in your car driving that
that'd be worse that cost will be bigger
than the benefit of of the anticipation
I think that you would have so yep it
started with live concerts then it's
being able to you know the the
phonograph invented right you start to
be able to record music exactly so then
then you got this massive distribution
that that made it possible to create two
things I think first of all cultural
phenomenons they probably need
distribution to be able to happen but it
also opened access to you know for a new
kind of artists so you started to have
these phenomenons like Beatles inhale
this and so forth that would really a
function of distribution I think
obviously of talent and innovation but
there was also taking a component and of
course the next big innovation that come
along was was radio broadcast radio and
I think radio is interesting because it
started not as a music medium and
started us as an information medium for
for news and then radio need to define
something to fill the time wid so that
they could honestly play more ads and
make more money and music was free so so
then you had this massive distribution
we could program to people I think those
things that ecosystem is what created
the ability for for for hits but it was
also very broadcast medium so you would
tend to get these massive massive hits
but maybe not such a lot
tail in terms of choice of everybody
listening to the same stuff yeah and as
you said I think there are some social
benefits to that yeah I think for
example there is there's a high
statistical chance that if I talk about
the latest episode of Game of Thrones we
have something to talk about yeah just
statistically in the age of individual
choice maybe some of that goes away so I
I do see the value of like you know
shared cultural components but I also
obviously love personalization and so
let's catch us up to the Internet
so maybe Napster well first of all
there's like mp3's exact tape CDs there
was a digitalization of music with a CD
really it was physical distribution but
the music became did you don't yeah and
so they were files but basically boxed
software and use the software analogy
and then you could start downloading
these files and I think there are two
interesting things that happen back to
music used to be longer before it was
constrained by the distribution medium I
don't think that was a coincidence and
then really the only music genre to have
developed mostly after music was a file
again on the Internet is EDM and EDM is
often much longer than the traditional
music I think I think it's interesting
to think about the fact that music is no
longer constrained in minutes per song
or something it's it's a it's a legacy
of our own distribution technology and
you see some of this new music that that
breaks the format not so much as I would
have expected actually by now but but it
still happens so first of all I don't
really know what EDM is electronic dance
music yeah right you could say Avicii
was one of the biggest in this genre so
the main constraint is of time something
like three four or five minute songs
songs there were eight minutes ten
minutes and so forth because the you
know it started as a digital product
that you downloaded so you didn't have
this this constraint anymore so I think
it's something really interesting that I
don't think has fully happened yet we're
kind of jumping ahead a little bit to
where we are but I think there's there's
tons of formal innovation in music that
should happen now
that couldn't happen when you needed to
really adhere to the distribution
constraints if you didn't adhere to that
you will get no distribution so so jerk
for example
Icelandic artist she made a full I pad
app as an album that's very expensive
you know even though the App Store has
great distribution she gets nowhere near
the distribution versus staying within
the 3-minute format so I think now that
music is fully digital inside these
streaming services there is there is the
opportunity to change the format again
and allow creators to be much more
creative without limiting their their
distribution ability that's interesting
that you're right it's surprising we
don't see that taking advantage more
often it's almost like the constraints
of the distribution from the 50s and 60s
have molded the culture to where we want
the five three to five minutes on that
anything else not just so we want the
song as consumers and as artists like I
cuz I write a lot of music and I never
even thought about writing something
longer than 10 minutes that's it's
really interesting that those
constraints because all your training
data has been three minutes right it's
right okay so yes digitization of data
later than mp3s
yeah so I think you had this file then
that was distributed physically but then
you had the components of digital
distribution and then the internet
happened and there was this vacuum where
you had a format that could be digitally
shipped but there was no business model
and then all these pirate networks
happen Napster and in pirate in Sweden
Pirate Bay which was one of the biggest
and it you know I think from a consumer
point of view which which kind of leads
up to the inception of of Spotify from a
consumer point of view consumers for the
first time had this access model to
music where they could without kind of
any marginal costs they could they could
try different tracks you could use music
in in new ways there was no marginal
cost and that was a fantastic consumer
experience that
just all the music ever made I think was
fantastic but it was all so horrible for
artists because there was no business
model around it so they didn't make any
money so the user need almost drove the
user interface before there was a
business model and then there were these
download stores that allowed you to
download files which was a solution but
it didn't solve the access problem there
was still a marginal cost of 99 cents to
try one more track and I think that that
heavily limits how you listen to music
the example always give this you know in
Spotify a huge amount of people listen
to music while they sleep while they go
to sleep and while they sleep if that
costed you $0.99 per three minutes you
probably wouldn't do that and you would
be much less adventurous if there was a
real dollar cost to exploring music so
the access model is interesting in that
it changes your music behavior you can
be you can take much more risk because
there's no marginal cost to it maybe let
me linger on piracy for a second because
I I find especially coming from Russia
piracy is something that's very
interesting to me not me of course ever
but my friends who partook in piracy of
music software TV shows sporting events
and usually to me what that shows is not
that they're they can actually pay the
money and they're not trying to save
money they're choosing the best
experience so what to me piracy shows is
a business opportunity in all these
domains and that's where I think you're
right spot if I stepped in is basically
piracy was is an experience you can
explore was fine music you like and
actually the interface of piracy isn't
as horrible because it's I mean it's not
metadata yeah that metadata is long
download times all kinds of stuff and
what Spotify does is basically
first rewards artists and second makes
the experience of exploring music much
better I mean the same is true I think
for movies and so on this piracy reveals
in the software space for example I'm a
huge user and fan of Adobe products and
the there was much more incentive to
pirate Adobe products before they went
to a monthly subscription plan and now
all of the sudden
that you used to pirate Adobe products
that I know now actually pay gladly for
the monthly subscription I think you're
right I think it's in it's a sign of an
opportunity for product development and
that sometimes the there's a product
market fit before there's a business
model fit in product development I think
that that is that's a sign of it in in
Sweden I think was a bit of both there
was there was a culture where even had a
political party called the pirate party
and this was during the time when when
people said that you know information
should be free it's not was somehow
wrong to charge for ones and zeros so I
think people felt that artists should
probably make some money somehow else
and you know concerts or something so at
least in Sweden it was part really
social acceptance even at the political
level and that but that also forced
Spotify to compete with with free which
which I don't think would actually could
have happen anywhere else in the world
the music industry needed to be doing
bad enough to take that risk
and Sweden was like a perfect testing
ground it had government-funded
high-bandwidth low-latency broadband
which meant that the product would work
and it was also there was no music
revenue anyway so they were kind of like
I don't think this is going to work but
why not
so this product is one that I don't
think could have happened in America
there was large music market for example
so how do you compete with free because
that's an interesting world of the
Internet where most people don't like to
pay for things so Spotify steps in and
tries to yes compete with free how do
you do it so I think two things one is
people are starting to pay for things on
the Internet
I think one way to think about it was
that advertising was the first business
model because no one would put a credit
card on internet transactional with
Amazon was the second and maybe
subscription is their third and if you
look offline subscription is the biggest
those so that may still happen I think
people are starting to pay but
definitely back then we needed to
compete with free and the first thing
you need to do is obviously to lower the
price to free and then you need to be
better somehow and the way that Spotify
was better was on the user experience on
the on the actual performance the
latency of you know even if even if you
had high bandwidth broadband it would
still take you thirty Seconds to a
minute to download one of these tracks
so the Spotify experience of starting
within the perceptual limit of immediacy
about 250 milliseconds meant that the
the whole trick was that felt as if you
had downloaded all the part that it was
on your harddrive it was that fast even
though it wasn't and it was still free
but somehow you were actually still
being a legal citizen now that was the
trick that's what if I managed to to
pull off so yeah I've actually heard you
say this to write this and that was
surprised that wasn't aware of it
because I just took it for granted you
know whenever an awesome thing comes
along you just like of course it has to
be this way that that's exactly right
that it felt like the entire world's
libraries at my fingertips because of
that of that latency being reduced what
was the technical challenge in reducing
Olli so there was a group of really
really talented engineers one of them
called Ludwig freakiest he wrote the
actually from Gothenburg he wrote the
initial the uterine client which is kind
of an interesting backstory to Spotify
you know that we have one of the top
developers from from BitTorrent clients
as well so he wrote utorrent the world's
smallest BitTorrent clients and then he
he was acquired very early by daniel and
martin who found it spotify and they
actually sold the u torrent client to
BitTorrent but kept living so Spotify
had a lot of experience within
peer-to-peer networking so the original
innovation wasn't was a distribution
innovation where Spotify built an
end-to-end media distribution system up
until only a few years ago we actually
hosted all the music ourselves so we had
both the server side in the cloud
and that meant that we could do things
such as having a peer-to-peer solution
to use local caching on the client-side
because back then the world was mostly
desktop but we could also do things like
hack the TCP protocols
things like niggles algorithm for kind
of exponential back-off or ramp up and
just go full throttle and optimize for
latency at the cost of bandwidth and all
of this end-to-end control
meant that we could do an experience
that felt like a step change these days
we actually are on on GCP we don't host
our own stuff and everyone is really
fast these days so that was the initial
competitive advantage but then obviously
you have to move on over time and that
was I was over 10 years ago right that
was in 2008 the product was launched in
Sweden it was in a beta I think 2007 and
it was on the desktop right so his
desktop only there's no phone there was
no phone the iPhone came out in 2008 but
the App Store came out one year later I
think so the writing was on the wall but
there was no phone yet you've mentioned
that people would use Spotify to
discover the songs they liked and then
they would torrent those songs just so
they can copy it to their phone just
hilarious because I'm not torrent quiet
it seriously piracy does seem to be and
like a good guide for business models
video content as far as I know Spotify
doesn't have video content well we do
have music videos and we do have videos
on the on the service but the way we
think about ourselves is that we're an
audio service and we think that if you
look at the amount of time that people
spend on audio it's actually very
similar to the amount of time that's
people spend on video so the opportunity
should be equally big but today is not
at all valued videos value much higher
so we think it's basically completely
undervalues we think of ourselves as an
audio service but within that audio
service I think video can make a lot of
sense I think for when you're when
you're discovering an artist you
probably do want to see them and
understand who they are to understand
their identity you won't see the video
every time now 90% of the time the phone
is gonna be in your pocket
for podcasters you use video I think
that can make a ton of sense so we do
have video but we're an audio service
where think of it as we call it
internally background able video video
that is helpful but isn't isn't the
driver of the narrative I think also if
we look at YouTube the way people
there's quite a few folks who listen to
music on YouTube
so in some sense YouTube was a bit of a
competitor to to Spotify which is very
strange to me that people use YouTube to
listen to music they play essentially
the music videos right but don't watch
the videos and put it in their pocket
well I think I think it's similar to to
what strange I mean it's similar to what
we were for the piracy networks know
where YouTube for historical reasons
have a lot of music videos so you use
people use YouTube for a lot of the
discovery part of the process I think
but then it's not a really good sort of
quote unquote mp3 player because it
doesn't even background then you have to
keep the app in the foreground so so the
consumption on a good consumption tool
but it's a decently good discoveries I
mean I think YouTube is fantastic
products and I use it for all kinds of
purposes so if I were to admit something
I do use YouTube a little bit for the
discovery to assistant discovery process
of songs and then if I like it I'll add
it just fine that's okay that's okay
with that ok so sorry we're jumping
around a little bit so the it's kind of
incredible you look at Napster you look
at the early days of Spotify how do you
one fascinating points how do you grow a
user base see their ins in Sweden you
have an idea I saw the initial sketches
that look terrible how do you grow user
base from all from a few folks to
millions I think there are a bunch of
tactical answers so first of all I think
you need a great product I don't think
you take a bad product and and market it
to be successful so you need a great
product but sorry to interrupt but it's
a totally new way to listen to music too
so it's not just did people realize
immediately that Spotify is a great
product
I think they did so back to the point of
pyrazine it was a totally new way to
listen to music illegally but people had
been used to the access model in Sweden
and the rest of the world for a long
time through piracy so one way to think
about Spotify it was just legal and fast
piracy yeah and so people have been
using it for a long time so they weren't
alien to it they didn't really
understand how it could be legal because
it was seemed too fast and too good to
be true yeah which i think is a great
product proposition if you can be too
good to be true but what I saw again and
again was people showing each other
clicking the song showing how fast it
started and saying I can't believe this
yeah so I really think it was about
speed then we also had an invite product
program that was there was really meant
for scaling because we hosted our own
service we needed as a control scaling
but that built a lot of expectation and
I don't want to say hype because I hype
implies that it was that it wasn't true
excitement around the product and we've
replicated that when we launched in the
in the US we also built up and it might
only program first there are lots of
tactics but I think you need a you need
a great product that solves some problem
and B basically the key innovation there
was technology but on a method level the
innovation was really the access model
versus the ownership model and that was
tricky a lot of people said that they I
mean they wanted to own their music they
would never kind of rent it or borrow it
but I think the fact that we had a free
tier which meant that you get to keep
this music for life as well
helped quite a lot so this is an
interesting psychological point maybe
you can speak to it was a big shift for
me like I get to it's almost like a I go
to therapy for this is uh I think I
would describe my early listening
experience and I think a lot of my
friends do is basically hoarding music
is your like slowly one song by one song
or maybe albums gathering a collection
of music that you love and you own it
it's like awful especially with CDs or
tape you like physically had it and and
what Spotify what I had to come to grips
with it was kind of liberating actually
is to throw away all the music I've had
this therapy session yes people and I
think the mental trick is so actually we
seen the user data once what if I
started a lot of people did the exact
same thing they started hoarding as if
the music would disappear right almost
the equivalent of downloading and so you
know we had these playlists that had
limits of like a few hundred thousand
tracks which we no one will ever like
well they do needs and hundreds and
hundreds of thousands of tracks and to
this day you know some people want to
actually save code and coordinate play
the entire catalog but I think that the
therapy session goes something like
instead of throwing away your music if
you took your files and you store them
in the locker at Google it'd be a
streaming service it's just that in that
locker you have all the world's music
now for free so instead of giving away
your music you got all the music it's
yours it's a you could think of it at
having a copy of the world's catalogue
that forever so you actually got more
music instead of less it's just that you
just took that hard disk and you sent it
to to someone who stored it for you and
once you go through that mental journey
I'm like still my files they're just
over there and I just have 40 million
other 50 million or something now then
people are like okay that's good
the problem is I think because you paid
us a subscription if we hadn't had the
free tier where you would feel like even
if I don't want to pay anymore I still
get to keep them you keep your playlist
forever they don't disappear even though
you stopped paying I think that was
really important if we would have
started us you know you can put in all
this time but if you stopped paying you
lose all your work I think that would
have been a big challenge and what's the
big challenge for a lot of our
competitors that's another reason why I
think the free tier is really important
that people need to feel the security
that the work they put in it will never
disappear even if they decide not to pay
I like how you put the work you put in I
she stopped even think of it that way I
just actually Spotify taught me to just
enjoy music I'm sorry as opposed as
opposed to what I was doing before which
is like in an unhealthy way hoarding
music which I found that because I was
doing that I was listening to a small
selection of songs way too much to our
where I was getting sick of them whereas
Spotify the more liberating kind of
approaches I was just enjoying of course
I listened to stairway to heaven over
and over but because of the extra
variety I don't get as sick of them
there is an interesting statistic I saw
that so
Spotify has maybe you can correct me but
over 50 million songs tracks and over
three billion playlists so yes a million
songs and three billion playlists 60
times more playlists what do you make of
that yeah so the way I think about it is
that from a from is that the station or
machine learning point of view you have
all these if you only thing about
reinforcement learning where you have
this state space of all the tracks and
you can take different journeys through
this through this world and these I
think of these is like people helping
themselves and each other creating
interesting vectors through this space
of tracks and then it's not so
surprising that across you know many
tens of millions of kind of atomic units
there will be billions of paths that
make sense and we're probably pretty
quite far away from having found all of
them so kind of our job now is users
when Spotify started it was really a
search box that was for that time pretty
powerful and then I'd like to refer to
that this programming language called
play listing where if you as you
probably were pretty good at music
you knew your new releases you knew your
backyard law you knew your stairway to
heaven you could create a soundtrack for
yourself using this playlist thing - oh
that's like meta programming language
for music - sounds like your life and
people who were good at music it's back
to how do you scale the product for
people who are good at music that wasn't
actually enough if you had the catalog
in a good search tool and you can create
your own sessions you could create
really good a soundtrack for your entire
life probably perfectly personalized
because you did it yourself but the
problem was most people many people
aren't that good at music they just
can't spend the time even if you're very
good at news it's gonna be hard to keep
up so what we did to try to scale this
was to essentially try to build you can
think of them as a
instead there's this friend that some
people had that helped them navigate
this music catalog that's what we're
trying to do for you but also there is
something like 200 million active users
on Spotify so there it's okay so from
the machine learning perspective you
have these 200 million people plus
they're creating it's really interesting
to think of playlist as I mean I don't
know if you meant it that way but it's
almost like a programming language it's
a released a trace of exploration of
those individual agents of the the
listeners and you have all this new
tracks coming in so it's a fascinating
space that is ripe for machine learning
so that is there is it is it possible
how can playlist be used as data in
terms of machine learning and just to
help Spotify organize the music so we
found in our data not surprising that
people who play listed lots they retain
much better they had a great experience
and so our first attempt was to playlist
for users and so we acquired this
company called tune ego of editors and
professional playlist errs and kind of
leverage the maximum of human
intelligence to help to help build kind
of these vectors through the track space
for four people and that that broaden
the product then the obvious next and we
you know use statistical means where
they could see what when they created a
playlist how did that play this perform
you know they could see skips of the
songs they could see how the songs
perform and they manually iterated the
playlist to maximize performance for a
large group of people but there were
never enough editors to playlist for you
personally so the promise of machine
learning was to go from kind of group
personalization using editors and tools
into statistics to individualization and
then what's so interesting about the 3
billion playlist we have is we ended the
truth is we lucked out this was not
a priori strategy as is often the case
it looks really smart in hindsight was
as dumb luck we looked at these
playlists and we had some people in the
company a person named their grandson it
was really good at machine learning
already back in in back then in like
2007-2008 back then it was mostly
collaborative filtering you so forth but
we realized that what what this is is
people are grouping tracks for
themselves that have some semantic
meaning to them and then they actually
label it with a playlist name as well so
in a sense people were grouping tracks
along semantic dimensions and labeling
them and so could you could you use that
information to find that that latent
embedding and so we started playing
around with collaborative filtering and
we saw tremendous success with it
basically trying to extract some of
these some of these dimensions and and
if you think about it's not surprising
at all it'd be quite surprising if
playlists were actually random if they
had no semantic meaning for most people
they group these tracks for some reason
so we just happen to cross this
incredible data set where people are
taking taken these tens of millions of
tracks and group them along different
semantic vectors and the semantics being
outside the individual users it's some
kind of universal there's a universal
embedding that holds across people on
this earth yes I do think that the
embeddings you finally gonna be
reflective of the people who play listed
so if if you have a lot of indie lovers
who playlist your embed is going to
perform better there but what we found
was that yes there were these these
latent similarities they were very
powerful and we we had them it was
interesting because I think that the
people who play listed the most
initially were this so-called music
aficionados who who really into music
and they often had a certain they're
tasteful stuff is often certain geared
towards a certain type of music and so
what surprised us if you look at the
problem from the outside you might
expect that
the algorithms would start performing
best with mainstreamers first because it
somehow feels like an easier problem to
solve mainstream tastes than really
particular tastes it was the complete
opposite for us the recommendations
performed fantastically for people who
saw themselves as having very unique
taste that's probably because all of
them playlist ed and they didn't perform
so well for mainstream is they actually
thought they were a bit too particular
and unorthodox so we had the complete
opposite of what we expected success
within the hardest problem first and
then had to try to scale to more
mainstream recommendations so you've
also acquired echo nests that analyze a
song data so in your view maybe you can
talk about so what kind of data is there
from a machine learning perspective
there's a like a huge amount what we're
talking about playlists thing and just
user data of what people are listening
to the playlists are constructing and so
on and then there's the the actual data
within a song what makes a song I don't
know the actual waveforms right is there
any how do you mix the two how much
values are in each to me it seems like
user data is well it's a romantic notion
that the song itself would contain
useful information but if I were to
guess user data would be much more
powerful like playlists would be much
more powerful yeah so we use both our
biggest success initially what was with
playlist data without understanding
anything about the structure of this
song but when we acquire the echo nest
they had the inverse problem they
actually didn't have any play data they
were just they were a provider of
recommendations but they didn't actually
have any play data so they they looked
at the structure of songs sonically and
they looked at Wikipedia for cultural
references and so forth right cool and
did a lot of NLU and so forth so we got
that skill into the company and combine
kind of our user data with their with
their kind of content-based so you can
think of as we were used to based and
they were content based in their
recommendations and we combine those two
and for some cases where you have a new
there's no no play date obviously you
have to try to go by either you know who
the artist is or or the sonic
information in the song or what it's
similar to so there's definitely value
in in both and we do a lot in both but I
would say yes the user data captures
things that that have to do with culture
in the greater society that you would
never see in the in the content itself
but that said we have seen we have a
research lab in Paris when you know we
can talk about more about that on kind
of machine layer on the creator side
what it can do for creators not just for
the consumers but where we looked at how
does the structure of a song actually
affect the listening behavior and it
turns out that there is a lot of we can
we can predict things like skips based
on we you know based on on the song
itself we could say that maybe you
should move that chorus a bit because
you're skippers gonna go up here there
is a lot of latent structure in the
music which is not surprising because it
is some sort of mind hack so there
should be structured that's probably
what we respond to you just blew my mind
actually for from the creator
perspective so that's really interesting
topic that probably most creators aren't
taking advantage of right so there's
I've recently got to interact with a few
folks youtubers who are like obsessed
with this idea of what do I do to make
sure people keep watching the video and
then like look at the analytics of which
point if people turn off and so on first
of all don't think that's healthy but
it's it's because you can do it a little
too much but it is a really powerful
tool for helping the creative process
you just made me realize you could do
the same thing for creation of music and
so is that something you've looked into
oh is it can you speak to how much
opportunity there is for that yeah I
think I listen to to the podcast with
Suraj yeah and I thought it was
fantastic and directed to do the same
thing where he said and he said he
posted something in the morning
yeah immediately watch the feedback
where the drop off was and then
responded to that in the afternoon
yeah which which is quite different from
how people make podcasts for example yes
exactly I mean the feedback loop is
almost non-existent it's very so if we
back out a one-level I think actually
both for music and podcasts which we
also do is let Spotify I think there's a
tremendous opportunity just for the
creation workflow and I think it's
really interesting speaking to you who
because you're a musician a developer
and a podcaster if you think about those
three different roles if you if you make
the leap as a musician if you if you
think about it as a software tool chain
really your door with the stems
that's the IDE right that's what you
work in source code formant with your
with with what you're creating then you
sit around and you play with that and
when you're happy you compile that thing
into some sort of you know AAC or mp3 or
something you do that because you get
distribution there's so many runtimes
for that mp3 across the world and
Carstairs and stuff so you kind of
compile this executable you ship it out
and kind of an old fashioned box
software analogy and then you hope for
the best right right but as a as a as a
software developer you'd never do that
first you go and get helping you
collaborate with other Creators yeah and
then you know you think it'd be crazy to
just ship one version of your software
without doing an a/b test without any
feedback loop and then HD tracking
exactly and then you would you would
look at the feedback loops and try to
optimize that thing right so I think if
you think of it as a as a very specific
software tool chain it looks quite
arcane you know the tools that a music
creator has versus what a software
developer has so that's kind of how we
think about it and why wouldn't a why
wouldn't a music creator have something
like github you could collaborate much
more easily so we have we bought this
company called sound trap which has a
kind of Google Docs for music approach
where you can collaborate with other
people on the kind of source code format
with stamps and I think introducing
things like AI tools there to help you
as you're creating music both in in
helping you you know
put accompaniment your music like drums
or something help you master and mix
automatically help you understand how
this track will perform exactly what you
would expect as a software developer I
think makes a lot of sense and I think
the same goes for a podcaster I think
podcasters will expect to have the same
kind of feedback loop that Siraj has
like why wouldn't you maybe maybe it's
not healthy but sorry I wanted to
criticize the fact cuz you can overdo it
because a lot of the each and we're in a
new era of that so you can become
addicted to it and therefore what people
say you become a slave to the YouTube
algorithm are sort of it's a it's always
a danger of a new technology as opposed
to say if you're creating a song
becoming too obsessed about the intro
riff to the song that keeps people
listening versus actually the entirety
of the creation process it's a balance
absolutely but the fact that there's
zero I mean you're blowing my mind right
now because you're completely right that
there is no signal whatsoever there's no
feedback whatsoever in the creation
process and music or podcasting almost
at all and are you saying that Spotify
is hoping to help create tools to not
tools but no tools actually actually
tools from traders absolutely so we have
we've remains micro stations the last
few years around music creation this
company called soundtrap which is the
door digital audio workstation but that
is browser-based and that their focus
was really the Google Docs approach
where you can collaborate with people
much more easily then you could in
previous tools so we have some of these
tools that we're working with that we
want to make accessible and then we can
connect it with our with our consumption
data we can create this feedback loop
where we could help you understand we
could help you create and help you
understand how you will perform we also
acquired this other company within
podcasting called anchor which is one of
the biggest podcasting tools mobile
focused so really focused on simple
creation or easy access to create
but that also gives us this feedback
loop and even before that we invested in
something called Spotify for artists and
Spotify for podcasters which is an app
that you can download you can verify
that you are that creator and then you
get you get things that you know
software developers have had for years
you can see where if you look at your
podcast for example on Spotify or or a
song that you released you can see how
it's performing which cities is
performing and who is listening to it
what's the demographic break up so
similar in the sense that you can
understand how you're actually doing on
the on the platform so we we definitely
want to build tools I think you also
interviewed the head of research for
Adobe and I think that's an item back to
photoshop that you like I think that's
an interesting analogy as well Photoshop
I think has been very innovative in
helping photographers and artists and I
think there should be the same kind of
tools for for music creators where you
could get you know AI assistants for
example that's you creating music as you
can do with with Adobe where you can I
want to sky over here and you can get
help creating that sky the really
fascinating thing is what Adobe doesn't
have is a distribution for the content
you create so you don't have the data of
if I create if I uh you know whatever
creation I'm making Photoshop a premiere
I can't get like immediate feedback like
I can on YouTube for example about the
way people are responding and if Spotify
is creating those tools that that's a
really exciting actually world but let's
talk a little about podcast it's so I
have trouble talking to one person so
it's a bit terrifying and kind of hard
to fathom but an average sixty to a
hundred thousand people will listen to
this episode okay so it's intimidating
it's intimidating
so I hosted on blueberry I don't know if
I'm pronouncing that correctly actually
it looks like most people listen to an
Apple podcast cashbox and pocket gas
and only about a thousand listen on
Spotify in just my podcast right so
where do you see a time when Spotify
will dominate this so Spotify is
relatively new into this podcasting talk
nesting site yeah in podcasting what's
the deal with podcasting and Spotify how
serious is Spotify about podcasting do
you see a time where everybody would
listen to you know probably a huge
amount of people majority perhaps listen
to music on Spotify do you see a time
when the same is true for podcasting
well I certainly hope so
that is our mission our mission as a
company is actually to enable a million
creators to live off of their art in a
billion people inspired by it and what I
think it is interesting about that
mission is it actually puts the crater's
first even though it's not as a consumer
focused company and it says to be able
to live off of their art not just make
some money or further art as well so
it's quite an ambitious project and so
we think about creators of all kinds and
we kind of expanded our mission from
being music - being audio a while back
and that's not so much because we think
we made that decision we think that my
decision was was made for us
we need the world made that decision
whether we like it or not when you put
in your headphones you're gonna make a
choice between music and new episode of
of your podcast or something else right
we're in that world whether we like it
or not
and that you know that's how radio work
so we decided that we think it's about
audio you can see the rise of audiobooks
and so forth we think audio is this
great opportunity so we decided to enter
it and and obviously Apple and Apple
podcast is absolutely dominating in
podcasting and we didn't have a single
podcast only like two years ago
what we did though was we we we looked
at this and said no can we bring
something to this you know we want to do
this but the back to the
Josefa we have to do something that
consumers actually value to be able to
do this and the reason we've gone from
not existing at all to being the the
record of what quite a wide margin the
second-largest podcast consumption still
still wide gap to iTunes but we're
growing quite fast I think it's because
when we when we looked at the consumer
problem
people said surprisingly that they
wanted their podcasts and music in the
same in the same application so what we
did was we took a little bit of a
different approach what we said instead
of building a separate podcast app we
thought it's their consumer problem to
solve here because the others are very
successful already and we thought there
was in making a more seamless experience
where you can have your podcast in your
music in the same application because we
think it's audio to you and that that
has been successful and that meant that
we actually had 200 million people to
offer this to instead of starting from 0
so I think we have a good chance because
we're taking a different approach than
the competition and back to the other
thing I mentioned about creators because
we're looking at the end-to-end flow I
think there's a tremendous amount of
innovation to do around podcast as a
format when we have creation tools and
consumption I think we could start
improving what podcasting is I mean
podcast is this this opaque big like 1/2
hour file that you're streaming which it
really doesn't make that much sense in
2019 that it's not interactive there's
no feedback loops nothing like that so I
think if we're gonna win it's gonna have
to be because we build a better product
for creators and for for consumers so
we'll see but it's certainly our goal we
have a long way to go
well the creators part is really
exciting you ready you got me hooked
there is the only stats I have a
blueberry just recently added the stats
of whether it's listened to the end or
not and that's like a huge improvement
but that's still nowhere to where you
could possibly go into her statistics
you just download this pot of five
podcasters up and verify and then then
you know where people dropped out in
this episode oh wow ok the moment I
started talking okay I might be
depressed by this but okay so one one
other question
the original Spotify for music and I
have a question about podcasting in this
line is the idea of albums I have what
did you use ik aficionados a friends who
are really big fans of music often
really enjoy albums listening to entire
albums of an artist and correct me if
I'm wrong but I feel like Spotify has
helped replace the idea of an album with
playlists so you create your own albums
it's kind of the way at least I've
experienced music and I have really
enjoy it that way one of the things that
was missing in podcasting for me I don't
know if it's missing I don't know it's
an open question for me but the way I
listen to podcasts is the way I would
listen to albums so I take Joe Rogan
experience and that's an album and I
listen you know I like I put that on and
I listen one episode after the next and
there's a sequence and so on is there a
room for doing what you did for music or
doing what about if I did for music but
creating playlists sort of this kind of
play listing idea of breaking apart from
podcasting from individual podcasts and
creating kind of this interplay or
Debbie thought about that space it's a
great question so I think in in music
you're right basically you bought an
album so it was like you bought a small
catalog of like 10 tracks right it was
it was again it was actually a lot of a
lot of consumption you think it's about
what you like but it's based on the
business model right so you paid for
this 10 track yeah service and then you
listened to that for a while
and then when everything was flat priced
you tended to listen differently now so
so I think the I think the album is
still tremendously important that's why
we have it and you can save albums and
so forth and you have a huge amount of
people who really listen according to
albums and I like that because it is a
creator format you can tell a longer
story over several tracks and so some
people listen to just one track some
people actually want to hear that whole
story now in podcast I think
I think it's different you can argue
that podcasts might be more like shows
on Netflix you have like a full season
of of narcos and you're probably not
going to do like one episode of Marcos
and then one of house of cards like you
know that there's a narrative there and
you you you love the cast and you love
these chairs so I think people will
people love shows and I think they will
they will listen to those shows I do
think you follow a bunch of shows at the
same time so there's certainly an
opportunity to bring you the latest
episode of you know whatever that 5 6 10
things that that you're into but but I
think I think people are gonna listen to
specific hosts and love those hosts for
a long time because I think there's
something different with podcast where
this format of the the experience of the
of the audience is actually sitting here
right between us whereas if you look at
something on TV the audio actually would
come from
you would sit over there the order would
come to you from both of us as if you
were watching notice you were part of
the conversation so my experience is I
mean listen to podcasts like yours and
Joe Rogan is I feel like I know all of
these people they they have no idea who
I am but I feel like you know so many
artists and that it's very different
from me watching a watching like a TV
show or an interview so I think you you
kind of fall in love with people and
experience in a different way so I think
I think shows and hosts are gonna be
very very important I don't think that's
gonna go away in just some sort of thing
where well you don't even know who
you're listening to I don't think that's
gonna happen what I do think is I think
there's a tremendous discovery
opportunity in podcast because the
catalog is growing quite quickly and I
think podcast is only a few like five
six hundred thousand shows right now if
you look back to YouTube as another in
knowledge of creators no one really
knows if you would lift the lid on
YouTube but it's probably billions of
episodes and so I think the podcast
catalog would probably grow tremendously
because the creation tools are getting
easier and then you're gonna have this
discovery opportunity that I think is
really big so so a lot of people tell me
that they love their shows
but discovering podcasts kind of suck
it's really hard to get into new show
they're usually quite long it's a big
time investments I think there's plenty
of opportunity in the discovery part
yeah for sure a hundred percent in in
even the dumbest there's so many
low-hanging fruit too for example just
knowing what episode to listen to first
to try out a podcast exactly because
most podcasts don't have an order to
them they they can be listening to out
of order and sorry to say some are
better than others episodes so some
episodes of Joe Rogan are better than
others and it's nice to know which you
should listen to to try it out and
there's as far as I know almost no
information in terms of like up votes on
how good an episode is exactly so I
think part of the problem is you it's
kind of like music there isn't one
answer people use music for different
things and there's actually many
different types of music there's workout
music and there's classical piano music
and focus music and and and so forth I
think the same with podcasts some
podcasts are sequential they're supposed
to be listened to in in order it's
actually it's actually telling a
narrative some podcasts are one topic
kind of like yours but different guests
so you could jump in anywhere some potes
I have completely different topics and
for those podcasts it might be that I
want you know we should recommend one
episode because it's about AI yeah from
someone but then they talk about
something that you're not interested in
the rest of the episodes so I think or
well we're spending a lot of time on now
it's just first understanding the domain
and creating kind of the knowledge graph
of how do these objects relate and how
do people consume and I think we'll find
that it's gonna be it's gonna be
different I'm excited
is it your the Spotify is the first
people I'm aware of that are trying to
do this for podcasting podcasting has
been like a Wild West
up until now it's been a very we want to
be very careful though because it's been
a very good Wild West I think is this
fragile ecosystem
and I you we want to make sure that you
don't barge in and say like oh we're
gonna internalize this thing and you
have to think about the the creators you
have to understand how they get
distribution today who listens to how
they make money today started you know
make sure that their business model
works that they understand the I think
it's back to doing something improving
their products like feedback loops and
distribution so jumping back into terms
of this fascinating world of recommender
system and listening to music and using
machine learning to analyze things do
you think it's better to what currently
correct me if I'm wrong but currently
Spotify lets people pick what they
listen to the most part there's a
discovery process but you kind of
organize playlists is it better to let
people pick what they listen to or
recommend what they should listen to
something like stations by Spotify that
I saw that you're playing around with
maybe you can tell me what's the status
of that this is a Pandora style app that
just kind of as opposed to you select
the music you listen to it kind of feeds
you music you listen to what's the
status of stations by Spotify what's its
future the store is modify as we have
grown has been that we made it more
accessible to different different
audiences and stations is another one of
those where the question is some people
want to be very specific they actually
want to hear stairway to heaven right
now
that needs to be very easy to do and
some people or even the same person at
some point might say I want to feel up
beats or I want to feel happy or I want
songs to sing in the car alright so they
put in they put in the information in a
very different level and then we need to
translate that into that what that means
musically so stations is a test to to
create like a consumption input vector
that is much simpler where you can just
tune in a little bit and see if that
increases the overall reach but we're
trying to kind of serve the entire gamut
of super-advanced so-called music
aficionados all the way to to people who
they love listening to music but it's it
not their number-one priority in life
right they're not going to sit and
follow every new release from every new
artist they need to be able to to
influence music at a at a at a different
level so we're trying you can think of
it as different products and I think
when one of them one of the interesting
things to answer your question on if
it's better to lift the user to soar to
play I think the answer is the the
challenge when you when machine learning
kind of came along there was a lot of
thinking about what this product
development mean in a machine learning
context people like Andrew Aang for
example when I went to buy do I started
doing a lot of practical machine
learning went from academia and you know
he thought a lot about this and he had
this notion that you know product
manager designer an engine they used to
work around this wireframe I kind of
described what the product should look
like who some talk about when you're
doing like a chatbot or a playlist how
do you what are you gonna say like it
should be good that's not a good product
description so how do you how do you do
that he came up with this notion that
the test set is the new wireframe the
the job of the product manager is to
source a good test set that is
representative of what like if you say
like I want to play the status songs
missing in the car job the product
managers go in source like a good test
out of what that means then you can work
with engineering to have algorithms to
try to produce that right so we try to
think a lot about how to structure
product development for for a machine
learning age and and what we discovered
was that a lot of it is actually in the
expectation and you can go you can go
two ways so
let's say that if you if you set the
expectation with the user that this is a
discovery product like discover weekly
you're actually setting the expectation
that most of all we show you will not be
relevant when you're in the discovery
process you're gonna accept that
actually if you find one gem every
Monday that you totally love you're
probably gonna be happy even though the
statistical meaning one out of ten is
terrible or one out of twenty is
terrible from a user point of view
because the studying was discoveries
fine can I get a sorry to interrupt real
quick I just actually learned about
discover weekly which is a Spotify I
don't know it's it's a feature a Spotify
that shows you cool songs to listen I
maybe I can do issue tracking I couldn't
find them my Spotify app it's in your
library it's in the library it's in the
list of life cuz I was like whoa is just
cool I didn't know this existed and I
try to find it but I'll show it to you
back product yeah but yeah it's a so yes
I just uh just to mention the
expectation there is basically they
you're going to discover in your song
yes so then you can be quite adventurous
in in the recommendations you do but but
if you're but we have another product
called daily mix which kind of implies
that these are only going to be your
favorites so if you have one out of ten
that is good and nine out of ten that
doesn't work for you you're gonna think
it's a horrible product so actually a
lot of the product development we
learned over the years it's about
setting the right expectation so for
daily mix you know algorithmically we
would pick among things that feel very
safe in your taste space there's
discover weekly we go kind of well
because the expectation is most of this
is not gonna so a lot of that a lot of
times of your question there a lot of
should you let the user pick or not it
depends we have some products where the
whole point is that the user can click
play put the phone in the pocket and it
should be really good music for like an
hour we have other products where you
probably need to say like no no say no
no and it's very interactive I see that
makes sense and then the radio product
the station's product is one of these
like click Play put in your pocket four
hours that's really interesting so
you're thinking of different test sets
for differ
users interact create products that sort
of optimize optimize for those test sets
that represents a specific set of users
yes I think one thing that I think is
interesting is we we invested quite
heavily in editorial in in people
creating playlists using a statistical
data and I was successful for us and
then we also invested in in machine
learning and for the longest time you
know within Spotify and within the rest
of the industry there was always this
narrative of humans versus the machine
how go versus editorial and an editors
would would say like well if I had that
data if I could see your playlist in
history and I made a choice for you I
would have made a better choice and they
would have because they honest they're
much smarter than these algorithms the
human is incredibly smart compared to
our algorithms they can take culture
into account and so forth the problem is
that they can't make 200 million
decisions you know per hour for every
user that logs in so the algo may be not
as sophisticated but much more efficient
so there was this there was this
contradiction but then a few years ago
we started focusing on this kind of
human-in-the-loop thinking around
machine learning and we actually coined
an internal term for it called algo
toriel combination of algorithms and and
editors where if we take a concrete
example you think of the editor there's
this paid expert that we have there's
really good at something like so hip-hop
EDM something right there a true expert
no and one in the industry so they have
all the cultural knowledge you think of
them as the product manager and you say
that let's say that you want to create a
you think that there's a there's a
product need in the world for something
like songs to sing in the car or someone
sitting in the shower and I'm taking
that example because it exists people
love yeah The Scream songs in the car
when they drive right yeah so you want
to create that product and you have this
product manager if it's a musical expert
they create they come up with a concept
like I think this is a missing thing in
in humanity like upside it's called song
sitting in the car they they create the
framing the image the title and they
create a test set or they create a group
of songs like a few thousand songs
out of the catalogue that they manually
curate that our known songs that are
great to sing in the car and they can
take like to romance into account they
understand things that our algorithms do
not at all so they have this huge set of
tracks then when we deliver that to you
we look at your taste vectors and you
get the 20 tracks that are songs to sing
in the car in your taste so you have you
have personalization and editorial input
in the same process if that makes sense
yeah it makes total sense and I have
several questions around that this is a
this is like fascinating okay so first
it is a little bit surprising to me that
the world expert humans are
outperforming machines at specifying
songs to sing in the car so maybe you
could talk to that a little bit I don't
know if you can put it into words but
what is it how difficult is this problem
of tea do you really I guess what I'm
trying to ask is there how difficult is
it to encode the cultural references the
the context of the song the artists all
those things together can machine
learning really not do that I mean I
think machine learning is great at
replicating patterns if you have the
patterns but if you try to write with me
a spec of what songs greatest song
testing in the card definition is is it
is it loud there's many choruses
should've been in movies it's it quickly
gets incredibly complicated right yeah
and and a lot of it may not be in the
structure of the song or the title it
could be cultural references because you
know it was a history once so so the
definition problems quickly get and I
think that was the that was the insight
of Andrew Aang when he said the job of
the product manager is to understand
these things that that algorithms don't
and then define what that looks like and
then you have something to train towards
right then you have kind of the test set
and then so so today the editors create
this pool of tracks and then we
personalize you could easily imagine
that once you have this set you could
have some automatic exploration on the
rest of the catalogue because then you
understand what it is and then the other
side of it what machine learning
does help is this taste vector how hard
is it to construct a vector that
represents the things an individual
human likes it this human preference so
you can you know music isn't like it's
not like Amazon like things you usually
buy music seems more amorphous like it's
this thing that's hard to specify like
what what is what you know if you look
at my playlist what is the music that I
love it's harder it seems to be much
more difficult to specify concretely so
how hard is it to a build the taste
vector it is very hard in the sense that
you need a lot of data and I think what
we found was that so it's not so it's
not a stationery problem most it changes
over time and so we've gone through the
journey of if you've done a lot of
computer vision obviously I've done a
bunch of computer vision in my past and
and we started kind of with the
handcrafted heuristics for you know this
is kind of in the music this is this and
if you consume this you probably like
this so we have we started there and we
have some of that still then what was
interesting about the playlist data was
that you could find these latent things
that wouldn't necessarily even make
sense to you that could could even
capture maybe cultural references
because they Co occurred things that
they wouldn't have appeared
mechanistically either in the content or
so forth so I think that I think the
core assumption is that there are
patterns in in almost everything and if
there are patterns these these embedding
techniques are getting better and better
now if now as everyone else we're also
using kind of deep embedding so you can
encode binary values and so forth and
what I think is interesting is is this
process to try to find things that they
do not necessarily you wouldn't actually
have have guessed so it is very hard in
a in an engineering sense to find the
dimensions it's an incredible
scalability problem to do for hundreds
of millions of users and to update it
every day but in but in theory in theory
embeddings isn't that complicated the
fact that you try to find some principal
components or some like that
dimensionality reduction is also the
theory I guess is easy that the
practices is very very hard and it's a
it's a huge engineering challenge but
fortunately we have some amazing
research and engineering teams in this
space yeah I guess the the question is
all I mean it's similar I deal with it
with an autonomous vehicle spaces the
question is how hard is driving and here
is basically the question is of edge
cases so embedding probably works not
probably but I would imagine works well
in a lot of cases so there's a bunch of
questions that arise then so do song
preferences does your taste vector
depend on context like mood right so
there's different moods and absolutely
so how does that take in is it is it
possible to take that at consideration
or do you just leave that as a interface
problem that allows the user to just
control it so when I'm looking for
workout music I kind of specify it by
choosing certain players doing certain
search yeah so that's a great point it's
back to the product development you
could try to spend a few years trying to
predict which mood you're in
automatically open Spotify or you create
a tab which is happy and sad right and
you're gonna be right 100% of the time
with one click now it's probably much
better to let the user tell you if
they're happy or sad or if they want to
work out on the other hand if you use
the interface become 2,000 tabs you're
introducing so much friction so no one
will use the product so then you have to
get better so it's this thing where I
think it maybe was I remember who coined
it but it's called fault tolerant you is
right you build the UI that is tolerant
of being wrong and then you can be much
less right in your you know in your
algorithms
so we you know we've had to learn a lot
of that building the right UI that fits
where the where the machine learning is
and and a great discovery there which is
which was by the teams during one of our
hack days was this thing of taking
discovery packaging it into a playlist
and saying that these are new tracks
that we think you might like based on
this and studying the right expectation
made it made it a great product so I
think we have this benefit that for
example Tesla doesn't have that we can
we can we can change the expectation we
can we can build a fault-tolerant
setting it's very hard to be fault
tolerant when you're driving at you know
100 miles per hour or something and and
we we have the luxury of being able to
say that of being wrong if we have the
right UI which gives us different
abilities to take more risk so I
actually think the self-driving problem
is is much harder oh yeah for sure
it's much less fun because people die
exactly and in Spotify it's just such a
more fun problem because failure will
mean failures beautiful away at least
exploration so it's a really fun
reinforcement learning project the worst
case scenario is to get these WTF tweets
like how did I get this this song which
is a lot better than the self-driving
favor so what's the feedback that a user
puts the signal that a user provides
into the system so the the you mentioned
skipping what is like the strongest
signal is uh you didn't mention clicking
like so so we have a few signals that
are important obviously playing playing
through so so one of the benefits of
music actually even compared to podcast
or or movies is the object itself is
really only about three minutes so you
get a lot of chances to recommend and
the feedback loop is is every three
minutes instead of every two hours or
some things you actually get kind of
noisy but but quite fast feedback and so
you can see if people played through or
if they which is you know the inverse of
skip really that
an important signal on the other hand
much of the consumption happens when
your phone is in your pocket maybe
you're running or driving or you're
playing on a speaker and so you not
skipping doesn't mean that you love that
song it might be that it wasn't bad
enough that you would walk up and skip
so it's a noisy signal then then we have
the equivalent of the like which is you
saved it to your library that's a pretty
strong signal of affection and then we
have the more explicit signal of play
listing like you took the time to create
a playlist you put it in there there's a
very little small chance that if you
took all that trouble this is not a
really important track to you and then
we understand also what are the tracks
it relates to so we have we have the
playlist thing we have the like and then
we have the listening or skipped and you
have two very different approaches to
all of them because at different levels
of noise one one is very voluminous but
noisy and the others rare but you can
you can probably trust it yeah it's
interesting I think between those
signals captures all the information
you'd want to capture I mean there's a
feeling a shallow feeling for me that
there's sometimes I'll hear songs like
yes this is you know this was the right
song for the moment but there's really
no way to express that fact except by
listening through it all the way yeah
and maybe playing it again at that time
or something yeah
there's no need for a button that says
this was the best song I could have
heard at this moment well we're playing
around with that with kind of the thumbs
up concept saying like I really like
this just kind of talking to the
algorithm it's unclear if that's the
best way for you miss to interact maybe
it is maybe they should think of Spotify
has a person an agent sitting there
trying to serve you and you can say like
that's modified good Spotify right now
the analogy we've had is more you
shouldn't think of us we should be
invested oh and the feedback is if you
save it it's kind of you work for
yourself you do a playlist because you
think is great and we can learn from
that it's kind of back to back to Tesla
how they kind of have that shadow mode
they sit in what you drive we kind of
took the same analogy we stayed in watch
your playlist and then maybe we can we
can offer you an autopilot where you can
take over for a while or something like
that and then back off if you say like
that's not that's not good enough
but but I think it's interesting to
figure out what your mental model s if
Spotify is an AI that you talk to which
I think might be a bit too abstract for
for many consumers or if you still think
of it as it's my music app but it's just
more helpful and depends on the device
it's running on which brings us to smart
speakers so I have a lot of the Spotify
listening I do is on things like on
device they can talk to you whether it's
from Amazon Google or Apple what's the
role Spotify on those devices how do you
think of it differently then on the
phone or on the desktop there are few
things to say about the first of all
it's incredibly exciting they're growing
like crazy especially here in the in the
in the US and it's solving a consumer
need that I think is is you can think of
it as just remote interactivity you can
control this thing from from from across
the room and it might may feel like a
small thing but it turns out that
friction matters to consumers being able
to say play pause and so forth from
across the room is very powerful so
basically you made you made the living
room interactive now and what we see in
our data is that the the number one use
case for these speakers is music music
and podcasts so fortunately for us it's
been important to these companies to
have those use case covered so they
wanted Spotify on this we have very good
relationships with with them and we're
seeing we're seeing tremendous success
for them what what I think it's
interesting about them is it's already
working with we kind of had this
epiphany many years ago back when we
started using Sonos if you went through
all the trouble of setting up your sonar
system you had this magical experience
where you had all the music ever made in
your living room and and we we we made
this assumption that the home everyone
use that a CD player at home but they
never managed to get their files working
in the home having this network attached
which was too cumbersome for most
consumers so we made the assumption that
the home would skip from the CD all the
way to streaming books where where you
will get would by the steering without
all the music built-in that took longer
than we thought but with the voice
speakers that was the unlocking that
made kind of the connected speaker
happen in the home so so it really it
really exploded and we saw this
engagement that we predicted would
happen what I think is interesting
though is where it's going from now
right now you think of them as voice
speakers but I think if you look at
Google i/o for example they just added a
camera to it where you know when the
alarm goes off instead of saying hey
Google stop you can just wave your hand
so I think they're gonna think more of
it as a as an agent or as an assistant
truly an assistant and an assistant that
can see you it's gonna be much more
effective than then a blind assistant so
I think these things will morph and we
won't necessarily think of them as
quote-unquote voice speakers anymore
just as interactive access to the
Internet in the home but I still think
that the biggest use case for those will
be will be audio so for that reason
we're investing heavily in it and we've
built our own NLU stack to be able to
the the challenge here is how do you
innovate in that well it's it's it
lowers friction for consumers but it's
also much more constrained there you
have no pixels to play with in an in an
audio-only world it's really the
vocabulary that is the interface so we
started investing and playing around
quite a lot with that trying to
understand what the future will be of
you speaking and gesturing and waving at
your music and actually you're actually
nudging closer to the autonomous vehicle
space because from everything I've seen
the level of frustration people
experience upon failure of natural
language understanding is much higher
than failure in other content people get
frustrated really fast so if you screw
screw that experience up even just a
little bit they give up really quickly
yeah and I think you see that in the
data while well it's tremendously
successful
the most common interactions are play
pause
and you know next the things were if you
compare it to taking up your phone
unlocking you bringing up the app and
skipping clicking skip yeah it was it
was much lower friction but then for for
longer more complicated things like you
if I mean that song about people
stopping up the phone and search and
then play it on their speaker so we
tried again to build a fault-tolerant UI
where for the more for the more
complicated things you can still pick up
your phone have powerful full keyboard
search and then try to optimize for
whether it's actually lower friction and
try to it's it's kind of like the test
autopilot thing you have to be at the
level where you're helpful if you're too
smart and just in the way people are
gonna get frustrated and first of all
I'm not obsessed with stairway to heaven
it's just a good song but let me mention
that as a use case because it's an
interesting one I've literally told one
of I don't want to say the name of the
speaker cuz though when people are
listening to it'll make their speaker go
off but I talk to the speaker and I say
play stairway to heaven and every time
it like not every time but a large
percent of the time plays the wrong
stairway to heaven it plays like some
cover of the and that that part of the
experience I actually wonder from a
business perspective the Spotify control
that entire experience or no it seems
like the NLU the the natural language
stuff is controlled by the speaker and
then Spotify stays at a layer below that
it is a good and complicated question
some of which is dependent on the on the
partner so it's hard to comment on the
own specifics but the question is the
right one the challenges if you can't
use any other personalization I mean we
know which stairway to heaven and and
the truth is maybe for for one person it
is exactly the cover that they want and
they will be very frustrated if a place
I I think we I think we default to the
right version but but you actually want
to be able to do the cover for the
person that just played the cover 50
times or Spotify is just gonna seem
stupid so you want to be able to
leverage the personalization but you
have this stack where where you have the
the ASR and this thing called the N best
list so then that guesses
here and then the presentation comes in
at the end you actually want the
personalization to be here when you're
guessing about what they actually meant
so we're working with these partners and
it's a complicated it's a complicated
thing where you want to you want to be
able so first of all you want to be very
careful with you users data
you don't share your users data without
the permission but you want to share
some data so that their experience gets
better
so that these partners can understand
enough but not too much and so forth so
it's really the trick is that it's like
a business driven relationship where
you're doing product development across
companies together yeah which is which
is really complicated but this is
exactly why we built our own NLU so that
we actually can make personalized
guesses because this is the biggest
frustration from a user point of view
they don't understand about ASRs and n
best lists and and business deals
they're like how hard can it be
I've told this thing 50 times in this
version and still applies the wrong
thing it can't it can't be hard so we
try to take the user approach if the
user the users not going to understand
the complications of business we have to
solve it let's talk about sort of a
complicated subject that I myself I'm
quite torn about the idea of sort of
paying artists right I saw as of August
31st 2018 over 11 billion dollars were
paid to rights holders so and further
distributed to artists from Spotify so a
lot of money is being paid to artists
first of all the whole time is a
consumer for me when I look at Spotify
I'm not sure I'm remembering correctly
but I think you said exactly how I feel
which is this is too good to be true
like when I started using Spotify I
assume you guys will go bankrupt in like
a month it's like this is too good a lot
of people did like this is amazing so
one question I have is sort of the
bigger question how do you make money in
this complicated world how do you deal
with a relationship with record labels
who
are complicated these big you're
essentially and have the task of herding
cats
but like rich and powerful cats and also
have the task of paying artists enough
and paying those labels enough and still
making money in the internet space or
people are not willing to pay hundreds
of dollars a month so how do you
navigate the space how do you know
that's a beautiful description hurting
rich cats yeah that's before it is very
complicated and I think certainly
actually betting against Porter fire has
been statistically a very smart thing to
do just looking at the at the line of
roadkill in music streaming services
it's it's kind of I think if I had
understood the complexity when I joined
Spotify unfortunately fortunately I
didn't know enough about the music
industry to understand the complexities
because then I would have made a more
rational guess that it wouldn't work so
you know ignorant is bliss but I think
there have been a few distinct
challenges I think as I said one of the
things that made it work at all was that
Sweden and the Nordic was a lost market
so there was you know there was there
was no risk for labels to try this I
don't think it would have worked if if
the market was was healthy so so that
was the initial condition then then we
had this tremendous challenge with the
model itself so now most people were
pirating but for the people who bought a
download or CD the artists will get all
the revenue for all the future plays
then right so you got it all up front
whereas the streaming model was like
almost nothing they want almost nothing
they - and then at some point this curve
of incremental revenue would intersect
with your day one payment and that took
a long time to play out before before
the music labels they understood that
but on the other side it took a lot of
time to understand that actually if I
have a big hit that is gonna be played
for for many years this is a much better
model
because I get paid based on how much
people use the product not how much they
thought they would use it day one or so
forth so it was a complicated model to
get across and but time helped with that
right and now now the revenues to the
music industry actually are bigger again
then you know it's gone through this
incredible dip and now they're back up
and so we're we're very really proud of
having having been a part of that so
there have been distinct problems I
think when it comes to the to the labels
we have taken the painful approach some
of our competition at the time they kind
of they kind of look that other
companies instead if we just if we just
ignore the rights and we get really big
really fast we're gonna be too big for
the the labels too kind of too big to
fail they're not gonna kill us we didn't
take that approach we went legal from
day one and we we negotiated and
negotiated and negotiated was very slow
is very frustrating we were angry at
seeing other companies taking shortcuts
and seeming to get away with it it was
this this this game theory thing where
over many rounds of playing the game
this would be the right strategy and
even though to clearly there's a lot of
frustrations at times during
negotiations there is this there is this
weird trust where we have been honest
and fair we never screw them the inner
screwed us it's ten years but there's
this trust in like they know that if
music doesn't get really big if lots of
people do not want to listen to music I
want to pay for it
Spotify has no business model so we
actually are incredibly aligned right
other companies not to be tense but
other companies have other business
models where even if they may know music
from no money from music they still be
profitable companies but fall if I want
so and I think the industry sees that we
are actually aligned business-wise so
there is this distrust that allows us to
to do product development even if it's
scary you know taking risks the free
model itself was an incredible risk for
the music industry to take that they
should get credit for now some of it was
that they had nothing to lose in Sweden
but frankly a lot of the labels also
took risk and so I think we built up
that trust
with a I think hurting what cats sounds
a bit what's the word it sounds like I'm
dismissive of the guests dismissive no
everyday yeah you matter they're all
beautiful and and very important exactly
they've taken a lot of risks and
certainly it's been frustrating about it
yeah it's it's it's really like playing
its its game theory if you play that if
you play the game many times and then
you can have this statistical outcome
that you bet on and it feels very
painful when you're in the middle of
that thing I mean there's risk this
trust there's relationships from just
having read the biography of Steve Jobs
similar kind of relationship were
discussed in iTunes the idea of selling
a song for a dollar was very
uncomfortable for labels and exactly and
there was no it was the same kind of
thing it was trust it was game theory as
is a lot of our relationships that had
to be built and it's really a
terrifyingly difficult process that
Apple could go through a little bit
because they could afford for that
process to fail for Spotify it seems
terrifying because you can't initially I
think a lot of it comes out comes down
to honestly Daniel and his tenacity in
in negotiating which seems like an
impossible this is a fun task because he
was completely unknown and so forth but
maybe that was also the reason that that
it worked
but I think yeah I think game theory is
probably the best way to think about it
you could straight go straight for this
like Nash equilibrium that someone is
going to defect or or you played many
times you try to actually go for the top
left the corporation cell is there any
magical reason why Spotify seems to have
won this so a lot of people have tried
to do a spot if I try to do and Spotify
has come out well so the order is that
there's no magical reason because I I
don't believe in magic but I think there
are there are reasons and I think some
of them are that people have
misunderstood a lot of what we actually
do the actual the actual Spotify model
is very complicated they've looked at
the premium model and said it seems like
you can you can charge $9.99 for music
and people are gonna pay but that's not
what happened actually when we launched
the original mobile product everyone
said they would never pay what happened
was they started on the on the free
product and then their engagement grew
so much that eventually they said maybe
it is worth $9.99 right it's it's your
propensity to pay grows with your
engagement so we had this super
complicated business model we operate
two different business model advertising
and premium at the same time and I think
that is hard to replicate however I
struggled to think of other companies
that run large-scale advertising and
subscription products at the same time
so I think the business model is
actually much more complicated than
people think it is and and so some
people went after just the premium part
without the free part and ran into a
wall where no one wanted to pay some
people went after just music music
should be free just ads which doesn't
give you enough revenue and doesn't work
for the music industry so I think that
combination is kind of a pick from the
outside so maybe I shouldn't say it here
and reveal the secret but that that
turns out to be hard to replicate then
you with then you would think so there's
a lot of brilliant business strategy
here brilliance or luck
probably more luck but it doesn't really
matter it looks brilliant
in retrospect let's call it brilliant
yeah when the books are read no
brilliant you've mentioned that your
philosophy is to embrace change so how
will the music streaming and music
listening world change over the next 10
years 20 years you look out into the far
future what do you think I think that
music and for that matter
audio podcasts audiobooks I think it's
one of the few core human needs I think
it there is no good reason to me why it
shouldn't be at the scale of something
like messaging or social networking I
don't think it's a nice thing to listen
to music or news or something so I think
scale is obviously one of the things
that I really hope for I think I hope
that it's going to be billions of users
I hope eventually everyone in the world
gets access to all the world's music
ever made so obviously I think it's
going to be a much bigger business
otherwise we we wouldn't be betting this
big now if you if you look more at how
it is consumed what I'm hoping is back
to this analogy of the software tool
chain where I think I sometimes
internally I make this analogy to to
text messaging text messaging was also
based on standards in the in the area of
mobile carriers you had the SMS the 140
character 2020 carrot SMS and it was
great because everyone agreed on the
standard so as a consumer you got a lot
of distributions and interoperability
but it was a very constrained format and
and when the industry wanted to add
pictures to that format to do the MMS I
looked it up and I think it took from
the late 80s to early 2000 this is like
a 15 20 year product cycle to bring
pictures into that now once that entire
value chain of creation and consumption
got wrapped in one software stack with
in something like snapchat or whatsapp
like the first week they had a
disappearing messages like then two
weeks later they added stories like the
pace of innovation when you're on one
software stack and you can you can you
can affect both creation and consumption
I think is going to be rapid so with
these streaming services we now for the
first time in history have enough I hope
people on one of these services actually
whether it's Spotify or Amazon or Apple
or or YouTube and hopefully enough
creators that you can actually start
working with a format again and and that
excites me I think being able to change
these constraints from a hundred years
that could really that could really do
something interesting I don't I really
hope it's not just going to be their
iteration on on the same thing for the
next ten to twenty years as well
yeah changing the creation of music
equation of audio equation of podcast is
a really fascinating possibility I
myself don't understand what it is about
podcast that's so intimate it just is I
listen to a lot of podcasts I think it
touches on a human and a deep human need
for connection that people do feel like
they're connected to when they listen I
don't understand what the psychology
that is but in this world that's
becoming more and more disconnected it
feels like this is fulfilling a certain
kind of need and empowering the creator
as opposed to just the listener it's
really interesting so this is I'm really
excited that you're working on it yeah I
think one of the things that is
inspiring for our teams to work on
podcast is exactly that what do you
think like I like I probably do that
it's something biological about
perceiving to be in the middle of the
conversation that makes you listen in a
different way it doesn't really matter
people seem to perceive it differently
and there was this narrative for a long
time that you know if you look at video
everything kind of in the foreground I
got shorter and shorter and shorter
because of financial pressure is a
monetization and so forth an event at
the end there zones like 20 seconds
clipped people just screaming something
and I'm really I feel really good about
the fact that you you could have
interpreted that as people have no
attention span anymore they don't want
to listen to things they're not
interested in deeper stories like you
know people are people are getting
dumber
but then podcast came along and it's
almost like no no they need still
existed once what but maybe maybe it was
the fact that you're not prepared to
look at your phone like this for two
hours but if you can drive at the same
time it seems like people really want to
dig
and they want to hear like the more
complicated version so to me that is
very inspiring that that podcast is
actually long-form it gives me a lot of
hope for for Humanity that people seem
really interested in hearing deeper more
complicated this is a I don't understand
it it's fascinating so the majority for
this podcast listen to the whole thing
this whole conversation we've been
talking for an hour and 45 minutes and
somebody will I mean most people will be
listening to these words I'm speaking
right now they wouldn't have thought
that 10 years ago with what the world
seemed to go that's very positive I
think that's really exciting and
empowering the creator and there's
really exciting last question you also
have a passion for just mobile in
general how do you see the smartphone
world this the the digital space of of
smartphones and just everything that's
on the move whether it's Internet of
Things and so on changing over the next
10 years and so on I think that one way
to think about it is that computing
might be moving out of these
multi-purpose devices the computer we
had in the phone into specific specific
specific purpose devices and you know it
will be ambient that you know at least
in my home you just shout something at
someone and there's always like one of
these speakers close enough and so you
start behaving differently it's as if
you have the Internet
ambient ambiently around you and you can
ask it things
so I think computing or kind of get more
integrated and we won we won't
necessarily think of that as as
connected to a device and the same thing
in the same way that we do today I don't
know the path to that maybe we used to
have these desktop computers and then we
partially replace that with the with the
laptops and left you know at home and at
work and then we got these phones and we
started leaving the laptop at home for a
while and maybe the maybe for stretches
of time you're gonna start using the
watch and you can leave your your phone
at home like for a run or something and
you know we're on this progressive path
where you I think what what is happening
with the voice is that you haven't you
have an interactive interaction paradigm
that doesn't require as large physical
devices so I definitely think there's a
future where you can have your your
airports and and your watch and you can
do a lot of computing and I don't think
it's gonna be this binary thing I think
it's gonna be like many of us still have
a laptop we just use it less and so you
shift your your consumption over and I
don't know about a or glasses and so
forth um I'm excited about I spend a lot
of time in that area but I still think
it's quite far away they are VR oh yes
VR is is happening and working I think
the recent oculus quest is quite
impressive I think AR is further away at
least that type of AR I think but I do
think your phone or water glasses
understanding where you are and maybe
what you're looking at and being able to
give you all your cues about or you can
say like what is this and it tells you
what it is that I think might happen you
you know you use your your watch to your
glasses there's a as a mouse pointer on
reality I think it might be a while
before I might be wrong I hope I'm wrong
I think it might be a while before we
walk around with these big like lab
glasses then project things I agree with
you there's a it's actually really
difficult when you have to understand
the physical world enough to project
onto it well I lied about the last
question
because I just thought of audio and my
favorite topic which is the movie her
mm-hmm do you think well there's part of
Spotify or not will have I don't know if
you've seen the movie her absolutely and
their audio is the primary form of
interaction and the connection with
another entity that you can actually
have a relationship with actually fall
in love with based on voice alone audio
alone you how far do you think that's
possible first of all based on audio
alone to fall in love with somebody
somebody or well yeah let's go with
somebody just have a relationship based
on audio alone and second question to
that can we create an artificial
intelligence system that allows one to
fall in love with it and hurt him with
you so there's my personal personal
answer speaking for me as a person the
answer is quite unequivocally yes on
both I think what we just said about
podcasts and the feeling of being in the
middle of a conversation if you could
have an assistant where and we just said
that feels like a very personal settings
so if you walk around with these
headphones and this thing you're
speaking with this thing all of the time
that feels like it's in your brain I
think it's it's gonna be much easier to
fall in love with than something that
would be on your screen I think that's
entirely possible and then from that you
can programs is better than me but from
the concept of if it's going to be
possible to build a machine that they
can achieve that I think whether you
whether you think of it as if you can
fake it the philosophical zombie that it
simulates it enough or it it somehow
actually is I think there's it's only
question if you ask me about time I'd
had to be a financier but if you say a
given some half infinite time absolutely
I think it's just atoms and arrangement
of information well I personally think
that love is a lot simpler than people
think so we started with true romance
and ended in love I don't see a better
place to end beautiful Gustav thanks so
much for talking to
thank you so much there was a lot of fun
was fun
you