YouTube Algorithm Basics (Cristos Goodrow, VP Engineering at Google) | AI Podcast Clips
h2SscdSVzE8 • 2020-01-26
Transcript preview
Open
Kind: captions
Language: en
maybe the basics of the quote-unquote
YouTube algorithm what is the YouTube
algorithm look at to make recommendation
for what to watch next was from a
machine learning perspective or when you
search for a particular term how does it
know what to show you next because it
seems to at least for me do an
incredible job both well that's kind of
you to say it didn't used to do a very
good job but it's gotten better over the
years even even I observed that it's
improved quite a bit
those are two different situations like
when you search for something YouTube
uses the best technology we can get from
Google to make sure that that the
YouTube search system finds what
someone's looking for and of course the
very first things that one thinks about
is okay well does the word occur in the
title for instance you know but there
but there are much more sophisticated
things where we're mostly trying to do
some syntactic match or or maybe a
semantic match based on words that we
can add to the document itself for
instance you know maybe is is this video
watched a lot after this query right
that's something that we can observe and
then as a result make sure that that
that document would be retrieved for
that query now when you talk about what
kind of videos would be recommended to
watch next that's something again we've
been working on for many years and
probably the first the first real
attempt to do that well was to use
collaborative filtering so you can
describe what collaborative filtering is
sure it's just basically what we do is
we observe which videos get watched
close together by the same person and if
you observe that and if you can imagine
creating a graph
where the videos that get watched close
together by the most people are sort of
very close to one another in this graph
and videos that don't frequently get
watch close too close together by the
same person or the same people are far
apart then you end up with this graph
that we call the related graph that
basically represents videos that are
very similar or related in some way and
what's amazing about that is that it
puts all the videos that are in the same
language together for instance and we
didn't even have to think about language
it just does it yeah I didn't it puts
all the videos that are about sports
together and it puts most of the music
videos together and it puts all of these
sorts of videos together just because
that's sort of the way the people using
YouTube behave so that already cleans up
a lot of the problem it takes care of
the lowest hanging fruit which happens
to be a huge one of just managing these
millions of videos that's right I
remember a few years ago I was talking
to someone who was trying to propose
that we do a research project concerning
people who who are bilingual and this
person was making this proposal based on
the idea that YouTube could not possibly
be good at recommending videos well to
people who are bilingual and so she was
telling me about this and I said well
can you give me an example of what
problem do you think we have on YouTube
with the recommendations and so she said
well I'm a researcher in in the US and
and when I'm looking for academic topics
I want to look I want to see them in
English and so she searched for one
found a video and then looked at the
watch next suggestions and they were all
in in English and so she said oh I see
YouTube must think that I speak only
English and so she said now I'm actually
originally from Turkey and sometimes
when I'm cooking let's say I want to
make some baklava I really like to watch
that are in Turkish and so she search
for a video about making the baklava and
then and then selected it it was in
Turkish and the watch next
recommendations were in Turkish and she
just couldn't believe how this was
possible and and how is it that you know
that I speak both these two languages
and put all the videos together and it's
just as a sort of an outcome of this
related graph that's created through
collaborative filtering so for me one of
my huge interest is just human
psychology right and and that's such a
powerful platform on which to utilize
human psychology to discover what people
individual people want to watch next but
it's also be just fascinating to me you
know I've Google search has ability to
look at your own history and I've done
that before just just what I've searched
three years for many many years and it's
fascinating picture of who I am actually
and I don't think anyone's ever
summarized that I personally would love
that a summary of who I am as a person
on the Internet to me because I think it
reveals I I think it puts a mirror to me
or to others you know that's actually
quite revealing and interesting you know
just maybe in the number of it's a joke
but not really is the number of cat
videos I've watched videos of people
falling you know it's the stuff that's
absurd that kind of stuff it's really
interesting and of course it's really
good for the machine learning aspect to
do to show to figure out what to show
next but it's interesting hey have you
just as a tangent played it wrong with
the idea of giving a map to people sort
of as opposed to just using this
information to show us next showing them
here are the clusters you've loved over
the years kind of thing well we do
provide the history of all the videos
that you've watched yes so you can
definitely search through that and and
look through it and search through it to
see what it is that you've been watching
on YouTube we have actually in various
times experimented with this sort
of cluster idea finding ways to
demonstrate or show people what topics
they've been interested in or what what
clusters they've watched from it's
interesting that you bring this up
because in some sense the the way the
recommendation system of YouTube sees a
user is exactly as the history of all
the videos they've watched on YouTube
and so you can think of yourself or any
user on YouTube as kind of like a DNA
strand of all your videos right that
sort of represents you you can also
think of it as maybe a vector in the
space of all the videos on YouTube and
so you know now once you think of it as
a vector in the space of all the videos
on YouTube then you can start to say
okay well you know which videos which
which other vectors are close to me and
to my vector and and that's one of the
ways that we generate some diverse
recommendations is because you're like
okay well you know these these people
seem to be closed with respect to the
videos they've watched on YouTube but
you know here's a topic or a video that
one of them has watched and enjoyed but
the other one hasn't that could be an
opportunity to make a good
recommendation I can tell you I mean I
know they asked for things that are
impossible but I would love to cluster
then human beings like I would love to
know who has similar trajectories as me
you probably would want to hang out
alright there's a social aspect there
like actually finding some of the most
fascinating people I find out in YouTube
but have like no followers and I start
following them and they create
incredible content and you know and on
that topic I just love to ask there's
some videos just blow my mind in terms
of quality and depth and just in every
regard are amazing videos and they have
like 57 views okay
how do you get videos of quality to be
seen by many eyes so the measure of
quality is it just something yeah how do
you know that something is good well I
mean I think it
pens initially on what sort of video
we're talking about so in the realm of
let's say you mentioned politics and
news in that realm you know quality news
or quality journalism relies on having a
journalism department right like you you
have to have actual journalists and
fact-checkers and people like that and
so in that situation and in others maybe
science or in medicine quality has a lot
to do with the authoritative 'no sand
the credibility and the expertise of the
people who make the video now if you're
thinking about the other end of the
spectrum you know what is the highest
quality prank video for what is the
highest quality minecraft video yeah
right that might be the one that people
enjoy watching the most and watch to the
end or it might be the one that when we
ask people the next day after they
watched it were they satisfied with it
and so we in in especially in the realm
of entertainment have been trying to get
at better and better measures of quality
or satisfaction or enrichment since I
came to YouTube and we started with well
you know the first approximation is the
one that gets more views but but you
know we both know that things can get a
lot of views and not really be that high
quality especially if people are
clicking on something and then
immediately realizing that it's not that
great and abandoning it and that's why
we move from views to thinking about the
amount of time people spend watching it
what the premise that like you know in
some sense the time that someone spends
watching a video is related to the value
that they get from that video it may not
be perfectly related but it has
something to say about how much value
they get but even that's not good enough
right because I myself have spent time
clicking through channels on
television late at night and ended up
watching under siege too for some reason
I don't know and if you were to ask me
the next day are you glad that you
watched that show on TV last night I'd
say yeah I wish I would have gotten to
bed and read a book or almost anything
else really and so that's why some
people got the idea a few years ago to
try to survey users afterwards and so so
we get feedback data from those surveys
and then use that in the machine
learning system to try to not just
predict what you're gonna click on right
now what you might watch for a while but
what when we ask you tomorrow you'll
give four or five stars too so just to
summarize what are the signals from a
machine learning perspective these can
provide cement she's just clicking on
the video views the time watch maybe the
relative time watched the clicking liked
and disliked on the video maybe
commenting on the video and those things
all of those things and then though the
one that wasn't actually quite aware of
even though I might have engaged in it
is a survey afterwards which is a
brilliant idea is there other signals
all right I mean that's already a really
rich space of signals to learn from is
there something else
well you mentioned commenting also
sharing the video if you if you think
it's worthy to be shared with someone
else you know within YouTube or outside
of YouTube as well either
let's see you mentioned like dislike
yeah like and dislike how important is
that it's very important right we want
it it's predictive of satisfaction but
it's not it's not perfectly predictive
subscribe if you subscribe to the
channel of the person who made the video
then that also is a piece of information
at signals satisfaction although over
the years we've learned that people have
a wide range of attitudes about what it
means to subscribe we would ask some
users who didn't subscribe very much why
but they watched a lot from
a few channels we'd say well why didn't
you subscribe and they would say why I
can't afford to pay for anything and you
know we tried to let them understand
like actually doesn't cost anything it's
free it just helps us know that you are
very interested in this creator but then
we've asked other people who subscribed
to many things and and don't really
watch any of the videos from those
channels and we say well why did you
subscribe to this if you weren't really
interested in any more videos from that
channel and they might tell us why just
you know I thought the person did a
great job and I just want to kind of
give him a high five okay yeah and so
yeah that's where I I said I actually
subscribe to channels where I just this
person is amazing I like this person but
then I like this person I really want to
support them that that's how I click
Subscribe right even though I may never
actually want to click on their videos
when they're releasing it
I just love what they're doing and it's
maybe outside of my interest area and so
on which is probably the wrong way to
use the subscribe button but I just want
to say congrats this is a great work
well so you have to deal with all the
space of people that see the subscribe
button it's totally different that's
right and so you know we we can't just
close our eyes and say sorry you're
using it wrong you know and we're not
gonna pay attention to what you've done
we need to embrace all the ways in which
all the different people in the world
use the subscribe button or the like in
the dislike button so in terms of
signals of machine learning using for
the search and for the recommendation
you've mentioned title so like metadata
like text data that people provide
description and title and maybe keywords
so maybe you can speak to the value of
those things in search and also this
incredible fascinating area of the
content itself so the video content
itself trying to understand what's
happening in the video so YouTube would
release a data set that you know in the
in the machine learning and computer
vision world this is just an exciting
space how much is that currently how
much he playing with that currently how
much is your
the future of being able to analyze the
content of the video itself well we have
been working on that also since I came
to YouTube analyzing the content
analyzing the content on video right and
what I can tell you is that our ability
to do it well is still somewhat crude we
can we can tell if it's a music video we
can tell if it's a sports video we can
probably tell you that people are
playing soccer we probably can't tell
whether it's Manchester United or my
daughter's soccer team so these things
are kind of difficult and and using them
we can use them in some ways so for
instance we use that kind of information
to understand and inform these clusters
that I talked about and also maybe to
add some words like soccer for instance
to the video if if it doesn't occur in
the title or the description which is
remarkable that often it doesn't I one
of the things that I ask creators to do
is is please help us out with the title
in the description for instance we were
a a few years ago having a live stream
of some competition for world of
warcraft on YouTube and it was a very
important competition but if you typed
World of Warcraft in search you wouldn't
find it well the Warcraft wasn't in the
title World of Warcraft wasn't in the
title it was match four seven eight you
know a team versus B team and we'll the
Warcraft wasn't the title just like come
on being literal being literal on the
Internet is actually very uncool which
is the problem oh is that right well I
mean in some sense well some of the
greatest videos I mean there's a humor
to just being indirect being witty and
so on and actually being you know
machine learning algorithms want you to
be you know literal right you just want
to say what's in the thing be very very
simple and in in some sense that gets
away from wit and humor so you have to
play with both right so but you're
saying that for now sort of the
content of the title the kind of the
description the actual text is is one of
the best ways to for the for the
algorithm to find your video and put
them in the right cluster that's right
and and I would go further and say that
if you want people human beings to
select your video in search then it
helps to have let's say World of
Warcraft in the title because why would
a person's you know if they're looking
at a bunch they type World of Warcraft
and they have a bunch of videos all of
whom say World of Warcraft except the
one that you uploaded well even the
person is gonna think maybe this isn't
some house search made a mistake this
isn't really about World of Warcraft so
it's important not just for the machine
learning systems but also for the people
who might be looking for this sort of
thing they get a clue that it's what
they're looking for by seeing that same
thing prominently in the title of the
video okay let me push back on that so I
think from the algorithmic perspective
yes but if they typed in World of
Warcraft and saw a video that with the
title simply winning and and and the
thumbnail has like a sad orc or
something I don't know right like I
think that's much it's Iraq it gets your
curiosity up and then if they could
trust that the algorithm was smart
enough to figure out somehow that this
is indeed a World of Warcraft video that
would have created the most beautiful
experience i I think in terms of just
the wit and the humor and the curiosity
that we human beings actually have but
you're saying I mean realistically
speaking is really hard for the
algorithm to figure out that the content
of that video will be a world of
warcraft it and you have to accept that
some people are gonna skip it
yeah right I mean and so you're right
the people who don't skip it and select
it are gonna be delighted yeah but other
people's I might say but yeah this is
not what I was looking for and making
stuff discoverable I think is what
you're really working on and hoping so
yeah so from your perspective to put
stuff in the description and remember
the collaborative filtering part of the
system it starts by
the same user watching videos together
right so the way that they're probably
going to do that is by searching for
them that's a fascinating aspect it's
like ant colonies that's how they find
stuff is so I mean you would agree for
collaborative filtering in general is
one curious ant one curious user
essential so just a person who is more
willing to click on random videos and
sort of explore these cluster spaces in
your sense how many people are just like
watching the same thing over and over
and over and over and how many are just
like the explorers I just kind of like
click on stuff and then help help the
other ant and the ants colony discover
the cool stuff you have a sense of that
I really don't think I have a sense for
yeah ok relative sizes of those groups
but I but I would say that you know
people come to YouTube with some certain
amount of intent and as long as they to
the extent to which they they try to
satisfy that intent that certainly helps
our systems right because our systems
rely on on kind of a faithful amount of
behavior the right like and there are
people who try to trick us right there
are people and machines that try to
associate videos together that really
don't belong together but they're trying
to get that association made because
it's profitable for them and so we have
to always be resilient to that sort of
attempt at gaming the system so speaking
to that there's a lot of people that in
a positive way perhaps I don't know I I
don't like it but like to gain want to
try to gain the system to get more
attention everybody creators in a
positive sense want to get attention
right so how do you how do you work in
this space when people create more and
more sort of click baby titles and
thumbnails sort of veritasium derek has
made a video it basically describes that
it seems what works is to create a high
quality video really good video where
people would want to watch and wants to
click on it but have clicked BTW titles
and thumbnails
to click on it in the first place and
he's saying I'm embracing this back from
just gonna keep doing it and I hope you
forgive me for doing it and you will
enjoy my videos once you click on them
so and what sons do you see this kind of
clickbait style attempt to manipulate to
get people in the door to manipulate the
algorithm or play with the algorithm of
game the algorithm I think that that you
can look at it as an attempt to game the
algorithm but even if you were to take
the algorithm out of it and just say
okay well all these videos happen to be
lined up which the algorithm didn't make
any decision about which one to put at
the top or the bottom but they're all
lined up there which one are the people
gonna choose and and I'll tell you the
same thing that I told Derek is you know
I have a bookshelf and they have two
kinds of books on them science books I
have my math books from when I was a
student and they all look identical
except for the titles on the covers
they're all yellow they're all from
Springer and they're every single one of
them the cover is totally the same yes
right yeah on the other hand I have
other more pop science type books and
they all have very interesting covers
right and they have provocative titles
and things like that I mean I wouldn't
say that they're clickbait II because
they are indeed good books and I don't
think that they cross any line but but
you know the that's just a decision you
have to make right like the people who
who write classical recursion theory by
pure OD Freddie it was fine with the
yellow title and the and nothing more
whereas I think other people who who
wrote a more popular type book
understand that they need to have a
compelling cover and a compelling title
and and you know I don't think there's
anything really wrong with that we do we
do take steps to make sure that there is
a line that you don't cross and if you
go too far maybe your thumbnail is
especially racy or or you know it's all
with too many exclamation points we
observe that users are kind of you know
sometimes offended by that and so so for
the users who were offended by that we
will then depress or suppress those
videos and which reminds me that there's
also another signal where users can say
I don't know if was recently added but I
really enjoy it just saying I don't I
didn't something like I I don't want to
see this video anymore or something like
like this is a like there's certain
videos just cut me the wrong way like
just just jump out at music I don't
wanna I don't want this and it feels
really good to clean that out to be like
I don't that's not that's not for me I
don't know I think that might have been
recently added by this that's also a
really strong signal yes absolutely
right we don't want to make a
recommendation that people are unhappy
with and that makes me that particular
one makes me feel good as a user in
general and as a machine learning person
because I feel like I'm helping the
algorithm my interaction on YouTube
don't always feel like I'm helping the
algorithm like I'm not reminded of that
fact
like for example Tesla and Otto Pollan
you know on musk create a feeling for
their customers for people their own
test is that they're helping the
algorithm of Tesla V like they're all
like a really proud they're helping
nicely learn I think YouTube doesn't
always remind people that you're helping
the algorithm get smarter and for me I
love that idea like we're all
collaboratively like Wikipedia gives
that sense they were all together
creating a beautiful thing YouTube is uh
doesn't always remind me of that that's
this conversation is Right any of that
but well that's a good tip we should
keep that fact in mind when we design
these features well I I'm not sure I I
really thought about it that way but
that's a very interesting perspective
it's an interesting question of
personalization that I feel like when I
click like on a video I'm just improving
my experience it would be great you
would make me personally people are
different but make me feel great if I
was helping also
YouTube's algorithm broadly say
something you know saying like there's a
that I don't know if that's human nature
but you want the products you love and I
certainly love YouTube like you want to
help it get smarter and smarter smarter
because there's some kind of coupling
between our lives together being better
if if YouTube was better than I will my
life will be better and that's that kind
of reasoning I'm not sure what that is
and I'm not sure how many people share
that feeling that could be just a
machine learning feeling but not at that
point how much personalization is there
in terms of next video recommendations
so is it kind of all really boiling down
to a clustering like you find in ears
clusters to me and so on and that kind
of thing or just how much is processed
is me the individual completely it's
very very personalized so your
experience will be quite a bit different
from anybody else's who's watching that
same video at least when they're logged
in and the reason is is that we found
that that users often want two different
kinds of things when they're watching a
video sometimes they want to keep
watching more on that topic or more in
that genre and other times they just are
done and they're ready to move on to
something else and so the question is
well what is this something else and one
of the first things one can imagine is
well maybe something else is the latest
video from some channel to which you've
subscribed and that's going to be very
different from for you than it is for me
right and and even if it's not something
that you subscribe to it's something
that you watch a lot and again that'll
be very different on a person-by-person
basis and so even the watch next as well
as the home page of course is quite
personalized so what we met some of the
signals but what a success look like
what a success look like in terms of the
algorithm of creating a great long-term
experience for a user or put another way
if you look at the videos i've watched
this month how do you know the algorithm
succeeded for me I think first of all if
you come back and watch more YouTube
then that's one indication that you
found some value from it so just the
number of hours is a powerful indicator
well I mean not the hours themselves but
the fact that you returned on another
day so that's probably the most simple
indicator people don't come back to
things that they don't find value in
right there's a lot of other things that
they could do but like I said I mean
ideally we would like everybody to feel
that YouTube enriches their lives and
that every video they watched is the
best one they've ever watched since
they've started watching YouTube and so
that's why we survey them and ask them
like is this one to five stars and so
our version of success is every time
someone takes that survey they say it's
five stars and if we ask them is this
the best video you've ever seen on
YouTube they say yes every single time
so it's hard to imagine that we would
actually achieve that maybe
asymptotically we would get there but
but that would be what we think success
is it's funny have recently said
somewhere I don't know maybe tweeted but
that Ray Dalio has this video on the
economic machine I forget what it's
called but it's a 30-minute video and I
said it's the the greatest video I've
ever watched I need you it's like I
watched the whole thing and my mind was
blown is a very crisp clean description
of how the at least the American
economic system works
it's a beautiful video and I was just I
wanted to click on something to say this
is the best thing this is the best thing
ever please let me I can't believe I
discovered it I mean the the views and
the likes reflect its quality but I was
almost upset that I haven't found it
earlier and wanted to find other things
like it I don't think I've ever felt
that this is the best video ever
that was that and to me the ultimate
utopia the best experiences were every
single video where I don't see any of
the videos I regret in every single
video I watch is one that actually helps
me grow helps me enjoy life be happy and
so on well so that's that's that's a
heck of uh that's uh that's one of the
most beautiful and ambitious I think
machine learning tasks so you've
mentioned kind of the the YouTube
algorithm isn't you know e equals MC
squared is that's a single equation it's
it's potentially sort of more than a
million lines of code sort of is it more
akin to what autonomous successful
autonomous vehicles today are which is
they're just basically patches on top of
patches of heuristics and human experts
really tuning the algorithm and have
some machine learning modules or is it
becoming more and more a giant machine
learning system with humans just doing a
little bit of tweaking here and there
what's your sense first of all do you
even have a sense of what is the YouTube
algorithm at this point and whichever
however much you do have a sense what
does it look like well we don't usually
think about it as the algorithm because
it's a bunch of systems that work on
different services the other thing that
I think people don't understand is that
what you might refer to as the YouTube
algorithm from outside of YouTube is
actually a you know a bunch of code and
machine learning systems and heuristics
but that's married with the behavior of
all the people who come to YouTube every
day so the people part of the code
accession exactly right like if there
were no people who came to YouTube
tomorrow then there the algorithm
wouldn't work anymore
right right so that's a critical part of
the algorithm and so when people talk
about well the algorithm does this the
algorithm does that it's sometimes hard
to understand well you know it could be
the the viewers are doing that and the
algorithm is mostly just keeping track
of what the viewers do and then reacting
to those things
in in sort of more fine-grain situations
and i and i think that this is the way
that the recommendation system and the
search system and and probably many
machine learning systems evolve is you
know you start trying to solve a problem
and the first way to solve a problem is
often with a simple heuristic right and
and you know you want to say what are
the videos we're gonna recommend well
how about the most popular ones hayden
that's where you start and and over time
you collect some data and you refine
your situations so that you're making
less heuristics and you're you're
building a system that can actually
learn what to do in different situations
based on some observations of those
situations in the past and and you keep
chipping away at these heuristics over
time and so i think that just like with
diversity you know I think the first
diversity measure we took was okay not
more than three videos in a row from the
same Channel right it's a pretty simple
heuristic to encourage diversity but it
worked right you needs to see four or
five six videos in a row from the same
Channel and over time we try to chip
away at that and make it more fine-grain
and and basically have it
remove the heuristics in favor of
something that can react to individuals
and individual situations so how do you
you mentioned you know we we know that
something worked how do you get a sense
when decisions are the kind of a be
testing that this idea was a good one
this was not so good what's how do you
measure that
and across which time scale across how
many users that kind of that kind of
thing
well you mentioned that a B experiments
and so just about every single change we
make to YouTube we do it only after
we've run a a B experiment and so in
those experiments which run from one
week two months we measure hundreds
literally hundreds of different
variables and and measure changes with
confidence intervals in all of them
because we really are trying to get a
sense for ultimately does this improve
the experience for viewers that's the
question we're trying to answer and an
experiment is one way because we can see
certain things go up and down so for
instance if we notice then the
experiment people are dismissing videos
less frequently or they're saying that
they're more satisfied they're giving
more videos five stars after they watch
them then those would be indications of
that the experiment is successful that
it's improving the situation for viewers
but we can also look at other things
like we might do user studies where we
invite some people in and ask them like
what do you think about this what do you
think about that how do you feel about
this and other various kinds of user
research but ultimately before we launch
something we're gonna want to run an
experiment so we get a sense for what
the impact is going to be not just to
the viewers but also to the different
channels and all of them
you
Resume
Read
file updated 2026-02-13 13:23:17 UTC
Categories
Manage