Deep Learning State of the Art (2020)
0VH1Lim8gL8 • 2020-01-10
Transcript preview
Open
Kind: captions
Language: en
welcome to 2020 and welcome to the deep
learning lecture series let's start it
off today to take a quick whirlwind tour
of all the exciting things that happened
in seventeen eighteen and nineteen
especially and the amazing things were
going to see in this year in 2020 also
as part of this series is going to be a
few talks from some of the top people in
learning in artificial intelligence
after today of course start at the broad
the celebrations from the Turing award
to the limitations and the debates and
the exciting growth first and first of
course the step back to the quote I've
used before I love it I'll keep reusing
it AI began not with Alan Turing or
McCarthy but with the ancient wish to
forge the gods a quote from Pamela
McCord Akande machines who think that
visualization there is just three
percent of the neurons in our brain of
the thalamocortical system that magical
thing between our ears that allows us
all to see and hear and think and reason
and hope and dream and fear our eventual
mortality all of that is the thing we
wish to understand that's the dream of
artificial intelligence and recreate
recreate versions of it echoes of it in
engineering of our intelligence systems
that's the dream we should never forget
in the details I'll talk the exciting
stuff I'll talk about today that's sort
of the the reason why this is exciting
this mystery that's our mind the modern
human brain the modern human as we know
them today know and love them today it's
just about 300,000 years ago and the
Industrial Revolution is about 300 years
ago that's point one percent of the
development since the early modern human
being is when we've seen a lot of the
machinery
Machin was born not in stories but in
actuality is the machine was engineered
since the Industrial Revolution and the
steam engine and the mechanized factory
system and the machining tools that's
just point one percent in the history
and that's the three hundred years now
resume in to the 60 70 years since the
the founder the father arguably of
artificial intelligence Alan Turing and
the dreams you know that there's always
been the dance in artificial
intelligence between the dreams the
mathematical foundations and the and and
when the dreams meet the engineering the
practice the reality so Alan Turing has
spoken many times it by the year 2000
that he would be sure that the Turing
test
natural language we passed it seems
probably he said that once the machine
thinking method had started it would not
take long to outstrip our feeble powers
they would be able to converse with each
other to sharpen their wits some stage
therefore we should have to expect the
machines to take control a little shout
out to self play there so that's the
dream both the father of the
mathematical foundation of artificial
intelligence and the father of dreams in
artificial intelligence and that dream
again in the early days was taking
reality the practice met with the
perceptron often thought of as a single
layer neural network but actually was
not as much known as Frank Rosenblatt
was also the develop or the multi-layer
perceptron and that history zooming
through has amazed our civilization to
me one of the most inspiring things in
this in the world of games first with
the great Gary Kasparov losing to IBM D
blue in 1997 then Lisa Dahl losing to
alphago in 2016 seminal moments and
captivating the world through the
engineering of actual real-world systems
robots on four wheels as we'll talk
about today from Weymouth to Tesla to
all the autonomous vehicle companies
working in the space robots on two legs
captivating the world
of what
actuation what kind of manipulation can
be achieved the history of deep learning
from 1943 the initial models from
neuroscience thinking about neural
networks how to model neural networks
mathematically to the creation as I said
of the single layer and the multi-layer
perceptron by Frank Rosenblatt and so 57
and 62 to the ideas of backpropagation
and recurrent neural nets in the 70s and
80s - convolutional neural networks and
LCL is a bi-directional rnns in the 80s
and 90s to the birth of the deep
learning term and the new wave the
revolution in in 2006 - the image net
and Alix net the seminal moment that
captivated the possibility the
imagination of the AI community of what
neural networks can do in the image and
natural language space closely following
years after to the to the development of
the popularization of Gans generative
adversarial network so the alpha going
alpha zero in 2016 and seven and as
we'll talk about language models of
transformers in seventeen eighteen and
nineteen those has been the last few
years have been dominated by the ideas
of deep learning in the space of natural
language processing ok celebrations this
year the Turing award was given for deep
learning this is like deep learning has
grown up we can finally start giving
awards yawn laocoön Geoffrey Hinton
yoshua bengio received the Turing award
for the conceptual engineering
breakthroughs that have made deep neural
networks a critical component of
computing I would also like to add that
perhaps the popularization in the face
of skepticism for those a little bit
older have known the skepticism the
neural networks have received throughout
the 90s in the face of that skepticism
continuing pushing believing and working
in this field and popularizing it
through in the face of that skepticism I
think is part of the reason these three
folks have received the award but of
course the community that contributed to
deep learning is bigger much bigger than
those three
many of whom might be here today at MIT
broadly in academia in industry looking
at the early key figures Walter Pitts
and Warren McCulloch as I mentioned for
the computational models of the neural
nets these ideas of that of thinking
that the kind of neural networks
biological illness can have on our brain
could be modeled mathematically and then
the engineering of those models into
actual physical and conceptual
mathematical systems by Frank Rosenblatt
57 again single layer multi-layer in
1962 you could say Frank Rosenblatt is
the father of deep learning the first
person to really in 62 mentioned the
idea of multiple hidden layers in neural
networks as far as I know somebody was
correct me but in 1965 shout out to the
Soviet Union and Ukraine the person who
is considered to be the father of deep
learning Alexei even Enco + V G lapa
co-author of that work is the first
learning algorithms are multi-layer
perceptrons multiple hidden layers the
work on back propagation on automatic
differentiation 1970 1979 convolutional
neural networks were first introduced
and john hopfield looking at recurrent
neural networks what are now called
hopfield networks a special kind of
attorney all networks okay that's the
early birth of deep learning I want to
mention this because there's been a kind
of contention space now that we can
celebrate the incredible consciousness
deep learning much like in reinforcement
learning and academia credit assignment
is a big problem and the embodiment of
that almost the point of meme is the the
great jurgen schmidhuber I encourage for
people who are interested in the amazing
contribution of the different people in
the deep learning field to read his work
on deep learning in neural networks it's
an overview of all the various people
who have contributed besides young
laocoon Geoffrey Hinton and yoshua
bengio
it's a big beautiful community so full
of great ideas and full of great people
my hope for this community given the
tension as some of you might have seen
around this kind of credit assignment
problem is that we have more not on this
slide but love that can never be enough
love in the world but general respect
open - and collaboration and credit
sharing in the community less derision
jealousy and stubbornness and silos
academic silos within institutions
within disciplines also 2019 was the
first time it became cool to highlight
the limits of deep learning this is the
interesting moment in time several books
several papers have come out in the past
couple of years highlighting that deep
learning is not able to do the kind of
the broad spectrum of tasks that we can
think of the artificial intelligence
isn't being able to do like read common
sense reasoning like building knowledge
bases and so on
rodney brooks said by 2020 the popular
press starts having stories that the era
of deep learning is over and certainly
there has been echoes of that through
the press through the Twittersphere and
all that kind of world and I'd like to
say that a little skepticism a little
criticism is really good always for the
community but not too much like a little
spice in the soup of progress aside from
that kind of skepticism the growth of
cvpr iclear europe's all these
conference submission papers has grown
year over year there's been a lot of
exciting research some of which I'd like
to cover today my hope in this space of
deep learning growth celebrations the
limitations for 2020 is that there's
less both less hype
unless anti-hype less tweets on how
there's too much hype in AI and more
solid research less criticism and more
doing but again a little criticism
there's a little spice is always good
for the recipe hybrid research less
contentious counterproductive debates
and more open-minded in the
interdisciplinary collaboration across
neuroscience cognitive science computer
science robotics Mathematics Physics
across all these disciplines working
together and the research topics that I
would love to see more contributions to
as we will briefly talk about in some
domains is reasoning common sense
reasoning integrating that into the
learning architecture active learning a
lifelong learning multimodal multitask
learning open domain conversation so
expanding the success of natural
language to dialog to open domain
dialogue and conversation and then
applications the two most exciting one
of which we'll talk about is medical and
autonomous vehicles then algorithmic
ethics in all of its forms fairness
privacy bias there's been a lot of
exciting research there I hope that
continues taking responsibility for the
flaws in our data and the flaws and our
human ethics and then robotics in terms
of deep learning application robotics
I'd love to see a lot of development
continued development deep reinforcement
learning application and robotics and
robot manipulation by the way there
might be a little bit time for questions
at the end if you have a really pressing
question you can ask it along the way
two questions so far thank God okay
so first the practical the deep learning
and deep IRL frameworks this has really
been a year where the frameworks have
really matured and converge shores to
popular deep learning frameworks that
people have used as tensorflow and Pytor
attesa float 2.0 and pi torch 1.3 is the
most recent version and they've
converged towards each other taking the
best features removing the weaknesses
from each other so that competition has
been really fruitful in some sense for
the development of the community
so on the tensorflow side eager
execution so imperative programming the
kind of how you would program in python
has become the default has been first
integrated made easy to use and become
the default and i'm the pie tour site or
script allowed for now graph
representation so do what you're used to
be able to do and what used to be the
default mode of operation intensive flow
allow you to have this intermediate
representation that's in graph form the
unintentional flow side just the deep
caris integration and and promotion is
the primary citizen it's the default
citizen of the api of the way you would
tracker tends to flow allowing complete
beginners just anybody outside of
machine learning to use tensor flow with
just a few lines of code to train and do
inference with a model that that's
really exciting they cleaned up the API
the documentation and so on and of
course maturing the the javascript in
the browser implementation intensive
flow tends to flow light being able to
run toothless phone on phones mobile and
serving apparently this is something
industry cares a lot about of course is
being able to efficiently use models in
the cloud and pi torch catching up with
TPU support and experimental versions of
pi torch mobile so being able to ride a
smartphone on their side this tense
exciting competition oh and I almost
forgot to mention we have to say goodbye
to our favorite Python - this is the
year that support finally in the January
1st 2020 support for Python
- and tensor flows and pythor support
for python - has ended so goodbye print
goodbye cruel world okay on the
reinforcement learning front we're kind
of in the same space as JavaScript
libraries are in there's no clear
winners coming out if if you're a
beginner in the space the one I
recommend is as a fork of open air
baselines as stable baselines but
there's a lot of exciting ones some of
them are really closely built on
tensorflow
some are built on PI torch of course
from Google from facebook from deep mind
dopamine TF agents tensile force most of
these I've used if you have specific
questions I can answer them so stable
baselines is the open any baselines for
because I said this implements a lot of
the basic deep RL algorithms PPO as you
see everything good documentation and
just allows very simple minimal few
lines of code implementation of the
basic the matching of the basic
algorithms of the open air gym
environments that's the one I recommend
ok for the framework world my hope for
2020 is framework agnostic research so
one of the things that I mentioned is PI
torch has really become almost
overtaking tensorflow in popularity in
the research world what I'd love to see
is being able to develop an architecture
in tensorflow or developing an PI torch
which you currently can and then Trent
once you train the model to be able to
easily transfer it to to the other from
PI to telephone from test flow to PI
torch currently takes three four five
hours if you know what you're doing in
both languages to do that it'd be nice
if if there was a very easy way to do
that transfer then the maturing of the
DRL frameworks I'd love it to see open
AI step up
deep mind to step up and really take
some of these frameworks to maturity
that we can all agree on much like
opening idea for the environment world
has done and continued work that Kerris
has started and many other rappers
around tensorflow started of greater and
greater abstractions allowing machine
learning to be used by people outside
of the machine learning field I think
the the powerful thing about supervised
sort of basic vanilla supervised
learning is that people in biology and
chemistry in neuroscience in in physics
in astronomy can can deal with the huge
amount of data that they're working with
and without needing to learn any of the
details of even Python so that that I
would love to see greater and greater
abstractions which empower scientists
outside the field ok natural language
processing 2017-2018 was in the
transformer was developed and its power
was demonstrated most especially by burt
achieving a lot of state-of-the-art
results on a lot of language benchmarks
from synthesis classification to tagging
question answering and so on
there's hundreds of data sets and
benchmarks that emerged most of which
Burt has dominated in 2018-2019 was sort
of the year that the transformer really
exploded in terms of all the different
variations again starting from Burt
Excel net it's very cool to use Burt in
the name of your new derivative
transformer Roberto distill Burt from
hugging face Salesforce opening eyes GPT
- of course Albert and Megatron from
Nvidia huge transformer a few tools have
emerged so one on hugging face is a
company and also a repository that has
implemented in both pi torch intensive
flow or a lot of these transformer based
national language models so that's
really exciting so most people here can
just use it easily so those are already
pre trained models and the other
exciting stuff is Sebastian Reuter great
researcher in the in the field of
natural language processing has put
together an LP progress which is all the
different benchmarks for all the
different natural language tasks
tracking who sort of leaderboards of
who's winning where okay I'll mention a
few models that stand out the work from
this year Megatron LM from Nvidia is
basically taking I believe the GPT -
transformer model and just putting it on
steroids right eight point three versus
one point five billion parameters and a
lot of interesting stuff there as you
would expect from Nvidia of course it's
always brilliant research but also
interesting aspects about how to train
in a parallel way model and data
parallelism in the training the first
breakthrough results in terms of
performance the model that replaced Bert
as king of transformers is XL net from
CMU of Google research they combined the
directionality from Bert and the
recurrence aspect of transformer
excelled the relative position
embeddings and the recurrence mechanism
of transform excel to taking the
bidirectionality and the recurrence
combining it to achieve state-of-the-art
performance on 20 tasks Albert is a
recent addition from Google research and
it reduces significantly the amount of
parameters versus Bert's
by doing a parameter sharing across the
layers and it has achieved
state-of-the-art results on 12 an LP
tasks including the the difficult
Stanford question answering benchmark of
squad 2 and they provide that provide
open source tensorflow implementation
including a number of ready to use pre
trained language models ok another way
for people who are completely new to
this field a bunch of apps right with
transformer is one of them from hugging
face a pop-top that allows you to
explore the capabilities to these
language models and I think they're
quite fascinating from a philosophical
point of view and this this has actually
been at the core of a lot of the tension
of how much do these transformers
actually understand basically memorizing
the statistics of the language in a self
supervised way by reading a lot of text
is that really understanding a lot of
people say no until it impressed us and
then everybody will say it's obvious but
right with transformer is a really
powerful way to generate text to reveal
to you how much these models really
learn before this yesterday actually
just came up with a bunch of prompts so
on the left is a prompt you give it the
meaning of life here for example is not
what I think it is it's what I do to
make it and you can do a lot of prompts
with this nature it's very profound and
some of them will be just absurd
you'll make sense of it statistically
but it'll be absurd and reveal that the
model really doesn't understand the
fundamentals of the prompt is being
provided but at the same time it's
incredible what kind of text is able to
generate
the limits
deep learning I was just having fun with
this at this point still the are still
in the process of being figured out very
true had to psych this most important
person in the history of deep learning
is probably Andrew and I have to agree
so this model knows what it's doing and
I tried to get it to say something nice
about me and that's a lot of attempts so
this is kind of funny is finally did it
did one I said likes Prima's best
qualities that he's smart said finally
but I said never nothing but ever
happens but I think he gets more
attention a very every Twitter comment
ever and that's very true
ok a nice way to sort of reveal through
this that the models are not able to do
any kind of understanding of language is
just to do problems that show
understanding of concepts of being able
to reason with those concepts common
sense reasoning trivia one is doing 2+2
is a 3 5 is a six seven the result of
the simple equation 4 + 2 + 3 is like
you got it right and then it changed its
mind okay 2 minus 2 is 7 so on you can
reveal any kind of reasoning you can do
with blocks you can ask it about gravity
all those kinds of things it shows that
it doesn't understand the fundamentals
of the concepts that are being reasoned
about and I'll mention of work that
takes it beyond towards that reasoning
world in the next few slides but I
should also mentioned will this GPT 2
model if you remember about a year ago
there was a lot of thinking about this
1.5 billion parameter model from open AI
it is so the thought was it might be so
powerful that it would be dangerous
and so the idea from opening eyes when
you have an AI system that you're about
to release that might turn out to be
dangerous in this case used probably by
Russians fake news for misinformation
that kind of death that's the kind of
thinking is how do we release it and I
think while it turned out that the GPG
to model is not quite so dangerous that
humans are in fact more dangerous than
AI currently the that thought experiment
is very interesting they released a
report run release strategies in the
social impacts of language models that
almost didn't get as much intention as I
think it should and it was a little bit
disappointing to me how little people
are worried about this kind of situation
there is more of an eye-roll about oh
these language models aren't as smart as
as as we thought they might be but the
reality is once they are it's a very
interesting thought experiment of how
should the process go of companies and
experts communicating with each other
during that release the support think
thinks through some of those details my
takeaway from just reading the reporter
from this whole year of that event is
that conversation on this topic are
difficult because we as the public seem
to penalize anybody trying to have that
conversation and the model of sharing
privately confidentially between ml
machine learning organizations and
experts is not there there's no
incentive or model or history or a
culture of sharing okay best paper from
ACL the the main conference for
languages was on the difficult task of
so we talked about language models now
there's the task taking it a step
further of dialogue multi-domain
task oriented dialogue that's sort of
like the next challenge for dialogue
systems and they've had a few ideas on
how to perform dialogues state tracking
across domains achieving state of the
art performance on multi laws which is a
five domain challenging very difficult
fide domain human to human dialogue
dataset there's a few ideas there I
should probably hurry up and start
skipping stuff
the common sense reasoning which is
really interesting is the this one of
the open questions for the deep learning
community a community in general is how
can we have hybrid systems of whether
it's symbolic and deep learning or
generally common sense reasoning with
learning systems and there's been a few
papers in this space on my favorite from
Salesforce on building a data set where
we can start to do question answering
and figuring out the concepts that are
being explored in the question and
answering here the question while eating
a hamburger with friends what are people
trying to do multiple choice have fun
tasty indigestion the idea that needs to
be generated there and that's where the
language model would come in is that
usually a hamburger with friends
indicates a good time
so you basically take the question
generate the common sense concept and
from that be able to determine the
multiple choice what's being what's
happening what's the state of affairs in
this particular question okay I'll let
surprise again hasn't received nearly
enough attention that I think you should
have perhaps because there hasn't been
major breakthroughs but it's open domain
conversations that all of us anybody who
owns an Alexa can can participate in as
a provider of data but there's been a
lot of amazing work from universities
across the world on the elect surprised
in the last couple of years and there's
been a lot of interesting lessons
summarized in papers and blog posts a
few lessons from Alcoa
that I particularly like and this is
kind of echoes the work in the IBM
Watson who the Jeopardy challenge is
that one of the big ones is that machine
learning is not an essential tool for
effective conversation yet so machine
learning is useful for general chitchat
when you fail at deep meaningful
conversation or actually understanding
what the topic or
talking about so throwing in chitchat
and classification sort of classifying
intent finding the entities detecting
the sentiment of the sentences that's
sort of a an assistive tool but the
fundamentals of the conversation are are
the following so first you have to break
it apart sort of conversation is a you
can think of it as a as a long dance and
the way you you have fun dancing is you
break it up into a set of moves and
turns and so on and focus on that sort
of live in the moment kind of thing so
focus on small parts of the conversation
taken at a time then also have a graph
sort of conversation is also all about
tangents so have a graph of topics and
be ready to jump context from one
context to the other and back if you
look at some of these natural language
conversations they publish it's just all
over the place in terms of topics you
jump back and forth and that's the
beauty the humour the wit the fun of
conversations you jump jump around from
topic to topic and opinions one of the
things that natural language systems
don't seem to have much is opinions if I
learned anything one of the simplest
ways to convey intelligence is to be
very opinionated about something and
confident and that's that's a really
interesting concept about constantly and
in general there's just a lot of lessons
oh and finally of course maximize
entertainment not information this is
true for Thomas vehicles this is true
for natural language conversation is fun
should be part of the objective function
okay lots of lessons to learn there this
is really the lobner prize the Turing
test of our generation that's I'm
excited to see if there's anybody able
to solve the lexer prize again a lexer
prize is your task with talking to a bot
and the measure of quality is the same
as the lobner prize is just measuring
how good was that conversation but also
the task is to try to continue the
conversation for 20 minutes if you try
to talk to a bot today like and you have
a choice to talk to a bot or go do
something else watch Netflix
the you last but probably less than 10
seconds you'd be bored the the point is
to continue trapping you in the
conversation because you're enjoying it
so much in the 20 minutes is that's a
really nice benchmark for passing the
spirit of what the Tory test stood for
examples here from the elect surprised
than the alko's bought so the difference
in two kinds of conversations so alko
says have you been in Brazil the user
says what is the population of Brazil
Alco says it is about 20 million user
says well okay this is what happens a
lot with like I meant your multi domain
conversation is once you jump to a new
domain you stay there once you switch
context you stay there the reality is
you want to jump back and continue
jumping around like in the second most
more successful conversation have you
been in Brazil what is the population of
Brazil it is around 20 million anyway I
was saying have you been in Brazil so
they're jumping back in context that's
how conversation goes change you to
change in and back quickly there's been
a lot of sequins to sequins kind of work
using natural language to summarize a
lot of applications one of the for me I
cleared that I wanted to highlight from
Technion that I find particularly
interesting is the abstract syntax tree
based summarization of code so modeling
computer code in this case sadly Java
and c-sharp in in trees in syntax trees
and then using operating on those trees
to then do the summarization in text
here an example of a basic power of to
function on the bottom right in Java the
code to SEC summarization says get power
of two that's an exciting possibility of
automated documentation of source code I
thought it was particularly interesting
in the future there's bright ok hopes
for 2020 for natural language processing
is reasoning common-sense reasoning
becomes greater and greater part of the
former type language model work that
will be seen in the deep learning world
extending the context from thousands
from hundreds of thousands of words to
tens of thousands of words being able to
read entire stories and maintain the
context which transformers again with
excel net transformer Excel is starting
to be able to do but we're still far
away from that long-term lifelong
maintenance of context dialogue open
domain dialogue forever since Alan
Turing - today is the dream of
artificial intelligence being able to
pass the Turing test and the dream of
sort of natural language model
transformers are self supervised
learning and the dream of Yann laocoön
is - for these kinds of where previously
were called unsupervised but he's
calling now self supervised learning
systems to be able to sort of watch
youtube videos and from that start to
form representation based on which you
can understand the world sort of the the
hope for 2020 and beyond is to be able
to transfer some of the success of
transformers to the world of visual
information the world of video for
example
DRL and self play this has been an
exciting year continues to be an
exciting time for reinforcement learning
in games and robotics
so first dota2 an open AI an
exceptionally popular competitive game
esports game that people compete when
millions of dollars with so this is a
lot of world-class professional players
in so in 2018 open at five this is a
team play tried their best at the
international and lost and said that
we're looking forward to pushing five to
the next level which they did in april
two thousand eighteen they beat the 2018
world champions in five on five play so
the key there was compute eight times
more training compute because the the
the actual compute was already maxed out
the way they achieved the 8x is in time
simply training for longer so the
current version of open ni 5 is jacob
will talk about next Friday has consumed
800 petaflop a second days and
experienced about 45,000 years of dota
self play over 10 real-time months again
behind a lot of the game systems talk
about the they use self play so they
play against each other this is one of
the most exciting concepts in deep
learning systems that learn by playing
each other and incrementally improving
in time so starting from being terrible
and getting better and better and better
and better and you're always being
challenged by a slightly better opponent
because the because of the natural
process of self play that's a
fascinating process the 2019 version the
last version of open AI 5 well has a
99.9 win rate versus the 2018 version ok
then deep mind also in parallel has been
working and using self play to solve
some of these multi agent games which is
a really difficult space when people
have to collaborate as part of the
competition it's exceptionally difficult
from the reinforcement learning
perspective so this is from raw pixels
so all
the arena capture the flag game quake 3
arena one of the things I love just as a
sort of side note about both opening
eyes and deep mind and general research
and reinforcement learning there will
always be one or two paragraphs of
philosophy in this case from deep mind
billions of people inhabit the planet
each with their own individual goals and
actions but still capable of coming
together through teams organizations and
societies in impressive displays of
collective intelligence this is a
setting we call multi agent learning
many individual agents must act
independently yet learn to interact and
cooperate with other agent this is an
immensely difficult problem because with
Co adapting agent the world is
constantly changing the fact that we
seven billion people on earth people in
this room in families in villages can
collaborate while being for the most
part self-interested agents is
fascinating one of my hopes actually for
2020 is to explore social behaviors that
emerge in reinforcement learning agents
and how those are echoed in in real
human to humans social systems okay
here's some visualizations the agents
automatically figure out as you see in
other games they figure out the concepts
so knowing very little knowing nothing
about the rules of the game about the
cost of the game about the strategy and
the behaviors able to figure it out
there's the TC visualizations of the
different states importance states and
concepts in the game that this figures
out and so on skipping ahead automatic
discovery of different behaviors this
happens in all the different games we
talk about from dota to Starcraft 2 to
quake the different strategies that it
doesn't know about it figures out
automatically and the really exciting
work in terms of the multi agent RL on
the deep mind side was the meeting
world-class players and achieving
grandmaster level in a game I dunno
about which is Starcraft in December
2018 alpha star beat mana one of the
world's strongest professional Starcraft
players but that was in a very
constrained environment and there's a
single race I think Protoss
and in 2019 alpha star Beach Grand
Master level by doing what we humans do
so using a camera observing the game and
playing as part of against other humans
so this is not an artificial side system
this is doing exact same process humans
will undertake and achieve Grandmaster
which is the highest level ok great I
encourage you to observe a lot of the
interesting on their blog posts and
videos of the different strategies that
the there are our Allegiance are able to
figure out here's a quote from the one
of the professional Starcraft players
and we see this with alpha zero - and
chess is alpha stars and intriguing
unorthodox player one with the reflexes
and speed of the best pros but
strategies and style they're entirely
its own the way alpha star was trained
with agents competing against each other
in a league has resulted in gameplay
that's unimaginably unusual it really
makes you question how much the stock
has diverse possibilities Pro players
have really explored that's the really
exciting thing about reinforcement
learning agent in chess and go in games
and hopefully simulated systems in the
future that teach us teach experts that
think they understand the dynamics of a
particular game a particular simulation
of new strategies of new behaviors to
study that's one of the exciting
applications from almost a psychology
perspective that I'd love to see
reinforcement learning push towards and
on the imperfect information game side
poker in 2018 CMU no Brown I was able to
beat had two head-to-head No Limit Texas
Hold'em and now team six player No Limit
Texas Hold'em against professional
players many of the same results mania
the same approaches was self play
iterative Monte Carlo and there's a
bunch of ideas in terms of the
abstractions so there's so many
possibilities under the imperfect
information that you have to form these
bins of abstractions in both the actions
bass in order to reduce the action space
and the information abstraction space so
the probabilities of all the different
hands that can possibly have and all the
different hands that the betting
strategies could possibly represent and
sort of you have to do this kind of
course planning so there they use self
play to generate a course blueprint
strategy that in real time they then use
Monte Carlo search to adjust as they
play again unlike the deep mind open eye
approaches very few very minimal compute
required and they're able to achieve to
beat to beat world-class players again I
like this is getting quotes from the
professional players after they get
beaten so Chris Ferguson famous world's
he is a poker player said pluribus
that's the name of the agent is a very
hard opponent to play against it's
really hard to pin him down on any kind
of hand he's also very good at making
thin value bets on the river he's very
good at extracting value out of his good
hands sort of making bets without
scaring off the opponent Darren Elias
said it's major strength is its ability
to use mixed strategies that's the same
thing that humans try to do it's a
matter of execution for humans to do
this in a perfectly random way and to do
so consistently most people just can't
then in the robotic space there's been a
lot of applications reinforcement
learning one of the most exciting is the
manipulation sufficient manipulation to
be able to solve the Rubik's Cube again
this is learned through reinforcement
learning again because self plays in
this context is not possible they use
automatic domain randomization ADR so
they generate progressively more
difficult environments for the hand
there's a giraffe head there you see
there's a lot of perturbations to the
system so they mess with it a lot and
then a lot of noise injected into the
system to be able to teach the hand to
manipulate the cube in order to then
solve the actual solution of figuring
out how to go from this particular face
to the solved cube is an obvious problem
the the
paper in this work is focused on the the
much more difficult learning to
manipulate the cube it's really exciting
again a little philosophy as you would
expect from open AI is they have this
idea of emergent meta learning this idea
that the capacity of the neural network
that's learning this manipulation is
constrained while the ADR the automatic
domain randomization it's progressively
making harder and harder environment so
the capacity of the environment to be
difficult is unconstrained and because
of that the there's a an emergent self
optimization of the neural network to
learn general concepts as opposed to
memorize particular manipulations the
hope for me in the deep reinforcement
learning space I mean for 2020 is the
continued application robotics even sort
of legged robotics but also robotic
manipulation human behavior it's the use
of multi agent self plays I've mentioned
to explore naturally emerging social
behaviors constructing simulations of
social behavior and seeing what kind of
multi human behavior emerges in soft
play context I think that's one of the
nice there are always I hope there'll be
like a reinforcement learning self play
psychology department one day like where
you use reinforcement learning to study
to reverse-engineer human behavior and
study it through that way and again in
games I'm not sure with the big
challenges that it remained but I would
love to see to me at least it's exciting
to see learned solution to games to self
play
instead deep learning I would say
there's been a lot of really exciting
developments here that deserve their own
lecture I'll mention just a few here
from MIT in early 2018 but it sparked a
lot of interest in 2019 follow-on work
is the idea of the lottery ticket
hypothesis so this work showed that
sub-networks small sub-networks within
the larger network are the ones that are
doing all the thinking the same results
in accuracy can be achieved from a small
sub Network from within a neural network
and they have a very simple process of
arriving at a sub network of randomly
initializing in your network that's I
guess the lottery ticket train the
network controller converges this is an
iterative process prune the fraction of
the network with low weights reset the
waste of the remaining network with the
original initialization these same
lottery ticket and then train again the
pre the pruned on train Network and
continue this iteratively continuously
to arrive at a network that's much
smaller using the same original
initializations this is fascinating that
within these big networks there's often
a much smaller network that can achieve
the same kind of accuracy now
practically speaking it's unclear with
that what are the big takeaways there
except the inspiring takeaway that there
exist architectures that are much more
efficient so there is value in investing
time in finding such networks then there
is disentangle representations which
again to serve its own lecture but here
showing a 10 vector representation and
the goal is where each part of the
vector can learn one particular concept
about a data set so the dream of
unsupervised learning is you can learn
compress representations where every one
thing is disentangled and you can learn
some fundamental concept about the
underlying data that can carry from data
saturdays that it is that there's the
best disentangle representation
there's theoretical work best ICML paper
in 2019 showing that that's impossible
so disentangled representation is
impossible without some without
inductive biases and so the suggestion
there is that the biases that you use
should be made explicit as much as
possible the open problem is finding
good inductive biases funster provides
model selection that work across
multiple data set that we're actually
interested in a lot more papers but one
of the exciting is the double dissent
idea that's been extended and to the
deep you know network context by open AI
to explore the the phenomena that as we
increase the number of parameters in a
neural network the test error initially
decreases increases and just as the
model is able to fit the training set
undergoes the second descent so decrease
increase decrease so there's this
critical moment of time when the the
training set is just fit perfectly okay
and this is the open air shows that it's
applicable not just the model size but
also the training time and data set time
this is more like an open problem of why
this is trying to understand this and
how to leverage it in optimizing
training dynamics and your networks
that's a there's a lot of really
interesting theoretical questions there
so my hope there for the science that
deep learning in 2020 is to continue
exploring the fundamentals of model
selection training dynamics the folks
focused on the performance of the
training in terms of memory and speed
has worked on and the representation
characteristics with respect to
architecture characteristics so a lot of
the fundamental work there in the
understanding neural networks two areas
that I had hold two sections on and
papers which is super exciting my first
love is grass so graph neural networks
is a really exciting area of deep deep
learning graph convolution neural
networks as well for solving
combinatorial problems and
recommendation systems are really useful
in any kind of problem that is
fundamentally can be modeled as a graph
can be then solved or
least aided in buying y'all know there's
a lot of exciting area there and
Bayesian deep learning using Bayesian
neural networks that's been for several
years and exciting possibility it's very
difficult to Train large Bayesian
networks but in in the context that you
can and it's useful small datasets
providing uncertainty measurements in
the predictions it's extremely powerful
capability of Bayesian Nets a Bayesian
your networks and online incremental
learning these new levels releases
there's a lot of really good papers
there it's exciting okay autonomous
vehicles oh boy let me try to use this
few sentences as possible to describe
this section of a few slides it is one
of the most exciting areas of
applications of AI and learning in the
real world today and I think it's the
way that artificial intelligence it is
the place where artificial intelligence
systems touch human beings that don't
know anything about artificial
intelligence the most hundreds of
thousands soon millions of cars will be
interacting with human beings robots
really so this is a really exciting area
and a really difficult problem and
there's two approaches one is level two
where the human is fundamentally
responsible for the supervision of the
AI system and level four or at least the
dream is where the AI system is
responsible for the actions and the
human does not need to be a supervisor
okay two companies represent each of
these approaches that are sort of
leading the way way mo in October 2018
ten million miles on road today this
year they've done twenty million miles
in simulation ten billion miles and a
lot I've gotten a chance to visit them
out in Arizona they're doing a lot of
really exciting work and they're
obsessed with testing so the kind of
testing they're doing is incredible
twenty thousand classes of structured
tests of putting the system through all
kinds of tests that these years can
think through and that appear in the
real world and they've initiated testing
on road with real consumers
without a safety driver which if you
don't know that is that means the car is
truly responsible there's no human catch
the exciting thing is that there is
seven hundred thousand eight hundred
thousand Tesla autopilot systems that
means there's these systems that are
human supervised they're using fun a
multi-headed neural network multitask
neural network to perceive predict and
act in this world so that's a really
exciting real-world deployment
large-scale of neural networks as a
fundamentally deep learning system
unlike way mo which is deep learning is
the icing on the cake for for Tesla deep
learning is the cake okay it's at the
core of of the perception and the action
of the system performs they have to de
done over two billion miles estimated
and that continues to quickly grow I'll
briefly mention which i think is a super
exciting idea in all applications of
machine learning in the real world which
is online so iterative learning active
learning Andrey Carpathia was the head
of autopilot calls is this the data
engine it's this iterative process of
having a neural network performing the
task discovering the edge cases
searching for other edge cases they're
similar and then we're training the
network annotating the education time we
train them and continuously doing this
loop this is what every single company
that's using machine learning seriously
is doing very little publications on
this space and active learning but this
is the fundamental problem machine
learning it's not to create a brilliant
neural network is to create a dumb
neural network that continuously learns
to improve until it's brilliant and that
process is specially interesting when
you take it outside of single task
learning so most papers are written on
single task learning you take whatever
benchmark here in the case of driving
this object detection landmark detection
driving buleria trajectory generation
right that all those have benchmarks
and you can have some separate neural
networks for them that's a single task
with combining to use a single neural
networks that performs all those tests
together that's the fascinating
challenge where you're reusing parts of
the neural network to learn things that
are coupled and then to learn things
that are completely independent and
doing the continuous active learning
loop they're inside companies in case
the test that way mode in general it's
exciting to have people these are actual
human beings that are responsible for
these particular tasks they've become
experts of particular perception tasks
expert as a particular planning task and
so on and so the job of that expert is
both to train the neural network and to
discover the edge cases which maximize
the improvement of the network that's
where the human expertise comes in a lot
ok and there's a lot of debate it's an
open question about which kind of system
will be which kind of approach would be
successful a fundamentally learning
based approach as is with the level two
with the Tesla autopilot system that's
learning all the different tasks that I
invited in involved with driving and as
it gets better and better and better
less and less human supervision is
required the pro of that approach is the
camera based systems have the highest
resolution so that it's very amenable to
learning but the con is that it requires
a lot of data a huge amount of data and
when nobody knows how much data yet the
other con is human psychology is the
driver behavior that the human must
continue continue mean remain vigilant
on the level for approach that leverages
besides cameras and radar and so on also
leverages lidar map the pros that it's
much more consistent reliable
explainable system so the detection the
accuracy of the detection the the depth
estimation of the detection of different
objects is much higher accurate with
less data the cons is it's expensive at
least for now it's less amenable to
learning methods because much fewer data
low resolution data
and must require at least for now
some fallback whether that's the safety
driver or teleoperation the open
questions for the deep learning level to
Tesla autopilot approach is how hard is
driving this is actually the open
question for most disciplines in
artificial intelligence
how difficult is driving how many
education is driving have can that can
we learn to generalize over those edge
cases without solving the common sense
reasoning problem it's kind of its kind
of task without solving the human level
artificial intelligence problem and that
means perception how hard is perception
detection intentional modeling human
mental model modeling the trajectory
prediction then the action side the game
theoretic action side of balancing like
I mentioned fun and enjoy ability with
the safety of the systems because these
are life critical systems and human
supervision the vigilance side how good
can auto Poli get before visuals
decrement significantly and so people
fall asleep becomes distracted stop
watching movies so on and so on the
things that people naturally do the open
question is how good can auto pilot get
before that becomes a serious problem
and if that decrement nullifies the
safety benefit of the use of autopilot
which is
Resume
Read
file updated 2026-02-13 13:24:45 UTC
Categories
Manage