Transcript
STFcvzoxVw4 • Vladimir Vapnik: Statistical Learning | Lex Fridman Podcast #5
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/lexfridman/.shards/text-0001.zst#text/0051_STFcvzoxVw4.txt
Kind: captions
Language: en
the following is a conversation with
vladimir vapnik he's the co-inventor of
the support vector machines support
vector clustering vici theory and many
foundational ideas in statistical
learning he was born in the Soviet Union
and worked at the Institute of control
sciences in Moscow then in the United
States
he worked at AT&T NEC labs Facebook
research and now as a professor Columbia
University his work has been cited over
a hundred seventy thousand times he has
some very interesting ideas about
artificial intelligence and the nature
of learning especially on limits of our
current approaches and the open problems
in the field
this conversation is part of MIT course
on artificial general intelligence and
the artificial intelligence podcast if
you enjoy it please subscribe on youtube
or rate it on iTunes or your podcast
provider of choice or simply connect
with me on Twitter or other social
networks at Lex Friedman spelled Fri D
and now here's my conversation with
vladimir vapnik Einstein famously said
that God doesn't play dice yeah you have
studied the world to the eyes of
Statistics so let me ask you in terms of
the nature of reality fundamental nature
of reality does God play dice you don't
know some factors because you don't know
some factors which could be important it
looks like good play dice but well you
should discourage in philosophy they
distinguish between two positions
positions of instrumentalism where you
click accelerate for production and
position of realism where you trying to
understand what God did can you describe
instrumental ISM and realism a little
bit for example if you have some
mechanical laws what is that is it law
which true always and everywhere or it
is law which allow you
to predict position of moving element
the what what you believe you believe
that it is God's law the God created the
world which Adi - this physical law or
it is just law for predictions and which
one is instrumentalism for predictions
just if you believe that this is law of
God innocence
always true everywhere that means that
you realist say your electorate they
really understood
understand that God thought say the way
you see the world as an instrumentalist
you know I working for some models
models of machine learning so in this
model we can see shaking and we try to
solve resolve the setting to solve the
problem and you can do in two different
way from the point of view of
instrumentalist and that's what
everybody does now because they says
that goal of machine learning is to find
the rule for classification that is true
bucket is instrument or prediction but I
can say the god of machine learning is
to to learn about conditional
probability so have God played youth and
he is he play what is probability for
one what is probability for another
given situation
but for prediction I don't need this I
need the role but for the standing
I need conditional probability so let me
just step back a little bit first to
talk about you mentioned which I read
last night the parts of the 1960 paper
by a Eugene Wigner unreasonable
effectiveness of mathematics and Natural
Sciences since you're such a beautiful
paper by the way made me feel to be
honest to confess my own work in the
past few years on deep learning heavily
applied made me feel that I was missing
out on some of the beauty of nature and
in the way that math can uncover so let
me just step away from the poacher of
that for a second how do you see the
role of math in your life is it a tool
as a poetry where does it sit and does
math for you have limits of what you can
describe some people saying that Moss is
language which use God so I believe it's
like to God or here's God almost God
he's God yeah so I believe that this
article about effectiveness unreasonable
effectiveness of mass is that if you
look in convert symmetrical structures
they know something about reality and
the most scientists for natural science
they're looking on equation in trying to
understand reality so the same in
machine learning if you try very
carefully looks on all equations which
define conditional probability you can
understand something about the reality
know them from your fantasy so math can
reveal the simple underlying principles
of reality perhaps you know what means
simple it is very hard to discover them
but then when you discover them and look
at them you see how beautiful they are
it is surprising why people did not see
that before you look in conclusion and
derive it from equations
for example I talked yesterday about
least square method and people had a lot
of fantasy half to improve lives
Cornette but if you look going step by
step by solving some equations you
suddenly you get some term which after
thinking you understand that in describe
position of observation point in least
square method you throw out a lot of
information we don't look in composition
of point of observations we're working
only on residuals but when you're
understood that that's very simple idea
but it's not too simple to understand
and you can derive this just form a
coherent so some simple algebra a few
steps will take you to something
surprising that when I think about ocean
and that is proof that human intuition
not too rich and very primitive and it
does not see very simple situations so
that means to take a step back in
general yes right but what about human
is a positive intuition ingenuity the
moments of brilliance so I use so do you
have to be so hard on human intuition
are there moments of brilliance in human
intuition they can leap ahead of math
and then the math will catch up
all right I don't think so I think that
the best human intuition it is putting
in actions and then it is technical
where I where they axioms take you but
if they correctly take actions but it
talks your polish during generations of
scientists and this is integral wisdom
so that's beautifully put
but if you maybe look at it
when you when you think of Einstein and
a special relativity yeah what is the
role of imagination coming first there
in the moment of discovery of an idea so
there's obviously mix of math and
out-of-the-box imagination they're not I
don't know whatever I did I exclude any
imagination because whatever I saw in
machine learning that come from
imagination like features like deep
learning they are not relevant of the
problem when you're looking very
carefully from mathematical equations
you're deriving very simple theory which
goes far beyond theoretical then
whatever people can imagine because it
is not good fantasy yeah it is just
interpretation it is just fantasy but it
is not what you need you don't need any
imagination to derive say more in
principle of machine learning when you
think about learning and intelligence
maybe thinking about the human brain and
trying to describe mathematically the
process of learning that is something
like what happens in the human brain do
you think we have the tools currently do
you think we will ever have the tools to
try to describe that process of learning
you it is not descriptions what's going
on
it is interpretation it is your
interpretation your vision can be wrong
you know when God invent microscope 11
books for the first time only he got
just instrument and nobody on he kept
secret about microscope but he wrote
reports in London Academy of Science in
which his report and he wrote in Gaza
blood
he looked everywhere on the water on the
blood spill but he described blood like
fight between green and Kings so
or he saw blood cells red cells and he
imagines that it is army fighting each
other and it was his interpretation of
situation and she said that this report
and academia signs they very clearly for
you look because they believe that he's
wrong he's right he saw something yes
but he gave wrong interpretation and I
believe the same can happen to his brain
well yeah the most important part you
know I believe in human language in some
product is so much wisdom
for example people say that it is better
than thousand days of diligent studies
one day this great teacher but if you I
will ask him what teacher does nobody
knows and that is intelligence and but
we know from history and now from from
mass in machine learning that teacher
can do a lot
so what from a mathematical point of
view is a great teacher I don't know
that's a journal yeah no but we can say
what teachers can do you can introduce
some invariance of predicates for
creating invariants have you doing it I
don't know because teacher knows reality
and can describe from the reality
predicate invariance but we know that
when you're using conveniently can
decrease number of observations hundred
times that's so but maybe try to pull
that apart a little I think you
mentioned like a piano teacher saying to
the student play like a butterfly yeah I
played piano playing guitar for a long
time and it yeah that's there's maybe
it's romantic
poetic but it feels like there's a lot
of truth in that statement like there's
there's a lot of instruction that's
and so can you pull that apart what is
what is that the language itself may not
contain this information what blah blah
blah because it's not blah blah yeah if
if you its what if you queue and if your
plank yes it does but well it's not the
lane it's if it feels like a what is the
information being exchanged there what
is the nature of information what is the
representation of that information I
believe that it is sort of predicates
but I don't know that's exactly what
what intelligence in machine learning
should be yes because the rest is just
mathematical technique I think that what
was discovered recently is it there is
to tie to mechanism of learning one
called strong convergence mechanism and
the convergence mechanism before people
use only one could Envy convergence
beacon isn't you can use predicate
that's what clearly butterfly and if you
immediately affect your plank you know
the series English product great if it
looks like a dog sleeps like a duck and
quacks like a duck then it is probably
duck yes but this is exact about three
decades looks like a duck what it means
so you so many dogs that your training
data so you you have description hobby
how looks integral loops dogs get the
visual characteristics of attack yeah
but you want and you have model for
recognition nouns so you would like so
that theoretical description from model
coincide this is empirical description
each is so intelligent so about looks
like a darkness general but what about
seems like a dog you should know that
duck swims you can say it play chess
like a duck okay duck doesn't play
and it is completely legal predicate but
it is useless so half teacher can
recognize not useless predicate so up to
now we don't use this predicate in
existing motion law and it's called
while zillions of data but in this
English proverb product they use only
sleep litigate looks like a duck swims
like a duck and quacks like a duck
so you can't deny the fact that swims
like a duck and quacks like a duck has
humor in it has ambiguity let's talk
about sleep like a duck in it does not
say jumps jumping like a duck why
because it's not relevant but that's
music you know dogs you know different
birds
you know animals and you derive from
this it is really one to say yeah so now
underneath in order first understand
swims like a duck it feels like we need
to know millions of other little pieces
of information we pick up along the way
you don't think so that doesn't need to
be this knowledge base in in those
statements carries some rich information
that helps us understand the essence of
duck
yeah how far are we from integrating
predicates no and you know that when
when you can see the completes over here
machine learning so what it does you
have a lot of functions and then you
yo-yo target it looks like a duck you
see your training data from training
data you recognizes life expected duck
should look then you remove all
functions which does not look like you
think it should look
from training date so you decrease
amount of function from beach P you pick
up one then you give a second predicate
in the again the strain decreases the
set of function and after that you pick
up the best function again when it is
standard machine learning so why you
need not too many examples your
predicates are very good because every
predicate is invented to decrease a
divisible set of function so you talk
about admissible set of functions and
talk about good functions so what makes
a good function so admissible sort of
function is sort of function which has
small capacity of small diversity small
vc-dimension Excel which contain good
function inside by the way for people
who don't know CeCe
you're the V in the VC so how do you
describe to a layperson what GC here is
how they describe this you have machine
so machine capable to pick up one
function from that visible set of
function but set of admissible function
can be big they contain all continuous
functions and killers you don't have so
many examples to pick up function but it
can be small small liquid capacity but
maybe better called diversity so not
very different function in the set is
infinite set of function but not very
diverse so it is small v c-dimension
when the sea dimension is small you need
not in in small amount of training date
so the goal is to create admissible
of functions which is have small
vc-dimension and contain good function
then you should be you'll be able to
pick up the function using small amount
of observations so that is the task of
learning yeah is creating a set of
admissible functions that has a small VC
dimension and then you've figure out a
clever way of picking up that is goal of
learning achieve uniformity yesterday
yeah
statistical learning surgery does not
involve in creating admissible set of
function in classical learning surgery
everywhere hundred percent and textbook
the set of function admissible set of
function is given but this is science
about nothing because the most difficult
problem to create admissible set of
functions given say a lot of functions
continue set up function create
admissible set of functions that spins
that it kills finite VC dimension small
VC dimension and contain good function
so this was out of consideration so
what's the process of doing it I mean
it's fascinating what is the process of
creating this admissible set of
functions
what is emporiums that's in various can
describe invariance
yeah you know string of prodigies of
training data and properties means that
you have some function and you just
count what is value average value of
functional training data you have model
and what is the expectation of this
function on the model and they should
coincide so the the problem is about how
to pick up functions it can be any
function it in fact
it
it is true for all functions but because
when they're talking that say duck does
not jumping so you don't ask question
jump like a duck because it is trivially
does the jump engine doesn't help you to
recognize you but you know something
which questions to ask in your asking
feet seems like the girl like the duck
what looks like a duck at his general
situation looks like say guy who have
this illness is this disease it is legal
yeah so there is a general type of
predicate looks like in specific special
type of predicate which related to this
specific problem and that is
intelligence part of all these business
ends up where teachers involved
incorporating the specialized predicates
okay what do you think about deep
learning as neural networks these
arbitrary architectures as helping
accomplish some of the tasks you're
thinking about their effectiveness or
lack thereof water what are the
weaknesses and what are the possible
strengths you know I think that this is
fantasy everything which by deep
learning like features let me give you
this example one of the greatest books
which Churchill book about history of
Second World War and he starting this
book describing that in all time when
war is over so the great Kings the Gaza
together and most all of them were
relatives and they discussed what should
be done how to create peace they came to
agreement and when happens First World
War the general
public came in power they were so greedy
that rock Germany and it was clear for
everybody that it is not peace that
peace will last only 20 years because
they was not professionals in the same
way she in machine lock the remote
imitations while working for the problem
from very deep point of your
mathematical point and there are
computer scientists this mostly does not
know with the markings they just have
interpretation of that and they invented
a lot of blahblahblah interpretations
like deep learning where you did deep
learning mathematics does not know
deploying mathematic does not know
neurons it is just function if you like
to say piecewise linear functions say
that and doing in class of piecewise
linear function but they invent
something and then they try to prove
advantage of that through
interpretations which most live wrong
and whether the king must not they they
appeal to brain which they know nothing
about that nobody not what can communism
break so I think that more reliable walk
on mass this is multi magical problem to
your quest to solve this problem try to
understand that there is no only one way
of convergence which is strong wave
convergence there is a big fear of
convergence which requires predicate and
if you will go through all the stuff you
will see that you don't need the plot
even more I would say one of the theorem
which called represented carry it says
that optimal solution of mathematical
problem which is which described
learning curve is on shadow network not
on deep learning and a shallower again
yes there
absolutely so in the end what you're
saying is exactly right the question is
you have no value for throwing something
on the table playing with it not math
it's like in your old network or you
said throwing something in the bucket
and or buy out the biological example in
looking at kings and queens or the cells
or the microscope you don't see value in
imagining the cells or kings and queens
and using that as inspiration and
imagination for where the math will
eventually lead you you you think that
interpretation who basically deceives
you in a way that's not productive I
think that if you try to analyze this
nation of learning and and especially
discussion about deep learning it is
discussion about interpretation not
about since about what you can say about
things that's right but aren't you
surprised by the beauty of it so the the
not mathematical beauty but the fact
that it works at all or are you
criticizing that very beauty our human
desire to to interpret to find our silly
interesting interpretations and these
constructs like let me ask you this
are you surprised and that does it
inspire you how do you feel about the
success of a system like alphago and
beating the game of go using neural
networks to estimate the quality of a
book of a board and the quality of the
position is your interpretation quality
of support yeah yes yeah
may it work so it's not our
interpretation the fact is a neural
network system doesn't matter a learning
system that we don't I think
mathematically understand that well
beats the best human player that's
something that was thought impossible
it's not very difficult
that's so you empirical we've
empirically have this
this is not a very difficult problem
yeah it's true so maybe that can argue
so even more obviously that if they use
deploring it is not the most effective
way of learning theory and usually when
people use deep learning they're using
zillions of training data yeah but you
don't need this so I describe challenge
can we do some problems which do well
deep learning method this dip net using
hundred times less training date even
more some problems deep learning cannot
solve because it's not necessarily they
created miscible set of function money
to create deep architecture means to
create invisible set of functions you
cannot say that you're creating good
investment set of functions
you're just CEO fantasy it is not comes
from mass but it is possible to create
admissible set of functions because you
have the training data that actually for
mathematicians
when you consider variant you need to
use law of large numbers when you make a
training in existing algorithm you need
uniform law of large numbers which is
much more difficult equation dimension
and all that stuff but nevertheless if
you use both Vic and Stroke way of
convergence you can decrease a lot of
training data you could user the three
the swims like a duck and quacks like a
duck but our so let's let's step back
and think about and tell human
intelligence in general and clearly that
has evolved
in a non mathematical way it wasn't as
far as we know God or whoever didn't
come up with a model in place in our
brain of admissible functions it kind of
evolved I don't know maybe you have a
view on this but so Alan Turing in the
50s in his paper asked and rejected the
question can machines think it's not a
very useful question but can you briefly
entertain this useful useless question
can machines think so talk about
intelligence and your view of it I don't
know that I know the Ewings describe
imitation if computer can imitate human
being
let's call it intelligent and he
understands that it is not sinking
computer yes
he completely understand what he don't
but his setup problem of limitation so
now we understand that the problem not
in imitation I am Not sure that
intelligence just inside of us it may be
also outside of us
I have several observations so when I
prove some theorem it's very difficult
in couple of years in several places
people prove the same theorem saying so
lemma after us was done then another
guys proved the same variant in the
history of science it's happened all the
time
for example geometry it's happen
simultaneously first did Lobachevsky ins
and Gauss and boy ie and and other guys
and it approximately in ten times period
take them years period of time and I saw
a lot of examples like that and when in
which magicians sings it when they
develop something they develop think
something in general which affect
everybody so maybe our models that Intel
only inside of us is incorrect it's our
interpretation yeah it might be exist
some connection yes won't intelligence I
don't know you're almost like plugging
in into your exactly and contributing to
this network into into a big maybe in
your network on the flip side of that
maybe you can comment on Big O
complexity in how you see classifying
algorithms by worst-case running time in
relation to their input so that way of
thinking about functions do you think P
equals NP do you think that's an
interesting question
yeah it is interesting question but let
me talk about complexity in about worst
case scenario there is a mathematical
setting when I came to United State in
1990 those people did not know this is
how it is I did not know statistically
so in Russia it was published two
monographs or monographs but in America
they did not know then they learned and
somebody told me that if it's worst case
Yuri and they will create real case
there but still no it did not
because it is much much called too
you can do only what you can do using
mathematics and which has clear
understanding and clear description and
for this reason we introduced complexity
and you need this because using actually
tested or said it like this one more
this invention you can prove some
theorems but we also create theory for
case when you know probability measure
and that is the best case it can happen
this entropy sorry
so from which a medical point of view
you know the best possible case and the
worst cause on the possible case you
can't derive different modeling but it's
not so interesting
you think they educate the edges are
interesting the edges is interesting
because it is not so easy to get good
bond exact but it's not many cases where
your hair the bond is not exact but
interesting principles which discover
the mass do you think it's interesting
because it's challenging and reveals
interesting principles that allow you to
get those bounds or do you think it's
interesting because it's actually very
useful for understanding the essence of
a function of an algorithm so it's like
me judging your life as a human being by
the worst thing you did and the best
thing you did versus all the stuff in
the middle it seems not productive I
don't think so because you cannot
describe situation in the middle or it
will be not general so you can describe
education and it is clear it has some
model but you cannot describe model for
every new case so you you'll be never
accurate when you use it but from a
statistical point of view the way you've
studied functions and and the nature of
learning in the world don't you think
that the real world has a very long tail
that the edge cases are very far away
from the mean the stuff in the middle or
no I sings it what for
my point of view if you will use formal
statistic you need uniform law of large
numbers if you will use this invariance
business you don't need just love large
numbers you don't and there is huge
difference between uniform law of large
numbers and watch your numbers as a
useful to describe they're a little more
or should we just take it no for example
when when I talking about doc I gave
sleep indicates if it was enough but if
you will try to do formal distinguish
you didn't need a lot of observation
data and so that means that information
about looks like a duck contain a lot of
bit of information form of bits of
informations so we don't know that
how much bit of information contained
since from artificially from
intelligence and that is the subject of
analysis you'll know all business I I
don't like half people consider
artificial intelligence they consider as
some codes which imitate
activity of human being it is not
science it is applications you would
like to imitate go ahead it as very low
stolen good problem but you need to
learn something more how people try to
the clerk out people came to develop se
pehle fate swims like a duck or play
like multiply or something like that
they're not not the teacher tells you
how it came in his mind if he chooses
image so that process
problem of intelligence that is the
problem of Intel and you see that
connected to the problem of learning
absolute are they because you
immediately give this predicate like
specific predicate sleeps like a duck
quacks like a duck it was choosen
somehow so what is the line of work
would you say well if you were to
formulate as a set of open problems that
will take us there would play like a
butterfly will get a system to be able
to let separate two stories run
mathematical story that if you have
predicates you can do something and
another story have to get predicates it
is intelligence problem and people even
did not start understand intelligence
because to understand intelligence first
of all try to understand what do
teachers have teacher teach why want one
teacher better than another one yeah so
you think we really even haven't started
on a journey of North generating the
partners you don't understand they even
don't understand since this problem
exists because did you feel yeah no I I
just know name yeah I I want to
understand why one teacher better than
another and have a fifth teacher student
it was not because he repeating the
problem which is in textbook he makes
some remarks he makes some philosophy of
reasoning you know that's a beautiful it
is a formulation of a question that is
the open problem why is one teacher
better than another all right
what he does but yeah
what what what what why in every level
what people how do they get better what
does it mean to be better the whole yeah
yeah from from
whatever model I have yeah one teacher
can give a very good predicate my
picture can say swims like a dog and
another can say jumped like a duck and
jump like a dog's career zero
information yeah so what is the most
exciting problem in statistical learning
you've ever worked on or are working on
now oh I just figured this in very odd
story and I'm happy that I believe that
it is ultimate learning story at least I
can show that there are no enlasa
mechanism only two mechanisms but they
separate statistical part from
intelligent part and I know nothing
about intelligent Park and if we do know
does intelligent part so it will help us
a lot in teaching yeah yeah well know it
when we see it so for example in my talk
the last slide was the challenge so you
have say least digitalization problem
and deplore the claim that they did it
very well
say 99.5% correct answers but they use
60,000 observations can you do the same
music conduct times less but
incorporating variance what it means you
know digit 1 2 3 but yeah just looking
all that explained division variant I
should keep to use hundred examples or
say hundred times less examples to do
the same job yeah that last slide in
unfortunately you're talking it quickly
but that last slide was a powerful open
challenge in a formulation of your the
instructors exact problem
of intelligence because everybody when
when marshal learning starting it was
developed but much much magician they
immediately recognize that we use much
more training data than human in it but
now again the kind of the same story
have to decrease that is the problem of
learning it is not like in deep learning
they they use zillions of training date
because my bazillions not enough if you
have a good invariance maybe you will
never collect some number of directions
but no it is a question to to
intelligence have to do that because
statistical part is rainy as soon as
your suppliers will predicate we can do
good job the small amount observations
and the very first challenges well-known
digital cognition and you know digits
and please tell me invariance I think it
about that I can say four digit three I
would introduce
concept of horizontal symmetry so the
digits Rufus horizontal symmetry say
more than say digital or something like
that but as soon as I get the horizontal
symmetry I can which magical invent a
lot of measure horizontal singing
symmetry on the vertical symmetry or the
organelle symmetry whatever if I have a
day of symmetry but for tells working on
digital she said it is meta theater
predicate which is not shape it is
something like symmetry like half dark
this whole picture something like that
which which can serve as a pretty key
you think such
predicates could rise out of something
that's not general meaning it feels like
for me to be able to understand the
difference between the two and three I
would need to have had a childhood of
ten to fifteen years playing with kids
going to school being yelled by parents
all of that walking jumping looking at
ducks and now then I would be able to
generate the right predicates for
telling the difference in 203 or do you
think there's a more efficient way I
know for sure you must know something
more some digits yes - that's a powerful
state yeah but maybe there are several
languages of description this elements
of digits so I talking about
symmetry about southern engineering
properties of geometry I'm talking about
something abstract I don't know that but
this is a problem of intelligence so in
one of our article it is trivial to show
that every example can carry not more
than one bit of information in the air
because when your show example and you
say this is one you can remove say a
function which does not tell you one say
it's a best strategy if you can do it
perfectly
remove half of that but when you use one
predicate which looks like a duck you
can remove much more functions and half
and that means that it contain a lot of
detail informations from formal pointers
but
when you have a general picture and what
you want to recognize and general
picture of the world chain you invent
just predicate and that predicates carry
a lot of information beautifully put
maybe just me but in all the math you
show in your work which is some of the
most profound mathematical work in the
field of learning ai and just math in
general I hear a lot of poetry in
philosophy you really kind of talk about
philosophy of science there's a there's
a poetry and music to a lot of the work
you're doing and the way you're thinking
about it
so do you where's that come from these
do you escape to poetry do you escape to
music or not exist ground truth process
granted yeah and that can be seen
everywhere yeah the smart guy
philosopher sometimes I surprise has a
deep sea sometimes I see that some of
them are completely out of subject but
the ground rose I seen music musically
the ground truth yeah and in poetry when
apology they believe they take dictation
so what what piece of music as a piece
of empirical evidence gave you a sense
that they are they're touching something
in the ground truth you to structure the
structure listening to Bach yeah but you
see the structure
yeah very clear very classic very simple
if the salmon was when you have axioms
enjoy native you have the same feelings
yeah yes poetry sometime this is insane
yeah and if you look back
hood you grew up in Russia you maybe
were born as a researcher in Russia
you've developed as a researcher in
Russia Eve came to United States and a
few places if you look back what were
what was some of your happiest moments
as a researcher some of the most
profound moments not in terms of their
impact on society but in terms of their
impact on how damn good you feel that
day and you remember that moment you
know every time when you found something
it is great in one instance of life
every simple things just well my general
feeling that time mostly most of my time
was broke you should go again and again
and again and try to be honest in front
of yourself not to make interpretation
but try to understand that it related to
grunt rose it is not my blahblahblah
interpretation and something like that
but you're a lot to get excited at the
at the possibility of discovery oh yeah
you'll double you have to double check
it but no but how it related to ten
other ground rules is it just temporary
or this fall for heaven you know you
always have a feeling when you found
something have because that so 20 years
ago we discovered statistical learning
theory nobody believed except for one
guy Dudley problem 87 20 years and
became passion in the same support
vector machines that has killed no
machines so we would support vector
machines and learning theory but when
you were working on it you had a sense
that you are a sense of
the the profundity of it how this this
seems to be right it seems to be
powerful
all right absolutely immediately I
recognize that it will last forever and
now when I found this invariance story
if else a wife ever I have a feeling
that it is completely because I have
proved there are no different mechanism
you can have some say cosmetic
improvement you can do but in terms of
invariance you need both invariance in
statistical learning translation work
together but also and criticism we can
form over it
boaters intelligence but that and to
separate from technical part and that is
completely absolutely all right thank
you so much for talking to thank you as
an honor photo