Vladimir Vapnik: Predicates, Invariants, and the Essence of Intelligence

Vladimir Vapnik: Predicates, Invariants, and the Essence of Intelligence | Lex Fridman Podcast #71

bQa7hpUpMzM • 2020-02-14

Transcript preview

Open

Kind: captions
Language: en
the following is a conversation with
Vladimir of APNIC part 2 the second time
we spoke in the podcast
he's the co-inventor of support vector
machines support vector clustering vici
theory and many foundational ideas is
the disco learning he was born in the
Soviet Union worked at the Institute of
control sciences in Moscow then in the
u.s. worked at AT&T NEC labs Facebook AI
research and now is a professor at
Columbia University his work has been
cited over 200,000 times the first time
we spoke on the podcast was just over a
year ago one of the early episodes this
time we spoke after a lecture he gave
titled complete statistical theory of
learning as part of the MIT series of
lectures on deep learning and AI that I
organized I'll release the video of the
lecture in the next few days this
podcast and lecture are independent from
each other so you don't need one to
understand the other the lecture is
quite technical and math heavy so if you
do watch both I recommend listening to
this podcast first since the podcast is
probably a bit more accessible this is
the artificial intelligence podcast if
you enjoy it subscribe on YouTube give
it five stars on Apple podcast supported
on patreon or simply connect with me on
Twitter at Lex Friedman spelled Fri DM
aen as usual I'll do one or two minutes
of ads now and never any ads in the
middle that can break the flow of the
conversation I hope that works for you
and doesn't hurt the listening
experience this show is presented by cap
the number one finance app in the App
Store when you get it used collects
podcast cash app lets you send money to
friends buy Bitcoin and invest in the
stock market with as little as $1
brokerage services are provided by cash
up investing a subsidiary of square and
member s IPC since cash app allows you
to send and receive money digitally
peer-to-peer and security in all digital
transactions very important let me
mention the PCI data security standard
PCI DSS level 1 a cash app is compliant
with
I'm a big fan of standards for safety
and security and PCI DSS is a good
example of that or a bunch of
competitors got together and agreed that
there needs to be a global standard
around the security of transactions now
we just need to do the same for
autonomous vehicles and the AI systems
in general
so again if you get cash out from the
App Store or Google Play and use the
collects podcast you get ten dollars in
cash people will also donate ten dollars
to first one of my favorite
organizations that is helping to advance
robotics and STEM education for young
people around the world and now here's
my conversation with vladimir vapnik you
and I talked about Alan Turing yesterday
a little bit and that he as the father
of artificial intelligence may have
instilled in our field an ethic of
engineering and not science seeking more
to build intelligence rather than to
understand it what do you think is the
difference between these two paths of
engineering intelligence and the science
of intelligence with completely
different story engineering his
imitation of human activity you have to
make device which behaved as human be
fair have all the functions of human it
does not matter how you do it but to
understand what is intelligence but is
quite different problem so I think I
believe that it's somehow related to
predicate we talked yesterday about
because look at the vladimir probes idea
he just found 31 he predicates
he called it units which can explain
human behavior at least in the russian
tales here local Russian tales and
derive from that
than people realize that that more
vitamin ration depths it isn't TV in
movie serials and for so long so you're
talking about Vladimir Propp alright who
in 1920 published a book morphology of
the folktale describing 31 predicates
that have this kind of sequential
structure that a lot of the stories
narratives follow in Russian folklore
and in other content we'll talk about it
I'd like to talk about predicates in a
focused way but let me if you allow me
to stay zoomed out on our friend Alan
Turing and you know he inspired a
generation with the the imitation game
yes do you think if we can linger in a
little bit longer do you think we can
learn do you think learning to imitate
intelligence can get us closer to the
scienter understanding intelligence so
why do you think imitation is so far
from understanding I think that it is
different between you have different
goals so your goal is to create
something something useful and that is
great and you can see how much things
was done and I believe that it will be
done even more yet self-driving cars and
also there's business it is great and it
was inspired by curing vision but
understanding is very difficult it more
was philosophical category
what means understands evolved I believe
in him which start from Plateau that
there exists volt of ideas I believe
that intelligence it is volved a five
years but it has vault of pure ideas and
when you combine save this
reality sings it creates as in my face
invariance which is very specific and
that I believe the combination of ideas
in way to constructing conveyance is
intelligence but first of all predicates
if you know predicates and hopefully
them not not too much predicate exists
for example 31 predicates for human
behaviors not a lot
Vladimir Propp used 31 you can even call
particles 31 predicates to describe
stories narratives what do you think
human behavior how much of human
behavior how much of our world our
universe all the things that matter in
our existence can be summarized in
predicates of the kind that problems
working with I think that's we have a
lot of form of behavior but I think the
predicate is much less because even in
these examples which I gave you
yesterday you saw that predicates can be
can construct one predicate can
construct many different invariance
depending on on your data they're
applying to different data and they give
different invariance so but pure ideas
maybe not so much not so many less I
don't know about that but my guess I
hope just very challenged about digit
recognition how much you need I think
we'll talk about computer vision and 2d
images a little bit in your challenge
that's exactly both intelligence that's
exactly that's exactly about know that
hopes to be exactly about the spirit of
intelligence in the simplest possible
way absolutely you should start this
simple story of the very serial to do
well there's an open question whether
starting at the amnesty digit
recognition is a step towards
intelligence or it's an entirely
different thing I think that to beat
records using a hundred two hundred
times less examples you new to
intelligence you need intelligence so
let's because you used this term and
it'll be nice and I'd like to ask simple
maybe even dumb questions let's start
with a predicate in terms of terms and
how you think about it what is a
predicate I don't know I have a feeling
for Molly as they exist but I believe
that predicate for 2d images one of them
is symmetry hold on a second sorry sorry
to interrupt and pull you back at the
simplest level we're not evens we're not
being profound currently a predicate is
a statement of something that is true
yes do you think of predicates as
somehow probabilistic in nature or is
this binary this is truly constraints of
logical statements about the world in my
definitions of simplest predicate is
function function and you can use this
function to move inner product that is
predicate what's the input and was the
output of the function input is
something which is input in reality so
if you consider digit recognition it
picks up space yes input but it is
function which in pixel space but it can
be any function from pixel space and you
choose and and I believe that there are
several functions which is important to
understanding of images one of them is
symmetry it's not so simple construction
as I described this little irritated
other stuff but another I believe I
don't know how me
is how well structure eyes is picture
structure eyes yeah what I mean by
structure eyes it is formal definition
so something happens heavy on the left
corner not so heavy is the middle and so
on you describe in general concept of
what what use you concept some kind of
universal concepts yeah but I don't know
how to formalize this do you so this is
the thing there's a million ways we can
talk about this I'll keep bringing it up
but we humans have such concepts when we
look at digits but it's hard to put them
just like you're saying now it's hard to
put them into words you know that this
example when critics in music trying to
describe music they use predicates and
not too many predicates but in different
combination but they have some special
words for describing music and the same
should be for images but my bizarre are
critics who understand essence of what
this images about do you think there
exists critics who can summarize the
essence of images human beings the eye
hopes with years but that explicitly
state them on paper this is the
fundamental question I'm asking is do
you do you think there exists a small
set of predicates that will summarize
images it feels to our mind like it does
that the concept of what makes a two and
A three and a four
no no it's not on this level what it
should not describe two three four it
describes some construction which allow
you to create invariance
in variants sorry to stick on this but
terminology invariance it is it is
protective of your image say I can say
looking on my image it is more or less
symmetric and I can give you a value of
symmetry say level of symmetry using
this function which I gave yesterday
then you can describe that your image
have these characteristics exactly in
the way of musical critics described
music so but this is invariant applied
to two specific data to specific music
to something I strongly believe in in in
this plot ideas answer exists world of
predicate and world of reality and
predicate in the reality is somehow
connected and you have to know that
let's talk about Plato a little bit so
you draw a line from Plato to Hegel to
Wagner to today yes so Plato has forms
the the theory of forms there's a world
of ideas
yeah world of things as you thought
along and there's a connection and
presumably the world of ideas is very
small
and the world of things is arbitrarily
big but they're all what Plato calls
them like the it's a shadow the real
world is a shadow from the world of yeah
you have projection projection
Altaf idea yes right oh and in reality
you can realize this projection Union
using canvas invariance because it is
projection for on specific examples
which create specific features of
specific objects so so the essence of
intelligence is while only being able to
observe the world of things try to come
up
the world of ideas exactly like in this
music story intelligent musical critics
knows the soldiers more than favorite
feeling about Thornton I feel like
that's a contradiction intelligent music
critics but I think I think music is to
be enjoyed in all its forms the notion
of critic like a food critic no I don't
want dark mushroom that's an interesting
question
there's emotion there's a certain
elements of the human psychology of the
human experience which seem to almost
contradict intelligence and reason like
emotion like fear like like a love all
those things are those not connected in
any way to the space of ideas thus I
don't know I I just want to be
concentrate on a very simple story on
digit recognition so you don't think you
have to love and fear death in order to
recognize digits I don't know because
it's so complicated it is it involves a
lot of stuff which I never consider but
I know about digital news and I know
that four digit recognition to to get
records from small number of
observations you need predicate but not
special predicate for this problem but
Universal predicate which understand
world of images of visual and visual yes
but on the first step they understand
say world of handwritten digits or
characters or something simple so like
he said symmetry as an interest no
that's what I think one of the
predicates is related to symmetry but
the level of symmetry ok degree of
symmetry so you know you think symmetry
at the bottom as a universal notion and
there's the
there's degrees of a single kind of
symmetry or is there many kinds of
symmetries many kinds of symmetries
there is a symmetry anti symmetry say
letter s so it has vertical anti
symmetry and it could be diagonal
symmetry vertical CIMMYT so when you
when you cut vertically the letter S
yeah then the upper part in lower part
in different directions along the y axis
yeah but that's just like one example
symmetry isn't there like right but
there is a degree of symmetry if you
play all this little relative stuff to
to do tangent distance whatever I
described you can do you can have a
degree of symmetry and that is
describing reason of image it is the
same as you will describe this image
saying about Digitas it has anti
symmetry did you see symmetric molars
look for symmetry do you think such
concepts like symmetry predicates like
symmetry is it a hierarchical set of
concepts or are these independent
distinct predicates that we want to
discover as some set of noise idea of
symmetry and you can this idea of
symmetry make very general like degree
of symmetry the degree of symmetry can
be zero no symmetry at all degree of
symmetry say more or less symmetrical
but you have one of this description and
symmetry can be different as I told
horizontal vertical diagonal and anti
symmetries it also concept of symmetry
what about shape in general I mean
symmetry is a fascinating notion but you
know I'm talking about digit I would
like to concentrate on all I would like
to know predicates for digit recognition
yes but symmetry is not enough for digit
recognition right it was not necessarily
for digital cognition it helps to create
invariant which will which you can use
when you will have examples for
digitalization you have regular problem
of digital communication you have
examples of the first class second class
plus you know that the resistor exists
concept of symmetry in you apply when
you looking for decision rule you will
apply concept of symmetry of this level
of symmetry which you estimate from so
let's let's talk everything is consumed
if convergence
what is convergence what is we
convergence what is strong convergence
so sorry I'm going to do this here
what are we converging from until you
converge ink you would like to have a
function the function which say
indicator function which indicate your
digit 5 for example a classification
task
let's talk only about classification so
classification means you will say
whether this is a 5 or not or say which
of the ten digits it is all right right
I would like to have these functions
then I have some exam
I can consider protégée of these
examples say symmetry and I can measure
a level of symmetry for every digit and
then I can take average and I from from
my training data and I will consider
only functions of conditional
probability which I am looking for my
decision rule which applying to two
digits will give me the same average as
they absorb on training date so actually
this is different level of description
of what you want you want not just your
so not one digit you show this this
predicate so general property of all
digits which you have in mind if you
have in mind digits three it gives you
property of digits three and you select
as admissible set of function only
function which keeps this product you
will not consider as a functions so you
immediately looking for smaller subsets
of function that's what I mean by
admissible functions you add a musical
function exam which is still a pretty
large for the number three a little R
it's a large but if you have one
predicate but according to there is a
strong indeed convergence strong
convergence is convergence and function
you're looking for the function from one
function and you're looking concern as a
function and square difference from them
should be small if you take difference
in any points make a square make an
integral and it should be small
that is convergence in function suppose
you have some function any function so I
would say I say that some function
converge to this function if integral
from squared difference between them is
small that's the definition of strong
convergence that definition of a few
functions integral the difference PS ma
it is convergence in functions yeah but
you have different convergence in
functionals you take any function you
take some function C and take inner
product this function this F function f
0 function which you want to find and
that gives you some value so you say is
it set of functions converge in inner
product to this function if this value
of inner product converge to value F 0
that is for one V but V converges
requires that it converge for any
function of Hilbert space if it converge
for any function of Hilbert space then
you will say that this is the
convergence you can think that when you
take integral that is protecting
integral protect your function for
example if you will take sine of a sine
it is coefficient of say Fourier
expansion so it if it converge for all
coefficients of free expansion so under
some condition it converge doto2
function you're looking for but the
convergence means any property converges
not point wise but integral protégée of
function
so the convergence means integral
property of functions when I talking
about predicate I would like to
formulate which integral protectees I
would like to have for convergence so
and if I will take one predict
predicated function which I measure
property if I will use one predicate and
say I will consider only function which
give me the same value as less this
predicate I selecting set of functions
from functions which is admissible in
the sense that function which are
looking for in this set of functions
because I checking in training data it
gives the same yes it's always has to be
connected to the training data in terms
of yeah but but protégée you can know
independent on training date and this
guy prop yeah so the series formal
property 31 property and you've married
a Russian fairy tale all right but a
Russian fairy tale is not so interesting
more interesting that people apply this
to two movies to theater two to two
different things and the same works the
universal well so I would argue that
there's a little bit of a difference
between the kind of things that were
applied to which are essentially stories
and digit recognition it was the same
story you're saying digits there's a
story within the digit yeah so but my my
point is why I hope that it possible to
beat rear court using not 60,000 but a
hundred times less because since that
you will give predicates and you will
select
decision not from wide set of functions
but from set of function which keeps us
predicate but predicate is not related
just to digital cognition right so like
in blotter space do you think it's
possible to automatically discover the
predicates this so you basically said
that the essence of intelligence is the
discovery of good predicates yeah now
the natural question is you know that's
what Einstein was good at doing in
physics can we make machines do these
kinds of discovery of good predicates or
is this ultimately a human endeavor yes
I don't know I don't think that machine
can do because according to theory both
with convergence any function from
hilbert space can be predicated so you
have infinite number of predicates in
opera and before you don't know which
predicate is good on me but whatever
prop show and what people call it
breakthrough that there is not too many
predicates which cover most of situation
happens in the world so there's a sea of
predicates and most of the only a small
amount are useful for the kinds of
things that happen in the world I think
that I would say only small part of
predicates very useful useful all of all
of them only very few are what we should
let's call them good predicates very
good particular very good predicates so
can we linger on it what's your
intuition why is it hard for a machine
to discover good predicates I even in my
top described after the brain
have to find new predicate I'm not sure
that it is very good what is you're
proposing it up no in my talk I gave
example for diabetes they belong m1 when
we achieve some percent so then we're
looking from area where some sort of
predicate each I formulate does not
keeps invariant so if it doesn't keep I
train my data I select only function
which keeps this invariant and when I
did it I improve my performance I can
looking for this predicate I know
technically have to do that and you can
of course do it using machine but I am
NOT shows that video instructs the
smartest predicate but this is the allow
me linger on it because that's the
essence that's the challenge that is
artificial that's that's the human level
intelligence that we seek is the
discovery of these good predicates
you've talked about deep learning as a
way to the predicates they use and the
functions are mediocre so you can find
better ones let's talk about deep
learning sure let's do our I know only
yawns Laocoon convolutional network and
what else
I don't know energy very simple
convolution there's not much else eleven
right yes I can do it like that when
this one predicate it is convolution is
a single predicate it's single it's it's
single predict yes because you know
exactly you take the derivative for
translational and predicate this should
be kept so that's a single predicate but
humans discovered that one or least note
that is every stick not too many
predicates and that this big story
because he undid it 25 years ago and
I think so clear was added to the
network and then I don't understand why
you should talk about deep network
instead of talking about piecewise
linear functions which keeps this
predicate whether you know a counter
argument is that maybe the amount of
predicates necessary to solve general
intelligence say in space of images
during efficient recognition of
handwritten digits is very small and so
we shouldn't be so obsessed about
finding we'll find other good predicates
like convolution for example you know
there's there has been other
advancements like if you look at the
work with attention
there's attentional mechanisms in
especially used in natural language
focusing the the network's ability to to
learn at which part of the input to look
at the thing is there's other things
besides predicates that are important
for the actual engineering mechanism of
showing how much you can really do given
such these predicates I I mean that's
essentially the work of deep learning is
constructing architectures that are able
to be given the training data to be able
to converge towards a function they can
approximate you can keep generalize well
this is an engineering problem oh yeah I
understand but let's talk not on
emotional level but on a mathematical
area you have set of piecewise linear
functions it is all possible neural
networks it's just peaceful in ear
functions this is many many pieces large
large number to specify exactly but very
large very large almost what this is
still large is to simpler than sex
illusionism reproducing kernel Hilbert
space nish every hilda's set of function
what's Hilbert space its space with
infinite number of coordinates a
function for expansion something so it's
much richer so and when I talk about
closed form solution like lot talking
about this set of function not piecewise
linear set which is particular case if
it's small for the neural networks is a
small part of the space here talk a
function is a small small say a small
set of functions they let me take that
but it is fine which is fine I don't
want to discuss a small or big retaken
one so you have some set of functions so
know when you're trying to create a he
teacher you would like to create
admissible set of function which all
your tricks to use not all functions but
some subset of the set of functions say
when you introducing convolutional net
it is way to make this subset useful for
you but from my point of view
convolutional it is something you want
to keep some invariants say translation
invariance but now if you understand
this and you cannot explain on the level
of a gears what neural network does you
should agree is it it is much better to
have a set of functions as I say this
set of functions should be admissible it
must keep season variances invariant and
that in way you know that as soon as you
incorporate new invariant set of
function because smaller and smaller and
smaller but all the invariants are
specified by you the human yeah
what I am hope that there is a standard
predicate like prop so that what that
what I want to find four digit
recognition if they start it is
completely new area what is intelligence
about on the level
starting from from Plata Sandhya what is
vault of ideas so and I believe that is
not too many yeah but you know it is a
museum that mathematician doing
something in their own network in in
general function but people from
literature from art they uses all the
time
that's right invariant saying say it is
great of how people describe music we
should learn from that in something on
this level but so why flag Aamir probe
who was just theoretical who study
theoretical literature he found that you
know let me throw that right back at you
because there's a little bit of a that's
less mathematical and more emotional
philosophical Vladimir Propp I mean he
wasn't doing math no and you just said
an another emotional statement which is
you believe that this Plato world of
ideas is small I hope I hope do you do
what's your intuition no if we can
linger on it you know about is not just
small or big I know
exactly then when I introducing some
predicates I decreased set of functions
but my goal to degree set of function
much by as much as pass by as much as
possible
good predicate which which does this
then I should choose next predicate
which does each degree set as much as
possible so set of
good predicate it is such a decrease
this amount of admissible function of
each good predicate significantly
reduces the set of admissible functions
that they're naturally should not be
that many cleared predicates no but but
if you reduce very well the VC dimension
of the function of admissible set of
function is small and you need not too
much training data to the well and VC
dimension by the way is a measure of
capacity of the set of function right
roughly speaking how many function in
this set so you're decreasing decreasing
and it might easy for you to find
function you're looking for that the
most important part to create good
admissible set of functions and it
probably there are many ways but the
good predicated says that that can do
that so that for for for this duct you
should know a little bit about dog
because what are the what is the three
fundamental laws of ducks looks like a
dog swims like a duck and quacks again
you should know something about ducks to
me not necessarily looks like a horse so
so good it's nice it generalizes yes
from the talk lock like edit and make
sound like horse and something in run
like horse and and moves like horse
it is generally it is general predicate
that this applied to dock but for dock
you can say play chess like that you
cannot say play chess why not see you're
saying you can put it that would not be
a good no you do not reduce a lot of you
not do yeah yeah you never just say no
function so you get the story is formal
story in which a magical story is that
you can use any function you want as a
predicate but some of them are good
some of them are not because some of
them reduce a lot of functions thought
miscible seta some of them but the
question is I'll probably keep asking
this question but how do we find such
parrot what's your intuition when
handwritten here in recognition how do
we find the answer to your challenge
yeah yeah I understand it's like that I
understand what what what defined what
it means I'm a new predicate yeah like
guy who understand music can say this
worth which he described him when he
listened to music he understand music he
use not too many different or you can do
like prop you can make collection what
you're talking about music about zoos
about that it it's not too many
different situation he described because
we mentioned vitomir proper buys let me
just mention there's a sequence of 31
structural notions they're common in
stories and I think you called units
units and I think they resonate I mean
it starts just a given example
abstention a member of the heroes
community a family leaves the security
of the home environment then it goes to
the interdiction or forbidding edict or
command is passed upon the hero don't go
there don't do this the heroes warn
against some action then step three
violate violation of interdiction brace
you know break the rules break out on
your own then reconnaissance the villain
makes an effort to attain knowledge
needing to fulfill their plot so on it
goes on like this ends ends in a wedding
number 31
your aplia ever after no he just gave
description of all situation he
understands this vault of fossils yeah
not for not focus like it photos or
stories and this story is not in just
for tales the stories in detective
serials as well and probably in our
lives we probably live but is this znz
is a
they're all set this predicate is good
for different situation from movie from
what for movie for theater by the way
there's also criticism right there's an
other way to interpret narratives from
claude lévi-strauss
I am NOT in this business and I know
it's theoretical literature but looking
in her eyes it's always the the
philosophy - yeah yeah but at least
there is a units it's not too many units
that can describe but that I probably
gives another units or in other way
exactly another another set of unasyn
another set of predicates it does not
matter whole but they exist
probably my my question is whether given
those units whether without our human
brains to interpret these units they
would still hold as much power as they
have meaning are those units enough when
we give them to the alien species let me
ask you do you understand digital
digital
emerges no I don't know no or when you
can recognize this digit images that you
understand you understand characters you
understand no no no no I I it's it's the
imitation versus understanding question
because I don't understand the mechanism
by which I don't know no I'm not talking
about I'm talking about three decades
you understand that it involves symmetry
maybe structure maybe something cause I
cannot formulate I just was able to find
symmetries like negative symmetries
that's really good so this is a good
line I feel like I understand the basic
elements of what makes a good hand
recognition system my own like symmetry
connects with me it seems like that's a
very powerful predicate my question is
is there a lot more going on that we're
not able to introspect maybe I need to
be able to understand a huge amount in
the world of ideas
thousands of predicates millions of
predicates in order to do hand
recognition I don't think so
say you're you know both your hope and
your intuition nicely clean enough
you're using digits you're using
examples as well theory says that if you
will use all possible functions from
Hilbert space all possible predicates
you don't need training date you just
will have admissible set of functions
which contain one function yes so the
trade-off is when you're not using all
predicates you're only using a few good
practice you need to have some training
data yes because are the more the more
good particles you have the last
training day exactly that this
intelligent blood still okay I'm gonna
keep asking the same dumb question
handwritten recognition to solve the
challenge you kind of propose a
challenge that says we should be able to
get state of the art amnesty error rates
by using very few sixty maybe fewer
examples prediction what kind of predict
is do you think it was the challenge so
people who will solve this problem that
will answer your answer do you think
they'll be able to answer it in a human
explainable way those are just new to
write function that's it but so can that
function be written I guess by an
automated reasoning system whether we're
talking about a neural network learning
a particular function or another
mechanism no no I'm not against neural
network I am against admissible set of
function which creates neural network
you did it by hand you don't you don't
do it by invariance by predicate vital
by by reason but your nowas can then
reverse the reverse step of helping you
find a function just as the task of in
your network is is to find a disentangle
representation for example what they
call is just define that one predicate
function as really captures some kind of
essence one not the entire essence but
one very useful essence of this
particular visual space do you think
that's possible like um listen I'm
grasping hoping there's an automated way
to find good predicates right so the
question is what are the mechanisms of
finding good predicates ideas they you
think we should pursue a younga
restlessly
I gave example so find situation where
predicates did you suggesting don't
create invariant it's like in physics
first find situation where existing
theory cannot just explain it
find situation where the existing theory
cannot explain this to see finding
contradictions final contradiction and
then remove this contradiction but in my
case
what means contradiction do point
function which if you will use this
function you do not keep in conveyance
this is really the process of
discovering contradictions yeah it is
like in physics find situation where you
have contradiction for one of the
property for one of the predicate then
includes the spread effect making
invariance and solve against this
problem now you don't have contradiction
but it is
not the best very probably I don't know
- looking for predicates that's just one
way okay that mono it was brute force
way in the brute force way what about
the ideas of some what big umbrella term
of symbolic AI these what in 80s with
expert systems sort of logic reasoning
based systems is there hope there to
find some through sort of deductive
reasoning to find good predicates
alright don't think so I think of just
logic is not enough it's kind of a
compelling notion now you know that when
smart people sit in a room and reason
through things it seems compelling and
making our machines do the same is also
compelling so everything is very simple
when you have infinite number of
predicates you can choose the function
you want you have invariance and you can
choose the function you want but you
have to have we're not too many
invariance to solve the problem so in
half from infinite number function to
select finite number and hopefully small
for a number of functions which is good
enough to extract small set of
admissible functions
so they've you be admissible it's for so
because every function just decreased
set to function and leaving admissible
but it will be small but why do you
think logic based systems don't can't
help intuition not because you you
should know
you should know life this guy like probe
he knows something and he tried to put
in invariant his understanding that's
the human yeah see you're putting too
much value in to Vladimir Propp
knowing something no it is my decision
what means you more life what elements
you know common sense
no no you know something common sense it
is some rules you think so
common sense is simply rules common
sense is every its mortality it's no
it's it's fear of death it's love it's
spirituality it's happiness and sadness
all of it is tied up into understanding
gravity which is what we think of as
common sense they don't really discuss
so bright I want to discuss understand
digitally understand digital cognition
you never bring up love and death you
bring it back to digit recognition okay
no you know it was durable because there
is a challenge yeah which I she have to
solve it before you have a student
concentrate on this work I do suggest
some sector so you mean Henry
recognition yeah it's a beautifully
simple elegant yet I think that I know
invariance which will solve this do I
sing some meanness
but it is not universe it is maybe I
want some universal invariance which are
good not only for digit recognition for
imaging the static so let me ask how
hard do you think is 2d image
understanding so if we can kind of
Intuit handwritten recognition how big
of a step leap journey is it from that
if I gave you good
I solved your challenge for Henry
recognition how long would my journey
then be from that to understanding more
general natural images immediately
understandeth as soon as you make a
record because it is not for free as
soon as you will create several
invariance which will help you to get
the same performance that the best
neural net did using hundred ten maybe
more than hundred times less examples
you have to have something smart to dot
that and you're saying that represent
Mario
it is predicate because you should put
some idea how to do that but okay let me
just pause maybe it's a turning point
maybe not but handwritten recognition
feels like a 2d two-dimensional problem
and it seems like how much complicated
is the fact that most images are
projection of a three-dimensional world
onto a 2d plane it feels like for a
three-dimensional world who still we
need to start understanding common sense
in order to understand an image it's no
longer visual shape and symmetry it's
having to start to understand concepts
of it understand life yeah yes yes
you're you're you're talking cells that
are different in value different every
decade yeah and potentially much larger
number you know might be but let's start
from simple
well yeah but you said that you know I I
cannot think yes the ball things which I
don't understand this I understand but
I'm sure that I don't understand
everything's there yeah as the
constraints I do as simple as possible
but not simpler and that is exact case
with harridan every condition yeah but
no that's the difference between you and
I
I welcome and enjoy thinking about
things I completely don't understand
because to me it's a natural extension
without having solved handwritten
recognition to wander how how difficult
is the the the next step of
understanding 2d 3d images because
ultimately while the signs of
intelligence is fascinating it's also
fascinating to see how that maps to the
engineering of intelligence and
recognizing handwritten digits is not
doesn't help you it might it may not
help you with the problem of general
intelligence we don't know it'll help
you a little bit unclear it's unclear
yeah but I would like to make a remark
yes I start not from very primitive
problem Mike a challenge problem I start
with very general problem this Plateau
so you understand and it comes from
plotted so digit recognition so so you
basically took Plato and the world of
forms and ideas and mapped and
projecting into the clearest simplest
formulation of that big world and you
know I will say that I did not
understand Plata until recently and
until I consider the convergence and
then predicate and you know this is what
plot at all so linger on that like why
how do you think about this world of
ideas and world of things in play-doh
no it was me tougher it is it's the
matter for for sure yeah compelling it's
a poetic and a beautiful for what can
you but it is the way of you you should
try to understand have a talk I guess
since the world so from my point of view
it is very clear but it is line all the
time people looking for that
say plateaus in Hegel whatever
reasonable it exists whatever exist it
is reasonable I don't know what he have
in mind reasonable right there's
philosophers again no no no no it is it
is next stop of vignale that mathematics
understand something good in reality it
is the same plot a line and then it
comes suddenly so Vladimir Propp look 31
IDs 31 units disconnect everything
there's abstractions ideas that
represent our world and we should always
try to reach into that yeah but what you
should make a projection on reality but
understanding is it is abstract ideas
you have in your mind
several abstract ideas which you can
apply to reality and reality in this
case sir if you look at machine learning
as days example did data okay let me let
me put you put this on you because I'm
an emotional creature I'm not a
mathematical creature like you I find
compelling the idea forget this the
space the sea of functions there's also
a sea of data in the world and I find
compelling that there might be like you
said teacher small examples of data that
are most useful for discovering good
whether it's predicates or good
functions that the selection of data may
be a powerful journey a useful mekin you
know coming up with a mechanism for
selecting good data might be useful to
do you find this idea of finding the
right data set interesting at all or do
you kind of take the data set as a given
I think that it is yeah you know my
scheme is very simple you have huge set
of fun
questions if you will apply and you have
not too many data if you pickup function
which describes this data you will do
not very well you know randomly yeah
usually fit yeah
it will be our ever fitting so you
should decrease set of function from
which you picking up one so you should
go some have two admissible set of
function now this what about these
conversions so but from another point of
view to to make admissible set of
function you need just a DG just
function which you will take in inner
product which you will measure property
of your function and that is how it
works
no I get it I get understand that but do
you that the reality is let's let's look
this car let's think about examples you
have huge set of function if you have
several examples if you just trying to
keep the take function which satisfies
these examples you still do overfit you
need decreases you new tab miscible set
of function yeah absolutely but what say
you have more data than functions so
sort of consider though I mean maybe not
more data than functions because that's
unfortunately impossible but what I was
trying to be poetic for a second I mean
you have a huge amount of data a huge
amount of examples but the function
didn't even get bigger
I understand there's always there's a
long ago well full human space I catch
it but okay
but you don't you don't find the world
of data to be an interest
optimization space like the the
optimization should be in a space of
functions in creating admissible set of
unnecessary force no you know even from
the classical accessory from structure
risk minimization you should or you
should organize function in the way that
they will be useful for you right and
that is the way you're thinking about
useful is you're given a small small
small set of functions which contain
function by looking quo yep as looking
for based on the empirical set of small
examples yeah but that is another story
I don't touch it because I I believe I
believe that this small examples it's
not too small say sixty per class law of
large numbers works
I don't need uniform law the story is
that in statistics there are two law law
of large numbers in uniform law of large
numbers so I want to be in situation
where I use law of large numbers no but
not uniform law of large numbers right
so 60 is love it's large enough I hope
no it still need some evaluation some
bonds so that's what idea is the
following that if you trust that say
this average gives you something close
to expectations so he you can talk about
that about this predicate and that is
basis of human intelligence right good
predicates is the discovery of good
predicate is the basis of it is
discovery of you of your understanding
world of your methodology or this type
of understanding wall because you have
several function which you will apply to
reality
okay can you say that again so you're
you have several functions predicate but
the abstract yes
then you will apply them to reality to
your data and you will create in this
very predicate which is useful for your
task but predicates are not related
specifically to your task to the C a
task it is abstract functions which
being applying apply to planning tasks
that you might be interested it might be
many tasks freedom or different tasks
well they should be many tasks yeah I
dislike like in prop case it was for
free details but such happened
everywhere okay so we talked about
images a little bit can we talk about
Noam Chomsky for a second verify I don't
know him personally what not personally
I don't know his ideas these ideas well
let me just say do you think language
human language is essential to
expressing ideas as Noam Chomsky
believed so like languages at the core
of our formation of predicates the human
language language and all the story of
language is very complicated I don't
understand this and I am NOT I thought
about nobody I'm not ready to work on
that because it's so huge it is not for
me and I believe not for our century
it's a 21st century not for 21st century
so you should learn something a lot of
stuff from simple tasks like digit
recognition so you think you think
digital recognition to the image what
how would you
more abstractly define a digit
recognition it's 2d image symbol
recognition
essentially I mean I'd like I'm trying
to get a sense sort of thinking about it
now having worked with amnesty forever
how could how small of a subset is this
of the general vision recognition
problem and the general intelligence
problem is it yeah
is it a giant subset is it not and how
far away is language you know let me
refer to entertain take the simplest
problem as simple as possible but not
simpler and this is challenge is simple
problem but it's simple by a year but
not simple to to get it when you will do
this you will find some predicate
without you oh yeah I mean with I what
Einstein you can you you look at general
relativity but that doesn't help you
with quantum mechanics that's another
story you don't have any universal
instrument yes so I'm trying to wonder
if which space were in whether the
whether handwritten recognitions like
general relativity and then languages
like quantum mechanics are you're still
going to have to do a lot of mess to to
universalize it but I'm trying to see
one so what's your intuition why
handwritten recognition is easier than
language just I think a lot of people
would agree with that but if you could
elucidate sort of the the intuition of
why I don't know no I don't think in
this reaction I just think in
congestions that this problem which I
feel so it well we will create
some abstract understanding of images
maybe not all images I would like to
talk to guys who doing real images in
Columbia University what kind of images
unreal it's a real image really yeah
what the Reggie Israel predicate what
can be predicated I still symmetry will
play role in real life images in any
real life images 2d images let's talk
about to the image because that's what
we know a neural network was created for
2d images so the people I know in vision
science for example the people study
human vision you know that they usually
go to the world of symbols and like
handwritten recognition but not really
it's other kinds of symbols to study our
visual perception system as far as I
know not much predicate type of thinking
is understood about our vision system so
do not assume conscious direction they
don't yeah they but how do you even
begin to think in that direction that's
a sorry I'd like to discuss with them
yeah because if we will be able to show
that it is what working and surely it's
caused him it's not so bad so the the
unfortunate so if we compare the
language language has like letters
finite set of letters and a finite set
of ways you can put together those
letters so it feels more amenable to
kind of analysis with natural images
there is so many pixels no no no letter
language is much much more complicated
it's involved a lot of different stuff
it's not just understanding of very
simple class of tasks I would like to
see lists of tasks where language
involved yes so there's a there's a lot
of nice benchmarks now on natural
language processing
from the very trivial like understanding
the elements of a sentence to question
answering it more much more complicated
where you talk about open domain
dialogue the natural question is with
handwriting recognition is really the
first step
yeah of understanding visual information
all right but not but but even our
records shows that we go in the wrong
direction of course we live sixty
thousand digits so even this first step
so forget about talking about the full
journey this first step should be taking
in the right or wrong direction because
60,000 pieces unacceptable no I'm saying
it should be taken in in the right
direction or the 60,000 is not
acceptable because you can talk great
off percent of error and hopefully the
step from doing hand recognition using
very few examples the step towards what
babies do when they crawl and understand
that I know babies will do from very
small examples yeah you will find
principles that will show the difference
from what we using it now and so let's
call it's more or less clear that means
that you here you'll use deep converges
not just strong convergence do you think
these principles are will naturally be
human interpretable oh yeah so like when
we will be able to explain them and have
a nice presentation to show what those
principles are or are they very going to
be ver

Resume

Berikut adalah rangkuman komprehensif dan terstruktur dari transkrip wawancara dengan Vladimir Vapnik.

---

# Wawancara Eksklusif: Vladimir Vapnik, Predikat, dan Masa Depan Teori Belajar Statistik

### Inti Sari (Executive Summary)
Video ini membahas diskusi mendalam antara Lex Fridman dan Vladimir Vapnik, salah satu bapak pendiri Support Vector Machines (SVM) dan Teori Pembelajaran Statistik (VC Theory). Topik utama berkisar pada perbedaan mendasar antara *engineering* (yang berfokus pada imitasi) dan *sains* (yang berfokus pada pemahaman) dalam kecerdasan buatan. Vapnik berargumen bahwa kecerdasan sejati tidak membutuhkan data dalam jumlah besar, melainkan penemuan "predikat" atau ide murni yang menciptakan invariansi, yang memungkinkan mesin belajar jauh lebih efisien mirip dengan cara manusia belajar.

### Poin-Poin Kunci (Key Takeaways)
*   **Sains vs. Engineering:** Engineering bertujuan membuat mesin meniru perilaku manusia (imitasi) untuk kegunaan praktis, sedangkan sains bertujuan memahami esensi kecerdasan itu sendiri.
*   **Filosofi Plato & Predikat:** Kecerdasan berakar pada "dunia ide" (Plato) yang terdiri dari sedikit "predikat" (ide murni) yang dapat menjelaskan berbagai fenomena di dunia nyata.
*   **Efisiensi Data:** Dengan menggunakan predikat yang tepat (seperti simetri atau struktur), mesin seharusnya mampu belajar dengan data yang jauh lebih sedikit (misalnya 60 contoh) dibandingkan pendekatan *Deep Learning* modern yang membutuhkan ribuan data.
*   **Uniform Convergence:** Konsep matematis paling penting dalam pembelajaran mesin adalah konvergensi seragam, yang memungkinkan serangkaian fungsi belajar secara simultan, bukan hanya satu fungsi.
*   **Privileged Information:** Menggunakan informasi tambahan (meta-data) selama pelatihan—seperti mendeskripsikan gambar angka dengan bahasa puisi—dapat meningkatkan akurasi algoritma secara signifikan.

---

### Rincian Materi (Detailed Breakdown)

#### 1. Pendahuluan: Latar Belakang dan Konteks
Vladimir Vapnik adalah penemu co-SVM, SVC, dan Teori VC. Lahir di USSR, kini bekerja sebagai Profesor di Columbia University dan pernah bekerja di AT&T, NEC Labs, dan Facebook AI Research. Wawancara ini dilakukan setelah kuliah Vapnik di MIT berjudul *"Complete Statistical Theory of Learning"*. Vapnik dikenal sebagai ilmuwan yang sangat fundamental dan kritis terhadap tren kecerdasan buatan yang hanya mengandalkan data tanpa pemahaman teoritis yang mendalam.

#### 2. Definisi Kecerdasan: Imitasi vs. Pemahaman
*   **Engineering (Imitasi):** Pendekatan ini, yang sering dikaitkan dengan Alan Turing sebagai "bapak AI", berfokus pada menciptakan perangkat yang berperilaku *seperti* manusia. Tujuannya adalah utilitas atau kegunaan praktis, terlepas dari apakah mesin tersebut "memahami" apa yang dilakukannya.
*   **Sains (Pemahaman):** Vapnik menekankan perlunya memahami kecerdasan sebagai kategori filosofis. Ia merujuk pada Plato tentang "brankas ide" (*vault of ideas*). Kecerdasan adalah kombinasi dari ide-ide murni ini untuk menciptakan invariansi (ketetapan) di tengah keragaman data.

#### 3. Konsep Predikat dan Invariansi
*   **Apa itu Predikat?** Predikat adalah fungsi atau ide murni yang merangkum karakteristik esensial dari sebuah objek. Contoh dalam pengenalan citra adalah simetri (vertikal, horizontal, diagonal) dan "struktur" (kepadatan di bagian tertentu).
*   **Vladimir Propp dan Cerita Rakyat:** Vapnik memberikan analogi dari Vladimir Propp, yang menganalisis cerita rakyat Rusia dan menemukan hanya 31 "predikat" atau unit struktural (seperti "pahlawan meninggalkan rumah", "pelanggaran terjadi") yang bisa menjelaskan ribuan cerita. Ini menunjukkan bahwa dunia ide itu kecil, tapi dunia benda (realitas) itu besar.
*   **Dunia Ide vs. Dunia Benda:** Sesuai teori Plato, dunia ide (predikat) kecil dan terbatas, sedangkan dunia benda (data yang kita lihat) hanyalah bayangan atau proyeksi dari ide tersebut. Tugas kecerdasan adalah menyimpulkan ide-ide kecil tersebut dari lautan data yang besar.

#### 4. Kritik terhadap Deep Learning dan Pendekatan Fungsi
*   **Keterbatasan Jaringan Saraf:** Vapnik tidak menentang neural networks, tetapi ia menentang cara mereka mendefinisikan "himpunan fungsi yang dapat diterima" (*admissible set of functions*). Deep learning saat ini menggunakan fungsi linier potong-potong dan konvolusi (hanya satu jenis predikat invariansi translasi).
*   **VC Dimension:** Predikat yang baik akan mengurangi dimensi VC (kapasitas fungsi), yang berarti kita membutuhkan lebih sedikit data pelatihan untuk mencapai akurasi yang tinggi.
*   **Tantangan Data Minimal:** Vapnik menantang komunitas AI untuk mencapai tingkat kesalahan terbaik (state-of-the-art) dalam pengenalan digit hanya dengan 60 sampel per kelas, sesuatu yang mustahil dilakukan tanpa predikat yang cerdas.

#### 5. Teori Matematika: Konvergensi dan Ruang Hilbert
*   **Uniform Convergence (Konvergensi Seragam):** Vapnik menyebut ini sebagai ide terindah dalam teori pembelajaran. Hukum bilangan besar biasanya berlaku untuk satu fungsi, tetapi dalam pembelajaran, kita membutuhkan konvergensi untuk satu himpunan fungsi secara bersamaan.
*   **Strong vs. Weak Convergence:** Vapnik menjelaskan perbedaan antara konvergensi kuat (di ruang fungsi) dan konvergensi lemah (pada fungsi fungsional/in produk dalam). Ia berargumen bahwa dengan menggunakan instrumen yang tepat (seperti kernel RBF di Ruang Hilbert), kita bisa mendapatkan solusi *closed-form* yang sederhana tanpa perlu heuristik yang berantakan.

#### 6. Bahasa, Penglihatan, dan Intuisi
*   **Kompleksitas Bahasa:** Vapnik menyatakan bahwa bahasa manusia terlalu rumit untuk diselesaikan di abad ini. Ia lebih memilih fokus pada pengenalan simbol 2D (digit) sebagai "relativitas umum"-nya pembelajaran mesin sebelum melangkah ke "mekanika kuantum"-nya (bahasa).
*   **Peran Intuisi:** Menemukan predikat yang baik membutuhkan intuisi, mirip dengan cara kritikus musik mengartikulasikan esensi lagu Bach atau Chopin menjadi konsep abstrak. Mesin saat ini belum mampu menemukan predikat "bagus" dari ruang tak terbatas tanpa bantuan intuisi manusia.

#### 7. Privileged Information dan Seni
*   **Informasi Istimewa:** Vapnik menceritakan eksperimen menggunakan "informasi istimewa" (*privileged information*). Ia meminta seorang profesor sastra Rusia untuk mendeskripsikan gambar digit menggunakan kosakata puisi Rusia. Menggabungkan bahasa gambar dengan bahasa deskripsi puisi ini menghasilkan algoritma yang lebih akurat.
*   **Seni sebagai Sumber Predikat:** Vapnik percaya bahwa seni (musik, sastra) menyimpan kunci predikat yang dapat digunakan untuk pengenalan pola visual. Mengubah seni menjadi ide-ide abstrak adalah bagian dari memahami kecerdasan.

#### 8. Filosofi Hidup dan Penutup
*   **Arti Hidup:** Merujuk pada penulis fiksi ilmiah Strugatsky, Vapnik menyebutkan tujuan hidup adalah evolusi kreatif masyarakat manusia, memisahkan antara orang "biasa" dan orang "pintar".
*   **Penting Sastra:** Ia mengagumi kemampuan sastrawan memahami hubungan manusia dan "blok-blok besar kehidupan" (predikat kehidupan). Ia menilai manajer perusahaan besar sering kali lulusan sastra Inggris karena mereka memahami model kehidupan, bukan sekadar teknik.
*   **Saran Terakhir:** Vapnik menutup dengan nasihat penting: **"Jangan menyelesaikan masalah yang lebih umum sebagai langkah perantara."** (Do not solve a more general problem as an intermediate step).

### Kesimpulan & Pesan Penutup
Wawancara ini menegaskan bahwa masa depan kecerdasan buatan tidak terletak pada memperbesar model atau data secara semata-mata, melainkan pada kembalinya ke dasar-dasar teoritis dan filosofis. Vladimir Vapnik mengajak kita untuk mencari "predikat" atau prinsip invariansi yang sederhana namun kuat, yang memungkinkan pembelajaran yang efisien. Pesan terakhirnya yang menekankan untuk tidak menyelesaikan masalah yang terlalu umum ketika solusi spesifik sudah cukup, menjadi kritik tajam terhadap tren "satu model untuk segalanya" yang

Read

file updated 2026-02-13 13:25:38 UTC