François Chollet: Limits of Deep Learning

François Chollet: Limits of Deep Learning | AI Podcast Clips

CycWAqivFu0 • 2019-10-10

Transcript preview

Open

Kind: captions
Language: en
what do you think of the current limits
of deep learning if we look specifically
at these function approximator x' that
tries to generalize from data they've
you've talked about local versus extreme
generalization you mentioned the neural
networks don't generalize well humans do
so there's this gap so and you've also
mentioned that generalization extreme
generalization requires something like a
reasoning to fill those gaps so how can
we start trying to build systems like
that all right yes so this is this is by
design right deplaning models are like
huge parametric models differentiable so
continuous that go from an input space
to an output space and they're trained
with gradient descent so they're
trying-- pretty much point by point they
are learning a continuous geometric
morphing from from an input vector space
to not put vector space all right and
because this is done point by points a
deep neural network can only make sense
of points in experience space that are
very close to think that it has already
seen in strain data at best it can do
interpolation across points but that
means you know that means in order to
train your network you need a dense
sampling of the input cross ad with
space almost a point-by-point sampling
which can be very expensive if you're
dealing with complex real-world problems
like autonomous driving for instance or
car robotics is it's doable if you're
looking at the subset of the visual
space but even then still fairly
expensive used in in millions of
examples and it's only going to be able
to make sense of things that are very
close to waste as seen before and in
contrast to that well of course you have
human intelligence but even if you're
not looking at human intelligence you
can look at very simple rules algorithms
if you have a symbolic rule it can
actually apply to a very very large set
of inputs because it is abstract
it is not obtained by doing a
point-by-point mapping right for
instance if you try to learn a sorting
algorithm using a deep neural network
well you're very much limited to
learning point by point
well the sorted representation of this
specific list is like but instead you
could have a very simple sorting
algorithm written in a few lines maybe
it's just you know two nested loops and
it can process any list at all because
it is abstract because it is a set of
rules so deep learning is really like
point by point geometric morphine's
more things train whistle and essence
and meanwhile abstract rules can
generalize much better and I think the
future is reach combine the two so how
do we do you think combine the two how
do we combine good point by point
functions with programs which is what
symbolic AI type systems yeah at which
levels the combination happen and you
know obviously we're jumping into the
realm of where there's no good answers
it just kind of ideas and intuitions and
so on well if you look at the really
successful AI systems today I think they
are already hybrid systems that are
combining symbolic AI which is deep
learning for instance success robotics
systems are already mostly model-based
rule-based things like planning
algorithms and so on at the same time
they're using deep learning as
perception modules sometimes they're
using deep learning as a way to inject a
fuzzy intuition into a rule-based
process if you look at a system like an
a self-driving car it's not just one big
end when your network you know that
wouldn't work at all precisely because
in order to train that you need a dense
sampling of experience space when it
comes to driving which is completely
unrealistic obviously
instead this a driving car is mostly
symbolic you know it's software it's
programmed by hand it's mostly based on
explicit models in this case mostly 3d
models of the of the environment around
the car but it's interfacing with the
real world using deep learning modules
right so the deep learning there serves
is a way to convert the raw sensory
information to something usable by
symbolic systems okay well it's
lingering that a little more so dense
sampling from input to output you said
it's obviously very difficult is it
possible in the case of sin driving even
let's say still driving itself driving
permit for many people but let's not
even talk about self driving let's talk
about steering so staying inside the
lane lines following yeah it's
definitely a problem cancel reason and
two in the planning model but that's
like one small subset on a second yeah I
don't like your jumping from the extreme
so easily because I disagree with you on
that I think well it's it's not obvious
to me that you can solve Lane following
it's no it's not it's not obvious I
think it's doable I think in general you
know there is no hard limitations to
what you can learn with a deep neural
network as long as this the search space
like is rich enough is flexible enough
and as long as you have this dense
sampling of the input cross output space
the problem is that you know distance
sampling could mean anything from 10,000
examples to like trillions and trillions
so that's that's my question so what's
your intuition and if you could just
give it a chance
and think what kind of problems can be
solved by getting a huge amounts of data
and thereby creating a dense mapping so
let's think about natural language
dialogue the Turing test do you think
the Turing test can be solved with a
neural network alone well the deterrent
test is all about tricking people into
believing they turn into human
nothing that's actually very difficult
because it's more about exploiting human
perception and not so much about
intelligence there's a big difference
between mimicking in Asian behavior and
actually engage in behavior so okay
let's look at maybe the elect surprised
and so on the different formulations of
the natural language conversation that
are less about mimicking and more about
maintaining a fun conversation that
lasts for 20 minutes mm-hmm that's a
little less about mimicking and that's
more about I mean it's still mimicking
but it's more about being able to carry
forward a conversation with all the
tangents that happen in dialogue and so
on do you think that problem is learn
herbal with this kind of well the neural
network that does the point-to-point
mapping so I think it would be very very
challenging to do this with deep
learning I don't think it's out of the
question
either I wouldn't read out the space of
problems that can be solved or the large
neural network what's your sense about
the spaces those problems so useful
problems for us in theory it's it's
infinite right you can solve any problem
in practice while deplaning is great fit
for perception problems in general any
any problem which is not really able to
explicit and crafted rules or rules that
you can generate device exhaustive
search or some program space so
perception artificial intuition as long
as you have a sufficient ring there
and that's the question I mean
perception there's interpretation and
understanding of the scene yeah which
seems to be outside the reach of current
perceptual systems so do you think
larger networks will be able to start to
understand the physics and the physics
of the scene the three-dimensional
structure and relationships divisors in
the scene and so on or really that's
where symbology has to step in well it's
it's always possible
to solve these problems with with the
planning is just extremely inefficient a
model would be an explicit rule-based
abstract model would be a law officer
far better and more compressed
representation of physics then learning
justice mapping between in this
situation this thing happens if you
change the situation like slightly then
this other thing happens and so on do
you think is possible to automatically
generate the programs that would require
that kind of reasoning our dessert have
to so the word expert systems fail
there's so many facts about the world
had to be encoded in thing is possible
to learn those logical statements that
are true about the world and their
relationships do you think I mean that's
kind of what you're improving at a basic
level is trying to do right yeah except
it's it's much harder to firmly
statements about the world compared to
family ting mathematical statements
statements about the world you know tend
to be subjective so can you can you
learn rule-based
models yes yes differently that's the
this is a field of program synthesis
however today we just don't really know
how to do it so it's very much a grad
search or research problem and so we are
limited to you know the sort of at
recession raster
algorithms that we have today personally
I think changing algorithms are very
promising though I was like genetic
programming genic priming Zack
you

Resume

Berikut adalah rangkuman profesional dan komprehensif berdasarkan transkrip Bagian 1 yang Anda berikan.

***

# Membedah Batas Deep Learning: Mengapa AI Hibrida adalah Kunci Generalisasi Cerdas

### Inti Sari (Executive Summary)
Video ini membahas secara mendalam keterbatasan *Deep Learning* (DL) sebagai pendekatan geometris yang bergantung pada sampel data yang sangat padat, serta kontrasnya dengan AI simbolik yang menggunakan aturan abstrak. Narasumber menjelaskan bahwa DL bekerja melalui interpolasi titik demi titik, sehingga kurang efisien untuk masalah yang membutuhkan generalisasi luas tanpa data masif. Solusi ke depan terletak pada sistem hibrida yang menggabungkan kekuatan persepsi DL dengan penalaran logis dari AI simbolik.

### Poin-Poin Kunci (Key Takeaways)
*   **Sifat Dasar Deep Learning:** DL merupakan model parametrik yang besar, dapat diturunkan, dan kontinu; ia belajar melalui *gradient descent* dengan memetakan input ke output secara geometris titik demi titik.
*   **Masalah Interpolasi:** Keterbatasan utama DL adalah hanya dapat memahami titik data yang dekat dengan data pelatihan (interpolasi), membutuhkan *sampling* yang sangat padat dan mahal.
*   **Keunggulan AI Simbolik:** Berbeda dengan DL, AI berbasis aturan atau simbolik menggunakan abstraksi (seperti algoritma pengurutan) yang berlaku untuk kumpulan input yang sangat luas tanpa memerlukan pemetaan titik demi titik.
*   **Sistem Hibrida adalah Masa Depan:** Sistem AI sukses saat ini (seperti robotika dan mobil otonom) sebenarnya adalah sistem hibrida yang menggunakan inti berbasis model/aturan, dengan DL berperan sebagai modul persepsi.
*   **Peran DL yang Ideal:** Deep Learning paling efektif digunakan untuk masalah persepsi (*perception*) di mana pembuatan aturan eksplisit sulit dilakukan, bertindak sebagai "intuisi buatan" untuk sistem simbolik.

### Rincian Materi (Detailed Breakdown)

#### 1. Karakteristik dan Keterbatasan Deep Learning
Narasumber menjelaskan bahwa *Deep Learning* pada dasarnya adalah *function approximator* (pendekati fungsi) yang bekerja dengan cara memodifikasi bentuk geometris dari ruang input ke ruang output secara titik demi titik. Karena sifatnya yang kontinu dan dilatih dengan *gradient descent*, DL hanya mampu melakukan interpolasi. Artinya, model hanya dapat membuat prediksi yang masuk akal untuk titik data baru yang letaknya dekat dengan data pelatihan yang pernah dilihat sebelumnya.

#### 2. Kebutuhan Data yang Padat vs. Efisiensi Simbolik
Untuk mengatasi keterbatasan interpolasi, DL membutuhkan *sampling* yang sangat padat dari seluruh ruang input dan output. Narasumber mencontohkan mobil otonom: membuat jaringan saraf *end-to-end* sepenuhnya untuk mengemudi akan membutuhkan jumlah pengalaman yang tidak realistis untuk dicapai. Sebaliknya, AI simbolik atau berbasis aturan menggunakan program dengan struktur abstrak (seperti *loop* bersarang) yang dapat menangani variasi input yang tak terbatas tanpa perlu melihat setiap kemungkinan titik data secara eksplisit.

#### 3. Implementasi Sistem Hibrida Saat Ini
Menanggapi pertanyaan tentang cara menggabungkan kedua pendekatan tersebut, narasumber mengungkapkan bahwa sistem AI canggih yang ada sekarang sebenarnya sudah bersifat hibrida.
*   **Struktur:** Sistem ini didominasi oleh perangkat lunak yang dikoding manual, algoritma perencanaan, dan model 3D (berbasis aturan/model).
*   **Fungsi DL:** Deep learning digunakan sebagai modul tambahan, khususnya untuk persepsi, untuk mengubah data sensor mentah menjadi data yang dapat digunakan oleh sistem simbolik, atau menyuntikkan "intuisi" yang kabur ke dalam sistem yang kaku.

#### 4. Peluang dan Tantangan Deep Learning Murni
Narasumber membahas kemungkinan menggunakan DL murni untuk tugas-tugas yang tampak sederhana, seperti mengikuti jalur jalan. Meskipun secara teori tidak ada batasan keras jika ruang pencarian kaya dan *sampling* cukup padat, praktiknya hal ini sangat sulit dan mungkin membutuhkan triliunan contoh data.

Terkait *Turing Test* atau dialog alami, narasumber berpendapat bahwa tes tersebut lebih tentang menipu persepsi manusia daripada mengukur kecerdasan sejati. Meskipun menjaga percakapan selama 20 menit dengan berbagai topik sangat menantang bagi DL saat ini, hal itu tidak mustahil dilakukan di masa depan.

#### 5. Aplikasi yang Paling Tepat untuk Deep Learning
Deep Learning sangat cocok untuk masalah di mana manusia sulit menuliskan aturan eksplisitnya, terutama dalam bidang persepsi. Dalam konteks ini, DL berfungsi sebagai "intuisi buatan" yang menangani ketidakpastian sensorik sebelum data tersebut diproses lebih lanjut oleh sistem logis.

### Kesimpulan & Pesan Penutup
Kesimpulan utama dari pembahasan ini adalah bahwa *Deep Learning* tidak dapat berdiri sendiri sebagai solusi tunggal untuk kecerdasan umum (AGI) karena keterbatasannya dalam generalisasi tanpa data masif. Arah pengembangan AI yang tepat adalah dengan membangun sistem hibrida, di mana DL menangani persepsi sensorik, sementara penalaran logis dan perencanaan diserahkan kepada AI simbolik. Kombinasi ini meniru cara kerja manusia yang menggabungkan intuisi dengan pemikiran logis.

Read

file updated 2026-02-13 13:22:20 UTC