Transcript
i3ZnDRrmFjg • Neural networks learning spirals
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/lexfridman/.shards/text-0001.zst#text/0418_i3ZnDRrmFjg.txt
Kind: captions
Language: en
let's use tensorflow playground to see
what kind of neural network
can learn to partition the space for the
binary classification problem
between the blue and the orange dots
first is an easier binary classification
problem
with a circle and a ring distribution
around it
second is a more difficult binary
classification problem
of two dueling spirals this little
visualization tool on
playground.tensorflow.org
is really useful for getting an
intuition about how the size of the
network
and the various hyper parameters affects
what kind of representations that
network is able to learn the input to
the network is the position of the point
in the 2d plane
and the output of the network is the
classification of whether it's an orange
or a blue dot
we'll hold all the hyper parameters
constant for this little experiment
and just vary the number of neurons and
hidden layers
the hyper parameters are a batch size of
one learning rate of 0.03
the activation function is rayleigh and
l1 regularization with a rate of 0.001
so let's start with one hidden layer and
one neuron and gradually increase the
size of the network to see what kind of
representation it's able to learn
keep your eye on the right side of the
screen that shows the test loss and the
training loss
and the plot that shows sample points
from the two distributions
and then the shading in the background
of the plot shows the partitioning
function that the neural network is
learning
so successful function is able to
separate the orange and the blue dots
one hidden layer with one neuron
two neurons
three neurons
four neurons
eight neurons
now let's take a look at the trickier
spiral data set keeping most of the
hyperparameters the same
but decreasing the learning rate to 0.01
and adding to the input to the neural
network
extra features than just the coordinate
of the point
but also the squares of the coordinates
the multiplication
and the sign of each coordinate let's
start with one hidden layer one neuron
two neurons
four neurons
six neurons
eight neurons
two hidden layers two neurons on the
second layer
four neurons
six neurons
eight neurons
there you go that's a basic illustration
with the playground.tensorflow.org
that i recommend you try that shows the
connection between neural network
architecture
data set characteristics and different
training hyper parameters
it's important to note that the
initialization of the neural network has
a big impact
in many of the cases but the purpose of
this video is not to show the minimal
neural network architecture that's able
to represent the spiral
data set but rather to provide a visual
intuition about which kind of networks
are able to
learn which kinds of data sets there you
go i hope you enjoy these quick little
videos
whether they make you think give you a
new kind of insights
are just fun and inspiring see you next
time
and remember try to challenge yourself
and learn something new
every day
you