Transcript
GVsUOuSjvcg • Future Computers Will Be Radically Different (Analog Computing)
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/veritasium/.shards/text-0001.zst#text/0326_GVsUOuSjvcg.txt
Kind: captions
Language: en
for hundreds of years analog computers
were the most powerful computers on
Earth predicting eclipses tides and
guiding anti-aircraft guns then with the
Advent of solid state transistors
digital computers took off now virtually
every computer we use is digital but
today a perfect storm of factors is
setting the scene for a Resurgence of
analog technology this is an analog
computer and by connecting these wires
in particular ways I can program it to
solve a whole range of differential
equations for example this setup allows
me to simulate a damped Mass oscillating
on a spring so on the oscilloscope you
can actually see the position of the
mass over time and I can vary the
damping or the spring
constant or the mass and we can see how
the amplitude and duration of the
oscillations change now what makes this
an analog computer is that there are no
zeros and ones in here instead there's
actually a voltage that oscillates up
and down exactly like a mass on a spring
the electrical circuitry is an analog
for the physical problem it just takes
place much faster now if I change the
electrical connections I can program
this computer to solve other
differential equations like the Lorent
system which is a basic model of
convection in the atmosphere now the
lent system is famous because it was one
of the first discovered examples of
chaos and here you can see the Lorent
attractor with its beautiful butterfly
shape and on this analog computer I can
change the parameters and see their
effects in real
time so these examples illustrate some
of the advantages of analog computers
they are incredibly powerful Computing
devices and they can complete a lot of
computations fast plus they don't take
much power to do it
with a digital computer if you want to
add two 8 bit numbers you need around 50
transistors whereas with an analog
computer you can add two currents just
by connecting two wires with a digital
computer to multiply two numbers you
need on the order of a thousand
transistors all switching zeros and ones
whereas with an analog computer you can
pass a current through a resistor and
then the voltage across this
resistor will be I * R so effectively
you have multiplied two numbers
together but analog computers also have
their drawbacks for one thing they are
not general purpose Computing devices I
mean you're not going to run Microsoft
Word on this thing and also since the
inputs and outputs are continuous I
can't input exact values so if I try to
repeat the same calculation I'm never
going to get the exact same answer plus
think about manufacturing analog
computers there's always going to be
some variation in the exact value of
components like resistors or capacitors
so as a general rule of thumb you can
expect about a 1% error so when you
think of analog computers you can think
powerful fast and energy efficient but
also single-purpose non-repeatable and
inexact and if those sound like deal
breakers it's because they probably are
I think these are the major reasons why
analog computers fell out of favor as
soon as digital computers became
viable now here is why analog computers
may be making a
comeback it all starts with artificial
intelligence a machine has been
programmed to see and to move
objects AI isn't new the term was coined
back in 1956 in 1958 Cornell University
psychologist Frank rosenblat built the
perceptron designed to mimic how neurons
fire in our brains so here's a basic
model of how neurons in our brains work
an individual neuron can either fire or
not so its level of activation can be
represented as a one or a zero the input
to one neuron is the output from a bunch
of other neurons but the strength of
these connections between neurons varies
so each one can be given a different
weight some connections are excitatory
so they have have positive weights while
others are inhibitory so they have
negative weights and the way to figure
out whether a particular neuron fires is
to take the activation of each input
neuron and multiply by its weight and
then add these all together if their sum
is greater than some number called the
bias then the neuron fires but if it's
less than that the neuron doesn't
fire as input Rosen blats perceptron had
400 photo cells arranged in a square
grid to capture a 20x 20 pixel image you
can think of each pixel as an input
neuron with its activation being the
brightness of the pixel although
strictly speaking the activation should
be either Zer or 1 we can let it take
any value between 0er and one all of
these neurons are connected to a single
output neuron each via its own
adjustable weight so to see if the
output neuron will fire you multiply the
activation of each neuron by its weight
and add them together together this is
essentially a vector dot product if the
answer is larger than the bias the
neuron fires and if not it doesn't now
the goal of the perceptron was to
reliably distinguish between two images
like a rectangle and a circle for
example the output neuron could always
fire when presented with a circle but
never when presented with a rectangle to
achieve this the perceptron had to be
trained that is shown a series of
different circles and rectangles and and
have its weights adjusted accordingly we
can visualize the weights as an image
since there's a unique weight for each
pixel of the image initially Rosen blat
set all the weights to zero if the
perceptron output is correct for example
here it's shown a rectangle and the
output neuron doesn't fire no change is
made to the weights but if it's wrong
then the weights are adjusted the
algorithm for updating the weights is
remarkably simple here the output neuron
didn't find when it was supposed to
because it was shown a circle so to
modify the weights you simply add the
input activations to the weights if the
output neuron fires when it shouldn't
like here when shown a rectangle well
then you subtract the input activations
from the weights and you keep doing this
until the perceptron correctly
identifies all the training images it
was shown that this algorithm will
always converge so long as it's possible
to map the two categories into distinct
groups
the perceptron was capable of
distinguishing between different shapes
like rectangles and triangles or between
different letters and according to
rosenblat it could even tell the
difference between cats and dogs he said
the machine was capable of what amounts
to original thought and the media lapped
it up the New York Times called the
perceptron the embryo of an electronic
computer that the Navy expects will be
able to walk talk see write reproduce
itself and be conscious of its
existence after training on lots of
examples it's given new faces it has
never seen and is able to successfully
distinguish male from female it has
learned in reality the perceptron was
pretty Limited in what it could do it
could not in fact tell apart dogs from
cats this and other critiques were
raised in a book by MIT Giants Minsky
and papert in
1969 and that led to bust period for
artificial neural networks and AI in
general it's known as the first AI
winter Rosen blat did not survive this
winter he drowned while sailing in
Chesapeake Bay on his 43rd
[Music]
birthday the nav lab is a roadworthy
truck modified so that researchers or
computers can control the vehicle as
occasion demands in the 1980s there was
an AI Resurgence when researchers at
Carnegie Mel created one of the first
self-driving cars the vehicle was
steered by an artificial neural network
called Alvin it was similar to the
perceptron except it had a hidden layer
of artificial neurons between the input
and output as input Alvin received 30X
32 pixel images of the road ahead here
I'm showing them as 60x 64 pixels but
each of these input neurons was
connected via an adjustable weight to a
hidden layer of four neurons these were
each connected to 32 output neurons so
to go from one layer of the network to
the next you perform a matrix
multiplication the input activation
times the weights the output neuron with
the greatest activation determines the
steering
angle to train the neural net a human
drove the vehicle providing the correct
steering angle for a given input image
all the weights in the neural network
were adjusted through the training so
that Alvin's output better matched that
of the human driver
the method for adjusting the weights is
called back propagation which I won't go
into here but Welch Labs has a great
series on this which I'll link to in the
description again you can visualize the
weights for the four hidden neurons as
images the weights are initially set to
be random but as training progresses the
computer learns to pick up on certain
patterns you can see the road markings
emerge in the weights simultaneously the
output steering angle coalesces onto the
human steering ing angle the computer
drove the vehicle at a top speed of
around 1 or 2
kmph it was limited by the speed at
which the computer could perform matrix
multiplication despite these advances
artificial neural networks still
struggled with seemingly simple tasks
like telling apart cats and dogs and no
one knew whether Hardware or software
was the weak link I mean did we have a
good model of intelligence we just
needed more computer power or did we
have the the wrong idea about how to
make intelligence systems altogether so
artificial intelligence experienced
another Lull in the
1990s by the mid 2000s most AI
researchers were focused on improving
algorithms but one researcher Fay Lee
thought maybe there was a different
problem maybe these artificial neural
networks just needed more data to train
on so she planned to map out the entire
world of objects from 2006 to 2009 she
created imag net database of 1.2 million
human labeled images which at the time
was the largest labeled image data set
ever constructed and from 2010 2017 imet
ran an annual contest the imag net large
scale visual recognition challenge where
software programs competed to correctly
detect and classify images images were
classified into a thousand different
categories including 90 different dog
breeds a neural network competing this
competition would have an output layer
of a thousand neurons each corresponding
to a category of object that could
appear in the image if the image
contains say a German Shepherd then the
output neuron corresponding to German
Shepherd should have the highest
activation unsurprisingly it turned out
to be a tough challenge one way to judge
the performance of an AI is to see how
often the five highest neuron
activations do not include the correct
category this is the so-called top five
error rate in 2010 the best performer
had a top five error rate of
28.2% meaning meaning that nearly a
third of the time the correct answer was
not among its top five guesses in 2011
the error rate of the best performer was
25.8% a substantial Improvement but the
next year an artificial neural network
from the University of Toronto called
alexnet blew away the competition with a
top five error rate of just
16.4% what set Alex net apart was its
size and depth the network consisted of
eight layers and in total 500 ,000
neurons to train alexnet 60 million
weights and biases had to be carefully
adjusted using the training database
because of all the big Matrix
multiplications processing a single
image required 700 million individual
math operations so training was
computationally intensive the team
Managed IT by pioneering the use of gpus
graphical processing units which are
traditionally used for driving displays
screens so they're specialized for fast
parallel comput ations the alexnet paper
describing their research is a
blockbuster it's now been cited over a
100,000 times and it identifies the
scale of the neural network as key to
its success it takes a lot of
computation to train and run the network
but the Improvement in performance is
worth it with others following their
lead the top five error rate on the imag
net competition plummeted in the years
that followed down to 3.6% in
2015 that is better than human
performance the neural network that
achieved this had a 100 layers of
neurons so the future is clear we will
see ever increasing demand for ever
larger neural networks and this is a
problem for several reasons one is
energy consumption training a neural
network requires an amount of
electricity similar to the yearly
consumption of three households another
issue is the so-called Von noyman
bottleneck virtually every modern
digital computer stores data in memory
and then accesses it as needed over a
bus when performing the huge Matrix
multiplications required by Deep neural
networks most of the time and energy
goes into fetching those weight values
rather than actually doing the
computation and finally there are the
limitations of Mo's law for decades the
number of transistors on a chip has been
doubling approximately every 2 years but
now the size of a transistor is
approaching the size of an atom so there
are some fundamental physical challenges
to further
miniaturization so this is the perfect
storm for analog computers digital
computers are reaching their limits
meanwhile neural networks are exploding
in popularity and a lot of what they do
boils down to a single task matrix
multiplication best of all neural
networks don't need the Precision of
digital computers whether the neural net
is 96% or 98% confident the image
contains a chicken it doesn't really
matter it's still a chicken so slight
variability in components or conditions
can be
tolerated I went went to an analog
Computing startup in Texas called Mythic
AI here they're creating analog chips to
run neural networks and they
demonstrated several AI algorithms for
me oh there you go see it's getting you
yeah that's fascinating the biggest use
case is augmented in virtual reality you
know if your friend is in a different
you know they're at their house and
you're at your house you can actually
render each other in the virtual world
so it needs to really quickly capture
your pose and then render it in the VR
world so hang on is this for the
metaverse then this is very yeah this is
a very metaverse very metaverse
application this is depth estimation
from just a single webcam it's just
taking this scene and then it's doing
like a heat map so if it's bright it
means it's close if it's far away it
makes it like black now all these
algorithms can be run on digital
computers but here the matrix
multiplication is actually taking place
in the analog domain to make this
possible Mythic has repurposed digital
flash storage cells
normally these are used as memory to
store either a one or a zero if you
apply a large positive voltage to the
control gate electrons tunnel up through
an insulating barrier and become trapped
on the floating gate remove the voltage
and the electrons can remain on the
floating gate for decades preventing the
cell from conducting current that's so
you can store either a one or a zero you
can read out the stored value by
applying a small voltage if there are
electrons on the floating gate no
current flows so that's a zero
if there aren't electrons then current
does flow and that's a one now Mythic
idea is to use these cells not as onoff
switches but as variable resistors they
do this by putting a specific number of
electrons on each floating gate instead
of All or Nothing the greater the number
of electrons the higher the resistance
of the channel when you later apply a
small voltage the current that flows is
equal to V / R but you can also think of
this as voltage times conductance where
conductance is just the reciprocal of
resistance so a single flash cell can be
used to multiply two values together
voltage times conductance so to use this
to run an artificial neural network they
first write all the weights to the flash
cells as each cell's conductance then
they input the activation values as the
voltage on the cells and the resulting
current is the product of voltage time
conductance which is activation time
weight the cells are wired together in
such a way that the current from each
multiplication adds together completing
the matrix
multiplication so this is our our first
product this can do 25 trillion math
operations per second 25 trillion yep 25
trillion math operations per second in
this little chip here burning about
three watts of power how does it
compared to a digital chip the the newer
digital systems can do you know anywhere
from 25 to 100 100 trillion operations
per second but they are big thousand
systems that are spitting out you know
50 to 100 watts of power obviously this
isn't like an Apples to Apples
comparison right it's not Apples to
Apples I mean you know it training those
algorithms you need big Hardware like
this you can just do all sorts of stuff
on the GPU but if you specifically are
doing Ai workloads and you want to
deploy them you could use this instead
you can imagine them in security cameras
autonomous systems inspection equipment
for manufacturing you know every time
they make like a Frito L chip they
inspect it with a camera and the the bad
Fritos get blown off of the conveyor
belt but they're using artificial
intelligence to spot which which Fritos
are good and bad some have proposed
using analog circuitry in Smart Home
Speakers solely to listen for the Wake
word like Alexa or Siri they would use a
lot less power and be able to quickly
and reliably turn on the digital
circuitry of the device but you still
have to deal with the challenges of
analog so for for one of the popular
networks there would be 50 sequences of
Matrix multiplies that you're doing now
if you did that entirely in the analog
domain by the time it gets to the output
it's just so distorted that you don't
have any result at all you convert it
from the analog domain back to the
digital domain send it to the next
processing block and then you convert it
into the analog domain again and that
allows you to preserve the signal you
know when Rosen blat was first setting
up his perceptron he used a digital IBM
computer finding it too slow he built
built a custom analog computer complete
with variable resistors and Little
Motors to drive them you know ultimately
his idea of neural networks turned out
to be right maybe he was right about
analog
2 now I can't say whether analog
computers will take off the Way digital
did last century but they do seem to be
better suited to a lot of the tasks that
we want computers to perform today which
is a little bit funny because I always
thought of digital as the optimal way of
processing information you know
everything from music to pictures to
video has all gone digital in the last
50 years but maybe in a hundred years we
will look back on digital not as the end
point of Information Technology but as a
starting point our brains are digital in
that a neuron either fires or it doesn't
but they're also analog and that
thinking takes place everywhere all at
once so maybe what we need to achieve
true artificial intelligence machines
that think like us is the power of
analog hey I learned a lot while making
this video much of it by playing with an
actual analog computer you know trying
things out for yourself is really the
best way to learn and you can do that
with this video sponsor brilliant
brilliant is a website and app that gets
you thinking deeply by engaging you in
problem solving they have a great great
course on neural networks where you can
test how it works for yourself it gives
you an excellent intuition about how
neural networks can recognize numbers
and shapes and it also allows you to
experience the importance of good
training data and hidden layers to
understand why more sophisticated neural
networks work better what I love about
brilliant is it tests your knowledge as
you go the lessons are highly
interactive and they get progressively
harder as you go on and if you get stuck
there are always helpful hints for
viewers of this video brilliant is
offering the first 200 people 20% off an
annual premium subscription just go to
brilliant.org
veritasium I will put that link down in
the description so I want to thank
brilliant for supporting veritasium and
I want to thank you for watching