David Patterson: Computer Architecture and Data Storage

David Patterson: Computer Architecture and Data Storage | Lex Fridman Podcast #104

naed4C4hfAg • 2020-06-27

Transcript preview

Open

Kind: captions
Language: en
the following is a conversation with
David Patterson Turing Award winner and
professor of computer science at
Berkeley he's known for pioneering
contributions to RISC processor
architecture used by 99% of new chips
today and for co-creating RAID storage
the impact that these two lines of
research and development have had in our
world is immeasurable he's also one of
the great educators of computer science
in the world his book with John Hennessy
is how I first learned about and was
humbled by the inner workings of
machines at the lowest level quick
summary of the ads to sponsors the
Jordan Harbinger show and cash app
please consider supporting the podcast
by going to Jordan Harbinger complex and
downloading cash app and using code
Lexx podcast click on the links buy the
stuff it's the best way to support this
podcast and in general the journey I'm
on in my research and startup this is
the artificial intelligence podcast if
you enjoy it
subscribe on YouTube review it five
stars in hype a podcast supported on
patreon or connect with me on Twitter
and Lex Friedman spelled without the e
just Fri DM a.m. as usual I'll do a few
minutes of ads now and never any ads in
the middle that can break the flow of
the conversation
this episode is supported by the Jordan
Harbinger show go to Jordan Harbinger
calm / Lex
it's how he knows I set you on that page
there's links to subscribe to it an
apple podcast Spotify and everywhere
else I've been binging on this podcast
it's amazing Jordan is a great human
being he gets the best out of his guests
- deep calls him out when it's needed it
makes the whole thing fun to listen to
he's interviewed Kobe Bryant Mark Cuban
and Neil deGrasse Tyson and Garry
Kasparov and many more I recently
listened to his conversation with Frank
Abagnale author of catch me if you can
one of the world's most famous Kahneman
perfect podcast length and topic for a
recent long distance run that I did
go to Jordan Harbinger complex to give
him my love and to support this podcast
subscribe also on Apple podcast Spotify
and everywhere else this show is
presented by cash app the greatest
sponsor of this podcast ever and the
number one finance app in the App Store
when you get a used coat Lex podcast
cash app lets you send money to friends
buy bitcoin invest in the stock market
with as little as one dollar since gas
rep allows you to buy bitcoin let me
mention that cryptocurrency in the
context of the history of money is
fascinating I recommend the scent of
money as a great book on this history
also the audio book is amazing debits
and credits on Ledger's started around
30,000 years ago the US dollar created
over two hundred years ago and the first
decentralized cryptocurrency released
just over ten years ago so given that
history cryptocurrencies still very much
in its early days of development but
it's still aiming to and just might
redefine the nature of money so again if
you get cash out from the App Store
Google Play and use the code Lex podcast
you get ten dollars and cash up will
also donate ten dollars to first an
organization that is helping to advance
robotics to stem education for young
people around the world and now here's
my conversation with David Patterson
let's start with the big historical
question how have computers changed in
the past 50 years at both the
fundamental architectural level and in
general in your eyes well the biggest
thing that happened was the invention of
the microprocessor so computers that
used to fill up several rooms could fit
inside your cell phone and not only and
how do they get smaller they got a lot
faster so they're million times faster
than they were 50 years ago and they're
much cheaper and they're RIBA covetous
you know there's seven point eight
billion people on this planet probably
half of them have cell phones but you
know just remarkable
it's probably more micro processors than
there are people sure I don't know what
the ratio is but I'm sure it's above one
maybe it's ten to one or some number
like that what is a microprocessor so a
way to say what a microprocessor is to
tell you what's inside a computer so a
computer forever has classically had
five pieces there's input and output
which kind of naturally as you'd expect
is input is like speech or typing and
output is displays there's a memory and
like the name sounds it it remembers
things so it's integrated circuits whose
job is you put information in and when
you ask for it it comes back out that's
memory and the third part is the
processor where the team microprocessor
comes from and that has two pieces as
well and that is the control which is
kind of the brain of the processor and
the what's called the arithmetic unit
it's kind of the Brawn of the computer
so if you think of the as a human body
the arithmetic unit the thing that does
the number crunching is the is the body
and the control is the brain so those
five pieces input/output memory
arithmetic unit and control are have
been in computers since the very dawn in
the last two are considered the
processor so a microprocessor simply
means a process of the fits on a
microchip and that was invented at about
you know 40 years ago was the first
microprocessor it's interesting that you
refer to the arithmetic unit as the like
he connected to the body and the
controller's of the brain so I guess I
never thought of it that was a nice way
to think of it because most of the
actions the microprocessor does in terms
of literally sort of computation with
the microprocessor does computation it
processes information and most of the
thing it does is basically earth net
arithmetic operations what what are the
operations by the way it's a lot like a
calculator you know so there are add
instructions a subtractive Stressless
multiply and divide and
kind of the brilliance of the invention
of the my computer or the processor is
that it performs very trivial operations
but it just performs billions of them
per second and what we're capable of
doing is writing software that can take
these very trivial instructions and have
them create tasks that can do things
better than human beings can do today
just looking back through your career
did you anticipate the kind of how good
we would be able to get at doing these
small basic operations I think what how
many surprises along the way we just
kind of set back and said wow I didn't
expect it to go this fast this good well
the the fundamental driving force is
what scored Moore's law which was named
after Gordon Moore who's a Berkeley
alumnus and he made this observation
very early in what are called semi
conductors and semiconductors are these
ideas you can build these very simple
switches and you can put them on these
microchips and he made his observation
over 50 years ago he looked at a few
years and said I think what's going to
happen is the number of these little
switches called transistors is going to
double every year for the next decade
and he said this in 1965 and in 1975 he
said well maybe he's gonna double every
two years and that I would other people
since named that Moore's Law guided the
industry and when Gordon Moore makes
that prediction he he wrote a paper back
in I think in the in the 70s and said
not only did this going to happen he
wrote what would be the implications of
that and in this article from 1965 he he
shows ideas like computers being in cars
and computers being in something that
you would buy in the grocery store and
stuff like that so he kind of not only
called his shot he called the
implications of it so if you were in in
the computing field and a few believed
Moore's prediction he kind of said what
the what would be happening in the
future
so so it's not kind of
it's at one sense this is what was
predicted and you could imagine it was
easily believed that Moore's law was
going to continue and so this would be
the implications on the other side there
are these shocking events in your life
like I remember driving in meriem across
the bay in San Francisco and seeing a
bulletin board at a local Civic Center
and had a URL on it uh and it was like
if for all for all that's for the people
at the time these first URLs and that's
the you know ww select stuff with the
HTTP people thought it was looks like
alien alien writing right they'd see
these advertisements and commercials or
bulletin boards that had this alien
writing on it so for the lay people is
like what the hell is going on here and
for those people interesting it's oh my
god this stuff is getting so popular
it's actually leaking out of our nerdy
world and into the real world so that I
mean there is events like that I think
another one was I member with the in the
early days of the personal computer when
we started seeing advertisements in
magazines for personal computers like
it's so popular that it's it made the
newspapers so at one hands you know
Gordon Moore predicted it and you kind
of expected it to happen but when it
really hit and you saw it affecting
society it was it was shocking so maybe
taking a step back and looking both the
engineering and philosophical
perspective what what do you see as the
layers of abstraction in the computer do
you see a computer as a set of layers of
abstractions yeah and I think that's one
of the things that computer science
fundamentals is the these things are
really complicated in the way we cope
with complicated software and
complicated hardware is these layers of
abstraction and that simply means that
we you know suspend disbelief and
pretend that the only thing you know is
that layer and you don't know anything
about the layer below it and that's the
way we can make very complicated things
and probably it started with hardware
that's the way it was done but it's been
proven extremely useful and
you know I would think in a modern
computer today there might be 10 or 20
layers of abstraction and they're all
trying to kind of enforce this contract
is all you know is this interface
there's a set of commands that you can
allow to use and you stick to those
commands that we will faithfully execute
that and it's like peeling the air
layers of a London onion you get down
there's a new set of layers and so forth
so for people who want to study computer
science the exciting part about it is
you can keep peeling those layers you
you take your first course and you might
learn to program in Python and then you
can take a follow-on course and you can
get it down to a lower level language
like C and you know you can go and you
can if you want to you can start getting
into the hardware layers and you keep
getting down all the way to that
transistor that I talked about that
Gordon Moore predicted and you can
understand all those layers all the way
up to the highest level application
software so it's it's a very kind of
magnetic field if you're interested you
can go into any depth and keep going in
particular what's happening right now or
it's happened in software last twenty
years and recently in hardware there's
getting to be open sourced versions of
all of these things so what open source
means is what the engineer the
programmer designs it's not secret the
belonging to a company it's up there on
the World Wide Web so you can see it so
you can look at for lots of pieces of
software that you use you can see
exactly what the programmer does if you
want to get involved that used to stop
at the hardware recently there's been an
efforts to make open-source hardware and
those interfaces open so you can see
that so instead of before you had to
stop at the hardware you can now start
going layer by layer below that and see
what's inside there so it's it's a
remarkable time that for the interested
individual can really see in great depth
what's really going on and the computers
that power everything that we see around
us are you thinking also
when you say open source at the hardware
level is this going to the design
architecture instruction set level or is
it going to literally the the you know
the manufacturer of the of the actual
hardware of the actual chips whether
that's a six specialized a particular
domain or the general yeah so let's talk
about that a little bit so when you get
down to the bottom layer of software the
way software talks to hardware is in a
vocabulary and what we call that
vocabulary we call that the words of
that vocabulary called instructions in
the technical term for the vocabulary is
instruction set
so those instructions are likely we
talked about earlier that can be
instructions like add subtract and
multiply divide there's instructions to
put data into memory which is called a
store instruction and to get data back
which is called the load instructions
and those simple instructions go back to
the very dawn of computing in you know
in 1950 the commercial commercial
computer had these instructions so
that's the instruction set that we're
talking about so up until I'd say ten
years ago these instruction sets are all
proprietary so a very popular one is
Alden by Intel the one that's in the
cloud and then all the pcs in the world
the Intel owns that instruction set it's
referred to as the x86 there have been a
sequence of ones that the first number
was called 8086
and since then there's been a lot of
numbers but they all end in 86 so
there's then that kind of family of
instruction sets and that's proprietary
and that's proprietary the other one
that's very popular is from arm that
kind of powers all of all the cell
phones in the world all the iPads in the
world and a lot of things that are
so-called Internet of Things devices arm
and that one is also proprietary arm
will license it to people for a fee but
they own that so the new idea that got
started at Berkeley kind of
unintentionally ten years ago is
in early in my career we pioneered a way
to do of these vocabularies instruction
sets that was very controversial at the
time at the time in the 1980s
conventional wisdom was these
vocabularies instruction sets should
have you know powerful instructions so
polysyllabic kind of words you can think
of that and and so that instead of just
add subtract and multiply they would
have polynomial vied or sort a list and
the hope was of those powerful
vocabularies that make it easier for
software so we thought that didn't make
sense for microprocessors servers people
at Berkeley and Stanford and IBM who
argued the opposite and we what we
called that was a reduced instruction
set computer in the abbreviation was our
ISC and typical for computer people we
use the abbreviations start pronouncing
it so risk was there so we said for
microprocessors which with Gordon's
Moore is changing really fast we think
it's better to have a pretty simple set
of instructions reduce set of
instructions that that would be a better
way to build microprocessors since
they're going to be changing so fast due
to Moore's law and then we'll just use
standard software to cover the used
generate more of those simple
instructions and one of the pieces of
software that it's in a software stack
going between these layers of
abstractions is called a compiler and it
basically translates it's a translator
between levels we said the translator
will handle it so the technical question
was well since there are these reduced
instructions you have to execute more of
them yeah that's right but maybe they
execute them faster yeah that's right
there's simpler so they could go faster
but you have to do more of them so
what's what's that trade-off look like
and it ended up that we ended up
executing maybe 50 percent more
instructions maybe 1/3 more instructions
but they ran four times faster so so
this risk controversial risk ideas
proved to be maybe factors of three or
four better I love that this idea was
controversial and
most kind of like a rebellious so that's
in the context of what was more
conventional is the complex instruction
set competing so how'd you pronounce
that Sisk Sisk risk vs. Sisk and and
believe it or not this sounds very very
you know who cares about this right it
was it was violently debated at several
conferences it's like what's the
brightman ago is is and people thought
risk was you know was de-evolution we're
gonna make software worse by making
death instructions simpler and they're
fierce debates at several conferences in
the 1980s and then later in the eighties
that kind of settled to these benefits
it's not completely intuitive to me why
risk has for the most part one yes so
why do that happen yeah yeah and maybe I
can sort of say a bunch of dumb things
that could lay the land for further
commentary so to me and this is a this
is kind of interesting thing if you look
at C++ was just see with modern
compilers you really could write faster
code with C++ so relying on the compiler
to reduce your complicated code into
something simple and fast so to me
comparing risk maybe this is a dumb
question but why is it that focusing the
definition the design of the instruction
set on very few simple instructions in
the long run provide faster execution
versus coming up with like I said a ton
of complicated instructions then over
time you know years maybe decades you
come up with compilers that can reduce
those into simple instructions for you
yeah some let's try and split that into
two pieces so if the compiler can do
that for you if the pilot can take you
know a complicated program and produce
simpler instructions then the programmer
doesn't care right programmer yeah yeah
I don't care just how how fast is the
computer I'm using how much does it cost
and so what we what
and kind of in the software industry is
right around before the 1980s critical
pieces of software we're still written
not in languages like C or C++ they were
written in what's called assembly
language where there's this kind of
humans writing exactly at the
instructions at the level then that a
computer can understand so they were
writing add subtract multiply you know
instructions it's very tedious but the
belief was to write this lowest level of
software that the people use which are
called operating systems they had to be
written in assembly language because
these high-level languages were just too
inefficient they were too slow or the
the programs would be too big so that
changed with a famous operating system
called UNIX which is kind of the
grandfather of all the operating systems
today so the UNIX demonstrated that you
could write something as complicated as
an operating system in a language like C
so once that was true then that meant we
could hide the instruction set from the
programmer and so that meant then it
didn't really matter the programmer
didn't have to write lots of these
simple instructions so that was up to
the compiler so that was part of our
arguments for risk is if you were still
writing assembly languages maybe a
better case for sis constructions but if
the compiler can do that it's gonna be
you know that's done once the computer
translates it once and then every time
you run the program it runs that this
this potentially simpler instructions
and so that that was the debate right is
because and people would acknowledge
that these simpler instructions could
lead to a faster computer you can think
of mono syllabic constructions you could
say them you know if you think of
reading you probably read them faster or
say them faster than long instructions
the same thing that analogy works pretty
well for hardware and as long as you
didn't have to read a lot more of those
instructions you could win so that's
that's kind of that's the basic idea for
risk but it's interesting that the in
that discussion of UNIX to see that
there's only one step
of levels of abstraction from the code
that's really the closest to the machine
to the code that's written by human it's
uh at least to me again perhaps a dumb
intuition but it feels like there might
have been more layers sort of different
kinds of humans stacked as well of each
other so what's true and not true about
what you said is several of the layers
of software like so the if you hear two
layers would be suppose we just talked
about two layers that would be the
operating system like you get from from
Microsoft or from Apple like iOS or the
Windows operating system and let's say
applications that run on top of it like
Word or Excel so both the operating
system could be written in C and the
application could be written in C so but
you could construct those two layers and
the applications absolutely do call upon
the operating system and the change was
that both of them could be written in
higher-level languages so it's one step
of a translation but you can still build
many layers of abstraction of software
on top of that and that's how how things
are done today so still today many of
the layers that you'll you'll deep deal
with you may deal with debuggers you may
deal with linkers there's libraries many
of those today will be written in c++
say even though that language is pretty
ancient and even the Python interpreter
is probably written in C or C++ so lots
of layers there are probably written in
these some old fashioned efficient
languages that still take one step to
produce these instructions produce RISC
instructions but they're composed each
layer of software invokes one another
through these interfaces and you can get
ten layers of software that way so in
general the risk was developed here
Berkeley it was kind of the three places
that were
these radicals that advocated for this
against the rest of community where IBM
Berkeley and Stanford you're one of
these radicals and how radical did you
feel how confident did you feel how
doubtful were you that risk might be the
right approach because it may you can
also Intuit that is kind of taking a
step back into simplicity not forward
into simplicity yeah no it was easy to
make yeah it was easy to make the
argument against it well this was my
colleague John Hennessy at Stanford and
I we were both assistant professors and
for me I just believed in the power of
our ideas I thought what we were saying
made sense Moore's Law is going to move
fast the other thing that I didn't
mention is one of the surprises of these
complex instruction sets you could
certainly write these complex
instructions if the programmer is
writing them in themselves it turned out
to be kind of difficult for the compiler
to generate those complex instructions
kind of ironically you'd have to find
the right circumstances that that just
exactly fit this complex instruction it
was actually easier for the compiler to
generate these simple instructions so
not only did these complex instructions
make the hard work more difficult to
build often the compiler wouldn't even
use them and so it's harder to build the
compiler doesn't use them that much the
simple instructions go better with
Moore's Law that's you know the number
of transistors is doubling every every
two years so we're gonna have you know
the you want to reduce the time to
design the microprocessor that may be
more important than these number
instructions so I think we believed in
the that we were right that this was the
best idea then the question became in
these debates well yeah that's a good
technical idea but in the business world
this doesn't matter there's other things
that matter it's like arguing that if
there's a standard with the railroad
tracks and you've come up with a better
with but the whole world has covered
railroad tracks so you'll your ideas
have no chance of success
commercial success it was technically
right but commercially it'll be
insignificant yeah this it's kind of sad
that this world the history of human
civilization is full of good ideas that
lost because somebody else came along
first with a worse idea and it's good
that in the computing world at least
some of these have well well you could
are I mean it's probably still sisk
people that say yeah still are but and
what happened was what was interesting
Intel a bunch of the system companies
with Sisk instruction sets of vocabulary
they gave up but not Intel what Intel
did to its credit because Intel's
vocabulary was in the in the personal
computer and so that was a very valuable
vocabulary because the way we distribute
software is in those actual instructions
it's in the instructions of that
instruction set so they then you don't
get that source code what the
programmers wrote you get after it's
been translated into the last level
that's if you were to get a floppy disk
or download software it's in the
instructions that instruction set so the
x86 instruction set was very valuable so
what Intel did cleverly and amazingly is
they had their chips in hardware do a
translation step they would take these
complex instructions and translate them
into essentially in RISC instructions in
Hardware on the fly you know at at
gigahertz clock speeds and then any good
idea that risk people had they could use
and they could still be compatible with
us with this really valuable PC software
software base and which also had very
high volumes you know a hundred million
personal computers per year so the sisk
architecture in the business world was
actually one in in this PC era so just
going back to the the time of designing
risk when you design an instruction set
architecture do you think like a
programmer do you think like a
microprocessor engineer do you think
like a artist a philosopher do you think
in software and hardware I mean is it
art I see science yeah I'd say I think
designing a goods instruction set as an
art and I think you're trying to balance
the the simplicity and speed of
execution with how well easy it will be
for compilers to use it alright you're
trying to create an instruction set that
everything in there can be used by
compilers there's not things that are
missing
that'll make it difficult for the
program to run they run efficiently but
you want it to be easy to build as well
so it's that kind of so you're thinking
I'd say you're thinking hard we're
trying to find a hardware software
compromise that'll work well and and
it's you know it's you know it's a
matter of taste right it's it's kind of
fun to build instruction sets it's not
that hard to build an instruction set
but to build one that catches on and
people use you know you have to be you
know fortunate to be the right place at
the right time or have a design that
people really like are using metrics
says is it quantifiable because you kind
of have to anticipate the kind of
programs that people will write yet
ahead of time so is that can you use
numbers can use metrics can you quantify
something ahead of time or is this again
that's the art part where you're kind of
knows it's a a big a big change kind of
what happened I think from Hennessey's
and my perspective in the 1980s what
happened was going from kind of really
you know taste and hunches to
quantifiable in in fact he and I wrote a
textbook at the end of the 1980s called
computer architecture a quantitative
approach I heard of that and and it's
it's the thing it it had a pretty big
big impact in the field because we went
from textbooks that kind of listed so
here's what this computer does and
here's the pros and cons and here's what
this computer doesn't pros and cons to
something where there were formulas
in equations where you could measure
things so specifically for instruction
sets what we do in some other fields do
is we agree upon a set of programs which
we call benchmarks and a suite of
programs and then you develop both the
hardware and the compiler and you get
numbers on how well your your computer
does given its instruction set and how
well you implemented it in your
microprocessor and how good your
compilers are and in computer
architecture we you know using
professors terms we grade on a curve
rather than greater than absolute scale
so when you say you know this these
programs run this fast well that's kind
of interesting but how do you know it's
better while you compare it to other
computers at the same time so the best
way we know how to make turned it into a
kind of more science and experimental
and quantitative is to compare yourself
to other computers or the same era that
have the same access the same kind of
technology on commonly agreed benchmark
programs so maybe two toss-up two
possible directions we can go one is
what are the different trade-offs in
designing architectures Ubben are you
talking about Siskin risk but maybe a
little bit more detail in terms of
specific features that you were thinking
about and the other side is what are the
metrics that you're thinking about when
looking at these trade-offs yeah well
let's talk about the metrics so during
these debates we actually had kind of a
hard time explaining convincing people
the ideas and partly we didn't have a
formula to explain it and a few years
into it we hit upon the formula that
helped explain what was going on and I
think if we can do this see how it works
orally just is this so the yes if I can
do a formula or Li L C so the so
fundamentally the way you measure
performance is how long does it take a
program to run a program if you have ten
programs and typically these benchmarks
were sweet because you'd want to have
ten programs so they could represents
lots of different applications so for
these ten programs how long they take to
run
now when you're trying to explain why it
took so long you could factor how long
it takes a program to run into three
factors one of the first one is how many
instructions did it take to execute so
that's the that's the what we've been
talking about you know the instructions
of Academy
how many did it take all right the next
question is how long did each
instruction take to run on average so
you multiply the number instructions
times how long it took to run and that
gets you help okay so that's but now
let's look at this metric of how long
did it take the instruction to run well
it turns out the way we could build
computers today is they all have a clock
and you've seen this when you if you buy
a microprocessor it'll say 3.1 gigahertz
or 2.5 gigahertz and more gigahertz is
good well what that is is the speed of
the clock so 2.5 gigahertz turns out to
be 4 billions of instruction or 4
nanoseconds so that's the clock cycle
time but there's another factor which is
what's the average number of clock
cycles that takes per instructions so
it's number of instructions average
number of clock cycles in the clock
cycle time
so in these risks ist's debates we would
we they would concentrate on but wrist
makes needs to take more instructions
and we'd argue what maybe the clock
cycle is faster but what the real big
difference was was the number of clock
cycles per instruction or instruction as
fascinating what about the mess up the
beautiful mess of parallelism in the
whole picture parallelism which has to
do was say how many instructions could
execute in parallel and things like that
you could think of that as affecting the
clock cycles per instruction because
it's the average clock cycles per
instruction so when you're running a
program if it took a hundred billion
instructions and on average it took two
clock cycles per instruction and they
were four nanoseconds you could multiply
that out and see how long it took to run
and there's all kinds of tricks to try
and reduce the number of clock cycles
per instruction but it turned out that
the way they would do these complex
instructions is they would actually
build what we would call an interpreter
in a simpler a very simple hardware
interpreter but it turned out that for
sis constructions if you had to use one
of those interpreters it would be like
10 clock cycles per instruction where
the risk instructions could be too so
there'd be this factor of five advantage
in clock cycles per instruction we have
to execute say 25 or 50 percent more
instructions so that's where the wind
would come and then you could make an
argument whether the clock cycle times
are the same or not but pointing out
that we could divide the benchmark
results time per program into three
factors and the biggest difference
between risk consists was the clock
cycles per you execute a few more
instructions but the clock cycles per
instruction is much less and that was
what this debate once we made that
argument then people say okay I get it
and so we went from it was outrageously
controversial in you know 1982 that
maybe probably by 1984 so people said oh
yeah
technically they've got a good argument
what are the instructions in the RISC
instruction set just to get an intuition
okay
1995 I was asked scientific the future
of what microprocessor so I and that
well as I'd seen these predictions and
usually people predict something
outrageous just to be entertaining right
and so my prediction for 2020 was you
know things are gonna be pretty much
they're gonna look very familiar to what
they are and they are in if you were to
read the article you know the things I
said are pretty much true the
instructions that have been around
forever are kind of the same and that's
the outrageous prediction actually yeah
given how fast computers and well you
know Moore's law was gonna go on we
thought for 25 more years you know who
knows but kind of the surprising thing
in fact you know Hennessy and I you know
won the the ACM a.m. Turing award for
both the RISC instruction set
contributions and for that textbook I
mentioned but you know we are surprised
that here we are 35 40 years later after
we did our work and the the conventional
wisdom of the best way to do instruction
sets is still those RISC instruction
sets that look very similar to what we
look like you know we did in the 1980s
so those surprisingly there hasn't
some radical new idea even though we
have you know a million times as many
transistors as we had back then but what
are the basic constructions and how did
they change over the years so we're
talking about addition subtract these
are the specific so the the to get so
the things that are in a calculator you
are in a computer so any of the buttons
that are in the calculator in the crater
so the little button so if there's a
memory function key and like I said
those are turns into putting something
in memories called a store bring
something back Scott load just as a
quick tangent when you say memory what
does memory mean well I told you there
were five pieces of a computer and if
you remember in a calculator there's a
memory key so you you want to have
intermediate calculation and bring it
back later
so you'd hit the memory plus key M plus
maybe and it would put that into memory
and then you'd hit an REM like return
instruction and it bring it back in the
display so you don't have to type it you
don't have to write it down bring it
back again so that's exactly what memory
is if you can put things into it as
temporary storage and bring it back when
you need it later
so that's memory and loads and stores
but the big thing the difference between
a computer and a calculator is that the
computer can make decisions and in
amazingly the decisions are as simple is
is this value less than zero or is this
value bigger than that value so there's
and those instructions which are called
conditional branch instructions is what
give computers all its power if you were
in the early days of computing before
the what's called the general-purpose
microprocessor people would write these
instructions kind of in hardware and but
it couldn't make decisions it would just
it would do the same thing over and over
again with the power of having branch
instructions that can look at things and
make decisions automatically and it can
make these decisions you know billions
of times per second and amazingly enough
we can get you know thanks to advances
machine learning we can we can create
programs that can do something smarter
than human beings can do but if you go
down that very basic level it's the
instructions are the keys on the
calculator plus the ability to make
decisions of these conditional branch
instructions you know and all decisions
fundamental can be reduced down to these
- assumptions yeah so in in fact and so
you know going way back in the sack back
to you know we did for risk projects at
Berkeley in the 1980s they did a couple
at Stanford in the 1980s in 2010 we
decided we wanted to do a new
instruction set learning from the
mistakes of those RISC architectures of
1980s and that was done here at Berkeley
almost exactly 10 years ago in the the
people who did it I participated but
other Christos Sanne and others
drove it
they called it risk 5 to honor those
risk the four risk projects of 1980s so
what is risk 5 involved so leaders 5 is
another instruction set of vocabulary
it's learned from the mistakes of the
past but it still has if you look at the
there's a core set of instructions it's
very similar to the simplest
architectures from the 1980s and the big
difference about risk 5 is it's open so
I talked early about proprietary versus
open and kind of sauce software so this
is an instruction set so it's a
vocabulary it's not it's not hardware
but by having an open instruction set we
can have open source implementations
open source processors that people can
use where do you see that going says
it's the really exciting possibilities
but she's just like in the Scientific
American if you were to predict 10 20 30
years from now that kind of ability to
utilize open source instruction set
architectures like risk 5 what kind of
possibilities might that unlock yeah and
so just to make it clear because this is
confusing the specification of risk 5 is
something that's like in a text book
there's books about it so that's what
that's kind of defining an interface
there's also the way you build hardware
is you write it in languages they're
kind of like sea but they're specialized
for hardware that gets translated into
hardware and so these implementations of
this specification are what are the open
source so they're written in something
that's called Verilog or VHDL but it's
put up on the web
like that you can see the C++ code for
Linux on the web so that's the open
instruction set enables open source
implementations at risk five so you can
literally build a processor using this
instruction set people are and people
are so what happened to us that the
story was this was developed here for
our use to do our research and we made
it we licensed under the berkeley
software distribution license like a lot
of things get licensed here so other
academics use it they wouldn't be afraid
to use it and then about 2014 we started
getting complaints that we were using it
in our research in our courses and we
got complaints from people in industries
why did you change your instruction set
between the fall and the spring semester
and well we get complaints of additional
time why the hell do you care what we do
with our instruction set and then when
we talked to him we found out there was
this thirst for this idea of an open
instruction set architecture and they
had been looking for one they stumbled
upon ours at Berkeley thought it was boy
this looks great we should use this one
and so once we realize there is this
need for an open instruction set
architecture we thought that's a great
idea and then we started supporting it
and tried to make it happen so this was
you know kind we accidentally stumbled
into this and to this need in our timing
was good and so it's really taking off
there's a you know universities are good
at starting things but the not good it's
sustaining things so like Linux has the
Linux Foundation there's a risk 5
foundation that we started there's
there's an annual conferences and the
first one was done I think January 2015
and the one that was just last December
in it you know it had 50 people at it
and the last one last December had kind
of 1,700 people were at it and the
companies excited all over the world
so if predicting into the future you
know if we were doing 25 years I would
predict that risk 5 will be you know
possibly the most popular instruction
set architecture out there because it's
a pretty good instruction set
architecture and it's open and free and
there's no reason
lots of people shouldn't use it and
there's benefits just like Linux is so
popular today compared to 20 years ago I
and you know the fact that you can get
access to it for free you can modify it
you can improve it for all those same
arguments and so people collaborate to
make it a better system for all
everybody to use and that works in
software and I expect the same thing
will happen in hardware so if you look
at arm Intel mips if you look at just
the lay of the land and what do you
think just for me because I'm not
familiar how difficult this kind of
transition would how much challenges
this kind of transition would entail do
you see let me ask my dumb question
another one no that's I know where
you're headed well there's a budget I
think the thing you point out there's
there's these proprietary popular
proprietary instruction sets the x86 and
so how do we move to risk five
potentially in sort of in the span of
five 10 20 years a kind of a unification
in given that the device is the kind of
way we use devices IOT mobile devices
and and the cloud keeps changing well
part of it a big piece of it is the
software stack and what right now
looking forward there seem to be three
important markets there's the cloud and
then the cloud is simply companies like
Alibaba and Amazon and Google Microsoft
having these giant data centers with
tens of thousands of servers in maybe a
hunt maybe a hundred of these data
centers all over the world and that's
what the cloud is so the computer that
dominates the cloud is the x86
instruction set so the instructions are
the vocal instructor sets using the
cloud of the x86 almost almost 100% of
that today is x86 the other big thing
are cell phones and laptops those are
the big things today I mean the PC is
also dominated by the x86 instruction
set but those
sales are dwindling you know there's
maybe 200 million pcs a year and there's
I serve one and a half billion phones a
year there's numbers like that so for
the phones that's dominated by arm and
now and a reason that I talked about the
software stacks and then the third
category is Internet of Things which is
basically embedded devices things in
your cars and your microwaves everywhere
so what's different about those three
categories is for the cloud the software
that runs in the cloud is determined by
these companies Alibaba Amazon Google
Microsoft so that they control that
software stack for the cell phones
there's both for Android and Apple the
software they supply but both of them
have marketplaces where anybody in the
world can build software and that
software is translated or you know
compiled down and shipped in the
vocabulary of arm so that's the the
what's referred to as binary compatible
because the actual it's the instructions
are turned into numbers binary numbers
and shipped around the world so and the
size just a quick interruption so arm
what is arm as arm is an instructions
like a risk-based yeah it's a risk-based
instruction as a proprietary one arm
stands for advanced RISC machine erm is
the name where the company is so it's a
proprietary RISC architecture so and
it's been around for a while and you
know the surely the most popular
instruction set in the world right now
they every year billions of chips are
using the arm design in this post PC era
is what it was the one of the early risk
adopters of the risk yeah yeah the first
arm goes back I don't know 86 or so so
Berkeley instead did their work in the
early 80s their arm guys needed an
instruction set and they read our papers
and it heavily influenced them so
getting back my story what about
Internet of Things well software's not
shipped in Internet of Things it's the
the embedded device people control that
software stack so you would the
opportune
these four risk five everybody thinks is
in the internet of things embedded
things because there's no dominant
player like there is in the cloud or the
smartphones and you know it's it's
doesn't have a lot of licenses
associated with and you can enhance the
instruction set if you want and it's a
in and people have looked at instruction
sets and think it's a very good
instruction set so it appears to be very
popular there it's possible that in the
cloud people those companies control
their software stacks so that it's
possible that they would decide to use
verse five if we're talking about ten
and twenty years in the future the one
of the be harder it would be the cell
phones since people ship software in the
arm instruction set that you'd think be
the more difficult one but if if risk
five really catches on and you know you
could in a period of a decade you can
imagine that's changing over to give a
sense why risk five our arm is dominated
you mentioned these three categories why
has why did arm dominate why does it
dominate the mobile device base and
maybe the my naive intuition is that
there are some aspects of power
efficiency that are important yeah that
somehow come along with risk well part
of it is for these old Siskin structions
that's like in the x86 it it was more
expensive to these for the you know
they're older so they have disadvantages
in them because they were designed forty
years ago but also they have to
translate in hardware from sis
constructions to risks instructions on
the fly and that costs both silicon area
that the chips are bigger to be able to
do that and it uses more power so arm
his which has you know followed this
risk philosophy is seen to be much more
energy-efficient and in today's computer
world both in the cloud in cell phone
and you know things it isn't the
limiting resource isn't the number of
transistors you can fit in the chip it's
what how much power can you dissipate
for your application so by having a
reduced instruction set you that's
possible to have
a simpler hardware which is more energy
efficient in energy efficiency is
incredibly important in the cloud when
you have tens of thousands of computers
in a datacenter you want to have the
most energy-efficient ones there as well
and of course for embedded things
running off of batteries you want those
to be energy efficient in the cell
phones too so it I think it's believed
that there's a energy disadvantage of
using these more complex instruction set
architectures so the other aspect of
this is if we look at Apple Qualcomm
Samsung Huawei all use the ARM
architecture and yet the performance of
the systems varies I mean I don't know
whose opinion you take on but you know
Apple for some reason seems to perform
better and try these implementations
architecture so where's the magic and
sure that happened yeah so what arm
pioneered was a new business model as
they said well here's our proprietary
instruction set and we'll give you two
ways to do it eat there we'll give you
one of these implementations written in
things like C called Verilog and you can
just use ours well you have to pay money
for that not only pay will give you the
you know will license use to do that or
you could design your own and so we're
talking about numbers like tens of
millions of dollars to have the right to
design your own since they it's the
instruction set belongs to them so Apple
got one of those the right to build
their own most of the other people who
build like Android phones just get one
of the designs from arm and to do it
themselves
so Apple developed a really good
microprocessor design team they you know
acquired a very good team that had was a
building other microprocessors and
brought them into the company to build
their designs so the instruction sets
are the same the specifications are the
same but their hardware design is much
more efficient than I think everybody
else's and that's given Apple an
advantage in the marketplace and that
the iPhones tend to be the faster than
most everybody else's phones that are
they
it'd be nice to be able to jump around
and kind of explore different little
sides of this but let me ask one sort of
romanticized question what to you is the
most beautiful aspect or idea of risk
instruction set or instruction sets for
this you know what I think that you know
I I'm you know I I was always attracted
to the idea of you know smallest
beautiful why is that the temptation in
engineering it's kind of easy to make
things more complicated it's harder to
come up with a it's more difficult
surprising they come up with a simple
elegant solution and I think that
there's a bunch of small features of of
risk in general that you know where you
can see this examples of keeping it
simpler makes it more elegant
specifically in risk five which you know
I'm I was kind of the mentor in the
program but it was really driven by
christos sama and two grad
students Andrew Waterman Yin Sibley is
they hit upon this idea of having a
subset of instructions a nice simple
instruction subset instructions like
40-ish instructions that all software
the software status v can run just on
those forty instructions and then they
provide optional features that could
accelerate the performance instructions
that if you needed them could be very
helpful but you don't need to have them
and that that's a new really a new idea
so risk five has right now maybe five
optional subsets that you can pull in
but the software runs without them if
you just want to build the just the core
forty instructions that's fine you can
do that so this is fantastic
educationally is so you can explain
computers you only have to explain forty
instructions and not thousands of them
also if you invent some wild and crazy
new technology like you know biological
computing you'd like a nice simple
instruction set and you can risk 5e if
you implement those core instructions
you can run you know really interesting
programs on top of that so this idea of
a co

Resume

# Evolusi Komputasi: Dari RISC hingga RISC-V dan Akhir Era Moore's Law bersama David Patterson

### Inti Sari (Executive Summary)
Video ini merupakan wawancara mendalam dengan David Patterson, pemenang Turing Award dan Professor Ilmu Komputer di UC Berkeley, mengenai evolusi arsitektur komputer selama 50 tahun terakhir. Pembahasan mencakup sejarah inovasi besar seperti RISC dan RAID, transisi industri menuju open-source hardware (RISC-V), dampak perlambatan Hukum Moore, serta masa depan komputasi dengan arsitektur spesifik domain untuk kecerdasan buatan (AI). Patterson juga berbagi wawasan filosofis tentang sinergi antara pengajaran dan riset, serta pentingnya menjaga hubungan manusia dalam perjalanan karier.

### Poin-Poin Kunci (Key Takeaways)
*   **Revolusi RISC**: Perubahan paradigma dari *Complex Instruction Set Computers* (CISC) ke *Reduced Instruction Set Computers* (RISC) pada tahun 1980-an mengubah lanskap industri; kini 99% chip baru menggunakan arsitektur RISC.
*   **Perlambatan Moore's Law**: Era peningkatan kecepatan komputer secara eksponensial telah berakhir; fokus inovasi kini beralih ke efisiensi daya dan akselerator khusus (seperti untuk Machine Learning).
*   **Munculnya RISC-V**: RISC-V, sebagai standar instruksi open-source yang dikembangkan di Berkeley, diprediksi akan menjadi arsitektur dominan masa depan karena sifatnya yang gratis, fleksibel, dan bebas lisensi.
*   **Pentingnya Benchmark**: Pengukuran kuantitatif dan standar benchmark (seperti MLPerf) sangat krusial untuk menghindari klaim pemasaran yang menyesatkan dan mendorong inovasi nyata dalam industri.
*   **Filosofi Karier**: Patterson menekankan bahwa kesuksesan riset dan pengajaran saling mendukung, dan bahwa ukuran kehidupan yang sukses bukanlah materi, melainkan dampak positif pada orang lain dan hubungan yang dibangun.

---

### Rincian Materi (Detailed Breakdown)

#### 1. Sejarah Komputasi dan Dasar Arsitektur
*   **Profil Tamu**: David Patterson adalah pemenang Turing Award, profesor di Berkeley, dan dikenal sebagai pencetus arsitektur prosesor RISC (digunakan di 99% chip baru) serta penemu sistem penyimpanan RAID bersama John Hennessy.
*   **Evolusi Mikroprosesor**: Perubahan terbesar dalam 50 tahun terakhir adalah penemuan mikroprosesor. Komputer berubah dari seukuran ruangan menjadi muat di saku (ponsel), menjadi jutaan kali lebih cepat, lebih murah, dan lebih melimpah.
*   **Hukum Moore**: Gordon Moore (lulusan Berkeley) memprediksi bahwa jumlah transistor pada chip akan berlipat ganda setiap dua tahun. Prediksi ini mendorong industri menciptakan komputer yang ada di mana-mana (mobil, toko kelontong).
*   **Lapisan Abstraksi**: Ilmu komputer menggunakan lapisan abstraksi (seperti bawang) untuk mengelola kompleksitas. Pengembang hanya perlu memahami antarmuka di lapisan mereka tanpa harus mengetahui detail di bawahnya.
*   **Open Source**: Gerakan open source yang awalnya hanya pada perangkat lunak, kini merambah ke perangkat keras, memungkinkan siapa saja untuk mempelajari dan memodifikasi desain chip.

#### 2. RISC vs. CISC dan Desain Instruksi
*   **Debat RISC vs. CISC**: Pada tahun 1980-an, kebijaksanaan konvensional menganut CISC (*Complex Instruction Set Computers*) dengan instruksi yang rumit. Patterson dan rekan-rekanannya di Berkeley dan Stanford mendorong RISC (*Reduced Instruction Set Computers*) yang menggunakan instruksi sederhana namun dieksekusi lebih cepat.
*   **Peran Bahasa C dan UNIX**: Sistem operasi UNIX ditulis dalam bahasa C, membuktikan bahwa *compiler* (penerjemah bahasa) dapat menangani kompleksitas perangkat lunak, sehingga perangkat keras bisa disederhanakan (filosofi RISC).
*   **Strategi Intel**: Meskipun CISC secara teknis dianggap kalah, Intel memenangkan pasar PC dengan mempertahankan kompatibilitas *software* x86 mereka. Mereka menerjemahkan instruksi CISC menjadi instruksi mirip RISC di dalam chip secara *real-time*.
*   **Pendekatan Kuantitatif**: Perdebatan RISC vs CISC diselesaikan melalui pendekatan kuantitatif menggunakan rumus performa dan *benchmark*, menggeser diskusi dari "selera" subjektif ke data objektif.

#### 3. RISC-V: Masa Depan Open Source Hardware
*   **Asal Usul**: RISC-V dikembangkan di Berkeley sekitar tahun 2010 sebagai proyek pendidikan dan riset. Namanya diambil dari lima proyek RISC besar sebelumnya (RISC I, II, SOAR, SPUR, dan RISC-V sendiri).
*   **Keunggulan**: Berbeda dengan x86 (Intel/AMD) dan ARM yang bersifat *proprietary* dan berbayar, RISC-V adalah *instruction set* yang terbuka dan gratis untuk digunakan siapa saja.
*   **Adopsi Industri**: Awalnya untuk akademis, RISC-V kini diadopsi industri karena kebutuhan akan standar terbuka. Konferensi tahunan RISC-V tumbuh dari 50 menjadi 1.700 peserta dalam beberapa tahun.
*   **Segmentasi Pasar**:
    *   *Cloud*: Didominasi x86, namun perusahaan besar (Google, Amazon) bisa beralih karena mereka mengontrol *software stack*.
    *   *Mobile*: Didominasi ARM (proprietary RISC). Sulit diganggu karena ekosistem aplikasi biner yang sudah mapan.
    *   *IoT*: Bidang yang paling potensial bagi RISC-V karena tidak ada pemain dominan, biaya lisensi ARM mahal, dan kebutuhan akan fleksibilitas hardware.

#### 4. Efisiensi Energi dan Dominasi ARM
*   **Mengapa ARM Menang di Ponsel**: ARM menguasai pasar mobile karena efisiensi daya. x86 membuang banyak daya untuk menerjemahkan instruksi lama (CISC ke RISC), sedangkan ARM menggunakan instruksi sederhana secara langsung.
*   **Batasan Daya**: Batasan utama komputasi modern bukan lagi jumlah transistor, melainkan disipasi daya (panas). Efisiensi energi krusial untuk *data center* dan perangkat baterai.
*   **Strategi Apple**: Apple unggul dalam performa chip ARM (seri M) karena mereka merancang sendiri implementasi *hardware*-nya (membayar lisensi arsitektur), sementara kebanyakan kompetitor hanya menggunakan desain referensi dari ARM.

#### 5. Perlambatan Moore's Law dan Era Akselerator
*   **Akhir Era Kecepatan Gratis**: Hukum Moore melambat drastis. Peningkatan performa CPU kini hanya beberapa persen per tahun, bukan lipat ganda seperti dulu.
*   **Domain-Specific Architecture (DSA)**: Untuk menggantikan kecepatan umum yang hilang, industri beralih ke akselerator khusus, terutama untuk Machine Learning (ML).
*   **Revolusi Machine Learning**: ML bergantung pada perkalian matriks.

## Kesimpulan & Pesan Penutup
Wawancara ini menegaskan bahwa lanskap komputasi telah bergeser secara fundamental dari pengejaran kecepatan melalui transistor menuju efisiensi melalui arsitektur khusus dan kolaborasi open-source. David Patterson menunjukkan bahwa inovasi berkelanjutan memerlukan pendekatan kuantitatif dan keterbukaan standar seperti RISC-V di tengah berakhirnya era Hukum Moore. Di luar teknis, ia mengingatkan bahwa kesuksesan sejati tidak hanya diukur dari pencapaian intelektual, tetapi juga dari dampak positif yang kita berikan kepada orang lain melalui pengajaran dan hubungan yang membangun.

Read

file updated 2026-02-13 13:24:58 UTC