Guido van Rossum: Python and the Future of Programming | Lex Fridman Podcast #341
-DVyjdw4t9I • 2022-11-26
Transcript preview
Open
Kind: captions
Language: en
can you imagine possible features
that python 4.0 might have that would
necessitate the creation of the new 4.0
given the amount of
pain and joy
suffering and Triumph that was involved
in the move between version 2 and
version 3.
the following is a conversation with
Guido van Rossum his second time on this
podcast he is the creator of the Python
programming language and is Python's
Emeritus pdfo benevolent dictator for
life this is the Lex Friedman podcast to
support it please check out our sponsors
in the description and now dear friends
here's Guido and Russell
python 3.11 is coming out very soon init
C python claimed to be 10 to 60 percent
faster how did you pull that off and
what's C python C python is the last
python implementation standing also the
first one that was ever created the
original python implementation that I
started over 30 years ago so what does
it mean that python the programming
language is implemented in another
programming language called C what kind
of audience do you have in mind here
people who know programming no there's
somebody on a boat that's into fishing
and have never heard about programming
but also some world-class programmers
you're gonna have to speak to both
imagine a boat with two people one of
them has not heard about programming is
really into phishing and the other one
is like uh an incredible Silicon Valley
programmer that's programmed in
everything C C plus plus python rust
Java it knows the entire history of
programming languages so you're gonna
have to speak to both I imagine that
boat in the middle of the ocean yes I'm
gonna please the guy who knows how to
fish first yes please
he seems like the most useful in the
middle of the ocean you got you gotta
make him mad I'm sure he has a cell
phone so uh he's probably very
suspicious about what goes on in that
cell phone but he must have heard that
inside his cell phone is a tiny computer
and a programming language is computer
code that tells the computer what to do
it's a very low level
language it's zeros and ones and then
there's assembly and then oh yeah we
don't talk about these really low levels
because those just confuse people I mean
when we're talking about human language
we're not usually talking about vocal
tracts and how you position your tongue
I was talking yesterday about how when
you have a Chinese person and they speak
English uh this is a bit of a stereotype
they often don't know or they they can't
can't seem to make the difference well
between an l and an r and I have a
theory about that and I've never checked
this with linguists
uh that it probably has to do with the
fact that in Chinese there is not really
a difference and it could be that there
are Regional variations in how China
native Chinese speakers pronounce that
one sound that
sounds to L to some like L to some of
them like R to others so it's both the
sounds you produce with your mouth
throughout the history of your life and
what you're used to listening to I mean
every language has that Russian has
exactly the Slavic languages have sounds
like the letters
like uh Americans or English speakers
don't seem to know the sounds
they seemed uncomfortable with that
sound yeah so I'm sure oh yes okay so
we're not we're not going to the shapes
of tongues and the sounds that the mouth
can make fine words similarly we're not
going into the ones and zeros or machine
language I would say a programming
language is a list of instructions like
a cookbook recipe
that sort of tells you how to do a
certain thing like make a sandwich well
acquire a loaf of bread cut it in slices
uh take two slices uh put mustard on one
put the jelly on the other or something
then add the meat then add the cheese
I've heard that science teachers can
actually uh do great stuff with recipes
like that and trying to interpret their
students instructions incorrectly until
the students are completely unambiguous
about it
with language see that's the difference
between
natural languages and programming
languages I think ambiguity is a feature
not a bug in human spoken languages like
uh
that's the dance of communication
between humans
well for lawyers ambiguity certainly is
a feature
uh for plenty of other cases uh the
ambiguity is is not much of a feature
but we work around it of course well
what's more important is context
so with context the Precision of the
statement becomes more and more concrete
right but you know when you say I love
you to a person that matters a lot to
you the person doesn't try to compile
that statement and return an error
saying please define love
right no but I imagine that my wife and
my son uh interpret it very differently
yes even though it's the same three
words but imprecisely still
oh for sure
lawyers never had a lot of follow-up
questions for you nevertheless the
context is already different different
in that case yes fair enough so that's
that's a programming language is uh
ability to unambiguously State a recipe
actually let's go back let's go to Pepe
you go through and pepe the style guide
for python code some ideas of what this
language should look like
feel like read like and the big idea
there is that code readability counts
what does that mean to you and how do we
achieve it so this recipe should be
readable that's a thing between
programmers
because on the one hand we always
explain the concept of programming
language as computers need instructions
and computers are very dumb and they
need very precise instructions because
they don't have much context in in fact
they have lots of context but their
their context is very different
but what we've seen emerge during the
development of software starting in the
probably in the late 40s
is that software is a very social
activity a software developer is not a
mad scientist who sits alone in his lab
writing brilliant code
software is developed by teams of people
uh even the Mad scientists sitting alone
in his lap can type fast enough to
produce enough code so that by the time
he's done with his coding he still
remembers what the first few lines he
wrote mean so even the mad scientist
coding alone in his lab would would be
sort of wise to
adopt conventions on how to format
the instructions that he gives to the
computer so that the thing is there is a
difference between a cookbook recipe and
a computer program
The cookbook recipe the the author of
The cookbook writes it once
and then is printed in 100 000 copies
and then lots of people in their
kitchens try to recreate that recipe
that that particular
pie or dish from the recipe and so there
the
the goal of the cookbook author is to
make it clear
to the human reader of the recipe the
human amateur Chef in most cases
when you're writing a computer program
you have two audiences at once
it needs to
tell the computer what to do
but it also is useful if that program is
readable by other programmers
because computer software unlike the
typical recipe for a cherry pie is so
complex
that you don't get all of it right at
once
you end up with the activity of
debugging and you end up with the
activity of so debugging is
trying to figure out why your code
doesn't run the way you thought it
should run that means brother could be
stupid little errors or it could be big
logical errors it could be anything
spiritual yeah it could be anything from
a typo to uh a wrong choice of algorithm
to
building something that does what you
tell it to do but that's not useful
yeah it seems to work really well 99 of
the time but does weird things one
percent of the time on some edge cases
that's pretty much all software nowadays
all good software right well yeah for
for bad software that that 99 goes down
a lot so but it's not just about the
complexity of the program it's like you
said it is a social endeavor
in that you're constantly improving that
recipe for the cherry pie but you're
sort of you're in a group of people
improving that recipe or the mad
scientist is improving the recipe that
he created a year ago and making it
better
or adding adding something he decides
that he wants a I don't know he wants
some decoration on his pie or icing or
so there's broad philosophical things
and there's specific advice on style so
first of all the thing that people first
experience when they look up python
there is a it is very readable
but there's also like a spatial
structure to it
can you explain the indentation style of
python and what is the magic to it space
bases are important for readability of
any kind of text
if you take a cookbook recipe
and you remove all the sort of
all the bullets and other markup
and you just crunch all the text
together maybe you leave the spaces
between the words but that's all you
leave
when you're in the kitchen trying to
figure out oh what are the ingredients
and what are the steps
and where does this step end and the
next step begin you're going to have a
hard time if it's if it's just one solid
block of text
on the other hand what what a typical
cookbook does if the paper is not too
expensive
each recipe starts on its own page maybe
there's a picture next to it the list of
ingredients comes first
uh there's a standard notation the
there's there's shortcuts so that you
don't have to sort of write two
sentences on how you have to cut the
onion because there are only three ways
that people ever cut onions in a kitchen
small medium and in slices or something
like that
right none of my examples make any sense
to real Cooks of course but yeah
we're talking to programmers with a
metaphor of cooking I love it
um but there is a strictness to the
spacing that python defines so there's
some
a looser thing some stricter things but
the four spaces for the
for the indentation is really
interesting it it really
um it really defines what the language
looks and feels like because indentation
sort of taking a block of text and then
having inside that block of text
a smaller block of text that is indented
further as sort of a group it's it's
it's like you have a a bulleted list
in a complex business document and
inside some of the bullets are other
bulleted lists you will indent those too
if each bulleted list is indented
several inches then at two levels deep
there's no no space left on the page to
put any of the words of the text so you
can't indent too far on the other hand
if you don't indent at all you can tell
whether something is a top level bullet
or a second level bullet or a third
level bullet so you have to have
some compromise and uh based on Ancient
conventions
and the sort of the typical width of a
computer screen in the 80s
uh and all sorts of things sort of
we we came up with sort of four spaces
as a compromise I mean there there are
groups there are large groups of people
who code with uh two spaces per indent
level for example the Google style guide
uh all the Google python code and I
think also all the Google C plus plus
code is indented with only two spaces
per block if you're not used to that
it's harder to at a glance
understand the code because the the sort
of the the high level structure is
determined by the indentation on the
other hand there are there are other
programming languages where the
indentation is uh eight spaces or a
whole tab stop in in sort of classic
Unix and to me that looks weird because
you you sort of after three indent
levels you've you've got no room left
well there's some languages where the
indentation is a recommendation
it's a stylistic one the code compiles
even without any indentation
and then python really indentations a
fundamental part of the language right
it doesn't have to be four spaces so you
you can code python with two spaces per
block or or six Paces or 12 if you
really want to go wild but
sort of everything
that belongs to the same block needs to
be indented the same way
in practice in most other languages
people recommend doing that anyway if
you look at
C or rust or C plus plus
all those languages Java don't have a
requirement of indentation
but except in extreme cases
they're just as anal about having their
code properly indented so any IDE that
the syntax highlighting that works with
Java or C plus they will yell at you
aggressively if you don't do proper
indentation they'd suggests the proper
indentation for you like uh in C you
type a few words and then you type a
curly brace which there is their notion
of sort of begin and an indented block
uh then you hit return and then it
automatically indents four or eight
spaces depending on uh your your style
preferences or how your editor is
configured was there a possible Universe
in which you considered having braces in
Python absolutely yeah well there's a 60
40 70 30 in your head uh uh what was the
trade-off for a long time I was actually
convinced that the indentation was just
better
uh
without context I would still claim that
indentation is better
uh it reduces clutter however
as I started to say earlier context is
almost everything
and in the context of coding
most programmers are familiar with
multiple languages even if they're only
good at one or two
and apart from Python and maybe Fortran
I don't know how that's written these
days anymore but all the other languages
Java rust CC plus plus JavaScript
typescript Perl are all using curly
braces uh to sort of indicate blocks and
so python is the odd one out so it's a
radical idea do you still as a radical
Renegade revolutionary do you still
stand behind this idea of space of uh
indentation versus braces
like what what can you dig into it a
little bit more
why you still stand behind indentation
because context is not the whole story
history in in a sense provides more
context so for python
there's no chance that we can switch
python is using curly braces for
something else dictionaries mostly
we would get in trouble if we wanted to
switch just like you couldn't redefine C
to use indentation
even if you agree that it that
indentation sort of
in a Greenfield environment would be
better
you can't change that kind of thing in a
language yeah it's hard enough to reach
agreement over over much more Minor
Details maybe I mean in the past in
Python we did have a big debate about
test versus spaces and four spaces
versus fewer or more
and we sort of came up with
a recommended standard and sort of
options for people who want to be
different
but yes I guess the thought experiment
I'd like you to consider is if you could
travel back through time when the when
the compatibility is not an issue and he
started python all over again
can you make the case for uh indentation
still
well it frees up a pair of
matched brackets of which there are
never enough in the world
uh for other purposes
really makes the language slightly
sort of
easier to grasp for people who don't
already know
another programming language
because the sort of one of the things
and I I mostly got this from my mentors
who were
taught me programming language design in
the earlier 80s when you're teaching
programming
for for the the total newbie who has not
coded before in not in any other
language
uh a whole bunch of Concepts in
programming are very alien or
sort of
new and and maybe very interesting but
also distracting and confusing and there
are many different things you have to
learn you have to sort of
in a typical
13-week programming course you have to
if it's like really
learning to program from scratch you
have to cover algorithms you have to
cover data structures you have to cover
syntax you have to cover variables Loops
functions recursion classes
Expressions operators there are so many
Concepts if you you sort of
if you can spend a little less time
having to worry about the syntax
the the classic example was often
oh the compiler complains every time I
put a semicolon in the wrong place or I
forget to put a semicolon
uh python doesn't have semicolons in
that sense so you can't forget them and
you're also not
sort of misled into putting them where
they don't belong because you don't
learn about them in the first place
the flip side of that is forcing the
strictness onto the beginning programmer
to teach them that programming is a
values attention to details you don't
get to just write the way you write in
English they have other details that
they have to pay attention to so I think
they'll they'll still get the message
about uh
paying attention to details the
interesting design choice so I still
program quite a bit in PHP and I'm sure
there's other languages like this but
the dollar sign before a variable
that was always an annoying thing for me
it didn't quite fit into my
understanding of why this is good for a
programming language I'm not sure if you
ever thought about that one
that is a historical thing there is a
whole lineage of programming languages
PHP is one Pearl was one
on the union shell
uh is one of the oldest or or all the
different shells
the dollar was invented for that purpose
because a very earliest shells had a
notion of scripting but they did not
have a notion of parameterizing the
scripting
right and so a script is just a few
lines of text
where each line of text is a command
that is read by a very primitive command
processor that then sort of takes the
first word on the line as the name of a
program and passes all the all the rest
of the line as text into the program for
the program to figure out what to do
with as arguments
and so by the time scripting was
slightly more mature than the very first
script
there was a convention that just like
the first word of line is uh the name of
the program the following words
uh could be names of files
input.text
output.html things like that
the next thing that happens is oh it
would actually be really nice if we
could have variables and especially
parameters for scripts parameters are
usually what starts this process
but now you have a problem because you
can't just
say the parameters are x y and z
and so now we we call say let's say x is
the input file and Y is the output file
and let's forget about Z for now I have
my program
and I write program X Y well that
already has a meaning because that
presumably means
X itself is the file
it's a file name it's not a variable
name
uh and so
the inventors of of things like the
unique shell and I'm sure job command
language in at IBM before that
uh
had to use
something that made it clear to the
script processor
here is an X that is not actually the
name of a file which you just pass
through to the to the program you're
running here is an X that is the name of
a variable yeah and
when you're writing a script processor
you try to keep it as simple as possible
because at as certainly in the 50s and
60s
uh the thing that interprets the script
was itself a very had to be a very small
program because it had to fit in a very
small part of memory and so saying oh
just look at each character and if you
see a dollar sign you jump to another
section of the code and then you gobble
up characters or say until the next
space or something and you say that's
the variable name
and so it was was sort of
invented as
a clever way to make parsing of things
that contain but contain both variable
and fixed parts
very easy in a very simple script
processor it also helps even then it
also helps the human
author and the human reader of the
the script
to quickly see oh
20 lines down in the script I see a
reference to x y z Oh it has a dollar in
front of it so now we know that x y z
must be one of the parameters of the
script well this is fascinating several
things to say which is
the leftovers from the simple script
processor languages are now in code
bases like behind Facebook or behind
most of the back end I think php's
probably still runs most of the back end
of the internet oh yeah yeah I think
there's a lot of it in Wikipedia too for
example yeah it's funny that those
decisions are not funny it's fascinating
that those decisions permeate Through
Time
just like biological systems right
I mean that the sort of the inner
workings of DNA
have been stable for well I don't know
how long it was like 300 million years
half a billion years yeah and there
there are all sorts of weird quirks
there
that don't make a lot of sense if you
were to design
a system like self-replicating molecules
from scratch but that system has a lot
of interesting resilience it has
redundancy that results like it messes
up in interesting ways that still is
resilient when you look at the system
level of the organism code doesn't
necessarily have that a program a
computer programming code you'd be
surprised
how much resilience modern code has
I mean if you if you look at the number
of bugs per line of code
even in in very well tested
code that in practice works just fine
there are actually lots of things that
don't work fine
and there are error correcting or
self-correcting mechanisms at many
levels including probably the user of
the code well in the end the user who
sort of is told well you got to reboot
your your PC is part of that system and
a slightly uh less drastic thing is
reload the page which we all know how to
do without thinking about it when
something weird happens you you try to
reload a few times before you say oh
there's something really weird okay or
try to click the button again if the
first time didn't work
well yeah that we should all have
learned not to do that because that's
probably just gonna turn the light back
off yeah true so do it three times
that's the that's the right lesson so uh
and I wonder how many people actually
like the dollar sign like you said it is
documentation so to me it's whatever the
opposite of syntactic sugar is syntactic
poison to me it is such a pain in the
ass that I have to type in a dollar shot
also super error prone
so it's not self-documenting it's it's
like a bug generating thing it is a kind
of documentation that's the pro and the
con is it's a source of a lot of bugs
but actually I have to ask you um
this is a really interesting idea of
bugs per line of code
if you look at all the computer systems
out there from the code that runs
nuclear weapons to the code that runs
all the amazing companies that you've
been involved with and not the code that
runs Twitter and Facebook and Dropbox
and Google and Microsoft Windows and so
on
and we like laid out
wouldn't that be a cool like table bugs
per line of code and what would that
let's let's put like actual companies
aside do you think we'd be surprised by
the number we see there for all these
companies
that depends on whether you've ever read
about research that's been done in this
area before
and
I didn't know the the re the the last
time I
I saw some research like that that was
probably in the 90s and the research
might have been done in the 80s but the
the conclusion was across a wide range
of different software different
languages
different companies
different development styles
the number of bugs is always
I think it's in the order of about one
bug per thousand lines in sort of
mature software that that is considered
interesting as good as it gets can I
give you some facts here there's a lot
of good papers so you said mature
software right so here's uh
a report from a uh like programming
analytics company
now this is from a developer perspective
let me just say what it says because
this is very weird and surprising on
average a developer creates 70 bugs per
1000 lines of code
15 bugs per 1000 lines of code find
their way to the customers
but this is in the software they've oh I
was I was wrong by an order okay there
fixing a bug takes 30 times longer than
writing a line of code
that I can believe yeah 75 of a
developers time is spent on debugging
um that's for an average developer that
they Analyze This 15. argue
1500 hours a year in us alone
113 billion dollars to spend annually on
identifying and fixing bugs
imagine this is marketing literature for
someone who claims to have a golden
bullet or a silver bullet that makes all
that investment in fixing bugs go away
but that that is usually yeah not going
to yeah that's not gonna happen well
they're uh I mean they're referencing a
lot of stuff of course but it is a page
uh that is you know there's a contact us
button at the bottom presumably if you
just spend a little bit less than 100
billion dollars we're willing to solve
the problem for you
right and there's also a report on stock
exchanges stack overflow on the exact
same topic but when I open it up at the
moment the page says stack Overflow is
currently offline for maintenance oh
it's ironic yes uh by the way their
error page is awesome anyway
I mean can you believe that number of
bugs oh absolutely isn't that scary that
70 bucks per 1000 lines of code so even
10 bucks per thousand lives well that's
about one bug after every 15 lines and
that's when you're first typing it in
yeah from a developer but like how many
bugs are going to be found
if you're if you're typing well the
development process is extremely
iterative yeah typically you don't make
a plan for what software you're going to
release a year from now yeah uh and work
out all the details because actually all
the details uh themselves consist
they're sort of compose a program
and that's that
being a program all your plans will have
bugs in them too and inaccuracies
uh but what what you actually do is
you do a bunch of typing and I'm I'm
actually really I'm a really bad typist
that just I've never learned to type
with 10 fingers
how many do you use
well I could use all 10 of them but not
very well
but I I never I never took a talking
class and I never sort of corrected that
so the first time I I seriously learned
I had to learn the layout of a qwerty
keyboard
was actually in college in my first
programming classes where we used Punch
Cards
and so
with my two fingers I sort of pecked out
my code
watch anyone
give you a little coding demonstration
they'll have to produce like four lines
of code
and now see how many times they use the
backspace key yeah because they made a
mistake and and
and some people especially when when
someone else is looking
will will backspace over 20 30 40
characters to fix a typo earlier in a
line if you're
if you're slightly more experienced of
course you use your arrow buttons to go
or your mouse to but the mouse is
usually slower than uh than the arrows
but a lot of people when they type a 20
character word which is not unusual and
they realize they made us made a mistake
at the start of the word the backspace
over the whole thing
and then retype it and sometimes it
takes three four times to get it right
so
I don't know what your definition of bug
is arguably mistyping a word and then
correcting it immediately is not a bug
on the other hand you you already
do sort of lose time and every once in a
while there's sort of a typo that you
don't get in that process
and now you've you've typed like 10
lines of code
uh and some were in the middle of it you
don't know where yet is a typo or maybe
a thinko where you you forgot that you
had to initialize a variable or
something but those are two different
things and I would say yes you have to
actually run the code to discover that
typo but forgetting to initialize a
variable is a fundamentally different
thing because that thing can go
undiscovered uh that depends on the
language in Python it will not right in
sort of modern compilers are usually
pretty good at catching that even
even foresee so for that specific thing
but actually deeper
it might there might be another variable
that has initialized but logically
speaking the one you meant related yep
it's like name the same but it's a
different thing and you forgot to
initialize uh whatever some counter or
some some basic variable they're using I
can tell that you've coded yes by the
way I should mention that I use the
Kinesis keyboard
which has the backspace under the thumb
and one of the biggest reasons I use
that keyboard is because you realize in
order to use the backspace on a usual
keyboard you have to stretch your pinky
out
and like the the for most normal
keyboards the Backspaces under the pinky
and so I don't know if people realize
the pain they go through in their life
because of the backspace keep being so
far away so with the Kinesis it's right
under the thumb so you don't have to
actually move your hands the backspace
and the delivery what do you do if
you're ever not with your own keyboard
and you have to use someone else's PC
keyboard that has a standard layout so
first of all it turns out that you can
actually go your whole life always
having the keyboard
with you so this well except for that
that little tablet that you're using so
we're note taking right now right uh
yeah so it's very inefficient
note-taking but I'm not I'm just looking
stuff up but in most cases I would be
actually using the keyboard here right
now I just don't anticipate you have to
calculate how much typing do you
anticipate if I anticipate quite a bit
then I'll just I have a keyboard
and the same same with I mean
the embarrassing
of accepted being the weirdo that I am
but you know when I go on an airplane
and I anticipate to do programming or a
lot of typing I will have a laptop that
will put pull out a Kinesis keyboard in
addition to the laptop and it's just who
I am you have to you have to accept who
you are
um but also it's a you know for a lot of
people
for me certainly there's a comfort space
where there's a certain kind of setups
that are maximized productivity and
um it's like some people have a warm
blanket that they like
when they watch a movie I like the
Kinesis keyboard takes me to uh a place
of focus and I still mostly I I'm trying
to make sure I use the state-of-the-art
IDS for everything but my comfort place
just like the Kinesis keyboard is still
emacs
so
I still use I still I mean that's one of
some of the debates I have with myself
about everything from a technology
perspective
is how much to hold on to the tools
you're comfortable with versus how much
to invest in using modern tools and the
signal that the communities provide you
with is the noisy one because a lot of
people year to year get excited about
new tools and you have to make a
prediction are these tools defining a
new generation or something that will
transform programming or is this just a
fad that will pass certainly with
JavaScript Frameworks and front and the
back end of the web there's a lot of
different styles that came and went I
remember learning um what was it called
the action script I remember for flash
um you know learning how to program in
Flash uh learning how to design doing
graphic animation all that kind of stuff
in Flash same with Java applets I
remember creating quite a lot of java
applets thinking that this potentially
defines the future of the web and it did
not well you know in most cases like
that the particular technology
eventually gets replaced
but
many of the concepts that the technology
introduced or made accessible first
are preserved of course
because yeah we're not using Java
applets anymore but the notion of
reactive web pages
that sort of contain little bits of code
that respond directly to
something you do like pressing a button
or a link or hovering even
uh is has certainly not gone away
and that those animations that were made
painfully
complicated with flesh
I mean flash was an innovation when it
first came up
and when it was replaced by JavaScript
equivalence
stuff
it was a somewhat better way to do
animations but those animations are
still there not all of them
but but sort of
again there is an evolution and often so
often with technology
the the sort of the technology that was
eventually thrown away or replaced
was still essential to to sort of
get started there wouldn't be jet planes
without propeller planes
I bet you but from a user perspective
yes from the feature set yes but I from
a programmer perspective it feels like
all the time I've spent
with actionscript all the time I spent
with Java on the applet side for the GUI
development I well no Java I have to
push back that was useful that because
it transfers but the Flash doesn't
transfer so some things you learn and
invest time in what yeah what what you
learned this the skill you picked up
learning action script yeah
was sort of it was perhaps
a super valuable skill at the time you
picked it up if if you if you learned
action script early enough but
that skill is no longer
in demand well that's the calculation
you have to make when you're learning
new things like today people start
learning programming today I'm trying to
to see what are the new languages to try
what are the new uh systems to try that
what are the new IDs to try to to keep
keep improving because that's why we
start when we're young right
but that seems very true to me that that
when you're young you have your whole
life ahead of you and your you're
allowed to make mistakes in fact you
should you should feel encouraged to to
do a bit of stupid stuff yeah try not to
get yourself killed or seriously maimed
but try stuff that
deviate from from what everybody else is
doing
and like nine out of ten times you'll
just learn why everybody else is not
doing that or why everybody else is
doing it some other way and one out of
ten times you sort of
you discover something that's better or
that's that somehow works I mean there
are all sorts of crazy things that were
invented
uh by accident by people trying trying
stuff together
that's great advice to try random stuff
make a lot of mistakes once you're
married with kids you're probably going
to uh be a little more risk-averse
because now there's more at stake and
you've already hopefully had some time
where you where you were experimenting
with crazy shit I like how marriage and
kids solidifies their choice of
programming language how does that the
robber Frost poem with the The Road Less
taken which I think is misinterpreted by
most people but anyway I I feel like the
choices you make early on
especially if you go all in they're
going to define the rest of your life's
trajectory in a way that
like you basically are picking a camp so
uh you know there's if you invest a lot
in PHP if you invest a lot in.net if you
invest a lot in JavaScript
you're going to stick there
you that's that's your life Journey
only as far as that technology remains
relevant yes yes I mean if if at age 16
you learn coding in C
and by the time you're 26 C is like a
dead language
then there's still time to switch
there's probably some kind of Survivor
bias or whatever it's called in in sort
of your observation that that you pick a
camp because there are many different
camps to pick and if you pick dot net
then then you can Coast for the rest of
your life because that technology is now
so ubiquitous of course that it's even
if it's if it's bound to die it's going
to take a very long time well for me
personally
I had a very difficult in my own head
Brave leap that I had to take relevant
to our discussion which is most of my
life I programmed in C and C plus plus
and so uh having that hammer everything
looked like a nail
so I would literally even do scripting
in C plus plus like I would create
programs that do script like things and
uh when I first came to Google and and
before then it became already before
tensorflow before all of that there was
a growing realization that c plus is not
the right tool for machine learning we
could talk about why that is it's
unclear why that is a lot of things
has to do with community and culture and
how it emerges and stuff like that but
for me they decided to take the leap to
python like all out basically switched
completely from C plus plus except for a
highly performant robotics applications
there were still uh
there's still a culture of C plus plus
in in the space of robotics
that was a big leap
like I had to you know like like people
have like existential crises or midlife
crises or whatever you have to realize
almost like walking away from uh from a
person you love
um because I was sure that c plus would
have to be a lifelong companion for a
lot of problems I would want to solve C
plus would be there and it was a
question to say well that might not be
the case because sibo spots is still one
of the most popular languages in the
world one of the most used one of the
most dependent on it's also still
evolving quite a bit I mean
that that is not a sort of a fossilizing
community yes they they are doing great
Innovative work actually a lot but yet
the sort of their Innovations are hard
to follow if you're not already a
hardcore C plus plus user well this was
the thing it pulls you in it's a rabbit
hole I was a hardcore the all meta
programming template programming like I
I would start using the modern C plus
plus as it developed right not just the
not just the shared pointer and the
garbage collection that makes it easier
for you to work with some of the flaws
but the detail like The Meta programming
the the crazy stuff that's that's coming
out there but then you have to just
empirically look and step back and say
what language am I more productive in
sorry to say what language do I enjoy my
life with more
and uh readability and able to think
through and all that kind of stuff that
those questions are harder to ask when
you already have
a loved one which in my case was C plus
plus and then there's python uh like
that Meme was is the the grass is
greener on the other side am I just
infatuated with a new fad new cool thing
or is this actually going to make my
life better and I think a lot of people
face that kind of decision it was a
difficult decision for me
um when I made it at this time it's an
obvious switch if you're into machine
learning but at that time it wasn't
quite yet so obvious so it was a risk
and you know you have the same kind of
stuff with um
I still because of my connection to
Wordpress
I still do a lot of back-end programming
in PHP uh
and the question is you know node.js
python do you switch to do you switch
back into any of those
programming there's the case for node.js
for me well more and more and more of
the front end it runs in JavaScript
um and fascinating cool stuff is known
as JavaScript maybe use the same
programming language for the back end as
well
uh the case for python for the back end
is well you're doing so much programming
outside of the web in Python so maybe
use Python for the back end and then the
case for PHP well most of the web still
runs in PHP
you have a lot of experience with PHP
why uh fix something that's not broken
those are my own personal struggles but
I think they reflect the struggles of a
lot of people and with different
programming languages with different
problems they're trying to solve it's a
weird one and there there's not a single
answer right because depending on how
much time you have to learn new stuff
where you are in your life what what
you're currently working on who you want
to work with what communities you like
yeah there's not one right choice
maybe if you if you sort of
if you can look back 20 years you can
say well that whole detour through
action script was a waste of time
but
nobody could know that
so you can you can beat yourself up over
that
uh you just need to accept that not
every choice you make
is going to be perfect maybe sort of
keep Plan B in the back of your mind
uh but don't don't overthink it don't
don't try to sort of don't don't create
a spreadsheet with like where you're
trying to estimate well if I learn this
language I expect to make x million
dollars in a lifetime and if I learn
that language I expect to make why a
million dollars in a lifetime and which
Which is higher and what which has more
risk and where's the chance that it's
like picking picking a stock
kind of kind of but uh
I think with stocks you can do
diversifying your investment as good
with productivity in life
boy that spreadsheet is possible to
construct
like if you actually carefully analyze
what your interest in life are where you
think you can maximally impact the world
there really is better and worse choices
for a programming language that are not
just about the syntax but about the
community about where you predict the
community's headed
what large systems are programmed in
that but can you create that spreadsheet
because that sort of you're mentioning a
whole bunch of inputs that go into that
spreadsheet where you have to estimate
things that are very hard to measure and
even harder I mean they're they're hard
to measure
retroactively and they're even harder to
predict like what is the better
community
well better is is one of those
incredibly difficult words what's better
for you is not better for someone else
no but we're not doing a public speech
about what's better we're doing a
personal spiritual journey I can
determine a circle of friends
circle circle one and circle two and I
can have a bunch of parties with one and
a bunch of parties with two and then
right down or to take a mental note of
what made me happier right and that you
know you have if you're a machine
learning person you want to say Okay I
want to build a large company that does
that is grounded in machine learning but
also has a sexy interface that has a
large impact on the world what languages
do I use you look at what Facebook is
using you look at what Twitter is using
then you look at performance more newer
languages like rust or you look at
languages that have taken that most the
community uses in the machine learning
space that's Python and you can like
think through you can hang out and think
through it and it's it's always a invest
and the the level of activity of the
community is also really interesting
like you said C plus plus and python are
super active in terms of the development
of the language itself
but do you think that you can make
objective choices there no no but
there's a gut you build up like don't
you don't you believe in that gut
feeling everything is very subjective
and yes you most certainly can have a
gut feeling and your gut can also be
wrong that's why there are billions of
people because they're not all right I
mean clearly there are more people
living in the Bay Area who have plans to
sort of create a Google sized company
then there's room in the world for
Google sized companies and they're gonna
have to Duke it out in the market the
space and there's many more choices than
just the programming language speaking
of which let's go back to the boat with
the with the fisherman who's tuned out
long ago I talked to the programmer
let's jump around and go back to see
python that we tried to Define as the
reference implementation and one of the
big things that's coming out in 3.11
what's the right way we tend to say 3.11
because it really was like we went 3.8
3.9 3.10 3.11 and we're planning to go
up to 3.99 99 what happens after 99
probably just 3.100 what if I make it
there okay
and go all the way to 420. I got it
forever python V3 we'll talk about four
but more for fun
so 3.11 is coming out one of the big
sexy things in it is it'll be much
faster so how did you beyond hiring a
great team or working with a great team
make it faster what are some ideas
uh that may makes it faster
it has to do with Simplicity of software
versus performance
and so even though C is known to be a
low-level language which is
great for writing sort of
a high performance language interpreter
when I originally started python or C
python
I
didn't expect there would be
great success and fame in my future
uh so I
I try to get something working
and useful
uh in about three months
and so I I sort of I cut corners
I borrowed ideas left and right when it
comes to language design as well as
implementation
uh I also wrote much of the code as
simple as it could be
and
they're they're like
there are many things that you can code
more efficiently by adding more code
it's a bit of a sort of a time space
trade-off
where you can compute a certain thing
from a small number of inputs
uh and every time you get presented with
new input
uh you do the whole computation from the
top
that can be simple looking code it's
easy to understand it's easy to reason
about that you can you can tell quickly
that it's correct in at least in the
sort of mathematical sense of correct
uh because it's implemented in C maybe
it performs relatively well
but over time as sort of
as the requirements for that code and
the need for performance
go up
you might be able to rewrite that same
algorithm
using more memory maybe remember
previous results
so you don't have to recompute
everything from scratch like the the
classic example is Computing prime
numbers
like
is 10 a prime number
well you sort of is it divisible by two
is it divisible by three is it divisible
by four and we go all the way to is it
divisible by 9. and it is not well
actually 10 is divisible by two so there
we stop but say 11. it's divisible by
ten the answer is nine is no ten times
in a row so now we know 11 is a prime
number
on the other hand if we already know
that 2 3 5 and 7 are prime numbers and
you know a little bit about the
mathematics of how prime numbers work
you know that if you have a rough
estimate for the square root of 11 you
don't actually have to check is it
divisible by four or is it divisible by
five you all you have to check in the
case of 11 is is it divisible by 2 is it
divisible by three
because take 12.
if it's divisible by 4 well 12 divided
by 4 is 3 so you you should have come
across the question is it divisible by 3
first
so if you know basically nothing about
prime numbers except the definition
maybe you go for X from 2
through n minus 1 is n divisible by X
and then at the end if you got uh all
no's uh for every single one of those
questions you know oh it must be a prime
number well the first thing is you can
stop iterating when you find a yes
answer
and the second is you can also stop
iterating when you have have reached
the square root of n because you know
that if it has a divisor larger than
than the square root did not also have a
divisor smaller than the square root
then you say oh except for two we don't
need to bother with checking for even
numbers because all even numbers are
divisible by two so if it's divisible by
four
we would already have come across the
question is it divisible by two and so
now you go special case check is a
divisible by two and then you just check
three five seven eleven
uh and so now you've you've sort of
reduced your search Pace by 50 Again by
by skipping all the even numbers I kept
for two
if you think a bit more about it or you
just
read in your book about the history of
math one of the first algorithms ever
written down
all you have to do is check is it
divisible by any of the previous prime
numbers that are smaller than the square
root
and before you get to a better algorithm
than that
you have to have several phds in in
discrete math so that's as much as I
know so of course that same story
applies to a lot of other algorithms
string matching is a good example
of uh how to come up with an efficient
algorithm and sometimes yeah the more
efficient algorithm is not so much more
complex than the inefficient one but
that's an art and it's not always the
case in the general cases the more
performant the algorithm the more
complex it's going to be there's a
there's a kind of trade-off the simpler
algorithms are also the ones that people
invent first
because when you're looking for a
solution
you look at the simplest way to get
there first
and so if there is a simple solution
even if it's not the best solution not
the fastest or the memory most memory
efficient or whatever
a a simple solution and simple is is
fairly subjective but mathematicians
have also thought about sort of what is
a good definition for simple in the case
of algorithms
uh but the simpler the simpler Solutions
tend to be
easier to follow for other programmers
who haven't made a study of a particular
field and when I when I started with
python I I was a good programmer in
general I knew sort of basic data
structures I knew the C language pretty
well
but there were many areas where I was
only somewhat familiar with the state of
the art
and so I I picked
in many cases the simplest way I could
solve a particular sub problem because
when you when you're designing and
implementing a language you have to like
you've many hundreds of little problems
to solve
and you have to have solutions for every
one of them before you can can sort of
say I've invented a programming language
first of all so see python what kind of
things does it do it's an interpreter it
takes in this readable language that we
talked about that is python what is it
supposed to do The Interpreter basically
it's it's sort of a recipe for
understanding recipes
so instead of a recipe that says bake me
a cake we have a recipe for
well given
the text of a program
how do we run that program and and that
is sort of the recipe for building a
computer the recipe for the Baker and
the chef yeah what are the
algorithmically tricky things that
happen to be low-hanging fruit that
could be improved on maybe throughout
the history of python but also now how
is it possible that 3.11 in year 2022
it's possible to get such a big
performance Improvement
we focused
on a few areas where we we still felt
there was low hanging fruit
the biggest one is actually The
Interpreter itself
and this has to do with details of Pi
how python is defined so I didn't know
if the fisherman is going to follow this
story he already he already jumped off
the boat his uh he's he's this yeah
stupid python is actually even though
it's always called an interpreted
language it's there's also a compiler in
there it just doesn't compile to machine
code it compiles to bytecode which is
sort of code for an imaginary computer
that is called the python interpreter so
it's compiling code that is more easily
digestible by The Interpreter or is
digestible at all it is the code that is
digested by The Interpreter that's the
compiler we tweaked very minor bi
Resume
Read
file updated 2026-02-14 10:02:33 UTC
Categories
Manage