Guido van Rossum: Python and the Future of Programming | Lex Fridman Podcast #341
-DVyjdw4t9I • 2022-11-26
Transcript preview
Open
Kind: captions Language: en can you imagine possible features that python 4.0 might have that would necessitate the creation of the new 4.0 given the amount of pain and joy suffering and Triumph that was involved in the move between version 2 and version 3. the following is a conversation with Guido van Rossum his second time on this podcast he is the creator of the Python programming language and is Python's Emeritus pdfo benevolent dictator for life this is the Lex Friedman podcast to support it please check out our sponsors in the description and now dear friends here's Guido and Russell python 3.11 is coming out very soon init C python claimed to be 10 to 60 percent faster how did you pull that off and what's C python C python is the last python implementation standing also the first one that was ever created the original python implementation that I started over 30 years ago so what does it mean that python the programming language is implemented in another programming language called C what kind of audience do you have in mind here people who know programming no there's somebody on a boat that's into fishing and have never heard about programming but also some world-class programmers you're gonna have to speak to both imagine a boat with two people one of them has not heard about programming is really into phishing and the other one is like uh an incredible Silicon Valley programmer that's programmed in everything C C plus plus python rust Java it knows the entire history of programming languages so you're gonna have to speak to both I imagine that boat in the middle of the ocean yes I'm gonna please the guy who knows how to fish first yes please he seems like the most useful in the middle of the ocean you got you gotta make him mad I'm sure he has a cell phone so uh he's probably very suspicious about what goes on in that cell phone but he must have heard that inside his cell phone is a tiny computer and a programming language is computer code that tells the computer what to do it's a very low level language it's zeros and ones and then there's assembly and then oh yeah we don't talk about these really low levels because those just confuse people I mean when we're talking about human language we're not usually talking about vocal tracts and how you position your tongue I was talking yesterday about how when you have a Chinese person and they speak English uh this is a bit of a stereotype they often don't know or they they can't can't seem to make the difference well between an l and an r and I have a theory about that and I've never checked this with linguists uh that it probably has to do with the fact that in Chinese there is not really a difference and it could be that there are Regional variations in how China native Chinese speakers pronounce that one sound that sounds to L to some like L to some of them like R to others so it's both the sounds you produce with your mouth throughout the history of your life and what you're used to listening to I mean every language has that Russian has exactly the Slavic languages have sounds like the letters like uh Americans or English speakers don't seem to know the sounds they seemed uncomfortable with that sound yeah so I'm sure oh yes okay so we're not we're not going to the shapes of tongues and the sounds that the mouth can make fine words similarly we're not going into the ones and zeros or machine language I would say a programming language is a list of instructions like a cookbook recipe that sort of tells you how to do a certain thing like make a sandwich well acquire a loaf of bread cut it in slices uh take two slices uh put mustard on one put the jelly on the other or something then add the meat then add the cheese I've heard that science teachers can actually uh do great stuff with recipes like that and trying to interpret their students instructions incorrectly until the students are completely unambiguous about it with language see that's the difference between natural languages and programming languages I think ambiguity is a feature not a bug in human spoken languages like uh that's the dance of communication between humans well for lawyers ambiguity certainly is a feature uh for plenty of other cases uh the ambiguity is is not much of a feature but we work around it of course well what's more important is context so with context the Precision of the statement becomes more and more concrete right but you know when you say I love you to a person that matters a lot to you the person doesn't try to compile that statement and return an error saying please define love right no but I imagine that my wife and my son uh interpret it very differently yes even though it's the same three words but imprecisely still oh for sure lawyers never had a lot of follow-up questions for you nevertheless the context is already different different in that case yes fair enough so that's that's a programming language is uh ability to unambiguously State a recipe actually let's go back let's go to Pepe you go through and pepe the style guide for python code some ideas of what this language should look like feel like read like and the big idea there is that code readability counts what does that mean to you and how do we achieve it so this recipe should be readable that's a thing between programmers because on the one hand we always explain the concept of programming language as computers need instructions and computers are very dumb and they need very precise instructions because they don't have much context in in fact they have lots of context but their their context is very different but what we've seen emerge during the development of software starting in the probably in the late 40s is that software is a very social activity a software developer is not a mad scientist who sits alone in his lab writing brilliant code software is developed by teams of people uh even the Mad scientists sitting alone in his lap can type fast enough to produce enough code so that by the time he's done with his coding he still remembers what the first few lines he wrote mean so even the mad scientist coding alone in his lab would would be sort of wise to adopt conventions on how to format the instructions that he gives to the computer so that the thing is there is a difference between a cookbook recipe and a computer program The cookbook recipe the the author of The cookbook writes it once and then is printed in 100 000 copies and then lots of people in their kitchens try to recreate that recipe that that particular pie or dish from the recipe and so there the the goal of the cookbook author is to make it clear to the human reader of the recipe the human amateur Chef in most cases when you're writing a computer program you have two audiences at once it needs to tell the computer what to do but it also is useful if that program is readable by other programmers because computer software unlike the typical recipe for a cherry pie is so complex that you don't get all of it right at once you end up with the activity of debugging and you end up with the activity of so debugging is trying to figure out why your code doesn't run the way you thought it should run that means brother could be stupid little errors or it could be big logical errors it could be anything spiritual yeah it could be anything from a typo to uh a wrong choice of algorithm to building something that does what you tell it to do but that's not useful yeah it seems to work really well 99 of the time but does weird things one percent of the time on some edge cases that's pretty much all software nowadays all good software right well yeah for for bad software that that 99 goes down a lot so but it's not just about the complexity of the program it's like you said it is a social endeavor in that you're constantly improving that recipe for the cherry pie but you're sort of you're in a group of people improving that recipe or the mad scientist is improving the recipe that he created a year ago and making it better or adding adding something he decides that he wants a I don't know he wants some decoration on his pie or icing or so there's broad philosophical things and there's specific advice on style so first of all the thing that people first experience when they look up python there is a it is very readable but there's also like a spatial structure to it can you explain the indentation style of python and what is the magic to it space bases are important for readability of any kind of text if you take a cookbook recipe and you remove all the sort of all the bullets and other markup and you just crunch all the text together maybe you leave the spaces between the words but that's all you leave when you're in the kitchen trying to figure out oh what are the ingredients and what are the steps and where does this step end and the next step begin you're going to have a hard time if it's if it's just one solid block of text on the other hand what what a typical cookbook does if the paper is not too expensive each recipe starts on its own page maybe there's a picture next to it the list of ingredients comes first uh there's a standard notation the there's there's shortcuts so that you don't have to sort of write two sentences on how you have to cut the onion because there are only three ways that people ever cut onions in a kitchen small medium and in slices or something like that right none of my examples make any sense to real Cooks of course but yeah we're talking to programmers with a metaphor of cooking I love it um but there is a strictness to the spacing that python defines so there's some a looser thing some stricter things but the four spaces for the for the indentation is really interesting it it really um it really defines what the language looks and feels like because indentation sort of taking a block of text and then having inside that block of text a smaller block of text that is indented further as sort of a group it's it's it's like you have a a bulleted list in a complex business document and inside some of the bullets are other bulleted lists you will indent those too if each bulleted list is indented several inches then at two levels deep there's no no space left on the page to put any of the words of the text so you can't indent too far on the other hand if you don't indent at all you can tell whether something is a top level bullet or a second level bullet or a third level bullet so you have to have some compromise and uh based on Ancient conventions and the sort of the typical width of a computer screen in the 80s uh and all sorts of things sort of we we came up with sort of four spaces as a compromise I mean there there are groups there are large groups of people who code with uh two spaces per indent level for example the Google style guide uh all the Google python code and I think also all the Google C plus plus code is indented with only two spaces per block if you're not used to that it's harder to at a glance understand the code because the the sort of the the high level structure is determined by the indentation on the other hand there are there are other programming languages where the indentation is uh eight spaces or a whole tab stop in in sort of classic Unix and to me that looks weird because you you sort of after three indent levels you've you've got no room left well there's some languages where the indentation is a recommendation it's a stylistic one the code compiles even without any indentation and then python really indentations a fundamental part of the language right it doesn't have to be four spaces so you you can code python with two spaces per block or or six Paces or 12 if you really want to go wild but sort of everything that belongs to the same block needs to be indented the same way in practice in most other languages people recommend doing that anyway if you look at C or rust or C plus plus all those languages Java don't have a requirement of indentation but except in extreme cases they're just as anal about having their code properly indented so any IDE that the syntax highlighting that works with Java or C plus they will yell at you aggressively if you don't do proper indentation they'd suggests the proper indentation for you like uh in C you type a few words and then you type a curly brace which there is their notion of sort of begin and an indented block uh then you hit return and then it automatically indents four or eight spaces depending on uh your your style preferences or how your editor is configured was there a possible Universe in which you considered having braces in Python absolutely yeah well there's a 60 40 70 30 in your head uh uh what was the trade-off for a long time I was actually convinced that the indentation was just better uh without context I would still claim that indentation is better uh it reduces clutter however as I started to say earlier context is almost everything and in the context of coding most programmers are familiar with multiple languages even if they're only good at one or two and apart from Python and maybe Fortran I don't know how that's written these days anymore but all the other languages Java rust CC plus plus JavaScript typescript Perl are all using curly braces uh to sort of indicate blocks and so python is the odd one out so it's a radical idea do you still as a radical Renegade revolutionary do you still stand behind this idea of space of uh indentation versus braces like what what can you dig into it a little bit more why you still stand behind indentation because context is not the whole story history in in a sense provides more context so for python there's no chance that we can switch python is using curly braces for something else dictionaries mostly we would get in trouble if we wanted to switch just like you couldn't redefine C to use indentation even if you agree that it that indentation sort of in a Greenfield environment would be better you can't change that kind of thing in a language yeah it's hard enough to reach agreement over over much more Minor Details maybe I mean in the past in Python we did have a big debate about test versus spaces and four spaces versus fewer or more and we sort of came up with a recommended standard and sort of options for people who want to be different but yes I guess the thought experiment I'd like you to consider is if you could travel back through time when the when the compatibility is not an issue and he started python all over again can you make the case for uh indentation still well it frees up a pair of matched brackets of which there are never enough in the world uh for other purposes really makes the language slightly sort of easier to grasp for people who don't already know another programming language because the sort of one of the things and I I mostly got this from my mentors who were taught me programming language design in the earlier 80s when you're teaching programming for for the the total newbie who has not coded before in not in any other language uh a whole bunch of Concepts in programming are very alien or sort of new and and maybe very interesting but also distracting and confusing and there are many different things you have to learn you have to sort of in a typical 13-week programming course you have to if it's like really learning to program from scratch you have to cover algorithms you have to cover data structures you have to cover syntax you have to cover variables Loops functions recursion classes Expressions operators there are so many Concepts if you you sort of if you can spend a little less time having to worry about the syntax the the classic example was often oh the compiler complains every time I put a semicolon in the wrong place or I forget to put a semicolon uh python doesn't have semicolons in that sense so you can't forget them and you're also not sort of misled into putting them where they don't belong because you don't learn about them in the first place the flip side of that is forcing the strictness onto the beginning programmer to teach them that programming is a values attention to details you don't get to just write the way you write in English they have other details that they have to pay attention to so I think they'll they'll still get the message about uh paying attention to details the interesting design choice so I still program quite a bit in PHP and I'm sure there's other languages like this but the dollar sign before a variable that was always an annoying thing for me it didn't quite fit into my understanding of why this is good for a programming language I'm not sure if you ever thought about that one that is a historical thing there is a whole lineage of programming languages PHP is one Pearl was one on the union shell uh is one of the oldest or or all the different shells the dollar was invented for that purpose because a very earliest shells had a notion of scripting but they did not have a notion of parameterizing the scripting right and so a script is just a few lines of text where each line of text is a command that is read by a very primitive command processor that then sort of takes the first word on the line as the name of a program and passes all the all the rest of the line as text into the program for the program to figure out what to do with as arguments and so by the time scripting was slightly more mature than the very first script there was a convention that just like the first word of line is uh the name of the program the following words uh could be names of files input.text output.html things like that the next thing that happens is oh it would actually be really nice if we could have variables and especially parameters for scripts parameters are usually what starts this process but now you have a problem because you can't just say the parameters are x y and z and so now we we call say let's say x is the input file and Y is the output file and let's forget about Z for now I have my program and I write program X Y well that already has a meaning because that presumably means X itself is the file it's a file name it's not a variable name uh and so the inventors of of things like the unique shell and I'm sure job command language in at IBM before that uh had to use something that made it clear to the script processor here is an X that is not actually the name of a file which you just pass through to the to the program you're running here is an X that is the name of a variable yeah and when you're writing a script processor you try to keep it as simple as possible because at as certainly in the 50s and 60s uh the thing that interprets the script was itself a very had to be a very small program because it had to fit in a very small part of memory and so saying oh just look at each character and if you see a dollar sign you jump to another section of the code and then you gobble up characters or say until the next space or something and you say that's the variable name and so it was was sort of invented as a clever way to make parsing of things that contain but contain both variable and fixed parts very easy in a very simple script processor it also helps even then it also helps the human author and the human reader of the the script to quickly see oh 20 lines down in the script I see a reference to x y z Oh it has a dollar in front of it so now we know that x y z must be one of the parameters of the script well this is fascinating several things to say which is the leftovers from the simple script processor languages are now in code bases like behind Facebook or behind most of the back end I think php's probably still runs most of the back end of the internet oh yeah yeah I think there's a lot of it in Wikipedia too for example yeah it's funny that those decisions are not funny it's fascinating that those decisions permeate Through Time just like biological systems right I mean that the sort of the inner workings of DNA have been stable for well I don't know how long it was like 300 million years half a billion years yeah and there there are all sorts of weird quirks there that don't make a lot of sense if you were to design a system like self-replicating molecules from scratch but that system has a lot of interesting resilience it has redundancy that results like it messes up in interesting ways that still is resilient when you look at the system level of the organism code doesn't necessarily have that a program a computer programming code you'd be surprised how much resilience modern code has I mean if you if you look at the number of bugs per line of code even in in very well tested code that in practice works just fine there are actually lots of things that don't work fine and there are error correcting or self-correcting mechanisms at many levels including probably the user of the code well in the end the user who sort of is told well you got to reboot your your PC is part of that system and a slightly uh less drastic thing is reload the page which we all know how to do without thinking about it when something weird happens you you try to reload a few times before you say oh there's something really weird okay or try to click the button again if the first time didn't work well yeah that we should all have learned not to do that because that's probably just gonna turn the light back off yeah true so do it three times that's the that's the right lesson so uh and I wonder how many people actually like the dollar sign like you said it is documentation so to me it's whatever the opposite of syntactic sugar is syntactic poison to me it is such a pain in the ass that I have to type in a dollar shot also super error prone so it's not self-documenting it's it's like a bug generating thing it is a kind of documentation that's the pro and the con is it's a source of a lot of bugs but actually I have to ask you um this is a really interesting idea of bugs per line of code if you look at all the computer systems out there from the code that runs nuclear weapons to the code that runs all the amazing companies that you've been involved with and not the code that runs Twitter and Facebook and Dropbox and Google and Microsoft Windows and so on and we like laid out wouldn't that be a cool like table bugs per line of code and what would that let's let's put like actual companies aside do you think we'd be surprised by the number we see there for all these companies that depends on whether you've ever read about research that's been done in this area before and I didn't know the the re the the last time I I saw some research like that that was probably in the 90s and the research might have been done in the 80s but the the conclusion was across a wide range of different software different languages different companies different development styles the number of bugs is always I think it's in the order of about one bug per thousand lines in sort of mature software that that is considered interesting as good as it gets can I give you some facts here there's a lot of good papers so you said mature software right so here's uh a report from a uh like programming analytics company now this is from a developer perspective let me just say what it says because this is very weird and surprising on average a developer creates 70 bugs per 1000 lines of code 15 bugs per 1000 lines of code find their way to the customers but this is in the software they've oh I was I was wrong by an order okay there fixing a bug takes 30 times longer than writing a line of code that I can believe yeah 75 of a developers time is spent on debugging um that's for an average developer that they Analyze This 15. argue 1500 hours a year in us alone 113 billion dollars to spend annually on identifying and fixing bugs imagine this is marketing literature for someone who claims to have a golden bullet or a silver bullet that makes all that investment in fixing bugs go away but that that is usually yeah not going to yeah that's not gonna happen well they're uh I mean they're referencing a lot of stuff of course but it is a page uh that is you know there's a contact us button at the bottom presumably if you just spend a little bit less than 100 billion dollars we're willing to solve the problem for you right and there's also a report on stock exchanges stack overflow on the exact same topic but when I open it up at the moment the page says stack Overflow is currently offline for maintenance oh it's ironic yes uh by the way their error page is awesome anyway I mean can you believe that number of bugs oh absolutely isn't that scary that 70 bucks per 1000 lines of code so even 10 bucks per thousand lives well that's about one bug after every 15 lines and that's when you're first typing it in yeah from a developer but like how many bugs are going to be found if you're if you're typing well the development process is extremely iterative yeah typically you don't make a plan for what software you're going to release a year from now yeah uh and work out all the details because actually all the details uh themselves consist they're sort of compose a program and that's that being a program all your plans will have bugs in them too and inaccuracies uh but what what you actually do is you do a bunch of typing and I'm I'm actually really I'm a really bad typist that just I've never learned to type with 10 fingers how many do you use well I could use all 10 of them but not very well but I I never I never took a talking class and I never sort of corrected that so the first time I I seriously learned I had to learn the layout of a qwerty keyboard was actually in college in my first programming classes where we used Punch Cards and so with my two fingers I sort of pecked out my code watch anyone give you a little coding demonstration they'll have to produce like four lines of code and now see how many times they use the backspace key yeah because they made a mistake and and and some people especially when when someone else is looking will will backspace over 20 30 40 characters to fix a typo earlier in a line if you're if you're slightly more experienced of course you use your arrow buttons to go or your mouse to but the mouse is usually slower than uh than the arrows but a lot of people when they type a 20 character word which is not unusual and they realize they made us made a mistake at the start of the word the backspace over the whole thing and then retype it and sometimes it takes three four times to get it right so I don't know what your definition of bug is arguably mistyping a word and then correcting it immediately is not a bug on the other hand you you already do sort of lose time and every once in a while there's sort of a typo that you don't get in that process and now you've you've typed like 10 lines of code uh and some were in the middle of it you don't know where yet is a typo or maybe a thinko where you you forgot that you had to initialize a variable or something but those are two different things and I would say yes you have to actually run the code to discover that typo but forgetting to initialize a variable is a fundamentally different thing because that thing can go undiscovered uh that depends on the language in Python it will not right in sort of modern compilers are usually pretty good at catching that even even foresee so for that specific thing but actually deeper it might there might be another variable that has initialized but logically speaking the one you meant related yep it's like name the same but it's a different thing and you forgot to initialize uh whatever some counter or some some basic variable they're using I can tell that you've coded yes by the way I should mention that I use the Kinesis keyboard which has the backspace under the thumb and one of the biggest reasons I use that keyboard is because you realize in order to use the backspace on a usual keyboard you have to stretch your pinky out and like the the for most normal keyboards the Backspaces under the pinky and so I don't know if people realize the pain they go through in their life because of the backspace keep being so far away so with the Kinesis it's right under the thumb so you don't have to actually move your hands the backspace and the delivery what do you do if you're ever not with your own keyboard and you have to use someone else's PC keyboard that has a standard layout so first of all it turns out that you can actually go your whole life always having the keyboard with you so this well except for that that little tablet that you're using so we're note taking right now right uh yeah so it's very inefficient note-taking but I'm not I'm just looking stuff up but in most cases I would be actually using the keyboard here right now I just don't anticipate you have to calculate how much typing do you anticipate if I anticipate quite a bit then I'll just I have a keyboard and the same same with I mean the embarrassing of accepted being the weirdo that I am but you know when I go on an airplane and I anticipate to do programming or a lot of typing I will have a laptop that will put pull out a Kinesis keyboard in addition to the laptop and it's just who I am you have to you have to accept who you are um but also it's a you know for a lot of people for me certainly there's a comfort space where there's a certain kind of setups that are maximized productivity and um it's like some people have a warm blanket that they like when they watch a movie I like the Kinesis keyboard takes me to uh a place of focus and I still mostly I I'm trying to make sure I use the state-of-the-art IDS for everything but my comfort place just like the Kinesis keyboard is still emacs so I still use I still I mean that's one of some of the debates I have with myself about everything from a technology perspective is how much to hold on to the tools you're comfortable with versus how much to invest in using modern tools and the signal that the communities provide you with is the noisy one because a lot of people year to year get excited about new tools and you have to make a prediction are these tools defining a new generation or something that will transform programming or is this just a fad that will pass certainly with JavaScript Frameworks and front and the back end of the web there's a lot of different styles that came and went I remember learning um what was it called the action script I remember for flash um you know learning how to program in Flash uh learning how to design doing graphic animation all that kind of stuff in Flash same with Java applets I remember creating quite a lot of java applets thinking that this potentially defines the future of the web and it did not well you know in most cases like that the particular technology eventually gets replaced but many of the concepts that the technology introduced or made accessible first are preserved of course because yeah we're not using Java applets anymore but the notion of reactive web pages that sort of contain little bits of code that respond directly to something you do like pressing a button or a link or hovering even uh is has certainly not gone away and that those animations that were made painfully complicated with flesh I mean flash was an innovation when it first came up and when it was replaced by JavaScript equivalence stuff it was a somewhat better way to do animations but those animations are still there not all of them but but sort of again there is an evolution and often so often with technology the the sort of the technology that was eventually thrown away or replaced was still essential to to sort of get started there wouldn't be jet planes without propeller planes I bet you but from a user perspective yes from the feature set yes but I from a programmer perspective it feels like all the time I've spent with actionscript all the time I spent with Java on the applet side for the GUI development I well no Java I have to push back that was useful that because it transfers but the Flash doesn't transfer so some things you learn and invest time in what yeah what what you learned this the skill you picked up learning action script yeah was sort of it was perhaps a super valuable skill at the time you picked it up if if you if you learned action script early enough but that skill is no longer in demand well that's the calculation you have to make when you're learning new things like today people start learning programming today I'm trying to to see what are the new languages to try what are the new uh systems to try that what are the new IDs to try to to keep keep improving because that's why we start when we're young right but that seems very true to me that that when you're young you have your whole life ahead of you and your you're allowed to make mistakes in fact you should you should feel encouraged to to do a bit of stupid stuff yeah try not to get yourself killed or seriously maimed but try stuff that deviate from from what everybody else is doing and like nine out of ten times you'll just learn why everybody else is not doing that or why everybody else is doing it some other way and one out of ten times you sort of you discover something that's better or that's that somehow works I mean there are all sorts of crazy things that were invented uh by accident by people trying trying stuff together that's great advice to try random stuff make a lot of mistakes once you're married with kids you're probably going to uh be a little more risk-averse because now there's more at stake and you've already hopefully had some time where you where you were experimenting with crazy shit I like how marriage and kids solidifies their choice of programming language how does that the robber Frost poem with the The Road Less taken which I think is misinterpreted by most people but anyway I I feel like the choices you make early on especially if you go all in they're going to define the rest of your life's trajectory in a way that like you basically are picking a camp so uh you know there's if you invest a lot in PHP if you invest a lot in.net if you invest a lot in JavaScript you're going to stick there you that's that's your life Journey only as far as that technology remains relevant yes yes I mean if if at age 16 you learn coding in C and by the time you're 26 C is like a dead language then there's still time to switch there's probably some kind of Survivor bias or whatever it's called in in sort of your observation that that you pick a camp because there are many different camps to pick and if you pick dot net then then you can Coast for the rest of your life because that technology is now so ubiquitous of course that it's even if it's if it's bound to die it's going to take a very long time well for me personally I had a very difficult in my own head Brave leap that I had to take relevant to our discussion which is most of my life I programmed in C and C plus plus and so uh having that hammer everything looked like a nail so I would literally even do scripting in C plus plus like I would create programs that do script like things and uh when I first came to Google and and before then it became already before tensorflow before all of that there was a growing realization that c plus is not the right tool for machine learning we could talk about why that is it's unclear why that is a lot of things has to do with community and culture and how it emerges and stuff like that but for me they decided to take the leap to python like all out basically switched completely from C plus plus except for a highly performant robotics applications there were still uh there's still a culture of C plus plus in in the space of robotics that was a big leap like I had to you know like like people have like existential crises or midlife crises or whatever you have to realize almost like walking away from uh from a person you love um because I was sure that c plus would have to be a lifelong companion for a lot of problems I would want to solve C plus would be there and it was a question to say well that might not be the case because sibo spots is still one of the most popular languages in the world one of the most used one of the most dependent on it's also still evolving quite a bit I mean that that is not a sort of a fossilizing community yes they they are doing great Innovative work actually a lot but yet the sort of their Innovations are hard to follow if you're not already a hardcore C plus plus user well this was the thing it pulls you in it's a rabbit hole I was a hardcore the all meta programming template programming like I I would start using the modern C plus plus as it developed right not just the not just the shared pointer and the garbage collection that makes it easier for you to work with some of the flaws but the detail like The Meta programming the the crazy stuff that's that's coming out there but then you have to just empirically look and step back and say what language am I more productive in sorry to say what language do I enjoy my life with more and uh readability and able to think through and all that kind of stuff that those questions are harder to ask when you already have a loved one which in my case was C plus plus and then there's python uh like that Meme was is the the grass is greener on the other side am I just infatuated with a new fad new cool thing or is this actually going to make my life better and I think a lot of people face that kind of decision it was a difficult decision for me um when I made it at this time it's an obvious switch if you're into machine learning but at that time it wasn't quite yet so obvious so it was a risk and you know you have the same kind of stuff with um I still because of my connection to Wordpress I still do a lot of back-end programming in PHP uh and the question is you know node.js python do you switch to do you switch back into any of those programming there's the case for node.js for me well more and more and more of the front end it runs in JavaScript um and fascinating cool stuff is known as JavaScript maybe use the same programming language for the back end as well uh the case for python for the back end is well you're doing so much programming outside of the web in Python so maybe use Python for the back end and then the case for PHP well most of the web still runs in PHP you have a lot of experience with PHP why uh fix something that's not broken those are my own personal struggles but I think they reflect the struggles of a lot of people and with different programming languages with different problems they're trying to solve it's a weird one and there there's not a single answer right because depending on how much time you have to learn new stuff where you are in your life what what you're currently working on who you want to work with what communities you like yeah there's not one right choice maybe if you if you sort of if you can look back 20 years you can say well that whole detour through action script was a waste of time but nobody could know that so you can you can beat yourself up over that uh you just need to accept that not every choice you make is going to be perfect maybe sort of keep Plan B in the back of your mind uh but don't don't overthink it don't don't try to sort of don't don't create a spreadsheet with like where you're trying to estimate well if I learn this language I expect to make x million dollars in a lifetime and if I learn that language I expect to make why a million dollars in a lifetime and which Which is higher and what which has more risk and where's the chance that it's like picking picking a stock kind of kind of but uh I think with stocks you can do diversifying your investment as good with productivity in life boy that spreadsheet is possible to construct like if you actually carefully analyze what your interest in life are where you think you can maximally impact the world there really is better and worse choices for a programming language that are not just about the syntax but about the community about where you predict the community's headed what large systems are programmed in that but can you create that spreadsheet because that sort of you're mentioning a whole bunch of inputs that go into that spreadsheet where you have to estimate things that are very hard to measure and even harder I mean they're they're hard to measure retroactively and they're even harder to predict like what is the better community well better is is one of those incredibly difficult words what's better for you is not better for someone else no but we're not doing a public speech about what's better we're doing a personal spiritual journey I can determine a circle of friends circle circle one and circle two and I can have a bunch of parties with one and a bunch of parties with two and then right down or to take a mental note of what made me happier right and that you know you have if you're a machine learning person you want to say Okay I want to build a large company that does that is grounded in machine learning but also has a sexy interface that has a large impact on the world what languages do I use you look at what Facebook is using you look at what Twitter is using then you look at performance more newer languages like rust or you look at languages that have taken that most the community uses in the machine learning space that's Python and you can like think through you can hang out and think through it and it's it's always a invest and the the level of activity of the community is also really interesting like you said C plus plus and python are super active in terms of the development of the language itself but do you think that you can make objective choices there no no but there's a gut you build up like don't you don't you believe in that gut feeling everything is very subjective and yes you most certainly can have a gut feeling and your gut can also be wrong that's why there are billions of people because they're not all right I mean clearly there are more people living in the Bay Area who have plans to sort of create a Google sized company then there's room in the world for Google sized companies and they're gonna have to Duke it out in the market the space and there's many more choices than just the programming language speaking of which let's go back to the boat with the with the fisherman who's tuned out long ago I talked to the programmer let's jump around and go back to see python that we tried to Define as the reference implementation and one of the big things that's coming out in 3.11 what's the right way we tend to say 3.11 because it really was like we went 3.8 3.9 3.10 3.11 and we're planning to go up to 3.99 99 what happens after 99 probably just 3.100 what if I make it there okay and go all the way to 420. I got it forever python V3 we'll talk about four but more for fun so 3.11 is coming out one of the big sexy things in it is it'll be much faster so how did you beyond hiring a great team or working with a great team make it faster what are some ideas uh that may makes it faster it has to do with Simplicity of software versus performance and so even though C is known to be a low-level language which is great for writing sort of a high performance language interpreter when I originally started python or C python I didn't expect there would be great success and fame in my future uh so I I try to get something working and useful uh in about three months and so I I sort of I cut corners I borrowed ideas left and right when it comes to language design as well as implementation uh I also wrote much of the code as simple as it could be and they're they're like there are many things that you can code more efficiently by adding more code it's a bit of a sort of a time space trade-off where you can compute a certain thing from a small number of inputs uh and every time you get presented with new input uh you do the whole computation from the top that can be simple looking code it's easy to understand it's easy to reason about that you can you can tell quickly that it's correct in at least in the sort of mathematical sense of correct uh because it's implemented in C maybe it performs relatively well but over time as sort of as the requirements for that code and the need for performance go up you might be able to rewrite that same algorithm using more memory maybe remember previous results so you don't have to recompute everything from scratch like the the classic example is Computing prime numbers like is 10 a prime number well you sort of is it divisible by two is it divisible by three is it divisible by four and we go all the way to is it divisible by 9. and it is not well actually 10 is divisible by two so there we stop but say 11. it's divisible by ten the answer is nine is no ten times in a row so now we know 11 is a prime number on the other hand if we already know that 2 3 5 and 7 are prime numbers and you know a little bit about the mathematics of how prime numbers work you know that if you have a rough estimate for the square root of 11 you don't actually have to check is it divisible by four or is it divisible by five you all you have to check in the case of 11 is is it divisible by 2 is it divisible by three because take 12. if it's divisible by 4 well 12 divided by 4 is 3 so you you should have come across the question is it divisible by 3 first so if you know basically nothing about prime numbers except the definition maybe you go for X from 2 through n minus 1 is n divisible by X and then at the end if you got uh all no's uh for every single one of those questions you know oh it must be a prime number well the first thing is you can stop iterating when you find a yes answer and the second is you can also stop iterating when you have have reached the square root of n because you know that if it has a divisor larger than than the square root did not also have a divisor smaller than the square root then you say oh except for two we don't need to bother with checking for even numbers because all even numbers are divisible by two so if it's divisible by four we would already have come across the question is it divisible by two and so now you go special case check is a divisible by two and then you just check three five seven eleven uh and so now you've you've sort of reduced your search Pace by 50 Again by by skipping all the even numbers I kept for two if you think a bit more about it or you just read in your book about the history of math one of the first algorithms ever written down all you have to do is check is it divisible by any of the previous prime numbers that are smaller than the square root and before you get to a better algorithm than that you have to have several phds in in discrete math so that's as much as I know so of course that same story applies to a lot of other algorithms string matching is a good example of uh how to come up with an efficient algorithm and sometimes yeah the more efficient algorithm is not so much more complex than the inefficient one but that's an art and it's not always the case in the general cases the more performant the algorithm the more complex it's going to be there's a there's a kind of trade-off the simpler algorithms are also the ones that people invent first because when you're looking for a solution you look at the simplest way to get there first and so if there is a simple solution even if it's not the best solution not the fastest or the memory most memory efficient or whatever a a simple solution and simple is is fairly subjective but mathematicians have also thought about sort of what is a good definition for simple in the case of algorithms uh but the simpler the simpler Solutions tend to be easier to follow for other programmers who haven't made a study of a particular field and when I when I started with python I I was a good programmer in general I knew sort of basic data structures I knew the C language pretty well but there were many areas where I was only somewhat familiar with the state of the art and so I I picked in many cases the simplest way I could solve a particular sub problem because when you when you're designing and implementing a language you have to like you've many hundreds of little problems to solve and you have to have solutions for every one of them before you can can sort of say I've invented a programming language first of all so see python what kind of things does it do it's an interpreter it takes in this readable language that we talked about that is python what is it supposed to do The Interpreter basically it's it's sort of a recipe for understanding recipes so instead of a recipe that says bake me a cake we have a recipe for well given the text of a program how do we run that program and and that is sort of the recipe for building a computer the recipe for the Baker and the chef yeah what are the algorithmically tricky things that happen to be low-hanging fruit that could be improved on maybe throughout the history of python but also now how is it possible that 3.11 in year 2022 it's possible to get such a big performance Improvement we focused on a few areas where we we still felt there was low hanging fruit the biggest one is actually The Interpreter itself and this has to do with details of Pi how python is defined so I didn't know if the fisherman is going to follow this story he already he already jumped off the boat his uh he's he's this yeah stupid python is actually even though it's always called an interpreted language it's there's also a compiler in there it just doesn't compile to machine code it compiles to bytecode which is sort of code for an imaginary computer that is called the python interpreter so it's compiling code that is more easily digestible by The Interpreter or is digestible at all it is the code that is digested by The Interpreter that's the compiler we tweaked very minor bi
Resume
Categories