Travis Oliphant: NumPy, SciPy, Anaconda, Python & Scientific Programming | Lex Fridman Podcast #224
gFEE3w7F0ww • 2021-09-23
Transcript preview
Open
Kind: captions Language: en the following is a conversation with travis oliphant one of the most impactful programmers and data scientists ever he created numpy scipy and anaconda numpai formed the foundation of tensor-based machine learning in python scipy formed the foundation of scientific programming in python and anaconda specifically with conda made python more accessible to a much larger audience travis's life work across a large number of programming and entrepreneurial efforts has and will continue to have immeasurable impact on millions of lives by empowering scientists and engineers in big companies small companies and open source communities to take on difficult problems and solve them with the power of programming plus he's a truly kind human being which is something that when combined with vision and ambition makes for great leader and a great person to chat with to support this podcast please check out our sponsors in the description this is the lex friedman podcast and here is my conversation with travis oliphant what was the first computer program you've ever written do you remember whoa that's a good question i think it was in fourth grade just a simple uh loop in basic basic basic on an atari 800 atari 400 i think or maybe there's an atari 800 it was at a part of a class and we just were just basic loops to print things out did you use go to statements um yes yes we used go to statements i remember in the early days that's when i first realized there's like principles to programming when i was told that don't use go to statements those are bad software engineering prints like it goes against what great beautiful code is i was like oh okay there's rules to this game i didn't see that until high school when i took an ap computer science course right i did a lot of other kinds of just programming on ti but finally when i took an ap computer science course in pascal wow that's that was pascal that's when i oh there are these principles not c or c plus plus no i didn't take c until uh the next year in college i had a course in c um but i haven't done much in pascal just that ap computer science course now sorry for the romanticized question but when did you first fall in love with programming oh man good question i think actually when i was 10 you know my dad got us a t a timex sinclair and uh he was excited about the spreadsheet capability and then but i made him get the basic the add-on so we could actually program in basic and just being able to write instructions and have the computer do something then we got a ti-99 ti-994a when i was about 12 and i would just it had sprites and graphics and music you could actually program to do music that's when i really sort of fell in love with programming so this is a full like a real computer with like uh with memory and storage yeah processors so what not the type of ti yeah the timex sinclair was one of the very first it was a cheap cheap like i think it was well it was still expensive but it was 2k of memory we got the 16k add-on pack but yeah it had memory and you could program it you had the in order to store your programs you had to attach a tape drive remember that old the sound that would play when you invented the converted the the modems would convert digital bits to audio files on a tape drive still remember that sound but that was the storage and what was the programming language do you remember it was basic it was basic and then they had a visi calc and so a little bit of spreadsheet programming busy but mostly just some basic do you remember what kind of things drew you to programming was it uh working with data was it video games and video games mathy stuff yeah i've i've always loved math and a lot of people think they don't like math because i think when they're exposed to it early they uh it's about memory you know when you're exposed to math earlier you have a good short term memory members timetables and i i do have a reasonably i mean not perfect but a reasonably long um little short-term memory buffer and so i did great at times tables i said oh you're good at math but i started to really like math just the the problem solving aspect and so computing was problem solving applied and so that's always kind of been the draw kind of coupled with the mathematics did you ever see the computer as like an extension of your mind like something able to achieve not till later okay yeah no not then it's just like a little set of puzzles that you can play with and you can you can play with math puzzles and yeah it was it was too rudimentary early on like it was sort of yeah it was too it was a lot of work to actually take a thought you'd have and actually get it implemented and that's still work but it's getting easier and so yeah i would say that's definitely what's attracted me to python is that that was more real right i could think in python speaking a foreign language i only speak another language fluently besides english which is spanish and i remember the day when i would dream in spanish and you start to think in that language and then you actually i do definitely believe that language limits or expands your thinking uh there are some languages that actually lead you to certain thought processes yeah like uh so i speak russian fluently and that's certainly uh a language that leads you down certain thoughts well yeah i mean there's a um there's a history of the two world wars right of the of millions of people starving to death or near to death throughout its history of suffering of injustice like this promise sold to the people and then the the carpet or whatever swept from under them it's like broken promises and all of that pain and melancholy is in the language the sad songs the sad hopeful songs the over romanticized like i love you i hate you the the sort of the swings between all the various uh spectrums of emotion so that's all within the language the way it's twisted uh poach there's a there's a there's a strong culture of rhyming poetry so like the bards like this this thing there's a musicality to the language too did dostoevsky write in russian yeah so like yes all the uh [Laughter] all the ones that i know about which are translated and curious how the translations so dostoevsky did not use the musicality of the language too much so it actually translates pretty well because it's so philosophically dense that the story does a lot of the work but there's a bunch of things that are untranslatable certainly the poetry is not translatable i actually have a few conversations coming up offline and also in this podcast with people who've translated dusty esky and that's in for people who worked who work in this field know how difficult that is sometimes you can spend you know months thinking about a single sentence right in in the context like because there's just the magic captured by that sentence and how do you translate it just in the right way because those words can be um can be really powerful there's a famous line beauty will save the world from dostoyevsky you know there's so many ways to translate that and you're right the language gives you the tools with which to tell the story but it also leads your mind down certain trajectories and paths to where over time as you think in that language you become a different human being yes yeah yeah that's a fascinating reality i think that i know people have explored that but it's just rediscovered well we don't we live in our own like little pockets like this is the sad thing is i feel like unfortunately given time and given getting older i'll never know the uh china the chinese world because i don't truly know the language same with japanese i don't truly know japanese and portuguese and brazil that whole south american continent like yeah i'll go to brazil and argentina but will i truly understand the people if i don't understand the language it's it's sad because um i wonder how much how many geniuses were missing because uh because so much of the scientific world so much of the technical world is in english and so much of it might be lost because they're they just we don't have the common language i completely agree i'm very much in that vein of there's a lot of genius out there that we miss and it's sort of sort of fortunate when it when it bubbles up into something that we can understand or process there's a lot we miss so i tend to lean towards really loving uh democratization or things that empower people or you know i very resistant to sort of authoritarian structures fundamentally for that reason it well several reasons but it just hurts us yeah we're worse off so speaking of languages that empower you so python was the first language for me that um that i could i really enjoyed thinking in yeah as you said sounds like you shared my experience too so when did you first do you remember when you first kind of connected with python maybe even fell in love with python it's a good question it was a process it took about a year i first encountered python in 1997 i was a graduate student studying biomedical engineering at the mayo clinic and i had previously i've been involved in taking information from satellites i was an electrical engineering student used to taking information and trying to get something out of it doing some data processing information out of it and i'd done that in matlab i'd done that in perl i've done that in you know scripting of on a vms there's actually a vax vms system and they had their own little scripting tools uh around fortran done a lot of that and then as a graduate student i was looking for something and encounter python and because python had an array had two things that made me not filtered away because i was filtering a bunch of stuff as yorick i looked at yorick i looked at a few other languages throughout there at the time in 1997 but it it had arrays there's a library called numeric that had just been written in 95 like not very not not too much earlier by an mit alum uh jim huganin you know and i went back and read the mailing list to see the history of how it grew and there was a very interesting it's fascinating to do that actually to see how this emergent cooperation unstructured cooperation happens in the open source world that led to a lot of this collective uh programming which is something maybe we might get into a little later but what that looks like what gap did numeric fill merrick fill the gap of having an array object so there was no array out there was no array there was a one dimensional byte concept but there was no uh n-dimensional two three four-dimensional tensor they call it now i'm still in the category that a tensor is another thing and it's just an md array we should call it but yeah kind of lost that battle there's many battles in this world some of which will win some we lose that's exactly right so and but it was uh it had no math to it so numeric had math and a basic way to think in a race so i was looking for that and it had complex numbers a lot of programming languages and you can see it because you know if you're just a computer scientist you think ah complex numbers just two floats so you can people can build that on but in practice a complex number as a as one of the significant algebras that helps connect a lot of physical and mathematical ideas particularly fft for an electrical engineer and and it's a really important concept and not having it means you have to develop it several times and those times may not share an approach one of the common things in programming one things programming enables is abstractions but when you have shared abstractions it's even better it sort of gets the level of language of actually we all think of this the same way which is both powerful and dangerous right because powerful in that we now can quickly make bigger and higher level things on top of those abstractions dangerous because it also limits us as to the things we left maybe left behind in producing that abstraction which is at the heart of programming today and actually building around the programming world so i think it's a fascinating philosophical topic yeah it will continue for many years i think as it builds more and more and more abstractions yes i often think about you know we have we have a world that's built on these abstractions that were they the only ones possible yeah certainly not but they led to now it's very hard to to do it differently yeah like there's an inertia that's very hard to you know push out push away from there's that has implications for things like you know the julia language which you have heard of i'm sure and i've met the creators and i like julia it's a really cool language but they've struggled to kind of against the just the tide of like this inertia of people using python and and you know there's strategies to approach that but nonetheless it's a it's a phenomena and sometimes so i love complex numbers and i love to raise so i looked at python and then i had the experience i did some stuff in python and i was just doing my phd so i was out my focus was on i was actually doing a combination of mri and ultrasound and looking at a phenomena called elastography which is you push waves into the body and observe those waves like you can actually measure them and then you do mathematical inversion to see what the elasticity is and so that's the problem i was solving is how to do that with both ultrasound and mri i needed some tool to do that with so i was starting to use python in 1997 in 98 i went back looked at what i'd written and realized i could still understand it which is not the experience i'd had when doing pearl in 95. right i'd done the same thing and then i looked back and i forgotten what it was even saying now you know i'm not saying it so i that that means hey this may work i like this this is something i can retain without becoming an expert per se and so that led me to go i'm going to push more into this and then that 98 was kind of the when i started to fall in love with python i would say a few peculiar things about python so maybe compared to pearl compared to some of the other languages so there's no braces yeah so you space is used indentation i should say is used as part of my language yeah right uh so did you i mean that that's quite a leap uh were you comfortable with that leap or were you just very open-minded good question i was open-minded so it i was cognizant of the concern and it definitely has it has specific challenges you know cut and pasting for example you're cutting pasting code and if your editors aren't supportive of that if you're put into a terminal and particularly in the past when terminals didn't necessarily have the intelligence to manage it now now now ipython and jupyter notebooks handle it just fine so there's really no problem but in the past it created some challenges formatting challenges also mixed tabs and spaces if you're if editors weren't you weren't clear on what was happening you would have these issues so there were really concrete reasons about it that i heard and understood i never really encountered a problem with it personally like it was occasional annoyances but i really like the fact that it didn't have all this extra characters right that these extra characters didn't show up in my visual field when i was just trying to process understanding a snippet of code yeah there's a cleanness to it but i mean the idea is supposed to be that pearl also has a cleanness to it because of the minimalism of like how many characters it takes to express a certain thing yeah so it's very compact yeah but what you realize with that compactness comes there's a culture that uh prizes compactness and so the code gets more and more compact and less and less readable to a point where it's like uh like to be a good programmer in pearl you write code that's basically unreadable right there's a culture like correct and you're proud of it yeah you're proud of it right exactly and it's like feels good and it's really selective like it means you have to be an expert in perl to understand it yeah whereas python was allowed you not to have to be an expert you'd have to take all this brain energy you could leverage what i said you could leverage your english language center which you're using all the time i've wondered about other languages particularly non-uh latin based languages you know latin-based languages where the characters are at least similar i think people have an easier time but i don't know what it's like to be a japanese or a chinese person trying to learn a different um different syntax like what would computer programming look like in a in that i haven't looked at that at all but it certainly doesn't you know leveraging your your chinese language center i'm not sure python or any programming does that but that was a big deal the fact that it was accessible i could be a scientist what i really liked is many programming languages really demand a lot of you and you can get a lot you know you do a lot if you learn it but python enables you to do a lot without demanding a lot of you there's a there's nuance to that statement but it certainly was it's more accessible so more people could actually as a as a scientist as somebody or engineer who was trying to solve another problem besides point programming i could still use this language and get things done and and be happy about it i was also comfortable in c at that time and matlab you did a little matlab i did a lot before that exactly so i was comfortable in those three languages were really the tools i used during my studies and schooling um but to your point about language helping you think one of the big things about matlab is it was and apl before it i don't know if you're a you remember apl apl is uh actually the predecessor of array-based programming which i think is really an underappreciated if i talk to people who are just steeped in computer programming computer science like most the people that microsoft has hired in the past for example microsoft as a company generally did not understand array-based programming like culturally they didn't understand it so they kept missing the boat kept missing the understanding of what this what this was they've gotten better but there's still a whole culture of folks that doesn't programming that's yeah you know that's that's systems programming or web programming or lists and maps and you know what about an n-dimensional array oh yeah that's just an implementation detail well you can think that but then actually if you have that as a construct you actually think differently apl was the first language to understand that it was in the 60s right the challenge of apl is apl had very dense not only glyphs like new characters new glyphs they even had a new keyboard because to produce those glyphs this is back in the early days of computing when you know the query keyboard maybe wasn't as established like what we could have a new keyboard no big deal but it was a big deal and it didn't catch on and the language apl very much like pearl as people would pride themselves on how how much could they write the game of life in 30 characters of apl apl has characters that mean uh summation and uh they have adverbs you know they have adjectives and these things called adverbs which are like methods like reduction reduction it would be an adverb on an ad operator right so but doing using these tools you could construct and then you start to think at that level you think in end dimensions it's something i like to say and you start to think differently about data at that point you know now you're it really helps yeah i mean outside of programming if you really internalize linear algebra as a course i mean it philosophically allows you to think of the world differently yes it's almost like liberating you don't have to you don't have to think about the individual numbers in the n-dimensional array you can think of it as an object in itself and all of a sudden this world can open up now you're saying matlab and apl were like the early c i don't know if many languages got that right ever no no no they didn't still even still i would say i mean numpy is a as a inheritor of the traditions that i would say apl j was a another version that was what it did is not have the glyphs just have short characters but still a latin keyboard could type them and then numeric inherited from that in terms of let's add arrays plus broadcasting plus methods reduction even some of the language like rank is a concept that's in that was in python it's still in python for the number of dimensions right that's that's different than say the rank of a matrix which people think of as well so it's it came from that tradition but numpy is a very pragmatic practical tool uh numpy inherited from numeric and we can get to where numpy came from which is the current array at least current as of 2016-17 now there's a ton of them over the past two or three years but we can get into that too so if we just sort of linger on the early days of what was your favorite feature of python do you remember like what yeah it's so interesting to linger on like the what what really makes you connect with the language i'm not sure it's obvious to introspect that no it isn't and i've thought about that at some length i'm not i think definitely the fact that i could read it later yeah that i could use it productively without becoming an expert and the other language i had to put more effort into right that's like an empirical observation like you're not analyzing any one aspect of the language it just seems time after time you look back it's somehow readable it's somewhat readable then it was sort of i could take uh executable english yeah and translate it to python more easily like i didn't have to go there was no translation layer as an engineer or as a scientist i could think about what i wanted to do and then the syntax wasn't that far behind it yeah right now there was some there have some there's some warts there still it wasn't perfect like there's some areas where i'm like ah it'd be better if this were different or if this were different some of those things got out of the language too i was really grateful for some of the early pioneers in the python ecosystem back because python got written in 91 is when the first version came out but guido was very open to users and one of the sets of users were people like jim hugin and david asher and paul dubois and conrad hinson these were people that were on the main list and they were just asking for things like hey we really should have complex numbers in this language so let's you know there's a j there's a one j right and the fact they want the engineering root of j is interesting i don't i don't think that's entirely favorite engineers i think it's because i is so often used as the index of a for loop so i think that's actually probably right i mean there's there's a pragmatic aspect like the complex numbers were there i love that the fact that i could write nd arrays constructs and that reduction was there very simple to write summations and and and broadcasting was there i could do addition of whole arrays um so that was cool those were something i loved about it i don't know what to start talking to you about because you've been you've created so many incredible projects that basically changed the whole landscape of programming but okay let's start with uh let's go chronologically with scipy you create a scipy over two decades ago now yes right yeah i said i'd love to talk about sci-fi sci-fi was really my baby what is it uh what was its goal what is its goal how does it work yeah fantastic so scipy was effectively here i'm using python to do stuff that i previously used matlab to use and i was using numeric which is an array library that made a lot of it possible but there's things that were missing like i didn't have an ordinary differential equation solver i could just call right i didn't have integration hey i wanted to integrate this function okay well i don't have just a function i can call to do that um these are things i remember being critical things that i was missing optimization i just want to pass a function to an optimizer and have it tell me what the optimum value is uh those are things like well why don't we just write a library that adds these tools and i started a post on the mailing list and they're previously been you know people have discussed i remember conrad hinson saying wouldn't it be great if we had this optimizer library or david ash would say this stuff and and i'm you know i'm a ambitious i am this is the wrong word and eager and uh probably more time than sense i was you know poor graduate student uh my wife thinks i'm working on my phd and i am but part of a phd that i loved was the fact that it's exploratory you're not just you know taking orders fulfilling a list of things to do you're trying to figure out what to do and so i thought well you know i'm writing tools for my own use and a phd so i'll just start this project and so in 99 98 was when i first started to write libraries for python particularly when i fell in love with python 98 i thought well there's just a few things missing like oh i need a reader to read dicom files i was in medical imaging.com was a format that i want to be able to load that into python okay how do i write a reader for that so i wrote something called it was an i o package right and that was my very first extension module which is c so i wrote c code to extend python so that the pos in python i could write things more easily that that combination kind of hooked me it was the idea that i could here's this powerful tool i can use as a scripting language and a high level language to think about but that i can extend easily easily in c that easily for me because i knew enough c right and then guido had written a link i mean the only the hard part of extending python was something called the way memory management works and you have to reference counting and so there's there's a tracking of reference counting you have to do manually and if you don't you have you have memory leaks and uh so that's hard plus then c you know it's just much more you have to put more effort into it it's not just i have to now think about pointers and have to think about stuff that is different i have to kind of you're like putting a new cartridge in your brain like you're okay i'm thinking about mri now i'm thinking about programming and there are distinct modules you end up having to think about so it's harder when i was just in python i could just think about mri and high-level writing but i could do that and that kind of i liked it i found that to be enjoyable and fun and so i ended up oh well let me just add a bunch of stuff to python to do integration well and the cool thing is is that you know the power of the internet i just looking around and i found oh there's this net lib which has hundreds of fortran routines that people written in the 60s and the 70s and the 80s in fortran 77 fortunately it wasn't for trend sixties i've been imported to fortran 77 and 1477 is actually a really great language fortune 90 probably is my favorite 4chan because it's also it's got complex numbers got a raise and it's pretty high level now the problem with it is you'd never want to write a program in fortune 90 or fortune 77 but it's totally fine to write a subroutine in right and so and then 4chan kind of got a little off course when they tried to compete with c plus plus but at the time i just want libraries to do something like oh here's an order different equation here's integration here's run cut integration already done i don't have to think about that algorithm and you could but it's nice to have somebody who's already done one and tested it and so i sort of started this journey in 98 really if you look back at the main list there's sort of this this productive era of me writing an extension module to connect wrench cut integration to python and making an ordinary digital equation solver and then releasing that as a package so we could call od pack i think i called it then quad pack and then i just made these packages eventually that became multi-pack because they're originally modular you can install them separately but a massive problem in python was actually just getting your stuff installed at the time releasing software for me like today it's people think what does that mean well then it meant some poorly written web page i had some bad web page up and i put a tarball just a gzip tar ball of source code that was the release but okay can we just stand that because that the community aspect of creating the package and sharing that yes that's rare that to have to both have at that time so like that was pretty early yeah so well not not rare maybe maybe you can uh correct me on this but it seems like in the scientific community so many people you were basically solving the problems you needed to solve to process the particular application uh the data that you need and to also have the mind that i'm going to make this usable for others that's um i would say i was inspired i'd been inspired by linux i've been inspired by you know linus linus and him making his code available and i was starting to use linux at the time and i went this is cool so i had kind of been previously primed that way and generally i was i was into science because i liked the sharing notion i like the idea of hey let's if collectively we build knowledge and share it we can all be better off okay so you weren't energized by that so it's energized value already yeah right and i can't deny that i was i'm sort of uh had this very i liked that part of science that part of sharing and then all of a sudden oh wait here's something and here's something i could do and then i slowly over years learned how to share better so that you could actually engage more people faster one of the key things was actually giving people a binary they could install right so that wasn't just your source code good luck compile this and then get it compiled ready to install you just you know so in fact a lot of the journey from 98 even through 2012 we used to when i started anaconda was about that like it's why uh you know it's really the key as to why the scientists with dreams of doing mri research ended up starting a software company that installs software i work with a few folks now that don't program like on the creative side and the video side the audio side and because my whole life is running on scripts i have to try to get them to i'm have now the task of teaching them how to do python enough yeah to run the scripts and so i've been actually facing this whether it's on the condor some with the task of how do i minimally explain basically to my mom how to write a python script and it's an interesting challenge it's a to-do item for me to figure out like what is the minimal amount of information i have to teach what are the tools you use that one you enjoy it to your effect of it they're related to two related questions and then the debugging like the the iterative process of running the script to figure out what the error is maybe even for some people to do the fix yourself yeah so do you compile it do this like how do you distribute that code to them and it's interesting because i think it's exactly what you're talking about if you increase the circle of empathy that the circle of people that are able to use your programs you increase it its like effectiveness and its power and so yeah you have to think you know can i write scripts can i write programs that can be used by biomedical engineers by all kinds of people that don't know programming and actually maybe plan to see have them catch the bug of programming so that they start on their journey that's a huge responsibility and ultimately has to do with the amazon one-click buy like how how frictionless can you make the early steps frictionless is actually really key to growing any community is every any friction point you're just going to lose you're going to lose some people yeah right now sometimes you may want to intentionally do that if you're early enough on you need a lot of help you need people who have the skills you might actually it's helpful you don't necessarily have too much too many users as opposed to contributors if the co if you're early on anyway there's uh uh sci-fi started in 98 but it really emerged as this collection of modules that i was just putting on the net people were downloading and they you know i think i got 100 users right by the end of that year but there but the fact that i got 100 users and more than that people started to email me with fixes like and that was actually intoxicating right that was the that was the you know here i'm writing papers and i'm giving conferences and i get people would say hello but yeah good job but mostly it was you're reviewed with it it's competitive yeah right you publish a paper and people were like oh it wasn't my paper you know i was starting to see that sense of academic life where it was so much i thought there was a cooperative effort but it sounds like we're here just to one-up each other right and you know it's not it's not true across the board but a lot of that's there but here in this world i was getting responses from people all over the world uh you know i remember pierrot peterson in estonia right was one of the first people and he sent me back this make file because the first thing it is yeah your build thing stinks and here's a better make file now it was a complex make file i think i never understood that make file actually but it worked and it did a lot more and so then thanks this is cool and that was my first kind of engagement with community development but you know the process was he sent me a patch file i had to upload a new tar ball and i just found i really loved that and the style back then was here's a main list it was very it wasn't as it certainly were the tools that are available today it was very early on but i really started that's the whole year i i think i did about seven packages that year right and then by the end of the year i collected them into a thing called multi-pack so 99 there was this thing called multi-pack and that's when a high school student knows a high school student at the time a guy named robert kern took that package and made a windows installer right and then of course a massive increase of usage so by the way most of this development was under linux yes yes it was on linux i was a linux developer doing it on munix box i mean at the time i was actually getting into i had a new hard drive he did some kernel programming to to make the hard drive work i mean not programming but modification to the kernel so i could actually hard drive working i i love that aspect of it i was also in you know at school i was building a cluster i took mac computers like uh and you put yellow dog linux on them uh they were at the mayo clinic they were just they're all these macs that were older they were just getting rid of and so i kind of got permission to go grab them together i put about 24 of them together in a cluster and a cabinet and put yellow dog linux on them all and i wrote a c plus plus um program to do mri simulation that was what i was doing at the same time for my day job so to speak so i was loving the whole process at the same time i was oh i need to ordinary differential equation that's why ordinary difference equations were key was because that's the heart of a block equation for simulated mri is a ode solver and so that's but i actually did that it doesn't happen at the same time that's why it kind of what you're working on and what you're interested in they're coinciding i was definitely scratching my own itch in terms of building stuff and uh which helped in the sense that i was using it for me so at least had one user yeah i had one person who's like well i know this is better i like this interface better and i had the experience of matlab to guide some of what those apis might look like but you know you're just doing yourself you're building all this stuff but with the windows installer it was the first time i realized oh yeah the binary installer really helps people and so that led to spending more time on that side of things so around 2000 so i graduated my phd in 2000 end of year 2000. so 99 doing a lot of work there 98 do a lot of work there 99 kind of spending more time on my phd you know helping people use the tools thinking about what i want to go from here there was a company there's a guy actually eric jones and travis vott they were two friends who founded a company called nthot it's here in austin still here and they eric contacted me at the time when i was a uh i was a graduate student still and he said hey why don't you come down we want to build a company you know we want we're thinking of you know a scientific company and we want to take what you're doing and kind of add it to some stuff that he'd done he'd written some tools and then pierre peterson had done ftp let's come together and build pull this all together and call it sci-fi so that's the origin of the scipy brand it came from you know multi-pack and a whole bunch of modules i'd written plus a few things from some other folks and then pulled together in a single installer sci-fi was really a distribution of python masquerading as a library how did you think about sci-fi in context of python in context of numeric like what we saw scipy as a way to make an r d environment for python like use python uh dependent on numeric so numeric was the array library we depended on and then from there ext extend it with a bunch of modules that allowed for and at the time the original vision of scipy was to have plotting was to have you know replied you know the rebel environment and kind of a whole really a whole data environment um that you could then install and get going with and that was kind of the thinking it didn't really evolve that way right it sort of had a but one it's really hard to do massive scale projects in a with with open source collectives actually there's a there's sort of an intrinsic uh cooperation limit as to which you know too many cooks in the kitchen you know you can do amazing infrastructure work when it comes down to bringing it all together into a single deliverable that actually requires a little more a little more product management that is not it doesn't really emerge from the same dynamic so it struggled you know struggled to get almost too many voices it's hard to have everybody agree you know consensus doesn't really work at that scale you end up with politics you know with the same kind of things that's happened in large organizations trying to decide on what to do together um so consensus building was still was was challenging at scale as more people came in right early on it's fine because there's nobody there and so it works but then as you get more successful the more people use it all of a sudden oh there's this this scale at which this doesn't work anymore and we have to come up with different approaches so sci-fi came out officially in 2001 was the first release most the time i remember the days of getting that release ready it was a windows installer and there was there were bugs on how you know the windows compiler handled complex numbers and you were you're chasing segmentation faults and it was it's a lot of work there's a lot of effort had nothing to do with my area of study at the same time i just got an offer so he wondered if i wanted to come down and help him start that you know start that company with his friend and i at the time i was like i was intrigued but i was squaring a path an academic path and i just got an offer to go and teach at my alma mater so i took that tenure track position and saipo and kind of then i started work on sci-fi as a professor too okay so that that's i left i've got the mayo clinic graduated wrote my thesis using sci-fi wrote you know there's there's images that were created now the plotting tool i used was something from yorick actually it was a plotting a plt kind of a plotting language that i used york is a programming language it was a programming language had a plotting tool dislin it we had integration to dislike i ended up using dislin plus some some of the plotting from yorick linked to from python anyway it was a people don't plot that way now but this is before and scipy was trying to add plotting yeah right it didn't have much success really the success of plotting came from john hunter who had a similar experience to my experience my kind of maverick experience as a person just trying to get stuff done and kind of having more time than than money maybe right and john hunter created what not plot lube he's the creator of map yeah so john hunter was uh you know he wasn't a student at the time but he was an actor he was working in quant field and he said we need better plotting so he just went out and said cool i'll make a new project and we'll call it matplotlib and he released in 2001. about the same time that scipy came out and it was separate library separate install use numeric sci-fi use numeric and so scipy you know 2001 released scipy and then m-thot created a conference called scipy which was brought people together to talk about the space another conference is still ongoing it's one of the favorite conferences of a lot of people because it's it's changed over the years but early on it was you know a collection of 50 people who care about scientists mostly practicing scientists who want to care about coding and doing it well and not using matlab i remember being driven by you know i like matlab but i didn't like the fact that like so i'm not opposed proprietary software i'm actually not an open source zealot i love open source for the what it brings but i also see the role for proprietary software what i didn't like was the fact that i would develop code and publish it and then effectively telling somebody here to run my code you have to have this proprietary software right and there's also culture around matlab as much because i've talked to a few folks math works great it's my life yeah i mean there's just a culture they try really hard but it's just there's this corporate ibm style culture that's like or whatever i don't don't want to say negative things about ibm or whatever but there's a no it's it's really that connection something i'm in the middle of right now is is the business of open source and how do you connect the ethos of cooperative development with the necessity of of creating profits right and like right now today you know i'm still i'm still in the middle of that that's actually the early days of of me exploring this question because i was writing sci-fi i mean as an aside i also had so i had three kids at the time i have six kids now i got married early wanted a family uh i had three kids and i remember reading i remember read richard stallman's post and i was i was a fan of stallman i would read his work i liked this collective ideas he would have certainly the ideas on ip law i read a lot of stuff but then he said you know okay well how do i make money with this how do i make a living how do i pay for my kids all this stuff was in my mind a young graduate student making no money thinking i got to get a job and he said well you know i think just be like me and don't have kids right that's just don't don't that's his take on this that was just that was that was the what what he said in that moment right that's the thing i read and i went okay this is a train i can't get out yeah there has to be a way to preserve the culture of open source and still be able to make sufficient money to feed you yes exactly there's got to be well so that actually led me to a study of economics because at the time i was ignorant and it really was i'm actually i'm embarrassed for educational system that they could let me and i was valedictorian in my high school class and i did super well in college and like academically i did great right but the fact that i could do that and then be clueless about this key part of life it led me to go there's a problem like i should i should have learned this in fifth grade i should learn this in eighth grade like everybody should come out with a basic knowledge of economics you're an interesting example because you've created tools that uh change the lives of probably millions of people and the fact that you don't understand at the time of the creation of those tools the basics economics of how like to build up giant system is a problem yeah it's a problem and so i during my phd at the same time this is actually in 98 99 at the same time i was in the library i was reading books on capitalism i was reading books on marxism i was reading books on you know what is this thing what does it what does it mean yeah and i encountered a basically what i encountered a set of writings from people that said they were the inheritors adam smith but adam smith for the first time right which is the wealth of nations and kind of this notion of emergent emergent uh societies and realized oh there's this whole world out here of people and in the challenge the economics is also political like because economics you know people different parties running for office they'll they want their economic friends they want their economist to back them up right or to to be there to be their magicians like the magicians in pharaoh's court right the people that are going to say hey this is you should listen to me because i've got the expert who says this and so it gets really muddled right but i was looking at from as a scientist as a scientist going what is this space what does this mean how do people how does paris get fed how does how what is money how does it work i found a lot of writings i really loved i found some things that i really loved and i learned from that it was writings from people like von mises he wrote a pre-order paper in 1920 that still should be read more than it is it's got i mean it was the economic calculation problem of the socialist commonwealth it's basically in response to the bolshevik revolution in 1917. and his basic argument was it's not going to work to not have private property you're not going to be able to come up with prices the bureaucrats aren't going to be able to determine how to allocate resources without a price system and a price system emerges from people making trades and they can only make trades if they have authority over the thing they're trading and that that that creates information flow that you just don't have if you try to top down it right right it's like huh that's a really good point yeah the prices have a signal that's used and it's important to have that signal when you're trying to build a community of productive people like you would in the software engineering yeah the prices are actually an important signaling mechanism yeah right and that money is just a bartering tool right so this is the first time i've encountered any of this concept right and the fact that oh this is actually really critical like it's so critical to our prosperity and that we're dangerously not learning about this not teaching our children about this you know so you had the three kids you had to make some stuff how to make some money right i had to figure it out but i didn't really care i mean i was never i've never been driven by money just need it right right to eat so what how did that resolve itself in terms of sci-fi so i would say it didn't really resolve itself it sort of started a journey that i'm continuing on i'm still on i would say i don't think it resolved itself but i will say i i went in wide eyes wide open like i knew that there were problems with you know um giving stuff away and creating uh the the ex market externalities the fact that yeah people might use it and i might not get paid for it and i'll have to figure something else out to get paid like at least i can say i'm not bitter that a lot of people have used stuff that i've written and i haven't necessarily benefited economically from it like yeah i've heard other people be you know bitter about that when they write or they talk like oh i should have got more value out of this and i'm also i want to create systems that let people like me who might have these desires to do things let them benefit so it actually creates more of the same not to turn on your bitterness module but there's some aspect i wish there was mechanisms for me to reward whoever created scipy and numpy because it brought so much joy to my life i appreciate that i mean the tip dark notion was there i appreciate that and i think but there should be a very there's surely mechanism mechanism i totally agree i would love to talk about some of the ideas i have because i actually came across i think i've come up with some interesting notions that could work but they'll require you know anything that will work takes time to emerge right like things don't just turn overnight that's definitely one thing i've also understood and learned is any fixes that's why it's kind of funny we often give credit to you know oh this president gets elected and oh look how great things have done and i saw that when when i had a transition in a condo when a new ceo came in right and it's like the success that's happening there's an inertia there yeah right and sometimes the decision you made like 10 years before is the reason why the successes see right exactly so we're sort of just running around taking credit for stuff credit assignment has like a delay to it yes that this makes the credit assignment basica
Resume
Categories