Transcript
gFEE3w7F0ww • Travis Oliphant: NumPy, SciPy, Anaconda, Python & Scientific Programming | Lex Fridman Podcast #224
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/lexfridman/.shards/text-0001.zst#text/0559_gFEE3w7F0ww.txt
Kind: captions Language: en the following is a conversation with travis oliphant one of the most impactful programmers and data scientists ever he created numpy scipy and anaconda numpai formed the foundation of tensor-based machine learning in python scipy formed the foundation of scientific programming in python and anaconda specifically with conda made python more accessible to a much larger audience travis's life work across a large number of programming and entrepreneurial efforts has and will continue to have immeasurable impact on millions of lives by empowering scientists and engineers in big companies small companies and open source communities to take on difficult problems and solve them with the power of programming plus he's a truly kind human being which is something that when combined with vision and ambition makes for great leader and a great person to chat with to support this podcast please check out our sponsors in the description this is the lex friedman podcast and here is my conversation with travis oliphant what was the first computer program you've ever written do you remember whoa that's a good question i think it was in fourth grade just a simple uh loop in basic basic basic on an atari 800 atari 400 i think or maybe there's an atari 800 it was at a part of a class and we just were just basic loops to print things out did you use go to statements um yes yes we used go to statements i remember in the early days that's when i first realized there's like principles to programming when i was told that don't use go to statements those are bad software engineering prints like it goes against what great beautiful code is i was like oh okay there's rules to this game i didn't see that until high school when i took an ap computer science course right i did a lot of other kinds of just programming on ti but finally when i took an ap computer science course in pascal wow that's that was pascal that's when i oh there are these principles not c or c plus plus no i didn't take c until uh the next year in college i had a course in c um but i haven't done much in pascal just that ap computer science course now sorry for the romanticized question but when did you first fall in love with programming oh man good question i think actually when i was 10 you know my dad got us a t a timex sinclair and uh he was excited about the spreadsheet capability and then but i made him get the basic the add-on so we could actually program in basic and just being able to write instructions and have the computer do something then we got a ti-99 ti-994a when i was about 12 and i would just it had sprites and graphics and music you could actually program to do music that's when i really sort of fell in love with programming so this is a full like a real computer with like uh with memory and storage yeah processors so what not the type of ti yeah the timex sinclair was one of the very first it was a cheap cheap like i think it was well it was still expensive but it was 2k of memory we got the 16k add-on pack but yeah it had memory and you could program it you had the in order to store your programs you had to attach a tape drive remember that old the sound that would play when you invented the converted the the modems would convert digital bits to audio files on a tape drive still remember that sound but that was the storage and what was the programming language do you remember it was basic it was basic and then they had a visi calc and so a little bit of spreadsheet programming busy but mostly just some basic do you remember what kind of things drew you to programming was it uh working with data was it video games and video games mathy stuff yeah i've i've always loved math and a lot of people think they don't like math because i think when they're exposed to it early they uh it's about memory you know when you're exposed to math earlier you have a good short term memory members timetables and i i do have a reasonably i mean not perfect but a reasonably long um little short-term memory buffer and so i did great at times tables i said oh you're good at math but i started to really like math just the the problem solving aspect and so computing was problem solving applied and so that's always kind of been the draw kind of coupled with the mathematics did you ever see the computer as like an extension of your mind like something able to achieve not till later okay yeah no not then it's just like a little set of puzzles that you can play with and you can you can play with math puzzles and yeah it was it was too rudimentary early on like it was sort of yeah it was too it was a lot of work to actually take a thought you'd have and actually get it implemented and that's still work but it's getting easier and so yeah i would say that's definitely what's attracted me to python is that that was more real right i could think in python speaking a foreign language i only speak another language fluently besides english which is spanish and i remember the day when i would dream in spanish and you start to think in that language and then you actually i do definitely believe that language limits or expands your thinking uh there are some languages that actually lead you to certain thought processes yeah like uh so i speak russian fluently and that's certainly uh a language that leads you down certain thoughts well yeah i mean there's a um there's a history of the two world wars right of the of millions of people starving to death or near to death throughout its history of suffering of injustice like this promise sold to the people and then the the carpet or whatever swept from under them it's like broken promises and all of that pain and melancholy is in the language the sad songs the sad hopeful songs the over romanticized like i love you i hate you the the sort of the swings between all the various uh spectrums of emotion so that's all within the language the way it's twisted uh poach there's a there's a there's a strong culture of rhyming poetry so like the bards like this this thing there's a musicality to the language too did dostoevsky write in russian yeah so like yes all the uh [Laughter] all the ones that i know about which are translated and curious how the translations so dostoevsky did not use the musicality of the language too much so it actually translates pretty well because it's so philosophically dense that the story does a lot of the work but there's a bunch of things that are untranslatable certainly the poetry is not translatable i actually have a few conversations coming up offline and also in this podcast with people who've translated dusty esky and that's in for people who worked who work in this field know how difficult that is sometimes you can spend you know months thinking about a single sentence right in in the context like because there's just the magic captured by that sentence and how do you translate it just in the right way because those words can be um can be really powerful there's a famous line beauty will save the world from dostoyevsky you know there's so many ways to translate that and you're right the language gives you the tools with which to tell the story but it also leads your mind down certain trajectories and paths to where over time as you think in that language you become a different human being yes yeah yeah that's a fascinating reality i think that i know people have explored that but it's just rediscovered well we don't we live in our own like little pockets like this is the sad thing is i feel like unfortunately given time and given getting older i'll never know the uh china the chinese world because i don't truly know the language same with japanese i don't truly know japanese and portuguese and brazil that whole south american continent like yeah i'll go to brazil and argentina but will i truly understand the people if i don't understand the language it's it's sad because um i wonder how much how many geniuses were missing because uh because so much of the scientific world so much of the technical world is in english and so much of it might be lost because they're they just we don't have the common language i completely agree i'm very much in that vein of there's a lot of genius out there that we miss and it's sort of sort of fortunate when it when it bubbles up into something that we can understand or process there's a lot we miss so i tend to lean towards really loving uh democratization or things that empower people or you know i very resistant to sort of authoritarian structures fundamentally for that reason it well several reasons but it just hurts us yeah we're worse off so speaking of languages that empower you so python was the first language for me that um that i could i really enjoyed thinking in yeah as you said sounds like you shared my experience too so when did you first do you remember when you first kind of connected with python maybe even fell in love with python it's a good question it was a process it took about a year i first encountered python in 1997 i was a graduate student studying biomedical engineering at the mayo clinic and i had previously i've been involved in taking information from satellites i was an electrical engineering student used to taking information and trying to get something out of it doing some data processing information out of it and i'd done that in matlab i'd done that in perl i've done that in you know scripting of on a vms there's actually a vax vms system and they had their own little scripting tools uh around fortran done a lot of that and then as a graduate student i was looking for something and encounter python and because python had an array had two things that made me not filtered away because i was filtering a bunch of stuff as yorick i looked at yorick i looked at a few other languages throughout there at the time in 1997 but it it had arrays there's a library called numeric that had just been written in 95 like not very not not too much earlier by an mit alum uh jim huganin you know and i went back and read the mailing list to see the history of how it grew and there was a very interesting it's fascinating to do that actually to see how this emergent cooperation unstructured cooperation happens in the open source world that led to a lot of this collective uh programming which is something maybe we might get into a little later but what that looks like what gap did numeric fill merrick fill the gap of having an array object so there was no array out there was no array there was a one dimensional byte concept but there was no uh n-dimensional two three four-dimensional tensor they call it now i'm still in the category that a tensor is another thing and it's just an md array we should call it but yeah kind of lost that battle there's many battles in this world some of which will win some we lose that's exactly right so and but it was uh it had no math to it so numeric had math and a basic way to think in a race so i was looking for that and it had complex numbers a lot of programming languages and you can see it because you know if you're just a computer scientist you think ah complex numbers just two floats so you can people can build that on but in practice a complex number as a as one of the significant algebras that helps connect a lot of physical and mathematical ideas particularly fft for an electrical engineer and and it's a really important concept and not having it means you have to develop it several times and those times may not share an approach one of the common things in programming one things programming enables is abstractions but when you have shared abstractions it's even better it sort of gets the level of language of actually we all think of this the same way which is both powerful and dangerous right because powerful in that we now can quickly make bigger and higher level things on top of those abstractions dangerous because it also limits us as to the things we left maybe left behind in producing that abstraction which is at the heart of programming today and actually building around the programming world so i think it's a fascinating philosophical topic yeah it will continue for many years i think as it builds more and more and more abstractions yes i often think about you know we have we have a world that's built on these abstractions that were they the only ones possible yeah certainly not but they led to now it's very hard to to do it differently yeah like there's an inertia that's very hard to you know push out push away from there's that has implications for things like you know the julia language which you have heard of i'm sure and i've met the creators and i like julia it's a really cool language but they've struggled to kind of against the just the tide of like this inertia of people using python and and you know there's strategies to approach that but nonetheless it's a it's a phenomena and sometimes so i love complex numbers and i love to raise so i looked at python and then i had the experience i did some stuff in python and i was just doing my phd so i was out my focus was on i was actually doing a combination of mri and ultrasound and looking at a phenomena called elastography which is you push waves into the body and observe those waves like you can actually measure them and then you do mathematical inversion to see what the elasticity is and so that's the problem i was solving is how to do that with both ultrasound and mri i needed some tool to do that with so i was starting to use python in 1997 in 98 i went back looked at what i'd written and realized i could still understand it which is not the experience i'd had when doing pearl in 95. right i'd done the same thing and then i looked back and i forgotten what it was even saying now you know i'm not saying it so i that that means hey this may work i like this this is something i can retain without becoming an expert per se and so that led me to go i'm going to push more into this and then that 98 was kind of the when i started to fall in love with python i would say a few peculiar things about python so maybe compared to pearl compared to some of the other languages so there's no braces yeah so you space is used indentation i should say is used as part of my language yeah right uh so did you i mean that that's quite a leap uh were you comfortable with that leap or were you just very open-minded good question i was open-minded so it i was cognizant of the concern and it definitely has it has specific challenges you know cut and pasting for example you're cutting pasting code and if your editors aren't supportive of that if you're put into a terminal and particularly in the past when terminals didn't necessarily have the intelligence to manage it now now now ipython and jupyter notebooks handle it just fine so there's really no problem but in the past it created some challenges formatting challenges also mixed tabs and spaces if you're if editors weren't you weren't clear on what was happening you would have these issues so there were really concrete reasons about it that i heard and understood i never really encountered a problem with it personally like it was occasional annoyances but i really like the fact that it didn't have all this extra characters right that these extra characters didn't show up in my visual field when i was just trying to process understanding a snippet of code yeah there's a cleanness to it but i mean the idea is supposed to be that pearl also has a cleanness to it because of the minimalism of like how many characters it takes to express a certain thing yeah so it's very compact yeah but what you realize with that compactness comes there's a culture that uh prizes compactness and so the code gets more and more compact and less and less readable to a point where it's like uh like to be a good programmer in pearl you write code that's basically unreadable right there's a culture like correct and you're proud of it yeah you're proud of it right exactly and it's like feels good and it's really selective like it means you have to be an expert in perl to understand it yeah whereas python was allowed you not to have to be an expert you'd have to take all this brain energy you could leverage what i said you could leverage your english language center which you're using all the time i've wondered about other languages particularly non-uh latin based languages you know latin-based languages where the characters are at least similar i think people have an easier time but i don't know what it's like to be a japanese or a chinese person trying to learn a different um different syntax like what would computer programming look like in a in that i haven't looked at that at all but it certainly doesn't you know leveraging your your chinese language center i'm not sure python or any programming does that but that was a big deal the fact that it was accessible i could be a scientist what i really liked is many programming languages really demand a lot of you and you can get a lot you know you do a lot if you learn it but python enables you to do a lot without demanding a lot of you there's a there's nuance to that statement but it certainly was it's more accessible so more people could actually as a as a scientist as somebody or engineer who was trying to solve another problem besides point programming i could still use this language and get things done and and be happy about it i was also comfortable in c at that time and matlab you did a little matlab i did a lot before that exactly so i was comfortable in those three languages were really the tools i used during my studies and schooling um but to your point about language helping you think one of the big things about matlab is it was and apl before it i don't know if you're a you remember apl apl is uh actually the predecessor of array-based programming which i think is really an underappreciated if i talk to people who are just steeped in computer programming computer science like most the people that microsoft has hired in the past for example microsoft as a company generally did not understand array-based programming like culturally they didn't understand it so they kept missing the boat kept missing the understanding of what this what this was they've gotten better but there's still a whole culture of folks that doesn't programming that's yeah you know that's that's systems programming or web programming or lists and maps and you know what about an n-dimensional array oh yeah that's just an implementation detail well you can think that but then actually if you have that as a construct you actually think differently apl was the first language to understand that it was in the 60s right the challenge of apl is apl had very dense not only glyphs like new characters new glyphs they even had a new keyboard because to produce those glyphs this is back in the early days of computing when you know the query keyboard maybe wasn't as established like what we could have a new keyboard no big deal but it was a big deal and it didn't catch on and the language apl very much like pearl as people would pride themselves on how how much could they write the game of life in 30 characters of apl apl has characters that mean uh summation and uh they have adverbs you know they have adjectives and these things called adverbs which are like methods like reduction reduction it would be an adverb on an ad operator right so but doing using these tools you could construct and then you start to think at that level you think in end dimensions it's something i like to say and you start to think differently about data at that point you know now you're it really helps yeah i mean outside of programming if you really internalize linear algebra as a course i mean it philosophically allows you to think of the world differently yes it's almost like liberating you don't have to you don't have to think about the individual numbers in the n-dimensional array you can think of it as an object in itself and all of a sudden this world can open up now you're saying matlab and apl were like the early c i don't know if many languages got that right ever no no no they didn't still even still i would say i mean numpy is a as a inheritor of the traditions that i would say apl j was a another version that was what it did is not have the glyphs just have short characters but still a latin keyboard could type them and then numeric inherited from that in terms of let's add arrays plus broadcasting plus methods reduction even some of the language like rank is a concept that's in that was in python it's still in python for the number of dimensions right that's that's different than say the rank of a matrix which people think of as well so it's it came from that tradition but numpy is a very pragmatic practical tool uh numpy inherited from numeric and we can get to where numpy came from which is the current array at least current as of 2016-17 now there's a ton of them over the past two or three years but we can get into that too so if we just sort of linger on the early days of what was your favorite feature of python do you remember like what yeah it's so interesting to linger on like the what what really makes you connect with the language i'm not sure it's obvious to introspect that no it isn't and i've thought about that at some length i'm not i think definitely the fact that i could read it later yeah that i could use it productively without becoming an expert and the other language i had to put more effort into right that's like an empirical observation like you're not analyzing any one aspect of the language it just seems time after time you look back it's somehow readable it's somewhat readable then it was sort of i could take uh executable english yeah and translate it to python more easily like i didn't have to go there was no translation layer as an engineer or as a scientist i could think about what i wanted to do and then the syntax wasn't that far behind it yeah right now there was some there have some there's some warts there still it wasn't perfect like there's some areas where i'm like ah it'd be better if this were different or if this were different some of those things got out of the language too i was really grateful for some of the early pioneers in the python ecosystem back because python got written in 91 is when the first version came out but guido was very open to users and one of the sets of users were people like jim hugin and david asher and paul dubois and conrad hinson these were people that were on the main list and they were just asking for things like hey we really should have complex numbers in this language so let's you know there's a j there's a one j right and the fact they want the engineering root of j is interesting i don't i don't think that's entirely favorite engineers i think it's because i is so often used as the index of a for loop so i think that's actually probably right i mean there's there's a pragmatic aspect like the complex numbers were there i love that the fact that i could write nd arrays constructs and that reduction was there very simple to write summations and and and broadcasting was there i could do addition of whole arrays um so that was cool those were something i loved about it i don't know what to start talking to you about because you've been you've created so many incredible projects that basically changed the whole landscape of programming but okay let's start with uh let's go chronologically with scipy you create a scipy over two decades ago now yes right yeah i said i'd love to talk about sci-fi sci-fi was really my baby what is it uh what was its goal what is its goal how does it work yeah fantastic so scipy was effectively here i'm using python to do stuff that i previously used matlab to use and i was using numeric which is an array library that made a lot of it possible but there's things that were missing like i didn't have an ordinary differential equation solver i could just call right i didn't have integration hey i wanted to integrate this function okay well i don't have just a function i can call to do that um these are things i remember being critical things that i was missing optimization i just want to pass a function to an optimizer and have it tell me what the optimum value is uh those are things like well why don't we just write a library that adds these tools and i started a post on the mailing list and they're previously been you know people have discussed i remember conrad hinson saying wouldn't it be great if we had this optimizer library or david ash would say this stuff and and i'm you know i'm a ambitious i am this is the wrong word and eager and uh probably more time than sense i was you know poor graduate student uh my wife thinks i'm working on my phd and i am but part of a phd that i loved was the fact that it's exploratory you're not just you know taking orders fulfilling a list of things to do you're trying to figure out what to do and so i thought well you know i'm writing tools for my own use and a phd so i'll just start this project and so in 99 98 was when i first started to write libraries for python particularly when i fell in love with python 98 i thought well there's just a few things missing like oh i need a reader to read dicom files i was in medical imaging.com was a format that i want to be able to load that into python okay how do i write a reader for that so i wrote something called it was an i o package right and that was my very first extension module which is c so i wrote c code to extend python so that the pos in python i could write things more easily that that combination kind of hooked me it was the idea that i could here's this powerful tool i can use as a scripting language and a high level language to think about but that i can extend easily easily in c that easily for me because i knew enough c right and then guido had written a link i mean the only the hard part of extending python was something called the way memory management works and you have to reference counting and so there's there's a tracking of reference counting you have to do manually and if you don't you have you have memory leaks and uh so that's hard plus then c you know it's just much more you have to put more effort into it it's not just i have to now think about pointers and have to think about stuff that is different i have to kind of you're like putting a new cartridge in your brain like you're okay i'm thinking about mri now i'm thinking about programming and there are distinct modules you end up having to think about so it's harder when i was just in python i could just think about mri and high-level writing but i could do that and that kind of i liked it i found that to be enjoyable and fun and so i ended up oh well let me just add a bunch of stuff to python to do integration well and the cool thing is is that you know the power of the internet i just looking around and i found oh there's this net lib which has hundreds of fortran routines that people written in the 60s and the 70s and the 80s in fortran 77 fortunately it wasn't for trend sixties i've been imported to fortran 77 and 1477 is actually a really great language fortune 90 probably is my favorite 4chan because it's also it's got complex numbers got a raise and it's pretty high level now the problem with it is you'd never want to write a program in fortune 90 or fortune 77 but it's totally fine to write a subroutine in right and so and then 4chan kind of got a little off course when they tried to compete with c plus plus but at the time i just want libraries to do something like oh here's an order different equation here's integration here's run cut integration already done i don't have to think about that algorithm and you could but it's nice to have somebody who's already done one and tested it and so i sort of started this journey in 98 really if you look back at the main list there's sort of this this productive era of me writing an extension module to connect wrench cut integration to python and making an ordinary digital equation solver and then releasing that as a package so we could call od pack i think i called it then quad pack and then i just made these packages eventually that became multi-pack because they're originally modular you can install them separately but a massive problem in python was actually just getting your stuff installed at the time releasing software for me like today it's people think what does that mean well then it meant some poorly written web page i had some bad web page up and i put a tarball just a gzip tar ball of source code that was the release but okay can we just stand that because that the community aspect of creating the package and sharing that yes that's rare that to have to both have at that time so like that was pretty early yeah so well not not rare maybe maybe you can uh correct me on this but it seems like in the scientific community so many people you were basically solving the problems you needed to solve to process the particular application uh the data that you need and to also have the mind that i'm going to make this usable for others that's um i would say i was inspired i'd been inspired by linux i've been inspired by you know linus linus and him making his code available and i was starting to use linux at the time and i went this is cool so i had kind of been previously primed that way and generally i was i was into science because i liked the sharing notion i like the idea of hey let's if collectively we build knowledge and share it we can all be better off okay so you weren't energized by that so it's energized value already yeah right and i can't deny that i was i'm sort of uh had this very i liked that part of science that part of sharing and then all of a sudden oh wait here's something and here's something i could do and then i slowly over years learned how to share better so that you could actually engage more people faster one of the key things was actually giving people a binary they could install right so that wasn't just your source code good luck compile this and then get it compiled ready to install you just you know so in fact a lot of the journey from 98 even through 2012 we used to when i started anaconda was about that like it's why uh you know it's really the key as to why the scientists with dreams of doing mri research ended up starting a software company that installs software i work with a few folks now that don't program like on the creative side and the video side the audio side and because my whole life is running on scripts i have to try to get them to i'm have now the task of teaching them how to do python enough yeah to run the scripts and so i've been actually facing this whether it's on the condor some with the task of how do i minimally explain basically to my mom how to write a python script and it's an interesting challenge it's a to-do item for me to figure out like what is the minimal amount of information i have to teach what are the tools you use that one you enjoy it to your effect of it they're related to two related questions and then the debugging like the the iterative process of running the script to figure out what the error is maybe even for some people to do the fix yourself yeah so do you compile it do this like how do you distribute that code to them and it's interesting because i think it's exactly what you're talking about if you increase the circle of empathy that the circle of people that are able to use your programs you increase it its like effectiveness and its power and so yeah you have to think you know can i write scripts can i write programs that can be used by biomedical engineers by all kinds of people that don't know programming and actually maybe plan to see have them catch the bug of programming so that they start on their journey that's a huge responsibility and ultimately has to do with the amazon one-click buy like how how frictionless can you make the early steps frictionless is actually really key to growing any community is every any friction point you're just going to lose you're going to lose some people yeah right now sometimes you may want to intentionally do that if you're early enough on you need a lot of help you need people who have the skills you might actually it's helpful you don't necessarily have too much too many users as opposed to contributors if the co if you're early on anyway there's uh uh sci-fi started in 98 but it really emerged as this collection of modules that i was just putting on the net people were downloading and they you know i think i got 100 users right by the end of that year but there but the fact that i got 100 users and more than that people started to email me with fixes like and that was actually intoxicating right that was the that was the you know here i'm writing papers and i'm giving conferences and i get people would say hello but yeah good job but mostly it was you're reviewed with it it's competitive yeah right you publish a paper and people were like oh it wasn't my paper you know i was starting to see that sense of academic life where it was so much i thought there was a cooperative effort but it sounds like we're here just to one-up each other right and you know it's not it's not true across the board but a lot of that's there but here in this world i was getting responses from people all over the world uh you know i remember pierrot peterson in estonia right was one of the first people and he sent me back this make file because the first thing it is yeah your build thing stinks and here's a better make file now it was a complex make file i think i never understood that make file actually but it worked and it did a lot more and so then thanks this is cool and that was my first kind of engagement with community development but you know the process was he sent me a patch file i had to upload a new tar ball and i just found i really loved that and the style back then was here's a main list it was very it wasn't as it certainly were the tools that are available today it was very early on but i really started that's the whole year i i think i did about seven packages that year right and then by the end of the year i collected them into a thing called multi-pack so 99 there was this thing called multi-pack and that's when a high school student knows a high school student at the time a guy named robert kern took that package and made a windows installer right and then of course a massive increase of usage so by the way most of this development was under linux yes yes it was on linux i was a linux developer doing it on munix box i mean at the time i was actually getting into i had a new hard drive he did some kernel programming to to make the hard drive work i mean not programming but modification to the kernel so i could actually hard drive working i i love that aspect of it i was also in you know at school i was building a cluster i took mac computers like uh and you put yellow dog linux on them uh they were at the mayo clinic they were just they're all these macs that were older they were just getting rid of and so i kind of got permission to go grab them together i put about 24 of them together in a cluster and a cabinet and put yellow dog linux on them all and i wrote a c plus plus um program to do mri simulation that was what i was doing at the same time for my day job so to speak so i was loving the whole process at the same time i was oh i need to ordinary differential equation that's why ordinary difference equations were key was because that's the heart of a block equation for simulated mri is a ode solver and so that's but i actually did that it doesn't happen at the same time that's why it kind of what you're working on and what you're interested in they're coinciding i was definitely scratching my own itch in terms of building stuff and uh which helped in the sense that i was using it for me so at least had one user yeah i had one person who's like well i know this is better i like this interface better and i had the experience of matlab to guide some of what those apis might look like but you know you're just doing yourself you're building all this stuff but with the windows installer it was the first time i realized oh yeah the binary installer really helps people and so that led to spending more time on that side of things so around 2000 so i graduated my phd in 2000 end of year 2000. so 99 doing a lot of work there 98 do a lot of work there 99 kind of spending more time on my phd you know helping people use the tools thinking about what i want to go from here there was a company there's a guy actually eric jones and travis vott they were two friends who founded a company called nthot it's here in austin still here and they eric contacted me at the time when i was a uh i was a graduate student still and he said hey why don't you come down we want to build a company you know we want we're thinking of you know a scientific company and we want to take what you're doing and kind of add it to some stuff that he'd done he'd written some tools and then pierre peterson had done ftp let's come together and build pull this all together and call it sci-fi so that's the origin of the scipy brand it came from you know multi-pack and a whole bunch of modules i'd written plus a few things from some other folks and then pulled together in a single installer sci-fi was really a distribution of python masquerading as a library how did you think about sci-fi in context of python in context of numeric like what we saw scipy as a way to make an r d environment for python like use python uh dependent on numeric so numeric was the array library we depended on and then from there ext extend it with a bunch of modules that allowed for and at the time the original vision of scipy was to have plotting was to have you know replied you know the rebel environment and kind of a whole really a whole data environment um that you could then install and get going with and that was kind of the thinking it didn't really evolve that way right it sort of had a but one it's really hard to do massive scale projects in a with with open source collectives actually there's a there's sort of an intrinsic uh cooperation limit as to which you know too many cooks in the kitchen you know you can do amazing infrastructure work when it comes down to bringing it all together into a single deliverable that actually requires a little more a little more product management that is not it doesn't really emerge from the same dynamic so it struggled you know struggled to get almost too many voices it's hard to have everybody agree you know consensus doesn't really work at that scale you end up with politics you know with the same kind of things that's happened in large organizations trying to decide on what to do together um so consensus building was still was was challenging at scale as more people came in right early on it's fine because there's nobody there and so it works but then as you get more successful the more people use it all of a sudden oh there's this this scale at which this doesn't work anymore and we have to come up with different approaches so sci-fi came out officially in 2001 was the first release most the time i remember the days of getting that release ready it was a windows installer and there was there were bugs on how you know the windows compiler handled complex numbers and you were you're chasing segmentation faults and it was it's a lot of work there's a lot of effort had nothing to do with my area of study at the same time i just got an offer so he wondered if i wanted to come down and help him start that you know start that company with his friend and i at the time i was like i was intrigued but i was squaring a path an academic path and i just got an offer to go and teach at my alma mater so i took that tenure track position and saipo and kind of then i started work on sci-fi as a professor too okay so that that's i left i've got the mayo clinic graduated wrote my thesis using sci-fi wrote you know there's there's images that were created now the plotting tool i used was something from yorick actually it was a plotting a plt kind of a plotting language that i used york is a programming language it was a programming language had a plotting tool dislin it we had integration to dislike i ended up using dislin plus some some of the plotting from yorick linked to from python anyway it was a people don't plot that way now but this is before and scipy was trying to add plotting yeah right it didn't have much success really the success of plotting came from john hunter who had a similar experience to my experience my kind of maverick experience as a person just trying to get stuff done and kind of having more time than than money maybe right and john hunter created what not plot lube he's the creator of map yeah so john hunter was uh you know he wasn't a student at the time but he was an actor he was working in quant field and he said we need better plotting so he just went out and said cool i'll make a new project and we'll call it matplotlib and he released in 2001. about the same time that scipy came out and it was separate library separate install use numeric sci-fi use numeric and so scipy you know 2001 released scipy and then m-thot created a conference called scipy which was brought people together to talk about the space another conference is still ongoing it's one of the favorite conferences of a lot of people because it's it's changed over the years but early on it was you know a collection of 50 people who care about scientists mostly practicing scientists who want to care about coding and doing it well and not using matlab i remember being driven by you know i like matlab but i didn't like the fact that like so i'm not opposed proprietary software i'm actually not an open source zealot i love open source for the what it brings but i also see the role for proprietary software what i didn't like was the fact that i would develop code and publish it and then effectively telling somebody here to run my code you have to have this proprietary software right and there's also culture around matlab as much because i've talked to a few folks math works great it's my life yeah i mean there's just a culture they try really hard but it's just there's this corporate ibm style culture that's like or whatever i don't don't want to say negative things about ibm or whatever but there's a no it's it's really that connection something i'm in the middle of right now is is the business of open source and how do you connect the ethos of cooperative development with the necessity of of creating profits right and like right now today you know i'm still i'm still in the middle of that that's actually the early days of of me exploring this question because i was writing sci-fi i mean as an aside i also had so i had three kids at the time i have six kids now i got married early wanted a family uh i had three kids and i remember reading i remember read richard stallman's post and i was i was a fan of stallman i would read his work i liked this collective ideas he would have certainly the ideas on ip law i read a lot of stuff but then he said you know okay well how do i make money with this how do i make a living how do i pay for my kids all this stuff was in my mind a young graduate student making no money thinking i got to get a job and he said well you know i think just be like me and don't have kids right that's just don't don't that's his take on this that was just that was that was the what what he said in that moment right that's the thing i read and i went okay this is a train i can't get out yeah there has to be a way to preserve the culture of open source and still be able to make sufficient money to feed you yes exactly there's got to be well so that actually led me to a study of economics because at the time i was ignorant and it really was i'm actually i'm embarrassed for educational system that they could let me and i was valedictorian in my high school class and i did super well in college and like academically i did great right but the fact that i could do that and then be clueless about this key part of life it led me to go there's a problem like i should i should have learned this in fifth grade i should learn this in eighth grade like everybody should come out with a basic knowledge of economics you're an interesting example because you've created tools that uh change the lives of probably millions of people and the fact that you don't understand at the time of the creation of those tools the basics economics of how like to build up giant system is a problem yeah it's a problem and so i during my phd at the same time this is actually in 98 99 at the same time i was in the library i was reading books on capitalism i was reading books on marxism i was reading books on you know what is this thing what does it what does it mean yeah and i encountered a basically what i encountered a set of writings from people that said they were the inheritors adam smith but adam smith for the first time right which is the wealth of nations and kind of this notion of emergent emergent uh societies and realized oh there's this whole world out here of people and in the challenge the economics is also political like because economics you know people different parties running for office they'll they want their economic friends they want their economist to back them up right or to to be there to be their magicians like the magicians in pharaoh's court right the people that are going to say hey this is you should listen to me because i've got the expert who says this and so it gets really muddled right but i was looking at from as a scientist as a scientist going what is this space what does this mean how do people how does paris get fed how does how what is money how does it work i found a lot of writings i really loved i found some things that i really loved and i learned from that it was writings from people like von mises he wrote a pre-order paper in 1920 that still should be read more than it is it's got i mean it was the economic calculation problem of the socialist commonwealth it's basically in response to the bolshevik revolution in 1917. and his basic argument was it's not going to work to not have private property you're not going to be able to come up with prices the bureaucrats aren't going to be able to determine how to allocate resources without a price system and a price system emerges from people making trades and they can only make trades if they have authority over the thing they're trading and that that that creates information flow that you just don't have if you try to top down it right right it's like huh that's a really good point yeah the prices have a signal that's used and it's important to have that signal when you're trying to build a community of productive people like you would in the software engineering yeah the prices are actually an important signaling mechanism yeah right and that money is just a bartering tool right so this is the first time i've encountered any of this concept right and the fact that oh this is actually really critical like it's so critical to our prosperity and that we're dangerously not learning about this not teaching our children about this you know so you had the three kids you had to make some stuff how to make some money right i had to figure it out but i didn't really care i mean i was never i've never been driven by money just need it right right to eat so what how did that resolve itself in terms of sci-fi so i would say it didn't really resolve itself it sort of started a journey that i'm continuing on i'm still on i would say i don't think it resolved itself but i will say i i went in wide eyes wide open like i knew that there were problems with you know um giving stuff away and creating uh the the ex market externalities the fact that yeah people might use it and i might not get paid for it and i'll have to figure something else out to get paid like at least i can say i'm not bitter that a lot of people have used stuff that i've written and i haven't necessarily benefited economically from it like yeah i've heard other people be you know bitter about that when they write or they talk like oh i should have got more value out of this and i'm also i want to create systems that let people like me who might have these desires to do things let them benefit so it actually creates more of the same not to turn on your bitterness module but there's some aspect i wish there was mechanisms for me to reward whoever created scipy and numpy because it brought so much joy to my life i appreciate that i mean the tip dark notion was there i appreciate that and i think but there should be a very there's surely mechanism mechanism i totally agree i would love to talk about some of the ideas i have because i actually came across i think i've come up with some interesting notions that could work but they'll require you know anything that will work takes time to emerge right like things don't just turn overnight that's definitely one thing i've also understood and learned is any fixes that's why it's kind of funny we often give credit to you know oh this president gets elected and oh look how great things have done and i saw that when when i had a transition in a condo when a new ceo came in right and it's like the success that's happening there's an inertia there yeah right and sometimes the decision you made like 10 years before is the reason why the successes see right exactly so we're sort of just running around taking credit for stuff credit assignment has like a delay to it yes that this makes the credit assignment basically wrong more than right wrong more than right exactly and so i'm like oh this is you know that's the stuff i would i would read a ton about you know early on so i don't i feel like i'm with you like i want the same thing i want to be able to and honestly not for personally i've been happy i've been i've been happy i feel like i don't have any i mean we've been done reasonably okay but i've had to pursue it like that's that's really what started my um trajectory from academia is reading that stuff letting me say oh entrepreneurship matters so i love software but entre but we need more entrepreneurs and i want to understand that better so once i kind of had that that virus infect my brain it even though i was on a trajectory to go to a tenure track position at a university and i was there for six years i was kind of already out the door when i started and we can get into that but yeah um what can i just ask a quick question on is there some design principles that were in your mind around sci-pi like is there some key ideas that were just like sticking to you that this is this is the fundamental ideas yeah i would say so i would think it's basically accessibility to scientists like give them give scientists and engineers tools they don't have to think a lot about programming so give them really good building blocks give them functions that they want to call and sort of just the right length of spelling you know there's a one tradition in programming where it's like you know make very very long names right and you can see it in some programming languages where the names get you know take half the screen and i in in the 4chan world characters would have to be six six letters early on right and that's way too too much too too little but i was like i like to have names that were informative but short so even though python well this is a different conversation but documentation is doing some work there so when you look at great scientific libraries and functions there's there's a richness of documentation that helps you get into the details the first glance that a function gives you the intuition of all it needs to do by looking at the headers and so on but to get the depths of all the complexities involved all the options involved documentation does something else documentation is essential yeah so that was actually a so we thought about several things one is we wanted plotting we wanted interactive environment we wanted good documentation these are things we knew we wanted the reality is those took about 10 years to evolve right given the fact that we didn't have a big budget it was all volunteer labor it was sort of um when nthot got created and they started to you know try to find projects people would pay for pieces and they were able to fund some of it not nearly enough to keep up with what was necessary no criticism just simply the reality i mean it's it's hard to start a business and then do consulting and also promote an open source project that's still fairly new cyborg is fairly niche we stayed connected all while i was a student sorry a professor i went to byu and started to teach electrical engineering all the applied math courses i loved teaching signal processing probability theory electromagnetism i was the if you look at rate my professor which my kids love to do i wasn't like i got some bad reviews because people what was the criticism um i would speak too high too high of a level like i definitely had a calibration problem coming out of uh graduate work where i hate to be condescending to people like i really have a ton of respect for people fundamentally like my fundamental thing is i respect people sometimes that can lead to a i was i was thinking they were they had more knowledge than they did and so i would just speak at a very high level yeah assume they got it but they need to rise to the standard that you set i mean that's one of the some of the greatest teachers do that yeah and i agree and that was kind of what was inspiring me but but you know you also have to i i cannot say i was uh i was articulate to some of the greatest teachers right i was you know like one one classic example when i first taught at byu my very first class it was overheads transparencies overheads before projectors are really that common i saw transparencies i'm writing my notes out i go in room's half dark i just blaring through these transparencies here it is here it is here it is and i gave a quiz after two weeks nobody knew anything nothing i had gotten anywhere and i realized okay i'm not this is not working so i took put away the transparencies and i turned around just started using the chalkboard and what it did is it slowed me down right the chalkboard just slowed me down and gave people time to process and to think and then that made me focus my writing wasn't great on their chalkboard but i really love that part of like the teaching so that that entered scipy's world in terms of we always understood that there's a didactic aspect of sci-fi kind of how do you take the knowledge and then produce it the challenge we had was the scope like ultimately scipy was everything right and so 2001 when it first came out people were starting to use it no this is cool this is a tool we actually use at the same time 2001 time frame there was a little bit of um like the hubble space telescope the folks at hubble has started saying hey python we're gonna use python for processing images from hubble and so perry greenfield was a good friend and running that program and he had called me before i left to byu and said you know we want to do this but numeric actually has some challenges in terms of you know it's not the array doesn't have enough types uh we need more operations you know broadcast needs to be a little more settled uh they wanted record arrays they wanted you know record arrays are like a data frame but a little bit different but they wanted more structured data so he had called me even early on then and they said yeah hey would you want to work on something to make this work i said yeah i'm interested but i'm going here and i you know we'll see if i have time so in the meantime while i was teaching and sci-fi was emerging and i had a student i was constantly while i was teaching trying a way to fund this stuff so i had a graduate student uh my only graduate student a chinese fellow luhongza is his name great guy he wrote a bunch of stuff for iterative iterative linear algebra like got into writing some of the iterative literary algebra uh tools that are currently there in sci-fi and they've gotten better since but this is in 2005. kept working on sci-fi but perry had started working on a replacement to numeric called namurai and in 2004 a package called in the image it was an image processing library that was written for nomare and it had in it a morphology tool i don't know what morphology is it's open dilations close you know there's sort of this as a medical imaging student i knew what it was because it was used in segmentation a lot and in fact i wanted to do something like that in python in scipy but just had never gotten around to it so when it came out that it worked only on the num array and scipy needed numeric and so we effectively had the beginning of this split and numeric and number i didn't share data they were just two so you could have a gigabyte of numeric memory data and gigabyte of numeric data and they wouldn't share it yeah and so you have these any of these scientific libraries written on top i got really bugged by that yeah i got really like oh man this is not good we're not cooperating now we're not we're sort of redoing each other's work and we're just this young community so that's what led me even though i knew it was risky because my you know i had i was on a tenure track position 2004 i got reviewed they said hey things are going okay you're doing well paper's coming out but you're kind of spending a lot of time in this open source stuff maybe do a little less of that and a little more of the paper writing and grant writing which was naive but it was definitely the time you know the thinking it still goes on still goes on you're basically creating a thing which enables science in the 21st century right um maybe don't emphasize that so much in your for your tenure right it illustrates some of the challenges yeah it does it's and it's people mean well yeah like but we've gotten broken in a bunch of ways certain things programming understanding the role of software engineering and programming exactly society is a little bit less exactly no i was in electrical engineering position right so it was even worse there yeah it was very they were very focused and so you know good people and i had a great time i loved my time i loved my teaching i loved all the things i did there the problem was this split was happening in this community i loved i saw people and i go oh my gosh this is going to be this is not great and so i happened you know fate i had a class i had signed up for it it's a i was trying to build an mri system so i just i had a kind of a radio a instead of a radio a digital radio class as a digital mri class uh and i had people sign up two people signed up then they dropped and so i had nobody in this class so and i didn't have any other courses to teach and i thought oh i've got some time and i'll just write i'll just write a reply a merger of the american memory like i'll basically take the numeric code base add the features number i was adding and then kind of come up with a single array library that everybody can use so that's where numpy came from was my thinking hey i can do this and who else is going to because at that point i'd been around the community long enough and i'd written enough c code i knew i knew the structures and i in fact my first contribution to numeric had been writing the c api documentation that went in the first documentation for numpy for numeric sorry this is paul dubois david asher connor hinson and myself i got credit because i wrote this chapter which is all the c api of numeric all the c stuff so i said ah i'm probably the one to do it you know nobody else is going to do this so it's sort of out of a sense of duty and passion knowing that i don't think my academic advice i don't think the department here is going to appreciate this but i it's the right thing to do i was like can we just link on that moment because the importance of the way you thought and the action you took i feel is is understated and is rare and i would love to see so much more of it because what happens as the tools become more popular uh there's a split that happens and it's a truly heroic and impactful action to in those early in that early split to step up and you it's like great leaders throughout history like get what is the brave heart like get on a horse and row the troops because i think that can have make a big difference we have tensorflow versus pytorch in the machine learning we have the same problem today yes i wonder it's actually bigger i wonder if it's possible to in the early days to rally the troops it is possible especially in the early days the longer it goes the harder right the more energy and the factions the harder but in the early days it is possible and it's extremely helpful and there's a willingness there yeah but but but the challenge is there's usually not a willingness to fund it yeah there's not a willingness to you know like i was literally walking into a field saying i'm going to do this and here i am like you know i have five kids at home now pressure builds sometimes my wife hears these stories and she's like you did what i thought we were gonna i thought you were actually on a path to make sure we had resources and money but oh wow but again there's a there's an aspect i'm a very hopeful person i'm an optimistic person by nature i love people i learned that about myself later on uh part of my my religious beliefs actually lead to that and it's why i hold them dear because it's actually how i feel about that's why it's what leads me to this these attitudes sort of this hopefulness and this sense of yeah it may not it may not work out for me financially or maybe but that's not the ultimate gain like that's a thing but it's not that's not the score card uh for me and so i just wanted to be helpful and i knew and partly because these sci-fi conferences because the mailing-list conversations i knew there was a lot of need for this right so i had this it wasn't like i was alone in terms of no feedback i had these people who knew but it was crazy like people at the time said yeah we didn't think you'd be able to do it yeah we thought it was crazy and also instructive like practically speaking that you had a cool feature that you were chasing the morphology like the yes like it's it it's not either the end result it's not some visionary thing i'm going to unite the community you were like correct you were actually practically this is what one person actually could do uh and actually build because that is important because you can get over your skis yeah you can definitely get over your skis and i had in fact this almost got me over my skis right i would say well in retrospect i hate looking back we i can tell you all the flaws with numpy right you want to go into it there's lots of stuff that i'm like oh man that's embarrassing that was wrong i wish i had somebody stop me with a wet fish there yeah like i needed in fact what i wished i'd had was somebody with more experience and certainly library writing an array library there's like i wish i had me i could go back in time and go do this do that there's important being cause there's things we did that are still there that are problematic that created challenges for later and and i didn't know it at the time didn't understand how important that was and in many cases didn't know what to do like there was pieces of the design of nunpai i didn't know what to do until five years ago now i know what they should have been been but i didn't know at the time and nobody and i couldn't get the help anyway so i wrote it it took about it took four months to write the first version then about 14 months to make it usable but it was it wasn't it was that first four months of intense writing coding getting something out the door that worked that was it was it was definitely challenging and then the big thing i did was create a new type object called d-type that was probably the contribution and the fact that i added uh broad not just broadcasting but advanced indexing so that you could do um masks indexing and indirect indexing instead of just slicing in so for people who don't know maybe you can elaborate yeah numpy i guess the vision in the narrowest sense is to have this object that represents n-dimensional arrays and like at any level of abstraction you want but basically it could be a black box that you can investigate in ways that you would naturally want to investigate yes such objects yes exactly so you could do math on it easily math on it easily so you had so it had an associated library of math operations and effectively scipy became an even larger upright set of math operations so the key for me was i was going to write numpy and then move side pi to depend on numpy in fact early on one of the initial proposals was that we would just write scipy and it would have the numeric object inside of it and it would be scipy.array or something that turned out to be problematic because numeric already had a little mini library of linear algebra and some functions and it had enough momentum enough users that nobody wanted to they wanted backward compatibility one of the big challenges of numpy was i had to be backward compatible with both numeric and memory in order to allow both of those communities to come together there was a ton of work in creating that backward compatibility that also create echoes in today's object like some of the complexity in today's object is actually from that goal of backward compatibility these other communities which if you didn't have that you'd do something different which is instructive because a lot of things are there you know what is that there for it's like well it was it was a remnant it's an artifact of its historical existence um by the way i love the empathy and the lack of ego behind that because i feel you see that in the split in the javascript frameworks for example the arbitrary branching right is you i think in order to unite people you have to kind of put your ego aside and truly listen to others like you do what do you love about nama ray what do you love about numeric like actually get a sense we're talking about languages earlier sort of empathize to the culture of the people that love something about this particular api some some the the naming style or the the the use the actual usage patterns and like truly understand them and so that you can like create that same draw in the united i completely agree and you have to also have enough passion that you'll do it it can't be just like a perfunctory yeah oh yes i'll listen i'm really i listen to you and then i'm not really excited about it you so it really is an aspect it's a it's a philosophical like there's a philia there's a love of esteeming of others it's actually at the heart of what it's sort of a life philosophy for me right that i'm constantly pursuing and that helped absolutely helped it makes me wonder in a philosophical like looking at human civilization as one object it makes me wonder how we can copy and paste travis's in the circle well in some aspects maybe some aspects right right exactly well i it's a good question how do we teach this how do we how do we encourage it how we lift it because so much of the software world it's it's giant communities right but it seems like so much is moved by like little individuals you talk about like linus tarwald it's like can you could you have not could you have had linux without him could you it's like guido and python you know python i mean the sci-fi community particularly it's like i said we wanted to build this big thing but ultimately we didn't what happened is we had uh mavericks and champions like john hunter created matt plotlib we had fernando perez who created ipython and so we sort of inspired each other but and then it kind of there's sort of a culture of of this selfless give the stewardship mentality as opposed to ownership mentality with stewardship and and and community um focused community focused but intentional work like not not waiting for everybody else to do the work but you're doing it for the benefit of others and not you're not worried about what you're going to get you know you're not worried about the credit you're not worried about we're going to get you're worried about i later realized that i have to worry a little about credit not because i want the credit because i want people to understand what led to the results like i don't it's not it's not about me it's i want to understand this is what led to the result so let's like i think doing and this is what had no impact on the result like let's let's promote this just like you said i want to promote the attributes so let that help make us better off how do we make more of west west mckinney like west mckinney was critical to the success of python because of his creation of pandas which is the roots of that were all the way back in uh american number a and numpy where numpy created an array of records west started to use that almost like a data frame except it's an array of records and data frame the challenge is okay if you want to augment it at another column you have to insert you have to do all this memory movement to insert a column whereas data frames became oh i'm going to have a loose collection of arrays so it's a record of arrays that is the heart of a data frame and we thought about that back in the memory days but wes ended up doing the work to build it and then then also the operations that were relevant for data processing what i noticed is just the each of these little things creates just another tick another up so numpy ultimately took a little while about six months in people started to join me you know francesc alted robert kern charles harris and these these people are many of the unsung heroes i would say people who are you know they don't they sometimes don't get the credit they deserve because they were critical both to support like you know it's it's hard and you want you need some support people need support and i needed just encouragement and they were helping encouraged by contributing and and once the big thing for me was when john hunter he had previously done kind of a simple thing called numerix to kind of you know between the american memory he had a little high level tool that would just select each one from matplotlib in 2006 he finally said we're going to just make numpy the dependency of matplotlib as soon as he did that and i remember specifically when he did that i said this okay we've done it like that was when i knew we had succeeded success and before then it was still you know to ensure but that kind of started a roller coaster and then 2006 to 2009 and then i've been floored by the by what it's done like i had i knew it would help i didn't have no idea how much it would help right so and it has to do with again the language thing that just people started to think in terms of numpy like yes and that opened up a whole new way of thinking and part of the story that would cut you kind of mentioned but maybe you can elaborate is it seems like at some point in this story python took over science and data science yeah and uh not bigger than that the scientific community started to think like programmers or started to utilize the tools of computers to do like at a scale that wasn't done with fortran like at this gigantic scale they started to opening their heart and then python was the thing i mean there's a few other competitors i guess but python i think really really took over i agree there's a lot of stories here that are kind of during this journey because this is sort of the start of this journey in 2006. so my tenure committee i applied for tenure in 2006 2007. it came back i split the department i was very polarizing i had some huge fans and then some people said no way right so it was very i was a polarizing figure in the department it went all the way up to the university president ultimately my department chair had the had this sway and they didn't say no they said come back in two years and do it again and i went at that point i was like i said i i i mean i had this interest in entrepreneurship this interest in in not the academic circles not the like how do how do we make industry work so i do have to give credit to that expert that exploration of economics because that led me oh i had a lot of opinions i was i was actually very libertarian at the time and i'm still i have some libertarian trends but i'm more of a i understand i'm more of a collectivist libertarian so you value broadly philosophically freedom i value broadly philosophy freedom but i also understand the the power of communities like the power of of collective behavior and and so what's that balance right that makes sense um so by the time i was just i got to go out and explore this entrepreneur world so i left academia i said no thanks called my friend eric here who had his company was going i said hey could i join you and start this trend and and he would at that time they were using sci-fi a lot they were trying to get clients and so i came down to texas and in texas where i sort of it's my entrepreneurial world right i left academia and went to entrepreneur world in in 2007. so moved here in 2007 kind of took a leap knew nothing really about business knew nothing about a lot of stuff there there's you know for a long time i've kept some connections to a lot of academics because i still value it i still love the scientific tradition i still value the the essence and the soul and the heart of what is possible don't like a lot of the administration and the kind of we can go into detail about why and where and how this happens what are those challenges i mean i i don't know but i'm with you so well i'm still affiliated with mit i still love mit because there's magic there yeah there's people i talk to like researchers faculty in those conversations and the white board and and just the conversation that's magic there all the other stuff the administration all that kind of stuff seems to um you don't you don't want to say too harshly criticize sort of bureaucracies but there's a lag that seems to get in the way of the magic yeah and i don't i'm still have a lot of hope that that can change because i don't often see that particular type of magic elsewhere in industry so like we need that and we need that flame going and um it's the same thing as exactly as you said it has the same kind of elements like the open source community does and but then if you like the reason i stepped away the reason i'm here just like you did in austin is like if i want to build one robot i'll stay at mit but if i want to build millions and make money enough to where i can explore the magic of that then you can't and i think that dance is um that translational dance has been lost a bit yeah right and there's a lot of reasons for that i'm certainly not an expert on this stuff like an opinion like anybody else but i i realized that i wanted to explore entrepreneurship which i knew and really figure out and it's been a driving passion for 20 years 20 25 years how do we connect capital markets and company um because again i fell in love with the notion that oh profit seeking on its own is not a bad thing it's actually a coordination mechanism for allocating resources that you know not in an emergent way right that respects everybody's opinions right so this is actually powerful so so i i say all the time when i make a company and we do something that makes profit what we're saying is hey we're collecting of the world's resources and voluntarily people are asking us to do something they like and that's a huge deal and so i really like that energy so that's what i came to do and to learn and to try to figure out and that's what i've been kind of stumbling through since for the past 14 years 2007 2007. so you were still so no i was just emerging just right one thing i've done i've done it's worth mentioning because it emphasized the exploratory nature of my thinking at the time i said well i don't know how to fund this thing i've got a graduate student i'm paying for and i got no funding for him and i had done some fundraising from the public to try to get public fundraiser for my lab i didn't really want to go out and just do the fundraising circuit the way it's traditionally done so i wrote a book and i said i'm going to write a book and i'm going to charge for it it was called guide to numpy and so ultimately numpy became documentation driven development because i basically wrote the book and made sure the stuff worked to the book would work so it really helped actually make numpy become a thing so that doc writing that book and it was not a i mean it's not a page turner i mean kind of not a book you pick up and go oh this is great over the fire but it was it's where you could find the details like how'd all this work and a lot of people love that book and so a lot of people end up so i but i said look i i need to so i'm going to charge for it uh and i got some flack for that not that much just just probably five angry messages people you know yelling at me saying i was you know bad guy for for charging for this book one of them richard no just kidding no i i haven't really had any interaction with him personally uh like i said um but but there were a few but but actually surprisingly not there was actually a lot of people like no it's fine you know you can charge for a book that's no big deal we know that's a way you can you can try to make money around open source so so what i did what i i did in an interesting way i said well you know kind of my ideas around around ip law and stuff i love the idea you can share something you can spread it like once it's the fact that you have a thing and copying is free but the creation is not free so how do we how do you fund the creation and allow the copying right and then software it's a little more complicated than that because creation is actually a continuous thing you know it's not like you build a widget that's done it's sort of a process of emerging and continuing to to create but i wrote the book and had this market determined price thing i said look i need i think i said 250 000. if i make 250 000 from this book it's it'll make it free so as soon as i get that much money or i said five years right so there's a time limit like that's forever cool i didn't know the story yeah i i released it on this and it's actually interesting because one of the people who also thought that was interesting ended up being chris white who was the director of darpa project that we got funding through at anaconda and the reason he even called us back is because he remembered my name from this book and he thought that was interesting and so even though we hadn't gone to the demo days we applied and the people said yeah nobody ever gets this without coming to the demo day first that's the first time i've seen it but it's because i knew you know chris had done this and had this interaction so it did have impact i was actually really really pleased by the result i mean i ended up i ended up in three years i mean 90 000. so i sold 30 000 copies by myself i just put it up on you know use paypal and sold it uh made and those are my first taste of kind of okay this can work to some degree and and i you know all over the world right from germany to japan to it was actually it did work and so i appreciated the fact that paypal existed and had a way to make to get the money the distribution was simple this is pre-amazon book stuff so it was just published in a website it was the popularity of sci-fi emerging and getting company usage i ended up not letting it go the five years and not trying to make the full amount because you know a year and a half later i was at m thought i had left academias at m thought and i kind of had a full-time job and then actually what happened is the documentation people there's a group that said hey we wanna do documentation for scipy as a collective and they're essentially needing the stuff in the book right and and so they kind of ask hey can we just use the stuff in your book and at that point said yeah i'll just open it up so that's but it has served his purpose in the money that i made actually funded my grad student like it was actually you know i paid him 25 000 a year uh out of that money the funny thing is if you do very similar kind of experiment now with numpy or something like it you could probably make a lot more it's probably true because of the tooling and the community building yeah i agree like the and social media there there's just a virality to that kind of idea i agree there'd be things to do i've thought about that but and really had thought about a couple of books or a couple of things that could be done there and i just haven't right even i tried to hire a ghostwriter this week this year too to speak effect would help but it it didn't part of my problem is this i've been so excited by a number of things that steps in from that like so i came here worked at nthot for four years uh graciously you know eric made me president and we started to work closely together we actually helped him buy out his partner um it didn't end great like unfortunately eric and i aren't real aren't friends now um i still respect him i have a lot i mean i wish we were but uh he didn't like the fact that i that peter and i started anaconda right that was not i mean um so i'm there's two sides of that story so i'm not gonna go into it right sure um but you as human beings and you wish you still could be friends i do i do it saddens me i mean that that's um that's the story of great minds building great companies yeah somehow it's sad that um yeah when there's that kind of and i i i hold him in a steam i'm grateful for him i think he's they're doing you know their thoughts still exist they're doing great work uh helping scientists they still run the scipy conference they're in they have an r d platform they're selling now that's a a tool that you can go get today right so um they've been thought has played a role in the scipy in in supporting the community around sci-fi i would say they ended up not being able to they ended up building a tool suite to write gui applications like that's where they could actually make that the business could work and so this supporting scipy and numpy itself wasn't as possible like they didn't they tried i mean it was not just because it was just because the business aspect so and i wanted to build a company that could do that could get venture funding right better for worse i mean that's a longer story we could talk a lot about that but and that's that's where anaconda came that's renaconic in it so let me let me ask you it's it's a little bit for fun because you built this amazing thing and so let's let's talk about uh like an old warrior looking over old battles um you've you know there's a sad letter in 2012 that you wrote uh to the numpy mailing list announcing that you're leaving numpy yeah and what some of the things you've listed some some of the things you regret or not regret necessarily but some things to think about if you could go back and you could fix stuff about numpy or both sort of in a personal level but also like looking forward what kind of things would you like to see changed good question so i think there's technical questions and social questions right there um first of all you know i wrote numpy as a service and i spent a lot of time doing it and then other people came help make it happen numpy succeeded because the work of a lot of people right so it's important to understand that i'm grateful for the opportunity the role i had i could play and grateful that things i did had an impact but they only had the impact they had because the other people that came to the story and so they were essential but the way data types were handled the way data types we had array scalers for example that that are really just an um a substitute for a type concept right so we had array scalers are actual python objects so that there's for every for a 32-bit float or a 16-bit float or a 16-bit integer python doesn't have a natural it's just as one integer there's one float well what about these lower precision types these larger precision types so we had them in numpy so that you could have a collection of them but then have an object in python that was one of them and there's questions about like in retrospect i wouldn't have created those of an improved the type system and like made the type system actually a python type system as opposed to currently it's a python one level type system i don't know if you know the difference between python one python two it's kind of technical kind of depth but python two one of its big things that guido did it was really brilliant it was he actually python one all classes new objects were where one if you as a user wrote a class it was an instance of a single python type called the ob called the class type right in python 2 he used a meta typing hook to actually go oh we can extend this and have users write classes that are new types so it's able to have your user classes be actual types and the python type system got a lot more rich i barely understood that at the time that numpy was written and so i essentially in python numpy created a type system that was python one era it was every every d type is an instance of the same type as opposed to having new d types be really just python types with additional metadata what's the cost of that is it efficiency is a usability uh it's usability primarily the cost isn't really efficiency it's it's it's the fact that it's clumsy to create new types uh it's hard and then one of the challenges you want to create new types you want a quaternion type or you want to uh add a new you know posit type or you want to um so it's hard now in the and and now if we had done that well when number came on the scene where we could actually compile python code it would integrate with that type system much cleaner and now all of a sudden you could do gradual typing more easily you could actually have python when you add number plus better typing could actually be a uh you'd smooth out a lot of rough edges but there's already there's like but are you talking about from the perspective of developers within numpy or users and not buy because developers have new not really users of numpy so much it's the development of numpy so you're thinking about like how to design numpy so that it's contributors yeah the contributors are it's easier it's easier it's less work to make it better and to keep it maintained and and where that's impacted things for example is the gpu like all of a sudden gpus start getting added and we don't have them in numpy like numpy should just work on gpus right the fact that we have to have to download a whole other object called kupai to have arrays on gpus is just an artifact of history because there's no fundamental reason for it well that's really interesting if we could sort of go on that tangent briefly is you have pi torch and other library like tensorflow that basically tried to mimic uh yeah like you've created a sort of platonic form basically yeah exactly well the problem was they didn't realize that yeah the platonic form has a lot of edges they're like well we should cut those out before we present it so i i wonder if you can comment is there like a difference between their implementations do you wish that they were all using numpy over like in this abstraction yeah and sorry to interrupt that there's gpus a6 there might be other neuromorphic computing there might be other kind of or the aliens will come with a new kind of computer like an abstraction that numpy should just operate nicely over the things that are more and more and smarter and smarter with uh with this multi-dimensional arrays yeah yeah i have there's several comments there we are working on something now called datadashapis.org data.api.org you can go there today and it's it's our answer it's my answer you know it's not just me it's me and ralph and and athen and aaron and a lot of companies are helping us at quansite labs uh it's not unifying all the arrays it's creating an api that is unified um so we do care about this and trying to try to work through it actually the chance to go and meet with the tensorflow team and the pi torch team and talk to them after uh after exiting anaconda just talking about because the first year after leaving a con in i became deeply aware of this and realized that oh this split in the array community that exists today makes what i was concerned about in 2005 pretty parochial it's a lot worse right now there's a lot more people so the perhaps the industry can sustain more stacks right there's a lot of money but it makes it a lot less efficient i mean this but i've also learned to appreciate it's okay to have some competition it's okay to have different implementations but it's better if you can at least refactor some parts i mean you're going to be more efficient if you can refactor parts it's uh it's nice to have competition over things overweight they're innovative competition right they're innovative yeah innovative and then maybe on the infrastructure right uh whatever however you define infrastructure right maybe it's nice to have controversial exactly i agree and i think but it was interesting to hear the stories i mean tensorflow came out of the c-plus plus library uh jeff dean wrote i think that was uh basically uh how they were doing inference right and then they realized oh we could do this tensorflow thing that close library then what was interesting to me was the fact that both google and facebook did not it's not like they supported python or numpy initially they just realized they had to they they came to this world and then all the users like hey where's the numpy interface oh and then they kind of came late to it and then they had these bolt-ons tensorflow's bolt on i don't mean to offend but it was so bad yeah it's the first time that i i i'm usually so i mean one of the challenges i have is i don't criticize enough because in the sense that i don't give people input enough you know if um i think it's universally agreed upon that the bolt-ons on tensorflow right but i went through it there was a talk given at a mallorca in in spain and it got a great guy i came and gave a talk i said you should never show that api again at a pi data conference like that that's terrible like you're taking this beautiful system you've created and like you're corrupting all these poor python people forcing them to write code like that or thinking they should uh fortunately you know they adopted keras as their and that's keros is better and so keras tensorflow is fine is reasonable but um they bolted it on facebook did too like facebook had their own c plus library for doing inference and they also had the same you know reaction they had to do this one big difference is facebook maybe because the way it's situated in the in part of fair part of the research library tensorflow is definitely used and you know they have to make they couldn't just open it up and let the community you know change what that is because i guess they were worried about disrupting their operations facebook's been much more open to having community input on the structure itself whereas google and tensorflow they're really eager to have user community users people use it and build the infrastructure but it's much more wild like it's harder to become a contributor to tensorflow and it's also this is very difficult question to answer and i don't mean to be throwing shade at anybody but you have to wonder it's the microsoft question of when you have a tool like pi torch or tensorflow how much are you tending to the hackers and how much are you tending to the big corporate clients correct and so correct like the ones that so do you tend to the millions of people that are giving you almost no money or do you tend to the peop the few that are giving you a ton of money i tend to um stand with the people right because i feel like if you uh nurture the hackers you will make the right decisions in the long term that will make the companies happy i lean that way too totally but then you have to find the right date but it's a balance yeah it's because you can lean to the hackers and run out of money yeah exactly exactly which has been some of the challenge i've faced yes in the sense that like i like i would look at some of the experiments like numpy the fact that we have the split is a factor of i wasn't able to collect more money towards number development yeah right i mean i didn't succeed at in the early days of getting enough financial contribution and umpi so maybe i could work on it right i couldn't work on it full-time i had to just catch an hour here an hour there and i've basically not liked that like i've wanted to be able to do something about that for a long time and trying to figure out how well there's lots of ways i mean possibly one could say you know we had an offer from microsoft at early days of anaconda uh the 2014 they offered to come by us right the problem was the right people that microsoft didn't offer to buy us and they were still they were it was really uh we were like a second they had really bought they just bought r the r company called um it was not our studio but it was another r company that was emergent and it was kind of a well we should also get a python play but they were really doubling down on r right and so it was like it was where you would go to die so it's not it wasn't it was before satya was there satya had just started just started right and if the and the offer was coming from someone two levels down from him gotcha right and if it come from scott guthrie so i got a chance to meet scott guthrie great guy i like him if it offered to come from him probably would be at microsoft right now that'd be fascinating that would be really nice actually especially given uh what microsoft has since done for the open source community yes i think they're doing well i really like some of the stuff they've been doing they're still working and they've you know they've hired guido now and they've hired a lot of python developers he retired then he came out of retirement and he's working out i was just talking to him and he didn't mention this person well i should i should have been further because i know he loved dropbox but i wasn't sure what he was doing who he was up to well he was kind of saying he'd retire but uh and it's it's literally been five years since i last sat down and really talked to guido right um guido is a technology uh expert right he's a so i i came i was excited because i'd finally figure out the type system for numbai i wanted to kind of talk about that with him and i kind of overwhelmed him could you stay in that mo just for a brief moment because you're a fascinating person in the history of program and he is a fascinating person what have you learned from guido about programming about life yeah yeah uh a lot actually i've been a fan of guidos you know we have a chance to talk some i wouldn't say you know we talk all the time not only at all he may um but we talked enough to i respect his back when i first started number one the first things i did was i had a i asked guido for a meeting with him and paul dubois in san mateo and i went and met him for lunch and basically to say maybe we can actually part of the strategy for numpy was to get it into python 3 and maybe be part of python so we talked about that that's cool about that approach right i would have loved to be a fly in the world that was a that was good and over over the years for guido i learned so he was open like he was willing to listen to people's ideas right and know over the years now generally you know i'm not saying universally that's been true but but generally that's been true so he's willing to listen he's willing to defer like on the scientific side he would just kind of defer he didn't really always understand what we were doing like and he'd defer one place where he didn't enough was we missed a matrix multiply operator like that finally got added to python but about 10 years later than it should have but the reason was because nobody it took it takes a lot of effort and i learned this while i was writing numpy i also wrote tools to give a python dev and i added some pieces of python um like the memory view object i wanted the structure of numpy into python so we didn't get numpy into python but we got the basic structure of it in the python like so you could build on it nobody did for a while but eventually database authors started to and it was it's a lot better they did and also antoine petro and stefan craw actually fixed the memory view object because i wrote the underlying infrastructure in c but the python exposure was terrible until they came in and fixed it partly because i was writing numpy and numpy was the python exposure i didn't really care about if you didn't have numpy installed anyway guido opened up ideas technology you know brilliant like really i really got a lot of respect for him when i saw what he did with the clap with the this type class merger thing that was actually tricky right and then and then willing to share willing to share his ideas so the other thing early on in 1998 i said i start wrote my first extension module the reason i could is because he wrote this blog post on how to do reference counting right and without it i would have been lost right but he was he was willing to at least try to write this post and so he's been motive he's been motivated early on with python it was a computer science for everybody we kind of have this early on desire to oh maybe we should be pushing programming to more people so he had this populist notion i guess or populist sense um so learn that there's a certain skill i've seen it in other people too of engaging with contributors sufficiently to because when somebody engages with you and wants to contribute to you if you ignore them they go away so building that early contributor base requires real engagement with other people and he would do that can you also comment on this tragic uh stepping down from his position as the benevolent dictator for life over the wars uh uh you know the walrus operator the walrus operator was the bat last battle i don't know if that's the cause of it but uh this there's this for people who don't know you can look up there's the walrus operator which is uh looks like a colon and an equal sign yeah and equal sign and it actually does maybe the thing that you that an equal sign should be doing yeah maybe right exactly uh yeah but it's just historically equal sign means something else it just means assignment so he stepped down over this what do you think about the pressure of leadership some of the you mentioned the letter i wrote in umpire at the time that was a hard time actually i mean you know there's been really hard times it was hard you know you get criticized right and you get pushed and you get um not everybody loves what you do like anytime you do anything that has impact at all you're not universally loved right you get some real critics and that's an important energy because it's impossible if you do everything right you need people to be pushing but sometimes people can get mean yeah people can i i prefer to get people to benefit the doubt i don't immediately assume they have bad intentions and maybe for other you know maybe other maybe that doesn't happen for everybody they for whatever reason their past their experience with people they they sometimes have bad and they so they immediately attribute to you bad intentions so you're like where this come from i mean i definitely open the criticism but i think you're misinterpreting the whole point uh because i i would get that you know certainly when i started anaconda you know i've been sometimes i say to people uh i know i'm i care enough about entrepreneurship to make some open source people uncomfortable and i care enough about open source to make investors uncomfortable so i sort of you know create you create kind of doubters on both sides so when you have and this is just a a plea to the listener and the public i've noticed this too that there's a tendency and social media makes this worse when you don't have perfect information about the situation you tend to fill the gaps with the the worst possible or at least a bad uh story that fills those gaps and i think it's good to live life uh maybe not fully naively but filling in the gaps with the with the with the good with the best with the positive with the with the hopeful explanation of why you see this so if you see somebody like you trying to make money on a book about numpy there's a million stories around that that are positive and those are good to think about to project positive intent on to people because for many reasons usually because people are good and they do have good intent and also when you project that positive intent people step up to that too yes so like it's it has a great point it has this kind of viral nature to it and of course what twitter early on figured out on facebook is that they can make a lot of money and engagement from the negative yes so like there's this we're fighting this mechanism i agree it's just challenging it's like easier it's just easier to be to be negative and then for some reason something in our minds really enjoys sharing that and getting getting all excited about the negativity we do yeah but but the protective mechanism perhaps that we're we're going to eat and if we don't exactly for us to be effective as a group of people in a software engineering project you have to project positive intent i think i totally agree totally agree and i think that's very so that that happens in this in the space but python has done a reasonable job in the past but here's a situation where i think it's it's starting to get this pressure where it didn't i was i really didn't i didn't know enough about what happened i've you know talked to several people about it and i know i think most of the steering committee members today uh one one person nominated me for that role but it's the wrong role for me right now right um i have a lot of respect for the python developer space and the python developers i also understand the gap between computer science python developers and array programming developers or science developers in fact python succeeds in the array space the more it has people in that boundary and there's often very few like i was playing a role in that boundary and you know working like everything to try to keep up with the with the what even what guido was saying like i'm a c programmer but not a computer scientist like i was a engineer and physicist and mathematician and i don't i didn't always understand what they were talking about and why they would have opinions the way they did so you have to listen and try to understand then you also have to explain your point of view in a way they can understand and that takes a lot of work and that that communication is always the challenge and it's just what we're describing here about the negativity is just another form of that like how do we come together and it does appear we're wired anyway to at least have a there's a part of us that will enemy you know friend enemy and and we see yeah it's like why are we wiring on the enemy front yeah so so why are we pushing that why are we promoting that so deeply let's assume friend until proven otherwise yes yeah so because you have such a fascinating mind and all this let me just ask you these questions so one interesting side on the python history is the move from python 2 to python 3. you mentioned move from python 1 to python 2 but the move from python 2 to python 3 is a little bit interesting because it took a very long time it uh it broke in a quite a small way backward compatibility but even that small way seemed to have been very painful for people is there lessons tons of lessons from uh from how long it took and how painful it seemed to be yeah tons of lessons well i mentioned here earlier that numpy was written in 2005. it was in 2005 that i actually went to guido to talk about getting numpy into python 3. like my strategy was to oh we're moving to python 3. let's have that be and it seems funny in retrospect because like wait python 3 that was in 20 2020 right when we finally ended support for python 2 or at least 2017. the reason it took a long time a lot of time i think it was because one of the things is there wasn't much to like about python 3. 3.0 3.1 it really wasn't until 3.3 like i consider python 3.3 to be python 3.0 but it wasn't until python 3.3 that i thought there's enough stuff in it to make it worth anybody using it right and then three four started to be oh yeah i want that and then three five as the matrix multiply operator and now it's like okay we gotta use that plus the libraries that started leveraging the some of the features of python exactly yeah so it really the challenge was it was but it also illustrated a truism that you know it's when you have inertia when you have a pop when you have a group of people using something it's really hard to move them away from it you can't just change the world on them and python 3 you know made some i think it fixed some things guido had always hated i don't think he didn't like the fact print was a statement he wanted to make it a function but in some sense there's a bit of gratuitous change to the language and you could argue and there's people have but there was one of the challenges was there wasn't enough features and too many just changes without features and so that empathy for the end user as to why they would switch wasn't wasn't there i think also it illustrated just the funding realities like python wasn't funded like it was also a project with a bunch of volunteer labor right it had more people so more volunteer labor but it was still it was fun in the sense that least guido had a job and i i've learned some of the behind the scenes on that now since since talking to people who lived through it and uh maybe not on air we can talk about something but it's interesting to see but guido had a job but he but his full-time job wasn't just work on python yeah like he had other things to do it's just wild it is wild isn't it as well how few people are funded yes how much impact they have yes maybe that's a feature not a bug i don't know maybe yes exactly at least early on like it's sort of i know yeah it's like olympic athletes are often severely underfunded but maybe that's what brings out the greatness perhaps yes correct no exactly maybe this is essential part of it because i do think about that in terms of i currently have an incubator for open source startups like what i'm trying to do right now is create the environment i wish that existed when i was leaving academia with numpy and trying to figure out what to do i'm trying to create those opportunities and environments so uh and that's that's what drives me still is how do i make the world easier for the open source entrepreneur uh so let me stay i mean i could probably stand numb by for a long time but um this is fun question so andre kapathy leads the tesla autopilot team and uh he's also one of the most like legit uh programmers uh i know it's like he builds stuff from scratch a lot and that's how he builds intuition about how a problem works he just built it from scratch and i always love that and the primary language he uses is python for for the intuition building but he posted something on twitter saying that they got a significant improvement on some aspect of their uh like data loading i think by switching away from np dot square root so the numpy's implementation of square root to math that square root and then somebody else commented that you can you can get even a much greater improvement by using the vanilla python square root which is like power 0.5 power 0.5 and it's fascinating to me i just wanted to so that absolutely i mean that was some shade throwing at some no no but also we're talking about it's a good way to ask the trade-off between usability and efficiency broadly in numpy but also on these like specific weird quirks of like a single function yep so on that point if you use a numpy math function on a scalar it's going to be slower than using a python function on that scalar yeah but because the the math object in p in numpy is more complicated right because you can also call that math object on an array and so effectively it goes through a similar machine there aren't enough of the which you would do in a and you could do like checks and fast paths so yeah if you're basically doing a list if you run over a list in fact for problems that are less than a thousand even maybe 10 000 it's probably the if you're going more than 10 000 that's where you definitely need to be using arrays but if you're less than that and for reading if you're doing a reading process and essentially it's not compute bound it's i o bound and so you're you're really taking lists of thousand at a time and doing work on it yeah you could be faster just using python straight up python see but also and then this is the so sorry to introduce there's the fundamental questions when you look at the long arc of history it's very possible that np square is much faster it could be so like in terms of like don't worry about it it's the the evils of over optimization or whatever all the different quotes are on that it's is uh sometimes obsessing about this particular little uh quark is not it's not it's efficient like for somebody like uh if you're if you're trying to optimize your path i mean i agree premature optimization creates all kinds of challenges right because now but you may have to do it i believe the quote is it's the root of all it's root of all evil right let's give dude i think or take that to somebody else well doc newt is kind of like mark twain people just attribute it don't matter and it's fine because brilliant so no i was a tech user myself and so i have a lot of respect and he did more than that of course but uh yeah someone i really appreciate in the computer science space yeah i don't i think that's appropriate there's a lot of little things like that where people actually if you understood it you go yeah of course that's the case yeah like and the other part and the other part i didn't mention and number was a thing we wrote early on and i was really excited by number because it's something we wanted it was a compiler for python syntax and i wanted it from the beginning of writing numpy because of this function question like taking the power of arrays is really that you can write functions using all of it it has implicit looping right so you don't worry about this n-dimensional for loop with you know four loops four four statements you just say oh big four-dimensional array i'm gonna do this operation this plus this minus this reduction and you get this it's called vectorization in other areas but you can basically think at a high level and get massive amounts of computation done with the added benefit of oh it can be paralyzed easily it can be put in parallel you don't have to think about that in fact it's worse to go decompose your you write the for loops and then try to infer parallelism from for loops that's actually harder problem than to take the array problem and just automatically parallelize that problem that's what and and so functions in numpy are called universal functions u func so square root is an example of a u func there are others sine cosine add subtract in fact one of those first libraries to scipy was something called special where i added bessel functions and like all these special functions that come up in physics and i added them as u func so they could work on arrays so i understood you function very very well from day one inside of numeric that was one of the things we tried to make better in numpy was how do they work can they do broadcasting what does broadcasting mean but one of the problems is okay what do i do with a python scalar so what happens the python scalar gets broadcast to a zero dimensional array and then it goes through the whole same machinery as if it were a ten thousand dimensional array and then that then then it kind of unpacks the element and then does the addition that's not to mention the function it calls in the case of square root is just the c lab square root right in some cases like python's power there's some optimizations they're doing for that are that could be faster than just calling this the c lab square root in the interpreter or the in the no in the c code in the python runtime in the pythagorean so they're they really optimize it and they have the freedom to do that because they don't have to worry about it's just a scalar it's just a scalar right they don't have to worry about the fact that oh this could be an object with many you know many pieces they're not the u funk machinery is also generic in the sense that uh type casting and broadcasting broadcasting's idea of i'm gonna go i have a zero dimensional array i have a scalar with a four dimensional array and i add them oh i have to just kind of concourse the shape of this guy to make it work against the whole four-dimensional array so it's the idea if i can do a one-dimensional array against a two-dimensional array and have it make sense well that's what numpy does is it challenges you to reformulate rethink your problem yes as a multi-dimensional rate problem versus like move away from scalars completely right exactly yeah exactly in fact that's where some of the edge cases boundaries are is that well the they're still there and this is where array scalers are particular so arrays are particularly bad in the sense that they were written so that you could optimize the math on them but that hasn't happened right and so their default is to use is to coerce the arrays together to a zero dimensional array and then use the number the numpy machinery that's what and you could specialize but it doesn't happen all the time so in fact when we first wrote number we'd do comparisons and say look it's a thousand x speed up we're lying a little bit in the sense that well first do with the the 40x slowdown of using array scalers inside of a loop because if you used to use python scalars you'd already be 10 times faster yeah but then we would get 100 times faster over that using just compilation but what we do is compile the loop from out of the interpreter to machine code and then that's always been the power of python is this extensibility so you can people say oh python's so slow well sure if you do all your logic in the runtime of the python interpreter yeah but the power is that you don't have to you write all the logic which you do in the high level is just high level logic and the the actual calls you're making could be on gigabyte arrays of data and that's all done at compiled speeds and the fact that integration is one can happen but two is separable that's one of the uh their language like julia says we're gonna be all in one you can do all of it together and then there's the jury's out is that possible i tend to think that you're gonna there's separate concerns there you want to pre-compile them but generally you will want to pre-compile your some of your loops like scipy is a compilation step to install sci-fi it takes about two hours if you have many machines maybe you can get it down to one hour but to compile those libraries takes about takes a while you don't want to do that at runtime you don't do that all the time you want to have this precompiled binary available that you're then just linking into so there's real questions about the whole you know source code code is running binary code is more than source code it's created object code it's the linker it's the loader it's the how does that interpret it inside the virtual memory space there's a lot of details there that actually i didn't understand for a long time until i you know read books on the topic and it led to the more you know the better off you are and you can do more details but sometimes it helps with abstractions too well the problem as we mentioned earlier with abstractions is you kind of sometimes assume that whoever implemented this thing had your case in mind and found the optimal solution yes or like you assume certain things i mean there's a lot of correct one of the really powerful things to me early on i mean it sounds silly to say but with python probably one of the reasons i fell in love with it is dictionaries yes um so obviously probably most languages have some concepts some mapping concept but it felt like it was a first class citizen and it was just my brain was able to think in dictionaries but then there's the thing that i guess i still use to this day is order dictionaries because that seems like a more natural way to construct dictionaries yeah and and from a computer science perspective the running time cost is not that significant but there's a lot of things to understand about dictionaries that the abstraction kind of doesn't necessarily incentivize you to understand right do you really understand the notion of a hashmap and how that dictionary is implemented but you're right dictionaries are a good example of an abstraction that's powerful and i agree with you one of the love i agree i love dictionaries too it took me a while to understand that once you do you realize oh they're everywhere and python uses them everywhere too like it's actually constructed that one of the foundational things is dictionaries and it does everything with x-rays yeah so it is it's powerful order dictionaries came later but it is very very powerful it took me a little while coming from just the array programming entirely to understand these other objects like dictionaries and lists and tuples and binary trees like i said i wasn't a computer scientist so i studied a raise first and so i was very erase-centric and you realize oh these others don't have purposes and value actually um i agree there's a friendliness about like one way to think about a raise is um arrays are just not like full of numbers but to make them accessible to humans and make them less error-prone to human users sometimes you want to attach names human interpretable names that are sticky to those arrays so yeah that's how you start to think about dictionaries yes you start to convert numbers into something that's human interpretable and that's actually the tension i've had correct with numpy because correct i've built so much tooling around human uh human interpretability and also protecting me from a year later not making the mistakes by being i wanted to force myself to use english versus uh numbers yes so there's a there's a project called labeled arrays like very early it was recognized that oh we need we we're indexing numpy we're just numbers all the columns and particularly the dimensions i mean if you have an image you don't necessarily need to label each column or row but if you have a lot of images so you have another dimension you at least like to label the dimension as this is x this is y z or this is give us some human meaning or some domain circle meaning that was one of the impetuses for pandas actually was just oh we do need to label these things and label label array was an attempt to add that like a lighter weight version of that and there's been like that's an example of something i think numpy could add could be added to numpy but one of the challenges again how do you fund this like like i said one of the tragedies i think is that so i i never had the chance to i was never paid to work on empire right so i've always just done in my spare time always taken from one thing taken from another thing to do it and at the time i mean today it would be the wrong time today like pay me to work on empire now would not be a good use of effort but but we are finally at quansite labs i'm actually paying people to work on numpy and scipy which is i'm thrilled with i'm excited by uh i've wanted to do that it's what i wanted to do from day one it just took me a while to figure out a mechanism to do that even like in the university setting respecting that from like pushing students young minds the younger graduate students to contribute and then figuring out financial uh mechanisms that enable them to contribute and then sort of reward them for their um innovative scientific journey that that would be nice but then also just the better allocation of resources well you know it's 20-year anniversary since 9 11 and i was just looking we spent over six trillion dollars in in the middle east after 9 11 in the various efforts there and sort of to put politics and all that aside it's just you think about the education system all the other ways we could have possibly allocated that money to me yeah to take it back the amount of impact you would have by allocating a little bit of money to uh the programmers yeah that build the tools that run the world is fascinating i mean it is i it i don't know i think uh again there is some aspect to uh being broke as somewhat of a feature not a bug that you make sure that you manage that right now i know i it's so i but i don't think that's a big part so it's like i think you can you can have enough money and actually be wealthy while maintaining your values agreed i think agreed there's an old adage that you know nations that trade together don't go to war together yeah right i i've often thought about you know nations that code together [Laughter] one thing i love about open source is it's global it's multinational like there aren't national boundaries one of the challenges with business and open source is the fact that business is national like businesses are entities that are recognized in legal jurisdictions right and have laws that are respected in those jurisdictions and hiring and yet the open source ecosystem is not it's not it's not there like currently one of the problems we're solving is hiring people all over the world right because we it's a global effort and i've had the chance to work and i've loved the chance i've never been to uh like a iran but i once had a conference i was able to talk to people there right and talk to folks in uh pakistan never been there but we had a a call where there are people there like just scientists and normal people and you know and and it's there's a there's a certain amount of humanizing right that gets away from the like we often get the memes of society that bubble up and get her get discussed but the memes are not even an accurate reflection of the reality of what people are well if you look at the major power centers that are leading to something like cyber war in the next few decades it's united states it's russia and china right and th those three countries in particular have incredible developers so if they work together yeah i think that's one way the politicians can do their stupid bickering but like the there's a layer of infrastructure of humanity yeah if they collaborate together that that i think can prevent major uh major military conflict which would i think most likely happen at the cyber level versus the actual hot war level you're right no i think that's good that's good prediction nations that code together uh don't go to war yeah they don't go to war together that's that's a hope right that's one of the philosophical hopes but yeah so you mentioned uh the project of number which is um fascinating so from the early days there was kind of a push back on python that it's not fast you know you see c if you want to write something that's fast you use cc plus if you want to write something that's usable and friendly but slow you use python and so what is uh number what is its goal how does it work great yeah yes that's what the argument and the reality was people would write high-level code and use compiled code but there's still user story use cases where you want to write python but then have it still be fast you still need to write a for loop like before number it was always don't write a for loop you know write it in a vectorized way you put in an array and often that'll that can make a memory trade off like quite often you can do it but then you make maybe use more memory because you have to build this array of data that you don't necessarily need all the time so number was it started from a desire to have a kind of a vectorized that worked vectorized was a was a tool in numpy it was released you give it a python function and it gave you a universal function a u-function would work on arrays so get the function that just worked on a scalar like you could make a like the classic case was a simple function and if then statement in it so uh sine x over x function sync function if x equals zero return one otherwise do sine x over x the challenge is you don't want that loop had one in python so you want a compiled version of that um but the ufo the vectorize and numpy would just give you a python function so it would take the array of numbers and at every call do a loop back into python so it was very slow it gave me the appearance of a u func but it was very slow so i always wanted a vectorize that would take that python scalar function and produce a u-func working on binary native code so in fact i had somebody work on that with pi pi to see if pipette could be used to produce a u func like that early on um in 2009 or something like that 2010. um they didn't work that well it was kind of pretty bulky but in 2012 uh peter and i just started anaconda we had i just i i'd learned to raise money that's a different topic but i've learned to you know raise money from friends family and fools as they say and that's a good line oh that's a good sign but you know so i we're trying to do something we were trying to change the world peter and i are super ambitious we wanted to make array computing and we had ideas for really what's still still the energy right now how do you do at scale data science we had a bunch of ideas there but one of them i just talked to people about llvm and i was like there's a way to do this i just i went uh i heard about my friend dave beasley at a compiler course so i was looking at compilers like and i realized oh this is what you do and so i wrote a version of number that just basically mapped python byte code to lvm nice right so and the first version is like this works and it produces code that's fast this is cool for you know obviously a reduced subset of python i didn't support all the python language there had been efforts to speed up python in the past but those efforts were i would say not from the array computing perspective not from the perspective of wanting to produce a vectorized improvement they were from perspective of speeding up the runtime of python which is fundamentally hard because python allows for some constructs that aren't you can't speed up like it's this generic you know what is this variable so i from the start did not try to replicate python's semantics entirely i said i'm going to take a subset of the python syntax and let people write syntax in python but it's kind of a new language really so it's almost like for loops like focusing on for loops scalar arithmetic you know typed you know really typed language a type subset that was the key so but we wanted to add inference of types so you didn't have to spell all the types out because when you call a function so python is typed it's just dynamically typed so you don't tell it what the types are but when it runs every time an object runs there's a type for the variables you know what it is and so that was the design goals of number were to make it possible to write functions that could be compiled and and have them use for numpy arrays like the needed support numpy race and so uh how does it work you have a comment within python that tells to do like how do you help out compiler yeah so there isn't much actually you don't it's kind of magical in the sense that just looks at the type of the objects and then does type inference to determine any other variables it needs and then it was also because we had a use case that that could work early like one of the challenges of any kind of new new development is if you have something that to make it work it's going to take you a long time it's really hard to get out of the ground if you have a project where there's some incremental story it can start working today and solve a problem then you can start getting it out there getting feedback because number today you know numbers nine years old today right the first two three versions were not great right but they solved a problem and some people could try it we could get some feedback on it not great and that it was very focused very very fragile very substantive the subset it would actually compile was small and so if you wrote python code and said to the way it worked did you write a function and you say at jit use decorators so decorators just these little constructs let you decorate code with an app and then a name the atgit would take your python function and actually just compile it and replace the python function with a another function that interacts with this compiled function got it and it would just do that and it would you know we went from python bytecode then we went to ast i mean writing compiler is actually i learned a lot about why computer science is taught the way it is because compilers can be hard to write there's they use tree structures they use all the concepts of computer science that are needed and it's actually hard to to you can it's easy to to write a compiler and have it be spaghetti code like the passes become challenging and we ended up with three versions of number right number got written three times what's uh what programming language is number written in python okay yeah python so really yeah it's fascinating yeah so python but then the whole goal of number is to translate python byte code to llvm and so lvm actually does the code generation in fact a lot of times they'd say yeah it's super easy to write a compiler if you're not writing the parser nor the code generator right so for people who don't know llvm is the compiler itself so you're compatible yeah it's really badly named low level virtual machine which that part of it is not used it's really low doesn't mean that love chris but the name makes you imply that the virtual machine is what it's all about it's actually the ir and the library the the code generation that's the real beauty of it the fact that what i love about llvm was the fact that it was a plateau you could collaborate on right instead of the internals of gcc or the internals of the intel compiler like how do i extend that and it was a place we could collaborate and we were early i mean people had started before it's a slow compiler like it's not a fast compiler so for some kind of jits like jits are common in language because one uh every browser has a javascript jet it does real-time compilation of the javascript to machine code for people who don't know jet is just in time compilation thank you yeah just in time compilation they're actually really sophisticated in fact i got jealous of how much effort was put into the javascript jets yes well it's kind of incredible what they've done yes that was good i completely agree i'm very impressed um but you know number wasn't it was it was an effort to make that happen with python and so we used some of the money raised for anaconda to do it and then we also applied for this darpa grant and used some of that money to continue the development and then we used proceeds from service pro projects we would do we get consulting projects on uh that we would then use some of the profits to invest in numbers so we ended up with a team of two or three people working on number it was a fits and starts right and ultimately the fact that we had a commercial version of it also we were writing so part of the way i was trying to fund numbers say well let's do the free number and then we'll have a commercial version of number called number pro and what number pro did is it targeted gpus so we had the very first cuda jit and the very first jit compiler that in 2013 for 13 you could run not just a vue funk on cpu but a u function gpus and it was awesome automatically parallelize it and get 1000x speed and that's a that's an interesting funding mechanism because you know large companies or larger companies care about speed exactly in just this way so it's it's it's exactly a really good way yeah there's been a couple things you know people will pay for one they'll pay for really good user interfaces right and so and so i'm always looking for what are the things people will pay for that you can actually adapt to the open source infrastructure one is definitely user interfaces the second is speed yeah like a better run time faster run time and then when you say people you mean like a small number of people pay a lot of money but then there's also this other mechanism that that's true a ton of people pay that's true a little bit first i gotta we mentioned anaconda we mentioned uh friends family and fools so uh anaconda is yet another so there's a company but there's also a project correct that is exceptionally impactful uh in in terms of uh for many reasons but one of which is bringing a lot more people into the um into the community of folks who use python so what is anaconda what is its goals yeah maybe what is conda versus anaconda yeah i'll tell you a little bit of the history of that because anaconda we we wanted to do uh we wanted to scale python because we you know that was the peter and i had the goal of when we started on anaconda we actually started as continuum analytics was the name of the company that started it got renamed to anaconda in 2015. but we uh we said we want to scale analytics numpy's great pandas is emerging but these need to run at scale with lots of machines the other thing we wanted to do was make user interfaces that were web we wanted to make sure the web did not pass by the python community that we had a ways to translate your data science to the web so those are the two kind of technical areas we thought oh we'll build products in this space and that was the idea very quickly in but of course the thing i knew how to do was to do consulting to make money and to make sure my family and friends and fools that it invested didn't lose their money so it's a little different than if you take money from a venture fund if you take money from a venture fund the venture fund they want you to go big or go home and they're kind of like expecting 9 out of 10 to fail or 99 out of 100 to fail it's different i was i was out of barbell strategy i was like i can't fail i mean i may not do super well but i cannot lose their money so i'm going to do something i know can return a profit but i want to have exposure to an upside so that's what happened in anaconda we didn't there was lots of things we did not well in terms of that structure and i've learned from since and have it better but we've uh we did a really good job of kind of attracting the interest around the area to get good people working and then get funneled some money on some interesting projects super excited about what came out of our energy there like a lot did so what are some of the interesting things dask number bokeh conda uh there was a data shader panel holovis um these are all tools that are extremely relevant in terms of helping you build applications build tools build you know faster code um there's a couple of days jupiter lab jupiter lab came out of this too fascinating yeah okay so uh well bokeh does plotting is that okay is plotting so bokeh was one of the foundational things to say i want to do plot and python but have the things show up in a web right that's right that's right that's right so applauding to me still with all due respect to matplotlib and bokeh is feels like still an unsolved problem not it's a big problem right because you're i mean i don't know it's a visualization broadly yes right i think we've got a pretty good api story around certain use cases of python plotting yeah but there's a difference between static plots versus interactive plots versus i'm an end user i just want to write a simple you know pandas started the idea of here's a data frame on a dot plot i'm just going to attach plot as a method to my object which was a little bit controversial right but works pretty well actually because there's a lot less you have to pass in right you can just say here's my object you know what you are you tell the visualization what to do so that and there's things like that that have not been you know super well developed entirely but bokeh was focused on interactive plotting so you could it's a short path between interactive plotting and application dashboard application and there's some incredible work that got done there right and it was a hard project because then you're basically doing javascript and python so we we wanted to tackle some of these hard problems and try to and just go after them we got some darpa funding to help and it was super helpful it's a funny story there we actually did two dark proposals but one we were five minutes late for and darpa has a very strict cutoff window and so i we had two proposals one for the bokeh and one for actually number and the the other work which one were you late for the foundational numerical work so fortunately chris let us use some of the money to fund still some of the other foundational work but it wasn't as yeah his hands were tired he couldn't do anything about it uh that was a whole interesting story so one of the incredible projects that you worked on is conda yes so what is that about yeah conda it was early on like i said with scipy sci-fi was a distribution masculine library and he said talk to me talking about compiler issues and trying to get the stuff shipped and the fact that people can use your libraries that they have it so for a long time we've understood the packaging problem in python and one of the first things you did at that consumer analytics became anaconda was organize the pi data ecosystem in conjunction with num focus we actually started num focus uh uh with some other folks in the community the same year we started anaconda i said we're going to build a corporation but also got to reify the community aspect and build a nonprofit so we do both of those can we pause real quick and and uh can you say what is pi pi the python package index like this whole story yeah of packaging in python yeah that's what i'm going to get to actually this is exactly the journey i'm honest to sort of explain packaging in python i think it's best expressed so the conversation i had with guido at a conference where i said so you know yeah packaging is kind of a problem megiddo said i don't ever care about packaging i don't use it i don't install new libraries i'm like i guess if you're the language creator and if you need something just put it put it in the distribution maybe you don't worry about packaging but guido has never really cared about packaging right and never really cared about the problem of distribution somebody else's problem and that's a fair position to take i think as a language creator in fact there's a philosophical question about should you have different development packaging managers should you have a package manager per language is that really the right approach i think there are some answers of it is appropriate to have development tools and there's an aspect of development tool that is related to packaging and every language should have some story there to help their developers create so you should have language specific language tools development tools that relate to package managers but then there's a very specific user story around package management that those language specific package managers have to interact with and currently aren't doing a good job of that that was one of the challenges that if not seeing that difference and it still exists in the in the difference today conda always was a user i'm i'm going to use python to do data science i'm going to use python to do something how do i get this installed it was always focused on that so it didn't have like a develop you know classic example is pip has a pip develop it's like i want to install this into my current development environment today now khan doesn't have that concept because it's not part of the story for people who don't know pip is a uh python specific packaging manage package manager right that's that's exceptionally popular that's probably like the default thing it's the default user yeah and so the story there emerged because what happened is in 2012 we had this meeting at the google googleplex and guido was there to come talk about what we're going to do how we're going to make things work better and wes mckinney me peter peter has a great photo of me talking to guido and he pretends we're talking about this story maybe we were maybe before but we did at that meeting talked about it and asked you know what we need to fix packaging in python like people can't get the stuff and he said go fix it yourself i don't think we're gonna do it all right the origin story right there all right you said okay you said to do this ourselves so at the same time people did start to work on the packaging story in python it just took a little longer so in 2012 kind of motivated by our training courses we were teaching like how to very similar to what you just mentioned about your mother like it was motivated by the same purpose like how do we get this into people's hands and it's this big long process it takes too expensive it was actually hurting numpy development because i would hear people were saying don't make that change to numpy because i just spent a week getting my python environment and if you change if you change numpy i have to reinstall everything and reinstalling such a pain don't do it i'm like wait okay so now we're not making changes to a library because of the installation problem that it'll cause for end users okay there's a problem with pac there's a problem with installation we've got to fix this so we we we said we're going to make a distribution of python and we'd previously done that previously done that at m thought i wanted to make one that would give away for free everyone could just get like that was critical that we just get it you know it wasn't tied to a product it was just you could get it and then we had constantly thought about well do we just leverage rpm do we but the challenge had always been we want a package manager that works on windows mac os 10 and linux the same right and it wasn't there like you don't have anything like that you have and for people who don't know rpm is red operating system specific packaging correct it's an operating specific yes so do you create the the design uh questions do you create an umbrella package manager then yes cross operating system yes that was the decision and a neighboring design question is do you also create a package manager that spans multiple programming languages correct exactly that was the world we faced and we decided to go multiple operating systems multiple and programming language independent because even python in particular was important was scipy has a bunch of 4chan in it right and scikit-learn has links to a bunch of c plus there's a lot of compiled code and the python package managers especially early on didn't even support that so in 2000 so we we we released anaconda which was just a distribution of libraries but we started work on condo in 2012. first version of kana came out in early 2013 2000 summer of 2013 and it was a package manager so you could say con install psychic scikit-learn in fact that was the scikit-learn was a fantastic project that emerged kind of it was the the classic example of the scikits i still talked to earlier about scipy being too big to be a single library well what the community had done is said let's make side kits and there's psychic image there's psychic learn there's a lot of sidekicks and it was a fantastic move you know the community did i didn't do it i was like okay that's a good idea i didn't like the name i didn't like the fact you typed scikit image i was like that's going to be simpler sk learn we got to make this smaller i don't like typing all this stuff from imports so i was kind of a pressure that way but i love the energy i love the fact that they went out and they did it and dost people jared millman and then of course guyell and and there's people i'm not even naming that psychic learn really emerged this fantastic project and the documentation around that is also incredible this was incredible exactly i don't know who did that but they did a great job a lot of people in inria a lot of people a lot of european contributors um andreas there's some andreas uh in the u.s there's a lot of just people i just adore i think are amazing people um awesome use of sci-fi right i love the fact that they were using sci-fi effectively do something i love which is machine learning um but i couldn't install it because there's so many pieces involved so many dependencies right yes so our our use case of condo was con install cycle learn right and it was the best way to install second learn in 2013 to really 2018 17 18. pip finally caught up i still don't i still think it's you should khan install second learn for the pip install second learn but you can dip install second learn the the issue is the package they created was wheels and pip does not handle the multi-vendor approach they don't handle the fact you have c plus libraries you're depending on they just stop at the python boundary and so what you have to do in the real world is you have to vendor you have to take all the binary and vendor it now if your change happens underlying dependency you have to redo the whole wheel so tensorflow is a good example but you should not pip install tensorflow it's a terrible idea people do it because the popularity of pip many people think of course that's how i install everything python yeah this is one of the big challenges you know you take a github repository or just a basic blog post the number of times pip is mentioned over conda is like 100x to one correct correct so they just haven't that was increasing it wasn't true early because pip didn't exist like conda came first so but that's like the long tail of the internet documentation user generated so that like you think how do i install google how do i install tensorflow you're just not going to see conda in that first page not correct exactly and that today you would you would have in 2017 and it's sad because you saw the condos solves a lot of usability issues correct like for especially super challenging thing i don't know one of the big pain points for me was uh just on the computer vision side uh opencv yeah installation that perfect example yes i think is i don't know if condos solved that pun has an open cv package i don't know i i certainly know pip has not solved i mean there's complexities there because right i actually don't know i should probably know a good answer for this but you know if if you compile opencv with certain dependencies you'll be able to do certain things so there's this kind of flexibility of what you like what br what options you compile with yes and i don't think it's trivial to do that in with condor or or so has a notion of variance of a package you can actually have different compilation versions of a package so not just the version's different but oh this is compiled with these optimizations so kana does have an answer to those flavors that's flavors basically as far as i know does not no no pip generally hasn't thought deeply about the binary dependency problem right and that's that's why fundamentally it it doesn't work for the sci-fi ecosystem it barely it you can sort of paper over it and duct tape and it kind of works until it doesn't it falls apart entirely so it's been a mixed bag like and i've been having lots of conversations with people over the years because again it's an area where if you if you understand some things but not all the things but they've done a great job of community appeal this is an area where i think anaconda uh as a company needed to do some things in order to make condom more community-centric right and this is a i talk about this all the time there's there's a balance between you have every project starts what i call company backed open source even if the company is yourself it's just one person just you know doing business ads but ultimately for products to succeed virally and become massive influencers they have to create they have to get community people on board they have to get other people on board so it has to become community driven and a big part of that is engagement with those people empowering people governance around it and there was and what happened with khan in the early days pip emerged and we did we did do some good things condo forge the kind of forage community is sort of the community recipe creation community mm-hmm the condo itself i am still believe and and you know peter is ceo of anakin he's my co-founder i ran anaconda tell 2017 2018 is peter still in peter salanakande right and we're still great friends we have great friends we talk all the time i love him to death there's a long story there about like why and how when we could cover in some some other podcast perhaps yeah sort of a more maybe a more business focused one but but um this is one area where i think condos should be more community driven like he he should be pushing more to get more community contributors to conda and let let let the not like anika shouldn't be fighting this battle yeah right it's actually uh it's really the developers like you said like help the developers yeah and then they'll actually move us the right direction but that was the problem i have as many of the cool kids i know don't use conda and that to me is confusing it is confusing and it's really a matter of kind of has some challenges first of all kind of still needs to be improved there's lots of improvements we made and that it's that aspect of wait who's doing this and the fact that then the pipea really stepped up like they were not solving the problem at all and now they kind of got to where they're solving it for the most part and then effectively you could get like conda solved a problem that was there and it still does and it's still you know there's still great things it can do but um and we still use it all the time at quansite and with other clients but with uh but you can kind of do similar things with pip and docker right so especially with the web development community that part of it again is this is the there's a lot of different kind of developers in the python ecosystem and there's still a lack of of some clear understanding i go to the python conference all the time and there's only a few people in the pipea who get it and then others who are just massively trumpeting the power of pitt but just do not understand the problem yeah so one of the obvious things to me from a mom from a non-programmer perspective is the across operating system usability that's much more natural so yeah they use windows and just it seems much easier to uh to recommend conda there but then it you should also recommend it across the board so i'll i'll definitely surf but what i recommend now is a hybrid i do i mean i have no problem is it possible to use oh it is it is what i like build the environment with pip with conda build an environment with conda and then pip install on top of that that's fine be careful about pip installing opencv or tensorflow or because if somebody's allowed that it's going to be most surely done in a way that can't be updated that easily so install like the big packages the infrastructure yeah and then the weirdos yeah the the like the weird like implementation for some uh i had a there's a cool library i used that based on your location and time of day and date tells you the exact position of the sun relative to the earth and it's just like a simple library but yeah it's very precise and i was like all right but that was that was uh in this episode the thing they did really well is python developers who want to get their stuff published they you have to have a pip recipe yeah right i mean even if it's you know the challenge is there's a key thing that needs to be added to pip just simply add the pip the ability to defer to a system package manager like because it's you know recognize you're not going to solve all the dependency problem so let like give up and allow the allow system packager to work that way anaconda's installed and it has pip it would default to conda to install this stuff but red hat rpm would default to rpm to install but it's all more things like that's the that's a key not difficult but somewhat where some work feature needs to be added that's an example of something like i've known we need to do it i mean it's where i wish i had more money i wish i was more successful in the in the business side trying to get there but i wish my you know my family friends in full community that i know was larger was larger and had more money because i know tons of things to do effectively with more resources but you know i have not yet been successful with channel tons of it you know some you know i'm happy with what we've done we we've created again at quansite what we created to get anaconda started we created analytics and it kind of started done it again with quansite super excited by that by the way it took three years to do it what is kwan site what is its mission yeah we've talked a few times about different fascinating aspects of it but let's like big picture what is big picture quan site kwansai is uh it's mission is to connect data to an open economy so it's basically consulting the pi data ecosystem right it's a consulting company and what i've said when i started it was we're trying to create products people and technology so it's divided into two groups and a third one as well the two groups are a consulting services company that just helps people do data science and data engineering and data management uh better and more efficient full stack like full stock science full thing will help you build a infrastructure if you're using jupiter we need we do staff augmentation need more programmers help use das more effectively help us gpus more effectively just basically a lot of people need help so we do training as well to help people you know both uh immediate help and then get get learn from somebody uh we've added a bunch of stuff too we've kind of separated some of these other things into another company called open teams that we currently started one things i loved what we did at anaconda was creating a community innovation team and so i want to replicate that this time we did a lot of innovation in anaconda i wanted to do innovation but also contribute to the projects that existed like create a place where maintainers so the scipy and numpy and number and all these projects we already started can pay people to work on them and keep them going so that's labs quonsite labs is a separate organization it's a non-profit mission the profits of quansite help fund it in fact every project that we have at quansite a portion of the money goes directly to quansite labs to help keep it funded so we've gotten several mechanisms we keep quansite labs funded and currently so i'm really excited about labs because it's been a mission for a long time what kind of projects are within labs so labs is working to make the software better like make numpy better make scipy better make it it's it only works on open source so you know if somebody wants to so you know companies do we have a thing called the community work order we call it if a company says i want to make spider better okay cool um you can pay for a month of a developer of spider or developer of numpy or developer of scipy you're not you can't tell them what you want them to do you can give them your priorities and things you wish you wish existed and they'll work on those priorities with the community to get what the community wants and what emerges what the community wants is there some aspect on the consulting side that is helping as we were talking about morphology and so on is there specific applications that are particularly like driving sort of inspiring the need for updates to science correct absolutely absolutely gpus are absolutely one of those and new hardware beyond gpus i mean tesla's dojo chip i'm hoping we'll have a chance to work on that perhaps um things like that are definitely driving it the other thing is driving is scalable like speed and scale uh how do i write numpy code or numpy like code if i want it to run across a cluster you know that's das or maybe it's ray i mean there's sort of ways to do that now or there's moden and there's so pandas code numpy code sci-fi code second learn code that i want to scale so that's one big area have you gotten a chance to chat with andre and elon about partic because like no i would love to by the way i have not but i'd love to i just saw their tesla ai days uh video yeah super exciting so that's one of the you know i love great engineering software engineering teams and engineering teams in general and they're doing a lot of incredible stuff with python they're like they are revolutionizing so many aspects of the machine learning pipeline i agree that's operating in the real world and so much of that is python like you said the guy running you know andre capati running autopilot is tweeting about optimization of in fact we have at quonset we've been fortunate enough to work with facebook on pytorch directly so we have about 13 developers at quonsite some of them are in labs working directly on pi torch on torch right so i basically started quantity i went to both tensorflow and pi torch and said hey i want to help connect what you're doing to the broader sci-fi ecosystem because i see what you're doing we have this bigger mission we want to make sure we don't you know lose energy here so uh and facebook responded really positively and i didn't get the same reaction not yet not yet i love the folks so i really love the folks tensorflow too they're fantastic i think it's the just how it integrates with their business i mean like i said there's a lot of reasons just the timing the integration with their business what they're looking for they're probably looking for more users and i was looking to kind of have some development effort and they couldn't receive that as easily i think so i'm hoping i'm really hopeful uh and love love the people there what's the idea behind open teams so open teams i'm super excited about open teams because it's one of the i mentioned my idea for investing directly in open source so that's a concept called ferro ss but one of the things we when we started quansite we knew we would do is we developed products and ideas and new companies might come out at anaconda this was clear right anaconda we did so much innovation that like five or six companies could have come out of that and we just didn't structure it so they could but in fact they have you look at das there's two companies coming on a desk you know bokeh could be a company there's like lots of companies that could exist off the work we did there and so i thought oh here's a recipe for an incubation a concept that we could actually spawn new companies and new new innovations then the idea has always been well money they earn should come back to fund the open source project so so labs is you know i think there should be a lot of things like quansite labs i think this concept is one that scales you could have a lot of open source research labs along the way so in 2018 when the bigger idea came how to make open source investors i said oh i need to write i need to create a venture fund so we created a venture fund called quonset initiate at the same time it's an angel fund really it's you know we started to learn that process how do we actually do this how do we get lp's how do we actually go in this direction and build a fund and i'm like every venture fund should have an associate open source research lab there's just no reason like our venture fund the carried interest portion of it goes to the lab it's it directly will fund the lab that's fascinating brother so you use the power of the organic formation of teams in the open source community and then like naturally that leads to a business that can make there are so many yeah correct it always maintains and loops back to the opening looks back to open source exactly it's a natural fit there's something there's there's absolutely a repeatable pattern there and it's also beneficial because oh i have i have natural connections to the open source if i have an open source research lab like they'll always they'll be out there talking to people and so we've we've had a chance to talk to a lot of early stage companies and we in our fund focus on the early stages so kwan site has the services the lab the fund right in that process a lot of stuff started to happen like oh you know we started to do recruiting and support and training and i was starting to build a bigger sales team and marketing team and people besides just developers and one of the challenges with that is you end up with different cultural aspects you know developers you know there's a in any company you go to you kind of go look is this a business led company a developer led company do they kind of co-exist how are they what's the interface between them there's always a bit of attention there like we were talking about before you know what is the tension there with open teams i thought wait a minute we can actually just create like this concept of quantity plus labs it's well worth while it's specific to the pi data ecosystem the concept is general for all open source so open teams emerge as oh we can create a business development company for many many quant sites like thousands of kwan sites and it can be a marketplace to connect essentially be the enterprise software company of the future if you look at what enterprise software wants from the customer side and during this journey i've had the chance to work and and sell to lots of companies exxon and shell and jv morgan bank of america like the fortune 100 and talk to a lot of people in procurement and see what are they buying and why are they buying so you know i don't know everything but i've learned a lot about oh what are they really looking for and they're looking for solutions they're constantly given products from the from enterprise software here's open source lead enterprise software now i buy it and they have to stitch it together into a solution open source is fantastic for gluing those solutions together so whereas they keep getting new platforms they're trying to buy what most open source what most enterprises want is tools that they can customize that that are as inexpensive as they can yeah and so you almost want to maintain the connection to the open source because that's yes so open teams about solving enterprise software problems brilliant brilliant idea by the way with a connect but we do it honoring the topology we don't hire all the people we are a network connecting the sales energy and the in the procurement energy and we we were on the business side get the deals closed and then have a network of partners like quansite and others who we hand the deals to right to actually do the work and then we off we then we have to maintain i feel like we have to maintain some level of quality control so the client can rely on open teams to ensure the deliveries it's not just here's a lead go figure that out but no we're going to make sure you get what you need right by the way it's such a skill and i don't know if i have the patience i will have the patience to talk to the business people or more specific i mean there's all kinds of flavors of business people or they're like yeah marketing people there's a challenge i hear what you're saying because i've had the same challenge yeah and it's true there's sometimes you think okay this is wait this is way overwrought yeah so you have to become an adult you have to because the the companies have needs they have ways to make money and they and they also want to learn and grow and yet it's your job to kind of educate them on the best way like the value of open source for example right and i'm really grateful for all my experiences over the past 14 years understanding that side of it and still learning for sure but not just understanding from companies but also dealing with marketing professionals and sales professionals and people that make a career out of that understanding what they're thinking about and also understanding what let's make this better like we can really make a place like open teams i see as the transmission layer between companies and open source communities producing enterprise software solutions like eventually we want to like today we're taking on sas and matlab and tools that we know we can replace for folks really anytime you have a software tool an organization where you have to do a lot of customization or make it work for you because now you're just buying this thing off the shelf and it works it's like okay you buy the system then you customize a lot usually with expensive consultants to actually make it work for you all of those should be replaced by open source foundations with the same customers such important work such important work in these giant organizations they're doing exactly that taking some proprietary software and hiring a huge team of consultants that customize it and then that's whole thing gets outdated quick correct and so i mean that that's brilliant right the one one solution to that is how it would like kind of what tesla's doing a little bit of which is basically build up a software engineering team yeah like build a team from scratch from scratch and companies are doing it well that's what they're doing right now yeah right exactly that's okay and you're creating an apology for some of that like right you just don't have to do it that's not the only answer yeah right and so other companies can access this new more flexible we really that's really really say open team is the future of enterprise software um it's we're still early like this idea just percolated over the past year as we've kind of grown quan sight and realize the extensibility of it uh we just finished in our seed round uh the work to help you know get more sales people and then push the push the messaging correctly and there's lots of tools we're building to make this easier like we want to automate the processes we feel like a lot of the power is the efficiency of the sales process there's a lot of wasted energy in small teams and the sales energy to get into large companies and make a deal there's a lot of money spent on that stuff creating the tools and processes makes that super seamless so a single company can go oh i've got my contract with open teams we've got a subscription they can get they can make that procurement seamless and then the fact they have access to the entire open source ecosystem and we have a you know so we have a part of our work that's embracing open source ecosystems and making sure we're doing things useful for them we're serving them and then companies making sure they're getting solutions they care about and then figuring out which which targets we have you know yeah we're we're not taking on all of open source all of enterprise software yet but but we're well this feels like the future the idea and the vision is brilliant uh can ask you uh why do you think microsoft bought github and what do you think is the future great point i thought it was a brilliant move i think they did because microsoft has always had a developer centric culture like they always have like one things microsoft's always done well is understand that their power is developers right it's been balmer didn't necessarily make a make a good meme about how he approached that but they're broadening that i think that's why because they recognize github is where developers are at right and so but do they have a vision like open teams type of situation right so yeah are they just basically throwing money at developers to show their support i think so without uh a topology like you put it like a a a way to leverage that like to give developers actual money right i don't i don't think so i think they're still it's an enterprise software company and they make a bunch of money they make a bunch of games they have a big they're a big company they sell products i think part of it is they know there's opportunity to make money from github right there's definitely a business there you know uh to sell to developers or to sell to people using development i think there's part of that i think part of it is also there's they had definitely wanted to recognize that that that you need to value open source to get great developers which is which is an important concept that was emerging over the past 10 years that you know pi data we were able to convince jp morgan to support pi data because of that fact right that was where the money for them putting a couple hundred thousand into supporting pi data for uh several conferences was they want developers and they realized that developers want to participate in open source so enterprise software folks don't always understand how their software gets used having spent a lot of time on the floors at jpmorgan at in shell at exxon mobil you see oh these companies have large development teams and and then you're they're kind of dealing with the what's being delivered to them so i really feel kind of a privilege that i had a chance to learn some of these people and see what they're doing and and even work alongside them uh you know as a consultant uh using my using open source and trying how do we make this work inside of our large organization some of it is actually for a large organization some of it is messaging to the world that you care about developers and you're the cool yep you're you care like for example like if ford because i talked to them like car companies right they they want to attract you know you want to take on tesla and autopilot you want to take that right and so what do you do there you show that you're cool like you sh you try to show off that you care about developers and they have a lot of trouble doing that and like one way i think like ford should have bought github they just to show off better yeah yeah like these old school companies and it's in a lot of good point a lot of different industries there's probably different ways it's probably an article that you care to developers and the developers it's it's it's exactly what you like for example just spitballing here but like ford or somebody like that could give a hundred million dollars to the development of numpy and uh like like literally look at like the top most popular projects in python and just say we're just gonna give money right like that's gonna immediately make you cool they could actually yeah and in fact they set up num focus to make it easy yeah but the challenge was is also you have to have some business development like it's a bit of a it's a bit of a seating problem right and you look at how i've talked to the folks at linux foundation know how they're doing it i know how in starting num focus because we had two babies in 2012 one was anaconda one was num focus right and they were both important efforts they had distinct journeys and super grateful that both existed and still grateful both exist um but there's different energies in getting donations as there is getting um this is important to my business like i'm selling you something that this is not a it's a salt is i'm gonna make money this way if you can tie it if you can tie the message to an roi for the company it becomes more effective it's much more effective right so and there are rational arguments to make i've tried to have conversations with mark especially marketing departments like very early on it was clear to me that oh you could just take a fraction of your marketing budget and just spend it on open source development and you get better results from your marketing like because how do those can i sorry i'm gonna try not to go here what have you learned from the interaction with the marketing folks on that kind of because you gave a great example of something that will obviously be much better investment in terms of marketing is supporting open source projects the challenge is not dissimilar from the challenge you have in academia at the different colleges right knowledge gets very specific and very channeled right and so people get they get a lot of learning in the thing they know about and it's hard then to bridge that and to get them to think differently enough to have a sense that you might have something to offer because it's different it's like well how do i implement that how do i what do i do with that like do i which budget do i take from do i slow down my spend on google ads or am i spent on facebook ads or do i not hire a content creator and say like like there's an operational aspect to that that some that you have to be the cmo right or the ceo you have to get the right level so you have to hire a high position level right because they care about this in this right or they won't know how right right and and because you can also do it very clumsily yeah right and i've seen it because you can you actually have to honor and recognize the the the people you're going to and the fact that if you just throw money at them it could actually create more problems can i just say this is not you saying can i just because i just need i need i need to say this i've been very surprised how often marketing people are terrible at marketing i feel like the best marketing is doing something novel and unique that anticipates the future it feels like so much of the marketing practice is like what they took in school or maybe they're studying for what was the best thing that was done in the past decade and they're just repeating that over and over as opposed to innovating like taking the risk to me marketing a great point is taking the big risk that's a great being the first one to risk yeah and there's an aspect of data observation from that risk right that's that's that's you i think because shared what they're doing already but it absolutely it's it's about i think it's content like there's this whole world on content marketing that you could almost say well yeah it can get over you can get you can get inundated with stuff that's not relevant to you whereas what you're saying would be highly relevant and highly useful and highly highly beneficial yeah but it's it's a it's risky i mean that's why sort of uh there's a lot of innovative ways of doing that test is an example of people that basically don't do marketing uh they do marketing in a very like it's like elon hired a person who's just good at twitter for running tesla's twitter account you know right right that's exactly what you want to be doing you want to be constantly innovating in uh right there's an aspect of telling i mean i've i've definitely seen people doing great work where you're not talking about it like i would say that's actually a problem i have right now with quansite labs once the lab's been doing amazing work really excited about it we have not been talking about it enough we haven't been and there's different ways to talk about it there's different ways to there's different channels to wish to communicate right there's also like i'll just throw some um shade at companies i love uh so for example irobot i just had a conversation with them they make roombas sure and uh they i think i love the incredible robots but like every time they do commerci like advertisements not advertisement but like marketing type stuff it just looks so corporate and to me the incredible maybe wrong in the case of irobot i don't know but to me when you're talking about engineering systems it's really nice to show off the magic of the engineering and the software and all the all the geniuses behind this product and the tinkering and like the raw authenticity of what it takes to build that system versus uh the marketing people who want to have like pretty people like standing there all pretty with the robots like moving perfectly so i to me there's some aspect it's like speaking to the hackers you have to throw some uh bones some some care towards the the engineers the developers because there's some aspect one for the hiring but two there's an authenticity to that kind of communication that's really inspiring to the end user as well like if they know that brilliant people the best in the world are working your company they start to believe that that product that you're creating is really it's interesting because your initial reaction would be wait there's different users here why would you do that to you know my wife bought a rumba rumba but she and she you know loves developers she loves me but she doesn't care about that yeah so essentially what you said is actually the authenticity because everyone has a friend everyone knows people there's word of mouth i mean if you worded my mouth is so so yeah exactly because i think it's the lack of that that realization there's this halo effect right and also influences your general marketing i interesting for some stupid reason i do have a platform and it seems that the reason i have a platform many others like me millions of others is like the authenticity and like we get excited naturally about stuff yeah and like i don't want to get excited about that irobot video because it's as boring as marketing as corporate as opposed to i wanted to do some fun this is this is me like a shout out to irobot is they're not letting me get into the robot yeah well there's an aspect of that they could be benefiting from a from a culture of modern modularity like add-ons and right like that could actually dramatically help if you've seen that over history i mean apple is an example of a company like that or or the like i i can see i can see what your point is is that you have something that needs to be it needs to be adopted broadly the concept needs to be adopted broadly and if you want to go beyond this one device you need to engage this community yeah and connecting to the the open source as you said i got to ask you your programmer uh one of the most impactful programmers ever you've led many programmers you lead many programmers what are some from a programmer perspective what makes a good programmer what makes a productive programmer is there a device you can give to be a great programmer great great question and there are times in my life i'd probably answer this even better than i hope maybe give an answer today because i've thought about this at numerous times like right now i've spent on so much time recently hiring sales people that your mind is like something else on something else but i i reflected on the past and also uh you know i have some really the only way i can do this i have some really great programmers that i work with who lead the teams that they they lead and my goal is to inspire them and and hopefully help them encourage them and be uh help them encourage with their teams i would say there's a number of things couple things one is um curiosity like you i think a programmer without curiosity is uh mundane like you'll lose interest you don't do your best work so it's sort of it's an affect it's sort of are you you have some curiosity about things i think two don't try to do everything at once recognize that you're you know we're limited as humans you're limited as a human and each one of us are limited in different ways you know we all have our different strengths and skills so it's adapting the art of programming to your skills one of the things that always works is to limit what you're trying to solve right so um if you're part of a team usually maybe somebody else has put the architecture together and they've gotten given a portion for you if you if you're young if you're not part of a team is sort of breaking down the problem into smaller parts is essential for you to make progress it's very easy to take on a big project and try to do it all at once and you get lost and then you do it badly and so thinking about you know um very concretely what you're doing to find you know defining the inputs and outputs to finding what you want to get done um even just talking about that in like writing down before you write code just what are you trying to accomplish and being very specific about it uh really really helps i think um using other people's work right don't don't be afraid that somehow you're like you should do it all like nobody does stand on the shoulders of giants but but but don't just copy and paste that's particularly relevant in the era of codex and the uh you know the auto-generated code which is essentially i see as an indexing of stack overflow right exactly secondly it's like it's a search engine it's a search engine over stack overflow basically so it's not i mean we've had this for a while yeah but really you want to cut and paste but but but not blindly like absolutely i've cut and paste to understand but then you understand oh this is what this means oh this is what it's doing and understanding as much as you can so it's critical that's where the curiosity comes in if you're just blindly cutting and pasting you're not going to understand and so understand and then you know be uh be sensitive to hype cycles right every for every view often there's always a oh test driven development is the answer oh object oriented is the answer oh there's always an answer you know agile is the answer be cautious of jumping onto a hype cycle like likely there's signal like there's a thing there that's that's actually valuable you can learn from but it's almost certainly not the answer to everything you need what lessons do you draw from you having created numpy and scipy like in service of sort of answering the question of what it takes to be a great programmer and giving advice to people yeah how can you be the next person to create a cyborg yeah so one is listen to listen to who uh to uh to people that have a problem right which is everybody right but but listen and listen to many and and try to uh then do like don't you you're gonna have to do an experiment you know do fall down don't be afraid to fall down don't be afraid you're the first thing you do is probably going to suck and that's okay right it's it's honestly i think iteration is the key to innovation and and it's that it's almost that psychological um hesitation we have to just uh iterate like yeah we know you you know it's not great but next i want to be better i mean just just keep learning and keep breathing and keep improving so it's it's an attitude um and then it doesn't take intense concentration right good things don't happen just it's not quite like tick tock or like facebook you know you can't scroll your way to good programming right there are you know sincere like hours of of deep don't be afraid of the deep problem like often people will run away from something because oh i can't solve this and it you might be right but give it an hour give it a couple of hours and see and you know just um five minutes not gonna not gonna give you that was it lonely when you were building sci-pod and number hugely yeah absolutely lonely in the sense of you have to have an inner drive and that inner drive for me always comes from i have to i have to see that this is right in some angle i have to believe it that this is the right approach the right thing to do with scipy it was like oh yeah the world needs libraries in python clearly python is popular enough with enough influential people that to start and it needs more libraries so that is a good in itself so i'm going to go do that good so find a good find a thing that you know is good and just just work on it um so that has to happen and it is and you kind of have to have enough realization of your mission to be okay with the naysayer or the fact that notably joins you up front in fact one thing i've talked to people a lot i've seen a lot of projects come and some fail like not everything i've done has actually worked perfectly i've tried a bunch of stuff that okay that didn't really work or this isn't working and why but you see the patterns and one of the key things is you can't even know for six months i say 18 months right now if you're starting a new project you got to give it a good 18 month run before you even know if the feedback's there like it's you're not gonna know in six months you might have the perfect thing but six months from now it's still kind of still emerging so give it time because you're dealing with humans and humans have an inertial inertia energy that just doesn't change that quickly so let me ask a silly question but uh you know like you said you're focused on the sales side of things currently but you know back when you're actively programming maybe in the 90s you talked about ids what's your a setup that you have that brings you joy keyboard number of screens yeah well linux i do still like to program somebody's not as much as i used to i have two projects i'm super interested in trying to find funding for them trying to figure out some good teams for them but i could talk about those um but what i yeah what get i'm an emacs guy great thank you the superior editor everybody i've got i've i don't often delete tweets but one of the tweets i deleted when i said emacs was better than vim and then the hate i got it is i was like i'm walking away from this i do too i i don't push it i mean i'm not i'm just joking of course yeah exactly it's kind of like but people do take the editor seriously they take it i did it with your life it is but uh there's something there's something beautiful to me about emacs but there's for people that love them there's something beautiful to them i mean i do use them for quick editing like command line if i say quick editing i will still sometimes use it but not much like it's simple corrective correctness single edited character so when you were developing sci-fi you were using ebx yep siphon numpy are already in an e-max on a linux box and uh cvs and then svn version control git came later like git has i love distributed branch stuff i think git is pretty complicated but i love the concept and uh also of course github is uh and then git lab make git definitely consumable um but that came later did you ever touch lisp at all like what were you what yeah emotional feelings about the parenthesis great question so i find myself appreciating this today much more than i did early because when i came to programming i knew programming but i was a domain expert right and to me the parentheses were in the way it's like wow it's just all this like it just gets in the way of my thinking about what i'm doing so why would i have all these right um that was my initial reaction to it uh you know now as i appreciate kind of the structure that kind of naturally maps to the to logical thinking about a program i can appreciate them right and why it's actually you could you you could create editors that make it not so problematic right honestly um yeah so i actually have a much more appreciation of lisp and things like closure and there's hyvee which is a python you know a list that compiles the python byte code um i think it's challenging like typically these languages are you know i even saw a whole data science programming system in lisp that somebody created which is you know cool but again it's the i think it's the lack of recognition of the fact that there exists what i call occasional programmers yes people are never going to be programmers for a living they don't want to have all this the cuteness in their head they want just it's why basic you know microsoft had the right idea with basic in terms of having that be the language of visual basic the language of excel and um sql sql server they should have converted that to python 10 years ago but the world would be a better place that they had but there's also uh there's a beauty and a magic to the history behind a language and lisp you know some of the most interesting people in in the history of computer science and artificial intelligence have used lisp so yes you feel well it's back to that language when you when you have a language you can think in it yeah and it helps you think about it attracts a certain kinds of people that think a certain kind of way and then that's that's there okay so what about like small laptop with a tiny keyboard or is there like a screen you know good question i've never gotten into the many screens to be honest i mean and maybe it's because in my head i kind of just i just swap between windows like partly because i guess i i really can't process three screens at once anyway like i just i'm looking at one and i just flip you know i flip an application open so where it's really helpful is actually when i'm trying to you know here's data and i want to input it from here right that's the only time i really need another screen so now because you're both a developer lead developers but then there's also these businesses and there's sales people you're working with large companies operations people hiring people yeah the whole thing which operating system is your favorites though at this at this point so linux was the earliest so yeah i love love linux as a as a server side and it was early days i was had my own linux desktop um i've been on mac laptops for 10 years now yeah that this is what leadership looks like as you switch to mac okay great yeah pretty much i mean just the fact that i had to do powerpoints i had to do presentations and you know plug in i just couldn't mess with plugging in laptops it wouldn't project and so uh you mentioned also quant labs and things like that uh can you give advice on how to hire great programmers and great people yeah i would say produce an open source project yeah get people contributing to it and hire those people yeah i mean you're doing it sort of uh you may be perhaps a little biased but that's probably 100 really good advice i find it hard to hire i still find it hard to hire like in terms of i don't think like it's not hard to hire if i've worked with somebody for a couple of weeks but a cup an hour or two of interviews i i have no idea so that instinct that radar of knowing if you're good or not that you've you found that you're still not able to it's really hard i mean the resume can help but again the resume is like a presentation of the things they want you to see not the reality of of and there's also um you know you have to understand what you're hiring for there are different stages and different kinds of skills and so it isn't just one of the things i talk a lot about internally at my companies is that the whole idea of measuring ourselves against the unit a single axis is flawed because we're not it's a multi-dimensional space and how do you order a multi-dimensional space there isn't one ordering so this whole idea you you immediately have projected into a thing when you're talking about hiring or best or worst or better or not better so what is the thing you're actually needing and you can even hire for that there is such a thing generally i really value people who have the affect they care about open source like so in some cases their affinity to open source is simply a kind of a filter of an affect however i have found this interesting dichotomy between open source contributors and product creation there's i don't know if it's fully true but there does seem to be the more uh the more experience the more affect somebody has an open source community the less ability to actually produce product that they have but the other one's kind of true too the more product focused star i find a lot of people talk to a lot of people who produce really great products and they they have a they're looking over the open source communities kind of wanting to participate and play but they've played here and they do a great job here and then they don't necessarily have some of the same i don't think that i don't think that's entirely necessary i think part of it is cultural how that's how they've emerged except because one of the things that open source communities often lack is great product management like some product management energy that's brilliant but you want both of those energies in a second place together yes you really do and so it's a lot of it's creating these teams of people that have these needed skills and attributes that are hard and so so one of the big things i look for is somebody that fundamentally recognizes their need to learn to learn like one of the values that we we have and all of the things we do is learning like if somebody thinks they know it all they're going to struggle and some of that is just there's more basic things like humility just being humble in the face of all the things you don't know and that's like step one of learning that's step one of learning right and you know i've spent a lot of time learning right other people spend a lot more time but i've spent a lot of time learning i mean my whole goal was to get a phd because i loved school and i wanted to be a scientist and then what i found is what's been written about elsewhere as well is the more i learned the more i didn't know the more i realized man i i i know about this but this is such a tiny thing in the global scope of what i might want to know about so i need to be listening a whole lot better than than i am just talking that's changed a little bit actually my wife says that i used to be a better listener now that i have i'm so full of all these ideas i want to do she kind of says you got to give people time to talk so you you've uh succeeded on multiple dimensions so one is the 10-year track faculty uh the others just creating all these products then building up the businesses then working with businesses uh do you have advice for young people today in high school in college of how to live a life as a non-linear and as successful as yours a life that could be uh they could be proud of well like that's that's a super compliment i'm humbled by that actually i i would say a life they can be proud of honestly one thing i've said to people is first um find people you love and care about them like family matters to me a lot and family means people you love and have committed to right so it can be whatever you you mean by that but it's it you need to have a foundation uh so find people you love and want to commit to and do that um because it anchors you in a way that nothing else can right and then and then you find other things and then kind of from out there you find other kinds of things you can commit to whether it's ideas or or people or groups of people um so you know especially in high school i would say don't settle on what you think you know right give yourself 10 years to think about the world like there's i see a lot of high school students who who seem to know everything already i i think i did too i think it's maybe natural but but recognize that the things you care about you might change your perspective over time i certainly have over time as senator you know i was really passionate about one specific thing and i was kind of softened you know i was a big um i didn't like the federal reserve right and there's still we can have a longer conversation about monetary policy and finances but but i'm a little more uh nuanced in my in my perspective at this point um but you know that's that's one area where you learn about someone going i want to attack it you know build don't destroy like build like someone so often the tendency is to not like something they want to go attack it build something build some to replace it yeah build up you know attract people to your new thing you'll get you'll be far more far better right you don't need to destroy something to build something else um so that's i guess generally uh and then you know definitely uh like curiosity you know follow your curiosity and and let it um don't just follow the money and all of that like you said is grounded in um family friendship and ultimately love yes which is uh a great way to end it travis you're one of the most impactful people in the engineering computer science in the human world so i truly appreciate everything you've done and i really appreciate that you would spend your valuable time with me it was an honor it was a real pleasure for me i appreciate that thanks for listening to this conversation with travis oliphant to support this podcast please check out our sponsors in the description and now let me leave you with something that in the programming world is called hodgson's law every sufficiently advanced lisp application will eventually be re-implemented in python thank you for listening and hope to see you next time you