Travis Oliphant: NumPy, SciPy, Anaconda, Python & Scientific Programming | Lex Fridman Podcast #224
gFEE3w7F0ww • 2021-09-23
Transcript preview
Open
Kind: captions
Language: en
the following is a conversation with
travis oliphant
one of the most impactful programmers
and data scientists ever
he created numpy
scipy and anaconda
numpai formed the foundation of
tensor-based machine learning in python
scipy formed the foundation of
scientific programming in python and
anaconda specifically with conda made
python more accessible to a much larger
audience
travis's life work across a large number
of programming and entrepreneurial
efforts has and will continue to have
immeasurable impact on millions of lives
by empowering scientists and engineers
in big companies small companies and
open source communities to take on
difficult problems and solve them with
the power of programming
plus he's a truly kind human being which
is something that when combined with
vision and ambition makes for great
leader and a great person to chat with
to support this podcast please check out
our sponsors in the description
this is the lex friedman podcast and
here is my conversation with travis
oliphant
what was the first computer program
you've ever written do you remember whoa
that's a good question i think it was in
fourth grade just a simple uh loop in
basic basic basic on an atari 800 atari
400 i think or maybe there's an atari
800 it was at a part of a class and we
just were just basic loops
to print things out
did you use go to statements um yes yes
we used go to statements
i remember in the early days that's when
i first realized there's like principles
to programming when i was told that
don't use go to statements those are bad
software engineering prints like it goes
against
what great beautiful code is i was like
oh okay there's rules to this game
i didn't see that until high school when
i took an ap computer science course
right i did a lot of other kinds of just
programming on ti but finally when i
took an ap computer science course in
pascal
wow that's that was pascal that's when i
oh there are these principles not c or c
plus plus no i didn't take c until uh
the next year in college i had a course
in c um but i haven't done much in
pascal just that ap computer science
course
now sorry for the romanticized question
but when did you first fall in love with
programming oh man good question i think
actually when i was 10 you know my dad
got us a t a timex sinclair
and uh he was excited about the
spreadsheet capability and then but i
made him get the basic the add-on so we
could actually program in basic and just
being able to write
instructions and have the computer do
something then we got a ti-99
ti-994a when i was about 12 and i would
just it had sprites and graphics and
music you could actually program to do
music
that's when i really sort of fell in
love with programming so this is a full
like a real computer with like uh
with memory and storage yeah processors
so what not the type of ti yeah the
timex sinclair was one of the very first
it was a cheap cheap like i think it was
well it was still expensive but it was
2k of memory we got the 16k add-on pack
but yeah it had memory and you could
program it you had the in order to store
your programs you had to attach a tape
drive remember that old the sound that
would play when you invented the
converted the the modems would convert
digital bits to audio files on a tape
drive still remember that sound
but that was the storage and what was
the programming language do you remember
it was basic it was basic and then they
had a visi calc and so a little bit of
spreadsheet programming busy but
mostly just some basic do you remember
what kind of things drew you to
programming was it uh
working with data was it video games and
video games
mathy stuff yeah i've i've always loved
math and
a lot of people think they don't like
math because i think when they're
exposed to it early they uh it's about
memory
you know when you're exposed to math
earlier you have a good short term
memory members timetables
and i i do have a reasonably i mean not
perfect but a reasonably long um little
short-term memory buffer
and so i did great at times tables i
said oh you're good at math but i
started to really like math just the the
problem solving aspect and so computing
was
problem solving
applied
and so that's always kind of been the
draw kind of coupled with the
mathematics
did you ever see the computer as like an
extension
of your mind like something able to
achieve not till later okay yeah no not
then it's just like a little set of
puzzles that you can play with and you
can you can play with math puzzles and
yeah it was it was too rudimentary early
on like it was sort of
yeah it was too it was a lot of work to
actually take
a thought you'd have and actually get it
implemented and that's still work but
it's getting easier and
so yeah i would say that's definitely
what's attracted me to python is that
that was more real
right
i could think in python
speaking a foreign language i only speak
another language fluently besides
english which is spanish and i remember
the day when i would dream in spanish
and you start to think in that language
and then you actually i do definitely
believe that language
limits or expands your thinking
uh there are some languages that
actually lead you to certain
thought processes yeah like uh
so i speak russian fluently and that's
certainly uh
a language that leads you down certain
thoughts
well yeah i mean there's a um
there's a history of
the two world wars right of the
of millions of people starving to death
or near to death throughout its history
of suffering of
injustice like this promise sold to the
people and then
the the carpet or whatever swept from
under them it's like broken promises and
all of that pain and melancholy is in
the language the sad songs the sad
hopeful songs the over romanticized like
i love you i hate you the the sort of
the swings between all the various
uh spectrums of emotion so that's all
within the language the way it's twisted
uh poach there's a there's a there's a
strong culture of rhyming poetry so like
the bards like this this thing there's a
musicality to the language too did
dostoevsky write in russian
yeah so like
yes
all the uh
[Laughter]
all the ones that i know about which are
translated and curious how the
translations so dostoevsky
did not use
the musicality of the language too much
so it actually translates pretty well
because it's so philosophically dense
that the story does a lot of the work
but there's a bunch of things that are
untranslatable certainly the poetry is
not translatable
i actually have a few
conversations coming up offline and also
in this podcast with people who've
translated dusty esky and that's in for
people who worked
who work in this field know how
difficult that is sometimes you can
spend
you know months thinking about a single
sentence right in in the context like
because there's just the magic captured
by that sentence and how do you
translate it just in the right way
because those words can be um
can be really powerful there's a famous
line
beauty will save the world from
dostoyevsky
you know there's so many ways to
translate that and you're right the
language gives you the tools with which
to tell the story but it also leads your
mind down certain trajectories and paths
to where over time as you think in that
language you become a different human
being yes yeah yeah that's a fascinating
reality i think that i know people have
explored that but it's just rediscovered
well we don't we live in our own like
little pockets like this is the sad
thing
is
i feel like unfortunately given time and
given getting older i'll never know
the uh china the chinese world because i
don't truly know the language same with
japanese i don't truly know japanese and
portuguese and brazil that whole south
american continent like yeah i'll go to
brazil and argentina but will i truly
understand the people
if i don't understand the language it's
it's sad because um
i wonder how much how many geniuses were
missing because uh because so much of
the scientific world so much of the
technical world is in english
and so much of it might be lost because
they're they just we don't have the
common language i completely agree i'm
very much in that vein of
there's a lot of genius out there that
we miss and it's sort of sort of
fortunate when it when it bubbles up
into something that we can understand or
process there's a lot we miss
so i tend to lean towards really loving
uh democratization or things that
empower people or you know i
very resistant to sort of authoritarian
structures
fundamentally for that reason it well
several reasons but it just hurts us
yeah we're worse off
so speaking of languages that empower
you so
python was the first language for me
that um that i could i really enjoyed
thinking in yeah as you said sounds like
you shared my experience too so when did
you first do you remember when you first
kind of connected with python maybe even
fell in love with python it's a good
question it was a process it took about
a year i first encountered python in
1997 i was a graduate student studying
biomedical engineering at the mayo
clinic
and i had previously i've been involved
in
taking information from satellites i was
an electrical engineering student
used to taking information and trying to
get something out of it doing some data
processing information out of it and i'd
done that in matlab i'd done that in
perl i've done that in you know
scripting of on a vms there's actually a
vax vms system and they had their own
little scripting tools uh around fortran
done a lot of that and then
as a graduate student i was looking for
something
and encounter python and because python
had an array had two things that made me
not filtered away because i was
filtering a bunch of stuff as yorick i
looked at yorick i looked at a few other
languages throughout there at the time
in 1997 but it it had arrays there's a
library called numeric that had just
been written in 95 like not very
not not too much earlier by an mit alum
uh jim huganin
you know and i went back and read the
mailing list to see the history of how
it grew and there was a very interesting
it's fascinating to do that actually to
see how this emergent
cooperation unstructured cooperation
happens in the open source world that
led to a lot of
this collective uh programming which is
something maybe we might get into a
little later but what that looks like
what gap did numeric fill merrick fill
the gap of having an array object so
there was no array out there was no
array there was a one dimensional
byte concept but there was no
uh n-dimensional two three
four-dimensional tensor they call it now
i'm still in the category that a tensor
is another thing and it's just an md
array we should call it but yeah kind of
lost that battle
there's many battles in this world some
of which will win some we lose that's
exactly right
so and but it was uh it had no math to
it so numeric had math and a basic way
to think in a race so i was looking for
that and it had complex numbers
a lot of programming languages and you
can see it because you know if you're
just a computer scientist you think ah
complex numbers just two floats so you
can people can build that on
but in practice a complex number as a as
one of the significant algebras that
helps connect a lot of physical and
mathematical ideas particularly fft for
an electrical engineer
and and it's a really important concept
and not having it means you have to
develop it several times
and those times may not share an
approach one of the common things in
programming one things programming
enables is abstractions
but when you have shared abstractions
it's even better it sort of gets the
level of language of actually we all
think of this the same way which is both
powerful and dangerous
right because
powerful in that we now can quickly make
bigger and higher level things on top of
those abstractions dangerous because it
also limits us as to the things we left
maybe left behind in producing that
abstraction which is at the heart of
programming today and actually building
around the programming world so i think
it's a fascinating philosophical topic
yeah it will continue for many years i
think
as it builds more and more and more
abstractions yes i often think about you
know we have we have a world that's
built on these abstractions that were
they the only ones possible yeah
certainly not but they led to
now it's very hard to to do it
differently yeah like there's an inertia
that's very hard to you know
push out push away from there's that has
implications for things like you know
the julia language which you have heard
of i'm sure and
i've met the creators and i like julia
it's a really cool language but they've
struggled to kind of
against the just the tide of like this
inertia of people using python and
and you know there's strategies to
approach that but nonetheless it's a
it's a phenomena and sometimes so i love
complex numbers and i love to raise so i
looked at python
and then i had the experience i did some
stuff in python and i was just doing my
phd so i was out my focus was on
i was actually doing a combination of
mri and ultrasound and looking at a
phenomena called elastography which is
you push waves into the body
and observe those waves like you can
actually measure them
and then you
do mathematical inversion to see what
the elasticity is and so that's the
problem i was solving is how to do that
with both ultrasound and mri i needed
some tool to do that with so i was
starting to use python in 1997 in 98 i
went back looked at what i'd written and
realized i could still understand it
which is not the experience i'd had when
doing pearl in 95. right i'd done the
same thing and then i looked back and i
forgotten what it was even saying
now you know i'm not saying it so i that
that means hey this may work i like this
this is something i can
retain without becoming an expert per se
and so that led me to go i'm going to
push more into this and
then that 98 was kind of the when i
started to fall in love with python i
would say
a few peculiar things about python so
maybe compared to pearl compared to some
of the other languages so there's no
braces yeah
so you
space is used indentation i should say
is used as part of my language yeah
right uh so
did you
i mean that that's quite a leap uh were
you comfortable with that leap or were
you just very open-minded good question
i was open-minded so it i was
cognizant of the concern and it
definitely has
it has specific
challenges you know cut and pasting for
example you're cutting pasting code and
if your editors aren't supportive of
that if you're put into a terminal
and particularly in the past when
terminals didn't necessarily have the
intelligence to manage it now now now
ipython and jupyter notebooks handle it
just fine so there's really no problem
but in the past it created some
challenges formatting challenges also
mixed tabs and spaces if you're if
editors weren't you weren't clear on
what was happening you would have these
issues so there were really concrete
reasons about it that i heard and
understood i never really encountered a
problem with it
personally like it was occasional
annoyances but
i really like the fact that it didn't
have all this extra characters right
that
these extra characters didn't show up in
my visual field when i was just trying
to process understanding a snippet of
code yeah there's a cleanness to it but
i mean the idea is supposed to be that
pearl also has a cleanness to it because
of the minimalism of like how many
characters it takes to express a certain
thing yeah so it's very compact yeah but
what you realize with that compactness
comes
there's a culture that uh prizes
compactness and so the code gets more
and more compact and less and less
readable to a point where it's like
uh like to be a good programmer in pearl
you write code that's basically
unreadable right there's a culture like
correct and you're proud of it yeah
you're proud of it
right exactly and it's like feels good
and it's really selective like it means
you have to be an expert in perl to
understand it yeah whereas python was
allowed you not to have to be an expert
you'd have to take all this brain energy
you could leverage what i said you could
leverage your english language center
which you're using all the time i've
wondered about other languages
particularly non-uh latin based
languages you know
latin-based languages where the
characters are at least similar i think
people have an easier time but i don't
know what it's like to be a japanese or
a chinese person trying to
learn a different um
different syntax like what would
computer programming look like in a in
that i haven't looked at that at all but
it certainly doesn't you know leveraging
your your chinese language center i'm
not sure python or any programming does
that
but that was a big deal the fact that it
was accessible i could be a scientist
what i really liked is
many programming languages really demand
a lot of you and you can get a lot you
know you do a lot if you learn it but
python enables you to do a lot without
demanding a lot of you
there's a there's nuance to that
statement but it certainly was it's more
accessible so more people could actually
as a as a scientist as somebody or
engineer
who was trying to solve another problem
besides point programming i could still
use this language and get things done
and and be happy about it i was also
comfortable in c
at that time and matlab you did a little
matlab i did a lot before that exactly
so i was comfortable in
those three languages were really the
tools i used during my studies and
schooling
um but to your point about language
helping you think one of the big things
about matlab is it was and apl before it
i don't know if you're a you remember
apl
apl is uh actually the predecessor of
array-based programming which i think is
really an underappreciated
if i talk to people who are just steeped
in computer programming computer science
like most the people that microsoft has
hired in the past for example
microsoft as a company generally did not
understand array-based programming like
culturally they didn't understand it so
they kept missing the boat kept missing
the understanding of what this what this
was
they've gotten better but there's still
a whole culture of folks that doesn't
programming that's yeah you know that's
that's systems programming or web
programming or lists and maps and you
know what about an n-dimensional array
oh yeah that's just an implementation
detail well
you can think that but then actually if
you have that as a construct you
actually think differently apl was the
first language to understand that it was
in the 60s
right the challenge of apl is apl had
very dense not only glyphs like new
characters new glyphs they even had a
new keyboard because to produce those
glyphs this is back in the early days of
computing when you know the
query keyboard maybe wasn't as
established like what we could have a
new keyboard no big deal
but it was a big deal and it didn't
catch on and the language apl
very much like pearl as people would
pride themselves on how how much could
they write the game of life in
30 characters of apl apl has characters
that mean
uh summation and uh they have adverbs
you know they have adjectives and these
things called adverbs which are like
methods like reduction reduction it
would be an adverb on an ad operator
right so
but doing using these tools you could
construct and then you start to think at
that level you think in end dimensions
it's something i like to say and you
start to think differently about data at
that point you know now you're
it really helps yeah i mean
outside of programming if you really
internalize linear algebra as a course
i mean it philosophically allows you to
think of the world differently yes it's
almost like liberating you don't have to
you don't have to think about the
individual numbers in the n-dimensional
array you can think of it as an object
in itself and all of a sudden this world
can open up
now you're saying matlab and apl were
like the early c
i don't know if many languages got that
right ever no no no they didn't still
even still i would say i mean
numpy is a as a inheritor of the
traditions that i would say apl j
was a another version that was what it
did is not have the glyphs just have
short characters but still a latin
keyboard could type them and then
numeric inherited from that in terms of
let's add arrays plus broadcasting plus
methods reduction even some of the
language like rank is a concept that's
in that was in python it's still in
python
for the number of dimensions right
that's that's different than say the
rank of a matrix which people think of
as well so it's it came from that
tradition but numpy
is a very pragmatic practical tool uh
numpy inherited from numeric and we can
get to where numpy came from which is
the current array
at least current as of
2016-17 now there's a ton of them over
the past two or three years but we can
get into that too so if we just sort of
linger on the early days of
what was your favorite feature of python
do you remember like what yeah
it's so interesting
to linger on like the
what
what really makes you connect with the
language i'm not sure it's
obvious to introspect that no it isn't
and i've thought about that at some
length i'm not i think definitely the
fact that i could read it later yeah
that i could use it productively without
becoming an expert and the other
language i had to put more effort into
right
that's like an empirical observation
like you're not analyzing any one aspect
of the language it just seems
time after time you look back it's
somehow readable it's somewhat readable
then it was sort of i could take
uh executable english yeah and translate
it to python more easily like i didn't
have to go there was no translation
layer
as an engineer or as a scientist i could
think about what i wanted to do and then
the syntax wasn't that far behind it
yeah right now there was some there have
some there's some warts there still it
wasn't perfect like there's some areas
where i'm like ah it'd be better if this
were different or if this were different
some of those things got out of the
language too i was really grateful for
some of the early pioneers in the python
ecosystem back because python got
written in 91 is when the first version
came out but guido was very open to
users and one of the sets of users were
people like jim hugin and david asher
and paul dubois
and
conrad hinson these were people that
were on the main list and they were just
asking for things like hey we really
should have complex numbers in this
language so let's you know there's a j
there's a one j right and the fact they
want the engineering root of j is
interesting
i don't i don't think that's entirely
favorite engineers i think it's because
i is so often used as the index of a for
loop
so i think that's actually
probably right i mean there's there's a
pragmatic aspect like the complex
numbers were there i love that the fact
that i could write nd arrays constructs
and that reduction was there very simple
to write summations and and and
broadcasting was there i could do
addition of whole arrays
um so that was cool those were something
i loved about it
i don't know what to start talking to
you about because you've been you've
created so many incredible projects that
basically changed the whole landscape of
programming but okay let's start with uh
let's go chronologically
with scipy you create a scipy over two
decades ago now yes right yeah i said
i'd love to talk about sci-fi sci-fi was
really my baby
what is it
uh what was its goal what is its goal
how does it work yeah fantastic so scipy
was effectively here i'm using python to
do
stuff that i previously used matlab to
use and i was using numeric which is an
array library that made a lot of it
possible but there's things that were
missing like i didn't have an ordinary
differential equation solver i could
just call right i didn't have
integration hey i wanted to integrate
this function okay well i don't have
just a function i can call to do that
um these are things i remember being
critical things that i was missing
optimization i just want to pass a
function to an optimizer and have it
tell me what the optimum value is
uh those are things like well why don't
we just write a library that adds these
tools
and i started a post on the mailing list
and they're previously been you know
people have discussed i remember conrad
hinson saying wouldn't it be great if we
had this optimizer library or david ash
would say this stuff and and i'm you
know i'm a
ambitious i am this is the wrong word
and eager
and
uh
probably more time than sense i was you
know poor graduate student
uh my wife thinks i'm working on my phd
and i am but part of a phd that i loved
was the fact that it's exploratory
you're not just you know
taking orders fulfilling a list of
things to do you're trying to figure out
what to do and so i thought well you
know i'm writing tools for my own use
and a phd so
i'll just start this project and so in
99 98 was when i first started to write
libraries for python particularly when i
fell in love with python 98 i thought
well there's just a few things missing
like oh i need a reader to read dicom
files i was in medical imaging.com was a
format that i want to be able to load
that into python okay how do i write a
reader for that so i wrote something
called
it was an i o package right and that was
my very first extension module which is
c so i wrote c code to extend python so
that the pos in python i could write
things more easily that that combination
kind of hooked me it was the idea that i
could here's this powerful tool i can
use as a scripting language and a high
level language to think about but that i
can extend easily
easily in c that easily for me because i
knew enough c right and then guido had
written a link i mean the only the hard
part of extending python was something
called the way memory management works
and you have to reference counting and
so there's there's a tracking of
reference counting you have to do
manually
and if you don't you have you have
memory leaks and uh so that's hard plus
then c you know it's just much more you
have to put more effort into it it's not
just i have to now think about pointers
and have to think about stuff that
is different i have to kind of you're
like putting a new cartridge in your
brain like you're okay i'm thinking
about mri now i'm thinking about
programming and there are distinct
modules you end up having to think about
so it's harder when i was just in python
i could just think about mri and
high-level writing
but i could do that and that kind of i
liked it i found that to be enjoyable
and fun and so i ended up oh well let me
just add a bunch of stuff to python to
do integration
well and the cool thing is is that you
know the power of the internet i just
looking around and i found oh there's
this net lib which has
hundreds of fortran routines that people
written in the 60s and the 70s and the
80s in fortran 77 fortunately it wasn't
for trend sixties i've been imported to
fortran 77
and 1477 is actually a really great
language
fortune 90 probably is my favorite 4chan
because it's also it's got complex
numbers got a raise and it's pretty high
level now the problem with it is you'd
never want to write a program in fortune
90 or fortune 77 but it's totally fine
to write a subroutine in
right and so and then 4chan kind of got
a little off course when they tried to
compete with c plus plus but at the time
i just want libraries to do something
like oh here's an order different
equation here's integration here's run
cut integration
already done i don't have to think about
that algorithm and you could but it's
nice to have somebody who's already done
one and tested it and so i sort of
started this journey in 98 really if you
look back at the main list there's sort
of this this productive era of me
writing an extension module to connect
wrench cut integration to python
and making an ordinary digital equation
solver and then
releasing that as a package so we could
call od pack i think i called it then
quad pack and then i just made these
packages eventually that became
multi-pack because they're originally
modular you can install them separately
but a massive problem in python was
actually just getting your stuff
installed
at the time releasing software for me
like today it's people think what does
that mean well then it meant
some poorly written web page i had some
bad web page up and i put a tarball just
a gzip tar ball of source code that was
the release
but okay can we just stand that because
that the community aspect of creating
the package and sharing that yes
that's rare
that to have to both have
at that time so like that was pretty
early yeah so well not not rare maybe
maybe you can uh correct me on this but
it seems like in the scientific
community so many people you were
basically solving the problems you
needed to solve
to process the particular application uh
the data that you need and
to also have the mind that i'm going to
make this usable for others
that's um i would say i was inspired i'd
been inspired by linux i've been
inspired by you know linus linus and him
making his code available and i was
starting to use linux at the time and i
went this is cool so i had kind of been
previously primed that way and generally
i was i was into science because i liked
the sharing notion i like the idea of
hey let's if collectively we build
knowledge and share it we can all be
better off okay so you weren't energized
by that so it's energized value already
yeah right and i can't deny that i was
i'm sort of uh had this very
i liked that part of science that part
of sharing and then all of a sudden oh
wait here's something and here's
something i could do
and then i slowly over years learned how
to share better so that you could
actually engage more people faster one
of the key things was actually giving
people a binary they could install
right so that wasn't just your source
code good luck
compile this and then get it compiled
ready to install you just you know so in
fact a lot of the journey from 98 even
through 2012 we used to when i started
anaconda was about that like it's why uh
you know it's really the key as to why
the scientists with dreams of doing mri
research ended up starting a software
company that installs
software i work with
a few folks now that don't program
like on the creative side and the video
side the audio side and because my whole
life is running on scripts i have to try
to get them to i'm have now the task of
teaching them how to do python enough
yeah to run the scripts and so i've been
actually facing this whether it's on the
condor some
with the task of how do i minimally
explain basically to my mom how to write
a python script
and it's an interesting challenge
it's a to-do item for me to figure out
like what is the minimal amount of
information i have to teach what are the
tools you use that one you enjoy it to
your effect of it they're related to two
related questions and then the debugging
like the the iterative process of
running the script to figure out what
the error is maybe even for some people
to do the fix yourself yeah so do you
compile it do this like how do you
distribute that code to them and it's
interesting because i think
it's exactly what you're talking about
if you increase the circle
of empathy that the circle of people
that are able to use your programs
you increase it its like effectiveness
and its power and so yeah you have to
think
you know can i write scripts can i write
programs that can be used by biomedical
engineers by all kinds of
people that don't know programming and
actually maybe plan to see
have them catch the bug of programming
so that they start on their journey
that's a huge responsibility and
ultimately has to do with the amazon
one-click buy
like how how frictionless can you make
the early steps frictionless is actually
really key to growing any community is
every any friction point you're just
going to lose you're going to lose some
people yeah right now sometimes you may
want to
intentionally do that if you're early
enough on you need
a lot of help you need people who have
the skills you might actually it's
helpful you don't necessarily have too
much too many users as opposed to
contributors if the co if you're early
on
anyway there's uh uh sci-fi started in
98 but it really emerged as this
collection of modules that i was just
putting on the net people were
downloading and they you know
i think i got 100 users right by the end
of that year but there but the fact that
i got 100 users and more than that
people started to email me with fixes
like and that was actually intoxicating
right that was the
that was the you know here i'm writing
papers and i'm giving conferences and i
get people would say hello but yeah good
job but mostly it was you're reviewed
with
it it's competitive yeah right you
publish a paper and people were like oh
it wasn't my paper you know
i was starting to see that sense of
academic
life where it was so much i thought
there was a cooperative effort but it
sounds like we're here just to
one-up each other right and
you know it's not it's not true across
the board but a lot of that's there but
here in this world i was
getting responses from people all over
the world
uh you know i remember pierrot peterson
in estonia right was one of the first
people and he sent me back this make
file because the first thing it is yeah
your build thing stinks and here's a
better make file now it was a complex
make file i think i never understood
that make file actually but it worked
and it did a lot more and so then thanks
this is cool and that was my first kind
of engagement with community
development but you know the process was
he sent me a patch file i had to upload
a new tar ball and i just found i really
loved that and the style back then was
here's a main list it was very it wasn't
as it certainly were the tools that are
available today it was very early on but
i really started that's the whole year i
i think i did about seven packages that
year right and then by the end of the
year i collected them into a thing
called multi-pack so 99 there was this
thing called multi-pack and that's when
a high school student knows a high
school student at the time a guy named
robert kern
took
that package and made a windows
installer
right and then of course a massive
increase of usage so by the way most of
this development was under linux yes yes
it was on linux i was a linux developer
doing it on munix box i mean at the time
i was actually getting into i had a new
hard drive he did some kernel
programming to to make the hard drive
work i mean not programming but
modification to the kernel so i could
actually hard drive working
i i love that aspect of it i was also in
you know at school i was building a
cluster i took mac computers like uh and
you put yellow dog linux on them uh they
were at the mayo clinic they were just
they're all these macs that were older
they were just getting rid of and so i
kind of got permission to go grab them
together i put about 24 of them together
in a cluster and a cabinet
and put yellow dog linux on them all and
i wrote a c plus plus um
program to do mri simulation that was
what i was doing
at the same time for my day job so to
speak so i was loving the whole process
at the same time i was oh i need to
ordinary differential equation that's
why ordinary difference equations were
key was because that's the heart of a
block equation for simulated mri is a
ode solver and so that's
but i actually did that
it doesn't happen at the same time
that's why it kind of what you're
working on and what you're interested in
they're coinciding i was definitely
scratching my own itch
in terms of building stuff and uh which
helped in the sense that i was using it
for me so at least had one user yeah i
had one person who's like well i know
this is better i like this interface
better and i had the experience of
matlab to guide some of what those apis
might look like but you know you're just
doing yourself you're building all this
stuff but with the windows installer it
was the first time i realized oh yeah
the binary installer really helps people
and so
that led to spending more time on that
side of things so around 2000 so i
graduated my phd in 2000 end of year
2000. so
99 doing a lot of work there 98 do a lot
of work there 99 kind of spending more
time on my phd you know helping people
use the tools thinking about what i want
to go from here there was a company
there's a guy actually eric jones and
travis vott they were two friends who
founded a company called nthot it's here
in austin still here
and they
eric contacted me at the time when i was
a uh
i was a graduate student still and he
said hey why don't you come down we want
to build a company
you know we want we're thinking of you
know a scientific
company and we want to take what you're
doing and kind of add it to some stuff
that he'd done he'd written some tools
and then pierre peterson had done ftp
let's come together and build pull this
all together and call it sci-fi
so that's the origin of the scipy brand
it came from you know multi-pack and a
whole bunch of modules i'd written plus
a few things from some other folks and
then pulled together in a single
installer
sci-fi was really a distribution of
python masquerading as a library
how did you think about sci-fi in
context of python in context of numeric
like what we saw scipy as a way to make
an r d environment for python like use
python uh dependent on numeric so
numeric was the array library we
depended on and then from there ext
extend it with a bunch of modules that
allowed for and at the time the original
vision of scipy was to have
plotting was to have you know replied
you know the rebel environment and kind
of a whole really a whole data
environment
um that you could then install and get
going with and that was kind of the
thinking
it didn't really evolve that way right
it sort of had a but one
it's really hard to do massive scale
projects in a
with with open source collectives
actually there's a there's sort of an
intrinsic uh cooperation limit
as to which you know too many cooks in
the kitchen you know you can do amazing
infrastructure work when it comes down
to bringing it all together into a
single deliverable that actually
requires a little more
a little more product management that is
not
it doesn't really emerge from the same
dynamic so it struggled you know
struggled to get
almost too many voices it's hard to have
everybody agree you know consensus
doesn't really work at that scale you
end up with politics you know with the
same kind of things that's happened in
large organizations trying to decide on
what to do together um
so consensus building was still was was
challenging at scale as more people came
in right early on it's fine because
there's nobody there and so it works but
then as you get more successful the more
people use it all of a sudden oh there's
this this
scale at which this doesn't work anymore
and we have to come up with different
approaches so sci-fi came out officially
in 2001 was the first release most the
time i remember the days of getting that
release ready it was a windows installer
and there was there were bugs on how you
know the windows compiler handled
complex numbers and you were you're
chasing segmentation faults and it was
it's a lot of work there's a lot of
effort had nothing to do with my
area of study at the same time i just
got an offer so he wondered if i wanted
to come down and help him start that you
know start that company with his friend
and i at the time i was like i was
intrigued but i was squaring a path an
academic path and i just got an offer to
go and teach at my alma mater so i took
that tenure track position
and saipo and kind of then i started
work on sci-fi as a professor too
okay so that that's i left i've got the
mayo clinic graduated wrote my thesis
using sci-fi wrote you know there's
there's images that were created
now the plotting tool i used was
something from yorick actually it was a
plotting a plt kind of a plotting
language that i used york is a
programming language it was a
programming language had a plotting tool
dislin
it we had integration to dislike i ended
up using dislin plus some some of the
plotting from yorick
linked to from python anyway it was a
people don't plot that way now but this
is before and scipy was trying to add
plotting yeah right
it didn't have much success really the
success of plotting came from john
hunter
who had a similar experience to my
experience my kind of maverick
experience as a person just trying to
get stuff done and kind of having more
time than than money maybe right and
john hunter created what not plot lube
he's the creator of map yeah so john
hunter was uh you know he wasn't a
student at the time but he was an actor
he was working in quant field and he
said we need better plotting so he just
went out and said cool i'll make a new
project and we'll call it matplotlib and
he released in 2001. about the same time
that scipy came out and it was separate
library separate install
use numeric sci-fi use numeric
and so scipy you know 2001 released
scipy and then m-thot created a
conference called scipy which
was brought people together to talk
about the space another conference is
still ongoing it's one of the favorite
conferences of a lot of people because
it's
it's changed over the years but early on
it was you know a collection of 50
people who care about
scientists mostly
practicing scientists who want to care
about
coding and doing it well and not using
matlab
i remember being driven by you know i
like matlab but i didn't like the fact
that like so i'm not opposed proprietary
software i'm actually not an open source
zealot i love open source for the what
it brings but i also see the role for
proprietary software what i didn't like
was the fact that i would develop code
and publish it and then effectively
telling somebody here to run my code you
have to have this proprietary software
right and there's also culture around
matlab
as much because i've talked to a few
folks
math works great it's my life yeah
i mean there's just a culture they try
really hard but it's just there's this
corporate ibm style culture that's like
or whatever i don't don't want to say
negative things about ibm or whatever
but there's a
no it's it's really that connection
something i'm in the middle of right now
is is the business of open source and
how do you connect the ethos of
cooperative development with the
necessity of
of creating profits
right and like right now today you know
i'm still i'm still in the middle of
that that's actually the early days of
of me exploring this question
because i was writing sci-fi i mean as
an aside i also had so i had three kids
at the time i have six kids now i got
married early wanted a family uh i had
three kids and i remember reading i
remember read richard stallman's post
and i was i was a fan of stallman i
would read his work i liked this
collective ideas he would have certainly
the ideas on ip law i read a lot of
stuff but then he said you know
okay
well how do i make money with this how
do i make a living how do i pay for my
kids all this stuff was in my mind a
young graduate student making no money
thinking i got to get a job and he said
well you know i think just be like me
and don't have kids right that's just
don't don't that's his take on this that
was just that was that was the what what
he said in that moment right that's the
thing i read and i went
okay this is a train i can't get out
yeah
there has to be a way to preserve the
culture of open source and still be able
to make sufficient money to feed you yes
exactly there's got to be well so that
actually led me to a study of economics
because at the time i was ignorant and
it really was i'm actually i'm
embarrassed for educational system that
they could let me and i was
valedictorian in my high school class
and i did super well in college and like
academically i did great
right but the fact that i could do that
and then be clueless about this key part
of life
it led me to go there's a problem like i
should i should have learned this in
fifth grade i should learn this in
eighth grade like everybody should come
out with a basic knowledge of economics
you're an interesting example because
you've created tools that uh change the
lives of probably millions of people and
the fact that you don't understand at
the time of the creation of those tools
the basics economics of how like to
build up giant system is a problem yeah
it's a problem and so i during my phd at
the same time this is actually in 98 99
at the same time i was in the library i
was reading books on capitalism i was
reading books on marxism i was reading
books on you know what is this thing
what does it what does it mean yeah and
i encountered a basically what i
encountered a set of writings from
people that said they were the
inheritors adam smith but adam smith for
the first time right which is the wealth
of nations and kind of this notion of
emergent
emergent uh societies and realized oh
there's this whole world out here of
people
and in the challenge the economics is
also political
like because economics you know
people
different parties running for office
they'll
they want their economic friends they
want their economist to back them up
right or to to be there
to be their magicians like the magicians
in pharaoh's court right the people that
are going to say hey this is you should
listen to me because i've got the expert
who says this
and so it gets really muddled right but
i was looking at from as a scientist as
a scientist going what is this space
what does this mean how do people how
does paris get fed how does how what is
money how does it work i found a lot of
writings i really loved i found some
things that i really loved and i learned
from that it was writings from people
like von mises he wrote a pre-order
paper in 1920 that still should be read
more than it is it's got i mean it was
the economic calculation problem of the
socialist commonwealth it's basically in
response to the bolshevik revolution in
1917. and his basic argument was it's
not going to work to not have private
property you're not going to be able to
come up with prices the bureaucrats
aren't going to be able to determine how
to allocate resources without a price
system and a price system emerges from
people making trades and they can only
make trades if they have authority over
the thing they're trading
and that that that creates information
flow that you just don't have
if you try to top down it right right
it's like huh that's a really good point
yeah the prices have a signal that's
used and it's important to have that
signal
when you're trying to build a community
of productive people like you would in
the software engineering yeah the prices
are actually an important
signaling mechanism yeah right and that
money
is just a bartering tool right so this
is the first time i've encountered any
of this concept right and the fact that
oh this is actually
really critical like it's so critical to
our prosperity and
that
we're dangerously
not learning about this not teaching our
children about this you know so you had
the three kids you had to make some
stuff how to make some money right i had
to figure it out but i didn't really
care i mean i was never i've never been
driven by money just need it right right
to eat so what how did that resolve
itself in terms of sci-fi
so i would say it didn't really resolve
itself it sort of started a journey that
i'm continuing on i'm still on i would
say i don't think it resolved itself but
i will say
i i went in wide eyes wide open like i
knew that there were problems with you
know um giving stuff away and creating
uh the the ex market externalities the
fact that yeah people might use it and i
might not get paid for it and i'll have
to figure something else out to get paid
like at least i can say i'm not bitter
that a lot of people have used stuff
that i've written and i haven't
necessarily benefited economically from
it like yeah i've heard other people be
you know bitter about that when they
write or they talk like oh i should have
got more value out of this and i'm also
i want to create systems that let people
like me who might have these desires to
do things let them benefit so it
actually creates more of the same
not to turn on your bitterness module
but
there's some aspect i wish there was
mechanisms for me to reward whoever
created scipy and numpy because it
brought so much joy to my life i
appreciate that i mean the tip dark
notion was there i appreciate that and i
think but there should be a very there's
surely mechanism mechanism i totally
agree i would love to talk about some of
the ideas i have because i actually came
across i think i've come up with some
interesting notions that could work but
they'll require
you know anything that will work takes
time to emerge right like things don't
just turn overnight that's definitely
one thing i've also understood and
learned
is any fixes
that's why it's kind of funny we often
give credit to you know oh this
president gets elected and oh look how
great things have done
and
i saw that when when i had a transition
in a condo when a new ceo came in right
and it's like the success that's
happening there's an inertia there yeah
right and sometimes the decision you
made like 10 years before is the reason
why the successes see right exactly so
we're sort of just running around taking
credit for stuff credit assignment has
like a delay to it yes
that this makes the credit assignment
basica
Resume
Read
file updated 2026-02-14 17:41:41 UTC
Categories
Manage