Stephen Wolfram: ChatGPT and the Nature of Truth, Reality & Computation | Lex Fridman Podcast #376
PdE-waSx-d8 • 2023-05-09
Transcript preview
Open
Kind: captions
Language: en
you know I can tell chat gbt create a
piece of code and then just run it on my
computer and I'm like you know that that
sort of personalizes for me the what
could what could possibly go wrong so to
speak was that exciting or scary that
possibility it was a little bit scary
actually because it's kind of like if
you do that right what is the sandboxing
that you should have and that's sort of
a that's a a version of of that question
for the world that is as soon as you put
the AIS in charge of things you know how
much how many constraints should there
be on these systems before you put the
AIS in charge of all the weapons and all
these you know all these different kinds
of systems well here's the fun part
about sandboxes is uh the AI knows about
them it has the tools to uh crack them
the following is a conversation with
Stephen Wolfram his fourth time on this
podcast he's a computer scientist
mathematician theoretical physicist and
the founder of Wolfram research a
company behind Mathematica well from
alpha or from language and the Wolfram
physics and meta mathematics projects He
has been a Pioneer in exploring the
computational nature of reality and so
he's the perfect person to explore with
together the new quickly evolving
landscape of large language models as
human civilization Journeys towards
building super intelligent AGI
this is the Lex Friedman podcast to
support it please check out our sponsors
in the description and now dear friends
here's Stephen Wolfram
you've announced the integration of chat
gbt and Wu from Alpha and Wolfram
language so let's talk about that
integration what are the key differences
from the high philosophical level
maybe the technical level between the
capabilities of
broadly speaking the two kinds of
systems large language models and this
computational gigantic computational
system infrastructure that is well from
alpha yeah so what does something like
chat GPT do it's it's mostly focused on
make language like the language that
humans have made and put on the web and
so on yeah so you know it's it's primary
sort of underlying technical thing is
you've given a prompt it's trying to
continue that prompt in a way that
somehow typical of what it's seen based
on a trillion words of text that humans
have written on the web and the way it's
doing that is with something which is
probably quite similar to the way we
humans do the first stages of that using
a neural net and so on and just saying
given these given this piece of text
let's Ripple through the neural net one
word and get one word at a time of
output and uh it's kind of a shallow
computation on a large amount of kind of
training data that is what we humans
have put on the web that's a different
thing from sort of the computational
stack that I spent the last I don't know
40 years or so building which has to do
with what can you compute many steps
potentially a very deep computation it's
not sort of taking the statistics of
what we humans have produced and trying
to continue Things based on that
statistics instead it's trying to take
kind of the formal structure that we've
created in our civilization whether it's
from mathematics or whether it's from
kind of systematic knowledge of all
kinds and use that to do arbitrarily
deep computations to figure out things
that that aren't just let's match what's
already been kind of said on the web but
let's potentially be able to compute
something new and different that's never
been computed before so as a practical
matter you know the the um what we're
you know the our goal is to have made as
much as possible of the world computable
in the sense that if there's a question
that in principle is answerable from
some sort of expert knowledge that's
been accumulated we can compute the
answer to that question and we can do it
in a sort of reliable way that's that's
the best one can do given what the
expertise that our civilization has
accumulated it's a very it's a it's a
much more sort of labor-intensive on the
side of kind of being creating kind of
the the computational system to do that
um obviously the in the the kind of the
chat GPT world it's like take things
which were produced for quite other
purposes namely the all the things we've
written out on the web and so on and
sort of forage from that things which
were are like what's been written on the
web so I think you know as a practical
point of view I I view sort of the chat
GPT thing as being wide and shallow and
what we're trying to do with sort of
building out computation as being this
sort of deep also broad but but most
importantly kind of a deep type of thing
I think another way to think about this
is if you go back in human history you
know I don't know a thousand years or
something and you say what what can the
typical person what's the typical person
going to figure out well the answer is
there's certain kinds of things that we
humans can quickly figure out that's
sort of what what our uh you know other
neural architecture and the kinds of
things we learn in our lives let us do
but then there's this whole layer of
kind of formalization that got developed
in which is you know the kind of whole
sort of story of intellectual history
and the whole kind of depth of learning
that formalization turned into things
like logic mathematics science and so on
and that's the kind of thing that allows
one to kind of build these towers of of
uh uh of of uh sort of towers of things
you work out it's not just I can
immediately figure this out it's no I
can use this kind of form to go step by
step and work out something which was
not immediately obvious to me and that's
kind of the story of what what we're
trying to do computationally is to be
able to build those kind of tall towers
of what implies what implies what and so
on
um and uh as opposed to kind of the yes
I can immediately figure it out it's
just like what I saw somewhere else in
something that I heard or remembered or
something like this what can you say
about the kind of formal structure the
kind of form of foundation you can build
such a formal structure on
about the kinds of things you would
start on in order to build this kind of
uh deep computable knowledge trees so
the question is sort of how do you how
do you think about computation and
there's there's a couple of points here
one is what computation intrinsically is
like and the other is what aspects of
computation we humans with our minds and
with the kinds of things we've learned
can sort of relate to in that
computational universe so if we start on
the kind of what can computation be like
it's something I've spent some big chunk
of my life studying is imagine that
you're you know we usually we write
programs where we kind of know what we
want the program to do and we carefully
write you know many lines of code and we
hope that the program does what we what
we intended it to do but the thing I've
been interested in is if you just look
at the kind of natural science of
programs so you just say I'm going to
make this program it's a really tiny
program maybe I even pick the pieces of
the program at random but it's really
tiny by really tiny I mean you know less
than a line of code type thing you say
what does this program do and you run it
and big discovery that I made in the
early 80s is that even extremely simple
programs when you run them can do really
complicated things really surprised me
it took me several years to kind of
realize that that was a thing so to
speak but that that realization that
even very simple programs can do
incredibly complicated things that we
very much don't expect
that Discovery I mean I realized that
that's very much I think how nature
works that is nature has simple rules
but yet does all sorts of complicated
things that we might not expect you know
as a big thing of the last few years has
been understanding that that's how the
whole universe and physics works but
that's a a quite separate topic but so
there's this whole world of programs and
what they do and very rich sophisticated
things that these programs can do but
when we look at many of these programs
we look at them and say well that's kind
of I don't really know what that's doing
it's not a very human kind of thing
so on the one hand we have sort of
what's possible in the computational
universe on the other hand we have the
kinds of things that we humans think
about the kinds of things that are
developed in kind of our intellectual
history and that's uh and the Really the
challenge to sort of making things
computational is to connect what's
computationally possible out in the
computational universe with the things
that we humans sort of typically think
about with our minds now that's a
complicated kind of moving Target
because the things that we think about
change over time we've learned more
stuff we've invented mathematics we've
invented various kinds of ideas and
structures and so on so it's gradually
expanding we're kind of gradually
colonizing more and more of this kind of
intellectual space of possibilities but
the the real thing the real challenge is
how do you take what is computationally
possible how do you take how do you
encapsulate the kinds of things that we
think about in a way that kind of plugs
into what's computationally possible and
and actually the the uh the big sort of
idea there is this idea of kind of
symbolic programming symbolic
representations of things and so the the
question is when you look at sort of
everything in the world and you kind of
you know you take some visual scene or
something you're looking at and you say
well how do I turn that into something
that I can kind of stuff into my mind
you know there are lots of pixels in my
visual scene but the things that I
remembered from that visual scene are
you know there's a there's a chair in
this place it's a kind of a symbolic
representation of the visual scene there
are two chairs on a table or something
rather than there are all these pixels
arranged in all these detailed ways and
so the question then is how do you take
sort of all all the things in the world
and make some kind of representation
that corresponds to the types of ways
that we think about things and human
language is is sort of one form of
representation that we have we talk
about chairs that's a word in human
language and so on how do we how do we
take but human language is is not in and
of itself something from that plugs in
very well to sort of computation it's
not something from which you can
immediately compute consequences and so
on and so you have to kind of find a way
to take sort of the the stuff we
understand from human language and make
it more precise and that's really this
story of of symbolic programming and you
know what what that turns into is
something which I didn't know at the
time it was going to work as well as it
has but back in the 1979 or so I was
trying to build my first big computer
system and trying to figure out you know
how should I represent computations at a
high level and I kind of invented this
idea of using kind of symbolic
Expressions you know structured as it's
kind of like a a function and a bunch of
arguments but that function doesn't
necessarily evaluate to anything it's
just a a thing that sits there
representing a structure and so building
up that structure and it's turned out
that structure has been extremely it's
it's a good match for the way that we
humans it seems to be a good match for
the way that we humans kind of
conceptualize higher level things and
it's been the last I don't know 45 years
or something it's served me remarkably
well so building up that structure using
this kind of symbolic representation but
what can you say about abstractions here
because you could just start with your
physics project you could start at a
hypograph at a very very low level and
build up everything from there but you
don't you type shortcuts right uh you
take the highest level of abstraction
convert that
uh the kind of abstraction that's
convertible to something computable
using symbolic representation
and then that's your new foundation for
that little piece of knowledge yeah
somehow all of that is integrated right
so the the sort of a very important
phenomenon that that is kind of a thing
that I've sort of realized is just it's
one of these things that sort of in the
in the future of kind of everything is
going to become more and more important
is this phenomenon of computational
irreducibility and the the question is
if you know the rules for something you
have a program you're going to run it
you might say I know the rules great I
know everything about what's going to
happen
well in principle you do because you can
just run those rules out and just see
what they do you might run them a
million steps you see what happens Etc
the question is can you like immediately
jump ahead and say I know it's going to
happen after a million steps and the
answer is 13 or something yes and the
the one of the very critical things to
realize is if you could reduce that
computation there isn't a sense no point
in doing the computation the place where
you really get value out of doing
computation is when you had to do the
computation to find out the answer but
this phenomenon that you have to do the
computation to find out the answer this
phenomenon of computational
irreducibility seems to be tremendously
important for thinking about lots of
kinds of things so one of the things
that happens is okay you've got a model
of the universe at the low level in
terms of atoms of space and hypographs
and rewriting typographs and so on and
it's happening you know 10 to the 100
times every second let's say well you
say great then we've we've nailed it
we've we know how the universe works
well the problem is the universe can
figure out what it's going to do it does
those 10 to the 100 you know steps but
for us to work out what it's going to do
we have no way to reduce that
computation the only way to do the
computation to see the result of the
computation is to do it and if we're
operating within the universe we're kind
of there's no there's no opportunity to
do that because the universe is doing it
as fast as the universe can do it and
that's you know that's what's happening
so what we're trying to do and a lot of
the story of science a lot of other
kinds of things is finding pockets of
reducibility that is you could have a
situation where everything in the world
is full of computational irreducibility
we never know what's going to happen
next the only way we can figure out
what's going to happen next is just let
the system run and see what happens so
in a sense the story of of most kinds of
science inventions a lot of kinds of
things is the story of finding these
places where we can locally jump ahead
and one of the features of computational
reducibility is that there are always
pockets of reducibility there are always
places there always an infinite number
of places where you can jump ahead
there's no way where you can jump
completely ahead but there are little
little patches little places where you
can jump ahead a bit and I think you
know we can talk about physics project
and so on but I think the thing we
realize is we kind of exist in a slice
of all the possible computational
irreducibility in the universe we exist
in a slice where there's a reasonable
amount of predictability and in a sense
as we try and construct these kind of
higher levels of of abstraction symbolic
representations and so on what we're
doing is we're finding these lumps of
reducibility that we can kind of attach
ourselves to and about which we can kind
of have fairly simple narrative things
to say because in principle you know I
say what's going to happen in the next
few seconds you know oh there are these
molecules moving around in the air in
this room and oh gosh it's an incredibly
complicated story
um and that's a whole computational
irresistible thing most of which I don't
care about and most of it is well you
know the air is still going to be here
and nothing much is going to be
different about it and that's a kind of
reducible fact about what is ultimately
a an underlying level of computational
irreducible process
and uh life would not be possible if we
didn't have a large number of such
reducible Pockets uh yes Pockets
amenable to uh reduction into something
symbolic yes I think so I mean life in
in the way that we experience it that I
mean you know one might you know
depending on what we mean by life so to
speak the the the experience that we
have of sort of consistent things
happening in the world the idea of space
for example where there's you know we
can just say you're here you move there
it's kind of the same thing it's still
you in that different place even though
you're made of different atoms of space
and so on this is this idea that it's uh
that there's sort of this level of
predictability of what's going on that's
us finding a slice of reducibility in
what is underneath this computationally
reducible kind of system and I think
that's that's sort of the thing which is
actually my favorite discovery of the
last few years is the realization that
it is sort of the interaction between
this sort of underlying computational
irreducibility and our nature as kind of
observers who sort of have to key into
computational reducibility that fact
leads to the main laws of physics that
we discovered in throughout his century
so this is we talked about this in in
more detail but this is a uh to me it's
kind of our nature as observers the fact
that we are computationally bounded
observers we don't get to follow all
those little pieces of computational
irreducibility to stuff what is about
out there in the world into our minds
requires that we are looking at things
that are reducible we are compressing
kind of we're extracting just some
Essence some kind of symbolic essence of
what's the detail of what's going on in
the world that together with one other
condition that at first seems sort of
trivial but isn't which is that we
believe we are persistent in time
that is yes you know uh so some sense of
causality here's the thing at every
moment according to our Theory we're
made of different atoms of space
at every moment sort of the microscopic
detail of what what the universe is made
of is being Rewritten and that's and in
fact the very fact that there's
coherence between different parts of
space is a consequence of the fact that
there are all these little processes
going on that kind of knit together the
structure of space it's kind of like if
you wanted to have a fluid with a bunch
of molecules in it if those molecules
weren't interacting you wouldn't have
this fluid that would pour and do all
these kinds of things it would just be
sort of a free-floating collection of
molecules so similarities with space
that the fact that space is kind of
knitted together as a consequence of all
this activity in space and the fact that
kind of what we consist of sort of this
this series of of uh you know we're
continually being Rewritten and the
question is why is it the case that we
think of ourselves as being the same us
through time that's kind of a key
assumption I think it's a key aspect of
what we see as sort of our Consciousness
so to speak is that we have this kind of
consistent thread of experience well
isn't that just another
limitation
of our mind that we want to reduce
reality into some that kind of temporal
yeah consistency is just a nice
narrative right tell ourselves well the
fact is I think it's critical to the way
we humans typically operate is that we
have a single thread of experience you
know if you if you imagine sort of a
mind where you have you know maybe
that's what's happening in various kinds
of Minds that aren't working the same
way other minds work is that you're
splitting into multiple threads of
experience it's also it's also something
where you know when you look at I don't
know Quantum Mechanics for example in
the insides of quantum mechanics it's
splitting into many threads of
experience but in order for us humans to
interact with it you kind of have to
have to knit all those different threads
together so that we say oh yeah a
definite thing happened and now the next
definite thing happens and so on and I
think you know sort of inside uh it's
it's sort of interesting to try and
imagine what's it like to have kind of
these uh fundamentally multiple threads
of experience going on I mean right now
different human Minds have different
threads of experience we just have a
bunch of Minds that are interacting with
each other but we don't have a you know
within each mind there's a single thread
and that's a that is indeed a
simplification I think it's a it's a
thing you know the general computational
system does not have that simplification
and um it's one of the things you know I
I people often seem to think that you
know Consciousness is the highest level
of kind of things that can happen in the
universe so to speak but I think that's
not true I think it's actually a a
specialization in which among other
things you have this idea of a single
threat of experience which is not a
general feature of anything that could
kind of computationally happen in the
universe so it's a feature of a
computationally limited system that's
only able to
observe
reducible Pockets so yeah so I mean this
word Observer it means something in
quantum mechanics it means something
in a lot of places it means something to
us humans right as conscious beings so
what what's the importance of the
Observer what is the Observer and what's
the importance of the observer in the
computational universe so this question
of what is an observer what's the
general idea of an observer it's
actually one of my next projects which
got somewhat derailed by the the current
sort of AI Mania but um is there a
connection there or is that uh do you do
you think the Observer is primarily a
physics phenomena is it related to the
whole AI thing yes yes it is related so
one question is what is a general
Observer so you know we know we have an
idea what is a general computational
system we think about Turing machines we
think about other models of computation
there's a question what is a general
model of an observer and the there's
kind of observers like us which is kind
of The Observers we're interested in you
know we could imagine an alien Observer
that deals with computational
irreducibility and it has a mind that's
utterly different from ours and and
completely incoherent with what what
we're like but the fact is that that you
know if we are talking about observers
like us that one of the key things is
this idea of kind of taking all the
detail of the world and being able to
stuff it into a mind being able to take
all the detail and kind of you know
extract out of it a smaller set of of
kind of degrees of freedom a smaller
number of elements that will sort of fit
in our minds and I think this this
question so I've been interested in
trying to characterize what is the
general Observer and the general
Observer is I think in part there are
many let me give an example of a you
know you have a gas it's got a bunch of
molecules bouncing around and the thing
you're measuring about the gas is its
pressure and the only anything you as an
observer care about is pressure and that
means you have a piston on the side of
this box and the Piston is being pushed
by the gas and there are many many
different ways that molecules can hit
that piston but all that matters is the
kind of aggregate of all those molecular
impacts because that's what determines
pressure so there's a huge number of
different configurations of the gas
which are all equivalent so I think one
key aspect of observers is this
equivalency of many different
configurations of a system saying all I
care about is this aggregate feature all
I care about is this this overall thing
and that's that sort of one one aspect
and when we see that in lots of
different again it's the same story over
and over again that there's there's a
lot of detail in the world but what we
are extracting from it is something a
sort of a thin a thin summary of that of
that detail is that thin summary
nevertheless true is can it be a crappy
approximation sure that on average is is
correct I mean if we look at the
Observer that's the human mind it seems
like there's a lot of very
um as represented by natural language
for example there's a lot of really
crappy approximation sure and that could
be maybe a feature of it well with this
ambiguity right right you don't know you
know it could be the case you're just
measuring the aggregate impacts of these
molecules but there is some tiny tiny
probability that molecules will arrange
themselves in some really funky way and
that just measuring that average isn't
going to be the main point yeah by the
way an awful lot of science is very
confused about this because you know you
look at you look at papers and people
are really Keen they draw this curve and
they have these you know these bars on
the curve and things it's just this
curve and it's this one thing and it's
supposed to represent some system that
has all kinds of details in it and this
is a way that lots of science has gotten
wrong because people say I remember
years ago I was studying snowflake
growth you know you have the Snowflake
and it's growing it has all these arms
it's doing complicated things but there
was a literature on this stuff and it
talked about you know what's the rate of
snowflake growth and you know it got
pretty good answers for the rate of the
growth of the Snowflake and they looked
at it more carefully and they had these
nice curves of you know snowflake growth
rates and so on I looked at it more
carefully and I realized according to
their models the snowflake will be
spherical
and so they got the growth rate right
but the detail was just utterly wrong
and you know that not only the detail
that the whole thing was was not
capturing you know it was capturing this
aspect of the system that was in a sense
missing the main point of what was going
on and what is the geometric uh shape of
a snowflake snowflakes start in in the
phase of water that's relevant to the
formation of snowflakes it's a phase of
ice which starts with a hexagonal
arrangement of of water molecules and so
it starts off growing as a hexagonal
plate and then what happens is is the
plate oh oh versus sphere sphere well no
no but it's it's much more than that I
mean snowflakes are fluffy you know
typical snowflakes have little little
dendritic Arts yeah and what actually
happens is it's kind of kind of cool
because you can make these very simple
discrete models with cellular automata
and things that that figure this out you
start off with this you know hexagonal
thing and then the places it starts to
grow little arms and every time a little
piece of ice it adds itself to the
snowflake the fact that that ice
condensed from the water vapor heats the
snowflake up locally and so it makes it
less likely for uh for another piece of
ice to accumulate right nearby so this
leads to a kind of growth inhibition so
you grow an arm and it is a separated
arm because right around the arm it got
a little bit hot and it didn't add more
ice there so what happens is it grows
you have a hexagon it grows out arms the
arms go arms and then the arms go arms
go arms and eventually actually it's
kind of cool because it actually fills
in another hexagon a bigger hexagon and
when I first looked at this we had a
very simple model for this I realized
you know when it fills in that hexagon
it actually leaves some holes behind so
I thought well you know that is that
really right so look at these pictures
of snowflakes and sure enough they have
these little holes in them that are kind
of scars of the way that these arms grow
out
um so you can't fill in backfill holes
yeah they don't backfill and presumably
there's a limitation of how big like you
can't arbitrarily grow
I'm not sure I mean the thing falls
through the the I mean I think it this
you know it hits the ground at some
point I think you can grow I think you
can grow in the lab I think you can grow
pretty big ones I think you can grow
many many iterations of this kind of
goes from hexagon it grows out arms it
turns back it fills back into a hexagon
it grows more arms again in 3D no it's
flat usually why is it flat why doesn't
it uh span out okay wait a minute you
said it's fluffy and fluffy is a
three-dimensional property no or no it's
it's fluffy snow is okay so you know
what makes we're really uh we're really
in it
it's multiple snowflakes become fluffy a
single snowflake is not fluffy no no
single snowflake is Fluffy and what
happens is you know if if you have snow
that it's just pure hexagons they they
can you know they they fit together
pretty well it's not it doesn't it
doesn't make it doesn't have a lot of
air in it and they can also slide
against each other pretty easily and so
the snow can be pretty you know can I
think avalanches happen sometimes when
when the things tend to be these you
know hexagonal plates and it kind of
slides but then when the thing has all
these arms that have grown out it's not
they don't fit together very well and
that's why the snow has lots of air in
it and if you look at one of these
snowflakes and if you catch one you'll
see it has these little arms and people
actually people often say you know no
two snowflakes are alike
um that's mostly because as a snowflake
grows they do grow pretty consistently
with these different arms and so on but
you capture them at different times as
they you know they fell through through
the air in a different way you'll catch
this one at this stage and as it goes
through different stages they look
really different and so that's why you
know kind of looks like no two slime
flakes are alike because you caught them
at different at different times so the
rules under which they grow are the same
it's just the timing is yes okay so the
point is science is not able to uh
describe the full complexity of
snowflake growth well science if you if
you do what people might often do just
say okay let's make it scientific let's
turn into one number and that one number
is kind of the growth rate of the arms
or some such other thing that fails to
capture sort of the detail of what's
going on inside the system and that's in
a sense a big challenge for science is
how do you extract from the natural
world for example those aspects of it
that you are interested in talking about
now you might just say I don't really
care about the fluffiness of the
snowflakes all I care about is the
growth rate of the arms in which case
you know you have you can have a good
model without knowing anything about the
fluffiness
um but the fact is as a practical you
know when if you if you say what's the
what is the most obvious feature of a
snowflake oh that it has this
complicated shape well then you've got a
different story about what you model I
mean this is one of the features of sort
of modeling and science that you know
what is a Model A model is some way of
reducing the actuality of the world to
something where you can readily sort of
give a narrative for what's happening
where you can basically make some kind
of abstraction of what's happening and
answer questions that you care about
answering if you want to answer all
possible questions about the system
you'd have to have the whole system
because you might care about this
particular molecule where did it go and
you know your model which is some big
abstraction of that has nothing to say
about that so you know one of the things
that's that's often confusing in science
is people will have I've got a model
somebody says somebody else will say I
don't believe in your model because it
doesn't capture the feature of the
system that I care about you know
there's always this controversy about
you know is the is it a correct model
well no model is a except for the actual
system itself is a correct model in the
sense that it captures everything
questions does It capture what you care
about capturing sometimes that's
ultimately defined by what you're going
to build technology out of things like
this the one counter example to this is
if you think you're modeling the whole
universe all the way down then there is
a notion of a correct model but even
that is more complicated because it
depends on kind of how observers sample
things and so on that's a that's a
separate story but at least at the first
level to say you know this thing about
oh it's an approximation you're
capturing one aspect you're not
capturing other aspects when you really
think you have a complete model for the
whole universe you better be capturing
ultimately everything even though oh to
actually run that model is impossible
because of computational reducibility
the only the only thing that
successfully runs that model is the
actual running of the universe is the
universe itself but okay so what you
care about
is an interesting concept so that's a
that's a human concept so that's what
you're doing with uh wolf from Alpha and
Wolfram language is you trying to come
up with symbolic representations yes as
simple as possible
uh so a model that's as simple as
possible that fully captures stuff we
care about yes so I mean for example you
know we could we'll have a thing about
you know data about movies let's say we
could be describing every individual
pixel in every movie and so on but
that's not the level that people care
about and it's yes this is a I mean and
and that level that people care about is
somewhat related to what's described in
natural language but what what we're
trying to do is to find a way to sort of
represent precisely so you can compute
things see see one thing when you say
you give a piece of natural language
question is you feed it to a computer
you say does the computer understand
this natural language
well you know the computer process it in
some way it does this maybe it can make
a continuation of the natural language
you know maybe it can go on from The
Prompt and say what it's going to say
you say does it really understand it
hard to know but for in this kind of
computational world there is a very
definite definition of does it
understand which is could it be turned
into this symbolic computational thing
from which you can compute all kinds of
consequences and that's the that's the
sense in which one has sort of a target
for the understanding of natural
language and that's kind of our goal is
to have as much as possible about the
world that can be computed in a in a
reasonable way so to speak be able to be
sort of captured by this kind of
computational language that's that's
kind of the goal and and I think for us
humans the the main thing that's
important is as we formalize what we're
talking about it gives us a way of kind
of building a structure where we can
sort of build this Tower of consequences
of things so if we're just saying well
let's talk about it in natural language
it doesn't really give us some hard
Foundation that lets us you know build
step by step to work something out I
mean it's kind of like what happens in
math if we were just sort of vaguely
talking about math but didn't have the
kind of full structure of math and all
that kind of thing we wouldn't be able
to build this kind of big tower of
consequences and so you know in a sense
what we're trying to do with the whole
computational language effort is to make
a formalism for describing the world
that makes it possible to kind of build
this this Tower of consequences well can
you talk about this dance between
natural language and Wolfram language
so there's this gigantic thing called
the internet where people post memes and
diary type thoughts and very important
sounding articles and all of that that
makes up the training data set for GPT
and then there's a wolf from language
how can you map from the natural
language of the internet to the Wolfram
language is there an
manual is there an automated way of
doing that as we look into the future
well so wolf from alpha what it does
it's kind of front end is turning
natural language into computational
language right what you mean by that is
there's a prompt you ask a question what
is the capital of some yeah and it turns
into you know what's the distance
between you know Chicago and London or
something and that will turn into you
know geo-distance of entity City you
know Etc et cetera Etc each one of those
things is very is very well defined we
know you know given that it's the entity
City Chicago et cetera et cetera et
cetera you know Illinois United States
you know we know the geolocation of that
we know it's population we know all
kinds of things about it which we have
you know curated that data to be able to
to know that with some degree of
certainty so to speak and then
then we can compute things from this and
that's that's kind of the
um yeah that's that's that's the idea
but then something like GPT large
language models do they allow you to uh
make that conversion much more powerful
okay so that's an interesting thing
which we still don't know everything
about okay the um I mean this question
of going from natural language to
computational language yes in will from
alpha we've now you know wolfenovo's
been out and about for what 13 and a
half years now and you know we've
achieved I don't know what it is 98 99
success on queries that get put into it
now obviously there's a sort of feedback
loop because the things that work are
things people go on putting into it so
that that um uh but you know we've got
to a very high success rate of the the
little fragments of natural language
that put people put in you know
questions math calculations chemistry
calculations whatever it is you know we
can we can we we do very well at that
turning those things into to
computational language now I from the
very beginning of Orphan Alpha I thought
about for example uh writing code with
natural language in fact I had a I was
just looking at this recently I had a
post that I wrote in 2010 2011 called
something like programming with natural
language is actually going to work okay
and so you know we had done a bunch of
experiments using methods that were a
little bit some of them a little bit
machine learning like but certainly not
the same you know the same kind of idea
of vast training data and so on that is
the story of large language models
actually I know that that post a piece
of utter trivia but that that post um uh
Steve Jobs forwarded that post around to
all kinds of people at Apple you know
that was because he never really liked
programming languages so he was very
happy to see the idea that that that
that you could get rid of this kind of
layer of kind of engineering like
structure he would have liked you know I
think what's happening now because it
really is the case that you can you know
this idea that you have to kind of learn
how the computer works to use a
programming language is something that
is I think a a thing that you know just
like you had to learn the details of the
op codes to know how Assembly Language
worked and so on it's kind of a thing
that's that's that's a limited time
Horizon but but kind of the the you know
so this idea
of how elaborate can you make kind of
the prompt how elaborate can you make
the natural language and Abstract from
it computational language it's a very
interesting question and you know what
chat gbt you know gbt4 and so on can do
is pretty good
um it isn't it's very interesting
process I'm still trying to understand
this workflow we've been working out a
lot of tooling around this workflow the
natural language to computational
language right and the process
especially if it's conversation like
dialogue it's like multiple queries kind
of thing yeah right there's so many
things that are really interesting that
that work and so on so first thing is
can you just walk up to the computer and
expect to sort of specify a computation
what one realizes is humans have to have
some idea of kind of this way of
thinking about things computationally
without that you're kind of out of luck
because you just have no idea what
you're going to walk up to a computer I
remember when I should tell a silly
story about myself the very first
computer I saw which is when I was 10
years old it was a big Mainframe
computer and so on and I didn't really
understand what computers did and it's
like somebody's showing me this computer
and it's like uh you know can the
computer work out the weight of a
dinosaur it's like that isn't a sensible
thing to ask that's kind of you know you
have to give it that's not what
computers do I mean in Wolfram Alpha for
example you could say what's the typical
weight of a Stegosaurus and we'll give
you some answer but that's a very
different kind of thing from what one
thinks of computers as doing and so the
the kind of the the question of uh you
know first thing is people have to have
an idea of what what computation is
about
um you know I think it's a very you know
for Education that is the key thing it's
kind of this this sir this notion not
computer science not so the details of
programming but just this idea of how do
you think about the world
computationally computation thinking
about the world computationally is kind
of this formal way of thinking about the
world we've had other ones like logic
was a formal way you know as a way of
sort of abstracting and formalizing some
aspects of the world mathematics is
another one computation is this very
broad way of sort of formalizing the way
we think about the world and the thing
that's that's cool about computation is
if we can successfully formalize things
in terms of computation computers can
help us figure out what the consequences
are it's not like you formalized it with
math well that's nice but now you have
to if you're you know not using a
computer to do the math you have to go
work out a bunch of stuff yourself so I
think but that this idea let's see I
mean that you know we're trying to take
kind of the we're talking about sort of
natural language and its relationship to
computational language the uh the thing
the sort of the typical workflow I think
is first human has to have some kind of
idea of what they're trying to do that
if if it's something that they want to
sort of build a tower of of capabilities
on something that they want to sort of
formalize and make computational so then
human can type something in to you know
some llm system and uh uh sort of say
vaguely what they want in sort of
computational terms then it does pretty
well at synthesizing wealth language
code and it'll probably do better in the
future because we've got a huge number
of examples of of natural language input
together with the wolf and language
translation of that so it's kind of a a
um uh you know that that's a thing where
you can kind of extrapolating from all
those examples uh makes it easier to do
that that toss is the prompter task
could also kind of debug in the from
language code
or is your hope to not do that debugging
no no no I mean so so there are many
steps here okay so first the first thing
is you type natural language it
generates woven language give examples
by the way you have an example that is
the the dinosaur example do you have an
example that jumps to mind that we
should be thinking about some dumb
example it's like take my heart rate
data and uh you know figure out whether
I uh you know make a moving average
every seven days or something and work
out what the um and make a plot of the
result okay so that's a thing which is
you know
about two-thirds of a line of language
code I mean it's you know list plot of
moving average of some data bin or
something of the of the data and then
you'll get the result
um and you know the vague thing that I
was just saying in natural language
could would almost certainly correctly
turn into that very simple piece of
language code so you start mumbling
about heart rate
yeah and it kind of you know you arrive
at the moving average kind of idea but
you say average over seven days maybe
it'll figure out that that's a moving
you know that that can be encapsulated
as this moving average idea I'm not sure
but then the typical workflow but I'm
seeing is you generate this piece of
often language code it's pretty small
usually
um it's uh and if it isn't small it
probably isn't right but um you know if
it's it's pretty small and you know
welcome language is one of the ideas of
open languages it's a language that
humans can read it's not a language
which you know programming languages
tend to be this one-way story of humans
write them and computers execute from
them orphan language is intended to be
something which is sort of like math
notation something where you know humans
write it and humans are supposed to read
it as well and so kind of the workflow
that's emerging is kind of this this
human mumbles some things you know large
language model produces a fragment of
awesome language code then you look at
that you say yeah that looks well
typically you just run it first you see
does it produce the right thing you look
at what it produces you might say that's
obviously crazy you look at the code you
see I see why it's crazy you fix it if
you really care about the result you
really want to make sure it's right you
better look at that code and understand
it because that's the way you have this
sort of checkpoint of did it really do
what I expected it to do now you go
beyond that I mean it's it's it's you
know what we find is for example let's
say the code does the wrong thing then
you can often say to the large language
model can you adjust this to do this and
it's pretty good at doing that
interesting so you're using the output
of the code
to give you hints
about the the function of the code so
you're debugging yeah based on the
output of the code itself right the
plug-in that we have the the you know
for chat GPT it does that routinely you
know it will send the thing in it will
get a result it will discover the llm
will discover itself that the result is
not plausible and it will go back and
say oh I'm sorry it's very polite and it
you know it goes back and says I'll
rewrite that piece of code and then it
will try it again and get the result the
other thing is pretty interesting is
when you're just running so one of the
new Concepts that we have we invented
this whole idea of notebooks back 36
years ago now and so now there's the
question of sort of how do you combine
this idea of notebooks where you have
you know text and code and output how do
you combine that with the notion of of
chat and so on and there's some really
interesting things there like for
example a very typical thing now is we
have these these notebooks where as soon
as the if if the thing produce uses
errors if the you know run this code and
it produces messages and so on the the
llm automatically not only looks at
those messages it can also see all kinds
of internal information about stack
traces and things like this and it can
then it does a remarkably good job of
guessing what's wrong
and telling you so in other words it's
it's looking at things sort of
interesting it's kind of a typical sort
of ai-ish thing that it's able to have
more sensory data than we humans are
able to have because they're able to
look at a bunch of stuff that we humans
would kind of glaze over looking at and
it's able to then come up with oh this
is the explanation of what's happening
and and what is the data the stack trace
the the code you've written previously
the natural language you've written yeah
it's also what's happening is one of the
things that's uh is is for example when
there's these messages there's
documentation about these messages
there's examples of where the messages
have occurred otherwise nice all these
kinds of things the other thing that's
really amusing with this is when it
makes a mistake one of the things that's
in our prompt when the code doesn't work
is read the documentation
and we have a you know another piece of
the plugin that lets it read
documentation and that again is very
very useful because it it will you know
it will figure out sometimes it'll get
it'll make up the name of some option
for some function that doesn't really
exist read the documentation it'll have
you know some wrong structure for the
function and so on it's um that's a
powerful thing I mean the thing that you
know I've realized is we built this
language over the course of all these
years to be nice and coherent and
consistent and so on so it's easy for
humans to understand turns out there was
a side effect that I didn't anticipate
which is it makes it easy for AIS to
understand so it's almost like another
natural language but yeah so so what
formal language is a kind of foreign
language yes yes you have a lineup
English French Japanese Wolfram language
and then uh I don't know Spanish and
then the system is not going to notice
well yes I mean maybe you know that's an
interesting question because it really
depends on what I see as being a a
important piece of fundamental science
that basically just jumped out at us
with Chachi BT
um because I think you know the the real
question is why does chat GPD work how
is it possible to encapsulate you know
to successfully reproduce all these
kinds of things in natural language
um you know with a you know a
comparatively small he says you know a
couple hundred billion you know weights
of neural net and so on and I think that
uh you know that that relates to kind of
a fundamental fact about language which
uh you know the the main the main thing
is that I think there's a structure to
language that we haven't kind of really
explored very well as kind of the
semantic grammar I'm talking about about
um about language I mean we kind of know
that when we
Resume
Read
file updated 2026-02-14 08:08:56 UTC
Categories
Manage