Stephen Wolfram: ChatGPT and the Nature of Truth, Reality & Computation | Lex Fridman Podcast #376
PdE-waSx-d8 • 2023-05-09
Transcript preview
Open
Kind: captions Language: en you know I can tell chat gbt create a piece of code and then just run it on my computer and I'm like you know that that sort of personalizes for me the what could what could possibly go wrong so to speak was that exciting or scary that possibility it was a little bit scary actually because it's kind of like if you do that right what is the sandboxing that you should have and that's sort of a that's a a version of of that question for the world that is as soon as you put the AIS in charge of things you know how much how many constraints should there be on these systems before you put the AIS in charge of all the weapons and all these you know all these different kinds of systems well here's the fun part about sandboxes is uh the AI knows about them it has the tools to uh crack them the following is a conversation with Stephen Wolfram his fourth time on this podcast he's a computer scientist mathematician theoretical physicist and the founder of Wolfram research a company behind Mathematica well from alpha or from language and the Wolfram physics and meta mathematics projects He has been a Pioneer in exploring the computational nature of reality and so he's the perfect person to explore with together the new quickly evolving landscape of large language models as human civilization Journeys towards building super intelligent AGI this is the Lex Friedman podcast to support it please check out our sponsors in the description and now dear friends here's Stephen Wolfram you've announced the integration of chat gbt and Wu from Alpha and Wolfram language so let's talk about that integration what are the key differences from the high philosophical level maybe the technical level between the capabilities of broadly speaking the two kinds of systems large language models and this computational gigantic computational system infrastructure that is well from alpha yeah so what does something like chat GPT do it's it's mostly focused on make language like the language that humans have made and put on the web and so on yeah so you know it's it's primary sort of underlying technical thing is you've given a prompt it's trying to continue that prompt in a way that somehow typical of what it's seen based on a trillion words of text that humans have written on the web and the way it's doing that is with something which is probably quite similar to the way we humans do the first stages of that using a neural net and so on and just saying given these given this piece of text let's Ripple through the neural net one word and get one word at a time of output and uh it's kind of a shallow computation on a large amount of kind of training data that is what we humans have put on the web that's a different thing from sort of the computational stack that I spent the last I don't know 40 years or so building which has to do with what can you compute many steps potentially a very deep computation it's not sort of taking the statistics of what we humans have produced and trying to continue Things based on that statistics instead it's trying to take kind of the formal structure that we've created in our civilization whether it's from mathematics or whether it's from kind of systematic knowledge of all kinds and use that to do arbitrarily deep computations to figure out things that that aren't just let's match what's already been kind of said on the web but let's potentially be able to compute something new and different that's never been computed before so as a practical matter you know the the um what we're you know the our goal is to have made as much as possible of the world computable in the sense that if there's a question that in principle is answerable from some sort of expert knowledge that's been accumulated we can compute the answer to that question and we can do it in a sort of reliable way that's that's the best one can do given what the expertise that our civilization has accumulated it's a very it's a it's a much more sort of labor-intensive on the side of kind of being creating kind of the the computational system to do that um obviously the in the the kind of the chat GPT world it's like take things which were produced for quite other purposes namely the all the things we've written out on the web and so on and sort of forage from that things which were are like what's been written on the web so I think you know as a practical point of view I I view sort of the chat GPT thing as being wide and shallow and what we're trying to do with sort of building out computation as being this sort of deep also broad but but most importantly kind of a deep type of thing I think another way to think about this is if you go back in human history you know I don't know a thousand years or something and you say what what can the typical person what's the typical person going to figure out well the answer is there's certain kinds of things that we humans can quickly figure out that's sort of what what our uh you know other neural architecture and the kinds of things we learn in our lives let us do but then there's this whole layer of kind of formalization that got developed in which is you know the kind of whole sort of story of intellectual history and the whole kind of depth of learning that formalization turned into things like logic mathematics science and so on and that's the kind of thing that allows one to kind of build these towers of of uh uh of of uh sort of towers of things you work out it's not just I can immediately figure this out it's no I can use this kind of form to go step by step and work out something which was not immediately obvious to me and that's kind of the story of what what we're trying to do computationally is to be able to build those kind of tall towers of what implies what implies what and so on um and uh as opposed to kind of the yes I can immediately figure it out it's just like what I saw somewhere else in something that I heard or remembered or something like this what can you say about the kind of formal structure the kind of form of foundation you can build such a formal structure on about the kinds of things you would start on in order to build this kind of uh deep computable knowledge trees so the question is sort of how do you how do you think about computation and there's there's a couple of points here one is what computation intrinsically is like and the other is what aspects of computation we humans with our minds and with the kinds of things we've learned can sort of relate to in that computational universe so if we start on the kind of what can computation be like it's something I've spent some big chunk of my life studying is imagine that you're you know we usually we write programs where we kind of know what we want the program to do and we carefully write you know many lines of code and we hope that the program does what we what we intended it to do but the thing I've been interested in is if you just look at the kind of natural science of programs so you just say I'm going to make this program it's a really tiny program maybe I even pick the pieces of the program at random but it's really tiny by really tiny I mean you know less than a line of code type thing you say what does this program do and you run it and big discovery that I made in the early 80s is that even extremely simple programs when you run them can do really complicated things really surprised me it took me several years to kind of realize that that was a thing so to speak but that that realization that even very simple programs can do incredibly complicated things that we very much don't expect that Discovery I mean I realized that that's very much I think how nature works that is nature has simple rules but yet does all sorts of complicated things that we might not expect you know as a big thing of the last few years has been understanding that that's how the whole universe and physics works but that's a a quite separate topic but so there's this whole world of programs and what they do and very rich sophisticated things that these programs can do but when we look at many of these programs we look at them and say well that's kind of I don't really know what that's doing it's not a very human kind of thing so on the one hand we have sort of what's possible in the computational universe on the other hand we have the kinds of things that we humans think about the kinds of things that are developed in kind of our intellectual history and that's uh and the Really the challenge to sort of making things computational is to connect what's computationally possible out in the computational universe with the things that we humans sort of typically think about with our minds now that's a complicated kind of moving Target because the things that we think about change over time we've learned more stuff we've invented mathematics we've invented various kinds of ideas and structures and so on so it's gradually expanding we're kind of gradually colonizing more and more of this kind of intellectual space of possibilities but the the real thing the real challenge is how do you take what is computationally possible how do you take how do you encapsulate the kinds of things that we think about in a way that kind of plugs into what's computationally possible and and actually the the uh the big sort of idea there is this idea of kind of symbolic programming symbolic representations of things and so the the question is when you look at sort of everything in the world and you kind of you know you take some visual scene or something you're looking at and you say well how do I turn that into something that I can kind of stuff into my mind you know there are lots of pixels in my visual scene but the things that I remembered from that visual scene are you know there's a there's a chair in this place it's a kind of a symbolic representation of the visual scene there are two chairs on a table or something rather than there are all these pixels arranged in all these detailed ways and so the question then is how do you take sort of all all the things in the world and make some kind of representation that corresponds to the types of ways that we think about things and human language is is sort of one form of representation that we have we talk about chairs that's a word in human language and so on how do we how do we take but human language is is not in and of itself something from that plugs in very well to sort of computation it's not something from which you can immediately compute consequences and so on and so you have to kind of find a way to take sort of the the stuff we understand from human language and make it more precise and that's really this story of of symbolic programming and you know what what that turns into is something which I didn't know at the time it was going to work as well as it has but back in the 1979 or so I was trying to build my first big computer system and trying to figure out you know how should I represent computations at a high level and I kind of invented this idea of using kind of symbolic Expressions you know structured as it's kind of like a a function and a bunch of arguments but that function doesn't necessarily evaluate to anything it's just a a thing that sits there representing a structure and so building up that structure and it's turned out that structure has been extremely it's it's a good match for the way that we humans it seems to be a good match for the way that we humans kind of conceptualize higher level things and it's been the last I don't know 45 years or something it's served me remarkably well so building up that structure using this kind of symbolic representation but what can you say about abstractions here because you could just start with your physics project you could start at a hypograph at a very very low level and build up everything from there but you don't you type shortcuts right uh you take the highest level of abstraction convert that uh the kind of abstraction that's convertible to something computable using symbolic representation and then that's your new foundation for that little piece of knowledge yeah somehow all of that is integrated right so the the sort of a very important phenomenon that that is kind of a thing that I've sort of realized is just it's one of these things that sort of in the in the future of kind of everything is going to become more and more important is this phenomenon of computational irreducibility and the the question is if you know the rules for something you have a program you're going to run it you might say I know the rules great I know everything about what's going to happen well in principle you do because you can just run those rules out and just see what they do you might run them a million steps you see what happens Etc the question is can you like immediately jump ahead and say I know it's going to happen after a million steps and the answer is 13 or something yes and the the one of the very critical things to realize is if you could reduce that computation there isn't a sense no point in doing the computation the place where you really get value out of doing computation is when you had to do the computation to find out the answer but this phenomenon that you have to do the computation to find out the answer this phenomenon of computational irreducibility seems to be tremendously important for thinking about lots of kinds of things so one of the things that happens is okay you've got a model of the universe at the low level in terms of atoms of space and hypographs and rewriting typographs and so on and it's happening you know 10 to the 100 times every second let's say well you say great then we've we've nailed it we've we know how the universe works well the problem is the universe can figure out what it's going to do it does those 10 to the 100 you know steps but for us to work out what it's going to do we have no way to reduce that computation the only way to do the computation to see the result of the computation is to do it and if we're operating within the universe we're kind of there's no there's no opportunity to do that because the universe is doing it as fast as the universe can do it and that's you know that's what's happening so what we're trying to do and a lot of the story of science a lot of other kinds of things is finding pockets of reducibility that is you could have a situation where everything in the world is full of computational irreducibility we never know what's going to happen next the only way we can figure out what's going to happen next is just let the system run and see what happens so in a sense the story of of most kinds of science inventions a lot of kinds of things is the story of finding these places where we can locally jump ahead and one of the features of computational reducibility is that there are always pockets of reducibility there are always places there always an infinite number of places where you can jump ahead there's no way where you can jump completely ahead but there are little little patches little places where you can jump ahead a bit and I think you know we can talk about physics project and so on but I think the thing we realize is we kind of exist in a slice of all the possible computational irreducibility in the universe we exist in a slice where there's a reasonable amount of predictability and in a sense as we try and construct these kind of higher levels of of abstraction symbolic representations and so on what we're doing is we're finding these lumps of reducibility that we can kind of attach ourselves to and about which we can kind of have fairly simple narrative things to say because in principle you know I say what's going to happen in the next few seconds you know oh there are these molecules moving around in the air in this room and oh gosh it's an incredibly complicated story um and that's a whole computational irresistible thing most of which I don't care about and most of it is well you know the air is still going to be here and nothing much is going to be different about it and that's a kind of reducible fact about what is ultimately a an underlying level of computational irreducible process and uh life would not be possible if we didn't have a large number of such reducible Pockets uh yes Pockets amenable to uh reduction into something symbolic yes I think so I mean life in in the way that we experience it that I mean you know one might you know depending on what we mean by life so to speak the the the experience that we have of sort of consistent things happening in the world the idea of space for example where there's you know we can just say you're here you move there it's kind of the same thing it's still you in that different place even though you're made of different atoms of space and so on this is this idea that it's uh that there's sort of this level of predictability of what's going on that's us finding a slice of reducibility in what is underneath this computationally reducible kind of system and I think that's that's sort of the thing which is actually my favorite discovery of the last few years is the realization that it is sort of the interaction between this sort of underlying computational irreducibility and our nature as kind of observers who sort of have to key into computational reducibility that fact leads to the main laws of physics that we discovered in throughout his century so this is we talked about this in in more detail but this is a uh to me it's kind of our nature as observers the fact that we are computationally bounded observers we don't get to follow all those little pieces of computational irreducibility to stuff what is about out there in the world into our minds requires that we are looking at things that are reducible we are compressing kind of we're extracting just some Essence some kind of symbolic essence of what's the detail of what's going on in the world that together with one other condition that at first seems sort of trivial but isn't which is that we believe we are persistent in time that is yes you know uh so some sense of causality here's the thing at every moment according to our Theory we're made of different atoms of space at every moment sort of the microscopic detail of what what the universe is made of is being Rewritten and that's and in fact the very fact that there's coherence between different parts of space is a consequence of the fact that there are all these little processes going on that kind of knit together the structure of space it's kind of like if you wanted to have a fluid with a bunch of molecules in it if those molecules weren't interacting you wouldn't have this fluid that would pour and do all these kinds of things it would just be sort of a free-floating collection of molecules so similarities with space that the fact that space is kind of knitted together as a consequence of all this activity in space and the fact that kind of what we consist of sort of this this series of of uh you know we're continually being Rewritten and the question is why is it the case that we think of ourselves as being the same us through time that's kind of a key assumption I think it's a key aspect of what we see as sort of our Consciousness so to speak is that we have this kind of consistent thread of experience well isn't that just another limitation of our mind that we want to reduce reality into some that kind of temporal yeah consistency is just a nice narrative right tell ourselves well the fact is I think it's critical to the way we humans typically operate is that we have a single thread of experience you know if you if you imagine sort of a mind where you have you know maybe that's what's happening in various kinds of Minds that aren't working the same way other minds work is that you're splitting into multiple threads of experience it's also it's also something where you know when you look at I don't know Quantum Mechanics for example in the insides of quantum mechanics it's splitting into many threads of experience but in order for us humans to interact with it you kind of have to have to knit all those different threads together so that we say oh yeah a definite thing happened and now the next definite thing happens and so on and I think you know sort of inside uh it's it's sort of interesting to try and imagine what's it like to have kind of these uh fundamentally multiple threads of experience going on I mean right now different human Minds have different threads of experience we just have a bunch of Minds that are interacting with each other but we don't have a you know within each mind there's a single thread and that's a that is indeed a simplification I think it's a it's a thing you know the general computational system does not have that simplification and um it's one of the things you know I I people often seem to think that you know Consciousness is the highest level of kind of things that can happen in the universe so to speak but I think that's not true I think it's actually a a specialization in which among other things you have this idea of a single threat of experience which is not a general feature of anything that could kind of computationally happen in the universe so it's a feature of a computationally limited system that's only able to observe reducible Pockets so yeah so I mean this word Observer it means something in quantum mechanics it means something in a lot of places it means something to us humans right as conscious beings so what what's the importance of the Observer what is the Observer and what's the importance of the observer in the computational universe so this question of what is an observer what's the general idea of an observer it's actually one of my next projects which got somewhat derailed by the the current sort of AI Mania but um is there a connection there or is that uh do you do you think the Observer is primarily a physics phenomena is it related to the whole AI thing yes yes it is related so one question is what is a general Observer so you know we know we have an idea what is a general computational system we think about Turing machines we think about other models of computation there's a question what is a general model of an observer and the there's kind of observers like us which is kind of The Observers we're interested in you know we could imagine an alien Observer that deals with computational irreducibility and it has a mind that's utterly different from ours and and completely incoherent with what what we're like but the fact is that that you know if we are talking about observers like us that one of the key things is this idea of kind of taking all the detail of the world and being able to stuff it into a mind being able to take all the detail and kind of you know extract out of it a smaller set of of kind of degrees of freedom a smaller number of elements that will sort of fit in our minds and I think this this question so I've been interested in trying to characterize what is the general Observer and the general Observer is I think in part there are many let me give an example of a you know you have a gas it's got a bunch of molecules bouncing around and the thing you're measuring about the gas is its pressure and the only anything you as an observer care about is pressure and that means you have a piston on the side of this box and the Piston is being pushed by the gas and there are many many different ways that molecules can hit that piston but all that matters is the kind of aggregate of all those molecular impacts because that's what determines pressure so there's a huge number of different configurations of the gas which are all equivalent so I think one key aspect of observers is this equivalency of many different configurations of a system saying all I care about is this aggregate feature all I care about is this this overall thing and that's that sort of one one aspect and when we see that in lots of different again it's the same story over and over again that there's there's a lot of detail in the world but what we are extracting from it is something a sort of a thin a thin summary of that of that detail is that thin summary nevertheless true is can it be a crappy approximation sure that on average is is correct I mean if we look at the Observer that's the human mind it seems like there's a lot of very um as represented by natural language for example there's a lot of really crappy approximation sure and that could be maybe a feature of it well with this ambiguity right right you don't know you know it could be the case you're just measuring the aggregate impacts of these molecules but there is some tiny tiny probability that molecules will arrange themselves in some really funky way and that just measuring that average isn't going to be the main point yeah by the way an awful lot of science is very confused about this because you know you look at you look at papers and people are really Keen they draw this curve and they have these you know these bars on the curve and things it's just this curve and it's this one thing and it's supposed to represent some system that has all kinds of details in it and this is a way that lots of science has gotten wrong because people say I remember years ago I was studying snowflake growth you know you have the Snowflake and it's growing it has all these arms it's doing complicated things but there was a literature on this stuff and it talked about you know what's the rate of snowflake growth and you know it got pretty good answers for the rate of the growth of the Snowflake and they looked at it more carefully and they had these nice curves of you know snowflake growth rates and so on I looked at it more carefully and I realized according to their models the snowflake will be spherical and so they got the growth rate right but the detail was just utterly wrong and you know that not only the detail that the whole thing was was not capturing you know it was capturing this aspect of the system that was in a sense missing the main point of what was going on and what is the geometric uh shape of a snowflake snowflakes start in in the phase of water that's relevant to the formation of snowflakes it's a phase of ice which starts with a hexagonal arrangement of of water molecules and so it starts off growing as a hexagonal plate and then what happens is is the plate oh oh versus sphere sphere well no no but it's it's much more than that I mean snowflakes are fluffy you know typical snowflakes have little little dendritic Arts yeah and what actually happens is it's kind of kind of cool because you can make these very simple discrete models with cellular automata and things that that figure this out you start off with this you know hexagonal thing and then the places it starts to grow little arms and every time a little piece of ice it adds itself to the snowflake the fact that that ice condensed from the water vapor heats the snowflake up locally and so it makes it less likely for uh for another piece of ice to accumulate right nearby so this leads to a kind of growth inhibition so you grow an arm and it is a separated arm because right around the arm it got a little bit hot and it didn't add more ice there so what happens is it grows you have a hexagon it grows out arms the arms go arms and then the arms go arms go arms and eventually actually it's kind of cool because it actually fills in another hexagon a bigger hexagon and when I first looked at this we had a very simple model for this I realized you know when it fills in that hexagon it actually leaves some holes behind so I thought well you know that is that really right so look at these pictures of snowflakes and sure enough they have these little holes in them that are kind of scars of the way that these arms grow out um so you can't fill in backfill holes yeah they don't backfill and presumably there's a limitation of how big like you can't arbitrarily grow I'm not sure I mean the thing falls through the the I mean I think it this you know it hits the ground at some point I think you can grow I think you can grow in the lab I think you can grow pretty big ones I think you can grow many many iterations of this kind of goes from hexagon it grows out arms it turns back it fills back into a hexagon it grows more arms again in 3D no it's flat usually why is it flat why doesn't it uh span out okay wait a minute you said it's fluffy and fluffy is a three-dimensional property no or no it's it's fluffy snow is okay so you know what makes we're really uh we're really in it it's multiple snowflakes become fluffy a single snowflake is not fluffy no no single snowflake is Fluffy and what happens is you know if if you have snow that it's just pure hexagons they they can you know they they fit together pretty well it's not it doesn't it doesn't make it doesn't have a lot of air in it and they can also slide against each other pretty easily and so the snow can be pretty you know can I think avalanches happen sometimes when when the things tend to be these you know hexagonal plates and it kind of slides but then when the thing has all these arms that have grown out it's not they don't fit together very well and that's why the snow has lots of air in it and if you look at one of these snowflakes and if you catch one you'll see it has these little arms and people actually people often say you know no two snowflakes are alike um that's mostly because as a snowflake grows they do grow pretty consistently with these different arms and so on but you capture them at different times as they you know they fell through through the air in a different way you'll catch this one at this stage and as it goes through different stages they look really different and so that's why you know kind of looks like no two slime flakes are alike because you caught them at different at different times so the rules under which they grow are the same it's just the timing is yes okay so the point is science is not able to uh describe the full complexity of snowflake growth well science if you if you do what people might often do just say okay let's make it scientific let's turn into one number and that one number is kind of the growth rate of the arms or some such other thing that fails to capture sort of the detail of what's going on inside the system and that's in a sense a big challenge for science is how do you extract from the natural world for example those aspects of it that you are interested in talking about now you might just say I don't really care about the fluffiness of the snowflakes all I care about is the growth rate of the arms in which case you know you have you can have a good model without knowing anything about the fluffiness um but the fact is as a practical you know when if you if you say what's the what is the most obvious feature of a snowflake oh that it has this complicated shape well then you've got a different story about what you model I mean this is one of the features of sort of modeling and science that you know what is a Model A model is some way of reducing the actuality of the world to something where you can readily sort of give a narrative for what's happening where you can basically make some kind of abstraction of what's happening and answer questions that you care about answering if you want to answer all possible questions about the system you'd have to have the whole system because you might care about this particular molecule where did it go and you know your model which is some big abstraction of that has nothing to say about that so you know one of the things that's that's often confusing in science is people will have I've got a model somebody says somebody else will say I don't believe in your model because it doesn't capture the feature of the system that I care about you know there's always this controversy about you know is the is it a correct model well no model is a except for the actual system itself is a correct model in the sense that it captures everything questions does It capture what you care about capturing sometimes that's ultimately defined by what you're going to build technology out of things like this the one counter example to this is if you think you're modeling the whole universe all the way down then there is a notion of a correct model but even that is more complicated because it depends on kind of how observers sample things and so on that's a that's a separate story but at least at the first level to say you know this thing about oh it's an approximation you're capturing one aspect you're not capturing other aspects when you really think you have a complete model for the whole universe you better be capturing ultimately everything even though oh to actually run that model is impossible because of computational reducibility the only the only thing that successfully runs that model is the actual running of the universe is the universe itself but okay so what you care about is an interesting concept so that's a that's a human concept so that's what you're doing with uh wolf from Alpha and Wolfram language is you trying to come up with symbolic representations yes as simple as possible uh so a model that's as simple as possible that fully captures stuff we care about yes so I mean for example you know we could we'll have a thing about you know data about movies let's say we could be describing every individual pixel in every movie and so on but that's not the level that people care about and it's yes this is a I mean and and that level that people care about is somewhat related to what's described in natural language but what what we're trying to do is to find a way to sort of represent precisely so you can compute things see see one thing when you say you give a piece of natural language question is you feed it to a computer you say does the computer understand this natural language well you know the computer process it in some way it does this maybe it can make a continuation of the natural language you know maybe it can go on from The Prompt and say what it's going to say you say does it really understand it hard to know but for in this kind of computational world there is a very definite definition of does it understand which is could it be turned into this symbolic computational thing from which you can compute all kinds of consequences and that's the that's the sense in which one has sort of a target for the understanding of natural language and that's kind of our goal is to have as much as possible about the world that can be computed in a in a reasonable way so to speak be able to be sort of captured by this kind of computational language that's that's kind of the goal and and I think for us humans the the main thing that's important is as we formalize what we're talking about it gives us a way of kind of building a structure where we can sort of build this Tower of consequences of things so if we're just saying well let's talk about it in natural language it doesn't really give us some hard Foundation that lets us you know build step by step to work something out I mean it's kind of like what happens in math if we were just sort of vaguely talking about math but didn't have the kind of full structure of math and all that kind of thing we wouldn't be able to build this kind of big tower of consequences and so you know in a sense what we're trying to do with the whole computational language effort is to make a formalism for describing the world that makes it possible to kind of build this this Tower of consequences well can you talk about this dance between natural language and Wolfram language so there's this gigantic thing called the internet where people post memes and diary type thoughts and very important sounding articles and all of that that makes up the training data set for GPT and then there's a wolf from language how can you map from the natural language of the internet to the Wolfram language is there an manual is there an automated way of doing that as we look into the future well so wolf from alpha what it does it's kind of front end is turning natural language into computational language right what you mean by that is there's a prompt you ask a question what is the capital of some yeah and it turns into you know what's the distance between you know Chicago and London or something and that will turn into you know geo-distance of entity City you know Etc et cetera Etc each one of those things is very is very well defined we know you know given that it's the entity City Chicago et cetera et cetera et cetera you know Illinois United States you know we know the geolocation of that we know it's population we know all kinds of things about it which we have you know curated that data to be able to to know that with some degree of certainty so to speak and then then we can compute things from this and that's that's kind of the um yeah that's that's that's the idea but then something like GPT large language models do they allow you to uh make that conversion much more powerful okay so that's an interesting thing which we still don't know everything about okay the um I mean this question of going from natural language to computational language yes in will from alpha we've now you know wolfenovo's been out and about for what 13 and a half years now and you know we've achieved I don't know what it is 98 99 success on queries that get put into it now obviously there's a sort of feedback loop because the things that work are things people go on putting into it so that that um uh but you know we've got to a very high success rate of the the little fragments of natural language that put people put in you know questions math calculations chemistry calculations whatever it is you know we can we can we we do very well at that turning those things into to computational language now I from the very beginning of Orphan Alpha I thought about for example uh writing code with natural language in fact I had a I was just looking at this recently I had a post that I wrote in 2010 2011 called something like programming with natural language is actually going to work okay and so you know we had done a bunch of experiments using methods that were a little bit some of them a little bit machine learning like but certainly not the same you know the same kind of idea of vast training data and so on that is the story of large language models actually I know that that post a piece of utter trivia but that that post um uh Steve Jobs forwarded that post around to all kinds of people at Apple you know that was because he never really liked programming languages so he was very happy to see the idea that that that that you could get rid of this kind of layer of kind of engineering like structure he would have liked you know I think what's happening now because it really is the case that you can you know this idea that you have to kind of learn how the computer works to use a programming language is something that is I think a a thing that you know just like you had to learn the details of the op codes to know how Assembly Language worked and so on it's kind of a thing that's that's that's a limited time Horizon but but kind of the the you know so this idea of how elaborate can you make kind of the prompt how elaborate can you make the natural language and Abstract from it computational language it's a very interesting question and you know what chat gbt you know gbt4 and so on can do is pretty good um it isn't it's very interesting process I'm still trying to understand this workflow we've been working out a lot of tooling around this workflow the natural language to computational language right and the process especially if it's conversation like dialogue it's like multiple queries kind of thing yeah right there's so many things that are really interesting that that work and so on so first thing is can you just walk up to the computer and expect to sort of specify a computation what one realizes is humans have to have some idea of kind of this way of thinking about things computationally without that you're kind of out of luck because you just have no idea what you're going to walk up to a computer I remember when I should tell a silly story about myself the very first computer I saw which is when I was 10 years old it was a big Mainframe computer and so on and I didn't really understand what computers did and it's like somebody's showing me this computer and it's like uh you know can the computer work out the weight of a dinosaur it's like that isn't a sensible thing to ask that's kind of you know you have to give it that's not what computers do I mean in Wolfram Alpha for example you could say what's the typical weight of a Stegosaurus and we'll give you some answer but that's a very different kind of thing from what one thinks of computers as doing and so the the kind of the the question of uh you know first thing is people have to have an idea of what what computation is about um you know I think it's a very you know for Education that is the key thing it's kind of this this sir this notion not computer science not so the details of programming but just this idea of how do you think about the world computationally computation thinking about the world computationally is kind of this formal way of thinking about the world we've had other ones like logic was a formal way you know as a way of sort of abstracting and formalizing some aspects of the world mathematics is another one computation is this very broad way of sort of formalizing the way we think about the world and the thing that's that's cool about computation is if we can successfully formalize things in terms of computation computers can help us figure out what the consequences are it's not like you formalized it with math well that's nice but now you have to if you're you know not using a computer to do the math you have to go work out a bunch of stuff yourself so I think but that this idea let's see I mean that you know we're trying to take kind of the we're talking about sort of natural language and its relationship to computational language the uh the thing the sort of the typical workflow I think is first human has to have some kind of idea of what they're trying to do that if if it's something that they want to sort of build a tower of of capabilities on something that they want to sort of formalize and make computational so then human can type something in to you know some llm system and uh uh sort of say vaguely what they want in sort of computational terms then it does pretty well at synthesizing wealth language code and it'll probably do better in the future because we've got a huge number of examples of of natural language input together with the wolf and language translation of that so it's kind of a a um uh you know that that's a thing where you can kind of extrapolating from all those examples uh makes it easier to do that that toss is the prompter task could also kind of debug in the from language code or is your hope to not do that debugging no no no I mean so so there are many steps here okay so first the first thing is you type natural language it generates woven language give examples by the way you have an example that is the the dinosaur example do you have an example that jumps to mind that we should be thinking about some dumb example it's like take my heart rate data and uh you know figure out whether I uh you know make a moving average every seven days or something and work out what the um and make a plot of the result okay so that's a thing which is you know about two-thirds of a line of language code I mean it's you know list plot of moving average of some data bin or something of the of the data and then you'll get the result um and you know the vague thing that I was just saying in natural language could would almost certainly correctly turn into that very simple piece of language code so you start mumbling about heart rate yeah and it kind of you know you arrive at the moving average kind of idea but you say average over seven days maybe it'll figure out that that's a moving you know that that can be encapsulated as this moving average idea I'm not sure but then the typical workflow but I'm seeing is you generate this piece of often language code it's pretty small usually um it's uh and if it isn't small it probably isn't right but um you know if it's it's pretty small and you know welcome language is one of the ideas of open languages it's a language that humans can read it's not a language which you know programming languages tend to be this one-way story of humans write them and computers execute from them orphan language is intended to be something which is sort of like math notation something where you know humans write it and humans are supposed to read it as well and so kind of the workflow that's emerging is kind of this this human mumbles some things you know large language model produces a fragment of awesome language code then you look at that you say yeah that looks well typically you just run it first you see does it produce the right thing you look at what it produces you might say that's obviously crazy you look at the code you see I see why it's crazy you fix it if you really care about the result you really want to make sure it's right you better look at that code and understand it because that's the way you have this sort of checkpoint of did it really do what I expected it to do now you go beyond that I mean it's it's it's you know what we find is for example let's say the code does the wrong thing then you can often say to the large language model can you adjust this to do this and it's pretty good at doing that interesting so you're using the output of the code to give you hints about the the function of the code so you're debugging yeah based on the output of the code itself right the plug-in that we have the the you know for chat GPT it does that routinely you know it will send the thing in it will get a result it will discover the llm will discover itself that the result is not plausible and it will go back and say oh I'm sorry it's very polite and it you know it goes back and says I'll rewrite that piece of code and then it will try it again and get the result the other thing is pretty interesting is when you're just running so one of the new Concepts that we have we invented this whole idea of notebooks back 36 years ago now and so now there's the question of sort of how do you combine this idea of notebooks where you have you know text and code and output how do you combine that with the notion of of chat and so on and there's some really interesting things there like for example a very typical thing now is we have these these notebooks where as soon as the if if the thing produce uses errors if the you know run this code and it produces messages and so on the the llm automatically not only looks at those messages it can also see all kinds of internal information about stack traces and things like this and it can then it does a remarkably good job of guessing what's wrong and telling you so in other words it's it's looking at things sort of interesting it's kind of a typical sort of ai-ish thing that it's able to have more sensory data than we humans are able to have because they're able to look at a bunch of stuff that we humans would kind of glaze over looking at and it's able to then come up with oh this is the explanation of what's happening and and what is the data the stack trace the the code you've written previously the natural language you've written yeah it's also what's happening is one of the things that's uh is is for example when there's these messages there's documentation about these messages there's examples of where the messages have occurred otherwise nice all these kinds of things the other thing that's really amusing with this is when it makes a mistake one of the things that's in our prompt when the code doesn't work is read the documentation and we have a you know another piece of the plugin that lets it read documentation and that again is very very useful because it it will you know it will figure out sometimes it'll get it'll make up the name of some option for some function that doesn't really exist read the documentation it'll have you know some wrong structure for the function and so on it's um that's a powerful thing I mean the thing that you know I've realized is we built this language over the course of all these years to be nice and coherent and consistent and so on so it's easy for humans to understand turns out there was a side effect that I didn't anticipate which is it makes it easy for AIS to understand so it's almost like another natural language but yeah so so what formal language is a kind of foreign language yes yes you have a lineup English French Japanese Wolfram language and then uh I don't know Spanish and then the system is not going to notice well yes I mean maybe you know that's an interesting question because it really depends on what I see as being a a important piece of fundamental science that basically just jumped out at us with Chachi BT um because I think you know the the real question is why does chat GPD work how is it possible to encapsulate you know to successfully reproduce all these kinds of things in natural language um you know with a you know a comparatively small he says you know a couple hundred billion you know weights of neural net and so on and I think that uh you know that that relates to kind of a fundamental fact about language which uh you know the the main the main thing is that I think there's a structure to language that we haven't kind of really explored very well as kind of the semantic grammar I'm talking about about um about language I mean we kind of know that when we
Resume
Categories