Transcript
F3Jd9GI6XqE • Edward Gibson: Human Language, Psycholinguistics, Syntax, Grammar & LLMs | Lex Fridman Podcast #426
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/lexfridman/.shards/text-0001.zst#text/0779_F3Jd9GI6XqE.txt
Kind: captions Language: en naively I certainly thought that all humans would have words for exact counting uh and the Paha don't okay so they don't have any words for even one there's not a word for one in their language and so there's certainly not a word for two three or four so that kind of blows people's minds often yeah that blowing my mind that's pretty weird how are you how are you going to ask I want two of those you just don't and so that's just not a thing you can possibly ask in the P it's not possible that is there is no words for that the following is a conversation with Edward Gibson or Ted as everybody calls him he is a psycho Linguistics professor in MIT he heads the MIT language lab that investigates why human languages look the way they do the relationship between cultureal language and how people represent process and learn language also he should have a book titled syntax a cognitive approach published by MIT press coming out this fall so look out for that this is Alex rman podcast to support it please check out our sponsors in the description and now dear friends here's Edward Gibson when did you first become fascinated with human language as a kid in school when we had to structure sentences in English grammar I I I found that process interesting I found it confusing as to what it was I was told to do I didn't didn't didn't understand what the theory was behind it but I found it very interesting so when you look at grammar you're almost thinking about like a puzzle like almost like a mathematical puzzle yeah I think that's right I didn't know I was going to work on this at all at that point I was really just I was kind of a math geek person computer scientist I really liked computer science and then I found language as a a neat puzzle to work on from an engineering perspective actually that's what I as a I I sort of accidentally well I decided after I finished my undergraduate degree which was computer science and math and Canada and Queens University I decided to go to grad school it's like that's what I always thought I would do and I went to Cambridge where they had a master's in a master's program in computational linguistics and I hadn't taken a single language class before all I had taken was CS computer science math classes pretty much mostly as an undergrad and I just oh this was an interesting thing to do for a year because it was a single year program and um then I ended up spending my whole life doing it so fundamentally your journey through life was one of a mathematician and a computer scientist and then you kind of discovered the puzzle the problem of language and approached it from that angle uh to try to understand it from that angle almost like a mathematician or maybe even an engineer as an engineer I'd say I mean to be frank I had taken an AI class I guess it was 83 or 84 5 somewhere 84 in there a long time ago and there was a natural language section in there and it didn't impress me I thought there must be more interesting things we can do didn't it didn't seem very it seemed just a bunch of uh hacks to me it didn't seem like a real theory of things in any way and so I just thought this was this seemed like an interesting area where there wasn't enough good work did you ever come across like the the philosophy angle of logic so if you think about the 80s with AI the expert systems where you try to kind of uh maybe sidestep the The Poetry of language and some of the syntax and the grammar and all that kind of stuff and go to the underlying meaning that language is trying to communicate and try to somehow compress that in a computer representable way did you ever come across that in your studies I mean I probably did but I wasn't as interested in it I was I was trying to do the easier problems first the ones I could thought maybe were handleable which is seems like the syntax is easier like which is just the forms as opposed to the meaning like you're talking when you're starting talking about the meaning that's very hard problem and it's still is a really really hard problem but the forms is is easier and so I thought at least figuring out the forms of human language which sounds really hard but is actually maybe more attractable so it's interesting you think there is a big divide there's a gap there's a distance between form and meaning because that's a question you have discussed a lot with llms mhm because they're damn good at form yeah I think that's what they're good at is form exactly and that's that's why they're good because they can do form meanings hard do you think there's oh wow and I mean it's an open question right yeah how close form and meaning are we'll discuss it but I to me studying form maybe it's a romantic notion gives you form is like the shadow of the the bigger meaning thing underlying language CU I it form is is language is how we communicate ideas we communicate with each other using language so in understanding the structure of that communication I think you start to understand the structure of thought and the structure of meaning behind those thoughts and communication to me but to you big gap yeah what do you find most beautiful about human language maybe the form of human language the expression of human language what I find beautiful about human language is the uh some of the generalizations that um happen across the human languages within and across a language so let me give you an example of something which I find kind of remarkable that is if like a language if it has um a word order such that the verbs tend to come before they're objects and so that's like English does that so we have the the first the subject comes first in a in a simple sentence so I say uh you know the the dog chased the cat or or Mary kicked the ball so the subject's first the and then after the subject there's the verb and then we have objects all these things come after in English so it's a it's generally a verb and most of the stuff that we want to say comes after the subject it's comes it's the it's the objects there's a lot of things we want to say that come after and and and there's a lot of languages like that about 40% of the languages of the world are look like that they're um sub subject verb object languages and then um these languages tend to have um prepositions these little markers on the nouns that that connect nouns to other nouns or nouns to verbs so I when I so verb like sorry preposition like in or on or of or about I say I talk about something the something is the object of that preposition that we have these little markers come also just like verbs they come before their their nouns okay and then so now we look at other languages that like Japanese or or Hindi or some these are these are so-called verb final languages those as about maybe a little more than 40% maybe 45% of the world's languages or more I mean 50% of the world's languages are verb final those tend to be um post positions those markers the same we have the states have the same kinds of markers as we do in English but they put them after so uh uh sorry they put them uh first the markers come first so you say instead of um you know talk about a book you say book about the opposite order there in Japanese or in Hindi you do the opposite and and the talk comes at the end so the verb will come at the end as well so instead of um Mary kicked the ball it's Mary uh Ball kicked and then uh says Mary kicked the ball to John it's John two the two little the marker there uh the preposition it's a postposition in these languages and so the interesting thing fascinating thing to me is that within a language this order aligns it's harmonic and so if it's one or the other it's either verb initial or verb final but then you then you'll have prepositions prepositions or postpositions and so that and that's across the languages that we we can look at we' got around a thousand languages for for there's around 7,000 languages around on on the Earth right now uh but we have information about say word order on around a thousand of those pretty decent amount of information and for those thousand which we know about um about 95% fit that pattern so they will have either verb so about it's about half and half half a verb initial like English and half a verb final like um like Japanese so just to clarify verb initial is subject verb object that's correct verb final is still subject object verb that's correct yeah the subject is generally first that's so fascinating I ate an apple or I Apple at yes okay and this fascinating that there's a pretty even division in the world amongst those 40 45% yeah it's pretty it's pretty even and and those two are the most common by far those two word ARS the subject tends to be first there's so many interesting things but these things are the thing I find so fascinating is there are these generalizations within and across a language and and not only those are the and there's actually a simple explanation I think for a lot of that and that is um you're trying to like minimize dependencies between words that's basically the story I think behind a lot of why word order looks the way it is is you we're always connecting what is it what is the thing I'm telling you I'm I'm talking to you in sentences you're talking to me in sentences these are sequences of Words which are connected and the connections are dependencies between the words and and it turns out that what we what we're trying to do in a language is actually minimize those dependency links it's easier for me to say things if the words that are connecting for their meaning are close together it's easier for you in understanding if that's also true if they're far away it's it's hard as to produce produce that and it's hard for you to understand and the languages of the world within a language and across languages you know fit that generalization which is you know so I you know it turns out that having verbs initial and then having prepositions ends up making dependencies shorter and and having verbs final and having postpositions ends up making dependency shorter then if you cross them if if you cross themit ends up you just end up it's possible you can do it it mean within a language within a language you can do it it just ends up with longer dependencies than if you didn't so languages tend to go that way they tend to minim they say they call it harmonic so it was observed a long time ago by uh without the explanation by a guy called Joseph Greenberg who's a um famous typologist from Stanford he observes a lot of generalizations about how word order works and these are some of the harmonic generalizations that he observed harmonic generalizations about word word order there's so many things I want to ask you okay let me uh just sometimes Basics you you mentioned dependencies a few times yeah what do you mean by dependencies well what I mean is in um in language there's kind of three structures to three components to the structure of language one is the sounds so cat is C and T in English I'm not talking about that part I'm talking then there's two meaning parts and those are the words and and you were talking about meaning earlier so words have a form and they have a meaning associated with them and so cat is a full form in English and it has a meaning associated with whatever a cat is and then the combinations of words uh that's what I'll call grammar or syntax and uh that's like when I have a combination like the cat or two cats okay so uh where I take two different words there and put them together and I get a compositional meaning from putting those two different words together and and so that's the syntax and in any sentence or utterance whatever I'm talking to you you're talking to me we have a bunch of words and we're putting together in a sequence they it turns out they are connected so that every word is connected to just one other word in that in that sentence and so you end up with what's what's called technically a tree it's a tree structure so there where there's a root of that of that utterance of that sentence and then there's a bunch of dependence like branches from that root that go down to the words the words are the leaves in this metaphor for a tree so a tree is also sort of a mathematical construct a graph theoretical thing graph Theory thing uh so in the it's fascinating that you can break down a sentence into a tree and then one every word is hanging on to another this depending on right and and everyone agrees on that so all linguists will agree with that no one not controversial that is not controversial there's nobody sitting here mad at you I don't think so okay there's no linguist sitting there mad at this I think every language I think everyone agrees that all sentences are trees at some level can I pause on that cuz it it's to me just as a Layman it uh it's surprising yeah that you can break down sentences in many most all languages all languages I into a tree I think so that's weird I I've never heard of anyone disagreeing with that that's weird the details of the trees are what people disagree about well okay so what's uh what's at the root of a how do you conru construct how hard is it what is the process of constructing a tree from a sentence uh well this is where you know depending on what you're there's different theoretical Notions I'm going to say the simplest thing the pendency grammar it's like a bunch of people invented this tenier was the first French guy back in I mean the paper was published in 1959 but he was working on the 30s and stuff so and and it goes back to uh you know philologist Pini was doing this in ancient uh India okay and so you know doing something like this the simplest thing we can think of is that there's just connections between the words to make the the utterance and so just say I have like two dogs entered a room okay here's a sentence and so uh we're connecting two and dogs together that's like there's some dependency between those words to make some bigger meaning and then we're connecting dogs now to uh entered right and we connect a room somehow to entered and so I'm going to connect uh to room and then room back to enter is that's the tree is I that the root is entered that's the the thing is like an entering event that's what we're saying here and the the subject which is whatever that dog is is two dogs it was and and the connection goes back to dogs which goes back to then that that goes back to two I'm just that that's my tree it it starts at entered goes to dogs down to two and on the other side after the verb the object it goes to room and then that goes back to the the determiner or article whatever you want to call that word uh so there's a bunch of categories of words here we're noticing so there are verbs those are these things that typically Mark uh they refer to events and states in the world and they're nouns which typically refer to people places and things is what people say but they can refer to other more they can refer to events themselves as well they're they're they're marked by you know how they how they get you what the category the part of speech of a word is how it gets used in language it's like that's how you decide what the what the category of a word is not not by the meaning but how it's how it gets used how it's used what's usually the root is it going to be the verb that defines the event usually usually yes yes okay yeah I mean if I don't say a verb then there won't be a verb and so it'll be something else what if you're messing are we talking about language that's like correct language what if you're doing poetry and messing with stuff is it then then rules go out the window right then it's no you're still no no no you're constrained by whatever language you're dealing with probably you have other constraints in poetry such that you're like usually in poetry there's multiple constraints that you want to like you want to usually convey multiple meanings is the idea and maybe you have like a rhythm or a rhyming structure as well and depending on so but you usually are constrained by your the rules of your language for the most part and so you don't violate those too much you can violate them somewhat but not too much so it has to be recognizable as your language like in English I can't say dogs to entered room ah I mean I meant the you know two dogs entered a room and I I I can't mess with the order of the the Articles the Articles and the nouns you just can't do that in some languages you can you can mess around with the order of words much more I mean you speak Russian Russian has a much Freer word order than English and so in fact you can move around words in you know I told you that English has the subject verb object word order so does Russian but Russian is much Freer than English and so you can actually mess around with the word order so probably Russian poetry is going to be quite different from English poetry because the word order is much less constrained yeah there's a much more extensive uh culture of poetry throughout the history of the last 100 years in Russia and I I always wondered why that is but it seems that there's more flexibility in the way the language is used there's more you're morphing the language Easier by altering the words altering the order of the words messing with it well you can just mess with different things in each language and so Russian you have case markers right on the end which is there these endings on the nouns which tell you how it connects each noun connects to the verb right we don't have that in English and so when I say um Mary kissed John I don't know who the agent or the patient is except by the order of the words right in in Russian you actually have a marker on the end if you're using a Russian name and each of those names you'll also say is it you know agent it'll be the uh you know nominative which is marking the subject or an accusative will Mark the object and you could put them in the reverse order you could put accusative first as you could put subject you could put um the patient first and then the verb and then the the the subject and that would be a perfectly good Russian sentence and it would still mean Mary I could say John kissed Mary meaning Mary kissed John with as long as I use the case markers in the right way you can't do that in English and so uh I love the terminology of agent and patient and uh and the other ones you used those are sort of linguistic terms correct those are those are for like kind of meaning those are meaning and and subject and object are generally used for position so subject is just like the thing that comes before the the verb and the object is one that comes after the verb the agent is kind of like the thing doing it that's kind of what that means right the subject is often the person doing the action right the thing so yeah okay this is fascinating so how hard is it to form a tree in general is there um is there a procedure to it like if you look at different languages is it supposed to be a very natural like is it aable or is there some human genius involved in I think it's pretty automatable at this point people can figure out the words are they can figure out the morphemes which are the technically morphemes are the the minimal meaning units within a language okay and so when you say eats or drinks it actually has two morphemes and in English there's there's the there's the root which is the verb and then there's some ending on it which tells you you know that's this third person uh third person singular say what mores are morphemes are just the minimal meaning units within a language and a word is just kind of the things we put spaces between English and they have a little bit more they have the morphology as well they have the endings this inflexal morphology on the endings on the roots they modify something about the word that adds additional meaning they tell you yeah yeah yeah and so we have a little bit of that in English very little much more in Russian for instance and and uh but we have a little bit in English and so we have a little on the on the nouns you can say it's either singular or plural and and you can say uh same thing for um for for verbs like simple past tense for example like you know notice in English we say drink drinks uh you know he drinks but everyone else is I drink you drink we drink it's unmarked in a way and then but in the past tense it's just drank there for everyone there's no morphology at all for past tense it's there is morphology it's marking past tense but it's kind of it's an irregular now so we don't even you know drink to drank you know it's not even a regular word so in most verbs many verbs there's an ed we kind of add so walk to walked we add that to say it's the past tense that I just happen to choose an irregular because it's a high frequency word and the high frequency words tend to have Irregulars in English for what's an irregular irregular it's just there's there isn't a rule so drink to drank is an is an irregular drink drank okay Asos to walk walked talk talked and there's a lot ofre Irregulars in English there's a lot of Irregulars in English the the the frequent ones the common words tend to be irregular the Le there's many many more um low frequency words and those tend to be those IR regular ones the evolution of the Irregulars are fascinating it's essentially slang that's sticky mhm cuz you're breaking the rules and then everybody use it and doesn't follow the rules yeah and they they say screw it to the rules it's fascinating so you said it mores lots of questions so morphology is what the study of morphemes morphology is the is the connections between the morphemes onto the Roots the Roots so in English we mostly have suffixes we have endings on the words not very much but a little bit and uh as opposed to prefixes some words depending on your language can have you know mostly prefixes mostly suffixes or mostly or or both and then even languages several languages have things called infixes where you have some kind of a uh General uh form for the for the root and you put stuff in the middle you change the vowels that's fascinating that is fascinating so wait so in general there's what two morphemes per word usually one or two or three well in English it's it's it's one or two in English it tends to be one or two there can be more you know in in other languages you know a lang language like uh like finish which has a very uh elaborate morphology there may be 10 morphemes on the end of a route okay and so there may be Mill there be millions of forms of a given word okay okay I I will ask the same question over and over but uh how does a just sometimes to understand things like morphemes it's nice to just ask the question how does these kinds of things evolve so you uh have a great book studying sort of the how how the cognitive processing how language used for communication so the the mathematical notion of how effective language is for communication what role that plays in the evolution of language but just high level like how do we how does a language evolve with where English is two morphemes or one or two mores per word and then Finnish has Infinity forward so what how does that how does that happen is it just that's a really good question yeah that's a very good question is like why do languages have more morphology versus less morphology and and I don't think we know the answer to this I don't I think there's just like a lot of good solutions to the problem of communication so I like I believe as you hinted that language is an invented system by humans for communicating their ideas and I think we it comes down to we label things we want to talk about those are the the the morphemes and words those are the things we want to talk about in the world and we invent those things and then uh we put them together in ways that are um easy for us to convey to process but that that that's like a naive View and I don't I mean I I think it's probably right right it's naive and probably right well I don't know if it's naive I think it's simple simple yeah I think naive is naive is an indication that it's an incorrect somehow it's a trivial to too simple I think it could very well be correct but it's interesting how sticky it feels like two people got together it just it just feels like once you figure out certain aspects of a language that just becomes sticky and the tribe forms around that language maybe the language maybe the tribe forms first and then the language evolves and then you just kind of agree and that you stick to whatever that is I mean these are very interesting questions we don't know really about how words even words get invented very much about you know we don't really I mean assuming they get invented they we don't really know how that process works and how these things evolve what we have is kind of a a current picture a current picture of few thousand languages a few thousand instances we don't have any pictures of really how these things are evolving really and and then the evolution is massively con you know uh confused by contact right so as soon as one language group one group runs into another we are smart hum are smart and they take on whatever is useful in the other group and so any kind of contrast which you're talking about which I find useful I'm going to I'm going to start using as well so I I worked a little bit in um in in specific areas of words in in number words and in in color words and in color words that so we have in English we have around 11 words that everyone knows for colors and uh and many more if you happen to uh be interested in color for some reason or other if you're a fashion designer or an artist or something you may have many many more words but we can see Millions like if you have normal color vision normal tri chometric color vision you can see millions of distinctions in colors so we don't have millions of words you know the most efficient no the most you know detailed color vocabulary would have over a million terms to distinguish all the different colors that we can see but of course we don't have that so it's somehow it's been it's kind of useful for English to have evolved in some way to there's 11 terms that people find useful to talk about you know black white red uh blue green yellow purple uh gray pink and I probably missed something there anyway uh there there's 11 that everyone knows yeah and um and depending on your and but you go to different cultures um especially the non-industrialized cultures and there'll be many fewer so some cultures will have only two believe it or not that the Dan I and in Papa New Guinea have only two labels that the that the group uses for color those are roughly black and white they are okay very very dark and very very light which are roughly black and white and you might think oh they're dividing the whole color space into you know light and dark or something and that's not really true they mostly just only label the light the black and the white things they just don't talk about the colors for the other ones and so and and then there's other groups I've worked with a group called The chimani down in um in Bolivia in South America and they have three words that everyone knows but there's a few others that are that that several people like that many people know and so they have me kind of depending at how you count between three and seven words that the group knows okay and uh and again they're they're black and white everyone knows those and red red is you like that tends to be the third word that everyone that that cultures bring in if there's a word it's always read the third one and then after that it's kind of all bets are off about what they bring in and so after that they they bring in a sort of a big blue green Spa gr gr they have one for that and then they have uh and then you know different people have different words that they'll use for other parts of the space and so anyway it's probably related to what they want to talk what they not what they not what they see because they see the same colors as we see so it's not like they have they don't they have a a weak a low color palette and the things they're looking at they're looking at a lot of beautiful scenery okay a lot of different colored uh flowers and berries and things and you know and so there's lots of things of very bright colors but they just don't label the color in those cases and the reason probably we we don't know this but we think probably what's going on here is that what you do why you label something is you need to talk to someone else about it and and why do I need to talk about a color well if I have two things which are identical and I want you to give me the one that's different and and the only way it varies is color then I invent a word which tells you uh you know this is the one I want so I want the red sweater off the rack not the not the green sweater right there's two and and so those those things will be identical ex because these are things we made and they're died and there there's nothing different about them and so in in industrialized Society we have you know everything everything we've got is pretty much arbitrarily colored uh but you go to non-industrialized group that's not true and so they don't re Sly they're not interested in color you you bring bright colored things to them they like them just like we like them bright colors are great they're beautiful they are but they just don't need to don't need to talk about them they don't have so probably color words is a good example of how language evolves from sort of function when you need to communicate the use of something I think so then then you kind of invent different variations and uh and basically you can imagine that the evolution of a language has to do with what the early tribe is doing like what what they want what what kind of problems they're facing them and they're quickly figuring out how to efficiently communicate uh the solution to those problems whether it's aesthetic or functional all that kind of stuff running away from a mammoth or whatever um but you know it's so so I think what you're pointing to is that we don't have data on the evolution of language because many languages have formed a long time ago so you don't get the chatter we have a little bit of like Old English to Modern English because there was a writing system and we can see how how old English looked so the word order changed for instance in Old English to Middle English to Modern English and so it you know we can see things like that but most languages don't even have a writing system so of the 7,000 only you know a small subset of those have a writing system and even if they have a writing system they it's not a very modern writing system and so they don't have it so we just basically have for Mandarin for Chinese we have a lot of a lot of evidence from from for a long time and for English and not for much else not for in German a little bit but not for a whole lot of like long-term um language Evolution we don't have a lot we just have snapshots is what we've got of current languages yeah I you get an inkling of that from the rapid communication on certain platforms like on Reddit there's different communities and they'll come up with different slang usually from my perspective during by a little bit of humor um or maybe mockery or whatever it's you know just talking and different kinds of ways and uh you could see the evolution of language there because um I think a lot of things on the internet you don't want to be the boring mainstream so you like want to deviate from the proper way of talking MH and so you get a lot of deviation like rapid deviation then when communities Collide you get like uh just like you said humans adapt to it and you can see it through L of humor I mean it's very difficult to study but you can imagine like 100 years from now well if there's a new language born for example will get really high resolution data on I mean English is changing English changes all the time all languages change all the time so you know there the famous um result about the queen's English so the que if you look at the Queen's vowels the queen's English is supposed to be you know originally the proper way for the talk was sort of defined by whoever the queen talked or the king whoever was in charge and uh and and so if you look at the how her vowels changed uh from when she be first became Queen in 1952 or 53 when she was car the first I mean that's Queen Elizabeth who's got who died recently of course uh until you know 50 years later her vowels changed her vowels shifted a lot and so that you know even in the sounds of British English in her the way she was talking was changing the vowels were changing slightly so that's just in the sounds there's change I don't know what's you know we're we're I'm interested we're all interested in what's driving any of these changes the the word order of English changed a lot over Thousand Years right so it used to look like German you know it looks it used to be a verb final language with case marking and it shifted to a verb medial language a lot of contact so a lot of contact with French and it became a verb medial language with no case marking and so it became this you know verb verb initially thing so and so that's evolving we it totally evolved and so it may very well I mean you know it doesn't evolve maybe very much in 20 years is maybe what you're talking about but over 50 and 100 years things change a lot I I think will now have good data on it which is great that's for sure um can you talk to what is syntax and what is grammar so you wrote a book on syntax I did you were asking me before about what you know how do I figure out what a dependency structure is I'd say the dependency structures aren't that hard to generally I think there's a lot of agreement of what they of what they are for almost any sentence in in most languages I think people will agree on a lot of that there are other parameters in the mix such that some people think there's a more complicated grammar than just a dependency structure and so you know like n chsky he's the most famous linguist ever uh and he he is famous for proposing a a a slightly more complicated syntax and so he he invented phrase structure grammar so he's um well known for many many things but in the 50s in early 60s like but late 50s he was basically figuring out what's called formal language Theory so and he uh figured out sort of a framework for figuring out how complicated langu you know a certain type of language might be so-called phrase structured grammars of language might be and so he his his idea was that maybe we can we can think about the complexity of a language by how complicated the rules are okay and the rules will look like this they will have a left hand side and will have a right right hand side something on the left hand side will expand to the thing on the right hand side so we'll say we'll start with an a an S which is like the root which is an a sentence okay and then we're going to expand to things uh like a noun phrase and a verb phrase is what he would say for instance okay an S goes to an NP and a VP is a kind of a phrase structure Rule and then and we figure out what an NP is an NP is a a a determiner and a noun for instance and a verb verb phrase is something else is a verb and another noun phrase and another npce for instance those are the rules of a very simple phrase structure okay and and so he he proposed phrase structure grammar as a way to sort of cover human languages and then he actually figured out that well depending on the formalization of those grammars you might get more complicated or less complicated languages so you could he could he said well you these are these are things called you know um context free languages that rule that he thought you know human languages tend to be what he calls context free languages um and but there are simpler languages which are so-called regular languages and they have a more a more constrained form to the rules of the of the phrase structure of of these particular rules so he he basically discovered and kind of invented ways to describe the language and and those are phrase those are phrase structure a human language and he was mostly interested in English initially in his his work in the 50s so a quick questions around all this so former language theory is The Big Field of just studying language formally yes and it doesn't have to be human language there we have computer languages any kind of system which is generating a uh a um some set of um expressions in a language and those could be like the the um you know the statements in a in a computer language for example so formal it could be that or it could be human language so technically you can study programming languages ab and have been been heavily studied using this formalism there there's a big field of programming languages within the formal language okay and then phrase structure grammar is this idea that you can break down language into this s npvp it's a particular formalism for describing language okay so and chsky was the first one he's the one who figured that stuff out back in the 50s and and and but he and and that's equivalent actually the this the context free grammar is actually is kind of equivalent in the sense that it generates the same sentences as a dependency grammar would you know as the dependency grammar is a little simpler in some way you just have a root and it goes like we don't have any of these the the rules are implicit I guess in and we just have connections between words the phrase structure grammar is a kind of a different way to think about the the dependency grammar it's slightly more complicated but it's kind of the same in some ways so to clarify dependency grammar is the framework under which you see language and you make the case that this is a good way to describe language that's correct and uh no Nome jsky is watching this is very upset right now so let's uh I'm just kidding but uh what's the difference between uh where's the the place of disagreement um between phrase structure grammar and dependency grammar they're they're very close so phrase structure grammar and dependency grammar aren't that aren't that far apart I I I like dependency grammar because it's more perspicuous it's more transparent about representing the connections between the words it's just a little harder to see in phrase structure grammar you know the the place where Chomsky sort of devolved or went off from from from this is he also thought there was um something called M okay and so so and that's where we disagree okay that's the place where I would say we disagree and and and I mean we maybe we'll get into that later but the idea is if you want to do you want me to explain that now I would love can you to explain movement movement okay so you're saying so many interesting things yeah yeah yeah okay so here's the movement is Chomsky basically sees English and he says okay I said um you know we had that sentence earlier like it was like two dogs enter the room it's changed a little bit say two dogs will enter the room and he notices that hey English if I want to make a question a yes no question from that same sentence I I say instead of two dogs will enter the room I say will two dogs enter the room okay there's a different way to to say the same idea and it's like well the auxiliary verb that will thing it's at the front as opposed to in the middle okay and so and he looked you know if you look at English you see that that's true for all those modal verbs and for other kinds of auxiliary verbs in English you always do that you always put an auxiliary verb at the front and and what he when he saw that so you know if I say um I can win this bet can I win this bet right so I move a can to the front so actually that's a theory I just gave you a theory there I he he talks about it as movement that word in the thinks the declarative is the root is is the sort of default way to think about the sentence and you move the auxiliary verb to the front that's a movement Theory okay and he he just thought that was just so obvious that it must be true that that that there's nothing more to say about that that this is how auxiliary verbs work in English there's a movement rule such that you're move like to get from the declarative to the interrogative you're moving the auxiliary to the front and it's a little more complicated as soon as you go to simple simple present and simple past because you know if I say you know John slept you have to say did JN sleep not slept John right and so it's you have to somehow get an auxiliary verb and I guess underlyingly it's like slept is it's a little more complicated than that but his that's his idea there's a movement okay and and and so a different way to think about that that isn't I mean the then then he ended up showing later so he proposed this theory of grammar which has movement there's other places where he thought there's movement not just auxiliary verbs but things like the passive in English and things like um questions wh questions a bunch of places where he thought there's also movement going on and and in each each one of those these things there's words well phrases and words are moving around from one structure to another what you call Deep structure to surface structure I mean there's like two different structures in his in his theory okay um there's a different way to think about this um which is there's no movement at all there's a lexical copying rule such that the word will or the word can these these auxiliary verbs they just have two forms and and and one of them is the declarative and one of them is interrogative and you basically have the declarative one and oh I form the interrogative or I can form one from the other it doesn't matter which direction you go and and I just have a new entry which has the same meaning which has a slightly different argument structure argument structure just a fancy word for The Ordering of the words and so if I say you it was um the the dogs two dogs can or will enter the room the the there's two forms of will one is Will declarative and and then okay I've got my subject to the left it comes before me and the verb comes after me in that one and then the will interrogative it's like oh I go first interrogative will is first and then have the subject immediately after and then the verb after that and so you just you can just generate from one of those words another word with a slightly different argument structure with different ordering and these are just lexical copies they they're not necessarily moving from one to another there's no movement there's a romantic notion that you have like one main way to use a word and then you could move it around right right which is essentially what movement is implying yeah but that's that's the lexical copying is similar so then so then then we we do Lex copying for that same idea that maybe the declarative is the source and then we can copy it and so an advantage uh for there's multiple advantages of the lexical copying story it's not my story this is like um Ivan SG linguists a bunch of linguists have been proposing these stories as well you know in tandem with the movement story okay you know he's he Ivan soag died a while ago but he was a one of the proponents of the non-movement of the lexical copying story and so that is that um a great Advantage is well Chomsky really famously in 1971 showed that the movement story leads to learnability problems it leads it leads to problems for for how language is learned it's really really hard to figure out what the underlying structure of a language is if you have both phrase structure and movement it's like really hard to figure out what came from what there's like a lot of possibilities there if you don't have that problem learning that learning problem gets a lot easier say there's lexical copies and when we say the learning problem do you mean like humans learning a new language yeah just learning English so baby is lying around listening to the crib listening to me talk and is you know how are they learning English or or you know maybe it's a 2-year-old who's learning you know interrogatives and stuff or one you know there you how are they doing that are they doing it from like are they figuring out or like know so Chomsky said it's impossible to figure it out actually he said it's actually impossible not not hard but impossible MH and therefore that's that that's where Universal grammar comes from is that it has to be built in and so what they're learning is uh that there there's some built-in movement is built in in his story is absolutely part of your language module and uh and then you are you're just setting parameters you're you're said depending on English is just sort of a variant of the universal grammar and you're figuring out oh which orders do does English do these things that's the the non-movement story doesn't have this it's like much more bottom up uh you're you're learning rules you're learning rules one by one and oh there's this this word is connected to that word a great advant another Advantage it's learnable another advantage of it is that it predicts that not all auxiliaries might move like it it might depend on the word depending on whether you and and and that turns out to be true so there's words that um that don't really work as auxiliary you they work in declarative and not in in interrogative so I can say um I'll give you the opposite first if so I can say aren't I invited to the party okay and that's an that's an interrogative form but it's not from I aren't invited to the party there is no I aren't right so that's that's interrogative only and and then we also have forms like um ought uh I I ought to do this and and I guess some British old British people can say exactly it doesn't sound right does it for me it sounds ridiculous I don't even think a is great but I mean I totally recognize I ought to I is not too bad actually I can say I ought to do this that sounds if I'm trying to sound sophisticated maybe I don't know it just sounds completely out to me I yeah anyway it's so there are variance here uh and a lot of these words just work in one versus is the other and and that's like fine under the lexical copying story it's like well you just learn the usage whatever the usage is is what you is what you do with this with with this word but um it doesn't it's a little bit harder in the movement story The Movement story like that's an advantage I think of lexical copying in all these different places there's there's all these usage variants which make the movement story um a little bit harder to work so one of the main divisions here is the movement Story versus the C story that has to do about the auxiliary warts and so on but you if rewind to the phrase structured grammar yeah versus dependency grammar those are equivalent in some sense in that for any dependency grammar I can generate a dependence a phrase structure grammar which generates exactly the same sentences I just I just like the dependency grammar uh formalism because it makes something really Salient which is the depend the the lengths of dependencies between Words which isn't so obvious in in the phrase in the phrase structure it's just kind of hard to see it's in there it's just very very it's opaque uh technically I think phrase structure grammar is mappable to dependency grammar and vice versa and vice versa yeah there's like these like little labels SN PVP yeah for a particular dependency grammar you can make a phrase structure grammar which generates exactly those same sentences and vice versa but there are many phrase structure grammars which you can't really make a dependency grammar I mean there you can do a lot more in a phrase structure grammar you get many more of these extra nodes basically you you can have more structure in there uh and and some people like that and and maybe there's value to that I I I don't like it well for you so we should clarify so so dependency grammar it's just uh well one word depends on only one other word and you form these trees and that makes it really puts priority on those dependencies just like as a as a tree that you can then measure the distance of the dependency from one word to the other they can then map to uh the cognitive processing of the of these sentences how well how easy it is to understand and all that kind of stuff so it just puts the focus on just like the mathematical um uh distance of dependence between words so like it's just a different Focus absolutely Ju Just continue on a thread of chsky because it's really interesting because it as you're discussing disagreement to the degree there's disagreement you're also telling the history of the study of language which is really awesome so you mention context free versus regular does that distinction come into play for the peny grammar no okay not at all I mean the regular regular languages are too simple for human languages they they are uh they it's a part of the hierarchy but human languages are in in the phrase structure world are definite they they're at least context free maybe a little bit more a little bit harder than that but uh so there's something called context sensitive as well where you can have like this is the just the formal language description in in a context free grammar you have one this is like a bunch of like formal language Theory we're doing here but I love it okay so you have you have a left- hand side category and you're expanding to anything on the right is is a uh that's a context free so like the idea is that that category on the left expands in independent of context to those things whatever they are on the right doesn't matter what and and a context sensitive says Okay I I actually have more than one thing on the left I can tell you only in this context you know I maybe have like a left and a right context or just a left context or a right context I have two or more stuff on the left tells you how to expand that th those things in that way okay so it's Contex sensitive a a regular language is just more constrained and so it it doesn't allow anything on the right it it allows very it allow basically it's a one very complicated rule is kind of what a a a regular language is and so it doesn't have any um it say long distance depencies it doesn't allow recursion for instance there's no recursion yeah recursion is where you which human languages have recursion they have embedding and you can't well it doesn't allow Center embedded recursion which human languages have which is what Center embedded recursion within a sentence within a sentence yeah within a sentence so here we're going to get to that but I you know the formal language stuff is a little aside Chomsky wasn't proposing it for human languages even he was just pointing out that human languages are context free and then he was most in for for human because that was kind of stuff we did for formal languages and what he was most interested in was human language and that's like the the movement is where we we we where where he sort of set off in on I would say a very interesting but wrong foot it was kind of interesting it's a very I agree it's a very interesting history so there's this so he proposed this multiple theories in 57 and then' 65 they're they all have this framework though was phrase structure plus movement different versions of the of the phrase structure and the movement in the 57 these are the most famous original bits of chomsky's work and then in 71 is when he figured out that those lead to learning problems that that there's cases where a kid could never figure out which rule um which set of rules was intended and and so and then he said well that means it's innate it's kind of interesting he just really thought the movement was just so obviously true that he couldn't he didn't even entertain giving it up it's just obvious that's obviously right and um it was later where people figured out that there's all these like subtle ways in which things which which look like generalizations aren't generalizations and they you across the category they're they're word specific and and they have and they they kind of work but they don't work across various other words in the category and so it's easier to just think of these things as lexical copies and and I think he was very obsessed I don't know I'm like guessing that he he just he really wanted this story to be simple in some sense and language is a little more complicated in some sense you know he didn't like Words uh he never talks about words he likes to talk about combinations of words and words are you know look up a dictionary there's 50 senses for a common word right the word take will have 30 or 40 senses in it so uh there'll be many different senses for common words and he just doesn't think about that it's or doesn't think that's language I think he doesn't think that's language he thinks that words are distinct from combinations of words I think they're the same if you look at my brain in the scanner while I'm listening to a language I understand and you compare I can localize my language Network in a few minutes in like 15 minutes and what you do is I listen to a language I know I listen to you know maybe some language I don't know or I listen to muffled speech or I I read sentences and I read non-words like I do anything like this anything that's sort of really like English and anything that's not very like English so I've got to something like it and not and I got a control and and the voxel which is just you know the um 3D pixels in my in my brain that are responding most are are is a language area and and that's this left lateralized um area in my head and and wherever I look in that network if you look for the combinations versus the words it's there it's it's everywhere it's the same that's fascinating and so it's like hard to find there are no areas that we know I mean that's it's a little overstated right now at this at this point the the technology isn't great it's not bad but we have the best the best way to figure out what's going on in my brain when I'm listening or reading language is to use FM R functional magnetic resonance imaging and that's a very good localization method so I can figure out where exactly these signals are coming from pretty you know down to you know millimeters you know cubic millimeters or smaller okay very small we can figure those out very well the problem is the when okay uh it's it's measuring um oxygen okay and oxygen takes a little while to get to those cells so it takes on the order of seconds so I talk fast I I probably listen fast and I can probably understand things really fast so a lot of stuff happens in two seconds and so to say that we know what's going on that the words right now in that Network our best guess is that whole network is doing something similar but maybe different parts of that Network are doing different things and and that's probably the case we just don't have very good methods to figure that out right at this moment and so since we're kind of talking about the history of the study of language what other interesting disagreements and you're both at it or were for a long time what kind of interesting disagreements there tension of ideas are there between you and no chowski and we should say that gnome was in the Linguistics department and you're uh I guess for a time we Affiliated there but primarily a brain and cognitive science department it's just another way of studying language and you've been talking about fmri so like what is there something else interesting to bring to the surface about the disagreement between the two you or other people in the this yeah I I mean I've been at MIT for 31 years since 1993 and he chomsky's been there much longer so I I met him I knew him I I met when I first got there I guess and I and we would interact every now and then I'd say that so I'd say our our biggest difference is our methods and so um that's the biggest difference between me and gnome uh is that I gather data from people I uh do experiments with people and I gather Corpus data whatever whatever Corpus data is available and we do quantitative methods to evaluate any kind of hypothesis we have he just doesn't do that so you know you you know he has never once been associated with any experiment or Corpus work ever and so it's all thought experiments it's his own intuitions so I I just don't think that's the way to do things um that's a that's a you know across the street they're across the street from us kind of difference between brain and kogai and Linguistics I mean not all lingu some of the linguists depending on what you do more speech oriented they do more quantitative stuff but in the in the meaning um words and well it's combinations of words syntax semantics they tend not to do experiments and uh and Corpus analyses so listic size probably well the but the method is a symptom of a bigger approach which is sort of a psychology philosophy side on gome and for you it's more sort of datadriven sort of almost like mathematical approach yeah I mean I'm psychologist so I would say we're in Psychology you know I I me brain cognitive science is is mit's old psychology department it was the psychology department up until 1985 and it became the brain and Cog cognitive science department and so I I mean my training is in Psych I mean my training is math and computer science but I'm a psychologist I mean I'm I mean I don't know what I am so data driven psychologist yeah you are I am what I am but but I'm happy to be called a linguist I'm happy to be called a computer scientist I'm happy to be called a psychologist any of those things in the actual uh like how that manifests itself outside the methodology is like these differences these subtle differences about the movement Story versus the lexical copy story those are theories so the the like the theories are I but I think the the reason we differ in part is because of how we evaluate the theories and so I evaluate theories quantitatively and gnome doesn't got it okay well let's let's explore the theories that you explore in your book Let's return to this dependency grammar framework of looking at language what's uh a good justification why the dependency grammar framework is a good way to explain language what's your intuition so the reason I like dependency grammar as I've said before is that it's very transparent about its representation of distance between words so it's like it all it is is you've got a bunch of words you're connect them together to make a sentence and a really neat Insight which turns out to be true is that the further apart the pair of words are that you're connecting the harder it is to do the production the harder it is to do the comprehension it's it's harder to produce it's hard to understand when the words are far apart when they're close together it's easy to produce and it's easy to comprehend let me give you an example okay so we have in any language we have mostly local connections between words but they're abstract the the connections are abstract they're between categories of words and so you can always make things further apart if you put your if you add modification for example after a noun so a noun in English comes before a verb the subject noun comes before a verb and then there's an an object after for example so I can say what I said before you know the dog entered the room or something like that so I can modify dog if I say something more about dog after it then what I'm doing is in indirectly I'm lengthening the dependence the dependence between dog and entered by adding more stuff to it so I just make just make it explicit here if I say um uh the the boy who the cat scratched cried we're going to have a mean cat here and uh and so what I've got here is I the boy cried it would be a very short simple sentence and I just told you something about the the boy and I told you it was it was the boy who the cat scratched okay so the CED is connected to the boy C at the end it's connected to the boy in the beginning right and so I can do that I can say that that's a perfectly fine English sentence and I can say the cat which the dog chased ran away or something okay I can do that but I it's really so I but it's really hard now I I've got whatever I have here I have the boy who the cat now let's say I try to modify cat cat Okay the boy who the cat which the dog chased scratched ran away oh my God that's hard right I I can I'm sort of just working that through in my head how to produce and how to and it's really very just horrendous to understand it's not so bad at least I've got intonation there to sort of Mark the boundaries and stuff but it's that's really complicated that's sort of English in a way I mean that follows the rules of English but uh so what's interesting about that is is that what I'm doing is nesting dependencies there I'm putting one con I've got a subject connected to a verb there and and then I'm modifying that with a a a clause another clause which happens to have a subject in a verb relation I'm trying to do that again on that second one and what that does is it lengthens out the dependence multiple dependence actually get lengthened out there the dependencies get get longer longer on the outside ones get long and even the ones in between get kind of long and and and you just so what's fascinating is that that's bad that's really horrendous in English but that's horrendous in any language and so in in no matter what language you look at if you do just figure out some structure where I'm going to have some modification following some head which is connected some later head and I do it again it won't be good guaranteed like 100% that will be uninterpretable in that language in the same way that was uninterpretable in English just clarify the distance of the dependencies is whenever the uh boy cried this um there's a dependence between words and then you counting the number of what morphemes between them that's a good question I I just say words your words are morphs between don't we don't know that actually that's a very good question what is the distance metric but let's just say it's words sure okay so that and you're saying the longer the distance of that dependence the more no matter the language except legales uh even leg okay we'll talk about it okay okay okay uh but that the people will be very upset that speak that language not upset but they'll either not understand it they'll be like this is they'll uh their brain will be working in overtime yeah they they'll have a hard time either producing or comprehending it they might tell you that's not their language you know it's sort of the language I mean it it following their like they'll agree with each of those pieces as part of their language but somehow that combination will be very very difficult to produce and understand is that a chicken or the egg issue here so like is well I'm giving you an explanation so the I well I mean I didn't there's I'm giving you two kinds of explanations I'm telling you that Center embedding that's nesting those are the same those are synonyms for the same concept here and I'm the explanation for why those are always hard Centum bedding and nesting are always hard and I give you an explanation for why they might be hard which is longdistance connections there's a when you do Centrum Bing when you do nesting you always have long-distance connections between the dependents you just so that's not necessarily the right explanation it just ha I can go through reasons why that's probably a good explanation and it's not really just about one of them it so probably it's a a pair of them or something of these dependents that you get get long that drives you to like be really confused in that case and so what the the behavioral consequence there if you I mean we this is kind of methods like how do we get at this you could try to do experiments to get people to produce these things they're going to have a hard time producing them you can try to do experiments to get them to understand them get and see how well they understand them can they understand them uh another method you can do is give people partial materials and ask them to complete them you know those those Cent embedded materials and and they they'll fail so I've done that I've done all these kinds of things wait a minute so so Central embedding meaning like you take a normal sentence like boy cried and inject a bunch of crap in the middle yes that separates the boy and the cried yes okay that's Central bedding and nesting is on top of that no no nesting is the same thing Cent Bing those are totally equivalent terms I'm sorry I sometimes use one and sometimes got it got it different got it and then uh what you're saying is there's a bunch of different kinds of experiments you can do I mean I like the understanding one is like have more embedding more Central embedding is it easier or harder to understand but then you have to measure the level of understanding I guess yeah yeah you could I mean there's multiple ways to do that I mean there's there's the simplest ways just ask people how good is it sound how natural is it sound that's a very blunt but very good measure it's very very reliable people will do the same thing and so it's like I don't know what it means exactly but it's doing something such that we're measuring something about the confusion the difficulty associated with those and those like those are giving you a signal that's why you can say that okay what about the the completion of this with the central so if you give them a partial sentence say I say um the book which the author who and I ask you to now finish that off for me I mean either say yeah yeah but you can just but it say it's written in front of you and you can just type and have as much time as you want they will even though that one's not too hard right so if I say it's like the book it's like oh the the book which the author who I met wrote was good you know that's a very simple completion for that you know if I give that completion on online somewhere to a uh you know um a crowdsourcing platform and ask people to complete that they will miss off of a verb very regularly like half the time maybe two-thirds of the time they'll say they'll just leave off one of those verb phrases even with that simple so say the book uh which the author who and they'll say was um um they won't you need three verbs right I need three verbs here who I met wrote was good and they'll give me two they'll say who was who was famous was good or something like that they'll just give me two and and and that that'll happen about 60% of the time so 40% maybe 30 they'll do it correctly correctly meaning they'll do with three verb phrases I don't know what's correct or not you know it this is hard it's a hard task yeah I actually I'm struggling with it in my head well it's it's easier when you when you stare at it if you a little easier than listen listening is pretty tough because you have to because there's no trace of it you have to remember the words that I'm saying which is very hard auditorily we wouldn't do it this way we you do it written you can look at it and figure it out it's easier in many dimensions in some ways depending on the the person it's easier to gather written data for I mean most sort of psycho I work in Psycho Linguistics right psychology of language and stuff and so a lot of our work is based on written stuff because it's so easy to gather data from people doing written kind of Tas spoken tasks are just more complicated to administer and analyze because people do weird things when they speak and it's harder to analyze what they do but they um they they generally point to the same kinds of things so okay so the the universal theory of language Yeah by Ted Gibson is uh that you can form dependency you can form trees from many sentences and right you can measure the distance in some way of those dependencies and then you can say that uh most languages have very short dependencies all languages all languages all languages have short dependencies you can actually measure that so uh an ex student of mine these guys at um University of California Irvine Richard Futrell did a thing of bunch of years ago now where he looked at all the languages we could look at which was about 40 initially and now I think there's about 60 for which there are dependency structures like you so they're meaning there got to be like a big text a bunch of text which have been parsed for the dependency structures and there's about 60 of those which have been parsed that way and for all of those um you can what what he did was take any um any sentence in one of those languages and uh and you can do the dependency structure and then start at the root we we're talking about dependency structures that's pretty easy now and and he's trying to figure out what a control way you might say the same sentence is in that language and so we is just like all right there's a root and it has let's say as a sentence is um let's go back to you know two dogs entered the room so entered is the root and entered has um two dependents it's got dogs and it has room okay and what he does is like let's scramble that order that's three things the root and the The Head and the two dependents and in just some random order just random and then just do that for all the dependents down that for so now look do it for the and whatever was two in dogs and for uh in room and and that's you know that's not it's a very short sentence when sentences get longer and you have more dependence there's more scrambling that's possible and what he found was so that so so that that's one you could figure out one scrambling for that sentence he did like 100 times for every sentence in every Cor in every one of these texts every Corpus and and then he just compared the dependency lengths in those random scrambling to what actually happen what what the the English or the French or the German was in the in original language or Chinese or what are all these like 80 l no 60 languages okay and and the dependency lengths are always shorter in the real language compared to this kind of a control and there's another it's a little more rigid his control so um the the way I described it you could have crossed dependencies like by by scrambling that way you could scramble in any way at all um languages don't do that they tend not to cross dependencies very much like so the dependency structure they just they tend to keep things uh non-cross and there's a you know there's a technical term they call that projective but it's just non-cross is all that is projective and so if you just constrain the the scrambling so that it only gives you projective sort of non non-cross the same thing holds so it's so the You Still Still Human languages are much shorter uh than these this kind of a control so there's like it what it means is that that that we're in every language we're trying to put things close in relative to this kind of a control like it doesn't matter about the word order some of these are verb final some of them these are verb medial like English and some are even verb initial there are a few languages of the world which have vso world word order verb subject object languages haven't talked about those it's like 10% of the and even even in those languages it's still short dependencies short dependencies is rules okay so how what what are some possible explanations for that uh for why why languages have evolved that way so that that's one of the um I suppose disagreements you might have with chowski so you consider the evolution of language in um in terms of information Theory yeah and uh for you the purpose of language is ease of communication right and processing that's right that's right so I mean the the story here is just about communication it is just about production really it's about ease of production is the story when you say production can you can oh I just mean ease of language production it's easier for me to say things when the what I'm doing whenever I'm talking to you is somehow I'm formulating some idea in my head and I'm putting these words together and it's easier for me to do that uh to put to say something where the words are close closely connected in inde dependency as opposed to separated like by putting something in between and over and over again I it's just hard for me to keep that in my head it like that's that's the whole story like the story it's basically I like that the dependency grammar sort of gives that to you like just like long long as bad short as good it's like easier to keep in mind because you have to keep it in mind for probably for production Pro you know matters in comprehension as well like also matters in comprehension it's on both sides of the production and the but I would guess it's probably evolved for production it's about producing it's right what's easier for me to say that ends up being easier for you also I that's very hard to disentangle this idea of who's it for is it for me the speaker or is it for you the listener I mean part of my language is for you like the way I talk to you is going to be different from how I talk to different people so I'm I'm definitely angling what I'm saying to who I'm saying right it's not like I'm is talking the same way to every single person and so I am sensitive to my audience but how does that does that you know work itself out in the in the dependency length differences I don't know maybe that's about just the words that part you know which words I select my initial intuition is that you optimize language for the audience yeah but it's both it's just kind of like messing with my head a little bit to say that some of the optimization might be it may be the primary objective of the optimization might be the ease of production we have different senses I guess I'm I'm like very selfish and you're I'm like I think it's like it's all about me I'm like I'm just doing what's easiest for me at all I don't want to I'm like I'll I mean but I have to of course choose the words that I think you're going to know I'm not going to choose words you don't know in fact I'm going to fix that when I you know so there it's about but but maybe for for the Syntax for the combinations it's just about me I feel like it's I don't know though it's very wait wait purpose of communication is to be understood is to convince others and so on so like the selfish thing is to be understood it's about circular there too then okay right I mean like the use of production helps helps me be understood then I don't think it's circular so I I think the primary I think the primary objective is to be UND is about the listener because otherwise if you're optimizing to for the ease of production then you're you're not going to have any of the interesting complexity of language like you're trying let's control for what it is I want to say like I I'm saying let's control for the thing the the message control for the message the message needs to be understood that's the goal oh but that's the meaning so I'm still talking about the form just the form of the meaning how do I frame the form of the meaning is all I'm talking about you're talking about a harder thing I think is like how am I like trying to change the let's let's keep the meaning constant like which it if you keep the meaning constant how can I phrase whatever it is I need to say like I got to pick the right words and I'm going to pick the order so that it's so it's easy for me that's that's that's that's what I think Isa I think I'm still tying meaning and form together in my head but you're saying if you keep the meaning of what you're saying constant yeah what the optimization yeah it could be the primary objective of that optimization is the for production that's interesting I'm I'm struggling to keep constant meaning it's just so I mean I'm I'm such a I'm a human right so for me the form without having introspected on this the form and the meaning are tied together like deeply because I'm a human like for for me what I'm speaking cuz I haven't thought about language like in a rigorous way about the form of language look for any event there's there's an an unbounded I I don't want to say infinite but sort of Unbound you know ways of that I might communicate that same event this two dogs entered a room I can say in many many different ways I can say hey there's two dogs they entered the room hey the room was entered by something the thing that was entered was two dogs I mean there's I mean it's kind of awkward and weird and stuff but those are all similar messages with different forms but different ways I might frame and of course I use the same words there all the time I could have referred to the dogs as you know a Dalmatian and a poodle or something you know I I could have been more specific or less specific about what they are and I could have said been more abstract about about about the number there's like so I like I'm trying to keep the meaning which is this event constant and then how am I going to describe that to get that to you it kind of depends on what you need to know right and what I think you need to know but I'm like trying let's control for all that stuff and not and it's like I'm just like choosing about I'm doing something simpler than you're doing which is just forms yes just words to You specifying the species the the breed of dog and whether they're cute or not is changing the meaning that might be yeah yeah that would be Chang well that would be changing the meaning for sure right so you're just yeah well yeah yeah that's changing the meaning but say even if we keep that constant we can still talk about what's easy or hard for me right The Listener and the and the right which phrase structures I use which combinations which you know this is so fascinating and just like a uh a really powerful window into human language but I wonder still throughout this how vast the gap between meaning and form I just I just have this like maybe romanticized notion that they're close together that they evolve close like hand in hand that you can't just simply optimize for one without the other being in the room with us like it's well it's kind of like an iceberg form is the tip of the iceberg and the rest the the meaning is the iceberg but you can't like sep but I think that's why these large language models are so successful is because they're good at form and form isn't that hard it's some sense and meaning is tough still and that's why they're not they're you know they don't understand what they're do we're going to talk about that later maybe but uh like we can distinguish in our forget about large language models like humans maybe you'll talk about that later too is like the difference between language which is a communication system and thinking which is meaning so language is a communic system for the meaning it's not the meaning and so that's why I mean that and there's a lot of interesting evidence we can talk about re relevant to that well I mean that's a really interesting question what is the differ what is the difference between uh language written communicated versus thought what do use the difference between them well you or anyone has to think of a task which they think is is a good thinking task and there's lots and lots of tasks which should be good thinking tasks and whatever those tasks are let's say it's you know playing chess or that's a good thinking task or playing some game or doing some complex puzzles uh maybe maybe remembering some digits that's thinking remembering some a lot of different tasks we might think maybe just listening to music as thinking or there's a lot of different tasks we might think of as thinking there's this woman in my department at federenko and she's done a lot of work at on on this question about what's the connection between language and thought and and so she uses I was referring earlier to MRI fmri that's her primary method and so she has been really fascinated by this question about whether what language is okay and so as I mentioned earlier you can localize my language area your language area in a few minutes okay like 15 minutes I can listen to language listen to non- language or backward speech or something and and and we'll find areas left lateralized Network in my head which is specially which is very sensitive to language as opposed to whatever that control was okay can't specify what you mean by language like communicated language like what is just sentences you know I'm listening to English of any kind story or I can read sentences anything at all that I understand if I understand it then it'll activate my language Network so right now my language network is going like crazy when I'm talking and when I'm listening to you because we're both we're communicating and that's pretty stable yeah it's incredibly stable so I've I I happen to be married to this woman at feno so I've been scanned by her over and over and over since 2007 or six or something and so my language network is exactly the same you know like a month ago as it was back in 2007 it's amazingly stable it's astounding within it's it's a really fundamentally cool thing and so my language network is it's like my face okay it's not changing much over time inside my head can ask a quick question Sor is a small tangent uh at which point in the as you grow up from baby to adult does it stabilize we don't know like that's that's a very hard question they're working on that right now because of the problem scanning little kids like doing the trying to do local trying to do the the localization on little children in this scanner you're lying in the fmri scan that's the best way to figure out where something's going on inside our brains and the scanner is loud and you're in this tiny little com you know area you're claustrophobic and it doesn't bother me at all I can go to sleep in there but some people are bothered by it and little kids don't really like it and they like to lie still and you have to be really still because if you move around that's that messes up the coordinates of where where everything is and so you know try to get you know your question is how and when are language developing you know how when how does this left lateralized system come to play where's it you know and it's really hard to get a two-year-old to do this task but you can maybe where they're starting to get three and four and fiveyear olds to do this task for short periods and it looks like it's there pretty early so clearly when you lead up to like a baby's first words before that there's a lot of fascinating turmoil going on about like figuring out like what is what are these people saying and you're trying to like make sense how does that connect to the world and all that kind of stuff yeah that that that might be just fascinating development that's happening there that's yeah hard to introspect but anyway you anyway we're back to the scanner and I can find my network in 15 minutes and now we can ask a we can ask find my network find yours find you know 20 other people do this task and and we can do some other tasks anything else else you think is thinking of some other thing I can do a spatial memory task I can do a music perception task I can do programming task if I program okay I can do uh where I can like understand computer programs and none of those tasks tap the language Network at all like at all there's no overlap they do they're they're highly activated in other parts of the brain there's a there's a bilateral Network which I think she she tends to call the multiple demands Network which does anything kind of hard and so anything that's kind of difficult in some ways will um activate that multiple demands Network I mean music will be in some music area you know there's music specific kinds of areas and so uh but they but but none of them are activating the language area at all unless there's words like so if you have music and there's a song and you can hear the words then then then you get the language area we're talking about speaking and listening but are or we also talking about reading this is all comprehension of any kind and so so that is fascina so what this this this network doesn't make any difference if it's written or spoken so the the Lang the thing that she calls ferano calls the the language network is this high level language so it's not about the spoken the spoken language and it's not about the written language it's about either one of them and so we're so when you do speech you're sort of listen you either you listen to speech and you and you subtract away some language you don't understand and so and or you subtract away back backward speech which signs sounds like speech but it isn't and and then so you you take away the sound part and so and then if you do written you get exactly the same network so for just reading the language versus reading sort of nonsense words or something like that you you'll find exactly the same network so this is about high level um the compreh comprehension of language yeah in this case and and the same thing happen production is a little harder to run the scanner but the same thing happens in production you get the same network so production is a little harder right you have to figure out how do you run a task you know in the network such that you're doing some kind of production and I can't remember what they've done a bunch of different kinds of tasks there where you get people to produce things yeah figure out how to produce and the same network goes on there exactly the same place so if wait wait so if you read random words yeah if you read things like um like jish yeah yeah Lewis carols TW Brill Jabberwocky right they call that Jabberwocky speech the network doesn't get activated not as much there are words in there there's function words and stuff so it's lower activation fascinating yeah yeah so there's like basically the more language like it is the higher it goes in the language Network and that network is there from when you speak from as soon as you learn language and and and it's it's there like you you speak multiple languages the same network is going for your multiple languages so you speak English you speak Russian then both of them are hitting that same network if you if you're fluent in those languages so programming not at all isn't that amazing even if you're a really good programmer that is not a human language it's just not conveying the same information and so it is not in the language Network and so is that as mindblowing as I think that's that's weird it's amazing so that's like one set of dat this hers like shows that what you might think is thinking is is not language language is just the SE just just this conventionalized system that we've worked out in in human languages oh another fascinating little bit tidbit is that even if they're these constructed languages like Klingon or um I don't know the languages from Game of Thrones I'm sorry I don't remember those languages maybe a lot of people offended right now there's people that speak those languages they they they really speak those languages because the people that wrote the languages for the shows um they did an amazing job of constructing something like a human language and those that that lights up the language area that's like because they can speak you know pretty much arbitrary thoughts in a human language it's not a it's a constructed human language and probably it's related to human languages because the people that were constructing them was were making them like human languages various ways but it it also activates the same network which is pretty pretty cool anyway sorry to go into a place where you may be a little bit philosophical but is it possible that this area of the brain is doing some kind of translation into a deeper set of almost like a Concepts that's it has to be doing so it's doing in communication right it is translating from thought whatever that is it's more abstract and it's doing that that's what it's doing like it is that that is kind of what it is doing it's kind of a meaning Network I guess yeah like a translation network but I wonder what is at the core at the bottom of it like what are thoughts are they are thoughts to me like I don't know thoughts and words are they neighbors or are is it one turtle sitting on top of the other meaning like is there a deep set of Concepts that we well there's connections right between the what what these things mean and then there's probably other other parts of the brain what these things mean and so you know when I'm talking about whatever it is I want to talk about if it's some it'll be represented somewhere else that that knowledge of whatever that is will be represented somewhere else well I wonder if there's like some stable nicely compressed encoding of meanings I don't know that's separate from language that you know I guess I guess the implication here is that that we don't think in language that's correct isn't that cool and and and that's so interesting so people I mean this is like hard to experiments on but there is this idea of an inner voice and a lot of people have an inner voice and so if you do a poll on the internet and ask if you you hear self talking when you're just thinking or whatever you know about 70 or 80% of people will say yes uh most people have an inner voice I I I don't and so I always find this strange when so when people talk about an inner voice I always thought this was a metaphor and and uh they here I'm I know most of you whoever is listening to this thinks I'm crazy now because I'm I don't have an in your voice and I I just don't know what you're listening to I I just it sounds so kind of annoying to me but that to have this voice going on while you're while you're thinking but I guess most people have that and I don't have that and uh I don't we don't really know what that that connects to I wonder if the inner voice activates that same network I I wonder I don't I don't know I don't know I mean this could be speechy right so that's like that you hear do you have an ear voice I don't think so oh a lot of people have this sense that they hear other PE they hear themselves and say they read someone's email I've heard people tell me that they hear that other other person's voice when they read other people's emails and I'm like wow that sounds so disruptive I do think I like vocalize what I'm reading but I don't think I hear a voice MH well that's you probably don't have an inter voice yeah I don't think I have people have an voice people have this strong percept of hearing sound in their heads when they're just thinking I refuse to believe that's the majority of people majority absolutely it's it's like 2/3 or 3/4 it's lot I when never ask class they and and I went internet they always say that so you're you're minority it could be a self-report flaw it could be you know when I'm reading yeah inside my head I'm kind of like saying the words which is probably the wrong way to read but I don't hear a voice there's no PR percept of a voice I refuse to believe the majority of people have anyway it's a fascinating the human brain is fasc but it still blew my mind that the that language does appear comprehension does appear to be separate from thinking MH so that's one set one set of data from feder Eno's group is that um no matter what task you do if it doesn't have words and combinations of words in it then it won't light up the language Network and you know you could it'll be active somewhere else but not there so that's one and then this other um piece of evidence relevant to that question is it turns out there are these this group of people who've had a massive stroke on the left side and wiped out their language Network and it as long as they didn't wipe out everything on the right as well in that case they wouldn't be you know cognitively functionable but if they just wiped out language which is pretty tough to do because it's it's very expansive on the left but if they have then there are these there's patients like this uh called so-called Global aphasic Who Um can do any task just fine but not language they can't they can't talk to them I mean you they don't understand you they can't speak they can't write they can't read but they can do all they can play chess they can drive their cars they can do all kinds of other stuff you know do math they can do all like so math is not in the language area for instance you do arithmetic and stuff that's not language area it's got symbols so people sort of confused some kind of symbolic processing with language and symbolic processing is not the same so there are symbols and they have meaning but it's not language it's not a you know conventionalized language system and so Lang so math isn't there so they can do math they they do just as well as their control age match controls and all these tasks this is Rosemary vley over in University College London who has a bunch of patients who who she's shown this that they're just um so that that sort of combination suggests that language isn't necessary for thinking it it it doesn't mean you can't think in language you could think in language because language allows a lot of expression but it's just you don't need it for thinking it's it suggests that language is separate is a separate system this is kind of blowing my mind right now I'm trying to load that in because it has implications for large language models it sure does and they've been working on that well let's take a stroll there you wrote that the best current theories of human language are arguably large language models so this has to do with form it's a kind of a big Theory and uh but the reason it's arguably the best is that it it does the best at predicting what's English for instance it's it's like incredibly good you know better than any other Theory it's so you know but you know we don't you know there's it's not sort of there's not enough detail well it's Opa like there's not you don't know what's going on you what's going on it's another black box but I think it's you know it is a theory what's your definition of a theory because it's a gigantic it's a gigantic black box with you know a very large number of parameters controlling to me Theory usually requires uh a Simplicity right I don't know maybe I'm just being loose there I I think it's a it's not not it's not a great Theory but it's a theory it's a good theory in in one sense and that it covers all the data like anything you want to say in English it does and so that's why it's that's how it's arguably the best is that no other theory is as good as a large language model in in predicting exactly what's good and what's bad in English you know you now you're saying is it a good theory well probably not you know because I I want a smaller Theory than that it's too big I agree you could probably construct mechanism by which it can generate a simple explanation of a particular particular language like a set of rules something like a i it could generate a a dependency grammar for a language right yes you could probably uh you could probably just ask it about itself well you know that's I mean that presumes and there's some evidence for this that that that lar some large language models are implementing something like dependency grammar inside them and so there's work from a guy called Chris Manning and colleagues over at um Stanford in natural language and they looked at I don't know how many large language model types but certainly Bert and some others where and and where where you do some kind of fancy math to figure out exactly what the sort of what kind of abstractions of representations are going on and they and they were saying it does look like dependency structure is is what they're constructing it doesn't like so it's actually a very very good map so kind of a they are constructing something like that um does it mean that you know that they're using that for meaning I mean probably but we don't know you right that the kinds of theories of language that llms are closest to are called construction based theories can you explain what construction based theories are it's just a general theory of language such that uh there's a form and a meaning pair for um for lots of pieces of the language and so it's it's it's primarily usage based is a construction grammar it's just it's trying to deal with the things that people actually say actually say and actually write and so that's it's a usage based idea and what's a Construction Construction is either a a simple word so like a a morphine plus its meaning or a combination of words it's basically combinations of words like the the rules so but it's it's um it's uh un unspecified as to what the form of the grammar is under underlyingly and so I I would I I would argue that dependency grammar is maybe the the right form to use for the types of construction grammar construction grammar typically um isn't kind of formalized quite and so maybe the formalization a formalization of that it might be independency grammar uh I mean I I would think so but I mean it's up to people other researchers in that area if they agree or not so do you think that large language models understand language are they mimicking language I guess the deeper question there is are they just understanding the surface form or do they understand something deeper about the meaning that then generates the form I mean I would argue they're doing the form they're doing the form doing it really really well and uh are they doing the meaning no probably not I mean there's lots of these examples from various groups showing that they can be tricked in all kinds of ways they really don't understand the the meaning of what's going on and so there's a lot of examples that he and other groups have given which just which show they don't really understand what's going on so you you know the Monty Hall problem is this silly problem right where you know if uh you have three door it's it's Let's Make a Deal is this old game show and there's three doors and there's a prize behind one and there's some junk prizes behind the other two and you're trying to select one and if you you know he knows Monte he knows where the target item is the good thing he knows everything is back there and you're supposed to he he gives you a choice you choose one of the three and then he opens one of the doors and it's some junk prize and then the question is should you trade to get the other one and and the answer is yes you should trade because he knew which ones you could turn around and so now the odds are 2third okay um and then if you just change that a little bit to the large language model large language model have seen that that that explanation so many times that it just if you change the story it's a little bit but it make it sound like it's the Monty Hall problem but it's not you just say oh um there's three doors and one behind there is a good prize and there's two bad doors I happen to know it's behind door number one the good prize the car is behind door number one so I'm going to choose door number one Monty Hall opens door number three and shows me nothing there should I trade for door number two even though I know the good priz in door number one and then and then the large language Mar say yes you should trade because it's a it's it just goes through the the the the the forms that it's seen before so many times on these cases where it yes you should trade because you know your odds have shifted from one and three now to two out of three to being that thing it doesn't have any way to remember that actually you have 100% probability behind that door number one you know that that's not part of the of the the scheme that it's seen hundreds and hundreds of times before and so it can't you can't even if you tried to explain to it that it's wrong that they can't do that it'll just keep giving you back the the problem but it's also possible that a larger language model would be aware of the fact that there's sometimes over representation of a of a particular kind of formulation MH and it's easy to get tricked by that and so you could see if they get larger and larger models be a little bit more skeptical so you see a over representation so like you it just feels like form can training on form can go really far in terms of being able to generate uh things that look like the thing understands deep L the underlying world world model mhm of the kind of mathematical World physical world psychological world that would generate these kinds of sentences it just feels like you're creeping close to the meaning part easily fooled all this kind of stuff but that's humans too so it just seems really impressive how often it seems like it understands Concepts I I mean you don't have to convince me of that I'm I am very very impressed but does it does do I mean you're you're giving a possible world where maybe someone's going to train some other version such that it'll be somehow abstracting away from types of forms I I mean I don't think that's happened and so well no no no I I'm not saying that I I I think when you just look at anecdotal examples and just showing a large number of them where it doesn't seem to understand yeah it's easily fooled yes does not seem like a scientific um data driven like analysis of like how many places is a damn impressive in terms of meaning and understanding and how many places is easily fooled and like that's not the inference yeah so I don't want to make that the inference I don't I wouldn't want to make was that in the inference I'm trying to push is just that is it is it like humans here it's probably not like humans here it's different so humans don't make that error if you explain that to them they're not going to make that error you know they don't make that error and so that's something it's doing something different from humans are that they're doing in that case what what what's the mechanism by which humans figure out that it's an error I'm just saying the error there is like if I explain to you there's 100% chance that the car is behind this case this door will do you want to trade people say no but this thing will say yes because it's so trick it's that that trick it's so wound up on the form that it's that that's an error that a human doesn't make which is kind of interesting less likely to make I should say yeah less likely because like humans are very oh yeah you're asking you know you're asking humans to you're asking a system to understand 100% like you're asking some mathematical Concepts and so like look the places where large language models are the form is amazing so let's go back to nested structure Center embedded structures okay if you ask a human to complete those they can't do it neither can a l large language model they're just like humans in that if you ask if I ask a large language model that's fascinating by the way that the central embedding the central embedding is is it struggles with just like exactly like humans exactly the same way as humans they and that's not trained so they do exactly so there so that is a similarity so but then it's it's that's not meaning right this is form but when we get into meaning this is where they get kind of messed up when you start to saying oh what's behind this door oh it's you know this is the thing I want humans don't mess that that up as much you know here they they the form is it's just like the form of the match is amazing is similar without being trained to do that I mean it's trained in the sense that it's getting lots of data which is just like human data but it's not being trained on uh you know bad sentences and being told what's bad it just can't do those it'll actually say things like those are too hard for me to complete or something which is kind of interesting actually kind of how does it know that I don't know oh but it really often doesn't just complete sense it off very often says stuff that's true MH and sometimes says stuff that's not true and almost always the form is great yeah but it's still very surprising that with really great form it's able to generate a lot of things that are true based on what it's trained on and so on so it's not just it's not just form that is generating it's mimicking true statements that's right that's right from the internet I guess I guess the underlying idea there is that on the internet truth is over represented versus falsehoods I think that's probably right yeah so but the the the fundamental thing is trained on you're saying is just form it's I think so yeah yeah I think so uh well that's a sad if that's to me that's still a little bit of an open question I probably lean agreeing with you uh especially now you just blown my mind that there's a separate module in the brain for language versus thinking maybe there's a fundamental part missing from the large language model approach that lacks the thinking the reasoning capability yeah that's what this group argues so the the same group feder Enos group has a recent paper arguing exactly that there a guy called Kyle Mell who's here in Austin Texas actually he's an old student of mine but he's a in linguistics at Texas and he was the first author on that that's fascinating still to me an open question yeah what do you are the interesting limits of llms you know I I don't see any limits to their form their form is impressive perect yeah yeah yeah it's pretty I mean it's close to being well you said ability to complete Central embeddings yeah it's just the same as humans it seems the same as but that's not perfect right it should be that's good no but I want to be like humans I I'm trying to I want a model of humans but but but oh wait wait wait oh so perfect use is uh as close to humans as possible I got it yeah but you should be able to if you're not human you're like you're super human you should be able to complete Central embedded sentences right I mean that's the the mechanism is if it's modeling some I think it's kind of really interesting that it it's really interesting it's more like like I think it's potentially underlyingly modeling something like what the the way the form is processed the form of human language the way how and how humans process the language yes I think that's plausible and how they generate language process language and generate language that's FAS yeah so in that sense they're perfect if we can just Linger on the center embedding thing that's hard for LMS produce and that seems really impressive cuz that's hard for humans to produce and how does that connect to the thing we've been talking about before which is the dependency grammar framework in which you view language and the finding that uh short dependencies seem to be a universal part of language so why is it hard to complete Center embeddings so what I like about dependency grammar is it makes the cognitive cost associated with long longer distance connections very transparent Bas there's some there turns out there is a cost associated with producing and comprehending connections between Words which are just not besided other the further apart they are the worse it is the the according to well we can measure that and there is a cost associated with that can you just Linger on what do you mean by cognitive cost and how do you measure oh well you can measure it in a lot of ways uh the simplest is just asking people to say whether you know how good a sentence sounds just ask that's one way to measure and you you try to like triangulate then across sentences and across structures to try to figure out what the source of that is you can look at um reading times in controlled materials you know so in certain kinds of materials when they and then we can like measure the dependency distances there we can there's a recent study which looked at we're we're talking about uh the the brain here we could look at the language Network okay we can look at the language Network and we could look at the activation in the language Network and how big the activation is depending on the length of the dependencies and turns out in just random sentences that you're listening to if you're listening to turns out there are people listening to stories here uh and the bigger the longer the dependence dependency is the the the stronger the activation in the language in the language Network and so there's some measure there's a different there's a bunch of different measures we could do that's a kind of a neat measure actually of actual activations activation in the brain so that you can somehow in different ways convert it to a number I wonder if there's a beautiful equation connecting cognitive cost and length of dependency eal mc² kind of thing yeah it's it's complicated but probably it's doable I would I would guess it's doable I you know I tried to do that a while ago and I was reasonably successful but some for some reason I stopped working on that I do I agree with you that it would be nice to figure out so there's like some way to figure out the the the cost I mean it's complicated another issue you raised before was like how do you measure distance is it words is it it probably isn't is the part of the problem is that some words matter than more than others and probably you know meaning like nouns might matter depend depending and then maybe depends on which kind of noun is it a noun we've already introduced or a noun that's already been mentioned is it a pronoun versus a name like like all these things probably matter so probably the simplest thing to do is just like let's forget about all that and just think about words or morphemes for sure but there might be a kind like there might be some insight in the kind of function yeah that fits the data meaning like a quadratic like what I I think it's an exponential we think it's exponential such that the longer the distance the less it matters and so then then it's the sum of those is my that that was our best guess a while ago so that you've got a bunch of dependencies if you've got a bunch of them that are being connected at some point that's at at the ends of those the the cost is the is some exponential function of those is my guess but because the reason it's probably an exponential is like it's it's not just the distance between two words because I can make a very very long subject verbal dependency by adding Lots and lots of noun phrases and prepositional phrases and it doesn't matter too much it's when you do nested when I have multiple of these then then things get go really bad go south probably somehow connected to working memory or something like yeah that's probably a function of the memory here is is the access is trying to find those earlier things it's kind of hard to figure out what was referred to earlier those are those connections that's that's the sort of notion of murking as opposed to a stagey thing but trying to connect uh retrie retrieve those earlier words depending on what was in between and then then we're talking about interference of similar things in between that's the Right theory probably has that kind of notion in it is interference of similar and so I I'm I'm dealing with an abstraction over the Right theory which is just you know it's count Words it's not right but it's close and then may maybe you're right though there's some sort of um an exponential or something on on the on to figure out the total so we can figure out a function for any given for every any given sentence in any given language but you know it's funny you know people haven't done that too much which I do think is I I'm I'm interested that you find that interesting I really find that interesting and a lot of people haven't found it interesting and I don't know why I haven't got people to want to work on that I really like that too no that's a that's that's a beautiful IDE and the underlying idea is beautiful that there's a cognitive cost that correlates with the length of dependency it just it feels like it's a deep I mean language is so fundamental to The Human Experience and this is a nice clean theory of language MH where yeah it's like wow okay so like we like our words close together the dependent words close together yeah that's why I like it too it's so simple yeah the Simplicity of and yet it explains some very complicated phenomena if you if I write these very complicated sentences it's kind of hard to know why they're so hard and you can like oh nail it down I can do like give you a math formula for why each one of them is bad and where and that's kind of cool I think that's very neat have you gone through the process is there like a you you take a piece of text and then simplify sort of like there's an average uh length of dependency and then you like uh you know uh reduce it and see comprehension on the entire not just a single sentence but like you know you go from James joy to Hemingway or something no no simple answer is no that does there's probably things you can do in that in that kind of Direction that's fun we might you know we're going to talk about legal EAS at some point and so we maybe we'll talk about that kind of thinking would applied to Legal EAS but let's talk about legal EAS because you mentioned that as an exception we just taking tangent upon tangent that's an interesting one you given as an exception it's an exception uh that you say that most natural languages as we've been talking about have local dependencies with one exception legal e that's right so what is legal first of all oh well legal is what you think it is it's just any legal language I mean like I actually know know very little about the kind of language that lawyers use so I'm just talking about language in laws and language in contracts got it so the stuff that you have to run into we have to run into every other day or every day uh and you skip over because it reads poorly and or you know partly it's just long right there's a lot of texts there that we don't really want to know about and so but the the thing I'm interested in so I I've been working with um this guy called Eric Martinez who is a um he was a lawyer who was was taking my class I I was teaching a psychol Linguistics lab class and I have been teaching it for a long time at MIT and he's a he was a law student at Harvard and he took the class because he had done some Linguistics as an undergrad and he was interested in the problem of why legal e sounds hard to understand you know why and and so why is it hard to understand and why do they write that way if it is so hard to understand it seems apparent that it's hard to understand the question is why is it and so we we didn't know and uh we did uh an evaluation of a bunch of contracts actually we just took a bunch of sort random contracts because I don't know you know there's contracts in laws might not be exactly the same but uh contracts are kind of the things that most people have to deal with most of the time and so that's kind of the most common thing that humans have like humans that that adults in our industrialized Society have to deal with a lot and so so that's what we we pulled and we we didn't know what was hard about them but it turns out that the way they're written is is very Centum beded has nested structures in them so it has low frequency words as well that's not surprising lots of texts have low it does have surprising slightly lower frequency words than other kinds of control texts even sort of academic text legal Le is even worse it is the worst that that we were able to find you just you just reveal the game that lers are playing they're optimizing a different well you know it's interesting that's a that's a now you're getting at why and so and I don't think so now you're saying it's they're doing intentionally I don't think they're doing intentionally but let let let's um let's it's an emerging phenomena okay yeah yeah we'll get to that we'll get to that and so but but we wanted to see why see what first as oppos so like because it turns out that we're not the first to observe that legal EAS is weird like back to uh Nixon had a plain language act in in 1970 and and Obama had one and uh boy a lot of these you know a lot of presidents have said oh we've got a simplify legal language must simplify it but if you don't know how it's complicated it's not easy to simplify it you need know what it is you're supposed to do before you can fix it right and so you need to like you need a psych linguist to analyze the text and see what's wrong with it before you can like fix it you don't know how to fix it how am I supposed to fix something I don't know what's wrong with it and so what we did was just that's what we did we figured out what's okay we just took a bunch of contracts had people uh and we encoded them for the the a bunch of features and so another feature the people one of them was Centrum Bing and so uh that is like basically how often a um a clause would in would would intervene between a subject and a verb for example that's one kind of a cent edding of a clause okay and um turns out they're massively Cent embedded like so I think in random contracts and in random laws I think you get about 70% or 80 something like 70% of sentences have a center embedded clause which is insanely high if you go to any other text it's down to 20% or something it's it's it's so much higher than the any control you can think of including you think oh people think oh Tech technical um academic text no people don't write Cent embedded sentences in in technical academic text I mean they do a little bit but much it's it's on the 20% 30% realm as opposed to 70 and so and so there's that and and there's low frequency words and then people oh maybe it's passive people don't like the passive passive for some reason the passive voice in English has a bad rap and I'm not really sure where that comes from um and there is a lot of passive in uh the there's much more passive voice in the in the uh in than there is in passive voice accounts for some of the low frequency words no no those are separate those are separate oh so passive voice sucks frequency where suck suck is different so these are different Jud yeah yeah yeah pass drop the Judgment it's just like these are frequent these are things which happen in legal these texts then we can ask the dependent measure is like how well you understand those things with those features okay and so then and it turns out the passive makes no difference so it has a Zero Effect on your comprehension ability on your recall ability no nothing at all that has no effect you're the the words matter a little bit they do low frequency words are going to hurt you in recall and understanding but what really what really hurts is the S betting that kills you that is like that slows people down that makes them that makes them very poor understanding that makes them uh they they they can't recall what was said as well nearly as well and we we did this not only on lay people we did on a lot of late people we ran it on a 100 lawyers we recruited lawyers from a from a wide range of of um um sort of different levels of law firms and stuff and they have the same pattern so they also like why when when when they did this I did not know it would happen I thought maybe they could process they're used to Legal e they process just as well as it it was normal no no they they they're much better than lay people so they're much like they can much better recall much better understanding but they have the same main effects as as as lay people as lay people exactly the same so they also much prefer the non-centric so we we oh we constructed non- Center embedded versions of each of these we constructed versions which have um higher frequency words in those places and we we did we un un UNP passivized we turned them into active versions The the passive active made no difference the words made a little difference and the UN uncenter embedding Mak makes big differences in all the populations uncenter embedding how hard is that process by the way don't question but how hard is it to detect Center embedding oh easy easy to detect just looking at long dependencies or you can just you can so there's automatic parsers for English which are pretty good very and they can detect Cent or I guess Nest perectly yeah lar yeah pretty much so you're not just looking for long dependencies you're just literally looking for Center bed yeah we are in this case in these case but long dependencies are they're highly correlated these to this so like a center embedding is a is a big bombb you throw inside inside of a sentence that just blows up the that that makes can I read a sentence for for you from these things I I see I I mean this is just like one of the things said this is just my eyes might glaze over in the middle mids sentence no I understand that I mean legal is hard this is it go it goes in the event that any payment or benefit by the company all such payments and benefits including the payments and benefits under Section 3A here of being here here and after referred to as a total payments would be subject to the excise tax then the cash Severance payments shall be reduced so that's something we pulled from a regular text from a from a contract wow and and and the Cent embedded bit there is just for some reason there's a definition they throw the definition of what payments and benefits are in between the subject and the verb let's how about don't do that how about put the definition somewhere else as opposed to in the middle of the sentence and so that's that's very very common by the way that's that's what happens you just throw your definitions you use a word a couple words and then you define it and then you continue the sentence like just don't write like that and and you ask so then we ask lawyers we thought oh maybe lawyers like this lawyers don't like this they don't like this they don't want to they don't want to wrate like this they they we asked them to rate materials which are with the same meaning with with uncenter bed and centered and they much preferred the uncentered versions on the comprehens on the reading side yeah well and we asked them we asked them would you hire someone who writes like this or this we asked them all kinds of questions and they always preferred the less complicated version all of them so I don't even think they want it this way yeah but how did it happen how did it happen that's a very good question and and and the answer is I still don't know but I have some theories well our our best theory at the moment is that there's there's actually some kind of a performative meaning in the center embedding in the style which tells you it's legal ease we think that that's the kind of a style which tells you it's legal ease like that's a it's a reasonable guess and maybe it's just so for instance if you're like it's like a magic spell so we kind of call this a magic spell hypothesis so when you give them when you tell someone to put a magic spell on someone what what do you do they you know people know what a magic spell is and they they do a lot of rhyming you know that's that's kind of what people will tend to do they'll do rhyming and they'll do sort of like some kind of poetry kind of thing Abracadabra type of thing exactly yeah and um maybe that's there's a syntactic sort of reflex here of a of a magic spell which is Centum betting and so that's like oh it's trying to like tell you this is like this is something which is true which is what the goal of law law is right is telling you something that we want you to believe as certainly true right that's that's what legal contracts are trying to enforce on you right and so maybe that's like a a form which has this is like an abstract very abstract form srum betting which has a has a has a meaning associated with it well don't you think there's an incentive for lawyers to generate things that are hard to understand that was our one of our working hypothesis we just couldn't find any evidence of that no lawyers also don't understand it but you're creating space why you I mean you ask in a communist Soviet Union the individual members uh their self-report is not going to correctly reflect what is broken about the gigantic bureaucracy that leads to Chernobyl or something like this um I think the incentives under which you operate are not always transparent to the me Members within that system so like it just feels like a strange coincidence that like there is benefit if you just zoom out and look at the system as opposed to ask individual lawyers that making something hard to understand is going to make a lot of people money yeah like there's going to you're going to need a lawyer uh to figure that out I guess from the perspective of the individual but then that could be the performative as it could be as opposed to the incentive driven to be complicated could be performative to where we lawyers speak in this sophistic at way and you regular humans don't understand it so you need to hire a lawyer yeah I don't know which one it is but it's suspicious suspicious that it's hard to understand and everybody's eyes glaze over and they don't read I I'm suspicious as well I'm still suspicious and I I hear what you're saying it could be kind of you know no individual and even average of individuals it could just be a few bad apples in a way which are driving the effect in some way influential bad apples at the sort of uh that everybody looks up to or whatever they like Central figures and and how you know but it turns but it is it is kind of interesting that among our 100 lawyers they did not share they didn't want this they really didn't like it so it they weren't better at than regular people at comprehending it or they were on average better but they had the same difference the same same difference exact same difference so they but I they wanted it fixed so they they also and so that that gave us hope that because it actually isn't very hard to construct a a material which is uncenter embedded and has the same meaning it's not very hard to do just basically in that situation you're just putting definitions outside of the subject verb relation in that particular example and that's kind of that's pretty general what they're doing is just throwing stuff in there which you didn't have to put in there it there's extra words involved typically um you may need a few extra words sort of to refer to the things that you're defining outside in some way or as a because if you only use it in that one sentence then there's no reason to introduce extra extra terms but so we might have a few more words but it'll be easier to understand um so I mean I I have hope that now that may maybe we can make legal ease less uh less convoluted in this so maybe the the next president of the United States can instead of saying generic things say exactly I ban uh cent center embeddings and make Ted the uh the language Zar of the Eric Martinez is the guy you should really put in there yeah yeah I mean but Center Bings are the the the bad thing to have that's right so if you get rid of that that'll do a lot of it that fixs a lot it's fascinating that is so fascinating yeah um and it just really fascinating on many fronts that humans are just not able to deal with this kind of thing and that language because of that evolved in the way it did it's fascinating so one of the mathematical formulations you have when talking about languages communication is uh this idea of noisy channels what's a noisy channel so that's about communication and so this is going back to Shannon so Shannon Claud Shannon was a u a student at MIT in the 40s and so he wrote this very influential piece of work about communication Theory or information Theory and uh he was interested in human language actually he was trying to he was interested in this problem of communication of getting a a a message from my head to to your head and and so and he he was concerned or interested in um what was a robust way to do that and so that assuming we both speak the same language we both already speak English whatever you know whatever the language is we we speak that what is a way that I can say the language so that it's most likely to get the signal that I want uh to you and so and then the the problem there in the communication is the noisy channel is that there's I make there's a lot of noise no in the system I don't speak perfectly I make errors that's noise um there's background noise you know you know that as like a literal literal background noise there is like white noise in the background or some other kind of noise or some speaking going on that you or just you're at a party that's background noise you're trying to hear someone it's hard to understand them because there's all this other stuff going on in the background um and and then there's noise on the communication on the commun on the uh receiver side so that you have some problem maybe understanding me for stuff that's just internal to you in some way so you've got some other problems whatever with uh understanding for whatever reasons maybe you're maybe you had too much to drink you know who knows why you're not able to pay attention to the signal so that's the noisy Channel and so so that language if it's Communication System we are trying to optimize in some sense the the passing of the message from one side to the other and um so it turn I mean one idea is that maybe you know aspects of like word order for example might have optimized in some way to to make language a little more easy to be passed from speaker to listener so Shannon's the guy that did this stuff way back in the 40s he you know it's very interesting you know historically he was interested in working in linguistics he was in at MIT and he did this is his master's thesis of all things you know it's crazy how much how much he did for his master's thesis in 1948 I think or 49 something and and he wanted to keep working in language and it it just wasn't a popular communication as a as a reason a source for what language was wasn't popular at the time so Chomsky was becoming was moving in there he was and he just wasn't able to get a a handle there I think and so uh and so he moved to bellhs and worked on communication uh from a mathematical point point of view and was you know uh did all kinds of amazing work and so he's just more on the signal side versus like the language side yeah yeah it would have been interesting to see if he pursued the language side that's really interesting he was interested in that is examples in the 40s are are are kind of like they're very language like like things yeah we can kind of show that there's a noisy Channel process going on in when you're listening to me you know you you can often sort of guess what I meant by what I you know uh what you think I meant given what I said um and I I mean with respect to sort of why language looks the way it does we might there might be sort of as I I alluded to there might be ways in which word order is is somewhat optimized for for because of the noisy channel in some way I mean that's really cool to sort of model if you don't hear certain parts of a sentence or have a some probability of missing that part like how do you construct a language that's resilient to that that's somewhat robust to that yeah that's the idea and then you're kind of saying like the word order and the syntax of language the dependency length are all helpful yeah well the dependency length is is really about memory really I think that's like about sort of what's easier or harder to produce in some way and these other ideas are about sort of robustness to communication so the problem of potential loss of loss of signal due to noise it's so that there there may be aspects of word order which is somewhat optimized for that and and you know we have this one guess in that direct and these are kind of Just So Stories I have to be you know pretty Frank they're not like I can't show this is true all we can do is like look at the current languages of the world this is a like we can't sort of see how languages change or anything because we've got these snapshots of a few you know hundred or a few thousand languages we don't have we don't really we we can't do the right kinds of modifications to test these things experimentally and so you know so just take that this with a grain of salt okay from here this this stuff the dependency stuff I can I'm much more solid on and like here's what the lengths are and here's and here's what's hard here's what easy and this is a reasonable structure I think I'm pretty reasonable here's like why you know why does the word order look the way it does is we're now into shaky territory but it's kind of cool but we're talking about just to be clear we're talking about maybe just actually the sounds of communic like you and I are sitting in the bar it's very loud and you you uh model with a noisy channel the loudness the noise and we have the signal that's coming across that and you're saying word order might have something to do with optimizing that when there's a presence of noise yes I it's really interesting I mean to me it's interesting how much you can load into the noisy Channel like how much can you bake in you said like you know cognitive load on the receiver end we think that those are there's three at least three different kinds of things going on there and we probably don't want to treat them all as the same and so I think you you know the right Model A Better model of a noisy channel would treat would have three different Source sources of noise which because which are background noise you know speaker speaker um inherent noise and listener inherent noise and those are not the those are all different things sure but then underneath it there's a million other subsets like what that's true on on the receiving I mean I just mentioned cognitive load on both sides then there's like uh speaking uh Speech impediments or just everything uh world view I mean the meaning will start to creep into the meaning realm of like we have different World Views well how about just form still though like just just what language you know like so how well you know the language and so if it's second language for you versus first language and and how maybe what other languages you know these are still just form stuff and that's like potentially very informative and and you know how old you are these things probably matter right so like a child learning language is is a you know as a noisy representation of English grammar uh you know depending on how old they are so maybe when they're six they're perfectly formed but uh you mentioned one of the things is like a way to measure the the a language is learning problems so like what's the correlation between everything we've been talking about and how easy it is to learn a language so is is uh like a short dependencies correlated to ability to learn language is there some kind of or like the dependency grammar is there some kind of connection there how easy it is to learn yeah well all the languages in the world's language none is right now we know is any better than any other with respect to sort of optimizing dependency lengths for example they're all kind of do it do it well they all keep low it's so I I think of every human language is some kind of an opt sort of an optimization problem a complex optimization problem to this communic ation problem and so they' like they've solved it and you know they're just sort of noisy solutions to this problem of communication there's just so many ways you can do this so they're not optimized for learning they're probably optimized forun and and learning so yes one of the factors which yeah so learning is messing this up a bit and so so for example if it were just about minimizing dependency lengths and and that was all that matters you know then we you know so then then we might find grammars which didn't have regularity in their rules like but languages always have regularity in their rules so so what I mean by that is that if if I wanted to say something to you and the in the optimal way to say it was what really mattered to me all that mattered was keeping the dependencies as close together as possible Then I then I would have a very lack set of phrase structure rule or or dependency rules I wouldn't have very many of those I would have very little of that and I would just put the words as close the things that refer to the things that are connected right beside each other but we don't do that like there like there are word order rules right so they're very and depending on the language they're more and less strict right so you speak Russian they're less strict than English English is very rigid word order rules we order things in a very particular way and and so why do we do that like that's probably not about um communication that's probably about learning I mean then we're talking about learning it's like probably easier to learn regular regular things things which are very predictable and easy to so so that's that's probably about learning as my is our guest because that can't be about communication just noise can it be just the the messiness of the development of a language well if it were just a communication then we we should have languages which have very very free word order and we don't have that we have free ER but not free like there's always well no but what I mean by noise is like cultural like sticky cultural things like the way the way you communicate just there there's a stickiness to it that it's it's an imperfect it's a noisy op it's stochastic yeah the the function over which you're optimizing is very noisy yeah so uh because I don't it feels weird to say that learning is part of the objective function CU some languages are way harder to learn than others right or is that that's not true that's interesting I mean that's the public sort of perception right yes that's true for a second language for second language but that depends on what you started with right so so it's it really depends on how close that second language is to the first language you've got and so yes it's very very hard to learn Arabic if you started with English or it's hard to you hard to learn Japanese or if you started with Chinese I think is the worst in the there's like Defense Language Institute in the in the United States has like a a list of of how hard it is to learn what language from English I think Chinese is the this is a second Lang you're saying babies don't care no there's no evidence that there's anything harder or easier about any baby any language learned like it's by three or four they speak that language and so there's no evidence of any anything harder or easier about any human language they're all kind of equal so to what degree is language this is returning to Chomsky a little bit is is innate you said that for chamski he used the idea that language is some aspects of language are innate to explain away certain things that are observed but do how much are we born with language at the core of our mind brain I mean I I you know the answer is I don't know of course but uh the uh I mean I I like to I'm an engineer at heart I guess and I sort of think it's fine to postulate that a lot of it's learned and so I I'm guessing that a lot of it's learned so I think the reason Chomsky went with innateness is because he he he hypothesized movement in his grammar he was interested in grammar and movement's hard to learn I think he's right movement is a hard it's a hard thing to learn to learn these two things together and how they interact and there's like a lot of ways in which you might generate exactly the same sentences and it's like really hard and so he's like oh I guess it's learned sorry I guess it's not learned it's a eight and um if you just throw out the movement and just think about that in a different way you know then you you get some messiness but the messiness is human language which it it actually fits better it's that messiness isn't a problem it's actually a a a a a it's a valuable asset of of uh of the theory and so so I think I don't really see a reason to postulate much much innate structure and that's kind of why I think these large language models are learning so well is because I think you can learn the form the forms of human language from the input I think that's like it's likely to be true so that part of the brain that lights up when you're doing all the comprehension that could be learned that could be just you don't need you don't need doesn't have to be an eight so like lots of stuff is um modular in the brain that's learned it doesn't have to you know so there's something called the visual word form era in the back and so it's in the back of your head near the you know the visual cortex okay and that is very specialized language sorry very specialized brain area which does um visual word processing if you read if you're a reader okay if you don't read you don't have it okay guess what you spend some time learning to read and you develop that that brain area which does exactly that and so these the modularization does not evidence for innateness so the modularization of a language area doesn't mean we're born with it it we could have easily learned that I I we might have been born with it I I we we just don't know at this point we might very well have been born with this left lateralized area I mean that there's like a lot of other interesting components here features of this kind of argument so some people get a stroke or something goes really wrong on the left side where the left where the language area would be and that and that isn't there it's not not available and it develops just fine in the right so it's no long so it's not about the left it goes to the left like this is a very interesting question it's like why is the why are any of the brain areas the way that they are and how how how did they come to be that way and uh you know there's these natural experiments which happen where people get these you know strange events in their brains at very young Ages which wipe out sections of their brain and and they behave totally normally and no one knows anything was wrong and we find out later because they happened to be accidentally scanned for some reason it's like what what's happened to your left hemisphere it's missing there's not many people who miss their whole left hemisphere but they'll be missing some other section of their left or their right and they behave absolutely normally would never know so that's like a very interesting you know current research you know this is another project that this person feno is working on she's got all these people contacting her because she's scanned some people who have been mesing sections one person missing missed a section of her brain and was scanned in her lab and and she and she happened to be a writer for the New York Times and there was a article in New York Times about about uh the uh just about the scanning procedure and and about what might be learned about by sort of the general process of MRI and language and necess language and and because she's writing for the New York Times then all these people started writing to her who who also have similar similar kinds of deficits because they've been you know accidentally you know scanned for some reason and uh and found out they're missing some section they they say they volunteer to be scanned these are natural experiments natural experiments they're kind of messy but natural experiments kind of cool she calls them interesting brains the first few hours days months of human life are fascinating like U well inside the womb actually like that development that Machinery whatever that is seems to create powerful humans that are able to speak comprehend think all that kind of stuff no matter what happen not no matter what but robust to the different ways that U um the the the brain might be damaged and so on that's that's really that's really interesting but yeah what what would Chomsky say about the fact the thing you're saying now that the languages is seems to be happening separate from thought because as far as I understand maybe you can correct me he thought that language underpins yeah he thinks so I don't know what he'd say he would be surprised cuz for him the idea is that language is a sort of the foundation of thought that's right absolutely and it's pretty U mind-blowing to things that it could be completely separate from thought that's right but so you know he's basically a philosopher philosopher of language in a way thinking about these things it's a fine thought you can't test it in his methods you can't do do a thought experiment to figure that out you need a scanner you need brain damage people you need something you need ways to to measure that and that's what you know fmri offers as a and and you know patients are a little Messier fmri is pretty unambiguous I'd say it's like very unambiguous there is no way to say that the language network is doing any of these tasks there's like you you should look at those data it's like there's no chance that you can say that there those networks are overlapping they're not overlapping they're just like completely different and so uh you know so the you know you can always make you know it's only two people it's four people or something for the the patients and there's something special about them we don't know but these are just random people and and with lots of them and you find always the same effects and it's very robust I'd say what's a fascinating effect uh what's the you mentioned Bolivia M uh what's the connection between culture and language uh you've uh you've also mentioned that you know much of our study of language comes from uh Weir D weird people Western educ at industrialized rich and Democratic so when you study like remote cultures such as around the Amazon jungle what can you learn about language so uh that term weird is from Joe Henrich he's at uh Harvard he's a Harvard evolutionary biologist and so he works on lots of different topics and uh he basically was pushing that observation that we should be careful about the inferences we want to make when we're talk in Psychology or soci yeah mostly in Psychology I guess about humans if we're talking about you know undergrads at MIT in Harvard those aren't the same right these aren't the same things and so if you want to make inferences about language for instance you there's a lot of VAR a lot of other kinds of languages in the world than English and French and Chinese you know and so maybe in for for for language we care about how culture because cultures can be very I mean of course English and Chinese cultures are very different but in you know hunter gatherers are much more different in in some ways and so you know if culture has an effect in what language is then we kind of want to look there as well as looking it's not like the industrialized cultures aren't interesting of course they are but we want to look at non-industrialized cultures as well and so I worked with two I work with the chimani which are in um Bolivia and and there Amazon both in the Amazon these cases and there are so-called farmer foragers which is not hunter gatherers um it's sort of one up from hunter gatherers and that they do a little bit of farming as well a lot of hunting as well but a little bit of farming and the kind of farming they do is the kind of farming that I might do if I ever were to grow like tomatoes or something in my backyard it's it's that it's not like so it's not like Big Field farming it's just a farming for a family a few things you do that and so that's what that's the kind of farming they do um and the other group I've worked with are the pan which are in uh also in the Amazon and happen to be in Brazil and that's with um a guy called Dan Everett who is a um linguist Anthropologist who actually lived and worked in the I mean he was a missionary actually initially back in the 70s working with trying to translate languages so they could teach them the bible teach them Christianity what what can you say about that yeah so the two groups I've worked with the chimani and the P are both isolate languages meaning there's no known connected languages at all they're just like on their own yeah there's a lot of those and and most of the isolates occur in in the in the Amazon or in Papu Guinea and these these places where the world has sort of stayed still for long enough and they're have like so there there aren't earthquakes there aren't um ah well certainly no earthquakes in the Amazon jungle and and and uh the climate isn't bad so you don't have droughts and so you know in Africa you've got a lot of moving of people because there's drought problems and so so they get a lot of language contact when you have when people have to if you got to move because you're you got no water then you got to get going and then uh then you run into contact with other other tribes other groups in in the Amazon that's not the case and so people can stay there for hundreds and hundreds and probably thousands of years I guess and so these groups the the chimani and the pan are both isolates in that and they just I guess they've just lived there for ages and ages with minimal contact with other outside groups um and so I I mean I'm interested in them because they are I mean I you know in these case I'm interested in their words I I would love to study their syntax their orders of words but I'm mostly just interested in how languages you know are connected to um their their cultures in this way and so with the pahan so most interesting I was working I was working on number there number information and so the the basic idea is I think language is invented this what I get from the words here is that I think language is invented we talked about color earlier it's the same idea so that what you need to talk about with someone else is what you're going to invent words for okay and so we invent labels for colors that I need not that I that I can see but that but the things I need to tell you about so that I can get objects from you or get you to give me the right objects and I just don't need a word for teal or or a word for aquamarine in in the in the Amazon jungle for the most part because I don't have two things which differ on those colors I just don't have that and so and so numbers are really another fascinating in um source of information here where um you might you know naively I certainly thought that all humans would have words for exact counting uh and the pioa don't okay so they don't have any words for even there's not a word for one in their in their language and so there's certainly not a word for two three or four so so so that kind of blows people's minds often yeah that blowing my mind that's pretty weird how are you how are you going to ask I want two of those you just don't and so that's just not a thing you can possibly ask in the P it's not possible that is there is no words for that so here's how we found this out okay so so it was thought to be a one too many language there are three words for quantifiers for for for sets but um and and people had thought that those meant one two and many uh but what they really mean is few some and many many is correct it's few some and many and so and so the way we figured this out and uh this is kind of cool is that um we gave people uh we had a set of objects okay these were having to be spools of thread doesn't really matter what they are identical objects and and and I sort of start off here I just give you know give you one of those and say what's that okay see you're a p speaker and you tell me what it is and and then I give you two and say what's that and and nothing's changing in the set except for the number okay and then I just ask you to label these things we just do this for a bunch of different people and and frankly it's a I I did this task this is f and it's a weird it's a little bit weird so you they say the word that they thought that we thought was one it's few but for the first one and then maybe they say few or maybe they say some for the second and then for the third or the fourth they start using the word many for the set and then five six seven eight I go all the way to 10 and it's always the same word and they look at me like I'm stupid because they told me what the word was for six S8 and I'm going to continue asking them at nine and 10 I'm like I'm sorry I just I just they understand that I want to know their language that's the point of the task is like I'm trying to learn their language and so that's okay but it does seem like I'm a little slow what because I they already told me what the word for many was 5 six s and I keep asking so it's a little funny to do this task over and over we did this with the guy called Dan was the our translator he's the only one who really speaks Paha fluently he's a good you know bilingual um for bunch of languages but also English and in Peta and then a guy called Mike Frank was a also a student with me down there he he and I did these things and um so you do that okay and everyone does the same thing all all all you know we asked like 10 people and they all do exactly the same labeling for one up and then we just do the same thing down on like random order actually we do some of them up some of them down first okay and so we do in instead of one to 10 we do 10 down to one and so so I give them 10 nine at eight they start saying the word for some and then down when you get to four everyone is saying the word for few which we thought was one so it's like it's the context determined what word what what what that quantifier they used was so it's not a count word they're not they're not count Words they're they're just approximate words and they're going to be nois when you interview a bunch of people the what the definition of few and there's going to be a threshold in the context yeah yeah I don't know what that means that's that's going to be depending on context I think that's true in English too right if you ask an English person what a few is I mean that's depend completely on the context and it might actually be at first hard to discover yeah cuz for a lot of people the jump from one to two will be few right so it's a jump yeah it might be still be there yeah like it's I mean that's fascinating that's fascinating that numbers don't present themselves yeah so the words aren't there and then and so then we do these other things well if if they don't have the words can they do exact matching kinds of tasks can they even do those tasks and and and the answer is sort of yes and no and so yes they can do them so here's the tasks that we did we we put out those spools of thread again okay so put like three out here and then um that we gave them some objects and those happen to be uninflated red balloons it doesn't really matter what they are it's just a bunch of exactly the same thing and it was easy to put down right next to these um um spools of thread okay and so then I put out three of these and your task was to just put one against each of my three things and they could do that perfectly so I mean I would actually do that it was a very easy task to explain to them because I have I did this with this guy Mike Frank and he would be my I I'd be the experimentor telling him to do this and showing him to do this and then we just like just do what he did you know copy him all we had to I didn't have to speak P except for know what copy him like do what he did is like all we had to be able to say and and then they would do that just perfectly and and so we'd move it up we'd do some sort of random number of items up to 10 and they basically do perfectly on that they never get that wrong I mean that's not a counting task right that is just a match you just put one against doesn't matter how I don't need to know how many there are there to do that correctly and and they would make mistakes but very very few and no more than MIT undergrads just GNA say like there there's no these are low stakes so you know you make mistakes counting is not required to complete the matching test that's right no not at all okay and so and and so that's our control and this guy a guy had gone down there before and said that they couldn't do this task but I just don't know what he did wrong there cuz they can do this task perfectly well and you know I can can train my dog to do this task so of course they can do this task and so you know it's not a hard task but the other task that was sort of more interesting is like so then we do a bunch of tasks where you need um some way to encode the set so like one one of them is just a I I just put a a um uh opaque sheet in front of the of the things I put down a bunch a set of these things and I put no Pig sheet down and so you can't see them anymore and I tell you do the same thing you were doing before right you know it's easy if it's two or three it's very easy but if I don't have the words for eight it's a little harder like maybe you know with practice wa well no because you have to count for us for us it's easy because we just we just count them it's just so easy to count them but but they don't they can't count them because they don't count they don't have words for this thing and so they would do approximate it's totally fascinating so they would get them approximately right uh you know uh you know after four or five you know because you can basically you always get four right three or four that looks that's something we can visually see but but after that you kind of have it's approximate number and so then and there's a bunch of tasks we did and they all failed as I mean failed they did approximate after five on all those tasks and it kind of shows that the words uh you kind of need the words you know to be able to do these these kinds of tasks there's a little bit of a chicken and egg thing there because if you don't have the words then maybe they'll limit you in the kind of like a little baby Einstein there won't be able to come up with a counting task you know what I mean like uh the ability to count enables you to come up with interesting things probably mhm so yes you develop counting because you need it but then once you have counting you can probably come up with a bunch of different inventions MH like how to I don't know what kind of thing they do matching really well for building purposes building some kind of Hut or something like this mhm so it's interesting that language is um a limiter on what you're able to do yeah here language is just is the words here is the words like the words for exact count is the limiting factor here they just don't have them yeah in this yeah well that that's what I mean the LI that limit is also a limit on the Society of what they're able to build that's going to be true yeah so it's proba I mean we don't know this is one of those problems with the snapshot of just current languages is that we don't know what causes a culture to discover SL invent accounting system but the hypothesis is the guess out there is something to do with farming so if you have a bunch of goats and uh you want to keep track of them and you say have 17 goats and you go to bed at night and you get up in the morning boy it's easier to have a count system to do that you know if I have that's an abstract abstraction over said so I don't have like people often ask me when I talk to tell them about this kind of work they say well don't these pahan don't they have kids don't they have a lot of children I'm like yeah they have a lot of children and they do they they often have families of three or four or five kids and like well don't they need the numbers to keep track of their kids and and I always ask the person who says this like do you have children and the answer is always no because that's not how you keep track of your kids you you care about their identities it's very important to me when I go I think I have five children it's it's doesn't matter which yeah doesn't it matters which five it's like if you replaced one with someone else I would I would care a goat maybe not right that's the kind of point it's an abstraction something that looks very similar to the one wouldn't matter to me probably but if you care about goats you're going to know them actually individually also yeah you will I mean cows goats if it's a source of food and milk and all that kind of stuff you're absolutely right but but I'm saying it is an abstraction such that you don't have to care about their identities to do this thing fast that's that's the hypothesis not m from Anthropologist is are guessing about where words for counting came from is from farming maybe any yeah uh do you have a sense why Universal languages like espiranto have not taken off um like why do we have all these different languages yeah yeah well my guess is the the function of a language is to do something in a community and and uh I mean unless there's some function to that language in the community it's it's not going to survive it's not going to be use so here's a great example so what like language death is super common okay languages are dying all around the world and here's how here's why they're dying and it's like yeah I see this in you know in it's not happening right now in either the chiman or the or the Pan but it probably will and so there's a neighboring group called mosan which is I I I I said that it's a Isola actually there's a duel there's two of them okay so it's actually there's two languages which are really close which are most ofon and um and chimane which are unrelated to anything else and mosan is unlike chimane in that it has a lot of contact with Spanish and it's dying so that language is dying the reason it's dying is there's not a lot of value for the local people in their native language so there's much more value in knowing Spanish like because they want to feed their families and how do you feed your family you learn Spanish so you can make money so you can get a job and do these things and then you can and then you make money and so they want Spanish things they want and so so most ofan is D is in danger and is dying and that's normal and so basically the problem is that people the reason we learn language is to communicate and we need to we use it to to make money and to do whatever it is to to feed our families and if that's not happening uh then it won't take off it's not like a game or something this is like something we you like why is English so popular it's it's not because it's an easy language to learn maybe it is I don't really know it's but that's not why it's popular but because it's a gig uh the United States is a gigantic economy therefore it's big economies that do this it's all it is it's all about money and that's what and so you know there's a motivation to learn Mandarin there's a motivation to learn Spanish there's a motivation to learn English these languages are very valuable to know because there's so so many speakers all over the world that's fascinating there's less of a value uh economically it's like kind of what drives this it's not about it's not a you know it's not just for fun I mean there are these groups that do want to learn language just for language's sake and they want and then and there's something you know to that but th th those are rare those are Rarities in general those are few small groups that do that not most people don't do that well if that was a primary driver then everybody was peing English or speaking one language there's also ATT tension that's happening that well we're moving towards fewer and fewer languages exactly I wonder if you're right maybe maybe on a you know this is slow but maybe that's where we're moving but there is a tension you're saying uh language that the fringes but if you look at geopolitics and superpowers it does seem that there's another thing in tension which is a language is a national identity sometimes oh yeah for certain Nation I mean that's the the war in Ukraine language Ukrainian language is a symbol of that war in many ways like a country fighing for its own identity so it's not merely the convenience I mean those two things are ATT tension the the convenience of trade and the economics and be able to uh communicate with neighboring countries and uh trade more efficiently with neighboring countries all that kind of stuff but also identity of the group I completely agree because language is the way for every Community like uh dialects that emerge are a kind of identity for people and sometimes a way for people to say f you to the more powerful yeah people that's interesting so in that way language can't be used as that tool yeah it it I completely agree and there's a lot of work to try to create that identity so people want to do that speak you know as a cognitive scientist and language expert I I I hope that continues because I don't want languages to die I want languages to survive because I um because they're so interesting for for so many reasons but I mean I I I find them fascinating just for the language part but I think they you know there's a lot of connections to culture as well which is also very important do you have hope uh for machine translation that can break down the barriers of language so while all these different diverse languages exist I guess there's many ways of asking this question but basically how hard is to it to translate in an automated way from one language to another there's there's going to be cases where it's going to be really hard right so there are Concepts that are in one language and not in another like the most extreme kinds of cases are these cases of number information so exact like good luck translating a lot of English into Peta it's just impossible there's no way to do it because there are no words for these Concepts that we're talking about there's probably the flip side right there's probably stuff in Peta which is going to be hard to translate into English on the other side and so I just don't know what those concepts are I mean you know the the space the world space is a little is different from my world space and so I don't know what like so that the things they talk about things are you know it's going to have to do with their life as opposed to you know my industrial life which is going to be different and and so there's going to be problems like that always um you know there's like it's not maybe it's not so bad in the case of some of these spaces and maybe it's going to be harder in others and uh so it's pretty bad in number it's like you know extreme I'd say in the number space you know exact number space but in the the color Dimension right so that's not so bad there's I mean but it's a problem that that you don't have ways to talk about the concept and there might be entire Concepts that are missing so to you it's more about the space of concept versus the space of form like form you can probably map yes yeah but so you were talking earlier about translation and about how translations there's good and bad translations I mean now we talking about translations of form right so what makes a writing good right you know it's not music form right it's it's not just the content it's you know it's how it's written and translating that I you know I you know that's that sounds difficult we should we should say that there is like I don't I hesitate to say meaning but there's a music and a rhythm to the form when you look at the broad picture like difference between dust and tolto uh or Heming you know Hemingway Bowski James Joyce like I mentioned there's a beat to it there's an edge to it that like is in the form we can probably get measures of those yeah I I I don't know I'm optimistic that we could get measures of those things and so maybe that's M I don't know I don't know though I I have not worked on that I would love fascinating translation to he I mean Hemingway is probably the lowest I would love to do see different authors but the average per sentence uh dependency length for Hemingway is probably the shortest that's your sense huh it's simple sentences short short yeah yeah yeah yeah I mean that's when if you have really long sentences even if they don't have Center and Bing like they can have longer connections yeah they can have longer connections they don't have to right you can have a long long sentence with a bunch of local words yeah but it's but it is much more likely to have the possibility of long dependencies with long sentences yeah uh I met a guy named Azar rosin who uh who does a lot of cool stuff really brilliant works with Tristan Harris and a bunch of stuff but he was talking to me about communicating with animals he co-founded Earth species project where you're trying to find the common language between whales crows and humans MH and he was saying that there is a there's a lot of promising work that even though the signals are very different right like the actual like um if you have embeddings of the languages they're actually trying to communicate similar type things and um is there something you can comment on that like where is there promise to that in everything you've seen in different cultures especially like remote cultures that this is a possibility or no that we can talk to whales I I I would say yes I I I think it's not crazy at all I think it's quite reasonable there's this sort of weird view well odd view I think that to think that human language is somehow special I mean it is maybe it is uh we can certainly do more than any of the other species you know we you know and so and maybe maybe our language system is part of that it's possible but but that but people do have often talked about how human like Chomsky in fact has talked about how human only only human language has you know this you know this this compositionality thing that he thinks is sort of key in in language and it's the the problem with that argument is he doesn't speak whale and he doesn't speak uh Crow and he doesn't speak monkey you know he's like they they say things like well they're making a bunch of Grunts and squeaks and and and that the reasoning is like that's bad reasoning like you know I'm pretty sure if you asked a whale what we're saying they'd say well I'm making a bunch of weird noises exactly and so it's like this is a very odd reasoning to to be making that human language is special because we're the only ones who have human language I'm like well we don't know what those other we just don't we can't talk to them yet and so there probably a signal in there and it might very well be something complicated like human language I mean sure with a small brain in in in lower in lower species there's probably not a very good communication system but in these higher higher species where you have you know what seems to be you know abilities to communicate something uh there might very well be a lot more signal there than we're uh than we might have otherwise thought but but also if we have a lot of intellectual humility here there somebody formerly from MIT n oxman who I admire very much has talked a lot about has worked on communicating with plants mhm so like yes the signal there is even less than but like it's not out of the realm of possibility that all Nature has a way of communicating and it's a very different language but they do develop a kind of language through the chemistry uh through some way of communicating with each other and if you have enough humility about that possibility I think you can um I think it would be very interesting in a few decades maybe centuries hopefully not um a humbling possibility of being able to communicate not just between humans effectively but between all of living things on earth well I mean I think some of them are not going to have much interesting to say but some of them will we don't know we certainly don't know I think I I think if we were humble there could be some interesting trees out there well they're probably talking to other trees right they're not talking to us and so to the extent they're talking they're saying something interesting to some other you know you know con specific as opposed to us right and so there probably is there may be some signal there I I you know so there are people out there actually it's pretty common to say that Lang that human language is special and different from any other animal communication system and I I I just I just don't think the evidence is there for that claim I think it's not obvious uh we just don't know what what because we we don't speak these other communication systems until we get uh better you know I do think there's there are people working on that as you pointed out the people working on whale speak for instance like that's really fascinating let me ask you a wild outo sci-fi question if we make contact with an intelligent alien civilization and um you get to meet them MH how hard do you think it like how surprised would you be about their way of communicating do you think you would be recognizable maybe there's some parallels here to when you go to the remote tribes I mean I I would want Dan Everett with me he is like amazing at learning uh foreign languages and so he like this is an amazing feat right to be able to go this is a language Paha which has no translators before him I mean there were he was aary went there well there was a guy that had been there before but he wasn't very good and so he learned the language far better than anyone else had learned before him he's like good at he's just a he's a very social person I think that's a big part of it is being able to interact so I don't know it kind of depends on these these uh this this species from Outer Outer Space how how much they want to talk to us is there something you can say about the process he follows like what how do you show up to a tribe and socialize I mean I guess colors and Counting is is one of the most basic things to figure out yeah you start that you actually start with like objects yes and just say you you just throw a stick down and say stick and then you say what do you call this and do this few and then they'll say the word whatever and he say a standard thing to do to throw two sticks at two sticks and then you know he learned pretty quick that there weren't any count wordss in this language because they didn't know this wasn't interesting it was kind of weird they'd say some or something the same word over and over again and so but that is a standard thing you just like try to but you have to be pretty out there socially like willing to talk to random people which these are you know really very different people from you and he was and he's he's very social and so I think that's a big part of this is like that's how you know a lot of people know a lot of languages is they're willing to talk to other people that's a tough one where you just show up knowing nothing yeah oh God it's a it's beautiful that humans are able to connect in that way yeah yeah uh you've had an incredible career exploring this fascinating topic what advice would you give to young people um about how to have a career like that or a life that that that they can be proud of when you see something interesting just go and do it like I do I do that like I that's something I do which is kind of unusual for most people so like when I saw the p like petan was available to go and visit I was like yes yes I'll go and then when we couldn't go back we had some trouble with the uh Brazilian government there's some corrupt people there it was very difficult to get go back in there and so I was like all right I got to find another group and so we searched around and we're able to find the Chim because I wanted to keep working on this kind of problem and so we found the chimani and just go there I didn't really have we didn't have contact we had a little bit of contact and brought someone and and that was you know we just just kind of just try things I I say it's like a lot of that's just like ambition just try to do something that other people haven't done just give it a shot is what I I mean I I do that all the time I don't know I love it but and I love the fact that your pursuit of fun has landed you here talking to me this was an incredible conversation that you're uh you're uh you're just a fascinating human being thank you for taking a journey through human language with me today this is awesome thank you very much Alexis been pleasure thanks for listening to this conversation with Edward Gibson to support this podcast please check out our sponsors in the description and now let me leave you with some words from wienstein the limits of my language mean the limits of my world thank you for listening and hope to see you next time