MIT AGI: Cognitive Architecture (Nate Derbinsky)
bfO4EkoGh40 • 2018-03-20
Transcript preview
Open
Kind: captions Language: en so today we have nadir bin ski he's a professor at Northeastern University working on various aspects of computational agents that exhibit human level intelligence please give Nate a warm welcome thanks a lot and thanks for having me here so the title that was on the page was cognitive modeling I'll kind of get there but I wanted to put it in context so the the bigger theme here is I want to talk about what's called cognitive architecture and if you've never heard about that before that's great and I wanted to contextualize that as how are we what how is that one approach to get us to AGI and I say what my view of AGI is and put up a whole bunch of TV and movie characters that I grew up with that inspire me that will lead us into what is this thing called cognitive architecture it's a whole research field that crosses neuroscience psychology cognitive science and all the way into AI so I'll try to give you kind of the historical big-picture view of it what some of the actual systems are out there that might be of interest to you and then we'll kind of zoom in on one of them that I've done a good amount of work with called soar and what I'll try to do is tell a story a research story of how we started with kind of a core research question we look to how humans operate understood that phenomenon and then took it and so really interesting results from it and so at the end if this field is of interest there's a few pointers for you to go read more and and go experience more of cognitive architecture so just rough definition of AGI given this in AGI class depending the direction that you're coming from it might be kind of understanding intelligence or maybe developing intelligent systems they're operating at the level of human level intelligence the the typical differences between this and other sorts of maybe AI machine learning systems we want systems that are going to persist for a long period of time we want them robust to different conditions we want them learning over time and here's the crux of it working on different tasks and in a lot of cases tasks they didn't know we're coming ahead of time I got into this because I clearly watched too much TV and too many movies and then I looked back at this and I realized I think I'm covering 70's 80's 90's nots I guess it is and today and so this is what I wanted out of AI and this is what I wanted to work with and then there's the the reality that we have today so instead of so who's watched Knight Rider for instance I I don't think that exists yet but but maybe we're getting there and in particular for fun during the Amazon sale day I got myself an Alexa and I could just see myself at some point saying Alexa please might write me an R sync script you know to sync my class and if you have an Alexa you probably know the following phrase this this just always hurts me inside which is sorry I don't know that one which is okay right that's a lot of people have no idea what I'm asking let alone how to do that so what I want Alexa to respond with after that is do you have time to teach me and to provide some sort of interface by which back and forth we can kind of talk through this that we aren't there yet to say the least but I'll talk later about some work on a system called Rosie that's working in that direction we're starting to see see some ideas about being able to teach systems out of work so folks who are in this field I think generally fall into these three categories they're just curious they want to learn new things generate knowledge work on hard problems great I think there are folks who are in kind of that middle cognitive modeling realm and so I'll use this term a lot it's really understanding how humans think how humans operate human intelligence at multiple levels and if you can do that one there's just knowledge in and of itself of how we operate but there's a lot of really important applications that you can think of if we were able to not only understand but predict how humans would respond react in various tasks medicine is is an easy one there's some work in HCI or HR I I'll get to later where if you can predict how humans would respond to a test you can iterate tightly and develop better interfaces it's already being used in the realm of simulation and in defense industries I happen to fall into the latter group which or the bottom group which is systems development which is to say just the desire to build systems for various tasks that are working on tasks that kind of current AI machine learning can't operate on and I think when you're working at this level or on any system that nobody's really achieved before what do you do you you kind of look to the examples that you have which in this case that we know of it's just humans right irrespective of your motivation when you have kind of an intent that you want to achieve in your research you kind of let that drive your approach and so I often show my AI students this the touring test you might have heard of or variants of it that have come before these were folks who are trying to create system that acted in a certain way that acted intelligently and the kind of line that they drew the benchmark that they used was to say let's make systems that operate like humans do cognitive modelers will fit up into this top point here to say it's not enough to act that way but by some definition of thinking we want the system to do what humans do or at least be able to make predictions about it so that might be things like what errors would the human make on this task or how long would it take them to perform this task or what emotion would be produced in this task there are folks who are still thinking about how the computer is operating but it's trying to apply kind of rational rules to it so a logician for instance would say if you have a and you have B the a gives you B B gives you see a should definitely give you C that's just what's rational and so there folks operate in that direction and then if you go to intro AI class anywhere in the country particularly Berkeley because they have graphics designers that I get to steal from the benchmark would be what the system produces in terms of action and the benchmark is some sort of optimal rational bound irrespective of where you work in the space there's kind of a common output that arrives when you research these areas which is you can learn individual bits and pieces and it can be hard to bring them together to build a system that either predicts or acts on different tasks so this is part of the transfer learning problem but it's also part of having distinct theories that are hard to combine together so I'm going to give an example that come comes out of cognitive modeling or perhaps three examples so if you were in a HCI class or some interest psychology classes one of the first things you'll learn about is Fitz law which provides you the ability to predict the difficulty level of basically a human pointing from where they start to a particular place and it turns out that you can learn some parameters and model this based upon just the distance from where you are to the targets and the size of the target so both moving along distance will take a while but also if you're aiming for a very small point that can take longer then if there's a large area that you just kind of have to get yourself to and so this is held true for many humans so let's say we've learned this and then we move on to the next task and we learn about what's called the power law of practice which has been shown true in a number of different tasks what I'm showing here is one of them where you're going to draw a line through sequential set of circles here starting at 1 going to 2 and so forth not making a mistake or at least not trying to and try to do this as fast as possible and so for a particular person we would fit the a B and C parameters and we'd see a power law so as you perform this task more you're going to see a decrease in the amount of reaction time required to complete the task great we've learned two things about humans let's add some more in so for those who might have done some reinforcement learning TV learning is one of those approaches temporal difference learning that's had some evidence of similar sorts of processes in the dopamine centers of the brain and it basically says in a sequential learning tasks you perform the task you get some sort of reward how are you going to kind of update your representation of what to do in the future such as to maximize expectation of future reward and there are various models of how that changes over time and you can build up functions that allow you to form better and better and better a given trial and error great so we've learned three interesting models here that hold true over multiple people multiple tasks and so my question is if we take these together and add them together how do we start to understand a task as quote/unquote simple as chess which is to say we could ask questions how how long would it take for a person to play what mistakes would they make they played a few games how would they adapt themselves or if we want to develop system that ended up being good at chess or at least learning to become better at chess my question is if you could there doesn't seem to be a clear way to take these very very individual theories and kind of smash them together and get a reasonable answer of how to play chess or how do humans play chess and so gentlemen in this slide is Alan Newell one of the founders of AI did incredible work in psychology and other fields he gave a series of lectures at Harvard in 1987 and they were published in 1990 called the unified theories of cognition and his argument to the psychology community at that point was the argument on the prior slide they had many individual studies many individual results and so the question was how do you bring them together to gain this overall theory how do you make forward progress and so his proposal was unified theories of cognition which became known as cognitive architecture which is to say to bring together your core assumptions your core beliefs of what are the fixed mechanisms and processes that intelligent agents would use across tasks so the representations the learning mechanisms the memory systems bring them together implement them in a theory and use that across tasks and the core idea is that when you actually have to implement this and see how it's going to work across different tasks the interconnections between these different processes and representations would add constraint and over time the constraints would start limiting the design space of what is necessary and what is possible in terms of building intelligent systems and so the overall goal from there was to understand and exhibit human level intelligence using these cognitive architectures a'nature question asked is okay so we've gone from a methodology of science that we understand how to operate in we make a hypothesis we construct a study we gather our data we evaluate that data and we falsify we do not falsify the original hypothesis and we can do that over and over again and we know that we're making for progress scientifically if I've now taken that model and changed it into I have a piece of software and it's representing my theories and to some extent I can configure that software in different ways to work on different tasks how do I know that I'm making progress and so there's a form of science called lactose ium and it's kind of shown pictorially here where you start with your core of what your your beliefs are about where your head what is necessary for achieving the goal that you have and around that you'll have kind of ephemeral hypotheses and assumptions that over time may grow and shrink and so you're trying out different things trying out different things and if an assumption is around there long enough it becomes part of that core and so as you work on more tasks can learn more either by your work or by data coming in from with someone else the core is growing larger and larger you've got more constraints and you've made more progress and so what I wanted to look at we're in this community what are some of the core assumptions that are driving forward scientific progress so one of them actually came out of those lectures they're referred to as Newell's time scales of human action and so off on the left the left two columns are both time units just expect somewhat differently second from the left being maybe more useful to a lot of us in understanding daily life one step over from there would be kind of at what level processes are occurring so the lowest three are down at kind of the substrate the neuronal level we're building up to deliberate tasks that occur in the brain and tasks that are operating on the order of ten seconds some of these might occur in the psychology laboratory but probably a step up to and ours and then above that really becomes interactions between agents over time and so if we start with that the things to take away is that regular the hypothesis is that regularities will occur at these different time scales and that they're useful and so those who operate at that lowest time scale might be considering neuroscience cognitive neuroscience when you shift up to the next couple levels what we would think about in terms of the areas of science that deal with that would be psychology and cognitive science and then we shift up a level and we're talking about sociology and economics and the interplay between agents over time and so what we'll find with cognitive architecture is that most of them will tend to sit at the deliberate act we're trying to take knowledge of a situation and make a single decision and then sequences of decisions over time will build to tasks and tasks over time will build to more interesting phenomenon I'm actually going to show that that isn't strictly true that there are folks working in this field that actually do operate one level below some other assumptions so this is herb Simon receiving the Nobel Prize in Economics and part of what he received that award for was an idea of bounded rationality so in various fields we tend to model humans as rational and his argument was let's consider that human beings are operating under various kinds of constraints and so to model the rational with respect to and bounded by how complex the problem is that they're working on how big is that search space that they have to conquer cognitive limitations so speed of operations amount of memory short-term as well as long-term as well as other aspects of our computing infrastructure that are going to keep us from being able to arbitrarily solve complex problems as well as how much time is available to make that decision and so this is actually a phrase that came out of his speech when he received the Nobel Prize decision-makers can satisfice either by finding optimum solutions for a simplified world just to say take your big problem simplify in some way and then solve that or by finding satisfactory solutions for a more realistic world take the world and all its complexity take the problem in all its complexity and try to find something that works neither approach in general dominates the other and both have continued to co-exist and so what you're actually going to see throughout the cognitive architecture community is this understanding that some problems you're not going to be able to get an optimal solution to if you consider for instance bounded amount of computation bounded time the need to be reactive to a changing environment these sorts of issues and so in some sense we can decompose problems that come up over and over again into simpler problems solve those newer optimally or optimally fix those in optimize those but more general problems we might have to satisfy some there's also the idea of the simple system hypothesis so this is Alan Newell and herb Simon they're considering how a computer could play the game of chess so the physical system physical symbol system talks about the idea of taking something some signal abstractly referred to as symbol combining them in some ways to form expressions and then having operations that produce new expressions a weak interpretation of the idea that symbol systems are necessary and sufficient for intelligent systems a very weak way of talking about it is the claim that there's nothing unique about the neuronal infrastructure that we have but if we got the software right we could implement it in the bits bytes Ram and processor that make up modern computers that's kind of the weakest way to look at this that we can do it with silicon and not carbon stronger way that this used to be looked at was more of a logical standpoint which is to say if we can encode rules of logic these tend to line up if we think intuitively of planning and problem solving and if we can just get that right and get enough fat's in there and enough in there that somehow intelligence well that's what we need for intelligence and eventually we can get to the point of intelligence and that's what you need for intelligence and that was a starting point that lasted for a while I think by now most folks in this field would agree that that's necessary to be able to operate logically but that there are going to be representations and processes that will benefit from non symbolic representation so particularly perceptual processing visual auditory and processing things in a more kind of standard machine learning sort of way as well as kind of statistic taking advantage of statistical representations so we're getting closer to actually looking at cognitive architectures I did want to go back to the idea that different researchers are coming with different research foci foci and we'll start off with kind of the lowest level and understanding biological modeling so Leiber and spawn both try to model different degrees of low-level details parameters firing rates connectivities between different kind of levels of neuronal representations they build that up and then they tried to build tasks above that layer but always being very cautious about being true to human biological processes at a layer above there would be psychological modeling which is to say trying to build systems that are true in some sense to areas of the brain interactions in the brain and being able to predict errors that we made timing that we produced by the human mind and so there I'll talk a little bit about Akhtar this final level down here these are systems that are focused mainly on producing functional systems that exhibit really cool artifacts and and solve really cool problems and so I'll spend most of the time talking about soar but I want to point out a relative newcomer in the game called Sigma so to talk about spawn a little bit we'll see if the sound works in here I'm going to let the Creator take this one or not see how the AV system likes this [Music] but of course if I wouldn't be pleased with a pad of the microseconds and all celebrated since we're engineering is critical goal engineering allows you to break down equation intense very precise descriptions which we can test like building actual models one probably do recently is called the song model this Moscow has the two and a half million individual mountains isolated and evens the model is a knife and the up the canal is Bernard so essentially you can see images of numbers and that is something like a progressive in the case of the discarded into that seat but magic tide reproduces style that yeah so for instance it's easy to get the on environment and actually forgive separately and silently on medical side we all know that we have cognitive town concession we get over and we can try that address accountant spicy alienating process with these nice models another potential area in fact it's on artificial intelligence a lot of working out visual Imperfects thanks to donations that are exceeded it at one pass pretty since plane test what's special is fine is that it's like that many different paths and this X we have not found it might appear californee the flow of information through different parts of the model something I haven't seen very well so provide a pointer at the end he's got a really cool book called how to build a brain and if you google and you can google spun you can find a toolkit where you can kind of construct circuits that will approximate functions that you're interested in connect them together set certain properties that you would want at a low level and build them up and actually work on tasks at the level of vision and robotic actuation so that's a really cool system as we move into architectures that are sitting above that biological level I wanted to give you kind of an overall sense of what they're going to look like what a prototypical architecture is going to look like so they're gonna have some ability to have perception the modalities typically are more digital symbolic but they will depending on the architecture be able to handle vision audition and various sensory inputs these will get represented in some sort of short-term memory whatever the state's representation for the particular system is there it's typical to have a representation of the knowledge of what tasks can be performed when they should be performed how they should be controlled and so these are typically both actions that take place internally that manage the internal state of the system and perform internal computations but also about external actuation and external might be a digital system a game AI but it might also be some sort of robotic actuation in real world there's typically some sort of mechanism by which to select from the available actions in a particular situation there's typically some way to augment this procedural information which is to say learn about new actions possibly modify existing ones there's typically some semblance of what's called declarative memory so whereas procedural at least in humans if I asked you to describe how to ride a bike you might be able to say get on the seats and pedal but in terms of keeping your balance there you'd have a pretty hard time describing it declaratively so that's kind of the procedural side the implicit representation of knowledge whereas declarative would include facts geography math but it also include experiences that the agent has had a more episodic representation of declarative memory and they'll typically have some way of learning this information on mending it over time and then finally some way of taking actions in the world and they'll all have some sort of cycle which is perception comes in knowledge that the agent has is brought to bear on that an action is selected knowledge that knows to condition on that action will act accordingly both with internal processes as well as eventually to take action and then rinse and repeat so when we talk about in an AI system an agent in this context that would be the fixed representation which is whatever architecture we're talking about plus set of knowledge that is typically specific to the task but might be more general so oftentimes these systems could incorporate a more general knowledge base of facts of linguistic facts of Geographic facts let's take Wikipedia and let's just stick it in the brain of the system there'll be more tasks in general but then also whatever it is that you're doing right now how should you proceed in that and then it's typical to see this processing cycle and going back to the prior assumption the idea is that these primitive cycles allow for the agent to be reactive to its environment so if new things come in that has react to if the Lions sitting over there I better run and maybe not do my calculus homework right so as long as this cycle is going I'm reactive but at the same time if multiple actions are taken over time I'm able to get complex behavior over the long term so this is the act our cognitive architecture it has many of the kind of core pieces that I talked about before let's see if the mouse yes mouse is useful up there so we have the procedural model here a short-term memory is going to be these buffers that are on the outside the procedural memory is encoded as what I call production rules or if-then rules if is this is the state of my short-term memory this is what I think should happen as a result you have a selection of the appropriate rule to fire and an execution you're seeing associated parts of the brain being represented here cool thing that has been done over time in the act our community is to make predictions about brain areas and then perform MRIs and gather that data and correlate that data so when you use the system you will get predictions about things like timing of operations errors that will occur probabilities that something is learned but you'll also get predictions about to degree that they can kind of brain areas that are going to line light up and if you want to that's actively being developed at Carnegie Mellon to the left is John Anderson who developed this cognitive architecture ooh 30 ish years ago and until the last about five years he was the primary researcher developer behind it with Christian and then recently he's decided to spend more time on cognitive tutoring systems and so Christian has become the primary developer there is an annual akhtar workshop there's a summer school where if you're thinking about modeling a particular task you can kind of bring your task to them bring your data they teach you how to use the system and try to get that study going right there on the spot to give you a sense of what kinds of tasks this could be applied to so this is a representative of a certain class of tasks certainly not the only one let's try this again I think powerpoints going to want a restart every time okay so we're getting predictions about basically where the eye is going to move what you're not seeing is it's actually processing things like text and colors and making predictions about what to do and how to represent the information and how to process the graph as a whole I had alluded to this earlier there's work by Bonnie John very similar so making predictions about how humans would use computer interfaces and at the time she got hired away by IBM and so they wanted the ability to have software that you can put in front of software designers and when they think they have a good interface press a button this model of human cognition would try to perform the tasks that have been told to do and make predictions about how long it would take and so you can have this tight feedback loop from designers saying here's how good your particular interfaces so act are as a whole it's very prevalent in this community I went to their web page and counted up just the papers that they knew about it was over 1,100 papers over time if you're interested in it the main distribution is in Lisp but many people have used this and wanted to apply it to systems that need a little more processing power so there's the NRL has a Java port of it that they use in robotics the Air Force Research Lab and Dayton has implemented it in Erlang for a parallel processing of large declare knowledge bases they're trying to do service-oriented architectures with it CUDA because they want what it has to say they don't want to wait around for it to have to figure that stuff out so that's the two minutes about Akhtar Sigma is a relative newcomer and it's developed out at the University of Southern California by man named Paul rosenbloom mmm mentioned a couple minutes because he was one of the prime developers of soar at Carnegie Mellon so he knows a lot about how store works and he's worked on it over the years and I think originally I'm gonna speak for him and he'll probably say I was wrong I think originally it was kind of a mental exercise of can i reproduce or using a uniform substrate I'll talk about so in a little bit it's thirty years of research code if anybody is dealt with research code it's thirty years of C and C++ with dozens of graduate students over time it's not pretty at all and and theoretically it's got these boxes sitting out here and so he reimplemented the the core functionality of soar all using factor graphs and message passing algorithms under the hood he got to that point and then said there's nothing stopping me from going further and so now it can do all sorts of modern machine learning vision optimization sort of things that would take some time in any other architecture to be able to integrate well so it's been an interesting experience it's now going to be the basis for the virtual human project out at the Institute for Creative Technology it's Institute associated with University of Southern California for him until recently could get your hands on it but in the last couple years he's done some tutorials on it he's got a public release with documentation so that's something interesting to keep an eye on but I'm gonna spend all the remaining time on the Soraa cognitive architecture and so you see it looks quite a bit like the prototypical architecture and I'll give a sense again about how this all operates give a sense of the people involved we already talked about Alan Newell so both John Laird who is my advisor and Paul Rosenbloom were students of Alan Newell John's thesis project was related to the chunking mechanism and soar which learns new rules based upon sub-goal reasoning so he finished that I believe the year I was born and so he's one of the few researchers you'll find who's still actively working on their thesis project beyond that's about I think about ten years ago he founded soar technology which is company up in Ann Arbor Michigan while it's called solar technology it doesn't do exclusively soar but that's a part of the portfolio general intelligence system stuff a lot of Defense Association so some notes of what's going to make soar different from the other other architectures that fall into this kind of functional architecture category a big thing is a focus on efficiency so john wants to be able to run soar on just about anything we just got on the soar mailing list a desire to run it on a real-time processor and our answer while we had never done it before was probably it'll work every release there's timing tests and we always what we what we look at is in a bunch of different domains for a bunch of different reasons that relate to human processing there's this magic number that comes out which is 50 milliseconds which is to say in terms of responding to tasks if you're above that time humans will sense a delay and you don't want that to happen now if we're working in a robotics task 50 milliseconds if you're dramatically above that you just fell off the curb or worse or you just hit somebody in a car right so we're trying to keep that as low as possible and for most agents it it doesn't even register it's below 1 millisecond fractions of millisecond but I'll come back to this because a lot of the work that I was doing was computer science AI and a lot of efficient algorithms and data structures and 50 milliseconds was that very high upper bound it's also one of the projects that has a public distribution you can get in all sorts of operating systems we use something called swig that allows you to interface with it in a bunch of different languages we kind of describe the meta description and you are able to basically generate bindings and different platforms Korres C++ there was a team at sore tech that said we don't like C++ it gets messy so they actually did a port over to pure Java in case that appeals to you there's an annual soar workshop that takes place in Ann Arbor typically it's free you can go there get a sort tutorial and talk to folks who are working on soar and it's fun I've been there every year but one in the last decade it's just fun to see the people around the world that are using the system and all sorts of interesting ways to give you a sense of the diversity of the applications one of the first was our one store which was back in the days when it was an actual challenge to build a computer which is to say that your choice of certain components would have radical implications for other parts of the computer so it wasn't just the Dell website where you just I want this much RAM I want this much CPU there was a lot of thinking that went behind it and then physical labor that went to construct your computer and so it was making that process a lot better there are folks that apply to natural language processing I saw r7 was the core of the virtual humans project for a long time HCI tasks terrasaur was one of the largest rule-based systems tens of thousands of rules over 48 hours it was a very large-scale simulation a defense simulation lots of games it's been applied to for various reasons and then in the last few years porting it on to mobile robotics platforms this is Edwin Olsen's splinter bot an early version of it that went on to win the magic competition then I went on to put soar on the web and if after this talk you're really interested in a dice game that I'm going to talk about you can actually go to the iOS App Store and download it's called Michigan liar's dice it's free you don't have to pay for it but you can actually play a liar's dice with soar and it's even set the difficulty level it's pretty good it beats me on a regular basis I wanted to give you a couple other just kind of really weird feeling sort of applications and really cool applications the first one is out of Georgia Tech go PowerPoint is dom-based interactive art installation in which she participants can engage and collaborate the movement improvisation with each other and virtual advance permits this thing her actually creates a hyperspace English virtual and quicker real bodies meet the line between human and non-human is learned through images to examine a relationship with technology the night installation ultimately examines how humans and machine can co-create experiences and it ducks out in a playful environment the don't creates a social space that encourages human human interaction and collective dance experiences allowing the depends to create an explorer movement while having fun the development of lumini has been a hundred exploration in our forms of theatre and dance as well as research and artificial intelligence and cognitive science lumahai draws inspiration from the ancient art form of shot here the original two-dimensional version of the installation led the conceptualization of the dome in the liminal space which even silhouettes and virtual character is being danced together on the projection surface rather than relying on a predominant library of movement responses the virtual dancer learns in this part measurements and utilizes new points movement theory to systematically reason about them and working improvisational shoes under the moon response the points theory is based in dance and theater and analyzes the performance along the dimensions of tempo duration repetition kinesthetic response shape spatial relationships gesture architecture and Photography the virtual dancer is able to use several different strategies to respond to human movements these include mimicry of a movement transformation of the movement along viewpoints and mentions we're calling a similar or complementary movement from memory in terms of you fight revolutions and define actually sponsor patterns of the agent has learned while dancing with its human partner the reason we did this is this is part of a larger effort in our lab for understanding the relationship between compeition cognition and creativity where a large amount of our efforts go into understanding human creativity and how we make things together out were created together as a way that almost understand how we can build co-created AI that serves the same purpose where to be a colleague and collaborate with us and create things with us so Brian was a graduate student in John leritz lab as well before I start this I lude into this earlier where we're getting closer to rosie saying can you teach me so let me give you some introduction to this in the lower left you're seeing the view of a Kinect camera onto a flat surface there's a robotic arm mainly 3d printed parts few servos above that you're seeing an interpretation of the scene we're giving it kind of associations of the four areas with semantic titles like one is the table one is the garbage just just semantic terms for areas but other than that the agent doesn't actually know all that much and it's going to operate in two modalities one is we'll call it natural language natural ich language restricted subset of English as well as some quote unquote pointing so you're gonna see some Mouse pointers in the upper left saying I'll talk about this and this is just a way to indicate location and so starting off we're gonna say things like you know pick up the blue block and it's gonna be like I don't know what is what is blue we say oh well that's a color okay you know so go get the green thing what's green oh it's a color okay move the blue thing to a particular location where's that point it okay what is moving like really it has to start from the beginning and it's described and it said okay now you've finished and once we got to that point now I can say move the green thing over here and it's got everything that it needs to be able to then reproduce the task given new parameters and it's learned that ability so let me give it a little bit of time so you can look a little bit at top left in terms of the pointers you're going to see some text commands being entered so what kind of attribute is blue we're gonna say it's a color and so that can map it then to a particular sensory modality this is green so the pointing what kind of thing is green okay color so now it knows how to understand blue and green as colors with respect to the visual scene move rectangle to the table what is rectangle okay now I can map that on to or understanding parts of the world is this the blue rectangle so the arm is actually pointing itself to get confirmation from the instructor and then we're trying to understand in general when you say move something what is the goal of this operation and so then it also has a declared representation of the idea of this task not only that it completed it then it can look back on having completed the task and understand what were the steps that led to achieving a particular goal so in order move it you're gonna have to pick it up it knows which one the blue thing is great now in the table so that's a particular location and at this point we can say you're done you have accomplished the moved blue rectangle to the table and so I can understand what that very simple kind of process is like and associate that with the verb to move and now we can say move the green object or not do the garbage and without any further interaction based upon everything that learned up till that point it can successfully complete that task so this is work of chavala Mohan and others at the shore group at the University of Michigan on the bruisy project and they're extending this to playing games and learning the rules of games through text-based descriptions and multimodal experience so in order to build up to here's a story and so I wanted to give you a sense of how research occurs in the group and so there's these back and forth that occur over time between there's this piece of software called soar we want to make this thing better and give it new capabilities and so all our agents are going to become better and we always have to keep in mind and you'll see this as I go further that it has to be useful to a wide variety of agents it has to be task independent and it has to be efficient for us to do anything in the architecture all of those have to hold true so we do something cool in the architecture and then we say okay let's solve a cool problem so it's build some agents to do this and so this ends up testing what are the limitations what are the issues that arise in a particular mechanism as well as integration with others and we get to solve interesting problems we usually find there was something missing and then we can go back to the architecture and rinse and repeat just to give you an idea again how sore works so the working memory is actually a directed connected graph the perception is just a subset of that graph and so there's going to be symbolic representations of most of the world there is a visual subsystem in which you can provide a scene graph just not showing it here actions are also a subset of that graph and so the procedural knowledge which is also production rules can modify can sections of the input modify sections of the output as well as arbitrary parts of the graph to take actions so the decision procedure says of all the things that I know to do and I've kind of ranked them according to various preferences what single things should I do semantic memory for facts there's episodic memory the agent is always actually storing every experience it's ever had over time in episodic memory and it has the ability to get back to that and so the similar cycle we saw before we get input in this perception called the input link rules are going to fire all in parallel and say here's everything I know about the situation here's all the things I could do decision procedure says here's what we're going to do based upon the selected operator all sorts of things could happen with respect to memories providing input rules firing to perform computations and as well as potentially output in the world and remember agent reactivity is required we want the system to be able to react to things in the world at a very quick pace so anything that happens in this cycle at max the overall cycle has to be under 50 milliseconds and so that's gonna be constraint we hold ourselves to and so the story I'll be telling will say how we got to a point where we started actually forgetting things and we're an architecture that doesn't want to be like humans we want to create cool systems but what we realized was something that we do there's probably some benefit to it and we actually put it into our system in the lead to good outputs so here's the research path I'm going to walk down we had just a simple problem which was we have these memory systems and sometimes they're going to get a cue that could relate to multiple memories and the question is if you have a fixed mechanism what should you return in a task independent way which one of these many memories should you return that was our question and we looked to some human data on this something called the rational analysis of memory done by John Anderson and realized that in human language there are recency and frequency effects that maybe it would be useful and so we actually did an analysis found that not only does this occur but it's useful in what are called word sense disambiguation tasks I'll get to that what that means in a second develop some algorithms to scale this really well and it turned out to worked out well not only in the original task when we learn look to two other completely different ones the same underlying mechanism ended up producing some really interesting outputs so let me talk about word sense disambiguation real quick this is a core problem in natural language processing if you haven't heard of it before let's say we have an agent and for some reason it needs to understand the verb to run looks to its memory and finds that it could you know run in the park it could be running a fever could run an election it could run a program and the question is what should an task independent memory mechanism return if all you've been given is the verb to run and so the rational analysis of memory looks through multiple text corpora and what they found was if a particular word had been used recently it's very likely to be reused again and if it hadn't been used recently there's going to be this effect where the expression here the T is time since the most recent use it's going to sum those with a exponential decay and so what it looks like if time is going to the right activation hire as better as you get these individual usages you get these little drops and then eventually drop down and so if we had just one usage of a word the read would be what the decay would look like and so the core problem here is if we're at a particular point and we want to select between kind of the blue thing or the red thing blue would have a higher activation and so maybe that's useful this is how things are modeled with human memory but is it useful in general for tasks and so we looked at common corpora used in word sense disambiguation and just said well if we just look at this corporate twice and we just use answers prior answers you know I ask the question what is the sense of this word I took a guess I got the right answer and I used that recency and frequency information in my task independent memory would that be useful and somewhat of a surprise but somewhat maybe not of a it actually performed really well across multiple corpora so we said okay this seems like a reasonable mechanism let's look at implementing this efficiently in the architecture and the problem was this term right here said for every memory for every time step you're having to pay everything that doesn't sound like a recipe for efficiency if you're talking about lots and lots of knowledge over long periods of time so we made use of a nice approximation that petrol that come up with to approximate tale effect so accesses that happen long long ago we could basically approximate their effect on the overall sum so now we had a fixed set of values and what we basically said is since these are always decreasing and all we care about is relative order let's just only recompute when someone gets a new value so it's a guess it's a heuristic and approximation but we looked at how this worked on the same set of corpora and in terms of query time if we made these approximations well under our 50 millisecond the effect on task performance was negligible in fact hunt a couple of these it got ever so slightly better terms of accuracy and actually if we looked at the individual decisions that were being made making these sorts of approximations were leading to up to 90 sorry at least 90 percent of the decisions being made were identical to having done the true full calculation so I said this is great and we implemented this and worked really well and then we started working on what seemed like completely unrelated problems one was in mobile robotics we had a mobile robot I'll show picture of in a little while roaming around the halls performing all sorts of tasks and what we're finding was if you have a system that's remembering everything in your short-term memory and your short-term memory gets really really big I don't know about you my short-term memory feels really really small I would love it to be big but if you make your memory really big and you try to remember something you're not having to pull lots and lots and lots of information into your short-term memory so the system was actually getting slower simply because it had a lot of short-term memory representation of the overall map it was looking up so large working memory a problem Liars dices game you play with dice we were doing in our L base system on this reinforcement learning and it turned out it's a really really big value function we're having to store lots of data and we didn't know which stuff we had to keep around to keep the performance up so we had a hypothesis that forgetting was actually going to be a beneficial thing that maybe maybe the problem we have with our memories that we really really dislike this forgetting thing maybe it's actually useful and so we experimented with the following policy we said let's forget a memory if one we haven't really it's not predicted to be useful by this base level activation we haven't used it recently we haven't used it frequently maybe it's not worth it that and we felt confident that we could approximately reconstruct it if we absolutely had to and if those two things held we could forget something so it's this bait same basic algorithm but instead of the ranking them it's if we set a threshold for base level activation finding when it is that a memory is going to pass that threshold and try to forget based upon that in a way that's efficient that isn't going to scale really really poorly so we were able to come up with an efficient way to implement this using an approximation that ended up for most memories to be exactly correct to the original I'm happy to go over details of this if anybody's interested later but end up being a fairly close approximation one that as compared to an accurate completely accurate search for the value ended up being somewhere between 15 to 20 times faster and so when we looked at our mobile robot here oh sorry let me get this back because our little robots actually going around it's the third floor of the computer science building at the University of Michigan it's going around he's building a map and again the idea was this map is getting too big so here was the basic idea as the robots going around it's going to need this map information about rooms the color there is describing kind of the strength of the memory and as it gets farther and farther away and it hasn't used part of the map for planning or other purposes basically make it 2 K away so that by the time it gets to the bottom it's forgotten about the top but we had the belief that we could reconstruc
Resume
Categories