Michael Kearns: Algorithmic Fairness, Privacy & Ethics | Lex Fridman Podcast #50
AzdxbzHtjgs • 2019-11-19
Transcript preview
Open
Kind: captions Language: en the following is a conversation with Michael Kern's he's a professor at the University of Pennsylvania and a co-author of the new book ethical algorithm that is the focus of much of this conversation it includes algorithmic fairness bias privacy and ethics in general but that is just one of many fields that Michael's a world-class researcher in some of which would touch on quickly including learning theory or the theoretical foundation of machine learning game theory quantitative finance computational social science and much more but on a personal note when I was an undergrad early on I worked with Michael on an algorithmic trading project in competition that he led that's when I first fell in love with algorithmic game theory while most of my research life has been a machine learning human robot interaction the systematic way that game theory reveals the beautiful structure and our competitive and cooperating world of humans has been a continued and inspiration to me so for that and other things I'm deeply thankful to Michael and really enjoyed having this conversation again in person after so many years this is the artificial intelligence podcast if you enjoy it subscribe on YouTube give it five stars on Apple podcasts supported on patreon or simply connect with me on Twitter at lex Friedman's both Fri D ma n this episode is supported by an amazing podcast called pessimists archive Jason the host the show reached out to me looking to support this podcast and so I listened to it to check it out and I listened I mean I went through it Netflix binge style at least 5 episodes in a row it's not one of my favorite podcast and I think it should be one of the top podcasts in the world frankly it's a history show about why people resist new things each episode looks at a moment in history when something new was introduced something that today we think of as commonplace like recorded music umbrellas bicycles cars chests coffee the elevator and the show explores why freaked everyone out the latest episode on mirrors and vanity still stays with they think about vanity in the modern day of the Twitter world that's the fascinating thing about the show is that stuff that happened long ago especially in terms of our fear of new things repeats itself in the modern day and so has many lessons for us to think about in terms of human psychology and the role of Technology in our society anyway you should subscribe but listen the pessimist archive I highly recommended and now here's my conversation with Michael Kern's you mentioned reading Fear and Loathing in Las Vegas in high school and having more or a bit more of a literary mind so what books non-technical non computer science would you say had the biggest impact on your life either intellectually or emotionally you've dug deep into my history I see deep yeah I think well my favorite novel is Infinite Jest by David Foster Wallace which actually coincidentally much of it takes place in the halls of buildings right around us here at MIT so that certainly had a big influence on me and as you noticed like when I was in high school I actually Stephen started college as an English major so was very influenced by sort of badge genre of journalism at the time and thought I wanted to be a writer and then realized that an English major teaches you to read but it doesn't teach you how to write and then I became interested in math and computer science instead well in your new book ethical algorithm you kind of sneak up from a algorithmic perspective on these deep profound philosophical questions of fairness of privacy in thinking about these topics how often do you return to that literary mind that either you had yeah I'd like to claim there was a deeper connection but but there you know I think both Aaron and I kind of came at these topics first and foremost from a technical angle I mean you know I'm kind of consider myself primarily and originally a machine learning researcher and I think as we just watched like the rest of the society the field technically advanced and then quickly on the heels of that kind of the the buzzkill of all the antisocial behavior by algorithms just kind of realized there was an opportunity for us to do something about it from a research perspective you know a more to the point in your question I mean I do have an uncle who is literally a moral philosopher and so in the early days of our technical work on fairness topics I would occasionally you know run ideas behind him so I mean I remembered an early email I sent to him in which I said like oh you know here's a specific definition of algorithmic fairness that we think is some sort of variants of Rawls II in fairness what do you think and I thought I was asking a yes-or-no question and I got back there kind of classical philosophers responsive well it depends if you look at it this way then you might conclude this and that's when I realized that there was a real kind of rift between the ways philosophers and others had thought about things like fairness you know from sort of a humanitarian perspective and the way that you needed to think about it as a computer scientist if you were going to kind of implement actual algorithmic solutions but I would say the algorithmic solutions take care of some of the low-hanging fruit sort of the problem is a lot of algorithms when they don't consider fairness they are just terribly unfair and when they don't consider privacy they're terribly they violate privacy sort of algorithmic approach fixes big problems but there's though you get when you start pushing into the gray area that's when you start getting into this philosophy of what it means to be fair that's starting from Plato what what is justice kind of questions yeah I think that's right and I mean I would even not go as far as you want to say that that sort of the algorithmic work in these areas is solving like the biggest problems and you know we discussed in the book the fact that really we are there's a sense in which we're kind of looking where the light is in that you know for example if police are racist in who they decide to stop and frisk and that goes into the data there's sort of no undoing that Downs by kind of clever algorithmic methods and I think especially in fairness I mean I think less so in privacy where we feel like the community kind of really has settled on the right definition which is differential privacy if you just look at the algorithmic fairness literature already you can see it's gonna be much more of a mess and you know you've got these theorems saying here are three entirely reasonable desirable notions of fairness and you know here's a proof that you cannot simultaneously have all three of them so I think we know that algorithmic fairness compared to algorithmic privacy is gonna be kind of a harder problem and it will have to revisit I think things that have been thought about by you know many generations of scholars before us so it's very early days for fairness I think so before we get into the details of differential privacy and then the fairness side I mean linger on the philosophy but do you think most people are fundamentally good or do most of us have both the capacity for good and evil within us I mean I'm an optimist I tend to think that most people are good and want to do to do right and that deviations from that or you know kind of usually due to circumstance to people being bad at heart with people with power are people at the heads of governments people at the heads of companies people at the heads of maybe so financial power markets do you think the distribution there is also most people are good and have good intent yeah I do I mean my statement wasn't qualified to people not in positions of power I mean I think what happens in a lot of the you know the the cliche about absolute power corrupts absolutely I mean you know I think even short of that you know having spent a lot of time on Wall Street and also in arenas very very different from Wall Street like academia you know one of the things I think I've benefited from by moving between two very different worlds is you you become aware that you know these were it's kind of developed their own social norms and they develop their own rationales for you know behavior for instance that might look unusual to outsiders but when you're in that world it doesn't feel unusual at all and I think this is true of a lot of you know professional cultures for instance and and you know so then you're maybe slippery slope is too strong of a word but you know you're in some world where you're mainly around other people with the same kind of viewpoints and training and worldview as you and I think that's more of a source of you know kind of abuses of power then sort of you know there being good people and evil people and and it's somehow the evil people are the ones that somehow rise to power that's really interesting so it's the within the social norms constructed by that particular group of people you're all trying to do good but because it's a group you might be you might drift into something that for the broader population it does not align with the values of society that kind of that's the word yeah I mean or nothing you drift but even the things that don't make sense to the outside world don't seem unusual to you so it's not sort of like a good or a bad thing but you know like so for instance you know on on in the world of finance right there's a lot of complicated types of activity that if you are not immersed in that world you cannot see why the purpose of that you know that activity exists at all it just seems like you know completely useless and people just like you know pushing money around and when you're in that world right you're you and you learn more you your view does become more nuanced right you realize okay there is actually a function to this activity and force in some cases you would conclude that actually if magically we could eradicate this activity tomorrow it would come back because it actually is like serving some useful purpose it's just a useful purpose that's very difficult for outsiders to see and so I think you know lots of professional work environments or cultures as I might put it kind of have these social norms that you know domain sense to the outside world academia is the same right I mean lots of people look at academia and say you know what the hell are all of you people doing why are you paid so much in some cases at taxpayer expenses to do you know to publish papers and military reads you know but when you're in that world you come to see the value for it and but even though you might not be able to explain it to you know the person in the street alright and in the case of the financial sector tools like credit might not make sense to people like is it's a good example of something that does seem to pop up and be useful or or just the power of markets and just in general capitalism yeah and Finance I think the primary example I would give is leverage right so being allowed to borrow to sort of use ten times as much money as you've actually borrowed right so so that's an example of something that before I had any experience in financial markets I might have looked at and said well what is the purpose of that that just seems very dangerous and it is dangerous and it has proven dangerous but you know if the fact of the matter is that you know sort of on some particular time scale you are holding positions that are you know very unlikely to you know loo you know they're you know that your value at risk their variance is like 1 or 5 percent then it kind of makes sense that you would be allowed to use a little bit more than you have because you have you know some confidence that you're not going to lose it all in a single day now of course when that happens we've seen what happens you know not not too long ago but but you know but the idea that it serves no useful economic purpose under any circumstances is definitely not true we'll return to the other side of the coast Silicon Valley and the problems there as we talk about privacy as we talk about fairness at the high level and I'll ask some sort of basic questions with the hope to get at the fundamental nature of reality but from a very high level what is an ethical algorithm so I can say that an algorithm has a running time of using Big Oil notation and login I can say that a machine learning algorithm classified cat versus dog with 97% accuracy do you think there will one day be a way to measure sort of in the same compelling way as the big ol notation of this algorithm is 97% ethical first of all many rif for a second on your specific and login examples so because early in the book when we're just kind of trying to describe algorithms period we say like ok you know what's an example of an algorithm or an algorithmic problem first of all I could sorting right yeah I'm a bunch of index cards with numbers on them and you want to sort them and we describe you know an algorithm that sweeps all the way through finds the the smallest number puts it at the front then sweeps through again finds the second smallest number so we make the point that this is an algorithm and it's also a bad algorithm in the sense that you know it's quadratic rather than n log n which we know is optimal for sorting and we make the point that sort of like you know so even within the confines of a very precisely specified problem there's you know there might be many many different algorithms for the same problem with different properties like some might be faster in terms of running time some I use less memory some might have you know better distributed implementations and and so the point is is that already we're used to you know in computer science thinking about trade-offs between different types of quantities and resources and there being you know better and worse algorithms and and our book is about that part of algorithmic ethics that we know how to kind of put on that same kind of quantitative footing right now so you know just to say something that our book is not about our book is not about kind of broad fuzzy notions of fairness it's about very specific notions of fairness there's more than one of them there are tensions between them right but if you pick one of them you can do something akin to saying this algorithm is 97% ethical you can say for instance the you know for this lending model the false rejection rate on black people and white people is within 3 percent right so we might call that to a 97% ethical algorithm in a 100% ethical algorithm would mean that that difference is 0% in that case fairness is specified when two groups however they're defined are given to you that's right so the and and then you can sort of mathematically start describing the algorithm but nevertheless the the part where the two groups are given to you I mean unlike running time you know we don't in a computer science talk about how fast an algorithm feels like when it runs true we measure an ethical starts getting into feelings so for example an algorithm runs you know if it runs in the background it doesn't disturb the performance of my system it'll feel nice I'll be okay with it but if it overloads the system will feel unpleasant so in that same way ethics there's a feeling of how socially acceptable it is how does it represent the moral standards of our society today so in that sense and sorry to linger on that for some high low philosophical question is do you have a sense we'll be able to measure how ethical and algorithm is first of all I didn't certainly didn't mean to give the impression that you can kind of measure you know memory speed trade-offs you know and and that there's a complete you know mapping from that on to kind of fairness for instance or ethics and and accuracy for example in the type of fairness definitions that are largely the objects of study today and starting to be deployed you as the user of the definitions you need to make some hard decisions before you even get to the point of designing fair algorithms one of them for instance is deciding who it is that you're worried about protecting who you're worried about being harmed by for instance some notion of discrimination or unfairness and then you need to also decide what constitutes harm so for instance in a lending application maybe you decide that you know falsely rejecting a credit worthy individual you know sort of a false negative is the real harm and that false positives ie people that are not credit worthy or are not going to repay your loan to get a loan you might think of them as lucky and so that's not a harm although it's not clear that if you are don't have the means to repay a loan that being given a loan is not also a harm so you know you know the literature is sort of so far quite limited in that you sort of need to say who do you want to protect and what would constitute harm to that group and when you ask questions like will algorithms feel ethical one way in which they won't under the definitions that I'm describing is if you know if you are an individual who is falsely denied alone incorrectly denied a loan all of these definitions basically say like well you know your compensation is the knowledge that we are we are also falsely denying loans to other people you know other groups at the same rate that we're doing it's to you and and you know there and so there is actually this interesting even technical tension in the field right now between these sort of group notions of fairness and notions of fairness that might actually feel like real fairness to individuals right they they might really feel like their particular interests are being protected or thought about by the algorithm rather than just you know the groups that they happen to be members of is there parallels to the big o-notation of worst-case analysis so is it important to looking at the worst violation of fairness for an individual is important to minimize that one individual so like worst case analysis is that something you think about or I mean I think we're not even at the point where we can sensibly think about that so first of all you know we're talking here both about fairness applied at the group level which is a relatively weak thing but it's better than nothing and also the more ambitious thing of trying to give some individual promises but even that doesn't incorporate I think something that you're hinting at here is what a chime I'll call subjective fairness right right so a lot of the definitions I mean all of the definitions in the algorithmic fairness literature are what I would kind of call received wisdom definitions it's sort of you know somebody like me sits around and things like okay you know I think here's a technical definition of fairness that I think people should want or that they should you know think of as some notion of fairness maybe not the only one maybe not the best one maybe not the last one but we really actually don't know from a subjective standpoint like what people really think is fair there's you know we've we've just started doing a little bit of work in in our group that actually doing kind of human subject experiments in which we you know ask people about you know we ask them questions about fairness we survey them we you know we show them pairs of individuals in let's say a criminal recidivism prediction setting and we ask them do you think these two individuals should be treated the same as a matter of fairness and to my knowledge there's not a large literature in which ordinary people are asked about you know they they have sort of notions of their subjective fairness elicited from them it's mainly you know kind of scholars who think about fairness no right and I'm making up their own definitions and I think I think this needs to change actually for many social norms not just for fairness right so there's a lot of discussion these days in the AI community about interpretable AI or understandable AI and as far as I can tell everybody agrees that deep learning or at least the outputs of deep learning are not very understandable and people might agree that sparse linear models with integer coefficients are more understandable but nobody's really asked people you know there's very little literature on you know sort of showing people models and asking them do they understand what the model is doing and I think that in all these topics as these fields mature we need to start doing more behavioral work yeah which is so one of my deep passions of psychology and I always thought computer scientists will be the the best future psychologists in a sense that data is especially in this modern world the data is a really powerful way to understand and study human behavior and you've explored that with your game theory side of work as well yeah I'd like to think that what you say is true about computer scientists and psychology from my own limited wandering into human subject experiments we have a great deal to learn not just computer science but AI and machine learning more specifically I kind of think of as imperialist research communities in that you know kind of like physicists in an earlier generation computer scientists kind of don't think of any scientific topic as off limits to them they will like freely wander into areas that others have been thinking about for decades or longer and you know we usually tend to embarrass ourselves yes in those efforts for for some amount of time like you know I think reinforcement learning is a good example right so a lot of the early work in reinforcement learning I have complete sympathy for the control theorist that looked at this and said like okay you are reinventing stuff that we've known since like the 40s right but you know in my view eventually this sort of you know computer scientists have made significant contributions to that field even though we kind of embarrassed ourselves for the first decade so I think if computer scientists are gonna start engaging in kind of psychology human subjects type of research we should expect to be embarrassing ourselves for a good ten years or so and then hope that it turns out as well as you know some other areas that we've waded into so you kind of mentioned this just the linger on the idea of an ethical algorithm of idea of group sort of group thinking an individual thinking and we're struggling that there's one of the amazing things about algorithms and your book and just this field of study is it gets us to ask like forcing machines converting these ideas into algorithms is forcing us to ask questions of ourselves as a human civilization so there's a lot of people now in public discourse doing sort of group thinking thinking like there's particular sets of groups that we don't want to discriminate against and so on and then there is individuals sort of in the individual life stories the struggles they went through and so on now like in philosophy it's easier to do group thinking because you don't you know it's very hard to think about individuals there's so much variability but with data you can start to actually say you know what group thinking is too crude you're actually doing more discrimination by thinking in terms of groups and individuals can you linger on that kind of idea of group versus individual and ethics and and is it good to continue thinking in terms of groups in in algorithms so let me start by answering a very good high level question with a slightly narrow technical response which is these group definitions of fairness like here's a few groups like different racial groups may be gender groups may be age what-have-you and let's make sure that you know from none of these groups do we you know have a false negative rate which is much higher than any other one of these groups okay so these are kind of classic group aggregate notions of fairness and you know but at the end of the day an individual you can think of as a combination of all of their attributes right they're a member of a racial group they're they have a gender they have an age you know and many other you know demographic properties that are not biological but that you know are are still you know very strong determinants of outcome and personality in the light so one I think useful spectrum is to sort of think about that array between the group and this individual and to realize that in some ways asking for fairness at the individual level is to sort of ask for group fairness simultaneously for all possible combinations of groups so in particular so in particular yes you know if I build a predictive model that meets some definition of fairness by race by gender by age by what-have-you marginally to get a slightly technical sort of independently I shouldn't expect that model to not to discriminate against disabled Hispanic women over age 55 making less than fifty thousand dollars a year or annually even though I might have protected each one of those attributes marginally so the optimization actually that's a fascinating way to put it so you're just optimizing the one way to achieve the optimizing fairness for individuals just to add more and more definitions of groups at each and it's right along so you know at the end of the day we could think of all of ourselves as groups of size one because eventually there's some attribute that separates you from me and everybody from everybody else in the world okay and so it is possible to put you know these incredibly coarse ways of thinking about their nests and these very very individualistic specific ways on a common scale and you know one of the things we've worked on from a research perspective is you know so we sort of know how to you know we in relative terms we know how to provide fairness guarantees at the coarsest end of the scale we don't know how to provide kind of sensible tractable realistic fairness guarantees at the individual level but maybe we could start creeping towards that by dealing with more you know refined subgroups I mean we we gave a name to this phenomenon where you know you protect you you you enforce some definite definition of fairness for a bunch of marginal attributes or features but then you find yourself discriminating against a combination of them we call that fairness gerrymandering because like political gerrymandering you know you're giving some guarantee at the aggregate level yes but that when you kind of look in a more granular way at what's going on you realize that you're achieving that aggregate guarantee by sort of favoring some groups in discriminating against other ones and and so there are you know it's early days but there are algorithmic approaches that let you start creep and creeping towards that you know individual end of the spectrum does there need to be human input in the form of weighing the value of the importance of each kind of group so for example is it is it like so gender say crudely speaking male and female and then different races are we as humans supposed to put value on saying gender is 0.6 and racist 0.4 in terms of in the big optimization of achieving fairness is that kind of what humans I mean most of you know I mean of course you know I don't need to tell you that of course technically one could incorporate such weights if you wanted to into a definition of fairness you know fairness is an interesting topic in that having worked in in the book being about both fairness privacy and many other social norms fairness of course is a much much more loaded topic so privacy I mean people want privacy people don't like violations of privacy violations of privacy cause damage angst and and bad publicity for the companies that are victims of them but sort of everybody agrees more data privacy would be better than less data privacy and and you don't have these somehow the discussions of fairness don't become politicized along other dimensions like race and about gender and you know you know whether we you and you know did you quickly find yourselves kind of revisiting topics that have been kind of unresolved forever like affirmative action right sort of you know like why are you protecting and some people will say why are you protecting this particular racial group and and others will say what we need to do that as a matter of retribution other people will say it's a matter of economic opportunity and I don't know which of you know whether any of these are the right answers but you sort of fairness is sort of special in that as soon as you start talking about it you inevitably have to participate in debates about fair to whom at what expense to who else I mean even in criminal justice right um you know where people talk about fairness in criminal sentencing or you know predicting failures to appear or making parole decisions or the like they will you know they'll point out that well these definitions of fairness are all about fairness for the criminals and what about fairness for the victims right so when I basically say something like well the the false incarceration rate for black people and white people needs to be roughly the same you know there's no mention of potential victims of criminals in such a fairness definition and that's the realm of public discourse I just listened to two people listening intelligent squares debates us edition just had a debate they have this structure we have a old Oxford style or whatever they're called debates those two versus two and they talked about affirmative action and it was the is incredibly interesting that it's still there's really good points on every side of this issue which is fascinating to listen yeah yeah I agree and so it's it's interesting to be a researcher trying to do for the most part technical algorithmic work but Aaron and I both quickly learned you cannot do that and then go out and talk about and expect people to take it seriously if you're unwilling to engage in these broader debates that are entirely extra algorithmic right there they're not about you know algorithms and making algorithms better they're sort of you know as you said sort of like what should society be protecting in the first place when you discuss the fairness an algorithm that uh that achieves fairness whether in the constraints and the objective function there's an immediate kind of analysis you can perform which is saying if you care about fairness in gender this is the amount that you have to pay for in terms of the performance of the system like do you is there a role for the statements like that in a table and a paper or do you want to really not touch that like you know we want to touch that and we do touch it so I mean just just again to make sure I'm not promising your your viewers more than we know how to provide but if you pick a definition of fairness like I'm worried about gender discrimination and you pick a notion of harm like false rejection for a loan for example and you give me a model I can definitely first of all go on at that model it's easy for me to go you know from data to kind of say like okay your false rejection rate on women is this much higher than it is on men okay but you know once you also put the fairness in to your objective function I mean I think the table that you're talking about is you know what we would call the Pareto curve right you can literally trace out and we give examples of such plots on real datasets in the book you have two axes on the x-axis is your error on the y-axis is unfairness by whatever you know if it's like the disparity between false rejection rates between two groups and you know your algorithm now has a knob that basically says how strongly do I want to enforce fairness and the less unfairly you know we you know if the two axes are err and unfairness we'd like to be at 0-0 we'd like to zero error and zero fair unfairness simultaneously anybody who works in machine learning knows that you're generally not going to get to zero error period without any fairness constrain whatsoever so that's that that's not gonna happen but in general you know you'll get this you'll get some kind of convex curve that specifies the numerical trade-off you face you know if I want to go from 17 percent error down to 16 percent error what will be the increase in unfairness that I've experienced as a result of that and and so this curve kind of specifies the you know kind of undaunted models models that are off that curve are you know can be strictly improved in one or both dimensions you can you know either make the error better or the unfairness better or both and I think our view is that not only are are these objects these Pareto curves you know there's efficient frontiers as you might call them not only are they valuable scientific objects I actually think that they in the near term might need to be the interface between researchers working in the field and and stakeholders and given problems so you know you could really imagine telling a criminal jurisdiction look if you're concerned about racial fairness but you're also concerned about accuracy you want to you know you want to release on parole people that are not going to recommit a violent crime and you don't want to release the ones who are so you know that's accuracy but if you also care about those you know the mistakes you make not being disproportionately on one racial group or another you can you can show this curve I'm hoping that in the near future it'll be possible to explain these curves to non-technical people that have that are the ones that have to make the decision where do we want to be on this curve like what are the relative merits or value of having lower error versus lower unfairness you know that's not something computer scientists should be deciding for society right that you know the people in the field so to speak the policymakers the regulator's that's who should be making these decisions but I think and hope that they can be made to understand that these trade-offs generally exist and that you need to pick a point and like and ignoring the trade-off you know you're implicitly picking a point anyway right right you just don't know it and you're not admitting it it's just a link out on the point of trade-offs I think that's a really important thing to sort of think about so you think when we start to optimize for fairness there's almost always in most system going to be trade-offs can you like what's the trade-off between just to clarify they've been some sort of technical terms thrown around but a sort of a perfectly fair world why is that why will somebody be upset about that the specific trade-off I talked about just in order to make things very concrete was between numerical error and some numerical measure of unfairness in what is numerical error in the case of just likes a predictive error like you know the probability or frequency with which you release somebody on parole who then goes on to recommit a violent crime or keep incarcerated somebody who would not have recommitted a violent crime so in case of awarding somebody parole or giving somebody Perl or letting them out on parole you don't want them to recommit a crime so it's your system failed in prediction if they happen to do a crime okay so that's the performer that's one axis right and what's the fairness axis so then the fairness axis might be the difference between racial groups in the kind of false false positive predictions namely people that I kept incarcerated predicting that they would recommit a violent-crime when in fact they wouldn't have right and the the unfairness of that just to linger it and allow me to in eloquently to try to sort of describe why that's unfair why unfairness is there the the unfairness you want to get rid of is the in the judges mind the bias of having being brought up to society the slight racial bias the racism that exists in the society you want to remove that from the system another way that's been debated is equality of opportunity versus equality of outcome and there's a weird dance there that's really difficult to get right and we don't as what the firm ative action is exploring that space right and then we this also quickly you know bleeds into questions like well maybe if one group really does recommit crimes at a higher rate the reason for that is that at some earlier point in the pipeline or earlier in their lives they didn't receive the same resources that the other group did right and that and so you know there's always in in kind of fairness discussions the possibility that the the real injustice came earlier right earlier in this individuals life earlier in this group's history etc etc and and so a lot of the fairness discussion is almost the goal is for it to be a corrective mechanism to account for the injustice earlier in life by some definitions of fairness or some theories of fairness yeah others would say like look it's it's you know it's not to correct that injustice it's just to kind of level the playing field right now and Nanyan coarser a falsely incarcerate more people of one group than another group but I mean do you think just it might be helpful just to demystify a little bit about the diff bias or unfairness can come into algorithms especially in the machine learning era right and you know I think many of your viewers have probably heard these examples before but you know let's say I'm building a face recognition system right and so I'm you know kind of gathering lots of images of faces and you know trying to train the system to you know recognize new faces of those individuals from training on you know a training set of those faces of individuals and you know it shouldn't surprise anybody or certainly not anybody in the field of machine learning if my training dataset was primarily white males and I'm training that mmm the model to maximize the overall accuracy on my training data set that you know the model can reduce its air or most by getting things right on the white males that constitute the majority of the data set even if that means that on other groups they will be less accurate okay now there's a bunch of ways you could think about addressing this one is to deliberately put into the objective of the algorithm not to not to optimize the air or at the expense of this discrimination and then you're kind of back in the land of these kind of two-dimensional numerical trade-offs a valid counter-argument is to say like well no you don't have to there's no you know the the notion of the tension between air and Acuras here is a false one you could instead just go out and get much more data on these other groups that are in the minority and you know equalize your dataset or you could train a separate model on those subgroups and you know have multiple models the point I think we would you know we try to make in the book is that those things have cost too right going out and gathering more data on groups that are relatively rare compared to your plurality or more majority group that you know it may not cost you in the accuracy of the model but it's gonna cost you know it's gonna cost the company developing this model more money to develop that and it has also cost more money to build separate predictive models and to implement and deploy them so even if you can find a way to avoid the tension between error and accuracy training a model you might push the cost somewhere else like money like development time research time and alike there are fundamentally difficult philosophical questions in fairness and we live in a very divisive political climate outrage culture there is uh all right folks on 4chan trolls there is social justice warriors on Twitter there is very divisive outraged folks and all sides of every kind of system how do you how do we as engineers build ethical algorithms in such divisive culture do you think they could be disjoint the human has to inject your values and then you can optimize over those values but in our times when when you start actually applying these systems things get a little bit challenging for the public discourse how do you think we can proceed yeah I mean for the most part in the book you know a point that we try to take some pains to make is that we don't view ourselves or people like us as being in the position of deciding for society what the right social norms are what the right definitions of fairness are our main point is to just show that if society or the relevant stakeholders in a particular domain can come to agreement on those sorts of things there's a way of encoding that into algorithms in many cases not in all cases one other misconception though hopefully we definitely dispel is sometimes people read the title of the book and I think not unnaturally fear that what we're suggesting is that the algorithms themselves should decide what those social norms are and develop their own notions of fairness and privacy or ethics and we're definitely not suggesting that the title of the book is ethical algorithm by the way and they didn't think of that interpretation of the title that's interesting yeah yeah I mean especially these days were people are you know concerned about the robots becoming our overlords the idea that the robots would also like sort of develop their own social norms is you know just one step away from that but I do think you know obviously despite disclaimer that people like us shouldn't be making those decisions for society we are kind of living in a world where in many ways computer scientists have made some decisions that have fundamentally changed the nature of our society and democracy and in sort of civil discourse and deliberation in ways that I think most people generally feel are bad these days right so but they had to make so if we look at people at the heads of companies and so on they had to make those decisions right there has to be decisions so there's there's two options either you kind of put your head in the sand and don't think about these things and just let they all go and do what it does or you make decisions about what you value you know open injecting moral values into that with look I don't never mean to be an apologist for the tech industry but I think it's it's a little bit too far to sort of say that explicit decisions were made about these things so let's for instance take social media platforms right so like many inventions in technology and computer science a lot of these platforms that we now use regularly kind of started as curiosities right I remember when things like Facebook came out in its predecessors like Friendster which nobody even remembers now the people people really wonder like what why would anybody want to spend time doing that you know what I mean even even the web when it first came out when it wasn't populated with much content and it was largely kind of hobbyists building their own kind of ramshackle websites a lot of people looked at this this is like what is the purpose of this thing why is this interesting who would want to do this and so even things like Facebook and Twitter yes technical decisions were made by engineers by scientists by executives in the design of those platforms but you know I don't I don't think 10 years ago anyone anticipated that those platforms for instance might kind of acquire undo you know influence on political discourse or on the outcomes of election and I think the scrutiny that these companies are getting now is entirely appropriate but I think it's a little too harsh to kind of look at history and sort of say like oh you should have been able to anticipate that this would happen with your platform and in this sort of gaming chapter of the book one of the points we're making is that you know these platforms right they don't operate in isolation so like that unlike the other topics we're discussing like fairness and privacy like those are really cases where algorithms can operate on your data and make decisions about you and you're not even aware of it okay things like Facebook and Twitter these are you know these are these are systems right these are social systems and their evolution even their technical evolution because machine learning is involved is driven in no small part by the behavior of the users themselves and how the users decide to adopt them and how to use them and so you know you know I'm kind of like who really knew that the you know in until until we saw it happen who knew that these things might be able to influence the outcome of elections who knew that you know they might polarize political discourse because of the ability to you know decide who you interact with on the platform and also with the platform naturally using machine learning to optimize for your own interest that they would further isolate us from each other and you know like feed us all basically just the stuff that we already agreed with and I think it you know we've come to that outcome I think largely but I think it's something that we all learned together including the companies as these things happen you asked like well are there algorithmic remedies to these kinds of things and again these are big problems that are not going to be solved with you know somebody going in and changing a few lines of code somewhere in a social media platform but I do think in many ways there are there are definitely ways of making things better I mean like an obvious recommendation that we we make at some point in the book is like look you know to the extent that we think that machine learning applied for person purposes in things like newsfeed you know or other platforms has led to polarization and intolerance of opposing viewpoints as you know right these these algorithms have models right and they kind of place people in some kind of metric space and and they place content in that space and they sort of know the extent to which I have an affinity for a particular type of content and by the same token they also probably have that that same model probably gives you a good idea of the stuff I'm likely to violently disagree whether it be offended by okay so you know in this case there really is some nod you could tune it says like instead of showing people only what they like and what they want let's show them some stuff that we think that they don't like or that's a little bit further away and you could even imagine users being able to control this you know just like a everybody gets a slider and that slider says like you know how much stuff do you want to see that's kind of you know you might disagree with or is at least further from your interests I can it's almost like an exploration button so just get your intuition do you think engagement so like you staying on the platform you because thing engaged do you think fairness ideas of fairness won't emerge like how bad is it to just optimize for engagement do you think we'll run into big trouble if we're just optimizing for how much you love the platform well I mean optimizing for engagement kind of got us where we are so do you one have faith that it's possible to do better and two if it is how do we do better I mean it's definitely possible to do different right and again you know it's not as if I think that doing something different than optimizing for engagement won't cost these companies in real ways including revenue and profitability potentially short-term at least yeah in the short term right and again you know if I worked at these companies I'm sure that it it would have seemed like the most natural thing in the world also to want to optimize engagement right and that's good for users in some sense you want them to be you know vested in the platform and enjoying it and finding it useful interesting and or productive but you know my point is is that the idea that there is that it's sort of out of their hands as you said or that there's nothing to do about it Never Say Never but that strikes me as implausible as a machine-learning person right I mean these companies are driven by machine learning and this optimization of engagement is essentially driven by machine learning right it's driven by not just machine learning but you know very very large-scale a be experimentation where you gonna have tweaked some element of the user interface or tweaked some component of an algorithm or tweak some component or feature of your click-through prediction model and my point is is that anytime you know how to optimize for something you'll you you know by def almost by definition that solution tells you how not to optimize for it or to do something different engagement can be measured so sort of optimizing for sort of minimizing divisiveness or maximizing intellectual growth over the lifetime of a human being very difficult to measure that that's right so I'm not I'm not claiming that doing something different will immediately make it apparent that this is a good thing for society and in particular I mean ethical one way of thinking about where we are on some of these social media platforms is it you know it kind of feels a bit like we're in a bad equilibrium right that these systems are helping us all kind of optimize something myopically and selfishly for ourselves and of course from an individual standpoint at any given moment like what why would I want to see things in my newsfeed that I found irrelevant offensive or you know or the like okay but you know maybe by all of us you know having these platforms myopically optimized in our interests we have reached a collective outcome as a society that were unhappy with in different ways let's say with respect to things like you know political discourse and tolerance of opposing viewpoints and if Mark Zuckerberg gave you a call and said I'm
Resume
Categories