Stuart Russell, "Human Compatible" : CSPAN2 : December 22, 2019 1:15am-2:26am EST : Free Borrow & Streaming : Internet Archive

1:15 am

love affair. [laughter] >> a t-shirt. >> but truly, this is the night i will not forget and you will not forget. >> thank you so much. [applause] >> iso walked this to be a night that has for me and but for both of us and all of you makes our lives a little bit more just and fun and outrageous full of companionship then it was if we had not been here. [applause] [cheering] >> thank you everyone for coming tonight. [applause]

1:16 am

>> book tv continues on c-span2. television were serious readers. the area security initiative is a new hug for energy person or research of artificial intelligence. we are working to understand the effects of misuse and consequences and how aiu's

1:17 am

change in global power dynamic in the governance model is meaningful to support the development for the a.i. toward action toward the view of horizon and our goal is to help them identify the steps they can take today that will have an outside impact on future trajectory around a.i. around the world. this helps support the broader mission for the cybersecurity which is to help individuals and organizations address tomorrow's information security challenges and amplify the upside of the revolution. the center for human compatible a.i. is a research lab base at uc berkeley aiming to re-ordinance the field of a.i. toward beneficial systems. through technical safety research. the faculty researchers keep the students pioneering technical research on puppets that include cooperative reinforcement lighting, misspecified, human

1:18 am

robot corporation, value preference alignment, multi agent systems, and theories of rationality among other topics. researchers use insight from computer science, machine learning, game theory, statistics and social science. we are thrilled to have the founder and director here with us to talk about his new book. artificial intelligence and the problem of control. this book has been called the most important book on a.i. so far. the most important book i've read in quite some time, a must read in the book we have all been waiting for. stuart russell is known to many of you and has been a faculty member for 33 years in computer science and cognitive science. he is also an honorary fellow in

1:19 am

oxford. he is a co-author of artificial intelligence in modern approach which is the standard textbook in a.i. and used over 1400 universities and 128 countries. right now he is a senior fellowship one of the most procedures towards and social sciences. last but not least he served for several years as a active professor of neurological surgery. he does have a license to operate. also joining us for the discussion is richer, the financial time west coast editor. he is based in san francisco and he leads a team of writers focused on technology in silicon valley. he writes about the tech industry and the uses of technology. current areas of interest include artificial intelligence and the growing power of the

1:20 am

tech platforms. the previous position of the financial times include various finances in london, the new york bureau chief and telecom editor based in new york. professor russell and mr. waters were in their recent and expected a.i. implication. including the expectation the a.i. capability will eventually exceed those across the range of decision-making scenarios. we will hear about steps to ensure this is not just the future of science fiction but a new world that will benefit us all. we will hear from them for a half an hour and open it up for questions from the audience. after that we will break for a reception and we will have food and water and drinks available in the book human compatible will be available for purchase in professor russell has agreed to sign copies for those interested. i will turn it over to professor stuart russell and richard waters.

1:21 am

[applause] >> thank you very much. welcome, thank you for joining us. if you don't know, buy the book after the introduction. i don't know what would tell you. i recommend it. we will dig into as much as we can but we might hold back some secret so you action have to pay for as well. as a journalist one of the things that i found fascinating by the a.i. debate is the complete skin and him amongst people who know what they're talking about. on the one hand we have people say we will never get to human intelligence in those machines are perfectly safe. on the other hand we have the elon musk tendency and it's a shame that as much as we admire him has run away with a sci-fi end of this debate and it needs to be anchored to something more

1:22 am

serious. i'm very glad to have this debate because what you have done is make us aware of the potential and risks while anchoring in a real-life solid understanding of the science and where we start from. i think this is a really good place to start the debate in the awful schism that we have right now. since i'm a journalist i will dive straight in. so were here in berkeley. i know you're from a sunnier place. >> the other place. >> so you study the hundred year study of a.i. which is a landmark that they've made, to map what is happening in a.i.

1:23 am

and anchor the debate in some reality that have not gone forward. are you crazy insane that unlike in the movies there is no radar on the horizon or probably even possible. we would be denying that agi is even coming. so how do answer to that. >> is this working? they could hear you and i don't think i could keep my voice at a high level long enough. so interestingly for the 70 year

1:24 am

history of a.i., a.i. researchers have been the one saying a.i. is possible. usually philosophers are the one saying it's impossible. so for whatever reason we don't have the right tubules in our a.i. systems to become conscious or whatever it might be and usually those claims of impossibility have fallen by the wayside one after another. but as far as i know a.i. researchers have never said a.i. is impossible until now. what could've prompted them, imagine the hundred year study is 20 distinguished a.i. researchers giving their considered consensus opinion on

1:25 am

what is happening and what's going to happen in a.i. imagine if a biologist to the summary of the stage of field of cancer research and they said the cure for cancer is not on the horizon and not even possible. you would think what on earth would make them say that. and we have given them $500 billion of taxpayer money over the last few decades and now they tell us the whole thing was a con. i don't understand what justification there could be for researchers saying a.i. is not possible. except, i kind of did nihilism which is just saying i don't want to tak think about the consequences of success it is too scary so i will find any argument that i can to avoid

1:26 am

having to think about that. and i have a long list, i used to talk about a.i. and the risks and then here are the arguments why we should ignore the risks. and after i got to 28 arguments kinda like the impeachment, 28 reasons why you cannot impeach donald trump. i just gave up because it was taking too much time and i did not want to take up too much time today. you get the usual there is no reason to worry we can switch it off. and that's the last one, and i will never happen and we can switch it off. there are other ones that i won't mention because are too embarrassing.

1:27 am

>> when we get to what the machines might do to us -- let's focus on will we get there? if you say this is an amazing and we missed three decades of a.i. recession and nothing happening in our in a period of amazing progress and they say there will not happen. -- >> nonetheless were out of point with the massive limitations of learning in the models and we can all see this promise there's a huge wolf to get from here to there. and you say it'll take big consensual breakers to get to that. is this the point -- we don't even know what they are doing. what can give you the confidence that thinks we will move these, why do you think that will

1:28 am

happen? >> i tell you the breakthroughs that i think we need. you're right after we make all the breakthroughs we might find is still not intelligent. we may not even be sure why. but there's clear places where we can say we don't know how to do this but if we did, that would be a big step forward. that already has been in dozens of breakthroughs over the history of a.i. actually even going back much further. you could say aristotle was doing a.i. even though he did not have a computer or electricity to do a.i. but he was thinking about the mechanical process of human thought decision-making, planning and so one and he described -- in my textbook we have agreed text which describes a simple algorithm that he talks about and how you can reach a

1:29 am

decision about what to do. so the idea has been there and steps have been taken including the development which started in aging greece and ancient india and revived itself in the mid-19th century and logic is overlooked these days by the community but it is the mathematics of things. the world has things in it. so if you want to have systems of intelligent in a world that contains things young, mathematic that incorporates things as third class citizens and logic is mathematics. whatever shape is super intelligent system eventually takes will incorporate in some

1:30 am

form logical reasoning and the kind of expected languages that go along with it. so let me give a couple of examples that are clearly needed breakthroughs, one is the ability to extract complex content from natural language text. so imagine being able to read a physics book and then use that knowledge to design a better telescope. that at the moment is not even close to being feasible but there are people working on being able to read physics books and pass exams. the sad thing is, it turns out most exams that we give students especially multiple-choice can be passed with no understanding whatsoever of the content. [laughter] so my friend who is a japanese

1:31 am

research has been building software to passengers exam which is like getting into harvard or mit or maybe berkeley. and her program is now up there around the passing mark to get into the university of tokyo and it does not understand anything about anything. it's just going to hold a lot of tricks on how to do well on the exam questions. this is a perennial problem that the media often overlooks and they have a big headline, a.i. system gets into the university of tokyo but not the underlying truth that it still does not understand. so being able to understand a book and extract complex content from it and do design and with the content would be a big step

1:32 am

forward. i think there is a little problem of imagination failure when we think about a.i. systems because we think maybe if we tried really hard to get to be as smart as us but if a machine can read a physics book and do that then that same morning it will read everything the human race has ever written. to do that it does not even need power and we have ready half. they will not be like humans in any way shape or form. this is an important thing to understand, obviously we far exceed human capabilities and arithmetic and go in video games and so on. but these are much broader corridors of capability and when

1:33 am

we reach human levels text understanding then immediately they blow by human beings and their ability to observe knowledge. that gives them access to everything that we know in every language at any time in history. another important thing is the ability to make plans successfully in the real world. w spin. if you look at a very impressive achievement, the program that beats the human champion and sometimes when it is thinking about what move to make it second at 50 or 100 and into the future which is superhuman. human beings don't have the memory capacity to remember that many moves. but if you take the same program and applied it to a real

1:34 am

embodied robot that actually has to get around in the world and pick up the kids from school, leave the table for dinner, perhaps we landscape the garden, 50 - 100 moves get to one tenth of a second into the future in the physical world with a physical robot. it simply does not help at all. you might think about it being superhuman and looking into the future but is completely useless. when you take it from the go board and put it into a real robot. humans manage to make plans at the millisecond timescale, so your brain preload and generates, preload then downloads into the muscles a normal sleigh complex motor control plan that allow you to speak and that's thousands of motor control commands sent to

1:35 am

your tongue and lips and vocal cords of mouth and everything. and your brain has special structures to store the long sequence of instruction and spit them out a high speed so your body can function. but also as i was talking about his daughter's decision to do a phd she just finished at biology and berkeley, it took six years. we make decisions on the timescale and i'm going to do a phd at berkeley, six years as a trillion motor control command. we operate at every scale between the decade to the new second period we do completely seamlessly. somehow we always have motor control commands ready to go and we don't freeze in the middle of doing something and wait for 72 minutes for the motor control to be computed and then moving.

1:36 am

so we always have motor control commands ready to go but we have stuff ready go the minute, hour, the month, the year. it's all seamless and the capacity comes from the civilization which over the millennia has accumulated higher and higher level abstract action that we learned about through language and culture and that allows us to make these plans. but that ability goes to construct the levels of obstruction and manage our activities over long time scales and is not something we know how to do an a.i. that would be the one big breakthrough that would allow machines to function effectively in the real world. there are dozens of groups working on it and there is progress toward the solution and some results we see recently in games like starcraft illustrate

1:37 am

this because were go is a new hundred game, these are 20000 or 100,000 new games. yet the a.i. is playing a superhuman level. >> let's leap ahead. these problems are being tackled and let's say we get to that point with human intelligence. this is heaven if you say one point in your book it took 119 years for gdp per capita in the world on a tenfold. we can do this with the technology at that point, we could do this in one generation or however, long to roll this out. so nonetheless, what could go wrong and what do you think about and what could go wrong. i think the interesting point is not the technology it's how we

1:38 am

designed at a fundamental level like you see most concerned about. we could talk more about that. >> making -- i think the economist put it this way, introducing a second species onto the earth what could possibly go wrong. [laughter] so, the fact if you put it that way and said clearly intelligence is what is this power over the world so we make things more intelligent and more powerful than us, how will we have power over more powerful entities forever. when you put it like that it's a good point. we should think about that. so that's what i try to do. the first thing to think about is why things go wrong. people have known this is a problem, alan said we would have

1:39 am

to expect the machine to take control. he was completely matter-of-fact and resigned to the future. so it's not a new thing the elon musk invented and i don't think anyone would say he is not an expert to have an opinion about a.i. or the design. same with marvin the cofounder of the field itself and other people. so it does not give you a choice. if the answer is we lose and the machines will take control on the han and of the human error, there's only one choice than to say we better stop doing a.i. that choice -- he referred to similasamuel butler's novel, the

1:40 am

choice of a science fiction novel about a society developing sophisticated machines and decides they don't want to be taken over, they want to have control of their world taken by the machines so they banned machines, they destroyed all the machines in a terrible war before with the pro machine and answer machine. and now machines only exist in museums. but that is completely infeasible for the reason that rich mentioned. if we have super intelligent a.i. and use it well, that tenfold increase in gdp is conservative and it means giving everyone access to the same level of technology and quality

1:41 am

of life that we have in berkeley. not sci-fi or eternal life or light travel. that tenfold increase in gdp is bringing everyone up to a decent standard of living and worth about 10 - $20 quadrillion. that is creating the momentum and saying we will be on a.i., it's clea completely infeasible. not to mention the and i unlike nuclear energy or crystal babies, a.i. perceives by people writing formulas on whiteboards. you cannot ban the writing of formulas on whiteboards. so it is really hard to do much about it. we have to say what can go wrong, what is making better a.i. a bad thing. the reason is, the way we design the a.i. technology from the

1:42 am

beginning has a property that the smarter you make a.i. systems the worse it is for humanity. why? because the way we build a.i. systems and always have is essentially a copy of how we thought about human intelligen intelligence. human intelligence is the capability to take action that you can expect will achieve your objective. this is the economic philosophical notion of the rational agent. that is how we've always built a.i. we build machines that receive an objective from us and take action that they can expect will achieve that objective. the problem is, as we've always known for thousands of years we are unable to specify objectives completely incorrectly. this is the fundamental problem and this is why the third wish

1:43 am

that you give to the genie is always please undo the first two wishes because i ruined everything. but we may not get a third wish. if you create a system or intelligent more powerful than human beings and give it an incorrectly specified objective, it will achieve that objective and you will basically create a match between us and the machine and we lose the chest match. and the downside of losing that is bad. but the fundamental design era that we made very early on in the field, not just a.i., control theory, economics, operation research, statistics all operate on the principle that we specify in objective and some machinery will optimize it. so corporations are already

1:44 am

destroying the world and we don't need to wait to see how super intelligent a.i. messes up. you can see it happening already. corporations are making machine and maximize incorrectly objectives when they're making a mess of the world. were powerless to stop it. they have outlawed us. and they have been doing this for 50 years and that's were unable to fix our climate problem despite the fact that we even know what the solutions are. >> so to sum up, we have to design a.i. systems a different way. if we are going to be able to live with her own creation successfully. >> a different way for all the organizations in many ways because we haven't had anything this powerful and that's the

1:45 am

last thing that we want. >> sometimes corporations took us for our word and we set them up to maximize shareholder returns and that's what they did. and that is a problem because economist -- sometimes you can fix it by taxes or finder regulations but sometimes a social media messing up democracy and society you cannot, there's no way to tax on the social media platform. that is an early example where the algorithms are simple algorithms that manipulate human beings to make them more predictable sources of revenue. that is all they care about but because the operating on platforms and interacting with everyone for hours every day and

1:46 am

superpowerful force in their objective of maximizing is another misspecified objective that we keep messing up with. >> we have plenty of time for questions. >> before we do we should not hold back the punchline from your book which there is an answer. >> i guess we can do this -- >> the answer is in the first chapter in the narrative in the book. i don't want to leave everyone that emma dooms hair predicting the end of the world, we have enough of those books already. i cannot help being optimistic

1:47 am

because i think every problem has a solution if it does not have a solution then it's a fact of life and not a problem. so i'm proposing a way of thinking about a.i. that is different in the following way. if were unable to specify objectives completely and correctly for what we don't want our machines to do. then it follows that the machine should not assume and knows what the objective is. all the a.i. systems in every chapter is based on the assumption that the machine has the correct objective. that cannot be the case in real life. so we need machines that know that they don't know what is the true objective. the true objective the satisfaction of human preferences of the future. what each want the future to be like and what we don't want it to be like.

1:48 am

that's what the machine should help us with what it knows, what it does not know what the preferences are. this is a machine that in some ways were quite familiar with. how many people have been to a restaurant, we'd go to a restaurant and does the restaurant wan know what you wao eat? not usually unless you go there a lot. my japanese place across the road just bringing my lunch, they don't ask what i want. generally speaking they have a menu because that way they can learn what you want. they know what you don't know what you want and they have a process, protocol to find out more about what you want. they're not finding out incomplete detail exactly how many grains of rice you want on your plate and exactly where you want to grow marks on your

1:49 am

burger. they're getting a very rough sense, if there 60 items on the menu that's only four bits for your main course. but that's a protocol where the restaurant is like the a.i. system and it knows you don't know what you want to have a protocol to learn enough that it can make you happy. that's a general idea except this will be much more radical. this will be not just what you want for dinner but for the whole future and what everyone on or wants for the whole future. we can show the two important properties of the systems. number one, they will not mess with parts of the world whose value they do not know about. so, in the book and i used the example, suppose you have a

1:50 am

robot that is supposed to be looking after your kids because your late home from work and supposed to cook dinner but there's nothing in the fridge. so what does it do? it looks around the house and spots a cat and calculates nutritional value of the cat and cooks the count for dinner. because it does not know about the sentimental value of the cat. with systems that know that they don't know the value of everything. they would say, the cat may have the value of being alive that i don't know about so cooking the cat not be an option. at least it would ask permission and call me up on my cell phone and say is it okay if we cook the cat for dinner and i would say no. is it okay if we turn the ocean into acid to reduce the carbon level and that mr.. no don't do that.

1:51 am

that's point number one. you get invasive behavior. they can still do things as long as it understands preferences in a particular direction i want a cup of coffee, if it can get that without missing the rest of the world is happy to do it. the second point it will allow itself to be switched off and this is like the one plus one equals two of phase a.i. you cannot switch off. why will it allow itself to be such off. because it does not want to do whatever it would be to cause us to want to switch it off. so by allowing itself to be switched off it avoids the consequences and with those are does not know it does not know why i'm angry or why i want to switch off but it wants to prevent whatever it is from happening so it lets me search it off. this is a mathematical theorem,

1:52 am

we can prove that as long as the machine is unfit with human preferences, it can always be shut off. as the uncertainty goes away then the safety goes away. so machines that believe they have complete knowledge and, they will not allow themselves to be shut off because that would allow them to achieve the objective. that the core of the solution. a very different kind of system and it requires rebuilding all of the anti-technology that we have because as i said all of the technologies based on incorrect assumptions and we haven't noticed because the anti-system has been stupid and constrained to the lab. constrained to the lab is going away in the real world messing

1:53 am

things up. in the stupid is also going away and we have to solve the problem and rebuild the technology from the foundation up before the system gets too powerful and too intelligent. >> i have one more thing, let's say the machines do not kill us and they give us what we want. then we have to look at what we want and not just individually. i have annoyed you what i want but in the aggregate. this is going to be a phenomenal problem for humanity and you come to the idea that the image of the movie woolly which i'm sure we've seen he manages sitting back being fed by robots and it's a kind of and into the world. but how on earth will we will

1:54 am

look forward to what the machines will give us what we want. >> i think this is the problem that i don't have a good solution for. it's not technological, it's a social cultural problem of how do we maintain the vitality when in fact we no longer need to do most of what constitutes a civilization. let's think about u education. why do we educate. if we don't the civilization would collapse because the next generation will not be able to run it. so human cultures and animal species have figured this out they will have to pass on to the next generation otherwise -- and you added up over history, 1 trillion of effort have gone

1:55 am

into just passing civilization onto the next generation. because we have no choice, we can put on paper but the paper will not run the world. it has to get into the brains of the next generation but what happens when that is not true. what happens when instead of going through a long painful process of educating humans we could put the knowledge in the machine and they take care of it for us. this is a story that ian wrote and if you want one take away, if you can't bring yourself to buy the book you can download because is no longer a copyrig copyright. the machine stopped in 1909 but in the story everyone is looked after by machines 24/7 and we

1:56 am

spend most of her time on the internet doing videoconferencing with ipads and listening to lectures or giving lectures to tell her. we are all a little bit of east and we don't like face-to-face contact so times are like today but written in 1909. of course the problem is nobody knows how to run the machine anymore. so we turned over the management of our own civilization to the machine and it's a modern version of the story. >> what we need to do. i'm reminded of the culture of the spartans.

1:57 am

sparta took a very serious cultural attitude to the survival of the city state. the typical life in those days seem to be every couple of years you're invaded by a neighboring in civilization, city state or whatever and they would hold off the women and kill the men. but sparta decided it needed to have a very serious civil defense capabilities. so education -- i was reading another book which is called world without working he described as 20 years of pe classes. in order to prepare the citizens both male and female to fight. so it was a military boot camp that went on before you can walk until you are old enough to carry weapons and that's how they fight. it is a cultural decision to

1:58 am

create and that is what was bound in the culture. so i'm not recommending we do that exactly but some notion of agency and acknowledge capability has to become different economic than the way it is now but a cultural necessity that you're not a valuable human being and i do not want to date you unless you know a lot and capable of skinning a rabbit and catching your own fish and fixing the court in this that and the other. so it's a cultural change and i think i know be a matter of your own self-esteem. you don't feel like a whole human unless you're capable of doing these things and not being depend on the machine to help you. i cannot see any other kind of solution for the problems. the machine will tell us

1:59 am

basically as you ma'am done with their children, it is time for you to tie your own shoelaces. but your children say no, no, we have to leave for school i can do it. i'll do it tomorrow. so that's what the human race will do we will say will get around to the agency stuff tomorrow but for now machines have to help us do everything. that's a slippery slope and pretty dangerous. we have to work against the slope. >> i think this is a great point to leave this with educational institutions where maybe they don't need to learn anymore but let's open it up to questions and we will pass around the microphones, we have another one here and then go be constrained buy the book ask anything on

2:00 am

your mind. [inaudible question] the objective should be unknown. a lot of way of thinking about what you said is like the objective is satisfying human preferences and human preferences are unknown. we should satisfy unexpected value of human oppressiv prefer. . . .

2:01 am