Skip to main content

tv   Weapons of Math Destruction  CSPAN  December 23, 2016 4:11am-5:23am EST

4:11 am
4:12 am
weapons of mass destruction is about the misuse of state data and mathematical algorithms to make decisions that she argues should be made by thoughtful human beings. she talked about her book at washington, d.c.. it's one hour and ten minutes. [inaudible conversations] thank you for coming up tonight. out tonight. my name is david.
4:13 am
i had to have couple of housekeeping notes. you could take a couple moments to silence her cell phone for any devices. i also just want to mention we will have a conversation. raise your hand and i can bring this around so you can have all of the questions recorded and everyone can hear the conversation. if you would like to purchase a book [inaudible]
4:14 am
if you look at the calendar you can find where these events are happening. [inaudible] it's gone way beyond and we live in an age of the algorithm and increasingly whether you go to school, where we get a car loan, they are made by mathematical models and it's done according to the same rule.
4:15 am
but the opposite is true. o'neill is an academic [inaudible] turn to be a scientist at the saint augustine company and is one of the strongest voices speaking out how to influence our life and because it is implemented it as an unusual book that governs so many aspects of our lives. she has a phd from harvard and switch over to the private sector she left in 2011 and
4:16 am
started working in 2013 on the data journalism. she's a weekly guest and you may know her from her book [inaudible] if you could please join me in welcoming. [applause] >> [inaudible] maybe we can jump right in and get some ideas out there and talk about the examples of your book. we were chatting in the back as you all came in and i was saying one of the questions we've done
4:17 am
is to take all of your data from trigger or facebook and use that. so how many of you have done this sort of personality test? it turns out a lot of businesses want you to take those that you don't have to take them anymore because you can find them automatically and then use that in ways that you may like or may not. we are going to talk about with a weapon of mass destruction is and then we thought we might start with an example and we can draw out the characteristics. there are some algorithms i care about and some i don't care about. the ones i care about our constructivareconstructive and e mass destruction characterized by three characteristics that we can start with an example to make it real.
4:18 am
the example if you don't mind me starting their, there is this guy that was a college student near atlanta and he wanted to get a part-time job at clovers grocery store. his friends were leaving so he said you just have to fill out the paperwork online. he starte started to fill out pk and like 60% of job applicants in the country, he was required to take a personality test online before he got the interview. so heated. it is a standard procedure for lots of minimum wage jobs. he failed it and he was lucky because most people don't find out they just never get a callback. he found out he got a red light
4:19 am
and unusual in a second way, his father is a lawyer. when people apply for minimum wage jobs they don't have a big powerful lawyer said he talked to his father and he asked him what kind of questions were on this test because you are very qualified to get a job as a grocery bagger at the store. he's going to a competitive college, got straight a's in school, what's the problem. he said the questions were a lot like the ones i got at the hospital when i was being treated for bipolar disorder. >> it looks of things which are you an extrovert or introvert are you anxious or do you get angry easily or super mellow and nothing bothers you.
4:20 am
it's easy to find that out on a test but it turns out it shows up in other data that we leave behind. >> that is so frightening. so what happened, he talked to his father to talk about the fact that it was his father was a wonderful man and said that is illegal. you can't make them take a health exam including a mental health exam. but the americans with disabilities act, to prevent this sort of creation of an underclass of people with disabilities and mental health problems. so his father is pursuing a lawsuit on behalf of anyone that ever took this test and i should add by the way that he tried to get a job at six other large companies in the atlanta area and got the same red light for
4:21 am
all of them. he was systematically prevented from getting a job in his area. so that's the first example i want to talk about. >> that takes us towards this definitional collection. a lot of these insights about people come from these models built on algorithms and a whole bunch of data and the fact that they can harm people on the surface it looks like the nature of the weapons of mass destruction concepts. first it is widespread. so it's like the one ones i buin my basement because that's what i do, i go about this all the time.
4:22 am
they start caring and should start caring where it affects a lot of people in that this is an example because it was widely used. criminals to do with this algorithm are located in boston and sold it to all sorts of companies. so it is widespread. second, it is secret so he didn't understand how he was being scored and most people that took this test didn't understand that they were being scored. finally, it is destructive so it destroyed his chances of getting a job but it is destructive in a larger sense in that they actually sort of create and reinforce and hear the larger feedback is systematically
4:23 am
refusing to give to people with certain disorders. >> one of the examples you open with i just a secretive the seck box nature of the algorithms including some really crappy ways of validating them and intd about is getting schoolteachers fired so maybe we should talk about that. >> second example. so, she was the school counselor for some time and instituted this policy whereby some people would get fired if they had been a teacher assessments and some would get a bonus if they got good teacher assessments. the assessments are complicated but the short version is most of the way that a teacher is assessed as they don't have a lot of spread so most people get acceptable or good.
4:24 am
people that want to disconnect between good and bad are discriminated to become frustrated. they want more. like people that are terrible, good, fair, better, very good. so very few of the ways we now assess teachers. they instituted this assessment called the growth score or the value added a score of a teacher. i won't go into it technically, but the very broad way of thinking about this is the idea that a teacher is on the hook for the difference whether the students should have gotten versus what they actually go so igot soit is this underlining ml that estimates what each student in the class should get. so if i'm teaching you, you are
4:25 am
all fifth graders right now can at the end of fourth grade you got a score on your standardized tests. let's say you got a 75 out of 100. you would be expected to get something like 75. let's say you've got 80. you got more than you were expected. does that make sense. so now help is expected t this e is determined per student and it's actually really complicated and another source of uncertainty you got at the end of fifth grade because after all kids get different scores whether they had breakfast into that sort of thing. so there is uncertainty and the teacher, again to remind you to
4:26 am
teacher is on the look for the difference between those two things. now, actually if you think about the original score, the difference between the expected score and the actual scores called the error turned. it's also called noise. the students to be good teachers are being assessed based on this which is if you're not a statistician i don't want you to be confused. all i'm saying is we would call this stuff pretty meaningless and in fact the way they came out were almost meaningless and i say that not because i have hard data on that because it is a secret score. the only way i got my hands on any of the data o besides the stories i talked to one teacher got a six out of 12010 and 96
4:27 am
out of 12011, he was from new york city. in general, the secret system is another source. so anyway, going back to washington, d.c., i interviewed a woman who actually got fired because her overall teacher assessment score was too low. it wasn't entirely due to her value-added score. 50% of it but the other 50% worthies things so most of the information_was this terrible value-added score. one more thing to tell you about this score is a lot of folks, a lot of fourth-graders got good scores at the end of fourth grade. kids coming into the class of fifth grade had excellent scores but they couldn't read and they
4:28 am
couldn't write. so she was suspicious of their scores and has every reason to believe the teachers cheated on the end of the test in order to get better value added models for themselves because they got better than accepted so they set her up to get worse than expected. does that make sense? so every reason to believe. it's never really got systematically investigated. but in any case, she got fired. let's go back over the three characteristics. it's widespread. they are being used in more than
4:29 am
half the states mostly in urban school districts. when she appealed it she said it's mathematical. it's where they don't use the schooling system and then the final thing you will be surprised if you know about this sort of nationwide war on teachers going on.
4:30 am
we can talk about whether that even make sense but in any case it was to get rid of the bad teachers and the larger feedback loop is that it's just getting rid of good teachers because good teachers retire early and they don't have this scoring system and right now we have a shortage of teachers so i would argue that this regime this value-added model is very bad and i think something that is especially insidious about this, we talk about the algorithms and they can sound very foreign and things we don't interact with a lot of the algorithms we are talking about here have a lot of similarities and are identical like on netflix or amazon recommends shows and books for
4:31 am
you. we totally know how to handle those. amazon is like you bought a stephen king book. it's right but not helpful. [laughter] sometimes it is totally wrong and you're just where did this come from. sometimes you get these beautiful insights like this wonderful guide it recommended i buy this knife set. i never would have thought about that. you don't just buy everything amazon recommends or watch everything netflix says to watch but the algorithms you're talking about, people don't treat them as one point in the decision process. they treat them as the oracle of truth even when they are poorly validated and that make them a lot more destructive because it overrides all of the tuition with this little bit of mess.
4:32 am
>> i would argue it is worse than what you just said. >> let me give you two examples. you know how the face but trending story, guess what, we notice and give feedback to facebook saying that isn't a real story. that's something that doesn't happen for the teacher value-added model. there is no ground truth for the teachers. there is no secondary assessment to compare the scores to. they were just told that this s your score. when i got six out of 100 was ashamed. next year he got 96 out of 100 he was like what does that mean i went from being terrible to a great teacher? there's nothing to compar compat to, no feedback. the teacher can't say i'm actually a really goo good teacr so please update your model. that's the equivalent of a
4:33 am
saying i don't want to see that movie because it is awful. please don't show that timmy. i don't recommend anything like that to me again. we have ways to tell it but the teachers don't have access to that. >> or parents or administrators. they couldn't go in and override. >> exactly. so the other thing i want to agree with is that people really do trust the scores because they were mathematical. the way that i came across this algorithm is she started complaining to me about the teachers getting scored and i said can you tell me how they are being scored, like what is the algorithm and she said i asked the department of education and they told me you wouldn't understand. he said that isn't good enough.
4:34 am
mathematics are supposed to clarify, not confuse. so that isn't something you do with math. that is recognizing math. so i keep pushing through each layer. then she finally got this white paper that wasn't readable to me and i cannot understand it so to get the source code for the algorithm i should say the reason i thought about it is "the new york post" had the information act request it had successfully got all the teachers scores so they publish
4:35 am
it. but if they can get their scores, it makes sense. i was denied and then i contacted somebody at the institute that built the score and they said i would never again because they had a contract with the city of new york. this was to say secret from officials in the department of education so they cannot explain how they are being scored. >> part of the book is the argument i thought maybe you could give us an overview on that because you are building these for a while then he became the enemy of the finance. >> joining up by paul street is
4:36 am
probably not something you're hedge fund would have smiled upon. i joined by hedge fund in early 2007 and headed straight into the crisis. i was really disillusioned very quickly because the people i thought were experts really didn't seem to know much about what was going on. i understood perfectly that they didn't see it that much better than i did and in the heart of it, by the way that's why i fell in love with math because like how beautiful is the rubik's cube. i had this idea of math being this pure, beautiful, almost
4:37 am
artistic and better and then a part of the financial crisis was this mathematical why do the opposite of the beauty of the rubik's cube, the aaa ratings of the mortgage-backed securities which, you know, they are mathematics. they are promises that people that were good at math with a phd o were backed by crunching e numbers and promising mortgages were not going to go bad. they were not doing that. they were creating a lie and selling it for money. the aaa rating people believe that so much the scale of the market was very large and one of the reasons it was such a big deal. back to one of the things i realized again it was the weaponization of people trusting
4:38 am
mathematics and the math itself wasn't the problem. it was that people were basically corrupt and shielding themselves by calling it math and a saying don't look here. and it's right, you have to trust it so that's the thing i realized. then i was like i'm doing the same thing. that's what the data scientists did, they use historical data to predict people. some of you are predicting personality traits and that didn't seem so bad until i came across this sort of teacher value-addeor that teachervalue-t mentioned. then i think the moment i
4:39 am
decided to quit my job and write this book is the one interaction i had in the startup where the venture capitalists came to visit and he was thinking of investing in the company so we all sat and listened to him talk to i was working on advertising which was relatively benign you get opportunities for some people it's a way of segregating society. but i still don't think it was particularly evil. he gave us this idea of what his
4:40 am
future dream for the world of the internet was and he was like i have this dream that someday -- it didn't sound like martin luther king let me take that back. this is what i hope to see. he said i get offered troops to aruba but those are not for people like me. everyone around me laugh. i was like whatever happened before the internet. those that scored high had
4:41 am
opportunities and we have toys and things we like to play with and people on the other side of the spectrum can be preyed upon. if i make money off of them from the universities where can i make money off of them and if you are in answer to class i can make money exploiting them. >> think about the ways the advertising works. from the eyes of whomever you're talking about its suspicious because they've known about the person and again, if you're talking about someone like me they are good at showing me yarn ads because i'm very vulnerable. they know my number and that's okay because sometimes i'm like i love that yarn but on the other end again, some people the most profitable thing is to go
4:42 am
for a for-profit college with debt and doesn't give them an actual education and we have seen some closed down recently which is the way it is. it was an enormous industry but by the time they came and talked to my company i would never see the university or know what it was honestly. i looked it up and saw the parent company of the university was the single biggest fired up at quarter. ..
4:43 am
>> >> one of the things that you use talk about is people who build these algorithms certainly for people like s your right to make them uncomfortable. so training is a good part but if there is money to be
4:44 am
made by exploiting people and he thought it was a mortgage backed securities than we thought that was enron if you remember those traders who shutdown the power grid of california just to make money on futures they talk about old people who are screwed basically because they have to pay enormous rates for their electricity but the market is set up in tuesday's algorithms anything that is basically within the rules and where thoughts how to change that part of the system packs. >> at dawn expect all corruption to event but i do want one form to beat challenge where they claim because of the algorithm
4:45 am
that has imbedded billions it is just that those that create those algorithms get to decide with the morals are. i have a very specific reason to define the algorithm like i do because i perform triage. and they seldom where to focus the attention and of those are terrible because people who are greedy so those that rise to the level
4:46 am
is something potentially a what then of mathis'' - - destruction there should be a law against that because the free-market will not solve the problem and actually makes money say you cannot say please stop being discriminatory because it is not great for society will not happen that is why we have laws around fair lending and a fair credit reporting act and the discrimination laws written in the '70s of they can up with the fight goes for it is not perfect but the whole lot better than big data.
4:47 am
so to push back and demand accountability. so you should know how mib assessed at your job but also other has to be rules and regulations. >> so you say very clearly in the book that if you have an algorithm that seems unfair in fact, is discriminating fast because they did not consider that fact. if one starts talking about holocaust denial they didn't
4:48 am
they really should think about minorities or women. if you have an industry made that of people who are greedy or ignorant they can be ignorant of the problems that they're algorithms create the we need to think about these problems in know about them not just optimizing the output. >> there's a lot of recent publicized examples of spectacular failures even from twitter or face bob -- facebook that is what we get to see those algorithms' ruining people's lives are never seen by the public is business to business like of personality test for the
4:49 am
supermarket so if it is that bad with under been public scrutiny imagine when they can't. [applause] >>. >> this is incredibly interesting. talking about the algorithms as a device for controlling or establishing a retaining control so that it is presented as valid that reminds me of accomplishes
4:50 am
the same function that nobody really in the stands but they legitimize sometimes there is the priestly class where they come from this algorithm so what did they think about this? i don't know any of these people. >> and much to my shame many people prefer it that way but this pretty flattering to save his magical because of a mathematician bet might
4:51 am
quit my job to write this book i was panicking because i didn't see anybody around me worried about this but now four years later have entire community of people including sociologists and anthropologists to entomologist that our super concerned. and a dent in particular to hang out with people who are interested in developing these tools specifically to look into the black box so as a as a sociological experiment if they want to say that was racist through a bunch of applications to follow those similar
4:52 am
qualifications if the white names get more called back for the interview can do that with an algorithm is crude and the first generation but that is what we could start doing so we do have work to do with those technologist to work with data that but for those who want to think about this also it should be developed a think in 20 years to be conscious but as far as i know not even the "journal" is interested in publishing and that would say at the beginning of this if you
4:53 am
look at journalism or that recidivism models the risk models that the judge reduced to send people to jail even if it is a little bit racist and they audited in found it was racist from so there is work to be done. >> this is wonderful his but to areas that this might have something to do with madison? i am a physician that the
4:54 am
medical and business community are going to the electronic medical records so there is the insight to take care of people and health. i am skeptical. second another big push for the individual physician's to take risks that if the patient is a required care or requires more than you anticipate and we're not an insurance company. so that the physicians and insurance companies even though they are supposed to beat on the afghan up without a pre-existing conditions that information that is available to figure
4:55 am
out who was a risky and who is not. >> great question i am not an expert in your field but i worry as you do all that hype and around the medical field so i was focused on accountability and testing procedures. but let me explain my analogy. compared to what? had a reno does any good? how correlated isn't? in this so bad and then to
4:56 am
demand that accountability if a martian said i have a way in the news that i don't trust you convince me but you have to be convincing in a real way. people assayed big data in throw that out there. hugh need to be shown why and how that's agrees it's you in the ways that you know, are true but having said that the example of how algorithms ironic we'll want to throw out that just imagine we had a really good
4:57 am
big data algorithm, even if we did in ricans predict the future well, that is not necessarily a good thing in repeat will go wrong that in the hands uh your doctor for but of an insurance company who can charge you more that is terrible and finally imagine in the hands of wal-mart the employer can decide to tire or not because of the future insurance cost? that would be very bad. so there is no guarantee even if the algorithm is accurate.
4:58 am
>> so those of algorithms terrify me so we just published a paper looking at people who are going to alcoholics anonymous there are clear profiles from before they start going to aa and can predict if 85 percent accuracy if they will stay sober looking at women in their twitter accounts according to produce seed in on-again their birth they could develop if they wanted to develop postpartum depression they said we need to monitor but that is scary. of but there is one community that obesity the
4:59 am
missile blast site conditions like ptl cn looking at the science is interesting that we can do this bundy other hand the implication there is no regulation i think it is terrifying as spend time talking to big companies to say you need to be careful so then say scary space at this point for. >> now we are all terrified. >> i did not know you can even do that with twitter. that is awful. talk about the medical profession reminds me of hit
5:00 am
the and also provisional testing. why isn't that kind negative of information as sensitive as the medical intermission question mark thank you for sharing that. that this chapter for in my book. [laughter] >> it don't have the perfect and algorithm to pick everybody in the audience i am trying to scatter around. >> i am a reporter with higher education and defense issues i was hoping to address those benefits of
5:01 am
those cases like with the teacher is seems it was just the use of the information but they're wrong analysis but it seems like in the instance of cheating and education. soda isn't really the algorithm but the weighted they are used but then the terms of the personality test that it is a matter of saving time on the front and ensure you and anticipated that scientists curious as to your response for. >> think of the future value added model five because the
5:02 am
structure gulf had to many incentives so that causes cheating so one of the things that happened is i am not an education expert said on have a solution but what i want to point out to it is the lobar of all of my examples it is just a back model. i will explain my evidence is a computer error but also the new year post a word
5:03 am
high-school math teachers who did something clever sleight couldn't actually get the model but teaching seventh grade and eighth grade math and to be consistent as the teacher for the overall score he plotted that and found uniform distribution is as common to get an idea or zero or as a 40 almost a random number generator so there is no reason to think
5:04 am
which is to say you could not possibly take this san say this is your fault fed is not strong enough evidence. but with a correlation maybe at the district level you could use that to information about test scores. they've been but i don't think so so in terms of the of hiring practices with the personality test strike complaint is it is illegal and secret easily it is a mental health assessment.
5:05 am
assuming that is true we cannot figure out directly but the problem is that it is illegal in the against the law there are a lot of laws against hiring the you cannot discriminate when you hire. i don't have a problem using big data to help you filter a resonate -- resonate but it should be transparent so people understand. >> so you paint a picture but the average person has no idea what the algorithm is windows that our most affected by the failure
5:06 am
still don't know any algorithm had anything to do with it because that is important because people still don't have any adn halifax their lives. >> that if is why i wrote the book i feel that people don't understand that they did not realize that there was an algorithm there and had even aware of it but with the discussion of the trending news stories that they don't see everything in some populations have done tricky things if you post something your friends may
5:07 am
not see it but depopulation that open a branch name into the post then it is pushed up in the rankings so your friends will see that. it is like the long term public golf and negative goal. >> one interesting point as the great representative steady concern of the impact is clearly correlated with the economic status and those who are porter are less worried of the algorithms and also less aware. but talk about being exploited by these algorithms the people that
5:08 am
our most likely to be exploited for also less worried and less aware so it is a cycle you're worried anything bad will happen but you are the target so it is more complicated the course it doesn't solve the problem your book is a great way to get this out there i like to talk about it best we need a massive education campaign this is so impact full this is the literacy between need to start to release know that they are there even if we don't know the details of how they work. >> is not enough to save rerun should be aware and protect themselves the reason i wrote the book i
5:09 am
saw this time and time again as a class issue for people atop controlling those at the bottom but this is why we need rules about this they cannot govern themselves is not enough. >>. >> i am a data scientist and working with health care payments sola of the data modeling is what you were talking about but pdf building on your point it is okay to use these algorithms like hiring practices but
5:10 am
what do you think is the solution with the algorithms themselves? and to deal with this inequity that arises? >> as the data science they could always be thinking about welfare when they build these algorithms from the ground up. >> the thing about the promise of big data is that still exist. in the idea that we can make things more fair. i am not holding mcgrath before it is not just the realization that you are done. or the tools to say how do
5:11 am
we fix this or make it better? the did news is let's compare a company with those discriminatory practices for hiring. them they will lie maybe they don't realize they're being discriminatory or biased but feel the rhythm will not lie so you can check that someone said this successful that is good if you have to have a sufficient trust to know that you were doing a good job. >> my partner and i did a
5:12 am
fair element -- amount of work is seen to be expressing frustration the bernie sanders campaign uses data from one company but what we find the data that they are sending to canvassing did not match up with those demographics that bernie supported and i find that frustrating that it was not working for us and the higher-ups' did not see that to not understand that was not the best demographic to target but i am a bit concerned the fact to be contacting a certain type of voter and not another miracle in to the wealthy neighborhoods and not the less affluent so some are
5:13 am
being reminded to vote and others are not. in terms of a place that the algorithm exists is probably not the best interest for society there should not be a private company that owns the data on voters to lend that out to campaigns? who is behind that quite. >> agree question i do have a chapter on politics and data the subtitle is halifax democracy. actually i think that's some point some people have different lives that they
5:14 am
don't have time to being gauged but that threat to democracy that ic and is not one even now have time to go into a lot but targeting older or special people that cannot be counted on for money or the swing voters or the spring states and with this weird feedback they definitely think that they will not vote. but having said that the then that's is corrected the
5:15 am
next time around but then they're continually ignored so it is very socio-economic but politics and data would is insufficient for campaigns is insufficient for democracy to know that everybody here to send you the message what the campaign wants you to hear that is not what is good for us as a group what is good is broad public discussion on the various issues but that is not what is happening so a friend paul targeted me about breaking up the big banks than to
5:16 am
good as with page he would follow me with a cookie to say show her that so they control the information that i have about them to once we want candidates to be open and to have more information about them than they have about us and that is not happening. >> let me free queue out some more laugh laugh in the last election we had 97 percent accuracy in the last election it said here is all the friends who have voted.
5:17 am
>> so they could get half a percent of more people to vote if they showed that they and if they didn't so the people they think will vote for donald trump and but they don't show that to somebody who will vote for somebody else so they are having a significant impact on the election itself they won't have any aspirations fax with these sorts of things but now calling l. facebook specifically but the country will ashley company who by understanding that it is a real possibility rather the
5:18 am
technology today whether those concerning things politically today. >> because there are already more democratic voters provided that has more impact on that democratic election. >> one more question. >> so while look for that bill, advertiser of lot of advertisers have gotten skittish with the american political rights who have
5:19 am
noble and patient -- and tensions that we don't sell t-shirts but whenever we pull out the of the market realities he replaces it. everybody replaces assesses our own online advertisers and if they are elected of that segregation with those mainstream advertisers so that is steadily increasing. pdf and that is better then
5:20 am
selling them cards for really big knives. there are hundreds of thousands discovered to those algorithms so can you comment on this phenomenon? >> i think it is larger than the book and i don't know how to address it the growing partisan divide, i agree the way they talk to each other with the echo chambers is part of it also i don't know.
5:21 am
>> 8q so much for joining us and for coming out it is the degree talk with great conversation and she will also be here signing books. [applause] [inaudible conversations]
5:22 am

45 Views

info Stream Only

Uploaded by TV Archive on