Skip to main content

tv   [untitled]    December 17, 2014 10:00am-10:31am PST

10:00 am
of this data and so we really want to thank them both, for everything, so far. is it on? >> yeah. >> okay. >> all right. so this is a small piece and this is just for this lecture and so what we have done is basically focused on the form 460, and what you are going to see is the data is actually broken up into... (inaudible) and one is all of the campaign finance for the board of supervisor's election, and the ballot matters, and so i will just walk you through quickly what we have and so for the prepositions we just have all of the (inaudible) listed out. and if you go to each one, you can see that we have there is a for and an against listed, those will be listed. and then, if you click on any individual one, you can see the
10:01 am
total raised and then this is the, and this gives you the full visualization of the summary by date, and so if you just, rollover it, you can just sort of see the date. and coming down, this is a visualization that tells you about where the money is coming from. so there is a schedule a of 460, so if you hover over them, it sort of goes around, and who made the donation. and there is a little bit of sorting on that. and this, tells you, and this is schedule e. and which tells you, the expenditures. and we have a little bit of a grouping, if you drag and drop one of the circles, it tells you who that is, and how much it gives you the break down of what it is. >> is there any way that you
10:02 am
can make it any darker? the text is very faint. >> yeah, i think that it is because mainly of the resolution. >> yeah. that would not surprise me. >> okay. >> yeah, if you can't, that is fine. >> steve, how do we? url, also, we will be sure that you can access it, it is on-line and you can access it any time. so, yeah, so yeah, so we have the same thing for candidates. so it is broken down just by the districts that are currently up for election, and again, this is the same thing and we have the incumbents and all of the other people and
10:03 am
again, if you click on any particular individual, it gives you the same data. and it gives you the money raised and the money spent and the balance by date, and then, it will also tells you all of the (inaudible) for that particular individual. you can group by different types of entities that are making those contributions, whether it is individuals, or recipient committee and then it also gives your grouping by size that shows you how much in this case it is 500 and it was given 500 and above, and those are below 500 and so you can just get a rough sense of like, it is about a third, that is less than 500. and the rest is mostly, everybody's maxing out on 500. so, the same thing. and going down, and you can see
10:04 am
what they are spending their money on. we created something, for the incumbents, and one of the great things, that we have accessible through the sf gov site is data on lobbying, so who is lobbying all of the different board of supervisor whens they are lobbied and what contributions were made. and so we were able to take both of those pieces of data and put it together. and you can see right here, that it is combining the data of lobbying and contributions. and this is sort of a heat map
10:05 am
that tells you some activity and so you can see for scott weiner on the 27th. he got six contributions. yeah. from the san francisco association of realtors. so it is the san francisco association of realtors in the contacts and you can see what it is for. and what the issue was that they lobbied for. and so, you can see that it is ellis act evictions or illegal inlaw units and so all of those details are in there. and i will turn it over to peter and we will talk a little bit about the challenges that we had with the data. >> all right, so i did some
10:06 am
work with the propositions, and one of the things that we found out as we did work with the data is that all of these things that we decided to actually display like just total money raised, who was giving money, a lot of these things were not necessarily, we need to actually go and find them. and so, for people like our group, or for other people in san francisco, that want to just understand how some of these elections are funded, and it was just, it was great to have this information and we just needed a way to actually make sense of it all and for me
10:07 am
personally, it was... i can actually understand, now with this website, and some of how some of the funding actually works. and i think that it is really beneficial to be able to have this information that the city provides on-line. that anyone can actually use, and then, i hope that it will continue to be available for people like us. or for people who are in journalism, and anyone who wants to be able to use this data, and then, i know steven will talk about what our future ideas are for making some of this data a little more consumable for just normal
10:08 am
citizens like us, or for like i mentioned journalists too. and some of the things that like austin mentioned we had some challenges working through the data and for example, steve has helped us a lot with just some of the reporting in the data, was inconsistent and so there were some committees that when they made different filings and i am sure that this is familiar would have different variations of their name when they reported. so, for a human, looking through it, it is really easy, so, it is easy to see when a committee has just has a slight difference in a name but for a computer of course, it is something that is a little harder to pick out and so thankfully, because he works with the data every day, he has
10:09 am
had this challenge as well and he was able to help us out. but, at some point, in the future, maybe this will, this data will, i will be a little more consistent and it may be all reported electronically, and be able to have some of those inconsistentcies worked out so that it is all, very clear and whether or not things are misreported, they are all easily attributed to the same group. let's see. i think that that is all that i have to talk about. did you want to talk? >> sure. >> hi, i'm steven, again. yeah, so i wanted to talk to you about a part of the project
10:10 am
that i spent some time working on, thinking about. and which is, peter and both peter and (inaudible) already mentioned the sort of organizing and understanding of the data. and for a visualization/computing project, the data format, and structures that are very important, and i, and there are a number of sort of technical issues that are, that i can touch on i guess. so one of the first things that we tried to do with the data, is simply draw a correlation between things and so you saw ash's visualization, that drew a correlation between lobbying event, and contributions. and we were interested in doing the same thing for lobbying events and voting records.
10:11 am
and you know, and for every variable in the data set there is a corresponding correlation that can be made against all of the other variables in the data set. and so we were also interested in connecting this data set up with other ones and so beyond the 460, what can we connect it to. and so as sort of in the computing space, the term for what we are trying to accomplish is called a mash up. and so it basically a combination of multiple different data sources. off not, you know, the sources are... the people who create the sources are not, you know, connected to each other, but they are publishing, information, independently and so it could be twitter, and the 460 and seeing if there is any overlap. so for our purposes, we were working with mash ups between different parts of the government for the most part. and so, it is, we found a
10:12 am
number of challenges, and in making things sort of connect to each other and so joined together in a line, so that we can make comparisons. and we found, that we found that there is a number of difficultis in doing that. so, going forward, we have a number of plans and we have plans in place to make that sort of process easier to carry out, so that we can join with other data sets. and you know, make other more interesting visualizations reduce the amount of sort of noise that we have to sit through, because there is a lot of data and only a small amount of it is relevant and only a small amount of it has interesting, leads and stories to tell. so, really, finding ways to zero in on what matters, is difficult.
10:13 am
and in doing so, we need a place to put the data. and as a government website, sfdata, or data, sf is not able to host community data sets and so as we make, you know, modifications, or filter things a certain way, join up things in a certain way and we need a place to put that data, and so, one of the things that we are doing, going forward, is we have been working with an organization called decan that has a similar, and i don't know if that is, and that is something that is, but, basically it is a place for us to put processed data sets, that we can then, so it is like a down stream, utility for, you know, when we pull the data in from the data sf. and it gets put in and it gets processed and put into our data store. one of the things that we have
10:14 am
done, for this project, with, in terms of processing data is adding geographic enrichment for the data and so the data from data sf, comes with latin long, associated with it and latitude and longitude coordinate and so for a contributor, you can see where the contribution was made and that is useful, but we might want to know from a lat long, what state was that lat long from? what sf district was it from. what california county did the money come from? and you use that, you know, enrichment, to show further correlations between things, do we find that certain kinds of activities are associated with certain geographic regions. and you know, to answer that kind of question, requires, some sort of processing. and processing a data set for that geo enrichment can take,
10:15 am
hours to days to find the positions within these geographic regions. and so, there is a lot of processing that needs to be done, and there is a lot of sort of intermediate data that needs to be stored, and going forward we are going to be looking at more of that, and seeing if we can make more out of sort of this foundation, that has been provided to us. and so, yeah, we are very excited and one of the interesting things that i wantsed to mention about the 460, is that since it is state wide form, we can use this, these tools that we are building on top of the 460, for other organizations who work with it and so there is a branch of code for america over in oakland and one in sacramento and they are sort of all over the state that do the similar things to what we are doing. and we can share the tools, that we are using to analyze and enrich and structure, the data. and so that we don't have any repetition of work and so that
10:16 am
we can all sort of stand on each other's shoulders and make progress with these things. and, yeah, so, those, maybe, we rambled a little bit. but, there are a number of sort of technical, social issues with these different brigades going forward. and we are going to continue working with the ethics commission, and as much as possible. and for, you know, the foreseeable future, continue working on this project, which is really just a sort of start. so, yeah. thank you. >> for the public, can you tell us the name of the website? it is hard to see on the screen. >> yeah, the website is transparent voting.com. this is the url that we have chosen for this particular visualization we have the get hub repo which is the place
10:17 am
where we store our code, and we can put a link to it on this page, after we go. >> great, so the people want to go and play with this site, they can go to transparent voting.com? >> correct. >> voting. >> >> questions from the commissioner? s >> commissioner andrews? >> you know, first of all, thank you for the presentation, i did my absolute best to keep up with all of the terms after the url, i can kind of drift and thank you, and thank you for your or what you are providing on the volunteer basis and, so these are some of these are just the fire questions that you can answer, quickly and they tend to run around the organization. so, code for america is a national organization and it is funded, and it is a non-profit and it receives the funding itself, where are the headquarters? >> washington, d.c., i believe. >> and then it has a loose
10:18 am
franchise model of them and then at least the latitude to define your own name and those types of things, is the ultimate goal, and you are all volunteers, and thank you for your service and i suspect that you have other means to keep yourselves a roof over your head and fed. would you be... (inaudible) e >> you know, they are working in civic tech and they typically work with a smaller local government. and you know, but i think that the government actually write to them and they have projects that they are interested in and would like someone to come in and work with them and so they actually have these fellows that actually work with the
10:19 am
local city governments to help them with the processes. so that is the main part of code for america. and then there are the local, sort of chapters, they are called the brigades, which is volunteers from within the xhunlt community itself, who come in and we meet, every wednesday there. and then, we have the different projects and everybody gets to choose a project that they would like to become an ongoing part of. >> i see, at least the chapter aspiration is not necessarily to get to a funded level, or a contract, or a series of contracts that you would be providing this particular service, but just, this is your... >> this is a very different (inaudible). >> we appreciate it. >> it is a very different ethos, and i think that it is not... it is part of what you would probably have heard of as hacker culture. which is very much deeply have a deeply open data, and open
10:20 am
government, and open force, and so this particular project is open force, which means that anybody else could take what we have built and reuse it somewhere else. and so, it is part of that ethos. it is not necessarily about like getting out there and making a bunch of money, although, code for america, does have an incubater where they help the start ups but that is not the goal. >> thank you. >> no questions? commissioner hayon? >> my first question, is this a permanent project that you will essentially supervise? >> it is an ongoing project, part of code for america. and so, code for america will hold on to everything, we build. and you know, everything will be live, they do have some not... so we have some nominal
10:21 am
sort of fees that we have to pay for the hosting and that is not a huge amount of money and these, and whatever we built will be there and it is always going to be there. and it really is the question of how it is going to progress. right? >> and who is going to continue to provide. >> and input the data, and that changes from day-to-day and year to year, and election to election. and then, i assume that we will have a link to that, and we will have a permanent link to that website mr. st. croix? >> sure. >> well, we have, because i mean it is very exciting and it is a wonderful project and i would love to already go on the site and play around with it as i am sure that anybody who has an interest in these numbers and this data, would want to do. and to make our own correlations. now, that is my second question.
10:22 am
is you showed the correlations that you have drawn or gave us examples of. and so if i were to go on your site, could i create my own correlations and you know, ask the website to put those correlations together, or does that have to be created by you? >> we have to create it. >> you have to create it? >> yeah, this is not quite at that level. >> but it could be some day. >> yeah i mean that it is one of the many possibilities as peter and steven explained, these are some of the challenges technical challenges that we face, and we really came into this very super idea, was big and we are going to do exactly what you said, but the reality is, these are human beings filling out forms, and so, it is full of errors and lots of issues, and it takes a lot of time and a lot of work. >> reality is like that. >> yeah. >> okay. >> thank you.
10:23 am
>> but it is a great work, thank you so much. >> i can elaborate on the contribution piece a little bit. >> one of the things going forward, we were working on, is a system that enables outside contributions, and that is one of the reasons why we decided to find our own content, what is called a content management system for hosting our own data sets, so that if you ended up importing this into excel, and did your own analysis, you could give us the data and we could put this on the internet. you know? and sort of under our name and in our system as something that can be, you know, built on that is not necessarily endorsed by the government, but it is part of our organization's effort to make the data more understandable and usable and so we are working on some very, you know, sort of basic tools for doing that. but, it is early stages, but it is a focus of ours. >> commissioner king?
10:24 am
>> you have done a good job in allowing the general public to get into the weeds of things like a general election and in regards to funding an election and who is involved, and the whole aspect of following the money which is or tells so much about government, and the transparency and the lack of transparency of government and that is enormously valuable and i want to ask you a question which may be unfair and if it is, feel free not to answer it. since you have gotten into the weeds and you have seen, how elections and balloting works and where the money is coming from and where it is coming from. have you formed me kind of opinions relating to one way or the other? in regard to the whole process itself? that you would like to share with us? >> i think that we had, some
10:25 am
ideas, just about in the future, being able to track future lobbying money and i think that some of those conclusions will come after the elections and just, after yeah, after we have some more information, just from lobbying and what politicians have done. then, i personally might have some more conclusions on that. but it has been interesting to see where money has been coming from so far, but in some ways, i am curious, how that will influence some of the candidates, for example, for the board of supervisors. >> i hope that you do follow up on that. and i would love to hear your conclusions on those things.
10:26 am
>> yeah. >> thanks very much. >> thanks. >> commissioner hayon? >> i have another question, you know, a lot of journalistic entities follow the money and their organizations that devote themselves to that and i am just wondering have you partnered with any of these groups such as propublica and even nate silver and the work that he does, and you know, vox, and any of the other number of sites that are out there. that do this kind of reporting on money and lobbying and you know, the influence that it has and then there is all of that money that is very difficult to track and that can't be tracked. but i am just wondering if you formed me partnerships on that basis, and then, after that, i have a question for you steven. >> so, the main thing is that a lot of the organizations that you are talking about are largely focused at the federal
10:27 am
level. that is a major difference and so i did personally work for another organization called working in that same space, called vitocracy and the great things about the things on the federal level is that you have many, many organizations that clean up the data for you, and very easy to go in there and build stuff on top of it. the challenge at the state and local level, is that we actually don't have that or those kinds of institutions that actually do the same level of work. so that is why we are going to be working with the other groups, similar brigades who are working on the same thing together and hopefully we can create something, the gray thing being 460, as a forum and that is the for the entire state and so that actually helps us build something. we will be reaching out, to journalists within san francisco, and with this project, and then, you know, whatever feedback we get from them will incorporate that into the next.
10:28 am
>> thank you. >> and steven, my question for you, is now that you have worked with this group, with code for america, and the san francisco brigade, what are your hopes for the future collaboration and how we at the ethics commission can use this and take advantage of it and make it available very clearly to all of the public that has an interest in this? >> yeah, i think that these kinds of technologies. >> actually a question was for the other steven. >> oh, >> my question is for you. >> okay. >> we are both named steven. >> i am sorry. you are both steven. you may have thoughts on this too, but for our expert at the commission. >> well, you know this is the first group like this that we really connected with and i
10:29 am
think that it is because we started to post this information on the city's open data system and because it is so accessible and so i hope not only will this relationship continue, because for this group, i mean it has been a major learning curve since the summer to produce this site and so now they have a little bit of experience with this and so i am moving for this group that this will continue and then i am also hoping that it will help us with other cities as well. because, i have worked a little bit with the group of oakland, and i know that there is a group in san jose and again in sacramento, and so there are a lot of different projects that are going on independently and i hope that going forward they can start to work together more to produce sort of one more unified site, and another sort of exciting development, which i have not had a chance to talk to them about, is that in the last week, net file, which is our vendor, for our electronic system is also the vendor for
10:30 am
all of the electronic filings in the other cities, posted a website and, the programmers like this group, can actually tap into the data in any city in california, and then they also took the state's data and included it as well. and so you can actually tap into all of the data state wide and so you could build something like this now and again, this is only, it has developed in the past week, so a lot of stuff is moving at the moment. >> thank you. >> any other questions from the commissioners? >> well, thank you very much for the presentation. we appreciate it, and thank you for your hard volunteer work. it is really great, and i think that the public is going to greatly benefit from it. [ applause ] public comment, on agenda item three? >> i too would like to compliment these volunteers, the website looks terrific. i think journalists