Get ReadWriteWeb in your inbox every day. Subscribe today!Close
Open

ReadWriteWeb

Click here to find out more!

Facebook has cut a deal with political website Politico that allows the independent site machine-access to Facebook users' messages, both public and private, when a Republican Presidential candidate is mentioned by name. The data is being collected and analyzed for sentiment by Facebook's data team, then delivered to Politico to serve as the basis of data-driven political analysis and journalism.

The move is being widely condemned in the press as a violation of privacy but if Facebook would do this right, it could be a huge win for everyone. Facebook could be the biggest, most dynamic census of human opinion and interaction in history. Unfortunately, failure to talk prominently about privacy protections, failure to make this opt-in (or even opt out!) and the inclusion of private messages are all things that put at risk any remaining shreds of trust in Facebook that could have served as the foundation of a new era of social self-awareness.

Medill The Medill School of Journalism, Media, Integrated Marketing Communications offers programs that combine the enduring skills and values of journalism with new techniques and knowledge that are essential to thrive in a digital world.

Ad powered by BTBuckets

FBPolitico.jpg

We, ok I, have long argued here at ReadWriteWeb that aggregate analysis of Facebook data is an idea with world-changing potential. The analogy from history that I think of is about Real estate Redlining. Back in the middle of the last century, when US Census data and housing mortgage loan data were both made available for computer analysis and cross referencing for the first time, early data scientists were able to prove a pattern of racial discrimination by banks against people of color who wanted to buy houses in certain neighborhoods. The data illuminated the problem and made it undeniable, thus leading to legislation to prohibit such discrimination.

I believe that there are probably patterns of interaction and communication of comparable historic importance that could be illuminated by effective analysis of Facebook user data. Good news and bad news could no doubt be found there, if critical thinking eyes could take a look.

"Assuming you had permission, you could use a semantic tool to investigate what issues the users are discussing, what weight those issues have in relation to everything else they are saying and get some insights into the relationships between those issues," writes systemic innovation researcher Haydn Shaughnessy in a comment on Forbes privacy writer Kashmir Hill's coverage of the Politico deal. "As far as I can see people use sentiment analysis because it is low overhead; the quickest, cheapest way to reflect something of the viewpoints, however fallible the technique. Properly mined though you could really understand what those demographics care about."

Several years ago I had the privilege to sit with Mark Zuckerberg and make this argument to him, but it doesn't feel like the company has seized the world-changing opportunity in front of it.

Facebook does regularly analyzes its own data of course. And sometimes it publishes what it finds. For example, two years ago the company cross referenced the body of its users' names with US Census data that tied last names and ethnicity. Facebook's conclusion was that the site used to be disproportionately made up of White people - but now it's as ethnically diverse as the rest of America. Good news!

But why do we only hear the good news? That millions of people are talking about Republican Presidential candidates might be considered bad news, but the new deal remains a very limited instance of Facebook treating its user data like the platform that it could be.

It could be just a sign of what's to come, though. "This is especially interesting in terms of the business relationships--who's allowed to analyze Facebook data across all users?" asks Nathan Gilliatt, principal at research firm Social Target and co-founder of AnalyticsCamp. "To my knowledge, they haven't let other companies analyze user data beyond publicly shared stuff and what people can access with their own accounts' authorization. This says to me that Facebook understands the value of that data. It will be interesting to see what else they do with it."

I've been told that Facebook used to let tech giant HP informally hack at their data years ago, back when the site was small and the world's tech privacy lawyers were as yet unaroused. That kind of arrangement would have been unheard of for the past several years, though. Two years ago, social graph hacker Pete Warden pulled down Facebook data from hundreds of millions of users, analyzing it for interesting connections before planning on releasing it to the academic research community. Facebook's response was assertive and came from the legal department. Warden decided not to give the data to researchers after all. (Disclosure: I am writing this post from Warden's couch.)

"Like a lot of Facebook's studies, this collaboration with Politico is fascinating research, it's just a real shame they can't make the data publicly available, largely due to privacy concerns" bemoans Warden. "Without reproducability, it loses a lot of its scientific impact. With a traditional opinion poll, anyone with enough money can call up a similar number of people and test a survey's conclusions. That's not the case with Facebook data."

"Everyone is going 'gaga' over the potential for Facebook," says Kaliya Hamlin, Executive Director of a trade and advocacy group called the Personal Data Ecosystem Consortium.

"The potential exists only because they have this massive lead (monopoly) so it seems like they should be the ones to do this.

"Yes we should be doing deeper sentiment analysis of peoples' real opinions. But in a way that they are choosing to participate - so that the entities that aggregate such information are trusted and accountable.

"If I had my own personal data store/service and I chose to share say my music listening habits with a ratings service like Neilson - voluntarily join a panel. I have full trust and confidence that they are not going to turn on me and do something else with my data - it will just go in a pool.

"Next thing you know Facebook is going to be selling to the candidate the ability to access people who make positive or negative comments in private messages. Where does it end? How are they accountable and how do we have choice?"

Not everyone is as concerned about this from a privacy perspective. "There are many things in the online world that give me willies for Fourth-Amendment-like reasons," says Curt Monash of data analyst firm Monash Research. "This isn't one of them, because the data collectors and users aren't proposing to even come close to singling out individual people for surveillance."

Monash's primary concern is in the quality of the data. "There's a limit as to how useful this can be," he says. "Online polls and similar popularity contests are rife with what amounts to ballot box stuffing. This will be just another example. It is regrettable that you can now stuff an online ballot box by spamming your friends in private conversation."

It doesn't just have to be about messages, though. Social connections, Likes and more all offer a lot of potential for analysis, if it's done appropriately.

"We need trust and accountability frameworks that work for people to allow analysis AND not allow creepiness," says Hamlin.

Two years ago social news site Reddit began giving its users an option to "donate your data to science" by opting in to have activity data made available for download. Massive programming Question and Answer site StackOverflow has long made available periodic dumps of its users' data for analysis. "You never know what's going to come out of it," StackOverflow co-founder Joel Spolsky says about analysis of aggregate user data.

The unknown potential is indicitive not just of how valuable Facebook data is, but potentially of the relationship between data and knowledge generally in the emerging data-rich world.

That's the thesis of author David Weinberger's new book, Too Big to Know. "It's not simply that there are too many brickfacts [datapoints] and not enough edifice-theories," he writes. "Rather, the creation of data galaxies has led us to science that sometimes is too rich and complex for reduction into theories. As science has gotten too big to know, we've adopted different ideas about what it means to know at all."

The world's largest social network, rich with far more signal than any of us could wrap our heads around, could help illuminate emergent qualities of the human experience that are only visible on the network level.

Please don't mess up our chance to learn those things, Mr. Zuckerberg.


Click here to find out more!

ReadWriteWeb encourages comments, but please remember: Keep it nice, keep it clean, and avoid promotional comments. We do pre-moderate some comments with links. For more information, please read our full comment policy.
  • 1

    Notifications

    • Newbie This commenter has posted fewer than 25 times.

    You've received a new rank!

Glad you liked it. Would you like to share?

Sharing this page …

Thanks! Close

Showing 16 comments

  • Chef Leslie Davio 1 comment collapsed Collapse Expand
    well you know i think that all politicians are nothing but bold face liars and cheats all they are looking at is the dollar signs and the power
  • Joseph Martins Managing Director of DMG LLC 1 comment collapsed Collapse Expand
    Does anyone know if deleted public and private messages would be included in the analyses? Monash suggests that spamming would become the digital equivalent of ballot box stuffing. However, he's assuming either people keep the spam, or that FB includes deleted messages in its analyses.
  • Imota Dinaroid 3 comments collapsed Collapse Expand
    So, people are STILL using Facebook?
    Not for long.
  • lakawak 2 comments collapsed Collapse Expand
    I love it when idiots like you say that. Yes...people are still using Facebook. With more page views than Google. (Not Google+, mind you...google) So yeah...I would say they are. 

    As for not for long? Why? Are you suggesting that everyone is as stupid as you and wouldn't know how to simply NOT OPT IN for this?
  • Owe Billy 1 comment collapsed Collapse Expand
    I love it when idiots like you blast someone without even reading the article. There is no opting in or out. The only opt-out is to stop using FB. I believe you owe someone an apology.
  • AntiSocialismFighter 3 comments collapsed Collapse Expand
    If they can data mine for good then they can date mine for bad. The sad thing is how desensitized and "sleeping giant" everyone is until something bad happens and then that becomes the norm and people cannot do anything about it. And, round and round we go towards Socialism. Get a f'n clue yourself and demand that they stop mining, demand they stop going into "private emails". Even if there is mining from a external point - leave my personal emails out of it. OkCupid uses mined data for their own use...not Government statistical information. Facebook seems to be going into traitor type business.

    If you're not part of the solution, you're part of the problem and you need to go.
  • lakawak 2 comments collapsed Collapse Expand
    You can demand it...simply by not opting in.
  • HarrisonPainter As EVP of lwi I spend my time between Indy & Los Angeles working in Digital Relations and Promotions. Co-founder of the Smash Network! 1 comment collapsed Collapse Expand
    I wish the focus was a little less on whether or not a free service you voluntarily joined is mining your data and more on having a voice to stop SOPA!
  • Jeff Downer Indianapolis IN Bail bondsman and owner of Jeff Downer Bail Bonds in Indianapolis Indiana. 1 comment collapsed Collapse Expand
    No more private communications?  Good night FB.
  • rflulling 1 comment collapsed Collapse Expand
    Your on facebook. Your on the internet. What exactly about the internet screams privacy? Really? You want privacy? DONT PUT IT ON THE INTERNET. Its like your credit card number. Don't want some guy in Nigeria to access your bank account? Keep it in your pocket where it belongs.

    That being said, when companies data mine your profiles, it's done in the blind. The systems gather trivial statistical data. No one gives a rats arse who you are and what you care about. What they care about is frequency of mention and other data. That can be combined with other data you have already given countless websites and applications public access to. Likely that the very application that's mining your data, you already gave permission to long ago. Infact whats happening here in this story is nothing compared to the data that you have already lost, to the data that has been gathered with your permission and entered into massive data bases around the world for the very purpose of advertising to you, and advertising to your friends in your name. Think it's not happening? Your quite clueless.

    A website called OKCupid used similar mining to harvest data from it's users profiles, not to be sold but to gather statistical information that the authors of the website used to create some really awesome graphs and reports. They used their mined data to end old stereo types and verify some urban legends that no one really had any proof of. In exchange for us having the survey questions, user interactions, and other statistical data used to create these reports, we were all granted access to one of the largest free social singles sites on the planet.
    Everything has a cost, especially FREE. Are you really prepared to pay it?

    If you fear what might happen to information you post online.
    Then either you don't post it, or You turn off your computer, cut the wires and go outside.

    Stop harassing every one about your lack of privacy because you don't really seem to have a clue as to what privacy is.
  • anothercultland 2 comments collapsed Collapse Expand
    I agree with your basic premise -  but the operative phrase here is "if Facebook would do this right."

    The first committed users of any technology are usually those who would abuse it.

    Facebook's track record is very, very spotty - at best - and the lack of user opting & the inclusion of private messaging are not good signs.

    I'm afraid we'll probably have to file this in the "what could have been" file.
  • M. Edward (Ed) BoraskyTop 50 Media Inactivist, Thought Follower, Sit-Down Comic, Social Media Analytics Researcher, Former Boy Genius, Linux Capacity Planner, R Hacker, Mathematician 1 comment collapsed Collapse Expand
    Perhaps "What should not have been" is more fitting.
  • M. Edward (Ed) BoraskyTop 50 Media Inactivist, Thought Follower, Sit-Down Comic, Social Media Analytics Researcher, Former Boy Genius, Linux Capacity Planner, R Hacker, Mathematician 3 comments collapsed Collapse Expand
    Pardon me for being a curmudgeon, but I see nothing but risks here - risks to the democratic process, risks to the sanctity of the secret ballot, and, of course, the risks to privacy. The "equal time" provision of the FCC, if it even still exists, is in shambles already because of social media. The Supreme Court decision popularly known as "Citizens United" has allowed an influx of money into such scurrilousness as the 30-minute "documentary" about Mitt Romney's days at Bain Capital being virally circulated, and now this. 

    Color me skeptical and a bit frightened. And yes, I am back on Google+
  • Judah Richardson 2 comments collapsed Collapse Expand
    Google reads your emails for ads, buddy. If that isn't an invasion of privacy for the sake of profit, I don't know what is. Don't be a hypocrite.

    That said, I'm an avid user of may Google services and Facebook. I don't mind their methods.
  • M. Edward (Ed) BoraskyTop 50 Media Inactivist, Thought Follower, Sit-Down Comic, Social Media Analytics Researcher, Former Boy Genius, Linux Capacity Planner, R Hacker, Mathematician 1 comment collapsed Collapse Expand
    Yes, I know Google reads my emails. And *everybody* who operates in the USA must deliver my data to law enforcement on receipt of a court order. Politico is *not* a court or law enforcement, they're a media organization.

    We're talking basic Bill of Rights law here, suitably interpreted for today's real-time many-to-many communications platforms and sophisticated data mining and knowledge discovery techniques. Something like this needs to be tested in the courts, and when you're talking Facebook, only the Federal government has the resources to get it there.

    In a way, I feel sorry for Congress, having to deal with basic budget matters on a more-or-less full-time basis while the corporate world is busy slugging it out on the Internet. That's capitalism, I guess.
Real-time updating is enabled. (Pause)

Add New Comment

  • Image
  • Share on:

    Twitter

Reactions

Recommended Story
Mozilla: We're About to Grab More Data About You, But Here's How We'll Keep It Safe