“The Largest Social Network Ever Analyzed”

FACT: According to ComScore data cited in a story in Monday’s FInancial Times, “Facebook, the fast-growing social network, has taken a significant lead over MySpace in visitor numbers for the first time… Facebook attracted more than 123 million unique visitors in May, an increase of 162 per cent over the same period last year… That compared with 114.6 million unique visitors at MySpace, Facebook’s leading rival, whose traffic grew just 5 per cent during the same period… The findings mark the first time that Facebook, launched in 2004, has taken a significant lead in unique visitors, [and] come at a time of change inside Facebook, as the one-time upstart attempts to transform itself into a leading media company.

ANALYSIS:  This week several members of the Microsoft Institute met in Redmond with a visiting friend from government, and among other talks we had a very interesting discussion with Eric Horvitz, a Microsoft Research principal researcher and manager.  Eric’s well known for his work in artificial intelligence and currently serves as president of the Association for the Advancement of Artificial Intelligence (AAAI).

We talked about one of Eric’s recent projects for quite a while: “Planetary-Scale Views on a Large Instant-Messaging Network,” a project which has been described by his co-author as “the largest social network ever analyzed.” 

One interesting facet is the story of that co-author, Jure Leskovec, who collaborated on the work while a grad-student intern at Microsoft Research; here’s his student page at Carnegie Mellon, where he’s finishing his PhD in Computer Science this summer.  That page doesn’t yet note that he’s taking a teaching position at Stanford University — Go Stanford, Beat Cal!

The Leskovec-Horvitz study took just one month’s worth of anonymized data capturing high-level communication activities within the whole of the Microsoft Messenger instant-messaging system. the Internet’s most popular IM environment.  Since they were examining the patterns and collective dynamics of communications among large numbers of people, not individual conversations, the dataset contained “summary properties” of 30 billion conversations among 240 million people over the course of that month.

The communication graph constructed includes 180 million nodes and 1.3 billion undirected edges, “creating the largest social network constructed and analyzed to date.”  Check out the “map” on the left, of the geo-located conversations – to the eye it actually contructs an understandable physical map of the world’s landmass.  Another view, a “communications heat map” of the data, brings to mind Thomas Friedman’s influential best-seller “The World is Flat”

Some of the study’s findings may have been intuitive, some not so much:

  • We find that the graph is well-connected and robust to node removal.
  • We investigate on a planetary-scale the oft-cited report that people are separated by “six degrees of separation” and find that the average path length among Messenger users is 6.6. 
  • We also find that people tend to communicate more with each other when they have similar age, language, and location, and that cross-gender conversations are both more frequent and of longer duration than conversations with the same gender.” 

You know, I can’t help highlighting the “Kevin Bacon” aspect noted by the authors: “To our knowledge, this is the first time a planetary-scale social network has been available to validate the well-known “6 degrees of separation” finding by Travers and Milgram [a 1969 study]. The earlier work employed a sample of 64 people and found that the average number of hops for a letter to travel from Nebraska to Boston was 6.2 (mode 5, median 5), which is popularly known as the “6 degrees of separation” among people. We used a population sample that is more than two million times larger than the group studied earlier and confirmed the classic finding.”

I won’t include many of the other interesting findings, but it’s a fascinating study and is now leading to other hypotheses and research.  It was also presented a couple of months ago at the Beijing WWW 2008 Conference.

Some more general points: Eric’s “Adaptive Systems and Interaction Group” within MSR includes work on machine learning and decision making, search and retrieval, sensor fusion, human-computer interaction, ecommerce, hardware devices, computational theory, and cryptography.   Here’s Eric’s own page on the Microsoft Research (MSR) site, with links to many projects and published papers, and where he describes some of his focus areas:

I’m interested in computational foundations of intelligent sensing, reasoning, and action — with a particular focus on methods for grappling with uncertainty about environments or situations. I’m also interested in models of human cognition, and in developing computational systems that leverage insights about cognition to help people to achieve their goals. Much of my work makes use of probability and decision theory, decision analysis, and, in particular, Bayesian and decision-theoretic principles. My research spans both theoretical issues and concrete, real-world applications. I’m interested in information triage and alerting that takes human attention into consideration, spanning work on notification systems, surprise modeling, multitasking, and psychological studies of interruption and recovery. Other interests include principles of mixed-initiative interaction that can support fluid, efficient collaborations between people and computing systems, methods for guiding computer actions in accordance with the preferences of people, search and information retrieval, and collaboration.” – from home-page of Eric Horvitz 

Every time I talk with Eric I learn something new and unexpected, this time including something about how to learn new and unexpected things. I suppose that’s inevitable given his interest in surprise-modeling.

By the way, when he says above that his “research spans both theoretical issues and concrete, real-world applications,” he means it.  A couple of months ago Microsoft got pretty wide coverage in TechMeme and elsewhere (see New York Times story, or WIRED magazine) for the release of the ClearFlow traffic system, which uses Eric’s team’s AI research for “a smarter way to keep you out of snarls” (WIRED’s phrase) with “an ambitious effort to add AI machine-learning techniques to the complex problem of predicting traffic congestion” (again WIRED).  The system combines predictive algorithms based on years of traffic data correlated with many other events and variables (time of day, weather, holidays, sporting events), combined of course with the normal live traffic data from networks of highway sensors.  You can use that system today at Live Maps.

Email this post to a friend

AddThis Social Bookmark Button

6 Responses

  1. for prediction markets check out several of these:


    the blog is not bad also


  2. I agreed with you


  3. […] story covers the same research which I blogged about a couple of months ago, as “The Largest Social Network Ever Analyzed,” and I’m not surprised at the popularity on news sites.  In fact, when my blog post […]


  4. […] “The Largest Social Network Ever Analyzed” […]


  5. Hi! This is my 1st comment here so I just wanted to give
    a quick shout out and tell you I really enjoy reading your articles.
    Can you suggest any other blogs/websites/forums that go over the same topics?
    Thanks a ton!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: