There are a number of interesting variables that you could correlate for analysis, but everyone loves a good game every once in a while. This section lays the foundation for a simple game you can play to see how well you know your friends, by grouping them such that their hometowns and current locations are juxtaposed. As always, the full source is available online at http://github.com/pt...nd_hometowns.py.
The FQL query we’ll run to get the names, current locations, and hometowns is simple and should look fairly familiar to previous FQL queries:
q = """select name, current_location, hometown_location from user where uid in
(select target_id from connection where source_id = me())"""
results = fql(query=q)Example 9-17 shows the final format that feeds the tree widget once you’ve invested the sweat equity in massaging it into the proper format, and Example 9-18 shows the Python code to generate it.
{
"items": [
{
"name": " Alabama (2)",
"children": [
{
"state": " Alabama",
"children": [
{
"state": " Tennessee",
"name": "Nashville, Tennessee (1)",
"children": [
{
"name": "Joe B."
}
]
}
],
"name": "Prattville, Alabama (1)",
"num_from_hometown": 1
}
]
},
{
"name": " Alberta (1)",
"children": [
{
"state": " Alberta",
"children": [
{
"state": " Alberta",
"name": "Edmonton, Alberta (1)",
"children": [
{
"name": "Gina F."
}
]
}
],
"name": "Edmonton, Alberta (1)",
"num_from_hometown": 1
}
]
},
...
],
"label": "name"
}The final widget ends up looking like Figure 9-8, a hierarchical display that groups your friends first by where they are currently located and then by their hometowns. In Figure 9-8, Jess C. is currently living in Tuscaloosa, AL but grew up in Princeton, WV. Although we’re correlating two harmless variables here, this exercise helps you quickly determine where most of your friends are located and gain insight into who has migrated from his hometown and who has stayed put. It’s not hard to imagine deviations that are more interesting or faceted displays that introduce additional variables, such as college attended, professional affiliation, or marital status.

A simple FQL query is all that it took to fetch the essential data, but there’s a little work involved in rolling up data items to populate the hierarchical tree widget. A fun improvement to the user experience might be integrating Google Maps with the widget so that you can quickly bring up locations you’re unfamiliar with on a map. Adding age and gender information into this display could also be interesting if you want to dig deeper or take another approach to clustering. Emitting some KML in a fashion and visualizing it in Google Earth might be another possibility worth considering, depending on your objective.
import sys
import json
import facebook
from facebook__fql_query import FQL
from facebook__login import login
try:
ACCESS_TOKEN = open("facebook.access_token").read()
except IOError, e:
try:
# If you pass in the access token from the Facebook app as a command-line
# parameter, be sure to wrap it in single quotes so that the shell
# doesn't interpret any characters in it. You may also need to escape
# the # character.
ACCESS_TOKEN = sys.argv[1]
except IndexError, e:
print >> sys.stderr, "Could not either find access token" + \
in 'facebook.access_token' or parse args. Logging in..."
ACCESS_TOKEN = login()
# Process the results of the following FQL query to create JSON output suitable for
# consumption by a simple hierarchical tree widget:
fql = FQL(ACCESS_TOKEN)
q = \
"""select name, current_location, hometown_location from user where uid in
(select target_id from connection where source_id = me() and target_type =
'user')"""
results = fql.query(q)
# First, read over the raw FQL query and create two hierarchical maps that group
# people by where they live now and by their hometowns. We'll simply tabulate
# frequencies, but you could easily grab additional data in the FQL query and use it
# for many creative situations.
current_by_hometown = {}
for r in results:
if r['current_location'] != None:
current_location = r['current_location']['city'] + ', ' \
+ r['current_location']['state']
else:
current_location = 'Unknown'
if r['hometown_location'] != None:
hometown_location = r['hometown_location']['city'] + ', ' \
+ r['hometown_location']['state']
else:
hometown_location = 'Unknown'
if current_by_hometown.has_key(hometown_location):
if current_by_hometown[hometown_location].has_key(current_location):
current_by_hometown[hometown_location][current_location] += \
[r['name']]
else:
current_by_hometown[hometown_location][current_location] = \
[r['name']]
else:
current_by_hometown[hometown_location] = {}
current_by_hometown[hometown_location][current_location] = \
[r['name']]
# There are a lot of different ways you could slice and dice the data now that
# it's in a reasonable data structure. Let's create a hierarchical
# structure that lends itself to being displayed as a tree.
items = []
for hometown in current_by_hometown:
num_from_hometown = sum([len(current_by_hometown[hometown][current])
for current in current_by_hometown[hometown]])
name = '%s (%s)' % (hometown, num_from_hometown)
try:
hometown_state = hometown.split(',')[1]
except IndexError:
hometown_state = hometown
item = {'name': name, 'state': hometown_state,
'num_from_hometown': num_from_hometown}
item['children'] = []
for current in current_by_hometown[hometown]:
try:
current_state = current.split(',')[1]
except IndexError:
current_state = current
item['children'].append({'name': '%s (%s)' % (current,
len(current_by_hometown[hometown][current])),
'state': current_state, 'children'
: [{'name': f[:f.find(' ') + 2] + '.'}
for f in
current_by_hometown[hometown][current]]})
# Sort items alphabetically by state. Further roll-up by state could
# be done here if desired.
item['children'] = sorted(item['children'], key=lambda i: i['state'])
items.append(item)
# Optionally, roll up outer-level items by state to create a better user experience
# in the display. Alternatively, you could just pass the current value of items in
# the final statement that creates the JSON output for smaller data sets.
items = sorted(items, key=lambda i: i['state'])
all_items_by_state = []
grouped_items = []
current_state = items[0]['state']
num_from_state = items[0]['num_from_hometown']
for item in items:
if item['state'] == current_state:
num_from_state += item['num_from_hometown']
grouped_items.append(item)
else:
all_items_by_state.append({'name': '%s (%s)' % (current_state,
num_from_state), 'children': grouped_items})
current_state = item['state']
num_from_state = item['num_from_hometown']
grouped_items = [item]
all_items_by_state.append({'name': '%s (%s)' % (current_state,
num_from_state), 'children': grouped_items})
# Finally, emit output suitable for consumption by a hierarchical tree widget
print json.dumps({'items': all_items_by_state, 'label': 'name'},
indent=4)
Learn more about this topic from Mining the Social Web.
Popular social networks such as Facebook and Twitter generate a tremendous amount of valuable data on topics and use patterns. Who’s talking to whom? What are they talking about? How often are they talking? This concise and practical book shows you how to answer these questions and more by harvesting and analyzing data using social web APIs, Python tools, GitHub, HTML5, and Javascript.

Help


