Lerman Twitter 2010 Dataset
Kristina Lerman

twitter (3 files)
active_follower_real_sql.zip 194.05MB
distinct_users_from_search_table_real_map.csv 28.08MB
link_status_search_with_ordering_real_csv.zip 70.04MB
Type: Dataset
Tags: twitter

title= {Lerman Twitter 2010 Dataset},
journal= {},
author= {Kristina Lerman },
year= {2010},
license= {This data is made available to the community for research purposes only},
url= {http://www.isi.edu/~lerman/downloads/twitter/twitter2010.html},
abstract= {Twitter_2010 data set contains tweets containing URLs that have been posted on Twitter during October 2010. In addition to tweets, we also the followee links of tweeting users, allowing us to reconstruct the follower graph of active (tweeting) users.
URLs	66,059
tweets	2,859,764
users	736,930
links	36,743,448

Table (in csv format) link_status_search_with_ordering_real_csv contains tweets with the following information

link: URL within the text of the tweet
id: tweet id
create_at: date added to the db
inreplyto_screen_name: screen name of user this tweet is replying to
inreplyto_user_id: user id of user this tweet is replying to
source: device from which the tweet originated
bad_user_id: alternate user id
user_screen_name: tweeting user screen name
order_of_users: tweet's index within sequence of tweets of the same URL
user_id: user id
Table (in csv format) distinct_users_from_search_table_real_map contains names of tweeting users, and the following information for each user:

user_id: user id
user_screen_name: user name
indegree: number of followers
outdegree: number of friends/followees
bad_user_id: alternate user id
Follower graph

File active_follower_real_sql contains zipped SQL dump of links between tweeting users in the form:

user_id: user id
follower_id: user id of the follower
Empirical characterization of this data is described in 
Kristina Lerman, Rumi Ghosh, Tawan Surachawala (2012) "Social Contagion: An Empirical Study of Information Spread on Digg and Twitter Follower Graphs." This data is made available to the community for research purposes only. If you use the data in a publication, please cite the above paper.},
keywords= {twitter},
terms= {}

10 day statistics (3 downloads)

Average Time 3 mins, 50 secs
Average Speed 1.27MB/s
Best Time 1 mins, 22 secs
Best Speed 3.56MB/s
Worst Time 5 mins, 05 secs
Worst Speed 957.95kB/s