Hariton Efstathiades (PhD candidate) from the Department of Computer Science of the University of Cyprus, received the Best student paper award for his work on Twitter graph analysis entitled ”Online Social Network Evolution: Revisiting the Twitter Graph” presented at the 2016 international conference on Big Data that took place in Washington, DC, USA on the 7th of December 2016. The paper was co-authored with Demetris Antoniades, George Pallis, Marios D. Dikaiakos from the University of Cyprus and Zoltán Szlávik, Robert-Jan Sips from IBM (Netherlands).
Twitter is one of the most popular Online Social Network (OSN) to time. It first appeared in 2006 and has been receiving growing attention ever since. As of 2015, the platform has more than 500 million users, out of which 316 million are considered active, i.e. users who log into the service at least once a month. Twitter allows its users to publish short messages, 140 characters long (including videos, pictures and URLs), in order to communicate their ideas, products, emotional state with their followers. Over the years Twitter has been used in a variety of different situations, e.g. allowing protesters to communicate over the Arab Spring. The extensive usage of Twitter enables researchers to analyze the generated information for several applications such as event detection, user location analysis, health care, recommendation and early warning systems, temporal trends and information diffusion.
Hariton’s work examines the Twitter network as it appeared in 2009 in the first comprehensive study of Twitter by Kwak et al. What is twitter, a social network or a news media? And, re-collects the users full characteristics as of late 2015.
In total the study by H. Efstathiades retrieved 34.66M users connected by 2.06B social connections. Performing a comprehensive study of the 2009 and 2015 social graph snapshots and presenting the results regarding various metrics in the topology of the social graph.
In specific, Hariton compared the two network snapshots, and studied the distributions of followers and followings, the relation between followers and tweets, reciprocity, degrees of separation, connected components and differences in newly created and removed edges. The results showed a denser network with increased reciprocity but lower connectivity, as shown by the decrease in the networks largest strongly connected component. The average shortest path of the network also slightly decreased to 4.05 hops. Then, Hariton examined the influential users of the network, as these can be defined by the number of followers and PageRank metrics. The results manifested a significant change of these users between the years. Having access to the entire 2009 Twittersphere, users who do not belong in this directory anymore were identified, and investigated the reasoning behind their disappearance. He grouped the removed users based on the reason they left the network and present a detailed comparison of the topological characteristics. The study showed that they had significant differences from the remaining set of users regarding their degree distributions, participation in Weakly and Strongly Connected Components, and their influential position in the social graph using their PageRank rankings. The results suggest that users who have been banned from Twitter showed different degree distributions than other categories, while the participation in WCC and SCC is much lower than the rest of the users.
To the best of the team’s knowledge this work is the first quantitative study on the entire Twittersphere, which compares the evolution of the network in such a large scale. Hariton’s also introduces the study on removed users, where he groups them in different fields and investigates their position in the social graph before their disappearance.