Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a...

75
Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland Student ID: 07-713-928 Supervisors: Guilherme Sperb Machado, Prof. Dr. Burkhard Stiller Date of Submission: January 17, 2014 University of Zurich Department of Informatics (IFI) Binzmühlestrasse 14, CH-8050 Zurich, Switzerland ifi MASTER T HESIS Communication Systems Group, Prof. Dr. Burkhard Stiller

Transcript of Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a...

Page 1: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Online Social Network Capabilitiesfor a Platform-Independent CloudStorage System for Multi-Usage

Alexander FilitzZürich, Switzerland

Student ID: 07-713-928

Supervisors: Guilherme Sperb Machado, Prof. Dr. Burkhard StillerDate of Submission: January 17, 2014

University of ZurichDepartment of Informatics (IFI)Binzmühlestrasse 14, CH-8050 Zurich, Switzerland ifi

MA

ST

ER

TH

ES

IS–

Com

mun

icat

ion

Sys

tem

sG

roup

,Pro

f.D

r.B

urkh

ard

Stil

ler

Page 2: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Master ThesisCommunication Systems Group (CSG)Department of Informatics (IFI)University of ZurichBinzmühlestrasse 14, CH-8050 Zürich, SwitzerlandURL: http://www.csg.uzh.ch/

Page 3: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Abstract

Online Social Networks (OSNs) have gained great interest from people all around theworld and OSN interactions became a trend in people’s everyday life. Many applicationshave adopted to this trend and integrated OSN capabilities to the core of their solutions,which enables powerful customization of user-experience by exploiting the social graph ofan OSN. This thesis designs, implements, and evaluates OSN capabilities for a Platform-Independent Cloud Storage System for Multi-Usage (PiCsMu). At the present time,PiCsMu users can not trust other PiCsMu users’ identities when sharing files, and canneither expand their social graph. Therefore, OSN social capabilities are required tobridge these gaps. In particular, two social recommendation methods (interaction-based,location-based) were designed, calibrated, and evaluated with an exploratory study toovercome this problem. The social recommendation methods differ from related workin the sense that they are designed and implemented for a general OSN data model,which makes them applicable to different OSNs, and that they rely on public and privateuser data. The results of the exploratory study show that the interaction-based socialrecommendation method is able to recommend, in average, 2 out of those 5 OSN friendsthat the OSN user also perceives as the ones he/she interacts most with. Related tothe location-based social recommendation method, the results show that it is possible torecommend, in average, 1.3 out of those 5 OSN friends that the OSN user also perceivesas the ones he/she is geographically closest to.

i

Page 4: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

ii

Page 5: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Zusammenfassung

Internet-basierte soziale Netzwerke haben das Interesse vieler Menschen auf der ganzenWelt geweckt und Interaktionen in diesen wurden ein fester Bestandteil des alltaglichenLebens. Viele Applikationen sind diesem Trend gefolgt und integrieren die Einsatzmog-lichkeiten von internet-basierten sozialen Netzwerken, welche durch die Erkundung dessozialen Graphen eines solchen Netzwerkes ein massgeschneidertes Benutzererlebnis er-moglichen. Diese Masterarbeit entwirft, implementiert und evaluiert Einsatzmoglichkeitenvon internet-basierten sozialen Netzwerken fur PiCsMu, eine neuartige Cloud-SpeicherAnwendung. Im Besonderen werden zwei soziale Empfehlungsmethoden (basierend aufInteraktionen und Standorten) entworfen, kalibriert und evaluiert. Diese Empfehlungsme-thoden unterscheiden sich, verglichen mit anderen sozialen Empfehlungsmethoden darin,dass sie nicht nur fur ein spezifisches soziales Netzwerk entworfen wurden, sondern sichauf ein allgemeines Datenmodell stutzen, welches es ermoglicht die Empfehlungsmethodenauf beliebige soziale Netzwerke anzuwenden. Die Resultate der Evaluation haben gezeigt,dass im Durchschnitt 2 von 5 Freunden im sozialen Netzwerk empfohlen werden, wel-che auch vom Benutzer als die mit ihm/ihr am meisten interargierenden wahrgenommenwerden. Ebenfalls zeigen die Resultate, dass im Durchschnitt 1.3 von 5 Freunden im sozia-len Netzwerk empfohlen werden, welche vom Benutzer als die am geographisch nachstenempfunden werden.

iii

Page 6: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

iv

Page 7: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Acknowledgments

First of all, I would like to thank Professor Dr. Burkhard Stiller and the CommunicationSystems Group, who gave me the opportunity to work on this interesting topic and beingpart of their research. It allowed me to get familiar with the interesting world of onlinesocial networks, current research, and leading edge technology. Many thanks to my su-pervisor Guilherme Sperb Machado, who provided me with excellent support in all areasof this topic, and who guided me always in the right direction. I also like to thank myfamily and especially my parents, Cornelia and Gerhard Filitz, who kept me motivateduntil the end of the thesis.

v

Page 8: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

vi

Page 9: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Contents

Abstract i

Zusammenfassung iii

Acknowledgments v

1 Introduction 1

1.1 Description of Work and Thesis Goals . . . . . . . . . . . . . . . . . . . . . 3

1.2 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Terminology and Related Work 5

2.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Online Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3 Social Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.4 SocIoS Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Design 11

3.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2 OSN Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.3 OSN Meta API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.4 PiCsMu Social Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.5 Social Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.5.1 Interaction-Based Recommendation . . . . . . . . . . . . . . . . . . 19

3.5.2 Location-Based Recommendation . . . . . . . . . . . . . . . . . . . 20

vii

Page 10: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

viii CONTENTS

4 Implementation 23

4.1 Technology Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.2 PiCsMu Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.3 OSN Meta API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.4 PiCsMu Social Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.5 Asynchronous Communication . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.6 Parallel Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5 Evaluation 29

5.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

5.2 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.2.1 Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.2.2 OSN Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5.3 Method Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.4 Result Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.5 Threats to Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6 Summary and Conclusions 45

Bibliography 51

Abbreviations 53

Glossary 55

List of Figures 55

List of Tables 57

List of Listings 59

A Installation Guidelines 63

B Contents of the CD 65

Page 11: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Chapter 1

Introduction

The emergence and rapid growth of OSNs over the past ten years has changed the infor-mation spread in the Internet in a major way. Online social interactions in OSNs becamea trend in people’s professional and regular life on a daily basis. Recent Initial PublicOfferings (IPO) of Facebook [9, 5] and Twitter [33, 6] have manifested the importanceand market value (multiple billions) of these OSNs at the present time. The core of eachOSN lies in its social graph, which connects users with other users and items. Thereforeeach OSN constantly aims to deepen and enlarge its social graph (cf. Section 2.2).

Through the adoption of OSNs, user-generated content is spreading and being consumedwith a higher rate from different individuals around the globe. Attracted by the con-tent spreading and consumption phenomena, several applications and systems adoptedthe OSN trend by integrating OSN capabilities to the core of their solutions. E.g., theInstagram application [13] started with an integration of Facebook and Twitter, whichenabled it to advertise its brand, spread content to these OSNs, and build a collaborativesystem. Thus, OSN capabilities attached to an application enable a powerful customiza-tion of user-experience by exploiting the social graph of an OSN. In a general manner, anapplication or system can benefit of OSN capabilities as follows: (1) expand the user base(i.e., attract more users to use the application/system and ultimately build an own socialgraph), (2) build trust (i.e., credibility among users), (3) enable interactions among users(e.g., share information, send direct messages), and (4) enable social recommendation(e.g., help to establish new relationships among users based on interactivity, locations,age, interests, etc.).

This thesis designs, implements, and evaluates OSN capabilities for PiCsMu [20], whichis a hybrid storage application developed at the Communication Systems Group (CSG),University of Zurich (UZH). PiCsMu introduces a new file-sharing approach based on asecure cloud storage overlay. It combines the benefits of centralized and decentralizedservices into one application. The centralized part consists of several services such as fileencoding, file upload/download to/from cloud providers, cryptography, and an identityprovider. The decentralized part consists of a Peer-to-Peer (P2P) network, which storesindex information and also enables sharing of files among users of the system. A PiCsMuuser can choose between two sharing methods, private sharing and public sharing. Private

1

Page 12: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

2 CHAPTER 1. INTRODUCTION

sharing means that the file will only be accessible by a specific PiCsMu user, whereas inpublic sharing the file is accessible by all users of the PiCsMu system [53].

In the current state of PiCsMu, users only have the possibility to share files privately withother PiCsMu users, if they know the exact PiCsMu identity (username) to share with.Even though usernames are unique in the PiCsMu system, PiCsMu users should rely onan external media to get to know new, or other PiCsMu users, since PiCsMu does notprovide means to search users (by design, due to anonymity). Moreover, when sharingcontent to specific PiCsMu user(s), there is no guarantee that PiCsMu users are really theones they claim to be, since anyone can create a PiCsMu identity without providing real-life information. In summary, PiCsMu users can not trust other PiCsMu users’ identitieswhen receiving a share notification, and can neither expand their social graph. Therefore,OSN social capabilities are required to bridge these gaps.

As indicated by related work (cf. Section 2.3), social recommendation methods are ableto determine the most trusted friends of an OSN user. Thus, this thesis focuses to buildOSN capabilities through the means of social recommendation with the goal to enhancethe user-experience of PiCsMu users. Social recommendation for PiCsMu does not onlyenable trust among users, but it also supports the PiCsMu system to increase its userbase number, and to consequently enlarge the social graph of each PiCsMu user.

In particular, two social recommendation methods (interaction-based, location-based)were designed, calibrated, and evaluated through the means of an exploratory study.The work in this thesis differs from related work (cf. Chapter 2) as follows. The socialrecommendation methods developed in this thesis are based on a general OSN data modeland meta API, which turns them applicable to different OSNs. On the other hand, re-lated work only focus on specific OSNs using particular social recommendation methods.In addition, the social recommendation methods developed in this thesis do not includeuser content in the recommendation calculation, which differs to most related work, thatinclude user content and its semantic meaning. Also, related work is only based on publicOSN data, while this thesis is based on private and public OSN data. Lastly, the evalua-tion performed in this thesis is based on a data set from real OSN users, which agreed toprovide private and public user data.

An exploratory study was conducted with the means to calibrate and evaluate the socialrecommendation methods. The study consists of three parts: first, the data set is obtainedwith a web-based survey; second, the two social recommendation methods are calibratedfor a random half of the data set collected; third, the calibrated social recommendationmethods are evaluated against the remaining half of the data set.

Page 13: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

1.1. DESCRIPTION OF WORK AND THESIS GOALS 3

1.1 Description of Work and Thesis Goals

This thesis encapsulates the design, implementation, and integration of the respectiveopen-source software that composes the PiCsMu OSN capabilities. The components,which compose the OSN capabilities introduced to PiCsMu are the following: (1) authen-tication to existing OSNs, (2) OSN meta API, (3) PiCsMu social network, and (4) socialrecommendation. The components (1), (2), and (3) are mandatory work steps to enablethe social recommendation. The methods for the social recommendation are designed,calibrated, and evaluated.

The contributions of this thesis can be summarized as follows:

• Design, implementation, and integration of OSN authentication for PiCsMu.

• Design, implementation, and integration of a general OSN data for PiCsMu.

• Design, implementation, and integration of an OSN meta API.

• Design, implementation, and integration of PiCsMu’ own social network.

• Design, implementation, calibration, and evaluation of social recommendation meth-ods.

1.2 Thesis Outline

The remainder of this thesis is structured as follows. Chapter 2 presents the terminologyand related work in the field of OSNs and social recommendation. Chapter 3 provides anoverview of the extended PiCsMu system and the detailed design of each newly introducedcomponent. Chapter 4 touches the technical part and explains the key concepts of theimplementation. Chapter 5 evaluates and discusses the results obtained from the calibra-tion of the social recommendation methods introduced by this thesis. Finally, Chapter 6summarizes and concludes the outcome, also presenting future work.

Page 14: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

4 CHAPTER 1. INTRODUCTION

Page 15: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Chapter 2

Terminology and Related Work

This chapter aims to clarify the definitions of important terms used throughout the thesis.Related work is presented in the field of OSNs and social recommendation, with Section2.4 mainly discussing a research project, which has built a prototype with similar OSNcapabilities integrated.

2.1 Terminology

A user is an entity that is authenticated and authorized to use a certain application orservice, which in the scope of this thesis, is the PiCsMu system, or any OSN.

A friend is the counterpart in a positively connoted social relationship of a human being.When referring to the term friend, it is solely in the scope of an OSN. A friend is also auser. In contrast to real life friends, it is assumed that OSN friends are less important toan OSN user, because usually only one click is necessary to establish a friend-relationshipbetween two OSN users, which is clearly a much more complicated process in real life.

Trust can be defined in different ways. In the scope of this thesis it is referred to thedefinition proposed by Xiu-Quan et. al [58] stating that trust is generated between peopleby two main factors: familiarity and similarity. Also it is referred to the definition usedby Podobnik et. al [56] stating that trust is the expectancy of a user to be able to rely onrecommendations of his/her friends within an OSN. Chao et. al [42] showed that trustcan be measured by the intimacy of a user A to user B inside an OSN. The intimacy isuni-directional, i.e., user A can trust user B more than user B trusts user A.

The term social graph refers to the mathematical graph model behind each OSN. Usersand items are represented as nodes, while the relationships between users and items arerepresented as edges.

5

Page 16: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

6 CHAPTER 2. TERMINOLOGY AND RELATED WORK

2.2 Online Social Networks

In the past ten years many OSNs emerged in the Internet. They focus on differentareas, e.g., professional-relationships, personal-relationships, pictures, videos, and shortmessages (tweets) [48]. In the theory of Social Network Analysis (SNA) existing OSNsshow small-world characteristics such as power-law scaling and high clustering coefficients[39, 37]. There are also decentralized OSN solutions which are based on P2P technology.Their focus is mainly on a different level, e.g., security and privacy of data, DistributedHash Table (DHT) routing, and physical trust measurement [45, 40, 51].

Table 2.1 lists some of the most important OSNs nowadays, based on total registered andmonthly active users. The numerical values in the table comparison are an estimatedaverage over the figures collected from different websites [10, 27, 19, 35, 29, 30, 16, 14].Currently, Facebook is the largest OSN, according to the total registered and monthlyactive users. Like other OSNs, Facebook offers a multitude of services to its users. Thefocus lies on personal-relationships and there are many different ways for a user to interactwith other users. E.g., users can post something on another users profile, they can like,comment, share posts, and send private messages to other users. Also Google+ [12] andBadoo [4] focus on personal-relationships and are similar structured as Facebook. Twitter,the third largest OSN, focuses on short-messages so-called “tweets” and became popularwith the tweets from famous artists and politicians around the world. Tumblr [31] focuseson user blogs, which can be easily created and managed by its users. Instagram [13] isabout pictures; its specialty is the way a user can create and edit pictures with just a fewclicks. LinkedIn [15] is the larges OSN focusing on professional-relationships.

Third party applications embedded in OSNs gained a lot of interest from OSN users, e.g.,applications related to games, media streaming, and newspapers. First, an OSN user hasto authorize and grant access to the third party application, and then, the user is able touse the third party application inside or outside the OSN. On the other hand, dependingon the granted permissions, third party applications could have full access to the user’sOSN data which is subject to controversial discussions regarding user privacy and dataprotection. These third party applications use the Application Programming Interface(API) of an OSN to query and retrieve data from it. OSN data is often separated intopublic and private data. Public data is visible to anyone, while private data is visible onlyto the OSN user.

Online Social Network

Facebook Twitter GooglePlus Tumblr LinkedIn Badoo Instagram

TotalUsers

∼1200 ∼800 ∼1000 ∼260 ∼270 ∼170 N/A

MonthlyActiveUsers

∼1000+ ∼280 ∼550 ∼60 ∼230 N/A ∼150

PublicAPI

X X X X X x X

Focus personal short-messages personal blogs professional personal pictures

Table 2.1: A comparison of OSNs. The numerical values are represented in millions.

Page 17: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

2.3. SOCIAL RECOMMENDATION 7

2.3 Social Recommendation

Recommendation of an item reduces the cost of finding information for the user and mayattract new users to an item in a system or the system itself. Commercial websites, suchas, Amazon, or eBay, have substantial use in recommendation systems to attract morecustomers for an item. Zhang et. al [61] summarize recommendation methods into thefollowing categories: content-based [38], collaborative-filtering [50][44], clustering model[41], graph model [36], and association rule graph [52]. Social recommendation is a newcategory in the field of recommendation methods and aims to recommend OSN items andusers to an OSN user [56].

With the emergence of OSNs, user-generated content is more available than ever and socialrecommendation is an important piece when building a larger social graph. Traditionalrecommendation methods only have one type of information as input to the recommen-dation, e.g., a laptop bought by user A on eBay is the input for the recommendationfor user A. In contrast to traditional recommendation methods, social recommendationmethods have to deal with heterogeneous information. The heterogeneity is composed ofthe different elements current OSNs introduce, e.g., users, pages, likes, posts, tags, loca-tions, and complex relationships between all of them [54]. The heterogeneous informationin OSNs arises new challenges in social recommendation and therefore is a current topicof research. The research questions of social recommendation are:

(i) How to measure and identify the most important friends of a user inside an OSN?

(ii) How to measure and identify other users inside an OSN, who the user is not yetconnected with?

The former question is related to the scenario of trust measurement among OSN friendsand the latter is related to friend recommendation. The main scenario for social rec-ommendation is friend recommendation, where the recommendation aims to present newpossible friends to an OSN user. Friend recommendation is important for both the userand the OSN itself. The more connections a user has in the social graph, the more itemsOSN services are able to display to the user. Therefore, the user more likely visits theOSN again and uses its services. The main goal of OSNs is to dense and enlarge theirsocial graph [55].

Comparison of Social RecommendationIn current researches the two scenarios (trust, friend recommendation) overlap, becausefriend recommendation is mostly based on trust measurements, therefore, this comparisonconsiders work related to the friend recommendation scenario.

A concrete approach was developed by Naruchitparames et. al [54]. They combineda network topology recommendation with genetic algorithms to recommend friends andshowed that the combination performs better compared to each method alone. Yang et.al [59] introduced an improved algorithm for personalized collaborative filtering in socialnetworks and showed that their results show a higher quality of recommendation thana traditional collaborative filtering approach. Chao et. al [42] introduce the conceptof user-intimacy as user-interaction, and created a hybrid recommendation method that

Page 18: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

8 CHAPTER 2. TERMINOLOGY AND RELATED WORK

combines user-interaction, content-based, and collaborative filtering. Another friend rec-ommendation method that is based on social interaction was developed by Nia et. al [55],who identified different types of interaction with different weights. Their results show thatFacebook users like an item (e.g., posts, videos, pictures) five times more rather than com-ment on it. Zhang et. al [61] identify the recommendation as a ranking problem. Using arandom walk method and a pair-wise learning algorithm, experimental results show thatimprovements in recommendation are obtained comparing to the baseline methods. Chuet. al [43] built a hybrid recommendation which combines similar interests of OSN userswith location similarity. Yu et. al [60] also base their friend recommendation on locationdata and social network topology. In order to crawl an unbiased sample of OSN data,Gjoka et. al [49] showed that the random walk method delivers representative (unbiased)samples of Facebook users. Also this method is state of the art in crawling OSN data andused in most of the above mentioned related work.

Table 2.2 compares related work about social recommendation. Method refers to whatrecommendation method was used, either one of the traditional recommendation methodsor new approaches. A check mark in the hybrid row indicates that a hybrid approach ofrecommendation methods was used. A hybrid recommendation combines two or more dif-ferent methods. A check mark in the interaction or location rows refers to that interactionor location data was used as input to the recommendation method. Data Set indicatesthe privacy it consists of, meaning that public refers to public OSN user data. As far asindicated in the compared related work, none of them used a data set consisting of publicand private OSN user data.

Social Recommendation

Naruchit-parames

et. al

Yang et.al

Chao et. alNia et.

alZhang et.

alChu et.

alYu et. al

Methodgenetic

algorithmscollaborative

filtering

content-based,

collaborativefiltering

rankingrandom walk,

pair-wiselearning

affinitymatrices,interest

similarity

randomwalk,

ranking

Hybrid x x X x x X X

Interaction x x X X x x x

Location x x x x x X X

Data Set public public public public public public public

Published 2011 2010 2012 2013 2008 2013 2011

Table 2.2: A comparison of social recommendation.

Page 19: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

2.4. SOCIOS PROJECT 9

2.4 SocIoS Project

The SocIoS Project [26] started in September 2010 and lasted until February 2013. Itsslogan was entitled “Exploiting social networks for building the future internet of ser-vices”. By exploiting social networks they refer to: user created context, the social graph,and providing a framework that allows to combine these two from different OSNs intoa workflow that presents media items based on a search query. The consortium of theproject consisted of a mix between research, industry, and media, led by the Institute ofCommunication and Computer Systems of the National Technical University of Athens.This project is mentioned as related work because of its similarity in the system archi-tecture and design compared to the work done in the scope of this thesis. Figure 2.1shows the SocIoS framework with its core components: Core API, Auxiliary Services, andthe Frond-End. The Core API is responsible for the communication with the OSN andis based on a general OSN model that enables the same functionality on method levelindependent of the underlying OSN. The Auxiliary Services consist of several moduleswhich compute more complex tasks, such as, recommendations or media item rankings.The Front-End displays the results of each workflow to the users of the SocIoS framework.The code of the SocIoS project is open-source and can be accessed on GitHub [25].

Figure 2.1: The SocIoS system architecture.

Page 20: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

10 CHAPTER 2. TERMINOLOGY AND RELATED WORK

Page 21: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Chapter 3

Design

This chapter explains the OSN capabilities introduced to the PiCsMu system. The generalconcept and design of each new capability is explained in detail. The chapter starts witha high-level system overview and then explains each element in a top-down approach.

3.1 System Overview

The existing PiCsMu system is built in a highly modularized and distributed way. In thescope of this thesis the system is extended with new distributed services and additionalcomponents for the main application. The following components and services build thePiCsMu system:

Existing

PiCsMu Application

P2P Network

Cloud Services

PiCsMu Identity Provider

Newly introduced

OSN Meta API

Social Recommendation

PiCsMu Social Network

OSN Authentication

Online Social Network

The existing system consists of a desktop application, which is the entry point for eachPiCsMu user, a centralized identity provider, that takes care of user authentication anda P2P overlay, which is formed by all running PiCsMu desktop applications. The P2Poverlay stores encrypted information about shared files and also enables a search func-tionality for publicly shared files. The newly introduced components and services will beexplained in the subsequent sections of this chapter. Figure 3.1 shows the high level systemoverview of the PiCsMu system with the existing and the newly introduced componentsand services.

11

Page 22: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

12 CHAPTER 3. DESIGN

PiCsMu Core

PiCsMu P2P Network

Cloud Services

PiCsMu User

PiCsMu Identity Provider

OSN Authentication

Online Social Network

PiCsMu Social Network

OSN User

OSN Meta API

Social Recommendation

PiCsMu Social

PiCsMu Application

Figure 3.1: The PiCsMu system overview.

3.2 OSN Authentication

The first step to be able to build OSN capabilities into the PiCsMu system is a servicethat handles the process of OSN authentication for a specific OSN user. As mentioned inSection 2.2, most OSNs have API endpoints, which can be queried for information. TheseAPI endpoints are accessible only for authorized OSN applications. Such applications canalso act on behalf of an OSN user and query private user data, if the user properly grantedrequired permissions. The OSN authentication flow for OSN applications is based on theOAuth 2.0/1.0 protocol, standardized in the RFCs 6749 and 5849 [24, 23].

In order to authenticate an OSN application with the OAuth protocol three things arenecessary: first, the OSN application credentials, which consist of an identification numberand a secret; second, a human being (in this case, the OSN user) that grants permissionto the application in an Internet browser; third, a callback URL, which is addressedby the OSN after the authentication flow is invoked. The result of a successful OAuthauthentication flow is a token generated by the OSN, which needs to be passed in eachAPI call for the OSN to know that the application is authorized to use the endpoint.Also the tokens have an expiration time. Initially for Facebook and Google+ these tokensare short-living and expire after certain hours, but there is the possibility to generate a

Page 23: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

3.2. OSN AUTHENTICATION 13

longer-living token, out of the short-living token, that is valid many days. Twitter has onlyone type of generated tokens, which only expire if the OSN user removes the permissionfor the application. In order that PiCsMu is able to integrate OSN authentication, threeOSN applications were created, one for each OSN: Facebook, Twitter, and Google+. Theconfiguration and management of the application is done in the OSN web interface itself,e.g., edit the callback URL or change the application secret.

As explained, the authentication process involves several steps including human inter-action. Since PiCsMu is a desktop application, a browser needs to be opened for theuser, in which he/she will complete the authentication process. This call is handled asyn-chronously, such that the user is still able to use the PiCsMu desktop application in themeantime. The browser is pointed to the OSN authentication server and after completionof the authentication process, the token and more information is forwarded to the PiCsMusocial network server. The PiCsMu desktop application checks in a separate loop processif the data is stored or an error occurred. Figure 3.2 shows the communication flow of theauthentication process with a specific OSN, Facebook, and the PiCsMu high level entitiesinvolved. This design leaves the opportunity for a PiCsMu user to exchange the OSNauthentication server with a third party service, e.g., oauth.io [18].

PiCsMuDesktop Application

PiCsMuOSN Authentication Facebook PiCsMu

Social Network

Login PageOpen Browser

User Credentials

App Permissions

Token

Persist Data

Check if the process is completedLoop

Figure 3.2: The OSN authentication communication flow.

Page 24: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

14 CHAPTER 3. DESIGN

3.3 OSN Meta API

In order to integrate social capabilities in PiCsMu, it is required a piece of software thathandles the communication and requests to/from the OSN after the OSN authentication.In this thesis, an OSN meta API library is developed, entitled JSocialLib. The JSocialLibis referred to as meta API since it is built on top of existing libraries that communicateto OSNs. The social capabilities of the PiCsMu system are designed to be as independentas possible of the underlying OSN used by the user. Therefore, a common data model isneeded to support most existing OSNs, also being flexible and extensible enough to adaptnew concepts from different OSNs. The data model builds the basis for the JSocialLib andenables the abstraction of a specific OSN to a more general concept. Additional to thisdata model, an API interface is defined with methods to gather and analyze data from anOSN. Nowadays, the three largest OSNs, Facebook, Twitter and Google+ are consideredin the design of the data model and meta API. The data model supports the most commonuser attributes among these three OSNs. Table 3.1 summarizes the user attributes of themeta API data model and shows if the attribute is supported by Facebook, Twitter, orGoogle+.

OSN Meta APIOnline Social Network

Facebook Twitter Google+

age X x X

birthday X x X

current location X x X

emails X x X

first name X X X

full name X X X

gender X x X

hometown X x x

id X X X

languages X X X

last name X X X

timezone X X x

username X X X

Table 3.1: The OSN meta API user attributes.

Figure 3.3 shows the detailed UML diagram of the OSN Meta API. On the right-handside is the part of the data model and on the left-hand side is the part of the API. Thedata model is built around the representation of an OSN user, called OSNIdentity. TheAPI is integrated in the representation of the OSN itself, called OSNProvider. EachOSNProvider consists always of only one OSNIdentity, the authenticated OSN user.

Page 25: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

3.3. OSN META API 15

CredentialsThere are two types of credentials in the data model: OSNUserCredentials and OSNAp-

pCredentials. An OSNIdentity has always one OSNUserCredentials attached, whichis the token (cf. Section 3.2) and, depending on the underlying OSN, some additionalinformation. On the other side an OSNProvider has zero or one OSNAppCredentials

attached, which consists of the application ID and secret (cf. Section 3.2).

ContentThe model also supports user generated content on an OSN, represented as OSNContent.Since content created by users varies among different OSNs (e.g., videos, pictures, weblinks, text and more), the decision was made that content is always stored in a textualrepresentation and it depends on the library user how to interpret the data. The onlyrefinement of OSNContent is the distinction between OSNPost and OSNMessage. TheOSNPost refers to all the content that is publicly created from an OSNIdentity, e.g., astatus update on Facebook, a post to a friends wall on Facebook, or a tweet on Twitter.The OSNMessage refers to private direct messages, which are supported by most OSNs,from one OSN user to one or several other OSN users.

RelationshipsA key characteristic of OSNs is the ability for its users to build social-relationships withother users and the used nomenclature varies upon each OSN. E.g., in Facebook a bidirec-tional relationship between two users is called friend, and Twitter distinguishes betweenfollowers, the users that subscribe to the updates of a user, and followees, the users auser subscribed to him-/herself to receive updates. In the model a social relationship isrepresented as OSNRelation and points to an OSNIdentity. Also the type of the rela-tionship, bidirectional, follower, and subscription can be specified. Bidirectional meansthat, using the nomenclature of Facebook, both users are friends with each other, or us-ing the nomenclature of Twitter, both users follow each other. Follower means that theOSNIdentity in the OSNRelation follows the user and the type Subscription refers to theopposite direction, meaning that the OSNIdentity in the OSNRelation is being followedby the user.

ProvidersEach provider handles the communication with a specific OSN and also they depend on themeta API interface called OSNProvider. The interface is designed to gather the relevantinformation and compute the analysis needed for the last part of the social capabilities:the social recommendation (cf. Section 3.5). The interface defines methods to gatherinformation about relationships, identities, content, and locations. Table 3.2 summarizesthe interface methods and their intended use.

Page 26: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

16 CHAPTER 3. DESIGN

Method Category Intended Use

countPrivMsg Relationship Total number of private messages.

countRelations Relationship Total number of relations.

getContentBetweenDate Content Content (post and message) between a time-span.

getFemaleRelations Relationship Female relations

getIdentitiesWithSameCurrentLocation Relationship Relations living in the same location.

getIdentitiesInAgeRange Relationship Relations in specified age range.

getIdentitiesNotInAgeRange Relationship Relations not in specified age range.

getIdentitiesOthersMostInteractToUser RelationshipA ranking of relations based on the interaction ofthe relations to the user.

getIdentitiesUserMostInteractsTo RelationshipA ranking of relations based on the interaction ofthe user to his/her relations.

getIdentitiesWithSameHometown Relationship Relations with the same hometown.

getIdentity IdentityA user relation with an exact match of the specifiedfirst-name and last-name.

getLocations Location Most recent locations of the user.

getMaleRelations Relationship Male relations.

getRelations Relationship All user relations or by type.

getRelationsMadeAfter Relationship Relations started after a specified date.

getRelationsMadeBefore Relationship Relations started before a specified date.

getRelationsWithUnresolvedGender Relationship Relations that have no gender specified.

searchForIdentities Identity Search the OSN for users with the specified name.

Table 3.2: The meta API interface methods and their intended use.

3.4 PiCsMu Social Network

The third part of the social capabilities for the PiCsMu system is an own social network.Since the PiCsMu system already has the notion of PiCsMu identities (provided by acentralized identity provider), the social network must comply with the existing datamodel. The PiCsMu social network differs from typical OSNs and only supports basicconcepts of social relationships: a PiCsMu user can add other PiCsMu users as friends,but adding a friend does not require an acknowledgment. Therefore, the PiCsMu systemuses the follower relationship type. The decision was made to design the PiCsMu socialnetwork in a centralized manner, similar to the identity provider, but still decoupled fromthe main system and exchangeable at any time. Its purpose relies on building the basisfor all social capabilities of the PiCsMu system, and enables the system to be extendedwith even more social capabilities in future. The main high level tasks of the PiCsMuSocial Network are summarized as following:

• Match PiCsMu identities with OSN identities. Each PiCsMu user can connecthis/her PiCsMu identity with his/her OSN identity. The matching keeps track ofPiCsMu users who attached an OSN identity with their PiCsMu identity.

Page 27: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

3.5. SOCIAL RECOMMENDATION 17

• Enable relationships among PiCsMu users. PiCsMu users can build relationshipswith other users, meaning they extend their PiCsMu social graph by becomingfriends with each other. E.g., user Alice becomes friend with user Bob and nowAlice is able to share a file privately with Bob.

• List all relationships of a PiCsMu identity. This is a requirement for the userinterface and eases the use of the PiCsMu application.

• Manage and store OSN information from the OSN authentication process (cf. Sec-tion 3.2).

• Enable social recommendation (cf. Section 3.5).

3.5 Social Recommendation

The last and fourth social capability introduced to the PiCsMu system is the ability toperform social recommendation. The main goal of social recommendation is to extendthe social graph of the social network as explained in Section 2.3. The work done in thisthesis follows the same high level goal, but with the focus to extend the social graph ofeach PiCsMu user in the social network of PiCsMu. Therefore, in contrast to relatedwork, which only operates on the OSN itself, the OSN data is used to build social recom-mendation for the PiCsMu social network and not for an OSN. This thesis considers thefollowing use cases for social recommendation:

(UC1) Recommend new friends to a PiCsMu user.

(UC2) Recommend the PiCsMu system to OSN friends of a PiCsMu user.

UC1’s goal is to recommend new possible friends to a PiCsMu user inside the socialnetwork of PiCsMu. In a first step, OSN user data of the corresponding OSN identity isgathered and analyzed. In a second step, the produced information from the first step isused to recommend new possible friends to the PiCsMu user. UC2 is similar to UC1, butin the opposite direction, where the PiCsMu system is recommended to OSN friends ofthe corresponding OSN identity. There are two requirements for UC1:

(RE1) A successfully completed OSN authentication of a PiCsMu user.

(RE2) At least one of the OSN friends of the corresponding OSN identity is using thePiCsMu system and also successfully completed the OSN authentication.

UC2 only needs RE1 to be satisfied. The following examples should clarify these usecases:

Use Case 1Alice and Bob know each other for a long time, but nowadays live far away from eachother. Also Alice and Bob are friends on Facebook to stay in contact from time to time.Bob is using PiCsMu to share files with other people he knows and Alice started to usePiCsMu too, but does not know, that Bob is currently using it. Both, Alice and Bob, have

Page 28: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

18 CHAPTER 3. DESIGN

completed the OSN authentication with their Facebook identity, associating it with thePiCsMu system (RE1, RE2). Alice wants to share some private files with Bob, but doesnot trust Facebook as medium. Social recommendation in PiCsMu then recommends Bobas new PiCsMu friend to Alice, so that Alice is able to share her files privately with Bob.

Use Case 2Alice is using Facebook and PiCsMu and associated her Facebook identity with the PiC-sMu system (RE1). Unfortunately none of her Facebook friends is using PiCsMu. There-fore social recommendation can recommend the PiCsMu system to the Facebook friendsof Alice.

A simple social recommendation approach would recommend in UC1 all of Alice’ Facebookfriends, which satisfy RE2, as new PiCsMu friends. In UC2 a simple approach wouldrecommend PiCsMu to all of Alice’ Facebook friend. In this thesis, the idea is to refinesuch simplistic idea to come up with a more robust and precise recommendation system.It is rather desired to recommend only to or from a selected set of OSN friends in bothuse cases, meaning that only the most important or most trusted friends are consideredin the recommendation.

The social recommendation methods used in this thesis identify and measure the mosttrusted OSN friends for a PiCsMu user in a first step, and rank them based on a set ofcriteria (trust scenario). In a second step only a filtered set of friends based on a set of cri-teria is used for the actual recommendation (friend scenario). Related work showed thattrust among OSN friends can be measured by interaction. The designed social recommen-dation methods follow the same principle. Two different social recommendation methodswere designed: interaction-based and location-based. The former method measures theintimacy by interaction on different levels between two OSN friends. The latter follows anew approach, that measures intimacy on location proximity of two OSN friends.

It is expected that the recommendation results of the two methods would reveal twodifferent set of trusted OSN friends. Such assumption is based on the hypothesis thatOSN users tend to interact more with OSN friends who are not their closest friends inreal life, because, e.g., they physically have contact on a daily basis or interact throughmore direct media channels. The evaluation (cf. Chapter 5) investigates these hypothesesin more detail.

The final result of both methods is a weighted, undirected graph, with the shape of astar. The central node is the OSN identity the social recommendation is made for, andthe surrounding nodes represent the OSN friends of that identity. The weights of theedges are the calculated scores, and all the edges start or end in the central node. Figure3.4 illustrates a fictive final recommendation result. In this example the green markednodes (B,D,I) achieved the highest score and are selected for the final result.

Page 29: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

3.5. SOCIAL RECOMMENDATION 19

A

C

E

B

G

I

HF

D

250230

150

30

60

55

40

15

Figure 3.4: The star graph recommendation result.

3.5.1 Interaction-Based Recommendation

In a first step this method measures the interaction between an OSN user and his/herOSN friends in both directions. Interaction is classified into four categories, which arebased on the OSN data model (cf. Section 3.3). Table 3.3 explains the four interactioncategories in more detail.

Interaction Category Explanation

Public Post (PP)A public post is any kind of interaction user A has with user Bthat is public, e.g., A writing some text on B’ user profile.

Public Post Replied (PPR)It is an extension of the public post category. Additionally, ifthere is a reply of user B to the public post by A, this is accounteddifferent.

Private Message (PM)A private message is a direct message from user A to user B,which can only be accessed by the sender and receiver.

Private Message Replied (PMR)

It is an extension of the private message category. Additionally,if there is a reply of user B to the private message by A, this isaccounted different and it is considered to be the highest degreeof interaction.

Table 3.3: The interaction categories for the interaction-based social recommendationmethod.

This interaction counting approach does not consider spamming activities. Therefore itis calculated a ratio between private messages sent and received from an OSN user toeach friend. If the calculated ratio is between an upper and lower bound of threshold,private messages are counted as PMR, else as PM. In a second step, weights are applied tothe interactions found and a score is calculated for each OSN friend. Within this design,weight values are not defined and need to be calibrated first. Chapter 5 will explainhow the weights are calibrated in the scope of this thesis. Also the actual content of theinteractions is not considered within this method. It is assumed that the frequency of

Page 30: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

20 CHAPTER 3. DESIGN

interactions is a better measurement in terms of measuring intimacy from a user A to B.The following formulas show how the score is calculated:

IA→B = Interaction value from A to B

PP = Count of public posts

PPR = Count of public posts replied

PM = Count of private messages

PMR = Count of private messages replied

ωpp = Weight for PP

ωppr = Weight for PPR

ωpm = Weight for PM

ωpmr = Weight for PMR

SA,B = Score between A and B

SA,B = (IA→B + IB→A)

IA→B = (PPA→B ∗ ωpp) + (PPRA→B ∗ ωppr) + (PMA→B ∗ ωpm) + (PMRA→B ∗ ωpmr)

IB→A = (PPB→A ∗ ωpp) + (PPRB→A ∗ ωppr) + (PMB→A ∗ ωpm) + (PMRB→A ∗ ωpmr)

3.5.2 Location-Based Recommendation

The second social recommendation method is based on location data of the underlyingOSN. Location data is found either in the OSN user attributes (e.g., hometown, currentlocation), or in OSN content with location data attached. This method focuses on OSNcontent with location data attached, but takes user attributes into account.

In a first step an algorithm counts matches for each location found of the OSN userand the corresponding OSN friend. The location data in current OSNs is based on GPScoordinates and therefore it is unlikely to find an exact match. To overcome this problem,the distance is calculated between two locations with the Haversine formula [47] and if thedistance is below a specified threshold it counts as a match. Also the time period betweentwo matching locations is taken into account, meaning that the longer the time period,the less the match counts for the final score. If there is more than one match found for aspecific location, the match with the smallest time period is accounted. Figure 3.5 showsan example situation. The social recommendation is done for the user A. User A was atlocation X at the day D. User B was at location X at D − 10, and at another matchinglocation Y at D − 5. User C was at location X at D + 1. The algorithm then accountsthe match between A and C in a higher score than between A and B. For B, the locationY is accounted as a match since Y is closer to D than X. This process is repeated for allthe locations of user A.

Page 31: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

3.5. SOCIAL RECOMMENDATION 21

D - 10 D D + 1

Time

User ALocation X

User CLocation X

User BLocation X

Match

Match

D - 5

User BLocation Y

Match

Figure 3.5: A location matching example.

The following formula shows how the score is calculated:

Mcl,A,B = Number of user attribute loca-tion matches between A and B

Ml,A,B = Number of content locationmatches between A and B

ωcl = Weight for user attribute locationmatch

ωl = Weight for content location match

D = Number of days (time period) be-tween two locations

SA,B = Score between A and B

SA,B = (Mcl,A,B ∗ ωcl) +∑Ml,A,B

k=1

(ωl ∗ 1

D

)

Page 32: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

22 CHAPTER 3. DESIGN

«e

nu

me

rati

on

»O

SN

Rel

atio

nT

ype

B

IDIR

EC

TIO

NA

L =

BID

IRE

CT

ION

AL

F

OL

LO

WE

R =

FO

LL

OW

ER

S

UB

SC

RIP

TIO

N =

SU

BS

CR

IPT

ION

OS

NP

rov

ide

r

+

cou

ntP

rivM

sg()

:i

nt

+

cou

ntR

ela

tio

ns(

) :

int

+

ge

tCo

nte

ntB

etw

ee

n(D

ate

, D

ate

) :

Lis

t<O

SN

Co

nte

nt>

+

ge

tFe

ma

leR

ela

tio

ns(

) :

Lis

t<O

SN

Re

lati

on

>+

g

etId

en

tite

sWit

hS

am

eC

urr

en

tLo

cati

on

()

:Lis

t<O

SN

Ide

nti

ty>

+

ge

tIde

nti

tie

sIn

Ag

eR

an

ge

()

:Lis

t<O

SN

Ide

nti

ty>

+

ge

tIde

nti

tie

sNo

tInA

ge

Ra

ng

e()

:L

ist<

OS

NId

en

tity

>+

g

etId

en

titi

esO

the

rsM

os

tInte

ract

To

Use

r()

:L

ist<

OS

NS

co

re>

+

ge

tIde

nti

tie

sUse

rMo

stIn

tera

cts

To

()

:Lis

t<O

SN

Sco

re>

+

ge

tIde

nti

tie

sWit

hS

am

eH

om

eto

wn

()

:Lis

t<O

SN

Ide

nti

ty>

+

ge

tIde

nti

ty(S

trin

g,

Str

ing

) :

OS

NId

en

tity

+

ge

tLo

cati

on

s(O

SN

Ide

nti

ty)

:M

ap

<S

trin

g,

Lo

cati

on

>+

g

etM

ale

Re

lati

on

s()

:L

ist<

OS

NR

ela

tio

n>

+

ge

tRe

lati

on

s()

:L

ist<

OS

NR

ela

tio

n>

+

ge

tRe

lati

on

s(O

SN

Re

lati

on

Typ

e)

:L

ist<

OS

NR

ela

tio

n>

+

ge

tRe

lati

on

sM

ad

eA

fte

r(D

ate

) :

Lis

t<O

SN

Re

lati

on

>+

g

etR

ela

tio

ns

Ma

de

Be

fore

(Da

te)

:L

ist<

OS

NR

ela

tio

n>

+

ge

tRe

lati

on

sW

ith

Un

res

olv

ed

Ge

nd

er(

) :

Lis

t<O

SN

Ide

nti

ty>

+

sea

rch

Fo

rIde

nti

tie

s(S

trin

g,

int)

:L

ist<

OS

NId

en

tity

>

Fa

ce

bo

ok

Pro

vid

er

Twit

terP

rov

ide

rG

oo

gle

Plu

sP

rov

ide

r

OS

NU

se

r

OS

NC

on

ten

t

- co

nte

nt

:S

trin

g-

cre

ati

on

Da

te

:Da

te-

cre

ato

r :

OS

NId

en

tity

OS

NP

os

tO

SN

Me

ss

ag

e

- re

cip

ien

t :

OS

NId

en

tity

OS

NId

en

tity

- a

ge

:i

nt

- b

irth

da

y

:Da

te-

curr

en

tLo

cati

on

:L

oca

tio

n-

em

ail

s[]

:in

t-

firs

tna

me

:S

trin

g-

full

Na

me

:S

trin

g-

ge

nd

er

:S

trin

g-

ho

me

tow

n

:Lo

cati

on

- id

:S

trin

g-

lan

gu

ag

es

:S

trin

g[]

- la

stn

am

e

:Str

ing

- ti

me

zon

e

:Str

ing

- u

sern

am

e

:Str

ing

Fa

ce

bo

ok

Us

erC

red

s

- e

xpir

esI

n

:in

t

OS

NR

ela

tio

n

- re

lati

on

To

:O

SN

Ide

nti

ty-

typ

e

:OS

NR

ela

tio

nT

ype

Twit

terU

se

rCre

ds

- o

au

thT

oke

nS

ecr

et

:S

trin

g

Go

og

leP

lus

Us

erC

red

s

- e

xpir

esI

n

:in

t-

refr

esh

To

ken

:S

trin

g

OS

NU

se

rCre

de

nti

als

- o

au

thT

oke

n

:Str

ing

OS

NL

oca

tio

n

- id

:S

trin

g-

na

me

:S

trin

g

OS

NW

ork

Pla

ce

- e

mp

loye

r :

Str

ing

- e

nd

Da

te

:Da

te-

id

:Str

ing

- st

artD

ate

:Dat

e

OS

NA

pp

Cre

den

tial

s

- a

pp

Id

:Str

ing

- a

pp

Na

me

:S

trin

g-

ap

pS

ec

ret

:S

trin

g

0..*1.

.*

0..*

0..*

0..

1

0..*

1

0..

1

11

10.

.*

11

Figure 3.3: The OSN meta API data model.

Page 33: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Chapter 4

Implementation

This chapter explains key points of the implementation for the designed social capabilities.Common problems and bottlenecks are also discussed.

4.1 Technology Overview

All code related to the PiCsMu system is written in Java, which enables PiCsMu to beinstalled on any platform. Eclipse was used as the development environment. Source codeorganization and integration is managed by Apache Maven [1]. Apache Tomcat 7 [2] isused as the application Web server, and MySQL 5.5 server [17] is the database. All Javalibraries and server software used in this work have an open-source license. In total 12’891lines of code (without comments and white space) were written in the scope of this thesis.

The design of all social capabilities is based on the general OSN data model, whichabstracts the functionality from the underlying OSN and the social capabilities imple-mentations are based on this model as well. In the scope of this thesis, two concrete OS-

NProvider were implemented: FacebookOSNProvider and TwitterOSNProvider. There-fore the OSNs Facebook and Twitter are fully supported by the PiCsMu system. Existingthird-party OSN client libraries were used by the JSocialLib, handling low-level commu-nication between PiCsMu and Facebook or Twitter. The client library RestFB [21] wasused to access the Facebook API and Twitter4J [34] to access the Twitter API. Google+was not considered, because the existing Java client libraries are still in beta state andfor the desired functionality a manual implementation would be required, which was notfeasible within the scope of this thesis.

4.2 PiCsMu Communication

The authentication for the PiCsMu system is handled by the PiCsMu identity provider.Both the PiCsMu identity provider and social network expose a RESTful web service[46] to communicate with the PiCsMu desktop application and the OSN authentication

23

Page 34: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

24 CHAPTER 4. IMPLEMENTATION

server. The web services only accept requests from authenticated users. The same tokenmechanism as current OSNs introduce for their API use (cf. Section 3.2) was implemented.It means that for each authenticated PiCsMu user a token in the identity provider isgenerated and returned to the PiCsMu application. Then the token is passed in eachrequest to the web services to communicate and is stored in the header field X-Auth-Token of the HTTP protocol. To transfer data the JSON format [22] is used, which isalso standard for current OSN API data representations. For security reasons, all PiCsMucommunication is routed through an encrypted HTTPS connection, which also applies tothe communication between PiCsMu and OSNs.

4.3 OSN Meta API

In the beginning of the implementation for the JSocialLib, problems came up while testingFacebook and Twitter APIs. E.g., Facebook and Twitter limit the API requests to acertain rate per user token and also per OSN application. Facebook limits the API requestsper user token to 600 calls per 10 minutes for all endpoints. Also there exists an applicationlimit regarding CPU and memory consumption. The exact limit per application is notpublicly disclosed by Facebook, which makes it tremendously difficult to respect suchlimitation. It can be said that the resource consumption limit is a percentage of themonthly active application users. If the user token limit or the application limit is reachedthen the API requests are blocked for at least one hour. Twitter, on the other hand, onlylimits per user token and has well documented rate limits for each API endpoint. Themost used endpoints have a limit of 180 calls per 15 minutes (specified in JSocialLib).

A third limitation is the amount of data returned by each call. Facebook, e.g., limits thestream API to data of 30 days or 50 posts, whichever is greater. But it was encounteredthat this limit is not as strict and it is possible to retrieve even more data with certainqueries. Twitter on the other hand follows exactly the documented limits. A completelist for all rate limitations on Twitter can be found in their API documentation [32].

How one deals with the rate and data limits is a key point in development of such alibrary. It involves a considerable amount of manual testing to check if method callseither succeed or fail. Especially finding the right queries and tuning the parameters isa complicated and time consuming task. Moreover, in some cases, the data returned byFacebook is not consistent, e.g., executing the same query on two different days (related tointeraction) revealed different data (even though the Facebook data for such specific userdid not change). Moreover, there are bugs in the OSN APIs that should be considered.During the JSocialLib implementation phase, and in the scope of this thesis, a bug relatedto Facebook location data was discovered and formally reported. However, until theconclusion of this thesis, the bug was not resolved [7].

Listing 4.1 shows an example of how Facebook is queried with the RestFB client library.The query in this example addresses the graph API endpoint and returns all friends ofthe authenticated user and the user attributes of each friend specified with the “friends”parameter.

Page 35: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

4.4. PICSMU SOCIAL INTERFACE 25

FacebookClient fb = new DefaultFacebookClient(oauthToken);

Connection<User> myFriends = fb.fetchConnection(

"me/friends",

User.class,

Parameter.with("fields", "id, name, first_name,

last_name, middle_name, gender, locale, languages,

username, age_range, timezone, bio, birthday, education,

email, hometown, location, work"));

Listing 4.1: A sample Facebook query with the RestFB library.

Listing 4.2 shows and example of how Twitter is queried with the Twitter4J client library.The query in this example addresses the search API endpoint and returns all Twitteridentities that match the given first name and last name. Also a rate limit check isperformed, before the execution of the query. Fortunately Twitter has an API endpointto check the rate limitations for each user token and API endpoint, which Facebook doesnot provide.

ConfigurationBuilder cb = new ConfigurationBuilder();

cb.setOAuthConsumerKey(appId)

.setOAuthConsumerSecret(appSecret)

.setOAuthAccessToken(oauthToken)

.setOAuthAccessTokenSecret(oauthSecret);

TwitterFactory tf = new TwitterFactory(cb.build());

Twitter twitter = twitter = tf.getInstance();

if (getRemainingCalls("users", "/search") > 0) {

ResponseList<User> result = twitter.searchUsers(firstname+ " " + lastname, 1);

}

Listing 4.2: A sample Twitter query with the Twitter4J library.

4.4 PiCsMu Social Interface

The PiCsMu social component resides in the PiCsMu desktop application and exposestwo interfaces to the PiCsMu core component: Social Network Service, and Social Rec-ommendation Service. The social components connects the social capabilities with theexisting PiCsMu application.

Listing 4.3 lists the interface methods of the social network service. The methodgetOSNProviderTypes returns the supported OSN of the PiCsMu social network. Themethod getOSNMappings returns a list of OSNMapping for the specified PiCsMu user. AOSNMapping maps a PiCsMu user ID to the corresponding OSN user ID. The methodgetOSNProviderStates returns all the OSN authentication information for the specifiedPiCsMu user. The method createOSNProviderState starts the asynchronous process of

Page 36: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

26 CHAPTER 4. IMPLEMENTATION

OSN authentication explained in Section 3.2. Lastly, the method deleteOSNProvider-

State removes the OSN authentication information for the specified PiCsMu user andOSN.

// Social Network Service Interface

public List<OSNProviderType> getOSNProviderTypes() throws Exception;

public List<OSNMapping> getOSNMappings(PicsMuAccessToken token,

PicsMuUserBean user) throws Exception;

public List<OSNProviderState> getOSNProviderStates(PicsMuAccessToken token,

PicsMuUserBean user) throws Exception;

public Future<OSNProviderState> createOSNProviderState(

AddCredentialCallback cb);

public boolean deleteOSNProviderState(PicsMuAccessToken token,

PicsMuUserBean user, OSNProviderState state) throws Exception;

Listing 4.3: The social network service interface.

Listing 4.4 lists the interface methods of the social recommendation service. The methodcreateRecommendationContext computes in a preliminary step which social recommen-dation use cases (cf. Section 3.5) are possible for the specified PiCsMu user. The methodgetRecommendationResult returns the actual social recommendation result based on arecommendation context.

// Recommendation Service Interface

public Future<RecommendationContext> createRecommendationContext(

ContextCallback cb);

public Future<RecommendationResult> getRecommendationResult(

RecommendationCallback cb);

Listing 4.4: The recommendation service interface.

4.5 Asynchronous Communication

The concept of asynchronous communication allows non-blocking program execution.This is a crucial concept for the social capabilities, since some of them are long dur-ing processes (several minutes) such as OSN authentication and social recommenda-tion. These processes should run in the background without blocking the PiCsMu ap-plication, else it would result in an inconvenient user experience. For this communica-tion pattern the interfaces in the java.util.concurrent package in combination with thewell-known observer pattern were used. Listing 4.5 shows an example of how a non-blocking asynchronous communication flow is invoked to compute a social recommen-dation result. First, a RecommendationCallback is created, which registers a Recom-

mendationObserver. The statement service.getRecommendationResult(cb) imme-diately returns a Future<RecommendationResult> object and the execution continueswith subsequent tasks. In the background, the RecommendationCallback is executed

Page 37: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

4.6. PARALLEL COMPUTING 27

and once it finishes the registered RecommendationObserver is notified and the Fu-

ture<RecommendationResult> is filled with the actual recommendation result.

// initialize asynchronous communication

RecommendationContext rc = new RecommendationContext();

RecommendationCallback cb = new RecommendationCallback(rc);

RecommendationObserver ro = new RecommendationObserver();

cb.addObserver(ro);

// start asynchronous communication

Future<RecommendationResult> future = service.getRecommendationResult(cb);

// do other tasks

...

Listing 4.5: An example of an asynchronous communication flow.

4.6 Parallel Computing

When communicating with a OSN, single-threaded solutions perform have a poor per-formance in terms of total run time (due to no parallelism). Therefore, the JSocial-Lib and the social recommendation software were optimized for parallel execution. Thesame operation/query has to be executed for all friends of an OSN user, many times.Thus, the implementation supports parallelization to collect data for each friend of anOSN user in a different thread. Listing 4.6 shows an example of how an executor ser-vice (ScheduledThreadPoolExecutor class) is instantiated with a fixed thread pool size,which, for the prototypical implementation, is set to 50. An executor service(ScheduledThreadPoolExecutor) is instantiated with a fixed thread pool size of 50. Af-ter the ScheduledThreadPoolExecutor instantiation, all tasks(RecommendationCallback class) are submitted to the executor service. Since the poolsize is set to 50, the executor service always runs 50 tasks in parallel until the internaltask queue is empty.

// thread pool

ScheduledThreadPoolExecutor threadPool = new ScheduledThreadPoolExecutor(50);

for (RecommendationContext rc : contextToDo) {

RecommendationCallback cb = new RecommendationCallback(rc);

RecommendationObserver ro = new RecommendationObserver();

cb.addObserver(ro);

// submit task

Future<RecommendationResult> future = executorPool.submit(cb);

}

// wait for completion

...

// do other tasks

Listing 4.6: An example of parallel computing.

Page 38: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

28 CHAPTER 4. IMPLEMENTATION

Page 39: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Chapter 5

Evaluation

This chapter describes the evaluation of the social recommendation for the PiCsMu sys-tem. An exploratory study is conducted and the findings are presented and discussedin detail. First, the study calibrates the introduced social recommendation methods inrespect to real OSN user data (collected through the means of a survey), and second, thecalibrated social recommendation methods are evaluated. Also, the study aims to providea better understanding of how OSN users perceive their OSN relationships.

5.1 Method

The primary goal of this study is to calibrate the parameter sets and evaluate the socialrecommendation methods using the calibrated parameter set (cf. Section 5.3). Differentdata sets were used to calibrate the parameters and to evaluate the social recommendationmethods. The calibration is based on OSN user perception, and measures how close thesocial recommendation results perform in comparison to user perception. A brief manualanalysis of experimental social recommendation results (based on a small Facebook usersample) showed that it was roughly represented what the two social recommendationmethods aim for. Both methods, interaction-based and location-based, performed asexpected for five selected OSN users who could verify the results. Based on the findingsof these experimental results, the following hypotheses are defined:

(H1) Both social recommendation methods are able to recommend at least 2 out of 5friends that reflect the user perception.

(H2) The two social recommendation methods reveal a different result set, or in otherwords, the friends with the highest scores from the interaction-based method differ fromthe friends with the highest scores of the location-based method.

H2 focuses on the observation that OSN users tend to interact more with friends whoare not their closest friends in real life. Also, it is assumed that OSN users more likelyinteract with friends who are geographically far away than the closest friends in real life.Such assumption leads to the third hypothesis:

29

Page 40: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

30 CHAPTER 5. EVALUATION

(H3) In order to be able to recommend the most trusted friends of an OSN user, it ishypothesized that a hybrid approach composed of both methods will perform better incontrast to each individual method.

Since Facebook has the largest amount of monthly active users, it was chosen as theunderlying OSN for this evaluation. Within the scope of this thesis it was not possibleto evaluate the social recommendation methods with a different OSN, e.g., Twitter, orGoogle+.

ProcedureThe evaluation procedure consists of three main steps, which can be summarized as fol-lowing: data collection, method calibration, and result evaluation. In the first step allthe necessary data is collected. The second step calibrates the social recommendationmethods to determine the best parameter set. In the last step, the calibrated social rec-ommendation methods are executed on a different data set and evaluated. The followingdescriptions give a brief overview of which tasks each step consists of:

Data CollectionTwo different data sets are collected to conduct this study. The first is the OSN userperception related to the two social recommendation methods, and the second is thecorresponding OSN user data. It was decided to collect the user perception with a web-based survey, which is explained in Section 5.2.1. The survey is accessible after a successfulOSN authentication, which enables the collection of OSN user data for the authenticatedOSN user. The details of OSN user data collection are described in Section 5.2.2.

Method CalibrationAfter the collection of the data sets, the social recommendation methods are calibrated.Different parameter sets are generated and statistics computed for each of the methodsto check which ones perform best. This step is explained more detailed Section 5.3.

Result EvaluationThe calibrated social recommendation methods are evaluated against a different data set.Various statistics are computed with the aim to answer the hypotheses H1, H2, and H3.Section 5.4 explains result evaluation in more detail.

5.2 Data Collection

The data collection is composed of a web-based survey, and the OSN data collection isperformed for each user that answered the survey.

5.2.1 Survey

It was decided to conduct a web-based survey, which is self-evident, since answers fromOSN users are required. The advantages are that it scales well for a multitude of par-ticipants, and it is also the least extensive form for a participant. The disadvantage is a

Page 41: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

5.2. DATA COLLECTION 31

much lower response rate, due to the fact that visitors tend to open the survey, browseit for a quick view, but in the end decide to not participate on it. The alternative wouldbe to invite people to participate having a controlled environment, where participantsthat accept to take part of the survey really answer the survey. This approach was notfeasible in the scope of this thesis (due to time constraints), but it is assumed to have amuch higher response rate, since possible participants agree to the invitation and makethe effort to visit the controlled environment.

The survey website [28] is written in a combination of JavaScript and PHP to be ableto integrate the Facebook SDK library [8], which is used to handle the communicationbetween the web-server and Facebook. Moreover, a Facebook application was createdfor the survey itself. In order to store the survey answers and the OSN data, a MySQLdatabase [17] was used. The website is hosted on a Apache 2.0 web-server [3]. The websitesurvey was deployed on a virtual Linux machine, which leaves the possibility to increasethe hardware resources under heavy load. Performance stress tests were performed tocheck if the infrastructure would behave stable. These tests showed that the setup isable to handle up to 1000 website visitors in parallel. The survey website is optimizedfor both desktop browsers and mobile devices. The survey submission is restricted tothe Facebook account of each user, meaning that each authenticated user is only able tosubmit the survey once. The only two requirements for the survey website are, that avisitor has cookies enabled and no pop-up blocker software is active.

ContentThe survey consists of three parts. The first part holds general information, advices onhow to use the elements in the survey and a privacy agreement, which the participantshave to confirm. The following privacy statement was presented on the survey:

• No personal information will be disclosed to any third-part entities by any means.

• For scientific publications the ethics and code of conducts for studies will be followed,and only fully anonymized and aggregated results will be used.

The second part is the OSN authentication, where each participant authenticates his/herFacebook identity and accepts the permissions for the survey Facebook application. Thefirst two parts are displayed to every visitor, but the last part is only shown after a success-ful OSN authentication. The last part consists of the questions to be answered. There aretwo main questions with three sub questions for both of them. Table 5.1 summarizes thesurvey questions. At the end, the survey participants is given the opportunity to submittheir email address, thus they can be informed about their personal social recommendationresults and the outcome of this study.

The first three questions (1.1, 1.2, 1.3) are related to the interaction-based social recom-mendation method, and the last three questions (2.1, 2.2, 2.3) are related to the location-based social recommendation method. In question 1.1 and 2.1 it is measured how frequentthe participant interacts or uses locations on Facebook. Those questions have a fixed an-swer set and a participant can choose between either frequently, occasionally, or never asanswer. In question 1.2 and 2.2 the actual user perception is received by selecting Face-book friends. In question 1.2 the user perception regarding his/her Facebook situation is

Page 42: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

32 CHAPTER 5. EVALUATION

Number Question

1.1 How often do you interact with Facebook friends?

1.2Who are the top 5 Facebook friends you most interacted with duringthe past 6 months?

1.3How do you rank those selected Facebook friends based on the amountof interaction in a descending order?

2.1How often do you specify or attach your current location when postingon Facebook?

2.2Who are the top 5 Facebook friends you were the geographically closestto during the past 6 months?

2.3How do you rank those selected Facebook friends based on geograph-ical closeness in a descending order?

Table 5.1: The questions of the survey.

collected, and in question 2.2 the user perception regarding his/her real life situation iscollected. A friend selector was implemented to navigate through all Facebook friends andselect them in a fast way. Figure 5.1 shows a screen-shot of the friend selector element. Inquestion 1.3 and 2.3 the participant has to rank his/her selected friends in a descendingorder, meaning in the first position is the friend he/she either most interacted with onFacebook, or was geographically closest to. Figure 5.2 shows a screenshot of the rankingelement. The user can drag and drop the friends to the right position.

Figure 5.1: The friend selector survey element.

Page 43: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

5.2. DATA COLLECTION 33

Figure 5.2: The friend ranking survey element

Sample StatisticsThe survey website link was spread over different channels. First, Facebook friends weredirectly contacted. Second, the link was sent to the mailing list of all UZH students.Third, the survey website was optimized for web search engines, e.g., Google. Fourth,the Google Analytic service [11] was enabled to collect visitor statistics. In total, 290Complete Surveys (CS) were collected during 30 days. The demographics of all websitevisitors, who did the OSN authentication step, are shown in Figure 5.3 and 5.4. Thesefigures are obtained from insights of the Facebook application, which was specificallycreated for the purpose of the survey.

Figure 5.3: Survey visitor demographics: gender and age.

In total the survey had 1577 First-Time Visits (FV) and a total visit count of 1898. Theaverage visit duration is one minute and 25 seconds. The bounce rate (percentage ofvisitors who left the entry page without any interaction) is around 80%. The responserate for the survey in a worst-case scenario is computed with the following formula:

rresponse = CS ∗ 100FV

= 290 ∗ 1001577

= 18.38%

The response rate in the worst-case scenario is very low and there are two explanationsfor this behavior. First, it is assumed, that a lot of people did not complete the survey,because they were frightened by the permissions the Facebook application requires. E.g.,in order to read direct messages of a user, the Facebook application requires the permissionto read a user’s message in-box, and this probably raised privacy skepticism. Second, it

Page 44: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

34 CHAPTER 5. EVALUATION

Figure 5.4: Survey visitor demographics: countries, cities and language.

is assumed, that some visitors did not even have a Facebook account and therefore wereunable to participate. The best-case scenario would be the response rate based on thetotal users that did the OSN authentication, which can be obtained from the insights ofthe Facebook application. The resulting response rate would be much higher:

rresponse = 290 ∗ 100358

= 81.00%

The best-case scenario does not include the visitors, who do have a Facebook account,but did not do the OSN authentication. Therefore the true response rate lies between18.38% and 81.00%.

5.2.2 OSN Data

The survey questions consider a time period of six months, which results in a considerableamount of data for each user, even though the actual content of an interaction or anitem with a location attached is not cached. To handle this amount of data, each singleFacebook query response is cached in a relational database. The database consists ofthree tables: relation, interaction, and location. The table relation caches the users ownattributes and the corresponding attributes of each of his/her friends. The interactiontable caches all the interactions of each user in the past six months counting back from theday the survey was submitted. Finally, the table location caches all the locations found foreach user in the same time period. It is necessary to cache all the Facebook data, becauseit is the only way to prevent the survey Facebook application being blocked by Facebook(due to API calls rate limits, cf. Section 4.3). Moreover, the caching enables to calibratethe social recommendation with much more parameter sets (cf. Section 5.3), since theresults’ computation is made locally, without the need to always query Facebook. Table

Page 45: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

5.3. METHOD CALIBRATION 35

5.2 gives a summary of the amount of cached OSN data and rows for each table in thedatabase.

Table Size Rows

relation 108.3 MByte 122,439

interaction 63.7 MByte 123,454

location 102.8 MByte 244,885

Table 5.2: Summary of the cached Facebook data.

5.3 Method Calibration

Each social recommendation method depends on a set of parameters, such as weight,distance, and spam ratio (cf. Section 3.5). The calibration of the social recommendationmethods aims to find the best values for these parameters. The idea is to test as manyparameter sets for each method as possible. Table 5.3 summarizes the parameters neededfor both methods. It was decided to generate parameter sets in a brute-force manner. Theboundary for the weight and distance parameters are set to 1-9. The value for the spamratio parameter in the interaction-based method is fixed to 2, 3, and 4, meaning that foreach weight combination in the interaction-based method, each spam ratio value is added.This results in three times more parameter sets than for the location-based method. Table5.4 shows an example of six generated parameter sets for the interaction-based method.The same procedure applies to the parameters generated for the location-based method,but without adding spam ratio values.

Parameter Description Boundary Method

ωpp Weight for a public post. 1-9

interaction-basedωppr Weight for a public post replied. 1-9

ωpm Weight for a private message. 1-9

ωpmr Weight for a private message replied. 1-9

r Ratio to filter spam. 2, 3, 4

ωcl Weight for a current location match. 1-9

location-basedωl Weight for a location match. 1-9

dcl Distance threshold for a current location match. 1-9

dl Distance threshold for a location match 1-9

Table 5.3: Complete parameter set for both social recommendation methods.

The brute-force approach is tuned and split into two parts. First, the generation ofparameter sets is started at the boundary value 1 with a proportional increment of 2units, and then it is started at the boundary value 2 with a proportional increment of 2units. Thus, e.g., the parameter values would be: {1, 3, 5, 7, 9} and {2, 4, 6, 8}, oddand even parameter values, respectively. This approach generates (54 + 44) ∗ 3 = 2643

Page 46: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

36 CHAPTER 5. EVALUATION

ωpp ωppr ωpm ωpmr r

1 3 2 4 2

1 3 2 4 3

1 3 2 4 4

5 7 1 2 2

5 7 1 2 3

5 7 1 2 4

Table 5.4: An example of six generated parameter sets for the interaction-based method.

parameter sets for the interaction-based method and 881 parameter sets for the location-based method.

The social recommendation methods are calibrated on a random half of the collected dataset. The remaining half is used to evaluate the calibrated methods in a second step (cf.Section 5.4). The procedure of randomly dividing the survey data set was chosen since, asfar as the authors’ knowledge, no Facebook data set is publicly available (including privateuser data, with interactions and locations). A possible explanation to the lack of a publicdata set including such information would be the difficulty to fully anonymize the dataset. Each social recommendation method is run with all corresponding parameter sets forhalf of the data set (randomly chosen). The process generates 383,235 interaction-basedmethod runs and 127,745 location-based method runs to compute. The calibration is runon a high-end server machine with 24 CPU cores, SSD hard disks, and 64 GByte RAM.With this hardware it is possible to complete 1400 interaction-based or 700 location-basedruns per minute. The estimated complete time for all interaction-based method runs is4.5 hours, and 3 hours for the location-based method runs. Table 5.5 summarizes thetotal of calibration runs for both social recommendation methods.

Method Parameter Sets Total Runs Runs / Minute Total Run Time

interaction-based 2643 383,235 1400 4.5 hours

location-based 881 127,745 700 3 hours

Table 5.5: Total calibration runs for both social recommendation methods.

Every social recommendation runs with a parameter set resulting in a weighted, undirectedgraph as explained in Section 3.5. Therefore, each Facebook friend (node) is ranked basedon his/her score (edge) in a descending order, meaning that at the top of the list is thefriend with the highest score. In order to compare the results of each parameter set, thefollowing measurements are defined: rank difference, match count, and cosine similarity.Each of these measurements compares the user perception answered in the survey and thecorresponding social recommendation method result of a specific parameter set. After allsocial recommendation results for one parameter set are completed, the mean value andstandard deviation are calculated in order to compare the parameter sets.

Rank DifferenceFor this measurement, the total difference in the ranking positions is calculated of the

Page 47: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

5.3. METHOD CALIBRATION 37

user perception answered in the survey, compared to the social recommendation methodresult related to one parameter set. This procedure is repeated for all survey answers and,at the end, the rank difference mean is calculated. Table 5.6 shows an example of howthe total difference of one social recommendation result is calculated.

User Perception User Perception Ranking Recommendation Ranking Difference

Friend A 1 53 52

Friend B 2 8 6

Friend C 3 1 2

Friend D 4 15 11

Friend E 5 3 2

Total Difference = 73

Table 5.6: An example of the rank difference measurement.

Match CountThis measurement counts the matches between friends in the social recommendation resultand the user perception (top5 match). If the rank is the same in both the user perceptionand the social recommendation result it is specified as an exact match. Table 5.7 showsan example of the match count approach.

User Perception User Perception Ranking Recommendation Ranking

Friend A 1 20

Friend B 2 2

Friend C 3 5

Friend D 4 15

Friend E 5 3

Top5 Match Count = 3Exact Match Count = 1

Table 5.7: An example of the match count measurement.

Cosine SimilarityThe third measurement is the calculation of the cosine similarity [57] between the userperception and the social recommendation result. The cosine measures the similarity oftwo document vectors. The same principle is applied, and two input vectors are computedout of the user perception and the recommendation result. A cosine similarity resultingin 1 refers to a complete similarity, meaning both the user perception and the socialrecommendation result are similar in terms of the friends found in the top five ranking.The following example clarifies the cosine calculation:

Friends - User Perception (UP): A, B, C, D, EFriends - Recommendation Result (RR): A, Z, X, E, YTotal Distinct Friends: A, B, C, D, E, X, Y, Z

Page 48: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

38 CHAPTER 5. EVALUATION

Input Vectors:

CountUP

A 1B 1C 1D 1E 1X 0Y 0Z 0

CountRR

A 1B 0C 0D 0E 1X 1Y 1Z 1

Resulting Cosine Similarity: 0.4

For each parameter set, the statistics based on the following three variations are calculated:frequency type, filter method, and threshold size.

Frequency TypeThe frequency type refers to survey question 1.1, where the participants answer howfrequent they interact, and question 2.1, where the participants answer how frequent theyattach locations on Facebook. Computing statistics for each frequency type limits thesocial recommendation results for each parameter set to either results with the answeredfrequency (never, occasionally or frequently) or all results (all frequencies).

Filter MethodThe filter method is used on the social recommendation results, in order to compute moreaccurate statistical results for each parameter set. There are three different filter meth-ods: failed completely, zero score, and threshold. Failed completely filters out a completeresult in a parameter set, when the social recommendation method could not computeanything. This is the case when the survey participant removed the permissions for theFacebook application right after he/she submitted the survey, and the social recommen-dation method was not able to cache any data. Zero score filtering filters out friendsin a result, which have a score value of exactly zero. There are three possible cases fora zero score value: the survey participant sabotaged his/her answers, it is a completewrong user perception (e.g., the user thought that he/she interacted with X, Y, or Z, butactually did not send messages, posts, or any OSN information to them), or Facebookreturned an error while collecting data for that specific friend. Threshold filtering goes astep further than zero score filtering, and filters out friends with a rank difference over acertain threshold, because friends with a big rank difference but still a score value abovezero, can also fall into the category of wrong user perception. This filtering approach israther experimental and there is the possibility that also valid friends are filtered out.It is important to note that the three filter methods are cumulative, meaning that failedcompletely filtering is also included in zero score and threshold filtering, and zero score isalso included in threshold filtering.

Threshold SizeDefines which threshold size is used if the threshold filter method is chosen. The thresholdsize is fixed to the values: two, three, and four.

Page 49: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

5.4. RESULT EVALUATION 39

To determine the best parameter set for each social recommendation method, all statisticalresults are ordered by the rank difference mean in a descending order. Therefore, iteratingthrough each rank difference mean ordered in a descending order, it is compared the top5match count mean (of each statistical result) and selected the one with the highest top5match count mean and standard deviation. With this approach, the best parameter set isselected, prioritizing the top5 match count mean followed by the standard deviation andrank difference mean.

The calibration of the two social recommendation methods revealed the best parametersets: the parameter set identifier 1277 for the interaction-based method, and the parameterset identifier 424 for the location-based method. Table 5.8 shows the parameters sets foundby the calibration step, also presenting the weight and ratio values.

Parameter Value Parameter Set Method

ωpp 3

1277 interaction-basedωppr 9

ωpm 1

ωpmr 1

r 3

ωcl 3

424 location-basedωl 7

dcl 9

dl 7

Table 5.8: Optimized parameter sets found from the social recommendation methodscalibration step.

5.4 Result Evaluation

In order to evaluate the parameters found in the calibration step, the social recommen-dation methods (calibrated with the parameter sets) are applied to the remaining halfof the data set obtained by the survey and data collection. The statistics are generatedagain for the results of the second half of the data set, and compared to the statisticsfrom the calibration step. Since the filtering method failed completely only filters outusers, who are completely invalid due to removed permissions of the survey Facebookapplication, only zero score filtering is considered in the evaluation of the parameter setsand social recommendation methods. As explained in Section 5.3, zero score filtering onlyremoves true negatives and therefore does not falsify the evaluation. The cosine similaritycorrelates with the top5 match count measurement, meaning the higher the top5 matchcount, the higher the cosine similarity. Therefore, the cosine similarity measurement isnot shown explicitly in the following figures due to redundancy reasons.

Page 50: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

40 CHAPTER 5. EVALUATION

Interaction-Based Social Recommendation MethodFor the statistics based on zero score filtering including all users, 137 calibration and 139evaluation recommendation results, out of 145 users, are considered in total. Five userswere discarded in both data sets due to failed completely filtering. Due to zero scorefiltering, 1 and 3 users were discarded respectively for calibration and evaluation results.Moreover, in average 1.66 and 1.88 friends were removed with zero score filtering fromthe calibration and evaluation results of each user.

Figure 5.5 shows the comparison of the interaction-based method for the rank differencemeasurement with the filtering methods failed completely and zero score. The valuesrepresented by the bars indicate the rank difference mean. The lower the rank differencemean, the better the social recommendation method performed. The best results wereachieved for the users, who answered occasionally as their interaction-frequency. It isshown that the rank difference mean for the zero score filtering is basically the samefor both calibration and evaluation data sets. For the users who, selected “never” asinteraction-frequency, the evaluation results performed worse than the calibration results.There are two explanations for this result. The first is that users, who answered “never” inthe survey, most likely do not have a good knowledge about the friends they have chosento compose the ranking. Moreover, as a second reason, only few users selected “never” asinteraction-frequency answer (7 users in total, 4 in the calibration, and 3 in the evaluationdata set). Therefore, the significance of the mean value for the “never” frequency type isvery low and can be ignored.

Figure 5.5: Interaction-based method: comparison of the rank difference mean for thefiltering methods failed completely and zero score.

Figure 5.6 shows the comparison of the interaction-based method for the top5 match countmeasurement. The top5 match count measurement shows the same values for all filteringmethods, because the filtering does not affect this measurement. Therefore, for simplicityreasons, it is only shown one statistic for each data set. The values represented by thebars indicate the top5 match count mean, and the error bars represent the correspondingstandard deviation. The higher the top5 match count mean, and the lower the standarddeviation, the better the social recommendation method performed. The best results wereachieved for the users that answered“occasionally”as their interaction-frequency. Also, forthe top5 match count measurement, the interaction-based social recommendation methodachieves basically the same results. As already mentioned, the measurement results for

Page 51: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

5.4. RESULT EVALUATION 41

users that answered “never” varies a lot due to a low confidence in the mean and can beignored.

Figure 5.6: Interaction-based method: comparison of the top5 match count mean.

Based on these evaluation results, there is evidence that the calibrated interaction-basedsocial recommendation method is a valid approach to determine, in average, at least 2friends (out of 5) that the OSN user also perceives as the ones he/she interacts most with.An interesting observation relies on the parameter set chosen by the calibration, whichshows that Facebook users who answered the survey, perceive public interaction as moreimportant than private interaction (i.e., public posts weights three, nine times higherthan private message weights). It is possible that Facebook users either interact more onpublic posts with their friends than on private messages, or they more likely rememberwith whom they interact publicly than privately.

Location-Based Social Recommendation MethodFor the statistics based on zero score filtering including all results, 92 calibration and 99evaluation recommendation results, out of 145 users, are considered in total. Due to failedcompletely filtering, 6 calibration and 5 evaluation users were discarded. Respectively forthe calibration and evaluation, 40 and 39 users, were discarded since they had no ownlocation data at all and, therefore, the location-based social recommendation is not able toproduce meaningful results. Moreover, 7 and 2 users, respectively for the calibration andevaluation, were discarded due to zero score filtering. In average, 1.99 and 2.27 friendswere removed, respectively from the calibration and evaluation results, due to zero scorefiltering.

Figure 5.7 shows the comparison of the location-based method for the rank differencemeasurement with the filtering methods failed completely and zero score. The values, rep-resented by the bars, indicate the rank difference mean. For all results, the evaluation rankdifference mean is higher than the calibration rank difference mean. Taking into accountthat the rank difference scale is also higher compared to the interaction-based method(309.46/344.97 of location-based compared to 47.1/48.67 of interaction-based method),the evaluation of both location- and interaction-based methods performed almost equally.

Page 52: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

42 CHAPTER 5. EVALUATION

Nevertheless it is observed that for the location-based method, in terms of the rank dif-ference measurement, it performs worse when compared to the interaction-based method.An interesting observed aspect is that, for the users, who answered that they “never”attach locations to their posts on Facebook, the lowest rank difference mean is achieved– meaning that the social recommendation method performs best for this particular case.In contrast to the interaction-based method, where only four users answered with “never”as interaction-frequency, 71 and 65 users, respectively for the calibration and evaluation,answered that they never attach location data to their Facebook posts. This means thatthey should had been filtered out (due to the lack of own locations), but in fact, thelocation-based social recommendation method found user location data for 35 and 37users, respectively for calibration and evaluation.

Figure 5.7: Location-based method: comparison of the rank difference mean for the fil-tering methods failed completely and zero score.

Figure 5.6 shows the comparison of the location-based method for the top5 match countmeasurement. The values, represented by the bars, indicate the top5 match count mean,and the error bars represent the corresponding standard deviation. Also for the top5match count measurement, the interaction-based social recommendation method achievesa similar results for both data sets (calibration and evaluation).

Figure 5.8: Location-based method: comparison of the top5 match count mean.

Page 53: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

5.5. THREATS TO VALIDITY 43

Based on these evaluation results, it can be assumed that the calibrated location-basedsocial recommendation method is a valid approach to determine, in average, at least1.3 friends (out of 5) that are geographically closest of the OSN user regarding his/herperception. This number may seem to be low, but it was observed that Facebook usershave much more interaction data than location data, since they tend to avoid to attachlocations to their posts. Therefore, the location-based social recommendation methodperforms worse in both measurements (rank difference and top5 match count). Basedon the evaluated parameter set identifier 424, it is shown that Facebook users considerfriends in a radius of 7 kilometers as the geographically closest friends regarding theircurrent location.

The cosine similarity between the friends selected by the OSN user for question 1.2 and2.2, for both calibration and evaluation data sets, is in average 0.372 (considering thetotal 290 users). The cosine similarity for the social recommendation method results is inaverage 0.09 for the calibration data set, and 0.08 for the evaluation data set. It is shownthat the average cosine similarity is close to zero for both social recommendation methods,which means their results are not similar at all. These findings support hypothesis H2,which states that the social recommendation results of both methods differ in the top 5friends they reveal.

5.5 Threats to Validity

There are several threats to the validity of the exploratory study results. The calibrationand evaluation data samples consist of 145 users. The possibility remains that with thissample size (population) the generalization of the findings are not correct. In defense,the data samples also include private user data, which is more accurate to calibrate andevaluate than only public user data. Also, at this present time, it is not known any publicdata set that could be used to calibrate and evaluate the introduced social recommendationmethods. The user data collected does only cover a time span of six months, due toFacebook limitations. The social recommendation methods were calibrated with data ofa six month time-span of each user, and a longer time-span could invalidate the evaluation.

Furthermore, the generation of parameter sets in a brute-force manner was tuned andlimited, and therefore the possibility remains that there exists parameter sets which couldcalibrate the social recommendations methods better. At least, this would only upgradethe results of the evaluation and not downgrade it.

It was observed that users either falsified the survey answers (e.g., by selecting any friend,not considering question semantics), or had a wrong perception of the selected friends.This could be noted since the average of removed friends out of the 5 selected friends is1.66 and 1.99, respectively to the interaction- and location-based methods.

Lastly, there is the uncertainty of implementation bugs, which could also falsify the results– since the software produced did not have a public code review. However, all methodsare thoroughly tested and all known bugs resolved.

Page 54: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

44 CHAPTER 5. EVALUATION

Page 55: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Chapter 6

Summary and Conclusions

This thesis designs, implements, and evaluates OSN capabilities for PiCsMu. In particu-lar, the following components, which compose the OSN capabilities, were introduced toPiCsMu: (1) an independent OSN authentication mechanism, which authenticates OSNusers, (2) an OSN meta API (JSocialLib), which consists of a general OSN data modeland an API, (3) a PiCsMu social network with basic social-relationship functionality, and(4) social recommendation methods. The (2) provides means to collect and analyze OSNuser data, based on a general OSN data model, independent of the underlying OSN. Thecomponents (1), (2), and (3) build the basis for the (4), which is the social recommenda-tion.

In particular, two social recommendation methods (interaction-based, location-based)were designed, calibrated, and evaluated through the means of an exploratory study.The study conducted a web-based survey, that was used to collect answers from 290 Face-book users. The survey asked about the 5 friends that the user interacted most with, andthe 5 friends that were the geographically closest, in the past six months. Since the usersagreed to collaborate and provide their public and private data, a data collection samplewas cached. Based on the collected data, it was possible to calibrate and evaluate bothsocial recommendation methods. The evaluation of the calibrated social recommendationmethods shows that, e.g., Facebook users (i) perceive public post interactions as, at least,three times more important than private message interactions, and (ii) perceive geograph-ically close friends in a distance radius of 7 kilometers. Furthermore, the interaction-basedsocial recommendation method is able to recommend, in average, 2 out of those 5 friendsthat the OSN user also perceives as the ones he/she interacted most with. The location-based social recommendation method is able to recommend, in average, 1.3 out of the 5friends that the OSN user also perceives as the ones he/she is geographically closest to.Hypothesis H1 is partially confirmed, since only for the interaction-based social recom-mendation H1 holds true. Hypothesis H2 is confirmed for both social recommendationmethods using the cosine similarity comparison. Unfortunately, in the scope of this thesis,hypothesis H3 could not be answered due to the design of the survey, and is subject tofuture work. It is assumed that the combination of both social recommendation methodsleads to even more accurate trusted friend recommendations.

45

Page 56: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

46 CHAPTER 6. SUMMARY AND CONCLUSIONS

This thesis evaluates the introduced social recommendation methods for Facebook. Infuture work this constraint can be addressed and other OSNs should be considered tocheck if the findings can be generalized to other OSNs. The data set obtained by thestudy contains public and private Facebook user data, and is of high value for the researchcommunity. Therefore, the collected data set can be made anonymous and published.Related to the implementation, new OSNs (e.g., Google+) should be integrated to theOSN meta API (JSocialLib) to enable social recommendation based on more OSNs.

Related to the calibration of the social recommendation methods, a logistic regressionanalysis could reveal a even better parameter set if there exists one, which would resultin better evaluation results. Lastly, the recommendation system should be evaluated inthe scope of, e.g., PiCsMu, in order to check what is the acceptance rate of the recom-mendation provided to users when embedded to an application.

Page 57: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Bibliography

[1] Apache maven website. Available at: http://maven.apache.org. Last visited on:02.01.2014.

[2] Apache tomcat website. Available at: http://tomcat.apache.org. Last visited on:02.01.2014.

[3] Apache web server website. Available at: http://httpd.apache.org/. Last visitedon: 02.01.2014.

[4] Badoo website. Available at: http://www.badoo.com. Last visited on: 26.12.2013.

[5] Cnbc facebook ipo. Available at: http://www.cnbc.com/id/47043815. Last visitedon: 26.12.2013.

[6] Cnbc twitter ipo. Available at: http://www.cnbc.com/id/101192368. Last visitedon: 26.12.2013.

[7] Facebook bug report. Available at: https://developers.facebook.com/x/bugs/694034083948939/. Last visited on: 02.01.2014.

[8] Facebook developer documentation. Available at: https://developers.facebook.com/docs/. Last visited on: 02.01.2014.

[9] Facebook website. Available at: http://www.facebook.com. Last visited on:16.12.2013.

[10] Findthebest website. Available at: http://social-networking.findthebest.com.Last visited on: 16.12.2013.

[11] Google analytics website. Available at: https://www.google.com/analytics/. Lastvisited on: 02.01.2014.

[12] Google+ website. Available at: https://plus.google.com. Last visited on:16.12.2013.

[13] Instagram website. Available at: http://www.instagram.com. Last visited on:26.12.2013.

[14] Instagram website. Available at: http://instagram.com/press/. Last visited on:16.12.2013.

47

Page 58: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

48 BIBLIOGRAPHY

[15] Linkedin website. Available at: http://www.linkedin.com. Last visited on:26.12.2013.

[16] Mediabistro website. Available at: http://www.mediabistro.com/alltwitter/

linkedin-more-users-than-twitter_b51089. Last visited on: 16.12.2013.

[17] Mysql website. Available at: http://www.mysql.com/. Last visited on: 02.01.2014.

[18] Oauth integration service. Available at: https://oauth.io/. Last visited on:26.12.2013.

[19] Openzap website. Available at: http://openzap.com/

top-10-social-networking-sites. Last visited on: 16.12.2013.

[20] Picsmu website. Available at: http://www.pics.mu. Last visited on: 16.12.2013.

[21] Restfb website. Available at: http://restfb.com. Last visited on: 02.01.2014.

[22] Rfc 4627: Java script object notation. Available at: http://tools.ietf.org/html/rfc4627. Last visited on: 26.12.2013.

[23] Rfc 5849: The oauth 1.0 protocol. Available at: http://tools.ietf.org/html/

rfc5849. Last visited on: 26.12.2013.

[24] Rfc 6749: The oauth 2.0 authorization framework. Available at: http://tools.

ietf.org/html/rfc6749. Last visited on: 26.12.2013.

[25] Socios project source code. Available at: https://github.com/SocIoSEUProject/SocIoS. Last visited on: 16.12.2013.

[26] Socios project website. Available at: http://www.sociosproject.eu. Last visitedon: 16.12.2013.

[27] Statisticbrain website. Available at: http://www.statisticbrain.com/

twitter-statistics. Last visited on: 16.12.2013.

[28] Survey website. Available at: http://social.csg.uzh.ch. Last visited on:26.12.2013.

[29] Techcrunch website. Available at: http://techcrunch.com/2013/05/01/

facebook-sees-26-year-over-year-growth-in-daus-23-in-maus-mobile-54.Last visited on: 16.12.2013.

[30] Thenextweb website. Available at: http://thenextweb.com/facebook/2013/10/

30/facebook-passes-1-19-billion. Last visited on: 16.12.2013.

[31] Tumblr website. Available at: http://www.tumblr.com. Last visited on: 26.12.2013.

[32] Twitter api rate limits. Available at: https://dev.twitter.com/docs/

rate-limiting/1.1/limits. Last visited on: 02.01.2014.

[33] Twitter website. Available at: https://www.twitter.com. Last visited on:16.12.2013.

Page 59: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

BIBLIOGRAPHY 49

[34] Twitter4j website. Available at: http://twitter4j.org. Last visited on: 02.01.2014.

[35] Usa today website. Available at: http://www.usatoday.com/story/tech/2013/10/29/google-plus/3296017. Last visited on: 16.12.2013.

[36] Charu C. Aggarwal, Joel L. Wolf, Kun-Lung Wu, and Philip S. Yu. Horting hatchesan egg: A new graph-theoretic approach to collaborative filtering. In Proceedingsof the Fifth ACM SIGKDD International Conference on Knowledge Discovery andData Mining, KDD ’99, pages 201–212, New York, NY, USA, 1999. ACM.

[37] L. A. N. Amaral, A. Scala, M. Barthelemy, and H. E. Stanley. Classes of small-worldnetworks. Proceedings of the National Academy of Sciences, 97(21):11149–11152,October 2000.

[38] Marko Balabanovic and Yoav Shoham. Fab: Content-based, collaborative recom-mendation. Communications of the ACM, 40:66–72, 1997.

[39] A. L. Barabasi and R. Albert. Emergence of scaling in random networks. Science,286:509–512, 1999.

[40] A. Bleicher. The anti-facebook. Spectrum, IEEE, 48(6):54–82, 2011.

[41] John S. Breese, David Heckerman, and Carl Kadie. Empirical analysis of predictivealgorithms for collaborative filtering. In Proceedings of the Fourteenth Conference onUncertainty in Artificial Intelligence, UAI’98, pages 43–52, San Francisco, CA, USA,1998. Morgan Kaufmann Publishers Inc.

[42] Li Chao, Yu Jian, Li Xiang, and Chen Jia Hui. A social network system orientedhybrid recommendation model. In Computer Science and Network Technology (ICC-SNT), 2012 2nd International Conference on, pages 901–906, 2012.

[43] Cheng-Hao Chu, Wan-Chuen Wu, Cheng-Chi Wang, Tzung-Shi Chen, and Jen-JeeChen. Friend recommendation for location-based mobile social networks. In Inno-vative Mobile and Internet Services in Ubiquitous Computing (IMIS), 2013 SeventhInternational Conference on, pages 365–370, 2013.

[44] Mukund Deshpande and George Karypis. Item-based top-n recommendation algo-rithms. ACM Trans. Inf. Syst., 22(1):143–177, January 2004.

[45] M. Durr, M. Maier, and F. Dorfmeister. Vegas – a secure and privacy-preserving peer-to-peer online social network. In Privacy, Security, Risk and Trust (PASSAT), 2012International Conference on and 2012 International Confernece on Social Computing(SocialCom), pages 868–874, 2012.

[46] Roy Thomas Fielding. Architectural Styles and the Design of Network-based SoftwareArchitectures. Phd thesis, University of California, 2000.

[47] W. Gellert and Van Nostrand Reinhold Company. The VNR concise encyclopedia ofmathematics. Van Nostrand Reinhold Co., 1977.

Page 60: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

50 BIBLIOGRAPHY

[48] M. Gjoka, M. Kurant, C.T. Butts, and A. Markopoulou. Walking in facebook: Acase study of unbiased sampling of osns. In INFOCOM, 2010 Proceedings IEEE,pages 1–9, 2010.

[49] M. Gjoka, M. Kurant, C.T. Butts, and A. Markopoulou. Walking in facebook: Acase study of unbiased sampling of osns. In INFOCOM, 2010 Proceedings IEEE,pages 1–9, 2010.

[50] David Goldberg, David Nichols, Brian M. Oki, and Douglas Terry. Using collab-orative filtering to weave an information tapestry. Commun. ACM, 35(12):61–70,December 1992.

[51] Ze Li and Haiying Shen. Social-p2p: Social network-based p2p file sharing system.In Network Protocols (ICNP), 2012 20th IEEE International Conference on, pages1–10, 2012.

[52] Weiyang Lin, Sergio A. Alvarez, and Carolina Ruiz. Collaborative recommendationvia adaptive association rule mining. In Data Mining and Knowledge Discovery, 2000.

[53] G.S. Machado, F.V. Hecht, M. Waldburger, and B. Stiller. Bypassing cloud providers’data validation to store arbitrary data. In Integrated Network Management (IM2013), 2013 IFIP/IEEE International Symposium on, pages 1–8, 2013.

[54] J. Naruchitparames, M.H. Gunes, and S.J. Louis. Friend recommendations in socialnetworks using genetic algorithms and network topology. In Evolutionary Computa-tion (CEC), 2011 IEEE Congress on, pages 2207–2214, 2011.

[55] Roozbeh Nia, Fredrik Erlandsson, Henric Johnson, and S.Felix Wu. Leveraging so-cial interactions to suggest friends. In Distributed Computing Systems Workshops(ICDCSW), 2013 IEEE 33rd International Conference on, pages 386–391, 2013.

[56] V. Podobnik, D. Striga, A. Jandras, and I. Lovrek. How to calculate trust betweensocial network users? In Software, Telecommunications and Computer Networks(SoftCOM), 2012 20th International Conference on, pages 1–6, 2012.

[57] Amit Singhal. Modern information retrieval: a brief overview. BULLETIN OFTHE IEEE COMPUTER SOCIETY TECHNICAL COMMITTEE ON DATA EN-GINEERING, 24:2001, 2001.

[58] Q. Xiu-Quan, Y. Chun, L. Xiao-Feng, and C. Jun-Liang. A trust calculating algo-rithm based on social networking service users’ context. 2011 Chinese Journal ofComputers, 34(12):2403.

[59] Changchun Yang, Jing Sun, and Ziyi Zhao. Personalized recommendation based oncollaborative filtering in social network. In Progress in Informatics and Computing(PIC), 2010 IEEE International Conference on, volume 1, pages 670–673, 2010.

[60] Xiao Yu, Ang Pan, Lu-An Tang, Zhenhui Li, and Jiawei Han. Geo-friends recommen-dation in gps-based cyber-physical social network. In Advances in Social NetworksAnalysis and Mining (ASONAM), 2011 International Conference on, pages 361–368,2011.

Page 61: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

BIBLIOGRAPHY 51

[61] Jing Zhang, Jie Tang, Bangyong Liang, Zi Yang, Sijie Wang, Jingjing Zuo, and JuanziLi. Recommendation over a heterogeneous social network. In Web-Age InformationManagement, 2008. WAIM ’08. The Ninth International Conference on, pages 309–316, 2008.

Page 62: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

52 BIBLIOGRAPHY

Page 63: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Abbreviations

API Application Programming InterfaceCPU Central Processing UnitCS Complete SurveysCSG Communication Systems GroupDHT Distributed Hash TableFV First-Time VisitsGByte GigabyteH1 Hypothesis 1H2 Hypothesis 2H3 Hypothesis 3IPO Initial Public OfferingMByte MegabyteOSN Online Social NetworkP2P Peer-to-PeerPiCsMu Platform-independent Cloud Storage System for Multi-UsagePM Private MessagePMR Private Message RepliedPP Public PostPPR Public Post RepliedRAM Random Access MemoryRFC Request For CommentsRE1 Requirement 1RE2 Requirement 2RR Recommendation ResultSNA Social Network AnalysisSSD Solid State DriveUC1 Use Case 1UC2 Use Case 2URL Uniform Resource LocatorUML Unified Modeling LanguageUP User PerceptionUZH University of Zurich

53

Page 64: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

54 ABBREVIATONS

Page 65: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Glossary

Application Programming Interface An API specifies how software components shouldinteract with each other.

Distributed Hash Table A DHT is used in distributed systems to provide a lookupservice by storing key/value pairs similar to a hash table.

File Sharing Systems Centralized or distributed applications to provide access to dig-itally stored data, e.g., documents, audio, and video.

Hypothesis A hypothesis is a proposed explanation for a phenomenon. Scientific hy-potheses require that one can test them.

Peer-to-Peer System A P2P system is a self-organizing system of equal, autonomousentities (peers) which aims for the shared usage of distributed resources in a net-worked environment avoiding central services.

Software Library Is a reusable software component with well defined interfaces, e.g.,the libraries used for the OSN meta API (RestFB, Twitter4J).

55

Page 66: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

56 ABBREVIATONS

Page 67: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

List of Figures

2.1 The SocIoS system architecture. . . . . . . . . . . . . . . . . . . . . . . . . 9

3.1 The PiCsMu system overview. . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.2 The OSN authentication communication flow. . . . . . . . . . . . . . . . . 13

3.4 The star graph recommendation result. . . . . . . . . . . . . . . . . . . . . 19

3.5 A location matching example. . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.3 The OSN meta API data model. . . . . . . . . . . . . . . . . . . . . . . . . 22

5.1 The friend selector survey element. . . . . . . . . . . . . . . . . . . . . . . 32

5.2 The friend ranking survey element . . . . . . . . . . . . . . . . . . . . . . . 33

5.3 Survey visitor demographics: gender and age. . . . . . . . . . . . . . . . . 33

5.4 Survey visitor demographics: countries, cities and language. . . . . . . . . 34

5.5 Interaction-based method: comparison of the rank difference mean for thefiltering methods failed completely and zero score. . . . . . . . . . . . . . . 40

5.6 Interaction-based method: comparison of the top5 match count mean. . . . 41

5.7 Location-based method: comparison of the rank difference mean for thefiltering methods failed completely and zero score. . . . . . . . . . . . . . . 42

5.8 Location-based method: comparison of the top5 match count mean. . . . . 42

57

Page 68: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

58 LIST OF FIGURES

Page 69: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

List of Tables

2.1 A comparison of OSNs. The numerical values are represented in millions. . 6

2.2 A comparison of social recommendation. . . . . . . . . . . . . . . . . . . . 8

3.1 The OSN meta API user attributes. . . . . . . . . . . . . . . . . . . . . . . 14

3.2 The meta API interface methods and their intended use. . . . . . . . . . . 16

3.3 The interaction categories for the interaction-based social recommendationmethod. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

5.1 The questions of the survey. . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.2 Summary of the cached Facebook data. . . . . . . . . . . . . . . . . . . . . 35

5.3 Complete parameter set for both social recommendation methods. . . . . . 35

5.4 An example of six generated parameter sets for the interaction-based method. 36

5.5 Total calibration runs for both social recommendation methods. . . . . . . 36

5.6 An example of the rank difference measurement. . . . . . . . . . . . . . . . 37

5.7 An example of the match count measurement. . . . . . . . . . . . . . . . . 37

5.8 Optimized parameter sets found from the social recommendation methodscalibration step. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

59

Page 70: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

60 LIST OF TABLES

Page 71: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

List of Listings

4.1 A sample Facebook query with the RestFB library. . . . . . . . . . . . . . 254.2 A sample Twitter query with the Twitter4J library. . . . . . . . . . . . . . 254.3 The social network service interface. . . . . . . . . . . . . . . . . . . . . . . 264.4 The recommendation service interface. . . . . . . . . . . . . . . . . . . . . 264.5 An example of an asynchronous communication flow. . . . . . . . . . . . . 274.6 An example of parallel computing. . . . . . . . . . . . . . . . . . . . . . . 27

61

Page 72: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

62 LIST OF LISTINGS

Page 73: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Appendix A

Installation Guidelines

The latest project configuration is available as Maven repository. In addition, source codecan be accessed through Subversion (SVN), under author contact.

Repositories:Maven: http://www.pics.mu/maven/

SVN: https://www.pics.mu/svn/picsmu-social

System Requirements: Java version 1.6 or later (JDK or JRE)

63

Page 74: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

64 APPENDIX A. INSTALLATION GUIDELINES

Page 75: Online Social Network Capabilities for a Platform …Online Social Network Capabilities for a Platform-Independent Cloud Storage System for Multi-Usage Alexander Filitz Zürich, Switzerland

Appendix B

Contents of the CD

picsmu-social Folder including all source code related to this thesis.

picsmu-social-documents Folder including all documents related to this thesis.

Masterthesis.pdf This thesis as PDF.

Abstract.txt Abstract in English.

Zusfsg.txt Abstract in German.

65