Network Structure For Social Network

6

Click here to load reader

description

Literature Review Network Structure for Social Network

Transcript of Network Structure For Social Network

Page 1: Network Structure For Social Network

Literature Survey to discuss topographical structure

of social networks and information propagation Sathe, Vaibhav

1

Indian Institute of Management Lucknow

IIM Campus, Prabandh Nagar, Off Sitapur Road, Lucknow, Uttar Pradesh – 226013, INDIA [email protected]

I. INTRODUCTION

Facebook’s currently 800 million and continuously growing

user base and increasing trend in time spent has attracted a lot

of attraction from researchers in various fields. Recently

Facebook has been used as platform for organizing mass

protests in countries of middle-east. Even looking at events in

India like rise of India against Corruption and their Facebook

following of 500,000 people has underscored rising power of

social media. This has resulted in clashes with governments

which are seeking to curtail power of social networks and its

users to spread messages without restrictions. In our research,

we want to model this censorship activity. This literature

survey is being conducted to support the research by

understanding network concepts required for modelling social

networks, primarily in areas of structure of network and how

message spreads.

We will review some well cited papers published on top

Information Systems journals to identify various dimensions

required for modelling exercise.

II. PROBLEM DEFINITION

Following are objectives of this literature review.

(1) Structure of Social Networks:

In order to model social network, we need to determine

which model from network science applies to social

network. Probable options are small world, random

network and scale free network. It is also noted that

different social networks may d isplay different structures

due to fundamental differences. From point of view of

censorship, we will focus more on social networks like

Facebook. Facebook clearly holds largest interest due to

largest user base which gives it capability to influence

behaviour of actors involved in censorship related study.

(2) Information Propagation Pattern:

In order to identify parameters that model interactions of

users on social network which lead to information

diffusion, we need to understand how informat ion spreads

on networks and what all factors affect it.

III. LITERATURE SEARCH

The literature surveyed for this is divided into following

sections.

A. Structure of Social Networks

Following articles contribute to first objective to determine

structure of social networks . Detailed reference is included in

references section.

Sr. Article/Paper Journal/Publisher

1 Measurement and Analysis of

Online Social Networks

ACM

2 Linking via Social Similarity: The

Emergence of Community

Structure in Scale-free Network

IEEE

3 A fast algorithm for simulating

scale-free networks

ICCTA (IEEE)

4 Social Search in “Small-World”

Experiments

World Wide Web

Consortium

5 Reciprocity in evolving social

networks

Journal of

Evolutionary

Economics

B. Information Propagation

Following articles contribute to second objective of

determining patterns in informat ion spread. Detailed reference

is included in references section.

Sr. Article/Paper Journal/Publisher

1 Network Effects and Personal

Influences: The Diffusion of an

Online Social Network

Journal of

Marketing

Research

2 Forward or delete: What drives

peer-to-peer message propagation

across social networks?

Journal of

consumer

behaviour

3 User Interactions in Social

Networks and their Implications

EuroSys’09, ACM

4 Online organization of offline

Protest: From Social to Tradit ional

Media and Back

HICSS 2011

5 Information propagation analysis

in a social network site

IEEE

6 Detecting and Characterizing

Social Spam Campaigns

IMC’10, ACM

IV. TERMINOLOGIES

Let’s look at some terminologies in detail required to

understand concepts discussed in this review.

Power Law:

When frequency varies inversely with power of

quantifiable size of event, the relationship is said to follow

power law. One of the characteristics of such distribution is

large difference between mean and median.

Types of networks:

A. Random Networks

Random network are unstructured networks with low

clustering. They do not occur in nature. They are theoretically

Page 2: Network Structure For Social Network

studied to provide baseline for study of more structured

networks like small world and scale free.

B. Small World Network

Small world networks are networks which have small

average path length due to large number of interconnections

and high cluster coefficient.

C. Scale-Free Network

Scale-free networks are those whose degree sequence

distribution follows power law. i.e. the network consists of

Small number of highly connected users and large number of

less connected users.

Terms related to networks:

(1) Network Diameter: Maximum internode distance is called

diameter of network.

(2) Indegree: No. of inward connections for given user.

(3) Outdegree: No. of outword connections for given user.

This is valid measure when networks are directed graphs.

Network like Facebook and Orkut are symmetrical

networks i.e . for any user, indegree and outdegree are

equal.

(4) Assortativity: It is measure of likeliness that nodes in

network establish link with other node which is similar to

it on some parameter.

Information shared on Facebook:

The informat ion that is created and shared on Facebook

comes from various sources. These are as follows:

(1) Status Messages: Users can share text message as their

status message. This is visible to other users (friends or

others) on user’s wall. The message also appears in news

feed of other users which are friends or/and subscribed to

user’s updates.

(2) Hyperlink: A hyperlink to some other location on Internet,

typically news of interest, is another source of shared

informat ion. Friends can like, share, comment on such

links.

(3) Photo: Photographs, typically taken by user, are

frequently shared, liked and commented.

(4) Community/Group: Facebook has different groups

dedicated to various topics. Message posted by or on the

community is typically shared by user so that his

subscribers can view it, which may not have access to the

community.

(5) Person: Famous people like Bill Gates have their own

personal pages which are not like groups. These are used

by sending personal images and links to thousands of

subscribers in similar way as these personalities are using

twitter today. This is for one-way communication.

(6) Event Invitations: Users can create events and invite

people. Users can also forward event invites.

V. DATA EVALUATION

This section is split into sections as below.

A. Social Networks

Before starting, let’s look at what is meaning of social

networks and how online social networks are different.

Social Network concept applies to naturally formed

networks like community, family t ies and relationships etc.

For e.g. In a town, people know each other in one residential

area. They also know some more people at workp lace. There

is also tendency that they want to know more people and try to

gain access to larger contacts through person they think is

well-connected. The information exchange may be intentional

or unintentional. The study of social networks focusses on

critical issues like d isease spread, news spread, riots, fads,

social awareness etc.

Online social networks demonstrate similar characteristics

with exception that users are not in physical connection with

each other. Examples of online social networks include

Facebook, Twitter, Flickr, YouTube or any other sites which

facilitate interaction between users. This can be one-one

(Google talk) or one-many (Facebook) or many-many (Forum)

depending on nature of the site.

B. Structure of Social Networks

What graph structure social networks follow has been very

interesting topic for the researchers as it is fundamental step in

any modelling or simulation on the network.

Mislove et al [2]

in their paper on measurement and analysis

of social network try to identify various characteristics of

social network. In the experiment they collected data from

over 11.3 million users of Orkut, Youtube, Flickr and

LiveJournal. When network analysis was done on each

network, these networks followed Power Law. In addit ion,

they identified that these social networks display scale-free

and small world properties. All networks have high clusters.

Authors have identified interesting parameter that whether

consent is required from second party to establish connection

by first party. The example is twitter, where anyone can

follow you and you need not follow him. But on other hand,

on Facebook, if somebody wants to be friends with you then

he needs to send request and only when you approve, you both

become friends to each other. Twitter is example of

asymmetric network which has different indegree and

outdegree for each user. Facebook is example o f symmetric

networks where each user has identical indegree and

outdegree. Based on these parameters, characteristics of

network will vary. Symmetric networks have more

connections among users and hence, they form stronger

clusters thereby reducing network diameter. Hence, they

display characteristics of small world network. Among

examples taken for analysis by author, we need to focus more

on example of Orkut as it is most closely related to Facebook.

To understand limitations, we need to note complex structure

of Facebook. Although friendship is one of the prime ways

Facebook disseminates informat ion, we need to consider other

ways like groups, pages where user subscribes thereby

creating directed or asymmetric relat ionship. Nowadays,

Facebook is also allowing users to subscribe to status updates

from other users without requirements of explicit consent.

This has resulted in formation of Facebook has hybrid

network with different types of nodes. With regards to cluster

formation, the authors state that the online social networks

score higher on assortativity on parameter that users of high

degree establish relation with other users of high degree while

users of low degree establish relation with other users of low

degree. This looks in violat ion with scale-free properties

where low degree users have tendency to attach to high degree

users more in order to form Hub and Spoke model.

The social networks are examples of very large scale

networks and they are not random. Study by Erdos and Renyi [6]

proved that networks like social networks evolve with

Page 3: Network Structure For Social Network

particular patterns and they have certain structure, but not

random.

Wei Ren and Jianping Li’s [4]

paper proposes RX algorithm

to simulate scale free network, which they claim is better

performing than popular Barabasi-Albert (BA) algorithm.

Authors state that as number of nodes increase, the time

required for RX is much lesser compared to that taken by BA.

They conclude that the networks that expand continuously

exhibit characteristics of scale-free networks. And since,

social networks are both very large in size as well as

continuously expanding, scale-free characteristics apply. The

same is true about online social network like Facebook, which

has currently 800 million users and is increasing in terms of

total users as well as average number of friends at very rapid

rate.

Yixiao Li et al [3]

in their paper, make important

observations that social network model exh ib its community

structure. This paper however correctly establishes clustering

method based on “Birds of feather flock together”, stating that

users having something in common tend to form clusters or

groups with a lot of interconnections among them. This does

not agree with statement in paper of Mislove [2]

, which stated

that users with high degree have tendency to connect to other

users with high degree and vice versa. Further this paper

establishes that communities develop into scale-free networks

when they keep expanding.

There is one more factor discussed in literature on user’s

intention. As explained in paper by Goel et al [7]

, from

physical social network standpoint, the topological connection

and algorithmic connection (intention to connect) with

example of spread of diseases in social network. The paper

distinguishes in network structure based on intention of user.

Next paper discussed below extends this concept by looking

into factor when such intentions evolve, making network very

dynamic.

The paper by Jun and Sethi [8]

discusses how social network

structure is developed in dynamic and continuously evolving

environment. The changes in network result as random

rewiring. Also, to certain extent, some old links are severed

over period of t ime. In physical as well as online social

networks it is due to changes in one’s lifestyle in terms of

location, community memberships etc. Also, changes may

happen in intention factor which is taken as conditional

cooperation. Over period o f t ime, user’s reasons to connect

can evolve e.g. looking for relationship, friendship or

professional networking. Another important observation by

the authors is based on increasing degree of network. With

increasing degree, the clustering increases as neighbours of

one node are likely to be neighbours of each other. Th is is

same phenomenon that social network like Facebook fo llows.

Hence, the diameter of network reduces. This paper identifies

future research scope in terms of in fluence of behaviour of

non-neighbours on given user. This is also valid scenario

considering features of Facebook. User A may receive updates

from interaction of particu lar friend B to his friend C who is

not friend of user A. We will discuss this propagation in next

section.

C. Information Propagation

Harvey et al [9]

in their paper on v iral marketing on Internet

researched how users Forward or Delete particu lar message on

social network like YouTube. From our research point of view,

observations on this forward ing behaviour are important as

they also apply to user behaviour on social network like

Facebook. Authors have identified that likelihood of video

being forwarded are closely correlated to sender involvement,

sender tie strength and amount of online communication

across ties. We would explain these factors in short. Sender

involvement means, as explained by Norman [10]

, is relation of

subject to person’s needs. Sender’s tie strength means how

close is the user to sender of message. Third factor on amount

of communication that sender has with p robable to whom he

would forward. Authors reject factor that knowledge of how

to forward given message has got any correlation to this.

Skoric et al [12]

in their paper discuss parameter of trust

which is similar to ties with sender which we discussed in

previous paper. Authors say that in general, user t rust their

friends over any other person like polit ical leader or advertiser.

What this means is when a friend forwards or share some

message, they consider it as serious message. This improves

likelihood that they forward such message. This research also

identifies that groups, events and status messages are the tools

on Facebook by which users can reach one’s immediate and

extended friends in fast, easily accessible and cost effective

way. One important contribution of this paper is identification

that spread of such messages will be limited in individuals

who are mostly similar and in one category of politically

engaged and socially act ive people. Th is is typically due to the

fact that such messages will spread only through friendship

networks, which are based on different intentions than

spreading such message. Friends are generally of s imilar

thought process and hence similar on above parameters.

Katona et al [1]

brings out some crit ical points based on

sender’s influence in their paper. First, they discussed that as

number of contacts of recipient increase, influencing effect

that particular indiv idual has on him gets diluted accordingly.

Second factor is of brokers. We have already seen that social

networks demonstrate characteristics of scale-free and small

world networks. This means that among different clusters of

users there are few users which are common, which form

prominent nodes linking these two clusters. As proved

empirically, since they control large amount of informat ion,

they have higher influential power.

Another very interesting observation is made by Wilson et

al [11]

in their paper. Authors say that links or connections on

social network like Facebook are not indicators of interaction

among them. This is primarily due to time constraints that

users face. So, all the friendships are not equally meaningful.

Authors therefore have come up with new concept of

interaction graph as valid indicator to map social connectivity

than Facebook updates. Interesting observation they have

made that such interaction graph does not exhib it small world

characteristics. Therefore, authors believe more in the scale-

free network pattern when it comes to interactions that happen

within users.

In paper by Magnani et al [13]

, authors have identified some

important dimensions of discussion. The average lifet ime of

post or message is the time for which it is availab le on news

feeds of user. It will vary inversely with number of friends the

user has and their frequency of activity on Facebook. Overall,

authors have found that such lifetime of post also follows

power law. Based on their empirical analysis it was found that

50% of entries survive fo r around one hour, 85% survive for a

day and so on. Authors have also identified specific time trend

in content generation. Since users in given clusters have some

Page 4: Network Structure For Social Network

parameters in common, any temporal factors affecting those

parameters will also affect activity of all users simultaneously.

One important issue that needs attention is increasing

quantity of spam. The paper by Gao et al [14]

, looks at

quantifying and characterizing online spam campaigns

launched by online social network accounts. Important

observation from this empirical study of 3.5 million Facebook

users indicate that over 97% of accounts are compromised

accounts and only rest are fake accounts. Another observation

is that spamming activ ity is more generally at early morning

hours for users based on their local time.

VI. ANALYSIS AND INTERPRETATION

A. Network Structure

Based on reviews of art icles in section on network structure

above, we find that Mislove’s art icle [2]

develops many

concepts required for understanding how this structure

develops. But, with help of community as example from Yixio

Li et al [3]

, we can get idea how social networks evolve. This

helps in understanding why social networks display

characteristics of both small world networks and scale-free

networks.

Initially a group of individuals with something in common

like belonging to same school come together on network like

Facebook. They add each other as links, thereby establishing

community structure. This is also a cluster of users tightly

coupled with each other. Th is behaves like Small World

network due to shorter diameter. As time progresses, the

individuals from these clusters may get exposed to a different

group or set of users. Now this particular user becomes

connection between these two clusters. That way, this

individual will have much higher degree of links than his

earlier cluster peers. This develops into hub and spoke model

and thereby into scale free networks. These follow Power Law,

as there are lesser users connected across clusters and hence

have higher degree, than large number of users connected only

within cluster, therefore have lesser degree of links.

Another parameter that impacts expansion of social

networks is how users can search other users in order to

connect them. Networks like LinkedIn allow users to search

only within certain levels of neighbourhood. This limits

capability of less connected users to connect to large number

of users. This further provides incentive to user to connect to

another user which is highly connected. This simple behaviour

contradicts concept given in paper of Mislove [2]

that users of

similar degree are more likely to connect to each other.

Scenario of linking unintentionally is not applicable to

online social network like Facebook as there is no reason to

believe that two users are connected to each other unless they

have some intention to do so. At least one user will have some

reason to connect to other, second user may approve request

unknowingly. Additionally it may need to be noted that the

intentions of different users connecting to each other may be

different. What this means is one user A intends to connect to

user B for reason X. But user B wants to connect to user A for

reason Y and still they can establish connection as long as

both users agree. But if there is no reason Y for B to connect

to A then the link will not establish. However, we could not

locate any literature modelling the network taking into

account heterogeneous intentions.

B. Information Propagation

As literature explains, we have several factors that define

the pattern of propagation of information. However, we need

to alter some conditions when we apply these to our research

for purpose of understanding how a message spreads over

social network like Facebook, fundamentally due to several

differences in characteristics of Facebook against social

networks that were considered for empirical research in

literature researched.

As against preferential forwarding discussed in paper by

Harvey et al [9]

, on Facebook, the user would forward i.e.

share message that he likes to all of his friends and those who

are subscribed to his updates. Very few times he would share

such message with particu lar Facebook user. However, we

need to note that he can preferentially tie up some users based

on relevance he sees while sharing the message with larger

audience. The ways to do it are tagging a person or posting

such link or image on wall of user intended.

We also agree with Harvey’s finding that user’s knowledge

has little to do with forwarding likelihood. While looking at

this observation from Facebook’s point of view, we can’t

logically think of any reason to believe that a Facebook user

will not be aware how to share the message that he or she is

reading if at all he wants to do that.

As we have seen in the structure of social networks, the

users of similar nature come together and form clusters. This

creates strong bonds between similar people and weaker

bonds between dissimilar people. Moreover we saw that while

friendship networks are formed based on consent, the user

gives such consent based on different criteria than spreading

particular message. This results in effectively reducing

velocity of message spread as it does not reach to dissimilar

users with equal intensity.

Wilson et al [11]

have found that small world clustering does

not exist due to low degree of connection in their interaction

graph, which is different than friendship link graph. Th is is

due to the fact that users on regular basis interact with a s mall

portion of their friends. As degree of links per user from

interaction point of view decreases, clustering index reduces ,

thereby network becomes more scale-free and less small-

world.

As described by Katona et al [1]

, the dilution of influence

occurs as number o f contacts increase. This is very logical. As

number of friends on Facebook increases frequency of updates

in Feeds also increases proportionately. As pointed out by

Wilson, every user has limited time on Facebook. Hence,

likelihood that particular update will be visib le in considerable

portion of his news feed he would scroll at time reduces with

increasing number of contacts. This weakens influence level

and hence the interaction that we are looking for.

Paper written by Magnani et al [13]

discusses lifet ime of post

where it is active and accessible to friends. Overall it indicates

short lifespan of the message. We also need to note that as

clustering will increase in Facebook with more and more user

activity and more friends, average lifespan of particular

message would lower further. This further underlines point

mentioned in Wilson’s paper about constrained time makes

interaction networks rather than connection networks more

important for modelling, which are scale-free in nature.

Regarding spread of spam content, important factor from

our study point of view is that compromised accounts

contribute to 97% of spam and only 3% by fake accounts.

This further highlights that users trust their friends. Message

Page 5: Network Structure For Social Network

coming from unknown user is identified as spam easily than

the one coming from friend with whom user has closer ties.

Regarding t iming issue of the spam generation, we do not find

any relevance to our study on spread of information.

But time of content generation has critical ro le to play when

it comes to find lifet ime of the message to remain active in

news feed of the user. If message is created or shared at peak

time for local user, as per clustering of users, there is

significant evidence that most friends are geographically

collocated. And hence, there will be higher activity in the

entire cluster. This further reduces lifetime of message in the

news feed, but simultaneously increases likelihood that user

sees such message due to he or she is actively v iewing the

news feed.

Another important point is that not all content that is

frequently shared is genuine. Unfortunately we could not find

any conclusive literature on user behaviour where they

forward or share spam or incorrect information knowingly

simply for amusement purpose. This typically includes some

random so called “confidential” information about some

political leader or forged images. If users share this

informat ion unknowingly, then this behaviour can be

considered under trusting the ties which we just discussed. But,

many a times user is completely aware of fraudulent nature.

Still, either for amusement purpose or out of political or

ideological conflict with person or event in question, they find

it encouraging sharing of such material. We could not

however find any empirical research on this behaviour. It

should also be noted that users who are aware of spam, if they

think it may be harmful to them, then they do not indulge in

such activity. But when it comes to pure static spam content,

which they are sure that it won’t compromise their profiles,

they do not have objection to share or comment on it. If we

look at censorship proposals from governments, we may find

that they are largely interested in controlling such content.

VII. LIMITATIONS

Facebook is continuously updating its features. Literature

suggests that new features have significant impact on user

behaviour. For newly introduced timeline feature, which

allows users to view past important interactions with ease, has

greater significance on user interactivity. But, we could not

locate any literature discussing impact of timeline. Also, we

could not find literature conclusively quantifying Facebook

events and their impact on social events. We also did not

locate any literature which can explain user bias in sharing

fake informat ion knowingly. We understand that social

networking phenomenon is relatively new and hence there is

no enough research done on every aspect of social network’s

impact on our real time interactions.

VIII. CONCLUSION

In this literature survey, we have identified factors that need

to be accounted while modelling informat ion spread on social

networks. We have avoided going into details of mathemat ical

details supporting conclusions derived for simplicity. We have

linked various papers that is available on this topic to identify

following conclusions.

On network structure side, we conclude that social network

from friendship perspective demonstrates characteristics of

both scale-free and small-world networks. But since,

interactions between users which are time constrained, display

only scale-free characteristics, we need to model social

network as scale-free network for our research perspective.

We conclude that following factors should be taken into

account by our model which will impact likelihood and

velocity of message spread.

(1) Number of friends of user is inversely proportional to

amount of influence of friend has on user

(2) Number of friends of user is inversely proportional to

lifetime of message to remain active in user’s news feed

(3) Amount of time user spends on average on Facebook is

directly proportional to likelihood of spreading message

(4) Stronger bond with sender is directly proportional to

likelihood of spreading message further

(5) More is the clustering in user’s network, less is the

velocity of message to spread, primarily due to

duplication of messages it will remain confined to same

cluster

(6) Message shared at peak time will have less lifetime on

news feed but higher likelihood to get replicated due to

high activity in entire cluster

(7) If users perceive particular message as no harmful to

them, then there is higher likelihood that it will be spread

or shared, irrespective of user’s analysis of message’s

authenticity. This will be typical sharing of such

messages for amusement or political conflicts.

REFERENCES

[1] Katona Z., Zubcsek P., Sarvary M., Network Effects and

Personal Influences: The Diffusion of an Online Social

Network , Journal of Marketing Research, Vol. XLVIII

(June 2011), 425-443, American Marketing Association.

[2] Mislove A., Marcon M., Gummadi K., Druschel P.,

Bhattacharjee B., Measurement and Analysis of Online

Social Networks, proceedings of IMC’07, ACM.

[3] Yixiao Li, Xiaogang Jin, Fansheng Kong and Jiming Li,

Linking via Social Similarity: The Emergence of

Community Structure in Scale-free Network , IEEE

symposium on digital object identifier, 2009.

[4] Wei Ren, Jianping Li, A fast algorithm for simulating

scale-free networks, proceedings of ICCTA2009

[5] Ted G. Lewis, Network Science: Theory and Practice,

John Wiley & Sons, Inc. 2009.

[6] P. Erdos, A. Renyi, On the evolution of random graphs,

Publ. Math. Inst. Hung. Acad. Sci., vol. 5, pp. 17-60,

1959.

[7] Goel S., Muhamad R., Watts D., Social Search in

“Small-World” Experiments, proc. WWW 2009 , ACM.

[8] Jun T., Sethi R., Reciprocity in evolving social networks,

Journal of Evolutionary Economics , June 2009.

[9] Harvey C., Stewart D., Ewing M., Forward or delete:

What drives peer-to-peer message propagation across

social networks?, Journal of Consumer Behavior, Vol.

10, 2011, Published by Wiley.

[10] Norman AT, Russell CA. 2006. The Pass-Along Effect:

Investigating Word-of-Mouth Effects on Online Survey

Procedures. Journal of Computer-Mediated

Communication 11(4): 1085–1103.

[11] Wilson C., Boe B., Sala A., Puttaswamy P., Zhao B.,

User Interactions in Social Networks and their

Implications, Proceedings of EuroSys 2009, ACM.

[12] Skoric M., Poor N., Liao Y., Wei S., Online

Organization of an Offline Protest: From Social to

Page 6: Network Structure For Social Network

Traditional Media and Back , proceedings of HICSS

2011, retrieved from IEEE.

[13] Magnani M., Montesi D., Rossi L., In formation

propagation analysis in a social network site,

proceedings of International Conference on Advances in

Social Networks Analysis and Mining, 2010, IEEE.

[14] Gao H., Hu J., Wilson C., Li Z., Chen Y., Zhao B.,

Detecting and Characterizing Social Spam Campaigns ,

proceedings of IMC’10. ACM.