Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

36
Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008

Transcript of Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

Page 1: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

Social Networking, Security and Privacy

Dr. Bhavani Thuraisingham

April 21, 2008

Page 2: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-204/20/23 18:01

Social Networkshttp://www.flairandsquare.com/archives/167

0 A social network site allows people who share interests to build a ‘trusted’ network/ online community. A social network site will usually provide various ways for users to interact, such as IM (chat/ instant messaging), email, video sharing, file sharing, blogging, discussion groups, etc.

0 The main types of social networking sites have a ‘theme’, they allow users to connect through image or video collections online (like Flicker or You Tube) or music (like My Space, lastfm). Most contain libraries/ directories of some categories, such as former classmates, old work colleagues, and so on (like Face book, friends reunited, Linked in, etc). They provide a means to connect with friends (by allowing users to create a detailed profile page), and recommender systems linked to trust.

Page 3: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-304/20/23 18:01

Popular Social Networks

0 Face book - A social networking website. Initially the membership was restricted to

students of Harvard University. It was originally based on what first-year students were

given called the “face book” which was a way to get to know other students on campus.

As of July 2007, there over 34 million active members worldwide. From September 2006

to September 2007 it increased its ranking from 60 to 6th most visited web site, and was

the number one site for photos in the United States.

0 Twitter- A free social networking and micro-blogging service that allows users to send

“updates” (text-based posts, up to 140 characters long) via SMS, instant messaging,

email, to the Twitter website, or an application/ widget within a space of your choice, like

MySpace, Facebook, a blog, an RSS Aggregator/reader.

0 My Space - A popular social networking website offering an interactive, user-submitted

network of friends, personal profiles, blogs, groups, photos, music and videos

internationally. According to AlexaInternet, MySpace is currently the world’s sixth most

popular English-language website and the sixth most popular website in any language,

and the third most popular website in the United States, though it has topped the chart

on various weeks. As of September 7, 2007, there are over 200 million accounts.

Page 4: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-404/20/23 18:01

Social Networks: More formal definition

0 A structural approach to understanding social interaction.

0 Networks consist of Actors and the Ties between them.

0 We represent social networks as graphs whose vertices are the actors and whose edges are the ties.

0 Edges are usually weighted to show the strength of the tie.

0 In the simplest networks, an Actor is an individual person.

0 A tie might be “is acquainted with”. Or it might represent the amount of email exchanged between persons A and B.

Page 5: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-504/20/23 18:01

Social Network Examples

0 Effects of urbanization on individual well-being

0 World political and economic system

0 Community elite decision-making

0 Social support, Group problem solving

0 Diffusion and adoption of innovations

0 Belief systems, Social influence

0 Markets, Sociology of science

0 Exchange and power

0 Email, Instant messaging, Newsgroups

0 Co-authorship, Citation, Co-citation

0 SocNet software, Friendster

0 Blogs and diaries, Blog quotes and links

Page 6: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-604/20/23 18:01

History

0 “Sociograms” were invented in 1933 by Moreno.

0 In a sociogram, the actors are represented as points in a two-dimensional space. The location of each actor is significant. E.g. a “central actor” is plotted in the center, and others are placed in concentric rings according to “distance” from this actor.

0 Actors are joined with lines representing ties, as in a social network. In other words a social network is a graph, and a sociogram is a particular 2D embedding of it.

0 These days, sociograms are rarely used (most examples on the web are not sociograms at all, but networks). But methods like MDS (Multi-Dimensional Scaling) can be used to lay out Actors, given a vector of attributes about them.

0 Social Networks were studied early by researchers in graph theory (Harary et al. 1950s). Some social network properties can be computed directly from the graph.

0 Others depend on an adjacency matrix representation (Actors index rows and columns of a matrix, matrix elements represent the tie strength between them).

Page 7: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-704/20/23 18:01

Social Networks Basic Questions

0 Balance: important in exchange networks

0 In a two-person network (dyad), exchange of goods, services and cash should be balanced.

0 More generally, exchanges of “favors” or “support” are likely to be quite balanced.

0 Role: what role does the actor perform in the network?

0 Role is defined in terms of Actors’ neighborhoods.0 The neighborhood is the set of ties and actors connected directly to

the current actor. 0 Actors with similar or identical neighborhoods are assigned the same

role.

0 What is the related idea from semiotics?

0 Paradigm: interchangability. Actors with the same role areinterchangable in the network.

Page 8: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-804/20/23 18:01

Social Networks Basic Questions

0 Prestige: How important is the actor in the network?

0 Related notions are status and centrality.

0 Centrality reifies the notion of “peripheral vs. central participation” from communities of practice.

0 Key notions of centrality were developed in the 1970’s, e.g. “eigenvalue centrality” by Bonacich.

0 Most of these measures were rediscovered as quality measures for web pages:

- Indegree

- Pagerank = eigenvalue centrality

- HITS ?= two-mode eigenvalue centrality

Page 9: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-904/20/23 18:01

Social Network Concepts

0 Actor

- An “actor” is a basic component for SNs. Actors can be:

- Individual people, Corporations, Nation-States, Social groups

0 Modes

- If all the actors are of the same type, the network is called a one-mode network. If there are two groups of actor then it is a two-mode network.

- E.g. an affiliation network is a two-mode network. One mode is individuals, the other is groups to which they belong. Ties represent the relation: person A is a member of group B.

0 Ties

- A tie is the relation between two actors. Common types of ties include:

= Friendship, Amount of communication, Goods exchanged, Familial relation (kinship), Institutional relations

Page 10: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-1004/20/23 18:01

Practical issues: Boundaries and Samples

0 Because human relations are rich and unbounded, drawing meaningful boundaries for network analysis is a challenge.

0 There are two main approaches:

0 Realist: boundaries perceived by actors themselves, e.g. gang members or ACM members.

0 Nominalist: Boundaries created by researcher: e.g. people who publish in ACM CHI.

0 To deal with large networks, sampling is necessary. Unfortunately, randomly sampled graphs will typically have completely different structure. Why?

0 One approach to this is “snowballing”. You start with a random sample. Then extend with all actors connected by a tie. Then extend with all actors connected to the previous set by a tie…

Page 11: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-1104/20/23 18:01

Social Network Analysis of 9/11 Terrorists (www.orgnet.com)

Early in 2000, the CIA was informed of two terrorist suspects linked to al-Qaeda. Nawaf Alhazmi and Khalid Almihdhar were photographed attending a meeting of known terrorists in Malaysia. After the meeting they returned to Los Angeles, where they had already set up residence in late 1999.

Page 12: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-1204/20/23 18:01

Social Network Analysis of 9/11 Terrorists

What do you do with these suspects? Arrest or deport them immediately? No, we need to use them to discover more of the al-Qaeda network.

Once suspects have been discovered, we can use their daily activities to uncloak their network. Just like they used our technology against us, we can use their planning process against them. Watch them, and listen to their conversations to see...

•who they call / email •who visits with them locally and in other cities •where their money comes from

The structure of their extended network begins to emerge as data is discovered via surveillance.

Page 13: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-1304/20/23 18:01

Social Network Analysis of 9/11 Terrorists

A suspect being monitored may have many contacts -- both accidental and intentional. We must always be wary of 'guilt by association'. Accidental contacts, like the mail delivery person, the grocery store clerk, and neighbor may not be viewed with investigative interest.

Intentional contacts are like the late afternoon visitor, whose car license plate is traced back to a rental company at the airport, where we discover he arrived from Toronto (got to notify the Canadians) and his name matches a cell phone number (with a Buffalo, NY area code) that our suspect calls regularly. This intentional contact is added to our map and we start tracking his interactions -- where do they lead? As data comes in, a picture of the terrorist organization slowly comes into focus.

How do investigators know whether they are on to something big? Often they don't. Yet in this case there was another strong clue that Alhazmi and Almihdhar were up to no good -- the attack on the USS Cole in October of 2000. One of the chief suspects in the Cole bombing [Khallad] was also present [along with Alhazmi and Almihdhar] at the terrorist meeting in Malaysia in January 2000.

Once we have their direct links, the next step is to find their indirect ties -- the 'connections of their connections'. Discovering the nodes and links within two steps of the suspects usually starts to reveal much about their network. Key individuals in the local network begin to stand out. In viewing the network map in Figure 2, most of us will focus on Mohammed Atta because we now know his history. The investigator uncloaking this network would not be aware of Atta's eventual importance. At this point he is just another node to be investigated.

Page 14: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-1404/20/23 18:01

Social Network Analysis of 9/11 Terrorists

Figure 2 shows the two suspects and

Page 15: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-1504/20/23 18:01

Social Network Analysis of 9/11 Terrorists

Figure 2 shows the two suspects andAtta's eventual importance. At this point he is just another node to be investigated.

                                                                                                                                                                                                                

Figure 3 shows the direct

Page 16: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-1604/20/23 18:01

Social Network Analysis of 9/11 Terrorists

We now have enough data for two key conclusions: • All 19 hijackers were within 2 steps of the two original suspects uncovered in 2000! • Social network metrics reveal Mohammed Atta emerging as the local leader

With hindsight, we have now mapped enough of the 9-11 conspiracy to stop it. Again, the investigators are never sure they have uncovered enough information while they are in the process of uncloaking the covert organization. They also have to contend with superfluous data. This data was gathered after the event, so the investigators knew exactly what to look for. Before an event it is not so easy.

As the network structure emerges, a key dynamic that needs to be closely monitored is the activity within the network. Network activity spikes when a planned event approaches. Is there an increase of flow across known links? Are new links rapidly emerging between known nodes? Are money flows suddenly going in the opposite direction? When activity reaches a certain pattern and threshold, it is time to stop monitoring the network, and time to start removing nodes.

The author argues that this bottom-up approach of uncloaking a network is more effective than a top down search for the terrorist needle in the public haystack -- and it is less invasive of the general population, resulting in far fewer "false positives".

Page 17: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-1704/20/23 18:01

Social Network Analysis of Steroid Usage in Baseball (www.orgnet.com)

Figure 2 shows the two suspects and

When the Mitchell Report on steroid use in Major League Baseball [MLB], was published, people were surprised at who and how many players were mentioned. The diagram below shows a human network created from data found in the Mitchell Report. Baseball players are shown as green nodes. Those who were found to be providers of steroids and other illegal performance enhancing substances appear as red nodes. The links reveal the flow of chemicals -- from provider to player.

Page 18: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-1804/20/23 18:01

Knowledge Management Examples

0 Managing the 21st Century Organization

0 Networks of Adaptive/Agile Organizations

0 Best Practice: Organizational Network Mapping

0 Discovering Communities of Practice

0 Data-Mining E-mail

0 Finding Leaders on your Team

0 Post-Merger Integration

0 Knowledge Sharing in Organizations

0 Innovation happens at the Intersections

0 Partnerships and Alliances in Industry

0 Decision-Making in Organizations

0 New Organizational Structures

Page 19: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-1904/20/23 18:01

Knowledge Sharing in Organizations: Finding Experts

Figure 2 shows the two suspects and

Page 20: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-2004/20/23 18:01

Knowledge Sharing Network: Finding Experts (www.orgnet.com)

Figure 2 shows the two suspects and

Organizational leaders are preparing for the potential loss of expertise and knowledge flow due to turnover, downsizing, outsourcing, and the coming retirements of the baby boom generation. The model network (previous chart) is used to illustrate the knowledge continuity analysis process.

Each node in this sample network (previous chart) represents a person that works in a knowledge domain. Some people have more / different knowledge than others. Employees who will retire in 2 years or less have their nodes colored red. Those who will retire in 3-4 years are colored yellow. Those retiring in 5 years or later are colored green.

A gray, directed line is drawn from the seeker of knowledge to the source of expertise. A-->B indicates that A seeks expertise / advice from B. Those with many arrows pointing to them are sought often for assistance.

The top subject matter experts -- SMEs -- in this group are nodes 29, 46, 100, 41, 36 and 55. The SMEs were discovered using a network metric in InFlow that is similar to how the Google search engine ranks web pages -- using both direct and indirect links.

Of the top six SMEs in this group, half are colored red[100] or yellow[46, 55]. The loss of person 46 has the greatest potential for knowledge loss. 90% of the network is within 3 steps of accessing this key knowledge source.

Page 21: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-2104/20/23 18:01

Social Networks: Security and Privacy Issues: European Network and Information Security Agency

0 The European Network and Information Security Agency (ENISA) has released its first

issue paper “Security Issues and Recomendations for Online Social Networks".

0 http://www.enisa.europa.eu/doc/pdf/deliverables/enisa_pp_social_networks.pdf

0 Four groups of threats: privacy related threats, variants of traditional network and

information security threats, identity related threats, social threats.

0 Recommendations are given for governments (oversight and adaption of existing data

protection legislation), companies that run such networks, technology developers, and

research and standardisation bodies.

0 Some concenrs: recommnendation to use automated filters against "offensive, litigious or

illegal content". This brings potential freedom of speech issues. European Digital Rights

has started a campaign against a similar recommendation by the Council of Europe.

Issue of portability of profiles social graphs are also addressed. However what is missing

is that “Information about social links is not about only one user, but also the others

which he is linked to. They have to agree if this information is moved to different

platforms”.

Page 22: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-2204/20/23 18:01

Social Networks: Security and Privacy Issues: Microsoft Recommendations http://www.microsoft.com/protect/yourself/personal/communities.mspx

0 Online communities require you to provide personal information. Profiles are public.

Comments you post are permanently recorded on the community site.You might even

mention when you plan to be out of town.

0 E-mail and phishing scammers count on the appealing sense of trust that is often

fostered in online communities to steal your personal information. The more you reveal

in profiles and posts, the more vulnerable you are to scams, spam, and identity theft.

0 Here are some features to look for when you're considering joining an online community:

- •Privacy policies that explain exactly what information the service will collect and

how it might be used.• User guidelines that outline a basic code of conduct for users

on their sites. Sites have the option to penalize reported violators with account

suspension or termination.•Special provisions for children and their parents, such

as family-friendly options geared towards protecting children under a certain

age.•Password protection to help keep your account secure..•E-mail address hiding,

which lets you display only part of your e-mail address on the site's membership

lists. Filtering options: Offered on blogging sites, these tools let you to choose

which subscribers can see what you've written.

Page 23: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-2304/20/23 18:01

Appendix:Social Networks and

Surveillance: Evaluating Suspicion by Association

Ryan Layfield, PhD StudentProf. Bhavani Thuraisingham

September 2006

Page 24: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-2404/20/23 18:01

Overview

0 Introduction

- Our Goal

- System Design

- Social Networks

- Threat Detection

- Correlation Analysis0 The Experiment

- Setup

- Current Results

- Issues

- Future Work0 Directions

Page 25: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-2504/20/23 18:01

Introduction

0 Automated message surveillance is essential to communication monitoring

- Widespread use of electronic communication

- Exponential data growth

- Impossible to sift through all ‘by hand’

0 Going beyond basic surveillance

- Identifying groups rather than individuals

- Monitoring conversations rather than messages

Page 26: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-2604/20/23 18:01

Our Approach0 Design new techniques and apply existing algorithms to…

- Create a machine-understandable model of existing social networks; Identify abnormal conversations and behavior; Monitor a given communications system in real-time; Continuously learn and adapt to a dynamic environment

0 System Design: Three major components:

- Social Network Modeler; Initial Activity Detector; Correlated Activity Investigator

0 Assumptions

- Individuals engaged in suspicious or undesirable behavior rarely act alone

- We can infer than those associated with a person positively identified as suspicious have a high probability of being either:

=Accomplices (participants in suspicious activity)

=Witnesses (observers of suspicious activity)

- Making these assumptions, we create a context of association between users of a communication network

Page 27: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-2704/20/23 18:01

Social Networks

0 Within our model:

- Every node is a unique user

- Every message creates or strengthens a link between nodes0 Over time, the network changes

- Frequent communication leads to stronger links

- Intermittent messaging implies weakening social ties0 The strength of the link implies how strong an association between

individuals is0 From this data, we can theoretically identify

- Hubs

- Groups

- Liaisons

Page 28: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-2804/20/23 18:01

Social Networks

Page 29: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-2904/20/23 18:01

Threat Detection and Correlation Analysis0 Every message sent is scrutinized in the interest of identifying

suspicious communication- Keywords analysis; Prior context (i.e. previous message

content)0 When a detection algorithm yields a strong result, a token is created

- The token is created at the origin and passed to the recipients); Existing tokens, if any, are cloned instead

0 The result is a web that potentially reflects the dissemination of suspicious information activity

0 Future messages with similar suspicious topics are not always identifiable with the same ‘initial’ techniques

- Quick replies; Pronoun use; Assumption that recipient is aware of topic

0 If a token is present at the sender when a message is sent:- Message token is associated with and new message are

analyzed; If analysis yields a strong match, the token is further cloned and passed to recipient

Page 30: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-3004/20/23 18:01

The Experiment

0 A rare set of words shared between two or more messages are candidates for keyword analysis, but they are not always easily sifted from ‘noise’

0 Noise within text-based messages comes in a variety of forms

- Misspelled words

- Unusual word choice

- Incompatible variations of the same language (i.e. British vs. American English)

- Unexpected language0 However, we do not want to eliminate potential keywords

- Document names

- Terminology specific to a subject

- ‘Buzz’ words

0 We proposed an experiment that attempts to eliminate false positives due to noisy data while strengthening and expanding our correlation techniques

Page 31: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-3104/20/23 18:01

Setup

0 Tools

- Running word ‘rank’ database

- Implementation of word set theory infrastructure

- JAMA Matrix Library

=Singular Value Decomposition

0 Our Approach

- Apply SVD noise filtering based on 100 messages

- Analyze word frequency correlation between current message and prior suspicious messages

- Generate a score based on the results

Page 32: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-3204/20/23 18:01

Setup

0 Construct a matrix based on the last 100 messages

Ww

MMMW

mwcountc

i

t

jiji

...

),(

21w

ords

messages

More common

Less common

Page 33: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-3304/20/23 18:01

Setup

0 Decompose and rebuild

U VTA

Eliminate ‘weak’ singular values

Page 34: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-3404/20/23 18:01

Setup

Pulled from messages j and k

)(

),(),()(

i

kijii wrank

mwcountmwcountwscore

‘Raw’ total score for word wi

Pulled from ‘running’ word database

kji WWw

iwscore )(Counts only intersection of words Predefined fixed

threshold

Page 35: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-3504/20/23 18:01

Current Results

1000 messages evaluated, first 100 used to seed word ranks.

Method is not currently accurate Large fluctuations

Correlation easily swayed by plethora of common wordsUncommon words not given enough weight

Accuracy of Results over 900 Messages

3%12%

59%

26%

True Positives

False Positives

True Negatives

False Negatives

Page 36: Social Networking, Security and Privacy Dr. Bhavani Thuraisingham April 21, 2008.

1-3604/20/23 18:01

Issues and Directions

0 Word frequencies fluctuate wildly during beginning of experiment (0.0 – 10.0+)

0 Extreme cost for current construction methods and computation

0 Filtering context limited to recent global history0 Affected by large bodies of text0 Future Directions include

- Tap potential of existing matrix for further analysis- Adaptive filtering feedback algorithms- Speed improvements to accommodate real-time streams- Flexible communication platform monitoring- Addition of pipe architecture for modular threat detection

and correlation