Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani...

20
Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu The University of Texas at Dallas {layfield, bxt043000, lkhan, muratk}@utdallas.edu

Transcript of Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani...

Page 1: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Social Networks and Surveillance: Evaluating Suspicion by Association

Ryan P. LayfieldDr. Bhavani Thuraisingham

Dr. Latifur KhanDr. Murat Kantarcioglu

The University of Texas at Dallas

{layfield, bxt043000, lkhan, muratk}@utdallas.edu

Page 2: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Overview

Introduction►Our Goal►System Design►Social Networks►Threat Detection►Correlation Analysis

The Experiment►Setup►Current Results►Issues►Future Work

Page 3: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Introduction

Automated message surveillance is essential to communication monitoring►Widespread use of electronic

communication

►Exponential data growth

►Impossible to sift through all ‘by hand’

Going beyond basic surveillance►Identifying groups rather than individuals

►Monitoring conversations rather than messages

Page 4: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Our Goal

Design new techniques and apply existing algorithms to…►Create a machine-understandable model

of existing social networks

►Identify abnormal conversations and behavior

►Monitor a given communications system in real-time

►Continuously learn and adapt to a dynamic environment

Page 5: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

System Design

Three major components:►Social Network Modeler

►Initial Activity Detector

►Correlated Activity Investigator

Page 6: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Social Networks

Individuals engaged in suspicious or undesirable behavior rarely act alone

We can infer than those associated with a person positively identified as suspicious have a high probability of being either:►Accomplices (participants in suspicious

activity)►Witnesses (observers of suspicious activity)

Making these assumptions, we create a context of association between users of a communication network

Page 7: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Social Networks

Within our model:► Every node is a unique user► Every message creates or strengthens a link between

nodesOver time, the network changes

► Frequent communication leads to stronger links► Intermittent messaging implies weakening social ties

The strength of the link implies how strong an association between individuals is

From this data, we can theoretically identify► Hubs► Groups► Liaisons

Page 8: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Social Networks

Page 9: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Threat Detection

Every message sent is scrutinized in the interest of identifying suspicious communication►Keywords analysis►Prior context (i.e. previous message content)

When a detection algorithm yields a strong result, a token is created►The token is created at the origin and passed to the

recipient(s)►Existing tokens, if any, are cloned instead

The result is a web that potentially reflects the dissemination of suspicious information activity

Page 10: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Correlation Analysis

Future messages with similar suspicious topics are not always identifiable with the same ‘initial’ techniques►Quick replies ►Pronoun use►Assumption that recipient is aware of topic

If a token is present at the sender when a message is sent:►Message token is associated with and new

message are analyzed►If analysis yields a strong match, the token

is further cloned and passed to recipient

Page 11: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

The Experiment

A rare set of words shared between two or more messages are candidates for keyword analysis, but they are not always easily sifted from ‘noise’

Noise within text-based messages comes in a variety of forms► Misspelled words► Unusual word choice► Incompatible variations of the same language (i.e. British

vs. American English)► Unexpected language

However, we do not want to eliminate potential keywords► Document names► Terminology specific to a subject► ‘Buzz’ words

Page 12: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

The Experiment

We proposed an experiment that attempts to eliminate false positives due to noisy data while strengthening and expanding our correlation techniques

Page 13: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Setup

Tools► Running word ‘rank’ database

► Implementation of word set theory infrastructure

► JAMA Matrix LibrarySingular Value Decomposition

Our Approach► Apply SVD noise filtering based on 100 messages

► Analyze word frequency correlation between current message and prior suspicious messages

► Generate a score based on the results

Page 14: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Setup

Construct a matrix based on the last 100 messages

Ww

MMMW

mwcountc

i

t

jiji

...

),(

21

wor

ds

messages

More common

Less common

Page 15: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Setup

Decompose and rebuild

U VTA

Eliminate ‘weak’ singular values

Page 16: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

SetupPulled from messages j and k

)(

),(),()(

i

kijii wrank

mwcountmwcountwscore

‘Raw’ total score for word wi

Pulled from ‘running’ word database

kji WWw

iwscore )(Counts only intersection of words Predefined fixed

threshold

Page 17: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Current Results

Method is not currently accurateLarge fluctuations

►Correlation easily swayed by plethora of common words

►Uncommon words not given enough weight

Page 18: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Current Results

Accuracy of Results over 900 Messages

3%12%

59%

26%

True Positives

False Positives

True Negatives

False Negatives

1000 messages evaluated, first 100 used to seed word ranks.

Page 19: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Issues

Word frequencies fluctuate wildly during beginning of experiment (0.0 – 10.0+)

Extreme cost for current construction methods and computation

Filtering context limited to recent global history

Affected by large bodies of text

Page 20: Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Future Work

Tap potential of existing matrix for further analysis

Adaptive filtering feedback algorithmsSpeed improvements to accommodate

real-time streamsFlexible communication platform

monitoringAddition of pipe architecture for

modular threat detection and correlation