An Introduction to Social Network Analysis and Its Application in Software Engineering

87
An Introduction to Social Network Analysis and its application in Software Engineering Sabrina Marczak [email protected] Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Manaus, Novembro 2012

description

This is a short tutorial on social network analysis applied to software engineering for beginners. Main social network analysis are presented along with examples of their application from literature. Reading recommendation is provided. This material was presented at the Workshop on Agile Methods for Distributed Teams organized by Prof. Tayana Conte, UFAM, Manaus, Brazil, on late Nov 2012.

Transcript of An Introduction to Social Network Analysis and Its Application in Software Engineering

Page 1: An Introduction to Social Network Analysis and Its Application in Software Engineering

An Introduction toSocial Network Analysisand its application in Software Engineering

Sabrina [email protected]

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Manaus, Novembro 2012

Page 2: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Software Development2

Page 3: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Software Development

R. Analyst

Architect

P. Manager

Developer

DeveloperTester

Planning

Design Development

Testing Deployment

Conception

3

Page 4: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Software Development

R. Analyst

Architect

P. Manager

Developer

DeveloperTester

Planning

Design Development

Testing Deployment

Conception

R. Analyst

P. Manager

Architect

Developer

Tester

Requirement

3

Page 5: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Software Development

• Collaboration

• Coordination

• Communication

GoalsTasks

Dependencies

Deadlines

4

Page 6: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Software Development

• Who talks with whom?

• Who receives help from whom?

• Who is aware of whom?

• Who are the experts?

• Who are the most active contributors?

5

Page 7: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Software Development

• Are the team members following the organizational structure?

• Are the team members coordinating with those their work is dependent on?

• Are the next builds going to fail?

6

Page 8: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Software Development

• How to answer to these questions?

7

Page 9: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Software Development

• How to answer to these questions?Social Network Analysis

7

Page 10: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Social network analysis

• It provides techniques to examine the structure of social relationships in a group to uncover patterns of behavior and interaction among people [Mitchell, 1969]

8

Page 11: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Agenda

• Introduction to social network analysis

• My research on collaboration using SNA

• Recommended reading

9

Page 12: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

> Introduction to SNA

• Terminology

• Representation

• Measures

• Data collection

• Tools

10

Page 13: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Terminology

Actor

Actor = Node = Vertice

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

11

Page 14: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Terminology

Tie

Tie = Link = Edge

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

12

Page 15: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

TerminologyDyad

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

13

Page 16: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Terminology

Triad

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

14

Page 17: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Representation

• Sociogram

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

15

Page 18: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Representation

• Matrix representation of network data

Absent

Present

16

Page 19: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Representation

• Actors’ attributesRole Country Work exp.

Andrew 1 1 3

Bob 1 2 3

Charles 1 1 1

David 1 2 2

Emma 1 1 1

Fynn 2 1 3

Greg 2 1 1

Hannah 2 1 1

Iris 2 1 2

John 2 2 3

Kevin 2 2 2

Lucas 2 1 2

Role1. Tester2. Developer

Country1. Canada2. Ireland

Work experience1. 1-6 months2. 6-12 months3.18+ months

17

Page 20: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Representation

• Sociogram with actors’ attributes

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

18

Page 21: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Representation

• Sociogram with actors’ attributes

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

Developer

Tester

Canada

Ireland

1-6 months

6-12 months

18+ months

Legend

18

Page 22: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Representation

• Tie weight

• Strength

• Frequency

• Etc...

19

Page 23: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Overall network characterization

• Network size

• Network density

• Ties statistics

20

Page 24: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Network size: is the number of actors in the social network

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

21

Page 25: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Network size: is the number of actors in the social network

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

Size: 12 actors

21

Page 26: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Network size: is the number of actors in the social network

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

Size: 12 actors

Size can be larger or smaller than the team size

21

Page 27: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Network size: is the number of actors in the social network

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

Size: 12 actors

Size can be larger or smaller than the team size

Herbsleb and Mockus (2003) found that distributed

communication networks are significantly smaller than

same-site networks

21

Page 28: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Network density: is the proportion of

ties that exist in the network out of the total possible ties. It can vary from 0 to 1.

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

22

Page 29: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Network density: is the proportion of

ties that exist in the network out of the total possible ties. It can vary from 0 to 1.

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

Possible ties: 12 (12-1) / 2 = 66

22

Page 30: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Network density: is the proportion of

ties that exist in the network out of the total possible ties. It can vary from 0 to 1.

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

Possible ties: 12 (12-1) / 2 = 66

Density: 20 / 66 = 0.30

22

Page 31: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Network density: is the proportion of

ties that exist in the network out of the total possible ties. It can vary from 0 to 1.

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

Possible ties: 12 (12-1) / 2 = 66

Hinds and McGrath (2006) found that geographic distribution is

associated with less dense work ties and less dense information

sharing, suggesting that social ties are not particularly important in

distributed as compared with collocated teams as a means of

coordinating work and improving performance

Density: 20 / 66 = 0.30

22

Page 32: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Ties statistics: it uses the actors’

attributes to reveal overall network characteristics

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

23

Page 33: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Ties statistics: it uses the actors’

attributes to reveal overall network characteristics

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

E.g.: 5 testers and 7 developers

23

Page 34: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Ties statistics: it uses the actors’

attributes to reveal overall network characteristics

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

E.g.: 5 testers and 7 developers

By counting up the number of ties within and cross-sites,

Herbesleb and Mockus (2003) found that there is

much more frequent communication with local colleagues in a distributed

project

23

Page 35: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Network structure

• Network centralization

• Core-periphery

• Ties reciprocity

• Clique

24

Page 36: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Network centralization: quantifies the difference between the

number of ties for each node divided by the maximum possible sum of differences. A centralized network (index = 1) structure will have many of its ties dispersed around one or a few actors while a decentralized network structure (index = 0) is one in which there is little variation between the number of ties each actor possesses [Freeman, 1978].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

25

Page 37: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Network centralization: quantifies the difference between the

number of ties for each node divided by the maximum possible sum of differences. A centralized network (index = 1) structure will have many of its ties dispersed around one or a few actors while a decentralized network structure (index = 0) is one in which there is little variation between the number of ties each actor possesses [Freeman, 1978].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

Centralization index = 0.39

25

Page 38: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Network centralization: quantifies the difference between the

number of ties for each node divided by the maximum possible sum of differences. A centralized network (index = 1) structure will have many of its ties dispersed around one or a few actors while a decentralized network structure (index = 0) is one in which there is little variation between the number of ties each actor possesses [Freeman, 1978].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

Centralization index = 0.39

Tsai (2002) found that a formal hierarchical

structure in the form of centralization has a

significant negative effect on knowledge sharing

among organizational units

25

Page 39: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Core-periphery: indicates the extent to which the structure of a

network consists of two classes of actors: a cohesive subnetwork, the core, in which the actors are connected to each other in some maximal sense; and a class of actors that are more loosely connected to the cohesive subnetwork but lack any maximal cohesion with the core, the peripheral actors. A high core value (close to 1) indicates a strong core-periphery structure [Borgatti and Everett, 1999].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

26

Page 40: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Core-periphery: indicates the extent to which the structure of a

network consists of two classes of actors: a cohesive subnetwork, the core, in which the actors are connected to each other in some maximal sense; and a class of actors that are more loosely connected to the cohesive subnetwork but lack any maximal cohesion with the core, the peripheral actors. A high core value (close to 1) indicates a strong core-periphery structure [Borgatti and Everett, 1999].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

Core-periphery index = 0.47

26

Page 41: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Core-periphery: indicates the extent to which the structure of a

network consists of two classes of actors: a cohesive subnetwork, the core, in which the actors are connected to each other in some maximal sense; and a class of actors that are more loosely connected to the cohesive subnetwork but lack any maximal cohesion with the core, the peripheral actors. A high core value (close to 1) indicates a strong core-periphery structure [Borgatti and Everett, 1999].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

Core-periphery index = 0.47

Hinds and McGrath (2006) found that communication

networks with a strong core-periphery structure leads to less coordination problems

than loosely connected networks

26

Page 42: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Ties reciprocity: when the relationship is considered directional (e.g., friendship, trust), then the reciprocity index can be calculated using the dyad method, the ration of the number of pairs of actors with a reciprocated ties relative to the number of pairs with any tie between the actors; or the arc method, the ration of the number of ties that are involved in reciprocal relationships relative to the total number of actual ties [Hanneman and Riddle, 2005].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

27

Page 43: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Ties reciprocity: when the relationship is considered directional (e.g., friendship, trust), then the reciprocity index can be calculated using the dyad method, the ration of the number of pairs of actors with a reciprocated ties relative to the number of pairs with any tie between the actors; or the arc method, the ration of the number of ties that are involved in reciprocal relationships relative to the total number of actual ties [Hanneman and Riddle, 2005].

Dyad method index = 0.85

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

27

Page 44: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Ties reciprocity: when the relationship is considered directional (e.g., friendship, trust), then the reciprocity index can be calculated using the dyad method, the ration of the number of pairs of actors with a reciprocated ties relative to the number of pairs with any tie between the actors; or the arc method, the ration of the number of ties that are involved in reciprocal relationships relative to the total number of actual ties [Hanneman and Riddle, 2005].

Dyad method index = 0.85

The higher the index of reciprocal ties the more stable or equal the network structure

is [Rao and Bandyopadhyay, 1987]. Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

27

Page 45: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Ties reciprocity: when the relationship is considered directional (e.g., friendship, trust), then the reciprocity index can be calculated using the dyad method, the ration of the number of pairs of actors with a reciprocated ties relative to the number of pairs with any tie between the actors; or the arc method, the ration of the number of ties that are involved in reciprocal relationships relative to the total number of actual ties [Hanneman and Riddle, 2005].

Dyad method index = 0.85

The higher the index of reciprocal ties the more stable or equal the network structure

is [Rao and Bandyopadhyay, 1987]. Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

A higher reciprocity index suggests a more horizontal

structure while the opposite suggests a more hierarchical network [Hanneman and Riddle, 2005].

27

Page 46: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Clique: consists of a subset of at least 3 actors in which every possible pair of actors is directly connected by a tie and there are no other actors that are also directly connected to all members of the clique [Wasserman and Faust, 1994].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

28

Page 47: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Clique: consists of a subset of at least 3 actors in which every possible pair of actors is directly connected by a tie and there are no other actors that are also directly connected to all members of the clique [Wasserman and Faust, 1994].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

- Andrew, Bob, Charles, and David- Andrew, David, and Emma- Fynn, Iris, John, and Kevin- Fynn, John, Kevin, and Lucas

28

Page 48: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Clique: consists of a subset of at least 3 actors in which every possible pair of actors is directly connected by a tie and there are no other actors that are also directly connected to all members of the clique [Wasserman and Faust, 1994].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

- Andrew, Bob, Charles, and David- Andrew, David, and Emma- Fynn, Iris, John, and Kevin- Fynn, John, Kevin, and Lucas

Cain and colleagues (1996) found 3 large cliques

consisting of team members developing 3 major activities:

architecture design, code development, and code

review in the communication networks of a certain

development team

28

Page 49: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Information exchange

• Reachability

• Component

• Degree centrality

29

Page 50: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Reachability: one actor is reachable by another actor if exists any set of ties that connects both actors, regardless of how many others fall in between them [Wasserman and Faust, 1994].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

30

Page 51: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Reachability: one actor is reachable by another actor if exists any set of ties that connects both actors, regardless of how many others fall in between them [Wasserman and Faust, 1994].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

All actors are reachable

30

Page 52: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Reachability: one actor is reachable by another actor if exists any set of ties that connects both actors, regardless of how many others fall in between them [Wasserman and Faust, 1994].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

All actors are reachable

If some actors cannot reach others, there is a potential division in the

network and thus information cannot reach

everyone

30

Page 53: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Component: indicates whether a social network is connected. A network is connected if there is a path between every pair of actors, otherwise it is disconnected. The actors in a disconnected network may be partitioned in subsets called components [Wasserman and Faust, 1994].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

31

Page 54: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Component: indicates whether a social network is connected. A network is connected if there is a path between every pair of actors, otherwise it is disconnected. The actors in a disconnected network may be partitioned in subsets called components [Wasserman and Faust, 1994].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

One component

31

Page 55: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Component: indicates whether a social network is connected. A network is connected if there is a path between every pair of actors, otherwise it is disconnected. The actors in a disconnected network may be partitioned in subsets called components [Wasserman and Faust, 1994].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

One component

Component test indicates whether there is a group of people connected to each

other and disconnected from the remaining, while clique test indicates whether a

subset of actors is completely connected

31

Page 56: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Degree centrality: indicates the number of ties of an actor and is indicative of activity. When the ties are directional, we have out-degree which are the ties from a certain actor to others and in-degree which are the ties from others to a certain actor [Freeman and colleagues, 1979].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

32

Page 57: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Degree centrality: indicates the number of ties of an actor and is indicative of activity. When the ties are directional, we have out-degree which are the ties from a certain actor to others and in-degree which are the ties from others to a certain actor [Freeman and colleagues, 1979].

Fynn is the member with the highest out- and in-degree

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

32

Page 58: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Degree centrality: indicates the number of ties of an actor and is indicative of activity. When the ties are directional, we have out-degree which are the ties from a certain actor to others and in-degree which are the ties from others to a certain actor [Freeman and colleagues, 1979].

Fynn is the member with the highest out- and in-degree

Hossain and colleagues (2006) found that highly centralized members coordinate better than others

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

32

Page 59: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Degree centrality: indicates the number of ties of an actor and is indicative of activity. When the ties are directional, we have out-degree which are the ties from a certain actor to others and in-degree which are the ties from others to a certain actor [Freeman and colleagues, 1979].

Fynn is the member with the highest out- and in-degree

Hossain and colleagues (2006) found that highly centralized members coordinate better than others

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

Bird and colleagues (2006) found that degree centrality indicated that developers who actually

committed changes played much more significant roles in the email community than non-developers

32

Page 60: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Brokerage: indicates when an actor, named broker, connects

two otherwise unconnected actors or subgroups. Brokerage occurs when, in a triad of actors A, B, and C, A has a tie to B, B has a tie to C, but A has no tie to C. A needs B to reach C, therefore B is a broker. The actors need to be partitioned into subgroups per attribute [Gould and Fernandez, 1989].

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

33

Page 61: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Brokerage: indicates when an actor, named broker, connects

two otherwise unconnected actors or subgroups. Brokerage occurs when, in a triad of actors A, B, and C, A has a tie to B, B has a tie to C, but A has no tie to C. A needs B to reach C, therefore B is a broker. The actors need to be partitioned into subgroups per attribute [Gould and Fernandez, 1989].

Fynn brokers information among his developer colleagues

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

33

Page 62: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Brokerage: indicates when an actor, named broker, connects

two otherwise unconnected actors or subgroups. Brokerage occurs when, in a triad of actors A, B, and C, A has a tie to B, B has a tie to C, but A has no tie to C. A needs B to reach C, therefore B is a broker. The actors need to be partitioned into subgroups per attribute [Gould and Fernandez, 1989].

Fynn brokers information among his developer colleagues

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

Hinds and McGrath (2006) found that brokers effectively disseminate

information between distributed sites when maintaining direct relationships is not practical

33

Page 63: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures• Brokerage: indicates when an actor, named broker, connects

two otherwise unconnected actors or subgroups. Brokerage occurs when, in a triad of actors A, B, and C, A has a tie to B, B has a tie to C, but A has no tie to C. A needs B to reach C, therefore B is a broker. The actors need to be partitioned into subgroups per attribute [Gould and Fernandez, 1989].

Fynn brokers information among his developer colleagues

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

Hinds and McGrath (2006) found that brokers effectively disseminate

information between distributed sites when maintaining direct relationships is not practical

Ehrlich and colleagues (2008) found that brokers are usually the most knowledgeable members of a team regardless of geographical

location

33

Page 64: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Cutpoint: indicates a weak point in the network. If this actor were removed along with his connections, the network would become divided into unconnected parts. A set of cutpoints is called a cutset.

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

34

Page 65: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Cutpoint: indicates a weak point in the network. If this actor were removed along with his connections, the network would become divided into unconnected parts. A set of cutpoints is called a cutset.

Andrew and Fynn are the cutset

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

34

Page 66: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Measures

• Cutpoint: indicates a weak point in the network. If this actor were removed along with his connections, the network would become divided into unconnected parts. A set of cutpoints is called a cutset.

Andrew and Fynn are the cutset

Andrew

Bob

Charles

David

Emma

Fynn

Greg Hannah

Iris

John

Kevin

Lucas

In communication networks a cutpoint

indicates disruption of information flow

34

Page 67: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Data collection

• Manual

• Survey

• Work diary

• Observation

35

Page 68: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Data collection36

Page 69: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Data collection

• Automatic

• Mining software repositories

• E.g.: source-code, bug trackers

37

Page 70: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Tools

• UCINet

https://sites.google.com/site/ucinetsoftware/home

38

Page 71: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Tools

• NetMiner

http://www.netminer.com

39

Page 72: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

Tools

• Gephi

https://gephi.org/

40

Page 73: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

> My research

• RE ’07: Patterns

• RE ’08: Brokerage

• Book Ch. ’10: RDC framework

• RE ’11: Roles and communication

• ICSE ’12: Domain knowledge

41

Page 74: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

> My research

• RE ’07: Collaboration patterns and impact of distance on awareness

42

Page 75: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

> My research

• RE ’08: Brokerage

Brokerage predominant in certain types of communication

Distance didn’t matter

Knowledge and experience as determinants for brokerage

43

Page 76: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

> My research

• Book ch. ’10: RDC framework

44

Page 77: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

> My research

• RE ’11: Roles and communication structures

45

Page 78: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

> My research

• ICSE ’13: Domain knowledge and hierarchical control structures in coordination

Communication ties that do not follow task assignments but are according to hierarchical structure

46

Page 79: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

> Recommended reading

Rob Cross and Andrew Parker. The Hidden Power of Social Networks: Understanding How work Really Gets Done in Organizations. Harvard Business School Press, Boston, United States, June 2004.

47

Page 80: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

> Recommended reading

John Scott. Social Network Analysis: A Handbook. Sage Publications, London, England, 2nd edition, March 2000.

48

Page 81: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

> Recommended reading

Stanley Wasserman and Katherine Faust. Social Network Analysis: Methods and Applications. Crambidge University Press, Crambidge, United Kingdom, 1994.

49

Page 82: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

> Recommended reading

• Kate Ehrlich and Klarissa Chang. Leveraging Expertise in Global Software Teams: Going Outside Boundaries. In IEEE Proc. of the International Conference on Global Software Engineering, 149–158, Florianópolis, Brazil, October 2006.

• Marcelo Cataldo, Patrick Wagstrom, James Herbsleb, and Kathleen Carley. Identification of Coordination Requirements: Implications for the Design of Collaboration and Awareness Tools. In ACM Proc. of the Conference on Computer Supported Cooperative Work, 353–362, Banff, Canada, November 2006.

50

Page 83: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

> References[Mitchell, 1969] J. Clyde Mitchell. Social Networks in Urban Situations: Analyses of Personal Relationships in Central African Towns. Manchester University Press, Manchester, United Kingdom, November 1969.

[Herbsleb and Mockus, 2003] James Herbsleb and Audris Mockus. An Empirical Study of Speed and Communication in Globally Distributed Software Development. IEEE Transactions on Software Engineering, 29(6): 481–494, June 2003.

[Hinds and McGrath, 2006] Pamela Hinds and Cathleen McGrath. Structures that Work: Social Structure, Work Structure and Coordination Ease in Geographically Distributed Teams. In ACM Proc. of the Conference on Computer Supported Cooperative Work, 343–352, Banff, Canada, November 2006.

[Freeman, 1978] Linton Freeman. Centrality in Social Networks: Conceptual Clarification. Social Networks, 1(3): 215–239, 1978/1979.

[Tsai, 2002] Wenpin Tsai. Social Structure of ”Coopetition” Within a Multiunit Organization: Coordination, Competition, and Intraorganizational Knowledge Sharing. Organization Science, 13(2):179–190, March 2002.

51

Page 84: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

> References

[Borgetti and Everett, 1999] Stephen Borgatti and Martin Everett. Models of Core/Periphery Structures. Social Networks, 21(4): 375–395, October 1999.

[Hanneman and Riddle, 2005] Robert Hanneman and Mark Riddle. Introduction to Social Network Methods. University of California, Riverside, United States, 2005.

[Rao and Bandyopadhyay, 1987] Ramachandra Rao and Sura Bandyopadhyay. Measures of Reciprocity in a Social Network. Sankhya: The Indian Journal of Statistics, Series A, 49(2): 141–188, June 1987.

[Wasserman and Faust, 1994] Stanley Wasserman and Katherine Faust. Social Network Analysis: Methods and Applications. Crambidge University Press, Crambidge, United Kingdom, 1994.

[Cain and colleagues, 1996] Brendan Cain, James Coplien, and Neil Harrison. Social Patterns in Productive Software Development Organizations. Annals of Software Engineering, 2(1): 259–286, 1996.

52

Page 85: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

> References

[Freeman and colleagues, 1979] Linton Freeman, Douglas Roeder, and Robert Mulholland. Centrality in Social Networks: II. Experimental Results. Social Networks, 2(2):119–141, 1979/1980.

[Hossain and colleagues, 2006] Liaquat Hossain, Andre Wu, and Kennetg Chung. Actor Centrality Correlates to Project Based Coordination. In ACM Proc. of the Conference on Computer Supported Cooperative Work, 363–372, Banff, Canada, November 2006.

[Bird and colleagues, 2006] Christian Bird, Alex Gourley, Premkumar Devanbu, Michael Gertz, and Anand Swaminathan. Mining Email Social Networks. In ACM Proc. of the Int’l Workshop on Mining Software Repositories, 37–143, Shanghai, China, May 2006.

[Ehrlich and colleagues, 2008] Kate Ehrlich, Mary Helander, Giuseppe Valetto, Stephen Davies, and Clay Williams. An Analysis of Congruence Gaps and Their Effct on Distributed Software Development. In Workshop on Socio-Technical Congruence, in conj. with the Int’l Conference on Software Engineering, Leipzig, Germany, May 2008. ACM.

53

Page 86: An Introduction to Social Network Analysis and Its Application in Software Engineering

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Sabrina Marczak - Manaus, Novembro 2012

> References[RE ‘07] Daniela Damian, Sabrina Marczak, and Irwin Kwan, “Collaboration Patterns and the Impact of Distance on Awareness in Requirements-Centred Social Networks”, In: IEEE Proc. International Requirements Engineering Conference, New Delhi, India, 59-68, 2007.

[RE ‘08] Sabrina Marczak, Daniela Damian, Ulrike Stege, and Adrian Schroeter, “Information Brokers in Requirements-Dependency Social Networks”, In: IEEE Proc. International Requirements Engineering Conference, Barcelona, Spain, 53-62, September 2008.

[Book ch. ‘10] Daniela Damian, Irwin Kwan, and Sabrina Marczak, Requirements-Driven Collaboration: Leveraging the Invisible Relationships between Requirements and People, Collaborative Software Engineering, Mistrik, I., Grundy, J., van der Hoek, A, Whitehead, J. (Eds.), Chapter 3, pages 57-76, Springer-Verlag, London, England, March 2010.

[RE ‘11] Sabrina Marczak and Daniela Damian, “How Interaction Between Roles Shapes the Communication Structure in Requirements-Driven Collaboration”, In: IEEE Proc. International Requirements Engineering Conference, Trento, Italy, 47-56, 2011.

[ICSE ’13] Daniela Damian, Remko Helms, Irwin Kwan, Sabrina Marczak, and Benjamin Koelewijn, “The Role of Domain Knowledge and Hierarchical Control Structures in Socio-Technical Coordination”, In: IEEE International Conference on Software Engineering, San Francisco, USA, May 2013 (To appear).

54

Page 87: An Introduction to Social Network Analysis and Its Application in Software Engineering

Sabrina [email protected]://www.inf.pucrs.br/sabrina.marczak/

Workshop de Métodos Ágeis para Desenvolvimento Distribuído de Software Manaus, Novembro 2012

Thank you for your attention!

Questions?