Pajek instructions - Quick reference sheet

20
Pajek – Quick Reference sheet v2 – Sept 2009 1 Pajek instructions - Quick reference sheet 1.0 Downloading and installing Pajek 1.1 Navigate to http://pajek.imfm.si/doku.php?id=download 1.2 Click on download [Pajek Wiki] 1.3 On the left hand side of this webpage there is a navigation bar – click on download – this brings you to the “download programs page” 1.4 Click on “Download the latest version of Pajek installation pack. To install it - run “pajek” and follow the installer messages 1.5 Click on run to install it, or save to keep a copy for later installation. 1.6 When installing the popup box will prompt you to click next, so you do that, then a licence agreement will popup which asks that you not use this free software for commercial use; (if it is for commercial use, then read further. 1.7 Then another popup box will ask you if you want to start Pajek installation, so you click “Start” 1.8 Then accept the default folder or select the folder to which you wish to download the program to, or browse for the right one, and hit enter 1.9 It will ask you if you need to write over previous Pajek files, if you have had an older version on your computer - click “Yes” 1.9 A popup box will come up which says that Pajek has been installed successfully. If you go to the folder directly you can copy the icon for Pajek onto your desktop for easy use There are other downloads available from this page: • Pajek Manual & history • Sunbelt (network) conference slides • Excel2Pajek and/or Text2Pajek

Transcript of Pajek instructions - Quick reference sheet

Page 1: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 1

Pajek instructions - Quick reference sheet 1.0 Downloading and installing Pajek

1.1 Navigate to http://pajek.imfm.si/doku.php?id=download

1.2 Click on download [Pajek Wiki] 1.3 On the left hand side of this webpage there is a navigation bar – click on download – this

brings you to the “download programs page” 1.4 Click on “Download the latest version of Pajek installation pack. To install it - run “pajek” and follow the installer messages

1.5 Click on run to install it, or save to keep a copy for later installation. 1.6 When installing the popup box will prompt you to click next, so you do that, then a licence

agreement will popup which asks that you not use this free software for commercial use; (if it is for commercial use, then read further.

1.7 Then another popup box will ask you if you want to start Pajek installation, so you click “Start”

1.8 Then accept the default folder or select the folder to which you wish to download the program to, or browse for the right one, and hit enter

1.9 It will ask you if you need to write over previous Pajek files, if you have had an older version on your computer - click “Yes”

1.9 A popup box will come up which says that Pajek has been installed successfully. If you go to the folder directly you can copy the icon for Pajek onto your desktop for easy use

There are other downloads available from this page:

• Pajek Manual & history • Sunbelt (network) conference slides • Excel2Pajek and/or Text2Pajek

Page 2: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 2

2.0 Getting your data into Pajek format from Excel;

2.1 Go back to http://pajek.imfm.si/doku.php?id=download 2.2 Click on “Download Excel2Pajek and/or Text2Pajek” (Download both of these files). This guide

will only be discussing the Excel to Pajek conversion but the same pronciples apply to converting a text file to a Pajek file.

2.3 For both files scroll to the bottom of the page and click on “Download program”. A window will open which will ask if you want to run or save the file – better to create a folder on your desktop and save both files to this folder. They are both “zip” files. Make sure you have a program on your computer that can unzip these 2 files.

2.4 Extract contact tracing data into Excel from iPHIS or other database. a. Make the first column contain the case to which the contact is connected. b. Make the second column the contact who is named by the case.

Don’t worry if there are more than one pair that is identical, Pajek will ignore these unless you specifically instruct it otherwise. Also do not worry about people who name each other reciprocally, Pajek will also ignore these unless you specify. Your data should look like this: (open the attached excel file “Training example for Excel conversion.xls”)

Open the software program ‘Excel2Pajek’ to convert the excel file into a format readable by Pajek.

Page 3: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 3

Click on ‘Input file’ - browse for the correct Excel file (‘Training example for Excel conversion.xls’) as the input file and click on it – then click on ‘open’ . (This automatically enters the file in the “Input file” box). The rest of the boxes (output file, worksheet, 1st column, 2nd column, lines from top to ignore) fill in automatically. Note; if you type in the name of the file rather than browse, the rest of the boxes do not fill in automatically.

2,5 Modify the output file name if you need to For this example, do the following steps….. 2.6 Insert 1 in the “Line(s) from top to ignore” dialog box if you have titles in it (cases, contacts) 2.7 Select if this is 1-mode (just people) or 2-mode network (people and places); when you select ‘1- mode network, another box pops up: 2.8 Select “undirected (Edges)

2.6. Click on Create Pajek file

Page 4: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 4

Finally you end up with this: Click on ‘OK’

The file is saved in the same location on the hardrive folder where the original excel file was. You can exit/close the Excel2Pajkl software program. The new file (.net) It looks like this in your file folder:

Training example for excel conversion.net You can open the file directly from Pajek, however, if you want to open it and look at it you can use Notepad or Wordpad and it looks like this:

Page 5: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 5

3.0 Starting Pajek

3.1 Click on the Pajek icon. (hopefully located on your desktop) Pajek.lnk This opens up the Pajek software program.

3.2 Now you need to read in the file you created using the ‘Excel2 Pajek’ program. 3.3 Click on ‘File’ – a drop-down box appears – click on ‘Network’ , then ‘Read’ File> Network> Read This opens the files folders on your hardrive – browse till you find the file you created ‘Training example

for excel conversion.net’ – ‘click on it’ – then click on ‘open’

And this is what you should see. 1. A smaller box – called ‘Report’ with ’28 lines read’ 2. The larger box (the Pajek program) where In the first empty space (Networks) you should see the name and path of the file, and at the end, in brackets the number of nodes (11).

Page 6: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 6

4.0 The network we have entered into Pajek is an undirected network. This means there is no direction on your lines. The reason for starting with undirected graphs in infectious disease is because very often we don’t know which person transmitted to whom, so it is inaccurate to place direction on the edges. 4.1 to look at your network click on ‘Draw’ – a drop down menu appears – click on ‘Draw’ again

Page 7: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 7

This is what you should see.

This initial drawing of the network does not lay the nodes out in a very helpful way. This 11 node network is presented in one big circle

Page 8: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 8

4.2 Now, you want to better visualize your network. In this screen, Click on Layout – a drop down menu appears. Click on Energy ► Kamada-Kawai ► Free This is what you should see.

This command moves the vertices/nodes to locations that minimize the variations in line length. Using this command will produce an aesthetic static layout. You are also given (under the Kamada Kawai command) the option of allowing the algorithm to “freely” distribute the various nodes and their respective edges in visual space, fixing the first and last nodes, or identifying a node you would like to appear in the middle of the drawing (e.g., the most central node).

Page 9: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 9

4.2 You may want to visualize your network using another command. In this screen, Click on Layout – a drop down menu appears. Click on Energy ► Fruchterman-Rheingoldi ► 2D Your drawing should look something like this.

The Fruchterman Reingold algorithm is similar to the Kamada-Kawai algorithm. It tries to minimize the “energy” of the network It also is able to distribute points in both two-dimensional and three-dimensional space (tends to better separate the network) It is important to note that there is no unique “solution” for drawing a network using either of these algorithms. Every time we use them, Pajek will draw them differently. However, repeated drawings of the same network data tend to resemble one another. It is generally a good idea to visualize the data using the energy commands more than once. Results do depend on the starting position of vertices, so different starting positions may (and often do) yield different results. The results are generally similar, but it seems logical that using an energy a second time will yield a more accurate drawing of the data since it will begin with starting positions that are not random and reflect, to a certain extent, the correct relationship between the various nodes.

Page 10: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 10

4.0 Preparing the network for analysis 4.1 Read in the text file called SN workshops WITHOUT locations.txt using the same commands as on page 5 of this guide. In Pajek, Click on ‘File’ – a drop-down box appears – click on ‘Network’ , then ‘Read’

File> Network> Read This opens the files folders on your hardrive – browse till you find the file ‘SN workshops WITHOUT locations.txt’ – ‘click on it’ – then click on ‘open’ And this is what you should see. 1. A smaller box – called ‘Report’ with ’73 lines read’ 2. The larger box (the Pajek program) where In the first empty space (Networks) you should see the name and path of the file, and at the end, in brackets the number of nodes (36). Usually, everyone wants a quick visualization of their network – so to draw the network. Click on ‘Draw’ – a drop down menu appears – click on ‘Draw’ again To better visualize the network. In this screen, Click on Layout – a drop down menu appears. Click on Energy ► Kamada-Kawai ► Free Then try Click on Energy ► Fruchterman-Rheingoldi ► 2D What differences do you see between the 2 network drawings? Minimize the drawing. Usually one of the first things you are interested in is to find out how many components there are in the graph, that is, how many subnetworks of people connected directly or indirectly by a path of one length or more. In infectious disease most networks are undirected. The reason for starting with undirected graphs in infectious disease, is because very often we don’t know which person transmitted to whom, so it is inaccurate to place direction on the edges. 5.0 Components

5.1 To find out how many components there are in your graph and of what size; click on Net> Components> Weak

A popup box will appear as below. If you want to see all the components, then leave the default as 1. If you wish to see components which are larger, say of four people or more, type in 4 and click on OK. The report window will appear then, as the task is completed. Minimize or close it.

Page 11: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 11

6.0 Draw To draw your component, click on Draw> Layout> Kamada-Kawai >Free Then Layout ► Energy ► Fruchterman-Rheingoldi ► 2D What differences do you see between the 2 network drawings? Staying with the drawing generated by the ‘Fruchterman-Rheingold ► 2D’ algorithm – it should look something like this

You can change the position of nodes by clicking on them and moving them - Try it….

Page 12: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 12

6.1 Under the Draw menu you can change the look of your graph. You can mark your nodes using the labels or numbers under; Options>Mark vertices using>numbers Options>Mark vertices using>labels You can change the size of vertices Options >Size> of vertices A pop-up box will appear – enter a new size 10 Should look something like this.

Change the size of the vertices to 100 – Wow! – then back to original size of 6 Change the colour of the vertices Options >colour> of vertices>select A pop-up box will appear – pick a colour Change the colour of the vertices back to original as dictated by your input file Options >colour> vertices >as defined on input file Open your original text file in Notepad – you can see the colours are defined in the file – you could change the colours directly in this file but would then need to read it in to Pajek again. It is advisable to change the background to white instead of grey so that your graph is easily seen. Options>Colours> background>, and click on white

Page 13: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 13

Let’s try a few more things Change the colour of the edges and then back to original Change the colour of the vertices border and then back to original Change the size of lines and then back to original 6.3 To see a frequency of components and the numbers of people in them, click on Info> Partition> A popup window will appear and you leave the default settings as they are and press OK (twice). You should see the number and size of all components, called “clusters” will appear in the report window as below.

The report tells you there are 3 components and 36 nodes in this network. Also describes the frequency distribution of the components. Component 1 has 14 nodes and represents 39% of the network.

Page 14: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 14

6.4 Extracting components To look at components in more detail,

Click on Operations> Extract from network > Partition A window will pop up and you type in the number of the cluster you wish to extract (which you see on the report of the components in 6.3 above. Enter 1. Then you can draw a new network of only that component. To draw your component, click on Draw> (you will see the component as it was in the original network drawing) then

Layout ► Energy ► Fruchterman-Rheingold ► 2D Should look something like this….

You can save all the components as separate files by clicking on File> Network> Save This opens the ‘Save as’ window Type in the name of the new file. Note that all Pajek *.net files can be read in Word Pad or Notepad, and they are all plain text files. 6.3 Under the draw menu you can also save your drawing to include it in a presentation. Click on: Export> 2d (two dimensional) > BMP, for a bitmap file. This opens the save as box – just save where you want it. Before going on to the next section – go to the Pajek program under both Networks & Partitions and click on the ▼ on the right had side – a drop down menu appears and you can return to your original network.

Page 15: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 15

7.0 Degree 7.1 To obtain basic characteristics of your nodes such as degree you can click on Net> Partitions >Degree >All , assuming an undirected graph.

Then click on Info> Partition and leave the popup window at the default setting as you wish to see all values of the degree. The report will show you all of the nodes classified by their degree.

This report tells you there are 36 nodes in this network and the lowest degree (number of direct connections a node has) is 1 and the highest degree is 12. You can also see that the majority (n = 30, 83%) of nodes have a degree of 1 or in other words have only one connection.

Page 16: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 16

To see the nodes colour coded by their degrees, click on Draw> Partition> Layout> Energy ► Fruchterman-Rheingold ► 2D If you click on the Partitions box on the left hand side of the main Pajek window, you can save the partition and load it after you have loaded the network file.

This drawing, you can visualize the nodes colour coded by their degree. The yellow nodes (30) all have 1 degree (1 direct connection), the blue nodes have a degree of 4, the pink nodes a degree of 5, the purple nodes a degree of 8 and the grey bodes have a degree of 12.a contact interview? If you were the field epidemiologist, the public health nurse or the MOH – which person (node) would you be most interested in following up with a contact interview? In component 1 can you name the ‘bridge’?, can you name the cutpoint?

Page 17: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 17

7.2 To obtain a list of all nodes and which components they belong to, double left click on the second blank space next to Partitions which says “1. Weak components..”. (You may have to go to the drop down list under partitions to select the original view 1. ‘Weak components…’)

A text file will pop up with the list, which can be merged with your data, using the unique positive number from the actor list. It looks like this, and can be saved as a text file, by clicking on “File” in the top left hand corner.

You can see from this report that Case_1, Case_2, Case_5 & Case_6 all belong to the same component, component 1.

Page 18: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 18

8.0 Simplify graphs Another useful thing with large graphs is to simplify them, by removing all nodes with degree one, for example. In reality these would be the transmission links which come to a stop, either because you find no more positives after that person, or because you have not had a chance to follow them up yet.

To do this, Click on the network you want to read by using the ▼drop down menu to the right

Then click on Net> Partitions >Degree >All Look at the report and see the nodes with degree 1; 2 or so which you wish to remove. Click on

Operation > Extract from Network> partition and a popup window will ask you which you wish to retain and you can type in 2 - 100 so that it will show only nodes of degree two or higher, to 100.

Then click on Draw> Partition> Layout> Energy ► Fruchterman-Rheingold ► 2D You can view the simplified graph with the colours denoting the degree of the nodes.

Page 19: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 19

9.0 Now close the Pajek program and then reopen it to start a fresh example on your own. This next file has both people and places in it. As you go through this example try and note any differences in results (frequencies and drawings) between this file and the previous file. Read in the text file called SN workshops WITH locations.txt using the same commands as on page 5 of this guide. Look at the report – notice the number of lines read. How many nodes are in this network? Draw your network Find the number of components in this network Draw your components Do you notice any differences between the numbers of components using this file compared to the other file? Change the position of your nodes on your drawing Change the look of your graph (if you want)

You could try marking your nodes using the labels or numbers You could try changing the size of the nodes

You could try changing the colour of the vertices Run a frequency of the components and the numbers of people in them How many components are there? How many nodes are in this cluster? Obtain the degrees for each node. What is the highest degree? What is the lowest degree? What degree do the majority of nodes have? Draw the network colour coded by their degrees. If you were the field epidemiologist, the public health nurse or the MOH – What node(s) would you be interested in from a disease transmission perspective and/or a follow-up contact tracing perspective? In this network can you name the ‘bridge’?, can you name the cutpoint? Simplify your network drawing by removing all the nodes with degree 1 Draw the simplified network Simplify your network drawing again by removing all the nodes with degree 1 and degree 2 Draw the simplified network

Page 20: Pajek instructions - Quick reference sheet

Pajek – Quick Reference sheet v2 – Sept 2009 20

Do you think these 2 networks are similar in any way?