Network basics with R and igraph (part II of III)

The first part of this little series of posts went over a some basic network modeling and plotting tools. In this part I want to go over how to get some of the simple properties of these networks using two packages, igraph and NetIndices.


###############################################################
###############################################################
####
####            PART II: NETWORK PROPERTIES
####
###############################################################
###############################################################
# Again I will be using the igraph package
# to analyze various properties of networks
library(igraph)
# Additionally I will use the NetIndices package,
# since its function "GenInd()" outputs several network properties
library(NetIndices)
# First I'll create a network to analyze using
# the preferential attachment model (power=.5)

test.graph<-barabasi.game(100,power=.5,m=2)
# In this case we have set m=2, meaning that
# for each new node 2 new links are created

par(mar=c(.1,.1,.1,.1))
plot.igraph(test.graph,
layout=layout.fruchterman.reingold,
vertex.size=7,
vertex.label.cex=.5,
edge.arrow.size=.5)

Preferential attachment graph with m=2, power=0.5

Preferential attachment graph with m=2, power=0.5


# How large is the network (I know I set this when we made the network,
# but what if I had not?)

test.graph      # Tells me that it is an IGRAPH object with 100 nodes and 197 links,
# made with the Barabasi algorithm
V(test.graph)   # gives the vertex sequence
E(test.graph)   # gives the edge sequence (edge list)

# The "GenInd()" function requires an input of an adjacency matrix
test.graph.adj<-get.adjacency(test.graph,sparse=F)
# in older versions of igraph the default was sparse=F,
# but now you must specify, other wise you get a matrix of 1s and .s

test.graph.properties<-GenInd(test.graph.adj)

# The function output consists of 10 network properties.
# I will consider five of them here:

test.graph.properties$N            #number of nodes

test.graph.properties$Ltot        #number of links

test.graph.properties$LD        #link density (average # of links per node)

test.graph.properties$C            #the connectance of the graph
# This function measures connectance as L/(N*(N-1)) where L is links, and N is nodes
# Connectance can also be calculated as L/(N^2)

# The degree of a node refers to the number of links associated with a node.
# Degree can be measured as the links going in ("in degree"), out ("out degree"), or both.
# The degree() function takes a graph input and gives the degree of specified nodes.
# With the argument "v=V(graph)" you tell the function to give the degree of all nodes in the graph,
# while the "mode" argument specifies in, out, or both.

in.deg.testgraph<-degree(test.graph,v=V(test.graph),mode="in")
out.deg.testgraph<-degree(test.graph,v=V(test.graph),mode="out")
all.deg.testgraph<-degree(test.graph,v=V(test.graph),mode="all")

# Degree distribution is the cumulative frequency of nodes with a given degree
# this, like degree() can be specified as "in", "out", or "all"
deg.distr<-degree.distribution(test.graph,cumulative=T,mode="all")

# Using the power.law.fit() function I can fit a power law to the degree distribution
power<-power.law.fit(all.deg.testgraph)

# The output of the power.law.fit() function tells me what the exponent of the power law is ($alpha)
# and the log-likelihood of the parameters used to fit the power law distribution ($logLik)
# Also, it performs a Kolmogov-Smirnov test to test whether the given degree distribution could have
# been drawn from the fitted power law distribution.
# The function thus gives me the test statistic ($KS.stat) and p-vaule ($KS.p) for that test

# Then I can plot the degree distribution
plot(deg.distr,log="xy",
ylim=c(.01,10),
bg="black",pch=21,
xlab="Degree",
ylab="Cumulative Frequency")

# And the expected power law distribution
lines(1:20,10*(1:20)^((-power$alpha)+1))

# Graphs typically have a Poisson distribution (if they are random),
# power law (preferential attachment), or truncated power law (many real networks) degree distribution

Cumulative degree distribution with power law fitted

Cumulative degree distribution with power law fitted


# Diameter is essentially the longest path between two vertices
diameter(test.graph)
# Gives me the length of the diameter while

nodes.diameter<-get.diameter(test.graph)
# Gives me the labels for each node that participates in the diameter

# I can look at the diameter graphically also
# First I will define the node and edge attributes
V(test.graph)$color<-"skyblue"
# I want all the nodes to be skyblue
V(test.graph)$size<-7
# I want all the nodes to be size=7
V(test.graph)[nodes.diameter]$color<-"darkgreen"
V(test.graph)[nodes.diameter]$size<-10
V(test.graph)[nodes.diameter]$label.color<-"white"
# but the nodes in the diameter should be darkgreen and larger than the rest
# with a white label instead of black
# this will make the diameter pop out of the larger network
E(test.graph)$color<-"grey"
# all non-diameter edges will be grey
E(test.graph,path=nodes.diameter)$color<-"darkgreen"
E(test.graph,path=nodes.diameter)$width<-2
# Edges in the diameter will be darkgreen and a little extra wide

# If you do not set the attributes of all of the nodes and edges then it will
# default such that you only see what you have defined

# Now when I plot the diameter will be larger than everything else, and darkgreen instead
# of grey/blue
par(mar=c(.1,.1,.1,.1))
plot.igraph(test.graph,
layout=layout.fruchterman.reingold,
vertex.label.cex=.5,
edge.arrow.size=.5)

test.graph with diameter highlighted

test.graph with diameter highlighted


# Clustering coefficient is the proportion of
# a nodes neighbors that can be reached by other neighbors
# in igraph this property is apparently called "transitivity"

transitivity(test.graph)
# gives the clustering coefficient of the whole network

transitivity(test.graph,type="local")
# gives the clustering coefficient of each node

# Betweenness is the number of shortest paths between two nodes that go through each node of interest

graph.betweenness<-betweenness(test.graph,v=V(test.graph))
graph.edge.betweenness<-edge.betweenness(test.graph,e=E(test.graph))

# Closeness refers to how connected a node is to its neighbors

graph.closeness<-closeness(test.graph,vids=V(test.graph))

# Clustering coefficient, betweenness, and closeness
# all describe the small world properties of the network.
# A network with small world properties is one in which
# it takes a relatively short path to get from one node to the next
# (e.g., six degrees of separation)

# Every graph can be decomposed into its component n-node subgraphs.
# In particular there are 13 unique ways to arrange 3 nodes in directed graphs.
# Here are the adjacency matrices for each of the 13 subgraphs
s1<-matrix(c(0,1,0,0,0,1,0,0,0),nrow=3,ncol=3)
s2<-matrix(c(0,1,1,0,0,1,0,0,0),nrow=3,ncol=3)
s3<-matrix(c(0,1,0,0,0,1,1,0,0),nrow=3,ncol=3)
s4<-matrix(c(0,0,1,0,0,1,0,0,0),nrow=3,ncol=3)
s5<-matrix(c(0,1,1,0,0,0,0,0,0),nrow=3,ncol=3)
d2<-matrix(c(0,1,1,1,0,1,0,0,0),nrow=3,ncol=3)
d1<-matrix(c(0,1,1,0,0,1,0,1,0),nrow=3,ncol=3)
d3<-matrix(c(0,0,1,1,0,0,1,0,0),nrow=3,ncol=3)
d4<-matrix(c(0,0,0,1,0,1,0,1,0),nrow=3,ncol=3)
d5<-matrix(c(0,1,1,0,0,1,1,0,0),nrow=3,ncol=3)
d6<-matrix(c(0,1,1,1,0,1,1,1,0),nrow=3,ncol=3)
d7<-matrix(c(0,1,1,1,0,1,1,0,0),nrow=3,ncol=3)
d8<-matrix(c(0,1,1,1,0,0,1,0,0),nrow=3,ncol=3)

# I then make the 13 matrices into a list
subgraph3.mat<-list(s1,s2,s3,s4,s5,d1,d2,d3,d4,d5,d6,d7,d8)
# And convert the matrices into graph objects
subgraph3.graph<-lapply(subgraph3.mat,graph.adjacency)

# Here I have created a simple for loop to go through the list of subgraphs
# and count how many times that subgraph appears in the larger test.graph
subgraph.count<-c()
for(i in 1:13){
subgraph.count[i]<-
graph.count.subisomorphisms.vf2(test.graph,subgraph3.graph[[i]])
}

Subgraph1

3 node subgraphs with only single links considered

Subgraph2

3 node subgraphs with double links considered

Advertisements
This entry was posted in Rbasics and tagged , , , , , . Bookmark the permalink.

15 Responses to Network basics with R and igraph (part II of III)

  1. Tony says:

    Part 3 still to come?

  2. Luiz Felipe Freitas says:

    Thank you very much! Those two posts have been very helpful.
    Looking forward to Part 3.

    • Jon Borrelli says:

      I am glad you found them helpful! I am planning the third part for sometime next week, and it will be focused on getting information out of a food web dataset, so it will be less of a general network overview and more specific to ecology.

  3. Pingback: Blogroll: Assembling my Network | Scientific Gems

  4. swarup says:

    heii in my system degree() is not working………..
    can someone help me?

    it is giving this error :
    > in.deg.testgraph<-degree(test.graph,v=V(test.graph),mode="in")
    Error in degree(test.graph, v = V(test.graph), mode = "in") :
    unused arguments (v = V(test.graph), mode = "in")

    • Jon Borrelli says:

      I am not sure what the problem is here, it looks like the code is correct. Does degree(test.graph) work properly?

      • swarup says:

        Hi Jon,
        The problem solved. degree(test.graph) work properly. Actually there was a standard bug corresponding to degree.distribution() function in igraph. I have used $density instead of $intensities inside the function and then your code runs smoothly.
        Thanks.

  5. --Nick/ says:

    Thanks for your initiative in providing this quick-start intro to igraph, Jon.

    It’s really very useful, well organized, and well written.

  6. Raphael says:

    Hi there!!
    I’m biology undergrad and I’m coping with systems biology. Also new to R. I have a specific problem, which you are probably able to help me with.
    I’m typing:
    deg <- degree(net, mode="all")
    and I get this respond: unused argument (mode="all").

    I think everything's set considering the packages installed and loaded, as well as my network which I defined as "net".
    No clue what is going on… .HELP! :p
    Thanks in advance!

    • Jon Borrelli says:

      I wonder if you have any network analysis packages loaded other than igraph (e.g., sna)? Maybe try `igraph::degree(net, mode=”all”)` to tell R to specifically use the degree function from igraph.

  7. Anan says:

    Hi Jon,
    Thank you for the tutorial, I’m new to R and this has been very helpful.

    I was wondering, when you fit the expected power law distribution on the curve, what does the value 10 in the equation:
    lines(1:20,10*(1:20)^((-power$alpha)+1)) signify?

    Is it an arbitrary constant? I could not find any reference to a variable (except that the upper limit for the y-axis is 10!)

    Thank you for your reply,
    Anan

    • Jon Borrelli says:

      Glad to hear this was helpful. TO be perfectly honest it has been a while since I wrote this and I cannot remember why I did it this way. Looking back it does seem like it was just an arbitrary way to plot a line. I am (very slowly) working on updating and fixing this tutorial, so I will eventually come back around and figure out perhaps a better way to illustrate this.

  8. raj says:

    Hi Jon,
    I have a generic issue w.r.t igraph outputs. However, I will ask you w.r.t clustering coefficient (CC).
    igraph dumps the output of local CC values without telling which value belongs to whom ? However, I would like to get something like node_ID , Local CC value and then I can dump all this information into a csv file for all the nodes.
    P.S: Nodes Ids do not necessarily have to follow any order.
    Can you please let me know how can we get this information ?
    Thanks in advance !

    • Jon Borrelli says:

      So the transitivity function outputs the local CC in the order of the nodes, so for a given graph, e.g., erg <- erdos.renyi.game(20, .3, "gnp"), I think you can get what you need with: data.frame(CC = transitivity(erg, typ = "local"), nodeID = V(erg))

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s