This demonstration will introduce you to working with network data in programming language R and the RStudio Integrated Development Environment for R.
We will examine data gathered from employees at a small company that produces wines and spirits in Spain. As the company grew, they were concerned about inter-departmental relationships between employeed and how to foster a common organizational culture.
38 employees in the company were asked how energizing their interactions with each of their co-workers were. These responses can be converted into a network, in which a tie from node i to node j indicates that employee i found their interactions with employee j to be energizing.
Data taken from: Casciaro, Tiziana, Vincent Dessain, and Elena Corsi. “Moët Hennessy España.” Harvard Business School Case 408-108, February 2008.
# Setup the igraph package for network analysis and plotting
# Check if igraph is installed. If not - installs the package automatically
if (!"igraph" %in% installed.packages()) install.packages("igraph")
# Load the igraph package, so that we can use functions from it for network analysis
library(igraph)
##
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
# Pull network data from the web, in the form of adjacency matrices for each type of tie
energizingMatrix <- as.matrix(read.csv("https://sites.northwestern.edu/nearfutureofwork2021/files/2021/04/Energizing.csv",header=F))
# Preview the first ten rows and columns of the data
energizingMatrix[1:10,1:10]
## V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
## [1,] 0 0 0 0 0 0 0 0 0 0
## [2,] 0 0 0 1 0 0 0 0 0 0
## [3,] 0 0 0 1 0 0 0 0 0 0
## [4,] 0 0 0 0 0 0 0 0 0 0
## [5,] 0 0 0 0 0 0 0 0 0 0
## [6,] 0 0 0 0 0 0 0 0 0 0
## [7,] 0 0 0 0 0 0 0 0 0 0
## [8,] 0 0 0 0 0 1 0 0 0 0
## [9,] 0 0 0 0 0 0 0 0 0 0
## [10,] 0 0 0 0 0 0 0 0 0 0
# Visualize the entire adjacency matrix using the sna package
if (!"sna" %in% installed.packages()) install.packages("sna")
sna::plot.sociomatrix(energizingMatrix)
# Pull node attributes from the web
node_att <- read.csv("https://sites.northwestern.edu/nearfutureofwork2021/files/2021/04/Moet_NodeAttributes.csv")
# Preview the first 5 rows of the data - each node's department
node_att[1:5,]
# Create a network object in R
energizingNetwork <- graph_from_adjacency_matrix(energizingMatrix)
# Store the department of each node (vertex)
V(energizingNetwork)$department <- node_att$Department
# Find the number of nodes (vertices)
vcount(energizingNetwork)
## [1] 38
# Find the number of ties (edges)
ecount(energizingNetwork)
## [1] 86
# Compute the density of the network
graph.density(energizingNetwork) # Sanity check - do these values equal numEdges / (numVertices * numVertices-1)?
## [1] 0.06116643
# Reduce the plot margins for current and future plots
par(mar=c(0,0,0,0))
# Plot the network
plot(
energizingNetwork,
vertex.label = NA, # Whether to label the nodes with names
vertex.size = 6, # Size of the nodes (vertices)
edge.arrow.size = 0.5 # Size of the arrows
)
par(mar=c(0,0,0,0))
# Generate colors for each department that employees are in
colbar = rainbow(4) # Select 4 colors
# Store the colors as a node (vertex) attribute
V(energizingNetwork)$color <- colbar[as.factor(V(energizingNetwork)$department)]
# Incorporate the colors into the plot
plot(
energizingNetwork,
vertex.label = NA,
vertex.size = 6,
edge.arrow.size = 0.5,
vertex.color = V(energizingNetwork)$color # Colors nodes based on department
)
par(mar=c(0,0,0,0))
# Change the size of nodes based on their indegree
V(energizingNetwork)$indegree <- degree(energizingNetwork, mode = "in") # Compute indegree
# Plot, sizing nodes by indegree
plot(
energizingNetwork,
vertex.label = NA,
vertex.size = V(energizingNetwork)$indegree + 5,
edge.arrow.size = 0.5,
vertex.color = V(energizingNetwork)$color
)
Fruchterman-Reingold algorithm: Fruchterman, T.M.J. and Reingold, E.M. (1991). Graph Drawing by Force-directed Placement. Software - Practice and Experience, 21(11):1129-1164.
Davidson and Harel algorithm: Ron Davidson, David Harel: Drawing Graphs Nicely Using Simulated Annealing. ACM Transactions on Graphics 15(4), pp. 301-331, 1996.
Spring algorithm: Kamada, T. and Kawai, S.: An Algorithm for Drawing General Undirected Graphs. Information Processing Letters, 31/1, 7–15, 1989.
# Different algorithms can generate different layouts for the network
# There is no single best layout algorithm for networks - It's more art than science
par(mar=c(0,0,0,0))
# Fruchterman-Reingold layout algorithm:
plot(
energizingNetwork, vertex.label = NA, vertex.size = V(energizingNetwork)$indegree+5, edge.arrow.size = 0.5,
vertex.color = V(energizingNetwork)$color,
layout = layout_with_fr(energizingNetwork)
)
# Davidson and Harel layout algorithm:
plot(
energizingNetwork, vertex.label = NA, vertex.size = V(energizingNetwork)$indegree+5, edge.arrow.size = 0.5,
vertex.color = V(energizingNetwork)$color,
layout = layout_with_dh(energizingNetwork)
)
# Spring layout algorithm:
plot(
energizingNetwork, vertex.label = NA, vertex.size = V(energizingNetwork)$indegree+5, edge.arrow.size = 0.5,
vertex.color = V(energizingNetwork)$color,
layout = layout_with_kk(energizingNetwork)
)