Skip to contents

Introduction

Bayesian networks are a type of mathematical model that represent dependencies and uncertainties using probability theory and graph structures. A Bayesian network is a directed acyclic graph (DAG) where nodes represent random variables and edges represent dependencies between the variables.

This document explores Bayesian networks for project risk analysis and decision making.

Project Risk Analysis

Suppose there is a simple roadway project. The project consists of 8 tasks, each with a specific duration and cost. The tasks are as follows:

roadway_tasks <- data.frame(
  ID = c("L", "M", "N", "O", "P", "Q", "R", "S"),
  Label = c(
    "Task-1",
    "Task-2",
    "Task-3",
    "Task-4",
    "Task-5", 
    "Task-6",
    "Task-7",
    "Task-8"
  ),
  Task = c(
    "Survey and Site Assessment",
    "Design and Planning",
    "Permitting and Approvals",
    "Excavation and Grading",
    "Pavement Installation",
    "Drainage and Utilities Installation",
    "Signage and Markings",
    "Final Inspection and Handover"
  ), 
  Project_ID = rep("P", 8)
)

knitr::kable(roadway_tasks, caption = "Roadway Tasks")
Roadway Tasks
ID Label Task Project_ID
L Task-1 Survey and Site Assessment P
M Task-2 Design and Planning P
N Task-3 Permitting and Approvals P
O Task-4 Excavation and Grading P
P Task-5 Pavement Installation P
Q Task-6 Drainage and Utilities Installation P
R Task-7 Signage and Markings P
S Task-8 Final Inspection and Handover P

Resources

The project requires various resources to complete the tasks. The resources include surveyors, engineers, regulatory support, heavy machinery, pavement and related machinery, drainage material and equipment, painters, traffic signs, road markers, inspectors, and quality control support. The resources are allocated to specific tasks based on their expertise and availability.

roadway_resources <- data.frame(
  ID = c("D", "E", "F", "G", "H", "I", "J", "K"),
  Label = c(
    "Resource-1",
    "Resource-2",
    "Resource-3",
    "Resource-4",
    "Resource-5",
    "Resource-6",
    "Resource-7",
    "Resource-8"
  ),
  Resource = c(
    "Surveyer",
    "Engineer",
    "Regulatory Support",
    "Heavy Machinery",
    "Pavement and Related Machinery",
    "Drainage Material and Equipment",
    "Painters, Traffic Signs, Road Markers",
    "Inspectors and Quality Control Support"
  ),
  Task_ID = c("L", "M", "N", "O", "P", "Q", "R", "S"),
  Task = c(
    "Survey and Site Assessment",
    "Design and Planning",
    "Permitting and Approvals",
    "Excavation and Grading",
    "Pavement Installation",
    "Drainage and Utilities Installation",
    "Signage and Markings",
    "Final Inspection and Handover"
  ),
  Mean = c(
    10000,
    20000,
    3500,
    35000,
    100000,
    25000,
    6500,
    2000
  ), 
  SD = c(
    2000,
    5000,
    1000,
    10000,
    20000,
    5000,
    1500,
    500
  )
)

knitr::kable(roadway_resources, caption = "Roadway Resources")
Roadway Resources
ID Label Resource Task_ID Task Mean SD
D Resource-1 Surveyer L Survey and Site Assessment 10000 2000
E Resource-2 Engineer M Design and Planning 20000 5000
F Resource-3 Regulatory Support N Permitting and Approvals 3500 1000
G Resource-4 Heavy Machinery O Excavation and Grading 35000 10000
H Resource-5 Pavement and Related Machinery P Pavement Installation 100000 20000
I Resource-6 Drainage Material and Equipment Q Drainage and Utilities Installation 25000 5000
J Resource-7 Painters, Traffic Signs, Road Markers R Signage and Markings 6500 1500
K Resource-8 Inspectors and Quality Control Support S Final Inspection and Handover 2000 500

Risks

The project is subject to various risks that may impact the cost, duration, and quality of the project. The risks include delays in permitting and approvals, unforeseen site conditions, material price fluctuations, labor shortages, weather disruptions, equipment breakdowns, design changes, and regulatory changes. Each risk event has a probability of occurrence, an impact on the project, and a root cause.

roadway_risks <- data.frame(
  Risk_ID = c("A", "B", "C"),
  Name = c(
    "Risk-1",
    "Risk-2",
    "Risk-3"
  ),
  Risk = c(
    "Delays in Permitting and Approvals",
    "Unforeseen Site Conditions",
    "Material Price Fluctuations"
  ),
  Probability = c(
    0.9,
    0.95,
    0.8
  ),
  Resource_ID = c("F", "G", "H"),
  Resource_Impacted = c(
    "Regulatory Support",
    "Heavy Machinery",
    "Pavement and Related Machinery"
  ),
  Mean = c(
    7000,
    70000,
    200000
  ),
  SD = c(
    2000,
    20000,
    40000
  )
)

knitr::kable(roadway_risks, caption = "Roadway Risks")
Roadway Risks
Risk_ID Name Risk Probability Resource_ID Resource_Impacted Mean SD
A Risk-1 Delays in Permitting and Approvals 0.90 F Regulatory Support 7e+03 2000
B Risk-2 Unforeseen Site Conditions 0.95 G Heavy Machinery 7e+04 20000
C Risk-3 Material Price Fluctuations 0.80 H Pavement and Related Machinery 2e+05 40000

Bayesian Network

To model the project risks and dependencies, we can create a Bayesian network. The Bayesian network will represent the relationships between tasks, resources, and risks in the project. The network will help us analyze the impact of risks on the project outcomes and make informed decisions.

First, we need to define the nodes and edges of the Bayesian network. The nodes represent the tasks, resources, and risks in the project.

nodes <- data.frame(
 id = c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T"),
  label = c(
    "Risk-1",
    "Risk-2",
    "Risk-3",
    "Resource-1",
    "Resource-2",
    "Resource-3",
    "Resource-4",
    "Resource-5",
    "Resource-6",
    "Resource-7",
    "Resource-8",
    "Task-1",
    "Task-2",
    "Task-3",
    "Task-4",
    "Task-5", 
    "Task-6",
    "Task-7",
    "Task-8",
    "Project"
  ),
  stringsAsFactors = FALSE
 )

Next, we define the edges between the nodes in the Bayesian network. The edges represent the dependencies between the nodes.

links <- data.frame(
  source = c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S"
  ),
  target = c("F", "G", "H", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "T", "T", "T", "T", "T", "T", "T"
  ),
  stringsAsFactors = FALSE
)

Then, we define the distributions for the nodes in the Bayesian network. The distributions represent the probabilities of the outcomes for each node.

distributions <- list(
  A = list(
    type = "discrete",
    values = c(1, 0),
    probs = c(0.9, 0.1)
  ),
  B = list(
    type = "discrete",
    values = c(1, 0),
    probs = c(0.95, 0.05)
  ),
  C = list(
    type = "discrete",
    values = c(1, 0),
    probs = c(0.8, 0.2)
  ),
  D = list(
    type = "normal",
    mean = 10000,
    sd = 2000
  ),
  E = list(
    type = "normal",
    mean = 20000,
    sd = 5000
  ),
  F = list(
    type = "conditional", condition = "A",
    true_dist = list(
      type = "normal",
      mean = 7000,
      sd = 2000
    ),
    false_dist = list(
      type = "normal",
      mean = 3500,
      sd = 1000
    )
  ),
  G = list(
    type = "conditional", condition = "B",
    true_dist = list(
      type = "normal",
      mean = 70000,
      sd = 20000
    ),
    false_dist = list(
      type = "normal",
      mean = 35000,
      sd = 10000
    )
  ),
  H = list(
    type = "conditional", condition = "C",
    true_dist = list(
      type = "normal",
      mean = 200000,
      sd = 40000
    ),
    false_dist = list(
      type = "normal",
      mean = 100000,
      sd = 20000
    )
  ),
  I = list(
    type = "normal",
    mean = 100000,
    sd = 20000
  ),
  J = list(
    type = "normal",
    mean = 25000,
    sd = 5000
  ),
  K = list(
    type = "normal",
    mean = 6500,
    sd = 1500
  ),
  L = list(
    type = "aggregate",
    nodes = c("D")
  ),
  M = list(
    type = "aggregate",
    nodes = c("E")
  ),
  N = list(
    type = "aggregate",
    nodes = c("F")
  ),
  O = list(
    type = "aggregate",
    nodes = c("G")
  ),
  P = list(
    type = "aggregate",
    nodes = c("H")
  ),
  Q = list(
    type = "aggregate",
    nodes = c("I")
  ),
  R = list(
    type = "aggregate",
    nodes = c("J")
  ),
  S = list(
    type = "aggregate",
    nodes = c("K")
  ),
  T = list(
    type = "aggregate",
    nodes = c("L", "M", "N", "O", "P", "Q", "R", "S")
  )
)

Finally, we create the Bayesian network using the nodes, edges, and distributions defined above.

library(PRA)
graph <- prob_net(nodes, links, distributions = distributions)

To plot the Bayesian network, we can use the igraph package. The igraph package provides functions for creating and analyzing graph structures.

library(igraph)
g <- graph_from_data_frame(graph$links, vertices = graph$nodes, directed = TRUE)
plot(g, main = "Bayesian Network", vertex.label = graph$nodes$label,
     vertex.size = 30, vertex.color = "lightblue", edge.arrow.size = 0.5,
     edge.color = "gray", layout = layout_with_fr)

Inference

To analyze the Bayesian network, we can use the probabilistic inference algorithms to calculate the probabilities of different outcomes. The probabilities can help us assess the impact of risks on the project outcomes and make informed decisions.

simulation_results <- prob_net_sim(graph, num_samples = 1000)

We can use these results to estimate the total project cost and assess the impact of risks on the project outcomes.

hist <- hist(simulation_results$T, breaks = 50, plot = FALSE)
plot(hist, main = "Total Project Cost", xlab = "Project Cost", col = "skyblue", border = "white")

Learning

We can also update the probabilities of the risk events based on new information or expert judgment. The updated probabilities can help us refine the project risk analysis and make better decisions.

For example, if we learn that Risk 3 (material price fluctuations) did not occur, we can update the Bayesian network with the new probability.

updated_results <- prob_net_learn(graph, observations = list(C = "No"),
                                  num_samples = 1000)

We can compare the updated results with the original results to see how the changes in the risk probabilities affect the project outcomes.

hist <- hist(simulation_results$H, breaks = 50, plot = FALSE)
hist2 <- hist(updated_results$H, breaks = 50, plot = FALSE)
plot(hist, main = "Pavement Resource Cost", xlab = "Resource Cost", col = "skyblue", border = "white")
plot(hist2, col = "blue", border = "white", add = TRUE)
legend("topright", legend = c("Original", "Updated"), fill = c("skyblue", "blue"))

Updating

We can also update the structure of the Bayesian network by adding or removing arcs between nodes. This can help us refine the project risk analysis and make better decisions.

For example, if we learn that Risk 1 (delays in permitting and approvals) is no longer a concern, we can remove the arc between Risk 1 and Resource 3 (Regulatory Support).

remove_links <- data.frame(
  source = c("A"),
  target = c("F"),
  stringsAsFactors = FALSE
)
update_distributions <- list(
  F = list(
    type = "normal",
    mean = 3500, 
    sd = 1000
    )
)
updated_graph <- prob_net_update(graph, remove_links = remove_links,
                                   update_distributions = update_distributions)
updated_results <- prob_net_sim(updated_graph, num_samples = 1000)

We can compare the updated results with the original results to see how the changes in the network structure affect the project outcomes.

hist <- hist(simulation_results$F, breaks = 50, plot = FALSE)
hist2 <- hist(updated_results$F, breaks = 50, plot = FALSE)
plot(hist, main = "Regulatory Support Resource Cost", xlab = "Resource Cost", col = "skyblue", border = "white")
plot(hist2, col = "blue", border = "white", add = TRUE)
legend("topright", legend = c("Original", "Updated"), fill = c("skyblue", "blue"))

Conclusion

Bayesian networks are powerful tools for project risk analysis and decision making. By modeling the dependencies and uncertainties in a project, Bayesian networks can help project managers assess the impact of risks on project outcomes and make informed decisions. The Bayesian network created in this document represents the relationships between tasks, resources, and risks in a roadway project. The network can be used to analyze the impact of risks on the project outcomes and refine the risk analysis based on new information or expert judgment.