Introduction
Bayesian networks are a type of mathematical model that represent dependencies and uncertainties using probability theory and graph structures. A Bayesian network is a directed acyclic graph (DAG) where nodes represent random variables and edges represent dependencies between the variables.
This document explores Bayesian networks for project risk analysis and decision making.
Project
Tasks
Suppose there is a simple roadway project. The project consists of 8 tasks, each with a specific duration and cost. The tasks are as follows:
roadway_tasks <- data.frame(
ID = c("L", "M", "N", "O", "P", "Q", "R", "S"),
Label = c(
"Task-1",
"Task-2",
"Task-3",
"Task-4",
"Task-5",
"Task-6",
"Task-7",
"Task-8"
),
Task = c(
"Survey and Site Assessment",
"Design and Planning",
"Permitting and Approvals",
"Excavation and Grading",
"Pavement Installation",
"Drainage and Utilities Installation",
"Signage and Markings",
"Final Inspection and Handover"
),
Project_ID = rep("P", 8)
)
knitr::kable(roadway_tasks, caption = "Roadway Tasks")
ID | Label | Task | Project_ID |
---|---|---|---|
L | Task-1 | Survey and Site Assessment | P |
M | Task-2 | Design and Planning | P |
N | Task-3 | Permitting and Approvals | P |
O | Task-4 | Excavation and Grading | P |
P | Task-5 | Pavement Installation | P |
Q | Task-6 | Drainage and Utilities Installation | P |
R | Task-7 | Signage and Markings | P |
S | Task-8 | Final Inspection and Handover | P |
Resources
The project requires various resources to complete the tasks. The resources include surveyors, engineers, regulatory support, heavy machinery, pavement and related machinery, drainage material and equipment, painters, traffic signs, road markers, inspectors, and quality control support. The resources are allocated to specific tasks based on their need and availability.
roadway_resources <- data.frame(
ID = c("D", "E", "F", "G", "H", "I", "J", "K"),
Label = c(
"Resource-1",
"Resource-2",
"Resource-3",
"Resource-4",
"Resource-5",
"Resource-6",
"Resource-7",
"Resource-8"
),
Resource = c(
"Surveyer",
"Engineer",
"Regulatory Support",
"Heavy Machinery",
"Pavement and Related Machinery",
"Drainage Material and Equipment",
"Painters, Traffic Signs, Road Markers",
"Inspectors and Quality Control Support"
),
Task_ID = c("L", "M", "N", "O", "P", "Q", "R", "S"),
Task = c(
"Survey and Site Assessment",
"Design and Planning",
"Permitting and Approvals",
"Excavation and Grading",
"Pavement Installation",
"Drainage and Utilities Installation",
"Signage and Markings",
"Final Inspection and Handover"
),
Mean = c(
10000,
20000,
3500,
35000,
100000,
25000,
6500,
2000
),
SD = c(
2000,
5000,
1000,
10000,
20000,
5000,
1500,
500
)
)
knitr::kable(roadway_resources, caption = "Roadway Resources")
ID | Label | Resource | Task_ID | Task | Mean | SD |
---|---|---|---|---|---|---|
D | Resource-1 | Surveyer | L | Survey and Site Assessment | 10000 | 2000 |
E | Resource-2 | Engineer | M | Design and Planning | 20000 | 5000 |
F | Resource-3 | Regulatory Support | N | Permitting and Approvals | 3500 | 1000 |
G | Resource-4 | Heavy Machinery | O | Excavation and Grading | 35000 | 10000 |
H | Resource-5 | Pavement and Related Machinery | P | Pavement Installation | 100000 | 20000 |
I | Resource-6 | Drainage Material and Equipment | Q | Drainage and Utilities Installation | 25000 | 5000 |
J | Resource-7 | Painters, Traffic Signs, Road Markers | R | Signage and Markings | 6500 | 1500 |
K | Resource-8 | Inspectors and Quality Control Support | S | Final Inspection and Handover | 2000 | 500 |
Risks
The project is subject to various risks that may impact the cost, duration, and quality of the project. The risks include delays in permitting and approvals, unforeseen site conditions, material price fluctuations, labor shortages, weather disruptions, equipment breakdowns, design changes, and regulatory changes. Each risk event has a probability of occurrence and an impact on the project.
roadway_risks <- data.frame(
Risk_ID = c("A", "B", "C"),
Name = c(
"Risk-1",
"Risk-2",
"Risk-3"
),
Risk = c(
"Delays in Permitting and Approvals",
"Unforeseen Site Conditions",
"Material Price Fluctuations"
),
Probability = c(
0.9,
0.95,
0.8
),
Resource_ID = c("F", "G", "H"),
Resource_Impacted = c(
"Regulatory Support",
"Heavy Machinery",
"Pavement and Related Machinery"
),
Mean = c(
7000,
70000,
200000
),
SD = c(
2000,
20000,
40000
)
)
knitr::kable(roadway_risks, caption = "Roadway Risks")
Risk_ID | Name | Risk | Probability | Resource_ID | Resource_Impacted | Mean | SD |
---|---|---|---|---|---|---|---|
A | Risk-1 | Delays in Permitting and Approvals | 0.90 | F | Regulatory Support | 7e+03 | 2000 |
B | Risk-2 | Unforeseen Site Conditions | 0.95 | G | Heavy Machinery | 7e+04 | 20000 |
C | Risk-3 | Material Price Fluctuations | 0.80 | H | Pavement and Related Machinery | 2e+05 | 40000 |
Bayesian Network
A Bayesian network can be used to model the relationships between tasks, resources, and risks in the project. The network can help in analyzing the impact of risks on the project outcomes and in making informed decisions.
Nodes
First, define the nodes and edges of the Bayesian network. The nodes represent the tasks, resources, and risks in the project.
nodes <- data.frame(
id = c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T"),
label = c(
"Risk-1",
"Risk-2",
"Risk-3",
"Resource-1",
"Resource-2",
"Resource-3",
"Resource-4",
"Resource-5",
"Resource-6",
"Resource-7",
"Resource-8",
"Task-1",
"Task-2",
"Task-3",
"Task-4",
"Task-5",
"Task-6",
"Task-7",
"Task-8",
"Project"
),
group = c(
"Risk",
"Risk",
"Risk",
"Resource",
"Resource",
"Resource",
"Resource",
"Resource",
"Resource",
"Resource",
"Resource",
"Task",
"Task",
"Task",
"Task",
"Task",
"Task",
"Task",
"Task",
"Project"
),
stringsAsFactors = FALSE
)
Edges
Next, define the edges between the nodes in the Bayesian network. The edges represent the dependencies between the nodes.
links <- data.frame(
source = c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S"
),
target = c("F", "G", "H", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "T", "T", "T", "T", "T", "T", "T"
),
value = rep(1, 19),
stringsAsFactors = FALSE
)
Then, define the distributions for the nodes in the Bayesian network. The distributions represent the probabilities of the outcomes for each node.
distributions <- list(
A = list(
type = "discrete",
values = c(1, 0),
probs = c(0.9, 0.1)
),
B = list(
type = "discrete",
values = c(1, 0),
probs = c(0.95, 0.05)
),
C = list(
type = "discrete",
values = c(1, 0),
probs = c(0.8, 0.2)
),
D = list(
type = "normal",
mean = 10000,
sd = 2000
),
E = list(
type = "normal",
mean = 20000,
sd = 5000
),
F = list(
type = "conditional", condition = "A",
true_dist = list(
type = "normal",
mean = 7000,
sd = 2000
),
false_dist = list(
type = "normal",
mean = 3500,
sd = 1000
)
),
G = list(
type = "conditional", condition = "B",
true_dist = list(
type = "normal",
mean = 70000,
sd = 20000
),
false_dist = list(
type = "normal",
mean = 35000,
sd = 10000
)
),
H = list(
type = "conditional", condition = "C",
true_dist = list(
type = "normal",
mean = 200000,
sd = 40000
),
false_dist = list(
type = "normal",
mean = 100000,
sd = 20000
)
),
I = list(
type = "normal",
mean = 100000,
sd = 20000
),
J = list(
type = "normal",
mean = 25000,
sd = 5000
),
K = list(
type = "normal",
mean = 6500,
sd = 1500
),
L = list(
type = "aggregate",
nodes = c("D")
),
M = list(
type = "aggregate",
nodes = c("E")
),
N = list(
type = "aggregate",
nodes = c("F")
),
O = list(
type = "aggregate",
nodes = c("G")
),
P = list(
type = "aggregate",
nodes = c("H")
),
Q = list(
type = "aggregate",
nodes = c("I")
),
R = list(
type = "aggregate",
nodes = c("J")
),
S = list(
type = "aggregate",
nodes = c("K")
),
T = list(
type = "aggregate",
nodes = c("L", "M", "N", "O", "P", "Q", "R", "S")
)
)
Finally, create the Bayesian network using the nodes, edges, and distributions defined above.
Graph
The Bayesian network can be visualized using the igraph and networkD3 packages. The igraph package provides functions for creating and analyzing graph structures, and the networkD3 package provides functions for creating interactive network visualizations.
library(igraph)
library(networkD3)
g <- graph_from_data_frame(graph$links, vertices = graph$nodes, directed = TRUE)
d3g <- igraph_to_networkD3(g, group = graph$nodes$group)
forceNetwork(Links = d3g$links, Nodes = d3g$nodes, NodeID = "name", Group = "group", Value = "value",
zoom = TRUE, legend = TRUE, arrows = TRUE, opacity = 0.8, fontSize = 14)
Inference
To analyze the Bayesian network, use probabilistic inference to calculate the probabilities of different outcomes. The probabilities can help in assessing the impact of risks on the project outcomes and in making informed decisions.
simulation_results <- prob_net_sim(graph, num_samples = 1000)
The simulation results can provide estimates of the total project cost, duration, and other outcomes based on the probabilities of the risk events.
hist <- hist(simulation_results$T, breaks = 50, plot = FALSE)
plot(hist, main = "Total Project Cost", xlab = "Project Cost", col = "skyblue", border = "white")
Learning
The prob_net_learn()
function can be used to update the
probabilities of the risk events based on new information or expert
judgment. The updated probabilities can help in refining the project
risk analysis and in making better decisions.
For example, if Risk 3 (material price fluctuations) did not occur, the Bayesian network can be updated with the new probability.
updated_results <- prob_net_learn(graph, observations = list(C = "No"),
num_samples = 1000)
The updated results can be compared with the original results to see how the changes in the risk probabilities affect the project outcomes.
hist <- hist(simulation_results$H, breaks = 50, plot = FALSE)
hist2 <- hist(updated_results$H, breaks = 50, plot = FALSE)
plot(hist, main = "Pavement Cost", xlab = "Resource Cost", col = "skyblue",
border = "white", ylim = c(0, max(hist$counts, hist2$counts)))
plot(hist2, col = "blue", border = "white", add = TRUE)
legend("topright", legend = c("Original", "Updated"), fill = c("skyblue", "blue"))
Updating
Similarly, the prob_net_update()
function can be used to
update the structure of the Bayesian network by adding or removing arcs
between nodes. This can help in refining the project risk analysis and
in making better decisions.
For example, if Risk 1 (delays in permitting and approvals) is no longer a concern, the arc between Risk 1 and Resource 3 (Regulatory Support) can be removed.
remove_links <- data.frame(
source = c("A"),
target = c("F"),
stringsAsFactors = FALSE
)
update_distributions <- list(
F = list(
type = "normal",
mean = 3500,
sd = 1000
)
)
updated_graph <- prob_net_update(graph, remove_links = remove_links,
update_distributions = update_distributions)
updated_results <- prob_net_sim(updated_graph, num_samples = 1000)
Just as before the updated results can be compared with the original results to see how the changes in the network structure affect the project outcomes.
hist <- hist(simulation_results$F, breaks = 50, plot = FALSE)
hist2 <- hist(updated_results$F, breaks = 50, plot = FALSE)
plot(hist, main = "Regulatory Support Cost", xlab = "Resource Cost",
col = "skyblue", border = "white", ylim = c(0, max(hist$counts, hist2$counts)))
plot(hist2, col = "blue", border = "white", add = TRUE)
legend("topright", legend = c("Original", "Updated"), fill = c("skyblue", "blue"))
Conclusion
Bayesian networks are powerful tools for project risk analysis and decision making. By modeling the dependencies and uncertainties in a project, Bayesian networks can help project managers assess the impact of risks on project outcomes and make informed decisions. The Bayesian network created in this document represents the relationships between tasks, resources, and risks in a roadway project. The network can be used to analyze the impact of risks on the project outcomes and refine the risk analysis based on new information or expert judgment.