For some research questions, there might be expectations in regards to the directions of the edges. For example, in symptom networks, all relations are often hypothesized to be positive (i.e., positive manifold). In turn, any negative relations are thought to be spurious.
In GGMncv, it is possible to estimate the conditional dependence structure, given that all edges in the graph are positive (sign restriction).
library(GGMncv)
library(corrplot)
The ptsd
dataset includes 20 post-traumatic stress symptoms. The following visualizes the correlation matrix:
::corrplot(cor(ptsd), method = "shade") corrplot
Notice that all of the correlations are positive.
Here are the partial correlations:
<- -cov2cor(solve(cor(ptsd))) + diag(ncol(ptsd))
pcors
::corrplot(pcors,
corrplotmethod = "shade")
Notice that some relations went to essentially zero (white), whereas other changed direction altogether.
Here the conditional dependence structure is selected via ggmncv
:
# fit model
<- GGMncv::ggmncv(cor(ptsd),
fit n = nrow(ptsd),
progress = FALSE,
penalty = "atan")
# plot graph
plot(GGMncv::get_graph(fit),
edge_magnify = 10,
node_names = colnames(ptsd))
Notice a few negatives are included in the graph.
Here the graph is re-estimated, with the constraint that all of negative edges in the above plot are actually zero.
# set negatives to zero (sign restriction)
<- ifelse(fit$P <= 0, 0, 1)
adj_new
<- TRUE
check_zeros
# track trys
<- 0
iter
# iterate until all positive
while(check_zeros){
<- iter + 1
iter <- GGMncv::constrained(cor(ptsd), adj = adj_new)
fit_new <- any(fit_new$wadj < 0)
check_zeros <- ifelse(fit_new$wadj <= 0, 0, 1)
adj_new
}
# make graph object
<- list(P = fit_new$wadj,
new_graph adj = adj_new)
class(new_graph) <- "graph"
# plot graph
plot(new_graph,
edge_magnify = 10,
node_names = colnames(ptsd))
The graph now only includes positive edges. Note this is not the same as simply removing the negative relations, as, in this case, this is the maximum likelihood estimate for the inverse covariance matrix.
Note also new_graph
is making the graph class so that it can be plotted with plot
.
The above essentially takes the selected graph, and then re-estimates it with the constraint that the negative edges are zero. Perhaps a more sophisticated approach is to select the graph with those constraints.
This can be implemented with:
<- cor(ptsd)
R <- nrow(ptsd)
n <- ncol(ptsd)
p
# store fitted models
<- ggmncv(R = R,
fit n = n,
progress = FALSE,
store = TRUE,
n_lambda = 50)
# all fitted models
# sol: solution
<- fit$fitted_models
sol_path
# storage
<- NA
bics <- list()
Thetas
for(i in seq_along(sol_path)){
# positive in wi is a negative partial
<- ifelse(sol_path[[i]]$wi >= 0, 0, 1)
adj_new
<- TRUE
check_zeros
# track trys
<- 0
iter
# iterate until all positive
while(check_zeros){
<- iter + 1
iter <- GGMncv::constrained(R, adj = adj_new)
fit_new <- any(fit_new$wadj < 0)
check_zeros <- ifelse(fit_new$wadj <= 0, 0, 1)
adj_new
}
<- GGMncv:::gic_helper(
bics[i] Theta = fit_new$Theta,
R = R,
n = n,
p = p,
type = "bic",
edges = sum(fit_new$Theta[upper.tri(fit_new$Theta)] != 0)
)
<- fit_new$Theta
Thetas[[i]]
}
# select via minimizing bic
# (then convert to partial correlatons)
<- -(cov2cor(Thetas[[which.min(bics)]]) - diag(p))
pcors
# make graph class
<- list(P = pcors,
new_graph adj = ifelse(pcors == 0, 0, 1))
class(new_graph) <- "graph"
# plot graph
plot(new_graph,
edge_magnify = 10,
node_names = colnames(ptsd))