The Bioinformateachers

Learn data, learn life.

All posts | List of posts by category | About us

Bayesian Networks: a super-elementary overview

Brief theoretical background

Bayesian Networks are probabilistic graphical models and are described by:

  1. a DAG (directed acyclic graph) \(G = (V,E)\), with nodes \(v_{i} \in V\) (V for “vertex”) that represent random variables \(X_{i}\), and connections \(e_{ij} = (v_i, v_j)\) between nodes (E from “edges”)
  2. a joint conditional probability distribution of the \(X\) variables: \(P(X_1 = x_1, \ldots, X_n = x_n) = \Pi_{i=1}^N p(x_{i} \| x_{parents(i)})\)

A DAG contains:

The objective of a Bayesian Network is to calculate conditional probabilities between variables: calculations –and predictions– can be made about any variables, with no distinction between dependent (target) and explanatory (factor) variables.

Unlike many other statistical learning methods, Bayesian Networks can handle missing data. Additionally, they should scale well and allow for complex inferential relationships. Bayesian Networks clearly show causal paths and highlight the presence of mediator variables, colliders, confounders etc.

Bayesian Networks can be thought of as a generalization of Naive Bayes: we move away from the naive assumption of independence of the predictor variables (\(x_i\)), and basically replace the joint probability distribution given by the simple product of all predictor densities, with the product of conditional probabilities.

Applications

back