BAMBE
Bayesian Analysis in Molecular Biology and Evolution
Version 2.03 beta, January 2001
© Copyright 2000, 2001, Donald Simon & Bret Larget,
Department of Mathematics and Computer Science, Duquesne University.
There are two main tree proposal algorithms in BAMBE,
each with clock and non-clock versions.
There are different algorithms for updating model parameters.
Complete descriptions are in the manuscript
Larget, B. and D. Simon (1999).
Markov chain Monte Carlo algorithms
for the Bayesian analysis of phylogenetic trees.
Molecular Biology and Evolution 16:750-759.
Both versions of GLOBAL manipulate the tree through direct
manipulation of the tree representation described in Larget and Simon (1999).
For any given collection of left/right orientations for subtrees at each internal node,
a graph of distance from the root in a depth first traversal of the tree
results in a graph with peaks and valleys.
With a molecular clock the peaks are all the same height above the root level.
The peaks correspond to leaves (taxa) and the valleys to internal nodes.
The internal nodes are some distance below the level of the peaks
and are ordered by the traversal ordering.
The tree is parameterized by a permutation of the taxa
as read from left to right in the representation
and the valley depths from left to right.
A tree with s leaves has 2s-1
different possible representations.
GLOBAL with a molecular clock does these steps:
- Choose a tree representation at random
by choosing a left/right orientation for the subtrees
at each internal node with equal probability.
- Perturb each valley depth independently by a small uniform amount,
reflecting excess back into range if a depth would rise above the leaf level.
- Evaluate the resultant tree and accept or reject by Metropolis-Hastings.
Every branch length is modified by this proposal,
and the tree topology may change.
When there is no molecular clock,
the peaks may all be different distances from the root in the representation.
Each valley has two depths, the depths to its left and right peaks.
The tree is now parameterized by a permutation of the taxa
and an array of 2(s-1) valley depths.
Because the likelihood models do not not distinguish between different
rootings of the same rooted tree,
we occasionally change the rooting of our rooted tree.
GLOBAL without a molecular clock is described as:
- Choose to reroot with probability p or update tree otherwise.
- If reroot tree,
- Choose a point in the unrooted tree for the new root uniformly at random.
- Otherwise
- Perturb each valley depth (two for each valley) independently by a small uniform amount,
reflecting excess back into range if a depth becomes negative.
- Evaluate the resultant tree and accept or reject by Metropolis-Hastings.
Every branch length is modified by this proposal,
and the tree topology may change.
LOCAL with a molecular clock acts on a rooted tree
in only a small neighborhood.
- Pick uniformly at random an internal branch of the tree.
- If this branch does not adjoin the root:
- Define the local neighborhood to be the two nodes of the selected branch,
u and v,
and nodes adjacent to these.
The children of u are a and b,
the children of v are u and c,
and v's parent is w.
- The heights of a, b, and c above w
are sorted and labeled
h1 < h2 < h3.
- Choose two positions above w independently,
one uniformly at random between 0 and h1
and the other uniformly at random between 0 and h2.
If the option
use-beta
is true,
these two positions are selected according to a scaled Beta distribution
whose mean is the current position.
- The position farther from w is the proposed position of u.
The position closer to w is the proposed position of v.
- If the proposed position of u is a distance less than h1
from w:
- Randomly choose one of the three children nodes to connect to v
and connect the others to u.
- Otherwise the tree topology is forced and the lowest child is connected to v.
- Else, if the randomly chosen branch does adjoin the root:
- Define the local neighborhood to be the two nodes of the selected branch,
u and v where v is the root of the tree
and nodes adjacent to these.
The children of u are a and b,
and the children of v are u and c.
- The heights of a, b, and c above v
are sorted and labeled
h1 < h2 < h3.
- The proposed distance between the root v and the nearest child
among a, b, and c
is a random multiple of h1.
- The proposed location of u is a height above the proposed
location of v chosen randomly between 0 and h2.
If
use-beta
is true,
the location is selected according to a scaled Beta distribution
with mean at the current location.
- If the proposed position of u is closer to the proposed location of v
- Randomly choose one of the three children nodes to connect to v
and connect the others to u.
- Otherwise the tree topology is forced and the lowest child is connected to v.
- Accept or reject the tree by Metropolis-Hastings.
LOCAL without a molecular clock acts on the unrooted tree.
- Pick uniformly at random an edge of the unrooted tree.
- Define the local neighborhood to be the two nodes of the selected branch,
u and v,
and nodes adjacent to these.
The neighbors of u are randomly labeled a and b
with equal probability
and independently
the neighbors of v are randomly labeled c and d
with equal probability.
- The length of the path from a to c is multiplied by a random factor.
- One node from u and v is randomly chosen
with equal probability and moved with its corresponding subtree
to a location on the path from a to c
chosen uniformly at random.
- Accept or reject the tree by Metropolis-Hastings.
In all models, the parameters theta, pi_A, pi_G, pi_C, and pi_T
may be updated within each category.
In HKY85 or F84,
there is an additional parameter kappa within each category.
In TN93, there are two additional parameters, ttp and gamma, within each category.
We restrict the kappa, ttp, and gamma values to be positive,
the weighted average of the theta values
(weighted by the number of sites in the category)
to be 1,
and the sum of the pi values within each category to be 1.
Unconstrained parameters kappa, ttp, and gamma
are updated by adding a random uniform perturbation,
reflecting the excess back into range should the proposed value become negative.
Parameters whose sum or weighted sum is fixed
are proposed from a Dirichlet distribution
with expected values at the current values.
Back to the table of contents.
This page was most recently updated on January 19, 2001.
bambe@mathcs.duq.edu