This command sets the priors for the phylogenetic model. Remember that in a Bayesian analysis, you must specify a prior probability distribution for the parameters of the likelihood model. The prior distribution represents your prior beliefs about the parameter before observation of the data. This command allows you to tailor your prior assumptions to a large extent.
Options:
Applyto -- This option allows you to apply the prset commands to specific partitions. This command should be the first in the list of commands specified in prset. Moreover, it only makes sense to be using this command if the data have been partitioned. A default partition is set on execution of a matrix. If the data are homogeneous (i.e., all of the same data type), then this partition will not subdivide the characters. Up to 30 other partitions can be defined, and you can switch among them using "set partition=<partition name>". Now, you may want to specify different priors to different partitions of the data. Applyto allows you to do this. For example, say you have partitioned the data by codon position, and you want to fix the statefreqs to equal for the first two partitions but apply a flat Dirichlet prior to the state-freqs of the last. This could be implemented in two uses of prset:
prset applyto=(1,2) statefreqs=fixed(equal)
prset applyto=(3) statefreqs=dirichlet(1,1,1,1)
The first applies the parameters after "applyto" to the first and second partitions. The second prset applies a flat Dirichlet to the third partition. You can also use applyto=(all), which attempts to apply the parameter settings to all of the data partitions. Importantly, if the option is not consistent with the data in the partition, the program will not apply the prset option to that partition.
Tratiopr -- This parameter sets the prior for the transition/transversion rate ratio (tratio). The options are:
prset tratiopr = beta(<number>, <number>)
prset tratiopr = fixed(<number>)
The program assumes that the transition and transversion rates are independent gamma-distributed random variables with the same scale parameter when beta is selected. If you want a diffuse prior that puts equal emphasis on transition/transversion rate ratios above 1.0 and below 1.0, then use a flat Beta, beta(1,1), which is the default. If you wish to concentrate this distribution more in the equal-rates region, then use a prior of the type beta(x,x), where the magnitude of x determines how much the prior is concentrated in the equal rates region. For instance, a beta(20,20) puts more probability on rate ratios close to 1.0 than a beta(1,1). If you think it is likely that the transition/transversion rate ratio is 2.0, you can use a prior of the type beta(2x,x), where x determines how strongly the prior is concentrated on tratio values near 2.0. For instance, a beta(2,1) is much more diffuse than a beta(80,40) but both have the expected tratio 2.0 in the absence of data. The parameters of the Beta can be interpreted as counts: if you have observed x transitions and y transversions, then a beta(x+1,y+1) is a good representation of this information. The fixed option allows you to fix the tratio to a particular value.
Revmatpr -- This parameter sets the prior for the substitution rates of the GTR model for nucleotide data. The options are:
prset revmatpr = dirichlet(<number>,<number>,...,<number>)
prset revmatpr = fixed(<number>,<number>,...,<number>)
The program assumes that the six substitution rates are independent gamma-distributed random variables with the same scale parameter when dirichlet is selected. The six numbers in brackets each corresponds to a particular substitution type. Together, they determine the shape of the prior. The six rates are in the order A<->C, A<->G, A<->T, C<->G, C<->T, and G<->T. If you want an uninformative prior you can use dirichlet(1,1,1,1,1,1), also referred to as a 'flat' Dirichlet. This is the default setting. If you wish a prior where the C<->T rate is 5 times and the A<->G rate 2 times higher, on average, than the transversion rates, which are all the same, then you should use a prior of the form dirichlet(x,2x, x,x,5x,x), where x determines how much the prior is focused on these particular rates. For more info, see tratiopr. The fixed option allows you to fix the substitution rates to particular values.
Aamodelpr -- This parameter sets the rate matrix for amino acid data. You can either fix the model by specifying aamodelpr=fixed(<model name>), where <model name> is 'poisson' (a glorified Jukes-Cantor model), 'jones', 'dayhoff', 'mtrev', 'mtmam', 'wag', 'rtrev', 'cprev', 'vt', 'blosum', 'equalin' (a glorified Felsenstein 1981 model), or 'gtr'. You can also average over the first ten models by specifying aamodelpr=mixed. If you do so, the Markov chain will sample each model according to its probability. The sampled model is reported as an index: poisson(0), jones(1), dayhoff(2), mtrev(3), mtmam(4), wag(5), rtrev(6), cprev(7), vt(8), or blosum(9). The 'Sump' command summarizes the MCMC samples and calculates the posterior probability estimate for each of these models.
Aarevmatpr -- This parameter sets the prior for the substitution rates of the GTR model for amino acid data. The options are:
prset revmatpr = dirichlet(<number>,<number>,...,<number>)
prset revmatpr = fixed(<number>,<number>,...,<number>)
The options are the same as those for 'Revmatpr' except that they are defined over the 190 rates of the time-reversible GTR model for amino acids instead of over the 6 rates of the GTR model for nucleotides. The rates are in the order A<->R, A<->N, etc to Y<->V. In other words, amino acids are listed in alphabetic order based on their full name. The first amino acid (Alanine) is then combined in turn with all amino acids following it in the list, starting with amino acid 2 (Arginine) and finishing with amino acid 20 (Valine). The second amino acid (Arginine) is then combined in turn with all amino acids following it, starting with amino acid 3 (Asparagine) and finishing with amino acid 20 (Valine), and so on.
Omegapr -- This parameter specifies the prior on the nonsynonymous/synonymous rate ratio. The options are:
prset omegapr = uniform(<number>,<number>)
prset omegapr = exponential(<number>)
prset omegapr = fixed(<number>)
This parameter is only in effect if the nucleotide substitution model is set to codon using the lset command (lset nucmodel=codon). Moreover, it only applies to the case when there is no variation in omega across sites (i.e., "lset omegavar=equal").
Ny98omega1pr -- This parameter specifies the prior on the nonsynonymous/synonymous rate ratio for sites under purifying selection. The options are:
prset Ny98omega1pr = beta(<number>,<number>)
prset Ny98omega1pr = fixed(<number>)
This parameter is only in effect if the nucleotide substitution model is set to codon using the lset command (lset nucmodel=codon). Moreover, it only applies to the case where omega varies across sites using the model of Nielsen and Yang (1998) (i.e., "lset omegavar=ny98"). If fixing the parameter, you must specify a number between 0 and 1.
Ny98omega3pr -- This parameter specifies the prior on the nonsynonymous/synonymous rate ratio for positively selected sites. The options are:
prset Ny98omega3pr = uniform(<number>,<number>)
prset Ny98omega3pr = exponential(<number>)
prset Ny98omega3pr = fixed(<number>)
This parameter is only in effect if the nucleotide substitution model is set to codon using the lset command (lset nucmodel=codon). Moreover, it only applies to the case where omega varies across sites according to the NY98 model. Note that if the NY98 model is specified that this parameter must be greater than 1, so you should not specify a uniform(0,10) prior, for example.
M3omegapr -- This parameter specifies the prior on the nonsynonymous/synonymous rate ratios for all three classes of sites for the M3 model. The options are:
prset M3omegapr = exponential
prset M3omegapr = fixed(<number>,<number>,<number>)
This parameter is only in effect if the nucleotide substitution model is set to codon using the lset command (lset nucmodel=codon). Moreover, it only applies to the case where omega varies across sites using the M3 model of Yang et al. (2000) (i.e., "lset omegavar=M3"). Under the exponential prior, the four rates (dN1, dN2, dN3, and dS) are all considered to be independent draws from the same exponential distribution (the parameter of the exponential does not matter, and so you don't need to specify it). The rates dN1, dN2, and dN3 are taken to be the order statistics with dN1 < dN2 < dN3. These three rates are all scaled to the same synonymous rate, dS. The other option is to simply fix the three rate ratios to some values.
Codoncatfreqs -- This parameter specifies the prior on frequencies of sites under purifying, neutral, and positive selection. The options are:
prset codoncatfreqs = dirichlet(<num>,<num>,<num>)
prset codoncatfreqs = fixed(<number>,<number>,<number>)
This parameter is only in effect if the nucleotide substitution model is set to codon using the lset command (lset nucmodel=codon). Moreover, it only applies to the case where omega varies across sites using the models of Nielsen and Yang (1998) (i.e., "lset omegavar=ny98") or Yang et al. (2000) (i.e., "lset omegavar=M3") Note that the sum of the three frequencies must be 1.
Statefreqpr -- This parameter specifies the prior on the state frequencies. The options are:
prset statefreqpr = dirichlet(<number>)
prset statefreqpr = dirichlet(<number>,...,<number>)
prset statefreqpr = fixed(equal)
prset statefreqpr = fixed(empirical)
prset statefreqpr = fixed(<number>,...,<number>)
For the dirichlet, you can specify either a single number or as many numbers as there are states. If you specify a single number, then the prior has all states equally probable with a variance related to the single parameter passed in.
Shapepr -- This parameter specifies the prior for the gamma shape parameter for among-site rate variation. The options are:
prset shapepr = uniform(<number>,<number>)
prset shapepr = exponential(<number>)
prset shapepr = fixed(<number>)
Pinvarpr -- This parameter specifies the prior for the proportion of invariable sites. The options are:
prset pinvarpr = uniform(<number>,<number>)
prset pinvarpr = fixed(<number>)
Note that the valid range for the parameter is between 0 and 1. Hence, "prset pinvarpr=uniform(0,0.8)" is valid while "prset pinvarpr=uniform(0,10)" is not. The default setting is "prset pinvarpr=uniform(0,1)".
Ratecorrpr -- This parameter specifies the prior for the autocorrelation parameter of the autocorrelated gamma distribution for among-site rate variation. The options are:
prset ratecorrpr = uniform(<number>,<number>)
prset ratecorrpr = fixed(<number>)
Note that the valid range for the parameter is between -1 and 1. Hence, "prset ratecorrpr=uniform(-1,1)" is valid while "prset ratecorrpr=uniform(-11,10)" is not. The default setting is "prset ratecorrpr=uniform(-1,1)".
Covswitchpr -- This option sets the prior for the covarion switching rates. The options are:
prset covswitchpr = uniform(<number>,<number>)
prset covswitchpr = exponential(<number>)
prset covswitchpr = fixed(<number>,<number>)
The covarion model has two rates: a rate from on to off and a rate from off to on. The rates are assumed to have independent priors that individually are either uniformly or exponentially distributed. The other option is to fix the switching rates, in which case you must specify both rates. (The first number is off->on and the second is on->off).
Symdirihyperpr -- This option sets the prior for the stationary frequencies of the states for morphological (standard) data. There can be as many as 10 states for standard data. However, the labelling of the states is somewhat arbitrary. For example, the state "1" for different characters does not have the same meaning. This is not true for DNA characters, for example, where a "G" has the same meaning across characters. The fact that the labelling of morphological characters is arbitrary makes it difficult to allow unequal character-state frequencies. MrBayes gets around this problem by assuming that the states have a dirichlet prior, with all states having equal frequency. The variation in the dirichlet can be controlled by this parameter--symdirihyperpr. Symdirihyperpr specifies the distribution on the variance parameter of the dirichlet. The valid options are:
prset Symdirihyperpr = uniform(<number>,<number>)
prset Symdirihyperpr = exponential(<number>)
prset Symdirihyperpr = fixed(<number>)
prset Symdirihyperpr = fixed(infinity)
If "fixed(infinity)" is chosen, the dirichlet prior is fixed such that all character states have equal frequency.
Topologypr -- This parameter specifies the prior probabilities of phylogenies. The options are:
prset topologypr = uniform
prset topologypr = constraints(<list>)
If the prior is selected to be "uniform", the default, then all possible trees are considered a priori equally probable. The constraints option allows you to specify complicated prior probabilities on trees (constraints are discussed more fully in "help constraint"). Note that you must specify a list of constraints that you wish to be obeyed. The list can be either the constraints' name or number. Also, note that the constraints simply tell you how much more (or less) probable individual trees are that possess the constraint than trees not possessing the constraint.
Brlenspr -- This parameter specifies the prior probability distribution on branch lengths. The options are:
prset brlenspr = unconstrained:uniform(<num>,<num>)
prset brlenspr = unconstrained:exponential(<number>)
prset brlenspr = clock:uniform
prset brlenspr = clock:birthdeath
prset brlenspr = clock:coalescence
Trees with unconstrained branch lengths are unrooted whereas clock-constrained trees are rooted. The option after the colon specifies the details of the probability density of branch lengths. If you choose a birth-death or coalescence prior, you may want to modify the details of the parameters of those processes.
Treeheightpr -- This parameter specifies the prior probability distribution on the tree height, when a clock model is specified. The options are:
prset treeheightpr = Gamma(<num>,<num>)
prset treeheightpr = Exponential(<number>)
(And, yes, we know the exponential is a special case of the gamma distribution.) The tree height is the expected number of substitutions on a single branch that extends from the root of the tree to the tips. This parameter does not come into play for the coalescence prior. It insures that the prior probability distribution for unconstrained and birth-death models is proper.
Ratepr -- This parameter allows you to specify the site specific rates model. First, you must have defined a partition of the characters. For example, you may define a partition that divides the characters by codon position, if you have DNA data. Second, you must make that partition the active one using the set command. For example, if your partition is called "by_codon", then you make that the active partition using "set partition=by_codon". Now that you have defined and activated a partition, you can specify the rate multipliers for the various partitions. The options are:
prset ratepr = fixed
prset ratepr = variable
prset ratepr = dirichlet(<number>,<number>,...,<number>)
If you specify "fixed", then the rate multiplier for that partition is set to 1 (i.e., the rate is fixed to the average rate across partitions). On the other hand, if you specify "variable", then the rate is allowed to vary across partitions subject to the constraint that the average rate of substitution across the partitions is 1. You must specify a variable rate prior for at least two partitions, otherwise the option is not activated when calculating likelihoods. The variable option automatically associates the partition rates with a dirichlet(1,...,1) prior. The dirichlet option is an alternative way of setting a partition rate to be variable, and also gives accurate control of the shape of the prior. The parameters of the Dirichlet are listed in the order of the partitions that the ratepr is applied to. For instance, "prset applyto=(1,3,4) ratepr = dirichlet(10,40,15)" would set the Dirichlet parameter 10 to partition 1, 40 to partition 3, and 15 to partition 4.
Speciationpr -- This parameter sets the prior on the speciation rate. The options are:
prset speciationpr = uniform(<number>,<number>)
prset speciationpr = exponential(<number>)
prset speciationpr = fixed(<number>)
This parameter is only relevant if the birth-death process is selected as the prior on branch lengths.
Extinctionpr -- This parameter sets the prior on the extinction rate. The options are:
prset extinctionpr = uniform(<number>,<number>)
prset extinctionpr = exponential(<number>)
prset extinctionpr = fixed(<number>)
This parameter is only relevant if the birth-death process is selected as the prior on branch lengths.
Sampleprob -- This parameter sets the fraction of species that are sampled in the analysis. This is used with the birth-death prior on trees (see Yang and Rannala, 1997).
Thetapr -- This parameter sets the prior on the coalescence parameter. The options are:
prset thetapr = uniform(<number>,<number>)
prset thetapr = exponential(<number>)
prset thetapr = fixed(<number>)
This parameter is only relevant if the coalescence process is selected as the prior on branch lengths.
Default model settings:
Parameter Options Current Setting
------------------------------------------------------------------
Tratiopr Beta/Fixed Beta(1.0,1.0)
Revmatpr Dirichlet/Fixed Dirichlet(1.0,1.0,1.0,1.0,1.0,1.0)
Aamodelpr Fixed/Mixed Fixed(Poisson)
Aarevmatpr Dirichlet/Fixed Dirichlet(1.0,1.0,...)
Omegapr Dirichlet/Fixed Dirichlet(1.0,1.0)
Ny98omega1pr Beta/Fixed Beta(1.0,1.0)
Ny98omega3pr Uniform/Exponential/Fixed Exponential(1.0)
M3omegapr Exponential/Fixed Exponential
Codoncatfreqs Dirichlet/Fixed Dirichlet(1.0,1.0,1.0)
Statefreqpr Dirichlet/Fixed Dirichlet
Ratepr Fixed/Variable=Dirichlet Fixed
Shapepr Uniform/Exponential/Fixed Uniform(0.0,50.0)
Ratecorrpr Uniform/Fixed Uniform(-1.0,1.0)
Pinvarpr Uniform/Fixed Uniform(0.0,1.0)
Covswitchpr Uniform/Exponential/Fixed Uniform(0.0,100.0)
Symdirihyperpr Uniform/Exponential/Fixed Fixed(Infinity)
Topologypr Uniform/Constraints Uniform
Brlenspr Unconstrained/Clock Unconstrained:Exp(10.0)
Speciationpr Uniform/Exponential/Fixed Uniform(0.0,10.0)
Extinctionpr Uniform/Exponential/Fixed Uniform(0.0,10.0)
Sampleprob <number> 1.00
Thetapr Uniform/Exponential/Fixed Uniform(0.0,10.0)
Growthpr Uniform/Exponential/
Fixed/Normal Fixed(0.0)
------------------------------------------------------------------