Parameters

By setting these parameters, you help weighbor do a better job of giving appropriate weights to the long and short branches. In the UNIX version you should use -L to specify the length of the sequences used to create the distance matrix. Thus use

weighbor -v -L 1000 < distfile > treefile

if your sequences had length 1000. (The menu driven version on Macs and PCs will prompt you for the input values.) If your model of evolution assumes that 20% of the sites are actually invariable (will never change), you should reflect this in L by reducing it by 20%, to 800.

Another parameter you can change is the "effective alphabet size" which we call b. The default value is b=4.0, i.e., the Jukes-Cantor model. If nucleotide usage is biased or if some sites are under partial selection pressure, you may want to reduce b (but b should be larger than 2.0). To account for significantly biased nucleotide usage, you could set (b-1) /b to the sequence similarity expected for two random sequences. This results in the basic Tajima and Nei (1984, MBE) model:
b = 1 / (1 - sumi fi(1-fi) )
where fi are the nucleotide frequencies. If the frequencies are .3,.3,.2,.2, we find b=3.85. In practice a value this small or smaller is probably appropriate most of the time, owing to selection.

If you are using amino acid sequences you may want to increase b to 20.0, although again you will want to reduce it below that owing to the non-uniform usage of amino acids. Swofford and Olsen (in Hillis and Moritz's Molecular Systematics, 1990) have suggested using a value of about 14 for this reason.

To use weighbor with a length of 1000 but with 20% invariable sites, and b= 3.85 characters allowed (on average) in the variable sites, do

weighbor -b 3.85 -L 800 < distfile > treefile

Also note that you should use similar parameters when creating your distance matrix. Using a good distance matrix is probably much more important than tuning the parameters in weighbor.

Back to Weighbor