weighbor -v -L 1000 < distfile > treefile
if your sequences had length 1000. (The menu driven version on Macs and PCs will prompt you for the input values.) If your model of evolution assumes that 20% of the sites are actually invariable (will never change), you should reflect this in L by reducing it by 20%, to 800.
Another parameter you can change is the "effective alphabet size"
which we call b. The default value is b=4.0, i.e., the
Jukes-Cantor model. If nucleotide usage is biased or if some
sites are under partial selection pressure, you may want to
reduce b (but b should be larger than 2.0). To account for
significantly biased nucleotide usage, you could
set (b-1) /b to the sequence similarity expected for two random sequences.
This results in the basic Tajima and Nei (1984, MBE) model:
where fi are the nucleotide frequencies.
If the frequencies are .3,.3,.2,.2, we find b=3.85. In practice
a value this small or smaller is probably appropriate most of the time,
owing to selection.
If you are using amino acid sequences you may want to increase b to 20.0, although again you will want to reduce it below that owing to the non-uniform usage of amino acids. Swofford and Olsen (in Hillis and Moritz's Molecular Systematics, 1990) have suggested using a value of about 14 for this reason.
To use weighbor with a length of 1000 but with 20% invariable sites, and b= 3.85 characters allowed (on average) in the variable sites, do
weighbor -b 3.85 -L 800 < distfile > treefile
Also note that you should use similar parameters when creating your distance matrix. Using a good distance matrix is probably much more important than tuning the parameters in weighbor.
| Back to Weighbor |