[Bioperl-l] PopGen modules
bli1 at bcm.tmc.edu
Thu Nov 10 01:15:43 EST 2005
I recently started to play with PopGen modules but am confused by the
difference between "number of individuals" and "sample size". My
understanding is that sample size is the number of haploids or
chromosomes, and number of individuals is the number of diploids. For
example, 100 humans are genotyped, then sample size should be 200 and
number of individuals is 100. Am I right? I could be completely wrong
but assume I am right for now.
I constructed a population object (named $pop) using prettybase format. Then
$stats = new Bio::PopGen::Statistics();
$number_individuals = $pop->get_number_individuals();
$seg_sites = $stats->segregating_sites_count($pop);
$theta1 = $stats->theta($pop);
$theta2 = $stats->theta($number_individuals, $seg_sites);
$theta3 = $stats->theta($number_individuals*2, $seg_sites);
In the above code, $theta1 == $theta2 != $theta3, and I think $theta3
should be the correct answer.
I used "ms" program of Hudson to simulate 200 chromosomes and I used 200
as sample size which gives correct answers (double confirmed with other
Please let me know if I am too naive about this.
More information about the Bioperl-l