Inbreeding within a subpopulation is caused by the nonrandom mating of the members of that subpopulation, in that mating occurs more often than by chance alone, between closely related individuals. As closely related individuals will contain a large proportion of the same alleles due to common descent, their offspring will have a higher level of homozygosity, and conversely, a lower level of heterozygosity then expected. A within subpopulations F-statistic can be estimated from a ratio of the observed to expected heterozygosity where,
where is the average expected heterozygosity estimated from each subpopulation by,
and is the average observed heterozygosity,
for k subpopulations.
Population substructure will also lead to inbreeding-like effects, i.e. a reduction in observed heterozygosity when compared to expected. This effect is known as Wahlunds' effect. This relationship shows that as allele frequencies in two subpopulations deviate, the average expected heterozygosity in those populations will always be less than that expected from the pooled allele frequencies . An among subpopulations F-statistic can be estimated from this ratio.
and is the frequency of the ith allele averaged over all subpopulations. It should be noted that as allele frequencies deviate, the difference in and will increase and will therefore also serve as a measure of genetic distance among subpopulations.
The measure of the correlation of alleles for the entire population is thus a combination of both the within and among subpopulation effects, and can be estimated from,
Nei (1987) further developed so that data from many loci could be combined. This estimate is calculated from
where and are averaged across all loci and then used to estimate .
where is the frequency of the ith allele in the jth population.
Weir and Cockerham (1984) have developed a variance based method for estimation of F- statistics. can be thought of as the correlation of pairs of alleles between individuals within a subpopulation. If there is population structure then alleles found within a subpopulation should be correlated (found more often together than expected) with respect to all the alleles found in the entire population. Weir and Cockerham (1984) describe a measure which estimates the correlation of pairs of alleles between individuals within a subpopulation through an analysis of the partitioning of variance of allele frequency.
The total variance of allele frequency within a population is equal to the sum of its components; between subpopulation variance in allele frequency , between individuals within subpopulation variance in allele frequency , and the between gametes within individuals variance in allele frequency , i.e.
Given this can be estimated from
where the variances in allele frequency are summed over all alleles i and all loci u. The precise formulae for the estimations of the component of variance can be found in Weir and Cockerham (1984). In the special case when both the sample sizes n and the number of subpopulations sampled are very large, the estimation of can be reduced to
where and are the observed allele frequency and sample size of the jth population, and and are the average allele frequency and sample size for the entire population. From this equation it can be seen that as the allele frequencies in the subpopulation diverge, the value of the numerator will increase, and the value of will approach 1.