Estimating and testing the effect of allelic recombination on the correlation between genotypic relatives

This paper provides estimates of the correlation between genotypic relatives and the effect of allelic recombination on the correlation assuming random mating. It is shown that the correlation is a non negative quantity and that allelic recombination has the effect of reducing total variation and doubling the correlation between genotypic relatives with respect to measurements on the character of interest. The significance of the correlation coefficient as well as the fitted regression model was obtained using Analysis of Variance method.


INTRODUCTION
Genetic recombination is an effective means of combining one individual trait of two parents, permitting the comparison of one expression of a character with another expression of the same traits (Burns, 1976, Alberts, 1994, Maloy S, 1994).
Although much work has been done on the correlation between relatives for various physical characteristics starting with the pioneering work by Fisher (1918), very little has been written on the effect of genetic recombination on these correlations (Ewens, 1979, Oyeka andOyeka, 1988).These writers also failed to provide a statistic for testing the significance of the estimated correlation coefficient.In this paper, the work by (Oyeka and Oyeka 1988) is modified to include a test statistic for the estimated correlation coefficient and to know if the hypothesised model fits.

CORRELATION BETWEEN RELATIVES
We will assume that a certain population has a gene locus with possible alleles A and a.We also assume that the probability of occurrence of allele A in the population *Corresponding author.E-mail: chinwe_uzuke@yahoo.com. of interest is p and that the corresponding probability of the allele a is q = 1-p.We further assume that a certain characteristic or factor V of the population of interest is completely determined by the genotype at the locus in such a way that the individuals of genotype AA have a value or measurement of (V = x ) for the character or factor of interest; all individuals of genotype Aa have a measurement of (V = y) and all individuals of genotype aa have measurement (V = z).We finally assume that our population obeys the Hardy-Weinberg law of random mating (Stein, 1943;Clavel et al., 1989).
Assuming n-pairs of relatives are studied, let R 1 and R 2 be a pair of relatives whom we know for sure have at least one allele in common.Then under the law of random mating, the occurrence of the allele A and a in the genotypes are independent.Hence the probabilities of occurrence of AA, Aa, or aA and aa are, respectively, We first derive an estimate of the correlation between the measurements on the character of genotypic relatives.To do this we first find the conditional probabilities that R 2 say, is of a certain genotype given the genotype of R 1 and then proceed to derive the joint probability distribution for the genotypes of R 1 and R 2 .Now if R 1 is of genotype AA, then R 2 must have allele A in common with R 1 .Also since the occurrence of the second allele in R 2 is independent of the occurrence of the known allele A, the second allele in R 2 is either A with probability p or a with probability q = 1-p.Now, since it is assumed that the relatives R 1 and R 2 must have at least one allele in common, the relative R 2 cannot be of genotype aa if R 1 is of genotype AA.Hence, the required conditional probabilities are: If now R 1 is of genotype Aa, then R 2 must have either the allele A or the allele a in common with R 1 .Since the second allele occurs independently of the first allele in R 2 , the second allele is either A with probability p or a with probability (1-p).Hence if R 1 is of genotype Aa, then R 2 is of genotype AA with probability p; of genotype Aa (or aA) with probability 1-p +p = 1 and of genotype aa with probability 1-p.Hence, Similarly, Now to find the joint probability distribution of R 1 and R 2 , we apply the multiplication law of probability (Miller J 1996), which states that for any two events x and y, Hence, = p(p.p)=p 3 Oyeka et al. 7933 since the alleles in the genotype occur independently. Similarly, Other probabilities are: (1-p).= These calculations yield the results of Table 1 which shows the joint probability distribution of R 1 and R 2 , the marginal probability distribution and the corresponding measurements on the character of interest in the population.
Hence from the Table 1, And (1) Therefore, the expectation of R 1 is equal to that of R 2 .The corresponding variance on the measurement R 1 is given as: Hence, the variance of the measurement on R 1 which is the same as the measurement on R 2 is given as Where m = E(R 1 ) = from Equation 1.
The covariance between R 1 and R 2 is also calculated in the usual way from the table as: (Uche, 2004).
Where m = E(R 1 ) = E(R 2 ) (3) The correlation, r, between R 1 and R 2 is found by dividing Equation (3) by Equation ( 2) since the variance of R 1 is the same as that of R 2 .Thus, , since var(R 1 ) = var(R 2 ) Then, Note that since p 0. The covariance, S 12 and hence the correlation, r, is a non negative quantity, and for 0<p<1, has a value zero only when (5) Provided x and z are both greater than y or x and z are both less than y.

EFFECT OF ALLELIC RECOMBINATION
Let us now examine what would happen to the measurements of the character of interest and hence the correlation if we recombine the alleles by replacing one allele of a genotype by another allele.Specifically, and without loss of generality, suppose we replace an allele A by an allele a in a genotype determining a certain character in an individual, by this allelic replacement model, the original individual must possess an A allele and hence must be of genotype AA or of genotype Aa.Hence, if the replacement is being made in an AA Oyeka et al. 7935 individual, the resulting effect on the measurement on the character of interest is to reduce y by x; that is , while if the allelic replacement is being made on an Aa individual the effect would be .
Interest is now on finding the differential effect of this allelic recombination on the measurement of the character concerned and its significance.We propose to do this using the method of least squares.Let us, therefore, find the best estimates, in the least square sense, of the parameters that would minimize the expected sum of squared deviations of x, y, and z from , respectively, assuming random mating and subject to the constraint ( 6) Where = the effect of allele A on the character of interest and =the effect of allele a on the character of interest.
The expected sum of squared deviations of observed from their true values of the measurements using the marginal probability distribution of Table 1.Since from equation ( 6) .
Hence we have that (8) Also, differentiating with respect to α, yields

Sum of square Degrees of freedom
Mean square F-ratio Where R 2 is the usual coefficient of determination in regression parlance.Or equivalently from equation ( 4) we have that (17) In other words, the proportion of total variance in the measurements on the character of interest that is accounted for by the effect of manipulation of the alleles is equal to twice the correlation between genotypic relatives in the absence of allelic recombination.

TESTING THE SIGNIFICANCE OF THE FIT OF THE REGRESSION MODEL AND THE CORRELATION BETWEEN GENOTYPIC RELATIVES
One may be interested in testing the hypothesis that the regression model fits.That is, testing whether the differential effects of the allelic recombination on the character of interest are statistically different from zero.To test the null hypothesis Ho, we may the Analysis of Variance method (Montgomery and Peck, 1992).The three sums of squares, their associated degrees of freedom, their mean squares, and the resulting Fratio are summarised in Table 2.
The F-ratio = (18) which has an F-distribution with (2, n-3) degrees of freedom and may be compared at an α significance level with tabulated critical F-value to test that the regression model fits.If: F-ratio > F (1-α), (2, n-3) We reject the null hypothesis of no differential effects of allelic recombination on the genotypic relatives.We may also wish to test the null hypothesis that allelic recombination has no significant effect on the correlation between genotypic relatives that is that the population correlation coefficient ρ due to allelic recombination of genotypic relatives is zero versus the alternative hypothesis that ρ is different from zero.Now, note that testing that the regression model fits is equivalent to testing the hypothesis that, in the sampled population, the values of Q which is equal to R 2 is equal to 2r not equal to zero.The significance of the sample estimates of these population parameters is tested using the usual Ftest of equation ( 18).The rejection of the null hypothesis implies that the population values of Q equals R 2 is not equal to or equivalently, Q equals 2r is not equal to zero in the population implying that r is not equal to zero in the population sampled.Hence the usual F-test provides a test statistic for testing the null hypothesis that the correlation between genotypic relatives is zero versus the alternative hypothesis that the correlation between genotypic relatives is different from zero.The null hypothesis is rejected at an appropriately chosen significance level α.

CONCLUSION
This paper provided estimates of the correlation between genotypic relatives both in the presence and in the absence of allelic ecombination.It is shown that the correlation between genotypic relatives in the absence of allelic recombination is double the correlation between genotypic relatives in the presence of allelic recom-bination.The correlation obtained is a non negative quantity and except for trivial cases (p = 0 or 1), assuming the value zero only for some critical value p.The significance of the correlation obtained and the regression model fitted are tested using the analysis of variance technique.
µ, α, and β by their estimates respectively, and simplifying yields .The right hand side of the above equation reduces to = .

Table 1 .
Joint probability distribution of R1 and R2.

Table 2 .
Analysis of variance table for the hypothesised regression model.