Estimation of genetic parameters for first-lactation test-day milk yield in Holstein Friesian cows fitting random regression models

In this study, the effects of varying order of Legendre polynomials (LP) for permanent environmental (Pe) variance structure on the estimates of genetic parameters for first-lactation milk yield were evaluated fitting random regression (RR) test-day animal models. The total data set included 6850 test-day milk yield records from 800 first-lactation Holstein Friesian cows that calved between 1997 and 2013 and the pedigree file with total of 1779 animals. Four different random regression models (RR1, RR2, RR3 and RR4) all with second order LP for additive genetic effects but with varying order (intercept, 1 st , 2 nd and 3 rd , respectively) of LP for modelling the Pe variances structure were tested for estimation of variance components and corresponding genetic parameters for milk yield. Variance components were estimated by average information restricted maximum likelihood method. The performances of competing RR models in the estimation of variance components were compared using estimates of log-likelihoods and the size of residual variances. Results showed that the estimates of log-likelihoods were higher and residual variances were lower for models that fitted second (RR3) and third order (RR4) LP for Pe effects. Heritability (h 2 ) and genetic correlations from RR3 and RR4 models ranged from 0.13 to 0.29 and 0.45 to 0.98, respectively. Models with lower order fits (RR1 and RR2) with either a constant or a linear term for Pe resulted in oscillatory trend for variance components and highly erratic h 2 estimates ranged from 0.18 to 0.52. Genetic correlations from these models were also implausible biologically indicating that models with lower order fits for the Pe effects were not robust enough to accurately model the variance structure at different stages of lactation. It is therefore suggested that at least a second or higher order polynomial fits are needed to model the Pe variance structure for the accurate estimation of genetic parameters for milk yield in first-lactation Ethiopian Holsteins.


INTRODUCTION
Milk yield is one of the most important traits in dairy cattle. Genetic improvement in livestock is a particular cost effective technology, producing permanent and cumulative changes in performance (Wall et al., 2010).
Improving milk yield through genetic selection is therefore a sustainable means of increasing farm profitability. Today, test-day models have been used for the analysis of longitudinal data or repeated records of production traits worldwide in the evaluation of the genetic merit of dairy cows (Jensen, 2001;Interbull, 2012). Random regression model (RRM) is one of the test-day models used widely for genetic analysis of production traits in dairy cattle. The use of test-day records fitting RRM has several benefits including ability to account for environmental factors that affect cows at different stages of lactation at the time of test (Jamrozik and Schaeffer, 1997;Swalve, 2000;Jensen, 2001) and account for individual differences in the shape of lactation curves, which includes the persistency of the lactation (Reents, 1996;Pool et al., 1999). The RRMs also permit the use of incomplete lactations and the subsequent inclusion of a large amount of data from the same animal in genetic evaluations improving the accuracy of estimation of animal's genetic merit (Bignardi et al., 2011). In RRMs, there are different functions that can be used to model genetic and non-genetic animal variations across the lactation trajectory. Among those the most widely used are parametric functions and non-parametric functions like Legendre polynomials. Kirkpatrick et al. (1994) demonstrated the use of Legendre polynomials in modeling the covariance structure of test-day records along time trajectories for the random effects. Legendre polynomial function has the computational advantage of reducing the correlation of estimated coefficients, thus facilitating convergence (Schaeffer and Jamrozik, 2008).
Legendre polynomials of varying orders could be used to model the trajectories of random animal genetic and permanent environmental variations. Studies conducted fitting permanent environmental and additive genetic effects with varying order of Legendre polynomials at a time (Pool et al., 2000;Cobuci et al., 2005;Liu et al., 2006;El Faro et al., 2008;Torshizi et al., 2013) indicated that modelling the permanent environment is variable and dependant on type of data and the population in question. For instance, Pool et al. (2000), El Faro et al. (2008 and Torshizi et al. (2014) indicated that higher order of permanent environmental regression coefficients over the additive genetic regression coefficient was best for data from Dutch Holstein Friesian, Caracu and Iranian Holstein cow data, respectively. Cobuci et al. (2005) emphasized the importance of the including random regression coefficients in the model for the description of the variation in the permanent environmental effects. Jamrozik and Schaeffer (2002) and Strabel et al. (2003) reported that RRM with orthogonal polynomials outperform models with lactation curve functions having the same number of parameters for additive genetic and permanent environmental effects. Currently, however, there is little information on how to model the additive genetic and permanent environmental variations using the Legendre polynomials functions particularly for dairy cattle data from tropical environments. Therefore, the objectives of this study were to evaluate varying order of Legendre polynomials for modelling additive genetic and permanent environmental effects and to estimate genetic parameters for test-day milk yield based on data from Holstein Friesian cows in Ethiopia.

Data
In this study, the data set contained 6,850 test-day milk yield records from 800 first-lactation Holstein Friesian cows which were daughters of 149 sires that calved between 1997 and 2013 and belonging to two herds. The pedigree file had 1779 animals of which 800 animals had their own records. The data were obtained from the Ethiopian national dairy cattle milk recording database. Only test-day records between 5 and 305 days in milk (DIM) were used for the estimation of variance components. In addition, records of cows with age at calving less than 20 months or greater than 54 months were excluded. Age at calving was grouped into five classes (in months). These included cows less than 27, 28 to 33, 34 to 39, 40 to 45 and above 45 months of age at first calving. The data was edited to ensure that cows have at least five test-day records and the calving months were divided into three seasons (October-February, March-May and June -September).

Models
Four different RRMs were identified for the estimation of variance components and genetic parameters for test-day records of firstlactation Holstein Friesian cows. The RR1 model was a RRM where second order Legendre polynomials was used for modelling the additive genetic variance and only the intercept term was used to model common permanent environment associated with all test-day yields of a cow.
The RR2, RR3 and RR4 were RRMs where Legendre polynomials of second order were used for additive genetic effects whilst first, second and third order was used for modeling the permanent environmental effects, respectively. The models for RR2, RR3 and RR4 can be described as follows:  = vector with fixed regressions coefficient specific to calving season subclass j and measured on DIM (d); = fixed effect of age at calving ; = random sire*calving year interaction; ℎ = random herd test month effect; = random permanent environmental effects for RR1 models; = vector with random permanent environmental random regression coefficients specific effects of cow n; = vector with additive genetic random regression coefficients specific to the animal effect of cow m and = residual effect.
The lactation curve was modeled with fixed regression by fitting a combination of Legendre polynomials and Wilmink functions and it was the same for all models: Where, C0 C1C2 C3 represents coefficients of the third order orthogonal Legendre polynomials at DIM d plus a Wilmink function (Wilmink, 1987). In the Wilmink function, the exponential term (w) is generally considered as a constant and in this study a value of -0.05 was used. The use of Wilmink function with the combination of Legendre polynomials for modelling lactation curves for milk yield traits has been reported by different authors (Lidauer and Mäntysaari, 1999;Lidauer et al., 2003;Negussie et al., 2008;Santos et al., 2013) in several different dairy cow populations.

Estimation of genetic parameters
Variance components were estimated by average information restricted maximum likelihood method using DMU program (Madsen and Jensen, 2013). Convergence of solutions was Meseret et al. 27 determined when the difference between the right hand side and left hand side was less than 10 -6 . Genetic parameters were estimated using the estimated variance components for the different models. The performances of the different competing models were compared using the estimates of the log-likelihoods and size of residual variances. The additive genetic and permanent environmental variances were estimated as: and Where, G and P = matrices of additive genetic and permanent environment random regression coefficients, respectively; = (co) variables related to a specific test-day measured during DIM d.
Heritability for a particular DIM d in lactation was calculated by dividing the estimated genetic variance by the sum of permanent environmental variances ( , genetic variances ( and residual variances ( for that particular DIM.
Genetic correlations: The genetic correlation between two days in lactation and was calculated by dividing the additive genetic covariance between days and by the product of the square root of the genetic variances of the days and .

Overall mean
Phenotypic means, standard deviations and coefficient of variation for test-day milk yield are given in Table 1. The overall average test-day milk yield was 11.1 kg. Milk yield peaked at around 35 DIM and then declined as the lactation progressed. The peak milk yield was reached earlier at about 3 to 4 weeks in early lactation which subsequently decreased as lactation progressed.

Comparison of model performances
Different models with the same fixed and random additive genetic effects but fitting either intercept, first, second or third order of Legendre polynomials for permanent environment were compared based on estimates of loglikelihoods and residual variances ( Table 2). The result showed that with the increase in the order of Legendre polynomials for modelling the permanent environmental  Table 2. Estimates of log-likelihoods and residual variances for RR1, RR2, RR3 and RR4 models. effects there was a decrease in the residual variances and an increase in the log-likelihood estimates. The RR4 model fitting third order of Legendre polynomials for the permanent environmental effects had the lowest residual variances and highest log-likelihood estimates. This is the most preferred model as compared to the others and the least performing model was RR1 that considered a common permanent environmental effect for all test-day milk yields across the lactation trajectory. Similarly, Rekaya et al. (1999) and Cobuci et al. (2005) worked on first-lactation Holstein cows fitting a common permanent environmental effect that resulted in very large heritability estimates and negative genetic correlations between testday records which they concluded as one of the least performing model for estimation of genetic parameters. In addition, Liu et al. (2000) and Torshizi et al. (2013) comparing different RRMs indicated that fitting higher order of random regression coefficients for the permanent environment than to the additive genetic was selected as the best for modelling milk yield in first-lactation dairy cows.

Estimates of variance components
Estimates of additive genetic and permanent environmental variances for milk yield during firstlactation are in Figure 1 and 2. The trend of additive genetic variances from RR1 and RR2 models were not very clear and it looks that they were slightly over estimated. Estimates of additive genetic variances from the simplest RR1 model ranged between 1.92 to 6.27 kg 2 and it was much higher than those from the other models. Marked decrease in the course of the genetic variance curve was observed for the first 30 days of lactation, suggesting that the models were less robust to describe the genetic variance in milk yield during the early lactation period (Cobuci et al., 2005). Rekaya et al. (1999) and Cobuci et al. (2005) have also reported that additive genetic variances estimated by the RRM when the permanent environmental effect was fitted only the intercept term were greater than that estimated by a model with equal random regression coefficients for both additive genetic and permanent environmental effects. The genetic variances estimated from RR2 model ranged  from 1.20 to 2.84 kg 2 and this was also higher than the estimates from RR3 (1.21 to 1.74 kg 2 ) and RR4 (1.21 to 1.78 kg 2 ) models.
On the other hand, the estimates from the RR3 and RR4 models followed the same trend with slight ifferences observed in the magnitude of additive genetic variances. The additive genetic variances from RR3 and RR4 models were highest in mid lactation (between 185 to 245 DIM) and estimates were generally lower during the early stages of lactation. A much more similar additive genetic variance trend has also been reported by Torshizi et al. (2013). This is, however, on the contrary to Cobuci et al. (2005) who reported a higher additive genetic variance towards the beginning and end of lactation. The highest estimates of permanent environmental variances were observed in RR4 model at the beginning of lactation. The estimates from RR3 followed closely the estimates from RR4. On the other hand, the permanent environmental variance estimates from RR1 model were higher than all the other models along the entire course of lactation except the early lactation.

Heritability
Estimates of heritabilities for the four RRMs are in Table  3 and Figure 3. The estimates of daily heritabilities showed that varying the combinations of order of Legendre polynomials for modelling additive genetic and permanent environmental clearly influenced the resulting parameter estimates. The inclusion of either the intercept term (RR1) or first order Legendre polynomials (RR2) for modelling the permanent environment variation resulted in wiggly heritability estimates along the trajectory with the highest estimates at the beginning and end of lactation. Similarly, Jamrozik and Schaeffer (1997) reported that daily heritability estimates with higher values at both beginning and end of lactations (border or wave effect) was the features of a RRM, in which the permanent Meseret et al. 29 environmental effect was constant during lactation. Similar results have also been reported by Rekaya et al. (1999) and Cobuci et al. (2005) for Holstein Friesian cattle population. In general, the assumption of constant permanent environment and modelling permanent environment with order of Legendre polynomials less than additive genetic effects across lactation did not permit the differentiation between variance estimates correctly. Therefore, genetic variances were overestimated and resulted in heritability estimates that are higher than those from the other models.
On the other hand, heritabilities estimated from the RR3 and RR4 models had a similar trend with slightly increasing heritability estimates followed by a slight decline towards the end of lactation which is quite expected. In these models heritability estimates along the lactation trajectory had a similar shape to the genetic variation but were less extreme at the beginning and end of the trajectory because of higher permanent environmental variances. The heritability estimates from RR3 and RR4 models were consistent and more plausible biologically. Various studies have also reported similar trends for the heritability curve with lower values at the beginning and an increase across lactation followed by slight decline towards the end of lactation (Liu et al., 2000;Druet et al., 2003;Negussie et al., 2008, Abdullahpour et al., 2010Cobuci et al., 2011). On the contrary, Strabel et al. (2003) working in polish black and white cows reported that models with different order polynomials for additive genetic and permanent environmental effects resulted undesired shape of heritability curves.

Genetic correlations
Estimated genetic correlations between milk yields at different test-days from the different RRMs are given in Figure 4. In the RR1 model, genetic correlations ranged from -0.06 to 0.96 and in RR2 from -0.02 to 0.90. These estimates from the RR1 and RR2 seems not reasonable biologically and are in contrast to literature reports by Lidauer et al. (2003), El Faro et al. (2008, Bignardi et al. (2011) and Cobuci et al. (2011). On the other hand, the model with equal order (RR3) or higher order (RR4) of Legendre polynomials for fitting permanent environment than for additive genetic effects gave biologically reasonable genetic correlations estimates. Genetic correlations between test-days from RR3 and RR4 in general ranged from 0.37 to 0.96 and 0.38 to 0.96, respectively. Cobuci et al. (2005), Santos et al. (2013) and Torshizi et al. (2013) reported genetic correlations between test-days that were higher than 0.06, 0.47 and 0.51, respectively. In our study, genetic correlation estimates from RR3 and RR4 models were in general positive between different test-days and were higher for adjacent test-days and decreased with increasing distance between test-days. In all cases, towards the extreme part of the lactation, a slight increase in the genetic correlations was observed. This might be due to lower number of observations at the extreme parts of the lactation and the assumption of homogenous residual variances across different stages of lactation. Overall, the RR4 model gave better estimates of genetic correlations especially at the extreme periods of lactation and the unexpected slight increment towards the end of lactation was not pronounced. Santos et al. (2013) has also reported slight increment of genetic correlations at the ends of lactation.

Conclusions
Determining the order of Legendre polynomial fits in testday models is crucial for the accurate estimation of genetic parameters. The results from this study clearly showed that a constant or lower order Legendre polynomial fits for permanent environment than for the additive genetic effects resulted in highly erratic variance component estimates, particularly at both extremes of the lactation curve. As a result, RR1 and RR2 models, predicted basically different shape of variance components implying that they are less robust in describing genetic and environmental variances across the different stages of lactation. This could be due to the inability of such models to differentiate between the variance estimates correctly across different stages of lactation resulting in unrealistic heritability estimates. Therefore, the indications from this study are that at least a second or higher order Legendre polynomial fits are needed to model the permanent environmental variance structure for the accurate estimation of genetic parameters for milk yield in first-lactation Ethiopian Holstein Friesian.