The aim of this study was to compare iterative and direct solvers for estimation of marker effects in genomic selection. One iterative and two direct methods were used: Gauss-Seidel with Residual Update, Cholesky Decomposition and Gentleman-Givens rotations. For resembling different scenarios with respect to number of markers and of genotyped animals, a simulated data set divided into 25 subsets was used. Number of markers ranged from 1,200 to 5,925 and number of animals ranged from 1,200 to 5,865. Methods were also applied to real data comprising 3081 individuals genotyped for 45181 SNPs. Results from simulated data showed that the iterative solver was substantially faster than direct methods for larger numbers of markers. Use of a direct solver may allow for computing (co)variances of SNP effects. When applied to real data, performance of the iterative method varied substantially, depending on the level of ill-conditioning of the coefficient matrix. From results with real data, Gentleman-Givens rotations would be the method of choice in this particular application as it provided an exact solution within a fairly reasonable time frame (less than two hours). It would indeed be the preferred method whenever computer resources allow its use.
The problem of multicollinearity in regression analysis was studied. Ridge regression (RR) techniques were used to estimate parameters affecting the performance of crossbred calves raised in tropical and subtropical regions by a model including additive, dominance, joint additive or "profit heterosis" and epistatic effects and their interactions with latitude in an attempt to model genotype by environment interactions. A software was developed in Fortran 77 to perform five variant types of RR: the originally proposed method; the method implemented by SAS; and three methods of weighting the RR parameter lambda. Three mathematical criteria were tested with the aim of choosing a value for the lambda coefficient: the sum and the harmonic mean of the absolute Student t-values and the value of lambda at which all variance inflation factors (VIF) became lower than 300. Prediction surfaces obtained from estimated coefficients were used to compare the five methods and three criteria. It was concluded that RR could be a good alternative to overcome multicollinearity problems. For all the methods tested, acceptable prediction surfaces could be obtained when the VIF criterion was employed. This mathematical criterion is thus recommended as an auxiliary tool for choosing lambda.
Os objetivos neste trabalho foram comparar estimativas de parâmetros genéticos obtidas por meio de dois modelos - um contendo apenas efeitos aditivos e de dominância e outro que incluiu os efeitos aditivo-conjunto (complementaridade) e epistático - e testar alternativas de critérios objetivos para determinação do coeficiente lambda na aplicação da regressão de cumeeira. Os resultados obtidos revelaram que a escolha de um critério para determinação do coeficiente lambda em regressão de cumeeira depende não apenas do conjunto de dados e do modelo utilizado, mas, sobretudo, de um conhecimento prévio acerca do fenômeno estudado e do significado prático e da interpretação dos parâmetros encontrados. Pelo uso de modelos mais completos para avaliação de efeitos genéticos em bovinos de corte, pode-se identificar a contribuição dos efeitos aditivo-conjunto e epistático, que encontram-se embutidos no efeito de heterose estimado por modelos mais simples. A regressão de cumeeira é uma ferramenta que viabiliza a obtenção dessas estimativas mesmo na presença de forte multicolinearidade.
The purpose of this study was to compare estimates of genetic effects obtained using the additive-dominance model and another which included parameters for joint-additive (complementarity) and epistatic effects, as well as evaluate alternative objective criteria for choosing the lambda coefficient in ridge regression implementation. The results indicated that the criterion to be employed at the choice of lambda not only depends on the data set and the model used, but also on a previous knowledge about the phenomenon under study and the practical interpretation of estimated coefficients. When performing genetic effects evaluation, if other than additive and dominance effects are contemplated, it may be possible to identify and separate joint-additive and epistatic effects, which are usually inlaid in the heterotic effect estimated by the additive-dominance model. The use of ridge regression method can make such analyses possible even under strong multicollinearity.