The following article presents an application of differential item functioning (DIF), using results obtained from the qualifying test developed by the Jorge Tadeo Lozano University and taken by students to classify them at a level of mathematical knowledge and to define an academic route for them based on their cognitive status shown on the test. The analysis is part of a perspective to estimate the difficulty and others characteristics of items, and the skills and level of students through the use of the Rasch model of the one parameter item response theory (IRT) and the parameters of a sample of 1623 students taking a test composed of 61 items. The article analyzes both the statistical performance of the items in terms of the parameters of item-test correlation, misfits (infit and outfit), and discrimination, as well as the behavior of the set of items depending on the construct validity or dimensionality, reliability, internal consistency, and separation parameters. A method is then shown to examine which items display DIF, associated with the conditions of the students' origin and not of their academic ability, which could lead to bias in the results of the test. The employed methods estimate the relative difficulty of each item, for students of similar ability but who belong to different groups, according to four variables studied: sex, age, intended major, and whether the high school of origin is public or private. The value of the difference in relative difficulty between the groups mentioned is associated with a level of DIF and recognizes whether the item in question has bias and which groups this bias is favoring. The difference in relative difficulty is graded in terms of severity according to three categories proposed by the Educational Testing Service: (1) moderate to large, if the difference in relative difficulty between groups (for students of similar ability) is greater than or equal to .64 logits, (2) small to moderate, if the difference is greater than or equal to .43 and less than .64 logits, and (3) not significant, if this difference is less than .43 logits. In order to validate the detection of DIF, the calculations are performed using three techniques. Two are chosen from those available in the literature and the third one is a proposal by the authors of this article to consider the size of the error in the estimations of difficulty difference. The three techniques used are: (1) the measurement of the difference between the core values of the difficulty intervals, ignoring the value of the estimation error, (2) the difference between the nearest extremes of the difficulty intervals, taking into account the estimation error, and (3) the Mantel-Haenszel statistical test. Regarding databases formed for the analysis, two aspects were considered: (1) chains of responses corresponding to missing data or especially small groups, which would not have allowed an effective and reliable comparison, were omitted, and (2) random samples with uniform distribution were selected to create groups of the same size for each study variable. The analysis with the technique of difference between core values showed that two items (34 and 59) displayed DIF with moderate to large severity, regarding the age variable for item 34 and the intended major and high school of origin variables for item 59. The technique about difference between the nearest extremes confirmed DIF with moderate to large severity for item 34, with respect to the age variable. The Mantel-Haenszel test detected DIF with moderate to large severity for items 13, 20, 34, and 61 for the age variable, and for items 4, 30, 36, 43, and 59 for the major variable.
La Universidad Jorge Tadeo Lozano aplica el Examen de Clasificación en Matemáticas Básicas, como evaluación diagnóstica, a los aspirantes y estudiantes provenientes de transferencias internas o externas, cuyo plan de estudios precise conocimientos básicos de Aritmética y Algebra Elemental. Dicho examen favorece el análisis de las condiciones académicas de los admitidos y permite a la Universidad, ofrecer opciones apropiadas para cada caso particular, al mismo tiempo que al evaluado le proporciona la posibilidad de reconocer su nivel de apropiación del conocimiento de los dominios conceptuales requeridos. Consecuentemente con el carácter decisorio del Examen de Clasificación de Matemáticas Básicas, se examinó si los ítemes utilizados presentan funcionamiento diferencial, esto es, se analizó si la diferencia de habilidades entre los evaluados podría deberse a las variables de contexto seleccionadas: sexo, edad, naturaleza jurídica del colegio de procedencia y facultad en la que el aspirante tramita su ingreso. Para ello, se procesaron 1.623 cadenas de respuestas para 61 ítemes, obtenidas en las pruebas comprendidas entre el tercer período lectivo de 2011 y el primero de 2012. La metodología incluyó la implementación de tres técnicas: Contraste del DIF (diferencia entre los centros de dificultades), Contraste del DIF (diferencia entre los extremos más próximos para los intervalos de dificultad) y prueba estadística Mantel-Haenszel. La conjunción de estas técnicas permitió determinar un ítem con funcionamiento diferencial en categoría moderada a grande, para la variable edad. Finalmente, para este ítem se exhiben sus parámetros estadísticos y su curva característica, estimados en la calibración.