Abstract: Establishing learningoutcomes and thesystemformonitoring and assessingtheirachievement is an essential aspect of planning and organising the teaching-learning process, and also a crucial function of university teaching staff. In addition, it is also a key activity to provide coherence in higher education to curriculum design based on constructive alignment. This study presents an analysis and assessment of the descriptions of the following curricular elements in the university master’s degree programmes: learning outcomes and assessment methods and instruments. Employing a textual and content analysis, 9419 descriptions of learning outcomes and 6729 descriptions of assessment methods and instruments have been analysed, which correspond to 89 master’s programmes in the branch of Social Sciences and Law taught in six Spanish universities in different autonomous regions. Textual analysis was performed with the Xplortext software. For the content analysis, firstly, an ad hoc evaluation instrument (ANVALDOC) was designed and, secondly, a computer tool (CORAMeval) was developed to implement and use the scale. The results show the association between the language used and the university of origin or the discipline in which the degree is contextualised. Likewise, there is a clear difference between universities and disciplines in terms of the quality of the learning outcome descriptions, assessed in terms of correctness, verifiability, authenticity, or underlying cognitive process. Moreover, these differences are maintained in the correctness and authenticity of the assessment methods and instruments.
Keywords: higher education, learning outcomes. educational assessment, performance assessment.
Resumen: Determinar los resultados de aprendizaje y el sistema para el seguimiento y evaluación de la consecución de estos constituye uno de los aspectos fundamentales de la planificación y organización del proceso de enseñanza-aprendizaje, siendo igualmente una de las funciones esenciales que desempeña el profesorado universitario. Así mismo, se trata de una actividad básica para dar coherencia en la educación superior a un diseño curricular basado en el alineamiento constructivo. En este estudio se presenta un análisis y valoración de las descripciones realizadas en las memorias de verificación de títulos universitarios de máster de los siguientes elementos curriculares: resultados de aprendizaje y medios e instrumentos de evaluación. Mediante un análisis textual y de contenido se han analizado 9419 descripciones de resultados de aprendizaje y 6729 de medios e instrumentos de evaluación, que se corresponden con las memorias de 89 títulos de máster de la rama de ciencias sociales y jurídicas impartidos en seis universidades españolas de diferentes regiones autónomas. El análisis textual se ha realizado con el software Xplortext. Para el análisis de contenido se ha diseñado, en primer lugar, un instrumento de evaluación ad hoc (ANVALDOC) y, en segundo lugar, se ha desarrollado una herramienta informática (CORAMeval) para la implementación y uso de la citada escala. Los resultados muestran la asociación existente entre el lenguaje utilizado y la universidad de procedencia o el ámbito de conocimiento en el que se contextualiza el título. Así mismo, se evidencia una clara diferencia según las universidades y ámbitos en cuanto a la calidad de las descripciones de los resultados de aprendizaje, valorada en términos de corrección, verificabilidad, autenticidad o proceso cognitivo subyacente. Igualmente, estas diferencias se mantienen en la corrección y autenticidad de los medios e instrumentos de evaluación.
Palabras clave: educación superior, resultados de aprendizaje, evaluación formativa, evaluación sumativa.
Estudios
The challenge to design and assess learning outcomes in higher education
El reto del diseño de los resultados de aprendizaje y su evaluación en educación superior
Recepción: 04 Septiembre 2023
Aprobación: 08 Abril 2024
Publicación: 07 Enero 2025
One critical and essential role of universities is to design programmes and so also the content within each programme. Among several possible approaches to curriculum design, Biggs et al. (2022) propose constructive alignment, which emphasises the need for coherence between intended learning outcomes (ILO), teaching-learning activities and assessment tasks. This approach represents a change in paradigm as it focuses attention on student learning, an aspect highlighted by the European Higher Education Area (Barboyon Combey & Gargallo López, 2022).
Constructive alignment proposes curriculum design based on four basic activities (Biggs, 2014): 1) determine the ILO that the students should achieve by specifying the action to be performed; 2) create a learning environment using teaching-learning activities that make the students get involved in achieving the intended outcomes; 3) design and use assessment tasks to evaluate ILO achievement; and 4) turn these judgements into final scores.
Despite this approach highlights the ILO, qualifications designed in Spain have focused on skills as an essential part of the programmes. However, Royal Decree 822/2021, on organising university teaching and the quality assurance procedure, put learning outcomes centre stage, turning them into «the key element to define study plans and harmonise higher education systems» (ANECA, 2022, p. 5), which causes some confusion from a curriculum point of view and represents a further challenge for university teachers.
This change of direction, plus our limited evidence on the use of ILO by academics (Dobbins et al., 2016), back the need to analyse master’s programmes to understand how the ILO are being designed and which assessment methods and instruments are being proposed to evaluate how well the ILO are achieved, which will make it possible to offer improvement guidelines to effectively address the master’s redesign to match current international trends.
Various authors have defended the importance of re-focusing subject or content design and planning this from the student’s perspective, in other words, taking assessment as a starting point, since it is the focus of interest from which students approach their activity (Biggs et al., 2022; Ibarra-Sáiz & Rodríguez-Gómez, 2022a) and so determines how they learn (Ajjawi et al., 2022; Boud, 2020). This requires coherence between the expected ILO and the assessment tasks which will demonstrate how far the ILO have been achieved (Ibarra-Sáiz & Rodríguez-Gómez, 2022a). In short, assessment tasks should explicitly align with the ILO (Coates, 2016) and they should use the appropriate assessment methods and instruments.
We conceive the ILO as declarations that provide information on what a learner is expected to know, understand, use, perform, demonstrate or apply and prove by performances or achievements in a specific context with determined levels of achievement at the end of the learning process» (Rodríguez-Gómez & Ibarra-Sáiz, 2022, 0m37s). The ILO offer greater transparency and clarity as they take what students are supposed to achieve during their university training and make it clearer and easier to understand. These learning outcomes thereby become a very useful course design tool. Figure 1 represents this relationship between these curricular elements, beginning with the ILO, as drivers of the assessment tasks and the teaching and learning activities (Boud, 2020). In short, establishing coherence between the ILO, the assessment tasks and the students’ learning when performing the various activities (Ajjawi et al., 2022).

When specifying the ILO, two fundamental aspects should be considered: the level of specification and its constitutive parts.
Approaching the curricular design from the constructive alignment is considered a fundamental principle for the university level teaching-learning process (Ajjawi et al., 2023; Barboyon Combey & Gargallo López, 2022) not only in the subjects/ content but also at an institutional level. Biggs et al. (2022) thereby propose three levels of ILO (institutional, programme and unit) which should be coherent to each other when rolled out. It should also be considered that the RD 882/2021 states that the ILO must be in line with QF-EHEA Master’s degree level in the European Higher Education Area and be coherent to its designation, its discipline and the graduate profile which, inexorably, requires considering various levels or standards and a benchmark teaching excellence model for the roll out (Figure 2).

At an operative level, to consider these outcomes to be properly formulated, ILO formulation must include a series of components. Consequently, an ILO statement should specify an action verb which informs the learner what they are expected to be capable of doing, and this action must also appear in the assessment task(s) which, in turn, will provide the backbone of the teaching-learning activities (Biggs et al., 2022).
Table 1 presents the components that various authors and institutions consider should be included in an ILO declaration.


Great similarity is seen among them all, specifying the chosen performance level considered by Rodríguez-Gómez e Ibarra-Sáiz (2022), which is an aspect related to the levels or standards, although these authors highlight the complementary nature of the latter two components.
As represented in Figure 3, specifying these components makes it easier to specify other curricular elements such as the assessment methods and, therefore, the type of assessment instrument likely to be used in coherence to the intended performance level.
Outcome-oriented higher education programmes have introduced long-term changes in assessment, particularly in OECD countries (Zlatkin-Troitschanskaia et al., 2016). However, despite contributions from various international and local initiatives to assess the ILO, assessment today is still the same as it was a century ago (Coates, 2020), and the time has come to look into updating it by designing innovative registering, assessing and certifying systems (Ibarra-Sáiz & Rodríguez-Gómez, 2022b).
Following Coates et al. (2021) in their new-generation assessment proposal, and in line with the constructive alignment approach, we advocate an evidence-based assessment design. This means that the assessment tasks must explicitly align with the ILO and guarantee that there is sufficient valid evidence to consistently assess how far the ILO have been achieved.
In this respect, monitoring and assessing the ILO requires assessment methods (products and actions by the students) which can be used to collect information on the assessment object, and assessment instruments that make it possible to pass judgement based on clear, known criteria to assess the level of achievement attained (Ibarra-Sáiz et al., 2023).
Regarding the ILO approach in the university curriculum, we encountered some curriculum redefinition, methodological and evaluative experiences (Astigarraga Echeverría et al., 2020) and others which provide content and textual analysis of the programmes and teaching guides ( Schoepp, 2019; Soares et al., 2020) which demonstrate weaknesses in the design and planning of the subject material, but lack greater attention to the topic in terms of curriculum specification (Gamboa Solano et al., 2021).
From these prior considerations, the aim of this study was to analyse the design of the learning outcomes and the assessment methods and instruments declared in the university master’s degree programmes, to answer the following research questions:
This study was performed in the context of the FLOASS Project (http://floass.uca.es) from a mixed-methodology approach (Creswell & Creswell, 2022). This study has specifically followed a multiple convergent design (Figure 4).

To make it easier to describe the sample, and the subsequent presentation of results and conclusions, Table 2 outlines the acronyms used and Table 3 presents the acronyms for the participating universities.
The project focused on analysing qualifications given in the universities, classified as level 3 in the Spanish Framework of Higher Education Qualifications (master’s degree) due to the specialisation and variability of these courses between various universities. Furthermore, a selection was made from the Social and Legal Sciences area, due to the project’s limited human and time resources, which meant that only Social Sciences qualifications taught at each university were analysed (See Appendix I). A total of 89 master’s degrees were analysed (Table 3): 38.20% were from the discipline of Education, 51.69% from Economics and Business Studies and 10.11% from Communication, specifically understanding these as the disciplines for this study.


The programmes for these 89 qualifications were used to extract descriptions of the ILO and the assessment methods and instruments (AMI) specified in each of them, which meant analysing 9,419 ILO and 6,729 AMI (Table 4).

To collect, organise and simplify the information to be extracted from the master’s programmes, a database was set up in Excel format (Register of master’s degrees in social sciences) adding the following data: university, discipline, qualification, subject, skills, learning outcomes and assessment methods and instruments.
The ANVALDOC scale (Ibarra-Sáiz et al., 2022) was defined to analyse the content of the ILO definitions and the AMI descriptions. Researchers used this scale to assess the ILO definitions according to the criteria of correctness, verifiability, authenticity and cognitive level. The AMI were assessed for correctness and authenticity. The CORAMeval computer tool (Balderas et al., 2021) was developed as a support for the assessment process, helping to run the assessments quickly and easily.
The descriptions of the ILO and the AMI proposed in the master’s programmes constitute two textual corpora which can be analysed using multi-dimensional statistic methods to explore their form and structure and their lexical content. This textual analysis was performed using several functions from the Xplortext (Bécue-Bertaut et al., 2022) package in the RStudio (RStudio Team, 2022) environment. Specifically, the TexData function was used to build the textual and contextual tables, the LexCa function to perform the correspondence analysis from the lexical tables, and the LexChar function to determine the characteristic words from the documents.
The subsequent content analysis expressed in the judges’ assessments was performed using descriptive statistics technique and non-parametric contrast tests, as these are ordinal measurements that do not fit reality (K-S test, p<.001). IBM SPSS (IBM Corp., 2017) and R (R Core Team, 2022) were used for these analyses.
Exploratory textual analysis
In the case of the ILO, a total of 9419 definitions were analysed, using 5642 different words. Table 5 presents a dictionary of the 30 words which are used 400 times or more, and the number of universities and discipliines where they appear. Consequently, the most frequent word ‘conocer’ (know) is used in 1625 definitions, found in all six universities and in the three disciplines. The word ‘analizar’ (analyse) is used 674 times and it is present in five universities and all three disciplines.

On the other hand, a total of 6729 definitions referring to AMI were analysed (Table 6), which used 1224 different words. Only ten words pass the threshold of being used 400+ times, and out of those ‘trabajo’ (assignment) and ‘pruebas’ (tests) were the most used.

Contextual association with the university and the disciplines
The fundamental aim of the correspondence analysis from the lexical table (documents by words) is to study and visualise the proximities between documents, the proximities between words and the association between documents and words (Bécue-Bertaut, 2018). Nouns and verbs used 400+ times were used in the correspondence analysis.
By comparing the row/column profiles, we can confirm the model of independence among all the documents and the vocabulary. Significant Chi-squared values were attained in both the case of the universities and the disciplines (Table 7), which make it possible to reject the hypothesis of independence, clearly showing an association between documents and vocabulary, between the various universities and the vocabulary that they use in each case, as well as between the various disciplines and the language used in each of them.
Using Cramer’s V values, we can see that the values are equal to or higher than 0.2. This can be interpreted as a moderate association, according to the rule which determines values between 0.2 and 0.6 as moderate. In the case of the ILO and university, there is an association of 0.18. The total inertia percentages for the first two axes of each factorial axis demonstrate values over 80% in all cases.

This association relationship is presented as a graph using the factorial planes shown in Figure 5.

Textual characterisation according to the university and the disciplines
To demonstrate these associations more clearly, the results after identifying the characteristic words are presented below.
Characterisation by university
Figure 6 presents the over-represented (blue) and under-represented (red) words in the ILO descriptions depending on the university. For example, at the UCA, the word ‘conocimiento’ (knowledge) is over-represented as it is used 102 times, and this represents 0.36% of use as opposed to 0.21% of use in all the universities as a whole.

Along the same line, Figure 7 shows the characteristic words referring to the AMI. The words ‘actividades’ (activities), ‘aula’ (classroom) and ‘participación’ (participation) are characteristic of the UCA. Participation appears over-represented in three universities as overall it is used in 1.57% of the descriptions, and the use in these universities represents 2% (UDC), 1.8% (UCA) and 2.1% (UPV/EHU).

The most characteristic textual segments for each university are presented in Table 8.

Characterisation according to the disciplines
In the description of the ILO, the words ‘técnicas’ (techniques) and ‘análisis’ (analysis) are presented as characteristic of the COM disciplines (Figure 8). The ECO discipline is characterised by terms such as ‘conocer’ (know), ‘análisis’ (analysis), ‘saber’ (find out), ‘identificar’ (identify), ‘aplicar’ (apply), ‘analizar’ (analyse), ‘técnicas’ (techniques) and ‘información’ (information). The EDU discipline presents a higher quantity of characteristic words, with outstanding use of terms such as ‘aprendizaje’ (learning), ‘enseñanza’ (teaching), ‘educación’ (education), ‘evaluación’ (evaluation), ‘lengua’ (language), ‘formación’ (training), ‘procesos’ (processes) or ‘alumnado’ (students).


Finally, Figure 9 shows that the words ‘prácticos’ (practical), ‘trabajos’ (assignments), ‘participación’ (participation) and ‘evaluación’ (assessment) are characteristic of the COM disciplines as they describe the assessment methods and instruments. ‘Pruebas’ (tests), ‘examen’ (exam), ‘prácticas’ (practices), ‘evaluación’ (assessment) and ‘prácticos’ (practical) characterise the ECO discipline and ‘aula’ (classroom) and ‘actividades’ (activities) feature most in EDU.
Table 9 presents the characteristic textual segments depending on the discipline.

Characterisation of the learning outcomes
Out of the 9419 ILO definitions analysed, 20.2% (1898) were scored as correctly defined (maximum score, 2), 42.4% (3995) had limitations in their definition (score=1) and 37.4% (3526) were not defined correctly (score=0). We can thereby see in Table 10 that the correctness average is 0.83 (out of a maximum score of 2). In the case of the universities, this ranges between 0.22 at the UDC and 1.28 at the UV; and regarding the discipline, it ranges between 0.40 from Communication and 0.95 in Education.

Focusing on the 5893 ILO defined correctly or with limitations, 22.7% (1337) are considered to be entirely observable, measurable or assessable. We find a verifiability average of 3.41 (on a scale of 1 to 5), ranging between 2.62 from the URV and 3.88 from the UCA, and 3.39 for Education compared to 4.15 for Communication.
As for ILO, 33.6% (1981) are assessed as authentic, to the extent that their definitions are focused on the action and the professional context. This produces an authenticity average of 4.01 (on a scale from 1 to 5), ranging between an average of 3.81 from UNIOVI and 4.77 from the UDC, and 3.97 from Education compared to 4.26 from Communication.
Finally, referring to the cognitive processes determined by Anderson et al. (2001), it is seen that 20.3% (1194) attain the maximum level (creation), obtaining an average score of 3.86 (on a scale of 1 to 6). The majority of the ILO (50.6%) are scored between levels 3 and 4 (apply and analyse), 15.1% in levels 1 and 2 (remember and understand) and 34.4% between levels 5 and 6 (evaluate and create). Table 9 shows that in this case, the averages from the universities lie between 3.86 from the UV and 4.73 from the UDC and in the disciplines 3.86 from Education and 4.39 from Communication.
Characterisation of assessment methods and instruments
Out of the 6729 AMI analysed, 47.3% (3182) were evaluated correctly, 39.1% (2629) had limitations in their definition as the product or action was not properly explained and 13.6% (918) lacked information on the product or action. The correctness average for the AMI (Table 11) is 1.34 (out of a maximum score of 2), ranging between 0.88 from the URV and 1.91 from the UNIOVI, and 0.98 from the discipline of Communication compared to 1.40 in the Education discipline.

Out of the 5811 assessment methods and instruments defined correctly or with limitations, 22.4% (1301) are scored as authentic, as they are focussed on the action and the professional context. Regarding the AMI, 39.9% (2318) are at an intermediate level, with an average authenticity of 3.57. The averages range between 3.14 from the URV and 3.87 from the UV; regarding the discipline, they range between 3.25 from Communication and 3.60 from Education.
The university and the discipline as differentiation factors
The Kruskal-Wallis H-test was performed to check the significance of the differences described above, and its results are presented in Table 12, alongside the effect sizes (ηH²) and the confidence intervals (CI). The differences between the evaluations carried out according to the university and disciplines are statistically significant (p<.05). In the case of the universities, the effect size varies, although correctness provides the greatest effect size both in the ILO (0.23) and in the AMI (0.27), that can both be considered as large. In terms of verifiability, it is moderate (0.10) and regarding the authenticity and the cognitive processes, the effect size is small in the ILO (0.04 and 0.02 respectively); and in the authenticity of the AMI (0.08). Regarding the disciplines, the effect sizes are remarkably small or very small.

The aim of this study was to analyse the design of the learning outcomes and the assessment methods and instruments declared in the university master’s degree programmes for Social Sciences.
To answer the first question raised on what type of ILO are specified in the master’s degree programmes, the textual analysis demonstrated that the most frequent word is ‘conocer’ (know), found in all six universities and all three disciplines. In the same way, content analysis confirmed that 49.3% of the ILO correspond to the lowest levels (remember, understand or apply) of the taxonomy by Anderson et al. (2001). These results coincide with contributions from Boud (2020) on the emphasis placed on low-level knowledge during assessment, as with other studies where the majority of the ILO were classified at the lowest level (Bone & Ross, 2021). This situation might be the consequence of Spanish regulations which allude to the student ‘knowing’ when they refer to the ILO (RD 1027/20119) and also because, as mentioned by Jiménez Hernández et al. (2020), the teacher-centred teaching model is still present. Furthermore, analysis of the ILO definitions evaluated as corrected formulated or with some limitations concluded that the majority cannot be verified (observable, measurable or evaluable) nor are they authentic (focused on the action or the professional context).
However, regarding the cognitive level, a little over one third of the ILO (34.4%) are assessed as the high levels (evaluate and create) of the taxonomy by Anderson et al. (2001), which gives a more encouraging vision in comparison with the studies mentioned by Boud (2020) and Bone & Ross (2021), although insufficient as these are master’s degrees corresponding to level 3 of the Spanish Framework of Qualifications for Education.
On the other hand, it is worth mentioning that the definitions are limited for a high percentage of the ILO described, as they do not contain all the components in Table 1 that so many agencies like AQU (2022) and authors such as Biggs et al. (2022), Rodríguez-Gómez & Ibarra-Sáiz (2022) or Soares et al. (2020) consider necessary for proper formulation. In this respect, this confirms what Astigarraga Echeverría et al. (2020) mention as a great difficulty in the design and conceptualisation for curriculum change, as teachers are not sure how to identify and describe the ILO and confuse them with skills.
Regarding the second question in the research, referring to which AMI are specified for monitoring and assessment of the ILO, the textual analysis tells us that the most-used terms are: assignments, tests, participation, activities, group, practices, classroom, exam, practical and assessment. The AMI are clearly diverse, which fits with the study by Ibarrra-Sáiz et al. (2023) and reveals a more innovative evolution regarding prior contributions by Panadero et al. (2019) which highlight more traditional practice.
The content analysis shows that there are limitations in the formulation of more than half the AMI or they do not provide information on the product or specific action that must be performed or completed by the students. This might be due to the confusion around its meaning, understanding methods and instruments to be one and the same (Ibarra-Sáiz et al., 2023). On the other hand, only a very small number of AMI stand out for their authenticity. This fact contrasts with what happens in other university contexts where there is an increase in the use of tasks, assessment processes and AMI which are more in line with professional practice (Boud, 2020), through which teaching staff can get students involved in important learning for employability (Ajjawi et al., 2022).
Finally, regarding the third question on possible differences in characterisation of the ILO and in the AMI depending on the university or the discipline, the results demonstrate divergences regarding the university of origin, although less when regarding the disciplines. Some of the difference found between universities might be due to each university analysing its own master’s degree with a team of its own researchers, thereby giving a scoring discrepancy that might be considered as usual in this type of inter-judge processes. However, the variability of the different contexts (greater between universities than between disciplines) leads us to consider the possible influence of both the university’s own organisational culture and the specific nature of each of the disciplines.
A series of limitations should be considered in this study. Firstly, although the sample originates from various universities, sufficiently diverse and large enough to draw conclusions, it is exclusively centred on three disciplines of Social Sciences (Communication, Education and Economics and Business). It could therefore be widened to other disciplines to generalise the result more effectively. Secondly, the results are only obtained through documentary analysis of programmes. Although this method is considered to be appropriate to find out about the current state of the ILO (Schoepp, 2019), future research is suggested to contrast the results obtained by other collection techniques and information sources such as interviews with the coordinators of the actual master’s degrees being analysed, a questionnaire sent to teachers on their ILO assessment practice (Ibarra-Sáiz et al., 2023) and focus groups which collect information from students. This will provide a better understanding and an overall perspective of the ILO and AMI, by including viewpoints from everyone involved.
The findings of this study demonstrate the challenge represented by designing ILO to respond to a reform that focuses on them as the central axis of the curriculum design (Gamboa Solano et al., 2021; García-Olalla et al., 2022). Only analysis, reflection, review and assessment of the ILO can bring about real changein educational practice (Bone & Ross, 2021), in an attempt to bring the majority of the results into line with internationally-accepted best practices (Schoepp, 2019). However, as Biggs (1996) reminded us, a university is a holistic, interactive system managed by many procedures with specific functional uses that determine the teaching and evaluation processes and that, in turn, affect students’ perceptions and experiences regarding what and how they will learn. Consequently, it is not enough to let teachers individually juggle as best they can with the conflictive bureaucratic demands imposed by quality assurance systems. Each higher education institution must have an assessment policy and guidelines which provide a coherent set of principles and procedural knowledge sustained in the teaching excellence model that has been chosen by each institution independently. This requires training and professional development for its teaching staff to bring about a change in their conceptions and a reflection that allows them to identify and specify the ILO so that the curricula design is definitively focused on the students’ learning (Biggs, 2014). Constructive alignment is a suitable framework to achieve this (Astigarraga et al., 2020).

How to reference this article: Rodríguez-Gómez, G., Cubero-Ibáñez, J., Sánchez-Calleja, L., González-Elorza, A., & Ibarra-Sáiz, M. S. (2025). The challenge to design and assess learning outcomes in higher education. Educación XX1, 28(1), 179-211. https://doi.org/10.5944/educxx1.38233
redalyc-journal-id: 706
This paper was made possible by the FLOASS Project – Learning Outcomes and Learning Analytics in Higher Education: An Action Framework from Sustainable Assessment, funded by the Spanish Ministry of Science, Innovation and Universities in the R+D+i State Programme Focused on the Challenges for Society, the State Research Agency and the European Regional Development Fund (Ref. RTI2018-093630-B-I00) and support from the UNESCO Chair on Evaluation and Assessment, Innovation and Excellence in Education at the University of Cadiz.
We would also like to thank Ramón Álvarez Esteban for his comments and guidance on how to use the Xplortext package.





















