Typing woes - All About XCRI

How do we use XCRI-CAP to enable feed consumers to filter out the course records they want from those they don't want? A fundamental question that was asked when we first started to design XCRI back in the day. This post, and a reiteration of that question as a starting point, was stimulated by Qamar Zaman's blog post "XCRI and Qualification" (https://atiqam.wordpress.com/xcri-and-qualification/).

XCRI-CAP 1.2 has many features that permit filtering in theory. These include at course level: subject, type and credit >> level; at presentation level: age, duration, studyMode, attendanceMode, attendancePattern; at qualification level: educationLevel and type. These elements were deliberately included in order to help with filtering, both for consumption of feeds and for search.

However, XCRI-CAP is primarily a structural specification - it specifies, for example, that if you have a course title, this is where you should put it. The spec itself doesn't prescribe the content of elements, except for some suggestions (not binding) for studyMode, attendanceMode and attendancePattern. That's why we have a Data Definitions document for the Course Data Programme, and even that is to an extent loosely specified - and could do with tightening up, once we have agreement on the content. For machine-readability this is not ideal, but it has helped to enable many organisations to produce an XCRI-CAP feed, and we already have some aggregation taking place, and some services.

When XCRI-CAP was designed, there were very few generally accepted vocabularies for key information items that enable divvying up of the data. Therefore, the designers were loathe to include them in the spec, as that could easily have restricted its take-up by negating potential use cases. On the other hand, producers of XCRI-CAP feeds need to know what XCRI-CAP feed consumers want inside many of these elements, so that the data can be filtered and consumed with a minimum of unnecessary intervention. This is the reason why the emergence of communities of practice within the Course Data Programme (for example, around Graduate Prospects, and the Creative Assembly, and later UCAS) has been so encouraging and so important.

As I've mentioned, XCRI-CAP 1.2 does include several data elements that can help, if populated with agreed vocabularies. Some are fairly well specified, such as and , while others (I'm looking at in all its forms as an example) less so. We can enumerate qualification type through various well-established frameworks (for example NQF or QCF). We have JACS and other vocabularies for subjects, and we have suggested vocabularies for studyMode, attendanceMode and attendancePattern.

The purpose of the element in course is to provide a filtering mechanism not already covered by elements such as studyMode, subject, qualification or educationLevel. An archetypal "type" is 'continuing professional education' courses; these cannot be readily extracted using existing elements, because they typically carry no credit or level, you cannot pick them out with just a subject vocabulary, duration or other easy descriptor without analysing free text descriptions. It also seems to me not unreasonable that an aggregator might want to pull out CPD courses (in fact we already have two specific cases of this). This is not an isolated use case. Consider perhaps 'Open Learning' courses, or 'Continuing Education', or even 'Undergraduate' - under-specified for level in most frameworks, or 'Postgraduate Taught' and 'Postgraduate Research'. The current state of XCRI-CAP design does not permit, without more vocabularies, these groups of courses to be filtered easily.

I think we legitimately have several axes (pl. of axis and pl. of axe!) here with which to slice up course provision, independent of educationLevel, qualification abbreviation, studyMode, subject and others explicitly defined:

qualification type: For example - 'GCSE or Equivalent', 'Foundation Degree', 'Postgraduate Qualification' [As an aside, importantly, I note that there is an error in the Data Definitions: there *should* be an element for qualification type; it's in the schemas but not the data definitions. This may help, as there are some useful qualtype vocabularies around that don't necessarily equate simply to 'level'.]
course type (inter-institution context): For example 'Continuing Professional Development', 'Open Learning', 'Continuing Education', 'Undergraduate', 'Postgraduate Taught', 'Postgraduate Research', 'Summer School', 'Researcher Training'.
course type (structural component type within an HEIs offerings): For example - 'module', 'programme', 'stage'; as used in the HEAR XML specification
course type (community practice): see CPD community practice below.
module / programme relations: could use hasPart / isPartOf, coupled with the structural component type (again as in the HEAR XML specification)

A typical usage in the CPD community might be:

This example is already implemented in the Course Data Programme schema, and the vocabulary is published as a VDEX file here: https://xcri.co.uk/vocabularies/courseTypeCPD1_0.xml.

My view is that communities of practice will have specific requirements for these vocabularies, which will be different in different communities. For example, Graduate Prospects may well wish to split up PG courses into different types, not necessarily linked directly to educationLevel; some "course type" terms might look like educationLevel terms, but they are being used in a different course-based context. For PG courses, you might want to identify CPD, Taught, and Research courses and perhaps researcher training, and these terms might be sufficient in the element. [Bearing in mind that multiple elements are permitted, and we can use xsi:type to prescribe and validate vocabularies.]

So there are many ways to slice and dice chunks of course provision, and XCRI-CAP 1.2 has elements that can enable this. We collectively need to determine what chunks need identification over and above the reasonably well specified stuff like subject, educationLevel and study mode. We can implement multiple vocabulary elements if required - a type vocab doesn't have to have mutually exclusive terms. And in my experience starting with agreement on a small number of terms is better than trying to get to an all-encompassing vocabulary before using it.

Here's 3, for a small start.

'Continuing Professional Development'
'Undergraduate'
'Postgraduate'

Implement with the following XML:

And also some components:

'Programme'
'Pathway'
'Stage'
'Year'
'Module'

Implement with the the following XML:

These will validate against the Course Data Programme schema, but should be considered a pilot implementation.