Thoughts on programme title cleaning

The most frustrating thing for me about working with Student Record data has always been the free-text Course Title field. Improving the speed and quality of analysis means better decision making.

Across three years of data, there are almost 59,000 different first degree and PGT programme titles, with misspellings, random codes and hundreds of different ways to write “with placement” making powerful programme-level analysis more difficult.

With UniViz, you can receive insightful analysis more quickly as I have made three key additions that improve programme titles:

1) The ๐—จ๐—ป๐—ถ๐—ฉ๐—ถ๐˜‡ ๐—ฃ๐—ฟ๐—ผ๐—ด๐—ฟ๐—ฎ๐—บ๐—บ๐—ฒ ๐—ง๐—ถ๐˜๐—น๐—ฒ strips out all of the unnecessary information, leaving only a clean, consistent titles. This means programme searching and cleaning takes minutes, not days.
2) ๐—”๐˜„๐—ฎ๐—ฟ๐—ฑ๐˜€ have been removed from programme titles and given their own column in the data. This makes trends more obvious as there is no need to spend time matching a BSc in one year to a title without an award in others).
3) The ๐—จ๐—ป๐—ถ๐—ฉ๐—ถ๐˜‡ ๐—ฃ๐—ฟ๐—ผ๐˜ƒ๐—ถ๐—ฑ๐—ฒ๐—ฟ field improves how satellite campuses and partner institutions are treated. As far as possible, they are now separate, so you can benchmark against your peers, not the students they provide degrees for in other parts of the country.

I now work with almost ๐Ÿฐ๐Ÿฑ% ๐—ณ๐—ฒ๐˜„๐—ฒ๐—ฟ ๐˜๐—ถ๐˜๐—น๐—ฒ๐˜€, freeing my time to focus on what actually matters: deep-diving into your programmeโ€™s performance and making recommendations to boost its market position.

Clean data isn’t just a nice-to-have, it is vital to improving the recommendations we can make from it too.

๐—ฃ๐—น๐—ฒ๐—ฎ๐˜€๐—ฒ ๐—ด๐—ฒ๐˜ ๐—ถ๐—ป ๐˜๐—ผ๐˜‚๐—ฐ๐—ต ๐˜๐—ผ๐—ฑ๐—ฎ๐˜† ๐—ถ๐—ณ ๐˜†๐—ผ๐˜‚ ๐—ต๐—ฎ๐˜ƒ๐—ฒ ๐—ฎ๐—ป๐˜† ๐—ฝ๐—ฟ๐—ผ๐—ท๐—ฒ๐—ฐ๐˜๐˜€ ๐˜„๐—ต๐—ถ๐—ฐ๐—ต ๐—ฐ๐—ผ๐˜‚๐—น๐—ฑ ๐—ฏ๐—ฒ๐—ป๐—ฒ๐—ณ๐—ถ๐˜ ๐—ณ๐—ฟ๐—ผ๐—บ ๐—ถ๐—ป๐˜€๐—ถ๐—ด๐—ต๐˜๐—ณ๐˜‚๐—น ๐—ฝ๐—ฟ๐—ผ๐—ด๐—ฟ๐—ฎ๐—บ๐—บ๐—ฒ-๐—น๐—ฒ๐˜ƒ๐—ฒ๐—น ๐—ฎ๐—ป๐—ฎ๐—น๐˜†๐˜€๐—ถ๐˜€.