A small while back, I had a conversation with Lystrialle about how Vocaloid’s UI is a terrible, terrible thing and how the humble freeware program UTAU, which was written in Visual Basic 6 and looks the part, has a better interface. I wondered to myself, “how could a program that was written by professionals and costs a not-insignificant amount of money have such a bad interface?” Well, as it turns out, Vocaloid doesn’t even have the worst UI of any singing synth program, because most pre-Vocaloid singing synthesizers were not commercial products and thus did not concern themselves with things like “usability” or “user experience.”
Witness MUSSE DIG, an academic singing synth project developed in 1989 by researchers at KTH, the Swedish Royal Institute of Technology, in Stockholm. “The KTH Rule System for Singing Synthesis” by Gunilla Berndtsson, which, much like the title suggests, outlines the rules for singing synthesis MUSSE DIG used. Unlike modern synths, MUSSE DIG, as far as I can tell, did not use actual human samples; that didn’t seem to be a thing until later in the 90’s, as far as I can tell. Getting back to the point, this is how “score files,” roughly equivalent to a VSQ in Vocaloid or a UST in UTAU, were written:
“The lyrics and the corresponding notes are first typed into a score file. The lyrics are written in a type of phonetic transcription containing information on vowel length, etc. The metronome value is given, and the notes are specified in terms of pitch name, octave number, and nominal duration [Berndtsson 1996].”
In other words, no piano roll, no visual way of doing pitch correction, and no GUI period. Everything was typed up using good old text. Here is a small excerpt of a score file the article provides:
Of course, it’s no surprise that researchers don’t often think about user friendliness when creating programs, if you’ve ever seen samples of the code they write. Ideas of good programming standards tend to go out the window, especially in the face of deadlines. This is understandable. None of this was meant for any kind of general consumer base; it was mostly meant as a study of how singing works. In fact, I wish I could see some of this code myself out of academic curiosity.
Vocaloid doesn’t have this excuse.