What is CV?

CV was the first recording method utilized for UTAU voicebanks. CV, is also known as 単独音 ( tandokuon ), by Japanese users, or Diphones due to the fact that it's made up of two phonemes for each sound. This recording/oto method is extremely easy―it was the first to be implemented into UTAU back in 2008, and was the only method up until late 2009. Highly reccomended for newcomers to record, configure, and use.

Using CV

Using CV is extremely simple, with practically no work needed to use―for the most part you can simply open up a .ust file and press play. However, I reccomend fitting the .ust to the UTAU, which you can read more about here. First, we'll open up our .ust file. I'll be using a .ust file for Frog Song/カエルの歌 ( kaeru no uta )

We can tell this is a CV .ust by looking at the notes and seeing how each sound is placed. In CV format, you have one recording for each phoneme, so to make "kaeru no uta ga" it would be formatted like "ka e ru no u ta ga". Since the .ust is already in CV format, we can simply fit the .ust to our UTAU, and press play.


It sounds alright, however, we can make CV voicebanks sound smoother by using an option in UTAU called "Crossfade". Crossfade is a function that crosses the envelopes of a vowel sound and the preceding one as well―which makes vowels much smoother. To crossfade, select the notes in your .ust, and then go to Tools, and then Built in Tools, and select Crossfade, and finally, press OK.

Recording CV

Recording CV is easy, and takes no time at all. Due to CV being diphonic, you only record one sample for each syllable, which makes the amount of recordings considerably small.

example: ( ka )

In this example, we can see the "ka" sound recorded by itself, with no other sounds before or after it; and because CV is diphonic, this sample will be used for every "ka" sound the voicebank encounters in .USTs files.

The same thing goes for every other sound in CV, which is why CV voicebanks are small and quick to record. Keep in mind that recordings should be 1-2 seconds long each, so that the resampler doesn't stretch the sample too much.

OTO'ing CV

OTO'ing CV is a generally a quick process, as the amount of recordings paired with one oto per recording amounts to a small amount of work. With CV, we must OTO different sounds/phonetics differenly depending on their consonant. Listed below is an image guide for various consonants and how to OTO them, but please refer to the OTO'ing page if you need more information about OTO'ing.

Vowels (a,i,u,e,o,n;but this oto also applies to w and y sounds.)

Hard consonants (k,t,ch,ts,p)

Semi-hard consonants (j,g,d,r,b)

Soft consonants (m,n,h,sh,s,f,l,v,z)

All in all, CV is a great starting point for new UTAU users, and is easy to use, record, and configure―it has to be prettyeasy considering it was the first recording/voicebank method out there.