UTAU Terminology

UTAU: Japanese word for “Sing” and name of the Program. UTAU is a Freeware Vocal Synthesis program in the same vain as Vocaloid.

Vocaloid: A commercial Vocal Synthesis program, more popular than UTAU. There’s a lot of crossover between Vocaloid Music and UTAU, as UTAUloids often cover Vocaloid songs.

Vocal synthesis: Used for UTAU and Vocaloid, The act of creating a musical instrument from a real voice. By taking samples of a real voice and programing them a certain way, we can “make a computer program sing.”

Voicebank: The folder of .WAV or .aaif files containing the recordings of the UTAUloid. Usually paired with an avatar/character of some kind. Often shortened to just “Bank.”

.WAV: The file format recognized within the UTAU program and used for Recording an UTAUloid. UTAU won’t read a Voicebank full of .mp3 files.

UTAUloid: A term derived from “Vocaloid” (Which itself is a contraction of the words “Vocal + Android”) It is used to refer both to the Voicebank and the avatar/character associated with the Voicebank. Many UTAU users treat the Voicebanks they’ve created like OCs.
Ex. “This is my UTAUloid, I voiced her myself!”

Vipperloid(s): A Series of Popular Japanese UTAUloids originating from vip@2ch.

.UST: Utau Sequence Text Files. The file format that the UTAU program uses. Similar in style to sheet music or a MIDI file.

MIDI: (Musical Instrument Digital Interface) A file format often used in the creation of UST files.

Flags: Voice modifiers written into the properties of a UST. Can make a voice sound sweeter, softer, more muffled or more breathy etc depending on which flags are used. (More on which flags do here)

OTO: The configuration file for making a Voicebank sing correctly (with the right “Timing”).

Alias: Alternate names given to the WAV recordings in the OTO file. Multiple aliases can be given to a single WAV recording. Used heavily with VCV Voicebanks.

CV: “Consonant Vowel” Style Voicebank. The smallest and simplest style of Voicebank. (see: CV)

Rentan: A style of CV Voicebank where sets of CV samples are recorded all at once, instead of split up into individual files. Works identically to a standard CV voicebank within the program once configured.

VCV: “Vowel Consonant Vowel” Style Voicebank. The most popular form of Voicebank. While much larger/takes more recordings to create compared to a CV bank, it’s much smoother as a result. (see: VCV)

Mora: The amount of syllables in a recording string.

Lite VCV: A shorter/smaller version of a VCV Voicebank with most of the same smoothness as a full VCV bank. Recommended to those who find VCV lists too daunting to record.

CVVC: A CV Voicebank with added “VC”s. At it’s best, nearly indistinguishable from VCV with more flexibility, at it’s worst a little better than a CV bank with a strange tone to it. (see: CVVC)

VCCV: A newer style of Voicebank used primarily for English language UTAUloids developed by Cz. (More on how VCCV works in the VCCV article)

Multipitch: A Voicebank where multiple single banks are put together to allow for a much greater range.

Kire: A popular recording style of Multipitch Voicebank where the tone of the voice gets stronger in the higher pitches, as opposed to keeping a more consistent tone throughout.

Prefix map: The file that dictates which pitches of the Multipitch Voicebank sing on which notes. Can be changed by the user freely.

Appends: Voicebanks recorded with a particular tone in mind, such as “Soft” “Dark” “Strong”. Another term derived from Vocaloid. Can be a standalone bank, or within a Multipitch bank.

Tuning: The practice of manually changing the pitch to make the Voicebank singing the song sing in a more pleasing way.

Constant Velocity: A tool that changes the speed at which the Consonants of samples are played. Useful for when a bank is singing very fast or making a bank “slur” their singing.

Resampler: The engine used to read the WAV files in the Voicebank folder to then be played in.

FRQ: The resampler doesn’t read/change the pitch of the WAV files in the folder directly, so FRQs are what the resampler makes and then uses to play a Voicebank’s files in the UTAU program.

Mixing: The process of taking the Vocals you’ve rendered in the UTAU program and blending it with the off vocal/karaoke of the song.

Fl Studio: A DAW (Digital Audio Workstation) often used to mix songs.

Rea[er: Another DAW that is often used to mix songs.

Nico Nico Douga/ニコニコ動画: A popular video sharing website in Japan, sort of Japanese Youtube. UTAU and Vocaloid Covers/Originals are frequently uploaded here.

Nikokara/ニコカラ: A service within Nico Nico Douga where the lyrics of songs are displayed across the video screen, written in Hirigana/Furigana. Useful for making UST files.

Audacity: A DAW often used to record voicebanks and to mix songs.

OREMO: Another Free software used for recording. Created specifically for recording Voicebanks. (see: OREMO and setParam)

setParam: A voicebank configuration program outside of UTAU. Popular for generating/editing OTO files. Comes alongside Oremo. (see: setParam and setParam)