Vocal Synthesizer Wiki

TIGER, also known in Chinese as Xuan Mao (炫猫; Xuàn Māo), is a vocal synthesizer character created by tigermeat. He was released as an AI singing model trained in English and Japanese for Diff-SVC in December 2022, as well as another AI singing model trained in English, Spanish, French, Japanese, and Mandarin Chinese for DiffSinger, known as "TIGER DS", in November 2023. He had singing a singing voice library for ENUNU, but this was never released. He also had a speech voicebanks in English, Japanese, and French for the discontinued voice synthesizer NeuTalk.

His voice is provided by the American producer tigermeat.

Concept[]

TIGER is an energetic and happy-go-lucky tiger-man in his twenties. He's constantly lost in his head, thinking about what songs he's going to skate to at his roller-rink and what games he's going to play on his Sega Dreamcast. He has a small companion named "Tigrito", a virtual pet who can pop out of his Tamagotchi-like device!

Etymology[]

"TIGER" is named after the large cat he is based on.

His name is rendered in kana as "タイガー".

Appearance[]

WIP

Relations[]

  • TRITON - romantic partner. TIGER lives with TRITON on a manatee farm off the coast of a beach.
  • Tigrito - virtual pet and companion.

History[]

TIGER was originally conceived by tigermeat in 2021 to be an AI singer. His design was created by tigermeat & static oceans, with his official key illustrations illustrated by Aquiboni. TIGER was initially conceived to be a commercial product, but no commercial version of his voicebank was ever released. Instead, his original set of voice data, recorded in 2021 in English and Japanese, was re-utilized for the SVS engine Diff-SVC.

tigermeat began developing TIGER for ENUNU in March 2022, re-utilizing TIGER's original dataset recorded the year prior. The Japanese database was teased in April 2022.[1] TIGER for ENUNU was cancelled soon after due to dissatisfaction for ENUNU's low quality output at the time and lack of interest. In August 2022, tigermeat showed the first public demonstration of NeuTalk, a text-to-speech program that ran on the TalkNet engine. On NeuTalk, TIGER could speak in English, Japanese, and French. NeuTalk's development had since then been discontinued. In December 2022, tigermeat teased and later released TIGER's RVC voice model.

TIGER for DiffSinger began development in Spring 2023. TIGER for DiffSinger used much of the original dataset, however, supplemental data was recorded for DiffSinger to allow support for voice modes, namely "Disco", "Electric", "Vinyl", and "Cetera".

On November 23, TIGER for DiffSinger was released with version b03. This was coined as a beta release and presented many issues, such as vocal modes not working properly and the pitch model being very low quality. TIGER had since then been updated many times where new vocal modes were added and others were removed or changed. A few weeks later, version b08 was released, which included the new vocal modes "Mystic" (a very soft whispery tone) and "Falsetto". TIGER's voice modes were properly configured and worked as intended, but still utilized the same pitch model from b03.

Near the end of December, TIGER's first stable release, coined v100, was released. The only change to this model was a high quality, heavily tested pitch model. On January 13, 2024, TIGER v101 was released, which was trained on experimental hyperparameters, providing slightly higher quality output but requiring more computational power and a much larger file size. This voicebank was quickly replaced with v102 on January 22. v102 removed TIGER's Mandarin Chinese data, but had improvements for better Japanese and Spanish data. The voice library was also trained on standard hyperparameters and had a much higher quality pitch model as well.

On February 6, version v103 was released. This update removed the "Cetera" voice mode, with most of the data being moved into "Fresh" since the Cross-Language Synthesis was quickly improving in quality. The voice mode "Glam" was also included, which is a very powerful, sometimes strained belting, voice mode. "Falsetto" was also renamed to "Royal". TIGER was then equipped with 3 different pitch models, internally called "Standard" (Fresh, Vinyl & Falsetto), "Soft" (Disco & Mystic), and "Power" (Electric & Glam). This update was also the first update where TIGER could sing in French, thanks to Petit Millefeuille.[2]

On April 23, 2024, TIGER v106 was released. This update allowed TIGER to sing in Korean with XLS, and was trained on the "reflow" backend. On September 15, tigermeat showed TIGER v107 Beta with PT-BR support, thanking Team BRAPA in making this possible.[3] On September 25, TIGER v106 was shown using the new vocoder, which tigermeat tentatively called v107. He noted that TIGER sounded clearer.[4] On October 28, tigermeat posted that he started to go through and vet TIGER's labels, noting that they needed work.[5] For TIGER, there were errors and some improvements tigermeat would like to see were more accurate pronunciation to the source when not editing phonetics (such as "ay" labelled as "aa"), better vowel timings, removing bad recordings, more consistency in how he labels, fixing [tr] and [dr] timings, removing all of the "ENUNU-isms" (such as [aa aa ay] for a long [ay]), refactoring some data for different Voice Modes (such as "Disco" moving to "Vinyl" due to having some belting in it despite that it's meant to be a mellow and calm voice mode until these are rerecorded), and not adding a diacritic after unaspirated plosives at the end of a word to modify them which would result in TIGER generally sounding better without phoneme editing.[6] On November 15, tigermeat announced that he finished fixing TIGER's labels and planned to work on training the new voicebank, needing to do updates to the pitch model, which would allow him to release his next round of DiffSingers.[7] He also posted that he'd done a lot of work on his DiffSinger corpus that he decided to skip v107 entirely since he trained that over a month ago. The next update would skip to v108, making the public update move from v106 to v108.[8] On November 16, TIGER's public SVS Corpus was updated to include new labels, six more minutes of English data, and ten more minutes of Japanese data.[9] On December 21, TIGER was updated to v108, which gave him access to the XLS/Multi-Dict update in addition to support for Thai, Russian, and Brazilian Portuguese.[10][11]

On January 9, 2025, tigermeat explored the idea of adding Tension for the next DiffSinger update, but may reconsider after many experiments resulting in Tension, despite sounding good, negatively affecting the dynamics the voice libraries have by default.[12] On January 27, 2025, tigermeat announced what the v110 update would consist of for all of the vocals he managed:

  • The public update would move from v108 to v110, skipping v109 because it was finished for weeks and he was already adding many new things.
  • German would be supported due to the Marzipan DiffSinger corpus.
  • tigermeat discussed the previously tentative Tension, which he sadly could not add after testing it. He was not satisfied with how it sounded on voice libraries with Voice Modes (including TIGER).
  • He noted he was training with TPSE, which v109 had in the beta version, though the initial versions of v109 didn't have it. Users had varying opinions on it, but tigermeat committed to adding it as he liked how it made the voices sound.
  • There were new training parameters for all three main models:
    • Pitch model was trained using a config similar to Pix.
    • Trained duration with slightly boosted model parameters and cyclical lr, proving to be a large upgrade from what he noticed.
    • Acoustic model parameters were a little different, but nothing too drastic. tigermeat still trained on wavenet, and with TPSE, the difference was drastic.
  • Vocoder was updated. There was a large influx of new data that the vocoder did not account for. The details were as followed:
    • Fine-tuned off OpenVPI's nsf-hifigan model. The current plan was 1M steps.
    • Vocoder corpus was audited and timed out by speaker, and tigermeat confirmed all data was used with proper consent. Some of the datasets he was using previously were not approved for commercial use, so they were fully removed.
    • Vocoder data stats for tigermeat voice libraries:
      • Total data: 84:30:55
      • Song: 57:56:18
      • Speech: 6:08:07
      • Rap: 0:31:41
      • String (UTAU/DV recordings): 17:51:44
    • Vocoder would be released as an oudep for others to use in their voice libraries for personal use only.
  • TIGER received additional recordings. By v111, TIGER was expected to have a new Voice Mode and some older Voice Modes should be updated.[13]

On February 5, 2025, TIGER was updated to v110, which included the German support and new training parameters to make the voice libraries produced by tigermeat to sound more realistic.[14][15] On July 30, TIGER's physical boxes were announced to be discontinued.[16]

Voicebanks[]

TIGER
The singing voicebank of TIGER.
  • TIGER (Diff-SVC), December 2022
  • TIGER (ENUNU), cancelled
  • TIGER DS (DiffSinger), November 24, 2023
TIGER Talk
The talk voicebank of TIGER.
  • TIGER (NeuTalk), cancelled


Reputation[]

Trivia[]

  • According to tigermeat's website, TIGER was recorded using the following microphones: Neumann TLM 103, AT4040, and Rode NT1 Signature.

References[]

External links[]

Official[]

Articles[]

Unofficial[]

Navigation[]