Vocal Synthesizer Wiki

TRITON is a vocal synthesizer character created by tigermeat. He was released as an AI singing model trained in English for DiffSinger, known as "TRITON DS", in June 2024.

His voice is provided by Ryan M.

Concept[]

TRITON is a fun and somewhat aloof human and mermaid hybrid in his 20s (officially listed as around 25-29 years old). He spends time tending to his manatee farm off the coast of a beach that he and TIGER, his partner, live on. His nationality is listed as an American Merman and his occupation is listed as a Manatee Farmer.

Etymology[]

While no official information was released regarding why his name was chosen or what it may mean, "TRITON" may refer to the mythical Greek god of the sea, who was the son of Poseidon and Amphitrite.[1] Triton is also the seventh moon of the planet Neptune.

Appearance[]

WIP

Relations[]

  • TIGER - romantic partner. TRITON lives with TIGER on a manatee farm off the coast of a beach.
  • Sheriff Spaghetti - companion. A rowdy manatee considered to be TRITON's right hand man. TRITON and Sheriff Spaghetti love to tend to the manatee farm together.

History[]

On April 2, 2024, tigermeat showed a demo featuring TRITON, who was voiced by his fiancé Ryan M. and illustrated by julieraptor, though was shown as a silhouette at the time. He was noted to be releasing soon.[2] On June 23, TRITON's illustration was revealed and it was confirmed he would be releasing on June 30. It was noted that he was an English-focused DiffSinger voice library with the ability to sing in other languages (such as Japanese, French (noted to be trained with "Millefeuille"), Spanish, Italian, Mandarin Chinese, and Korean) through Cross-Language Synthesis. His voicetype was described as "baritenor", or somewhere in the middle range. His focus was on indie styles of music with very loose pronunciation. He had four Voice Modes produced: "Gale" (the standard voice), "Deluge" (a soft and calm voice), "Tempest" (a solid and powerful voice), and "Flurry" (a light and airy voice focused on the Falsetto range). At the time of this announcement, TRITON was listed as v106 and had 54 minutes of data.[3] He released on June 30 as expected. At some point, the official website revised that TRITON's recording data was approximately around 1 hour.[4]

On October 28, tigermeat posted that he started to go through and vet TIGER's labels, noting that they needed work. TRITON would be worked upon afterwards.[5] On November 15, tigermeat posted that he'd done a lot of work on his DiffSinger corpus that he decided to skip v107 entirely since he trained that over a month ago. The next update would skip to v108, making the public update move from v106 to v108.[6] On December 21, TRITON was updated to v108, which gave him access to the XLS/Multi-Dict update in addition to support for Thai, Russian, and Brazilian Portuguese.[7][8]

On January 9, 2025, tigermeat explored the idea of adding Tension for the next DiffSinger update, but may reconsider after many experiments resulting in Tension, despite sounding good, negatively affecting the dynamics the voice libraries have by default.[9] On January 27, 2025, tigermeat announced what the v110 update would consist of for all of the vocals he managed:

  • The public update would move from v108 to v110, skipping v109 because it was finished for weeks and he was already adding many new things.
  • German would be supported due to the Marzipan DiffSinger corpus.
  • tigermeat discussed the previously tentative Tension, which he sadly could not add after testing it. He was not satisfied with how it sounded on voice libraries with Voice Modes (including TRITON).
  • He noted he was training with TPSE, which v109 had in the beta version, though the initial versions of v109 didn't have it. Users had varying opinions on it, but tigermeat committed to adding it as he liked how it made the voices sound.
  • There were new training parameters for all three main models:
    • Pitch model was trained using a config similar to Pix.
    • Trained duration with slightly boosted model parameters and cyclical lr, proving to be a large upgrade from what he noticed.
    • Acoustic model parameters were a little different, but nothing too drastic. tigermeat still trained on wavenet, and with TPSE, the difference was drastic.
  • Vocoder was updated. There was a large influx of new data that the vocoder did not account for. The details were as followed:
    • Fine-tuned off OpenVPI's nsf-hifigan model. The current plan was 1M steps.
    • Vocoder corpus was audited and timed out by speaker, and tigermeat confirmed all data was used with proper consent. Some of the datasets he was using previously were not approved for commercial use, so they were fully removed.
    • Vocoder data stats for tigermeat voice libraries:
      • Total data: 84:30:55
      • Song: 57:56:18
      • Speech: 6:08:07
      • Rap: 0:31:41
      • String (UTAU/DV recordings): 17:51:44
    • Vocoder would be released as an oudep for others to use in their voice libraries for personal use only.[10]

On February 5, 2025, TRITON was updated to v110, which included the German support and new training parameters to make every voice library produced by tigermeat (including TRITON) to sound more realistic.[11][12]

Voicebanks[]

Reputation[]

Trivia[]

  • According to tigermeat's website, TIGER was recorded using the AT4040 microphone.

References[]

External links[]

Official[]

Articles[]

Unofficial[]

Navigation[]