Vocal Synthesizer Wiki
Advertisement
šŸ’» Technology article work in progress.
For information on how to help, see the guidelines.  More subjects categorized here.
šŸ’»


VOICEVOX is an open source, deep learning, reading/text-to-speech synthesizer software developed by Hiho. It later received singing synthesizer features on January 31st, 2024.

Etymology

VOICEVOX was named out of a desire to make a software that could fit many voices like in a box. The "VOX" part of the name comes from the latin word for "voice", but is also a homonym for "box".[1]

VOICEVOX's main type of synthesis, text-to-speech was rebranded into VOICEVOX Talk to differentiate itself from the new type of synthesis introduced in January 2024, VOICEVOX Song and Humming. According to Hiho, Song was chosen over Sing because, brand-wise a noun sounded better than verb, a problem Talk did not run into since the word is used as both a noun and verb.[2]

Product information

Demonstrations

Demonstrations

Generation 1 demo YouTube
Generation 2 demo YouTube
Generation 3 demo YouTube
Generation 4 demo YouTube

Requirements

  • OS:
    • Windows: Windows 10 / Windows 11
    • Mac: macOS Catalina or later
    • Linux: Ubuntu 18.04 / Ubuntu 20.04
  • GPU: Nvidia

VOICEVOX Talk

VOICEVOX Talk (VOICEVOX ćƒˆćƒ¼ć‚Æ) is VOICEVOX's text-to-speech feature.

Releases

1st generation

2nd generation

3rd generation

4th generation

5th generation

6th generation

7th generation

8th generation


Upcoming voicebanks


VOICEVOX Song

VOICEVOX Song (VOICEVOX ć‚½ćƒ³ć‚°) is VOICEVOX's singing synthesis feature. It was first revealed with the Humming feature on December 28, 2022.[3] A prototype release for Song and Humming was slated for late January 2024. The prototype will only have the bare minimum features available, with pitch editing and other features planned post-release. The user interface is also a prototype, which will be improved based on user feedback. The editor and API are planned to be open source. The editor for Humming will be integrated into the base VOICEVOX software.[4] The prototype was released on January 31, 2024 and properly introduced as VOICEVOX Song.[5]

Humming

Humming (ćƒćƒŸćƒ³ć‚°) is a feature that allows VOICEVOX voices to sing using their TTS voicebanks. Compatibility is planned by generation, with generations 1 and 2 in development at the time of announcement.[6] Theoretically, all voices and emotion styles should be compatible with the Humming feature however, VOICEVOX will have to consult with the IP owners if the output is up to par, if not then those databases will not have the Humming feature available.

Releases

Humming Compatibility


Upcoming voicebanks



VOICEVOX Nemo

VOICEVOX Nemo is VOICEVOX's effort to provide voicebanks without characters and easier licensing. It was first announced on March 18, 2022.[7] The voices were taken from an audition that started on the same day of its announcement and the winners were revealed on April 1, 2022.[8] A demo was showcased on January 2, 2023.[9] but the voices themselves would not end up releasing until November 17,[10] featuring an additional female voice.

Releases


References

External links

Official

Unofficial

Articles

Navigation

Advertisement