• 🏆 Texturing Contest #33 is OPEN! Contestants must re-texture a SD unit model found in-game (Warcraft 3 Classic), recreating the unit into a peaceful NPC version. 🔗Click here to enter!
  • It's time for the first HD Modeling Contest of 2024. Join the theme discussion for Hive's HD Modeling Contest #6! Click here to post your idea!

HIVE

Macielos
Macielos
Hey,
We are using ElevenLabs - Generative AI Text to Speech & Voice Cloning. Pretty simple to use, for a small monthly fee. Paraphrasing Milton Friedman - there's no such thing as free (and good) services ;).

We just used voiceline compilations extracted from other games, mostly WoW and HOTS, to be able to generate voice for a specific character. The results... varied. Sometimes 2 minutes of voicelines were enough to get a realistically sounding voice, but we mostly used compilations 10-20-minute-long, generally you'll get better results with samples that are longer and clear of any background noises like environment sounds or electronic beeps.

When generating particular voice, usually 3-5 attempts were enough to get good enough results, but there were few lines and some characters that indeed took us much longer. We mostly manipulated one parameter - stability - where lower stability made the results more variable, so they were more often bad, but after more attempts you could get some true pearls. For longer texts it's obviously better to use higher stability to get less surprises where the character says everything fine and screws up the last word.
Macielos
Macielos
In the end we were doing some postprocessing in Audacity - applying "normalize volume" filter (I believe our magic number was -0,17 or sth like that) for everything to have similar loudness, and sometimes also slightly changing speed.
Top