• 🏆 Texturing Contest #33 is OPEN! Contestants must re-texture a SD unit model found in-game (Warcraft 3 Classic), recreating the unit into a peaceful NPC version. 🔗Click here to enter!
  • It's time for the first HD Modeling Contest of 2024. Join the theme discussion for Hive's HD Modeling Contest #6! Click here to post your idea!

Does using AI-generated voice for cinematics ruin the game?

Status
Not open for further replies.
Level 2
Joined
Dec 26, 2020
Messages
4
I tried adding computer-generated voice lines into my cinematics (specifically Google TTS) to give justice and life to the unit speaking, and for the clarity of speech.

The problem is that the ai voice is quite emotionless and sometimes glitchy/robotic, and I know I can't do anything about it. But leaving cinematics voiceless feels so empty (and boring).

Should I use these ai voices or should I just leave the cinematics in subtitles only?
 
Level 20
Joined
Feb 23, 2014
Messages
1,264
A bad voiceover can be really detrimental to the user experience and modern text-to-speech solutions, as decent as they are in some applications, cannot produce a good voiceover, so... I second what has been said above - it's better to have nothing then use TTS.
 
Level 4
Joined
Feb 21, 2021
Messages
40
I mean, wouldn't be funnier to ask the community to help you voicing the cutscene? We are a lot, I think you'd find people open to voice you whatever you want if you give context, time limit and scripts, or even better if you send the silent clips
 
Last edited:
  • Like
Reactions: pyf
Level 13
Joined
Oct 18, 2013
Messages
690
I like the idea, and the concepts been posted here before. This is someone using a Neural Network to synthesize voice lines in the style of a target voice:
 

pyf

pyf

Level 32
Joined
Mar 21, 2016
Messages
2,985
[...] Should I use these ai voices or should I just leave the cinematics in subtitles only?
TTS digitized samples labelled as 'Natural Voices' overall sound the best, but their reading engines are mainly designed to impersonate the voice of someone who reads, not the one of a comedian who acts.

The overall quality of a good speech synthesis imho highly depends on the digitized voice samples themselves, as well as on the options to tweak the silence breaks right after any punctuation, for the sentences to flow naturally. For the record, some professional software manages to replicate the subtle breathing sounds of someone while he is also talking, and sometimes some intonations as well.


With this site, you may experiment with the pitch and therefore create 'funny' voices:

This other one has less features, but it allows to compare with Microsoft's Zira (which was designed to sound good on low bandwidth laptop speakers, with a minimum file size for its digitized samples, and was powered by the MS TTS 5.0 engine iirc)

Try to experiment first with anything that is available for free online.
 
Last edited:
Status
Not open for further replies.
Top