Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
> xVAsynth voice generation, Let's swap tips!
Lena Wolf
post Nov 29 2022, 05:32 PM
Post #1


Mouth
Group Icon
Joined: 18-May 21
From: Bravil



Let's have a whole new thread about voice generation with xVAsynth! This is for all the different games that it supports. There are differences of course, but I think there are more similarities and hopefully we can help each other.

I'll start - this is for Oblivion, a reprint of my post in Wolf Mods thread.

I had another go at voice generation in batch mode, and I now have a better idea of what's going on. I wrote it up here, if anyone is interested.

It's a huge pain at the moment to export quest dialogue. There is an xEdit script for Skyrim but not for Oblivion, and Skyrim quest records are too different - voice types are independent from race in Skyrim. I looked at that script... complicated. I know it's "only" a Pascal program, but it's yet another API to learn... mmm... Does anyone have a similar script for Oblivion? biggrin.gif


--------------------
"What is life's greatest illusion?"
"Innocence, my brother."

User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Renee
post Nov 29 2022, 09:38 PM
Post #2


Councilor
Group Icon
Joined: 19-March 13
From: Ellicott City, Maryland



Hee, I'm not even sure what this is! So I Google'd.

xVASynth is an AI tool for generating high-quality voice acting lines using voices from video games. The app supports hundreds of voices, across dozens of games, and provides pitch, duration, and energy control at per-letter granularity.

So if I understand this correctly, you could take some voice files from, let's say, The Witcher, and use it in some other game?
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Lena Wolf
post Nov 29 2022, 10:32 PM
Post #3


Mouth
Group Icon
Joined: 18-May 21
From: Bravil



Ok, sorry Renee, what you read there, it is... well... advertising. wink.gif Not technically wrong, but not quite up to the everyday reality either. biggrin.gif

What xVAsynth does, is generate voice files from text lines that you type into your quest window in a wide range of games. Certainly Morrowind, Oblivion, Skyrim, Fallout, to name a few, which is what we care about here. Here it is on Nexus.

For it to work, it requires what is called "voice models" - data files based on actual voice lines recorded by actors, such as all the different voice lines for each race in Oblivion. There is a different tool that allows you to create these models - to train them, as it is called. So if you wanted to, say, take Geralt's voice from The Witcher and make him say new lines for your mod, you would first need to train a "Geralt" model and then use it to generate your new lines.

That's too complicated for me though. smile.gif I just use models that someone else already created. Previously those models were not very good, so the newly generated voice lines sounded very mechanical. But recently both the engine and the models for Oblivion got revamped and improved, and can now generate quite acceptable voice files. I believe Skyrim voice models are even better, and that's what Ghastley is using to voice his mods.

But to start with, you need those text lines exported from your quest window into a very specific format that the voice synthesizer can read. And that's a job and a half already!

It is also possible to hand-craft each line, tune parameters until you're happy with the way it comes out. I don't have that kind of time! But Ghastley does. biggrin.gif

I think Zelazko is also using it, so that's three people already, and I figured we might want to exchange tips. Hence this thread. salute.gif


--------------------
"What is life's greatest illusion?"
"Innocence, my brother."

User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Renee
post Nov 29 2022, 11:05 PM
Post #4


Councilor
Group Icon
Joined: 19-March 13
From: Ellicott City, Maryland



Wow, that sounds really neat. I'm in shock. I mean, a similar sort of technology exists. mALX has a program which types whatever she orates, for instance. But this is the first time I've heard of text-turning-into-voice for a videogame. ohmy.gif
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Lena Wolf
post Nov 29 2022, 11:30 PM
Post #5


Mouth
Group Icon
Joined: 18-May 21
From: Bravil



This is how Morroblivion is voiced. No more walls of text, they are actually saying it! ohmy.gif Although the files are a bit old and it is all a bit mechanical, and yet I'll take it over a wall of text any day (and am using it). But these new models are so much better! Only it's a big job to get from the text lines in a quest window to the actual voice files that play in-game...

This post has been edited by Lena Wolf: Nov 29 2022, 11:31 PM


--------------------
"What is life's greatest illusion?"
"Innocence, my brother."

User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Renee
post Nov 30 2022, 08:28 PM
Post #6


Councilor
Group Icon
Joined: 19-March 13
From: Ellicott City, Maryland



Again, it sounds really awesome. Sorry I can't be of any help on the subject, what a neat program, though.

Edit: is xVAsynth similar to Microsoft Sam?

This post has been edited by Renee: Nov 30 2022, 08:30 PM
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Lena Wolf
post Nov 30 2022, 09:50 PM
Post #7


Mouth
Group Icon
Joined: 18-May 21
From: Bravil



QUOTE(Renee @ Nov 30 2022, 07:28 PM) *

Is xVAsynth similar to Microsoft Sam?

I never used MS Sam, but by the looks of it is similar technology. Except that of course Sam only talks like Sam, whereas xVAsynth allows you to choose from various models. You know, so that your Nords don't sound like your Imperials - that would be awful! wacko.gif wink.gif



--------------------
"What is life's greatest illusion?"
"Innocence, my brother."

User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Lena Wolf
post Nov 30 2022, 10:34 PM
Post #8


Mouth
Group Icon
Joined: 18-May 21
From: Bravil



I spent two days straight up generating voice lines for TWMP Northern Realms - my nickname for the conglomerate of TWMP Hammerfell, High Rock (empty as it is), Skyrim, Stirk and Chain Islands. This covers the following mods:

- TWMP Hammerfell
- TWMP High Rock (nothing there yet, but let's not leave it out)
- TWMP Skyrim Improved
- TWMP Locations
- TWMP Skyrim Alive

Yesterday the whole day (!) was spent exporting dialogue from these mods. It should be possible to do it faster, but I couldn't find another way, meaning that I was sitting here at my PC all day clicking the "Export dialogue" button on per quest basis. Each quest took anywhere between 1 minute and 10 minutes to generate. Not fast enough to keep my attention, yet not long enough to be able to focus on something else in between... Infuriating.

Once that was done, the files had to be converted into the input format that xVAsynth expects and voice IDs had to be filled in for each line. That's 21,895 lines, thank you very much. wacko.gif

So this morning was spent cleaning the data. Nords and Orcs for example speak with the same voice, so once you convert the race+sex combo into a voice ID, you find a lot of duplicate lines. Delete them because the Synth is not smart enough to preprocess your data for you. Still, I was left with 5,697 lines.

Tried loading that into the Synth, it would start synthesizing, but after some 200-300 lines it would crash. Not even making a dent in it. Turns out, the default settings seem to be meant for a high-end PC, or may be just a modern PC, not a 12 year old thing like mine. Turned down the settings and enabled GPU and VRAM usage - that helped enormously. Still, I found it necessary to split up the big file per voice - trouble seems to start when the Synth tries to do the smart thing and group the data... Don't.

After that it didn't take too long to generate all files, but still it was all afternoon babysitting it. Again, can't get away and can't really focus on anything else.

The next step was lip sync generation. This is done with the CS (or at least I don't know another way to do it). Fortunately, Vorians took the heat on that topic and ShadeMe even made some changes in the CSE especially for that - which I also hijacked. biggrin.gif With the latest development build, lip sync file generation can actually be done in batch mode from the Character menu, and it works! Another couple of hours and you've got it. How many hours is it already altogether? Too many.



--------------------
"What is life's greatest illusion?"
"Innocence, my brother."

User is offlineProfile CardPM
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 

- Lo-Fi Version Time is now: 25th December 2022 - 05:37 AM