What hardware do you need for text-2-speech (TTS) system?
During a Text-2-Speech synthesis, most computations are performed using the graphics (GPU) card. However, attaching additional GPU cards to hardware does not increase the speed of synthesis. Therefore, InteliWISE recommends the use of Nvidia RTX 2080Ti.
The CPU has a smaller impact on the mentioned computations.
In below table you may find our declaration of synthesis speed per CPU.
Declared amount of parallel synthesis channels with audio stream / GPU card