Voices & Music

The Voices page is the standalone audio studio in AI Suite. It bundles three modes — Voice Cloning, Text-to-Speech, and Music — into a tabbed interface so you can produce dialogue, narration, and soundtracks without leaving the page.

Tabs

Voice Cloning

Clone a voice from a short audio sample. Once cloned, the voice can be used for text-to-speech generation or assigned to dialogue lines in the Video Generator Node for automatic lip-sync.

Use Case	Cost
TTS voice clone	1,500 credits (one-time)
Video lip-sync clone	7 credits (one-time)

Cloned voices are stored at the workspace level and can be reused across all your projects.

Text-to-Speech

Generate spoken audio from a text script using a system voice or one of your cloned voices.

Setting	Description
Voice	Choose from 27 system voices or any voice you've cloned
Script	The text to be spoken (free-form)
Model	Speech-02-HD

Cost: 10 credits per 100 characters.

Music

Generate full music tracks from a text prompt. Two tiers are available.

Tier	Model	Cost
Standard	ACE-Step	~0.2 credits/second (minimum 1)
Premium	ElevenLabs	800 credits/minute

The Music tab is the same engine used by the Audio Generator Node in Flow Studio — choose the AI Suite version for one-off generations and the node version when you need music inside a workflow.

info

Music generation occasionally needs a cold start on the underlying provider. The Voices page automatically retries up to 3 times with a short backoff, so an initial wait of a few seconds is normal.

When to Use Voices vs the Audio Generator Node

Use Case	Recommended Surface
One-off voice clone for a single video	Voices page
Quick TTS take to test a script	Voices page
Background track for a single export	Voices page
Repeated audio generation as part of a pipeline	Audio Generator Node
Pairing audio with generated video via Video Combiner	Audio Generator Node

Audio Generator Node — equivalent capabilities inside Flow Studio
Video Generator Node — assign cloned voices to dialogue with lip-sync
Credits System — full audio pricing tables

Tabs​

Voice Cloning​

Text-to-Speech​

Music​

When to Use Voices vs the Audio Generator Node​

Related Pages​

Tabs

Voice Cloning

Text-to-Speech

Music

When to Use Voices vs the Audio Generator Node

Related Pages