Voice Engine

This is where you configure how your character sounds — the actual voice it uses, how expressive or stable that voice is, how often it speaks vs. types, and how it behaves in live voice calls. How to get here:

Go to your character’s dashboard → left sidebar → click Universal → click Voice Engine

The full path is: Universal → Voice Engine This section has five sub-pages shown at the top of the Voice Engine area:

Overview — details about the current voice
Settings — sliders to adjust how the voice sounds
Frequency — how often the character uses voice vs. text
Instructions — how to use audio tags for emotion/expression
Preferences — how the character behaves in voice channels

At the very top of every Voice Engine page, you’ll see your currently assigned voice displayed with its provider badge (for example, Eleven v3 from ElevenLabs) and a toggle to enable or disable it.

Overview

Full path: Universal → Voice Engine → Overview This is the information page about your current voice — think of it like the voice’s “profile page.”

Voice Details

Field	What it means
Voice ID	A unique code that identifies this specific voice. You can copy it by clicking it.
Model	The text-to-speech (TTS) model being used to generate audio. Example: `eleven_v3`.
Language	The main language this voice was trained in. Example: `En` for English.
Accent	The accent of the voice (if it has one). Example: `persian`. You can remove the accent by clicking the ✕ next to it.
Gender	The voice’s gender classification. Example: Male.

Voice Latency

These numbers tell you how fast or slow the voice generates audio:

Metric	What it means
Average	How many seconds it typically takes to generate a voice response. Lower is better.
P95	The “worst case” time — 95% of all responses come in faster than this number.

These numbers only measure how long it takes to create the audio itself. The total time before your character speaks also includes time for the AI to write what to say — that’s separate.

Training Samples

At the bottom of the Overview page, you’ll see a list of audio files (MPEG format). These are the original recordings the voice was cloned from. Each file shows its length and file size.

Changing Your Voice

Click the Edit Voice button on this page to switch to a different voice for your character.

Settings

Full path: Universal → Voice Engine → Settings These are sliders that let you fine-tune how the voice sounds. Moving them left or right changes the character’s audio quality:

Setting	What it does	Simple explanation
Stability	How consistent the voice sounds from sentence to sentence.	High = the voice sounds very even and uniform. Low = the voice has more natural variation, like a real person.
Similarity Boost	How closely the output matches the original voice recording it was cloned from.	High = sounds more like the original clone recording. Low = sounds slightly different.
Style	How much the voice exaggerates its emotional style and emphasis.	High = dramatic, expressive. Low = flat and neutral.
Speed	How fast the character talks.	Low = slow and deliberate. High = fast.
Pitch	How high or low the voice sounds overall.	Move up for a higher pitch, down for a deeper voice.

For natural conversation: Try Stability at 0.6, Style at 0.4. This gives slight variation without sounding unstable. For dramatic characters: Lower Stability (0.3) and higher Style (0.6–0.8) makes the voice more expressive and theatrical.

Frequency

Full path: Universal → Voice Engine → Frequency This controls how often your character responds with a voice message instead of a regular text message (in Discord text channels where voice is enabled). There are two modes. You can only use one at a time:

Let Character Decide (toggle)

When this toggle is ON, the character will decide on its own whether to send a voice message or a text reply based on the conversation context. The manual slider below becomes disabled.

Manual Frequency Control (slider)

When “Let Character Decide” is OFF, you drag a slider to set the exact rate:

Slider position	What happens
All the way left (0)	The character never sends voice messages — always text only.
In the middle (~0.5)	The character sends voice messages about half the time.
All the way right (1)	The character always responds with voice — never text.

Instructions

Full path: Universal → Voice Engine → Instructions This is a text box (up to 1000 characters) where you write instructions for how the voice uses audio tags to show emotion.

What are Audio Tags?

Audio tags are special words you put in square brackets [like this] that tell the voice model to add a specific sound or change how it speaks. The AI includes these tags in the text before it’s turned into audio. Examples of audio tags:

Tag	What it does
`[whisper]`	Makes the voice quiet and hushed — like whispering
`[yells]`	Makes the voice louder and more intense
`[laughs]`	The voice laughs
`[sigh]`	The voice sighs
`[crying]`	The voice sounds like it’s crying
`[breathes]`	Adds a breath sound
`[applause]`	Adds an applause sound effect
`[British accent]`	Changes the accent mid-speech
`[echoing]`	Adds an echo effect
`[explosion]`	Adds an explosion sound effect

How to write the Instructions

In the text box, write rules that tell the AI when to use these tags. For example:

Use [laughs] before sarcastic remarks.
Insert [breathes] occasionally for a dramatic feel.
Use [yells] when responding to rude or aggressive messages.
Add [sigh] before answering boring or repetitive questions.

The character will follow these rules every time it generates a voice response.

Preferences

Full path: Universal → Voice Engine → Preferences These are two simple on/off toggles that control what your character does in Discord voice channels:

Setting	What it does
Auto-join Voice Channels	When ON, the character automatically joins a voice channel as soon as real members are in it — without being invited. When OFF, someone has to manually ask it to join.
Auto-leave Voice Channels	When ON, the character automatically leaves the voice channel when all human members have left — so it’s not sitting alone in an empty channel. When OFF, it stays until someone asks it to leave.

​Voice Engine

​Overview

​Voice Details

​Voice Latency

​Training Samples

​Changing Your Voice

​Settings

​Frequency

​Let Character Decide (toggle)

​Manual Frequency Control (slider)

​Instructions

​What are Audio Tags?

​How to write the Instructions

​Preferences

Voice Engine

Overview

Voice Details

Voice Latency

Training Samples

Changing Your Voice

Settings

Frequency

Let Character Decide (toggle)

Manual Frequency Control (slider)

Instructions

What are Audio Tags?

How to write the Instructions

Preferences