Personality and Fallbacks
Designing resilient assistant behavior
Now that turn-taking feels natural, we can make the agent sound on-brand and natural. We will:
- Give the assistant a distinct personality
- Change the default voice
- Configure fallback adapters so the agent survives provider outages
Voice–prompt synergy
Voices aren't neutral speakers. How the LLM writes affects how the TTS sounds:
- Match phrasing to voice: shorter clauses and fewer parentheticals often sound more natural.
- Dialect awareness: tailor spelling and idioms (e.g., UK vs US) for authenticity.
- Emotion control: add subtle stage directions in system messages. Too many will create a caricature.
- Keep replies brief (≤3 sentences): to reduce latency and prevent the agent from monologuing.
Crafting a system persona
Your system prompt sets expectations for tone and boundaries. Update the Assistant class:
class Assistant(Agent):
def __init__(self) -> None:
super().__init__(
# Define personality: tone, style, and constraints
instructions=(
"You are an upbeat, slightly sarcastic voice AI for tech support. "
"Help the caller fix issues without rambling, and keep replies under 3 sentences."
),
)
You don't need to use this specific set of instructions. Take some time to write a set of instructions that you think make sense for you.
Changing voices
Next, let's modify the TTS configuration to match our new personality:
session = AgentSession(
# ...
tts="cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
# ...
)
The format follows the pattern provider/model:voice_id. In this case, we're using Cartesia's Sonic 3 model with a specific voice ID.
If you want to try some other styles of voices, you can find a list of recommended ones in the LiveKit Inference docs.
Adding fallback adapters
Every provider has hiccups. When outages from model providers happen, we want to make sure that your agents don't stop working.
Production agents always need secondary providers.
LiveKit's fallback adapters automatically retry in priority order. Let's add some fallback adapters.
-
Import the adapters near the top of the file:
from livekit.agents import llm, stt, tts, inference -
Extract the VAD so we can reuse it:
vad = silero.VAD.load() session = AgentSession( vad=vad, # ... ) -
Wrap each pipeline component:
session = AgentSession( # LLM with fallback: OpenAI primary, Gemini backup llm=llm.FallbackAdapter( [ inference.LLM(model="openai/gpt-4.1-mini"), inference.LLM(model="google/gemini-2.5-flash"), ] ), # STT with fallback: AssemblyAI primary, Deepgram backup stt=stt.FallbackAdapter( [ inference.STT.from_model_string("assemblyai/universal-streaming:en"), inference.STT.from_model_string("deepgram/nova-3"), ] ), # TTS with fallback: Cartesia primary, Inworld backup tts=tts.FallbackAdapter( [ inference.TTS.from_model_string("cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc"), inference.TTS.from_model_string("inworld/inworld-tts-1"), ] ), vad=vad, turn_detection=MultilingualModel(), )
The adapters degrade gracefully: if the first provider fails, the session retries the next one without dropping the call.
Production agents should always have fallbacks.
Testing your changes
Run the agent and see how it sounds:
uv run agent.py console
While interacting with the agent:
- Ask the agent to introduce itself multiple times. Does it stay on persona?
- Interrupt it mid-sentence. Does the tone still match the prompt?
Testing fallbacks
To verify your fallback adapters are working, simulate a provider failure:
-
Change the primary LLM to an invalid model name:
llm=llm.FallbackAdapter( [ "openai/invalid-model", # This will fail "google/gemini-2.5-flash", ] ) -
Run the agent with
uv run agent.py console -
Watch the console output. You should see the agent attempt the primary provider, fail, then fall back to the secondary provider
-
Restore the correct model name when you're done testing
This kind of chaos testing helps you confirm the agent will stay online even when a provider goes down.
Wrap up
Your agent now has:
- A defined personality through the system prompt
- A custom voice through the TTS configuration
- Fallback adapters to handle provider outages
In the next lesson, you'll capture metrics so you can measure performance and enable preemptive generation to reduce response latency.