Back to Help Center
Inbox & follow-ups·Beginner5 min read

Voice reply and language translation

Dictate quick replies via your browser's mic; let the AI polish + translate into 30 languages including Pidgin, Twi, Yoruba, and Patois.

Two small pills sit next to the AI Suggested Reply in every inbox case — a microphone button labeled Voice reply and a globe-icon pill showing the current reply language. Together they let you compose faster and respond in your customer's preferred language without retyping anything. This guide explains what each one does and when to reach for both.

1. What each feature does

Voice reply lets you speak your reply out loud. Yesoma captures your words using your browser's built-in speech recognition, then sends the transcript plus the inquiry context to Claude, which rewrites it into a polished, on-brand message — matching the tone in your Business Brain and the thread you're working in.

Language translation regenerates the existing AI Suggested Reply in a different language. It is not a word-for-word mechanical translation — Claude adapts the tone to fit the target language (for example, choosing the appropriate formal or informal register in Spanish or French) so the message reads naturally rather than like it came out of a translation tool.

2. Voice reply

Where to find it

Open any inbox case. Below the Suggested Reply card, before the Regenerate and Edit buttons, you will see a small Voice reply pill with a microphone icon.

How to use it

  1. Click Voice reply. The pill turns coral and shows a Stop button. Your browser may prompt for microphone access the first time — grant it.
  2. Speak naturally. As you talk, a live preview of the transcript appears next to the button in italics. You do not need to choose words carefully — "yeah we're open Tuesday afternoon, two works" is exactly the kind of thing to say.
  3. Click Stop (or pause speaking — recognition ends automatically after a brief silence). The pill briefly shows a "Polishing..." state with an animated wand icon.
  4. The polished reply replaces the visible Suggested Reply text. You can read it, edit it further, or send it as-is.

If polishing fails for any reason, Yesoma hands you the raw transcript in the reply field anyway, so you are never left with nothing to work from.

How the AI polish works

Your spoken transcript and a short excerpt of the inquiry are sent to Claude together. Claude rewrites the transcript into a complete, professional reply that fits the customer's question and your saved brand voice. The language pill controls which language the polished reply comes out in — if the pill is set to Spanish, you can dictate in English and receive a Spanish reply (see section 5 for that workflow).

When voice reply is most useful

It shines when typing on a phone is slow, when you are replying to a simple availability or pricing question, or when you know what you want to say but need it to come out polished. "Yes available Saturday, deposit is 20 percent, here's the link" becomes a complete, warm reply in a few seconds.

3. Language translation

Where to find it

The language pill sits just to the right of the Voice reply button in the Suggested Reply row. It displays a globe icon and the name of the currently active language — for example "Auto", "English", or "Spanish". When set to Auto, Yesoma uses the language detected from the customer's message.

How to switch languages

  1. Click the language pill. A dropdown opens showing the full list of supported languages.
  2. Click any language. The pill label updates immediately (optimistic), and Yesoma re-runs the AI to produce a new suggested reply in your chosen language. A toast confirms when it is ready, and the Suggested Reply card refreshes.
  3. The dropdown header notes: "Regenerates this reply only — unless you save." To make the chosen language the default for all future inquiries in your workspace, check the Save as default box at the bottom of the dropdown before picking.

What languages are supported

The dropdown is the source of truth, but the current list includes:

  • English, English (British)
  • Spanish, French, German, Italian, Dutch
  • Portuguese (Brazil), Portuguese (Portugal)
  • Arabic, Hebrew, Turkish, Hindi
  • Mandarin, Vietnamese, Indonesian, Tagalog
  • Swahili, Amharic, Hausa, Zulu, Lingala, Wolof
  • Twi, Yoruba, Igbo
  • Ghanaian Pidgin, Nigerian Pidgin, Jamaican Patois

If you need a language not in the list, email support@getyesoma.com — it is a support request, not something you can add yourself.

How translation handles tone

Yesoma passes the Business Brain context — your services, voice, and saved notes — into the translation prompt. This means the translated reply should feel like your business wrote it in that language, not like a translated version of an English reply. For languages with formal and informal registers (French, Spanish, German, and others), Claude picks the appropriate one based on context. If the customer's message was casual and friendly, the reply matches that energy.

4. Using both together

The most practical combined scenario: a customer who prefers Spanish messages you in English because that is where they found your website. You want to reply in Spanish.

  1. Click Voice reply and dictate your reply in English — it is faster for you.
  2. While the polishing runs (or after), click the language pill and select Spanish.
  3. Yesoma produces a polished Spanish reply. The language pill controls the output, not the input — you can always dictate in whatever language you speak.

You can also chain them in the opposite order: let the AI generate a Suggested Reply first, switch the language pill to French, then click Voice reply to overlay it with your own words in French — the polish step will still produce a French result because the pill setting is still active.

5. Privacy and processing

Voice reply uses the Web Speech API built into your browser. Audio is processed by your browser and the speech recognition engine it uses (typically provided by the OS or browser vendor such as Chrome, Safari, or Edge). No audio recording is sent to Yesoma's servers — only the text transcript arrives on Yesoma's side. That transcript is then passed to Claude alongside the inquiry context.

Language translation happens entirely via Anthropic Claude using the same Business Brain context the rest of the inbox uses. No third-party translation service is involved.

Neither feature stores audio on Yesoma's infrastructure. Transcripts and translated replies are retained only as part of your normal inquiry history.

6. Browser permissions for voice reply

Voice reply requires microphone access. The first time you click the button in a given browser, the browser will show a permission prompt.

  • Click Allow to grant access. You should only need to do this once per browser.
  • If you accidentally clicked Block, or the permission is denied: click the lock icon (or padlock / tune icon) in your browser's URL bar, find the Microphone setting, and change it to Allow. Then reload the page and try again.

If the Voice reply button does not appear at all, your browser does not support the Web Speech API. Firefox does not support it. Safari on macOS and iOS (inside the Yesoma PWA) does support it. Chrome and Edge on desktop and Android support it.

7. Common questions

Can I dictate in a different language than the customer? Yes. The language pill controls the output language of the polished reply, completely independently of what language you speak into the microphone. Dictate in English, set the pill to Spanish, and the customer reads Spanish.

Does voice reply work on mobile? Yes. On iPhone and iPad, it works inside the Yesoma PWA installed to your home screen (tap Share → Add to Home Screen in Safari, then open from there). On Android, Chrome supports it in the regular browser — no install required.

What if I switch the language pill but then use Voice reply — which language wins? The language pill setting always applies to the polished output. If the pill is set to French and you click Voice reply, your dictation gets polished into French regardless of what language you spoke.

Can I undo a language switch? Switch the pill back to Auto (or any other language) and the AI regenerates the reply again. Because each switch re-runs the AI, there is no single "undo" — but you can always switch back.

Does "Save as default" affect existing cases? No. It updates your workspace's default reply language going forward. Existing cases keep whatever language is already on their Suggested Reply; you would switch them manually if needed.

The Voice reply button is not showing up — why? The button only renders if your browser supports the Web Speech API. If you are using Firefox, that is expected. Try Chrome, Edge, or Safari (iOS PWA) instead.

More in Inbox & follow-ups

Was this article helpful?

If something was unclear or missing, tell us and we'll fix it.

Still stuck?

We'll help you get this working. Send us a message, or ask about Managed Setup.