Open source Local mode available

Push-to-talk dictation for Android that works across apps.

Tap the floating button, speak, tap again, and your text is inserted into the currently focused field.
No keyboard switching. Local or cloud transcription.

Better transcription than typical keyboard dictation
Dictate first, then keep editing in the same field
Optional cleanup for prompts, commands, and general writing

Download APK View on GitHub

Contact me ·

Reply by voice, send text

9:41

Can you send me the summary when you're done?

No rush.

Message|

Dev mode. Niche, but fun

If you type into terminals on your phone, this is going to be great for you.

In Dev mode, say “command mode” and then either describe the command you want or literally spell it. The spoken text and the inserted text do not have to be the same.

You say command mode show files in current dir → ls -l .

You say command mode git commit minus m description → git commit -m "Clean up overlay timing"

How it works

1
Tap the overlay button
A small floating button lives on top of your apps. Tap it once to start recording.
2
Speak
Dictate naturally. The button pulses red while it's listening.
3
Tap again to stop
The button turns gray while the transcription runs.
4
Audio is transcribed
On-device using local models, or sent to OpenAI Whisper with your own API key.
5
Text is inserted
The transcribed text goes into the currently focused field when the app exposes a standard Android input field. If insertion fails, it falls back to the clipboard.

Local or cloud — your choice

Both modes use your own hardware or your own API key. I run no backend.

Local mode

On-device transcription
Audio never leaves your phone
No API key required
Download a model once, use it offline

Available models

Parakeet 110M 100 MB Recommended

Moonshine Tiny 103 MB Fastest

Whisper Base 199 MB Solid baseline

Parakeet 0.6B 465 MB Best quality

Cloud mode

Uses OpenAI Whisper API
Your own API key, stored on-device
No backend run by me
Optional: send transcript to OpenAI for cleanup

Requests go straight from your phone to OpenAI. I'm not in the loop.

Why Accessibility permission is needed

Phone Whisper uses Android Accessibility Service for one narrow reason: inserting dictated text into the currently focused text field across apps.

It does not replace your keyboard. It does not run background automation. It only acts after you explicitly tap the overlay button.

The app is open source. You can read exactly what it does before granting the permission.

Read the source code on GitHub

Privacy and open source

Local mode keeps audio on-device.

No audio leaves your phone. Transcription runs entirely on your hardware.

Cloud mode goes direct to OpenAI.

Audio is sent from your device straight to the OpenAI API using your own key. I don't operate a relay server.

Open source.

The full source code is on GitHub. No hidden behavior.

GitHub repo · Privacy policy

FAQ

Why not just make it a keyboard?

A custom keyboard means replacing your existing keyboard entirely. That's a lot of friction. Phone Whisper leaves your keyboard alone and works on top of any app, any keyboard.

Why does it need Accessibility permission?

Android doesn't have a standard API for inserting text into another app's focused field. The Accessibility Service is the sanctioned way to do that cross-app. Phone Whisper uses it for exactly one thing: inserting the transcribed text.

Does audio stay on-device?

In local mode, yes. In cloud mode, audio is sent directly to OpenAI's API from your device using your own API key. I don't have a backend and never see your audio.

Does it work in every app?

It works in most apps with standard text fields. Some apps restrict text injection for security reasons, and others use custom text surfaces instead of normal input fields. In those cases, the transcribed text is copied to your clipboard as a fallback.

How does it work in Termux?

Termux's main terminal area is not a standard Android text field, so direct insertion may not work there. Swipe the extra keys row left or right to switch to Termux's native text input box, then dictate there.

Is it on the Play Store?

Not yet. I'm shipping the APK directly first, tightening the experience, and deciding later whether a Play Store release makes sense.

Is this finished?

Not really. It's an early but working MVP. The core loop already works well enough to use every day, and I'm improving it in public.