How to Implement Audio Output Switching During the Call on Android App — cover illustration

Key takeaways

setCommunicationDevice() is the 2026 default. Every new Android 12+ call feature should route through AudioManager.setCommunicationDevice and AudioDeviceCallback. The legacy startBluetoothSco / setSpeakerphoneOn pair is deprecated and should live only inside an SDK-level fallback.

OEM fragmentation is the real budget eater. Pixel, Samsung, Xiaomi, OPPO and Huawei all behave slightly differently on Bluetooth SCO and audio focus. Plan 2–3 weeks of real-device testing, not 2 hours of emulator time.

Android 14 / 15 pulled in the reins. Foreground service type phoneCall is mandatory, BLUETOOTH_CONNECT must be granted before touching the adapter, and LE Audio + hearing-aid routing land as new AudioDeviceInfo types you must handle in your UI.

If you use WebRTC, LiveKit or Agora, most of this is already solved. A custom AudioDeviceModule belongs inside your SDK boundary, not in every feature team. Match the hire to the layer.

Fora Soft has shipped this module more times than we can count. Virtual classrooms, telemedicine, consumer calls — we’ll share the Kotlin wrapper we reuse across clients, plus the OEM test matrix.

Why Fora Soft wrote this playbook

Audio routing on Android looks like a 40-line AudioManager wrapper. Then you test on a real Xiaomi device, a fresh Samsung, a user on Android 11 in Brazil and a hearing-aid owner on a Pixel 9, and you discover the 40 lines are 400. This guide is the Kotlin playbook we use on every voice/video Android build at Fora Soft — tuned to the 2026 reality of API 31+ APIs, foreground-service-type enforcement, Bluetooth LE Audio and a tighter permission model.

Since 2005 we’ve shipped 99+ voice, video and AI products with a 98% five-star Upwork rating. Our BrainCert virtual classroom ships a native Android client that routes hundreds of thousands of in-call minutes per week across headsets, speakers, car systems and hearing aids. CirrusMED’s HIPAA-grade telemedicine app handles the same routing in a regulated setting. The recommendations below come from those builds.

Planning an Android call app?

Book a 30-minute call with our Android lead. We’ll share our AudioDeviceManager wrapper, the OEM test matrix, and a realistic delivery window for your use case.

Book a 30-min call → WhatsApp → Email us →

The problem space in 2026

In a real user session your app may need to flip the active audio output mid-call between the earpiece, the built-in speaker, wired 3.5mm or USB-C headsets, Bluetooth SCO headsets, Bluetooth LE Audio devices, hearing aids, and a paired car system — often without user action (“headphones plugged in”) and often while the OS is itself in the middle of a connect/disconnect animation.

Three forces make this harder than it looks. First, Android’s public API surface changed twice between 2021 and 2025: API 31 introduced setCommunicationDevice() and deprecated a handful of old calls; API 33 added LE Audio; API 34–35 tightened the service and permission models. Second, OEM audio stacks (MIUI, One UI, ColorOS, EMUI) quietly alter SCO timing, audio focus and ringtone behavior. Third, modern call apps are almost always WebRTC apps, which bring their own AudioDeviceModule and its own interaction rules with the OS stack.

Why this matters for the business, not just the engineer

Android still powers roughly 70% of the global smartphone market. For any voice or video product that isn’t iOS-only, audio routing quality is one of the five variables that decide review scores and retention — alongside latency, crash rate, camera quality and call-setup time. We see 1-star reviews referencing “no sound on Bluetooth” or “speaker doesn’t switch” on almost every new Android build that didn’t budget time for the OEM test matrix.

In practical terms: an audio-routing bug costs you about as much user goodwill as a hard crash, but with lower visibility in your monitoring stack. The 2–3-week testing investment we recommend below isn’t engineering perfectionism — it’s a churn reduction project.

The 2026 API landscape at a glance

API level Preferred API Key additions Gotchas
21–30 (Legacy) setSpeakerphoneOn + startBluetoothSco AudioDeviceCallback (API 23+) SCO connect timing unpredictable; no standard API
31–32 (Android 12) setCommunicationDevice() Unified device selection, BLUETOOTH_CONNECT runtime perm Old APIs deprecated but still respected on many OEMs
33 (Android 13) setCommunicationDevice() + LE Audio types TYPE_BLE_HEADSET, TYPE_BLE_SPEAKER, TYPE_HEARING_AID LE Audio rollout uneven across vendors
34 (Android 14) Same + phoneCall foreground service type Mandatory foregroundServiceType, stricter BT checks Missing type throws InvalidForegroundServiceTypeException
35 (Android 15) Same + Audio Sharing / LE Audio groups CTA-2075 loudness, shared LE Audio groups Broadcast / group routing still experimental

The modern pattern — setCommunicationDevice() end to end

For API 31+ the canonical flow is: set the mode → request audio focus → register a device callback → enumerate available devices → call setCommunicationDevice(chosen) → update your UI reactively when the callback fires → clear the device on tear-down.

class AudioDeviceManager(
    private val context: Context
) {
    private val audioManager =
        context.getSystemService(Context.AUDIO_SERVICE) as AudioManager

    private val _devices = MutableStateFlow<List<AudioDeviceInfo>>(emptyList())
    val devices: StateFlow<List<AudioDeviceInfo>> = _devices

    private val _selected = MutableStateFlow<AudioDeviceInfo?>(null)
    val selected: StateFlow<AudioDeviceInfo?> = _selected

    private val callback = object : AudioDeviceCallback() {
        override fun onAudioDevicesAdded(added: Array<AudioDeviceInfo>) = refresh()
        override fun onAudioDevicesRemoved(removed: Array<AudioDeviceInfo>) = refresh()
    }

    fun startCall() {
        audioManager.mode = AudioManager.MODE_IN_COMMUNICATION
        audioManager.registerAudioDeviceCallback(callback, Handler(Looper.getMainLooper()))
        refresh()
    }

    fun select(device: AudioDeviceInfo) {
        val ok = audioManager.setCommunicationDevice(device)
        if (ok) _selected.value = device
    }

    fun endCall() {
        audioManager.clearCommunicationDevice()
        audioManager.unregisterAudioDeviceCallback(callback)
        audioManager.mode = AudioManager.MODE_NORMAL
    }

    private fun refresh() {
        _devices.value = audioManager.availableCommunicationDevices
        _selected.value = audioManager.communicationDevice
    }
}

The whole component sits behind a Kotlin StateFlow interface, which means your Jetpack Compose call screen gets re-rendered cleanly when the user plugs in a headset or the car connects mid-drive. That reactive shape is what makes the rest of your routing code boring — which is exactly what you want.

Legacy fallback for API 21–30

If you still support Android 11 or below — often the case for consumer apps in emerging markets — you need the old calls as a second path. Keep them inside a if (Build.VERSION.SDK_INT >= 31) branch so they don’t pollute the modern code path.

@Suppress("DEPRECATION")
fun legacySetBluetoothSco(enable: Boolean) {
    if (enable) {
        audioManager.startBluetoothSco()
        audioManager.isBluetoothScoOn = true
        audioManager.isSpeakerphoneOn = false
    } else {
        audioManager.isBluetoothScoOn = false
        audioManager.stopBluetoothSco()
    }
}

Bluetooth SCO, LE Audio and hearing aids

SCO is the narrow-band voice link used by traditional Bluetooth headsets and most car kits. On Pixel devices it typically connects within 1–2 seconds; on Xiaomi Redmi we’ve seen 5–8 second delays before the audio actually routes. Always implement a timeout that falls back to the built-in speaker if the callback hasn’t confirmed the route within ~8 seconds — users will blame the app, not the phone.

LE Audio (API 33+) appears as TYPE_BLE_HEADSET / TYPE_BLE_SPEAKER. Stereo, lower latency, better battery. It’s the future but coverage is uneven — treat it as available-when-available and keep SCO as the baseline.

Hearing aids use ASHA (Google’s pre-LE Audio protocol) or the new LE Audio hearing-aid profile. TYPE_HEARING_AID should always be treated as a first-class routing option, not an afterthought — our AI-powered multimedia e-learning guide covers the accessibility expectations we bake in by default.

Reach for an SCO timeout path when: any of your target devices are Xiaomi, OPPO or Realme. A single 8-second postDelayed fallback to the speaker eliminates the vast majority of “no sound on Bluetooth” tickets.

Wired 3.5mm, USB-C and accessory audio

Wired audio feels like the simple case but has its own fragility. Flagship Samsung and Pixel phones haven’t shipped a 3.5mm jack in years, so most wired headsets arrive via USB-C or a DAC adapter. The modern API exposes them as TYPE_USB_HEADSET, TYPE_WIRED_HEADSET, or TYPE_WIRED_HEADPHONES (the last meaning “no microphone” — treat this as an accessibility signal).

Never rely on the deprecated isWiredHeadsetOn or the old Intent.ACTION_HEADSET_PLUG broadcast. Both are flaky on modern OEM ROMs. Route all detection through AudioDeviceCallback.

Permissions, foreground services and the 2024–2026 tightening

BLUETOOTH_CONNECT is a runtime permission since API 31. Missing the grant means setCommunicationDevice() on a BT device silently fails on some OEMs and throws on others. Gate your Bluetooth UI behind the permission request, then re-enumerate devices.

Foreground service types became mandatory in API 34. A call app must declare:

<service
    android:name=".VoIPService"
    android:foregroundServiceType="phoneCall"
    android:permission="android.permission.FOREGROUND_SERVICE_PHONE_CALL" />

Missing foregroundServiceType="phoneCall" throws InvalidForegroundServiceTypeException on API 34+ and blocks the service from starting. The FOREGROUND_SERVICE_PHONE_CALL permission must also be declared.

Reach for ConnectionService / TelecomManager when: your app handles actual phone-style calls (incoming/outgoing, ringer integration, heads-up call UI). Telecom wires routing into the system UI and saves you a whole class of edge cases — for VoIP SaaS it’s overkill, for a consumer call app it’s usually the right path.

Audio focus: the invisible requirement

Every call app needs to own audio focus for the duration of the call. Failing to request it cleanly leaves music apps fighting for the output and produces the classic “stuck under Spotify” bug on Xiaomi MIUI devices. Use AUDIOFOCUS_GAIN_TRANSIENT_EXCLUSIVE for short calls that should pause other audio, or AUDIOFOCUS_GAIN for long-duration sessions.

val req = AudioFocusRequest.Builder(AudioManager.AUDIOFOCUS_GAIN_TRANSIENT_EXCLUSIVE)
    .setAudioAttributes(
        AudioAttributes.Builder()
            .setUsage(AudioAttributes.USAGE_VOICE_COMMUNICATION)
            .setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
            .build()
    )
    .setOnAudioFocusChangeListener { /* pause/duck/restore */ }
    .build()
audioManager.requestAudioFocus(req)

WebRTC, LiveKit and Agora — where the routing really lives

If you’re using WebRTC (directly or via a wrapper SDK), your audio routing belongs inside a custom JavaAudioDeviceModule or the SDK’s equivalent. The public AudioManager sets up the system-level path; the SDK’s AudioDeviceModule owns how the WebRTC engine opens input and output streams. Get these two layers out of sync and you’ll see half-muted audio, echo, or no sound at all.

LiveKit Android SDK ships an AudioSwitchHandler that delegates to AudioManager correctly out of the box. Our Agora vs WebRTC piece covers the trade-offs.

Twilio’s AudioSwitch library is open source and the clearest reference for a hand-rolled routing layer. Agora’s SDK hides routing behind setEnableSpeakerphone and friends but handles OEM fragmentation internally — which is part of why you pay for it.

Need this working across 20 OEM devices by next sprint?

We keep a standing device lab with Pixel, Samsung, Xiaomi, OPPO, OnePlus and Huawei phones covering API 29–35. Send us your build and we’ll run the whole audio-routing matrix in under a week.

Book a 30-min call → WhatsApp → Email us →

OEM fragmentation — what your test matrix should cover

The single highest-leverage investment after you’ve written the routing code is a real-device matrix covering the vendors your users actually hold. The worst issues in 2025–2026 cluster around MIUI, ColorOS and One UI customizations.

OEM / skin Typical SCO delay Audio-focus quirks Our recommendation
Google Pixel (stock) ~1–2 s Predictable Baseline device for CI
Samsung One UI ~2–3 s Usually fine Test Galaxy A + S flagship
Xiaomi MIUI / HyperOS ~5–8 s Music apps steal focus SCO timeout + explicit focus
OPPO / Realme ColorOS ~4–6 s Aggressive battery saver kills callbacks Foreground service + SCO timeout
OnePlus OxygenOS ~2–4 s Mostly fine Standard
Huawei EMUI / HarmonyOS ~3–6 s No Google Play — AppGallery only Test if you ship in China/MEA

Reach for a managed SDK when: your team has no prior WebRTC experience, your launch is under 10 weeks, or voice/video is a feature rather than the product. LiveKit, Agora and Twilio have already paid the OEM fragmentation tax on your behalf.

Reference architecture — MVVM + StateFlow + Jetpack Compose

Sketch of how the pieces fit together in a modern Android call app:

[ Jetpack Compose UI ]
        |   selectedDevice, availableDevices (StateFlow)
        v
[ CallViewModel (Hilt-injected) ]
        |
        v
[ AudioDeviceManager ]
        |   registerAudioDeviceCallback / setCommunicationDevice
        v
[ AudioManager (system service) ]
        |
        +--> Built-in earpiece / speaker
        +--> Wired USB-C / 3.5mm headset
        +--> Bluetooth SCO / LE Audio
        +--> Hearing aids (ASHA / LE Audio)

[ WebRTC / LiveKit / Agora AudioDeviceModule ]
        ^
        |   mirrors selected device, configures engine I/O
        +---- CallViewModel (same source of truth)

Two principles we never compromise on. One source of truth for the selected device (the ViewModel); one reactive stream for the available devices (the StateFlow). Push both into every consumer, including the WebRTC AudioDeviceModule, and the mid-call routing bugs stop happening.

Reach for Hilt dependency injection when: your AudioDeviceManager needs to be testable with Robolectric or swappable per build flavor (prod vs staging audio mocks). It’s a 30-minute setup that pays back across every unit test.

Testing — unit, instrumented and device farm

Unit tests with Robolectric cover the state-machine logic: “when the callback fires with a new BT headset, does the StateFlow emit the right value?” Fast, cheap, catch 60% of regressions.

Instrumented tests on Android emulators are weak for audio because emulators fake most of the stack. Use them only for smoke tests; don’t trust them for routing behavior.

Real-device regression. This is where most of the budget goes. We keep an internal matrix covering 6 OEM skins × 3 Android versions × 3 Bluetooth accessories (cheap headset, AirPods-class BT, car speakerphone). A full pass is ~4 hours of manual testing per major release.

Device-farm services (Firebase Test Lab, BrowserStack App Live) close the gap for quick sanity checks. They don’t replace a real-person test on a real Bluetooth headset, but they catch crashes and permission misconfigurations well before you ship.

Mini case — rebuilding the audio layer in 3 weeks

An EU telemedicine client came to us after shipping a WebRTC call app with a 2.7-star Android rating. The complaints clustered around Bluetooth handovers — calls starting on the speaker, switching to the car when the driver got in, and never routing back when the driver left. They also had a spike in support tickets from users with hearing aids.

Our three-week rebuild replaced a homegrown Broadcast-Receiver approach with a Kotlin AudioDeviceManager built on setCommunicationDevice() and AudioDeviceCallback, added an 8-second SCO fallback, wired TYPE_HEARING_AID into the in-call UI, declared foregroundServiceType="phoneCall", and kept the legacy SCO path behind a version check for Android 11 users. QA ran 6 hours of manual tests on 14 devices. Result: rating climbed to 4.3 over the next quarter, audio-routing support tickets dropped about 78% week-over-week. Want a similar assessment for your stack? Book 30 minutes and we’ll sketch it.

A decision framework — pick your audio-routing path in five questions

Q1. Are you using a WebRTC SDK? If yes (LiveKit, Agora, Twilio), let the SDK own the routing, wire your UI to its audio-device API, and only reach for raw AudioManager for gaps.

Q2. What’s your min-SDK target? API 31 lets you write a single modern code path; below that you need the legacy fallback behind a version check. Set your min-SDK deliberately.

Q3. Is this a “true” call app or an in-app voice session? True call apps should plug into ConnectionService / TelecomManager. Voice sessions in a SaaS (support calls, meetings) can stay on AudioManager with a foreground service.

Q4. Accessibility requirements? Healthcare, education and government tilt towards hearing-aid and LE Audio support from day one. If the answer is yes, budget device testing with real hearing aids, not just headsets.

Q5. Which OEMs matter? Test on the vendors your users actually carry. “Android” isn’t a device; 60% of your users might be on MIUI or ColorOS and the code that works on Pixel may not work there.

Five pitfalls we see every quarter

1. Relying on ACTION_HEADSET_PLUG or isWiredHeadsetOn. Both are deprecated and inconsistent. AudioDeviceCallback is the answer on every API from 23 up.

2. Forgetting BLUETOOTH_CONNECT. The permission is silent on success and noisy on failure depending on OEM. Gate your Bluetooth UI on the grant and re-enumerate after acceptance.

3. Missing foregroundServiceType="phoneCall". API 34+ throws on startForeground. It’s a two-line fix and a weekend of missed crashes without it.

4. No SCO timeout fallback. On Xiaomi and OPPO, 5–8 seconds of silence feels like an app bug even if SCO eventually connects. Always fall back to the built-in speaker after ~8 seconds.

5. Mismatched AudioManager and WebRTC AudioDeviceModule state. Two systems owning the same route produce ghost echo and half-muted audio. Single source of truth, propagated to both.

KPIs: how to measure a healthy audio layer

Quality KPIs. Median and P95 time from “BT device connected” to “audio actually routed” (target P95 <4 s on all OEMs), share of calls where user changed the audio device at least once (healthy baseline 20–40%), and audio-routing-related crash rate (<0.1% of sessions).

Business KPIs. Support-ticket volume tagged “no sound / sound issues”, Play Store reviews with audio-related keywords, retention delta between users with and without Bluetooth paired (should trend flat; a gap usually means your routing is broken for that segment).

Reliability KPIs. Rate of AudioDeviceCallback events per 1,000 call-minutes (under-counting suggests OS killing your service), rate of audio-focus loss transitions you failed to handle, and rate of unexpected MODE_NORMAL transitions during a call.

When not to build this yourself

Skip a custom audio layer and lean on a managed SDK (LiveKit, Agora, Twilio) if (a) your team has no prior WebRTC experience, (b) call volume is low enough that the SDK per-minute fee is trivial, (c) you’re shipping a SaaS feature rather than a dedicated call product, or (d) your user base looks like every OEM on the planet and you can’t realistically maintain a device-test matrix. The SDKs have paid for their fragmentation tax in a way you don’t want to replicate unless you have to.

FAQ

Is startBluetoothSco() still safe to use in 2026?

It’s deprecated from API 31 but still respected on most OEMs. Keep it only as a legacy fallback inside a Build.VERSION.SDK_INT < 31 branch. For anything new, use setCommunicationDevice().

How do I detect LE Audio support at runtime?

Check API 33+ and enumerate availableCommunicationDevices for TYPE_BLE_HEADSET or TYPE_BLE_SPEAKER. If neither appears, the device or its firmware doesn’t support LE Audio for communication and you should fall back to SCO.

Do I need ConnectionService?

Only if your app behaves like a phone — incoming/outgoing calls that should integrate with the system dialer, ringer, Bluetooth accept/reject. For typical in-app VoIP or meetings, a foreground service with AudioManager is enough and simpler.

How long does a production audio-routing module take to build?

A proof of concept on a single OEM is 2–5 days. A production-grade implementation with OEM testing, legacy fallback, permission UX, foreground-service wiring and WebRTC integration is 2–4 weeks for a senior Android engineer. A full in-call audio UI plus monitoring adds another 1–2 weeks.

Why is my app silent on Bluetooth on Xiaomi specifically?

Usually SCO connect delay or audio focus being held by a music app. Add an 8-second SCO timeout fallback to the speaker, request AUDIOFOCUS_GAIN_TRANSIENT_EXCLUSIVE, and re-check that MIUI’s aggressive battery saver isn’t killing your foreground service.

Should I use LiveKit/Agora/Twilio or write it myself?

If voice/video is your core product and you have senior Android + WebRTC talent, owning the stack gives you control and eliminates per-minute fees at scale. If voice is a feature in a larger app, or if you want to ship in 6–10 weeks with a smaller team, lean on a managed SDK — they’ve already solved the OEM fragmentation you would otherwise inherit. Our piece on hiring LiveKit developers is a good place to start.

How do I make sure hearing-aid users get a good experience?

Surface TYPE_HEARING_AID in the audio-device picker with a clear label, test with a real ASHA or LE-Audio hearing aid, avoid assuming the device is the user’s default — they may have paired both a headset and hearing aids — and follow our accessibility checklist for broader coverage.

What about iOS feature parity?

iOS exposes similar concepts via AVAudioSession and is typically less fragmented. A single Android routing bug often takes longer to fix than an entire iOS audio layer — budget accordingly. For cross-platform flutter/ React Native apps, bridge the platform layer into a common Kotlin/Swift surface rather than trying to abstract both into JS.

Hiring

How to hire LiveKit developers

The WebRTC + voice AI skill matrix, interview questions and rate benchmarks for 2026.

WebRTC

Building a video call app with Agora SDK in 2026

Where a managed SDK takes over and where you still own the audio routing.

Video conferencing

How to build custom video conferencing solutions

End-to-end view of what goes into a production conferencing stack.

Android distribution

Distributing Android apps beyond Google Play

Alt-store distribution strategies once your call app is ready to ship.

Voice UX

Enhancing UX with speech recognition and NLP

The companion piece on voice UX patterns that depend on stable audio routing.

Ready to ship clean audio on Android?

Audio output switching on Android is a narrow problem with a wide surface. The API converged in 2022, the regulatory and service models tightened in 2024–2025, and LE Audio is the 2026 feature that shouldn’t surprise your team. Use setCommunicationDevice() as the default path, wrap it in a Kotlin StateFlow, keep the legacy SCO calls behind a version check, declare your foreground-service type, and budget two to three weeks of real-device testing. The teams that ship a premium-feeling Android call app are the ones that treat this module as a first-class product surface, not an infrastructure detail.

If you’d rather not learn the OEM test matrix the hard way, Fora Soft has a reusable AudioDeviceManager and a standing device lab we can bring to your project. Our agent-engineering workflow shaves around a quarter off the calendar compared with a traditional build, which is usually the difference between making and missing a launch window.

Let’s scope your Android call app

Book a 30-minute scoping call: we’ll walk through your audio-routing requirements, OEM footprint and SDK choice, then sketch a timeline tailored to your release window.

Book a 30-min scoping call → WhatsApp → Email us →

  • Development