How to Make Picture-in-Picture Mode on Android? With Code Examples — cover illustration

Key takeaways

Auto-enter PiP is the 2026 default. On Android 12+ you set setAutoEnterEnabled(true) once and the system handles the home-gesture transition; onUserLeaveHint is now the legacy fallback for API 26–30.

Source rect hint kills the black-flash. Without it the OS animates a black rectangle in and out; with it the video player smoothly zooms into the PiP window and back — the single biggest perceived-quality fix.

Manifest configChanges is non-negotiable. Without orientation|screenSize|smallestScreenSize|screenLayout, the OS recreates your activity on every PiP toggle and your video resets to 0:00.

Android 13 made PiP a richer surface. Titles, subtitles, expanded aspect ratios for foldables, and a dedicated setCloseAction with confirmation dialog — use them to differentiate from a stock implementation.

PiP is table stakes for video calling, streaming and navigation. Zoom, WhatsApp, Google Maps, YouTube and Spotify all support it; calling apps without PiP look outdated and bleed retention to competitors that have it.

The work is small, the leverage is large. A clean PiP integration on a healthy Kotlin codebase is usually 3–5 engineering days — cheap insurance against a feature competitors already ship.

Why Fora Soft wrote this playbook

Fora Soft has been shipping calling, streaming and video-collaboration products since 2005. Picture-in-picture is the feature our clients ask for as soon as they have one user complaint about “why does the call die when I open my notes?” The implementation looks trivial in the official sample — one line, enterPictureInPictureMode() — but the production version that survives Samsung gestures, foldable hinges, and the home-button race condition is a few hundred lines of careful Kotlin.

This guide consolidates what we learned shipping PiP across BrainCert (a WebRTC-first LMS), ProVideoMeeting (an enterprise video conferencing platform), and Sprii (a live-video shopping product handling 72k+ live events). Every code snippet has been compiled against Android 14 (API 34) and verified against Android 8.0 through 15. Skim the key takeaways and jump to section 06 for the auto-enter pattern, or to section 08 for the source-rect-hint trick.

Building a video calling or streaming app and want PiP done right?

Book a 30-minute scoping call with our Android lead. We will sanity-check your PiP stack, source rect handling, and lifecycle hooks against your specific player.

Book a 30-min call → WhatsApp → Email us →

Why Picture-in-Picture is now table stakes

Every competitor your users compare you to has PiP. Zoom, Google Meet, WhatsApp, Telegram, FaceTime via Android Auto, YouTube, Netflix, Spotify, Google Maps, Twitch, and TikTok — the entire reference set of polished Android apps treats PiP as a default. Shipping a calling product without it in 2026 reads as outdated the moment a user tries to check their calendar mid-meeting and watches the call die.

The business case sits on three pillars. Session length: users stay in your app 15–30% longer when they can multitask. Conversion to paid: PiP is a recognizable premium signal that lifts upgrade rates on subscription tiers. Churn defense: a one-star review with the words “app closes when I switch” will hurt store ranking for weeks. The work to ship PiP is small — usually 3–5 engineering days on a healthy codebase — which makes it one of the highest-leverage Android features you can ship per dev-day.

The PiP API matrix — what each Android version added

PiP shipped on Android 8.0 (API 26) and has been refined on every release since. The table below collapses the timeline into the four API levels that change your code path.

API / Version Capability Why it matters
API 26 — Android 8.0 PiP launched. enterPictureInPictureMode(), setSourceRectHint, setActions. Manual transition with onUserLeaveHint. Baseline for all calling apps.
API 31 — Android 12 setAutoEnterEnabled, setSeamlessResizeEnabled. No more onUserLeaveHint dance; smoother resize for non-video content.
API 33 — Android 13 setTitle, setSubtitle, setExpandedAspectRatio, setCloseAction. Richer PiP UI for foldables and tablets; confirmation dialogs prevent accidental close.
API 34–35 — Android 14–15 No new PiP API surface. Multi-window and foldable polish. Existing API surface is stable; invest now without rework risk.

Roughly 92% of active Android devices run API 26+. Nearly half are on API 33 or newer, where the rich-PiP features land for free if you opt in.

Manifest setup that prevents activity restarts

The single most common production bug we see in PiP code review is a missing configChanges attribute on the call activity. When PiP toggles on or off, Android delivers a configuration change. If your activity does not declare it handles those changes, the OS recreates the activity, the video resets to 0:00, the WebRTC peer connection is torn down, and the UX is ruined. Add the line below before you do anything else.

<uses-feature
    android:name="android.software.picture_in_picture"
    android:required="false" />

<activity
    android:name=".CallActivity"
    android:supportsPictureInPicture="true"
    android:configChanges="screenSize|smallestScreenSize|screenLayout|orientation|keyboardHidden"
    android:launchMode="singleTask"
    android:resizeableActivity="true" />

Reach for android:launchMode="singleTask" when: the call activity must come back to the foreground from PiP without being recreated. Default standard launch mode produces duplicate activity instances and breaks lifecycle assumptions.

Detecting device support before you call PiP

Android-Go devices, in-vehicle infotainment systems, and a handful of low-end OEMs ship without PiP. Calling enterPictureInPictureMode() on those devices throws or silently no-ops. Always feature-check first.

val isPipSupported: Boolean
    get() = Build.VERSION.SDK_INT >= Build.VERSION_CODES.O
        && packageManager.hasSystemFeature(PackageManager.FEATURE_PICTURE_IN_PICTURE)

When isPipSupported is false, fall back gracefully: hide the PiP affordance in the UI, and on home gesture, end the call cleanly with a “return to call” notification rather than letting the activity die.

Auto-enter PiP on Android 12+ — the modern default

Before Android 12, you had to listen for onUserLeaveHint — a callback the OS fires roughly when the user is leaving your activity — and call enterPictureInPictureMode() from there. The race conditions were ugly: the gesture fires too late on some devices, the system kills the activity on others, and Compose apps had to wire it through an awkward LocalLifecycleOwner side effect.

API 31 introduced setAutoEnterEnabled(true). The OS now handles the home-gesture transition for you. The only requirement is that the parameters are fresh at the moment the user leaves — if you set them once at onResume and never update, an aspect-ratio change halfway through the call will not be reflected.

private fun updatePipParams(call: ActiveCall) {
    if (!isPipSupported) return

    val builder = PictureInPictureParams.Builder()
        .setAspectRatio(Rational(call.videoWidth, call.videoHeight))
        .setSourceRectHint(playerView.getGlobalRect())
        .setActions(listOf(muteAction(call), endCallAction(call)))

    if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.S) {
        builder
            .setAutoEnterEnabled(call.isInProgress)
            .setSeamlessResizeEnabled(false) // video, not UI cross-fade
    }

    if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.TIRAMISU) {
        builder
            .setTitle(call.peerName)
            .setSubtitle(call.elapsedFormatted())
            .setCloseAction(closeWithConfirmAction(call))
    }

    setPictureInPictureParams(builder.build())
}

Call updatePipParams from onResume and from any place that mutates the call state — aspect ratio change after a remote camera flip, mute toggle, hold/resume. The OS reads the latest snapshot when the gesture fires.

Legacy fallback for API 26–30

override fun onUserLeaveHint() {
    super.onUserLeaveHint()
    if (Build.VERSION.SDK_INT < Build.VERSION_CODES.S
        && isPipSupported
        && callManager.isInCall
    ) {
        enterPictureInPictureMode(buildPipParams(callManager.activeCall))
    }
}

Lifecycle hooks: hiding the right UI in PiP

A 320 × 180 PiP window cannot fit your full call UI. Hide everything except the remote video. Action buttons live in the system PiP overlay; chat threads, mute toggles, participant lists and any sensitive surfaces should disappear entirely. onPictureInPictureModeChanged is your hook.

override fun onPictureInPictureModeChanged(
    isInPictureInPictureMode: Boolean,
    newConfig: Configuration
) {
    super.onPictureInPictureModeChanged(isInPictureInPictureMode, newConfig)

    binding.controlsScrim.isGone = isInPictureInPictureMode
    binding.chatPanel.isGone     = isInPictureInPictureMode
    binding.localPreview.isGone  = isInPictureInPictureMode
    binding.bottomSheet.isGone   = isInPictureInPictureMode
    binding.toolbar.isGone       = isInPictureInPictureMode

    // Make the player fill the window
    binding.remoteVideo.layoutParams = binding.remoteVideo.layoutParams.apply {
        width  = ViewGroup.LayoutParams.MATCH_PARENT
        height = ViewGroup.LayoutParams.MATCH_PARENT
    }
}

In Compose, expose the same state through a snapshot:

@Composable
fun rememberIsInPip(): Boolean {
    val activity = LocalContext.current as Activity
    var isInPip by remember { mutableStateOf(activity.isInPictureInPictureMode) }
    DisposableEffect(activity) {
        val listener = Consumer<PictureInPictureModeChangedInfo> { info ->
            isInPip = info.isInPictureInPictureMode
        }
        activity.addOnPictureInPictureModeChangedListener(listener)
        onDispose { activity.removeOnPictureInPictureModeChangedListener(listener) }
    }
    return isInPip
}

Source rect hint — the smooth-zoom secret

Without a source rect hint, the OS animates a black rectangle from the centre of the screen into the PiP corner. With it, the player smoothly zooms from its real position. The fix is mechanical: hand the OS the bounds of the video view at the moment of the transition.

private fun View.getGlobalRect(): Rect {
    val rect = Rect()
    getGlobalVisibleRect(rect)
    return rect
}

playerView.addOnLayoutChangeListener { _, _, _, _, _, _, _, _, _ ->
    updatePipParams(callManager.activeCall)
}

Refresh the rect on every layout change — orientation flips, soft-keyboard appearance, and split-screen handoff all move the player’s real bounds. When the source rect is missing, you also lose the reverse animation on exit; the user gets a teleport instead of a zoom.

PiP entering with a black-flash on your app?

We have shipped this exact integration on WebRTC, ExoPlayer, and LiveKit-based stacks. A 30-minute call usually narrows the diagnosis to one of four root causes.

Book a 30-min call → WhatsApp → Email us →

Action buttons (RemoteAction): mute, hang up, switch camera

PiP buttons are RemoteAction objects. The system renders icons in its own colour palette — never assume your tint will survive. Cap your action list at three; getMaxNumPictureInPictureActions() tells you the device-specific maximum but three works everywhere.

private fun endCallAction(call: ActiveCall): RemoteAction {
    val intent = Intent(this, CallControlsReceiver::class.java)
        .setAction(CallControlsReceiver.ACTION_END_CALL)
        .putExtra(CallControlsReceiver.EXTRA_CALL_ID, call.id)

    val pendingIntent = PendingIntent.getBroadcast(
        this, REQ_END_CALL, intent,
        PendingIntent.FLAG_UPDATE_CURRENT or PendingIntent.FLAG_IMMUTABLE
    )

    return RemoteAction(
        Icon.createWithResource(this, R.drawable.ic_call_end),
        getString(R.string.end_call),
        getString(R.string.end_call_a11y),
        pendingIntent
    )
}

Route the action through a BroadcastReceiver rather than re-launching the activity — the activity is already up, and starting it again confuses the lifecycle. Two more details: every PendingIntent needs FLAG_IMMUTABLE on API 31+, and unique request codes per action prevent the OS from collapsing your intents.

Aspect ratio, expanded ratio, and foldables

The system clamps PiP windows between roughly 2.39:1 and 1:2.39 — anything more extreme is clipped. For a video call use Rational(16, 9) when remote video is landscape, Rational(9, 16) when portrait, and update on every camera flip.

Android 13 added setExpandedAspectRatio for foldables and tablets. When the user un-folds a Pixel Fold or rotates a Galaxy Tab into landscape, your PiP can expand to a wider ratio without reset. Wire both:

if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.TIRAMISU) {
    builder
        .setAspectRatio(Rational(9, 16))           // default portrait
        .setExpandedAspectRatio(Rational(16, 9))   // expanded when user pulls
}

Title, subtitle and the Android 13 close action

Three Android 13 additions take a stock PiP from functional to polished. setTitle and setSubtitle populate the system PiP UI when the user expands the window, and setCloseAction intercepts the system close affordance to surface a confirmation step.

Reach for setCloseAction when: a stray swipe to dismiss the PiP window would actually drop a live call, end a paid live stream, or hang up on a doctor mid-consultation. A confirmation step preserves session integrity without adding friction.

Title and subtitle render on Android Auto and on the lock-screen PiP overlay too — populating them is free SEO for your in-app metadata.

PiP for media playback (ExoPlayer / Media3)

If your product is a streaming app rather than a calling app, the PiP integration looks slightly different. ExoPlayer (now Media3) auto-enters PiP via the Media3 UI module and ships a default play / pause / seek action set, provided you have wired a MediaSession. Without the session, no system controls render in the PiP window.

val player = ExoPlayer.Builder(this).build()
val mediaSession = MediaSession.Builder(this, player).build()

playerView.player = player
playerView.useController = true

if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.S) {
    setPictureInPictureParams(
        PictureInPictureParams.Builder()
            .setAutoEnterEnabled(true)
            .setSeamlessResizeEnabled(true) // OK for media, not for live video
            .setSourceRectHint(playerView.getGlobalRect())
            .setAspectRatio(Rational(player.videoSize.width, player.videoSize.height))
            .build()
    )
}

For live video calling keep setSeamlessResizeEnabled(false). The cross-fade looks great for static media but creates ghosting on live frames where every pixel is changing. Test on a 60-fps remote video and pick the option that survives the camera flip.

Five pitfalls we see in code review every week

1. Forgotten configChanges. The activity restarts on every PiP toggle. Symptom: video resets to 0:00, WebRTC peer connection rebuilt. Fix in the manifest, single line.

2. Stale params with auto-enter. Setting setAutoEnterEnabled(true) once at onCreate and never updating means the OS uses initial aspect ratio for the entire call. Always refresh on layout change and call state change.

3. PiP from background. enterPictureInPictureMode() from a background activity returns false silently. Always call from a resumed foreground activity, ideally inside the auto-enter path.

4. RemoteAction PendingIntent without FLAG_IMMUTABLE. Crashes the app on API 31+ the first time a button is rendered. Audit every PendingIntent.get* call.

5. Sensitive UI leaking into PiP. Chat threads, password fields, payment forms must be hidden in onPictureInPictureModeChanged. We have seen apps screenshot user passwords from PiP recordings — treat the PiP window as a public surface.

OEM quirks: Samsung, Xiaomi, foldables

Samsung One UI uses an edge-swipe gesture instead of the stock home gesture on certain models. The auto-enter path still fires because the OS sends the same lifecycle events, but Samsung’s “reduce app screen size” option intercepts before PiP. Test on a Galaxy S device with stock gestures disabled.

Xiaomi MIUI / HyperOS sometimes blocks PiP entirely under aggressive battery profiles. Mitigate by surfacing a one-time onboarding tip with a deep-link to autostart settings, similar to how we ship our custom call notification stack.

Foldables (Pixel Fold, Galaxy Z Fold, Honor Magic V) trigger a onConfigurationChanged on hinge state change. With the manifest configChanges declared correctly the activity stays alive and you can update PiP params on the fly.

Mini case: PiP rollout that lifted call duration 18%

A European video conferencing client on our roster, similar in profile to ProVideoMeeting, came to us with a flat call-duration metric. Average call length was 9 minutes; competitor benchmarks were closer to 13. The user feedback was consistent: people answered the call but ended quickly because they could not multitask without dropping.

Our two-week plan: ship full PiP support with auto-enter on Android 12+, source rect hint for smooth zoom, three action buttons (mute, switch camera, end call), Android 13 title and subtitle, and Compose lifecycle wiring. We tested across 12 devices including a Galaxy S23, a Pixel 8 Pro, a Xiaomi 14, and a Pixel Fold.

Outcome over the next 30 days: average call duration moved from 9 to 10.6 minutes (+18%), home-gesture-during-call abandonment dropped from 14% to 3%, and Play Store rating climbed from 4.2 to 4.5 with the word “multitasking” appearing positively in 22% of new reviews. Engineering effort was four Android-engineer days spread across two calendar weeks; agentic engineering let us reuse 70% of test scaffolding from earlier projects.

KPIs: what to measure once PiP ships

Quality KPIs. PiP-enter success rate over 99% on supported devices; black-flash rate (frames where the PiP window renders solid black for >100 ms) under 5%; activity-restart-on-PiP rate at 0%.

Business KPIs. Average call duration delta (target +10% within 30 days), home-gesture abandonment rate during active call (target under 5%), and Play Store rating with “multitasking” sentiment.

Reliability KPIs. PiP-related crash-free sessions over 99.5%; IllegalStateException on enter under 0.1%; correct aspect ratio in 100% of camera-flip scenarios.

A decision framework — pick the right PiP scope in five questions

Q1. Is the primary use case live video (calling) or playback (streaming)? Calling → setSeamlessResizeEnabled(false), dynamic aspect ratio. Playback → true, MediaSession-driven actions.

Q2. Do you target API 31+? Yes → auto-enter is the default. No → onUserLeaveHint with explicit feature check.

Q3. How many actions do users actually need? Two or three. Anything more is invisible on most devices and clutters the system overlay.

Q4. Are foldables a non-trivial portion of your install base? Yes → ship setExpandedAspectRatio on day 1. No → defer to a later release.

Q5. Is dropping the call by accident a paid-product issue? Yes → ship setCloseAction with confirmation. No → default close behavior is fine.

When NOT to ship PiP at all

Three cases where PiP is the wrong investment. First, if the product has no time-sensitive content displayed visually — an audio-only podcast app, a chat app without media — PiP adds nothing. Use the system media notification instead.

Second, if your minSdk is below API 26, or your DAU on devices below API 26 is significant, the engineering cost of supporting two paths plus a graceful fallback usually outweighs the benefit. Ship a notification-driven “return to call” instead.

Third, if your video is end-to-end encrypted with screen-capture protection turned on (telemedicine with strict HIPAA posture), PiP appears as a black rectangle in screenshots and screen recordings. That is correct behavior but worth flagging to your product manager so the QA team does not file it as a bug.

Picture-in-picture is one piece of a calling product. Three adjacent surfaces deserve the same care.

Custom call notifications wake the device when the user is not in your app — see our custom Android call notifications guide for the CallStyle and full-screen-intent patterns.

Foreground services keep the call alive when the activity goes to PiP and then to background — see our foreground services and deep links on Android 14.

Audio output switching between speaker, earpiece and Bluetooth headset is the most reported post-launch bug; the audio routing deep dive covers the edge cases.

Screen sharing for products that double as collaboration tools layers cleanly on top of a PiP-aware app — see implementing Android screen sharing on Kotlin + WebRTC.

FAQ

What minSdk should a calling app target if it needs PiP?

API 26 (Android 8.0) is the floor. Auto-enter PiP requires API 31; on lower API levels you fall back to onUserLeaveHint. Roughly 92% of active devices run API 26+ and just under 50% run API 33+.

Why does my activity recreate every time PiP toggles?

The manifest configChanges attribute is missing or incomplete. Add screenSize|smallestScreenSize|screenLayout|orientation|keyboardHidden to the activity declaration.

Why does the PiP transition flash black?

You did not set setSourceRectHint, or the rect is stale. Refresh on every layout change of the player view and pass the latest rect into setPictureInPictureParams.

How many action buttons can a PiP window display?

The system guarantees at least three. Some devices support up to five via getMaxNumPictureInPictureActions(). Cap at three for portability.

Does PiP work in Jetpack Compose?

Yes. Compose hosts inside a regular Activity, so the auto-enter pattern, source rect hint and lifecycle hooks all apply. Use addOnPictureInPictureModeChangedListener on the activity to drive Compose state changes through a remembered flag.

Can I show PiP from a background activity?

Not reliably. enterPictureInPictureMode() from a background activity returns false silently. Always trigger PiP from a resumed activity, ideally via the auto-enter path.

How do I prevent sensitive UI from leaking into PiP?

Hide chat threads, password fields, payment forms, and participant lists in onPictureInPictureModeChanged. For HIPAA-strict products, set FLAG_SECURE on the window so screen captures of the PiP render as black.

How much does adding PiP to a calling app usually cost?

On a healthy Kotlin codebase, three to five engineering days for the core integration and another two for OEM device-lab QA. Our agent-assisted engineering compresses the typical timeline. For an estimate against your codebase, a 30-minute scoping call is usually enough.

Notifications

Custom Android call notifications in 2026

CallStyle, full-screen intents, typed foreground services — the wake half of the calling stack.

Android services

Foreground services and deep links on Android 14

The background-service scaffolding that keeps PiP calls alive.

Audio engineering

Audio output switching on Android during a call

Speaker, earpiece, Bluetooth — the routing problem that drives most post-launch bug reports.

Collaboration

Implementing Android screen sharing on Kotlin + WebRTC

MediaProjection, foreground service types, Android 14 quirks for collaboration apps.

Architecture

Custom video conferencing architecture in 2026

P2P, SFU, MCU — the scaling trade-offs behind every calling product.

Ready to ship PiP that feels native?

A correct Picture-in-Picture integration in 2026 is auto-enter PiP guarded by feature detection, a fresh source rect hint on every layout change, manifest configChanges wide enough to absorb the transition, three immutable-PendingIntent action buttons, and Android 13 metadata for foldable polish. The work is small and the leverage is large — longer sessions, lower abandonment, higher reviews.

If you want a second pair of eyes on your PiP stack, or a team that has shipped this pattern across 50+ real-time products, we are a 30-minute call away.

Need PiP that survives every Android device?

Tell us about your stack. We will return a punch list of PiP, lifecycle, and OEM survival fixes inside a week.

Book a 30-min call → WhatsApp → Email us →

  • Development