Audio latency

Latency is the time it takes for a signal to travel through a system. These are the common types of latency related to audio apps:

  • Audio output latency is the time between an audio sample being generated by an app and the sample being played through the headphone jack or built-in speaker.
  • Audio input latency is the time between an audio signal being received by a device’s audio input, such as the microphone, and that same audio data being available to an app.
  • Round-trip latency is the sum of input latency, app processing time, and output latency.

  • Touch latency is the time between a user touching the screen and that touch event being received by an app.
  • Warmup latency is the time it takes to start up the audio pipeline the first time data is enqueued in a buffer.

This page describes how to develop your audio app with low-latency input and output, and how to avoid warmup latency.

Measure latency

It is difficult to measure audio input and output latency in isolation since it requires knowing exactly when the first sample is sent into the audio path (although this can be done using a light testing circuit and an oscilloscope). If you know the round-trip audio latency, you can use the rough rule of thumb: audio input (and output) latency is half the round-trip audio latency over paths without signal processing.

Round-trip audio latency varies greatly depending on device model and Android build. You can get a rough idea of round-trip latency for Nexus devices by reading the published measurements.

You can measure round-trip audio latency by creating an app that generates an audio signal, listens for that signal, and measures the time between sending it and receiving it.

Since the lowest latency is achieved over audio paths with minimal signal processing, you may also want to use an Audio Loopback Dongle, which allows the test to be run over the headset connector.

Best practices to minimize latency

Validate audio performance

The Android Compatibility Definition Document (CDD) enumerates the hardware and software requirements of a compatible Android device. See Android Compatibility for more information on the overall compatibility program, and CDD for the actual CDD document.

In the CDD, round-trip latency is specified as 20 ms or lower (even though musicians generally require 10 ms). This is because there are important use cases that are enabled by 20 ms.

There is currently no API to determine audio latency over any path on an Android device at runtime. You can, however, use the following hardware feature flags to find out whether the device makes any guarantees for latency:

The criteria for reporting these flags is defined in the CDD in sections 5.6 Audio Latency and 5.10 Professional Audio.

Here’s how to check for these features in Java:

Kotlin

val hasLowLatencyFeature: Boolean =
        packageManager.hasSystemFeature(PackageManager.FEATURE_AUDIO_LOW_LATENCY)

val hasProFeature: Boolean =
        packageManager.hasSystemFeature(PackageManager.FEATURE_AUDIO_PRO)

Java

boolean hasLowLatencyFeature =
    getPackageManager().hasSystemFeature(PackageManager.FEATURE_AUDIO_LOW_LATENCY);

boolean hasProFeature =
    getPackageManager().hasSystemFeature(PackageManager.FEATURE_AUDIO_PRO);

Regarding the relationship of audio features, the android.hardware.audio.low_latency feature is a prerequisite for android.hardware.audio.pro. A device can implement android.hardware.audio.low_latency and not android.hardware.audio.pro, but not vice-versa.

Make no assumptions about audio performance

Beware of the following assumptions to help avoid latency issues:

  • Don’t assume that the speakers and microphones used in mobile devices generally have good acoustics. Due to their small size, the acoustics are generally poor so signal processing is added to improve the sound quality. This signal processing introduces latency.
  • Don't assume that your input and output callbacks are synchronized. For simultaneous input and output, separate buffer queue completion handlers are used for each side. There is no guarantee of the relative order of these callbacks or the synchronization of the audio clocks, even when both sides use the same sample rate. Your application should buffer the data with proper buffer synchronization.
  • Don't assume that the actual sample rate exactly matches the nominal sample rate. For example, if the nominal sample rate is 48,000 Hz, it is normal for the audio clock to advance at a slightly different rate than the operating system CLOCK_MONOTONIC. This is because the audio and system clocks may derive from different crystals.
  • Don't assume that the actual playback sample rate exactly matches the actual capture sample rate, especially if the endpoints are on separate paths. For example, if you are capturing from the on-device microphone at 48,000 Hz nominal sample rate, and playing on USB audio at 48,000 Hz nominal sample rate, the actual sample rates are likely to be slightly different from each other.

A consequence of potentially independent audio clocks is the need for asynchronous sample rate conversion. A simple (though not ideal for audio quality) technique for asynchronous sample rate conversion is to duplicate or drop samples as needed near a zero-crossing point. More sophisticated conversions are possible.

Minimize input latency

This section provides suggestions to help you reduce audio input latency when recording with a built-in microphone or an external headset microphone.

  • If your app is monitoring the input, suggest that your users use a headset (for example, by displaying a Best with headphones screen on first run). Note that just using the headset doesn’t guarantee the lowest possible latency. You may need to perform other steps to remove any unwanted signal processing from the audio path, such as by using the VOICE_RECOGNITION preset when recording.
  • Be prepared to handle nominal sample rates of 44,100 and 48,000 Hz as reported by getProperty(String) for PROPERTY_OUTPUT_SAMPLE_RATE. Other sample rates are possible, but rare.
  • Be prepared to handle the buffer size reported by getProperty(String) for PROPERTY_OUTPUT_FRAMES_PER_BUFFER. Typical buffer sizes include 96, 128, 160, 192, 240, 256, or 512 frames, but other values are possible.

Minimize output latency

Use the optimal sample rate when you create your audio player

To obtain the lowest latency, you must supply audio data that matches the device's optimal sample rate and buffer size. For more information, see Design For Reduced Latency.

In Java, you can obtain the optimal sample rate from AudioManager as shown in the following code example:

Kotlin

val am = getSystemService(Context.AUDIO_SERVICE) as AudioManager
val sampleRateStr: String? = am.getProperty(AudioManager.PROPERTY_OUTPUT_SAMPLE_RATE)
var sampleRate: Int = sampleRateStr?.let { str ->
    Integer.parseInt(str).takeUnless { it == 0 }
} ?: 44100 // Use a default value if property not found

Java

AudioManager am = (AudioManager) getSystemService(Context.AUDIO_SERVICE);
String sampleRateStr = am.getProperty(AudioManager.PROPERTY_OUTPUT_SAMPLE_RATE);
int sampleRate = Integer.parseInt(sampleRateStr);
if (sampleRate == 0) sampleRate = 44100; // Use a default value if property not found

Once you know the optimal sample rate, you can supply it when creating your player. This example uses OpenSL ES:

// create buffer queue audio player
void Java_com_example_audio_generatetone_MainActivity_createBufferQueueAudioPlayer
        (JNIEnv* env, jclass clazz, jint sampleRate, jint framesPerBuffer)
{
   ...
   // specify the audio source format
   SLDataFormat_PCM format_pcm;
   format_pcm.numChannels = 2;
   format_pcm.samplesPerSec = (SLuint32) sampleRate * 1000;
   ...
}

Note: samplesPerSec refers to the sample rate per channel in millihertz (1 Hz = 1000 mHz).

Use the optimal buffer size to enqueue audio data

You can obtain the optimal buffer size in a similar way to the optimal sample rate, using the AudioManager API:

Kotlin

val am = getSystemService(Context.AUDIO_SERVICE) as AudioManager
val framesPerBuffer: String? = am.getProperty(AudioManager.PROPERTY_OUTPUT_FRAMES_PER_BUFFER)
var framesPerBufferInt: Int = framesPerBuffer?.let { str ->
    Integer.parseInt(str).takeUnless { it == 0 }
} ?: 256 // Use default

Java

AudioManager am = (AudioManager) getSystemService(Context.AUDIO_SERVICE);
String framesPerBuffer = am.getProperty(AudioManager.PROPERTY_OUTPUT_FRAMES_PER_BUFFER);
int framesPerBufferInt = Integer.parseInt(framesPerBuffer);
if (framesPerBufferInt == 0) framesPerBufferInt = 256; // Use default

The PROPERTY_OUTPUT_FRAMES_PER_BUFFER property indicates the number of audio frames that the HAL (Hardware Abstraction Layer) buffer can hold. You should construct your audio buffers so that they contain an exact multiple of this number. If you use the correct number of audio frames, your callbacks occur at regular intervals, which reduces jitter.

It is important to use the API to determine buffer size rather than using a hardcoded value, because HAL buffer sizes differ across devices and across Android builds.

Don't add output interfaces that involve signal processing

Only these interfaces are supported by the fast mixer:

  • SL_IID_ANDROIDSIMPLEBUFFERQUEUE
  • SL_IID_VOLUME
  • SL_IID_MUTESOLO

These interfaces are not allowed because they involve signal processing and will cause your request for a fast-track to be rejected:

  • SL_IID_BASSBOOST
  • SL_IID_EFFECTSEND
  • SL_IID_ENVIRONMENTALREVERB
  • SL_IID_EQUALIZER
  • SL_IID_PLAYBACKRATE
  • SL_IID_PRESETREVERB
  • SL_IID_VIRTUALIZER
  • SL_IID_ANDROIDEFFECT
  • SL_IID_ANDROIDEFFECTSEND

When you create your player, make sure you only add fast interfaces, as shown in the following example:

const SLInterfaceID interface_ids[2] = { SL_IID_ANDROIDSIMPLEBUFFERQUEUE, SL_IID_VOLUME };

Verify you're using a low-latency track

Complete these steps to verify that you have successfully obtained a low-latency track:

  1. Launch your app and then run the following command:
  2. adb shell ps | grep your_app_name
    
  3. Make a note of your app's process ID.
  4. Now, play some audio from your app. You have approximately three seconds to run the following command from the terminal:
  5. adb shell dumpsys media.audio_flinger
    
  6. Scan for your process ID. If you see an F in the Name column, it's on a low-latency track (the F stands for fast track).

Minimize warmup latency

When you enqueue audio data for the first time, it takes a small, but still significant, amount of time for the device audio circuit to warm up. To avoid this warmup latency, you can enqueue buffers of audio data containing silence, as shown in the following code example:

#define CHANNELS 1
static short* silenceBuffer;
int numSamples = frames * CHANNELS;
silenceBuffer = malloc(sizeof(*silenceBuffer) * numSamples);
    for (i = 0; i<numSamples; i++) {
        silenceBuffer[i] = 0;
    }

At the point when audio should be produced, you can switch to enqueuing buffers containing real audio data.

Note: Constantly outputting audio incurs significant power consumption. Ensure that you stop the output in the onPause() method. Also consider pausing the silent output after some period of user inactivity.

Additional sample code

To download a sample app showcasing audio latency, see NDK Samples.

For more information

  1. Audio Latency for App Developers
  2. Contributors to Audio Latency
  3. Measuring Audio Latency
  4. Audio Warmup
  5. Latency (audio)
  6. Round-trip delay time