This topic helps you to identify and fix key performance issues in your app.
Key performance issues
There are many problems that can contribute to bad performance in an app, but here are some of the common situations to look out for in your app:
- Scroll Jank
- "Jank" is the term used to describe the visual hiccup that occurs when the system is not able to build and provide frames in time for them to be drawn to the screen at the requested cadence (60hz, or higher). Jank is most apparent when scrolling, when what should be a smoothly animated flow has hiccups, where the movement pauses along the way for one or more frames as the app takes longer to render content than the duration of a frame on the system.
- Apps should target 90Hz refresh rates. Traditional rendering rates have
been 60Hz, but many newer devices operate in 90Hz mode during user
interactions like scrolling, and some devices support even higher rates,
up to 120Hz.
- To see what refresh rate a device is using at a given time, enable an overlay using Developer Options > Show refresh rate in the Debugging section.
-
- Startup latency is the amount of time it takes between tapping on the app icon, notification, or other entry point and the user's data being shown on the screen.
You should aim for these two startup goals in your apps:
Cold start < 500ms: A "cold start" happens when the app being launched is not present in the system's memory. This happens when it is the app's first launch since reboot or since the app process killed by either the user or the system.
In contrast, a "warm start" occurs when the app is already running in the background. A cold start requires the most work from the system as it has to load everything from storage and initialize the app. Aim for a goal of cold start taking 500ms or less.
For the P95/P99 latencies to be very close to the median latency. When the app sometimes takes a very long time to start, user trust is eroded. IPCs and unnecessary I/O during the critical path of app startup can experience lock contention and introduce these inconsistencies.
Transitions that are not smooth
- These concerns arise during interactions such as switching between tabs or loading a new activity. These types of transitions should have smooth animations and not include delays or visual flicker.
Power inefficiencies
- Doing work costs battery, and doing unnecessary work reduces battery life.
Memory allocations, which come from creating new objects in code, can be the cause of significant work in the system. This is because not only do the allocations themselves require effort from the Android Runtime, but freeing those objects later ("garbage collection") also requires time and effort. Both allocation and collection are much faster and more efficient than they used to be, especially for temporary objects. So where the guidance used to be to avoid allocating objects whenever possible, the recommendation now is to do what makes the most sense for your app and your architecture; saving on allocations at the risk of unmaintainable code is not the right choice given what ART is capable of.
However, there is still effort involved, so it is worth keeping in mind whether you are allocating many objects in your inner loop, which could contribute to performance problems.
Identifying issues
The recommended workflow to identify and remedy performance issues is as follows:
- Identify critical user journeys to inspect. These may include:
- Common startup flows, including from launcher and notification.
- Any screens where the user scrolls through data.
- Transitions between screens.
- Long-running flows, like navigation or music playback.
- Inspect what is happening during those flows using debugging tools:
- Systrace or Perfetto: Allows you to see exactly what is happening across the entire device with precise timing data.
- Memory profiler: Allows you to see what memory allocations are happening on the heap.
- Simpleperf: View a flamegraph of what function calls are taking up the most CPU during a certain period of time. When you identify something that's taking a long time in systrace, but you don't know why, simpleperf can provide additional information.
Manual debugging of individual test runs is critical for understanding and debugging these performance issues. The above steps cannot be replaced by analyzing aggregated data. However, setting up metrics collection in automated testing as well as in the field is also important to understand what users are actually seeing and identify when regressions may occur:
- Startup flows
- Field metrics: Play Console startup time
- Lab tests Jetpack Macrobenchmark: Startup
- Jank
- Field metrics
- Play Console frame vitals: Note that within the Play Console, it's not possible to narrow down metrics to a specific user journey, since all that is reported is overall jank throughout the app.
- Custom measurement with
FrameMetricsAggregator
: You can useFrameMetricsAggregator
to record jank metrics during a particular workflow.
- Lab tests
- Jetpack Macrobenchmark: Scrolling
- Macrobenchmark collects frame timing using
dumpsys gfxinfo
commands that bracket a single user journey. This is a reasonable way to understand variation in jank over a specific user journey. TheRenderTime
metrics, which highlight how long frames are taking to draw, are more important than the count of janky frames for identifying regressions or improvements.
- Field metrics
Set up your app for performance analysis
Proper setup is essential for getting accurate, repeatable, actionable benchmarks from an application. Test on a system that is as close to production as possible, while suppressing sources of noise. In the following sections are a number of APK- and system-specific steps you can take to prepare a test setup, some of which are use-case-specific.
Tracepoints
Applications can instrument their code with custom trace events.
While traces are being captured, tracing does incur a small overhead (roughly 5μs) per section, so don't put it around every method. Just tracing larger chunks of work (>0.1ms) can give significant insights into bottlenecks.
APK considerations
Caution: Do not measure performance on a debug build.
Debug variants can be helpful for troubleshooting and symbolizing stack samples,
but they have severe non-linear impacts on performance. Devices running
Android 10 (API Level 29) and higher can use
profileable android:shell="true"
in their manifest to enable profiling in release builds.
Use your production-grade code shrinking configuration. Depending on the resources your application uses, this can have a substantial impact on performance. Note that some ProGuard configurations remove tracepoints, so consider removing those rules for the configuration you're running tests on.
Compilation
Compile your application on-device to a known state (generally speed or speed-profile). Background JIT activity can have a significant performance overhead, and you will hit it often if you are reinstalling the APK between test runs. The command to do this is:
adb shell cmd package compile -m speed -f com.google.packagename
The 'speed' compilation mode will compile the app completely; the 'speed-profile' mode will compile the app according to a profile of the utilized code paths that is collected during app usage. It can be difficult to collect profiles consistently and correctly, so if you decide to use them, confirm they are collecting what you expect. The profiles are located here:
/data/misc/profiles/ref/[package-name]/primary.prof
Note that Macrobenchmark allows you to directly specify compilation mode.
System considerations
For low-level and high fidelity measurements, calibrate your devices. Run A/B comparisons across the same device and same OS version. There can be significant variations in performance, even across the same device type.
On rooted devices, consider using a
lockClocks
script
for micro benchmarks. Among other things, these scripts do the following:
- Place CPUs at a fixed frequency,
- Disable little cores configure the GPU.
- Disable thermal throttling.
This isn't recommended for user-experience focused tests (such as app launch, DoU testing, and jank testing), but can be essential for cutting down noise in micro benchmark tests.
When possible, consider using a testing framework like Macrobenchmark, which can reduce noise in your measurements and prevent measurement inaccuracy.
Slow app startup: unnecessary trampoline activity
A trampoline activity can extend app startup time unnecessarily, and it's
important to be aware if your app is doing it. As you can see in the following
example trace, one activityStart
is immediately followed by another
activityStart
without any frames being drawn by the first activity.
This can happen both in a notification entrypoint and a regular app startup entrypoint, and can often be addressed by refactoring. For example, if you're using that activity to perform setup before another activity runs, factor that code out into a reusable component or library.
Unnecessary allocations triggering frequent GCs
You may note that Garbage Collections (GCs) are happening more frequently than you expect in a systrace.
In this case, every 10 seconds during a long-running operation is an indicator that your app might be allocating unnecessarily but consistently over time:
Or, you may notice that a specific callstack is making the vast majority of the allocations when using the Memory Profiler. You don't need to eliminate all allocations aggressively, as this can make code harder to maintain. Start instead by working on hotspots of allocations.
Janky frames
The graphics pipeline is relatively complicated, and there can be some nuance involved in determining whether a user ultimately may have seen a dropped frame; in some cases, the platform can "rescue" a frame using buffering. However, you can ignore most of that nuance to easily identify problematic frames from your app's perspective.
When frames are being drawn with little work required from the app, the
Choreographer.doFrame()
tracepoints occur on a 16.7ms cadence (assuming a 60
FPS device):
If you zoom out and navigate through the trace, you'll sometimes see frames take a little longer to complete, but that's still okay because they're not taking more than their allotted 16.7ms time:
When you actually see a disruption to that regular cadence, that will be a janky frame:
With a little practice, you'll be able to see them easily.
In some cases, you'll need to zoom into that tracepoint for more information about which views are being inflated or what RecyclerView is doing. In other cases, you may have to inspect further.
For more information about identifying janky frames and debugging their causes, see Slow rendering.
Common RecyclerView mistakes
- Invalidating the entire
RecyclerView
's backing data unnecessarily. This can lead to long frame rendering times and hence a jank. Instead, invalidate only the data that has changed, to minimize the number of views that need to update.- See Presenting dynamic
data
for ways to avoid costly
notifyDatasetChanged()
calls, which cause content to be updated rather than replaced entirely.
- See Presenting dynamic
data
for ways to avoid costly
- Failing to support nested
RecyclerView
s properly, causing the internalRecyclerView
to be completely re-created every time.- Every nested, inner
RecyclerView
should have aRecycledViewPool
set to ensure views can be recycled between innerRecyclerView
s.
- Every nested, inner
- Not prefetching enough data, or not prefetching in a timely manner. It can be jarring to quickly hit the bottom of a scrolling list and need to wait for more data from the server. While this isn't technically "jank", as no frame deadlines are being missed, it can be a significant UX improvement to modify the timing and quantity of prefetching so that the user doesn't have to wait for data.