Types of CI automation

The following are some typical forms of automation that you might like to use in your CI system.

Basic jobs

Build: By building a project from scratch, you make sure that the new changes compile correctly and that all libraries and tools are compatible with each other.
Lint or style checks: This is an optional but recommended step. When you enforce style rules and perform static analysis, code reviews can be more concise and focused.
Local, or host-side tests: They run on the local machine that performs the build. On Android this is usually the JVM, so they're fast and reliable. They include Robolectric tests as well.

Instrumented tests

Tests that run on emulators or physical devices require some provisioning, waiting for devices to boot or be connected and other operations that add complexity.

There are multiple options to run instrumented tests on CI:

Gradle Managed Devices can be used to define the devices to use (for example "Pixel 2 emulator on API 27") and it handles device provisioning.
Most CI systems come with a third-party plugin (also called "action", "integration" or "step") to handle Android emulators.
Delegate instrumented tests to a device farm such as Firebase Test Lab. Device farms are used for their high reliability and they can run on emulators or physical devices.

Performance regression tests

To monitor app performance we recommend using the benchmark libraries. Automation of performance tests during development requires physical devices to ensure consistent and realistic test results.

Running benchmarks can take a long time, especially when you have high coverage of code and user journeys that you are benchmarking. Instead of running all benchmarks for every merged feature or commit, consider executing them as part of a regularly scheduled maintenance build, such as a nightly build.

Monitoring performance

You can monitor performance regressions using step fitting. Step fitting defines a rolling window of previous build results which you compare against the current build. This approach combines several benchmark results into one regression-specific metric. You can apply step fitting to reduce noise during regression testing.

This reduces the occurrence of false positives which can occur when benchmark times are slow for a single build and then normalize again.

Test coverage regression checks

Test coverage is a metric that can help you and your team decide if tests sufficiently cover a change. However, it shouldn't be the only indicator. It is common practice to set up a regression check that fails or shows a warning when the coverage goes down relative to the base branch.