RenderScript Overview

RenderScript is a framework for running computationally intensive tasks at high performance on Android. RenderScript is primarily oriented for use with data-parallel computation, although serial workloads can benefit as well. The RenderScript runtime parallelizes work across processors available on a device, such as multi-core CPUs and GPUs. This allows you to focus on expressing algorithms rather than scheduling work. RenderScript is especially useful for applications performing image processing, computational photography, or computer vision.

To begin with RenderScript, there are two main concepts you should understand:

  • The language itself is a C99-derived language for writing high-performance compute code. Writing a RenderScript Kernel describes how to use it to write compute kernels.
  • The control API is used for managing the lifetime of RenderScript resources and controlling kernel execution. It is available in three different languages: Java, C++ in Android NDK, and the C99-derived kernel language itself. Using RenderScript from Java Code and Single-Source RenderScript describe the first and the third options, respectively.

Writing a RenderScript Kernel

A RenderScript kernel typically resides in a .rs file in the <project_root>/src/rs directory; each .rs file is called a script. Every script contains its own set of kernels, functions, and variables. A script can contain:

  • A pragma declaration (#pragma version(1)) that declares the version of the RenderScript kernel language used in this script. Currently, 1 is the only valid value.
  • A pragma declaration (#pragma rs java_package_name(com.example.app)) that declares the package name of the Java classes reflected from this script. Note that your .rs file must be part of your application package, and not in a library project.
  • Zero or more invokable functions. An invokable function is a single-threaded RenderScript function that you can call from your Java code with arbitrary arguments. These are often useful for initial setup or serial computations within a larger processing pipeline.
  • Zero or more script globals. A script global is similar to a global variable in C. You can access script globals from Java code, and these are often used for parameter passing to RenderScript kernels. Script globals are explained in more detail here.

  • Zero or more compute kernels. A compute kernel is a function or collection of functions that you can direct the RenderScript runtime to execute in parallel across a collection of data. There are two kinds of compute kernels: mapping kernels (also called foreach kernels) and reduction kernels.

    A mapping kernel is a parallel function that operates on a collection of Allocations of the same dimensions. By default, it executes once for every coordinate in those dimensions. It is typically (but not exclusively) used to transform a collection of input Allocations to an output Allocation one Element at a time.

    • Here is an example of a simple mapping kernel:

      uchar4 RS_KERNEL invert(uchar4 in, uint32_t x, uint32_t y) {
        uchar4 out = in;
        out.r = 255 - in.r;
        out.g = 255 - in.g;
        out.b = 255 - in.b;
        return out;
      }

      In most respects, this is identical to a standard C function. The RS_KERNEL property applied to the function prototype specifies that the function is a RenderScript mapping kernel instead of an invokable function. The in argument is automatically filled in based on the input Allocation passed to the kernel launch. The arguments x and y are discussed below. The value returned from the kernel is automatically written to the appropriate location in the output Allocation. By default, this kernel is run across its entire input Allocation, with one execution of the kernel function per Element in the Allocation.

      A mapping kernel may have one or more input Allocations, a single output Allocation, or both. The RenderScript runtime checks to ensure that all input and output Allocations have the same dimensions, and that the Element types of the input and output Allocations match the kernel's prototype; if either of these checks fails, RenderScript throws an exception.

      NOTE: Before Android 6.0 (API level 23), a mapping kernel may not have more than one input Allocation.

      If you need more input or output Allocations than the kernel has, those objects should be bound to rs_allocation script globals and accessed from a kernel or invokable function via rsGetElementAt_type() or rsSetElementAt_type().

      NOTE: RS_KERNEL is a macro defined automatically by RenderScript for your convenience:

      #define RS_KERNEL __attribute__((kernel))
      

    A reduction kernel is a family of functions that operates on a collection of input Allocations of the same dimensions. By default, its accumulator function executes once for every coordinate in those dimensions. It is typically (but not exclusively) used to "reduce" a collection of input Allocations to a single value.

    • Here is an example of a simple reduction kernel that adds up the Elements of its input:

      #pragma rs reduce(addint) accumulator(addintAccum)
      
      static void addintAccum(int *accum, int val) {
        *accum += val;
      }

      A reduction kernel consists of one or more user-written functions. #pragma rs reduce is used to define the kernel by specifying its name (addint, in this example) and the names and roles of the functions that make up the kernel (an accumulator function addintAccum, in this example). All such functions must be static. A reduction kernel always requires an accumulator function; it may also have other functions, depending on what you want the kernel to do.

      A reduction kernel accumulator function must return void and must have at least two arguments. The first argument (accum, in this example) is a pointer to an accumulator data item and the second (val, in this example) is automatically filled in based on the input Allocation passed to the kernel launch. The accumulator data item is created by the RenderScript runtime; by default, it is initialized to zero. By default, this kernel is run across its entire input Allocation, with one execution of the accumulator function per Element in the Allocation. By default, the final value of the accumulator data item is treated as the result of the reduction, and is returned to Java. The RenderScript runtime checks to ensure that the Element type of the input Allocation matches the accumulator function's prototype; if it does not match, RenderScript throws an exception.

      A reduction kernel has one or more input Allocations but no output Allocations.

      Reduction kernels are explained in more detail here.

      Reduction kernels are supported in Android 7.0 (API level 24) and later.

    A mapping kernel function or a reduction kernel accumulator function may access the coordinates of the current execution using the special arguments x, y, and z, which must be of type int or uint32_t. These arguments are optional.

    A mapping kernel function or a reduction kernel accumulator function may also take the optional special argument context of type rs_kernel_context. It is needed by a family of runtime APIs that are used to query certain properties of the current execution -- for example, rsGetDimX. (The context argument is available in Android 6.0 (API level 23) and later.)

  • An optional init() function. The init() function is a special type of invokable function that RenderScript runs when the script is first instantiated. This allows for some computation to occur automatically at script creation.
  • Zero or more static script globals and functions. A static script global is equivalent to a script global except that it cannot be accessed from Java code. A static function is a standard C function that can be called from any kernel or invokable function in the script but is not exposed to the Java API. If a script global or function does not need to be accessed from Java code, it is highly recommended that it be declared static.

Setting floating point precision

You can control the required level of floating point precision in a script. This is useful if full IEEE 754-2008 standard (used by default) is not required. The following pragmas can set a different level of floating point precision:

  • #pragma rs_fp_full (default if nothing is specified): For apps that require floating point precision as outlined by the IEEE 754-2008 standard.
  • #pragma rs_fp_relaxed: For apps that don’t require strict IEEE 754-2008 compliance and can tolerate less precision. This mode enables flush-to-zero for denorms and round-towards-zero.
  • #pragma rs_fp_imprecise: For apps that don’t have stringent precision requirements. This mode enables everything in rs_fp_relaxed along with the following:
    • Operations resulting in -0.0 can return +0.0 instead.
    • Operations on INF and NAN are undefined.

Most applications can use rs_fp_relaxed without any side effects. This may be very beneficial on some architectures due to additional optimizations only available with relaxed precision (such as SIMD CPU instructions).

Accessing RenderScript APIs from Java

When developing an Android application that uses RenderScript, you can access its API from Java in one of two ways:

Here are the tradeoffs:

  • If you use the Support Library APIs, the RenderScript portion of your application will be compatible with devices running Android 2.3 (API level 9) and higher, regardless of which RenderScript features you use. This allows your application to work on more devices than if you use the native (android.renderscript) APIs.
  • Certain RenderScript features are not available through the Support Library APIs.
  • If you use the Support Library APIs, you will get (possibly significantly) larger APKs than if you use the native (android.renderscript) APIs.

Using the RenderScript Support Library APIs

In order to use the Support Library RenderScript APIs, you must configure your development environment to be able to access them. The following Android SDK tools are required for using these APIs:

  • Android SDK Tools revision 22.2 or higher
  • Android SDK Build-tools revision 18.1.0 or higher

Note that starting from Android SDK Build-tools 24.0.0, Android 2.2 (API level 8) is no longer supported.

You can check and update the installed version of these tools in the Android SDK Manager.

To use the Support Library RenderScript APIs:

  1. Make sure you have the required Android SDK version installed.
  2. Update the settings for the Android build process to include the RenderScript settings:
    • Open the build.gradle file in the app folder of your application module.
    • Add the following RenderScript settings to the file: