To access the Gemini Pro and Flash models, we recommend Android developers to use Gemini Developer API using Firebase AI Logic. It lets you to get started without requiring a credit card, and provides a generous free tier. Once you validate your integration with a small user base, you can scale by switching to the paid tier.
Getting started
Before you interact with the Gemini API directly from your app, you'll need to do a few things first, including getting familiar with prompting as well as setting up Firebase and your app to use the SDK.
Experiment with prompts
Experimenting with prompts can help you find the best phrasing, content, and format for your Android app. Google AI Studio is an IDE that you can use to prototype and design prompts for your app's use cases.
Creating the right prompt for your use-case is more art than science, which makes experimentation critical. You can learn more about prompting in the Firebase documentation.
Once you are happy with your prompt, click the "<>" button to get code snippets that you can add to your code.
Set up a Firebase project and connect your app to Firebase
Once you're ready to call the API from your app, follow the instructions in the "Step 1" of the Firebase AI Logic getting started guide to set up Firebase and the SDK in your app.
Add the Gradle dependency
Add the following Gradle dependency to your app module:
Kotlin
dependencies {
// ... other androidx dependencies
// Import the BoM for the Firebase platform
implementation(platform("com.google.firebase:firebase-bom:33.13.0"))
// Add the dependency for the Firebase AI Logic library When using the BoM,
// you don't specify versions in Firebase library dependencies
implementation("com.google.firebase:firebase-ai")
}
Java
dependencies {
// Import the BoM for the Firebase platform
implementation(platform("com.google.firebase:firebase-bom:33.13.0"))
// Add the dependency for the Firebase AI Logic library When using the BoM,
// you don't specify versions in Firebase library dependencies
implementation("com.google.firebase:firebase-ai")
// Required for one-shot operations (to use `ListenableFuture` from Guava
// Android)
implementation("com.google.guava:guava:31.0.1-android")
// Required for streaming operations (to use `Publisher` from Reactive
// Streams)
implementation("org.reactivestreams:reactive-streams:1.0.4")
}
Initialize the generative model
Start by instantiating a GenerativeModel
and specifying the model name:
Kotlin
val model = Firebase.ai(backend = GenerativeBackend.googleAI())
.generativeModel("gemini-2.0-flash")
Java
GenerativeModel firebaseAI = FirebaseAI.getInstance(GenerativeBackend.googleAI())
.generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(firebaseAI);
Learn more about the available models for use with the Gemini Developer API. You can also learn more about configuring model parameters.
Interact with the Gemini Developer API from your app
Now that you've set up Firebase and your app to use the SDK, you're ready to interact with the Gemini Developer API from your app.
Generate text
To generate a text response, call generateContent()
with your prompt.
Kotlin
scope.launch {
val response = model.generateContent("Write a story about a magic backpack.")
}
Java
Content prompt = new Content.Builder()
.addText("Write a story about a magic backpack.")
.build();
ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
@Override
public void onSuccess(GenerateContentResponse result) {
String resultText = result.getText();
[...]
}
@Override
public void onFailure(Throwable t) {
t.printStackTrace();
}
}, executor);
Generate text from images and other media
You can also generate text from a prompt that includes text plus images or other
media. When you call generateContent()
, you can pass the media as inline data.
For example, to use a bitmap, use the image
content type:
Kotlin
scope.launch {
val response = model.generateContent(
content {
image(bitmap)
text("what is the object in the picture?")
}
)
}
Java
Content content = new Content.Builder()
.addImage(bitmap)
.addText("what is the object in the picture?")
.build();
ListenableFuture<GenerateContentResponse> response = model.generateContent(content);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
@Override
public void onSuccess(GenerateContentResponse result) {
String resultText = result.getText();
[...]
}
@Override
public void onFailure(Throwable t) {
t.printStackTrace();
}
}, executor);
To pass an audio file, use the inlineData
content type:
Kotlin
val contentResolver = applicationContext.contentResolver
val inputStream = contentResolver.openInputStream(audioUri).use { stream ->
stream?.let {
val bytes = stream.readBytes()
val prompt = content {
inlineData(bytes, "audio/mpeg") // Specify the appropriate audio MIME type
text("Transcribe this audio recording.")
}
val response = model.generateContent(prompt)
}
}
Java
ContentResolver resolver = getApplicationContext().getContentResolver();
try (InputStream stream = resolver.openInputStream(audioUri)) {
File audioFile = new File(new URI(audioUri.toString()));
int audioSize = (int) audioFile.length();
byte audioBytes = new byte[audioSize];
if (stream != null) {
stream.read(audioBytes, 0, audioBytes.length);
stream.close();
// Provide a prompt that includes audio specified earlier and text
Content prompt = new Content.Builder()
.addInlineData(audioBytes, "audio/mpeg") // Specify the appropriate audio MIME type
.addText("Transcribe what's said in this audio recording.")
.build();
// To generate text output, call `generateContent` with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
@Override
public void onSuccess(GenerateContentResponse result) {
String text = result.getText();
Log.d(TAG, (text == null) ? "" : text);
}
@Override
public void onFailure(Throwable t) {
Log.e(TAG, "Failed to generate a response", t);
}
}, executor);
} else {
Log.e(TAG, "Error getting input stream for file.");
// Handle the error appropriately
}
} catch (IOException e) {
Log.e(TAG, "Failed to read the audio file", e);
} catch (URISyntaxException e) {
Log.e(TAG, "Invalid audio file", e);
}
And to provide a video file, continue using the inlineData
content type:
Kotlin
val contentResolver = applicationContext.contentResolver
contentResolver.openInputStream(videoUri).use { stream ->
stream?.let {
val bytes = stream.readBytes()
val prompt = content {
inlineData(bytes, "video/mp4") // Specify the appropriate video MIME type
text("Describe the content of this video")
}
val response = model.generateContent(prompt)
}
}
Java
ContentResolver resolver = getApplicationContext().getContentResolver();
try (InputStream stream = resolver.openInputStream(videoUri)) {
File videoFile = new File(new URI(videoUri.toString()));
int videoSize = (int) videoFile.length();
byte[] videoBytes = new byte[videoSize];
if (stream != null) {
stream.read(videoBytes, 0, videoBytes.length);
stream.close();
// Provide a prompt that includes video specified earlier and text
Content prompt = new Content.Builder()
.addInlineData(videoBytes, "video/mp4")
.addText("Describe the content of this video")
.build();
// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
@Override
public void onSuccess(GenerateContentResponse result) {
String resultText = result.getText();
System.out.println(resultText);
}
@Override
public void onFailure(Throwable t) {
t.printStackTrace();
}
}, executor);
}
} catch (IOException e) {
e.printStackTrace();
} catch (URISyntaxException e) {
e.printStackTrace();
}
Similarly you can also pass PDF (application/pdf
) and plain text
(text/plain
) documents passing their respective MIME Type as a parameter.
Multi-turn chat
You can also support multi-turn conversations. Initialize a chat with the
startChat()
function. You can optionally provide the model with a message
history. Then call the sendMessage()
function to send chat messages.
Kotlin
val chat = model.startChat(
history = listOf(
content(role = "user") { text("Hello, I have 2 dogs in my house.") },
content(role = "model") { text("Great to meet you. What would you like to know?") }
)
)
scope.launch {
val response = chat.sendMessage("How many paws are in my house?")
}
Java
Content.Builder userContentBuilder = new Content.Builder();
userContentBuilder.setRole("user");
userContentBuilder.addText("Hello, I have 2 dogs in my house.");
Content userContent = userContentBuilder.build();
Content.Builder modelContentBuilder = new Content.Builder();
modelContentBuilder.setRole("model");
modelContentBuilder.addText("Great to meet you. What would you like to know?");
Content modelContent = userContentBuilder.build();
List<Content> history = Arrays.asList(userContent, modelContent);
// Initialize the chat
ChatFutures chat = model.startChat(history);
// Create a new user message
Content.Builder messageBuilder = new Content.Builder();
messageBuilder.setRole("user");
messageBuilder.addText("How many paws are in my house?");
Content message = messageBuilder.build();
// Send the message
ListenableFuture<GenerateContentResponse> response = chat.sendMessage(message);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
@Override
public void onSuccess(GenerateContentResponse result) {
String resultText = result.getText();
System.out.println(resultText);
}
@Override
public void onFailure(Throwable t) {
t.printStackTrace();
}
}, executor);
See the Firebase documentation for more details.
Next steps
- Review the Android Quickstart Firebase sample app and the Android AI Sample Catalog on GitHub.
- Prepare your app for production, including setting up Firebase App Check to protect the Gemini API from abuse by unauthorized clients.
- Learn more about Firebase AI Logic in the Firebase documentation.