Google Vertex AI SDK for Java
Note: The
com.google.cloud.vertexai.generativeaipackage and its classes are deprecated as of June 24, 2025 and will be removed on June 24, 2026. Please use the Google Gen AI SDK to access GenAI features. See the migration guide for details.
Note: The
com.google.cloud.vertexai.genaipackage and its classes are experimental, and may change in future versions.
Java idiomatic SDK for Vertex AI.
Add dependency
<dependency> <groupId>com.google.cloud</groupId> <artifactId>google-cloud-vertexai</artifactId> <version>1.51.0</version> </dependency>
If you are using Gradle without BOM, add this to your dependencies:
implementation 'com.google.cloud:google-cloud-vertexai:1.51.0'If you are using SBT, add this to your dependencies:
libraryDependencies += "com.google.cloud" % "google-cloud-vertexai" % "1.51.0"
Authentication
To learn how to authenticate to the API, see the Authentication.
Authorization
When a client application makes a call to the Vertex AI API, the application must be granted the authorization scopes that are required for the API. Additionally, the authenticated principal must have the IAM role(s) that are required to access the Google Cloud resources being called.
Getting Started
Follow the instructions in this section to get started using the Vertex AI SDK for Java.
Prerequisites
To use the Vertex AI SDK for Java, you must have completed the following:
-
Enable the Vertex AI API for your project.
-
Enable billing for your project.
-
Install the Google Cloud Command Line Interface and run the following commands in command line:
gcloud auth login && gcloud config set project <var>PROJECT_ID</var>
To acquire user credentials to use for Application Default Credentials,
run gcloud auth application-default login.
Install and setup the SDK
You must install the google-cloud-vertexai library. See the
Add Dependency section
to learn how to add google-cloud-vertexai as a dependency in your code.
Use the Vertex AI SDK for Java
The following sections show you how to perform common tasks by using the Vertex AI SDK for Java.
Basic Text Generation
Vertex AI SDK allows you to access the service programmatically. The following code snippet is the most basic usage of SDK
package <your package name> import com.google.cloud.vertexai.VertexAI; import com.google.cloud.vertexai.generativeai.GenerativeModel; import com.google.cloud.vertexai.api.GenerateContentResponse; import java.io.IOException; public class Main { private static final String PROJECT_ID = <your project id>; private static final String LOCATION = <location>; public static void main(String[] args) throws IOException { try (VertexAI vertexAi = new VertexAI(PROJECT_ID, LOCATION);) { GenerativeModel model = new GenerativeModel("gemini-pro", vertexAi); GenerateContentResponse response = model.generateContent("How are you?"); // Do something with the response } } }
Stream generated output
To get a streamed output, you can use the generateContentStream method:
package <your package name> import com.google.cloud.vertexai.VertexAI; import com.google.cloud.vertexai.generativeai.GenerativeModel; import com.google.cloud.vertexai.generativeai.ResponseStream; import com.google.cloud.vertexai.api.GenerateContentResponse; import java.io.IOException; public class Main { private static final String PROJECT_ID = <your project id>; private static final String LOCATION = <location>; public static void main(String[] args) throws IOException { try (VertexAI vertexAi = new VertexAI(PROJECT_ID, LOCATION);) { GenerativeModel model = new GenerativeModel("gemini-pro", vertexAi); ResponseStream<GenerateContentResponse> responseStream = model.generateContentStream("How are you?"); // Do something with the ResponseStream, which is an iterable. } } }
Text Generation with Async
To get a future response, you can use the generateContentAsync method
package <your package name> import com.google.api.core.ApiFuture; import com.google.cloud.vertexai.VertexAI; import com.google.cloud.vertexai.api.GenerateContentResponse; import com.google.cloud.vertexai.generativeai.GenerativeModel; import java.io.IOException; public class Main { private static final String PROJECT_ID = <your project id>; private static final String LOCATION = <location>; public static void main(String[] args) throws IOException { try (VertexAI vertexAi = new VertexAI(PROJECT_ID, LOCATION);) { GenerativeModel model = new GenerativeModel("gemini-pro", vertexAi); ApiFuture<GenerateContentResponse> future = model.generateContentAsync("How are you?"); // Do something else. // Get the response from Future GenerateContentResponse response = future.get(); // Do something with the response. } } }
Generate text from multi-modal input
To generate text from a prompt that contains multiple modalities of data, use
ContentMaker to make a Content:
package <your package name>; import com.google.cloud.vertexai.VertexAI; import com.google.cloud.vertexai.generativeai.GenerativeModel; import com.google.cloud.vertexai.generativeai.ResponseStream; import com.google.cloud.vertexai.generativeai.ContentMaker; import com.google.cloud.vertexai.generativeai.PartMaker; import com.google.cloud.vertexai.api.GenerateContentResponse; import java.util.Arrays; public class Main { private static final String PROJECT_ID = <your project id>; private static final String LOCATION = <location>; private static final String IMAGE_URI = <gcs uri to your image> public static void main(String[] args) throws Exception { try (VertexAI vertexAi = new VertexAI(PROJECT_ID, LOCATION); ) { // Vision model must be used for multi-modal input GenerativeModel model = new GenerativeModel("gemini-pro-vision", vertexAi); ResponseStream<GenerateContentResponse> stream = model.generateContentStream(ContentMaker.fromMultiModalData( "Please describe this image", PartMaker.fromMimeTypeAndData("image/jpeg", IMAGE_URI) )); // Do something with the ResponseStream, which is an iterable. } } }
Role Change for Multi-turn Conversation
For a multi-turn conversation, one needs to make a Content list to represent the whole conversation between two roles: "user" and "model".
package <your package name>; import com.google.cloud.vertexai.VertexAI; import com.google.cloud.vertexai.api.Content; import com.google.cloud.vertexai.api.GenerateContentResponse; import com.google.cloud.vertexai.generativeai.ContentMaker; import com.google.cloud.vertexai.generativeai.GenerativeModel; import com.google.cloud.vertexai.generativeai.ResponseHandler; import java.io.IOException; import java.util.Arrays; import java.util.List; public class Main { private static final String PROJECT_ID = <your project id>; private static final String LOCATION = <location>; private static final String MODEL_NAME = "gemini-pro"; public static void main(String[] args) throws IOException { try (VertexAI vertexAi = new VertexAI(PROJECT_ID, LOCATION); ) { GenerativeModel model = new GenerativeModel(MODEL_NAME, vertexAi); // Put all the contents in a Content list List<Content> contents = Arrays.asList( ContentMaker.fromString("Hi!"), ContentMaker.forRole("model") .fromString("Hello! How may I assist you?"), ContentMaker.fromString( "Can you explain quantum mechanis as well in only a few sentences?")); // generate the result GenerateContentResponse response = model.generateContent(contents); // ResponseHandler.getText is a helper function to retrieve the text part of the answer. System.out.println("\nPrint response: "); System.out.println(ResponseHandler.getText(response)); System.out.println("\n"); } } }
Use ChatSession for multi-turn chat
The Vertex AI SDK for Java provides a ChatSession class that lets you easily
chat with the model:
package <your package name>; import com.google.cloud.vertexai.VertexAI; import com.google.cloud.vertexai.generativeai.GenerativeModel; import com.google.cloud.vertexai.generativeai.ChatSession; import com.google.cloud.vertexai.generativeai.ResponseStream; import com.google.cloud.vertexai.api.GenerateContentResponse; import java.io.IOException; import java.util.Arrays; import java.util.List; public class Main { private static final String PROJECT_ID = <your project id>; private static final String LOCATION = <location>; public static void main(String[] args) throws IOException { try (VertexAI vertexAi = new VertexAI(PROJECT_ID, LOCATION); ) { GenerativeModel model = new GenerativeModel("gemini-pro", vertexAi); ChatSession chat = model.startChat(); // Send the first message. // ChatSession also has two versions of sendMessage, stream and non-stream ResponseStream<GenerateContentResponse> response = chat.sendMessageStream("Hi!"); // Do something with the output stream, possibly with ResponseHandler // Now send another message. The history will be remembered by the ChatSession. // Note: the stream needs to be consumed before you send another message // or fetch the history. ResponseStream<GenerateContentResponse> anotherResponse = chat.sendMessageStream("Can you explain quantum mechanis as well in a few sentences?"); // Do something with the second response // See the whole history. Make sure you have consumed the stream. List<Content> history = chat.getHistory(); } } }
Update configurations
The Vertex AI SDK for Java provides configurations for customizing content generation. You can configure options like GenerationConfig, SafetySetting, and system instructions, or add Tool for function calling.
You can choose between two configuration approaches: 1) set configurations during model instantiation for consistency across all text generations, or 2) adjust them on a per-request basis for fine-grained control.
Model level configurations
Below is an example of configure GenerationConfig for a GenerativeModel:
package <PACKAGE_NAME> import com.google.cloud.vertexai.VertexAI; import com.google.cloud.vertexai.api.GenerateContentResponse; import com.google.cloud.vertexai.api.GenerationConfig; import com.google.cloud.vertexai.generativeai.GenerativeModel; import com.google.cloud.vertexai.generativeai.ResponseHandler; import java.io.IOException; public class Main { private static final String PROJECT_ID = <PROJECT_ID>; private static final String LOCATION = <LOCATION>; public static void main(String[] args) throws IOException { try (VertexAI vertexAi = new VertexAI(PROJECT_ID, LOCATION);) { // Build a GenerationConfig instance. GenerationConfig generationConfig = GenerationConfig.newBuilder().setMaxOutputTokens(50).build(); // Use the builder to instantialize the model with the configuration. GenerativeModel model = new GenerativeModel.Builder() .setModelName("gemino-pro") .setVertexAi(vertexAi) .setGenerationConfig(generationConfig) .build(); // Generate the response. GenerateContentResponse response = model.generateContent("Please explain LLM?"); // Do something with the response. } } }
And an example of configuring preambles as a system instruction.
package <your package name> import com.google.cloud.vertexai.VertexAI; import com.google.cloud.vertexai.generativeai.ContentMaker; import com.google.cloud.vertexai.generativeai.GenerativeModel; import com.google.cloud.vertexai.api.GenerateContentResponse; import java.io.IOException; public class Main { private static final String PROJECT_ID = <your project id>; private static final String LOCATION = <location>; public static void main(String[] args) throws IOException { try (VertexAI vertexAi = new VertexAI(PROJECT_ID, LOCATION);) { GenerativeModel model = new GenerativeModel.Builder() .setModelName("gemino-pro") .setVertexAi(vertexAi) .setSystemInstruction( ContentMaker.fromString( "You're a helpful assistant that starts all its answers with: \"COOL\"") ) .build(); GenerateContentResponse response = model.generateContent("How are you?"); // Do something with the response } } }
Update request-level configurations
The Vertex AI SDK for Java provides fluent APIs to control request-level configurations.
package <PACKAGE_NAME> import com.google.cloud.vertexai.VertexAI; import com.google.cloud.vertexai.api.GenerateContentResponse; import com.google.cloud.vertexai.api.HarmCategory; import com.google.cloud.vertexai.api.SafetySetting; import com.google.cloud.vertexai.api.SafetySetting.HarmBlockThreshold; import com.google.cloud.vertexai.generativeai.GenerateContentConfig; import com.google.cloud.vertexai.generativeai.GenerativeModel; import java.io.IOException; import java.util.Arrays; public class Main { private static final String PROJECT_ID = <PROJECT_ID>; private static final String LOCATION = <LOCATION>; public static void main(String[] args) throws IOException { try (VertexAI vertexAi = new VertexAI(PROJECT_ID, LOCATION); ) { // Build a SafetySetting instance. SafetySetting safetySetting = SafetySetting.newBuilder() .setCategory(HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT) .setThreshold(HarmBlockThreshold.BLOCK_LOW_AND_ABOVE) .build(); // Generate the response with the fluent API `withSafetySetting`. GenerateContentResponse response = model .withSafetySetting(Arrays.asList(SafetySetting)) .generateContent("Please explain LLM?"); // Do something with the response. } } }
Configurations for ChatSession
When a chat session is started (ChatSesson chat = model.startChat()),
it inherits all configurations from the model. You can also use fluent APIs
to update these settings during the chat.
package <PACKAGE_NAME> import com.google.cloud.vertexai.VertexAI; import com.google.cloud.vertexai.api.GenerateContentResponse; import com.google.cloud.vertexai.api.GenerationConfig; import com.google.cloud.vertexai.generativeai.ChatSession; import com.google.cloud.vertexai.generativeai.GenerativeModel; import com.google.cloud.vertexai.generativeai.ResponseHandler; import java.io.IOException; public class Main { private static final String PROJECT_ID = <PROJECT_ID>; private static final String LOCATION = <LOCATION>; public static void main(String[] args) throws IOException { try (VertexAI vertexAi = new VertexAI(PROJECT_ID, LOCATION);) { // Instantiate a model with GenerationConfig GenerationConfig generationConfig = GenerationConfig.newBuilder().setMaxOutputTokens(50).build(); GenerativeModel model = new GenerativeModel.Builder() .setModelName("gemino-pro") .setVertexAi(vertexAi) .setGenerationConfig(generationConfig) .build(); // Start a chat session ChatSession chat = model.startChat(); // Send a message. The model level GenerationConfig will be applied here GenerateContentResponse response = chat.sendMessage("Please explain LLM?"); // Do something with the response // Send another message, using Fluent API to update the GenerationConfig response = chat.withGenerationConfig(GenerationConfig.getDefaultInstance()) .sendMessage("Tell me more about what you can do."); // Do something with the response } } }
Use ChatSession for function calling
You can perfrom a function call in a ChatSession as follows:
package <your package name>; import com.google.cloud.vertexai.VertexAI; import com.google.cloud.vertexai.generativeai.ChatSession; import com.google.cloud.vertexai.generativeai.GenerativeModel; import com.google.cloud.vertexai.generativeai.ResponseHandler; import com.google.cloud.vertexai.generativeai.ResponseStream; import com.google.cloud.vertexai.api.Content; import com.google.cloud.vertexai.api.GenerateContentResponse; import java.io.IOException; public class Main { private static final String PROJECT_ID = "<your project>"; private static final String LOCATION = "<location>"; private static final String MODEL_NAME = "gemini-pro"; private static final String TEXT = "What's the weather in Vancouver?"; public static void main(String[] args) throws IOException { try (VertexAI vertexAi = new VertexAI(PROJECT_ID, LOCATION); ) { // Declare a function to be used in a request. // We construct a jsonString that corresponds to the following function // declaration. // { // "name": "getCurrentWeather", // "description": "Get the current weather in a given location", // "parameters": { // "type": "OBJECT", // "properties": { // "location": { // "type": "STRING", // "description": "location" // } // } // } // } // With JDK 15 and above, you can do // // String jsonString = """ // { // "name": "getCurrentWeather", // "description": "Get the current weather in a given location", // "parameters": { // "type": "OBJECT", // "properties": { // "location": { // "type": "STRING", // "description": "location" // } // } // } // } // """ String jsonString = "{\n" + " \"name\": \"getCurrentWeather\",\n" + " \"description\": \"Get the current weather in a given location\",\n" + " \"parameters\": {\n" + " \"type\": \"OBJECT\", \n" + " \"properties\": {\n" + " \"location\": {\n" + " \"type\": \"STRING\",\n" + " \"description\": \"location\"\n" + " }\n" + " }\n" + " }\n" + "}"; Tool tool = Tool.newBuilder() .addFunctionDeclarations( FunctionDeclarationMaker.fromJsonString(jsonString) ) .build(); // Start a chat session from a model, with the use of the declared // function. GenerativeModel model = new GenerativeModel.Builder() .setModelName(MODEL_NAME) .setVertexAi(vertexAi) .setTools(Arrays.asList(tool)) .build(); ChatSession chat = model.startChat(); System.out.println(String.format("Ask the question: %s", TEXT)); GenerateContentResponse response = chat.sendMessage(TEXT); // The model will most likely return a function call to the declared // function `getCurrentWeather` with "Vancouver" as the value for the // argument `location`. System.out.println("\nPrint response: "); System.out.println(ResponseHandler.getContent(response)); System.out.println("\n"); // Provide an answer to the model so that it knows what the result of a // "function call" is. Content content = ContentMaker.fromMultiModalData( PartMaker.fromFunctionResponse( "getCurrentWeather", Collections.singletonMap("currentWeather", "snowing"))); System.out.println("Provide the function response: "); System.out.println(content); System.out.println("\n"); response = chat.sendMessage(content); // See what the model replies now System.out.println("\nPrint response: "); System.out.println(ResponseHandler.getText(response)); System.out.println("\n"); } } }
See the Vertex AI SDK docs to learn more about how to use this Vertex AI SDK in more advanced ways.
Troubleshooting
To get help, follow the instructions in the shared Troubleshooting document.
Other Configurations
Vertex-scoped Configurations
Transport
Vertex AI uses gRPC and rest for the transport layer. By default, we use gRPC transport. To use rest, passing a Transport.REST to the VertexAI constructor as the example below:
package <your package name> import com.google.cloud.vertexai.VertexAI; import com.google.cloud.vertexai.generativeai.GenerativeModel; import com.google.cloud.vertexai.generativeai.Transport; import com.google.cloud.vertexai.api.GenerateContentResponse; import java.io.IOException; public class Main { private static final String PROJECT_ID = <your project id>; private static final String LOCATION = <location>; public static void main(String[] args) throws IOException { try (VertexAI vertexAi = new VertexAI.Builder() .setProjectId(PROJECT_ID) .setLocation(LOCATION) .setTransport(Transport.REST);) { GenerativeModel model = new GenerativeModel("gemini-pro", vertexAi); GenerateContentResponse response = model.generateContent("How are you?"); // Do something with the response } } }
Change API endpoints
To use a different API endpoint, specify the endpoint that you want to use when
you instantiate VertexAI:
package <your package name> import com.google.cloud.vertexai.VertexAI; import com.google.cloud.vertexai.generativeai.GenerativeModel; import com.google.cloud.vertexai.api.GenerateContentResponse; import java.io.IOException; public class Main { private static final String PROJECT_ID = <your project id>; private static final String LOCATION = <location>; public static void main(String[] args) throws IOException { try (VertexAI vertexAi = new VertexAI.Builder() .setProjectId(PROJECT_ID) .setLocation(LOCATION) .setApiEndpoint(<new_endpoint>);) { GenerativeModel model = new GenerativeModel("gemini-pro", vertexAi); GenerateContentResponse response = model.generateContent("How are you?"); // Do something with the response } } }
Supported Java Versions
Java 8 or above is required for using this client.
Google's Java client libraries, Google Cloud Client Libraries and Google Cloud API Libraries, follow the Oracle Java SE support roadmap (see the Oracle Java SE Product Releases section).
For new development
In general, new feature development occurs with support for the lowest Java LTS version covered by Oracle's Premier Support (which typically lasts 5 years from initial General Availability). If the minimum required JVM for a given library is changed, it is accompanied by a semver major release.
Java 11 and (in September 2021) Java 17 are the best choices for new development.
Keeping production systems current
Google tests its client libraries with all current LTS versions covered by Oracle's Extended Support (which typically lasts 8 years from initial General Availability).
Legacy support
Google's client libraries support legacy versions of Java runtimes with long term stable libraries that don't receive feature updates on a best efforts basis as it may not be possible to backport all patches.
Google provides updates on a best efforts basis to apps that continue to use Java 7, though apps might need to upgrade to current versions of the library that supports their JVM.
Where to find specific information
The latest versions and the supported Java versions are identified on
the individual GitHub repository github.com/GoogleAPIs/java-SERVICENAME
and on google-cloud-java.
Versioning
This library follows Semantic Versioning.
Contribute to this library
Contributions to this library are always welcome and highly encouraged.
See CONTRIBUTING for more information how to get started.
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms. See Code of Conduct for more information.
License
Apache 2.0 - See LICENSE for more information.
CI Status
| Java Version | Status |
|---|---|
| Java 8 | |
| Java 8 OSX | |
| Java 8 Windows | |
| Java 11 |
Java is a registered trademark of Oracle and/or its affiliates.