本頁面由 Cloud Translation API 翻譯而成。

從文字實體擷取模型取得預測結果
透過集合功能整理內容你可以依據偏好儲存及分類內容。

本頁面說明如何使用 Google Cloud 控制台或 Vertex AI API，從文字實體擷取模型取得線上 (即時) 預測和批次預測。

線上預測和批次預測的差異

「線上預測」是對模型端點發出的同步要求。如要依據應用程式輸入內容發出要求，或是需要及時進行推論，您可以選用「線上預測」模式。

「批次預測」為非同步要求。您可以直接從模型資源要求批次預測，而不需要將模型部署至端點。如果是文字資料，如果您不需要立即取得回應，並想透過單一要求處理累積的資料，就適合使用批次預測功能。

取得線上預測

將模型部署至端點

您必須先將模型部署至端點，才能使用模型進行線上預測。部署模型時，系統會將實體資源與模型建立關聯，讓模型以低延遲的方式提供線上預測結果。

您可以將多個模型部署至同一個端點，也可以將模型部署至多個端點。如要進一步瞭解部署模型的選項和用途，請參閱「關於部署模型」。

請使用下列其中一種方法部署模型：

Google Cloud 控制台

在 Google Cloud 控制台的 Vertex AI 專區中，前往「Models」頁面。

前往「Models」(模型) 頁面
按一下要部署的模型名稱，開啟模型詳細資料頁面。
選取「Deploy & Test」分頁標籤。

如果模型已部署至任何端點，這些端點會列在「Deploy your model」部分。
按一下「Deploy to endpoint」。
如要將模型部署至新端點，請選取「Create new endpoint」(建立新端點)，然後為新端點提供名稱。如要將模型部署至現有端點，請選取「Add to existing endpoint」，然後從下拉式清單中選取端點。

您可以在端點中加入多個模型，也可以在多個端點中加入模型。瞭解詳情。
如果您將模型部署至已部署一或多個模型的現有端點，則必須更新您要部署的模型和已部署模型的流量拆分百分比，讓所有百分比加總為 100%。
選取「AutoML Text」，然後按照下列步驟進行設定：
1. 如果您要將模型部署至新端點，請接受「流量分配」為 100。否則，請調整端點上所有模型的流量拆分值，使其相加結果為 100。
2. 按一下模型的「完成」，然後在所有流量分配百分比正確無誤後，按一下「繼續」。
  系統會顯示模型部署的區域。這個地區必須是您建立模型的地區。
3. 按一下「Deploy」，將模型部署至端點。

API

使用 Vertex AI API 部署模型時，您必須完成下列步驟：

視需要建立端點。
取得端點 ID。
將模型部署至端點。

建立端點

如果您要將模型部署至現有端點，可以略過這個步驟。

gcloud

以下範例使用 gcloud ai endpoints create 指令：

gcloud ai endpoints create \
  --region=LOCATION \
  --display-name=ENDPOINT_NAME

更改下列內容：

LOCATION_ID：您使用 Vertex AI 的區域。
ENDPOINT_NAME：端點的顯示名稱。

Google Cloud CLI 工具可能需要幾秒鐘的時間才能建立端點。

REST

使用任何要求資料之前，請先替換以下項目：

LOCATION_ID：您的區域。
PROJECT_ID：您的專案 ID。
ENDPOINT_NAME：端點的顯示名稱。

HTTP 方法和網址：

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints

JSON 要求主體：

{
  "display_name": "ENDPOINT_NAME"
}

如要傳送要求，請展開以下其中一個選項：

curl (Linux、macOS 或 Cloud Shell)

注意：以下指令假設您已使用使用者帳戶登入 gcloud CLI，方法是執行 gcloud init 或 gcloud auth login，或是使用 Cloud Shell，後者會自動登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints"

PowerShell (Windows)

注意：下列指令假設您已透過執行 gcloud init 或 gcloud auth login 登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應：

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateEndpointOperationMetadata",
    "genericMetadata": {
      "createTime": "2020-11-05T17:45:42.812656Z",
      "updateTime": "2020-11-05T17:45:42.812656Z"
    }
  }
}

"done": true

Java

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Java。詳情請參閱 Vertex AI Java API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。


import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.aiplatform.v1.CreateEndpointOperationMetadata;
import com.google.cloud.aiplatform.v1.Endpoint;
import com.google.cloud.aiplatform.v1.EndpointServiceClient;
import com.google.cloud.aiplatform.v1.EndpointServiceSettings;
import com.google.cloud.aiplatform.v1.LocationName;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateEndpointSample {

  public static void main(String[] args)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String endpointDisplayName = "YOUR_ENDPOINT_DISPLAY_NAME";
    createEndpointSample(project, endpointDisplayName);
  }

  static void createEndpointSample(String project, String endpointDisplayName)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    EndpointServiceSettings endpointServiceSettings =
        EndpointServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (EndpointServiceClient endpointServiceClient =
        EndpointServiceClient.create(endpointServiceSettings)) {
      String location = "us-central1";
      LocationName locationName = LocationName.of(project, location);
      Endpoint endpoint = Endpoint.newBuilder().setDisplayName(endpointDisplayName).build();

      OperationFuture<Endpoint, CreateEndpointOperationMetadata> endpointFuture =
          endpointServiceClient.createEndpointAsync(locationName, endpoint);
      System.out.format("Operation name: %s\n", endpointFuture.getInitialFuture().get().getName());
      System.out.println("Waiting for operation to finish...");
      Endpoint endpointResponse = endpointFuture.get(300, TimeUnit.SECONDS);

      System.out.println("Create Endpoint Response");
      System.out.format("Name: %s\n", endpointResponse.getName());
      System.out.format("Display Name: %s\n", endpointResponse.getDisplayName());
      System.out.format("Description: %s\n", endpointResponse.getDescription());
      System.out.format("Labels: %s\n", endpointResponse.getLabelsMap());
      System.out.format("Create Time: %s\n", endpointResponse.getCreateTime());
      System.out.format("Update Time: %s\n", endpointResponse.getUpdateTime());
    }
  }
}

Node.js

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Node.js。詳情請參閱 Vertex AI Node.js API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const endpointDisplayName = 'YOUR_ENDPOINT_DISPLAY_NAME';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';

// Imports the Google Cloud Endpoint Service Client library
const {EndpointServiceClient} = require('@google-cloud/aiplatform');

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const endpointServiceClient = new EndpointServiceClient(clientOptions);

async function createEndpoint() {
  // Configure the parent resource
  const parent = `projects/${project}/locations/${location}`;
  const endpoint = {
    displayName: endpointDisplayName,
  };
  const request = {
    parent,
    endpoint,
  };

  // Get and print out a list of all the endpoints for this resource
  const [response] = await endpointServiceClient.createEndpoint(request);
  console.log(`Long running operation : ${response.name}`);

  // Wait for operation to complete
  await response.promise();
  const result = response.result;

  console.log('Create endpoint response');
  console.log(`\tName : ${result.name}`);
  console.log(`\tDisplay name : ${result.displayName}`);
  console.log(`\tDescription : ${result.description}`);
  console.log(`\tLabels : ${JSON.stringify(result.labels)}`);
  console.log(`\tCreate time : ${JSON.stringify(result.createTime)}`);
  console.log(`\tUpdate time : ${JSON.stringify(result.updateTime)}`);
}
createEndpoint();

Python 適用的 Vertex AI SDK

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Vertex AI SDK for Python API 參考說明文件。

def create_endpoint_sample(
    project: str,
    display_name: str,
    location: str,
):
    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint.create(
        display_name=display_name,
        project=project,
        location=location,
    )

    print(endpoint.display_name)
    print(endpoint.resource_name)
    return endpoint

擷取端點 ID

您需要端點 ID 才能部署模型。

gcloud

以下範例使用 gcloud ai endpoints list 指令：

gcloud ai endpoints list \
  --region=LOCATION \
  --filter=display_name=ENDPOINT_NAME

更改下列內容：

LOCATION_ID：您使用 Vertex AI 的區��。
ENDPOINT_NAME：端點的顯示名稱。

請注意「ENDPOINT_ID」欄中的數字。在下一個步驟中使用這個 ID。

REST

使用任何要求資料之前，請先替換以下項目：

LOCATION_ID：您使用 Vertex AI 的區域。
PROJECT_ID：您的專案 ID。
ENDPOINT_NAME：端點的顯示名稱。

HTTP 方法和網址：

GET https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints?filter=display_name=ENDPOINT_NAME

如要傳送要求，請展開以下其中一個選項：

curl (Linux、macOS 或 Cloud Shell)

執行下列指令：

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints?filter=display_name=ENDPOINT_NAME"

PowerShell (Windows)

注意：下列指令假設您已透過執行 gcloud init 或 gcloud auth login 登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints?filter=display_name=ENDPOINT_NAME" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應：

{
  "endpoints": [
    {
      "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID",
      "displayName": "ENDPOINT_NAME",
      "etag": "AMEw9yPz5pf4PwBHbRWOGh0PcAxUdjbdX2Jm3QO_amguy3DbZGP5Oi_YUKRywIE-BtLx",
      "createTime": "2020-04-17T18:31:11.585169Z",
      "updateTime": "2020-04-17T18:35:08.568959Z"
    }
  ]
}

ENDPOINT_ID

部署模型

請選取下方對應您語言或環境的分頁：

gcloud

以下範例使用 gcloud ai endpoints deploy-model 指令。

以下範例會將 Model 部署至 Endpoint，但不會在多個 DeployedModel 資源之間分割流量：

使用下列任何指令資料之前，請先替換以下項目：

ENDPOINT_ID：端點的 ID。
LOCATION_ID：您使用 Vertex AI 的區域。
MODEL_ID：要部署的模型 ID。
DEPLOYED_MODEL_NAME：DeployedModel 的名稱。您也可以使用 Model 的顯示名稱來命名 DeployedModel。
MIN_REPLICA_COUNT：此部署作業的節點數量下限。節點數量可視預測負載需求增加或減少，但不得超過節點數量上限，也不能少於這個數量。
MAX_REPLICA_COUNT：此部署作業的節點數量上限。節點數量可視預測負載需求增加或減少，但不得超過這個數量，也不能少於最小節點數量。如果省略 --max-replica-count 標記，節點數量上限會設為 --min-replica-count 的值。

執行 gcloud ai endpoints deploy-model 指令：

Linux、macOS 或 Cloud Shell

gcloud ai endpoints deploy-model ENDPOINT_ID\
  --region=LOCATION_ID \
  --model=MODEL_ID \
  --display-name=DEPLOYED_MODEL_NAME \
  --traffic-split=0=100

Windows (PowerShell)

gcloud ai endpoints deploy-model ENDPOINT_ID`
  --region=LOCATION_ID `
  --model=MODEL_ID `
  --display-name=DEPLOYED_MODEL_NAME `
  --traffic-split=0=100

Windows (cmd.exe)

gcloud ai endpoints deploy-model ENDPOINT_ID^
  --region=LOCATION_ID ^
  --model=MODEL_ID ^
  --display-name=DEPLOYED_MODEL_NAME ^
  --traffic-split=0=100

流量分配

上述範例中的 --traffic-split=0=100 標記會將 Endpoint 收到的預測流量 100% 傳送至新的 DeployedModel，並以臨時 ID 0 表示。如果您的 Endpoint 已包含其他 DeployedModel 資源，您可以將流量分配給新 DeployedModel 和舊 DeployedModel。例如，如要將 20% 的流量傳送至新的 DeployedModel，並將 80% 的流量傳送至較舊的 DeployedModel，請執行下列指令。

使用下列任何指令資料之前，請先替換以下項目：

OLD_DEPLOYED_MODEL_ID：現有 DeployedModel 的 ID。

執行 gcloud ai endpoints deploy-model 指令：

Linux、macOS 或 Cloud Shell

gcloud ai endpoints deploy-model ENDPOINT_ID\
  --region=LOCATION_ID \
  --model=MODEL_ID \
  --display-name=DEPLOYED_MODEL_NAME \ 
  --min-replica-count=MIN_REPLICA_COUNT \
  --max-replica-count=MAX_REPLICA_COUNT \
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80

Windows (PowerShell)

gcloud ai endpoints deploy-model ENDPOINT_ID`
  --region=LOCATION_ID `
  --model=MODEL_ID `
  --display-name=DEPLOYED_MODEL_NAME \ 
  --min-replica-count=MIN_REPLICA_COUNT `
  --max-replica-count=MAX_REPLICA_COUNT `
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80

Windows (cmd.exe)

gcloud ai endpoints deploy-model ENDPOINT_ID^
  --region=LOCATION_ID ^
  --model=MODEL_ID ^
  --display-name=DEPLOYED_MODEL_NAME \ 
  --min-replica-count=MIN_REPLICA_COUNT ^
  --max-replica-count=MAX_REPLICA_COUNT ^
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80

REST

部署模型。

使用任何要求資料之前，請先替換以下項目：

LOCATION_ID：您使用 Vertex AI 的區域。
PROJECT_ID：您的專案 ID。
ENDPOINT_ID：端點的 ID。
MODEL_ID：要部署的模型 ID。
DEPLOYED_MODEL_NAME：DeployedModel 的名稱。您也可以使用 Model 的顯示名稱來命名 DeployedModel。
TRAFFIC_SPLIT_THIS_MODEL：傳送至此端點的預測流量百分比，會路由至透過此作業部署的模型。預設值為 100。所有流量百分比的總和必須為 100。進一步瞭解流量分配。
DEPLOYED_MODEL_ID_N：選用。如果其他模型已部署至這個端點，您必須更新其流量分配百分比，讓所有百分比加總為 100。
TRAFFIC_SPLIT_MODEL_N：已部署模型 ID 鍵的流量分配百分比值。
PROJECT_NUMBER：系統自動產生的專案編號

HTTP 方法和網址：

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:deployModel

JSON 要求主體：

{
  "deployedModel": {
    "model": "projects/PROJECT_ID/locations/us-central1/models/MODEL_ID",
    "displayName": "DEPLOYED_MODEL_NAME",
    "automaticResources": {
     }
  },
  "trafficSplit": {
    "0": TRAFFIC_SPLIT_THIS_MODEL,
    "DEPLOYED_MODEL_ID_1": TRAFFIC_SPLIT_MODEL_1,
    "DEPLOYED_MODEL_ID_2": TRAFFIC_SPLIT_MODEL_2
  },
}

如要傳送要求，請展開以下其中一個選項：

curl (Linux、macOS 或 Cloud Shell)

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:deployModel"

PowerShell (Windows)

注意：下列指令假設您已透過執行 gcloud init 或 gcloud auth login 登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:deployModel" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應：

{
  "name": "projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployModelOperationMetadata",
    "genericMetadata": {
      "createTime": "2020-10-19T17:53:16.502088Z",
      "updateTime": "2020-10-19T17:53:16.502088Z"
    }
  }
}

Java

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Java。詳情請參閱 Vertex AI Java API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。


import com.google.api.gax.longrunning.OperationFuture;
import com.google.api.gax.longrunning.OperationTimedPollAlgorithm;
import com.google.api.gax.retrying.RetrySettings;
import com.google.cloud.aiplatform.v1.AutomaticResources;
import com.google.cloud.aiplatform.v1.DedicatedResources;
import com.google.cloud.aiplatform.v1.DeployModelOperationMetadata;
import com.google.cloud.aiplatform.v1.DeployModelResponse;
import com.google.cloud.aiplatform.v1.DeployedModel;
import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.EndpointServiceClient;
import com.google.cloud.aiplatform.v1.EndpointServiceSettings;
import com.google.cloud.aiplatform.v1.MachineSpec;
import com.google.cloud.aiplatform.v1.ModelName;
import com.google.cloud.aiplatform.v1.stub.EndpointServiceStubSettings;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
import org.threeten.bp.Duration;

public class DeployModelSample {

  public static void main(String[] args)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String deployedModelDisplayName = "YOUR_DEPLOYED_MODEL_DISPLAY_NAME";
    String endpointId = "YOUR_ENDPOINT_NAME";
    String modelId = "YOUR_MODEL_ID";
    int timeout = 900;
    deployModelSample(project, deployedModelDisplayName, endpointId, modelId, timeout);
  }

  static void deployModelSample(
      String project,
      String deployedModelDisplayName,
      String endpointId,
      String modelId,
      int timeout)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {

    // Set long-running operations (LROs) timeout
    final OperationTimedPollAlgorithm operationTimedPollAlgorithm =
        OperationTimedPollAlgorithm.create(
            RetrySettings.newBuilder()
                .setInitialRetryDelay(Duration.ofMillis(5000L))
                .setRetryDelayMultiplier(1.5)
                .setMaxRetryDelay(Duration.ofMillis(45000L))
                .setInitialRpcTimeout(Duration.ZERO)
                .setRpcTimeoutMultiplier(1.0)
                .setMaxRpcTimeout(Duration.ZERO)
                .setTotalTimeout(Duration.ofSeconds(timeout))
                .build());

    EndpointServiceStubSettings.Builder endpointServiceStubSettingsBuilder =
        EndpointServiceStubSettings.newBuilder();
    endpointServiceStubSettingsBuilder
        .deployModelOperationSettings()
        .setPollingAlgorithm(operationTimedPollAlgorithm);
    EndpointServiceStubSettings endpointStubSettings = endpointServiceStubSettingsBuilder.build();
    EndpointServiceSettings endpointServiceSettings =
        EndpointServiceSettings.create(endpointStubSettings);
    endpointServiceSettings =
        endpointServiceSettings.toBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (EndpointServiceClient endpointServiceClient =
        EndpointServiceClient.create(endpointServiceSettings)) {
      String location = "us-central1";
      EndpointName endpointName = EndpointName.of(project, location, endpointId);
      // key '0' assigns traffic for the newly deployed model
      // Traffic percentage values must add up to 100
      // Leave dictionary empty if endpoint should not accept any traffic
      Map<String, Integer> trafficSplit = new HashMap<>();
      trafficSplit.put("0", 100);
      ModelName modelName = ModelName.of(project, location, modelId);
      AutomaticResources automaticResourcesInput =
          AutomaticResources.newBuilder().setMinReplicaCount(1).setMaxReplicaCount(1).build();
      DeployedModel deployedModelInput =
          DeployedModel.newBuilder()
              .setModel(modelName.toString())
              .setDisplayName(deployedModelDisplayName)
              .setAutomaticResources(automaticResourcesInput)
              .build();

      OperationFuture<DeployModelResponse, DeployModelOperationMetadata> deployModelResponseFuture =
          endpointServiceClient.deployModelAsync(endpointName, deployedModelInput, trafficSplit);
      System.out.format(
          "Operation name: %s\n", deployModelResponseFuture.getInitialFuture().get().getName());
      System.out.println("Waiting for operation to finish...");
      DeployModelResponse deployModelResponse = deployModelResponseFuture.get(20, TimeUnit.MINUTES);

      System.out.println("Deploy Model Response");
      DeployedModel deployedModel = deployModelResponse.getDeployedModel();
      System.out.println("\tDeployed Model");
      System.out.format("\t\tid: %s\n", deployedModel.getId());
      System.out.format("\t\tmodel: %s\n", deployedModel.getModel());
      System.out.format("\t\tDisplay Name: %s\n", deployedModel.getDisplayName());
      System.out.format("\t\tCreate Time: %s\n", deployedModel.getCreateTime());

      DedicatedResources dedicatedResources = deployedModel.getDedicatedResources();
      System.out.println("\t\tDedicated Resources");
      System.out.format("\t\t\tMin Replica Count: %s\n", dedicatedResources.getMinReplicaCount());

      MachineSpec machineSpec = dedicatedResources.getMachineSpec();
      System.out.println("\t\t\tMachine Spec");
      System.out.format("\t\t\t\tMachine Type: %s\n", machineSpec.getMachineType());
      System.out.format("\t\t\t\tAccelerator Type: %s\n", machineSpec.getAcceleratorType());
      System.out.format("\t\t\t\tAccelerator Count: %s\n", machineSpec.getAcceleratorCount());

      AutomaticResources automaticResources = deployedModel.getAutomaticResources();
      System.out.println("\t\tAutomatic Resources");
      System.out.format("\t\t\tMin Replica Count: %s\n", automaticResources.getMinReplicaCount());
      System.out.format("\t\t\tMax Replica Count: %s\n", automaticResources.getMaxReplicaCount());
    }
  }
}

Node.js

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Node.js。詳情請參閱 Vertex AI Node.js API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const modelId = "YOUR_MODEL_ID";
// const endpointId = 'YOUR_ENDPOINT_ID';
// const deployedModelDisplayName = 'YOUR_DEPLOYED_MODEL_DISPLAY_NAME';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';

const modelName = `projects/${project}/locations/${location}/models/${modelId}`;
const endpoint = `projects/${project}/locations/${location}/endpoints/${endpointId}`;
// Imports the Google Cloud Endpoint Service Client library
const {EndpointServiceClient} = require('@google-cloud/aiplatform');

// Specifies the location of the api endpoint:
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const endpointServiceClient = new EndpointServiceClient(clientOptions);

async function deployModel() {
  // Configure the parent resource
  // key '0' assigns traffic for the newly deployed model
  // Traffic percentage values must add up to 100
  // Leave dictionary empty if endpoint should not accept any traffic
  const trafficSplit = {0: 100};
  const deployedModel = {
    // format: 'projects/{project}/locations/{location}/models/{model}'
    model: modelName,
    displayName: deployedModelDisplayName,
    automaticResources: {minReplicaCount: 1, maxReplicaCount: 1},
  };
  const request = {
    endpoint,
    deployedModel,
    trafficSplit,
  };

  // Get and print out a list of all the endpoints for this resource
  const [response] = await endpointServiceClient.deployModel(request);
  console.log(`Long running operation : ${response.name}`);

  // Wait for operation to complete
  await response.promise();
  const result = response.result;

  console.log('Deploy model response');
  const modelDeployed = result.deployedModel;
  console.log('\tDeployed model');
  if (!modelDeployed) {
    console.log('\t\tId : {}');
    console.log('\t\tModel : {}');
    console.log('\t\tDisplay name : {}');
    console.log('\t\tCreate time : {}');

    console.log('\t\tDedicated resources');
    console.log('\t\t\tMin replica count : {}');
    console.log('\t\t\tMachine spec {}');
    console.log('\t\t\t\tMachine type : {}');
    console.log('\t\t\t\tAccelerator type : {}');
    console.log('\t\t\t\tAccelerator count : {}');

    console.log('\t\tAutomatic resources');
    console.log('\t\t\tMin replica count : {}');
    console.log('\t\t\tMax replica count : {}');
  } else {
    console.log(`\t\tId : ${modelDeployed.id}`);
    console.log(`\t\tModel : ${modelDeployed.model}`);
    console.log(`\t\tDisplay name : ${modelDeployed.displayName}`);
    console.log(`\t\tCreate time : ${modelDeployed.createTime}`);

    const dedicatedResources = modelDeployed.dedicatedResources;
    console.log('\t\tDedicated resources');
    if (!dedicatedResources) {
      console.log('\t\t\tMin replica count : {}');
      console.log('\t\t\tMachine spec {}');
      console.log('\t\t\t\tMachine type : {}');
      console.log('\t\t\t\tAccelerator type : {}');
      console.log('\t\t\t\tAccelerator count : {}');
    } else {
      console.log(
        `\t\t\tMin replica count : \
          ${dedicatedResources.minReplicaCount}`
      );
      const machineSpec = dedicatedResources.machineSpec;
      console.log('\t\t\tMachine spec');
      console.log(`\t\t\t\tMachine type : ${machineSpec.machineType}`);
      console.log(
        `\t\t\t\tAccelerator type : ${machineSpec.acceleratorType}`
      );
      console.log(
        `\t\t\t\tAccelerator count : ${machineSpec.acceleratorCount}`
      );
    }

    const automaticResources = modelDeployed.automaticResources;
    console.log('\t\tAutomatic resources');
    if (!automaticResources) {
      console.log('\t\t\tMin replica count : {}');
      console.log('\t\t\tMax replica count : {}');
    } else {
      console.log(
        `\t\t\tMin replica count : \
          ${automaticResources.minReplicaCount}`
      );
      console.log(
        `\t\t\tMax replica count : \
          ${automaticResources.maxReplicaCount}`
      );
    }
  }
}
deployModel();

Python 適用的 Vertex AI SDK

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Vertex AI SDK for Python API 參考說明文件。

def deploy_model_with_automatic_resources_sample(
    project,
    location,
    model_name: str,
    endpoint: Optional[aiplatform.Endpoint] = None,
    deployed_model_display_name: Optional[str] = None,
    traffic_percentage: Optional[int] = 0,
    traffic_split: Optional[Dict[str, int]] = None,
    min_replica_count: int = 1,
    max_replica_count: int = 1,
    metadata: Optional[Sequence[Tuple[str, str]]] = (),
    sync: bool = True,
):
    """
    model_name: A fully-qualified model resource name or model ID.
          Example: "projects/123/locations/us-central1/models/456" or
          "456" when project and location are initialized or passed.
    """

    aiplatform.init(project=project, location=location)

    model = aiplatform.Model(model_name=model_name)

    model.deploy(
        endpoint=endpoint,
        deployed_model_display_name=deployed_model_display_name,
        traffic_percentage=traffic_percentage,
        traffic_split=traffic_split,
        min_replica_count=min_replica_count,
        max_replica_count=max_replica_count,
        metadata=metadata,
        sync=sync,
    )

    model.wait()

    print(model.display_name)
    print(model.resource_name)
    return model

取得作業狀態

部分要求會啟動需要時間才能完成的長時間作業。這些要求會傳回作業名稱，您可以使用該名稱查看作業狀態或取消作業。Vertex AI 提供輔助方法，可針對長時間執行的作業進行呼叫。詳情請參閱「處理長時間執行作業」。

使用已部署的模型進行線上預測

如要進行線上預測，請將一或多個測試項目提交至模型進行分析，模型會根據模型的目標傳回結果。如要進一步瞭解預測結果，請參閱「解讀結果」頁面。

主控台

請使用 Google Cloud 控制台要求線上預測。模型必須部署至端點。

在 Google Cloud 控制台的 Vertex AI 專區中，前往「Models」頁面。

前往「Models」(模型) 頁面
在模型清單中，按一下要要求預測結果的模型名稱。
選取「Deploy & test」分頁標籤。
在「Test your model」部分下方，新增測試項目來要求預測。

針對文字目標的 AutoML 模型，您必須在文字欄位中輸入內容，然後按一下「預測」。

如要瞭解本機地圖特徵的重要性，請參閱「取得說明」。

預測完成後，Vertex AI 會在控制台中傳回結果。

API

使用 Vertex AI API 要求線上預測。模型必須部署至端點。

gcloud

建立名為 request.json 的檔案，並在當中加入下列內容：
```
{
  "instances": [{
    "mimeType": "text/plain",
    "content": "CONTENT"
  }]
}
```
更改下列內容：
- CONTENT：用於預測的文字片段。
執行下列指令：
```
gcloud ai endpoints predict ENDPOINT_ID \
  --region=LOCATION_ID \
  --json-request=request.json
```
更改下列內容：
- ENDPOINT_ID：端點的 ID。
- LOCATION_ID：您使用 Vertex AI 的區域。

REST

使用任何要求資料之前，請先替換以下項目：

LOCATION_ID：端點所在的區域。例如：us-central1。
PROJECT_ID：您的專案 ID
ENDPOINT_ID：端點 ID
CONTENT：用於預測的文字片段。
DEPLOYED_MODEL_ID：用於預測的已部署模型 ID。

HTTP 方法和網址：

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict

JSON 要求主體：

{
  "instances": [{
    "mimeType": "text/plain",
    "content": "CONTENT"
  }]
}

如要傳送要求，請選擇以下其中一個選項：

curl

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict"

PowerShell

注意：下列指令假設您已透過執行 gcloud init 或 gcloud auth login 登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應：

{
  "predictions": {
    "ids": [
      "1234567890123456789",
      "2234567890123456789",
      "3234567890123456789"
    ],
    "displayNames": [
      "SpecificDisease",
      "DiseaseClass",
      "SpecificDisease"
    ],
    "textSegmentStartOffsets":  [13, 40, 57],
    "textSegmentEndOffsets": [29, 51, 75],
    "confidences": [
      0.99959725141525269,
      0.99912621492484128,
      0.99935531616210938
    ]
  },
  "deployedModelId": "0123456789012345678"
}

Java

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Java。詳情請參閱 Vertex AI Java API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。


import com.google.cloud.aiplatform.util.ValueConverter;
import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.cloud.aiplatform.v1.schema.predict.instance.TextExtractionPredictionInstance;
import com.google.cloud.aiplatform.v1.schema.predict.prediction.TextExtractionPredictionResult;
import com.google.protobuf.Value;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class PredictTextEntityExtractionSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String content = "YOUR_TEXT_CONTENT";
    String endpointId = "YOUR_ENDPOINT_ID";

    predictTextEntityExtraction(project, content, endpointId);
  }

  static void predictTextEntityExtraction(String project, String content, String endpointId)
      throws IOException {
    PredictionServiceSettings predictionServiceSettings =
        PredictionServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (PredictionServiceClient predictionServiceClient =
        PredictionServiceClient.create(predictionServiceSettings)) {
      String location = "us-central1";
      String jsonString = "{\"content\": \"" + content + "\"}";

      EndpointName endpointName = EndpointName.of(project, location, endpointId);

      TextExtractionPredictionInstance instance =
          TextExtractionPredictionInstance.newBuilder().setContent(content).build();

      List<Value> instances = new ArrayList<>();
      instances.add(ValueConverter.toValue(instance));

      PredictResponse predictResponse =
          predictionServiceClient.predict(endpointName, instances, ValueConverter.EMPTY_VALUE);
      System.out.println("Predict Text Entity Extraction Response");
      System.out.format("\tDeployed Model Id: %s\n", predictResponse.getDeployedModelId());

      System.out.println("Predictions");
      for (Value prediction : predictResponse.getPredictionsList()) {
        TextExtractionPredictionResult.Builder resultBuilder =
            TextExtractionPredictionResult.newBuilder();

        TextExtractionPredictionResult result =
            (TextExtractionPredictionResult) ValueConverter.fromValue(resultBuilder, prediction);

        for (int i = 0; i < result.getIdsCount(); i++) {
          long textStartOffset = result.getTextSegmentStartOffsets(i);
          long textEndOffset = result.getTextSegmentEndOffsets(i);
          String entity = content.substring((int) textStartOffset, (int) textEndOffset);

          System.out.format("\tEntity: %s\n", entity);
          System.out.format("\tEntity type: %s\n", result.getDisplayNames(i));
          System.out.format("\tConfidences: %f\n", result.getConfidences(i));
          System.out.format("\tIDs: %d\n", result.getIds(i));
        }
      }
    }
  }
}

Node.js

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Node.js。詳情請參閱 Vertex AI Node.js API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const text = "YOUR_PREDICTION_TEXT";
// const endpointId = "YOUR_ENDPOINT_ID";
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';
const aiplatform = require('@google-cloud/aiplatform');
const {instance, prediction} =
  aiplatform.protos.google.cloud.aiplatform.v1.schema.predict;

// Imports the Google Cloud Model Service Client library
const {PredictionServiceClient} = aiplatform.v1;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const predictionServiceClient = new PredictionServiceClient(clientOptions);

async function predictTextEntityExtraction() {
  // Configure the endpoint resource
  const endpoint = `projects/${project}/locations/${location}/endpoints/${endpointId}`;

  const instanceObj = new instance.TextExtractionPredictionInstance({
    content: text,
  });
  const instanceVal = instanceObj.toValue();
  const instances = [instanceVal];

  const request = {
    endpoint,
    instances,
  };

  // Predict request
  const [response] = await predictionServiceClient.predict(request);

  console.log('Predict text entity extraction response :');
  console.log(`\tDeployed model id : ${response.deployedModelId}`);

  console.log('\nPredictions :');
  for (const predictionResultValue of response.predictions) {
    const predictionResult =
      prediction.TextExtractionPredictionResult.fromValue(
        predictionResultValue
      );

    for (const [i, label] of predictionResult.displayNames.entries()) {
      const textStartOffset = parseInt(
        predictionResult.textSegmentStartOffsets[i]
      );
      const textEndOffset = parseInt(
        predictionResult.textSegmentEndOffsets[i]
      );
      const entity = text.substring(textStartOffset, textEndOffset);
      console.log(`\tEntity: ${entity}`);
      console.log(`\tEntity type: ${label}`);
      console.log(`\tConfidences: ${predictionResult.confidences[i]}`);
      console.log(`\tIDs: ${predictionResult.ids[i]}\n\n`);
    }
  }
}
predictTextEntityExtraction();

Python 適用的 Vertex AI SDK

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Vertex AI SDK for Python API 參考說明文件。

def predict_text_entity_extraction_sample(project, location, endpoint_id, content):

    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint(endpoint_id)

    response = endpoint.predict(instances=[{"content": content}], parameters={})

    for prediction_ in response.predictions:
        print(prediction_)

取得批次預測

如要提出批次預測要求，您必須指定輸入來源和輸出格式，Vertex AI 會在這些位置儲存預測結果。

輸入資料規定

批次要求的輸入內容會指定要傳送至模型進行預測的項目。針對實體擷取，您可以加入內嵌文字或 Cloud Storage 值區中的文件參照。您也可以為每份文件在輸入內容中新增 key 欄位。

通常，批次預測結果會使用 instance 欄位 (包含 content 和 mimeType 欄位) 對應輸入和輸出內容。如果您在輸入內容中使用 key 欄位，批次預測輸出內容會將 instance 欄位替換為 key 欄位。舉例來說，如果輸入內容包含大型文字摘要，這項功能有助於簡化批次預測輸出內容。

以下範例顯示 JSON 列檔案，其中包含參照文件和內嵌文字片段的參照，以及是否包含 key 欄位。

{"content": "gs://sourcebucket/datasets/texts/source\_text.txt", "mimeType": "text/plain"}
{"content": "gs://bucket/sample.txt", "mimeType": "text/plain", "key": "sample-file"}
{"content": "Text snippet", "mimeType": "text/plain"}
{"content": "Sample text snippet", "mimeType": "text/plain", "key": "sample-snippet"}

要求批次預測

如要提出批次預測要求，您可以使用 Google Cloud 控制台或 Vertex AI API。視您提交的輸入項目數量而定，批次預測工作可能需要一些時間才能完成。

Google Cloud 控制台

使用 Google Cloud 控制台要求批次預測。

在 Google Cloud 控制台的 Vertex AI 專區中，前往「批次預測」頁面。

前往「批次預測」頁面
按一下「Create」，開啟「New batch prediction」視窗，然後完成下列步驟：
1. 輸入批次預測的名稱。
2. 針對「Model name」(模型名稱)，選取要用於此批次預測的模型名稱。
3. 在「Source path」中，指定 JSON Lines 輸入檔案所在的 Cloud Storage 位置。
4. 在「Destination path」中，指定批次預測結果的儲存位置。輸出格式取決於模型的目標。針對文字目標的 AutoML 模型會輸出 JSON Lines 檔案。

API

使用 Vertex AI API 傳送批次預測要求。

REST

使用任何要求資料之前，請先替換以下項目：

LOCATION_IS：模型儲存及執行批次預測工作的區域。例如：us-central1。
PROJECT_ID：您的專案 ID
BATCH_JOB_NAME：批次工作顯示名稱
MODEL_ID：模型的 ID，用於進行預測
URI：輸入 JSON Lines 檔案的 Cloud Storage URI。
BUCKET：您的 Cloud Storage 值區
PROJECT_NUMBER：系統自動產生的專案編號

HTTP 方法和網址：

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs

JSON 要求主體：

{
    "displayName": "BATCH_JOB_NAME",
    "model": "projects/PROJECT_ID/locations/LOCATION_ID/models/MODEL_ID",
    "inputConfig": {
        "instancesFormat": "jsonl",
        "gcsSource": {
            "uris": ["URI"]
        }
    },
    "outputConfig": {
        "predictionsFormat": "jsonl",
        "gcsDestination": {
            "outputUriPrefix": "OUTPUT_BUCKET"
        }
    }
}

如要傳送要求，請選擇以下其中一個選項：

curl

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs"

PowerShell

注意：下列指令假設您已透過執行 gcloud init 或 gcloud auth login 登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應：

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION/batchPredictionJobs/BATCH_JOB_ID",
  "displayName": "BATCH_JOB_NAME",
  "model": "projects/PROJECT_NUMBER/locations/LOCATION/models/MODEL_ID",
  "inputConfig": {
    "instancesFormat": "jsonl",
    "gcsSource": {
      "uris": [
        "CONTENT"
      ]
    }
  },
  "outputConfig": {
    "predictionsFormat": "jsonl",
    "gcsDestination": {
      "outputUriPrefix": "BUCKET"
    }
  },
  "state": "JOB_STATE_PENDING",
  "completionStats": {
    "incompleteCount": "-1"
  },
  "createTime": "2022-12-19T20:33:48.906074Z",
  "updateTime": "2022-12-19T20:33:48.906074Z",
  "modelVersionId": "1"
}

您可以使用 BATCH_JOB_ID 輪詢批次工作狀態，直到工作 state 為 JOB_STATE_SUCCEEDED 為止。

Java

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Java。詳情請參閱 Vertex AI Java API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

import com.google.api.gax.rpc.ApiException;
import com.google.cloud.aiplatform.v1.BatchPredictionJob;
import com.google.cloud.aiplatform.v1.GcsDestination;
import com.google.cloud.aiplatform.v1.GcsSource;
import com.google.cloud.aiplatform.v1.JobServiceClient;
import com.google.cloud.aiplatform.v1.JobServiceSettings;
import com.google.cloud.aiplatform.v1.LocationName;
import com.google.cloud.aiplatform.v1.ModelName;
import java.io.IOException;

public class CreateBatchPredictionJobTextEntityExtractionSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "PROJECT";
    String location = "us-central1";
    String displayName = "DISPLAY_NAME";
    String modelId = "MODEL_ID";
    String gcsSourceUri = "GCS_SOURCE_URI";
    String gcsDestinationOutputUriPrefix = "GCS_DESTINATION_OUTPUT_URI_PREFIX";
    createBatchPredictionJobTextEntityExtractionSample(
        project, location, displayName, modelId, gcsSourceUri, gcsDestinationOutputUriPrefix);
  }

  static void createBatchPredictionJobTextEntityExtractionSample(
      String project,
      String location,
      String displayName,
      String modelId,
      String gcsSourceUri,
      String gcsDestinationOutputUriPrefix)
      throws IOException {
    // The AI Platform services require regional API endpoints.
    JobServiceSettings settings =
        JobServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (JobServiceClient client = JobServiceClient.create(settings)) {
      try {
        String modelName = ModelName.of(project, location, modelId).toString();
        GcsSource gcsSource = GcsSource.newBuilder().addUris(gcsSourceUri).build();
        BatchPredictionJob.InputConfig inputConfig =
            BatchPredictionJob.InputConfig.newBuilder()
                .setInstancesFormat("jsonl")
                .setGcsSource(gcsSource)
                .build();
        GcsDestination gcsDestination =
            GcsDestination.newBuilder().setOutputUriPrefix(gcsDestinationOutputUriPrefix).build();
        BatchPredictionJob.OutputConfig outputConfig =
            BatchPredictionJob.OutputConfig.newBuilder()
                .setPredictionsFormat("jsonl")
                .setGcsDestination(gcsDestination)
                .build();
        BatchPredictionJob batchPredictionJob =
            BatchPredictionJob.newBuilder()
                .setDisplayName(displayName)
                .setModel(modelName)
                .setInputConfig(inputConfig)
                .setOutputConfig(outputConfig)
                .build();
        LocationName parent = LocationName.of(project, location);
        BatchPredictionJob response = client.createBatchPredictionJob(parent, batchPredictionJob);
        System.out.format("response: %s\n", response);
        System.out.format("\tname:%s\n", response.getName());
      } catch (ApiException ex) {
        System.out.format("Exception: %s\n", ex.getLocalizedMessage());
      }
    }
  }
}

Node.js

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Node.js。詳情請參閱 Vertex AI Node.js API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const batchPredictionDisplayName = 'YOUR_BATCH_PREDICTION_DISPLAY_NAME';
// const modelId = 'YOUR_MODEL_ID';
// const gcsSourceUri = 'YOUR_GCS_SOURCE_URI';
// const gcsDestinationOutputUriPrefix = 'YOUR_GCS_DEST_OUTPUT_URI_PREFIX';
//    eg. "gs://<your-gcs-bucket>/destination_path"
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';

// Imports the Google Cloud Job Service Client library
const {JobServiceClient} = require('@google-cloud/aiplatform').v1;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const jobServiceClient = new JobServiceClient(clientOptions);

async function createBatchPredictionJobTextEntityExtraction() {
  // Configure the parent resource
  const parent = `projects/${project}/locations/${location}`;
  const modelName = `projects/${project}/locations/${location}/models/${modelId}`;

  const inputConfig = {
    instancesFormat: 'jsonl',
    gcsSource: {uris: [gcsSourceUri]},
  };
  const outputConfig = {
    predictionsFormat: 'jsonl',
    gcsDestination: {outputUriPrefix: gcsDestinationOutputUriPrefix},
  };
  const batchPredictionJob = {
    displayName: batchPredictionDisplayName,
    model: modelName,
    inputConfig,
    outputConfig,
  };
  const request = {
    parent,
    batchPredictionJob,
  };

  // Create batch prediction job request
  const [response] = await jobServiceClient.createBatchPredictionJob(request);

  console.log('Create batch prediction job text entity extraction response');
  console.log(`Name : ${response.name}`);
  console.log('Raw response:');
  console.log(JSON.stringify(response, null, 2));
}
createBatchPredictionJobTextEntityExtraction();

Python 適用的 Vertex AI SDK

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Vertex AI SDK for Python API 參考說明文件。

def create_batch_prediction_job_sample(
    project: str,
    location: str,
    model_resource_name: str,
    job_display_name: str,
    gcs_source: Union[str, Sequence[str]],
    gcs_destination: str,
    sync: bool = True,
):
    aiplatform.init(project=project, location=location)

    my_model = aiplatform.Model(model_resource_name)

    batch_prediction_job = my_model.batch_predict(
        job_display_name=job_display_name,
        gcs_source=gcs_source,
        gcs_destination_prefix=gcs_destination,
        sync=sync,
    )

    batch_prediction_job.wait()

    print(batch_prediction_job.display_name)
    print(batch_prediction_job.resource_name)
    print(batch_prediction_job.state)
    return batch_prediction_job

擷取批次預測結果

批次預測工作完成後，預測的輸出會儲存在您在要求中指定的 Cloud Storage 值區。

批次預測結果範例

以下是文字實體擷取模型的批次預測結果範例。

{
  "key": 1,
  "predictions": {
    "ids": [
      "1234567890123456789",
      "2234567890123456789",
      "3234567890123456789"
    ],
    "displayNames": [
      "SpecificDisease",
      "DiseaseClass",
      "SpecificDisease"
    ],
    "textSegmentStartOffsets":  [13, 40, 57],
    "textSegmentEndOffsets": [29, 51, 75],
    "confidences": [
      0.99959725141525269,
      0.99912621492484128,
      0.99935531616210938
    ]
  }
}

評估模型

解讀結果

從文字實體擷取模型取得預測結果 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

線上預測和批次預測的差異

取得線上預測

將模型部署至端點

Google Cloud 控制台

API

建立端點

gcloud

REST

curl (Linux、macOS 或 Cloud Shell)

PowerShell (Windows)

Java

Node.js

Python 適用的 Vertex AI SDK

擷取端點 ID

gcloud

REST

curl (Linux、macOS 或 Cloud Shell)

PowerShell (Windows)

部署模型

gcloud

Linux、macOS 或 Cloud Shell

Windows (PowerShell)

Windows (cmd.exe)

流量分配

Linux、macOS 或 Cloud Shell

Windows (PowerShell)

Windows (cmd.exe)

REST

curl (Linux、macOS 或 Cloud Shell)

PowerShell (Windows)

Java

Node.js

Python 適用的 Vertex AI SDK

取得作業狀態

使用已部署的模型進行線上預測

主控台

API

gcloud

REST

curl

PowerShell

Java

Node.js

Python 適用的 Vertex AI SDK

取得批次預測

輸入資料規定

要求批次預測

Google Cloud 控制台

API

REST

curl

PowerShell

Java

Node.js

Python 適用的 Vertex AI SDK

擷取批次預測結果

批次預測結果範例

從文字實體擷取模型取得預測結果
透過集合功能整理內容你可以依據偏好儲存及分類內容。