從文字實體擷取模型取得預測結果

本頁面說明如何使用 Google Cloud 控制台或 Vertex AI API,從文字實體擷取模型取得線上 (即時) 預測和批次預測

線上預測和批次預測的差異

「線上預測」是對模型端點發出的同步要求。如要依據應用程式輸入內容發出要求,或是需要及時進行推論,您可以選用「線上預測」模式。

「批次預測」為非同步要求。您可以直接從模型資源要求批次預測,而不需要將模型部署至端點。如果是文字資料,如果您不需要立即取得回應,並想透過單一要求處理累積的資料,就適合使用批次預測功能。

取得線上預測

將模型部署至端點

您必須先將模型部署至端點,才能使用模型進行線上預測。部署模型時,系統會將實體資源與模型建立關聯,讓模型以低延遲的方式提供線上預測結果。

您可以將多個模型部署至同一個端點,也可以將模型部署至多個端點。如要進一步瞭解部署模型的選項和用途,請參閱「關於部署模型」。

請使用下列其中一種方法部署模型:

Google Cloud 控制台

  1. 在 Google Cloud 控制台的 Vertex AI 專區中,前往「Models」頁面。

    前往「Models」(模型) 頁面

  2. 按一下要部署的模型名稱,開啟模型詳細資料頁面。

  3. 選取「Deploy & Test」分頁標籤。

    如果模型已部署至任何端點,這些端點會列在「Deploy your model」部分。

  4. 按一下「Deploy to endpoint」

  5. 如要將模型部署至新端點,請選取 「Create new endpoint」(建立新端點),然後為新端點提供名稱。如要將模型部署至現有端點,請選取 「Add to existing endpoint」,然後從下拉式清單中選取端點。

    您可以在端點中加入多個模型,也可以在多個端點中加入模型。瞭解詳情

  6. 如果您將模型部署至已部署一或多個模型的現有端點,則必須更新您要部署的模型和已部署模型的流量拆分百分比,讓所有百分比加總為 100%。

  7. 選取「AutoML Text」,然後按照下列步驟進行設定:

    1. 如果您要將模型部署至新端點,請接受「流量分配」為 100。否則,請調整端點上所有模型的流量拆分值,使其相加結果為 100。

    2. 按一下模型的「完成」,然後在所有流量分配百分比正確無誤後,按一下「繼續」

      系統會顯示模型部署的區域。這個地區必須是您建立模型的地區。

    3. 按一下「Deploy」,將模型部署至端點。

API

使用 Vertex AI API 部署模型時,您必須完成下列步驟:

  1. 視需要建立端點。
  2. 取得端點 ID。
  3. 將模型部署至端點。

建立端點

如果您要將模型部署至現有端點,可以略過這個步驟。

gcloud

以下範例使用 gcloud ai endpoints create 指令

gcloud ai endpoints create \
  --region=LOCATION \
  --display-name=ENDPOINT_NAME

更改下列內容:

  • LOCATION_ID:您使用 Vertex AI 的區域。
  • ENDPOINT_NAME:端點的顯示名稱。

Google Cloud CLI 工具可能需要幾秒鐘的時間才能建立端點。

REST

使用任何要求資料之前,請先替換以下項目:

  • LOCATION_ID:您的區域。
  • PROJECT_ID:您的專案 ID
  • ENDPOINT_NAME:端點的顯示名稱。

HTTP 方法和網址:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints

JSON 要求主體:

{
  "display_name": "ENDPOINT_NAME"
}

如要傳送要求,請展開以下其中一個選項:

您應該會收到如下的 JSON 回應:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateEndpointOperationMetadata",
    "genericMetadata": {
      "createTime": "2020-11-05T17:45:42.812656Z",
      "updateTime": "2020-11-05T17:45:42.812656Z"
    }
  }
}
"done": true

Java

在試用這個範例之前,請先按照 Vertex AI 快速入門:使用用戶端程式庫中的操作說明設定 Java。詳情請參閱 Vertex AI Java API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。


import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.aiplatform.v1.CreateEndpointOperationMetadata;
import com.google.cloud.aiplatform.v1.Endpoint;
import com.google.cloud.aiplatform.v1.EndpointServiceClient;
import com.google.cloud.aiplatform.v1.EndpointServiceSettings;
import com.google.cloud.aiplatform.v1.LocationName;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateEndpointSample {

  public static void main(String[] args)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String endpointDisplayName = "YOUR_ENDPOINT_DISPLAY_NAME";
    createEndpointSample(project, endpointDisplayName);
  }

  static void createEndpointSample(String project, String endpointDisplayName)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    EndpointServiceSettings endpointServiceSettings =
        EndpointServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (EndpointServiceClient endpointServiceClient =
        EndpointServiceClient.create(endpointServiceSettings)) {
      String location = "us-central1";
      LocationName locationName = LocationName.of(project, location);
      Endpoint endpoint = Endpoint.newBuilder().setDisplayName(endpointDisplayName).build();

      OperationFuture<Endpoint, CreateEndpointOperationMetadata> endpointFuture =
          endpointServiceClient.createEndpointAsync(locationName, endpoint);
      System.out.format("Operation name: %s\n", endpointFuture.getInitialFuture().get().getName());
      System.out.println("Waiting for operation to finish...");
      Endpoint endpointResponse = endpointFuture.get(300, TimeUnit.SECONDS);

      System.out.println("Create Endpoint Response");
      System.out.format("Name: %s\n", endpointResponse.getName());
      System.out.format("Display Name: %s\n", endpointResponse.getDisplayName());
      System.out.format("Description: %s\n", endpointResponse.getDescription());
      System.out.format("Labels: %s\n", endpointResponse.getLabelsMap());
      System.out.format("Create Time: %s\n", endpointResponse.getCreateTime());
      System.out.format("Update Time: %s\n", endpointResponse.getUpdateTime());
    }
  }
}

Node.js

在試用這個範例之前,請先按照 Vertex AI 快速入門:使用用戶端程式庫中的操作說明設定 Node.js。詳情請參閱 Vertex AI Node.js API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const endpointDisplayName = 'YOUR_ENDPOINT_DISPLAY_NAME';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';

// Imports the Google Cloud Endpoint Service Client library
const {EndpointServiceClient} = require('@google-cloud/aiplatform');

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const endpointServiceClient = new EndpointServiceClient(clientOptions);

async function createEndpoint() {
  // Configure the parent resource
  const parent = `projects/${project}/locations/${location}`;
  const endpoint = {
    displayName: endpointDisplayName,
  };
  const request = {
    parent,
    endpoint,
  };

  // Get and print out a list of all the endpoints for this resource
  const [response] = await endpointServiceClient.createEndpoint(request);
  console.log(`Long running operation : ${response.name}`);

  // Wait for operation to complete
  await response.promise();
  const result = response.result;

  console.log('Create endpoint response');
  console.log(`\tName : ${result.name}`);
  console.log(`\tDisplay name : ${result.displayName}`);
  console.log(`\tDescription : ${result.description}`);
  console.log(`\tLabels : ${JSON.stringify(result.labels)}`);
  console.log(`\tCreate time : ${JSON.stringify(result.createTime)}`);
  console.log(`\tUpdate time : ${JSON.stringify(result.updateTime)}`);
}
createEndpoint();

Python 適用的 Vertex AI SDK

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK,請參閱「安裝 Python 適用的 Vertex AI SDK」。 詳情請參閱 Vertex AI SDK for Python API 參考說明文件

def create_endpoint_sample(
    project: str,
    display_name: str,
    location: str,
):
    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint.create(
        display_name=display_name,
        project=project,
        location=location,
    )

    print(endpoint.display_name)
    print(endpoint.resource_name)
    return endpoint

擷取端點 ID

您需要端點 ID 才能部署模型。

gcloud

以下範例使用 gcloud ai endpoints list 指令

gcloud ai endpoints list \
  --region=LOCATION \
  --filter=display_name=ENDPOINT_NAME

更改下列內容:

  • LOCATION_ID:您使用 Vertex AI 的區��。
  • ENDPOINT_NAME:端點的顯示名稱。

請注意「ENDPOINT_ID」欄中的數字。在下一個步驟中使用這個 ID。

REST

使用任何要求資料之前,請先替換以下項目:

  • LOCATION_ID:您使用 Vertex AI 的區域。
  • PROJECT_ID:您的專案 ID
  • ENDPOINT_NAME:端點的顯示名稱。

HTTP 方法和網址:

GET https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints?filter=display_name=ENDPOINT_NAME

如要傳送要求,請展開以下其中一個選項:

您應該會收到如下的 JSON 回應:

{
  "endpoints": [
    {
      "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID",
      "displayName": "ENDPOINT_NAME",
      "etag": "AMEw9yPz5pf4PwBHbRWOGh0PcAxUdjbdX2Jm3QO_amguy3DbZGP5Oi_YUKRywIE-BtLx",
      "createTime": "2020-04-17T18:31:11.585169Z",
      "updateTime": "2020-04-17T18:35:08.568959Z"
    }
  ]
}
ENDPOINT_ID

部署模型

請選取下方對應您語言或環境的分頁:

gcloud

以下範例使用 gcloud ai endpoints deploy-model 指令

以下範例會將 Model 部署至 Endpoint,但不會在多個 DeployedModel 資源之間分割流量:

使用下列任何指令資料之前,請先替換以下項目:

  • ENDPOINT_ID:端點的 ID。
  • LOCATION_ID:您使用 Vertex AI 的區域。
  • MODEL_ID:要部署的模型 ID。
  • DEPLOYED_MODEL_NAMEDeployedModel 的名稱。您也可以使用 Model 的顯示名稱來命名 DeployedModel
  • MIN_REPLICA_COUNT:此部署作業的節點數量下限。節點數量可視預測負載需求增加或減少,但不得超過節點數量上限,也不能少於這個數量。
  • MAX_REPLICA_COUNT:此部署作業的節點數量上限。節點數量可視預測負載需求增加或減少,但不得超過這個數量,也不能少於最小節點數量。如果省略 --max-replica-count 標記,節點數量上限會設為 --min-replica-count 的值。

執行 gcloud ai endpoints deploy-model 指令:

Linux、macOS 或 Cloud Shell

gcloud ai endpoints deploy-model ENDPOINT_ID\
  --region=LOCATION_ID \
  --model=MODEL_ID \
  --display-name=DEPLOYED_MODEL_NAME \
  --traffic-split=0=100

Windows (PowerShell)

gcloud ai endpoints deploy-model ENDPOINT_ID`
  --region=LOCATION_ID `
  --model=MODEL_ID `
  --display-name=DEPLOYED_MODEL_NAME `
  --traffic-split=0=100

Windows (cmd.exe)

gcloud ai endpoints deploy-model ENDPOINT_ID^
  --region=LOCATION_ID ^
  --model=MODEL_ID ^
  --display-name=DEPLOYED_MODEL_NAME ^
  --traffic-split=0=100
 

流量分配

上述範例中的 --traffic-split=0=100 標記會將 Endpoint 收到的預測流量 100% 傳送至新的 DeployedModel,並以臨時 ID 0 表示。如果您的 Endpoint 已包含其他 DeployedModel 資源,您可以將流量分配給新 DeployedModel 和舊 DeployedModel。例如,如要將 20% 的流量傳送至新的 DeployedModel,並將 80% 的流量傳送至較舊的 DeployedModel,請執行下列指令。

使用下列任何指令資料之前,請先替換以下項目:

  • OLD_DEPLOYED_MODEL_ID:現有 DeployedModel 的 ID。

執行 gcloud ai endpoints deploy-model 指令:

Linux、macOS 或 Cloud Shell

gcloud ai endpoints deploy-model ENDPOINT_ID\
  --region=LOCATION_ID \
  --model=MODEL_ID \
  --display-name=DEPLOYED_MODEL_NAME \ 
  --min-replica-count=MIN_REPLICA_COUNT \
  --max-replica-count=MAX_REPLICA_COUNT \
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80

Windows (PowerShell)

gcloud ai endpoints deploy-model ENDPOINT_ID`
  --region=LOCATION_ID `
  --model=MODEL_ID `
  --display-name=DEPLOYED_MODEL_NAME \ 
  --min-replica-count=MIN_REPLICA_COUNT `
  --max-replica-count=MAX_REPLICA_COUNT `
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80

Windows (cmd.exe)

gcloud ai endpoints deploy-model ENDPOINT_ID^
  --region=LOCATION_ID ^
  --model=MODEL_ID ^
  --display-name=DEPLOYED_MODEL_NAME \ 
  --min-replica-count=MIN_REPLICA_COUNT ^
  --max-replica-count=MAX_REPLICA_COUNT ^
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80
 

REST

部署模型。

使用任何要求資料之前,請先替換以下項目:

  • LOCATION_ID:您使用 Vertex AI 的區域。
  • PROJECT_ID:您的專案 ID
  • ENDPOINT_ID:端點的 ID。
  • MODEL_ID:要部署的模型 ID。
  • DEPLOYED_MODEL_NAMEDeployedModel 的名稱。您也可以使用 Model 的顯示名稱來命名 DeployedModel
  • TRAFFIC_SPLIT_THIS_MODEL:傳送至此端點的預測流量百分比,會路由至透過此作業部署的模型。預設值為 100。所有流量百分比的總和必須為 100。進一步瞭解流量分配
  • DEPLOYED_MODEL_ID_N:選用。如果其他模型已部署至這個端點,您必須更新其流量分配百分比,讓所有百分比加總為 100。
  • TRAFFIC_SPLIT_MODEL_N:已部署模型 ID 鍵的流量分配百分比值。
  • PROJECT_NUMBER:系統自動產生的專案編號

HTTP 方法和網址:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:deployModel

JSON 要求主體:

{
  "deployedModel": {
    "model": "projects/PROJECT_ID/locations/us-central1/models/MODEL_ID",
    "displayName": "DEPLOYED_MODEL_NAME",
    "automaticResources": {
     }
  },
  "trafficSplit": {
    "0": TRAFFIC_SPLIT_THIS_MODEL,
    "DEPLOYED_MODEL_ID_1": TRAFFIC_SPLIT_MODEL_1,
    "DEPLOYED_MODEL_ID_2": TRAFFIC_SPLIT_MODEL_2
  },
}

如要傳送要求,請展開以下其中一個選項:

您應該會收到如下的 JSON 回應:

{
  "name": "projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployModelOperationMetadata",
    "genericMetadata": {
      "createTime": "2020-10-19T17:53:16.502088Z",
      "updateTime": "2020-10-19T17:53:16.502088Z"
    }
  }
}

Java

在試用這個範例之前,請先按照 Vertex AI 快速入門:使用用戶端程式庫中的操作說明設定 Java。詳情請參閱 Vertex AI Java API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。


import com.google.api.gax.longrunning.OperationFuture;
import com.google.api.gax.longrunning.OperationTimedPollAlgorithm;
import com.google.api.gax.retrying.RetrySettings;
import com.google.cloud.aiplatform.v1.AutomaticResources;
import com.google.cloud.aiplatform.v1.DedicatedResources;
import com.google.cloud.aiplatform.v1.DeployModelOperationMetadata;
import com.google.cloud.aiplatform.v1.DeployModelResponse;
import com.google.cloud.aiplatform.v1.DeployedModel;
import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.EndpointServiceClient;
import com.google.cloud.aiplatform.v1.EndpointServiceSettings;
import com.google.cloud.aiplatform.v1.MachineSpec;
import com.google.cloud.aiplatform.v1.ModelName;
import com.google.cloud.aiplatform.v1.stub.EndpointServiceStubSettings;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
import org.threeten.bp.Duration;

public class DeployModelSample {

  public static void main(String[] args)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String deployedModelDisplayName = "YOUR_DEPLOYED_MODEL_DISPLAY_NAME";
    String endpointId = "YOUR_ENDPOINT_NAME";
    String modelId = "YOUR_MODEL_ID";
    int timeout = 900;
    deployModelSample(project, deployedModelDisplayName, endpointId, modelId, timeout);
  }

  static void deployModelSample(
      String project,
      String deployedModelDisplayName,
      String endpointId,
      String modelId,
      int timeout)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {

    // Set long-running operations (LROs) timeout
    final OperationTimedPollAlgorithm operationTimedPollAlgorithm =
        OperationTimedPollAlgorithm.create(
            RetrySettings.newBuilder()
                .setInitialRetryDelay(Duration.ofMillis(5000L))
                .setRetryDelayMultiplier(1.5)
                .setMaxRetryDelay(Duration.ofMillis(45000L))
                .setInitialRpcTimeout(Duration.ZERO)
                .setRpcTimeoutMultiplier(1.0)
                .setMaxRpcTimeout(Duration.ZERO)
                .setTotalTimeout(Duration.ofSeconds(timeout))
                .build());

    EndpointServiceStubSettings.Builder endpointServiceStubSettingsBuilder =
        EndpointServiceStubSettings.newBuilder();
    endpointServiceStubSettingsBuilder
        .deployModelOperationSettings()
        .setPollingAlgorithm(operationTimedPollAlgorithm);
    EndpointServiceStubSettings endpointStubSettings = endpointServiceStubSettingsBuilder.build();
    EndpointServiceSettings endpointServiceSettings =
        EndpointServiceSettings.create(endpointStubSettings);
    endpointServiceSettings =
        endpointServiceSettings.toBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (EndpointServiceClient endpointServiceClient =
        EndpointServiceClient.create(endpointServiceSettings)) {
      String location = "us-central1";
      EndpointName endpointName = EndpointName.of(project, location, endpointId);
      // key '0' assigns traffic for the newly deployed model
      // Traffic percentage values must add up to 100
      // Leave dictionary empty if endpoint should not accept any traffic
      Map<String, Integer> trafficSplit = new HashMap<>();
      trafficSplit.put("0", 100);
      ModelName modelName = ModelName.of(project, location, modelId);
      AutomaticResources automaticResourcesInput =
          AutomaticResources.newBuilder().setMinReplicaCount(1).setMaxReplicaCount(1).build();
      DeployedModel deployedModelInput =
          DeployedModel.newBuilder()
              .setModel(modelName.toString())
              .setDisplayName(deployedModelDisplayName)
              .setAutomaticResources(automaticResourcesInput)
              .build();

      OperationFuture<DeployModelResponse, DeployModelOperationMetadata> deployModelResponseFuture =
          endpointServiceClient.deployModelAsync(endpointName, deployedModelInput, trafficSplit);
      System.out.format(
          "Operation name: %s\n", deployModelResponseFuture.getInitialFuture().get().getName());
      System.out.println("Waiting for operation to finish...");
      DeployModelResponse deployModelResponse = deployModelResponseFuture.get(20, TimeUnit.MINUTES);

      System.out.println("Deploy Model Response");
      DeployedModel deployedModel = deployModelResponse.getDeployedModel();
      System.out.println("\tDeployed Model");
      System.out.format("\t\tid: %s\n", deployedModel.getId());
      System.out.format("\t\tmodel: %s\n", deployedModel.getModel());
      System.out.format("\t\tDisplay Name: %s\n", deployedModel.getDisplayName());
      System.out.format("\t\tCreate Time: %s\n", deployedModel.getCreateTime());

      DedicatedResources dedicatedResources = deployedModel.getDedicatedResources();
      System.out.println("\t\tDedicated Resources");
      System.out.format("\t\t\tMin Replica Count: %s\n", dedicatedResources.getMinReplicaCount());

      MachineSpec machineSpec = dedicatedResources.getMachineSpec();
      System.out.println("\t\t\tMachine Spec");
      System.out.format("\t\t\t\tMachine Type: %s\n", machineSpec.getMachineType());
      System.out.format("\t\t\t\tAccelerator Type: %s\n", machineSpec.getAcceleratorType());
      System.out.format("\t\t\t\tAccelerator Count: %s\n", machineSpec.getAcceleratorCount());

      AutomaticResources automaticResources = deployedModel.getAutomaticResources();
      System.out.println("\t\tAutomatic Resources");
      System.out.format("\t\t\tMin Replica Count: %s\n", automaticResources.getMinReplicaCount());
      System.out.format("\t\t\tMax Replica Count: %s\n", automaticResources.getMaxReplicaCount());
    }
  }
}

Node.js

在試用這個範例之前,請先按照 Vertex AI 快速入門:使用用戶端程式庫中的操作說明設定 Node.js。詳情請參閱 Vertex AI Node.js API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const modelId = "YOUR_MODEL_ID";
// const endpointId = 'YOUR_ENDPOINT_ID';
// const deployedModelDisplayName = 'YOUR_DEPLOYED_MODEL_DISPLAY_NAME';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';

const modelName = `projects/${project}/locations/${location}/models/${modelId}`;
const endpoint = `projects/${project}/locations/${location}/endpoints/${endpointId}`;
// Imports the Google Cloud Endpoint Service Client library
const {EndpointServiceClient} = require('@google-cloud/aiplatform');

// Specifies the location of the api endpoint:
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const endpointServiceClient = new EndpointServiceClient(clientOptions);

async function deployModel() {
  // Configure the parent resource
  // key '0' assigns traffic for the newly deployed model
  // Traffic percentage values must add up to 100
  // Leave dictionary empty if endpoint should not accept any traffic
  const trafficSplit = {0: 100};
  const deployedModel = {
    // format: 'projects/{project}/locations/{location}/models/{model}'
    model: modelName,
    displayName: deployedModelDisplayName,
    automaticResources: {minReplicaCount: 1, maxReplicaCount: 1},
  };
  const request = {
    endpoint,
    deployedModel,
    trafficSplit,
  };

  // Get and print out a list of all the endpoints for this resource
  const [response] = await endpointServiceClient.deployModel(request);
  console.log(`Long running operation : ${response.name}`);

  // Wait for operation to complete
  await response.promise();
  const result = response.result;

  console.log('Deploy model response');
  const modelDeployed = result.deployedModel;
  console.log('\tDeployed model');
  if (!modelDeployed) {
    console.log('\t\tId : {}');
    console.log('\t\tModel : {}');
    console.log('\t\tDisplay name : {}');
    console.log('\t\tCreate time : {}');

    console.log('\t\tDedicated resources');
    console.log('\t\t\tMin replica count : {}');
    console.log('\t\t\tMachine spec {}');
    console.log('\t\t\t\tMachine type : {}');
    console.log('\t\t\t\tAccelerator type : {}');
    console.log('\t\t\t\tAccelerator count : {}');

    console.log('\t\tAutomatic resources');
    console.log('\t\t\tMin replica count : {}');
    console.log('\t\t\tMax replica count : {}');
  } else {
    console.log(`\t\tId : ${modelDeployed.id}`);
    console.log(`\t\tModel : ${modelDeployed.model}`);
    console.log(`\t\tDisplay name : ${modelDeployed.displayName}`);
    console.log(`\t\tCreate time : ${modelDeployed.createTime}`);

    const dedicatedResources = modelDeployed.dedicatedResources;
    console.log('\t\tDedicated resources');
    if (!dedicatedResources) {
      console.log('\t\t\tMin replica count : {}');
      console.log('\t\t\tMachine spec {}');
      console.log('\t\t\t\tMachine type : {}');
      console.log('\t\t\t\tAccelerator type : {}');
      console.log('\t\t\t\tAccelerator count : {}');
    } else {
      console.log(
        `\t\t\tMin replica count : \
          ${dedicatedResources.minReplicaCount}`
      );
      const machineSpec = dedicatedResources.machineSpec;
      console.log('\t\t\tMachine spec');
      console.log(`\t\t\t\tMachine type : ${machineSpec.machineType}`);
      console.log(
        `\t\t\t\tAccelerator type : ${machineSpec.acceleratorType}`
      );
      console.log(
        `\t\t\t\tAccelerator count : ${machineSpec.acceleratorCount}`
      );
    }

    const automaticResources = modelDeployed.automaticResources;
    console.log('\t\tAutomatic resources');
    if (!automaticResources) {
      console.log('\t\t\tMin replica count : {}');
      console.log('\t\t\tMax replica count : {}');
    } else {
      console.log(
        `\t\t\tMin replica count : \
          ${automaticResources.minReplicaCount}`
      );
      console.log(
        `\t\t\tMax replica count : \
          ${automaticResources.maxReplicaCount}`
      );
    }
  }
}
deployModel();

Python 適用的 Vertex AI SDK

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK,請參閱「安裝 Python 適用的 Vertex AI SDK」。 詳情請參閱 Vertex AI SDK for Python API 參考說明文件

def deploy_model_with_automatic_resources_sample(
    project,
    location,
    model_name: str,
    endpoint: Optional[aiplatform.Endpoint] = None,
    deployed_model_display_name: Optional[str] = None,
    traffic_percentage: Optional[int] = 0,
    traffic_split: Optional[Dict[str, int]] = None,
    min_replica_count: int = 1,
    max_replica_count: int = 1,
    metadata: Optional[Sequence[Tuple[str, str]]] = (),
    sync: bool = True,
):
    """
    model_name: A fully-qualified model resource name or model ID.
          Example: "projects/123/locations/us-central1/models/456" or
          "456" when project and location are initialized or passed.
    """

    aiplatform.init(project=project, location=location)

    model = aiplatform.Model(model_name=model_name)

    model.deploy(
        endpoint=endpoint,
        deployed_model_display_name=deployed_model_display_name,
        traffic_percentage=traffic_percentage,
        traffic_split=traffic_split,
        min_replica_count=min_replica_count,
        max_replica_count=max_replica_count,
        metadata=metadata,
        sync=sync,
    )

    model.wait()

    print(model.display_name)
    print(model.resource_name)
    return model

取得作業狀態

部分要求會啟動需要時間才能完成的長時間作業。這些要求會傳回作業名稱,您可以使用該名稱查看作業狀態或取消作業。Vertex AI 提供輔助方法,可針對長時間執行的作業進行呼叫。詳情請參閱「處理長時間執行作業」。

使用已部署的模型進行線上預測

如要進行線上預測,請將一或多個測試項目提交至模型進行分析,模型會根據模型的目標傳回結果。如要進一步瞭解預測結果,請參閱「解讀結果」頁面。

主控台

請使用 Google Cloud 控制台要求線上預測。模型必須部署至端點。

  1. 在 Google Cloud 控制台的 Vertex AI 專區中,前往「Models」頁面。

    前往「Models」(模型) 頁面

  2. 在模型清單中,按一下要要求預測結果的模型名稱。

  3. 選取「Deploy & test」分頁標籤。

  4. 在「Test your model」部分下方,新增測試項目來要求預測。

    針對文字目標的 AutoML 模型,您必須在文字欄位中輸入內容,然後按一下「預測」

    如要瞭解本機地圖特徵的重要性,請參閱「取得說明」。

    預測完成後,Vertex AI 會在控制台中傳回結果。

API

使用 Vertex AI API 要求線上預測。模型必須部署至端點。

gcloud

  1. 建立名為 request.json 的檔案,並在當中加入下列內容:

    {
      "instances": [{
        "mimeType": "text/plain",
        "content": "CONTENT"
      }]
    }
    

    更改下列內容:

    • CONTENT:用於預測的文字片段。
  2. 執行下列指令:

    gcloud ai endpoints predict ENDPOINT_ID \
      --region=LOCATION_ID \
      --json-request=request.json

    更改下列內容:

    • ENDPOINT_ID:端點的 ID。
    • LOCATION_ID:您使用 Vertex AI 的區域。

REST

使用任何要求資料之前,請先替換以下項目:

  • LOCATION_ID:端點所在的區域。例如:us-central1
  • PROJECT_ID:您的專案 ID
  • ENDPOINT_ID:端點 ID
  • CONTENT:用於預測的文字片段。
  • DEPLOYED_MODEL_ID:用於預測的已部署模型 ID。

HTTP 方法和網址:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict

JSON 要求主體:

{
  "instances": [{
    "mimeType": "text/plain",
    "content": "CONTENT"
  }]
}

如要傳送要求,請選擇以下其中一個選項:

curl

將要求主體儲存在名為 request.json 的檔案中,然後執行下列指令:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict"

PowerShell

將要求主體儲存在名為 request.json 的檔案中,然後執行下列指令:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應:

{
  "predictions": {
    "ids": [
      "1234567890123456789",
      "2234567890123456789",
      "3234567890123456789"
    ],
    "displayNames": [
      "SpecificDisease",
      "DiseaseClass",
      "SpecificDisease"
    ],
    "textSegmentStartOffsets":  [13, 40, 57],
    "textSegmentEndOffsets": [29, 51, 75],
    "confidences": [
      0.99959725141525269,
      0.99912621492484128,
      0.99935531616210938
    ]
  },
  "deployedModelId": "0123456789012345678"
}

Java

在試用這個範例之前,請先按照 Vertex AI 快速入門:使用用戶端程式庫中的操作說明設定 Java。詳情請參閱 Vertex AI Java API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。


import com.google.cloud.aiplatform.util.ValueConverter;
import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.cloud.aiplatform.v1.schema.predict.instance.TextExtractionPredictionInstance;
import com.google.cloud.aiplatform.v1.schema.predict.prediction.TextExtractionPredictionResult;
import com.google.protobuf.Value;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class PredictTextEntityExtractionSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String content = "YOUR_TEXT_CONTENT";
    String endpointId = "YOUR_ENDPOINT_ID";

    predictTextEntityExtraction(project, content, endpointId);
  }

  static void predictTextEntityExtraction(String project, String content, String endpointId)
      throws IOException {
    PredictionServiceSettings predictionServiceSettings =
        PredictionServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (PredictionServiceClient predictionServiceClient =
        PredictionServiceClient.create(predictionServiceSettings)) {
      String location = "us-central1";
      String jsonString = "{\"content\": \"" + content + "\"}";

      EndpointName endpointName = EndpointName.of(project, location, endpointId);

      TextExtractionPredictionInstance instance =
          TextExtractionPredictionInstance.newBuilder().setContent(content).build();

      List<Value> instances = new ArrayList<>();
      instances.add(ValueConverter.toValue(instance));

      PredictResponse predictResponse =
          predictionServiceClient.predict(endpointName, instances, ValueConverter.EMPTY_VALUE);
      System.out.println("Predict Text Entity Extraction Response");
      System.out.format("\tDeployed Model Id: %s\n", predictResponse.getDeployedModelId());

      System.out.println("Predictions");
      for (Value prediction : predictResponse.getPredictionsList()) {
        TextExtractionPredictionResult.Builder resultBuilder =
            TextExtractionPredictionResult.newBuilder();

        TextExtractionPredictionResult result =
            (TextExtractionPredictionResult) ValueConverter.fromValue(resultBuilder, prediction);

        for (int i = 0; i < result.getIdsCount(); i++) {
          long textStartOffset = result.getTextSegmentStartOffsets(i);
          long textEndOffset = result.getTextSegmentEndOffsets(i);
          String entity = content.substring((int) textStartOffset, (int) textEndOffset);

          System.out.format("\tEntity: %s\n", entity);
          System.out.format("\tEntity type: %s\n", result.getDisplayNames(i));
          System.out.format("\tConfidences: %f\n", result.getConfidences(i));
          System.out.format("\tIDs: %d\n", result.getIds(i));
        }
      }
    }
  }
}

Node.js

在試用這個範例之前,請先按照 Vertex AI 快速入門:使用用戶端程式庫中的操作說明設定 Node.js。詳情請參閱 Vertex AI Node.js API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const text = "YOUR_PREDICTION_TEXT";
// const endpointId = "YOUR_ENDPOINT_ID";
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';
const aiplatform = require('@google-cloud/aiplatform');
const {instance, prediction} =
  aiplatform.protos.google.cloud.aiplatform.v1.schema.predict;

// Imports the Google Cloud Model Service Client library
const {PredictionServiceClient} = aiplatform.v1;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const predictionServiceClient = new PredictionServiceClient(clientOptions);

async function predictTextEntityExtraction() {
  // Configure the endpoint resource
  const endpoint = `projects/${project}/locations/${location}/endpoints/${endpointId}`;

  const instanceObj = new instance.TextExtractionPredictionInstance({
    content: text,
  });
  const instanceVal = instanceObj.toValue();
  const instances = [instanceVal];

  const request = {
    endpoint,
    instances,
  };

  // Predict request
  const [response] = await predictionServiceClient.predict(request);

  console.log('Predict text entity extraction response :');
  console.log(`\tDeployed model id : ${response.deployedModelId}`);

  console.log('\nPredictions :');
  for (const predictionResultValue of response.predictions) {
    const predictionResult =
      prediction.TextExtractionPredictionResult.fromValue(
        predictionResultValue
      );

    for (const [i, label] of predictionResult.displayNames.entries()) {
      const textStartOffset = parseInt(
        predictionResult.textSegmentStartOffsets[i]
      );
      const textEndOffset = parseInt(
        predictionResult.textSegmentEndOffsets[i]
      );
      const entity = text.substring(textStartOffset, textEndOffset);
      console.log(`\tEntity: ${entity}`);
      console.log(`\tEntity type: ${label}`);
      console.log(`\tConfidences: ${predictionResult.confidences[i]}`);
      console.log(`\tIDs: ${predictionResult.ids[i]}\n\n`);
    }
  }
}
predictTextEntityExtraction();

Python 適用的 Vertex AI SDK

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK,請參閱「安裝 Python 適用的 Vertex AI SDK」。 詳情請參閱 Vertex AI SDK for Python API 參考說明文件

def predict_text_entity_extraction_sample(project, location, endpoint_id, content):

    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint(endpoint_id)

    response = endpoint.predict(instances=[{"content": content}], parameters={})

    for prediction_ in response.predictions:
        print(prediction_)

取得批次預測

如要提出批次預測要求,您必須指定 輸入來源輸出格式,Vertex AI 會在這些位置儲存預測結果。

輸入資料規定

批次要求的輸入內容會指定要傳送至模型進行預測的項目。針對實體擷取,您可以加入內嵌文字或 Cloud Storage 值區中的文件參照。您也可以為每份文件在輸入內容中新增 key 欄位。

通常,批次預測結果會使用 instance 欄位 (包含 contentmimeType 欄位) 對應輸入和輸出內容。如果您在輸入內容中使用 key 欄位,批次預測輸出內容會將 instance 欄位替換為 key 欄位。舉例來說,如果輸入內容包含大型文字摘要,這項功能有助於簡化批次預測輸出內容。

以下範例顯示 JSON 列檔案,其中包含參照文件和內嵌文字片段的參照,以及是否包含 key 欄位。

{"content": "gs://sourcebucket/datasets/texts/source\_text.txt", "mimeType": "text/plain"}
{"content": "gs://bucket/sample.txt", "mimeType": "text/plain", "key": "sample-file"}
{"content": "Text snippet", "mimeType": "text/plain"}
{"content": "Sample text snippet", "mimeType": "text/plain", "key": "sample-snippet"}

要求批次預測

如要提出批次預測要求,您可以使用 Google Cloud 控制台或 Vertex AI API。視您提交的輸入項目數量而定,批次預測工作可能需要一些時間才能完成。

Google Cloud 控制台

使用 Google Cloud 控制台要求批次預測。

  1. 在 Google Cloud 控制台的 Vertex AI 專區中,前往「批次預測」頁面。

    前往「批次預測」頁面

  2. 按一下「Create」,開啟「New batch prediction」視窗,然後完成下列步驟:

    1. 輸入批次預測的名稱。
    2. 針對「Model name」(模型名稱),選取要用於此批次預測的模型名稱。
    3. 在「Source path」中,指定 JSON Lines 輸入檔案所在的 Cloud Storage 位置。
    4. 在「Destination path」中,指定批次預測結果的儲存位置。輸出格式取決於模型的目標。針對文字目標的 AutoML 模型會輸出 JSON Lines 檔案。

API

使用 Vertex AI API 傳送批次預測要求。

REST

使用任何要求資料之前,請先替換以下項目:

  • LOCATION_IS:模型儲存及執行批次預測工作的區域。例如:us-central1
  • PROJECT_ID:您的專案 ID
  • BATCH_JOB_NAME:批次工作顯示名稱
  • MODEL_ID:模型的 ID,用於進行預測
  • URI:輸入 JSON Lines 檔案的 Cloud Storage URI。
  • BUCKET:您的 Cloud Storage 值區
  • PROJECT_NUMBER:系統自動產生的專案編號

HTTP 方法和網址:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs

JSON 要求主體:

{
    "displayName": "BATCH_JOB_NAME",
    "model": "projects/PROJECT_ID/locations/LOCATION_ID/models/MODEL_ID",
    "inputConfig": {
        "instancesFormat": "jsonl",
        "gcsSource": {
            "uris": ["URI"]
        }
    },
    "outputConfig": {
        "predictionsFormat": "jsonl",
        "gcsDestination": {
            "outputUriPrefix": "OUTPUT_BUCKET"
        }
    }
}

如要傳送要求,請選擇以下其中一個選項:

curl

將要求主體儲存在名為 request.json 的檔案中,然後執行下列指令:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs"

PowerShell

將要求主體儲存在名為 request.json 的檔案中,然後執行下列指令:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION/batchPredictionJobs/BATCH_JOB_ID",
  "displayName": "BATCH_JOB_NAME",
  "model": "projects/PROJECT_NUMBER/locations/LOCATION/models/MODEL_ID",
  "inputConfig": {
    "instancesFormat": "jsonl",
    "gcsSource": {
      "uris": [
        "CONTENT"
      ]
    }
  },
  "outputConfig": {
    "predictionsFormat": "jsonl",
    "gcsDestination": {
      "outputUriPrefix": "BUCKET"
    }
  },
  "state": "JOB_STATE_PENDING",
  "completionStats": {
    "incompleteCount": "-1"
  },
  "createTime": "2022-12-19T20:33:48.906074Z",
  "updateTime": "2022-12-19T20:33:48.906074Z",
  "modelVersionId": "1"
}

您可以使用 BATCH_JOB_ID 輪詢批次工作狀態,直到工作 stateJOB_STATE_SUCCEEDED 為止。

Java

在試用這個範例之前,請先按照 Vertex AI 快速入門:使用用戶端程式庫中的操作說明設定 Java。詳情請參閱 Vertex AI Java API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

import com.google.api.gax.rpc.ApiException;
import com.google.cloud.aiplatform.v1.BatchPredictionJob;
import com.google.cloud.aiplatform.v1.GcsDestination;
import com.google.cloud.aiplatform.v1.GcsSource;
import com.google.cloud.aiplatform.v1.JobServiceClient;
import com.google.cloud.aiplatform.v1.JobServiceSettings;
import com.google.cloud.aiplatform.v1.LocationName;
import com.google.cloud.aiplatform.v1.ModelName;
import java.io.IOException;

public class CreateBatchPredictionJobTextEntityExtractionSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "PROJECT";
    String location = "us-central1";
    String displayName = "DISPLAY_NAME";
    String modelId = "MODEL_ID";
    String gcsSourceUri = "GCS_SOURCE_URI";
    String gcsDestinationOutputUriPrefix = "GCS_DESTINATION_OUTPUT_URI_PREFIX";
    createBatchPredictionJobTextEntityExtractionSample(
        project, location, displayName, modelId, gcsSourceUri, gcsDestinationOutputUriPrefix);
  }

  static void createBatchPredictionJobTextEntityExtractionSample(
      String project,
      String location,
      String displayName,
      String modelId,
      String gcsSourceUri,
      String gcsDestinationOutputUriPrefix)
      throws IOException {
    // The AI Platform services require regional API endpoints.
    JobServiceSettings settings =
        JobServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (JobServiceClient client = JobServiceClient.create(settings)) {
      try {
        String modelName = ModelName.of(project, location, modelId).toString();
        GcsSource gcsSource = GcsSource.newBuilder().addUris(gcsSourceUri).build();
        BatchPredictionJob.InputConfig inputConfig =
            BatchPredictionJob.InputConfig.newBuilder()
                .setInstancesFormat("jsonl")
                .setGcsSource(gcsSource)
                .build();
        GcsDestination gcsDestination =
            GcsDestination.newBuilder().setOutputUriPrefix(gcsDestinationOutputUriPrefix).build();
        BatchPredictionJob.OutputConfig outputConfig =
            BatchPredictionJob.OutputConfig.newBuilder()
                .setPredictionsFormat("jsonl")
                .setGcsDestination(gcsDestination)
                .build();
        BatchPredictionJob batchPredictionJob =
            BatchPredictionJob.newBuilder()
                .setDisplayName(displayName)
                .setModel(modelName)
                .setInputConfig(inputConfig)
                .setOutputConfig(outputConfig)
                .build();
        LocationName parent = LocationName.of(project, location);
        BatchPredictionJob response = client.createBatchPredictionJob(parent, batchPredictionJob);
        System.out.format("response: %s\n", response);
        System.out.format("\tname:%s\n", response.getName());
      } catch (ApiException ex) {
        System.out.format("Exception: %s\n", ex.getLocalizedMessage());
      }
    }
  }
}

Node.js

在試用這個範例之前,請先按照 Vertex AI 快速入門:使用用戶端程式庫中的操作說明設定 Node.js。詳情請參閱 Vertex AI Node.js API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const batchPredictionDisplayName = 'YOUR_BATCH_PREDICTION_DISPLAY_NAME';
// const modelId = 'YOUR_MODEL_ID';
// const gcsSourceUri = 'YOUR_GCS_SOURCE_URI';
// const gcsDestinationOutputUriPrefix = 'YOUR_GCS_DEST_OUTPUT_URI_PREFIX';
//    eg. "gs://<your-gcs-bucket>/destination_path"
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';

// Imports the Google Cloud Job Service Client library
const {JobServiceClient} = require('@google-cloud/aiplatform').v1;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const jobServiceClient = new JobServiceClient(clientOptions);

async function createBatchPredictionJobTextEntityExtraction() {
  // Configure the parent resource
  const parent = `projects/${project}/locations/${location}`;
  const modelName = `projects/${project}/locations/${location}/models/${modelId}`;

  const inputConfig = {
    instancesFormat: 'jsonl',
    gcsSource: {uris: [gcsSourceUri]},
  };
  const outputConfig = {
    predictionsFormat: 'jsonl',
    gcsDestination: {outputUriPrefix: gcsDestinationOutputUriPrefix},
  };
  const batchPredictionJob = {
    displayName: batchPredictionDisplayName,
    model: modelName,
    inputConfig,
    outputConfig,
  };
  const request = {
    parent,
    batchPredictionJob,
  };

  // Create batch prediction job request
  const [response] = await jobServiceClient.createBatchPredictionJob(request);

  console.log('Create batch prediction job text entity extraction response');
  console.log(`Name : ${response.name}`);
  console.log('Raw response:');
  console.log(JSON.stringify(response, null, 2));
}
createBatchPredictionJobTextEntityExtraction();

Python 適用的 Vertex AI SDK

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK,請參閱「安裝 Python 適用的 Vertex AI SDK」。 詳情請參閱 Vertex AI SDK for Python API 參考說明文件

def create_batch_prediction_job_sample(
    project: str,
    location: str,
    model_resource_name: str,
    job_display_name: str,
    gcs_source: Union[str, Sequence[str]],
    gcs_destination: str,
    sync: bool = True,
):
    aiplatform.init(project=project, location=location)

    my_model = aiplatform.Model(model_resource_name)

    batch_prediction_job = my_model.batch_predict(
        job_display_name=job_display_name,
        gcs_source=gcs_source,
        gcs_destination_prefix=gcs_destination,
        sync=sync,
    )

    batch_prediction_job.wait()

    print(batch_prediction_job.display_name)
    print(batch_prediction_job.resource_name)
    print(batch_prediction_job.state)
    return batch_prediction_job

擷取批次預測結果

批次預測工作完成後,預測的輸出會儲存在您在要求中指定的 Cloud Storage 值區。

批次預測結果範例

以下是文字實體擷取模型的批次預測結果範例。

{
  "key": 1,
  "predictions": {
    "ids": [
      "1234567890123456789",
      "2234567890123456789",
      "3234567890123456789"
    ],
    "displayNames": [
      "SpecificDisease",
      "DiseaseClass",
      "SpecificDisease"
    ],
    "textSegmentStartOffsets":  [13, 40, 57],
    "textSegmentEndOffsets": [29, 51, 75],
    "confidences": [
      0.99959725141525269,
      0.99912621492484128,
      0.99935531616210938
    ]
  }
}