Get support

Google's primary support objective is to resolve production incidents as quickly as possible. We do this by understanding your configuration, analyzing logs and metrics, and collaborating with partners to solve incidents quickly.

Cloud Customer Care offers various support packages to accommodate your support needs. All Cloud Customer Care packages include support for Google Kubernetes Engine (GKE) Enterprise edition and Google Distributed Cloud. If you have an existing Cloud Customer Care support package, then you already have support for GKE Enterprise and Google Distributed Cloud.

For more information, see the Cloud Customer Care hub.

Requirements for Google Distributed Cloud support

To effectively troubleshoot business-critical incidents, you must do the following:

Support tools

To troubleshoot a Google Distributed Cloud incident, Cloud Customer Care relies on three pieces of information:

Your environment configuration

When you open a support case, provide key information about your cluster setup by running the following commands:

  • For all your cluster types, capture information about Kubernetes and your nodes by running the bmctl check cluster --snapshot command. Attach the resulting tar file to the support case.

  • For admin, hybrid, and standalone clusters, check the health status of the cluster and nodes by running the bmctl check cluster command. Attach the resulting logs to the support case. The logs should exist under the bmctl-workspace/[CLUSTER_NAME]/log/check-cluster-[TIMESTAMP] directory.

  • For user clusters, first create a health check YAML file with the cluster name and namespace, and then apply the file in the appropriate admin cluster:

    1. Create a YAML file with the following healthcheck properties. The following is sample content for a cluster named user1 in the cluster-user1 namespace:

      apiVersion: baremetal.cluster.gke.io/v1
      kind: HealthCheck
      metadata:
        generateName: healthcheck-
        namespace: cluster-user1
      spec:
        clusterName: user1
      
    2. After you create the YAML file, apply the custom resource in the admin cluster that's managing the user cluster by using the kubectl command. The following is a sample command that uses the YAML file created in the previous step. In the sample, the ADMIN_KUBECONFIG variable specifies the path to the admin cluster's kubeconfig file:

      kubectl --kubeconfig ADMIN_KUBECONFIG create -f healthcheck-user1.yaml
      

      The command returns the following response:

      healthcheck.baremetal.cluster.gke.io/healthcheck-7c4qf created
      
    3. Wait until the health check job is completed. To do so, test to see if the health check job finished reconciling. In the previous example case, the health check job name is healthcheck.baremetal.cluster.gke.io/healthcheck-7c4qf. The following is a sample test that uses the kubectl command, and it waits 30 minutes for the health check job to complete:

      kubectl --kubeconfig ADMIN_KUBECONFIG wait healthcheck healthcheck-7c4qf \
          -n cluster-user1 --for=condition=Reconciling=False --timeout=30m
      

      When the health check job is complete, the preceding kubectl command returns:

      healthcheck.baremetal.cluster.gke.io/healthcheck-7c4qf condition met
      

      You can see the health check job results with the following command:

      kubectl --kubeconfig ADMIN_KUBECONFIG get healthcheck healthcheck-7c4qf \
          -n cluster-user1
      

      The command returns the following result:

      NAME                PASS   AGE
      healthcheck-7c4qf   true   17m
      
    4. Gather all the logs for the health check job Pod into a local file by using the kubectl command. The following is an example that uses the previous sample health check job:

      kubectl --kubeconfig ADMIN_KUBECONFIG logs -n cluster-user1 \
          -l baremetal.cluster.gke.io/check-name=healthcheck-7c4qf --tail=-1 > \
          healthcheck-7c4qf.log
      

Cluster logs

When you create a new Google Distributed Cloud cluster, Cloud Logging agents are enabled by default and scoped only to system-level components. This replicates system-level logs into the Google Cloud project associated with the cluster. System-level logs are from Kubernetes Pods in the following namespaces:

  • kube-system
  • gke-system
  • gke-connect
  • istio-system
  • config-management-system
  • gatekeeper-system
  • cnrm-system
  • knative-serving

You can query logs from the Logs Explorer.

For more details, see Configure logging and monitoring.

Google Cloud CLI and remote cluster access

If you open a support case, Cloud Customer Care might ask you for remote read-only access to your clusters to help diagnose and resolve issues more effectively. For Cloud Customer Care to have sufficient access to troubleshoot your cluster issue remotely, ensure that you've installed and updated to the latest version of the Google Cloud CLI. The Google Cloud CLI must be at version 401.0.0 or higher to give Cloud Customer Care the needed permissions. We recommend that you update Google Cloud CLI regularly to pick up added permissions and other enhancements.

To install the latest components of the gcloud CLI, use the gcloud components update command. For more information about giving Cloud Customer Care remote read-only access to your clusters, see Remote GKE Enterprise Cluster Support.

Cluster metrics

In addition to logs, the Cloud Monitoring agent also captures metrics. This replicates system-level metrics into the Google Cloud project associated with the cluster. System-level metrics are from Kubernetes Pods that run in the same namespaces that are listed in the Cluster logs section.

For more details, see Configure logging and monitoring.

How we troubleshoot your environment

Here is an example of a typical support incident:

  1. The cluster administrator opens a support case in Google Cloud console with Cloud Customer Care. They then select Google Kubernetes Engine (GKE) Enterprise edition and Google Distributed Cloud as Category and Component, respectively. They enter the required information and attach the output of relevant bmctl commands to the case.

  2. The support case is routed to a Technical Support Engineer who specializes in Google Distributed Cloud.

  3. The support engineer examines the contents of the snapshot to gain context of the environment.

  4. The support engineer examines the logs and metrics in the Google Cloud project. They enter the support case ID as the business justification, which is logged internally.

  5. The support engineer responds to the case with an assessment and recommendation. The support engineer and the user continue troubleshooting until they come to a resolution.

What does Google support?

Generally, Cloud Customer Care supports all software components shipped as part of Google Distributed Cloud and Cloud Service Mesh, Policy Controller, Config Sync, and Config Controller. The following table provides a more complete list of what is and isn't supported:

Google Cloud supported Not supported
Kubernetes and the container runtime Customer choice of load balancer (manual load balancing)
Connect and the Connect Agent Customer code (see Developer Support)
Google Cloud operations, Monitoring, Logging, and agents Customer choice of operating system
Bundled load balancer Physical or virtual server, storage, and network
Ingress controller External DNS, DHCP, and identity systems
GKE Identity Service
Cloud Service Mesh
Policy Controller
Config Sync
Config Controller

Version Support Policy

Support for Google Distributed Cloud follows the GKE Enterprise Support Policy. Google supports each Google Distributed Cloud minor version for the later of:

  • 12 months after the initial release of the minor version.
  • The release of the third subsequent minor version.
For the most recent minor release dates for GKE Enterprise and the earliest end-of-life dates, see GKE Enterprise support period.

For lists of the supported and unsupported Google Distributed Cloud versions, see Versioning.

For version information that's related to cluster upgrades, see Version rules.

Shared responsibility model

Running a business-critical production application on Google Distributed Cloud requires multiple parties to carry different responsponsibilities. Although they aren't an exhaustive list, the following sections list the roles and responsibilities of different parties.

Google responsibilities

  • Maintenance and distribution of the Google Distributed Cloud software package.
  • Notifying users of available upgrades for Google Distributed Cloud, and producing upgrade scripts for the previous version.

    Google Distributed Cloud supports sequential cluster upgrades only (example: 1.30 → 1.31 → 1.32, but not 1.30 → 1.32). When you upgrade node pools, in some cases, you can skip a minor version. For more information about upgrade logic, see Version rules.

  • Operating the Connect and Cloud Operations services.

  • Troubleshooting, providing workarounds, and correcting the root cause of any issues related to Google-provided components.

User responsibilities

  • Overall system administration for on-premises clusters.
  • Maintaining any application workload deployed on the cluster.
  • Running, maintaining, and patching the data center infrastructure, including networking, servers, operating system, storage, and connectivity to Google Cloud.
  • Running, maintaining, and patching network load balancers if manual load balancer option is chosen.
  • Upgrading Google Distributed Cloud versions regularly.
  • Monitoring of the cluster and applications, and responding to any incidents.
  • Ensuring Cloud Operations agents are deployed to clusters.
  • Providing Google with environmental details for troubleshooting purposes.

Developer support

Google doesn't provide support specifically for your application workloads. However, we do provide best-effort developer support to ensure your developers can run applications on Google Distributed Cloud. We believe that engaging earlier during development can prevent critical incidents later in the deployment.

This best-effort developer support is available to customers with any paid support package and is treated as a P3 priority for an issue blocking a launch, or a P4 priority for general consultation. In this classification, priority level 0 is the highest priority.