Google Cloud Platform (GCP) Knowledge

Kip Landergren

(Updated: )

My Google Cloud Platform knowledge base explaining how to make use of its cloud-based compute, storage, and networking resources.

Contents

Overview

Google Cloud Platform (GCP) is Google’s cloud computing platform, making available compute, storage, and networking resources along with a host of managed offerings built on top of them.

The individual datacenters powering GCP are abstracted underneath the concept of geographical regions and, within regions, zones. Applications may be deployed to multiple regions and zones to mitigate failure risks.

The two main interfaces to GCP are through:

Client applications are organized into “Projects” that may employ multiple GCP products to attain functionality. Each project may define a set of users that in turn are assigned to roles in accordance with a role-based access control policy.

Core Idea

Bundle Google’s compute, storage, and network infrastructure into products available for rent and use by external customers. Secure organizations and projects through role-based access control.

Key Concepts

Role-Based Access Control (RBAC)

An overview is available in the RBAC knowledge document.

Identity and Access Management (IAM)

IAM encompasses the suite of tools and policies GCP makes available to authorize who can take action on which specific cloud resources. Control and auditing are built-in.

Principals are the identity entities that interface with GCP and operate—directly or indirectly—on behalf of users. They may be user accounts—tied 1:1 with real people—or service accounts which are typically created for specialized, automated tasks involving a restricted permission set.

Access—meaning the permission granted by GCP to a principal—to perform these tasks is managed through the binding of roles to specific principals and resources. Resources are organized hierarchically within an organization. The organization is at the top level and itself a resource, followed by its projects (also resources), and within projects individual APIs and compute resources like VMs and IPs. Permissions cascade downward, meaning that if you are an “Admin” at the organizational level, you are also an “Admin” for every project.

For example:

The “Acme” organization has a single project “Website”. John Doe is an employee who has a user account jdoe@example.com. The organization administrator creates a role binding with:

To grant John the use of the cloud console web interface to create and manage DNS zones. It also allows him to use gcloud locally with the same permission structure to manage DNS resources.

Now let’s imagine he wants to run a VM under project “Website” that executes code which need DNS read access for the project’s zones. John’s user account has permission to accomplish this task and he could try configure the code to run as his user. This would be convoluted, distribute his credentials beyond his local machine, and not follow the principle of least privilege.

Rather than go down that path he and the administrator decide to use a service account with more granular permissions:

This ensures that the code has the minimum access it needs to accomplish its goal and John’s user credentials are not shared widely.

Products

Google Compute Engine (GCE)

Virtual Machines (VMs)

VMs are run by Google-managed hypervisor. Creating a VM involves specifying its virtual hardware composition:

CPU and memory are straightforward to reason about: you select a machine type suited to your workload and the hypervisor presents the corresponding virtual CPUs (vCPUs) and memory to the guest.

Storage is more involved because GCP presents several distinct products ranging from network-attached block storage to physically-attached local solid state drives (SSDs) and lots in between. Review the latest offerings to understand the right tradeoffs for your workload.

A VM’s instance configuration is established at creation. Some changes—e.g. machine type, disk attachments, and others—can be applied without recreating the VM (but often only while the VM is stopped). Certain changes—especially those related to the initial configuration—require that the VM be recreated. (I ran into this with the expiration of Microsoft’s Secure Boot certificate). These latter changes are independent of what is going on at the guest OS level.

So what exactly does “recreate” mean? It means to provision a new VM to pick up the necessary configuration changes (on Google’s side, not on the guest OS’ side). Stopping and starting will likely not be sufficient.

Note: I believe you could provision a VM with Debian 12, manually upgrade it to Debian 13 and still see Debian 12 reflected in the VM metadata (e.g. via gcloud or on the cloud console). This is because it is referring the machine image used to generate VM, not its current state.

The lifecycle operations of a VM involve:

suspend memory written to a persistent disk and execution paused e.g. closing your laptop lid
stop memory cleared and persistent disk preserved e.g. shutdown
reset immediately power cycle without clean shutdown
delete removal of the VM; data persistence depends on configuration

Tip: Consider whether application data should be stored on a separate volume attached to the VM (does decoupling from the guest OS make sense?). Treat VMs as replaceable and ask yourself: “how am I going to get my data off this VM when I need to rebuild it?”.

VM Instances
Instance Templates

Storage

Disks
Snapshots

Operates at the disk level, not the VM. For VM-level snapshots, look at Machine Images.

Virtual Private Cloud (VPC) Network

A new project is initialized with a default VPC network named default. This is an auto mode VPC: the subnets are automatically created and managed by Google. It is private to the project, exists as a global resource, and is automatically populated with one /20 subnet in every region from within the 10.128.0.0/9 range. This is in contrast to a custom mode VPC where you are responsible for the subnet management.

To be clear: default is “global” in the sense that the network itself is not regional, but the IP space that makes it usable is partitioned into subnets which are regional resources. So e.g. a VM created in us-west1-a attaches to the subnet in the us-west1 region and its internal IP—ephemeral by default—is pulled from that subnet’s CIDR range.

Static internal IP addresses are regional resources within the specified VPC and subnet. The reserved address may be assigned to a resource—across any zone—within the subnet’s region. It cannot be assigned to a different region’s resources.

A private Cloud DNS managed zone, authorized for the VPC, can hold an A record mapping a hostname to the reserved address allowing internal services to refer to it by name.

Resources

forwarding-rules

Means of directing traffic matching an IP address to some other target, like a load balancer. Can be externally accessible or internal-only. Allows you to maintain an IP address as underlying resources change.

target-pools

A group of instances that receive traffic from forwarding rules.

If a forwarding-rule is used to point to a target-pool, the instance chosen is based on the hash of the source and destination. More info available here.

addresses

IP addresses that may be ephemeral or reserved. Service characteristics change based on network tier (PREMIUM or STANDARD) and whether designated as regional or global.

Network Services

Cloud DNS

Private managed zones may be created for one or more VPC networks (and then authorized so their records are resolvable within those networks). By creating A records pointing to reserved IPs within managed zones, internal resources can use stable, meaningful hostnames rather than raw IP addresses or the verbose auto-generated DNS names based on the provisioned VM hostname.

Google Container Registry (GCR)

Overview

Google Container Registry (GCR) is Google’s Docker registry offering, built on top of Google Cloud Storage. Images are scoped within a GCP project, stored in a Cloud Storage bucket, and may have configurable access levels.

Additional features including deployment capabilities and vulnerability scanning are available.

Core Idea

Provide a Docker registry fully integrated into the GCP ecosystem.

Configuration

gcloud is bundled with a credential helper to configure docker for communication with gcr.io as an image registry. More info on this in the GCR Reference.

The underlying buckets where images are stored may be configured with role-based access controls or bucket-level permissions via the Cloud Storage console, or gcloud.

Google Kubernetes Engine (GKE)

Overview

Google Kubernetes Engine (GKE) is Google’s managed Kubernetes offering built on top of Google Cloud Platform (GCP). Multiple versions of Kubernetes customized for GKE deployment are available, with customizations targeted mainly at enterprise workloads requiring an improved SLA or better reliability guarantees. Additional configuration options, like HTTP Load Balancing via Google Cloud Load Balancer, incorporate the use of other Google Cloud products.

Notably the control plane is entirely managed by Google, removing the need for configuration and maintenance of a HA master node setup.

Core Idea

Provide the ability to easily provision, manage, and scale a Kubernetes cluster built on top of Google Cloud Platform.