FLEDGE has been renamed to Protected Audience API. To learn more about the name change, see the blog post
Authors:
Daniel Kocoj, Google Privacy Sandbox
This document proposes a cloud architecture for Protected Audience Bidding and Auction services on Google Cloud Platform (GCP). The goal of this document is to enable server operators to gain familiarity with the methods of and requirements for running Bidding and Auction services in GCP. This document will be updated as more features are added and the system evolves.
To learn more about Protected Audience services and Bidding and Auction services, read the following:
- Protected Audiences Services Overview
- Bidding and Auction Services High Level Design and API
- Bidding and Auction Services System Design
To compare the GCP implementation to the AWS implementation, refer to the Bidding and Auction services AWS cloud support and deployment guide.
The Protected Audience Bidding and Auction services implementations on GCP and AWS are functionally similar. One key difference is that while AWS uses an internal load balancer to route traffic to the Bidding and Auction services, GCP uses a service mesh. Another key difference is that the GCP packaging pipeline creates a Docker image, while the AWS packaging pipeline creates an Amazon Machine Image.
The Protected Audience Bidding and Auction services will be open-sourced in June 2023. In addition to the code for each service, we will open source Terraform configurations and scripts that allow ad techs to easily deploy the services proposed in this document.
The seller operates a SellerFrontEnd service and an Auction service. These services are responsible for orchestrating the auction and communicate with a seller-operated Key/Value service and partner-operated BuyerFrontEnd services. Learn more about how these services work.
A trusted execution environment (TEE) provides a level of assurance for data integrity, data confidentiality, and code integrity using hardware-backed techniques for increased security guarantees for code execution and data protection. The TEE-based SellerFrontEnd service receives its requests from a Seller Ad service. These requests can be either HTTP or gRPC. See the Envoy component to learn how HTTP requests are translated to gRPC. Then network requests are sent to all configured buyer front ends and Key/Value services. The following diagram provides an overall view of the system.
Figure 1. Seller GCP ArchitectureA buy-side ad tech is responsible for operating a BuyerFrontEnd service, a Bidding service, and a Key/Value service. A request begins with a gRPC message from a SellerFrontEnd service. The major subsequent steps include fetching data from the buyer-operated Key/Value service and generating bids for the ads that are present in the request. Learn more about the demand-side platform system. The following diagram provides an overall view of the system.
Figure 2. Buyer GCP ArchitectureThe Bidding and Auction services are regional services, where a ‘cloud region' refers to a particular geographical location as defined by GCP. We'll open source Terraform configurations to support deployment in multiple regions in different geographies. Ad techs can deploy services in any region supported by GCP. The Terraform configurations include parameters that an ad tech can update before deploying to a different cloud region. Note that even though the services are regional, the Terraform configurations show how to make use of global load balancers and service meshes to achieve multi-region coverage.
Identity and Access Management (IAM) is used to securely control who has access to your GCP resources. You can use IAM to create and manage users, groups, and permissions. You can also use IAM to audit access to your resources.
The default Terraform configuration requires a service account for each GCE instance. The service account is bound to an IAM role, which the GCE instance inherits.
A virtual private cloud (VPC) is an isolated resource within a public cloud. Sellers and buyers should start with 1 global VPC. A VPC is critical for security purposes and provides configurable firewall rules (which can be found in the Terraform modules).
Each buyer and seller should have 1 private subnet per region. By hosting most components in the private subnet, the service has extra protection from internet traffic. However, the buyer and seller's load balancers must face the public internet. Like VPCs, subnets rely on firewall rules for network security.
Firewall rules are used to control which types of traffic flow to which ports in the VPC. Key security group rules allow for egress and ingress traffic to flow from the load balancers to GCE Instances, and GCE instances to send network requests to external services.
Each region uses a managed instance group to host its GCE instances. The managed instance group uses an instance template to create or destroy instances based on health checks and autoscaling requirements. Autoscaling is based on CPU utilization and is configurable in the terraform variables, including specifying a minimum and maximum number of instances per instance group. Health checks are configured such that managed instance groups will detect mal-performing VM instances and auto-heal.
A GCP Backend Service is a global resource consisting of all regional managed instance groups for a particular service. Backend services work in tandem with load balancers and are responsible for routing a request to the correct instance.
The buyer and seller frontend services both use a Global External Load Balancer.
The SellerFrontEnd load balancer accepts both HTTP and gRPC traffic over TLS, while the BuyerFrontEnd load balancer accepts gRPC traffic only. Both services' load balancers accept internet traffic. Load balancers terminate TLS, then start a new TLS session with the GCE instance provided by the load balancer's GCP Backend Service. Note that root certificates for the Load Balancer and Backend Service are not verified.
The SellerFrontEnd and BuyerFrontEnd both communicate with the Auction and Bidding services, respectively, via the Traffic Director gRPC proxyless service mesh. This bypasses any need for an internal load balancer and saves on dedicated load balancer costs. Specifically, the SellerFrontEnd and BuyerFrontEnd use gRPC's built-in xDS capabilities to query the Traffic Director control plane service and find routes to available backend services. Each front end relies on the Traffic Director to distribute its requests to the appropriate backend service, based on utilization and regional availability. While the service mesh itself is a global resource, it automatically routes requests based on region. Because the requests are internal to the VPC and subnet, they are sent as plaintext gRPC.
The GCE Host is also known as an ‘instance.' The instance runs in a Confidential Computing Space, which is created via a Docker image containing the service binary and startup scripts. The instance's open ports are identical to those exposed by the Docker image and are the only ports on which incoming connections are supported; connections initiated from inside the Confidential Space workload support both ingress and egress. Individual GCE instances belong to a Managed Instance Group.
In order to support incoming connections, the minimum Confidential Space image version is 230600
under the confidential-space-images
project and in the confidential-space
and confidential-space-debug
families.
Inside the seller's front-end service GCE Host Confidential Space, we provide an instance of the open source Envoy proxy. This is solely used to convert HTTP traffic to gRPC that the SellerFrontEnd service can consume. Envoy terminates TLS and then forwards the plaintext request to the seller's port. The envoy configuration is included in the TEE and can only be modified in a limited way by the operator (such as by providing TLS keys and certificates).
Cloud Domains and Cloud DNS are supported for use with global external load balancers. The load balancers can automatically use Google-managed TLS certificates and don't require operators to provide regional domains. The Terraform configurations require a GCP Domain, DNS Zone, and certificate resource ID.
By default, ad techs are free to use any instance type that supports Confidential Compute (N2D or C2D and that meets the ad tech's performance requirements. A recommended starter instance for functionality is n2d-highcpu-128. Take note of the regional availability limitations for Confidential Computing.
The GCE host uses the tee-container-log-redirect
metadata variable to redirect all stdout and stderr output to Cloud Logging in both production and debugging environments. This allows service operators to use Logs Explorer to view logs across all of their services. The production Bidding and Auction service binaries will be built with VLOG=0
, so only limited information will be logged compared to debugging binaries (which can be built with any VLOG
level).
GCP Secret Manager is a fully managed service that makes it easy for you to store and retrieve configuration data (including secrets, such as TLS information, and runtime flags, such as ports) in a central location. The default Terraform configurations store the servers' runtime flags in Secret Manager and fetch the flags on server startup. An ad tech can modify the secret and restart the server for the new flag value to take effect.
Google Cloud Storage (GCS, a cloud object storage service) buckets are used to store the ad tech's proprietary code modules that are required for Bidding and Auction services. The Bidding and Auction services communicate with GCS via a Private Service Connect endpoint to fetch the code modules. Each bucket must be configured to allow READ access to the GCE host IAM role. Due to relaxed code module fetch latency requirements, the ad tech can host its code in a GCS bucket in a single region if desired.
We will provide Terraform configurations for the Bidding and Auction services. Terraform is used to describe the cloud resource composition (via Infrastructure as Code that is required for a fully-functional bidding and auction system and the provided Terraform configurations can be modified by the ad tech with no limitations. In fact, the Bidding and Auction services are configured via Terraform, so it is expected that the ad tech interacts with Terraform throughout the deployment process.
This section documents the packaging and deployment process for the Bidding and Auction services. The goal is for technical users to gain an understanding of how to deploy a functioning cloud environment with the service(s) of their choice. After the Bidding and Auction services code is open sourced, much more detail will be added to this section so that it can serve as a complete how-to guide.
In order to create a functioning service in GCP, there are two major steps:
- Packaging: The creation of the Docker image containing the service's code.
- Deployment: Running Terraform to bring up the individual cloud components, including the image from step 1.
Use a Linux-based operating system to follow these instructions. Other systems have not been tested.
- Install
git
. - Download the source code from the Github repository.
- Run
git submodule update --init
.
This command and all suggested commands in this document should be run from the project root directory.
- Install Docker, which is required to:
- Build the code. NOTE: the code relies on the Bazel build system which is included in the default Docker images used by the build scripts.
- Build the production images with attestable hashes.
- Run tests and binaries locally.
To verify that Docker is installed and runs, try building the code using one of the tools installed by Docker. Running builders/tools/bazel-debian info workspace should return an output of the Bazel workspace location. Make sure that bazel info workspace and bazel-debian info workspace have different outputs, or your local builds will conflict with the packaging toolchain.
- Create a symbolic link that points to your python3 installation from /opt/bin/python3.
- Install gcloud and initialize your environment.
After installing the prerequisites, you should be able to test the server. To run the server locally:
- Run
builders/tools/bazel-debian build
to build each server. - Start the server with the artifact returned by Bazel.
- Test the server following these steps.
- Optional: Run the built binary with the
--helpfull
flag to inspect the required flags.
Startup Scripts
The scripts in tools/debug/start_*
contain example startup commands and can be run directly. See the README for more detail.
- Create a billable project. This may belong to an organization but can also be standalone. All work will proceed within a single project.
- Register a domain name via Cloud Domains. Both seller and buyer services require a dedicated domain.
- In your project, create an Artifact Registry repository, then authenticate your account. You will build and upload the docker images used for testing.
- Install Terraform and follow the remainder of the project setup using Terraform.
The file config.bzl
presents a flag for non_prod (non-attestable) builds, non_prod_build
. You may modify the value of the GLOG_v
key to increase your log level for more verbose logs.
To build a seller front end service, you may want to modify the envoy.yaml
configuration file to expose whichever ports you need via the socket_address
fields. The gRPC_cluster
port must match the port passed via the <Service>_PORT
flag.
Buy-side ad techs only need to deploy the Buyer Front-End service and the Bidding service in production, while sell-side ad techs only need to deploy the Seller Front-end Server and the Auction Server. However, when testing, ad techs may want to deploy all of the servers to better understand the message flows and structures.
To deploy to GCP for testing, we suggest building a docker image for each service. A script to do so can be found at:
production/packaging/build_and_test_all_in_docker
This script takes flags to specify which service and which region to build. For example:
production/packaging/build_and_test_all_in_docker --service-path <SERVICE_NAME>_service --instance gcp --platform gcp --gcp-image-tag <DEPLOYMENT ENVIRONMENT> --gcp-image-repo <REGION>-docker.pkg.dev/<PROJECT_ID>/<REPO_NAME> --build-flavor <prod (for attestation) or non_prod (for debug logging)> --no-tests --no-precommit
Note:
- Switch
prod
tonon_prod
for a debugging build that turns on all vlog.<DEPLOYMENT ENVIRONMENT>
must matchenvironment
in the terraform deployment (see Step 2).
The script uploads the service (configured via the service-path
flag) docker image (tagged with the gcp-image-tag
flag) to the Artifact Registry repository provided by the gcp-image-repo
flag. The GCE managed instance group template Terraform resources then take as input an image path, which you can provide via a string of the following format: <gcp-image-repo>/<service-path>:<gcp-image-tag>
.
Install Terraform, following these instructions.
The Terraform lies across three main folders within the production/deploy/gcp/terraform
directory:
.
├── environment
│ └── demo
├── modules
│ ├── buyer
│ └── seller
└── services
├── autoscaling
├── load_balancing
├── networking
└── security
This directory contains all of the individual components of a full stack: networking, load balancing, etc..
This directory contains the seller and buyer modules, which compose the objects found in services/
and apply defaults. Take note of the variable descriptions and defaults.
This directory contains example setups of sellers and buyers. Subdirectories of the environment
directory (such as setup_1
) are where you should run terraform apply
. As an ad tech, this is where you write (or reuse) .tf
files. Review setup_1/multi-region.tf
as an example. This file contains all of the ad tech-specific details such as runtime flags, region, and domain addresses. The Terraform variable descriptions in the buyer and seller service_vars.tf
files contain the complete details of each variable.
For recommended configurations, please see here.
Terraform variables are split into two major categories:
- Those for the seller (definitions, defaults found in
production/deploy/gcp/terraform/modules/seller/service_vars.tf
). - Those for the buyer (definitions, defaults found in
production/deploy/gcp/terraform/modules/buyer/service_vars.tf
).
The seller module brings up a SellerFrontEnd and Auction service, while the buyer module brings up a BuyerFrontEnd and Bidding service. You can have multiple buyers for every seller.
After modifying the provided implementations to your desired parameters (including updating all defaults in modules/*/service_vars.tf
), you should run the following in your desired environment or setup directory to bring the entire system up.
terraform apply
You can see the output of each GCE instance via the serial logging console.
Ad techs must use a GCS bucket to host proprietary code modules. The bucket name is required by the Terraform configuration so that a bucket and Private Service Connect Endpoint can be created. Bidding and Auction services automatically fetch updates from the bucket, but it is the ad tech's responsibility to upload their code modules to the bucket. Note that to upload to the bucket, the ad tech must modify the bucket permissions to allow their own proprietary endpoints WRITE access. This is most easily done through IAM permissions. See the GCP GCS permission guide for details. The Terraform configuration allows the VPC's instances READ access to the bucket by default.
Instead of using a bucket, during alpha and early beta testing server operators may specify an arbitrary code module endpoint to fetch (via an HTTPS call) in the Terraform configuration. Only a single code module is supported. Later in beta testing, ad techs will be able to host multiple different code modules in a single bucket and specify the module to use at the individual request level.
Please see the secure_invoke README. This tool is bundled with the Bidding and Auction services.
Use grpcurl to send a gRPC request to the load balancer address you configured in the Terraform. Requests must be addressed to port 443 so that the load balancer can terminate the TLS connection. When testing locally-running services, disable the TLS_INGRESS
flags to bypass TLS requirements.
Note: if providing a sample_request.json
, keep in mind that the SelectAdRequest
will still require a protected_audience_ciphertext (see secure_invoke
in Option 1 for instructions on how to generate a ciphertext payload). Additionally, grpcurl will not be able to decrypt the AuctionResult
ciphertext.
Task |
Command |
Local service: List grpc endpoints |
grpcurl -plaintext localhost: list |
Local service: Send query |
grpcurl -plaintext -d '@' localhost: privacy_sandbox.bidding_auction_servers./ < sample_request.json |
GCP service: List grpc endpoints |
grpcurl dns:///:443 list |
GCP service: Send query |
grpcurl -d '@' dns:///:443 privacy_sandbox.bidding_auction_servers./ < sample_request.json |