The following information applies only to in-VPC deployments of Unstructured Enterprise.For dedicated instance deployments of Unstructured Enterprise, contact your Unstructured sales representative,
or email Unstructured Sales at sales@unstructured.io.
- Do it all for me: Have Unstructured set up the required infrastructure in your AWS account and then deploy the Unstructured UI and API into that newly created infrastructure.
- Bring my own infrastructure: Set up the required infrastructure yourself in your AWS account, and then have Unstructured deploy the Unstructured UI and API into your existing infrastructure.
Questions? Need help?
If you have questions or need help as you go, contact your Unstructured sales representative or technical enablement contact. If you do not know who they are, email Unstructured Sales at sales@unstructured.io, and a member of the Unstructured sales or technical enablement teams will get back to you as soon as possible.Do it all for me
If you want Unstructured to set up the required infrastructure for you in your GCP account and then deploy the Unstructured UI and API into that newly created infrastructure, then provide your Unstructured sales representative or technical enablement contact with the access credentials for an IAM user or service account in your GCP account that has the following required permissions:Core networking permissions
VPC/subnet management:compute.networks.createcompute.subnetworks.createcompute.routers.create(for Cloud NAT)compute.addresses.create(for NAT IPs)compute.firewalls.create(for intra-cluster traffic rules)
compute.organizations.admin(for the host project)compute.networks.use(for the service project)
GKE cluster permissions
Control plane:container.clusters.createcontainer.clusters.update(for private cluster settings)compute.networks.useExternalIp(for public endpoint access)
compute.instances.createcompute.disks.create(for node disks)compute.instanceGroups.create(for autoscaling)
- For the GKE cluster SA service account:
roles/container.hostServiceAgentUser - For the node SA service account:
roles/container.nodeServiceAccount - For the workload identity service account:
roles/iam.workloadIdentityUser
Storage and database
GCS buckets:storage.buckets.createstorage.objects.create(for versioning)storage.buckets.update(for encryption/lifecycle rules)
cloudsql.instances.createcloudsql.instances.connect(for private IPs)vpcaccess.connectors.use(if using Serverless VPC Access)
compute.disks.create(forpd.csi.storage.gke.io)compute.subnetworks.use(for regional disks)
Advanced configurations
Workload identity:iam.serviceAccounts.getAccessToken(for federated access)iam.serviceAccounts.setIamPolicy(to bind Kubernetes SAs to GCP SAs)
compute.routers.update(for NAT configuration)compute.addresses.use(for NAT IP allocation)
compute.projects.setCommonInstanceMetadata(for SSH key upload)compute.instances.osAdminLogin
Minimum required roles
Project level:roles/editor(broad access, or scope with custom roles)
roles/compute.networkAdmin(for VPC and subnets)roles/container.admin(for GKE)roles/storage.admin(for GCS)roles/cloudsql.admin(for Postgres)
Bring my own infrastructure
If you want to set up the required infrastructure yourself, set things up as follows within your GCP account for Unstructured to deploy the Unstructured UI and API into. You must also provide your Unstructured sales representative or technical enablement contact with the access credentials for an IAM user or service account in your GCP account that has access to the target Google Kubernetes Engine (GKE) cluster to deploy the Unstructured UI and API into.VPC and networking (GCP equivalent)
-
VPC Network
- Name:
u10d-platform - Subnet Mode: Custom
- CIDR:
10.0.0.0/16 - DNS: Internal DNS supported by default
- Name:
-
Internet Gateway
- GCP provides implicit internet access via default internet gateway (No need to explicitly create)
-
Public Subnet
- Subnet:
public-subnet—10.0.0.0/24 - Region:
${region} - Enable external IPs on VM instances for internet access
- Subnet:
-
NAT Gateway
- Use Cloud NAT attached to a Cloud Router in public subnet
- Needed to provide egress internet access to private subnet instances
-
Private Subnets (x2)
private-subnet-a:10.0.1.0/24, region${region}-aprivate-subnet-b:10.0.2.0/24, region${region}-b
-
Routes
- Public subnet: default route
0.0.0.0/0to Internet Gateway (via external IPs) - Private subnets: route
0.0.0.0/0via Cloud NAT
- Public subnet: default route
IAM roles and policies
-
GKE Cluster IAM Service Account
-
Grant roles:
roles/container.clusterAdminroles/compute.networkAdmin
-
Grant roles:
-
GKE Node IAM Service Account
-
Grant roles:
roles/container.nodeServiceAccountroles/compute.viewerroles/storage.objectViewer
-
Grant roles:
-
Workload Identity IAM Bindings (x3)
- Namespaces:
recommender,etl-operator,data-broker - Use Workload Identity Federation
- Bind GCP IAM Service Accounts to Kubernetes service accounts
- Grant
roles/storage.objectAdminfor access to GCS buckets
- Namespaces:
GKE cluster
-
Control Plane
- Version:
1.31or higher - Private Cluster: Enabled
- Master Authorized Networks: your IP(s)
- Enable Public Endpoint Access: Yes
- Version:
-
Node Pool
- Machine type:
n2-standard-16 - Disk: 100GB, SSD (default boot disk)
- Node count: min 2, max 5, autoscaling enabled
- SSH access: via OS Login + SSH keys
- SSH key: Add public key to instance metadata
- Machine type:
-
Firewall Rules
-
Allow:
- Internal:
10.0.0.0/16 - Egress: all
- Internal:
- Kubernetes master access to nodes
-
Allow:
Kubernetes add-ons (installed via kubectl or Helm)
- Workload Identity Config
-
Metrics Server
- Deployed manually (same version:
v0.7.2)
- Deployed manually (same version:
-
GCP CSI Driver
- Provisioner:
pd.csi.storage.gke.io - Role binding needed for controller SA
- Provisioner:
Storage class
Cloud SQL (Postgres)
-
Private IP-enabled Cloud SQL instance
- Engine: Postgres 16
- Size:
db-f1-micro(ordb-custom-1-3840) - Storage: 20GB
- Credentials: Username/password
- Private network: Use the private VPC
- Cloud SQL Auth Proxy or private VPC peering to connect from GKE
GCS Buckets
-
Buckets:
u10d-{stack_name}-etl-blob-cacheu10d-{stack_name}-etl-job-dbu10d-{stack_name}-etl-job-statusu10d-{stack_name}-job-files
-
Config:
- Versioning: Enabled
- Encryption: Default (Google-managed key or CMEK if needed)
- Lifecycle rule: Auto-delete / force destroy if needed (optional)
Keys
-
SSH Key Pair
- Generate manually (
ssh-keygen -t rsa -b 4096) - Upload public key to project metadata or OS Login
- Export private key as PEM for automation
- Generate manually (
Secrets and ConfigMaps
After your infrastructure is set up, but before Unstructured can deploy the Unstructured UI and API into your insfrastructure, Unstructured will need to know the values of the following Secrets and configuration mappings (also known as ConfigMaps). The Secrets are as follows.Blob storage credentials
BLOB_STORAGE_ADAPTER_GCP_SERVICE_ACCOUNT_KEY_JSONBLOB_STORAGE_ADAPTER_REGION_NAME
Database credentials
DB_USERNAMEDB_PASSWORDDB_HOSTDB_NAMEDB_DATABASE(used inplatform-apionly)
Authentication
JWT_SECRET_KEYAUTH_STRATEGY(sometimes encoded, sometimes not)SESSION_SECRETSHARED_SECRETKEYCLOAK_CLIENT_SECRETKEYCLOAK_ADMIN_SECRETKEYCLOAK_ADMINKEYCLOAK_ADMIN_PASSWORDAPI_BEARER_TOKEN
Blob storage settings
BLOB_STORAGE_ADAPTER_TYPE(alwaysgcpfor GCP)BLOB_STORAGE_ADAPTER_BUCKETETL_BLOB_CACHE_BUCKET_NAMEETL_API_BLOB_STORAGE_ADAPTER_BUCKETETL_API_BLOB_STORAGE_ADAPTER_TYPEETL_API_DB_REMOTE_BUCKET_NAMEETL_API_JOB_STATUS_DEST_BUCKET_NAMEJOB_STATUS_BUCKET_NAMEJOB_DB_BUCKET_NAME
Environment
ENVENVIRONMENTJOB_ENVJOB_ENVIRONMENT
Observability and OpenTelemetry (OTel)
JOB_OTEL_EXPORTER_OTLP_ENDPOINTJOB_OTEL_METRICS_EXPORTERJOB_OTEL_TRACES_EXPORTEROTEL_EXPORTER_OTLP_ENDPOINTOTEL_METRICS_EXPORTEROTEL_TRACES_EXPORTER
Unstructured API and authentication
UNSTRUCTURED_API_URLJWKS_URLJWT_ISSUERJWT_AUDIENCESINGLE_PLANE_DEPLOYMENT
Front end and dashboard
API_BASE_URLAPI_CLIENT_BASE_URLAPI_URLAPM_SERVICE_NAMEAPM_SERVICE_NAME_CLIENTAUTH_STRATEGYFRONTEND_BASE_URLKEYCLOAK_CALLBACK_URLKEYCLOAK_CLIENT_IDKEYCLOAK_DOMAINKEYCLOAK_REALMKEYCLOAK_SSL_ENABLEDKEYCLOAK_TRUST_ISSUERPUBLIC_BASE_URLPUBLIC_RELEASE_CHANNEL
Sentry & Feature Flags
SENTRY_DSNSENTRY_SAMPLE_RATEWORKFLOW_NODE_EDITOR_FF_REQUEST_FORMCUSTOM_WORKFLOW_FF_REQUEST_FORM
Redis
REDIS_DSN
Other
IMAGE_PULL_SECRETSPRIVATE_KEY_SECRETS_ADAPTER_TYPEPRIVATE_KEY_SECRETS_ADAPTER_GCP_REGIONSECRETS_ADAPTER_TYPESECRETS_ADAPTER_GCP_REGION
| File name | Type | Resource name | Namespace | Data keys |
|---|---|---|---|---|
data-broker-env-cm.yaml | ConfigMap | data-broker-env | api | JOB_STATUS_BUCKET_NAME, JOB_DB_BUCKET_NAME, BLOB_STORAGE_ADAPTER_TYPE |
data-broker-env-secret.yaml | Secret | data-broker-env | api | BLOB_STORAGE_ADAPTER_GCP_SERVICE_ACCOUNT_KEY_JSON, BLOB_STORAGE_ADAPTER_REGION_NAME |
dataplane-api-env-cm.yaml | Secret | dataplane-api-env | api | DB_PASSWORD, DB_USERNAME, DB_HOST, DB_NAME |
etl-operator-env-cm.yaml | ConfigMap | etl-operator-env | etl-operator | BLOB_STORAGE_ADAPTER_BUCKET, JOB_STATUS_BUCKET_NAME, JOB_DB_BUCKET_NAME, BLOB_STORAGE_ADAPTER_TYPE, ENV, ENVIRONMENT, REDIS_DSN, ETL_API_BLOB_STORAGE_ADAPTER_BUCKET, ETL_API_BLOB_STORAGE_ADAPTER_TYPE, ETL_API_DB_REMOTE_BUCKET_NAME, ETL_API_JOB_STATUS_DEST_BUCKET_NAME (x2), ETL_BLOB_CACHE_BUCKET_NAME, IMAGE_PULL_SECRETS, JOB_ENV, JOB_ENVIRONMENT, JOB_OTEL_EXPORTER_OTLP_ENDPOINT, JOB_OTEL_METRICS_EXPORTER, JOB_OTEL_TRACES_EXPORTER, OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_METRICS_EXPORTER, OTEL_TRACES_EXPORTER, UNSTRUCTURED_API_URL |
etl-operator-env-secret.yaml | Secret | etl-operator-env | etl-operator | BLOB_STORAGE_ADAPTER_GCP_SERVICE_ACCOUNT_KEY_JSON, BLOB_STORAGE_ADAPTER_REGION_NAME, |
frontend-env-cm.yaml | ConfigMap | frontend-env | www | API_BASE_URL, API_CLIENT_BASE_URL, API_URL, APM_SERVICE_NAME, APM_SERVICE_NAME_CLIENT, AUTH_STRATEGY, ENV, FRONTEND_BASE_URL, KEYCLOAK_CALLBACK_URL, KEYCLOAK_CLIENT_ID, KEYCLOAK_DOMAIN, KEYCLOAK_REALM, KEYCLOAK_SSL_ENABLED, KEYCLOAK_TRUST_ISSUER, PUBLIC_BASE_URL, PUBLIC_RELEASE_CHANNEL, SENTRY_DSN, SENTRY_SAMPLE_RATE, WORKFLOW_NODE_EDITOR_FF_REQUEST_FORM, CUSTOM_WORKFLOW_FF_REQUEST_FORM |
frontend-env-secret.yaml | Secret | frontend-env | www | API_BEARER_TOKEN, KEYCLOAK_ADMIN_SECRET, KEYCLOAK_CLIENT_SECRET, SESSION_SECRET, SHARED_SECRET |
keycloak-secret.yaml | Secret | phasetwo-keycloak-env | www | KEYCLOAK_ADMIN, KEYCLOAK_ADMIN_PASSWORD |
platform-api-env-cm.yaml | ConfigMap | platform-api-env | api | JWKS_URL, JWT_ISSUER, JWT_AUDIENCE, SINGLE_PLANE_DEPLOYMENT |
platform-api-env-secret.yaml | Secret | platform-api-env | api | DB_PASSWORD, DB_USERNAME, DB_HOST, DB_NAME, DB_DATABASE, JWT_SECRET_KEY, AUTH_STRATEGY |
recommender-env-cm.yaml | ConfigMap | recommender-env | recommender | BLOB_STORAGE_ADAPTER_TYPE, ETL_BLOB_CACHE_BUCKET_NAME |
recommender-env-secret.yaml | Secret | recommender-env | recommender | BLOB_STORAGE_ADAPTER_GCP_SERVICE_ACCOUNT_KEY_JSON, BLOB_STORAGE_ADAPTER_REGION_NAME |
secret-provider-api-env-cm.yaml | ConfigMap | secrets-provider-api-env | secrets | ENV, ENVIRONMENT, OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_METRICS_EXPORTER, OTEL_TRACES_EXPORTER, PRIVATE_KEY_SECRETS_ADAPTER_GCP_REGION, PRIVATE_KEY_SECRETS_ADAPTER_TYPE, SECRETS_ADAPTER_GCP_REGION, SECRETS_ADAPTER_TYPE |
secret-provider-api-env-secret.yaml | Secret | secrets-provider-api-env | secrets | BLOB_STORAGE_ADAPTER_GCP_SERVICE_ACCOUNT_KEY_JSON, BLOB_STORAGE_ADAPTER_REGION_NAME |
usage-collector-env-secret.yaml | Secret | usage-collector-env | api | DB_PASSWORD, DB_USERNAME, DB_HOST, DB_NAME, BLOB_STORAGE_ADAPTER_TYPE |
data-broker-env-cm.yaml ConfigMap file, the contents would look like this:
data-broker-env-secret.yaml Secret file, the contents would look like this:

