Configure a cluster-wide proxy in an Azure Red Hat OpenShift (ARO) cluster

This article describes the process for enabling a cluster-wide proxy on an Azure Red Hat OpenShift cluster. This feature allows production environments to deny direct access to the internet and instead have an HTTP or HTTPS proxy available. This article details the specific configuration steps necessary for an Azure Red Hat OpenShift cluster. For more information about how the cluster-wide proxy feature works for the OpenShift Container Platform, see the Red Hat documentation.

When configuring a cluster-wide proxy, it's important to understand the following impacts:

  • Node reboot: Enabling the proxy causes nodes to reboot in a rolling fashion, similar to a cluster update. This is necessary as it applies new machine configurations.
  • Service disruptions: To avoid any service disruptions during this process, it's crucial to prepare the noProxy list as described.

Important

Failure to adhere to the instructions outlined in this article might result in improper routing of cluster network traffic. This could lead to workload issues, such as image pull failures.

Scope of cluster-wide proxy configuration

  • OpenShift workloads: The instructions in this article only apply to OpenShift workloads. Proxying application workloads is out of scope for this article.
  • OpenShift Container Platform versions: Cluster-wide proxy is supported on OpenShift Container Platform versions outlined in the Azure Red Hat OpenShift support policy.

Following the instructions in this article and preparing the noProxy list will minimize disruptions and ensure a smooth transition when enabling the proxy.

Prerequisites and disclaimer

  • Review the OpenShift documentation for Configuring the cluster-wide proxy for more information.
  • Proxy server and certificates: You're expected to have a proxy server and certificates already in place.
  • Azure Red Hat OpenShift SRE doesn't provide support for your proxy server or certificates.

Overview

  1. Gather the required endpoint values for use in the noProxy list.
  2. Enable the cluster-wide proxy using the gathered data for noProxy.
  3. Verify that the noProxy list and the cluster-wide proxy were successfully configured.

Gather the required data for noProxy

  1. Verify the cluster-wide proxy status by running the following command:

    oc get proxy cluster -o yaml
    

    The spec and status fields should be empty, showing it isn't enabled. If it isn't empty, then it may have been previously configured.

    apiVersion: config.openshift.io/v1
    kind: Proxy
    metadata:
      creationTimestamp: "xxxx-xx-xxTxx:xx:xxZ"
      generation:
      name: cluster
      resourceVersion:
      uid: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
    spec:
      trustedCA:
        name: ""
    status: {}
    
  2. Note the IMDS IP: 169.254.169.254

  3. If you aren't using custom DNS, note the Azure DNS IP: 168.63.129.16

  4. Note the localhost and service domains:

    • localhost
    • 127.0.0.1
    • .svc
    • .cluster.local
  5. Retrieve the gatewayDomains by running the following command:

    oc get cluster cluster -o jsonpath='{.spec.gatewayDomains}'
    

    See the following example output:

    [
        "agentimagestorews01.blob.core.windows.net",
        "agentimagestorecus01.blob.core.windows.net",
        "agentimagestoreeus01.blob.core.windows.net",
        "agentimagestoreeus01.blob.core.windows.net",
        "agentimagestoreeas01.blob.core.windows.net",
        "eastus-shared.prod.warm.ingest.monitor.core.windows.net",
        "...", // Many other endpoints
    ]
    
  6. Get your cluster domain URLs.

    Create the cluster specific URLs for the API and application domains.

    a. Obtain the applications domain by running the following command:

    az aro show -n <CLUSTER_NAME> -g <RESOURCE_GROUP_NAME> --query "consoleProfile.url" -o tsv
    

    See the following example output:

    https://console-openshift-console.apps.xxxxxxxx.westus2.aroapp.io/
    

    Only keep the part starting with .apps.xxxxxxxx for use in the noProxy list. Don't include the trailing "/".

    See the following example:

    .apps.xxxxxxxx.westus2.aroapp.io
    

    b. Obtain the API domains.

    Using the output of the previous command, replace .apps with api and api-int in the URL to get the API domains for the noProxy list.

    See the following example:

    api.xxxxxxxx.westus2.aroapp.io
    api-int.xxxxxxxx.westus2.aroapp.io
    
  7. Get the CIDR ranges.

    a. Get the addressPrefix from the worker profile subnets by running the following command:

    SUBNET_ID=$(az aro show -n <CLUSTER_NAME> -g <RESOURCE_GROUP_NAME> --query "workerProfiles[].subnetId" -o tsv)
    az network vnet subnet show --ids "$SUBNET_ID" --query "addressPrefix || [].addressPrefix" -o tsv
    

    Example output:

    10.0.1.0/24
    

    b. Get the addressPrefix from the master profile subnet by running the following command:

    SUBNET_ID=$(az aro show -n <CLUSTER_NAME> -g <RESOURCE_GROUP_NAME> --query "masterProfile.subnetId" -o tsv)
    az network vnet subnet show --ids "$SUBNET_ID" --query "addressPrefix" -o tsv
    

    Example output:

    10.0.0.0/24
    

    c. Get the podCidr by running the following command:

    az aro show -n <CLUSTER_NAME> -g <RESOURCE_GROUP_NAME> --query "networkProfile.podCidr" -o tsv
    

    Example output:

    10.128.0.0/14
    

    d. Get the serviceCidr by running the following command:

    az aro show -n <CLUSTER_NAME> -g <RESOURCE_GROUP_NAME> --query "networkProfile.serviceCidr" -o tsv
    

    Example output:

    172.30.0.0/16
    
  8. Combine the gathered data into your noProxy list which will be used for updating the proxy cluster object in the next section.

Enabling cluster-wide proxy

  1. Create the user-ca-bundle configmap in the openshift-config namespace to use the correct certificate.

    a. Create a file called user-ca-bundle.yaml with the following contents, and provide the values of your PEM-encoded certificates:

    apiVersion: v1
    data:
      ca-bundle.crt: |
        <MY_PEM_ENCODED_CERTS>
    kind: ConfigMap
    metadata:
      name: user-ca-bundle
      namespace: openshift-config
    
    • data.ca-bundle.crt: This data key must be named ca-bundle.crt.
    • data.ca-bundle.crt | <MY_PEM_ENCODED_CERTS>: One or more PEM-encoded X.509 certificates used to sign the proxy’s identity certificate.
    • metadata.name: The config map name referenced from the proxy object.
    • metadata.namespace: The config map must be in the openshift-config namespace.

    b. Create the ConfigMap by running the following command:

    oc create -f user-ca-bundle.yaml
    

    c. Confirm the creation of the user-ca-bundle ConfigMap by running the following command:

    oc get cm -n openshift-config user-ca-bundle -o yaml
    

    See the following example output:

    apiVersion: v1
    data:
      ca-bundle.crt: |
         -----BEGIN CERTIFICATE-----
         <CERTIFICATE_DATA>
        -----END CERTIFICATE-----
    kind: ConfigMap
    metadata:
      creationTimestamp: "xxxx-xx-xxTxx:xx:xxZ"
      name: user-ca-bundle
      namespace: openshift-config
      resourceVersion: "xxxxxx"
      uid: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
    
  2. Update the proxy cluster object using oc edit, then configure the proxy object using the previously gathered information.

    a. Run the following command:

    oc edit proxy/cluster
    

    Update or add the following fields:

    • spec.httpProxy: A proxy URL to use for creating HTTP connections outside the cluster. The URL scheme must be http.
    • spec.httpsProxy: A proxy URL to use for creating HTTPS connections outside the cluster.
    • spec.noProxy: This will be the comma-separated list of endpoints obtained in the Gather the required data for noProxy steps above.
    • spec.trustedCA: A reference to the config map in the openshift-config namespace that contains other CA certificates required for proxying HTTPS connections. Note that the config map must already exist before referencing it here. In this case this is the name of the config map created above, which is user-ca-bundle.

    b. Confirm the configuration by running the following command:

    oc get proxy cluster -o yaml
    

    See the following example output:

    apiVersion: config.openshift.io/v1
    kind: Proxy
    metadata:
      annotations:
        kubectl.kubernetes.io/last-applied-configuration: |
          {"apiVersion":"config.openshift.io/v1","kind":"Proxy","metadata":{"annotations":{},"name":"cluster"},"spec":{"httpProxy":"http://10.0.0.15:3128","httpsProxy":"https://10.0.0.15:3129","noProxy":"agentimagestorecus01.blob.core.windows.net,agentimagestoreeus01.blob.core.windows.net,agentimagestorewus01.blob.core.windows.net,agentimagestoreweu01.blob.core.windows.net,agentimagestoreeas01.blob.core.windows.net,australiaeast-shared.prod.warm.ingest.monitor.core.windows.net,gcs.prod.monitoring.core.windows.net,gsm1130809042eh.servicebus.windows.net,gsm1130809042xt.blob.core.windows.net,gsm119650579eh.servicebus.windows.net,gsm119650579xt.blob.core.windows.net,gsm810972145eh.servicebus.windows.net,gsm810972145xt.blob.core.windows.net,maupdateaccount.blob.core.windows.net,maupdateaccount2.blob.core.windows.net,maupdateaccount3.blob.core.windows.net,maupdateaccount4.blob.core.windows.net,production.diagnostics.monitoring.core.windows.net,qos.prod.warm.ingest.monitor.core.windows.net,login.microsoftonline.com,management.azure.com,arosvc.azurecr.io,arosvc.australiaeast.data.azurecr.io,imageregistryvmxx7.blob.core.windows.net,.cluster.local,.svc,api-int.vlsi41ah.australiaeast.aroapp.io,localhost,10.0.0.0/8","trustedCA":{"name":"user-ca-bundle"}}}
      creationTimestamp: "xxxx-xx-xxTxx:xx:xxZ"
      generation: 17
      name: cluster
      resourceVersion: "xxxxxxx"
      uid: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
    spec:
      httpProxy: http://10.0.0.15:3128
      httpsProxy: https://10.0.0.15:3129
      noProxy: agentimagestorecus01.blob.core.windows.net,agentimagestoreeus01.blob.core.windows.net,agentimagestorewus01.blob.core.windows.net,agentimagestoreweu01.blob.core.windows.net,agentimagestoreeas01.blob.core.windows.net,australiaeast-shared.prod.warm.ingest.monitor.core.windows.net,gcs.prod.monitoring.core.windows.net,gsm1130809042eh.servicebus.windows.net,gsm1130809042xt.blob.core.windows.net,gsm119650579eh.servicebus.windows.net,gsm119650579xt.blob.core.windows.net,gsm810972145eh.servicebus.windows.net,gsm810972145xt.blob.core.windows.net,maupdateaccount.blob.core.windows.net,maupdateaccount2.blob.core.windows.net,maupdateaccount3.blob.core.windows.net,maupdateaccount4.blob.core.windows.net,production.diagnostics.monitoring.core.windows.net,qos.prod.warm.ingest.monitor.core.windows.net,login.microsoftonline.com,management.azure.com,arosvc.azurecr.io,arosvc.australiaeast.data.azurecr.io,imageregistryvmxx7.blob.core.windows.net,.cluster.local,.svc,api-int.vlsi41ah.australiaeast.aroapp.io,localhost,10.0.0.0/8
      trustedCA:
        name: user-ca-bundle
    status:
      httpProxy: http://10.0.0.15:3128
      httpsProxy: https://10.0.0.15:3129
      noProxy: .cluster.local,.svc,10.0.0.0/8,10.128.0.0/14,127.0.0.0/8,127.0.0.1,169.254.169.254,172.30.0.0/16,agentimagestorecus01.blob.core.windows.net,agentimagestoreeas01.blob.core.windows.net,agentimagestoreeus01.blob.core.windows.net,agentimagestoreweu01.blob.core.windows.net,agentimagestorewus01.blob.core.windows.net,api-int.vlsi41ah.australiaeast.aroapp.io,arosvc.australiaeast.data.azurecr.io,arosvc.azurecr.io,australiaeast-shared.prod.warm.ingest.monitor.core.windows.net,gcs.prod.monitoring.core.windows.net,gsm1130809042eh.servicebus.windows.net,gsm1130809042xt.blob.core.windows.net,gsm119650579eh.servicebus.windows.net,gsm119650579xt.blob.core.windows.net,gsm810972145eh.servicebus.windows.net,gsm810972145xt.blob.core.windows.net,imageregistryvmxx7.blob.core.windows.net,localhost,login.microsoftonline.com,management.azure.com,maupdateaccount.blob.core.windows.net,maupdateaccount2.blob.core.windows.net,maupdateaccount3.blob.core.windows.net,maupdateaccount4.blob.core.windows.net,production.diagnostics.monitoring.core.windows.net,qos.prod.warm.ingest.monitor.core.windows.net
    
  3. Wait for the new machine-config to be rolled out to all the nodes and for the cluster operators to report healthy.

    a. Confirm node health by running the following command:

    oc get nodes
    

    See the following example output:

    NAME                                         STATUS   ROLES    AGE   VERSION
    mycluster-master-0                           Ready    master   10d   v1.xx.xx+xxxxxxx
    mycluster-master-1                           Ready    master   10d   v1.xx.xx+xxxxxxx
    mycluster-master-2                           Ready    master   10d   v1.xx.xx+xxxxxxx
    mycluster-worker-australiaeast1-mvzqr        Ready    worker   10d   v1.xx.xx+xxxxxxx
    mycluster-worker-australiaeast2-l9fgj        Ready    worker   10d   v1.xx.xx+xxxxxxx
    mycluster-worker-australiaeast3-pz9rw        Ready    worker   10d   v1.xx.xx+xxxxxxx
    

    b. Confirm cluster operator health by running the following command:

    oc get co
    

    See the following example output:

    NAME                                VERSION        AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
    aro                                 vxxxxxxxx      True        False         False      10d
    authentication                      4.xx.xx        True        False         False      8m25s
    cloud-controller-manager            4.xx.xx        True        False         False      10d
    cloud-credential                    4.xx.xx        True        False         False      10d
    cluster-autoscaler                  4.xx.xx        True        False         False      10d
    ... (Many other components) ...
    storage                             4.xx.xx        True        False         False      10d
    

    Note

    If you require the user-ca-bundle, it's located in the following directory (but it's not required for this process):

    /etc/pki/ca-trust/source/anchors/openshift-config-user-ca-bundle.crt

Verify noProxy configuration

To verify your proxy configuration, check the health status of the cluster operators. If the noProxy field is misconfigured, multiple cluster operators might enter a Degraded: True state. This can result from various issues, including, but not limited to, ImagePullBack errors, invalid certificates, or general connectivity problems. Additionally, some operators might remain in a Progressing: True state due to similar underlying causes.

  1. Check the status of the cluster operators by running the following command:

    oc get co
    
  2. Interpreting the output (healthy state): If the noProxy field is correctly configured, the output should resemble the following example:

    NAME                                       VERSION        AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
    aro                                        vxxxxxxxx.xx   True        False         False      15d
    authentication                             4.xx.xx        True        False         False      15d
    cloud-controller-manager                   4.xx.xx        True        False         False      15d
    cloud-credential                           4.xx.xx        True        False         False      15d
    

    Note

    The number and type of cluster operators may vary. The truncated example shown is provided to illustrate a healthy state for ARO-supported operators.

  3. Interpreting the output (misconfigured): If the noProxy field is misconfigured, the output might resemble the following example:

    NAME                         VERSION        AVAILABLE  PROGRESSING  DEGRADED  SINCE    MESSAGE
    aro                          vxxxxxxxx.xx   True       False        False     45h
    authentication               4.xx.xx        False      True         True      24h      OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.mm6osebam6b03b9df3.eastus2euap.aroapp.io/healthz": Not Found
    control-plane-machine-set    4.xx.xx        True       False        False     46h      SyncLoopRefreshProgressing: Working toward version 4.15.35, 1 replicas available
    image-registry               4.xx.xx        True       True         False     45h      NodeCADaemonProgressing: The daemon set node-ca is deployed Progressing: The deployment has not completed
    ingress                      4.xx.xx        True       True         True      83m      The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing)
    machine-config               4.xx.xx        False      False        True      43h      Cluster not available for [{operator 4.15.35}]: error during waitForControllerConfigToBeCompleted: [context deadline exceeded, controllerconfig is not completed: status for ControllerConfig machine-config-controller is being reported for 6, expecting it for 13]
    storage                      4.xx.xx        True       True         False     45h      AzureFileCSIDriverOperatorCRProgressing: AzureFileDriverControllerServiceControllerProgressing: Waiting for Deployment to deploy pods AzureFileCSIDriverOperatorCRProgressing: AzureFileDriverNodeServiceControllerProgressing: Waiting for DaemonSet to deploy node pods
    

    Note

    Shown is only a truncated sample output. Other cluster operators may also report a Degraded: True state with different errors resulting from the misconfiguration of noProxy.

Remove cluster-wide proxy

For information about removing the cluster-wide proxy, see the Red Hat OpenShift documentation.