共用方式為


資源傳播失敗:ClusterResourcePlacementRolloutStarted 為 false

本文說明如何在 Azure Kubernetes Fleet Manager 中使用 ClusterResourcePlacement API 對象傳播資源時,針對問題進行疑難解答ClusterResourcePlacementRolloutStarted

徵兆

在 Azure Kubernetes Fleet Manager 中使用 ClusterResourcePlacement API 物件傳播資源時,所有排程叢集中都不會推出選取的資源,而 ClusterResourcePlacementRolloutStarted 條件狀態會顯示為 False

注意

若要取得推出原因的詳細資訊,您可以檢查 推出控制器 記錄。

原因

叢集資源放置推出策略遭到封鎖,因為設定 RollingUpdate 太嚴格。

疑難排解步驟

  1. 在 [狀態] 區ClusterResourcePlacement段中,檢查 placementStatuses 以識別狀態設定為 FalseRolloutStarted叢集。
  2. 找出所識別叢集的對應 ClusterResourceBinding 。 如需詳細資訊,請參閱 如何尋找最新的 ClusterResourceBinding 資源? 此資源應指出 Work 狀態(無論是已建立還是更新)。
  3. 確認 和 maxSurge 的值maxUnavailable,以確保它們符合您的期望。

案例研究

在下列範例中, ClusterResourcePlacement 嘗試將命名空間傳播至三個成員叢集。 不過,在初始建立 ClusterResourcePlacement期間,命名空間不存在於中樞叢集上,且車隊目前包含名為 kind-cluster-1kind-cluster-2的兩個成員叢集。

ClusterResourcePlacement 規格

spec:
  policy:
    numberOfClusters: 3
    placementType: PickN
  resourceSelectors:
  - group: ""
    kind: Namespace
    name: test-ns
    version: v1
  revisionHistoryLimit: 10
  strategy:
    type: RollingUpdate

ClusterResourcePlacement 狀態

status:
  conditions:
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: could not find all the clusters needed as specified by the scheduling
      policy
    observedGeneration: 1
    reason: SchedulingPolicyUnfulfilled
    status: "False"
    type: ClusterResourcePlacementScheduled
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: All 2 cluster(s) start rolling out the latest resource
    observedGeneration: 1
    reason: RolloutStarted
    status: "True"
    type: ClusterResourcePlacementRolloutStarted
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: No override rules are configured for the selected resources
    observedGeneration: 1
    reason: NoOverrideSpecified
    status: "True"
    type: ClusterResourcePlacementOverridden
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: Works(s) are successfully created or updated in the 2 target clusters'
      namespaces
    observedGeneration: 1
    reason: WorkSynchronized
    status: "True"
    type: ClusterResourcePlacementWorkSynchronized
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: The selected resources are successfully applied to 2 clusters
    observedGeneration: 1
    reason: ApplySucceeded
    status: "True"
    type: ClusterResourcePlacementApplied
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: The selected resources in 2 cluster are available now
    observedGeneration: 1
    reason: ResourceAvailable
    status: "True"
    type: ClusterResourcePlacementAvailable
  observedResourceIndex: "0"
  placementStatuses:
  - clusterName: kind-cluster-2
    conditions:
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: 'Successfully scheduled resources for placement in kind-cluster-2 (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: Detected the new changes on the resources and started the rollout process
      observedGeneration: 1
      reason: RolloutStarted
      status: "True"
      type: RolloutStarted
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: No override rules are configured for the selected resources
      observedGeneration: 1
      reason: NoOverrideSpecified
      status: "True"
      type: Overridden
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All of the works are synchronized to the latest
      observedGeneration: 1
      reason: AllWorkSynced
      status: "True"
      type: WorkSynchronized
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All corresponding work objects are applied
      observedGeneration: 1
      reason: AllWorkHaveBeenApplied
      status: "True"
      type: Applied
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All corresponding work objects are available
      observedGeneration: 1
      reason: AllWorkAreAvailable
      status: "True"
      type: Available
  - clusterName: kind-cluster-1
    conditions:
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: 'Successfully scheduled resources for placement in kind-cluster-1 (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: Detected the new changes on the resources and started the rollout process
      observedGeneration: 1
      reason: RolloutStarted
      status: "True"
      type: RolloutStarted
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: No override rules are configured for the selected resources
      observedGeneration: 1
      reason: NoOverrideSpecified
      status: "True"
      type: Overridden
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All of the works are synchronized to the latest
      observedGeneration: 1
      reason: AllWorkSynced
      status: "True"
      type: WorkSynchronized
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All corresponding work objects are applied
      observedGeneration: 1
      reason: AllWorkHaveBeenApplied
      status: "True"
      type: Applied
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All corresponding work objects are available
      observedGeneration: 1
      reason: AllWorkAreAvailable
      status: "True"
      type: Available

上述輸出指出中樞叢集上從未存在資源 test-ns 命名空間,並顯示下列 ClusterResourcePlacement 條件狀態:

  • 條件 ClusterResourcePlacementScheduled 狀態會顯示為 False,因為指定的原則的目標是挑選三個叢集,但排程器只能容納目前可用和已加入之叢集中的兩個位置。
  • 條件 ClusterResourcePlacementRolloutStarted 狀態會顯示為 True,因為首度發行程式已開始選取兩個叢集。
  • 條件 ClusterResourcePlacementOverridden 狀態會顯示為 True,因為未針對選取的資源設定覆寫規則。
  • 條件 ClusterResourcePlacementWorkSynchronized 狀態會顯示為 True
  • 條件 ClusterResourcePlacementApplied 狀態會顯示為 True
  • 條件 ClusterResourcePlacementAvailable 狀態會顯示為 True

為了確保跨相關叢集順暢地傳播命名空間,請繼續在中樞叢集上建立 test-ns 命名空間。

在中樞叢集上建立命名空間 「test-ns」 之後的 ClusterResourcePlacement 狀態

status:
  conditions:
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: could not find all the clusters needed as specified by the scheduling
      policy
    observedGeneration: 1
    reason: SchedulingPolicyUnfulfilled
    status: "False"
    type: ClusterResourcePlacementScheduled
  - lastTransitionTime: "2024-05-07T23:13:51Z"
    message: The rollout is being blocked by the rollout strategy in 2 cluster(s)
    observedGeneration: 1
    reason: RolloutNotStartedYet
    status: "False"
    type: ClusterResourcePlacementRolloutStarted
  observedResourceIndex: "1"
  placementStatuses:
  - clusterName: kind-cluster-2
    conditions:
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: 'Successfully scheduled resources for placement in kind-cluster-2 (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2024-05-07T23:13:51Z"
      message: The rollout is being blocked by the rollout strategy
      observedGeneration: 1
      reason: RolloutNotStartedYet
      status: "False"
      type: RolloutStarted
  - clusterName: kind-cluster-1
    conditions:
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: 'Successfully scheduled resources for placement in kind-cluster-1 (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2024-05-07T23:13:51Z"
      message: The rollout is being blocked by the rollout strategy
      observedGeneration: 1
      reason: RolloutNotStartedYet
      status: "False"
      type: RolloutStarted
  selectedResources:
  - kind: Namespace
    name: test-ns
    version: v1

在上述輸出中 ClusterResourcePlacementScheduled ,條件狀態會顯示為 False。 狀態 ClusterResourcePlacementRolloutStarted 也會顯示為 False 訊息: The rollout is being blocked by the rollout strategy in 2 cluster(s)

ClusterResourceSnapshot在如何找到最新的 ClusterResourceBinding 資源中執行 命令,以檢查最新的命令?

最新 ClusterResourceSnapshot

apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourceSnapshot
metadata:
  annotations:
    kubernetes-fleet.io/number-of-enveloped-object: "0"
    kubernetes-fleet.io/number-of-resource-snapshots: "1"
    kubernetes-fleet.io/resource-hash: 72344be6e268bc7af29d75b7f0aad588d341c228801aab50d6f9f5fc33dd9c7c
  creationTimestamp: "2024-05-07T23:13:51Z"
  generation: 1
  labels:
    kubernetes-fleet.io/is-latest-snapshot: "true"
    kubernetes-fleet.io/parent-CRP: crp-3
    kubernetes-fleet.io/resource-index: "1"
  name: crp-3-1-snapshot
  ownerReferences:
  - apiVersion: placement.kubernetes-fleet.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: ClusterResourcePlacement
    name: crp-3
    uid: b4f31b9a-971a-480d-93ac-93f093ee661f
  resourceVersion: "14434"
  uid: 85ee0e81-92c9-4362-932b-b0bf57d78e3f
spec:
  selectedResources:
  - apiVersion: v1
    kind: Namespace
    metadata:
      labels:
        kubernetes.io/metadata.name: test-ns
      name: test-ns
    spec:
      finalizers:
      - kubernetes

在規格中 ClusterResourceSnapshot ,區 selectedResources 段現在會顯示 命名空間 test-ns

ClusterResourceBinding檢查 是否kind-cluster-1在建立命名空間test-ns之後更新。 如需詳細資訊,請參閱 如何尋找最新的 ClusterResourceBinding 資源?

kind-cluster-1 的 ClusterResourceBinding

apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourceBinding
metadata:
  creationTimestamp: "2024-05-07T23:08:53Z"
  finalizers:
  - kubernetes-fleet.io/work-cleanup
  generation: 2
  labels:
    kubernetes-fleet.io/parent-CRP: crp-3
  name: crp-3-kind-cluster-1-7114c253
  resourceVersion: "14438"
  uid: 0db4e480-8599-4b40-a1cc-f33bcb24b1a7
spec:
  applyStrategy:
    type: ClientSideApply
  clusterDecision:
    clusterName: kind-cluster-1
    clusterScore:
      affinityScore: 0
      priorityScore: 0
    reason: picked by scheduling policy
    selected: true
  resourceSnapshotName: crp-3-0-snapshot
  schedulingPolicySnapshotName: crp-3-0
  state: Bound
  targetCluster: kind-cluster-1
status:
  conditions:
  - lastTransitionTime: "2024-05-07T23:13:51Z"
    message: The resources cannot be updated to the latest because of the rollout
      strategy
    observedGeneration: 2
    reason: RolloutNotStartedYet
    status: "False"
    type: RolloutStarted
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: No override rules are configured for the selected resources
    observedGeneration: 2
    reason: NoOverrideSpecified
    status: "True"
    type: Overridden
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: All of the works are synchronized to the latest
    observedGeneration: 2
    reason: AllWorkSynced
    status: "True"
    type: WorkSynchronized
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: All corresponding work objects are applied
    observedGeneration: 2
    reason: AllWorkHaveBeenApplied
    status: "True"
    type: Applied
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: All corresponding work objects are available
    observedGeneration: 2
    reason: AllWorkAreAvailable
    status: "True"
    type: Available

保持不變 ClusterResourceBinding 。 在規格中 ClusterResourceBindingresourceSnapshotName 仍然參考舊 ClusterResourceSnapshot 名稱。 當使用者沒有明確的 RollingUpdate 輸入時,就會發生此問題,因為會套用預設值:

  • maxUnavailable 設定為 25% × 3 (所需數位),四捨五入為 1
  • maxSurge 設定為 25% × 3 (所需數位),四捨五入為 1

為什麼 ClusterResourceBinding 未更新

一開始,建立 時 ClusterResourcePlacement ,會產生兩 ClusterResourceBindings 個 。 不過,由於首度發行不適用於初始階段,條件 ClusterResourcePlacementRolloutStarted 會設定為 True

在中樞叢集上建立 test-ns 命名空間時,推出控制器嘗試更新兩個現有的 ClusterResourceBindings。 不過, maxUnavailable 由於 1 缺少成員叢集,因此設定 RollingUpdate 太嚴格。

注意

在更新期間,如果其中一個系結無法套用,它也會違反 RollingUpdate 設定,這會導致 maxUnavailable 設定為 1

解決方法

在此情況下,若要解決此問題,請考慮手動設定 maxUnavailable 為大於 1 放寬設定 RollingUpdate 的值。 或者,您可以加入第三個成員叢集。

與我們連絡,以取得說明

如果您有問題或需要相關協助,請建立支援要求,或詢問 Azure community 支援。 您也可以向 Azure 意見反應社群提交產品意見反應。