資源傳播失敗:ClusterResourcePlacementRolloutStarted 為 false
本文說明如何在 Azure Kubernetes Fleet Manager 中使用 ClusterResourcePlacement
API 對象傳播資源時,針對問題進行疑難解答ClusterResourcePlacementRolloutStarted
。
徵兆
在 Azure Kubernetes Fleet Manager 中使用 ClusterResourcePlacement
API 物件傳播資源時,所有排程叢集中都不會推出選取的資源,而 ClusterResourcePlacementRolloutStarted
條件狀態會顯示為 False
。
注意
若要取得推出原因的詳細資訊,您可以檢查 推出控制器 記錄。
原因
叢集資源放置推出策略遭到封鎖,因為設定 RollingUpdate
太嚴格。
疑難排解步驟
- 在 [狀態] 區
ClusterResourcePlacement
段中,檢查placementStatuses
以識別狀態設定為False
的RolloutStarted
叢集。 - 找出所識別叢集的對應
ClusterResourceBinding
。 如需詳細資訊,請參閱 如何尋找最新的 ClusterResourceBinding 資源? 此資源應指出Work
狀態(無論是已建立還是更新)。 - 確認 和
maxSurge
的值maxUnavailable
,以確保它們符合您的期望。
案例研究
在下列範例中, ClusterResourcePlacement
嘗試將命名空間傳播至三個成員叢集。 不過,在初始建立 ClusterResourcePlacement
期間,命名空間不存在於中樞叢集上,且車隊目前包含名為 kind-cluster-1
和 kind-cluster-2
的兩個成員叢集。
ClusterResourcePlacement 規格
spec:
policy:
numberOfClusters: 3
placementType: PickN
resourceSelectors:
- group: ""
kind: Namespace
name: test-ns
version: v1
revisionHistoryLimit: 10
strategy:
type: RollingUpdate
ClusterResourcePlacement 狀態
status:
conditions:
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: could not find all the clusters needed as specified by the scheduling
policy
observedGeneration: 1
reason: SchedulingPolicyUnfulfilled
status: "False"
type: ClusterResourcePlacementScheduled
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: All 2 cluster(s) start rolling out the latest resource
observedGeneration: 1
reason: RolloutStarted
status: "True"
type: ClusterResourcePlacementRolloutStarted
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: No override rules are configured for the selected resources
observedGeneration: 1
reason: NoOverrideSpecified
status: "True"
type: ClusterResourcePlacementOverridden
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: Works(s) are successfully created or updated in the 2 target clusters'
namespaces
observedGeneration: 1
reason: WorkSynchronized
status: "True"
type: ClusterResourcePlacementWorkSynchronized
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: The selected resources are successfully applied to 2 clusters
observedGeneration: 1
reason: ApplySucceeded
status: "True"
type: ClusterResourcePlacementApplied
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: The selected resources in 2 cluster are available now
observedGeneration: 1
reason: ResourceAvailable
status: "True"
type: ClusterResourcePlacementAvailable
observedResourceIndex: "0"
placementStatuses:
- clusterName: kind-cluster-2
conditions:
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: 'Successfully scheduled resources for placement in kind-cluster-2 (affinity
score: 0, topology spread score: 0): picked by scheduling policy'
observedGeneration: 1
reason: Scheduled
status: "True"
type: Scheduled
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: Detected the new changes on the resources and started the rollout process
observedGeneration: 1
reason: RolloutStarted
status: "True"
type: RolloutStarted
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: No override rules are configured for the selected resources
observedGeneration: 1
reason: NoOverrideSpecified
status: "True"
type: Overridden
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: All of the works are synchronized to the latest
observedGeneration: 1
reason: AllWorkSynced
status: "True"
type: WorkSynchronized
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: All corresponding work objects are applied
observedGeneration: 1
reason: AllWorkHaveBeenApplied
status: "True"
type: Applied
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: All corresponding work objects are available
observedGeneration: 1
reason: AllWorkAreAvailable
status: "True"
type: Available
- clusterName: kind-cluster-1
conditions:
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: 'Successfully scheduled resources for placement in kind-cluster-1 (affinity
score: 0, topology spread score: 0): picked by scheduling policy'
observedGeneration: 1
reason: Scheduled
status: "True"
type: Scheduled
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: Detected the new changes on the resources and started the rollout process
observedGeneration: 1
reason: RolloutStarted
status: "True"
type: RolloutStarted
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: No override rules are configured for the selected resources
observedGeneration: 1
reason: NoOverrideSpecified
status: "True"
type: Overridden
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: All of the works are synchronized to the latest
observedGeneration: 1
reason: AllWorkSynced
status: "True"
type: WorkSynchronized
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: All corresponding work objects are applied
observedGeneration: 1
reason: AllWorkHaveBeenApplied
status: "True"
type: Applied
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: All corresponding work objects are available
observedGeneration: 1
reason: AllWorkAreAvailable
status: "True"
type: Available
上述輸出指出中樞叢集上從未存在資源 test-ns
命名空間,並顯示下列 ClusterResourcePlacement
條件狀態:
- 條件
ClusterResourcePlacementScheduled
狀態會顯示為False
,因為指定的原則的目標是挑選三個叢集,但排程器只能容納目前可用和已加入之叢集中的兩個位置。 - 條件
ClusterResourcePlacementRolloutStarted
狀態會顯示為True
,因為首度發行程式已開始選取兩個叢集。 - 條件
ClusterResourcePlacementOverridden
狀態會顯示為True
,因為未針對選取的資源設定覆寫規則。 - 條件
ClusterResourcePlacementWorkSynchronized
狀態會顯示為True
。 - 條件
ClusterResourcePlacementApplied
狀態會顯示為True
。 - 條件
ClusterResourcePlacementAvailable
狀態會顯示為True
。
為了確保跨相關叢集順暢地傳播命名空間,請繼續在中樞叢集上建立 test-ns
命名空間。
在中樞叢集上建立命名空間 「test-ns」 之後的 ClusterResourcePlacement 狀態
status:
conditions:
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: could not find all the clusters needed as specified by the scheduling
policy
observedGeneration: 1
reason: SchedulingPolicyUnfulfilled
status: "False"
type: ClusterResourcePlacementScheduled
- lastTransitionTime: "2024-05-07T23:13:51Z"
message: The rollout is being blocked by the rollout strategy in 2 cluster(s)
observedGeneration: 1
reason: RolloutNotStartedYet
status: "False"
type: ClusterResourcePlacementRolloutStarted
observedResourceIndex: "1"
placementStatuses:
- clusterName: kind-cluster-2
conditions:
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: 'Successfully scheduled resources for placement in kind-cluster-2 (affinity
score: 0, topology spread score: 0): picked by scheduling policy'
observedGeneration: 1
reason: Scheduled
status: "True"
type: Scheduled
- lastTransitionTime: "2024-05-07T23:13:51Z"
message: The rollout is being blocked by the rollout strategy
observedGeneration: 1
reason: RolloutNotStartedYet
status: "False"
type: RolloutStarted
- clusterName: kind-cluster-1
conditions:
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: 'Successfully scheduled resources for placement in kind-cluster-1 (affinity
score: 0, topology spread score: 0): picked by scheduling policy'
observedGeneration: 1
reason: Scheduled
status: "True"
type: Scheduled
- lastTransitionTime: "2024-05-07T23:13:51Z"
message: The rollout is being blocked by the rollout strategy
observedGeneration: 1
reason: RolloutNotStartedYet
status: "False"
type: RolloutStarted
selectedResources:
- kind: Namespace
name: test-ns
version: v1
在上述輸出中 ClusterResourcePlacementScheduled
,條件狀態會顯示為 False
。 狀態 ClusterResourcePlacementRolloutStarted
也會顯示為 False
訊息: The rollout is being blocked by the rollout strategy in 2 cluster(s)
。
ClusterResourceSnapshot
在如何找到最新的 ClusterResourceBinding 資源中執行 命令,以檢查最新的命令?
最新 ClusterResourceSnapshot
apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourceSnapshot
metadata:
annotations:
kubernetes-fleet.io/number-of-enveloped-object: "0"
kubernetes-fleet.io/number-of-resource-snapshots: "1"
kubernetes-fleet.io/resource-hash: 72344be6e268bc7af29d75b7f0aad588d341c228801aab50d6f9f5fc33dd9c7c
creationTimestamp: "2024-05-07T23:13:51Z"
generation: 1
labels:
kubernetes-fleet.io/is-latest-snapshot: "true"
kubernetes-fleet.io/parent-CRP: crp-3
kubernetes-fleet.io/resource-index: "1"
name: crp-3-1-snapshot
ownerReferences:
- apiVersion: placement.kubernetes-fleet.io/v1beta1
blockOwnerDeletion: true
controller: true
kind: ClusterResourcePlacement
name: crp-3
uid: b4f31b9a-971a-480d-93ac-93f093ee661f
resourceVersion: "14434"
uid: 85ee0e81-92c9-4362-932b-b0bf57d78e3f
spec:
selectedResources:
- apiVersion: v1
kind: Namespace
metadata:
labels:
kubernetes.io/metadata.name: test-ns
name: test-ns
spec:
finalizers:
- kubernetes
在規格中 ClusterResourceSnapshot
,區 selectedResources
段現在會顯示 命名空間 test-ns
。
ClusterResourceBinding
檢查 是否kind-cluster-1
在建立命名空間test-ns
之後更新。 如需詳細資訊,請參閱 如何尋找最新的 ClusterResourceBinding 資源?。
kind-cluster-1 的 ClusterResourceBinding
apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourceBinding
metadata:
creationTimestamp: "2024-05-07T23:08:53Z"
finalizers:
- kubernetes-fleet.io/work-cleanup
generation: 2
labels:
kubernetes-fleet.io/parent-CRP: crp-3
name: crp-3-kind-cluster-1-7114c253
resourceVersion: "14438"
uid: 0db4e480-8599-4b40-a1cc-f33bcb24b1a7
spec:
applyStrategy:
type: ClientSideApply
clusterDecision:
clusterName: kind-cluster-1
clusterScore:
affinityScore: 0
priorityScore: 0
reason: picked by scheduling policy
selected: true
resourceSnapshotName: crp-3-0-snapshot
schedulingPolicySnapshotName: crp-3-0
state: Bound
targetCluster: kind-cluster-1
status:
conditions:
- lastTransitionTime: "2024-05-07T23:13:51Z"
message: The resources cannot be updated to the latest because of the rollout
strategy
observedGeneration: 2
reason: RolloutNotStartedYet
status: "False"
type: RolloutStarted
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: No override rules are configured for the selected resources
observedGeneration: 2
reason: NoOverrideSpecified
status: "True"
type: Overridden
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: All of the works are synchronized to the latest
observedGeneration: 2
reason: AllWorkSynced
status: "True"
type: WorkSynchronized
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: All corresponding work objects are applied
observedGeneration: 2
reason: AllWorkHaveBeenApplied
status: "True"
type: Applied
- lastTransitionTime: "2024-05-07T23:08:53Z"
message: All corresponding work objects are available
observedGeneration: 2
reason: AllWorkAreAvailable
status: "True"
type: Available
保持不變 ClusterResourceBinding
。 在規格中 ClusterResourceBinding
, resourceSnapshotName
仍然參考舊 ClusterResourceSnapshot
名稱。 當使用者沒有明確的 RollingUpdate
輸入時,就會發生此問題,因為會套用預設值:
- 值
maxUnavailable
設定為 25% × 3 (所需數位),四捨五入為1
。 - 值
maxSurge
設定為 25% × 3 (所需數位),四捨五入為1
。
為什麼 ClusterResourceBinding 未更新
一開始,建立 時 ClusterResourcePlacement
,會產生兩 ClusterResourceBindings
個 。 不過,由於首度發行不適用於初始階段,條件 ClusterResourcePlacementRolloutStarted
會設定為 True
。
在中樞叢集上建立 test-ns
命名空間時,推出控制器嘗試更新兩個現有的 ClusterResourceBindings
。 不過, maxUnavailable
由於 1
缺少成員叢集,因此設定 RollingUpdate
太嚴格。
注意
在更新期間,如果其中一個系結無法套用,它也會違反 RollingUpdate
設定,這會導致 maxUnavailable
設定為 1
。
解決方法
在此情況下,若要解決此問題,請考慮手動設定 maxUnavailable
為大於 1
放寬設定 RollingUpdate
的值。 或者,您可以加入第三個成員叢集。
與我們連絡,以取得說明
如果您有問題或需要相關協助,請建立支援要求,或詢問 Azure community 支援。 您也可以向 Azure 意見反應社群提交產品意見反應。