资源传播失败:ClusterResourcePlacementApplied 为 False
本文讨论如何在 Microsoft azure Kubernetes Fleet Manager 中使用ClusterResourcePlacement
对象 API 传播资源时排查ClusterResourcePlacementApplied
问题。
现象
使用 ClusterResourcePlacement
Azure Kubernetes Fleet Manager 中的 API 对象传播资源时,部署将失败。 状态 ClusterResourcePlacementApplied
显示为 False
。
原因
此问题可能是由于以下原因之一导致的:
- 该资源已存在于群集上,并且不受机群控制器管理。 若要解决此问题,请
ClusterResourcePlacement
更新清单 YAML 文件以在内部ApplyStrategy
使用AllowCoOwnership
,以允许机群控制器管理资源。 - 另一个
ClusterResourcePlacement
部署已使用不同的应用策略来管理所选群集的资源。 - 由于语法错误或资源配置无效,部署
ClusterResourcePlacement
不会应用清单。 如果资源通过信封对象传播,则也可能发生这种情况。
疑难解答步骤
ClusterResourcePlacement
查看状态并找到placementStatuses
分区。 检查该值placementStatuses
以标识哪些群集已ResourceApplied
将False
条件设置为,并记下其clusterName
值。- 在
Work
中心群集中找到对象。 使用标识clusterName
查找Work
与成员群集关联的对象。 有关详细信息,请参阅 如何查找与ClusterResourcePlacement
该资源关联的正确工作资源。 - 检查对象的状态
Work
,了解阻止成功资源应用程序的特定问题。
案例研究
在以下示例中, ClusterResourcePlacement
尝试将包含部署的命名空间传播到两个成员群集。 但是,命名空间已存在于一个成员群集上,具体而言 kind-cluster-1
。
ClusterResourcePlacement 规范
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterResourcePlacement
metadata:
name: crp
spec:
policy:
clusterNames:
- kind-cluster-1
- kind-cluster-2
placementType: PickFixed
resourceSelectors:
- group: ""
kind: Namespace
name: test-ns
version: v1
revisionHistoryLimit: 10
strategy:
type: RollingUpdate
ClusterResourcePlacement 状态
status:
conditions:
- lastTransitionTime: "2024-05-07T23:32:40Z"
message:couldn't find all the clusters needed as specified by the scheduling
policy
observedGeneration: 1
reason: SchedulingPolicyUnfulfilled
status: "False"
type: ClusterResourcePlacementScheduled
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: All 2 cluster(s) start rolling out the latest resource
observedGeneration: 1
reason: RolloutStarted
status: "True"
type: ClusterResourcePlacementRolloutStarted
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: No override rules are configured for the selected resources
observedGeneration: 1
reason: NoOverrideSpecified
status: "True"
type: ClusterResourcePlacementOverridden
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: Works(s) are succcesfully created or updated in the 2 target clusters'
namespaces
observedGeneration: 1
reason: WorkSynchronized
status: "True"
type: ClusterResourcePlacementWorkSynchronized
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: Failed to apply resources to 1 clusters, please check the `failedPlacements`
status
observedGeneration: 1
reason: ApplyFailed
status: "False"
type: ClusterResourcePlacementApplied
observedResourceIndex: "0"
placementStatuses:
- clusterName: kind-cluster-2
conditions:
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: 'Successfully scheduled resources for placement in kind-cluster-2 (affinity
score: 0, topology spread score: 0): picked by scheduling policy'
observedGeneration: 1
reason: Scheduled
status: "True"
type: Scheduled
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: Detected the new changes on the resources and started the rollout process
observedGeneration: 1
reason: RolloutStarted
status: "True"
type: RolloutStarted
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: No override rules are configured for the selected resources
observedGeneration: 1
reason: NoOverrideSpecified
status: "True"
type: Overridden
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: All of the works are synchronized to the latest
observedGeneration: 1
reason: AllWorkSynced
status: "True"
type: WorkSynchronized
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: All corresponding work objects are applied
observedGeneration: 1
reason: AllWorkHaveBeenApplied
status: "True"
type: Applied
- lastTransitionTime: "2024-05-07T23:32:49Z"
message: The availability of work object crp-4-work isn't trackable
observedGeneration: 1
reason: WorkNotTrackable
status: "True"
type: Available
- clusterName: kind-cluster-1
conditions:
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: 'Successfully scheduled resources for placement in kind-cluster-1 (affinity
score: 0, topology spread score: 0): picked by scheduling policy'
observedGeneration: 1
reason: Scheduled
status: "True"
type: Scheduled
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: Detected the new changes on the resources and started the rollout process
observedGeneration: 1
reason: RolloutStarted
status: "True"
type: RolloutStarted
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: No override rules are configured for the selected resources
observedGeneration: 1
reason: NoOverrideSpecified
status: "True"
type: Overridden
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: All of the works are synchronized to the latest
observedGeneration: 1
reason: AllWorkSynced
status: "True"
type: WorkSynchronized
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: Work object crp-4-work isn't applied
observedGeneration: 1
reason: NotAllWorkHaveBeenApplied
status: "False"
type: Applied
failedPlacements:
- condition:
lastTransitionTime: "2024-05-07T23:32:40Z"
message: 'Failed to apply manifest: failed to process the request due to a
client error: resource exists and isn't managed by the fleet controller
and co-ownernship is disallowed'
reason: ManifestsAlreadyOwnedByOthers
status: "False"
type: Applied
kind: Namespace
name: test-ns
version: v1
selectedResources:
- kind: Namespace
name: test-ns
version: v1
- group: apps
kind: Deployment
name: test-nginx
namespace: test-ns
version: v1
在 failedPlacements
部分中 kind-cluster-1
,这些 message
字段解释了为何未在成员群集上应用资源。 在上conditions
一部分中,条件Applied
kind-cluster-1
标记为false
并显示NotAllWorkHaveBeenApplied
原因。 这表示 Work
未应用针对成员群集 kind-cluster-1
的对象。 有关详细信息,请参阅 如何查找与 ClusterResourcePlacement
该资源关联的正确工作资源。
类型群集-1 的工作状态
status:
conditions:
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: 'Apply manifest {Ordinal:0 Group: Version:v1 Kind:Namespace Resource:namespaces
Namespace: Name:test-ns} failed'
observedGeneration: 1
reason: WorkAppliedFailed
status: "False"
type: Applied
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: ""
observedGeneration: 1
reason: WorkAppliedFailed
status: Unknown
type: Available
manifestConditions:
- conditions:
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: 'Failed to apply manifest: failed to process the request due to a client
error: resource exists and isn't managed by the fleet controller and co-ownernship
is disallowed'
reason: ManifestsAlreadyOwnedByOthers
status: "False"
type: Applied
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: Manifest isn't applied yet
reason: ManifestApplyFailed
status: Unknown
type: Available
identifier:
kind: Namespace
name: test-ns
ordinal: 0
resource: namespaces
version: v1
- conditions:
- lastTransitionTime: "2024-05-07T23:32:40Z"
message: Manifest is already up to date
observedGeneration: 1
reason: ManifestAlreadyUpToDate
status: "True"
type: Applied
- lastTransitionTime: "2024-05-07T23:32:51Z"
message: Manifest is trackable and available now
observedGeneration: 1
reason: ManifestAvailable
status: "True"
type: Available
identifier:
group: apps
kind: Deployment
name: test-nginx
namespace: test-ns
ordinal: 1
resource: deployments
version: v1
Work
检查状态,尤其是manifestConditions
分区。 可以看到无法应用命名空间,但命名空间中的部署已从中心传播到成员群集。
解决方法
在这种情况下,潜在的解决方案是在 ApplyStrategy 策略中设置 AllowCoOwnership
该策略 true
。 但是,请务必注意,此决定应由用户做出,因为资源可能不共享。
此外,还可以查看“应用工作控制器”的日志,以更深入地了解资源不可用的原因。
联系我们寻求帮助
如果你有任何疑问或需要帮助,请创建支持请求或联系 Azure 社区支持。 你还可以将产品反馈提交到 Azure 反馈社区。