Fehler bei der Ressourcenverteilung: "ClusterResourcePlacementRolloutStarted" ist "false".

Artikel
08/06/2024

In diesem Artikel wird beschrieben, wie Sie Probleme beheben ClusterResourcePlacementRolloutStarted , wenn Sie Ressourcen mithilfe des ClusterResourcePlacement API-Objekts in Azure Kubernetes Fleet Manager verteilen.

Symptome

Wenn Sie das ClusterResourcePlacement API-Objekt in Azure Kubernetes Fleet Manager zum Verteilen von Ressourcen verwenden, werden die ausgewählten Ressourcen nicht in allen geplanten Clustern bereitgestellt, und der ClusterResourcePlacementRolloutStarted Bedingungsstatus wird als False.

Notiz

Um weitere Informationen darüber zu erhalten, warum das Rollout nicht gestartet wird, können Sie die Rolloutcontrollerprotokolle überprüfen.

Ursache

Die Rolloutstrategie für die Clusterressourcenplatzierung wird blockiert, da die RollingUpdate Konfiguration zu streng ist.

Schritte zur Fehlersuche

Überprüfen Sie im ClusterResourcePlacement Statusabschnitt, placementStatuses um Cluster zu identifizieren, auf die der RolloutStarted Status festgelegt ist False.
Suchen Sie den entsprechenden ClusterResourceBinding für den identifizierten Cluster. Weitere Informationen finden Sie unter Wie finde ich die neueste ClusterResourceBinding-Ressource? Diese Ressource sollte den Work Status angeben (unabhängig davon, ob sie erstellt oder aktualisiert wurde).
Überprüfen Sie die Werte und maxUnavailable maxSurge stellen Sie sicher, dass sie Ihren Erwartungen entsprechen.

Fallstudie

Im folgenden Beispiel wird versucht, ClusterResourcePlacement einen Namespace an drei Membercluster zu verteilen. Während der ersten Erstellung des ClusterResourcePlacementNamespace existierte der Namespace jedoch nicht im Hubcluster, und die Flotte besteht derzeit aus zwei Mitgliedsclustern namens kind-cluster-1 und kind-cluster-2.

ClusterResourcePlacement-Spezifikation

spec:
  policy:
    numberOfClusters: 3
    placementType: PickN
  resourceSelectors:
  - group: ""
    kind: Namespace
    name: test-ns
    version: v1
  revisionHistoryLimit: 10
  strategy:
    type: RollingUpdate

ClusterResourcePlacement-Status

status:
  conditions:
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: could not find all the clusters needed as specified by the scheduling
      policy
    observedGeneration: 1
    reason: SchedulingPolicyUnfulfilled
    status: "False"
    type: ClusterResourcePlacementScheduled
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: All 2 cluster(s) start rolling out the latest resource
    observedGeneration: 1
    reason: RolloutStarted
    status: "True"
    type: ClusterResourcePlacementRolloutStarted
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: No override rules are configured for the selected resources
    observedGeneration: 1
    reason: NoOverrideSpecified
    status: "True"
    type: ClusterResourcePlacementOverridden
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: Works(s) are successfully created or updated in the 2 target clusters'
      namespaces
    observedGeneration: 1
    reason: WorkSynchronized
    status: "True"
    type: ClusterResourcePlacementWorkSynchronized
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: The selected resources are successfully applied to 2 clusters
    observedGeneration: 1
    reason: ApplySucceeded
    status: "True"
    type: ClusterResourcePlacementApplied
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: The selected resources in 2 cluster are available now
    observedGeneration: 1
    reason: ResourceAvailable
    status: "True"
    type: ClusterResourcePlacementAvailable
  observedResourceIndex: "0"
  placementStatuses:
  - clusterName: kind-cluster-2
    conditions:
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: 'Successfully scheduled resources for placement in kind-cluster-2 (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: Detected the new changes on the resources and started the rollout process
      observedGeneration: 1
      reason: RolloutStarted
      status: "True"
      type: RolloutStarted
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: No override rules are configured for the selected resources
      observedGeneration: 1
      reason: NoOverrideSpecified
      status: "True"
      type: Overridden
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All of the works are synchronized to the latest
      observedGeneration: 1
      reason: AllWorkSynced
      status: "True"
      type: WorkSynchronized
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All corresponding work objects are applied
      observedGeneration: 1
      reason: AllWorkHaveBeenApplied
      status: "True"
      type: Applied
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All corresponding work objects are available
      observedGeneration: 1
      reason: AllWorkAreAvailable
      status: "True"
      type: Available
  - clusterName: kind-cluster-1
    conditions:
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: 'Successfully scheduled resources for placement in kind-cluster-1 (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: Detected the new changes on the resources and started the rollout process
      observedGeneration: 1
      reason: RolloutStarted
      status: "True"
      type: RolloutStarted
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: No override rules are configured for the selected resources
      observedGeneration: 1
      reason: NoOverrideSpecified
      status: "True"
      type: Overridden
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All of the works are synchronized to the latest
      observedGeneration: 1
      reason: AllWorkSynced
      status: "True"
      type: WorkSynchronized
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All corresponding work objects are applied
      observedGeneration: 1
      reason: AllWorkHaveBeenApplied
      status: "True"
      type: Applied
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All corresponding work objects are available
      observedGeneration: 1
      reason: AllWorkAreAvailable
      status: "True"
      type: Available

Die vorstehende Ausgabe gibt an, dass der Ressourcennamespace test-ns nie im Hubcluster vorhanden ist und die folgenden ClusterResourcePlacement Bedingungsstatus anzeigt:

Der ClusterResourcePlacementScheduled Bedingungsstatus wird angezeigt, Falseda die angegebene Richtlinie darauf abzielt, drei Cluster zu wählen, aber der Zeitplaner kann nur Platzierungen in zwei derzeit verfügbaren und verknüpften Clustern aufnehmen.
Der ClusterResourcePlacementRolloutStarted Bedingungsstatus wird angezeigt, Trueda der Rolloutprozess mit zwei ausgewählten Clustern gestartet wurde.
Der ClusterResourcePlacementOverridden Bedingungsstatus wird angezeigt, Trueda keine Außerkraftsetzungsregeln für die ausgewählten Ressourcen konfiguriert sind.
Der Zustandsstatus ClusterResourcePlacementWorkSynchronized wird als True.
Der Zustandsstatus ClusterResourcePlacementApplied wird als True.
Der Zustandsstatus ClusterResourcePlacementAvailable wird als True.

Um eine nahtlose Verteilung des Namespace in den relevanten Clustern sicherzustellen, erstellen Sie den test-ns Namespace im Hubcluster.

ClusterResourcePlacement-Status, nachdem der Namespace "test-ns" im Hubcluster erstellt wurde

status:
  conditions:
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: could not find all the clusters needed as specified by the scheduling
      policy
    observedGeneration: 1
    reason: SchedulingPolicyUnfulfilled
    status: "False"
    type: ClusterResourcePlacementScheduled
  - lastTransitionTime: "2024-05-07T23:13:51Z"
    message: The rollout is being blocked by the rollout strategy in 2 cluster(s)
    observedGeneration: 1
    reason: RolloutNotStartedYet
    status: "False"
    type: ClusterResourcePlacementRolloutStarted
  observedResourceIndex: "1"
  placementStatuses:
  - clusterName: kind-cluster-2
    conditions:
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: 'Successfully scheduled resources for placement in kind-cluster-2 (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2024-05-07T23:13:51Z"
      message: The rollout is being blocked by the rollout strategy
      observedGeneration: 1
      reason: RolloutNotStartedYet
      status: "False"
      type: RolloutStarted
  - clusterName: kind-cluster-1
    conditions:
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: 'Successfully scheduled resources for placement in kind-cluster-1 (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2024-05-07T23:13:51Z"
      message: The rollout is being blocked by the rollout strategy
      observedGeneration: 1
      reason: RolloutNotStartedYet
      status: "False"
      type: RolloutStarted
  selectedResources:
  - kind: Namespace
    name: test-ns
    version: v1

In der vorherigen Ausgabe wird der ClusterResourcePlacementScheduled Bedingungsstatus als Falseangezeigt. Der ClusterResourcePlacementRolloutStarted Status wird auch wie False in der Meldung angezeigt: The rollout is being blocked by the rollout strategy in 2 cluster(s).

Überprüfen Sie die neuesten ClusterResourceSnapshot , indem Sie den Befehl unter "Wie finde ich die neueste ClusterResourceBinding-Ressource?

Neueste ClusterResourceSnapshot

apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourceSnapshot
metadata:
  annotations:
    kubernetes-fleet.io/number-of-enveloped-object: "0"
    kubernetes-fleet.io/number-of-resource-snapshots: "1"
    kubernetes-fleet.io/resource-hash: 72344be6e268bc7af29d75b7f0aad588d341c228801aab50d6f9f5fc33dd9c7c
  creationTimestamp: "2024-05-07T23:13:51Z"
  generation: 1
  labels:
    kubernetes-fleet.io/is-latest-snapshot: "true"
    kubernetes-fleet.io/parent-CRP: crp-3
    kubernetes-fleet.io/resource-index: "1"
  name: crp-3-1-snapshot
  ownerReferences:
  - apiVersion: placement.kubernetes-fleet.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: ClusterResourcePlacement
    name: crp-3
    uid: b4f31b9a-971a-480d-93ac-93f093ee661f
  resourceVersion: "14434"
  uid: 85ee0e81-92c9-4362-932b-b0bf57d78e3f
spec:
  selectedResources:
  - apiVersion: v1
    kind: Namespace
    metadata:
      labels:
        kubernetes.io/metadata.name: test-ns
      name: test-ns
    spec:
      finalizers:
      - kubernetes

In der ClusterResourceSnapshot Spezifikation zeigt der selectedResources Abschnitt nun den Namespace test-nsan.

Überprüfen Sie, ClusterResourceBinding kind-cluster-1 ob sie nach der Erstellung des Namespaces test-ns aktualisiert wurde. Weitere Informationen finden Sie unter Wie finde ich die neueste ClusterResourceBinding-Ressource?.

ClusterResourceBinding für Art-Cluster-1

apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourceBinding
metadata:
  creationTimestamp: "2024-05-07T23:08:53Z"
  finalizers:
  - kubernetes-fleet.io/work-cleanup
  generation: 2
  labels:
    kubernetes-fleet.io/parent-CRP: crp-3
  name: crp-3-kind-cluster-1-7114c253
  resourceVersion: "14438"
  uid: 0db4e480-8599-4b40-a1cc-f33bcb24b1a7
spec:
  applyStrategy:
    type: ClientSideApply
  clusterDecision:
    clusterName: kind-cluster-1
    clusterScore:
      affinityScore: 0
      priorityScore: 0
    reason: picked by scheduling policy
    selected: true
  resourceSnapshotName: crp-3-0-snapshot
  schedulingPolicySnapshotName: crp-3-0
  state: Bound
  targetCluster: kind-cluster-1
status:
  conditions:
  - lastTransitionTime: "2024-05-07T23:13:51Z"
    message: The resources cannot be updated to the latest because of the rollout
      strategy
    observedGeneration: 2
    reason: RolloutNotStartedYet
    status: "False"
    type: RolloutStarted
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: No override rules are configured for the selected resources
    observedGeneration: 2
    reason: NoOverrideSpecified
    status: "True"
    type: Overridden
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: All of the works are synchronized to the latest
    observedGeneration: 2
    reason: AllWorkSynced
    status: "True"
    type: WorkSynchronized
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: All corresponding work objects are applied
    observedGeneration: 2
    reason: AllWorkHaveBeenApplied
    status: "True"
    type: Applied
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: All corresponding work objects are available
    observedGeneration: 2
    reason: AllWorkAreAvailable
    status: "True"
    type: Available

Die ClusterResourceBinding Bleibt unverändert. In der ClusterResourceBinding Spezifikation verweist der resourceSnapshotName alte Name immer noch auf den alten ClusterResourceSnapshot Namen. Dieses Problem tritt auf, wenn es keine explizite RollingUpdate Eingabe des Benutzers gibt, da die Standardwerte angewendet werden:

Der maxUnavailable Wert ist auf 25 % × 3 (die gewünschte Zahl) konfiguriert, gerundet auf 1.
Der maxSurge Wert ist auf 25 % × 3 (die gewünschte Zahl) konfiguriert, gerundet auf 1.

Warum ClusterResourceBinding nicht aktualisiert wird

Zunächst wurden beim Erstellen der ClusterResourcePlacement Beiden ClusterResourceBindings generiert. Da der Rollout jedoch nicht auf die erste Phase angewendet wurde, wurde die ClusterResourcePlacementRolloutStarted Bedingung auf True".

Beim Erstellen des test-ns Namespace im Hubcluster hat der Rolloutcontroller versucht, die beiden vorhandenen ClusterResourceBindingsNamespaces zu aktualisieren. Allerdings wurde aufgrund des Mangels an Memberclustern festgelegt1, was dazu führte, maxUnavailable dass die RollingUpdate Konfiguration zu streng war.

Notiz

Wenn eine der Bindungen während des Updates nicht angewendet werden kann, verstößt sie auch gegen die RollingUpdate Konfiguration, die auf maxUnavailable <a0/> festgelegt wird.

Lösung

In dieser Situation sollten Sie, um dieses Problem zu beheben, manuell auf einen Wert festlegen maxUnavailable , der größer ist als 1 die RollingUpdate Konfiguration zu entspannen. Alternativ können Sie einem dritten Mitgliedscluster beitreten.

Kontaktieren Sie uns für Hilfe

Wenn Sie Fragen haben oder Hilfe mit Ihren Azure-Gutschriften benötigen, dann erstellen Sie beim Azure-Support eine Support-Anforderung oder fragen Sie den Azure Community-Support. Sie können auch Produktfeedback an die Azure Feedback Community senden.

Freigeben über