你当前正在访问 Microsoft Azure Global Edition 技术文档网站。如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站，请访问 https://docs.azure.cn。

使用 Grafana 中的 Prometheus 监视应用程序路由加载项中的 ingress-nginx 控制器指标（预览版）

项目
08/02/2024

应用程序路由加载项中的 ingress-nginx 控制器公开了请求、nginx 进程和控制器的许多指标，这些指标有助于分析应用程序的性能和使用情况。

应用程序路由加载项公开了端口 10254 的 /metrics 的 Prometheus 指标终结点。

重要

AKS 预览功能是可选择启用的自助功能。预览功能是“按现状”和“按可用”提供的，不包括在服务级别协议和有限保证中。 AKS 预览功能是由客户支持尽最大努力部分覆盖。因此，这些功能并不适合用于生产。有关详细信息，请参阅以下支持文章：

先决条件

启用应用程序路由加载项的 Azure Kubernetes 服务 (AKS) 群集。
Prometheus 实例，例如适用于 Prometheus 的 Azure Monitor 托管服务。
Grafana 实例，例如 Azure 托管 Grafana。

验证指标终结点

若要验证正在收集的指标，可以设置一个端口转发到其中一个 ingress-nginx 控制器 Pod。

kubectl get pods -n app-routing-system

NAME                            READY   STATUS    RESTARTS   AGE
external-dns-667d54c44b-jmsxm   1/1     Running   0          4d6h
nginx-657bb8cdcf-qllmx          1/1     Running   0          4d6h
nginx-657bb8cdcf-wgcr7          1/1     Running   0          4d6h

现在，将本地端口转发到其中一个 nginx Pod 上的端口 10254。

kubectl port-forward nginx-657bb8cdcf-qllmx -n app-routing-system :10254

Forwarding from 127.0.0.1:43307 -> 10254
Forwarding from [::1]:43307 -> 10254

记下本地端口（本例中为 43307）并在浏览器中打开 http://localhost:43307/metrics。应会显示 ingress-nginx 控制器指标正在加载。

浏览器中 Prometheus 指标的屏幕截图。

现在可以终止 port-forward 进程以关闭转发。

使用容器见解配置适用于 Prometheus 的 Azure Monitor 托管服务和 Azure 托管 Grafana

适用于 Prometheus 的 Azure Monitor 托管服务是完全托管的 Prometheus 兼容服务，该服务支持行业标准功能，例如 PromQL、Grafana 仪表板和 Prometheus 警报。此服务需要为 Azure Monitor 代理配置指标加载项，从而将数据发送到 Prometheus。如果群集未配置加载项，可以按照本文配置 Azure Kubernetes 服务 (AKS) 群集以将数据发送到适用于 Prometheus 的 Azure Monitor 托管服务，并将收集的指标发送到 Azure 托管 Grafana 实例。

启用基于 Pod 注释的抓取

使用 Azure Monitor 代理更新群集后，需要配置代理以启用基于 Pod 注释的抓取，这些注释将添加到 ingress-nginx Pod。设置此设置的一种方法是在 kube-system 命名空间的 ama-metrics-settings-configmap ConfigMap 中。

注意

这将替换 kube-system 中现有的 ama-metrics-settings-configmap ConfigMap。如果已有配置，可能需要进行备份或将其与此配置合并。

可以通过运行 kubectl get configmap ama-metrics-settings-configmap -n kube-system -o yaml > ama-metrics-settings-configmap-backup.yaml 来备份现有的 ama-metrics-settings-config ConfigMap（如果存在）

以下配置将 podannotationnamespaceregex 参数设置为 .* 以抓取所有命名空间。

kubectl apply -f - <<EOF
kind: ConfigMap
apiVersion: v1
metadata:
  name: ama-metrics-settings-configmap
  namespace: kube-system
data:
  schema-version:
    #string.used by agent to parse config. supported versions are {v1}. Configs with other schema versions will be rejected by the agent.
    v1
  config-version:
    #string.used by customer to keep track of this config file's version in their source control/repository (max allowed 10 chars, other chars will be truncated)
    ver1
  prometheus-collector-settings: |-
    cluster_alias = ""
  default-scrape-settings-enabled: |-
    kubelet = true
    coredns = false
    cadvisor = true
    kubeproxy = false
    apiserver = false
    kubestate = true
    nodeexporter = true
    windowsexporter = false
    windowskubeproxy = false
    kappiebasic = true
    prometheuscollectorhealth = false
  # Regex for which namespaces to scrape through pod annotation based scraping.
  # This is none by default. Use '.*' to scrape all namespaces of annotated pods.
  pod-annotation-based-scraping: |-
    podannotationnamespaceregex = ".*"
  default-targets-metrics-keep-list: |-
    kubelet = ""
    coredns = ""
    cadvisor = ""
    kubeproxy = ""
    apiserver = ""
    kubestate = ""
    nodeexporter = ""
    windowsexporter = ""
    windowskubeproxy = ""
    podannotations = ""
    kappiebasic = ""
    minimalingestionprofile = true
  default-targets-scrape-interval-settings: |-
    kubelet = "30s"
    coredns = "30s"
    cadvisor = "30s"
    kubeproxy = "30s"
    apiserver = "30s"
    kubestate = "30s"
    nodeexporter = "30s"
    windowsexporter = "30s"
    windowskubeproxy = "30s"
    kappiebasic = "30s"
    prometheuscollectorhealth = "30s"
    podannotations = "30s"
  debug-mode: |-
    enabled = false
EOF

几分钟后，kube-system 命名空间中的 ama-metrics Pod 应会重启并选取新配置。

查看 Azure 托管 Grafana 中指标的可视化效果

配置了适用于 Prometheus 的 Azure Monitor 托管服务和 Azure 托管 Grafana 后，应访问托管 Grafana 实例。

你可以下载两个官方 ingress-nginx 仪表板并将其导入到 Grafana 实例中：

Ingress-nginx 控制器仪表板
请求处理性能仪表板

Ingress-nginx 控制器仪表板

使用此仪表板可以查看请求量、连接、成功率、配置重载和配置不同步。还可以用于查看入口控制器的网络 IO 压力、内存和 CPU 使用情况。最后，此仪表板还显示入口的 P50、P95 和 P99 百分位数响应时间及其吞吐量。

你可以从 GitHub 下载此仪表板。

浏览器显示 Grafana 上 ingress-nginx 仪表板的屏幕截图。

请求处理性能仪表板

借助此仪表板，可以了解不同入口上游目标的请求处理性能，这些目标是入口控制器将流量转发到的应用程序终结点。仪表板显示全部请求的 P50、P95 和 P99 百分位数及上游响应时间。你还可以查看请求错误和延迟的聚合。使用此仪表板查看和改进应用程序的性能和可伸缩性。

你可以从 GitHub 下载此仪表板。

浏览器显示 Grafana 上 ingress-nginx 请求处理性能仪表板的屏幕截图。

导入仪表板

若要导入 Grafana 仪表板，请展开左侧菜单并单击“仪表板”下的“导入”。

浏览器显示 Grafana 实例的屏幕截图，其中突出显示了“导入”仪表板。

然后上传所需的仪表板文件，然后单击“上传”。

浏览器显示 Grafana 实例“导入”仪表板对话框的屏幕截图。

后续步骤

可以通过 Kubernetes 事件驱动的自动缩放程序 (KEDA) 使用 Prometheus 抓取的入口指标来配置工作负载的缩放。详细了解如何将 KEDA 与 AKS 集成。
使用 Azure 负载测试创建并运行负载测试，以测试工作负载性能并优化应用程序的可伸缩性。

通过