Telegraf에서 Azure Data Explorer로 데이터 수집

아티클
10/13/2023

Important

이 커넥터는 Microsoft Fabric의 실시간 인텔리전스 에서 사용할 수 있습니다. 다음 예외를 제외하고 이 문서의 지침을 사용합니다.

필요한 경우 KQL 데이터베이스 만들기의 지침을 사용하여 데이터베이스를 만듭니다.
필요한 경우 빈 테이블 만들기의 지침을 사용하여 테이블을 만듭니다.
복사 URI의 지침을 사용하여 쿼리 또는 수집 URI를 가져옵니다.
KQL 쿼리 세트에서 쿼리를 실행합니다.

Azure Data Explorer는 Telegraf의 데이터 수집을 지원합니다. Telegraf는 로그, 메트릭 및 IoT 데이터를 포함한 원격 분석 데이터를 수집, 처리 및 작성하기 위한 오픈 소스, 경량, 최소 메모리 풋 인쇄 에이전트입니다. Telegraf는 수백 개의 입력 및 출력 플러그 인을 지원합니다. 오픈 소스 커뮤니티에서 널리 사용되고 잘 지원됩니다. Azure Data Explorer 출력 플러그 인은 Telegraf의 커넥터 역할을 하며 다양한 유형의 입력 플러그 인에서 Azure Data Explorer로의 데이터 수집을 지원합니다.

사전 요구 사항

Azure 구독 평가판 Azure 계정을 만듭니다.
Azure Data Explorer 클러스터 및 데이터베이스. 클러스터 및 데이터베이스를 만듭니다.
Telegraf. VM(가상 머신) 또는 컨테이너에서 Telegraf를 호스트합니다. Telegraf는 모니터링되는 앱 또는 서비스가 배포되는 로컬로 또는 전용 모니터링 컴퓨팅/컨테이너에서 원격으로 호스트할 수 있습니다.

지원되는 인증 방법

플러그 인은 다음과 같은 인증 방법을 지원합니다.

앱 키 또는 인증서가 있는 Microsoft Entra 애플리케이션
- Microsoft Entra ID에서 앱을 만들고 등록하는 방법에 대한 자세한 내용은 애플리케이션 등록을 참조하세요.
- 서비스 주체에 대한 자세한 내용은 Microsoft Entra ID의 애플리케이션 및 서비스 주체 개체를 참조하세요.
Microsoft Entra 사용자 토큰
- 플러그 인이 사용자처럼 인증할 수 있도록 허용합니다. 개발 목적으로만 이 방법을 사용하는 것이 좋습니다.
Azure MSI(Managed Service Identity) 토큰
- Azure Virtual Machines와 같은 지원 Azure 환경에서 Telegraf를 실행하는 경우 기본 인증 방법입니다.

어떤 방법을 사용하든 Azure Data Explorer에서 지정된 보안 주체에 데이터베이스 사용자 역할이 할당되어야 합니다. 이 역할을 통해 플러그 인은 데이터 수집에 필요한 테이블을 만들 수 있습니다. 플러그 인이 create_tables=false로 구성된 경우 지정된 보안 주체에는 적어도 데이터베이스 수집기 역할이 있어야 합니다.

인증 방법 구성

플러그 인은 환경 변수의 특정 구성을 확인하여 사용할 인증 방법을 결정합니다. 구성은 지정된 순서로 평가되고 검색된 첫 번째 구성이 사용됩니다. 유효한 구성이 검색되지 않으면 플러그 인이 인증에 실패합니다.

플러그 인에 대한 인증을 구성하려면 선택한 인증 방법에 적합한 환경 변수를 설정합니다.

클라이언트 자격 증명(Microsoft Entra 애플리케이션 토큰): Microsoft Entra 애플리케이션 ID 및 비밀입니다.
- AZURE_TENANT_ID: 인증에 사용되는 Microsoft Entra 테넌트 ID입니다.
- AZURE_CLIENT_ID: 테넌트에 있는 앱 등록의 클라이언트(애플리케이션) ID.
- AZURE_CLIENT_SECRET: 앱 등록을 위해 생성된 클라이언트 암호.
클라이언트 인증서(Microsoft Entra 애플리케이션 토큰): Microsoft Entra 애플리케이션 ID 및 X.509 인증서입니다.
- AZURE_TENANT_ID: 인증에 사용되는 Microsoft Entra 테넌트 ID입니다.
- AZURE_CERTIFICATE_PATH: 앱 등록을 인증할 수 있는 PEM 또는 PFX 형식의 인증서 및 프라이빗 키 쌍에 대한 경로.
- AZURE_CERTIFICATE_PASSWORD: 인증서에 대해 설정된 암호.
리소스 소유자 암호(Microsoft Entra 사용자 토큰): Microsoft Entra 사용자 및 암호입니다. 이 권한 부여 유형을 사용하지 않는 것이 좋습니다. 대화형 로그인이 필요한 경우 디바이스 로그인을 사용합니다.
- AZURE_TENANT_ID: 인증에 사용되는 Microsoft Entra 테넌트 ID입니다.
- AZURE_CLIENT_ID: 테넌트에 있는 앱 등록의 클라이언트(애플리케이션) ID.
- AZURE_USERNAME: Microsoft Entra 사용자 계정의 upn이라고도 하는 사용자 이름입니다.
- AZURE_PASSWORD: Microsoft Entra 사용자 계정의 암호입니다. MFA가 활성화된 계정은 지원하지 않습니다.
Azure 관리 서비스 ID: 자격 증명 관리를 플랫폼에 위임합니다. 이 메서드를 사용하려면 코드를 Azure에서 실행해야 합니다(예: VM). 모든 구성은 Azure에서 처리됩니다. 자세한 내용은 Azure 관리 서비스 ID를 참조하세요. 이 방법은 Azure Resource Manager를 사용할 때만 사용 가능합니다.

Telegraf 구성

Telergraf는 구성 기반 에이전트입니다. 시작하려면 Telegraf를 설치하고 필요한 입력 및 출력 플러그 인을 구성해야 합니다. 구성 파일의 기본 위치는 다음과 같습니다.

Windows 경우: C:\Program Files\Telegraf\telegraf.conf
Linux의 경우: etc/telegraf/telegraf.conf

Azure Data Explorer 출력 플러그 인을 사용하도록 설정하려면 자동으로 생성된 구성 파일에서 다음 섹션의 주석 처리를 제거해야 합니다.

[[outputs.azure_data_explorer]]
  ## The URI property of the Azure Data Explorer resource on Azure
  ## ex: https://myadxresource.australiasoutheast.kusto.windows.net
  # endpoint_url = ""

  ## The Azure Data Explorer database that the metrics will be ingested into.
  ## The plugin will NOT generate this database automatically, it's expected that this database already exists before ingestion.
  ## ex: "exampledatabase"
  # database = ""

  ## Timeout for Azure Data Explorer operations, default value is 20 seconds
  # timeout = "20s"

  ## Type of metrics grouping used when ingesting to Azure Data Explorer
  ## Default value is "TablePerMetric" which means there will be one table for each metric
  # metrics_grouping_type = "TablePerMetric"

  ## Name of the single table to store all the metrics (Only needed if metrics_grouping_type is "SingleTable").
  # table_name = ""

  ## Creates tables and relevant mapping if set to true(default).
  ## Skips table and mapping creation if set to false, this is useful for running telegraf with the least possible access permissions i.e. table ingestor role.
  # create_tables = true

지원되는 수집 형식

플러그 인은 관리(스트리밍) 및 큐에 대기(일괄 처리) 수집을 지원합니다. 기본 처리 형식은 큐에 대기입니다.

중요

관리 수집을 사용하려면 클러스터에서 스트리밍 수집을 사용하도록 설정해야 합니다.

플러그 인의 수집 형식을 구성하려면 다음과 같이 자동으로 생성된 구성 파일을 수정합니다.

  ##  Ingestion method to use.
  ##  Available options are
  ##    - managed  --  streaming ingestion with fallback to batched ingestion or the "queued" method below
  ##    - queued   --  queue up metrics data and process sequentially
  # ingestion_type = "queued"

수집된 데이터 쿼리

다음은 Azure Data Explorer 출력 플러그 인과 함께 SQL 및 syslog 입력 플러그 인을 사용하여 수집된 데이터의 예입니다. 각 입력 방법에 대해 Azure Data Explorer에서 데이터 변환 및 쿼리를 사용하는 방법의 예가 있습니다.

SQL 입력 플러그 인

다음 표에서는 SQL 입력 플러그 인에서 수집한 샘플 메트릭 데이터를 보여 줍니다.

name	tags	timestamp	fields
sqlserver_database_io	{"database_name":"azure-sql-db2","file_type":"DATA","host":"adx-vm","logical_filename":"tempdev","measurement_db_type":"AzureSQLDB","physical_filename":"tempdb.mdf","replica_updateability":"READ_WRITE","sql_instance":"adx-sql-server"}	2021-09-09T13:51:20Z	{"current_size_mb":16,"database_id":2,"file_id":1,"read_bytes":2965504,"read_latency_ms":68,"reads":47,"rg_read_stall_ms":42,"rg_write_stall_ms":0,"space_used_mb":0,"write_bytes":1220608,"write_latency_ms":103,"writes":149}
sqlserver_waitstats	{"database_name":"azure-sql-db2","host":"adx-vm","measurement_db_type":"AzureSQLDB","replica_updateability":"READ_WRITE","sql_instance":"adx-sql-server","wait_category":"Worker Thread","wait_type":"THREADPOOL"}	2021-09-09T13:51:20Z	{"max_wait_time_ms":15,"resource_wait_ms":4469,"signal_wait_time_ms":0,"wait_time_ms":4469,"waiting_tasks_count":1464}

수집된 메트릭 개체는 복합 형식이므로 필드 및 태그 열은 동적 데이터 형식으로 저장됩니다. 이 데이터를 쿼리하는 방법에는 여러 가지가 있습니다. 예를 들면 다음과 같습니다.

JSON 특성 직접 쿼리: JSON 데이터를 구문 분석하지 않고 원시 형식으로 쿼리할 수 있습니다.

예제 1
```
Tablename
| where name == "sqlserver_azure_db_resource_stats" and todouble(fields.avg_cpu_percent) > 7
```
예제 2
```
Tablename
| distinct tostring(tags.database_name)
```
참고

이 방식은 대량의 데이터를 사용할 때 성능에 영향을 줄 수 있습니다. 이러한 경우 업데이트 정책 접근 방식을 사용합니다.

업데이트 정책 사용: 업데이트 정책을 사용하여 동적 데이터 형식 열을 변환합니다. 대량의 데이터를 쿼리하는 데 이 방식을 사용하는 것이 좋습니다.

// Function to transform data
.create-or-alter function Transform_TargetTableName() {
  SourceTableName
  | mv-apply fields on (extend key = tostring(bag_keys(fields)[0]))
  | project fieldname=key, value=todouble(fields[key]), name, tags, timestamp
}

// Create destination table with above query's results schema (if it doesn't exist already)
.set-or-append TargetTableName <| Transform_TargetTableName() | take 0

// Apply update policy on destination table
.alter table TargetTableName policy update
@'[{"IsEnabled": true, "Source": "SourceTableName", "Query": "Transform_TargetTableName()", "IsTransactional": true, "PropagateIngestionProperties": false}]'

Syslog 입력 플러그 인

다음 표에서는 Syslog 입력 플러그 인에서 수집한 샘플 메트릭 데이터를 보여 줍니다.

name	tags	timestamp	fields
syslog	{"appname":"azsecmond","facility":"user","host":"adx-linux-vm","hostname":"adx-linux-vm","severity":"info"}	2021-09-20T14:36:44Z	{"facility_code":1,"message":" 2021/09/20 14:36:44.890110 mdsd에 연결하지 못했습니다. dial unix /var/run/mdsd/default_djson.socket: connect: 해당 파일 또는 디렉터리 없음","procid":"2184","severity_code":6,"timestamp":"1632148604890477000","version":1}
syslog	{"appname":"CRON","facility":"authpriv","host":"adx-linux-vm","hostname":"adx-linux-vm","severity":"info"}	2021-09-20T14:37:01Z	{"facility_code":10,"message":" pam_unix(cron:session): (uid=0)에 의해 루트 사용자에 대해 열린 세션","procid":"26446","severity_code":6,"timestamp":"1632148621120781000","version":1}

extend 연산자 또는 bag_unpack() 플러그 인을 사용하여 동적 열을 평면화할 수 있는 여러 가지 방법이 있습니다. 업데이트 정책 Transform_TargetTableName() 함수에서 둘 중 하나를 사용할 수 있습니다.

extend 연산자 사용: 더 빠르고 강력하기 때문에 이 방법을 사용하는 것이 좋습니다. 스키마가 변경되더라도 쿼리 또는 대시보드는 중단되지 않습니다.

Tablename
| extend facility_code=toint(fields.facility_code), message=tostring(fields.message), procid= tolong(fields.procid), severity_code=toint(fields.severity_code),
SysLogTimestamp=unixtime_nanoseconds_todatetime(tolong(fields.timestamp)), version= todouble(fields.version),
appname= tostring(tags.appname), facility= tostring(tags.facility),host= tostring(tags.host), hostname=tostring(tags.hostname), severity=tostring(tags.severity)
| project-away fields, tags

bag_unpack() 플러그 인 사용: 이 방식은 동적 형식 열의 압축을 자동으로 풉니다. 원본 스키마를 변경하면 열을 동적으로 확장할 때 문제가 발생할 수 있습니다.
```
Tablename
| evaluate bag_unpack(tags, columnsConflict='replace_source')
| evaluate bag_unpack(fields, columnsConflict='replace_source')
```

다음을 통해 공유