telegraf-exec插件-aws ec2 ebs volumen信息-度量分析错误,原因:[缺少字段]或遇到的错误:[无效编号] [英] telegraf - exec plugin - aws ec2 ebs volumen info - metric parsing error, reason: [missing fields] or Errors encountered: [ invalid number]

查看:186
本文介绍了telegraf-exec插件-aws ec2 ebs volumen信息-度量分析错误,原因:[缺少字段]或遇到的错误:[无效编号]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

机器- CentOS 7.2 Ubuntu 14.04/16.xx

Telegraf 版本:1.0.1

Python 版本:2.7.5

Telegraf支持一个名为: exec 的INPUT插件.首先,请参见此处的自述文件文档中的示例2 .我不能使用JSON格式,因为它仅使用数字值作为指标.根据文档:

If using JSON, only numeric values are parsed and turned into floats. Booleans and strings will be ignored.

因此,想法很简单,您可以在exec插件部分中指定一个脚本,该脚本应该吐出一些有意义的信息(以 JSON -或- influx 数据格式<在我的情况下为strong> ,因为我有一些包含非数字值的指标),您希望在一个很酷的仪表板中的某个地方捕获/显示该指标,例如,如下所示的 Wavefront仪表板: :

基本上,人们可以使用这些度量标准,标签,这些度量标准的来源来查找有关内存,CPU,磁盘,网络,其他有意义的信息的各种信息,并且还可以在发生意外情况时使用这些信息来创建警报.

好的,我想出了以下可用的python脚本:

#!/usr/bin/python

# sudo pip install boto3 if you don't have it on your machine.
import boto3


def generate(key, value):
    """
    Creates a nicely formatted Key(Value) item for output
    """
    return '{}="{}"'.format(key, value)
    #return '{}={}'.format(key, value)


def main():
    ec2 = boto3.resource('ec2', region_name="us-west-2")
    volumes = ec2.volumes.all()

    for vol in volumes:
        # You don't need to wrap everything in `str` unless it is not a string
        # By default most things will come back as a string 
        # unless they are very obviously not (complex, date time, etc)
        # but since we are printing these (and formatting them into strings)
        # the cast to string will be implicit and we don't need to make it 
        # explicit


        # vol is already a fully returned volume you are essentially DOUBLING
        # your API calls when you do this
        #iv = ec2.Volume(vol.id)
        output_parts = [
            # Volume level details
            generate('create_time', vol.create_time),
            generate('availability_zone', vol.availability_zone),
            generate('volume_id', vol.volume_id),
            generate('volume_type', vol.volume_type),
            generate('state', vol.state),
            generate('size', vol.size),
            generate('iops', vol.iops),
            generate('encrypted', vol.encrypted),
            generate('snapshot_id', vol.snapshot_id),
            generate('kms_key_id', vol.kms_key_id),
        ]

        for _ in vol.attachments:
            # Will get any attachments and since it is a list
            # we should write this to handle MULTIPLE attachments
            output_parts.extend([
                generate('InstanceId', _.get('InstanceId')),
                generate('InstanceVolumeState', _.get('State')),
                generate('DeleteOnTermination', _.get('DeleteOnTermination')),
                generate('Device', _.get('Device')),
            ])

        # only process when there are tags to process        
        if vol.tags:
            for _ in vol.tags:
                # Get all of the tags
                output_parts.extend([
                    generate(_.get('Key'), _.get('Value')),
                ])

        # output everything at once.. 
        print ','.join(output_parts)


if __name__ == '__main__':
    main()

此脚本将与AWS EC2 EBS卷进行对话并输出其可以找到的所有值(通常是您在AWS EC2 EBS卷控制台中看到的值),并将该信息格式化为有意义的CSV格式,我将其重定向到.csv日志文件. 我们不想一直运行python脚本(AWS API限制/成本因素).

因此,一旦创建了.csv文件,便创建了这个小Shell脚本,该脚本将在 Telegraf的exec插件的部分中设置.

Telegraf exec插件中设置的

Shell脚本 /tmp/aws-vol-info.sh是:

#!/bin/bash

cat /tmp/aws-vol-info.csv

使用exec插件(/etc/telegraf/telegraf.d/exec-plugin-aws-info.conf)创建的Telegraf配置文件:

#--- https://github.com/influxdata/telegraf/tree/master/plugins/inputs/exec

[[inputs.exec]]
  commands = ["/tmp/aws-vol-info.sh"]

  ## Timeout for each command to complete.
  timeout = "5s"

  # Data format to consume.
  # NOTE json only reads numerical measurements, strings and booleans are ignored.
  data_format = "influx"

  name_suffix = "_telegraf_execplugin"

调整了 .py(用于 generate 函数的Python脚本)以生成以下三种类型的输出格式(.csv文件),并想测试在启用配置文件(/etc/telegraf/telegraf.d/catch-aws-ebs-info.conf )并重新启动telegraf服务之前,telegraf将如何处理这些数据.


格式1:(每个值都用双引号"包裹)

create_time="2017-01-09 23:24:29.428000+00:00",availability_zone="us-east-2b",volume_id="vol-058e1d47dgh721121",volume_type="gp2",state="in-use",size="8",iops="100",encrypted="False",snapshot_id="snap-06h1h1b91bh662avn",kms_key_id="None",InstanceId="i-0jjb1boop26f42f50",InstanceVolumeState="attached",DeleteOnTermination="True",Device="/dev/sda1",Name="[company-2b-app90] secondary",hostname="company-2b-app90-i-0jjb1boop26f42f50",high_availability="1",mirror="secondary",cluster="company",autoscale="true",role="app"

在telegraf目录上测试telegraf配置会给我以下错误.

命令:$ telegraf --config-directory=/etc/telegraf --test --input-filter=exec

[vagrant@myvagrant ~] $ telegraf --config-directory=/etc/telegraf --test --input-filter=exec
2017/03/10 00:37:48 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
2017-03-10T00:37:48Z E! Errors encountered: [ metric parsing error, reason: [invalid field format], buffer: [create_time="2017-01-09 23:24:29.428000+00:00",availability_zone="us-east-2b",volume_id="vol-058e1d47dgh721121",volume_type="gp2",state="in-use",size="8",iops="100",encrypted="False",snapshot_id="snap-06h1h1b91bh662avn",kms_key_id="None",InstanceId="i-0jjb1boop26f42f50",InstanceVolumeState="attached",DeleteOnTermination="True",Device="/dev/sda1",Name="[company-2b-app90] secondary",hostname="company-2b-app90-i-0jjb1boop26f42f50",high_availability="1",mirror="secondary",cluster="company",autoscale="true",role="app"], index: [372]]
[vagrant@myvagrant ~] $

格式2:(不带任何"双引号)

create_time=2017-01-09 23:24:29.428000+00:00,availability_zone=us-east-2b,volume_id=vol-058e1d47dgh721121,volume_type=gp2,state=in-use,size=8,iops=100,encrypted=False,snapshot_id=snap-06h1h1b91bh662avn,kms_key_id=None,InstanceId=i-0jjb1boop26f42f50,InstanceVolumeState=attached,DeleteOnTermination=True,Device=/dev/sda1,Name=[company-2b-app90] secondary,hostname=company-2b-app90-i-0jjb1boop26f42f50,high_availability=1,mirror=secondary,cluster=company,autoscale=true,role=app

在测试Telegraf的exec插件配置时遇到相同的错误:

2017/03/10 00:45:01 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
2017-03-10T00:45:01Z E! Errors encountered: [ metric parsing error, reason: [invalid value], buffer: [create_time=2017-01-09 23:24:29.428000+00:00,availability_zone=us-east-2b,volume_id=vol-058e1d47dgh721121,volume_type=gp2,state=in-use,size=8,iops=100,encrypted=False,snapshot_id=snap-06h1h1b91bh662avn,kms_key_id=None,InstanceId=i-0jjb1boop26f42f50,InstanceVolumeState=attached,DeleteOnTermination=True,Device=/dev/sda1,Name=[company-2b-app90] secondary,hostname=company-2b-app90-i-0jjb1boop26f42f50,high_availability=1,mirror=secondary,cluster=company,autoscale=true,role=app], index: [63]]

格式3:(此格式的值中没有任何"双引号和空格字符).用_字符替换的空格.

create_time=2017-01-09_23:24:29.428000+00:00,availability_zone=us-east-2b,volume_id=vol-058e1d47dgh721121,volume_type=gp2,state=in-use,size=8,iops=100,encrypted=False,snapshot_id=snap-06h1h1b91bh662avn,kms_key_id=None,InstanceId=i-0jjb1boop26f42f50,InstanceVolumeState=attached,DeleteOnTermination=True,Device=/dev/sda1,Name=[company-2b-app90]_secondary,hostname=company-2b-app90-i-0jjb1boop26f42f50,high_availability=1,mirror=secondary,cluster=company,autoscale=true,role=app

仍然无法正常运行,出现相同的错误:

[vagrant@myvagrant ~] $ telegraf --config-directory=/etc/telegraf --test --input-filter=exec
2017/03/10 00:50:30 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
2017-03-10T00:50:30Z E! Errors encountered: [ metric parsing error, reason: [missing fields], buffer: [create_time=2017-01-09_23:24:29.428000+00:00,availability_zone=us-east-2b,volume_id=vol-058e1d47dgh721121,volume_type=gp2,state=in-use,size=8,iops=100,encrypted=False,snapshot_id=snap-06h1h1b91bh662avn,kms_key_id=None,InstanceId=i-0jjb1boop26f42f50,InstanceVolumeState=attached,DeleteOnTermination=True,Device=/dev/sda1,Name=[company-2b-app90]_secondary,hostname=company-2b-app90-i-0jjb1boop26f42f50,high_availability=1,mirror=secondary,cluster=company,autoscale=true,role=app], index: [476]]

格式4 :如果我遵循此页面上的 influx线路协议: https://docs.influxdata.com/influxdb/v1.2/write_protocols/line_protocol_tutorial/

awsebs,Name=[company-2b-app90]_secondary,hostname=company-2b-app90-i-0jjb1boop26f42f50,high_availability=1,mirror=secondary,cluster=company,autoscale=true,role=app create_time=2017-01-09_23:24:29.428000+00:00,availability_zone=us-east-2b,volume_id=vol-058e1d47dgh721121,volume_type=gp2,state=in-use,size=8,iops=100,encrypted=False,snapshot_id=snap-06h1h1b91bh662avn,kms_key_id=None,InstanceId=i-0jjb1boop26f42f50,InstanceVolumeState=attached,DeleteOnTermination=True,Device=/dev/sda1

我遇到此错误:

[vagrant@myvagrant ~] $ telegraf --config-directory=/etc/telegraf --test --input-filter=exec
2017/03/10 02:34:30 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
2017-03-10T02:34:30Z E! Errors encountered: [ invalid number]

如何,我可以摆脱这个错误,让Telegraf与exec插件(运行.sh脚本)一起工作吗?


其他信息:

Python脚本每天(通过cron)每天运行两次,而telegraf将每1分钟运行一次(运行exec插件-运行.sh脚本-将管理.csv文件,以便telegraf可以在 influx 数据格式).

https://galaxy.ansible.com/wavefrontHQ/wavefront-ansible/

https://github.com/influxdata/telegraf/issues/2525

解决方案

规则似乎很严格,我应该仔细看看.

您可以使用的任何程序的输出语法必须与以下所示的 INFLUX LINE PROTOCOL 格式以及随附的所有规则匹配或遵循./p>

例如:

weather,location=us-midwest temperature=82 1465839830100400200
  |    -------------------- --------------  |
  |             |             |             |
  |             |             |             |
+-----------+--------+-+---------+-+---------+
|measurement|,tag_set| |field_set| |timestamp|
+-----------+--------+-+---------+-+---------+

您可以在此处了解有关度量,标记,字段和可选(时间戳)的更多信息: https://docs.influxdata.com/influxdb/v1.2/write_protocols/line_protocol_tutorial/

重要规则是:

1)测量和标签集之间必须没有,并且没有空格.

2)标记集和字段集之间必须有一个空格.

3)对于标签键,标签值和字段键,如果要转义测量名称,标签或字段集名称及其值中的任何字符,请始终使用反斜杠字符\进行转义!

4)您无法使用\

转义\

5)线路协议可以毫无问题地处理表情符号:)

6)在可选

中设置了TAG/TAG(标记用逗号分隔)

7)FIELD/FIELD集(字段,用逗号分隔)-每行至少需要一个.

8)时间戳(格式中最后显示的值)为可选.



9)非常重要的报价规则如下:

a)从不 双引号或单引号 时间戳.这不是有效的线路协议.如果#是有效的,则'123123131312313'或"1231313213131"将不起作用.

b)从不 单引号 字段值(即使它们是字符串!).这也不是有效的线路协议.即fieldname ='giga'无效.

c)请勿 双引号或单引号 测量名称,标签键标签值字段键. 注意:这确实是说!!!标签值!!!!小心点

d)不要 双引号 字段值,它们只能是浮点数,整数或布尔值格式,否则InfluxDB会假定这些值值是字符串.

e)进行字符串双引号 字段值.

f)和最重要的一项(这将使您免于获得 BALD ):如果将FIELD值设置为不带双引号/ ie您认为这是一个整数值或浮点数,在一行中(例如,任何人都会说出 size iops 字段),而在另一些行中(在如果您设置了非整数值(即字符串),则Telegraf将使用 exec插件读取/解析的文件),则您会收到以下错误消息遇到的错误:[无效数字错误.

要解决此问题,规则,如果FIELD键的任何可能的FIELD值字符串,则您必须确保使用"对其进行包装(每行),在某些行中它的值 1、200还是1.5 都无关紧要(例如: iops可以是15),而在其他一些行中,值(iops可以是None).

错误消息: Errors encountered: [ invalid number

[vagrant@myvagrant ~] $ telegraf --config-directory=/etc/telegraf --test --input-filter=exec
2017/03/10 11:13:18 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
2017-03-10T11:13:18Z E! Errors encountered: [ invalid number metric parsing error, reason: [invalid field format], buffer: [awsebsvol,host=myvagrant ], index: [25]]

因此,经过所有这些学习之后,很明显,我首先错过了Influx Line协议格式,还缺少了规则 !!

现在,我希望我的python脚本生成的输出应该是这样的(根据INFLUX LINE PROTOCOL).您可以只更改.sh文件并使用sed "s/^/awsec2ebs,/"或也可以执行sed "s/^/awsec2ebs,sourcehost=$(hostname) /"(注意:结束sed /字符之前的空格),然后可以在任何键=值对周围使用".我确实更改了.py文件,以使sizeiops字段不使用".

无论如何,如果输出是这样的:

awsec2ebs,volume_id=vol-058e1d47dgh721121 create_time="2017-01-09 23:24:29.428000+00:00",availability_zone="us-east-2b",volume_type="gp2",state="in-use",size="8",iops="100",encrypted="False",snapshot_id="snap-06h1h1b91bh662avn",kms_key_id="None",InstanceId="i-0jjb1boop26f42f50",InstanceVolumeState="attached",DeleteOnTermination="True",Device="/dev/sda1",Name="[company-2b-app90] secondary",hostname="company-2b-app90-i-0jjb1boop26f42f50",high_availability="1",mirror="secondary",cluster="company",autoscale="true",role="app"

在上述最终工作解决方案中,我创建了一个名为awsec2ebs的度量,然后在此度量和标记键volume_id之间给出了,,对于标记值,我没有使用任何'"引号然后我在标签集和字段集之间给了一个空格字符(因为我现在只想要一个标签,否则您可以使用命令分隔的方式并遵循规则来拥有更多的标签).

最后运行命令:

$ telegraf --config-directory=/etc/telegraf --test --input-filter=exec像神鬼子一样工作!

2017/03/10 03:33:54 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
> awsec2ebs_telegraf_execplugin,volume_id=vol-058e1d47dgh721121,host=myvagrant volume_type="gp2",iops="100",kms_key_id="None",role="app",size="8",encrypted="False",InstanceId="i-0jjb1boop26f42f50",InstanceVolumeState="attached",Name="[company-2b-app90] secondary",snapshot_id="snap-06h1h1b91bh662avn",DeleteOnTermination="True",mirror="secondary",cluster="company",autoscale="true",high_availability="1",create_time="2017-01-09 23:24:29.428000+00:00",availability_zone="us-east-2b",state="in-use",Device="/dev/sda1",hostname="company-2b-app90-i-0jjb1boop26f42f50" 1489116835000000000
[vagrant@myvagrant ~] $ echo $?
0

在上面的示例中,size是唯一一个始终是数字/数字值的字段,因此我们不需要用"进行包装,但这取决于您.回忆上面的最重要的规则"及其产生的错误.

因此最终的python文件为:

#!/usr/bin/python

#Do `sudo pip install boto3` first
import boto3

def generate(key, value, qs, qe):
    """
    Creates a nicely formatted Key(Value) item for output
    """
    return '{}={}{}{}'.format(key, qs, value, qe)

def main():
    ec2 = boto3.resource('ec2', region_name="us-west-2")
    volumes = ec2.volumes.all()

    for vol in volumes:
        # You don't need to wrap everything in `str` unless it is not a string
        # By default most things will come back as a string
        # unless they are very obviously not (complex, date time, etc)
        # but since we are printing these (and formatting them into strings)
        # the cast to string will be implicit and we don't need to make it
        # explicit

        # vol is already a fully returned volume you are essentially DOUBLING
        # your API calls when you do this
        #iv = ec2.Volume(vol.id)
        output_parts = [
            # Volume level details
            generate('volume_id', vol.volume_id, '"', '"'),
            generate('create_time', vol.create_time, '"', '"'),
            generate('availability_zone', vol.availability_zone, '"', '"'),
            generate('volume_type', vol.volume_type, '"', '"'),
            generate('state', vol.state, '"', '"'),
            generate('size', vol.size, '', ''),
            #The following vol.iops variable can be a number or None so you must wrap it with double quotes otherwise "invalid number" error will come.
            generate('iops', vol.iops, '"', '"'),
            generate('encrypted', vol.encrypted, '"', '"'),
            generate('snapshot_id', vol.snapshot_id, '"', '"'),
            generate('kms_key_id', vol.kms_key_id, '"', '"'),
        ]

        for _ in vol.attachments:
            # Will get any attachments and since it is a list
            # we should write this to handle MULTIPLE attachments
            output_parts.extend([
                generate('InstanceId', _.get('InstanceId'), '"', '"'),
                generate('InstanceVolumeState', _.get('State'), '"', '"'),
                generate('DeleteOnTermination', _.get('DeleteOnTermination'), '"', '"'),
                generate('Device', _.get('Device'), '"', '"'),
            ])

        # only process when there are tags to process
        if vol.tags:
            for _ in vol.tags:
                # Get all of the tags
                output_parts.extend([
                    generate(_.get('Key'), _.get('Value'), '"', '"'),
                ])

        # output everything at once..
        print ','.join(output_parts)

if __name__ == '__main__':
    main()

最终aws-vol-info.sh是:

#!/bin/bash

cat aws-vol-info.csv | sed "s/^/awsebsvol,host=`hostname|head -1|sed "s/[ \t][ \t]*/_/g"` /"

最终的telegraf exec插件配置文件是(/etc/telegraf/telegraf.d/exec-plugin-aws-info.conf)使用.conf命名:

#--- https://github.com/influxdata/telegraf/tree/master/plugins/inputs/exec

[[inputs.exec]]
  commands = ["/some/valid/path/where/csvfileexists/aws-vol-info.sh"]

  ## Timeout for each command to complete.
  timeout = "5s"

  # Data format to consume.
  # NOTE json only reads numerical measurements, strings and booleans are ignored.
  data_format = "influx"

  name_suffix = "_telegraf_exec"

运行:,现在一切正常!

$ telegraf --config-directory=/etc/telegraf --test --input-filter=exec

Machine - CentOS 7.2 or Ubuntu 14.04/16.xx

Telegraf version: 1.0.1

Python version: 2.7.5

Telegraf supports an INPUT plugin named: exec. First please see EXAMPLE 2 in the README doc there. I can't use JSON format as it only consumes Numeric values for metrics. As per the docs:

If using JSON, only numeric values are parsed and turned into floats. Booleans and strings will be ignored.

So, the idea is simple, you specify a script in exec plugin section, which should spit some meaningful info(in either JSON -or- influx data format in my case as I have some metrics which contains non-numeric values) which you would want to catch/show somewhere in a cool dashboard like for example Wavefront Dashboard shown here: :

Basically one can use these metrics, tags, sources from where these metrics are coming from to find out various info about memory, cpu, disk, networking, other meaningful info and also create alerts using those if something unwanted happens.

OK, I came up with this python script available here:

#!/usr/bin/python

# sudo pip install boto3 if you don't have it on your machine.
import boto3


def generate(key, value):
    """
    Creates a nicely formatted Key(Value) item for output
    """
    return '{}="{}"'.format(key, value)
    #return '{}={}'.format(key, value)


def main():
    ec2 = boto3.resource('ec2', region_name="us-west-2")
    volumes = ec2.volumes.all()

    for vol in volumes:
        # You don't need to wrap everything in `str` unless it is not a string
        # By default most things will come back as a string 
        # unless they are very obviously not (complex, date time, etc)
        # but since we are printing these (and formatting them into strings)
        # the cast to string will be implicit and we don't need to make it 
        # explicit


        # vol is already a fully returned volume you are essentially DOUBLING
        # your API calls when you do this
        #iv = ec2.Volume(vol.id)
        output_parts = [
            # Volume level details
            generate('create_time', vol.create_time),
            generate('availability_zone', vol.availability_zone),
            generate('volume_id', vol.volume_id),
            generate('volume_type', vol.volume_type),
            generate('state', vol.state),
            generate('size', vol.size),
            generate('iops', vol.iops),
            generate('encrypted', vol.encrypted),
            generate('snapshot_id', vol.snapshot_id),
            generate('kms_key_id', vol.kms_key_id),
        ]

        for _ in vol.attachments:
            # Will get any attachments and since it is a list
            # we should write this to handle MULTIPLE attachments
            output_parts.extend([
                generate('InstanceId', _.get('InstanceId')),
                generate('InstanceVolumeState', _.get('State')),
                generate('DeleteOnTermination', _.get('DeleteOnTermination')),
                generate('Device', _.get('Device')),
            ])

        # only process when there are tags to process        
        if vol.tags:
            for _ in vol.tags:
                # Get all of the tags
                output_parts.extend([
                    generate(_.get('Key'), _.get('Value')),
                ])

        # output everything at once.. 
        print ','.join(output_parts)


if __name__ == '__main__':
    main()

This script will talk to AWS EC2 EBS volumes and outputs all values it can find (usually what you see in AWS EC2 EBS volume console) and format that info into a meaningful CSV format which I'm redirecting to a .csv log file. We don't want to run the python script all the time (AWS API limits / cost factor).

So, once the .csv file is created, I created this small shell script which I'll set in Telegraf's exec plugin's section.

Shell script /tmp/aws-vol-info.sh set in Telegraf exec plugin is:

#!/bin/bash

cat /tmp/aws-vol-info.csv

Telegraf configuration file created using exec plugin (/etc/telegraf/telegraf.d/exec-plugin-aws-info.conf):

#--- https://github.com/influxdata/telegraf/tree/master/plugins/inputs/exec

[[inputs.exec]]
  commands = ["/tmp/aws-vol-info.sh"]

  ## Timeout for each command to complete.
  timeout = "5s"

  # Data format to consume.
  # NOTE json only reads numerical measurements, strings and booleans are ignored.
  data_format = "influx"

  name_suffix = "_telegraf_execplugin"

I tweaked the .py (Python script for generate function) to generate the following three type of output formats (.csv file) and wanted to test how telegraf would handle this data before I enable the config file (/etc/telegraf/telegraf.d/catch-aws-ebs-info.conf) and restart telegraf service.


Format 1: (with double quotes " wrapped for every value)

create_time="2017-01-09 23:24:29.428000+00:00",availability_zone="us-east-2b",volume_id="vol-058e1d47dgh721121",volume_type="gp2",state="in-use",size="8",iops="100",encrypted="False",snapshot_id="snap-06h1h1b91bh662avn",kms_key_id="None",InstanceId="i-0jjb1boop26f42f50",InstanceVolumeState="attached",DeleteOnTermination="True",Device="/dev/sda1",Name="[company-2b-app90] secondary",hostname="company-2b-app90-i-0jjb1boop26f42f50",high_availability="1",mirror="secondary",cluster="company",autoscale="true",role="app"

Testing telegraf configuration on the telegraf directory gives me the following error.

Command: $ telegraf --config-directory=/etc/telegraf --test --input-filter=exec

[vagrant@myvagrant ~] $ telegraf --config-directory=/etc/telegraf --test --input-filter=exec
2017/03/10 00:37:48 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
2017-03-10T00:37:48Z E! Errors encountered: [ metric parsing error, reason: [invalid field format], buffer: [create_time="2017-01-09 23:24:29.428000+00:00",availability_zone="us-east-2b",volume_id="vol-058e1d47dgh721121",volume_type="gp2",state="in-use",size="8",iops="100",encrypted="False",snapshot_id="snap-06h1h1b91bh662avn",kms_key_id="None",InstanceId="i-0jjb1boop26f42f50",InstanceVolumeState="attached",DeleteOnTermination="True",Device="/dev/sda1",Name="[company-2b-app90] secondary",hostname="company-2b-app90-i-0jjb1boop26f42f50",high_availability="1",mirror="secondary",cluster="company",autoscale="true",role="app"], index: [372]]
[vagrant@myvagrant ~] $

Format 2: (without any " double quotes)

create_time=2017-01-09 23:24:29.428000+00:00,availability_zone=us-east-2b,volume_id=vol-058e1d47dgh721121,volume_type=gp2,state=in-use,size=8,iops=100,encrypted=False,snapshot_id=snap-06h1h1b91bh662avn,kms_key_id=None,InstanceId=i-0jjb1boop26f42f50,InstanceVolumeState=attached,DeleteOnTermination=True,Device=/dev/sda1,Name=[company-2b-app90] secondary,hostname=company-2b-app90-i-0jjb1boop26f42f50,high_availability=1,mirror=secondary,cluster=company,autoscale=true,role=app

Getting same error while testing Telegraf's configuration for exec plugin:

2017/03/10 00:45:01 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
2017-03-10T00:45:01Z E! Errors encountered: [ metric parsing error, reason: [invalid value], buffer: [create_time=2017-01-09 23:24:29.428000+00:00,availability_zone=us-east-2b,volume_id=vol-058e1d47dgh721121,volume_type=gp2,state=in-use,size=8,iops=100,encrypted=False,snapshot_id=snap-06h1h1b91bh662avn,kms_key_id=None,InstanceId=i-0jjb1boop26f42f50,InstanceVolumeState=attached,DeleteOnTermination=True,Device=/dev/sda1,Name=[company-2b-app90] secondary,hostname=company-2b-app90-i-0jjb1boop26f42f50,high_availability=1,mirror=secondary,cluster=company,autoscale=true,role=app], index: [63]]

Format 3: (this format doesn't have any " double quote and space character in the values). Substituted space with _ character.

create_time=2017-01-09_23:24:29.428000+00:00,availability_zone=us-east-2b,volume_id=vol-058e1d47dgh721121,volume_type=gp2,state=in-use,size=8,iops=100,encrypted=False,snapshot_id=snap-06h1h1b91bh662avn,kms_key_id=None,InstanceId=i-0jjb1boop26f42f50,InstanceVolumeState=attached,DeleteOnTermination=True,Device=/dev/sda1,Name=[company-2b-app90]_secondary,hostname=company-2b-app90-i-0jjb1boop26f42f50,high_availability=1,mirror=secondary,cluster=company,autoscale=true,role=app

Still didn't work, getting same error:

[vagrant@myvagrant ~] $ telegraf --config-directory=/etc/telegraf --test --input-filter=exec
2017/03/10 00:50:30 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
2017-03-10T00:50:30Z E! Errors encountered: [ metric parsing error, reason: [missing fields], buffer: [create_time=2017-01-09_23:24:29.428000+00:00,availability_zone=us-east-2b,volume_id=vol-058e1d47dgh721121,volume_type=gp2,state=in-use,size=8,iops=100,encrypted=False,snapshot_id=snap-06h1h1b91bh662avn,kms_key_id=None,InstanceId=i-0jjb1boop26f42f50,InstanceVolumeState=attached,DeleteOnTermination=True,Device=/dev/sda1,Name=[company-2b-app90]_secondary,hostname=company-2b-app90-i-0jjb1boop26f42f50,high_availability=1,mirror=secondary,cluster=company,autoscale=true,role=app], index: [476]]

Format 4: If I follow influx line protocol as per this page: https://docs.influxdata.com/influxdb/v1.2/write_protocols/line_protocol_tutorial/

awsebs,Name=[company-2b-app90]_secondary,hostname=company-2b-app90-i-0jjb1boop26f42f50,high_availability=1,mirror=secondary,cluster=company,autoscale=true,role=app create_time=2017-01-09_23:24:29.428000+00:00,availability_zone=us-east-2b,volume_id=vol-058e1d47dgh721121,volume_type=gp2,state=in-use,size=8,iops=100,encrypted=False,snapshot_id=snap-06h1h1b91bh662avn,kms_key_id=None,InstanceId=i-0jjb1boop26f42f50,InstanceVolumeState=attached,DeleteOnTermination=True,Device=/dev/sda1

I'm getting this error:

[vagrant@myvagrant ~] $ telegraf --config-directory=/etc/telegraf --test --input-filter=exec
2017/03/10 02:34:30 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
2017-03-10T02:34:30Z E! Errors encountered: [ invalid number]

HOW can I get rid of this error and get telegraf to work with exec plugin (which runs the .sh script)?


Other Info:

Python script will run once/twice per day (via cron) and telegraf will run every 1 minute (to run exec plugin - which runs .sh script - which will cat the .csv file so that telegraf can consume it in influx data format).

https://galaxy.ansible.com/wavefrontHQ/wavefront-ansible/

https://github.com/influxdata/telegraf/issues/2525

解决方案

It seems like the rules are very strict, I should have looked more closely.

Syntax of the output of any program that you can to consume MUST match or follow INFLUX LINE PROTOCOL format shown below and also all the RULES which comes with it.

For ex:

weather,location=us-midwest temperature=82 1465839830100400200
  |    -------------------- --------------  |
  |             |             |             |
  |             |             |             |
+-----------+--------+-+---------+-+---------+
|measurement|,tag_set| |field_set| |timestamp|
+-----------+--------+-+---------+-+---------+

You can read more about what's measurement, tag, field and optional(timestamp) here: https://docs.influxdata.com/influxdb/v1.2/write_protocols/line_protocol_tutorial/

Important rules are:

1) There must be a , and no space between measurement and tag set.

2) There must be a space between tag set and field set.

3) For tag keys, tag values, and field keys always use a backslash character \ to escape if you want to escape any character in measurement name, tag or field set name and their values!

4) You can't escape \ with \

5) Line Protocol handles emojis with no problem :)

6) TAG / TAG set (tags comma separated) in OPTIONAL

7) FIELD / FIELD set (fields, comma separated) - At least ONE is required per line.

8) TIMESTAMP (last value shown in the format) is OPTIONAL.



9) VERY IMPORTANT QUOTING rules are below:

a) Never double or single quote the timestamp. It’s not valid Line Protocol. '123123131312313' or "1231313213131" won't work if that # is valid.

b) Never single quote field values (even if they’re strings!). It’s also not valid Line Protocol. i.e. fieldname='giga' won't work.

c) Do not double or single quote measurement names, tag keys, tag values, and field keys. NOTE: THIS does say !!! tag values !!!! so careful.

d) Do not double quote field values that are ONLY in floats, integers, or booleans format, otherwise InfluxDB will assume that those values are strings.

e) Do double quote field values that are strings.

f) AND the MOST IMPORTANT one (which will save you from getting BALD): If a FIELD value is set without double quote / i.e. you think it's an integer value or float in one line (for ex: anyone will say fields size or iops) and in some other lines (anywhere in the file that telegraf will read/parse using exec plugin) if you have a non-integer value set (i.e. a String), then you'll get the following error message Errors encountered: [ invalid number error.

So to fix it, the RULE is, if any possible FIELD value for a FIELD key is a string, then you MUST make sure to use " to wrap it (in every lines), it doesn't matter whether it has value 1, 200 or 1.5 in some lines (for ex: iops can be 1, 5) and in some other lines that value (iops can be None).

Error message: Errors encountered: [ invalid number

[vagrant@myvagrant ~] $ telegraf --config-directory=/etc/telegraf --test --input-filter=exec
2017/03/10 11:13:18 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
2017-03-10T11:13:18Z E! Errors encountered: [ invalid number metric parsing error, reason: [invalid field format], buffer: [awsebsvol,host=myvagrant ], index: [25]]

So, after all this learning, it's clear that first I was missing the Influx Line protocol format and ALSO the RULES!!

Now, my output that I want my python script to generate should be like this (acc. to the INFLUX LINE PROTOCOL). You can just change the .sh file and use sed "s/^/awsec2ebs,/" or also do sed "s/^/awsec2ebs,sourcehost=$(hostname) /" (note: the space before the closing sed / character) and then you can have " around any key=value pair. I did change .py file to not use " for size and iops fields.

Anyways, if the output is something like this:

awsec2ebs,volume_id=vol-058e1d47dgh721121 create_time="2017-01-09 23:24:29.428000+00:00",availability_zone="us-east-2b",volume_type="gp2",state="in-use",size="8",iops="100",encrypted="False",snapshot_id="snap-06h1h1b91bh662avn",kms_key_id="None",InstanceId="i-0jjb1boop26f42f50",InstanceVolumeState="attached",DeleteOnTermination="True",Device="/dev/sda1",Name="[company-2b-app90] secondary",hostname="company-2b-app90-i-0jjb1boop26f42f50",high_availability="1",mirror="secondary",cluster="company",autoscale="true",role="app"

In the above final working solution, I created a measurement named awsec2ebs then gave , between this measurement and tag key volume_id and for tag value, I did NOT use any ' or " quotes and then I gave a space character (as I just wanted only one tag for now otherwise you can have more tag using command separated way and following the rules) between tag set and field set.

Finally ran the command:

$ telegraf --config-directory=/etc/telegraf --test --input-filter=exec which worked like a shenzi!

2017/03/10 03:33:54 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
> awsec2ebs_telegraf_execplugin,volume_id=vol-058e1d47dgh721121,host=myvagrant volume_type="gp2",iops="100",kms_key_id="None",role="app",size="8",encrypted="False",InstanceId="i-0jjb1boop26f42f50",InstanceVolumeState="attached",Name="[company-2b-app90] secondary",snapshot_id="snap-06h1h1b91bh662avn",DeleteOnTermination="True",mirror="secondary",cluster="company",autoscale="true",high_availability="1",create_time="2017-01-09 23:24:29.428000+00:00",availability_zone="us-east-2b",state="in-use",Device="/dev/sda1",hostname="company-2b-app90-i-0jjb1boop26f42f50" 1489116835000000000
[vagrant@myvagrant ~] $ echo $?
0

In the above example, size is the only field which will always be a number/numeric value, so we don't need to wrap it with " but it's up to you. Recall the MOST IMPORTANT rule.. above and the error it generates.

So final python file is:

#!/usr/bin/python

#Do `sudo pip install boto3` first
import boto3

def generate(key, value, qs, qe):
    """
    Creates a nicely formatted Key(Value) item for output
    """
    return '{}={}{}{}'.format(key, qs, value, qe)

def main():
    ec2 = boto3.resource('ec2', region_name="us-west-2")
    volumes = ec2.volumes.all()

    for vol in volumes:
        # You don't need to wrap everything in `str` unless it is not a string
        # By default most things will come back as a string
        # unless they are very obviously not (complex, date time, etc)
        # but since we are printing these (and formatting them into strings)
        # the cast to string will be implicit and we don't need to make it
        # explicit

        # vol is already a fully returned volume you are essentially DOUBLING
        # your API calls when you do this
        #iv = ec2.Volume(vol.id)
        output_parts = [
            # Volume level details
            generate('volume_id', vol.volume_id, '"', '"'),
            generate('create_time', vol.create_time, '"', '"'),
            generate('availability_zone', vol.availability_zone, '"', '"'),
            generate('volume_type', vol.volume_type, '"', '"'),
            generate('state', vol.state, '"', '"'),
            generate('size', vol.size, '', ''),
            #The following vol.iops variable can be a number or None so you must wrap it with double quotes otherwise "invalid number" error will come.
            generate('iops', vol.iops, '"', '"'),
            generate('encrypted', vol.encrypted, '"', '"'),
            generate('snapshot_id', vol.snapshot_id, '"', '"'),
            generate('kms_key_id', vol.kms_key_id, '"', '"'),
        ]

        for _ in vol.attachments:
            # Will get any attachments and since it is a list
            # we should write this to handle MULTIPLE attachments
            output_parts.extend([
                generate('InstanceId', _.get('InstanceId'), '"', '"'),
                generate('InstanceVolumeState', _.get('State'), '"', '"'),
                generate('DeleteOnTermination', _.get('DeleteOnTermination'), '"', '"'),
                generate('Device', _.get('Device'), '"', '"'),
            ])

        # only process when there are tags to process
        if vol.tags:
            for _ in vol.tags:
                # Get all of the tags
                output_parts.extend([
                    generate(_.get('Key'), _.get('Value'), '"', '"'),
                ])

        # output everything at once..
        print ','.join(output_parts)

if __name__ == '__main__':
    main()

Final aws-vol-info.sh is:

#!/bin/bash

cat aws-vol-info.csv | sed "s/^/awsebsvol,host=`hostname|head -1|sed "s/[ \t][ \t]*/_/g"` /"

Final telegraf exec plugin config file is (/etc/telegraf/telegraf.d/exec-plugin-aws-info.conf) give any name with .conf:

#--- https://github.com/influxdata/telegraf/tree/master/plugins/inputs/exec

[[inputs.exec]]
  commands = ["/some/valid/path/where/csvfileexists/aws-vol-info.sh"]

  ## Timeout for each command to complete.
  timeout = "5s"

  # Data format to consume.
  # NOTE json only reads numerical measurements, strings and booleans are ignored.
  data_format = "influx"

  name_suffix = "_telegraf_exec"

Run: and everything will work now!

$ telegraf --config-directory=/etc/telegraf --test --input-filter=exec

这篇关于telegraf-exec插件-aws ec2 ebs volumen信息-度量分析错误,原因:[缺少字段]或遇到的错误:[无效编号]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆