Kinesis 代理未解析文件 [英] Kinesis agent not parsing the file

查看:33
本文介绍了Kinesis 代理未解析文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 agent.json 中有以下内容

I have the following in the agent.json

{
  "cloudwatch.emitMetrics": true,
  "kinesis.endpoint": "",
  "firehose.endpoint": "", 
  "flows": [
    {
      "filePattern": "/home/ec2-user/ETLdata/contracts/Delta.csv",
      "kinesisStream": "ETL-rawdata-stream",
      "partitionKeyOption": "RANDOM",
      "dataProcessingOptions": [
        {
    "optionName": "CSVTOJSON",
    "customFieldNames": [ "field1", "field2"],
    "delimiter": ","
        }
      ] 
    }
  ]
}

当我将指定的文件添加到文件夹时,实际上什么也没发生.我只在日志中看到以下内容.为什么它根本不解析文件.有人知道吗?

When I add the specified file to the folder, literally nothing happens. I only see the below in the logs. Why is it not parsing the file at all. Does anyone have any idea?

更新:当我将文件模式设为/tmp/delta.csv 时它会起作用.看起来像是权限问题,但日志中没有错误.

update: It works when I make the file pattern as /tmp/delta.csv. Looks like a permission issue but no errors in the logs.

Tailer Progress:Tailer 已解析 0 条记录(0 字节),转换为 0记录,跳过 0 条记录,已成功发送 0 条记录到目的地.2017-06-22 18:12:03.671+0000(Agent.MetricsEmitter RUNNING)com.amazon.kinesis.streaming.agent.Agent [INFO] 代理:进度:0已解析的记录(0 字节),并且成功发送了 0 条记录到目的地.正常运行时间:300020ms

Tailer Progress: Tailer has parsed 0 records (0 bytes), transformed 0 records, skipped 0 records, and has successfully sent 0 records to destination. 2017-06-22 18:12:03.671+0000 (Agent.MetricsEmitter RUNNING) com.amazon.kinesis.streaming.agent.Agent [INFO] Agent: Progress: 0 records parsed (0 bytes), and 0 records sent successfully to destinations. Uptime: 300020ms

推荐答案

我遇到了类似的问题,我可以通过执行以下操作来解决它:

I had a similar issue, I was able to solve it by doing the following:

  1. 将要发送到 kinesis firehose 流(一堆 CSV 文件)的数据从 ~/ec2-user/out-data 移动到另一个目录:

  1. moving the data to be sent to the kinesis firehose stream (a bunch of CSV files) from ~/ec2-user/out-data to another directory:

mv *.csv /tmp/out-data

  • 编辑 agent.json 文件,以便代理从文件的开头开始读取 - 这是我的 agent.json 文件:

  • edit the agent.json file so that the agent starts reading at the beginning of the file- here is my agent.json file:

    {
      "cloudwatch.emitMetrics": true,
      "firehose.endpoint": "firehose.eu-west-1.amazonaws.com",
      "flows": [
        {
          "filePattern": "/tmp/out-data/trx_headers_2017*",
          "deliveryStream": "TestDeliveryStream",
          "initialPosition": "START_OF_FILE"
        }
      ]
    }
    

  • 我的猜测是您的 Delta.csv 文件正在写入,因此如果您添加 "initialPosition" : "START_OF_FILE" 修复它会在文件开头开始解析.

    my guess is that your Delta.csv file is being written to so the kinesis agent is checking the end of the file and finding no new records, if you add the "initialPosition" : "START_OF_FILE" fix it will start parsing at the beginning of file.

    这篇关于Kinesis 代理未解析文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆