访问 Json 元素并使用 python ExecuteScript 处理器写入文本文件 [英] Access Json element and write to a text file using python ExecuteScript processor

查看:26
本文介绍了访问 Json 元素并使用 python ExecuteScript 处理器写入文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 python 和 nifi 的新手.

I am new to python and nifi.

我的流程是 GetFile-->ExecuteScript

My flow is GetFile-->ExecuteScript

在脚本中,对于每个 json,我想访问特定元素并将其逐行写入文本文件.

In the script, for each json,i want to accesss a particular element and write it to a text file line by line.

我尝试了以下方法:

import json
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback

class ModJSON(StreamCallback):
  def __init__(self):
    pass
  def process(self, inputStream, outputStream):
  text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
  json_content = json.loads(text)
  try:
     body = json_content['id']['body']
     body_encoded = body.encode('utf-8')
  except (KeyError,TypeError,ValueError):
     body_encoded = ''

  text_file = open ('/tmp/test/testFile.txt', 'w')   
  text_file.write("%s"%body_encoded)
  text_file.close()
  outputStream.write(bytearray(json.dumps(body, indent=4).encode('utf-8')))

flowFile = session.get()
if (flowFile != None):
    flowFile = session.write(flowFile, ModJSON())
    flowFile = session.putAttribute(flowFile, "filename", flowFile.getAttribute('filename').split('.')[0]+'_translated.json')
session.transfer(flowFile, REL_SUCCESS)

但是在 testFile.txt 中,访问的正文没有被写入.

but in the testFile.txt, the accessed body is not being written.

我在这里想念什么?

推荐答案

Python 类的主体没有缩进,过程方法的主体也没有缩进.尝试通过 outputStream.write 行从 def init 行缩进一级,然后通过 outputStream.write 行再次从 text = IOUtils.toString 行缩进一级,这应该会给你一个有效的 StreamCallback类并使脚本正常工作.

The body of your Python class is not indented, and neither is the body of the process method. Try indenting one level from the def init line through the outputStream.write line, then again indent one level from the text = IOUtils.toString line through the outputStream.write line, this should give you a working StreamCallback class and cause the script to work correctly.

此外,您不需要调用 session.commit(),它将在脚本完成时为您调用.

Also you do not need a call to session.commit(), that will be called for you when the script is complete.

EDIT(由于 OP 编辑​​——见评论):上面的脚本仍然没有正确缩进,process() 方法的主体需要缩进.您是否在 ExecuteScript 处理器上收到错误或公告?如果传入的流文件在 ExecuteScript 之前排队,则不会执行flowFile = session.get()",或者处理器应该抛出错误并发布公告(右上角的红色框).

EDIT (due to OP edit -- see comments): The script above is still not indented correctly, the body of the process() method needs to be indented. Are you getting errors or bulletins on the ExecuteScript processor? If the incoming flow files are being queued before ExecuteScript, then the "flowFile = session.get()" is not getting executed, or the processor should be throwing an error and posting a bulletin (a red box on the upper right corner).

此外,由于您打算在流文件中将相同的内容发送出处理器,因此您不需要text_file"代码,我认为这是用于调试?

Also since you intend to send the same content out of the processor in a flow file, you shouldn't need the "text_file" code, I assume that's for debugging?

这篇关于访问 Json 元素并使用 python ExecuteScript 处理器写入文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆