使用python ExecuteScript处理器访问Json元素并写入文本文件 [英] Access Json element and write to a text file using python ExecuteScript processor

查看:96
本文介绍了使用python ExecuteScript处理器访问Json元素并写入文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是python和nifi的新手.

I am new to python and nifi.

我的流程是GetFile-> ExecuteScript

My flow is GetFile-->ExecuteScript

在脚本中,对于每个json,我要访问特定元素并将其逐行写入文本文件.

In the script, for each json,i want to accesss a particular element and write it to a text file line by line.

我尝试了以下方法:

import json
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback

class ModJSON(StreamCallback):
  def __init__(self):
    pass
  def process(self, inputStream, outputStream):
  text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
  json_content = json.loads(text)
  try:
     body = json_content['id']['body']
     body_encoded = body.encode('utf-8')
  except (KeyError,TypeError,ValueError):
     body_encoded = ''

  text_file = open ('/tmp/test/testFile.txt', 'w')   
  text_file.write("%s"%body_encoded)
  text_file.close()
  outputStream.write(bytearray(json.dumps(body, indent=4).encode('utf-8')))

flowFile = session.get()
if (flowFile != None):
    flowFile = session.write(flowFile, ModJSON())
    flowFile = session.putAttribute(flowFile, "filename", flowFile.getAttribute('filename').split('.')[0]+'_translated.json')
session.transfer(flowFile, REL_SUCCESS)

但是在testFile.txt中,未编写访问的正文.

but in the testFile.txt, the accessed body is not being written.

我在这里想念什么?

推荐答案

Python类的主体不缩进,process方法的主体也不缩进.尝试从def init 行通过outputStream.write行缩进一个级别,然后再次从text = IOUtils.toString行通过outputStream.write行缩进一个级别,这将为您提供一个有效的StreamCallback类并导致脚本正常工作.

The body of your Python class is not indented, and neither is the body of the process method. Try indenting one level from the def init line through the outputStream.write line, then again indent one level from the text = IOUtils.toString line through the outputStream.write line, this should give you a working StreamCallback class and cause the script to work correctly.

此外,您无需调用session.commit(),脚本完成后将为您调用.

Also you do not need a call to session.commit(), that will be called for you when the script is complete.

EDIT (由于OP编辑-参见注释):上面的脚本仍未正确缩进,process()方法的主体需要缩进.您是否在ExecuteScript处理器上收到错误或公告?如果传入的流文件在ExecuteScript之前排队,则"flowFile = session.get()"未得到执行,否则处理器应该抛出错误并发布公告(右上角的红色框).

EDIT (due to OP edit -- see comments): The script above is still not indented correctly, the body of the process() method needs to be indented. Are you getting errors or bulletins on the ExecuteScript processor? If the incoming flow files are being queued before ExecuteScript, then the "flowFile = session.get()" is not getting executed, or the processor should be throwing an error and posting a bulletin (a red box on the upper right corner).

另外,由于您打算将相同的内容以流文件的形式从处理器中发送出去,因此您不需要"text_file"代码,我想这是用于调试吗?

Also since you intend to send the same content out of the processor in a flow file, you shouldn't need the "text_file" code, I assume that's for debugging?

这篇关于使用python ExecuteScript处理器访问Json元素并写入文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆