Yahoo Pipes:根据文本文件中的单词过滤供稿中的项目 [英] Yahoo Pipes: filter items in a feed based on words in a text file

查看:116
本文介绍了Yahoo Pipes:根据文本文件中的单词过滤供稿中的项目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个过滤RSS提要并删除包含我选择的停用词"的项目的管道.目前,我已经在管道编辑器中为每个停用词手动创建了一个过滤器,但是更合乎逻辑的方式是从文件中读取这些停用词.我已经弄清楚了如何从文本文件中读取停用词,但是如何将过滤器运算符应用于Feed,每个停用词一次?

I have a pipe that filters an RSS feed and removes any item that contains "stopwords" that I've chosen. Currently I've manually created a filter for each stopword in the pipe editor, but the more logical way is to read these from a file. I've figured out how to read the stopwords out of the text file, but how do I apply the filter operator to the feed, once for every stopword?

文档明确声明不能使用运算符在循环构造中,但希望我在这里丢失了一些东西.

The documentation states explicitly that operators can't be applied within the loop construct, but hopefully I'm missing something here.

推荐答案

您不会丢失任何内容-过滤器运算符无法循环进行.

You're not missing anything - the filter operator can't go in a loop.

您最好的选择是从停用词中生成一个正则表达式并使用它进行过滤.例如生成类似(word1|word2|word3|...|wordN)的字符串.

Your best bet might be to generate a regex out of the stopwords and filter using that. e.g. generate a string like (word1|word2|word3|...|wordN).

您可能必须转义任何奇数字符.另外,我不确定正则表达式可以使用多长时间,因此您可能必须根据多个过滤规则对它进行分块.

You may have to escape any odd characters. Also I'm not sure how long a regex can be so you might have to chunk it over multiple filter rules.

这篇关于Yahoo Pipes:根据文本文件中的单词过滤供稿中的项目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆