OpenRefine-在单元格之间填充,但不在列表末尾 [英] OpenRefine - Fill between cells but not at the end of the list

查看:98
本文介绍了OpenRefine-在单元格之间填充,但不在列表末尾的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些股票的股价清单.由于周末,节假日和其他可能原因,某些值丢失了.

I have a list of stock prices for several stocks. Some of the values are missing due to weekends, holidays and probably other reasons.

差距不一致.有的是两天,有的还不止于此.

The gaps are not consistent. Some are two days and some are more than that.

我想用最后一个已知值填补空白,而不是在列表的末尾.

I want to fill the gaps with the last known value but not at the end of the list.

我已经尝试在Excel中测试下面的一些单元格,如果现在为空,则进行填充.问题是由于间隙的不一致,要在所有情况下更改功能都是一项繁琐的任务.

I have tried in Excel to test a few cells below and if it's now empty, do the fill. The problem is that due to the inconsistency of the gaps, it's a tedious task to change the function for all the cases.

有没有一种方法可以测试列表的结尾?

Is there a way to test for the end of a list?

更新-添加了屏幕截图.

UPDATE - added a screenshot.

请参阅此屏幕截图.我要填充蓝点所在的位置.红点在列表的末尾,我不想填充这些单元格.

See this screenshot. I want to fill where the blue dots are. The red dots are at the end of the list and I don't want to fill those cells.

我正在寻找一种检测列表结尾并在检测到结尾时停止填充的方法.

I am looking for a way to detect the end of the list and stop the filling when the end is detected.

推荐答案

我认为这在OpenRefine中非常困难,可能使用其他工具会更好.主要问题是OpenRefine无法提供轻松地跨行工作的功能,因此汇总列"(或列的一部分)非常棘手-在

I think this is pretty difficult in OpenRefine and probably a different tool would work better. The main issue is that OpenRefine does not offer the ability to easily work across rows so 'summing a column' (or part of a column) is tricky - this is mentioned in https://github.com/OpenRefine/OpenRefine/issues/200

但是,您可以通过在整个项目包含单个记录的情况下在记录模式下强制OpenRefine来执行此操作.完成此操作后,您可以使用以下语法访问列中的所有值:

However, you can do this by forcing OpenRefine in Record mode with the whole project containing a single record. Once you've done this you can access all values in a column using syntax like:

row.record.cells["Column name"].value

这给出了列中所有非空白值的数组.由于此操作会忽略空白值,因此为了在列中具有真实的值视图,您必须在空白单元格中填入一个值.

This gives an array of all the non-blank values in the column. Since this ignores blank values, in order to have a true view of the values in the column you have to fill in blank cells with a value.

所以我认为您可能可以实现以下目标:

So I think you could probably achieve what you want as follows:

  1. 对于要使用的每一列,请执行单元格转换,以在空单元格中放置一个虚拟值-例如if(isBlank(value),"null",value)
  2. 在项目的开头创建一个新列,并在该列的第一个单元格中输入一个值
  3. 切换到记录模式

此时,您应该在项目中有一个记录",例如

At this point you should have a single 'Record' in your project - e.g.

现在,您可以使用row.record.cells ["Column 1"].value之类的语法访问列中的所有单元格.您可以将其与"forRange"结合使用,以row.index作为当前行的标记来遍历此数组的内容.

You can now access all cells in a column using syntax like row.record.cells["Column 1"].value. You can combine this with 'forRange' to iterate through the contents of this array, using the row.index as the marker for the current row.

  1. 我使用以下公式将新列添加到项目中:

  1. I used the following formula to add a new column to the project:

with(row.record.cells ["Column 1"].value,w,if(forRange(row.index,w.length(),1,i,w [i] .toNumber()).sum ()> 0,"a","b"))

with(row.record.cells["Column 1"].value,w,if(forRange(row.index,w.length(),1,i,w[i].toNumber()).sum()>0,"a","b"))

然后...

  1. 更改回行"模式
  2. 从原始列中删除空"占位符

  1. 在填充过滤器"列上创建一个构面

  1. 就我而言,我过滤为"a"
  2. 使用填充"选项
  3. 删除过滤器

  1. In my case I filter to 'a'
  2. Use the 'fill down' option
  3. Remove the filter

并删除记录"列

至少可以说这是一个漫长的过程,但是到目前为止,在不离开OpenRefine的情况下,我还没有找到更好的方法.我猜测您可能会将步骤5-11压缩为一个步骤或更少的步骤.

Rather a long winded way of doing it to say the least, but so far I've not been able to find anything better while not going outside OpenRefine. I'm guessing you could probably compress steps 5-11 into a single step or smaller number of steps.

如果要按照iMitwe的建议使用Jython访问单元格值数组,则需要使用:

If you want to access the array of cell values using Jython as suggested by iMitwe you need to use:

row["record"]["cells"]["Column 1"]["value"]

代替

row.record.cells["Column 1"].value

(第5步)

这篇关于OpenRefine-在单元格之间填充,但不在列表末尾的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆