如何替换自定义缺失数据指示符,以便可以用该列的均值替换列中的所有缺失值? [英] How to replace a custom missing data signifier so all missing values in a column can be replaced with the mean for that column?
问题描述
我有一个上载的csv,其中许多列中的每一个缺少数据都用*表示.对于每列,我似乎都找不到一种用该列的平均值替换此*的方法.请问在Azure Machine Learning Studio中可以吗?
I have an uploaded csv where in each of the many columns missing data is signified by *. I can't seem to find a way, for each column, to replace this * with the average for the column. Is this possible in Azure Machine Learning Studio please?
我知道我可以在导入前清洁但有趣的是,在Studio工具中是否有可能.到目前为止,我还找不到使用转换为数据集"和清除丢失的数据"的方法.我可以使用自定义替代*为空白,然后 工作流程的这一步将无法运行.替换为离群整数,例如-999999对我也不起作用,因为在使用清除丢失的数据"时,它不会被当作丢失的值.
I know I could clean before import but am interesting in whether it is possible within the studio tools. So far I have been unable to find a way using Convert To DataSet and Clean Missing Data. I can use custom substitute * for blank but then that step of the work flow won't run. Replacing with an outlier integer e.g. -999999 doesn't work for me either as it is not picked up as a missing value when using Clean Missing Data.
非常感谢收到任何建议
此致
昆汀
QHarri
推荐答案
如果清除缺少的数据"模块不能解决您的问题,则可以选择编写自己的R/Python代码.
An option would be to write your own R/Python code if the 'Clean Missing Data' module does not fix your issue.
此致,
Jaya
Regards,
Jaya
这篇关于如何替换自定义缺失数据指示符,以便可以用该列的均值替换列中的所有缺失值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!