速度问题 [英] Speed problem

查看:86
本文介绍了速度问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是perl的新手,下面是我工作过的脚本,它给了我我想要的东西。我面临的唯一问题是速度。该脚本从file1.csv映射AB,BA,IS,RT,ET并将它们放入列中。 file2.txt包含我使用unix命令剪切的file1.csv的第一列(我想避免并从脚本本身映射它)。我的问题 - 有没有什么方法可以让这个脚本更快,因为file1.csv有1.6百万行,而grep命令通过它们6次,这使得脚本真正变慢(需要6个小时来映射)。但确实给出了我想要的输出。

请帮帮我


提前致谢。

John

Hi, I am newbie to perl, below is the script i have worked on which gives me exactly what i want. The only problem i am facing is speed. This script is mapping the AB,BA,IS,RT,ET from file1.csv and putting them into columns. file2.txt contains first column of file1.csv which i have cut using unix command (this i would like to avoid and map it from the script itself). my question- Is there any way I can make this script faster as file1.csv has 1.6 millions of lines and grep command goes through them 6 times which is making the script real slow (takes 6 hours to map). but does gives the output what i want.
Please help me

Thanks in advance.
John

展开 | 选择 | Wrap | 行号

推荐答案


您好,我是perl的新手,下面是我工作过的脚本在其上给了我我想要的东西。我面临的唯一问题是速度。该脚本从file1.csv映射AB,BA,IS,RT,ET并将它们放入列中。 file2.txt包含我使用unix命令剪切的file1.csv的第一列(我想避免并从脚本本身映射它)。我的问题 - 有没有什么方法可以让这个脚本更快,因为file1.csv有1.6百万行,而grep命令通过它们6次,这使得脚本真正变慢(需要6个小时来映射)。但确实给出了我想要的输出。

请帮帮我


提前致谢。

John
Hi, I am newbie to perl, below is the script i have worked on which gives me exactly what i want. The only problem i am facing is speed. This script is mapping the AB,BA,IS,RT,ET from file1.csv and putting them into columns. file2.txt contains first column of file1.csv which i have cut using unix command (this i would like to avoid and map it from the script itself). my question- Is there any way I can make this script faster as file1.csv has 1.6 millions of lines and grep command goes through them 6 times which is making the script real slow (takes 6 hours to map). but does gives the output what i want.
Please help me

Thanks in advance.
John



看起来file1和file2的订单相同:


112311

321211

432342


file1已经按照您希望输出的顺序排列,或者第一列是否需要以某种方式排序?我假设file2并不重要,你只是暂时使用它来获取第一列值,除非file1的开头顺序不正确。


你有没有在其他地方发布这个问题论坛,你已经有一个解决方案?因为如果你这样做我不想浪费我的时间搞清楚。

It looks like file1 and file2 have the same order:

112311
321211
432342

Is file1 already in the order you want the output to be in or does the first column need to be sorted somehow? I assume file2 is not important, you just use it temporarily to get the first column value unless file1 is not in the correct order to begin with.

Do you have this question posted on other forums and do you already have a solution? Because if you do I don''t want to waste my time figuring it out.



看起来file1和file2有相同的订购:


112311

321211

432342


文件1已经在你想要输出的顺序,还是第一列需要以某种方式排序?我假设file2并不重要,你只是暂时使用它来获取第一列值,除非file1的开头顺序不正确。


你有没有在其他地方发布这个问题论坛,你已经有一个解决方案?因为如果你这样做,我不想浪费我的时间搞清楚。
It looks like file1 and file2 have the same order:

112311
321211
432342

Is file1 already in the order you want the output to be in or does the first column need to be sorted somehow? I assume file2 is not important, you just use it temporarily to get the first column value unless file1 is not in the correct order to begin with.

Do you have this question posted on other forums and do you already have a solution? Because if you do I don''t want to waste my time figuring it out.



Kevin感谢您的快速回复,不,我没有在其他地方发布此问题,因为我在搜索中看到这是唯一活跃且快速的论坛。

关于file2.txt,它包含来自file1.csv的唯一列,我手动使用unix命令获取file2(我想避免手动步骤并在perl中执行)。上面的输出文件包含新的列,这些列来自file1,列为匹配。所以在输出中我不想要任何重复加上我想从第2列中的行创建新列。


是的,这个脚本确实提供了解决方案,但正如我提到的那样需要6到7个产生结果的时间,因为它通过了160万行。这就是为什么我需要帮助,因为我被卡住了。任何帮助将不胜感激。


感谢advancs。

Kevin Thanks for fast response, No i have not posted this question anywhere else, coz I saw in my search that this the only forum which is active and fast.
About file2.txt, it contains unique columns from file1.csv and I am getting file2 using unix command manually ( I want avoid manual step and do it in perl). The output file as above has new columns which comes out of file1 with column as match. so in the output i dont want any duplicates plus I want to create new columns from rows in column 2.

Yes this script does gives solution, but as i mention it takes 6 to 7 hours to generate results, as it goes through 1.6 millions lines. thats why i need help, coz i am stuck. Any help will be appreciated.

Thank you in advancs.


这三列之间有什么关系?如果您只存储column1中的唯一数据,那么您对第2列和第3列的数据做了什么?


--Kevin
What are the relationships between the three columns? If you are only storing the unique data from column1, what are you doing with the data from columns 2 and 3?

--Kevin


这篇关于速度问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆