改善python中函数的性能 [英] Improving performance of a function in python

查看:62
本文介绍了改善python中函数的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有几个GB的文本文件,格式是

I have a text file fo several GB with this format

0 274 593869.99 6734999.96 121.83 1,
0 273 593869.51 6734999.92 121.57 1,
0 273 593869.15 6734999.89 121.57 1,
0 273 593868.79 6734999.86 121.65 1,
0 273 593868.44 6734999.84 121.65 1,
0 273 593869.00 6734999.94 124.21 1,
0 273 593868.68 6734999.92 124.32 1,
0 273 593868.39 6734999.90 124.44 1,
0 273 593866.94 6734999.71 121.37 1,
0 273 593868.73 6734999.99 127.28 1,

我有一个简单的函数可以在Windows上的Python 2.7中进行过滤.该函数读取整个文件,选择具有相同idtile的行(第一列和第二列),并返回点列表(x,y,z和标签)和idtile.

I have a simple function to filter in Python 2.7 on Windows. The function reads the entire file, selects the line with the same idtile (first and second column) and returns the list of points (x,y,z, and label) and the idtile.

tiles_id = [j for j in np.ndindex(ny, nx)] #ny = number of row, nx= number of columns
idtile = tiles_id[0]

def file_filter(name,idtile):
        lst = []
        for line in file(name, mode="r"):
            element = line.split() # add value
            if (int(element[0]),int(element[1])) == idtile:
                lst.append(element[2:])
                dy, dx = int(element[0]),int(element[1])
        return(lst, dy, dx)

文件大于32 GB,瓶颈是文件的读取.我正在寻找一些建议或示例,以加快我的功能(例如:并行计算或其他方法).

The file is more than 32 GB and the bottle-neck is the reading of the file. I am looking for some suggestions or examples in order to speed up my function (ex: Parallel computing or other approaches).

我的解决方案是将文本文件拆分为图块(使用x和y位置).解决方案不是很好,我正在寻找一种有效的方法.

My solution is to split the text file into tiles (using x and y location). The solution is not elegant and I am looking for an efficient approach.

推荐答案

也许最好,最快的方法是解决您的问题,即在(大量)并行系统上使用map reduce算法.

Maybe the best and fasted was to solve you problem is using a map reduce algorithm on a (massively) parallel system.

这篇关于改善python中函数的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆