我该如何处理代码以避免被杀死? [英] how can I handle the code to avoid killed?

查看:64
本文介绍了我该如何处理代码以避免被杀死?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

运行这段代码后,我被杀死

代码的第一部分是

def load_data(distance_file):
    distance = {}
    min_dis, max_dis = sys.float_info.max, 0.0
    num = 0
    with open(distance_file, 'r', encoding = 'utf-8') as infile:
        for line in infile:
            content = line.strip().split()
           
            assert(len(content) == 3)
            idx1, idx2, dis = int(content[0]), int(content[1]), float(content[2])
            num = max(num, idx1, idx2)
            min_dis = min(min_dis, dis)
            max_dis = max(max_dis, dis)
            distance[(idx1, idx2)] = dis
            distance[(idx2, idx1)] = dis
        for i in range(1, num + 1):
            distance[(i, i)] = 0.0
        #infile.close() there are no need to close file it is closed automatically since i am using with
    
    return distance, num, max_dis, min_dis 

编辑我试过这个解决方案

bigfile = open(folder,'r')
        tmp_lines = bigfile.readlines(1024)
        while tmp_lines:
             for line in tmp_lines:
                tmp_lines = bigfile.readlines(1024)
                
                i, j, dis = line.strip().split()
                i, j, dis = int(i), int(j), float(dis)
                distance[(i, j)] = dis
                distance[(j, i)] = dis
                max_pt = max(i, j, max_pt)
             for num in range(1, max_pt + 1):
                distance[(num, num)] = 0
        return distance, max_pt

但得到这个错误

   gap = distance[(i, j)] - threshold
KeyError: (1, 2)

从这个方法

def CutOff(self, distance, max_id, threshold):
        '''
        :rtype: list with Cut-off kernel values by desc
        '''
        cut_off = dict()
        for i in range(1, max_id + 1):
            tmp = 0
            for j in range(1, max_id + 1):
                gap = distance[(i, j)] - threshold
                print(gap)
                tmp += 0 if gap >= 0 else 1
            cut_off[i] = tmp
        sorted_cutoff = sorted(cut_off.items(), key=lambda k:k[1], reverse=True)
        return sorted_cutoff

我使用 print(gap) 来了解为什么会出现这个问题并得到这个值 -0.3

i used print(gap) to get why this problem appeared and got this value -0.3

其余代码这里

我有一个包含 20000 行的文件,代码停在

I have a file contains 20000 lines and the code stopped at

['2686', '13856', '64.176689']
Killed

如何处理代码以接受更多行?我可以增加内存以及如何或从代码本身需要更改,例如使用文件来存储非参数

how can I handle the code to accept more lines? can I increase the memory and how or from the code itself need to change like using file for storing not parameters

我使用了 dmesg 并得到了

Out of memory: Killed process 24502 (python) total-vm:19568804kB, anon-rss:14542148kB, file-rss:4kB, shmem-rss:0kB, UID:1000 pgtables:31232kB oom_score_adj:0

[  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents o

 1000       24502    4892200          3585991    33763328     579936   

推荐答案

Cutoff 函数检查每个 (i, j) 对,从 1 ~ max_id.

The Cutoff function checks every (i, j) pairs, from 1 ~ max_id.

def CutOff(self, distance, max_id, threshold):
    for i in range(1, max_id + 1):
        for j in range(1, max_id + 1):

github 链接中提供的示例数据文件包含每个 ID 对的距离值,从 1 到 2000.(因此对于 2K ID,它有 2M 行).

And a sample data file provided in the github link contains distance values for every ID pairs, from 1 to 2000. (so it has 2M lines for the 2K IDs).

但是,你的数据好像很稀疏,因为它只有20000行但是有2686和13856这样的大ID号.错误信息'KeyError:(1, 2)'说明没有距离值ID 1 和 2 之间.

However, your data seems to be very sparse, because it has only 20,000 lines but there are large ID numbers such as 2686 and 13856. The error message 'KeyError: (1, 2)' tells that there is no distance value between ID 1 and 2.

最后,如果某些仅加载 20,000 行数据(可能只有几兆字节)的代码引发内存不足错误,这对我来说没有意义.我猜您的数据要大得多,或者 OOM 错误来自您代码的另一部分.

Finally, it does not make sense for me if some code loading only 20,000 lines of data (probably few MBytes) raises the out of memory error. I guess your data is much larger, or the OOM error came from another part of your code.

这篇关于我该如何处理代码以避免被杀死?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆