如何并行打开超过19个文件(Python)? [英] How to open more than 19 files in parallel (Python)?

查看:105
本文介绍了如何并行打开超过19个文件(Python)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个项目,需要读取数据,然后根据每一行并行写入23个以上的CSV文件.例如,如果该行是关于温度的,那么我们应该写到temperature.csv,如果是关于湿度的,>>我们要写到湿度.CSV,等等.

我尝试了以下操作:

with open('Results\\GHCN_Daily\\MetLocations.csv','wb+') as locations, \
            open('Results\\GHCN_Daily\\Tmax.csv','wb+')as tmax_d, \
            open('Results\\GHCN_Daily\\Tmin.csv','wb+')as tmin_d, \
            open('Results\\GHCN_Daily\\Snow.csv', 'wb+')as snow_d, \
            .
            .
            # total of 23 'open' statements
            .

            open('Results\\GHCN_Daily\\SnowDepth.csv','wb+')as snwd_d, \
            open('Results\\GHCN_Daily\\Cloud.csv', 'wb+')as cloud_d, \
            open('Results\\GHCN_Daily\\Evap.csv', 'wb+')as evap_d, \

我遇到了以下错误

SystemError: too many statically nested blocks python

我搜索了此错误,然后转到这篇文章,其中说

当您嵌套超过20个块时,将遇到此错误. 这是Python解释器的一项设计决策,将其限制为20.

但是我写的open语句以并行方式而不是嵌套方式打开文件.

我在做什么错了,我该如何解决这个问题?

谢谢.

解决方案

每次打开都是一个嵌套上下文,正是python语法允许您将它们放在逗号分隔的列表中. contextlib.ExitStack是一个上下文容器,可让您在堆栈中放入任意数量的上下文,并在完成后退出每个上下文.所以,你可以做

import contextlib

files_to_process = (
    ('Results\\GHCN_Daily\\MetLocations.csv', 'locations'),
    ('Results\\GHCN_Daily\\Tmax.csv', 'tmax_d'),
    ('Results\\GHCN_Daily\\Tmin.csv', 'tmin_d'),
    # ...
)

with contextlib.ExitStack() as stack:
    files = {varname:stack.enter_context(open(filename, 'rb'))
        for filename, varname in files_to_process}
    # and for instance...
    files['locations'].writeline('my location\n')

如果您发现dict访问权限不如属性访问权限那么整洁,则可以创建一个简单的容器类

class SimpleNamespace:

    def __init__(self, name_val_pairs):
        self.__dict__.update(name_val_pairs)

with contextlib.ExitStack() as stack:
    files = SimpleNamespace(((varname, stack.enter_context(open(filename, 'rb')))
        for filename, varname in files_to_process))
    # and for instance...
    files.locations.writeline('my location\n')

I have a project that needs to read data, then write in more than 23 CSV files in parallel depending on each line. For example, if the line is about temperature, we should write to temperature.csv, if about humidity, >>to humid.CSV , etc.

I tried the following:

with open('Results\\GHCN_Daily\\MetLocations.csv','wb+') as locations, \
            open('Results\\GHCN_Daily\\Tmax.csv','wb+')as tmax_d, \
            open('Results\\GHCN_Daily\\Tmin.csv','wb+')as tmin_d, \
            open('Results\\GHCN_Daily\\Snow.csv', 'wb+')as snow_d, \
            .
            .
            # total of 23 'open' statements
            .

            open('Results\\GHCN_Daily\\SnowDepth.csv','wb+')as snwd_d, \
            open('Results\\GHCN_Daily\\Cloud.csv', 'wb+')as cloud_d, \
            open('Results\\GHCN_Daily\\Evap.csv', 'wb+')as evap_d, \

I got the following error

SystemError: too many statically nested blocks python

I searched for this error, and I get to this post which says that

You will encounter this error when you nest blocks more than 20. This is a design decision of Python interpreter to restrict it to 20.

But the open statement I wrote opens the files in parallel, not nested.

What am I doing wrong, and how can I solve this problem?

Thanks in advance.

解决方案

Each open is a nested context, its just that python syntax allows you to put them in a comma-separated list. contextlib.ExitStack is a context container that lets you put as many contexts as you like in a stack and exits each of them when you are done. So, you could do

import contextlib

files_to_process = (
    ('Results\\GHCN_Daily\\MetLocations.csv', 'locations'),
    ('Results\\GHCN_Daily\\Tmax.csv', 'tmax_d'),
    ('Results\\GHCN_Daily\\Tmin.csv', 'tmin_d'),
    # ...
)

with contextlib.ExitStack() as stack:
    files = {varname:stack.enter_context(open(filename, 'rb'))
        for filename, varname in files_to_process}
    # and for instance...
    files['locations'].writeline('my location\n')

If you find dict access less tidy than attribute access, you could create a simple container class

class SimpleNamespace:

    def __init__(self, name_val_pairs):
        self.__dict__.update(name_val_pairs)

with contextlib.ExitStack() as stack:
    files = SimpleNamespace(((varname, stack.enter_context(open(filename, 'rb')))
        for filename, varname in files_to_process))
    # and for instance...
    files.locations.writeline('my location\n')

这篇关于如何并行打开超过19个文件(Python)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆