添加到现有电子表格吗? [英] Add to existing spreadsheet?

查看:111
本文介绍了添加到现有电子表格吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个电子表格,其字段名称为:

I have a spreadsheet with fieldnames:

[名称",职业",公司",地址",地址_2",城市",州",邮编",电话",传真",电子邮件",网站" ','说明']

['name', 'occupation', 'company', 'address', 'address_2','city', 'state', 'zip', 'phone,' 'fax', 'email', 'website', 'description']

,并希望将包含较少字段名称的其他电子表格数据添加到该电子表格中(尽管其他所有字段名称都包含在此电子表格中).

and would like to add to this spreadsheet other spreadsheets of data that contain fewer fieldnames (although all of the others' fieldnames are included in this spreadsheet).

我遇到一个奇怪的错误:

I'm getting a bizarre error:

Samuel-Finegolds-MacBook-Pro:~ samuelfinegold$ /var/folders/jv/9_sy0bn10mbdft1bk9t14qz40000gn/T/Cleanup\ At\ Startup/merge-395698810.980.py.command ; exit;
['name', 'occupation', 'company', 'address', 'address_2', 'city', 'state', 'zip', 'phone,fax', 'email', 'website', 'description']
Traceback (most recent call last):
  File "/Users/samuelfinegold/Documents/noodle/merge.py", line 14, in <module>
    gc_all_dict.writerow(row)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/csv.py", line 148, in writerow
    return self.writer.writerow(self._dict_to_list(rowdict))
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/csv.py", line 144, in _dict_to_list
    ", ".join(wrong_fields))
TypeError: sequence item 0: expected string, NoneType found
logout

[Process completed]

当我运行以下命令时:

import csv

# compile master spreadsheet
with(open('gc_all.txt','w')) as gc_all:

    fieldnames = ['name', 'occupation', 'company', 'address', 'address_2','city', 'state', 'zip', 'phone,' 'fax', 'email', 'website', 'description']
    gc_all_dict = csv.DictWriter(gc_all, fieldnames = fieldnames, delimiter = '\t')
    print gc_all_dict.fieldnames

    with(open('/Users/samuelfinegold/Documents/noodle/aicep/aicep_scrape_output.txt', 'rU')) as aicep:
        aicep_dict = csv.DictReader(aicep, fieldnames = fieldnames, delimiter = '\t')
        for row in aicep_dict:
#             print row
            gc_all_dict.writerow(row)


    for row in gc_all:
        print row

假数据:

name    occupation  company address address_2   city    state   zip phone   fax email   website description
Rob Er      Step Up 123 Road Dr     New York    NY  10011   1234567891  1234567891  a@b.com www.stepUp.com  A great counselor
Bob B. Bob      For Your Rights 12 2nd Ave      San Francisco   CA  94109   1234567891  1234567891  c@d.com     
Snob Job        Marley Inc. 12 1st Ave      Denver  CO  80231   1234567891  1234567891  g@h.com     What a counselor!

推荐答案

此处的真正问题是,尽管您在问题中声明了什么,但其他所有字段名都包含在此内容中电子表格.

The real problem here is that, despite what you claim in your question, all of the others' fieldnames are not included in this spreadsheet.

您可以通过查看所提出的那一行上方的内容来判断. DictWriter._dict_to_list 看起来像这样:

You can tell by looking at the line above the one that raised. DictWriter._dict_to_list looks like this:

def _dict_to_list(self, rowdict):
    if self.extrasaction == "raise":
        wrong_fields = [k for k in rowdict if k not in self.fieldnames]
        if wrong_fields:
            raise ValueError("dict contains fields not in fieldnames: " +
                             ", ".join(wrong_fields))
    return [rowdict.get(key, self.restval) for key in self.fieldnames]

因此,它找到了不在您的DictWriter中的字段.

So, it found a field that isn't in your DictWriter.

但是为什么在尝试创建错误时却引发了这个奇怪的错误呢?因为缺少的字段被命名为None. DictWriter代码不是用于处理此问题的.所以,这就是问题2.

But why is it raising that weird error while trying to create the error? Because the missing field is named None. The DictWriter code isn't built to handle that. So, that's problem #2.

为什么将该字段命名为None?因为DictReader每当它找到不适合您提供的fieldnames的列时,这就是产生的结果.您可以通过print row看到这一点:dict的元素之一将类似于None: 'foo'.因此,这就是问题3.

And why is the field named None? Because that's what a DictReader produces whenever it finds a column that doesn't fit into the fieldnames that you gave it. You can see this by print row: One of the elements of the dict will be something like None: 'foo'. So, that's problem #3.

那么您如何解决呢?

好吧,显而易见的事情是使您的主张正确:使目标中的字段成为源中字段的严格超集.

Well, the obvious thing to do is make your claim true: Make the fields in your target a strict superset of the fields in your source.

或者,您可以告诉您DictReader跳过多余的字段,或者告诉您DictWriter忽略它们而不是加注.例如,只需将extrasaction='ignore'添加到您的DictWriter构造函数中,问题就会消失.

Alternatively, you can tell your DictReader to skip extra fields, or your DictWriter to ignore them instead of raising. For example, just add extrasaction='ignore' to your DictWriter constructor, and the problem will go away.

但实际上,您不应该那样做. raise在这里为您找到了一个合法的错误;只是没有这样做,却显示了非常有用的错误消息.

But really, you shouldn't be doing that. raise caught a legitimate bug for you here; it just didn't do so with a very useful error message.

这篇关于添加到现有电子表格吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆