只有unicode字符串的第一个字符写入csv [英] Only first character of unicode strings getting written to csv

查看：514 发布时间：2017/2/24 20:33:05 python csv unicode pyodbc

本文介绍了只有unicode字符串的第一个字符写入csv的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的问题的坚果是我的脚本不能写完整的unicode字符串（从数据库检索）到csv，而只是每个字符串的第一个字符写入文件。例如：

  U，1423.0,831,1,139

其中输出应为：

 华盛顿大学学生，1423.0831 ，1,139

一些背景：我使用pyodbc连接到MSSQL数据库。我有我的odbc配置文件设置unicode，并连接到db如下：

  p.connect（DSN = myserver; UID = username; PWD = password; DATABASE = mydb; CHARSET = utf-8）

我可以得到数据没有问题，但问题出现，当我尝试保存查询结果到csv文件。我试过使用csv.writer， UnicodeWriter 解决方案官方文档，以及最近在github上找到的 unicodecsv 模块。每个方法产生相同的结果。

奇怪的是，我可以打印字符串在python控制台没有问题。然而，如果我把同样的字符串并写到csv，问题出现了。看我的测试代码&结果如下：

代码突出显示问题：

 数据库中的raw字符串：
 print\tencoding：\t+ whatisthis（report.data [1] [0]）
 print\tprint string：\t report.data [1] [0] 
 print\tstring len：\t+ str（len（report.data [1] [0]））
 
f = StringIO （）
w = unicodecsv.writer（f，encoding ='utf-8'）
 w.writerows（report.data）
 f.seek（0）
r = unicodecsv。阅读器（f）
 row = r.next（）
 row = r.next（）
 
 print从csv文件写入/读取：
 print \tencoding：\t+ whatisthis（row [0]）
 print\tprint string：\t+ row [0] 
 print\tstring len：\t + str（len（row [0]））

测试输出：

 数据库中的Raw字符串：
 encoding：unicode string 
 print string：华盛顿大学学生
 string len：66 
从csv文件写入/读取：
 encoding：unicode string 
打印字符串：U 
字符串len：1 
  / pre> 
 
 这个问题的原因是什么，我该如何解决？谢谢！
 
 
 编辑：whatisthis函数只是检查字符串格式，取自此帖 
  def whatisthis：
如果isinstance（s，str）：
 print普通字符串
 elif isinstance（s，unicode）：
 printunicode string 
 else：
 printnot a string
  
 
 
解决方案
 
  import StringIO as sio 
 import unicodecsv as ucsv 
 
 class Report（object）：
 def __init __ ）：
 self.data = data 
 
 report =报告（
 [
 [华盛顿大学学生，1，2，3]，
 [UCLA，5，6，7] 
] 
）
 
 
 
 print report.data 
 print report.data [0] [0] 
 
 print** 20 
 
f = sio.StringIO（）
 writer = ucsv.writer（f，encoding ='utf -8'）
 writer.writerows（report.data）
 
 print f.getvalue（）
 print - * 20 
 
 f。 seek（0）
 
 reader = ucsv.reader（f）
 row = reader.next（）
 
打印行
打印行[0] 
 
 
 
 --output： -  
 [[华盛顿大学学生，1，2，3]，['UCLA'，5,6 ，7]] 
华盛顿大学学生
 ******************** 
华盛顿大学学生，1,2,3 
 UCLA，5,6,7 
 
 -------------------- 
 [u'University of Washington Students '，u'1'，u'2'，u'3'] 
华盛顿大学学生
  
谁知道你的whatisthis（）函数是什么恶作剧。
 
The nutshell of my problem is that my script cannot write complete unicode strings (retrieved from a db) to a csv, instead only the first character of each string is written to the file. eg:
U,1423.0,831,1,139
Where the output should be:
University of Washington Students,1423.0,831,1,139
Some background: I'm connecting to an MSSQL database using pyodbc. I have my odbc config file set up for unicode, and connect to the db as follows:
p.connect("DSN=myserver;UID=username;PWD=password;DATABASE=mydb;CHARSET=utf-8")
I can get data no problem, but the issue arises when I try to save query results to the csv file. I've tried using csv.writer, the UnicodeWriter solution in the official docs, and most recently, the unicodecsv module I found on github. Each method yields the same results.

The weird thing is I can print the strings in the python console no problem. Yet, if I take that same string and write it to csv, the problem emerges. See my test code & results below:

Code to highlight issue:
print "'Raw' string from database:"
print "\tencoding:\t" + whatisthis(report.data[1][0])
print "\tprint string:\t" + report.data[1][0]
print "\tstring len:\t" + str(len(report.data[1][0]))

f = StringIO()
w = unicodecsv.writer(f, encoding='utf-8')
w.writerows(report.data)
f.seek(0)
r = unicodecsv.reader(f)
row = r.next()
row = r.next()

print "Write/Read from csv file:"
print "\tencoding:\t" + whatisthis(row[0])
print "\tprint string:\t" + row[0]
print "\tstring len:\t" + str(len(row[0]))
Output from test:
'Raw' string from database:
    encoding: unicode string
    print string: University of Washington Students
    string len: 66
Write/Read from csv file:
    encoding: unicode string
    print string: U
    string len: 1
What could be the reason for this issue and how might I resolve it? Thanks!

EDIT: the whatisthis function is just to check the string format, taken from this post
def whatisthis(s):
    if isinstance(s, str):
        print "ordinary string"
    elif isinstance(s, unicode):
        print "unicode string"
    else:
        print "not a string"

 解决方案 
import StringIO as sio
import unicodecsv as ucsv

class Report(object):
    def __init__(self, data):
        self.data = data

report = Report( 
  [
     ["University of Washington Students", 1, 2, 3],
     ["UCLA", 5, 6, 7]
  ]
)



print report.data
print report.data[0][0]

print "*" * 20

f = sio.StringIO()
writer = ucsv.writer(f, encoding='utf-8')
writer.writerows(report.data)

print f.getvalue()
print "-" * 20

f.seek(0)

reader = ucsv.reader(f)
row = reader.next()

print row
print row[0]



--output:--
[['University of Washington Students', 1, 2, 3], ['UCLA', 5, 6, 7]]
University of Washington Students
********************
University of Washington Students,1,2,3
UCLA,5,6,7

--------------------
[u'University of Washington Students', u'1', u'2', u'3']
University of Washington Students
Who knows what mischief your whatisthis() function is up to.

                        这篇关于只有unicode字符串的第一个字符写入csv的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！
                        
                    
                    
                        查看全文

只有unicode字符串的第一个字符写入csv [英] Only first character of unicode strings getting written to csv

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

只有unicode字符串的第一个字符写入csv [英] Only first character of unicode strings getting written to csv

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭