在Pandas DataFrame的字符串中漂亮地打印换行符 [英] Pretty printing newlines inside a string in a Pandas DataFrame

查看:144
本文介绍了在Pandas DataFrame的字符串中漂亮地打印换行符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Pandas DataFrame,其中的一列包含字符串元素,而这些字符串元素包含我想按字面意义打印的新行.但是它们只是在输出中显示为\n.

I have a Pandas DataFrame in which one of the columns contains string elements, and those string elements contain new lines that I would like to print literally. But they just appear as \n in the output.

也就是说,我要打印此:

That is, I want to print this:

  pos     bidder
0   1
1   2
2   3  <- alice
       <- bob
3   4

但这就是我得到的:

  pos            bidder
0   1
1   2
2   3  <- alice\n<- bob
3   4

我该如何完成我想要的?我可以使用DataFrame,还是必须恢复为手动打印填充的列一次一次?

How can I accomplish what I want? Can I use a DataFrame, or will I have to revert to manually printing padded columns one row at a time?

这是我到目前为止所拥有的:

Here's what I have so far:

n = 4
output = pd.DataFrame({
    'pos': range(1, n+1),
    'bidder': [''] * n
})
bids = {'alice': 3, 'bob': 3}
used_pos = []
for bidder, pos in bids.items():
    if pos in used_pos:
        arrow = output.ix[pos, 'bidder']
        output.ix[pos, 'bidder'] = arrow + "\n<- %s" % bidder
    else:
        output.ix[pos, 'bidder'] = "<- %s" % bidder
print(output)

推荐答案

来自pandas.DataFrame 文档:

From pandas.DataFrame documention:

二维尺寸可变的,可能异构的表格数据 具有标记轴(行和列)的结构.算术运算 在行和列标签上对齐.可以像字典一样 系列对象的容器.大熊猫的主要数据结构

Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure

因此,没有索引就不能有一行.换行符"\ n"在DataFrame中不起作用.

So you can't have a row without an index. Newline "\n" won't work in DataFrame.

您可以用空值覆盖'pos',然后在下一行输出下一个'bidder'.但是,每次这样做时,index和'pos'都会被抵消.像:

You could overwrite 'pos' with an empty value, and output the next 'bidder' on the next row. But then index and 'pos' would be offset every time you do that. Like:

  pos    bidder
0   1          
1   2          
2   3  <- alice
3        <- bob
4   5   

因此,如果一个叫'frank'的竞标者的价值为4,它将覆盖'bob'.当您添加更多时,这会引起问题.可能可以使用DataFrame并编写代码来解决此问题,但可能值得研究其他解决方案.

So if a bidder called 'frank' had 4 as value, it would overwrite 'bob'. This would cause problems as you add more. It is probably possible to use DataFrame and write code to work around this issue, but probably worth looking into other solutions.

这是产生上面输出结构的代码.

Here is the code to produce the output structure above.

import pandas as pd

n = 5
output = pd.DataFrame({'pos': range(1, n + 1),
                      'bidder': [''] * n},
                      columns=['pos', 'bidder'])
bids = {'alice': 3, 'bob': 3}
used_pos = []
for bidder, pos in bids.items():
    if pos in used_pos:
        output.ix[pos, 'bidder'] = "<- %s" % bidder
        output.ix[pos, 'pos'] = ''
    else:
        output.ix[pos - 1, 'bidder'] = "<- %s" % bidder
        used_pos.append(pos)
print(output)

另一个选择是重组数据和输出.你可以 将pos作为列,并为每个键/人创建一个新行 在数据中.在下面的代码示例中,它使用NaN打印DataFrame 值替换为空字符串.

Another option is to restructure the data and output. You could have pos as columns, and create a new row for each key/person in the data. In the code example below it prints the DataFrame with NaN values replaced with an empty string.

import pandas as pd

data = {'johnny\nnewline': 2, 'alice': 3, 'bob': 3,
        'frank': 4, 'lisa': 1, 'tom': 8}
n = range(1, max(data.values()) + 1)

# Create DataFrame with columns = pos
output = pd.DataFrame(columns=n, index=[])

# Populate DataFrame with rows
for index, (bidder, pos) in enumerate(data.items()):
    output.loc[index, pos] = bidder

# Print the DataFrame and remove NaN to make it easier to read.
print(output.fillna(''))

# Fetch and print every element in column 2
for index in range(1, 5):
    print(output.loc[index, 2])

这取决于您要如何处理数据.祝你好运:)

It depends what you want to do with the data though. Good luck :)

这篇关于在Pandas DataFrame的字符串中漂亮地打印换行符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆