在 Pandas DataFrame 的字符串中漂亮地打印换行符 [英] Pretty printing newlines inside a string in a Pandas DataFrame

查看:28
本文介绍了在 Pandas DataFrame 的字符串中漂亮地打印换行符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 Pandas DataFrame,其中一列包含字符串元素,而这些字符串元素包含我想逐字打印的新行.但它们只是在输出中显示为 .

I have a Pandas DataFrame in which one of the columns contains string elements, and those string elements contain new lines that I would like to print literally. But they just appear as in the output.

也就是说,我想打印这个:

That is, I want to print this:

  pos     bidder
0   1
1   2
2   3  <- alice
       <- bob
3   4

但这就是我得到的:

  pos            bidder
0   1
1   2
2   3  <- alice
<- bob
3   4

我怎样才能完成我想要的?我可以使用 DataFrame 吗,还是必须恢复为一次手动打印一行填充的列?

How can I accomplish what I want? Can I use a DataFrame, or will I have to revert to manually printing padded columns one row at a time?

这是我目前所拥有的:

n = 4
output = pd.DataFrame({
    'pos': range(1, n+1),
    'bidder': [''] * n
})
bids = {'alice': 3, 'bob': 3}
used_pos = []
for bidder, pos in bids.items():
    if pos in used_pos:
        arrow = output.ix[pos, 'bidder']
        output.ix[pos, 'bidder'] = arrow + "
<- %s" % bidder
    else:
        output.ix[pos, 'bidder'] = "<- %s" % bidder
print(output)

推荐答案

来自 pandas.DataFrame 文档:

From pandas.DataFrame documention:

二维大小可变、潜在异构的表格数据带有标记轴(行和列)的结构.算术运算在行和列标签上对齐.可以被认为是一个类似 dict 的系列对象的容器.初级pandas数据结构

Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure

所以你不能有没有索引的行.换行符 " " 在 DataFrame 中不起作用.

So you can't have a row without an index. Newline " " won't work in DataFrame.

您可以用空值覆盖 'pos',并在下一行输出下一个 'bidder'.但是每次你这样做时, index 和 'pos' 都会被抵消.喜欢:

You could overwrite 'pos' with an empty value, and output the next 'bidder' on the next row. But then index and 'pos' would be offset every time you do that. Like:

  pos    bidder
0   1          
1   2          
2   3  <- alice
3        <- bob
4   5   

因此,如果名为frank"的投标人的价值为 4,它将覆盖bob".当您添加更多时,这会导致问题.可能可以使用 DataFrame 并编写代码来解决此问题,但可能值得研究其他解决方案.

So if a bidder called 'frank' had 4 as value, it would overwrite 'bob'. This would cause problems as you add more. It is probably possible to use DataFrame and write code to work around this issue, but probably worth looking into other solutions.

这是生成上述输出结构的代码.

Here is the code to produce the output structure above.

import pandas as pd

n = 5
output = pd.DataFrame({'pos': range(1, n + 1),
                      'bidder': [''] * n},
                      columns=['pos', 'bidder'])
bids = {'alice': 3, 'bob': 3}
used_pos = []
for bidder, pos in bids.items():
    if pos in used_pos:
        output.ix[pos, 'bidder'] = "<- %s" % bidder
        output.ix[pos, 'pos'] = ''
    else:
        output.ix[pos - 1, 'bidder'] = "<- %s" % bidder
        used_pos.append(pos)
print(output)

另一种选择是重构数据和输出.你可以将 pos 作为列,并为每个键/人创建一个新行在数据中.在下面的代码示例中,它使用 NaN 打印 DataFrame值替换为空字符串.

Another option is to restructure the data and output. You could have pos as columns, and create a new row for each key/person in the data. In the code example below it prints the DataFrame with NaN values replaced with an empty string.

import pandas as pd

data = {'johnny
newline': 2, 'alice': 3, 'bob': 3,
        'frank': 4, 'lisa': 1, 'tom': 8}
n = range(1, max(data.values()) + 1)

# Create DataFrame with columns = pos
output = pd.DataFrame(columns=n, index=[])

# Populate DataFrame with rows
for index, (bidder, pos) in enumerate(data.items()):
    output.loc[index, pos] = bidder

# Print the DataFrame and remove NaN to make it easier to read.
print(output.fillna(''))

# Fetch and print every element in column 2
for index in range(1, 5):
    print(output.loc[index, 2])

不过,这取决于您想对数据做什么.祝你好运:)

It depends what you want to do with the data though. Good luck :)

这篇关于在 Pandas DataFrame 的字符串中漂亮地打印换行符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆