大 pandas 显示:截断列显示而不是包装 [英] pandas display: truncate column display rather than wrapping

查看:136
本文介绍了大 pandas 显示:截断列显示而不是包装的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

列名很长,无论什么

With lengthy column names, DataFrames will display in a very messy form seemingly no matter what options are set.

信息:我在Jupyter QtConsole中,熊猫0.20.1,在启动时指定了以下相关选项:

Info: I'm in Jupyter QtConsole, pandas 0.20.1, with the following relevant options specified at startup:

pd.set_option('display.max_colwidth', 20)
pd.set_option('expand_frame_repr', False)
pd.set_option('display.max_rows', 25)

问题:如何在保持expand_frame_repr=False的同时截断DataFrame而不是将列包装到下一行?

Question: how can I truncate the DataFrame if necessary rather than wrapping the columns to the next line, while keeping expand_frame_repr=False?

这是一个例子.同样,问题不取决于列数,而是取决于列的长度.

Here's an example. Again, the issue doesn't depend on the number of columns but length of the columns.

这不会引起问题:

df = pd.DataFrame(np.random.randn(1000, 1000),
                  columns=['col' + str(i) for i in range(1000)])

输出完全可读,看起来像:

As the output is perfectly readable and looks like:

具有长列名的相同DataFrame引起了我正在谈论的问题:

The same DataFrame with long column names causes the issue I'm talking about:

df = pd.DataFrame(np.random.randn(1000, 1000),
                  columns=['very_long_col_name_' 
                           + str(i) for i in range(1000)])

有什么办法可以使第二个输出与我所缺少的第一个输出一致? (通过指定一个选项,而不是每次查看时都使用.iloc.)

Is there any way to conform the second output to be like the first that I'm missing? (Through specifying an option, not through using .iloc every time I want to view.)

推荐答案

好像需要增强. repr函数中的相关代码似乎在这里:

Looks like it will need an enhancement. The relevant code in the repr function appears to be here:

    max_rows = get_option("display.max_rows")
    max_cols = get_option("display.max_columns")
    show_dimensions = get_option("display.show_dimensions")
    if get_option("display.expand_frame_repr"):
        width, _ = console.get_console_size()
    else:
        width = None
    self.to_string(buf=buf, max_rows=max_rows, max_cols=max_cols,
                   line_width=width, show_dimensions=show_dimensions)

因此,您通过expand_frame_repr=True并包裹了行宽,或者您通过了expand_frame_repr=False却不行.但看起来代码中有一个错误(应该是pandas 0.20.3 iirc):

So either you pass expand_frame_repr=True and it wraps on the line width, or you pass expand_frame_repr=False and it shouldn't. But it looks like there is a bug in the code (this should be pandas 0.20.3 iirc):

pd.io.formats.format.DataFrameFormatter中:

def _chk_truncate(self):
    """
    Checks whether the frame should be truncated. If so, slices
    the frame up.
    """
    from pandas.core.reshape.concat import concat

    # Column of which first element is used to determine width of a dot col
    self.tr_size_col = -1

    # Cut the data to the information actually printed
    max_cols = self.max_cols
    max_rows = self.max_rows

    if max_cols == 0 or max_rows == 0:  # assume we are in the terminal
                                        # (why else = 0)
        (w, h) = get_terminal_size()
        self.w = w
        self.h = h
        if self.max_rows == 0:
            dot_row = 1
            prompt_row = 1
            if self.show_dimensions:
                show_dimension_rows = 3
            n_add_rows = (self.header + dot_row + show_dimension_rows +
                          prompt_row)
            # rows available to fill with actual data
            max_rows_adj = self.h - n_add_rows
            self.max_rows_adj = max_rows_adj

        # Format only rows and columns that could potentially fit the
        # screen
        if max_cols == 0 and len(self.frame.columns) > w:
            max_cols = w
        if max_rows == 0 and len(self.frame) > h:
            max_rows = h

看起来它打算做您想做的事,但未完成.它根据列的 number 而不是列的总宽度来检查max_cols.

Looks like it intended to do what you wanted, but was unfinished. It's checking max_cols against the number of columns, not the total width of the columns.

因此,您可以创建一个show_df函数,该函数将计算正确的列数并将其显示在option_context中,例如pi2Squared的答案,或者在此处进行修复(如果需要分发,则可以提交补丁).

So you could either create a show_df function that would calculate the correct number of columns and show it in an option_context like pi2Squared's answer, or fix it here (and maybe submit a patch if you need it distributed).

这篇关于大 pandas 显示:截断列显示而不是包装的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆