使用to_clipboard()提供DataFrame的可复制副本 [英] Provide a reproducible copy of the DataFrame with to_clipboard()

查看:490
本文介绍了使用to_clipboard()提供DataFrame的可复制副本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

2018-09-18_reproducible_dataframe.ipynb

这被标记为重复,但是另一个问题和答案未涵盖to_clipboard,而该问题专门涵盖了.to_clipboard,并且更为简洁.

This was marked as a duplicate, however, the other question and answer(s) do not cover to_clipboard, while this question specifically covers .to_clipboard and is more succinct.

这似乎是一个显而易见的问题.但是,许多询问有关熊猫的问题的用户都是新手,没有经验.提出问题的关键要素是如何创建最小,完整和可验证的示例,该示例解释了什么"和为什么" ",而不是如何".

This may seem like an obvious question. However, many of the users asking questions about Pandas are new and inexperienced. A critical component of asking a question is How to create a Minimal, Complete, and Verifiable example, which explains "what" and "why", but not "how".

例如,作为 OP ,我可能具有以下数据框:

For example, as the OP, I may have the following dataframe:

  • 在此示例中,我创建了综合数据,这是用于创建可重现数据集的一种选择,但不在此问题范围内.
    • 考虑到这一点,就好像您已经加载了文件,并且只需要共享其中的一部分,即可重现错误.
    • For this example, I've created synthetic data, which is an option for creating a reproducible dataset, but not within the scope of this question.
      • Think of this, as if you've loaded a file, and only need to share a bit of it, to reproduce the error.
      import pandas as pd
      import numpy as np
      
      np.random.seed(365)
      data = {'a': [np.random.randint(10) for _ in range(15)],
              'b': [np.random.randint(10) for _ in range(15)],
              'date': pd.bdate_range(pd.datetime.today(), periods=15).tolist()}
      
      df = pd.DataFrame(data)
      
          a  b       date
      0   2  0 2019-11-06
      1   4  8 2019-11-07
      2   1  4 2019-11-08
      3   5  3 2019-11-11
      4   2  2 2019-11-12
      5   2  6 2019-11-13
      6   9  2 2019-11-14
      7   8  6 2019-11-15
      8   4  8 2019-11-18
      9   0  9 2019-11-19
      10  3  6 2019-11-20
      11  3  1 2019-11-21
      12  7  6 2019-11-22
      13  7  5 2019-11-25
      14  7  7 2019-11-26
      

      数据框后可能会跟随一些其他代码,这些代码会产生错误或无法产生预期的结果

      The dataframe could be followed by some other code, that produces an error or doesn't produce the desired outcome

      提出有关堆栈溢出的问题时应该提供的东西.

      Things that should be provided when asking a question on Stack Overflow.

      • 一个写得很好的连贯问题
      • 产生错误的代码
      • 错误堆栈
      • 可能是某些代码的预期结果
      • 易于使用的数据格式
      • A well written coherent question
      • The code that produces the error
      • The error stack
      • Potentially, the expected outcome of some code
      • The data, in an easily usable form

      推荐答案

      从pandas DataFrame提供示例数据的最快方法

      有多种方法可以回答这个问题.但是,此答案并不旨在提供详尽的解决方案.它提供了最简单的方法.出于好奇,Stack Overflow还提供了其他更详细的解决方案.

      There is more than one way to answer this question. However, this answer isn't meant to provide an exhaustive solution. It provides the simplest method possible. For the curious, there are other more verbose solutions provided on Stack Overflow.

      1. 提供指向可共享数据集的链接(可能在GitHub或Google上的共享文件).如果数据量很大并且目标是优化某些方法,则此功能特别有用.缺点是数据将来可能不再可用,这降低了发布的好处.
        • 必须在问题中提供数据,但可以附带指向更广泛的数据集的链接.
        • 不要仅发布数据的链接或图像.
      1. Provide a link to a shareable dataset (maybe on GitHub or a shared file on Google). This is particularly useful if it's a large dataset and the objective is to optimize some method. The drawback is that the data may no longer be available in the future, which reduces the benefit of the post.
        • Data must be provided in the question, but can be accompanied by a link to a more extensive dataset.
        • Do not post only a link or an image of the data.

      代码:

      提供pandas.DataFrame.to_clipboard

      Code:

      Provide the output of pandas.DataFrame.to_clipboard

      df.head(10).to_clipboard(sep=',', index=False)
      

      • 如果您有一个多索引DataFrame或0 ... n以外的索引,请使用index=True并在问题中提供有关哪个列是索引的注释.
      • 注意:执行上一行代码时,将不会显示任何输出.代码的结果现在在剪贴板中.
      • 在您的堆栈溢出问题中将剪贴板粘贴到code block
        • If you have a multi-index DataFrame or an index other than 0...n, use index=True and provide a note in your question as to which column(s) are the index.
        • Note: when the previous line of code is executed, no output will appear. The result of the code is now in the clipboard.
        • paste the clipboard into a code block in your Stack Overflow question
        • a,b,date
          2,0,2019-11-06
          4,8,2019-11-07
          1,4,2019-11-08
          5,3,2019-11-11
          2,2,2019-11-12
          2,6,2019-11-13
          9,2,2019-11-14
          8,6,2019-11-15
          4,8,2019-11-18
          0,9,2019-11-19
          

          • 可以由试图回答您问题的人将其复制到剪贴板,然后跟随:
          • df = pd.read_clipboard(sep=',')
            

            这篇关于使用to_clipboard()提供DataFrame的可复制副本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆