如何使用to_clipboard()提供DataFrame的副本 [英] How to provide a copy of your DataFrame with to_clipboard()
问题描述
2018-09-18_reproducible_dataframe.ipynb
这被标记为如何制作好的可复制熊猫示例的重复.另一个问题和相关的答案涉及如何创建可复制的数据框.他们没有介绍如何使用to_clipboard
复制现有数据帧,而该问题专门介绍了.to_clipboard
并且更加简洁.
This was marked as a duplicate of How to make good reproducible pandas examples. The other question and associated answers cover how to create a reproducible dataframe. They do not cover how to copy an existing dataframe with to_clipboard
, while this question specifically covers .to_clipboard
and is more succinct.
这似乎是一个显而易见的问题.但是,许多询问有关熊猫的问题的用户都是新手,没有经验.提出问题的关键要素是如何创建最小,完整和可验证的示例,该示例解释了什么"和为什么" ",而不是如何".
This may seem like an obvious question. However, many of the users asking questions about Pandas are new and inexperienced. A critical component of asking a question is How to create a Minimal, Complete, and Verifiable example, which explains "what" and "why", but not "how".
例如,作为 OP ,我可能具有以下数据框:
For example, as the OP, I may have the following dataframe:
- 在此示例中,我创建了综合数据,这是用于创建可重现数据集的一种选择,但不在此问题范围内.
- 考虑到这一点,就好像您已经加载了文件,并且只需要共享其中的一部分,即可重现错误.
- For this example, I've created synthetic data, which is an option for creating a reproducible dataset, but not within the scope of this question.
- Think of this, as if you've loaded a file, and only need to share a bit of it, to reproduce the error.
import pandas as pd import numpy as np from datetime import datetime np.random.seed(365) data = {'a': [np.random.randint(10) for _ in range(15)], 'b': [np.random.randint(10) for _ in range(15)], 'date': pd.bdate_range(datetime.today(), periods=15).tolist()} df = pd.DataFrame(data) a b date 0 2 0 2019-11-06 1 4 8 2019-11-07 2 1 4 2019-11-08 3 5 3 2019-11-11 4 2 2 2019-11-12 5 2 6 2019-11-13 6 9 2 2019-11-14 7 8 6 2019-11-15 8 4 8 2019-11-18 9 0 9 2019-11-19 10 3 6 2019-11-20 11 3 1 2019-11-21 12 7 6 2019-11-22 13 7 5 2019-11-25 14 7 7 2019-11-26
数据框后可能会跟随一些其他代码,这些代码会产生错误或无法产生预期的结果
The dataframe could be followed by some other code, that produces an error or doesn't produce the desired outcome
提出有关堆栈溢出的问题时应该提供的东西.
Things that should be provided when asking a question on Stack Overflow.
- 一个写得很好的连贯问题
- 产生错误的代码
- 错误堆栈
- 可能是某些代码的预期结果
- 易于使用的数据格式
- A well written coherent question
- The code that produces the error
- The error stack
- Potentially, the expected outcome of some code
- The data, in an easily usable form
推荐答案
从pandas DataFrame提供示例数据的最快方法
有多种方法可以回答这个问题.但是,此答案并不旨在提供详尽的解决方案.它提供了最简单的方法.出于好奇,Stack Overflow还提供了其他更详细的解决方案.
There is more than one way to answer this question. However, this answer isn't meant to provide an exhaustive solution. It provides the simplest method possible. For the curious, there are other more verbose solutions provided on Stack Overflow.
- 提供指向可共享数据集的链接(可能在GitHub或Google上的共享文件).如果数据量很大并且目标是优化某些方法,则此功能特别有用.缺点是数据将来可能不再可用,这降低了发布的好处.
- 必须在问题中提供数据,但可以附带指向更广泛的数据集的链接.
- 不要仅发布数据的链接或图像.
- Provide a link to a shareable dataset (maybe on GitHub or a shared file on Google). This is particularly useful if it's a large dataset and the objective is to optimize some method. The drawback is that the data may no longer be available in the future, which reduces the benefit of the post.
- Data must be provided in the question, but can be accompanied by a link to a more extensive dataset.
- Do not post only a link or an image of the data.
代码:
提供
pandas.DataFrame.to_clipboard
Code:
Provide the output of
pandas.DataFrame.to_clipboard
df.head(10).to_clipboard(sep=',', index=False)
- 如果您有一个多索引DataFrame或0 ... n以外的索引,请使用
index=True
并在问题中提供有关哪个列是索引的注释. - 注意:执行上一行代码时,将不会显示任何输出.代码的结果现在在剪贴板中.
- 在您的堆栈溢出问题中将剪贴板粘贴到
code block
- If you have a multi-index DataFrame or an index other than 0...n, use
index=True
and provide a note in your question as to which column(s) are the index. - Note: when the previous line of code is executed, no output will appear. The result of the code is now in the clipboard.
- paste the clipboard into a
code block
in your Stack Overflow question
a,b,date 2,0,2019-11-06 4,8,2019-11-07 1,4,2019-11-08 5,3,2019-11-11 2,2,2019-11-12 2,6,2019-11-13 9,2,2019-11-14 8,6,2019-11-15 4,8,2019-11-18 0,9,2019-11-19
- 可以由试图回答您问题的人将其复制到剪贴板,然后跟随:
df = pd.read_clipboard(sep=',')
.head(10)
- 使用
.iloc
属性 - 以下示例选择第3-11行和所有列
- Specify a section of the dataframe with the
.iloc
property - The following example selects rows 3 - 11 and all the columns
Locations of the dataframe other the
.head(10)
df.iloc[3:12, :].to_clipboard(sep=',')
Google Colab用户
-
.to_clipboard()
无法正常工作 - 执行以下操作
.to_clipboard()
won't work- Do the following
Google Colab Users
# if you have a datetime column, convert it to a str df['date'] = df['date'].astype('str') # output to a dict df.head(10).to_dict() # paste into a code block on SO # convert datatime column back
这篇关于如何使用to_clipboard()提供DataFrame的副本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!