将日期时间转换为另一种格式而不更改dtype [英] Convert datetime to another format without changing dtype

查看:154
本文介绍了将日期时间转换为另一种格式而不更改dtype的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只是自己学习熊猫,遇到了一些问题.

I'm just learning Pandas myself and I have met few problems.

  1. 在从csv文件读取的DataFrame中,我有一列包含日期数据,该日期数据具有不同的格式(例如'%m/%d/%Y''%Y-%m-%d',可能为空白.),并且我想统一此列的格式.但是我不知道是否还有其他格式.因此,当我使用pd.to_datetime()时,它引发了一些错误,例如格式不匹配和数据不符合时间要求.如何统一此列的格式?

  1. In a DataFrame, which it was reads from a csv file, I have one column includes date data that in different format(like '%m/%d/%Y' and '%Y-%m-%d', may be blank.) and I want to unify the format of this column. But I don't know if there are any other formats. So when I using pd.to_datetime(),it raised some errors like format not matching and not timelike data. How can I unify the format of this column?

我已将该列的一部分转换为datetime dtype,并且为YYYY-mm-dd格式.我可以保留datetime dtype,并将格式更改为'%m/%d/%Y'吗?我使用过pd.dt.strftime(),它将更改格式,但还将dtype更改为str,而不保留datetime dtype.

I have converted part of that column into datetime dtype, and it's in YYYY-mm-dd format. Can I keep the datetime dtype, and change the format into '%m/%d/%Y'? I have used pd.dt.strftime(), it will change the format, but also change the dtype into str, not keeping the datetime dtype.

推荐答案

所以当我使用pd.to_datetime()时,它引发了一些错误,例如格式不 匹配且不符合时间要求的数据.如何统一此格式 专栏?

So when I using pd.to_datetime(),it raised some errors like format not matching and not timelike data. How can I unify the format of this column?

使用errors='coerce'选项以便为未转换的值返回NaT(不是时间).另请注意,format参数不是必需的.省略它会使熊猫尝试多种格式,否则它将恢复为NaT 1 .例如:

Use the errors='coerce' option in order to return NaT (Not a Time) for non-converted values. Also note that the format argument is not required. Omitting it will enable Pandas to try multiple formats, failing which it will revert to NaT1. For example:

df['datetime'] = pd.to_datetime(df['datetime'], errors='coerce')

当心,混合类型可能会被错误解释.例如,Python如何知道05/06/2018是6月5日还是5月6日?将应用约定顺序,如果需要更好的控制,则需要自己应用自定义顺序.

Beware, mixed types may be interpreted incorrectly. For example, how will Python know whether 05/06/2018 is 5th June or 6th May? An order of conventions will be applied and if you need greater control you will need to apply a customised ordering yourself.

我可以保留datetime dtype,并将格式更改为'%m/%d/%Y'吗?

Can I keep the datetime dtype, and change the format into '%m/%d/%Y'?

不,您不能. datetime系列在内部存储为整数.任何人类可读的日期表示形式都是表示形式,而不是基础整数.要访问自定义格式,可以使用Pandas中可用的方法.您甚至可以将这样的文本表示形式存储在pd.Series变量中:

No, you cannot. datetime series are stored internally as integers. Any human-readable date representation is just that, a representation, not the underlying integer. To access your custom formatting, you can use methods available in Pandas. You can even store such a text representation in a pd.Series variable:

formatted_dates = df['datetime'].dt.strftime('%m/%d/%Y')

formatted_datesdtype将是object,这表明系列中的元素指向任意Python时间.在这种情况下,那些任意类型恰好都是字符串.

The dtype of formatted_dates will be object, which indicates that the elements of your series point to arbitrary Python times. In this case, those arbitrary types happen to be all strings.

最后,我强烈建议您不要datetime系列转换为字符串,直到工作流程的最后一步.这是因为一旦这样做,您将不再能够在这样的系列上使用高效的矢量化运算.

Lastly, I strongly recommend you do not convert a datetime series to strings until the very last step in your workflow. This is because as soon as you do so, you will no longer be able to use efficient, vectorised operations on such a series.

1 这将牺牲性能并与 dateutil 库,如

1 This will sacrifice performance and contrasts with datetime.strptime, which requires format to be specified. Internally, Pandas uses the dateutil library, as indicated in the docs.

这篇关于将日期时间转换为另一种格式而不更改dtype的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆