如何从Python datetime对象中删除未转换的数据 [英] How to remove unconverted data from a Python datetime object
问题描述
我有一个大多数正确的数据库数据库,但有一些数据库破损如下: Sat Dec 22 12:34:08 PST 20102015
I have a database of mostly correct datetimes but a few are broke like so: Sat Dec 22 12:34:08 PST 20102015
没有无效的年份,这是为我工作:
Without the invalid year, this was working for me:
end_date = soup('tr')[4].contents[1].renderContents()
end_date = time.strptime(end_date,"%a %b %d %H:%M:%S %Z %Y")
end_date = datetime.fromtimestamp(time.mktime(end_date))
但是一旦我打了一个无效的对象一年我得到 ValueError:未转换的数据仍然存在:2
,这是非常好的,但我不知道如何最好地剥离一年中的坏角色。他们的范围从2到6 未转换的字符
。
But once I hit an object with a invalid year I get ValueError: unconverted data remains: 2
, which is great but im not sure how best to strip the bad characters out of the year. They range from 2 to 6 unconverted characters
.
任何指针?我只是切片 end_date
,但我希望有一个datetime安全的策略。
Any pointers? I would just slice end_date
but im hoping there is a datetime-safe strategy.
推荐答案
是的,我只是把额外的数字剁掉。假设它们总是附加到datestring,那么这样的东西可以正常工作:
Yeah, I'd just chop off the extra numbers. Assuming they are always appended to the datestring, then something like this would work:
end_date = end_date.split(" ")
end_date[-1] = end_date[-1][:4]
end_date = " ".join(end_date)
我将尝试从异常中获取多余的数字,但是在我安装的Python版本(2.6.6和3.1.2)上,信息实际上并不存在;它只是说数据与格式不符。当然,你可以继续一次删除一个数字并重新解析,直到你没有得到例外。
I was going to try to get the number of excess digits from the exception, but on my installed versions of Python (2.6.6 and 3.1.2) that information isn't actually there; it just says that the data does not match the format. Of course, you could just continue lopping off digits one at a time and re-parsing until you don't get an exception.
你还可以写一个正则表达式,将会只匹配有效日期,包括一年中正确的数字,但这似乎是过度的。
You could also write a regex that will match only valid dates, including the right number of digits in the year, but that seems like overkill.
这篇关于如何从Python datetime对象中删除未转换的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!