将\ u200e解码为字符串 [英] decoding \u200e to string

查看:72
本文介绍了将\ u200e解码为字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Python3中,我收到以下错误消息:

In Python3, I receive the following error message:

ValueError:时间数据'\ u200e07-30-200702:38 PM'与格式'%m-%d-%Y%I:%M%p'不匹配

from datetime import datetime

dateRegistered = '\u200e07-30-200702:38 PM'
# dateRegistered = '07-30-200702:38 PM'
dateRegistered = datetime.strptime(dateRegistered, '%m-%d-%Y%I:%M %p')
print (dateRegistered)

上面的代码用于复制问题.如果我取消注释该行,它将起作用.我接收的字符串似乎已编码,但是我无法确定它使用的编码.还是我的字符串中有一个不可打印的字符?

The code above serves to replicate the issue. It works if I uncomment the line. It seems the string I am receiving is encoded, but I could not find out which encoding it is using. Or do I have a non-printable character in my string?

print ('\u200e07-30-200702:38 PM')
>>>> 07-30-200702:38 PM

推荐答案

您有 U + 200E LEFT-TO输入中的-RIGHT标记字符.这是非打印排版指令 ,显示文本的所有内容都可以切换到从左到右的模式.该字符串在打印到已经设置为从左到右显示的控制台(例如,西方世界的绝大多数终端)时,看起来与没有标记的打印字符串没有什么不同.

You have a U+200E LEFT-TO-RIGHT MARK character in your input. It's a non-printing typesetting directive, instructing anything that is displaying the text to switch to left-to-right mode. The string, when printed to a console that is already set to display from left-to-right (e.g. the vast majority of terminals in the western world), will not look any different from one printed without the marker.

由于它不是日期的一部分,因此您可以剥离此类字符:

Since it is not part of the date, you could just strip such characters:

datetime.strptime(dateRegistered.strip('\u200e'), '%m-%d-%Y%I:%M %p')

或(如果始终存在)将其显式添加到您要解析的格式中,就像-: 空格字符一样已经是您格式的一部分:

or if it is always present, explicitly add it to the format you are parsing, just like the - and : and space characters already part of your format:

datetime.strptime(dateRegistered, '\u200e%m-%d-%Y%I:%M %p')

演示:

>>> from datetime import datetime
>>> dateRegistered = '\u200e07-30-200702:38 PM'
>>> datetime.strptime(dateRegistered.strip('\u200e'), '%m-%d-%Y%I:%M %p')
datetime.datetime(2007, 7, 30, 14, 38)
>>> datetime.strptime(dateRegistered, '\u200e%m-%d-%Y%I:%M %p')
datetime.datetime(2007, 7, 30, 14, 38)

这篇关于将\ u200e解码为字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆