Pythonic的方式来实现分词器 [英] Pythonic way to implement a tokenizer
问题描述
我将在Python中实现一个tokenizer,我想知道你是否可以提供一些样式的建议?
I'm going to implement a tokenizer in Python and I was wondering if you could offer some style advice?
我在C中实现了tokenizer,在Java中,所以我很好的理论,我只是想确保我遵循pythonic的风格和最佳做法。
I've implemented a tokenizer before in C and in Java so I'm fine with the theory, I'd just like to ensure I'm following pythonic styles and best practices.
列表令牌类型:
例如,在Java中,我会得到如下字段的列表:
In Java, for example, I would have a list of fields like so:
public static final int TOKEN_INTEGER = 0
但是,显然没有办法在Python中声明一个常量变量,所以我可以替换为正常的变量声明,但是这并不是一个很好的解决方案,因为声明可以改变。
But, obviously, there's no way (I think) to declare a constant variable in Python, so I could just replace this with normal variable declarations but that doesn't strike me as a great solution since the declarations could be altered.
从Tokenizer返回令牌:
Returning Tokens From The Tokenizer:
有一个更好的替代,只是简单地返回一个元组的列表例如
Is there a better alternative to just simply returning a list of tuples e.g.
[ (TOKEN_INTEGER, 17), (TOKEN_STRING, "Sixteen")]?
干杯,
Pete
推荐答案
Python采用我们都同意成年人的方法来隐藏信息。可以使用变量作为常量,并且相信代码的用户不会做一些愚蠢的事情。
Python takes a "we're all consenting adults" approach to information hiding. It's OK to use variables as though they were constants, and trust that users of your code won't do something stupid.
这篇关于Pythonic的方式来实现分词器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!