Pythonic的方式来实现分词器 [英] Pythonic way to implement a tokenizer

查看:111
本文介绍了Pythonic的方式来实现分词器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将在Python中实现一个tokenizer,我想知道你是否可以提供一些样式的建议?

I'm going to implement a tokenizer in Python and I was wondering if you could offer some style advice?

我在C中实现了tokenizer,在Java中,所以我很好的理论,我只是想确保我遵循pythonic的风格和最佳做法。

I've implemented a tokenizer before in C and in Java so I'm fine with the theory, I'd just like to ensure I'm following pythonic styles and best practices.

列表令牌类型:

例如,在Java中,我会得到如下字段的列表:

In Java, for example, I would have a list of fields like so:

public static final int TOKEN_INTEGER = 0

但是,显然没有办法在Python中声明一个常量变量,所以我可以替换为正常的变量声明,但是这并不是一个很好的解决方案,因为声明可以改变。

But, obviously, there's no way (I think) to declare a constant variable in Python, so I could just replace this with normal variable declarations but that doesn't strike me as a great solution since the declarations could be altered.

从Tokenizer返回令牌:

Returning Tokens From The Tokenizer:

有一个更好的替代,只是简单地返回一个元组的列表例如

Is there a better alternative to just simply returning a list of tuples e.g.

[ (TOKEN_INTEGER, 17), (TOKEN_STRING, "Sixteen")]?

干杯,

Pete

推荐答案

Python采用我们都同意成年人的方法来隐藏信息。可以使用变量作为常量,并且相信代码的用户不会做一些愚蠢的事情。

Python takes a "we're all consenting adults" approach to information hiding. It's OK to use variables as though they were constants, and trust that users of your code won't do something stupid.

这篇关于Pythonic的方式来实现分词器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆