删除带有滑动窗口的意见没有嵌套while循环 [英] Removing comments with a sliding window without nested while loops
问题描述
我想从C code A C文件中删除注释和字符串。我就坚持为例子意见。我有一个滑动窗口,所以我只有字符 N
和 N-1
在任何给定的时刻。我试图找出一种算法,不使用嵌套田地
如果可能的话,但我需要一个,而以的getchar
通过输入。我首先想到的是,同时通过发现当 N = *和(n-1)= /
然后同时通过,直到 N = /和(N- 1)= *
,但考虑到这已经嵌套9:20我觉得这是低效的。我能做到这样,如果我有,但我想知道如果任何人有一个更好的解决方案。
I'm trying to remove comments and strings from a c file with c code. I'll just stick to comments for the examples. I have a sliding window so I only have character n
and n-1
at any given moment. I'm trying to figure out an algorithm that does not use nested whiles
if possible, but I will need one while to getchar
through the input. My first thought was to while through find when n=* and (n-1)=/
then while through until n=/ and (n-1)=*
, but considering this has nested whiles I feel it is inefficient. I can do it this way if I have to, but I was wondering if anyone had a better solution.
推荐答案
正确做法,这是比较复杂的不止一个一开始可能想,巧妙地指出了这里的其他意见。我强烈建议编写一个表驱动的FSM,使用状态转换图得到的转换权。尝试做任何事情比少数几个州更多的case语句是可怕的错误倾向IMO。
Doing this correctly is more complicated than one may at first think, as ably pointed out by the other comments here. I would strongly recommend writing a table-driven FSM, using a state transition diagram to get the transitions right. Trying to do anything more than a few states with case statements is horribly error-prone IMO.
下面是在点/ graphviz的格式图从中你也许可以直接code状态表。请注意,我没有测试此可言,所以因人而异。
Here's a diagram in dot/graphviz format from which you could probably directly code a state table. Note that I haven't tested this at all, so YMMV.
图的语义是,当你看到< CH>
,这是一个秋天,但如果没有在该州匹配的其他投入。文件结束,除了 S0
在任何状态下的错误,所以是没有明确列出的任何字符,或< CH>
。打印除了扫描每一个字符时,在注释( S4
和 S5
),并在检测到开始评论(当 S1
)。您可以在检测到开始注释时缓冲的字符,然后打印,如果它是一个错误的开始,否则扔掉肯定时,它确实是一个注释。
The semantics of the diagram are that when you see <ch>
, it is a fall-though if none of the other input in that state match. End of file is an error in any state except S0
, and so is any character not explicitly listed, or <ch>
. Every character scanned is printed except when in a comment (S4
and S5
), and when detecting a start comment (S1
). You will have to buffer characters when detecting a start comment, and print them if it's a false start, otherwise throw them away when sure it's really a comment.
在点图中,平方
是一个单引号
, DQ
是一个双引号。
In the dot diagram, sq
is a single quote '
, dq
is a double quote "
.
digraph state_machine {
rankdir=LR;
size="8,5";
node [shape=doublecircle]; S0 /* init */;
node [shape=circle];
S0 /* init */ -> S1 /* begin_cmt */ [label = "'/'"];
S0 /* init */ -> S2 /* in_str */ [label = dq];
S0 /* init */ -> S3 /* in_ch */ [label = sq];
S0 /* init */ -> S0 /* init */ [label = "<ch>"];
S1 /* begin_cmt */ -> S4 /* in_slc */ [label = "'/'"];
S1 /* begin_cmt */ -> S5 /* in_mlc */ [label = "'*'"];
S1 /* begin_cmt */ -> S0 /* init */ [label = "<ch>"];
S1 /* begin_cmt */ -> S1 /* begin_cmt */ [label = "'\\n'"]; // handle "/\n/" and "/\n*"
S2 /* in_str */ -> S0 /* init */ [label = "'\\'"];
S2 /* in_str */ -> S6 /* str_esc */ [label = "'\\'"];
S2 /* in_str */ -> S2 /* in_str */ [label = "<ch>"];
S3 /* in_ch */ -> S0 /* init */ [label = sq];
S4 /* in_slc */ -> S4 /* in_slc */ [label = "<ch>"];
S4 /* in_slc */ -> S0 /* init */ [label = "'\\n'"];
S5 /* in_mlc */ -> S7 /* end_mlc */ [label = "'*'"];
S5 /* in_mlc */ -> S5 /* in_mlc */ [label = "<ch>"];
S7 /* end_mlc */ -> S7 /* end_mlc */ [label = "'*'|'\\n'"];
S7 /* end_mlc */ -> S0 /* init */ [label = "'/'"];
S7 /* end_mlc */ -> S5 /* in_mlc */ [label = "<ch>"];
S6 /* str_esc */ -> S8 /* oct */ [label = "[0-3]"];
S6 /* str_esc */ -> S9 /* hex */ [label = "'x'"];
S6 /* str_esc */ -> S2 /* in_str */ [label = "<ch>"];
S8 /* oct */ -> S10 /* o1 */ [label = "[0-7]"];
S10 /* o1 */ -> S2 /* in_str */ [label = "[0-7]"];
S9 /* hex */ -> S11 /* h1 */ [label = hex];
S11 /* h1 */ -> S2 /* in_str */ [label = hex];
S3 /* in_ch */ -> S12 /* ch_esc */ [label = "'\\'"];
S3 /* in_ch */ -> S13 /* out_ch */ [label = "<ch>"];
S13 /* out_ch */ -> S0 /* init */ [label = sq];
S12 /* ch_esc */ -> S3 /* in_ch */ [label = sq];
S12 /* ch_esc */ -> S12 /* ch_esc */ [label = "<ch>"];
}
这篇关于删除带有滑动窗口的意见没有嵌套while循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!