创建Brainfuck解析器,解析循环运算符的最佳方法是什么? [英] Creating a Brainfuck parser, whats the best method of parsing loop operators?
问题描述
我正在创建一个Brainfuck解析器(使用BASIC方言),最终创建了一个解释器,但是我意识到它并不像我最初想象的那样简单.我的问题是我需要一种方法来准确解析Brainfuck程序中的匹配循环运算符.这是一个示例程序:
I'm creating a Brainfuck parser (in a BASIC dialect) ultimately to create an interpreter but i've realise it's not as straight forward as i first thought. My problem is that i need a way to accurately parse the matching loop operators within a Brainfuck program. This is an example program:
,>,>++++++++[<------<------>>-]
<<[>[>+>+<<-]>>[<<+>>-]<<<-]
>>>++++++[<++++++++>-],<.>.
'['=循环开始
']'=循环结束
我需要记录每个匹配循环运算符的起点和终点,以便我可以根据需要在源代码周围跳转.有些循环是单独的,有些则是嵌套的.
I need to record the start and end point of each matching loop operator so i can jump around the source as needed. Some loops are alone, some are nested.
解析此问题的最佳方法是什么?我在想,也许在源文件中移动以创建一个2D数组(或类似的记录),以记录每个匹配运算符的开始和结束位置,但这似乎在源中进行了很多往返"操作.这是最好的方法吗?
What would be the best way to parse this? I was thinking maybe move through the source file creating a 2D array (or such like) recording the start and end positions of each matching operator, but this seems like a lot of 'to'ing and fro'ing' through the source. Is this the best way to do it?
更多信息: Brainfuck主页
任何语言的示例代码都将受到赞赏.
Sample code in any language greatly appreciated.
推荐答案
您是否考虑过使用Stack数据结构来记录跳转点"(即指令指针的位置).
Have you considered using a Stack data structure to record "jump points" (i.e. the location of the instruction pointer).
因此,基本上,每次遇到"["时,都将指令指针的当前位置压入该堆栈.每当遇到]"时,都将指令指针重置为堆栈顶部当前的值.循环完成后,将其从堆栈中弹出.
So basically, every time you encounter a "[" you push the current location of the instruction pointer on this stack. Whenever you encounter a "]" you reset the instruction pointer to the value that's currently on the top of the stack. When a loop is complete, you pop it off the stack.
这里是一个具有100个存储单元的C ++示例.该代码以递归方式处理嵌套循环,尽管未进行完善,但应说明这些概念.
Here is an example in C++ with 100 memory cells. The code handles nested loops recursively and although it is not refined it should illustrate the concepts..
char cells[100] = {0}; // define 100 memory cells
char* cell = cells; // set memory pointer to first cell
char* ip = 0; // define variable used as "instruction pointer"
void interpret(static char* program, int* stack, int sp)
{
int tmp;
if(ip == 0) // if the instruction pointer hasn't been initialized
ip = program; // now would be a good time
while(*ip) // this runs for as long as there is valid brainF**k 'code'
{
if(*ip == ',')
*cell = getch();
else if(*ip == '.')
putch(*cell);
else if(*ip == '>')
cell++;
else if(*ip == '<')
cell--;
else if(*ip == '+')
*cell = *cell + 1;
else if(*ip == '-')
*cell = *cell - 1;
else if(*ip == '[')
{
stack[sp+1] = ip - program;
*ip++;
while(*cell != 0)
{
interpret(program, stack, sp + 1);
}
tmp = sp + 1;
while((tmp >= (sp + 1)) || *ip != ']')
{
*ip++;
if(*ip == '[')
stack[++tmp] = ip - program;
else if(*ip == ']')
tmp--;
}
}
else if(*ip == ']')
{
ip = program + stack[sp] + 1;
break;
}
*ip++; // advance instruction
}
}
int _tmain(int argc, _TCHAR* argv[])
{
int stack[100] = {0}; // use a stack of 100 levels, modeled using a simple array
interpret(",>,>++++++++[<------<------>>-]<<[>[>+>+<<-]>>[<<+>>-]<<<-]>>>++++++[<++++++++>-],<.>.", stack, 0);
return 0;
}
编辑 我只是再次浏览了一下代码,我意识到while循环中存在一个错误,如果指针的值为0,它将跳过"已解析的循环.这是我进行更改的地方:
EDIT I just went over the code again and I realized there was a bug in the while loop that would 'skip' parsed loops if the value of the pointer is 0. This is where I made the change:
while((tmp >= (sp + 1)) || *ip != ']') // the bug was tmp > (sp + 1)
{
ip++;
if(*ip == '[')
stack[++tmp] = ip - program;
else if(*ip == ']')
tmp--;
}
以下是相同解析器的实现,但不使用递归:
Below is an implementation of the same parser but without using recursion:
char cells[100] = {0};
void interpret(static char* program)
{
int cnt; // cnt is a counter that is going to be used
// only when parsing 0-loops
int stack[100] = {0}; // create a stack, 100 levels deep - modeled
// using a simple array - and initialized to 0
int sp = 0; // sp is going to be used as a 'stack pointer'
char* ip = program; // ip is going to be used as instruction pointer
// and it is initialized at the beginning or program
char* cell = cells; // cell is the pointer to the 'current' memory cell
// and as such, it is initialized to the first
// memory cell
while(*ip) // as long as ip point to 'valid code' keep going
{
if(*ip == ',')
*cell = getch();
else if(*ip == '.')
putch(*cell);
else if(*ip == '>')
cell++;
else if(*ip == '<')
cell--;
else if(*ip == '+')
*cell = *cell + 1;
else if(*ip == '-')
*cell = *cell - 1;
else if(*ip == '[')
{
if(stack[sp] != ip - program)
stack[++sp] = ip - program;
*ip++;
if(*cell != 0)
continue;
else
{
cnt = 1;
while((cnt > 0) || *ip != ']')
{
*ip++;
if(*ip == '[')
cnt++;
else if(*ip == ']')
cnt--;
}
sp--;
}
}else if(*ip == ']')
{
ip = program + stack[sp];
continue;
}
*ip++;
}
}
int _tmain(int argc, _TCHAR* argv[])
{
// define our program code here..
char *prg = ",>++++++[<-------->-],[<+>-]<.";
interpret(prg);
return 0;
}
这篇关于创建Brainfuck解析器,解析循环运算符的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!