Flex中,连续扫描流(从插座)。我错过使用yywrap东西()? [英] Flex, continuous scanning stream (from socket). Did I miss something using yywrap()?

查看:106
本文介绍了Flex中,连续扫描流(从插座)。我错过使用yywrap东西()?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Flex模式识别工作的一个socketbased扫描仪(连续流)。 Flex不找到匹配重叠阵列bounderies。所以,我实现了yywrap()设置新的数组的内容,尽快函数yylex()检测<>(它会调用yywrap)。没有成功为止。

Working on a socketbased scanner (continuous stream) using Flex for pattern recognition. Flex doesn't find a match that overlaps 'array bounderies'. So I implemented yywrap() to setup new array content as soon yylex() detects <> (it will call yywrap). No success so far.

基本上(引脚-指着我的问题)这是我的code:

Basically (for pin-pointing my problem) this is my code:

%{

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define BUFFERSIZE 26
                     /*   0123456789012345678901234 */
char cbuf1[BUFFERSIZE] = "Hello everybody, lex is su";  // Warning, no '\0'
char cbuf2[BUFFERSIZE] = "per cool. Thanks!         ";
char recvBuffer[BUFFERSIZE];

int packetCnt = 0;

YY_BUFFER_STATE bufferState1, bufferState2;

%}

%option nounput
%option noinput

%%

"super"                 { ECHO; }
.                       { printf( "%c", yytext[0] );}

%%

int yywrap()
{

  int retval = 1;   

  printf(">> yywrap()\n");

  if( packetCnt <= 0 )    // Stop after 2
  {
    // Copy cbuf2 into recvBuffer
    memcpy(recvBuffer, cbuf2, BUFFERSIZE);

    //
    yyrestart(NULL); // ?? has no effect

    // Feed new data to flex
    bufferState2 = yy_scan_bytes(recvBuffer, BUFFERSIZE); 

    //
    packetCnt++;

    // Tell flex to resume scanning
    retval = 0;   
  }

  return(retval); 
}

int main(void)
{
  printf("Lenght: %d\n", (int)sizeof(recvBuffer)) ;

  // Copy cbuf1 into recvBuffer
  memcpy(recvBuffer, cbuf1, BUFFERSIZE);

  //
  packetCnt = 0;

  //
  bufferState1 = yy_scan_bytes(recvBuffer, BUFFERSIZE);

  //
  yylex();

  yy_delete_buffer(bufferState1);
  yy_delete_buffer(bufferState2);

  return 0;
}

这是我的输出:

dkmbpro:test dkroeske$ ./text 
Lenght: 26
Hello everybody, lex is su>> yywrap()
per cool. Thanks!         >> yywrap()

因此​​,对超不匹配。根据美国商务部的lexxer不yywrap的之间的复位。我怎么错过?谢谢你。

So no match on 'super'. According to the doc the lexxer is not 'reset' between yywrap's. What do I miss? Thanks.

推荐答案

有关为弯曲提供输入流的机制提供的<一个定义href=\"http://flex.sourceforge.net/manual/Generated-Scanner.html#index-YY_005fINPUT_002c-overriding-151\"相对=nofollow> YY_INPUT 宏,这就是所谓的弯曲每次都需要重新填充其缓冲区[注1]。宏调用三个参数,大致是这样的:

The mechanism for providing a stream of input to flex is to provide a definition of the YY_INPUT macro, which is called every time flex needs to refill its buffer [note 1]. The macro is called with three arguments, roughly like this:

YY_INPUT(buffer, &bytes_read, max_bytes)

宏预计读取多达 MAX_BYTES 缓存,并设置 bytes_read缓存来读取的字节的实际数量。如果在此流没有更多的投入, YY_INPUT 应该设置 bytes_read缓存 YY_NULL (即0)。有没有办法来标记不是设置文件结束条件等输入错误。的不要设置 YY_INPUT 为负值。

The macro is expected to read up to max_bytes into buffer, and to set bytes_read to the actual number of bytes read. If there is no more input in this stream, YY_INPUT should set bytes_read to YY_NULL (which is 0). There is no way to flag an input error other than setting the end of file condition. Do not set YY_INPUT to a negative value.

注意 YY_INPUT 不提供从何处读取输入的指示或任何形式的用户数据的说法。唯一的提供机制是全球 yyin中,这是一个 FILE * 。 (您可以创建一个 FILE * 从文件/套接字描述与 fdopen 并获得描述符回来的fileno 。其他解决方法超出了这个答案的范围。)

Note that YY_INPUT does not provide an indication of where to read the input from or any sort of userdata argument. The only provided mechanism is the global yyin, which is a FILE*. (You could create a FILE* from a file/socket descriptor with fdopen and get the descriptor back with fileno. Other workarounds are beyond the scope of this answer.)

当扫描器遇到流的末尾,按 YY_INPUT 指示返回0,则结束当前令牌[注2],然后调用 yywrap 来决定是否有另一个流处理。由于人工表明,它不会重置解析器状态(即,它启动它恰好是状态;当前的行号,如果行启用计数等)。然而,不允许令牌跨越两个流。

When the scanner encounters the end of a stream, as indicated by YY_INPUT returning 0, it finishes the current token [note 2], and then calls yywrap to decide whether there is another stream to process. As the manual indicates, it does not reset the parser state (that is, which start condition it happens to be in; the current line number if line counting is enabled, etc.). However, it does not allow tokens to span two streams.

当解析器/扫描器施加一个号码在命令行上指定不同的文件的 yywrap 机构是最常用的。在这种情况下使用,这将是一个有点奇怪,如果一个令牌可以在一个文件中启动,一直持续到另外一个;大多数语言实现preFER他们的文件是有点自成体系。 (考虑多行字符串,例如。)通常情况下,你真的要重置多解析器的状态,以及(行号,当然,有时启动条件),但是这是<$ C $的责任C> yywrap 。 [注3]

The yywrap mechanism is most commonly used when a parser/scanner is applied to a number of different files specified on the command line. In that use case, it would be a bit odd if a token could start in one file and continue into another one; most language implementations prefer their files to be somewhat self-contained. (Consider multi-line string literals, for example.) Normally, you actually want to reset more of the parser state as well (the line number, certainly, and sometimes the start condition), but that is the responsibility of yywrap. [note 3]

有关从插座词法,你可能会想从你的 YY_INPUT 执行调用的recv 。但是对于实验目的,这里有一个简单的 YY_INPUT 刚刚从内存缓冲区返回数据:

For lexing from a socket, you'll probably want to call recv from your YY_INPUT implementation. But for experimentation purposes, here's a simple YY_INPUT which just returns data from a memory buffer:

/* Globals which describe the input buffer. */
const char* my_in_buffer = NULL;
const char* my_in_pointer = NULL;
const char* my_in_limit = NULL;
void my_set_buffer(const char* buffer, size_t buflen) {
  my_in_buffer = my_in_pointer = buffer;
  my_in_limit = my_in_buffer + buflen;
}

/* For debugging, limit the number of bytes YY_INPUT will
 * return.
 */
#define MY_MAXREAD 26

/* This is technically incorrect because it returns 0
 * on EOF, assuming that YY_NULL is 0.
 */
#define YY_INPUT(buf, ret, maxlen) do {          \
   size_t avail = my_in_limit - my_in_pointer;   \
   size_t toread = maxlen;                       \
   if (toread > avail) toread = avail;           \
   if (toread > MY_MAXREAD) toread = MY_MAXREAD; \ 
   *ret = toread;                                \
   memcpy(buf, my_inpointer, toread);            \
   my_in_pointer += toread;                      \
} while (0)



  1. 这是不完全真实的;的缓冲器状态包括指示所述缓冲器是否可再填充的标记。如果你使用 yy_scan_bytes ,创建缓冲区的状态标记为不可再充装。

  1. This is not quite true; the buffer state includes a flag which indicates whether the buffer can be refilled. If you use yy_scan_bytes, the buffer state created is marked as non-refillable.

它实际上比这更复杂一点,因为弯曲扫描仪有时需要以决定哪些令牌已匹配,超前期间可能发生尾流的指示向前看。扫描器备份到识别记号的端部后,它仍具有重新扫描先行的字符,其可以包含几个令牌。为了解决这个问题,它设置缓冲区状态的标志表明尾流已经达成,其中prevents每次从扫描仪被称为 YY_INPUT 击中缓冲器的末尾。尽管这样,它可能是一个好主意,以确保您的 YY_INPUT 的实施将继续的情况下,返回尾流是一个尾流之后再次调用返回。

It's actually a bit more complicated than that, because flex scanners sometimes need to look ahead in order to decide which token has been matched, and the end-of-stream indication might occur during the lookahead. After the scanner backs up to the end of the recognized token, it still has to rescan the lookahead characters, which may contain several more tokens. To handle this, it sets a flag in the buffer state which indicates that end-of-stream has been reached, which prevents YY_INPUT from being called each time the scanner hits the end of the buffer. Despite this, it's probably a good idea to make sure that your YY_INPUT implementation will continue to return end-of-stream in case it is called again after an end-of-stream return.

有关另一具体例子,假设你想实现某种的#include 机制。 弯曲提供了 yy_push_state / yy_pop_state 机制,它允许你实现一个包括栈。你会打电话 yy_push_state 一旦包含指令被扫描,但 yy_pop_state 需要从 yywrap 。同样,很少有语言将允许令牌包含的源文件中启动,并继续遵循包含指令。

For another concrete example, suppose you wanted to implement some kind of #include mechanism. flex provides the yy_push_state/yy_pop_state mechanism which allows you to implement an include stack. You'd call yy_push_state once the include directive has been scanned, but yy_pop_state needs to be called from yywrap. Again, very few languages would allow a token to start in the included source file and continue following the include directive.

这篇关于Flex中,连续扫描流(从插座)。我错过使用yywrap东西()?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆