如何在Java中逐个字符读取输入? [英] How do I read input character-by-character in Java?

查看:1239
本文介绍了如何在Java中逐个字符读取输入?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我习惯于c风格 getchar(),但似乎没有可比的java。我正在建立一个词法分析器,我需要按字符读取输入字符。

I am used to the c-style getchar(), but it seems like there is nothing comparable for java. I am building a lexical analyzer, and I need to read in the input character by character.

我知道我可以使用扫描器扫描一个令牌或行并解析通过令牌char-by-char,但是对于跨越多行的字符串似乎是不实用的。有没有办法只是从Java的输入缓冲区中获取下一个字符,或者我应该只是插上Scanner类?

I know I can use the scanner to scan in a token or line and parse through the token char-by-char, but that seems unwieldy for strings spanning multiple lines. Is there a way to just get the next character from the input buffer in Java, or should I just plug away with the Scanner class?

输入是一个文件,而不是

The input is a file, not the keyboard.

推荐答案

使用 Reader.read()。返回值-1表示流的结束;

Use Reader.read(). A return value of -1 means end of stream; else, cast to char.

此代码从文件参数列表中读取字符数据:

This code reads character data from a list of file arguments:

public class CharacterHandler {
    //Java 7 source level
    public static void main(String[] args) throws IOException {
        // replace this with a known encoding if possible
        Charset encoding = Charset.defaultCharset();
        for (String filename : args) {
            File file = new File(filename);
            handleFile(file, encoding);
        }
    }

    private static void handleFile(File file, Charset encoding)
            throws IOException {
        try (InputStream in = new FileInputStream(file);
             Reader reader = new InputStreamReader(in, encoding);
             // buffer for efficiency
             Reader buffer = new BufferedReader(reader)) {
            handleCharacters(buffer);
        }
    }

    private static void handleCharacters(Reader reader)
            throws IOException {
        int r;
        while ((r = reader.read()) != -1) {
            char ch = (char) r;
            System.out.println("Do something with " + ch);
        }
    }
}

代码是它使用系统的默认字符集。只要有可能,就更喜欢已知的编码(如果你有选择,最好是Unicode编码)。有关详情,请参见 Charset 类。 (如果您觉得自虐,您可以阅读此字符编码指南。)

The bad thing about the above code is that it uses the system's default character set. Wherever possible, prefer a known encoding (ideally, a Unicode encoding if you have a choice). See the Charset class for more. (If you feel masochistic, you can read this guide to character encoding.)

(您可能需要注意的补充Unicode字符 - 那些需要两个字符值存储的网址。请参阅字符类别,这是一种边缘情况,可能不适用于家庭作业。)

(One thing you might want to look out for are supplementary Unicode characters - those that require two char values to store. See the Character class for more details; this is an edge case that probably won't apply to homework.)

这篇关于如何在Java中逐个字符读取输入?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆