从大文件中选择随机元素时出现 NumberFormatException [英] NumberFormatException while selecting random elements from a big file
问题描述
我有一个非常大的文件,其中包含这样的用户 ID.该大文件中的每一行都是一个用户 ID.
I have a very big file which contains user ids like this. Each line in that big file is an user id.
149905320
1165665384
66969324
886633368
1145241312
286585320
1008665352
所以在那个大文件中,我将拥有大约 3000 万个用户 ID.现在我试图从那个大文件中选择随机用户 ID.下面是我拥有的程序,但在某些时候它总是给我这样的异常 - 我不确定为什么会发生这个异常.
So in that big file, I will have around 30Million user id's. Now I am trying to select random user id's from that big file. Below is the program I have but at some point it always give me this exception like this- and I am not sure why this exception is happening.
Exception in thread "main" java.lang.NumberFormatException: For input string: ""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:59)
at java.lang.Integer.parseInt(Integer.java:481)
at java.lang.Integer.parseInt(Integer.java:510)
at com.host.bulls.service.lnp.RandomReadFromFile.main(RandomReadFromFile.java:65)
下面是我的程序-
public static void main(String[] args) throws Exception {
File f = new File("D:/abc.txt");
RandomAccessFile file;
try {
file = new RandomAccessFile(f, "r");
long file_size = file.length();
// Let's start
long chosen_byte = (long)(Math.random() * (file_size - 1));
long cur_byte = chosen_byte;
// Goto starting position
file.seek(cur_byte);
String s_LR = "";
char a_char;
// Get left hand chars
for (;;)
{
a_char = (char)file.readByte();
if (cur_byte < 0 || a_char == '\n' || a_char == '\r' || a_char == -1) break;
else
{
s_LR = a_char + s_LR;
--cur_byte;
if (cur_byte >= 0) file.seek(cur_byte);
else break;
}
}
// Get right hand chars
cur_byte = chosen_byte + 1;
file.seek(cur_byte);
for (;;)
{
a_char = (char)file.readByte();
if (cur_byte >= file_size || a_char == '\n' || a_char == '\r' || a_char == -1) break;
else
{
s_LR += a_char;
++cur_byte;
}
}
// Parse ID
if (cur_byte < file_size)
{
int chosen_id = Integer.parseInt(s_LR);
System.out.println("Chosen id : " + chosen_id);
}
else
{
throw new Exception("Ran out of bounds..");
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
我上面的代码有问题吗?
Is there any problem in my above code?
推荐答案
我尝试运行你的代码,发现一个额外的错误 - 你必须在阅读之前检查 cur_byte 如下:
I tried to run your code and found one additional error - you have to check cur_byte before read as follows:
if (cur_byte < file_size) {
a_char = (char) file.readByte();
}
否则你会得到EOFException
.
使用您的示例 abc.txt 我没有得到 java.lang.NumberFormatException: For input string: ""
异常.
With your sample abc.txt I don't get java.lang.NumberFormatException: For input string: ""
exception.
但是如果我在 abc.txt 中添加空行,我迟早会得到这个异常.因此,问题在于 abc.txt 中某处的空行.
But if I add empty lines in abc.txt I get this exception sooner or later. Thus the problem is with empty lines somewhere in abc.txt.
这篇关于从大文件中选择随机元素时出现 NumberFormatException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!