用Java创建一个近似大小的随机txt文件 [英] Create a random txt file of approximate size in Java

查看:96
本文介绍了用Java创建一个近似大小的随机txt文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好的,我在生成随机txt文件的地方找到了以下代码。基本上,我希望随机单词之间用空格隔开,以便运行MapReduce单词计数模拟。

OK, I have found the following code somewhere that generate a random txt file. Basically I want random words separated by some whitespace in order to run MapReduce word counting simulations.

import java.io.IOException;
import java.io.PrintWriter;
import java.util.Random;

public class MainClass {


    public static void main(String[] args) {
        // TODO Auto-generated method stub


        try{
            PrintWriter writer = new PrintWriter("bigfile.txt", "UTF-8");


            Random random = new Random();
            for(int i = 0; i < 23695522; i++)
            {           
                char[] word = new char[random.nextInt(8)+3]; // words of length 3 through 10. (1 and 2 letter words are boring.)
                for(int j = 0; j < word.length; j++)
                {
                    word[j] = (char)('a' + random.nextInt(26));
                }
                writer.print(new String(word) + ' ');

                if (i % 10 == 0){
                    writer.println();
                }
            }


            writer.close();
        } catch (IOException e) {
           // do something
        }

    }

}

现在,我想稍微修改一下此代码,以使文件具有所需的尽可能多的迭代次数,以具有大约预定义的大小。因此,每次迭代将产生大约6.5个字符(由于统一选择),每个2个字节。因此,我将所需文件大小除以(6.5 * 2)字节,将结果设置为for循环迭代次数,并得到一个比我期望的小得多的文件。

Now I want to alter this code a bit in order to have as much iterations as needed for the file to have approximately a predefined size. So, every iteration produces about 6.5 characters (due to uniform selection) each of 2 bytes. So, I divide the size of file I want in bytes by (6.5*2), set the result as the number of for loop iteration and get a file much smaller than I expect it to be.

推荐答案

import java.io.File;
import java.io.IOException;
import java.io.PrintWriter;
import java.util.Random;

public class MainClass {


public static void main(String[] args) {
    // TODO Auto-generated method stub

    long count=0;
    try{

        File file = new File("bigfile.txt");
        PrintWriter writer = new PrintWriter(file, "UTF-8");


        Random random = new Random();
        for(int i = 0; i < 23695522; i++)
        {           
            char[] word = new char[random.nextInt(8)+3]; // words of length 3 through 10. (1 and 2 letter words are boring.)
            count+=word.length;
            for(int j = 0; j < word.length; j++)
            {
                word[j] = (char)('a' + random.nextInt(26));

            }
            writer.print(new String(word) + ' ');
            count+=1;
            if (i % 10 == 0){
                writer.println();
                count+=2;

            }
        }


        writer.close();
    } catch (IOException e) {
       // do something
    }




    System.out.println(count);

}

}

尝试这个。换行符char是2个字节,其他字符是1个字节。

Try this one. Newline char is 2 byte and the others are 1 byte.

这篇关于用Java创建一个近似大小的随机txt文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆