创建大型二维数组 [英] Create big Two-Dimensional Array

查看:84
本文介绍了创建大型二维数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个简单的问题:

如何在C#中使用巨大的二维数组?我想做的是以下事情:

How can I use a huge two-dimensional array in C#? What I want to do is the following:

int[] Nodes = new int[1146445];

int[,] Relations = new int[Nodes.Lenght,Nodes.Lenght];

这只是说明我遇到了内存不足的错误.

It just figures that I got an out of memory error.

是否有机会使用如此大的内存数据? (4GB RAM和6核CPU)^^

Is there a chance to work with such big data in-memory? (4gb RAM and a 6 core CPU)^^

我要保存在二维数组中的整数很小.我猜是从0到1000.

The integers I want to save in the two-dimensional array are small. I guess from 0 to 1000.

更新:我尝试使用Dictionary<KeyValuePair<int, int>, int>保存关系".它适用于一些添加循环.这是应该创建图形的类. CreateGraph的实例从xml流读取器获取其数据.

Update: I tried to save the Relations using Dictionary<KeyValuePair<int, int>, int>. It works for some adding loops. Here is the class wich should create the graph. The instance of CreateGraph get's its data from a xml streamreader.

主要(C#backgroundWorker_DoWork)

Main (C# backgroundWorker_DoWork)

ReadXML Reader = new ReadXML(tBOpenFile.Text);
CreateGraph Creater = new CreateGraph();

int WordsCount = (int)nUDLimit.Value;
if (nUDLimit.Value == 0) WordsCount = Reader.CountWords();

// word loop
for (int Position = 0; Position < WordsCount; Position++)
{
    // reading and parsing
    Reader.ReadNextWord();

    // add to graph builder
    Creater.AddWord(Reader.CurrentWord, Reader.GetRelations(Reader.CurrentText));
}

string[] Words = Creater.GetWords();
Dictionary<KeyValuePair<int, int>, int> Relations = Creater.GetRelations();

ReadXML

class ReadXML
{
    private string Path;
    private XmlReader Reader;
    protected int Word;
    public string CurrentWord;
    public string CurrentText;

    public ReadXML(string FilePath)
    {
        Path = FilePath;
        LoadFile();
        Word = 0;
    }

    public int CountWords()
    {
        // caching
        if(Path.Contains("filename") == true) return 1000;

        int Words = 0;
        while (Reader.Read())
        {
            if (Reader.NodeType == XmlNodeType.Element & Reader.Name == "word")
            {
                Words++;
            }
        }

        LoadFile();

        return Words;
    }

    public void ReadNextWord()
    {
        while(Reader.Read())
        {
            if(Reader.NodeType == XmlNodeType.Element & Reader.Name == "word")
            {
                while (Reader.Read())
                {
                    if (Reader.NodeType == XmlNodeType.Element & Reader.Name == "name")
                    {
                        XElement Title = XElement.ReadFrom(Reader) as XElement;
                        CurrentWord = Title.Value;

                        break;
                    }
                }
                while(Reader.Read())
                {
                    if (Reader.NodeType == XmlNodeType.Element & Reader.Name == "rels")
                    {
                        XElement Text = XElement.ReadFrom(Reader) as XElement;
                        CurrentText = Text.Value;

                        break;
                    }
                }
                break;
            }
        }
    }

    public Dictionary<string, int> GetRelations(string Text)
    {
        Dictionary<string, int> Relations = new Dictionary<string,int>();

        string[] RelationStrings = Text.Split(';');

        foreach (string RelationString in RelationStrings)
        {
            string[] SplitString = RelationString.Split(':');

            if (SplitString.Length == 2)
            {
                string RelationName = SplitString[0];
                int RelationWeight = Convert.ToInt32(SplitString[1]);

                Relations.Add(RelationName, RelationWeight);
            }
        }

        return Relations;
    }

    private void LoadFile()
    {
        Reader = XmlReader.Create(Path);
        Reader.MoveToContent();
    }
}

CreateGraph

CreateGraph

class CreateGraph
{
    private Dictionary<string, int> CollectedWords = new Dictionary<string, int>();
    private Dictionary<KeyValuePair<int, int>, int> CollectedRelations = new Dictionary<KeyValuePair<int, int>, int>();

    public void AddWord(string Word, Dictionary<string, int> Relations)
    {
        int SourceNode = GetIdCreate(Word);
        foreach (KeyValuePair<string, int> Relation in Relations)
        {
            int TargetNode = GetIdCreate(Relation.Key);
            CollectedRelations.Add(new KeyValuePair<int,int>(SourceNode, TargetNode), Relation.Value);  // here is the error located
        }
    }

    public string[] GetWords()
    {
        string[] Words = new string[CollectedWords.Count];

        foreach (KeyValuePair<string, int> CollectedWord in CollectedWords)
        {
            Words[CollectedWord.Value] = CollectedWord.Key;
        }

        return Words;
    }

    public Dictionary<KeyValuePair<int,int>,int> GetRelations()
    {
        return CollectedRelations;
    }

    private int WordsIndex = 0;
    private int GetIdCreate(string Word)
    {
        if (!CollectedWords.ContainsKey(Word))
        {
            CollectedWords.Add(Word, WordsIndex);
            WordsIndex++;
        }
        return CollectedWords[Word];
    }

}

现在我得到另一个错误:具有相同键的元素已经存在. (在CreateGraph类的Add中.)

Now I get another error: An element with the same key already exists. (At the Add in the CreateGraph class.)

推荐答案

Relations设置为锯齿状数组(数组数组)时,您会有更好的机会:

You'll have a better chance when you set Relations up as a jagged array (array of array) :

//int[,] Relations = new int[Nodes.Length,Nodes.Length];
int[][] Relations = new int[Nodes.length] [];
for (int i = 0; i < Relations.Length; i++)
    Relations[i] = new int[Nodes.Length];

然后您仍然需要10k * 10k * sizeof(int)= 400M

And then you still need 10k * 10k * sizeof(int) = 400M

即使以32位元运行时,这也应该是可能的.

Which should be possible, even when running in 32 bits .

使用新的数字,即1M * 1M * 4 = 4 TB,将无法使用.
使用short替换int只会将其降低到2 TB

With the new number, it's 1M * 1M * 4 = 4 TB, that' not going to work.
And using short to replace int will only bring it down to 2 TB

由于您似乎需要为节点之间的(稀疏)连接分配权重,因此应查看是否可以进行以下操作:

Since you seem to need to assign weights to (sparse) connections between nodes, you should see if something like this could work:

struct WeightedRelation 
{ 
   public readonly int node1;
   public readonly int node2;
   public readonly int weight;
}

int[] Nodes = new int[1146445];

List<WeightedRelation> Relations = new List<WeightedRelation>();
Relations.Add(1, 2, 10);
...

这只是基本概念,您可能需要双重字典才能进行快速查找.但是您的内存大小将与实际(非0)关系的数量成比例.

This just the basic idea, you may need a double dictionary to do fast lookups. But your memory size would be proportional to the number of actual (non 0) relations.

这篇关于创建大型二维数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆