为什么我在使用迭代器的并行任务执行期间得到重复的值? [英] Why I get duplicated values during a Parallel Task Execution using an Iterator?

查看:21
本文介绍了为什么我在使用迭代器的并行任务执行期间得到重复的值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 任务并行库 (TPL).

我使用 XPathNavigator 遍历 xml 并检索客户 ID.我正在使用带有 Parallel.ForEach(..) 用于任务并行.

I use XPathNavigator to iterate through xml and retrieve customer Ids. I’m using an iterator with the Parallel.ForEach(..) for task parallelism.

出于某种原因,我检索了重复的客户 ID.似乎迭代器会跟踪以前的读取/迭代.每次循环时,我都在期待新的迭代器.

For some reason I retrieve duplicated customer Ids. It almost seems like the iterator keeping track of previous reads/iteratoes. I’m expecting new iterator each time when I loop through.

我尝试了很多方法仍然没有运气.如果有人能指出我正确的方向,将不胜感激.

I have tried number of ways still no luck. If someone can point me to the right direction it would be greatly appreciated.

(尝试的完整代码示例如下.)

一些简单的 XML:

    private static string Xml()
    {
        return "<persons>" +
               "<person><id>1</id></person>" +
               "<person><id>2</id></person>" +
               "<person><id>3</id></person>" +
               "<person><id>4</id></person>" +
               "<person><id>5</id></person>" +
               "</persons>";
    }


    static void Main(string[] args)
    {
        var navigator = XmlHelper.CreateNavigator(Xml());

        string xpath = "/persons/person";
        var exp = navigator.Compile(xpath);
        var iterator = navigator.Select(exp);

        //Parallel Task scenario returns duplicated customer Ids
        Parallel.ForEach(Iterate(iterator), (a) =>
        {
            string xpathId = "/person/id";
            var value = XmlHelper.SelectString(a.Current, xpathId);
            Console.WriteLine("person id: " + value);
        });
        /*
         * Sample output can be: (notice the duplicated values!)
         * person id: 2
         * person id: 2
         * person id: 4
         * person id: 4
         * person id: 3
         * person id: 1
         * 
         */

        //Sequential scenario displays unique values:       
        //while (iterator.MoveNext())
        //{
        //    string xpathId = "/person/id";
        //    var value = XmlHelper.SelectString(iterator.Current, xpathId);
        //    Console.WriteLine("person id: " + value);
        //}

        Console.ReadLine();
    }

    private static IEnumerable<XPathNodeIterator> 
             Iterate(XPathNodeIterator iterator)
    {
        while (iterator.MoveNext())
        {
            yield return iterator;
        }
    }


public static class XmlHelper
{
    public static string SelectString(XPathNavigator navigator, string xpath)
    {
        return SelectString(navigator, xpath, null);
    }

    public static string SelectString
          (XPathNavigator navigator, string xpath, string defaultVal)
    {
        XPathExpression exp = navigator.Compile(xpath);
        XPathNodeIterator it = navigator.Select(exp);
        it.MoveNext();
        return it.Current.Value;

    }

    public static XPathNavigator CreateNavigator(string input)
    {
        XPathDocument doc;

        using (var reader = new StringReader(input))
        {
            doc = new XPathDocument(reader);
        }

        return doc.CreateNavigator();
    }
}

注意我也有this 文章还是没有运气.非常感谢任何帮助.

Note I have also the approach take by this article still no luck. Any help greatly appreciated.

推荐答案

感谢 @Natram 和 @Paddy!

Thanks @Natram and @Paddy!

两个答案都为我指明了正确的方向.我认为@Nitram 的回答更准确,因为他首先解释了我遇到的问题.

Both answers pointed me to the right direction. I think @Nitram’s answer was more accurate as he has explained the problem I had it in the first place.

它似乎是并行运行的,下面的代码仍然导致一些重复.这对于较小的集合并不明显,但是当数字变大时,它往往会在多线程环境中重复值.

It seems running in parallel, the below code was still causing some duplicates. This is not obvious for smaller collections, but when the number becomes larger it tend to repeat values in multi threaded environments.

private static IEnumerable<XPathNavigator> Iterate(XPathNodeIterator iterator)
{
    while (iterator.MoveNext())
    {
        yield return iterator.Current;
    }
}

我相信这就是@Paddy 提到迭代器不是线程安全的原因.

I believe this is why @Paddy mentioned the Iterator is not thread safe.

@Ntram 提到:

Parallel.ForEach 可以轻松处理可枚举项..

Parallel.ForEach is easily able to handle Enumerables..

基于此,我继续转换迭代器以返回 XPathNaviagator 枚举列表

Based on this I went on converting the Iterator to return a list of XPathNaviagator Enumerables

private static IEnumerable<XPathNavigator> Iterate(XPathNodeIterator iterator)
        {
            var items = iterator.Cast<XPathNavigator>();
            return items;
        }

这解决了我遇到的问题,并且有效地处理了我期望并行化的项目数量.

This solved the problem I had and it worked effectively with the number of items I'm expected to parallelize.

这篇关于为什么我在使用迭代器的并行任务执行期间得到重复的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆