需要算法来进行简单的程序(句子排列) [英] Need algorithm to make simple program (sentence permutations)

查看:117
本文介绍了需要算法来进行简单的程序(句子排列)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我真的无法理解如何做一个简单的算法,在C#来解决我的问题。因此,我们有句话:

  {你好|你好|高高} {我的队友| M8 |朋友|朋友}。
 

所以,我的程序应该让很多句子是这样的:

 你好我的队友。
你好我的M8。
朋友你好。
你好,我的朋友们。
大家好,我的队友。
...
大家好,我的朋友。
 

我知道,有很多可能做到这一点的方案,但我想,我自己来做。 Ofcourse,它应与这一点:

  {你好|你好|高高}我{伴侣| M8 |朋友|朋友},{我|我们}希望{告诉|说}您{你好|喜|喜喜}。
 

解决方案

更新我是不是太高兴,我使用使用regexen解析这么简单的输入;但我不喜欢在其他的答案中发现的人工索引操作丛林。

所以我替换为枚举型扫描仪的标记化有两个交替令牌状态的。这是由输入的复杂性更加合理,并具有Linqy的感觉(尽管它确实不是LINQ)。我已经保持了原有的基于正则表达式解析器在我的职位有兴趣的读者到底。


这只是的 已被 的解决了使用<一个href="http://blogs.msdn.com/b/ericlippert/archive/2010/06/28/computing-a-cartesian-product-with-linq.aspx"相对=nofollow>埃里克利珀的/ IANG的CartesianProduct LINQ的扩展方法,在该方案的核心就变成了:

 公共静态无效的主要(字串[] args)
{
    常量字符串数据= @{你好|你好|高高}我{伴侣| M8 |朋友|朋友},{我|我们}希望{告诉|说}您{你好|喜|喜喜}。 ;
    变种口袋=记号化(data.GetEnumerator());

    的foreach(VAR结果CartesianProduct(袋))
        Console.WriteLine(的string.join(,result.ToArray()));
}
 

使用只有两个使用regexen()做解析成口袋,就成了写CartesianProduct控制台:)这里的问题是整个工作code(.NET 3.5 +):

 使用系统;
使用System.Text;
使用System.Text.RegularEx pressions;
使用System.Linq的;
使用System.Collections.Generic;

命名空间X
{
    静态Y级
    {
        私有静态布尔ReadTill(这IEnumerator的&LT;焦炭&GT;输入字符串stopChars,动作&LT; StringBuilder的&GT;动作)
        {
            VAR SB =新的StringBuilder();

            尝试
            {
                而(input.MoveNext())
                    如果(stopChars.Contains(input.Current))
                        返回true;
                    其他
                        sb.Append(input.Current);
            } 最后
            {
                行动(某人);
            }

            返回false;
        }


        私有静态的IEnumerable&LT; IEnumerable的&LT;字符串&GT;&GT;标记化(IEnumerator的&LT;焦炭&GT;输入)
        {
            VAR的结果=新名单,其中,IEnumerable的&LT;字符串&GT;&GT;();

            而(input.ReadTill({,某人=&GT; result.Add(新[] {sb.ToString()}))及&安培;
                  input.ReadTill(},SB =&GT; result.Add(sb.ToString()斯普利特('|'))))
            {
                // Console.WriteLine(预期累积的结果:+ result.Select(一个=&GT; a.Count())聚集(1中,(i,j)的= I标记* j)条);
            }

            返回结果;
        }

        公共静态无效的主要(字串[] args)
        {
            常量字符串数据= @{你好|你好|高高}我{伴侣| M8 |朋友|朋友},{我|我们}希望{告诉|说}您{你好|喜|喜喜}。 ;
            变种口袋=记号化(data.GetEnumerator());

            的foreach(VAR结果CartesianProduct(袋))
                Console.WriteLine(的string.join(,result.ToArray()));
        }

        静态的IEnumerable&LT; IEnumerable的&LT; T&GT;&GT; CartesianProduct&LT; T&GT;(这IEnumerable的&LT; IEnumerable的&LT; T&GT;&GT;序列)
        {
            IEnumerable的&LT; IEnumerable的&LT; T&GT;&GT; emptyProduct =新的[] {Enumerable.Empty&LT; T&GT;()};
            返回sequences.Aggregate(
                    emptyProduct,
                    (累加器,序列)=&GT;
                    从accseq累加器
                    从顺序题目
                    选择accseq.Co​​ncat(新[] {}项));
        }
    }
}
 


旧基于正则表达式解析:

 静态只读正则表达式块=新的正则表达式(@^(小于块&GT; {?*} |。?*(= {| $))+ $, RegexOptions.Compiled);
静态只读正则表达式的腿=新的正则表达式(@^ {((小于替代&GT;?。?*)\ |}])+(小于=})$?,RegexOptions.Compiled);

私有静态的IEnumerable&LT;字符串&GT;所有(这正则表达式正则表达式,文本字符串,字符串组)
{
    返回!regex.IsMatch(文本)
                ?新的[] {文本}
                :regex.Match(文本).Groups [组] .Captures.Cast&LT;捕获&GT;()选择(C =&GT; c.Value);
}

公共静态无效的主要(字串[] args)
{
    常量字符串数据= @{你好|你好|高高}我{伴侣| M8 |朋友|朋友},{我|我们}希望{告诉|说}您{你好|喜|喜喜}。 ;
    VAR口袋= chunks.All(数据块)中选择。(V =&GT; legs.All(五,另类));
 

其余不变

I really cant understand how to make a simple algorithm on C# to solve my problem. So, we have a sentences:

{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}.

So, my program should make a lot of sentences looks like:

Hello my mate.
Hello my m8.
Hello my friend.
Hello my friends.
Hi my mate.
...
Hi-Hi my friends.

I know, there are a lot of programs which could do this, but i'd like to make it myself. Ofcourse, it should work with this too:

{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}, {i|we} want to {tell|say} you {hello|hi|hi-hi}.

解决方案

Update I just wasn't too happy about my using the regexen to parse so simple input; yet I disliked the manual index manipulation jungle found in other answers.

So I replaced the tokenizing with a Enumerator-based scanner with two alternating token-states. This is more justified by the complexity of the input, and has a 'Linqy' feel to it (although it really isn't Linq). I have kept the original Regex based parser at the end of my post for interested readers.


This just had to be solved using Eric Lippert's/IanG's CartesianProduct Linq extension method, in which the core of the program becomes:

public static void Main(string[] args)
{
    const string data = @"{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}, {i|we} want to {tell|say} you {hello|hi|hi-hi}.";
    var pockets = Tokenize(data.GetEnumerator());

    foreach (var result in CartesianProduct(pockets))
        Console.WriteLine(string.Join("", result.ToArray()));
}

Using just two regexen (chunks and legs) to do the parsing into 'pockets', it becomes a matter of writing the CartesianProduct to the console :) Here is the full working code (.NET 3.5+):

using System;
using System.Text;
using System.Text.RegularExpressions;
using System.Linq;
using System.Collections.Generic;

namespace X 
{ 
    static class Y 
    {
        private static bool ReadTill(this IEnumerator<char> input, string stopChars, Action<StringBuilder> action)
        {
            var sb = new StringBuilder();

            try 
            {
                while (input.MoveNext())
                    if (stopChars.Contains(input.Current))
                        return true;
                    else
                        sb.Append(input.Current);
            } finally 
            {
                action(sb);
            }

            return false;
        }


        private static IEnumerable<IEnumerable<string>> Tokenize(IEnumerator<char> input)
        {
            var result = new List<IEnumerable<string>>();

            while(input.ReadTill("{", sb => result.Add(new [] { sb.ToString() })) &&
                  input.ReadTill("}", sb => result.Add(sb.ToString().Split('|')))) 
            {
                // Console.WriteLine("Expected cumulative results: " + result.Select(a => a.Count()).Aggregate(1, (i,j) => i*j));
            }

            return result;
        }

        public static void Main(string[] args)
        {
            const string data = @"{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}, {i|we} want to {tell|say} you {hello|hi|hi-hi}.";
            var pockets = Tokenize(data.GetEnumerator());

            foreach (var result in CartesianProduct(pockets))
                Console.WriteLine(string.Join("", result.ToArray()));
        }

        static IEnumerable<IEnumerable<T>> CartesianProduct<T>(this IEnumerable<IEnumerable<T>> sequences) 
        { 
            IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() }; 
            return sequences.Aggregate( 
                    emptyProduct, 
                    (accumulator, sequence) =>  
                    from accseq in accumulator  
                    from item in sequence  
                    select accseq.Concat(new[] {item}));                
        }
    }
}


Old Regex based parsing:

static readonly Regex chunks = new Regex(@"^(?<chunk>{.*?}|.*?(?={|$))+$", RegexOptions.Compiled);
static readonly Regex legs = new Regex(@"^{((?<alternative>.*?)[\|}])+(?<=})$", RegexOptions.Compiled);

private static IEnumerable<String> All(this Regex regex, string text, string group)
{
    return !regex.IsMatch(text) 
                ? new [] { text } 
                : regex.Match(text).Groups[group].Captures.Cast<Capture>().Select(c => c.Value);
}

public static void Main(string[] args)
{
    const string data = @"{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}, {i|we} want to {tell|say} you {hello|hi|hi-hi}.";
    var pockets = chunks.All(data, "chunk").Select(v => legs.All(v, "alternative"));

The rest is unchanged

这篇关于需要算法来进行简单的程序(句子排列)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆