如何做一个“真实”的字符串比较? [英] how to do a "REAL" string compare?

查看:58
本文介绍了如何做一个“真实”的字符串比较?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

继续交易...我有一个应用程序,我有一个列表(如在Windows

列表控件中,但这并不重要)显示给用户。我根据列表控件排序函数对这个

列表进行排序(再次,它的Windows不是很重要),最终在我的代码中调用了一个比较函数:


int CompareFunc(char * str1,char * str2)

{

}


此函数返回-1,0或1,它将传递给内部快速

排序算法。没问题,一切正常。


现在我有一个用户,其中此列表显示多部分项目。你可以猜测这是怎么回事:)这个列表最终会是这样的:


项目(1/100)

项目(11/100)

项目(2/100)


现在这是一个正确的字符串排序,它的跛脚。我可以强迫用户零填充或零填充我自己,但两者似乎都很好,因为我是

要么对用户提出要求要么改变他的项目文本。我会更多

而不是结束:


物品(1/100)

物品(2/100)

..

..

..

商品(11/100)


应该如此。现在请记住,这可能会以几十种格式结束,

括号,父母,短划线,星号等,或者用户可能输入的任何无穷无尽的可爱的

字符。即使是正斜杠也可能不是分隔符的一部分,并且在部分#s之后可能有东西。


我见过一些应用程序这样做在过去,但从来没有看到他们的来源

。如何在不要求用户以特定格式输入的情况下对其进行正确排序?我永远无法在我的代码中处理所有可能的

格式。必须有某种很酷的通用方法来做到这一点。

Heres the deal... I have an application where I have a list (as in a Windows
list control, but thats not important) displayed to the user. I sort this
list based on the list controls sort function (again, its not important that
its Windows) which ends up calling a compare function in my code:

int CompareFunc(char* str1, char* str2)
{
}

this function returns -1, 0 or 1 which gets passed on to the internal quick
sort algorithm. No problem, it all works fine.

Now I have a user in which this list displays "multi-part" items. You can
guess where this is headed :), the list ends up like this:

Item (1/100)
Item (11/100)
Item (2/100)

Now while that is a "correct" string sort, its kind of lame. I could force
the user to zero-pad or zero-pad myself, but both seem kind of hokey as I am
either putting requirements on the user or changing his item text. I''d much
rather end up with:

Item (1/100)
Item (2/100)
..
..
..
Item (11/100)

As it should. Now keep in mind that this could end up in dozens of formats,
brackets, parents, dashes, asterisks, etc or any endless supply of cutesy
characters a user might enter. Even the forward slash may not be the part
separator and there may be stuff after the part #s.

I''ve seen some applications do this in the past, but never saw the source
for them. How can this be sorted properly without requiring the user to
enter it in a very specific format? I could never handle every possible
format in my code. There must be some kind of cool generic way to do this.

推荐答案

没有人写道:
Nobody wrote:
列表控件中,但这并不重要)显示给用户。我根据列表控件排序函数(再次,它的Windows并不重要)对这个列表进行排序,最终在我的代码中调用了一个比较函数:

int CompareFunc (char * str1,char * str2)
{
}
这个函数返回-1,0或1,它被传递给内部快速
排序算法。没问题,一切正常。

现在我有一个用户,这个列表显示多部分。项目。你可以猜到这是朝哪里:),列表最终如下:

项目(1/100)
项目(11/100)
项目(2/100)

现在,虽然这是一个正确的字符串排序,它的跛脚。我可以强迫用户零填充或零填充自己,但两者似乎都很好,因为我要么对用户提出要求或更改他的项目文本。我宁愿最终结束:

项目(1/100)
项目(2/100)



项目(11/100)

应该如此。现在请记住,这可能会导致数十种格式,括号,父母,破折号,星号等,或者用户可能输入的任何无穷无尽的字符。即使是正斜杠也可能不是分隔符的一部分,并且在部分#s之后可能会有东西。

我曾经看过一些应用程序在过去这样做,但从未见过对他们来说。如何在不要求用户以特定格式输入的情况下对其进行正确分类?我无法处理代码中的所有可能的格式。必须有某种很酷的通用方法来做到这一点。
Heres the deal... I have an application where I have a list (as in a Windows
list control, but thats not important) displayed to the user. I sort this
list based on the list controls sort function (again, its not important that
its Windows) which ends up calling a compare function in my code:

int CompareFunc(char* str1, char* str2)
{
}

this function returns -1, 0 or 1 which gets passed on to the internal quick
sort algorithm. No problem, it all works fine.

Now I have a user in which this list displays "multi-part" items. You can
guess where this is headed :), the list ends up like this:

Item (1/100)
Item (11/100)
Item (2/100)

Now while that is a "correct" string sort, its kind of lame. I could force
the user to zero-pad or zero-pad myself, but both seem kind of hokey as I am
either putting requirements on the user or changing his item text. I''d much
rather end up with:

Item (1/100)
Item (2/100)
.
.
.
Item (11/100)

As it should. Now keep in mind that this could end up in dozens of formats,
brackets, parents, dashes, asterisks, etc or any endless supply of cutesy
characters a user might enter. Even the forward slash may not be the part
separator and there may be stuff after the part #s.

I''ve seen some applications do this in the past, but never saw the source
for them. How can this be sorted properly without requiring the user to
enter it in a very specific format? I could never handle every possible
format in my code. There must be some kind of cool generic way to do this.




前缀0的单个数字是一个修复(我假设这是你的意思

byzero-pad)。


前缀空格也可以完成同样的工作:

#include< string>

#include< algorithm>

#include< iostream>

#include< list>


int main()

{

using namespace std;


list< string> somelist;


somelist.push_back(" Item(2/100");

somelist.push_back(" Item(1/100)" );

somelist.push_back(" Item(11/100)");


somelist.sort();

< string> :: const_iterator p = somelist.begin();

p!= somelist.end(); ++ p)

cout<< * p<<" \ n" ;;

}

C:\c> temp

(1/100)

项目(2/100

项目(11/100)


C:\c>

总结一下,确保所有字符串共享相同的长度。

我也对自己更清洁的方法感兴趣。


-

Ioannis Vranos

http: //www23.brinkster.com/noicys




" Ioannis Vranos"< iv*@remove.this。 grad.com>在留言中写道

news:1110659832.315 731 @ athnrd02 ...

"Ioannis Vranos" <iv*@remove.this.grad.com> wrote in message
news:1110659832.315731@athnrd02...
没有人写道:
Nobody wrote:
继承人的交易...我有一个应用程序,我有一个列表(如在
Windows列表控件,但这并不重要)显示给用户。我根据列表控件排序功能对这个列表进行排序(再次,它的重要性不是它的Windows)最终在我的代码中调用了比较函数:

int CompareFunc(char * str1,char * str2)
{
}
这个函数返回-1,0或1,它被传递给内部
快速排序算法。没问题,一切正常。

现在我有一个用户,这个列表显示多部分。项目。你可以猜到这是朝哪里:),列表最终如下:

项目(1/100)
项目(11/100)
项目(2/100)

现在,虽然这是一个正确的字符串排序,它的跛脚。我可以强迫用户自己进行零填充或零填充,但两者看起来都很好,因为我要么对用户提出要求,要么更改他的
项目文本。我宁愿最终得到:

项目(1/100)
项目(2/100)



项目(11/100)

应该如此。现在请记住,这可能会导致数十种格式,括号,父母,短划线,星号等,或者用户可能输入的任何可爱角色的无穷无尽的供应。即使正斜杠也可能不是部件分隔符,部件#s之后可能会有东西。

我以前见过一些应用程序这样做,但从未见过对他们来说。如何在不要求用户以特定格式输入的情况下对其进行正确分类?我无法处理代码中的所有可能的格式。必须有某种很酷的通用方法来做这个。
Heres the deal... I have an application where I have a list (as in a
Windows list control, but thats not important) displayed to the user. I
sort this list based on the list controls sort function (again, its not
important that its Windows) which ends up calling a compare function in
my code:

int CompareFunc(char* str1, char* str2)
{
}

this function returns -1, 0 or 1 which gets passed on to the internal
quick sort algorithm. No problem, it all works fine.

Now I have a user in which this list displays "multi-part" items. You can
guess where this is headed :), the list ends up like this:

Item (1/100)
Item (11/100)
Item (2/100)

Now while that is a "correct" string sort, its kind of lame. I could
force the user to zero-pad or zero-pad myself, but both seem kind of
hokey as I am either putting requirements on the user or changing his
item text. I''d much rather end up with:

Item (1/100)
Item (2/100)
.
.
.
Item (11/100)

As it should. Now keep in mind that this could end up in dozens of
formats, brackets, parents, dashes, asterisks, etc or any endless supply
of cutesy characters a user might enter. Even the forward slash may not
be the part separator and there may be stuff after the part #s.

I''ve seen some applications do this in the past, but never saw the source
for them. How can this be sorted properly without requiring the user to
enter it in a very specific format? I could never handle every possible
format in my code. There must be some kind of cool generic way to do
this.



用0修正单个数字是一个修复(我假设这是你的意思
零填充。

前缀空格也可以完成同样的工作:

#include< string>
#include< algorithm>
#include< iostream>
#include< list>

int main()
{
使用命名空间std;

list< string> somelist;

somelist.push_back(" Item(2/100");
somelist.push_back(" Item(1/100)");
somelist。 push_back(" Item(11/100)")

somelist.sort();

for(list< string> :: const_iterator p = somelist.begin ();
p!= somelist.end(); ++ p)
cout<< * p<<" \ n";
}

C:\c> temp
项目(1/100)
项目(2/100
项目(11/100)

C:\\ \\ c>

总结一下,确保所有字符串共享相同的长度。

我也对自己更清洁的方法感兴趣。



Prefixing single digits with 0 is a fix (I assume this is what you mean by
"zero-pad").

Also prefixing with space does the same job:
#include <string>
#include <algorithm>
#include <iostream>
#include <list>

int main()
{
using namespace std;

list<string> somelist;

somelist.push_back("Item ( 2/100");
somelist.push_back("Item ( 1/100)");
somelist.push_back("Item (11/100)");

somelist.sort();

for(list<string>::const_iterator p= somelist.begin();
p!=somelist.end(); ++p)
cout<<*p<<"\n";
}
C:\c>temp
Item ( 1/100)
Item ( 2/100
Item (11/100)

C:\c>
In summary just make sure all strings share the same length.
I am interested in a cleaner approach myself too.




嗯,就像我说的,我不想改变用户输入的文字。空格

填充和零填充一样糟糕:)。只是找到了部件编号

字符串可能很难,因为你可以:


项目#1 of 100

项目1/100

物品(1/100)

物品1-100

物品1/100

物品1/100

项目* 1/100 *

项目< 1/100>

项目[1/100]


等等


你永远无法处理每一个案子...如果有人想要可爱怎么样

并做类似的事情:


项目[1/100] - =由Fred = -





项目部分#1-100 [ 1/100]

项目部分#1-100 [2/100]





一些用户会遵循格式要求,但我想绝大多数

不会。而且很多可爱的用户宁愿把这些可爱的废话放在

描述中,然后对它进行适当的排序。



Well, as I said, I don''t want to change the text the user entered. Space
padding is just as bad as zero padding :). Just finding the part number in
the string could be tough since you can have:

Item #1 of 100
Item 1/100
Item (1/100)
Item 1-100
Item 1 / 100
Item 1/ 100
Item *1/100*
Item <1/100>
Item [1/100]

etc.

You could never handle every case... and what if someone wants to be cute
and do something like:

Item [1/100] -= by Fred =-

or

Item Part # 1-100 [1/100]
Item Part # 1-100 [2/100]

etc.

Some users would follow format requirements, but I guess a huge majority
wouldn''t. And a lot of cutesy users would rather put the cutesy crap on the
description then have it properly sorted.


没有人写过:
Nobody wrote:
现在我有一个用户,其中此列表显示多部分项目。
...列表最终如下:

项目(1/100)
项目(11/100)
项目(2/100)

现在,虽然这是一个正确的字符串排序,它的跛脚。我可以强迫用户自己进行零填充或零填充,但两者看起来都很好,因为我要么对用户提出要求,要么更改他的
项目文本。我宁愿最终得到:

项目(1/100)
项目(2/100)



项目(11/100)

应该如此。
他们的来源。如何在不要求用户以特定格式输入的情况下对其进行正确分类?我无法处理代码中的每种可能格式。必须有某种很酷的通用方法来做到这一点。
Now I have a user in which this list displays "multi-part" items.
... the list ends up like this:

Item (1/100)
Item (11/100)
Item (2/100)

Now while that is a "correct" string sort, its kind of lame. I could
force the user to zero-pad or zero-pad myself, but both seem kind of
hokey as I am either putting requirements on the user or changing his
item text. I''d much rather end up with:

Item (1/100)
Item (2/100)
.
.
.
Item (11/100)

As it should. source for them. How can this be sorted properly without requiring
the user to enter it in a very specific format? I could never handle
every possible format in my code. There must be some kind of cool
generic way to do this.




排序字符串有很多标准。没有更多地了解可能的输入范围,

它们代表什么,以及为什么你认为各种算法是什么,你无法回答你的问题。蹩脚。


你可能做的一件事就是将任何非字母数字字符作为分隔符,

然后按字典顺序对生成的序列进行排序,并使用单独的组件

以数字顺序排序。 OTOH,也许这也是蹩脚的。


Jonathan



There are infinitely many criteria for sorting strings. There''s no way to answer
your question without getting more of an idea of the range of possible inputs,
what they represent, and why you think various algorithms are lame.

One thing you might do is count any non-alphanumeric character as a separator,
and then lexicographically sort the resulting sequences with indivual components
ordered numrically. OTOH, maybe this, too, is lame.

Jonathan


这篇关于如何做一个“真实”的字符串比较?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆