任意的字符串映射为RGB值 [英] Mapping arbitrary strings to RGB values

查看:226
本文介绍了任意的字符串映射为RGB值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一组任意的自然语言字符串巨大的。对于我的工具来分析他们,我需要将每个字符串转换为独特的颜色值(RGB或其他)。我需要的色彩对比依赖于字符串的相似性(更字符串与其他不同的是,更多的各自的颜色应该是不同的)。会是完美的,如果我总是得到相同的颜色值相同的字符串。

I have a huge set of arbitrary natural language strings. For my tool to analyze them I need to convert each string to unique color value (RGB or other). I need color contrast to depend on string similarity (the more string is different from other, the more their respective colors should be different). Would be perfect if I would always get same color value for the same string.

在如何解决这个问题?任何意见

Any advice on how to approach this problem?

我可能需要相似性定义为莱文施泰因般的距离。没有自然语言解析是必需的。

I probably need "similarity" defined as a Levenstein-like distance. No natural language parsing is required.

这就是:

"I am going to the store" and 
"We are going to the store"

类似的。

"I am going to the store" and 
"I am going to the store today"

类似的还有(但略少)。

Similar as well (but slightly less).

"I am going to the store" and 
"J bn hpjoh up uif tupsf"

相当不相似。

Quite not similar.

(谢谢,<一个href="http://stackoverflow.com/questions/495662/mapping-arbitrary-strings-to-rgb-values/495717#495717">Welbog!)

我大概会知道的完全的我所需要的距离函数,只有当我会看到程序的输出。所以让我们从简单的事情开始。

I probably would know exactly what distance function I need only when I'll see program output. So lets start from simpler things.

我已经打消了我自己的建议拆分任务分为两个&mdash;绝对距离计算和颜色分布。这不是第一次,我们正在减少维信息到一个维度,然后试图合成它最多三个方面的工作以及。

I've removed my own suggestion to split task into two — absolute distance calculation and color distribution. This would not work well as at first we're reducing dimensional information to a single dimension, and then trying to synthesize it up to three dimensions.

推荐答案

您需要为了拿出一个适当的转换功能,更多地讨论你的意思是相似字符串。是字符串

You need to elaborate more on what you mean by "similar strings" in order to come up with an appropriate conversion function. Are the strings

 "I am going to the store" and 
"We are going to the store"

考虑类似的?怎么样的字符串

considered similar? What about the strings

 "I am going to the store" and 
"J bn hpjoh up uif tupsf"

(全部在原1的字母),或

(all of the letters in the original +1), or

 "I am going to the store" and 
"I am going to the store today"

?根据你所说的类似的意思是,你可能会考虑不同的功能。

? Based on what you mean by "similar", you might consider different functions.

如果差别可以单独对人物的值(统一code或任何空间,他们是来自)为主,那么你可以尝试了总结的价值和使用结果为HSV空间的色调。如果有一个更长的字符串应该引起颜色更加不同,你可能会考虑其在字符串中的位置称量的字符。

If the difference can be based solely on the values of the characters (in Unicode or whatever space they are from), then you can try summing the values up and using the result as a hue for HSV space. If having a longer string should cause the colours to be more different, you might consider weighing characters by their position in the string.

如果所不同的是更复杂的,例如通过某些字母或单词的发生,则需要识别这个。也许你可以决定基于ES,SS和卢比的字符串数量红色,绿色和蓝色的值,如果你的域名有很多这样的。或基于元音consonents或单词的音节的比例选择一个色调。

If the difference is more complex, such as by the occurrences of certain letters or words, then you need to identify this. Maybe you can decide red, green and blue values based on the number of Es, Ss and Rs in a string, if your domain has a lot of these. Or pick a hue based on the ratio of vowels to consonents, or words to syllables.

有很多很多不同的方式接近这一点,但最好的一个真的取决于你所说的类似字符串的意思。

There are many, many different ways to approach this, but the best one really depends on what you mean by "similar" strings.

这篇关于任意的字符串映射为RGB值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆