Java遗传算法程序 [英] genetic algorithm program using java

查看:59
本文介绍了Java遗传算法程序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在做文本摘要项目.遗传算法可用于总结.

该项目即将进行;新闻可以作为输入,也可以进行预处理.对于文本中的每个句子,可以在该预处理文本中给出分数.我完成到那部分.在句子分数中,我可以应用GA并希望总结单个文档.我不知道GA如何应用于文本.如果有人知道,请用文本而不是基因来举例说明.

我知道在GA中,可以计算出第一适应度.然后我不知道如何在文本中应用变异,交叉概念.如果您有程序,请帮助我...

I am doing project in text summarization. Genetic algorithm can be used for summarization.

The project is about; News can be taken as input and it can be pre-processed. For each sentences in the text a score can be given in that pre-processed text. I finished till that part. In the sentence score I can apply GA and want to have summarized single document. I have no idea about how GA can apply to the text. if anybody knows please explain with example with text not with gene.

I know that in GA, first fitness can be calculated. Then i don''t know how to apply mutation, crossover concept in the text. If you have program please help me...

推荐答案

好吧,这不是一个简单的问题...(由于您研究的领域仍在进行大量研究).

我建议:

1.阅读有关自动文本摘要和遗传算法应用的最新文章

为此,您可以在Scholar.google.com上搜索,例如AI会议(如ECAI或IJCAI)的会议记录.

2.了解数据:

IE.您有一个句子,其中将某些信息建模为一系列值,然后这些值构成您的基因序列".这里的问题是要找到重要的信息,可以从中得出结论,即摘要中需要包含一个特定的单词,例如单词在文本中的出现频率(但请注意:在此处过滤停用词,例如"no","yes"等),否则您将需要大量数据来训练模型,这简直是行不通的.

3.检查计算机语言学中的标准预处理方法,例如词干,停用词过滤,词组类型的标记符等.所有这些都将减少您的搜索空间并简化对基因"序列的搜索;)

4.对我来说,这似乎是一个非常雄心勃勃的项目(但是我已经离开了领域几年了...)

希望这会有所帮助,
干杯,阿恩特
Well, this is not a simple question... (Since the field you dig into is still very much research).

I would recommend:

1. read current articles about automatic text summarization and application of genetic algorithms

For this you can search on scholar.google.com or e.g. conference proceedings of AI conferences such as ECAI or IJCAI.

2. Understand the Data:

I.e. you have a sentence where you model some of the Information into a sequence of values, which then "kind of" form your "gene-sequence". The problem here is to find significant information from which can be concluded, that a specific word needs to be in the summary, e.g. the word''s frequency in the text (but careful here: stopwords like "no", "yes", etc. should be filtered here) otherwise you will need so much data to train you model, that it will simply be unfeasible.

3. Check on Standard preprocessing methods in Computer Linguistics, such as stemming, stop-word filtering, Taggers for phrase types etc. These will all reduce your search space and simplify your search for the "gene"-sequence ;)

4. To me it seems to be an extremely ambitious project (but I''m out of the field for a few years now...)

Hope this helps a bit,
Cheers, Arndt


这篇关于Java遗传算法程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆