伯特针对语义相似性进行了微调 [英] Bert fine-tuned for semantic similarity

查看:23
本文介绍了伯特针对语义相似性进行了微调的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想应用微调 Bert 来计算句子之间的语义相似度.我搜索了很多网站,但我几乎没有在下游找到有关此的信息.

I would like to apply fine-tuning Bert to calculate semantic similarity between sentences. I search a lot websites, but I almost not found downstream about this.

我刚刚找到了 STS 基准.我想知道是否可以使用 STS 基准数据集来训练微调 bert 模型,并将其应用于我的任务.合理吗?

I just found STS benchmark. I wonder if I can use STS benchmark dataset to train a fine-tuning bert model, and apply it to my task. Is it reasonable?

据我所知,计算相似度的方法有很多,包括余弦相似度、皮尔逊相关性、曼哈顿距离等.语义相似度如何选择?

As I know, there are a lot method to calculate similarity including cosine similarity, pearson correlation, manhattan distance, etc. How choose for semantic similarity?

推荐答案

作为前面的一般性评论,我想强调的是,此类问题可能不会被视为 Stackoverflow 上的主题,请参阅 如何提问.然而,有一些相关网站可能更适合这类问题(无代码,理论上的 PoV),即 AI Stackexchange交叉验证.

As a general remark ahead, I want to stress that this kind of question might not be considered on-topic on Stackoverflow, see How to ask. There are, however, related sites that might be better for these kinds of questions (no code, theoretical PoV), namely AI Stackexchange, or Cross Validated.

如果你看一个相当流行Mueller 和 Thyagarajan 在该领域发表的论文关注在 LSTM 上学习句子相似性,他们使用了一个密切相关的数据集(SICK 数据集),该数据集也由 SemEval 竞赛主办,并在 2014 年与 STS 基准测试一起运行.

If you look at a rather popular paper in the field by Mueller and Thyagarajan, which is concerned with learning sentence similarity on LSTMs, they use a closely related dataset (the SICK dataset), which is also hosted by the SemEval competition, and ran alongside the STS benchmark in 2014.

其中任何一个都应该是一个合理的微调设置,但 STS 已经运行了多年,因此可用的训练数据量可能更大.

Either one of those should be a reasonable set to fine-tune on, but STS has run over multiple years, so the amount of available training data might be larger.

作为该主题的出色入门,我还可以强烈推荐 Adrien Sieg 的 Medium 文章(请参阅 此处,随附 GitHub 参考.

As a great primer on the topic, I can also highly recommend the Medium article by Adrien Sieg (see here, which comes with an accompanied GitHub reference.

对于语义相似性,我估计您最好对神经网络进行微调(或训练),因为您提到的大多数经典相似性度量都更加关注标记相似性(因此,句法相似性,虽然甚至不一定).另一方面,语义有时会因单个词(可能是否定,或两个词的交换句子位置)而大相径庭,这很难用静态方法来解释或评估.

For semantic similarity, I would estimate that you are better of with fine-tuning (or training) a neural network, as most classical similarity measures you mentioned have a more prominent focus on the token similarity (and thus, syntactic similarity, although not even that necessarily). Semantic meaning, on the other hand, can sometimes differ wildly on a single word (maybe a negation, or the swapped sentence position of two words), which is difficult to interpret or evaluate with static methods.

这篇关于伯特针对语义相似性进行了微调的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆