github如何找出项目的语言? [英] How does github figure out a project's language?
问题描述
我最近在JavaScript和C ++中使用github项目,并注意到github将项目标记为C ++。如果你必须选择一种语言,这可能是正确的名称,因为C ++代码被编译为JavaScript库,但这让我想知道...... github如何找出每个项目标记的语言?
2013年4月更新,由 nuclearsandwich (GitHub支持团队或supportocat): 帮助页面我的存储库被标记为错误的语言提及现在使用 linguist library 来确定语法高亮和回购统计的文件语言。语言学家将从统计中排除某些文件名和路径,除特定供应商外,文件和目录。 帮助页面补充:
如果您所需的语言没有收到语法高亮显示,您可以为Linguist图书馆添加它。
$ hr
原始答案2012年10月
GitHub支持上的线程解释它:
它只是为每个扩展名汇总文件大小。最大的一个获胜。
我们希望避免打开文件并解析它们的内容,因为两者都会减慢流程......但这可能是解决冲突的唯一方法,如这一点。
由于这不是100%准确的,所以导致一些添加:
我也会投票给一个简单的手动覆盖开关用于猜测错误的情况。
注意:由于 Mark Rushakoff 在他的回答(upvoted)中提到,从那以后,猜测得到了更好的发展:语言学项目(自2011年6月开放源代码)。
您可以看到仍有问题: GitHub语言问题。
Se 更多详情:
一旦检测到语言,它就会传递给 Albino ,这是一个 Pygments 包装器,它可以实现语法突出显示。
您可以 在.gitattributes中添加语言指令文件 。
I was recently working on a github project in both JavaScript and C++, and noticed that github tagged the project as C++. If you have to pick a single language, this is probably the correct designation since the C++ code is compiled as a JavaScript library, but this made me wonder... how does github figure out what language to tag each project?
Update April 2013, by nuclearsandwich (GitHub support team or "supportocat"):
the help page "My repository is marked as the wrong language" mentions using now the linguist library to determine file language for syntax highlighting and repo statistics. Linguist will exclude certain file names and paths from statistic, excluding certain vendor files and directories.
the help page "Why isn't my favorite language recognized?" adds:
If your desired language is not receiving syntax highlighting you can contribute to the Linguist library to add it.
(Original answer, Oct. 2012)
This thread on GitHub support explains it:
It just sums up file sizes for each extension. Largest one "wins".
We'd like to avoid opening files up and parsing their content, as both would slow down the process... but that might be the only method of resolving conflicts like this one.
Since this is not 100% accurate, that had lead some to add:
I, too, would vote for a simple manual-override switch for the cases where the guess is wrong.
Note: as Mark Rushakoff mentions in his answer (upvoted), the guessing got better since then with the linguist project (open-sourced from June 2011).
You can see there are still issues though: GitHub Linguist Issues.
See here for more details:
Once the language has been detected, it is passed to Albino, a Pygments wrapper, which does the actual syntax highlighting.
And you can add linguist directives in a .gitattributes file.
这篇关于github如何找出项目的语言?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!