代码补全如何工作? [英] How does code completion work?

查看:15
本文介绍了代码补全如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

许多编辑器和 IDE 都有代码完成功能.他们中的一些人非常聪明",其他人则不然.我对更聪明的类型感兴趣.例如,我看到 IDE 只提供一个功能,如果它 a) 在当前范围内可用 b) 其返回值有效.(例如,在5 + foo[tab]"之后,它只提供返回可以添加到正确类型的整数或变量名称中的内容的函数.)我还看到他们将更常用或最长的选项放在前面的列表.

Lots of editors and IDEs have code completion. Some of them are very "intelligent" others are not really. I am interested in the more intelligent type. For example I have seen IDEs that only offer a function if it is a) available in the current scope b) its return value is valid. (For example after "5 + foo[tab]" it only offers functions that return something that can be added to an integer or variable names of the correct type.) I have also seen that they place the more often used or longest option ahead of the list.

我意识到您需要解析代码.但通常在编辑当前代码无效时,其中存在语法错误.当内容不完整且包含错误时,您如何解析?

I realize you need to parse the code. But usually while editing the current code is invalid there are syntax errors in it. How do you parse something when it is incomplete and contains errors?

还有时间限制.如果想出一个列表需要几秒钟,那么完成是没有用的.有时,完成算法会处理数千个类.

There is also a time constraint. The completion is useless if it takes seconds to come up with a list. Sometimes the completion algorithm deals with thousands of classes.

对此有哪些好的算法和数据结构?

What are the good algorithms and data structures for this?

推荐答案

我的 UnrealScript 语言服务产品中的 IntelliSense 引擎很复杂,但我会在这里尽可能提供一个概述.VS2008 SP1 中的 C# 语言服务是我的性能目标(有充分的理由).它还没有出现,但它足够快/准确,我可以在输入单个字符后安全地提供建议,而无需等待 ctrl+space 或用户输入 .(点).[从事语言服务的人] 获得的关于这个主题的信息越多,如果我使用他们的产品,我得到的最终用户体验就越好.有许多产品在我使用时有过不幸的经历,但它们并没有如此密切地关注细节,因此我与 IDE 的斗争比我编码的更多.

The IntelliSense engine in my UnrealScript language service product is complicated, but I'll give as best an overview here as I can. The C# language service in VS2008 SP1 is my performance goal (for good reason). It's not there yet, but it's fast/accurate enough that I can safely offer suggestions after a single character is typed, without waiting for ctrl+space or the user typing a . (dot). The more information people [working on language services] get about this subject, the better end-user experience I get should I ever use their products. There are a number of products I've had the unfortunate experience of working with that didn't pay such close attention to details, and as a result I was fighting with the IDE more than I was coding.

在我的语言服务中,它的布局如下:

In my language service, it's laid out like the following:

  1. 获取光标处的表达式.这从成员访问表达式的开头到光标所在标识符的结尾.成员访问表达式通常采用 aa.bb.cc 形式,但也可以包含方法调用,如 aa.bb(3+2).cc.
  2. 获取光标周围的上下文.这是非常棘手的,因为它并不总是遵循与编译器相同的规则(长篇故事),但在这里假设它确实如此.通常,这意味着获取有关光标所在的方法/类的缓存信息.
  3. 假设上下文对象实现了 IDeclarationProvider,您可以在其中调用 GetDeclarations() 来获取所有可见项的 IEnumerable范围.就我而言,此列表包含局部变量/参数(如果在方法中)、成员(字段和方法,仅静态,除非在实例方法中,并且没有基类型的私有成员)、全局变量(语言的类型和常量)正在处理)和关键字.在这个列表中将有一个名为 aa 的项目.作为评估 #1 中表达式的第一步,我们从上下文枚举中选择名为 aa 的项目,为下一步提供一个 IDeclaration.莉>
  4. 接下来,我将运算符应用于表示 aaIDeclaration 以获得另一个包含成员"的 IEnumerable(在某些aa 的意义).由于 . 运算符与 -> 运算符不同,我调用 declaration.GetMembers(".") 并期望 IDeclaration 对象以正确应用列出的运算符.
  5. 这一直持续到我点击cc,其中声明列表可能或可能不包含一个名为cc的对象.我相信您知道,如果多个项目以 cc 开头,它们也应该出现.我通过获取最终枚举并将其通过 我的文档算法 来解决这个问题尽可能为用户提供最有用的信息.
  1. Get the expression at the cursor. This walks from the beginning of the member access expression to the end of the identifier the cursor is over. The member access expression is generally in the form aa.bb.cc, but can also contain method calls as in aa.bb(3+2).cc.
  2. Get the context surrounding the cursor. This is very tricky, because it doesn't always follow the same rules as the compiler (long story), but for here assume it does. Generally this means get the cached information about the method/class the cursor is within.
  3. Say the context object implements IDeclarationProvider, where you can call GetDeclarations() to get an IEnumerable<IDeclaration> of all items visible in the scope. In my case, this list contains the locals/parameters (if in a method), members (fields and methods, static only unless in an instance method, and no private members of base types), globals (types and constants for the language I'm working on), and keywords. In this list will be an item with the name aa. As a first step in evaluating the expression in #1, we select the item from the context enumeration with the name aa, giving us an IDeclaration for the next step.
  4. Next, I apply the operator to the IDeclaration representing aa to get another IEnumerable<IDeclaration> containing the "members" (in some sense) of aa. Since the . operator is different from the -> operator, I call declaration.GetMembers(".") and expect the IDeclaration object to correctly apply the listed operator.
  5. This continues until I hit cc, where the declaration list may or may not contain an object with the name cc. As I'm sure you're aware, if multiple items begin with cc, they should appear as well. I solve this by taking the final enumeration and passing it through my documented algorithm to provide the user with the most helpful information possible.

以下是 IntelliSense 后端的一些附加说明:

  • 我在实现 GetMembers 时广泛使用了 LINQ 的惰性求值机制.我缓存中的每个对象都能够提供一个对其成员求值的函子,因此对树执行复杂的操作几乎是微不足道的.
  • 不是每个对象都保留其成员的 List,而是保留一个 List,其中 Name 是一个包含描述成员的特殊格式字符串的散列的结构.有一个巨大的缓存可以将名称映射到对象.这样,当我重新解析文件时,我可以从缓存中删除文件中声明的所有项目,并使用更新的成员重新填充它.由于函子的配置方式,所有表达式都会立即计算为新项.
  • I make extensive use of LINQ's lazy evaluation mechanisms in implementing GetMembers. Each object in my cache is able to provide a functor that evaluates to its members, so performing complicated actions with the tree is near trivial.
  • Instead of each object keeping a List<IDeclaration> of its members, I keep a List<Name>, where Name is a struct containing the hash of a specially-formatted string describing the member. There's an enormous cache that maps names to objects. This way, when I re-parse a file, I can remove all items declared in the file from the cache and repopulate it with the updated members. Due to the way the functors are configured, all expressions immediately evaluate to the new items.

智能感知前端"

当用户输入时,文件在语法上不正确的频率比正确的要多.因此,我不想在用户键入时随意删除缓存的某些部分.我有大量的特殊情况规则来尽快处理增量更新.增量缓存仅保留在打开文件的本地,有助于确保用户不会意识到他们的输入导致后端缓存为文件中的每个方法等内容保存不正确的行/列信息.

As the user types, the file is syntactically incorrect more often than it is correct. As such, I don't want to haphazardly remove sections of the cache when the user types. I have a large number of special-case rules in place to handle incremental updates as quickly as possible. The incremental cache is only kept local to an open file and helps make ensure the user doesn't realize that their typing is causing the backend cache to hold incorrect line/column information for things like each method in the file.

  • 一个弥补因素是我的解析器.它可以在 150 毫秒内处理 20000 行源文件的完整缓存更新,同时在低优先级后台线程上独立运行.每当此解析器成功(在语法上)完成对打开文件的传递时,文件的当前状态就会移动到全局缓存中.
  • 如果文件在语法上不正确,我会使用 ANTLR过滤器解析器(抱歉链接 - 大多数信息都在邮件列表中或从阅读源中收集) 重新解析文件以查找:
    • 变量/字段声明.
    • 类/结构定义的签名.
    • 方法定义的签名.
    • One redeeming factor is my parser is fast. It can handle a full cache update of a 20000 line source file in 150ms while operating self-contained on a low priority background thread. Whenever this parser completes a pass on an open file successfully (syntactically), the current state of the file is moved into the global cache.
    • If the file is not syntactically correct, I use an ANTLR filter parser (sorry about the link - most info is on the mailing list or gathered from reading the source) to reparse the file looking for:
      • Variable/field declarations.
      • The signature for class/struct definitions.
      • The signature for method definitions.
      • 确保我可以识别光标的正确上下文,因为方法可以/确实在完整解析之间的文件中移动.
      • 确保 Go To Declaration/Definition/Reference 在打开的文件中正确定位项目.

      上一节的代码片段:

      class A
      {
          int x; // linked to A
      
          void foo() // linked to A
          {
              int local; // linked to foo()
      
          // foo() ends here because bar() is starting
          void bar() // linked to A
          {
              int local2; // linked to bar()
          }
      
          int y; // linked again to A
      

      我想我会添加一个我已使用此布局实现的 IntelliSense 功能的列表.每个图片都位于此处.

      I figured I'd add a list of the IntelliSense features I've implemented with this layout. Pictures of each are located here.

      • 自动完成
      • 工具提示
      • 方法提示
      • 班级视图
      • 代码定义窗口
      • 调用浏览器(VS 2010 最终将其添加到 C#)
      • 语义正确的查找所有引用

      这篇关于代码补全如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆