在Ruby中解析文本 [英] Parsing text in Ruby

查看:166
本文介绍了在Ruby中解析文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理一个脚本,用于为SketchUp导入组件信息.帮助页面上的一位非常有帮助的人帮助我创建了一个在已编辑"的逐行文本文件中使用的文件.现在,我准备将其提升到一个新的水平-直接从FreePCB创建的原始文件中导入.

I'm working on a script for importing component information for SketchUp. A very helpful individual on their help page, assisted me in creating one that works on an "edited" line by line text file. Now I'm ready to take it to the next level - importing directly from the original file created by FreePCB.

我要使用的文件部分如下:"sample_1.txt"

The portion of the file I wish to use is below: "sample_1.txt"

[parts]

part: C1
  ref_text: 1270000 127000 0 -7620000 1270000 1
  package: "CAP-AX-10X18-7X"
  value: "4.7pF" 1270000 127000 0 1270000 1270000 1
  shape: "CAP-AX-10X18-7"
  pos: 10160000 10160000 0 0 0

part: IC1
  ref_text: 1270000 177800 270 2540000 2286000 1
  package: "DIP-8-3X"
  value: "JRC 4558" 1270000 177800 270 10668000 508000 0
  shape: "DIP-8-3"
  pos: 2540000 27940000 0 90 0

part: R1
  ref_text: 1270000 127000 0 3380000 -600000 1
  package: "RES-CF-1/4W-4X"
  value: "470" 1270000 127000 0 2180000 -2900000 0
  shape: "RES-CF-1/4W-4"
  pos: 15240000 20320000 0 270 0

括号中的[parts]一词只是一个小节标题.我希望提取的信息是参考标记,形状,位置和旋转度.我已经有了使用IO.readlines(file).each{ |line| data = line.split(" ");从重新格式化的文本文件中执行此操作的代码.

The word [parts], in brackets, is just a section heading. The information I wish to extract is the reference designator, shape, position, and rotation. I already have code to do this from a reformatted text file, using IO.readlines(file).each{ |line| data = line.split(" ");.

我当前的方法使用的文本文件的格式如下:"sample_2.txt"

My current method uses a text file re-formatted as thus: "sample_2.txt"

C1 CAP-AX-10X18-7 10160000 10160000 0 0 0
IC1 DIP-8-3 2540000 27940000 0 90 0
R1 RES-CF-1/4W-4 15240000 20320000 0 270 0

然后,我使用数组提取数据[0],数据[1],数据[2],数据[3]和数据[5]. 加上一个额外的步骤,在包名称的末尾附加".skp",以允许脚本插入与包名称相同的组件.

I then use an array to extract data[0], data[1], data[2], data[3], and data[5]. Plus an additional step, to append ".skp" to the end of the package name, to allow the script to insert components with the same name as the package.

我想从第一个示例中提取信息,而不必像第二个示例那样重新格式化文件.即,我知道如何从单个字符串中提取信息,并用空格分隔-当一个数组的文本出现在多行中时,该怎么办?

I would like to extract the information from the 1st example, without having to re-format the file, as is the case with the 2nd example. i.e. I know how to pull information from a single string, split by spaces - How do I do it, when the text for one array, appears on more than one line?

在此先感谢您的帮助;-)

Thanks in advance for any help ;-)

以下是解析"sample_2.txt"的完整代码,该代码在运行脚本之前已重新格式化.

Below is the full code to parse "sample_2.txt", that was re-formatted prior to running the script.

    # import.rb - extracts component info from text file

    # Launch file browser
    file=UI.openpanel "Open Text File", "c:\\", "*.txt"

    # Do for each line, what appears in braces {}
    IO.readlines(file).each{ |line| data = line.split(" ");

    # Append second element in array "data[1]", with SketchUp file extension
    data[1] += ".skp"

    # Search for component with same name as data[1], and insert in component browser
    component_path = Sketchup.find_support_file data[1] ,"Components"
    component_def = Sketchup.active_model.definitions.load component_path

    # Create transformation from "origin" to point "location", convert data[] to float
    location = [data[2].to_f, data[3].to_f, 0]
    translation = Geom::Transformation.new location

    # Convert rotation "data[5]" to radians, and into float
    angle = data[5].to_f*Math::PI/180.to_f
    rotation = Geom::Transformation.rotation [0,0,0], [0,0,1], angle

    # Insert an instance of component in model, and apply transformation
    instance = Sketchup.active_model.entities.add_instance component_def, translation*rotation

    # Rename component 
    instance.name=data[0]

    # Ending brace for "IO.readlines(file).each{"
    }

从运行"import.rb"以打开"sample_2.txt"得到以下输出.

Results in the following output, from running "import.rb" to open "sample_2.txt".

    C1 CAP-AX-10X18-7 10160000 10160000 0<br>IC1 DIP-8-3 2540000 27940000 90<br>R1 RES-CF-1/4W-4 15240000 20320000 270

我试图从未经编辑的原始文件"sample_1.txt"中获得相同的结果,而不需要使用记事本"sample_2.txt"从文件中删除信息的额外步骤.关键字,后跟冒号(部分,形状,位置),仅出现在文档的此部分中,没有其他地方,但是...文档篇幅相当长,我需要脚本忽略之前和之后出现的所有内容.之后,是[parts]部分.

I am trying to get the same results from the un-edited original file "sample_1.txt", without the extra step of removing information from the file, with notepad "sample_2.txt". The keywords, followed by a colon (part, shape, pos), only appear in this part of the document, and nowhere else, but... the document is rather lengthy, and I need the script to ignore all that appears before and after, the [parts] section.

推荐答案

您的问题不清楚,但这是

Your question is not clear, but this:

text.scan(/^\s+shape: "(.*?)"\s+pos: (\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)

会给您:

[["CAP-AX-10X18-7", "10160000", "10160000", "0", "0", "0"],
 ["DIP-8-3", "2540000", "27940000", "0", "90", "0"],
 ["RES-CF-1/4W-4", "15240000", "20320000", "0", "270", "0"]]

问题更改后添加

此:

text.scan(/^\s*part:\s*(.*?)$.*?\s+shape:\s*"(.*?)"\s+pos:\s*(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/m)

会给你

[["C1", "CAP-AX-10X18-7", "10160000", "10160000", "0", "0", "0"],
 ["IC1", "DIP-8-3", "2540000", "27940000", "0", "90", "0"],
 ["R1", "RES-CF-1/4W-4", "15240000", "20320000", "0", "270", "0"]]

第二次更改问题后添加的时间

此:

text.scan(/^\s*part:\s*(.*?)$.*?\s+shape:\s*"(.*?)"\s+pos:\s*(-?\d+)\s+(-?\d+)\s+(-?\d+)\s+(-?\d+)\s+(-?\d+)/m)

即使数字为负,也可以捕获数字.

will let you capture numbers even if they are negative.

这篇关于在Ruby中解析文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆