寻找一个终端命令来解析MacOSX字典数据文件 [英] Looking for a terminal command to parse MacOSX dictionary data file

查看:379
本文介绍了寻找一个终端命令来解析MacOSX字典数据文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题



MacOSX附带存储在 / Library / Dictionaries 中的字典。我想解析它们以编程方式获得字典结果(通过终端,AppleScript或Automator)。字典是MacOSX软件包,并且都有一个 Contents 文件夹,其中包含一个名为 Body.data 的文件。我想解析一个UTF-8字符串(可能是汉字双字节)的文件,并返回找到字符串的行。



我尝试过以下,没有返回任何结果:

  find。 -name'Body.data'-exec grep -li'我'{} \; 

当我使用应用程序界面搜索字典时,我可以找到适当的文本。我的目标是创建一个工作流程服务,将所选的中文文本翻译成拼音等效物,这些文字存储在系统/用户字典中。



更新



以下内容基于接受的答案为我工作:



创建并归档一个名为 code> rdef 使用Xcode代码:

  #import< Foundation / Foundation .H> 

int main(int argc,const char * argv [])
{

@autoreleasepool {

if(argc< 2)
{
printf(Usage:rdef< word to define>);

return -1;
}

NSString * search =
[NSString stringWithCString:argv [1] encoding:NSUTF8StringEncoding];

CFStringRef def =
DCSCopyTextDefinition(NULL,
(__bridge CFStringRef)search,
CFRangeMake(0,[search length]));

NSString * output =
[NSString stringWithFormat:@<%@> ;:%@的定义,搜索,(__bridge NSString *)def];

printf(%s,[output UTF8String]);


}
return 0;
}

将以下内容添加到我的项目框架中:





执行Build,然后使用以下步骤手动部署。



部署:



右键单击归档包,然后选择在Finder中显示。然后显示包内容并钻出产品文件夹,并将可执行文件复制到 / local / usr / bin 。现在从命令提示符我可以运行这样的实用程序:

  rdef我| awk -F'\ |''{ gsub(/ ^ + | + $ /,,$ 2);打印$ 2}'

请参阅以下接受的扩展参考答案。



注意:
该实用程序的github可以在



这是Apple的字典API文档: https://developer.apple.com/library/mac/documentation/UserExperience/概念/ DictionaryServicesProgGuide / access / access.html#// apple_ref / doc / uid / TP40006152-CH5-SW1



更新



假设您创建了一个名为 rdef 的实用程序,返回类似于的定义的<我> ;: | wǒ|我使用我的',使用以下 awk 命令来解析拼音:

  rdef我| awk -F'* [|] *''{print $ 2}'






或者,如果基于在线的解决方案是一个选项,您可以尝试使用基于Google Translate的解决方案。



至少在交互式使用中,您会得到



例如,您的示例符号被转录为Wǒ:



http:// translate .google.com /?text =%E6%88%91#zh-CN / en /%E6%88%91


Problem

MacOSX comes with dictionaries stored in /Library/Dictionaries. I would like to parse them to obtain dictionary results programmatically (via Terminal, AppleScript, or Automator). The dictionaries are MacOSX packages and all have a Contents folder that contains a file called Body.data. I would like to parse that file for a UTF-8 string (maybe Chinese character double bytes) and return the lines where the string is found.

I've tried the following, which is not returning any results:

find . -name 'Body.data' -exec grep -li '我' {} \;

When I search through the dictionary using the app interface I can find the appropriate text. My objective is to create a workflow service to translate selected Chinese text into the pinyin equivalents which are stored in the system/user dictionaries.

Update

The following worked for me based on the accepted answer:

Created and Archived a command line utility called rdef using Xcode with this code:

#import <Foundation/Foundation.h>

int main(int argc, const char * argv[])
{

    @autoreleasepool {

        if(argc < 2)
        {
            printf("Usage: rdef <word to define>");

            return -1;
        }

        NSString * search =
        [NSString stringWithCString: argv[1] encoding: NSUTF8StringEncoding];

        CFStringRef def =
        DCSCopyTextDefinition(NULL,
                              (__bridge CFStringRef)search,
                              CFRangeMake(0, [search length]));

        NSString * output =
        [NSString stringWithFormat: @"Definition of <%@>: %@", search, (__bridge NSString *)def];

        printf("%s", [output UTF8String]);


    }
    return 0;
}

Added the following to my project frameworks:

Performed a Build and then deployed manually using the steps below.

To deploy:

Right-clicked the Archived package and chose Show in Finder. Then Show Package Contents and drilled down product folder and copied the executable to /local/usr/bin. Now from a command prompt I can run the utility like so:

rdef 我|awk -F '\|' '{ gsub(/^ +| +$/, "", $2); print $2 }'

Please see the accepted answer below for extended references.

NB: The github for the utility can be found at https://github.com/mingsai/rdef.git

Next I will just create a Service to call the utility from Automator against selected text.

Service Solution

To pay it forward for the folks who've helped, especially @mklement0: here is the Solution for taking the command utility and converting it to a MacOSX service that can be used to translate Chinese characters to pinyin.

Create a new Automator Service file and make sure to select output replaces selected text.

Automator Script details

PATH=/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin/: 
export PATH
LC_CTYPE=UTF-8
x=$1

for ((i=0;i<${#x};i++)); do rdef "${x:i:1}" | awk -F  '\|' 'BEGIN {ORS=" "}{ gsub(/^ | +?/, "", $2); if (length($2) > 0) print $2 ; exit}'; done

To make the Service "live" just delete the "Ask for Text" and save the service with name of your choice (e.g. Convert to Pinyin).

To use the revised service highlight any Chinese characters and right click the context menu then on the bottom under the Services menu select "Convert to Pinyin" ... (as indicated below)

Usage

Produces this output

Hope that helps anyone with this problem.

解决方案

grep operates on text files, but the Body.data files are not text files, unfortunately.

Your best bet is probably to create your own command-line utility in Xcode, as suggested here (sample code): https://discussions.apple.com/thread/2679911

Here's Apple's dictionary API documentation: https://developer.apple.com/library/mac/documentation/UserExperience/Conceptual/DictionaryServicesProgGuide/access/access.html#//apple_ref/doc/uid/TP40006152-CH5-SW1

Update:

Assuming you've created a utility named rdef that returns something like 'Definition of <我>: | wǒ | I me my', use the following awk command to parse out the pinyin:

rdef "我" | awk -F ' *[|] *' '{ print $2 }'


Alternatively, if an online-based solution is an option, you could try a Google Translate-based solution.

At least in interactive use you get a pinyin transcription below the input field.

For instance, your example symbol is transcribed as "Wǒ":

http://translate.google.com/?text=%E6%88%91#zh-CN/en/%E6%88%91

这篇关于寻找一个终端命令来解析MacOSX字典数据文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆