本地化html文件(后视) [英] localizing an html document (hind sight)

查看:136
本文介绍了本地化html文件(后视)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用PHP构建一个Web应用程序,我已经决定(远在这个过程中)以不同语言提供。



我的问题是:



我不想通过模板文件中的所有HTMl代码来查找需要用动态生成的lang变量替换的单词。



是否有一种工具可以突出显示HTML中使用的单词,以便更轻松地完成任务。

当我向下滚动HTML文档时,我可以很容易地看到语言词的位置。



通常当我创建应用程序时,我添加 注释为我的代码,如下

 < label><! -  lang  - >全名<拉布勒> 
< input type =submitvalue =<! - lang - >保存更改name =submit>

这样,当我完成时,我可以运行并轻松识别需要添加动态的位变量....不幸的是,我几乎通过应用程序(HTML模板文件丢失),我没有这样做。



我使用模板引擎(tinybutstrong ),所以我的HTML很干净(即没有PHP)

解决方案

你可以做到这一点,甚至相对容易,使用> DOMDocument 解析标记, DOMXPath 查询所有评论节点,然后访问每个节点的父节点, nodeValue 并将这些值列为要翻译的字符串

  $ dom = new DOMDocument; 
$ dom-> load($ file); //或loadHTML,以防您使用HTML字符串
$ xpath = new DOMXPath($ dom); //获得XPath
$ comment = $ xpath-> query('// comment()'); //获取所有注释节点
//此数组将包含所有要翻译的文本
$ toTranslate = array() ;
foreach($ comments as $ comment)
{
if(trim($ comment-> nodeValue)=='lang')
{//修剪,避免空格,如果你需要不区分大小写的匹配,使用stristr!== false
$ parent = $ comment-> parentNode; //获得父节点
$ toTranslate [] = $ parent-> textContent; //获取父节点的文本内容


var_dump($ toTranslate);

请注意,这无法处理标签属性中使用的注释。使用这个简单的脚本,您将能够提取需要在regular标记中翻译的字符串。之后,您可以编写一个脚本,在标签属性中查找<! - lang - > ...我会看看是否存在也可以使用XPath来完成此操作。但现在,这应该可以帮助你开始。



如果您没有评论,除​​<! - lang-- > ,那么您可以简单地使用一个xpath表达式来直接选择这些注释节点的父项:

  $ commentsAndInput = $ xpath-> query('(// input | // option)[@ value] | // comment()/ ..'); 
foreach($ commentsAndInput as $ node)
{
if($ node-> tagName!=='input'&& $ amp; $ node-> tagName!==''option ')
{//获取节点的textContent
$ toTranslate [] = $ node-> textContent;
}
else
{//获取值属性的值:
$ toTranslate [] = $ node-> getAttributeNode('value') - > value;
}
}

xpath表达式解释:


  • // :指示xpath搜索与DOM中任何位置的其余条件相匹配的节点

  • 输入:文字标记名称: //输入在任意位置查找输入标记在DOM树中

  • [@ value] :所提及的标签只有匹配 @value 属性

  • | OR 。 $ // a | //输入[@ type =button] 匹配links 按钮
  • //选项[@value] :与上面相同:具有值属性的选项匹配

  • / input | // option):对两个表达式进行分组, [@value] 适用于此选择中的所有匹配

  • // comment():在dom中的任何位置选择注释
  • / .. :选择当前节点的父节点,因此 // comment()/ .. 与包含选定注释节点的父节点匹配。 li>


继续使用XPath表达式来获取所有需要翻译的内容



< h1> 概念验证

I am bulding a web application in PHP, which I have decided (far along the process) to have available in different languages.

My question is this:

I do not want to wade through all the HTMl code in the template files to look for the "words" that I need to replace with dynamically generated lang variables.

Is there a tool that can highlight the "words" used in the HTML to make my task easier.

so that when I scroll down the HTML doc, I can easily see where the language "words" are.

Normally when I create an app, I add comments as i code, like below

 <label><!--lang-->Full Name</lable>
 <input type="submit" value="<!--lang-->Save Changes" name="submit">

so that when I am done, I can run through and easily identify the bits I need to add dynamic variables to....unfortunately I am almost through with the app (lost of HTML template files) and I had not done so.

I use a template engine (tinybutstrong) so my HTML is pretty clean (i.e. with no PHP in it)

解决方案

You can do this, relatively easily even, using DOMDocument to parse the markup, DOMXPath to query for all the comment nodes, and then access each node's parent, extract the nodeValue and list those values as "strings to translate":

$dom = new DOMDocument;
$dom->load($file);//or loadHTML in case you're working with HTML strings
$xpath = new DOMXPath($dom);//get XPath
$comments = $xpath->query('//comment()');//get all comment nodes
//this array will contain all to-translate texts
$toTranslate = array();
foreach ($comments as $comment)
{
    if (trim($comment->nodeValue) == 'lang')
    {//trim, avoid spaces, use stristr !== false if you need case-insensitive matching
        $parent = $comment->parentNode;//get parent node
        $toTranslate[] = $parent->textContent;//get parent node's text content
    }
}
var_dump($toTranslate);

Note that this can't handle comments used in tag attributes. Using this simple script, you will be able to extract those strings that need to be translated in the "regular" markup. After that, you can write a script that looks for <!--lang--> in tag attributes... I'll have a look if there isn't a way to do this using XPath, too. For now, this should help you to get started, though.

If you have not comments, other than <!--lang--> in your markup, then you could simply use an xpath expression that selects the parents of those comment nodes directly:

$commentsAndInput = $xpath->query('(//input|//option)[@value]|//comment()/..');
foreach ($commentsAndInput as $node)
{
    if ($node->tagName !== 'input' && $node->tagName !== 'option')
    {//get the textContent of the node
        $toTranslate[] = $node->textContent;
    }
    else
    {//get value attribute's value:
        $toTranslate[] = $node->getAttributeNode('value')->value;
    }
}

The xpath expression explained:

  • //: tells xpath to search for nodes that match the rest of the criteria anywhere in the DOM
  • input: literal tag name: //input looks for input tags anywhere in the DOM tree
  • [@value]: the mentioned tag only matches if it has a @value attribute
  • |: OR. //a|//input[@type="button"] matches links OR buttons
  • //option[@value]: same as above: options with value attributes are matched
  • (//input|//option): groups both expressions, the [@value] applies to all matches in this selection
  • //comment(): selects comments anywhere in the dom
  • /..: selects the parent of the current node, so //comment()/.. matches the parent, containing the selected comment node.

Keep working at the XPath expression to get all of the content you need to translate

Proof of concept

这篇关于本地化html文件(后视)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆