PHPWord的HTML阅读器不适用于表格? [英] HTML Reader from PHPWord does't work with tables?

查看:224
本文介绍了PHPWord的HTML阅读器不适用于表格?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



PHP示例:
$
b $ b

  $ reader = IOFactory :: createReader('HTML'); 
$ phpWord = $ reader-> load($ this-> getReportDir()。'/'。$ fileName);
$ writer = IOFactory :: createWriter($ phpWord);
$ writer-> save($ this-> getReportDir()。'/'。$ fileName);

表格示例:

 <表> 
< tr>
< td>№п/п< / td>
< td>外观设计&功能< / td>
< td>Количествопройденныхпроверок< / td>
< td>%      
< / tr>
< / table>


解决方案

PHPWord中的当前HTML类非常有限。您遇到的问题是一个已知问题(请参阅 https://github.com/PHPOffice/PHPWord/问题/ 324 )。



我正在一个需要一些HTML表格进行文档转换的项目中工作。所以,我稍微改进了HTML类。这是非常少的测试,我只是测试DOC转换。

我的版本能够转换以下HTML:

 < table style =width:50%; border:6px#0000FF solid;> 
< thead>
< tr style =background-color:#FF0000; text-align:center; color:#FFFFFF; font-weight:bold;>
< th> a< / th>
b< / th>
< th> c< / th>
< / tr>
< / thead>
< tbody>
< tr>< td> 1< / td>< td colspan =2> 2< / td>< / tr>
< tr>< td> 4< / td>< td> 5< / td>< td> 6< / td>< / tr>
< / tbody>
< / table>

生成以下DOC表:

使用PHPWord 0.13版本:

 <?php 
/ **
*此文件是PHPWord的一部分 - 一个用于读取和写入
*文字处理文档的纯PHP库。
*
* PHPWord是根据自由软件基金会发布的GNU Lesser
*通用公共许可证版本3的条款分发的免费软件。
*
*有关完整的版权和许可信息,请阅读随此源代码一起分发的LICENSE
*文件。有关
*贡献者的完整列表,请访问https://github.com/PHPOffice/PHPWord/contributors。
*
* @link https://github.com/PHPOffice/PHPWord
* @copyright 2010-2016 PHPWord贡献者
* @license http://www.gnu。 org / licenses / lgpl.txt LGPL版本3
* /

命名空间PhpOffice\PhpWord\Shared;

使用PhpOffice \PhpWord\Element\AbstractContainer;
使用PhpOffice\PhpWord\Element\Table;
使用PhpOffice \PhpWord\Element\Row;
$ b / **
*常用Html函数
*
* @SuppressWarnings(PHPMD.UnusedPrivateMethod)用于readWPNode
* /
class Html
{
// public static $ phpWord = null;

/ **
*从父元素中持有样式,
*允许子元素继承属性。
*因此,如果你的表格行有粗体字体
*,你可以这样做:
*< tr style =font-weight:bold;>
*而不是
*< tr>
*< td>
*< p style =font-weight:bold;>
* ...
*
*在处理DOM元素子元素之前,将父元素元素样式添加到堆栈中。
*每个子元素的样式由
*其样式和父类型组成。
* /
public static $ stylesStack = null;

/ **
*添加HTML部分。
*
*注意:$ stylesheet参数被删除,以避免PHPMD错误的未使用参数
*
* @param \PhpOffice\PhpWord\Element\AbstractContainer $ element哪里需要添加部分
* @param string $ html解析
*的代码bram @ bool $ fullHTML如果是完整的HTML,则不需要添加'body'标记
* @return void
* /
public static function addHtml($ element,$ html,$ fullHTML = false)
{
/ *
* @todo parse $ stylesheet对于默认样式。应该产生一个基于id,class和元素的数组
*,可以在parseNode函数中出现这样的元素时应用它。
* /

//预处理:删除所有行结束,解码HTML实体
//修正&符号和尖括号并为HTML片段添加body标记
$ html = str_replace(array(\\\
,\r),'',$ html);
$ html = str_replace(array('& lt;','& gt;','& amp;'),array('_ lt_','_gt_','_amp_'),$ html );
$ html = html_entity_decode($ html,ENT_QUOTES,'UTF-8');
$ html = str_replace('&','& amp;',$ html);
$ html = str_replace(array('_ lt_','_gt_','_amp_'),array('& lt;','& gt;','& amp;'),$ html );

if(false === $ fullHTML){
$ html ='< body>'。 $ html。 < /体>;
}

//载入DOM
$ dom = new \DOMDocument();
$ dom-> preserveWhiteSpace = true;
$ dom-> loadXML($ html);
$ node = $ dom-> getElementsByTagName('body');

// self :: $ phpWord = $ element-> getPhpWord();
self :: $ stylesStack = array();

self :: parseNode($ node-> item(0),$ element);

$ b $ **
* parse节点的内联样式
*
* @param \DOMNode $ node节点检查属性和编译一个样式数组
* @param array $ styles被提供,内联样式属性被添加到已经存在的样式
* @return数组
* /
受保护的静态函数parseInlineStyle($ node,$ styles = array())
{
if(XML_ELEMENT_NODE == $ node-> nodeType){
$ stylesStr = $ node-> getAttribute('style );
$ styles = self :: parseStyle($ node,$ stylesStr,$ styles);
}
else
{
//只是为了平衡堆栈。
//(使pushs数= pop数)
self :: pushStyles(array());
}

返回$ styles;
}

/ **
*解析节点并向父元素添加相应的元素。
*
* @param \DOMNode $ node节点来解析
* @param \PhpOffice\PhpWord\Element\AbstractContainer $元素对象添加一个与节点相对应的元素
* @param array $ styles具有所有样式的数组
* @param数组$ data将数据传输到DOM树中的下一级,例如列表项的级别
* @return void
* /
保护静态函数parseNode($ node,$ element,$ styles = array(),$ data = array())
{
//填充样式数组
$ styleTypes = array('font','paragraph','list','table','row','cell'); // $ change
foreach($ styleTypes as $ styleType){
if(!isset($ styles [$ styleType])){
$ styles [$ styleType] = array();



//节点映射表
$ nodes = array(
// $ method $ node $ element $ styles $ data $ argument1 $ argument2 $ b $'p'=> array('Paragraph',$ node,$ element,$ styles,null,null,null),
'h1'=> array('Heading', null,'Heading1',null),
'h2'=> array('Heading',null,$ element,$ styles,null,'Heading2',null),
'h3'=> array('Heading',null,$ element,$ styles,null,'Heading3',null),
'h4'=> array('Heading',null ,$ element,$ styles,null,'Heading4',null),
'h5'=> array('Heading',null,$ element,$ styles,null,'Heading5',n ('Heading',null,$ element,$ styles,null,'Heading6',null),
'#text'=> array('Text',$ node,$ element,$ styles,null,null,null),
'strong'=>数组('Property',null,null,$ styles,null,'bold',true),
'em'=>数组('Property',null,null,$ styles,null,'italic',true),
'sup'=>数组('Property',null,null,$ styles,null,'superScript',true),
'sub'=> array('Property',null,null,$ styles,null,'subScript',true),
// @change
//'table'=> array('Table',$ node,$ element,$ styles,null,'addTable',true),
//'tr'=>数组('Table',$ node,$ element,$ styles,null,'addRow',true),
//'td'=> array('Table',$ node,$ element,$ styles,null,'addCell',true),
'table'=>数组('Table',$ node,$ element,$ styles,null,null,true),
'tr'=>数组('Row',$ node,$ element,$ styles,null,null,true),
'td'=> array('Cell',$ node,$ element,$ styles,null,null,true),
'th'=> array('Cell',$ node,$ element,$ styles,null,null,true),
'ul'=>数组('List',null,null,$ styles,$ data,3,null),
'ol'=> array('List',null,null,$ styles,$ data,7,null),
'li'=> array('ListItem',$ node,$ element,$ styles,$ data,null,null),
);

$ newElement = null;
$ keys = array('node','element','styles','data','argument1','argument2');

if(isset($ nodes [$ node-> nodeName])){
//根据节点映射表执行方法并返回$ newElement或null
//参数通过引用传递
$ arguments = array();
$ args = array();
list($ method,$ args [0],$ args [1],$ args [2],$ args [3],$ args [4],$ args [5])= $ nodes [$节点 - >节点名称]。 $ $ b $ for($ i = 0; $ i <= 5; $ i ++){
if($ args [$ i]!== null){
$ arguments [$ keys [ $ i]] =& $ args [$ i];
}
}
$ method =parse {$ method};
$ newElement = call_user_func_array(array('PhpOffice\PhpWord\Shared\Html',$ method),$ arguments);

//从参数中获取变量
foreach($ key as $ key){
if(array_key_exists($ key,$ arguments)){
$$ key = $ arguments [$ key];
}
}
}
else
{
//只是为了平衡堆栈。
//推送次数=弹出次数。
self :: pushStyles(array());
}

if($ newElement === null){
$ newElement = $ element;
}

self :: parseChildNodes($ node,$ newElement,$ styles,$ data);

//处理父元素后,
//将其样式从堆栈中移除。
self :: popStyles();
}

/ **
*解析子节点。
*
* @param \DOMNode $ node
* @param \PhpOffice\PhpWord\Element\AbstractContainer $ element
* @param array $ styles
* @param array $ data
* @return void
* /
private static function parseChildNodes($ node,$ element,$ styles,$ data)
{
if('li'!= $ node-> nodeName){
$ cNodes = $ node-> childNodes;如果(($元素instanceof AbstractContainer)或($ element instanceof Table)或($ count $($ c $)) $ element instanceof Row)){// @change
self :: parseNode($ cNode,$ element,$ styles,$ data);
}
}
}
}
}

/ **
*解析段落节点
*
* @param \DOMNode $ node
* @param \PhpOffice\PhpWord\Element\AbstractContainer $ element
* @param array& $ styles
* @返回\PhpOffice\PhpWord\Element\TextRun
* /
私有静态函数parseParagraph($ node,$ element,& $ styles)
{
$ elementStyles = self :: parseInlineStyle($ node,$ styles ['paragraph']);

$ newElement = $ element-> addTextRun($ elementStyles);

返回$ newElement;
}

/ **
*解析标题节点
*
* @param \PhpOffice\PhpWord\Element\AbstractContainer $ element
* @param array& $ styles
* @param string $ argument1标题样式的名称
* @return \PhpOffice\PhpWord\Element\TextRun
*
* @todo想想定义标题样式的一种巧妙方式,现在它只是基于假设,
* Heading1 - Heading6已经在某处定义了
* /
private静态函数parseHeading($ element,& $ styles,$ argument1)
{
$ elementStyles = $ argument1;

$ newElement = $ element-> addTextRun($ elementStyles);

返回$ newElement;
}

/ **
*解析文本节点
*
* @param \DOMNode $ node
* @param \ PhpOffice\PhpWord\Element\AbstractContainer $ element
* @param array& $ styles
* @return null
* /
private static function parseText($ node, $ element,& $ styles)
{
$ elementStyles = self :: parseInlineStyle($ node,$ styles ['font']);

$ textStyles = self :: getInheritedTextStyles();
$ paragraphStyles = self :: getInheritedParagraphStyles();

//评论为bug#257的来源。在这种情况下``method_exists`似乎不能正常工作。
// @todo找到更好的错误检查这个
// if(method_exists($ element,'addText')){
$ element-> addText($ node-> nodeValue,$ textStyles,$ paragraphStyles);
//}

返回null;
}
$ b $ **
*解析属性节点
*
* @param array& $ styles
* @param string $ argument1样式名称
* @param字符串$ argument2样式值
* @return null
* /
私有静态函数parseProperty(& $ styles,$ argument1,$ argument2)
{
$ styles ['font'] [$ argument1] = $ argument2;

返回null;
}

/ **
*解析表节点
*
* @param \DOMNode $ node
* @param \ PhpOffice \PhpWord\Element\AbstractContainer $ element
* @param array& $ styles
* @param string $ argument1方法名称
* @return \PhpOffice\PhpWord\\ \\元素\抽象容器$元素
*
* @todo只要TableItem,RowItem和CellItem支持相对宽度和高度
* /
private static function parseTable($ node, $ element,& $ styles,$ argument1)
{
$ elementStyles = self :: parseInlineStyle($ node,$ styles ['table']);

$ newElement = $ element-> addTable($ elementStyles);

// $ attributes = $ node->属性;
// if($ attributes-> getNamedItem('width')!== null){
// $ newElement-> setWidth($ attributes-> getNamedItem('width') - >值);
//

// if($ attributes-> getNamedItem('height')!== null){
// $ newElement-> setHeight($属性 - > getNamedItem( '高度') - >值);
//}
// if($ attributes-> getNamedItem('width')!== null){
// $ newElement = $ element-> addCell($ width = $属性 - > getNamedItem( '宽度') - >值);
//}

返回$ newElement;

$ b $ private static function parseRow($ node,$ element,& $ styles,$ argument1)
{
$ elementStyles = self :: parseInlineStyle($节点,$ styles ['row']);

$ newElement = $ element-> addRow(null,$ elementStyles);

返回$ newElement;


$ b private static function parseCell($ node,$ element,& $ styles,$ argument1)
{
$ elementStyles = self: :parseInlineStyle($ node,$ styles ['cell']);

$ colspan = $ node-> getAttribute('colspan');
if(!empty($ colspan))
$ elementStyles ['gridSpan'] = $ colspan-0;

$ newElement = $ element-> addCell(null,$ elementStyles);
返回$ newElement;
}
$ b $ **
*解析列表节点
*
* @param array& $ styles
* @param array& amp ; $ data
* @param string $ argument1列表类型
* @return null
* /
private static function parseList(& $ styles,& $ data,$ argument1 )
{
if(isset($ data ['listdepth'])){
$ data ['listdepth'] ++;
} else {
$ data ['listdepth'] = 0;
}
$ styles ['list'] ['listType'] = $ argument1;

返回null;
}
$ b $ **
*解析列表项节点
*
* @param \DOMNode $ node
* @param \\ \\PhpOffice\PhpWord\Element\AbstractContainer $ element
* @param array& $ styles
* @param array $ data
* @return null
*
* @todo这个函数与`parseChildNodes`几乎相同。合并?
* @todo只要ListItem继承自AbstractContainer或TextRun,就会删除childNodes的一部分
* /
私有静态函数parseListItem($ node,$ element,& $ styles,$ data)
{
$ cNodes = $ node-> childNodes;
if(count($ cNodes)> 0){
$ text ='';
foreach($ cNodes as $ cNode){
if($ cNode-> nodeName =='#text'){
$ text = $ cNode-> nodeValue;


$ element-> addListItem($ text,$ data ['listdepth'],$ styles ['font'],$ styles ['list'],$ styles ['段']);
}

返回null;

$ b / **
*解析样式
*
* @param \DOMAttr $属性
* @参数array $样式
* @return数组
* /
私有静态函数parseStyle($ node,$ stylesStr,$ styles)
{
//分析元素样式。
$ newStyles = array();

if(!empty($ stylesStr))
{
$ properties = explode(';',trim($ stylesStr,\t\\\
\r \0\x0B;));
foreach($ properties as $ property){
list($ cKey,$ cValue)= explode(':',$ property,2);
$ cValue = trim($ cValue);
switch(trim($ cKey)){
case'text-decoration':
switch($ cValue){
case'underline':
$ newStyles [ 'underline'] ='单身';
休息;
case'line-through':
$ newStyles ['strikethrough'] = true;
休息;
}
break;
case'text-align':
$ newStyles ['alignment'] = $ cValue; // todo:任何映射?
休息;
case'color':
$ newStyles ['color'] = trim($ cValue,#);
休息;
case'background-color':
$ newStyles ['bgColor'] = trim($ cValue,#);
休息;

// @change
'colspan':
$ newStyles ['gridSpan'] = $ cValue-0;
休息;
case'font-weight':
if($ cValue =='bold')
$ newStyles ['bold'] = true;
休息;
case'width':
$ newStyles = self :: parseWidth($ newStyles,$ cValue);
休息;
case'border-width':
$ newStyles = self :: parseBorderStyle($ newStyles,$ cValue);
休息;
case'border-color':
$ newStyles = self :: parseBorderColor($ newStyles,$ cValue);
休息;
case'border':
$ newStyles = self :: parseBorder($ newStyles,$ cValue);
休息;
}
}
}

//将样式添加到堆栈。
self :: pushStyles($ newStyles);

//继承父类型(包括它自己)。
$ inheritedStyles = self :: getInheritedStyles($ node-> nodeName);

//用继承的样式覆盖默认样式。
$ styles = array_merge($ styles,$ inheritedStyles);
$ b $ * DEBUG
if($ node-> nodeName =='th')
{
echo'< pre>;
print_r(self :: $ stylesStack);
print_r($ styles);
// print_r($ elementStyles);
echo'< / pre>';
}
* /

返回$ styles;

$ b / **
*解析width样式属性,添加到样式
*数组相应的PHPWORD属性。
* /
public static function parseWidth($ styles,$ cValue)
{
if(preg_match('/([0-9] +)px /',$ cValue ,$匹配))
{
$ styles ['width'] = $ matches [1];
$ styles ['unit'] ='dxa';
}
else if(preg_match('/([0-9] +)%/',$ cValue,$ matches))
{
$ styles ['width' ] = $匹配[1] * 50;
$ styles ['unit'] ='pct';
}
else if(preg_match('/([0-9] +)/',$ cValue,$ matches))
{
$ styles ['width'] = $ matches [1];
$ styles ['unit'] ='auto';
}

$ styles ['alignment'] = \PhpOffice\PhpWord\SimpleType\JcTable :: START;

返回$ styles;
}

/ **
*解析border-width样式属性,添加样式
*数组相应的PHPWORD属性。
* /
public static function parseBorderWidth($ styles,$ cValue)
{
// border-width:2px;
if(preg_match('/([0-9] +)px /',$ cValue,$ matches))
$ styles ['borderSize'] = $ matches [1];

返回$ styles;
}

/ **
*解析border-color样式属性,添加样式
*数组相应的PHPWORD属性。
* /
public static function parseBorderColor($ styles,$ cValue)
{
// border-color:#FFAACC;
$ styles ['borderColor'] = $ cValue;

返回$ styles;
}

/ **
*解析border样式属性,添加样式
*数组相应的PHPWORD属性。
* /
public static function parseBorder($ styles,$ cValue)
{
if(preg_match('/([0-9] +)px\s +(\\ \\#[a-fA-F0-9] +)\s + solid + /',$ cValue,$ matches))
{
$ styles ['borderSize'] = $ matches [1] ;
$ styles ['borderColor'] = $匹配[2];
}

返回$ styles;
}

/ **
*考虑到当前堆栈状态,返回文本元素的继承样式
*。
* /
public static function getInheritedTextStyles()
{
return self :: getInheritedStyles('#text');
}

/ **
*考虑当前堆栈状态,返回段落元素的继承样式
*。
* /
public static function getInheritedParagraphStyles()
{
return self :: getInheritedStyles('p');
}

/ **
*考虑到当前堆栈状态,返回给定nodeType的继承样式,
*。
* /
public static function getInheritedStyles($ nodeType)
{
$ textStyles = array('color','bold','italic');
$ paragraphStyles = array('color','bold','italic','alignment');

//每种元素类型相关的phpword样式列表。
$ stylesMapping = array(
'p'=> $ paragraphStyles,
'h1'=> $ textStyles,
'h2'=> $ textStyles,
'h3'=> $ textStyles,
'h4'=> $ textStyles,
'h5'=> $ textStyles,
'h6'=> $ textStyles,
'#text'=> $ textStyles,
'strong'=> $ textStyles,
'em'=> $ textStyles,
'sup'=> $ textStyles,
'sub'=> $ textStyles,
'table'=> array('width','borderSize','borderColor','unit'),
' ('bgColor','alignment'),
'th'=> array('bgColor','alignment'),
'td'=> bgColor','alignme nt'),
'ul'=> $ textStyles,
'ol'=> $ textStyles,
'li'=> $ textStyles,
);

$ result = array();

if(isset($ stylesMapping [$ nodeType]))
{
$ nodeStyles = $ stylesMapping [$ nodeType];

//循环槽样式堆栈应用
中的样式//正确的顺序。
foreach(self :: $ stylesStack as $ styles)
{
//循环遍历所有样式,仅应用
//该节点类型的相关元素。
foreach($ styles as $ name => $ value)
{
if(in_array($ name,$ nodeStyles))
{
$ result [$名称] = $值;
}
}
}
}

返回$ result;
}


/ **
*将父类型添加到堆栈,允许
*子元素继承。
* /
public static function pushStyles($ styles)
{
self :: $ stylesStack [] = $ styles;
}

/ **
*在递归结束时移除父类型。
* /
public static function popStyles()
{
array_pop(self :: $ stylesStack);






$ b

使用这种新结构很容易添加新的样式支持。你只需要在getInheritedStyles()方法中编辑parseStyle()方法和$ stylesMapping变量。希望它有帮助。



使用示例:

 < php 
include_once'Sample_Header.php';

//新建Word文档
echo date('H:i:s'),'创建新的PhpWord对象',EOL;
$ phpWord = new \PhpOffice\PhpWord\PhpWord();

$ section = $ phpWord-> addSection();
$ html ='< table style =width:50%; border:6px#0000FF solid;>'。
'< thead>'。
'< tr style =background-color:#FF0000; text-align:center; color:#FFFFFF; font-weight:bold;>'。
'第< th> a< / th>'。
'th'b th'/'。
'th c '。
'< / tr>'。
'< / thead>'。
'< tbody>'。
'< tr>< td> 1< / td>< td colspan =2> 2< / td>< / tr>'。
'< tr>< td> 4< / td>< td> 5< / td>< td> 6< / td>< / tr>'。
'< / tbody>'。
'< / table>';


\PhpOffice\PhpWord\Shared\Html::addHtml($section, $html);

// Save file
echo write($phpWord, basename(__FILE__, ’.php’), $writers);
if (!CLI) {
include_once ’Sample_Footer.php’;
}


When I use Html reader for my html for converting into docx, reader is cut off my table.

PHP example:

$reader = IOFactory::createReader('HTML');
$phpWord = $reader->load($this->getReportDir() . '/' . $fileName);
$writer = IOFactory::createWriter($phpWord);
$writer->save($this->getReportDir() . '/' . $fileName);

Table example:

<table>
    <tr>
        <td>№ п/п</td>
        <td>Общие показатели результатов прохождения проверочных листов</td>
        <td>Количество пройденных проверок</td>
        <td>% от общего количества пройденных проверок</td>
    </tr>
</table>

解决方案

The current HTML class from PHPWord is very limited. The issue you are getting is a know issue (see https://github.com/PHPOffice/PHPWord/issues/324).

I'm working in a project that needs some HTML tables to doc conversion. So, I work a little improving the HTML class. It is very little tested and I just test DOC conversion.

My version is able to convert the following HTML:

<table style="width: 50%; border: 6px #0000FF solid;">
    <thead>
        <tr style="background-color: #FF0000; text-align: center; color: #FFFFFF; font-weight: bold; ">
             <th>a</th>
             <th>b</th>
             <th>c</th>
        </tr>
    </thead>
    <tbody>
        <tr><td>1</td><td colspan="2">2</td></tr>
        <tr><td>4</td><td>5</td><td>6</td></tr>
    </tbody>
</table>

Generating the following DOC table:

It uses PHPWord version 0.13:

<?php
/**
 * This file is part of PHPWord - A pure PHP library for reading and writing
 * word processing documents.
 *
 * PHPWord is free software distributed under the terms of the GNU Lesser
 * General Public License version 3 as published by the Free Software Foundation.
 *
 * For the full copyright and license information, please read the LICENSE
 * file that was distributed with this source code. For the full list of
 * contributors, visit https://github.com/PHPOffice/PHPWord/contributors.
 *
 * @link        https://github.com/PHPOffice/PHPWord
 * @copyright   2010-2016 PHPWord contributors
 * @license     http://www.gnu.org/licenses/lgpl.txt LGPL version 3
 */

namespace PhpOffice\PhpWord\Shared;

use PhpOffice\PhpWord\Element\AbstractContainer;
use PhpOffice\PhpWord\Element\Table;
use PhpOffice\PhpWord\Element\Row;

/**
 * Common Html functions
 *
 * @SuppressWarnings(PHPMD.UnusedPrivateMethod) For readWPNode
 */
class Html
{
    //public static $phpWord=null;

    /**
    *  Hold styles from parent elements,
    *  allowing child elements inherit attributes.
    *  So if you whant your table row have bold font
    *  you can do:
    *     <tr style="font-weight: bold; ">
    *  instead of
    *     <tr>
    *       <td>
    *           <p style="font-weight: bold;">
    *       ...
    *
    *  Before DOM element children are processed,
    *  the parent DOM element styles are added to the stack.
    *  The styles for each child element is composed by
    *  its styles plus the parent styles.
    */
    public static $stylesStack=null;

    /**
     * Add HTML parts.
     *
     * Note: $stylesheet parameter is removed to avoid PHPMD error for unused parameter
     *
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element Where the parts need to be added
     * @param string $html The code to parse
     * @param bool $fullHTML If it's a full HTML, no need to add 'body' tag
     * @return void
     */
    public static function addHtml($element, $html, $fullHTML = false)
    {
        /*
         * @todo parse $stylesheet for default styles.  Should result in an array based on id, class and element,
         * which could be applied when such an element occurs in the parseNode function.
         */

        // Preprocess: remove all line ends, decode HTML entity,
        // fix ampersand and angle brackets and add body tag for HTML fragments
        $html = str_replace(array("\n", "\r"), '', $html);
        $html = str_replace(array('&lt;', '&gt;', '&amp;'), array('_lt_', '_gt_', '_amp_'), $html);
        $html = html_entity_decode($html, ENT_QUOTES, 'UTF-8');
        $html = str_replace('&', '&amp;', $html);
        $html = str_replace(array('_lt_', '_gt_', '_amp_'), array('&lt;', '&gt;', '&amp;'), $html);

        if (false === $fullHTML) {
            $html = '<body>' . $html . '</body>';
        }

        // Load DOM
        $dom = new \DOMDocument();
        $dom->preserveWhiteSpace = true;
        $dom->loadXML($html);
        $node = $dom->getElementsByTagName('body');

        //self::$phpWord = $element->getPhpWord();
        self::$stylesStack = array();

        self::parseNode($node->item(0), $element);
    }

    /**
     * parse Inline style of a node
     *
     * @param \DOMNode $node Node to check on attributes and to compile a style array
     * @param array $styles is supplied, the inline style attributes are added to the already existing style
     * @return array
     */
    protected static function parseInlineStyle($node, $styles = array())
    {
        if (XML_ELEMENT_NODE == $node->nodeType) {
            $stylesStr = $node->getAttribute('style');
            $styles = self::parseStyle($node, $stylesStr, $styles);
        }
        else
        {
            // Just to balance the stack.
            // (make number of pushs = number of pops)
            self::pushStyles(array());
        } 

        return $styles;
    }

    /**
     * Parse a node and add a corresponding element to the parent element.
     *
     * @param \DOMNode $node node to parse
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element object to add an element corresponding with the node
     * @param array $styles Array with all styles
     * @param array $data Array to transport data to a next level in the DOM tree, for example level of listitems
     * @return void
     */
    protected static function parseNode($node, $element, $styles = array(), $data = array())
    {
        // Populate styles array
        $styleTypes = array('font', 'paragraph', 'list', 'table', 'row', 'cell'); //@change
        foreach ($styleTypes as $styleType) {
            if (!isset($styles[$styleType])) {
                $styles[$styleType] = array();
            }
        }

        // Node mapping table
        $nodes = array(
                              // $method        $node   $element    $styles     $data   $argument1      $argument2
            'p'         => array('Paragraph',   $node,  $element,   $styles,    null,   null,           null),
            'h1'        => array('Heading',     null,   $element,   $styles,    null,   'Heading1',     null),
            'h2'        => array('Heading',     null,   $element,   $styles,    null,   'Heading2',     null),
            'h3'        => array('Heading',     null,   $element,   $styles,    null,   'Heading3',     null),
            'h4'        => array('Heading',     null,   $element,   $styles,    null,   'Heading4',     null),
            'h5'        => array('Heading',     null,   $element,   $styles,    null,   'Heading5',     null),
            'h6'        => array('Heading',     null,   $element,   $styles,    null,   'Heading6',     null),
            '#text'     => array('Text',        $node,  $element,   $styles,    null,   null,           null),
            'strong'    => array('Property',    null,   null,       $styles,    null,   'bold',         true),
            'em'        => array('Property',    null,   null,       $styles,    null,   'italic',       true),
            'sup'       => array('Property',    null,   null,       $styles,    null,   'superScript',  true),
            'sub'       => array('Property',    null,   null,       $styles,    null,   'subScript',    true),
            // @change
            //'table'     => array('Table',       $node,  $element,   $styles,    null,   'addTable',     true),
            //'tr'        => array('Table',       $node,  $element,   $styles,    null,   'addRow',       true),
            //'td'        => array('Table',       $node,  $element,   $styles,    null,   'addCell',      true),
            'table'     => array('Table' ,       $node,  $element,   $styles,    null,   null,     true),
            'tr'        => array('Row'   ,       $node,  $element,   $styles,    null,   null,       true),
            'td'        => array('Cell'  ,       $node,  $element,   $styles,    null,   null,      true),
            'th'        => array('Cell'  ,       $node,  $element,   $styles,    null,   null,      true),
            'ul'        => array('List',        null,   null,       $styles,    $data,  3,              null),
            'ol'        => array('List',        null,   null,       $styles,    $data,  7,              null),
            'li'        => array('ListItem',    $node,  $element,   $styles,    $data,  null,           null),
        );

        $newElement = null;
        $keys = array('node', 'element', 'styles', 'data', 'argument1', 'argument2');

        if (isset($nodes[$node->nodeName])) {
            // Execute method based on node mapping table and return $newElement or null
            // Arguments are passed by reference
            $arguments = array();
            $args = array();
            list($method, $args[0], $args[1], $args[2], $args[3], $args[4], $args[5]) = $nodes[$node->nodeName];
            for ($i = 0; $i <= 5; $i++) {
                if ($args[$i] !== null) {
                    $arguments[$keys[$i]] = &$args[$i];
                }
            }
            $method = "parse{$method}";
            $newElement = call_user_func_array(array('PhpOffice\PhpWord\Shared\Html', $method), $arguments);

            // Retrieve back variables from arguments
            foreach ($keys as $key) {
                if (array_key_exists($key, $arguments)) {
                    $$key = $arguments[$key];
                }
            }
        }
        else
        {
            // Just to balance the stack.
            // Number of pushs = number of pops.
            self::pushStyles(array());
        }

        if ($newElement === null) {
            $newElement = $element;
        }

        self::parseChildNodes($node, $newElement, $styles, $data);

        // After the parent element be processed, 
        // its styles are removed from stack.
        self::popStyles();
    }

    /**
     * Parse child nodes.
     *
     * @param \DOMNode $node
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element
     * @param array $styles
     * @param array $data
     * @return void
     */
    private static function parseChildNodes($node, $element, $styles, $data)
    {
        if ('li' != $node->nodeName) {
            $cNodes = $node->childNodes;
            if (count($cNodes) > 0) {
                foreach ($cNodes as $cNode) {
                    if (($element instanceof AbstractContainer) or ($element instanceof Table) or ($element instanceof Row)) { // @change
                        self::parseNode($cNode, $element, $styles, $data);
                    }
                }
            }
        }
    }

    /**
     * Parse paragraph node
     *
     * @param \DOMNode $node
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element
     * @param array &$styles
     * @return \PhpOffice\PhpWord\Element\TextRun
     */
    private static function parseParagraph($node, $element, &$styles)
    {
        $elementStyles = self::parseInlineStyle($node, $styles['paragraph']);

        $newElement = $element->addTextRun($elementStyles);

        return $newElement;
    }

    /**
     * Parse heading node
     *
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element
     * @param array &$styles
     * @param string $argument1 Name of heading style
     * @return \PhpOffice\PhpWord\Element\TextRun
     *
     * @todo Think of a clever way of defining header styles, now it is only based on the assumption, that
     * Heading1 - Heading6 are already defined somewhere
     */
    private static function parseHeading($element, &$styles, $argument1)
    {
        $elementStyles = $argument1;

        $newElement = $element->addTextRun($elementStyles);

        return $newElement;
    }

    /**
     * Parse text node
     *
     * @param \DOMNode $node
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element
     * @param array &$styles
     * @return null
     */
    private static function parseText($node, $element, &$styles)
    {
        $elementStyles = self::parseInlineStyle($node, $styles['font']);

        $textStyles = self::getInheritedTextStyles();
        $paragraphStyles = self::getInheritedParagraphStyles();

        // Commented as source of bug #257. `method_exists` doesn't seems to work properly in this case.
        // @todo Find better error checking for this one
        // if (method_exists($element, 'addText')) {
            $element->addText($node->nodeValue, $textStyles, $paragraphStyles);
        // }

        return null;
    }

    /**
     * Parse property node
     *
     * @param array &$styles
     * @param string $argument1 Style name
     * @param string $argument2 Style value
     * @return null
     */
    private static function parseProperty(&$styles, $argument1, $argument2)
    {
        $styles['font'][$argument1] = $argument2;

        return null;
    }

    /**
     * Parse table node
     *
     * @param \DOMNode $node
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element
     * @param array &$styles
     * @param string $argument1 Method name
     * @return \PhpOffice\PhpWord\Element\AbstractContainer $element
     *
     * @todo As soon as TableItem, RowItem and CellItem support relative width and height
     */
    private static function parseTable($node, $element, &$styles, $argument1)
    {
        $elementStyles = self::parseInlineStyle($node, $styles['table']);

        $newElement = $element->addTable($elementStyles);

        // $attributes = $node->attributes;
        // if ($attributes->getNamedItem('width') !== null) {
            // $newElement->setWidth($attributes->getNamedItem('width')->value);
        // }

        // if ($attributes->getNamedItem('height') !== null) {
            // $newElement->setHeight($attributes->getNamedItem('height')->value);
        // }
        // if ($attributes->getNamedItem('width') !== null) {
            // $newElement=$element->addCell($width=$attributes->getNamedItem('width')->value);
        // }

        return $newElement;
    }

    private static function parseRow($node, $element, &$styles, $argument1)
    {
        $elementStyles = self::parseInlineStyle($node, $styles['row']);

        $newElement = $element->addRow(null, $elementStyles);

        return $newElement;
    }


    private static function parseCell($node, $element, &$styles, $argument1)
    {        
        $elementStyles = self::parseInlineStyle($node, $styles['cell']);

        $colspan = $node->getAttribute('colspan');        
        if (!empty($colspan))
            $elementStyles['gridSpan'] = $colspan-0;        

        $newElement = $element->addCell(null, $elementStyles);
        return $newElement;
    }

    /**
     * Parse list node
     *
     * @param array &$styles
     * @param array &$data
     * @param string $argument1 List type
     * @return null
     */
    private static function parseList(&$styles, &$data, $argument1)
    {
        if (isset($data['listdepth'])) {
            $data['listdepth']++;
        } else {
            $data['listdepth'] = 0;
        }
        $styles['list']['listType'] = $argument1;

        return null;
    }

    /**
     * Parse list item node
     *
     * @param \DOMNode $node
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element
     * @param array &$styles
     * @param array $data
     * @return null
     *
     * @todo This function is almost the same like `parseChildNodes`. Merged?
     * @todo As soon as ListItem inherits from AbstractContainer or TextRun delete parsing part of childNodes
     */
    private static function parseListItem($node, $element, &$styles, $data)
    {
        $cNodes = $node->childNodes;
        if (count($cNodes) > 0) {
            $text = '';
            foreach ($cNodes as $cNode) {
                if ($cNode->nodeName == '#text') {
                    $text = $cNode->nodeValue;
                }
            }
            $element->addListItem($text, $data['listdepth'], $styles['font'], $styles['list'], $styles['paragraph']);
        }

        return null;
    }

    /**
     * Parse style
     *
     * @param \DOMAttr $attribute
     * @param array $styles
     * @return array
     */
    private static function parseStyle($node, $stylesStr, $styles)
    {
        // Parses element styles.
        $newStyles = array();

        if (!empty($stylesStr))
        {
            $properties = explode(';', trim($stylesStr, " \t\n\r\0\x0B;"));
            foreach ($properties as $property) {
                list($cKey, $cValue) = explode(':', $property, 2);
                $cValue = trim($cValue);
                switch (trim($cKey)) {
                    case 'text-decoration':
                        switch ($cValue) {
                            case 'underline':
                                $newStyles['underline'] = 'single';
                                break;
                            case 'line-through':
                                $newStyles['strikethrough'] = true;
                                break;
                        }
                        break;                
                    case 'text-align':
                        $newStyles['alignment'] = $cValue; // todo: any mapping?
                        break;
                    case 'color':
                        $newStyles['color'] = trim($cValue, "#");
                        break;
                    case 'background-color':
                        $newStyles['bgColor'] = trim($cValue, "#");
                        break;

                    // @change
                    case 'colspan':
                        $newStyles['gridSpan'] = $cValue-0;
                        break;
                    case 'font-weight':
                        if ($cValue=='bold')
                            $newStyles['bold'] = true;
                        break;                    
                    case 'width':
                        $newStyles = self::parseWidth($newStyles, $cValue);
                        break;
                    case 'border-width':
                        $newStyles = self::parseBorderStyle($newStyles, $cValue);
                        break;
                    case 'border-color':
                        $newStyles = self::parseBorderColor($newStyles, $cValue);
                        break;
                    case 'border':
                        $newStyles = self::parseBorder($newStyles, $cValue);
                        break;                    
                }
            }
        }

        // Add styles to stack.
        self::pushStyles($newStyles);

        // Inherit parent styles (including itself).
        $inheritedStyles = self::getInheritedStyles($node->nodeName);

        // Override default styles with the inherited ones.
        $styles = array_merge($styles, $inheritedStyles);       

        /* DEBUG
        if ($node->nodeName=='th')
        {
            echo '<pre>';
            print_r(self::$stylesStack);
            print_r($styles);
            //print_r($elementStyles);
            echo '</pre>';
        }
        */

        return $styles;
    }

    /**
    *  Parses the "width" style attribute, adding to styles
    *  array the corresponding PHPWORD attributes.
    */
    public static function parseWidth($styles, $cValue)
    {
        if (preg_match('/([0-9]+)px/', $cValue, $matches))
        {
            $styles['width'] = $matches[1];
            $styles['unit'] = 'dxa';
        }
        else if (preg_match('/([0-9]+)%/', $cValue, $matches))
        {
            $styles['width'] = $matches[1]*50;
            $styles['unit'] = 'pct';
        }
        else if (preg_match('/([0-9]+)/', $cValue, $matches))
        {
            $styles['width'] = $matches[1];
            $styles['unit'] = 'auto';
        }

        $styles['alignment'] = \PhpOffice\PhpWord\SimpleType\JcTable::START;

        return $styles;
    }

    /**
    *  Parses the "border-width" style attribute, adding to styles
    *  array the corresponding PHPWORD attributes.
    */
    public static function parseBorderWidth($styles, $cValue)
    {
        // border-width: 2px;
        if (preg_match('/([0-9]+)px/', $cValue, $matches))
            $styles['borderSize'] = $matches[1];

        return $styles;
    }

    /**
    *  Parses the "border-color" style attribute, adding to styles
    *  array the corresponding PHPWORD attributes.
    */
    public static function parseBorderColor($styles, $cValue)
    {
        // border-color: #FFAACC;
        $styles['borderColor'] = $cValue;

        return $styles;
    }    

    /**
    *  Parses the "border" style attribute, adding to styles
    *  array the corresponding PHPWORD attributes.
    */
    public static function parseBorder($styles, $cValue)
    {
        if (preg_match('/([0-9]+)px\s+(\#[a-fA-F0-9]+)\s+solid+/', $cValue, $matches))
        {
            $styles['borderSize'] = $matches[1];
            $styles['borderColor'] = $matches[2];
        }

        return $styles;
    }

    /**
    *  Return the inherited styles for text elements,
    *  considering current stack state.
    */
    public static function getInheritedTextStyles()
    {
        return self::getInheritedStyles('#text');
    }

    /**
    *  Return the inherited styles for paragraph elements,
    *  considering current stack state.
    */
    public static function getInheritedParagraphStyles()
    {
        return self::getInheritedStyles('p');
    }

    /**
    *  Return the inherited styles for a given nodeType,
    *  considering current stack state.
    */
    public static function  getInheritedStyles($nodeType)
    {
        $textStyles = array('color', 'bold', 'italic');
        $paragraphStyles = array('color', 'bold', 'italic', 'alignment');

        // List of phpword styles relevant for each element types.
        $stylesMapping = array(
            'p'         => $paragraphStyles,
            'h1'        => $textStyles,
            'h2'        => $textStyles,
            'h3'        => $textStyles,
            'h4'        => $textStyles,
            'h5'        => $textStyles,
            'h6'        => $textStyles,
            '#text'     => $textStyles,
            'strong'    => $textStyles,
            'em'        => $textStyles,
            'sup'       => $textStyles,
            'sub'       => $textStyles,
            'table'     => array('width', 'borderSize', 'borderColor', 'unit'),
            'tr'        => array('bgColor', 'alignment'),
            'td'        => array('bgColor', 'alignment'),
            'th'        => array('bgColor', 'alignment'),
            'ul'        => $textStyles,
            'ol'        => $textStyles,
            'li'        => $textStyles,
        );

        $result = array();

        if (isset($stylesMapping[$nodeType]))
        {
            $nodeStyles = $stylesMapping[$nodeType];

            // Loop trough styles stack applying styles in
            // the right order.
            foreach (self::$stylesStack as $styles)
            {
                // Loop trough all styles applying only the relevants for
                // that node type.
                foreach ($styles as $name => $value)
                {
                    if (in_array($name, $nodeStyles))
                    {
                        $result[$name] = $value;
                    }
                }
            }
        }

        return $result;
    }


    /**
    *  Add the parent styles to stack, allowing
    *  children elements inherit from.
    */
    public static function pushStyles($styles)
    {
        self::$stylesStack[] = $styles;
    }

    /**
    *  Remove parent styles at end of recursion.
    */
    public static function popStyles()
    {
        array_pop(self::$stylesStack);
    }
}

With this new structure is easy add new style support. You just need edit the parseStyle() method and the $stylesMapping variable at getInheritedStyles() method. Hope it helps.

Example of use:

<?php
include_once 'Sample_Header.php';

// New Word Document
echo date('H:i:s') , ' Create new PhpWord object' , EOL;
$phpWord = new \PhpOffice\PhpWord\PhpWord();

$section = $phpWord->addSection();
$html  = '<table style="width: 50%; border: 6px #0000FF solid;">'.
            '<thead>'.
                '<tr style="background-color: #FF0000; text-align: center; color: #FFFFFF; font-weight: bold; ">'.
                    '<th>a</th>'.
                    '<th>b</th>'.
                    '<th>c</th>'.
                '</tr>'.
            '</thead>'.
            '<tbody>'.
                '<tr><td>1</td><td colspan="2">2</td></tr>'.
                '<tr><td>4</td><td>5</td><td>6</td></tr>'.
            '</tbody>'.
         '</table>';


\PhpOffice\PhpWord\Shared\Html::addHtml($section, $html);

// Save file
echo write($phpWord, basename(__FILE__, '.php'), $writers);
if (!CLI) {
    include_once 'Sample_Footer.php';
}

这篇关于PHPWord的HTML阅读器不适用于表格?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆