PHPWord的HTML阅读器不适用于表格? [英] HTML Reader from PHPWord does't work with tables?
问题描述
PHP示例:
$ b $ b
$ reader = IOFactory :: createReader('HTML');
$ phpWord = $ reader-> load($ this-> getReportDir()。'/'。$ fileName);
$ writer = IOFactory :: createWriter($ phpWord);
$ writer-> save($ this-> getReportDir()。'/'。$ fileName);
表格示例:
<表>
< tr>
< td>№п/п< / td>
< td>外观设计&功能< / td>
< td>Количествопройденныхпроверок< / td>
< td>%
< / tr>
< / table>
PHPWord中的当前HTML类非常有限。您遇到的问题是一个已知问题(请参阅 https://github.com/PHPOffice/PHPWord/问题/ 324 )。
我正在一个需要一些HTML表格进行文档转换的项目中工作。所以,我稍微改进了HTML类。这是非常少的测试,我只是测试DOC转换。
我的版本能够转换以下HTML:
< table style =width:50%; border:6px#0000FF solid;>
< thead>
< tr style =background-color:#FF0000; text-align:center; color:#FFFFFF; font-weight:bold;>
< th> a< / th>
b< / th>
< th> c< / th>
< / tr>
< / thead>
< tbody>
< tr>< td> 1< / td>< td colspan =2> 2< / td>< / tr>
< tr>< td> 4< / td>< td> 5< / td>< td> 6< / td>< / tr>
< / tbody>
< / table>
生成以下DOC表:
<?php
/ **
*此文件是PHPWord的一部分 - 一个用于读取和写入
*文字处理文档的纯PHP库。
*
* PHPWord是根据自由软件基金会发布的GNU Lesser
*通用公共许可证版本3的条款分发的免费软件。
*
*有关完整的版权和许可信息,请阅读随此源代码一起分发的LICENSE
*文件。有关
*贡献者的完整列表,请访问https://github.com/PHPOffice/PHPWord/contributors。
*
* @link https://github.com/PHPOffice/PHPWord
* @copyright 2010-2016 PHPWord贡献者
* @license http://www.gnu。 org / licenses / lgpl.txt LGPL版本3
* /
命名空间PhpOffice\PhpWord\Shared;
使用PhpOffice \PhpWord\Element\AbstractContainer;
使用PhpOffice\PhpWord\Element\Table;
使用PhpOffice \PhpWord\Element\Row;
$ b / **
*常用Html函数
*
* @SuppressWarnings(PHPMD.UnusedPrivateMethod)用于readWPNode
* /
class Html
{
// public static $ phpWord = null;
/ **
*从父元素中持有样式,
*允许子元素继承属性。
*因此,如果你的表格行有粗体字体
*,你可以这样做:
*< tr style =font-weight:bold;>
*而不是
*< tr>
*< td>
*< p style =font-weight:bold;>
* ...
*
*在处理DOM元素子元素之前,将父元素元素样式添加到堆栈中。
*每个子元素的样式由
*其样式和父类型组成。
* /
public static $ stylesStack = null;
/ **
*添加HTML部分。
*
*注意:$ stylesheet参数被删除,以避免PHPMD错误的未使用参数
*
* @param \PhpOffice\PhpWord\Element\AbstractContainer $ element哪里需要添加部分
* @param string $ html解析
*的代码bram @ bool $ fullHTML如果是完整的HTML,则不需要添加'body'标记
* @return void
* /
public static function addHtml($ element,$ html,$ fullHTML = false)
{
/ *
* @todo parse $ stylesheet对于默认样式。应该产生一个基于id,class和元素的数组
*,可以在parseNode函数中出现这样的元素时应用它。
* /
//预处理:删除所有行结束,解码HTML实体
//修正&符号和尖括号并为HTML片段添加body标记
$ html = str_replace(array(\\\
,\r),'',$ html);
$ html = str_replace(array('& lt;','& gt;','& amp;'),array('_ lt_','_gt_','_amp_'),$ html );
$ html = html_entity_decode($ html,ENT_QUOTES,'UTF-8');
$ html = str_replace('&','& amp;',$ html);
$ html = str_replace(array('_ lt_','_gt_','_amp_'),array('& lt;','& gt;','& amp;'),$ html );
if(false === $ fullHTML){
$ html ='< body>'。 $ html。 < /体>;
}
//载入DOM
$ dom = new \DOMDocument();
$ dom-> preserveWhiteSpace = true;
$ dom-> loadXML($ html);
$ node = $ dom-> getElementsByTagName('body');
// self :: $ phpWord = $ element-> getPhpWord();
self :: $ stylesStack = array();
self :: parseNode($ node-> item(0),$ element);
$ b $ **
* parse节点的内联样式
*
* @param \DOMNode $ node节点检查属性和编译一个样式数组
* @param array $ styles被提供,内联样式属性被添加到已经存在的样式
* @return数组
* /
受保护的静态函数parseInlineStyle($ node,$ styles = array())
{
if(XML_ELEMENT_NODE == $ node-> nodeType){
$ stylesStr = $ node-> getAttribute('style );
$ styles = self :: parseStyle($ node,$ stylesStr,$ styles);
}
else
{
//只是为了平衡堆栈。
//(使pushs数= pop数)
self :: pushStyles(array());
}
返回$ styles;
}
/ **
*解析节点并向父元素添加相应的元素。
*
* @param \DOMNode $ node节点来解析
* @param \PhpOffice\PhpWord\Element\AbstractContainer $元素对象添加一个与节点相对应的元素
* @param array $ styles具有所有样式的数组
* @param数组$ data将数据传输到DOM树中的下一级,例如列表项的级别
* @return void
* /
保护静态函数parseNode($ node,$ element,$ styles = array(),$ data = array())
{
//填充样式数组
$ styleTypes = array('font','paragraph','list','table','row','cell'); // $ change
foreach($ styleTypes as $ styleType){
if(!isset($ styles [$ styleType])){
$ styles [$ styleType] = array();
//节点映射表
$ nodes = array(
// $ method $ node $ element $ styles $ data $ argument1 $ argument2 $ b $'p'=> array('Paragraph',$ node,$ element,$ styles,null,null,null),
'h1'=> array('Heading', null,'Heading1',null),
'h2'=> array('Heading',null,$ element,$ styles,null,'Heading2',null),
'h3'=> array('Heading',null,$ element,$ styles,null,'Heading3',null),
'h4'=> array('Heading',null ,$ element,$ styles,null,'Heading4',null),
'h5'=> array('Heading',null,$ element,$ styles,null,'Heading5',n ('Heading',null,$ element,$ styles,null,'Heading6',null),
'#text'=> array('Text',$ node,$ element,$ styles,null,null,null),
'strong'=>数组('Property',null,null,$ styles,null,'bold',true),
'em'=>数组('Property',null,null,$ styles,null,'italic',true),
'sup'=>数组('Property',null,null,$ styles,null,'superScript',true),
'sub'=> array('Property',null,null,$ styles,null,'subScript',true),
// @change
//'table'=> array('Table',$ node,$ element,$ styles,null,'addTable',true),
//'tr'=>数组('Table',$ node,$ element,$ styles,null,'addRow',true),
//'td'=> array('Table',$ node,$ element,$ styles,null,'addCell',true),
'table'=>数组('Table',$ node,$ element,$ styles,null,null,true),
'tr'=>数组('Row',$ node,$ element,$ styles,null,null,true),
'td'=> array('Cell',$ node,$ element,$ styles,null,null,true),
'th'=> array('Cell',$ node,$ element,$ styles,null,null,true),
'ul'=>数组('List',null,null,$ styles,$ data,3,null),
'ol'=> array('List',null,null,$ styles,$ data,7,null),
'li'=> array('ListItem',$ node,$ element,$ styles,$ data,null,null),
);
$ newElement = null;
$ keys = array('node','element','styles','data','argument1','argument2');
if(isset($ nodes [$ node-> nodeName])){
//根据节点映射表执行方法并返回$ newElement或null
//参数通过引用传递
$ arguments = array();
$ args = array();
list($ method,$ args [0],$ args [1],$ args [2],$ args [3],$ args [4],$ args [5])= $ nodes [$节点 - >节点名称]。 $ $ b $ for($ i = 0; $ i <= 5; $ i ++){
if($ args [$ i]!== null){
$ arguments [$ keys [ $ i]] =& $ args [$ i];
}
}
$ method =parse {$ method};
$ newElement = call_user_func_array(array('PhpOffice\PhpWord\Shared\Html',$ method),$ arguments);
//从参数中获取变量
foreach($ key as $ key){
if(array_key_exists($ key,$ arguments)){
$$ key = $ arguments [$ key];
}
}
}
else
{
//只是为了平衡堆栈。
//推送次数=弹出次数。
self :: pushStyles(array());
}
if($ newElement === null){
$ newElement = $ element;
}
self :: parseChildNodes($ node,$ newElement,$ styles,$ data);
//处理父元素后,
//将其样式从堆栈中移除。
self :: popStyles();
}
/ **
*解析子节点。
*
* @param \DOMNode $ node
* @param \PhpOffice\PhpWord\Element\AbstractContainer $ element
* @param array $ styles
* @param array $ data
* @return void
* /
private static function parseChildNodes($ node,$ element,$ styles,$ data)
{
if('li'!= $ node-> nodeName){
$ cNodes = $ node-> childNodes;如果(($元素instanceof AbstractContainer)或($ element instanceof Table)或($ count $($ c $)) $ element instanceof Row)){// @change
self :: parseNode($ cNode,$ element,$ styles,$ data);
}
}
}
}
}
/ **
*解析段落节点
*
* @param \DOMNode $ node
* @param \PhpOffice\PhpWord\Element\AbstractContainer $ element
* @param array& $ styles
* @返回\PhpOffice\PhpWord\Element\TextRun
* /
私有静态函数parseParagraph($ node,$ element,& $ styles)
{
$ elementStyles = self :: parseInlineStyle($ node,$ styles ['paragraph']);
$ newElement = $ element-> addTextRun($ elementStyles);
返回$ newElement;
}
/ **
*解析标题节点
*
* @param \PhpOffice\PhpWord\Element\AbstractContainer $ element
* @param array& $ styles
* @param string $ argument1标题样式的名称
* @return \PhpOffice\PhpWord\Element\TextRun
*
* @todo想想定义标题样式的一种巧妙方式,现在它只是基于假设,
* Heading1 - Heading6已经在某处定义了
* /
private静态函数parseHeading($ element,& $ styles,$ argument1)
{
$ elementStyles = $ argument1;
$ newElement = $ element-> addTextRun($ elementStyles);
返回$ newElement;
}
/ **
*解析文本节点
*
* @param \DOMNode $ node
* @param \ PhpOffice\PhpWord\Element\AbstractContainer $ element
* @param array& $ styles
* @return null
* /
private static function parseText($ node, $ element,& $ styles)
{
$ elementStyles = self :: parseInlineStyle($ node,$ styles ['font']);
$ textStyles = self :: getInheritedTextStyles();
$ paragraphStyles = self :: getInheritedParagraphStyles();
//评论为bug#257的来源。在这种情况下``method_exists`似乎不能正常工作。
// @todo找到更好的错误检查这个
// if(method_exists($ element,'addText')){
$ element-> addText($ node-> nodeValue,$ textStyles,$ paragraphStyles);
//}
返回null;
}
$ b $ **
*解析属性节点
*
* @param array& $ styles
* @param string $ argument1样式名称
* @param字符串$ argument2样式值
* @return null
* /
私有静态函数parseProperty(& $ styles,$ argument1,$ argument2)
{
$ styles ['font'] [$ argument1] = $ argument2;
返回null;
}
/ **
*解析表节点
*
* @param \DOMNode $ node
* @param \ PhpOffice \PhpWord\Element\AbstractContainer $ element
* @param array& $ styles
* @param string $ argument1方法名称
* @return \PhpOffice\PhpWord\\ \\元素\抽象容器$元素
*
* @todo只要TableItem,RowItem和CellItem支持相对宽度和高度
* /
private static function parseTable($ node, $ element,& $ styles,$ argument1)
{
$ elementStyles = self :: parseInlineStyle($ node,$ styles ['table']);
$ newElement = $ element-> addTable($ elementStyles);
// $ attributes = $ node->属性;
// if($ attributes-> getNamedItem('width')!== null){
// $ newElement-> setWidth($ attributes-> getNamedItem('width') - >值);
//
// if($ attributes-> getNamedItem('height')!== null){
// $ newElement-> setHeight($属性 - > getNamedItem( '高度') - >值);
//}
// if($ attributes-> getNamedItem('width')!== null){
// $ newElement = $ element-> addCell($ width = $属性 - > getNamedItem( '宽度') - >值);
//}
返回$ newElement;
$ b $ private static function parseRow($ node,$ element,& $ styles,$ argument1)
{
$ elementStyles = self :: parseInlineStyle($节点,$ styles ['row']);
$ newElement = $ element-> addRow(null,$ elementStyles);
返回$ newElement;
$ b private static function parseCell($ node,$ element,& $ styles,$ argument1)
{
$ elementStyles = self: :parseInlineStyle($ node,$ styles ['cell']);
$ colspan = $ node-> getAttribute('colspan');
if(!empty($ colspan))
$ elementStyles ['gridSpan'] = $ colspan-0;
$ newElement = $ element-> addCell(null,$ elementStyles);
返回$ newElement;
}
$ b $ **
*解析列表节点
*
* @param array& $ styles
* @param array& amp ; $ data
* @param string $ argument1列表类型
* @return null
* /
private static function parseList(& $ styles,& $ data,$ argument1 )
{
if(isset($ data ['listdepth'])){
$ data ['listdepth'] ++;
} else {
$ data ['listdepth'] = 0;
}
$ styles ['list'] ['listType'] = $ argument1;
返回null;
}
$ b $ **
*解析列表项节点
*
* @param \DOMNode $ node
* @param \\ \\PhpOffice\PhpWord\Element\AbstractContainer $ element
* @param array& $ styles
* @param array $ data
* @return null
*
* @todo这个函数与`parseChildNodes`几乎相同。合并?
* @todo只要ListItem继承自AbstractContainer或TextRun,就会删除childNodes的一部分
* /
私有静态函数parseListItem($ node,$ element,& $ styles,$ data)
{
$ cNodes = $ node-> childNodes;
if(count($ cNodes)> 0){
$ text ='';
foreach($ cNodes as $ cNode){
if($ cNode-> nodeName =='#text'){
$ text = $ cNode-> nodeValue;
$ element-> addListItem($ text,$ data ['listdepth'],$ styles ['font'],$ styles ['list'],$ styles ['段']);
}
返回null;
$ b / **
*解析样式
*
* @param \DOMAttr $属性
* @参数array $样式
* @return数组
* /
私有静态函数parseStyle($ node,$ stylesStr,$ styles)
{
//分析元素样式。
$ newStyles = array();
if(!empty($ stylesStr))
{
$ properties = explode(';',trim($ stylesStr,\t\\\
\r \0\x0B;));
foreach($ properties as $ property){
list($ cKey,$ cValue)= explode(':',$ property,2);
$ cValue = trim($ cValue);
switch(trim($ cKey)){
case'text-decoration':
switch($ cValue){
case'underline':
$ newStyles [ 'underline'] ='单身';
休息;
case'line-through':
$ newStyles ['strikethrough'] = true;
休息;
}
break;
case'text-align':
$ newStyles ['alignment'] = $ cValue; // todo:任何映射?
休息;
case'color':
$ newStyles ['color'] = trim($ cValue,#);
休息;
case'background-color':
$ newStyles ['bgColor'] = trim($ cValue,#);
休息;
// @change
'colspan':
$ newStyles ['gridSpan'] = $ cValue-0;
休息;
case'font-weight':
if($ cValue =='bold')
$ newStyles ['bold'] = true;
休息;
case'width':
$ newStyles = self :: parseWidth($ newStyles,$ cValue);
休息;
case'border-width':
$ newStyles = self :: parseBorderStyle($ newStyles,$ cValue);
休息;
case'border-color':
$ newStyles = self :: parseBorderColor($ newStyles,$ cValue);
休息;
case'border':
$ newStyles = self :: parseBorder($ newStyles,$ cValue);
休息;
}
}
}
//将样式添加到堆栈。
self :: pushStyles($ newStyles);
//继承父类型(包括它自己)。
$ inheritedStyles = self :: getInheritedStyles($ node-> nodeName);
//用继承的样式覆盖默认样式。
$ styles = array_merge($ styles,$ inheritedStyles);
$ b $ * DEBUG
if($ node-> nodeName =='th')
{
echo'< pre>;
print_r(self :: $ stylesStack);
print_r($ styles);
// print_r($ elementStyles);
echo'< / pre>';
}
* /
返回$ styles;
$ b / **
*解析width样式属性,添加到样式
*数组相应的PHPWORD属性。
* /
public static function parseWidth($ styles,$ cValue)
{
if(preg_match('/([0-9] +)px /',$ cValue ,$匹配))
{
$ styles ['width'] = $ matches [1];
$ styles ['unit'] ='dxa';
}
else if(preg_match('/([0-9] +)%/',$ cValue,$ matches))
{
$ styles ['width' ] = $匹配[1] * 50;
$ styles ['unit'] ='pct';
}
else if(preg_match('/([0-9] +)/',$ cValue,$ matches))
{
$ styles ['width'] = $ matches [1];
$ styles ['unit'] ='auto';
}
$ styles ['alignment'] = \PhpOffice\PhpWord\SimpleType\JcTable :: START;
返回$ styles;
}
/ **
*解析border-width样式属性,添加样式
*数组相应的PHPWORD属性。
* /
public static function parseBorderWidth($ styles,$ cValue)
{
// border-width:2px;
if(preg_match('/([0-9] +)px /',$ cValue,$ matches))
$ styles ['borderSize'] = $ matches [1];
返回$ styles;
}
/ **
*解析border-color样式属性,添加样式
*数组相应的PHPWORD属性。
* /
public static function parseBorderColor($ styles,$ cValue)
{
// border-color:#FFAACC;
$ styles ['borderColor'] = $ cValue;
返回$ styles;
}
/ **
*解析border样式属性,添加样式
*数组相应的PHPWORD属性。
* /
public static function parseBorder($ styles,$ cValue)
{
if(preg_match('/([0-9] +)px\s +(\\ \\#[a-fA-F0-9] +)\s + solid + /',$ cValue,$ matches))
{
$ styles ['borderSize'] = $ matches [1] ;
$ styles ['borderColor'] = $匹配[2];
}
返回$ styles;
}
/ **
*考虑到当前堆栈状态,返回文本元素的继承样式
*。
* /
public static function getInheritedTextStyles()
{
return self :: getInheritedStyles('#text');
}
/ **
*考虑当前堆栈状态,返回段落元素的继承样式
*。
* /
public static function getInheritedParagraphStyles()
{
return self :: getInheritedStyles('p');
}
/ **
*考虑到当前堆栈状态,返回给定nodeType的继承样式,
*。
* /
public static function getInheritedStyles($ nodeType)
{
$ textStyles = array('color','bold','italic');
$ paragraphStyles = array('color','bold','italic','alignment');
//每种元素类型相关的phpword样式列表。
$ stylesMapping = array(
'p'=> $ paragraphStyles,
'h1'=> $ textStyles,
'h2'=> $ textStyles,
'h3'=> $ textStyles,
'h4'=> $ textStyles,
'h5'=> $ textStyles,
'h6'=> $ textStyles,
'#text'=> $ textStyles,
'strong'=> $ textStyles,
'em'=> $ textStyles,
'sup'=> $ textStyles,
'sub'=> $ textStyles,
'table'=> array('width','borderSize','borderColor','unit'),
' ('bgColor','alignment'),
'th'=> array('bgColor','alignment'),
'td'=> bgColor','alignme nt'),
'ul'=> $ textStyles,
'ol'=> $ textStyles,
'li'=> $ textStyles,
);
$ result = array();
if(isset($ stylesMapping [$ nodeType]))
{
$ nodeStyles = $ stylesMapping [$ nodeType];
//循环槽样式堆栈应用
中的样式//正确的顺序。
foreach(self :: $ stylesStack as $ styles)
{
//循环遍历所有样式,仅应用
//该节点类型的相关元素。
foreach($ styles as $ name => $ value)
{
if(in_array($ name,$ nodeStyles))
{
$ result [$名称] = $值;
}
}
}
}
返回$ result;
}
/ **
*将父类型添加到堆栈,允许
*子元素继承。
* /
public static function pushStyles($ styles)
{
self :: $ stylesStack [] = $ styles;
}
/ **
*在递归结束时移除父类型。
* /
public static function popStyles()
{
array_pop(self :: $ stylesStack);
$ b 使用这种新结构很容易添加新的样式支持。你只需要在getInheritedStyles()方法中编辑parseStyle()方法和$ stylesMapping变量。希望它有帮助。
使用示例:
< php
include_once'Sample_Header.php';
//新建Word文档
echo date('H:i:s'),'创建新的PhpWord对象',EOL;
$ phpWord = new \PhpOffice\PhpWord\PhpWord();
$ section = $ phpWord-> addSection();
$ html ='< table style =width:50%; border:6px#0000FF solid;>'。
'< thead>'。
'< tr style =background-color:#FF0000; text-align:center; color:#FFFFFF; font-weight:bold;>'。
'第< th> a< / th>'。
'th'b th'/'。
'th c '。
'< / tr>'。
'< / thead>'。
'< tbody>'。
'< tr>< td> 1< / td>< td colspan =2> 2< / td>< / tr>'。
'< tr>< td> 4< / td>< td> 5< / td>< td> 6< / td>< / tr>'。
'< / tbody>'。
'< / table>';
\PhpOffice\PhpWord\Shared\Html::addHtml($section, $html);
// Save file
echo write($phpWord, basename(__FILE__, ’.php’), $writers);
if (!CLI) {
include_once ’Sample_Footer.php’;
}
When I use Html reader for my html for converting into docx, reader is cut off my table.
PHP example:
$reader = IOFactory::createReader('HTML');
$phpWord = $reader->load($this->getReportDir() . '/' . $fileName);
$writer = IOFactory::createWriter($phpWord);
$writer->save($this->getReportDir() . '/' . $fileName);
Table example:
<table>
<tr>
<td>№ п/п</td>
<td>Общие показатели результатов прохождения проверочных листов</td>
<td>Количество пройденных проверок</td>
<td>% от общего количества пройденных проверок</td>
</tr>
</table>
解决方案 The current HTML class from PHPWord is very limited. The issue you are getting is a know issue (see https://github.com/PHPOffice/PHPWord/issues/324).
I'm working in a project that needs some HTML tables to doc conversion. So, I work a little improving the HTML class. It is very little tested and I just test DOC conversion.
My version is able to convert the following HTML:
<table style="width: 50%; border: 6px #0000FF solid;">
<thead>
<tr style="background-color: #FF0000; text-align: center; color: #FFFFFF; font-weight: bold; ">
<th>a</th>
<th>b</th>
<th>c</th>
</tr>
</thead>
<tbody>
<tr><td>1</td><td colspan="2">2</td></tr>
<tr><td>4</td><td>5</td><td>6</td></tr>
</tbody>
</table>
Generating the following DOC table:
It uses PHPWord version 0.13:
<?php
/**
* This file is part of PHPWord - A pure PHP library for reading and writing
* word processing documents.
*
* PHPWord is free software distributed under the terms of the GNU Lesser
* General Public License version 3 as published by the Free Software Foundation.
*
* For the full copyright and license information, please read the LICENSE
* file that was distributed with this source code. For the full list of
* contributors, visit https://github.com/PHPOffice/PHPWord/contributors.
*
* @link https://github.com/PHPOffice/PHPWord
* @copyright 2010-2016 PHPWord contributors
* @license http://www.gnu.org/licenses/lgpl.txt LGPL version 3
*/
namespace PhpOffice\PhpWord\Shared;
use PhpOffice\PhpWord\Element\AbstractContainer;
use PhpOffice\PhpWord\Element\Table;
use PhpOffice\PhpWord\Element\Row;
/**
* Common Html functions
*
* @SuppressWarnings(PHPMD.UnusedPrivateMethod) For readWPNode
*/
class Html
{
//public static $phpWord=null;
/**
* Hold styles from parent elements,
* allowing child elements inherit attributes.
* So if you whant your table row have bold font
* you can do:
* <tr style="font-weight: bold; ">
* instead of
* <tr>
* <td>
* <p style="font-weight: bold;">
* ...
*
* Before DOM element children are processed,
* the parent DOM element styles are added to the stack.
* The styles for each child element is composed by
* its styles plus the parent styles.
*/
public static $stylesStack=null;
/**
* Add HTML parts.
*
* Note: $stylesheet parameter is removed to avoid PHPMD error for unused parameter
*
* @param \PhpOffice\PhpWord\Element\AbstractContainer $element Where the parts need to be added
* @param string $html The code to parse
* @param bool $fullHTML If it's a full HTML, no need to add 'body' tag
* @return void
*/
public static function addHtml($element, $html, $fullHTML = false)
{
/*
* @todo parse $stylesheet for default styles. Should result in an array based on id, class and element,
* which could be applied when such an element occurs in the parseNode function.
*/
// Preprocess: remove all line ends, decode HTML entity,
// fix ampersand and angle brackets and add body tag for HTML fragments
$html = str_replace(array("\n", "\r"), '', $html);
$html = str_replace(array('<', '>', '&'), array('_lt_', '_gt_', '_amp_'), $html);
$html = html_entity_decode($html, ENT_QUOTES, 'UTF-8');
$html = str_replace('&', '&', $html);
$html = str_replace(array('_lt_', '_gt_', '_amp_'), array('<', '>', '&'), $html);
if (false === $fullHTML) {
$html = '<body>' . $html . '</body>';
}
// Load DOM
$dom = new \DOMDocument();
$dom->preserveWhiteSpace = true;
$dom->loadXML($html);
$node = $dom->getElementsByTagName('body');
//self::$phpWord = $element->getPhpWord();
self::$stylesStack = array();
self::parseNode($node->item(0), $element);
}
/**
* parse Inline style of a node
*
* @param \DOMNode $node Node to check on attributes and to compile a style array
* @param array $styles is supplied, the inline style attributes are added to the already existing style
* @return array
*/
protected static function parseInlineStyle($node, $styles = array())
{
if (XML_ELEMENT_NODE == $node->nodeType) {
$stylesStr = $node->getAttribute('style');
$styles = self::parseStyle($node, $stylesStr, $styles);
}
else
{
// Just to balance the stack.
// (make number of pushs = number of pops)
self::pushStyles(array());
}
return $styles;
}
/**
* Parse a node and add a corresponding element to the parent element.
*
* @param \DOMNode $node node to parse
* @param \PhpOffice\PhpWord\Element\AbstractContainer $element object to add an element corresponding with the node
* @param array $styles Array with all styles
* @param array $data Array to transport data to a next level in the DOM tree, for example level of listitems
* @return void
*/
protected static function parseNode($node, $element, $styles = array(), $data = array())
{
// Populate styles array
$styleTypes = array('font', 'paragraph', 'list', 'table', 'row', 'cell'); //@change
foreach ($styleTypes as $styleType) {
if (!isset($styles[$styleType])) {
$styles[$styleType] = array();
}
}
// Node mapping table
$nodes = array(
// $method $node $element $styles $data $argument1 $argument2
'p' => array('Paragraph', $node, $element, $styles, null, null, null),
'h1' => array('Heading', null, $element, $styles, null, 'Heading1', null),
'h2' => array('Heading', null, $element, $styles, null, 'Heading2', null),
'h3' => array('Heading', null, $element, $styles, null, 'Heading3', null),
'h4' => array('Heading', null, $element, $styles, null, 'Heading4', null),
'h5' => array('Heading', null, $element, $styles, null, 'Heading5', null),
'h6' => array('Heading', null, $element, $styles, null, 'Heading6', null),
'#text' => array('Text', $node, $element, $styles, null, null, null),
'strong' => array('Property', null, null, $styles, null, 'bold', true),
'em' => array('Property', null, null, $styles, null, 'italic', true),
'sup' => array('Property', null, null, $styles, null, 'superScript', true),
'sub' => array('Property', null, null, $styles, null, 'subScript', true),
// @change
//'table' => array('Table', $node, $element, $styles, null, 'addTable', true),
//'tr' => array('Table', $node, $element, $styles, null, 'addRow', true),
//'td' => array('Table', $node, $element, $styles, null, 'addCell', true),
'table' => array('Table' , $node, $element, $styles, null, null, true),
'tr' => array('Row' , $node, $element, $styles, null, null, true),
'td' => array('Cell' , $node, $element, $styles, null, null, true),
'th' => array('Cell' , $node, $element, $styles, null, null, true),
'ul' => array('List', null, null, $styles, $data, 3, null),
'ol' => array('List', null, null, $styles, $data, 7, null),
'li' => array('ListItem', $node, $element, $styles, $data, null, null),
);
$newElement = null;
$keys = array('node', 'element', 'styles', 'data', 'argument1', 'argument2');
if (isset($nodes[$node->nodeName])) {
// Execute method based on node mapping table and return $newElement or null
// Arguments are passed by reference
$arguments = array();
$args = array();
list($method, $args[0], $args[1], $args[2], $args[3], $args[4], $args[5]) = $nodes[$node->nodeName];
for ($i = 0; $i <= 5; $i++) {
if ($args[$i] !== null) {
$arguments[$keys[$i]] = &$args[$i];
}
}
$method = "parse{$method}";
$newElement = call_user_func_array(array('PhpOffice\PhpWord\Shared\Html', $method), $arguments);
// Retrieve back variables from arguments
foreach ($keys as $key) {
if (array_key_exists($key, $arguments)) {
$$key = $arguments[$key];
}
}
}
else
{
// Just to balance the stack.
// Number of pushs = number of pops.
self::pushStyles(array());
}
if ($newElement === null) {
$newElement = $element;
}
self::parseChildNodes($node, $newElement, $styles, $data);
// After the parent element be processed,
// its styles are removed from stack.
self::popStyles();
}
/**
* Parse child nodes.
*
* @param \DOMNode $node
* @param \PhpOffice\PhpWord\Element\AbstractContainer $element
* @param array $styles
* @param array $data
* @return void
*/
private static function parseChildNodes($node, $element, $styles, $data)
{
if ('li' != $node->nodeName) {
$cNodes = $node->childNodes;
if (count($cNodes) > 0) {
foreach ($cNodes as $cNode) {
if (($element instanceof AbstractContainer) or ($element instanceof Table) or ($element instanceof Row)) { // @change
self::parseNode($cNode, $element, $styles, $data);
}
}
}
}
}
/**
* Parse paragraph node
*
* @param \DOMNode $node
* @param \PhpOffice\PhpWord\Element\AbstractContainer $element
* @param array &$styles
* @return \PhpOffice\PhpWord\Element\TextRun
*/
private static function parseParagraph($node, $element, &$styles)
{
$elementStyles = self::parseInlineStyle($node, $styles['paragraph']);
$newElement = $element->addTextRun($elementStyles);
return $newElement;
}
/**
* Parse heading node
*
* @param \PhpOffice\PhpWord\Element\AbstractContainer $element
* @param array &$styles
* @param string $argument1 Name of heading style
* @return \PhpOffice\PhpWord\Element\TextRun
*
* @todo Think of a clever way of defining header styles, now it is only based on the assumption, that
* Heading1 - Heading6 are already defined somewhere
*/
private static function parseHeading($element, &$styles, $argument1)
{
$elementStyles = $argument1;
$newElement = $element->addTextRun($elementStyles);
return $newElement;
}
/**
* Parse text node
*
* @param \DOMNode $node
* @param \PhpOffice\PhpWord\Element\AbstractContainer $element
* @param array &$styles
* @return null
*/
private static function parseText($node, $element, &$styles)
{
$elementStyles = self::parseInlineStyle($node, $styles['font']);
$textStyles = self::getInheritedTextStyles();
$paragraphStyles = self::getInheritedParagraphStyles();
// Commented as source of bug #257. `method_exists` doesn't seems to work properly in this case.
// @todo Find better error checking for this one
// if (method_exists($element, 'addText')) {
$element->addText($node->nodeValue, $textStyles, $paragraphStyles);
// }
return null;
}
/**
* Parse property node
*
* @param array &$styles
* @param string $argument1 Style name
* @param string $argument2 Style value
* @return null
*/
private static function parseProperty(&$styles, $argument1, $argument2)
{
$styles['font'][$argument1] = $argument2;
return null;
}
/**
* Parse table node
*
* @param \DOMNode $node
* @param \PhpOffice\PhpWord\Element\AbstractContainer $element
* @param array &$styles
* @param string $argument1 Method name
* @return \PhpOffice\PhpWord\Element\AbstractContainer $element
*
* @todo As soon as TableItem, RowItem and CellItem support relative width and height
*/
private static function parseTable($node, $element, &$styles, $argument1)
{
$elementStyles = self::parseInlineStyle($node, $styles['table']);
$newElement = $element->addTable($elementStyles);
// $attributes = $node->attributes;
// if ($attributes->getNamedItem('width') !== null) {
// $newElement->setWidth($attributes->getNamedItem('width')->value);
// }
// if ($attributes->getNamedItem('height') !== null) {
// $newElement->setHeight($attributes->getNamedItem('height')->value);
// }
// if ($attributes->getNamedItem('width') !== null) {
// $newElement=$element->addCell($width=$attributes->getNamedItem('width')->value);
// }
return $newElement;
}
private static function parseRow($node, $element, &$styles, $argument1)
{
$elementStyles = self::parseInlineStyle($node, $styles['row']);
$newElement = $element->addRow(null, $elementStyles);
return $newElement;
}
private static function parseCell($node, $element, &$styles, $argument1)
{
$elementStyles = self::parseInlineStyle($node, $styles['cell']);
$colspan = $node->getAttribute('colspan');
if (!empty($colspan))
$elementStyles['gridSpan'] = $colspan-0;
$newElement = $element->addCell(null, $elementStyles);
return $newElement;
}
/**
* Parse list node
*
* @param array &$styles
* @param array &$data
* @param string $argument1 List type
* @return null
*/
private static function parseList(&$styles, &$data, $argument1)
{
if (isset($data['listdepth'])) {
$data['listdepth']++;
} else {
$data['listdepth'] = 0;
}
$styles['list']['listType'] = $argument1;
return null;
}
/**
* Parse list item node
*
* @param \DOMNode $node
* @param \PhpOffice\PhpWord\Element\AbstractContainer $element
* @param array &$styles
* @param array $data
* @return null
*
* @todo This function is almost the same like `parseChildNodes`. Merged?
* @todo As soon as ListItem inherits from AbstractContainer or TextRun delete parsing part of childNodes
*/
private static function parseListItem($node, $element, &$styles, $data)
{
$cNodes = $node->childNodes;
if (count($cNodes) > 0) {
$text = '';
foreach ($cNodes as $cNode) {
if ($cNode->nodeName == '#text') {
$text = $cNode->nodeValue;
}
}
$element->addListItem($text, $data['listdepth'], $styles['font'], $styles['list'], $styles['paragraph']);
}
return null;
}
/**
* Parse style
*
* @param \DOMAttr $attribute
* @param array $styles
* @return array
*/
private static function parseStyle($node, $stylesStr, $styles)
{
// Parses element styles.
$newStyles = array();
if (!empty($stylesStr))
{
$properties = explode(';', trim($stylesStr, " \t\n\r\0\x0B;"));
foreach ($properties as $property) {
list($cKey, $cValue) = explode(':', $property, 2);
$cValue = trim($cValue);
switch (trim($cKey)) {
case 'text-decoration':
switch ($cValue) {
case 'underline':
$newStyles['underline'] = 'single';
break;
case 'line-through':
$newStyles['strikethrough'] = true;
break;
}
break;
case 'text-align':
$newStyles['alignment'] = $cValue; // todo: any mapping?
break;
case 'color':
$newStyles['color'] = trim($cValue, "#");
break;
case 'background-color':
$newStyles['bgColor'] = trim($cValue, "#");
break;
// @change
case 'colspan':
$newStyles['gridSpan'] = $cValue-0;
break;
case 'font-weight':
if ($cValue=='bold')
$newStyles['bold'] = true;
break;
case 'width':
$newStyles = self::parseWidth($newStyles, $cValue);
break;
case 'border-width':
$newStyles = self::parseBorderStyle($newStyles, $cValue);
break;
case 'border-color':
$newStyles = self::parseBorderColor($newStyles, $cValue);
break;
case 'border':
$newStyles = self::parseBorder($newStyles, $cValue);
break;
}
}
}
// Add styles to stack.
self::pushStyles($newStyles);
// Inherit parent styles (including itself).
$inheritedStyles = self::getInheritedStyles($node->nodeName);
// Override default styles with the inherited ones.
$styles = array_merge($styles, $inheritedStyles);
/* DEBUG
if ($node->nodeName=='th')
{
echo '<pre>';
print_r(self::$stylesStack);
print_r($styles);
//print_r($elementStyles);
echo '</pre>';
}
*/
return $styles;
}
/**
* Parses the "width" style attribute, adding to styles
* array the corresponding PHPWORD attributes.
*/
public static function parseWidth($styles, $cValue)
{
if (preg_match('/([0-9]+)px/', $cValue, $matches))
{
$styles['width'] = $matches[1];
$styles['unit'] = 'dxa';
}
else if (preg_match('/([0-9]+)%/', $cValue, $matches))
{
$styles['width'] = $matches[1]*50;
$styles['unit'] = 'pct';
}
else if (preg_match('/([0-9]+)/', $cValue, $matches))
{
$styles['width'] = $matches[1];
$styles['unit'] = 'auto';
}
$styles['alignment'] = \PhpOffice\PhpWord\SimpleType\JcTable::START;
return $styles;
}
/**
* Parses the "border-width" style attribute, adding to styles
* array the corresponding PHPWORD attributes.
*/
public static function parseBorderWidth($styles, $cValue)
{
// border-width: 2px;
if (preg_match('/([0-9]+)px/', $cValue, $matches))
$styles['borderSize'] = $matches[1];
return $styles;
}
/**
* Parses the "border-color" style attribute, adding to styles
* array the corresponding PHPWORD attributes.
*/
public static function parseBorderColor($styles, $cValue)
{
// border-color: #FFAACC;
$styles['borderColor'] = $cValue;
return $styles;
}
/**
* Parses the "border" style attribute, adding to styles
* array the corresponding PHPWORD attributes.
*/
public static function parseBorder($styles, $cValue)
{
if (preg_match('/([0-9]+)px\s+(\#[a-fA-F0-9]+)\s+solid+/', $cValue, $matches))
{
$styles['borderSize'] = $matches[1];
$styles['borderColor'] = $matches[2];
}
return $styles;
}
/**
* Return the inherited styles for text elements,
* considering current stack state.
*/
public static function getInheritedTextStyles()
{
return self::getInheritedStyles('#text');
}
/**
* Return the inherited styles for paragraph elements,
* considering current stack state.
*/
public static function getInheritedParagraphStyles()
{
return self::getInheritedStyles('p');
}
/**
* Return the inherited styles for a given nodeType,
* considering current stack state.
*/
public static function getInheritedStyles($nodeType)
{
$textStyles = array('color', 'bold', 'italic');
$paragraphStyles = array('color', 'bold', 'italic', 'alignment');
// List of phpword styles relevant for each element types.
$stylesMapping = array(
'p' => $paragraphStyles,
'h1' => $textStyles,
'h2' => $textStyles,
'h3' => $textStyles,
'h4' => $textStyles,
'h5' => $textStyles,
'h6' => $textStyles,
'#text' => $textStyles,
'strong' => $textStyles,
'em' => $textStyles,
'sup' => $textStyles,
'sub' => $textStyles,
'table' => array('width', 'borderSize', 'borderColor', 'unit'),
'tr' => array('bgColor', 'alignment'),
'td' => array('bgColor', 'alignment'),
'th' => array('bgColor', 'alignment'),
'ul' => $textStyles,
'ol' => $textStyles,
'li' => $textStyles,
);
$result = array();
if (isset($stylesMapping[$nodeType]))
{
$nodeStyles = $stylesMapping[$nodeType];
// Loop trough styles stack applying styles in
// the right order.
foreach (self::$stylesStack as $styles)
{
// Loop trough all styles applying only the relevants for
// that node type.
foreach ($styles as $name => $value)
{
if (in_array($name, $nodeStyles))
{
$result[$name] = $value;
}
}
}
}
return $result;
}
/**
* Add the parent styles to stack, allowing
* children elements inherit from.
*/
public static function pushStyles($styles)
{
self::$stylesStack[] = $styles;
}
/**
* Remove parent styles at end of recursion.
*/
public static function popStyles()
{
array_pop(self::$stylesStack);
}
}
With this new structure is easy add new style support. You just need edit the parseStyle() method and the $stylesMapping variable at getInheritedStyles() method. Hope it helps.
Example of use:
<?php
include_once 'Sample_Header.php';
// New Word Document
echo date('H:i:s') , ' Create new PhpWord object' , EOL;
$phpWord = new \PhpOffice\PhpWord\PhpWord();
$section = $phpWord->addSection();
$html = '<table style="width: 50%; border: 6px #0000FF solid;">'.
'<thead>'.
'<tr style="background-color: #FF0000; text-align: center; color: #FFFFFF; font-weight: bold; ">'.
'<th>a</th>'.
'<th>b</th>'.
'<th>c</th>'.
'</tr>'.
'</thead>'.
'<tbody>'.
'<tr><td>1</td><td colspan="2">2</td></tr>'.
'<tr><td>4</td><td>5</td><td>6</td></tr>'.
'</tbody>'.
'</table>';
\PhpOffice\PhpWord\Shared\Html::addHtml($section, $html);
// Save file
echo write($phpWord, basename(__FILE__, '.php'), $writers);
if (!CLI) {
include_once 'Sample_Footer.php';
}
这篇关于PHPWord的HTML阅读器不适用于表格?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!