如何将相当非结构化的文本转换为XML [英] How do I convert fairly unstructured text to XML

查看:305
本文介绍了如何将相当非结构化的文本转换为XML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个存储数据的Word文件.那么通常会有至少三个具有相同描述符的行,但是通常会有更多行的数据.这两行太过分散,以至于查找和替换"无法正常工作.我希望避免手动编辑文档,使它成为XML,但几乎没有选择.

I have a Word file that stores data. There is a heading then usually at least three lines that have the same descriptor but often there are many more lines of data. The lines are too disparate for Find and Replace to work well. I wish to avoid manually editing the document so it becomes XML but see little choice. Is there an alternative way?

推荐答案

数据必须具有结构,否则,您将需要手工完成.

如果标题由某种样式表示(例如Heading1),则可以查找该样式以及该样式之间的任何内容,您可以将其理解为位于该标题下.

如果原始文件中没有结构,您将如何建议编写代码来创建该结构?

总有一种方法,但是您必须能够告诉程序如何提取信息.
There has to be a structure to the data, otherwise, you''re going to have to do it by hand.

If a heading is denoted by a certain style (like Heading1), then you can look for that style and anything between that style, you can read as being under that heading.

Without structure in the original file, how would you suggest writing code to create that structure?

There is always a way, but you have to be able to tell the program how to pull out the information.


谢谢.这是我所想的,但我问了这个问题以防万一.我认为这是手动复制和粘贴的情况.
Thanks. This is as I thought, but I asked the question just in case. I think it is a case of manually copying and pasting.


这篇关于如何将相当非结构化的文本转换为XML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆