我怎么能填充的C#类从具有一些嵌入式数据的XML文档? [英] How can I populate C# classes from an XML document that has some embedded data?
问题描述
我又回到了一个API这样的:
I have an API that has returned this:
的 http://services.aonaware.com/DictService/DictService.asmx?op=DefineInDict
<?xml version="1.0" encoding="utf-8"?>
<WordDefinition xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://services.aonaware.com/webservices/">
<Word>abandon</Word>
<Definitions>
<Definition>
<Word>abandon</Word>
<Dictionary>
<Id>wn</Id>
<Name>WordNet (r) 2.0</Name>
</Dictionary>
<WordDefinition>abandon
n 1: the trait of lacking restraint or control; freedom from
inhibition or worry; "she danced with abandon" [syn: {wantonness},
{unconstraint}]
2: a feeling of extreme emotional intensity; "the wildness of
his anger" [syn: {wildness}]
v 1: forsake, leave behind; "We abandoned the old car in the
empty parking lot"
2: stop maintaining or insisting on; of ideas, claims, etc.;
"He abandoned the thought of asking for her hand in
marriage"; "Both sides have to give up some calims in
these negociations" [syn: {give up}]
3: give up with the intent of never claiming again; "Abandon
your life to God"; "She gave up her children to her
ex-husband when she moved to Tahiti"; "We gave the
drowning victim up for dead" [syn: {give up}]
4: leave behind empty; move out of; "You must vacate your
office by tonight" [syn: {vacate}, {empty}]
5: leave someone who needs or counts on you; leave in the
lurch; "The mother deserted her children" [syn: {forsake},
{desolate}, {desert}]
</WordDefinition>
</Definition>
</Definitions>
</WordDefinition>
下面是我用来检索XML数据的代码:
Here is the code that I used to retrieve the XML data:
WebRequest request = WebRequest.Create("http://services.aonaware.com/DictService/DictService.asmx/DefineInDict");
request.Method = "POST";
string postData = "dictId=wn&word=abandon";
byte[] byteArray = Encoding.UTF8.GetBytes(postData);
request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = byteArray.Length;
Stream dataStream = request.GetRequestStream();
dataStream.Write(byteArray, 0, byteArray.Length);
dataStream.Close();
WebResponse response = request.GetResponse();
Console.WriteLine(((HttpWebResponse)response).StatusDescription);
dataStream = response.GetResponseStream();
StreamReader reader = new StreamReader(dataStream);
string responseFromServer = reader.ReadToEnd();
Console.WriteLine(responseFromServer);
reader.Close();
dataStream.Close();
response.Close();
我想从XML提取数据到一个列表,其中定义类的样子:
I would like to extract the data from the XML into a List where the Definition class looks like:
public class Def
{
public string text { get; set; }
public List<string> synonym { get; set; }
}
public class Definition
{
public string type { get; set; } // single character: n or v or a
public List<Def> Def { get; set; }
}
能有人给我我如何能做到这一点,表现出一定的建议是什么选项提供给我的挑类的元素了XML,并把这些成类。
当我想到这个问题,可帮助很多其他人我会开一个大的奖金,所以希望有人能花时间来拿出一个很好的例子
As I think this question could be helpful to many other people I'll open a large bounty so hopefully someone can take the time to come up with a good example
更新:
对不起。我犯了一个错误与同义词。我现在已经改变了这一点。希望它更有意义。同义词只是一个List我也把加粗我所需要的两个答案似乎到目前为止还没有在所有回答问题。谢谢
Sorry. I made a mistake with Synonym. I have changed this now. Hope it makes more sense. The synonyms are just a List I also put in bold what I am needing as the two answers so far don't seem to answer the question at all. Thank you.
推荐答案
我创造了这个词的定义(很肯定还有这里的改进的空间),一个简单的解析器:
I created a simple parser for the word definition (pretty sure there's room for improvements here):
class ParseyMcParseface
{
/// <summary>
/// Word definition lines
/// </summary>
private string[] _text;
/// <summary>
/// Constructor (Takes the innerText of the WordDefinition tag as input
/// </summary>
/// <param name="text">innerText of the WordDefinition</param>
public ParseyMcParseface(string text)
{
_text = text.Split(new [] {'\n'}, StringSplitOptions.RemoveEmptyEntries)
.Skip(1) // Skip the first line where the word is mentioned
.ToArray();
}
/// <summary>
/// Convert from single letter type to full human readable type
/// </summary>
/// <param name="c"></param>
/// <returns></returns>
private string CharToType(char c)
{
switch (c)
{
case 'a':
return "Adjective";
case 'n':
return "Noun";
case 'v':
return "Verb";
default:
return "Unknown";
}
}
/// <summary>
/// Reorganize the data for easier parsing
/// </summary>
/// <param name="text">Lines of text</param>
/// <returns></returns>
private static List<List<string>> MakeLists(IEnumerable<string> text)
{
List<List<string>> types = new List<List<string>>();
int i = -1;
int j = 0;
foreach (var line in text)
{
// New type (Noun, Verb, Adj.)
if (Regex.IsMatch(line.Trim(), "^[avn]{1}\\ \\d+"))
{
types.Add(new List<string> { line.Trim() });
i++;
j = 0;
}
// New definition in the previous type
else if (Regex.IsMatch(line.Trim(), "^\\d+"))
{
j++;
types[i].Add(line.Trim());
}
// New line of the same definition
else
{
types[i][j] = types[i][j] + " " + line.Trim();
}
}
return types;
}
public List<Definition> Parse()
{
var definitionsLines = MakeLists(_text);
List<Definition> definitions = new List<Definition>();
foreach (var type in definitionsLines)
{
var defs = new List<Def>();
foreach (var def in type)
{
var match = Regex.Match(def.Trim(), "(?:\\:\\ )(\\w|\\ |;|\"|,|\\.|-)*[\\[]{0,1}");
MatchCollection syns = Regex.Matches(def.Trim(), "\\{(\\w|\\ )+\\}");
List<string> synonymes = new List<string>();
foreach (Match syn in syns)
{
synonymes.Add(syn.Value.Trim('{', '}'));
}
defs.Add(new Def()
{
text = match.Value.Trim(':', '[', ' '),
synonym = synonymes
});
}
definitions.Add(new Definition
{
type = CharToType(type[0][0]),
Def = defs
});
}
return definitions;
}
}
和这里的用法示例:
WebRequest request =
WebRequest.Create("http://services.aonaware.com/DictService/DictService.asmx/DefineInDict");
request.Method = "POST";
string postData = "dictId=wn&word=abandon";
byte[] byteArray = Encoding.UTF8.GetBytes(postData);
request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = byteArray.Length;
Stream dataStream = request.GetRequestStream();
dataStream.Write(byteArray, 0, byteArray.Length);
dataStream.Close();
WebResponse response = request.GetResponse();
Console.WriteLine(((HttpWebResponse)response).StatusDescription);
dataStream = response.GetResponseStream();
StreamReader reader = new StreamReader(dataStream);
string responseFromServer = reader.ReadToEnd();
var doc = new XmlDocument();
doc.LoadXml(responseFromServer );
var el = doc.GetElementsByTagName("WordDefinition");
ParseyMcParseface parseyMcParseface = new ParseyMcParseface(el[1].InnerText);
var parsingResult = parseyMcParseface.Parse();
// parsingResult will contain a list of Definitions
// per the format specified in the question.
和这里的现场演示:的 https://dotnetfiddle.net/24IQ67
And here's a live demo: https://dotnetfiddle.net/24IQ67
您也可以避免手动检索,然后通过添加一个参考解析XML网络服务。
You can also avoid manually retrieving then parsing the XML by adding a reference to that webservice.
我做了一个小的应用程序,这是否然后分析的定义。它托管在GitHub上(它太大,张贴在这里的计算器):
I've made a little app that does that then parses the definition. It is hosted here on GitHub (it's too big to post here on StackOverflow):
public enum WordTypes
{
Noun,
Verb,
Adjective,
Adverb,
Unknown
}
public class Definition
{
public Definition()
{
Synonyms = new List<string>();
Anotnyms = new List<string>();
}
public WordTypes WordType { get; set; }
public string DefinitionText { get; set; }
public List<string> Synonyms { get; set; }
public List<string> Anotnyms { get; set; }
}
static class DefinitionParser
{
public static List<Definition> Parse(string wordDefinition)
{
var wordDefinitionLines = wordDefinition.Split(new[] { '\n' }, StringSplitOptions.RemoveEmptyEntries)
.Skip(1)
.Select(x => x.Trim())
.ToList();
var flatenedList = MakeLists(wordDefinitionLines).SelectMany(x => x).ToList();
var result = new List<Definition>();
foreach (var wd in flatenedList)
{
var foundMatch = Regex.Match(wd, @"^(?<matchType>adv|adj|v|n){0,1}\s*(\d*): (?<definition>[\w\s;""',\.\(\)\!\-]+)(?<extraInfoSyns>\[syn: ((?<wordSyn>\{[\w\s\-]+\})|(?:[,\ ]))*\]){0,1}\s*(?<extraInfoAnts>\[ant: ((?<wordAnt>\{[\w\s-]+\})|(?:[,\ ]))*\]){0,1}");
var def = new Definition();
if (foundMatch.Groups["matchType"].Success)
{
var matchType = foundMatch.Groups["matchType"];
def.WordType = DefinitionTypeToEnum(matchType.Value);
}
if (foundMatch.Groups["definition"].Success)
{
var definition = foundMatch.Groups["definition"];
def.DefinitionText = definition.Value;
}
if (foundMatch.Groups["extraInfoSyns"].Success && foundMatch.Groups["wordSyn"].Success)
{
foreach (Capture capture in foundMatch.Groups["wordSyn"].Captures)
{
def.Synonyms.Add(capture.Value.Trim('{','}'));
}
}
if (foundMatch.Groups["extraInfoAnts"].Success && foundMatch.Groups["wordAnt"].Success)
{
foreach (Capture capture in foundMatch.Groups["wordAnt"].Captures)
{
def.Anotnyms.Add(capture.Value.Trim('{', '}'));
}
}
result.Add(def);
}
return result;
}
private static List<List<string>> MakeLists(IEnumerable<string> text)
{
List<List<string>> types = new List<List<string>>();
int i = -1;
int j = 0;
foreach (var line in text)
{
// New type (Noun, Verb, Adj.)
if (Regex.IsMatch(line, "^(adj|v|n|adv){1}\\s\\d*"))
{
types.Add(new List<string> { line });
i++;
j = 0;
}
// New definition in the previous type
else if (Regex.IsMatch(line, "^\\d+"))
{
j++;
types[i].Add(line);
}
// New line of the same definition
else
{
types[i][j] = types[i][j] + " " + line;
}
}
return types;
}
private static WordTypes DefinitionTypeToEnum(string input)
{
switch (input)
{
case "adj":
return WordTypes.Adjective;
case "adv":
return WordTypes.Adverb;
case "n":
return WordTypes.Noun;
case "v":
return WordTypes.Verb;
default:
return WordTypes.Unknown;
}
}
}
备注:
- 这应该按预期工作
- 解析自由文本是不可靠
- 您应该导入服务引用(如对方回答说),而不是手工解析XML。
这篇关于我怎么能填充的C#类从具有一些嵌入式数据的XML文档?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!