如何反序列化狡猾的JSON(带有不正确的引号字符串和缺少的括号)? [英] How to deserialize dodgy JSON (with improperly quoted strings, and missing brackets)?
问题描述
我必须解析(并最终重新序列化)一些狡猾的JSON.看起来像这样:
{
name: "xyz",
id: "29573f59-85fb-4d06-9905-01a3acb2cdbd",
status: "astatus",
color: colors["Open"]
},
{
name: "abc",
id: "29573f59-85fb-4d06-9905-01a3acb2cdbd",
status: "astatus",
color: colors["Open"]
}
这里有很多问题-从最严重的问题开始.
-
color: colors["Open"]
那是WTF吗?如果我放弃颜色",那么我可以得到一系列的字符串,但是我无法调整以开箱即用.
-
这是一个没有方括号的数组.我可以将它们包裹起来以解决此问题.但是有没有一种现成的支持方式?
-
属性没有引号.反序列化可以解决这些问题..但是,重新序列化没有任何作用.
对处理这种结构有什么建议?
按顺序回答问题#1-#3:
-
Json.NET不支持读取
colors["Open"]
格式的狡猾属性值(正如您正确指出的那样,该值违反了这会将
color
属性值更改为正确转义的JSON字符串:color: "colors[\"Open\"]"
但是,
Json.NET确实具有通过调用
从自定义写入狡猾的属性值的功能. htm"rel =" nofollow noreferrer> JsonWriter.WriteRawValue()
JsonConverter
.定义以下转换器:
public class RawStringConverter : JsonConverter { public override bool CanConvert(Type objectType) { return objectType == typeof(string); } public override bool CanRead { get { return false; } } public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer) { throw new NotImplementedException(); } public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer) { var s = (string)value; writer.WriteRawValue(s); } }
然后按如下所示定义您的
RootObject
:public class RootObject { public string name { get; set; } public string id { get; set; } public string status { get; set; } [JsonConverter(typeof(RawStringConverter))] public string color { get; set; } }
然后,在重新序列化时,您将在JSON中获得原始的躲避值.
-
在以下版本的Json.NET中,问题1396 和
JsonTextReader.SupportMultipleContent = true
才能使其正常工作. /p>同时,作为一种变通办法,您可以从
How to string multiple TextReaders together?
的答案中获取ChainedTextReader
和public static TextReader Extensions.Concat(this TextReader first, TextReader second)
.通过 Rex M 来创建,并用方括号[
和]
包围您的JSON.因此,您将按如下所示反序列化JSON:
List<RootObject> list; using (var reader = new StringReader("[").Concat(new StringReader(fixedJsonString)).Concat(new StringReader("]"))) using (var jsonReader = new JsonTextReader(reader)) { list = JsonSerializer.CreateDefault().Deserialize<List<RootObject>>(jsonReader); }
(或者您可以使用
[
和]
手动将JSON字符串括起来,但我更喜欢不涉及复制可能很大的字符串的解决方案.)如果您使用
JsonTextWriter分别序列化每个项目,则可以在不使用大括号的情况下重新序列化根集合. htm"rel =" nofollow noreferrer> CloseOutput = false
.您还可以在每个序列化项目与每个JsonTextWriter
共享的基础TextWriter
之间手动写入,
.
示例 .Net小提琴显示了所有这些操作.
I am having to parse (and ultimately reserialize) some dodgy JSON. it looks like this:
{
name: "xyz",
id: "29573f59-85fb-4d06-9905-01a3acb2cdbd",
status: "astatus",
color: colors["Open"]
},
{
name: "abc",
id: "29573f59-85fb-4d06-9905-01a3acb2cdbd",
status: "astatus",
color: colors["Open"]
}
There are a number of problems here - starting with the most severe.
color: colors["Open"]
WTF even is that? If I drop 'colors' then I can get an array of strings out but I can't tweak to work out of the box.
It is an array without square brackets. I can fix this by wrapping in them. But is there a way to support out of the box?
Properties have no quotes. Deserializing is fine for these.. but reserializing is just no dice.
Any suggestions of handling both in and out of this structure?
Answering your questions #1 - #3 in order:
Json.NET does not support reading dodgy property values in the form
colors["Open"]
(which, as you correctly note, violates the JSON standard).Instead, you will need to manually fix these values, e.g. through some sort of
Regex
:var regex = new Regex(@"(colors\[)(.*)(\])"); var fixedJsonString = regex.Replace(jsonString, m => string.Format(@"""{0}{1}{2}""", m.Groups[1].Value, m.Groups[2].Value.Replace("\"", "\\\""), m.Groups[3].Value));
This changes the
color
property values into properly escaped JSON strings:color: "colors[\"Open\"]"
Json.NET does, however, have the capability to write dodgy property values by calling
JsonWriter.WriteRawValue()
from within a customJsonConverter
.Define the following converter:
public class RawStringConverter : JsonConverter { public override bool CanConvert(Type objectType) { return objectType == typeof(string); } public override bool CanRead { get { return false; } } public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer) { throw new NotImplementedException(); } public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer) { var s = (string)value; writer.WriteRawValue(s); } }
Then define your
RootObject
as follows:public class RootObject { public string name { get; set; } public string id { get; set; } public string status { get; set; } [JsonConverter(typeof(RawStringConverter))] public string color { get; set; } }
Then, when re-serialized, you will get the original dodgy values in your JSON.
Support for deserializing comma-delimited JSON without outer brackets will be in the next release of Json.NET after 10.0.3. see Issue 1396 and Issue 1355 for details. You will need to set
JsonTextReader.SupportMultipleContent = true
to make it work.In the meantime, as a workaround, you could grab
ChainedTextReader
andpublic static TextReader Extensions.Concat(this TextReader first, TextReader second)
from the answer toHow to string multiple TextReaders together?
by Rex M and surround your JSON with brackets[
and]
.Thus you would deserialize your JSON as follows:
List<RootObject> list; using (var reader = new StringReader("[").Concat(new StringReader(fixedJsonString)).Concat(new StringReader("]"))) using (var jsonReader = new JsonTextReader(reader)) { list = JsonSerializer.CreateDefault().Deserialize<List<RootObject>>(jsonReader); }
(Or you could just manually surround your JSON string with
[
and]
, but I prefer solutions that don't involve copying possibly large strings.)Re-serializing a root collection without outer braces is possible if you serialize each item individually using its own
JsonTextWriter
withCloseOutput = false
. You can also manually write a,
between each serialized item to the underlyingTextWriter
shared by everyJsonTextWriter
.Serializing JSON property names without a surrounding quote character is possible if you set
JsonTextWriter.QuoteName = false
.Thus, to re-serialize your
List<RootObject>
without quoted property names or outer braces, do:var sb = new StringBuilder(); bool first = true; using (var textWriter = new StringWriter(sb)) { foreach (var item in list) { if (!first) { textWriter.WriteLine(","); } first = false; using (var jsonWriter = new JsonTextWriter(textWriter) { QuoteName = false, Formatting = Formatting.Indented, CloseOutput = false }) { JsonSerializer.CreateDefault().Serialize(jsonWriter, item); } } } var reserializedJson = sb.ToString();
Sample .Net fiddle showing all this in action.
这篇关于如何反序列化狡猾的JSON(带有不正确的引号字符串和缺少的括号)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!