如何反序列化狡猾的JSON(带有不正确的引号字符串和缺少的括号)? [英] How to deserialize dodgy JSON (with improperly quoted strings, and missing brackets)?

查看:299
本文介绍了如何反序列化狡猾的JSON(带有不正确的引号字符串和缺少的括号)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须解析(并最终重新序列化)一些狡猾的JSON.看起来像这样:

{
  name: "xyz",
  id: "29573f59-85fb-4d06-9905-01a3acb2cdbd",
  status: "astatus",
  color: colors["Open"]
},
{
  name: "abc",
  id: "29573f59-85fb-4d06-9905-01a3acb2cdbd",
  status: "astatus",
  color: colors["Open"]
}

这里有很多问题-从最严重的问题开始.

  1. color: colors["Open"]

    那是WTF吗?如果我放弃颜色",那么我可以得到一系列的字符串,但是我无法调整以开箱即用.

  2. 这是一个没有方括号的数组.我可以将它们包裹起来以解决此问题.但是有没有一种现成的支持方式?

  3. 属性没有引号.反序列化可以解决这些问题..但是,重新序列化没有任何作用.

对处理这种结构有什么建议?

解决方案

按顺序回答问题#1-#3:

  1. Json.NET不支持读取colors["Open"]格式的狡猾属性值(正如您正确指出的那样,该值违反了

    这会将color属性值更改为正确转义的JSON字符串:

    color: "colors[\"Open\"]"
    

    但是,

    Json.NET确实具有通过调用写入狡猾的属性值的功能. htm"rel =" nofollow noreferrer> JsonWriter.WriteRawValue() 自定义JsonConverter .

    定义以下转换器:

    public class RawStringConverter : JsonConverter
    {
        public override bool CanConvert(Type objectType)
        {
            return objectType == typeof(string);
        }
    
        public override bool CanRead { get { return false; } }
    
        public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
        {
            throw new NotImplementedException();
        }
    
        public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
        {
            var s = (string)value;
            writer.WriteRawValue(s);
        }
    }
    

    然后按如下所示定义您的RootObject:

    public class RootObject
    {
        public string name { get; set; }
        public string id { get; set; }
        public string status { get; set; }
    
        [JsonConverter(typeof(RawStringConverter))]
        public string color { get; set; }
    }
    

    然后,在重新序列化时,您将在JSON中获得原始的躲避值.

  2. 在以下版本的Json.NET中,问题1396 JsonTextReader.SupportMultipleContent = true 才能使其正常工作. /p>

    同时,作为一种变通办法,您可以从 How to string multiple TextReaders together? 的答案中获取ChainedTextReaderpublic static TextReader Extensions.Concat(this TextReader first, TextReader second).通过 Rex M 来创建,并用方括号[]包围您的JSON.

    因此,您将按如下所示反序列化JSON:

    List<RootObject> list;
    using (var reader = new StringReader("[").Concat(new StringReader(fixedJsonString)).Concat(new StringReader("]")))
    using (var jsonReader = new JsonTextReader(reader))
    {
        list = JsonSerializer.CreateDefault().Deserialize<List<RootObject>>(jsonReader);
    }
    

    (或者您可以使用[]手动将JSON字符串括起来,但我更喜欢不涉及复制可能很大的字符串的解决方案.)

    如果您使用JsonTextWriter分别序列化每个项目,则可以在不使用大括号的情况下重新序列化根集合. htm"rel =" nofollow noreferrer> CloseOutput = false .您还可以在每个序列化项目与每个JsonTextWriter共享的基础TextWriter之间手动写入,.

  3. 如果设置

示例 .Net小提琴显示了所有这些操作.

I am having to parse (and ultimately reserialize) some dodgy JSON. it looks like this:

{
  name: "xyz",
  id: "29573f59-85fb-4d06-9905-01a3acb2cdbd",
  status: "astatus",
  color: colors["Open"]
},
{
  name: "abc",
  id: "29573f59-85fb-4d06-9905-01a3acb2cdbd",
  status: "astatus",
  color: colors["Open"]
}

There are a number of problems here - starting with the most severe.

  1. color: colors["Open"]

    WTF even is that? If I drop 'colors' then I can get an array of strings out but I can't tweak to work out of the box.

  2. It is an array without square brackets. I can fix this by wrapping in them. But is there a way to support out of the box?

  3. Properties have no quotes. Deserializing is fine for these.. but reserializing is just no dice.

Any suggestions of handling both in and out of this structure?

解决方案

Answering your questions #1 - #3 in order:

  1. Json.NET does not support reading dodgy property values in the form colors["Open"] (which, as you correctly note, violates the JSON standard).

    Instead, you will need to manually fix these values, e.g. through some sort of Regex:

    var regex = new Regex(@"(colors\[)(.*)(\])");
    var fixedJsonString = regex.Replace(jsonString, 
        m => string.Format(@"""{0}{1}{2}""", m.Groups[1].Value, m.Groups[2].Value.Replace("\"", "\\\""), m.Groups[3].Value));
    

    This changes the color property values into properly escaped JSON strings:

    color: "colors[\"Open\"]"
    

    Json.NET does, however, have the capability to write dodgy property values by calling JsonWriter.WriteRawValue() from within a custom JsonConverter.

    Define the following converter:

    public class RawStringConverter : JsonConverter
    {
        public override bool CanConvert(Type objectType)
        {
            return objectType == typeof(string);
        }
    
        public override bool CanRead { get { return false; } }
    
        public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
        {
            throw new NotImplementedException();
        }
    
        public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
        {
            var s = (string)value;
            writer.WriteRawValue(s);
        }
    }
    

    Then define your RootObject as follows:

    public class RootObject
    {
        public string name { get; set; }
        public string id { get; set; }
        public string status { get; set; }
    
        [JsonConverter(typeof(RawStringConverter))]
        public string color { get; set; }
    }
    

    Then, when re-serialized, you will get the original dodgy values in your JSON.

  2. Support for deserializing comma-delimited JSON without outer brackets will be in the next release of Json.NET after 10.0.3. see Issue 1396 and Issue 1355 for details. You will need to set JsonTextReader.SupportMultipleContent = true to make it work.

    In the meantime, as a workaround, you could grab ChainedTextReader and public static TextReader Extensions.Concat(this TextReader first, TextReader second) from the answer to How to string multiple TextReaders together? by Rex M and surround your JSON with brackets [ and ].

    Thus you would deserialize your JSON as follows:

    List<RootObject> list;
    using (var reader = new StringReader("[").Concat(new StringReader(fixedJsonString)).Concat(new StringReader("]")))
    using (var jsonReader = new JsonTextReader(reader))
    {
        list = JsonSerializer.CreateDefault().Deserialize<List<RootObject>>(jsonReader);
    }
    

    (Or you could just manually surround your JSON string with [ and ], but I prefer solutions that don't involve copying possibly large strings.)

    Re-serializing a root collection without outer braces is possible if you serialize each item individually using its own JsonTextWriter with CloseOutput = false. You can also manually write a , between each serialized item to the underlying TextWriter shared by every JsonTextWriter.

  3. Serializing JSON property names without a surrounding quote character is possible if you set JsonTextWriter.QuoteName = false.

    Thus, to re-serialize your List<RootObject> without quoted property names or outer braces, do:

    var sb = new StringBuilder();
    bool first = true;
    using (var textWriter = new StringWriter(sb))
    {
        foreach (var item in list)
        {
            if (!first)
            {
                textWriter.WriteLine(",");
            }
            first = false;
            using (var jsonWriter = new JsonTextWriter(textWriter) { QuoteName = false, Formatting = Formatting.Indented, CloseOutput = false })
            {
                JsonSerializer.CreateDefault().Serialize(jsonWriter, item);
            }
        }
    }
    
    var reserializedJson = sb.ToString();
    

Sample .Net fiddle showing all this in action.

这篇关于如何反序列化狡猾的JSON(带有不正确的引号字符串和缺少的括号)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆