如何反序列化"NaN"作为带有serde_json的`nan`吗? [英] How to deserialize "NaN" as `nan` with serde_json?

查看:127
本文介绍了如何反序列化"NaN"作为带有serde_json的`nan`吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的数据类型如下:

#[derive(Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct Matrix {
    #[serde(rename = "numColumns")]
    pub num_cols: usize,
    #[serde(rename = "numRows")]
    pub num_rows: usize,
    pub data: Vec<f64>,
}

我的JSON正文看起来像这样:

My JSON bodies look something like this:

{
    "numRows": 2,
    "numColumns": 1,
    "data": [1.0, "NaN"]
}

这是Jackson(从我们使用的Java服务器)提供的序列化,并且是有效的JSON.不幸的是,如果我们调用serde_json::from_str(&blob),则会收到错误消息:

This is the serialization provided by Jackson (from a Java server we use), and is valid JSON. Unfortunately if we call serde_json::from_str(&blob) we get an error:

Error("invalid type: string "NaN", expected f64", [snip]

我知道浮点数周围有一些微妙之处,人们对应该采用的方式抱有很大的看法.我尊重.特别是Rust,我非常乐于助人,我喜欢.

I understand there are subtleties around floating point numbers and people get very opinionated about the way things ought to be. I respect that. Rust in particular likes to be very opinionated, and I like that.

但是,到最后,这些JSON blob才是我要接收的,我需要"NaN"字符串将其反序列化为某些f64值,其中is_nan()为true,然后将其序列化回字符串"NaN",因为生态系统的其余部分都使用Jackson,在这里就可以了.

However at the end of the day these JSON blobs are what I'm going to receive, and I need that "NaN" string to deserialize to some f64 value where is_nan() is true, and which serialized back to the string "NaN", because the rest of the ecosystem uses Jackson and this is fine there.

这可以通过合理的方式实现吗?

Can this be achieved in a reasonable way?

建议的链接问题讨论了如何覆盖派生的序列化器,但并未解释如何专门对浮点数进行反序列化.

the suggested linked questions talk about overriding the derived derializer, but they do not explain how to deserialize floats specifically.

推荐答案

实际上似乎在Vec(或Map等)中使用自定义解串器是Serde的未解决问题,并且已经使用了一年多时间. (截至撰写本文时): https://github.com/serde-rs/serde/issues/723

It actually seems like using a custom deserializer inside a Vec (or Map or etc.) is an open issue on serde and has been for a little over a year (as of time of writing): https://github.com/serde-rs/serde/issues/723

我相信解决方案是为f64(这很好)以及使用f64作为子内容的所有内容(例如Vec<f64>HashMap<K, f64>等)编写自定义解串器.不幸的是,由于这些方法的实现看起来像

I believe the solution is to write a custom deserializer for f64 (which is fine), as well as everything which uses f64 as a subthing (e.g. Vec<f64>, HashMap<K, f64>, etc.). Unfortunately it does not seem like these things are composable, as implementations of these methods look like

deserialize<'de, D>(deserializer: D) -> Result<Vec<f64>, D::Error>
where D: Deserializer<'de> { /* snip */ }

拥有反序列化器后,您只能通过访问者与其进行交互.

and once you have a Deserializer you can only interact with it through visitors.

长话短说,我最终使它工作了,但是似乎很多代码不是必需的.将其张贴在此处,希望(a)某人知道如何清除它,或(b)确实是应该这样做,并且此答案对某人有用.我花了整整一天的时间认真阅读文档并进行反复试验,所以这可能对其他人很有用.函数(de)serialize_float(s)应该与字段名称上方的相应#[serde( (de)serialize_with="etc." )]一起使用.

Long story short, I eventually got it working, but it seems like a lot of code that shouldn't be necessary. Posting it here in the hopes that either (a) someone knows how to clean this up, or (b) this is really how it should be done, and this answer will be useful to someone. I've spent a whole day fervently reading docs and making trial and error guesses, so maybe this will be useful to someone else. The functions (de)serialize_float(s) should be used with an appropriate #[serde( (de)serialize_with="etc." )] above the field name.

use serde::de::{self, SeqAccess, Visitor};
use serde::ser::SerializeSeq;
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use std::fmt;

type Float = f64;

const NAN: Float = std::f64::NAN;

struct NiceFloat(Float);

impl Serialize for NiceFloat {
    #[inline]
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: Serializer,
    {
        serialize_float(&self.0, serializer)
    }
}

pub fn serialize_float<S>(x: &Float, serializer: S) -> Result<S::Ok, S::Error>
where
    S: Serializer,
{
    if x.is_nan() {
        serializer.serialize_str("NaN")
    } else {
        serializer.serialize_f64(*x)
    }
}

pub fn serialize_floats<S>(floats: &[Float], serializer: S) -> Result<S::Ok, S::Error>
where
    S: Serializer,
{
    let mut seq = serializer.serialize_seq(Some(floats.len()))?;

    for f in floats {
        seq.serialize_element(&NiceFloat(*f))?;
    }

    seq.end()
}

struct FloatDeserializeVisitor;

impl<'de> Visitor<'de> for FloatDeserializeVisitor {
    type Value = Float;

    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter.write_str("a float or the string \"NaN\"")
    }

    fn visit_i32<E>(self, v: i32) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(v as Float)
    }

    fn visit_i64<E>(self, v: i64) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(v as Float)
    }

    fn visit_u32<E>(self, v: u32) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(v as Float)
    }

    fn visit_u64<E>(self, v: u64) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(v as Float)
    }

    fn visit_f32<E>(self, v: f32) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(v as Float)
    }

    fn visit_f64<E>(self, v: f64) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(v as Float)
    }

    fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        if v == "NaN" {
            Ok(NAN)
        } else {
            Err(E::invalid_value(de::Unexpected::Str(v), &self))
        }
    }
}

struct NiceFloatDeserializeVisitor;

impl<'de> Visitor<'de> for NiceFloatDeserializeVisitor {
    type Value = NiceFloat;

    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter.write_str("a float or the string \"NaN\"")
    }

    fn visit_f32<E>(self, v: f32) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(NiceFloat(v as Float))
    }

    fn visit_f64<E>(self, v: f64) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(NiceFloat(v as Float))
    }

    fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        if v == "NaN" {
            Ok(NiceFloat(NAN))
        } else {
            Err(E::invalid_value(de::Unexpected::Str(v), &self))
        }
    }
}

pub fn deserialize_float<'de, D>(deserializer: D) -> Result<Float, D::Error>
where
    D: Deserializer<'de>,
{
    deserializer.deserialize_any(FloatDeserializeVisitor)
}

impl<'de> Deserialize<'de> for NiceFloat {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        let raw = deserialize_float(deserializer)?;
        Ok(NiceFloat(raw))
    }
}

pub struct VecDeserializeVisitor<T>(std::marker::PhantomData<T>);

impl<'de, T> Visitor<'de> for VecDeserializeVisitor<T>
where
    T: Deserialize<'de> + Sized,
{
    type Value = Vec<T>;

    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter.write_str("A sequence of floats or \"NaN\" string values")
    }

    fn visit_seq<S>(self, mut seq: S) -> Result<Self::Value, S::Error>
    where
        S: SeqAccess<'de>,
    {
        let mut out = Vec::with_capacity(seq.size_hint().unwrap_or(0));

        while let Some(value) = seq.next_element()? {
            out.push(value);
        }

        Ok(out)
    }
}

pub fn deserialize_floats<'de, D>(deserializer: D) -> Result<Vec<Float>, D::Error>
where
    D: Deserializer<'de>,
{
    let visitor: VecDeserializeVisitor<NiceFloat> = VecDeserializeVisitor(std::marker::PhantomData);

    let seq: Vec<NiceFloat> = deserializer.deserialize_seq(visitor)?;

    let raw: Vec<Float> = seq.into_iter().map(|nf| nf.0).collect::<Vec<Float>>();

    Ok(raw)
}

这篇关于如何反序列化"NaN"作为带有serde_json的`nan`吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆