golang json 库中的rawmessage功能原理介绍 -凯发k8天生赢家一触即发

不想看，ai帮我总结一下文章

今天小编给大家分享的是golang json 库中的rawmessage功能原理介绍，相信很多人都不太了解，为了让大家更加了解，所以给大家总结了以下内容，一起往下看吧。一定会有所收获的哦。

正文

json 作为一种通用的编解码协议，可阅读性上比 thrift，protobuf 等协议要好一些，同时编码的 size 也会比 xml 这类协议要小，在市面上用的非常多。甚至在很多业务上，我们的线上实例消耗最大的部分就是 json 的序列化和反序列化。这也是为什么很多 gopher 会致力于研究怎样最有效地优化这个过程。

今天我们来学习一个 golang 官方 json 库提供了一个经典能力：rawmessage。

什么是序列化

首先我们思考一下所谓序列化指的是什么呢？

参考 json 包中 marshaler 和 unmarshaler 两个接口定义：

// marshaler is the interface implemented by types that
// can marshal themselves into valid json.
type marshaler interface {
    marshaljson() ([]byte, error)
}
序列化，也就是 marshal，需要将一种类型转换为一个字节数组，也就是这里接口返回值的 []byte。
go// unmarshaler is the interface implemented by types
// that can unmarshal a json description of themselves.
// the input can be assumed to be a valid encoding of
// a json value. unmarshaljson must copy the json data
// if it wishes to retain the data after returning.
//
// by convention, to approximate the behavior of unmarshal itself,
// unmarshalers implement unmarshaljson([]byte("null")) as a no-op.
type unmarshaler interface {
    unmarshaljson([]byte) error
}

而反序列化，则是序列化的逆过程，接收一个字节数组，转换为目标的类型值。

事实上如果你对自定义的类型实现了上面两个接口，调用 json 包的 json.marshal 以及 json.unmarshal 函数时就会执行你的实现。

简言之，本质上看，序列化就是将一个 object 转换为字节数组，即 []byte 的过程。
rawmessage

rawmessage is a raw encoded json value. it implements marshaler and unmarshaler and can be used to delay json decoding or precompute a json encoding.

rawmessage 具体来讲是 json 库中定义的一个类型。它实现了 marshaler 接口以及 unmarshaler 接口，以此来支持序列化的能力。注意上面我们引用官方 doc 的说明。我们直接来看看源码中的实现：

// rawmessage is a raw encoded json value.
// it implements marshaler and unmarshaler and can
// be used to delay json decoding or precompute a json encoding.
type rawmessage []byte
// marshaljson returns m as the json encoding of m.
func (m rawmessage) marshaljson() ([]byte, error) {
    if m == nil {
        return []byte("null"), nil
    }
    return m, nil
}
// unmarshaljson sets *m to a copy of data.
func (m *rawmessage) unmarshaljson(data []byte) error {
    if m == nil {
        return errors.new("json.rawmessage: unmarshaljson on nil pointer")
    }
    *m = append((*m)[0:0], data...)
    return nil
}
var _ marshaler = (*rawmessage)(nil)
var _ unmarshaler = (*rawmessage)(nil)

非常直接，其实 rawmessage 底层就是一个 []byte。序列化时是直接把自己 return 回去了。而反序列化时则是把入参的 []byte 拷贝一份，写入自己的内存地址即可。

有意思了，前一节我们提到过，序列化后产出的本来就是一个 []byte，那为什么还要专门再搞一个 rawmessage 出来，有什么作用呢？

没错，rawmessage 其实人如其名，代表的就是一个终态。什么意思呢？我本来就是个字节数组，那么如果你要对我进行序列化，就不需要什么成本，直接把我这个字节数组拿过去即可。如果要反序列化，没事，你直接把原来的字节数组拿到就够了。

这就是 raw 的含义，原来是什么样，现在就是什么样。原样拿过来即可。

这里参照 using go’s json.rawmessage 的经典解释。

we can think of the raw message as a piece of information that we decide to ignore at the moment. the information is still there but we choose to keep it in its raw form — a byte array.

我们可以把 rawmessage 看作是一部分可以暂时忽略的信息，以后可以进一步去解析，但此时不用。所以，我们保留它的原始形式，还是个字节数组即可。

使用场景

软件开发中，我们经常说不要过度设计，好的代码应当有明确的使用场景，而且能高效地解决一类问题，而不是在设想和概念上造出来一个未经过验证的空中楼阁。
那么 rawmessage 是不是这样一个空中楼阁呢？其实并不是。
我们可以将其当做一个【占位符】。设想一下，我们给某种业务场景定义了一个通用的 model，其中部分数据需要在不同场景下对应不同的结构体。这个时候怎么 marshal 成字节数组，存入数据库，以及读出数据，还原出 model 呢？
我们就可以将这个可变的字段定义为 json.rawmessage，利用它适配万物的能力来进行读写。

复用预计算的 json 值

package main
import (
    "encoding/json"
    "fmt"
    "os"
)
func main() {
    h := json.rawmessage(`{"precomputed": true}`)
    c := struct {
        header *json.rawmessage `json:"header"`
        body   string           `json:"body"`
    }{header: &h, body: "hello gophers!"}
    b, err := json.marshalindent(&c, "", "\t")
    if err != nil {
        fmt.println("error:", err)
    }
    os.stdout.write(b)
}

这里 c 是我们临时定义的结构体，body 是明确的一个字符串，而 header 是可变的。

还记得么？rawmessage 本质是个 []byte，所以我们可以用

json.rawmessage(`{"precomputed": true}`)

来将一个字符串转换为 rawmessage。随后对其进行 marshal，输出的结果如下：

{
   "header": {
       "precomputed": true
   },
   "body": "hello gophers!"
}

发现了么？

这里 "precomputed": true 跟我们构造的 rawmessage 是一模一样的，所以对应到第一个能力：在序列化时使用一个预先计算好的 json 值。

延迟解析 json 结构

package main
import (
    "encoding/json"
    "fmt"
    "log"
)
func main() {
    type color struct {
        space string
        point json.rawmessage // delay parsing until we know the color space
    }
    type rgb struct {
        r uint8
        g uint8
        b uint8
    }
    type ycbcr struct {
        y  uint8
        cb int8
        cr int8
    }
    var j = []byte(`[
    {"space": "ycbcr", "point": {"y": 255, "cb": 0, "cr": -10}},
    {"space": "rgb",   "point": {"r": 98, "g": 218, "b": 255}}
]`)
    var colors []color
    err := json.unmarshal(j, &colors)
    if err != nil {
        log.fatalln("error:", err)
    }
    for _, c := range colors {
        var dst any
        switch c.space {
        case "rgb":
            dst = new(rgb)
        case "ycbcr":
            dst = new(ycbcr)
        }
        err := json.unmarshal(c.point, dst)
        if err != nil {
            log.fatalln("error:", err)
        }
        fmt.println(c.space, dst)
    }
}

这里的例子其实更典型。color 中的 point 可能存在两种结构描述，一种是 rgb，另一种是 ycbcr，而我们对应到底层存储，又希望能复用，这是非常常见的。
所以，这里采用了【两级反序列化】的策略：

第一级，解析出来公共字段，利用 json.rawmessage 延迟这部分差异字段的解析。

第二级，根据已经解析出来的字段（一般是有类似 type 的语义），判断再次反序列化时要使用的结构，基于 json.rawmessage 再次 unmarshal，拿到最终的数据。

上面的示例输出结果如下：

ycbcr &{255 0 -10}
rgb &{98 218 255}

总结

json 提供的 rawmessage 是直接暴露了底层的 []byte 作为交互凭证，它可以被内嵌在各种结构体中。作为不可变的字段类型的 placeholder，延迟解析。相较于 string 类型效率更高。从实现上看非常简单，只是封装了一层字节数组的交互，大家可以放心使用。

关于golang json 库中的rawmessage功能原理介绍就分享到这里了，希望以上内容可以对大家有一定的参考价值，可以学以致用。如果喜欢本篇文章，不妨把它分享出去让更多的人看到。

golang json 库中的rawmessage功能原理介绍 -凯发k8天生赢家一触即发

正文

什么是序列化

使用场景

复用预计算的 json 值

延迟解析 json 结构

总结

相关推荐

vue如何将传递的json数据转换为form data

spark-sql如何读取json文件时反射表头

java使用json进行文件解析的方法

热门推荐