在Go中实现Merkle-tree数据结构 [英] Implementing a Merkle-tree data structure in Go

查看:106
本文介绍了在Go中实现Merkle-tree数据结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在Go中实现一个merkle-tree数据结构。基本上,我的最终目标是存储一小组结构化数据(最大10MB),并允许这个数据库与通过网络分发的其他节点轻松同步(参见相关的)。
我已经在Node中有效地实现了这一点,因为没有类型检查。这就是Go的问题,我想使用Go的编译时类型检查,虽然我也想要一个可以与任何提供的树一起使用的库。



<简而言之,我想使用structs作为merkle节点,我想要一个嵌入在所有类型中的一个 Merkle.Update()方法。我试图避免为每个结构编写一个 Update()(虽然我知道这可能是唯一/最好的方式)。



我的想法是使用嵌入式类型:

  // library 
type Merkle struct {
初始化bool
容器接口{} //在例子中这个引用foo
Fields [] reflect.Type
// ...其他merkle状态
}
// Merkle方法...更新()...等...

// userland
type Foo struct {
Merkle
A int
B bool
C string
D map [string] * Bazz
E [] * Bar
}

类型Bazz struct {
Merkle
S int
T int
U int
}

类型Bar struct {
Merkle
X int
Y int
Z int
}

在这个例子中, Foo 将是根,它将包含 Bazz s和 Bar s。这种关系可以通过反映类型来推断。问题是使用情况:

  foo:=& Foo {
A:42,
B :
C:foo,
D:map [string] * Bazz {
b1:& Bazz {},
b2:& Bazz {},
},
E:[] * Bar {
& Bar {},
& Bar {},
& Bar {} ,
},
}

merkle.Init(foo)
foo.Hash //初始哈希=> abc ...

foo.A = 35
foo.E = append(foo.E,& Bar {})

foo.Update()
foo.Hash //更新哈希=>我想我们需要 merkle.Init(foo)...



<
因为 foo.Init()实际上是 foo.Merkle.Init()和将无法反映在 foo 上。未初始化的 Bazz 可以由父级 foo.Update检测并初始化()。一些反思是可以接受的,因为正确性比目前的表现更重要。
另一个问题是,当我们 Update()一个节点时,所有的结构域(子节点)都需要 Update() d(rehashed),因为我们不知道发生了什么变化。我们可以执行 foo.SetInt(A,35)来实现自动更新,但是我们丢失编译时类型检查。



这会被认为是惯用的Go吗?如果没有,怎么可以改善?任何人都可以想到使用简洁的数据集比较(用于通过网络进行有效的增量传输)来将数据集存储在内存中的替代方法(快速读取)?
编辑:还有一个元问题:哪里最好的地方问这样的问题,StackOverflow,Reddit还是坚果?最初发布在 reddit ,没有回答:(

解决方案

有些目标如下:




  • 哈希任何东西 - 通过散列许多东西,使其易于使用

  • 缓存哈希 - 让更新只是重新刷新他们需要

  • 成语 - 适合其他Go代码



我认为你可以按照内置的 encoding / gob encoding / json do,这是三管齐下:如果类型实现它(对于JSON为 MarshalJSON ),请使用特殊方法,使用类型开关作为基本类型,并回到使用反射的讨厌的默认情况,这是一个API草图,它为哈希缓存提供了一个帮助,并允许类型为impl c code code code code code code

$ $ b类型HashVal uint64

const MissingHash HashVal = 0

// Hasher为类型提供了自定义哈希实现。不是
//所有需要实现的,但这样做可以加快
//更新速度。
类型Hasher接口{
Hash()HashVal
}

// HashCacher是缓存哈希值的项目的接口。
//通常嵌入HashCache实现。
类型HashCacher接口{
CachedHash()* HashVal
}

// HashCache实现HashCacher;这意味着嵌入到您的
//结构体中,以使更新哈希树更有效率。
type HashCache struct {
h HashVal
}

// CachedHash实现HashCacher。
func(h * HashCache)CachedHash()* HashVal {
return& hh
}

//哈希返回一些散列,使用缓存哈希或散列()方法如果
//可用。
func Hash(i interface {})HashVal {
如果hashCacher,ok:= i。(HashCacher); ok {
如果缓存:= * hashCacher.CachedHash();缓存!= MissingHash {
return cached
}
}

switch i:= i。(type){
case Hasher:
return i.Hash()
case uint64:
return HashVal(i * 8675309)//或者你知道,使用一个真正的哈希
case [] byte:
// CRC的字节,说
返回0xdeadbeef
默认值:
返回0xdeadbeef
//可怕的缓慢递归案例使用反射
// like:iterate fields using reflect,then hash每个
}

//而不是panic()在这里,你可以生活在一个
//危险的,并声明更改为unhashable
//类型'b无效树
panic(unhashable类型传递给Hash())
}

// Item是Merkle树中的一个节点,必须知道如何找到它的
//父项(根节点应该返回nil),通常应该是
//嵌入HashCache以进行有效的更新。为了避免使用反射,
//项目可能会受益于Hashers。
类型项目接口{
父()项
HashCacher
}

//更新更新i和根之间的项目链,给定
//可能已更改的叶节点。
func更新(i Item){
for i!= nil {
cached:= i.CachedHash()
* cached = MissingHash // invalidate
* cached =哈希(i)
i = i.Parent()
}
}


I'm currently attempting to implement a merkle-tree data structure in Go. Basically, my end goal is to store a small set of structured data (10MB max) and allow this "database" to be easily synchronised with other nodes distributed over the network (see related ). I've implemented this reasonably effectively in Node as there are no type-checks. Herein lies the problem with Go, I'd like to make use of Go's compile-time type checks, though I also want to have one library which works with any provided tree.

In short, I'd like to use structs as merkle nodes and I'd like to have one Merkle.Update() method which is embedded in all types. I'm trying to avoid writing an Update() for every struct (though I'm aware this might be the only/best way).

My idea was to use embedded types:

//library
type Merkle struct {
    Initialised bool
    Container interface{} //in example this references foo
    Fields []reflect.Type
    //... other merkle state
}
//Merkle methods... Update()... etc...

//userland
type Foo struct {
    Merkle
    A int
    B bool
    C string
    D map[string]*Bazz
    E []*Bar
}

type Bazz struct {
    Merkle
    S int
    T int
    U int
}

type Bar struct {
    Merkle
    X int
    Y int
    Z int
}

In this example, Foo will be the root, which will contain Bazzs and Bars. This relationship could be inferred by reflecting on the types. The problem is the usage:

foo := &Foo{
    A: 42,
    B: true,
    C: "foo",
    D: map[string]*Bazz{
        "b1": &Bazz{},
        "b2": &Bazz{},
    },
    E: []*Bar{
        &Bar{},
        &Bar{},
        &Bar{},
    },
}

merkle.Init(foo)
foo.Hash //Initial hash => abc...

foo.A = 35
foo.E = append(foo.E, &Bar{})

foo.Update()
foo.Hash //Updated hash => def...

I think we need to merkle.Init(foo) since foo.Init() would actually be foo.Merkle.Init() and would not be able to reflect on foo. The uninitialised Bars and Bazzs could be detected and initialised by the parent foo.Update(). Some reflection is acceptable as correctness is more important than performance at the moment. Another problem is, when we Update() a node, all struct fields (child nodes) would need to be Update()d as well (rehashed) since we aren't sure what was changed. We could do foo.SetInt("A", 35) to implement an auto-update, though then we lose compile time type-checks.

Would this be considered idiomatic Go? If not, how could this be improved? Can anyone think of an alternative way to store a dataset in memory (for fast reads) with concise dataset comparison (for efficient delta transfers over the network)? Edit: And also a meta-question: Where is the best place to ask this kind of question, StackOverflow, Reddit or go-nuts? Originally posted on reddit with no answer :(

解决方案

Some goals seem like:

  • Hash anything -- make it easy to use by hashing lots of things out of the box
  • Cache hashes -- make updates just rehash what they need to
  • Be idiomatic -- fit in well among other Go code

I think you can attack hashing anything roughly the way that serialization tools like the built-in encoding/gob or encoding/json do, which is three-pronged: use a special method if the type implements it (for JSON that's MarshalJSON), use a type switch for basic types, and fall back to a nasty default case using reflection. Here's an API sketch that provides a helper for hash caching and lets types either implement Hash or not:

package merkle

type HashVal uint64

const MissingHash HashVal = 0

// Hasher provides a custom hash implementation for a type. Not
// everything needs to implement it, but doing so can speed
// updates.
type Hasher interface {
    Hash() HashVal
}

// HashCacher is the interface for items that cache a hash value.
// Normally implemented by embedding HashCache.
type HashCacher interface {
    CachedHash() *HashVal
}

// HashCache implements HashCacher; it's meant to be embedded in your
// structs to make updating hash trees more efficient.
type HashCache struct {
    h HashVal
}

// CachedHash implements HashCacher.
func (h *HashCache) CachedHash() *HashVal {
    return &h.h
}

// Hash returns something's hash, using a cached hash or Hash() method if
// available.
func Hash(i interface{}) HashVal {
    if hashCacher, ok := i.(HashCacher); ok {
        if cached := *hashCacher.CachedHash(); cached != MissingHash {
            return cached
        }
    }

    switch i := i.(type) {
    case Hasher:
        return i.Hash()
    case uint64:
        return HashVal(i * 8675309) // or, you know, use a real hash
    case []byte:
        // CRC the bytes, say
        return 0xdeadbeef
    default:
        return 0xdeadbeef
        // terrible slow recursive case using reflection
        // like: iterate fields using reflect, then hash each
    }

    // instead of panic()ing here, you could live a little
    // dangerously and declare that changes to unhashable
    // types don't invalidate the tree
    panic("unhashable type passed to Hash()")
}

// Item is a node in the Merkle tree, which must know how to find its
// parent Item (the root node should return nil) and should usually
// embed HashCache for efficient updates. To avoid using reflection,
// Items might benefit from being Hashers as well.
type Item interface {
    Parent() Item
    HashCacher
}

// Update updates the chain of items between i and the root, given the
// leaf node that may have been changed.
func Update(i Item) {
    for i != nil {
        cached := i.CachedHash()
        *cached = MissingHash // invalidate
        *cached = Hash(i)
        i = i.Parent()
    }
}

这篇关于在Go中实现Merkle-tree数据结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆