根据数据创建规格 [英] create spec from data
问题描述
我正在尝试仅根据数据创建规范。我的数据结构非常复杂-所有嵌套的地图。
I am trying to create spec just from data. I have very complex data structure - all nested map.
{:contexts
({:importer.datamodel/global-id "01b4e69f86e5dd1d816e91da27edc08e",
:importer.datamodel/type "province",
:name "a1",
:importer.datamodel/part-of "8cda1baed04b668a167d4ca28e3cef36"}
{:importer.datamodel/global-id "8cda1baed04b668a167d4ca28e3cef36",
:importer.datamodel/type "country",
:name "AAA"}
{:importer.datamodel/global-id "c78e5478e19f2d7c1b02088e53e8d8a4",
:importer.datamodel/type "location",
:importer.datamodel/center ["36." "2."],
:importer.datamodel/part-of "01b4e69f86e5dd1d816e91da27edc08e"}
{:importer.datamodel/global-id "88844f94f79c75acfcb957bb41386149",
:importer.datamodel/type "organisation",
:name "C"}
{:importer.datamodel/global-id "102e96468e5d13058ab85c734aa4a949",
:importer.datamodel/type "organisation",
:name "A"}),
:datasources
({:importer.datamodel/global-id "Source;ACLED",
:name "ACLED",
:url "https://www.acleddata.com"}),
:iois
({:importer.datamodel/global-id "item-set;ACLED",
:importer.datamodel/type "event",
:datasource "Source;ACLED",
:features
({:importer.datamodel/global-id
"c74257292f584502f9be02c98829d9fda532a492e7dd41e06c31bbccc76a7ba0",
:date "1997-01-04",
:fulltext
{:importer.datamodel/global-id "df5c7d6d075df3a7719ebdd39c6d4c7f",
:text "bla"},
:location-meanings
({:importer.datamodel/global-id
"e5611219971164a15f06e07228fb7b51",
:location "8cda1baed04b668a167d4ca28e3cef36",
:contexts (),
:importer.datamodel/type "position"}
{:importer.datamodel/global-id
"af36461d27ec1d8d28fd7f4a70ab7ce2",
:location "c78e5478e19f2d7c1b02088e53e8d8a4",
:contexts (),
:importer.datamodel/type "position"}),
:interaction-name "Violence",
:importer.datamodel/type "description",
:has-contexts
({:context "102e96468e5d13058ab85c734aa4a949",
:context-association-type "actor",
:context-association-name "actor-1",
:priority "none"}
{:context "88844f94f79c75acfcb957bb41386149",
:context-association-type "actor",
:context-association-name "actor-2",
:priority "none"}),
:facts
({:importer.datamodel/global-id
"c46802ce6dcf33ca02ce113ffd9a855e",
:importer.datamodel/type "integer",
:name "fatalities",
:value "16"}),
:attributes
({:name "description",
:importer.datamodel/type "string",
:value "Violence"})}),
:attributes (),
:ioi-slice "per-item"})}
什么样的工具可以创建这种结构的规范?
我正在尝试使用此工具: https://github.com/stathissideris/spec-提供商
What tool can create the spec for such a structure? I am trying to use this tool: https://github.com/stathissideris/spec-provider
,但是它给了我:
(spec/def :importer.datamodel/data
(clojure.spec.alpha/coll-of
(clojure.spec.alpha/or
:collection
(clojure.spec.alpha/coll-of
(clojure.spec.alpha/keys
:req
[:importer.datamodel/global-id]
:opt
[:importer.datamodel/center
:importer.datamodel/part-of
:importer.datamodel/type]
:opt-un
[:importer.datamodel/attributes
:importer.datamodel/datasource
:importer.datamodel/features
:importer.datamodel/ioi-slice
:importer.datamodel/name
:importer.datamodel/url]))
:simple
clojure.core/keyword?)))
这不是完整的解决方案离子...
我用(sp / pprint-specs(sp / infer-specs data:importer.datamodel / data)'data')
..
什么工具可以为这种结构创建规格?
which is not complete solution...
I use (sp/pprint-specs (sp/infer-specs data :importer.datamodel/data) 'data 's)
...
What tool can create the spec for such a structure?
推荐答案
规范提供者无法提供所需的结果,因为您的数据是复杂的嵌套/递归结构。其中一些地图最好与多规格规范,但会获得规范提供商不这样做;文档中的一项警告指出 没有尝试推断多规格。
spec-provider isn't giving you the desired result because your data is a complex nested/recursive structure. Some of those maps would be best spec'd with multi-specs, but spec-provider won't do that; one of the caveats in its docs says There is no attempt to infer multi-spec.
唯一的方法要正确指定其中一些地图正在使用多规范,其规范将取决于其:importer.datamodel / type
值。
The only way to properly spec some of these maps is using multi-specs their spec will depend on their :importer.datamodel/type
value.
首先,让我们看一下顶级键(假设地图位于名为 data
的绑定中):
First, let's look at the top-level keys (assuming the map is in a binding named data
):
(keys data) => (:contexts :datasources :iois)
创建 s /密钥
最外层地图的规格:
Create a s/keys
spec for the outermost map:
(s/def ::my-map
(s/keys :req-un [::contexts ::datasources ::iois]))
这些键是不合格的,但是我们必须使用带有:req-un
的合格关键字来指定它们。我们可以使用REPL通过遍历嵌套结构并收集数据来查看嵌套地图的形状及其与:importer.datamodel / type
的关系:
These keys are unqualified, but we must use qualified keywords w/:req-un
to spec them. We can use the REPL to look at the shapes of nested maps and their relationships to :importer.datamodel/type
, by walking the nested structure and collecting data:
(let [keysets (atom #{})]
(clojure.walk/postwalk
(fn [v]
(when (map? v)
(swap! keysets conj [(:importer.datamodel/type v) (keys v)]))
v)
data)
@keysets)
=>
#{...
["organisation" (:importer.datamodel/global-id :importer.datamodel/type :name)]
[nil (:context :context-association-type :context-association-name :priority)]
["description"
(:importer.datamodel/global-id :date :fulltext :location-meanings
:interaction-name :importer.datamodel/type :has-contexts :facts :attributes)]
["event" (:importer.datamodel/global-id :importer.datamodel/type :datasource :features :attributes :ioi-slice)]
...}
(即将发布的规范alpha应该可以更轻松地从程序中定义规范
(An upcoming spec alpha should make it easier to define specs programmatically from this data.)
我们可以看到有些地图形状没有一个:importer.datamodel / type
,但是我们可以为那些编写多规范。首先定义用于分派类型键的多重方法:
We can see there are some map shapes that don't have a :importer.datamodel/type
, but we can write multi-specs for the ones that do. First define a multimethod for dispatching on the type key:
(defmulti type-spec :importer.datamodel/type)
然后为每个 defmethod
>:importer.datamodel / type 值。以下是一些示例:
Then write a defmethod
for each :importer.datamodel/type
value. Here are a few examples:
(defmethod type-spec :default [_] (s/keys))
(defmethod type-spec "organisation" [_]
(s/keys :req [:importer.datamodel/global-id]
:req-un [::name]))
(defmethod type-spec "description" [_]
(s/keys :req [:importer.datamodel/global-id]
:req-un [::date ::fulltext ::location-meanings ::interaction-name
::has-contexts ::facts ::attributes]))
(defmethod type-spec "event" [_]
(s/keys :req-un [::features]))
然后定义 s / multi-规范
:
(s/def ::datamodel
(s/multi-spec type-spec :importer.datamodel/type))
现在我们符合的任何地图 :: datamodel
将根据其:importer.datamodel / type
值解析规范。我们可以将该规范分配给规范将用于符合地图要求的关键字,例如最外面的键之一:
Now any map we conform to ::datamodel
will resolve a spec based on its :importer.datamodel/type
value. We can assign that spec to keywords that spec will use to conform the maps, e.g. one of the outermost keys:
(s/def ::contexts (s/coll-of ::datamodel))
现在,如果您从我们在下指定的一张地图中删除所需的密钥:contexts
,规范可以告诉您出了什么问题。例如,从组织
映射中删除:name
键:
Now if you remove a required key from one of the maps we spec'd under :contexts
, spec can tell you what's wrong. For example, removing the :name
key from an "organisation"
map:
(s/explain ::my-map data)
In: [:contexts 3]
val: #:importer.datamodel{:global-id "88844f94f79c75acfcb957bb41386149",
:type "organisation"}
fails spec: :playground.so/datamodel
at: [:contexts "organisation"]
predicate: (contains? % :name)
其他规格
对于没有:importer.datamodel / type
的地图,您应该能够定义关键规格。例如,嵌套的:has-contexts
键具有一个没有:importer.datamodel / type
的地图集合,但是,如果我们可以假设它们都相似,我们可以编写以下规范:
Other specs
For the maps that don't have a :importer.datamodel/type
you should be able to define a key spec. For example, the nested :has-contexts
key has a collection of maps without a :importer.datamodel/type
, but if we can assume they'll all be similar we can write this spec:
(s/def ::has-contexts
(s/coll-of (s/keys :req-un [::context ::context-association-type
::context-association-name ::priority])))
:具有上下文
在我们的地图中上面已经包含了多规范,只需将规范注册到该密钥即可使规范符合其值。包含此规范的最外键是:iois
,因此我们也可以指定该键:
:has-contexts
is in a map we've already covered with a multi-spec above, and simply registering a spec to this key will make spec conform its values. The outermost key that contains this spec is :iois
so we can spec that key too:
(s/def ::iois (s/coll-of ::datamodel))
现在,将输入符合 :: my-map
规范将自动覆盖更多数据。
Now, conforming an input to ::my-map
spec will automatically cover more data.
什么工具可以创建这种结构的规范?
What tool can create the spec for such a structure?
如您所见,为此编写了完整的规范结构是不平凡的,但可能的。我不知道任何现有工具可以自动推断出此结构的完整正确规格。不得不直觉:importer.datamodel / type
是一个可以用于分派给不同的 s / keys $的键。 c $ c>规范-它仍然会做出可能无效的假设。我认为在这种情况下,工具-辅助规范生成更加现实和实用。
As you can see, writing a full spec for this structure is non-trivial but possible. I don't know of any existing tool that could automatically infer a complete, "correct" spec for this structure. It would've had to intuit that :importer.datamodel/type
is a key that could be used to dispatch to different s/keys
specs — and it would still be making a potentially invalid assumption. I think tool-assisted spec generation is more realistic and practical in this case.
这篇关于根据数据创建规格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!