如何在 python-jsonschema 文档中设置本地文件引用?

我有一组符合 jsonschema 的文档。一些文档包含对其他文档的引用(通过 $ref 属性)。我不希望托管这些文档,以便可以通过 HTTP URI 访问它们。因此,所有引用都是相对的。所有文档都位于本地文件夹结构中。

我怎样才能让 python-jsonschema 理解正确使用我的本地文件系统来加载引用的文档?


例如,如果我有一个文件名为 defs.json 的文档包含一些定义。我尝试加载引用它的不同文档,例如:

{
  "allOf": [
    {"$ref":"defs.json#/definitions/basic_event"},
    {
      "type": "object",
      "properties": {
        "action": {
          "type": "string",
          "enum": ["page_load"]
        }
      },
      "required": ["action"]
    }
  ]
}

我收到一个错误 RefResolutionError: <urlopen error [Errno 2] No such file or directory: '/defs.json'>

我在Linux机器上可能很重要。


(我将其写为问答,因为我很难弄清楚这一点并且 observed other folks having trouble too 。)

stack overflow How to set up local file references in python-jsonschema document?
原文答案

答案:

作者头像

我最难弄清楚如何解决一组彼此 $ref 的模式(我是 JSON 模式的新手)。事实证明,关键是使用 RefResolver 创建 storedict 是从 url 映射到模式的 ```
import json

from jsonschema import RefResolver, Draft7Validator

base = """
{
"$id": "base.schema.json",
"type": "object",
"properties": {
"prop": {
"type": "string"
}
},
"required": ["prop"]
}
"""

extend = """
{
"$id": "extend.schema.json",
"allOf": [
{"$ref": "base.schema.json#"},
{
"properties": {
"extra": {
"type": "boolean"
}
},
"required": ["extra"]
}
]
}
"""

extend_extend = """
{
"$id": "extend_extend.schema.json",
"allOf": [
{"$ref": "extend.schema.json#"},
{
"properties": {
"extra2": {
"type": "boolean"
}
},
"required": ["extra2"]
}
]
}
"""

data = """
{
"prop": "This is the property string",
"extra": true,
"extra2": false
}
"""

schema = json.loads(base)
extendedSchema = json.loads(extend)
extendedExtendSchema = json.loads(extend_extend)
schema_store = {
schema['$id'] : schema,
extendedSchema['$id'] : extendedSchema,
extendedExtendSchema['$id'] : extendedExtendSchema,
}

resolver = RefResolver.from_schema(schema, store=schema_store)
validator = Draft7Validator(extendedExtendSchema, resolver=resolver)

jsonData = json.loads(data)
validator.validate(jsonData)



 `jsonschema==3.2.0` 

以上是用    构建的。
作者头像

您必须为每个使用相对引用的模式构建自定义 jsonschema.RefResolver ,并确保您的解析器知道给定模式在文件系统上的位置。

如...

import os
import json
from jsonschema import Draft4Validator, RefResolver # We prefer Draft7, but jsonschema 3.0 is still in alpha as of this writing 

abs_path_to_schema = '/path/to/schema-doc-foobar.json'
with open(abs_path_to_schema, 'r') as fp:
  schema = json.load(fp)

resolver = RefResolver(
  # The key part is here where we build a custom RefResolver 
  # and tell it where *this* schema lives in the filesystem
  # Note that `file:` is for unix systems
  schema_path='file:{}'.format(abs_path_to_schema),
  schema=schema
)
Draft4Validator.check_schema(schema) # Unnecessary but a good idea
validator = Draft4Validator(schema, resolver=resolver, format_checker=None)

# Then you can...
data_to_validate = `{...}`
validator.validate(data_to_validate)
作者头像

编辑-1

修复了对 $ref 模式的错误引用 ( base )。更新了示例以使用 the 文档中的示例: https://json-schema.org/understanding-json-schema/structuring.html

EDIT-2

正如评论中所指出的,在下面我使用以下导入:

from jsonschema import validate, RefResolver 
from jsonschema.validators import validator_for

这只是@Daniel 答案的另一个版本——这对我来说是正确的。基本上,我决定在基本模式中定义 $schema 。然后释放其他模式并在实例化 resolver 时进行明确的调用。

  • RefResolver.from_schema() 获得 (1) some 架构和 (2) 架构存储的事实对我来说不是很清楚,这里的顺序和哪个“ some ”架构是相关的。所以你在下面看到的结构。

我有以下内容:

base.schema.json

{
  "$schema": "http://json-schema.org/draft-07/schema#"
}

definitions.schema.json

{
  "type": "object",
  "properties": {
    "street_address": { "type": "string" },
    "city":           { "type": "string" },
    "state":          { "type": "string" }
  },
  "required": ["street_address", "city", "state"]
}

address.schema.json

{
  "type": "object",

  "properties": {
    "billing_address": { "$ref": "definitions.schema.json#" },
    "shipping_address": { "$ref": "definitions.schema.json#" }
  }
}

我喜欢这个设置有两个原因:

  1. 是对 RefResolver.from_schema() 的更清洁的调用:

    base = json.loads(open('base.schema.json').read())
    definitions = json.loads(open('definitions.schema.json').read())
    schema = json.loads(open('address.schema.json').read())
    
    schema_store = {
      base.get('$id','base.schema.json') : base,
      definitions.get('$id','definitions.schema.json') : definitions,
      schema.get('$id','address.schema.json') : schema,
    }
    
    resolver = RefResolver.from_schema(base, store=schema_store)
    
  2. 然后我从图书馆提供的便捷工具中获益,为您提供 best validator_for 您的模式(根据您的 $schema 键):

    Validator = validator_for(base)
    
  3. 然后将它们放在一起实例化 validator

    validator = Validator(schema, resolver=resolver)
    

最后,您 validate 您的数据:

data = {
  "shipping_address": {
    "street_address": "1600 Pennsylvania Avenue NW",
    "city": "Washington",
    "state": "DC"   
  },
  "billing_address": {
    "street_address": "1st Street SE",
    "city": "Washington",
    "state": 32
  }
}
  • This one 会崩溃 因为 "state": 32

    
    >>> validator.validate(data)

ValidationError: 32 is not of type 'string'

Failed validating 'type' in schema['properties']['billing_address']['properties']['state']:
{'type': 'string'}

On instance['billing_address']['state']:
32



>  `Change`  到  `"DC"` ,并且**将验证**。
作者头像

跟进@chris-w 提供的答案,我想用 jsonschema 3.2.0 做同样的事情,但他的回答并没有完全涵盖它我希望这个答案能帮助那些仍在寻求帮助但正在使用的人更新版本的软件包。

要使用该库扩展 JSON 模式,请执行以下操作:

  1. 创建基础架构:

    
    base.schema.json
    {
    "$id": "base.schema.json",
    "type": "object",
    "properties": {
    "prop": {
      "type": "string"
    }
    },
    "required": ["prop"]
    }

2. 创建扩展架构

extend.schema.json
{
"allOf": [
{"$ref": "base.schema.json"},
{
"properties": {
"extra": {
"type": "boolean"
}
},
"required": ["extra"]
}
]
}


3. 创建您想要针对架构进行测试的 JSON 文件

data.json
{
"prop": "This is the property",
"extra": true
}


4. 为基础架构创建 RefResolver 和 Validator 并使用它来检查数据

Set up schema, resolver, and validator on the base schema

baseSchema = json.loads(baseSchemaJSON) # Create a schema dictionary from the base JSON file
relativeSchema = json.loads(relativeJSON) # Create a schema dictionary from the relative JSON file
resolver = RefResolver.from_schema(baseSchema) # Creates your resolver, uses the "$id" element
validator = Draft7Validator(relativeSchema, resolver=resolver) # Create a validator against the extended schema (but resolving to the base schema!)

Check validation!

data = json.loads(dataJSON) # Create a dictionary from the data JSON file
validator.validate(data)


您可能需要对上述条目进行一些调整,例如不使用 Draft7Validator。这应该适用于单级引用(扩展基的子级),您需要小心您的模式以及如何设置  `RefResolver`  和  `Validator`  对象。

附言这是一个练习上述内容的片段。尝试修改  `data`  字符串以删除所需的属性之一:

import json

from jsonschema import RefResolver, Draft7Validator

base = """
{
"$id": "base.schema.json",
"type": "object",
"properties": {
"prop": {
"type": "string"
}
},
"required": ["prop"]
}
"""

extend = """
{
"allOf": [
{"$ref": "base.schema.json"},
{
"properties": {
"extra": {
"type": "boolean"
}
},
"required": ["extra"]
}
]
}
"""

data = """
{
"prop": "This is the property string",
"extra": true
}
"""

schema = json.loads(base)
extendedSchema = json.loads(extend)
resolver = RefResolver.from_schema(schema)
validator = Draft7Validator(extendedSchema, resolver=resolver)

jsonData = json.loads(data)
validator.validate(jsonData)

作者头像

我的方法是将所有模式片段预加载到 RefResolver 缓存。我创建了一个要点来说明这一点: https://gist.github.com/mrtj/d59812a981da17fbaa67b7de98ac3d4b

作者头像

这是我过去从给定目录中的所有模式中动态生成 schema_store 的方法

base.schema.json

{
  "$id": "base.schema.json",
  "type": "object",
  "properties": {
    "prop": {
      "type": "string"
    }
  },
  "required": ["prop"]
}

extend.schema.json

{  
  "$id": "extend.schema.json",
  "allOf": [
    {"$ref": "base.schema.json"},
    {
      "properties": {
        "extra": {
          "type": "boolean"
        }
      },
    "required": ["extra"]
    }
  ]
}

instance.json

{
  "prop": "This is the property string",
  "extra": true
}

validator.py

import json

from pathlib import Path

from jsonschema import Draft7Validator, RefResolver
from jsonschema.exceptions import RefResolutionError

schemas = (json.load(open(source)) for source in Path("schema/dir").iterdir())
schema_store = {schema["$id"]: schema for schema in schemas}

schema = json.load(open("schema/dir/extend.schema.json"))
instance = json.load(open("instance/dir/instance.json"))
resolver = RefResolver.from_schema(schema, store=schema_store)
validator = Draft7Validator(schema, resolver=resolver)

try:
    errors = sorted(validator.iter_errors(instance), key=lambda e: e.path)
except RefResolutionError as e:
    print(e)