Overview

BuildStatus Coverage pypi CondaForge downloads license

Documentation for version: v1.0b2

Note

These docs refer to Version 1 of pydantic which is as-yet unreleased. v0.32 docs are available here.

Data validation and settings management using python type hinting.

Define how data should be in pure, canonical python; validate it with pydantic.

PEP 484 introduced type hinting into python 3.5, PEP 526 extended that with syntax for variable annotation in python 3.6.

pydantic uses those annotations to validate that untrusted data takes the form you want.

There's also support for an extension to dataclasses where the input data is validated.

Example🔗

from datetime import datetime
from typing import List
from pydantic import BaseModel

class User(BaseModel):
    id: int
    name = 'John Doe'
    signup_ts: datetime = None
    friends: List[int] = []

external_data = {
    'id': '123',
    'signup_ts': '2019-06-01 12:22',
    'friends': [1, '2', 3.1415]
}
user = User(**external_data)
print(user.id)
#> 123
print(repr(user.signup_ts))
#> datetime.datetime(2019, 6, 1, 12, 22)
print(user.friends)
#> [1, 2, 3]
print(user.dict())
#> {
#>     'id': 123,
#>     'signup_ts': datetime.datetime(2019, 6, 1, 12, 22),
#>     'friends': [1, 2, 3],
#>     'name': 'John Doe'
#> }
print(user.json())
#> {"id": 123, "signup_ts": "2019-06-01T12:22:00", ...

(This script is complete, it should run "as is")

What's going on here:

  • id is of type int; the annotation only declaration tells pydantic that this field is required. Strings, bytes or floats will be coerced to ints if possible, otherwise an exception would be raised.
  • name is inferred as a string from the default; it is not required as it has a default.
  • signup_ts is a datetime field which is not required (None if it's not supplied), pydantic will process either a unix timestamp int (e.g. 1496498400) or a string representing the date & time.
  • friends uses python's typing system, it is required to be a list of integers, as with id integer-like objects will be converted to integers.

If validation fails pydantic will raise an error with a breakdown of what was wrong:

from pydantic import ValidationError

try:
    User(signup_ts='broken', friends=[1, 2, 'not number'])
except ValidationError as e:
    print(e.json())

outputs:

[
  {
    "loc": [
      "id"
    ],
    "msg": "field required",
    "type": "value_error.missing"
  },
  {
    "loc": [
      "signup_ts"
    ],
    "msg": "invalid datetime format",
    "type": "type_error.datetime"
  },
  {
    "loc": [
      "friends",
      2
    ],
    "msg": "value is not a valid integer",
    "type": "type_error.integer"
  }
]

Rationale🔗

So pydantic uses some cool new language features, but why should I actually go and use it?

no brainfuck
no new schema definition micro-language to learn. If you know python (and perhaps skim the type hinting docs) you know how to use pydantic.
plays nicely with your IDE/linter/brain
because pydantic data structures are just instances of classes you define; auto-completion, linting, mypy, IDEs (especially PyCharm) and your intuition should all work properly with your validated data.
dual use
pydantic's BaseSettings class allows it to be used in both a "validate this request data" context and "load my system settings" context. The main difference being that system settings can have defaults changed by environment variables and more complex objects like DSNs and python objects are often required.
fast
In benchmarks pydantic is faster than all other tested libraries.
validate complex structures
use of recursive pydantic models, typing's standard types (e.g. List, Tuple, Dict etc.) and validators allow complex data schemas to be clearly and easily defined and then validated and parsed.
extensible
pydantic allows custom data types to be defined or you can extend validation with methods on a model decorated with the validator decorator.

Using Pydantic🔗

Hundreds of organisations and packages are using pydantic, including:

FastAPI
a high performance API framework, easy to learn, fast to code and ready for production, based on pydantic and Starlette.
Project Jupyter
developers of the Jupyter notebook are using pydantic for subprojects.
Microsoft
are using pydantic (via FastAPI) for numerous services some of which are "getting integrated into the core Windows product and some Office products."
Amazon Web Services
are using pydantic in gluon-ts an open-source probabilistic time series modeling library.
The NSA
are using pydantic in WALKOFF an open-source automation framework.
Uber
are using pydantic in Ludwig an an open-source TensorFlow wrapper.
Cuenca
a Mexican neobank that uses pydantic for several internal tools (including API validation) and for open source projects like stpmex, which is used to process real-time, 24/7, inter-bank transfers in Mexico.

For a more comprehensive list of open-source projects using pydantic see dependents on github.