Models

The primary means of defining objects in pydantic is via models (models are simply classes which inherit from BaseModel).

You can think of models as similar to types in strictly typed languages or the requirements of a single endpoint in an API.

Untrusted data can be passed to a model, after parsing and validation pydantic guarantees that the fields of the resultant model instance will conform to the field types defined on the model.

Note

pydantic is primarily a parsing library, not a validation library. Validation is a means to an end - building a model which conforms to the types and constraints provided.

In other words pydantic guarantees the types and constraints of the output model, not the input data.

This might sound like an esoteric distinction, but it is not - you should read about Data Conversion if you're unsure what this means or how it might effect your usage.

Basic model usage🔗

from pydantic import BaseModel

class User(BaseModel):
    id: int
    name = 'Jane Doe'

User here is a model with two fields id which is an integer and is required, and name which is a string and is not required (it has a default value). The type of name is inferred from the default value, thus a type annotation is not required (however note this warning about field order when some fields do not have type annotations).

user = User(id='123')

user here is an instance of User. Initialisation of the object will perform all parsing and validation, if no ValidationError is raised, you know the resulting model instance is valid.

assert user.id == 123

fields of a model can be accessed as normal attributes of the user object the string '123' has been cast to an int as per the field type

assert user.name == 'Jane Doe'

name wasn't set when user was initialised, so it has the default value

assert user.__fields_set__ == {'id'}

the fields which were supplied when user was initialised:

assert user.dict() == dict(user) == {'id': 123, 'name': 'Jane Doe'}

either .dict() or dict(user) will provide a dict of fields, but .dict() can take numerous other arguments.

user.id = 321
assert user.id == 321

This model is mutable so field values can be changed.

Model properties🔗

The example above only shows the tip of the iceberg of what models can do. Models contains the following methods and attributes:

dict()
returns a dictionary of the model's fields and values, see exporting models for more details
json()
returns a JSON string representation dict(), see exporting models for more details
copy()
returns a deep copy of the model, see exporting models for more details
parse_obj()
utility for loading any object into a model with error handling if the object is not a dictionary, see helper function below
parse_raw()
utility for loading strings of numerous formats, see helper function below
parse_file()
like parse_raw() but for files, see helper function below
from_orm()
for loading data from arbitrary classes, see ORM mode below
schema()
returns a dictionary representing the model as JSON Schema, see Schema
schema_json()
returns a JSON string representation schema(), see Schema
fields
a dictionary of the model class's fields
__config__
the configuration class for this model, see model config
__fields_set__
Set of names of fields which were set when the model instance was initialised

Recursive Models🔗

More complex hierarchical data structures can be defined using models as types in annotations themselves.

from typing import List
from pydantic import BaseModel

class Foo(BaseModel):
    count: int
    size: float = None

class Bar(BaseModel):
    apple = 'x'
    banana = 'y'

class Spam(BaseModel):
    foo: Foo
    bars: List[Bar]

m = Spam(foo={'count': 4}, bars=[{'apple': 'x1'}, {'apple': 'x2'}])
print(m)
#> Spam foo=<Foo count=4 size=None>
#>      bars=[<Bar apple='x1' banana='y'>, <Bar apple='x2' banana='y'>]
print(m.dict())
#> {
#>     'foo': {'count': 4, 'size': None},
#>     'bars': [
#>         {'apple': 'x1', 'banana': 'y'},
#>         {'apple': 'x2', 'banana': 'y'}
#>     ]
#> }

(This script is complete, it should run "as is")

For self-referencing models, see postponed annotations.

ORM Mode (aka Arbitrary Class Instances)🔗

Pydantic models can be created from arbitrary class instances to support models that map to ORM objects.

To do this: 1. The Config property orm_mode must be set to True. 2. The special constructor from_orm must be used to create the model instance.

The example here uses SQLAlchemy but the same approach should work for any ORM.

from typing import List
from sqlalchemy import Column, Integer, String
from sqlalchemy.dialects.postgresql import ARRAY
from sqlalchemy.ext.declarative import declarative_base
from pydantic import BaseModel, constr

Base = declarative_base()

class CompanyOrm(Base):
    __tablename__ = 'companies'
    id = Column(Integer, primary_key=True, nullable=False)
    public_key = Column(String(20), index=True, nullable=False, unique=True)
    name = Column(String(63), unique=True)
    domains = Column(ARRAY(String(255)))

class CompanyModel(BaseModel):
    id: int
    public_key: constr(max_length=20)
    name: constr(max_length=63)
    domains: List[constr(max_length=255)]

    class Config:
        orm_mode = True

co_orm = CompanyOrm(
    id=123,
    public_key='foobar',
    name='Testing',
    domains=['example.com', 'foobar.com']
)
print(co_orm)
#> <__main__.CompanyOrm object at 0x7ff4bf918278>
co_model = CompanyModel.from_orm(co_orm)
print(co_model)
#> CompanyModel id=123
#>              public_key='foobar'
#>              name='Testing'
#>              domains=['example.com', 'foobar.com']

(This script is complete, it should run "as is")

ORM instances will be parsed with from_orm recursively as well as at the top level.

Here a vanilla class is used to demonstrate the principle, but any ORM could be used instead.

from typing import List
from pydantic import BaseModel

class PetCls:
    def __init__(self, *, name: str, species: str):
        self.name = name
        self.species = species

class PersonCls:
    def __init__(self, *, name: str, age: float = None, pets: List[PetCls]):
        self.name = name
        self.age = age
        self.pets = pets

class Pet(BaseModel):
    name: str
    species: str

    class Config:
        orm_mode = True

class Person(BaseModel):
    name: str
    age: float = None
    pets: List[Pet]

    class Config:
        orm_mode = True

bones = PetCls(name='Bones', species='dog')
orion = PetCls(name='Orion', species='cat')
anna = PersonCls(name='Anna', age=20, pets=[bones, orion])
anna_model = Person.from_orm(anna)
print(anna_model)
#> Person name='Anna'
#>        pets=[
#>          <Pet name='Bones' species='dog'>,
#>          <Pet name='Orion' species='cat'>
#>        ]
#>        age=20.0

(This script is complete, it should run "as is")

Arbitrary classes are processed by pydantic using the GetterDict class (see utils.py) which attempts to provide a dictionary-like interface to any class. You can customise how this works by setting your own sub-class of GetterDict in Config.getter_dict (see config).

You can also customise class validation using root_validators with pre=True, in this case your validator function will be passed a GetterDict instance which you may copy and modify.

Error Handling🔗

pydantic will raise ValidationError whenever it finds an error in the data it's validating.

Note

Validation code should not raise ValidationError itself, but rather raise ValueError, TypeError or AssertionError (or subclasses of ValueError or TypeError) which will be caught and used to populate ValidationError.

One exception will be raised regardless of the number of errors found, that ValidationError will contain information about all the errors and how they happened.

You can access these errors in a several ways:

e.errors()
method will return list of errors found in the input data.
e.json()
method will return a JSON representation of errors.
str(e)
method will return a human readable representation of the errors.

Each error object contains:

loc
the error's location as a list, the first item in the list will be the field where the error occurred, subsequent items will represent the field where the error occurred in sub models when they're used.
type
a unique identifier of the error readable by a computer.
msg
a human readable explanation of the error.
ctx
an optional object which contains values required to render the error message.

To demonstrate that:

from typing import List
from pydantic import BaseModel, ValidationError, conint

class Location(BaseModel):
    lat = 0.1
    lng = 10.1

class Model(BaseModel):
    is_required: float
    gt_int: conint(gt=42)
    list_of_ints: List[int] = None
    a_float: float = None
    recursive_model: Location = None

data = dict(
    list_of_ints=['1', 2, 'bad'],
    a_float='not a float',
    recursive_model={'lat': 4.2, 'lng': 'New York'},
    gt_int=21,
)

try:
    Model(**data)
except ValidationError as e:
    print(e)
"""
5 validation errors
list_of_ints -> 2
  value is not a valid integer (type=type_error.integer)
a_float
  value is not a valid float (type=type_error.float)
is_required
  field required (type=value_error.missing)
recursive_model -> lng
  value is not a valid float (type=type_error.float)
gt_int
  ensure this value is greater than 42 (type=value_error.number.gt; limit_value=42)
"""

try:
    Model(**data)
except ValidationError as e:
    print(e.json())

"""
[
  {
    "loc": ["is_required"],
    "msg": "field required",
    "type": "value_error.missing"
  },
  {
    "loc": ["gt_int"],
    "msg": "ensure this value is greater than 42",
    "type": "value_error.number.gt",
    "ctx": {
      "limit_value": 42
    }
  },
  {
    "loc": ["list_of_ints", 2],
    "msg": "value is not a valid integer",
    "type": "type_error.integer"
  },
  {
    "loc": ["a_float"],
    "msg": "value is not a valid float",
    "type": "type_error.float"
  },
  {
    "loc": ["recursive_model", "lng"],
    "msg": "value is not a valid float",
    "type": "type_error.float"
  }
]
"""

(This script is complete, it should run "as is". json() has indent=2 set by default, but I've tweaked the JSON here and below to make it slightly more concise.)

Custom Errors🔗

In your custom data types or validators you should use ValueError, TypeError or AssertionError to raise errors.

See validators for more details on use of the @validator decorator.

from pydantic import BaseModel, ValidationError, validator

class Model(BaseModel):
    foo: str

    @validator('foo')
    def name_must_contain_space(cls, v):
        if v != 'bar':
            raise ValueError('value must be "bar"')

        return v

try:
    Model(foo='ber')
except ValidationError as e:
    print(e.errors())

"""
[
    {
        'loc': ('foo',),
        'msg': 'value must be "bar"',
        'type': 'value_error',
    },
]
"""

(This script is complete, it should run "as is")

You can also define your own error class with abilities to specify custom error code, message template and context:

from pydantic import BaseModel, PydanticValueError, ValidationError, validator

class NotABarError(PydanticValueError):
    code = 'not_a_bar'
    msg_template = 'value is not "bar", got "{wrong_value}"'

class Model(BaseModel):
    foo: str

    @validator('foo')
    def name_must_contain_space(cls, v):
        if v != 'bar':
            raise NotABarError(wrong_value=v)
        return v

try:
    Model(foo='ber')
except ValidationError as e:
    print(e.json())
"""
[
  {
    "loc": ["foo"],
    "msg": "value is not \"bar\", got \"ber\"",
    "type": "value_error.not_a_bar",
    "ctx": {
      "wrong_value": "ber"
    }
  }
]
"""

(This script is complete, it should run "as is")

Helper Functions🔗

Pydantic provides three classmethod helper functions on models for parsing data:

  • parse_obj this is almost identical to the __init__ method of the model except if the object passed is not a dict ValidationError will be raised (rather than python raising a TypeError).
  • parse_raw takes a str or bytes parses it as json, or pickle data and then passes the result to parse_obj. The data type is inferred from the content_type argument, otherwise json is assumed.
  • parse_file reads a file and passes the contents to parse_raw, if content_type is omitted it is inferred from the file's extension.
import pickle
from datetime import datetime
from pydantic import BaseModel, ValidationError

class User(BaseModel):
    id: int
    name = 'John Doe'
    signup_ts: datetime = None

m = User.parse_obj({'id': 123, 'name': 'James'})
print(m)
# > User id=123 name='James' signup_ts=None

try:
    User.parse_obj(['not', 'a', 'dict'])
except ValidationError as e:
    print(e)
# > error validating input
# > User expected dict not list (error_type=TypeError)

m = User.parse_raw('{"id": 123, "name": "James"}')  # assumes json as no content type passed
print(m)
# > User id=123 name='James' signup_ts=None

pickle_data = pickle.dumps({'id': 123, 'name': 'James', 'signup_ts': datetime(2017, 7, 14)})
m = User.parse_raw(pickle_data, content_type='application/pickle', allow_pickle=True)
print(m)
# > User id=123 name='James' signup_ts=datetime.datetime(2017, 7, 14, 0, 0)

(This script is complete, it should run "as is")

Note

Since pickle allows complex objects to be encoded, to use it you need to explicitly pass allow_pickle to the parsing function.

Generic Models🔗

Note

New in version v0.29.

This feature requires Python 3.7+.

Pydantic supports the creation of generic models to make it easier to reuse a common model structure.

In order to declare a generic model, you perform the following steps:

  • Declare one or more typing.TypeVar instances to use to parameterize your model.
  • Declare a pydantic model that inherits from pydantic.generics.GenericModel and typing.Generic, where you pass the TypeVar instances as parameters to typing.Generic.
  • Use the TypeVar instances as annotations where you will want to replace them with other types or pydantic models.

Here is an example using GenericModel to create an easily-reused HTTP response payload wrapper:

from typing import Generic, TypeVar, Optional, List

from pydantic import BaseModel, validator, ValidationError
from pydantic.generics import GenericModel


DataT = TypeVar('DataT')


class Error(BaseModel):
    code: int
    message: str


class DataModel(BaseModel):
    numbers: List[int]
    people: List[str]


class Response(GenericModel, Generic[DataT]):
    data: Optional[DataT]
    error: Optional[Error]

    @validator('error', always=True)
    def check_consistency(cls, v, values):
        if v is not None and values['data'] is not None:
            raise ValueError('must not provide both data and error')
        if v is None and values.get('data') is None:
            raise ValueError('must provide data or error')
        return v


data = DataModel(numbers=[1, 2, 3], people=[])
error = Error(code=404, message='Not found')

print(Response[int](data=1))
# > Response[int] data=1 error=None
print(Response[str](data='value'))
# > Response[str] data='value' error=None
print(Response[str](data='value').dict())
# > {'data': 'value', 'error': None}
print(Response[DataModel](data=data).dict())
# > {'data': {'numbers': [1, 2, 3], 'people': []}, 'error': None}
print(Response[DataModel](error=error).dict())
# > {'data': None, 'error': {'code': 404, 'message': 'Not found'}}

try:
    Response[int](data='value')
except ValidationError as e:
    print(e)
"""
4 validation errors
data
  value is not a valid integer (type=type_error.integer)
data
  value is not none (type=type_error.none.allowed)
error
  value is not a valid dict (type=type_error.dict)
error
  must provide data or error (type=value_error)
"""

(This script is complete, it should run "as is")

If you set Config or make use of validator in your generic model definition, it is applied to concrete subclasses in the same way as when inheriting from BaseModel. Any methods defined on your generic class will also be inherited.

Pydantic's generics also integrate properly with mypy, so you get all the type checking you would expect mypy to provide if you were to declare the type without using GenericModel.

Note

Internally, pydantic uses create_model to generate a (cached) concrete BaseModel at runtime, so there is essentially zero overhead introduced by making use of GenericModel.

If the name of the concrete subclasses is important, you can also override the default behavior:

from typing import Generic, TypeVar, Type, Any, Tuple

from pydantic.generics import GenericModel

DataT = TypeVar('DataT')

class Response(GenericModel, Generic[DataT]):
    data: DataT

    @classmethod
    def __concrete_name__(cls: Type[Any], params: Tuple[Type[Any], ...]) -> str:
        return f'{params[0].__name__.title()}Response'

print(Response[int](data=1))
# IntResponse data=1
print(Response[str](data='a'))
# StrResponse data='a'

(This script is complete, it should run "as is")

Dynamic model creation🔗

There are some occasions where the shape of a model is not known until runtime, for this pydantic provides the create_model method to allow models to be created on the fly.

from pydantic import BaseModel, create_model

DynamicFoobarModel = create_model('DynamicFoobarModel', foo=(str, ...), bar=123)


class StaticFoobarModel(BaseModel):
    foo: str
    bar: int = 123

Here StaticFoobarModel and DynamicFoobarModel are identical.

Fields are defined by either a a tuple of the form (<type>, <default value>) or just a default value. The special key word arguments __config__ and __base__ can be used to customise the new model. This includes extending a base model with extra fields.

from pydantic import BaseModel, create_model


class FooModel(BaseModel):
    foo: str
    bar: int = 123


BarModel = create_model('BarModel', apple='russet', banana='yellow', __base__=FooModel)
print(BarModel)
# > <class 'pydantic.main.BarModel'>
print(', '.join(BarModel.__fields__.keys()))
# > foo, bar, apple, banana

Custom Root Types🔗

Pydantic models which do not represent a dict ("object" in JSON parlance) can have a custom root type defined via the __root__ field. The root type can of any type: list, float, int etc.

The root type can be defined via the type hint on the __root__ field. The root value can be passed to model __init__ via the __root__ keyword argument or as the first and only argument to parse_obj.

from typing import List
import json
from pydantic import BaseModel
from pydantic.schema import schema

class Pets(BaseModel):
    __root__: List[str]

print(Pets(__root__=['dog', 'cat']))
# > Pets __root__=['dog', 'cat']

print(Pets(__root__=['dog', 'cat']).json())
# ["dog", "cat"]

print(Pets.parse_obj(['dog', 'cat']))
# > Pets __root__=['dog', 'cat']

print(Pets.schema())
# > {'title': 'Pets', 'type': 'array', 'items': {'type': 'string'}}

pets_schema = schema([Pets])
print(json.dumps(pets_schema, indent=2))
"""
{
  "definitions": {
    "Pets": {
      "title": "Pets",
      "type": "array",
      "items": {
        "type": "string"
      }
    }
  }
}
"""

Faux Immutability🔗

Models can be configured to be immutable via allow_mutation = False this will prevent changing attributes of a model. See model config for more details on Config.

Warning

Immutability in python is never strict. If developers are determined/stupid they can always modify a so-called "immutable" object.

from pydantic import BaseModel


class FooBarModel(BaseModel):
    a: str
    b: dict

    class Config:
        allow_mutation = False


foobar = FooBarModel(a='hello', b={'apple': 'pear'})

try:
    foobar.a = 'different'
except TypeError as e:
    print(e)
    # > "FooBarModel" is immutable and does not support item assignment

print(foobar.a)
# > hello

print(foobar.b)
# > {'apple': 'pear'}

foobar.b['apple'] = 'grape'
print(foobar.b)
# > {'apple': 'grape'}

Trying to change a caused an error and it remains unchanged, however the dict b is mutable and the immutability of foobar doesn't stop b from being changed.

Abstract Base Classes🔗

Pydantic models can be used alongside Python's Abstract Base Classes (ABCs).

import abc
from pydantic import BaseModel


class FooBarModel(BaseModel, abc.ABC):
    a: str
    b: int

    @abc.abstractmethod
    def my_abstract_method(self):
        pass

(This script is complete, it should run "as is")

Field Ordering🔗

Field order is important in models for the following reason:

As of v1.0 all fields with annotations (both annotation only and annotations with a value) come first followed by fields with no annotation. Within each group fields remain in the order they were defined.

from pydantic import BaseModel, ValidationError

class Model(BaseModel):
    a: int
    b = 2
    c: int = 1
    d = 0
    e: float

print(Model.__fields__.keys())
#> dict_keys(['a', 'c', 'e', 'b', 'd'])
m = Model(e=2, a=1)
print(m.dict())
#> {'a': 1, 'c': 'x', 'e': 2.0, 'b': 2, 'd': 'y'}

try:
    Model(a='x', b='x', c='x', d='x', e='x')
except ValidationError as e:
    error_logs = [e['loc'] for e in e.errors()]

print(error_logs)
#> [('a',), ('c',), ('e',), ('b',), ('d',)]

(This script is complete, it should run "as is")

Warning

Note here that field order when both annotated and un-annotated fields are used is esoteric and not obvious at first glance.

In general therefore, it's preferable to add type annotations to all fields even when a default value also defines the type.

Required fields🔗

In addition to annotation only fields to denote required fields, an ellipsis (...) can be used as the value

from pydantic import BaseModel

class Model(BaseModel):
    a: int
    b: int = ...

Here both a and b are required here. Use of ellipses for required fields does not work well with mypy so should generally be avoided.

Data Conversion🔗

pydantic may cast input data to force it to conform model field types. This may result in information being lost, take the following example:

from pydantic import BaseModel

class Model(BaseModel):
    a: int
    b: float
    c: str

print(Model(a=3.1415, b=' 2.72 ', c=123).dict())
#> {'a': 3, 'b': 2.72, 'c': '123'}

(This script is complete, it should run "as is")

This is a deliberate decision of pydantic, and in general it's the most useful approach, see here for a longer discussion of the subject.