Skip to content

Annotating fields

Field types are specified via Annotated type hints. Each field may include a pure_protobuf.annotations.Field annotation, otherwise it gets ignored by BaseMessage. For older Python versions one can use typing_extensions.Annotated.

Supported types

Built-in types

Type .proto type Notes
bool bool Encoded normally as int
bytes, bytearray, memoryview, ByteString bytes Always deserialized as bytes
float float 32-bit floating-point number. Use the additional double type for 64-bit number
int int32 int64 uint32 uint64 Variable-length integer. For negative values, two's compliments are used. See also the additional uint and ZigZagInt
enum.IntEnum enum int32 int64 Supports subclasses of IntEnum (see enumerations)
str string
urllib.parse.ParseResult string Parsed URL, represented as a string

Additional types

pure_protobuf.annotations module provides additional NewTypes to support different representations of the singular types:

pure_protobuf.annotations type .proto type Python value type Notes
double double float 64-bit floating-point number
fixed32 fixed32 int 32-bit unsigned integer
fixed64 fixed64 int 64-bit unsigned integer
sfixed32 sfixed32 int 32-bit signed integer
sfixed64 sfixed64 int 64-bit signed integer
uint uint32 uint64 int Unsigned variable-length integer
ZigZagInt sint32 sint64 int ZigZag-encoded integer

Repeated fields

typing.List, typing.Iterable, and collections.abc.Iterable annotations are automatically converted to repeated fields. Repeated fields of scalar numeric types use packed encoding by default:

test_repeated.py
from dataclasses import dataclass, field
from typing import List
from typing_extensions import Annotated

from pure_protobuf.annotations import Field
from pure_protobuf.message import BaseMessage


@dataclass
class Message(BaseMessage):
    foo: Annotated[List[int], Field(1)] = field(default_factory=list)


assert bytes(Message(foo=[1, 2])) == b"\x0A\x02\x01\x02"

In case, unpacked encoding is explicitly wanted, you can specify packed=False:

test_repeated_unpacked.py
from dataclasses import dataclass, field
from typing import List
from typing_extensions import Annotated

from pure_protobuf.annotations import Field
from pure_protobuf.message import BaseMessage


@dataclass
class Message(BaseMessage):
    foo: Annotated[List[int], Field(1, packed=False)] = field(
        default_factory=list,
    )


assert bytes(Message(foo=[1, 2])) == b"\x08\x01\x08\x02"

Required fields

Required fields are deprecated in proto2 and not supported in proto3, thus in pure-protobuf fields are always optional. Optional annotation is accepted for type hinting, but has no functional meaning for BaseMessage.

Default values

In pure-protobuf it's developer's responsibility to take care of default values. If encoded message does not contain a particular element, the corresponding field stays unprovided:

test_default_values.py
from dataclasses import dataclass
from io import BytesIO
from typing import Optional
from typing_extensions import Annotated

from pure_protobuf.annotations import Field
from pure_protobuf.message import BaseMessage


@dataclass
class Foo(BaseMessage):
    bar: Annotated[int, Field(1)] = 42
    qux: Annotated[Optional[int], Field(2)] = None


assert bytes(Foo()) == b"\x08\x2A"
assert Foo.read_from(BytesIO()) == Foo(bar=42)

Make sure to set defaults for non-required fields

pure-protobuf makes no assumptions on how a message class' __init__() handles missing keyword arguments. So, if you expect a field to be optional, you must specify a default value explicitly – just as you normally do with pydantic or dataclasses.

Otherwise, a missing record would cause a missing argument error:

test_missing_default.py
from dataclasses import dataclass
from io import BytesIO
from typing import Optional

from pytest import raises
from typing_extensions import Annotated

from pure_protobuf.annotations import Field
from pure_protobuf.message import BaseMessage


@dataclass
class Foo(BaseMessage):
    foo: Annotated[Optional[int], Field(1)]


with raises(TypeError):
    Foo.read_from(BytesIO())

Enumerations

Subclasses of the standard IntEnum class are supported, their values are encoded as normal int-s:

test_enum.py
from dataclasses import dataclass
from enum import IntEnum
from io import BytesIO
from typing_extensions import Annotated

from pure_protobuf.annotations import Field
from pure_protobuf.message import BaseMessage


class TestEnum(IntEnum):
    BAR = 1


@dataclass
class Test(BaseMessage):
    foo: Annotated[TestEnum, Field(1)]


assert bytes(Test(foo=TestEnum.BAR)) == b"\x08\x01"
assert Test.read_from(BytesIO(b"\x08\x01")) == Test(foo=TestEnum.BAR)

Embedded messages

test_embedded_message.py
from dataclasses import dataclass, field
from typing_extensions import Annotated

from pure_protobuf.annotations import Field
from pure_protobuf.message import BaseMessage


@dataclass
class Test1(BaseMessage):
    a: Annotated[int, Field(1)] = 0


@dataclass
class Test3(BaseMessage):
    c: Annotated[Test1, Field(3)] = field(default_factory=Test1)


assert bytes(Test3(c=Test1(a=150))) == b"\x1A\x03\x08\x96\x01"

Self-referencing messages

Use typing.Self (or typing_extensions.Self in older Python) to reference the message class itself:

test_self.py
from dataclasses import dataclass    
from typing import Optional

from typing_extensions import Annotated, Self

from pure_protobuf.annotations import Field
from pure_protobuf.message import BaseMessage


@dataclass
class RecursiveMessage(BaseMessage):
    payload: Annotated[int, Field(1)]
    inner: Annotated[Optional[Self], Field(2)] = None

Messages with circular dependencies are not supported

The following example does not work at the moment:

class A(BaseMessage):
    b: Annotated[B, ...]

class B(BaseMessage):
    a: Annotated[A, ...]

Tracking issue: #108.

Oneof

test_one_of.py
from typing import ClassVar, Optional

from pydantic import BaseModel
from pure_protobuf.annotations import Field
from pure_protobuf.message import BaseMessage
from pure_protobuf.one_of import OneOf
from typing_extensions import Annotated


class Message(BaseMessage, BaseModel):
    foo_or_bar: ClassVar[OneOf] = OneOf()  # (1)
    which_one = foo_or_bar.which_one_of_getter()  # (2)

    foo: Annotated[Optional[int], Field(1, one_of=foo_or_bar)] = None
    bar: Annotated[Optional[int], Field(2, one_of=foo_or_bar)] = None


message = Message()
message.foo = 42
message.bar = 43

assert message.foo_or_bar == 43
assert message.foo is None
assert message.bar == 43
assert message.which_one() == "bar"
  1. ClassVar is needed here because this is a descriptor and not a real attribute.
  2. Since the foo_or_bar returns the value itself, we need an extra attribute for the which_one() getter.

Limitations

  • When assigning a one-of member, BaseMessage resets the other fields to None, regardless of any defaults defined by, for example, dataclasses.field.
  • The OneOf descriptor simply iterates over its members in order to return an assigned Oneof value, so it takes linear time.
  • It's impossible to set a value via a OneOf descriptor, one needs to assign the value to a specific attribute.