-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider external serialization libraries #180
Comments
Much better, |
https://www.attrs.org/en/stable/why.html
|
https://threeofwands.com/why-i-use-attrs-instead-of-pydantic/ This actually convinced me that we want Pydantic, rather than attrs, since we want to validate runcards. And if someone saves me the burden to write the validator (or part of it), I'm only grateful. |
Hi, @alecandido
This is a good idea especially if you’re interested in serialization of variadic generics from the new PEP 646. It has a lot of edge cases but I managed to cope with them in mashumaro that I would suggest you to try even if you have more common use cases. |
@Fatal1ty another rather nasty type that requires some special care here is |
For non-standard types like this one a custom universal serialization strategy can be registered. But since class NDArraySerializationStrategy(SerializationStrategy):
def serialize(self, value: np.ndarray) -> str:
tmp_io = io.BytesIO()
np.save(tmp_io, value, allow_pickle=False)
return tmp_io.getvalue().hex()
def deserialize(self, value: str) -> np.ndarray:
tmp_io = io.BytesIO(bytes.fromhex(value))
return np.load(tmp_io, allow_pickle=False)
@dataclass
class C(DataClassDictMixin):
x: npt.NDArray[np.float64]
class Config(BaseConfig):
serialization_strategy = {
npt.NDArray[float64]: NDArraySerializationStrategy(),
} This could be unhandy when you have a lot of scalar variations, so I'm thinking about allowing to set a strategy for the origin type — np.ndarray in this case. |
a commit in an earlier attempt is c6c2fa6 |
At the moment, I wrote in #172 a mini-serialization library based on
dataclass
es, it is currently calledDictLike
.Now,
dataclass
already provides natively.asdict()
and.from_dict()
analagoue (i.e.MyDataClass(**mydict)
).This is the initial motivation to go down this way.
It is optimal, since it allows us to go through serialization preserving type information, since it is in the runtime structure, and this is the new instance of the "input layer" currently implemented in yadism.
The idea is to improve over and over input-checking relying on custom types definition, such that the internal library can make assumptions on those values.
Unfortunately, type hints and runtime classes are not the exact same thing, and proper serialization with generic type hints is complex (consider that a union is a type hint, not a class, including
MyType | None
, formerlytyping.Optional[MyType]
, and alsolist[int]
, formerlytyping.List[int]
).So, we already taking care of internal types is not coming at zero cost, especially because many features are introduced in later releases, and we need compatibility with py3.8 at the time of writing. This reflects the fact that types in Python are rather recent, with all their ecosystem (of which
dataclass
is part).In order to reduce the burden of maintenance of the serialization part, it is worth to consider external libraries, especially if they are popular enough.
A good example would be lidatong/dataclasses-json, that provides a
@dataclass_json
decorator that is the equivalent of myDictLike
base class.It is also compatible with py3.7 and py3.6 (with backport of dataclasses, introduced in standard library in py3.7).
Unfortunately, it doesn't look so lively.
Other options are welcome.
The text was updated successfully, but these errors were encountered: