traitschema¶
Create serializable, type-checked schema using traits and Numpy. A typical use case involves saving several Numpy arrays of varying shape and type.
Defining schema¶
Note
The following assumes a basic familiarity with the traits package. See
its documentation for details.
In order to be able to properly serialize data, non-scalar traits should be
declared as a traits.api.Array type. Example:
import numpy as np
from traits.api import Array, String
from traitschema import Schema
class NamedMatrix(Schema):
name = String()
data = Array(dtype=np.float64)
matrix = NamedMatrix(name="name", data=np.random.random((8, 8)))
For other demos, see the demos directory.
Saving and loading¶
Data can be stored in the following formats:
- HDF5 via
h5py - JSON via the standard library
jsonmodule - Numpy
npzformat
Multiple schema can be saved at once to a zip file via
traitschema.bundle_schema() and loaded with
traitschema.load_bundle().
Reference¶
-
class
traitschema.Schema(**kwargs)[source]¶ Extension to
HasTraitsto add methods for automatically saving and loading typed data.Examples
Create a new data class:
import numpy as np from traits.api import Array from traitschema import Schema class Matrix(Schema): data = Array(dtype=np.float64) matrix = Matrix(data=np.random.random((8, 8)))
Serialize to HDF5 using
h5py:matrix.to_hdf("out.h5")
Load from HDF5:
matrix_copy = Matrix.from_hdf("out.h5")
-
classmethod
from_hdf(filename, decode_string_arrays=True, encoding='utf-8')[source]¶ Deserialize from HDF5 using
h5py.Parameters: - filename (str) –
- decode_string_arrays (bool) – Arrays of bytes should be decoded into strings
- encoding (str) – Encoding scheme to use for decoding
Returns: Return type: Deserialized instance
-
classmethod
from_json(data)[source]¶ Deserialize from a JSON string or file.
Parameters: data (str or file-like) – Returns: Return type: Deserialized instance
-
classmethod
from_npz(filename)[source]¶ Load data from numpy’s npz format.
Parameters: filename (str) –
-
save(filename)[source]¶ Serialize using the type determined by the file extension.
Parameters: filename (str) – Full output path. Notes
Only default saving options are used, so this method is less flexible than using the
to_xyzmethods instead.
-
to_hdf(filename, mode='w', compression=None, compression_opts=None, encode_string_arrays=True, encoding='utf8')[source]¶ Serialize to HDF5 using
h5py.Parameters: - filename (str) – Path to save HDF5 file to.
- mode (str) – Default:
'w' - compression (str or None) – Compression to use with arrays (see
h5pydocumentation for valid choices). - compression_opts (int or None) – Compression options, generally a number specifying compression level
(see
h5pydocumentation for details). - encode_string_arrays (bool) – When True, force encoding of arrays of unicode strings using the
encodingkeyword argument. Not setting this will result in errors if using arrays of unicode strings. Default: True. - encoding (str) – Encoding to use when forcing encoding of unicode string arrays.
Default:
'utf8'.
Notes
Each stored dataset will also have a
descattribute which uses thedescattribute of each trait.The root node also has attributes:
classname- the class name of the instance being serializedpython_module- the Python module in which the class is defined
-
to_json(json_kwargs={})[source]¶ Serialize to JSON.
Parameters: json_kwargs (dict) – Keyword arguments to pass to json.dumps().Returns: Return type: JSON string. Notes
This uses a custom JSON encoder to handle numpy arrays but could conceivably lose precision. If this is important, please consider serializing in HDF5 format instead. As a consequence of using a custom encoder, the
clskeyword arugment, if passed, will be ignored.
-
to_npz(filename, compress=False)[source]¶ Save in numpy’s npz archive format.
Parameters: - filename (str) –
- compress (bool) – Save as a compressed archive (default: False)
Notes
To ensure loading of scalar values works as expected, casting traits should be used (e.g.,
CStrinstead ofStringorStr). See thetraitsdocumentation for details.
-
classmethod
-
exception
traitschema.io.UnsupportedArchiveFormat[source]¶ Raised when a file extension doesn’t match up with a supported archive format.
-
traitschema.io.bundle_schema(outfile, schema, format='npz')[source]¶ Bundle several
Schemaobjects into a single archive.Parameters: - outfile (str) – Output bundle filename. Only zip archives are supported.
- schema (Dict[str, Schema]) – Dictionary of
Schemaobjects to bundle together. Keys are names to give each schema and are used when loading a bundle. - format (str) – Format to save individual schema as (default:
'npz').
Notes
Default options are used with all saving functions (e.g., no compression is used for individual serialized schema).
-
traitschema.io.load_bundle(filename)[source]¶ Loads a bundle of schema saved with
bundle_schema().Parameters: filename (str) – Path to bundled schema archive. Returns: schema – A dictionary of stored schema where the keys are the keys used when bundling. Additionally, a __meta__key will contain other info that was stored when saved (e.g., bundling format version number).Return type: dict