traitschema¶

https://img.shields.io/github/release/mivade/traitschema.svg

Create serializable, type-checked schema using traits and Numpy. A typical use case involves saving several Numpy arrays of varying shape and type.

Defining schema¶

Note

The following assumes a basic familiarity with the traits package. See its documentation for details.

In order to be able to properly serialize data, non-scalar traits should be declared as a traits.api.Array type. Example:

import numpy as np
from traits.api import Array, String
from traitschema import Schema

class NamedMatrix(Schema):
    name = String()
    data = Array(dtype=np.float64)

matrix = NamedMatrix(name="name", data=np.random.random((8, 8)))

For other demos, see the demos directory.

Saving and loading¶

Data can be stored in the following formats:

HDF5 via h5py
JSON via the standard library json module
Numpy npz format

Multiple schema can be saved at once to a zip file via traitschema.bundle_schema and loaded with traitschema.load_bundle.

Reference¶

class traitschema.Schema(**kwargs)[source]¶

Extension to HasTraits to add methods for automatically saving and loading typed data.

Examples

Create a new data class:

import numpy as np
from traits.api import Array
from traitschema import Schema

class Matrix(Schema):
    data = Array(dtype=np.float64)

matrix = Matrix(data=np.random.random((8, 8)))

Serialize to HDF5 using h5py:

matrix.to_hdf("out.h5")

Load from HDF5:

matrix_copy = Matrix.from_hdf("out.h5")

classmethod from_hdf(filename, decode_string_arrays=True, encoding='utf-8')[source]¶

Deserialize from HDF5 using h5py.

Parameters:	filename (str) – decode_string_arrays (bool) – Arrays of bytes should be decoded into strings encoding (str) – Encoding scheme to use for decoding
Returns:
Return type:	Deserialized instance

classmethod from_json(data)[source]¶

Deserialize from a JSON string or file.

Parameters:	data (str or file-like) –
Returns:
Return type:	Deserialized instance

classmethod from_npz(filename)[source]¶

Load data from numpy’s npz format.

Parameters:	filename (str) –

classmethod load(filename)[source]¶: Counterpart to save().

save(filename)[source]¶

Serialize using the type determined by the file extension.

Parameters:	filename (str) – Full output path.

Notes

Only default saving options are used, so this method is less flexible than using the to_xyz methods instead.

to_dict()[source]¶: Return all visible traits as a dictionary.

to_hdf(filename, mode='w', compression=None, compression_opts=None, encode_string_arrays=True, encoding='utf8')[source]¶

Serialize to HDF5 using h5py.

Parameters:

filename (str) – Path to save HDF5 file to.
mode (str) – Default: 'w'
compression (str or None) – Compression to use with arrays (see h5py documentation for valid choices).
compression_opts (int or None) – Compression options, generally a number specifying compression level (see h5py documentation for details).
encode_string_arrays (bool) – When True, force encoding of arrays of unicode strings using the encoding keyword argument. Not setting this will result in errors if using arrays of unicode strings. Default: True.
encoding (str) – Encoding to use when forcing encoding of unicode string arrays. Default: 'utf8'.

Notes

Each stored dataset will also have a desc attribute which uses the desc attribute of each trait.

The root node also has attributes:

classname - the class name of the instance being serialized
python_module - the Python module in which the class is defined

to_json(json_kwargs={})[source]¶

Serialize to JSON.

Parameters:	json_kwargs (dict) – Keyword arguments to pass to `json.dumps()`.
Returns:
Return type:	JSON string.

Notes

This uses a custom JSON encoder to handle numpy arrays but could conceivably lose precision. If this is important, please consider serializing in HDF5 format instead. As a consequence of using a custom encoder, the cls keyword arugment, if passed, will be ignored.

to_npz(filename, compress=False)[source]¶

Save in numpy’s npz archive format.

Parameters:	filename (str) – compress (bool) – Save as a compressed archive (default: False)

Notes

To ensure loading of scalar values works as expected, casting traits should be used (e.g., CStr instead of String or Str). See the traits documentation for details.

exception traitschema.io.UnsupportedArchiveFormat[source]¶: Raised when a file extension doesn’t match up with a supported archive format.

traitschema.io.bundle_schema(outfile, schema, format='npz')[source]¶

Bundle several Schema objects into a single archive.

Parameters:	outfile (str) – Output bundle filename. Only zip archives are supported. schema (Dict[str, Schema]) – Dictionary of `Schema` objects to bundle together. Keys are names to give each schema and are used when loading a bundle. format (str) – Format to save individual schema as (default: `'npz'`).

Notes

Default options are used with all saving functions (e.g., no compression is used for individual serialized schema).

traitschema.io.load_bundle(filename)[source]¶

Loads a bundle of schema saved with bundle_schema().

Parameters:	filename (str) – Path to bundled schema archive.
Returns:	schema – A dictionary of stored schema where the keys are the keys used when bundling. Additionally, a `__meta__` key will contain other info that was stored when saved (e.g., bundling format version number).
Return type:	dict

traitschema¶

Defining schema¶

Saving and loading¶

Reference¶

traitschema

Navigation

Related Topics