traitschema¶
Create serializable, type-checked schema using traits and Numpy. A typical use case involves saving several Numpy arrays of varying shape and type.
Defining schema¶
Note
The following assumes a basic familiarity with the traits
package. See
its documentation for details.
In order to be able to properly serialize data, non-scalar traits should be
declared as a traits.api.Array
type. Example:
import numpy as np
from traits.api import Array, String
from traitschema import Schema
class NamedMatrix(Schema):
name = String()
data = Array(dtype=np.float64)
matrix = NamedMatrix(name="name", data=np.random.random((8, 8)))
For other demos, see the demos
directory.
Saving and loading¶
Data can be stored in the following formats:
- HDF5 via
h5py
- JSON via the standard library
json
module - Numpy
npz
format
Multiple schema can be saved at once to a zip file via
traitschema.bundle_schema
and loaded with traitschema.load_bundle
.
Reference¶
-
class
traitschema.
Schema
(**kwargs)[source]¶ Extension to
HasTraits
to add methods for automatically saving and loading typed data.Examples
Create a new data class:
import numpy as np from traits.api import Array from traitschema import Schema class Matrix(Schema): data = Array(dtype=np.float64) matrix = Matrix(data=np.random.random((8, 8)))
Serialize to HDF5 using
h5py
:matrix.to_hdf("out.h5")
Load from HDF5:
matrix_copy = Matrix.from_hdf("out.h5")
-
classmethod
from_hdf
(filename, decode_string_arrays=True, encoding='utf-8')[source]¶ Deserialize from HDF5 using
h5py
.Parameters: - filename (str) –
- decode_string_arrays (bool) – Arrays of bytes should be decoded into strings
- encoding (str) – Encoding scheme to use for decoding
Returns: Return type: Deserialized instance
-
classmethod
from_json
(data)[source]¶ Deserialize from a JSON string or file.
Parameters: data (str or file-like) – Returns: Return type: Deserialized instance
-
classmethod
from_npz
(filename)[source]¶ Load data from numpy’s npz format.
Parameters: filename (str) –
-
save
(filename)[source]¶ Serialize using the type determined by the file extension.
Parameters: filename (str) – Full output path. Notes
Only default saving options are used, so this method is less flexible than using the
to_xyz
methods instead.
-
to_hdf
(filename, mode='w', compression=None, compression_opts=None, encode_string_arrays=True, encoding='utf8')[source]¶ Serialize to HDF5 using
h5py
.Parameters: - filename (str) – Path to save HDF5 file to.
- mode (str) – Default:
'w'
- compression (str or None) – Compression to use with arrays (see
h5py
documentation for valid choices). - compression_opts (int or None) – Compression options, generally a number specifying compression level
(see
h5py
documentation for details). - encode_string_arrays (bool) – When True, force encoding of arrays of unicode strings using the
encoding
keyword argument. Not setting this will result in errors if using arrays of unicode strings. Default: True. - encoding (str) – Encoding to use when forcing encoding of unicode string arrays.
Default:
'utf8'
.
Notes
Each stored dataset will also have a
desc
attribute which uses thedesc
attribute of each trait.The root node also has attributes:
classname
- the class name of the instance being serializedpython_module
- the Python module in which the class is defined
-
to_json
(json_kwargs={})[source]¶ Serialize to JSON.
Parameters: json_kwargs (dict) – Keyword arguments to pass to json.dumps()
.Returns: Return type: JSON string. Notes
This uses a custom JSON encoder to handle numpy arrays but could conceivably lose precision. If this is important, please consider serializing in HDF5 format instead. As a consequence of using a custom encoder, the
cls
keyword arugment, if passed, will be ignored.
-
to_npz
(filename, compress=False)[source]¶ Save in numpy’s npz archive format.
Parameters: - filename (str) –
- compress (bool) – Save as a compressed archive (default: False)
Notes
To ensure loading of scalar values works as expected, casting traits should be used (e.g.,
CStr
instead ofString
orStr
). See thetraits
documentation for details.
-
classmethod
-
exception
traitschema.io.
UnsupportedArchiveFormat
[source]¶ Raised when a file extension doesn’t match up with a supported archive format.
-
traitschema.io.
bundle_schema
(outfile, schema, format='npz')[source]¶ Bundle several
Schema
objects into a single archive.Parameters: - outfile (str) – Output bundle filename. Only zip archives are supported.
- schema (Dict[str, Schema]) – Dictionary of
Schema
objects to bundle together. Keys are names to give each schema and are used when loading a bundle. - format (str) – Format to save individual schema as (default:
'npz'
).
Notes
Default options are used with all saving functions (e.g., no compression is used for individual serialized schema).
-
traitschema.io.
load_bundle
(filename)[source]¶ Loads a bundle of schema saved with
bundle_schema()
.Parameters: filename (str) – Path to bundled schema archive. Returns: schema – A dictionary of stored schema where the keys are the keys used when bundling. Additionally, a __meta__
key will contain other info that was stored when saved (e.g., bundling format version number).Return type: dict