Storage Services¶
The HDF5 Storage Service¶
- class pypet.storageservice.HDF5StorageService(filename=None, file_title='Experiment')[source]¶
Storage Service to handle the storage of a trajectory/parameters/results into hdf5 files.
Normally you do not interact with the storage service directly but via the trajectory, see pypet.trajectory.Trajectory.f_store() and pypet.trajectory.Trajectory.f_load().
The service is not thread safe. For multiprocessing the service needs to be wrapped either by the LockWrapper or with a combination of QueueStorageServiceSender and QueueStorageServiceWriter.
The storage service supports two operations store and load.
Requests for these two are always passed as msg, what_to_store_or_load, *args, **kwargs
For example:
>>> HDF5StorageService.load(pypetconstants.LEAF, myresult, load_only=['spikestimes','nspikes'])
For a list of supported items see store() and load().
- ADD_ROW = 'ADD'¶
Adds a row to an overview table
- REMOVE_ROW = 'REMOVE'¶
Removes a row from an overview table
- MODIFY_ROW = 'MODIFY'¶
Changes a row of an overview table
- COLL_TYPE = 'COLL_TYPE'¶
Type of a container stored to hdf5, like list,tuple,dict,etc
Must be stored in order to allow perfect reconstructions.
- COLL_LIST = 'COLL_LIST'¶
Container was a list
- COLL_TUPLE = 'COLL_TUPLE'¶
Container was a tuple
- COLL_NDARRAY = 'COLL_NDARRAY'¶
Container was a numpy array
- COLL_MATRIX = 'COLL_MATRIX'¶
Container was a numpy matrix
- COLL_DICT = 'COLL_DICT'¶
Container was a dictionary
- COLL_SCALAR = 'COLL_SCALAR'¶
No container, but the thing to store was a scalar
- SCALAR_TYPE = 'SCALAR_TYPE'¶
Type of scalars stored into a container
- NAME_TABLE_MAPPING = {'_overview_explored_parameters': 'explored_parameters', '_overview_parameters': 'parameters', '_overview_derived_parameters_trajectory': 'derived_parameters_trajectory', '_overview_config': 'config', '_overview_results_runs': 'results_runs', '_overview_results_trajectory': 'results_trajectory', '_overview_derived_parameters_runs_summary': 'derived_parameters_runs_summary', '_overview_derived_parameters_runs': 'derived_parameters_runs', '_overview_results_runs_summary': 'results_runs_summary'}¶
Mapping of trajectory config names to the tables
- PR_ATTR_NAME_MAPPING = {'_derived_parameters_per_run': 'derived_parameters_per_run', '_purge_duplicate_comments': 'purge_duplicate_comments', '_overview_explored_parameters_runs': 'explored_parameters_runs', '_results_per_run': 'results_per_run'}¶
Mapping of Attribute names for hdf5_settings table
- ATTR_LIST = ['complevel', 'complib', 'shuffle', 'fletcher32', 'pandas_format', 'pandas_append']¶
List of HDF5StorageService Attributes that have to be stored into the hdf5_settings table
- STORAGE_TYPE = 'SRVC_STORE'¶
Flag, how data was stored
- DICT = 'DICT'¶
Stored as dict.
In fact, stored as pytable, but the dictionary wil be reconstructed.
- SERIES = 'SERIES'¶
Store data as pandas Series
- PANEL = 'PANEL'¶
Store data as pandas Panel(4D)
- TYPE_FLAG_MAPPING = {<Mock object at 0x7fb868635810>: 'ARRAY', <Mock object at 0x7fb868635850>: 'ARRAY', <Mock object at 0x7fb868635210>: 'ARRAY', <Mock object at 0x7fb868654090>: 'ARRAY', <Mock object at 0x7fb868635890>: 'ARRAY', <Mock object at 0x7fb868654110>: 'ARRAY', <Mock object at 0x7fb8686354d0>: 'ARRAY', <type 'list'>: 'ARRAY', <Mock object at 0x7fb8686350d0>: 'ARRAY', <Mock object at 0x7fb868635110>: 'ARRAY', <Mock object at 0x7fb868654150>: 'ARRAY', <class 'pypet.parameter.ObjectTable'>: 'TABLE', <Mock object at 0x7fb868635790>: 'ARRAY', <Mock object at 0x7fb868635590>: 'ARRAY', <class 'DataFrame'>: 'FRAME', <Mock object at 0x7fb868635390>: 'ARRAY', <Mock object at 0x7fb868635050>: 'ARRAY', <class 'Panel'>: 'PANEL', <type 'tuple'>: 'ARRAY', <Mock object at 0x7fb868635250>: 'ARRAY', <Mock object at 0x7fb868635310>: 'ARRAY', <class 'Panel4D'>: 'PANEL', <Mock object at 0x7fb8686540d0>: 'ARRAY', <Mock object at 0x7fb8686352d0>: 'ARRAY', <Mock object at 0x7fb868635510>: 'ARRAY', <type 'dict'>: 'DICT', <Mock object at 0x7fb868386f10>: 'CARRAY', <class 'Series'>: 'SERIES', <Mock object at 0x7fb868635750>: 'ARRAY', <Mock object at 0x7fb868635290>: 'ARRAY', <Mock object at 0x7fb868386f90>: 'CARRAY', <Mock object at 0x7fb868654050>: 'ARRAY'}¶
Mapping from object type to storage flag
- FORMATTED_COLUMN_PREFIX = 'SRVC_COLUMN_%s_'¶
Stores data type of a specific pytables column for perfect reconstruction
- DATA_PREFIX = 'SRVC_DATA_'¶
Stores data type of a pytables carray or array for perfect reconstruction
- ANNOTATION_PREFIX = 'SRVC_AN_'¶
Prefix to store annotations as node attributes
- ANNOTATED = 'SRVC_ANNOTATED'¶
Whether an item was annotated
- INIT_PREFIX = 'SRVC_INIT_'¶
Hdf5 attribute prefix to store class name of parameter or result
- CLASS_NAME = 'SRVC_INIT_CLASS_NAME'¶
Name of a parameter or result class, is converted to a constructor
- COMMENT = 'SRVC_INIT_COMMENT'¶
Comment of parameter or result
- LENGTH = 'SRVC_INIT_LENGTH'¶
Length of a parameter if it is explored
- LEAF = 'SRVC_LEAF'¶
Whether an hdf5 node is a leaf node
- pandas_format[source]¶
Format of pandas data. Applicable formats are ‘table’ (or ‘t’) and ‘fixed’ (or ‘f’)
- load(msg, stuff_to_load, *args, **kwargs)[source]¶
Loads a particular item from disk.
The storage service always accepts these parameters:
Parameters: - trajectory_name – Name of current trajectory and name of top node in hdf5 file.
- trajectory_index – If no trajectory_name is provided, you can specify an integer index. The trajectory at the index position in the hdf5 file is considered to loaded. Negative indices are also possible for reverse indexing.
- filename – Name of the hdf5 file
The following messages (first argument msg) are understood and the following arguments can be provided in combination with the message:
pypet.pypetconstants.TRAJECTORY (‘TRAJECTORY’)
Loads a trajectory.
param stuff_to_load: The trajectory param as_new: Whether to load trajectory as new param load_parameters: How to load parameters and config param load_derived_parameters: How to load derived parameters param load_results: How to load results param force: Force load in case there is a pypet version mismatch You can specify how to load the parameters, derived parameters and results as follows:
pypet.pypetconstants.LOAD_NOTHING: (0)
Nothing is loaded
pypet.pypetconstants.LOAD_SKELETON: (1)
The skeleton including annotations are loaded, i.e. the items are empty. Non-empty items in RAM are left untouched.
pypet.pypetconstants.LOAD_DATA: (2)
The whole data is loaded. Only empty or in RAM non-existing instance are filled with the data found on disk.
pypet.pypetconstants.OVERWRITE_DATA: (3)
The whole data is loaded. If items that are to be loaded are already in RAM and not empty, they are emptied and new data is loaded from disk.
pypet.pypetconstants.LEAF (‘LEAF’)
Loads a parameter or result.
param stuff_to_load: The item to be loaded
param load_only: If you load a result, you can partially load it and ignore the rest of the data. Just specify the name of the data you want to load. You can also provide a list, for example load_only=’spikes’, load_only=[‘spikes’,’membrane_potential’].
Issues a warning if items cannot be found.
param load_except: If you load a result you can partially load in and specify items that should NOT be loaded here. You cannot use load_except and load_only at the same time.
pypet.pypetconstants.TREE (‘TREE’)
Loads a whole subtree
param stuff_to_load: The parent node (!) not the one where loading starts! param child_name: Name of child node that should be loaded param recursive: Whether to load recursively the subtree below child param load_data: How to load stuff, accepted values as above for loading the trajectory param trajectory: The trajectory object pypet.pypetconstants.LIST (‘LIST’)
Analogous to storing lists
Raises: NoSuchServiceError if message or data is not understood
DataNotInStorageError if data to be loaded cannot be found on disk
- store(msg, stuff_to_store, *args, **kwargs)[source]¶
Stores a particular item to disk.
The storage service always accepts these parameters:
Parameters: - trajectory_name – Name or current trajectory and name of top node in hdf5 file
- filename – Name of the hdf5 file
- file_title – If file needs to be created, assigns a title to the file.
The following messages (first argument msg) are understood and the following arguments can be provided in combination with the message:
pypet.pypetconstants.PREPARE_MERGE (‘PREPARE_MERGE’):
Called to prepare a trajectory for merging, see also ‘MERGE’ below.
Will also be called if merging cannot happen within the same hdf5 file. Stores already enlarged parameters and updates meta information.
param stuff_to_store: Trajectory that is about to be extended by another one param changed_parameters: List containing all parameters that were enlarged due to merging pypet.pypetconstants.MERGE (‘MERGE’)
Note that before merging within HDF5 file, the storage service will be called with msg=’PREPARE_MERGE’ before, see above.
Raises a ValueError if the two trajectories are not stored within the very same hdf5 file. Then the current trajectory needs to perform the merge slowly item by item.
Merges two trajectories, parameters are:
param stuff_to_store: The trajectory data is merged into param other_trajectory_name: Name of the other trajectory param rename_dict: Dictionary containing the old result and derived parameter names in the other trajectory and their new names in the current trajectory. param move_nodes: Whether to move the nodes from the other to the current trajectory param delete_trajectory: Whether to delete the other trajectory after merging. pypet.pypetconstants.BACKUP (‘BACKUP’)
param stuff_to_store: Trajectory to be backed up param backup_filename: Name of file where to store the backup. If None the backup file will be in the same folder as your hdf5 file and named ‘backup_XXXXX.hdf5’ where ‘XXXXX’ is the name of your current trajectory. pypet.pypetconstants.TRAJECTORY (‘TRAJECTORY’)
Stores the whole trajectory
param stuff_to_store: The trajectory to be stored param only_init: If you just want to initialise the store. If yes, only meta information about the trajectory is stored and none of the nodes/leaves within the trajectory. pypet.pypetconstants.SINGLE_RUN (‘SINGLE_RUN’)
param stuff_to_store: The single run to be stored param store_data: If all data belwo run_XXXXXXXX should be stored param store_final: If final meta info should be stored -
Stores a parameter or result.
Modification of results is not supported (yet). Everything stored to disk is set in stone!
Note that everything that is supported by the storage service and that is stored to disk will be perfectly recovered. For instance, you store a tuple of numpy 32 bit integers, you will get a tuple of numpy 32 bit integers after loading independent of the platform!
param stuff_to_sore: Result or parameter to store
In order to determine what to store, the function ‘_store’ of the parameter or result is called. This function returns a dictionary with name keys and data to store as values. In order to determine how to store the data, the storage flags are considered, see below.
The function ‘_store’ has to return a dictionary containing values only from the following objects:
- python natives (int, long, str, bool, float, complex),
- numpy natives, arrays and matrices of type np.int8-64, np.uint8-64, np.float32-64, np.complex, np.str
- python lists and tuples of the previous types (python natives + numpy natives and arrays) Lists and tuples are not allowed to be nested and must be homogeneous, i.e. only contain data of one particular type. Only integers, or only floats, etc.
- python dictionaries of the previous types (not nested!), data can be heterogeneous, keys must be strings. For example, one key-value-pair of string and int and one key-value pair of string and float, and so on.
- pandas DataFrames
- ObjectTable
The keys from the ‘_store’ dictionaries determine how the data will be named in the hdf5 file.
param store_flags: Flags describing how to store data.
ARRAY (‘ARRAY’)
Store stuff as array
CARRAY (‘CARRAY’)
Store stuff as carray
TABLE (‘TABLE’)
Store stuff as pytable
DICT (‘DICT’)
Store stuff as pytable but reconstructs it later as dictionary on loading
FRAME (‘FRAME’)
Store stuff as pandas data frame
Storage flags can also be provided by the parameters and results themselves if they implement a function ‘_store_flags’ that returns a dictionary with the names of the data to store as keys and the flags as values.
If no storage flags are provided, they are automatically inferred from the data. See pypet.HDF5StorageService.TYPE_FLAG_MAPPING for the mapping from type to flag.
param overwrite: Can be used if parts of a leaf should be replaced. Either a list of HDF5 names or True if this should account for all.
pypet.pypetconstants.DELETE (‘DELETE’)
Removes an item from disk. Empty group nodes, results and non-explored parameters can be removed.
param stuff_to_store: The item to be removed. param remove_empty_groups: Whether to also remove groups that become empty due to removal. default is False. param delete_only: Potential list of parts of a leaf node that should be deleted. param remove_from_item: If delete_only is used, whether deleted nodes should also be erased from the leaf nodes themseleves. pypet.pypetconstants.GROUP (‘GROUP’)
param stuff_to_store: The group to store -
Stores a single node or a full subtree
param stuff_to_store: Node to store param recursive: Whether to store recursively the whole sub-tree -
Stores several items at once
param stuff_to_store: Iterable whose items are to be stored. Iterable must contain tuples, for example [(msg1,item1,arg1,kwargs1),(msg2,item2,arg2,kwargs2),...]
Raises: NoSuchServiceError if message or data is not understood
The Multiprocessing Wrappers¶
- class pypet.storageservice.LockWrapper(storage_service, lock)[source]¶
For multiprocessing in WRAP_MODE_LOCK mode, augments a storage service with a lock.
The lock is acquired before storage or loading and released afterwards.
- class pypet.storageservice.QueueStorageServiceSender[source]¶
For multiprocessing with WRAP_MODE_QUEUE, replaces the original storage service.
All storage requests are send over a queue to the process running the QueueStorageServiceWriter.
Does not support loading of data!