Storage Services¶

The HDF5 Storage Service¶

class pypet.storageservice.HDF5StorageService(filename=None, file_title=None, overwrite_file=False, encoding='utf8', complevel=9, complib='zlib', shuffle=True, fletcher32=False, pandas_format='fixed', purge_duplicate_comments=True, summary_tables=True, small_overview_tables=True, large_overview_tables=False, results_per_run=0, derived_parameters_per_run=0, display_time=20, trajectory=None)[source]¶

Storage Service to handle the storage of a trajectory/parameters/results into hdf5 files.

Normally you do not interact with the storage service directly but via the trajectory, see pypet.trajectory.Trajectory.f_store() and pypet.trajectory.Trajectory.f_load().

The service is not thread safe. For multiprocessing the service needs to be wrapped either by the LockWrapper or with a combination of QueueStorageServiceSender and QueueStorageServiceWriter.

The storage service supports two operations store and load.

Requests for these two are always passed as msg, what_to_store_or_load, *args, **kwargs

For example:

>>> HDF5StorageService.load(pypetconstants.LEAF, myresult, load_only=['spikestimes','nspikes'])

For a list of supported items see store() and load().

The service accepts the following parameters

Parameters:

Parameters:	filename – The name of the hdf5 file. If none is specified the default ./hdf5/the_name_of_your_trajectory.hdf5 is chosen. If filename contains only a path like filename=’./myfolder/’, it is changed to `filename=’./myfolder/the_name_of_your_trajectory.hdf5’. file_title – Title of the hdf5 file (only important if file is created new) overwrite_file – If the file already exists it will be overwritten. Otherwise the trajectory will simply be added to the file and already existing trajectories are not deleted. encoding – Format to encode and decode unicode strings stored to disk. The default `'utf8'` is highly recommended. complevel – If you use HDF5, you can specify your compression level. 0 means no compression and 9 is the highest compression level. See PyTables Compression for a detailed description. complib – The library used for compression. Choose between zlib, blosc, and lzo. Note that ‘blosc’ and ‘lzo’ are usually faster than ‘zlib’ but it may be the case that you can no longer open your hdf5 files with third-party applications that do not rely on PyTables. shuffle – Whether or not to use the shuffle filters in the HDF5 library. This normally improves the compression ratio. fletcher32 – Whether or not to use the Fletcher32 filter in the HDF5 library. This is used to add a checksum on hdf5 data. pandas_format – How to store pandas data frames. Either in ‘fixed’ (‘f’) or ‘table’ (‘t’) format. Fixed format allows fast reading and writing but disables querying the hdf5 data and appending to the store (with other 3rd party software other than pypet). purge_duplicate_comments – If you add a result via `f_add_result()` or a derived parameter `f_add_derived_parameter()` and you set a comment, normally that comment would be attached to each and every instance. This can produce a lot of unnecessary overhead if the comment is the same for every instance over all runs. If purge_duplicate_comments=1 than only the comment of the first result or derived parameter instance created in a run is stored or comments that differ from this first comment. For instance, during a single run you call traj.f_add_result(‘my_result,42, comment=’Mostly harmless!’)` and the result will be renamed to results.run_00000000.my_result. After storage in the node associated with this result in your hdf5 file, you will find the comment ‘Mostly harmless!’ there. If you call traj.f_add_result(‘my_result’,-43, comment=’Mostly harmless!’) in another run again, let’s say run 00000001, the name will be mapped to results.run_00000001.my_result. But this time the comment will not be saved to disk since ‘Mostly harmless!’ is already part of the very first result with the name ‘results.run_00000000.my_result’. Note that the comments will be compared and storage will only be discarded if the strings are exactly the same. If you use multiprocessing, the storage service will take care that the comment for the result or derived parameter with the lowest run index will be considered regardless of the order of the finishing of your runs. Note that this only works properly if all comments are the same. Otherwise the comment in the overview table might not be the one with the lowest run index. You need summary tables (see below) to be able to purge duplicate comments. This feature only works for comments in leaf nodes (aka Results and Parameters). So try to avoid to add comments in group nodes within single runs. summary_tables – Whether the summary tables should be created, i.e. the ‘derived_parameters_runs_summary’, and the results_runs_summary. The ‘XXXXXX_summary’ tables give a summary about all results or derived parameters. It is assumed that results and derived parameters with equal names in individual runs are similar and only the first result or derived parameter that was created is shown as an example. The summary table can be used in combination with purge_duplicate_comments to only store a single comment for every result with the same name in each run, see above. small_overview_tables – Whether the small overview tables should be created. Small tables are giving overview about ‘config’,’parameters’, ‘derived_parameters_trajectory’, , ‘results_trajectory’,’results_runs_summary’. Note that these tables create some overhead. If you want very small hdf5 files set small_overview_tables to False. large_overview_tables – Whether to add large overview tables. This encompasses information about every derived parameter, result, and the explored parameter in every single run. If you want small hdf5 files, this is the first option to set to false. results_per_run – Expected results you store per run. If you give a good/correct estimate storage to hdf5 file is much faster in case you store LARGE overview tables. Default is 0, i.e. the number of results is not estimated! derived_parameters_per_run – Analogous to the above. display_time – How often status messages about loading and storing time should be displayed. Interval in seconds. trajectory – A trajectory container, the storage service will add the used parameter to the trajectory container.

filename – The name of the hdf5 file. If none is specified the default ./hdf5/the_name_of_your_trajectory.hdf5 is chosen. If filename contains only a path like filename=’./myfolder/’, it is changed to `filename=’./myfolder/the_name_of_your_trajectory.hdf5’.
file_title – Title of the hdf5 file (only important if file is created new)
overwrite_file – If the file already exists it will be overwritten. Otherwise the trajectory will simply be added to the file and already existing trajectories are not deleted.
encoding – Format to encode and decode unicode strings stored to disk. The default 'utf8' is highly recommended.
complevel –
If you use HDF5, you can specify your compression level. 0 means no compression and 9 is the highest compression level. See PyTables Compression for a detailed description.
complib – The library used for compression. Choose between zlib, blosc, and lzo. Note that ‘blosc’ and ‘lzo’ are usually faster than ‘zlib’ but it may be the case that you can no longer open your hdf5 files with third-party applications that do not rely on PyTables.
shuffle – Whether or not to use the shuffle filters in the HDF5 library. This normally improves the compression ratio.
fletcher32 – Whether or not to use the Fletcher32 filter in the HDF5 library. This is used to add a checksum on hdf5 data.
pandas_format – How to store pandas data frames. Either in ‘fixed’ (‘f’) or ‘table’ (‘t’) format. Fixed format allows fast reading and writing but disables querying the hdf5 data and appending to the store (with other 3rd party software other than pypet).
purge_duplicate_comments –
If you add a result via f_add_result() or a derived parameter f_add_derived_parameter() and you set a comment, normally that comment would be attached to each and every instance. This can produce a lot of unnecessary overhead if the comment is the same for every instance over all runs. If purge_duplicate_comments=1 than only the comment of the first result or derived parameter instance created in a run is stored or comments that differ from this first comment.

For instance, during a single run you call traj.f_add_result(‘my_result,42, comment=’Mostly harmless!’)` and the result will be renamed to results.run_00000000.my_result. After storage in the node associated with this result in your hdf5 file, you will find the comment ‘Mostly harmless!’ there. If you call traj.f_add_result(‘my_result’,-43, comment=’Mostly harmless!’) in another run again, let’s say run 00000001, the name will be mapped to results.run_00000001.my_result. But this time the comment will not be saved to disk since ‘Mostly harmless!’ is already part of the very first result with the name ‘results.run_00000000.my_result’. Note that the comments will be compared and storage will only be discarded if the strings are exactly the same.

If you use multiprocessing, the storage service will take care that the comment for the result or derived parameter with the lowest run index will be considered regardless of the order of the finishing of your runs. Note that this only works properly if all comments are the same. Otherwise the comment in the overview table might not be the one with the lowest run index.

You need summary tables (see below) to be able to purge duplicate comments.

This feature only works for comments in leaf nodes (aka Results and Parameters). So try to avoid to add comments in group nodes within single runs.
summary_tables –
Whether the summary tables should be created, i.e. the ‘derived_parameters_runs_summary’, and the results_runs_summary.

The ‘XXXXXX_summary’ tables give a summary about all results or derived parameters. It is assumed that results and derived parameters with equal names in individual runs are similar and only the first result or derived parameter that was created is shown as an example.

The summary table can be used in combination with purge_duplicate_comments to only store a single comment for every result with the same name in each run, see above.
small_overview_tables –
Whether the small overview tables should be created. Small tables are giving overview about ‘config’,’parameters’, ‘derived_parameters_trajectory’, , ‘results_trajectory’,’results_runs_summary’.

Note that these tables create some overhead. If you want very small hdf5 files set small_overview_tables to False.
large_overview_tables – Whether to add large overview tables. This encompasses information about every derived parameter, result, and the explored parameter in every single run. If you want small hdf5 files, this is the first option to set to false.
results_per_run –
Expected results you store per run. If you give a good/correct estimate storage to hdf5 file is much faster in case you store LARGE overview tables.

Default is 0, i.e. the number of results is not estimated!
derived_parameters_per_run – Analogous to the above.
display_time – How often status messages about loading and storing time should be displayed. Interval in seconds.
trajectory – A trajectory container, the storage service will add the used parameter to the trajectory container.

ADD_ROW = 'ADD'¶: Adds a row to an overview table

REMOVE_ROW = 'REMOVE'¶: Removes a row from an overview table

MODIFY_ROW = 'MODIFY'¶: Changes a row of an overview table

COLL_TYPE = 'COLL_TYPE'¶

Type of a container stored to hdf5, like list,tuple,dict,etc

Must be stored in order to allow perfect reconstructions.

COLL_LIST = 'COLL_LIST'¶: Container was a list

COLL_TUPLE = 'COLL_TUPLE'¶: Container was a tuple

COLL_NDARRAY = 'COLL_NDARRAY'¶: Container was a numpy array

COLL_MATRIX = 'COLL_MATRIX'¶: Container was a numpy matrix

COLL_DICT = 'COLL_DICT'¶: Container was a dictionary

COLL_EMPTY_DICT = 'COLL_EMPTY_DICT'¶: Container was an empty dictionary

COLL_SCALAR = 'COLL_SCALAR'¶: No container, but the thing to store was a scalar

SCALAR_TYPE = 'SCALAR_TYPE'¶: Type of scalars stored into a container

NAME_TABLE_MAPPING = {'_overview_config': 'config_overview', '_overview_derived_parameters': 'derived_parameters_overview', '_overview_derived_parameters_summary': 'derived_parameters_summary', '_overview_explored_parameters': 'explored_parameters_overview', '_overview_parameters': 'parameters_overview', '_overview_results': 'results_overview', '_overview_results_summary': 'results_summary'}¶: Mapping of trajectory config names to the tables

PR_ATTR_NAME_MAPPING = {'_derived_parameters_per_run': 'derived_parameters_per_run', '_purge_duplicate_comments': 'purge_duplicate_comments', '_results_per_run': 'results_per_run'}¶: Mapping of Attribute names for hdf5_settings table

ATTR_LIST = ['complevel', 'complib', 'shuffle', 'fletcher32', 'pandas_format', 'encoding']¶: List of HDF5StorageService Attributes that have to be stored into the hdf5_settings table

STORAGE_TYPE = 'SRVC_STORE'¶: Flag, how data was stored

ARRAY = 'ARRAY'¶: Stored as array

CARRAY = 'CARRAY'¶: Stored as carray

EARRAY = 'EARRAY'¶: Stored as earray_e.

VLARRAY = 'VLARRAY'¶: Stored as vlarray

TABLE = 'TABLE'¶: Stored as pytable

DICT = 'DICT'¶

Stored as dict.

In fact, stored as pytable, but the dictionary wil be reconstructed.

FRAME = 'FRAME'¶: Stored as pandas DataFrame

SERIES = 'SERIES'¶: Store data as pandas Series

SPLIT_TABLE = 'SPLIT_TABLE'¶: If a table was split due to too many columns

DATATYPE_TABLE = 'DATATYPE_TABLE'¶: If a table contains the data types instead of the attrs

SHARED_DATA = 'SHARED_DATA_'¶: An HDF5 data object for direct interaction

NESTED_GROUP = 'NESTED_GROUP'¶: An HDF5 data object containing nested data

TYPE_FLAG_MAPPING = {<class 'pypet.parameter.ObjectTable'>: 'TABLE', <class 'list'>: 'ARRAY', <class 'tuple'>: 'ARRAY', <class 'dict'>: 'DICT', <Mock object>: 'CARRAY', <Mock object>: 'CARRAY', <class 'DataFrame'>: 'FRAME', <class 'Series'>: 'SERIES', <class 'pypet.shareddata.SharedTable'>: 'SHARED_DATA_', <class 'pypet.shareddata.SharedArray'>: 'SHARED_DATA_', <class 'pypet.shareddata.SharedPandasFrame'>: 'SHARED_DATA_', <class 'pypet.shareddata.SharedCArray'>: 'SHARED_DATA_', <class 'pypet.shareddata.SharedEArray'>: 'SHARED_DATA_', <class 'pypet.shareddata.SharedVLArray'>: 'SHARED_DATA_', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <Mock object>: 'ARRAY', <class 'str'>: 'ARRAY', <class 'bytes'>: 'ARRAY'}¶: Mapping from object type to storage flag

FORMATTED_COLUMN_PREFIX = 'SRVC_COLUMN_%s_'¶: Stores data type of a specific pytables column for perfect reconstruction

DATA_PREFIX = 'SRVC_DATA_'¶: Stores data type of a pytables carray or array for perfect reconstruction

ANNOTATION_PREFIX = 'SRVC_AN_'¶: Prefix to store annotations as node attributes

ANNOTATED = 'SRVC_ANNOTATED'¶: Whether an item was annotated

INIT_PREFIX = 'SRVC_INIT_'¶: Hdf5 attribute prefix to store class name of parameter or result

CLASS_NAME = 'SRVC_INIT_CLASS_NAME'¶: Name of a parameter or result class, is converted to a constructor

COMMENT = 'SRVC_INIT_COMMENT'¶: Comment of parameter or result

LENGTH = 'SRVC_INIT_LENGTH'¶: Length of a parameter if it is explored, no longer in use, only for backwards compatibility

LEAF = 'SRVC_LEAF'¶: Whether an hdf5 node is a leaf node

is_open¶

Normally the file is opened and closed after each insertion.

However, the storage service may provide the option to keep the store open and signals this via this property.

encoding¶: How unicode strings are encoded

display_time¶: Time interval in seconds, when to display the storage or loading of nodes

complib¶: Compression library used

complevel¶: Compression level used

fletcher32¶: Whether fletcher 32 should be used

shuffle¶: Whether shuffle filtering should be used

pandas_format¶: Format of pandas data. Applicable formats are ‘table’ (or ‘t’) and ‘fixed’ (or ‘f’)

filename¶: The name and path of the underlying hdf5 file.

load(msg, stuff_to_load, *args, **kwargs)[source]¶

Loads a particular item from disk.

The storage service always accepts these parameters:

Parameters:	trajectory_name – Name of current trajectory and name of top node in hdf5 file. trajectory_index – If no trajectory_name is provided, you can specify an integer index. The trajectory at the index position in the hdf5 file is considered to loaded. Negative indices are also possible for reverse indexing. filename – Name of the hdf5 file

The following messages (first argument msg) are understood and the following arguments can be provided in combination with the message:

pypet.pypetconstants.TRAJECTORY (‘TRAJECTORY’)

Loads a trajectory.

param stuff_to_load:

The trajectory

param as_new: Whether to load trajectory as new

param load_parameters:

How to load parameters and config

param load_derived_parameters:

How to load derived parameters

param load_results:

How to load results

param force: Force load in case there is a pypet version mismatch

You can specify how to load the parameters, derived parameters and results as follows:

pypet.pypetconstants.LOAD_NOTHING: (0)

Nothing is loaded

pypet.pypetconstants.LOAD_SKELETON: (1)

The skeleton including annotations are loaded, i.e. the items are empty. Non-empty items in RAM are left untouched.

pypet.pypetconstants.LOAD_DATA: (2)

The whole data is loaded. Only empty or in RAM non-existing instance are filled with the data found on disk.

pypet.pypetconstants.OVERWRITE_DATA: (3)

The whole data is loaded. If items that are to be loaded are already in RAM and not empty, they are emptied and new data is loaded from disk.

pypet.pypetconstants.LEAF (‘LEAF’)

Loads a parameter or result.

param stuff_to_load:

The item to be loaded

param load_data:

How to load data

param load_only:

If you load a result, you can partially load it and ignore the rest of the data. Just specify the name of the data you want to load. You can also provide a list, for example load_only=’spikes’, load_only=[‘spikes’,’membrane_potential’].

Issues a warning if items cannot be found.

param load_except:

If you load a result you can partially load in and specify items that should NOT be loaded here. You cannot use load_except and load_only at the same time.

pypet.pyetconstants.GROUP

Loads a group a node (comment and annotations)

param recursive:

Recursively loads everything below

param load_data:

How to load stuff if recursive=True accepted values as above for loading the trajectory

param max_depth:

Maximum depth in case of recursion. None for no limit.

pypet.pypetconstants.TREE (‘TREE’)

Loads a whole subtree

param stuff_to_load:

The parent node (!) not the one where loading starts!

param child_name:

Name of child node that should be loaded

param recursive:

Whether to load recursively the subtree below child

param load_data:

How to load stuff, accepted values as above for loading the trajectory

param max_depth:

Maximum depth in case of recursion. None for no limit.

param trajectory:

The trajectory object

pypet.pypetconstants.LIST (‘LIST’)

Analogous to storing lists

param stuff_to_load:
	The trajectory
param as_new:	Whether to load trajectory as new
param load_parameters:
	How to load parameters and config
param load_derived_parameters:
	How to load derived parameters
param load_results:
	How to load results
param force:	Force load in case there is a pypet version mismatch

param stuff_to_load:
	The item to be loaded
param load_data:
	How to load data
param load_only:
	If you load a result, you can partially load it and ignore the rest of the data. Just specify the name of the data you want to load. You can also provide a list, for example load_only=’spikes’, load_only=[‘spikes’,’membrane_potential’]. Issues a warning if items cannot be found.
param load_except:
	If you load a result you can partially load in and specify items that should NOT be loaded here. You cannot use load_except and load_only at the same time.

param recursive:
	Recursively loads everything below
param load_data:
	How to load stuff if `recursive=True` accepted values as above for loading the trajectory
param max_depth:
	Maximum depth in case of recursion. None for no limit.

param stuff_to_load:
	The parent node (!) not the one where loading starts!
param child_name:
	Name of child node that should be loaded
param recursive:
	Whether to load recursively the subtree below child
param load_data:
	How to load stuff, accepted values as above for loading the trajectory
param max_depth:
	Maximum depth in case of recursion. None for no limit.
param trajectory:
	The trajectory object

Raises:

Raises:	NoSuchServiceError if message or data is not understood DataNotInStorageError if data to be loaded cannot be found on disk

NoSuchServiceError if message or data is not understood

DataNotInStorageError if data to be loaded cannot be found on disk

store(msg, stuff_to_store, *args, **kwargs)[source]¶

Stores a particular item to disk.

The storage service always accepts these parameters:

Parameters:	trajectory_name – Name or current trajectory and name of top node in hdf5 file filename – Name of the hdf5 file file_title – If file needs to be created, assigns a title to the file.

The following messages (first argument msg) are understood and the following arguments can be provided in combination with the message:

pypet.pypetconstants.PREPARE_MERGE (‘PREPARE_MERGE’):

Called to prepare a trajectory for merging, see also ‘MERGE’ below.

Will also be called if merging cannot happen within the same hdf5 file. Stores already enlarged parameters and updates meta information.

param stuff_to_store:

Trajectory that is about to be extended by another one

param changed_parameters:

List containing all parameters that were enlarged due to merging

param old_length:

Old length of trajectory before merge

pypet.pypetconstants.MERGE (‘MERGE’)

Note that before merging within HDF5 file, the storage service will be called with msg=’PREPARE_MERGE’ before, see above.

Raises a ValueError if the two trajectories are not stored within the very same hdf5 file. Then the current trajectory needs to perform the merge slowly item by item.

Merges two trajectories, parameters are:

param stuff_to_store:

The trajectory data is merged into

param other_trajectory_name:

Name of the other trajectory

param rename_dict:

Dictionary containing the old result and derived parameter names in the other trajectory and their new names in the current trajectory.

param move_nodes:

Whether to move the nodes from the other to the current trajectory

param delete_trajectory:

Whether to delete the other trajectory after merging.

pypet.pypetconstants.BACKUP (‘BACKUP’)

param stuff_to_store:

Trajectory to be backed up

param backup_filename:

Name of file where to store the backup. If None the backup file will be in the same folder as your hdf5 file and named ‘backup_XXXXX.hdf5’ where ‘XXXXX’ is the name of your current trajectory.

pypet.pypetconstants.TRAJECTORY (‘TRAJECTORY’)

Stores the whole trajectory

param stuff_to_store:

The trajectory to be stored

param only_init:

If you just want to initialise the store. If yes, only meta information about the trajectory is stored and none of the nodes/leaves within the trajectory.

param store_data:

How to store data, the following settings are understood:

pypet.pypetconstants.STORE_NOTHING: (0)

Nothing is stored

pypet.pypetconstants.STORE_DATA_SKIPPING: (1)

Data of not already stored nodes is stored

pypet.pypetconstants.STORE_DATA: (2)

Data of all nodes is stored. However, existing data on disk is left untouched.

pypet.pypetconstants.OVERWRITE_DATA: (3)

Data of all nodes is stored and data on disk is overwritten. May lead to fragmentation of the HDF5 file. The user is adviced to recompress the file manually later on.

pypet.pypetconstants.SINGLE_RUN (‘SINGLE_RUN’)

param stuff_to_store:

The trajectory

param store_data:

How to store data see above

param store_final:

If final meta info should be stored

pypet.pypetconstants.LEAF

Stores a parameter or result

Note that everything that is supported by the storage service and that is stored to disk will be perfectly recovered. For instance, you store a tuple of numpy 32 bit integers, you will get a tuple of numpy 32 bit integers after loading independent of the platform!

param stuff_to_sore:

Result or parameter to store

In order to determine what to store, the function ‘_store’ of the parameter or result is called. This function returns a dictionary with name keys and data to store as values. In order to determine how to store the data, the storage flags are considered, see below.

The function ‘_store’ has to return a dictionary containing values only from the following objects:

python natives (int, long, str, bool, float, complex),

numpy natives, arrays and matrices of type np.int8-64, np.uint8-64, np.float32-64, np.complex, np.str

python lists and tuples of the previous types (python natives + numpy natives and arrays) Lists and tuples are not allowed to be nested and must be homogeneous, i.e. only contain data of one particular type. Only integers, or only floats, etc.

python dictionaries of the previous types (not nested!), data can be heterogeneous, keys must be strings. For example, one key-value-pair of string and int and one key-value pair of string and float, and so on.

pandas DataFrames

ObjectTable

The keys from the ‘_store’ dictionaries determine how the data will be named in the hdf5 file.

param store_data:

How to store the data, see above for a descitpion.

param store_flags:

Flags describing how to store data.

ARRAY (‘ARRAY’)

Store stuff as array

CARRAY (‘CARRAY’)

Store stuff as carray

TABLE (‘TABLE’)

Store stuff as pytable

DICT (‘DICT’)

Store stuff as pytable but reconstructs it later as dictionary on loading

FRAME (‘FRAME’)

Store stuff as pandas data frame

Storage flags can also be provided by the parameters and results themselves if they implement a function ‘_store_flags’ that returns a dictionary with the names of the data to store as keys and the flags as values.

If no storage flags are provided, they are automatically inferred from the data. See pypet.HDF5StorageService.TYPE_FLAG_MAPPING for the mapping from type to flag.

param overwrite:

Can be used if parts of a leaf should be replaced. Either a list of HDF5 names or True if this should account for all.

pypet.pypetconstants.DELETE (‘DELETE’)

Removes an item from disk. Empty group nodes, results and non-explored parameters can be removed.

param stuff_to_store:

The item to be removed.

param delete_only:

Potential list of parts of a leaf node that should be deleted.

param remove_from_item:

If delete_only is used, whether deleted nodes should also be erased from the leaf nodes themseleves.

param recursive:

If you want to delete a group node you can recursively delete all its children.

pypet.pypetconstants.GROUP (‘GROUP’)

param stuff_to_store:

The group to store

param store_data:

How to store data

param recursive:

To recursively load everything below.

param max_depth:

Maximum depth in case of recursion. None for no limit.

pypet.pypetconstants.TREE

Stores a single node or a full subtree

param stuff_to_store:

Node to store

param store_data:

How to store data

param recursive:

Whether to store recursively the whole sub-tree

param max_depth:

Maximum depth in case of recursion. None for no limit.

pypet.pypetconstants.DELETE_LINK

Deletes a link from hard drive

param name: The full colon separated name of the link

pypet.pypetconstants.LIST

Stores several items at once

param stuff_to_store:

Iterable whose items are to be stored. Iterable must contain tuples, for example [(msg1,item1,arg1,kwargs1),(msg2,item2,arg2,kwargs2),…]

pypet.pypetconstants.ACCESS_DATA

Requests and manipulates data within the storage. Storage must be open.

param stuff_to_store:

A colon separated name to the data path

param item_name:

The name of the data item to interact with

param request: A functional request in form of a string

param args: Positional arguments passed to the reques

param kwargs: Keyword arguments passed to the request

pypet.pypetconstants.OPEN_FILE

Opens the HDF5 file and keeps it open

param stuff_to_store:

None

pypet.pypetconstants.CLOSE_FILE

Closes an HDF5 file that was kept open, must be open before.

param stuff_to_store:

None

pypet.pypetconstants.FLUSH

Flushes an open file, must be open before.

param stuff_to_store:

None

param stuff_to_store:
	Trajectory that is about to be extended by another one
param changed_parameters:
	List containing all parameters that were enlarged due to merging
param old_length:
	Old length of trajectory before merge

param stuff_to_store:
	The trajectory data is merged into
param other_trajectory_name:
	Name of the other trajectory
param rename_dict:
	Dictionary containing the old result and derived parameter names in the other trajectory and their new names in the current trajectory.
param move_nodes:
	Whether to move the nodes from the other to the current trajectory
param delete_trajectory:
	Whether to delete the other trajectory after merging.

param stuff_to_store:
	Trajectory to be backed up
param backup_filename:
	Name of file where to store the backup. If None the backup file will be in the same folder as your hdf5 file and named ‘backup_XXXXX.hdf5’ where ‘XXXXX’ is the name of your current trajectory.

param stuff_to_store:
	The trajectory to be stored
param only_init:
	If you just want to initialise the store. If yes, only meta information about the trajectory is stored and none of the nodes/leaves within the trajectory.
param store_data:
	How to store data, the following settings are understood: `pypet.pypetconstants.STORE_NOTHING`: (0) Nothing is stored `pypet.pypetconstants.STORE_DATA_SKIPPING`: (1) Data of not already stored nodes is stored `pypet.pypetconstants.STORE_DATA`: (2) Data of all nodes is stored. However, existing data on disk is left untouched. `pypet.pypetconstants.OVERWRITE_DATA`: (3) Data of all nodes is stored and data on disk is overwritten. May lead to fragmentation of the HDF5 file. The user is adviced to recompress the file manually later on.

param stuff_to_store:
	The trajectory
param store_data:
	How to store data see above
param store_final:
	If final meta info should be stored

param stuff_to_sore:
	Result or parameter to store In order to determine what to store, the function ‘_store’ of the parameter or result is called. This function returns a dictionary with name keys and data to store as values. In order to determine how to store the data, the storage flags are considered, see below. The function ‘_store’ has to return a dictionary containing values only from the following objects: python natives (int, long, str, bool, float, complex), numpy natives, arrays and matrices of type np.int8-64, np.uint8-64, np.float32-64, np.complex, np.str python lists and tuples of the previous types (python natives + numpy natives and arrays) Lists and tuples are not allowed to be nested and must be homogeneous, i.e. only contain data of one particular type. Only integers, or only floats, etc. python dictionaries of the previous types (not nested!), data can be heterogeneous, keys must be strings. For example, one key-value-pair of string and int and one key-value pair of string and float, and so on. pandas DataFrames `ObjectTable` The keys from the ‘_store’ dictionaries determine how the data will be named in the hdf5 file.
param store_data:
	How to store the data, see above for a descitpion.
param store_flags:
	Flags describing how to store data. `ARRAY` (‘ARRAY’) Store stuff as array `CARRAY` (‘CARRAY’) Store stuff as carray `TABLE` (‘TABLE’) Store stuff as pytable `DICT` (‘DICT’) Store stuff as pytable but reconstructs it later as dictionary on loading `FRAME` (‘FRAME’) Store stuff as pandas data frame Storage flags can also be provided by the parameters and results themselves if they implement a function ‘_store_flags’ that returns a dictionary with the names of the data to store as keys and the flags as values. If no storage flags are provided, they are automatically inferred from the data. See `pypet.HDF5StorageService.TYPE_FLAG_MAPPING` for the mapping from type to flag.
param overwrite:
	Can be used if parts of a leaf should be replaced. Either a list of HDF5 names or True if this should account for all.

param stuff_to_store:
	The item to be removed.
param delete_only:
	Potential list of parts of a leaf node that should be deleted.
param remove_from_item:
	If delete_only is used, whether deleted nodes should also be erased from the leaf nodes themseleves.
param recursive:
	If you want to delete a group node you can recursively delete all its children.

param stuff_to_store:
	The group to store
param store_data:
	How to store data
param recursive:
	To recursively load everything below.
param max_depth:
	Maximum depth in case of recursion. None for no limit.

param stuff_to_store:
	Node to store
param store_data:
	How to store data
param recursive:
	Whether to store recursively the whole sub-tree
param max_depth:
	Maximum depth in case of recursion. None for no limit.

param name:	The full colon separated name of the link

param stuff_to_store:
	Iterable whose items are to be stored. Iterable must contain tuples, for example [(msg1,item1,arg1,kwargs1),(msg2,item2,arg2,kwargs2),…]

param stuff_to_store:
	A colon separated name to the data path
param item_name:
	The name of the data item to interact with
param request:	A functional request in form of a string
param args:	Positional arguments passed to the reques
param kwargs:	Keyword arguments passed to the request

param stuff_to_store:
	`None`

param stuff_to_store:
	`None`

param stuff_to_store:
	`None`

Raises:	NoSuchServiceError if message or data is not understood

item¶: alias of builtins.bytes

Empty Storage Service for Debugging¶

class pypet.storageservice.LazyStorageService(*args, **kwargs)[source]¶

This lazy guy does nothing! Only for debugging purposes.

Ignores all storage and loading requests and simply executes pass instead.

load(*args, **kwargs)[source]¶: Nope, I won’t care, dude!

store(*args, **kwargs)[source]¶: Do whatever you want, I won’t store anything!

The Multiprocessing Wrappers¶

class pypet.utils.mpwrappers.LockWrapper(storage_service, lock=None)[source]¶

For multiprocessing in WRAP_MODE_LOCK mode, augments a storage service with a lock.

The lock is acquired before storage or loading and released afterwards.

is_open¶

Normally the file is opened and closed after each insertion.

However, the storage service may provide the option to keep the store open and signals this via this property.

load(*args, **kwargs)[source]¶: Acquires a lock before loading and releases it afterwards.

multiproc_safe¶: Usually storage services are not supposed to be multiprocessing safe

store(*args, **kwargs)[source]¶: Acquires a lock before storage and releases it afterwards.

class pypet.utils.mpwrappers.QueueStorageServiceSender(storage_queue=None)[source]¶

For multiprocessing with WRAP_MODE_QUEUE, replaces the original storage service.

All storage requests are send over a queue to the process running the QdebugueueStorageServiceWriter.

Does not support loading of data!

send_done()[source]¶: Signals the writer that it can stop listening to the queue

store(*args, **kwargs)[source]¶

Puts data to store on queue.

Note that the queue will no longer be pickled if the Sender is pickled.

class pypet.utils.mpwrappers.QueueStorageServiceWriter(storage_service, storage_queue, gc_interval=None)[source]¶: Wrapper class that listens to the queue and stores queue items via the storage service.

class pypet.utils.mpwrappers.PipeStorageServiceSender(storage_connection=None, lock=None)[source]¶

send_done()[source]¶: Signals the writer that it can stop listening to the queue

store(*args, **kwargs)[source]¶

Puts data to store on queue.

Note that the queue will no longer be pickled if the Sender is pickled.

class pypet.utils.mpwrappers.PipeStorageServiceWriter(storage_service, storage_connection, max_buffer_size=10, gc_interval=None)[source]¶: Wrapper class that listens to the queue and stores queue items via the storage service.

class pypet.utils.mpwrappers.ReferenceWrapper[source]¶

Wrapper that just keeps references to data to be stored.

load(*args, **kwargs)[source]¶: Not implemented

store(msg, stuff_to_store, *args, **kwargs)[source]¶: Simply keeps a reference to the stored data

class pypet.utils.mpwrappers.ReferenceStore(storage_service, gc_interval=None)[source]¶

Class that can store references

store_references(references)[source]¶: Stores references to disk and may collect garbage.

class pypet.utils.mpwrappers.LockerServer(url='tcp://127.0.0.1:7777')[source]¶

Manages a database of locks

run()[source]¶: Runs server

class pypet.utils.mpwrappers.LockerClient(url='tcp://127.0.0.1:7777', lock_name='_DEFAULT_')[source]¶

Implements a Lock by requesting lock information from LockServer

acquire()[source]¶

Acquires lock and returns True

Blocks until lock is available.

release()[source]¶: Releases lock

start(test_connection=True)[source]¶

Starts connection to server if not existent.

NO-OP if connection is already established. Makes ping-pong test as well if desired.

class pypet.utils.mpwrappers.TimeOutLockerServer(url, timeout)[source]¶: Lock Server where each lock is valid only for a fixed period of time.

class pypet.utils.mpwrappers.ForkAwareLockerClient(url='tcp://127.0.0.1:7777', lock_name='_DEFAULT_')[source]¶

Locker Client that can detect forking of processes.

In this case the context and socket are restarted.

start(test_connection=True)[source]¶: Checks for forking and starts/restarts if desired

Storage Services¶

The HDF5 Storage Service¶

Empty Storage Service for Debugging¶

The Multiprocessing Wrappers¶

Table of Contents

Search