Optimization Tips

Group your Results into Buckets/Sets

HDF5 has a hard time managing nodes with more than 20,000 children. Accordingly, file I/O and reading or writing data can become very inefficient if one of your trajectory groups has more than 20,000 children. For instance, this may happen to you if you explore many runs.

Suppose in every run you add the following result:

>>> traj.f_add_result('some_group.$.z', 42, comment='Universal answer.')

If this line is executed in each of your, let’s say 100,000 runs, the node some_group will have at least 100k children. Hence, storage and loading becomes extremely slow.

The simplest way around this problem is to group your results into buckets using the '$set' wildcard, see also More on Wildcards. Accordingly, your result addition becomes:

>>> traj.f_add_result('some_group.$set.$.z', 42, comment='Universal answer.')

Hence, even running 100k runs, some_group has only 100 children, each having only 1000 children themselves.

Huge Explorations

Yet, this approach will still fall short in case you have parameter exploration of more than 1,000,000 runs, because loading meta-data of your trajectory may already take more than a minute. And this can be annoying. In case of such huge explorations, I would advise you to tailor your parameter space and split it among several individual trajectories.

Collect Small Results

In case you compute only small results during your runs, like a single value, but you do this quite often (100k+), it might be more convenient to return the result instead of storing it into the trajectory directly. As a consequence, you can collect these single values later on during the post-processing phase and store all of them together into a single result. This has also been done for the estimated firing rate in the Tutorial.

Many and Fast Single Runs

In case you perform many single runs and milliseconds matter, use a pool (use_pool=True) in combination with a queue (wrap_mode='QUEUE', see Multiprocessing) or the even faster - but potentially unreliable - method of using a shared pipe (wrape_mode='PIPE'). Moreover, to avoid re-pickling of unnecessary data of your trajectory, store and remove all data that is not needed during single runs.

For instance, if you don’t really need config data during the runs, use the following before calling the environment’s run() function:

traj.f_store()
traj.config.f_remove(recursive=True)

This may save a couple of milliseconds each run because the config data no longer needs to be pickled and send over the queue for storage.

Moreover, you can further avoid unnecessary pickling for the pool and SCOOP by setting freeze_input=True. Accordingly, the trajectory, your target function, and all additional arguments are passed to each pool or SCOOP process at initialisation and not for each run individually. However, in order to use this feature, you must make sure that neither your target function nor the additional arguments are mutated over the course of your runs. In case you use SCOOP, try do avoid running many batches of experiments in one go with freeze_input=True because memory consumption of all the SCOOP workers may increase with every batch, see also Multiprocessing with a Cluster or a Multi-Server Framework.