Runaway uses profiles (yaml files stored in your ~/.orchestra
folder) to connect to remote hosts and execute the jobs. Since most resources are either shared without scheduled access, or managed with slurm
, you should be able to make it work with the current installers. Though, if you have a particular need that can’t be met, it should not be too much of a hassle to write your own profile. Just read through the execution model page of the documentation, and start building your own profile.
Depending on your needs, Runaway is expected to help you in a few different ways:
Now why using it if you already have your own super-cool tools ?
The project of developing tools to automate experiments is a old hat in multiple research teams. Every now and then, Phd and PostDocs take on their own to develop such tools. It can be a funny engineering project, but depending on your knowledge, it can take time and it adds no value to your research. Plus, if you are a researcher, you’ll never have time and motivation to make it a tool others can use. All in all, the fact that people are still developing their own scripts and tools is demonstrative enough of the need.
In some cases, doing things on your own may still be your best option:
In the design process of this program, we’ve been discussing wit administrators from the plafrim and the curta platforms. Those discussions influenced this design, which currently appears to be flexible enough to fit most platforms. Future will tell us if this was a good design decision, but from the experience of those people, this should work on most platforms we could use.
Since Runaway is scheduling jobs over the scheduler, it has to be running somewhere. The standard way to use it is to run it from your laptop and keep the command going. Problem is, if the program looses connection, or is paused (if your laptop hibernates), it won’t be able to keep scheduling!
Hopefully, there is a simple solution to that: you can install Runaway directly on a cluster, and derive a working profile by just changing the ssh target to localhost
. This way, you just have to log on the cluster start a runaway command in a tmux, and move out. When you are back, you’ll just have to check if the command is done. An other solution is to do the same on an other computer of the lab that stays up while you are gone. This way, you can have most of Runaway benefits, while having the ability to let the jobs run while you are away.
Also note that if you just disconnect for a few hours, runaway will be able to resume your work (basically as long as a ssh session is kept on the server side, which is usually a matter of hours).
Thanks to the asynchronous development used for its core, Runaway is able to scale to larger experimental campaigns pretty well. Still, since this program is still in its infancy, as with any other programs, you may experience a few bugs. Don’t hesitate to file an issue !