You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

84 lines
3.2 KiB

4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
  1. # Python Algorithm Alchemists Machinery (pyalma)
  2. **Disclaimer:** This project is in a very early stage of development and should therefor be considered as experimental.
  3. The pyalma package provides a framework to plan, run, pause and continue file based experiments (or workloads in general). The framework provided is especially helpful if you have to perform multiple tasks on big batches of files in numerous iterations. It strength comes from the ability to lengthy experiments (tasks) in advance. This allows track progress and pause/continue long tasks.
  4. ## Example
  5. Everything starts with grouping the files of interests to (a) [batch(es)](./docs/batch.md). Lets say we have a batch file `example.batch`. This could look as follows.
  6. ###### `example.batch`
  7. ```json
  8. {
  9. "instances":
  10. [
  11. "file1",
  12. "file2",
  13. .
  14. .
  15. .
  16. "file100"
  17. ]
  18. }
  19. ```
  20. Then we need to specify our [experiment / task](./docs/experiment.md). Like the batch descriptions this is also completely file based.
  21. ###### `example.experiment`
  22. ```json
  23. {
  24. "load": "example.py",
  25. "workers": 3,
  26. "iterations": 100,
  27. "batch": "example.batch"
  28. }
  29. ```
  30. The experiment description in `example.experiment` roughly translates to: Perform the algorithm loaded from `example.py` 100 times on all files in `example.batch` and use 3 worker threads to do so. Let's have a look at `example.py` (look at [Run Module](./docs/run_module.md) for more information).
  31. ###### `example.py`
  32. ```python
  33. def run(instance, save_callback, state):
  34. # do some stuff on "instance"
  35. ```
  36. The `run` function is where the magic happens. For every file in our batch the ***pyalma*** framework will call `run(...)` exactly "iterations" times. The `instance` parameter is a path to one file of our `example.batch`.
  37. Now that we have specified everything, we can start executing our experiment.
  38. ```python
  39. >>> import alma.experiment
  40. >>> dispatcher = alma.experiment.load("example.experiment")
  41. >>> dispatcher.start()
  42. ```
  43. The line `dispatcher.start()` starts the concurrent non blocking execution of our experiment. This means the dispatcher stays responsive and we can pause/stop the execution at any given time.
  44. ```python
  45. >>> dispatcher.stop()
  46. ```
  47. During the execution the `dispatcher` continuously keeps track of which files he still needs to call `run(...)` on and how many iterations he has left. He does so by saving the current state of the execution in a file. Loading an experiment (`alma.experiment.load(...)`) the framework first looks for such a save file and if one exists, the execution will pick up at the point we've called `dispatcher.stop()`. To pick up the experiment we can perform:
  48. ```python
  49. >>> dispatcher = alma.experiment.load("example.experiment")
  50. >>> dispatcher.start()
  51. ```
  52. **Note:** The Granularity of the save file is at the file level, meaning that every active `run(...)` call will be aborted when `dispatcher.stop(...)` gets called. If we continue the execution at a later time, these `run(...)` calls will be reinitiated.
  53. ## Install
  54. Fist clone the repository and then switch into it's root directory and call
  55. ```bash
  56. $ pip install -e .
  57. ```
  58. This will locally install the **pyalma** framework on your system.