dtControl logo

dtControl

Represent controllers as decision trees. Improve memory footprint, boost explainability while preserving guarantees.

Artifact Evaluation Instructions: TACAS 2021

Here we explain how to reproduce the results of the TACAS 2021 paper dtControl 2.0: Explainable Strategy Representation via Decision Tree Learning Steered by Experts. More concretely, we show how to

  1. Prepare your machine in order to run dtControl
  2. Reproduce all figures and tables from the paper using pre-written scripts
  3. Use the tool in general (Robustness testing)

More detailed instructions follow. Note that we always assume you are inside our unzipped artifact folder and give paths relative to that. E.g. just executing ls should result in seeing the folders examples, installation and results as well as 9 Python scripts.

This artifact has passed the TACAS 2021 artifact evaluation and is presented here for archival purposes.

Paper abstract

Recent advances have shown how decision trees are apt data structures for concisely representing strategies (or controllers) satisfying various objectives. Moreover, they also make the strategy more explainable. The recent tool dtControl had provided pipelines with tools supporting strategy synthesis for hybrid systems, such as SCOTS and Uppaal Stratego. We present dtControl 2.0, a new version with several fundamentally novel features. Most importantly, the user can now provide domain knowledge to be exploited in the decision tree learning process and can also interactively steer the process based on the dynamically provided information. To this end, we also provide a graphical user interface. It allows for inspection and re-computation of parts of the result, suggesting as well as receiving advice on predicates, and visual simulation of the decision-making process. Besides, we interface model checkers of probabilistic systems, namely STORM and PRISM and provide dedicated support for categorical enumeration-type state variables. Consequently, the controllers are more explainable and smaller.

Preparation

Contents of this artifact

This artifact contains the following files and folders

Technical setup

We tested the artifact on a PC with a 64 bit Windows 10 operating system, 24 GB RAM and a 3.00 GHz AMD Phenom II X6 1075T processor. We ran the virtual machine with 12 GB of RAM and 2 cores of the processor, using VirtualBox 6.0.16. The case studies have a size of ~4.5GB, so the experiments require ~5GB hard drive. The experiments in the paper were conducted on an Ubuntu 19.10 server with much more power, namely 250 GB RAM and a 2.2GHz Intel Xeon CPU E5-2630 v4, with multiple experiments running in parallel. More concrete details on the scripts are in Section “Reproducing the results of the paper”

To run dtControl, you need Python 3.7.9 or higher (which is installed automatically on the TACAS 21 VM) as well as several libraries which are included in this artifact. The web-based graphical user interface was tested on Firefox 80, which comes bundled with the VM.

Installing dtControl

We assume that you start on a clean version of the TACAS 2021 VM and use it only for this evaluation, hence we do not bother with virtual environments. We assume that python3.8 and pip3 are already installed.

First, we have to add ~/.local/bin to the path so that dtcontrol can easily be found after installation

$ echo "export PATH=\"$PATH:/home/tacas21/.local/bin\"" >> ~/.bashrc
$ source ~/.bashrc

To install dtControl with all its dependencies, go to the installation folder of our artifact and execute

$ cd installation && pip3 install *

This installs dtControl in the .local/bin directory. You can check that it worked by typing dtcontrol, which should print the help text of dtcontrol.

Unpacking the case studies

The case studies are located in folder examples. To unpack them, first

$ cd examples

then run

$ find . -name "*.zip" | while read filename; do unzip -o -d "`dirname "$filename"`" "$filename"; done;

to extract the zipped examples within all the subfolders (examples/cps and examples/storm). The examples in the prism and csv folder are not zipped as they are very small.

Checking that everything worked.

We want to verify that three things work properly: The command line interface (CLI), the graphical user interface (GUI) and the python bindings which we use for the large scripts. To do so, first execute

$ dtcontrol

which should print the help text. If it doesn’t, make sure you have set the path correctly. Then execute

$ dtcontrol-frontend

and then open firefox and go to http://127.0.0.1:5000 You should see the GUI (and not a 404 not found). Finally, in the top level of our artifact folder, execute the script quickcheck.py by typing

$ python quickcheck.py

This executes 3 different algorithms on 2 case studies and finally opens a results page in firefox.

Reproducing the results of the paper

Scripts for convenience

We offer eight scripts (using the dtControl python bindings) to reproduce the two tables we have. Every script is named after the schema <models>-<representation>-<set>.py, where - <models> are either ‘CPS’ or ‘MDP’. CPS (Cyber-physical systems) are the case studies for Table 1 and MDP (Markov decision processes) are the case studies for Table 2.

Note for the interested: You can modify the scripts yourself to modify the subset of experiments, e.g. to include a large case study for BDDs, but not repeat it 20 times. For this, here is a quick description of the two important parameters in the scripts:

  1. There always is a line suite.add_datasets(..., include=[...]). In this ‘include’ list, all case studies are explicitly listed, and you can remove or add them as you deem fit.
  2. Similarly, there is the command classifiers = [...] where all algorithms for representation are listed. By commenting some of the classifiers, you can omit them in the computation (for an example, compare the cps-bdd-subset to the cps-bdd-all script: two case studies were excluded and all but two classifiers have been commented).

By modifying the scripts, you can reproduce as many of the results as your computational resources and time allow.

Here we report the time and memory requirements of the different scripts. Note that we are not benchmarking the memory or time requirements, and hence these are ballpark figures to help the reviewer choose the experiments to reproduce.

We mark some suggested experiments in bold. OOM indicates that the script ran out of memory on our VM.

Script Name Max RAM Time on TACAS VM Theoretical Maximum
quickcheck 100MB 20 sec 6min
cps-dt-subset ~1GB 89 min 3.5h
cps-bdd-subset ~8GB 63 min 2.5h
cps-dt-all 12GB+ OOM 3.5d
cps-bdd-all 12GB+ OOM 17.5d
mdp-dt-subset ~1.4GB 37 min 11h
mdp-bdd-subset ~3.4GB 38 min 16h
mdp-dt-all ~5.8GB 5 hours 9.5d
mdp-bdd-all 12GB+ OOM 47.5d

Provided that you have a good setup, you can also run the scripts in parallel to speed things up (as we did when running them for the paper).

Expected error: Note that we have several times observed the following behaviour of the BDD scripts: after completing several instances, the script terminates with an error related to garbage collection. In all cases, it was sufficient to just restart the script, and it continued without this error, picking up from the point where it failed.

Table 1 - Cyber-physical systems (CPS)

To reproduce this table, we suggest to run the scripts cps-dt-subset.py and cps-bdd-subset.py

$ python cps-dt-subset.py
$ python cps-bdd-subset.py

The results are saved in multiple formats, namely in the folders decision_trees, saved_classifiers and results. The former two are not relevant for reproducing the table. In the results folder, you see an html and a json file that report all the statistics of every combination of algorithm and case study. You may open the results/DT-CPS.html file and the results/BDD-CPS.html file in a browser to compare the results with the paper. We report the minimum sized dtControl1, dtControl2 and BDD results from the different DT-algorithms respectively the multiple BDD repetitions.

Table 2 - Probabilistic model checking (concretely Markov decision processes, MDP)

To reproduce this table, we suggest to run the scripts mdp-dt-subset.py and mdp-bdd-subset.py.

$ python mdp-dt-subset.py
$ python mdp-bdd-subset.py

The results of Table 2 may be compiled as described for Table 1, from the results/DT-MDP.html file and the results/BDD-MDP.html file generated by the above commands.

Helper script - Extracting minimum values for Tables 1 and 2

As previously mentioned, we only report the minimum sized results of the different algorithms in the paper. In order to automatically extract the minimum values given within Table 1 and Table 2, we provide a helper script that reads the corresponding json-files of Table 1 and Table 2. The helper script process-results.py can be found in the results/ folder (where the files DT-CPS.json, BDD-CPS.json, DT-MDP.json and BDD-MDP.json are generated).

The helper script process-results.py can be executed by running

$ python process-results.py

A respective CSV file will be created for each of them, which can be opened using a standard spreadsheet software (Open/LibreOffice, Excel, Google Sheets).

Figure 1, 4 and 5

These figures show small illustrative examples that need not be reproduced via the tool.

Figure 3

This figure shows a screenshot of the new GUI. To start the GUI, run dtcontrol-frontend. The sidebar on the left allows the selection of the case study and the preset. By clicking the browse button, you can select the case study. 10rooms.scs and cartpole.scs are located in <our-artifact-folder>/examples/cps and firewire_abst.prism is in <our-artifact-folder>/examples/prism. The dropdown menu allows to select the preset, i.e. the exact DT construction algorithm. Every preset induces a choice of hyper-parameters. These hyper-parameters can also be tuned by hand, but to recreate the screenshot, you do not need to use the advanced options.

Adding the case studies with their respective preset as in screenshot and then clicking the run button on the right for the first two recreates the Figure.

Figure 2

This figure shows all the workflow and software architecture of dtControl 2.0. We suggest to read the description of the figure in the paper before continuing here.

Robustness testing

You can play around with CLI and GUI, trying any combination of hyper-parameters on any case study. You can also download further case studies from the Quantitative Verification Benchmark set or from our examples repository. You can explore DTs in the GUI and try to get better understanding, or see whether you can come up with useful predicates.

Single case-study: Command line interface

To run dtControl on a single case study, execute the following (assuming that you have installed dtControl and unzipped the examples):

$ dtcontrol --input examples/<location_of_controller_file> --use-preset <preset>

where <location_of_controller_file> is the path to an example, e.g. cps/cartpole.scs or storm/wlan_dl.0.80.deadline.storm.json, and <preset> is one of the available presets (run dtcontrol preset --list to list all available presets, we recommend avg for MDPs and mlentropy for the CPS benchmarks).

Note that, for the avg preset, there needs to be a <controller_name>_config.json file containing the metadata present in the same location as the controller file.

The output tree (in JSON, C and DOT formats) can be found inside the decision_trees/<present_name>/<controller_name>/ folder, and a summary will be available in the benchmarks.html file in the folder from which dtcontrol was run.

Interactive DT building in the GUI

To reproduce Figure 9 of the Appendix, one may follow these steps

  1. Run dtcontrol-frontend to start the GUI and navigate to http://127.0.0.1:5000 in the browser.
  2. Upload the examples/cps/10rooms.scs controller along with the metadata file examples/cps/10rooms_config.json.
  3. Select the mlentropy preset and run the experiment.
  4. Open the interactive DT viewer by pressing the eye icon in the results table.
  5. In the top left area, click the Edit mode button.
  6. Click on the x[4] <= 20.625 predicate in the tree by clicking on it and press on the “Start interactive tree builder …” button in the left side bar.
  7. Click on the Add predicate button in the Predicate Collection table and enter x_0 + 3*x_1 <= c_0 and wait until the Instantiated Predicate table reloads.

In case of malfunction

In case of problems with the evaluation scripts, you may try

  1. The *.json files in the results folder.
  2. Making sure that all the examples are unzipped.
  3. Deleting the .benchmark_suite hidden folder inside each of the examples folders.

In case the dtControl CLI is not working properly, you can clean up the cache by deleting

  1. The *.json files in the folder from which dtcontrol is run
  2. The .benchmark_suite hidden folder inside each of the examples folders.
  3. Running dtcontrol --help

In case the dtControl GUI is malfunctioning, you may try

  1. Looking at the server logs displayed in the terminal in which dtcontrol-frontend is run.
  2. Looking at the browser console by pressing the F12 button (in Firefox).
  3. If there are any errors, you may restart the server (press Ctrl+C in the terminal to kill dtcontrol-frontend) and try again.

User and Developer Manuals

More information on dtControl and its functionality can be found in the user and developer manuals at dtcontrol.readthedocs.io.

Source Code

dtControl is developed as an open-source software, with the source available in LRZ Gitlab.