Merge pull request #1 from philippaltmann/v0.2.0

V0.2.0
philippaltmann · Jul 29, 2024 · 030acc2 · 030acc2
2 parents e00b710 + 099993d
commit 030acc2
Show file tree

Hide file tree

Showing 45 changed files with 1,976 additions and 569 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,5 +1,4 @@
 TODO*
-test*
 results*
 
 # Created by https://www.toptal.com/developers/gitignore/api/python,macos,direnv,visualstudiocode

diff --git a/CITATION.cff b/CITATION.cff
@@ -1,5 +1,5 @@
 cff-version: 1.2.0
-title: "Quantum Circuit Designer"
+title: "qcd-gym"
 message: >-
   If you use this software, please cite it using the
   metadata from this file.
@@ -8,12 +8,14 @@ authors:
   - given-names: Philipp
     family-names: Altmann
     email: [email protected]
-    orcid: 'https://orcid.org/0000-0003-1134-176X'
 repository-code: 'https://github.com/philippaltmann/qcd/'
-abstract: A gymnasium-based set of environments for benchmarking reinforcement learning for quantum circuit design.
+url: 'https://github.com/philippaltmann/qcd/'
+abstract: >-
+  A gymnasium-based set of environments for benchmarking 
+  reinforcement learning for quantum circuit design.
 keywords:
-  - benchmark
-  - reinforcement-learning
-  - quantum-computing
-  -circuit-design
-license: MIT
+  - Reinforcement Learning
+  - Quantum Computing
+  - Circuit Optimization
+  - Architecture Search
+license: MIT
diff --git a/QCD.png b/QCD.png
diff --git a/README.md b/README.md
@@ -1,22 +1,23 @@
 # Quantum Circuit Designer
+[![arXiv](https://img.shields.io/badge/arXiv-2312.11337-b31b1b.svg?logo=arxiv&logoColor=white)](https://arxiv.org/abs/2312.11337)
+[![GitHub Release](https://img.shields.io/github/v/release/philippaltmann/qcd?logo=github&logoColor=white&label=GitHub)](http://github.com/philippaltmann/qcd)
+[![PyPI Version](https://img.shields.io/pypi/v/qcd-gym?logo=pypi&logoColor=white)](https://pypi.org/p/qcd-gym/)
 
-[![arXiv](https://img.shields.io/badge/arXiv-2312.11337-b31b1b.svg)](https://arxiv.org/abs/2312.11337)
-[![PyPI version](https://badge.fury.io/py/qcd-gym.svg)](https://badge.fury.io/py/qcd-gym)
-![alt text](QCD.png)
+![QCD Overview](QCD.png)
 
 ## **Description**
 
-This repository contains the Quantum Circuit Designer, a generic [gymnasium](https://github.com/Farama-Foundation/Gymnasium) environment to build quantum circuits gate-by-gate using [pennylane](https://github.com/PennyLaneAI/pennylane), revealing current challenges regarding:
+This repository contains `qcd-gym`, a generic [gymnasium](https://github.com/Farama-Foundation/Gymnasium) environment to build quantum circuits gate-by-gate using [qiskit](https://github.com/Qiskit/qiskit), revealing current challenges regarding:
 
 - [State Preparation (SP)](#state-preparation): Find a gate sequence that turns some initial state into the target quantum state.
 - [Unitary Composition (UC)](#unitary-composition): Find a gate sequence that constructs an arbitrary quantum operator.
 
 
 ## Observations
 
-The observation is defined by the full complex vector representation of the state of the current circuit: $s = \ket{\boldsymbol{\Psi}}\in\mathbb{C}^{2^\eta}$. 
+The observation is comprised of the state of the current circuit, represented by the full complex vector representation $\ket{\Psi}$ or the unitary operator $\boldsymbol{V}(\Sigma_t)$ resulting from the current sequence of operations $\Sigma_t$, as well as the intended target. 
 While this information is only available in quantum circuit simulators efficiently (on real hardware, $\mathcal{O}(2^\eta)$ measurements would be needed), it depicts a starting point for RL from which future work should extract a sufficient, efficiently obtainable, subset of information.
-This $2^\eta$-dimensional state representation is sufficient for the definition of an MDP-compliant environment, as operations on this state are required to be reversible. 
+This state representation is sufficient for the definition of an MDP-compliant environment, as operations on this state are required to be reversible. 
 
 ## Actions
 
@@ -33,12 +34,11 @@ The operations $\Gamma$ are defined as:
 
 | o | Operation    | Condition  | Type                 | Arguments  | Comments                      |
 | - | ------------ | ---------- | -------------------- | ---------- | :---------------------------- |
-| 0 | $\mathbb{M}$ |            | Meassurement         | $q$        | Control and Parameter omitted |
-| 1 | $\mathbb{Z}$ | $q = c$    | PhaseShift           | $q,\Phi$   | Control omitted               |
-| 1 | $\mathbb{Z}$ | $q \neq c$ | ControlledPhaseShift | $q,c,\Phi$ | -                             |
-| 2 | $\mathbb{X}$ | $q = c$    | X-Rotation           | $q,\Phi$   | Control omitted               |
-| 2 | $\mathbb{X}$ | $q \neq c$ | CNOT                 | $q,c$      | Parameter omitted             |
-| 3 | $\mathbb{T}$ |            | Terminate            |            | All agruments omitted         |
+| 0 | $\mathbb{Z}$ | $q = c$    | PhaseShift           | $q,\Phi$   | Control omitted               |
+| 0 | $\mathbb{Z}$ | $q \neq c$ | ControlledPhaseShift | $q,c,\Phi$ | -                             |
+| 1 | $\mathbb{X}$ | $q = c$    | X-Rotation           | $q,\Phi$   | Control omitted               |
+| 1 | $\mathbb{X}$ | $q \neq c$ | CNOT                 | $q,c$      | Parameter omitted             |
+| 2 | $\mathbb{T}$ |            | Terminate            |            | All agruments omitted         |
 
 With operations according to the following unversal gate set:
 
@@ -53,13 +53,13 @@ The reward is kept $0$ until the end of an episode is reached (either by truncat
 To incentivize the use of few operations, a step-cost $\mathcal{C}_t$ is applied when exceeding two-thirds of the available operations $\sigma$:
 $$\mathcal{C}_t=\max\left(0,\frac{3}{2\sigma}\left(t-\frac{\sigma}{3}\right)\right)$$
 
-Suitable task reward functions $\mathcal{R}^{\*}\in[0,1]$ are defined, s.t.: $\mathcal{R}=\mathcal{R}^{\*}(s_t,a_t)-C_t$ if $t$ is terminal, according to the following challenges:
+Suitable task reward functions $\mathcal{R}^{\ast}\in[0,1]$ are defined, s.t.: $\mathcal{R}=\mathcal{R}^{\ast}(s_t,a_t)-C_t$ if $t$ is terminal, according to the following objectives:
 
-## Challenges
+## Objectives
 
 ### **State Preparation**
 
-The objective of this challenge is to construct a quantum circuit that generates a desired quantum state.
+The task of this objective is to construct a quantum circuit that generates a desired quantum state.
 The reward is based on the *fidelity* between the target an the final state:
 $$\mathcal{R}^{SP}(s_t,a_t) = F(s_t, \Psi) = |\braket{\psi_{\text{env}}|\psi_{\text{target}}}|^2 \in [0,1]$$
 Currently, the following states are defined:
@@ -69,33 +69,32 @@ Currently, the following states are defined:
 
 ### **Unitary Composition**
 
-The objective of this challenge is to construct a quantum circuit that implements a desired unitary operation.
+The task of this objective is to construct a quantum circuit that implements a desired unitary operation.
 The reward is based on the ***Frobenius norm*** $D = |U - V(\Sigma_t)|_2$ between the taget unitary $U$ and the final unitary $V$ based on the sequence of operations $\Sigma_t = \langle a_0, \dots, a_t \rangle$: 
 
 $$ R^{UC}(s_t,a_t) = 1 - \arctan (D)$$
 
 <!-- For the reward function, an 1-arctan mapping of the ***Frobenius norm*** $|U_{\text{env}} - U_{\text{target}}|_2$ to the interval $[0,1]$ is chosen.  -->
-The following unitaries are currently available for this challenge:
+The following unitaries are currently available for this objective:
 
 - `'UC-random'` (a random unitary operation on *max_qubits* )
 - `'UC-hadamard'` (the single qubit Hadamard gate)
 - `'UC-toffoli'` (the 3-qubit Toffoli gate)
 
-See [Outlook](#outlook-and-todos) for more challenges to come.
 
 ### *Further Objectives*
 
-The goal of this implementation is to not only construct any circuit that fulfills a specific challenge but to also make this circuit optimal, that is to give the environment further objectives, such as optimizing:
+The goal of this implementation is to not only construct any circuit that fulfills a specific objective but to also make this circuit optimal, that is to give the environment further objectives, such as optimizing:
 
 - Circuit Depth
 - Qubit Count
-- Gate Count (or: 2-qubit Gate Count)
+- Gate Count
 - Parameter Count
 - Qubit-Connectivity
 
 These circuit optimization objectives can be switched on by the parameter `punish` when initializing a new environment.
 
-Currently, the only further objective implemented in this environment is the **circuit depth**, as this is one of the most important features to restrict for NISQ (noisy, intermediate-scale, quantum) devices. This metric already includes gate count and parameter count to some extent. However, further objectives can easily be added within the `Reward` class of this environment (see [Outlook](#outlook)).
+Currently, the only further objective implemented in this environment is the **circuit depth**, as this is one of the most important features to restrict for NISQ (noisy, intermediate-scale, quantum) devices. This metric already includes gate count and parameter count to some extent. However, further objectives can easily be added within the `Reward` class of this environment.
 
 
 ## **Setup**
@@ -111,7 +110,7 @@ The environment can be set up as:
 ```python
 import gymnasium as gym
 
-env = gym.make("CircuitDesigner-v0", max_qubits=2, max_depth=10, challenge='SP-bell', render_mode='text', verbose=True)
+env = gym.make("CircuitDesigner-v0", max_qubits=2, max_depth=10, objective='SP-bell', render_mode='text')
 observation, info = env.reset(seed=42); env.action_space.seed(42)
 
 for _ in range(9):
@@ -128,7 +127,7 @@ The relevant parameters for setting up the environment are:
 | :----------------- | ------ | ------------------------------------------------------------ |
 | max_qubits $\eta$  | `int`  | maximal number of qubits available                           |
 | max_depth $\delta$ | `int`  | maximal circuit depth allowed (= truncation criterion)       |
-| challenge          | `str`  | RL challenge for which the circuit is to be built (see [Challenges](#challenges)) |
+| objective          | `str`  | RL objective for which the circuit is to be built (see [Objectives](#objectives)) |
 | punish             | `bool` | specifier for turning on multi-objectives (see [Further Objectives](#further-objectives)) |
 
 
@@ -141,31 +140,28 @@ git clone https://github.com/philippaltmann/QCD.git
 pip install -e '.[all]'
 ```
 
-Specify the intended \<Challenge\> as: "`challenge`-q`max_qubits`-d`max_depth`":
+Specify the intended \<Task\> as: "`objective`-q`max_qubits`-d`max_depth`":
 
 ```sh
-# Run a specific algoritm and challenge (requires `pip install -e '.[train]'`)
-python -m train [A2C | PPO | SAC | TD3] -e <Challenge>
+# Run a specific algoritm and task (requires `pip install -e '.[train]'`)
+python -m train [A2C | PPO | SAC | TD3] -e <Task>
 
-# Generate plots from the `results` folder (requires `pip install -e '.[plot]'`)
-python -m plot results
+# Generate plots from the `results` folder (requires `pip install -e '.[plot]'`) 
+python -m plot results -b # plot all runs in `results`, add random and evo baselines
 
 # To train the provided baseline algorithms, use (pip install -e .[all])
-./run
+./run.sh
 
 # Test the circuit designer (requires `pip install -e '.[test]'`)
-python -m circuit_designer.test
+python -m test
 ```
 
-## Results 
+## Results
+
+![Results](Results.png)
 
-![alt text](Results.png)
 
 
 ## Acknowledgements
 
 The research is part of the [Munich Quantum Valley](https://www.munich-quantum-valley.de), which is supported by the Bavarian state government with funds from the [Hightech Agenda Bayern Plus](https://www.hightechagenda.de).
-
-
-
-
diff --git a/Results.png b/Results.png