Gradient circuits 5/n: refactor base Differentiator (tensorflow#537)

* update to conform to new interface * formatting * increase sample number to reduce flaking * trigger build * trigger build * trigger build * Refactored into base class * formatting * format * upgrade tests to compare against analytic diff * doc string * update build * lint * explicit tolerance * other tolerance * removed superfluous test * remove lint * revert sample count in noise test * updated differentiator docstrings * updated gradients tutorial * formatting * formatting * formatting * formatting * formatting * formatting * formatting * formatting * formatting * extraneous output removed * formatting * trigger build * trigger build * add back decorator * Update tutorial language.
Userfound404 · Apr 26, 2021 · 00f14d3 · 00f14d3
1 parent 0c8f03c
commit 00f14d3
Show file tree

Hide file tree

Showing 7 changed files with 117 additions and 832 deletions.
diff --git a/docs/tutorials/gradients.ipynb b/docs/tutorials/gradients.ipynb
@@ -603,10 +603,20 @@
       },
       "source": [
         "## 4. Advanced usage\n",
-        "Here you will learn how to define your own custom differentiation routines for quantum circuits.\n",
-        "All differentiators that exist inside of TensorFlow Quantum subclass `tfq.differentiators.Differentiator`. A differentiator must implement `differentiate_analytic` and `differentiate_sampled`.\n",
+        "All differentiators that exist inside of TensorFlow Quantum subclass `tfq.differentiators.Differentiator`. To implement a differentiator, a user must implement one of two interfaces.  The standard is to implement `get_gradient_circuits`, which tells the base class which circuits to measure to obtain an estimate of the gradient.  Alternatively, you can overload `differentiate_analytic` and `differentiate_sampled`; the class `tfq.differentiators.Adjoint` takes this route.\n",
         "\n",
-        "The following uses TensorFlow Quantum constructs to implement the closed form solution from the first part of this tutorial."
+        "The following uses TensorFlow Quantum to implement the gradient of a circuit.  You will use a small example of parameter shifting."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "J1xN6Ln5mB9N"
+      },
+      "source": [
+        "Recall the circuit you defined above, $|\\alpha⟩ = Y^{\\alpha}|0⟩$.  As before, you can define a function as the expectation value of this circuit against the $X$ observable, $f(\\alpha) = ⟨\\alpha|X|\\alpha⟩$.  Using [parameter shift rules](https://pennylane.ai/qml/glossary/parameter_shift.html), for this circuit, you can find that the derivative is\n",
+        "$$\\frac{\\partial}{\\partial \\alpha} f(\\alpha) = \\frac{\\pi}{2} f\\left(\\alpha + \\frac{1}{2}\\right) -  \\frac{ \\pi}{2} f\\left(\\alpha - \\frac{1}{2}\\right)$$\n",
+        "The `get_gradient_circuits` function returns the components of this derivative."
       ]
     },
     {
@@ -625,107 +635,36 @@
         "    def __init__(self):\n",
         "        pass\n",
         "\n",
-        "    @tf.function\n",
         "    def get_gradient_circuits(self, programs, symbol_names, symbol_values):\n",
         "        \"\"\"Return circuits to compute gradients for given forward pass circuits.\n",
         "        \n",
-        "        When implementing a gradient, it is often useful to describe the\n",
-        "        intermediate computations in terms of transformed versions of the input\n",
-        "        circuits. The details are beyond the scope of this tutorial, but interested\n",
-        "        users should check out the differentiator implementations in the TFQ library\n",
-        "        for examples.\n",
-        "        \"\"\"\n",
-        "        raise NotImplementedError(\n",
-        "            \"Gradient circuits are not implemented in this tutorial.\")\n",
-        "\n",
-        "    @tf.function\n",
-        "    def _compute_gradient(self, symbol_values):\n",
-        "        \"\"\"Compute the gradient based on symbol_values.\"\"\"\n",
-        "\n",
-        "        # f(x) = sin(pi * x)\n",
-        "        # f'(x) = pi * cos(pi * x)\n",
-        "        return tf.cast(tf.cos(symbol_values * np.pi) * np.pi, tf.float32)\n",
-        "\n",
-        "    @tf.function\n",
-        "    def differentiate_analytic(self, programs, symbol_names, symbol_values,\n",
-        "                               pauli_sums, forward_pass_vals, grad):\n",
-        "        \"\"\"Specify how to differentiate a circuit with analytical expectation.\n",
-        "\n",
-        "        This is called at graph runtime by TensorFlow. `differentiate_analytic`\n",
-        "        should calculate the gradient of a batch of circuits and return it\n",
-        "        formatted as indicated below. See\n",
-        "        `tfq.differentiators.ForwardDifference` for an example.\n",
-        "\n",
-        "        Args:\n",
-        "            programs: `tf.Tensor` of strings with shape [batch_size] containing\n",
-        "                the string representations of the circuits to be executed.\n",
-        "            symbol_names: `tf.Tensor` of strings with shape [n_params], which\n",
-        "                is used to specify the order in which the values in\n",
-        "                `symbol_values` should be placed inside of the circuits in\n",
-        "                `programs`.\n",
-        "            symbol_values: `tf.Tensor` of real numbers with shape\n",
-        "                [batch_size, n_params] specifying parameter values to resolve\n",
-        "                into the circuits specified by programs, following the ordering\n",
-        "                dictated by `symbol_names`.\n",
-        "            pauli_sums: `tf.Tensor` of strings with shape [batch_size, n_ops]\n",
-        "                containing the string representation of the operators that will\n",
-        "                be used on all of the circuits in the expectation calculations.\n",
-        "            forward_pass_vals: `tf.Tensor` of real numbers with shape\n",
-        "                [batch_size, n_ops] containing the output of the forward pass\n",
-        "                through the op you are differentiating.\n",
-        "            grad: `tf.Tensor` of real numbers with shape [batch_size, n_ops]\n",
-        "                representing the gradient backpropagated to the output of the\n",
-        "                op you are differentiating through.\n",
-        "\n",
-        "        Returns:\n",
-        "            A `tf.Tensor` with the same shape as `symbol_values` representing\n",
-        "            the gradient backpropagated to the `symbol_values` input of the op\n",
-        "            you are differentiating through.\n",
+        "        Every gradient on a quantum computer can be computed via measurements\n",
+        "        of transformed quantum circuits.  Here, you implement a custom gradient\n",
+        "        for a specific circuit.  For a real differentiator, you will need to\n",
+        "        implement this function in a more general way.  See the differentiator\n",
+        "        implementations in the TFQ library for examples.\n",
         "        \"\"\"\n",
         "\n",
-        "        # Computing gradients just based off of symbol_values.\n",
-        "        return self._compute_gradient(symbol_values) * grad\n",
-        "\n",
-        "    @tf.function\n",
-        "    def differentiate_sampled(self, programs, symbol_names, symbol_values,\n",
-        "                              pauli_sums, num_samples, forward_pass_vals, grad):\n",
-        "        \"\"\"Specify how to differentiate a circuit with sampled expectation.\n",
-        "\n",
-        "        This is called at graph runtime by TensorFlow. `differentiate_sampled`\n",
-        "        should calculate the gradient of a batch of circuits and return it\n",
-        "        formatted as indicated below. See\n",
-        "        `tfq.differentiators.ForwardDifference` for an example.\n",
-        "\n",
-        "        Args:\n",
-        "            programs: `tf.Tensor` of strings with shape [batch_size] containing\n",
-        "                the string representations of the circuits to be executed.\n",
-        "            symbol_names: `tf.Tensor` of strings with shape [n_params], which\n",
-        "                is used to specify the order in which the values in\n",
-        "                `symbol_values` should be placed inside of the circuits in\n",
-        "                `programs`.\n",
-        "            symbol_values: `tf.Tensor` of real numbers with shape\n",
-        "                [batch_size, n_params] specifying parameter values to resolve\n",
-        "                into the circuits specified by programs, following the ordering\n",
-        "                dictated by `symbol_names`.\n",
-        "            pauli_sums: `tf.Tensor` of strings with shape [batch_size, n_ops]\n",
-        "                containing the string representation of the operators that will\n",
-        "                be used on all of the circuits in the expectation calculations.\n",
-        "            num_samples: `tf.Tensor` of positive integers representing the\n",
-        "                number of samples per term in each term of pauli_sums used\n",
-        "                during the forward pass.\n",
-        "            forward_pass_vals: `tf.Tensor` of real numbers with shape\n",
-        "                [batch_size, n_ops] containing the output of the forward pass\n",
-        "                through the op you are differentiating.\n",
-        "            grad: `tf.Tensor` of real numbers with shape [batch_size, n_ops]\n",
-        "                representing the gradient backpropagated to the output of the\n",
-        "                op you are differentiating through.\n",
-        "\n",
-        "        Returns:\n",
-        "            A `tf.Tensor` with the same shape as `symbol_values` representing\n",
-        "            the gradient backpropagated to the `symbol_values` input of the op\n",
-        "            you are differentiating through.\n",
-        "        \"\"\"\n",
-        "        return self._compute_gradient(symbol_values) * grad"
+        "        # The two terms in the derivative are the same circuit...\n",
+        "        batch_programs = tf.stack([programs, programs], axis=1)\n",
+        "\n",
+        "        # ... with shifted parameter values.\n",
+        "        shift = tf.constant(1/2)\n",
+        "        forward = symbol_values + shift\n",
+        "        backward = symbol_values - shift\n",
+        "        batch_symbol_values = tf.stack([forward, backward], axis=1)\n",
+        "  \n",
+        "        # Weights are the coefficients of the terms in the derivative.\n",
+        "        num_program_copies = tf.shape(batch_programs)[0]\n",
+        "        batch_weights = tf.tile(tf.constant([[[np.pi/2, -np.pi/2]]]),\n",
+        "                                [num_program_copies, 1, 1])\n",
+        "\n",
+        "        # The index map simply says which weights go with which circuits.\n",
+        "        batch_mapper = tf.tile(\n",
+        "            tf.constant([[[0, 1]]]), [num_program_copies, 1, 1])\n",
+        "\n",
+        "        return (batch_programs, symbol_names, batch_symbol_values,\n",
+        "                batch_weights, batch_mapper)"
       ]
     },
     {
@@ -735,7 +674,7 @@
         "id": "bvEgw2m6NUAI"
       },
       "source": [
-        "This new differentiator can now be used with existing `tfq.layer` objects:"
+        "The `Differentiator` base class uses the components returned from `get_gradient_circuits` to calculate the derivative, as in the parameter shift formula you saw above.  This new differentiator can now be used with existing `tfq.layer` objects:"
       ]
     },
     {
@@ -818,7 +757,7 @@
         "circuit_tensor = tfq.convert_to_tensor([my_circuit])\n",
         "op_tensor = tfq.convert_to_tensor([[pauli_x]])\n",
         "single_value = tf.convert_to_tensor([[my_alpha]])\n",
-        "num_samples_tensor = tf.convert_to_tensor([[1000]])\n",
+        "num_samples_tensor = tf.convert_to_tensor([[5000]])\n",
         "\n",
         "with tf.GradientTape() as g:\n",
         "    g.watch(single_value)\n",

diff --git a/tensorflow_quantum/python/differentiators/differentiator.py b/tensorflow_quantum/python/differentiators/differentiator.py
@@ -218,6 +218,15 @@ def get_gradient_circuits(self, programs, symbol_names, symbol_values):
         `tf.Tensor` objects give all necessary information to recreate the
         internal logic of the differentiator.
 
+        This base class defines the standard way to use the outputs of this
+        function to obtain either analytic gradients or sample gradients.
+        Below is code that is copied directly from the `differentiate_analytic`
+        default implementation, which is then compared to how one could
+        automatically get this gradient.  The point is that the derivatives of
+        some functions cannot be calculated via the available auto-diff (such
+        as when the function is not expressible efficiently as a PauliSum),
+        and then one would need to use `get_gradient_circuits` the manual way.
+
         Suppose we have some inputs `programs`, `symbol_names`, and
         `symbol_values`.  To get the derivative of the expectation values of a
         tensor of PauliSums `pauli_sums` with respect to these inputs, do:
@@ -226,7 +235,7 @@ def get_gradient_circuits(self, programs, symbol_names, symbol_values):
         >>> diff = <some differentiator>()
         >>> (
         ...     batch_programs, new_symbol_names, batch_symbol_values,
-        ...     batch_mapper
+        ...     batch_weights, batch_mapper
         ... ) = diff.get_gradient_circuits(
         ...     programs, symbol_names, symbol_values)
         >>> exp_layer = tfq.layers.Expectation()
@@ -315,15 +324,19 @@ def get_gradient_circuits(self, programs, symbol_names, symbol_values):
                 the output `batch_weights`.
         """
 
-    @abc.abstractmethod
+    @catch_empty_inputs
+    @tf.function
     def differentiate_analytic(self, programs, symbol_names, symbol_values,
                                pauli_sums, forward_pass_vals, grad):
-        """Specify how to differentiate a circuit with analytical expectation.
+        """Differentiate a circuit with analytical expectation.
 
         This is called at graph runtime by TensorFlow. `differentiate_analytic`
-        should calculate the gradient of a batch of circuits and return it
-        formatted as indicated below. See
-        `tfq.differentiators.ForwardDifference` for an example.
+        calls he inheriting differentiator's `get_gradient_circuits` and uses
+        those components to construct the gradient.
+
+        Note: the default implementation does not use `forward_pass_vals`; the
+        inheriting differentiator is free to override the default implementation
+        and use this argument if desired.
 
         Args:
             programs: `tf.Tensor` of strings with shape [batch_size] containing
@@ -351,16 +364,43 @@ def differentiate_analytic(self, programs, symbol_names, symbol_values,
             the gradient backpropageted to the `symbol_values` input of the op
             you are differentiating through.
         """
-
-    @abc.abstractmethod
+        (batch_programs, new_symbol_names, batch_symbol_values, batch_weights,
+         batch_mapper) = self.get_gradient_circuits(programs, symbol_names,
+                                                    symbol_values)
+        m_i = tf.shape(batch_programs)[1]
+        batch_pauli_sums = tf.tile(tf.expand_dims(pauli_sums, 1), [1, m_i, 1])
+        n_batch_programs = tf.reduce_prod(tf.shape(batch_programs))
+        n_symbols = tf.shape(new_symbol_names)[0]
+        n_ops = tf.shape(pauli_sums)[1]
+        batch_expectations = self.expectation_op(
+            tf.reshape(batch_programs, [n_batch_programs]), new_symbol_names,
+            tf.reshape(batch_symbol_values, [n_batch_programs, n_symbols]),
+            tf.reshape(batch_pauli_sums, [n_batch_programs, n_ops]))
+        batch_expectations = tf.reshape(batch_expectations,
+                                        tf.shape(batch_pauli_sums))
+
+        # has shape [n_programs, n_symbols, n_ops]
+        batch_jacobian = tf.map_fn(
+            lambda x: tf.einsum('sm,smo->so', x[0], tf.gather(x[1], x[2])),
+            (batch_weights, batch_expectations, batch_mapper),
+            fn_output_signature=tf.float32)
+
+        # now apply the chain rule
+        return tf.einsum('pso,po->ps', batch_jacobian, grad)
+
+    @catch_empty_inputs
+    @tf.function
     def differentiate_sampled(self, programs, symbol_names, symbol_values,
                               pauli_sums, num_samples, forward_pass_vals, grad):
-        """Specify how to differentiate a circuit with sampled expectation.
+        """Differentiate a circuit with sampled expectation.
 
         This is called at graph runtime by TensorFlow. `differentiate_sampled`
-        should calculate the gradient of a batch of circuits and return it
-        formatted as indicated below. See
-        `tfq.differentiators.ForwardDifference` for an example.
+        calls he inheriting differentiator's `get_gradient_circuits` and uses
+        those components to construct the gradient.
+
+        Note: the default implementation does not use `forward_pass_vals`; the
+        inheriting differentiator is free to override the default implementation
+        and use this argument if desired.
 
         Args:
             programs: `tf.Tensor` of strings with shape [batch_size] containing
@@ -391,3 +431,28 @@ def differentiate_sampled(self, programs, symbol_names, symbol_values,
             the gradient backpropageted to the `symbol_values` input of the op
             you are differentiating through.
         """
+        (batch_programs, new_symbol_names, batch_symbol_values, batch_weights,
+         batch_mapper) = self.get_gradient_circuits(programs, symbol_names,
+                                                    symbol_values)
+        m_i = tf.shape(batch_programs)[1]
+        batch_pauli_sums = tf.tile(tf.expand_dims(pauli_sums, 1), [1, m_i, 1])
+        batch_num_samples = tf.tile(tf.expand_dims(num_samples, 1), [1, m_i, 1])
+        n_batch_programs = tf.reduce_prod(tf.shape(batch_programs))
+        n_symbols = tf.shape(new_symbol_names)[0]
+        n_ops = tf.shape(pauli_sums)[1]
+        batch_expectations = self.expectation_op(
+            tf.reshape(batch_programs, [n_batch_programs]), new_symbol_names,
+            tf.reshape(batch_symbol_values, [n_batch_programs, n_symbols]),
+            tf.reshape(batch_pauli_sums, [n_batch_programs, n_ops]),
+            tf.reshape(batch_num_samples, [n_batch_programs, n_ops]))
+        batch_expectations = tf.reshape(batch_expectations,
+                                        tf.shape(batch_pauli_sums))
+
+        # has shape [n_programs, n_symbols, n_ops]
+        batch_jacobian = tf.map_fn(
+            lambda x: tf.einsum('sm,smo->so', x[0], tf.gather(x[1], x[2])),
+            (batch_weights, batch_expectations, batch_mapper),
+            fn_output_signature=tf.float32)
+
+        # now apply the chain rule
+        return tf.einsum('pso,po->ps', batch_jacobian, grad)
diff --git a/tensorflow_quantum/python/differentiators/differentiator_test.py b/tensorflow_quantum/python/differentiators/differentiator_test.py
@@ -23,14 +23,6 @@ class WorkingDifferentiator(differentiator.Differentiator):
     def get_gradient_circuits(self, programs, symbol_names, symbol_values):
         """test."""
 
-    def differentiate_analytic(self, programs, symbol_names, symbol_values,
-                               pauli_sums, forward_pass_vals, grad):
-        """test."""
-
-    def differentiate_sampled(self, programs, symbol_names, symbol_values,
-                              num_samples, pauli_sums, forward_pass_vals, grad):
-        """test."""
-
 
 class DifferentiatorTest(tf.test.TestCase):
     """Test that we can properly subclass differentiator."""