Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse observables with extended alphabets #74

Merged
merged 8 commits into from
Jul 4, 2024

Conversation

jakelishman
Copy link
Member

Summary

This describes a new SparseObservable object that would be added to qiskit.quantum_info for use in the Estimator interface, to address two major problems:

  • the Pauli alphabet of SparsePauliOp makes it impossible to efficiently represent all operators that can be efficiently measured by hardware.
  • as device qubit counts scale up, the number of non-identity terms in observable terms does not necessarily do so, making the SparsePauliOp representation inefficient.

Scope

This design doc is just about what the new class will look like. A key component of that is making sure that it can be used in Estimator implementers, but the exact mechanisms of how the public Estimator interface evolves to accept this are out-of-scope of this document.

jakelishman and others added 6 commits June 20, 2024 12:17
This describes a new `SparseObservable` object that would be added to
`qiskit.quantum_info` for use in the `Estimator` interface, to address
two major problems:

* the Pauli alphabet of `SparsePauliOp` makes it impossible to
  efficiently represent all operators that can be efficiently measured
  by hardware.

* as device qubit counts scale up, the number of non-identity terms in
  observable terms does not necessarily do so, making the
  `SparsePauliOp` reprsentation inefficient.

Co-authored-by: Ian Hincks <ian.hincks@ibm.com>
The only con (lack of no-copy deserialisation / CSR-array construction)
can be alleviated by supplying an unsafe uninitialised Numpy view object
onto an object created with a `with_capacity` constructor from Rust
space.  This way, the buffers are owned by Rust space, and the
deserialisation / raw data can be written directly into the buffers.
* More complex code is needed from Rust space to handle mathematical manipulations efficiently in the happy path of "no Python-space parametrisation".

Jake: personally I'd avoid this unless we have a really strong compelling use-case for giving it first-class support.
A user can always work around this simply by splitting the terms of their sum into different broadcast axes in the estimator pub, then calculating the sums themselves, which gives them far more freedom.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't seen strong use cases requiring parameterized observables, so I am fine leaving this out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some use-cases around parameterized observables but I agree that I would not classify them as "strong", especially since there are alternative means by which to achieve the same end-result.

That said, I wonder if it will be possible to have some Python-level utilities (or maybe alternative classes) that could make working with parameterized observables a little bit easier. I am not sure what exactly that would look like, but am wondering if @jakelishman could comment on the feasibility of something like this.

@blakejohnson
Copy link
Contributor

Looks like a strong proposal.

####-sparse-observable.md Outdated Show resolved Hide resolved

A core function of `Estimator` is to group terms that can be measured within the same execution.
There are many ways to do this, and we do not want to tie the observable to one particular implementation.
Qiskit will provide a function `SparseObservable.measurement_bases(*observables)` that takes an arbitrary number of `SparseObservable` instances and returns a set of the measurement bases needed to measure all terms.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method would then just give all required measurement basis w/o any kind of optimization, which is left to other algorithms, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the purpose is to separate "grouping of measurement bases" (which primitives / users will likely want to configure) from "what are the measurement bases?".


```python
class SparseObservableView:
num_bits: int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a too small comment for an RFC, but should this be num_qubits for consistency with the other observables & circuit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The names and whatnot can be bikeshed, but this discrepancy was purposeful in the RFC, at least: there's interest from the primitives in not having this tied to qubits because of how they (eventually) plan to work dynamic circuits into the mixture.

That said, Ian and I talked about that fairly early on in the process, and it may well be something that's better done just by primitives-side documentation, and keep num_qubits as the term here, as you say, for consistency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling them qubits is fine with me. The idea with dynamic circuits would be to attach observables to slices of measurement registers rather than to the entire circuit, with measurement terms prescribing instructions to insert before those registers. Therefore multiple qubits in an SparseObservable could in principle correspond to a single qubit, at the quantum programmer's discretion.

Since we are not there yet, and even when we get there it would be a point of minor concern, I'm not fussed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If such a workflow is possible, it would certainly make sense to rename it! Until then IMO it might be a bit clearer to keep the existing names 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can call this qubits in the actual implementation, no trouble.

There are several others that might be possible too.
These operations could all fairly easily be supported:

* Evolution of `SparseObservable` by another: this is completely doable, just naturally has quadratic complexity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It comes up sometimes to compute the expectation value of the Hamiltonian squared in algos & applications (even in the estimator for the variance maybe?), so the would seem like an important feature

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the expectation value computation for a given state is certainly easy enough, the trick is probably just finding some sensible representation of the statevector; we don't have the concept of a "sparse statevector" in Qiskit at the moment, and this operator is intended for numbers of qubits that dense statevectors can't represent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm but if we square a Hamiltonian we might have to measure in new bases, e.g. if H = X -> H^2 = I we'd have to measure in Z-basis for the expval of H^2 (ok maybe a bit basic this example... 😄). Or maybe I didn't correctly understand your proposal 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I missed the word "squared" in your answer.

If we have some representation of a state, I think the expectation value of the square of a Hamiltonian might shake out neater in an API if we cast that problem to "evolve the (sparse) state by the operator, then take the inner product with itself"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I'm not sure you can do that since you'd actually have to measure in different bases.. but we can also discuss this later 🙂

Co-authored-by: Julien Gacon <gaconju@gmail.com>
Copy link
Member

@mrossinek mrossinek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the proposal a lot 👍

Just left two comments on aspects that stood out to me as noteworthy.

* More complex code is needed from Rust space to handle mathematical manipulations efficiently in the happy path of "no Python-space parametrisation".

Jake: personally I'd avoid this unless we have a really strong compelling use-case for giving it first-class support.
A user can always work around this simply by splitting the terms of their sum into different broadcast axes in the estimator pub, then calculating the sums themselves, which gives them far more freedom.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some use-cases around parameterized observables but I agree that I would not classify them as "strong", especially since there are alternative means by which to achieve the same end-result.

That said, I wonder if it will be possible to have some Python-level utilities (or maybe alternative classes) that could make working with parameterized observables a little bit easier. I am not sure what exactly that would look like, but am wondering if @jakelishman could comment on the feasibility of something like this.

These operations could all fairly easily be supported:

* Evolution of `SparseObservable` by another: this is completely doable, just naturally has quadratic complexity.
* Tidy-up structural compaction of the operator: summing all terms that share the same abstract operator, removing zeros at some specified tolerance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be somewhat crucial, especially if we consider construction of operators to happen in an "iterative" manner by means of a sequence of mathematical operations.
The existing SparsePauliOp already has a part of its existing .simplify routine implemented in Rust, but performance still leaves room for improvement. I wonder how much this new operator completely implemented in Rust would benefit purely from the language advantage? It might be difficult to predict though 🤔

At the same time, could the CSR-like structure allow an alternative (read: more performant) approach for an implementation of .simplify?

Just thinking out loud here...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For background, the high-level algorithm of SparsePauliOp.simplify is basically:

  1. start with an empty hashmap of pauli: coeff
  2. for each term of the sum, add the term into the hashmap, summing the coeffs if there's a match
  3. create a new SparsePauliOp with each term in the hashmap, if the coeff is sufficiently far from 0

The way we do that has a fair amount of fiddling around the edges that's making it less efficient than it could be, but asymptotically, the runtime complexity is already the best I can think of.

That said, the complexity still scales as $\mathcal O(\text{qubits}\times\text{terms})$. The CSR-like form would be effectively the same, and so have the same asymptotic complexity. As a rule of thumb, though, it would be faster when there's less memory used (i.e. individually sparse terms), simply because there's no real mathematical structure / algorithmic trickery going on here, and the limit is mostly the iteration speed over all stored data.

For operator construction, one thing (mostly unrelated to this comment) to highlight: the way I've written SparseObservable here, it's growable in place, which means for some operations (+ being a notable one), we can do it in-place and growing, which is very nice for iteration. evolve is harder to do in-place, though.

Copy link
Member

@pedrorrivero pedrorrivero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this (much needed) RFC, awesome work @jakelishman @ihincks !

Comment on lines +127 to +134
`SparseObservable` will support some set of mathematical operations.
At a minimum, the following will be supported:

* addition of two `SparseObservable`s
* tensor product of two `SparseObservable`s
* evolution of one `SparseObservable` by a Pauli string ($A' = P A P^\dagger$ for `SparseObservable` $A$ and Pauli $P$)
* multiplication by complex scalars
* structural equality of two `SparseObservable`s (structural not mathematical; it's highly inefficient to detect equality if the abstract terms form an over-complete spanning set).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add the following?

  1. Evolution over (closed group) basis rotations $H$, $S$ and $S^\dagger$
  2. Qubit permutations (to easily/efficiently account for layout and routing during circuit transpilation)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, apply_layout is easy enough to do. Longer term, the story around compilation of observables will probably change a bit in Qiskit to be more streamlined, but for now we'll keep consistency with SparsePauliOp on that.

Evolution is mathematically sound for many operators, the tricks are mostly around finding nice representations of more complex objects for the API surface.

Comment on lines +155 to +160
Qiskit will provide a function `SparseObservable.measurement_bases(*observables)` that takes an arbitrary number of `SparseObservable` instances and returns a set of the measurement bases needed to measure all terms.
A measurement basis is a Pauli string.

Since the number of measurement bases will be (non-strictly) smaller than the total number of terms across all observables, and because only 2 bits of information per qubit is necessary to define the basis, there is not expected to be immediate memory concerns with a representation of this.
Qiskit already has `PauliList` that can serve this purpose; it _could_ be bit-packed to use 8x less memory, but this can be done as a follow-up optimisation if it becomes a bottleneck.
From this point, we can continue to use `PauliList.group_qubitwise_commuting`, or any other future grouping function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the design, however I don't fully see (from quick reading) how we avoid falling back into memory bottlenecks if we end up producing Pauli/PauliLists as the measurement bases.

Is it just from the fact that the observables will be transmitted in the new format and measurement bases only generated server side? What am I missing? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only part of SparsePauliOp that causes serious immediate memory problems is that some things that are efficient to measure are not efficient to represent. For example, the all-zeros projector state takes linear complexity/space to measure (it's an all-Z basis) but needs $2^n$ terms in SparsePauliOp to represent. Since, for the measurement, you just need to know which basis to take your counts in, the information about "what to measure" (including mitigation) can be stored efficiently by PauliList, and you reconstruct the requested observables later from the original SparseObservable.

Bit-packing PauliList can reduce its memory usage by a factor of about 8, but that's just a scaling factor. We already know we must be able to do operations that are linear in the number of qubits (or how would we twirl?), and PauliList can represent all the measurement bases we realistically care about for error-mitigation purposes for the next while.

All that said, how a primitives implementation actually manages its error mitigation is entirely up to it, and this isn't fixed by any public interface. So any primitive can use whatever they like in the backend to do this task; the point about PauliList is mostly just showing that we don't immediately need any new object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jakelishman! I was thinking more on all the design decisions to avoid explicit identities, but I see your point that this is not the biggest concern 🙂

@1ucian0 1ucian0 merged commit b04d53b into Qiskit:master Jul 4, 2024
1 check passed
@jakelishman jakelishman deleted the sparse-observable branch July 4, 2024 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants