Understanding Grover's Algorithm in Quantum Computing

Last week I put up a video introducing quantum computing, and in the final section we were stepping through something known as Grover's algorithm. And based on the comments that I saw, I think there was a very common point of confusion that reveals I clearly could have done a better job explaining a core piece of it. Right here I wanted to throw together a very quick supplement in the hopes of adding a bit of clarity. The premise was to have a function which is somehow triggered by a unique value out of a big bag of options, and the puzzle is basically to figure out how do you find that unique value just by applying the function on various inputs. Now in a classical setting, it's not a very interesting question, the best you can do is guess and check, but what we walked through was a completely different approach that you can take that becomes possible in the setting of a quantum computer. When we did this, there was a certain key step where, if I'm understanding the comments correctly, it looked to a lot of people like in order to apply this key step, you would have to already know the value that you're searching for, which would of course defeat the whole purpose of the algorithm. More specifically, we had this very high dimensional vector space, and one of the axes in that space corresponded to the value that we're searching for, and this step of the algorithm looked like flipping along that axis, multiplying any component of a vector in that direction by negative one. Now viewers were essentially asking, whoa whoa whoa, how could you know how to do that without already knowing which axis you're searching for? I genuinely tried to forestall that objection, but I think I failed, so backing up, I think the whole discussion might be clearer if we focus on a very concrete example, something like solving a sudoku. On your normal classical computer, it's not hard to write a function that checks whether a proposed solution follows all of the sudoku rules and solves the puzzle. You know, it would check the rows, the columns, squares for duplicates, things like that. If you have written this function, just because you know how to verify a solution, it's not at all obvious what the solution is in the first place. This is, after all, why a sudoku is a puzzle. The rules alone don't reveal the answer. There are other situations where this is actually a much stronger assumption. The function SHA-256, for example, is what's called a cryptographic hash function. That basically means if you want to find what input gives you a particular output, it is strongly believed that you really can't gain much insight by looking at how the function is implemented. If someone could reverse engineer it, they would be mining all of the bitcoin in the world and breaking numerous other cryptographic schemes. But it's believed that the best thing you can do when you're searching for a particular output is guess and check. So it's not that the key value is like hiding inside the function behind some curtain, it's more of a difficult-to-find emergent phenomenon of the function itself. Now the idea with Grover's algorithm is that if you have this sort of verifier function for some hard problem and you translate it into the language of quantum computing, there is a method for sifting out valid solutions which requires fewer steps than simple guessing and checking over all the possibilities. Now to be clear, it is not dramatically faster, it's only a quadratic speedup, and given the overheads for making quantum computing work, this frankly has questionable utility. In fact, let's talk a little bit more about that at the end. First, to the clarification, if you have this Sudoku verifying function, it's not like you can just run it on a quantum computer. After all, quantum computers speak an entirely different language, it's a totally different framework for computing that looks a lot more like vector manipulation. The first step to port over this verification function into the new context is to imagine that we've kind of compiled your verifier into a bunch of logic gates, things like AND, OR, and NOT. So for any proposed Sudoku solution, you would represent it all in binary, all of those bits would be processed by your web of logic gates, and the output would be a 1 if it's a valid Sudoku solution, and a 0 for all the invalid ones. And again, being able to assemble these logic gates does not require knowing ahead of time which input solves the puzzle. The logic gates distill the rules of the game, but not the strategy. Now I'm assuming everyone has watched the main video, in particular the core section about the fundamentals of the state vector, but as a quick recap, the upshot is that you think of every possible bit string as a unit vector along a coordinate axis in some high dimensional space. In the language of linear algebra, you would call these the basis vectors of your coordinate system. For example, with a 2-qubit quantum computer, you would have 4 possible bit strings, and these would all look like basis directions in some 4-dimensional space. These state vectors get very big very fast. If you have a k-qubit quantum computer, that gives you 2 to the k possible bit strings, and you think of each one of them as being a coordinate direction in some very high dimensional vector space. Now operations on a quantum computer don't spit out true or false the way you can see on a classical computer. Instead they take in a vector and spit out a new vector, both of which live in the same space. And like I said last video, you often think about them as somehow flipping or rotating the vectors in that space. Now here's the crux of the confusion. I mentioned how if you have this classical verifier function, something like a Sudoku checker that spits out a 1 or a 0, it is possible to translate it into an operation on a quantum computer that has the following behavior. If a bit string returns 1 for true up in the classical case, then down in the quantum case, the corresponding basis vector gets multiplied by negative 1, effectively flipping 180 degrees. And then if in the classical case a bit string outputs 0 for false, then in the quantum translation, the corresponding basis vector is unchanged. Now I can see three reasons that this step might have caused some confusion. First of all, I didn't explain how it actually works. I didn't step through the translation. Now my hope with that video was that it's just not too huge a leap to have you accept that in principle there exists this correspondence between returning true and false up in the classical world and multiplying by negative 1 or positive 1 down in the quantum world with vectors. Now maybe that is a leap. I could preview for you what that translation looks like. The relevant search term here is quantum compilation, but I'm going to be honest, I don't think it would add much clarity, in the same way that knowing the logic gates that implement addition don't really teach you much about how to add two numbers. Now it's not exactly like this, but loosely speaking, every time you see an AND gate, you translate it into a quantum operation that looks kind of like an AND. Every time you see a NOT gate, there's a quantum analogue that does something kind of like a NOT. I suspect the real cause of confusion originates not from a lack of low-level detail, but from how I had framed the entire setup. I opened that video by having you imagine that there was some mystery number that we're searching for, and as one commenter helpfully pointed out, this made it seem like the computer kind of knows the answer ahead of time, it's just hiding it from us. And this is almost certainly exacerbated by me briefly flashing an example function that just checks if the input is 12, and saying that we were going to treat the function as a black box. That's on me. That's a misleading way to open things. What I wanted to foreshadow is how, with Grover's algorithm, the only way you use the new quantum function is by trying it out on inputs, as opposed to maybe like reverse engineering it. So in that sense, it's treated as a black box. But to be clear, in order to translate the classical verifier into the quantum version, you absolutely need to get into the guts of the function. And if this is going to be a compelling example, it would be very silly if the only thing the function did was just check if the input equals some hidden number. The Sudoku example is much better, and a cryptographic hash like SHA-256 would be much better still. In these contexts, there is one value that will trigger the function, and we don't know what it is, but the computer also doesn't know what it is. It's not like the key value is just hiding in the source code. Whether we're up here in the classical setting, where triggering the function means returning true, or down in the quantum setting, where triggering the function means multiplying by negative one, which specific key input does this is a difficult to find and emergent property of those logic gates. It's not something that's baked in ahead of time. The other potential source of confusion, I suspect, is that I didn't appropriately emphasize the idea of linearity. In fact, this is a central enough feature of quantum computing and quantum mechanics that half of my reason for making this whole follow-up video is as an excuse to talk about it. So most vectors don't look like a pure basis direction, they look like some weighted sum of all the different basis vectors. One way you can represent this is with a column vector, where we think of each component as being associated with one of the possible bitstrings. The more common convention among physicists is to write general vectors as an explicit weighted sum of all the basis directions, each one represented with a ket. When the state vector for a computer looks like this, you say that it's in superposition, meaning it has some non-zero component associated with multiple distinct bitstrings. It's a lot like saying if someone is walking northeast, their velocity is a superposition of north and east, they're travelling both directions at the same time. A core idea from the last video is that you never actually see the coordinates of a state vector in superposition like this. When you read out from the computer, all you see is one of the bitstrings at random, and the probability of seeing it is equal to the square of the magnitude of the component of the state vector associated with that value. I'm saying magnitude here with the absolute value signs, because in general, these components can be complex numbers, but for simplicity, I'm only going to be showing real values. When I say that operations in quantum computers are linear, what I mean is that if you pass in one of these weighted sums of the different basis directions, that is to say, a superposition, then the output looks like the same weighted sum, but of the transformed versions of each vector. So here's a very small example. On a single qubit, there's an operation that we call a z-gate. What it does is it leaves the 0 direction unchanged, but it multiplies that vertical 1 direction by negative 1. These are only two out of the infinitely many possible state vectors, but they're all you need to know. If you pass in a superposition of those two, something that has a little bit of 0 plus a little bit of 1, what you do is look at what the z-gate does to each part separately, and then add those together. Again, with the same components. In this case, that means flipping the sign associated with the 1 component. Geometrically, when you draw this vector in a 2D space, the action of a z-gate looks like flipping around the x-axis. The z-gate is simple enough that just by looking at the definition, you can clearly see which direction gets flipped. But keep in mind, for more complicated functions, the definition alone might not so easily reveal how it behaves. Take a look back at the Sudoku verification function and its translation onto a quantum computer, and then say that the state of your computer is not one of those clean basis directions, but it is a combination of all the basis vectors, a superposition of every possible bit string in this context representing every possible solution to the Sudoku. To figure out what our verifier does to this new vector, what you do is look at what it would do to each basis separately, and then the output is going to be the same scaled sum of the result. In this case, most parts of that sum stay unchanged, but one of them, the one that's associated with the key input that represents a Sudoku solution, that part is going to have it sign flipped. When you do this, it's very tempting to look at it and say that the function is acting on every possible basis vector at once in parallel, and then adding the results. Now that might be true, but I invite you to reflect on whether that's necessarily a fair way to summarize it. As an analogy, if a hiker is walking northeast, and you tell him to rotate 90 degrees, that rotation is a linear operation. In other words, the final direction is the same as what you would get by rotating the north vector 90 degrees, rotating the east vector 90 degrees, and adding the two results. But that doesn't mean that you have to perform two separate rotations in parallel in order to move the hiker. The linearity is a property of the transformation, it's not necessarily a set of instructions on how to do it. It's very similar over here. The effect of the quantum translation for our verifier looks like adding together what its effect would be on all of the basis vectors, but I will leave it to your interpretation whether this means that it is necessarily acting on all 2 to the k possible bitstrings at once. Here, I've been writing the vector the physicist way, but if I write it with a column vector instead, what this operation looks like is taking one component of that vector and multiplying it by negative 1, specifically whichever component corresponds to the sudoku solution. So stepping back, hopefully this helps clear up some of the confusion there was in that last video. The starting point of Grover's algorithm is to assume that you're given a function that flips one component of a vector like this, though you don't know which one. The puzzle is to somehow figure out which direction is getting flipped, where all you're allowed to do is apply this function to some well-chosen set of inputs, and in Grover's case, it involves interleaving it with a certain other operation in the toolkit of quantum computing. I get it, that is a really weird place to start from. And it doesn't help that this is a famously confusing topic. If you find it weird to work with a state vector whose components you never actually observe and which instead acts like a kind of square root of a probability distribution, you're not alone. Everyone finds that very weird. At this point I genuinely can't tell if I'm overexplaining things or still under-explaining them, but there is one final aspect of the explanation from last time that may have added to this confusion. When we visualized the algorithm, we chose to view everything on a certain two-dimensional slice of the enormous n-dimensional vector space where all these state vectors live, and this slice, by definition, included the axis associated with that mystery value we're searching for. Now, in case that left anyone with the impression that part of the algorithm was to choose that slice, let me be very clear, it's not. The algorithm is just doing what it's going to do. It interleaves two operations that go back and forth. It doesn't have a care in the world how you and I choose to visualize it. The fact that the state vector stays confined to this particular plane, that's a happy emergent property of the algorithm. It is in no way part of the instruction set that we're giving to the computer. One very final thing that I do think deserves some added reflection is how Grover's algorithm, while very thought-provoking, is maybe just not really useful. Take the Sudoku example. The number of possible solutions will depend on how many of those 81 squares start off blank, but let's call it something like 9 to the power 60. If you tried to use your classical verifier function to find a solution by brute force, there are way, way too many possibilities to check. But even if you had a fully functioning quantum computer, sitting on your desk right now with ample qubits and no issues maintaining coherence and all that, using Grover's algorithm, this would still take around 9 to the 30 steps, which is a much smaller number, but it's still enormous. In this case, there's obviously many smarter ways to solve a Sudoku than a brute force search, but if you take something like SHA-256, inverting it by brute force takes 2 to the 256 steps on a classical computer. Using Grover's algorithm, that becomes 2 to the 128, at least up to that pi-fourths constant that we saw last video. So even in some sci-fi future where quantum computers are as far along as today's classical computers, that is still an infeasibly large number of steps. The way some people write about quantum computing, it makes it sound like the moment they arrive, everything is going to change and all of cryptography will break. And it's true that there are specific problems that have exponential speedups, and especially with RSA, some of those are relevant to cryptography, but that's not true in general. This quadratic speedup is much more representative of what you get for most problems. It's really cool, and the math is just beautiful, and there continues to be lots of interesting research in the field, but one of my hopes with this whole project is that you now have enough background to maybe see through some of the hyperbole that certain outlets are so fond of. Thank you.

Transcript for:Understanding Grover's Algorithm in Quantum Computing

Transcript for:
Understanding Grover's Algorithm in Quantum Computing