*What could be the performance of a 512-qubit machine? By what kind of metrics could it be measured, in itself and comparatively to conventional computers?*

That’s a very difficult question because in the HPC world, you run Linpack to compare computers and in our case we don’t. In our case, we have 10,000 solutions, but each solution looks at 1.34e+154 solution spaces. It’s quite different. Probably, the best example to give is what Google did before they bought the machine.

I should say first that for most algorithms we either can’t run them yet or if we can, they may not be very fast. Basically again, what this machine does natively is solving optimization problems like finding the lowest spot in the Alps, but you can also map machine learning problems onto it and it also looks like it might be good for Monte Carlo, although we don’t know that for sure yet. Optimization problems and machine learning are the two types of problems which the machine seems best suited for so far.

What Google said before buying it was that they wanted to run tests on an optimization problem and a machine learning problem. They created a set of synthetic optimization benchmarks that were meant to exercise the system to help determine its performance envelope. The benchmarks, about 400-500 different ones, didn’t represent a real problem: they just made them up specifically for the occasion. So they asked us to run these benchmarks. Then, they ran the same benchmarks on their Intel platform with IBM’s optimization package Cplex, and another optimization package called Tabu, which operate on traditional supercomputers. What they found were three things: starting from small problems to problems that would use all 500 qubits on our machine, the performance on our machine was constant. No matter the size of the problem, we gave 10,000 solutions a second. With the IBM setup they found that for problems that used fewer than 100 of our qubits, the D-Wave machine was slower, but as the problems got larger, then it took longer and longer for Cplex and Tabu to solve them. And last, for the largest problems we were able to run at that time, at about 500 qubits, we were about 10,000 times faster.

That’s the promise of this machine. If you can map a problem onto it and the problem fits well, then there is the opportunity of great performance. That was one thing they gave to us. It was comparing a solution to an optimization problem on our machine versus on a traditional machine. 99 out of 100 times, we either can’t run it or we’re slower, but every once in a while, there’s one like this where we’re much, much faster. That’s the promise of the future with this machine. And it will only get better as we go to 1,000 qubits, 2,000 qubits, and so on.

*Did you get other comparable performance results between conventional supercomputers and the D-Wave machine?*

I mentioned the other thing that Google wanted to do was to try a machine learning problem. They gave us a problem with several hundred images that included cars and tried to use our machine to see if it could recognize an automobile. They were about 84% effective on traditional computers and after six months of work, trying to figure out how to create an algorithm to recognize a car in an image, they were able to do it on the D-Wave machine with about 94% effectiveness. They ended up about 10 points better. Then they were able to take the machine learning classifiers to recognize a car, put them back on the traditional machines and it turned out that since our classifiers were better, it also took fewer CPU cycles when they put it back on the traditional machine. In fact, it only used 30-40% of the CPU cycles. It was more effective at recognizing a car by 10 percentage points and used fewer CPU cycles. So that was a big win for them. The program at Google that purchased our machine is Google Glass and, as you know, it’s a battery operated computer, so using fewer CPU cycles is a huge benefit.

*Given the state of the technology, what are the main challenges D-Wave and the research are facing today? *

On the hardware side, we’re pushing to get from 500 to 1,000 qubits, then 2,000 and so on, and we can see a path to do that. We also would like to have better connectivity between qubits to be able to solve a broader set of problems, so we’re looking at network connectivity and topology.

*Can you elaborate?*

It’s a problem of physical real estate on the chip. Qubits are currently laid out in such a way that they overlap each other. Therefore, each qubit only connects to its nearest neighbors. To have it connect to others, we’re looking at other ways of laying out the chips, in order to achieve more connectivity. Some of that includes three dimensional layout, which we have not done yet. A lot of it is physical layout and manufacturing is the challenge there.

Secondly, running on this machine today is like it must have been in the 1950’s on traditional computers. In our case, the equivalent of FORTRAN hasn’t been invented yet. There is no equivalent of a FORTRAN compiler, no math libraries, no graphics libraries, no sign function, no cosign function, etc. We have to do a lot of work to be able to create some software tools to make this machine easier to use for more people. That’s the second challenge.

The third challenge is to find more applications that map onto it. As we make the architecture a little bit more general purpose and get better software tools, then more applications or algorithms will be able to execute on the machine. Today, I wouldn’t say it’s a research machine, but it is a special-purpose machine. It does this one thing, it does it very fast and it’s unique in the world in doing it.

*What about error-correction mechanisms? Are we there yet? Can we expect a generic model applicable to any problem?*

No. Our machine doesn’t do memory or logic error correction in the traditional sense of high performance computing. When our machine solves three dimensional problems, errors happen. That’s why people run 1,000 solutions or iterations. Because they may get some numbers that are clearly wrong. Rather than putting error detection and error correction hardware in the machine, today at least given the state of the machine, it’s better to run your problem 100 times. Because it is a statistical machine – and a very fast one – it’s better to have a statistical approach.

*Ok but then, do you envision the integration of such mechanisms sometime in the future?*

Probably. Again, it depends on the architecture you’re using. I probably should have said this at the start, but there are two different architectures for quantum computers. One is like a gate model computer which would be much like today’s machines. Most of the research work that’s been done for the last 20 years has been trying to create devices for those machines. As you know, they’re up to having maybe 4 or 7 qubits or something like that and they run for maybe a second.

The other approach is our approach, a more general purpose architecture, which is what more people are used to. In our case, it’s a special-purpose architecture, but we’re already at 500 qubits and it’s in production. The quantum machine that would be like today’s machines would be more familiar to IT people, more applications would go onto it, but realistically it’s probably 10 to 20 years away. That one will require error correction. In our case, a more special-purpose architecture doesn’t require error correction because you can look at the function you’re trying to minimize and say “this answer’s clearly wrong”. Now, the more accurate you can make the machine, the better, so over time we’ll figure out how to do error correction as well.

*You said there is no FORTRAN language equivalent for your current machine. With regard to optimization, how do you actually map an application? And when you say optimization, what does it mean actually?*

You can program the machine at three or almost four levels. One level is the equivalent of what you and I would know as machine language. You can create that one machine language instruction (the one I talked about earlier) that has 2,000 variables. You can do it that way, send it directly to the machine and have it come back. We do that sometimes and some of our customers do it too, but that’s a lot of work.

At the next level, we have interfaces to C, C++, FORTRAN, MATLAB and I think Python as well. If people have an optimization problem that they’re trying to solve and they can create that three dimensional energy landscape in C, then there’s a way that they can turn that landscape into a machine instruction in C or C++ and send it to our machine.

Then we have a couple of experimental tools. One is a kind of interpreter using a constraint specification language. We’re just starting to explore it, but if you have an optimization problem, you could say, for example, that ‘A has to be greater than B, and B has to be less than C’. The little language we created lets you specify in a more natural way what would turn out to be constraints in an optimization problem. We have another tool that’s a little higher level, which is basically oriented toward these optimizations and is able to launch our machine to do searches in spaces. It’s also really at the beginning stages.

People who are using our machine are mostly mathematicians and computer scientists, and you know how mathematic is, so I can imagine an interface to our machine in the future where you can think of your solution much like you do in mathematic. You have your equations, the optimization problem you’re trying to solve and in that sense the interface would create an instruction and send it off to the machine. But that’s sometime in the future. Today the tools are, as I said, machine language, instructions in C, C++ or FORTRAN, the interpreter and sort of a rudimentary optimization compiler.

© HPC Today 2021 - All rights reserved.

Thank you for reading HPC Today.