A non-mathematical introduction to Zero Knowledge Proofs

This article will attempt to teach you what a zero knowledge proof is, taking the perspective that it is a tool - this means there is barely any mathematics involved; and nothing to be afraid of.

POSTED ON: 05.09.2022

Errol Drummond

Exlusive content by Zero Knowledge Natives

A non-mathematical introduction to ZKPs

This article will attempt to teach you what a zero knowledge proof is, taking the perspective that it is a tool - this means there is barely any mathematics involved and nothing to be afraid of. This will serve to prepare builders to make use of ZKP systems, and also help business people figure out whether ZKPs can be applied to solve their problem.

Common analogies

When explaining ZKPs for the first time, people often use one of a few analogies; let’s recap what these were.

ID at a bar

When you go to a bar and ask for a drink, sometimes they request your ID to check that you are of drinking age (which is 18 in civilised countries). But your ID doesn’t just prove that you are (or are not) of drinking age. It reveals your exact age and date of birth, it reveals your full name, and it may reveal other things like your address. ZKPs will provide a way to provide the minimal knowledge required, and nothing more - so you will be able to prove that you are allowed to drink, and not give away any more info than that.

So this analogy is nice in that it is extremely straightforward, and it highlights that ZKPs will be some sort of way to prove something and not give away unnecessary info. But it does nothing in the way of giving you understanding about how a ZKP works or how it can be used to solve a problem you have.

Where’s Wally

The great game Where’s Wally (or Where’s Waldo, depending on where you’re from) involves looking at a page in book depicting a very crowded and colourful environment filled with people and strange happenings. The point of the game is to find Wally on the page.

node101 A non-mathematical introduction to Zero Knowledge Proofs

You can create a very straightforward ZKP to prove that you know where Wally is, without revealing where he actually is. Crazy, right?! What you will do is have the people you are proving this fact to turn around, and then you will get your phone and take a close up photo of Wally on the page, such that your photo will only show Wally’s face and a tiny bit of the surrounding.

When you show this photo to the people you are proving to, they will recognise that the only way you would have been able to take that photo is if you actually knew where Wally was.

This analogy is really nice because it provides people with a legit ZKP scheme that is fun and memorable. But it is limited. It doesn’t provide any actual understanding of what the ZKP systems that we use do, and neither does it let you figure out whether ZKPs can be used to solve your problem

Ali Baba’s cave

Only read this analogy if you are very new and the above 2 didn’t set the scene enough for you. If you do read it, can you then explain the limits of this analogy?

What do ZKPs actually do?

Now we want to lay the foundations of what ZKPs really do. With this scaffolding, you will be able to reason whether ZKPs can be applied to your problem in a very effective manner, and you will also be able to begin your journey of learning to create ZKPs if you are a developer.

Caveat: I will not be explaining all generalities of ZKP systems (that is one of the problems with current intros I believe). I will be explaining one specific part of a proof system called PLONK. The reason for this is that it is the pedagogically simplest to learn and will give you everything you need to get started in the realm of applying ZKPs to create value. No deep mathematics needed. Furthermore, ZKP systems are very modular. So once you have a simple understanding of this one, it will be far easier to start to swap out modular components for others and see how they work.

Circuit Satisfiability

Circuit satisfiability is what ZKPs actually are. The proofs that we create are actually bits of data that can be used to confirm that we actually filled out some computational circuit (usually referred to as a zero knowledge circuit) in a valid way. In order to understand this, let’s first dig into what we mean by a circuit.

The above statement requires some context for you to decipher what it really means. We're going to dig into that now.

Circuits

A good starting place here is normal computers. You may or may not be aware of this, but computers you make use of conduct all their computation through chips that contain tonnes and tonnes of NAND gates.

A NAND gate is a Not AND gate. How does an AND gate work? Well, you take in 2 binary inputs, and the output is 1 if both of those inputs is 1, and zero otherwise.

So the Not aspect of the NAND gate means you take the output of the AND gate and just invert, turning 0s into 1s and vice versa.

Normal computers just have loads of these gates set up in various constructions to allow them to do various types of logic and arithmetic. If you’re interested in the basic building blocks of a computer I would really recommend this fantastic and simple book

So a normal computer is just a bunch of NAND gates with 0s and 1s going around, and it can do all this amazing computation and stuff for us.

So what is a circuit?

A circuit is just a finite number of these NAND gates connected together in whatever way we want. Makes sense, right? For example, we could just take a single NAND gate, that would be a circuit. Or we could take k NAND gates and put them together by connecting the output of one NAND gate as an input to the next NAND gate. Then we would get a circuit that would be a bit like a zig-zagging snake.

And zero knowledge circuits?

Well these are going to be very similar. Instead of NAND gates, plonk gives us addition and multiplication gates (we can imagine all of these as having 2 input wires and 1 output wire). We can place these gates and connect up their wires like before to create a circuit.

Previously we had 0s and 1s going through the wires, what do we have now? Okay, this is the only bit of real maths we need in this article - I promise you that with a little bit of bravery you will see that this maths really isn’t that hard, you do it every day probably without realising it.

Finite fields

We will have whole numbers between 0 and p (including 0 but not including p). For example let’s say p is 13. What if we put two numbers through a gate where the output is more than 12? Well if we do 8+9, the result is 17. But 17 is out side of our range. So what we can do, just like we do with the 24 hour clock, is we can just take away the largest multiple, which in our case is 13 (in a clock we would take away 12). So in fact 8+9=17≡4. Notice the second equals sign is not an equals sign, it is an equivalent to sign and has three lines rather than two. We use this because in our setting, the two numbers on each side really are indistinguishable, even though usually the numbers are different. But we are working in something called a finite field, and therefore we end up with these equivalences.

How was that; not too much maths, right? There is a little more to dig into regarding these things called finite fields, but I promise you that the basic idea of them is pretty straightforward, and that to get an idea of how to use ZKPs you don’t need to go any deeper.

You may also have questioned why I chose p to be 13 rather than 12 considering I said it was similar to working with a clock. The reason is that if we want our range (0 to p) to be a finite field (rather than just a mathematical object called a ring), we need to ensure that p is prime.

In our ZKP systems we need to choose p to be reallyyyy large for security reasons; p can be around 2^256, which is around 10^80. For reference, 10^80 is around about the number of atoms in the universe. That’s a pretty big number.

ZK circuits

Plonk lets us place addition and multiplication gates, and connect output and input wires as we please to create a circuit. The values going through the wires are elements from a chosen finite field (p was set to a specific value, let’s now call it p_c). So every value in a wire is in the range [0,p_c) (the closed brace [ means we include 0, and the parentheses we don’t include p_c, but only go up to the number before it, p_c-1).

Zero knowledge proofs

Now that we know what is meant by a zero knowledge circuit, what is meant by a zero knowledge proof? The proof is some piece of data that proves that you filled in a zk circuit in a valid way. By valid we mean that wires around a gate really do correspond, for example if you put "a" and "b" into an addition gate, the output wire can only be "a+b".

How do you verify?

In order to verify a proof, you of course need the proof, but you also need to know what zk circuit the proof is claiming was filled in correctly by you.

Where is the zero knowledge?

The proof only tells people you filled in a zk circuit correctly, it doesn’t say anything about what the values of any of the wires are, that’s the zero knowledge!

But if I don’t know what any of the wire values are, why would I care about your proof? …. you wouldn’t!

The value of ZKPs

The value of ZKPs derive from how the computation that was proved to have been done correctly relates to some info we have in the outside, real world. If somebody gives you a proof, tells you what the circuit was, but doesn’t reveal any of the wire values, you will not care about their proof. It has no relation to the outside world. Let’s look at some examples of potentially valuable proofs.

RSA factoring challenge

It’s quite possible you will recognise the acronym RSA. RSA is the encryption mechanism that has been used to conduct secure online messaging for a long time, it is how our banks communicate with us in a safe manner.

The way RSA works is by taking two huge prime numbers "p" and "q", and multiplying them to get "pq". Now you can communicate with you bank securely online by encrypting your messages with "pq". The only people who can decrypt your messages and your banks’ messages are people who know what p and q are. And only you and your bank will know these values because you went through something called the Diffie-Hellman key exchange protocol.

Why is this safe? Can’t people just work out "p" and "q" by dividing? Well, technically yes, but actually there is no known efficient way to do that. People have been trying for thousands of years to get an algorithm that does this quickly, but it turns out nobody has found a method that is much better than just dividing by all primes one by one. It is suspected that a faster algorithm does not exist, though nobody has managed to prove this yet.

Even so, can’t they brute force it? This is why we chose really large primes. It will take an attacker millions of years with the world’s best supercomputers to break this encryption. Safe enough, right?

So the RSA factoring challenge is a way to know where our limits are in terms of factorisation. RSA Laboratories put out a bunch of RSA numbers and asked people if they could figure out which 2 primes were used to create them. Feel free to read a bit more on the wikipedia page .

For example, RSA-896 is a number that is around 10^270. It’s hard to begin to describe how large this number is. It’s like having our universe, but each atom is another universe, and each atom in those universes is another universe. So our universe cubed. Woah.

If you figure out which 2 primes were used to create this number, you will get a cash prize of $75,000 dollars! That’s a decent amount of money. But what if you were worried that if you shared your result somebody would steal your prize? Enter ZKPs!

You could create a ZKP that proves this. This is a great example, because it is the smallest circuit somebody can make that is non empty. What does the circuit look like? Well it would just consist of a single multiplication gate.

Of course for the input wires you would put "p" in one of them and "q" in the other one, and then of course the output wire would be "pq", which in this scenario would be the number called RSA-896. So when you make your proof, and you tell people what your circuit was, and you reveal the output wire, people will have to conclude what it means.

Well, you made a zero knowledge proof, so that means that you filled in the multiplication gate correctly, and since the prime factorisation of this number only has 2 primes, the conclusion can only be that you know what "p" and "q" are; i.e. you solved the factoring challenge and will get the cash prize!

Okay, that's not quite true

So actually I lied slightly. The above example can be broken; i.e. there is a way you can lie - can you figure out how?

Hint: how many ways are there to factor RSA-896?

Can you figure out what additional constraints we might need to make the above example work?

One way we can do it is to add a gadget that will check that the items you claim to be p and q are not in fact 1. So we can add a gadget that checks that neither of your inputs is actually 1. By a gadget we mean some sub-circuit that serves a particular purpose, for example if you laid down a bunch of gates such that it conducted a hash function you could keep that as a sub-circuit called a gadget and use it in other circuits you want to make too.

The is_not_one() gadget will take p as an input and we will ensure that the result of this gadget is 1 (it is a binary output, 1 meaning the entered value is not 1, and 0 that it is 1).

This allows us to mention another important aspect of plonk - how does plonk ensure that the value entered into the is_not_one() gadget is the same as one of the values entered into the main multiplication? Well, plonk has something called copy constraints.

Earlier I said that you can join wires of gates - this is sort of true. The way you join a wire is by telling the proof system 2 wire values should be the same. So in the proof system you actually lay a bunch of gates, and then there is a list of pairs indicating which wires need to have the same value - i.e. which ones are connected.

For example:

(left input of multiplication, input of first is_not_zero),

(right input of multiplication, input of second is_not_zero),

….

(things that go here would for example include connections between wires in the is_not_zero gadget)

Satoshi’s public key

Another fun example. Let’s say you are Satoshi and you want to prove it by using the secret key. Now of course you could just sign a message with the secret key, and that’s a valid way to do it. But you can also do it with a ZKP. Can you figure out what the circuit could be?

One way to do it would be to create a circuit that consists of the hash function that turns a bitcoin secret key into a public key. This hash function is of course just a computation, and it can be represented by addition and multiplication gates. We can make a mock sketch of this circuit as below.

Now if you place Satoshi’s secret key at the starting wire and fill out the rest of the wire values based on that, the output wire should be Satoshi’s public key. Then if you make a proof, tell people what the circuit was, and reveal the output wire, people will have to make a conclusion.

From the perspective of somebody else, you ran a number through a hash function and got out Satoshi’s public key. Conclusion? You are Satoshi!

Okay that’s not quite true, it only proves that you know Satoshi’s secret key. But that’s still more evidence than anybody else has.

Trust?

We often talk about trust in these situations, and lots of people say things are trustless. That is sort of true, but also sort of not. We usually don’t remove trust, we often end up removing some trust aspects, and all trust is left to be placed on something in the outside world rather than the proof.

For example in the Satoshi example above, you’re going to need to trust that the hash function is secure and hasn’t been broken if you want to conclude that the creator of the proof is actually Satoshi. You’re also going to need to trust that the secret key wasn’t stolen.

In all ZKP applications there will still be elements of trust that are needed in order for people to have the conclusions you want.

Hopefully you now feel like the foundations of what ZKPs actually do are beginning to elucidate themselves to you. This is really all the starting info you need, and the next steps consist of adding more detail via more examples of what can be built, with analysis of what circuit would solve a particular problem, and what trust assumptions of the outside world are needed to ensure the conclusion you want.

Conclusion

In any case, you should be ready to attempt to see how ZKPs can solve your problem, and also to start building basic circuits such as in the Halo2 book.

See you in the next article, where we will be diving into applying ZKPs to voting.

Want to start building?

If you're really eager to get going, check out our walkthrough of the Ethereum Foundation's PSE team's implementation of Halo2

You might like to follow ZK Natives on Twitter

Privacy I - Why we need privacy

The Great Pilgrimage, a Proof of Personhood

Errol Drummond

Exlusive content by Zero Knowledge Natives

node101

Learn more on node101