1 00:00:00,940 --> 00:00:05,310 The following content is provided under a Creative Commons license. Your support will 2 00:00:05,310 --> 00:00:10,900 help MIT OpenCourseWare continue to offer high-quality educational resources for free. 3 00:00:10,900 --> 00:00:18,440 To make a donation or to view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare 4 00:00:18,440 --> 00:00:20,190 at ocw.mit.edu. 5 00:00:20,190 --> 00:00:27,950 TADGE DRYJA: OK, so today I'll talk about CoinJoin, signature aggregation, well before 6 00:00:27,950 --> 00:00:33,649 that privacy, but privacy, CoinJoin, CoinShuffle, signature aggregation, and how these all connect. 7 00:00:33,649 --> 00:00:38,280 It's a little bit of a leap to put these both in the same class, but not really. They're 8 00:00:38,280 --> 00:00:45,089 connected. But they are somewhat separate things. Today, I'll talk about privacy, CoinJoin, 9 00:00:45,089 --> 00:00:50,899 and various ways to do that, and then aggregate signatures, so Schnorr multi-signatures, and 10 00:00:50,899 --> 00:00:54,699 then aggregation, and attacks on the system. 11 00:00:54,699 --> 00:01:02,600 OK, so the idea is privacy. And there's a bunch of terms we can use here, like anonymity 12 00:01:02,600 --> 00:01:07,710 and fungibility. And whatever the term for this, I don't need any, because I have nothing 13 00:01:07,710 --> 00:01:08,720 to hide. 14 00:01:08,720 --> 00:01:12,789 So I don't need any privacy. I don't need any anonymity. I'm a good person. I don't 15 00:01:12,789 --> 00:01:16,580 break the law. I'm a boring guy. I don't do anything crazy. 16 00:01:16,580 --> 00:01:22,640 I'm joking, but-- no, I'm not joking about being boring and not doing anything crazy. 17 00:01:22,640 --> 00:01:27,070 That is actually pretty true. I mostly just work on this stuff. 18 00:01:27,070 --> 00:01:36,210 But, yeah, that's a fairly common thing that people say. And the sort of clearest example 19 00:01:36,210 --> 00:01:40,670 is if you don't have anything to hide, you don't have any bitcoin, because literally, 20 00:01:40,670 --> 00:01:45,550 if you don't hide your private key, someone will just immediately take your bitcoin. So 21 00:01:45,550 --> 00:01:48,030 you do have secrets. 22 00:01:48,030 --> 00:01:54,060 Your secret is your private key and your password. And if bitcoin and these types of systems 23 00:01:54,060 --> 00:01:59,420 really take off, potentially most of your money and a lot of your wealth could be tied 24 00:01:59,420 --> 00:02:05,161 up in what is a secret. So if you're not able to keep your secrets you could lose a lot 25 00:02:05,161 --> 00:02:12,319 of money. And larger, I sort of think that going forward-- this is very thought leader, 26 00:02:12,319 --> 00:02:13,770 future kind of thing. 27 00:02:13,770 --> 00:02:19,730 But going forward, it seems like sort of all you have is your secrets. To the extent that 28 00:02:19,730 --> 00:02:23,510 you can own anything, the thing that you really have control of, you're like, OK, I got these 29 00:02:23,510 --> 00:02:29,990 things in my head that no one else knows. And maybe I own my car. But that's also sort 30 00:02:29,990 --> 00:02:33,500 of property rights, and legal systems, and maybe you can get repossessed. 31 00:02:33,500 --> 00:02:38,000 Maybe I own this land. But, again, that's sort of this thing with the state. But I definitely 32 00:02:38,000 --> 00:02:44,390 really have the stuff in my head. And I can at least try to keep that. 33 00:02:44,390 --> 00:02:49,410 So even if you think you have nothing to hide, you do have your private keys to hide. And 34 00:02:49,410 --> 00:02:56,480 that's sort of the obvious one. But larger, you want privacy because, in general, you 35 00:02:56,480 --> 00:03:00,410 don't want to reveal stuff about your coins. This is specific to bitcoin. 36 00:03:00,410 --> 00:03:06,760 But my philosophy is I generally don't want to reveal stuff ever. And it's a lot of times 37 00:03:06,760 --> 00:03:11,840 at stores. I remember like Radio Shack, when it was still a thing when I was in high school, 38 00:03:11,840 --> 00:03:15,780 they would always ask you your phone number when you bought anything. And when I was little, 39 00:03:15,780 --> 00:03:20,020 I would just tell them my parents phone number at my house, because they asked. 40 00:03:20,020 --> 00:03:22,930 But then once I was in college, I was like, wait, no I'm just going to make something 41 00:03:22,930 --> 00:03:29,060 up. And now I do that all the time. And it sometimes gets annoying for other people, 42 00:03:29,060 --> 00:03:34,040 because I would like go to a restaurant. They ask to put my name down. I'll say I'm Fred. 43 00:03:34,040 --> 00:03:39,020 And then my other friend comes. And she was like, wait, you're not listed. And then she 44 00:03:39,020 --> 00:03:45,890 put her name in. I'm like, no, I'm already there. I'm Fred, and confusing things like 45 00:03:45,890 --> 00:03:46,890 that. 46 00:03:46,890 --> 00:03:55,680 And I hope, and I sort of think that we're starting to see this kind of thing. Like, 47 00:03:55,680 --> 00:03:59,510 there's like Facebook hearings now. And people are all shocked that the company Facebook 48 00:03:59,510 --> 00:04:04,120 can read everyone's messages and see everything. Well, yeah, that's how it works. You're giving 49 00:04:04,120 --> 00:04:07,450 it all to their servers. 50 00:04:07,450 --> 00:04:11,190 So I think it's starting to be, like where before companies would all be like, we just 51 00:04:11,190 --> 00:04:15,069 want to get all the user data possible, hopefully going forward data will be seen more as a 52 00:04:15,069 --> 00:04:21,470 liability and less as an asset. We're not there yet, because the companies who get sort 53 00:04:21,470 --> 00:04:27,820 of hacked and lose all this user data don't really get punished. So the Equifax thing, 54 00:04:27,820 --> 00:04:31,690 and all these different data breaches, there's not a ton of consequences yet. 55 00:04:31,690 --> 00:04:34,840 Maybe there will be in the future. And maybe companies will start to not want all this 56 00:04:34,840 --> 00:04:42,750 data. But it's not as clean cut as, well, I didn't do anything wrong, so I don't want 57 00:04:42,750 --> 00:04:43,750 to hide anything. 58 00:04:43,750 --> 00:04:49,440 Sort of default hide is my stance. And there's so many conflicting interests where every 59 00:04:49,440 --> 00:04:53,310 time you're on the websites, it's like, this website wants your location. And you're like, 60 00:04:53,310 --> 00:04:59,570 wait, how do I always disable that? I don't know the settings. But everyone wants this 61 00:04:59,570 --> 00:05:00,570 stuff. 62 00:05:00,570 --> 00:05:08,030 So another reason, specifically in the case of money, is what's called fungibility. And 63 00:05:08,030 --> 00:05:12,880 it sounds like a weird word. It just means that every bitcoin is the same, or every dollar 64 00:05:12,880 --> 00:05:14,220 is the same. 65 00:05:14,220 --> 00:05:21,000 So dollars are fungible, in that, physically, this $20 bill and this $20 bill are worth 66 00:05:21,000 --> 00:05:27,120 the same. And different denominations, so I will give you a 20 for two 10's. And anyone 67 00:05:27,120 --> 00:05:29,370 will do that. They're worth the same. 68 00:05:29,370 --> 00:05:35,070 You don't have, well, this dollar is not worth quite as much. And sometimes that happens, 69 00:05:35,070 --> 00:05:41,220 even with dollar bills. Like in other countries, I think in the Philippines when the new $100 70 00:05:41,220 --> 00:05:45,669 bills were issued that have the like little blue shiny thing, people didn't like them. 71 00:05:45,669 --> 00:05:49,760 And people were like, no, I won't trade my old $100 bill for your new $100 bill. I don't 72 00:05:49,760 --> 00:05:52,540 trust this new one. Eventually, they got used to it. 73 00:05:52,540 --> 00:05:57,480 But there's things, that even money can be not fungible. So the classic example is, OK, 74 00:05:57,480 --> 00:06:03,150 gold is fungible. Diamonds are not. Diamonds, I have little experience with diamonds they 75 00:06:03,150 --> 00:06:06,960 seem kind of silly, but they're all unique. 76 00:06:06,960 --> 00:06:10,490 They've got different grades, and different cuts, and different sizes, and carrots, and 77 00:06:10,490 --> 00:06:15,800 little inclusions, and imperfections. And you've got this whole industry, where people 78 00:06:15,800 --> 00:06:19,790 with these little eye things look at the diamonds, and figure out, oh, this is a good one. And 79 00:06:19,790 --> 00:06:21,479 this one is so, so. 80 00:06:21,479 --> 00:06:27,280 Whereas gold is, sort of also has essayists, and good delivery standards, and all these 81 00:06:27,280 --> 00:06:31,419 things for the gold bars. But it's much more of a standard. There's no judgment calls. 82 00:06:31,419 --> 00:06:37,020 It's like, OK, this is gold. It's 99.95% pure. It weighs this much. This is how much it's 83 00:06:37,020 --> 00:06:43,580 worth. And you can chop gold up and-- sorry, divisibility is different than fungibility. 84 00:06:43,580 --> 00:06:48,680 So if diamonds were all identical but not divisible, they could still maybe function 85 00:06:48,680 --> 00:06:54,430 as money. You would just need a lot of little diamonds. But the fact that gold is also divisible 86 00:06:54,430 --> 00:06:58,090 is really is a bonus. But the fungibility is really what's nice about it, and what makes 87 00:06:58,090 --> 00:07:01,680 it sort of used as money. 88 00:07:01,680 --> 00:07:09,470 Currency's fungibility is actually-- I'm not a lawyer, but I hang out with them. It's enforced 89 00:07:09,470 --> 00:07:15,610 by the law itself. So there's an interesting case, Crawford v. The Royal Bank, in Scotland 90 00:07:15,610 --> 00:07:19,010 a couple hundred years ago, where there was a guy. 91 00:07:19,010 --> 00:07:24,259 And he wrote his name on a 20 pound note. It's like Crawford. This is his money. And 92 00:07:24,259 --> 00:07:28,010 I think he was trying to mail it someone. He lost it. 93 00:07:28,010 --> 00:07:31,759 Years later, the note shows up at a bank. And he's like, no, see that was my money. 94 00:07:31,759 --> 00:07:35,110 I lost it. And he demands the money back. 95 00:07:35,110 --> 00:07:39,389 I lost that money. This is my property. Give it back. It's got my name on it. I can prove 96 00:07:39,389 --> 00:07:40,449 it. 97 00:07:40,449 --> 00:07:44,550 And then the court says no, no, no, that's not how money works. The bank has the money. 98 00:07:44,550 --> 00:07:51,260 You can't just write your name on money and then have it be yours. 99 00:07:51,260 --> 00:07:57,639 And this is very different from property. So if you steal a bicycle, and then you go 100 00:07:57,639 --> 00:08:02,860 go sell it on-- sorry, if someone steals a bicycle, sells it on Craigslist, and you buy 101 00:08:02,860 --> 00:08:06,830 it, and you didn't know, it was just, whatever, some guy's selling a bike on Craigslist. The 102 00:08:06,830 --> 00:08:09,630 police show up, and say, hey, that bicycle was stolen. We're taking it back. 103 00:08:09,630 --> 00:08:14,360 And you're like, well, I didn't know. I just bought it on Craigslist. I paid a couple hundred 104 00:08:14,360 --> 00:08:17,981 bucks. Police are like, sorry, you're not in trouble. We're not going to charge you 105 00:08:17,981 --> 00:08:18,981 with a crime. 106 00:08:18,981 --> 00:08:23,600 But these are stolen goods. We're taking them back, giving it to the rightful owner. And 107 00:08:23,600 --> 00:08:28,160 that's how the law works. You can try to complain about it, but they're taking it. 108 00:08:28,160 --> 00:08:33,349 Money is different. If you're running a pizza store, someone comes in, gives you a 20, and 109 00:08:33,349 --> 00:08:38,120 buys a pizza, and then presumably eats it or whatever, and then a couple hours later 110 00:08:38,120 --> 00:08:43,709 the police come in and say, hey, that guy stole that $20 bill. And we're taking it back, 111 00:08:43,709 --> 00:08:45,230 and giving it to its rightful owner. 112 00:08:45,230 --> 00:08:49,840 As the shopkeeper, you're like, I don't even know which $20 bill. Even though there are 113 00:08:49,840 --> 00:08:53,820 serial numbers, and the police could say, it was this $20 bill. We're taking it back, 114 00:08:53,820 --> 00:09:02,750 or returning it to the person he stole it from, legally they can't do that. 115 00:09:02,750 --> 00:09:07,480 So this is from talking to lawyer people. So money's different. Money is not property, 116 00:09:07,480 --> 00:09:12,190 in the same sense. And the fungibility is enforced by the state. 117 00:09:12,190 --> 00:09:21,400 And the divisibility and fungibility are enforced by the banks. The US dollar isn't worth anything. 118 00:09:21,400 --> 00:09:25,310 The government and the banks don't have to give you anything for $1. 119 00:09:25,310 --> 00:09:30,090 They used to sort of have this agreement, where, OK you give us $35. And we'll give 120 00:09:30,090 --> 00:09:35,500 you an ounce of gold. That agreement is longer is in place. But they do have to give you 121 00:09:35,500 --> 00:09:36,580 a change. 122 00:09:36,580 --> 00:09:41,480 So they do have a duty where if you have a bunch of old, beatup $20 bills, the bank sort 123 00:09:41,480 --> 00:09:45,490 of does have to honor those and give you new $20 bills. And that's one of the things that 124 00:09:45,490 --> 00:09:49,300 the Bureau of Engraving and Printing, and the Fed, they all have this system. So they 125 00:09:49,300 --> 00:09:53,050 maintain the currency, in that sense. 126 00:09:53,050 --> 00:09:56,730 So there's all these things that like you don't really think about in that how money 127 00:09:56,730 --> 00:10:01,800 operates. But fungibility, divisibility, those are really important. And most of it is enforced 128 00:10:01,800 --> 00:10:03,850 by the legal system. 129 00:10:03,850 --> 00:10:09,980 Bitcoin does not have these legal protections. If want bitcoin to work, even where it's not 130 00:10:09,980 --> 00:10:17,471 legally recognized as being money, and our goal is to make it like money. So if they're 131 00:10:17,471 --> 00:10:22,600 stolen bitcoins, and you can trace them, the law enforcement can say, no, those bitcoins 132 00:10:22,600 --> 00:10:25,640 are stolen. We're recovering them. 133 00:10:25,640 --> 00:10:33,740 So it's not treated as a currency. It's also not controlled as fungible. So in the absence 134 00:10:33,740 --> 00:10:38,710 of these sort of protections from the legal system, the software needs to help enforce 135 00:10:38,710 --> 00:10:44,190 the fungibility and divisibility of these things. So you can sort of treat the bitcoin 136 00:10:44,190 --> 00:10:48,500 software and ruleset as sort of a legal-- it's not a legal system. 137 00:10:48,500 --> 00:10:53,360 But it's a system of rules that governs how the bitcoin transactions work. And so the 138 00:10:53,360 --> 00:10:59,030 fungibility has got to be in the system. Right So gold, for example, they don't have-- I 139 00:10:59,030 --> 00:11:02,560 don't want to say they don't have rules about gold. There's probably all sorts of old laws 140 00:11:02,560 --> 00:11:03,580 about gold. 141 00:11:03,580 --> 00:11:10,110 But you don't really need a law enforcing the fungibility of gold, in that you can sort 142 00:11:10,110 --> 00:11:14,930 of melt it down, and it's untraceable. If you have this old gold coin from this empire, 143 00:11:14,930 --> 00:11:19,830 and this other gold coin from this empire, you melt the gold, you put it into a new coin, 144 00:11:19,830 --> 00:11:26,060 no one can tell. So, similarly, bitcoin needs to enforce fungibility that way. 145 00:11:26,060 --> 00:11:31,670 OK, so any questions so far about these definitions or objections? I don't think bitcoin should 146 00:11:31,670 --> 00:11:32,880 be fungible. 147 00:11:32,880 --> 00:11:44,720 AUDIENCE: So bitcoin is more like gold than diamonds. But, in a sense, right after your 148 00:11:44,720 --> 00:11:50,100 transaction happens, the probability that it-- say it's only one block deep. I don't 149 00:11:50,100 --> 00:11:52,570 think it's actually treated this way. 150 00:11:52,570 --> 00:12:01,113 But someone could potentially, oh, I'll just consider that 90% of one bitcoin, because 151 00:12:01,113 --> 00:12:02,550 the probability or whatever-- 152 00:12:02,550 --> 00:12:06,810 TADGE DRYJA: They're not valued the same, because they haven't been fully confirmed. 153 00:12:06,810 --> 00:12:12,030 Yeah, that's an interesting way to look at it. I'm not talking about sort of the time-based 154 00:12:12,030 --> 00:12:17,020 things here, because I guess in many cases you say, OK, I'm going to wait a day. 155 00:12:17,020 --> 00:12:21,121 If it's got 100 confirmations, it's good. I'll consider it full value. But, yeah, there 156 00:12:21,121 --> 00:12:25,800 could be things where you look at the probability of a double spend, and then sort of assign 157 00:12:25,800 --> 00:12:32,030 value to it, increasing as time passes or something. That's sort of a unique thing to 158 00:12:32,030 --> 00:12:37,529 bitcoin, where the sort of finality of the transfer increases with time, which is not 159 00:12:37,529 --> 00:12:41,040 really the case with gold or dollars, where you're like, oh, I got it. 160 00:12:41,040 --> 00:12:46,050 The fact that I've had it for one minute or 10 minutes doesn't really change that I've 161 00:12:46,050 --> 00:12:51,920 got it. So that's an interesting point. I'm not really addressing it here. 162 00:12:51,920 --> 00:12:56,360 In this conversation, just assume you wait a day. And then all the probability of it 163 00:12:56,360 --> 00:13:02,690 going back is sort of negligible. But it's more the fungibility in that you can trace 164 00:13:02,690 --> 00:13:10,590 where it came from, and so assign values to different coins based on that. 165 00:13:10,590 --> 00:13:15,310 So there's a real world example of this. I don't want to name specific names. But customer 166 00:13:15,310 --> 00:13:18,850 buys some coins. You go to exchange. You send them a couple hundred bucks. 167 00:13:18,850 --> 00:13:23,440 You say, I want to buy some bitcoins. And the customer buys the bitcoins, and then transfers 168 00:13:23,440 --> 00:13:29,350 the coins to a betting shop, a betting company in the UK, and bets on a soccer game, for 169 00:13:29,350 --> 00:13:30,930 example. And he wins. 170 00:13:30,930 --> 00:13:37,260 Great, so he bought a coin, sent it over to this UK gambling company, bet on a soccer 171 00:13:37,260 --> 00:13:43,320 game, or football I guess they call it, wins. Now, he has two coins. Transfers those two 172 00:13:43,320 --> 00:13:46,389 coins back to the US exchange. 173 00:13:46,389 --> 00:13:51,130 And he wants to sell them. Before he can click sell, his account is closed. And the exchange 174 00:13:51,130 --> 00:13:55,019 website says, no, you violated our terms of service. We're closing your account. 175 00:13:55,019 --> 00:14:02,320 Take your bitcoins. We've sent all your dollars back to your bank. Withdraw your bitcoins. 176 00:14:02,320 --> 00:14:04,720 We're out. 177 00:14:04,720 --> 00:14:14,660 This happens. So whether you agree with gambling being legal-- so in the UK, however, there's 178 00:14:14,660 --> 00:14:20,050 no law violated. And as a United States-- so this is sort of the gray area. I'm pretty 179 00:14:20,050 --> 00:14:26,829 sure that if I'm a US citizen, I take a plane over to London, I go to a-- I don't know the 180 00:14:26,829 --> 00:14:33,260 lingo, punters shop, betting shop? And then I bet on something, and I win. 181 00:14:33,260 --> 00:14:39,070 Well, I'm pretty sure I have to pay capital gains on those winnings to the US, IRS. But 182 00:14:39,070 --> 00:14:45,019 I haven't broken a law, because I'm not in the US when I'm doing this thing. So, similarly, 183 00:14:45,019 --> 00:14:49,730 if casino gambling is illegal in Massachusetts, sort of, I can go to Las Vegas. I can gamble. 184 00:14:49,730 --> 00:14:54,200 And I can come back to Massachusetts, and it's OK. 185 00:14:54,200 --> 00:14:58,610 So this is probably a gray area. I don't know if there's actual court cases about this, 186 00:14:58,610 --> 00:15:02,710 where you're still sitting in Massachusetts at your computer. But you've sent your money 187 00:15:02,710 --> 00:15:10,339 to the United Kingdom. I don't know. I don't know of anyone who's gotten in trouble for 188 00:15:10,339 --> 00:15:11,339 this. 189 00:15:11,339 --> 00:15:15,720 I don't know if there's been prosecution against UK betting shops for this kind of thing. But 190 00:15:15,720 --> 00:15:21,180 this happens. A lot of companies run betting shops like this. 191 00:15:21,180 --> 00:15:24,800 But I have seen that US exchanges will shut down accounts because of this. And they say, 192 00:15:24,800 --> 00:15:28,950 look, you can buy bitcoins. But if you're using these to gamble, we don't want to get 193 00:15:28,950 --> 00:15:32,300 involved. It's just more risk for us. 194 00:15:32,300 --> 00:15:39,389 And so they close the accounts. So the problem here is-- what did I say next? From the perspective 195 00:15:39,389 --> 00:15:43,540 of the legal system, for the user it's obviously a problem, because it's annoying. 196 00:15:43,540 --> 00:15:49,080 What's the problem for bitcoin here? Why should bitcoin care about this, like as people working 197 00:15:49,080 --> 00:15:53,180 on bitcoin? Why do I care? 198 00:15:53,180 --> 00:16:00,470 AUDIENCE: Two reasons. It's the usability of bitcoin, that bitcoin is not usable that's 199 00:16:00,470 --> 00:16:09,040 public and private. So that's the different between fiat currencies, that can be like 200 00:16:09,040 --> 00:16:15,240 government, state. But, two, is then what happens to these two bitcoins? Do they just 201 00:16:15,240 --> 00:16:16,240 get burned? 202 00:16:16,240 --> 00:16:20,029 They're suspended until there's some decision as to what to happen. They're not owned by 203 00:16:20,029 --> 00:16:21,029 the exchange. 204 00:16:21,029 --> 00:16:25,040 TADGE DRYJA: They gave them back. But, yeah, the real thing is these two bitcoins are sort 205 00:16:25,040 --> 00:16:30,140 of worth less than two other bitcoins. Now, you've got a sort of different value for these, 206 00:16:30,140 --> 00:16:35,330 where these two bitcoins are hot. And I need to get rid of them, and launder them, or something. 207 00:16:35,330 --> 00:16:36,330 AUDIENCE: That happens with $100 bills too. 208 00:16:36,330 --> 00:16:37,330 TADGE DRYJA: Really? OK, well. 209 00:16:37,330 --> 00:16:41,430 AUDIENCE: I had a suitcase full of them in Bogota. 210 00:16:41,430 --> 00:16:52,740 TADGE DRYJA: You had a suitcase full of hundreds, and you just walk to the Bank of America, 211 00:16:52,740 --> 00:16:58,699 and say, here, I want to deposit this, probably you've got some problems. 212 00:16:58,699 --> 00:17:05,849 AUDIENCE: [INAUDIBLE] 1% into the banking system, to move it from fiat to digital, digital 213 00:17:05,849 --> 00:17:06,849 fiat. 214 00:17:06,849 --> 00:17:07,970 TADGE DRYJA: Yeah, like actual physical currency notes. 215 00:17:07,970 --> 00:17:10,770 AUDIENCE: That's not money laundering. 216 00:17:10,770 --> 00:17:16,740 TADGE DRYJA: So, yeah, so now you've got the sort of money laundering that now people are 217 00:17:16,740 --> 00:17:23,608 going to try to do using bitcoin. It would be preferable if-- so the way that they can 218 00:17:23,608 --> 00:17:29,950 tell about this, is that people reuse addresses. Bitcoin is not inherently as fungible as, 219 00:17:29,950 --> 00:17:31,909 say, gold. 220 00:17:31,909 --> 00:17:36,149 Dollar bills have these serial numbers. And I don't know to what extent people track these 221 00:17:36,149 --> 00:17:40,820 things. But bitcoins are much more like the dollar bills with serial numbers. And it's 222 00:17:40,820 --> 00:17:44,830 all on the internet. It's all on the computer. So it's really easy to track things this way. 223 00:17:44,830 --> 00:17:51,193 AUDIENCE: [INAUDIBLE] could just simply transfer them to a new address, and no one would know 224 00:17:51,193 --> 00:17:54,549 it was still him until it was spent. 225 00:17:54,549 --> 00:18:01,909 TADGE DRYJA: Well, yes and no. So you say, OK, I'm at the exchange. They've got a giant 226 00:18:01,909 --> 00:18:07,529 pool of addresses. I transfer to the casino. 227 00:18:07,529 --> 00:18:12,669 And then the casino, in many of these cases, reuses addresses extensively. So it's really 228 00:18:12,669 --> 00:18:18,539 easy to see that this is the casino. And then from the casino, goes to the exchange again. 229 00:18:18,539 --> 00:18:21,020 Exchange flags that and says, hey, we know where this came from. 230 00:18:21,020 --> 00:18:27,489 Instead, you say, OK, I go to User A. And then I go to the exchange. In some cases, 231 00:18:27,489 --> 00:18:31,799 that might-- if their algorithm is, hey, just look at where it came from, sure. But if you 232 00:18:31,799 --> 00:18:36,950 say, wait, there was one output. This transaction is one input, one output. 233 00:18:36,950 --> 00:18:41,789 So we know exactly where it came from. You can trace it back that way. So it depends. 234 00:18:41,789 --> 00:18:45,850 Would the coins be worth less? This is un-money like. 235 00:18:45,850 --> 00:18:50,259 And since legal tender is a whole other thing. And I'm pretty sure bitcoin will never be 236 00:18:50,259 --> 00:18:56,289 legal tender anywhere, in terms of debts can be settled. But at least fungibility, we want. 237 00:18:56,289 --> 00:19:01,330 So we want to make bitcoin more money like, how do we fix this? The first, and simplest, 238 00:19:01,330 --> 00:19:07,309 and possibly the biggest is, address reuse. So if the casino keeps using the same address 239 00:19:07,309 --> 00:19:10,580 over and over, it's really obvious. 240 00:19:10,580 --> 00:19:15,119 And so things like vanity addresses, so a lot of them will use vanity addresses. A vanity 241 00:19:15,119 --> 00:19:21,989 address is when you continually perform computations to try to get a human readable address. The 242 00:19:21,989 --> 00:19:23,169 addresses are random numbers. 243 00:19:23,169 --> 00:19:32,710 So for example, I ain't rich. So Greg Maxwell has this address, which is 1gMaxwellbo8. So 244 00:19:32,710 --> 00:19:36,899 you can see that the first part of that addresses is Jim Maxwell. 245 00:19:36,899 --> 00:19:41,399 And you get that just by continually attempting millions and millions, or billions, of different 246 00:19:41,399 --> 00:19:48,159 keys, and seeing which turn into the address you want. So people do that. People spend 247 00:19:48,159 --> 00:19:50,090 a lot of resources on making cool addresses. 248 00:19:50,090 --> 00:19:54,889 And then the casino can say, OK, this is my address. But that really hurts the fungibility 249 00:19:54,889 --> 00:20:01,450 and privacy, because now it's clear when a customer sends to that address, everyone in 250 00:20:01,450 --> 00:20:03,230 the world can see. Yeah? 251 00:20:03,230 --> 00:20:11,419 AUDIENCE: Does the casino just do this, because it's cool? Because it's not like it's easier 252 00:20:11,419 --> 00:20:13,169 to type in, because-- 253 00:20:13,169 --> 00:20:18,489 TADGE DRYJA: You still have the stuff. It's branding, vanity address, it cool I guess. 254 00:20:18,489 --> 00:20:25,739 But, yeah, Satoshi Dice does this too, right? Or did. 255 00:20:25,739 --> 00:20:39,059 So the Satoshi Dice address is-- yeah, it always starts with one dice. So branding, 256 00:20:39,059 --> 00:20:43,169 I guess, makes it easier for-- it does make it easier for people to recognize. If they 257 00:20:43,169 --> 00:20:46,320 have a list of the addresses, they're like, oh, that's the casino address. I want to deposit 258 00:20:46,320 --> 00:20:50,049 to the casino, and just copy and paste that in there. 259 00:20:50,049 --> 00:20:55,669 So it does help usability. And that's one of the issues in bitcoin. It's like, having 260 00:20:55,669 --> 00:21:01,659 these giant, ugly addresses that you have to get right, and not sent to the wrong, it's 261 00:21:01,659 --> 00:21:06,999 not a great user experience kind of thing. So I can see that vanity addresses do improve 262 00:21:06,999 --> 00:21:08,720 the user experience a little bit. 263 00:21:08,720 --> 00:21:15,340 But it really hurts the privacy of the system. And address reuse is a problem, because people 264 00:21:15,340 --> 00:21:27,429 keep using it. It looks cool. Also web explorers, so if you look 265 00:21:27,429 --> 00:21:35,599 at blockchain.info. This is advanced mode, enable, disable. 266 00:21:35,599 --> 00:21:39,460 Even with advanced mode, it'll give you a link to the output. So no one is going to 267 00:21:39,460 --> 00:21:44,940 click that. Anyway, so the idea is this shows, oh, here's a transaction. It's coming from 268 00:21:44,940 --> 00:21:49,029 two addresses, sending to two addresses. 269 00:21:49,029 --> 00:21:55,869 And it even gives you a helpful, hey, this is Greg Maxwell. And I'll give you a link 270 00:21:55,869 --> 00:22:07,059 to-- where does this link go? You're being redirected. OK, well, go for it. Gmax-- OK, 271 00:22:07,059 --> 00:22:16,389 so it's Greg's account on bitcoin talk. Interesting. 272 00:22:16,389 --> 00:22:21,580 So you guys know enough about the system now, that this is not true. If you click on an 273 00:22:21,580 --> 00:22:26,249 address, and now it gives you all the transactions, way too many, that were involved in this. 274 00:22:26,249 --> 00:22:33,879 But if you click on a transaction, it's not spending from addresses and to addresses. 275 00:22:33,879 --> 00:22:39,480 It's spending from transaction outputs, and spending to addresses that can later be consumed. 276 00:22:39,480 --> 00:22:45,039 So the way that web explorers-- and it'll show a balance. If I click on an address, 277 00:22:45,039 --> 00:22:49,369 it says, hey, here's the number of transactions involved. Here's the final balance. And you 278 00:22:49,369 --> 00:22:53,350 can look at the current balance, and things like that. 279 00:22:53,350 --> 00:22:57,519 But that's not actually how the system works. And it hurts privacy and vulnerability, to 280 00:22:57,519 --> 00:23:01,520 have this idea of, oh, this is sort of my account. And it's got a balance that can increase 281 00:23:01,520 --> 00:23:07,269 and decrease, because then that implies that I'm going to keep using it. Whereas, really, 282 00:23:07,269 --> 00:23:11,990 if they're one-time use, that really makes it harder to trace things. 283 00:23:11,990 --> 00:23:17,009 So another aspect that hurts this, if you want to trace things-- so if it's one input, 284 00:23:17,009 --> 00:23:23,549 one output, right here, it's real easy. But even if it's not one input, one output-- so, 285 00:23:23,549 --> 00:23:27,700 for example, imagine a transaction where the input has 10 coins in it. There's output a, 286 00:23:27,700 --> 00:23:32,549 which is one coin, output b, which is 8.9997 coins. 287 00:23:32,549 --> 00:23:44,159 Which do you think is the change address? They're all different addresses. But OK, whoever 288 00:23:44,159 --> 00:23:50,129 is doing this, they're sending one coin. And the remainder, minus the fee, is this. 289 00:23:50,129 --> 00:23:56,379 So it's often pretty clear, even though looking at it, all the addresses are different. OK, 290 00:23:56,379 --> 00:24:00,960 are any of these change addresses? Is it just a sending to b and c? Is it a sending to b, 291 00:24:00,960 --> 00:24:04,409 and back to a? Often, it's pretty easy to figure out. 292 00:24:04,409 --> 00:24:09,809 And you're guessing. But you can get the guessing pretty good. OK, so what we're going to talk 293 00:24:09,809 --> 00:24:12,640 about now-- anonymity sets. 294 00:24:12,640 --> 00:24:17,519 So bitcoin is not actually anonymous. There are sort of identities attached to these things, 295 00:24:17,519 --> 00:24:20,629 in that you have addresses. The addresses are not your name. 296 00:24:20,629 --> 00:24:24,399 But you can think of them as a pseudonym. And you can create a bunch of pseudonyms. 297 00:24:24,399 --> 00:24:29,889 But there are these keys. There are these publicly known addresses. 298 00:24:29,889 --> 00:24:36,690 And so we want to expand an anonymity set. And so the idea of an anonymity set is, how 299 00:24:36,690 --> 00:24:46,109 many possible different identities could be the owner of these coins? And the idea of 300 00:24:46,109 --> 00:24:49,639 expanding your anonymity set-- so even if bitcoin were perfectly anonymous, in terms 301 00:24:49,639 --> 00:24:53,950 of the anonymity set was everyone who had Bitcoin, that's still actually not that many 302 00:24:53,950 --> 00:24:54,950 people. 303 00:24:54,950 --> 00:24:58,789 So if you see a bitcoin, you're like, well, I don't know who owns it. But I know that 304 00:24:58,789 --> 00:25:02,249 the person who owns it, owns bitcoin. And, actually, there's not that many people who 305 00:25:02,249 --> 00:25:08,340 own bitcoin. So I've just eliminated 99% of the people, my suspects, just by doing that. 306 00:25:08,340 --> 00:25:14,649 So just having more people using bitcoin makes it more anonymous, in that sense. If it's 307 00:25:14,649 --> 00:25:19,969 very niche kind of thing, and then the police say, OK, well whoever did this crime, they're 308 00:25:19,969 --> 00:25:24,599 a bitcoin user. Well, now you can find-- there's not that many bitcoin users. So you want to 309 00:25:24,599 --> 00:25:28,649 try to increase your anonymity set for a specific transaction. 310 00:25:28,649 --> 00:25:33,879 So the traditional-- I don't want to say traditional-- the way you do this is basically money laundering. 311 00:25:33,879 --> 00:25:40,580 It's a bitcoin mixer. And these mixers still exist. I'm pretty sure, right? They're still 312 00:25:40,580 --> 00:25:41,580 around? 313 00:25:41,580 --> 00:25:46,899 They're still around. I don't know why. But a lot of times they use Tor. I mean, I know 314 00:25:46,899 --> 00:25:50,149 why. But it's like, there's so many better ways to do this. 315 00:25:50,149 --> 00:25:53,820 So you've got coins at address A. And you say, OK, I'm going to send 10 coins to the 316 00:25:53,820 --> 00:25:58,799 mixer, which has address me. And then later, some different, not from that output, but 317 00:25:58,799 --> 00:26:03,840 somewhere else in this sort of giant mixer account, four coins get sent to address C, 318 00:26:03,840 --> 00:26:09,859 and six coins to address D. So you basically pool all the coins into this mixer, which 319 00:26:09,859 --> 00:26:15,899 uses lots of different addresses, and then split it up over time, different amounts. 320 00:26:15,899 --> 00:26:19,789 And it gets really hard to figure out where the coins came from. 321 00:26:19,789 --> 00:26:26,000 So your anonymity set is bigger. The problem is mixers were well. Potential anonymity set 322 00:26:26,000 --> 00:26:30,340 is all the other users of the mixer, if it's well designed. The problem, the mixers disappear 323 00:26:30,340 --> 00:26:33,059 with everyone's money, very consistently. 324 00:26:33,059 --> 00:26:40,529 The mixers are certainly not publicly regulated companies. I'm pretty sure you couldn't do 325 00:26:40,529 --> 00:26:45,299 that. So they're just sort of these anonymous, like, hey, I'm a mixer, bitcoin cloud, or 326 00:26:45,299 --> 00:26:49,759 bitcoin fog, or whatever. And a lot of times they're on Tor. So you don't even know where 327 00:26:49,759 --> 00:26:54,369 the mixer exists. And you sort of hope for the best and send your money. Yeah? 328 00:26:54,369 --> 00:26:56,640 AUDIENCE: Do they take a transaction fee? 329 00:26:56,640 --> 00:26:59,299 TADGE DRYJA: Yes, of them take fees. The big fee is when they keep all your money and don't 330 00:26:59,299 --> 00:27:07,419 give it back. But a lot of times, they will take a small cut. But it's not actually hard. 331 00:27:07,419 --> 00:27:12,190 The cost is pretty minimal. Need some kind of Tor service. And then you just have some 332 00:27:12,190 --> 00:27:16,590 software that just runs this, and allows deposits and withdrawals. 333 00:27:16,590 --> 00:27:22,119 The conference in Puerto Rico, Financial Cryptos, there was a talk about a-- what was the word 334 00:27:22,119 --> 00:27:35,460 they used? Not traceable, a mixer that you could prove defrauded you. So you could have 335 00:27:35,460 --> 00:27:40,580 these proofs that, oh, I can prove that they stole my coins, which is I thought was kind 336 00:27:40,580 --> 00:27:42,979 of useless, because the whole idea is they're anonymous. 337 00:27:42,979 --> 00:27:47,909 And maybe you can prove that they ripped you off. But they still ripped you off. OK, so 338 00:27:47,909 --> 00:27:56,229 then the better idea than a mixer is I taint rich was a blog post from Greg. That's why 339 00:27:56,229 --> 00:27:57,610 I've got this up. 340 00:27:57,610 --> 00:28:03,080 In 2013, and it was kind of fun. Ever since I was a wee lad, I had a dream, a dream of 341 00:28:03,080 --> 00:28:07,010 being incorrectly assessed as impossibly rich by a braindead automated analysis. Now, with 342 00:28:07,010 --> 00:28:08,580 your help, I can be. 343 00:28:08,580 --> 00:28:16,609 So he wanted to mix inputs from different people within the same transaction. So you 344 00:28:16,609 --> 00:28:20,289 could have two different people in the same transaction. And in bitcoin, this is secure, 345 00:28:20,289 --> 00:28:23,929 because the signature signs the whole transaction. 346 00:28:23,929 --> 00:28:31,250 So you say, OK, I'm user A. I have my 10 coin input. I'll match up with user B, who's got 347 00:28:31,250 --> 00:28:36,479 his two coin input. And we both sign this whole transaction, which sends two coins to 348 00:28:36,479 --> 00:28:45,019 C, and 10 coins to D. So now, you can't trace where these coins went right. Right? Well, 349 00:28:45,019 --> 00:28:47,110 you sort of can. 350 00:28:47,110 --> 00:28:50,769 But if you just look at the graph of transactions, it's not as obvious. So the first transaction 351 00:28:50,769 --> 00:28:55,429 is really fun. I remember seeing this. 352 00:28:55,429 --> 00:29:03,399 So there's three inputs, 40,000 bitcoins, 0.1337 bitcoins, and 1 bitcoin from Greg, 353 00:29:03,399 --> 00:29:12,830 and then 40,000, and 31337, and then 0.82. For some reason, it was even more-- yeah, 354 00:29:12,830 --> 00:29:18,981 40,000 bitcoins is pretty good today. It was no less impressive at the time in 2013, even 355 00:29:18,981 --> 00:29:23,710 though I guess it was only worth about half a million dollars at that time. 356 00:29:23,710 --> 00:29:27,549 But it was like-- it was pretty cool. And then Greg was like, wow, I've never seen that 357 00:29:27,549 --> 00:29:37,639 much money. Yeah, so the guy who posted the 40,000 coins, he's called loaded. And nobody 358 00:29:37,639 --> 00:29:44,139 knows who he is. But he shows up from time to time, and has a lot of money. 359 00:29:44,139 --> 00:29:49,321 So, yeah, then Greg made a transaction with Loaded. I've handled pricey assets before, 360 00:29:49,321 --> 00:29:57,700 the most I've ever moved on a single key press. So it's pretty cool. This was very manual. 361 00:29:57,700 --> 00:30:02,549 That was on a message board, sort of goofing around. But there's no risk. You're not sending 362 00:30:02,549 --> 00:30:07,489 money to someone, and then getting it back. You're just saying, I'll sign off on this 363 00:30:07,489 --> 00:30:09,809 transaction. And then transactions are atomic. 364 00:30:09,809 --> 00:30:13,599 You can't say, oh, I'm going to cut off this bottom part, and only have 10 coins going 365 00:30:13,599 --> 00:30:18,529 in, and two coins come up. The whole transaction is the thing that gets signed. OK, so what's 366 00:30:18,529 --> 00:30:20,229 the problem with this model? 367 00:30:20,229 --> 00:30:28,029 Any way to get a mapping from C and D, to A and B? There's one really obvious one on 368 00:30:28,029 --> 00:30:35,950 this screen. It's an x, right? Yeah, well, gee, I think it's A goes to D, and B goes 369 00:30:35,950 --> 00:30:41,009 to C, because the amounts are completely different. And, actually, we'll talk about amounts next 370 00:30:41,009 --> 00:30:42,009 week. 371 00:30:42,009 --> 00:30:47,139 But how about this? Well, you've got 10 coins coming in, two coins coming in here. And I've 372 00:30:47,139 --> 00:30:55,370 got address C, D, E, and F, 1, 7, 1, and 3. Better, maybe? No, nice try. You can still 373 00:30:55,370 --> 00:30:56,370 easily. 374 00:30:56,370 --> 00:31:04,150 The 7 goes to the 10, the 7 to 3, the 1 and 1. I don't think there's any way it could 375 00:31:04,150 --> 00:31:08,119 be anything else. How about this? 376 00:31:08,119 --> 00:31:11,669 Address C has two coins. Address D has two coins. Address E has eight coins. Well, now, 377 00:31:11,669 --> 00:31:19,399 that actually works, right? It's not clear if C is from A or B, same with D. 378 00:31:19,399 --> 00:31:25,799 These two have some anonymity, in that you're not sure which user it's from. E, on the other 379 00:31:25,799 --> 00:31:34,600 hand, is obviously from A. But B's coins are now sort of-- the anonymity set has doubled. 380 00:31:34,600 --> 00:31:38,809 You don't know whether it's C or D. B's address is now unclear. 381 00:31:38,809 --> 00:31:45,669 So that's kind of cool. How do we scale this? Well, have more users, and a bigger anonymity 382 00:31:45,669 --> 00:31:51,989 set. So one issue with this is, as you scale to more users-- let's say you do 10 different 383 00:31:51,989 --> 00:31:54,659 users, where they all put in their inputs. 384 00:31:54,659 --> 00:32:00,099 They all put in their set of outputs. And now, you've got this big transaction, where 385 00:32:00,099 --> 00:32:04,789 it's hard to tell what the mapping is. The problem, as you gain numbers of inputs, numbers 386 00:32:04,789 --> 00:32:10,009 of users, the users themselves know the mapping, because someone's actually doing the, I put 387 00:32:10,009 --> 00:32:12,440 in my thing, you put in yours. 388 00:32:12,440 --> 00:32:16,239 So there's a user that knows the mapping. And they can leak that info. And that hurts 389 00:32:16,239 --> 00:32:26,470 the anonymity. So maybe just the transaction graph itself won't tell you, but somebody 390 00:32:26,470 --> 00:32:27,470 knows. 391 00:32:27,470 --> 00:32:33,769 And they can reveal that at a later date. So that's not good. So there's a really cool 392 00:32:33,769 --> 00:32:40,029 protocol called CoinShuffle, so pre-CoinJoin messaging to shuffle the inputs and outputs. 393 00:32:40,029 --> 00:32:46,129 And this allows you to have 10, 20, 30 different people doing it. And if at least two participants 394 00:32:46,129 --> 00:32:53,159 are honest, then the mapping cannot be determined. The way it works-- this is super quick, because 395 00:32:53,159 --> 00:32:56,059 I don't have time. 396 00:32:56,059 --> 00:33:00,289 Everyone has their inputs that they want to put into the transaction. And they also have 397 00:33:00,289 --> 00:33:05,120 the output addresses that they want as their outputs. So they make a new set of public 398 00:33:05,120 --> 00:33:07,859 keys that they're not going to use on bitcoin at all. 399 00:33:07,859 --> 00:33:14,639 They're just making public keys for encryption purposes for this game. And they also tell 400 00:33:14,639 --> 00:33:20,389 everyone, here's my inputs. My input is A. My input is B. So the inputs are known to 401 00:33:20,389 --> 00:33:21,389 the people participating. 402 00:33:21,389 --> 00:33:27,320 And then the idea is, OK, I know everyone's publicly that they've given. So I encrypt 403 00:33:27,320 --> 00:33:31,869 my output. So I've got an input I've told everyone. I've got my output. 404 00:33:31,869 --> 00:33:37,289 And I encrypt that, the address itself and the amount, with everyone's public keys sequentially. 405 00:33:37,289 --> 00:33:42,519 So, for example, I use encryption on key C to encrypt the thing on encryption on key 406 00:33:42,519 --> 00:33:49,349 B, encrypt the encryption on key A of my output. And then I hand this to user A. 407 00:33:49,349 --> 00:33:53,750 So this is like onion routing, sort of onion encryption, where you take a plaintext, encrypt 408 00:33:53,750 --> 00:34:03,170 it, encrypt it again, encrypt it again. So I receive these encrypted outputs. I shuffle 409 00:34:03,170 --> 00:34:06,669 them. I receive one from A, one from B, one from C. 410 00:34:06,669 --> 00:34:13,010 Actually, I receive the whole set. I shuffle the order. And then I use my key to decrypt 411 00:34:13,010 --> 00:34:18,369 one layer. It's still going to be encrypted, because I just see, here's a bunch of encrypted 412 00:34:18,369 --> 00:34:19,369 data. 413 00:34:19,369 --> 00:34:23,359 I decrypt it. It's still encrypted with the next person's key. And then I hand it to them, 414 00:34:23,359 --> 00:34:28,239 and they shuffle, and decrypt. So the final user gets the outputs. 415 00:34:28,239 --> 00:34:34,339 And the final user-- so in this case, user C, can decrypt. And now, he's got my output. 416 00:34:34,339 --> 00:34:40,250 But it's been shuffled around with everyone else's output, at every step of the way. 417 00:34:40,250 --> 00:34:45,090 So they can't tell who's went to whom. And then they have this final transaction, which 418 00:34:45,090 --> 00:34:49,510 has all the outputs in some random order, because everyone contributed to randomizing 419 00:34:49,510 --> 00:34:53,050 the order of the outputs. So this is really cool. 420 00:34:53,050 --> 00:34:57,940 As long as two parties are honest-- if there's only one honest party, it doesn't work, because 421 00:34:57,940 --> 00:35:04,829 then every dishonest party colludes, and says, well, we know everything but the honest party's 422 00:35:04,829 --> 00:35:12,769 output. So we can figure out which one it is. But if there's two users actually doing 423 00:35:12,769 --> 00:35:14,700 this, then you can't determine the order. 424 00:35:14,700 --> 00:35:19,289 So that's pretty cool. I think it's being used for Join Market. Do they use something 425 00:35:19,289 --> 00:35:22,040 like this? I don't know. 426 00:35:22,040 --> 00:35:27,790 So there's cool techniques for this kind of. Thing OK, real world, though, some people 427 00:35:27,790 --> 00:35:32,279 use this. Join Market exists. I don't know how popular it is. 428 00:35:32,279 --> 00:35:38,160 Problem, which people use this. Who uses this? It's got a limited anonymity set of the people 429 00:35:38,160 --> 00:35:43,800 who really want anonymity, which is not the anonymity set the people who want anonymity 430 00:35:43,800 --> 00:35:50,099 want, which is kind of confusing. But the idea is, I don't want to just be in this group 431 00:35:50,099 --> 00:35:54,670 of people who want anonymity, because those are the people I don't want to be associated 432 00:35:54,670 --> 00:35:55,670 with. 433 00:35:55,670 --> 00:36:02,010 I would rather associate myself with the people who don't particularly want anonymity. This 434 00:36:02,010 --> 00:36:09,670 is a big problem. People don't care about privacy. People don't want to do these kinds 435 00:36:09,670 --> 00:36:10,670 of things. 436 00:36:10,670 --> 00:36:15,369 And this costs money too, in this case. If you're doing these transactions, this isn't 437 00:36:15,369 --> 00:36:20,410 really a payment. This is just superfluous transaction to turn your money around to get 438 00:36:20,410 --> 00:36:25,960 some more privacy. This is one of the big issues with, OK, we want anonymity. 439 00:36:25,960 --> 00:36:32,890 But if anonymity is opt in, it's not very useful, because then it's sort of like Tor, 440 00:36:32,890 --> 00:36:44,339 where if I'm using Tor, everyone's like, why are you using Tor? And so at this point, encryption 441 00:36:44,339 --> 00:36:49,490 is now sort of ubiquitous. And it's like the standard. But yeah? 442 00:36:49,490 --> 00:36:56,549 AUDIENCE: Given that most transactions happen on exchanges, why not just lobby a few exchanges 443 00:36:56,549 --> 00:36:59,059 to try and implement this? 444 00:36:59,059 --> 00:37:03,410 TADGE DRYJA: Well, wait, most actual bitcoin transactions are in and out from exchanges? 445 00:37:03,410 --> 00:37:04,410 I don't know-- 446 00:37:04,410 --> 00:37:05,410 AUDIENCE: If that's true. 447 00:37:05,410 --> 00:37:09,160 TADGE DRYJA: There's a lot. But I don't know if it's a majority. I think it's a decent 448 00:37:09,160 --> 00:37:12,010 percentage. But I think it's less than half. But, yeah, there's a lot. 449 00:37:12,010 --> 00:37:17,940 It'd be awesome. So exchanges are actually uniquely positioned, where something like 450 00:37:17,940 --> 00:37:23,211 CoinShuffle, you could really easily do an oblivious withdrawal. Where you say, I want 451 00:37:23,211 --> 00:37:27,609 to withdraw my coins. And I'm not going to tell you where to. But I'll give you this 452 00:37:27,609 --> 00:37:30,290 encrypted thing that gets shuffled around with different users. 453 00:37:30,290 --> 00:37:34,210 And then at the end, I sign off on it. And the exchange can say, well, we're fulfilling 454 00:37:34,210 --> 00:37:38,279 all our customers' withdrawals. But we don't know where the coins are going. You could 455 00:37:38,279 --> 00:37:44,019 totally do that. That's not a conversation that happens at any exchange. Yeah? 456 00:37:44,019 --> 00:37:50,780 AUDIENCE: I also think that the bulk of exchanges are now getting caught up in some regulatory, 457 00:37:50,780 --> 00:37:59,839 in Japan, soon to be in the US, for know your customer, anti-money laundering. So for the 458 00:37:59,839 --> 00:38:05,940 exchange business model, they're going to have to give this Coinbase to the FBI some 459 00:38:05,940 --> 00:38:06,940 data. 460 00:38:06,940 --> 00:38:10,890 TADGE DRYJA: Yeah, and whether that data is, here is this user's name, and address, his 461 00:38:10,890 --> 00:38:14,730 house home address, and his social security number, and here's what he bought and sold. 462 00:38:14,730 --> 00:38:19,270 Versus, does that also have, and here's where the addresses where he withdrew his coins 463 00:38:19,270 --> 00:38:27,130 to? I don't think the IRs-- the IRS thing, I think it was mostly we want to get people 464 00:38:27,130 --> 00:38:28,630 for not reporting their gains. 465 00:38:28,630 --> 00:38:30,529 AUDIENCE: That's the IRS point, but the other part-- 466 00:38:30,529 --> 00:38:31,529 TADGE DRYJA: Right, FinCEN. 467 00:38:31,529 --> 00:38:35,920 AUDIENCE: Financial Crimes Unit, FinCEN would want the addresses. 468 00:38:35,920 --> 00:38:42,400 TADGE DRYJA: FinCEN would want to map bitcoin addresses to human names, so they could see 469 00:38:42,400 --> 00:38:47,460 who did what. IRS just wants to know who sold and didn't report gains. By the way, this 470 00:38:47,460 --> 00:38:55,211 weekend, if you sold any coins last year, you got to tell the IRS. I've actually got 471 00:38:55,211 --> 00:38:58,230 to pay a little bit of tax on that. 472 00:38:58,230 --> 00:39:04,750 Yeah, so anonymity is tricky. Like, you could try to go through exchanges, probably not 473 00:39:04,750 --> 00:39:11,890 the best. They're not going to do it. It's hard to do even research on this kind of stuff 474 00:39:11,890 --> 00:39:13,940 when you're a company. 475 00:39:13,940 --> 00:39:17,660 There's a lot of sort of chilling effects and stuff. That's sort of why I like working 476 00:39:17,660 --> 00:39:23,140 at MIT, in that if I want to research crazy anonymity stuff, totally fine. Whereas if 477 00:39:23,140 --> 00:39:28,829 I'm working at a VC-funded company in San Francisco, and I'm like, hey, we're making 478 00:39:28,829 --> 00:39:33,069 these anonymity protocols. Are you sure you want to do it? 479 00:39:33,069 --> 00:39:41,230 It can be an awkward conversation, in some cases. So people don't care about privacy. 480 00:39:41,230 --> 00:39:45,809 It's sort of an externality. If you say, I'm reusing addresses. But that only hurts my 481 00:39:45,809 --> 00:39:46,809 privacy. 482 00:39:46,809 --> 00:39:52,460 That's actually not true. In the simplest sense, you're reducing the anonymity set for 483 00:39:52,460 --> 00:39:59,339 everyone else. So if I I say, OK, where are James' coins? Well, I know they're not at 484 00:39:59,339 --> 00:40:02,799 1Gmaxwell909, because that's Greg Maxwell's coin. 485 00:40:02,799 --> 00:40:08,040 So when you use publicly identifiable addresses, and vanity addresses, you're actually reducing 486 00:40:08,040 --> 00:40:12,230 the anonymity set for everyone else. It's sort of an externality, where the people who 487 00:40:12,230 --> 00:40:17,890 say, I don't care about privacy are not paying the cost of harming the people who do care 488 00:40:17,890 --> 00:40:21,200 about privacy. So this is a tricky problem. 489 00:40:21,200 --> 00:40:30,490 And there's no solution. But one cool thing, one way forward, is, well, everyone likes 490 00:40:30,490 --> 00:40:33,750 cheaper transactions. Everyone likes saving money. 491 00:40:33,750 --> 00:40:40,769 So can we make it cheaper to improve anonymity? And so privacy and scalability, in some cases, 492 00:40:40,769 --> 00:40:47,829 are at odds. So in things like ring signatures, which I don't think I have time to go-- but 493 00:40:47,829 --> 00:40:53,099 Monero is a sort of privacy focused currency that uses ring signatures where, where you 494 00:40:53,099 --> 00:40:57,329 can point to-- a ring signature is, I can point to two public keys. 495 00:40:57,329 --> 00:41:02,579 And say, OK, there's public key A and public key B. I'm signing message M on one of those 496 00:41:02,579 --> 00:41:06,280 two public keys. But I'm not telling you which. And you can verify that, OK, one of these 497 00:41:06,280 --> 00:41:09,220 keys signed. But I don't know which. 498 00:41:09,220 --> 00:41:13,910 And the signer had to have one of the private keys, or both. If you have both keys, you 499 00:41:13,910 --> 00:41:19,730 can obviously make a ring signature. It doesn't have to be two. It can 5, 10, whatever. You 500 00:41:19,730 --> 00:41:22,769 pick a bunch of public keys, sign with one of them. 501 00:41:22,769 --> 00:41:28,920 So that's pretty cool. That expands the size of signatures, I believe, with like 0 of n. 502 00:41:28,920 --> 00:41:34,170 So if you have 10 keys you're pointing to, and you make a signature on one of them, the 503 00:41:34,170 --> 00:41:36,680 signature actually gets bigger. 504 00:41:36,680 --> 00:41:41,710 So for Monero, they use all these different systems. And scalability is one of the biggest 505 00:41:41,710 --> 00:41:45,390 problems with there-- I mean, it's a problem with bitcoin as well. It's a problem with 506 00:41:45,390 --> 00:41:46,390 everything. 507 00:41:46,390 --> 00:41:50,620 But it's even worse in Monero because of the choices and the different algorithms they 508 00:41:50,620 --> 00:41:56,740 use. So in some cases, privacy and scalability are at odds. But in some cases, it works together, 509 00:41:56,740 --> 00:42:01,430 where you can say, we just want to have less information about these transactions. So if 510 00:42:01,430 --> 00:42:06,410 you have less information to store, there's less information to link the users to their 511 00:42:06,410 --> 00:42:08,190 coins. 512 00:42:08,190 --> 00:42:16,519 So an idea here is aggregate signatures. So, currently, when you sign, you have this input. 513 00:42:16,519 --> 00:42:21,599 And you say, OK, here's user's A signature, here's user's B signature. And you're doing 514 00:42:21,599 --> 00:42:25,279 this ineffective CoinJoin thing here. 515 00:42:25,279 --> 00:42:30,710 The goal would be we want to aggregate these signatures. So we don't have a signature for 516 00:42:30,710 --> 00:42:38,380 this input. We just have a single signature on key C, which is just, somehow, the combination 517 00:42:38,380 --> 00:42:41,740 of A and B's signature. 518 00:42:41,740 --> 00:42:45,910 And then you can save space, because there's only one signature that stays the same size, 519 00:42:45,910 --> 00:42:51,819 but still prove that both A-- so this is not a ring signature. This is aggregate is A and 520 00:42:51,819 --> 00:42:57,270 B both signed, and produced a single signature together. And that can validate the whole 521 00:42:57,270 --> 00:42:58,270 thing. 522 00:42:58,270 --> 00:43:05,859 OK, so how can we make this signature? We know pub keys A and B. They're both signing 523 00:43:05,859 --> 00:43:10,970 the same message, which is really important. There's other terms for these things. 524 00:43:10,970 --> 00:43:16,869 There's multi-signatures, aggregate signatures, key aggregation. And no one agrees on-- like, 525 00:43:16,869 --> 00:43:19,839 I've had discussions with people. And I'm like, wait, that's what you call it? I was 526 00:43:19,839 --> 00:43:23,930 calling it something else. And there's not really good terms for this. 527 00:43:23,930 --> 00:43:28,369 But in this case, what we want is we're signing the same message. So that makes it easier. 528 00:43:28,369 --> 00:43:34,359 We're not trying to combine different signatures on different messages from different pub keys 529 00:43:34,359 --> 00:43:37,910 and combine it to the same signature. That's even harder, although that is possible in 530 00:43:37,910 --> 00:43:40,170 some ways. 531 00:43:40,170 --> 00:43:44,230 But in this case, A and B are signing the same message with their two different keys. 532 00:43:44,230 --> 00:43:49,029 And they need one signature. And the signature is going to be R and S. 533 00:43:49,029 --> 00:43:55,441 So the equation that we had, that we've done a couple of times, the signature is k, some 534 00:43:55,441 --> 00:44:02,869 random number, minus the hash of the message, and k times gr, times the private key. And 535 00:44:02,869 --> 00:44:13,890 to verify, r minus the hash of the message and r, times the public key. Now, if you share 536 00:44:13,890 --> 00:44:18,450 the private keys with each other, it's kind of easy. 537 00:44:18,450 --> 00:44:22,180 Alice just say, here's my key, Bob. And then the Bob can compute everything. You can do 538 00:44:22,180 --> 00:44:26,740 that with ECSA. But you really don't want to share private keys, because as soon as 539 00:44:26,740 --> 00:44:31,170 you give someone else your private key, maybe they don't sign this aggregate signature of 540 00:44:31,170 --> 00:44:34,900 the thing you want to sign, they sign something else, giving them all the money. So you want 541 00:44:34,900 --> 00:44:38,539 to be involved in this process. 542 00:44:38,539 --> 00:44:45,269 So the simplest sort of multi-signature system for Schnorr signatures-- first, you want to 543 00:44:45,269 --> 00:44:55,029 share an r value. So this r is also going to have to have contributions from both users. 544 00:44:55,029 --> 00:45:00,599 Who comes up with what here? If c is the combination of both Alice and Bob's public key, the message 545 00:45:00,599 --> 00:45:06,089 we have already agreed on. But this r value is the question. 546 00:45:06,089 --> 00:45:13,200 So the idea is, Alice comes up with k sub a, computes r sub a, and gives it to Bob. 547 00:45:13,200 --> 00:45:19,970 Bob makes k sub b, computes r sub b, and gives that to Alice. Then both of them can say, 548 00:45:19,970 --> 00:45:24,930 OK, well, we know the real r that we're going to use for the signature is the sum of r sub 549 00:45:24,930 --> 00:45:25,930 a and r sub b. 550 00:45:25,930 --> 00:45:36,349 Then, they want to compute their own s's. So for Alice, s sub a is going to be equal 551 00:45:36,349 --> 00:45:41,980 to k sub a, that only she knows. She shared r sub a, but not k sub a, with Bob. Minus 552 00:45:41,980 --> 00:45:47,109 the hash of the message, and this aggregate, this summed r, times her private key. 553 00:45:47,109 --> 00:45:54,250 For Bob, it's s sub b, equals k sub b, minus the hash of the message, and r, times little 554 00:45:54,250 --> 00:46:00,349 be. And then they give each other a really-- only one party needs to do this, right? Either 555 00:46:00,349 --> 00:46:05,990 Alice gives s sub a to Bob, or Bob gives s sub a to Alice. But they don't have them both 556 00:46:05,990 --> 00:46:10,829 do it, because then the final step is you just add the s's. 557 00:46:10,829 --> 00:46:19,910 So this aggregate s is s sub a, plus s sub b, which is k sub a, plus k sub b, minus the 558 00:46:19,910 --> 00:46:27,130 hash of m and r, times a, minus the hash of m and r, times b, which is, you can call this 559 00:46:27,130 --> 00:46:32,630 k. That's the discrete log of r, because they summed it. And then you can factor out the 560 00:46:32,630 --> 00:46:34,510 a and b here. 561 00:46:34,510 --> 00:46:40,630 And so this is the sum of their private keys. This is the sum of their k values. And then 562 00:46:40,630 --> 00:46:49,450 this single verification step will work. And that works. Awesome. OK, questions about this? 563 00:46:49,450 --> 00:47:00,049 Basically make sense? There's all sorts of minefield caveat sort of watch out things. 564 00:47:00,049 --> 00:47:06,410 For example, if they learn your k value, they will learn your private key, once they get 565 00:47:06,410 --> 00:47:17,950 this. Normally, you make deterministic k values, where you compute k as the hash of your message 566 00:47:17,950 --> 00:47:22,011 being signed and your private key. That's dangerous to do in this case, because they 567 00:47:22,011 --> 00:47:27,999 might get a k value from you, and then give you a different r value, and get another k 568 00:47:27,999 --> 00:47:29,940 value, and find your private key. 569 00:47:29,940 --> 00:47:37,970 So there's all sorts of watch out things. So the idea is, OK, now cool, we've got output, 570 00:47:37,970 --> 00:47:41,359 which needs a signature from a. We've got an output which needs a signature from b. 571 00:47:41,359 --> 00:47:46,750 We'll use those as inputs, and just have this sum c signature. And that shows that both 572 00:47:46,750 --> 00:47:48,000 parties signed. 573 00:47:48,000 --> 00:47:57,390 So we're good. Now, we have way less data. These signatures are like 65 bytes. So instead 574 00:47:57,390 --> 00:48:02,089 of having it twice, you just have one that covers the whole transaction. Great. 575 00:48:02,089 --> 00:48:07,519 And it doesn't have to be two. You can see how this would extend to three, four, or five 576 00:48:07,519 --> 00:48:13,650 different people. They would all have to compute their own k value, give it to everyone else, 577 00:48:13,650 --> 00:48:20,140 everyone computes this combined r value, and then does their own s's, and then sums them 578 00:48:20,140 --> 00:48:24,499 all up. And this would work with any number of participants. 579 00:48:24,499 --> 00:48:29,950 Problem-- you've got a bunch of coins, that 40,000 coins from before. And then one day, 580 00:48:29,950 --> 00:48:36,800 you see, wait, I'm user A. User B came up with a user A and B signature and sent it 581 00:48:36,800 --> 00:48:43,410 all to address E. I never signed anything. I didn't do this process with him. 582 00:48:43,410 --> 00:48:48,529 How did he steal all my coins? This is bad. So any idea of how you could do this with 583 00:48:48,529 --> 00:48:51,520 the equations we? 584 00:48:51,520 --> 00:48:59,430 OK, so what you do is you say, hey, here's key A on the network. And, normally, you see 585 00:48:59,430 --> 00:49:02,529 pub key hashes. But a lot of times you see pub keys. And, anyway, you definitely see 586 00:49:02,529 --> 00:49:05,589 the pub key once people try to spend it. 587 00:49:05,589 --> 00:49:09,309 So maybe you have to do this quickly. But you can do this real quick. So you say, OK, 588 00:49:09,309 --> 00:49:16,069 I'm going to make Q, a random private key, compute q times G to be big Q. And then I'll 589 00:49:16,069 --> 00:49:22,049 compute key B, which is q minus A. Then I will send some coins to B. 590 00:49:22,049 --> 00:49:27,059 And now, note that I don't know the private key for B. The private key for B is going 591 00:49:27,059 --> 00:49:31,930 to be this little q, minus little a. I don't know little a, so I don't know what little 592 00:49:31,930 --> 00:49:33,990 b is. 593 00:49:33,990 --> 00:49:40,839 But that's OK. I don't send too many coins to key b. And, anyway, I'll get them back. 594 00:49:40,839 --> 00:49:42,859 I don't know b. I can't sign with it. 595 00:49:42,859 --> 00:49:49,259 However, I want to spend from b and a together. I don't know a little b. I don't little a. 596 00:49:49,259 --> 00:49:52,680 I don't know the private key for either. But I do know the private key for both, which 597 00:49:52,680 --> 00:49:53,849 is sort of confusing. 598 00:49:53,849 --> 00:49:58,031 But the idea is, well, C, is going to be a plus b, which is a, plus q minus a, which 599 00:49:58,031 --> 00:50:08,190 is just q. So in this case, I can observe a key, make this rogue key, and then spend 600 00:50:08,190 --> 00:50:14,190 from both of them, without knowing the actual key that I wanted to steal from. So this is 601 00:50:14,190 --> 00:50:18,410 a huge problem. You can't have this. This would make signature aggregation completely 602 00:50:18,410 --> 00:50:28,509 insecure. What are some ideas of how to fix that? 603 00:50:28,509 --> 00:50:34,259 So the interesting thing is in all the literature about multi-signature, this problem sort of 604 00:50:34,259 --> 00:50:39,720 still persists. And the new papers about how to use this in bitcoin are the ones that are 605 00:50:39,720 --> 00:50:47,539 actually addressing it. So the general idea was, well, make b sign. Make b prove that 606 00:50:47,539 --> 00:50:50,140 he actually can sign with key b. 607 00:50:50,140 --> 00:50:54,390 Before you start doing these things, have Alice and Bob talk to each other. And Bob 608 00:50:54,390 --> 00:50:58,579 says, OK, here's a signature with key b. And Alice says, OK, here's a signature with key 609 00:50:58,579 --> 00:51:02,470 a. And, now you've proven that they actually can sign, and they're not doing these kinds 610 00:51:02,470 --> 00:51:04,980 of rogue attacks. 611 00:51:04,980 --> 00:51:10,579 So that's the straight straightforward way to do it, where, OK, if you don't know b and 612 00:51:10,579 --> 00:51:15,119 can't sign, we're not going to continue with this process. That's interactive, though. 613 00:51:15,119 --> 00:51:19,930 So make b sign a message before combining keys. Easy, that's what people sort of thought 614 00:51:19,930 --> 00:51:21,740 about for years. 615 00:51:21,740 --> 00:51:25,720 But the whole point here is to aggregate signatures. And so if you require b to create a signature 616 00:51:25,720 --> 00:51:31,770 and post it into the transaction, you've just eliminated all the gains, all the space saving, 617 00:51:31,770 --> 00:51:36,330 for this technique. So we can't have that. We want it to be non-interactive. 618 00:51:36,330 --> 00:51:43,829 We want any existing keys to be usable in this system, without pre-committing to anything. 619 00:51:43,829 --> 00:51:49,690 You can also make it interactive, where-- well, anyway, there's actually a better way 620 00:51:49,690 --> 00:51:54,359 to do it. You de-linearize the signatures. So you redefine the signatures. 621 00:51:54,359 --> 00:52:01,279 You still send to, for example, key a or key b. Your outputs are still pub keys. But when 622 00:52:01,279 --> 00:52:07,359 you require a signature, you don't require a signature from that pub key. You require 623 00:52:07,359 --> 00:52:11,790 a signature from that pub key, times the hash of that pub key. 624 00:52:11,790 --> 00:52:17,550 So you say, OK, I'm sending to a. But when normally I have sig a, signature from key 625 00:52:17,550 --> 00:52:28,891 a. Instead, I say, no, I want a signature from a, times the hash of a. So anyone who 626 00:52:28,891 --> 00:52:34,859 knows the private key for a will know the private key for this, because this is public. 627 00:52:34,859 --> 00:52:37,529 Everyone can see what the hash of a is. 628 00:52:37,529 --> 00:52:46,279 And then so it's just little a, times the has of little a. This is a scalar multiplication. 629 00:52:46,279 --> 00:52:52,650 So the scalar multiplication works the same for the-- sorry, this is the point multiplication. 630 00:52:52,650 --> 00:52:55,339 And it works the same for the scalar down here. 631 00:52:55,339 --> 00:52:59,210 So this doesn't hurt. It doesn't make it any harder to sign. You just have to perform a 632 00:52:59,210 --> 00:53:05,279 hash operation and a multiplication, which is quick. But what it does is it prevents 633 00:53:05,279 --> 00:53:06,700 this kind of attack. 634 00:53:06,700 --> 00:53:11,960 So instead of signing with a plus b, sign with a, times the hash of a, plus b times 635 00:53:11,960 --> 00:53:19,590 the hash of b. Since, in this case, in bitcoin, you can see what public keys are being signed 636 00:53:19,590 --> 00:53:24,440 and aggregated, so you say, OK, well I'm going to try to do this technique again. c is a, 637 00:53:24,440 --> 00:53:27,579 times the hash of a, plus b times the hash of b. 638 00:53:27,579 --> 00:53:33,410 The private key is going to be a, times the hash of the pub key, plus little b times the 639 00:53:33,410 --> 00:53:44,220 hash of pub key big B. I know b, which is q minus a-- sorry, I don't know b. I know 640 00:53:44,220 --> 00:53:54,900 q. And I constructed sort of this b is q minus a. But I don't know it. 641 00:53:54,900 --> 00:54:02,230 So c is now going to be a, times the hash of a, plus q minus a, times the hash of q 642 00:54:02,230 --> 00:54:10,069 minus a. That's what b is. I can't cancel out this little a anymore, because it's got 643 00:54:10,069 --> 00:54:15,140 a different coefficient. Maybe it's easier if you sort of move them on the other side. 644 00:54:15,140 --> 00:54:20,010 I've now got these coefficients in front of my private keys that are not the same, and 645 00:54:20,010 --> 00:54:22,930 won't cancel out anymore. 646 00:54:22,930 --> 00:54:31,171 Before, the system is times 1. So the idea was a, times 1, plus q minus a, times 1. Since 647 00:54:31,171 --> 00:54:37,359 the coefficients are the same, 1, I can factor them out, and now apply these, and remove 648 00:54:37,359 --> 00:54:38,359 the a. 649 00:54:38,359 --> 00:54:42,691 So if the coefficients are the same-- if it was a, times the hash of a, plus q minus a 650 00:54:42,691 --> 00:54:47,049 times the hash of a, I'm good. I can still factor out that coefficient and cancel out 651 00:54:47,049 --> 00:54:52,869 a. But now, the coefficients are different. And so I can't do that subtraction. And now, 652 00:54:52,869 --> 00:54:59,800 I'm stuck. I cannot sign with this little c, just from knowing q. Questions about this? 653 00:54:59,800 --> 00:55:06,390 AUDIENCE: So how do come up with q? 654 00:55:06,390 --> 00:55:16,950 TADGE DRYJA: In this attack, q is just any random number. But the idea is here, q being 655 00:55:16,950 --> 00:55:22,279 any random number doesn't help you, because you've sent money to b. You're trying to sign 656 00:55:22,279 --> 00:55:29,140 with q minus a. But you don't actually know the private key, q minus a. So you're stuck. 657 00:55:29,140 --> 00:55:34,200 Even if you did, in this case, it wouldn't help, because you still have that either. 658 00:55:34,200 --> 00:55:41,619 So this prevents that kind of straightforward attack. You can't get rid of the a times h 659 00:55:41,619 --> 00:55:42,670 term. 660 00:55:42,670 --> 00:55:49,480 This actually is not enough. There's a paper called Wagner, Wagner's paper, some guy at 661 00:55:49,480 --> 00:55:56,290 California, UC Berkeley, I think, a generalized birthday problem. And it's a hard paper to 662 00:55:56,290 --> 00:56:00,509 read. But if you're working on these kinds of things, it's a good paper to read, because 663 00:56:00,509 --> 00:56:04,740 it comes up again and again with these kinds of attacks. 664 00:56:04,740 --> 00:56:11,309 Collisions-- I don't think we ever talked about. So the idea of colliding, for example, 665 00:56:11,309 --> 00:56:21,059 a hash function, or in this case a public key, usually it takes half the work that you-- 666 00:56:21,059 --> 00:56:28,380 the reason it's called a birthday paradox is because-- do people know this birthday 667 00:56:28,380 --> 00:56:32,599 thing. Like, how many people need to be in a room before two people have the same birthday? 668 00:56:32,599 --> 00:56:34,500 And it's like 22 or something. 669 00:56:34,500 --> 00:56:40,310 It's really low. And it's surprising, because you think, well, maybe 365 divided by 2, so 670 00:56:40,310 --> 00:56:49,260 like 180 or something. But it's actually 22, I think, because of the way like for every 671 00:56:49,260 --> 00:56:55,339 new person added, there's a possibility they collide with every already existing person's 672 00:56:55,339 --> 00:56:56,609 birthday. 673 00:56:56,609 --> 00:57:02,180 And so the collisions are more common than you think. So in the case of colliding a hash 674 00:57:02,180 --> 00:57:11,059 function, even if the hash function is 2 to the 256 bits, you only need 2 of the 128 attempts 675 00:57:11,059 --> 00:57:16,030 to collide and find two that are the same. And that's if you store all the old hashes. 676 00:57:16,030 --> 00:57:20,470 And so keep coming up with new ones after the square root of the number of attempts, 677 00:57:20,470 --> 00:57:24,810 you'll find it. And then there's some techniques with cycle finding, where you don't have to 678 00:57:24,810 --> 00:57:30,910 store them all, things like that. So usually, you think, OK, 2 to the n over 2 time to find 679 00:57:30,910 --> 00:57:34,609 a collision. But that's a collision between two things. 680 00:57:34,609 --> 00:57:41,910 And Wagner's attack is sort of, yeah, find a and b, such that a equals b, kind of the 681 00:57:41,910 --> 00:57:49,789 normal collision. Now, find a0, a1, up to ai. So find i things on this side, and j things 682 00:57:49,789 --> 00:57:56,830 on this side, such that the sum of a equals the sum of b. And in the case of elliptic 683 00:57:56,830 --> 00:57:59,400 curve points, it would actually be the sum. You'd do addition. 684 00:57:59,400 --> 00:58:04,840 In the case of hashes, it might be xor, or some other operation. But the idea is if you 685 00:58:04,840 --> 00:58:09,839 have a lot of different things that you can pick and choose on both sides, you can potentially 686 00:58:09,839 --> 00:58:15,960 find collisions with much less time. And in the case of these elliptic curve points, it 687 00:58:15,960 --> 00:58:21,910 probably would be practical. You'd need dozens, 20, 30, 40, different keys. 688 00:58:21,910 --> 00:58:27,869 But you could potentially say, OK, there's a key with a lot of coins. I'm going to find 689 00:58:27,869 --> 00:58:32,410 a set of a whole bunch of keys, such that that I can cancel that key out, even with 690 00:58:32,410 --> 00:58:42,190 this multiplying by a different coefficient, because I've got so much search space. It's 691 00:58:42,190 --> 00:58:45,519 a cool paper. You should look through it, if you're interested. 692 00:58:45,519 --> 00:58:52,180 But that's a problem. You can sort of make progress, is the basic idea. I can keep getting 693 00:58:52,180 --> 00:58:55,249 closer and closer to canceling this thing out. 694 00:58:55,249 --> 00:59:03,040 So instead, we have this improved delinearization. And this is talked about in the paper from 695 00:59:03,040 --> 00:59:09,210 like January. You take the hash of all the keys concatenated together, in some specified 696 00:59:09,210 --> 00:59:15,780 order. So you sort the keys some way. And you say OK, z is the hash of a and b, or if 697 00:59:15,780 --> 00:59:18,900 there's more keys a, and b, and c, just all concatenated. 698 00:59:18,900 --> 00:59:25,269 And then you sign with a, times the hash of z, and a 0, plus b times the hash of z, and 699 00:59:25,269 --> 00:59:30,349 a 1. So you make the coefficients distinct. But you also make the coefficients commit 700 00:59:30,349 --> 00:59:37,380 to every single part of the set. Every key in the set needs to be in this coefficient. 701 00:59:37,380 --> 00:59:48,190 And the MuSig sort of explains why that works. So the attack in this case is, well, I've 702 00:59:48,190 --> 00:59:52,940 got these different coefficients. But the coefficient here only depends on this key 703 00:59:52,940 --> 00:59:53,960 here. 704 00:59:53,960 --> 00:59:58,480 So maybe I can find a bunch of different keys with different coefficients, such that these 705 00:59:58,480 --> 01:00:06,880 coefficients will cancel out, or these coefficients will sum, or multiply, or sum up to hash of 706 01:00:06,880 --> 01:00:13,670 a. So it seems hard. I can't find different hashes that will equal the hash of a. That's 707 01:00:13,670 --> 01:00:15,279 a hash collision. 708 01:00:15,279 --> 01:00:22,190 But if I have any number of different things, I may be able to. And so the idea is, if you 709 01:00:22,190 --> 01:00:30,440 do this, now every time I'm adding or removing a key from my attempt, I have to change z. 710 01:00:30,440 --> 01:00:35,820 So I can't leave this here, and start adding things, and try to work on it that way, because 711 01:00:35,820 --> 01:00:39,569 every time I add a c or d, z changes. 712 01:00:39,569 --> 01:00:45,349 And so I have to start over from scratch. So anyway, that's my general intuition of 713 01:00:45,349 --> 01:00:52,109 how the delinearization works for MuSig. And then, yeah, it used to be like z, and then 714 01:00:52,109 --> 01:00:57,119 put a in here, because now that makes it unique. But you can actually just number them. 715 01:00:57,119 --> 01:01:00,849 You just need them to be distinct. And since it's hash, function just say like 0, 1, 2, 716 01:01:00,849 --> 01:01:05,650 3, 4, is good enough. OK, any questions about this? 717 01:01:05,650 --> 01:01:10,740 I realize it's kind of complicated, and the software is kind of crazy. But the idea is 718 01:01:10,740 --> 01:01:20,499 that it prevents a bunch of these attacks so that you can securely use aggregate signatures. 719 01:01:20,499 --> 01:01:23,749 Questions? 720 01:01:23,749 --> 01:01:33,630 And you wouldn't have to program it. Yeah so you could say, OK, it saves face. The first 721 01:01:33,630 --> 01:01:41,960 use case will be in single wallets. It's interactive. So you have to do that thing where I come 722 01:01:41,960 --> 01:01:46,690 up with different k values, I share the different r values. I come up with the s values. I add 723 01:01:46,690 --> 01:01:49,000 those up, sure. 724 01:01:49,000 --> 01:01:53,450 If it's all doing it yourself, where it's just there happened to be two different pub 725 01:01:53,450 --> 01:01:58,549 keys that you control both of, you can basically skip all of that. And you can just say, no, 726 01:01:58,549 --> 01:02:04,229 I'm just going to add the two points here. And I'll use the same k value. That's just 727 01:02:04,229 --> 01:02:05,249 me. 728 01:02:05,249 --> 01:02:10,989 So this happens a lot in bitcoin wallets, where you're making a transaction with multiple 729 01:02:10,989 --> 01:02:15,359 inputs. They're all controlled by you. But you might have two coins here, and three coins 730 01:02:15,359 --> 01:02:19,380 there, and you want to send someone four coins. So you've got to use both of these inputs. 731 01:02:19,380 --> 01:02:25,510 Currently, in bitcoin you're going to have multiple signatures. With this new signature 732 01:02:25,510 --> 01:02:31,029 aggregation, you can say, OK, I'm picking these two inputs. I'm signing for all of them 733 01:02:31,029 --> 01:02:35,330 with a single signature. And in the case of it all being years, you don't even have to 734 01:02:35,330 --> 01:02:43,099 do the two operations and add them, because you know the all the keys. So that's going 735 01:02:43,099 --> 01:02:44,310 to be the first use case. 736 01:02:44,310 --> 01:02:49,800 And that's much easier to program, because it's all just within the same computer. And 737 01:02:49,800 --> 01:02:51,619 this is easy. That's cool. 738 01:02:51,619 --> 01:03:01,380 And then a cooler use is, OK, use it with CoinJoin, where users A, B, C, and D all contribute. 739 01:03:01,380 --> 01:03:06,869 They all have an input with the same amount of coins, great. And they're doing CoinShuffle 740 01:03:06,869 --> 01:03:09,009 to shuffle the order of these outputs. 741 01:03:09,009 --> 01:03:15,150 So A knows, oh, mine is G. But I have no idea about E, F, and H. I don't know the mapping 742 01:03:15,150 --> 01:03:21,579 other than my own. But, hey, my output is in there. So whatever everyone else is doing, 743 01:03:21,579 --> 01:03:22,579 I don't care. 744 01:03:22,579 --> 01:03:27,270 But I'm good with this transaction. And they can contribute their own k value on their 745 01:03:27,270 --> 01:03:33,479 own s value. And then, finally, you put the signature at the bottom. And it verifies that 746 01:03:33,479 --> 01:03:36,279 all these participants have signed. 747 01:03:36,279 --> 01:03:46,010 And this is really nice because it's smaller. This would be 250 bytes worth of signature 748 01:03:46,010 --> 01:03:51,420 data. Now, it's just 65 bytes sorts of signature data at the bottom. So you save on fees. You 749 01:03:51,420 --> 01:03:53,630 save on space. 750 01:03:53,630 --> 01:03:59,829 Everyone likes this. Yeah, it's cheaper than a solo transaction. And so that potentially 751 01:03:59,829 --> 01:04:04,038 helps scalability and privacy, in that if people say, I want to do something cheaply, 752 01:04:04,038 --> 01:04:09,319 I will connect to other people and aggregate my signature with theirs. 753 01:04:09,319 --> 01:04:13,930 And so we'd save some space and save some money. And maybe I don't really care about 754 01:04:13,930 --> 01:04:20,400 the anonymity thing. Maybe I'm entering into a transaction, and there's some people doing 755 01:04:20,400 --> 01:04:23,609 shady deals in this transaction. But I don't care. 756 01:04:23,609 --> 01:04:28,820 My thing gets in and out. And I save money. So the extension of this would be, what if 757 01:04:28,820 --> 01:04:34,010 you had just one giant transaction in every block, which is all the inputs and all the 758 01:04:34,010 --> 01:04:40,210 outputs, and there's a single signature? That would be really cool. 759 01:04:40,210 --> 01:04:51,329 There are techniques to do that. Maybe next time. So the issue is it's interactive, in 760 01:04:51,329 --> 01:04:55,790 that you have to know about all the other inputs and all the other outputs. 761 01:04:55,790 --> 01:05:00,420 So if it were non-interactive, then I could say, oh, well there's this transaction where 762 01:05:00,420 --> 01:05:05,509 D is sending coins to H. And there's also this other transaction where C sounds going 763 01:05:05,509 --> 01:05:10,460 to G. And they've got their own signatures. 764 01:05:10,460 --> 01:05:13,930 I'm not either of these people. I didn't do anything with these signatures. But if I could 765 01:05:13,930 --> 01:05:19,160 take those two things and combine the signature non-interactively, then I could just basically 766 01:05:19,160 --> 01:05:23,119 take all the transactions in my meme pool, squish them into one transaction with one 767 01:05:23,119 --> 01:05:28,140 signature, and put that in my block. That would be really cool. 768 01:05:28,140 --> 01:05:33,070 But the issue there is, one, the signatures have to be interactive. And, two, they're 769 01:05:33,070 --> 01:05:39,349 on different messages. So in the case of C sending to G, it's a message of C to G. And 770 01:05:39,349 --> 01:05:42,989 then D sending to H, It's on that message. 771 01:05:42,989 --> 01:05:52,150 There are techniques, though. So BLS signatures do allow you to do this, so non-interactive 772 01:05:52,150 --> 01:05:56,400 aggregation of signatures. Which would be really cool, because then a block would only 773 01:05:56,400 --> 01:06:01,720 need one signature. And then BLS signatures themselves are a single point, not a point 774 01:06:01,720 --> 01:06:05,359 in a scalar. So it would just be 32 bytes. 775 01:06:05,359 --> 01:06:11,309 Well, you'd probably use a different curve. You'd have to. But, yeah, the idea of having 776 01:06:11,309 --> 01:06:18,979 these tiny signatures that's satisfy proof for the entire set is really cool. 777 01:06:18,979 --> 01:06:29,440 So this is a cool idea. So you saying there isn't software? Or it's not on GitHub? 778 01:06:29,440 --> 01:06:38,359 OK, so the idea is, there is a bunch of software for this that mostly SIPPA-- or Peter Wool, 779 01:06:38,359 --> 01:06:43,490 his name on the internet is SIPPA. And he told me what that stood for. And it was some 780 01:06:43,490 --> 01:06:47,288 like super cheesy thing he made when he was in middle school. 781 01:06:47,288 --> 01:06:54,059 Anyway, he has been working on this kind of software. So has Greg, that guy who five years 782 01:06:54,059 --> 01:07:00,380 ago wrote about this stuff. A couple of people have been working on this. 783 01:07:00,380 --> 01:07:04,820 It's not publicly available code, which is kind of weird. But the reason is that they're 784 01:07:04,820 --> 01:07:10,599 working on Schnorr signatures and aggregation. And it was on GitHub in like a branch of lib 785 01:07:10,599 --> 01:07:16,700 sec p, the repo they're using. And then all these altcoins were taking the code, and putting 786 01:07:16,700 --> 01:07:21,339 it into their altcoins, and saying, hey, we support Schnorr signatures. 787 01:07:21,339 --> 01:07:25,690 So for example the rogue key attack, it was the simple version, which didn't have this 788 01:07:25,690 --> 01:07:31,279 multiplied by the hash of the pub key. And they were like, wait, no, don't, we're working 789 01:07:31,279 --> 01:07:35,320 on this software. But we know all of these problems with it. 790 01:07:35,320 --> 01:07:39,829 You can't actually use this for signature aggregation. We're just sort of playing around, 791 01:07:39,829 --> 01:07:46,329 and trying different things. And so they actually pulled it from GitHub. 792 01:07:46,329 --> 01:07:54,559 And so I have seen it. You can see the old stuff. Yeah, the three years ago stuff that 793 01:07:54,559 --> 01:07:59,559 doesn't have any of this delinearization, or this thing that prevent Wagner's attack, 794 01:07:59,559 --> 01:08:04,069 all of that stuff is not in there. 795 01:08:04,069 --> 01:08:08,910 AUDIENCE: That's something that sounds really cool. But that's something [INAUDIBLE]. 796 01:08:08,910 --> 01:08:15,130 TADGE DRYJA: No that's not what they-- OK, maybe that's a cool attack. But they were 797 01:08:15,130 --> 01:08:20,010 like, oh, shoot, no. But it said in their repo, do not use this. This is research code. 798 01:08:20,010 --> 01:08:23,649 Do not use this in production. There are known vulnerabilities. 799 01:08:23,649 --> 01:08:28,738 But people are like, whatever, people didn't care. Well, yeah, there's some esoteric edge 800 01:08:28,738 --> 01:08:32,899 case vulnerability, but I'm going to solve that, because xyz. So they actually pulled 801 01:08:32,899 --> 01:08:34,940 the code. 802 01:08:34,940 --> 01:08:41,479 They are working on the code. I mean, the code works. I've seen it. But they are very 803 01:08:41,479 --> 01:08:46,009 sort of like, we don't want to put it publicly until we're sort of saying, OK, here's our 804 01:08:46,009 --> 01:08:49,649 final version that we want to put in bitcoin, because they know as soon as they sort of 805 01:08:49,649 --> 01:08:54,540 put it out there for review, all these altcoins will take-- I mean, you've implemented this 806 01:08:54,540 --> 01:09:04,589 in-- yeah. Not in vertcoin, in crypto kernel. And you do use the u sig? 807 01:09:04,589 --> 01:09:06,670 AUDIENCE: I will do now. 808 01:09:06,670 --> 01:09:12,318 TADGE DRYJA: Oh, OK, cool, let me know. So they know that people are going to use this, 809 01:09:12,318 --> 01:09:17,440 because it's a cool signature scheme. And I've talked about this multiple times before. 810 01:09:17,440 --> 01:09:21,220 Once you have this Schnorr signature equation, there's all these other cool things you can 811 01:09:21,220 --> 01:09:25,279 do with your addresses and things like that. 812 01:09:25,279 --> 01:09:30,920 And so the idea here is scalability is one that, OK, now we can combine things, make 813 01:09:30,920 --> 01:09:35,859 it smaller. And also, we can hopefully help privacy this way, in that people who are not 814 01:09:35,859 --> 01:09:40,339 particularly concerned with their anonymity will say, yeah, I still want to use it, because 815 01:09:40,339 --> 01:09:46,920 I can save $1 on fees. So, yeah, I'll join this transaction with a bunch of other people. 816 01:09:46,920 --> 01:09:50,920 And then you've got the people who actually want anonymity are like, great, these are 817 01:09:50,920 --> 01:09:55,760 exactly the people I want to be associated with, the people who don't particularly care 818 01:09:55,760 --> 01:09:59,280 about anonymity. So it seems like a win-win. 819 01:09:59,280 --> 01:10:09,360 Right now, all the software requires a regular old signature. So, for example, if these are 820 01:10:09,360 --> 01:10:14,671 outputs that are currently being used, and you do this, I'd validate the output one at 821 01:10:14,671 --> 01:10:15,671 a time. 822 01:10:15,671 --> 01:10:18,960 And I say, OK, I'm looking at this output, looking at this input, there's no signature 823 01:10:18,960 --> 01:10:24,150 at all, fail. OK, the whole transaction fails. So you need a new output type, which says, 824 01:10:24,150 --> 01:10:31,131 OK, I'm using this new aggregate signature. So, actually, look at all the inputs. Don't 825 01:10:31,131 --> 01:10:36,270 validate them at all, and just validate at the end at the bottom. 826 01:10:36,270 --> 01:10:44,309 So that's going to be some different code. And you can't retroactively say, OK, all the 827 01:10:44,309 --> 01:10:47,639 existing outputs we can now spend with this new signature scheme. You can't really do 828 01:10:47,639 --> 01:10:51,139 that. So we'll have to make new addresses. 829 01:10:51,139 --> 01:10:57,440 And the new addresses will be bc1v, instead of q, new address type. And then we send to 830 01:10:57,440 --> 01:11:01,570 that. And now, you're allowed to spend from it using aggregate signatures. So all that 831 01:11:01,570 --> 01:11:03,780 software is mostly there. 832 01:11:03,780 --> 01:11:09,550 But it might take a while. You've got to get everyone on board. I'm curious to see what 833 01:11:09,550 --> 01:11:15,929 people will complain about. Segwit was a little bit more like, there's a lot to complain about. 834 01:11:15,929 --> 01:11:23,079 Even I don't like parts of Segwit. There's a lot of parts I don't like and won't use. 835 01:11:23,079 --> 01:11:26,590 But this seems like a sort of clear win, where it's like, great, you can aggregate signatures, 836 01:11:26,590 --> 01:11:33,230 do all these cool things. I'm guessing there will be people saying, this is bad. And saying, 837 01:11:33,230 --> 01:11:42,050 no, because Satoshi's vision is one signature, one input. A coin is a chain of signatures. 838 01:11:42,050 --> 01:11:43,650 And this coin has no signatures. 839 01:11:43,650 --> 01:11:48,600 I'm guessing. I don't know what'll happen. Or maybe it might be smooth, and everyone 840 01:11:48,600 --> 01:11:54,469 is sort of like, yeah, this is obviously better, great paper, great math, great idea. Let's 841 01:11:54,469 --> 01:11:59,110 all activate it and use bitcoin this way. Who knows? 842 01:11:59,110 --> 01:12:06,300 So we'll see. It's pretty cool software, though. And you can use it in other projects. So Crypto 843 01:12:06,300 --> 01:12:10,610 Kernel is using Schnorr signatures, probably a bunch of different coins who are sort of 844 01:12:10,610 --> 01:12:14,780 more agile, in that it's easier to make changes, will probably start using this. 845 01:12:14,780 --> 01:12:23,260 Ethereum is still ECDSA though, right? There's no coins that are actually implementing this 846 01:12:23,260 --> 01:12:31,150 yet. Monera uses ED25509, which is essentially a Schnorr signature. And I think Ripple can 847 01:12:31,150 --> 01:12:38,199 use ED25509. So there's some different schemes that are more similar to this, but still don't 848 01:12:38,199 --> 01:12:39,989 have the definitions for aggregation. 849 01:12:39,989 --> 01:12:47,730 So, anyway, so this is cool stuff. Next time, I will talk about-- next one, I'm not looking 850 01:12:47,730 --> 01:12:51,540 forward to making slides, because it's like a really complicated one that I don't even 851 01:12:51,540 --> 01:12:57,840 understand. Like, I sort of get it, but it's like, wait, how? OK, but how to mix different 852 01:12:57,840 --> 01:13:02,570 amounts. 853 01:13:02,570 --> 01:13:10,590 In this case, hey, it's great. But in many cases, you can see, I said the reason it doesn't 854 01:13:10,590 --> 01:13:18,179 work is because it's really obvious. But what if you could hide the amounts, but still enforce 855 01:13:18,179 --> 01:13:20,110 that the amounts were correct? 856 01:13:20,110 --> 01:13:24,860 So that's the idea of confidential transactions, which I'll talk about next time. And then 857 01:13:24,860 --> 01:13:29,949 it leads to MimbleWimble, which is even more complicated. I'm not going to promise that 858 01:13:29,949 --> 01:13:33,640 I'll explain the whole thing. But I'll touch on the ideas.