1 00:00:00,850 --> 00:00:03,220 The following content is provided under a Creative 2 00:00:03,220 --> 00:00:04,610 Commons license. 3 00:00:04,610 --> 00:00:06,820 Your support will help MIT OpenCourseWare 4 00:00:06,820 --> 00:00:10,910 continue to offer high quality educational resources for free. 5 00:00:10,910 --> 00:00:13,480 To make a donation or to view additional materials 6 00:00:13,480 --> 00:00:17,440 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,440 --> 00:00:18,313 at ocw.mit.edu. 8 00:00:22,100 --> 00:00:23,350 ALIN TOMESCU: My name is Alin. 9 00:00:23,350 --> 00:00:25,590 I work in Stata, in the Stata Center. 10 00:00:25,590 --> 00:00:27,760 I'm a PhD student there in my fifth year. 11 00:00:27,760 --> 00:00:30,730 And today we're going to be talking about one 12 00:00:30,730 --> 00:00:32,740 of our research project called Catena. 13 00:00:32,740 --> 00:00:35,020 And Catena is a really nice way of using bitcoin 14 00:00:35,020 --> 00:00:37,390 to build append-only logs. 15 00:00:37,390 --> 00:00:39,250 Bitcoin itself is an append-only log. 16 00:00:39,250 --> 00:00:43,270 And a lot of people have been using it to put data in it. 17 00:00:43,270 --> 00:00:45,040 And I'll describe a really efficient way 18 00:00:45,040 --> 00:00:47,200 of doing that and its applications. 19 00:00:47,200 --> 00:00:49,930 And if there is time, we'll talk about a tax and maybe colored 20 00:00:49,930 --> 00:00:51,580 coins and some other stuff. 21 00:00:51,580 --> 00:00:53,830 We'll talk about the what, the how and the why. 22 00:00:53,830 --> 00:00:57,040 And that's the overview of the presentation. 23 00:00:57,040 --> 00:00:59,780 So let's talk about this problem called the equivocation 24 00:00:59,780 --> 00:01:00,280 problem. 25 00:01:00,280 --> 00:01:02,380 So what is this? 26 00:01:02,380 --> 00:01:05,740 In general, non-equivocation means saying the same thing 27 00:01:05,740 --> 00:01:06,850 to everybody. 28 00:01:06,850 --> 00:01:09,160 So for example, if you have a malicious service 29 00:01:09,160 --> 00:01:11,980 and you have Alice and Bob, the service 30 00:01:11,980 --> 00:01:14,170 should say the same thing to Alice and Bob. 31 00:01:14,170 --> 00:01:16,060 So it would make a bunch of statements. 32 00:01:16,060 --> 00:01:19,180 Let's say s1, s2, s3 over time and Alice and Bob 33 00:01:19,180 --> 00:01:21,047 would see all of these statements. 34 00:01:21,047 --> 00:01:23,380 So this is very similar to what bitcoin provides, right? 35 00:01:23,380 --> 00:01:26,050 In bitcoin you see block one, block two, block three. 36 00:01:26,050 --> 00:01:28,570 And everybody agrees on these blocks in sequence, right? 37 00:01:28,570 --> 00:01:30,230 Does that make sense? 38 00:01:30,230 --> 00:01:32,290 So this is non-equivocation and in some sense 39 00:01:32,290 --> 00:01:34,840 this is what bitcoin already offers. 40 00:01:34,840 --> 00:01:37,363 And in general with non-equivocation, 41 00:01:37,363 --> 00:01:39,280 what you might get is some of these statements 42 00:01:39,280 --> 00:01:41,450 might actually be false or incorrect. 43 00:01:41,450 --> 00:01:43,240 Non-equivocation doesn't guarantee you 44 00:01:43,240 --> 00:01:45,132 that this statement is a correct statement. 45 00:01:45,132 --> 00:01:46,840 But it just guarantees you that everybody 46 00:01:46,840 --> 00:01:47,882 sees the same statements. 47 00:01:47,882 --> 00:01:49,900 And then they can detect incorrect statements. 48 00:01:49,900 --> 00:01:51,220 In bitcoin you get a little bit more. 49 00:01:51,220 --> 00:01:52,970 You actually know that if this is a block, 50 00:01:52,970 --> 00:01:55,930 it's a valid block, assuming there are enough blocks on top 51 00:01:55,930 --> 00:01:57,400 of it, right? 52 00:01:57,400 --> 00:02:00,520 So equivocation means saying the same thing to everybody. 53 00:02:00,520 --> 00:02:03,610 So for example, this malicious service at time four, 54 00:02:03,610 --> 00:02:06,520 he might show Bob a different statement than Alice. 55 00:02:06,520 --> 00:02:09,550 So Bob sees s4 and Alice sees s4 prime. 56 00:02:09,550 --> 00:02:11,590 This is what happens in bitcoin sometimes. 57 00:02:11,590 --> 00:02:13,750 And that's how you can double spend in bitcoin 58 00:02:13,750 --> 00:02:16,660 by putting the transaction here sending money to the merchant, 59 00:02:16,660 --> 00:02:18,160 and then putting another transaction 60 00:02:18,160 --> 00:02:20,110 here sending money back to you. 61 00:02:20,110 --> 00:02:21,960 Right, are you guys familiar with this? 62 00:02:21,960 --> 00:02:23,820 Yeah, OK. 63 00:02:23,820 --> 00:02:25,360 All right, so why does this matter? 64 00:02:25,360 --> 00:02:26,800 Let me give you a silly example. 65 00:02:26,800 --> 00:02:29,860 Suppose we have Jimmy and we have Jimmy's mom and Jimmy's 66 00:02:29,860 --> 00:02:30,695 dad, right? 67 00:02:30,695 --> 00:02:32,320 And Jimmy wants to go outside and play. 68 00:02:32,320 --> 00:02:36,070 But he knows that mom and dad usually don't let him play. 69 00:02:36,070 --> 00:02:39,400 So what he does is he tells dad, hey dad, 70 00:02:39,400 --> 00:02:41,210 mom said I can go outside. 71 00:02:41,210 --> 00:02:44,110 Right, and then he tells mom, hey mom. 72 00:02:44,110 --> 00:02:45,470 Dad said I can go outside. 73 00:02:45,470 --> 00:02:47,470 And let's say mom and dad are in different rooms 74 00:02:47,470 --> 00:02:48,910 and they're watching soap operas and they're not 75 00:02:48,910 --> 00:02:50,020 talking to one another. 76 00:02:50,020 --> 00:02:52,530 So they can actually confirm that. 77 00:02:52,530 --> 00:02:54,713 You know, mom can confirm that dad really said that. 78 00:02:54,713 --> 00:02:56,630 And dad can really confirm that mom said that. 79 00:02:56,630 --> 00:02:58,642 But they both trust Jimmy. 80 00:02:58,642 --> 00:03:00,850 So you see how equivocation can be really problematic 81 00:03:00,850 --> 00:03:02,740 because now mom and dad will say sure, 82 00:03:02,740 --> 00:03:05,860 go outside as long as the other person said that, right? 83 00:03:05,860 --> 00:03:08,540 But let me give you a more practical example. 84 00:03:08,540 --> 00:03:11,580 So let's look at something called a public-key directory. 85 00:03:11,580 --> 00:03:15,200 A public-key directory allows you to map user's public keys-- 86 00:03:15,200 --> 00:03:16,367 a user name to a public key. 87 00:03:16,367 --> 00:03:18,283 Right, so here I have the public key for Alice 88 00:03:18,283 --> 00:03:19,930 and here I have the public key for Bob. 89 00:03:19,930 --> 00:03:22,138 And they look up each other's keys in this directory. 90 00:03:22,138 --> 00:03:24,490 And then they can set up a secure channel. 91 00:03:24,490 --> 00:03:27,370 How many of you guys use Whatsapp, for example? 92 00:03:27,370 --> 00:03:30,340 So the Whatsapp server has a public-key directory. 93 00:03:30,340 --> 00:03:32,050 And when I want to send you a message, 94 00:03:32,050 --> 00:03:34,480 I look up your phone number in that directory 95 00:03:34,480 --> 00:03:36,100 and I get your public key, right? 96 00:03:36,100 --> 00:03:38,940 If that directory equivocates, the following thing can happen. 97 00:03:38,940 --> 00:03:41,590 What the directory can do is it can create a new directory 98 00:03:41,590 --> 00:03:44,650 at time two where he puts a fake public key for Bob 99 00:03:44,650 --> 00:03:47,650 and he shows this to Alice, right? 100 00:03:47,650 --> 00:03:51,730 And at time two also, he creates another directory for Bob 101 00:03:51,730 --> 00:03:55,000 where he puts a fake public key for Alice, right? 102 00:03:55,000 --> 00:03:57,430 So now the problem here is that when 103 00:03:57,430 --> 00:04:00,640 Alice checks in this directory, she looks at her own public key 104 00:04:00,640 --> 00:04:02,695 to make sure she's not impersonated. 105 00:04:02,695 --> 00:04:04,570 And Alice looks in this version and sees, OK. 106 00:04:04,570 --> 00:04:05,390 That is my public key. 107 00:04:05,390 --> 00:04:05,770 I'm good. 108 00:04:05,770 --> 00:04:06,940 She looks in this version, OK. 109 00:04:06,940 --> 00:04:07,857 This is my public key. 110 00:04:07,857 --> 00:04:08,495 I'm good. 111 00:04:08,495 --> 00:04:10,120 So now I'm ready to use this directory. 112 00:04:10,120 --> 00:04:12,580 And I'll look up Bob and I'll get his public key. 113 00:04:12,580 --> 00:04:14,680 But Alice will actually get the wrong public key. 114 00:04:14,680 --> 00:04:17,070 Does everybody see that? 115 00:04:17,070 --> 00:04:19,420 And similarly Bob will do the same. 116 00:04:19,420 --> 00:04:22,990 So Bob will look in his fork of the directory, right? 117 00:04:22,990 --> 00:04:25,493 And he looks up his key here and his key here. 118 00:04:25,493 --> 00:04:26,410 And he thinks he's OK. 119 00:04:26,410 --> 00:04:27,327 He's not impersonated. 120 00:04:27,327 --> 00:04:30,310 But in fact, Alice has impersonated there. 121 00:04:30,310 --> 00:04:31,540 OK? 122 00:04:31,540 --> 00:04:35,320 And now as a result, they will obtain fake keys 123 00:04:35,320 --> 00:04:36,340 for each other. 124 00:04:36,340 --> 00:04:37,960 And this man in the middle, attacker 125 00:04:37,960 --> 00:04:39,970 who knows the corresponding secret keys 126 00:04:39,970 --> 00:04:42,190 for these public keys can basically 127 00:04:42,190 --> 00:04:45,890 read all of their communications. 128 00:04:45,890 --> 00:04:47,460 Any questions about this? 129 00:04:47,460 --> 00:04:49,850 This is just one example of how equivocation 130 00:04:49,850 --> 00:04:51,680 can be really disastrous. 131 00:04:51,680 --> 00:04:53,900 So in a public-key directory, if you can equivocate, 132 00:04:53,900 --> 00:04:55,910 you can show fake public keys for people 133 00:04:55,910 --> 00:04:58,263 and impersonate them. 134 00:04:58,263 --> 00:04:59,930 So in other words, it's really important 135 00:04:59,930 --> 00:05:02,720 that Alice and Bob both see the same directory. 136 00:05:02,720 --> 00:05:04,520 Because if Bob saw this directory, 137 00:05:04,520 --> 00:05:08,030 the same one Alice saw, then Bob would notice that this is not 138 00:05:08,030 --> 00:05:09,635 the public key he had. 139 00:05:09,635 --> 00:05:11,510 He would notice his first public key and then 140 00:05:11,510 --> 00:05:12,980 that there's a second one there. 141 00:05:12,980 --> 00:05:14,688 And then he would know he's impersonated. 142 00:05:14,688 --> 00:05:17,270 And he could let's say, talk to the New York Times, 143 00:05:17,270 --> 00:05:18,200 and say, look. 144 00:05:18,200 --> 00:05:20,040 This directory is impersonating me. 145 00:05:23,330 --> 00:05:25,640 So in conclusion, equivocation can be pretty bad. 146 00:05:25,640 --> 00:05:27,390 So this idea that you say different things 147 00:05:27,390 --> 00:05:29,360 to different people can be pretty disastrous. 148 00:05:29,360 --> 00:05:31,950 And what Catena does is it prevents that. 149 00:05:31,950 --> 00:05:35,150 So in general, if you have this malicious service that is 150 00:05:35,150 --> 00:05:38,535 backed by Catena, if it wants to say different things 151 00:05:38,535 --> 00:05:40,160 to different people. it cannot do that. 152 00:05:40,160 --> 00:05:42,980 It has to show the same thing to everybody. 153 00:05:42,980 --> 00:05:45,890 And the way we achieve that is by building on top of bitcoin. 154 00:05:45,890 --> 00:05:48,300 And that's what we're going to be talking about today. 155 00:05:48,300 --> 00:05:51,320 So any questions about sort of the general setting 156 00:05:51,320 --> 00:05:54,030 of the problem and our goals here? 157 00:05:54,030 --> 00:05:55,260 So let's move on then. 158 00:05:55,260 --> 00:05:56,925 So why does this matter? 159 00:05:56,925 --> 00:05:58,800 So this matters for a bunch of other reasons, 160 00:05:58,800 --> 00:06:01,230 not just public-key directories and secure messaging. 161 00:06:01,230 --> 00:06:04,890 It matters because when you want to do secure software update, 162 00:06:04,890 --> 00:06:06,120 equivocation is a problem. 163 00:06:06,120 --> 00:06:07,540 And I'll talk about that later. 164 00:06:07,540 --> 00:06:09,150 So for example, at some point bitcoin 165 00:06:09,150 --> 00:06:12,240 was concerned about malicious bitcoin binaries 166 00:06:12,240 --> 00:06:14,240 being published on the web and people 167 00:06:14,240 --> 00:06:15,990 like you and me downloading those binaries 168 00:06:15,990 --> 00:06:17,810 and getting our coins stolen. 169 00:06:17,810 --> 00:06:20,310 Right, and it turns out that that's an equivocation problem. 170 00:06:20,310 --> 00:06:22,620 Somebody is equivocating, right? 171 00:06:22,620 --> 00:06:24,610 It's equivocating about the bitcoin binary. 172 00:06:24,610 --> 00:06:26,670 It's showing us a fake version and maybe 173 00:06:26,670 --> 00:06:28,758 other people the real version. 174 00:06:28,758 --> 00:06:30,300 Secure messaging, like I said before, 175 00:06:30,300 --> 00:06:33,540 has applications here and. 176 00:06:33,540 --> 00:06:35,340 Not just secure messaging but also the web. 177 00:06:35,340 --> 00:06:36,715 Like when you go on Facebook.com, 178 00:06:36,715 --> 00:06:38,617 you're looking up Facebook's public key. 179 00:06:38,617 --> 00:06:40,700 And if somebody lies to you about that public key, 180 00:06:40,700 --> 00:06:43,020 you could be going to a malicious service 181 00:06:43,020 --> 00:06:46,390 and you could be giving them your Facebook password. 182 00:06:46,390 --> 00:06:48,930 Does that make sense? 183 00:06:48,930 --> 00:06:50,860 And also it has applications in the sense 184 00:06:50,860 --> 00:06:53,830 that if you have a way of building a append-only log, 185 00:06:53,830 --> 00:06:56,470 you really have a way of building a blockchain right 186 00:06:56,470 --> 00:06:58,280 for whatever purpose you want. 187 00:06:58,280 --> 00:07:00,830 And we'll talk about that as well. 188 00:07:00,830 --> 00:07:03,940 So the 10,000 feet view of the system 189 00:07:03,940 --> 00:07:07,060 is we built this bitcoin based append-only log. 190 00:07:07,060 --> 00:07:09,380 And the way to think about is that bitcoin is already 191 00:07:09,380 --> 00:07:10,832 an append-only log. 192 00:07:10,832 --> 00:07:12,790 It's just that it's kind of inefficient to look 193 00:07:12,790 --> 00:07:13,305 in that log. 194 00:07:13,305 --> 00:07:15,430 If you want to pick certain things from the Bitcoin 195 00:07:15,430 --> 00:07:18,010 blockchain, like you put your certain bits and pieces of data 196 00:07:18,010 --> 00:07:19,060 there, you have to kind of download 197 00:07:19,060 --> 00:07:21,435 the whole thing to make sure you're not missing anything. 198 00:07:21,435 --> 00:07:23,020 And I'll tell you why soon. 199 00:07:23,020 --> 00:07:25,450 So instead of putting stuff naively 200 00:07:25,450 --> 00:07:26,950 in the Bitcoin blockchain, we put it 201 00:07:26,950 --> 00:07:28,720 in a more principled way. 202 00:07:28,720 --> 00:07:31,540 And in a sense we get a log and another log. 203 00:07:31,540 --> 00:07:34,660 We get our log and the bitcoin log. 204 00:07:34,660 --> 00:07:36,910 And this generalizes to other cryptocurrencies. 205 00:07:36,910 --> 00:07:38,530 Like you could do this in light coin, 206 00:07:38,530 --> 00:07:40,215 for example, or even in ethereum. 207 00:07:40,215 --> 00:07:41,590 Though I don't think you guys yet 208 00:07:41,590 --> 00:07:45,090 talked about how the ethereum blockchain works. 209 00:07:45,090 --> 00:07:48,310 And the cool thing about this Catena log that we're building 210 00:07:48,310 --> 00:07:51,560 is that the Catena log is as hard to fork as the bitcoin 211 00:07:51,560 --> 00:07:52,060 blockchain. 212 00:07:52,060 --> 00:07:54,880 If you want to fork our log, you have to fork bitcoin. 213 00:07:54,880 --> 00:07:56,740 However, unlike the Bitcoin blockchain, 214 00:07:56,740 --> 00:07:59,140 Catena is super efficient to verify. 215 00:07:59,140 --> 00:08:02,770 So in particular, remember, I described the log 216 00:08:02,770 --> 00:08:04,250 in terms of the statements in it. 217 00:08:04,250 --> 00:08:07,840 So if you have 10 statements, each statement 218 00:08:07,840 --> 00:08:09,572 will be 600 bytes to audit. 219 00:08:09,572 --> 00:08:11,530 So you don't have to download the whole Bitcoin 220 00:08:11,530 --> 00:08:14,820 blockchain to make sure you're not missing a statement. 221 00:08:14,820 --> 00:08:18,190 And you also have to download 80 bytes per bitcoin block. 222 00:08:18,190 --> 00:08:20,020 And we have a Java implementation of this. 223 00:08:20,020 --> 00:08:23,710 And if you guys are curious, you can go to my GitHub page, 224 00:08:23,710 --> 00:08:25,630 and take a look at the code. 225 00:08:25,630 --> 00:08:27,880 All right, so before we start, I know you guys already 226 00:08:27,880 --> 00:08:29,410 know a lot about how the Bitcoin blockchain, 227 00:08:29,410 --> 00:08:31,720 but it's important that I reintroduce some terminology 228 00:08:31,720 --> 00:08:33,490 just so we're on the same page. 229 00:08:33,490 --> 00:08:35,000 So this is the bitcoin blockchain. 230 00:08:35,000 --> 00:08:36,490 We have a bunch of blocks connected 231 00:08:36,490 --> 00:08:37,690 by hash chain pointers. 232 00:08:37,690 --> 00:08:40,840 And we have Merkle trees of transactions. 233 00:08:40,840 --> 00:08:43,610 And you know, these arrows indicate hash pointers. 234 00:08:43,610 --> 00:08:47,840 It means that block n stores a hash of block n minus 1 in it, 235 00:08:47,840 --> 00:08:48,340 right? 236 00:08:48,340 --> 00:08:49,690 So in that sense. 237 00:08:49,690 --> 00:08:52,500 Block n has a hash pointer to block n minus 1. 238 00:08:55,320 --> 00:08:57,802 All right, and like I said, each tree has a Merkle tree. 239 00:08:57,802 --> 00:08:59,010 Each block has a Merkle tree. 240 00:08:59,010 --> 00:09:00,960 And everybody agrees on this chain 241 00:09:00,960 --> 00:09:03,870 of blocks via proof of work consensus, which all of you 242 00:09:03,870 --> 00:09:05,160 already know. 243 00:09:05,160 --> 00:09:08,670 And importantly, in the Merkle trees, we have transactions. 244 00:09:08,670 --> 00:09:10,230 And transactions can do two things. 245 00:09:10,230 --> 00:09:13,050 Right, a transaction can mint coins, create new coins. 246 00:09:13,050 --> 00:09:14,590 So here I have transaction a. 247 00:09:14,590 --> 00:09:17,670 It created four coins by storing them in an output. 248 00:09:17,670 --> 00:09:18,920 Are you familiar with outputs? 249 00:09:18,920 --> 00:09:21,462 So you guys already discussed transaction outputs and inputs. 250 00:09:21,462 --> 00:09:25,260 So that's what we're going over here again real quick. 251 00:09:25,260 --> 00:09:27,450 So an output specifies the number of coins 252 00:09:27,450 --> 00:09:30,300 and the public key of the owner of those coins, 253 00:09:30,300 --> 00:09:31,800 as you already know. 254 00:09:31,800 --> 00:09:33,630 And the second purpose of transactions 255 00:09:33,630 --> 00:09:35,940 is that they can transfer coins, right, and pay fees 256 00:09:35,940 --> 00:09:36,900 in the process. 257 00:09:36,900 --> 00:09:39,150 So here if you have a transaction b, 258 00:09:39,150 --> 00:09:42,510 this transaction b might say, hey, here's 259 00:09:42,510 --> 00:09:45,750 a signature from the owner of these coins here. 260 00:09:45,750 --> 00:09:47,850 Here's a signature and transaction b. 261 00:09:47,850 --> 00:09:49,670 And here's the new owner with public key 262 00:09:49,670 --> 00:09:52,200 b of three of those four coins. 263 00:09:52,200 --> 00:09:54,090 And one of those coins I'll just give it 264 00:09:54,090 --> 00:09:57,775 to the miners as a transaction fee. 265 00:09:57,775 --> 00:09:58,650 Does this make sense? 266 00:09:58,650 --> 00:10:00,620 How many of you are with me? 267 00:10:00,620 --> 00:10:03,770 All right, any questions about how transaction inputs 268 00:10:03,770 --> 00:10:04,658 and outputs work? 269 00:10:04,658 --> 00:10:06,200 So if you remember in the input here, 270 00:10:06,200 --> 00:10:09,520 you have a hash pointer to the output here. 271 00:10:09,520 --> 00:10:12,470 All right. 272 00:10:12,470 --> 00:10:14,263 So and the high level idea here is 273 00:10:14,263 --> 00:10:15,680 that the output is just the number 274 00:10:15,680 --> 00:10:16,877 of coins in a public key. 275 00:10:16,877 --> 00:10:18,710 And the input is a hash pointer to an output 276 00:10:18,710 --> 00:10:24,410 plus a digital signature from that output's public key, OK? 277 00:10:24,410 --> 00:10:26,180 And yeah, so what happens here is 278 00:10:26,180 --> 00:10:30,920 that transaction b is spending transaction a's first output. 279 00:10:30,920 --> 00:10:33,415 Right, that's the terminology that we're going to use. 280 00:10:33,415 --> 00:10:34,790 And I think you guys have already 281 00:10:34,790 --> 00:10:37,790 used terminology like this. 282 00:10:37,790 --> 00:10:40,430 And in addition, what we're going to talk about today 283 00:10:40,430 --> 00:10:42,440 is the fact that in these bitcoin transactions 284 00:10:42,440 --> 00:10:43,670 you can actually embed data. 285 00:10:43,670 --> 00:10:48,080 And I think you touched briefly on this concept of op return 286 00:10:48,080 --> 00:10:50,060 transaction outputs. 287 00:10:50,060 --> 00:10:54,740 Right, so this is an output that sends coins to a public key. 288 00:10:54,740 --> 00:10:57,800 But here I can have an output that sends coins to nobody. 289 00:10:57,800 --> 00:10:59,810 It just specifies some data. 290 00:10:59,810 --> 00:11:02,480 And in fact, I'll use that data to specify the statements 291 00:11:02,480 --> 00:11:04,870 that I was talking about earlier. 292 00:11:04,870 --> 00:11:06,290 So that the high level point here 293 00:11:06,290 --> 00:11:10,340 is that you can embed data in bitcoin transactions using 294 00:11:10,340 --> 00:11:11,102 these operations. 295 00:11:11,102 --> 00:11:13,310 And there's a bunch of other ways to do it, actually. 296 00:11:13,310 --> 00:11:15,500 Like initially what people did is 297 00:11:15,500 --> 00:11:17,272 they put the data as the public key. 298 00:11:17,272 --> 00:11:18,980 They just set the public key to the data. 299 00:11:18,980 --> 00:11:22,570 And in that sense, they kind of wasted bitcoins. 300 00:11:22,570 --> 00:11:25,280 Right, they said hey, send these three bitcoins 301 00:11:25,280 --> 00:11:27,890 to this public key, which is just some random data. 302 00:11:27,890 --> 00:11:30,080 But nobody would know the corresponding secret key 303 00:11:30,080 --> 00:11:31,580 of that public key. 304 00:11:31,580 --> 00:11:33,860 So therefore those coins would be burned. 305 00:11:33,860 --> 00:11:36,770 Did you did you cover this in class already? 306 00:11:36,770 --> 00:11:37,880 Maybe, maybe a little bit. 307 00:11:37,880 --> 00:11:39,602 So but that's kind of inefficient 308 00:11:39,602 --> 00:11:41,060 because if you remember, the miners 309 00:11:41,060 --> 00:11:43,460 have to build this UTXO set. 310 00:11:43,460 --> 00:11:45,770 And they have to keep this output that 311 00:11:45,770 --> 00:11:47,930 has this bad public key in their memory 312 00:11:47,930 --> 00:11:50,673 forever because nobody's going to be able to spend it. 313 00:11:50,673 --> 00:11:51,840 So we don't want to do that. 314 00:11:51,840 --> 00:11:55,730 We want to build a nice system, a system that treats bitcoin 315 00:11:55,730 --> 00:11:57,060 nicely, the bitcoin miners. 316 00:11:57,060 --> 00:12:00,265 So that's why we use these return outputs. 317 00:12:00,265 --> 00:12:01,640 All right, so the high level here 318 00:12:01,640 --> 00:12:03,290 is that Alice gives Bob three bitcoins. 319 00:12:03,290 --> 00:12:06,083 And the miners collect a bitcoin as a fee. 320 00:12:06,083 --> 00:12:08,000 And of course, you can keep doing this, right? 321 00:12:08,000 --> 00:12:10,940 Like Bob can give Carol these two bitcoins later by creating 322 00:12:10,940 --> 00:12:14,460 another transaction with an input referring to that output 323 00:12:14,460 --> 00:12:17,830 and with an output specifying Carol's public key. 324 00:12:17,830 --> 00:12:20,930 All right, and the high level idea of bitcoin is that you 325 00:12:20,930 --> 00:12:22,610 don't-- you cannot double spend coins. 326 00:12:22,610 --> 00:12:25,640 And what that means is that a transaction output can only 327 00:12:25,640 --> 00:12:28,492 be referred to by a single transaction input. 328 00:12:28,492 --> 00:12:30,950 So this thing right here in bitcoin where I have two inputs 329 00:12:30,950 --> 00:12:34,440 spending an output cannot happen, right? 330 00:12:34,440 --> 00:12:36,870 How many of you are familiar with this already? 331 00:12:36,870 --> 00:12:37,495 OK, good. 332 00:12:37,495 --> 00:12:39,120 So actually this is the essential trick 333 00:12:39,120 --> 00:12:40,170 that Catena leverages. 334 00:12:40,170 --> 00:12:42,010 And we'll talk about it soon. 335 00:12:42,010 --> 00:12:43,560 And yeah, the moral of the story, 336 00:12:43,560 --> 00:12:46,290 the reason I told you all of this is because you know, 337 00:12:46,290 --> 00:12:48,990 basically if you have proof of work consensus in the bitcoin 338 00:12:48,990 --> 00:12:50,700 sense, you cannot do double spends. 339 00:12:50,700 --> 00:12:52,890 So this thing right here, that I said before, 340 00:12:52,890 --> 00:12:56,157 just cannot occur in the Bitcoin blockchain unless you break 341 00:12:56,157 --> 00:12:58,740 the assumptions, right, unless you have more mining power than 342 00:12:58,740 --> 00:13:01,050 you should. 343 00:13:01,050 --> 00:13:03,270 In that case, you either have to tx2 or tx2 344 00:13:03,270 --> 00:13:05,750 prime, but not both, right? 345 00:13:05,750 --> 00:13:08,880 And what Catena really realizes is 346 00:13:08,880 --> 00:13:11,790 that if I put statements in these transactions now, 347 00:13:11,790 --> 00:13:14,370 what that means is that I can only have a second statement. 348 00:13:14,370 --> 00:13:16,110 I cannot have two second statements. 349 00:13:16,110 --> 00:13:19,110 I cannot equivocate about the second statements if I just 350 00:13:19,110 --> 00:13:22,737 restrict my way of issuing statements in this way. 351 00:13:22,737 --> 00:13:24,570 I put the first statement in the transaction 352 00:13:24,570 --> 00:13:25,945 and the second statement I put it 353 00:13:25,945 --> 00:13:28,830 in a transaction that spends the first one. 354 00:13:28,830 --> 00:13:31,403 So does everybody agree that if I do things in this way 355 00:13:31,403 --> 00:13:33,570 and I want to equivocate about the second statement, 356 00:13:33,570 --> 00:13:36,520 I would have to double spend? 357 00:13:36,520 --> 00:13:40,680 So that's the key insight behind our system. 358 00:13:40,680 --> 00:13:42,220 Any questions about this so far? 359 00:13:46,147 --> 00:13:48,480 You know it's really hard to talk if you guys talk back, 360 00:13:48,480 --> 00:13:51,150 it's way easier. 361 00:13:51,150 --> 00:13:51,650 Yeah? 362 00:13:51,650 --> 00:13:52,525 AUDIENCE: A question. 363 00:13:52,525 --> 00:13:55,078 And maybe, I think I'm getting this right. 364 00:13:55,078 --> 00:13:56,870 If you're setting up the first transaction, 365 00:13:56,870 --> 00:13:58,890 then you're adding data on? 366 00:13:58,890 --> 00:14:01,130 You're burning bitcoins every time you're doing it, 367 00:14:01,130 --> 00:14:02,880 so at some point, you're going to run out. 368 00:14:02,880 --> 00:14:03,810 ALIN TOMESCU: That's an excellent question. 369 00:14:03,810 --> 00:14:05,790 So let's-- we'll go over that. 370 00:14:05,790 --> 00:14:07,950 But the idea is that in this output, 371 00:14:07,950 --> 00:14:10,260 I won't burn any bitcoins. 372 00:14:10,260 --> 00:14:12,610 I'll actually specify my own public key here. 373 00:14:12,610 --> 00:14:14,820 And I'll just send the bitcoins back to myself. 374 00:14:14,820 --> 00:14:18,450 And in the process I'll pay a fee to issue this transaction. 375 00:14:18,450 --> 00:14:19,860 Does that make sense? 376 00:14:19,860 --> 00:14:20,400 Yeah? 377 00:14:20,400 --> 00:14:22,690 And we'll talk about it later. 378 00:14:22,690 --> 00:14:25,280 OK, so real quickly, what did previous work 379 00:14:25,280 --> 00:14:26,030 do regarding this? 380 00:14:26,030 --> 00:14:29,090 So how many of you are familiar with blockstack? 381 00:14:29,090 --> 00:14:31,900 1, 2, only two people? 382 00:14:31,900 --> 00:14:33,820 OK how many of you are familiar with Keybase? 383 00:14:33,820 --> 00:14:37,950 OK, so blockstack and Keybase actually post statements 384 00:14:37,950 --> 00:14:39,240 in the Bitcoin blockchain. 385 00:14:39,240 --> 00:14:41,210 And they're both public directories. 386 00:14:41,210 --> 00:14:43,380 They map user names to public keys. 387 00:14:43,380 --> 00:14:45,870 And for example, Keybase, what Keybase does 388 00:14:45,870 --> 00:14:47,922 is they take this Merkle route hash, 389 00:14:47,922 --> 00:14:49,380 and they put it in the transaction. 390 00:14:49,380 --> 00:14:51,720 And then six hours later, they take the new route hash, 391 00:14:51,720 --> 00:14:54,030 and they put it in another transaction, and so on. 392 00:14:54,030 --> 00:14:56,280 Every six hours, they post the transaction. 393 00:14:56,280 --> 00:14:59,640 But unfortunately, they don't actually do what Catena does. 394 00:14:59,640 --> 00:15:03,360 So they don't have their new transaction spend the old one. 395 00:15:03,360 --> 00:15:05,010 And as a result, if you're trying 396 00:15:05,010 --> 00:15:09,210 to make sure you see all of the statements for Keybase, 397 00:15:09,210 --> 00:15:10,860 you don't have a lot of recourse other 398 00:15:10,860 --> 00:15:14,910 than just downloading each block and looking in the block 399 00:15:14,910 --> 00:15:17,752 for all of the relevant transactions. 400 00:15:17,752 --> 00:15:19,710 Another thing you could do is you could sort of 401 00:15:19,710 --> 00:15:21,750 trust the bitcoin miners-- 402 00:15:21,750 --> 00:15:25,200 the bitcoin full nodes to filter the blocks for you. 403 00:15:25,200 --> 00:15:28,877 So you could contact a bunch of bitcoin full nodes 404 00:15:28,877 --> 00:15:29,460 and say, look. 405 00:15:29,460 --> 00:15:31,260 I'm only interested in transactions 406 00:15:31,260 --> 00:15:34,478 that have a certain IP return prefix in the data. 407 00:15:34,478 --> 00:15:35,770 And they could do that for you. 408 00:15:35,770 --> 00:15:37,312 But unfortunately, bitcoin full nodes 409 00:15:37,312 --> 00:15:39,060 could also lie to you very easily. 410 00:15:39,060 --> 00:15:41,040 And there is no cost for them to lie to you. 411 00:15:41,040 --> 00:15:43,080 Everybody can be a bitcoin full node. 412 00:15:43,080 --> 00:15:45,300 So then it becomes a very bandwidth intensive process 413 00:15:45,300 --> 00:15:48,840 because you have to ask a lot of full nodes 414 00:15:48,840 --> 00:15:51,310 to deal with the fact that someone might lie to you. 415 00:15:51,310 --> 00:15:53,340 But you either need to download full blocks 416 00:15:53,340 --> 00:15:55,930 to find let's say, a missing statement like this one 417 00:15:55,930 --> 00:15:58,505 that a bitcoin full node might hide from you, 418 00:15:58,505 --> 00:16:00,630 or you can trust the majority of bitcoin full nodes 419 00:16:00,630 --> 00:16:03,302 to not hide statements, which is not very good, right? 420 00:16:03,302 --> 00:16:05,010 So I don't want to trust these full nodes 421 00:16:05,010 --> 00:16:06,750 because all of you guys could run a full node right 422 00:16:06,750 --> 00:16:07,630 now in the bitcoin network. 423 00:16:07,630 --> 00:16:08,838 It doesn't cost you anything. 424 00:16:08,838 --> 00:16:12,220 And if I talk to your malicious full node, I could be screwed. 425 00:16:12,220 --> 00:16:15,750 So our work just says, look. 426 00:16:15,750 --> 00:16:19,350 Instead of issuing transactions in sort 427 00:16:19,350 --> 00:16:23,430 of an uncorrelated fashion, just do the following thing. 428 00:16:23,430 --> 00:16:25,860 Every transaction you issue should spend the previous one. 429 00:16:25,860 --> 00:16:27,780 So as a result, if someone wants to equivocate 430 00:16:27,780 --> 00:16:29,520 about the third statement, they have 431 00:16:29,520 --> 00:16:32,880 to double spend like I said before. 432 00:16:32,880 --> 00:16:33,570 Right? 433 00:16:33,570 --> 00:16:34,350 Yeah? 434 00:16:34,350 --> 00:16:35,975 AUDIENCE: So what's the connection back 435 00:16:35,975 --> 00:16:38,050 to Keybase and blockstack again? 436 00:16:38,050 --> 00:16:41,890 ALIN TOMESCU: Yeah, so Keybase and blockstack 437 00:16:41,890 --> 00:16:42,930 are public directories. 438 00:16:42,930 --> 00:16:47,800 And what they do is they want to prevent themselves from-- 439 00:16:47,800 --> 00:16:49,990 is there a whiteboard I can draw here? 440 00:16:49,990 --> 00:16:54,990 Yeah so, let's say, you know, if you remember the picture 441 00:16:54,990 --> 00:16:57,360 from the beginning, I can have a public directory 442 00:16:57,360 --> 00:16:59,220 that evolves over time, right? 443 00:16:59,220 --> 00:17:01,200 So this is v1 of the directory, and it 444 00:17:01,200 --> 00:17:03,790 might have the right public keys for Alice and Bob. 445 00:17:03,790 --> 00:17:05,770 But at v2, the directory might do this. 446 00:17:05,770 --> 00:17:07,890 It might do v2. 447 00:17:07,890 --> 00:17:10,980 It might have Alice, Bob, but then put a fake key for Alice. 448 00:17:10,980 --> 00:17:14,369 And at V2 prime, it might do Alice, Bob as before, 449 00:17:14,369 --> 00:17:16,109 and put a fake key for Bob. 450 00:17:16,109 --> 00:17:19,050 So in other words it keeps the directory append-only. 451 00:17:19,050 --> 00:17:22,470 But it just adds fake public keys for the right people 452 00:17:22,470 --> 00:17:23,500 in the right version. 453 00:17:23,500 --> 00:17:26,650 So here this one is shown to Bob. 454 00:17:26,650 --> 00:17:28,860 And here this one is shown to Alice. 455 00:17:28,860 --> 00:17:32,190 So now Alice will use this fake key for Bob. 456 00:17:32,190 --> 00:17:36,330 So she'll encrypt a message with b prime for Bob. 457 00:17:36,330 --> 00:17:40,980 And now the attacker can easily decrypt this message 458 00:17:40,980 --> 00:17:42,780 because he has the secret key. 459 00:17:42,780 --> 00:17:45,810 So what the attacker can do then is re-encrypt it 460 00:17:45,810 --> 00:17:48,570 with the right public key for Bob 461 00:17:48,570 --> 00:17:50,822 and now he can read Alice's messages. 462 00:17:50,822 --> 00:17:52,530 And the whole idea is that the attacker-- 463 00:17:52,530 --> 00:17:57,690 you know, this b prime is the public key, is pk b prime, 464 00:17:57,690 --> 00:17:58,410 let's say. 465 00:17:58,410 --> 00:18:01,140 But the attacker has this sk b prime. 466 00:18:01,140 --> 00:18:04,110 He knows sk b prime because the attacker put this in there. 467 00:18:04,110 --> 00:18:06,568 The attacker being really, blockstack or Keybase. 468 00:18:06,568 --> 00:18:08,610 And of course, they're not attackers in the sense 469 00:18:08,610 --> 00:18:09,730 that they want to be good guys. 470 00:18:09,730 --> 00:18:11,640 But they're going to be compromised eventually. 471 00:18:11,640 --> 00:18:13,057 So they want to prevent themselves 472 00:18:13,057 --> 00:18:14,750 from doing things like these. 473 00:18:14,750 --> 00:18:17,300 Is that sort of answer your question? 474 00:18:17,300 --> 00:18:18,870 Yeah. 475 00:18:18,870 --> 00:18:22,130 All right, so yeah, a really simple summary. 476 00:18:22,130 --> 00:18:24,630 If I had two slides to summarize our work, this would be it. 477 00:18:24,630 --> 00:18:26,310 Right, they would be these two. 478 00:18:26,310 --> 00:18:27,730 Look, don't do things this way. 479 00:18:27,730 --> 00:18:29,180 Do them this way. 480 00:18:29,180 --> 00:18:32,040 All right, so let's see, let's look a little bit 481 00:18:32,040 --> 00:18:32,640 at the design. 482 00:18:32,640 --> 00:18:36,390 So remember we have these authorities that 483 00:18:36,390 --> 00:18:38,390 could equivocate about statements they issued 484 00:18:38,390 --> 00:18:39,580 like blockstack and Keybase. 485 00:18:39,580 --> 00:18:42,600 So what we propose is look, these authorities 486 00:18:42,600 --> 00:18:45,930 can run a lock server, a Catena lock server. 487 00:18:45,930 --> 00:18:49,943 And they start with some funds locked in some output. 488 00:18:49,943 --> 00:18:51,360 And what they can do first is they 489 00:18:51,360 --> 00:18:53,940 can issue this genesis transaction 490 00:18:53,940 --> 00:18:55,290 to start a new lock. 491 00:18:55,290 --> 00:18:58,080 So for example, Keybase would issue this genesis transaction 492 00:18:58,080 --> 00:19:02,310 starting the log of their public directory Merkle routes. 493 00:19:02,310 --> 00:19:04,080 And this genesis transaction can be 494 00:19:04,080 --> 00:19:05,760 thought of as the public key of the log. 495 00:19:05,760 --> 00:19:08,250 Once you have this genesis transaction, 496 00:19:08,250 --> 00:19:10,830 you know it's transaction ID. 497 00:19:10,830 --> 00:19:12,750 You can verify any future statements, 498 00:19:12,750 --> 00:19:17,610 and you can implicitly prevent equivocation about statements. 499 00:19:17,610 --> 00:19:20,290 And what the lock server is going to do 500 00:19:20,290 --> 00:19:21,780 is it's going to take these coins 501 00:19:21,780 --> 00:19:24,450 and send them back to the server to answer your question. 502 00:19:24,450 --> 00:19:26,283 So if there was a public key, if these coins 503 00:19:26,283 --> 00:19:27,825 are owned by some public key, they're 504 00:19:27,825 --> 00:19:29,500 just sent back to same public key here 505 00:19:29,500 --> 00:19:32,310 and paying some fees in the process. 506 00:19:32,310 --> 00:19:34,230 Right, so we're not burning coins in the sense 507 00:19:34,230 --> 00:19:35,730 that we're just paying fees that are 508 00:19:35,730 --> 00:19:36,897 miners, which we have to do. 509 00:19:40,220 --> 00:19:42,120 OK, so now what you can do is if you 510 00:19:42,120 --> 00:19:43,578 want to issue the first statements, 511 00:19:43,578 --> 00:19:46,020 you create a transaction. 512 00:19:46,020 --> 00:19:48,480 You send the coins from this output to this other output. 513 00:19:48,480 --> 00:19:49,855 You pay some fees in the process. 514 00:19:49,855 --> 00:19:53,850 You put your statement in an op return output. 515 00:19:53,850 --> 00:19:56,760 And as a result, if this lock server wants to equivocate, 516 00:19:56,760 --> 00:19:59,140 it has to again, double spend here, 517 00:19:59,140 --> 00:20:02,460 which it cannot do unless it has enough mining power. 518 00:20:02,460 --> 00:20:04,230 So Keybase and blockstack, if they 519 00:20:04,230 --> 00:20:05,730 were to use a system like this, they 520 00:20:05,730 --> 00:20:08,518 could prevent themselves from equivocating. 521 00:20:08,518 --> 00:20:09,810 And this can keep going, right? 522 00:20:09,810 --> 00:20:11,185 So you issue another transaction. 523 00:20:11,185 --> 00:20:12,660 Spend the previous output. 524 00:20:12,660 --> 00:20:13,680 Put the new statement. 525 00:20:13,680 --> 00:20:14,190 Yes? 526 00:20:14,190 --> 00:20:16,148 AUDIENCE: This doesn't seem like a new problem. 527 00:20:16,148 --> 00:20:20,795 How have authorities prevented equivocation in the past? 528 00:20:20,795 --> 00:20:22,920 ALIN TOMESCU: This doesn't seem like a new problem. 529 00:20:22,920 --> 00:20:24,180 How did authorities do it? 530 00:20:24,180 --> 00:20:25,200 The problem is not new. 531 00:20:25,200 --> 00:20:27,150 The problem is eternal. 532 00:20:27,150 --> 00:20:28,260 So you are correct there. 533 00:20:28,260 --> 00:20:29,860 How did they do it in the past? 534 00:20:29,860 --> 00:20:32,578 They just used a Byzantine consensus algorithm. 535 00:20:32,578 --> 00:20:34,870 So in some sense this is what we're doing here as well. 536 00:20:34,870 --> 00:20:37,940 We're just piggybacking on top of Bitcoin's 537 00:20:37,940 --> 00:20:39,690 Byzantine consensus algorithm. 538 00:20:39,690 --> 00:20:42,960 AUDIENCE: So you're rolling down to a newer Byzantine consensus 539 00:20:42,960 --> 00:20:44,630 algorithm basically. 540 00:20:44,630 --> 00:20:47,005 ALIN TOMESCU: Sure, I'm not sure what rolling down means. 541 00:20:47,005 --> 00:20:49,197 But yeah, we're piggybacking on top of bitcoin. 542 00:20:49,197 --> 00:20:51,030 The idea is that look, a byzantine consensus 543 00:20:51,030 --> 00:20:53,520 is actually quite complex to get right. 544 00:20:53,520 --> 00:20:56,920 We already have a publicly verifiable business consensus 545 00:20:56,920 --> 00:20:57,420 algorithm. 546 00:20:57,420 --> 00:20:58,500 It's bitcoin. 547 00:20:58,500 --> 00:21:02,280 Why can't we use it to verify, let's say, a log of statements 548 00:21:02,280 --> 00:21:03,600 super efficiently? 549 00:21:03,600 --> 00:21:07,380 So up until our work, people didn't seem to do this. 550 00:21:07,380 --> 00:21:08,550 So Keybase didn't do this. 551 00:21:08,550 --> 00:21:09,900 Blockstack didn't do this. 552 00:21:09,900 --> 00:21:12,150 They kind of forced it to download the entire bitcoin 553 00:21:12,150 --> 00:21:14,575 block chain to verify let's say, three statements. 554 00:21:14,575 --> 00:21:16,950 So in our case, you only have to download a few kilobytes 555 00:21:16,950 --> 00:21:18,870 of data to verify these three statements, 556 00:21:18,870 --> 00:21:21,120 assuming you have the bitcoin block headers, right, 557 00:21:21,120 --> 00:21:23,652 which we think is a step forward. 558 00:21:23,652 --> 00:21:25,110 And it's sort of like the right way 559 00:21:25,110 --> 00:21:28,260 to use these systems should be you 560 00:21:28,260 --> 00:21:30,330 know, the efficient way, not the inefficient way. 561 00:21:30,330 --> 00:21:32,095 Because bandwidth is expensive, right? 562 00:21:32,095 --> 00:21:32,970 Computation is cheap. 563 00:21:32,970 --> 00:21:34,220 Bandwidth is expensive. 564 00:21:36,860 --> 00:21:40,680 All right, so anyway the idea is that if the lock server becomes 565 00:21:40,680 --> 00:21:43,140 malicious, if Keybase or blockstack gets hacked, 566 00:21:43,140 --> 00:21:46,050 they cannot equivocate about the third statement. 567 00:21:46,050 --> 00:21:49,320 They can only issue one unique third statement. 568 00:21:49,320 --> 00:21:50,820 And the advantages are, you know, 569 00:21:50,820 --> 00:21:52,100 it's hard to fork this lock. 570 00:21:52,100 --> 00:21:54,202 It's hard to equivocate about third statement. 571 00:21:54,202 --> 00:21:55,410 But it's efficient to verify. 572 00:21:55,410 --> 00:21:57,493 And I'll walk you through how clients verify soon. 573 00:22:00,590 --> 00:22:03,020 The disadvantages are that if I want 574 00:22:03,020 --> 00:22:05,390 to know that this is the second statement in the log, 575 00:22:05,390 --> 00:22:07,250 I have to wait for six more blocks 576 00:22:07,250 --> 00:22:10,832 to be built on top of this statement's block, right? 577 00:22:10,832 --> 00:22:13,040 Just like in bitcoin, you have to wait for six blocks 578 00:22:13,040 --> 00:22:15,380 to make sure a transaction is confirmed, right? 579 00:22:15,380 --> 00:22:16,180 Why do you do that? 580 00:22:16,180 --> 00:22:18,055 The reason you do that is because there could 581 00:22:18,055 --> 00:22:20,030 be another transaction here, double spending 582 00:22:20,030 --> 00:22:23,280 because sometimes there are accidental forks in bitcoin 583 00:22:23,280 --> 00:22:26,330 and things like that. 584 00:22:26,330 --> 00:22:27,890 What are some other disadvantages 585 00:22:27,890 --> 00:22:29,057 that you guys can point out? 586 00:22:32,600 --> 00:22:33,224 Yeah? 587 00:22:33,224 --> 00:22:35,040 AUDIENCE: It's going to be expensive. 588 00:22:35,040 --> 00:22:36,450 ALIN TOMESCU: It's going to be expensive to issue 589 00:22:36,450 --> 00:22:36,992 these trends. 590 00:22:36,992 --> 00:22:40,240 So Alin-- Alin and I share the same name. 591 00:22:40,240 --> 00:22:41,880 So Alin is pointing out that you know, 592 00:22:41,880 --> 00:22:43,380 every time I issue these statements, 593 00:22:43,380 --> 00:22:44,735 I have to pay a fee, right? 594 00:22:44,735 --> 00:22:46,860 And if you remember, the fees were quite ridiculous 595 00:22:46,860 --> 00:22:48,240 in bitcoin. 596 00:22:48,240 --> 00:22:49,530 So that's a problem, right? 597 00:22:49,530 --> 00:22:51,330 So let's see, is that the next thing? 598 00:22:51,330 --> 00:22:53,400 The next thing was you have to issue-- 599 00:22:53,400 --> 00:22:56,170 you can only issue a statement every 10 minutes, right? 600 00:22:56,170 --> 00:22:58,128 So if you want to issue statements really fast. 601 00:22:58,128 --> 00:22:59,250 you can't do that. 602 00:22:59,250 --> 00:23:04,150 All right, like Alin said you have to pay bitcoin transaction 603 00:23:04,150 --> 00:23:04,900 fees. 604 00:23:04,900 --> 00:23:07,510 And the other problem is that you don't get freshness 605 00:23:07,510 --> 00:23:11,500 in the sense that it's kind of easy for this lock server 606 00:23:11,500 --> 00:23:13,353 to hide from you the latest statement. 607 00:23:13,353 --> 00:23:15,520 You know, unless you have a lot of these log servers 608 00:23:15,520 --> 00:23:18,100 and you ask many of them, hey, what's the latest statement? 609 00:23:18,100 --> 00:23:20,872 And they show you back the latest statement. 610 00:23:20,872 --> 00:23:23,080 If there is just one log server and it's compromised, 611 00:23:23,080 --> 00:23:24,538 it could always pretend no, no, no. 612 00:23:24,538 --> 00:23:25,850 This is the latest statement. 613 00:23:25,850 --> 00:23:27,760 And if you don't trust it, the best recourse you have 614 00:23:27,760 --> 00:23:30,250 is to download the full block and look for the statement 615 00:23:30,250 --> 00:23:31,360 yourself. 616 00:23:31,360 --> 00:23:34,030 So you don't get freshness. 617 00:23:34,030 --> 00:23:35,240 Those are some disadvantages. 618 00:23:35,240 --> 00:23:38,020 Now let's look at how clients audit this log? 619 00:23:38,020 --> 00:23:39,928 So I was claiming that it's very efficient 620 00:23:39,928 --> 00:23:41,470 to get these statements and make sure 621 00:23:41,470 --> 00:23:44,087 that no equivocation happened. 622 00:23:44,087 --> 00:23:45,670 So let's say, you have a Catena client 623 00:23:45,670 --> 00:23:47,753 and you're running on your phone with this client. 624 00:23:47,753 --> 00:23:50,590 And your goal is to get that list of statements. 625 00:23:50,590 --> 00:23:53,020 And there's the Catena log server over there in the back. 626 00:23:53,020 --> 00:23:54,520 And there's the bitcoin peer to peer 627 00:23:54,520 --> 00:23:58,105 network which at the moment has about 11,000 nodes. 628 00:23:58,105 --> 00:23:59,980 And remember, I said the first thing you need 629 00:23:59,980 --> 00:24:03,790 is the genesis transaction. 630 00:24:03,790 --> 00:24:05,690 Does everybody sort of understand 631 00:24:05,690 --> 00:24:08,150 that if you get the wrong Genesis transaction, 632 00:24:08,150 --> 00:24:09,800 you're completely screwed, right? 633 00:24:09,800 --> 00:24:11,988 Because it's very easy to equivocate 634 00:24:11,988 --> 00:24:14,030 if you have the wrong Genesis transaction, right? 635 00:24:14,030 --> 00:24:18,720 I mean, you know there's the right GTX here where you have, 636 00:24:18,720 --> 00:24:19,520 let's say, a s1. 637 00:24:19,520 --> 00:24:23,180 And then you have s2 in their own transactions, right? 638 00:24:23,180 --> 00:24:26,810 But if there's another GTX prime here and you're using that one, 639 00:24:26,810 --> 00:24:30,970 you're going to get s1 prime, s2 prime, different statements. 640 00:24:30,970 --> 00:24:35,610 So if Alice uses GTX but Bob uses GTX prime, 641 00:24:35,610 --> 00:24:38,237 Alice and Bob are back to square one. 642 00:24:38,237 --> 00:24:40,820 So in some sense, you might ask, OK, so then what's the point? 643 00:24:40,820 --> 00:24:41,903 What have you solved here? 644 00:24:41,903 --> 00:24:43,940 I still need to get this GTX, right? 645 00:24:43,940 --> 00:24:46,820 So what we claim is that this is a step forward because you only 646 00:24:46,820 --> 00:24:47,725 have to do this once. 647 00:24:47,725 --> 00:24:49,100 Once you've got this GTX, you can 648 00:24:49,100 --> 00:24:52,010 be sure you're never equivocated to, right? 649 00:24:52,010 --> 00:24:53,750 Whereas in the past, you would have 650 00:24:53,750 --> 00:24:55,820 to for each individual statement, 651 00:24:55,820 --> 00:24:57,560 you'd have to do additional checks 652 00:24:57,560 --> 00:24:59,268 to make sure you're not being equivocated 653 00:24:59,268 --> 00:25:02,180 to, like you would have to ask in a full node, let's say. 654 00:25:02,180 --> 00:25:05,035 Right, so as long as you have the right GTX, you're good. 655 00:25:05,035 --> 00:25:06,410 And how do you get the right GTX? 656 00:25:06,410 --> 00:25:09,403 Well, usually you ship it with the software on your phone. 657 00:25:09,403 --> 00:25:11,070 And there's some problems there as well, 658 00:25:11,070 --> 00:25:13,833 like there is no problem solved in computer 659 00:25:13,833 --> 00:25:14,750 science in some sense. 660 00:25:14,750 --> 00:25:18,470 But you know, we're trying to make progress here. 661 00:25:18,470 --> 00:25:20,690 OK, so let's say you have the right GTX because it 662 00:25:20,690 --> 00:25:22,402 got shipped with your software. 663 00:25:22,402 --> 00:25:24,860 Now the next thing you want to do is get the block headers. 664 00:25:24,860 --> 00:25:26,870 So you have header i, but there are 665 00:25:26,870 --> 00:25:28,438 some new headers being posted. 666 00:25:28,438 --> 00:25:30,980 Let's say the bitcoin peer to peer network sends them to you. 667 00:25:30,980 --> 00:25:31,938 You have these headers. 668 00:25:31,938 --> 00:25:33,500 You verify the proof of work, right? 669 00:25:33,500 --> 00:25:35,950 So this only costs you 80 bytes per header, right? 670 00:25:35,950 --> 00:25:38,330 Does everyone see that this is very cheap? 671 00:25:38,330 --> 00:25:43,040 So far, so far I have the GTX, which is let's say 235 bytes. 672 00:25:43,040 --> 00:25:44,650 And now I'm downloading some headers. 673 00:25:44,650 --> 00:25:46,567 And now I'm ready to ask the log server what's 674 00:25:46,567 --> 00:25:48,703 the first statement in the log, right? 675 00:25:48,703 --> 00:25:50,120 And what the log server will do is 676 00:25:50,120 --> 00:25:52,370 he's going to reply with the transaction 677 00:25:52,370 --> 00:25:55,370 with the statement, which is 600 bytes and the Merkle proof. 678 00:25:55,370 --> 00:25:58,430 So all of this is 600 bytes, actually. 679 00:25:58,430 --> 00:26:01,280 Right, and now what the Catena client will do 680 00:26:01,280 --> 00:26:03,530 is he'll check the Merkle proof against one 681 00:26:03,530 --> 00:26:06,740 of the headers so to see in which headers does it fit. 682 00:26:06,740 --> 00:26:09,200 And then he'll also check that the input here 683 00:26:09,200 --> 00:26:12,950 has a valid signature from the public key in the output here. 684 00:26:16,000 --> 00:26:18,870 All right, I want at least one question about this. 685 00:26:18,870 --> 00:26:19,440 Yeah? 686 00:26:19,440 --> 00:26:21,773 AUDIENCE: Sorry I came in late, but is the Catena client 687 00:26:21,773 --> 00:26:23,940 over there similar to the SPV? 688 00:26:23,940 --> 00:26:25,440 ALIN TOMESCU: Yeah, so that exactly. 689 00:26:25,440 --> 00:26:26,460 It's an SPV client. 690 00:26:26,460 --> 00:26:29,050 Yeah, so the idea is that we want SPV clients. 691 00:26:29,050 --> 00:26:31,140 We don't want these mobile phone clients 692 00:26:31,140 --> 00:26:34,500 to download 150 gigabytes of data. 693 00:26:34,500 --> 00:26:37,560 We want them to download let's say 40 megabytes worth of block 694 00:26:37,560 --> 00:26:41,045 headers, which they can discard very quickly as they verify. 695 00:26:41,045 --> 00:26:42,420 And then we want them to download 696 00:26:42,420 --> 00:26:44,610 600 bytes per statement, but still 697 00:26:44,610 --> 00:26:47,370 be sure that they saw all of the statements in sequence, 698 00:26:47,370 --> 00:26:48,840 and that there was no equivocation. 699 00:26:51,550 --> 00:26:52,050 Yeah? 700 00:26:53,988 --> 00:26:55,530 AUDIENCE: So in our previous classes, 701 00:26:55,530 --> 00:26:59,890 we discussed, if you can have a full node and an SPV node, 702 00:26:59,890 --> 00:27:03,310 do these sort of vulnerabilities exist with the [INAUDIBLE] 703 00:27:03,310 --> 00:27:04,000 client? 704 00:27:04,000 --> 00:27:05,583 ALIN TOMESCU: So which vulnerabilities 705 00:27:05,583 --> 00:27:06,830 that you did talk about? 706 00:27:06,830 --> 00:27:07,930 AUDIENCE: I forget. 707 00:27:07,930 --> 00:27:08,590 ALIN TOMESCU: Did you talk about-- 708 00:27:08,590 --> 00:27:11,020 AUDIENCE: There was just a box that was like less secure. 709 00:27:11,020 --> 00:27:14,440 And then there was another box that was something else bad. 710 00:27:14,440 --> 00:27:18,210 There was one about [INAUDIBLE]. 711 00:27:18,210 --> 00:27:21,160 The clients would lie to you about-- 712 00:27:21,160 --> 00:27:24,680 If you say, here's some transactions 713 00:27:24,680 --> 00:27:26,240 or here are some unspent outputs, 714 00:27:26,240 --> 00:27:29,280 then they could just tell you something different. 715 00:27:29,280 --> 00:27:30,030 ALIN TOMESCU: Yes. 716 00:27:30,030 --> 00:27:30,440 Yeah? 717 00:27:30,440 --> 00:27:30,730 Sorry. 718 00:27:30,730 --> 00:27:32,188 AUDIENCE: Well, they can't tell you 719 00:27:32,188 --> 00:27:33,980 that transactions exist or don't exist. 720 00:27:33,980 --> 00:27:35,693 They can just not tell you-- 721 00:27:35,693 --> 00:27:37,610 ALIN TOMESCU: Yeah, they can hide transactions 722 00:27:37,610 --> 00:27:39,920 from you, which gets back to the freshness issue 723 00:27:39,920 --> 00:27:41,120 that we could discussed. 724 00:27:41,120 --> 00:27:43,460 They could also-- this block header 725 00:27:43,460 --> 00:27:46,610 could be an header for an invalid block. 726 00:27:46,610 --> 00:27:49,850 But remember, that before you accept this tx1, 727 00:27:49,850 --> 00:27:51,860 you wait for enough proof of work, 728 00:27:51,860 --> 00:27:54,710 you wait for more block headers on top of this guy 729 00:27:54,710 --> 00:27:56,750 to sort of get some assurance that no, this 730 00:27:56,750 --> 00:27:59,270 was a valid block because a bunch of other miners built 731 00:27:59,270 --> 00:28:00,370 on top of it. 732 00:28:00,370 --> 00:28:00,870 Right? 733 00:28:03,037 --> 00:28:05,370 So as long as you're willing to trust that the miners do 734 00:28:05,370 --> 00:28:07,440 the right thing, which they have an incentive 735 00:28:07,440 --> 00:28:09,750 to do the right thing, you should be good. 736 00:28:09,750 --> 00:28:12,350 But like you said, what's your name? 737 00:28:12,350 --> 00:28:13,170 AUDIENCE: Anne. 738 00:28:13,170 --> 00:28:13,962 ALIN TOMESCU: Anne? 739 00:28:13,962 --> 00:28:16,260 Like Anne said, there are actually bigger problems 740 00:28:16,260 --> 00:28:17,048 with SPV clients. 741 00:28:17,048 --> 00:28:18,840 And if there is time, we can talk about it. 742 00:28:18,840 --> 00:28:22,650 But it's actually easier to trick SPV clients to fork them. 743 00:28:22,650 --> 00:28:26,100 And there's something called a generalized vector 76 attack. 744 00:28:26,100 --> 00:28:28,650 Have any of your guys heard about this? 745 00:28:28,650 --> 00:28:30,420 So it's like a pre-mining attack but it's 746 00:28:30,420 --> 00:28:32,490 a bit easier to pull off. 747 00:28:32,490 --> 00:28:34,740 Actually, it's a lot easier to pull off on an SPV node 748 00:28:34,740 --> 00:28:35,540 than on a full node. 749 00:28:35,540 --> 00:28:37,748 And if there's time at the end, we can talk about it. 750 00:28:37,748 --> 00:28:39,810 If there isn't, you can read our paper, 751 00:28:39,810 --> 00:28:41,260 which is online on my website. 752 00:28:41,260 --> 00:28:44,160 And you can read about these pre-mining attacks 753 00:28:44,160 --> 00:28:46,645 that work easier for SPV nodes. 754 00:28:46,645 --> 00:28:48,270 But anyway, this can keep going, right? 755 00:28:48,270 --> 00:28:50,710 You get block headers, 80 bytes each. 756 00:28:50,710 --> 00:28:52,570 You ask the log server, hey what's 757 00:28:52,570 --> 00:28:53,830 the next statement in the log? 758 00:28:53,830 --> 00:28:56,140 You get a Merkle proof in a transaction. 759 00:28:56,140 --> 00:28:58,000 And then you verify the Merkle proof. 760 00:28:58,000 --> 00:29:00,430 You put this transaction in one of these blocks 761 00:29:00,430 --> 00:29:03,210 and you verify that it spends the previous one. 762 00:29:03,210 --> 00:29:05,320 Right, and as a result, you implicitly 763 00:29:05,320 --> 00:29:07,733 by doing this verification, by checking that hey, 764 00:29:07,733 --> 00:29:09,400 this is a transaction and a valid block. 765 00:29:09,400 --> 00:29:11,770 This block has enough stuff built on top of it 766 00:29:11,770 --> 00:29:14,020 and this transaction spends this guy here, 767 00:29:14,020 --> 00:29:16,042 you implicitly prevent equivocation. 768 00:29:16,042 --> 00:29:18,250 Right, then you don't have to download anything else. 769 00:29:18,250 --> 00:29:21,210 Right, you only have to download these Merkle proofs 770 00:29:21,210 --> 00:29:23,410 and transactions in these block headers. 771 00:29:23,410 --> 00:29:26,240 Whereas in previous work, you could be missing these s1's, 772 00:29:26,240 --> 00:29:28,000 these s2's, they could be hidden away 773 00:29:28,000 --> 00:29:30,280 in some other branch of the Merkle tree. 774 00:29:30,280 --> 00:29:32,380 And you'd have to do peer to peer bloom 775 00:29:32,380 --> 00:29:34,270 filtering on the full nodes. 776 00:29:34,270 --> 00:29:36,092 And those full nodes could lie to you. 777 00:29:39,023 --> 00:29:40,940 Yeah, so the bandwidth is actually very small. 778 00:29:40,940 --> 00:29:43,250 So suppose we have 500k block headers-- 779 00:29:43,250 --> 00:29:45,320 I think bitcoin has a bit more right now-- 780 00:29:45,320 --> 00:29:46,820 which are 80 bytes each and we have 781 00:29:46,820 --> 00:29:49,790 10,000 statements in this log, which are 600 bytes each. 782 00:29:49,790 --> 00:29:53,057 Then we only need now with 46 megabytes, right? 783 00:29:53,057 --> 00:29:54,890 What's the what's the other way of doing it? 784 00:29:54,890 --> 00:29:59,410 You have to download hundreds of gigabytes. 785 00:29:59,410 --> 00:30:02,320 All right, so let's talk about scalability a little bit. 786 00:30:02,320 --> 00:30:04,750 So suppose the system gets deployed widely. 787 00:30:04,750 --> 00:30:09,900 Let's say Whatsapp starts to use the system to witness-- 788 00:30:09,900 --> 00:30:13,000 to publish their public directory in bitcoin, right? 789 00:30:13,000 --> 00:30:15,100 And everybody, a lot of you here have Whatsapp. 790 00:30:15,100 --> 00:30:16,270 And this is you guys. 791 00:30:16,270 --> 00:30:18,445 Let's say there are 200,000 people using Whatsapp. 792 00:30:18,445 --> 00:30:20,710 I think there's more like a billion. 793 00:30:20,710 --> 00:30:21,980 So what are they going to do? 794 00:30:21,980 --> 00:30:24,730 Remember that part of the verification protocol 795 00:30:24,730 --> 00:30:26,170 is asking for these block headers 796 00:30:26,170 --> 00:30:28,450 from the peer to peer network, right? 797 00:30:28,450 --> 00:30:30,170 And in fact, if you're SPV clients, 798 00:30:30,170 --> 00:30:32,770 you usually open up around eight connections to the peer 799 00:30:32,770 --> 00:30:34,180 to peer network. 800 00:30:34,180 --> 00:30:37,300 And if you're a full node in the bitcoin peer to peer network, 801 00:30:37,300 --> 00:30:41,910 you usually have around 117 incoming connections. 802 00:30:41,910 --> 00:30:43,300 That that's how much you support. 803 00:30:43,300 --> 00:30:46,395 You support 117 incoming connections as a full node. 804 00:30:46,395 --> 00:30:48,520 So that means in total, you support about a million 805 00:30:48,520 --> 00:30:49,930 incoming connections. 806 00:30:49,930 --> 00:30:51,700 So you know, this guy supports a million. 807 00:30:51,700 --> 00:30:55,840 But we need about 1.6 million connections 808 00:30:55,840 --> 00:30:57,830 from these 200,000 clients, right? 809 00:30:57,830 --> 00:30:59,830 So it's a bit of a problem if you 810 00:30:59,830 --> 00:31:04,020 deploy Catena and it becomes wildly popular. 811 00:31:04,020 --> 00:31:05,770 Being a bit optimistic here, but you know, 812 00:31:05,770 --> 00:31:08,440 let's say that happened, right? 813 00:31:08,440 --> 00:31:09,800 So how can we fix this? 814 00:31:09,800 --> 00:31:12,550 How can we avoid this problem because in this case, what 815 00:31:12,550 --> 00:31:14,110 we would basically be doing is we 816 00:31:14,110 --> 00:31:16,030 would be accidentally DDOSing bitcoin. 817 00:31:16,030 --> 00:31:17,500 And we don't want to do that. 818 00:31:17,500 --> 00:31:20,042 Does everybody see that there's a problem here, first of all? 819 00:31:23,310 --> 00:31:25,270 OK, so the idea is very simple. 820 00:31:25,270 --> 00:31:28,005 We just introduced something called a header relay network. 821 00:31:28,005 --> 00:31:29,880 And what that means is look, you don't really 822 00:31:29,880 --> 00:31:32,400 have to ask for these block headers from the Bitcoin peer 823 00:31:32,400 --> 00:31:33,360 to peer network. 824 00:31:33,360 --> 00:31:36,210 You could just outsource these block headers anywhere 825 00:31:36,210 --> 00:31:37,900 because they're publicly verifiable. 826 00:31:37,900 --> 00:31:41,050 Right, the block headers have proof of work on them. 827 00:31:41,050 --> 00:31:46,530 So you can use volunteer nodes that sort of push block 828 00:31:46,530 --> 00:31:49,230 headers to whoever asks for them. 829 00:31:49,230 --> 00:31:51,570 You could use blockchain explorers like blockchain.info, 830 00:31:51,570 --> 00:31:52,350 right? 831 00:31:52,350 --> 00:31:53,580 You could use Facebook. 832 00:31:53,580 --> 00:31:56,350 You could just post block headers on Facebook. 833 00:31:56,350 --> 00:31:57,988 Right, like in a Facebook feed. 834 00:31:57,988 --> 00:31:59,280 You could use Twitter for that. 835 00:31:59,280 --> 00:32:01,420 You could use GitHub gists. 836 00:32:01,420 --> 00:32:02,893 You know, so you could-- 837 00:32:02,893 --> 00:32:04,560 there's a lot of ways to implement this. 838 00:32:04,560 --> 00:32:07,230 The simplest way is have servers and have 839 00:32:07,230 --> 00:32:11,070 them send these headers to whoever asks for them. 840 00:32:11,070 --> 00:32:12,900 So it's easy to scale in that sense 841 00:32:12,900 --> 00:32:15,270 because if you now ask these header relay 842 00:32:15,270 --> 00:32:17,577 network for the block headers, you 843 00:32:17,577 --> 00:32:20,160 know, it's much easier to scale this than to scale the bitcoin 844 00:32:20,160 --> 00:32:23,190 peer to peer network, which has to do a bit more than just 845 00:32:23,190 --> 00:32:23,820 block headers. 846 00:32:23,820 --> 00:32:25,050 They have to verify blocks. 847 00:32:25,050 --> 00:32:28,330 They have to verify signatures. 848 00:32:28,330 --> 00:32:28,960 Yeah? 849 00:32:28,960 --> 00:32:29,865 AUDIENCE: Did you consider having 850 00:32:29,865 --> 00:32:31,232 the clients being peer to peer? 851 00:32:31,232 --> 00:32:32,690 ALIN TOMESCU: , Yes so another way. 852 00:32:32,690 --> 00:32:36,510 And I think we'll talk about that in the paper-- 853 00:32:36,510 --> 00:32:39,600 is you can implement the header relay network as a peer to peer 854 00:32:39,600 --> 00:32:41,620 network on top of the clients. 855 00:32:41,620 --> 00:32:45,920 Yeah, so that's another way to do it. 856 00:32:45,920 --> 00:32:48,350 There's some subtleties there that you have to get right. 857 00:32:48,350 --> 00:32:49,690 But you can do it, I think. 858 00:32:49,690 --> 00:32:49,840 Yeah? 859 00:32:49,840 --> 00:32:52,423 AUDIENCE: Would you you expect that if a company like Whatsapp 860 00:32:52,423 --> 00:32:55,670 decided adopt Catena, they would run their own servers 861 00:32:55,670 --> 00:32:57,918 to make sure that there was the capacity for it? 862 00:32:57,918 --> 00:32:59,710 ALIN TOMESCU: For the header relay network? 863 00:32:59,710 --> 00:33:00,830 AUDIENCE: Yes. 864 00:33:00,830 --> 00:33:03,260 ALIN TOMESCU: I mean, I would be just be speculating, 865 00:33:03,260 --> 00:33:05,750 wishful thinking. 866 00:33:05,750 --> 00:33:07,730 They could. 867 00:33:07,730 --> 00:33:10,490 There is some problem with this header relay network as well. 868 00:33:10,490 --> 00:33:11,865 And we talk about it in the paper 869 00:33:11,865 --> 00:33:13,960 because this had a really network could withhold 870 00:33:13,960 --> 00:33:14,930 block headers from you. 871 00:33:14,930 --> 00:33:16,520 So you do have to distribute it. 872 00:33:16,520 --> 00:33:18,320 Like usually you don't want to just trust 873 00:33:18,320 --> 00:33:21,050 Whatsapp who's also doing the statements, who's 874 00:33:21,050 --> 00:33:23,360 also pushing the statements in the blockchain. 875 00:33:23,360 --> 00:33:25,460 You don't want to trust them to also give you 876 00:33:25,460 --> 00:33:26,210 the block headers. 877 00:33:26,210 --> 00:33:28,730 You actually want to fetch them from a different source 878 00:33:28,730 --> 00:33:30,890 that Whatsapp doesn't collude with. 879 00:33:30,890 --> 00:33:32,722 Yeah, Anne, you had a question? 880 00:33:32,722 --> 00:33:34,680 AUDIENCE: Are there other header relay networks 881 00:33:34,680 --> 00:33:36,648 that are deployed already? 882 00:33:36,648 --> 00:33:38,940 ALIN TOMESCU: Yeah, there was actually one on ethereum. 883 00:33:38,940 --> 00:33:40,887 There's a smart contract in ethereum, 884 00:33:40,887 --> 00:33:43,220 I think, that if you submit bitcoin block headers to it, 885 00:33:43,220 --> 00:33:44,900 you get something back. 886 00:33:44,900 --> 00:33:46,970 And then you can just query bitcoin block headers 887 00:33:46,970 --> 00:33:48,990 from the ethereum blockchain. 888 00:33:48,990 --> 00:33:50,690 Is anyone familiar with this? 889 00:33:50,690 --> 00:33:52,190 So I guess I didn't include it here. 890 00:33:52,190 --> 00:33:54,440 But another way to do it is to just publish the header 891 00:33:54,440 --> 00:33:56,230 is in an ethereum smart contract. 892 00:33:56,230 --> 00:33:57,290 Yeah. 893 00:33:57,290 --> 00:33:59,660 So there's crazy ways you could do this too. 894 00:33:59,660 --> 00:34:00,620 Yeah? 895 00:34:00,620 --> 00:34:02,910 AUDIENCE: Will that one go out of gas? 896 00:34:02,910 --> 00:34:05,563 I don't know, my understanding of ethereum is pretty decent, 897 00:34:05,563 --> 00:34:08,580 but would that smart contract eventually run out of gas 898 00:34:08,580 --> 00:34:10,090 and not publish anymore? 899 00:34:10,090 --> 00:34:14,590 ALIN TOMESCU: So to fetch from it, you don't need to pay gas. 900 00:34:14,590 --> 00:34:18,991 But I suspect to push in it, I actually 901 00:34:18,991 --> 00:34:20,449 don't know who funds that contract. 902 00:34:20,449 --> 00:34:23,322 So I guess you fund that when you push maybe. 903 00:34:23,322 --> 00:34:25,489 Maybe not because then you also want something back. 904 00:34:25,489 --> 00:34:27,648 Why would you push? 905 00:34:27,648 --> 00:34:28,190 I'm not sure. 906 00:34:28,190 --> 00:34:30,790 But we can look at it after. 907 00:34:30,790 --> 00:34:32,639 Yeah. 908 00:34:32,639 --> 00:34:35,050 It's a good question. 909 00:34:35,050 --> 00:34:37,550 So anyway, even if this header relay network is compromised, 910 00:34:37,550 --> 00:34:39,092 if you implement it in the right way, 911 00:34:39,092 --> 00:34:42,770 you've distributed on a sufficient number of parties, 912 00:34:42,770 --> 00:34:46,083 you can still get all of the properties that you need to, 913 00:34:46,083 --> 00:34:47,750 meaning freshness for the block headers. 914 00:34:47,750 --> 00:34:49,540 That's really the only property that you need to. 915 00:34:49,540 --> 00:34:51,123 The header relay network should always 916 00:34:51,123 --> 00:34:53,210 reply with the latest block headers. 917 00:34:53,210 --> 00:34:57,610 So let's look at costs since Alin was mentioning the costs. 918 00:34:57,610 --> 00:35:01,490 So to open a statement, you have to issue a transaction, right? 919 00:35:01,490 --> 00:35:06,230 And the size of our transactions are around 235 bytes. 920 00:35:06,230 --> 00:35:11,840 So and the fee as of December 13 was $16.24 for transactions 921 00:35:11,840 --> 00:35:14,360 if you guys remember those great bitcoin times. 922 00:35:14,360 --> 00:35:17,690 I think it went up to $40 at some point too. 923 00:35:17,690 --> 00:35:20,510 So it was it was really hard for me to talk to people 924 00:35:20,510 --> 00:35:21,820 about this research back then. 925 00:35:21,820 --> 00:35:23,278 But guess what, the fees are today? 926 00:35:26,660 --> 00:35:28,100 So today, this morning I checked. 927 00:35:28,100 --> 00:35:30,650 And there were $0.78, right? 928 00:35:30,650 --> 00:35:34,670 When we wrote the paper, they were like $0.12. 929 00:35:34,670 --> 00:35:36,620 So you know, here I am standing in front 930 00:35:36,620 --> 00:35:38,720 of you pitching our work. 931 00:35:38,720 --> 00:35:41,840 In two minutes they could be back to $100, but who knows? 932 00:35:41,840 --> 00:35:43,050 Yes, you had a question. 933 00:35:43,050 --> 00:35:45,050 AUDIENCE: Maybe I'm wrong, but in a transaction, 934 00:35:45,050 --> 00:35:46,520 you can have some outputs. 935 00:35:46,520 --> 00:35:49,273 Can you have several statements in there? 936 00:35:49,273 --> 00:35:50,940 ALIN TOMESCU: So that's a good question. 937 00:35:50,940 --> 00:35:52,535 So can you batch statements? 938 00:35:52,535 --> 00:35:53,660 And then the answer is yes. 939 00:35:53,660 --> 00:35:55,760 You can definitely batch statements. 940 00:35:55,760 --> 00:35:59,150 What we've said so far is in a transaction-- 941 00:35:59,150 --> 00:36:02,390 in a Catena transaction, you have this output. 942 00:36:02,390 --> 00:36:04,120 And you have this op return output 943 00:36:04,120 --> 00:36:06,620 where you put the statement, right? 944 00:36:06,620 --> 00:36:10,580 And you know, it spends a previous transaction. 945 00:36:10,580 --> 00:36:16,670 But as a matter of fact, what you can do and some of you 946 00:36:16,670 --> 00:36:19,950 may already notice this. 947 00:36:19,950 --> 00:36:22,490 There is no reason to put just one statement in here. 948 00:36:22,490 --> 00:36:24,530 There is some reason-- so the only reason 949 00:36:24,530 --> 00:36:27,150 is that it only fits 80 bytes. 950 00:36:27,150 --> 00:36:29,870 So you could put easily, let's say, two or three statements 951 00:36:29,870 --> 00:36:32,518 in there if you hashed them with the right hash function. 952 00:36:32,518 --> 00:36:34,310 But a better way to do it is why didn't you 953 00:36:34,310 --> 00:36:36,620 put here a Merkle root hash? 954 00:36:36,620 --> 00:36:39,380 And then you can have as many statements 955 00:36:39,380 --> 00:36:43,630 as you want in the leafs of that Merkle tree, right? 956 00:36:43,630 --> 00:36:45,380 In fact, here you could have I don't know, 957 00:36:45,380 --> 00:36:49,600 billions of statements. 958 00:36:49,600 --> 00:36:52,320 So keep in mind, you will only be 959 00:36:52,320 --> 00:36:55,380 able to issue billions of statements every 10 minutes. 960 00:36:55,380 --> 00:36:56,880 But you can definitely have billions 961 00:36:56,880 --> 00:36:59,010 of statements in a single transaction 962 00:36:59,010 --> 00:37:00,270 if you just batch them. 963 00:37:00,270 --> 00:37:03,750 So now, remember the blockchain will only store the root hash. 964 00:37:03,750 --> 00:37:06,840 This Merkle tree will be stored by the log server perhaps 965 00:37:06,840 --> 00:37:08,880 or by a different party. 966 00:37:08,880 --> 00:37:11,280 They don't have to be the same party. 967 00:37:11,280 --> 00:37:12,910 Does that make sense? 968 00:37:12,910 --> 00:37:14,370 Does that answer your question? 969 00:37:14,370 --> 00:37:14,870 Yeah. 970 00:37:17,450 --> 00:37:20,100 Right, that was my next point. 971 00:37:20,100 --> 00:37:22,510 Statements can be batched with Merkle trees. 972 00:37:22,510 --> 00:37:24,900 OK, so let's talk about the why since so far 973 00:37:24,900 --> 00:37:27,150 we've been talking abstractly about these statements. 974 00:37:27,150 --> 00:37:29,490 But what could these statements actually be. 975 00:37:29,490 --> 00:37:31,290 So let's look at a secure software update. 976 00:37:31,290 --> 00:37:34,140 So how do you do secure software update? 977 00:37:34,140 --> 00:37:36,540 An example attack on a software update scheme 978 00:37:36,540 --> 00:37:41,330 is that somebody compromises the bitcoin.org domain. 979 00:37:41,330 --> 00:37:44,390 And they change the bitcoin binary to a malicious binary. 980 00:37:44,390 --> 00:37:47,417 And they wait for people to install that malicious binary. 981 00:37:47,417 --> 00:37:48,750 And then they steal their coins. 982 00:37:48,750 --> 00:37:49,820 They steal their data. 983 00:37:49,820 --> 00:37:51,890 They could execute arbitrary code. 984 00:37:51,890 --> 00:37:54,230 And an example of this was sort of this binary safety 985 00:37:54,230 --> 00:37:55,875 warning on the bitcoin website. 986 00:37:55,875 --> 00:37:57,500 At some point, they were very concerned 987 00:37:57,500 --> 00:38:01,130 that a state actor is going to mess with the DNS servers 988 00:38:01,130 --> 00:38:04,010 and redirect clients to a different server 989 00:38:04,010 --> 00:38:07,290 and make them download a bad bitcoin binary. 990 00:38:07,290 --> 00:38:08,540 So does the attack make sense? 991 00:38:08,540 --> 00:38:10,410 Does everybody see why this is possible? 992 00:38:10,410 --> 00:38:15,590 You do need to sort of accept that the DNS service that we 993 00:38:15,590 --> 00:38:17,930 currently have on the internet is fundamentally flawed. 994 00:38:17,930 --> 00:38:21,630 It's not built for security. 995 00:38:21,630 --> 00:38:24,680 So the typical defense the typical defense for this 996 00:38:24,680 --> 00:38:26,960 is that the Bitcoin developers-- they sign the bitcoin 997 00:38:26,960 --> 00:38:29,270 binaries with some secret key. 998 00:38:29,270 --> 00:38:31,820 And they protect that secret key. 999 00:38:31,820 --> 00:38:34,455 And then there's a public key associated 1000 00:38:34,455 --> 00:38:36,830 with the secret key that's posted on the bitcoin website, 1001 00:38:36,830 --> 00:38:37,670 right? 1002 00:38:37,670 --> 00:38:39,795 And maybe some of you'll notice that sometimes it's 1003 00:38:39,795 --> 00:38:42,087 also very easy to change the public key on the website, 1004 00:38:42,087 --> 00:38:43,580 you know, if you can just redirect 1005 00:38:43,580 --> 00:38:45,290 the victim to another website. 1006 00:38:45,290 --> 00:38:47,990 And another problem is that not everyone checks the signature. 1007 00:38:47,990 --> 00:38:50,448 Even if let's say you have the public key on your computer, 1008 00:38:50,448 --> 00:38:52,303 you know what the right public key is, 1009 00:38:52,303 --> 00:38:53,720 only if you're like an expert user 1010 00:38:53,720 --> 00:38:57,380 and you know how to use GPG, you will check that signature, 1011 00:38:57,380 --> 00:38:58,430 right? 1012 00:38:58,430 --> 00:39:00,145 And the other problem that's probably I 1013 00:39:00,145 --> 00:39:01,520 think a much, much bigger problem 1014 00:39:01,520 --> 00:39:03,770 is that for the bitcoin devs themselves, 1015 00:39:03,770 --> 00:39:06,280 it's very hard to detect if someone stole their secret key. 1016 00:39:06,280 --> 00:39:09,500 Like if I'm a state actor and I break your computer 1017 00:39:09,500 --> 00:39:12,440 and I steal your secret key, I will sign this bitcoin binary. 1018 00:39:12,440 --> 00:39:14,270 And I'll give it to let's say, one guy. 1019 00:39:14,270 --> 00:39:15,282 I'll give it to you. 1020 00:39:15,282 --> 00:39:17,240 You know, and I'll just target you individually 1021 00:39:17,240 --> 00:39:18,320 because I'm really-- 1022 00:39:18,320 --> 00:39:19,790 I know you have a lot of bitcoin. 1023 00:39:19,790 --> 00:39:23,047 Right, and then the bitcoin devs will never find out about it 1024 00:39:23,047 --> 00:39:25,130 unless you know you kind of realize what happened. 1025 00:39:25,130 --> 00:39:26,685 Then you take your bitcoin binary 1026 00:39:26,685 --> 00:39:28,310 and you go with it to the bitcoin devs. 1027 00:39:28,310 --> 00:39:30,828 And they check the signature on it and they say, oh wow. 1028 00:39:30,828 --> 00:39:32,870 This is a valid signature and we never signed it. 1029 00:39:32,870 --> 00:39:36,440 So somebody must have stolen our secret key. 1030 00:39:36,440 --> 00:39:39,553 So this is really a bad-- 1031 00:39:39,553 --> 00:39:40,970 kind of the core of the problem is 1032 00:39:40,970 --> 00:39:43,322 that it's hard for whoever publishes software 1033 00:39:43,322 --> 00:39:44,780 to detect that their secret key has 1034 00:39:44,780 --> 00:39:47,360 been stolen-- to detect malicious signatures 1035 00:39:47,360 --> 00:39:48,200 on their binaries. 1036 00:39:48,200 --> 00:39:49,530 Does that make sense? 1037 00:39:49,530 --> 00:39:51,860 So the solution, of course, is you know, publish 1038 00:39:51,860 --> 00:39:55,460 the signatures of bitcoin binaries in a Catena log 1039 00:39:55,460 --> 00:39:57,830 So now if there's a malicious binary being published 1040 00:39:57,830 --> 00:40:00,770 by a state actor, people won't accept that binary 1041 00:40:00,770 --> 00:40:02,780 unless it's in the Catena log, which 1042 00:40:02,780 --> 00:40:07,050 means people in the bitcoin devs will see the same binary. 1043 00:40:07,050 --> 00:40:09,800 Right, so let me let me show you what I mean with a picture. 1044 00:40:09,800 --> 00:40:13,610 So we have this Catena log for Bitcoin binaries. 1045 00:40:13,610 --> 00:40:15,110 And let's say, the first transaction 1046 00:40:15,110 --> 00:40:21,260 has a hash of the bitcoin 0.001 tar file, right, 1047 00:40:21,260 --> 00:40:23,840 the bitcoin binaries. 1048 00:40:23,840 --> 00:40:27,230 And this hash here is implicitly signed 1049 00:40:27,230 --> 00:40:29,780 by the signature in this input because it 1050 00:40:29,780 --> 00:40:32,510 signs the whole transaction. 1051 00:40:32,510 --> 00:40:35,210 So now if I put this hash in a Catena log, 1052 00:40:35,210 --> 00:40:38,240 I get a signature on it for free. 1053 00:40:38,240 --> 00:40:42,320 And now, if let's say, a state actor compromises the log 1054 00:40:42,320 --> 00:40:44,720 server, gets the secret key, he can 1055 00:40:44,720 --> 00:40:48,110 publish this second malicious binary in the log, right? 1056 00:40:48,110 --> 00:40:51,090 But what that malicious state actor will want to do, 1057 00:40:51,090 --> 00:40:53,240 he will want to hide this from the Bitcoin devs 1058 00:40:53,240 --> 00:40:54,620 and show it to all of you guys. 1059 00:40:54,620 --> 00:40:57,197 All right, so he'll want to equivocate. 1060 00:40:57,197 --> 00:40:59,780 So as a result, he will want to create a different transaction 1061 00:40:59,780 --> 00:41:02,420 with the right bitcoin binary there, 1062 00:41:02,420 --> 00:41:05,350 show this to the bitcoin devs while showing this to you guys. 1063 00:41:05,350 --> 00:41:08,240 All right, so the bitcoin devs would think they're good. 1064 00:41:08,240 --> 00:41:11,540 This is the binary they wanted to publish while you guys would 1065 00:41:11,540 --> 00:41:17,535 be using this malicious binary published by the state actor. 1066 00:41:17,535 --> 00:41:19,910 Of course this cannot happen because in Catena you cannot 1067 00:41:19,910 --> 00:41:21,316 equivocate. 1068 00:41:21,316 --> 00:41:24,190 Right? 1069 00:41:24,190 --> 00:41:27,360 Does everybody see this? 1070 00:41:27,360 --> 00:41:28,832 Right, any questions about this? 1071 00:41:28,832 --> 00:41:30,290 There has to be a question on this. 1072 00:41:37,122 --> 00:41:39,090 No? 1073 00:41:39,090 --> 00:41:42,240 So this mechanism is called software transparency. 1074 00:41:42,240 --> 00:41:45,720 It's this idea that rather than just downloading software 1075 00:41:45,720 --> 00:41:50,040 like a crazy person from the internet and installing it, 1076 00:41:50,040 --> 00:41:51,930 we should just be publishing these binaries 1077 00:41:51,930 --> 00:41:54,990 in a log that everybody can see, including the software vendors 1078 00:41:54,990 --> 00:41:56,770 that created those binaries. 1079 00:41:56,770 --> 00:41:59,947 So in this way, if somebody compromises a software vendor, 1080 00:41:59,947 --> 00:42:01,530 that vendor can notice that in the log 1081 00:42:01,530 --> 00:42:03,197 there's a new version for their software 1082 00:42:03,197 --> 00:42:04,350 that they didn't publish. 1083 00:42:04,350 --> 00:42:07,890 So you know, this isn't to say that it'll prevent attacks. 1084 00:42:07,890 --> 00:42:10,890 You know, what a state actor can do anyway 1085 00:42:10,890 --> 00:42:12,180 is they can just do this. 1086 00:42:12,180 --> 00:42:13,950 They can post this H2 prime in here, 1087 00:42:13,950 --> 00:42:16,770 show it to you guys including the bitcoin devs 1088 00:42:16,770 --> 00:42:18,960 and still screw everyone over. 1089 00:42:18,960 --> 00:42:21,890 But at least these attacks then go undetected anymore. 1090 00:42:21,890 --> 00:42:27,325 All right, so it's a step forward in that sense. 1091 00:42:27,325 --> 00:42:28,700 Yes, so the idea is that you have 1092 00:42:28,700 --> 00:42:30,590 to double spend to equivocate. 1093 00:42:30,590 --> 00:42:34,760 And the other example that really-- 1094 00:42:34,760 --> 00:42:36,680 the reason I wanted to start this research 1095 00:42:36,680 --> 00:42:38,340 had to do with public key distribution. 1096 00:42:38,340 --> 00:42:40,550 So let's say we have Alice and we have Bob. 1097 00:42:40,550 --> 00:42:42,530 And they both have their public keys. 1098 00:42:42,530 --> 00:42:44,648 And I'm using this letter b to denote 1099 00:42:44,648 --> 00:42:46,190 Bob and his public key and the letter 1100 00:42:46,190 --> 00:42:48,500 a to denote Alice and her public key. 1101 00:42:48,500 --> 00:42:50,540 And they have their corresponding secret keys, 1102 00:42:50,540 --> 00:42:51,110 right? 1103 00:42:51,110 --> 00:42:53,318 And Alice and Bob, they want to chat securely, right? 1104 00:42:53,318 --> 00:42:58,580 So they want to set up a secure channel. 1105 00:42:58,580 --> 00:43:01,280 And there's this directory which stores their public keys. 1106 00:43:01,280 --> 00:43:03,320 So this guy stores Alice, pk Alice. 1107 00:43:03,320 --> 00:43:05,300 This guy stores Bob, pk Bob. 1108 00:43:05,300 --> 00:43:09,440 All right, and the directory gets updated over time, 1109 00:43:09,440 --> 00:43:13,670 maybe Karl, Ellen and Dan registered. 1110 00:43:13,670 --> 00:43:17,090 And if you have non-equivocation, 1111 00:43:17,090 --> 00:43:19,940 if the attacker wants to impersonate Alice and Bob, 1112 00:43:19,940 --> 00:43:22,730 he kind of has to put their public keys, 1113 00:43:22,730 --> 00:43:24,570 the fake public keys in the same directory, 1114 00:43:24,570 --> 00:43:27,800 which means that when Alice and Bob monitor-- 1115 00:43:27,800 --> 00:43:29,870 they check their own public keys, 1116 00:43:29,870 --> 00:43:34,030 they both notice they've been impersonated, right? 1117 00:43:34,030 --> 00:43:37,420 So again, the idea is that you can detect. 1118 00:43:37,420 --> 00:43:41,030 Now how can this attacker still trick 1119 00:43:41,030 --> 00:43:43,640 Alice to send an encrypted message to Bob 1120 00:43:43,640 --> 00:43:45,590 with Bob's fake public key? 1121 00:43:45,590 --> 00:43:48,057 Is there a way even if you have non-equivocation? 1122 00:43:50,980 --> 00:43:52,030 So what's the attack? 1123 00:43:52,030 --> 00:43:53,775 Even I have non-equivocation and I 1124 00:43:53,775 --> 00:43:55,150 claim that the attacker can still 1125 00:43:55,150 --> 00:43:59,710 get Alice to send a fake, an encrypted message to Bob 1126 00:43:59,710 --> 00:44:01,573 that the attacker can decrypt. 1127 00:44:01,573 --> 00:44:02,740 What should the attacker do? 1128 00:44:02,740 --> 00:44:07,750 So pretend that we are here without the ability 1129 00:44:07,750 --> 00:44:08,862 to equivocate. 1130 00:44:11,453 --> 00:44:12,870 So the attacker cannot equivocate. 1131 00:44:12,870 --> 00:44:14,790 But I claimed that the attacker can still 1132 00:44:14,790 --> 00:44:17,070 trick Alice into sending a message to Bob 1133 00:44:17,070 --> 00:44:19,790 that the attacker can read. 1134 00:44:19,790 --> 00:44:22,166 So now it's time to see if you guys paid attention. 1135 00:44:28,010 --> 00:44:29,920 Somebody? 1136 00:44:29,920 --> 00:44:31,040 Alin? 1137 00:44:31,040 --> 00:44:32,097 Oh, you? 1138 00:44:32,097 --> 00:44:34,430 AUDIENCE: Does the attacker have to have the secret key? 1139 00:44:34,430 --> 00:44:35,060 ALIN TOMESCU: No, no. 1140 00:44:35,060 --> 00:44:35,680 He does not. 1141 00:44:35,680 --> 00:44:36,920 Yeah. 1142 00:44:36,920 --> 00:44:39,112 He does not have to have the secret key. 1143 00:44:41,840 --> 00:44:44,527 The attacker just creates fake public keys. 1144 00:44:44,527 --> 00:44:45,110 Here's a hint. 1145 00:44:48,290 --> 00:44:51,010 AUDIENCE: If you only changed one person to choose, 1146 00:44:51,010 --> 00:44:53,965 they don't know that it's a fake key so they could send it 1147 00:44:53,965 --> 00:44:55,960 to a fake key for a bit? 1148 00:44:55,960 --> 00:44:58,042 ALIN TOMESCU: Yeah, so whose person should they 1149 00:44:58,042 --> 00:44:59,250 attack or change the key for? 1150 00:44:59,250 --> 00:45:00,870 AUDIENCE: Like if they change Bob's, Alice 1151 00:45:00,870 --> 00:45:02,245 will still think Bob's is correct 1152 00:45:02,245 --> 00:45:04,873 so she'll send it to the fake Bob until Bob checks it. 1153 00:45:04,873 --> 00:45:05,790 ALIN TOMESCU: Exactly. 1154 00:45:05,790 --> 00:45:07,350 So that's exactly right. 1155 00:45:07,350 --> 00:45:08,483 So what's your name? 1156 00:45:08,483 --> 00:45:09,150 AUDIENCE: Lucas. 1157 00:45:09,150 --> 00:45:09,983 ALIN TOMESCU: Lucas. 1158 00:45:09,983 --> 00:45:15,810 So what Lucas is saying is look, even without equivocation, 1159 00:45:15,810 --> 00:45:18,390 I had this directory at T1. 1160 00:45:18,390 --> 00:45:19,950 I had another one at T2. 1161 00:45:19,950 --> 00:45:25,590 But at T3 and both of these had keys for Alice and Bob, right? 1162 00:45:25,590 --> 00:45:30,190 But at T3, Lucas is saying look, just put the fake key for Bob 1163 00:45:30,190 --> 00:45:30,690 here. 1164 00:45:30,690 --> 00:45:32,130 And that's it. 1165 00:45:32,130 --> 00:45:35,310 Don't put a fake key for Alice there, just for Bob. 1166 00:45:35,310 --> 00:45:41,335 And now when Alice looks up this public key for Bob here, 1167 00:45:41,335 --> 00:45:42,960 she sends a query to the directory hey, 1168 00:45:42,960 --> 00:45:44,940 what's Bob's public key? 1169 00:45:44,940 --> 00:45:50,280 She gets back b prime, which is equal to Bob pk 1170 00:45:50,280 --> 00:45:53,980 Bob prime, right? 1171 00:45:53,980 --> 00:45:57,168 Alice can't tell if that's really Bob's fake public key. 1172 00:45:57,168 --> 00:45:58,960 That's the reason she's using the directory 1173 00:45:58,960 --> 00:46:00,160 in the first place. 1174 00:46:00,160 --> 00:46:02,920 She wants sort of a trustworthy place to get it from. 1175 00:46:02,920 --> 00:46:04,610 Bob can tell if Bob looks. 1176 00:46:04,610 --> 00:46:06,610 But by the time Bob looks, it might be too late. 1177 00:46:06,610 --> 00:46:08,880 Alice might have already encrypted a message, right? 1178 00:46:08,880 --> 00:46:10,880 So again, what's the point of doing all of this? 1179 00:46:10,880 --> 00:46:14,103 It's not like you're preventing attacks, right? 1180 00:46:14,103 --> 00:46:15,520 And the point of doing all of this 1181 00:46:15,520 --> 00:46:16,840 is that you get transparency. 1182 00:46:16,840 --> 00:46:19,810 Bob can detect, whereas right now Bob has no hope. 1183 00:46:19,810 --> 00:46:23,110 In fact, so you said, a lot of you use Whatsapp. 1184 00:46:23,110 --> 00:46:24,793 So you know in Whatsapp, if you really 1185 00:46:24,793 --> 00:46:26,710 want to be sure, so I have a conversation here 1186 00:46:26,710 --> 00:46:28,570 with Alin Dragos. 1187 00:46:28,570 --> 00:46:31,060 So, Alin, do you what to to bring your phone here? 1188 00:46:31,060 --> 00:46:32,950 Do you have Whatsapp? 1189 00:46:32,950 --> 00:46:35,440 So if you really want to be sure that you're 1190 00:46:35,440 --> 00:46:37,697 talking to the real Alin and not some other guy, 1191 00:46:37,697 --> 00:46:39,280 you have to go on this encryption tab. 1192 00:46:39,280 --> 00:46:42,548 Can you tape this? 1193 00:46:42,548 --> 00:46:43,840 So my phone is black and white. 1194 00:46:43,840 --> 00:46:45,760 It's going through a depression phase. 1195 00:46:45,760 --> 00:46:47,890 I apologize. 1196 00:46:47,890 --> 00:46:50,140 So you have to go here and there's a code here, right? 1197 00:46:50,140 --> 00:46:51,640 And Alin, can you do the same thing? 1198 00:46:51,640 --> 00:46:52,973 You know what I'm talking about? 1199 00:46:56,250 --> 00:46:59,490 I really hope I don't get some weird text message right now 1200 00:46:59,490 --> 00:47:01,930 with the camera on the phone. 1201 00:47:01,930 --> 00:47:05,095 OK, so now with Alin's phone, is that the same code? 1202 00:47:05,095 --> 00:47:05,970 Can somebody tell me? 1203 00:47:05,970 --> 00:47:07,410 I can't see it. 1204 00:47:07,410 --> 00:47:09,740 AUDIENCE: It's hard to see. 1205 00:47:09,740 --> 00:47:12,333 ALIN TOMESCU: OK, so we have 27836 and yeah. 1206 00:47:12,333 --> 00:47:13,250 So it's the same code. 1207 00:47:13,250 --> 00:47:15,050 Can you see it on the camera? 1208 00:47:15,050 --> 00:47:16,915 So now because we have the same code here. 1209 00:47:16,915 --> 00:47:18,290 With this code here, it really is 1210 00:47:18,290 --> 00:47:20,673 a hash of my public key and Alin's public key. 1211 00:47:20,673 --> 00:47:23,090 And if we've got the same hash of both of our public keys, 1212 00:47:23,090 --> 00:47:24,840 then we know we're talking to one another. 1213 00:47:24,840 --> 00:47:27,290 But we won't really know that's the case until we actually 1214 00:47:27,290 --> 00:47:29,360 meet in person and do this exchange, right? 1215 00:47:29,360 --> 00:47:31,880 So what this system does instead is 1216 00:47:31,880 --> 00:47:33,710 it allows Alin to check his own public key 1217 00:47:33,710 --> 00:47:36,410 and it allows me to check my own public key. 1218 00:47:36,410 --> 00:47:38,078 This way if we check our own public key, 1219 00:47:38,078 --> 00:47:40,370 we'll always know when we're impersonated even though I 1220 00:47:40,370 --> 00:47:43,130 might send an encrypted message to the wrong Alin, 1221 00:47:43,130 --> 00:47:44,703 Alin will eventually find out. 1222 00:47:44,703 --> 00:47:46,870 It's a bit confusing because we're both called Alin. 1223 00:47:50,505 --> 00:47:51,880 So does that sort of makes sense? 1224 00:47:51,880 --> 00:47:52,380 Yeah? 1225 00:47:52,380 --> 00:47:54,290 AUDIENCE: So what do you do when you realize 1226 00:47:54,290 --> 00:47:55,943 that your key is the wrong key? 1227 00:47:55,943 --> 00:47:57,110 ALIN TOMESCU: Good question. 1228 00:47:57,110 --> 00:47:58,400 So that's really the crucial question. 1229 00:47:58,400 --> 00:47:59,442 What the hell can you do? 1230 00:47:59,442 --> 00:48:01,490 Right, so this directory impersonated you. 1231 00:48:01,490 --> 00:48:03,788 In fact, you can't even-- 1232 00:48:03,788 --> 00:48:04,580 here's the problem. 1233 00:48:04,580 --> 00:48:07,713 If you're Bob here and you see this fake public key. 1234 00:48:07,713 --> 00:48:09,380 And you go to the New York Times and you 1235 00:48:09,380 --> 00:48:12,110 say hey, New York Times, this Whatsapp directory 1236 00:48:12,110 --> 00:48:13,820 started impersonating me. 1237 00:48:13,820 --> 00:48:15,920 And the New York Times can go to the directory 1238 00:48:15,920 --> 00:48:18,087 and say, hey directory, why did you impersonate Bob? 1239 00:48:18,087 --> 00:48:20,462 And the directory can say, no, I did not impersonate Bob. 1240 00:48:20,462 --> 00:48:22,280 Bob really just ask for a new public key. 1241 00:48:22,280 --> 00:48:24,620 And this was the public key that Bob gave me. 1242 00:48:24,620 --> 00:48:28,133 And it's just a he said, they said, kind of a thing, right? 1243 00:48:28,133 --> 00:48:30,050 So it's really a sort of an open research area 1244 00:48:30,050 --> 00:48:32,960 to figure out what's the right way to whistle blow here. 1245 00:48:32,960 --> 00:48:35,570 So for example, one project that we're trying to work on 1246 00:48:35,570 --> 00:48:38,690 is there a way to track the directory somehow 1247 00:48:38,690 --> 00:48:40,640 so that when he does stuff like this, 1248 00:48:40,640 --> 00:48:44,868 you get a publicly verifiable cryptographic proof that he 1249 00:48:44,868 --> 00:48:45,910 really misbehaved, right? 1250 00:48:45,910 --> 00:48:47,120 No, there's no cryptographic proof. 1251 00:48:47,120 --> 00:48:48,860 The fact that that public key is there 1252 00:48:48,860 --> 00:48:50,693 could have come from the malicious directory 1253 00:48:50,693 --> 00:48:52,430 or could have come from an honest Bob who 1254 00:48:52,430 --> 00:48:54,720 just changed his public key. 1255 00:48:54,720 --> 00:48:55,560 So yeah. 1256 00:48:55,560 --> 00:48:57,440 So again, a step forward but we're not there. 1257 00:48:57,440 --> 00:49:00,660 You know, it's just that there's much more work to do here. 1258 00:49:00,660 --> 00:49:02,752 And I think you also had a question. 1259 00:49:02,752 --> 00:49:06,360 AUDIENCE: Can't Alice just ask Bob if this is you? 1260 00:49:06,360 --> 00:49:08,930 ALIN TOMESCU: So that's a chicken and an egg, right? 1261 00:49:08,930 --> 00:49:10,790 So we have Alice. 1262 00:49:10,790 --> 00:49:11,600 We have Bob. 1263 00:49:11,600 --> 00:49:13,130 And we have the attacker. 1264 00:49:13,130 --> 00:49:17,510 Alice asks Bob, is this your public key? 1265 00:49:17,510 --> 00:49:20,730 You know, let's say, b prime. 1266 00:49:20,730 --> 00:49:22,580 Let me make this more readable. 1267 00:49:22,580 --> 00:49:24,860 So attacker, right? 1268 00:49:24,860 --> 00:49:26,360 So Alice asks, hey Bob. 1269 00:49:26,360 --> 00:49:28,490 Is this b prime your public key? 1270 00:49:28,490 --> 00:49:30,410 The attacker changes it. 1271 00:49:30,410 --> 00:49:32,930 Hey Bob, is b your public key? 1272 00:49:32,930 --> 00:49:35,460 The attacker-- Bob says yes. 1273 00:49:35,460 --> 00:49:38,030 Attacker forwards yes to Alice, right? 1274 00:49:38,030 --> 00:49:40,358 Remember, Bob and Alice don't have a secure channel. 1275 00:49:40,358 --> 00:49:42,900 That's the problem we're trying to solve with this directory, 1276 00:49:42,900 --> 00:49:44,910 right? 1277 00:49:44,910 --> 00:49:47,610 So the attacker can always man in the middle people. 1278 00:49:47,610 --> 00:49:49,030 Well, people like Alice and Bob. 1279 00:49:49,030 --> 00:49:51,100 If the attacker can man in the middle everything, 1280 00:49:51,100 --> 00:49:52,320 then there's really no hope. 1281 00:49:52,320 --> 00:49:56,330 And we're living in a very sad, sad world if that's the case. 1282 00:49:56,330 --> 00:50:00,920 Yeah, it actually might be the case but we'll see. 1283 00:50:00,920 --> 00:50:04,470 Anyway, so yeah, so I claim here that this is a step forward. 1284 00:50:04,470 --> 00:50:05,893 But there's still much work to do. 1285 00:50:05,893 --> 00:50:07,310 All right, so we get transparency. 1286 00:50:07,310 --> 00:50:08,090 Bob can detect. 1287 00:50:08,090 --> 00:50:08,930 He'll know. 1288 00:50:08,930 --> 00:50:10,700 He won't be able to convince anybody. 1289 00:50:10,700 --> 00:50:13,310 But if a lot of Bobs get compromised, 1290 00:50:13,310 --> 00:50:15,610 you're still like in a place where everybody 1291 00:50:15,610 --> 00:50:18,110 knows that something's off and will all stop using Whatsapp, 1292 00:50:18,110 --> 00:50:18,740 for example. 1293 00:50:18,740 --> 00:50:19,358 Right? 1294 00:50:19,358 --> 00:50:20,900 By the way, Whatsapp is a great tool. 1295 00:50:20,900 --> 00:50:23,090 You should continue using it. 1296 00:50:23,090 --> 00:50:25,050 I'm just saying it's difficult to use right 1297 00:50:25,050 --> 00:50:26,920 like if somebody really wants to target it, 1298 00:50:26,920 --> 00:50:31,460 they can play a lot of tricks to still trick you. 1299 00:50:31,460 --> 00:50:34,730 So yeah, so all right, we already talked about this. 1300 00:50:34,730 --> 00:50:37,970 And yeah, again if the director can equivocate, then you know, 1301 00:50:37,970 --> 00:50:41,300 all bets are off because now Bob will look in this directory, 1302 00:50:41,300 --> 00:50:42,763 he'll think he's not impersonated. 1303 00:50:42,763 --> 00:50:44,180 Alice will look in this directory. 1304 00:50:44,180 --> 00:50:46,240 She'll think she's not impersonated, right? 1305 00:50:46,240 --> 00:50:48,140 So the reason we started this research 1306 00:50:48,140 --> 00:50:49,700 is because I really want to-- 1307 00:50:49,700 --> 00:50:52,310 my thesis is on building these directories that 1308 00:50:52,310 --> 00:50:55,370 are efficiently auditable and have a hard time impersonating 1309 00:50:55,370 --> 00:50:56,148 people. 1310 00:50:56,148 --> 00:50:58,190 So that's why we decided to look at how could you 1311 00:50:58,190 --> 00:50:59,960 do this with Bitcoin. 1312 00:50:59,960 --> 00:51:02,600 Yeah, so of course there's one project called KeyChat 1313 00:51:02,600 --> 00:51:04,940 that we're working on with some high school students. 1314 00:51:04,940 --> 00:51:07,065 And we're using the key based public key directory. 1315 00:51:07,065 --> 00:51:09,050 And we're witnessing it in a Catena log 1316 00:51:09,050 --> 00:51:13,370 so that stuff like that doesn't happen. 1317 00:51:13,370 --> 00:51:16,700 OK, so now let's talk about the blockchains. 1318 00:51:16,700 --> 00:51:18,710 In general, people nowadays like to say I 1319 00:51:18,710 --> 00:51:20,940 need a blockchain for x, right? 1320 00:51:20,940 --> 00:51:22,940 I need a blockchain for supply chain management. 1321 00:51:22,940 --> 00:51:25,820 I mean a blockchain for cats, for whatever. 1322 00:51:25,820 --> 00:51:27,770 I've heard a lot of crazy stories. 1323 00:51:27,770 --> 00:51:30,430 IOT, self-driving cars, blah, blah, blah. 1324 00:51:30,430 --> 00:51:35,120 And I think the right way to think about blockchain 1325 00:51:35,120 --> 00:51:37,310 is to never ever say that word unless you 1326 00:51:37,310 --> 00:51:38,850 use quotes, first of all. 1327 00:51:38,850 --> 00:51:41,870 And second of all, to understand what Byzantine state 1328 00:51:41,870 --> 00:51:43,185 machine replication is. 1329 00:51:43,185 --> 00:51:45,560 Right, and if you understand what Byzantine state machine 1330 00:51:45,560 --> 00:51:47,687 replication is or a Byzantine consensus, 1331 00:51:47,687 --> 00:51:48,770 you understand blockchain. 1332 00:51:48,770 --> 00:51:50,210 And you understand all the hype. 1333 00:51:50,210 --> 00:51:52,790 And then you can make some progress in solving problems. 1334 00:51:52,790 --> 00:51:54,350 In the sense that what is blockchain? 1335 00:51:54,350 --> 00:51:55,737 So what we're doing here is we're 1336 00:51:55,737 --> 00:51:57,320 doing a Byzantine consensus algorithm. 1337 00:51:57,320 --> 00:51:59,520 We're agreeing on a log of operations. 1338 00:51:59,520 --> 00:52:01,550 Right, by the way, that's what Catena does too 1339 00:52:01,550 --> 00:52:03,960 by piggybacking on bitcoin. 1340 00:52:03,960 --> 00:52:07,080 It agrees on a lot on a log of operations. 1341 00:52:07,080 --> 00:52:11,390 Right, but the other thing that SMR or Byzantine consensus does 1342 00:52:11,390 --> 00:52:13,340 is that it also allows you to agree 1343 00:52:13,340 --> 00:52:16,520 on the execution of the ops in that log. 1344 00:52:16,520 --> 00:52:18,530 So in Catena, you don't agree on the execution, 1345 00:52:18,530 --> 00:52:20,430 you just agree on the statements. 1346 00:52:20,430 --> 00:52:23,030 But there is no execution of those statements in the sense 1347 00:52:23,030 --> 00:52:25,550 that you can't build another bitcoin 1348 00:52:25,550 --> 00:52:29,870 on top of bitcoin in Catena because you can't prevent 1349 00:52:29,870 --> 00:52:33,560 double spends of transactions that 1350 00:52:33,560 --> 00:52:34,768 are Catena statements, right? 1351 00:52:34,768 --> 00:52:37,102 Like the Catena statements, you have to look in each one 1352 00:52:37,102 --> 00:52:38,260 and tell if it's correct. 1353 00:52:38,260 --> 00:52:41,360 So to detect a double spend in a Catena backed cryptocurrency, 1354 00:52:41,360 --> 00:52:43,510 you would have to download all of the transactions 1355 00:52:43,510 --> 00:52:46,850 and because you cannot execute it like the bitcoin miners do 1356 00:52:46,850 --> 00:52:49,430 and build this UTXO set. 1357 00:52:49,430 --> 00:52:51,290 I'm not sure this is making a lot of sense. 1358 00:52:51,290 --> 00:52:54,060 But let's put it another way. 1359 00:52:54,060 --> 00:52:57,677 In bitcoin, you have block one and then 1360 00:52:57,677 --> 00:52:58,760 you have block two, right? 1361 00:53:01,307 --> 00:53:03,890 And there's a hash pointer and there's a bunch of transactions 1362 00:53:03,890 --> 00:53:05,450 here, right? 1363 00:53:05,450 --> 00:53:08,570 And remember that what prevents me from double spending 1364 00:53:08,570 --> 00:53:09,770 something here-- 1365 00:53:09,770 --> 00:53:11,510 I can have two transactions in this block 1366 00:53:11,510 --> 00:53:13,670 that double spend the same one here. 1367 00:53:13,670 --> 00:53:16,340 What prevents me from doing that is exactly this execution 1368 00:53:16,340 --> 00:53:18,140 stage, right? 1369 00:53:18,140 --> 00:53:21,110 Because in the execution stage, when I try to I execute this 1370 00:53:21,110 --> 00:53:24,020 first transaction and I mark this output as spent, 1371 00:53:24,020 --> 00:53:25,910 when I execute the second transaction, 1372 00:53:25,910 --> 00:53:27,660 I cannot spend that output anymore, right? 1373 00:53:27,660 --> 00:53:30,285 In Catena, you can't do anything like that with the statements. 1374 00:53:30,285 --> 00:53:31,740 You just agree on the statements. 1375 00:53:31,740 --> 00:53:33,170 In Catena, you would put this transaction 1376 00:53:33,170 --> 00:53:35,503 in the log, this one and then that one and someone would 1377 00:53:35,503 --> 00:53:38,060 have to detect that the second one is a bad one by actually 1378 00:53:38,060 --> 00:53:38,700 downloading it. 1379 00:53:38,700 --> 00:53:40,975 That's kind of what I'm trying to say here. 1380 00:53:40,975 --> 00:53:42,350 So in general, the way should you 1381 00:53:42,350 --> 00:53:43,808 should be thinking about blockchain 1382 00:53:43,808 --> 00:53:45,740 is through the lens of Byzantine consensus. 1383 00:53:45,740 --> 00:53:48,230 And that'll get you ahead of the curve 1384 00:53:48,230 --> 00:53:50,100 in this overly hyped space, right, 1385 00:53:50,100 --> 00:53:51,350 because it's really just this. 1386 00:53:51,350 --> 00:53:52,710 You're agreeing on a log of operations. 1387 00:53:52,710 --> 00:53:55,010 And then you're agreeing on the execution of those operations 1388 00:53:55,010 --> 00:53:56,030 according to some rules. 1389 00:53:56,030 --> 00:53:59,390 The rules in bitcoin are transaction cannot-- 1390 00:53:59,390 --> 00:54:02,880 there cannot be two inputs spending the same output more 1391 00:54:02,880 --> 00:54:03,380 or less. 1392 00:54:03,380 --> 00:54:05,180 There's other things too. 1393 00:54:05,180 --> 00:54:06,680 What that gives you is it allows you 1394 00:54:06,680 --> 00:54:09,510 to agree on a final state which in bitcoin are 1395 00:54:09,510 --> 00:54:10,760 the valid transactions. 1396 00:54:10,760 --> 00:54:12,345 That's the final state right. 1397 00:54:12,345 --> 00:54:14,510 In ethereum, for example, the final state 1398 00:54:14,510 --> 00:54:16,520 are the valid transactions, and the account 1399 00:54:16,520 --> 00:54:19,490 balances of everything, and the smart contract 1400 00:54:19,490 --> 00:54:20,930 state for everything. 1401 00:54:20,930 --> 00:54:23,040 And I guess you'll learn about that later. 1402 00:54:23,040 --> 00:54:25,460 So you can build arbitrarily complex things 1403 00:54:25,460 --> 00:54:29,020 with Byzantine consensus or with blockchains. 1404 00:54:29,020 --> 00:54:31,670 And the high level bit, if you want to look at it another way, 1405 00:54:31,670 --> 00:54:35,390 is that you have a program p, right, which could be anything, 1406 00:54:35,390 --> 00:54:36,808 could be a cryptocurrency. 1407 00:54:36,808 --> 00:54:38,600 And then, instead of running this program p 1408 00:54:38,600 --> 00:54:41,540 on a single server s, what do you do is you 1409 00:54:41,540 --> 00:54:47,380 distribute it on a bunch of servers, s1, s2, s3, s4, right? 1410 00:54:47,380 --> 00:54:49,610 And now as a result, to mess with this program p, 1411 00:54:49,610 --> 00:54:51,320 it's not enough to compromise one server, 1412 00:54:51,320 --> 00:54:53,860 you have to compromise a bunch of them. 1413 00:54:53,860 --> 00:54:55,220 right? 1414 00:54:55,220 --> 00:54:57,620 OK so, and some of you might also 1415 00:54:57,620 --> 00:55:00,230 be familiar with this term permissioned blockchain. 1416 00:55:00,230 --> 00:55:03,890 So when you distribute this program o amongst n servers 1417 00:55:03,890 --> 00:55:07,760 where n is equal let's say, three f plus 1 and f 1418 00:55:07,760 --> 00:55:10,300 is equal to 1 in this particular case. 1419 00:55:10,300 --> 00:55:13,220 In a permission blockchain, this n is fixed, right? 1420 00:55:13,220 --> 00:55:15,980 Once you've set n to 4, it has to stay 4. 1421 00:55:15,980 --> 00:55:17,640 These servers have to know one another. 1422 00:55:17,640 --> 00:55:19,640 They need to know each other's public keys. 1423 00:55:19,640 --> 00:55:23,720 And only one of the servers, f is equal to 1, can fail. 1424 00:55:23,720 --> 00:55:26,510 If more than one server fails, then all bets are off. 1425 00:55:26,510 --> 00:55:28,840 Your program can start doing arbitrary things. 1426 00:55:28,840 --> 00:55:30,590 In particular, if your program is bitcoin, 1427 00:55:30,590 --> 00:55:33,170 it can start double spending. 1428 00:55:33,170 --> 00:55:36,390 I'm moving a little bit fast, so I'll take some questions up 1429 00:55:36,390 --> 00:55:39,326 until this point before I go on. 1430 00:55:39,326 --> 00:55:40,315 Yes? 1431 00:55:40,315 --> 00:55:41,940 AUDIENCE: In a permissioned blockchain, 1432 00:55:41,940 --> 00:55:43,270 do they use proof of work? 1433 00:55:43,270 --> 00:55:44,770 ALIN TOMESCU: No, you don't have to. 1434 00:55:44,770 --> 00:55:47,400 And that's kind of what the hype is about. 1435 00:55:47,400 --> 00:55:49,800 This stuff, there is like-- 1436 00:55:49,800 --> 00:55:52,560 the first interesting paper on this 1437 00:55:52,560 --> 00:55:55,200 was 1976 or something like that. 1438 00:55:55,200 --> 00:55:58,420 So this is 40-30, 40-year-old research. 1439 00:55:58,420 --> 00:56:00,510 We've known how to do permissioned consensus-- we 1440 00:56:00,510 --> 00:56:03,060 used to call it Byzantine consensus for 30 or 40 1441 00:56:03,060 --> 00:56:04,560 years, right? 1442 00:56:04,560 --> 00:56:06,030 So there's nothing new there. 1443 00:56:06,030 --> 00:56:09,575 It's just that it's very useful nowadays to say blockchain then 1444 00:56:09,575 --> 00:56:10,950 to say consensus because then you 1445 00:56:10,950 --> 00:56:14,243 get 10 more million from your venture capitalist folks. 1446 00:56:14,243 --> 00:56:16,410 AUDIENCE: But if you don't have to do proof of work, 1447 00:56:16,410 --> 00:56:19,440 do they ever do proof of work? 1448 00:56:19,440 --> 00:56:21,660 ALIN TOMESCU: It would be such a bad idea 1449 00:56:21,660 --> 00:56:24,090 technically to do prefer working a permissioned consensus 1450 00:56:24,090 --> 00:56:24,590 algorithm. 1451 00:56:24,590 --> 00:56:30,510 It just-- completely unnecessary, plus probably 1452 00:56:30,510 --> 00:56:32,450 insecure too. 1453 00:56:32,450 --> 00:56:36,090 Yeah, so now the reason you do proof of work 1454 00:56:36,090 --> 00:56:39,510 is because in a permissionless blockchain, 1455 00:56:39,510 --> 00:56:40,830 this n is not fixed. 1456 00:56:40,830 --> 00:56:43,080 n could go, let's say, n was 4. 1457 00:56:43,080 --> 00:56:45,330 It could go to 8. 1458 00:56:45,330 --> 00:56:47,760 Then it could go to 3. 1459 00:56:47,760 --> 00:56:50,750 Then it could go to 12. 1460 00:56:50,750 --> 00:56:53,460 In other words, people are joining and leaving 1461 00:56:53,460 --> 00:56:55,038 as they please. 1462 00:56:55,038 --> 00:56:56,580 And the reason you need proof of work 1463 00:56:56,580 --> 00:57:00,030 in bitcoin, one way you can look at it 1464 00:57:00,030 --> 00:57:03,090 is that you're really turning a permissioned consensus 1465 00:57:03,090 --> 00:57:05,280 algorithm into a permissionless one. 1466 00:57:05,280 --> 00:57:08,100 And a consensus algorithm is just voting. 1467 00:57:08,100 --> 00:57:10,440 These n folks are just voting. 1468 00:57:10,440 --> 00:57:16,170 And you need 2f plus 1 votes to sort of move on, right? 1469 00:57:16,170 --> 00:57:20,520 And if this n changes over time, like if the n becomes bigger, 1470 00:57:20,520 --> 00:57:23,670 it's very easy to take over a majority of the voters. 1471 00:57:23,670 --> 00:57:26,730 If I can just add fake voters to a permissioned consensus 1472 00:57:26,730 --> 00:57:29,920 algorithm, I can just take over the consensus algorithm. 1473 00:57:29,920 --> 00:57:32,520 In other words, I can take over more than f nodes. 1474 00:57:32,520 --> 00:57:35,460 Right, so the trick there is you have 1475 00:57:35,460 --> 00:57:37,360 to prevent that from happening. 1476 00:57:37,360 --> 00:57:39,270 And the only way to prevent that is to say, 1477 00:57:39,270 --> 00:57:42,450 look if you're going to join and then make my n bigger, 1478 00:57:42,450 --> 00:57:43,458 you better do some work. 1479 00:57:43,458 --> 00:57:45,000 So that it's not easy for you to join 1480 00:57:45,000 --> 00:57:48,540 because if you're a bad guy and you want to join, you know, 1481 00:57:48,540 --> 00:57:50,040 you can do that very easily unless I 1482 00:57:50,040 --> 00:57:53,727 require you to do some work. 1483 00:57:53,727 --> 00:57:55,310 So that's kind of the trick in turning 1484 00:57:55,310 --> 00:57:56,970 a permissioned consensus algorithm 1485 00:57:56,970 --> 00:57:58,460 into a permissionless one. 1486 00:57:58,460 --> 00:58:01,040 And in fact, the way these permissioned animals work 1487 00:58:01,040 --> 00:58:02,915 is completely different than bitcoin. 1488 00:58:02,915 --> 00:58:04,040 They are much more complex. 1489 00:58:04,040 --> 00:58:07,490 Bitcoin is incredibly simple as a consensus algorithm. 1490 00:58:07,490 --> 00:58:09,980 If you ever read a consensus algorithm paper, 1491 00:58:09,980 --> 00:58:11,228 you know, it's a bit insane. 1492 00:58:11,228 --> 00:58:12,770 Also to implement, it's a bit insane. 1493 00:58:12,770 --> 00:58:14,620 Bitcoin is very simple to implement compared 1494 00:58:14,620 --> 00:58:16,370 to these other things, I mean, bitcoin is, 1495 00:58:16,370 --> 00:58:17,850 of course, a complex beast as well. 1496 00:58:17,850 --> 00:58:21,170 But you should look at let's say, practical Byzantine fault 1497 00:58:21,170 --> 00:58:25,880 tolerant paper, PBFT, and try and implement that. 1498 00:58:28,177 --> 00:58:30,010 So anyway, why am I telling you all of this? 1499 00:58:30,010 --> 00:58:31,593 The reason I'm telling you all of this 1500 00:58:31,593 --> 00:58:33,920 is because if you want to do a permissioned blockchain 1501 00:58:33,920 --> 00:58:35,902 for whatever reason, one way to do 1502 00:58:35,902 --> 00:58:38,360 that is to use your favorite Byzantine consensus algorithm. 1503 00:58:38,360 --> 00:58:39,500 So that would be-- 1504 00:58:39,500 --> 00:58:41,000 let's say pbft. 1505 00:58:41,000 --> 00:58:46,473 This was 1999 from MIT. 1506 00:58:46,473 --> 00:58:47,390 So you could use that. 1507 00:58:47,390 --> 00:58:50,250 You could have a lot of fun implementing it. 1508 00:58:50,250 --> 00:58:52,970 Another thing you could do is you could take your program p 1509 00:58:52,970 --> 00:58:55,522 and just give it to an ethereum smart contract. 1510 00:58:55,522 --> 00:58:57,480 And you know, that the ethereum smart contract, 1511 00:58:57,480 --> 00:59:00,935 if the ethereum security assumption holds, 1512 00:59:00,935 --> 00:59:02,060 it will do the right thing. 1513 00:59:02,060 --> 00:59:04,902 It will execute your program p correctly, right? 1514 00:59:04,902 --> 00:59:06,860 But the other thing that you could do actually, 1515 00:59:06,860 --> 00:59:10,010 is you could use Catena to agree on these logs, 1516 00:59:10,010 --> 00:59:13,280 on the log of operations for your program. 1517 00:59:13,280 --> 00:59:17,630 And then you could use another 2f plus 1 servers or replicas 1518 00:59:17,630 --> 00:59:19,880 to do the execution stuff so that you 1519 00:59:19,880 --> 00:59:22,495 can agree on the final state. 1520 00:59:22,495 --> 00:59:25,120 And this gives you a very simple Byzantine consensus algorithm. 1521 00:59:25,120 --> 00:59:27,440 So remember Catena doesn't give you execution. 1522 00:59:27,440 --> 00:59:29,240 It allows you to agree on the log of ops. 1523 00:59:29,240 --> 00:59:32,690 To get the execution, you'd basically take a majority vote. 1524 00:59:32,690 --> 00:59:37,790 If you see you have 2f plus 1 replica servers 1525 00:59:37,790 --> 00:59:41,780 and if you see f plus 1 votes on a final state, 1526 00:59:41,780 --> 00:59:46,070 you know that's the right state because only f of them 1527 00:59:46,070 --> 00:59:46,850 are malicious. 1528 00:59:51,320 --> 00:59:53,600 So in fact, if you use Catena with 2f plus 1 replicas, 1529 00:59:53,600 --> 00:59:56,630 I claim that you can get sort of a permissioned blockchain that 1530 00:59:56,630 --> 01:00:00,200 sort of leverages the bitcoin blockchain to do the agreement 1531 01:00:00,200 --> 01:00:01,280 on the log of ops. 1532 01:00:01,280 --> 01:00:03,050 So in that sense, it's sort of a mix of a permissioned 1533 01:00:03,050 --> 01:00:03,980 and permissionless. 1534 01:00:03,980 --> 01:00:05,730 We haven't studied this like we don't know 1535 01:00:05,730 --> 01:00:07,850 what properties it would have. 1536 01:00:07,850 --> 01:00:10,440 So that's future work. 1537 01:00:10,440 --> 01:00:14,400 And if you don't need the execution, for example, 1538 01:00:14,400 --> 01:00:16,520 if all you're doing is you're agreeing 1539 01:00:16,520 --> 01:00:19,070 on a public key directory-- like here, there's no execution. 1540 01:00:19,070 --> 01:00:21,920 This directory just is supposed to stay append-only, 1541 01:00:21,920 --> 01:00:24,020 we have some research that allows 1542 01:00:24,020 --> 01:00:26,450 you to prove that every transition is 1543 01:00:26,450 --> 01:00:28,832 an append-only directory. 1544 01:00:28,832 --> 01:00:30,290 And if you only need execution, you 1545 01:00:30,290 --> 01:00:32,900 can just use Catena directly as I already 1546 01:00:32,900 --> 01:00:35,030 told you guys for the software transparency 1547 01:00:35,030 --> 01:00:39,860 application for the public key directory application. 1548 01:00:39,860 --> 01:00:43,020 And if you want to do a permissionless blockchain then, 1549 01:00:43,020 --> 01:00:46,340 of course, you would have to roll your own. 1550 01:00:46,340 --> 01:00:48,770 But you have to proceed with caution there, right? 1551 01:00:48,770 --> 01:00:51,140 It's not an easy thing to do. 1552 01:00:51,140 --> 01:00:52,940 OK, so let's conclude now. 1553 01:00:52,940 --> 01:00:55,400 What we did here is that we enabled these applications 1554 01:00:55,400 --> 01:00:59,090 to efficiently leverage bitcoin's consensus, right? 1555 01:00:59,090 --> 01:01:02,240 So clients can download transactions selectively 1556 01:01:02,240 --> 01:01:05,360 rather than full blockchain and prevent equivocation. 1557 01:01:05,360 --> 01:01:07,940 Right, and you only need to get 46 megabytes instead 1558 01:01:07,940 --> 01:01:11,420 of gigabytes from the Bitcoin blockchain. 1559 01:01:11,420 --> 01:01:13,223 So why does this matter? 1560 01:01:13,223 --> 01:01:15,140 These are just I think the three killer apps-- 1561 01:01:15,140 --> 01:01:17,550 secure software update, public key directories-- 1562 01:01:17,550 --> 01:01:18,967 by the way, the public directories 1563 01:01:18,967 --> 01:01:21,260 also are applicable to https. 1564 01:01:21,260 --> 01:01:22,760 So when you go on Facebook, you have 1565 01:01:22,760 --> 01:01:24,325 to get Facebook's public key. 1566 01:01:24,325 --> 01:01:26,870 The certificate authorities that sign the public keys 1567 01:01:26,870 --> 01:01:28,533 are often compromised. 1568 01:01:28,533 --> 01:01:30,200 So they're often fake search for Google, 1569 01:01:30,200 --> 01:01:32,180 for big companies like that. 1570 01:01:32,180 --> 01:01:33,420 And you might use it. 1571 01:01:33,420 --> 01:01:35,462 But if you have a public key directory, 1572 01:01:35,462 --> 01:01:36,920 Facebook and Google can immediately 1573 01:01:36,920 --> 01:01:39,980 notice those fake sorts. 1574 01:01:39,980 --> 01:01:41,220 It's a step forward. 1575 01:01:41,220 --> 01:01:44,560 And for more, of course, you can read our paper. 1576 01:01:44,560 --> 01:01:48,980 It appeared in a IEEE security and privacy 2017. 1577 01:01:48,980 --> 01:01:51,420 And I'll post the slide on GitHub too. 1578 01:01:51,420 --> 01:01:52,870 So there are links there. 1579 01:01:52,870 --> 01:01:56,030 Yeah so, again this is the high level overview of everything 1580 01:01:56,030 --> 01:01:58,340 that's previous work and our work. 1581 01:01:58,340 --> 01:02:01,940 The difference is very small. 1582 01:02:01,940 --> 01:02:04,760 And now we can also talk about other stuff. 1583 01:02:04,760 --> 01:02:06,400 In fact, I have more stuff to talk. 1584 01:02:06,400 --> 01:02:09,920 But before we do, I'd like to have a discussion with you guys 1585 01:02:09,920 --> 01:02:11,500 if you have questions. 1586 01:02:11,500 --> 01:02:12,410 So? 1587 01:02:12,410 --> 01:02:17,050 AUDIENCE: So could you implement Catena in the actual bitcoin 1588 01:02:17,050 --> 01:02:18,510 node? 1589 01:02:18,510 --> 01:02:21,290 Would that be something that they would want? 1590 01:02:21,290 --> 01:02:26,480 It seems like it would be a good feature to add? 1591 01:02:26,480 --> 01:02:28,523 Or is it strictly separate? 1592 01:02:28,523 --> 01:02:30,440 ALIN TOMESCU: Yeah, I don't think you need to. 1593 01:02:30,440 --> 01:02:31,800 That's the whole point, right? 1594 01:02:31,800 --> 01:02:33,217 The whole point of the research is 1595 01:02:33,217 --> 01:02:36,517 how do we use bitcoin without getting the miners to accept 1596 01:02:36,517 --> 01:02:38,600 a new version of bitcoin, without changing bitcoin 1597 01:02:38,600 --> 01:02:40,990 in any way? 1598 01:02:40,990 --> 01:02:43,610 So no, I don't think-- there's nothing to do really, 1599 01:02:43,610 --> 01:02:44,655 we're just-- 1600 01:02:44,655 --> 01:02:46,280 we're taking bitcoin as it is and we're 1601 01:02:46,280 --> 01:02:48,740 piggybacking on top of it. 1602 01:02:48,740 --> 01:02:49,890 We couldn't change it. 1603 01:02:49,890 --> 01:02:51,410 I mean, there's a lot of things you can do in some sense. 1604 01:02:51,410 --> 01:02:53,600 But then you get a very different system. 1605 01:02:53,600 --> 01:02:56,430 Very different, we can talk about it more if you want. 1606 01:02:56,430 --> 01:02:57,270 Yeah? 1607 01:02:57,270 --> 01:03:00,020 AUDIENCE: You were talking about how you can use this system 1608 01:03:00,020 --> 01:03:03,060 to verify software binaries and how 1609 01:03:03,060 --> 01:03:06,220 you want this to run in SPV modes on phones So 1610 01:03:06,220 --> 01:03:08,707 how do you install software on the phones say, 1611 01:03:08,707 --> 01:03:10,040 through apple and the app store. 1612 01:03:10,040 --> 01:03:12,470 Is there a way to sign the binary that you actually 1613 01:03:12,470 --> 01:03:14,420 get from the app store? 1614 01:03:14,420 --> 01:03:16,730 ALIN TOMESCU: I think your question is really about how 1615 01:03:16,730 --> 01:03:20,188 do appstore binaries-- 1616 01:03:20,188 --> 01:03:21,730 how do you Verify App store binaries? 1617 01:03:21,730 --> 01:03:23,260 It's a chicken and an egg in some sense right, 1618 01:03:23,260 --> 01:03:24,385 is that what you're saying? 1619 01:03:24,385 --> 01:03:27,640 Yeah, you're right. 1620 01:03:27,640 --> 01:03:30,610 Eventually, I mean in the best case, 1621 01:03:30,610 --> 01:03:32,440 wishful thinking would be to say that look, 1622 01:03:32,440 --> 01:03:35,470 the app store does this already for all of the binaries 1623 01:03:35,470 --> 01:03:42,120 that they publish, allowing the developers to make sure nobody 1624 01:03:42,120 --> 01:03:44,480 is posting malicious binaries on the app store for them. 1625 01:03:48,120 --> 01:03:49,444 Yes? 1626 01:03:49,444 --> 01:03:51,772 AUDIENCE: Who do you envision running it? 1627 01:03:51,772 --> 01:03:54,230 ALIN TOMESCU: So I'd really like to see Keybase run Catena, 1628 01:03:54,230 --> 01:03:56,060 it seems like a missed opportunity 1629 01:03:56,060 --> 01:03:57,380 that they don't do this. 1630 01:03:57,380 --> 01:04:02,750 I'm sure they have better stuff to do but it's just really 1631 01:04:02,750 --> 01:04:04,450 easily allow Keybase-- 1632 01:04:04,450 --> 01:04:06,200 let's say Keybase has a mobile phone app, 1633 01:04:06,200 --> 01:04:09,110 it would allow that mobile phone app to verify the directory 1634 01:04:09,110 --> 01:04:11,490 and get much, much, much more security. 1635 01:04:11,490 --> 01:04:15,080 You know, no equivocation as long as nobody forks bitcoin. 1636 01:04:15,080 --> 01:04:17,150 Since Keybase is already publishing these digests 1637 01:04:17,150 --> 01:04:20,030 but they cannot be audited efficiently on a mobile phone. 1638 01:04:20,030 --> 01:04:21,980 I mean they can but not securely you know, 1639 01:04:21,980 --> 01:04:23,480 because full nodes can lie. 1640 01:04:33,090 --> 01:04:36,450 So there's a big problem with everything I said so far. 1641 01:04:36,450 --> 01:04:37,950 And nobody caught it. 1642 01:04:37,950 --> 01:04:40,710 So one problem is that what do you 1643 01:04:40,710 --> 01:04:44,247 do when you run out of funds? 1644 01:04:44,247 --> 01:04:46,580 Remember I said the log server starts with two bitcoins. 1645 01:04:46,580 --> 01:04:49,220 Let's say it issues thousands of transactions, 1646 01:04:49,220 --> 01:04:52,885 starts paying those $40 fees and it runs out of funds. 1647 01:04:52,885 --> 01:04:53,510 What do you do? 1648 01:04:53,510 --> 01:04:55,820 Then Yeah? 1649 01:04:55,820 --> 01:04:59,274 AUDIENCE: You can maybe reload the new transactions 1650 01:04:59,274 --> 01:05:03,188 with this transaction [INAUDIBLE].. 1651 01:05:03,188 --> 01:05:04,480 ALIN TOMESCU: What's your name? 1652 01:05:04,480 --> 01:05:05,170 AUDIENCE: Raul. 1653 01:05:05,170 --> 01:05:06,110 ALIN TOMESCU: Raul. 1654 01:05:06,110 --> 01:05:07,560 So Raul is saying you can reload. 1655 01:05:07,560 --> 01:05:08,750 And that's exactly right. 1656 01:05:08,750 --> 01:05:13,680 We just have to change the transaction format slightly. 1657 01:05:13,680 --> 01:05:15,860 We talk about this in the paper as well. 1658 01:05:15,860 --> 01:05:19,190 But just to demonstrate real quickly. 1659 01:05:19,190 --> 01:05:23,510 Suppose, let's take a ridiculous example which hopefully 1660 01:05:23,510 --> 01:05:25,220 will never happen in bitcoin. 1661 01:05:25,220 --> 01:05:29,300 But suppose that the Bitcoin fee is 1 bitcoin. 1662 01:05:32,450 --> 01:05:35,070 So now I have one bitcoin here. 1663 01:05:35,070 --> 01:05:37,380 And I have s1 here. 1664 01:05:37,380 --> 01:05:40,640 And now I have zero bitcoins here. 1665 01:05:44,860 --> 01:05:49,270 All right, so zero bitcoins in this output. 1666 01:05:49,270 --> 01:05:52,097 And maybe s2 here. 1667 01:05:52,097 --> 01:05:53,180 So that would be terrible. 1668 01:05:53,180 --> 01:05:55,790 Right, now I can't go on. 1669 01:05:55,790 --> 01:05:58,650 Does everybody see this as a problem? 1670 01:05:58,650 --> 01:06:01,810 Right, so what Raul is saying is look at another input 1671 01:06:01,810 --> 01:06:07,130 here and make it take coins from some other transaction 1672 01:06:07,130 --> 01:06:08,610 whatever, 20 bitcoins. 1673 01:06:08,610 --> 01:06:11,090 And now you get 20 bitcoins here. 1674 01:06:11,090 --> 01:06:13,982 Right, so you can easily refund transactions. 1675 01:06:19,070 --> 01:06:21,920 There is a bit more subtlety there in the sense 1676 01:06:21,920 --> 01:06:26,180 that you don't want to join logs-- 1677 01:06:26,180 --> 01:06:31,460 let's say if you have two logs, GTX and GTX prime 1678 01:06:31,460 --> 01:06:33,630 for different applications. 1679 01:06:33,630 --> 01:06:36,710 Right, and they start issuing statements-- s1, s2. 1680 01:06:36,710 --> 01:06:38,660 You don't want to be able to join-- 1681 01:06:38,660 --> 01:06:40,850 I'm sorry, s1, s1 prime-- 1682 01:06:40,850 --> 01:06:45,200 these two locks to a single log for certain reasons right. 1683 01:06:45,200 --> 01:06:47,960 But this doesn't actually allow you to join them 1684 01:06:47,960 --> 01:06:48,965 in the sense that-- 1685 01:06:48,965 --> 01:06:50,840 let's say you actually do this and join them. 1686 01:06:50,840 --> 01:06:53,990 Right, so let's say this transaction here 1687 01:06:53,990 --> 01:06:58,840 came from GTX prime, right? 1688 01:06:58,840 --> 01:07:00,970 And there was an s1 prime here. 1689 01:07:04,428 --> 01:07:08,380 And you did this, right? 1690 01:07:08,380 --> 01:07:10,610 So the problem is this is no longer a valid Catena 1691 01:07:10,610 --> 01:07:15,560 transaction for this log because a valid Casino 1692 01:07:15,560 --> 01:07:17,840 transaction-- the first input spends 1693 01:07:17,840 --> 01:07:20,450 the previous transactions output. 1694 01:07:20,450 --> 01:07:24,110 But in this chain, it's the second input that 1695 01:07:24,110 --> 01:07:26,090 spends the previous output. 1696 01:07:26,090 --> 01:07:28,280 So I cannot join logs. 1697 01:07:28,280 --> 01:07:30,800 And this matters for a bunch of reasons 1698 01:07:30,800 --> 01:07:33,110 that we don't have to go into. 1699 01:07:37,350 --> 01:07:41,768 Yeah, so we talked about the about batching statements. 1700 01:07:41,768 --> 01:07:43,560 I want to show you guys some previous work, 1701 01:07:43,560 --> 01:07:47,190 so how did some previous work do this since there 1702 01:07:47,190 --> 01:07:49,152 we seem to have a bit of time. 1703 01:07:49,152 --> 01:07:50,610 OK, so there are some previous work 1704 01:07:50,610 --> 01:07:53,670 called liar, liar, coins on fire. 1705 01:07:53,670 --> 01:07:55,380 Have you guys seen this? 1706 01:07:55,380 --> 01:07:58,290 So the idea here is that it is a really nice piece of work 1707 01:07:58,290 --> 01:08:00,240 and lots of people in the bitcoin community 1708 01:08:00,240 --> 01:08:02,740 already know about this. 1709 01:08:02,740 --> 01:08:05,830 And I think it was an idea before the paper or maybe not, 1710 01:08:05,830 --> 01:08:07,330 I'm not sure. 1711 01:08:07,330 --> 01:08:10,330 But Tadge has a similar idea. 1712 01:08:10,330 --> 01:08:15,450 So for example, suppose we have this authority. 1713 01:08:15,450 --> 01:08:17,100 Imagine this is a Catena block server 1714 01:08:17,100 --> 01:08:21,149 and it publishes a transaction which locks to bitcoin 1715 01:08:21,149 --> 01:08:24,420 and locks those bitcoins to that public key. 1716 01:08:24,420 --> 01:08:26,473 And this authority sometimes will want 1717 01:08:26,473 --> 01:08:27,640 to say two different things. 1718 01:08:27,640 --> 01:08:31,250 It will want to say s and s prime, right? 1719 01:08:31,250 --> 01:08:35,319 And it will sign the statements with their secret key. 1720 01:08:37,990 --> 01:08:39,970 But the secret key that the authority 1721 01:08:39,970 --> 01:08:44,290 uses to sign statements is also a bitcoin secret key. 1722 01:08:44,290 --> 01:08:46,569 It's the same secret key that the authority 1723 01:08:46,569 --> 01:08:50,470 used to lock to bitcoins, so $20,000 or something. 1724 01:08:50,470 --> 01:08:53,410 I'm not sure if bitcoin plummeted since yesterday, 1725 01:08:53,410 --> 01:08:54,880 can never be sure. 1726 01:08:54,880 --> 01:08:56,560 Does the setting make sense? 1727 01:08:56,560 --> 01:08:57,720 So I have an authority. 1728 01:08:57,720 --> 01:08:59,710 It issues statements just like before. 1729 01:08:59,710 --> 01:09:04,649 And it numbers them with i, let's say. 1730 01:09:04,649 --> 01:09:07,535 And we want to prevent this authority from equivocating. 1731 01:09:07,535 --> 01:09:09,160 We're not actually going to prevent it. 1732 01:09:09,160 --> 01:09:12,340 We're just going to disincentivize it in the sense 1733 01:09:12,340 --> 01:09:15,430 that if this authority equivocates like this 1734 01:09:15,430 --> 01:09:18,700 for the same statement i, what I claim 1735 01:09:18,700 --> 01:09:21,490 is that anybody can then steal that authority's 1736 01:09:21,490 --> 01:09:23,800 bitcoin because equivocating like this 1737 01:09:23,800 --> 01:09:25,973 reveals the secret key. 1738 01:09:25,973 --> 01:09:27,640 And the reason it reveals the secret key 1739 01:09:27,640 --> 01:09:32,750 is because the signature shares the same i here. 1740 01:09:32,750 --> 01:09:35,270 So how many of you guys actually, you 1741 01:09:35,270 --> 01:09:37,340 did cover Schnorr signatures, right? 1742 01:09:37,340 --> 01:09:41,712 So did Tadge talk about how to do this with Schnorr? 1743 01:09:41,712 --> 01:09:43,170 AUDIENCE: This is the [INAUDIBLE].. 1744 01:09:46,270 --> 01:09:47,930 ALIN TOMESCU: But did Tadge cover it? 1745 01:09:47,930 --> 01:09:48,520 AUDIENCE: Yes. 1746 01:09:48,520 --> 01:09:49,600 ALIN TOMESCU: OK, great. 1747 01:09:49,600 --> 01:09:53,080 So yeah so let's go over that less briefly. 1748 01:09:53,080 --> 01:09:56,440 So again, the idea is that if someone observes these two 1749 01:09:56,440 --> 01:09:59,440 signatures on conflicting statements for the same i, 1750 01:09:59,440 --> 01:10:02,020 there is this box where you can put the two signatures 1751 01:10:02,020 --> 01:10:03,650 and get back the secret key. 1752 01:10:03,650 --> 01:10:05,130 And once you have the secret key, 1753 01:10:05,130 --> 01:10:06,760 we can spend this transaction and get 1754 01:10:06,760 --> 01:10:09,325 that authority's bitcoins. 1755 01:10:09,325 --> 01:10:10,700 And then there's a lot of details 1756 01:10:10,700 --> 01:10:13,010 to get it right because you might notice 1757 01:10:13,010 --> 01:10:16,860 that if the authority does this, before they do this, 1758 01:10:16,860 --> 01:10:19,280 they might already be spending this transaction themselves 1759 01:10:19,280 --> 01:10:22,970 so as to prevent you from taking it. 1760 01:10:22,970 --> 01:10:25,490 And details on how to prevent the authority from doing that 1761 01:10:25,490 --> 01:10:27,350 are in those paper. 1762 01:10:27,350 --> 01:10:29,870 Yeah, and then you know, whoever discovered 1763 01:10:29,870 --> 01:10:31,950 this can spend those bitcoins. 1764 01:10:31,950 --> 01:10:36,170 And the idea is that this disincentivizes equivocation 1765 01:10:36,170 --> 01:10:38,200 by locking these funds under the secret key 1766 01:10:38,200 --> 01:10:40,190 of the bad authority. 1767 01:10:40,190 --> 01:10:41,420 But it does not prevent it. 1768 01:10:41,420 --> 01:10:44,150 Right, so in Catena we actually prevent equivocation. 1769 01:10:44,150 --> 01:10:46,220 We say, if you want to equivocate, 1770 01:10:46,220 --> 01:10:48,540 you better fork bitcoin. 1771 01:10:48,540 --> 01:10:52,100 Here they say, if you want to equivocate, 1772 01:10:52,100 --> 01:10:54,750 you're going to lose $20,000. 1773 01:10:54,750 --> 01:10:56,870 Right? 1774 01:10:56,870 --> 01:10:58,520 But you have to understand like this 1775 01:10:58,520 --> 01:11:00,872 could be a good authority that locked $20,000 here. 1776 01:11:00,872 --> 01:11:03,080 But the attackers are going to be-- they're not going 1777 01:11:03,080 --> 01:11:04,610 to care about those $20,000. 1778 01:11:04,610 --> 01:11:06,620 They're just going to steal the secret key, 1779 01:11:06,620 --> 01:11:08,780 equivocate and then the authority 1780 01:11:08,780 --> 01:11:11,690 is going to be left without money. 1781 01:11:11,690 --> 01:11:13,203 If the authority is the attacker, 1782 01:11:13,203 --> 01:11:14,120 then this makes sense. 1783 01:11:14,120 --> 01:11:15,828 But if the attacker is not the authority, 1784 01:11:15,828 --> 01:11:18,110 then this makes less sense because the authority sort 1785 01:11:18,110 --> 01:11:22,070 of risking their bitcoins on the assumption 1786 01:11:22,070 --> 01:11:24,662 that no attacker can compromise them, which you know, 1787 01:11:24,662 --> 01:11:26,370 if you could do that in computer science, 1788 01:11:26,370 --> 01:11:28,328 I wouldn't be sitting here talking to you guys. 1789 01:11:30,330 --> 01:11:33,230 OK, so now how do you do this? 1790 01:11:33,230 --> 01:11:36,500 So do you remember Schnorr signatures real quickly? 1791 01:11:39,050 --> 01:11:44,700 An easy way to do this is using Schnorr signatures. 1792 01:11:44,700 --> 01:11:45,760 And I think-- 1793 01:11:45,760 --> 01:11:48,450 I could be wrong, but I think this new SegWit 1794 01:11:48,450 --> 01:11:54,360 update to bitcoin allows Schnorr signatures, right? 1795 01:11:54,360 --> 01:11:58,160 So with SegWit and with Schnorr signature, 1796 01:11:58,160 --> 01:11:59,750 as you can definitely do this. 1797 01:11:59,750 --> 01:12:02,480 And the idea is that Schnorr signature 1798 01:12:02,480 --> 01:12:11,210 if you recall, is just k plus h of m, g to the k s, 1799 01:12:11,210 --> 01:12:13,160 where s is the secret key. 1800 01:12:13,160 --> 01:12:16,230 Right, so this is a Schnorr signature on m, right? 1801 01:12:18,840 --> 01:12:20,782 How many recall this? 1802 01:12:20,782 --> 01:12:22,490 Right, and of course, it's not just this. 1803 01:12:22,490 --> 01:12:31,880 I mean, the signature is really this and this h of m, 1804 01:12:31,880 --> 01:12:33,710 g to the k. 1805 01:12:33,710 --> 01:12:36,740 So it's these two things, right. 1806 01:12:36,740 --> 01:12:44,870 But now I want to show you that if I sign two different things, 1807 01:12:44,870 --> 01:12:47,720 I can actually get s. 1808 01:12:47,720 --> 01:12:50,030 But if I sign them in a certain way-- 1809 01:12:50,030 --> 01:12:53,330 so remember what I said before is I would like this authority 1810 01:12:53,330 --> 01:12:59,490 to sign i m and i m prime. 1811 01:12:59,490 --> 01:13:03,200 Right, and if the authority does this, 1812 01:13:03,200 --> 01:13:05,150 I claim that I can get the secret key out. 1813 01:13:05,150 --> 01:13:07,100 But I have to have the same i. 1814 01:13:07,100 --> 01:13:10,340 So we relax this in a sense that we're not 1815 01:13:10,340 --> 01:13:15,950 going to use i here, we're just going to use this g to the k. 1816 01:13:15,950 --> 01:13:19,320 So if the authority uses the same g to the k to sign, 1817 01:13:19,320 --> 01:13:20,790 then we can extract the secret key. 1818 01:13:20,790 --> 01:13:23,200 And the way we can do that is I'll just show an example, 1819 01:13:23,200 --> 01:13:32,120 sig1 would be k plus h of m1 g to the k. 1820 01:13:32,120 --> 01:13:36,950 And it'll be sig 1 and I guess the associated hash 1821 01:13:36,950 --> 01:13:41,370 E1 would be h m1, g to the k. 1822 01:13:41,370 --> 01:13:43,490 Does everybody see this? 1823 01:13:43,490 --> 01:13:50,030 And then sig2 would be k plus h of m 2 g to the k. 1824 01:13:50,030 --> 01:13:53,440 So again, oh, you guys are clearly not paying attention. 1825 01:13:53,440 --> 01:13:54,440 I forgot the secret key. 1826 01:13:57,200 --> 01:14:00,960 E2 is hash of m2 g to the k. 1827 01:14:00,960 --> 01:14:05,090 OK, so I'm using the same g to the k and the same k. 1828 01:14:12,520 --> 01:14:14,320 So now how can I extract the secret key? 1829 01:14:14,320 --> 01:14:15,820 Does anybody see a solution to this? 1830 01:14:18,970 --> 01:14:21,417 AUDIENCE: It's now just system of two equations and two 1831 01:14:21,417 --> 01:14:23,500 variables where k and s are the unknown variables. 1832 01:14:23,500 --> 01:14:25,542 ALIN TOMESCU: So I think what Rahul always saying 1833 01:14:25,542 --> 01:14:32,500 is that s is just sig1 minus sig2 divided by e1 minus e2. 1834 01:14:32,500 --> 01:14:33,140 Is that right? 1835 01:14:33,140 --> 01:14:33,640 Yeah. 1836 01:14:39,140 --> 01:14:42,230 See this, because if I subtract this from this, 1837 01:14:42,230 --> 01:14:48,470 I just get h m1-- 1838 01:14:48,470 --> 01:14:50,240 I'm not going to have space here. 1839 01:14:50,240 --> 01:14:54,470 I get-- let's say it here-- 1840 01:14:54,470 --> 01:15:06,150 h of m1, g to the k minus h of m2 g into the k times s. 1841 01:15:06,150 --> 01:15:08,680 All right? 1842 01:15:08,680 --> 01:15:12,590 And now I take this here in the denominator 1843 01:15:12,590 --> 01:15:14,580 and I simplify and I get s. 1844 01:15:14,580 --> 01:15:15,080 All right. 1845 01:15:18,430 --> 01:15:20,830 So it turns out that that's kind of the trick 1846 01:15:20,830 --> 01:15:22,900 that this word leverage is more or less. 1847 01:15:22,900 --> 01:15:27,000 The only sort of caveat here is that remember I said, 1848 01:15:27,000 --> 01:15:29,080 there is a position i for the statement. 1849 01:15:29,080 --> 01:15:31,330 But now I'm saying, there's no longer a position. 1850 01:15:31,330 --> 01:15:33,130 We have to use this g to the k. 1851 01:15:33,130 --> 01:15:35,290 And how do we map positions that are g to the k 1852 01:15:35,290 --> 01:15:38,090 is another trick that you have to do but you can do it. 1853 01:15:38,090 --> 01:15:40,257 And if you want more details, you can read the paper 1854 01:15:40,257 --> 01:15:43,930 and I'm sure Tadge will tell you even more details about this. 1855 01:15:43,930 --> 01:15:45,850 I think the lightning network also leverages 1856 01:15:45,850 --> 01:15:48,550 this trick in some cases. 1857 01:15:51,850 --> 01:15:55,150 All right now, let's talk a little bit about some attacks 1858 01:15:55,150 --> 01:15:56,380 if there is time. 1859 01:15:56,380 --> 01:15:59,950 The most interesting attack would be the generalized vector 1860 01:15:59,950 --> 01:16:00,748 76 attack. 1861 01:16:00,748 --> 01:16:02,290 So this is a screenshot from a paper. 1862 01:16:02,290 --> 01:16:05,020 Can everybody see this? 1863 01:16:05,020 --> 01:16:09,070 So the generalized vector 76 attack is very interesting. 1864 01:16:09,070 --> 01:16:11,200 So you have you have an attacker, right? 1865 01:16:11,200 --> 01:16:14,970 And remember, the goal of the attacker is to replace-- 1866 01:16:14,970 --> 01:16:19,120 let's say this TX1 with that TX2. 1867 01:16:19,120 --> 01:16:22,180 It's going to show this TX1 to a merchant saying hey, 1868 01:16:22,180 --> 01:16:23,560 I paid you money. 1869 01:16:23,560 --> 01:16:26,560 And then it's going to show this TX2 to the merchant sending 1870 01:16:26,560 --> 01:16:27,700 the money back to himself. 1871 01:16:27,700 --> 01:16:30,640 So the whole trick is to fork the merchant on this side 1872 01:16:30,640 --> 01:16:32,980 and then to switch him back to that, right. 1873 01:16:32,980 --> 01:16:37,030 It's a sort of a pre-mining attack and this attack is much 1874 01:16:37,030 --> 01:16:40,780 easier to pull on SPV notes because what the attacker does 1875 01:16:40,780 --> 01:16:41,290 is-- 1876 01:16:41,290 --> 01:16:46,020 let's see-- he works on the secret chain with TX1 in it. 1877 01:16:46,020 --> 01:16:48,410 And at some point, the main chain might take him over, 1878 01:16:48,410 --> 01:16:48,910 might win. 1879 01:16:48,910 --> 01:16:51,250 So the attacker kind of gives up and keeps trying. 1880 01:16:51,250 --> 01:16:53,740 But at some point the attacker gets ahead, 1881 01:16:53,740 --> 01:16:54,867 right in the second chain. 1882 01:16:54,867 --> 01:16:56,200 He gets ahead of the main chain. 1883 01:16:56,200 --> 01:16:57,940 This stuff is not posted yet. 1884 01:16:57,940 --> 01:17:01,600 And let's see this merchant only for some reason because they're 1885 01:17:01,600 --> 01:17:06,190 silly, they only need one confirmation to accept the TX1. 1886 01:17:06,190 --> 01:17:07,940 And for other silly reasons, this merchant 1887 01:17:07,940 --> 01:17:10,960 is an SPV merchant, right? 1888 01:17:10,960 --> 01:17:12,730 Let's draw a figure here. 1889 01:17:12,730 --> 01:17:14,290 So this merchant is an SPV merchant 1890 01:17:14,290 --> 01:17:19,686 that only needs one confirmation to accept the transaction. 1891 01:17:23,970 --> 01:17:25,875 So we have the attacker. 1892 01:17:29,550 --> 01:17:31,470 And we have let's say, the SPV merchant. 1893 01:17:35,670 --> 01:17:38,300 Right, and now the attacker's sent you know, 1894 01:17:38,300 --> 01:17:42,750 this was the main block there, let's say, this was bi. 1895 01:17:42,750 --> 01:17:44,290 And the attacker forked it. 1896 01:17:44,290 --> 01:17:47,230 He put TX1 here. 1897 01:17:47,230 --> 01:17:49,720 And then had a confirmation on it, right? 1898 01:17:49,720 --> 01:17:52,780 So he sends this chain to the merchant. 1899 01:17:52,780 --> 01:17:55,137 And the other miners haven't found anything yet. 1900 01:17:55,137 --> 01:17:56,720 Right, so the attacker is a bit ahead. 1901 01:17:56,720 --> 01:17:58,240 So he's pre-mining. 1902 01:17:58,240 --> 01:17:59,320 So far, so good. 1903 01:17:59,320 --> 01:18:01,510 Are you all with me? 1904 01:18:01,510 --> 01:18:05,620 And remember that he really just is showing block headers. 1905 01:18:05,620 --> 01:18:06,937 So these are not full blocks. 1906 01:18:06,937 --> 01:18:08,270 He's just showing block headers. 1907 01:18:08,270 --> 01:18:11,620 They're much smaller blocks, 80 bytes. 1908 01:18:11,620 --> 01:18:14,160 And the blocks are missing. 1909 01:18:14,160 --> 01:18:15,940 All right, so when he shows TX1, he 1910 01:18:15,940 --> 01:18:21,340 is just showing like a Merkle path to TX1 to the merchant. 1911 01:18:21,340 --> 01:18:23,200 You guys with me? 1912 01:18:23,200 --> 01:18:25,020 OK, so now he did this to the merchant. 1913 01:18:25,020 --> 01:18:26,850 And now what the attacker is going to do, 1914 01:18:26,850 --> 01:18:29,430 he's going to relax, sit back, and post 1915 01:18:29,430 --> 01:18:33,240 the TX2 the mines-- give TX2 to the miners. 1916 01:18:33,240 --> 01:18:35,700 Let's say the miners are here. 1917 01:18:35,700 --> 01:18:38,370 Hes going to post TX2 to the miners. 1918 01:18:38,370 --> 01:18:39,930 And he'll stop mining the attacker. 1919 01:18:39,930 --> 01:18:41,100 He's not going to mine anymore. 1920 01:18:41,100 --> 01:18:43,017 But he got the merchant to accept the payment. 1921 01:18:43,017 --> 01:18:44,730 And the merchant shipped the goods. 1922 01:18:44,730 --> 01:18:47,220 Maybe this is an online purchase. 1923 01:18:47,220 --> 01:18:52,740 And so now the miners, they will find the next block bi plus 1. 1924 01:18:52,740 --> 01:18:55,590 They'll put TX2 here. 1925 01:18:55,590 --> 01:18:58,650 They'll find the next block and then the next block. 1926 01:18:58,650 --> 01:18:59,760 And they'll take over. 1927 01:18:59,760 --> 01:19:01,385 Their chain will be the main chain. 1928 01:19:01,385 --> 01:19:02,760 And eventually this merchant will 1929 01:19:02,760 --> 01:19:04,770 hear about this main chain. 1930 01:19:04,770 --> 01:19:06,830 And he'll just see the headers, of course, 1931 01:19:06,830 --> 01:19:08,350 because he's an SPV merchant. 1932 01:19:08,350 --> 01:19:10,860 So now the SPV merchant just got double spent. 1933 01:19:10,860 --> 01:19:12,620 Does everybody see this? 1934 01:19:12,620 --> 01:19:14,340 That I just double spent the merchant? 1935 01:19:14,340 --> 01:19:16,470 And sort of the fundamental problem 1936 01:19:16,470 --> 01:19:20,760 here is that the merchant only received block headers. 1937 01:19:20,760 --> 01:19:23,250 So he cannot take these block headers and broadcast them 1938 01:19:23,250 --> 01:19:26,460 to the miners because miners don't mine on top of block 1939 01:19:26,460 --> 01:19:27,510 headers. 1940 01:19:27,510 --> 01:19:29,760 Miners mine on top of full blocks. 1941 01:19:29,760 --> 01:19:32,340 So this merchant cannot has no protection against this 1942 01:19:32,340 --> 01:19:34,150 if he's doing SPV-- 1943 01:19:34,150 --> 01:19:35,960 if he's accepting SPV payments. 1944 01:19:35,960 --> 01:19:39,210 If these were full blocks, what the merchant could have done 1945 01:19:39,210 --> 01:19:41,273 was the following. 1946 01:19:41,273 --> 01:19:42,690 The merchant would have would have 1947 01:19:42,690 --> 01:19:51,060 received the full block here with the transaction in it, 1948 01:19:51,060 --> 01:19:51,560 right. 1949 01:19:51,560 --> 01:19:53,390 So again the attacker got ahead. 1950 01:19:53,390 --> 01:19:55,940 But now the merchant because he saw a block, 1951 01:19:55,940 --> 01:19:59,900 he's going to ship this block with TX1 to the miners 1952 01:19:59,900 --> 01:20:01,010 because he's a full node. 1953 01:20:01,010 --> 01:20:01,910 Full nodes do that. 1954 01:20:01,910 --> 01:20:05,000 When they hear about a block, they broadcast it. 1955 01:20:05,000 --> 01:20:07,370 So now the miners know about this block. 1956 01:20:07,370 --> 01:20:09,480 They're going to continue mining on top of it. 1957 01:20:09,480 --> 01:20:11,480 And in fact, the merchant will send both blocks, 1958 01:20:11,480 --> 01:20:12,680 right to the miners. 1959 01:20:12,680 --> 01:20:15,800 So now miners will continue building here. 1960 01:20:15,800 --> 01:20:21,380 So now the attacker has a bit of a tougher problem on his hands. 1961 01:20:21,380 --> 01:20:23,480 There is still a way to trick even full nodes, 1962 01:20:23,480 --> 01:20:25,460 even if this guy is a full node, you 1963 01:20:25,460 --> 01:20:27,730 can sort of leverage some timing assumptions 1964 01:20:27,730 --> 01:20:29,050 to still trick the full node. 1965 01:20:29,050 --> 01:20:32,000 And the details are in that paper over there. 1966 01:20:32,000 --> 01:20:35,193 But that's one way-- 1967 01:20:35,193 --> 01:20:37,610 somebody was asking, I think you were asking Anne, right-- 1968 01:20:37,610 --> 01:20:39,170 about SPV nodes and how they're less secure. 1969 01:20:39,170 --> 01:20:41,712 And this is one fundamental way in which they're less secure. 1970 01:20:41,712 --> 01:20:44,510 If you accept payments with SPV nodes 1971 01:20:44,510 --> 01:20:46,410 you're really playing with fire. 1972 01:20:46,410 --> 01:20:50,930 You know, you do need a sufficiently powerful attacker 1973 01:20:50,930 --> 01:20:52,680 who can get ahead. 1974 01:20:52,680 --> 01:20:54,160 But yeah. 1975 01:20:54,160 --> 01:20:59,010 OK, so with that I think that kind of concludes the lecture. 1976 01:20:59,010 --> 01:21:00,260 Any final questions? 1977 01:21:05,190 --> 01:21:06,250 All right, cool. 1978 01:21:06,250 --> 01:21:07,970 Thank you guys.