1 00:00:19,023 --> 00:00:21,190 PROFESSOR: We spent the last few lectures developing 2 00:00:21,190 --> 00:00:23,530 the theory of graph limits. 3 00:00:23,530 --> 00:00:26,230 And one of the motivations I gave 4 00:00:26,230 --> 00:00:28,720 at the beginning of the lecture on graph limits 5 00:00:28,720 --> 00:00:32,830 was that there were certain graph inequalities. 6 00:00:32,830 --> 00:00:37,580 Specifically, if I tell you that your graph has edge density one 7 00:00:37,580 --> 00:00:41,840 half, what's the minimum possible C4 density? 8 00:00:41,840 --> 00:00:43,810 So for those kinds of problems, graph limits 9 00:00:43,810 --> 00:00:47,890 gives us a very nice language for describing 10 00:00:47,890 --> 00:00:49,990 what the answer is, and also sometimes 11 00:00:49,990 --> 00:00:51,830 for solving these problems. 12 00:00:51,830 --> 00:00:56,620 So today, I want to dive more into these types of problems. 13 00:00:56,620 --> 00:00:59,470 Specifically, we're going to be talking about homomorphism 14 00:00:59,470 --> 00:01:00,730 density inequalities. 15 00:01:06,502 --> 00:01:07,340 Homomorphism. 16 00:01:17,990 --> 00:01:20,450 So trying to understand what is the relationship 17 00:01:20,450 --> 00:01:24,900 between possible subgraph densities or homomorphism 18 00:01:24,900 --> 00:01:28,830 densities within a large graph. 19 00:01:28,830 --> 00:01:31,180 We've seen these kind of problems in the past. 20 00:01:31,180 --> 00:01:32,640 So one of the very first theorems 21 00:01:32,640 --> 00:01:35,105 that we did in this course was Turan's theorem 22 00:01:35,105 --> 00:01:35,980 and Mantel's theorem. 23 00:01:41,710 --> 00:01:44,310 So specifically, for Mantel's theorem, 24 00:01:44,310 --> 00:01:48,930 it tells us something about the possible edge versus triangle 25 00:01:48,930 --> 00:01:53,730 densities in the graph, which is something 26 00:01:53,730 --> 00:01:56,940 that I want to spend the first part of today's lecture 27 00:01:56,940 --> 00:01:57,810 focusing on. 28 00:01:57,810 --> 00:02:00,130 So what is the possible relationship? 29 00:02:00,130 --> 00:02:07,650 What are all the possible edge versus triangle densities 30 00:02:07,650 --> 00:02:10,539 in the graph? 31 00:02:10,539 --> 00:02:12,630 Mantel's theorem tells us something-- 32 00:02:12,630 --> 00:02:18,120 namely, that if your edge density exceeds one half, 33 00:02:18,120 --> 00:02:22,130 then your triangle density cannot be zero. 34 00:02:22,130 --> 00:02:24,120 So that's what Mantel's theorem tells us. 35 00:02:24,120 --> 00:02:27,230 And let me write it down like this. 36 00:02:27,230 --> 00:02:29,150 So the statement I just said, the one 37 00:02:29,150 --> 00:02:32,690 about Mantel's theorem, and more generally for Turan's theorem 38 00:02:32,690 --> 00:02:42,110 tells us that if the Kr plus 1 density in W is 0, 39 00:02:42,110 --> 00:02:52,930 then necessarily the edge density is at most 1 minus 1 40 00:02:52,930 --> 00:02:53,430 over r. 41 00:02:58,210 --> 00:03:01,090 So this is what it tells us. 42 00:03:01,090 --> 00:03:03,130 It gives us some information about what 43 00:03:03,130 --> 00:03:04,570 are the possible densities. 44 00:03:04,570 --> 00:03:07,490 But I would like to know more generally, 45 00:03:07,490 --> 00:03:09,370 or a good complete picture, of what 46 00:03:09,370 --> 00:03:13,600 is a set of edge versus triangle density inequalities. 47 00:03:13,600 --> 00:03:17,340 So let me draw a picture that captures 48 00:03:17,340 --> 00:03:19,870 what we're looking for. 49 00:03:19,870 --> 00:03:27,950 So here on the x-axis, I have all the possible edge 50 00:03:27,950 --> 00:03:33,360 densities, and on the vertical axis, 51 00:03:33,360 --> 00:03:36,700 I have the triangle density. 52 00:03:36,700 --> 00:03:40,160 And I would like to know what is a set of feasible 53 00:03:40,160 --> 00:03:42,590 points in this box. 54 00:03:45,790 --> 00:03:48,730 Mantel's theorem tells us already something-- 55 00:03:48,730 --> 00:03:54,530 namely, when can you-- so this region, 56 00:03:54,530 --> 00:03:58,300 the horizontal line at zero extends at most 57 00:03:58,300 --> 00:04:01,300 until the halfway point. 58 00:04:01,300 --> 00:04:05,418 Beyond this point, it's not a part of the feasible region. 59 00:04:05,418 --> 00:04:07,210 So far that's the information that we know. 60 00:04:09,790 --> 00:04:13,860 Our discussion about graph limits, and in particular-- 61 00:04:13,860 --> 00:04:16,870 so let me first write down what is the question. 62 00:04:16,870 --> 00:04:24,870 So if you look at the set of possible edge versus triangle 63 00:04:24,870 --> 00:04:31,840 densities, so there is this region here. 64 00:04:36,070 --> 00:04:37,780 What is this region? 65 00:04:37,780 --> 00:04:41,530 It's a subset of this unit square. 66 00:04:41,530 --> 00:04:43,270 We would like to understand what is 67 00:04:43,270 --> 00:04:47,120 the set of all possibilities. 68 00:04:47,120 --> 00:04:49,910 The compactness of the space of graphons 69 00:04:49,910 --> 00:04:54,260 tells us that this region is compact. 70 00:04:54,260 --> 00:05:02,570 So let me call this region D23 for edge versus triangle. 71 00:05:02,570 --> 00:05:09,050 So D23 is compact because the space of graphons 72 00:05:09,050 --> 00:05:13,190 is compact under the cut metric, and densities are 73 00:05:13,190 --> 00:05:16,890 continuous under cut distance. 74 00:05:16,890 --> 00:05:21,500 So in particular, if you have some limit point 75 00:05:21,500 --> 00:05:25,610 of some sequence of graphs, that limit point's 76 00:05:25,610 --> 00:05:29,030 achieved by a corresponding limit graphon. 77 00:05:29,030 --> 00:05:33,280 So you really have a nice closed region over here. 78 00:05:33,280 --> 00:05:35,040 So we don't have to-- 79 00:05:35,040 --> 00:05:36,968 I should be able to tell you the answer. 80 00:05:36,968 --> 00:05:37,760 This is the region. 81 00:05:37,760 --> 00:05:39,780 There should not be any additional quantifiers. 82 00:05:39,780 --> 00:05:41,812 There's no optimizer zero, missing this point 83 00:05:41,812 --> 00:05:42,770 and missing that point. 84 00:05:42,770 --> 00:05:43,770 It's a closed region. 85 00:05:43,770 --> 00:05:46,970 So what is this closed region? 86 00:05:46,970 --> 00:05:50,450 Equivalently, we can ask the following question. 87 00:05:50,450 --> 00:05:56,820 Suppose I give you the edge density. 88 00:05:56,820 --> 00:06:00,080 In other words, look at a particular horizontal place 89 00:06:00,080 --> 00:06:01,310 in this picture. 90 00:06:01,310 --> 00:06:12,687 What is the maximum and minimum possible triangle densities? 91 00:06:21,620 --> 00:06:27,580 So I tell you that the edge density is 0.75. 92 00:06:27,580 --> 00:06:31,590 What is the upper and lower boundaries of this region? 93 00:06:31,590 --> 00:06:35,000 I want you to think about why this region is-- 94 00:06:35,000 --> 00:06:38,090 the vertical cross-section is a line segment. 95 00:06:38,090 --> 00:06:40,160 You cannot have any hulls. 96 00:06:40,160 --> 00:06:42,388 So that requires an argument, and I'll 97 00:06:42,388 --> 00:06:43,430 let you think about that. 98 00:06:50,270 --> 00:06:52,530 So I want to complete this picture, 99 00:06:52,530 --> 00:06:53,780 and I'll show you some proofs. 100 00:06:53,780 --> 00:06:55,945 And at the end of-- 101 00:06:55,945 --> 00:06:57,570 well, by the middle of today's lecture, 102 00:06:57,570 --> 00:06:59,695 we'll see a picture of what this region looks like. 103 00:07:03,820 --> 00:07:04,360 All right. 104 00:07:04,360 --> 00:07:07,870 First, let me do the easier direction, which 105 00:07:07,870 --> 00:07:12,370 is to find the upper boundary of this region. 106 00:07:12,370 --> 00:07:15,360 So what is the maximum possible triangle density 107 00:07:15,360 --> 00:07:17,160 for a given edge density? 108 00:07:28,950 --> 00:07:34,680 And the answer-- so it turns out to be what I will-- 109 00:07:34,680 --> 00:07:37,080 the result I will tell you is a special case 110 00:07:37,080 --> 00:07:40,012 of what's called Kruskal-Katona. 111 00:07:46,150 --> 00:07:47,290 Think about it this way. 112 00:07:47,290 --> 00:07:49,850 Suppose I give you a very large number of vertices 113 00:07:49,850 --> 00:07:52,060 and I give you some large number of edges, 114 00:07:52,060 --> 00:07:55,300 and I want you to put the edges into the graph in a way that 115 00:07:55,300 --> 00:07:59,350 generates as many triangles as possible. 116 00:07:59,350 --> 00:08:01,120 Intuitively, how should you put the edges 117 00:08:01,120 --> 00:08:04,380 in to try to make as many triangles as you can? 118 00:08:08,196 --> 00:08:09,150 AUDIENCE: Clique. 119 00:08:09,150 --> 00:08:10,460 PROFESSOR: In a clique. 120 00:08:10,460 --> 00:08:14,920 So you put all the edges as closely together as possible, 121 00:08:14,920 --> 00:08:16,820 try to form a clique. 122 00:08:16,820 --> 00:08:24,730 So maximize number of triangles by forming a clique. 123 00:08:28,370 --> 00:08:29,620 And that is indeed the answer. 124 00:08:29,620 --> 00:08:33,340 And this is what we'll prove, at least in the graph densities 125 00:08:33,340 --> 00:08:34,270 version. 126 00:08:34,270 --> 00:08:38,419 So we will show that the upper boundary is given by the curve 127 00:08:38,419 --> 00:08:41,440 y equals to x to the 3/2. 128 00:08:44,420 --> 00:08:46,570 So don't worry about the specific function. 129 00:08:46,570 --> 00:08:54,170 But what's important is that the upper bound is achieved 130 00:08:54,170 --> 00:08:55,360 by the following graphon. 131 00:08:58,700 --> 00:09:01,490 Namely, this graphon corresponding to a clique. 132 00:09:08,450 --> 00:09:14,780 For this graphon here, the edge density is a squared, 133 00:09:14,780 --> 00:09:19,980 and the triangle density is a cubed. 134 00:09:19,980 --> 00:09:22,440 And it turns out this graphon is the best that you 135 00:09:22,440 --> 00:09:27,170 can do with a given edge density in order to generate 136 00:09:27,170 --> 00:09:31,500 as many triangles or the most triangle density possible. 137 00:09:31,500 --> 00:09:34,220 In other words, what we'll prove is 138 00:09:34,220 --> 00:09:40,670 that the triangle density throughout W will be a graphon. 139 00:09:40,670 --> 00:09:42,980 So W is always a graphon. 140 00:09:42,980 --> 00:09:44,870 So values between 0 and 1. 141 00:09:50,230 --> 00:09:52,434 Then you have the following inequality. 142 00:10:00,840 --> 00:10:03,420 So let's prove it. 143 00:10:03,420 --> 00:10:05,700 First let me draw you what this shape looks like. 144 00:10:19,490 --> 00:10:22,760 Because of the relationship between graphs and graph 145 00:10:22,760 --> 00:10:25,760 limits, any of these inequalities about graph 146 00:10:25,760 --> 00:10:29,660 limits, about graphons, it's sufficient to prove 147 00:10:29,660 --> 00:10:32,750 the corresponding inequality for graphs 148 00:10:32,750 --> 00:10:35,840 because the set of graphs is dense 149 00:10:35,840 --> 00:10:39,620 within the space of graphons according to the topology-- 150 00:10:39,620 --> 00:10:41,930 namely, the cut metric that we discussed. 151 00:10:41,930 --> 00:10:52,100 So it suffices to show the corresponding inequality 152 00:10:52,100 --> 00:10:54,350 about graphs-- 153 00:10:54,350 --> 00:10:59,750 namely, that the K3 density in a graph 154 00:10:59,750 --> 00:11:08,580 is at most the K2 density in a graph raised to the power 3/2. 155 00:11:08,580 --> 00:11:11,460 So let me belabor this point just a little bit more. 156 00:11:11,460 --> 00:11:14,850 This inequality is a subset of those inequalities 157 00:11:14,850 --> 00:11:18,210 up there because graphs sit inside a space of graphons. 158 00:11:18,210 --> 00:11:22,860 But because they sit inside in a dense subset, 159 00:11:22,860 --> 00:11:27,100 if you know this inequality and everything is continuous, 160 00:11:27,100 --> 00:11:29,290 then you know that inequality. 161 00:11:29,290 --> 00:11:32,240 So these two are equivalent to each other. 162 00:11:32,240 --> 00:11:36,760 Now, with graphs-- and specifically, 163 00:11:36,760 --> 00:11:42,850 these counts here, so triangle densities and edge densities-- 164 00:11:42,850 --> 00:11:46,850 they correspond to counting closed walks in the graph. 165 00:11:46,850 --> 00:11:53,290 So in particular, if we're interested in the number 166 00:11:53,290 --> 00:11:56,590 of K3 homomorphisms in a graph, this 167 00:11:56,590 --> 00:12:01,180 is the same as counting closed walks of length 3. 168 00:12:01,180 --> 00:12:04,300 And there was important identity we used earlier 169 00:12:04,300 --> 00:12:07,630 when we were discussing the proof of quasi-random graphs, 170 00:12:07,630 --> 00:12:11,080 that for counting closed walks you should 171 00:12:11,080 --> 00:12:14,210 look at the spectral moment. 172 00:12:14,210 --> 00:12:16,030 So that's a very important tool to look 173 00:12:16,030 --> 00:12:17,410 at the spectral moment-- 174 00:12:17,410 --> 00:12:21,850 namely, the third power of the eigenvalues 175 00:12:21,850 --> 00:12:24,580 of the adjacency matrix of this graph. 176 00:12:27,430 --> 00:12:41,820 This is the eigenvalues of the adjacency matrix of G. 177 00:12:41,820 --> 00:12:46,110 I claim that this sum here is upper bounded 178 00:12:46,110 --> 00:12:52,230 by a corresponding sum of squares raised 179 00:12:52,230 --> 00:12:56,307 to the power that normalizes. 180 00:12:56,307 --> 00:12:58,640 The first time I saw this I was a bit confused because I 181 00:12:58,640 --> 00:12:59,850 remember, power means inequality. 182 00:12:59,850 --> 00:13:00,990 Shouldn't go the other way. 183 00:13:00,990 --> 00:13:03,390 But actually, no, this is the correct direction. 184 00:13:03,390 --> 00:13:05,070 So let me remind you why. 185 00:13:05,070 --> 00:13:11,930 So if you have a positive t, then claim that-- 186 00:13:11,930 --> 00:13:17,550 and you have a bunch of non-negative reals, 187 00:13:17,550 --> 00:13:23,430 then the claim is that this t-th power 188 00:13:23,430 --> 00:13:33,200 sum is less than or equal to the t-th power of the sum. 189 00:13:33,200 --> 00:13:35,635 Now, there are several ways to see why this is true. 190 00:13:35,635 --> 00:13:36,510 You can do induction. 191 00:13:36,510 --> 00:13:40,320 But let me show you one way which is quite neat. 192 00:13:40,320 --> 00:13:43,140 Because it's homogeneous in the variables, 193 00:13:43,140 --> 00:13:56,480 I can assume that the sum is 1, in which case 194 00:13:56,480 --> 00:14:01,280 the left-hand side is equal to this sum of t-th powers. 195 00:14:05,540 --> 00:14:12,080 So because I assumed that everything is non-negative, 196 00:14:12,080 --> 00:14:14,460 all these a's are between 0 and 1. 197 00:14:14,460 --> 00:14:18,950 So now, this sum is less than or equal to the same sum 198 00:14:18,950 --> 00:14:24,160 without the t's because you're using it like that. 199 00:14:24,160 --> 00:14:27,410 And that's equal to 1, which is the right-hand side. 200 00:14:36,380 --> 00:14:38,430 So this is true. 201 00:14:38,430 --> 00:14:41,460 And now we have the sum of the squares 202 00:14:41,460 --> 00:14:44,430 of the eigenvalues, which is also 203 00:14:44,430 --> 00:14:52,053 a moment of the eigenvalues-- 204 00:14:52,053 --> 00:14:53,220 namely, corresponding to K2. 205 00:15:05,850 --> 00:15:08,640 So the same inequality is true for graph homomorphisms. 206 00:15:08,640 --> 00:15:13,440 And to get to the inequality for densities, 207 00:15:13,440 --> 00:15:18,120 we just divide by the number of vertices raised 208 00:15:18,120 --> 00:15:20,880 to the third power from both sides, 209 00:15:20,880 --> 00:15:23,050 and we get the inequality that we're looking for. 210 00:15:27,180 --> 00:15:31,010 So that's the proof of the upper bound. 211 00:15:31,010 --> 00:15:31,910 Any questions? 212 00:15:37,440 --> 00:15:40,350 There is something that bothers me slightly about this proof. 213 00:15:40,350 --> 00:15:41,610 Look, it's a correct proof. 214 00:15:41,610 --> 00:15:43,360 So there is nothing wrong with this proof. 215 00:15:43,360 --> 00:15:44,235 Everything is kosher. 216 00:15:44,235 --> 00:15:46,932 Everything is correct. 217 00:15:46,932 --> 00:15:48,390 You might ask, is there a way to do 218 00:15:48,390 --> 00:15:51,750 this spectral argument in graphons 219 00:15:51,750 --> 00:15:53,730 without passing to graphs? 220 00:15:53,730 --> 00:15:56,940 And yes, you can, because for graphons you 221 00:15:56,940 --> 00:15:59,370 can also talk about spectrum. 222 00:15:59,370 --> 00:16:01,440 It turns out to be a compact operator, 223 00:16:01,440 --> 00:16:03,045 so that spectrum makes sense. 224 00:16:03,045 --> 00:16:05,670 You have to develop a little bit more theory about the spectrum 225 00:16:05,670 --> 00:16:07,800 of compact operators, but everything, more or less, 226 00:16:07,800 --> 00:16:09,470 works exactly the same way. 227 00:16:09,470 --> 00:16:12,300 It's just easier to talk about graphs. 228 00:16:12,300 --> 00:16:14,730 But what bothers me about this proof 229 00:16:14,730 --> 00:16:18,000 is that we started with what I would 230 00:16:18,000 --> 00:16:21,090 call a physical inequality, meaning that it only 231 00:16:21,090 --> 00:16:28,290 has to do with the actual edges and subgraph densities. 232 00:16:28,290 --> 00:16:32,950 But the proof involved going to the spectrum. 233 00:16:32,950 --> 00:16:35,150 And that bothers me a little bit. 234 00:16:35,150 --> 00:16:37,680 There's nothing incorrect about it, but somehow in my mind 235 00:16:37,680 --> 00:16:40,980 a physical inequality deserves a physical proof. 236 00:16:40,980 --> 00:16:43,780 So I use the word physical in contrast 237 00:16:43,780 --> 00:16:47,910 to frequency, which is coming from Fourier analysis. 238 00:16:47,910 --> 00:16:50,770 And that's the next thing we'll do in this course. 239 00:16:50,770 --> 00:16:52,540 But this proof goes to the spectrum. 240 00:16:52,540 --> 00:16:54,710 It goes to something beyond the physical domain. 241 00:16:54,710 --> 00:16:55,210 OK. 242 00:16:55,210 --> 00:16:55,750 It's neat. 243 00:16:55,750 --> 00:16:58,420 But I want to show you a different proof that stays 244 00:16:58,420 --> 00:17:00,540 within the physical domain. 245 00:17:00,540 --> 00:17:01,582 And this other proof-- 246 00:17:01,582 --> 00:17:03,790 I mean, it's always nice to see some different proofs 247 00:17:03,790 --> 00:17:07,282 because you can use it to apply to different situations. 248 00:17:07,282 --> 00:17:08,740 And there are some situations where 249 00:17:08,740 --> 00:17:15,730 you might not be able to use this spectral characterization. 250 00:17:15,730 --> 00:17:20,190 For example, what if your K3 is now K4? 251 00:17:20,190 --> 00:17:24,260 A similar inequality is true, but this proof doesn't show it, 252 00:17:24,260 --> 00:17:25,420 at least not directly. 253 00:17:25,420 --> 00:17:27,819 You have to do a little bit of extra work. 254 00:17:36,240 --> 00:17:39,420 So let me show you a different proof of the upper bound. 255 00:17:42,480 --> 00:17:45,282 And we'll prove a slightly stronger statement. 256 00:17:58,580 --> 00:18:03,820 Namely, that for all not just graphons-- 257 00:18:03,820 --> 00:18:06,380 it's not so important but for all symmetric measurable 258 00:18:06,380 --> 00:18:12,135 functions, from the unit square to r, 259 00:18:12,135 --> 00:18:15,330 one has the following inequality-- 260 00:18:15,330 --> 00:18:19,890 namely, that the K3 density in W is 261 00:18:19,890 --> 00:18:26,340 upper bounded by the K2 density of W squared raised 262 00:18:26,340 --> 00:18:28,700 to power 3/2. 263 00:18:28,700 --> 00:18:31,730 Here, the square is meant to be a point-wise square. 264 00:18:40,563 --> 00:18:41,480 So a couple of things. 265 00:18:41,480 --> 00:18:46,620 If your graph is a graphon or a graph, then-- 266 00:18:46,620 --> 00:18:48,910 if it's a graph, then it's 0 comma 1 value. 267 00:18:48,910 --> 00:18:51,990 So taking this point by square doesn't do anything. 268 00:18:51,990 --> 00:18:54,980 If you're a graphon, you can always 269 00:18:54,980 --> 00:18:57,380 put one more inequality that replaces it by the thing 270 00:18:57,380 --> 00:18:59,870 that we're looking for because W is always 271 00:18:59,870 --> 00:19:02,000 bounded between 0 and 1. 272 00:19:02,000 --> 00:19:04,510 So it's a slightly stronger inequality. 273 00:19:04,510 --> 00:19:10,550 Let me show it to you by writing down a series of inequalities 274 00:19:10,550 --> 00:19:14,190 and applying the Cauchy-Schwarz inequality repeatedly. 275 00:19:14,190 --> 00:19:18,770 So it's, again, an exercise in using Cauchy-Schwartz. 276 00:19:18,770 --> 00:19:24,890 And we will apply three applications of Cauchy-Schwarz. 277 00:19:29,210 --> 00:19:31,430 Essentially, three applications-- one 278 00:19:31,430 --> 00:19:35,640 corresponding to every edge of this triangle. 279 00:19:35,640 --> 00:19:43,510 So let me begin by writing down the expression in graphons 280 00:19:43,510 --> 00:19:45,670 corresponding to the K3 density. 281 00:20:02,590 --> 00:20:07,690 I'm going to apply Cauchy-Schwarz by-- 282 00:20:07,690 --> 00:20:09,880 so I'm going to apply Cauchy-Schwarz 283 00:20:09,880 --> 00:20:18,960 to the variable x, holding all the other variables constant. 284 00:20:18,960 --> 00:20:22,490 So hold dy and dz constant. 285 00:20:22,490 --> 00:20:23,840 Going to apply to dz. 286 00:20:23,840 --> 00:20:27,080 You see there are two factors that involve the variable x. 287 00:20:27,080 --> 00:20:29,990 So apply Cauchy-Schwarz to them, you split each of them 288 00:20:29,990 --> 00:20:30,560 into an L2. 289 00:20:37,980 --> 00:20:39,850 So one of these factors become that. 290 00:20:39,850 --> 00:20:41,850 By the way, all of these are definite integrals. 291 00:20:41,850 --> 00:20:43,725 I'm just omitting the domain of integrations. 292 00:20:43,725 --> 00:20:48,640 All the integrals are integrated over from 0 to 1. 293 00:20:48,640 --> 00:20:50,850 So the second application-- sorry, 294 00:20:50,850 --> 00:20:56,670 the second factor becomes like that. 295 00:20:56,670 --> 00:20:59,315 And the third factor is left intact. 296 00:21:07,700 --> 00:21:09,910 So that's the first application of Cauchy-Schwarz. 297 00:21:09,910 --> 00:21:13,840 You apply it with respect to dx to these two factors. 298 00:21:13,840 --> 00:21:15,806 Split them like that. 299 00:21:15,806 --> 00:21:18,355 AUDIENCE: There's a normalization missing. 300 00:21:18,355 --> 00:21:19,230 PROFESSOR: Thank you. 301 00:21:19,230 --> 00:21:20,750 There is a normalization missing. 302 00:21:25,550 --> 00:21:26,660 OK. 303 00:21:26,660 --> 00:21:28,630 Guess what the second step is? 304 00:21:28,630 --> 00:21:31,310 Going to apply Cauchy-Schwarz again, but now 305 00:21:31,310 --> 00:21:36,430 to dy, to one more variable. 306 00:21:39,220 --> 00:21:43,140 Cauchy-Schwarz with respect to dy. 307 00:21:43,140 --> 00:21:46,800 There are two factors now that involve the letter y. 308 00:21:46,800 --> 00:21:51,570 So I apply Cauchy-Schwarz and I get the following. 309 00:21:51,570 --> 00:22:01,630 The first factor now just becomes the L2 norm of W. 310 00:22:01,630 --> 00:22:03,610 The second factor does not involve y, 311 00:22:03,610 --> 00:22:04,780 so it is left intact. 312 00:22:12,840 --> 00:22:19,290 And the third factor is again integrated with respect 313 00:22:19,290 --> 00:22:23,070 to y after taking the square. 314 00:22:29,080 --> 00:22:31,010 And there's now dz that remains. 315 00:22:33,630 --> 00:22:35,100 Last step. 316 00:22:35,100 --> 00:22:37,440 You can guess, you integrate with respect 317 00:22:37,440 --> 00:22:42,570 to dz and apply Cauchy-Schwarz. 318 00:22:42,570 --> 00:22:47,420 Apply Cauchy-Schwarz to the last two factors. 319 00:22:47,420 --> 00:22:50,310 And there, actually, the outside integral goes away. 320 00:23:15,690 --> 00:23:16,190 OK. 321 00:23:16,190 --> 00:23:17,690 So you get this product. 322 00:23:17,690 --> 00:23:24,470 And you see every single term is just the L2 norm of W. 323 00:23:24,470 --> 00:23:31,170 So you have that, which is the same as what I wrote over here. 324 00:23:38,430 --> 00:23:42,002 Any questions? 325 00:23:42,002 --> 00:23:42,986 Yeah. 326 00:23:42,986 --> 00:23:46,167 AUDIENCE: Where do you use the fact that W is symmetric? 327 00:23:46,167 --> 00:23:47,250 PROFESSOR: Great question. 328 00:23:47,250 --> 00:23:50,348 So where do I use the fact that W is symmetric? 329 00:23:53,150 --> 00:23:56,065 So let's see. 330 00:23:56,065 --> 00:23:57,690 In some sense, we're not using the fact 331 00:23:57,690 --> 00:23:59,640 that W is symmetric because there 332 00:23:59,640 --> 00:24:02,940 is a slightly more general inequality you can write down. 333 00:24:02,940 --> 00:24:06,810 And actually, the question gives me a good chance 334 00:24:06,810 --> 00:24:11,610 to do a slight diversion into how this inequality is 335 00:24:11,610 --> 00:24:13,470 related to Holder's inequality. 336 00:24:13,470 --> 00:24:15,690 So this is actually one of my favorite inequalities 337 00:24:15,690 --> 00:24:20,940 for this kind of combinatorial inequalities on graphons. 338 00:24:20,940 --> 00:24:24,690 So many of you may be familiar with Holder's inequality 339 00:24:24,690 --> 00:24:25,650 in the following form. 340 00:24:25,650 --> 00:24:29,220 If I have three functions, if I integrate them, 341 00:24:29,220 --> 00:24:33,930 then you can upper bound this integral 342 00:24:33,930 --> 00:24:38,520 by the product of the L3 norms. 343 00:24:38,520 --> 00:24:42,400 And likewise, if you have more functions. 344 00:24:42,400 --> 00:24:45,600 So if you apply just this inequality directly, 345 00:24:45,600 --> 00:24:49,707 you get a weaker estimate. 346 00:24:49,707 --> 00:24:52,040 So you don't get anything that's quite as strong as what 347 00:24:52,040 --> 00:24:54,390 you're looking for over there. 348 00:24:54,390 --> 00:24:58,580 So what happens is that if you know-- 349 00:24:58,580 --> 00:25:06,620 so if f, g, and h each depends only 350 00:25:06,620 --> 00:25:18,500 on a subset of the coordinates in the following way 351 00:25:18,500 --> 00:25:26,950 that f depends on only x and y, g depends only on x and z, 352 00:25:26,950 --> 00:25:30,450 and h depends only on y and z, then 353 00:25:30,450 --> 00:25:32,640 if you repeat that proof verbatim 354 00:25:32,640 --> 00:25:34,430 with three different functions, you 355 00:25:34,430 --> 00:25:38,100 will find that you can upper bound 356 00:25:38,100 --> 00:25:45,060 this product, this integral, by the product of the L2 norms. 357 00:25:50,280 --> 00:25:53,640 So L2 norms are in general less than or equal to the L3 norms. 358 00:25:53,640 --> 00:25:57,100 So here we're inside a probability measure space. 359 00:25:57,100 --> 00:26:02,140 So the entire space has volume 1. 360 00:26:02,140 --> 00:26:03,713 So this is a stronger inequality, 361 00:26:03,713 --> 00:26:05,880 and this is the inequality that comes up over there. 362 00:26:09,550 --> 00:26:10,135 Yeah. 363 00:26:10,135 --> 00:26:12,718 AUDIENCE: Is there an entirely graph theoretic proof of this-- 364 00:26:12,718 --> 00:26:14,985 say, for graphs instead of graphons-- that doesn't 365 00:26:14,985 --> 00:26:16,272 involve going to spectrum? 366 00:26:16,272 --> 00:26:16,980 PROFESSOR: Great. 367 00:26:16,980 --> 00:26:18,640 So the question is, is there entirely 368 00:26:18,640 --> 00:26:20,680 graph theoretic proof of this? 369 00:26:20,680 --> 00:26:26,020 So the reason why I mentioned that this result is a special 370 00:26:26,020 --> 00:26:27,606 case of Kruskal-Katona-- 371 00:26:32,850 --> 00:26:35,940 so Kruskal-Katona actually is a stronger result, 372 00:26:35,940 --> 00:26:43,560 which tells you precisely how you should construct a graph. 373 00:26:43,560 --> 00:26:50,700 So given exactly m edges, what's the maximum number 374 00:26:50,700 --> 00:26:52,200 of triangles? 375 00:26:56,800 --> 00:27:00,040 And the statement that there is actually-- 376 00:27:00,040 --> 00:27:02,380 it's a very precise result. It tells 377 00:27:02,380 --> 00:27:08,760 you, for example, if you have K choose 2 edges, 378 00:27:08,760 --> 00:27:11,580 you have at most K choose 3 triangles. 379 00:27:14,600 --> 00:27:17,530 It's not just at the density level but exactly. 380 00:27:17,530 --> 00:27:20,530 And even if the number of edges is not in the form of K 381 00:27:20,530 --> 00:27:22,830 choose 2, it tells you what to do. 382 00:27:22,830 --> 00:27:25,140 And actually, the answer is pretty easy to describe. 383 00:27:25,140 --> 00:27:29,520 It's almost intuitive so if I give you a bunch of matchsticks 384 00:27:29,520 --> 00:27:32,310 and ask you to construct a graph with as many triangles as you 385 00:27:32,310 --> 00:27:33,780 can, what should you do? 386 00:27:33,780 --> 00:27:37,160 You start with one, two, filling a triangle. 387 00:27:37,160 --> 00:27:40,680 Start filling a triangle. 388 00:27:40,680 --> 00:27:41,730 Another vertex. 389 00:27:41,730 --> 00:27:43,100 1, 2, 3, 4. 390 00:27:43,100 --> 00:27:44,690 You keep going. 391 00:27:44,690 --> 00:27:47,190 And that's the best way to do it. 392 00:27:47,190 --> 00:27:49,470 And that's what Kruskal-Katona tells you. 393 00:27:49,470 --> 00:27:53,410 So that's a more precise version of this inequality. 394 00:27:53,410 --> 00:27:56,680 And the Kruskal-Katona, the combinatorial version, 395 00:27:56,680 --> 00:28:00,100 is proved via a combinatorial shifting argument, also known 396 00:28:00,100 --> 00:28:01,493 as a compression argument. 397 00:28:01,493 --> 00:28:03,160 Namely, if you start with a given graph, 398 00:28:03,160 --> 00:28:05,500 there are some transformations you do to that graph 399 00:28:05,500 --> 00:28:08,050 to push your edges in one direction that 400 00:28:08,050 --> 00:28:10,840 saves the number of edges exactly the same 401 00:28:10,840 --> 00:28:13,650 but increases the number of triangles at each step. 402 00:28:13,650 --> 00:28:17,218 And eventually, you push everything into a clique. 403 00:28:17,218 --> 00:28:18,760 So it's something you can read about. 404 00:28:18,760 --> 00:28:22,033 It's a very nice result. Other questions? 405 00:28:26,110 --> 00:28:30,770 So we've solved the upper bound. 406 00:28:30,770 --> 00:28:33,340 So from examples and from this upper bound proof, 407 00:28:33,340 --> 00:28:35,560 we see that it's the upper bound. 408 00:28:35,560 --> 00:28:41,050 Now let me tell you a fairly general result that 409 00:28:41,050 --> 00:28:45,180 says something about graph theoretic inequalities 410 00:28:45,180 --> 00:28:48,590 but for a specific kind of linear inequalities. 411 00:28:48,590 --> 00:28:50,910 So here's a theorem due to Bollobas. 412 00:28:55,190 --> 00:29:04,900 I'm interested in an inequality of the form like-- 413 00:29:04,900 --> 00:29:19,380 so I'm interested in inequality of this form, where 414 00:29:19,380 --> 00:29:26,920 I have a bunch of real coefficients, 415 00:29:26,920 --> 00:29:29,170 and I'm looking at a linear combination 416 00:29:29,170 --> 00:29:32,120 of the clique densities. 417 00:29:32,120 --> 00:29:36,770 I would like to know if this inequality is true. 418 00:29:36,770 --> 00:29:39,390 So somebody gives us this inequality, 419 00:29:39,390 --> 00:29:41,430 whatever the numbers may be. 420 00:29:41,430 --> 00:29:42,910 You can also have a constant term. 421 00:29:42,910 --> 00:29:46,000 The constant term corresponds to r equals to 1. 422 00:29:46,000 --> 00:29:47,730 So the point density. 423 00:29:47,730 --> 00:29:50,100 That's the constant term. 424 00:29:50,100 --> 00:29:54,290 And asks you to decide is this inequality true. 425 00:29:54,290 --> 00:29:55,410 And if so, prove it. 426 00:29:55,410 --> 00:29:57,343 If not, find a counterexample. 427 00:30:00,800 --> 00:30:03,260 So the theorem tells you that this is actually not 428 00:30:03,260 --> 00:30:04,040 hard to do. 429 00:30:04,040 --> 00:30:09,230 So this inequality holds for all G 430 00:30:09,230 --> 00:30:17,000 if and only if it holds whenever G is a clique. 431 00:30:23,043 --> 00:30:24,710 Maybe somebody gives you this inequality 432 00:30:24,710 --> 00:30:28,350 about-- it's a linear inequality about clique densities. 433 00:30:28,350 --> 00:30:30,200 Then, to check this inequality, you only 434 00:30:30,200 --> 00:30:35,160 have to check over all cliques G, 435 00:30:35,160 --> 00:30:39,300 which is much easier than checking for all graphs. 436 00:30:39,300 --> 00:30:42,927 For each clique G this is just some specific expression 437 00:30:42,927 --> 00:30:44,510 you can write down, and you can check. 438 00:30:51,340 --> 00:30:55,890 So I want to show you the proof of Bollobas' theorem. 439 00:30:55,890 --> 00:30:59,600 It's a quite nice result. But before that, any 440 00:30:59,600 --> 00:31:01,477 questions about the statement. 441 00:31:05,293 --> 00:31:08,640 All right. 442 00:31:08,640 --> 00:31:11,360 So the reason I say that this is very easy to check 443 00:31:11,360 --> 00:31:15,030 if I actually give you what the numbers are 444 00:31:15,030 --> 00:31:20,770 is because this inequality for cliques-- 445 00:31:20,770 --> 00:31:30,720 so the inequality is equivalent to just the statement 446 00:31:30,720 --> 00:31:33,950 of inequality that I'm writing down now, 447 00:31:33,950 --> 00:31:40,710 where I tell you precisely what the r clique 448 00:31:40,710 --> 00:31:42,520 density is in an n clique. 449 00:31:42,520 --> 00:31:45,092 Because that's just some combinatorial expression. 450 00:31:48,217 --> 00:31:49,800 So to check whether this inequality is 451 00:31:49,800 --> 00:31:51,720 true for all graphs, I just have to check 452 00:31:51,720 --> 00:31:54,540 the specific inequality for all integers n, which 453 00:31:54,540 --> 00:31:56,103 is straightforward. 454 00:31:59,970 --> 00:32:00,780 All right. 455 00:32:00,780 --> 00:32:04,540 So let's see how to prove that inequality up there. 456 00:32:04,540 --> 00:32:05,640 And here we're-- 457 00:32:05,640 --> 00:32:07,290 I mean, we're not going to exactly use 458 00:32:07,290 --> 00:32:09,668 the theorems about graphons, but it's 459 00:32:09,668 --> 00:32:10,960 useful to think about graphons. 460 00:32:19,580 --> 00:32:25,380 So if and only if one of the directions is trivial-- 461 00:32:25,380 --> 00:32:28,300 so let's get that out of the way first. 462 00:32:28,300 --> 00:32:32,860 But also-- so the only if is clear. 463 00:32:32,860 --> 00:32:35,750 So for the if direction, first note 464 00:32:35,750 --> 00:32:41,860 that this is true for all graphs if and only 465 00:32:41,860 --> 00:32:48,670 if it is true for all graphons and where I replaced G 466 00:32:48,670 --> 00:32:53,380 by W. By the general theory of graph limits and whatnot, 467 00:32:53,380 --> 00:32:56,080 this is true. 468 00:32:56,080 --> 00:33:00,210 So in particular, there is one class of class 469 00:33:00,210 --> 00:33:02,460 that I would like to look at-- 470 00:33:02,460 --> 00:33:08,430 namely, I want to consider the set of node weighted-- 471 00:33:16,270 --> 00:33:19,900 so I want to consider the set of node weighted simple graphs. 472 00:33:27,150 --> 00:33:29,020 So node weighted simple graphs, by this 473 00:33:29,020 --> 00:33:35,960 I mean a graph where some of the edges are present 474 00:33:35,960 --> 00:33:41,190 and I have a node weight-- 475 00:33:41,190 --> 00:33:43,010 a weight for each node. 476 00:33:43,010 --> 00:33:45,210 And to normalize things properly, 477 00:33:45,210 --> 00:33:50,850 I'm going to assume that the nodes' weights add up to 1. 478 00:33:50,850 --> 00:33:54,690 Now, you see that each graph like that, you can represented 479 00:33:54,690 --> 00:34:03,180 by a graphon where-- 480 00:34:13,300 --> 00:34:15,800 so you can have a graphon. 481 00:34:15,800 --> 00:34:17,690 So they're not meant to be the same picture, 482 00:34:17,690 --> 00:34:20,659 but you have some graphon like this, which corresponds 483 00:34:20,659 --> 00:34:23,210 to a node weighted graph. 484 00:34:23,210 --> 00:34:25,699 And the set of such node weighted graphs 485 00:34:25,699 --> 00:34:30,174 is dense in the space of graphons. 486 00:34:37,420 --> 00:34:40,630 In particular, as far as graph densities are concerned, 487 00:34:40,630 --> 00:34:44,020 they include all the simple graphs. 488 00:34:44,020 --> 00:34:45,460 So it suffices-- 489 00:34:45,460 --> 00:34:49,000 I mean, it's equivalent to-- 490 00:34:49,000 --> 00:34:51,699 the inequality is equivalent to it 491 00:34:51,699 --> 00:34:55,650 being true for all node weighted simple graphs. 492 00:34:55,650 --> 00:35:01,170 But for this space of graphs, suppose 493 00:35:01,170 --> 00:35:03,585 that the inequality fails. 494 00:35:06,900 --> 00:35:10,390 Suppose that inequality is false. 495 00:35:10,390 --> 00:35:19,440 Then there exists a node weighted simple graph. 496 00:35:22,750 --> 00:35:25,330 I'm going to actually drop the word simple from now on. 497 00:35:25,330 --> 00:35:44,390 So node weighted graph H, such that f of H being the above sum 498 00:35:44,390 --> 00:35:46,080 is less than zero. 499 00:35:48,840 --> 00:35:53,670 And there could be many possibilities for such an H. 500 00:35:53,670 --> 00:35:54,750 But let me choose. 501 00:35:54,750 --> 00:35:59,490 So among all the possible H's, let's choose 502 00:35:59,490 --> 00:36:04,290 one that is minimal in the sense that it has the smallest 503 00:36:04,290 --> 00:36:06,090 possible number of nodes. 504 00:36:09,038 --> 00:36:09,955 So with this minimum-- 505 00:36:23,060 --> 00:36:24,710 has a minimum number of nodes. 506 00:36:24,710 --> 00:36:40,990 And furthermore, among all H with this number of nodes, 507 00:36:40,990 --> 00:36:53,230 choose the node weights, which we'll denote by a1 through a n, 508 00:36:53,230 --> 00:36:54,700 summing to 1. 509 00:36:54,700 --> 00:37:01,030 Chooses node weights so that this expression, the sum, 510 00:37:01,030 --> 00:37:02,753 is minimized. 511 00:37:08,680 --> 00:37:11,117 And by compactness-- and now we're 512 00:37:11,117 --> 00:37:13,450 not even talking about compact in the space of graphons. 513 00:37:13,450 --> 00:37:15,075 You have a finite number of parameters. 514 00:37:15,075 --> 00:37:16,340 It's a continuous function. 515 00:37:16,340 --> 00:37:19,000 So just by compactness, there exists 516 00:37:19,000 --> 00:37:21,340 such an H for which the minimum is achieved. 517 00:37:24,530 --> 00:37:26,155 This is minimizing over integers. 518 00:37:26,155 --> 00:37:28,840 And here, minimizing over a finite set 519 00:37:28,840 --> 00:37:32,300 of bounded real numbers. 520 00:37:32,300 --> 00:37:34,750 So the name of the game now is we have 521 00:37:34,750 --> 00:37:37,870 this H, which is minimizing. 522 00:37:37,870 --> 00:37:41,620 And I want to show that H has certain properties. 523 00:37:41,620 --> 00:37:43,120 If it doesn't have these properties, 524 00:37:43,120 --> 00:37:45,340 I can decrease those values. 525 00:38:01,450 --> 00:38:06,250 So let's see what properties this H must 526 00:38:06,250 --> 00:38:11,150 have if it has the minimum number of nodes and f of H 527 00:38:11,150 --> 00:38:12,310 is minimum possible. 528 00:38:16,180 --> 00:38:25,620 So first I claim that all the node weights are positive. 529 00:38:25,620 --> 00:38:27,480 If not, I can delete that node and decrease 530 00:38:27,480 --> 00:38:28,400 the number of nodes. 531 00:38:38,880 --> 00:38:41,970 I would like to claim that H must 532 00:38:41,970 --> 00:38:54,010 be a complete graph because if some ij is not edge of H-- 533 00:38:54,010 --> 00:38:56,715 so here i is different from j. 534 00:38:56,715 --> 00:38:57,590 I do not allow loops. 535 00:38:57,590 --> 00:38:59,410 It's just simple. 536 00:38:59,410 --> 00:39:05,412 Then let's think about what this expression f of H should be. 537 00:39:05,412 --> 00:39:06,870 So I don't want to write this down, 538 00:39:06,870 --> 00:39:08,520 but I want you to imagine in your head. 539 00:39:08,520 --> 00:39:15,990 So you have this graphon H. I'm Looking at the clique density. 540 00:39:15,990 --> 00:39:19,600 It's some polynomial. 541 00:39:19,600 --> 00:39:23,990 In fact, it's some multilinear-- 542 00:39:23,990 --> 00:39:27,380 it's some polynomial in these node weights. 543 00:39:30,700 --> 00:39:34,490 So I want to understand what is the shape 544 00:39:34,490 --> 00:39:39,010 of this polynomial as a function of the node weights. 545 00:39:42,130 --> 00:39:53,950 And I observe that it has to be multilinear in-- 546 00:39:53,950 --> 00:39:58,130 has to be multilinear in particular in 547 00:39:58,130 --> 00:40:01,470 alpha i and alpha j. 548 00:40:06,090 --> 00:40:06,915 It's a polynomial. 549 00:40:06,915 --> 00:40:07,790 That should be clear. 550 00:40:07,790 --> 00:40:10,940 It is multilinear because, well, you have-- 551 00:40:16,810 --> 00:40:20,070 why is it multilinear? 552 00:40:20,070 --> 00:40:24,180 Why do I not have alpha i squared? 553 00:40:33,910 --> 00:40:35,266 Either of you. 554 00:40:35,266 --> 00:40:38,580 AUDIENCE: It says the 0 is not [INAUDIBLE].. 555 00:40:38,580 --> 00:40:41,270 PROFESSOR: So we're forbidding-- 556 00:40:41,270 --> 00:40:45,350 so here's alpha 1, alpha 2, alpha 3, alpha 4, alpha 1, 557 00:40:45,350 --> 00:40:48,630 alpha 2, alpha 3, alpha 4. 558 00:40:48,630 --> 00:40:52,877 So understand what the triangle density-- 559 00:40:52,877 --> 00:40:54,460 if you write down the triangle density 560 00:40:54,460 --> 00:40:58,990 as an expression in terms of the parameters, 561 00:40:58,990 --> 00:41:01,420 think about what comes out, what it looks like. 562 00:41:01,420 --> 00:41:06,340 And they essentially consist of you choosing a subgraph, 563 00:41:06,340 --> 00:41:10,260 which you cannot have repeats. 564 00:41:10,260 --> 00:41:11,460 So it's multilinear. 565 00:41:11,460 --> 00:41:14,690 So it's multilinear in particular in alpha 566 00:41:14,690 --> 00:41:17,440 i and alpha j. 567 00:41:17,440 --> 00:41:31,010 So no term has the product alpha i alpha j in it 568 00:41:31,010 --> 00:41:33,950 because ij is not an edge. 569 00:41:36,720 --> 00:41:41,820 So here's where we're really using that we're only 570 00:41:41,820 --> 00:41:44,580 considering clique densities. 571 00:41:52,380 --> 00:41:56,670 So the theorem is completely false without the assumption 572 00:41:56,670 --> 00:41:57,660 of clique densities. 573 00:41:57,660 --> 00:42:02,573 If we have a general inequality, general linear inequality, then 574 00:42:02,573 --> 00:42:03,990 the statement is completely false. 575 00:42:06,800 --> 00:42:11,240 So it's multilinear. 576 00:42:11,240 --> 00:42:14,360 So if we now fix all the other variables 577 00:42:14,360 --> 00:42:17,480 and just think about how to optimize, 578 00:42:17,480 --> 00:42:23,530 how to minimize f of H by tweaking alpha i and alpha j, 579 00:42:23,530 --> 00:42:27,350 well, it's linear, so you should minimize it 580 00:42:27,350 --> 00:42:29,620 by setting one of them to be zero. 581 00:42:32,190 --> 00:42:36,100 And that would then decrease the number of nodes. 582 00:42:36,100 --> 00:42:49,410 So can shift alpha i and alpha j while preserving alpha i 583 00:42:49,410 --> 00:42:52,755 plus alpha j and not changing-- 584 00:42:55,545 --> 00:43:05,670 so not increasing f of H. And then 585 00:43:05,670 --> 00:43:13,310 we get either alpha i to go to zero 586 00:43:13,310 --> 00:43:16,340 or alpha j to go to zero, in which case 587 00:43:16,340 --> 00:43:23,030 we decrease the number of nodes, thereby contradicting 588 00:43:23,030 --> 00:43:24,808 the minimality assumption. 589 00:43:30,180 --> 00:43:36,290 So this argument here then tells you that H must be a clique. 590 00:43:36,290 --> 00:43:42,110 So hence, H is complete. 591 00:43:46,470 --> 00:43:53,020 And if H is complete, then as a polynomial in these alphas, 592 00:43:53,020 --> 00:43:55,660 what should f look like? 593 00:43:55,660 --> 00:43:58,810 Well, it has to be symmetric with respect 594 00:43:58,810 --> 00:44:01,270 to all these alphas. 595 00:44:01,270 --> 00:44:05,750 So in particular, it has to be-- 596 00:44:05,750 --> 00:44:13,440 so since H is complete, we see that, in fact, now you 597 00:44:13,440 --> 00:44:15,720 can write down exactly what f of H 598 00:44:15,720 --> 00:44:20,000 is in terms of the parameters described in the problem. 599 00:44:20,000 --> 00:44:26,400 Namely, it's Cr times r factorial times Sr, 600 00:44:26,400 --> 00:44:34,150 where Sr is a symmetric polynomial where you look at-- 601 00:44:34,150 --> 00:44:40,290 you choose r of the terms, r of these alphas 602 00:44:40,290 --> 00:44:42,790 for each term in this sum. 603 00:44:42,790 --> 00:44:46,980 It's just elementary symmetric polynomial. 604 00:44:46,980 --> 00:44:50,080 And I would like to know, given such a polynomial, 605 00:44:50,080 --> 00:44:54,810 how to minimize this number by choosing the alphas. 606 00:44:54,810 --> 00:44:56,940 But if you think about what happens 607 00:44:56,940 --> 00:45:01,350 if you fix again everything but two of the alphas-- 608 00:45:01,350 --> 00:45:11,310 so by fixing all of, let's say, alpha 3 to alpha n, 609 00:45:11,310 --> 00:45:12,750 we find that-- 610 00:45:12,750 --> 00:45:20,170 so as a function in just alpha 1 and alpha 2, f of H 611 00:45:20,170 --> 00:45:21,550 has the following form. 612 00:45:32,860 --> 00:45:35,475 And because it's symmetric, these two B's 613 00:45:35,475 --> 00:45:36,750 are actually the same. 614 00:45:40,770 --> 00:45:47,390 So if we now vary alpha 1 and alpha 2 but fixing everything 615 00:45:47,390 --> 00:45:53,870 else, because alpha 1 plus alpha 2 is constant, 616 00:45:53,870 --> 00:45:57,230 I can even get rid of this linear part. 617 00:46:00,850 --> 00:46:03,050 So that linear part is fixed as a constant. 618 00:46:07,020 --> 00:46:09,810 I want to minimize this expression with alpha 1 plus 619 00:46:09,810 --> 00:46:11,790 alpha 2, how it's fixed. 620 00:46:11,790 --> 00:46:13,770 So there are two possibilities depending on 621 00:46:13,770 --> 00:46:19,170 whether C is positive or negative or, I guess, 0. 622 00:46:19,170 --> 00:46:20,370 So now you're here. 623 00:46:20,370 --> 00:46:35,250 So depending if C is positive or negative, 624 00:46:35,250 --> 00:46:43,400 it's minimized by either the two alphas equal to each other 625 00:46:43,400 --> 00:46:52,340 or one of the two of alphas should be zero. 626 00:46:52,340 --> 00:46:55,520 The latter cannot occur because we assume minimality. 627 00:46:55,520 --> 00:46:58,150 So the first must occur. 628 00:46:58,150 --> 00:47:01,880 And hence, by symmetry, if you apply the same argument 629 00:47:01,880 --> 00:47:04,550 to all the other alphas, all the alphas 630 00:47:04,550 --> 00:47:09,190 are equal to each other, which means 631 00:47:09,190 --> 00:47:11,190 that H is a simple clique. 632 00:47:15,490 --> 00:47:17,150 It's basically an unweighted clique. 633 00:47:20,060 --> 00:47:22,780 So in other words, if this inequality 634 00:47:22,780 --> 00:47:28,360 fails for some H, some node weighted H, 635 00:47:28,360 --> 00:47:33,740 then it must fail for a simple clique H. 636 00:47:33,740 --> 00:47:36,630 And that's the claim above. 637 00:47:36,630 --> 00:47:37,130 Yeah? 638 00:47:37,130 --> 00:47:38,630 AUDIENCE: So in the statement, there 639 00:47:38,630 --> 00:47:41,137 are two n's, are those two n's different n's then? 640 00:47:41,137 --> 00:47:41,720 PROFESSOR: OK. 641 00:47:41,720 --> 00:47:42,230 Question. 642 00:47:42,230 --> 00:47:43,210 There are two n's. 643 00:47:43,210 --> 00:47:44,670 Yeah. 644 00:47:44,670 --> 00:47:45,380 Thank you. 645 00:47:45,380 --> 00:47:47,450 So these are two-- 646 00:47:50,460 --> 00:47:50,960 yeah. 647 00:47:50,960 --> 00:47:53,122 So these are two different n's. 648 00:48:01,554 --> 00:48:03,042 Great. 649 00:48:03,042 --> 00:48:04,034 yeah. 650 00:48:04,034 --> 00:48:05,540 AUDIENCE: I have a question. 651 00:48:05,540 --> 00:48:10,130 Which is the node weight such that f of H [INAUDIBLE]?? 652 00:48:10,130 --> 00:48:11,590 PROFESSOR: Question is, why can we 653 00:48:11,590 --> 00:48:17,590 assume that you can choose H so that f of H is minimized? 654 00:48:17,590 --> 00:48:18,590 Its because once-- 655 00:48:18,590 --> 00:48:19,090 OK. 656 00:48:19,090 --> 00:48:20,830 So you agreed the first thing you 657 00:48:20,830 --> 00:48:24,187 can minimize because the number of nodes is a positive integer. 658 00:48:24,187 --> 00:48:25,770 So if there's a counterexample, choose 659 00:48:25,770 --> 00:48:27,590 the minimum counterexample. 660 00:48:27,590 --> 00:48:32,215 Now, you fixed that number of vertices, and the number of-- 661 00:48:32,215 --> 00:48:33,970 then this is an optimization problem. 662 00:48:33,970 --> 00:48:36,040 It's minimizing continuous function 663 00:48:36,040 --> 00:48:38,200 with a finite number of variables. 664 00:48:38,200 --> 00:48:42,600 So it has a minimum just by compactness 665 00:48:42,600 --> 00:48:44,095 of a continuous function. 666 00:48:44,095 --> 00:48:45,220 So I choose that minimizer. 667 00:48:50,370 --> 00:48:53,380 Any more questions? 668 00:48:53,380 --> 00:48:56,950 So we have this rather general looking theorem. 669 00:48:56,950 --> 00:48:59,283 So in the second part of today's lecture, 670 00:48:59,283 --> 00:49:00,700 after taking a short break, I want 671 00:49:00,700 --> 00:49:02,800 to discuss what are some of the consequences 672 00:49:02,800 --> 00:49:06,960 and also variations of that statement up there. 673 00:49:06,960 --> 00:49:10,240 And I want to also show you what the rest of this picture 674 00:49:10,240 --> 00:49:10,750 looks like. 675 00:49:15,030 --> 00:49:18,390 So let's continue to deduce some consequences of this theorem 676 00:49:18,390 --> 00:49:20,190 up there that tells us that it is 677 00:49:20,190 --> 00:49:22,650 pretty easy to decide linear inequalities 678 00:49:22,650 --> 00:49:25,830 between clique densities. 679 00:49:25,830 --> 00:49:31,380 Namely, to decide it, just check the inequalities on cliques. 680 00:49:31,380 --> 00:49:44,590 So as a corollary for each n-- 681 00:49:44,590 --> 00:49:47,890 yes, for each n, the extremal points-- 682 00:49:55,200 --> 00:50:09,440 so the extremal points of the convex hull of this set 683 00:50:09,440 --> 00:50:29,140 where I record the clique densities overall graphons W. 684 00:50:29,140 --> 00:50:32,610 So think about this set as the higher dimensional 685 00:50:32,610 --> 00:50:35,910 generalization of that picture I drew up there. 686 00:50:35,910 --> 00:50:39,810 But no previously we had n equals to 3, 687 00:50:39,810 --> 00:50:42,480 and we're still interested in n equals to 3. 688 00:50:42,480 --> 00:50:51,330 But in general, you have this set sitting in this box. 689 00:50:51,330 --> 00:50:55,220 And so it's some set. 690 00:50:55,220 --> 00:50:59,240 And if I take the convex hull of the set, what 691 00:50:59,240 --> 00:51:03,620 that theorem tells us-- and it requires maybe one bit 692 00:51:03,620 --> 00:51:04,890 of extra computation. 693 00:51:04,890 --> 00:51:09,980 But what it tells us is that the extremal points are precisely 694 00:51:09,980 --> 00:51:23,600 the points given by W equals to Km for all m equal to 1. 695 00:51:23,600 --> 00:51:27,950 So evaluate, find what this point is for each m, 696 00:51:27,950 --> 00:51:31,250 and you have a bunch of points. 697 00:51:31,250 --> 00:51:34,520 And those are the convex hull. 698 00:51:34,520 --> 00:51:37,250 So I'll illustrate by drawing what the points are 699 00:51:37,250 --> 00:51:38,840 for the picture over there. 700 00:51:38,840 --> 00:51:42,170 But it essentially follows from Bollobas' theorem 701 00:51:42,170 --> 00:51:44,390 with one extra bit of computation 702 00:51:44,390 --> 00:51:47,340 to make sure that all of these are actually 703 00:51:47,340 --> 00:51:49,100 extremal points of the convex hull. 704 00:51:49,100 --> 00:51:51,770 None of them is contained in the convex hull 705 00:51:51,770 --> 00:51:54,750 of the other points. 706 00:51:54,750 --> 00:51:59,998 So for example, we can also deduce very easily 707 00:51:59,998 --> 00:52:00,665 Turan's theorem. 708 00:52:05,840 --> 00:52:08,350 So what does Turan's theorem tell us? 709 00:52:08,350 --> 00:52:20,180 It tells us that if the r plus 1 clique density is zero, 710 00:52:20,180 --> 00:52:32,930 then the K2 density is at most 1 minus r. 711 00:52:32,930 --> 00:52:36,245 So why does Turan's theorem follow from the above claims? 712 00:52:41,350 --> 00:52:44,740 It should follow because all the data here has 713 00:52:44,740 --> 00:52:46,360 to do with clique densities. 714 00:52:46,360 --> 00:52:47,860 And everything we saw so far says 715 00:52:47,860 --> 00:52:50,110 that if you just want to understand 716 00:52:50,110 --> 00:52:52,630 linear inequalities between clique densities, 717 00:52:52,630 --> 00:52:53,320 it's super easy. 718 00:53:01,600 --> 00:53:04,372 Maybe I'll draw the picture for triangles, 719 00:53:04,372 --> 00:53:05,830 and then you'll see what it's like. 720 00:53:08,700 --> 00:53:12,060 So the corollary tells us for this picture, 721 00:53:12,060 --> 00:53:15,540 corresponding to n equals to 3, what 722 00:53:15,540 --> 00:53:19,670 the points, the extreme point of the convex hull are. 723 00:53:19,670 --> 00:53:22,100 So let me let me draw these points for you. 724 00:53:22,100 --> 00:53:26,340 So one of these points is this 1/2 comma 0. 725 00:53:26,340 --> 00:53:29,180 So that corresponds to Mantel's theorem. 726 00:53:32,400 --> 00:53:36,680 Now, if you go to the other values of m, 727 00:53:36,680 --> 00:53:39,290 you find that those points-- 728 00:53:39,290 --> 00:53:41,930 so the extreme points-- 729 00:53:41,930 --> 00:53:50,200 they are of the form m minus 1 divided by m, m minus 1 m 730 00:53:50,200 --> 00:53:57,370 minus 2 divided by m squared for positive integers m. 731 00:54:00,396 --> 00:54:03,760 So for m equals to 2, that's the point that we just drew. 732 00:54:03,760 --> 00:54:04,630 And the next point-- 733 00:54:09,270 --> 00:54:12,110 so next two points, one third and one fourth, 734 00:54:12,110 --> 00:54:14,690 they are at, if you plug it in-- 735 00:54:17,850 --> 00:54:18,350 thank you. 736 00:54:18,350 --> 00:54:21,760 2/3 and 3/4. 737 00:54:21,760 --> 00:54:25,720 They correspond to 2/9 and 3/8. 738 00:54:28,780 --> 00:54:30,940 So let me show you where these points are. 739 00:54:30,940 --> 00:54:38,740 So they are at here and over there. 740 00:54:38,740 --> 00:54:41,860 And you have this sequence of points going up. 741 00:54:51,170 --> 00:54:53,110 So this is the convex hull. 742 00:54:53,110 --> 00:54:55,900 And from that information, you should already 743 00:54:55,900 --> 00:54:58,710 be able to deduce Mantel's theorem 744 00:54:58,710 --> 00:55:03,870 because this right half is not part of this convex hull. 745 00:55:03,870 --> 00:55:05,670 So that's what Mantel's theorem. 746 00:55:05,670 --> 00:55:09,900 And similarly, the deduction to Turan's theorem 747 00:55:09,900 --> 00:55:11,490 also follows by a similar logic. 748 00:55:13,810 --> 00:55:14,310 OK. 749 00:55:14,310 --> 00:55:17,110 So you have this sequence of points. 750 00:55:17,110 --> 00:55:21,430 Now, it happens that all of these points lie on a curve. 751 00:55:21,430 --> 00:55:26,190 So let me try to draw what this extra curve is. 752 00:55:30,140 --> 00:55:34,966 So there is some curve, like that. 753 00:55:38,438 --> 00:55:40,230 So there's some curve like that. 754 00:55:42,780 --> 00:55:51,750 The equation of this curve happens to be x 2x minus 1. 755 00:55:51,750 --> 00:55:53,940 And because the regions is contained 756 00:55:53,940 --> 00:55:56,010 in the convex hull, the yellow points, 757 00:55:56,010 --> 00:56:01,720 it certainly lies above this convex red curve. 758 00:56:01,720 --> 00:56:03,160 You've seen this red curve before. 759 00:56:06,470 --> 00:56:06,970 From where? 760 00:56:19,010 --> 00:56:20,010 So what is that saying? 761 00:56:20,010 --> 00:56:23,740 It's saying that if your edge density is beyond 762 00:56:23,740 --> 00:56:27,010 above one half, then you have some lower bound 763 00:56:27,010 --> 00:56:29,730 on the triangle density. 764 00:56:29,730 --> 00:56:31,560 Where have we seen this before? 765 00:56:31,560 --> 00:56:32,658 Problem set one. 766 00:56:32,658 --> 00:56:34,450 There was a problem on problem set one that 767 00:56:34,450 --> 00:56:36,890 says exactly this inequality. 768 00:56:36,890 --> 00:56:38,660 So go back and compare what it did. 769 00:56:43,580 --> 00:56:46,710 But of course, the convex hull result 770 00:56:46,710 --> 00:56:48,410 tells you even a little bit more-- 771 00:56:48,410 --> 00:56:51,950 namely, that you can draw line segments 772 00:56:51,950 --> 00:56:54,770 between these convex hull points. 773 00:56:59,660 --> 00:57:01,850 So you have some polygonal reason that 774 00:57:01,850 --> 00:57:03,230 lower bounds the actual region. 775 00:57:06,312 --> 00:57:07,520 So what is the actual region? 776 00:57:07,520 --> 00:57:09,390 So leaving you in suspense. 777 00:57:09,390 --> 00:57:11,990 So let me tell you what the actual region is now. 778 00:57:11,990 --> 00:57:14,680 So it turns out to be actually-- it's beautiful 779 00:57:14,680 --> 00:57:19,043 and it's quite deep, that the region is now 780 00:57:19,043 --> 00:57:19,960 completely understood. 781 00:57:19,960 --> 00:57:22,600 And it's a fairly recent result. It's only about 10 years ago 782 00:57:22,600 --> 00:57:31,250 roughly that there are some concave curves. 783 00:57:31,250 --> 00:57:39,120 The sequence of scallops going up to the top right corner. 784 00:57:39,120 --> 00:57:42,930 And this is now understood to be the complete region 785 00:57:42,930 --> 00:57:44,580 between these lower and upper curves. 786 00:57:47,510 --> 00:57:50,680 So this is the complete set of feasible regions for edge 787 00:57:50,680 --> 00:57:54,310 versus triangle densities. 788 00:57:54,310 --> 00:57:57,100 So this lower curve is a difficult result 789 00:57:57,100 --> 00:57:57,870 due to Razborov. 790 00:58:02,730 --> 00:58:07,600 And I want to give you a statement what this curve is. 791 00:58:07,600 --> 00:58:12,760 And Razborov came up with this machinery, this technique, 792 00:58:12,760 --> 00:58:14,770 known by the name of flag algebra. 793 00:58:14,770 --> 00:58:16,930 So actually, he came up with this name. 794 00:58:22,390 --> 00:58:24,390 So I won't really tell you what flag algebra is, 795 00:58:24,390 --> 00:58:26,670 but it's kind of a computerized way of doing 796 00:58:26,670 --> 00:58:28,440 Cauchy-Schwarz inequalities. 797 00:58:28,440 --> 00:58:33,930 So many of our proofs for this graph through inequalities, 798 00:58:33,930 --> 00:58:35,850 they go through some kind of Cauchy-Schwarz 799 00:58:35,850 --> 00:58:38,130 or sum of squares equivalently. 800 00:58:38,130 --> 00:58:42,033 But there are some very large or difficult inequalities 801 00:58:42,033 --> 00:58:43,200 you can also prove this way. 802 00:58:43,200 --> 00:58:45,840 But it may be difficult to find exactly what is 803 00:58:45,840 --> 00:58:48,630 the actual inequality-- the chain of Cauchy-Schwarz 804 00:58:48,630 --> 00:58:50,970 or the sum of squares that you should write down. 805 00:58:50,970 --> 00:58:56,550 So this machinery, flag algebra, is a language, 806 00:58:56,550 --> 00:58:59,910 is a framework for setting up those sum of squares 807 00:58:59,910 --> 00:59:02,220 inequalities in the context of proving 808 00:59:02,220 --> 00:59:04,710 graph theoretic inequalities. 809 00:59:04,710 --> 00:59:07,120 So it can be used in many different ways. 810 00:59:07,120 --> 00:59:11,760 And notably, a lot of people have used serious computer 811 00:59:11,760 --> 00:59:13,180 computations. 812 00:59:13,180 --> 00:59:14,820 If I want to prove something is true, 813 00:59:14,820 --> 00:59:19,050 I plug it into what's called a semidefinite program that 814 00:59:19,050 --> 00:59:23,190 allows me to decide what kinds of Cauchy-Schwarz inequalities 815 00:59:23,190 --> 00:59:26,680 I should be applying to derive the result I want to prove. 816 00:59:26,680 --> 00:59:28,310 So that's what flag algebra roughly is. 817 00:59:32,130 --> 00:59:35,948 So what Razborov proved is the following. 818 00:59:42,430 --> 00:59:46,030 So Razborov's theorem, which is drawn up there-- 819 00:59:46,030 --> 00:59:47,830 that's the lower curve-- 820 00:59:47,830 --> 00:59:50,820 is that for fixed-- 821 00:59:50,820 --> 01:00:02,410 so for a fixed value of edge densities, 822 01:00:02,410 --> 01:00:15,670 if it lies between two specific points, drawn above, 823 01:00:15,670 --> 01:00:23,820 the minimum value of triangle density 824 01:00:23,820 --> 01:00:26,530 with a fixed value of edge density 825 01:00:26,530 --> 01:00:30,164 is attained via the following construction. 826 01:00:33,960 --> 01:00:37,920 It's attained by the step function of the graphon 827 01:00:37,920 --> 01:00:48,340 corresponding to a K clique. 828 01:00:48,340 --> 01:00:51,160 So a complete graph on K vertices 829 01:00:51,160 --> 01:01:03,450 with node weights alpha 1 through alpha K summing to 1, 830 01:01:03,450 --> 01:01:07,680 and such that the first K minus 1 of the node weights 831 01:01:07,680 --> 01:01:09,300 are equal. 832 01:01:09,300 --> 01:01:18,970 And the last one is Smaller 833 01:01:18,970 --> 01:01:19,690 All right. 834 01:01:19,690 --> 01:01:24,610 And the point here is that if you are given a specific edge 835 01:01:24,610 --> 01:01:27,490 weight, edge density, then there is 836 01:01:27,490 --> 01:01:33,580 a unique choice of these alphas that achieve that edge density. 837 01:01:33,580 --> 01:01:37,420 And that is the graphon you should use that minimizes 838 01:01:37,420 --> 01:01:38,800 the triangle density-- 839 01:01:38,800 --> 01:01:40,998 describes the lower curve. 840 01:01:40,998 --> 01:01:43,540 So you can write down specific equations for the lower curve, 841 01:01:43,540 --> 01:01:44,510 but it's not so important. 842 01:01:44,510 --> 01:01:46,052 This is a more important description. 843 01:01:46,052 --> 01:01:48,070 These are the graphs that come out. 844 01:01:48,070 --> 01:01:51,430 And what is something that is actually quite-- 845 01:01:51,430 --> 01:01:53,680 I mean, why you should suspect this theorem is 846 01:01:53,680 --> 01:01:57,220 difficult is that unlike Turan's theorem-- so Turan's theorem, 847 01:01:57,220 --> 01:02:00,580 which corresponds to all those discrete points. 848 01:02:00,580 --> 01:02:04,370 In Turan's theorem, the minimizer is unique. 849 01:02:04,370 --> 01:02:07,950 I tell you the number-- 850 01:02:07,950 --> 01:02:10,680 I tell you that the edge density is 2/3, 851 01:02:10,680 --> 01:02:13,220 and I want you to minimize the number of triangles. 852 01:02:17,190 --> 01:02:19,250 Not from Turan's theorem, but it turns out 853 01:02:19,250 --> 01:02:22,790 that this extremal point is unique. 854 01:02:22,790 --> 01:02:26,750 Essentially corresponds to a complete three partite graph. 855 01:02:26,750 --> 01:02:30,080 But for the intermediate values, the constructions 856 01:02:30,080 --> 01:02:31,920 are not unique. 857 01:02:31,920 --> 01:02:43,560 So unless the K2 density is exactly of this form, 858 01:02:43,560 --> 01:02:49,760 the minimizer is not unique. 859 01:02:52,770 --> 01:02:54,680 And the reason why it is not unique 860 01:02:54,680 --> 01:03:03,130 is that you can replace-- 861 01:03:03,130 --> 01:03:04,790 so what's going on here? 862 01:03:04,790 --> 01:03:06,400 So you have this graphon. 863 01:03:11,040 --> 01:03:14,270 Alpha 1, alpha 2, alpha 3. 864 01:03:14,270 --> 01:03:34,600 I can replace this graphon here by any triangle free graphon 865 01:03:34,600 --> 01:03:35,830 of the same edge density. 866 01:03:41,240 --> 01:03:43,670 And there are lots and lots of them. 867 01:03:43,670 --> 01:03:46,550 And the non-uniqueness of the minimizer 868 01:03:46,550 --> 01:03:54,370 makes this minimization problem much more difficult. 869 01:03:54,370 --> 01:03:56,580 So Razborov proved this result for edge 870 01:03:56,580 --> 01:03:58,320 versus triangle densities. 871 01:03:58,320 --> 01:04:04,290 And this program was later completed to K4, 872 01:04:04,290 --> 01:04:07,800 and more generally, to Kr So K4 is 873 01:04:07,800 --> 01:04:17,060 due to a result of Nikiforov, and the Kr result of Reiher 874 01:04:17,060 --> 01:04:18,200 So a similar picture. 875 01:04:18,200 --> 01:04:19,850 It's more or less that picture up 876 01:04:19,850 --> 01:04:22,840 there but with the actual numbers shifted. 877 01:04:22,840 --> 01:04:26,380 Instead of edge versus triangle, it is now edge versus Kr. 878 01:04:29,740 --> 01:04:32,818 I should say that it's worth-- so this is a picture that I 879 01:04:32,818 --> 01:04:35,110 drew up there, and this is roughly the picture that you 880 01:04:35,110 --> 01:04:36,520 see in textbooks-- 881 01:04:36,520 --> 01:04:38,730 how they draw these scallops. 882 01:04:38,730 --> 01:04:42,970 I once plotted what this picture looks like in Mathematica, 883 01:04:42,970 --> 01:04:46,090 just to see for myself where the actual graph is. 884 01:04:46,090 --> 01:04:48,340 And it doesn't actually look like that. 885 01:04:48,340 --> 01:04:50,500 The concaveness is very subtle. 886 01:04:50,500 --> 01:04:53,170 If you draw it on a computer, they look like straight lines. 887 01:04:53,170 --> 01:04:55,320 So in some sense, that's a cartoon. 888 01:04:55,320 --> 01:04:59,170 So the concaveness is caricatured. 889 01:04:59,170 --> 01:05:02,563 So it's not actually as concave as it is drawn, 890 01:05:02,563 --> 01:05:04,480 but I think it's a good illustration of what's 891 01:05:04,480 --> 01:05:05,355 happening in reality. 892 01:05:09,700 --> 01:05:12,830 Questions? 893 01:05:12,830 --> 01:05:18,040 So on one hand, every polynomial graph inequality-- 894 01:05:18,040 --> 01:05:20,840 so what do I mean by a polynomial graph inequality? 895 01:05:20,840 --> 01:05:27,270 So something like-- suppose I have 896 01:05:27,270 --> 01:05:28,920 some inequality of this form. 897 01:05:38,890 --> 01:05:40,890 And I want to know, is this true? 898 01:05:43,420 --> 01:05:44,850 It turns out that I don't actually 899 01:05:44,850 --> 01:05:49,380 need these squares in some sense because I can always 900 01:05:49,380 --> 01:05:58,600 replace them by what happens if you take disjoint unions. 901 01:05:58,600 --> 01:06:03,100 So all I'm trying to say is that every polynomial graph 902 01:06:03,100 --> 01:06:09,670 inequality can be written as a linear graph 903 01:06:09,670 --> 01:06:11,980 inequality of densities. 904 01:06:18,710 --> 01:06:21,170 But nevertheless, this still captures a very large class 905 01:06:21,170 --> 01:06:22,890 of graph inequalities. 906 01:06:22,890 --> 01:06:25,850 And if I just give you some arbitrary one that is not 907 01:06:25,850 --> 01:06:27,380 of that form, it can be often very 908 01:06:27,380 --> 01:06:30,000 difficult to decide whether it is true or not. 909 01:06:30,000 --> 01:06:31,250 So over here it's not so hard. 910 01:06:31,250 --> 01:06:32,625 You just plug it in, and then you 911 01:06:32,625 --> 01:06:35,690 can decide whether it is true. 912 01:06:35,690 --> 01:06:39,080 I mean, it turns out to decide whether this inequality is 913 01:06:39,080 --> 01:06:42,110 true, it's really a polynomial. 914 01:06:42,110 --> 01:06:44,600 And then you just check. 915 01:06:44,600 --> 01:06:46,200 It's not too hard to do. 916 01:06:46,200 --> 01:06:50,890 But in general, suppose I give you an inequality of this form. 917 01:06:50,890 --> 01:06:54,610 So some generalized version of a linear inequality, like that. 918 01:06:54,610 --> 01:07:00,000 It's even decidable if the inequality holds. 919 01:07:00,000 --> 01:07:04,110 Decidable in the sense of Turing halting problem. 920 01:07:04,110 --> 01:07:07,350 So is there some computer program 921 01:07:07,350 --> 01:07:10,110 give you this inequality is true? 922 01:07:10,110 --> 01:07:12,120 I wonder, can you write a computer program 923 01:07:12,120 --> 01:07:16,140 that decides the truthfulness? 924 01:07:16,140 --> 01:07:17,072 It turns out-- OK. 925 01:07:17,072 --> 01:07:18,780 So before telling you what the answer is, 926 01:07:18,780 --> 01:07:22,500 let me just put it in some context. 927 01:07:22,500 --> 01:07:24,452 What about more classical questions before we 928 01:07:24,452 --> 01:07:25,410 jump into graph theory? 929 01:07:25,410 --> 01:07:37,790 If I give you some polynomial p over the real numbers 930 01:07:37,790 --> 01:07:46,500 and I want to check is that true-- 931 01:07:46,500 --> 01:07:49,900 so this is not too hard. 932 01:07:49,900 --> 01:07:51,010 So this is not too hard. 933 01:07:51,010 --> 01:08:02,220 But what if you have multivariate for all real? 934 01:08:06,440 --> 01:08:08,360 Does anyone know the answer? 935 01:08:08,360 --> 01:08:09,320 Is this decidable? 936 01:08:14,250 --> 01:08:15,720 So as you can imagine, these things 937 01:08:15,720 --> 01:08:18,810 were studied pretty classically. 938 01:08:18,810 --> 01:08:22,950 And so it turns out that every first word 939 01:08:22,950 --> 01:08:25,500 or theory over the real numbers is decidable. 940 01:08:25,500 --> 01:08:27,939 So this is a result of Tarski. 941 01:08:27,939 --> 01:08:31,979 In particular, such questions are decidable. 942 01:08:31,979 --> 01:08:34,946 And in fact, there is a very nice characterization of-- 943 01:08:34,946 --> 01:08:39,569 so there's a result called Artin's theorem 944 01:08:39,569 --> 01:08:41,850 that tells you that every such polynomial, 945 01:08:41,850 --> 01:08:45,000 if it is non-negative, then if and only 946 01:08:45,000 --> 01:08:47,729 if, it can be written as a sum of squares 947 01:08:47,729 --> 01:08:50,250 of rational functions. 948 01:08:50,250 --> 01:08:52,170 So there's a very nice characterization 949 01:08:52,170 --> 01:08:57,080 of positiveness of polynomials over the reals. 950 01:09:00,109 --> 01:09:03,410 But now I change the question and I ask, 951 01:09:03,410 --> 01:09:04,970 what about over the integers? 952 01:09:09,350 --> 01:09:11,950 So if I give you a polynomial, is it always non-negative 953 01:09:11,950 --> 01:09:16,760 if I have integer entries? 954 01:09:16,760 --> 01:09:18,078 Is this decidable? 955 01:09:21,214 --> 01:09:24,910 So turns out, this is not decidable. 956 01:09:24,910 --> 01:09:25,859 And this is related. 957 01:09:25,859 --> 01:09:29,560 So it's more or less the same as the undecidability 958 01:09:29,560 --> 01:09:42,670 of Diophantine equations, which is also known 959 01:09:42,670 --> 01:09:44,080 as Hilbert's tenth problem. 960 01:09:50,660 --> 01:09:54,770 So there is no computer program where we give you a Diophantine 961 01:09:54,770 --> 01:09:58,370 equation and solves the question or even tells you 962 01:09:58,370 --> 01:10:02,180 whether the equation has a solution. 963 01:10:02,180 --> 01:10:05,223 And this is part of what makes number theory, 964 01:10:05,223 --> 01:10:06,890 makes Diophantine equations interesting. 965 01:10:06,890 --> 01:10:11,630 So it's undecidable, but we talk about it. 966 01:10:15,360 --> 01:10:19,620 So undecidability is a famous result due to Matiyasevich. 967 01:10:24,790 --> 01:10:28,870 So what about graph theoretic inequalities? 968 01:10:28,870 --> 01:10:45,555 So is a graph homomorphism inequality decidable? 969 01:10:51,477 --> 01:10:53,310 I mean, the question you should ask yourself 970 01:10:53,310 --> 01:10:55,290 is, which one is it closer to? 971 01:10:55,290 --> 01:10:59,280 Is it closer to deciding the positiveness of polynomials 972 01:10:59,280 --> 01:11:03,430 over reals or over integers? 973 01:11:03,430 --> 01:11:07,000 On one hand, you might think that it 974 01:11:07,000 --> 01:11:10,238 is more similar to the question of polynomials over real. 975 01:11:10,238 --> 01:11:12,280 So first of all, why it's similar to polynomials, 976 01:11:12,280 --> 01:11:16,090 I hope that's at least intuitively-- 977 01:11:16,090 --> 01:11:17,830 nothing's a proof, but intuitively it 978 01:11:17,830 --> 01:11:19,800 feels somewhat similar to polynomials. 979 01:11:19,800 --> 01:11:22,140 And all of these guys you can write down 980 01:11:22,140 --> 01:11:23,630 as polynomial-like quantities. 981 01:11:23,630 --> 01:11:28,633 And we saw this earlier in the proof of Bollobas' theorem. 982 01:11:28,633 --> 01:11:30,300 So you might think it's similar to reals 983 01:11:30,300 --> 01:11:32,610 because, well, for graphons, you can 984 01:11:32,610 --> 01:11:34,380 take arbitrary real weights. 985 01:11:34,380 --> 01:11:37,150 So it feels like the reals. 986 01:11:37,150 --> 01:11:46,530 So it turns out, due to a theorem of Hatami and Norine, 987 01:11:46,530 --> 01:11:50,250 that the answer is no. 988 01:11:50,250 --> 01:11:53,940 It is not decidable. 989 01:11:53,940 --> 01:11:59,380 And roughly the reason has to do with this picture. 990 01:11:59,380 --> 01:12:04,720 Even though the space of graphons is not discrete, 991 01:12:04,720 --> 01:12:07,840 it's a very continuous object, even if you just 992 01:12:07,840 --> 01:12:10,120 look at this picture here, you have 993 01:12:10,120 --> 01:12:14,750 a bunch of discrete points along this scallop. 994 01:12:14,750 --> 01:12:18,580 So here's a potential strategy for proving 995 01:12:18,580 --> 01:12:24,190 the undecidability of graph homomorphism inequalities. 996 01:12:24,190 --> 01:12:29,310 I start by just restricting myself to this curve. 997 01:12:29,310 --> 01:12:32,892 I restrict myself to the red curve. 998 01:12:32,892 --> 01:12:34,850 If you restrict yourself to the red curve, than 999 01:12:34,850 --> 01:12:36,620 the set of possibilities-- 1000 01:12:36,620 --> 01:12:39,845 it's now a discrete set, which is like the positive integers. 1001 01:12:43,670 --> 01:12:46,600 And now I start with-- 1002 01:12:46,600 --> 01:12:51,070 I can reduce the problem to the problem of decidability 1003 01:12:51,070 --> 01:12:54,510 of integer inequalities. 1004 01:12:54,510 --> 01:12:57,220 I start with an integer inequality. 1005 01:12:57,220 --> 01:13:03,480 I convert it to an inequality about points on this red curve. 1006 01:13:05,990 --> 01:13:14,170 And that turns into a corresponding graph inequality, 1007 01:13:14,170 --> 01:13:16,231 which must then be undecidable. 1008 01:13:19,020 --> 01:13:22,970 So this undecidability result is related to the discreteness 1009 01:13:22,970 --> 01:13:25,270 of points on this red curve. 1010 01:13:29,990 --> 01:13:32,370 So general undecidability results are interesting. 1011 01:13:32,370 --> 01:13:35,220 But often, we're interested in specific problems. 1012 01:13:35,220 --> 01:13:38,970 So I give you some specific inequality and ask, is it true? 1013 01:13:38,970 --> 01:13:42,720 And there are a lot of interesting open problems 1014 01:13:42,720 --> 01:13:43,680 of that type. 1015 01:13:43,680 --> 01:13:45,990 My favorite one, and also a very important problem 1016 01:13:45,990 --> 01:13:48,170 in extremal graph theory, is known 1017 01:13:48,170 --> 01:13:50,026 as Sidorenko's conjecture. 1018 01:13:58,657 --> 01:14:00,740 So the main cause conjecture-- it's a conjecture-- 1019 01:14:00,740 --> 01:14:11,500 says that if H is bipartite, then the H density in G or W 1020 01:14:11,500 --> 01:14:14,890 is at least the edge density raised 1021 01:14:14,890 --> 01:14:21,060 to the power of the number of edges of H. 1022 01:14:21,060 --> 01:14:27,820 So we saw one example of this inequality 1023 01:14:27,820 --> 01:14:29,990 when H is the fourth cycle. 1024 01:14:29,990 --> 01:14:31,630 So when we discussed quasi-randomness 1025 01:14:31,630 --> 01:14:33,945 we saw that this is true. 1026 01:14:33,945 --> 01:14:35,820 And in the homework, you'll have a few more-- 1027 01:14:35,820 --> 01:14:37,445 so in the next problem homework, you'll 1028 01:14:37,445 --> 01:14:39,040 have a few more examples where you're 1029 01:14:39,040 --> 01:14:41,790 asked to show this inequality. 1030 01:14:41,790 --> 01:14:42,940 It is open. 1031 01:14:42,940 --> 01:14:44,660 We don't know any counterexamples. 1032 01:14:44,660 --> 01:14:49,240 And the first open example, it's known as something 1033 01:14:49,240 --> 01:14:50,260 called a Mobius strip. 1034 01:14:55,240 --> 01:14:59,920 So the Mobius strip graph, which is 1035 01:14:59,920 --> 01:15:04,150 a fancy name for the graph consisting of taking a K55 1036 01:15:04,150 --> 01:15:05,500 and removing a 10 cycle. 1037 01:15:16,390 --> 01:15:17,570 So that's the graph. 1038 01:15:17,570 --> 01:15:20,560 It is open whether this inequality holds for that graph 1039 01:15:20,560 --> 01:15:22,010 there. 1040 01:15:22,010 --> 01:15:24,187 And this is something of great interest. 1041 01:15:24,187 --> 01:15:26,020 So if you can make progress on this problem, 1042 01:15:26,020 --> 01:15:27,415 people will be very excited. 1043 01:15:31,750 --> 01:15:37,072 Now, why is this called a Mobius strip? 1044 01:15:37,072 --> 01:15:38,530 This took me a while to figure out. 1045 01:15:38,530 --> 01:15:40,363 So there are many different interpretations. 1046 01:15:40,363 --> 01:15:42,610 I think the reason why it's called a Mobius strip is 1047 01:15:42,610 --> 01:15:47,170 that if you think about the usual simplicial complex 1048 01:15:47,170 --> 01:15:47,980 for a Mobius strip. 1049 01:15:50,950 --> 01:15:56,110 And then this is the face vertex incidence bipartite graph. 1050 01:15:56,110 --> 01:15:58,570 So five vertices, one for each face. 1051 01:15:58,570 --> 01:16:02,530 Five vertices, one for each vertex. 1052 01:16:02,530 --> 01:16:05,890 And if you draw the incident structure, that's the graph. 1053 01:16:05,890 --> 01:16:10,150 I'm not sure if this topological formulation will 1054 01:16:10,150 --> 01:16:15,305 help you improving Sidorenko's conjecture or disprove it, 1055 01:16:15,305 --> 01:16:17,680 but certainly that that's why it's called a Mobius strip. 1056 01:16:17,680 --> 01:16:19,222 And there are some people believe who 1057 01:16:19,222 --> 01:16:22,170 believe that it may be false. 1058 01:16:22,170 --> 01:16:23,920 So it's still open. 1059 01:16:23,920 --> 01:16:25,068 It's still open. 1060 01:16:27,820 --> 01:16:30,130 The one last thing I want to mention 1061 01:16:30,130 --> 01:16:35,330 is that even though the inequality written up there 1062 01:16:35,330 --> 01:16:38,630 in general is undecidable, if you only 1063 01:16:38,630 --> 01:16:41,180 want to know whether this inequality is true 1064 01:16:41,180 --> 01:16:45,830 up to an epsilon error, then it has decidable. 1065 01:16:45,830 --> 01:16:49,850 In fact, there is an algorithm that I can tell you. 1066 01:16:49,850 --> 01:16:58,010 So there exists an algorithm that decides, 1067 01:16:58,010 --> 01:17:04,568 for every epsilon, that resides-- 1068 01:17:04,568 --> 01:17:06,860 so I just want to know whether that inequality is true. 1069 01:17:06,860 --> 01:17:10,310 But I allow an epsilon error, meaning 1070 01:17:10,310 --> 01:17:27,860 it decides correctly this inequality is true 1071 01:17:27,860 --> 01:17:36,530 up to an epsilon error for all G or outputs a G such 1072 01:17:36,530 --> 01:17:45,616 that the sum here is negative. 1073 01:17:49,430 --> 01:17:52,930 So up to an epsilon of error, I can give you an algorithm. 1074 01:17:52,930 --> 01:17:56,230 And the algorithm follows-- 1075 01:17:56,230 --> 01:17:59,800 I mean, it's not too hard to describe. 1076 01:17:59,800 --> 01:18:02,500 Basically, the idea is that if I take 1077 01:18:02,500 --> 01:18:07,010 an epsilon regular partition, then all the data 1078 01:18:07,010 --> 01:18:10,090 about edge densities can be encoded 1079 01:18:10,090 --> 01:18:12,340 in the epsilon regular partition. 1080 01:18:12,340 --> 01:18:17,590 So apply even the weak regularity lemma is enough. 1081 01:18:22,190 --> 01:18:32,150 And then we can test the bounded number of possibilities 1082 01:18:32,150 --> 01:18:37,820 with some fixed number of parts. 1083 01:18:41,600 --> 01:18:47,270 And by the counting lemma, you lose some epsilon error 1084 01:18:47,270 --> 01:18:51,270 if I check over all weighted graphs 1085 01:18:51,270 --> 01:18:54,860 on some bounded number of parts whose edge weights are 1086 01:18:54,860 --> 01:18:59,330 multiples of epsilon, let's say, whether this is true. 1087 01:18:59,330 --> 01:19:02,090 If it's true, then it is true with this epsilon. 1088 01:19:02,090 --> 01:19:03,560 If it is false, then I can already 1089 01:19:03,560 --> 01:19:06,860 output a counterexample. 1090 01:19:06,860 --> 01:19:08,870 So there is only finitely many possibilities 1091 01:19:08,870 --> 01:19:11,090 as a result of weak regularity lemma. 1092 01:19:11,090 --> 01:19:15,870 And therefore, this version here is decidable. 1093 01:19:15,870 --> 01:19:20,190 So today, we saw many different graph theoretic inequalities 1094 01:19:20,190 --> 01:19:21,520 and some general results. 1095 01:19:21,520 --> 01:19:24,780 And there are lots of open problems about graph 1096 01:19:24,780 --> 01:19:27,210 homomorphism inequalities. 1097 01:19:27,210 --> 01:19:31,055 So this concludes roughly the extremal graph theory 1098 01:19:31,055 --> 01:19:32,020 section of this course. 1099 01:19:32,020 --> 01:19:33,810 So starting from next lecture, we'll 1100 01:19:33,810 --> 01:19:36,130 be looking at Roth's theorem. 1101 01:19:36,130 --> 01:19:37,710 So looking at the Fourier analytic 1102 01:19:37,710 --> 01:19:40,550 proof of Roth's theorem.