1 00:00:15,580 --> 00:00:17,680 PROFESSOR: So welcome, everyone. 2 00:00:17,680 --> 00:00:20,740 Today is the first of what will be 3 00:00:20,740 --> 00:00:25,070 a series of four guest lectures throughout the semester. 4 00:00:25,070 --> 00:00:28,412 There will be two guest lectures, 5 00:00:28,412 --> 00:00:30,370 starting the week from today, and then there'll 6 00:00:30,370 --> 00:00:32,350 be another one towards the end of the semester. 7 00:00:32,350 --> 00:00:34,930 And what Pete and I decided to do 8 00:00:34,930 --> 00:00:37,120 is to bring in people who know a lot more than us 9 00:00:37,120 --> 00:00:39,298 about some area of expertise. 10 00:00:39,298 --> 00:00:40,840 In today's instance, it's going to be 11 00:00:40,840 --> 00:00:44,590 about cardiovascular medicine, in particular 12 00:00:44,590 --> 00:00:47,260 about how to use imaging and machine learning 13 00:00:47,260 --> 00:00:49,240 on images in that context. 14 00:00:49,240 --> 00:00:52,150 And for today's lecture, we're very 15 00:00:52,150 --> 00:00:57,220 excited to have professor Rahul Deo to speak. 16 00:00:57,220 --> 00:01:00,240 Rahul's name kept on showing up, as I did research 17 00:01:00,240 --> 00:01:01,660 over the last couple of years. 18 00:01:01,660 --> 00:01:04,660 First, my group was starting to get interested 19 00:01:04,660 --> 00:01:07,315 in echocardiography, and we said, oh, here's 20 00:01:07,315 --> 00:01:08,920 an interesting paper to read on it. 21 00:01:08,920 --> 00:01:12,610 We read it, and then we read another paper 22 00:01:12,610 --> 00:01:17,050 on doing subtyping of ejection fraction 23 00:01:17,050 --> 00:01:19,840 which is a type of heart failure, and we read it. 24 00:01:19,840 --> 00:01:22,450 I wasn't really paying attention to the names on the papers, 25 00:01:22,450 --> 00:01:23,908 and then suddenly, someone told me, 26 00:01:23,908 --> 00:01:26,878 there's this guy moving to Boston next month who's 27 00:01:26,878 --> 00:01:29,170 doing a lot of interesting work and interesting machine 28 00:01:29,170 --> 00:01:29,950 learning. 29 00:01:29,950 --> 00:01:31,762 You should go meet him. 30 00:01:31,762 --> 00:01:33,220 And of course, I meet him, and then 31 00:01:33,220 --> 00:01:34,762 I tell him about these papers I read, 32 00:01:34,762 --> 00:01:38,270 and he said, oh, I wrote all of those papers. 33 00:01:38,270 --> 00:01:40,360 He was a senior author on them. 34 00:01:40,360 --> 00:01:42,410 So Rahul's been around for a while. 35 00:01:42,410 --> 00:01:48,440 He is already a senior in his field. 36 00:01:48,440 --> 00:01:52,900 He started out doing his medical school training at Cornell, 37 00:01:52,900 --> 00:01:55,450 in Cornell Medical School, in New York 38 00:01:55,450 --> 00:01:58,630 City, at the same time as doing his PhD at Rockefeller 39 00:01:58,630 --> 00:01:59,860 University. 40 00:01:59,860 --> 00:02:01,690 And then he spent the first large chunk-- 41 00:02:01,690 --> 00:02:05,770 after his post-doctoral training, up here in Boston, 42 00:02:05,770 --> 00:02:07,270 at Harvard Medical School-- he spent 43 00:02:07,270 --> 00:02:12,910 a large chunk of his career as faculty at UCSF, in California. 44 00:02:12,910 --> 00:02:15,880 And just moved back this past year to take a position 45 00:02:15,880 --> 00:02:18,900 as the chief data scientist-- is that right-- 46 00:02:18,900 --> 00:02:21,070 for the One Brave Idea project which 47 00:02:21,070 --> 00:02:24,685 is a very large initiative joint between MIT and Brigham 48 00:02:24,685 --> 00:02:27,805 and Women's Hospital to study cardiovascular medicine. 49 00:02:27,805 --> 00:02:29,830 He'll tell you more maybe. 50 00:02:29,830 --> 00:02:34,360 And Rahul's research has really gone the full spectrum, 51 00:02:34,360 --> 00:02:36,407 but the type of things you'll hear about today 52 00:02:36,407 --> 00:02:38,740 is actually not what he's been doing most of his career, 53 00:02:38,740 --> 00:02:39,820 amazingly so. 54 00:02:39,820 --> 00:02:40,990 Most of his career, he's been thinking more 55 00:02:40,990 --> 00:02:42,640 about genotype and how to really bridge 56 00:02:42,640 --> 00:02:48,803 that genotype-phenotype branch, but I asked him specifically 57 00:02:48,803 --> 00:02:49,720 to talk about imaging. 58 00:02:49,720 --> 00:02:51,190 So that's what he'll be focusing on today in his lecture. 59 00:02:51,190 --> 00:02:53,648 And without further ado, thank you, Rahul, for coming here. 60 00:02:53,648 --> 00:02:56,150 [APPLAUSE] 61 00:02:56,650 --> 00:02:58,663 RAHUL DEO: So I'm used to lecturing 62 00:02:58,663 --> 00:03:00,580 the clinical audiences, so you guys are by far 63 00:03:00,580 --> 00:03:02,190 the most technical audience. 64 00:03:02,190 --> 00:03:04,510 So please spare me a little bit, but I actually 65 00:03:04,510 --> 00:03:08,530 want to encourage interruptions, questions. 66 00:03:08,530 --> 00:03:10,120 This is a very opinionated lecture, 67 00:03:10,120 --> 00:03:13,780 so that if anybody has sort of any questions, reservations, 68 00:03:13,780 --> 00:03:15,280 please bring them up during lecture. 69 00:03:15,280 --> 00:03:17,200 Don't wait till the end. 70 00:03:17,200 --> 00:03:22,150 And in part, it's opinionated because I feel passionately 71 00:03:22,150 --> 00:03:27,370 that the stuff we're doing needs to make its way into practice. 72 00:03:27,370 --> 00:03:30,083 It's not by itself purely academically interesting. 73 00:03:30,083 --> 00:03:31,750 We need to study the things we're doing. 74 00:03:31,750 --> 00:03:33,970 We're already picking up what everybody else here 75 00:03:33,970 --> 00:03:35,380 is already doing. 76 00:03:35,380 --> 00:03:38,740 So it's OK from that standpoint, but it really 77 00:03:38,740 --> 00:03:39,648 has to make its way. 78 00:03:39,648 --> 00:03:42,190 And that means that we have to have some mature understanding 79 00:03:42,190 --> 00:03:44,042 of what makes its way into practice, 80 00:03:44,042 --> 00:03:45,250 where the resistance will be. 81 00:03:45,250 --> 00:03:48,850 So the lecture will be peppered throughout with some opinions 82 00:03:48,850 --> 00:03:52,070 and comments in that, and hopefully, that will be useful. 83 00:03:52,070 --> 00:03:53,778 So just a quick outline, just going 84 00:03:53,778 --> 00:03:56,320 to introduce cardiac structure and function which is probably 85 00:03:56,320 --> 00:03:59,350 not part of the regular undergraduate and graduate 86 00:03:59,350 --> 00:04:00,730 training here at MIT. 87 00:04:00,730 --> 00:04:03,683 Talk a little bit about what the major cardiac diagnostics are 88 00:04:03,683 --> 00:04:04,600 and how they use them. 89 00:04:04,600 --> 00:04:09,032 And all this is really to help guide the thought 90 00:04:09,032 --> 00:04:10,990 and the decision making about how we would ever 91 00:04:10,990 --> 00:04:12,850 automate and bring this into-- 92 00:04:12,850 --> 00:04:15,280 how to bring machine learning, artificial intelligence, 93 00:04:15,280 --> 00:04:16,530 into actual clinical practice. 94 00:04:16,530 --> 00:04:18,220 Because I need to give enough background 95 00:04:18,220 --> 00:04:20,568 so you realize what the challenges are, 96 00:04:20,568 --> 00:04:23,110 and then the question probably every has is where's the data? 97 00:04:23,110 --> 00:04:24,657 How would how would one get access 98 00:04:24,657 --> 00:04:26,740 to some of this stuff to be able to potentially do 99 00:04:26,740 --> 00:04:28,210 work in this area? 100 00:04:28,210 --> 00:04:31,270 And then, I'm going to venture a little bit into computer vision 101 00:04:31,270 --> 00:04:33,130 and just talk about some of the topics 102 00:04:33,130 --> 00:04:35,088 that at least I've been thinking about that are 103 00:04:35,088 --> 00:04:36,328 relevant to what we're doing. 104 00:04:36,328 --> 00:04:37,870 And then talk about some of this work 105 00:04:37,870 --> 00:04:40,948 around an automated pipeline for echocardiogram, not as by any 106 00:04:40,948 --> 00:04:42,490 means a gold standard but really just 107 00:04:42,490 --> 00:04:44,350 as sort of an initial foray into trying 108 00:04:44,350 --> 00:04:47,350 to make a dent into this. 109 00:04:47,350 --> 00:04:50,158 And then thinking a little bit about what lessons-- 110 00:04:50,158 --> 00:04:52,450 David mentioned that you talked about electrocardiogram 111 00:04:52,450 --> 00:04:56,110 last week or last class, and so a little bit of some 112 00:04:56,110 --> 00:04:59,050 of the ideas from there, and how they would lend themselves 113 00:04:59,050 --> 00:05:01,240 to insights about future types of approaches 114 00:05:01,240 --> 00:05:02,950 with automated interpretation. 115 00:05:02,950 --> 00:05:05,380 And then my background is actually more in biology. 116 00:05:05,380 --> 00:05:07,392 So I'm going to come back and say, OK, 117 00:05:07,392 --> 00:05:09,850 enough with all this imaging stuff, what about the biology? 118 00:05:09,850 --> 00:05:12,450 How can we make some insights there? 119 00:05:12,450 --> 00:05:13,330 OK. 120 00:05:13,330 --> 00:05:17,320 So every time people try to get funding 121 00:05:17,320 --> 00:05:19,870 for coronary heart disease, they try to talk up 122 00:05:19,870 --> 00:05:21,340 just how important it is. 123 00:05:21,340 --> 00:05:23,380 So this is still-- 124 00:05:23,380 --> 00:05:25,570 we have some battles with the oncology people-- 125 00:05:25,570 --> 00:05:30,340 but this is still the leading cause of death in the world. 126 00:05:30,340 --> 00:05:33,610 And then people like I, you're just emphasizing 127 00:05:33,610 --> 00:05:34,560 the developed world. 128 00:05:34,560 --> 00:05:37,127 There's lots of communicable diseases that matter much more. 129 00:05:37,127 --> 00:05:39,460 So even if you look at those, and you look at the bottom 130 00:05:39,460 --> 00:05:44,560 here, this still, if this is all causes of death age-adjusted, 131 00:05:44,560 --> 00:05:46,920 cardiovascular disease is still number one amongst that. 132 00:05:46,920 --> 00:05:52,180 So certainly it remains important and increasingly so 133 00:05:52,180 --> 00:05:54,550 in some of the developing world also. 134 00:05:54,550 --> 00:05:57,598 So it's important to think a little bit about what 135 00:05:57,598 --> 00:05:59,140 the heart does, because this is going 136 00:05:59,140 --> 00:06:01,740 to guide at least the way that diseases have been classified. 137 00:06:01,740 --> 00:06:03,740 So the main thing the heart does is it's a pump, 138 00:06:03,740 --> 00:06:06,593 and it delivers oxygenated blood throughout the circulatory 139 00:06:06,593 --> 00:06:08,260 system to all the tissues that need it-- 140 00:06:08,260 --> 00:06:11,830 the brain, the kidneys, the muscles, and oxygen, of course, 141 00:06:11,830 --> 00:06:14,440 is required for ATP production. 142 00:06:14,440 --> 00:06:16,240 So it's a pretty impressive organ. 143 00:06:16,240 --> 00:06:18,160 It pumps about five liters of blood a minute, 144 00:06:18,160 --> 00:06:21,660 and with exercise, that can go up five to seven-fold or so, 145 00:06:21,660 --> 00:06:24,280 with conditioned athletes, not me, but other people 146 00:06:24,280 --> 00:06:26,660 can ramp that up substantially. 147 00:06:26,660 --> 00:06:29,890 And we have this need to keep a very, very regular beat, 148 00:06:29,890 --> 00:06:33,340 so if you pause for about three seconds, 149 00:06:33,340 --> 00:06:36,310 you are likely to get lightheaded or pass out. 150 00:06:36,310 --> 00:06:40,330 So you have to maintain this rhythmic beating of your heart, 151 00:06:40,330 --> 00:06:42,310 and you can compute what that would be, 152 00:06:42,310 --> 00:06:45,370 and somewhere around two billion beats in a typical lifetime. 153 00:06:45,370 --> 00:06:49,353 So I'm going to show a lot of pictures and videos 154 00:06:49,353 --> 00:06:50,020 throughout this. 155 00:06:50,020 --> 00:06:52,562 So it's probably worthwhile just to take a pause a little bit 156 00:06:52,562 --> 00:06:54,790 and talk about what the anatomy of the heart is. 157 00:06:54,790 --> 00:06:57,880 So the heart sits like this, so the pointy part 158 00:06:57,880 --> 00:07:01,200 is kind of sitting out to the side, like that. 159 00:07:01,200 --> 00:07:04,540 And so I'm going to just sort of describe the flow of blood. 160 00:07:04,540 --> 00:07:07,180 So the blood comes in something called the inferior vena 161 00:07:07,180 --> 00:07:10,510 cava or the superior vena cava, that's draining from the brain. 162 00:07:10,510 --> 00:07:12,880 This is draining from the lower body, 163 00:07:12,880 --> 00:07:16,312 and then enters into a chamber called the right atrium. 164 00:07:16,312 --> 00:07:18,520 It moves through something called the tricuspid valve 165 00:07:18,520 --> 00:07:20,080 into what's called the right ventricle. 166 00:07:20,080 --> 00:07:21,997 The right ventricle has got some muscle to it. 167 00:07:21,997 --> 00:07:23,935 It pumps into the lungs. 168 00:07:23,935 --> 00:07:25,850 There, the blood picks up oxygen, 169 00:07:25,850 --> 00:07:29,060 so that's why it's shown as being red here. 170 00:07:29,060 --> 00:07:31,647 The oxygenated native blood comes through the left atrium 171 00:07:31,647 --> 00:07:33,730 and then into the left ventricle through something 172 00:07:33,730 --> 00:07:34,840 called the mitral valve. 173 00:07:34,840 --> 00:07:37,630 We'll show you some pictures of the mitral valve later on. 174 00:07:37,630 --> 00:07:39,400 And then the left ventricle, which 175 00:07:39,400 --> 00:07:41,170 is the big workhorse of the heart, 176 00:07:41,170 --> 00:07:44,080 pumps blood through the rest of the body, 177 00:07:44,080 --> 00:07:46,510 through a structure of the aorta. 178 00:07:46,510 --> 00:07:48,970 So in through the right heart, through the lungs, 179 00:07:48,970 --> 00:07:51,148 through the left heart, to the rest of the body. 180 00:07:51,148 --> 00:07:53,440 And then shown here in yellow is the conduction system. 181 00:07:53,440 --> 00:07:56,400 So you guys got a little bit of a conversation last class 182 00:07:56,400 --> 00:07:57,810 on the electrical system. 183 00:07:57,810 --> 00:08:02,165 So the sinoatrial node is up here in the right atrium, 184 00:08:02,165 --> 00:08:03,540 and then conduction goes through. 185 00:08:03,540 --> 00:08:06,660 So the P wave on an EKG represents the conduction 186 00:08:06,660 --> 00:08:07,410 through there. 187 00:08:07,410 --> 00:08:08,827 You get through the AV node, where 188 00:08:08,827 --> 00:08:10,530 there's a delay which is a PR interval, 189 00:08:10,530 --> 00:08:12,822 and then you get spreading through the ventricles which 190 00:08:12,822 --> 00:08:16,290 is the QRS complex, and then repolarization is the T wave. 191 00:08:16,290 --> 00:08:18,840 So that's the electrical system, and of course, these things 192 00:08:18,840 --> 00:08:20,910 have to work intimately together. 193 00:08:24,570 --> 00:08:27,080 Every single basic kind of cardiac physiology 194 00:08:27,080 --> 00:08:29,850 will show this diagram called the Wiggers diagram which 195 00:08:29,850 --> 00:08:31,838 really just shows the interconnectedness 196 00:08:31,838 --> 00:08:32,880 of the electrical system. 197 00:08:32,880 --> 00:08:34,590 So there's the EKG up there. 198 00:08:34,590 --> 00:08:37,929 These are the heart sounds that a provider would listen to 199 00:08:37,929 --> 00:08:39,720 with the stethoscope, and this is 200 00:08:39,720 --> 00:08:43,530 capturing the flow of sort of the changes in pressure 201 00:08:43,530 --> 00:08:45,120 in the heart and in the aorta. 202 00:08:45,120 --> 00:08:49,050 So heart fills during a period of time called diastole. 203 00:08:49,050 --> 00:08:50,940 The mitral valve closes. 204 00:08:50,940 --> 00:08:52,020 The ventricle contracts. 205 00:08:52,020 --> 00:08:53,230 The pressure increases. 206 00:08:53,230 --> 00:08:54,907 This is a period of time called systole. 207 00:08:54,907 --> 00:08:57,240 Eventually, something called the aortic valve pops open, 208 00:08:57,240 --> 00:08:59,073 and blood goes through the rest of the body. 209 00:08:59,073 --> 00:09:01,230 The heart finally starts to relax. 210 00:09:01,230 --> 00:09:03,025 The atrioventricular valve closes. 211 00:09:03,025 --> 00:09:03,900 Then, you fill again. 212 00:09:03,900 --> 00:09:06,960 So this happens again and again and again in a cyclical way, 213 00:09:06,960 --> 00:09:09,450 and you have this combination of electrical and mechanical 214 00:09:09,450 --> 00:09:11,300 properties. 215 00:09:11,300 --> 00:09:11,920 OK. 216 00:09:11,920 --> 00:09:12,890 So I have some pictures here. 217 00:09:12,890 --> 00:09:13,520 These are all MRIs. 218 00:09:13,520 --> 00:09:15,437 I'm going to talk about echocardiography which 219 00:09:15,437 --> 00:09:17,990 is these very ugly, grainy things that I unfortunately 220 00:09:17,990 --> 00:09:18,860 have to work with. 221 00:09:18,860 --> 00:09:20,952 MRIs are beautiful but very expensive. 222 00:09:20,952 --> 00:09:22,160 So there's a reason for that. 223 00:09:22,160 --> 00:09:26,160 So this is something called the long axis view of the heart. 224 00:09:26,160 --> 00:09:28,340 So this is the thick walled left ventricle there. 225 00:09:28,340 --> 00:09:29,923 This is the left atrium there, and you 226 00:09:29,923 --> 00:09:32,840 can see this beautiful turbulent flow of blood in there, 227 00:09:32,840 --> 00:09:35,150 and it's flowing from the atrium to the ventricle. 228 00:09:35,150 --> 00:09:37,190 This is another patient's. 229 00:09:37,190 --> 00:09:38,570 It's called the short axis view. 230 00:09:38,570 --> 00:09:41,090 There is the left ventricle and the right ventricle there. 231 00:09:41,090 --> 00:09:43,402 So we're kind of looking at it somewhat obliquely, 232 00:09:43,402 --> 00:09:45,485 and then this is another view called the physical. 233 00:09:45,485 --> 00:09:46,340 It's a little bit dull there. 234 00:09:46,340 --> 00:09:47,150 I'm sorry. 235 00:09:47,150 --> 00:09:48,650 We can brighten it a little bit. 236 00:09:48,650 --> 00:09:52,022 This is the what's called the four chamber view. 237 00:09:52,022 --> 00:09:54,230 So you can see the left ventricle and right ventricle 238 00:09:54,230 --> 00:09:54,980 here. 239 00:09:54,980 --> 00:09:57,500 So the reason for these different views 240 00:09:57,500 --> 00:10:01,550 is, ultimately, that people have measures 241 00:10:01,550 --> 00:10:04,007 of function and measures of disease that go along 242 00:10:04,007 --> 00:10:05,090 with these specific views. 243 00:10:05,090 --> 00:10:08,190 So you're going to see them coming back again and again. 244 00:10:08,190 --> 00:10:08,690 OK. 245 00:10:08,690 --> 00:10:14,290 So the way that physicians like to organize disease definitions 246 00:10:14,290 --> 00:10:16,540 really around some of these same kind of functions. 247 00:10:16,540 --> 00:10:20,710 So failures of the heart to pump properly 248 00:10:20,710 --> 00:10:23,380 causes a disease called heart failure, 249 00:10:23,380 --> 00:10:26,380 and this shows up in terms of being out of breath, having 250 00:10:26,380 --> 00:10:28,780 fluid buildup in the belly and in the legs, 251 00:10:28,780 --> 00:10:30,812 and this is treated with medications. 252 00:10:30,812 --> 00:10:32,770 Sometimes, you can have some artificial devices 253 00:10:32,770 --> 00:10:34,562 to help the heart pump, and ultimately, you 254 00:10:34,562 --> 00:10:37,310 could even have a transplant, depending on how severe it is. 255 00:10:37,310 --> 00:10:38,920 So that's the pump. 256 00:10:38,920 --> 00:10:42,220 Blood supply to the heart ultimately can also be blocked, 257 00:10:42,220 --> 00:10:44,830 and that causes a disease called coronary artery disease. 258 00:10:44,830 --> 00:10:46,410 If blood is completely blocked, you 259 00:10:46,410 --> 00:10:48,618 can get something called a heart attack or myocardial 260 00:10:48,618 --> 00:10:49,210 infarction. 261 00:10:49,210 --> 00:10:51,490 That's chest pain, sometimes shortness of breath, 262 00:10:51,490 --> 00:10:54,370 and we open up those blocked vessels by angioplasty, 263 00:10:54,370 --> 00:10:57,790 stick a stent in there, or bypass them altogether. 264 00:10:57,790 --> 00:11:02,350 And then the flow of blood has to be one way. 265 00:11:02,350 --> 00:11:05,980 So abnormalities of flow of the blood through valves 266 00:11:05,980 --> 00:11:09,030 is valvular disease, and so you can have either two type 267 00:11:09,030 --> 00:11:10,480 valves, so that's called stenosis. 268 00:11:10,480 --> 00:11:11,688 Or you can have leaky valves. 269 00:11:11,688 --> 00:11:12,855 That's called regurgitation. 270 00:11:12,855 --> 00:11:14,688 That shows up as light-headedness, shortness 271 00:11:14,688 --> 00:11:17,290 of breath, fainting, and then you've got to fix those valves. 272 00:11:17,290 --> 00:11:19,510 And finally, there's abnormalities of rhythm. 273 00:11:19,510 --> 00:11:21,170 So something like atrial fibrillation 274 00:11:21,170 --> 00:11:24,640 which is a quivering of the atrium, so too slow heartbeats, 275 00:11:24,640 --> 00:11:27,100 which would look like cardiac, can present as palpitations, 276 00:11:27,100 --> 00:11:28,925 fainting, or even sudden death. 277 00:11:28,925 --> 00:11:31,550 And you can stick a pacemaker in there, defibrillator in there, 278 00:11:31,550 --> 00:11:34,190 or try to burn off the arrhythmia. 279 00:11:34,190 --> 00:11:34,690 OK. 280 00:11:34,690 --> 00:11:38,440 So this is like the very physiology-centric view, 281 00:11:38,440 --> 00:11:41,042 but the truth is that the heart has a whole lot of cells. 282 00:11:41,042 --> 00:11:43,000 So there's a lot more biology there than simply 283 00:11:43,000 --> 00:11:46,120 just thinking about the pumping and the electrical function. 284 00:11:46,120 --> 00:11:50,000 Only 30% of the cells or so are these cardiomyocytes. 285 00:11:50,000 --> 00:11:52,640 So these are the cells that are involved in contraction. 286 00:11:52,640 --> 00:11:55,148 These are cells that are excitable, but that's only 30% 287 00:11:55,148 --> 00:11:55,690 of the cells. 288 00:11:55,690 --> 00:11:57,850 There is endothelials in the cell. 289 00:11:57,850 --> 00:11:58,780 There's fibroblasts. 290 00:11:58,780 --> 00:12:00,850 There's a bunch of blood cells in there 291 00:12:00,850 --> 00:12:02,560 too, certainly a lot of red blood cells in there too. 292 00:12:02,560 --> 00:12:03,710 So you have lots of other things. 293 00:12:03,710 --> 00:12:05,360 So we're going to come back to here a little bit 294 00:12:05,360 --> 00:12:07,818 when talking about how should we be thinking about disease? 295 00:12:07,818 --> 00:12:10,750 The historic way is to think about pumping 296 00:12:10,750 --> 00:12:12,820 and electrical activation, but really, there's 297 00:12:12,820 --> 00:12:14,485 maybe a little bit more complexity here 298 00:12:14,485 --> 00:12:15,610 that needs to be addressed. 299 00:12:15,610 --> 00:12:16,330 OK. 300 00:12:16,330 --> 00:12:20,290 So there's a lot of different-- 301 00:12:20,290 --> 00:12:23,560 so cardiology is very imaging-centric, 302 00:12:23,560 --> 00:12:25,930 and as a result, it's very expensive. 303 00:12:25,930 --> 00:12:28,630 Because imaging costs a lot of money to do, 304 00:12:28,630 --> 00:12:31,120 and so I have dollar signs here reflecting 305 00:12:31,120 --> 00:12:32,650 the sorts of different tests we do. 306 00:12:32,650 --> 00:12:36,410 So you saw the cheapest one last week, 307 00:12:36,410 --> 00:12:38,990 electrocardiogram, so one dollar sign, 308 00:12:38,990 --> 00:12:41,950 and that has lots of utility. 309 00:12:41,950 --> 00:12:44,170 For example, one could diagnose an acute heart attack 310 00:12:44,170 --> 00:12:46,250 with that. 311 00:12:46,250 --> 00:12:48,680 Echocardiography, which involves sound waves, 312 00:12:48,680 --> 00:12:51,280 is ultimately more used for quantifying structure 313 00:12:51,280 --> 00:12:54,740 and function, can pick up heart failure, valvular disease, 314 00:12:54,740 --> 00:12:56,280 high blood pressure in the lungs. 315 00:12:56,280 --> 00:12:57,760 So that's another modality. 316 00:12:57,760 --> 00:13:00,760 MRI, which is just not used all that much in this country, 317 00:13:00,760 --> 00:13:01,910 is very expensive. 318 00:13:01,910 --> 00:13:04,160 It does largely the same things, and you can imagine, 319 00:13:04,160 --> 00:13:05,620 even though it's beautiful, people 320 00:13:05,620 --> 00:13:07,990 have not had an easy time and able to justify 321 00:13:07,990 --> 00:13:11,980 why it's any better than this slightly cheaper modality. 322 00:13:11,980 --> 00:13:15,010 And then you have angiography which can either be by CAT scan 323 00:13:15,010 --> 00:13:16,240 or by X-ray. 324 00:13:16,240 --> 00:13:19,870 And that visualizes the flow of blood through the heart 325 00:13:19,870 --> 00:13:22,990 and looks for blockages which are going to be stented, 326 00:13:22,990 --> 00:13:24,760 ballooned up and stented. 327 00:13:24,760 --> 00:13:28,840 And then you had these kind of non-invasive technologies, 328 00:13:28,840 --> 00:13:32,970 like PET and SPECT that use radionucleotides, 329 00:13:32,970 --> 00:13:35,200 like technetium, rubidium, and they 330 00:13:35,200 --> 00:13:37,030 look for abnormalities in blood flow 331 00:13:37,030 --> 00:13:38,980 to detect whether or non-invasively 332 00:13:38,980 --> 00:13:40,480 there's some patch of the heart that 333 00:13:40,480 --> 00:13:41,787 isn't getting enough blood. 334 00:13:41,787 --> 00:13:43,870 If you get one of these, and it's abnormal, often, 335 00:13:43,870 --> 00:13:45,790 you go over there, and you take a trip to the movies-- 336 00:13:45,790 --> 00:13:47,350 as my old teachers used to say-- 337 00:13:47,350 --> 00:13:50,800 and then you may find yourself with an angioplasty or stent 338 00:13:50,800 --> 00:13:52,420 or bypass. 339 00:13:52,420 --> 00:13:54,708 So one of the sad things about cardiology 340 00:13:54,708 --> 00:13:56,500 is we don't define our diseases by biology. 341 00:13:56,500 --> 00:13:58,853 We define our diseases often related 342 00:13:58,853 --> 00:14:00,520 to whether the anatomy of the physiology 343 00:14:00,520 --> 00:14:03,520 is abnormal or normal, usually based on some of these images 344 00:14:03,520 --> 00:14:04,860 or some of these numbers. 345 00:14:04,860 --> 00:14:05,990 OK. 346 00:14:05,990 --> 00:14:07,840 So we have to make decisions, and we often 347 00:14:07,840 --> 00:14:09,443 use these very same things too to be 348 00:14:09,443 --> 00:14:10,610 able to make some decisions. 349 00:14:10,610 --> 00:14:13,510 So we have to decide whether we want to put a defibrillator, 350 00:14:13,510 --> 00:14:16,745 and to do so, you often need to get an echocardiogram 351 00:14:16,745 --> 00:14:18,620 to look at the pumping function of the heart. 352 00:14:18,620 --> 00:14:21,120 If you want to decide on whether somebody needs angioplasty, 353 00:14:21,120 --> 00:14:22,647 you have to get an angiogram. 354 00:14:22,647 --> 00:14:24,730 If you want to decided to get a valve replacement, 355 00:14:24,730 --> 00:14:26,470 you need an echo. 356 00:14:26,470 --> 00:14:28,102 But some of these other ones actually 357 00:14:28,102 --> 00:14:29,560 don't involve any imaging, and this 358 00:14:29,560 --> 00:14:31,268 is sort of one of the challenges that I'm 359 00:14:31,268 --> 00:14:34,300 going to talk about is that all of the future-- 360 00:14:34,300 --> 00:14:36,680 you can imagine building brand new risk models, 361 00:14:36,680 --> 00:14:38,140 new classification models. 362 00:14:38,140 --> 00:14:40,510 You're stuck with the data that's out there, 363 00:14:40,510 --> 00:14:42,310 and the data that's out there is ultimately 364 00:14:42,310 --> 00:14:44,890 being collected because somebody feels like it's worth 365 00:14:44,890 --> 00:14:46,360 paying for it already. 366 00:14:46,360 --> 00:14:48,790 So if you want to build a brand new risk 367 00:14:48,790 --> 00:14:51,550 model for who's going to have a myocardial infarction, 368 00:14:51,550 --> 00:14:54,010 you're probably not going to have any echocardiograms to be 369 00:14:54,010 --> 00:14:55,510 able to use for that, because nobody 370 00:14:55,510 --> 00:14:57,460 is going to have paid for that to be collected 371 00:14:57,460 --> 00:14:58,450 in the first place. 372 00:14:58,450 --> 00:14:59,350 So this is a problem. 373 00:14:59,350 --> 00:15:01,570 To be able to innovate, I've got to keep on coming back to that, 374 00:15:01,570 --> 00:15:04,153 because I think you're going to be shocked by the small sample 375 00:15:04,153 --> 00:15:05,950 sizes that we face in some of these things. 376 00:15:05,950 --> 00:15:06,970 And part of it is because if you just 377 00:15:06,970 --> 00:15:08,803 want to piggyback on what insurers are going 378 00:15:08,803 --> 00:15:10,973 to be willing to pay for to get your data, 379 00:15:10,973 --> 00:15:12,390 you're going to be stuck with only 380 00:15:12,390 --> 00:15:14,340 being able to work off the stuff we already 381 00:15:14,340 --> 00:15:15,215 know something about. 382 00:15:15,215 --> 00:15:17,280 So much of my work has been really trying 383 00:15:17,280 --> 00:15:20,070 to think about how we can change that. 384 00:15:20,070 --> 00:15:22,670 OK, so just a little bit more, and then we 385 00:15:22,670 --> 00:15:24,520 can get into a little bit more meat. 386 00:15:24,520 --> 00:15:26,640 So sort of the universal standard 387 00:15:26,640 --> 00:15:29,870 for how imaging data is stored is something called DICOMs, 388 00:15:29,870 --> 00:15:33,058 or Digital Imaging and Communications standard, 389 00:15:33,058 --> 00:15:34,600 and really, the end of the day, there 390 00:15:34,600 --> 00:15:36,457 is some compressed data for the images. 391 00:15:36,457 --> 00:15:38,790 There's a DICOM header, which I'll show you in a moment. 392 00:15:38,790 --> 00:15:40,282 It's lots of nice Python libraries 393 00:15:40,282 --> 00:15:42,490 that are available to be able to work with this data, 394 00:15:42,490 --> 00:15:45,180 and there's a free viewer you could use too. 395 00:15:45,180 --> 00:15:46,555 OK. 396 00:15:46,555 --> 00:15:47,930 So where do I get access to this? 397 00:15:47,930 --> 00:15:49,805 So this has actually been an incredible pain. 398 00:15:49,805 --> 00:15:52,800 So hospitals are set up to be clinical operations. 399 00:15:52,800 --> 00:15:54,300 They're not set up to make it easy 400 00:15:54,300 --> 00:15:56,623 for you to get gobs of data for being 401 00:15:56,623 --> 00:15:57,790 able to do machine learning. 402 00:15:57,790 --> 00:16:00,840 It's just not really there. 403 00:16:00,840 --> 00:16:04,230 And so sometimes, you have some of these data archives 404 00:16:04,230 --> 00:16:05,730 that store this data, but there's 405 00:16:05,730 --> 00:16:08,710 lots of reasons for why people make that difficult. 406 00:16:08,710 --> 00:16:10,530 And one of them is because often images 407 00:16:10,530 --> 00:16:13,100 have these burned in pixels with identifiable information. 408 00:16:13,100 --> 00:16:16,077 So you'll have a patient's name emblazoned in the image. 409 00:16:16,077 --> 00:16:17,160 You'll have date of birth. 410 00:16:17,160 --> 00:16:18,520 You'll have kind of other attributes. 411 00:16:18,520 --> 00:16:20,100 So you're stuck with that, and not 412 00:16:20,100 --> 00:16:22,133 only is it a problem that they're there, 413 00:16:22,133 --> 00:16:24,300 the vendors don't make it easy to be able to get rid 414 00:16:24,300 --> 00:16:25,133 of that information. 415 00:16:25,133 --> 00:16:28,170 So you actually have a problem that they don't really 416 00:16:28,170 --> 00:16:31,202 make it easy to download in bulk or de-identify this. 417 00:16:31,202 --> 00:16:32,910 And part of the reason is because then it 418 00:16:32,910 --> 00:16:35,118 would make it easy for you to switch vendors and have 419 00:16:35,118 --> 00:16:36,200 somebody else take over. 420 00:16:36,200 --> 00:16:37,950 So they make it a little bit hard for you. 421 00:16:37,950 --> 00:16:40,530 Once it's in there, it's hard for you to get it out, 422 00:16:40,530 --> 00:16:42,480 and people are selling their data. 423 00:16:42,480 --> 00:16:43,900 That's certainly happening too. 424 00:16:43,900 --> 00:16:45,570 So there's a little bit of attempts 425 00:16:45,570 --> 00:16:49,050 to try to control things that way, and many of the labels 426 00:16:49,050 --> 00:16:50,710 you want are stored separately. 427 00:16:50,710 --> 00:16:52,390 So you want to know what the diseases of these people. 428 00:16:52,390 --> 00:16:53,730 So you have the raw imaging data, 429 00:16:53,730 --> 00:16:55,605 but all the clinical stuff is somewhere else. 430 00:16:55,605 --> 00:16:57,130 So you have to sometimes link that, 431 00:16:57,130 --> 00:16:58,740 and so you need to get access there. 432 00:16:58,740 --> 00:17:01,115 And so just to give you a little bit of an idea of scale, 433 00:17:01,115 --> 00:17:03,660 so we're about to get all the ECGs from Brigham and Women's 434 00:17:03,660 --> 00:17:06,960 which is about 30 million historically, 435 00:17:06,960 --> 00:17:08,410 and this is all related to cost. 436 00:17:08,410 --> 00:17:11,609 So positron emission tomography, you can get about 8,000 or so, 437 00:17:11,609 --> 00:17:13,800 and we're one of the busiest centers for that. 438 00:17:13,800 --> 00:17:16,510 Echocardiograms are in the 300,000 to 500,000 439 00:17:16,510 --> 00:17:17,260 range in archives. 440 00:17:17,260 --> 00:17:19,060 So that gets a little bit more interesting. 441 00:17:19,060 --> 00:17:19,680 OK. 442 00:17:19,680 --> 00:17:21,599 This is what a DICOM header looks like. 443 00:17:21,599 --> 00:17:23,849 You have some sort of identifiers, 444 00:17:23,849 --> 00:17:25,980 and then you have some information there, 445 00:17:25,980 --> 00:17:27,660 attributes of the images, patient 446 00:17:27,660 --> 00:17:29,468 name, date of birth, frame rate. 447 00:17:29,468 --> 00:17:32,010 These kind of things are there, and there's some variability. 448 00:17:32,010 --> 00:17:34,880 So it's never quite easy. 449 00:17:34,880 --> 00:17:36,170 OK. 450 00:17:36,170 --> 00:17:40,585 So these different modalities have some different benefits 451 00:17:40,585 --> 00:17:41,960 to them which is why they're used 452 00:17:41,960 --> 00:17:44,580 for one disease or the other. 453 00:17:44,580 --> 00:17:47,330 And so one of the real headaches is that the heart moves. 454 00:17:47,330 --> 00:17:49,600 So the chest wall moves, because we breathe, 455 00:17:49,600 --> 00:17:50,600 and the heart moves too. 456 00:17:50,600 --> 00:17:52,580 So you have to image something that 457 00:17:52,580 --> 00:17:55,880 has enough temporal frequency that you're not overwhelmed 458 00:17:55,880 --> 00:17:58,752 by the basic movement of the heart itself, 459 00:17:58,752 --> 00:18:00,460 and so some of these things aren't great. 460 00:18:00,460 --> 00:18:03,110 So SPECT or PET acquire their images, 461 00:18:03,110 --> 00:18:05,443 which are radioactive counts, over minutes. 462 00:18:05,443 --> 00:18:06,860 So that's certainly a problem when 463 00:18:06,860 --> 00:18:08,880 it comes to something that's moving like that, 464 00:18:08,880 --> 00:18:10,547 and if you want to have high resolution. 465 00:18:10,547 --> 00:18:13,130 So typically, you have very poor spatial resolution 466 00:18:13,130 --> 00:18:16,010 for something that ultimately doesn't deal well 467 00:18:16,010 --> 00:18:17,570 with the moving aspect. 468 00:18:17,570 --> 00:18:20,160 So coronary angiography has very, very fast frame rates. 469 00:18:20,160 --> 00:18:22,510 So that's X-ray, and that's sort of very fast. 470 00:18:22,510 --> 00:18:24,740 Echocardiography can be quite fast. 471 00:18:24,740 --> 00:18:26,765 MRI and CT are not quite as good, 472 00:18:26,765 --> 00:18:28,640 and so there's some degradation of the image. 473 00:18:28,640 --> 00:18:30,770 As a result, people do something called gating, 474 00:18:30,770 --> 00:18:33,920 where they'll take the electrocardiogram, the ECG, 475 00:18:33,920 --> 00:18:36,610 and try to line up different portions 476 00:18:36,610 --> 00:18:37,610 of different heartbeats. 477 00:18:37,610 --> 00:18:40,332 And say, well, we'll take this image from here, 478 00:18:40,332 --> 00:18:42,290 line it up with this one from there, this one-- 479 00:18:42,290 --> 00:18:44,873 I'm going to talk a little bit about that, about registration, 480 00:18:44,873 --> 00:18:47,573 but ultimately, that's a problem that people have to deal with. 481 00:18:47,573 --> 00:18:49,490 So it's a computer vision problem of interest. 482 00:18:49,490 --> 00:18:50,750 OK. 483 00:18:50,750 --> 00:18:52,730 Preamble is almost done. 484 00:18:52,730 --> 00:18:54,770 OK. 485 00:18:54,770 --> 00:18:56,983 So why do we even imagine any of this stuff 486 00:18:56,983 --> 00:18:57,900 is going to be useful? 487 00:18:57,900 --> 00:19:02,330 So it turns out that the practice of interpreting 488 00:19:02,330 --> 00:19:04,520 involves a lot of manual measurements. 489 00:19:04,520 --> 00:19:07,010 So people like me, and people who 490 00:19:07,010 --> 00:19:08,660 have trained for way too long, find 491 00:19:08,660 --> 00:19:11,550 themselves getting little rulers and measuring various things. 492 00:19:11,550 --> 00:19:14,600 So for example, this is a narrowing of an artery. 493 00:19:14,600 --> 00:19:16,850 So you could take a little bit of calipers and measure 494 00:19:16,850 --> 00:19:19,190 across that and compare it to here 495 00:19:19,190 --> 00:19:22,220 and say, ah, this is 80% narrowed. 496 00:19:22,220 --> 00:19:24,440 You could measure the area of this chamber, 497 00:19:24,440 --> 00:19:27,200 the left ventricle, and you can measure its area is, 498 00:19:27,200 --> 00:19:29,660 and you can see, ah, its peak area is this. 499 00:19:29,660 --> 00:19:31,125 It's minimum area is this. 500 00:19:31,125 --> 00:19:33,000 Therefore, it's contracting a certain amount. 501 00:19:33,000 --> 00:19:33,917 So we do those things. 502 00:19:33,917 --> 00:19:36,410 We measure those things by hand. 503 00:19:36,410 --> 00:19:39,230 And the other thing we do is we actually diagnose things just 504 00:19:39,230 --> 00:19:40,080 by looking at them. 505 00:19:40,080 --> 00:19:43,040 So this is a disease called cardiac amyloid characterized 506 00:19:43,040 --> 00:19:44,030 by some thickening. 507 00:19:44,030 --> 00:19:45,110 I'll show you a little bit more about that 508 00:19:45,110 --> 00:19:46,130 and some sparkling here. 509 00:19:46,130 --> 00:19:48,440 So people do look and say, ah, this is what this is. 510 00:19:48,440 --> 00:19:50,703 So there's kind of a classification problem 511 00:19:50,703 --> 00:19:52,620 that comes either at the image or video level. 512 00:19:52,620 --> 00:19:54,828 So we'll talk about whether this is even worth doing. 513 00:19:54,828 --> 00:19:56,020 AUDIENCE: I have a question. 514 00:19:56,020 --> 00:19:56,820 RAHUL DEO: Yes. 515 00:19:56,820 --> 00:19:58,987 AUDIENCE: Is this with software, or do you literally 516 00:19:58,987 --> 00:20:00,340 take a ruler and measure? 517 00:20:00,340 --> 00:20:03,500 RAHUL DEO: So the software involves clicking at one point, 518 00:20:03,500 --> 00:20:05,710 stretching something, and clicking another point. 519 00:20:05,710 --> 00:20:07,793 So it's a little better than pulling the ruler out 520 00:20:07,793 --> 00:20:11,140 of your back pocket, but not that much better. 521 00:20:11,140 --> 00:20:11,690 OK. 522 00:20:11,690 --> 00:20:14,580 So we're going to talk about or three little areas, 523 00:20:14,580 --> 00:20:15,770 and again, this is not-- 524 00:20:15,770 --> 00:20:18,187 I got involved in this really in the last two years or so. 525 00:20:18,187 --> 00:20:19,978 It's nice of David to ask me to speak here, 526 00:20:19,978 --> 00:20:21,560 but I think there are probably people 527 00:20:21,560 --> 00:20:24,295 in this room who have a lot more experience in this space. 528 00:20:24,295 --> 00:20:26,420 But the areas that have been relevant to what we've 529 00:20:26,420 --> 00:20:29,870 been doing has been image classification and then 530 00:20:29,870 --> 00:20:30,850 semantic segmentation. 531 00:20:30,850 --> 00:20:33,350 So image classification being assigning a label to an image, 532 00:20:33,350 --> 00:20:34,340 very great. 533 00:20:34,340 --> 00:20:37,820 Semantic segmentation, assigning each pixel to a class label, 534 00:20:37,820 --> 00:20:40,278 and we haven't done anything around the image registration, 535 00:20:40,278 --> 00:20:41,903 but there are some interesting problems 536 00:20:41,903 --> 00:20:43,260 I've been thinking about there. 537 00:20:43,260 --> 00:20:45,343 And that's really mapping different sets of images 538 00:20:45,343 --> 00:20:46,830 onto one coordinate system. 539 00:20:46,830 --> 00:20:47,330 OK. 540 00:20:47,330 --> 00:20:50,580 So seems obvious that image classification would 541 00:20:50,580 --> 00:20:53,458 be something that you would imagine a physician does, 542 00:20:53,458 --> 00:20:54,750 and so maybe we can mimic that. 543 00:20:54,750 --> 00:20:56,542 Seems like a reasonable thing that happens. 544 00:20:56,542 --> 00:20:59,610 So lots of things that radiologists, people 545 00:20:59,610 --> 00:21:03,690 who interpret images, do involve terms of recognition, 546 00:21:03,690 --> 00:21:04,890 and they're really fast. 547 00:21:04,890 --> 00:21:08,400 So it takes them a couple of minutes to often do things 548 00:21:08,400 --> 00:21:10,860 like detect if there's cancer, detect if somebody has 549 00:21:10,860 --> 00:21:13,530 pneumonia, detect if there's breast cancer in a mammogram, 550 00:21:13,530 --> 00:21:14,580 tells there's fluid in the heart, 551 00:21:14,580 --> 00:21:17,038 and then even less than that, one minute often, 30 seconds, 552 00:21:17,038 --> 00:21:19,300 they can very, very fast. 553 00:21:19,300 --> 00:21:23,160 So you can imagine the wave of excitement 554 00:21:23,160 --> 00:21:26,730 around image classification was really post-image net, 555 00:21:26,730 --> 00:21:28,830 so maybe about three years, four years, or so ago. 556 00:21:28,830 --> 00:21:30,455 We're always a little slow in medicine, 557 00:21:30,455 --> 00:21:32,985 so a little bit behind other fields. 558 00:21:32,985 --> 00:21:34,860 And the places that they went were the places 559 00:21:34,860 --> 00:21:36,810 where there are huge data sets already, 560 00:21:36,810 --> 00:21:38,620 and where there's simple recognition tests. 561 00:21:38,620 --> 00:21:40,710 So chest X-rays and mammograms are both places 562 00:21:40,710 --> 00:21:43,410 that had a lot of attention, and other places 563 00:21:43,410 --> 00:21:46,310 have been slowed down by just how hard it is to get data. 564 00:21:46,310 --> 00:21:48,060 So if you can't get a big enough data set, 565 00:21:48,060 --> 00:21:49,893 then you're not going to be able to do much. 566 00:21:49,893 --> 00:21:50,460 OK. 567 00:21:50,460 --> 00:21:54,378 So David mentioned, you guys already covered very nicely, 568 00:21:54,378 --> 00:21:55,920 and this is probably kind of old hat. 569 00:21:55,920 --> 00:21:58,650 But I would say that prior to convolutional neural networks, 570 00:21:58,650 --> 00:22:00,840 nothing was happening in the image classification 571 00:22:00,840 --> 00:22:01,600 space in medicine. 572 00:22:01,600 --> 00:22:02,593 It was just not. 573 00:22:02,593 --> 00:22:05,010 People weren't even thinking that it was even worth doing. 574 00:22:05,010 --> 00:22:07,110 Now, there's a lot of interest, and so I 575 00:22:07,110 --> 00:22:10,650 have many different companies coming and asking for help 576 00:22:10,650 --> 00:22:11,780 with some of these things. 577 00:22:11,780 --> 00:22:16,290 And so it is now a very attractive thing 578 00:22:16,290 --> 00:22:17,713 in terms of thinking, and I think 579 00:22:17,713 --> 00:22:19,380 people haven't thought out all that well 580 00:22:19,380 --> 00:22:21,630 how we're going to use that. 581 00:22:21,630 --> 00:22:23,885 So for example, if it takes a radiologist a minute 582 00:22:23,885 --> 00:22:25,260 to two minutes to read something, 583 00:22:25,260 --> 00:22:28,360 how much benefit are you going to get to automate it? 584 00:22:28,360 --> 00:22:30,600 And the real problem is you can't 585 00:22:30,600 --> 00:22:31,860 take that radiologist away. 586 00:22:31,860 --> 00:22:33,360 They're still there, because they're 587 00:22:33,360 --> 00:22:34,740 the ones who are on the hook. 588 00:22:34,740 --> 00:22:36,365 And they're going to get sued, and it's 589 00:22:36,365 --> 00:22:38,460 among the most sued profession in medicine. 590 00:22:38,460 --> 00:22:42,000 So there's lots of people who can read an X-ray. 591 00:22:42,000 --> 00:22:44,140 You don't need to have all that training. 592 00:22:44,140 --> 00:22:46,350 But if you're the one who's going to be sued, 593 00:22:46,350 --> 00:22:48,070 it ends up being that there really isn't 594 00:22:48,070 --> 00:22:49,320 any task shifting in medicine. 595 00:22:49,320 --> 00:22:51,150 There isn't that kind of, oh, I'm 596 00:22:51,150 --> 00:22:53,940 going to let such and such take on 99%, 597 00:22:53,940 --> 00:22:55,690 and just tell me when there is a problem. 598 00:22:55,690 --> 00:22:58,260 It just doesn't happen, because they ultimately don't feel 599 00:22:58,260 --> 00:22:59,610 comfortable passing that on. 600 00:22:59,610 --> 00:23:01,540 So that's something to think about. 601 00:23:01,540 --> 00:23:03,540 So you have a task that's relatively 602 00:23:03,540 --> 00:23:06,240 easy for a very, very expensive and skilled person to do, 603 00:23:06,240 --> 00:23:07,990 and they refuse to give it up. 604 00:23:07,990 --> 00:23:08,490 OK. 605 00:23:08,490 --> 00:23:10,603 So that's a problem, but you can imagine 606 00:23:10,603 --> 00:23:13,020 there is some scenarios-- and we'll talk more about this-- 607 00:23:13,020 --> 00:23:14,103 as to where that could be. 608 00:23:14,103 --> 00:23:16,050 So let's say it's overnight. 609 00:23:16,050 --> 00:23:18,660 The radiologist is sleeping comfortably at home, 610 00:23:18,660 --> 00:23:20,790 and you have a bunch of studies being 611 00:23:20,790 --> 00:23:22,235 done in the emergency room. 612 00:23:22,235 --> 00:23:24,360 And you want to figure out, OK, which one should we 613 00:23:24,360 --> 00:23:25,050 call them about? 614 00:23:25,050 --> 00:23:26,760 So you can imagine there could be triage, 615 00:23:26,760 --> 00:23:30,300 because the status quo would be, we'll take them one by one. 616 00:23:30,300 --> 00:23:32,850 Maybe you could imagine sifting through them quickly and then 617 00:23:32,850 --> 00:23:34,230 re-prioritizing them. 618 00:23:34,230 --> 00:23:35,412 They'll still be looked at. 619 00:23:35,412 --> 00:23:37,120 Every single one will still be looked at. 620 00:23:37,120 --> 00:23:38,412 It's just the order may change. 621 00:23:38,412 --> 00:23:39,960 So that's an example, and you could 622 00:23:39,960 --> 00:23:42,450 imagine there could be separate-- someone else could 623 00:23:42,450 --> 00:23:43,540 read at the same time. 624 00:23:43,540 --> 00:23:44,910 And we'll come back to this in terms of 625 00:23:44,910 --> 00:23:46,285 whether or not you could have two 626 00:23:46,285 --> 00:23:49,500 streams and whether or not that is a scenario that 627 00:23:49,500 --> 00:23:50,478 would make some sense. 628 00:23:50,478 --> 00:23:52,020 And maybe, in resource-poor settings, 629 00:23:52,020 --> 00:23:53,670 where we're not teaming with the radiologist, 630 00:23:53,670 --> 00:23:54,795 maybe that makes sense too. 631 00:23:54,795 --> 00:23:57,210 So we'll come back to that too. 632 00:23:57,210 --> 00:23:57,810 OK. 633 00:23:57,810 --> 00:23:59,460 So here's another problem. 634 00:23:59,460 --> 00:24:03,420 So almost everything in medicine requires some element 635 00:24:03,420 --> 00:24:06,090 of confirmation of a visual finding, 636 00:24:06,090 --> 00:24:08,100 and some of the reasons are very simple. 637 00:24:08,100 --> 00:24:11,248 So let's say you want to talk about there being a tumor. 638 00:24:11,248 --> 00:24:13,290 So if you're going to ask a surgeon to biopsy it, 639 00:24:13,290 --> 00:24:15,440 you better tell them where it is. 640 00:24:15,440 --> 00:24:17,220 It's not enough to just say, this image 641 00:24:17,220 --> 00:24:18,990 has a tumor somewhere on it. 642 00:24:18,990 --> 00:24:20,893 So there is some element of that that you're 643 00:24:20,893 --> 00:24:23,310 going to need to be a little bit more detailed than simply 644 00:24:23,310 --> 00:24:26,100 making a classification with a level one image, 645 00:24:26,100 --> 00:24:28,702 but I would say beyond that. 646 00:24:28,702 --> 00:24:30,910 Let's say, I'm going to try to get one of my patients 647 00:24:30,910 --> 00:24:33,600 to go for valve surgery. 648 00:24:33,600 --> 00:24:36,060 I'll sit with them, bring up their echo, 649 00:24:36,060 --> 00:24:39,120 sit side by side with them, and point them to where it is. 650 00:24:39,120 --> 00:24:41,090 Bring up a normal one and compare, 651 00:24:41,090 --> 00:24:43,240 because I want them to be involved in the decision. 652 00:24:43,240 --> 00:24:45,387 I want them to feel like they're not just trust-- 653 00:24:45,387 --> 00:24:46,470 and they have to trust me. 654 00:24:46,470 --> 00:24:47,370 At the end of the day, they don't even 655 00:24:47,370 --> 00:24:48,210 know that I'm showing-- 656 00:24:48,210 --> 00:24:50,210 I'll show them their name, but ultimately, there 657 00:24:50,210 --> 00:24:51,310 is some element of trust. 658 00:24:51,310 --> 00:24:53,440 They're not able to do this, but at the same time, 659 00:24:53,440 --> 00:24:55,560 there is this sense of shared decision making. 660 00:24:55,560 --> 00:24:58,920 You're trying to communicate to somebody, whose life is really 661 00:24:58,920 --> 00:25:02,380 at risk here, that this is why we're doing this decision. 662 00:25:02,380 --> 00:25:04,698 So the more you could imagine that there is obscuring, 663 00:25:04,698 --> 00:25:06,490 the more difficult it is to make that case. 664 00:25:06,490 --> 00:25:08,460 So medicine is this-- 665 00:25:08,460 --> 00:25:12,560 I found this review by Bin Yu from Berkeley, just came out, 666 00:25:12,560 --> 00:25:15,900 and it talks about this tension between predictive accuracy 667 00:25:15,900 --> 00:25:17,510 and descriptive accuracy. 668 00:25:17,510 --> 00:25:21,005 So this is of the typical thing we think about that matters, 669 00:25:21,005 --> 00:25:22,380 and there's lots of people who've 670 00:25:22,380 --> 00:25:24,340 written about this thing. 671 00:25:24,340 --> 00:25:28,470 Medicine is tough in that it's very demanding in this space 672 00:25:28,470 --> 00:25:32,340 here, and it's almost inflexible in this space here. 673 00:25:32,340 --> 00:25:34,260 So it's a tough nut to crack in terms 674 00:25:34,260 --> 00:25:35,760 of being able to make some progress, 675 00:25:35,760 --> 00:25:38,460 and so we'll talk more about when that's likely to happen. 676 00:25:38,460 --> 00:25:38,960 OK. 677 00:25:38,960 --> 00:25:42,060 So this again may be something that's very familiar to you. 678 00:25:42,060 --> 00:25:44,910 So we had this problem in terms of some of the disease 679 00:25:44,910 --> 00:25:46,620 detection models, and I didn't find 680 00:25:46,620 --> 00:25:48,150 this all that satisfying in terms 681 00:25:48,150 --> 00:25:50,010 being able to successfully localize. 682 00:25:50,010 --> 00:25:51,635 So just digging through the literature, 683 00:25:51,635 --> 00:25:55,230 it looks like this idea of being able to explain 684 00:25:55,230 --> 00:25:57,690 what part of the image is driving 685 00:25:57,690 --> 00:26:01,510 a certain classification. 686 00:26:01,510 --> 00:26:03,690 That field is modestly old. 687 00:26:03,690 --> 00:26:05,400 Maybe it goes back before that. 688 00:26:05,400 --> 00:26:07,110 But ultimately, there's two broad ways. 689 00:26:07,110 --> 00:26:10,260 You can imagine finding an exemplary image that maximally 690 00:26:10,260 --> 00:26:12,960 activates the classical work, or you can take a given image 691 00:26:12,960 --> 00:26:17,010 and say, what aspect of it is driving the classification? 692 00:26:17,010 --> 00:26:20,040 And so in this paper here did both those things. 693 00:26:20,040 --> 00:26:23,135 They either went through and optimized-- 694 00:26:23,135 --> 00:26:25,260 starting from an average of all the training data-- 695 00:26:25,260 --> 00:26:28,020 they optimized the intensities until they maximized 696 00:26:28,020 --> 00:26:29,580 the score for a given class. 697 00:26:29,580 --> 00:26:31,270 So that's what's shown here. 698 00:26:31,270 --> 00:26:33,660 And then another way to do it is in some sense you could 699 00:26:33,660 --> 00:26:36,240 take a derivative of the score function 700 00:26:36,240 --> 00:26:38,430 relative to the intensities of all the pixels 701 00:26:38,430 --> 00:26:39,210 and come up with something like this. 702 00:26:39,210 --> 00:26:41,070 But you could imagine, if you showed this to a patient, 703 00:26:41,070 --> 00:26:42,750 they wouldn't be very satisfied. 704 00:26:42,750 --> 00:26:47,850 So it's very difficult to make a case that this is super useful, 705 00:26:47,850 --> 00:26:50,400 but it seems like this field has progressed somewhat, 706 00:26:50,400 --> 00:26:51,970 and I haven't tried this out. 707 00:26:51,970 --> 00:26:53,930 This is a paper by Max Welling and company, 708 00:26:53,930 --> 00:26:55,320 out by a couple of years, and maybe you guys 709 00:26:55,320 --> 00:26:56,550 are familiar with this. 710 00:26:56,550 --> 00:26:59,008 But this ultimately is a little bit of a different approach 711 00:26:59,008 --> 00:27:01,230 in the sense that they take patches, 712 00:27:01,230 --> 00:27:03,570 the sort of purple-like patch here, 713 00:27:03,570 --> 00:27:10,440 and they compare the final score, or class label, 714 00:27:10,440 --> 00:27:12,180 relative to what it-- 715 00:27:12,180 --> 00:27:15,090 so taking the intensity here and replacing it 716 00:27:15,090 --> 00:27:18,342 by a conditional result sampling from the periphery. 717 00:27:18,342 --> 00:27:19,800 And just comparing those two things 718 00:27:19,800 --> 00:27:22,210 and seeing whether or not you either get activation, 719 00:27:22,210 --> 00:27:24,960 which is the red here. 720 00:27:24,960 --> 00:27:27,360 This is the way that they did the conditional sampling, 721 00:27:27,360 --> 00:27:29,802 and then blue would be the negative contributors. 722 00:27:29,802 --> 00:27:31,260 And there, you can imagine, there's 723 00:27:31,260 --> 00:27:32,460 a little bit more distinction here, 724 00:27:32,460 --> 00:27:34,793 and then something a little bit more on the medical side 725 00:27:34,793 --> 00:27:35,910 is this is a brain MRI. 726 00:27:35,910 --> 00:27:37,980 And so depending on this patch size, 727 00:27:37,980 --> 00:27:40,860 you get a different degree of resolution 728 00:27:40,860 --> 00:27:44,590 to localizing some areas of the image that are relevant. 729 00:27:44,590 --> 00:27:46,710 So this is something that we're going 730 00:27:46,710 --> 00:27:50,955 to expect a lot of demands from the medical field in terms 731 00:27:50,955 --> 00:27:52,080 of being able to show this. 732 00:27:52,080 --> 00:27:53,550 And at least our initial forays weren't 733 00:27:53,550 --> 00:27:55,675 very satisfying doing this with what we were doing, 734 00:27:55,675 --> 00:27:57,930 but maybe these algorithms have gotten better. 735 00:27:57,930 --> 00:27:58,430 OK. 736 00:27:58,430 --> 00:27:59,730 So next thing that matters. 737 00:27:59,730 --> 00:28:00,230 OK. 738 00:28:00,230 --> 00:28:01,710 So this is what people do. 739 00:28:01,710 --> 00:28:06,360 So I did my cardiology fellowship in MGH, 740 00:28:06,360 --> 00:28:07,820 and I just traced circles. 741 00:28:07,820 --> 00:28:08,570 That's what I did. 742 00:28:08,570 --> 00:28:11,820 I just trace circles, and I stretched a ruler across, 743 00:28:11,820 --> 00:28:12,750 and then fed that in. 744 00:28:12,750 --> 00:28:14,910 At least the program computed the volumes 745 00:28:14,910 --> 00:28:17,980 for me, the areas and volumes, but otherwise, you 746 00:28:17,980 --> 00:28:20,100 have to do this yourself. 747 00:28:20,100 --> 00:28:23,880 And so this is like a task that's done, 748 00:28:23,880 --> 00:28:26,040 and sometimes you may have to-- 749 00:28:26,040 --> 00:28:28,680 here's an example of volumes being computed 750 00:28:28,680 --> 00:28:32,238 by tracing these sorts of things and much radiology reports just 751 00:28:32,238 --> 00:28:33,030 involve doing that. 752 00:28:33,030 --> 00:28:34,800 So this seems like a very obvious task we 753 00:28:34,800 --> 00:28:36,870 should be able to improve on. 754 00:28:36,870 --> 00:28:39,060 So medicine tends to be not the most 755 00:28:39,060 --> 00:28:40,675 creative in terms of trying a bunch 756 00:28:40,675 --> 00:28:41,800 of different architectures. 757 00:28:41,800 --> 00:28:44,190 So if you look at the papers, they all jump on the U-net 758 00:28:44,190 --> 00:28:47,310 as being the favorite architecture 759 00:28:47,310 --> 00:28:48,870 for semantic segmentation. 760 00:28:48,870 --> 00:28:51,000 So maybe familiar to people here, 761 00:28:51,000 --> 00:28:55,760 really, it just captures this encoding or contracting layer. 762 00:28:55,760 --> 00:28:58,020 Where you're downsampling, and then there's 763 00:28:58,020 --> 00:29:00,600 a symmetric upsampling that takes place. 764 00:29:00,600 --> 00:29:03,300 And then ultimately, there's these skip connections, where 765 00:29:03,300 --> 00:29:07,290 you take an image, and then you can catonate it 766 00:29:07,290 --> 00:29:10,410 with this upsampled layer, and this helps get a little bit 767 00:29:10,410 --> 00:29:11,160 more localization. 768 00:29:11,160 --> 00:29:12,827 So we used this for our paper, and we'll 769 00:29:12,827 --> 00:29:15,090 talk about this a little bit, and it's very popular 770 00:29:15,090 --> 00:29:16,742 within the medical literature. 771 00:29:16,742 --> 00:29:18,450 One of the things that was quite annoying 772 00:29:18,450 --> 00:29:20,802 is that what you would find for some of the images, 773 00:29:20,802 --> 00:29:22,260 you'd find, let's say, a ventricle. 774 00:29:22,260 --> 00:29:24,180 You'd find this nicely segmented area, 775 00:29:24,180 --> 00:29:26,305 and then you'd find this little satellite ventricle 776 00:29:26,305 --> 00:29:28,200 that the image would just pick. 777 00:29:28,200 --> 00:29:31,350 The problem is that this pixel-level classification 778 00:29:31,350 --> 00:29:33,690 tends to be a problem, and a human 779 00:29:33,690 --> 00:29:35,010 would never make that mistake. 780 00:29:35,010 --> 00:29:38,370 But that tends to be something that sounds like it is common 781 00:29:38,370 --> 00:29:40,740 in the-- this is a common tension is 782 00:29:40,740 --> 00:29:45,750 that this sort of focusing on relatively limited scales 783 00:29:45,750 --> 00:29:50,040 ends up being problematic, when it comes to picking up 784 00:29:50,040 --> 00:29:51,100 the global architecture. 785 00:29:51,100 --> 00:29:52,440 And so there's lots of different solutions 786 00:29:52,440 --> 00:29:53,800 it looks like in the literature. 787 00:29:53,800 --> 00:29:55,675 I just highlighted some of these from a paper 788 00:29:55,675 --> 00:29:57,785 that was published from Google a little while ago. 789 00:29:57,785 --> 00:29:59,160 One of the things that's captured 790 00:29:59,160 --> 00:30:01,500 is these ideas of dilated convolutions, 791 00:30:01,500 --> 00:30:04,440 and so that you have convolutions 792 00:30:04,440 --> 00:30:05,520 built on convolutions. 793 00:30:05,520 --> 00:30:08,280 And so ultimately, you have a much bigger receptive field 794 00:30:08,280 --> 00:30:11,513 for this layer, though you haven't really 795 00:30:11,513 --> 00:30:12,930 increased the number of parameters 796 00:30:12,930 --> 00:30:13,560 that you have to learn. 797 00:30:13,560 --> 00:30:14,350 So there is some. 798 00:30:14,350 --> 00:30:15,475 It seems like there's lots. 799 00:30:15,475 --> 00:30:17,400 This is not just a problem for us 800 00:30:17,400 --> 00:30:19,323 but a problem for many people in this field. 801 00:30:19,323 --> 00:30:21,240 So we need to be a little bit more adventurous 802 00:30:21,240 --> 00:30:23,198 in terms of trying some of these other methods. 803 00:30:23,198 --> 00:30:26,190 We did try a little bit of that and didn't find a gains, 804 00:30:26,190 --> 00:30:27,690 but I think, ultimately, there still 805 00:30:27,690 --> 00:30:29,270 needs to be a little bit more work there. 806 00:30:29,270 --> 00:30:29,610 OK. 807 00:30:29,610 --> 00:30:31,318 So the last thing I'm going to talk about 808 00:30:31,318 --> 00:30:33,840 before getting into my work is really this idea 809 00:30:33,840 --> 00:30:35,490 of image registration. 810 00:30:35,490 --> 00:30:38,400 So I talked about how there are sometimes some techniques that 811 00:30:38,400 --> 00:30:42,462 have limitations, either in terms of spatial resolution 812 00:30:42,462 --> 00:30:43,420 or temporal resolution. 813 00:30:43,420 --> 00:30:46,940 So this is a PET scan here, this sort of reddish glow here, 814 00:30:46,940 --> 00:30:49,780 and in the background, we have a CAT scan of the heart. 815 00:30:49,780 --> 00:30:52,280 And so clearly, this is a poorly registered image, 816 00:30:52,280 --> 00:30:55,450 where you have the PET scan kind of floating out here, when it 817 00:30:55,450 --> 00:30:56,785 really should be lined up here. 818 00:30:56,785 --> 00:30:59,160 And so you have something that's registered better there. 819 00:30:59,160 --> 00:31:00,868 I also mentioned this problem but gating. 820 00:31:00,868 --> 00:31:03,520 So ultimately, if you have an image taken 821 00:31:03,520 --> 00:31:05,560 from different cardiac cycles, you're 822 00:31:05,560 --> 00:31:08,500 going to have align them in some way. 823 00:31:08,500 --> 00:31:10,870 It seems like a very mature problem in computer vision 824 00:31:10,870 --> 00:31:11,530 world. 825 00:31:11,530 --> 00:31:13,640 We haven't done anything in this space, 826 00:31:13,640 --> 00:31:16,300 but ultimately, it has been around for decades. 827 00:31:16,300 --> 00:31:19,907 If not, I would just at least touch it, touch upon it. 828 00:31:19,907 --> 00:31:21,490 So this is sort of the old school way, 829 00:31:21,490 --> 00:31:23,110 and then now people are starting to use 830 00:31:23,110 --> 00:31:24,610 conditional variational autoencoders 831 00:31:24,610 --> 00:31:27,880 to be able to learn geometric transformations. 832 00:31:27,880 --> 00:31:32,430 This is the Siemens group out in Princeton that has this paper. 833 00:31:32,430 --> 00:31:34,180 Again, nothing I'm going to focus on, just 834 00:31:34,180 --> 00:31:36,250 wanted to bring it up as being an area that 835 00:31:36,250 --> 00:31:38,300 remains of interest. 836 00:31:38,300 --> 00:31:39,280 OK. 837 00:31:39,280 --> 00:31:45,320 So I think we're doing OK, but you said 4:00. 838 00:31:45,320 --> 00:31:46,060 PROFESSOR: 3:55 839 00:31:46,060 --> 00:31:46,600 RAHUL DEO: 3:55. 840 00:31:46,600 --> 00:31:47,110 OK. 841 00:31:47,110 --> 00:31:47,920 All right, and interrupt. 842 00:31:47,920 --> 00:31:48,670 Please, interrupt. 843 00:31:48,670 --> 00:31:49,720 OK? 844 00:31:49,720 --> 00:31:52,650 I'm hoping that I'm not talking too fast. 845 00:31:52,650 --> 00:31:54,530 OK. 846 00:31:54,530 --> 00:31:58,160 As David said, this was not my field, 847 00:31:58,160 --> 00:32:00,328 but increasingly, there is some interest 848 00:32:00,328 --> 00:32:02,120 in terms of getting involved in it, in part 849 00:32:02,120 --> 00:32:03,920 because of my frustrations with clinical medicine. 850 00:32:03,920 --> 00:32:05,295 So this is one of my frustrations 851 00:32:05,295 --> 00:32:06,470 with clinical medicine. 852 00:32:06,470 --> 00:32:10,380 So cardiology has not really changed, 853 00:32:10,380 --> 00:32:13,850 and one of the things it fails at miserably 854 00:32:13,850 --> 00:32:17,630 is picking up early-onset disease. 855 00:32:17,630 --> 00:32:20,690 So here's the typical profile, a little facetious. 856 00:32:20,690 --> 00:32:24,230 So people like me in our early 40s, 857 00:32:24,230 --> 00:32:26,365 start to already have some problems 858 00:32:26,365 --> 00:32:27,490 with some of these numbers. 859 00:32:27,490 --> 00:32:29,930 So I like to joke that, since I came back to the Harvard system 860 00:32:29,930 --> 00:32:31,847 from California, my blood pressure has gone up 861 00:32:31,847 --> 00:32:34,450 10 points which is true, unfortunately. 862 00:32:34,450 --> 00:32:38,020 So these changes already start to happen, 863 00:32:38,020 --> 00:32:40,310 and nobody does anything about it. 864 00:32:40,310 --> 00:32:43,495 So you can go to your doctor, and you're also saying, 865 00:32:43,495 --> 00:32:45,120 no, I don't want to be on any medicine. 866 00:32:45,120 --> 00:32:47,412 They're like, no, no, you shouldn't be on any medicine. 867 00:32:47,412 --> 00:32:52,223 So you kind hem and haw, and a decade goes by, 15 years go by. 868 00:32:52,223 --> 00:32:53,890 And then finally, you're like, OK, well, 869 00:32:53,890 --> 00:32:55,380 it looks like at least my coworkers 870 00:32:55,380 --> 00:32:58,733 are on some medicines, or maybe I'll be willing to do that. 871 00:32:58,733 --> 00:33:00,900 And so they've got lots of stuff you can be treated, 872 00:33:00,900 --> 00:33:03,382 but it is often very difficult, and you see 873 00:33:03,382 --> 00:33:04,590 this at the doctor level too. 874 00:33:04,590 --> 00:33:05,155 Yes. 875 00:33:05,155 --> 00:33:06,530 AUDIENCE: For the optical values, 876 00:33:06,530 --> 00:33:11,990 how much personal deviation is there for the values? 877 00:33:11,990 --> 00:33:17,860 RAHUL DEO: So the optimal value is fixed and is just 878 00:33:17,860 --> 00:33:19,780 like a reference value. 879 00:33:19,780 --> 00:33:21,900 And you can be off-- 880 00:33:21,900 --> 00:33:23,380 so blood pressure, let's say. 881 00:33:23,380 --> 00:33:25,420 So people consider optimal to be less than 120 882 00:33:25,420 --> 00:33:27,600 over less than 80. 883 00:33:27,600 --> 00:33:29,715 People are in the 200s. 884 00:33:29,715 --> 00:33:31,590 So you'd be treated in the 200s, but there'll 885 00:33:31,590 --> 00:33:34,077 be lots of people in the 140s and the 150s, 886 00:33:34,077 --> 00:33:35,910 and there'll be a degree of kind of nihilism 887 00:33:35,910 --> 00:33:38,010 about that for some time. 888 00:33:38,010 --> 00:33:40,560 And my patients would be like, oh, I got into the fight 889 00:33:40,560 --> 00:33:42,930 with the parking attendant. 890 00:33:42,930 --> 00:33:45,160 I just had a really bad phone-- 891 00:33:45,160 --> 00:33:47,250 there's like countless excuses for why 892 00:33:47,250 --> 00:33:49,552 it is that one shouldn't start a medication, 893 00:33:49,552 --> 00:33:51,010 and this can go on for a long time. 894 00:33:51,010 --> 00:33:51,553 Yes. 895 00:33:51,553 --> 00:33:52,470 AUDIENCE: [INAUDIBLE]. 896 00:33:52,470 --> 00:33:55,676 How can you assess the risk [INAUDIBLE] for blood pressure? 897 00:33:55,676 --> 00:33:59,127 Is that, like, noise [INAUDIBLE]?? 898 00:34:03,933 --> 00:34:04,600 RAHUL DEO: Yeah. 899 00:34:04,600 --> 00:34:05,130 So OK. 900 00:34:05,130 --> 00:34:06,720 So that's a great point. 901 00:34:06,720 --> 00:34:08,100 So yeah. 902 00:34:08,100 --> 00:34:12,030 So the question is that many of the things 903 00:34:12,030 --> 00:34:14,260 that we're seeing as risk factors have 904 00:34:14,260 --> 00:34:15,750 inherent variability to them. 905 00:34:15,750 --> 00:34:19,050 Blood sugar is another great example of those things. 906 00:34:19,050 --> 00:34:21,210 If you could have a single-point estimate that 907 00:34:21,210 --> 00:34:23,370 arises in the setting of a single clinic visit, 908 00:34:23,370 --> 00:34:24,270 how much do you trust that? 909 00:34:24,270 --> 00:34:26,062 So it's a couple of things related to that. 910 00:34:26,062 --> 00:34:28,139 So one of them is that people could 911 00:34:28,139 --> 00:34:31,710 be sent home with monitors, and they can have 24-hour monitors. 912 00:34:31,710 --> 00:34:34,710 In Europe, that's much more often done than here. 913 00:34:34,710 --> 00:34:37,290 And then, the thing is that often they'll say that, 914 00:34:37,290 --> 00:34:40,020 and then you go look at like six consecutive visits, 915 00:34:40,020 --> 00:34:42,420 and they all have something elevate, but it's true. 916 00:34:42,420 --> 00:34:46,260 This is a noisy point estimate, and people 917 00:34:46,260 --> 00:34:49,270 have shown that averages tend to do better. 918 00:34:49,270 --> 00:34:53,090 But at the same time, if that's all you have-- 919 00:34:53,090 --> 00:34:54,449 and the bias is interesting. 920 00:34:54,449 --> 00:34:57,685 Because the bias comes from some degree of stress, 921 00:34:57,685 --> 00:34:59,310 but we have lots of stress in our life. 922 00:34:59,310 --> 00:35:01,060 I hopefully am not the most stressful part 923 00:35:01,060 --> 00:35:04,050 of my patient's life, and so I think that ultimately there 924 00:35:04,050 --> 00:35:05,760 are-- 925 00:35:05,760 --> 00:35:07,742 and the problem with that is it's 926 00:35:07,742 --> 00:35:09,450 a good reason for someone to talk you out 927 00:35:09,450 --> 00:35:10,930 of them starting them on anything. 928 00:35:10,930 --> 00:35:13,320 And that's what ends up happening, 929 00:35:13,320 --> 00:35:16,270 and so this can be a really long period of time. 930 00:35:16,270 --> 00:35:16,770 OK. 931 00:35:16,770 --> 00:35:17,530 So this is the grim part. 932 00:35:17,530 --> 00:35:18,030 OK? 933 00:35:18,030 --> 00:35:19,560 So it turns out that once symptoms 934 00:35:19,560 --> 00:35:22,830 develop for something like heart failure, decline is fast. 935 00:35:22,830 --> 00:35:26,322 So 50% mortality in five years, after somebody gets 936 00:35:26,322 --> 00:35:28,530 hospitalized for their first heart failure admission, 937 00:35:28,530 --> 00:35:30,960 and often the symptoms are just around that time. 938 00:35:30,960 --> 00:35:32,670 So unfortunately, these things tend 939 00:35:32,670 --> 00:35:36,982 to be irreversible changes that happen in the background, 940 00:35:36,982 --> 00:35:38,940 and largely, you don't really have any symptoms 941 00:35:38,940 --> 00:35:40,030 until late in the game. 942 00:35:40,030 --> 00:35:42,630 So we have this problem, where we have this huge stretch. 943 00:35:42,630 --> 00:35:44,250 We know that there is risk factors, 944 00:35:44,250 --> 00:35:46,792 but we have this huge stretch, where nobody is doing anything 945 00:35:46,792 --> 00:35:47,470 about them. 946 00:35:47,470 --> 00:35:49,740 And then we have sort things going downhill relatively 947 00:35:49,740 --> 00:35:51,030 quickly after that. 948 00:35:51,030 --> 00:35:53,310 And unfortunately, I would make a case 949 00:35:53,310 --> 00:35:55,050 that probably responsiveness is probably 950 00:35:55,050 --> 00:35:56,850 best did this phase over there. 951 00:35:56,850 --> 00:35:59,160 Expense is really all over there. 952 00:35:59,160 --> 00:36:00,600 So we really want to find-- 953 00:36:00,600 --> 00:36:02,850 and this is what I consider to be missing in medicine. 954 00:36:02,850 --> 00:36:04,620 I'm going to come back to this again a little bit later 955 00:36:04,620 --> 00:36:06,390 on-- but really, we want to have these-- 956 00:36:06,390 --> 00:36:08,910 if you're going to do something in this asymptomatic phase, 957 00:36:08,910 --> 00:36:09,930 it better be cheap. 958 00:36:09,930 --> 00:36:11,970 You're not going to be getting MRIs every day 959 00:36:11,970 --> 00:36:16,820 or every year for people who have no symptoms. 960 00:36:16,820 --> 00:36:18,570 The system would bankrupt if you had that. 961 00:36:18,570 --> 00:36:20,460 So we need these low cost metrics 962 00:36:20,460 --> 00:36:22,570 that can tell us, at an individual level, 963 00:36:22,570 --> 00:36:25,260 not just if we had 1,000 people like you, 964 00:36:25,260 --> 00:36:27,053 somebody would benefit. 965 00:36:27,053 --> 00:36:28,470 And this is what my patients would 966 00:36:28,470 --> 00:36:32,190 say is that they would be so excited about their EKG 967 00:36:32,190 --> 00:36:33,690 or their echo being done every year, 968 00:36:33,690 --> 00:36:35,130 because they want to know, how does it look like 969 00:36:35,130 --> 00:36:36,047 compared to last year? 970 00:36:36,047 --> 00:36:38,710 They want some comparison at their level, 971 00:36:38,710 --> 00:36:41,070 not just some public health report 972 00:36:41,070 --> 00:36:45,520 about this being a benefit to 100 people like you. 973 00:36:45,520 --> 00:36:48,480 And so it shouldn't be both low cost, 974 00:36:48,480 --> 00:36:49,860 should be reflective at something 975 00:36:49,860 --> 00:36:51,840 an individual level, should be relatively 976 00:36:51,840 --> 00:36:55,200 specific to the disease process, expressive in some way, 977 00:36:55,200 --> 00:36:56,670 and should get better with therapy. 978 00:36:56,670 --> 00:36:57,870 I think that's one of the things that's 979 00:36:57,870 --> 00:36:59,430 pretty important is if somebody does 980 00:36:59,430 --> 00:37:01,380 the things you ask them to do, hopefully, 981 00:37:01,380 --> 00:37:02,580 that will look better. 982 00:37:02,580 --> 00:37:04,920 And then that would be motivating, 983 00:37:04,920 --> 00:37:06,930 and I think that's how people get motivated is 984 00:37:06,930 --> 00:37:08,910 that they get responses. 985 00:37:08,910 --> 00:37:11,502 So I would make a case that even simple things 986 00:37:11,502 --> 00:37:12,960 like an ultrasound-- and I have one 987 00:37:12,960 --> 00:37:14,310 showed here-- really does capture 988 00:37:14,310 --> 00:37:15,780 some of these things, and not all those things, 989 00:37:15,780 --> 00:37:17,460 but they have some of those things. 990 00:37:17,460 --> 00:37:20,100 So you have, for example, that in the setting of high blood 991 00:37:20,100 --> 00:37:22,530 pressure, the left ventricular mass starts to thicken, 992 00:37:22,530 --> 00:37:25,080 and this is a quantitative, continuous measure. 993 00:37:25,080 --> 00:37:28,380 It just thickens over time, and the heart starts to change. 994 00:37:28,380 --> 00:37:31,180 The pumping function can get worse over time. 995 00:37:31,180 --> 00:37:33,570 The left atrium, which is this structure over here, 996 00:37:33,570 --> 00:37:35,730 this thin-walled structure is amazing in the sense 997 00:37:35,730 --> 00:37:38,447 that it's almost this barometer for the pressure in the heart. 998 00:37:38,447 --> 00:37:39,780 Oh, that's a horrible reference. 999 00:37:39,780 --> 00:37:41,820 OK, but it tends to get kind of bigger 1000 00:37:41,820 --> 00:37:44,460 and bigger in a very subtle way before any symptoms happen. 1001 00:37:44,460 --> 00:37:46,320 So you have this, and this is just one view. 1002 00:37:46,320 --> 00:37:46,820 Right? 1003 00:37:46,820 --> 00:37:48,240 So this is a simple view acquired 1004 00:37:48,240 --> 00:37:50,250 from an ultrasound that captures some 1005 00:37:50,250 --> 00:37:52,420 of these things at an individual level. 1006 00:37:52,420 --> 00:37:54,270 So this gets to some of my thoughts 1007 00:37:54,270 --> 00:37:57,590 around where we could imagine automated interpretation 1008 00:37:57,590 --> 00:37:58,660 benefiting. 1009 00:37:58,660 --> 00:38:03,060 So if you want to think about where you're less likely. 1010 00:38:03,060 --> 00:38:08,400 So with these very, very difficult, end-stage, 1011 00:38:08,400 --> 00:38:09,960 or complex decisions, where you have 1012 00:38:09,960 --> 00:38:12,423 a super skilled person even collecting 1013 00:38:12,423 --> 00:38:13,590 the data in the first place. 1014 00:38:13,590 --> 00:38:14,840 They've gone through training. 1015 00:38:14,840 --> 00:38:16,320 They're super experienced. 1016 00:38:16,320 --> 00:38:18,210 You have a very expensive piece of hardware 1017 00:38:18,210 --> 00:38:19,740 used to collect the data. 1018 00:38:19,740 --> 00:38:21,210 You have an expert interpreting it. 1019 00:38:21,210 --> 00:38:23,310 This is done late in the disease course. 1020 00:38:23,310 --> 00:38:25,590 You have to make really hard decisions, 1021 00:38:25,590 --> 00:38:27,120 and you don't want to mess it up. 1022 00:38:27,120 --> 00:38:29,040 So probably not good places to try 1023 00:38:29,040 --> 00:38:32,303 to stick in an automated system in there, 1024 00:38:32,303 --> 00:38:33,720 but what would be attractive would 1025 00:38:33,720 --> 00:38:37,120 be to try to enable studies that are not even being done at all. 1026 00:38:37,120 --> 00:38:40,470 So move to the primary care setting. 1027 00:38:40,470 --> 00:38:41,830 Use low cost handhelds. 1028 00:38:41,830 --> 00:38:43,890 So there's even now companies that 1029 00:38:43,890 --> 00:38:46,710 are starting to try to automate acquisition of the data 1030 00:38:46,710 --> 00:38:48,780 by helping people collect it and guide them 1031 00:38:48,780 --> 00:38:50,700 to collecting the right views. 1032 00:38:50,700 --> 00:38:53,280 Early in the disease course, no real symptoms here. 1033 00:38:53,280 --> 00:38:55,350 Decision support just around whether you 1034 00:38:55,350 --> 00:38:58,040 should start some meds or intensify them, low liability, 1035 00:38:58,040 --> 00:38:58,843 low cost. 1036 00:38:58,843 --> 00:39:00,260 So this is a place where we wanted 1037 00:39:00,260 --> 00:39:02,093 to focus in terms of being able to introduce 1038 00:39:02,093 --> 00:39:05,460 some kind of innovations in this space. 1039 00:39:05,460 --> 00:39:05,960 OK. 1040 00:39:05,960 --> 00:39:07,760 So this comes back to this slide of I 1041 00:39:07,760 --> 00:39:10,190 talked about where you could imagine some of these things 1042 00:39:10,190 --> 00:39:12,680 being low hanging fruit, but maybe those aren't the ones 1043 00:39:12,680 --> 00:39:14,638 that we should be focusing on we should instead 1044 00:39:14,638 --> 00:39:19,100 be focusing on enabling more data at low cost, 1045 00:39:19,100 --> 00:39:21,870 getting more out of the data that we're collecting, 1046 00:39:21,870 --> 00:39:24,120 and helping people even acquire it in the first place. 1047 00:39:24,120 --> 00:39:26,287 So that's one category of things, and that's the one 1048 00:39:26,287 --> 00:39:28,105 I just highlighted in the previous slide. 1049 00:39:28,105 --> 00:39:29,480 You can imagine something running 1050 00:39:29,480 --> 00:39:31,380 in the background at a hospital system level 1051 00:39:31,380 --> 00:39:33,380 and just checking to see whether there's anybody 1052 00:39:33,380 --> 00:39:34,753 who was missed in some ways. 1053 00:39:34,753 --> 00:39:37,170 And then triage I'm going to talk about in the next slide. 1054 00:39:37,170 --> 00:39:39,140 I'll come back to that, and then really-- and this 1055 00:39:39,140 --> 00:39:40,280 is, again, one of the reasons I got 1056 00:39:40,280 --> 00:39:41,947 into this-- we want to do something that 1057 00:39:41,947 --> 00:39:44,000 elevates practice beyond just simply repeating 1058 00:39:44,000 --> 00:39:45,540 what we already do. 1059 00:39:45,540 --> 00:39:48,230 And so this idea of quantitative tracking 1060 00:39:48,230 --> 00:39:50,147 of intermediate states, subclasses of disease, 1061 00:39:50,147 --> 00:39:52,438 which is actually the real reason I got into this space 1062 00:39:52,438 --> 00:39:54,410 is because I wanted to increase scale of data 1063 00:39:54,410 --> 00:39:57,412 to be able to do this, and this is where you potentially 1064 00:39:57,412 --> 00:39:58,120 would like to go. 1065 00:39:58,120 --> 00:40:00,470 So the ECG example is an interesting one, 1066 00:40:00,470 --> 00:40:02,810 because automated systems for ECG interpretation 1067 00:40:02,810 --> 00:40:05,240 have been around for 40 or 50 years, 1068 00:40:05,240 --> 00:40:11,030 and they really got going around the early 2000s, when 1069 00:40:11,030 --> 00:40:13,590 people realized-- 1070 00:40:13,590 --> 00:40:15,568 there's a pattern called an ST elevation. 1071 00:40:15,568 --> 00:40:17,360 I'm not sure if you guys talked about that. 1072 00:40:17,360 --> 00:40:20,225 This is a marker of complete stoppage 1073 00:40:20,225 --> 00:40:21,350 of blood flow to the heart. 1074 00:40:21,350 --> 00:40:24,080 So muscle starts to die. 1075 00:40:24,080 --> 00:40:28,880 And then the early 2000s, there was a quality movement that 1076 00:40:28,880 --> 00:40:31,190 said, as soon as anybody sees that, you 1077 00:40:31,190 --> 00:40:33,200 should get to somebody doing something 1078 00:40:33,200 --> 00:40:35,700 about it within an hour and a half or so. 1079 00:40:35,700 --> 00:40:38,180 And so the problem was that in the old days and the old way 1080 00:40:38,180 --> 00:40:39,020 to do this-- 1081 00:40:39,020 --> 00:40:40,395 and even this was around the time 1082 00:40:40,395 --> 00:40:43,460 I was a resident-- you would have 1083 00:40:43,460 --> 00:40:45,440 to first call the cardiologist. 1084 00:40:45,440 --> 00:40:46,213 Wake him up. 1085 00:40:46,213 --> 00:40:46,880 They would come. 1086 00:40:46,880 --> 00:40:47,963 You'd send them the image. 1087 00:40:47,963 --> 00:40:49,060 They would look at it. 1088 00:40:49,060 --> 00:40:50,180 Then, they would decide whether or not 1089 00:40:50,180 --> 00:40:51,830 this was the pattern they were seeing, 1090 00:40:51,830 --> 00:40:54,140 and then they would activate the lab, the cath lab. 1091 00:40:54,140 --> 00:40:56,870 They would come in, and you were losing about an hour, hour 1092 00:40:56,870 --> 00:40:58,290 and a half in this process. 1093 00:40:58,290 --> 00:41:01,670 And so instead they decided that automated systems could 1094 00:41:01,670 --> 00:41:05,270 be used to be able to enable ambulance personnel 1095 00:41:05,270 --> 00:41:07,550 or emergency room docs, so non-cardiologists, 1096 00:41:07,550 --> 00:41:09,042 to be able to say, hey, look, this 1097 00:41:09,042 --> 00:41:10,250 is what we think is going on. 1098 00:41:10,250 --> 00:41:12,890 Let's bring the team in, and so people would get mobilized. 1099 00:41:12,890 --> 00:41:14,390 People would come to the hospital. 1100 00:41:14,390 --> 00:41:17,480 Nobody would do anything in terms of starting the case, 1101 00:41:17,480 --> 00:41:20,598 until somebody confirmed it, but already, the whole wheels 1102 00:41:20,598 --> 00:41:21,140 were turning. 1103 00:41:21,140 --> 00:41:22,730 And so you have this triage system, 1104 00:41:22,730 --> 00:41:24,212 where you're making a decision. 1105 00:41:24,212 --> 00:41:25,670 You're not finalizing the decision, 1106 00:41:25,670 --> 00:41:26,850 but you're speeding things up. 1107 00:41:26,850 --> 00:41:28,040 And so this is an example where you 1108 00:41:28,040 --> 00:41:29,498 could imagine it's important to try 1109 00:41:29,498 --> 00:41:31,070 to offload this to something. 1110 00:41:31,070 --> 00:41:33,363 So this is an example, and there's 1111 00:41:33,363 --> 00:41:34,530 going to be false positives. 1112 00:41:34,530 --> 00:41:36,950 And people will laugh and mock the emergency room doctors 1113 00:41:36,950 --> 00:41:38,972 and mock the ambulance drivers and say, ah, 1114 00:41:38,972 --> 00:41:40,430 they don't know what they're doing. 1115 00:41:40,430 --> 00:41:41,480 They don't have any experience. 1116 00:41:41,480 --> 00:41:43,130 But ultimately, people were dying, 1117 00:41:43,130 --> 00:41:45,047 because they were waiting for the cardiologist 1118 00:41:45,047 --> 00:41:47,150 to be available to read the ECG. 1119 00:41:47,150 --> 00:41:50,390 So you've got to think about those in terms of places 1120 00:41:50,390 --> 00:41:51,930 where there may be cost for delay. 1121 00:41:51,930 --> 00:41:52,430 OK. 1122 00:41:52,430 --> 00:41:54,420 So coming back to echoes. 1123 00:41:54,420 --> 00:41:54,920 OK. 1124 00:41:54,920 --> 00:41:56,253 So why does an echo get studied? 1125 00:41:56,253 --> 00:42:00,110 Because this is probably not something that is typical. 1126 00:42:00,110 --> 00:42:04,310 It's a compilation of videos, and there 1127 00:42:04,310 --> 00:42:06,680 are about 70 different videos typically in the studies 1128 00:42:06,680 --> 00:42:08,690 that we do at the centers that we're at. 1129 00:42:08,690 --> 00:42:10,400 And they're taken over multiple cycles 1130 00:42:10,400 --> 00:42:12,860 and multiple different views, and often it 1131 00:42:12,860 --> 00:42:15,382 takes somebody pretty skilled to acquire those views. 1132 00:42:15,382 --> 00:42:17,090 And they take about 45 minutes to an hour 1133 00:42:17,090 --> 00:42:19,820 to gather that data, multiple different views, 1134 00:42:19,820 --> 00:42:22,010 and the stenographer is changing the depth 1135 00:42:22,010 --> 00:42:24,050 to zoom in on given structures. 1136 00:42:24,050 --> 00:42:26,090 And so you can understand that there's already 1137 00:42:26,090 --> 00:42:27,830 somebody who was already very experienced 1138 00:42:27,830 --> 00:42:30,350 in this process even collecting the data which is a problem. 1139 00:42:30,350 --> 00:42:32,600 Because you need to take them out of the picture, 1140 00:42:32,600 --> 00:42:35,490 because they're expensive to be able to do those things. 1141 00:42:35,490 --> 00:42:38,170 So we were doing at UCSF 12,000 to 50,000. 1142 00:42:38,170 --> 00:42:41,540 Brigham was probably a little busier at 30,000 to 35,000. 1143 00:42:41,540 --> 00:42:44,300 Medicare back, in 2011, had seven million of these perform, 1144 00:42:44,300 --> 00:42:47,630 and there's probably hundreds of millions of these archives, 1145 00:42:47,630 --> 00:42:50,420 so lots of data. 1146 00:42:50,420 --> 00:42:54,050 So we published a paper last year 1147 00:42:54,050 --> 00:42:57,440 trying to automate really all of the main processes around this, 1148 00:42:57,440 --> 00:43:01,130 and part of the reason to do all is it doesn't help you to have 1149 00:43:01,130 --> 00:43:02,540 one little bit automated. 1150 00:43:02,540 --> 00:43:04,550 Because at the end of the day, if you 1151 00:43:04,550 --> 00:43:06,140 have to have a cardiologist doing 1152 00:43:06,140 --> 00:43:07,220 everything else and a stenographer 1153 00:43:07,220 --> 00:43:09,262 doing everything else, what have you really saved 1154 00:43:09,262 --> 00:43:10,900 by having one little step? 1155 00:43:10,900 --> 00:43:14,030 So the goal here was to start from raw study, coming straight 1156 00:43:14,030 --> 00:43:16,410 off the machine, and try to do everything. 1157 00:43:16,410 --> 00:43:18,110 And so that involves sorting through all 1158 00:43:18,110 --> 00:43:20,660 these different views, coming up with empirical quality score 1159 00:43:20,660 --> 00:43:25,100 with it, segmenting all the five primary views that we use. 1160 00:43:25,100 --> 00:43:27,080 Directly detecting some diseases, 1161 00:43:27,080 --> 00:43:29,032 and then computing all the standard mass 1162 00:43:29,032 --> 00:43:31,240 and volume types of measurements that come from this. 1163 00:43:31,240 --> 00:43:34,070 So we wanted to do it all, and this was, I think, 1164 00:43:34,070 --> 00:43:37,880 it wasn't strikingly original in the algorithms that were used. 1165 00:43:37,880 --> 00:43:40,358 But at the same time, it was very bold for anybody 1166 00:43:40,358 --> 00:43:42,650 in the community to try to take this on, and of course, 1167 00:43:42,650 --> 00:43:44,983 in general, all the backlash you could imagine when, you 1168 00:43:44,983 --> 00:43:46,460 try to do something like this. 1169 00:43:46,460 --> 00:43:49,712 I still hear it, but there's excitement. 1170 00:43:49,712 --> 00:43:51,170 And certainly on the industry side, 1171 00:43:51,170 --> 00:43:54,110 there's really excitement in that this is feasible. 1172 00:43:54,110 --> 00:44:01,020 So I was running biology lab, back in 2016 or so, 1173 00:44:01,020 --> 00:44:02,130 and then decided-- 1174 00:44:02,130 --> 00:44:06,720 so my cousin's husband is the Dean of Engineering at Penn, 1175 00:44:06,720 --> 00:44:09,270 and I emailed him and said, do you know anyone at Berkeley? 1176 00:44:09,270 --> 00:44:10,112 I live near there. 1177 00:44:10,112 --> 00:44:12,570 I have a very long commute, and I was like closer to there. 1178 00:44:12,570 --> 00:44:13,653 Is anybody you know there? 1179 00:44:13,653 --> 00:44:14,760 So he's like, yeah. 1180 00:44:14,760 --> 00:44:16,410 I know Ruzena Bajcsy there. 1181 00:44:16,410 --> 00:44:18,840 She used to be a Penn, and I know Alyosha Efros. 1182 00:44:18,840 --> 00:44:21,970 And so he just emailed them and said, can you meet? 1183 00:44:21,970 --> 00:44:22,998 [INAUDIBLE] 1184 00:44:22,998 --> 00:44:24,540 And so I met some of them, and then I 1185 00:44:24,540 --> 00:44:26,260 tried to find some people who were willing to work. 1186 00:44:26,260 --> 00:44:28,950 So I just spent a day a week there for about two years, 1187 00:44:28,950 --> 00:44:30,720 just hanging out, writing, code and try 1188 00:44:30,720 --> 00:44:32,530 to get this project off the ground. 1189 00:44:32,530 --> 00:44:34,910 So we have a few different institutions. 1190 00:44:34,910 --> 00:44:37,260 Jeff Zhang was a senior undergraduate at the time. 1191 00:44:37,260 --> 00:44:40,020 He's at Illinois right now as a graduate student. 1192 00:44:40,020 --> 00:44:43,170 It's interesting, because it's hard to get grad student level 1193 00:44:43,170 --> 00:44:45,630 people excited over stuff that's applications 1194 00:44:45,630 --> 00:44:51,080 of existing algorithms, but they're happy to advise. 1195 00:44:51,080 --> 00:44:54,070 So I ended up having to write a lot of the code myself. 1196 00:44:54,070 --> 00:44:55,572 And undergraduates are, of course, 1197 00:44:55,572 --> 00:44:57,030 excited to do these kind of things, 1198 00:44:57,030 --> 00:44:59,920 because it's better than homework, and I can pay. 1199 00:44:59,920 --> 00:45:01,920 But I think, ultimately, it's interesting to try 1200 00:45:01,920 --> 00:45:05,940 to find that sweet spot and also find things that ultimately 1201 00:45:05,940 --> 00:45:09,627 could be interesting from an algorithmic standpoint too. 1202 00:45:09,627 --> 00:45:11,460 So I'm trying to do more of that these days. 1203 00:45:11,460 --> 00:45:13,230 OK. 1204 00:45:13,230 --> 00:45:15,330 So we aren't the first to even do something 1205 00:45:15,330 --> 00:45:17,030 around classifying views. 1206 00:45:17,030 --> 00:45:18,780 So somebody already had publish something, 1207 00:45:18,780 --> 00:45:20,860 but we wanted to be a little bit more nuanced than that. 1208 00:45:20,860 --> 00:45:22,693 In that we wanted to be able to distinguish, 1209 00:45:22,693 --> 00:45:25,590 for example, whether this structure, the left ventricle, 1210 00:45:25,590 --> 00:45:26,220 is cut off. 1211 00:45:26,220 --> 00:45:28,387 Because we don't want to measure it if it's cut off, 1212 00:45:28,387 --> 00:45:30,803 and we don't want to measure the atrium if it's completely 1213 00:45:30,803 --> 00:45:31,380 cut off here. 1214 00:45:31,380 --> 00:45:33,390 So we wanted to be able to have a classifier 1215 00:45:33,390 --> 00:45:35,432 able to distinguish between some of those things. 1216 00:45:35,432 --> 00:45:37,740 It's not an easy task, and a lot of these labels 1217 00:45:37,740 --> 00:45:41,400 were me riding the train in my very long commute from East 1218 00:45:41,400 --> 00:45:44,040 Bay, in California, to UCSF. 1219 00:45:44,040 --> 00:45:47,340 And so I did a lot of labeling, and I did a lot of segmentation 1220 00:45:47,340 --> 00:45:47,840 too. 1221 00:45:47,840 --> 00:45:49,175 So I could fly a lot. 1222 00:45:49,175 --> 00:45:50,550 And that's the other thing that's 1223 00:45:50,550 --> 00:45:52,512 kind of interesting is that you often need-- 1224 00:45:52,512 --> 00:45:53,970 even to do the grunt work-- you may 1225 00:45:53,970 --> 00:45:56,890 need somebody fairly specialized to do it which is OK, but yeah, 1226 00:45:56,890 --> 00:45:58,723 so that ended up being me for a lot of this. 1227 00:45:58,723 --> 00:46:01,390 So I traced a lot of these images, 1228 00:46:01,390 --> 00:46:03,400 and then I got some other people to help out. 1229 00:46:03,400 --> 00:46:05,900 But you're not going to get a computer science undergraduate 1230 00:46:05,900 --> 00:46:08,448 to trace art structures for you, nor are you 1231 00:46:08,448 --> 00:46:10,240 going to get them excited about doing this. 1232 00:46:10,240 --> 00:46:11,760 So we didn't end up having that much data, 1233 00:46:11,760 --> 00:46:13,885 and I think we could probably get better than that. 1234 00:46:13,885 --> 00:46:17,190 But we had the five main views, and we implemented a modified 1235 00:46:17,190 --> 00:46:18,600 version of unit algorithm. 1236 00:46:18,600 --> 00:46:21,060 We imposed a bit of a penalty to keep 1237 00:46:21,060 --> 00:46:24,480 this problem of, for example, a little stray ventricle 1238 00:46:24,480 --> 00:46:25,500 being out there. 1239 00:46:25,500 --> 00:46:27,420 We imposed a penalty to say, well, 1240 00:46:27,420 --> 00:46:30,022 if that's too far away from the center then, 1241 00:46:30,022 --> 00:46:31,980 we're going to have the loss function take that 1242 00:46:31,980 --> 00:46:32,670 into account. 1243 00:46:32,670 --> 00:46:37,130 That helped somewhat, but so that was our approach to-- 1244 00:46:37,130 --> 00:46:38,587 this is a pretty substantial deal 1245 00:46:38,587 --> 00:46:40,170 to be able to do all these things that 1246 00:46:40,170 --> 00:46:42,120 normally would be very tedious. 1247 00:46:42,120 --> 00:46:44,310 And as a result, when we start to analyze things, 1248 00:46:44,310 --> 00:46:47,340 we can segment every single frame of every single video. 1249 00:46:47,340 --> 00:46:49,900 The typical echo reader will take two frames and trace them. 1250 00:46:49,900 --> 00:46:50,400 That's it. 1251 00:46:50,400 --> 00:46:51,330 That's all you get. 1252 00:46:51,330 --> 00:46:53,910 So we can do everything over every single cardiac cycle, 1253 00:46:53,910 --> 00:46:56,290 because there's amazing variability from beat to beat. 1254 00:46:56,290 --> 00:46:58,560 And so it's silly to think that that 1255 00:46:58,560 --> 00:47:02,020 should be the gold standard, but that is the gold standard. 1256 00:47:02,020 --> 00:47:03,837 So we had thousands of echoes. 1257 00:47:03,837 --> 00:47:04,920 So that's the other thing. 1258 00:47:04,920 --> 00:47:07,740 So it turns out that it's almost impossible to get access 1259 00:47:07,740 --> 00:47:09,930 to echoes, so I wrote a keystroke encoder that 1260 00:47:09,930 --> 00:47:13,440 sat at the front end and just mimicked me entering in studies 1261 00:47:13,440 --> 00:47:14,320 and downloading them. 1262 00:47:14,320 --> 00:47:15,862 So that was the only way I could get. 1263 00:47:15,862 --> 00:47:18,370 So I had about 30,000 studies built up over a year, 1264 00:47:18,370 --> 00:47:21,310 but there's no way to do bulk download. 1265 00:47:21,310 --> 00:47:23,730 And so again, you've got to do some grunt work to be 1266 00:47:23,730 --> 00:47:25,570 willing to play this space. 1267 00:47:25,570 --> 00:47:29,495 So we had a fair number of studies 1268 00:47:29,495 --> 00:47:30,870 we could use in terms of where we 1269 00:47:30,870 --> 00:47:35,010 had measurements and decent values in terms of that. 1270 00:47:35,010 --> 00:47:36,570 I think it's interesting in terms 1271 00:47:36,570 --> 00:47:39,415 of thinking about how good one can-- how close one can get. 1272 00:47:39,415 --> 00:47:41,040 And one of the things we found is that, 1273 00:47:41,040 --> 00:47:43,890 when there were big deviations-- these are Bland-Altman plots-- 1274 00:47:43,890 --> 00:47:46,170 almost always the manual ones were wrong. 1275 00:47:46,170 --> 00:47:47,200 AUDIENCE: Why is that? 1276 00:47:47,200 --> 00:47:48,090 RAHUL DEO: Oh, OK. 1277 00:47:48,090 --> 00:47:48,590 OK. 1278 00:47:48,590 --> 00:47:53,340 So Bland-Altman plots, so people don't like using correlations 1279 00:47:53,340 --> 00:47:54,185 in the medical-- 1280 00:47:54,185 --> 00:47:56,310 so Bland and Altman published a paper in the Lancet 1281 00:47:56,310 --> 00:47:59,190 about 30 years ago complaining that correlations 1282 00:47:59,190 --> 00:48:01,830 and correlation coefficient are ultimately not good metrics. 1283 00:48:01,830 --> 00:48:03,940 Because you could have some substantial bias, 1284 00:48:03,940 --> 00:48:06,480 and really you want to know, if this is the gold standard, 1285 00:48:06,480 --> 00:48:08,160 you need to get that value. 1286 00:48:08,160 --> 00:48:11,490 So it really is just looking at differences 1287 00:48:11,490 --> 00:48:15,540 between, let's say, the reference value and the, 1288 00:48:15,540 --> 00:48:18,360 let's say, automated value, and then 1289 00:48:18,360 --> 00:48:20,758 plotting that against the mean of the two. 1290 00:48:20,758 --> 00:48:21,300 So that's it. 1291 00:48:21,300 --> 00:48:23,850 I did it as percentages here, but ultimately, it's just that. 1292 00:48:23,850 --> 00:48:27,330 It's that you're just taking the mean of, 1293 00:48:27,330 --> 00:48:29,790 let's, say the left ventricular volume. 1294 00:48:29,790 --> 00:48:33,450 You have a mean of the automated versus the manually measured 1295 00:48:33,450 --> 00:48:36,318 one, and then you compare what the difference is 1296 00:48:36,318 --> 00:48:37,860 of one minus the other, and so you'll 1297 00:48:37,860 --> 00:48:39,150 be on one side or the other. 1298 00:48:39,150 --> 00:48:41,692 So ideally, you would just be sitting perfectly on this line, 1299 00:48:41,692 --> 00:48:43,233 and then you're going to look and see 1300 00:48:43,233 --> 00:48:45,760 whether or not you're clustered on one side or the other. 1301 00:48:45,760 --> 00:48:48,060 So that's just the typical thing. 1302 00:48:48,060 --> 00:48:50,280 People try to avoid correlation coefficients, 1303 00:48:50,280 --> 00:48:52,890 because they kind of consider them to be not really telling 1304 00:48:52,890 --> 00:48:53,970 you whether or not-- 1305 00:48:53,970 --> 00:48:55,970 there really is a gold standard, and there truly 1306 00:48:55,970 --> 00:48:59,450 is a value here, and you want to be near that value. 1307 00:48:59,450 --> 00:49:04,670 And so that's the standard for looking 1308 00:49:04,670 --> 00:49:06,980 at comparison of diagnostics. 1309 00:49:06,980 --> 00:49:09,080 So we had about 8,000 things. 1310 00:49:09,080 --> 00:49:11,417 The reviewers gave us a hard time for the space up here, 1311 00:49:11,417 --> 00:49:13,250 and there are not that many studies up here, 1312 00:49:13,250 --> 00:49:14,240 but ultimately, there are some. 1313 00:49:14,240 --> 00:49:16,490 And when we manually looked at a bunch of them, always 1314 00:49:16,490 --> 00:49:17,990 the manual ones were just wrong. 1315 00:49:17,990 --> 00:49:20,360 Either there is a typo or something like that, 1316 00:49:20,360 --> 00:49:23,657 so that was reassuring, but we were sometimes very wrong. 1317 00:49:23,657 --> 00:49:25,490 And you'd find that the places we'd be wrong 1318 00:49:25,490 --> 00:49:28,640 would be these ridiculously complex congenital heart 1319 00:49:28,640 --> 00:49:32,390 studies that we had never been given examples like that 1320 00:49:32,390 --> 00:49:33,360 before. 1321 00:49:33,360 --> 00:49:36,190 So that's a lesson to be learned is that, sometimes, you're 1322 00:49:36,190 --> 00:49:39,382 going to be really off in these sorts of approaches, 1323 00:49:39,382 --> 00:49:40,840 and you have to think a little bit. 1324 00:49:40,840 --> 00:49:41,990 And what we ended up doing is having 1325 00:49:41,990 --> 00:49:44,448 an interative cycle, where we would identify those and feed 1326 00:49:44,448 --> 00:49:46,550 them back and of keep on doing that, 1327 00:49:46,550 --> 00:49:49,750 but that still needs to be improved upon. 1328 00:49:49,750 --> 00:49:50,300 OK. 1329 00:49:50,300 --> 00:49:55,550 So function, again, there's, a couple of measures a function. 1330 00:49:55,550 --> 00:49:57,050 There's a company that has something 1331 00:49:57,050 --> 00:49:58,970 out there in this space, got FDA approved 1332 00:49:58,970 --> 00:50:00,720 for having an automated ejection fraction. 1333 00:50:00,720 --> 00:50:02,780 So I think we're better than their numbers, 1334 00:50:02,780 --> 00:50:04,405 overall, but yeah. 1335 00:50:04,405 --> 00:50:06,530 I think that that's just one of those things you're 1336 00:50:06,530 --> 00:50:09,090 expected to be able to do. 1337 00:50:09,090 --> 00:50:11,500 And then here's a problem that we run into. 1338 00:50:11,500 --> 00:50:15,330 So we're comparing to the status quo which, like I said, 1339 00:50:15,330 --> 00:50:18,220 is one person tracing two images and comparing them. 1340 00:50:18,220 --> 00:50:19,340 That's it. 1341 00:50:19,340 --> 00:50:24,290 So we're processing potentially 200, 300 different frames 1342 00:50:24,290 --> 00:50:29,210 per study and competing median, smoothing across. 1343 00:50:29,210 --> 00:50:31,520 We're doing a whole lot more than that. 1344 00:50:31,520 --> 00:50:34,820 So what do we do about that in terms of the gold standard? 1345 00:50:34,820 --> 00:50:37,350 And if you just take into observer variability, 1346 00:50:37,350 --> 00:50:39,080 you're going to have up to 8% to 9% 1347 00:50:39,080 --> 00:50:41,540 in absolute compared to 60% of the reference. 1348 00:50:41,540 --> 00:50:43,470 So that's horrible. 1349 00:50:43,470 --> 00:50:44,890 So what are you supposed to do? 1350 00:50:44,890 --> 00:50:46,460 And I think so one thing people do 1351 00:50:46,460 --> 00:50:48,963 is they take multiple readers and ask them to do that. 1352 00:50:48,963 --> 00:50:50,380 But this is like, are you're going 1353 00:50:50,380 --> 00:50:51,922 to get a bunch of cardiologists to do 1354 00:50:51,922 --> 00:50:54,040 like 1,000 studies for you? 1355 00:50:54,040 --> 00:50:57,470 It's very hard to imagine somebody doing that. 1356 00:50:57,470 --> 00:50:59,210 You could compare it to another modality. 1357 00:50:59,210 --> 00:51:01,020 So we haven't done this yet, but you could, for example, 1358 00:51:01,020 --> 00:51:02,840 compare it to MRI and say whether or not 1359 00:51:02,840 --> 00:51:05,420 you're more consistent with another modality. 1360 00:51:05,420 --> 00:51:07,100 And then this is indirect, but you 1361 00:51:07,100 --> 00:51:09,038 can go to like outcomes in a trial 1362 00:51:09,038 --> 00:51:10,830 and see whether or not you do a better job. 1363 00:51:10,830 --> 00:51:12,493 So there are things you can do. 1364 00:51:12,493 --> 00:51:13,910 One of the things we decided to do 1365 00:51:13,910 --> 00:51:16,880 is look for correlations of structures 1366 00:51:16,880 --> 00:51:22,250 within a study itself and say, well, the mass-- 1367 00:51:22,250 --> 00:51:24,830 so we know that, for example, thickened hearts 1368 00:51:24,830 --> 00:51:26,583 lead to larger increases of pressure 1369 00:51:26,583 --> 00:51:27,750 and left atrial enlargement. 1370 00:51:27,750 --> 00:51:29,630 So we can look for correlations between those things 1371 00:51:29,630 --> 00:51:31,270 and see whether we do a better job. 1372 00:51:31,270 --> 00:51:33,920 I'd say, for, the most part we're about on par 1373 00:51:33,920 --> 00:51:35,200 with everything that's there. 1374 00:51:35,200 --> 00:51:36,617 So I don't think we're any better. 1375 00:51:36,617 --> 00:51:37,575 Sometimes we're better. 1376 00:51:37,575 --> 00:51:38,580 Sometimes we're worse. 1377 00:51:38,580 --> 00:51:40,400 And I think, for the most part, this was another way 1378 00:51:40,400 --> 00:51:42,740 to try to get at this, because we were stuck with this. 1379 00:51:42,740 --> 00:51:44,900 How do you work with a gold standard 1380 00:51:44,900 --> 00:51:46,790 that ultimately I don't think anybody really 1381 00:51:46,790 --> 00:51:49,130 trusts as a gold standard? 1382 00:51:49,130 --> 00:51:52,970 And this is a problem that just has to keep on coming up. 1383 00:51:52,970 --> 00:51:54,560 This is just an example of where you 1384 00:51:54,560 --> 00:51:58,070 could facilitate this idea of low cost serial imaging 1385 00:51:58,070 --> 00:51:58,860 and point of care. 1386 00:51:58,860 --> 00:52:01,910 So these are patients who are getting chemotherapy, 1387 00:52:01,910 --> 00:52:05,300 and so so Herceptin-- not herception, Herceptin, 1388 00:52:05,300 --> 00:52:06,365 it's like inception-- 1389 00:52:10,340 --> 00:52:13,160 is an EGFR inhibitor that causes cardiac toxicity, 1390 00:52:13,160 --> 00:52:15,170 and so people are getting screening echoes. 1391 00:52:15,170 --> 00:52:17,353 So you could imagine, if you make it easier 1392 00:52:17,353 --> 00:52:18,770 to acquire and interpret that, all 1393 00:52:18,770 --> 00:52:20,420 you want to care about is the function and the size. 1394 00:52:20,420 --> 00:52:21,550 So you can imagine automating that. 1395 00:52:21,550 --> 00:52:23,420 So we just did this as proof of concept 1396 00:52:23,420 --> 00:52:26,430 that you could imagine doing something like this. 1397 00:52:26,430 --> 00:52:29,230 And for the last thing I want to talk about-- 1398 00:52:29,230 --> 00:52:31,010 or sorry, the last thing in this space-- 1399 00:52:31,010 --> 00:52:34,530 is that you could also imagine directly detecting disease. 1400 00:52:34,530 --> 00:52:37,850 And so you have to say, well, why is that even worthwhile? 1401 00:52:37,850 --> 00:52:38,389 Yes. 1402 00:52:38,389 --> 00:52:39,389 AUDIENCE: I was curious. 1403 00:52:39,389 --> 00:52:42,049 I guess it's going back to the idea of if you look 1404 00:52:42,049 --> 00:52:45,375 at blended models between human groud truth 1405 00:52:45,375 --> 00:52:54,650 and maybe a biological ground truth, [INAUDIBLE] versus sort 1406 00:52:54,650 --> 00:52:57,500 of what you could get from an MRI or something-- 1407 00:52:57,500 --> 00:53:00,615 or maybe not necessarily an MRI, but what you were saying based 1408 00:53:00,615 --> 00:53:03,240 on the underlying biology, or if those two things are generally 1409 00:53:03,240 --> 00:53:03,860 kept separate? 1410 00:53:03,860 --> 00:53:05,720 RAHUL DEO: Yeah. 1411 00:53:05,720 --> 00:53:07,550 These are early days for a lot of this, 1412 00:53:07,550 --> 00:53:10,040 and I think, anytime you make anything more complicated, 1413 00:53:10,040 --> 00:53:12,320 then the readers will give you a hard time, 1414 00:53:12,320 --> 00:53:13,450 but you can imagine that. 1415 00:53:13,450 --> 00:53:15,242 And especially, you may want to tune things 1416 00:53:15,242 --> 00:53:18,080 to be able to be closer to something like that. 1417 00:53:18,080 --> 00:53:20,348 So yeah, I think, unfortunately, people 1418 00:53:20,348 --> 00:53:22,640 are pretty conservative in terms of how they interpret, 1419 00:53:22,640 --> 00:53:24,890 but it does make some sense that there's probably 1420 00:53:24,890 --> 00:53:26,060 something that-- 1421 00:53:26,060 --> 00:53:29,900 Ideally, you want to be able to have something that is useful, 1422 00:53:29,900 --> 00:53:33,020 and useful may not be exactly the same thing as mimicking 1423 00:53:33,020 --> 00:53:34,250 what humans are doing. 1424 00:53:34,250 --> 00:53:35,630 So no, I think it's a good idea. 1425 00:53:35,630 --> 00:53:37,070 And I think that this is going to be-- 1426 00:53:37,070 --> 00:53:39,487 this next wave-- is going to be thinking a little bit more 1427 00:53:39,487 --> 00:53:41,650 about that in terms of like how do we 1428 00:53:41,650 --> 00:53:44,150 improve on what's going on over there, rather than simply 1429 00:53:44,150 --> 00:53:46,940 dragging it back to that? 1430 00:53:46,940 --> 00:53:48,310 OK. 1431 00:53:48,310 --> 00:53:50,028 So there are multiple rare diseases. 1432 00:53:50,028 --> 00:53:52,070 I use to have a clinic that would focus on these, 1433 00:53:52,070 --> 00:53:53,653 and they tend to get missed at centers 1434 00:53:53,653 --> 00:53:54,980 that don't see them that often. 1435 00:53:54,980 --> 00:53:57,140 So one place you could imagine is 1436 00:53:57,140 --> 00:53:58,940 you can focus on trying to pick those up, 1437 00:53:58,940 --> 00:54:00,590 and you could imagine, this could be just surveillance 1438 00:54:00,590 --> 00:54:01,760 running in the background. 1439 00:54:01,760 --> 00:54:06,080 It doesn't have to be kind of real time identification. 1440 00:54:06,080 --> 00:54:08,510 So there's a few diseases where it's 1441 00:54:08,510 --> 00:54:10,430 very reasonable to do these things, 1442 00:54:10,430 --> 00:54:11,490 where it's very obvious. 1443 00:54:11,490 --> 00:54:13,160 So this is a disease called hypertrophic cardiomyopathy. 1444 00:54:13,160 --> 00:54:14,790 I used to see it in my clinic. 1445 00:54:14,790 --> 00:54:17,660 So abnormally thickened hearts, leading cause of sudden death 1446 00:54:17,660 --> 00:54:18,560 in young athletes. 1447 00:54:18,560 --> 00:54:24,110 So Reggie Lewis, there's a bunch of people who've died suddenly 1448 00:54:24,110 --> 00:54:25,970 from this condition. 1449 00:54:25,970 --> 00:54:28,760 Unstable heart rhythm, sudden death, heart failure, 1450 00:54:28,760 --> 00:54:30,320 it runs in families, and there are 1451 00:54:30,320 --> 00:54:32,390 things you can do, if you identified it. 1452 00:54:32,390 --> 00:54:35,180 And so it's actually a fairly easy task, in the sense 1453 00:54:35,180 --> 00:54:38,130 that it tends to be quite obvious. 1454 00:54:38,130 --> 00:54:40,460 So we built the classification model around this, 1455 00:54:40,460 --> 00:54:43,472 and we tried to understand what it was doing in part. 1456 00:54:43,472 --> 00:54:45,680 And so we tried to do some of these kind of attention 1457 00:54:45,680 --> 00:54:47,780 or saliency type things, and they were very unsatisfying, 1458 00:54:47,780 --> 00:54:49,790 in part because I think there's so many different features 1459 00:54:49,790 --> 00:54:50,895 across the whole image. 1460 00:54:50,895 --> 00:54:52,270 So you're just getting this blob, 1461 00:54:52,270 --> 00:54:53,870 but I think maybe we just weren't implementing it 1462 00:54:53,870 --> 00:54:54,370 correctly. 1463 00:54:54,370 --> 00:54:57,740 I'm not really sure, but you have a left atrium gets bigger. 1464 00:54:57,740 --> 00:54:59,030 The heart gets thicker. 1465 00:54:59,030 --> 00:55:01,790 There's so many changes across the image. 1466 00:55:01,790 --> 00:55:03,602 It was unsatisfying in terms of that. 1467 00:55:03,602 --> 00:55:05,060 So we did something simple and just 1468 00:55:05,060 --> 00:55:06,620 took the output of the probabilities 1469 00:55:06,620 --> 00:55:08,510 and compared it to some simple things 1470 00:55:08,510 --> 00:55:10,580 that we actually know about these things 1471 00:55:10,580 --> 00:55:12,980 and found that there was some degree of correlation. 1472 00:55:12,980 --> 00:55:16,520 But I would like to make that a little bit better. 1473 00:55:16,520 --> 00:55:18,620 Cardiac amyloid, a very popular disease for which 1474 00:55:18,620 --> 00:55:20,128 there are now therapies. 1475 00:55:20,128 --> 00:55:22,670 And so pharma is very interested in identifying these people, 1476 00:55:22,670 --> 00:55:24,712 and they really get missed at a pretty high rate. 1477 00:55:24,712 --> 00:55:26,990 So we built another model for this. 1478 00:55:26,990 --> 00:55:29,420 Usually, we had about 250 or 300 cases 1479 00:55:29,420 --> 00:55:33,652 for each of these things and maybe a few thousand controls. 1480 00:55:33,652 --> 00:55:35,360 And then this one's a little interesting. 1481 00:55:35,360 --> 00:55:37,550 This is mitral valve prolapse. 1482 00:55:37,550 --> 00:55:41,830 So this is what a prolapsing valve looks like. 1483 00:55:41,830 --> 00:55:45,980 If you imagine the plane of the valve here, it buckles back. 1484 00:55:45,980 --> 00:55:50,425 So it does this, and that's abnormal, 1485 00:55:50,425 --> 00:55:51,550 and this is a normal valve. 1486 00:55:51,550 --> 00:55:53,520 So you notice, it doesn't buckle back in. 1487 00:55:53,520 --> 00:55:55,040 So it's a little interesting in that there's really 1488 00:55:55,040 --> 00:55:57,260 only one part of the cardiac cycle that would really 1489 00:55:57,260 --> 00:55:59,920 highlight this abnormality, at least that's the way that-- 1490 00:55:59,920 --> 00:56:01,910 so the way that it's read clinically 1491 00:56:01,910 --> 00:56:04,490 is people wait for this one part of the cardiac cycle 1492 00:56:04,490 --> 00:56:05,810 where it's buckled back. 1493 00:56:05,810 --> 00:56:07,440 They draw an imaginary line across, 1494 00:56:07,440 --> 00:56:09,440 and they measure what the displacement is there, 1495 00:56:09,440 --> 00:56:11,450 and so we built a reasonable model focusing. 1496 00:56:11,450 --> 00:56:13,013 So we phased these images and picked 1497 00:56:13,013 --> 00:56:14,930 the part of the cardiac cycle, those relevant, 1498 00:56:14,930 --> 00:56:16,638 all in an automated way and built a model 1499 00:56:16,638 --> 00:56:20,990 around that and pretty good, in terms of being able to do that, 1500 00:56:20,990 --> 00:56:23,700 in terms of being in detect that. 1501 00:56:23,700 --> 00:56:24,200 Yes. 1502 00:56:24,200 --> 00:56:27,460 AUDIENCE: And so is this model on images at a certain time? 1503 00:56:27,460 --> 00:56:28,687 Like can you just go back? 1504 00:56:28,687 --> 00:56:30,520 Because obviously, you weren't doing videos. 1505 00:56:30,520 --> 00:56:31,170 Right? 1506 00:56:31,170 --> 00:56:32,630 RAHUL DEO: Well, so we would take the whole video. 1507 00:56:32,630 --> 00:56:33,980 We were segmenting it. 1508 00:56:33,980 --> 00:56:36,920 We were phasing it, figuring out what the part of the-- 1509 00:56:36,920 --> 00:56:38,360 when was the end systole in that, 1510 00:56:38,360 --> 00:56:41,330 and then using those as the-- so using a stack of those 1511 00:56:41,330 --> 00:56:42,377 to be able to classify. 1512 00:56:42,377 --> 00:56:44,210 AUDIENCE: So how do you know the time point? 1513 00:56:44,210 --> 00:56:45,668 RAHUL DEO: Well, that's I'm saying. 1514 00:56:45,668 --> 00:56:47,653 So we we're using the variation in the volumes. 1515 00:56:47,653 --> 00:56:48,710 AUDIENCE: The segmentation would allow 1516 00:56:48,710 --> 00:56:50,220 you to know the time point. 1517 00:56:50,220 --> 00:56:54,470 RAHUL DEO: Exactly, because so a typical echo will have an ECG 1518 00:56:54,470 --> 00:56:56,300 to use to gate, but the handhelds don't. 1519 00:56:56,300 --> 00:56:58,400 So we want to move away from the things that 1520 00:56:58,400 --> 00:57:01,040 involve the fanciness and all the bells and whistles. 1521 00:57:01,040 --> 00:57:03,278 We're trying to use the image alone 1522 00:57:03,278 --> 00:57:04,820 to be able to tell the cardiac cycle. 1523 00:57:04,820 --> 00:57:06,450 So that's how we did it. 1524 00:57:06,450 --> 00:57:07,870 Yes. 1525 00:57:07,870 --> 00:57:10,280 AUDIENCE: So you mentioned handhelds. 1526 00:57:10,280 --> 00:57:12,780 With the ultrasounds [INAUDIBLE],, 1527 00:57:12,780 --> 00:57:14,090 are they different from these? 1528 00:57:14,090 --> 00:57:16,510 RAHUL DEO: They look pretty similar. 1529 00:57:16,510 --> 00:57:19,340 We got some now, and they look pretty similar 1530 00:57:19,340 --> 00:57:21,380 in terms of the quality of the images, 1531 00:57:21,380 --> 00:57:23,810 and you can acquire the very same view. 1532 00:57:23,810 --> 00:57:27,413 So I think we haven't shown that we can do it off those, in part 1533 00:57:27,413 --> 00:57:29,330 because there just isn't enough training data. 1534 00:57:29,330 --> 00:57:32,630 But they look pretty nice, and I know at UCSF 1535 00:57:32,630 --> 00:57:35,080 and at Brigham, all the fellows are using it. 1536 00:57:35,080 --> 00:57:38,330 It looks pretty much the same in terms of the-- the transducers 1537 00:57:38,330 --> 00:57:40,450 are similar, and image quality is very good. 1538 00:57:40,450 --> 00:57:41,450 Resolution is very good. 1539 00:57:41,450 --> 00:57:43,850 Frame rate probably doesn't get up as high necessarily, 1540 00:57:43,850 --> 00:57:47,750 but for the most part, I don't think it's that different. 1541 00:57:47,750 --> 00:57:50,640 So that is the next phase. 1542 00:57:50,640 --> 00:57:51,400 Yes. 1543 00:57:51,400 --> 00:57:52,793 AUDIENCE: Could you comment on-- 1544 00:57:52,793 --> 00:57:54,993 so you mentioned how each of these three examples 1545 00:57:54,993 --> 00:57:56,910 could be used within a surveillance algorithm. 1546 00:57:56,910 --> 00:57:57,577 RAHUL DEO: Yeah. 1547 00:57:57,577 --> 00:57:59,680 AUDIENCE: Could you comment on where 1548 00:57:59,680 --> 00:58:02,647 along this true positive, false positive trade-off 1549 00:58:02,647 --> 00:58:04,480 you would actually be realistic to use this? 1550 00:58:04,480 --> 00:58:04,880 RAHUL DEO: Yeah. 1551 00:58:04,880 --> 00:58:05,713 That's a good point. 1552 00:58:05,713 --> 00:58:07,880 I think it would vary for every single one of those, 1553 00:58:07,880 --> 00:58:10,060 and you really want to have some costs on what the-- 1554 00:58:10,060 --> 00:58:14,500 so I would typically err on the side of higher sensitivity 1555 00:58:14,500 --> 00:58:19,100 and dump it on the cardiologists to be able to-- 1556 00:58:19,100 --> 00:58:23,550 so I would work, but I think you have to pick some-- 1557 00:58:23,550 --> 00:58:25,152 let's say, you're a product manager. 1558 00:58:25,152 --> 00:58:27,360 AUDIENCE: Just choose one of these three, and maybe-- 1559 00:58:27,360 --> 00:58:28,570 RAHUL DEO: OK. 1560 00:58:28,570 --> 00:58:29,860 Yeah. 1561 00:58:29,860 --> 00:58:32,440 So this is a pretty rare disease. 1562 00:58:32,440 --> 00:58:36,770 So your priors are pretty low in terms of these individuals. 1563 00:58:36,770 --> 00:58:39,760 And so I think you probably would probably 1564 00:58:39,760 --> 00:58:46,330 want to err somewhere along this area here, 1565 00:58:46,330 --> 00:58:50,110 and so just working on what the-- 1566 00:58:50,110 --> 00:58:53,830 so you probably will still be a relatively high rate 1567 00:58:53,830 --> 00:58:56,120 of false positives even that space. 1568 00:58:56,120 --> 00:59:01,810 But I would argue that it would take the treating cardiologist 1569 00:59:01,810 --> 00:59:04,850 potentially just a few minutes to look at that study again, 1570 00:59:04,850 --> 00:59:06,850 and if you picked up one of those patients, that 1571 00:59:06,850 --> 00:59:08,147 would be a big win. 1572 00:59:08,147 --> 00:59:10,480 So I think that the cost probably wouldn't be that high, 1573 00:59:10,480 --> 00:59:13,290 and you just have to make the case. 1574 00:59:13,290 --> 00:59:15,840 So therapy for amyloid, for example, 1575 00:59:15,840 --> 00:59:18,150 this is a nice sharp up stroke there. 1576 00:59:18,150 --> 00:59:21,552 There's new drugs out there that are 1577 00:59:21,552 --> 00:59:23,260 sort of begging for patients, and they're 1578 00:59:23,260 --> 00:59:25,138 having a real hard time identifying them. 1579 00:59:25,138 --> 00:59:26,680 So you could imagine again, it's sort 1580 00:59:26,680 --> 00:59:29,830 of a calculus based on what the benefits would 1581 00:59:29,830 --> 00:59:32,020 be for that identification and what burden you're 1582 00:59:32,020 --> 00:59:35,460 placing on the individuals to have to over read something. 1583 00:59:35,460 --> 00:59:37,210 And you could probably tune that depending 1584 00:59:37,210 --> 00:59:42,070 on what the disease is and who you're pitching it to. 1585 00:59:42,070 --> 00:59:44,230 But you're right, you're going to crush people 1586 00:59:44,230 --> 00:59:47,530 if like 1 in 100 ends up taking a true positive then 1587 00:59:47,530 --> 00:59:50,015 you're not going to get many fans. 1588 00:59:50,015 --> 00:59:50,515 Yes. 1589 00:59:50,515 --> 00:59:53,930 AUDIENCE: Could you comment on whether, for example, 1590 00:59:53,930 --> 00:59:56,160 [INAUDIBLE] basis, the ones that you're 1591 00:59:56,160 --> 00:59:59,670 able to predict very well at that point 1592 00:59:59,670 --> 01:00:03,470 you just chose what distinguishes the ones that 1593 01:00:03,470 --> 01:00:04,753 are defined well? 1594 01:00:04,753 --> 01:00:06,170 RAHUL DEO: So that's a good point, 1595 01:00:06,170 --> 01:00:10,060 and I don't really know in the sense 1596 01:00:10,060 --> 01:00:11,860 that I haven't looked that closely. 1597 01:00:11,860 --> 01:00:17,110 But I'm going to guess, they're very thick and very obvious 1598 01:00:17,110 --> 01:00:18,910 in that sort of sense. 1599 01:00:18,910 --> 01:00:22,295 So we have a ECG model that may pick this up early. 1600 01:00:22,295 --> 01:00:23,920 What you want is something to fix it up 1601 01:00:23,920 --> 01:00:26,230 when it's treatable, not having something that's 1602 01:00:26,230 --> 01:00:27,460 ridiculously exaggerated. 1603 01:00:27,460 --> 01:00:29,800 So you may need multiple modalities some of which 1604 01:00:29,800 --> 01:00:33,310 are more sensitive than others that can catch earlier stage 1605 01:00:33,310 --> 01:00:34,827 disease to be able to do that. 1606 01:00:34,827 --> 01:00:36,910 So there are interesting things about this disease 1607 01:00:36,910 --> 01:00:37,493 in particular. 1608 01:00:37,493 --> 01:00:40,770 So cataracts sometimes happen before-- 1609 01:00:40,770 --> 01:00:43,535 so ideally, the way you do this is-- and I'm actually 1610 01:00:43,535 --> 01:00:45,160 consulting around something like this-- 1611 01:00:45,160 --> 01:00:49,780 you ideally want a mixture of electronic health record, 1612 01:00:49,780 --> 01:00:52,870 something from other findings-- mirror findings, eye findings, 1613 01:00:52,870 --> 01:00:54,785 plus maybe something cardiac plus 1614 01:00:54,785 --> 01:00:56,410 and have something that ideally catches 1615 01:00:56,410 --> 01:00:58,660 the disease in the ideal most treated state. 1616 01:00:58,660 --> 01:01:00,130 And maybe echo's not the best one, 1617 01:01:00,130 --> 01:01:04,070 and I think that we'll come back to that at the end. 1618 01:01:04,070 --> 01:01:05,350 We have a little bit of time. 1619 01:01:05,350 --> 01:01:05,850 OK. 1620 01:01:08,260 --> 01:01:10,598 So UCSF is filing-- 1621 01:01:10,598 --> 01:01:11,140 I don't know. 1622 01:01:11,140 --> 01:01:12,890 I don't think this is actually patentable, 1623 01:01:12,890 --> 01:01:15,340 but they are filing for a patent. 1624 01:01:15,340 --> 01:01:18,310 I'm just filling the paperwork out today in terms of-- 1625 01:01:18,310 --> 01:01:19,600 I don't know. 1626 01:01:19,600 --> 01:01:24,730 But my code is all freely available anyway, 1627 01:01:24,730 --> 01:01:27,078 for academic, non-profit use, and they're just 1628 01:01:27,078 --> 01:01:28,120 trying to make it better. 1629 01:01:28,120 --> 01:01:31,030 I think, ultimately, my view as an academic here is 1630 01:01:31,030 --> 01:01:32,770 to try to show what's possible. 1631 01:01:32,770 --> 01:01:35,410 And then, if you want to get a commercial product, 1632 01:01:35,410 --> 01:01:37,853 then you need people to weigh in on the industry side 1633 01:01:37,853 --> 01:01:40,270 and make something pretty and make it usable and all that. 1634 01:01:40,270 --> 01:01:42,010 But I think, ultimately, I'm trying 1635 01:01:42,010 --> 01:01:44,830 to just show, hey, if we could do this in a scalable way 1636 01:01:44,830 --> 01:01:46,540 and find out something new, then you guys 1637 01:01:46,540 --> 01:01:48,430 can catch up and do something that 1638 01:01:48,430 --> 01:01:51,050 ultimately can be deployed. 1639 01:01:51,050 --> 01:01:53,745 And what's interesting is I have a collaborator in New Zealand. 1640 01:01:53,745 --> 01:01:55,120 There, they're are resource poor. 1641 01:01:55,120 --> 01:01:56,948 So they have a huge backlog of patients. 1642 01:01:56,948 --> 01:01:58,490 They don't have enough stenographers, 1643 01:01:58,490 --> 01:02:00,198 and they don't have enough cardiologists. 1644 01:02:00,198 --> 01:02:02,410 So they're trying to implement this super ultra 1645 01:02:02,410 --> 01:02:06,680 quick five-minute study and then have automation. 1646 01:02:06,680 --> 01:02:10,857 And so they want our accuracy to be a little bit better, 1647 01:02:10,857 --> 01:02:12,440 but I think they're ready to roll out, 1648 01:02:12,440 --> 01:02:15,790 if we're able to get something that has probably more training 1649 01:02:15,790 --> 01:02:16,290 data. 1650 01:02:16,290 --> 01:02:16,920 Yes. 1651 01:02:16,920 --> 01:02:18,453 Are you from New Zealand? 1652 01:02:18,453 --> 01:02:18,995 AUDIENCE: No. 1653 01:02:18,995 --> 01:02:23,228 I think you started talking about the trade-off between 1654 01:02:23,228 --> 01:02:24,710 accuracy and-- 1655 01:02:24,710 --> 01:02:27,674 so in academia, I get the sense that they're always 1656 01:02:27,674 --> 01:02:29,173 chasing perfect accuracy. 1657 01:02:29,173 --> 01:02:29,840 RAHUL DEO: Yeah. 1658 01:02:29,840 --> 01:02:31,215 AUDIENCE: But as you said, you're 1659 01:02:31,215 --> 01:02:35,430 not going to get rid of cardiologists in the diagnosis. 1660 01:02:35,430 --> 01:02:37,630 So I have a philosophical question 1661 01:02:37,630 --> 01:02:40,940 of are you chasing the wrong thing? 1662 01:02:40,940 --> 01:02:45,243 Should we chase perfect accuracy? 1663 01:02:45,243 --> 01:02:45,910 RAHUL DEO: Yeah. 1664 01:02:45,910 --> 01:02:48,500 So the question is around what should our goals be? 1665 01:02:51,420 --> 01:02:57,470 So should we be just chasing after a level of accuracy 1666 01:02:57,470 --> 01:03:00,620 that may be either very, very difficult to attain? 1667 01:03:00,620 --> 01:03:03,800 And especially, if there's never a scenario where there'll be 1668 01:03:03,800 --> 01:03:06,680 no clinician involved, should we instead 1669 01:03:06,680 --> 01:03:08,450 be thinking about something that gets good 1670 01:03:08,450 --> 01:03:09,590 enough to that next step? 1671 01:03:09,590 --> 01:03:11,215 And I think that's a really good point. 1672 01:03:15,230 --> 01:03:16,430 And what's interesting is-- 1673 01:03:16,430 --> 01:03:18,513 and also it's interesting from the industry side-- 1674 01:03:18,513 --> 01:03:21,260 is the field starts with the mimicking mode, 1675 01:03:21,260 --> 01:03:24,200 because it's much harder to change practice. 1676 01:03:24,200 --> 01:03:28,158 It's much easier to just pop something in and say, hey, 1677 01:03:28,158 --> 01:03:29,950 I know you have to make these measurements. 1678 01:03:29,950 --> 01:03:32,210 Let me make them for you, and you could look at them 1679 01:03:32,210 --> 01:03:33,532 and see if you agree. 1680 01:03:33,532 --> 01:03:34,490 So that's what ECGs do. 1681 01:03:34,490 --> 01:03:34,990 Right? 1682 01:03:34,990 --> 01:03:38,100 So nobody these days is measuring the QR rests width. 1683 01:03:38,100 --> 01:03:38,990 Nobody does that. 1684 01:03:38,990 --> 01:03:39,878 That's just not done. 1685 01:03:39,878 --> 01:03:42,170 If you've got a number that's absurd, you'll change it. 1686 01:03:42,170 --> 01:03:44,420 But for the most part, you're like, it's close enough, 1687 01:03:44,420 --> 01:03:46,560 but you almost have to start with that. 1688 01:03:46,560 --> 01:03:49,590 To do something that's transformative 1689 01:03:49,590 --> 01:03:51,740 is very hard to do. 1690 01:03:51,740 --> 01:03:53,260 So I think something that involves-- 1691 01:03:53,260 --> 01:03:54,635 and I talked to David about this. 1692 01:03:54,635 --> 01:03:57,800 It's sort of like the man-machine interface is 1693 01:03:57,800 --> 01:03:59,780 fascinating to think about how do we together 1694 01:03:59,780 --> 01:04:01,130 come up with something better? 1695 01:04:01,130 --> 01:04:04,310 But it's just much harder to get that adopted, because it 1696 01:04:04,310 --> 01:04:07,190 requires buy-in in a way that's different than just 1697 01:04:07,190 --> 01:04:10,610 you do my work for me, but more that we come together 1698 01:04:10,610 --> 01:04:12,073 to do something better. 1699 01:04:12,073 --> 01:04:14,240 And I think that's going to be interesting as to how 1700 01:04:14,240 --> 01:04:16,070 to chip away at that problem. 1701 01:04:20,270 --> 01:04:20,770 OK. 1702 01:04:20,770 --> 01:04:22,490 So a couple of musings, then I'm going 1703 01:04:22,490 --> 01:04:24,948 to talk a little bit about One Brave Idea, if we have time, 1704 01:04:24,948 --> 01:04:27,522 or I can stop and take questions instead, 1705 01:04:27,522 --> 01:04:29,480 because it's a little bit of a biology venture. 1706 01:04:29,480 --> 01:04:30,020 OK. 1707 01:04:30,020 --> 01:04:33,272 So I do think that we should really look. 1708 01:04:33,272 --> 01:04:35,480 People give me a hard time around echo, and I'm like, 1709 01:04:35,480 --> 01:04:37,100 well, ECG's been around for a long time, 1710 01:04:37,100 --> 01:04:38,308 and there's automation there. 1711 01:04:38,308 --> 01:04:40,100 So let's think about how it's used there, 1712 01:04:40,100 --> 01:04:41,575 and then see whether or not-- 1713 01:04:41,575 --> 01:04:43,200 it's not as outlandish as people think. 1714 01:04:43,200 --> 01:04:45,117 So I think a lot of these routine measurements 1715 01:04:45,117 --> 01:04:48,625 are just going to be done in an automated way. 1716 01:04:48,625 --> 01:04:51,000 Already in our software, you can put out a little picture 1717 01:04:51,000 --> 01:04:53,060 and overlay the segmentation on the original image 1718 01:04:53,060 --> 01:04:54,143 and say how good it looks. 1719 01:04:54,143 --> 01:04:54,830 So that's easy. 1720 01:04:54,830 --> 01:04:56,200 So you can do that. 1721 01:04:56,200 --> 01:04:59,540 And then this kind of idea of point of care automated 1722 01:04:59,540 --> 01:05:01,880 diagnoses can make some sense around 1723 01:05:01,880 --> 01:05:03,470 some emergency-type situations. 1724 01:05:03,470 --> 01:05:06,380 So maybe you need a quick check of function. 1725 01:05:06,380 --> 01:05:08,030 Maybe you want to know if they have 1726 01:05:08,030 --> 01:05:10,447 a lot of fluid around the heart, and you don't necessarily 1727 01:05:10,447 --> 01:05:11,090 want to wait. 1728 01:05:11,090 --> 01:05:12,465 So those will be the places where 1729 01:05:12,465 --> 01:05:15,088 there may be some kind of innovations 1730 01:05:15,088 --> 01:05:16,880 around just getting something done quickly. 1731 01:05:16,880 --> 01:05:18,380 And then you always have somebody checking 1732 01:05:18,380 --> 01:05:19,980 in the background, layer on, a little 1733 01:05:19,980 --> 01:05:21,860 the heart attack thing I showed you, 1734 01:05:21,860 --> 01:05:23,730 and I think this problem in echo is there. 1735 01:05:23,730 --> 01:05:26,287 And so if you need skilled people 1736 01:05:26,287 --> 01:05:28,370 to be able to acquire the data in the first place, 1737 01:05:28,370 --> 01:05:31,190 you're stuck, because they can read an echo. 1738 01:05:31,190 --> 01:05:33,720 A really good stenography can read the whole study for you. 1739 01:05:33,720 --> 01:05:35,750 So if you already have that person involved 1740 01:05:35,750 --> 01:05:38,090 in the pipeline, then it's really hard 1741 01:05:38,090 --> 01:05:41,683 to introduce a big advance. 1742 01:05:41,683 --> 01:05:43,850 So you need to figure out how to take a primary care 1743 01:05:43,850 --> 01:05:46,320 doc off the street, put a machine in their hand, 1744 01:05:46,320 --> 01:05:48,320 and let them get the image and then automate all 1745 01:05:48,320 --> 01:05:49,550 the interpretation for them. 1746 01:05:49,550 --> 01:05:52,610 And so until you can task shift into that space, 1747 01:05:52,610 --> 01:05:55,850 you're stuck with having still too high a level of skill. 1748 01:05:55,850 --> 01:05:58,170 So there are these companies that are in the space now, 1749 01:05:58,170 --> 01:06:00,280 and there's a few that are trying. 1750 01:06:00,280 --> 01:06:03,440 It's easy to imagine, if you can train a neural network 1751 01:06:03,440 --> 01:06:06,170 to classify a view, you could get it to-- 1752 01:06:06,170 --> 01:06:07,948 this gets to this idea of registration 1753 01:06:07,948 --> 01:06:10,490 a little bit-- you can recognize if you're off by 10 degrees, 1754 01:06:10,490 --> 01:06:11,510 or if you need a translation. 1755 01:06:11,510 --> 01:06:13,635 You could just train a model to be able to do that. 1756 01:06:13,635 --> 01:06:15,830 So I think that's already happening right now. 1757 01:06:15,830 --> 01:06:19,100 So it's a question as to whether that will get adopted or not, 1758 01:06:19,100 --> 01:06:20,720 but I think that, ultimately, if you 1759 01:06:20,720 --> 01:06:24,320 want to get shifting towards sort of less skilled personnel, 1760 01:06:24,320 --> 01:06:26,460 you need to do something in that space. 1761 01:06:26,460 --> 01:06:26,960 OK. 1762 01:06:26,960 --> 01:06:28,793 So this is where it gets a little bit harder 1763 01:06:28,793 --> 01:06:31,940 is to think about how to make stuff and elevate medicine 1764 01:06:31,940 --> 01:06:34,810 beyond what we're doing. 1765 01:06:34,810 --> 01:06:36,320 And this gets back to this problem 1766 01:06:36,320 --> 01:06:38,460 I mentioned is, at the end of the day, 1767 01:06:38,460 --> 01:06:41,990 you can't find new uses for echo, 1768 01:06:41,990 --> 01:06:43,700 unless the data is already there for you 1769 01:06:43,700 --> 01:06:45,450 to be able to show that there's more value 1770 01:06:45,450 --> 01:06:48,110 than there currently is, sort of this chicken and egg thing. 1771 01:06:48,110 --> 01:06:51,830 So in some sense, what I hope to introduce in some way 1772 01:06:51,830 --> 01:06:54,180 that we can get much bigger data sets, 1773 01:06:54,180 --> 01:06:57,310 and they don't have to be 100 video data sets. 1774 01:06:57,310 --> 01:06:59,160 They can be three video data sets, 1775 01:06:59,160 --> 01:07:01,005 but we want to be able to figure out 1776 01:07:01,005 --> 01:07:02,880 how to enable more and more of these studies. 1777 01:07:02,880 --> 01:07:04,752 So then you can sort of imagine learning 1778 01:07:04,752 --> 01:07:05,960 many more complicated things. 1779 01:07:05,960 --> 01:07:07,863 You want to track people over time. 1780 01:07:07,863 --> 01:07:09,530 You want to look at treatment responses. 1781 01:07:09,530 --> 01:07:11,780 So you've got to look at where the money is already 1782 01:07:11,780 --> 01:07:13,550 and see who could do this. 1783 01:07:13,550 --> 01:07:15,290 So pharma companies are interested, 1784 01:07:15,290 --> 01:07:17,930 because they have these phase II trials. 1785 01:07:17,930 --> 01:07:19,910 They may only have three months or six months 1786 01:07:19,910 --> 01:07:22,850 to show some benefit for a drug, and they're 1787 01:07:22,850 --> 01:07:24,770 really interested in seeing whether there's 1788 01:07:24,770 --> 01:07:26,950 differences after a month, two months, three months, four 1789 01:07:26,950 --> 01:07:27,450 months. 1790 01:07:27,450 --> 01:07:29,420 So that may be a place where you get-- 1791 01:07:29,420 --> 01:07:31,380 and they're being frugal, but they have money. 1792 01:07:31,380 --> 01:07:32,797 So you could imagine, if you could 1793 01:07:32,797 --> 01:07:37,940 introduce this pipeline in there and just have handheld, simple, 1794 01:07:37,940 --> 01:07:40,850 quick to acquire, far more frequency, and you 1795 01:07:40,850 --> 01:07:43,620 show a treatment response, and that's kind of transformative 1796 01:07:43,620 --> 01:07:43,790 then. 1797 01:07:43,790 --> 01:07:44,810 Because then, you could imagine, that 1798 01:07:44,810 --> 01:07:46,560 can get rolled out in practice after that. 1799 01:07:46,560 --> 01:07:48,902 So you need somebody to bankroll this to start with, 1800 01:07:48,902 --> 01:07:51,110 and then you could imagine, once you have a use case, 1801 01:07:51,110 --> 01:07:53,030 then you could imagine it getting much more. 1802 01:07:53,030 --> 01:07:54,710 And this idea of surveillance, you 1803 01:07:54,710 --> 01:07:57,210 could imagine that would be very doable, that you could just 1804 01:07:57,210 --> 01:07:58,695 have something taking-- 1805 01:07:58,695 --> 01:08:01,070 The problem is, you can even get the data in the archives 1806 01:08:01,070 --> 01:08:02,600 anyway, but let's say you can get that. 1807 01:08:02,600 --> 01:08:04,820 You could just have this system looking for amyloid, 1808 01:08:04,820 --> 01:08:06,740 looking for whatever, and that would be a win 1809 01:08:06,740 --> 01:08:09,200 too is to be able to imagine doing something like that. 1810 01:08:09,200 --> 01:08:11,720 It's not putting any pressure on the clinical workflow. 1811 01:08:11,720 --> 01:08:13,107 It's not making anybody look bad. 1812 01:08:13,107 --> 01:08:15,440 I think, ultimately, it's trying to just figure out if-- 1813 01:08:15,440 --> 01:08:17,609 well, maybe somebody may be looking bad 1814 01:08:17,609 --> 01:08:19,250 if they miss something, but yeah. 1815 01:08:19,250 --> 01:08:23,450 I think it is just trying to identify individuals. 1816 01:08:23,450 --> 01:08:25,910 And so this is an area I think that's hard, 1817 01:08:25,910 --> 01:08:27,529 and so this kind of idea, this is 1818 01:08:27,529 --> 01:08:29,779 where I started a little bit, around this kind of idea 1819 01:08:29,779 --> 01:08:31,880 of this disease subclassification and risk 1820 01:08:31,880 --> 01:08:32,810 models. 1821 01:08:32,810 --> 01:08:35,939 And so that's like more sophisticated than anything 1822 01:08:35,939 --> 01:08:36,439 we're doing. 1823 01:08:36,439 --> 01:08:39,040 I think we're pretty crude at this kind of stuff, 1824 01:08:39,040 --> 01:08:42,260 but one of the challenges is people just 1825 01:08:42,260 --> 01:08:46,550 aren't interested in new categories or new risk models, 1826 01:08:46,550 --> 01:08:50,890 if they don't have some way that they can change practice. 1827 01:08:50,890 --> 01:08:54,319 And that becomes more difficult, because then you 1828 01:08:54,319 --> 01:08:56,420 need to not only introduce the model, 1829 01:08:56,420 --> 01:08:58,640 you need to show how incorporating 1830 01:08:58,640 --> 01:09:01,700 that model in some way is able to either identify 1831 01:09:01,700 --> 01:09:03,080 people who respond. 1832 01:09:03,080 --> 01:09:04,680 It always comes down to therapies 1833 01:09:04,680 --> 01:09:05,597 at the end of the day. 1834 01:09:05,597 --> 01:09:08,870 So can you tell me some subclass of people who will do better 1835 01:09:08,870 --> 01:09:10,670 on this drug, which means that you 1836 01:09:10,670 --> 01:09:13,399 have to have trial data that has all those people with all 1837 01:09:13,399 --> 01:09:14,167 that data. 1838 01:09:14,167 --> 01:09:16,250 And unfortunately, because echoes are so expensive 1839 01:09:16,250 --> 01:09:19,402 and places like the Brigham charge like $3,000 per echo, 1840 01:09:19,402 --> 01:09:20,819 then you only have like 100 people 1841 01:09:20,819 --> 01:09:23,111 who have an echo in a trial or 300 people have an echo. 1842 01:09:23,111 --> 01:09:26,819 You have a 5,000 person trial, and 5% of them have an echo. 1843 01:09:26,819 --> 01:09:29,700 So you need to change the way that gets done, because you're 1844 01:09:29,700 --> 01:09:33,270 massively underpowered to be able to detect anything that's 1845 01:09:33,270 --> 01:09:36,630 sort of a subgroup within that kind of work. 1846 01:09:36,630 --> 01:09:39,300 So yeah, unfortunately, the research pace of things 1847 01:09:39,300 --> 01:09:42,510 outpaces the change in practice in terms of the space, 1848 01:09:42,510 --> 01:09:46,319 until we're able to enable more data collection. 1849 01:09:46,319 --> 01:09:47,760 So I can stop there. 1850 01:09:47,760 --> 01:09:50,355 I was going to talk about blood cells in slides. 1851 01:09:50,355 --> 01:09:52,163 PROFESSOR: We can take some questions. 1852 01:09:52,163 --> 01:09:52,830 RAHUL DEO: Yeah. 1853 01:09:52,830 --> 01:09:53,040 Yeah. 1854 01:09:53,040 --> 01:09:53,250 Yeah. 1855 01:09:53,250 --> 01:09:53,479 OK. 1856 01:09:53,479 --> 01:09:54,354 Why don't we do that. 1857 01:09:57,110 --> 01:09:57,610 Yes. 1858 01:10:00,370 --> 01:10:04,480 AUDIENCE: When CT reconstruction started, 1859 01:10:04,480 --> 01:10:08,510 I remember seeing some papers where people said, well, 1860 01:10:08,510 --> 01:10:11,690 we know roughly what to the anatomy should look like, 1861 01:10:11,690 --> 01:10:14,930 and so we can fill in missing details. 1862 01:10:14,930 --> 01:10:18,902 In those days, the slices were run before, 1863 01:10:18,902 --> 01:10:22,073 and so they would hallucinate what the structure looked like. 1864 01:10:22,073 --> 01:10:22,740 RAHUL DEO: Yeah. 1865 01:10:22,740 --> 01:10:25,730 AUDIENCE: And of course, that has the benefit of giving you 1866 01:10:25,730 --> 01:10:28,100 a better model, but it also does risk 1867 01:10:28,100 --> 01:10:30,690 that it's hallucinated data. 1868 01:10:30,690 --> 01:10:34,810 Have you guys tried doing that with some of the-- 1869 01:10:34,810 --> 01:10:35,560 RAHUL DEO: Yeah. 1870 01:10:35,560 --> 01:10:36,500 That's a great point. 1871 01:10:36,500 --> 01:10:37,630 So OK. 1872 01:10:37,630 --> 01:10:40,780 So the question was so cardiac imaging has 1873 01:10:40,780 --> 01:10:43,920 a very long history, and so there was a period of time 1874 01:10:43,920 --> 01:10:45,820 where there's these kind of active modelers 1875 01:10:45,820 --> 01:10:48,370 around morphologies of the heart. 1876 01:10:48,370 --> 01:10:50,710 And so people had these models around what 1877 01:10:50,710 --> 01:10:53,480 the heart should look like from many, many, many studies. 1878 01:10:53,480 --> 01:10:55,480 And they were using that, back at the time, when 1879 01:10:55,480 --> 01:10:59,560 you had these relatively coarse multi-slice scanners for a CT, 1880 01:10:59,560 --> 01:11:02,800 they would reconstruct the 3D image of the heart 1881 01:11:02,800 --> 01:11:06,040 based on some pre-existing geometric model for what 1882 01:11:06,040 --> 01:11:07,310 the heart should look like. 1883 01:11:07,310 --> 01:11:08,650 And there's, of course, a benefit to that, 1884 01:11:08,650 --> 01:11:10,317 but some risk in the sense that somebody 1885 01:11:10,317 --> 01:11:12,555 may be very different in the space that's missing. 1886 01:11:12,555 --> 01:11:14,680 And so the question is whether those kind of priors 1887 01:11:14,680 --> 01:11:17,560 can be introduced in some way, and it 1888 01:11:17,560 --> 01:11:23,290 hasn't been straightforward as to how to do that. 1889 01:11:23,290 --> 01:11:25,357 Whenever you look at these ridiculously poor 1890 01:11:25,357 --> 01:11:27,190 segmentations, you're like, this is idiotic. 1891 01:11:27,190 --> 01:11:29,470 We should be able to introduce some of that, 1892 01:11:29,470 --> 01:11:33,940 and I've seen people, for example, put an autoencoder. 1893 01:11:33,940 --> 01:11:35,450 That's not exactly getting at it, 1894 01:11:35,450 --> 01:11:36,992 but it's actually getting it somewhat 1895 01:11:36,992 --> 01:11:38,740 with these coarser features. 1896 01:11:38,740 --> 01:11:40,960 But no, I think in terms of using 1897 01:11:40,960 --> 01:11:43,203 some degree of geometric priors, I 1898 01:11:43,203 --> 01:11:45,370 think I may have seen some literature in that space. 1899 01:11:45,370 --> 01:11:46,880 We haven't tried anything there. 1900 01:11:46,880 --> 01:11:49,440 We don't have any data to do that, unfortunately, 1901 01:11:49,440 --> 01:11:52,090 and I suspect, yeah, I just don't 1902 01:11:52,090 --> 01:11:53,884 know how difficult that is. 1903 01:11:53,884 --> 01:11:56,104 AUDIENCE: You mentioned that you don't 1904 01:11:56,104 --> 01:12:01,300 want to see a small additional atrium off at a distance. 1905 01:12:01,300 --> 01:12:03,113 So that's, in a way, building in knowledge. 1906 01:12:03,113 --> 01:12:03,780 RAHUL DEO: Yeah. 1907 01:12:03,780 --> 01:12:04,280 No. 1908 01:12:04,280 --> 01:12:06,385 I remember when I was starting this space. 1909 01:12:06,385 --> 01:12:07,510 I was like this is idiotic. 1910 01:12:07,510 --> 01:12:08,480 Why can't we do this? 1911 01:12:08,480 --> 01:12:10,188 Why don't we have some way of doing that? 1912 01:12:10,188 --> 01:12:13,270 We couldn't find at that time any architectures that 1913 01:12:13,270 --> 01:12:16,030 were straightforward to be able to do that, 1914 01:12:16,030 --> 01:12:20,150 but I'm sure there is something in that space. 1915 01:12:20,150 --> 01:12:23,200 And we didn't also have the data for those priors ourselves. 1916 01:12:23,200 --> 01:12:29,400 There's a long history of these de novo heart modelers 1917 01:12:29,400 --> 01:12:31,715 that exist out there from Oxford and the New Zealand 1918 01:12:31,715 --> 01:12:33,090 group for that matter who've been 1919 01:12:33,090 --> 01:12:35,907 doing some of this kind of multi-scale modeling. 1920 01:12:35,907 --> 01:12:37,740 It will be interesting to see whether or not 1921 01:12:37,740 --> 01:12:40,440 there is anybody who pushes forward in that space, 1922 01:12:40,440 --> 01:12:41,650 or is it just more data? 1923 01:12:41,650 --> 01:12:44,250 I think that's always that tension. 1924 01:12:51,012 --> 01:12:52,950 AUDIENCE: Can I ask about ultrasounds? 1925 01:12:52,950 --> 01:12:54,300 RAHUL DEO: Yeah. 1926 01:12:54,300 --> 01:12:56,008 AUDIENCE: You didn't show us ultrasounds. 1927 01:12:56,008 --> 01:12:56,560 Right? 1928 01:12:56,560 --> 01:12:57,220 RAHUL DEO: Yeah, I did. 1929 01:12:57,220 --> 01:12:58,280 AUDIENCE: Oh, you did? 1930 01:12:58,280 --> 01:12:58,420 RAHUL DEO: Yeah. 1931 01:12:58,420 --> 01:12:59,450 The echoes are ultrasounds. 1932 01:12:59,450 --> 01:13:01,830 AUDIENCE: Oh, OK, but that's really expensive ultrasound. 1933 01:13:01,830 --> 01:13:02,330 Right? 1934 01:13:02,330 --> 01:13:04,193 Like there are cheaper ultrasounds 1935 01:13:04,193 --> 01:13:06,110 that you could imagine that you constantly do. 1936 01:13:06,110 --> 01:13:06,610 Right? 1937 01:13:06,610 --> 01:13:07,790 RAHUL DEO: Yeah. 1938 01:13:07,790 --> 01:13:11,210 So there is a company that just came out 1939 01:13:11,210 --> 01:13:14,210 with the $2,000 handheld ultrasound, the subscription 1940 01:13:14,210 --> 01:13:15,930 model. 1941 01:13:15,930 --> 01:13:16,430 Yeah. 1942 01:13:16,430 --> 01:13:19,880 So I think that Philips has a handheld device 1943 01:13:19,880 --> 01:13:24,150 around the $8,000 marker, so $2,000 is getting quite cheap. 1944 01:13:24,150 --> 01:13:27,650 So that's I think the space for handheld devices. 1945 01:13:27,650 --> 01:13:29,940 AUDIENCE: We're talking about resource-poor countries. 1946 01:13:29,940 --> 01:13:30,280 RAHUL DEO: Yeah. 1947 01:13:30,280 --> 01:13:32,405 AUDIENCE: In a developing country, where maybe they 1948 01:13:32,405 --> 01:13:35,240 have very few doctors per population kind of thing. 1949 01:13:35,240 --> 01:13:38,130 What kind of imaging might be useful 1950 01:13:38,130 --> 01:13:41,390 that we could then apply computer vision algorithms to? 1951 01:13:41,390 --> 01:13:43,940 RAHUL DEO: I think ultrasound is that sweet spot. 1952 01:13:43,940 --> 01:13:48,250 It has versatility, and its cost is about where-- 1953 01:13:48,250 --> 01:13:50,000 and I'm sure those companies rented it out 1954 01:13:50,000 --> 01:13:52,590 for much lower cost in those kinds of places too. 1955 01:13:52,590 --> 01:13:54,840 We're putting together-- or I put together-- actually, 1956 01:13:54,840 --> 01:13:55,460 it may not have been funded. 1957 01:13:55,460 --> 01:13:56,120 I'm not sure. 1958 01:13:56,120 --> 01:13:59,450 But looking at sub-Saharan Africa 1959 01:13:59,450 --> 01:14:02,420 and collaborating with one of the Brigham doctors 1960 01:14:02,420 --> 01:14:05,068 who travels out to sub-Saharan Africa 1961 01:14:05,068 --> 01:14:07,610 and looking to try to build some of these automated detection 1962 01:14:07,610 --> 01:14:09,860 type of things in that space. 1963 01:14:09,860 --> 01:14:12,650 So no, I think there is definite interest in that, 1964 01:14:12,650 --> 01:14:17,450 and then there may be a much bigger win there then the stuff 1965 01:14:17,450 --> 01:14:18,380 I'm proposing. 1966 01:14:18,380 --> 01:14:20,338 But yeah, no, I think that's a very good point, 1967 01:14:20,338 --> 01:14:21,218 and that would be-- 1968 01:14:21,218 --> 01:14:22,260 it's also, it's portable. 1969 01:14:22,260 --> 01:14:24,260 You could have a phone-based thing. 1970 01:14:24,260 --> 01:14:29,126 So it's actually very attractive from that standpoint. 1971 01:14:29,126 --> 01:14:30,043 PROFESSOR: [INAUDIBLE] 1972 01:14:30,043 --> 01:14:30,918 RAHUL DEO: All right. 1973 01:14:30,918 --> 01:14:33,251 I feel like I'm changing the topic substantially but not 1974 01:14:33,251 --> 01:14:33,751 totally. 1975 01:14:33,751 --> 01:14:34,310 OK. 1976 01:14:34,310 --> 01:14:39,320 So this is that slide I showed, and I pitched it in a way 1977 01:14:39,320 --> 01:14:41,410 to try to motivate you to think of ultrasound. 1978 01:14:41,410 --> 01:14:42,930 But I'm not sure ultrasound really 1979 01:14:42,930 --> 01:14:45,680 achieves all these things, in the sense I wouldn't call it 1980 01:14:45,680 --> 01:14:48,410 the greatest biological tool to get at underlying disease 1981 01:14:48,410 --> 01:14:49,850 pathways. 1982 01:14:49,850 --> 01:14:52,070 Some of these things may be late, like David said, 1983 01:14:52,070 --> 01:14:54,190 or maybe not so reversible. 1984 01:14:54,190 --> 01:14:58,580 So we've been given this One Brave Idea thing $85 million 1985 01:14:58,580 --> 01:15:02,570 now to make some dent in a specific disease, so 1986 01:15:02,570 --> 01:15:05,510 coronary artery disease or coronary heart disease. 1987 01:15:05,510 --> 01:15:07,047 It's that arrogant tech thing, where 1988 01:15:07,047 --> 01:15:08,630 you just dump a lot of money somewhere 1989 01:15:08,630 --> 01:15:10,910 and think you're going to solve all problems. 1990 01:15:10,910 --> 01:15:12,950 And happy to take it, but I think 1991 01:15:12,950 --> 01:15:14,175 that there are some problems. 1992 01:15:14,175 --> 01:15:15,800 So this is what I wanted to do, so I've 1993 01:15:15,800 --> 01:15:18,230 wanted to do this for probably the last five, six 1994 01:15:18,230 --> 01:15:19,880 years, before I even started here, 1995 01:15:19,880 --> 01:15:23,505 and this has motivated me in part for quite a while. 1996 01:15:23,505 --> 01:15:24,630 And so here's our problems. 1997 01:15:24,630 --> 01:15:25,000 OK. 1998 01:15:25,000 --> 01:15:27,500 So we're studying heart disease, so coronary artery disease 1999 01:15:27,500 --> 01:15:30,950 or coronary heart disease is the arteries in the heart. 2000 01:15:30,950 --> 01:15:32,120 You can't get at those. 2001 01:15:32,120 --> 01:15:33,410 So you can't do any biology. 2002 01:15:33,410 --> 01:15:35,285 You can't do the stuff the cancer people-- do 2003 01:15:35,285 --> 01:15:36,120 you can biopsy that. 2004 01:15:36,120 --> 01:15:37,575 You can't do anything there. 2005 01:15:37,575 --> 01:15:39,200 So you're stuck with the thing that you 2006 01:15:39,200 --> 01:15:42,470 want to get at is inaccessible. 2007 01:15:42,470 --> 01:15:45,020 I talked about how a lot of the imaging is expensive, 2008 01:15:45,020 --> 01:15:48,080 but all those other omic stuff is really expensive too. 2009 01:15:48,080 --> 01:15:50,980 So that's going to be not so possible, 2010 01:15:50,980 --> 01:15:54,920 and you're not going to be able to do serial $1,000 proteomics 2011 01:15:54,920 --> 01:15:55,670 on people either. 2012 01:15:55,670 --> 01:15:57,680 That's not happening anytime soon. 2013 01:15:57,680 --> 01:16:01,040 And then everything I talked about, we were woefully 2014 01:16:01,040 --> 01:16:02,690 inadequate in terms of sample size, 2015 01:16:02,690 --> 01:16:04,730 especially if we want to characterize 2016 01:16:04,730 --> 01:16:06,833 underlying complex biological processes. 2017 01:16:06,833 --> 01:16:09,125 So we expect we're going to need high dimensional data, 2018 01:16:09,125 --> 01:16:10,875 and we're going to need huge sample sizes. 2019 01:16:10,875 --> 01:16:12,617 There's Vladimir Vapnik over there. 2020 01:16:12,617 --> 01:16:13,950 And then here's another problem. 2021 01:16:13,950 --> 01:16:14,450 OK? 2022 01:16:14,450 --> 01:16:16,490 So this stuff takes time. 2023 01:16:16,490 --> 01:16:17,720 These diseases take time. 2024 01:16:17,720 --> 01:16:20,085 So if I introduce a new assay right now, 2025 01:16:20,085 --> 01:16:21,710 how am I going to show that any of this 2026 01:16:21,710 --> 01:16:22,970 is going to be beneficial? 2027 01:16:22,970 --> 01:16:25,350 Because this disease develops or 10 to 20 years. 2028 01:16:25,350 --> 01:16:27,517 So I'm not going to talk about the solution to that, 2029 01:16:27,517 --> 01:16:29,330 well, a little bit. 2030 01:16:29,330 --> 01:16:29,870 OK. 2031 01:16:29,870 --> 01:16:32,630 So one of the issues with a lot of the data that's 2032 01:16:32,630 --> 01:16:35,010 out there is it's not particularly expressive. 2033 01:16:35,010 --> 01:16:37,460 It's a lot of that just the same clinical stuff, 2034 01:16:37,460 --> 01:16:38,690 the same imaging stuff. 2035 01:16:38,690 --> 01:16:42,680 So all these big studies, these billion dollar big studies, 2036 01:16:42,680 --> 01:16:45,262 ultimately just have echoes and MRIs and maybe 2037 01:16:45,262 --> 01:16:46,970 a little bit of genetics, but they really 2038 01:16:46,970 --> 01:16:48,920 don't have stuff that is this low cost 2039 01:16:48,920 --> 01:16:51,303 expressive biological stuff that we ideally 2040 01:16:51,303 --> 01:16:52,220 want to be able to do. 2041 01:16:52,220 --> 01:16:55,250 So this is really expensive and makes $85 million look 2042 01:16:55,250 --> 01:16:57,800 like a joke, and it's not all that 2043 01:16:57,800 --> 01:16:59,820 rich in terms of complexity. 2044 01:16:59,820 --> 01:17:02,520 So we wanted to do something different, 2045 01:17:02,520 --> 01:17:05,240 and so this is the crazy thing. 2046 01:17:05,240 --> 01:17:08,340 We're focusing on circulating cells, 2047 01:17:08,340 --> 01:17:11,270 and so this is a compromise. 2048 01:17:11,270 --> 01:17:12,950 And there's a reasonably good case 2049 01:17:12,950 --> 01:17:15,250 to be made for their involvement. 2050 01:17:15,250 --> 01:17:17,270 So there's lots of data to suggest 2051 01:17:17,270 --> 01:17:19,910 that these are causal mediators of coronary artery disease 2052 01:17:19,910 --> 01:17:21,270 or coronary heart disease. 2053 01:17:21,270 --> 01:17:24,920 So you can find them in the plaques. 2054 01:17:24,920 --> 01:17:26,720 So patients who have autoimmune diseases 2055 01:17:26,720 --> 01:17:29,390 certainly have accelerated forms after atherosclerosis. 2056 01:17:29,390 --> 01:17:30,175 There are drugs. 2057 01:17:30,175 --> 01:17:31,550 There's a drug called canakinumab 2058 01:17:31,550 --> 01:17:35,390 that inhibits IL-1 one beta secretion from macrophages, 2059 01:17:35,390 --> 01:17:38,360 and this has mortality benefit in coronary artery disease. 2060 01:17:38,360 --> 01:17:40,503 There are mutations in the white blood cell 2061 01:17:40,503 --> 01:17:42,920 population themselves that are associated with early heart 2062 01:17:42,920 --> 01:17:43,730 attack. 2063 01:17:43,730 --> 01:17:46,340 So there's a lot there, and this has been going-- 2064 01:17:46,340 --> 01:17:47,840 and there's plenty of mouse models 2065 01:17:47,840 --> 01:17:49,340 that show that if you make mutations 2066 01:17:49,340 --> 01:17:51,075 only in the white blood cell compartment, 2067 01:17:51,075 --> 01:17:53,700 that you will completely change that the disease course itself. 2068 01:17:53,700 --> 01:17:56,450 So there's a good amount of data out there 2069 01:17:56,450 --> 01:17:58,940 to suggest that there is an informative kind of cell type 2070 01:17:58,940 --> 01:17:59,600 there. 2071 01:17:59,600 --> 01:18:01,015 It's accessible. 2072 01:18:01,015 --> 01:18:02,390 There's lots of predictive models 2073 01:18:02,390 --> 01:18:04,515 already there that could be done with some of this, 2074 01:18:04,515 --> 01:18:07,010 and they express many of the genes that are involved. 2075 01:18:07,010 --> 01:18:10,468 And there's a window on many of these biological processes. 2076 01:18:10,468 --> 01:18:13,010 So we're focusing on computer vision approaches to this data. 2077 01:18:13,010 --> 01:18:15,050 So we decided, if we can't do the omic stuff, 2078 01:18:15,050 --> 01:18:16,940 because it costs too much, we're going 2079 01:18:16,940 --> 01:18:20,240 to take slides and have tens of thousands 2080 01:18:20,240 --> 01:18:21,850 of cells per individual. 2081 01:18:21,850 --> 01:18:23,600 And then we can introduce fluorescent dyes 2082 01:18:23,600 --> 01:18:27,350 that can focus on lots of different organelles. 2083 01:18:27,350 --> 01:18:30,860 And then we can potentially expand the phenotypic space 2084 01:18:30,860 --> 01:18:32,780 by adding all kinds of perturbations 2085 01:18:32,780 --> 01:18:35,540 that can be able to unmask attributes 2086 01:18:35,540 --> 01:18:38,600 of people that may not even be relatively there at baseline. 2087 01:18:38,600 --> 01:18:41,017 And I think I've been empowered by the computer vision 2088 01:18:41,017 --> 01:18:43,100 experience with the echo stuff, and I'm like, hey, 2089 01:18:43,100 --> 01:18:44,310 I can do this. 2090 01:18:44,310 --> 01:18:46,370 I can train these models. 2091 01:18:46,370 --> 01:18:49,790 So we're in a position now where we can-- 2092 01:18:49,790 --> 01:18:52,010 this stuff costs a few dollars per person. 2093 01:18:52,010 --> 01:18:55,250 It's cheap, and you can just keep 2094 01:18:55,250 --> 01:18:56,662 on expanding phenotypic space. 2095 01:18:56,662 --> 01:18:57,620 You can bring in drugs. 2096 01:18:57,620 --> 01:18:59,287 You can bring in whatever you want here, 2097 01:18:59,287 --> 01:19:02,300 and you're still in that dollars type range. 2098 01:19:02,300 --> 01:19:05,960 So we just piggy-back, and we just hover around-- 2099 01:19:05,960 --> 01:19:07,550 just a couple of research assistants 2100 01:19:07,550 --> 01:19:09,380 were hovering around clinics. 2101 01:19:09,380 --> 01:19:11,180 And we can do thousands of patients 2102 01:19:11,180 --> 01:19:13,340 a month, so tens of thousands of patients a year. 2103 01:19:13,340 --> 01:19:18,410 So we can get into a deep learning sample size here, 2104 01:19:18,410 --> 01:19:21,710 and so we want these primary assays 2105 01:19:21,710 --> 01:19:23,570 to be low cost, reproducible, expressive, 2106 01:19:23,570 --> 01:19:24,830 ideally responsive to therapy. 2107 01:19:24,830 --> 01:19:27,740 So that's this space here, and there's lots of stuff 2108 01:19:27,740 --> 01:19:28,656 that we have. 2109 01:19:28,656 --> 01:19:31,470 We have all the medical record data on all these people, 2110 01:19:31,470 --> 01:19:33,810 and we can selectively do somatic sequencing. 2111 01:19:33,810 --> 01:19:35,130 We can do genome associations. 2112 01:19:35,130 --> 01:19:36,270 We have all ECG data. 2113 01:19:36,270 --> 01:19:38,160 We have selective positron emission data. 2114 01:19:38,160 --> 01:19:39,960 So it's lots of additional thought, 2115 01:19:39,960 --> 01:19:42,390 and we want to be able to walk our cheap assay 2116 01:19:42,390 --> 01:19:45,000 towards those things are more expensive 2117 01:19:45,000 --> 01:19:47,757 but for which there's much more historical data. 2118 01:19:47,757 --> 01:19:49,590 So that's what I do with my life these days, 2119 01:19:49,590 --> 01:19:51,240 and the time problem has been solved. 2120 01:19:51,240 --> 01:19:54,570 Because we found a collaborary MGH who has 3 1/2 million 2121 01:19:54,570 --> 01:19:57,870 of these records in terms of cell counting and cytometer 2122 01:19:57,870 --> 01:19:59,860 data going back for about three years. 2123 01:19:59,860 --> 01:20:03,390 So we should be able to get some decent events in that time. 2124 01:20:03,390 --> 01:20:06,072 I need to build a document classification model for 3 1/2 2125 01:20:06,072 --> 01:20:08,280 million records and decide whether they have coronary 2126 01:20:08,280 --> 01:20:11,580 heart disease, but sounds like that's doable. 2127 01:20:11,580 --> 01:20:13,920 We're fearless in this space. 2128 01:20:13,920 --> 01:20:15,960 And then they also have 13 million images, 2129 01:20:15,960 --> 01:20:18,412 so hundreds of thousands of people worth of slides. 2130 01:20:18,412 --> 01:20:20,370 So we can at the very least, get decent weights 2131 01:20:20,370 --> 01:20:22,530 for transfer learning from some of this data, 2132 01:20:22,530 --> 01:20:25,730 and we're doing this for acute heart attack patients. 2133 01:20:25,730 --> 01:20:29,140 So yeah, so this is what I'm doing, ultimately, 2134 01:20:29,140 --> 01:20:32,760 and so it's this bridge between existing imaging, existing 2135 01:20:32,760 --> 01:20:36,660 conventional medical data, and this low cost, 2136 01:20:36,660 --> 01:20:39,030 expressive, serial-type of stuff that ultimately 2137 01:20:39,030 --> 01:20:42,090 hoping to expand phenotypic space and keep the cost down. 2138 01:20:42,090 --> 01:20:44,670 I think all my lessons from working with expensive imaging 2139 01:20:44,670 --> 01:20:47,300 data has motivated me to build something around this space. 2140 01:20:47,300 --> 01:20:50,890 So this is my it's my baby right now. 2141 01:20:50,890 --> 01:20:53,550 And so lots of things for people to be involved in, 2142 01:20:53,550 --> 01:20:58,030 if they want to, and these are some of the funding sources. 2143 01:20:58,030 --> 01:20:58,530 All right. 2144 01:20:58,530 --> 01:20:59,340 Thank you. 2145 01:20:59,340 --> 01:21:02,690 [APPLAUSE]