1 00:00:00,790 --> 00:00:03,190 The following content is provided under a Creative 2 00:00:03,190 --> 00:00:04,730 Commons license. 3 00:00:04,730 --> 00:00:07,030 Your support will help MIT OpenCourseWare 4 00:00:07,030 --> 00:00:11,390 continue to offer high quality educational resources for free. 5 00:00:11,390 --> 00:00:13,990 To make a donation or view additional materials 6 00:00:13,990 --> 00:00:17,880 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,880 --> 00:00:18,840 at ocw.mit.edu. 8 00:00:30,709 --> 00:00:33,000 PROFESSOR: Quick, quick recap of what we did last time. 9 00:00:33,000 --> 00:00:36,830 So last time we introduced this idea of decomposition 10 00:00:36,830 --> 00:00:37,610 and abstraction. 11 00:00:37,610 --> 00:00:39,719 And we started putting that into our programs. 12 00:00:39,719 --> 00:00:41,510 And these were sort of high level concepts, 13 00:00:41,510 --> 00:00:45,170 and we achieved them using these concrete things called 14 00:00:45,170 --> 00:00:46,544 functions in our programs. 15 00:00:46,544 --> 00:00:47,960 And functions allowed us to create 16 00:00:47,960 --> 00:00:53,180 code that was coherent, that had some structure to it, 17 00:00:53,180 --> 00:00:55,160 and was reusable. 18 00:00:55,160 --> 00:00:57,220 OK. 19 00:00:57,220 --> 00:00:59,487 And from now on in problem sets and in lectures, 20 00:00:59,487 --> 00:01:01,070 I'm going to be using functions a lot. 21 00:01:01,070 --> 00:01:04,019 So make sure that you understand how they work 22 00:01:04,019 --> 00:01:05,450 and all of those details. 23 00:01:05,450 --> 00:01:09,676 So today, we're going to introduce two new data types. 24 00:01:09,676 --> 00:01:11,300 And they're called compound data types, 25 00:01:11,300 --> 00:01:13,466 because they're actually data types that are made up 26 00:01:13,466 --> 00:01:17,120 of other data types, particularly ints, floats, 27 00:01:17,120 --> 00:01:19,020 Booleans, and strings. 28 00:01:19,020 --> 00:01:22,220 And actually not just these, but other data types as well. 29 00:01:22,220 --> 00:01:24,390 So that's why they're called compound data types. 30 00:01:24,390 --> 00:01:27,080 So we're going to look at a new data 31 00:01:27,080 --> 00:01:30,776 type called a tuple and a new data type called a list. 32 00:01:30,776 --> 00:01:32,900 And then we're going to talk about these ideas that 33 00:01:32,900 --> 00:01:38,110 come about with-- specifically with lists. 34 00:01:38,110 --> 00:01:38,610 All right. 35 00:01:38,610 --> 00:01:41,790 So let's go right into tuples. 36 00:01:41,790 --> 00:01:44,130 So if you recall strings, strings 37 00:01:44,130 --> 00:01:47,710 were sequences of characters. 38 00:01:47,710 --> 00:01:50,640 Tuples are going to be similar to strings in that they're 39 00:01:50,640 --> 00:01:52,915 going to be sequences of something, 40 00:01:52,915 --> 00:01:55,350 except that tuples aren't just sequences of characters, 41 00:01:55,350 --> 00:01:57,540 they can be sequences of anything. 42 00:01:57,540 --> 00:01:59,310 They're a collection of data where 43 00:01:59,310 --> 00:02:01,070 that data can be of any type. 44 00:02:05,310 --> 00:02:08,680 So a tuple can contain elements that are integers, 45 00:02:08,680 --> 00:02:10,470 floats, strings, and so on. 46 00:02:13,110 --> 00:02:14,640 Tuples are immutable. 47 00:02:14,640 --> 00:02:17,280 And if you recall, we talked about this word a little bit 48 00:02:17,280 --> 00:02:19,020 when we talked about strings. 49 00:02:19,020 --> 00:02:21,330 So that means once you create a tuple object, 50 00:02:21,330 --> 00:02:22,720 you can't modify it. 51 00:02:22,720 --> 00:02:24,960 So when you created a string object, 52 00:02:24,960 --> 00:02:27,690 you were not allowed to modify it. 53 00:02:30,280 --> 00:02:33,520 So the way we create tuples are with these open and close 54 00:02:33,520 --> 00:02:36,270 parentheses. 55 00:02:36,270 --> 00:02:38,180 This shouldn't be confused with a function, 56 00:02:38,180 --> 00:02:41,330 because there's nothing-- there's no-- 57 00:02:41,330 --> 00:02:42,740 this isn't a function call. 58 00:02:42,740 --> 00:02:44,922 It's just how you represent a tuple. 59 00:02:44,922 --> 00:02:46,880 For a function call, you'd have something right 60 00:02:46,880 --> 00:02:48,950 before the parentheses. 61 00:02:48,950 --> 00:02:53,010 This is just how we chose to represent a tuple. 62 00:02:53,010 --> 00:02:54,920 And just a plain open and close parentheses 63 00:02:54,920 --> 00:02:56,390 represents an empty tuple. 64 00:02:56,390 --> 00:02:57,350 So it's of length 0. 65 00:02:57,350 --> 00:03:00,500 There's nothing in it. 66 00:03:00,500 --> 00:03:02,780 You can create a tuple that contains 67 00:03:02,780 --> 00:03:07,110 some elements by separating each element with a comma. 68 00:03:07,110 --> 00:03:11,100 So in this case, this is a tuple that 69 00:03:11,100 --> 00:03:14,680 can be accessed with a variable t that contains three elements. 70 00:03:14,680 --> 00:03:17,700 The first is an integer, the second is a string, 71 00:03:17,700 --> 00:03:20,370 and the third is another integer. 72 00:03:20,370 --> 00:03:23,910 Much like strings, we can index into tuples to find out 73 00:03:23,910 --> 00:03:27,330 values at particular indices. 74 00:03:27,330 --> 00:03:30,480 So you read this as t at position 0. 75 00:03:30,480 --> 00:03:34,800 So the tuple represented by a variable t at position 0 76 00:03:34,800 --> 00:03:37,140 will evaluate to 2, because again, we 77 00:03:37,140 --> 00:03:39,900 start counting from 0 in computer science. 78 00:03:39,900 --> 00:03:43,660 So that brings us-- gives us the first element. 79 00:03:43,660 --> 00:03:46,060 Just like strings, we can concatenate tuples together. 80 00:03:46,060 --> 00:03:48,410 That just means add them together. 81 00:03:48,410 --> 00:03:50,260 So if we add these two tuples together, 82 00:03:50,260 --> 00:03:52,390 we just get back one larger tuple 83 00:03:52,390 --> 00:03:57,640 that's just those two-- the elements of those tuples 84 00:03:57,640 --> 00:04:00,890 just put together in one larger tuple. 85 00:04:00,890 --> 00:04:04,330 Again, much like strings, we can slice into tuples. 86 00:04:04,330 --> 00:04:09,560 So t sliced from index 1 until index 2. 87 00:04:09,560 --> 00:04:14,340 Remember, we go until this stop minus 1. 88 00:04:14,340 --> 00:04:19,610 So this only gives us one element inside the tuple. 89 00:04:19,610 --> 00:04:21,430 And this is not a mistake. 90 00:04:21,430 --> 00:04:26,260 This extra comma here actually represents a tuple object. 91 00:04:26,260 --> 00:04:28,480 If I didn't have this comma here, 92 00:04:28,480 --> 00:04:31,800 then this would just be a string. 93 00:04:31,800 --> 00:04:34,237 The parentheses would just-- wouldn't really 94 00:04:34,237 --> 00:04:35,070 make any difference. 95 00:04:35,070 --> 00:04:38,370 But the comma here makes it clear to Python 96 00:04:38,370 --> 00:04:42,660 that this is a tuple with only one element in it. 97 00:04:42,660 --> 00:04:47,140 We can slice even further to get a tuple with two elements. 98 00:04:47,140 --> 00:04:48,940 And we can do the usual operations 99 00:04:48,940 --> 00:04:51,610 like get the length of a tuple, which says, how many elements 100 00:04:51,610 --> 00:04:53,200 are in my tuple? 101 00:04:53,200 --> 00:04:57,370 And len of this t would evaluate to 3, 102 00:04:57,370 --> 00:04:59,680 because there are three elements inside the tuple. 103 00:04:59,680 --> 00:05:03,600 Each element, again, separated by the comma. 104 00:05:03,600 --> 00:05:05,530 And just like strings, if we try to change 105 00:05:05,530 --> 00:05:07,930 a value inside the tuple-- in this case, 106 00:05:07,930 --> 00:05:15,970 I wanted to try to change the value of the second element 107 00:05:15,970 --> 00:05:18,760 to 4-- Python doesn't allow that, 108 00:05:18,760 --> 00:05:20,482 because tuples are immutable. 109 00:05:23,442 --> 00:05:24,900 So why would we want to use tuples? 110 00:05:24,900 --> 00:05:30,360 Tuples are actually useful in a couple of different scenarios. 111 00:05:30,360 --> 00:05:33,780 So recall a few years ago, we looked at this code 112 00:05:33,780 --> 00:05:36,852 where we tried to swap the values of variables x and y. 113 00:05:36,852 --> 00:05:38,310 And this first code actually didn't 114 00:05:38,310 --> 00:05:42,180 work, because you're overwriting the value for x. 115 00:05:42,180 --> 00:05:44,370 So instead, what we ended up doing 116 00:05:44,370 --> 00:05:46,290 was creating this temporary variable 117 00:05:46,290 --> 00:05:49,320 where we stored the value of x, and then we overwrote it, 118 00:05:49,320 --> 00:05:51,080 and then we used the temporary variable. 119 00:05:51,080 --> 00:05:53,400 Well, turns out this three liner code right 120 00:05:53,400 --> 00:05:56,900 here can actually be written in one line using tuples. 121 00:05:56,900 --> 00:06:00,460 So you say x comma y is equal to y comma x. 122 00:06:00,460 --> 00:06:04,630 And Python goes in and says, what's the value of y? 123 00:06:04,630 --> 00:06:06,670 And assigns it to x. 124 00:06:06,670 --> 00:06:08,380 And then what's the value of x? 125 00:06:08,380 --> 00:06:09,500 And assigns it to y. 126 00:06:15,140 --> 00:06:16,700 Extending on that, we can actually 127 00:06:16,700 --> 00:06:22,410 use tuples to return more than one value from a function. 128 00:06:22,410 --> 00:06:27,242 So functions, you're only allowed to return one object. 129 00:06:27,242 --> 00:06:29,450 So you're not allowed to return more than one object. 130 00:06:29,450 --> 00:06:32,740 However, if we use a tuple object, 131 00:06:32,740 --> 00:06:34,840 and if that's the thing that we return, 132 00:06:34,840 --> 00:06:39,550 we can actually get around this sort of rule 133 00:06:39,550 --> 00:06:42,160 by putting in as many values as we 134 00:06:42,160 --> 00:06:44,090 want inside the tuple object. 135 00:06:44,090 --> 00:06:47,800 And then we can return as many values as we'd like. 136 00:06:47,800 --> 00:06:49,540 So in this specific example, I'm trying 137 00:06:49,540 --> 00:06:51,520 to calculate the quotient and remainder 138 00:06:51,520 --> 00:06:53,980 when we divide x by a y. 139 00:06:53,980 --> 00:06:56,320 So this is a function definition here. 140 00:06:56,320 --> 00:07:01,670 And down here I'm calling the function with 4 and 5. 141 00:07:01,670 --> 00:07:04,000 So when I make the function call, 142 00:07:04,000 --> 00:07:10,480 4 gets assigned to x and 5 gets assigned to y. 143 00:07:10,480 --> 00:07:16,880 So then q is going to be the integer division when 144 00:07:16,880 --> 00:07:19,970 x is divided by y. 145 00:07:19,970 --> 00:07:21,560 And this double slash just means-- 146 00:07:21,560 --> 00:07:24,260 it's like casting the result to an integer. 147 00:07:24,260 --> 00:07:27,170 It says divide it, keep the whole number part, 148 00:07:27,170 --> 00:07:32,430 and just delete everything else beyond the decimal point. 149 00:07:32,430 --> 00:07:33,910 So when you divide 4 by 5, this q 150 00:07:33,910 --> 00:07:36,530 is actually going to be 0, because it's 0 point something, 151 00:07:36,530 --> 00:07:39,690 and I don't care about the point something. 152 00:07:39,690 --> 00:07:43,290 And then the remainder is just using the percent operator. 153 00:07:43,290 --> 00:07:47,840 So I divide 4 by 5, the remainder is going to be 4. 154 00:07:47,840 --> 00:07:49,870 And notice that I'm going to be returning 155 00:07:49,870 --> 00:07:52,720 q and r, which are these two values that I calculated 156 00:07:52,720 --> 00:07:54,250 inside my function. 157 00:07:54,250 --> 00:07:58,560 And I'm returning them in the context of this tuple. 158 00:07:58,560 --> 00:08:00,760 So I'm only returning one object, which is a tuple. 159 00:08:00,760 --> 00:08:03,520 It just so happens that I'm populating that object 160 00:08:03,520 --> 00:08:05,470 with a few different values. 161 00:08:08,400 --> 00:08:10,890 So when the function returns here, 162 00:08:10,890 --> 00:08:13,600 this is going to say 0 comma 4. 163 00:08:13,600 --> 00:08:16,710 That's the tuple it's going to return. 164 00:08:16,710 --> 00:08:19,590 Q is going to be 0 and r is going to be 4. 165 00:08:19,590 --> 00:08:22,440 So then this line here-- quot, rem 166 00:08:22,440 --> 00:08:26,130 equals 0, 4-- is basically this-- 167 00:08:26,130 --> 00:08:28,050 it's sort of like what we did up here. 168 00:08:28,050 --> 00:08:30,330 So it assigns quot to 4-- sorry. 169 00:08:30,330 --> 00:08:31,840 Quot to 0 and rem to 4. 170 00:08:34,409 --> 00:08:35,610 So we can use tuples. 171 00:08:35,610 --> 00:08:37,059 This is very useful. 172 00:08:37,059 --> 00:08:42,260 We can use them to return more than one value from a function. 173 00:08:42,260 --> 00:08:43,929 So tuples are great. 174 00:08:43,929 --> 00:08:45,720 Might seem a little bit confusing at first, 175 00:08:45,720 --> 00:08:47,220 but they're actually pretty useful, 176 00:08:47,220 --> 00:08:50,130 because they hold collections of data. 177 00:08:50,130 --> 00:08:54,010 So here, I wrote a function which 178 00:08:54,010 --> 00:08:57,112 I can apply to any set of data. 179 00:08:57,112 --> 00:08:58,820 And I'll explain what this function does, 180 00:08:58,820 --> 00:09:00,403 and then we can apply it to some data. 181 00:09:00,403 --> 00:09:02,650 And you can see that you can extract some very basic 182 00:09:02,650 --> 00:09:05,140 information from whatever set of data that you 183 00:09:05,140 --> 00:09:07,130 happen to collect. 184 00:09:07,130 --> 00:09:08,830 So here's a function called get_data, 185 00:09:08,830 --> 00:09:11,550 and it does all of this stuff in here. 186 00:09:11,550 --> 00:09:13,920 And in the actual code associated with the lecture, 187 00:09:13,920 --> 00:09:17,640 I actually said what the condition on a tuple was. 188 00:09:17,640 --> 00:09:20,550 So it has to be a tuple of a certain-- 189 00:09:20,550 --> 00:09:23,480 that looks a certain way. 190 00:09:23,480 --> 00:09:25,430 And this is the way it has to look. 191 00:09:25,430 --> 00:09:28,580 So it's one tuple. 192 00:09:28,580 --> 00:09:32,540 The outer parentheses out here represent the fact 193 00:09:32,540 --> 00:09:34,830 that it's a tuple. 194 00:09:34,830 --> 00:09:39,450 And the elements of this tuple are actually other tuples. 195 00:09:39,450 --> 00:09:41,140 So the first element is a tuple object, 196 00:09:41,140 --> 00:09:42,450 the second element is a tuple object, 197 00:09:42,450 --> 00:09:44,370 and third one is a tuple object, and so on. 198 00:09:47,150 --> 00:09:50,710 And each one of these inner tuple objects 199 00:09:50,710 --> 00:09:52,610 are actually going to contain two elements, 200 00:09:52,610 --> 00:09:55,720 the first being an integer and the second being a string. 201 00:09:55,720 --> 00:09:57,430 So that's sort of the precondition 202 00:09:57,430 --> 00:10:01,000 that this function assumes on a tuple 203 00:10:01,000 --> 00:10:06,580 before it can-- before it actually can work. 204 00:10:06,580 --> 00:10:07,390 All right. 205 00:10:07,390 --> 00:10:10,360 So given a tuple that looks like that, 206 00:10:10,360 --> 00:10:11,830 what's the function going to do? 207 00:10:14,520 --> 00:10:17,250 It's first creating two empty tuple. 208 00:10:17,250 --> 00:10:20,220 One is called nums and one is called words. 209 00:10:20,220 --> 00:10:21,700 And then there's a for loop. 210 00:10:21,700 --> 00:10:24,600 And notice here the for loop is going to iterate over 211 00:10:24,600 --> 00:10:27,660 every element inside the tuple. 212 00:10:27,660 --> 00:10:30,240 Remember in strings when we were able to use for loops that 213 00:10:30,240 --> 00:10:32,460 iterated over the characters directly as opposed 214 00:10:32,460 --> 00:10:33,710 to over the indices? 215 00:10:33,710 --> 00:10:36,540 Well, we're doing the same sort of thing here. 216 00:10:36,540 --> 00:10:38,140 Instead of iterating over the indices, 217 00:10:38,140 --> 00:10:42,900 we're going to iterate over the tuple object at each position. 218 00:10:42,900 --> 00:10:46,260 So first time through the loop, t here 219 00:10:46,260 --> 00:10:49,212 is going to be this first tuple. 220 00:10:49,212 --> 00:10:50,670 The second time through the loop, t 221 00:10:50,670 --> 00:10:51,770 is going to be this tuple. 222 00:10:51,770 --> 00:10:54,375 And the third time, it's going to be this exact tuple object. 223 00:10:57,810 --> 00:11:00,120 So each time through the loop, what I'm doing 224 00:11:00,120 --> 00:11:03,390 is I'm going to have this nums tuple that I'm 225 00:11:03,390 --> 00:11:04,534 going to keep adding to. 226 00:11:04,534 --> 00:11:06,450 And each time I'm going to create a new object 227 00:11:06,450 --> 00:11:09,802 and reassign it to this variable nums. 228 00:11:09,802 --> 00:11:11,260 And each time through the loop, I'm 229 00:11:11,260 --> 00:11:14,500 looking at what the previous value of nums was. 230 00:11:14,500 --> 00:11:17,650 So what was my previous tuple? 231 00:11:17,650 --> 00:11:20,800 And I'm going to add it with this singleton tuple. 232 00:11:20,800 --> 00:11:23,325 So it's a tuple of one character or one element. 233 00:11:26,290 --> 00:11:28,500 This element being t at position zero. 234 00:11:28,500 --> 00:11:31,397 So you have to sort of wrap your mind around 235 00:11:31,397 --> 00:11:32,230 how this is working. 236 00:11:32,230 --> 00:11:37,390 So if t is going to be this tuple element right here, 237 00:11:37,390 --> 00:11:41,380 then t at position zero is going to be this blue bar here. 238 00:11:41,380 --> 00:11:45,340 So it represents the integer portion of the tuple. 239 00:11:45,340 --> 00:11:47,500 So as we're going through the loop, 240 00:11:47,500 --> 00:11:49,960 this nums is going to get populated with all 241 00:11:49,960 --> 00:11:52,960 of the integers from every one of my tuple-- 242 00:11:52,960 --> 00:11:54,010 inside tuple objects. 243 00:11:56,619 --> 00:11:58,660 So that's basically what this line here is doing. 244 00:12:01,870 --> 00:12:07,420 At the same time, I'm also populating this words tuple. 245 00:12:07,420 --> 00:12:09,560 And the words tuple is a little bit different, 246 00:12:09,560 --> 00:12:12,320 because I'm not adding every single one of these string 247 00:12:12,320 --> 00:12:13,740 objects. 248 00:12:13,740 --> 00:12:17,540 So t at position one being the string part of the inner tuple. 249 00:12:17,540 --> 00:12:21,830 I'm actually adding the string part only if it's not already 250 00:12:21,830 --> 00:12:25,020 in my words list. 251 00:12:25,020 --> 00:12:26,850 So here, I'm essentially grabbing 252 00:12:26,850 --> 00:12:29,420 all of the unique strings from my list. 253 00:12:32,520 --> 00:12:35,820 These last sort of three lines-- three, four lines here 254 00:12:35,820 --> 00:12:38,640 just do a little bit of arithmetic on it saying, 255 00:12:38,640 --> 00:12:41,100 OK, now I have all of the numbers 256 00:12:41,100 --> 00:12:43,500 here, what's the minimum out of all of these, 257 00:12:43,500 --> 00:12:46,350 and then what's the maximum amount of all these? 258 00:12:46,350 --> 00:12:49,050 And then this unique words variable 259 00:12:49,050 --> 00:12:54,460 tells me how many unique words do I have in my original tuple. 260 00:12:54,460 --> 00:12:59,720 So this feels sort of generic, so let's run it on some data. 261 00:12:59,720 --> 00:13:05,000 So here I have it-- I tested it on some test data. 262 00:13:05,000 --> 00:13:08,670 And then I got some actual data. 263 00:13:08,670 --> 00:13:10,860 And this actual data that I wanted to analyze 264 00:13:10,860 --> 00:13:14,280 was Taylor Swift data. 265 00:13:14,280 --> 00:13:18,260 And representing the integer portion 266 00:13:18,260 --> 00:13:21,530 of the tuple representing a year and the string portion 267 00:13:21,530 --> 00:13:25,880 of the tuple representing the person who she 268 00:13:25,880 --> 00:13:27,210 wrote a song about that year. 269 00:13:30,440 --> 00:13:35,180 So some real world data that we're working with here. 270 00:13:35,180 --> 00:13:37,590 Very important that we know this information. 271 00:13:37,590 --> 00:13:38,660 OK. 272 00:13:38,660 --> 00:13:41,450 So with this data, I can run it-- 273 00:13:41,450 --> 00:13:46,229 I can plug it into this function that I wrote up here. 274 00:13:46,229 --> 00:13:48,020 And I'm going to actually comment this out, 275 00:13:48,020 --> 00:13:51,960 so it doesn't get cluttered. 276 00:13:51,960 --> 00:13:58,660 And if I run it-- this is the part 277 00:13:58,660 --> 00:14:00,820 where I'm calling my function. 278 00:14:00,820 --> 00:14:03,070 I'm calling it with this data here. 279 00:14:03,070 --> 00:14:07,420 tswift being this tuple of tuples. 280 00:14:07,420 --> 00:14:09,625 And what I get back is-- up here, 281 00:14:09,625 --> 00:14:16,060 line 38-- is the return from the function being a large tuple. 282 00:14:16,060 --> 00:14:18,070 And that large tuple, I'm then assigning it 283 00:14:18,070 --> 00:14:20,622 to my own tuple in my program. 284 00:14:20,622 --> 00:14:23,080 And then I'm just writing out-- printing out some statement 285 00:14:23,080 --> 00:14:25,210 here. 286 00:14:25,210 --> 00:14:27,360 So I'm getting the minimum year, the maximum year, 287 00:14:27,360 --> 00:14:28,950 and then the number of people. 288 00:14:28,950 --> 00:14:31,620 So I can show you that it works if I replace 289 00:14:31,620 --> 00:14:33,450 one of these names with another one 290 00:14:33,450 --> 00:14:35,272 that I already have in here. 291 00:14:35,272 --> 00:14:37,230 So instead of writing a song about five people, 292 00:14:37,230 --> 00:14:39,146 she would have wrote a song about four people. 293 00:14:39,146 --> 00:14:40,160 Yay, it worked. 294 00:14:48,180 --> 00:14:50,920 So that's tuples. 295 00:14:50,920 --> 00:14:54,894 And remeber or recall-- keep in mind, tuples were immutable. 296 00:14:54,894 --> 00:14:57,060 Now we're going to look at a very, very similar data 297 00:14:57,060 --> 00:14:59,755 structure to tuples called lists, 298 00:14:59,755 --> 00:15:02,890 except that instead of lists being immutable, 299 00:15:02,890 --> 00:15:07,010 lists are going to be mutable objects. 300 00:15:07,010 --> 00:15:09,170 So much like lists, they're going 301 00:15:09,170 --> 00:15:14,110 to contain elements of any type or objects of any type. 302 00:15:14,110 --> 00:15:15,870 You denote them with-- you denote a list 303 00:15:15,870 --> 00:15:20,590 with square brackets instead of parentheses. 304 00:15:20,590 --> 00:15:22,180 And the difference being that they're 305 00:15:22,180 --> 00:15:26,800 going to be immutable objects instead of immutable. 306 00:15:26,800 --> 00:15:29,230 So creating an empty list, you just 307 00:15:29,230 --> 00:15:31,600 do open close square brackets. 308 00:15:31,600 --> 00:15:36,420 You can have a list of elements of different types, 309 00:15:36,420 --> 00:15:42,160 even a list of lists. 310 00:15:42,160 --> 00:15:43,660 So one of the elements being a list. 311 00:15:46,680 --> 00:15:49,170 As usual, you can apply length on a list, 312 00:15:49,170 --> 00:15:51,339 and that tells you how many elements are in it. 313 00:15:51,339 --> 00:15:53,130 This is going to tell you how many elements 314 00:15:53,130 --> 00:15:55,380 are in your list l. 315 00:15:55,380 --> 00:15:57,542 So it's not going to look any further than that. 316 00:15:57,542 --> 00:16:00,000 So it's going to say, this is an integer, this is a string, 317 00:16:00,000 --> 00:16:02,410 this is an integer, this is a list. 318 00:16:02,410 --> 00:16:05,160 It's not going to say how many elements are in this list. 319 00:16:05,160 --> 00:16:11,710 It's just going to look at the outer-- the shell of elements. 320 00:16:11,710 --> 00:16:14,090 Indexing and slicing works the same way. 321 00:16:14,090 --> 00:16:17,950 So l at position 0 gives you the value 2. 322 00:16:17,950 --> 00:16:20,200 You can index into a list, and then do something 323 00:16:20,200 --> 00:16:21,760 with the value that you get back. 324 00:16:21,760 --> 00:16:25,090 So l at position 2 says-- that's this value there 325 00:16:25,090 --> 00:16:28,250 and add one to it. 326 00:16:28,250 --> 00:16:32,271 L at position 3, that's going to be this list here. 327 00:16:32,271 --> 00:16:33,770 Notice it evaluates to another list. 328 00:16:37,042 --> 00:16:38,500 You're not allowed to index outside 329 00:16:38,500 --> 00:16:39,420 of the length of the list. 330 00:16:39,420 --> 00:16:41,290 So that's going to give you an error, because we only 331 00:16:41,290 --> 00:16:42,090 have four elements. 332 00:16:45,150 --> 00:16:49,750 And you can also have expressions for your index. 333 00:16:49,750 --> 00:16:53,550 So this-- Python just replaces i with 2 here and says, 334 00:16:53,550 --> 00:16:55,560 what's l at position 1? 335 00:16:55,560 --> 00:16:57,430 And then grabs that from in there. 336 00:16:57,430 --> 00:16:57,930 OK. 337 00:16:57,930 --> 00:17:00,390 So very, very similar to the kinds of operations 338 00:17:00,390 --> 00:17:03,462 we've seen on strings and tuples. 339 00:17:03,462 --> 00:17:04,920 The one difference, and that's what 340 00:17:04,920 --> 00:17:07,045 we're going to focus on for the rest of this class, 341 00:17:07,045 --> 00:17:11,619 is that lists are mutable objects. 342 00:17:11,619 --> 00:17:13,089 So what does that mean internally? 343 00:17:13,089 --> 00:17:15,950 Internally, that means let's say we have a list l, 344 00:17:15,950 --> 00:17:17,619 and we assign it-- sorry. 345 00:17:17,619 --> 00:17:21,880 Let's say we have a variable l that's going to point to a list 346 00:17:21,880 --> 00:17:24,800 with three elements, 2, 1, and 3. 347 00:17:24,800 --> 00:17:25,300 OK. 348 00:17:25,300 --> 00:17:29,770 They're all-- each element is an integer. 349 00:17:29,770 --> 00:17:33,610 When we were dealing with tuples or with strings, 350 00:17:33,610 --> 00:17:36,880 if we re-assign-- if we try to do this line right here, 351 00:17:36,880 --> 00:17:38,110 we've had an error. 352 00:17:38,110 --> 00:17:41,350 But this is actually allowed with lists. 353 00:17:41,350 --> 00:17:44,070 So when you execute that line, Python 354 00:17:44,070 --> 00:17:46,720 is going to look at that middle element, 355 00:17:46,720 --> 00:17:50,846 and it's going to change its value from a 1 to a 5. 356 00:17:50,846 --> 00:17:53,220 And that's just due to the mutability nature of the list. 357 00:17:56,990 --> 00:18:02,120 So notice that this list variable, this variable l, 358 00:18:02,120 --> 00:18:03,740 which originally pointed to this list, 359 00:18:03,740 --> 00:18:05,630 points to the exact same list. 360 00:18:05,630 --> 00:18:08,750 We haven't created a new object in memory. 361 00:18:08,750 --> 00:18:11,900 We're just modifying the same object in memory. 362 00:18:11,900 --> 00:18:14,000 And you're going to see why this is important 363 00:18:14,000 --> 00:18:15,710 as we look at a few side effects that 364 00:18:15,710 --> 00:18:17,390 can happen when you have this. 365 00:18:22,240 --> 00:18:26,500 So I've said this a couple of times before, 366 00:18:26,500 --> 00:18:32,000 but it'll make your life a lot easier 367 00:18:32,000 --> 00:18:35,090 if you try to think of-- when you want 368 00:18:35,090 --> 00:18:38,300 to iterate through a list if you try to think about iterating 369 00:18:38,300 --> 00:18:40,640 through the elements directly. 370 00:18:40,640 --> 00:18:41,690 It's a lot more Pythonic. 371 00:18:41,690 --> 00:18:43,680 I've used that word before. 372 00:18:43,680 --> 00:18:45,290 So this is sort of a common pattern 373 00:18:45,290 --> 00:18:48,170 that you're going to see where you're iterating over the list 374 00:18:48,170 --> 00:18:49,460 elements directly. 375 00:18:49,460 --> 00:18:50,570 We've done it over tuples. 376 00:18:50,570 --> 00:18:51,695 We've done it over strings. 377 00:18:54,260 --> 00:18:55,670 So these are identical codes. 378 00:18:55,670 --> 00:18:58,280 They do the exact same thing, except on the left, 379 00:18:58,280 --> 00:19:05,010 you're going from-- you're going through 0, 1, 2, 3, and so on. 380 00:19:05,010 --> 00:19:12,150 And then you're indexing into each one of these numbers 381 00:19:12,150 --> 00:19:14,620 to get the element value. 382 00:19:14,620 --> 00:19:18,480 Whereas on the right, this loop variable i 383 00:19:18,480 --> 00:19:22,030 is going to have the element value itself. 384 00:19:22,030 --> 00:19:24,315 So this code on the right is a lot cleaner. 385 00:19:28,320 --> 00:19:28,820 OK. 386 00:19:28,820 --> 00:19:31,460 So now let's look at some operations 387 00:19:31,460 --> 00:19:32,914 that we can do on lists. 388 00:19:32,914 --> 00:19:34,580 So there's a lot more operations that we 389 00:19:34,580 --> 00:19:37,970 can do on lists, because of their mutability aspect 390 00:19:37,970 --> 00:19:40,880 than we can do on tuples or strings, for example. 391 00:19:40,880 --> 00:19:42,392 So here's a few of them. 392 00:19:42,392 --> 00:19:43,850 And they're going to take advantage 393 00:19:43,850 --> 00:19:45,740 of this mutability concept. 394 00:19:45,740 --> 00:19:48,530 So we can add elements directly to the end 395 00:19:48,530 --> 00:19:53,690 of the list using this funky looking notation L.append. 396 00:19:53,690 --> 00:19:57,140 And then the element we want to add to the end. 397 00:19:57,140 --> 00:20:00,120 And this operation mutates the list. 398 00:20:00,120 --> 00:20:02,000 So if I have L is equal to 2, 1, 3, and I 399 00:20:02,000 --> 00:20:04,890 append the element 5 to the end, then 400 00:20:04,890 --> 00:20:08,930 L-- the same L is going to point to the same object, 401 00:20:08,930 --> 00:20:11,167 except it's going to have an extra number at the end. 402 00:20:11,167 --> 00:20:11,666 5. 403 00:20:14,840 --> 00:20:16,540 But now what's this dot? 404 00:20:16,540 --> 00:20:19,440 We haven't really seen this before. 405 00:20:19,440 --> 00:20:22,920 And it's going to become apparent 406 00:20:22,920 --> 00:20:25,996 what it means in a few lectures from now. 407 00:20:25,996 --> 00:20:27,370 But for the moment, you can think 408 00:20:27,370 --> 00:20:32,020 of this dot as an operation. 409 00:20:32,020 --> 00:20:35,800 It's like applying a function, except that the function 410 00:20:35,800 --> 00:20:37,870 that you're applying can only work 411 00:20:37,870 --> 00:20:40,570 on certain types of objects. 412 00:20:40,570 --> 00:20:43,690 So in this case, append, for example, 413 00:20:43,690 --> 00:20:48,070 is the function we're trying to apply. 414 00:20:48,070 --> 00:20:50,230 And we want to apply it to whatever 415 00:20:50,230 --> 00:20:54,040 is before the dot, which is the object. 416 00:20:54,040 --> 00:20:57,980 And append has only been defined to work with a list object, 417 00:20:57,980 --> 00:21:00,620 for example, which is why we're using the dot in this case. 418 00:21:03,480 --> 00:21:05,570 We wouldn't be able to use append on an integer, 419 00:21:05,570 --> 00:21:08,130 for example, because that sort of function 420 00:21:08,130 --> 00:21:10,480 is not defined on the integer. 421 00:21:10,480 --> 00:21:16,270 So for now, you'll sort of have to remember-- 422 00:21:16,270 --> 00:21:19,570 which are functions that work with a dot 423 00:21:19,570 --> 00:21:22,420 and which are functions like [? ln, ?] that 424 00:21:22,420 --> 00:21:24,130 aren't with a dot. 425 00:21:24,130 --> 00:21:28,274 But in a couple of lectures, I promise it'll be a lot clearer. 426 00:21:28,274 --> 00:21:29,940 So for now, just think of it as whatever 427 00:21:29,940 --> 00:21:34,220 is before the dot is the object you're applying a function to, 428 00:21:34,220 --> 00:21:36,490 and whatever is after the dot is the function you're 429 00:21:36,490 --> 00:21:37,735 applying on the object. 430 00:21:42,540 --> 00:21:45,090 So we can add things to the end of our list. 431 00:21:45,090 --> 00:21:46,880 We can also combine lists together using 432 00:21:46,880 --> 00:21:49,190 the plus operator. 433 00:21:49,190 --> 00:21:51,950 The plus operator does not mutate the list. 434 00:21:51,950 --> 00:21:53,870 Instead, it gives you a new list that's 435 00:21:53,870 --> 00:21:56,480 the sum of those two lists combined. 436 00:21:56,480 --> 00:22:00,350 So in this case, if L1 is 2,1,3 and L2 is 4, 5, 6, 437 00:22:00,350 --> 00:22:02,400 when we add those two lists together, 438 00:22:02,400 --> 00:22:05,090 that's going to give us an entirely new list leaving 439 00:22:05,090 --> 00:22:07,850 L1 and L2 the same. 440 00:22:07,850 --> 00:22:16,660 And that's why we have to assign the result of the addition 441 00:22:16,660 --> 00:22:17,265 to a new list. 442 00:22:17,265 --> 00:22:18,515 Otherwise, the result is lost. 443 00:22:22,150 --> 00:22:25,270 If you want to mutate a list directly and make it 444 00:22:25,270 --> 00:22:30,040 longer by the elements within another list, then 445 00:22:30,040 --> 00:22:35,860 you can use this extend function or extent method. 446 00:22:35,860 --> 00:22:38,920 And this is going to mutate L1 directly. 447 00:22:38,920 --> 00:22:44,780 So if L1 was 2,1,3, if you extend it by the list 0,6, 448 00:22:44,780 --> 00:22:47,540 it's just going to tack on 0,6 to L1 directly. 449 00:22:47,540 --> 00:22:49,070 So notice L1 has been mutated. 450 00:22:57,860 --> 00:23:01,120 So that's adding things to lists. 451 00:23:01,120 --> 00:23:03,112 We can also delete things from lists. 452 00:23:03,112 --> 00:23:05,070 We don't just want to keep adding to our lists, 453 00:23:05,070 --> 00:23:07,150 because then they become very, very big. 454 00:23:07,150 --> 00:23:09,720 So let's see how we can delete some items from our list. 455 00:23:09,720 --> 00:23:11,590 There's a few ways. 456 00:23:11,590 --> 00:23:15,600 First one being can use this del function. 457 00:23:15,600 --> 00:23:18,310 And this says delete from the list 458 00:23:18,310 --> 00:23:21,980 L the element at this index. 459 00:23:21,980 --> 00:23:24,004 So you give it the index 0, 1, 2, 460 00:23:24,004 --> 00:23:25,670 or whatever you want to-- whatever index 461 00:23:25,670 --> 00:23:27,857 you want to delete the element at. 462 00:23:27,857 --> 00:23:29,440 If you just want to delete the element 463 00:23:29,440 --> 00:23:32,050 at the end of the list, that's the farthest right, 464 00:23:32,050 --> 00:23:35,150 you do L.pop. 465 00:23:35,150 --> 00:23:37,400 If you want to remove a specific element-- so you know 466 00:23:37,400 --> 00:23:40,640 there's somewhere in your list there's the number 5, 467 00:23:40,640 --> 00:23:43,310 and you want to delete it from the list-- then you 468 00:23:43,310 --> 00:23:47,450 say L.remove and you say what element you want to remove. 469 00:23:47,450 --> 00:23:50,220 And that only removes the very first occurrence of it. 470 00:23:50,220 --> 00:23:51,840 So if there's two fives in your list, 471 00:23:51,840 --> 00:23:55,610 then it's only going to remove the very first one. 472 00:23:55,610 --> 00:24:01,010 So let's take a look at this sort of sequence of commands. 473 00:24:01,010 --> 00:24:06,327 So we have first L is equl to this long list here. 474 00:24:06,327 --> 00:24:08,410 And I want to mention that all of these operations 475 00:24:08,410 --> 00:24:09,880 are going to mutate our list, which 476 00:24:09,880 --> 00:24:13,630 is why I wrote this comment here that says assume that you're 477 00:24:13,630 --> 00:24:15,004 doing these in order. 478 00:24:15,004 --> 00:24:16,420 So as you're doing these in order, 479 00:24:16,420 --> 00:24:18,260 you're going to be mutating your list. 480 00:24:18,260 --> 00:24:19,180 And if you're mutating your list, 481 00:24:19,180 --> 00:24:20,846 you have to remember that you're working 482 00:24:20,846 --> 00:24:24,256 with this new mutated list. 483 00:24:24,256 --> 00:24:25,630 So the first thing we're doing is 484 00:24:25,630 --> 00:24:29,320 we're removing 2 from our list. 485 00:24:29,320 --> 00:24:31,180 So when you remove 2, this says look 486 00:24:31,180 --> 00:24:36,310 for an element with the value 2 and take it away from the list. 487 00:24:36,310 --> 00:24:39,650 So that's the very first one here. 488 00:24:39,650 --> 00:24:44,350 So the list we're left with is just everything after it. 489 00:24:44,350 --> 00:24:46,000 Then I want to remove 3 from the list 490 00:24:46,000 --> 00:24:47,350 and notice there's two of them. 491 00:24:47,350 --> 00:24:49,760 There's this 3 here and there's this 3 here. 492 00:24:49,760 --> 00:24:52,930 So we're going to remove only the first one, which 493 00:24:52,930 --> 00:24:54,610 is this one here. 494 00:24:54,610 --> 00:24:59,588 So the list we're left with is 1,6,3,7,0. 495 00:24:59,588 --> 00:25:03,680 Then we're going to delete from the list L 496 00:25:03,680 --> 00:25:05,670 the element at position 1. 497 00:25:05,670 --> 00:25:08,060 So starting counting from 0, the element at position 1 498 00:25:08,060 --> 00:25:10,100 is this one here. 499 00:25:10,100 --> 00:25:14,940 So we've removed that, and we're left with 1, 3, 7, 0. 500 00:25:14,940 --> 00:25:16,907 And then when we do L.pop, that's 501 00:25:16,907 --> 00:25:18,990 going to delete the element furthest to the right. 502 00:25:18,990 --> 00:25:20,870 So at the end of the list, which is that 0. 503 00:25:25,450 --> 00:25:30,280 So then we're left with only 1, 3, and 7. 504 00:25:30,280 --> 00:25:33,290 And L.pop is often useful, because it tells you 505 00:25:33,290 --> 00:25:37,970 the return value from L.pop is going to be the value that it 506 00:25:37,970 --> 00:25:38,540 removed. 507 00:25:38,540 --> 00:25:40,260 So in this case, it's going to return 0. 508 00:25:43,740 --> 00:25:46,950 I want to mention, though, that some of the-- so these 509 00:25:46,950 --> 00:25:49,470 functions all mutate the list. 510 00:25:49,470 --> 00:25:52,500 You have to be careful with return values. 511 00:25:52,500 --> 00:25:55,050 So these are all-- you can think of all of these 512 00:25:55,050 --> 00:25:57,999 as functions that operate on the list. 513 00:25:57,999 --> 00:26:00,540 Except that what these functions do is they take in the list, 514 00:26:00,540 --> 00:26:02,700 and they modify it. 515 00:26:02,700 --> 00:26:05,760 But as functions, they obviously return something 516 00:26:05,760 --> 00:26:07,740 back to whoever called them. 517 00:26:07,740 --> 00:26:11,490 And oftentimes, they're going to return the value none. 518 00:26:11,490 --> 00:26:15,780 So for example, if you are going to do L.remove 2, 519 00:26:15,780 --> 00:26:18,900 and you print that out, that might print out none for you. 520 00:26:18,900 --> 00:26:23,850 So you can't just assign the value of this to a variable 521 00:26:23,850 --> 00:26:26,580 and expect it to be the mutated list. 522 00:26:26,580 --> 00:26:28,759 The list got mutated. 523 00:26:28,759 --> 00:26:30,300 The list that got mutated is the list 524 00:26:30,300 --> 00:26:32,100 that was passed into to here. 525 00:26:32,100 --> 00:26:34,710 We're going look at one example in a few slides that's 526 00:26:34,710 --> 00:26:36,330 going to show this. 527 00:26:36,330 --> 00:26:36,872 OK. 528 00:26:36,872 --> 00:26:39,330 Another thing that we can do, and this is often useful when 529 00:26:39,330 --> 00:26:43,290 you're working with data, is to convert lists 530 00:26:43,290 --> 00:26:45,420 to strings and then strings to lists. 531 00:26:45,420 --> 00:26:48,330 Sometimes it might be useful to work with strings as opposed 532 00:26:48,330 --> 00:26:52,810 to a list and vice versa. 533 00:26:52,810 --> 00:26:56,370 So this first line here, list s takes in a string 534 00:26:56,370 --> 00:26:58,770 and casts it to a list. 535 00:26:58,770 --> 00:27:00,840 So much like when we cast a float to an integer, 536 00:27:00,840 --> 00:27:01,890 for example. 537 00:27:01,890 --> 00:27:04,950 You're just casting a string to a list here. 538 00:27:04,950 --> 00:27:06,430 And when you do that up this line-- 539 00:27:06,430 --> 00:27:09,990 so if this is your s here-- when you do list s, 540 00:27:09,990 --> 00:27:12,080 this is going to give you a list-- 541 00:27:12,080 --> 00:27:14,850 looks like this-- where every single character in s 542 00:27:14,850 --> 00:27:16,920 is going to be its own element. 543 00:27:16,920 --> 00:27:19,170 So that means every character is going to be a string, 544 00:27:19,170 --> 00:27:21,070 and it's going to be separated by a comma, 545 00:27:21,070 --> 00:27:21,925 so including spaces. 546 00:27:25,360 --> 00:27:27,820 Sometimes you don't want each character in the list 547 00:27:27,820 --> 00:27:28,930 to be its own element. 548 00:27:28,930 --> 00:27:31,810 Sometimes you want, for example, if you're given a sentence, 549 00:27:31,810 --> 00:27:34,960 you might want to have everything in between spaces 550 00:27:34,960 --> 00:27:36,290 being its own element. 551 00:27:36,290 --> 00:27:38,873 So that will give you every word in the sentence, for example. 552 00:27:41,230 --> 00:27:45,370 In that case, you're going to use split. 553 00:27:45,370 --> 00:27:47,490 In this case, I've split over the less than sign. 554 00:27:47,490 --> 00:27:52,240 But again, if you're doing the sentence example, 555 00:27:52,240 --> 00:27:54,360 you might want to split on the space. 556 00:27:54,360 --> 00:27:57,327 So this is going to take everything in between the sign 557 00:27:57,327 --> 00:27:59,910 that you're interested in-- in this case, the less than sign-- 558 00:27:59,910 --> 00:28:02,368 and it's going to set it as a separate element in the list. 559 00:28:07,620 --> 00:28:12,170 So that's how you convert strings to lists. 560 00:28:12,170 --> 00:28:14,075 And sometimes you're given a list, 561 00:28:14,075 --> 00:28:15,950 and you might want to convert it to a string. 562 00:28:15,950 --> 00:28:21,880 So that's where this join method or function is useful. 563 00:28:21,880 --> 00:28:23,802 So this is an empty string. 564 00:28:23,802 --> 00:28:25,510 So it's just open close quote right away. 565 00:28:25,510 --> 00:28:26,870 No space. 566 00:28:26,870 --> 00:28:30,531 So this just joins every one of the elements in the list 567 00:28:30,531 --> 00:28:31,030 together. 568 00:28:31,030 --> 00:28:32,950 So it'll return the string abc. 569 00:28:32,950 --> 00:28:35,630 And then you can join on any character that you would want. 570 00:28:35,630 --> 00:28:38,240 So in this case, you can join on the underscore. 571 00:28:38,240 --> 00:28:41,320 So it'll put whatever characters in here in between every one 572 00:28:41,320 --> 00:28:44,420 of the elements in your list. 573 00:28:44,420 --> 00:28:48,220 So pretty useful functions. 574 00:28:48,220 --> 00:28:48,940 OK. 575 00:28:48,940 --> 00:28:50,840 Couple other operations we can do on lists-- 576 00:28:50,840 --> 00:28:53,870 and these are also pretty useful-- is to sort lists 577 00:28:53,870 --> 00:28:57,190 and to reverse lists and many, many others 578 00:28:57,190 --> 00:29:01,740 in the Python documentation. 579 00:29:01,740 --> 00:29:06,810 So sort and sorted both sort lists, but one of them 580 00:29:06,810 --> 00:29:09,570 mutates the list and the other one does not. 581 00:29:09,570 --> 00:29:11,460 And sometimes it's useful to use one, 582 00:29:11,460 --> 00:29:14,070 and sometimes it's useful to use the other. 583 00:29:14,070 --> 00:29:19,550 So if I have this list L is equal to 9,6,0,3, 584 00:29:19,550 --> 00:29:24,380 sorted-- you can think of it as giving me 585 00:29:24,380 --> 00:29:27,610 the sorted version of L-- gives you back 586 00:29:27,610 --> 00:29:31,120 the sorted version of L. So it returns a new list that's 587 00:29:31,120 --> 00:29:34,690 the sorted version of the input list and does not mutate L. 588 00:29:34,690 --> 00:29:36,480 So it keeps L the exact same way. 589 00:29:39,120 --> 00:29:42,600 So this will be replaced by the sorted version 590 00:29:42,600 --> 00:29:44,867 of the list, which you can assign to a variable, 591 00:29:44,867 --> 00:29:46,450 and then do whatever you want with it. 592 00:29:46,450 --> 00:29:50,845 Like L2 is equal to sorted L, for example. 593 00:29:53,430 --> 00:29:55,940 And it keeps L the same. 594 00:29:55,940 --> 00:29:58,451 On the other hand, if you just want to mutate L, 595 00:29:58,451 --> 00:30:00,950 and you don't care about getting another copy that's sorted, 596 00:30:00,950 --> 00:30:03,850 you just do L.sort. 597 00:30:03,850 --> 00:30:07,050 And that's going to automatically sort L for you, 598 00:30:07,050 --> 00:30:14,220 and L now-- L is now the sorted version of L. Similarly, 599 00:30:14,220 --> 00:30:17,880 reverse is going to take L and reverse all the character-- 600 00:30:17,880 --> 00:30:18,840 all the elements in it. 601 00:30:18,840 --> 00:30:21,173 So the last one is the first one, the second to last one 602 00:30:21,173 --> 00:30:24,900 is the second one, and so on. 603 00:30:24,900 --> 00:30:26,520 So lists are mutable. 604 00:30:26,520 --> 00:30:28,337 We've said that so many times this lecture. 605 00:30:28,337 --> 00:30:29,670 But what exactly does that mean? 606 00:30:29,670 --> 00:30:33,690 What implications does that have? 607 00:30:33,690 --> 00:30:39,480 Once again, this next part of the lecture, Python tutor. 608 00:30:39,480 --> 00:30:43,560 Just paste all the code in and go step by step 609 00:30:43,560 --> 00:30:47,420 to see exactly what's happening. 610 00:30:47,420 --> 00:30:49,175 So lists are mutable. 611 00:30:52,460 --> 00:30:56,730 As you have variable names-- so for example, 612 00:30:56,730 --> 00:31:00,710 L is equal to some list-- that L is 613 00:31:00,710 --> 00:31:04,160 going to be pointing to the list in memory. 614 00:31:04,160 --> 00:31:06,570 And since it's a mutable object, this list, 615 00:31:06,570 --> 00:31:08,330 you can have more than one variable 616 00:31:08,330 --> 00:31:13,250 that points to the exact same object in memory. 617 00:31:13,250 --> 00:31:15,600 And if you have more than one variable that 618 00:31:15,600 --> 00:31:17,100 points to the same object in memory, 619 00:31:17,100 --> 00:31:20,340 if that object in memory is changed, then 620 00:31:20,340 --> 00:31:23,550 when you access it through any one of these variables, 621 00:31:23,550 --> 00:31:30,860 they're all going to give you the changed object value. 622 00:31:30,860 --> 00:31:32,270 So the key phrase to keep in mind 623 00:31:32,270 --> 00:31:34,010 when you're dealing with lists is 624 00:31:34,010 --> 00:31:36,050 what side effects could happen? 625 00:31:36,050 --> 00:31:38,780 If you're mutating a list, if you're doing operations 626 00:31:38,780 --> 00:31:43,370 on lists, what side effects-- what variables might 627 00:31:43,370 --> 00:31:46,280 be affected by this change? 628 00:31:46,280 --> 00:31:49,250 Let's come back down to earth for a second. 629 00:31:49,250 --> 00:31:50,757 This will wake a lot of people up. 630 00:31:53,380 --> 00:31:57,940 So let's do an analogy with people. 631 00:31:57,940 --> 00:31:59,620 Let's say we have a person. 632 00:31:59,620 --> 00:32:02,860 A person-- this case, Justin Bieber-- 633 00:32:02,860 --> 00:32:05,350 is going to be an object. 634 00:32:05,350 --> 00:32:06,790 I'm an object. 635 00:32:06,790 --> 00:32:08,470 I'm like the number three. 636 00:32:08,470 --> 00:32:09,340 Bieber's an object. 637 00:32:09,340 --> 00:32:10,690 He's like number five. 638 00:32:10,690 --> 00:32:11,840 Different objects. 639 00:32:11,840 --> 00:32:15,030 Were both of type people. 640 00:32:15,030 --> 00:32:16,940 OK. 641 00:32:16,940 --> 00:32:20,120 Let's say a person has different attributes. 642 00:32:20,120 --> 00:32:23,870 Let's say we can-- let's say he gets 643 00:32:23,870 --> 00:32:25,430 two attributes to begin with. 644 00:32:25,430 --> 00:32:26,650 He's a singer and he's rich. 645 00:32:30,530 --> 00:32:34,100 I can refer to this person object by many different names. 646 00:32:34,100 --> 00:32:36,980 His full name, his stage name, all of the fan girls 647 00:32:36,980 --> 00:32:39,590 call him by these names, people who 648 00:32:39,590 --> 00:32:41,090 dislike him call him by other names 649 00:32:41,090 --> 00:32:43,490 that they didn't put up here. 650 00:32:43,490 --> 00:32:45,560 But he's known by all these different names. 651 00:32:45,560 --> 00:32:48,560 They're all aliases or nicknames that point to this same person 652 00:32:48,560 --> 00:32:50,770 object. 653 00:32:50,770 --> 00:32:51,680 OK. 654 00:32:51,680 --> 00:32:53,720 So originally, let's say I say Justin Bieber is 655 00:32:53,720 --> 00:32:54,737 a singer and rich. 656 00:32:54,737 --> 00:32:56,570 Those are the two attributes I've originally 657 00:32:56,570 --> 00:32:58,160 assigned to him. 658 00:32:58,160 --> 00:33:00,860 And then let's say I want to assign a different attribute 659 00:33:00,860 --> 00:33:04,390 to him and say Justin Bieber's a singer, rich, 660 00:33:04,390 --> 00:33:05,740 and a troublemaker. 661 00:33:05,740 --> 00:33:07,930 I'm being kind here. 662 00:33:07,930 --> 00:33:09,040 OK. 663 00:33:09,040 --> 00:33:12,530 So if I say Justin Bieber has these three attributes-- 664 00:33:12,530 --> 00:33:15,160 so it's the same person I'm referring to-- then 665 00:33:15,160 --> 00:33:18,760 all of his nicknames are going to refer 666 00:33:18,760 --> 00:33:20,090 to this exact same person. 667 00:33:20,090 --> 00:33:23,290 So all of his nicknames or aliases 668 00:33:23,290 --> 00:33:25,390 will refer to the same person object 669 00:33:25,390 --> 00:33:28,590 with these changed attributes. 670 00:33:28,590 --> 00:33:30,510 Does that makes sense? 671 00:33:30,510 --> 00:33:31,470 OK. 672 00:33:31,470 --> 00:33:35,550 So that sort of idea arises in lists. 673 00:33:35,550 --> 00:33:37,370 So a list is like a person object 674 00:33:37,370 --> 00:33:40,940 whose value-- whose attributes can change, for example. 675 00:33:40,940 --> 00:33:43,290 And as they change, all of the different aliases 676 00:33:43,290 --> 00:33:46,525 for this object will point to this changed object. 677 00:33:49,390 --> 00:33:51,000 So let's see a few examples. 678 00:33:51,000 --> 00:33:53,640 I apologize if this is a little small, 679 00:33:53,640 --> 00:33:55,410 but this I basically copied and pasted 680 00:33:55,410 --> 00:33:58,860 from the Python tutor, which is just 681 00:33:58,860 --> 00:34:02,340 from the code from today's lecture. 682 00:34:02,340 --> 00:34:05,130 So I have these lines of code here. 683 00:34:05,130 --> 00:34:06,720 The first couple of lines really just 684 00:34:06,720 --> 00:34:08,261 show what happens when you're dealing 685 00:34:08,261 --> 00:34:09,888 with non-mutable objects. 686 00:34:09,888 --> 00:34:11,429 So with non-mutable objects, you have 687 00:34:11,429 --> 00:34:14,520 two separate objects that get their own values, 688 00:34:14,520 --> 00:34:15,250 and that's it. 689 00:34:15,250 --> 00:34:17,010 End of story. 690 00:34:17,010 --> 00:34:21,428 With lists, however, there's something different 691 00:34:21,428 --> 00:34:21,969 that happens. 692 00:34:21,969 --> 00:34:25,260 So I have warm is a variable. 693 00:34:25,260 --> 00:34:27,090 And it's going to be equal to this list. 694 00:34:27,090 --> 00:34:30,239 So warm is going to point to this list here. 695 00:34:30,239 --> 00:34:31,150 Red, yellow, orange. 696 00:34:31,150 --> 00:34:34,560 It contains three elements. 697 00:34:34,560 --> 00:34:35,730 Hot is equal to warm. 698 00:34:35,730 --> 00:34:39,080 It means I'm creating an alias for this list. 699 00:34:39,080 --> 00:34:43,469 And the alias is going to be with this variable hot. 700 00:34:43,469 --> 00:34:48,880 So notice warm and hot point to the exact same object. 701 00:34:48,880 --> 00:34:54,870 So on line 8 when I append this string pink to my object, 702 00:34:54,870 --> 00:34:57,000 since both of these two variables 703 00:34:57,000 --> 00:34:58,860 point to the exact same object, if I'm 704 00:34:58,860 --> 00:35:03,000 trying to access this object through either variable, 705 00:35:03,000 --> 00:35:05,640 they're both going to print out the same thing. 706 00:35:08,330 --> 00:35:09,640 And that's the side effect. 707 00:35:09,640 --> 00:35:12,920 That's the side effect of lists mutable. 708 00:35:17,300 --> 00:35:22,280 If you want to create an entirely new copy of the list, 709 00:35:22,280 --> 00:35:25,220 then you can clone it, which sounds really cool. 710 00:35:25,220 --> 00:35:28,100 But really, it's just making a copy of the list. 711 00:35:28,100 --> 00:35:31,140 And you clone it using this little notation here, 712 00:35:31,140 --> 00:35:34,430 which is open close square brackets with a colon. 713 00:35:34,430 --> 00:35:36,180 And we've sort of seen this notation here. 714 00:35:36,180 --> 00:35:39,350 And this tells Python this is 0-- sorry. 715 00:35:39,350 --> 00:35:42,780 This is 0 and this is length. 716 00:35:42,780 --> 00:35:43,280 Cool. 717 00:35:46,030 --> 00:35:47,940 But it basically says take every element, 718 00:35:47,940 --> 00:35:52,530 create a new list with those exact same elements, 719 00:35:52,530 --> 00:35:55,670 and assign it to the variable chill. 720 00:35:55,670 --> 00:35:57,560 So here, if I originally have cool 721 00:35:57,560 --> 00:35:59,690 is equal to blue, green, gray right 722 00:35:59,690 --> 00:36:05,180 here, when I clone it on line 2 with that funky notation, 723 00:36:05,180 --> 00:36:07,724 I'm creating a new copy of it. 724 00:36:07,724 --> 00:36:09,140 And then on the next line when I'm 725 00:36:09,140 --> 00:36:14,000 appending another element to the copy, 726 00:36:14,000 --> 00:36:16,520 notice I'm just altering the copy. 727 00:36:16,520 --> 00:36:21,360 The original stayed the same, because I've cloned it. 728 00:36:21,360 --> 00:36:24,360 So if you don't want to have the side effects-- side 729 00:36:24,360 --> 00:36:26,190 effect issue, then you should clone 730 00:36:26,190 --> 00:36:30,300 your variable-- your list. 731 00:36:34,610 --> 00:36:39,370 So let's see a slightly more complicated example 732 00:36:39,370 --> 00:36:42,855 where you're going to see the difference between sort 733 00:36:42,855 --> 00:36:48,770 and sorted in the context of this mutability and side 734 00:36:48,770 --> 00:36:49,591 effects issue. 735 00:36:49,591 --> 00:36:50,090 OK. 736 00:36:50,090 --> 00:36:54,170 So once again, let's create this warm is equal to red, yellow, 737 00:36:54,170 --> 00:36:54,890 orange. 738 00:36:54,890 --> 00:37:00,020 So that's what warm is going to point to, this list. 739 00:37:00,020 --> 00:37:03,830 And then sorted warm is equal to warm.sort. 740 00:37:03,830 --> 00:37:06,500 So .sort mutates. 741 00:37:06,500 --> 00:37:09,830 So as soon as I do that, that list warm 742 00:37:09,830 --> 00:37:13,480 is now the sorted version of it. 743 00:37:13,480 --> 00:37:16,460 And notice that I've assigned the return of this 744 00:37:16,460 --> 00:37:17,610 to sorted warm. 745 00:37:17,610 --> 00:37:26,170 And the return is none, because L.sort or .sort mutated 746 00:37:26,170 --> 00:37:26,950 the list. 747 00:37:26,950 --> 00:37:30,130 It didn't return a sorted version of the list. 748 00:37:30,130 --> 00:37:32,350 It mutated the list itself. 749 00:37:32,350 --> 00:37:33,160 OK. 750 00:37:33,160 --> 00:37:35,230 So when I print warm and I print sorted warm, 751 00:37:35,230 --> 00:37:40,480 I'm printing the mutated version and then this one here. 752 00:37:40,480 --> 00:37:44,780 Sorted, on the other hand, returns-- 753 00:37:44,780 --> 00:37:50,950 it doesn't-- sorted does not sort the list that's given 754 00:37:50,950 --> 00:37:51,580 to it. 755 00:37:51,580 --> 00:37:54,800 And instead, it returns a sorted version of the list. 756 00:37:54,800 --> 00:37:57,820 So in this case, if cool is equal to these three colors-- 757 00:37:57,820 --> 00:38:02,760 gray, green, blue-- if I do sorted cool, 758 00:38:02,760 --> 00:38:05,350 it's going to return the sorted version of that list, which 759 00:38:05,350 --> 00:38:07,214 is blue, green, gray. 760 00:38:07,214 --> 00:38:09,130 And it's assigned to the variable sorted cool. 761 00:38:09,130 --> 00:38:10,796 So when I print them, it's going to show 762 00:38:10,796 --> 00:38:13,230 me the two separate lists. 763 00:38:13,230 --> 00:38:14,956 One being the original unsorted one, 764 00:38:14,956 --> 00:38:16,330 and one being the sorted version. 765 00:38:21,190 --> 00:38:23,680 Last ones a little bit more complicated, 766 00:38:23,680 --> 00:38:27,910 but it shows that even though you have nested-- 767 00:38:27,910 --> 00:38:33,100 even though you can have nested lists, you still-- you're not-- 768 00:38:33,100 --> 00:38:40,630 you don't escape this idea of side effects. 769 00:38:40,630 --> 00:38:44,770 So first, I'm going to create warm is equal to these two 770 00:38:44,770 --> 00:38:46,180 colors, yellow, orange. 771 00:38:46,180 --> 00:38:49,000 So warm points to these two colors. 772 00:38:49,000 --> 00:38:54,000 Hot is equal to this one list-- a list with one element. 773 00:38:54,000 --> 00:38:57,810 Bright colors is going to be a list. 774 00:38:57,810 --> 00:39:02,360 And the element inside the list is a list itself. 775 00:39:02,360 --> 00:39:05,210 So since it's a list-- this is your list, 776 00:39:05,210 --> 00:39:08,240 and the element inside here, which is a list itself, 777 00:39:08,240 --> 00:39:10,880 is actually just pointing to whatever warm is. 778 00:39:10,880 --> 00:39:11,493 That object. 779 00:39:16,700 --> 00:39:20,540 Then I do-- then I append hot to my bright colors. 780 00:39:20,540 --> 00:39:22,250 So the next element here is going 781 00:39:22,250 --> 00:39:25,460 to be another list, which means it's just pointing 782 00:39:25,460 --> 00:39:26,840 to this other list here. 783 00:39:26,840 --> 00:39:30,164 It's not creating a copy of it. 784 00:39:30,164 --> 00:39:31,580 So each one of these elements here 785 00:39:31,580 --> 00:39:35,540 is actually just pointing to these two lists here. 786 00:39:35,540 --> 00:39:38,000 So if I modified either one of these, 787 00:39:38,000 --> 00:39:40,200 then bright colors would also be modified. 788 00:39:40,200 --> 00:39:45,260 So let's say I add pink here to my hot list. 789 00:39:45,260 --> 00:39:46,960 We have red and pink. 790 00:39:46,960 --> 00:39:50,729 Then notice that bright colors-- the first element 791 00:39:50,729 --> 00:39:52,520 points to this list, and the second element 792 00:39:52,520 --> 00:39:56,459 points to this list, which I've just modified. 793 00:39:56,459 --> 00:39:58,000 Last thing is-- I'll let you try this 794 00:39:58,000 --> 00:40:02,020 as an exercise in Python Tutor-- but the idea here being you 795 00:40:02,020 --> 00:40:04,570 should be careful as you're writing a for loop that 796 00:40:04,570 --> 00:40:08,500 iterates over a list that you're modifying inside the list. 797 00:40:08,500 --> 00:40:12,250 In this case, I'm trying to go through the list L1. 798 00:40:12,250 --> 00:40:15,130 And if I find an item that's in L1 and L2, 799 00:40:15,130 --> 00:40:18,220 I want to delete it from L1. 800 00:40:18,220 --> 00:40:20,680 So 1 and 2 are also in L2. 801 00:40:20,680 --> 00:40:24,040 So I want to delete them from L1 and be left with 3, 4. 802 00:40:24,040 --> 00:40:26,200 However, the code on the left here doesn't actually 803 00:40:26,200 --> 00:40:31,710 do what I think it's doing, because here I'm 804 00:40:31,710 --> 00:40:33,650 modifying a list as I'm iterating over it. 805 00:40:33,650 --> 00:40:35,870 And behind the scenes, Python keeps 806 00:40:35,870 --> 00:40:40,220 this-- keeps track of the index and doesn't update the index 807 00:40:40,220 --> 00:40:42,810 as you're changing the list. 808 00:40:42,810 --> 00:40:45,020 So it figures out the length of the list 809 00:40:45,020 --> 00:40:48,490 to begin with and how many indices it has. 810 00:40:48,490 --> 00:40:52,710 It doesn't update it as you're removing items from the list. 811 00:40:52,710 --> 00:40:55,590 So the solution to that is to make a copy of the list 812 00:40:55,590 --> 00:41:01,530 first, iterate over the copy, which will remain intact, 813 00:41:01,530 --> 00:41:05,940 and modify the list that you want to modify inside the loop. 814 00:41:05,940 --> 00:41:08,924 So please run both of these in the Python Tutor, 815 00:41:08,924 --> 00:41:11,340 and you'll see that what ends up happening is on the left, 816 00:41:11,340 --> 00:41:13,860 you're going to skip over one element. 817 00:41:13,860 --> 00:41:17,880 So your code-- so that's going to be the wrong code. 818 00:41:17,880 --> 00:41:19,490 All right.