More On Lists
Last updated
Last updated
In Python, lists are represented by square brackets. Therefore, we create a list as follows.
The above list, colors
is stored in memory as shown below.
We can also create a list that contains multiple data types, like strings, integers, and floats.
Python lists follow a zero indexing structure, meaning the list index starts from 0. Nested lists are accessed using nested indexing.
Python has a very handy negative indexing feature as well, which starts from the end of the list:
We can reverse and slice lists using list indices, as follows
For more information regarding list slicing, refer to this link.
list.index()
list.index()
returns the index of a specified element in the list. The syntax is: list.index(element, start, end)
list.append()
The list.append()
method adds an item at the end of a list.
list.extend()
list.extend()
extends the list by appending items.
list.insert()
list.insert()
inserts an element into the mentioned index.
list.remove()
list.remove()
removes the first element that matches from the specified list.
list.count(x)
list.count()
returns the number of times that ‘x’ appears in the list.
list.pop()
The list.pop()
method removes and returns the element specified in the parameter. If the parameter is not specified, it removes and returns the last element in the list.
list.reverse()
The list.reverse()
method reverses the list, and updates it. It has no return value.
list.sort()
The list.sort()
method sorts the elements of the given list using the syntax: list.sort(key= , reverse= )
list.copy()
The list.copy()
method copies the list into another list.
list.clear()
The list.clear()
method empties the given list.
List Comprehensions are advanced features in Python that enable you to create a new list from an existing list, and it consists of expressions within a for statement inside square brackets.
For example:
Lists are one of the most commonly used and most powerful data structures in Python. If one manages to master lists, he/she will perform very well in programming interviews. Once you’re done reading and using the list methods, check out the below links and start solving programs based on lists.
Last chapter we introduced Python’s built-in types int
, float
, and str
, and we stumbled upon tuple
.
Integers and floats are numeric types, which means they hold numbers. We can use the numeric operators we saw last chapter with them to form numeric expressions. The Python interpreter can then evaluate these expressions to produce numeric values, making Python a very powerful calculator.
Strings, lists, and tuples are all sequence types, so called because they behave like a sequence - an ordered collection of objects.
Squence types are qualitatively different from numeric types because they are compound data types - meaning they are made up of smaller pieces. In the case of strings, they’re made up of smaller strings, each containing one character. There is also the empty string, containing no characters at all.
In the case of lists or tuples, they are made up of elements, which are values of any Python datatype, including other lists and tuples.
Lists are enclosed in square brackets ([
and ]
) and tuples in parentheses ((
and )
).
A list containing no elements is called an empty list, and a tuple with no elements is an empty tuple.
The first example is a list of five integers, and the next is a list of three strings. The third is a tuple containing four integers, followed by a tuple containing four strings. The last is a list containing three tuples, each of which contains a pair of strings.
Depending on what we are doing, we may want to treat a compound data type as a single thing, or we may want to access its parts. This ambiguity is useful.
Note
It is possible to drop the parentheses when specifiying a tuple, and only use a comma seperated list of values:
Also, it is required to include a comma when specifying a tuple with only one element:
Except for the case of the empty tuple, it is really the commas, not the parentheses, that tell Python it is a tuple.
The sequence types share a common set of operations.
The indexing operator ([
]
) selects a single element from a sequence. The expression inside brackets is called the index, and must be an integer value. The index indicates which element to select, hence its name.
The expression fruit[1]
selects the character with index 1
from fruit
, and creates a new string containing just this one character, which you may be surprised to see is 'a'
.
You probably expected to see 'b'
, but computer scientists typically start counting from zero, not one. Think of the index as the numbers on a ruler measuring how many elements you have moved into the sequence from the beginning. Both rulers and indices start at 0
.
Last chapter you saw the len
function used to get the number of characters in a string:
With lists and tuples, len
returns the number of elements in the sequence:
It is common in computer programming to need to access elements at the end of a sequence. Now that you have seen the len
function, you might be tempted to try something like this:
That won’t work. It causes the runtime error IndexError: list index out of range
. The reason is that len(seq)
returns the number of elements in the list, 16, but there is no element at index position 16 in seq
.
Since we started counting at zero, the sixteen indices are numbered 0 to 15. To get the last element, we have to subtract 1 from the length:
This is such a common in pattern that Python provides a short hand notation for it, negative indexing, which counts backward from the end of the sequence.
The expression seq[-1]
yields the last element, seq[-2]
yields the second to last, and so on.
for
loopA lot of computations involve processing a sequence one element at a time. The most common pattern is to start at the beginning, select each element in turn, do something to it, and continue until the end. This pattern of processing is called a traversal.
Python’s for
loop makes traversal easy to express:
Note
We will discuss looping in greater detail in the next chapter. For now just note that the colon (:) at the end of the first line and the indentation on the second line are both required for this statement to be syntactically correct.
enumerate
As the standard for
loop traverses a sequence, it assigns each value in the sequence to the loop variable in the order it occurs in the sequence. Sometimes it is helpful to have both the value and the index of each element. The enumerate
function gives us this:
A subsequence of a sequence is called a slice and the operation that extracts a subsequence is called slicing. Like with indexing, we use square brackets ([
]
) as the slice operator, but instead of one integer value inside we have two, seperated by a colon (:
):
If you omit the first index (before the colon), the slice starts at the beginning of the string. If you omit the second index, the slice goes to the end of the string. Thus:
What do you think s[:]
means? What about classmates[4:]
?
Negative indexes are also allowed, so
Tip
Developing a firm understanding of how slicing works is important. Keep creating your own “experiments” with sequences and slices until you can consistently predict the result of a slicing operation before you run it.
When you slice a sequence, the resulting subsequence always has the same type as the sequence from which it was derived. This is not generally true with indexing, except in the case of strings.
While the elements of a list (or tuple) can be of any type, no matter how you slice it, a slice of a list is a list.
in
operatorThe in
operator returns whether a given element is contained in a list or tuple:
in
works somewhat differently with strings. It evaluates to True
if one string is a substring of another:
Note that a string is a substring of itself, and the empty string is a substring of any other string. (Also note that computer programmers like to think about these edge cases quite carefully!)
Strings, lists, and tuples are objects, which means that they not only hold values, but have built-in behaviors called methods, that act on the values in the object.
Let’s look at some string methods in action to see how this works.
Now let’s learn to describe what we just saw. Each string in the above examples is followed by a dot operator, a method name, and a parameter list, which may be empty.
In the first example, the string 'apple'
is followed by the dot operator and then the upper()
method, which has an empty parameter list. We say that the “upper()
method is invoked on the string, 'apple'
. Invoking the method causes an action to take place using the value on which the method is invoked. The action produces a result, in this case the string value 'Apple'
. We say that the upper()
method returns the string 'Apple'
when it is invoked on (or called on) the string 'apple'
.
In the fourth example, the method isdigit()
(again with an empty parameter list) is invoked on the string '42'
. Since each of the characters in the string represents a digit, the isdigit()
method returns the boolean value True
. Invoking isdigit()
on 'four'
produces False
.
The strip()
removes leading and trailing whitespace.
dir()
function and docstringsThe previous section introduced several of the methods of string objects. To find all the methods that strings have, we can use Python’s built-in dir
function:
We will postpone talking about the ones that begin with double underscores (__
) until later. You can find out more about each of these methods by printing out their docstrings. To find out what the replace
method does, for example, we do this:
Using this information, we can try using the replace method to varify that we know how it works.
The first example replaces all occurances of 'i'
with 'X'
. The second replaces the single character 'p'
with the two characters 'MO'
. The third example replaces the first two occurances of 'i''
with the empty string.
count
and index
methodsThere are two methods that are common to all three sequence types: count
and index
. Let’s look at their docstrings to see what they do.
We will explore these functions in the exercises.
Unlike strings and tuples, which are immutable objects, lists are mutable, which means we can change their elements. Using the bracket operator on the left side of an assignment, we can update one of the elements:
The bracket operator applied to a list can appear anywhere in an expression. When it appears on the left side of an assignment, it changes one of the elements in the list, so the first element of fruit
has been changed from 'banana'
to 'pear'
, and the last from 'quince'
to 'orange'
. An assignment to an element of a list is called item assignment. Item assignment does not work for strings:
but it does for lists:
With the slice operator we can update several elements at once:
We can also remove elements from a list by assigning the empty list to them:
And we can add elements to a list by squeezing them into an empty slice at the desired location:
Using slices to delete list elements can be awkward, and therefore error-prone. Python provides an alternative that is more readable.
del
removes an element from a list:
As you might expect, del
handles negative indices and causes a runtime error if the index is out of range.
You can use a slice as an index for del
:
As usual, slices select all the elements up to, but not including, the second index.
In addition to count
and index
, lists have several useful methods. Since lists are mutable, these methods modify the list on which they are invoked, rather than returning a new list.
The sort
method is particularly useful, since it makes it easy to use Python to sort data that you have put in a list.
If we execute these assignment statements,
we know that the names a
and b
will refer to a list with the numbers 1
, 2
, and 3
. But we don’t know yet whether they point to the same list.
In one case, a
and b
refer to two different things that have the same value. In the second case, they refer to the same object.
We can test whether two names have the same value using ==
:
We can test whether two names refer to the same object using the is operator:
This tells us that both a
and b
do not refer to the same object, and that it is the first of the two state diagrams that describes the relationship.
Since variables refer to objects, if we assign one variable to another, both variables refer to the same object:
In this case, it is the second of the two state diagrams that describes the relationship between the variables.
Because the same list has two different names, a
and b
, we say that it is aliased. Since lists are mutable, changes made with one alias affect the other:
Although this behavior can be useful, it is sometimes unexpected or undesirable. In general, it is safer to avoid aliasing when you are working with mutable objects. Of course, for immutable objects, there’s no problem, since they can’t be changed after they are created.
If we want to modify a list and also keep a copy of the original, we need to be able to make a copy of the list itself, not just the reference. This process is sometimes called cloning, to avoid the ambiguity of the word copy.
The easiest way to clone a list is to use the slice operator:
Taking any slice of a
creates a new list. In this case the slice happens to consist of the whole list.
Now we are free to make changes to b
without worrying about a
:
A nested list is a list that appears as an element in another list. In this list, the element with index 3 is a nested list:
If we print nested[3]
, we get [10, 20]
. To extract an element from the nested list, we can proceed in two steps:
Or we can combine them:
Bracket operators evaluate from left to right, so this expression gets the three-eth element of nested
and extracts the one-eth element from it.
Python has several tools which combine lists of strings into strings and separate strings into lists of strings.
The list
command takes a sequence type as an argument and creates a list out of its elements. When applied to a string, you get a list of characters.
The split
method invoked on a string and separates the string into a list of strings, breaking it apart whenever a substring called the delimiter occurs. The default delimiter is whitespace, which includes spaces, tabs, and newlines.
Here we have 'o'
as the delimiter.
Notice that the delimiter doesn’t appear in the list.
The join
method does approximately the oposite of the split
method. It takes a list of strings as an argument and returns a string of all the list elements joined together.
The string value on which the join
method is invoked acts as a separator that gets placed between each element in the list in the returned string.
The separator can also be the empty string.
Once in a while, it is useful to swap the values of two variables. With conventional assignment statements, we have to use a temporary variable. For example, to swap a
and b
:
If we have to do this often, this approach becomes cumbersome. Python provides a form of tuple assignment that solves this problem neatly:
The left side is a tuple of variables; the right side is a tuple of values. Each value is assigned to its respective variable. All the expressions on the right side are evaluated before any of the assignments. This feature makes tuple assignment quite versatile.
Naturally, the number of variables on the left and the number of values on the right have to be the same:
We will now look at a new type of value - boolean values - named after the British mathematician, George Boole. He created the mathematics we call Boolean algebra, which is the basis of all modern computer arithmetic.
Note
It is a computer’s ability to alter its flow of execution depending on whether a boolean value is true or false that makes a general purpose computer more than just a calculator.
There are only two boolean values, True
and False
.
Capitalization is important, since true
and false
are not boolean values in Python.:
A boolean expression is an expression that evaluates to a boolean value.
The operator ==
compares two values and produces a boolean value:
In the first statement, the two operands are equal, so the expression evaluates to True
; in the second statement, 5 is not equal to 6, so we get False
.
The ==
operator is one of six common comparison operators; the others are:
Although these operations are probably familiar to you, the Python symbols are different from the mathematical symbols. A common error is to use a single equal sign (=
) instead of a double equal sign (==
). Remember that =
is an assignment operator and ==
is a comparison operator. Also, there is no such thing as =<
or =>
.
There are three logical operators: and
, or
, and not
. The semantics (meaning) of these operators is similar to their meaning in English. For example, x > 0 and x < 10
is true only if x
is greater than 0 and at the same time, x is less than 10.
n % 2 == 0 or n % 3 == 0
is true if either of the conditions is true, that is, if the number is divisible by 2 or divisible by 3.
Finally, the not
operator negates a boolean expression, so not (x > y)
is true if (x > y)
is false, that is, if x
is less than or equal to y
.
Boolean expressions in Python use short-circuit evaluation, which means only the first argument of an and
or or
expression is evaluated when its value is suffient to determine the value of the entire expression.
This can be quite useful in preventing runtime errors. Imagine you want check if the fifth number in a tuple of integers named numbers
is even.
The following expression will work:
unless of course there are not 5 elements in numbers
, in which case you will get:
Short-circuit evaluation makes it possible to avoid this problem.
Since the left hand side of this and
expression is false, Python does not need to evaluate the right hand side to determine that the whole expression is false. Since it uses short-circuit evaluation, it does not, and the runtime error is avoided.
All Python values have a “truthiness” or “falsiness” which means they can be used in places requiring a boolean. For the numeric and sequence types we have seen thus far, truthiness is defined as follows:numberic types
Values equal to 0 are false, all others are true.sequence types
Empty sequences are false, non-empty sequences are true.
Combining this notion of truthiness with an understanding of short-circuit evaluation makes it possible to understand what Python is doing in the following expressions:
aliases
Multiple variables that contain references to the same object.boolean value
There are exactly two boolean values: True
and False
. Boolean values result when a boolean expression is evaluated by the Python interepreter. They have type bool
.boolean expression
An expression that is either true or false.clone
To create a new object that has the same value as an existing object. Copying a reference to an object creates an alias but doesn’t clone the object.comparison operator
One of the operators that compares two values: ==
, !=
, >
, <
, >=
, and <=
.compound data type
A data type in which the values are made up of components, or elements, that are themselves values.element
One of the parts that make up a sequence type (string, list, or tuple). Elements have a value and an index. The value is accessed by using the index operator ([*index*]
) on the sequence.immutable data type
A data type which cannot be modified. Assignments to elements or slices of immutable types cause a runtime error.index
A variable or value used to select a member of an ordered collection, such as a character from a string, or an element from a list or tuple.logical operator
One of the operators that combines boolean expressions: and
, or
, and not
.mutable data type
A data type which can be modified. All mutable types are compound types. Lists and dictionaries are mutable data types; strings and tuples are not.nested list
A list that is an element of another list.slice
A part of a string (substring) specified by a range of indices. More generally, a subsequence of any sequence type in Python can be created using the slice operator (sequence[start:stop]
).step size
The interval between successive elements of a linear sequence. The third (and optional argument) to the range
function is called the step size. If not specified, it defaults to 1.traverse
To iterate through the elements of a collection, performing a similar operation on each.tuple
A data type that contains a sequence of elements of any type, like a list, but is immutable. Tuples can be used wherever an immutable type is required, such as a key in a dictionary (see next chapter).tuple assignment
An assignment to all of the elements in a tuple using a single assignment statement. Tuple assignment occurs in parallel rather than in sequence, making it useful for swapping values.
The operator [n:m]
returns the part of the sequence from the n’th element to the m’th element, including the first but excluding the last. This behavior is counter-intuitive; it makes more sense if you imagine the indices pointing between the characters, as in the following diagram:
There are two possible states:
or