Welcome to part 6 of the intermediate Python programming tutorial series. In this part, we're going to talk about the timeit module.
The idea of the timeit module is to be able to test snippets of code. In our previous tutorial, we were talking about list comprehension and generators, and the difference between the two of them (speed vs memory) was explained. Using the timeit
module, I will illustrate this.
Many times on forums and such, people will ask questions about which method is faster in some scenario, and the answer is always the same: Test it. In this case, let's test to see how quickly we can build a list of even numbers from range(500)
. The actual code for a generator:
input_list = range(100) def div_by_five(num): if num % 5 == 0: return True else: return False # generator, converted to list. xyz = list(i for i in input_list if div_by_five(i))
The code for list comprehension:
input_list = range(100) def div_by_five(num): if num % 5 == 0: return True else: return False # generator: xyz = [i for i in input_list if div_by_five(i)]
In these cases, the generator is actually only taking part in the calculation of whether or not a number is divisible by two, since, at the end, we are actually creating a list. We're only doing this to exemplify a generator vs list comprehension. Now, to test this code, we can use timeit
.
A quick example:
import timeit print(timeit.timeit('1+3', number=500000))
Output:
0.006161588492280894
This tells us how long it took to run 500,000 iterations of 1+3. We can also use the timeit module against multiple lines of code:
Our generator
print(timeit.timeit('''input_list = range(100) def div_by_two(num): if (num/2).is_integer(): return True else: return False # generator: xyz = list(i for i in input_list if div_by_two(i)) ''', number=50000))
List comprehension:
print(timeit.timeit('''input_list = range(100) def div_by_two(num): if (num/2).is_integer(): return True else: return False # generator: xyz = [i for i in input_list if div_by_two(i)]''', number=50000))
The generator: 1.2544767654206526
List comprehension: 1.1799026390586294
Fairly close, but if we increase the stakes, and do maybe a range(500)
:
The generator: 6.2863801209022245
List comprehension: 5.917454497778153
Now, these appear to be pretty close, so, as you might guess, it would really require a huge range to make a generator preferable. That said, what if we can leave things in generator form?
It's a common thought process as a scripter to think one line at a time, but what are your next steps going to be? Might it be possible to stay as a generator, and continue your operations as a generator? Do you ever need to access the list as a whole? If not, you might want to stay as a generator. Let's illustrate why!
Rather than converting the generator to a list at the end like this: xyz = list(i for i in input_list if div_by_two(i))
, let's leave it as a generator (just delete list
)!
Run it again:
The generator: 0.0343689859103488
List comprehension: 5.898960074639096
Oh my! The generator blew the list comprehension out of the water. But didn't we need to convert the generator to a list to see the values? Before it was just a generator object! Nope. Remember the for i in range(50)
? range()
is a generator, and we just need to iterate through it. Thus, we can do:
input_list = range(500) def div_by_two(num): if (num/2).is_integer(): return True else: return False # generator: xyz = (i for i in input_list if div_by_two(i)) for i in xyz: print(i)
At no point did we need to load the entire "list" of the even numbers into memory, it's all generated, and is a generator object til we do anything. We can also do:
input_list = range(500) def div_by_two(num): if (num/2).is_integer(): return True else: return False for i in (i for i in input_list if div_by_two(i)): print(i)
Boom.
Thus, you really only need to be using lists IF you need to be able to access the entire list at once, otherwise you should *probably* be using a generator. Yes, list comprehension is in theory faster since the list is in memory, BUT this is only true if you're not building a new list. Building lists is expensive! Yay for generators.
Alright, let's move along and talk about enumerate, a built-in function that has existed for a very long time, but is often not used when it should be!