Variable Scoping in Python 2
I encountered a kind of variable scoping that I did not expect while working on this pull request for the Poetry core packaging module. Poetry still supports Python 2.7 and several tests failed on Python 2.7 while I was working on it. However, all tests were passing on Python 3.5 and above.
The tests failed because of an AttributeError
:
for file in include.elements: # type: List[Path]
# omitted for brevity
if file.is_dir():
if self.format in formats:
for f in file.glob("**/*"): # type: List[Path]
rel_path = f.relative_to(self._path)
if (
rel_path not in set([f.path for f in to_add])
> and not f.is_dir()
and not self.is_excluded(rel_path)
):
AttributeError: BuildIncludeFile instance has no attribute 'is_dir'
As you can see on the first line, the type of the for-loop variable is a Path
. The Path
class has a useful method, Path.is_dir()
, to check if it is a directory.
The inner for-loop traverses the directory’s Path
objects with an ill-named variable called f
. In this for-loop, we check if f
is already in the list to_add
and if it is a directory and if f
is not excluded using its relative path, rel_path
. to_add
is a list of BuildIncludeFile
objects which themselves have the following attributes: path
, parent
, relative_path
and resolve
.
I used a list comprehension to transform to_add
to a set
1 and check if the traversed file is already in to_add
. In this list comprehension, [f.path for f in to_add]
, I also used f
as a variable name. This was my unfortunate mistake.
This f # type: BuildIncludeFile
in the list comprehension shadows the outer for-loop f # type: Path
variable which caused the second condition in the if-statement, if ... f.is_dir()
to throw the AttributeError
in Python 2.7. f
continues to shadow the outer for-loop even once the list comprehension is iterated through and is the last BuildIncludeFile
instance in to_add
instead of the traversed file Path
from the outer for-loop. Because I expected a Path
instance for f
instead of a BuildIncludeFile
which does not actually have an is_dir()
method, it threw this AttributeError
.
A much simpler example of this phenomenon in Python 2 is the following2:
>>> [x for x in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> print(x) # Python 3 will throw a `NameError` on the other hand.
9
The fix for this variable scoping issue was easy. Renaming the variable in the outer scope from f
to something more descriptive like current_file
prevented any kind of unexpected scoping behaviour.
Another possible solution is to use a generator expression since generators are functions and have functional scope3 4.
Python 3 handles variable scope by only temporarily shadowing the outer scope. According to Guido, the list comprehension’s variables “leaked” onto the outer scope because it was “an intentional compromise to make list comprehensions blindingly fast”. Guido called this Python’s dirty little secret.
It’s often said in the halls of colleges and between ping pong tables in startups that “there are only two hard things in Computer Science: cache invalidation and naming things”5. I never thought that the second of these two hard things, naming, would creep up on me as a variable in a list comprehension. 🤦♀️
-
I made
to_add
into aset
because I only needed theBuildIncludeFile.path
and not the entire class itself. Had I not accessed just the paths of theBuildIncludeFile
objects, then checking for membership of the file in a simpleset(to_add)
would not have worked since they are of different types:Path
andBuildIncludeFile
. ↩︎ -
Thanks to Alex Louden for proofreading and providing this example. ↩︎
-
https://docs.python.org/2/reference/executionmodel.html#naming-and-binding ↩︎
-
https://docs.python.org/3.8/reference/executionmodel.html#resolution-of-names ↩︎
-
A quote by Phil Karlton. Martin Fowler has an entertaining and very short blog post about it here. ↩︎