The following post contains a summary of the book titled Python Workout by Reuven M. Lerner

img

Chapter 1

  • randint selects numbers inclusive of both the min and max range provided as arguments

    1
    2
    3
    
    import random
    x = [random.randint(1,10) for i in range(100)]
    print(max(x))
    
    10
    
  • In Python 2 it wasn’t an error to compare objects of different types. It would compare by type and then within type. In Python 3 this has been removed.

  • f strings are superuseful in string formatting

  • walrus operator :=

  • splat operator helps one to pass multiple arguments in to a function

  • Never knew that 0.1+0.2 in Python gives '0.30000000000000004'

  • It is better to use f strings to display floating numbers - for example f'{x:.2f}'

  • “decimal” module in Python 3 that does human type arithmetic

  • Learned how to convert a hexadecimal number to base 10 number

  • Learn that hexadecimal is called that way because it has hex - 6 + decimal - 10 , i.e. 16 numbers 0 - 15

Chapter 2

  • There is no character type in Python 3
  • In Python 3, the strings contain unicode characters
  • “Pig translate”: Never knew such a thing existed
  • print function takes an end argument in which we can pass any character
  • There are 1114111 unicode characters
  • Learned about itemgetter and attrgetter from operator module
  • Decorate-Sort-Undecorate method for sorting elements in an array
  • Unicode lecture at Pycon - https://nedbatchelder.com/text/unipain.html
  • Sorting method has a key argument and a reverse argument

Chapter 3

  • Slice of a list/str/tuple gives back a list/str/tuple
  • When retrieving via slice, Python allows you to go beyond the bounds
  • Data is strongly typed but variables don’t have types
  • Learned the importance of slicing
  • Python lists are implemented as arrays of pointers. Python allocates some buffer space so that one can add a few elements to it on a dynamic basis. In its background, Python keeps a tab on the buffer space and allocates new memory as and when required
  • Worked on replicating zip function in python
  • Learned about itemgetter in the operator module
  • How does one verbalizing lambda function?
  • How can itemgetter replace lambda function ?
  • Use collections library that can give summary statistics on collections
  • max function also takes in a key argument that can be used to specify the criterion for sorting the iterable
  • fstring are format string literals are super useful in printing stuff

Chapter 4

  • Much of the Python language is itself implemented using dicts
  • When one creates a dictionary in Python using {'a':12}, Python creates a hash function using the input a and then stores the value 12 in that location.
  • Most important properties of Python dicts
    • always store key-value pairs together
    • guarantee very fast lookup for keys
    • ensure key uniqueness
    • don’t guarantee anything on the value of the lookups
  • sets are dicts without values
  • UTC is a compromise between Coordinated Universal Time (English) and French abbreviation
  • The actual timezones , for example, GMT is defined as an offset to UTC
  • Python time representation as a continuum
    • float -> tuples -> structs -> string
    • human readable and machine operability go against each other
  • Python time as a tuple with nine fields
  • Fun Fact: If you’re a native English speaker, you might be wondering why the abbreviation for “Coordinated Universal Time” is UTC rather than the more obvious CUT. However, if you’re a native French speaker, you would call it “Temps Universel Coordonné,” which suggests a different abbreviation: TUC.
  • To format a string, one uses directives
  • time.struct_time can be converted to string using asctime and strftime
    • the latter is more flexible
  • strptime converts a string to a time object
  • time.perf_counter can be used to measure time between operations
  • time.ctime gives a string representation of time
  • time.time gives the number of seconds since epoch. It takes in to account fractional seconds
  • time.gtime and time.localtime are ways to convert time in seconds to struct_time
  • dict.keys has several methods similar to sets

Chapter 5

  • with is not meant to be only used with files It is called context manager and can be used with any python class that has __enter__ method and __exit__ method
  • When the context method is used in the context of file, then the __enter__ method merely returns the file object. When all the operations are done, the __exit__ method flushes the file and closes the object
  • It is better to use glob as you can retrieve a specific pattern in filenames.In the case of os.listdir, you cannot specify patterns - it returns all the files in a given directory
  • pathlib is another way to access objects in a directory
  • since paths are not strings, the functionality to handle paths are spread across several modules such as glob, shutil, os
  • This chapter introduced me to pathlib and I was super impressed with the functionality that I am going to use it for all my tasks that deal with going through files and directories
  • pathlib is powerful because
    • it directly represents the underlying object
    • it has super useful functions that make it easy to perform many operations on files and directories
    • it is consistent across several operating systems
  • json.load can be used on a file buffer to read all the content in to a list of dicts
  • with context manager can work with more than one objects

Chapter 6

  • __code__ attribute contains the core of the function, including the bytecodes in to which the function was compiled
  • .__code__.co_argcount gives the number of arguments that the function consumes
  • Do not use mutable arguments in Python functions. It is better to assign the default value as None
  • When passing a mutable value as a default argument in a function, the default argument is mutated anytime the value is mutated. One can come across a situation that the default value changes across multiple invocations of the function
  • Python has four levels of scoping(LEGB):
    • Local
    • Enclosing function
    • Global
    • Built-ins
  • If you are in a function all four scopes are searched for
  • The way the python looks for a variable is
    • It first looks for the variable in the local scope
    • It then looks for the variable in the enclosing function scope
    • It then looks for the variable in the global name space
    • It then looks for the variable in the built-ins
  • nonlocal tells python that any update to the variable should be done to the variable in the immediate outside scope of the function
  • split takes a maxsplit argument that helps you unpack values according to predefined splits
  • one can use operator module to get a unified interface for most of mathematical functions

Chapter 7

  • the syntax for list comprehension and generator looks similar. Wherever possible, you generator
  • if you want to read a binary file, you need to specify the encoding
  • if you open the file in ‘rb’ mode, then Python opens the file and does not attempt to read the file in string
  • if you have two sets a and b, then if you have to check whether a is subset of b, you can easily do via a<b

Chapter 8

Real Python course on Modules and Packages

  • import module necessitates the use of . notation to access the objects
  • individual objects can be imported via from module import x
  • you can import specific modules within a function
  • dir() is a built in function that gives the defined names in the namespace
  • object names are store in the local symbol table
  • dir(modulename) gives the namespace for the modulename
  • when a .py file is imported, the dunder variable __name__ is set to the name of the module
  • when a .py file is run as a standalone script, __name__ is set to the string __main__
  • __name__ dunder name
  • one can test a piece of code in a python program via two ways
    • load the code as a module and write tests
    • write the tests as a part of __main__ and execute the python code
  • python imports the module only once. If there is need to reload the module, one can use importlib and then use reload function
  • packages allow hierarchical structuring of namespace using dot notation
  • package intialization entails creating __init__.py file
  • if you put import statements in __init__.py all the imported modules are available for a program that imports the pkg folder
  • from pkg import * works if there is __all__ variable in __init__.py
  • packages can contain nested packages
  • .. is used for relative imports
  • There are two ways to invoke a python program - either step in to the directory that contains the script and run the code under global name space or step in to any root folder that has visibility on the script and run it from the root folder. In the latter, one should specify -m option
  • if you remove __init__.py from a folder, python will stop searching for modules under that folder
  • when we drop .py , we are telling python to use the namespace and use the -m option. The code runs as a part of an imported module and hence the dunder variable name takes the name of the package

Learnings from the chapter

  • Modules are useful for creating reusable code
  • Modules are useful for creating namespaces
  • curated list of Python packages - https://awesome-python.com/
  • builtins is a namespace in python that contains all the standard modules
  • import defines a new variable that references a namespace
  • from os import sep, path creates sep and path names in the namespace
  • floor always rounds to the nearest zero for positive numbers whereas it rounds away from zero for negative numbers
  • ceil always round to the nearest zero for negative numbers whereas it rounds away from zero from positive numbers
  • round implements bankers rounding, i.e, it rounds towards the closest even significant numbers, i.e. round(2.5) gives 2 whereas round(3.5) rounds to 4
  • It is always better to use Decimal class in python
  • Python keeps a track of its modules by searching the loaded modules in sys.modules
  • I will stop using float and always use Decimal class
  • poetry package can be used to bundle your source code as a package
  • PyPI has about 250,000 packages
  • __name__ is either defined as the current module name or __main__
  • variable lookup - LEGB - Local, Enclosing, Global, Builtins

Chapter 9

  • The first parameter for every class method is self. However it is not a reserved word in Python and it comes from Smalltalk language, whose object system influenced Python
  • When you call a class to instantiate an object, it looks for the name in the global namespace and then invokes the __new__ constructor. The __new__ constructor is responsible for creating the object and invoking __init__ method before returning the instance object to the caller.
  • The job of any __init__ method is to add attributes to the instance
  • One can always add instance attributes outside of init dunder method but it is considered good practice to add all the instance attributes at one place, i.e. in the init dunder method
  • Fantastic article on type checking at Real Python
  • when you pass a splat operator, it results in a tuple that can be accessed in the function
  • Python searches the attributes based on ICPO - instance, class, parent, object
  • Today i have understood why the following code works
1
2
s ='abcd'
print(s.upper())

The above code works because python checks for upper method on s. Since it does not find that method, it goes and checks the type(s) has that method. In the above case, type(s) is string and string has an =upper=method. In the case when the class does not have an attribute, it looks for the parent class and checks to see if it has an attribute

  • the ultimate parent for every object in python is object
  • One can specify class attributes as well as instance attributes
  • One can use self.__class__.__name__ to obtain the class name
  • This chapter has been superuseful to me as I had forgotten most of the oops concepts in Python. Stumbled on to OOPS Python path on real python. Need to work through the course

Chapter 10

  • strings, lists and dicts are iterables because they implement the iterator protocol
  • Three parts of a for loop
    • It asks the object whether it is an iterable or not using __iter__ built-in function. This function invokes the __iter__ method on the target object. Whatever the __iter__ function returns ,it is an iterator
    • If the object was an iterator, then the for loop invokes the next built-in function on the iterator that was returned. That function invokes __next__ on the iterator
    • if =_next__ raises an exception, then the loop exits
  • To make any class in to an iterable, it must adhere to the following protocol:
    • it must implement __iter__ method that returns the object
    • it must implement __next__ method that returns the next value
    • it must raise StopIteration if the index runs though the entire data
  • There is a difference between iterable and iterator. The former is an object that is put in a loop which returns an iterator object. The latter actually follows the iterator protocol. Most of the cases, the iterable and iterator are the same object. However there are cases such as strings and lists, that return a separate iterable object
  • Iterable is a category of data in Python
  • itertools is a module that implements many classes that follow the iterator protocol
  • Awesome explanation of the difference between iterable and iterator using MyEnumerate class and MyEnumerateIterator class. Most of the boiler plate code needed for iteration is present in the helper class
  • Iterator can return whatever data it wants until it hits a StopIteration exception
  • ways to convert an iterator class to generator functions
  • Learned a nice way of using boolean or function to cut down redundant code
  • There are three different ways to create an iterator
    • add the appropriate methods to the class
    • write a generator function
    • write a generator expression

Takeaway

After a very long time, I have actually completed a book from cover to cover. I have worked through all the exercises in the book and have learned a ton of Python in the process. Let me recollect all my learnings from this book

  1. difference between iterator and iterable
  2. how does one write a generator function ?
  3. how does one enable to class to be an iterable ?
  4. class can have attributes and so can the instances
  5. composition and inheritance in python OOP
  6. the way attributes are looked up instance-class-parent-object - IOCP
  7. the way variables are looked up local-enclosing-global-builtin - LEGB
  8. the way namespaces are structured
  9. importing means creating variables in the namespace
  10. pathlib module - a fantastic module to work with files in a file system
  11. argparse module - a fantastic module to create command line programs
  12. why should one use __init__.py in a package
  13. poetry can help in automatically creating a package out of your source code
  14. mutable vs immutable datatypes
  15. list comprehensions, dict comprehensions, set comprehensions
  16. PyPI has about 250,000 packages
  17. Type hints introduced in Python 3
  18. Interesting methods on dicts such as update methods
  19. one can iterate through strings, lists as they all implement iter functions
  20. it is always preferable to use generator expressions whereever possible as a substitute to list comprehensions
  21. way to use key in sorting methods
  22. use of key in max method
  23. collections module that has a lot of useful classes such as defaultdict, Counter
  24. importlib module to reload module multiple times in a REPL

After going back to all the points that I had written while going through the book, here are some of the additional takeaways:

  1. splat operator
  2. itemgettr and attrgettr functions
  3. time module , datetime module - super useful modules
  4. learnt about Decimal class
  5. banker’s rounding
  6. way to incorporate timezones in to python datetime objects
  7. with can work with a lot more python objects than merely file instances
  8. a<b for sets
  9. dunder name __name__
  10. There are close to 1 billion unicode characters