1. Inheritance and Composition
Inheritance models the relationship of derivation. The child class can inherit methods and properties from the parent class.
Composition models the relationship of inclusion. The composite class references to one or more objects of other component classes as Instance variables.
Inheritance is to design a class on what it is, while composition is to design a class on what it does.
When one needs to use the class as it without any modification, the composition is recommended and when one needs to change the behavior of the method in another class, then inheritance is recommended.
Composition is a looser coupling compared to inheritance.
Say there is a company with many employees, some of the employees are engineers, and some of them are managers. All the employees has a home address, which consists of their country, state and street address. For this example, the relationship between employees and company is composition, the relationship between employees and engineers is inheritance and the relationship between address and employee is composition.
2. Generator and Iterator
An iterator is an object that contains a countable number of values and can be iterated upon, meaning that you can traverse through all the values. List, tuple, dictionary are iterable containers from which can get a iterator.
class MyNumbers:
def __iter__(self):
self.a = 1
return self
def __next__(self):
if self.a <= 20:
x = self.a
self.a += 1
return x
else:
raise StopIteration
myclass = MyNumbers()
myiter = iter(myclass)
print(next(myiter))
for x in myiter:
print(x)
Generator create iterators in a simple way where it uses the keyword “yield” to give out values. Generators are implemented using a function. Generator functions behave like iterators.
def simpleGeneratorFun():
yield 1
yield 2
yield 3
x = simpleGeneratorFun()
for i in x:
print(i)
print(next(x))
print(next(x))
print(next(x))
Iterator | Generator |
---|---|
Class is used to implement an iterator | Function is used to implement a generator. |
Local Variables aren’t used here. | All the local variables before the yield function are stored. |
Iterators are used mostly to iterate or convert other objects to an iterator using iter() function. | Generators are mostly used in loops to generate an iterator by returning all the values in the loop without affecting the iteration of the loop |
3. Decorator
3.1 Definition
In Python, functions are first-class objects. This means that functions can be passed around and used as arguments. A function is called Higher Order Function if it contains other functions as a parameter or returns a function as an output. In other words, the functions that operate with another function are known as Higher Order Functions.
A decorator is a function that takes another function and extends the behavior of the latter function without explicitly modifying it.
Say we want to figure out the execution time of a function, which can be achieved using decorator.
3.2 Implementation
# easy example
def add2(f):
def inner(*args, **kwargs):
return f(*args, **kwargs) + 2
return inner
@add2
def add(a, b):
return a + b
"""
def add(a, b):
return a + b
add = add2(add)
"""
print(add(1, 2))
# decorator that makes the function take less params
def set_a_to_2(f):
def inner(*args, **kwargs):
return f(2, *args, **kwargs)
return inner
@set_a_to_2
def add(a, b):
return a + b
print(add(3)) # 5
# decorator with params
# decorator of "dec (multiply)"
def parametrized(dec):
# obtains arguments of the "dec (multiply)" decorator, which are:
# n
# that is because "multiply" is called with "n" to return a decorator
# "layer" returns a modified decorator
def layer(*args, **kwargs):
print(args, kwargs, [type(o) for o in args]) # 2
def repl(f):
# "*args" here is the arguments of "f (function)"
# we can see "f" is applied here, so when we call "multiply" as a decorator,
# we do not need to pass "f" to "multiply" any more
return dec(f, *args, **kwargs)
return repl
return layer
# we want to replace "multiply" with a function that returns a decorator
# we use another decorator to achieve this
@parametrized
def multiply(f, n):
def aux(*xs, **kws):
return n * f(*xs, **kws)
return aux
# "multiply(2)" returns a decorator function, which is the decorator we want to apply
@multiply(2)
def function(a):
return 10 + a
print(function(3)) # Prints 26
3.3 Built-in Decorators
property
is a decorator usually used to implement getter (@property
), setter (@prop.setter
), deleter (@prop.deleter
) functions.
classmethod
is a decorator that will make the parameter passed in to the method the class itself (class), rather than self (object, instance). @classmethod
allows to access the function without the need for an instance.
staticmethod
will make a method a method of the class itself (my personal understanding, not definition).
class Pizza:
def __init__(self, ingredients):
self.ingredients = ingredients
def __repr__(self):
return f'Pizza({self.ingredients!r})'
@classmethod
def margherita(cls):
return cls(['mozzarella', 'tomatoes'])
@classmethod
def prosciutto(cls):
return cls(['mozzarella', 'tomatoes', 'ham'])
>>> Pizza.margherita()
Pizza(['mozzarella', 'tomatoes'])
>>> Pizza.prosciutto()
Pizza(['mozzarella', 'tomatoes', 'ham'])
Calling the functions above will not create any instances of Pizza
.
4. Shallow and Deep Copy
In python, assignment statements will not create copies of objects. Instead, they just represent bindings of names. If the object is immutable (numeric, tuple, string, bool), there is not any difference. If the object is mutable, the newly assigned variable and the original variable in fact point to the same piece of memory.
Shallow copy only copies the references of the child objects of the origin object to the new object. Deep copy copies all the child objects to the new object in a recursive manner.
The depth of shallow copy is 1, of deep copy is the depth of the object.
# shallow copy
a = [[1,1], 4, [5, 1, 4]]
b = list(a)
b[0][0] = 0 # a and b all become [[0,1], 4, [5, 1, 4]]
b[1] = 3 # only b becomes [[0,1], 3, [5, 1, 4]]
b.append(1919) # only b becomes [[0,1], 3, [5, 1, 4], 1919]
# deep copy
import copy
a = [[1,1], 4, [5, 1, 4]]
b = copy.deepcopy(a) # now all the changes in b has nothing to do with a
5. List Reversal
reversed
function can be utilized to reverse a list, which will create a list_reverseiterator
object, from which we can get the reversed list using type casting. list(reversed(a))
reverse
method of list object can also be called to achieve it, which will reverse the list in-place. a.reverse()
We can also use slicing technique to reverse a list, which will create a copy of the list. a[::-1]
Above are the built-in ways to reverse a list. We can also build our own algorithms, like inserting and two pointers swapping.
6. Multithreading
Below is my personal understanding of this concept.
The goal of multithreading is to simultaneously execute multiple tasks. To achieve multithreading, there are mainly two ways, which are, alternating task on the same core, or, using multiple cores and each execute a task.
When using CPython, creating threads will not shorten the time of executing tasks, because different threads runs only on a single core with the manner similar to Time-division Multiplexing, which means only one thread can be being executed at any time, which is the first way described above.
This is a side-effect of the GIL, or Global Interpreter Lock. To get rid of it, we can use other python implementations or multiprocessing.
6.1 GIL
GIL is a mutex that only grants control to one thread at a time, which is to guarantee that there will not be conflicts to the reference count of objects, which is used to keep track of object so that they can be collected as garbage when the reference count reaches 0. GIL is a lock of the interpreter itself, so there is no need to create locks to every objects, which can get rid problems of dead locks and boost efficiency.
6.2 Implementation
threading
library can be utilized to create and start threads.
import threading
import time
import concurrent.futures
def func(n):
time.sleep(n)
tr = threading.Thread(target=func, args=(2,))
tr.start()
tr.join() # main program will wait for the thread
# daemon threads will be shut down when the main program finishes
daemon_tr = threading.Thread(target=func, args=(3,), daemon=True)
# creating a group of threads
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
executor.map(lambda: func(3), range(3))
Can also be implemented in an OOP way by creating class that extends Thread
.
6.3 Multiprocessing
A process is what we call a program that has been loaded into memory along with all the resources it needs to operate. It has its own memory space. A thread is the unit of execution within a process. A process can have multiple threads running as a part of it, where each thread uses the process’s memory space and shares it with other threads.
In python, we can use multiprocessing to execute different tasks at a time, which will open up new processes that copy the memory of current process.
import time, os
from multiprocessing import Process, current_process
p = Process(target=func, args=(2,))
p.start()
p.join()
It is recommended to use multithreading for IO-bound tasks and multiprocessing for CPU-bound tasks.
7. Context Manager
7.1 Why
When managing external resources like files and connections, the program may retain the resources after we finish using them, which will cause memory leak. To address this, we need to manage the resources properly, that is, apply setup and teardown phases. This is the usage of context manager.
7.2 With Statement
The Python with
statement creates a runtime context that allows us to run a group of statements under the control of a context manager.
# most common use: opening files
with open("hello.txt", mode="w") as file:
file.write("Hello, World!")
with
statement is also commonly used to handle locks in multithread programs and test exceptions with pytest.
lock = threading.Lock()
with balance_lock:
pass
with pytest.raises(ZeroDivisionError):
a = 1 / 0
7.3 Implementation
Define __enter__
and __exit__
inside the class.
class MyContextManager:
def __enter__(self):
val = 114514
print("entered")
return val
"""
exc_type, exc_value, exc_tb are for error handling
exc_type is the exception class # error type
exc_value is the exception instance # error message
exc_tb is the traceback object # error traceback
"""
def __exit__(self, exc_type, exc_value, exc_tb):
print("exited")
with MyContextManager() as cm:
print(cm)
Context manager can also be asynchronous, just add __aenter__
and __aexit__
.
import aiohttp
import asyncio
class AsyncSession:
def __init__(self, url):
self._url = url
async def __aenter__(self):
self.session = aiohttp.ClientSession()
response = await self.session.get(self._url)
return response
async def __aexit__(self, exc_type, exc_value, exc_tb):
await self.session.close()
async def check(url):
async with AsyncSession(url) as response:
print(f"{url}: status -> {response.status}")
html = await response.text()
print(f"{url}: type -> {html[:17].strip()}")
async def main():
await asyncio.gather(
check("https://realpython.com"),
check("https://pycoders.com"),
)
asyncio.run(main())
# https://realpython.com/python-with-statement/
8. Garbage Collector
Python uses two strategies for memory allocation, which are reference counting and garbage collection. Standard CPython's garbage collector has two components, the reference counting collector and the generational garbage collector, known as gc module.
8.1 Reference Counting
This strategy will keep track of objects in python. The main advantage of such an approach is that the objects can be immediately and easily destroyed after they are no longer needed. The reference count of an object will increase when it is assigned, it is passed as an argument, appending, etc.
If the reference counting field reaches zero, CPython automatically calls the object-specific memory deallocation function. If an object contains references to other objects, then their reference count is automatically decremented too.
For global variables, they will be alive until the end of process. For other variables, their names will be destroyed when the block ends. To remove something from memory, you need to either assign a new value to a variable or exit from a block of code.
To check the reference count, use sys.getrefcount(obj)
.
Reference counting is one of the reasons why Python can't get rid of the GIL.
8.2 Generational Garbage Collector
Reference counting would fail if there are reference cycles, which can not be detected by reference counting. A reference cycle occurs when one or more objects are referencing each other, it only occurs in container objects.
The GC classifies container objects into three generations. Every new object starts in the first generation. If an object survives a garbage collection round, it moves to the older (higher) generation. Lower generations are collected more often than higher. Because most of the newly created objects die young, it improves GC performance and reduces the GC pause time.
In order to decide when to run, each generation has an individual counter and threshold. The counter stores the number of object allocations minus deallocations since the last collection. Every time you allocate a new container object, CPython checks whenever the counter of the first generation exceeds the threshold value. If so, Python initiates the сollection process.
9. Enumerate
enumerate()
function is utilized to get a counter and the value from the iterable at the same time. When you use enumerate()
, the function returns a tuple with the counter and value every time.
10. Map, Filter and Reduce
10.1 Map
The map()
function iterates through all items in the given iterable and executes the function
we passed as an argument on each of them. It returns a map object.
a = [1, 1, 4, 5, 1, 4]
print(list(map(lambda x: x * 2, a))) # [2, 2, 8, 10, 2, 8]
10.2 Filter
The filter()
function iterates through all items in the given iterable and executes the function
we passed as an argument on each of them. If the function returns a true value, the item will be brought in to a new list.
a = [1, 9, 1, 9, 8, 1, 0]
print(list(map(lambda x: x * 2, a))) # [1, 9, 1, 9, 8, 1]
10.3 Reduce
In python3, reduce()
is not a built-in function any more.
reduce()
works by calling the function
we passed for the first two items in the sequence. The result returned by the function
is used in another call to function
alongside with the next (third in this case), element. It returns a value.
from functools import reduce
a = [1, 1, 4, 5, 1, 4]
print(reduce(lambda x, y: x + y), a) # 16
print(reduce(lambda x, y: x + y), [1]) # 1
print(reduce(lambda x, y: x + y), ["1"]) # "1"
11. Encapsulation and Abstraction
11.1 Encapsulation
Encapsulation is defined as the wrapping up of data under a single unit. It is the mechanism that binds together code and the data it manipulates. Another way to think about encapsulation is, that it is a protective shield that prevents the data from being accessed by the code outside this shield. In code, getter and setter are common examples of encapsulation.
class Person:
def __init__(self, name):
self.__name = name
@property
def name(self):
return self.__name
@name.setter
def name(self, name):
self.__name = name
11.2 Abstraction
Abstraction is the concept of object-oriented programming that "shows" only essential attributes and "hides" unnecessary information. The main purpose of abstraction is hiding the unnecessary details from the users.
from abc import ABC, abstractmethod
# ABC: Abstract Base Class
class Polygon(ABC):
@abstractmethod
def noofsides(self):
pass
class Triangle(Polygon):
# overriding abstract method
def noofsides(self):
print("I have 3 sides")
class Pentagon(Polygon):
# overriding abstract method
def noofsides(self):
print("I have 5 sides")
class Hexagon(Polygon):
# overriding abstract method
def noofsides(self):
print("I have 6 sides")
class Quadrilateral(Polygon):
# overriding abstract method
def noofsides(self):
print("I have 4 sides")
# https://www.geeksforgeeks.org/abstract-classes-in-python/
12. Sorting
Sorting means putting elements in a certain order. The order depends on requirements.
By sorting, we can boost the efficiency of tasks like searching.
13. Cache
In computing, a cache is a hardware or software component that stores data so that future requests for that data can be served faster.
For example, if we visit a website several times, we will find that subsequent visits load much faster than the first visit. This is because the browser stores some static resources of the website to the local storage, and when we access the same resources again, we can read them directly from the local storage instead of getting them from the destination server, which saves a lot of time.
There are many cache strategies: (reference)
Strategy | Eviction policy | Use case |
---|---|---|
First-In/First-Out (FIFO) | Evicts the oldest of the entries | Newer entries are most likely to be reused |
Last-In/First-Out (LIFO) | Evicts the latest of the entries | Older entries are most likely to be reused |
Least Recently Used (LRU) | Evicts the least recently used entry | Recently used entries are most likely to be reused |
Most Recently Used (MRU) | Evicts the most recently used entry | Least recently used entries are most likely to be reused |
Least Frequently Used (LFU) | Evicts the least often accessed entry | Entries with a lot of hits are more likely to be reused |
14. Load Balancing
Load balancing refers to efficiently distributing incoming network traffic across a group of backend servers, also known as a server farm or server pool.
Say we have a web service which serves millions of users concurrently in the US, and we deploy our servers to each state. If we provide our service form the nearest server to our users, there would likely to be a scenario that servers in CA and NY are extremely busy, while servers in UT and WY are idle most of the time. To address this problem, we can distribute the tasks of busy servers to less busy ones. This is an example of load balancing.