Best Practices for Python Programming (Continuously Updated)
python techTable of Contents
When delving into the codebases of some successful large Python projects such as PyTorch, I am consistently impressed by their code – whether it’s clean yet precise, or leveraging lesser-known built-in or third-party packages to significantly enhance functionality.
High-quality code snippets, handy packages, and modules have greatly facilitated my work. In this blog, I’ll be sharing noteworthy findings and insights learned from the open-source codebase.
Basics
__new__
The __new__
method is used for creating a new instance of a class. It is a static method that gets called before the __init__
method.
The default __new__
method could be
class MyClass:
def __new__(cls, *args, **kwargs):
instance = super(MyClass, cls).__new__(cls, *args, **kwargs)
return instance
Note that, different from __init__
, whose first argument is an instance self
, __new__
’s first argument is a class.
You can override __new__
if something special need to be done with the object creation.
There are some classical use cases for the __new__
method:
Singleton Pattern
class Singleton:
_instance = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
return cls._instance
# Usage
singleton1 = Singleton()
singleton2 = Singleton()
print(singleton1 is singleton2)
True
Subclassing Immutable Types
When subclassing immutable types like str
, int
, unicode
or tuple
, the properties of immutable cannot be changed after they are created, you can override __new__
instead:
class UpperStr(str):
def __new__(cls, value):
return str.__new__(cls, value.upper())
# Usage
upper_string = UpperStr("hello")
print(upper_string) # Output: HELLO
HELLO
Factory Methods
__new__
can be used to implement factory methods that return instances of different classes based on input parameters.
class Shape:
def __new__(cls, *args, **kwargs):
if cls is Shape:
shape_type = args[0]
if shape_type == 'circle':
return Circle()
elif shape_type == 'square':
return Square()
return super(Shape, cls).__new__(cls, *args, **kwargs)
class Circle(Shape):
pass
class Square(Shape):
pass
# Usage
shape = Shape('circle')
print(isinstance(shape, Circle)) # Output: True
True
__iter__
and __next__
The __iter__
is a magic method that allows an object to be iterable, its result should be an iterable, and the __next__
method returns the next element of the iterable.
A naive example:
class Counter:
def __init__(self, n):
self.n = n
self.i = 0
def __iter__(self):
# Return the iterator object (self)
return self
def __next__(self):
if self.i < self.n:
self.i += 1
return self.i
else:
# signal the end
raise StopIteration
__aiter__
and __anext__
Similar to __iter__
and __next__
, the __aiter__
and __anext__
are the asynchronous version.
The __aiter__
allows an object to be an asynchronous iterator object, which is an object that has an __anext__
method that returns an awaitable object that yields the next element of the sequence.
import asyncio
class AsyncCounter:
def __init__(self, n):
self.n = n
self.i = 0
def __aiter__(self):
# Return the iterator object (self)
return self
async def __anext__(self):
if self.i < self.n:
self.i += 1
# Simulate some delay
await asyncio.sleep(1)
return self.i
else:
# Signal the end
raise StopAsyncIteration
Handy builtin utilities
setter and getter
When there is some logic bound to a member when it is got or updated, then the getter and setter could be used.
class App:
def __init__(self):
self.update_count = 0
self._name = ""
@property
def name(self) -> str:
return self._name
@name.setter
def name(self, v:str):
self._name = v
self.update_count += 1
app = App()
app.name = 'a'
app.name = 'b'
print('name:', app.name) # b
print('update_count:', app.update_count) # 2
name: b
update_count: 2
@dataclass
@dataclass
is a decorator that can be used to create classes that mainly store data.
It can automatically generate some common methods for the class, such as __init__
, __repr__
, and __eq__
, based on the type hints of the class attributes.
from dataclasses import dataclass
@dataclass
class Point:
x: float
y: float
p = Point(1., 2.)
print(p)
Point(x=1.0, y=2.0)
There are several classical practices using @dataclass
Use default values or default factories
from dataclasses import dataclass, field
from random import randint
from typing import List
@dataclass
class DummyContainer:
sides: int = 6
value: int = field(default_factory=lambda: randint(1, 6))
alist: List[int] = field(default_factory=list) # avoid assign [] directly
dummy = DummyContainer()
print(dummy)
DummyContainer(sides=6, value=2, alist=[])
Use frozen=True
to make the class immutable
from dataclasses import dataclass
@dataclass(frozen=True)
class Circle:
radius: float
const_circle = Circle(2.0)
Use order=True
to enable comparison operators based on the class attributes
from dataclasses import dataclass
@dataclass(order=True)
class Circle:
radius: float
c0 = Circle(1.)
c1 = Circle(2)
print(c0 > c1)
False
Use inheritance to create subclasses of data classes
from dataclasses import dataclass
@dataclass
class Animal:
name: str
sound: str
@dataclass
class Dog(Animal):
# inherits name and sound from Animal
watch_house: bool
dog = Dog(name="Huang", sound="Wang", watch_house=False)
print(dog)
Dog(name='Huang', sound='Wang', watch_house=False)
functools partial
to get new function by partially fixing some arguments of an existing one
from functools import partial
def func0(a, b):
print(f"a:{a}, b:{b}")
func1 = partial(func0, a = 0)
print(func1)
func1(b=10)
# reset argument a
func1(a=1, b=10)
functools.partial(<function func0 at 0x7fccb00e31e0>, a=0)
a:0, b:10
a:1, b:10
functools @warps
to help define better decorators
Below is a classical way to define an decorator
def decorator(func):
def actual_func(*args, **kwargs):
''' The actual func. '''
print(f"Before Calling {func.__name__}")
func(*args, **kwargs)
print(f"After Calling {func.__name__}")
return actual_func
@decorator
def greet(name):
''' The greet func. '''
print(f"Hello, {name}!")
greet("Martin")
Before Calling greet
Hello, Martin!
After Calling greet
The name and docstring of the decorated function will be hidden in the decorator function, and this makes the usage a bit opaque when debugging.
print(greet.__name__)
print(greet.__doc__)
actual_func
The actual func.
In other words, the name and the docstring of the decorated function is overwritten by the decorator, which is not expected.
We can fix such issue with @wraps
, for example, the original code could be replaced with
from functools import wraps
def decorator(func):
@wraps(func)
def actual_func(*args, **kwargs):
print(f"Before Calling {func.__name__}")
func(*args, **kwargs)
print(f"After Calling {func.__name__}")
return actual_func
@decorator
def greet(name):
''' The greet func. '''
print(f"Hello, {name}!")
print(greet.__name__)
print(greet.__doc__)
greet
The greet func.
functools @lru_cache
: decorator to wrap a function with a LRU cache
Accelerating DP-like recursive function call
@functools.lru_cache(maxsize=1000)
def factorial(n):
return n * factorial(n-1) if n else 1
Initialization for some heavy states without introducing global variables
Suppose we have some global states that should be initialized only once, the naive way to do it is by introducing some global variables,
state = None
def get_state(args):
if state is None:
state = construct_state(args)
return state
We can eliminate the need for a global variable with a cache:
@lru_cache
def get_state(args):
return construct_state(args)