# Pythonic Python is "simple and correct"! - Walrus operator `:=`, `a := 10` will perform assignment and also return `10`. - `case [str(name), _, _, (float(lat), float(lon))]:` - `==` is usually what you want. `is` is mostly used for comparison with `None` or other sentinels, it can't be overloaded so it's slightly faster than `__eq__` - [Weak references | Fluent Python](https://fluentpython.com/extra/weak-references/), several classes can be used to reference objects without actually being counted, useful for caches. - Prefer EAFP (easy to ask forgiveness than permission) over LBYL (look before you leap), with `try`-`except` blocks. This improves safety in multi-threaded programs. ## String - Single quote by default. - `'%s/%s' % ('hello', 'world)`, the `%` formating operator understands tuple. - `'\N{NAME OF UNICODE CHAR}'` - `str.maketrans` and `str.translate` - `locale.strxform` for transforming [[unicode|Unicode]] text to locale-aware form for comparison. Alternative, use `pyuca` and the more powerful `pyicu`. - `unicodedata` module for metadata about unicode characters. - [[regex|RegEx]] built from `str` and `bytes` are different - Use `reprlib.repr()` to get concise and readable representation for objects ## Data Models - `l[i, ...]` could mean `l[i, :, :, :]` - `__missing__` method may or may not be called, depending the class we're subclassing - `dict` is now ordered by default, but `OrderedDict` is still optimized for ordering - `types.MutableMappingProxy` - Consider extending from `UserDict` instead of `dict` due to their different implementations - `dict.setdefault` is helpful for updating mutable mappings - `dict.keys` and `dict.vales` return a view, which is similar to a `set` - `memoryview` creates a shared view without copying bytes - `__post_init__` method for data classes - Inherit from `TypedDict` to have annotations for fields in a dictionary. Unlike named tuple or data classes, it has no runtime effect, can't set defaults. ## [[oop|Object Oriented]] - Mixin classes can be used to provide additional functionalities to a class when inherited. - Class attributes are stored in a shared dictionary, hence adding attributes after instantiation compromises performance. - In data classes, fields without type hints are considered class attributes, unless it's typed as `ClassVar` or `InitVar` - Use `__match_args__` method to implement objects matching with positional attributes - Name mangling: `__attr` becomes `_ClassName__attr` - A method calls `super()` will *activate* the classes in `__mro__`, the corresponding method may need to also call `super()` to *cooperate* - Extend from `UserDict`, `UserList`, `UserStr`, since the built-in types are implemented in C and they don't call magical methods in the expected way. - Try to avoid inheritance unless the benefit is clear. Static duck typing is better. - Operator overloading - For `a + b`, `a.__add__` is invoked first, if doesn't work, then `b.__radd__` - Return `NotImplemented` if not supported - When accessing attributes from an instance, it first searches in `obj.__class__` for properties, if it doesn't exist, falls back to a key in `obj.__ditc`. then to `obj.__getattr__`. This can be bypassed by directly accessing `obj.__dict__`. - Properties are class attributes. Can be created with `@properties`, with corresponding decorators for getters and setters. Can also be created with property factory functions. - `property(fget=None, fset=None, fdel=None, doc=None)` - `__init__` is initializer, `__new__` is the real constructor. - Using descriptors as getter/setter to validate properties. `__set__`, `__get__` methods. - Every function is a non-overriding descriptor. `Cls.func.__get__(obj)` returns a method bound to `obj`, `Cls.func.__get__(None, Cls)` returns a function without binding. ### Metaprogramming - `type(name, bases, dict)` can be used to construct classes in run time. `type` is a meta class that build other classes. - `__init_subclass__` can be used to customize some behaviors of subclasses when they're created. Note that `__slots__` can't be tweaked here since the classes have been constructed by `__new__` -- meta class is needed for this. - `type` is an instance of `type`, `type` is a subclass of `object`, `object` is an instance of `type.` - Every class is an *instance* of `type` (maybe indirectly), but only meta classes are *subclass* of `type`. Meta classes like `abc.ABCMeta` inherits from `type` the power to construct classes. - `MetaKlass.__new__` constructs new classes. ## Utils - `sorted` is based on `__lt__` method - `func.__code__` to view the code, and `func.__code__.co_varnames` and `func.__code__.co_freevars` to view the var names. `func.__closure__` to view the closure. Content can be viewed with `func.__closure__[0].cell_contents` - Given a `str`, we can have `str.fmt` to use it as a formatting string. - `inspection` lib, to help inspect objects. - `local.getpreferredencoding` - `list(zip(*a))` can be used to transpose a 2D list - `itertools.chain` to chain generators together ## Functions - Position-only argument, tuple-captured arguments, keyword-only arguments... e.g. `func(a, b, /, *d, e, **f)` - `operator` library, which contains operators that can be used together with `reduce`, or to be used to avoid creating `lambda` functions. `itemgetter`, `attrgetter`, and `methodcaller` can be used to create a function that accesses certain item/attr, or call a certain method. e.g. `key=itemgetter(2)`. - Use `functool.partial` and `partialmethod` to create a callable from a callable with arguments bound to certain values. - Optional arg, `Optional[str]` is a shorthand of `Union[str, None]`, or `str | None` - `tuple[int, ...]` for unlimited tuple - Use `abc.Mapping` instead of `dict` in param type hints. Parameters should be typed with generics, returns should be with concretes. - `@functools.wraps(func)` decorator. - `@cache` can take up all memory, `@lru_cache` with `maxsize` is safer. `maxsize` should be a power of `2`. `typed` defaults to `False` ```python @singledispatch def htmlize(obj: object) -> str: content = html.escape(repr(obj)) return f'<pre>{content}</pre>' @htmlize.register def _(text: str) -> str: content = html.escape(text).replace('\n', '<br/>\n') return f'<p>{content}</p>' # stack and pass types if needed @htmlize.register(decimal.Decimal) @htmlize.register(float) def _(x) -> str: ... ``` - `locals()` can be used to access all local vars. ## Types > [!IMPORTANT] Duck Typing > > Prefer duck typing when possible, then goose typing, then nominal typing (`isinstance`) if really needed. - Type `x` is *consistent with* `y`, if `x` supports all operations of `y`. `Any` is a special case. - `TypeVar` - `T = TypeVar('T', int, float)`, or `T = TypeVar('T', bound=Hashable)`, other options are also supported - `TypeVar` can be replaced with `type` keyword in 3.12. But for generic classes they still implicitly inherit from `Generic` - Use `variance` argument to specify ```python # define a protocol from typing import Protocol, Any class SupportsLessThan(Protocol): def __lt__(self, other: Any) -> bool: ... LT = TypeVar('LT', bound=SupportsLessThan) ``` - `typing.TYPE_CHECKING` is a `bool` that is `False` in runtime and `True` in checking - `__annotations__` can be used to read type hints from data classes. The annotations are read as string, and evaluated when called with `inspect.get_annotations()` - Goose typing by sub-classing the ABC, or define classes first, then register them as virtual subclass of an ABC. - Duck typing by EAFP (easier to ask for forgiveness than for permission), e.g. to assert `o` can is compatible with `list`, simply call `list(o)`. - `__subclasshook__` can be used to modify the behavior of subclass/instance. For example, any object with `__len__` will be considered a `Sequence`, even without subclassing or registering. - However, runtime checks don't always work well, static protocol typing is still helpful. - `@runtime_checkable` to make protocols checkable at runtime. - It's better to make protocols "single method", as in `SupportsComplex`. - Typing for numbers are tricky, since `number.Number` has no methods in ABC - Python interpreter goes a long way for iterators and sequences to work, even when methods are partially implemented. - Use `@overload` to define multiple function signatures. ## Decorators - `pytest.mark.parametrize` for testing. e.g. `@mark.parametrize('a, b', [...])` - `@classmethod` receives class itself as first arg, `@staticmethod receives nothing` - `@typing.final` - cannot be overridden, `@typing.override` - explicitly tell type checker that this function overrides super class ## Iterators and Generators - `re.finditer` performs lazy evaluation (unlike `findall`), useful for building lazy evaluated iterators. - An object is `Iterable` as long as it have `__getitem__` (doesn't have to have `__next__`) - To build iterator, also try using `yield`, `yield from`, and generator expressions. Raise `StopIteration` exception if done. - `Generator` and `Coroutine` types are functionally equivalent, but classic coroutine with generator function should use `Generator` - Iterator produces items, generator consumes items. - Pattern: send a sentinel value to generator to terminate it; raise `StopIteration` inside generator to terminate it. - When `gen.close()` is invoked, a `StopIteration` exception will be raised in the `yield` statement. ## Control Flow ### Context Manager - `ctx.__enter__` and `ctx.__exit__` will be called. `ctx.__exit__` will also be called with exceptions for handling. - `gen.__exit__` returns a *truthy* value to indicate that exceptions have been handled, otherwise they will be propagated. - Multiple contexts since Python 3.10: `with ( Ctx1() as ctx1, Ctx2() as ctx2 ):` - `@contextlib.contextmanager` to make a function a context. Anything before `yield` will be `__enter__`, the yielded object will be given to `as`, anything after will be `__exit__`. It's assumed that the context handles the exceptions. - Functions decorated with `@contextmanager` can be used to decorate other functions. ### `else` Block - `else` can be used with `for` and `while`, which will be executed if the loop is terminated by `break`. - `else` can be used with `except`, which will be executed if the `try` block didn't raise any exceptions. ## Modules and Packages - Modules are also first-class objects, can be found in `sys.modules`. - Avoid doing things in the module's body except binding attributes. - Special methods like `__getattr__` at module body will work. - `imoprt builtins` or the `__builtins__` attribute can be used to monkey patch builtins. - If the first statement is a string literal, it's bound to `__doc__` - `from mod import *` binds all attributes except those starting with `_`. Module can define a `__all__` for a list of names that should be bound. - Importing is implemented in `builtin.__import__`, it first searches in `sys.modules` - `.pth` file for Python path config. When importing, `sys.path` is looked up, the final path for module `M` would be `M/__init__.py` - Use `python -m testname` to see if a name exists in stdlib. - `importlib.reload(M)` can reload a module, but existing bindings to attributes won't be reloaded (hence `import` is preferred over `from` when importing a module) - Hooks can be added for module importing. - `P/__init__.py`, `P`'s module body, must be present for `P` to be recognized as a package (except for namespace packages). If `P.__all__` is not set, `from P import *` will only import names in `__init__.py` - Relative import: `from . import X` will look for `X` in current package, more dots can go up in hierarchy.