# Pythonic
Python is "simple and correct"!
- Walrus operator `:=`, `a := 10` will perform assignment and also return `10`.
- `case [str(name), _, _, (float(lat), float(lon))]:`
- `==` is usually what you want. `is` is mostly used for comparison with `None` or other sentinels, it can't be overloaded so it's slightly faster than `__eq__`
- [Weak references | Fluent Python](https://fluentpython.com/extra/weak-references/), several classes can be used to reference objects without actually being counted, useful for caches.
- Prefer EAFP (easy to ask forgiveness than permission) over LBYL (look before you leap), with `try`-`except` blocks. This improves safety in multi-threaded programs.
## String
- Single quote by default.
- `'%s/%s' % ('hello', 'world)`, the `%` formating operator understands tuple.
- `'\N{NAME OF UNICODE CHAR}'`
- `str.maketrans` and `str.translate`
- `locale.strxform` for transforming [[unicode|Unicode]] text to locale-aware form for comparison. Alternative, use `pyuca` and the more powerful `pyicu`.
- `unicodedata` module for metadata about unicode characters.
- [[regex|RegEx]] built from `str` and `bytes` are different
- Use `reprlib.repr()` to get concise and readable representation for objects
## Data Models
- `l[i, ...]` could mean `l[i, :, :, :]`
- `__missing__` method may or may not be called, depending the class we're subclassing
- `dict` is now ordered by default, but `OrderedDict` is still optimized for ordering
- `types.MutableMappingProxy`
- Consider extending from `UserDict` instead of `dict` due to their different implementations
- `dict.setdefault` is helpful for updating mutable mappings
- `dict.keys` and `dict.vales` return a view, which is similar to a `set`
- `memoryview` creates a shared view without copying bytes
- `__post_init__` method for data classes
- Inherit from `TypedDict` to have annotations for fields in a dictionary. Unlike named tuple or data classes, it has no runtime effect, can't set defaults.
## [[oop|Object Oriented]]
- Mixin classes can be used to provide additional functionalities to a class when inherited.
- Class attributes are stored in a shared dictionary, hence adding attributes after instantiation compromises performance.
- In data classes, fields without type hints are considered class attributes, unless it's typed as `ClassVar` or `InitVar`
- Use `__match_args__` method to implement objects matching with positional attributes
- Name mangling: `__attr` becomes `_ClassName__attr`
- A method calls `super()` will *activate* the classes in `__mro__`, the corresponding method may need to also call `super()` to *cooperate*
- Extend from `UserDict`, `UserList`, `UserStr`, since the built-in types are implemented in C and they don't call magical methods in the expected way.
- Try to avoid inheritance unless the benefit is clear. Static duck typing is better.
- Operator overloading
- For `a + b`, `a.__add__` is invoked first, if doesn't work, then `b.__radd__`
- Return `NotImplemented` if not supported
- When accessing attributes from an instance, it first searches in `obj.__class__` for properties, if it doesn't exist, falls back to a key in `obj.__ditc`. then to `obj.__getattr__`. This can be bypassed by directly accessing `obj.__dict__`.
- Properties are class attributes. Can be created with `@properties`, with corresponding decorators for getters and setters. Can also be created with property factory functions.
- `property(fget=None, fset=None, fdel=None, doc=None)`
- `__init__` is initializer, `__new__` is the real constructor.
- Using descriptors as getter/setter to validate properties. `__set__`, `__get__` methods.
- Every function is a non-overriding descriptor. `Cls.func.__get__(obj)` returns a method bound to `obj`, `Cls.func.__get__(None, Cls)` returns a function without binding.
### Metaprogramming
- `type(name, bases, dict)` can be used to construct classes in run time. `type` is a meta class that build other classes.
- `__init_subclass__` can be used to customize some behaviors of subclasses when they're created. Note that `__slots__` can't be tweaked here since the classes have been constructed by `__new__` -- meta class is needed for this.
- `type` is an instance of `type`, `type` is a subclass of `object`, `object` is an instance of `type.`
- Every class is an *instance* of `type` (maybe indirectly), but only meta classes are *subclass* of `type`. Meta classes like `abc.ABCMeta` inherits from `type` the power to construct classes.
- `MetaKlass.__new__` constructs new classes.
## Utils
- `sorted` is based on `__lt__` method
- `func.__code__` to view the code, and `func.__code__.co_varnames` and `func.__code__.co_freevars` to view the var names. `func.__closure__` to view the closure. Content can be viewed with `func.__closure__[0].cell_contents`
- Given a `str`, we can have `str.fmt` to use it as a formatting string.
- `inspection` lib, to help inspect objects.
- `local.getpreferredencoding`
- `list(zip(*a))` can be used to transpose a 2D list
- `itertools.chain` to chain generators together
## Functions
- Position-only argument, tuple-captured arguments, keyword-only arguments... e.g. `func(a, b, /, *d, e, **f)`
- `operator` library, which contains operators that can be used together with `reduce`, or to be used to avoid creating `lambda` functions. `itemgetter`, `attrgetter`, and `methodcaller` can be used to create a function that accesses certain item/attr, or call a certain method. e.g. `key=itemgetter(2)`.
- Use `functool.partial` and `partialmethod` to create a callable from a callable with arguments bound to certain values.
- Optional arg, `Optional[str]` is a shorthand of `Union[str, None]`, or `str | None`
- `tuple[int, ...]` for unlimited tuple
- Use `abc.Mapping` instead of `dict` in param type hints. Parameters should be typed with generics, returns should be with concretes.
- `@functools.wraps(func)` decorator.
- `@cache` can take up all memory, `@lru_cache` with `maxsize` is safer. `maxsize` should be a power of `2`. `typed` defaults to `False`
```python
@singledispatch
def htmlize(obj: object) -> str:
content = html.escape(repr(obj))
return f'<pre>{content}</pre>'
@htmlize.register
def _(text: str) -> str:
content = html.escape(text).replace('\n', '<br/>\n')
return f'<p>{content}</p>'
# stack and pass types if needed
@htmlize.register(decimal.Decimal)
@htmlize.register(float)
def _(x) -> str:
...
```
- `locals()` can be used to access all local vars.
## Types
> [!IMPORTANT] Duck Typing
>
> Prefer duck typing when possible, then goose typing, then nominal typing (`isinstance`) if really needed.
- Type `x` is *consistent with* `y`, if `x` supports all operations of `y`. `Any` is a special case.
- `TypeVar`
- `T = TypeVar('T', int, float)`, or `T = TypeVar('T', bound=Hashable)`, other options are also supported
- `TypeVar` can be replaced with `type` keyword in 3.12. But for generic classes they still implicitly inherit from `Generic`
- Use `variance` argument to specify
```python
# define a protocol
from typing import Protocol, Any
class SupportsLessThan(Protocol):
def __lt__(self, other: Any) -> bool: ...
LT = TypeVar('LT', bound=SupportsLessThan)
```
- `typing.TYPE_CHECKING` is a `bool` that is `False` in runtime and `True` in checking
- `__annotations__` can be used to read type hints from data classes. The annotations are read as string, and evaluated when called with `inspect.get_annotations()`
- Goose typing by sub-classing the ABC, or define classes first, then register them as virtual subclass of an ABC.
- Duck typing by EAFP (easier to ask for forgiveness than for permission), e.g. to assert `o` can is compatible with `list`, simply call `list(o)`.
- `__subclasshook__` can be used to modify the behavior of subclass/instance. For example, any object with `__len__` will be considered a `Sequence`, even without subclassing or registering.
- However, runtime checks don't always work well, static protocol typing is still helpful.
- `@runtime_checkable` to make protocols checkable at runtime.
- It's better to make protocols "single method", as in `SupportsComplex`.
- Typing for numbers are tricky, since `number.Number` has no methods in ABC
- Python interpreter goes a long way for iterators and sequences to work, even when methods are partially implemented.
- Use `@overload` to define multiple function signatures.
## Decorators
- `pytest.mark.parametrize` for testing. e.g. `@mark.parametrize('a, b', [...])`
- `@classmethod` receives class itself as first arg, `@staticmethod receives nothing`
- `@typing.final` - cannot be overridden, `@typing.override` - explicitly tell type checker that this function overrides super class
## Iterators and Generators
- `re.finditer` performs lazy evaluation (unlike `findall`), useful for building lazy evaluated iterators.
- An object is `Iterable` as long as it have `__getitem__` (doesn't have to have `__next__`)
- To build iterator, also try using `yield`, `yield from`, and generator expressions. Raise `StopIteration` exception if done.
- `Generator` and `Coroutine` types are functionally equivalent, but classic coroutine with generator function should use `Generator`
- Iterator produces items, generator consumes items.
- Pattern: send a sentinel value to generator to terminate it; raise `StopIteration` inside generator to terminate it.
- When `gen.close()` is invoked, a `StopIteration` exception will be raised in the `yield` statement.
## Control Flow
### Context Manager
- `ctx.__enter__` and `ctx.__exit__` will be called. `ctx.__exit__` will also be called with exceptions for handling.
- `gen.__exit__` returns a *truthy* value to indicate that exceptions have been handled, otherwise they will be propagated.
- Multiple contexts since Python 3.10: `with ( Ctx1() as ctx1, Ctx2() as ctx2 ):`
- `@contextlib.contextmanager` to make a function a context. Anything before `yield` will be `__enter__`, the yielded object will be given to `as`, anything after will be `__exit__`. It's assumed that the context handles the exceptions.
- Functions decorated with `@contextmanager` can be used to decorate other functions.
### `else` Block
- `else` can be used with `for` and `while`, which will be executed if the loop is terminated by `break`.
- `else` can be used with `except`, which will be executed if the `try` block didn't raise any exceptions.
## Modules and Packages
- Modules are also first-class objects, can be found in `sys.modules`.
- Avoid doing things in the module's body except binding attributes.
- Special methods like `__getattr__` at module body will work.
- `imoprt builtins` or the `__builtins__` attribute can be used to monkey patch builtins.
- If the first statement is a string literal, it's bound to `__doc__`
- `from mod import *` binds all attributes except those starting with `_`. Module can define a `__all__` for a list of names that should be bound.
- Importing is implemented in `builtin.__import__`, it first searches in `sys.modules`
- `.pth` file for Python path config. When importing, `sys.path` is looked up, the final path for module `M` would be `M/__init__.py`
- Use `python -m testname` to see if a name exists in stdlib.
- `importlib.reload(M)` can reload a module, but existing bindings to attributes won't be reloaded (hence `import` is preferred over `from` when importing a module)
- Hooks can be added for module importing.
- `P/__init__.py`, `P`'s module body, must be present for `P` to be recognized as a package (except for namespace packages). If `P.__all__` is not set, `from P import *` will only import names in `__init__.py`
- Relative import: `from . import X` will look for `X` in current package, more dots can go up in hierarchy.