transformer - Samuel's Vault

# Transformer Architecutre - A "mini-brain" sitting on each token. - A mini-brain can pass information to its right. - A mini-brain has states. i.e. To compute a layer, all it need is previous layers and the output of the mini-brain to the left. - Mini-brains can ask questions and share information. - "Backward and downward" mechanism, information only flows from left to right. - The only way to get around "downward" is the newly generated token will have a chance to pass insights in high layers to future generated tokens -- the basis of chain-of-thought prompting.