```mermaid graph LR i[Inputs] p[Program] o[Observer \n Agent] a[Analysis \n Agent] i --> p --> o --> a ``` ## Design Decisions - How much input to generate? - Inputs dictate the _expense_ (time and space) and _accuracy_ - How to run the program - Runtime environment - On-line vs. off-line analysis - What to instrument? - When to instrument? ## Program Instrumentation - Adding code to the program to capture runtime information. - [[python|Python]] - `sys` module, `sys.settrace(tracefunc)` - `tracefunc` should accept `frame`, `event`, and `arg` - `tracefunc` should return the callback function for the next instruction. ## Input Generation - Rely on symbolic information. - Symbolic state = $(\text{stmt}, \sigma_s, \pi)$ - next instruction to evaluate - symbolic store, aka the variables. - path constraints, a boolean expression, `True` if always executed - Execute the program while carrying the symbolic state, fork the states if conditions encountered. - Implementation - `ast` module in [[python|Python]], `ast.parse()` to parse source obtained from `inspect.getsource()`. ```python import inspect import ast import z3 def traverse_if_children(children, context, cond): previous_len = len(paths) for child in children: traverse(child, context + [cond]) if len(paths) == previous_len: paths.append(context + [cond]) def traverse(node, context): if isinstance(node, ast.If): cond = ast.unparse(node.test).strip() not_cond = f"z3.Not({cond})" traverse_if_children(node.body, context, cond) traverse_if_children(node.orelse, context, not_cond) else: for child in ast.iter_child_nodes(node): traverse(child, context) ``` ## Dynamic Taint Analysis