word-break - Samuel's Vault

# Word Break https://leetcode.com/problems/word-break/ Assume `n = len(s)`, `m = len(words)`, `k` is the average length of words. ## Approach 1: [[bfs|BFS]] `s[start:end] in words` indicates that there is an edge from `start` to `end`. ```python words = set(wordDict) queue = deque([0]) seen = set() while queue: start = queue.popleft() if start == len(s): # reached the end return True for end in range(start + 1, len(s) + 1): if end in seen: # only concerned about reaching end, not about how continue if s[start:end] in words: queue.append(end) seen.add(end) return False ``` Time complexity: $O(n^3 + mk)$. Handling a node costs $O(n^2)$ as we iterate over all the `start` before a specific `end`, and creates a string for each of the range. BFS can cost $O(n^3)$ in total. Creating the set takes $O(mk)$ time. Space complexity: $O(n + mk)$. ## Approach 2: Top-Down [[dp|DP]] ```python lenth = [len(word) for word in words] @cache def dp(i): if i < 0: return True return any( s[i - l + 1:i + 1] == w and dp(i - l) for w, l in zip(words, length) ) return dp(len(s) - 1) ``` Time complexity: $O(nmk)$, there are $n$ states in table in total, to calculate each of them, we iterate over $m$ words, and perform substring operation on average length of $O(k)$. Space complexity: $O(n)$ ## Approach 3: Bottom-Up [[dp|DP]] ```python dp = [False] * len(s) length = [len(w) for w in words] for i in range(len(s)): for w, l in zip(words, length): if i < l - 1: # word is too long continue if i == l - 1 or dp[i - l]: if s[i - l + 1:i + 1] == w: dp[i] = True break return dp[-1] ``` - `i == l - 1`: can reach `i` from start of string with this word - `dp[i - l]`: can reach the end of previous word Time complexity: $O(nmk)$, $O(n)$ states in total, $O(mk)$ time per state. Space complexity: $O(n)$ ## Approach 4: [[trie|Trie]] Optimization ```python class Trie(dict): def __init__(self, words): for w in words: self.insert(w) self.is_word = False def insert(self, word): node = self for c in word: node.setdefault(c, Trie()) node = node[c] node.is_word = True trie = Trie(words) dp = [False] * len(s) for i in range(len(s)): if i == 0 or dp[i - 1]: # either at start or the previous word is reached node = trie for j in range(i, len(s)): c = s[j] if not node := node.get(c): break if node.is_word: dp[j] = True return dp[-1] ``` Time complexity: $O((n+m)k)$. Building the trie iterates over all characters and takes $O(mk)$. Once the trie is built, searching for a word takes $O(1)$. Nested loop iterates over $n$ indices, but each iteration stops after $k$ searches in trie on average. Space complexity: $O(n + mk)$, the trie can have at most $mk$ nodes. ## Approach 5: Another [[dp|DP]] `dp[i] = True` indicates a prefix of length `i` can be formed. ```python n = len(s) dp = [False] * (n + 1) dp[0] = True for i in range(1, n + 1): for j in range(i): if dp[j] and s[j:i] in words: dp[i] = True break return dp[-1] ``` Time complexity: $O(n^3 + mk)$, nested loop with substring operation takes $O(n^3)$ time.