Trie
a.k.a prefix tree, is a tree data structure, which is used for retrieval of a key in a dataset of strings
Compared with other data structures
Hashtable同样可以提供O(1)的search,但是无法处理以下情况:
- Finding all keys with a common prefix
- Enumerating a dataset of strings in lexicographical order.
另外一个原因,当数量增多是,hashtable中存在hash collisions,所以search有可能变成O(n)(这里n是插入的key的个数) 而Tire可以用更少的空间来存放更多的key with the same prefix,相对来说,Tire的search只需要O(m)(这里m是key的长度) 而在Balanced tree中,search需要O(mlogn)
Trie Node Structure
- Maximum of R links to its children (usually it is 26 for lowercase latin letters)
- Boolean field to indicate that if it is a end of the key
- 对于insert来说,时间和空间复杂度都是O(n)
注意这里不需要Node节点的值,值通过links数组的index即可以表示
Implement Trie
class TrieNode {
private TrieNode[] links;
private boolean isEnd;
public TrieNode() {
links = new TrieNode[26];
}
public void setEnd() {
this.isEnd = true;
}
public boolean getEnd() {
return this.isEnd;
}
public boolean containsKey(char key) {
return links[key - 'a'] != null;
}
public TrieNode get(char key) {
return links[key - 'a'];
}
public void put(char key, TrieNode node) {
links[key - 'a'] = node;
}
}
public class Trie {
private TrieNode root;
/** Initialize your data structure here. */
public Trie() {
this.root = new TrieNode();
}
/** Inserts a word into the trie. */
public void insert(String word) {
TrieNode node = this.root;
for (int i = 0; i < word.length(); i++) {
char key = word.charAt(i);
if (!node.containsKey(key)) {
node.put(key, new TrieNode());
}
node = node.get(key);
}
node.setEnd();
}
/** Returns if the word is in the trie. */
public boolean search(String word) {
TrieNode node = this.root;
int i = 0;
for (; i < word.length(); i++) {
char key = word.charAt(i);
if (node.containsKey(key)) {
node = node.get(key);
} else {
return false;
}
}
//必须要恰好在最后这个node结束 a.k.a node.isEnd(),这也就是search和startsWith的不同
return node.getEnd();
}
/** Returns if there is any word in the trie that starts with the given prefix. */
public boolean startsWith(String prefix) {
TrieNode node = this.root;
for (int i = 0; i < prefix.length(); i++) {
char key = prefix.charAt(i);
if (node.containsKey(key)) {
node = node.get(key);
} else {
return false;
}
}
return true;
}
}
Add and Search Word Data Structure
- 处理wildcard通配符的时候,要用到recursion
public boolean search(String word) {
TrieNode node = root;
return search(word, 0, node);
}
private boolean search (String word, int start, TrieNode node) {
if (start == word.length()) return node.isWord;
char c = word.charAt(start);
if (c == '.') {
for (int j = 0; j < 26; j++) {
if (node.children[j] != null && bfs(word, start+1, node.children[j]))
return true;
}
return false;
} else {
TrieNode tmp = node.children[c-'a'];
return tmp != null && bfs(word, start+1, tmp);
}
}