This module is an optimized implementation of Ukkonen's suffix tree algorithm in python. Aiming to support the important text processing functionalities such as:
Search for strings:
✓ Check if a string P of length m is a substring in O(m) time.
✓ Find the first occurrence of the patterns P1,... ,Pq of total length m as substrings in O(m) time.
✓ Find all z occurrences of the patterns P1,... ,Pq of total length m as substrings in O(m+z) time.
- Search for a regular expression P in time expected sublinear in n
- Find for each suffix of a pattern P the length of the longest match between a prefix of P[i... m] and a substring in D in
time. This is termed the matching statistics for P
Find properties of the strings:
sources:
- http://web.stanford.edu/~mjkay/gusfield.pdf
- On–line construction of suffix trees. Esko Ukkonen
- http://www.geeksforgeeks.org/ukkonens-suffix-tree-construction-part-6/