wtorek, 29 marca 2011
pyahocorasick
Python module implementing Aho-Corasick algorithm has been released. C extension (for Py3k) and pure python code are available.
poniedziałek, 28 marca 2011
Internal memory fragmentation
In previous post I've advertised my text about trie representations.
Depending on particular representation internal memory fragmentation vary from 25% to 46% (in GNU libc). In other words if trie should occupy 100MB then in the worst case real memory usage is around 200MB. I've never suppose that fragmentation could be so significant.
When quite simple memory pools were used, then internal fragmentation has been cut down to 1-2%! Impressive.
Depending on particular representation internal memory fragmentation vary from 25% to 46% (in GNU libc). In other words if trie should occupy 100MB then in the worst case real memory usage is around 200MB. I've never suppose that fragmentation could be so significant.
When quite simple memory pools were used, then internal fragmentation has been cut down to 1-2%! Impressive.
sobota, 26 marca 2011
Efficient trie representation
While implementing Aho-Corasick algorithm I've checked several trie representations.