I've tested following grouping schemes:
length of string, first letter of string (one level trie), - both length and first letter.
Tests: my program read plain text (I've used The Illiad from http://gutenberg.org), text is splitted into words (~190000) and then each words is inserted into dictonary (~28000 distinct words); then the same words are searched in dictionaries. Table below summarizes results on my computer (gcc 4.3.4 from cygwin).
+-------------+--------------------------------+--------------------------------+ | | running time [ms] | speedup [%] | | data struct +----------+----------+----------+----------+----------+----------+ | | min | avg | max | min | avg | max | +-------------+----------+----------+----------+----------+----------+----------+ | inserting | +-------------+----------+----------+----------+----------+----------+----------+ | std::map | 269 | 287 | 355 | 100 | 100 | 100 | +-------------+----------+----------+----------+----------+----------+----------+ | first char | 218 | 241 | 395 | 81 | 84 | 111 | +-------------+----------+----------+----------+----------+----------+----------+ | length | 218 | 240 | 345 | 81 | 84 | 97 | +-------------+----------+----------+----------+----------+----------+----------+ | len./char | 165 | 172 | 207 | 61 | 60 | 58 | +-------------+----------+----------+----------+----------+----------+----------+ | finding | +-------------+----------+----------+----------+----------+----------+----------+ | std::map | 295 | 322 | 483 | 100 | 100 | 100 | +-------------+----------+----------+----------+----------+----------+----------+ | first char | 243 | 263 | 460 | 82 | 82 | 95 | +-------------+----------+----------+----------+----------+----------+----------+ | length | 238 | 248 | 292 | 80 | 77 | 60 | +-------------+----------+----------+----------+----------+----------+----------+ | len./char | 184 | 190 | 241 | 62 | 60 | 50 | +-------------+----------+----------+----------+----------+----------+----------+
Download test program.
Brak komentarzy:
Prześlij komentarz