Arc Forumnew | comments | leaders | submitlogin
4 points by Pauan 4769 days ago | link | parent

I implemented the Boyer–Moore–Horspool substring-search algorithm in Arc[1].

I then plugged it into my playlist program, and, well... see for yourself:

  ;; Arc posmatch:          38,236 ms
  ;; Boyer–Moore–Horspool:   6,396 ms
Wow. I think this confirms that the biggest bottleneck in my program was posmatch[2].

Keep in mind, though, Boyer-Moore isn't always faster than posmatch[3]: due to the preprocessing time, long pattern strings will cause the algorithm to perform slower.

This only applies when the string is preprocessed multiple times: if it's only used once, then the overhead should be negligible. And if you want to use the same pattern string multiple times, you can preprocess it once and then reuse it:

  (let x (boyer-moore-process "foobarquxcorgenou")
    (boyer-moore-search x "adadadadadadadadadadadad")
    (boyer-moore-search x "adadadadadadadadadadadad")
    ...
    (boyer-moore-search x "adadadadadadadadadadadad"))
In general, though, Boyer-Moore is faster. Especially for multi-string searches, which is what my playlist program does.

---

* [1]: https://github.com/Pauan/ar/blob/nu/lib/boyer-moore.arc

* [2]: http://arclanguage.org/item?id=14204

* [3]: https://github.com/Pauan/ar/blob/nu/notes/timing



2 points by Pauan 4768 days ago | link

New timing link: https://github.com/Pauan/ar/blob/nu/notes/timing.md

-----