One Unified Global Perspective
Communications with a Global Perspective
Home
Intro
Contact Us
Voice over IP
PBX Solutions
Services
Support
Glossary
Open Source
Blog
Forum

WebMail





2008 Mar 23 - Sun

C++ Implementation of Wu Manber's Multi-Pattern Search Algorithm

I'm finding that this algorithm is useful in a number of situations. Out in the real world, it is found in Network Intrusion Scanners, grep engines, as well as text processing.

For me, I'm working on an implementation for a RADIUS server for authorizing and logging calls for a voice over ip service. Some vendor specific attributes don't have a unique numeric identifier, the identifier is simply a string. Being able to do a fast match would be nice in this situation. Currently I'm using C++'s unordered_map to facilitate identification of the string. The Wu Manber algorithm may be a suitable alternative.

On another front, I subscribe to a real time news service. Passing this news service through a Multi-Pattern Search Algorithm would make for a processor efficient method of filtering for news items in which I'm interested.

Wu-Manber stand on the shoulders of their Multi Pattern Search predecessors: Aho and Corasick (with a linear time scanner based upon an automata approach), Commentz-Walter (who combined Aho-Corasick with the Boyer-Moore string search algorithm), and Baeza-Yates (with a slightly different combination of Aho-Corasick and Boyer-Moore-Horspool).

For those who are interested, here is a Visual Studio implmentation of Wu Manber's algorithm. It should be readily transportable to other platforms with little or no change, other the the _tmain construct. The implementation relies on C++ Standard Template Library for various containers. A table driven character comparison approach is used to make it easy to choose between case sensitive and case insensitive matches.

The next step would be turn it into a template to make it useful on various types of alphabets. It all needs to be redone a bit in order to accept a 'functor' or a 'delegate' so that when a match is found, an appropriate routine can be invoked.

There is another paper out there called "An Improved Wu-Manber Multiple Patterns Matching Algorithm" that purports to be faster. The charts in the paper say it is a bit faster. I'm wondering if it is 'faster enough' to be of value implementing.

[/OpenSource/Programming] permanent link



New blog site at: Raymond Burkholder - What I Do

Blog Content ©2013
Ray Burkholder
All Rights Reserved
ray@oneunified.net
(519) 838-6013
(441) 705-7292
Available for Contract Work
Resume

RSS: Click to see the XML version of this web page.

twitter
View Ray 
Burkholder's profile on LinkedIn
technorati
Add to Technorati Favorites



March
Su Mo Tu We Th Fr Sa
           
23
         


Main Links:
Monitoring Server
SSH Tools
QuantDeveloper Code

Special Links:
Frink

Blog Links:
Quote Database
Nanex Research
Sergey Solyanik
Marc Andreessen
Micro Persuasion
... Reasonable ...
Chris Donnan
BeyondVC
lifehacker
Trader Mike
Ticker Sense
HeadRush
TraderFeed
Stock Bandit
The Daily WTF
Guy Kawaski
J. Brant Arseneau
Steve Pavlina
Matt Cutts
Kevin Scaldeferri
Joel On Software
Quant Recruiter
Blosxom User Group
Wesner Moise
Julian Dunn
Steve Yegge
Max Dama

2008
Months
Mar




Mason HQ

Disclaimer: This site may include market analysis. All ideas, opinions, and/or forecasts, expressed or implied herein, are for informational purposes only and should not be construed as a recommendation to invest, trade, and/or speculate in the markets. Any investments, trades, and/or speculations made in light of the ideas, opinions, and/or forecasts, expressed or implied herein, are committed at your own risk, financial or otherwise.