2010 Aug 18 - Wed
Cygwin, Eclipse and Subversion Installation Notes
I have written several articles about Eclipse (the code editing UI) and it's
integration with subversion. This is an update of a few things to watch out for
with Eclipse, the Helios release. I do development on Linux as well as on Windows. In
this case my primary machine is a Windows machine running VMWare with several
guest Linux systems.
For the Linux systems with a GUI, I've used Cygwin to provide a mechanism of running the
Linux interfaces on my Windows interface. I have tried the VMWare Unity mechanism, but
on my multi-monitor system, it appears clunky and buggy.
When installing Cygwin, the key library to install is the 'xinit' library. This loads
all other necessary X11 libraries. Also include mintty in the Shells category for an
improved console experience.
As a side note for regular terminal operations in Cygwin, the following can be used with
mintty. Start 'ssh-agent mintty'.
mintty is explained at
http://code.google.com/p/mintty/.
Then use ssh-add to add a private key. The public key can be added to the ~/.ssh/authorized_key
files on the destination machines.
Anyway, for getting the GUI experience, use startxwin to start an xwindow terminal window.
Connect to the destination computer with 'ssh -l username -Y ipaddress'. At that point,
I run eclipse with '/usr/sbin/eclip[se/eclipse &'. The '&' forks the process and allows further
operations in the terminal window.
I've got ahead of myself here. To get eclipse installed, I downloaded the binaries from
eclipse.org, expanded them to a directory called eclipse. I then moved the directory to
/usr/sbin. Eclipse can then be started with '/usr/sbin/eclipse/eclipse'.
For version control, the Polaris subversion client is listed as a standard item
in the Collaboration items in the Eclipse New Software. After trying that, I wasn't very
pleased with the experience. It is not well integrated.
Instead, I removed the Polaris Subversive client and installed the Tigris.org
Subclipose Client.
The integration into Eclipse is much better. I used the SVNKit (Pure Java) connector
so as to obtain the svn+ssh://... tunnelling capability with a private key based login.
2009 Oct 14 - Wed
Boost BJam Updated
With the version 1.40 of Boost, library names are decorated differently. To keep the old style library
decorations and naming style, the option "--layout=tagged" should work. So from my
2008/10/10 Boost Build Article, my typical command line should be:
bjam --layout=versioned --toolset=msvc-9.0 variant=debug threading=multi link=static runtime-link=static stage
2009 Jun 17 - Wed
New Release of WTL (Windows Template Library)
I've been able to start on a new project with a clean slate. For the portion residing on Microsoft
Windows, I'm going to give the latest version (v8.1, build 9127) of the WTL (Windows Template Library) for
crafting the GUI side of things. The latest version can be downloaded from
SourceForge,
which looks to have a release date of May 7, 2009.
The release notes don't indicate anything for running with the Visual Studio 2008 IDE. A blog entry at
Code Gem called
WTL Wizard for Visual Studio 2008 indicates that by changing a few references to registry entries in the
Visual Studio 2005 script, one can get the new WTL Wizard to install in the 2008 IDE. He supplies
downloadable source code in the blog entry. Another location for a patch file is located in the
Yahoo Forums.
WTL has in the past been underdocumented. Besides some websites linked from the wtl.sourceforge.net
web site, the WTL group on Yahoo has had a
WTL Developer's Guide posted in Doc and PDF forms.
2008 Nov 01 - Sat
MinHeap: C++ Template Implementation of Minimum Heap Algorithm
I am in the process of developing a trading application based in C++. I've realized that
it would be useful to test code and algorithms off line, before connecting to a
paper-trading system, or even a live scenario.
For code testing and algorithm testing, I need to send the quotes and trades I've
collected through an order fullfilment simualation engine. As the solution I'm developing
depends upon multiple symbols, I need a synchronization process to ensure the simulated
quotes and trades are processed in the same temporal order as I recieved them in real time.
One option for solving this ordering problem would be to simply merge/sort all the
vectors and process the single vector. I didn't like that idea as it wasn't amenable to
incorporating time based events as they are generated by the fulfillment algorithm.
I ended up designing a 'carrier' to hold each vector. In addition to a few other
house-keeping chores, The carrier manages the index into
the vector for the latest datum to be processed. The various carriers are then linked into
another vector, with the vector sorted in ascending date/time order based upon the current
datum pointed to by the carrier.
My first kick at the can for maintaining the vector of carriers was a brute force sift
down approach, with the sifting occuring after each datum was processed. This could turn
out to be O(n*m/2) time where n is the number of carriers and m is the number of total
datums across all carriers.
After looking at performance figures, I figured I could improve on this somewhat. The
MinHeap algorithm looked like a suitable candidate.
The Standard Template Library implementation of the algorithm uses pops and pushes to get
carriers into and out of the vector. This creates too much overhead. I basically needed a
mechanism to maintain minheap order as the carriers are added to the vector, and then
perform an in-place re-minheap when an datum from a carrier was 'consumed'. Once a carrier
is exhausted of all datums in its vector, the carrier is moved to the end of the carrier
vector with an 'archival' call.
Because, during the process of adding carriers to the vector, they are added at the end
and sifted up, once carrier archival starts to occur, no further carriers can be added. A
flag in the code is used to check for this condition.
The code is written for Visual Studio, but with a few minor mods, will hopefully run with
GCC.
There are two template parameters: C, T. T is the basic type of the carrier, and C is used for determining the
type of the operands for the lt operator in the SiftDown and SiftUp functions. I'm sure there is a more elegant way of handling the
storage of pointers in the carrier vector and trying to perform comparisons of the class represented by the pointer.
I havn't take the time to read up on that yet.
The following is the 'lt' operator of the class C:
static bool lt( CMergeCarrierBase *plhs, CMergeCarrierBase *prhs ) { return plhs->m_dt < prhs->m_dt; };
Here is the rest of the code:
#pragma once
#include <vector>
#include <assert.h>
// http://cis.stvincent.edu/html/tutorials/swd/heaps/heaps.html
// http://en.wikipedia.org/wiki/Binary_heap
// http://en.wikipedia.org/wiki/Heapsort
// http://people.cs.vt.edu/~shaffer/Book/C++2e/progs/minheap.h
// http://www.staroceans.com/minmaxHeap1.htm
// http://www.cppreference.com/wiki/stl/algorithm/is_heap is_heap()
template<class T, class C> class CMinHeap {
public:
CMinHeap<T,C>( size_t size );
CMinHeap<T,C>( void );
virtual ~CMinHeap( void );
void Append( T ); // automatic sift up
T RemoveEnd( void );
T GetRoot( void ) { assert( 0 != m_cntActiveItems ); return m_vT.front(); };
void ArchiveRoot( void ); // root item of no further use, swap to end and sift down
void SiftDown( void ) { SiftDown( 0 ); }; // reorder new root after use and change
bool Empty( void ) { return m_vT.empty(); };
size_t Size( void ) { return m_vT.size(); };
protected:
inline size_t Parent( size_t ix ) { return ( ix - 1 ) / 2; };
inline size_t RightChild( size_t ix ) { return 2 * ( ix + 1 ); };
inline size_t LeftChild( size_t ix ) { return ( 2 * ix ) + 1; };
inline bool isLeaf( size_t ix ) { return LeftChild( ix ) >= m_cntActiveItems; };
inline bool hasOneLeaf( size_t ix ) { return m_cntActiveItems == RightChild( ix ); };
inline size_t ixLastItem( void );
void SiftDown( size_t ix ); // from ix downwards, reordering item
void SiftUp( size_t ix ); // from ix upwards towards root, when appending new items
inline void Swap( size_t ix, size_t iy );
private:
std::vector<T> m_vT;
size_t m_cntActiveItems;
bool m_bArchivalStarted; // prevents further Appends
};
template<class T, class C> CMinHeap<T,C>::CMinHeap(size_t size)
: m_cntActiveItems( 0 ), m_bArchivalStarted( false )
{
m_vT.reserve( size );
}
template<class T, class C> CMinHeap<T,C>::CMinHeap( void )
: m_cntActiveItems( 0 ), m_bArchivalStarted( false )
{
}
template<class T, class C> CMinHeap<T,C>::~CMinHeap() {
}
template<class T, class C> size_t CMinHeap<T,C>::ixLastItem() {
assert( 0 < m_cntActiveItems );
return m_cntActiveItems - 1;
}
template<class T, class C> void
CMinHeap<T,C>::Append( T item ) {
assert( !m_bArchivalStarted );
m_vT.push_back( item );
++m_cntActiveItems;
SiftUp( ixLastItem() );
}
template<class T, class C> T CMinHeap<T,C>::RemoveEnd( void ) {
assert( !m_vT.empty() );
T item = m_vT.back();
m_vT.pop_back();
return item;
}
template<class T, class C> void
CMinHeap<T,C>::ArchiveRoot() {
// swap with last item and SiftDown
assert( 0 < m_cntActiveItems );
m_bArchivalStarted = true;
Swap( 0, ixLastItem() );
--m_cntActiveItems;
if ( 1 < m_cntActiveItems ) { // sift only if 2 or more items
SiftDown();
}
}
template<class T, class C> void
CMinHeap<T,C>::Swap( size_t ix, size_t iy ) {
assert( ix < m_cntActiveItems );
assert( iy < m_cntActiveItems );
T tmp = m_vT.at( ix );
m_vT.at( ix ) = m_vT.at( iy );
m_vT.at( iy ) = tmp;
}
template<class T, class C> void
CMinHeap<T,C>::SiftUp( size_t ix ) {
size_t cur = ix;
size_t parent;
while ( 0 != cur ) {
parent = Parent( cur );
if ( C::lt( m_vT.at( parent ), m_vT.at( cur ) ) ) {
break; // done sifting
}
else { // swap and try again
Swap( cur, parent );
cur = parent;
}
}
}
template<class T, class C> void
CMinHeap<T,C>::SiftDown( size_t ix ) {
size_t cur = ix;
while ( !isLeaf( cur ) ) {
if ( hasOneLeaf( cur ) ) {
size_t left = LeftChild( cur );
if ( C::lt( m_vT.at( left ), m_vT.at( cur ) ) ) {
Swap( left, cur );
cur = left;
}
else {
break;
}
}
else { // has two leaves
size_t right = RightChild( cur );
size_t left = LeftChild( cur );
// right side has shorter distance by default for same or greater
bool bGoRight = !( C::lt( m_vT.at( left ), m_vT.at( right ) ) );
if ( bGoRight ) {
if ( C::lt( m_vT.at( right ), m_vT.at( cur ) ) ) {
Swap( right, cur );
cur = right;
}
else {
break;
}
}
else { // test with left
if ( C::lt( m_vT.at( left ), m_vT.at( cur ) ) ) {
Swap( left, cur );
cur = left;
}
else {
break;
}
}
}
}
}
// if we needed to build a heap from pre-assigned vector:
// for (int i=n/2-1; i>=0; i--) siftdown(i);
2008 Oct 10 - Fri
Boost Build
Boost C++ Library has a large number of
interesting modules. Most of the modules are header only templates readily used on many
platforms and many recent version compilers.
A few of the modules, such as Regex and DateTime, need to be compiled into static or
dynamic link libraries. To do so, requires the use of Boost's build tool: bjam.
To build all variations of the libraries, single/multithreaded, debug/release, and
static/dynamic, the command is simple (example for windows):
bjam --toolset=msvc --build-type=complete stage
Finding out how to build a subset of the libraries is a bit harder. The options I use
are:
bjam --toolset=msvc variant=debug,release threading=multi link=static stage
2008 Jul 03 - Thu
Upgrade from Eclipse Europa to Ganymede (with painful Subversion)
Today, I upgraded from Eclipse/CDT Europa to Eclipse/CDT Ganymede.
(CDT meaning C++ Developer Tools). The Eclipse upgrade was painless:
download the Eclipse/CDT package, expand it, and start eclipse from
within the directory. After pointing it to my workspace, everything
was there. Nicely simple.
My subversion client was an entirely different story. For the Europa installation,
everything came from the tigris site and worked well. For the Ganymede installation,
there are now two sites involved, and I'm not sure which is what. I think
the tigris site can now be ignored (for the time being). In the
installation instructions somewhere, one needs to go to the
Polarion site for their client. There
are Eclipse update links there.
However, what I assumed to be workable defaults of using the JavaHL library on
Debian turned out to be non-workable. The solution was not to use the JavaHL
client but use the SVNKit client.
My key problem is that my SVN repository requries an ssh public/private key.
The JavaHL library, if or when it would or wouldn't load, I'm not sure what I was
seeing, but I could only see the option for user name and password authentication.
It would have been nice if the Subversion/Polarion/Eclipse guys would all get
together and make it straight-forward in terms of which libraries from
which sites need to be download. If they imply that the JavaHL libraries should
be downloaded, please make it painless to get the paths set and ensure the binaries
are present. After 15 million lines of code, you'd think that would be a
small task to accomplish.
2008 Jun 18 - Wed
Concurrency aka MultiThreading
A number of my projects are approaching the phase where some of their feature sets
will work better with some form of background processing. In a trading application,
plowing through historical data on thousands of symbols looking for patterns would be
best left to a background task, rather than rendering the user-interface frozen during
the, well, duration. For Radius, with a listener on an accounting port, and one on an
authorization port, the two threads need to coordinate access to resources.
Windows has a native threads API, but that isn't necessarily portable between operating systems.
My trading application is in Windows, and the Radius application in on Linux. Using the same API on both
platforms wuold be cool.
As I already use the Boost tools in their various forms, Boost::Thread would be a good candidate.
The Boost documentation isn't exactly overflowing with examples. Dr. Dobb's Portal
saves the day with an article dating from May 2002 entitled
The Boost.Threads Library.
It has good examples covering the basics:
- Thread Creation (boost::thread)
- Mutexes (boost::mutex), which protects one thread from another
- Condition Variables (boost::condition), which are good for getting data into and out of a thread
- Thread Local Storage (boost::thread_specific_ptr), for keeping thread specific storage separate from other threads
- Once Routines (boost::call_once), for making sure statics are initialized once and only once
For mutexes, protecting code regions can be as easy as declaring a mutex:
boost::mutex CodeProtectionMutex;
And then putting a scoped lock in the code encountered by multiple threads:
{ // some scope some where
// ... some code
boost::mutex::scoped_lock lock(CodeProtectionMutex);
// .. some more code, which is protected by the lock
} // the scope exit, no unlock is required as the destructor does the work
After reviewing the examples, making use of the Boost documentation should be an easier
task. As such, boost::thread documentation should be reviewed anywa as boost::thread
has gone through some changes since that article.
Paul Bridger has also written a
tutorial on multithreading making use of the boost::thread
class. The navigation through the tutorial isn't the greatest, but the content is good.
Making use of boost::thread as a base, Philipp Henkel has written
threadpool. It has been
brought up to date for use with boost v1.35. It provides a dead easy solution to
making use of a limited number of worker threads to carry out tasks:
pool tp(2); //create a 2 thread pool
// Add some tasks to the pool.
tp.schedule(&first_task);
tp.schedule(&second_task);
tp.schedule(&third_task); // this task waits until of the other two completes
In the similar vein to Henkel, Ted Yuan has a boost::thread based
C++ Producer-Consumer Concurrency Template Library.
Zoltán Porkoláb has written an article on
Distributed Programming and Metaprogramming in C++. It has many
examples and goes into some additional examples for boost::thread. His article also
introduces bind and tuples, which are good backgrounds to boost::lambda.
Back to boost for a second. You can't find it from the boost home page, but here is a
good link to
Boost Libraries Listed Alphabetically.
boost::thread is a basic threading library. Going above and beyond multi-threading
grunt work, Intel's
Thread Building Blocks has higher level constructs for getting
multiple threads going. For example, it has a 'for' construct for simultaneously
executing multiple elements of the for statement. Good tutorials and background
information can be read through
Kevin Farnham's Blog.
Building further on the Threading Building Blocks is something of
simulating interest:
go parallel looks to be chock full of content related to multi-core and multi-threaded
programming.
From the theoretical perspective, I came across an HP paper called
Foundations of the C++ Concurrency Memory Model, and written by Hans-J. Boehm and Sarita V. Adve.
I havn't read it all the way through, but at some time, I think the bibiliography may be
a worthy read in itself.
2008 Jun 16 - Mon
Keyword Matching (non-text streams)
In a previous blog article, I presented the beginnings of a Keyword Lookup class. This
article takes that work, turns it into a template and makes it useful for longest match
lookups.
I'll have to compare this lookup library with what a C++ map or unordered_map does for
lookup speed. I'm hoping that when a comparison is made, that this routine is indeed
faster. I'm thinking it might be because, even though a map will do a binary search through
it's map, complete strings are compared at each step. With this library, keyward patterns
are added to a rooted tree of characters, possibly reducing the amount of time spent
mactching.
I must admit that at each match step, there is a linear search performed through a list
of likely character candidates. To imporve the search, I've been thinking that once the
pattern tree has been
created, it could be sorted so that each step search can be done with a binary search. This
will be something for next time.
In the meantime, this routine does work for finding maximum matches. For example when
performing a long distance rate lookup, this will find the most specific rate code from a
list of various length candidates, something which is difficult to do with a map.
This keyword match algorithm is template based. 'class T' is the type to be returned
upon a successful match. It can be an index, a pointer, a number, or anything else
suitable. The class constructor requires an initializer... basically a value to be given
the equivalent meaning of NULL. On no match, this value is returned. Use 'AddPattern' to
add patterns and their associated 'meanings'. Use FindMatch to perform the lookup.
// this is kind of a subset of Aho Corasick algorithm
// only full keyword matching, no text searches
// no on failure coding
#ifndef CKEYWORDMATCH_H_
#define CKEYWORDMATCH_H_
#include <string>
#include <vector>
#include <stdexcept>
#include <iostream>
template<class T> class CKeyWordMatch {
public:
explicit CKeyWordMatch<T>( T initializer, size_t size );
virtual ~CKeyWordMatch(void);
void ClearPatterns( void );
void AddPattern( const std::string &sPattern, T object );
T FindMatch( const std::string &sMatch );
size_t size( void ) { return m_vNodes.size(); };
protected:
T m_Initializer;
struct structNode {
size_t ixLinkToNextLevel; // next letter of same word
size_t ixLinkAtSameLevel; // look for other letters at same location
T object; // upon match, (returned when keyword found)
char chLetter; // the letter at this node
explicit structNode( T initializer ) : ixLinkToNextLevel( 0 ), ixLinkAtSameLevel( 0 ),
object( initializer ), chLetter( 0 ) {};
};
std::vector<structNode> m_vNodes;
private:
};
template<class T> CKeyWordMatch<T>::CKeyWordMatch( T initializer, size_t size )
: m_Initializer( initializer )
{
m_vNodes.reserve( size );
ClearPatterns();
}
template<class T> CKeyWordMatch<T>::~CKeyWordMatch(void) {
m_vNodes.clear();
}
template<class T> void CKeyWordMatch<T>::ClearPatterns() {
m_vNodes.clear();
structNode node( m_Initializer );
m_vNodes.push_back( node ); // root node with nothing
}
template<class T> void CKeyWordMatch<T>::AddPattern(
const std::string &sPattern, T object ) {
std::string::const_iterator iter = sPattern.begin();
if ( sPattern.end() == iter ) {
throw std::invalid_argument( "zero length pattern" );
}
size_t ixNode = 0;
size_t ix;
bool bDone = false;
while ( !bDone ) {
char ch = *iter;
ix = m_vNodes[ ixNode ].ixLinkToNextLevel;
if ( 0 == ix ) { // end of chain, so add letter
structNode node( m_Initializer );
node.chLetter = ch;
m_vNodes.push_back( node );
ix = m_vNodes.size() - 1;
m_vNodes[ ixNode ].ixLinkToNextLevel = ix;
ixNode = ix;
}
else { // find letter at this level
bool bLevelDone = false;
size_t ixLevel = ix; // set from above
while ( !bLevelDone ) {
if ( ch == m_vNodes[ ixLevel ].chLetter ) {
// found matching character
ixNode = ixLevel;
bLevelDone = true;
}
else {
// move onto next node at this level to find character
size_t ixLinkAtNextSameLevel
= m_vNodes[ ixLevel ].ixLinkAtSameLevel;
if ( 0 == ixLinkAtNextSameLevel ) {
// add a new node at this level
structNode node( m_Initializer );
node.chLetter = ch;
m_vNodes.push_back( node );
ix = m_vNodes.size() - 1;
m_vNodes[ ixLevel ].ixLinkAtSameLevel = ix;
ixNode = ix;
bLevelDone = true;
}
else {
// check the new node, nothing to do here
// check next in sequence
ixLevel = ixLinkAtNextSameLevel;
}
}
}
}
++iter;
if ( sPattern.end() == iter ) {
if ( m_Initializer != m_vNodes[ ixNode ].object ) {
throw std::domain_error( "Pattern already present" );
}
m_vNodes[ ixNode ].object = object; // assign and finish
bDone = true;
}
}
}
template<class T> T CKeyWordMatch<T>::FindMatch( const std::string &sPattern ) {
// traverse structure looking for matches, object at longest match is returned
std::string::const_iterator iter = sPattern.begin();
if ( sPattern.end() == iter ) {
throw std::runtime_error( "zero length pattern" );
}
T object = m_Initializer;
size_t ixNode = 0;
size_t ix;
bool bDone = false;
while ( !bDone ) {
char ch = *iter;
ix = m_vNodes[ ixNode ].ixLinkToNextLevel;
if ( 0 == ix ) {
bDone = true; // no more matches to be found so exit
}
else {
// compare characters at this level
bool bLevelDone = false;
size_t ixLevel = ix; // set from above
while ( !bLevelDone ) {
if ( ch == m_vNodes[ ixLevel ].chLetter ) {
if ( m_Initializer != m_vNodes[ ixLevel ].object )
object = m_vNodes[ ixLevel ].object;
ixNode = ixLevel;
bLevelDone = true;
}
else {
ixLevel = m_vNodes[ ixLevel ].ixLinkAtSameLevel;
if ( 0 == ixLevel ) { // no match so end
bLevelDone = true;
bDone = true;
}
}
}
}
++iter;
if ( sPattern.end() == iter ) {
bDone = true;
}
}
return object;
}
#endif /*CKEYWORDMATCH_H_*/
2008 Apr 25 - Fri
C++ Library: ACE RADIUS
In putting together a mechanism for handling authentication and accounting with a
Cisco Voice Gateway, such as the AS5350XM, I came across the
ACE RADIUS Library. It is
self described as a free, open source portable RADIUS stack.
The stack uses ACE_Task for basic network communication.
ACE
is a very good and well respected C++ network
programming framework. I've started doing an few projects with it.
Anyway, ace-radius comes as a zip file, rather than a .gz file, so remember to unzip it
into a directory, otherwise the 'current' directory may get polluted with files.
When building, the ACE library is required, and the environment variable ACE_ROOT needs
to be set to where the ACE_wrappers directory resides. Before compiling a couple of fixes
for the current version of the compiler are required:
- CRadiusPacket.h, line 291: friend class CRadiusAttribute;
- CRadiusClientStack.h, line 109: friend class CRadiusClientConnection;
- CRadiusAttribute.cpp, line 455, 479, 1084, 1097: s/.S_un.S_addr/.s_addr/
- CRadiusClientConnection.cpp, line 65: ACE::set_flags(m_socket.get_handle(), O_NONBLOCK);
Instead of using the existing Makefiles, I copied all the source files from their various
directories into a single directory under Eclipse and compiled them into a single library.
I excluded the Client and Server test directories.
A number of good examples are provided, which show both the client and server sides.
The API is quite clean and useful. Good doxygen documentation is supplied.
I was quite impressed with how easy it was to accept an authentication request, and reply
with an accept or a deny packet.
The Radius server I'm writing is but one component of several inter-communicating network
components for talking with a voice gateway, authenticating and rating calls, collecting
call detail records, passing along realtime call information, and presenting the realtime
call status in a browser.
For the other inter-communicating components, I had them fleshed out with the
ACE_Connector and ACE_Acceptor Frameworks.
I got to the point where I needed to rework my
Wt web server/client with the
ACE Framework.
However, there became a bit of an issue. Wt uses ASIO for its network communications. This
put
me into a quandary. Recent reading is indicating that ASIO, which now comes natively in the
latest Boost release of 1.35.0, is more C++'sh than is ACE. I see that there is also much
overlap in Boost
and ACE, at least from the concepts I might need to use: message passing, threading,
inter-process communications, network communications, etc.
In ACE, I see a strength with its Service Configuration Framework and its Naming Service.
With the distributed components I'm writing, the mentioned frameworks would come in handy
for provisioning and service enablement.
However, at this point in time, I'm thinking of migrating to the ASIO/Boost way of
doing things. I'll put off thinking about the service configuration and naming stuff for
a little later. Hooefully I'll come across something suitable in the meantime.
The ACE framework is flexible and complicated, and something I was willing to negotiate
my way through. But when I see a lot of learning needed to wend my way through the Boost
libraries as well, I think I'll use Boost where I can, and then wrap ACE in something when
forced to work with it.
Which brings me back to the Ace-Radius library. I may be able to port a couple of key
ACE based classes to ASIO and not have to worry about ACE. If not, then I'll set the ACE
based
classes in a thread for acting as a Radius server, run another thread for ASIO to
communicate with my other network objects, and then have the ACE thread forward stuff to the
ASIO thread for communicating with the rest of my infrastructure.
2008 Mar 23 - Sun
C++ Implementation of Wu Manber's Multi-Pattern Search Algorithm
I'm finding that this algorithm is useful in a number of situations. Out in the
real world, it is found in Network Intrusion Scanners, grep engines, as well as
text processing.
For me, I'm working on an implementation for a RADIUS server for authorizing and
logging calls for a voice over ip service. Some vendor specific attributes don't
have a unique numeric identifier, the identifier is simply a string. Being able to do a
fast match would be nice in this situation. Currently I'm using C++'s unordered_map
to facilitate identification of the string. The Wu Manber algorithm may be a suitable
alternative.
On another front, I subscribe to a real time news service. Passing this news service
through a Multi-Pattern Search Algorithm would make for a processor efficient method of
filtering for news items in which I'm interested.
Wu-Manber
stand on the
shoulders of their Multi Pattern Search predecessors: Aho and
Corasick (with a linear time scanner based upon an automata approach), Commentz-Walter
(who combined Aho-Corasick with the Boyer-Moore string search algorithm), and Baeza-Yates
(with a slightly different combination of Aho-Corasick and Boyer-Moore-Horspool).
For those who are interested, here is a Visual Studio implmentation of Wu Manber's
algorithm. It should be
readily transportable to other platforms with little or no change, other the the _tmain
construct. The implementation relies on C++ Standard Template Library for various
containers. A table driven character comparison approach is used to make it easy
to choose between case sensitive and case insensitive matches.
The next step would be turn it into a template to make it useful on various types of
alphabets. It all needs to be redone a bit in order to accept a 'functor' or a 'delegate'
so that when a match is found, an appropriate routine can be invoked.
There is another paper out there called "An Improved Wu-Manber Multiple Patterns Matching
Algorithm" that purports to be faster. The charts in the paper say it is a bit faster. I'm
wondering if it is 'faster enough' to be of value implementing.
2008 Feb 18 - Mon
Redirect STL cout
In a previous article entitled
C++ Override std::cout, std::cerr Streams, I
wrote about some sites I found regarding the redirection of cout
to some user supplied routine. After some fiddling about, I came up with a result that
works for me in Visual Studio 2005 version of C++.
Many of the sites suggest overriding the xsputn function. I did that in conjunction with
buffered output through the setp function. I found that the xsputn function is used for
string delivery, but the user supplied buffer is used when cout formats binary values. I
had to come up with a mechanism to sync the two. The solution was to not over-ride xsputn, only
use the setp function, and rely on overriding the sync function.
The code in the sync override isn't perfect, but it does get the job done. The code
makes use of the
fastdelegate template to issue a 'callback' to code interested in
processing the buffer on each sync. The short coming with this code is that cout inserts a
0x0a into the buffer for each endl, and so the routine accepting the buffer has to scan and
interpret the character appropriately.
By not using setp, the routine becomes unbuffered, and then xsputn becomes necessary. I havn't
tried that scenario yet.
#pragma once
#include <iostream>
using namespace std;
#include "FastDelegate.h"
using namespace fastdelegate;
class CConsoleStream : public streambuf {
public:
CConsoleStream(void);
virtual ~CConsoleStream(void);
typedef FastDelegate2<const char*, streamsize> OnNewStringHandler;
void SetOnNewString( OnNewStringHandler function ) {
OnNewString = function;
}
typedef FastDelegate0<> OnFlushStringHandler;
void SetOnFlushString( OnFlushStringHandler function ) {
OnFlushString = function;
}
protected:
OnNewStringHandler OnNewString;
OnFlushStringHandler OnFlushString;
static const unsigned short BufSize = 1024;
char buf[ BufSize ]; virtual int sync( void );
virtual int_type overflow( int_type meta );
private:
};
#include "StdAfx.h"
#include "ConsoleStream.h"
#include <stdexcept>
CConsoleStream::CConsoleStream(void) { setp( buf, buf + BufSize );
}
CConsoleStream::~CConsoleStream(void) {
}
int CConsoleStream::sync() {
if ( NULL != OnNewString ) OnNewString( pbase(), (int) ( pptr() - pbase() -
1 ) ); if ( NULL != OnFlushString ) OnFlushString();
setp( pbase(), epptr() ); return 0;
}
int CConsoleStream::overflow(int_type meta) {
throw std::runtime_error( "ConsoleStream overflow" );
}
The code was formatted with the javascript found at
C++2HTML. I see there is
GNU Source-highlight 2.8, but I
don't see a web interactive version handy.
2008 Jan 28 - Mon
Symbol Clash Between VC++ oledb.h and Berkeley DB db.h
When attempting to use Berkeley DB4 in a Microsoft Visual Studio 2005 C++ project, the
symbol DBTYPE is found in both Microsoft's oledb.h and Berkeley DB4's db.h. It is really
hard to get rid of oledb.h as it is buried somewhere in the depths of the stdafx.h
precompiled header file.
In one of the forums, a suggestion was made to wrap the Berkely DB4 header file db_cxx.h
in a namespace. That worked somewhat to remove the name clash, but it resulted in a link
error of not being able to find the namespaced symbol in the dll file. I wasn't sure what
was needed to resolve that bit of a naming/link-resolution problem.
Elsewhere, in another forum posting, there was a suggestion of using an #undef DBTYPE.
That didn't appear to work either. I think this is because DBTYPE is a 'typedef' rather
than a simple #define. I suppose I could have tried to change the typedef to a #define.
Instead, I made a copy of oledb.h as oledb.original.h, and modified the oledb.h file. I
changed all references of DBTYPE to MS_DBTYPE and rebuilt the project. The project compiled
and linked fine.
Of course, if I need to use oledb.h in the future, something I doubt very much, but one
never knows, I'll have to figure something out to maintain code compatibility.
I'm sure there is a more elegant solution. If someone has encountered it, please email a
solution to me and I'll post it.
|