2008 May 29 - Thu
Evaluating Inter-Process Communication Frameworks
I'm reposting some comments regarding IPC frameworks that I made to the Boost-Users
listserve today. It is in response to someone making unsubstantiated remarks regarding the
relative merits of ACE and Boost, and another looking for some substatiated remarks. What
follows are some substantiated remarks, based upon my personal experience with it and
several other libraries.
I've started working on a number of distributed system projects. As a consequence, I
started looking for distributed system libraries. References to ACE were most pervasive. I
implemented a number of trial applications with the library. That was after plowing through
relevant sections in the three primary ACE reference books. That was a good learning
experience, if only to find out the various patterns in distributed architecture definition.
I had the inter-process/inter-server communications (which only sent simple stuff) working
well within ACE's Acceptor/Connector framework. ACE has a number of other patterns one can
use. I was really impressed with the fact that the examples I used from the books worked as
advertised, and I was able to bend them to my will.
ACE is based upon an interaction of classes, macros, and templates. One has to spend
some
time with the environment in order to become proficient with it. It has a large API. A
number of lower level API's upon which higher level API's are based. For example the
Acceptor/Connector uses constructs described earlier in the books.
Once I had my basic communications going, I realized I needed to get some form concrete
messaging infrastructure in place. I had an impression that TAO, which is a layer above
ACE, would be quite extravagant to implement, with it being an implementation of the CORBA
specification. I wanted something a little lighter (a whole lot lighter actually).
As I worked through that project, I started hearing about ASIO, indirectly through some
other libraries I was using. ASIO is now a member of Boost. I read a review somewhere that
ASIO is a 'modern' replacement for ACE. If you want to get into real template structures
and Boost oriented philosophy, I'd say that is a valid statement. I'd also say that ASIO is
'more to the point' and straight forward than is ACE, at least for the things I want to
accomplish. But like ACE, ASIO is the basic communications infrastructure, no real
messaging capability, which is what distributed computing is all about. ASIO turned out to
be a little harder to get my head wrapped around as it uses a number of advanced C++ and
Boost related idioms. For a run-of-the-mill C++ programmer, ACE would be better. For
someone steeped in the power and obscurity of advanced C++, and is looking to advance their
skill set, ASIO would be better.
I came across
RCF - Interprocess communication for C++, which is a
messaging framework riding atop of ASIO. Flexible, lightweight, and to the point. I worked
through the examples and things worked as advertised. It has the encryption,
publisher/subscriber, oneway/twoway idioms, and a few other nifty features.
At the same time I was doing that, seemingly coincidently, I learned a few more
interesting
facts. Going into this, I realized that I need a message dispatcher/proxy, some decent
failover techniques, and some additional event handling for non-IPC related activities.
Someone suggested ICE from www.zeroc.com for an RCF-like solution, but working to a
larger
scale. I've heard that the library's originator is someone who spent much time on CORBA
standards and redid the concept without the 'benefit' of committee involvement. I think the
library has all the bases covered in terms of lightweight message handling, dispatching,
resiliency, and higher level distributed processing philosophies. The drawback is that it
will have a steeper learning curve than would an implementation using RCF. I like RCF, but
I think I.m going to have to tilt towards ICE (itself, like RCF, developed and focused
towards C++ in a multiple license environment).
On the non-IPC front, Qt's QCoreApplication looks to be a good substrate on which to
build
event driven daemons.
In the end, I think my solutions are going to involve:
- ZeroC's ICE for primary inter-process
communications
- Qt QCoreApplication as a
base for daemon development (which has built-in stuff for
threads/locks, slots/signals)
- Wt, a C++ based web toolkit for
distributed GUI development
- Boost Libraries to fill in all the
holes
- a little legacy layer 3/4 ACE in one library I'm using, but with some work, I think I
can
convert the ACE stuff to ASIO
2008 May 24 - Sat
A Keyword Matching Algorithm
There are a number of well known algorithms out there for taking in a set of keywords and
matching them against test. Aho and Corasick comes to mind, as does the Wu Manber algorithm (the latter I've
implemented, and the code resides elsewhere on this site).
For another project, I didn't need something quite so fancy. Actually two projects come to mind. One is that I
have a input comma separated value file which includes stock symbols, a description, and the associated exchange. I
wanted to keep statitics on what is read in on an exchange basis. My first kick at the can on this was to implement a
string look up table using
2008 May 23 - Fri
RCF - Interprocess Communications for C++
For a couple of distributed computing projects, I've been trying to come up with a
feasible and easy to use method for making applications talk to each other, whether they be
on the same machine or across a network.
I started off doing some work with Douglas C. Schmidt's
ACE: The
ADAPTIVE
Communication
Environment. I plowed through ACE's three primary programming books to see what
would be
the best bit of the environment I would need. I ended up implementing a demo with the
Acceptor - Connector framework, just to see how things worked.
I then started on thinking on the messaging structure and the event handling structures.
ACE's mixture of macros and classes turned out to be a little overwhelming for what I wanted
to accomplish.
During my stint with ACE, I started to use ASIO, from the
Boost libraries. I was first introduced
to ASIO through working with
WT: WebToolKit. I used Wt as a
frontend to a voip call sign in server.
The next step in the evolution is to present a real time call summary report to
authorized management as the calls are authenticated, authorized, and accounted for from a
Radius server. This means sending call detail messages from the Radius server to a central
dispatch server, and then publish to active web clients (with the clients written with
Wt).
As Wt uses ASIO for its underlying network communications, and I had read a remark
somewhere that ASIO is the new improved ACE, I started to look into it as the mechanism for
my inter-process communications. I even got a good chunk of messaging infrastructure
written as was about to get it testing when I found it was all for nought.
I came across
RCF - Interprocess Communications for C++. It is a library that has been in development
for the last few years by a talented fellow by the name of Jarl Lindrud. The library has
implemented all the stuff that I only dreamed about doing: publish/subscribing, stream
encryption, payload filtering, and any number of other nifty features.
I had a few painful moments in getting the library built. After a couple of messages
back and forth to the author, I realized I was trying to build the whole thing into a static
library rather than using an 'include' technique to get the platform specific files built.
The client and server examples built and ran without a hitch. I must admit that I was
impressed by the examples in the ACE books as well: they compiled and ran with little or no
messing about.
The RCF library is better because it deals with serializing native values back and forth,
something that ACE only accomplishes when you get into the TAO and CORBA levels of the
environment.
So now with Boost (which includes ASIO), RCF (which uses ASIO), and Wt (which also uses
ASIO), I think I have all the interprocess tools I need to make my modules talk to each
other. Now I can get on with the meat of my projects.
2008 May 20 - Tue
Confusion by Committee
In reading Rob Weir's
An Antic Dispoition blog today, he has a very cogent observation regarding
committees:
I have a theory concerning committees. A committee may have different states, like water has
gas, liquid or solid phases, depending on temperate and pressure. The same committee,
depending on external circumstances of time and pressure will enter well-defined states that
determine its effectiveness. If a committee works in a deliberate mode, where issues are
freely discussed, objections heard, and consensus is sought, then the committee will make
slow progress, but the decisions of the committee will collectively be smarter than its
smartest member. However, if a committee refuses to deliberate and instead merely votes on
things without discussion, then it will be as dumb as its dumbest members. Voting dulls the
edge of expertise. But discussion among experts socializes that expertise. This should be
obvious. If you put a bunch of smart people in a room and don't let them think or talk, then
don't expect smart things to happen as if the mere exhalation of their breath brings forth
improvements to the standard.
The quotation stems from his observations regarding the committee which was stick
handling Microsoft's OOXML standard through the fast track process. Sometimes committees,
when doing things properly, can be better than the sum of the parts, but without proper
communication and time allotments, can turn out to be no better than the weakest link.
2008 May 05 - Mon
Reducing Traffic on High Cost Inter-ISP Links
AquaLab has released an open source plugin for BitTorrent clients, specifically Azureus.
AquaLab's Ono Plugin's "main goal of this plugin is simple -- to improve download speeds
for your BitTorrent client. "
Here is a press release summary I came across from ACM TechNews:
Northwestern University researchers have developed Ono, software that eases
the strain that peer-to-peer (P2P) file-sharing services place on Internet service providers
(ISPs). Ono allows users to efficiently identify nearby P2P users and requires no
cooperation or trust between ISPs and P2P users. Ono, the Hawaiian word for delicious, is
open source and does not require the deployment of additional infrastructure. When ISPs
configure their networks correctly, Ono can improve transfer speeds by as much as 207
percent on average, the researchers say. Ph.D. student David Choffnes, who developed Ono
with professor Fabian E. Bustamante, says Ono relies on a clever trick based on observations
of Internet companies to find nearby computers. Content-distribution networks (CDN), which
offload data traffic from Web sites onto their proprietary networks, power some of the most
popular Web sites in the world, enabling higher performance for Web clients by sending them
to a server close to them. Using the key assumption that the two computers sent to the same
CDN server are near to each other, Ono can identify P2P users close to each other.
This aids two types of communities:
- Users: who can get faster downloads because P2P peers are closer and are therefore
prone to fewer errors and dropouts.
- Service Providers: traffic can be kept off high cost inter-ISP links. With traffic
kept internal, cost savings on carrier links could be realized.
On the negative side though, last mile links get more saturation with higher traffic
densities. If one is on a shared cable modem or a shared wireless access point,
ironically this isn't the best thing that could happen.
2008 May 03 - Sat
Multi Touch Screens
In a recent issue of
Technology Review, there is an
article regarding
Open Source Multi Touch Displays.
The technology is based upon taking an acrylic sheet, and projecting video onto
the back surface. Around the edges are some infrared light
emitting diodes focussed to emit the light into the sheet. The light bounces around on the
inside from suface to surface.
When someone touches the panel, the light path is interrupted. An infrared sensitive
camera on the back side can then be used to distinguish the touch locations. Simple and
effective touch technology.
If someone could marry Lightfactory's new virtual layout generator on a multitouch board,
suddenly lighting design and control would take on a whole new dimension.
Perhaps even using the the multitouch capability on the dance floor would introduce a
whole new level of dance lighting interaction.
2008 Apr 25 - Fri
Latent Brain Power
In an article or two ago, I made a brief mention of
MapServer in relation to
throwing together a mixture of data types regarding Bermudian Visual Features.
I was thinking a little later on that this exercise becomes one of building a
spatial/temporal complex of meanings. I then got to thinking about this visually. What if
one could take a slider or a bounding box and zoom in on a part of the island, and then zoom
around in time space. It would be interesting to see what the hot spots were, and what they
were about. It would become what could be described as a space/time based Wikipedia for
Bermuda, or any location for that matter. Information is one thing, but navigating it and
relating it is another matter entirely.
Something like this would only be possible through the
Collective Intelligence of users.
The article mentions that many many people have contributed many many hours to making
wikipedia the huge compendieum that it is.
But the article goes on to say that there are still many many people out there who have
more time on their hands than they know what to do with. Lots of people have hobbies, do
public service, take care of families, etc. But how many more vegetate on the
couch in front of the 'one eyed monster' known as the TV?
This reminds me of the fact that there must be millions of computers out there sitting
idle, wasting energy, waiting for something to do. Instead of illigimately using these free
cycles to spew forth harmful spam, what if we could harness them into catalogueing, or
storage, or analysis, or ...
Seagate just sold its billionth hard drive. If we take a billion drives times a billion
bytes each (probably a woefully inadequate estimate), that is a lot of data, and probably
underutilized at that.
It is also said that we, as humans, utilize less than ten percent of our brain capacity.
And if less than ten percent of the population is mentally active (doing something other
than passively watching preprogrammed images pass through their retinas into the blackhole
of vicarious experience), that represents lots of wasted capability for enhancing humanity.
Robert Heinlein, in one of his science fiction stories, suggested that if we took the top
one percent of mankind and moved them off world to start new digs, what remained would be
unable to take care of themselves in any organized fashion. Not that we are very good at it
as it is.
Anyway, on a positive note, the article seems to think that things might be improving by
saying:
Just as people "woke up" during the Industrial Revolution, society is now beginning to
emerge from its sitcom-induced stupor to see its cognitive surplus as an asset rather than a
crisis. As a result, people are turning to Web 2.0 technologies as an outlet for that
brain-power surplus.
With appropriately designed interaction tools, we have a
reasonable hope for carving out enough of ... the collective goodwill of the citizens to
create a resource you couldn't have imagined existing five years ago. This isn't the sort of
thing that society grows out of. It is something that society grows into."
I'm liking what I am hearing.
2008 Mar 27 - Thu
I DOS'd Myself (created a slashdot effect with out slashdot)
I wrote a short article comparing ODF with OOXML and posted it on DZone. It ended up being
linked up on reddit. Then I got a bunch of traffic. Way lots too much traffic for my
poor ineffecient blosxom based server to handle. It is time to upgrade. Sorry about that folks.
2008 Mar 26 - Wed
C++ Custom Containers and Iterators
I'm using the
HDF5 File System for holding
time series information. Rather than writing my own binary search implementation to find
particular elements within a particular saved time series, I thought it would be
clever if I designed the
interface so I could use the Standard Template Library's 'find' iterator. If I can make the
STL's 'find' work, then all the other iterators should work just as well, and thus I'll have
an
easy mechanism to access time series with very little programming involved.
I can find any number of web sites containing information on how to work with C++'s
standard containers and iterators. When it comes to finding information on custom
containers and iterators, the information is not quite so plentiful.
The first article I came across was one from TechRepublic called
Extending the C++ STL with custom containers. It didn't quite have the meat I was
expecting.
Bjarne Stroustrup's book, The C++ Programming Language, does have a section on
iterators and a section on containers. In retrospect, they are quite good introductions
to the concept, but I didn't feel the examples were as informative as I would have liked.
Microsoft's MSDN has an article called
C++ and STL: Take Advantage of STL Algorithms by Implementing a Custom Iterator, but
this article only covers the custom iterator side of things, it doesn't discuss how it would
interact with a custom container.
Dr. Dobbs inherited an article entitled
Custom Containers & Iterators for STL-Friendly Code:
A pair of approaches for creating custom containers from the March 2005 issue of C++
Users Journal. Some code extracts are included but there are some pieces missing, such as
the begin() and end() methods and how they are put together. The link in the article to the
original code no longer works. However, I did find that I have the Dr. Dobbs Developer
Library DVD Release 4. On it resides the full example code. That was much more
informative.
Now that I have a better understanding for what I'm looking, I see that the
STL compliant container example has some useful information. In the same vein,
CodeProject has another example:
An STL compliant sorted vector.
Finally, I came across Ulrich Breymann's book called Desiging Components with the C++
STL. It provided all the necessary background to pull it all together. I always thought
there was more to it, but custom containers and iterators may not be so hard after all.
Once I have the code finished, I'll try to have it posted one way or another.
2008 Mar 25 - Tue
How Not To Form a Standard
Rob Weir has a blog called An Antic Disposition where he discusses
The Disharmony of OOXML.
The eloquent center piece of his article is a table representing how various applications represent a smiple
text string with one word in red, represented here verbatim:
| Format | Text Color | Text Alignment |
|---|
| OOXML Text | <w:color
w:val="FF0000"/> | <w:jc w:val="right"/> | | OOXML Sheet | <color
rgb="FFFF0000"/> | <alignment horizontal="right"/> | | OOXML Presentation | <a:srgbClr
val="FF0000"/> | <a:pPr algn="r"/> | | ODF Text | <style:text-properties
fo:color="#FF0000"/> | <style:paragraph-properties fo:text-align="end" /> | | ODF
Sheet | <style:text-properties fo:color="#FF0000"/> | <style:paragraph-properties
fo:text-align="end"/> | | ODF Presentation | <style:text-properties
fo:color="#FF0000"/> | <style:paragraph-properties fo:text-align="end"/> |
Some wag once mentioned that a standard is nice, you have so many from which to choose. The standards
writers for OOXML must have had this in mind when they allowed the diversity of Text Coloration and Alignment into
the standard. Oh, wait. The applications were written first, then some general bucket was designed to hold
the output these applications produced.
As the writer says, it would have been nice to create a 'single standard' and then retrofit the application's output
to conform to the file format. If an application needs to store it differently internally, so be it, but conform
to some level of operability in the file format. Hmmm, can each application read each other's handiwork? If not, what
good is a standard?
The article indicates that once ODF was established, Open Office changed to match the standard. And from the table
above, we can see all the tools within Open Office conform, with the result of twin goals of true universality of information interchange
and simplicity of software design have been reached.
That would be a high standard for OOXML to achieve.
|