2009 May 03 - Sun
Open Source Site of the Day: ModSecurity -- Open Source Web Application Firewall
mod_security is an actively maintained web application firewall.
From my reading, it looks like it is a filter for processing web requests before they hit a company's main web server.
It performs a series of different check and balances: looks at http headers for correctness, does common checks on field
content so as to prevent injection attacks, and through a command language, can perform so complex analysis within a request as
well as across requests.
In can be used as an appliance in-line or out-of-line, or can be used as a module right on the web server. The company
defines their 'Web Application Firewall' as a reverse proxy with additional security related features.
Is is an adjunct to a firewall, which can only do some basic session state analysis. There is one slide in a
presentation
on the site which provides a good summary of its capabilities:
- Monitoring: know what happened
- Detection: know when you are being attacked
- Prevention: stop attacks before they succeed
- Assessment: discover problems before the attackers do
It looks like mod_security is a very good tool for helping web developers protect themselves from things they don't know.
Web developers focus more on content and less on security. This tool helps rebalance the problem.
SANS is a good place to start learning about security.
[/OpenSource/SiteOfTheDay/D200905]
permanent link
Time Series Analysis on RRD Files
Crist Clark, in a posting on the NANOG mailing list, started an interesting thread on
analyzing network traffic based upon frequency analysis rather than the traditional
time based analysis. He started the thread by asking about Fourier Analysis on
network traffic time series. A number of responses indicated that Wavelet Analysis
might be the 'more modern' approrach. This type of analysis has been used for
Network Traffic Anomoalies Detection. The responses indicate that operating systems can be
deduced through analysis of RTD (Round Trip Delay) of ping generated traffic.
The thread started with:
Crist Clark started:
Has anyone found any value in examining network utilization numbers
with Fourier analyses? After staring at pretty MRTG graphs for a bit
too long today, I'm wondering if there are some interesting periodic
characteristics in the data that could be easily teased out beyond,
"Well, the diurnal fluctuations are obvious, but looks like we may
have some hourly traffic spikes in there too. And maybe some of those
are bigger every fourth hour."
Dave Plonka Responded:
Such techniques are used in the are of network anomaly detection.
For instance, a search for "network anomaly detection" at scholar.google.com will yield very many results.
Our 2002 paper, "A Signal Analysis of Network Traffic Anomalies"
[ACM SIGCOMM Internet Measurement Workshop 2002, Barford, et al.], is one such work. We mention that we use wavelet analysis
rather than Fourier analysis because wavelet/framelet analysis is able to localize events both in the frequency and time
domains, whereas Fourier analysis would localize the events only in frequency, so an iterative approach (with varying intervals
of time) would be necessary.
In general, this is the reason why Fourier analysis has not been a common technique used in network anomaly detection.
That work used data stored in RRD files at five minute intervals.
Our subsequent work used data stored at one second intervals, again in RRD files.
Anton Kapela had a couple of messages and a
link (look for Kapela):
Indeed, there are. Interesting things emerge in frequency (or phase) space - bits/sec, packets/sec, and ave size, etc. - all
have new meaning, often revealing subtle details otherwise missed. The UW paper [Barford/Plonka et. al] is one of my favories
and often referenced in other publications.
Along similar lines, I presented a lightning talk at nanog that demonstrates using windowed Ft's (mostly Gaussian or Hamming)
in three-axis graphs (i.e. 'waterfalls') available in common tools (buadline, sigview, labview, etc) for characterizing round
trip times through various network queues and queue states. Unexpectedly, interesting details regarding host IP stacks and OS
scheduler behavior became visible.
I want to suggest that time windowed Ft might be a reasonable middle ground, certainly for Crist's case. Naturally, the
trade-offs will be in frequency accuracy (ie. longer window) vs. temporal accuracy (ie.
short window). Another solution for your needs might be cascaded FIR "bandpass" filters, but again, you're subject to
time/frequency error trade-offs as related a filter's bandwidth.
While you're at it, consider processing your time series data into histogram stacks, or nested histograms. I haven't
specifically seen a paper covering this, but another UW gent (DW, are you reading this?) used to process their 30 second ifmib
data into a raw .ps file, and printed this out weekly/daily. The trends visible here were quite interesting, but I don't think
much further work was done to see if anything super-interesting was more/less visible in this form than traditional ones.
... one point - since packets/bits/etc data is more monotonic than not (math wizards, please debate/chime in) and
since it's not a 'signal' in the continuous sense, you might find value in differentially filtering the input data *before* FT
or wavelet processing. This would serve to remove the weird-looking "DC" offset in the output simply by creating a semi-even
distribution of both positive and negative input sample values.
[/OpenSource/Debian/Monitoring]
permanent link
Routing Within An ISP
Many ISP's I've seen have had two routing protocols implemented: BGP to talk to the
'internet' with the external /24 and shorter prefixes, and an internal routing protocol such
as EIGRP or OSPF to handle the internal /24 and longer prefixes. The internal protocol
would be running on all ISP devices and would handle all infrastructure devices and customer
links. For a multi-homed ISP, BGP would need to be running on all internal devices that
form internal paths from one external link to another. This provides an ability to choose
an appropriate exit point for any traffic generated from within an ISP destined for the
external network. Some ISP's 'cheat' by generating default routes to the nearest
exit and having BGP reside only on edge devices. Some optimum paths will be missed using
this simplified arrangement, particularily if an ISP is connected to non-transit neighbors.
Current best practices make expanded use of BGP. BGP, known as IBGP, is used
extensively within the ISP to carry customer prefixes. The internal routing protocol such
as OSPF or EIGRP is used simply for carrying infrastructure routes such as loopback
addresses and link addresses.
With this arrangement, it is then easy to make use of MP-BGP (Multi-Protocol BGP) to
handle the various requirements for carrying MPLS links.
One presentation at RIPE shows some basics of
BGP Best Practices.
[/Cisco]
permanent link
64 Bit Data Models
As we move to 64 bit processors, variable types and their widths change. I had
originally thought that there would be a consistent naming convention as one moved from 32
bit programming to 64 bit programming. At a
64 Bit
Wiki Entry, I find that such is not the case. Different compilers choose different
ways. For example the Microsoft VC compiler will use the LLP64 model which keeps an int
as 32 bits. This is something that one needs to keep in mind when re-compiling software
created for 32 bit processors in a 64 bit environment.
In the same article, mention is made that it is a good habit to make use of 'ptrdiff_t',
which is declared in , when subtracting two pointers and using the result.
[/Personal/SoftwareDevelopment]
permanent link
|