This article is a very first answer to a question I recently received. The question was: You’re talking about web application firewalls. I’d like to know how to use the WAF for reverse proxying. … I also might use a firewall as it also contains functions like SQL injection prevention.
So what’s in a web application firewall?
A web application firewall, also known as WAF, is a shell around a web application. There are hundreds of threats, and there might be a vulnerability in my application none is aware of. There may be debug code, remained inside my application, there may be comments containing too much information like
<!—now connecting to database, using dbuser/ksjd72h?kasjd-->, and you’ll never know. All this might help to break the application.
A WAF will try to address several threats in a very streamlined way. There are general approaches like removing comments. No comment is needed, so they all have to go (however: It’s rather common to place scripts in comments to keep them from being displayed, so we will have to leave comments containing scripts). There are pattern based approaches addressing well known attacks, and there are application descriptions allowing only traffic known to be not malicious.
So what’s the difference to a traditional fire wall?
First of all, a fire wall is a L3/L4 device. As such it’s a packet based animal. It can, of course filter malicious packets. But it can’t filter malicious traffic, if it’s thoughtfully distributed over several packets. A firewall has very limited abilities about data streams. And data streams is that we face in web attacks.
A NetScaler is – in most deployments – a proxy, so it’s a L7 device. Being a proxy it handles data streams instead of packets.
The negative approach
A WAF always supports a negative approach: Well-known malicious traffic gets blocked. Think of a HTTP request for an URL ending with /etc/password on a LINUX host. We would expose sensitive information if a request like that is successful. This negative approach is pattern based. A Citrix NetScaler contains lists with hundreds of well-known malicious requests.
And we would block “incomplete” requests, so requests not containing the usual header fields (Accept-Language, User-Agent, …). This may block DDOS tools like LOIC!
The positive approach
Unfortunately we don’t know all the risk out there. Even worse, we don’t know all weaknesses and vulnerabilities of our applications. We would – of course – fix them immediately if we just knew!
However we know how our application works: We know all URLs allowed we know about cookies and so on. We know the length of URLs, query strings and cookies. So we can describe all allowed traffic and deny all the rest. The positive approach is quite similar to what we do with a fire-wall: we deny everything and allow non malicious traffic only. At the same time we will also replace (for example) all ‘ with ' that’s a very efficient was to prevent SQL injection attacks from occurring.
The hard peace of work is finding out what’s going on inside our web application. We have to white list all allowed URLs (in NetScaler we call these start URLs). This may be done using literals or Pearl compatible Regular Expressions (RegEx). In addition we may even deny certain URLs (in NetScaler we call it Deny URLs). We mainly use it if we allow everything inside a certain subdirectory (let’s say: ^https?://www\.Example\.com/images) and need to deny access to ^https?://www\.Example\.com/images/secret.jpg.
An important one is called “cookie consistency”. I once found a cookie containing 2 fields: Costumer_Number and Logon. Of course I changed it to Costumer_Number=1 and Logon=true. This made me the first costumer (and owner) of the shop. I had been able to change all I wanted (also prices, see all costumer data and so on). No doubt, that’s a bad mistake inside the web app. What can we do? We could, for example, encrypt these cookies, or add a hash to them and protect them from being tampered.
Another bad thing is loss of confident information. A partner company of mine uses a blog to communicate to costumers. They also use the same blog internally, but internal readers may see content with certain keywords that’s not available from outside. This company had a top secret project, called (let’s say) Middle-Earth. This project was planned to be announced at a great conference some month ahead. Unfortunately one of the guys working on project Middle-Earth forgot to specify the proper keyword (like I use to do all the time), so a document containing project specifications about project Middle-Earth was leaked. A WAF could contain blocking rules (“Content-Type”), containing words (numbers, whatever you could specify using RegEx) preventing documents containing this words from traversing the WAF. Same can be done about credit cards (NetScaler supports American Express, Diners Club, Discover, JCB, MasterCard and Visa). So these credit card information can’t leave your webserver, even if your application fails, or an attacker was able to access your log-files. This is a must have to get certified PCI-DSS
There are several more things to do. I just gave a short description about possible threads and what a WAF could do for you.