I recently was hired to create a web application firewall (WAF) using Citrix NetScaler to protect a SAP Hybris based e-shop. This shop has content for several languages, so we had to select the right home page.
The base URL of the website was like that: https://shop.domain.com/shop/language/. SSL was optional. I wanted to set the default language based on browser settings. I based it on HTTP- Header Accept-Language.
There are several pre-defined codes. I found a list here.
Following rfc3066 I found out, Accept-Language tags will always start with the main language (for example en, de, fr, …) and can optionally be followed by sub attributes. De-AT for example means: Austrian German. There may be some more sub attributes, but they had not been of any interest. A typical header would look like that: en-us, en;q=1.0,fr-ca, fr;q=0.5,pt-br, pt;q=0.5,es;q=0.5
The only portion I was interested in are the left 2 letters. en means English, de means German, fr means French and so on. Easy like that!
Even though users could change their language code they usually don’t. There are tools out there, like “change accept-language switcher” for FireFox, but they get hardly used, and if they get used they get used for reason. So this is something I can count on.
So I decided to base my policy on this header. I made up my mind to use responder, however I could also have used rewriting. Whenever a user just opens the base URL (/), I will forward him to /shop/xy where xy is the country code.
The Citrix NetScaler function I have to use is
HTTP.REQ.HEADER("Accept-Language"). I am interested in a substing, starting on position 0, the first 2 digits. So I have to add the SUBSTR (x,y) function, where x is the starting position (0) and y is the length of the substring (2).
The RFC mandates lower case letters for the country code, but the term RFC means request for comment. So different from ANSI or DIN a RFC is just a proposal, so you might not count on this. So I appended a .TO_LOWER, just to make sure. So my redirect would look like that:
"/shop/" + HTTP.REQ.HEADER("Accept-Language").SUBSTR(0,2).TO_LOWER + "/".
It worked great. Our UK costumers got sent to English content, the French to French, the Swiz to French, Italian or German (depending on their taste) and the Germans to German content, even costumers in Lichtenstein would be faced with German content (de-LI). It could hardly have been any better. Just Chinese hackers saw a 404, not found, as they got directed to ../zh/, which does not exist at all! Great!
My first thought was: give a damn shit on Chinese, we don’t sell outside of the EU (OK, something our British costumers will have to keep in mind). But there is a bigger problem than just Chinese: Some search engines might use crawlers not sending this header at all. So what can I do?
We currently have some languages: English (en), French (fr), German (de), Italian (it), Dutch (nl) and some more, a total of 13 languages.
So I created a list. A pattern set called suported_languages. It would be good style to give index numbers based on importance. So I gave de (German) 10, as it’s my costumer’s biggest market, French 20, and so on. Reason here is to save computing time.
add policy patset suported_languages
bind policy patset suported_languages de -index 10
bind policy patset suported_languages fr -index 20
My costumer defined a standard language, English, for example. I then created 2 policies, one for unsupported languages (not in list), and one for supported ones.
So my first policy expression would look like that:
HTTP.REQ.URL.EQ("/") && HTTP.REQ.HEADER("Accept-Language").SUBSTR(0,2).TO_LOWER.CONTAINS_ANY("suported_languages")
This means: the requested URL is / (this is the empty URL), and (&&) the browser preferred language is inside my list.
The other policy expression could be quite similar, but with a .NOT in the end, so not in my list, however I would have to examine all languages again. I also could bind this policy with a GOTO END (so no other policies get evaluated), so the other policy would just ask for
HTTP.REQ.URL.EQ("/"). This will safe CPU time (we already know: It’s not in the list!)
A great, but simple, example for policy expressions!