Splitting up a NetScaler site using admin partitions

(a nice but partly failed try)

Complex web applications may lead to complex NetScaler configuration. And sometimes an administrator may get lost troubleshooting complex websites, especially sites using content switching.

This is an example of a real world website: The portal page is assembled of several independent web applications. Each application is hosted on a specific group of load balanced servers. There are rewriting policies replacing some content on a website, there are also rewriting policies on a global base (and responders, URL transformation, FEO optimization, app firewall, caching, …). Some of the global and some of the server specific content, was not replaced like desired, but some content gets replaced. The current configuration is confusing the admins, and it also confused me.

Main problem here: I can’t look into traffic between a content switching and a load balancing vServer, so I can’t see what’s actually going on in here. Second problem: there is a total of 800 rewriting policies. That’s confusing me, there are too many for me, I can’t keep track of all these policies, I simply don’t remember what they are good for and where they got bound too!

The current solution also used NetScaler MAC based forwarding, but MAC based forwarding had partly undesired influence on some of the load balancing vServers, and on the NetScaler as a hole as it blows up the TCP connection tables (by adding MAC addresses to it).

That’s where admin partition came in my focus!

We got admin partitions in NetScaler 11 (10.5e), a possibility to split up a NetScaler into several “virtual” ones. That’s great. I made up my mind to put each load balancing server into a specific admin partitions while I let the content switching vServer in the default (root) partition.

This is a sketch of solution I desired:

Lay_Out

The first big problem I faced: two partitions can’t connect into the same subnet. This had been a must have as I would not have been able to change the current networking and routing configuration in a 10,000+ server data centre without an excessive change process lasting for several month. So we stopped here, almost a year ago.

The new version 11.1 offers a feature called partition shared vLan; this seemed to be the solution! So I tried to set up vLan 1 as Partition shared vLan. This was impossible. I guess, vLan 1 is not a real vLan at all. It’s not comparable to the rest of vLans, but I actually don’t really know.

But I could create a vLan, make it a Partition shared vLan, and bind it to the interface.

Creating vLans

vLan2

add vLan 1000 -sharing ENABLED -aliasName PartitionShared_vLan

(so we add vLan 1000 with partition sharing enabled. You may skip the alias name, but I always like to add some documentation)

bind vlan 1000 -ifnum 1/2

(we bind this vLan to the designated interface)

Next step: Let’s create the partitions

createPartition

add partition WebServerApp1

(This partition will be used for a webserver of app1, so I’ll call it WebServerApp1)

Open this partition, scroll down to network isolation, click add binding and select vLan 1000

Partition2

click on VLANS

vLan3

and bind vLan 1000

bind partition WebServer -vlan 1000
Currently you can’t unbind vLan 1.

I repeat this step for all admin partitions desired. Now I can put all of my load balancing servers into dedicated admin partitions.

Currently there are several restrictions about NetScaler basic and advanced features in admin partitions:

Restrictions about admin partitions in NetScaler 11.1 build 48.10

default partition Admin partition
SSL Offloading SSL Offloading
Load Balancing Load Balancing
Content Filter
Rewrite Rewrite
Authentication, Authorization, Auditing
HTTP compression HTTP compression
Content Switch Content Switch
Integrated Caching Integrated Caching
NetScaler Gateway
Application Firewall
Surge Protection
Priority Queuing
Cache Redirection
Web Logging Web Logging
RIP Routing RIP Routing
IPv6 Protocol Translation IPv6 Protocol Translation
EdgeSight Monitoring (HTML Injection) EdgeSight Monitoring (HTML Injection)
AppFlow AppFlow
ISIS Routing ISIS Routing
AppQoE AppQoE
Content Accelerator Content Accelerator
vPath vPath
Reputation
Sure Connect
Http Dos Protection
Global Server Load Balancing
OSPF Routing OSPF Routing
BGP Routing BGP Routing
Responder Responder
NetScaler Push NetScaler Push
Cloud Bridge
Callhome Callhome
Front End Optimization  (missing in GUI)
Large Scale NAT Large Scale NAT
RDP Proxy RDP Proxy
RISE Integration

A comparison of features may be found here. (Thanks, Balaji, to provide this link)

So there are currently serious ones missing in admin partitions! I highlighted some I was interested in. To me the ones I miss most are App Firewall and Front End Optimization. I would have put this into admin partitions, as this is done on a per application base. I don’t miss Surge Protection, Http Dos Protection and Priority Queuing as this is done during connect on the content switching vServer.

This project does not use NetScaler Gateway. So NetScaler Gateway missing is no problem for me, however I missed the chance to isolate NetScaler Gateway in many other projects. NetScaler Gateway is usually governed by other departments, so it should be in a separate admin partition. Our beloved NetScaler will degenerate into a battle ground between the application delivery and the network group, if we can’t completely isolate it.

I suddenly faced a strange problem (why did it not work?):

Simple: I could not communicate from default partition to WebServerApp1 admin partition. It was a completely impossible thing to do. I tried to send ICMP packets from default to WebServerApp1 admin partition, but without success. Even ARP didn’t work at all.

I started monitoring, both from NetScaler using NSTrace and from a switch board (an other restriction here: NSTrace is only available from command line inside an admin partition, it does not exist in GUI).

I set up a switch board for monitoring. Pinging from default partition to 10.0.1.10 (the vServer inside the admin partition), I saw ARP requests going out of NetScaler, but no ARP replies coming back from the admin partition. Same the other way round. However I could ping all IPs from both partitions from an external server (i.e. 10.0.1.100) and vice versa. My networking problems seem to be internal to NetScaler only.

I added a static ARP entry into default partition for 10.0.1.10 and 10.0.1.1 into the WebServerApp1 partition and tried again. No success.

Sending packets between admin partitions is currently not possible!

I also added virtual MAC addresses to the partition. No success either. There is something spooky going on inside a NetScaler’s internal networking logic making admin partition to admin partition traffic an impossible thing to do.

My current work around is a router VM based on VyOS. I could fix all of my problems by now, I love my deployment, but I hate this tiny little VM: it should simply not be there!

Comments (and a possible solution) are highly welcome …

10 thoughts on “Splitting up a NetScaler site using admin partitions

    1. Johannes Norz Post author

      Thanks, Dimitry. I don’t think traffic domains would be helpful as they deal with duplicated IP address spaces. However I’ll give it a try

      Reply
    1. Johannes Norz Post author

      Thank you, you are right! I just checked the GUI, a shame. It’s not in GUI but seems to be supported via command line. I have seen the same with network traces. I’ll update my blog!

      Reply
    1. Johannes Norz Post author

      There are very different Versions about adminpartitions as adminpartitions envolved very much. Late 10.5 versions supported adminpartitions, however there had been no partition shared vLans prior to 11.1.

      You simply have to try out if you can go with adminpartitions in your 10.5

      Reply
  1. Pingback: Doing Citrix NetScaler trace inside an admin-partition – JustAnotherCitrixBlog

  2. Alexander Ries

    Hello Venkatesh,
    it exist one extra introduction “10.5e admin partition” build parallel to 10.5e stream. I used it at some customers. It”s usable but the version has some strange bugs, some of them can be fixed via CLI. So, in my opinion, don”t use it, since 11 exists.

    Cheers

    Alex

    Reply
  3. NetworkGuy480

    Admin partitions will depend on the underlay network setup.
    Device will not steer traffic within itself internally and partitions should not have overlapping subnets.
    So each partition if needs to talk to an IP on another partition will send the traffic out to Router/L3 switch to which the NetScaler is connected and then it would use the right VLAN to again send back this traffic to the other partition on the same unit.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *