Monday, October 20, 2014

Netscaler rewrites for X-Forwarded-Proto

Using X-Forwarded-Proto to tell backend servers if netscaler vservers are terminating http or https.

Seems like an excellent time to learn a bit about netscaler rewrite rules right?

At a high level, it looks like I want to set an X-Forwarded-Proto header, overwriting any pre-existing value.

Reading through examples, it seems like rewrite policies and rewrite actions have a roughly IF THEN relationship, where the rewrite policy defined the conditional and the rewrite action defined the action.

I could then bind these rules to a specific vserver, but as these seemed to be more generically useful, I decided to bind these globally.

So I started with something like
add rewrite action delete_x_forwarded_proto delete_http_header x-forwarded-proto
add rewrite policy check_x_forwarded_proto 'HTTP.REQ.HEADER("x-forwarded-proto").EXISTS' delete_x_forwarded_proto

add rewrite action insert_x_forwarded_proto_http insert_http_header X-Forwarded-Proto HTTP.REQ.URL.PROTOCOL -bypassSafetyCheck YES
add rewrite policy insert_x_forwarded_proto TRUE insert_x_forwarded_proto

bind rewrite global check_x_forwarded_proto 1
bind rewrite global insert_x_forwarded_proto 100
This did not work.

Watching requests on the server side with tcpdump, I realized that most of the traffic I was interested had the push flag set, and used for filtering I could avoid connection build up and tear down noise.
tcpdump -s0 -A 'tcp (dst port 80 or dst port 443) and tcp[13] & 8!=0'
The problem was I was seeing something like this in the requests reaching the server.
X-Forwarded-Proto:
After beating my head against the wall, I realized that as I was looking at the HTTP.REQ.URL.PROTOCOL, i.e. via HTTP.REQ.URL, I was never going to get a useful X-Forwarded-Proto value unless somebody tried to full on proxy through the netscaler vservers with requests like so
GET http://www.example.com HTTP/1.1
Ok, so back to the drawing board.

After sifting through a large chunk of the Citrix NetScaler Policy Configuration and Reference Guide I realized I could pivot on CLIENT.SSL.IS_SSL to decide if the client connection was using ssl or not. Not a variable directly containing the protocol itself, but I could work with this.

Since I wasn't setting X-Forwarded-Proto to an extracted protocol value and instead pivoting on CLIENT.SSL.IS_SSL I ended up having to define static string insert_http_header actions for http and https. Now, as my previous rewrite actions were all utilizing netscaler variables instead of static values, this led to a wee bit of pain
add rewrite action test_action insert_http_header test_header test_value
ERROR: Expression syntax error [test_value]
add rewrite action test_action insert_http_header test_header "test_value"
ERROR: Unmatched character ["]
add rewrite action test_action insert_http_header test_header \"test_value\"
ERROR: Unmatched character ["]
add rewrite action test_action insert_http_header test_header "\"test_value\""
 Done
So now my config was looking like so
enable ns feature REWRITE

add rewrite action delete_x_forwarded_proto delete_http_header x-forwarded-proto
add rewrite policy delete_x_forwarded_proto 'HTTP.REQ.HEADER("x-forwarded-proto").EXISTS' delete_x_forwarded_proto

add rewrite action x_forwarded_proto_http insert_http_header X-Forwarded-Proto "\"http\""
add rewrite policy x_forwarded_proto_http !CLIENT.SSL.IS_SSL x_forwarded_proto_http

add rewrite action x_forwarded_proto_https insert_http_header X-Forwarded-Proto "\"https\""
add rewrite policy x_forwarded_proto_https CLIENT.SSL.IS_SSL x_forwarded_proto_https

bind rewrite global delete_x_forwarded_proto 1
bind rewrite global x_forwarded_proto_http 100
bind rewrite global x_forwarded_proto_https 101
And I was seeing appropriate http or https values from my X-Forwarded-Proto headers on the server.

Cool.

Ok, so how do I actually see the current bindings on the netscaler?

show rewrite global was less then useful.
show rewrite global 
1) Global bindpoint: REQ_DEFAULT
 Number of bound policies: 3

 Done
However specifying a type got me where I wanted to go
show rewrite global -type REQ_DEFAULT
1) Policy Name: delete_x_forwarded_proto
 Priority: 1
 GotoPriorityExpression: NEXT

2) Policy Name: x_forwarded_proto_http
 Priority: 100
 GotoPriorityExpression: NEXT

3) Policy Name: x_forwarded_proto_https
 Priority: 101
 GotoPriorityExpression: NEXT

 Done
Now, why is the type REQ_DEFAULT? From Citrix NetScaler Policy Configuration and Reference Guide under Binding a Policy Globally

The type argument is optional to maintain backward compatibility. If you omit the type, the policy is bound to REQ_DEFAULT or RES_DEFAULT, depending on whether the policy rule is a response-time or a request-time expression.

I.e. As my policies all involved the request conditional CLIENT.SSL.IS_SSL, they were all implicitly bound to REQ_DEFAULT by default.

Getting closer!

Now I went to verify the deletion of pre-existing X-Forwarded-Proto headers behavior. When I sent a request to the vserver with a pre-existing X-Forwarded-Proto header
GET / HTTP/1.0
X-Forwarded-Proto: test
On the server side I was no longer seeing the X-Forwarded-Proto header at all, the entire header was no longer present.

Digging back through Citrix NetScaler Policy Configuration and Reference Guide I finally found my clue under Evaluation Order Within a Policy Bank

If the final Goto in the invoked policy bank has a value of END or is empty, the invocation result is END, and evaluation stops.

I.e. if the bound policy doesn't have an explicit gotoPriorityExpression, END is used. So the netscaler uses the first rule found per binding and priority level, and unless it contains an explicit gotoPriorityExpression, stops processing any further policies.

In this case, the delete_x_forwarded_proto policy was triggering, then using an implicit END to stop processing all further rules. Having an implicit END instead of an explicit END seems a bit hinky, but we can address it by using explicit NEXT gotoPriorityExpressions for our policy bindings.

Final Config Version

enable ns feature REWRITE

add rewrite action delete_x_forwarded_proto delete_http_header x-forwarded-proto
add rewrite policy delete_x_forwarded_proto 'HTTP.REQ.HEADER("x-forwarded-proto").EXISTS' delete_x_forwarded_proto

add rewrite action x_forwarded_proto_http insert_http_header X-Forwarded-Proto "\"http\""
add rewrite policy x_forwarded_proto_http !CLIENT.SSL.IS_SSL x_forwarded_proto_http

add rewrite action x_forwarded_proto_https insert_http_header X-Forwarded-Proto "\"https\""
add rewrite policy x_forwarded_proto_https CLIENT.SSL.IS_SSL x_forwarded_proto_https

bind rewrite global delete_x_forwarded_proto 1 NEXT
bind rewrite global x_forwarded_proto_http 100 NEXT
bind rewrite global x_forwarded_proto_https 101 NEXT

Monday, October 13, 2014

Keepalive and Healthchecks for Proxied Connections

So, I've been noticing recently that while keepalive connection pools and healthchecks are frequently used for incoming web server connections, they are used much less frequently for outbound proxied connections.

This means that proxied connections are both built up and torn down for every single proxy request, and that the proxied endpoints being used are not checked for availability or performance.

Lets fix that.

Keepalive connection pooling for Apache

Example mod_rewrite proxy invocation
RewriteRule ^(.*)$ http://www.example.com:80/$1 [P,L]
Starting with Apache 2.1 the mod_proxy ProxyPass and ProxyPassMatch directives provide implicit keepalive connection pooling

The secret sauce is described under the mod_proxy ProxyPass directive, but is appliciable to both ProxyPass and ProxyPassMatch.

Example mod_proxy ProxyPassMatch proxy invocation with implicit keepalive connection pooling
ProxyPassMatch ^(.*)$ / http://www.example.com:80/$1

Keepalive connection pooling for Nginx

Example ngx_http_upstream_module proxy invocation
upstream example {
    server www.example.com;
}

server {
    location / {
        proxy_pass http://example;
    }
}
Starting with Nginx 1.1.4 the ngx_http_upstream_module keepalive directive can be added for explicit keepalive connection pooling.

Note that for HTTP, the proxy_http_version directive should be set to “1.1” and the “Connection” header field should be cleared.

Example ngx_http_upstream_module proxy invocation with explicit keepalive connection pooling
upstream example {
    server www.example.com;
    keepalive 42;
}

server {
    location / {
        proxy_pass http://example;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}

Healthchecking proxy destinations for Apache

Well, this is embarrassing. Apparently mod_proxy and mod_proxy_balancer don't actually support an out of band healthcheck mechanism.

Suck.

The problem of course is that if any of the servers are not responsive Apache will have to wait for (ProxyPass or ProxyPassMatch) -> connectiontimeout seconds to expire before timing out. connectiontimeout defaults to the Apache TimeOut value by default, which itself defaults to 300 seconds.

Apache mod_proxy will only disable a non-responsive proxy destination if error state is set for the connection, and error state is currently only set on a http 500 or 503 response

Per https://github.com/apache/httpd/blob/trunk/modules/proxy/mod_proxy.c
if (access_status == OK)
    break;  
else if (access_status == HTTP_INTERNAL_SERVER_ERROR) {
    /* Unrecoverable server error.
     * We can not failover to another worker. 
     * Mark the worker as unusable if member of load balancer
     */      
    if (balancer) {
        worker->s->status |= PROXY_WORKER_IN_ERROR;
        worker->s->error_time = apr_time_now();
    }       
    break;  
}       
else if (access_status == HTTP_SERVICE_UNAVAILABLE) {
    /* Recoverable server error.
     * We can failover to another worker
     * Mark the worker as unusable if member of load balancer
     */      
    if (balancer) {
        worker->s->status |= PROXY_WORKER_IN_ERROR;
        worker->s->error_time = apr_time_now();
    }       
}       
else {  
    /* Unrecoverable error.
     * Return the origin status code to the client. 
     */      
    break;  
} 
On the positive side, once an error state has been set, the server will be failed out for (ProxyPass or ProxyPassMatch) -> retry seconds. 60 seconds by default.

On the negative side, as no preemptive healthchecking is done, each non-responsive server will be returned to service every retry seconds until an error state is again set.

i.e. ongoing intermittent failures.

Healthchecking proxy destinations for Nginx

Example ngx_http_upstream_module proxy invocation
upstream example {
    server www1.example.com;
    server www2.example.com;
}

server {
    location / {
        proxy_pass http://example;
    }
}
The problem of course is that if any of the servers are not responsive nginx will have to wait for the server fail_timeout seconds to expire before trying another resource. By default, fail_timeout is set to 10 seconds.

On the positive side, once a server has hit max_fails, 1 by default, the server will be failed out for fail_timeout.

On the negative side, as no preemptive healthchecking is done, each non-responsive server will be returned to service every fail_timeout seconds until it again hits it's max_fails threshold.

i.e. ongoing intermittent failures.

The Nginx ngx_http_upstream_module health_check directive can be added for healthchecks.

Example ngx_http_upstream_module proxy invocation with healthchecks
upstream example {
    server www1.example.com;
    server www2.example.com;
}

server {
    location / {
        proxy_pass http://example;
        health_check;
    }
}