ModSecurity Breach

ModSecurity Blog

« WAF Virtual Patching Workshop at Blackhat USA 2010 | Main | ModSecurity Happy Hour @ Black Hat USA »

Impedance Mismatch and Base64

There was a recent blog article stating that ModSecurity can be bypassed by adding invalid characters to Base64 encoded data. Well, this is somewhat correct, but I am not sure I'd call it a bypass. It is really "Impedance Mismatch" as it depends on the decoder you are using in your app. PHP's decoder is ignoring characters (RFC-2045) and ModSecurity is doing what Apache does for HTTP Basic Auth and not allowing the extra characters (RFC-4648)

The article's example is roughly this:

1) Take an attack string: <script>alert(1)</script>
2) Base64 encode it to: PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==
3) Now add an illegal character: P.HNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==
4) Notice that most decoders will not work, but PHP's will (act surprised)

PHP will apparently just skip the invalid characters (RFC-2045) and so something like this (article's example, not mine) will of course fail:

SecRule ARGS:b64 "alert" "t:base64decode,log,deny,status:501"

The Base64 decoder in ModSecurity is based off the RFC-4648 implementation of Base64. There are many other variants. Well, as it turns out it is important to know a bit more about your platform on which your app is based and the above trivial rule is just not going to cut it.

For PHP and possibly others you will need to go a little further and validate the character set first using a positive rule. Something like this is going to be required for the article's example:

SecRule ARGS:b64 "!^[A-Za-z0-9\+/]*={0,2}$" \
  "phase:2,t:none,log,deny,status:403,msg:'Invalid Base64 Encoding'"
SecRule ARGS:b64 "alert" \
  "phase:2,t:none,t:base64decode,log,deny,status:403,msg:'Badness in b64'"

And now you get some better coverage:

# For PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==
ModSecurity: Access denied with code 403 (phase 2). Pattern match "alert" 
at ARGS:b64. [file "test.conf"] [line "3"] [msg "Badness in b64"] 
[hostname "myhost"] [uri "/foo"] [unique_id "S8-4-X8AAQEAACGOJcoAAAAA"]

# For P.HNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==
ModSecurity: Access denied with code 403 (phase 2). Match of 
"rx ^[A-Za-z0-9\\\\+/]*={0,2}$" against "ARGS:b64" required. 
[file "test.conf"] [line "1"] [msg "Invalid Base64 Encoding"] 
[hostname "myhost"] [uri "/foo"] [unique_id "S8-5BX8AAQEAACJSKBYAAA@i"]

Though I am picking on PHP a bit here, this may be true in many other areas if you have decoders/parsers that accept out-of-the-norm data. You really do have to know your apps to write targeted rules like the example in this article. You cannot detect encodings like Base64 generically and I would not expect to find such a rule as this in a generic rule set such as ModSecurity's CRS.

Edited: Added details on which RFCs I was referring to and removed blame on PHP after further investigation as it really is just an issue with multiple variants of base64.


TrackBack URL for this entry:

Listed below are links to weblogs that reference Impedance Mismatch and Base64:

You're stating that companies should not rely on modsecurity b64 decoder and that they should add a positive rule?

It seems to me a choice for them as well as it is a choice implementing controls on application after the payload is decoded on the correct layer.

Why not simply give more flexiblity to the WAF b64decoder? It shouldn't be that hard since b64 decoding is simple.

I am not saying that exactly.

Let me reverse that on you. Should companies rely on the broken PHP b64 decoder?

The base64 decoder in ModSecurity was not really put there to normalize data (like t:lowercase, t:removeWhitespace, etc). It was put there to do real parsing, such as looking into basic auth header, etc.

From my view (rule writer hat on), it is actually more interesting to see that someone tried to use an invalid character than to detect what they were trying to decode. Mostly this is due to the base64 example itself. Not many apps out there are going to be using an arg (or other field) that is *always* base64 encoded. More likely it will be base64 in some field that is normally plain text and they are trying to hide something. In the later case, you cannot reliably use t:base64Decode anyhow as there is no way to just decode the base64 bits in a larger string. So, in a case where t:base64Decode can be used as in your example, it is better to know if invalid encodings were attempted than to try to interpret it in a loose matter to only detect the contents.

If I add the ability to skip over invalid characters like PHP is doing, then it opens up another bypass for decoders that will ignore all text preceding the invalid character. In other words, catering to one bad decoder will just lead to breakage elsewhere. I think it is better to just document these issues and say "PHP has a broken decoder, you will need to work around it like this...".

Additionally, there are some flavors of Base64 where a "." *is* valid (look at XML name tokens implementation). So, which flavor do I choose for ModSecurity? It would have to be flexible and the rule writer will have to choose and we are back to the same issue of having to know what the decoder is using.

What ModSecurity should be doing in all of its parsers is setting flags when things go badly. The multi-part parser does this now. Then you can just check for these flags in rules to know when there may be an issue where a bypass was attempted.

You made some good and interesting points.

Companies rely always on something that is broken and not written by them. They also rely on Waf to control their input as well.
Php is not the only one I'm collecting some java imlpementation too.

I agree with you about the fact that is more important to understand if there's an invalid character in the encoded B64 string.
My post is more about awareness. And I'm glad there's a bit of discussion about the topic.
You know better than me how sys admins are..:)
So why not giving customers strictB64Decode and B64Decode?

It is not a matter '.' it is a matter of (.mario (c)):
PHNjcml«ÔÑÒÉŽ!!!» wdD5hbGVydCgxKTwvc2NyaXB0Pg==

That said, I got your points and I thank you for explaining your reasons and giving suggestions. I just hope sysadmins will use them.

Just FYI
Is not only Php.

E.g.: org.apache.commons.codec.binary.Base64 is flexible as well as php.
From a comment in the code:

Discards any characters outside of the base64 alphabet, per
the requirements on page 25 of RFC 2045 - "Any characters
outside of the base64 alphabet are to be ignored in base64
encoded data."

I know that Rfc 2045 is Mime Dec/Enc B64 but just to be sure, which Rfc did ModSecurity developers implemented?

Not ModSecurity developers, but rather Apache developers. ModSecurity just uses Apache APR-Util's base64 decoder, which I thought was the more generic form, RFC-3548, but I need to check again. Both RFC-3548 and the revised version of that (RFC-4648) state:

"Implementations MUST reject the encoded data if it contains characters outside the base alphabet when interpreting base-encoded data, unless the specification referring to this document explicitly states otherwise."

Looking at this more, though, I am re-reading HTTP RFCs and see that the HTTP Basic Auth is supposed to be RFC-2045 (MIME), not RFC-3548/4648 and that would constitute the "unless" clause above. Hmm, so maybe there is more to this. Oh the irony that PHP's flexibility may actually be correct here, heh!

In any case, that is enough gray area to make me edit this post to not blame PHP so much, heh ;)

I'll continue looking at it and make a note of it in the next ModSecurity release. It may be enough to warrant two different base64Decode flavors as you suggested...

The comments to this entry are closed.


November 2010
Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30


Atom Feed



Recent Entries