A Twitter DomXss, a wrong fix and something more

A Twitter DOM Xss

It seems that twitter new site, introduced some issue resulting in a worm exploiting a stored Xss.
They also added some new JavaScript in their pages which I casually saw while searching in the html for the worm payload.

The code was the following :
//<![CDATA[ (function(g){var a=location.href.split("#!")[1];if(a){g.location=g.HBR=a;}})(window); //]]>

Do you spot the issue?
It search for "#!" in the Url and assign the content after that to the window.location object. And it is present in (almost?) every page on twitter.com main site.

According to DOM Xss Wiki the location object is one of the first objects identified for being dangerous as it is both a source and a sink.

In fact the DOM Based Xss will be triggered by simply going to:

http://twitter.com/#!javascript:alert(document.domain);

as shown in the following screenshot:

Very simple and effective.
After spotting the issue, I sent an email to twitter warning them about it but not without apologizing for finding it in the middle of worm spreading.

The response was quite funny because, even if the issue was very straightforward, they cannot reproduce it because of safari antiXss filter.

Obviously I checked on every browser but Safari ... and guess what ? Safari blocked it! We'll talk about it later.

So I told them that it worked on Firefox, Chrome and Opera and after that they confirmed the issue, thanked me and so long. No more mails.

A Wrong Fix (To Replace or not to Replace )
Thanks Gaz for the title.

After some hours, I found the following fix:

var c = location.href.split("#!")[1]; if (c) { window.location = c.replace(":", ""); } else { return true; }

What's wrong with that?

Data Validation. 'c' is not validated as directed, that means every character but ":" is allowed. Data validation is about limiting the set of every possible input to an expected subset. Question is do we need to allow everything but one char?
BlackListing. It is widely known that blacklisting could lead to bypasses if it is done loosely.
No output encoding is applied. Since location assignment calls the URL Parser the context is quite known and have it's own metacharacters and structure. encoding in the URLParser context is also known as URLEncoding.
The use of replace ... let's see the doc (ECMA Specification):

[...]
String.prototype.replace (searchValue, replaceValue)
[...]
If searchValue is not a regular expression, let searchString be ToString(searchValue) and search
string for the first occurrence of searchString. Let m be 0.
[...]

The analysis tells that the fix is wrong, and in fact is possible to bypass the replace by doubling the colon ':' char.

http://twitter.com/#!javascript::alert(document.domain);

See the '::' ?
The replace just deletes the first occurrence of ':' so we just add two ':'.
It has also the drawback to bypass several client side filters, Safari included.

So I wrote again to twitter:

Hey,
that is not correct!

(function(g){var
a=location.href.split("#!")[1];if(a){g.location=g.HBR=a.replace(":","");}})(window);

will allow:

twitter.com/#!javascript::alert(1)

see the :: ?
I'd suggest you to urlencode a
or
if it breaks things use a whitelist of allowed chars before going to assign a to the location.

Another fix could be using:
location.pathname=a
or
location.search=a

which at least let you stay on the same domain (not sure it works on every browser), but I don't know if it's ok for twitter.

It's not a easy task, as usual :)

Also, please, send me an email when the fix is done, cause I don't want to set a cron job to get when the fix is deployed.
...

This morning I found the following fix (no email from them though):
(function(g){ var a=location.href.split("#!")[1]; if(a){ g.location=g.HBR=a.replace(":","","g"); } })(window);

Which resolves the multiple colons attack, but, IMHO, it is not really correct because of what I've said previously.

The Safari Filter Bypass (the something more part)

As a side consequence of the twitter DOM Xss I found myself playing with the Safari anti Xss filter.
It seems that it tries to find a match between the payload used in the assignment to location and the values in the Url in browser location bar.
After checking a bit in order to understand the behavior of the filter, I figured out that it urldecode the Url and then search for the pattern.
The problem comes because of that.
In fact since the + is replaced to a space character, then
twitter.com?#!javascript:1+alert(1)

becomes:
twitter.com?#!javascript:1 alert(1)
which obviously won't ever match the needle:
"javascript:1+alert(1)"

And there you have the bypass.

Update (24/09/2010)

Twitter finally set a working patch to the second wrong fix (see comments).
(function(g){var a=location.href.split("#!")[1];if(a){g.location=g.HBR=a.replace(/:/gi,"");}})(window);

Still not the best,IMHO, but at least it works...well, until there will be a bypass.
Also, since the patch just blocks ':' still remains an arbitrary redirect issue.

twitter.com#!//attacker.ltd/with/a/page/similar/to/twitterlogin/page

Update (25/09/2010)

As it was to be expected, there is a bypass (already public) which works on IE8 (~26% market share).
I found it yesterday independently by Gareth Heyes and Yusuke Hasegawa and reported to Twitter security team.
The bypass takes advantage of the html entity version of ':' which is : or :.
Internet Explorer 8, unlikely other browsers, when finds an entity converts it to its original value when it is assigned to the location object.
location="&x#58;"
will let the browser to go to ':' and not to literally "&x58;"
So, when the patch tries to replace ':' to an empty value, it won't find it, but the assignment to the location will convert it again to a colon.

twitter.com#!javascript&x58;alert(1)
is still valid (not in blacklist).

Finally, after writing a new mail to twitter security team, they came with a good defensive patch:
(function(g){var a=location.href.split("#!")[1];if(a){window.location.hash = "";g.location.pathname = g.HBR = a;}})(window);
As I suggested in my first email.
What happens here is that the assignment is performed on the correct attribute (the pathname) so that it is parsed in the right context with no possible bypasses to force a new URI scheme.
Now everything *should* be really ok... well, if all browsers will behave in the right way!