wonko.com

Hi! I'm Ryan Grove: Sorcerer at SmugMug, lover of movies, eater of pie, connoisseur of awesome.

Older posts

Displaying items 21 - 30 of 662

There's more to HTML escaping than &, <, >, and "

A few days ago I tweeted:

If I had a dollar for every HTML escaper that only escapes &, <, >, and ", I'd have $0. Because my account would've been pwned via XSS."

This was exaggeration for effect—there aren’t many cases where a simple XSS injection could actually empty a bank account—but I wanted to make a point.

By some coincidence, I’ve found myself working with various open source projects recently that take a half-assed approach to HTML escaping. It’s something that tends to be implemented as an afterthought, which is unfortunate because it can be critical for the security of users of these projects. I won’t name any names in this post (pull requests are forthcoming), but I will explain some of the common problems I’ve seen, why they’re problems, and what can be done to fix them.

This post is not an introduction to HTML escaping. It assumes that you already know what HTML escaping is and why it’s necessary. This post also is not a comprehensive catalog of XSS vectors; the examples here are illustrative, but they certainly aren’t the only attacks you need to worry about. The intent of this post is to explain some dangers that you may not be aware of, and to encourage you to read more about them and write safer code.

Note that this post only discusses escaping, which is something entirely different (and far less complicated) than sanitizing. HTML sanitization is a topic for another time.

Escaping < and > isn’t enough

The worst HTML escaper I’ve seen in a major open source project only escapes the < and > characters. This may actually be worse than not escaping anything at all, since it gives the illusion of security, but is trivial to defeat.

For example, let’s say I have the following template, and I’m going to replace the placeholder values, indicated in [square brackets], with HTML-escaped user input:

<a href="/user/[username]">[username]</a>

The attacker enters foo" onmouseover="alert(1) as their username. End result, even after escaping:

<a href="/user/foo" onmouseover="alert(1)">foo" onmouseover="alert(1)</a>

Because the " character wasn’t escaped and the attacker’s input was used in an attribute value, the attacker was able to inject arbitrary attributes and therefore JavaScript (which, in a real XSS attack, would probably be something more harmful than an alert).

This is a classic example of making input safer in one context—in this case, as the content of an <a> element—without considering the other contexts in which it’s likely to be used, such as inside an attribute value.

Escaping &, <, >, and " isn’t enough

The characters &, <, >, and " are the ones most commonly targeted by HTML escaper implementations. This seems to be the minimum set of characters that people think need to be escaped. Unfortunately, it’s still not safe if you don’t have complete control over where the escaped values will be used.

Consider the following template, in which the template author has used single-quoted attribute values:

<a href='/user/[username]'>[username]</a>

This is exploitable using the same attack as the previous example, but with single quotes instead of double quotes: foo' onmouseover='alert(1):

<a href='/user/foo' onmouseover='alert(1)'>foo' onmouseover='alert(1)</a>

You may be saying, “But I always use double quotes to quote attribute values!” Are you also the only person who will ever use your HTML escaper? And are you immune to typos?

Escaping &, <, >, ", and ' isn't enough

This is the character set used by PHP’s ubiquitous htmlspecialchars function, and as you may have guessed, it still falls down on attribute values for two reasons.

First, as Hacker News users DanBlake and nbpoole pointed out in a discussion of this blog post, Internet Explorer treats ` as an attribute delimiter. It may be an edge case, but it’s still a potential attack vector, so ` needs to be escaped too.

Second, HTML also allows attribute values to be completely unquoted. Believe it or not, unquoted attribute values are fairly popular (some people are too lazy to quote them, others are performance zealots who can’t bear the thought of wasting those extra bytes).

Unquoted attribute values are one of the single biggest XSS vectors there is. If you don’t quote your attribute values, you’re essentially leaving the door wide open for naughty people to inject naughty things into your HTML. Very few escaper implementations cover all the edge cases necessary to prevent unquoted attribute values from becoming XSS vectors.

Escaping &, <, >, ", ', `, , !, @, $, %, (, ), =, +, {, }, [, and ] is almost enough

All those characters up there (including the space character!) can be used to break out of an unquoted HTML attribute value. If you escape every last one of them, then you’re probably pretty close to being safe. But you’re still not so safe that you can just start throwing around user input willy nilly.

Why? Because this still doesn’t cover some context-specific cases like inserting user input into the body of an inline <script> element or using user input as part of a URL.

Context is key

If you haven’t figured it out already, the primary message I’m trying to convey here is that you must be aware of the context in which you’re working with user input. Some contexts are more susceptible to attack than others, and there’s no single magic escaping bullet that will protect you or your users in all cases.

In other words, you don’t need to escape everything all the time, but you do need to escape everything that’s important in the particular contexts in which you’re displaying user input.

But there’s still one more wrench to throw into the works…

Always specify a charset, or UTF-7 will eat your face

Even if you do everything else right, serving a page that doesn’t explicitly specify a character set can leave Internet Explorer users open to XSS, thanks to the way IE sniffs out the charset when it isn’t specified.

If an attacker is able to get your page to echo back something that looks like UTF-7 encoding early enough in the page, he may be able to trick IE into rendering the page using UTF-7. This could turn the following seemingly harmless input…

+ADw-script+AD4-alert(1)+ADw-/script+AD4-

…into something potentially harmful:

<script>alert(1)</script>

I recommend specifying a UTF-8 charset in both the Content-Type HTTP response header and a <meta> tag, since it’s easy for one or the other to get switched off or omitted inadvertently as a codebase ages (this has happened to me).

Further reading

As I mentioned in the disclaimer at the top of this post, this is not a comprehensive reference of all the things that can go wrong with HTML escaping. It’s not even a guide. It’s more of a tip-of-the-iceberg preview. Please don’t assume that, having read this post, you now know everything there is to know about HTML escaping. I can guarantee that you don’t, because I don’t.

I learned a lot from the following sources, and I highly recommend them if you’re interested in learning more:

Larch 1.1.0 released

Version 1.1.0 of Larch is now available.

Larch is a command line tool for copying email quickly and reliably from one IMAP server to another. It’s smart enough not to copy messages that already exist on the destination server, robust enough to pick up where it left off when interrupted, and also capable of more advanced operations like syncing message flags from the source server to the destination server. Larch is particularly well-suited for copying messages to, from, and between Gmail and Google Apps accounts.

Installing

Larch is a Ruby application and requires Ruby 1.8.6 or higher (1.9.2 is recommended). Once Ruby is installed, install or upgrade Larch via RubyGems:

gem install larch

What’s new

This release is the culmination of over a year of bug fixes and feature development. Larch is now faster and more reliable than ever. The most significant new features are:

  • Mailbox and message state information is now stored in a local SQLite database, which allows Larch to resync and resume interrupted operations much more quickly without having to rescan all messages.
  • You can now provide configuration options via a config file, so you don’t need to pass all options on the command line. This also allows the use of named session configs, so running larch gmail-to-yahoo will use the config options defined under the gmail-to-yahoo section in the config file.
  • Yahoo! Mail IMAP is now supported when connecting to imap.mail.yahoo.com or imap-ssl.mail.yahoo.com, even for non-pro accounts. Note that Yahoo! doesn’t officially support the use of IMAP by non-pro accounts, so this feature should be considered experimental.
  • Performance and reliability have been improved significantly. Many bugs have been fixed and many workarounds have been added for misbehaving servers.

This is only a partial list of changes. For the complete list, see the HISTORY file. For more details on how to use Larch’s new features, see the comprehensive README.

Try it out, report bugs

Please use Larch’s GitHub issue tracker to report bugs and request features. If you have questions or need help with Larch, the first thing you should do is read the very detailed documentation. If that fails, search the archives of the Larch mailing list on Google Groups. If you still can’t find an answer, send your question to the list.

Sanitize 2.0.0 released

Version 2.0.0 of Sanitize, my whitelist-based HTML filtering library for Ruby, is now available. This release includes several new features and some changes to existing features. I’ll cover the big stuff in this blog post; for the complete list of changes, see the HISTORY.md file.

Installing

To install or upgrade Sanitize via RubyGems, run:

gem install sanitize

Sanitize is fully compatible with Ruby 1.8.7, 1.9.1 and 1.9.2.

Transformers

The most significant change in this release is that Sanitize’s core filtering logic is now implemented entirely as a set of always-on transformers. This simplifies the core code and means that Sanitize itself is now built on the same powerful transformer architecture that you can use in your own apps to enhance or alter Sanitize’s functionality.

The environment object provided as input to transformers now contains a slightly different set of data, and transformer output has been simplified. Transformers are no longer required to return anything, and are expected to make any desired alterations directly to the current node and/or document.

Sanitize now has the ability to traverse the document and execute transformers using either depth-first traversal (the default behavior, same as before) or breadth-first traversal (new in 2.0.0). If necessary, you can even run one set of transformers using one traversal method and another using the other method. This allows for greater flexibility and less complexity when writing certain types of transformers.

The README has more details on these changes and new features.

Other notable changes

  • Sanitize now outputs HTML4/HTML5 markup by default instead of XHTML (e.g., <img src="foo.jpg"> instead of <img src="foo.jpg" />, etc.). If you prefer the old behavior, you can set the :output config to :xhtml.
  • Some new elements and attributes (including several HTML5 elements) have been added to the built-in basic and relaxed whitelists. See HISTORY.md for the complete list.
  • Elements like <br>, <p>, and others are now replaced with whitespace when they’re removed in order to preserve the readability of the remaining text content. The list of elements that will be replaced with whitespace when removed is configurable using the :whitespace_elements setting.

Be aware that if you expect specific output from Sanitize in your unit tests, you may need to update your tests. The HTML output from this release may not precisely match the output from previous releases.

Try it out, report bugs

As always, you can try out Sanitize’s built-in filters using the test page at sanitize.pieisgood.org. Please use Sanitize’s GitHub issue tracker to report bugs and file feature requests.

Introducing YUI 3 AutoComplete

In November I gave a talk at YUIConf 2010 introducing the new AutoComplete widget I wrote for YUI 3.3.0, which we’ll be releasing soon. The video of the talk is now up on YUI Theater for your viewing pleasure. The slides are available on SlideShare.

If you didn’t attend YUIConf and haven’t yet had a chance to check out the videos, you should. The conference was packed with excellent talks this year, including quite a few about topics not directly related to YUI, so there’s sure to be something there for you even if you’re not a YUI user.

Privileges, rights, and the slippery slopes that surround them

The most common argument I see in support of the TSA’s authority to require passengers to undergo screenings that would, in other circumstances, be considered a violation of their Constitutional rights, is that traveling by air is a privilege and not a right. Since travelers choose to travel by air and are not required to do so, it’s legal for the TSA to require them to submit to X-rays, pat downs, and any other screening procedures the TSA deems necessary before they may board a plane.

This is a good argument. In fact, it’s so good that it has allowed the TSA (and its privately operated predecessors) to operate for years in the face of frequent legal challenges and protests.

It’s also a dangerous argument: it sneakily undermines citizens’ Constitutional rights by declaring those rights to exist only within certain boundaries—boundaries that are defined by the government from whom those rights are supposed to protect us, and defined in a way that bypasses the checks and balances required by the Constitution.

Flying was once a luxury, in the same way that traveling by automobile was once a luxury. Fifty years ago, few people could claim that their livelihood or their well-being required them to travel by air. One hundred years ago, the same was true of cars.

Today, though, air travel is commonplace. Many people have jobs that require them—require them—to travel frequently across long distances. In a large country like the United States, without any ground-based high speed transit infrastructure to speak of, flying is often the only realistic option. A trip that takes hours by plane may take days by train or by car.

To a casual traveler who objects to being X-rayed or patted down, this may seem like a relatively easy choice. Take fewer trips, or budget more time and turn them into road trips. To someone who is required to travel as a condition of their employment, this choice isn’t so easy. Budgeting time for road trips isn’t likely to be an option. This person is faced with a more difficult choice: they can forfeit their civil liberties or they can find a new job.

It would be unwise to assume that the TSA will be content to limit their screenings to air travel. With a strong foothold in all of the nation’s airports, it’s only a matter of time before the TSA asserts the authority to require screenings for rail travel. If things continue in this vein, we’ll see mandatory TSA checkpoints on major interstate highways within the next thirty years (the justification: drivers don’t have to take the interstate).

One thing history has taught us is that people who willingly give up their freedoms rarely have an easy time getting them back. Nobody wants to make it easy for evildoers to smuggle weapons onto airplanes, but we need to find better solutions than the ones the TSA claims are necessary, and these solutions need to be balanced with the actual risks they’re intended to guard against.

Freedom makes a terrible currency. Using freedom to buy safety only leads to bankruptcy.

"No thanks, I've already had cancer, just feel me up or whatever."

My former coworker Isaac Schlueter is flying today and brought some copies of the UCSF letter with him. While waiting in line at the TSA checkpoint, he struck up a conversation with the family behind him:

Turns out she’s a breast cancer survivor. And her doctor has told her to avoid x-rays, even at the dentist, unless absolutely medically necessary. And she didn’t realize that “millimeter wave digital backscatter detection” used x-rays, because the TSA doesn’t actually put that on the sign.

She did the rest.

When we got to the scanner, I opted out. Then they opted out. She’d already convinced the family behind them to do the same. Her response to the TSA agent was awesome, I wish I’d thought of it:

“Ma’am, please step over here.”

“No thanks, I’ve already had cancer, just feel me up or whatever.”

After the first 4 “OPT-OUT” calls, they just passed us all through the regular metal detector. No one got groped.

Internets, please do more things like this.

Why I will avoid flying from now on

I just got back from a trip to Sunnyvale for YUIConf 2010.

Since I work remotely, I typically fly to California every few months for a week in the office. I’ve always had decent luck at not getting selected for “enhanced” screening by the TSA, and this trip was no different. I managed to slip through unscathed, but only barely.

The Terminal B checkpoint at SJC now has a backscatter machine next to each metal detector. Of the two lines that were active when I went through security yesterday evening, one backscatter machine was in active use and another was roped off. I managed to get myself into the line with the roped-off machine, but even so, TSA agents were selecting people from both lines for additional screening.

I wasn’t selected, but as I was putting my shoes back on inside the checkpoint, I noticed that a woman behind me (blonde, wearing a spandex sports bra and shorts, with her midriff exposed) who had already gone through either the metal detector or the backscatter machine (I didn’t see which) had been selected for further screening by a male TSA agent.

She didn’t seem to be objecting, and I didn’t stick around to see what happened, but unless this woman was hiding explosives literally inside her body, this additional screening didn’t make anything safer for anyone.

Until today, I was under the impression that if I were to refuse both a backscatter scan and a pat-down, I would still be free to leave the airport and find another mode of travel. Then I saw this well-documented blog post containing both video (albeit pointed at the ceiling) and audio of a man attempting to do just that and being detained and then threatened with fines and a civil suit simply for trying to leave the airport after refusing to allow a TSA agent to give him a pat-down.

That changes the entire game. Now, merely entering the line at a security checkpoint apparently makes you a prisoner of the TSA. You either do whatever they tell you to, even if it means allowing them to shoot x-rays at you and scrutinize your genitalia, or you pay a $10,000 fine and fight a legal battle.

Sunnyvale isn’t far enough from Portland that I’m willing to give up both my civil liberties and my dignity to get there. I’ll drive from now on.

Achieving Performance Zen with YUI 3

Last week I gave an internal tech talk at Yahoo! entitled Achieving Performance Zen with YUI 3. The full video of the talk is now available on YUI Theater. The slides are available on SlideShare (or you can download the Keynote deck).

Synopsis: Following codified guidelines can help you build fast websites, but building applications that are clean, fast and extensible also involves taking a balanced approach to performance at every level of your F2E work. YUI 3 is designed to help you in this process, providing a right-sized abstraction layer with built-in performance magic and a variety of tools that make fast frontend code easy and fun to produce. In this session, we’ll explore the zen of performant JavaScript in the YUI 3 world and introduce you to some of the powerful tools YUI 3 puts at your disposal in every app you write.

What's happening at Yahoo!

I wasn’t at Yahoo! when Paul Graham was. He was there a long time ago. I can’t speak to whether his blog post accurately reflects what Yahoo! was like then. I can tell you that it doesn’t mesh at all with the experiences I’ve had at Yahoo! since I joined the company in early 2007.

The main point of Graham’s article is that Yahoo! didn’t have a hacker-centric culture. If there was a time when that was true, it must have been before I joined.

A company without a hacker-centric culture doesn’t encourage the kind of risk-taking and experimentation I saw when I was at Yahoo! Search. As an engineer, I had direct input into product features at every level, from ideation to design to implementation to launch. If I had a crazy idea, I was encouraged not just to tell people about it (up to and including executives), but to implement it and see if it tested well with users. I was able to add my own personal touch to parts of the product (sometimes big parts) without needing to ask permission or wade through excessive red tape.

This may not sound impressive to someone who’s used to the way things work at startups or small companies. But this was at one of the largest Internet companies in the world, on one of the most visited websites in the world. For Yahoo! to give me and other engineers the kind of freedom and power we had is not normal for a company or a product that operates at this scale.

Earlier this year, I transferred to the YUI team, where I get to work with some of the smartest frontend engineers on the planet on an open source JavaScript library that we develop not just for use by Yahoo!, but also by thousands of other developers around the world. The ideas and the work that I see coming from this team, and from the other teams we work with throughout Yahoo!, are amazing and often groundbreaking.

I have my gripes about Yahoo!, sure. It hasn’t been all kittens and rainbows. But the hacker-centric culture and the brilliant people Paul Graham seems to think don’t exist here are the reason I’ve been here for 3.5 years and counting, and they’re the reason I don’t plan on leaving anytime soon.

I originally wrote this as an answer to a question on Quora. I thought it was worth reposting here. My opinions, as always, are my own, and don’t necessarily reflect the views of my employer.

Dissection of a recruiter email

Like anyone with certain hot buzzwords on their résumé or LinkedIn profile, I’m often contacted by recruiters. Usually they’re perfectly friendly and polite: nice people doing an important and often thankless job. Sometimes they’re less polite, or use sketchy tactics.

Whenever I get annoyed by a recruiter, I try to remember how happy I would have been to have gotten a call from them back in the dark days of ought-one, when much of my time was spent sitting on a ratty couch eating Eggo waffles and watching Zoboomafoo with whichever of my roommates also happened to be unemployed at the time.

There are good recruiters and there are sketchy recruiters, but the worst kind of recruiter is an incompetent recruiter. Like Bjoern, who sent me the following email today.

From: bjoern@[redacted]
To: ryan@wonko.com
Subject: Silicon Explosion? // interesting startup opportunities

Email not displaying correctly? View it in your browser.

Already we’re off to a bad start. Bjoern is so certain of his own incompetence at such a simple task as formatting an email that, before even saying hello, he has offered me the opportunity to skip the email entirely and instead view a web page, presumably generated by some automated tool so foolproof that even he couldn’t screw it up. Confidence has not been instilled.

That’s one interpretation. Another interpretation is that the email itself was generated by an automated tool. Which makes it insulting. But for some reason I keep reading.

Hi Ryan,

You stumbled over your profile a couple days ago … I was impressed. Congratulations!

Just one word into the opening sentence, Bjoern has justified his earlier self-doubt.

I’m not sure exactly what Bjoern is accusing me of here—I swear I don’t remember stumbling over anything a couple days ago—but apparently an impressive profile was involved. So impressive that I deserve congratulating. Go me! I must be super awesome.

You probably noticed that recently there was an explosion of freshly funded startups out there looking for senior devs., tech VPs and CTOs.

We are working with a number of funded startups handson and help them to accelerate with funding, people, prototyping, media, going global, etc.

I don’t have the slightest idea why Bjoern thinks any of this is relevant to me, but hey, I’m impressed he managed to string together so many empty, meaningless words. And that “etc.” at the end totally seals the deal. It tells me that he’s doing so many incredible “handson” things for so many freshly funded startups that he can’t even be bothered to list them all. He must be almost as awesome as I am.

Hey, Bjoern? Just one thing. What is it with you and explosions?

Finding good people – as you probably know – is always the hardest part!

The hardest part? Of what? All those things he listed earlier? Maybe just the “people” part? Or did he mean the “etc.” part? I appreciate the implication that I’m smart enough to know what he’s talking about, but since I’m not sure he knows what he’s talking about, I feel uncomfortable jumping to conclusions.

As it turns out it is the best to ask good people for good people. Could you recommend anybody who is looking for a new opportunity? :)

Bjoern

Well shit. Bjoern had me all worked up about how awesome I was, and then he went and dashed my hopes. It wasn’t me he was interested in after all; it was my awesome friends!

Or maybe he’s actually interested in me and is just being really sneaky about it. Is that what the smiley means? Maybe that’s Bjoern’s way of saying, “Hey, I know your boss reads your personal email, so let’s pretend we’re talking about OTHER PEOPLE and not you. Got it? Wink wink!”

If my boss were reading my personal email, you’d think he’d be smart enough to at least delete the recruiter emails before I saw them.

But wait, there’s more good stuff after the signature.

You were recommended to me :)

Unsubscribe ryan@wonko.com from this list.

Our mailing address is:
[redacted]

Add us to your address book

Copyright © 2010 [redacted] All rights reserved.

Forward this email to a friend
Update your profile

I was recommended to him? But I thought he stumbled across my profile? Somebody stumbled across a profile, at any rate. Maybe the smiley after this sentence means that it, too, is code for something? But for what? Bjoern may have overestimated my intuitive abilities, because his encoded meaning is lost on me.

But then, at the very end, there’s a link to my profile. That must be the one someone stumbled over at the beginning of the email! Boy, I sure don’t remember creating a profile, so I’d better click that link and find out what’s going on!

Of course, clicking that link would route me through a tracking redirect and tell Bjoern that I read his message.

So, yeah. Let’s just not click that. Let’s not click that at all.