Archived Posts

Displaying posts 1 - 10 of 649

Extending JavaScript natives

Wednesday December 07, 2011 @ 11:13 AM (PST)

The argument about whether or not it’s okay to extend JavaScript natives tends to focus too much on whether you should or shouldn’t, instead of on when you should and shouldn’t, which would be a much more useful discussion.

The ability to extend natives is a powerful feature of the language, and it’s one that can be used to do fantastic things. When used in the wrong places or for the wrong reasons, it can seriously fuck shit up and make developers cry.

Generally speaking, there are two kinds of JavaScript code that you run on any website you build: there’s the code you (or your team) write yourself, and there’s the code that comes from libraries someone else wrote.

Extending natives in your own code is a bit like decorating your living room. It makes perfect sense. You live there, you have opinions about how it should look and where things should go, and you’re the one who either benefits or suffers as a result of your choices, so you should feel free to go nuts.

Extending natives in library code that will be used on someone else’s pages is a bit like being invited into someone else’s house and — without asking permission — repainting the walls, moving furniture around, adding new furniture, breaking expensive vases, and just generally being an asshole.

You could argue that the person who invited you into their home knew you were an asshole before they invited you, so they knew what they were getting themselves into, but that would just make you a victim-blaming asshole, which is even worse than a normal asshole. So don’t argue that.

Sure, some libraries are so well-known for extending natives that anyone who uses them can be said to be opting into those risks. But later, when the developer adds another library to the page to help them build some new functionality and the second library expects natives to behave like real natives when they actually behave however the first library thinks they should behave, it’s the developer who suffers.

And that developer will inevitably file a bug against the second library, because things broke when they started using it. And then that library’s authors suffer.

And then the authors of the second library end up writing code like this to protect users from brokenness caused by the first library, and the fiery heat death of the universe draws just a little bit closer.

All because some asshole thought repainting someone else’s living room was a good idea.

Google+ Pages don't make any sense

Friday November 18, 2011 @ 11:15 AM (PST)

I want to create a Google+ Page for YUI. I need other members of the YUI team to be able to edit and post to this page, so I can’t create it through my personal Google account. As far as I can tell, the only way to do this is to create it through a shared Google account.

But before I can do this, I have to create a “personal” Google+ profile for that shared account. Since this shared account isn’t actually a person, creating a personal profile for it would violate Google+’s policies and the profile would likely get suspended.

So, how are organizations supposed to actually create and maintain Google+ pages? Do they really just nominate one member of the organization to carry out all Google+ duties, and let the page go silent if that person goes on vacation or gets sick, and let the page die if that person leaves the organization?

I don’t get it. It seems like this wasn’t thought out at all.

Connecting the dots

Friday October 07, 2011 @ 02:36 PM (PDT)

Designing a flexible framework is tricky. Most framework users are blinded by choice and would rather just connect the dots. And boy do they hate it if the dots don’t connect up to create exactly the picture they want.

Framework designers need to understand this and at least provide clearly defined guideposts for the users who want to connect the dots, while also making it possible for more adventurous users to blaze their own trails without too much effort.

Framework users should strive to see beyond their immediate goals and look at the ways the framework can make their work easier, not just the ways it can do their work for them.

Apathetic

Wednesday October 05, 2011 @ 07:50 PM (PDT)

Guy holding a clipboard knocked on the door. Said something about jobs and times being tough. Asked me to put my name on his clipboard. I told him I wasn’t interested. He insisted. I reiterated.

He assured me no emails or mailings or consequences of any kind would result, he just needed my name. I suggested he just make a name up, since that’s what I’d do anyway if he kept insisting. He told me that if people didn’t do something, things would keep getting worse. I told him I still wasn’t going to put my name on his clipboard.

“Are you apathetic?” he asked.

“Um, yes,” I said. “Exactly.”

I’m glad he noticed.

Simple makefile to minify CSS and JS

Saturday July 30, 2011 @ 03:11 PM (PDT)

I recently needed a quick and easy way to minify CSS and JS for the new YUI Library website (launching soon!). In the past I’ve written powerful and complicated tools for doing static asset management and minification, but this time I wanted something simple.

A good old-fashioned makefile turned out to be the perfect tool for the job. Here’s what I came up with. Feel free to use it in your own projects. This version requires the YUI Compressor, but that can easily be replaced with Closure Compiler, Uglify, or any other tool of your choice.

# Patterns matching CSS files that should be minified. Files with a -min.css
# suffix will be ignored.
CSS_FILES = $(filter-out %-min.css,$(wildcard \
	public/css/*.css \
	public/css/**/*.css \
))

# Patterns matching JS files that should be minified. Files with a -min.js
# suffix will be ignored.
JS_FILES = $(filter-out %-min.js,$(wildcard \
	public/js/*.js \
	public/js/**/*.js \
))

# Command to run to execute the YUI Compressor.
YUI_COMPRESSOR = java -jar yuicompressor-2.4.6.jar

# Flags to pass to the YUI Compressor for both CSS and JS.
YUI_COMPRESSOR_FLAGS = --charset utf-8 --verbose

CSS_MINIFIED = $(CSS_FILES:.css=-min.css)
JS_MINIFIED = $(JS_FILES:.js=-min.js)

# target: minify - Minifies CSS and JS.
minify: minify-css minify-js

# target: minify-css - Minifies CSS.
minify-css: $(CSS_FILES) $(CSS_MINIFIED)

# target: minify-js - Minifies JS.
minify-js: $(JS_FILES) $(JS_MINIFIED)

%-min.css: %.css
	@echo '==> Minifying $<'
	$(YUI_COMPRESSOR) $(YUI_COMPRESSOR_FLAGS) --type css $< >$@
	@echo

%-min.js: %.js
	@echo '==> Minifying $<'
	$(YUI_COMPRESSOR) $(YUI_COMPRESSOR_FLAGS) --type js $< >$@
	@echo

# target: clean - Removes minified CSS and JS files.
clean:
	rm -f $(CSS_MINIFIED) $(JS_MINIFIED)

# target: help - Displays help.
help:
	@egrep "^# target:" Makefile

To use this, save it as a makefile, customize it as necessary, and then run make minify to minify your .js and .css files. Minified files will be saved with a -min suffix alongside the originals. Only files that have changed since the last time you minified them will be processed.

This file is also available as a gist if you’d like to fork it and improve it. Enjoy!

Steve Jobs on programmer productivity

Sunday June 05, 2011 @ 07:03 PM (PDT)

Steve Jobs discussing programmer productivity in his keynote Q&A from the 1997 WWDC:

The way you get programmer productivity is not by increasing the lines of code per programmer per day. That doesn’t work. The way you get programmer productivity is by eliminating lines of code you have to write.

The line of code that’s the fastest to write, that never breaks, that doesn’t need maintenance, is the line you never had to write.

This isn’t the only gem from that keynote. The entire thing is fantastic and worth watching, if only to see in hindsight what an amazing product visionary Steve is.

In this unrehearsed Q&A session (not a prepared presentation), he lays out many of the ideas that will chart Apple’s course for the next 15 years. Mac OS X, iLife, Xcode, MobileMe, iCloud, and even the iPhone and iPad—the seeds of all of these ideas were clearly already present in Steve’s mind.

Incredible.

Let’s say you have a website at https://buygadgets.example.com. Users shop for shiny gadgets on your website and then enter their credit card numbers to buy them.

Because you value the security and privacy of your users, you use SSL for all traffic. You paid top dollar for an SSL certificate signed by one of the most trusted certificate authorities in the world, so your users can always be certain that they’re communicating with your website and not some other site pretending to be yours.

To build your site, you used an open source JavaScript library called FooLib. FooLib is awesome, and it’s backed by FooCo, one of the world’s largest and most trusted technology companies. This company even provides hosted versions of FooLib on their super fast content delivery network (CDN), so that anyone who wants to can link to http://cdn.foolib.com/foo.js instead of having to host the JavaScript on their own servers.

Because browsers display a warning when you serve a page that has a mix of HTTP and HTTPS content, you want to serve FooLib over SSL. Nobody wants to annoy their users with scary security warnings. Luckily, FooCo’s CDN supports SSL! You can just load https://cdn.foolib.com/foo.js, and now your users don’t see that pesky security warning anymore.

Unfortunately, you’re now deceiving your users, and that fancy SSL certificate you bought from the world’s most trusted CA is worthless.

Why? Because you’re letting FooCo execute any JavaScript it wants on your website. You’re loading that JavaScript securely over SSL, so the browser isn’t displaying any scary warnings, but now your users aren’t just communicating with buygadgets.example.com. Now they’re also communicating with cdn.foolib.com, and since cdn.foolib.com can run JavaScript on your pages, they can also see any information the user reads or enters on those pages.

“But why would FooCo do something like that?” you ask. “After all, their motto is `Don’t be naughty`!”

Of course FooCo would never do that. They’re a solid, upstanding, trustworthy company with nothing to gain from stealing credit card numbers. They’re providing a valuable service to the community, and they genuinely do it out of the goodness of their hearts.

But you’re still deceiving your users.

Your SSL certificate says to the user “Hey, you’re safe. It’s only you and me talking here, and nobody else can decrypt our communications. And you can rest assured that I’m really who you think I am, because this trustworthy CA says so.”.

But when you load FooLib from FooCo’s CDN, you’re silently inviting FooCo into that conversation as well. FooCo has their own SSL certificate, which is also signed by a trustworthy CA, but your user doesn’t want to share their information with FooCo. They want to share their information with you. By inviting FooCo into this confidential conversation without even telling your user that you’ve done it, you’re breaking the contract that was implied by your site’s SSL certificate and by the soothing lock icon in the browser’s location bar.

The user thinks they’re only telling you their secrets, but they’re also telling FooCo their secrets. And that’s not cool.

If user trust is important to you, you shouldn’t load JavaScript from third-party CDNs on secure pages. Not even if those CDNs support SSL, and not even if they’re run by the world’s most trustworthy companies.

Now, let’s be realistic: security is always a tradeoff. You trade some convenience for some security. Some websites are willing to share their users’ private information with FooCo, because it’s more convenient to do that than to host a JavaScript library locally. That’s a decision you have to make for yourself.

But as a user, I can tell you that if I found out that a company I trusted was silently making my private information available to some other company without my knowledge, all while making me think they were keeping this information confidential, I’d be pretty pissed. Even if no harm came from it, and even if it was done with the best intentions, I’d consider it a violation of my trust. And I don’t like companies that violate my trust.

Obligatory disclaimer: I work on the YUI JavaScript library at Yahoo!, but these are my own opinions. Nothing in this post should be construed as representing the views of my employer or anyone else on the YUI team.

There's more to HTML escaping than &, <, >, and "

Saturday April 23, 2011 @ 03:49 PM (PDT)

A few days ago I tweeted:

If I had a dollar for every HTML escaper that only escapes &, <, >, and ", I'd have $0. Because my account would've been pwned via XSS."

This was exaggeration for effect—there aren’t many cases where a simple XSS injection could actually empty a bank account—but I wanted to make a point.

By some coincidence, I’ve found myself working with various open source projects recently that take a half-assed approach to HTML escaping. It’s something that tends to be implemented as an afterthought, which is unfortunate because it can be critical for the security of users of these projects. I won’t name any names in this post (pull requests are forthcoming), but I will explain some of the common problems I’ve seen, why they’re problems, and what can be done to fix them.

This post is not an introduction to HTML escaping. It assumes that you already know what HTML escaping is and why it’s necessary. This post also is not a comprehensive catalog of XSS vectors; the examples here are illustrative, but they certainly aren’t the only attacks you need to worry about. The intent of this post is to explain some dangers that you may not be aware of, and to encourage you to read more about them and write safer code.

Note that this post only discusses escaping, which is something entirely different (and far less complicated) than sanitizing. HTML sanitization is a topic for another time.

Escaping < and > isn’t enough

The worst HTML escaper I’ve seen in a major open source project only escapes the < and > characters. This may actually be worse than not escaping anything at all, since it gives the illusion of security, but is trivial to defeat.

For example, let’s say I have the following template, and I’m going to replace the placeholder values, indicated in [square brackets], with HTML-escaped user input:

<a href="/user/[username]">[username]</a>

The attacker enters foo" onmouseover="alert(1) as their username. End result, even after escaping:

<a href="/user/foo" onmouseover="alert(1)">foo" onmouseover="alert(1)</a>

Because the " character wasn’t escaped and the attacker’s input was used in an attribute value, the attacker was able to inject arbitrary attributes and therefore JavaScript (which, in a real XSS attack, would probably be something more harmful than an alert).

This is a classic example of making input safer in one context—in this case, as the content of an <a> element—without considering the other contexts in which it’s likely to be used, such as inside an attribute value.

Escaping &, <, >, and " isn’t enough

The characters &, <, >, and " are the ones most commonly targeted by HTML escaper implementations. This seems to be the minimum set of characters that people think need to be escaped. Unfortunately, it’s still not safe if you don’t have complete control over where the escaped values will be used.

Consider the following template, in which the template author has used single-quoted attribute values:

<a href='/user/[username]'>[username]</a>

This is exploitable using the same attack as the previous example, but with single quotes instead of double quotes: foo' onmouseover='alert(1):

<a href='/user/foo' onmouseover='alert(1)'>foo' onmouseover='alert(1)</a>

You may be saying, “But I always use double quotes to quote attribute values!” Are you also the only person who will ever use your HTML escaper? And are you immune to typos?

Escaping &, <, >, ", and ' isn't enough

This is the character set used by PHP’s ubiquitous htmlspecialchars function, and as you may have guessed, it still falls down on attribute values for two reasons.

First, as Hacker News users DanBlake and nbpoole pointed out in a discussion of this blog post, Internet Explorer treats ` as an attribute delimiter. It may be an edge case, but it’s still a potential attack vector, so ` needs to be escaped too.

Second, HTML also allows attribute values to be completely unquoted. Believe it or not, unquoted attribute values are fairly popular (some people are too lazy to quote them, others are performance zealots who can’t bear the thought of wasting those extra bytes).

Unquoted attribute values are one of the single biggest XSS vectors there is. If you don’t quote your attribute values, you’re essentially leaving the door wide open for naughty people to inject naughty things into your HTML. Very few escaper implementations cover all the edge cases necessary to prevent unquoted attribute values from becoming XSS vectors.

Escaping &, <, >, ", ', `, , !, @, $, %, (, ), =, +, {, }, [, and ] is almost enough

All those characters up there (including the space character!) can be used to break out of an unquoted HTML attribute value. If you escape every last one of them, then you’re probably pretty close to being safe. But you’re still not so safe that you can just start throwing around user input willy nilly.

Why? Because this still doesn’t cover some context-specific cases like inserting user input into the body of an inline <script> element or using user input as part of a URL.

Context is key

If you haven’t figured it out already, the primary message I’m trying to convey here is that you must be aware of the context in which you’re working with user input. Some contexts are more susceptible to attack than others, and there’s no single magic escaping bullet that will protect you or your users in all cases.

In other words, you don’t need to escape everything all the time, but you do need to escape everything that’s important in the particular contexts in which you’re displaying user input.

But there’s still one more wrench to throw into the works…

Always specify a charset, or UTF-7 will eat your face

Even if you do everything else right, serving a page that doesn’t explicitly specify a character set can leave Internet Explorer users open to XSS, thanks to the way IE sniffs out the charset when it isn’t specified.

If an attacker is able to get your page to echo back something that looks like UTF-7 encoding early enough in the page, he may be able to trick IE into rendering the page using UTF-7. This could turn the following seemingly harmless input…

+ADw-script+AD4-alert(1)+ADw-/script+AD4-

…into something potentially harmful:

<script>alert(1)</script>

I recommend specifying a UTF-8 charset in both the Content-Type HTTP response header and a <meta> tag, since it’s easy for one or the other to get switched off or omitted inadvertently as a codebase ages (this has happened to me).

Further reading

As I mentioned in the disclaimer at the top of this post, this is not a comprehensive reference of all the things that can go wrong with HTML escaping. It’s not even a guide. It’s more of a tip-of-the-iceberg preview. Please don’t assume that, having read this post, you now know everything there is to know about HTML escaping. I can guarantee that you don’t, because I don’t.

I learned a lot from the following sources, and I highly recommend them if you’re interested in learning more:

Larch 1.1.0 released

Saturday January 22, 2011 @ 01:59 PM (PST)

Version 1.1.0 of Larch is now available.

Larch is a command line tool for copying email quickly and reliably from one IMAP server to another. It’s smart enough not to copy messages that already exist on the destination server, robust enough to pick up where it left off when interrupted, and also capable of more advanced operations like syncing message flags from the source server to the destination server. Larch is particularly well-suited for copying messages to, from, and between Gmail and Google Apps accounts.

Installing

Larch is a Ruby application and requires Ruby 1.8.6 or higher (1.9.2 is recommended). Once Ruby is installed, install or upgrade Larch via RubyGems:

gem install larch

What’s new

This release is the culmination of over a year of bug fixes and feature development. Larch is now faster and more reliable than ever. The most significant new features are:

  • Mailbox and message state information is now stored in a local SQLite database, which allows Larch to resync and resume interrupted operations much more quickly without having to rescan all messages.
  • You can now provide configuration options via a config file, so you don’t need to pass all options on the command line. This also allows the use of named session configs, so running larch gmail-to-yahoo will use the config options defined under the gmail-to-yahoo section in the config file.
  • Yahoo! Mail IMAP is now supported when connecting to imap.mail.yahoo.com or imap-ssl.mail.yahoo.com, even for non-pro accounts. Note that Yahoo! doesn’t officially support the use of IMAP by non-pro accounts, so this feature should be considered experimental.
  • Performance and reliability have been improved significantly. Many bugs have been fixed and many workarounds have been added for misbehaving servers.

This is only a partial list of changes. For the complete list, see the HISTORY file. For more details on how to use Larch’s new features, see the comprehensive README.

Try it out, report bugs

Please use Larch’s GitHub issue tracker to report bugs and request features. If you have questions or need help with Larch, the first thing you should do is read the very detailed documentation. If that fails, search the archives of the Larch mailing list on Google Groups. If you still can’t find an answer, send your question to the list.

Sanitize 2.0.0 released

Saturday January 15, 2011 @ 02:21 PM (PST)

Version 2.0.0 of Sanitize, my whitelist-based HTML filtering library for Ruby, is now available. This release includes several new features and some changes to existing features. I’ll cover the big stuff in this blog post; for the complete list of changes, see the HISTORY.md file.

Installing

To install or upgrade Sanitize via RubyGems, run:

gem install sanitize

Sanitize is fully compatible with Ruby 1.8.7, 1.9.1 and 1.9.2.

Transformers

The most significant change in this release is that Sanitize’s core filtering logic is now implemented entirely as a set of always-on transformers. This simplifies the core code and means that Sanitize itself is now built on the same powerful transformer architecture that you can use in your own apps to enhance or alter Sanitize’s functionality.

The environment object provided as input to transformers now contains a slightly different set of data, and transformer output has been simplified. Transformers are no longer required to return anything, and are expected to make any desired alterations directly to the current node and/or document.

Sanitize now has the ability to traverse the document and execute transformers using either depth-first traversal (the default behavior, same as before) or breadth-first traversal (new in 2.0.0). If necessary, you can even run one set of transformers using one traversal method and another using the other method. This allows for greater flexibility and less complexity when writing certain types of transformers.

The README has more details on these changes and new features.

Other notable changes

  • Sanitize now outputs HTML4/HTML5 markup by default instead of XHTML (e.g., <img src="foo.jpg"> instead of <img src="foo.jpg" />, etc.). If you prefer the old behavior, you can set the :output config to :xhtml.
  • Some new elements and attributes (including several HTML5 elements) have been added to the built-in basic and relaxed whitelists. See HISTORY.md for the complete list.
  • Elements like <br>, <p>, and others are now replaced with whitespace when they’re removed in order to preserve the readability of the remaining text content. The list of elements that will be replaced with whitespace when removed is configurable using the :whitespace_elements setting.

Be aware that if you expect specific output from Sanitize in your unit tests, you may need to update your tests. The HTML output from this release may not precisely match the output from previous releases.

Try it out, report bugs

As always, you can try out Sanitize’s built-in filters using the test page at sanitize.pieisgood.org. Please use Sanitize’s GitHub issue tracker to report bugs and file feature requests.

Copyright © 2002-2012 Ryan Grove. All rights reserved.
Powered by Thoth.