Ruby script to retrieve and display Comcast data usage

Thursday February 25, 2010 @ 04:43 PM (PST)

Comcast has often advertised their high speed Internet service as providing “unlimited” data transfer, but when they say “unlimited”, what they really mean is “limited to 250GB a month”.

Just before the new year, Comcast finally rolled out a data usage meter to users in the Portland, Oregon area so we can actually tell when we’re in danger of exceeding that 250GB ceiling. I find this usage meter incredibly helpful in achieving my goal of using as much of my monthly 250GB data allotment as I possibly can. I feel it’s my duty to get my full money’s worth.

Unfortunately, the meter is buried several pages deep in Comcast’s account site, which is a slow and ugly beast that requires a login, several redirects, and a click or two. So I whipped up a little Ruby script to do the dirty work for me and just print out my current usage total.

Before using the script, you’ll need to install the Mechanize gem:

gem install mechanize

Here’s the script:

#!/usr/bin/env ruby

require 'rubygems'
require 'mechanize'

URL_LOGIN = 'https://login.comcast.net/login?continue=https://login.comcast.net/account'
URL_USERS = 'https://customer.comcast.com/Secure/Users.aspx'

abort "Usage: #{$0} <username> <password>" unless ARGV.length == 2

agent = Mechanize.new

agent.follow_meta_refresh = true
agent.redirect_ok = true
agent.user_agent = 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6'

login_page = agent.get(URL_LOGIN)

login_form = login_page.form_with(:name => 'login-form')
login_form.user = ARGV[0]
login_form.passwd = ARGV[1]

redirect_page = agent.submit(login_form)
redirect_form = redirect_page.form_with(:name => 'redir')

abort 'Error: Login failed' unless redirect_form

account_page = agent.submit(redirect_form, redirect_form.buttons.first)

users_page = agent.get(URL_USERS)
usage_text = users_page.search("div[@class='usage-graph-legend']").first.content

puts usage_text.strip

Save it to an executable file (I called it capmon.rb), then run it like so, passing in your Comcast.net username and password (they’ll be sent securely over HTTPS):

./capmon.rb myusername mypass

The script will log into your Comcast account, go through all those painful redirects and clicks, and eventually spit out your usage stats, which will look something like this:

166GB of 250GB

Couldn’t be simpler! Naturally, this script won’t work for you unless you’re a Comcast customer in a region where the usage meter is currently available. Also, the script will break if Comcast changes their login flow or page structure, but I’ll try to keep this post updated if that happens.

This script is available as a GitHub gist as well. If you’d like to modify it and make it better, please fork the gist.

Read about Storage Lite on YUIBlog

Monday February 22, 2010 @ 10:25 PM (PST)

I’ve written an article on YUIBlog introducing my new Storage Lite YUI 3 Gallery module, which provides a lightweight, cross-browser client-side storage API without relying on any browser plugins. This is an updated version of the storage library I mentioned last year in my post about the development of Yahoo! Search Pad.

Head over to YUIBlog to read the article, and feel free to fork Storage Lite on GitHub and add awesomeness (or subtract crappiness).

The Net::IMAP standard library distributed with Ruby 1.8.6, 1.8.7, and 1.9.1 contains a response parsing bug that can cause an endless hang (in 1.8.x) or raise an exception (in 1.9.1) when switching between mailboxes on a Dovecot 1.2.x server.

The bug has been fixed in Ruby’s SVN trunk and should eventually make it into the 1.9.2 release, but if you’re using Net::IMAP with a current or older Ruby release and need a fix for this, the following monkeypatch (which just replaces the old buggy method with the fixed one from SVN) should do the trick.

Fortunately, this fix is the only difference from the 1.8.6, 1.8.7, and 1.9.1 versions of this method, so the monkeypatch works for all three versions. Just add it to your own code at some point after requiring Net::IMAP.

if RUBY_VERSION <= '1.9.1'
  module Net # :nodoc:
    class IMAP # :nodoc:
      class ResponseParser # :nodoc:
        private

        # This monkeypatched method is the one included in Ruby SVN trunk as
        # of 2010-02-08.
        def resp_text_code
          @lex_state = EXPR_BEG
          match(T_LBRA)
          token = match(T_ATOM)
          name = token.value.upcase
          case name
          when /\A(?:ALERT|PARSE|READ-ONLY|READ-WRITE|TRYCREATE|NOMODSEQ)\z/n
            result = ResponseCode.new(name, nil)
          when /\A(?:PERMANENTFLAGS)\z/n
            match(T_SPACE)
            result = ResponseCode.new(name, flag_list)
          when /\A(?:UIDVALIDITY|UIDNEXT|UNSEEN)\z/n
            match(T_SPACE)
            result = ResponseCode.new(name, number)
          else
            token = lookahead
            if token.symbol == T_SPACE
              shift_token
              @lex_state = EXPR_CTEXT
              token = match(T_TEXT)
              @lex_state = EXPR_BEG
              result = ResponseCode.new(name, token.value)
            else
              result = ResponseCode.new(name, nil)
            end
          end
          match(T_RBRA)
          @lex_state = EXPR_RTEXT
          return result
        end
      end

    end
  end
end

If you’re a Larch user, the latest Larch development gem includes this fix.

Sanitize 1.2.0 released

Sunday January 17, 2010 @ 04:22 PM (PST)

Version 1.2.0 of Sanitize, my whitelist-based HTML sanitizing library for Ruby, is now available. Consult the HISTORY file for a complete list of changes.

Introducing Transformers

This release adds a major new feature called transformers. Transformers allow you to filter and alter HTML nodes using your own custom logic, on top of (or instead of) Sanitize’s core filter. A transformer is any Ruby object that responds to call() (such as a lambda or proc) and returns either nil or a Hash containing certain optional response values.

To use one or more transformers, pass them to the :transformers config setting:

Sanitize.clean(html, :transformers => [transformer_one, transformer_two])

Input

Each registered transformer’s call() method will be called once for each element node in the HTML, and will receive as an argument an environment Hash that contains Sanitize config information and a reference to a Nokogiri::XML::Node object.

The transformer has full access to the Nokogiri::XML::Node that’s passed into it and to the rest of the document via the node’s document() method. Any changes will be reflected instantly in the document and passed on to subsequently-called transformers and to Sanitize itself. A transformer may even call Sanitize internally to perform custom sanitization if needed.

Transformers have a tremendous amount of power, including the power to completely bypass Sanitize’s built-in filtering.

Output

A transformer may return either nil or a Hash. A return value of nil indicates that the transformer does not wish to act on the current node in any way. A returned Hash may contain instructions that tell Sanitize to whitelist certain attributes or nodes, or to replace the current node with a new node (see the README for specifics).

Example: Transformer to whitelist YouTube video embeds

The following example demonstrates how to create a Sanitize transformer that will safely whitelist valid YouTube video embeds without having to blindly allow other kinds of embedded content, which would be the case if you tried to do this by just whitelisting all <object>, <embed>, and <param> elements:

lambda do |env|
  node      = env[:node]
  node_name = node.name.to_s.downcase
  parent    = node.parent

  # Since the transformer receives the deepest nodes first, we look for a
  # <param> element or an <embed> element whose parent is an <object>.
  return nil unless (node_name == 'param' || node_name == 'embed') &&
      parent.name.to_s.downcase == 'object'

  if node_name == 'param'
    # Quick XPath search to find the <param> node that contains the video URL.
    return nil unless movie_node = parent.search('param[@name="movie"]')[0]
    url = movie_node['value']
  else
    # Since this is an <embed>, the video URL is in the "src" attribute. No
    # extra work needed.
    url = node['src']
  end

  # Verify that the video URL is actually a valid YouTube video URL.
  return nil unless url =~ /^http:\/\/(?:www\.)?youtube\.com\/v\//

  # We're now certain that this is a YouTube embed, but we still need to run
  # it through a special Sanitize step to ensure that no unwanted elements or
  # attributes that don't belong in a YouTube embed can sneak in.
  Sanitize.clean_node!(parent, {
    :elements   => ['embed', 'object', 'param'],
    :attributes => {
      'embed'  => ['allowfullscreen', 'allowscriptaccess', 'height', 'src', 'type', 'width'],
      'object' => ['height', 'width'],
      'param'  => ['name', 'value']
    }
  })

  # Now that we're sure that this is a valid YouTube embed and that there are
  # no unwanted elements or attributes hidden inside it, we can tell Sanitize
  # to whitelist the current node (<param> or <embed>) and its parent
  # (<object>).
  {:whitelist_nodes => [node, parent]}
end

For more details on transformers, consult the README.

Installing

To install or upgrade Sanitize via RubyGems, run:

gem install sanitize

Last week, Google banned my PHP port of JSMin from Google Code due to a quibble over a line in the license stating that “The Software shall be used for Good, not Evil”, which they believe makes the license non-free. When I asked Google’s Chris DiBona whether all Google Code projects including JSMin would be subject to bans due to this clause in the license, he replied, “Sadly, yes”.

Today, Etherpad (which was recently acquired by Google) released their source code on Google Code. Unfortunately, their source tree contains at least two different JSMin ports (one in JavaScript and one in Python), thus making Etherpad non-free and violating Google Code’s terms of service. I’ve notified Google via an email to the Google Code mailing list.

I bring this up not because I have anything against Etherpad or Google Code, and not because I want to start a fight, but because it demonstrates the slipperiness of the slope Google launched themselves down when they banned jsmin-php last week. While I may disagree with their interpretation of the JSMin license as non-free, Google is certainly within their rights to refuse to host it. However, since JSMin is so widely used by so many open source projects, Google now has to choose between banning popular, high profile projects (including their own) or applying their rules selectively and thus promoting a double standard.

So what will it be, Google? Will you remove JSMin from Etherpad, ban Etherpad, or just be—dare I say it—evil and ignore your own rules when they’re inconvenient?

If you need a new host for the Etherpad project, the lovely folks over at GitHub don’t seem to have any problem hosting JSMin.


Update (2009-12-18): Chris clarifies: “As a side note, it’s not a matter of violating the terms of service, which don’t mention specific licenses, it is against our practices, though.” I’ve updated the title of this post accordingly. Chris has also asked the Etherpad maintainers to remove JSMin, which seems to indicate that Google is going to do the right thing and follow their own rules. Admirable!

Update 2 (2009-12-18): There are several other Google-sponsored projects that fall afoul of this ban as well:

JSMin isn't welcome on Google Code

Tuesday December 08, 2009 @ 01:37 PM (PST)

Google’s Chris DiBona emailed me this morning to tell me that unless I removed a specific line from the license of my jsmin-php project (a PHP port of Douglas Crockford’s JSMin), Google Code would no longer host the project.

The license in question is the one attached to the original jsmin.c, and is a slightly modified version of the MIT License. Here it is with the offending line emphasized:

Copyright © 2002 Douglas Crockford (www.crockford.com)

Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the “Software”), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

The Software shall be used for Good, not Evil.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

As Google (and some others) interpret it, this additional requirement constitutes a vague use restriction and thus makes the license non-free. Chris explained that if I were to remove that line from the license and “return to a proper open source license that we support”, then jsmin-php could stay on Google Code. Otherwise, he said, “we can’t host you”.

Of course, I can’t change the license, because it’s not my license. It’s Douglas’s license, and he wants people who use his software and derivative works of his software to use it for good and not evil. All derivative works and copies of jsmin.c either include this license or are in violation of it.

I added jsmin-php to Google Code in 2007. Since then, it’s been downloaded over 20,000 times. As of today, its new home is GitHub.

I don’t really mind moving the project—I’ve been intending to do it for a while anyway—and I certainly appreciate the fact that Chris was kind enough to send me a personal email about this before taking any action. But jsmin-php is unlikely to be the only project affected by Google’s discovery of JSMin’s license.

In my reply to Chris, I asked him:

There are quite a few other projects on Google Code that are ports of jsmin.c or include either ports or the original. Does this mean those projects will also be banned from Google Code unless jsmin.c’s license changes?

Chris responded: “Sadly, yes.”

I don’t know if Google intends to proactively hunt down all projects using JSMin or whether they’ll only take action when someone rats you out, but if you currently have a project on Google Code that is derived from or includes jsmin.c, you might want to consider migrating to a new host with less restrictive policies.

I asked Douglas what he thought of this. He responded: “When did Google stop being against evil?”


Update (2009-12-09): Via @miraglia, here’s a hilarious excerpt from Doug’s talk, “The JSON Saga”, in which he gives some background on why he added this clause to the license and how often people ask him to remove it:

When I put the reference implementation onto the website, I needed to put a software license on it. I looked up all the licenses that are available, and there were a lot of them. I decided the one I liked the best was the MIT license, which was a notice that you would put on your source, and it would say: “you’re allowed to use this for any purpose you want, just leave the notice in the source, and don’t sue me.” I love that license, it’s really good.

But this was late in 2002, we’d just started the War On Terror, and we were going after the evil-doers with the President, and the Vice-President, and I felt like I need to do my part.

[laughter]

So I added one more line to my license, which was: “The Software shall be used for Good, not Evil.” I thought I’d done my job. About once a year I’ll get a letter from a crank who says: “I should have a right to use it for evil!”

[laughter]

“I’m not going to use it until you change your license!” Or they’ll write to me and say: “How do I know if it’s evil or not? I don’t think it’s evil, but someone else might think it’s evil, so I’m not going to use it.” Great, it’s working. My license works, I’m stopping the evil doers!

Audience member: If you ask for a separate license, can you use it for evil?

Douglas: That’s an interesting point. Also about once a year, I get a letter from a lawyer, every year a different lawyer, at a company—I don’t want to embarrass the company by saying their name, so I’ll just say their initials—IBM

[laughter]

…saying that they want to use something I wrote. Because I put this on everything I write, now. They want to use something that I wrote in something that they wrote, and they were pretty sure they weren’t going to use it for evil, but they couldn’t say for sure about their customers. So could I give them a special license for that?

Of course. So I wrote back—this happened literally two weeks ago—“I give permission for IBM, its customers, partners, and minions, to use JSLint for evil.”

[laughter and applause]

And the attorney wrote back and said: “Thanks very much, Douglas!”

You can see the full video of the talk at YUI Theater (the excerpt above is from 39:45).

History Lite is a new YUI 3 Gallery module that provides an extremely lightweight (856 bytes minified and gzipped) and flexible Ajax browser history API. I originally wrote History Lite as a YUI 2 module for use on Yahoo! Search, and when the YUI 3 Gallery was announced recently, I jumped at the chance to port it to YUI 3 and release it publicly.

What’s it For?

Ajax applications often involve client-side interactions that change the contents or state of the page without performing a full page refresh. Unfortunately, browsers don’t record new history events for this kind of interaction, which means that the back/forward buttons cannot be used to navigate through the client-side changes. It also means that bookmarks and copied/pasted URLs will not return the user to the actual page state they might expect.

History Lite and other similar libraries provide APIs that Ajax applications can use to programmatically add state information to the browser’s history by manipulating the document’s location hash (the part of the URL after the # character), thus preserving the expected back/forward button behavior. This also results in copyable, bookmarkable URLs that allow an Ajax application to restore its state when it’s loaded.

YUI 2 and 3 already provide an excellent History utility written by my colleague Julien Lecomte. However, it has a few inconvenient requirements — an <iframe> must be added to the page, and all state parameters must be pre-registered before the module is initialized — which are necessary in order to provide full support for IE6 and IE7. This makes it a bit heavy for performance-sensitive use cases (especially since the <iframe> causes another HTTP request) and results in an API that can be difficult to share between multiple unrelated modules that coexist on a page.

History Lite provides only partial support for IE6 and IE7, which makes it possible to have a much smaller implementation and a more flexible API that doesn’t require any pre-existing markup or initialization. If supporting older versions of IE is critical for you, then you should use the YUI History utility. However, if you’re willing to do without legacy IE support, History Lite is a good alternative.

Usage

History Lite is hosted on the same Yahoo! CDN as YUI 3 itself, so you don’t even need to download anything to use it. Just tell YUI where to find it and it’ll be loaded automatically on demand:

<script src="http://yui.yahooapis.com/3.0.0/build/yui/yui-min.js"></script>
<script>
  YUI({modules: {
    'gallery-history-lite': {
      fullpath: 'http://yui.yahooapis.com/gallery-2009.12.15-22/build/gallery-history-lite/gallery-history-lite-min.js',
      requires: ['event-custom', 'event-custom-complex', 'node']
    }
  }}).use('gallery-history-lite', function (Y) {

    // Y.HistoryLite is now available to your code.

  });
</script>

History Lite doesn’t require any initialization, and the API consists of the add() and get() methods and the global history-lite:change event. Yep, that’s really the entire API!

Subscribe to the history-lite:change event to be notified when the history state changes. This occurs whenever a history parameter is added, modified, or removed. This example just logs stuff to the console to demonstrate how things work, but typically this is where you would implement any logic necessary to change the state of your application:

Y.on('history-lite:change', function (e) {
  // Properties on e.changed represent new or changed history parameters.
  Y.each(e.changed, function (value, name) {
    console.log(name + ' changed to "' + value + '"');
  });

  // Properties on e.removed represent history parameters that have been
  // removed.
  Y.each(e.removed, function (value, name) {
    console.log(name + ' was removed');
  });

  // The get() method returns the current value of the specified history
  // parameter. If you call get() without specifying a parameter name,
  // it'll return an object containing all current history parameters and
  // their values.
  console.log('current value of foo is ' + Y.HistoryLite.get('foo'));
});

In addition to listening for the history-lite:change event, it’s also a good idea to call get() when the page loads in order to restore state from a bookmarked or copied/pasted URL.

Use the add() method to add new entries to the browser history. Each call to add() will modify the document’s location hash, thus triggering the history-lite:change event:

// The add() method accepts an object containing key/value pairs of
// history parameter names and values. Each call to add() creates a new
// browser history entry.
Y.HistoryLite.add({foo: 'bar', baz: 'quux'});

// The add() method will also accept a query string.
Y.HistoryLite.add('foo=kittens&bar=puppies');

// A null or undefined value causes that parameter to be removed from
// the history state.
Y.HistoryLite.add({foo: null, baz: 'monkeys'});

Whenever you want your application to perform a state-changing action, use add() to trigger a change event and then perform the actual state change in the event handler (or in code called from the event handler). This enforces code modularity while also ensuring that state changes are explicitly tied to history events.

Supported Browsers

  • Firefox 2+
  • Google Chrome (all versions)
  • Internet Explorer 8+
  • Opera 9+
  • Safari 3+
  • Mobile Safari (all versions)

IE6 and IE7 are partially supported in that state changes and back/forward navigation within a single pageview will work, and bookmarked URLs will restore state. However, after navigating away from a page and then returning using the back/forward buttons, previous Ajax history from within that page will be lost.

Sanitize 1.1.0 released

Tuesday October 13, 2009 @ 08:04 PM (PDT)

Sanitize 1.1.0 is now available. The biggest change in this release is a migration from Hpricot to Nokogiri, contributed by Adam Hooper. In addition, a new :output config setting allows you to specify whether you want Sanitize to output XHTML (the default) or HTML4, and Peter Cooper contributed a fix for a bug in which Sanitize would incorrectly strip a whitelisted URL if a path segment contained a colon.

To install or upgrade Sanitize via RubyGems, run:

gem install sanitize

Context clues

Wednesday September 09, 2009 @ 09:07 PM (PDT)

The following badly edited paragraph from an article at Telegraph.co.uk raises some serious questions about what my iPhone is doing when I’m not looking:

Blackberries, iPods, mobile phones, plams TVs, navigation systems, and air defence missiles all use a sprinkling of rare earth metals. They are used to filter viruses and bacteria from water, and cleaning up Sarin gas and VX nerve agents.

It also raises other questions, like “what’s a plams TV?” and “did anyone even read that last sentence before publishing this article?”

Oh yeah, and there’s some other stuff about China hoarding the world’s supply of vitally important rare earth metals and leaving everyone else to fend for themselves, but if the article was researched with as much care as it was edited, it’s probably safe to assume it’s mostly wrong.

I’ve reviewed three different desktop backup applications on wonko.com over the years: Carbonite, Mozy, and CrashPlan. I stopped using Carbonite because it was too basic and too expensive. I stopped using Mozy because I lost hundreds of gigs of data due to a hard drive failure and Mozy’s horrendously broken restore process made it impossible to restore many of my backed up files. I still use CrashPlan, which I love and which has reliably saved my ass several times.

My backup software reviews are among the most commented-on posts on this blog. People find them in searches and can’t resist adding their thoughts. These posts still get several new comments each week. Since my blog has become a repository of comments, both positive and negative, on backup software, I thought I’d tally up the totals.

To produce the graph below, I perused all the comments on this site that were attached to one of my backup software reviews or which contained the name of one or more of the aforementioned backup applications. I excluded my own comments and comments from users who clearly hadn’t actually used the software in question. This graph is a tally of all the positive and negative comments that remained for each application.

Graph of positive and negative comment counts

The totals are as follows:

Carbonite: 1 positive, 3 negative
CrashPlan: 7 positive, 1 negative
Mozy: 10 positive, 88 negative

I’m not sure the totals for Carbonite and CrashPlan are even statistically relevant, but it’s clear that people hate Mozy (or at least that people who search for Mozy and find this blog hate it).

Update: I’ve updated the totals and the graph to reflect the comments on this post as of 2009-08-14 17:53 PDT.

Copyright © 2002-2010 Ryan Grove. All rights reserved.
Powered by Thoth.