Adam @ Heroku
a tornado of razorblades

RubyGems 1.2

June 23, 2008 at 01:38 PM

The new RubyGems doesn't update the index every time, so gems install very quickly. Add the —no-rdoc —no-ri options and they install instantly. Excellent.

Comments: 0 (view/add your own) Tags: ruby

Rack, and Why It Matters

June 19, 2008 at 05:37 PM

Rack is one of the most important developments in the Ruby web space in the past year. I suspect it's been slow to get attention because the benefits are a bit subtle. Witness the Rails core team being confused about Rack just a few months ago. So if you don't get what the deal is with Rack, don't feel bad - you're in good company.

James covered Rack in his Railsconf talk, partially at my insistence. (His talk was about Mongrel handlers, but Rack middleware is a newer and better way to achieve the same end.) It's worth noting that he asked the crowd - a couple hundred Rubyists - whether they had heard of Rack, and almost every single hand went up. But when he asked if they knew what it was for, not a single hand was raised.

So what's the deal with Rack? In short, Rack provides a standard interface between the web app server and the app framework. This is useful in light of the multiplying number of web app servers (Webrick, Mongrel, Thin, Ebb...) and frameworks (Rails, Merb, Sinatra, Ramaze...). A standard not only reduces the amount of code the framework authors have to write, but it makes the layers in the stack more pluggable. Pluggability encourages experimentation (which means more innovation over time), and generally makes the whole stack more robust.

One implication of Rack is that you can skip the app framework altogether. I've always liked using use standalone Mongrels running tiny Ruby apps without a framework for internal daemons. These days, I generally use Sinatra for that purpose - but there's still something cool about skipping the use of any framework and just coding down to the metal.

Want to try it out yourself? Stick this code into hello.ru:

class HelloHandler
   def call(env)
      [ 200, { 'Content-type' => 'text/plain' }, 'hello, world' ]
   end
end

run HelloHandler.new

Then at the shell:

rackup hello.ru &
curl http://localhost:9292/

Congratulations, you've just made a frameworkless Ruby web application in five lines of code.

A Rack handler is anything that can respond to the call method and returns an array with the status code, output headers, and output body. Handlers can be the end of the request chain, or do input and output filtering anywhere in the middle (hence "middleware"). Here's an example from Marc-André Cournoyer. Though his example is presented for Thin, you can run his code on any Rack-compatible server.

In the real world, what is Rack middleware useful for? We recently ported the Heroku toolbar to Rack middleware. The previous implementation was several hundred lines of very hard-to-follow monkeypatching of ActionController, combined with a rarely-used and poorly-maintained plugin framework for Mongrel call GemPlugins. (Which I nominate for Most Confusing Name Ever.) That code was hard to read and nearly impossible to spec, but it's the only way we could make it work with the traditional Mongrel/Rails setup. It was also very tightly coupled to a particular version of Rails and Mongrel.

Ricardo (one of the new Heroku devs) banged out the Rack middle port in just a couple of days. It's a fraction the number of lines of code, and can be speced normally. Plus, our toolbar is now compatible with any Ruby app server or framework.

Because Rack separates the layers of the stack more cleanly, it was way easier to hook the Heroku toolbar code into the right place. Take that lesson and generalize it, and you'll start to glimpse the significance of Rack.

A Better Daemonize

May 07, 2008 at 02:16 AM

Mongrel, Thin, and every other web application server I've ever used all suffer from a similar deficiency: they daemonize too early. That is, they daemonize prior to trying to boot your app, which means any error - even a really obvious, immediate boot problem - will silently feed into the log as the process dies, without so much as a peep on the command line.

Try this:

$ mkdir nothing
$ cd nothing
$ thin start -d && echo success

(Substitute "mongrel_rails" for "thin" here if you want, the result is identical.)

Wait, what? There's not even anything there. Why does it return true? Early daemonization, that's why.

A better approach would be to boot the app, and once it's online and listening and ready to serve requests, then return.

Comments: 0 (view/add your own) Tags: ruby, thin

What Defines the Ruby Community?

April 29, 2008 at 10:18 PM

A friend of mine, who is a Ruby developer but a little less immersed in the Ruby culture than I, was recently boggling at the excitement around the new VMs (Rubinius, JRuby...) and new frameworks (Merb, Sinatra...). "Wait," he said. "They're replacing Ruby, and they're replacing Rails. So what the heck defines the Ruby on Rails community, if both Ruby and Rails are replaceable?"

Good question. Peter Cooper wrote "'Ruby' is starting to represent both a community and a language 'ideal' rather than just a single, well-defined programming language." I agree. But what are the specific traits that bind the Ruby community together?

I suspect that there will be surprising divergence of opinion on this subject. But here's my answer, in the form of a two characteristics that I believe all Rubyists share.

First, Rubyists love elegance. We want to solve problems in a simple and elegant fashion. Most programming languages and software infrastructure feels like the inside of a industrial revolution-era factory: it gets the job done, but it sure ain't pretty. Rubyists create things that have the minimalist and pleasing aesthetic of a haiku or a Zen garden. We are so committed to elegance that given the choice between an inelegant solution and none at all, we typically choose the latter.

The second, and more subtle point: Rubyists are dynamists. We have a deep understanding of the infinite series of technological progress: each stage of advancement building on the next. There is no such thing as perfection: anything and everything can be improved upon. In this, we are not afraid to swap out any component with a superior replacement. Apache giving way to Nginx, Subversion giving way to Git, Prototype giving way to JQuery, Mongrel giving way to Thin, Test::Unit giving way to RSpec. Even our most fundamental foundation components - Ruby and Rails - are not safe, if someone can build better replacements.

Comments: 0 (view/add your own) Tags: ruby

Avoiding Inject

April 07, 2008 at 01:30 PM

I was rather taken with inject when I first started getting serious with Ruby. It allows you to turn ugly, imperative-style loops requiring temporary variables into functional-style expressions. Like this:

total = 0
items.each do |item|
  total += item.price * item.quantity
end
total

Becomes:

items.inject(0) do |total, item|
  total += item.price * item.quantity
end

Much better. (Jay Fields expands on this, in case you're not already an inject junkie.)

And yet, somehow, this hasn't always sat quite right with me. I had to look up the syntax quite a bit when I was first learning it (I always wanted to do |item, total| rather than the reverse), and from time to time my use of inject seems to create subtle bugs that take me a while to figure out. It feels right in theory, but in practice something is a little off.

My partner and coding buddy Orion pointed out to me recently that map is actually a simpler solution in most cases. The trick is knowing the right Enumerable methods to go with it. Like sum, perfect for the example above:

items.map { |item| item.price * item.quantity }.sum

This breaks the process into two steps: extracting the information you want, then operating on the result set. (I daresay this might be a simple version of map/reduce.) Plus, it works elegantly for hashes, something that always frustrates me about inject. For example, turning a hash into a key-value string with inject is:

hash.inject("") do |string, key_value|
        string += "#{key_value[0]}=#{key_value[1]}\n"
end

With map, this is:

params.map { |key, value| "#{key}=#{value}" }.join("\n")

The two-step process wins out on elegance here, too.

Comments: 5 (view/add your own) Tags: ruby

A Taste of the Future

March 18, 2008 at 05:26 PM

Yesterday I had a chance to tinker around with a new project using Merb, DataMapper, RSpec, and Thin. Merb + Thin means your app serves up pages twice as fast, with half the memory. Merb + DataMapper means your app is threadsafe. RSpec + DataMapper means your tests are completely independent of the database - which means no db:test:prepare step, so the tests run lightning fast and without fragile database dependencies.

Above all, these components are all cleaner, smaller, and more specialized than their mainstream counterparts (Rails, ActiveRecord, Test::Unit, and Mongrel). It's too early for me to make any kind of real judgement about these tools (except for RSpec, which is very mature and entirely suitable for production use.) But using these four tools in a stack together leaves me with the gut sense that this is where the future lies.

Rest Client

March 09, 2008 at 08:03 PM

REST is part of the Ruby Way. Which is why I'm surprised that every time I go to access a RESTful resource, I find myself writing some sort of ad-hoc rest client. Net::HTTP is too low level - you've got to write at least three or four fairly dense lines of code even for a relatively simple GET or PUT.

I was banking on ActiveResource being the defacto solution starting with Rails 2, but I was a bit disappointed when it was finally released. Its purpose is fairly narrow - accessing resources that are database-recordish and which operate completely in a certain XML format. But further, it doesn't (as near as I can tell) support nested resources, which cuts out about 70% of what I might want to use it for.

The only other thing I can find is this, which monkey patches open-uri to handle other kinds of verbs. Fine, but still a bit too low level.

While I was toying with Sinatra the other day, I realized that what I wanted was just the client-side equivalent of its controller syntax. So I threw together rest-client.

require 'rest_client'

RestClient.get 'http://gemtacular.com/gems'

RestClient.post 'http://myphotosite.com/users/adam/photos', File.read('pic.jpg'), :content_type => 'image/jpg'

RestClient.destroy 'http://heroku.com/apps/myapp'

The middle one - post (or put) with a payload an non-xml content-type - is the one that interests me the most, and that I find hardest to do with other libraries. Particularly for a one-off on the command line. I'll usually hobble together a curl command with a bunch of obscure switches that I can never remember. But now, I've just added require 'rest_client' to my .rush/env.rb, so at the rush command line I can instantly access any REST resource on the web with an easy-to-remember one-liner.

I also threw together a test server at rest-test.heroku.com, to try out all the different verbs. It just echoes back the verb, resource you requested (wildcard routing will match anything), and info about the payload. Here's a session from my rush shell:

rush> RestClient.get "http://rest-test.heroku.com/some/resource"
GET http://rest-test.heroku.com/some/resource
rush> RestClient.put "http://rest-test.heroku.com/some/resource", home['pic.jpg'].contents, :content_type => 'image/jpg'
PUT http://rest-test.heroku.com/some/resource with a 70335 byte payload, content type image/jpg
rush> RestClient.delete "http://rest-test.heroku.com/some/resource"
DELETE http://rest-test.heroku.com/some/resource

gem install rest-client if you'd like to give it a try. RDocs here.

git-wiki

March 09, 2008 at 02:28 PM

git-wiki is a wiki written in less than 200 lines of code using Sinatra as the web framework and Git as the database. Quite clever. Check out how they store the CSS at the end of the file using END - a Ruby trick I was previously unawrae of.

Comments: 0 (view/add your own) Tags: git, ruby

rush, the Ruby Shell

February 19, 2008 at 01:07 PM

The unix shell (bash) and remote login (ssh) are centerpieces of the server and app deployment process. While building Heroku, however, Orion and I became aware that these tools are pretty far out of step with modern, agile development practices.

I've wanted a Ruby-syntax replacement for the unix shell from almost the moment I began using Ruby. Whenever I can, I write shell scripts as Ruby scripts with lots of backticks. But the "everything is text" mechanism starts to show its age when you end up with Ruby code like this:

my_ip = `ifconfig | grep inet | grep -v 127.0.0.1 | grep -v inet6`.match(/inet ([\d.]+)/)[1]

Yergh. (If you've never had occasion to write code like this in the wild, just check out god's process lookup methods.)

What we really want - the modern way - is to query the unix system (filesystem, processes, network, services) as if they were a database. This avoids the fragility of text pipes, the complexities of firing up a complete new environment on each system call, and would allow unit tests of system-level code.

This is why I've created rush. It's a replacement for bash and ssh which uses Ruby syntax. More than that: it IS ruby. Imagine an irb shell in which you can do everything you can do at the unix command line, but without any backticks. That's the vision; what I've got so far is a good start in that direction.

I said it replaces ssh, so this isn't just a shell: you can use it to control an arbitrary number of remote boxes, using the exact same interface as you would locally. Copy a file, or grep through a logfile, or kill a process - whether the machine is remote or local, the interface is identical.

Unlike the character-based connection of ssh, the rush client connects to the rushd process on the remote server and passes discrete commands. This is very similar to connecting to a remote database. When you run a SQL query, it makes no difference to the programmer whether the connection is a remote box or a local one; the client handles this seamlessly. You can even connect to multiple databases from the same client. rush goes even a step further by allowing you to pass data seamlessly between any number of local and remote connections.

A quick example:

local = Rush::Box.new('localhost')
remote = Rush::Box.new('my.remote.server')
local['/etc/hosts'].copy_to remote['/etc/']

Check the rush website for more examples and to try it out.

One of the inspirations for rush as a shell was this preview of MSH, the Microsoft shell. I get the feeling that this is vaporware (though I don't really know, not being in the Microsoft world at all), but the concepts introduced in the preview really struck home. Treating data returned from shell commands - like file matches from grep or processes from ps - as discrete objecs, rather than text which can be parsed, is the obvious next evolution for shells.

There are some other deficiencies in the bash+ssh model:

  • Consistency. Bash is a full-fledged programming language; more specifically, it's a DSL for managing a unix system. But it could also be considered a collection of smaller languages. Standard tools like cp, mv, ps, grep, sed, and sort all have their own unique syntax. You may combine several of these in a single command, which is a bit like mixing several different programming languages on one line. I've been using unix shells on a daily basis for well over a decade, and still I sometimes forget the syntax for a particular command. Compare this to Ruby, or any other modern scripting language, where just a few months of working with the language is enough to teach you 90% of the language's syntax.
  • Quoting. Bash commands often have many layers of quoting. Consider:
    ssh remote "rm `grep '^class Thing' lib/* -l`"
    
    This has four layers of quoting: the bash command line on the client, the bash command line on the server, the backticks, and the regexp. This leads to both confusion (do I need one backslash or two to escape this quote character?) and is riddled with security holes.
  • Quirks and limitations. Two that I frequently bump into are running out of space in the command line buffer space with backticks, such as:
    grep some_method `find . -name \*.rb`
    
    On a large project, you'll need to rewrite this with xargs:
    find . -name \*.rb | xargs grep some_method
    
    If the directory has filename with spaces in them, you have to use the null separator option on both find and xargs:
    find . -name \*.rb -print0 | xargs -0 grep some_method
    
    Ick. In rush, this would be:
    dir['**/*.rb'].search(/some_method/)
    
  • Exceptions. Bash commands have three outputs: stdout, stderr, and the shell return value. Most of the time you're only interested in one and can ignore the others. But for more advanced uses, you need two, or perhaps all three. Explicitly checking for return values (or worse, pattern matching against stderr) is not a lot of fun. Exceptions are the modern way to handle errors.

Go give it a try, and then tell me what you think.

Comments: 32 (view/add your own) Tags: ruby, rush, ssh, unix

Using Ruby's Readline Library

January 22, 2008 at 08:48 PM

There appears to be no documentation for Ruby's readline support. What's worse, it's written in C, so you can't (easily) read the source to find out its interface.

By peering at IRB's source, however, I was able to construct the following:

require 'readline'

loop do
  line = Readline::readline('> ')
  Readline::HISTORY.push(line)
  puts "You typed: #{line}"
end
Comments: 3 (view/add your own) Tags: ruby