While it's fun to write little one off utility scripts, sometimes you need to write a real honest to God command-line application. One that takes arguments and plays nicely with unix conventions for input, output, error reporting, etc.

Fortunately, Ruby gives you all the building blocks you need to command-line applications fairly easily. In this post I hope to go beyond the typical "how to do X with gem Y" approach, and instead do a broad overview of all the pieces that go together to make a first-rate command-line app.

Input: Environment Variables

Environment variables are typically used for short configuration values that you want to stick around for a while. API keys are a good example.

Setting Environment Variables

Users normally set environment variables via a shell like bash. They can set them for a specific session:

$ AWS_ACCESS_KEY_ID=FOO 

They can set them for the current session, plus any new bash session:

$ export AWS_ACCESS_KEY_ID=FOO 

Or they can set them for a single program:

$ env AWS_ACCESS_KEY_ID=FOO aws s3 mb s3://mybucket

Reading Environment Variables

Regardless of how the user sets the environment variable, you can read it via Ruby's ENV hash.

key_id = ENV["AWS_ACCESS_KEY_ID"]

How do environment vars work under the hood?

Every process has a table of environment variables associated with it. When it spawns child processes, they get a copy of that table along with whatever changes the parent wants to make.

Clear as mud?

For a more readable explanation, check out my other blog post: The Rubyist's Guide to Environment Variables.

Command-line arguments

Command-line arguments are anything you put after the program name when you run a program in the terminal:

$ echo foo bar baz

In the example above "foo", "bar" and "baz" are all command-line arguments.

A more realistic example might look like this:

$ honeybadger deploy -v --environment staging

But even here, the command-line arguments are just text. If we want --environment staging to mean anything, we have to figure that out for ourselves.

Sending command-line arguments

We've already seen arguments sent like this:

$ echo foo bar baz

But since Ruby is an interpreted language, you might also come across something like this:

$ ruby myprog.rb foo

As far as the OS is concerned, you're running a program called "ruby" and passing it two arguments.

Luckily, Ruby is smart enough to know that the argument "foo" is intended for your program, not for Ruby itself. So your program will see one argument, "foo".

Reading command-line arguments

Command-line arguments are stored in a array called ARGV. It's a global variable, so you can access it from anywhere in your program.

In the example below, we're printing everything in ARGV. Feel free to copy and paste this into your terminal and play around with it a bit.

$ ruby -e "puts ARGV.inspect" foo bar baz
["foo", "bar", "baz"]

Please don't be thrown off by the ruby -e business. I'm only using it here because it lets me show the program and the result of running it in a single line.

Parsing command-line arguments

Command-line arguments are just text. If you want the text to mean anything, you're going to have to parse it. Fortunately, there are several good libraries that can help you with the parsing.

Over the years, a somewhat standard syntax for command-line arguments has emerged. It looks something like this:

$ program -a --option foo

This style lets you have boolean flags like -h to display help. It also lets you specify options with values.

Introducing OptionParser

The OptionParser class is part of Ruby's standard library. It gives you an easy way to parse options that match the style above.

Let's make a simple application that says hello. It will let you specify name via the command-line argument.

require 'optparse'

# This will hold the options we parse
options = {}

OptionParser.new do |parser|

  # Whenever we see -n or --name, with an 
  # argument, save the argument.
  parser.on("-n", "--name NAME", "The name of the person to greet.") do |v|
    options[:name] = v
  end
end.parse!

# Now we can use the options hash however we like. 
puts "Hello #{ options[:name] }"  if options[:name]

Then when I run it, it just works:

$ ruby hello.rb --name Starr
Hello Starr

$ ruby hello.rb -n Starr
Hello Starr

Adding a "help" screen

Adding a help feature is just as easy. All we have to do is provide some text, and add a command for -h:

OptionParser.new do |parser|
  parser.banner = "Usage: hello.rb [options]"

  parser.on("-h", "--help", "Show this help message") do ||
    puts parser
  end

  ...
end.parse!

Now we can ask for help:

$ ruby hello.rb -h
Usage: hello.rb [options]
    -h, --help                       Show this help message
    -n, --name NAME                  The name of the person to greet.

Typecasting arguments

All command-line arguments are strings, but sometimes you'd like the user to give you a number or a date. Doing the conversion by hand can be tedious, so OptionParser does it for you.

In the example below, I'm going to add a "count" option that lets the user specify how many times the program should say hello. I'm telling OptionParser to cast to an Integer, which it does.

OptionParser.new do |parser|

  ...

  # Note the `Integer` arg. That tells the parser to cast the value to an int.
  # I could have used `Float`, `Date`, or a number of other types. 
  parser.on("-c", "--count COUNT", Integer, "Repeat the message COUNT times") do |v|
    options[:count] = v
  end

end.parse!

if options[:name]
  options.fetch(:count, 1).times do
    puts "Hello #{ options[:name] }" 
  end
end

Now, when I run the program I'm greeted multiple times:

$ ruby hello.rb -n Starr -c 5
Hello Starr
Hello Starr
Hello Starr
Hello Starr
Hello Starr

Naming conventions

One of the hardest aspects of programming can be figuring out what to name things. Command-line arguments are no exception. To help you, I've compiled a short table of common arguments and their meanings:

Flag Common Meanings
-a All, Append
-d Debug mode, or specify directory
-e Execute something, or edit it
-f Specify a file, or force an operation.
-h Help
-m Specify a message
-o Specify an output file or device
-q Quiet mode
-v Verbose mode. Print the current version
-y Say "yes" to any prompts

Alternatives to OptionParser

OptionParser is nice, but it does have its limitations. Most obviously, it doesn't support the command-based syntaxes that have become popular in recent years:

$ myprog command subcommand -V

Fortunately, there are a ton of option parsing libraries out there. I'm sure you'll be able to find one that you like. Here are three that look kind of interesting:

  • GLI - The "Git-like Interface" command-line parser allows you to easily make applications that, like git, are a single executable with multiple commands.
  • CRI - An easy-to-use library for building command-line tools with support for nested commands.
  • Methadone - Methadone provides its own option parsing DSL, but it goes way beyond that. It will set up a directory structure for you, a test suite and logging. It's like Rails for your CLI.

Inputting larger amounts of data with STDIN

Command-line arguments aren't suitable for inputting larger amounts of data. For that, you'll want to use an IO stream. And STDIN is one of the most convenient IO streams out there.

Most programs will automatically be assigned a STDIN by the operating system. It's like a read-only file that the OS uses to send your app data. It can be used to get keyboard input from the user. But more importantly, it can be used to receive data from other programs via a pipe.

You use STDIN just like you'd use a read-only file. Here, I'm piping some text into the STDIN of my program. Then I use the read method to fetch it.

$ echo "hello world" | ruby -e "puts STDIN.read.upcase"
HELLO WORLD

In Ruby, you can use Enumerable features on IO objects. And STDIN is no exception. That means you can do all sorts of tricks:

# Get the first 20 lines
STDIN.first(20)

# Convert to integers and reject odds
STDIN.map(&:to_i).reject(&:odd)

...etc

Outputting results to STDOUT

STDOUT is a write-only IO stream that the operating system assigns to your program.

By writing to STDOUT, you can send text to the user's terminal. The user can redirect the output to a file, or pipe it into another program.

You'll use STDOUT just like you'd use any other write-only file:

STDOUT.write("Hi!\n")
# hi

And of course, puts and print both output to STDOUT.

Sending status information to STDERR

STDERR is yet another write-only IO stream. It's not for general output. It's specifically for status messages — so they don't get in the way of the real output.

STDERR will usually be displayed in the user's terminal, even if they're redirecting STDOUT to a file or another program.

In the example below we are using curl to fetch a webpage. It outputs the content of the webpage to STDOUT, and displays progress information to STDERR:

$ curl "https://www.google.com/" > /tmp/g.html
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  151k    0  151k    0     0   277k      0 --:--:-- --:--:-- --:--:--  277k

Writing to STDERR from your own programs is easy. Just treat it like the IO object it is:

STDERR.write("blah\n")
STDERR.puts("blah")

Hitting it with the pretty stick

The examples we've seen so far aren't going to win any design awards. But there's no reason that command-line apps should be ugly, or non-interactive.

If you decide to soup up your app, there are a ton of great gems to do some of the heavy lifting. Here are three that I like:

  • highline - Highline is a great library that takes a lot of the work out of gathering, validating, and typecasting user input.
  • command_line_reporter - Makes it easy to generate ASCII progress reports.
  • paint - Lets you easily add ANSI color codes to colorize your boring old text.

See a more exhaustive list

Exit Status

If your program exits with an error, you should tell the OS by way of an exit code. That's what makes it possible for bash code like this to work:

$ prog1 && prog2

If you're not familiar with bash's && operator, it simply means "run prog2 only if prog1 exited successfully."

Ruby exits with a "success" code by default, and a "failure" code on exception. Everything else is up to you to implement. Fortunately, it's easy:

# Any nonzero argument to `exit` counts as failure. 
exit(1)

For more info, check out my other post How to exit a Ruby program.

Setting the process name

If your command-line program is going to be running for a little while, it's important that people know what it is when they list the systems processes. Normally, a Ruby programs process name consists of everything that you typed in to run the program. So if you typed in ruby myapp -v, that's the name of the process.

If your program has a lot of arguments, this can become unreadable. So you might want to set a more friendly process name. You can do this like so:

Process.setproctitle("My Awesome Command Line App")

If you're feeling fancy, you can use the process title to give the user information about what the processes doing at any given moment.

Process.setproctitle("Mail Sender: initializing")
init(...)
Process.setproctitle("Mail Sender: connecting")
connect(...)
Process.setproctitle("Mail Sender: sending")
send(...)

For more info, check out my post: How to change the process name of your Ruby script as shown by top and ps