Trying to understand Ruby :symbols

After reading Kevin Clark's recent post on Ruby symbols made me want to learn more about this mysterious part of the Ruby language. After writing some unit tests to ensure that a symbol really is the same object as any other with the same name, I have come to the following conclusion: symbols are immutable string objects, same named symbols share the same object and therefore the same object_id.

Kevin's example points out that they make great keys for hashs, and other indicators of what action to take (:get, :post, etc.)

Coming from a Java background, symbols have made me hit my head a number of times, this rather helpful post from Rob Sanheim helped clear things up a lot for me.

Why the lucky stiff also has a good explaination, he suggests:


Symbols are words that look just like variables. Again, they may contain letters, digits, or underscores. But they start with a colon.

:a, :b, or :ponce_de_leon are examples.

Symbols are lightweight strings. Usually, symbols are used in situations where you need a string but you won’t be printing it to the screen.

You could say a symbol is a bit easier on the computer. It’s like an antacid. The colon indicates the bubbles trickling up from your computer’s stomach as it digests the symbol. Ah. Sweet, sweet relief.

So I've got a number of different explainations of symbols, I've written 5 unit tests about them, but I'm still not 100% sure of all their uses. I suppost I just need to get Rails installed and start playing with that to really see how symbols should be used...

Technorati Tags: , , , , ,


Alaric said...

Ah, if you really want to see symbols in their natural splendour, try a Lisp language - Lisp is, I gather, where symbols first emerged.

As you say, yes, they're special strings, with the system only ever keeping one copy, so all symbols that look the same are the same. Yet the symbol will just be represented internally as a single machine word, usually a pointer to the string, so they can be compared as quickly as an integer - unlike a normal string.

Use symbols wherever you'd use an enum in C, or a set of final static ints with capital-letter names in Java.

They also make good sentinel values. Eg, a CSV parser might take an input file or character stream, and return a list of arrays (one per line) of strings or integers (one per field). However, any invalid lines or fields can be represented by a symbol such as :error - distinct from any actual string or integer or array that might appear there. Sure, you could use a NULL value, but what if you want to indicate more than one kind of error? :invalid-integer and :unterminated-quoted-string can be used.

LISP languages represent their own source code with symbols. To cut a long story short, a LISP expression is either:

1) A symbol - written as a string without quotes
2) A number - 2.3
3) A string - in quotes, "Hello"
4) A list - surrounded in (brackets), with elements separated by spaces

There's some other types, but they're irrelevant to this discussion.

Given then above, a simple Scheme function definition might look like:

(define (isZero? x) (= x 0))

define, isZero?, x, and = are all symbols. 0 is an integer. So as you can see, symbols are also like keywords in a programming language; I don't know Ruby, but I guess it has an "if" keyword, which means exactly the same wherever you use it. It is represented to you, the user, as the ASCII codes for "i" and "f", but to the computer, it's just a token with some arbitrary internal identifying number. In other words, it's a symbol!

abeacock said...

Thank you for your detailed comment, the part about using symbols as error messages was particularly interesting.

I've not dived into Lisp - getting into Ruby is about as much as my spare time can take!