Mu: 2010-01

2010-01-31

Closure Library Tutorial: Tasks (part 2)

It's been nearly two months since I've written about JavaScript and now is the time to follow up on the previous part of this tutorial. Since then, there have been some changes to the Closure library, nothing major, mostly improvements to the documentation and bug fixes.

Sliders

Adding a Slider is unsurprisingly as easy to do as any other components. Contrary to those we already used though, it doesn't come with a nice interface with pre-made CSS and images. Closure let you decide how your sliders will look like, which is comprehensible as this control is really versatile and it would be hard to come up with a good generic design for it. A slider element is composed of two DIVs, one representing the slider itself, the other being the thumb. Both have a class name to let you control their appearance. For the slider element this name is composed of a prefix found in the Slider's CSS_CLASS_PREFIX static property followed by either of "-horizontal" or "-vertical" depending on the orientation. The thumb has only a single class name determined by the THUMB_CSS_CLASS property, note that as the thumb is contained in the slider element, it can be styled depending on orientation too.

There's some rules to obey while styling a slider:

a slider cannot be inlined, but you can use inline-block;
a slider cannot be floated;
the thumb position style must be set to absolute or relative.

Note that it's preferable to make both elements hide their overflowing contents to prevent scrolling bars from appearing.

Using Sliders

As sliders cannot be floated, we have a problem with the existing code. If you remember the way tasks were constructed in the previous tutorial, the action buttons were added to the summary DIV directly with their float styles set to right. This isn't the correct manner of doing things anyway, so we'll fix that first. We'll group the task controls into a single floating element.

    this.summaryControlsDiv = goog.dom.createDom('div', { 'class': 'controls' });
    this.summaryDiv = goog.dom.createDom('div', { 'class': 'summary' }, this.summary, this.summaryControlsDiv);

This break some ugly part of our code, the makeActionButton function is being passed the task DIV and renders the button being created by selecting the appropriate child node. We'll just do a quick fix and correct this issue in the next section.

    button.render(element.childNodes[0].childNodes[1]);

Another more subtle change we need to make is in makeButtons, we must change the order in which the action buttons are created. We must also change some styles and add an entry for the new CSS class.

      .task .summary .controls { float: right; }
      .task .description { background: #eee; border-top: solid 1px #333; }
      
      .taskButton { margin-top: -6px; margin-left: 7px; }
      .editorButton { margin: 0.2em 0.3em 0.5em; }

The following code create a slider, we'll stop event propagation the same way it's done for tasks' action buttons.

mu.tutorial.tasks.Task.prototype.makeSlider = function(task, element) {
    if (task.parent.id == 'taskList') {
        var slider = new goog.ui.Slider;
        slider.createDom();
        slider.render(element.childNodes[0].childNodes[1]);

        slider.addEventListener(
            goog.ui.Component.EventType.CHANGE,
            function() { /* TODO: something... */ });

        goog.events.listen(
            slider.getContentElement(),
            goog.events.EventType.CLICK,
            function(e) { e.stopPropagation(); });
    }
}

We need to call this function from makeDom just before creating the task buttons. To make the page actually show the sliders, we'll add the styles below.

      .goog-slider-horizontal, .goog-slider-thumb {
        overflow: hidden;
        height: 20px;
      }
      
      .goog-slider-horizontal {
        background-color: #ccc;
        display: inline-block;
        position: relative;
        width: 200px;
      }
 
      .goog-slider-thumb {
        background-color: #777;
        position: absolute;
        width: 20px;
      }

Cleaning Up

Earlier, we talked about how bad it was for makeActionButton to have to know where to render the button. We'll take the time to clean that up. If we look back at the code, one thing is obvious, we pass lots of arguments around. That make the code overly functional and this isn't a good approach here. Another thing is that we carefully initialized prototype members for every elements contained in a task and we end up not using them.

For this section I'll skip the code, so use your imagination. First thing to do is to remove the element argument from functions having it. For the controls, we'll use the object's properties, summaryControlsDiv for action buttons and sliders, and editorContainer for editor buttons. Finally, we need to replace the taskDiv local variable in makeDom by a property that we'll name element. We can't use it directly in our callback function as it is a closure and this doesn't point to the task object. We can work around that simply by using a temporary variable, we'll rewrite that code later anyway.

Making the Sliders Do Something

To put the sliders to good use, we'll make them control the priority of a task. We'll first make the tasks display their priorities, to be able to test things manually.

    this.priorityDiv =  goog.dom.createDom('div', { 'class': 'priority' }, this.priority.toString());
    this.summaryDiv = goog.dom.createDom('div', { 'class': 'summary' },
                                         this.summary,
                                         this.summaryControlsDiv,
                                         this.priorityDiv);

We'll also have to add styling for the new priority DIV, it must float to the right and have some padding on the right side. Now, lets add two things to the makeSlider function. Firstly, setting the slider value to the priority of the task currently being created. Secondly, replacing our TODO comment by changing the priority property as well as the content of the priority DIV.

        slider.setValue(this.priority);

        var task = this;
        slider.addEventListener(
            goog.ui.Component.EventType.CHANGE,
            function() {
                task.priority = slider.getValue();
                task.priorityDiv.innerHTML = task.priority;
            });

There's a small bug with this code, can you find it?

If you delete or mark a task as done, then undo that action, you'll see that the resurrected tasks' slider thumb is set at its lowest value. If you insert an alert to display the sliders' value when the task is recreated, you'll see that it's value is correct. Also the slider behave ~~as if the thumb was at the correct position~~ strangely, it can even go into infinite loop. This all points out toward the fact that when we recreate our tasks, they're being added to an invisible element. We'll reorganize our code in the next section to circumvent this issue.

Sorting Tasks

The sliders are now working (with some issues) that's good but it doesn't add much to our little application feature-wise. The obvious use of tasks priority would be to sort them. We'll need to do some major modification to our code to do this though. Currently the tasks add themselves to their containers and appear in whatever order they're being appended. We'll need to create a new object to represent task lists. I won't be showing all the changes made, just the most important ones, you can always refer to the final code if you need more details. First, we'll need to provide a new class name for our TaskList prototype.

goog.provide('mu.tutorial.tasks.TaskList');

mu.tutorial.tasks.TaskList = function(name, data, container) {
    this.name = name;
    this.container = container;
    this.tasks = [];

    for (var i = 0; i < data.length; i++) {
        this.tasks.push(new mu.tutorial.tasks.Task(data[i], this));
    }

    this.sort()
    this.render();
}

It's followed by its constructor which creates, sorts and renders the given tasks, this greatly simplify the makeTasks function. With this modification, we introduced some unwanted duplication of information, our tasks objects already have a reference to their containers. We'll just change that reference for another pointed at the list it's associated to. We'll also change the way we're using the taskLists namespace variable, it will now contains TaskList objects. Now lets see how the render function works.

mu.tutorial.tasks.TaskList.prototype.noTasks = function() {
    this.container.appendChild(
        goog.dom.createDom('h2', { 'class': 'empty' }, 'This list is empty!'));
}

mu.tutorial.tasks.TaskList.prototype.render = function() {
    goog.dom.removeChildren(this.container);

    if (this.tasks.length == 0)
        this.noTasks();
    else
        for (var i = 0; i < this.tasks.length; i++) {
            this.tasks[i].makeDom();
        }
}

It's basically the same code that was previously in makeTasks with the only difference that it clean up its container before adding new elements. We also moved noTasks into the TaskList object for convenience.

Next, some new code. We've got an array of tasks and we'll sort it using the default JavaScript sort function. Google has thought about adding helper functions for arrays including one for sorting, its advantage over the default JavaScript one is that it sorts numbers correctly. Here this sort function wouldn't be helping us as we'll write our own compare function.

mu.tutorial.tasks.defaultCompare = function(t1, t2) {
    return goog.array.defaultCompare(
        t2.priority,
        t1.priority);
};

mu.tutorial.tasks.TaskList.prototype.sort = function() {
    this.tasks.sort(mu.tutorial.tasks.defaultCompare);
}

We'll need to insert and remove tasks from our task lists. As they're always sorted, the Closure Library has two functions to help us: binaryInsert and binaryRemove. These allow for fast manipulation of sorted arrays by using a binary search algorithm.

mu.tutorial.tasks.TaskList.prototype.add = function(task) {
    task.list = this;
    
    goog.array.binaryInsert(
        this.tasks,
        task,
        mu.tutorial.tasks.defaultCompare);
}

mu.tutorial.tasks.TaskList.prototype.remove = function(task) {
    this.container.removeChild(task.element);
    
    goog.array.binaryRemove(
        this.tasks,
        task,
        mu.tutorial.tasks.defaultCompare);

    if (this.tasks.length == 0)
        this.render();
}

The add method is very simple, it just set the current task list as the given task parent and insert it into the array. For the remove method though, it's actually in charge of cleaning up the element of the task being removed. It also ensures to call noTask if there's no task left in that task list.

Next, we'll replace the clickActionButton listener code by a call to a new Task method called moveTo. This method will be in charge of removing the task from its parent list and inserting it into the specified one using the above methods.

mu.tutorial.tasks.Task.prototype.moveTo = function(target) {
    this.list.remove(this);
    mu.tutorial.tasks.taskLists[target].add(this);
}

Yet another change that merit a mention is the switchPanel function which takes a new argument to know what TaskList to render. This could be made cleaner, but we'll keep it that way for this part of the tutorial.

When a user change the priority of a task, we better have them sorted. We'll create a new Task method that will be called when someone stop using the slider. It make use of a new property of the Task object to remember the previous value so as not to reload a task list if the priority didn't changed.

mu.tutorial.tasks.Task.prototype.reload = function() {
    if (this.priority != this.previous_priority) {
        this.list.sort();
        this.list.render();
        this.previous_priority = this.priority;
    }
}

But now we have a problem. When shall we call this function? I've thought about it for some time and came up with a complex and convoluted solution. The correct solution, that is to use a timer and accumulate events, being too much bothersome for the scope of this tutorial, I've decided to settle on a small hack I've found that should do the job.

By the principles of YAGNI, I figured out that the sliders are to heavy to handle. Why would we need four different ways of modifying a task priority? So, from now on, we'll only listen for MOUSEOUT and KEYUP events.

        goog.events.listen(
            slider.getContentElement(),
            [goog.events.EventType.MOUSEOUT,
             goog.events.EventType.KEYUP],
            function() { task.reload(); });

There's still a problem with dragging the sliders, if only we could make that issue disappear.

Invisible Sliders

Sometimes the best style is no styles at all. We must face it, the sliders are ugly. Making them look awesome can take some time for a non-designer like me. While thinking about this problem, I've came up with a solution that would also simplify sliders event handling code. We could make the sliders invisible and put them on top of the priority numbers shown. That doesn't actually fix our issue, we only prevent users from thinking about dragging the slider thumb. This is just a hack and should only be done if you're really short on time. Talking of time, this part is much more longer than I expected. So I'll let you look at the result rather than the code as it's a very simple modification, mostly new DOM elements and CSS.

Less Rendering and More Stability

One last thing, our code is working like a charm and is more than fast enough for the small data set we're testing it with. But in real world situations, there's some bottlenecks that could potentially slow our application down. I'll let the testing for another part, but we'll remove one of those bottleneck.

The most processor hungry function in our code is certainly the render one. Presently, we're calling it every time the priority of a task change. But if the order of tasks don't change, we don't need to render them again. Moreover, the JavaScript sort function isn't stable, two tasks having the same priority could be reordered. So, to kill two birds with one stone a second time, we'll make our own stable sort function. Well, in fact, we'll just take the code from goog.array.stableSort and alter it to return true if the array changed.

mu.tutorial.tasks.TaskList.prototype.sort = function() {
    var arr = this.tasks
    for (var i = 0; i < arr.length; i++) {
        arr[i] = {index: i, value: arr[i]};
    }

    function stableCompareFn(obj1, obj2) {
        return mu.tutorial.tasks.defaultCompare(obj1.value, obj2.value) ||
            obj1.index - obj2.index;
    };
    
    goog.array.sort(arr, stableCompareFn);

    var changed = false;
    for (var i = 0; i < arr.length; i++) {
        if (i != arr[i].index)
            changed = true;
        arr[i] = arr[i].value;
    }
    return changed;
}

Then, it's only a matter of calling render when the sort call returns true in the reload method.

mu.tutorial.tasks.Task.prototype.reload = function() {
    if (this.priority != this.previous_priority) {
        if (this.list.sort())
            this.list.render();
        this.previous_priority = this.priority;
    }
}

Phew, that was a big post, I hope you enjoyed it! I'm still not quite sure about what will be the subject of the next part, if you have any suggestion, drop by a comment.

2010-01-24

Emacs Tip #1

Emacs is a great editor, I won't elaborate on this as there's countless blog posts already celebrating its countless features. After some years of usage, anyone using Emacs end up with a bloated .emacs file that may contains some good tips. I'll be explaining some of the tricks I'm using in a series of posts of which this one is the first.

Web Search Shortcuts

It's a known fact (in the sense that it's a common Intertubes meme) that some people use their computer exclusively through Emacs. I'm not that fond of this editor and prefer to do things like navigating the web with a real browser. Emacs as a way to send URLs to one, using the browse-url function. Feeding an unencoded URL to it could yield a bad request though. To fix that, we can use url-hexify-string from the url library. Here's a simple example of putting these functions to good use.

(require 'url)

(defun google ()
  (interactive)
  (let ((url (if (and transient-mark-mode mark-active)
               (buffer-substring-no-properties
                 (region-beginning)
                 (region-end))
               (thing-at-point 'symbol))))
    (browse-url
      (concat "http://www.google.com/search?"
        (format "q=%s" (url-hexify-string url))))))

This interactive function let us do a simple search with the text under the currently active region or with the symbol found under the cursor if there's no active region. This can be useful, you're no longer forced to copy-paste text from Emacs to your browser.

Next, we could try to send more complex queries to our preferred search engine, which in this case is Google. Standard library documentation such as Java's API specification generally have URLs that are easily queryable. With Google's allinurl keyword, we can make a function to search for a specific class symbol.

(defun search-all-in-url (query site inurl)
  (concat "http://www.google.com/"
    (format "search?q=allinurl:%s+site:%s+inurl:%s&btnI"
      (url-hexify-string query)
      (url-hexify-string site)
      (url-hexify-string inurl))))

(defun web-help (site inurl)
  (browse-url (search-all-in-url (thing-at-point 'symbol) site inurl)))

With web-help, we can now define custom interactive functions. Currently in my .emacs file, I've got two such functions, one for Java and the other for Ruby.

(defun java-help ()
  (interactive)
  (web-help "java.sun.com" "/javase/6/docs/api/"))

(defun ruby-help ()
  (interactive)
  (web-help "www.ruby-doc.org" "/core/"))

Then, it's a simple matter of adding key bindings for our new interactive functions. Personally, I like to have them bound to the lower F* keys.

(global-set-key [(f1)] 'google)
(global-set-key [(f2)] 'java-help)
(global-set-key [(f3)] 'ruby-help)

You might already use these shortcuts for other things (I think that F3 serves to start recording macros) but you certainly still have some unbound key combinations!

Update: I've corrected the second code example that I messed up while writing this post.

2010-01-19

A Recursive Walk on a String

Sometimes you don't pay attention to some details and this lead you right into the wrong direction. That's exactly what happened to me the other day. I had a specific problem I intended to solve for a while, nothing particularly difficult, but an interesting one. After hacking around for an hour, the situation was getting weird and I decided to ask for help on the Clojure Google Group. This post is an account of how that situation arose and what have been done to solve it.

Walking Recursively

The task is nearly as simple as explaining it. I needed a way to apply a function to each strings contained in a tree of nested data structures in Clojure. Not that easy to explain as you see, well you didn't really saw me rewrite the previous sentence a thousand times but you can probably picture it. My first attempt is really naive and leverages the clojure.walk library written by Stuart Sierra. It's a small library that provides functions to perform non-recursive walking of generic Clojure data structures. Lets first have a look at the walk function documentation.

user> (doc clojure.walk/walk)
-------------------------
clojure.walk/walk
([inner outer form])
  Traverses form, an arbitrary data structure.  inner and outer are
  functions.  Applies inner to each element of form, building up a
  data structure of the same type, then applies outer to the result.
  Recognizes all Clojure data structures except sorted-map-by.
  Consumes seqs as with doall. 
nil

It isn't hard to take this function and use it in a recursive one to do the job. Our function to transform strings could go in the inner argument, it's also where we'll add the recursive call. We don't need the outer one, so we will just use the identity function.

(use 'clojure.walk)

(defn recursive-string-walk [f form]
  (walk #(if (string? %) (f %) (recursive-string-walk f %))
    identity form))

It works well but there's an issue with this way of doing things, recursion will end up filling up the stack which will ultimately overflow. For the use I intend to make of this function, it isn't really a big problem, but I always like to take these kind of challenges.

A Buggy Test

The first thing that came to my mind to test the above issue was to create a nested list of arbitrary depth. We can take the iterate function to create an infinite sequence of nested function calls to list, then take and last to fetch the pit we want to test with.

user> (def pit (iterate list "bottom!"))
#'user/pit
user> (take 6 pit)
("bottom!" ("bottom!") (("bottom!")) ((("bottom!"))) (((("bottom!")))) ((((("bottom!"))))))
user> (last (take 6 pit))
((((("bottom!")))))
user> (recursive-string-walk identity (last (take 6 pit)))
((((("bottom!")))))
user> (recursive-string-walk #(str "reached " %) (last (take 6 pit)))
((((("reached bottom!")))))
user> (recursive-string-walk #(str "reached " %) (last (take 1000 pit)))
; Evaluation aborted.
user> (recursive-string-walk #(str "reached " %) (last (take 100 pit)))
((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((("reached bottom!")))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))

Work great! Now lets create an informal test function. This function should take a range and a recursive walk function then will test it using pits starting with the shallowest. This way we can find exactly at which point the overflow happen.

(defn test-walk [walker shallowest deepest]
  (doseq [depth (range shallowest deepest)]
    (println (walker #(str depth " reached " %)
               (last (take depth pit))))))

We'll use it to make sure it works and see which depth our recursive-string-walk can reach.

user> (test-walk recursive-string-walk 400 500)
...
((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((("486 reached bottom!")))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))
(...(; Evaluation aborted.

On my computer, this function cannot go deeper than 486 (I wonder how it would fare on a 486?) which isn't that much. It shouldn't be all that difficult to make a lazy version of that function. It boils down to look for sequences and wrap their recursive calls with lazy-seq.

(defn lazy-recursive-string-walk [f form]
  (walk #(cond
           (string? %) (f %)
           (seq? %)    (lazy-seq (lazy-recursive-string-walk f %))
           :default    (lazy-recursive-string-walk f %))
    identity form))

From my understanding of lazy data structures and the walk library, this version should be able to handle data structures of infinite depth given infinite resources. However the test function didn't seems to be in agreement with me as it was crashing at 828. That's about the time I asked for help and stopped thinking about this problem for a while.

Lazy Zippers

A week-end passed by and in the meantime, some people from the group tried to help me, yet nobody found what was the real problem. That would have been miraculous given the actual source of the bug. Furthermore, I was now more convinced than before than there was no bug in the lazy version. Tom Hicks provided a lazy version of walk, which was behaving in the same way. Laurent Petit also suggested that I try to write a version using zippers.

It was an intriguing suggestion, I never used zippers but heard often about them. Additionally, that day, MarkCC from Good Math, Bad Math just posted an article about them. So I learned what they are and how to use them. I must say this is a very neat functional programming technique, really worth the time to learn. It's not that hard, Mark has done a great job explaining it. I won't go into the details, but this version is quite easy to understand when you know that the next function do a depth-first traversal automatically.

(require '[clojure.zip :as z])

(defn lazy-recursive-string-walk-with-zipper [f form]
  (loop [loc (z/seq-zip form)]
    (if (z/end? loc)
      (z/root loc)
      (recur (z/next
               (if (string? (z/node loc))
                 (z/replace loc (f (z/node loc)))
                 loc))))))

To my surprise, the test function still overflowed the stack using this version. And it could simply not throw the StackOverflowError as it executes in a loop. Maybe the error wasn't coming from the lazy walk function after all.

So Where's The Bug?

I've been negligent about one thing, I didn't look closely enough at the stack traces. The one thrown by recursive-string-walk happen to be completely different than the one thrown by lazy-recursive-string-walk. Here's both of them one after the other.

No message.
  [Thrown class java.lang.StackOverflowError]

Restarts:
 0: [ABORT] Return to SLIME's top level.

Backtrace:
  0: clojure.lang.RT.boundedLength(RT.java:1128)
  1: clojure.lang.RestFn.applyTo(RestFn.java:135)
  2: clojure.core$apply__4370.invoke(core.clj:436)
  3: clojure.walk$walk__7942.invoke(walk.clj:43)
  4: recursive_string_walk$recursive_string_walk__4137.invoke(NO_SOURCE_FILE:1)
  5: recursive_string_walk$recursive_string_walk__4137$fn__4139.invoke(NO_SOURCE_FILE:1)
---
No message.
  [Thrown class java.lang.StackOverflowError]

Restarts:
 0: [ABORT] Return to SLIME's top level.

Backtrace:
  0: clojure.lang.RT.assoc(RT.java:666)
  1: clojure.core$assoc__4259.invoke(core.clj:146)
  2: clojure.lang.AFn.applyToHelper(AFn.java:179)
  3: clojure.lang.RestFn.applyTo(RestFn.java:137)
  4: clojure.lang.Ref.alter(Ref.java:174)
  5: clojure.core$alter__4907.doInvoke(core.clj:1542)

The actual source of the exception is the print function that is recursive on nested sequences. This make sense, why would someone want to print such monsters. Because of that inattention I lost time searching for an imaginary bug. Only a small modification is needed to make our test function working.

(defn get-bottom [p]
  (loop [p (first p)]
    (if (seq? p) (recur (first p)) p)))

(defn test-walk [walker shallowest deepest]
  (doseq [depth (range shallowest deepest)]
    (println (get-bottom
               (walker #(str depth " reached " %)
                 (lazy-seq (last (take depth pit))))))))

We simply search for the bottom item and print it, this will also make the test output clearer. We can now see what our original walk function can handle.

user> (test-walk lazy-recursive-string-walk 100000 100001)
...
; Evaluation aborted.
---
Java heap space
  [Thrown class java.lang.OutOfMemoryError]

Well, the heap too has its limit! We would need to augment the JVM heap memory size to be able to run the previous example. Lets reduce our ambitions and do some benchmarking instead.

user> (time (test-walk lazy-recursive-string-walk 10000 10001))
10000 reached bottom!
"Elapsed time: 25.138697 msecs"
nil
user> (time (test-walk lazy-recursive-string-walk-with-zipper 10000 10001))
10000 reached bottom!
"Elapsed time: 122.691972 msecs"
nil

As you see, the version using clojure.walk is much faster. That's perfectly understandable considering the more complex concept of zippers which have multiple uses aside walking trees. As both versions are fully lazy, the first one is the clear winner here.

What Can We Use This For

After all these trials and tribulations, you may wonder what use I may have for such a function. It's to help create functions or macros that perform string replacement on all strings in a Clojure form. This can be useful with Compojure while deploying your application to a different root path than "/" for example. The idea come from my experience with ASP.NET in which URLs can contain a character (~) to denote the web site root path.

(use 'clojure.contrib.str-utils)

(defn expand* [expansions string]
  (reduce (fn [s [regex replacement]]
            (re-gsub regex replacement s))
    string (partition 2 expansions)))

(defn expand-form [url form]
  (recursive-string-walk (partial expand* [#"^~" url]) form))

Having the same functionality in a macro would be nice too.

(defmacro with-expansions [expansions & body]
  `(do ~@(recursive-string-walk (partial expand* expansions) body)))

(defmacro expand [url & body]
  `(with-expansions [#"^~" ~(str url)] ~@body))

To put this to use I've taken a function written by James Reeves a few months ago during a discussion on the Compojure group. A function with the same name (with-context) already exists in Compojure though, so we'll rename it to with-root-path. The way it work is by binding the given context to a var and then we can use a function called url to add the context so that internal links still work once the application is deployed elsewhere. Using the expand-form function on html's arguments can reduce repetition and make the code less cluttered if there's a lot of such URLs.

(use 'compojure)

(declare *context*)

(defn with-root-path
  [ctx & route-seq]
  (let [handler (apply routes route-seq)
         pattern (re-pattern (str "^" ctx "(/.*)?"))]
    (fn [request]
      (if-let [[_ uri] (re-matches pattern (:uri request))]
        (binding [*context* ctx]
          (handler (assoc request :uri uri)))))))

(defn url [path]
  (str *context* path))

(defn html-expand [& trees]
  (apply html (expand-form *context* trees)))

(defroutes test-routes
  (with-root-path "/foo"
    (GET "/"    (html-expand [:h1 "test"] [:p "~/"]))
    (GET "/bar" (html [:h1 "test"] [:p (url "/bar")]))
    (ANY "*" (page-not-found))))

(defn start-test-server []
  (run-server {:port 8000} "/*" (servlet test-routes)))

The first route use a special version of the html function, it's only a slight improvement but I really like that way of doing things. There's plenty of other possible use cases though, but that should be enough for this post.

P.S.: I know the title isn't correct, but it sounds way better than A Recursive Walk on Strings Contained in Arbitrary Nested Data Structures.

2010-01-12

Documenting Clojure Code

Clojure is a truly amazing language and still a very young one. There is already a substantial quantity of documentation available for Clojure and some of its libraries. Yet, when talking about documentation, some is not enough. Sure, there's the REPL with all its bells and whistles, but sometimes I prefer to navigate through a well formatted API document using a browser. There's already some solutions for this, one on which I worked for the defunct Compojure website (which was based on similar code for Clojure) and another more serious attempt in clojure-contrib, gen-html-docs by Craig Andera. This library only output HTML and there arise a small (but annoying) problem: not all projects have their documentation formatted in HTML. For example, the new Compojure website runs on DokuWiki which have its own syntax and Gitorious integrated Wiki, used for ClojureQL, is using Markdown. If only every Wiki engine would understand Creole!

clj-doc

To be able to solve the above problem, I decided to start a new project called clj-doc. For now it's pretty much just a weekend coding session, but I'll continue to work on it so that one day it will be the best tool available to generate documentation for Clojure code. The basic usage is done through the gen-doc macro.

user> (use 'clj-doc)
nil
user> (gen-doc clj-doc.utils)
("<!DOCTYPE html ...Documentation for clj-doc.utils...")

This gives us a sequence of strings being the rendered output of each namespaces given using the default markup language. Namespaces could also be grouped using nested sequences, for now clj-doc support only one level of nesting. Another notable feature is that namespace groups can be specified using regular expressions. Here's an example demonstrating all these methods at once.

user> (use 'clojure.contrib.pprint)
nil
user> (pprint (gen-doc clj-doc
                       (clj-doc.markups clj-doc.generator clj-doc.utils)
                       #"clj-doc\.markups\."))
("<!DOCTYPE html ...Documentation for clj-doc..."
 "<!DOCTYPE html ...Documentation for clj-doc.markups, clj-doc.generator, clj-doc.utils..."
 "<!DOCTYPE html ...Documentation for clj-doc.markups.creole, clj-doc.markups.dokuwiki, ...")

Writing to Files

Usually you would want to write the documentation for your library to a file. I've added basic support for this with the gen-doc-to-file macro. It behaves exactly as gen-doc but writes each of the resulting documentation strings to a file.

user> (gen-doc-to-file "~/test.html" clj-doc)
nil

If you use more than one group of namespaces, this macro will write them into multiple files that will be numbered starting with zero by appending a number before the last dot if there's one, at the end otherwise. I intend to make this argument a format string where you can specify more complex filenames in a next version. So, if you got any ideas about how to design or implement such feature, I encourage you to add a comment to this post.

Supported Markups

Up until now, we've only been using the default output markup. The main point of this library being generating something else than HTML, we better have a way to use another markup language. Before talking about how to do that, lets look under the hood.

(ns clj-doc.markups.creole
  "Creole markup."
  (use clj-doc.markups))

(defmarkup
  #^{:doc "Creole markup."}
  creole
  :title        #(str "\n= "     %     " =")
  :namespace    #(str "\n== "    %    " ==")
  :section      #(str "\n=== "   %1  " ===\n" %2)
  :var-name     #(str "\n==== "  %  " ====")
  :var-arglist  #(str "\n**" % "**\\\\")
  :var-doc      #(str "\n\n" %))

This is the current implementation for the Creole Wiki markup generator. All markups are defined in the same way, they're basically only struct maps, for more details see the clj-doc.markups namespace. The library currently provides four output methods. We've already seen two of them, here's the complete list:

creole
dokuwiki
html-simple
markdown

To define your own markups, you can take one of the implementations, modify it and see how it work out, the keywords should be explicit enough. The library only look for markups into namespaces starting with "clj-doc.markups." though, but that's just a temporary limitation, eventually you'll be able to pass a markup struct map as an option.

The Option Map

The option map is an optional argument that must be specified before any namespaces. It presently only support two keywords, :markup and :separated-by. The first letting you choose the output method and the second used to change the default page separation behavior.

user> (gen-doc {:markup creole} clj-doc.markups.creole)
("\n= Documentation for clj-doc.markups.creole =\n== clj-doc.markups.creole namespace ==\n=== others ===\n\n==== creole ====\n\nCreole markup.")

The :separated-by option can only take one value, namespace, which makes the macros separate pages between namespaces instead of groups of them.

user> (pprint (gen-doc {:markup markdown :separated-by namespace} #"clj-doc\.markups\."))
("\n#Documentation for clj-doc.markups.creole..."
 "\n#Documentation for clj-doc.markups.dokuwiki..."
 "\n#Documentation for clj-doc.markups.html-simple..."
 "\n#Documentation for clj-doc.markups.markdown...")
nil

The Future

This is a very early release done the worse is better way, so it's still a pretty bare-bone tool. I'll be taking a pause this week to work on more (hopefully) profitable projects, so in the meantime I'll be pleased to ear about any suggestions to improve the way clj-doc works.

2010-01-11

Git Tip #1

Changing Filename Case Under Cygwin

When using Git on Windows XP, you're confronted with a basic problem, case changes aren't accounted for while renaming files as Windows internals doesn't have a clue about the difference between lower and upper case. You can actually change the casing, but you can't have two files using the same name with different casing for example. This behavior obviously affect Git, which cannot detect filename case changes.

$ git init Initialized empty Git repository in /home/budu/tmp/.git/

$ touch foo

$ git add .

$ git commit -m 'test' [master (root-commit) 5a1fbb3] test 0 files
changed, 0 insertions(+), 0 deletions(-) create mode 100644 fo

$ mv foo FOO

$ l total 0 drwxr-xr-x+ 1 budu None 0 Jan 11 18:15 .git/ -rw-r--r-- 1
budu None 0 Jan 11 17:59 FOO

$ git status # On branch master nothing to commit (working directory
clean)

This is an unfortunate situation which can be easily resolved by using a simple trick. The first thing you need to do is to copy the file you want to modify to a different temporary name. Then you must use git rm on the original file and finally rename the temporary file to the new name with changed casing.

$ cp FOO bar

$ git rm foo rm 'foo'

$ mv bar FOO

$ git add .

$ git status # On branch master # Changes to be committed: # (use "git
reset HEAD ..." to unstage) # # renamed: foo -> FOO #

One thing to note here is that when calling git rm you have to use the original file name if its casing is already changed.

Mu