Zen and the art of...

Showing posts with label clojure. Show all posts
Showing posts with label clojure. Show all posts

2010-12-12

Improved Sandbar Forms

There has been a flurry of improvements made to the Clojure web stack during the past year. Compojure has matured, there's some non-trivial code available (for example, see Brian Carper's cow-blog) and the new Sandbar library which brings a higher-level of abstraction on top of Compojure and Ring. For now, it provides a stateful session mechanism, authorization + authentication and a clever way of defining forms layout, processing and validation. Also there's much more to come, you can look at the details in the following roadmap.

For today, I'll discuss the recent changes to the forms namespace. There has been a lot of work done on that part during the past week, but the changes to the API are minimal. I'll go over each options of the defform macro, but first lets talk about the code behind. As Stuart Halloway stated in his book, the first rule of the Macro Club is: "Don't Write Macros." So the most significant alteration is the rewrite of the defform macro. Previously, it was a big piece of code (134 lines) generating a bunch of definitions of all sorts. It was quite hard to modify and had the usual constraints of using macros that way. It now has been replaced by a much more simple macro that call the new make-form function.

Lets see a sample form written for the current (0.3) version of Sandbar.

(forms/defform group-form "/group/edit"
  :fields [(forms/textfield :name)
           (forms/select :region
                         (db/all-region)
                         {:id :name :prompt {"" "Select a Region"}})
           (forms/textarea :description)]
  :load #(db/fetch-group %)
  :on-cancel "/groups"
  :on-success #(do (db/store-group %)
                   (session/flash-put! :user-message
                                       "Group has been saved.")
                   "/groups")
  :properties {:name "Group's name:"
               :description "Description:"})

(defroutes group-form-routes
  (group-form (fn [request form] (views/layout form))))

It was nice but had a severe limitation: you were forced to use a fixed set of routes. When creating a record, "/new" was appended to the given URI else the "id" key was used. Also, the fields option tended to become cluttered with the data source bindings. These are the two main point I'll address here.

Firstly, lets rewrite the fields option, it's quite similar in taking a vector of field descriptions. The difference is in the field description functions that are taking the name of the field (as a keyword) followed by a list of optional key/value pairs.

  :fields [(forms/textfield :name
                            :label "Group's name:")
           (forms/select :region
                         :prompt {"" "Select a Region"})
           (forms/textarea  :description
                            :label "Description:")]

Each field functions can take a label option, then most have an optional boolean required option which auto-generate the corresponding validator for that field and finally there's the prompt option for the select field. Any other options will be added to the field's HTML attribute.

Secondly, there are the new options for managing the form action and method attributes. Each are prefixed by create or update followed by -action or -method whether it's for an action or method. They can take parametrized routes which will get their parameters replaced by the matching route parameters from the incoming request.

  :create-action "/groups"
  :update-action "/groups/:id"
  :update-method :put

Thirdly, here's the new bindings option which take a map of field names followed by their respective binding information in a map.

  :bindings {:region {:value :id
                      :visible :name
                      :source (constantly (db/all-regions))
                      :data :id}}

The source option needs a function that fetch the relevant data, the visible option determines what field to show on the page and the value and data options represent the actual value to use in the source data map and the form data map respectively.

Finally here's the whole code for this example using the new forms features using RESTful routes.

(forms/defform group-form
  "Form handler for Group entity."
  :fields [(forms/textfield :name
                            :label "Group's name:")
           (forms/select :region
                         :prompt {"" "Select a Region"})
           (forms/textarea  :description
                            :label "Description:")]
  :load #(db/fetch-group %)
  :on-cancel "/groups"
  :on-success #(do (db/store-group %)
                   (session/flash-put! :user-message
                                       "Group has been saved.")
                   "/groups")
  :create-action "/groups"
  :update-action "/groups/:id"
  :update-method :put
  :bindings {:region {:value :id
                      :visible :name
                      :source (constantly (db/all-regions))
                      :data :id}})

(defroutes group-form-routes
  (GET  "/groups/new"      request (group-form request))
  (POST "/groups"          request (group-form request))
  (GET  "/groups/:id/edit" request (group-form request))
  (PUT  "/groups/:id"      request (group-form request)))

Furthermore, other modifications include that most defform options can take a function of the request instead of just a value and that redirection URIs in the on-success and on-cancel options can be parametrized.

The forms namespace is still a work in progress and is still being improved. Particularly there is talk concerning the way forms get rendered, which currently lack flexibility as pointed out by David Nolen in this thread. Nothing has been done yet in this regard, so it's the right time for anyone interested to chime in this discussion.

2010-02-22

Redesigning ClojureQL's Frontend

I've not updated this blog in a while as I've been pretty busy for the past few weeks. I've got many projects on the table and one of them is starting to get really interesting. I'm obviously talking about the subject of this post, ClojureQL. I've mainly been working on the backend since the end of last year. In the meantime, Meikel has started a rework of the frontend to provide a cleaner API, free from the magical artifacts introduced by using macros like in the current (0.9.7) version. This rework have been triggered by a post carefully explaining the issue, courtesy of Zef.

I won't go further than a linkfest so here are the important links for those interested in the future of ClojureQL:

We're also ready to receive your suggestions on the brand new clojureql group. Please, visit the Wiki pages and tell us about what you think.

2010-02-07

Google AI Challenge in Clojure

These days, the web have been noisy about the brand new Google sponsored AI Challenge organized by the University of Waterloo Computer Science Club. I find that it's a great initiative and is a good occasion for amateur AI researchers like me to do something else than web development or database related coding. Being fond of Clojure, I've taken on the task of making a starter package for this language and making it available on a GitHub repository. It's not an official package yet, but it shouldn't be a problem as the organizers seems really open.

The code has been translated from the Java starter package. I've tried to keep the map code close to the original, so it may not be the most idiomatic Clojure code. There's four global refs containing the state of the game: width, height, walls and players. The two first are simple integers, walls is an hash map keyed with points (which are vectors) and the last one a vector of points. There's some convenient functions to access that data, wall? to tell if a wall is found at the given coordinates and two others to get each players position, you and them. To move your bike you must call make-move with one of :north, :east, :south or :west. The map code end up being nearly half the size of the Java version.

With this code we can rewrite the sample random bot in Clojure, I'll spare you any more explanation and only show the code.

(load "map")

(defn valid-moves [x y]
  [(when-not (wall? x (- y 1)) :north)
   (when-not (wall? (+ x 1) y) :east)
   (when-not (wall? x (+ y 1)) :south)
   (when-not (wall? (- x 1) y) :west)])

(defn compact [coll]
  (filter identity coll))

(defn choose-at-random [coll]
  (let [size (count coll)]
    (when (< 0 size)
      (nth coll (rand-int size)))))

(defn next-move []
  (choose-at-random
   (compact
    (apply valid-moves (you)))))

(defn game-loop []
  (loop []
    (initialize)
    (make-move (next-move))
    (recur)))

(game-loop)

Hack and be mery!

Update: I've updated the GitHub repository with some improvements and instructions on how to compile your entry. While discussing with one of the contest organizer over their forum, I realized that even though Clojure is more than fast enough for this kind of task, the Clojure jar loading time take a good chunk of the first turn time, so be careful. By the way, it's now an official package.

2010-01-19

A Recursive Walk on a String

Sometimes you don't pay attention to some details and this lead you right into the wrong direction. That's exactly what happened to me the other day. I had a specific problem I intended to solve for a while, nothing particularly difficult, but an interesting one. After hacking around for an hour, the situation was getting weird and I decided to ask for help on the Clojure Google Group. This post is an account of how that situation arose and what have been done to solve it.

Walking Recursively

The task is nearly as simple as explaining it. I needed a way to apply a function to each strings contained in a tree of nested data structures in Clojure. Not that easy to explain as you see, well you didn't really saw me rewrite the previous sentence a thousand times but you can probably picture it. My first attempt is really naive and leverages the clojure.walk library written by Stuart Sierra. It's a small library that provides functions to perform non-recursive walking of generic Clojure data structures. Lets first have a look at the walk function documentation.

user> (doc clojure.walk/walk)
-------------------------
clojure.walk/walk
([inner outer form])
  Traverses form, an arbitrary data structure.  inner and outer are
  functions.  Applies inner to each element of form, building up a
  data structure of the same type, then applies outer to the result.
  Recognizes all Clojure data structures except sorted-map-by.
  Consumes seqs as with doall. 
nil

It isn't hard to take this function and use it in a recursive one to do the job. Our function to transform strings could go in the inner argument, it's also where we'll add the recursive call. We don't need the outer one, so we will just use the identity function.

(use 'clojure.walk)

(defn recursive-string-walk [f form]
  (walk #(if (string? %) (f %) (recursive-string-walk f %))
    identity form))

It works well but there's an issue with this way of doing things, recursion will end up filling up the stack which will ultimately overflow. For the use I intend to make of this function, it isn't really a big problem, but I always like to take these kind of challenges.

A Buggy Test

The first thing that came to my mind to test the above issue was to create a nested list of arbitrary depth. We can take the iterate function to create an infinite sequence of nested function calls to list, then take and last to fetch the pit we want to test with.

user> (def pit (iterate list "bottom!"))
#'user/pit
user> (take 6 pit)
("bottom!" ("bottom!") (("bottom!")) ((("bottom!"))) (((("bottom!")))) ((((("bottom!"))))))
user> (last (take 6 pit))
((((("bottom!")))))
user> (recursive-string-walk identity (last (take 6 pit)))
((((("bottom!")))))
user> (recursive-string-walk #(str "reached " %) (last (take 6 pit)))
((((("reached bottom!")))))
user> (recursive-string-walk #(str "reached " %) (last (take 1000 pit)))
; Evaluation aborted.
user> (recursive-string-walk #(str "reached " %) (last (take 100 pit)))
((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((("reached bottom!")))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))

Work great! Now lets create an informal test function. This function should take a range and a recursive walk function then will test it using pits starting with the shallowest. This way we can find exactly at which point the overflow happen.

(defn test-walk [walker shallowest deepest]
  (doseq [depth (range shallowest deepest)]
    (println (walker #(str depth " reached " %)
               (last (take depth pit))))))

We'll use it to make sure it works and see which depth our recursive-string-walk can reach.

user> (test-walk recursive-string-walk 400 500)
...
((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((("486 reached bottom!")))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))
(...(; Evaluation aborted.

On my computer, this function cannot go deeper than 486 (I wonder how it would fare on a 486?) which isn't that much. It shouldn't be all that difficult to make a lazy version of that function. It boils down to look for sequences and wrap their recursive calls with lazy-seq.

(defn lazy-recursive-string-walk [f form]
  (walk #(cond
           (string? %) (f %)
           (seq? %)    (lazy-seq (lazy-recursive-string-walk f %))
           :default    (lazy-recursive-string-walk f %))
    identity form))

From my understanding of lazy data structures and the walk library, this version should be able to handle data structures of infinite depth given infinite resources. However the test function didn't seems to be in agreement with me as it was crashing at 828. That's about the time I asked for help and stopped thinking about this problem for a while.

Lazy Zippers

A week-end passed by and in the meantime, some people from the group tried to help me, yet nobody found what was the real problem. That would have been miraculous given the actual source of the bug. Furthermore, I was now more convinced than before than there was no bug in the lazy version. Tom Hicks provided a lazy version of walk, which was behaving in the same way. Laurent Petit also suggested that I try to write a version using zippers.

It was an intriguing suggestion, I never used zippers but heard often about them. Additionally, that day, MarkCC from Good Math, Bad Math just posted an article about them. So I learned what they are and how to use them. I must say this is a very neat functional programming technique, really worth the time to learn. It's not that hard, Mark has done a great job explaining it. I won't go into the details, but this version is quite easy to understand when you know that the next function do a depth-first traversal automatically.

(require '[clojure.zip :as z])

(defn lazy-recursive-string-walk-with-zipper [f form]
  (loop [loc (z/seq-zip form)]
    (if (z/end? loc)
      (z/root loc)
      (recur (z/next
               (if (string? (z/node loc))
                 (z/replace loc (f (z/node loc)))
                 loc))))))

To my surprise, the test function still overflowed the stack using this version. And it could simply not throw the StackOverflowError as it executes in a loop. Maybe the error wasn't coming from the lazy walk function after all.

So Where's The Bug?

I've been negligent about one thing, I didn't look closely enough at the stack traces. The one thrown by recursive-string-walk happen to be completely different than the one thrown by lazy-recursive-string-walk. Here's both of them one after the other.

No message.
  [Thrown class java.lang.StackOverflowError]

Restarts:
 0: [ABORT] Return to SLIME's top level.

Backtrace:
  0: clojure.lang.RT.boundedLength(RT.java:1128)
  1: clojure.lang.RestFn.applyTo(RestFn.java:135)
  2: clojure.core$apply__4370.invoke(core.clj:436)
  3: clojure.walk$walk__7942.invoke(walk.clj:43)
  4: recursive_string_walk$recursive_string_walk__4137.invoke(NO_SOURCE_FILE:1)
  5: recursive_string_walk$recursive_string_walk__4137$fn__4139.invoke(NO_SOURCE_FILE:1)
---
No message.
  [Thrown class java.lang.StackOverflowError]

Restarts:
 0: [ABORT] Return to SLIME's top level.

Backtrace:
  0: clojure.lang.RT.assoc(RT.java:666)
  1: clojure.core$assoc__4259.invoke(core.clj:146)
  2: clojure.lang.AFn.applyToHelper(AFn.java:179)
  3: clojure.lang.RestFn.applyTo(RestFn.java:137)
  4: clojure.lang.Ref.alter(Ref.java:174)
  5: clojure.core$alter__4907.doInvoke(core.clj:1542)

The actual source of the exception is the print function that is recursive on nested sequences. This make sense, why would someone want to print such monsters. Because of that inattention I lost time searching for an imaginary bug. Only a small modification is needed to make our test function working.

(defn get-bottom [p]
  (loop [p (first p)]
    (if (seq? p) (recur (first p)) p)))

(defn test-walk [walker shallowest deepest]
  (doseq [depth (range shallowest deepest)]
    (println (get-bottom
               (walker #(str depth " reached " %)
                 (lazy-seq (last (take depth pit))))))))

We simply search for the bottom item and print it, this will also make the test output clearer. We can now see what our original walk function can handle.

user> (test-walk lazy-recursive-string-walk 100000 100001)
...
; Evaluation aborted.
---
Java heap space
  [Thrown class java.lang.OutOfMemoryError]

Well, the heap too has its limit! We would need to augment the JVM heap memory size to be able to run the previous example. Lets reduce our ambitions and do some benchmarking instead.

user> (time (test-walk lazy-recursive-string-walk 10000 10001))
10000 reached bottom!
"Elapsed time: 25.138697 msecs"
nil
user> (time (test-walk lazy-recursive-string-walk-with-zipper 10000 10001))
10000 reached bottom!
"Elapsed time: 122.691972 msecs"
nil

As you see, the version using clojure.walk is much faster. That's perfectly understandable considering the more complex concept of zippers which have multiple uses aside walking trees. As both versions are fully lazy, the first one is the clear winner here.

What Can We Use This For

After all these trials and tribulations, you may wonder what use I may have for such a function. It's to help create functions or macros that perform string replacement on all strings in a Clojure form. This can be useful with Compojure while deploying your application to a different root path than "/" for example. The idea come from my experience with ASP.NET in which URLs can contain a character (~) to denote the web site root path.

(use 'clojure.contrib.str-utils)

(defn expand* [expansions string]
  (reduce (fn [s [regex replacement]]
            (re-gsub regex replacement s))
    string (partition 2 expansions)))

(defn expand-form [url form]
  (recursive-string-walk (partial expand* [#"^~" url]) form))

Having the same functionality in a macro would be nice too.

(defmacro with-expansions [expansions & body]
  `(do ~@(recursive-string-walk (partial expand* expansions) body)))

(defmacro expand [url & body]
  `(with-expansions [#"^~" ~(str url)] ~@body))

To put this to use I've taken a function written by James Reeves a few months ago during a discussion on the Compojure group. A function with the same name (with-context) already exists in Compojure though, so we'll rename it to with-root-path. The way it work is by binding the given context to a var and then we can use a function called url to add the context so that internal links still work once the application is deployed elsewhere. Using the expand-form function on html's arguments can reduce repetition and make the code less cluttered if there's a lot of such URLs.

(use 'compojure)

(declare *context*)

(defn with-root-path
  [ctx & route-seq]
  (let [handler (apply routes route-seq)
         pattern (re-pattern (str "^" ctx "(/.*)?"))]
    (fn [request]
      (if-let [[_ uri] (re-matches pattern (:uri request))]
        (binding [*context* ctx]
          (handler (assoc request :uri uri)))))))

(defn url [path]
  (str *context* path))

(defn html-expand [& trees]
  (apply html (expand-form *context* trees)))

(defroutes test-routes
  (with-root-path "/foo"
    (GET "/"    (html-expand [:h1 "test"] [:p "~/"]))
    (GET "/bar" (html [:h1 "test"] [:p (url "/bar")]))
    (ANY "*" (page-not-found))))

(defn start-test-server []
  (run-server {:port 8000} "/*" (servlet test-routes)))

The first route use a special version of the html function, it's only a slight improvement but I really like that way of doing things. There's plenty of other possible use cases though, but that should be enough for this post.

P.S.: I know the title isn't correct, but it sounds way better than A Recursive Walk on Strings Contained in Arbitrary Nested Data Structures.

2010-01-12

Documenting Clojure Code

Clojure is a truly amazing language and still a very young one. There is already a substantial quantity of documentation available for Clojure and some of its libraries. Yet, when talking about documentation, some is not enough. Sure, there's the REPL with all its bells and whistles, but sometimes I prefer to navigate through a well formatted API document using a browser. There's already some solutions for this, one on which I worked for the defunct Compojure website (which was based on similar code for Clojure) and another more serious attempt in clojure-contrib, gen-html-docs by Craig Andera. This library only output HTML and there arise a small (but annoying) problem: not all projects have their documentation formatted in HTML. For example, the new Compojure website runs on DokuWiki which have its own syntax and Gitorious integrated Wiki, used for ClojureQL, is using Markdown. If only every Wiki engine would understand Creole!

clj-doc

To be able to solve the above problem, I decided to start a new project called clj-doc. For now it's pretty much just a weekend coding session, but I'll continue to work on it so that one day it will be the best tool available to generate documentation for Clojure code. The basic usage is done through the gen-doc macro.

user> (use 'clj-doc)
nil
user> (gen-doc clj-doc.utils)
("<!DOCTYPE html ...Documentation for clj-doc.utils...")

This gives us a sequence of strings being the rendered output of each namespaces given using the default markup language. Namespaces could also be grouped using nested sequences, for now clj-doc support only one level of nesting. Another notable feature is that namespace groups can be specified using regular expressions. Here's an example demonstrating all these methods at once.

user> (use 'clojure.contrib.pprint)
nil
user> (pprint (gen-doc clj-doc
                       (clj-doc.markups clj-doc.generator clj-doc.utils)
                       #"clj-doc\.markups\."))
("<!DOCTYPE html ...Documentation for clj-doc..."
 "<!DOCTYPE html ...Documentation for clj-doc.markups, clj-doc.generator, clj-doc.utils..."
 "<!DOCTYPE html ...Documentation for clj-doc.markups.creole, clj-doc.markups.dokuwiki, ...")

Writing to Files

Usually you would want to write the documentation for your library to a file. I've added basic support for this with the gen-doc-to-file macro. It behaves exactly as gen-doc but writes each of the resulting documentation strings to a file.

user> (gen-doc-to-file "~/test.html" clj-doc)
nil

If you use more than one group of namespaces, this macro will write them into multiple files that will be numbered starting with zero by appending a number before the last dot if there's one, at the end otherwise. I intend to make this argument a format string where you can specify more complex filenames in a next version. So, if you got any ideas about how to design or implement such feature, I encourage you to add a comment to this post.

Supported Markups

Up until now, we've only been using the default output markup. The main point of this library being generating something else than HTML, we better have a way to use another markup language. Before talking about how to do that, lets look under the hood.

(ns clj-doc.markups.creole
  "Creole markup."
  (use clj-doc.markups))

(defmarkup
  #^{:doc "Creole markup."}
  creole
  :title        #(str "\n= "     %     " =")
  :namespace    #(str "\n== "    %    " ==")
  :section      #(str "\n=== "   %1  " ===\n" %2)
  :var-name     #(str "\n==== "  %  " ====")
  :var-arglist  #(str "\n**" % "**\\\\")
  :var-doc      #(str "\n\n" %))

This is the current implementation for the Creole Wiki markup generator. All markups are defined in the same way, they're basically only struct maps, for more details see the clj-doc.markups namespace. The library currently provides four output methods. We've already seen two of them, here's the complete list:

  • creole
  • dokuwiki
  • html-simple
  • markdown

To define your own markups, you can take one of the implementations, modify it and see how it work out, the keywords should be explicit enough. The library only look for markups into namespaces starting with "clj-doc.markups." though, but that's just a temporary limitation, eventually you'll be able to pass a markup struct map as an option.

The Option Map

The option map is an optional argument that must be specified before any namespaces. It presently only support two keywords, :markup and :separated-by. The first letting you choose the output method and the second used to change the default page separation behavior.

user> (gen-doc {:markup creole} clj-doc.markups.creole)
("\n= Documentation for clj-doc.markups.creole =\n== clj-doc.markups.creole namespace ==\n=== others ===\n\n==== creole ====\n\nCreole markup.")

The :separated-by option can only take one value, namespace, which makes the macros separate pages between namespaces instead of groups of them.

user> (pprint (gen-doc {:markup markdown :separated-by namespace} #"clj-doc\.markups\."))
("\n#Documentation for clj-doc.markups.creole..."
 "\n#Documentation for clj-doc.markups.dokuwiki..."
 "\n#Documentation for clj-doc.markups.html-simple..."
 "\n#Documentation for clj-doc.markups.markdown...")
nil

The Future

This is a very early release done the worse is better way, so it's still a pretty bare-bone tool. I'll be taking a pause this week to work on more (hopefully) profitable projects, so in the meantime I'll be pleased to ear about any suggestions to improve the way clj-doc works.

Links

2009-12-31

Testing ClojureQL

After working on ClojureQL for some time, I've became tired of running manual tests all the time, even with the debug macro. So I began working on a test framework for this project. It's a combination of the test-is library and the concept of demos already built-in. For now we only have them, but they aren't very comprehensive and require a working connection for each backend you want to test. Furthermore, there's practically no validation of the results, we only know that the database accept the expression that have been sent. Now that we're approaching the release of version 1.0 we need something more flexible and exhaustive to keep the code stable.

The Plan

The idea of demo is not bad in itself, so I'll just complement it with tests that could be run without connection. It's all we need for day to day development on ClojureQL, the actual execution of the generated SQL code can be done only once in a while. To not break the DRY principle, I'll make them both use the same code, demos will be written using a special macro and tests will be generated automatically from them. The demos should also be improved by validating the results coming out of the tested databases.

The Prototype

Implementing this scheme directly into ClojureQL is not a simple task though, so I'll make a prototype to show how it works at a smaller scale. We'll use a very simple database written in Clojure and a ClojureQL-like DSL that compile to pseudo-SQL statements. No backends, no intermediate forms and only two kind of statements: inserts and selects. We'll walk through this prototype below, it's merely a hundred lines long.

The Database

All data is kept in a ref, *db*, which is a map of maps with strings as the sole data type for keys as well as for values. The only function we really need here is exec, which takes a compiled statement and returns a result if no exception is raised. I've also included a simple helper function, reset, to empty the database.

(def *db* (ref {}))

(defn reset [] (dosync (ref-set *db* {})))

(defn exec-insert [stmt]
  (let [tokens (drop 2 (seq (.split stmt " ")))
        table  (first tokens)
        values (apply hash-map (rest (rest tokens)))]
    (dosync (alter *db* assoc table
              (merge (@*db* table) values)))))

(defn exec-select [stmt]
  (let [tokens (drop 1 (seq (.split stmt " ")))
        table  (last tokens)
        keys   (take (- (count tokens) 2) tokens)]
    (map #((@*db* table) %) keys)))

(defn exec [stmt]
  (condp #(.startsWith %2 %1) stmt
    "INSERT INTO " (exec-insert stmt)
    "SELECT "      (exec-select stmt)
    (throw (SQLException. "error: unrecognized statement."))))

(defn insert [name & key-val-pairs]
  (apply str "INSERT INTO " name " VALUES " (interpose " " key-val-pairs)))

(defn select [name & keys]
  (str "SELECT " (apply str (interpose " " keys)) " FROM " name))

Our DSL is composed of the insert and select functions, which output the compiled pseudo-SQL code that the database can execute. It's a really dumb system, but we don't need more for the purpose of this experiment.

The Demos

The demos will remain the heart of the ClojureQL testing framework. The main objectives of the following code is to make it easy to define demos and to make their output human-readable. The trick is to also make them reusable, so demos will be defined as data, not code.

(defmacro defdemo [demo & stmts-results]
  `(def ~demo
     ~(vec (map #(vector (list 'quote (first %))
                         (list 'quote (second %)))
             (doall (partition 2 stmts-results))))))

(defdemo demo1
  (insert 'test1 'foo 1) {"test1" {"foo" "1"}}
  (select 'test1 'foo)   ("1"))

(defdemo demo2
  (insert 'test2 'foo 1 'bar 2) {"test2" {"bar" "2", "foo" "1"}}
  (select 'test2 'foo 'bar)     ("1" "2"))

(defn run-demo [name]
  (reset)
  (println "\n===" name "===\n")
  (doall
    (map (fn [[stmt expected-result]]
           (println "statement: " stmt)
           (let [compiled-stmt (eval stmt)
                 result        (exec compiled-stmt)]
             (println "compiled:  " compiled-stmt)
             (println "result:    " result "\n")
             (is (= result expected-result)))) (eval (symbol name)))))

(def demos ["demo1" "demo2"])

(defn run-demos []
  (every? identity
    (apply concat (map run-demo demos))))

The run-demo function test everything in a demo and provide a nice output. It returns a boolean that is the result of the assertion made at the end.

The Tests

We can now use the above to write some tests. The demos ensure us that the compiled statements are all valid. After that it's simply a matter of generating Clojure code, to define tests, and writing it to a file. The gen-test function will take care of the first part and write-tests the second one.

(defn gen-test [name]
  (let [demo (eval (symbol name))]
    `(deftest ~(symbol (.replace name "demo" "test"))
       ~@(map (fn [[stmt _]]
                `(is (= ~stmt ~(eval stmt)))) demo))))

(defn write-tests [filename]
  (when (run-demos)
    (with-open [writer (java.io.FileWriter. filename)]
      (binding [*out* writer]
        (newline)
        (doseq [demo demos]
          (pprint (gen-test demo))
          (newline))))))

(write-tests "tests.clj")

(load-file "tests.clj")

Once the tests file written, we can load it to run the tests it contains. So now we can use run-demos to test everything and run-tests to test compilation only.

Conclusion

There's still many details to figure out before implementing this framework. The namespace layout to be used for example or how to integrate the test-is library depending on what Clojure version we end up targeting. This is a work in progress so I'll keep this post updated until it's done. In the meantime, any comments or suggestions are welcomed.

Happy New Year!

2009-12-21

Writing a Help Macro.

Emacs may seems like a pretty barebone tool for the uninitiated, but pack a lot of things under the hood. Especially when you're coding with a Lisp, Emacs really shines thanks to its renowned slime mode. Yet, most of the power of using a Lisp-like language with any editors lies in the use of the REPL. Clojure has a great set of functions and macros available to help you code using it, but these are dispersed around many libraries contained in clojure-contrib. As there's lots of such helpers, you have to remember a bunch of names and refer to documentation often before getting used to all of them. Having these functionalities packed into a single command could be very useful for beginners as well as for practitioners. We'll first pass trough the code and review it, then we'll show how to use the help macro.

Help Macro

We'll proceed in a top-bottom manner, starting with the namespace definition. We'll need some Java classes from clojure.lang package and also add some aliases for the clojure-contrib libraries we'll be using.

(ns help-macro
  "The help macro regroups clojure-contrib most useful help functions and
  macros into a single call."
  (import [clojure.lang IFn PersistentList Symbol])
  (require
    [clojure.contrib.classpath  :as classpath]
    [clojure.contrib.repl-utils :as repl-utils]
    [clojure.contrib.str-utils  :as str-utils]
    [clojure.contrib.ns-utils   :as ns-utils]))

Then comes the definition of *help-usage*, a simple var containing a string explaining how to use the help macro. That could have been included in the docstring, but I chose to emulate the way command line scripts work instead.

(def *help-usage*
  #^{:doc "Help macro usage text."}
  (str
    "Usage: (help pwd)\n"
    "       (help classpath)\n"
    "       (help dir <ns>)\n"
    "       (help docs <ns>)\n"
    "       (help vars <ns>)\n"
    "       (help source <symbol>)\n"
    "       (help <class> [<n>])\n"
    "       (help <expr>)\n"
    "       (help <string>)\n"
    "       (help ? <query-type>)"))

As you see, we'll support nine kinds of help queries, six of which use a command symbol while the three others are dispatched on the class of their first argument. Lets review each types of queries:

  • pwd - Returns the current working directory.
  • classpath - Prints the current classpath.
  • dir - Prints a sorted directory of public vars in a namespace.
  • docs - Prints documentation for the public vars in a namespace.
  • vars - Returns a sorted seq of symbols naming public vars in a namespace.
  • source - Returns a string of the source code for the given symbol, if it can find it.
  • <class> - Get help on class members, like clojure.contrib.repl-utils/show.
  • <expr> - Get help on the result of an expression, like clojure.contrib.repl-utils/expression-info.
  • <string> - Like find-doc, but easier to type.

The help usage message is nice, but it would be great to have more specific ones for each types of queries, like the above list. To do this, we'll create a help-usage function that will print various messages depending on the symbol given as first argument.

(defn help-usage
  "Returns documentation on help macro usage."
  [query-type & args]
  (condp = query-type
    'class     (doc clojure.contrib.repl-utils/show)
    'expr      (doc clojure.contrib.repl-utils/expression-info)
    'string    (doc find-doc)
    (println
      (condp = query-type
        'pwd       "Returns the current working directory."
        'classpath "Prints the current classpath."
        'dir       "Prints a sorted directory of public vars in a namespace."
        'docs      "Prints documentation for the public vars in a namespace."
        'vars      "Returns a sorted seq of symbols naming public vars in a namespace."
        'source    "Returns a string of the source code for the given symbol, if it can find it."
        "This type of help query is not recognized."))))

Each symbols are separated in two categories. For the simpler queries, we only display a description of what they do. For more complex ones though, we show the documentation for the original function or macro using the doc macro. The function accept other arguments only to prevent exceptions in case of unintentional input.

We'll now look at the generic-help multimethod, which will dispatch calls on the class of its first argument. We'll only respond to three classes: Class, Clojure's PersistentList and String.

(defmulti generic-help
  "Makes the help macro generic on its first argument if no command found."
  {:arglists '([query args])}
  (fn [query _] (class query)))

(defmethod generic-help Class
  [query args]
  (apply repl-utils/show (cons query args)))

(defmethod generic-help PersistentList
  [query args]
  (repl-utils/expression-info (second query)))

(defmethod generic-help String
  [query args]
  (find-doc query))

A thing to note here is the reason why we take the second argument of the list in the PersistentList method. This is so because the help macro quote all arguments it receives, but the expression query must be already quoted.

We'll also add a default dispatch method which displays a warning message followed by the help macro usage.

(defmethod generic-help :default
  [query _]
  (println "No help available for object of type " (class query))
  (help-usage))

All simple commands are contained in the *help-command* map, which is used by the help* function following it.

(def *help-commands*
  #^{:doc "This is a map containing all the commands for the help macro."}
  { 'pwd       #(.getCanonicalPath (java.io.File. "."))
    'classpath #(println (str-utils/str-join "\n" (classpath/classpath)))
    'dir       ns-utils/print-dir
    'docs      ns-utils/print-docs
    'vars      ns-utils/ns-vars
    'source    (comp println repl-utils/get-source)})

(defn help*
  "Driver for the help macro."
  [query args]
  (if-let [sc (get *help-commands* query)]
    (apply sc args)
    (generic-help (if (symbol? query)
                    (resolve query)
                    query) args)))

This function look into the *help-command* map to see if it contains the given query symbol. If found, it calls the associated runnable, else it calls generic-help. Before calling the multimethod, we first try to resolve the query quoted expression in case it's a symbol. This is because this function receives only the symbol of a class when help is called with a class name.

Finally, here's the help macro in all it's glory.

(defmacro help
  "Get help for various kind of expressions, use without arguments for
  detailed usage."
  ([] `(println *help-usage*))
  ([query & args]
    (let [quoted-args (map #(list 'quote %) args)]
      (if (= '? query)
        `(help-usage ~@quoted-args)
        `(help* '~query (list ~@quoted-args))))))

Usage

Here's some examples of using the help macro:

user> (help pwd)
"C:\\"
user> (help classpath)
g:\libraries\java\swank-clojure-1.0-SNAPSHOT.jar
g:\libraries\java\servlet-api-2.5-20081211.jar
...
nil
user> (help dir help-macro)
*help-commands*
*help-usage*
generic-help
help
help*
help-usage
nil
user> (help docs help-macro)
-------------------------
help-macro/*help-commands*
nil
  nil
...
nil
user> (help vars help-macro)
(*help-commands* *help-usage* generic-help help help* help-usage)
user> (help source +)
(defn +
  "Returns the sum of nums. (+) returns 0."
  {:inline (fn [x y] `(. clojure.lang.Numbers (add ~x ~y)))
   :inline-arities #{2}}
  ([] 0)
  ([x] (cast Number x))
  ([x y] (. clojure.lang.Numbers (add x y)))
  ([x y & more]
   (reduce + (+ x y) more)))
nil
user> (help String)
===  public final java.lang.String  ===
[ 0] static CASE_INSENSITIVE_ORDER : Comparator
[ 1] static copyValueOf : String (char[])
...
nil
user> (help String 1)
#<Method public static java.lang.String java.lang.String.copyValueOf(char[])>
user> (help '1)
{:class java.lang.Integer, :primitive? false}
user> (help '(int 1))
{:class int, :primitive? true}
user> (help "help")
-------------------------
clojure.main/help-opt
([_ _])
  Print help text for main
-------------------------
clojure.main/main
([& args])
...

Installing

You can find the complete code here. To use it, place that file in your classpath and create an user script to be called when stating a REPL. From there, you simply have to add the help-macro namespace with the use function if you want to be able to just type "(help...". To have it around in every namespace, you could add the following function in the same file than the macro and use it instead of in-ns.

(defn to-ns [n]
  (in-ns n)
  (use 'help-macro))

Any comments, suggestions, improvements, insults?

2009-12-14

Debugging ClojureQL

Since last week, I embarked on a new endeavor, contributing to ClojureQL. My future projects involving Clojure will need to access some databases, some of which doesn't behave in the same way. Sure, there is already clojure.contrib.sql that give you a wrapper around JDBC, but there are two problems with this approach. First, JDBC is quite good at what it does, yet is not a very sophisticated tool. It gives you access to a portable subset of SQL, that mainly support querying and updating data. This leaves a lot of advanced features out of the deal, these are still available, but in a non-portable way. The other way around, if a database system doesn't support a common feature, JDBC cannot do anything about it, the driver can though. Second, the clojure.contrib sql API has a very procedural feeling to it, it doesn't even let you play with an intermediary form as it executes statement directly. That's enough for basic database interactions, but not for serious database agnostic development.

Lets put off advocacy and talk about something concrete. While contributing to ClojureQL, I realized there was some issues with debugging. It was working well for Postgres (which I had an instance running), I simply had to use the compile-sql method. A problem arise when you're testing changes that affect all backends. In this case, to verify the SQL generated, you need a server for each DBMS you want to test. After looking at the code for some time, I found an easy way of compiling statements without a connection. We can create mock objects using Clojure's proxy macro and use them instead of live connections. For now, it's really simple as no backend implementations are actually using the connector. To put this idea into practice, I wrote a macro to debug multiple databases.

(defmacro debug [db ast]
  (let [connector (.getName (db *connectors*))]
    `(compile-sql ~ast (proxy [~(symbol connector)] []))))

It uses a map containing the interfaces used by all backends with keywords as keys.

(def *connectors* {
  :postgres org.postgresql.PGConnection
  :mysql    com.mysql.jdbc.Connection
  :derby    org.apache.derby.iapi.jdbc.EngineConnection
  :generic  Object})

With this code you can easily debug statements for every databases ClojureQL support. We can add a final touch to be able to see the SQL output for all backends with a single command.

(defmacro debug-and-print-all [ast]
  (let [longest (reduce max (map (comp count str) (keys *connectors*)))
        debug-and-print (fn [db] `(println (format (str "%1$-" ~longest "s : %2$s")
                                           ~(subs (str db) 1)
                                           (debug ~db ~ast))))]
    `(do ~@(map debug-and-print (keys *connectors*)))))

Finally, you can see an example output of the debug-and-show-all macro at the REPL using my ClojureQL clone.

clojureql-test> (debug-and-print-all (create-table test [id int title text date date] :non-nulls * :primary-key id :auto-inc id :unique title))
postgres  : CREATE TABLE test (id SERIAL,title text NOT NULL,date date NOT NULL,PRIMARY KEY ("id"),UNIQUE ("title"))
mysql     : CREATE TABLE test (id int NOT NULL  AUTO_INCREMENT ,title text NOT NULL ,date date NOT NULL ,PRIMARY KEY (`id`),UNIQUE (`title`)) 
derby     : CREATE TABLE test (id int NOT NULL  GENERATED ALWAYS AS IDENTITY ,title text NOT NULL ,date date NOT NULL ,PRIMARY KEY ("id"),UNIQUE ("title"))
generic   : CREATE TABLE test (id int NOT NULL ,title text NOT NULL ,date date NOT NULL ,PRIMARY KEY ("id"),UNIQUE ("title"))

This code is not working properly on the main repository for the moment, as there's some issues with Derby and the generic backend. It's enough for this post, I'll go back hacking my way through ClojureQL code to find other useful tricks and speed up version 1.0 release.

2009-12-02

Overlooking the Obvious

Monday, Tim Bray posted another article on Clojure in his Concur.next series. In there is some beautiful code (as idiomatic Clojure usually is) written by John Evans, in which there was a function I wasn't familiar with.

user> (doc merge-with)
-------------------------
clojure.core/merge-with
([f & maps])
  Returns a map that consists of the rest of the maps conj-ed onto
  the first.  If a key occurs in more than one map, the mapping(s)
  from the latter (left-to-right) will be combined with the mapping in
  the result by calling (f val-in-result val-in-latter).

Although I get the general idea I think it's preferable to perform experiments after reading some documentation, just to be sure.

user> (merge-with #(+ %1 %2) {:a 1} {:a 2 :b 3} {:a 4 :b 5 :c 6})
{:c 6, :b 8, :a 7}
user> (merge-with #(list %1 %2) {:a 1} {:a 2 :b 3} {:a 4 :b 5 :c 6})
{:c 6, :b (3 5), :a ((1 2) 4)}

It does pretty much what the docs say, it's always great to find about a useful function you didn't knew about! Let's see some real code using this function.

(defn- merge-map
  "Merges an inner map in 'from' into 'to'"
  [to key from]
  (merge-with merge to (select-keys from [key])))

This was extracted from Compojure source code for response handling. It's used to merge the response's headers and session maps when the update-response multi-method is called on an object of the Map class. That gave me an idea: the principle of using merge as the with function could be generalized into a function that recursively merges maps. Let's to do it manually.

user> (merge {:a 1} {:a 2 :b 3} {:b 4 :c 5})
{:c 5, :b 4, :a 2}
user> (merge-with merge {:a {:a 1 :b 2}} {:a {:a 3 :c 4}} {:a {:b 5 :c 6}})
{:a {:c 6, :a 3, :b 5}}
user> (merge-with (partial merge-with merge) {:a {:a {:a 1} :b {:b 2}}} {:a {:a {:a 3} :c {:c 4}}} {:a {:b {:b 5} :c {:c 6}}})
{:a {:c {:c 6}, :a {:a 3}, :b {:b 5}}}

That's surely feasible, but I can't seems to find any use for such a function, so I'll leave it as an exercise. The complexity of this task certainly negate the actual benefits gained from its limited usefulness anyway. Is there anybody who disagree?

Update: Mister Bray posted yet another article on Clojure yesterday and proggit bursted in flames with relatively interesting (but heated) discussions on Clojure versus everything under the sun. It also provoked craziness!

About Me

My photo
Quebec, Canada
Your humble servant.