Freitag, 1. Juni 2012

Porting PAIP to clojure - Chapter 4: GPS - The General Problem Solver

Porting PAIP to clojure - Chapter 4: GPS - The General Problem Solver


Table of Contents

  • 1 PAIP to clojure - chapter 4
    • 1.1 Description
    • 1.2 Specification
    • 1.3 Implementation
    • 1.4 Testing
    • 1.5 Analysis
    • 1.6 Back to implementation - A more general version 2

1 PAIP to clojure - chapter 4

I continue to translate PAIP to clojure. As always, it is possible to follow the post even without reading PAIP, but I encourage you to do so. This chapter covers the GPS program, the General Problem Solver. As always, the focus lies on the iterative process of developing an (ai-) progam. Norvig presents 5 stages of an programming project, after which the sections of the chapter are numbered:
  1. Description
  2. Specification (in algorithmic terms)
  3. Implementation
  4. Testing
  5. Debugging/Analysis (optional repeat)

1.1 Description

GPS was the first program which separated its problem-solving strategy from its knowledge of particular problems. It used means-ends analysis as problem-solving strategy. In means-ends analysis, you know in what state you are right now and in what state you will be when you have solved your problem, but you consider what is between your state now and the goal-state and how this distance can be overcome. An Example: I have to write a blog-post. What Do I need to do to write an blog-post? I have to type it. What do I need to type it? I have to sit in front of my pc. What do I need to do to sit in front of my pc ? Go to the PC ….
It is obvious that we have to be able to answer "What do I need to … "or, like the book points out "What is the difference between…" from the set of all possible actions. And each actions could have itself preconditions, for which actions have to be taken befor it can be taken. This is a very brief description, I encourage you to read the section in PAIP for the full discussion.

1.2 Specification

After having a (rough) Idea of how the means-ends analysis works, it is time to come to the second step: specification. In this stage, we refine the description to something that is representible in the implementation language. I tink, the stage of specification is best described as follows: Find the most natural mapping between the problem and the implementation language!
So, let's see how means-ends analysis can be represented in clojure:
  • A state can be represented as a set of conditions, for example #{unknown poor} and #{rich famous}. We need 2 states: the current state and the goal state (try to guess which of the two above is the goal-state…?)
  • The allowable actions can naturally be represented as a collection of actions, which can be worke
  • An action needs a Name, a descriptions of what states a person has to be in in order to take the action and a description of what states of the person will change after taking the action. An action can therefore be described by a map (or later by a record) with a :action and a :precond field. The changes of the state of the acting person are trickier. Consider an action "Become rich" that has as precond being poor and changes that to being rich. It could also remove states or add new ones. The change of one state to the other can easily be represented in terms of removing the old one and adding the new one. Thus the Action will have two more fields, a :del-list and a :add-list which contains states which will be added or removed from the current state after taking the action.
  • Last, but not least, we have to specify the entry point of the application, the GPS-function. We need a start-state, a goal-state and the allowed actions. It can be represented as a function which takes a start-state and a goal states, and works with the ops var.
Now we have specified the single entities of the program. The algorithm is next to go: GPS succeeds, if it can achieve all single conditions in the goal state by applying a sequence of actions. To decide, wheter to apply an action or not, we have to look in its add list. If a goal-condition is in the add-list, the action is appropriate and GPS wants to take the action. It can only take the action, if it can achieve all preconds of the action, which is just calling itself with the preconds as goal-state, updating the current state as it goes along. If it was able to achieve every condition of the goal state, it succeedes (Note that this is the first, most simplest specification, it will be refined in part 2). Ok, this is enough for the first implementation of the program!

1.3 Implementation

First, let's have a look at a sample operations list. I find it easier to implement the program, if I see with what data it works with:
(ns PAIP2clojure.Chapter4)

(def school-ops
 [ {:action "drive-son-to-school"
    :preconds #{ "son-at-home" "car-works"}
    :add-list #{ "son-at-school"}
    :del-list #{ "son-at-home"}}
   {:action "shop-installs-battery"
    :preconds #{"car-needs-battery" "shop-knows-problem" "shop-has-money"}
    :add-list #{"car-works"}}
   {:action "tell-shop-problem"
    :preconds #{"in-communication-with-shop"}
    :add-list #{"shop-knows-problem"}}
   {:action "telephone-shop"
    :preconds #{"know-phone-number"}
    :add-list #{"in-communication-with-shop"}}
   {:action "look-up-number"
    :preconds #{"have-phone-book"}
    :add-list #{"know-phone-number"}}
   {:action "give-shop-money"
    :preconds #{"have-money"}
    :add-list #{"shop-has-money"}
    :del-list #{"have-money"}}])
(def ^:dynamic *ops* school-ops)
=> #'PAIP2clojure.Chapter4/*ops*
In the implementation, I choose not follow Norvigs route and modify the current state in the function, which would require something like an atom or agent in clojure. Instead I changed the implementation to be side-effect free and hence more ideomatic clojure. Here is the implementation from the book:
(defvar *state* nil "The current state: a list of conditions.")

(defvar *ops* nil "A list of available operators.")

(defstruct op "An operation"
  (action nil) (preconds nil) (add-list nil) (del-list nil))

(defun GPS (*state* goals *ops*)
  "General Problem Solver: achieve all goals using *ops*."
  (if (every #'achieve goals) 'solved))

(defun achieve (goal)
  "A goal is achieved if it already holds,
  or if there is an appropriate op for it that is applicable."
  (or (member goal *state*)
      (some #'apply-op 
            (find-all goal *ops* :test #'appropriate-p))))

(defun appropriate-p (goal op)
  "An op is appropriate to a goal if it is in its add list."
  (member goal (op-add-list op)))

(defun apply-op (op)
  "Print a message and update *state* if op is applicable."
  (when (every #'achieve (op-preconds op))
    (print (list 'executing (op-action op)))
    (setf *state* (set-difference *state* (op-del-list op)))
    (setf *state* (union *state* (op-add-list op)))
    t))
I had to change the return value of achieve and apply-op to an updated current-state, instead of returning a truth value and destructively modifying the current state as side effect. Also, I couldn't use every? because the state has to be updated after every application of achieve. Thus I made a helper function every-accum? which takes a 2-argument function, a start-state and a collection and reduces the collection applying the function. When one result is nil, the function returns nil.
Notice the use of the with-auto-declare macro, which I described in a previous post. This let me avoid   to add a declare, everytime I make a forward declaration. If a symbol starts with _, a forward-declaration will be automagically added for it.
 (use 'clojure.set)
 (use 'auto-declare.core)

(with-auto-declare _  
   (defn GPS [state goals]
     (let [new-state (_every-accum? _achieve state goals)]
       (if (nil? new-state)
         'not-solved
         'solved)))

   (defn achieve
     "return the new-state after the goal is achieved or nil
      if it could not be archieved"
     [current-state goal]
     (if (contains? current-state goal)
       current-state
        (some (partial _apply-op current-state)
          (filter #(_appropriate? goal %) *ops*))))

   (defn appropriate? [goal op]
     (contains? (:add-list op) goal))

   (defn every-accum? [func start coll]
     (reduce #(if (nil? %1)
                nil
                (func %1 %2)) start coll))

   (defn apply-op [current-state op]
     (let [new-current-state (every-accum? achieve current-state (:preconds op))]
       (if (nil? new-current-state)
         nil
         (do (println (str "executing " (:action op)))
             (-> new-current-state (difference (:del-list op)) (union (:add-list op))))))))
=> #'PAIP2clojure.Chapter4/apply-op
Ok, now comes the 4. stage: Testing

1.4 Testing

Let's test the implementation with the ops defined above and see how it works. Norvig presents three sample applications:
(GPS #{"son-at-home" "car-needs-battery" "have-money" "have-phone-book"} #{"son-at-school"})
executing look-up-number
executing telephone-shop
executing tell-shop-problem
executing give-shop-money
executing shop-installs-battery
executing drive-son-to-school
executing look-up-number
executing telephone-shop
executing tell-shop-problem
executing give-shop-money
executing shop-installs-battery
executing drive-son-to-school
solved
=> solved
(GPS #{"son-at-home" "car-needs-battery" "have-money"}
     #{"son-at-school"})
=> not-solved
(GPS #{"son-at-home" "car-works"}
     #{"son-at-school"})
executing drive-son-to-school
=> solved
The second example has no solution, because the person does not have a phone-book and thus can't come in communication with the shop and the shop cant repair the car, so he can't drive his son to school
But how does it actually work? Trace is a good way to find out.
(dotrace [GPS achieve appropriate? apply-op every-accum?] 
  (GPS #{"son-at-home" "car-needs-battery" "have-money" "have-phone-book"} #{"son-at-school"}))  
 TRACE t3410: (GPS #{"car-needs-battery" "have-phone-book" "have-money" "son-at-home"} #{"son-at-school"})
 TRACE t3411: | (every-accum? #<Chapter4$eval3393$fn__3396 PAIP2clojure.Chapter4$eval3393$fn__3396@6bb0b0a0> #{"car-needs-battery" "have-phone-book" "have-money" "son-at-home"} #{"car-works" "son-at-home"})
 TRACE t3412: | | (achieve #{"car-needs-battery" "have-phone-book" "have-money" "son-at-home"} "car-works")
 TRACE t3413: | | | (every-accum? #<Chapter4$eval3393$fn__3396 PAIP2clojure.Chapter4$eval3393$fn__3396@6bb0b0a0> #{"car-needs-battery" "have-phone-book" "have-money" "son-at-home"} #{"shop-knows-problem" "car-needs-battery" "shop-has-money"})
 TRACE t3414: | | | | (achieve #{"car-needs-battery" "have-phone-book" "have-money" "son-at-home"} "shop-knows-problem")
 TRACE t3415: | | | | | (every-accum? #<Chapter4$eval3393$fn__3396 PAIP2clojure.Chapter4$eval3393$fn__3396@6bb0b0a0> #{"car-needs-battery" "have-phone-book" "have-money" "son-at-home"} #{"in-communication-with-shop"})
 TRACE t3416: | | | | | | (achieve #{"car-needs-battery" "have-phone-book" "have-money" "son-at-home"} "in-communication-with-shop")
 TRACE t3417: | | | | | | | (every-accum? #<Chapter4$eval3393$fn__3396 PAIP2clojure.Chapter4$eval3393$fn__3396@6bb0b0a0> #{"car-needs-battery" "have-phone-book" "have-money" "son-at-home"} #{"know-phone-number"})
 TRACE t3418: | | | | | | | | (achieve #{"car-needs-battery" "have-phone-book" "have-money" "son-at-home"} "know-phone-number")
 TRACE t3419: | | | | | | | | | (every-accum? #<Chapter4$eval3393$fn__3396 PAIP2clojure.Chapter4$eval3393$fn__3396@6bb0b0a0> #{"car-needs-battery" "have-phone-book" "have-money" "son-at-home"} #{"have-phone-book"})
 TRACE t3420: | | | | | | | | | | (achieve #{"car-needs-battery" "have-phone-book" "have-money" "son-at-home"} "have-phone-book")
 TRACE t3420: | | | | | | | | | | => #{"car-needs-battery" "have-phone-book" "have-money" "son-at-home"}
 TRACE t3419: | | | | | | | | | => #{"car-needs-battery" "have-phone-book" "have-money" "son-at-home"}
 executing look-up-number
 TRACE t3418: | | | | | | | | => #{"car-needs-battery" "know-phone-number" "have-phone-book" "have-money" "son-at-home"}
 TRACE t3417: | | | | | | | => #{"car-needs-battery" "know-phone-number" "have-phone-book" "have-money" "son-at-home"}
 executing telephone-shop
 TRACE t3416: | | | | | | => #{"car-needs-battery" "know-phone-number" "have-phone-book" "have-money" "son-at-home" "in-communication-with-shop"}
 TRACE t3415: | | | | | => #{"car-needs-battery" "know-phone-number" "have-phone-book" "have-money" "son-at-home" "in-communication-with-shop"}
 executing tell-shop-problem
 TRACE t3414: | | | | => #{"shop-knows-problem" "car-needs-battery" "know-phone-number" "have-phone-book" "have-money" "son-at-home" "in-communication-with-shop"}
 TRACE t3421: | | | | (achieve #{"shop-knows-problem" "car-needs-battery" "know-phone-number" "have-phone-book" "have-money" "son-at-home" "in-communication-with-shop"} "car-needs-battery")
 TRACE t3421: | | | | => #{"shop-knows-problem" "car-needs-battery" "know-phone-number" "have-phone-book" "have-money" "son-at-home" "in-communication-with-shop"}
 TRACE t3422: | | | | (achieve #{"shop-knows-problem" "car-needs-battery" "know-phone-number" "have-phone-book" "have-money" "son-at-home" "in-communication-with-shop"} "shop-has-money")
 TRACE t3423: | | | | | (every-accum? #<Chapter4$eval3393$fn__3396 PAIP2clojure.Chapter4$eval3393$fn__3396@6bb0b0a0> #{"shop-knows-problem" "car-needs-battery" "know-phone-number" "have-phone-book" "have-money" "son-at-home" "in-communication-with-shop"} #{"have-money"})
 TRACE t3424: | | | | | | (achieve #{"shop-knows-problem" "car-needs-battery" "know-phone-number" "have-phone-book" "have-money" "son-at-home" "in-communication-with-shop"} "have-money")
 TRACE t3424: | | | | | | => #{"shop-knows-problem" "car-needs-battery" "know-phone-number" "have-phone-book" "have-money" "son-at-home" "in-communication-with-shop"}
 TRACE t3423: | | | | | => #{"shop-knows-problem" "car-needs-battery" "know-phone-number" "have-phone-book" "have-money" "son-at-home" "in-communication-with-shop"}
 executing give-shop-money
 TRACE t3422: | | | | => #{"shop-knows-problem" "car-needs-battery" "know-phone-number" "have-phone-book" "shop-has-money" "son-at-home" "in-communication-with-shop"}
 TRACE t3413: | | | => #{"shop-knows-problem" "car-needs-battery" "know-phone-number" "have-phone-book" "shop-has-money" "son-at-home" "in-communication-with-shop"}
 executing shop-installs-battery
 TRACE t3412: | | => #{"shop-knows-problem" "car-needs-battery" "car-works" "know-phone-number" "have-phone-book" "shop-has-money" "son-at-home" "in-communication-with-shop"}
 TRACE t3425: | | (achieve #{"shop-knows-problem" "car-needs-battery" "car-works" "know-phone-number" "have-phone-book" "shop-has-money" "son-at-home" "in-communication-with-shop"} "son-at-home")
 TRACE t3425: | | => #{"shop-knows-problem" "car-needs-battery" "car-works" "know-phone-number" "have-phone-book" "shop-has-money" "son-at-home" "in-communication-with-shop"}
 TRACE t3411: | => #{"shop-knows-problem" "car-needs-battery" "car-works" "know-phone-number" "have-phone-book" "shop-has-money" "son-at-home" "in-communication-with-shop"}
 executing drive-son-to-school
 TRACE t3410: => solved
=> solved
The output is rather lengthy, but you can find your way through. Really great about tracing is, that you see all applications of the functions with their arguments and results printed hierachically, so that the caller of a function is less deeply nested. Therefore, it really helps in debugging and understanding code without manually adding debug statements or stepping through breakpoints etc. I am showing here the output when all functions are traced. Turning out tracing on a few functions make the output more readable, but you miss a bit about the whole picture.
As you see, it tries to achieve the goals in reverse order, because it has to solve the subproblems first. The first action it can execute is look-up-number. After this, it knows the telefone-number, can phone the shop, the shop can repair the car and it can solve the problem of driving the son to school because the car works.

1.5 Analysis

It turns out, that the GPS is not very general, at least our implementation is not. Norvig does a great analysis explaining the reasons why the program can not handle many situations. I will give a short summary of the discussion, describing the problems and a miminal explanation why the program fails.
  • Running around the block - GPS don't have any notion of actions, that add nothing to the :add-list, like 'running-around-the-block'. There has to be a state like "got-some-exercise" in the :add-list
  • Clobbered Sibling Goal - In the current implementation, GPS solves every goal independent and in sequence. That means, if solving the second problem undoes a previous goal, GPS will falsly return success. In the school-ops world, consider #{have-money son-at-school}. The program has to be modified, so that it checks after achieving all goals that the state is still a subset of the goal-state
  • Leaping before you look - sometimes, gps will fail, but executes some action before it recognizes that. In the current implementation, planning and execution are interleaved. This will be addressent by returning the list of actions instead of just printing the actions.
  • Recursive Subgoal Problem - suppose we would add another operator to the school-ops "ask-phone-number" with the precondition of being in communication with the shop. GPS will try to achieve know-phone-number with ask-phone-number, which requires to be in communication with the shop, which requires knowing the phone-number … It tries to solve the problem in terms of itself. To address this, we have to add a goal-stack, and let GPS give-up if it recognizes a circle.
  • Lack of Intermediate Information - As at the leaping before you look problem, it is better to return a list of actions instead of just solved. This makes it easier for other parts of the program to interact with GPS

1.6 Back to implementation - A more general version 2

Before we come to the implementation of the program, it helps to get a good debugging tool, so that we can see what it is doing:
(def ^:dynamic *dbg-ids*)
(defn debug
  ([id str] (debug id 0 str))
  ([id indent str]
   (when (contains? *dbg-ids* id)
    (do (dotimes [i indent]
          (print " "))
        (println str)))))
=> #'PAIP2clojure.Chapter4/debug
Here is the common-lisp code. It first transforms all the ops to a new form, which has an '(Executing action) in its add list.
   (defun executing-p (x)
  "Is x of the form: (executing ...) ?"
  (starts-with x 'executing))

  (defun starts-with (list x)
    "Is this a list whose first element is x?"
    (and (consp list) (eql (first list) x)))

(defun convert-op (op)
  "Make op conform to the (EXECUTING op) convention."
  (unless (some #'executing-p (op-add-list op))
    (push (list 'executing (op-action op)) (op-add-list op)))
  op)

(defun op (action &key preconds add-list del-list)
  "Make a new operator that obeys the (EXECUTING op) convention."
  (convert-op
    (make-op :action action :preconds preconds
             :add-list add-list :del-list del-list)))

;;; ==============================

(mapc #'convert-op *school-ops*)

The new version of GPS will return the new state instead of printing the actions and removes all non-atoms from the state so that only (Executing action) forms are left
  (defvar *ops* nil "A list of available operators.")

(defstruct op "An operation"
  (action nil) (preconds nil) (add-list nil) (del-list nil))

(defun GPS (state goals &optional (*ops* *ops*))
  "General Problem Solver: from state, achieve goals using *ops*."
  (remove-if #'atom (achieve-all (cons '(start) state) goals nil)))


(defun achieve-all (state goals goal-stack)
  "Achieve each goal, and make sure they still hold at the end."
  (let ((current-state state))
    (if (and (every #'(lambda (g)
                        (setf current-state
                              (achieve current-state g goal-stack)))
                    goals)
             (subsetp goals current-state :test #'equal))
        current-state)))

(defun achieve (state goal goal-stack)
  "A goal is achieved if it already holds,
  or if there is an appropriate op for it that is applicable."
  (dbg-indent :gps (length goal-stack) "Goal: ~a" goal)
  (cond ((member-equal goal state) state)
        ((member-equal goal goal-stack) nil)
        (t (some #'(lambda (op) (apply-op state goal op goal-stack))
                 (find-all goal *ops* :test #'appropriate-p)))))


(defun member-equal (item list)
  (member item list :test #'equal))


(defun apply-op (state goal op goal-stack)
  "Return a new, transformed state if op is applicable."
  (dbg-indent :gps (length goal-stack) "Consider: ~a" (op-action op))
  (let ((state2 (achieve-all state (op-preconds op) 
                             (cons goal goal-stack))))
    (unless (null state2)
      ;; Return an updated state
      (dbg-indent :gps (length goal-stack) "Action: ~a" (op-action op))
      (append (remove-if #'(lambda (x) 
                             (member-equal x (op-del-list op)))
                         state2)
              (op-add-list op)))))

(defun appropriate-p (goal op)
  "An op is appropriate to a goal if it is in its add list."
  (member-equal goal (op-add-list op)))
The function achieve-all abstracts away the (every #'achieve …) from the first version and checks whether the resulting state is still a subset of the goal-state. Also, it updates the current state for each application of achieve (destructively modifying..)
Also note, that in apply-op append and remove-if are used. They are needed because in this version, the order of the conditions in the current state counts (because they contain (Executing action) elements.
Again, I will not follow the implementation given in PAIP. Especially I want to avoid the side-effects in achieve-all and the mixing of actual conditions and (Executing action) forms, which would prevent to represent states with sets. It turned out, that I needed only three small changes from the first version of the program:
  • The state is now divided in the current-state and a list-of-actions taken so far. Thanks to clojure's destructuring, it is easy to take the state apart again.
  • there is a goal-stack which contains all goals tried so far. achieve gives up if it encounters a goal that is in the goal-stack to avoid stack-overflow-errors.
  • every-time an action is executed, it is appendet to the list-of-actions.
 (use 'clojure.set)
 (use 'auto-declare.core)

(with-auto-declare _  
   (defn GPS [state goals]
     (let [[new-state list-of-actions] (_every-accum? (partial achieve []) [state []] goals)]
       (if (nil? new-state)
         nil
         list-of-actions)))

   (defn achieve
     "return the new-state after the goal is achieved or nil
      if it could not be archieved"
     [goal-stack [current-state list-of-actions] goal]
     (debug :gps (count goal-stack) (str "Goal " goal))
     (cond (contains? current-state goal) [current-state list-of-actions]
           (contains? goal-stack goal) [nil list-of-actions]
           :else (some (partial _apply-op goal-stack goal [current-state list-of-actions])
                       (filter #(_appropriate? goal %) *ops*))))

   (defn appropriate? [goal op]
     (contains? (:add-list op) goal))

   (defn every-accum? [func start coll]
     (reduce #(if (nil? %1)
                nil
                (func %1 %2)) start coll))

   (defn apply-op [goal-stack goal state op]
     (debug :gps (count goal-stack) (str "Consider: " (:action op)))
     (let [[new-current-state new-list-of-actions](every-accum? (partial achieve (conj goal-stack goal))
                                                                state (:preconds op))]
       (if (nil? new-current-state)
         [nil new-list-of-actions]
         (do (debug :gps (count goal-stack) (str "Action " (:action op)))
             [(-> new-current-state (difference (:del-list op)) (union (:add-list op)))
              (conj new-list-of-actions (:action op))])))))

=> #'PAIP2clojure.Chapter4/apply-op
This implementation solves the problems mentioned above. The next parts of Chapter 4 show how the program performs in new domains. Feel free to port the ops in these domains to clojure and report whether the program worked or not. Here is one example domain: monkey and bananas:
  (def ^:dynamic *banana-ops*
  [{:action "climb-on-chair"
    :preconds #{"chair-at-middle-room" "at-middle-room" "on-floor"}
    :add-list #{"at-bananas" "on-chair"}
    :del-list #{"at-middle-room" "on-floor"}}
   {:action "push-chair-from-door-to-middle-room"
    :preconds #{"chair-at-door" "at-door"}
    :add-list #{"chair-at-middle-room" "at-middle-room"}
    :del-list #{"chair-at-door" "at-middle-room"}}
   {:action "walk-from-door-to-middle-room"
    :preconds #{"at-door" "on-floor"}
    :add-list #{"at-middle-room"}
    :del-list #{"at-door"}}
   {:action "grasp-bananas"
    :preconds #{"at-bananas" "empty-handed"}
    :add-list #{"has-bananas"}
    :del-list #{"at-door"}}
   {:action "drop-ball"
    :preconds #{"has-ball"}
    :add-list #{"empty-handed"}
    :del-list #{"has-ball"}}
   {:action "eat-bananas"
    :preconds #{"has-bananas"}
    :add-list #{"empty-handed" "not-hungry"}
    :del-list #{"has-bananas" "hungry"}}])

(def ^:dynamic *ops* *banana-ops*)
=> #'PAIP2clojure.Chapter4/*ops*
(binding [*dbg-ids* #{:gps }]
  (GPS #{"at-door" "on-floor" "has-ball" "hungry" "chair-at-door"}
       #{"not-hungry"}))
 Goal not-hungry
 Consider: eat-bananas
  Goal has-bananas
  Consider: grasp-bananas
   Goal empty-handed
   Consider: drop-ball
    Goal has-ball
   Action drop-ball
   Goal at-bananas
   Consider: climb-on-chair
    Goal chair-at-middle-room
    Consider: push-chair-from-door-to-middle-room
     Goal chair-at-door
     Goal at-door
    Action push-chair-from-door-to-middle-room
    Goal at-middle-room
    Goal on-floor
   Action climb-on-chair
  Action grasp-bananas
 Action eat-bananas
=> ["drop-ball" "push-chair-from-door-to-middle-room" "climb-on-chair" "grasp-bananas" "eat-bananas"]

The Chapter in PAIP ends with a discussion of how general the GPS really is. It turns out, that it has severe limitations and I encourage everyone to read the sections in the book. Many of this issues will be addressed in later chapters using more sophisticated techniques, such as full-fledged search and backtracking, like prolog does.

Puh, that was more work than I thought. But I hope that I managed to translate it to ideomatic clojure which is not to hard to follow. Feel free to comment, if you have questions, anything is not clear, you enounter bugs or have some other advice.

Chapter 5 is next to come. It contains an implementation of the ELIZA program.


Date: 2012-06-01T20:54+0200
Author: Maik Schünemann
Org version 7.8.09 with Emacs version 23
Validate XHTML 1.0

Donnerstag, 31. Mai 2012

auto-declare in clojure

Automatic forward-declaration in clojure


Table of Contents

  • 1 Auto-declare in clojure
    • 1.1 The reason why clojure resolves symbols at compile time
    • 1.2 a with-auto-declare macro
    • 1.3 example of the use of with-auto-declare

1 Automatic forward-declaration in clojure

There is one think that distracts me everytime I write a lot clojure code: the lack of forward-declarations. Actually,clojure supports forward declarations,but you have to use the declare macro and I hate to stop working on my function to add a declare on top of it. When I return to the function, I have often forgot what I wanted to write … I found a few discussions on this topic here, here and here. This is what Rich Hickey answered to the question of why clojure resolves symbols at compile time (the reason why you have to declare all the vars you have not yet defined).

1.1 The reason why clojure resolves symbols at compile time

3.  Rich Hickey       
View profile  
 More options Nov 11 2008, 4:16 pm
On Nov 11, 12:24 am, Paul Barry <pauljbar...@gmail.com> wrote: 

- Show quoted text -
I can't speak for Scheme, but CL resolves symbols earlier than does 
Clojure - at read time. This causes a number of problems and a lot of 
complexity [1]. 
In Clojure, I had to solve the Lisp-1 vs defmacro problem, and did so 
by separating symbols from vars. That means that symbols are simple 
when read, and only get resolved when compiled. Were compilation to 
auto-create vars for never before seen symbols, many of the problems 
of packages would remain, so I decided that vars and other name 
resolutions needed to be created intentionally, via def/import/refer. 
This leads to many fewer errors, and avoids the problems identified in 
[1]. There is now a declare macro that makes it simple and obvious to 
intentionally claim names for later use: 
(declare a b c) 
; ...def a b c in any order 
This expands into (def a) (def b) (def c). Note that no initial values 
are supplied, so they are unbound, and you will get a runtime error if 
you use them before subsequently defining them, unlike your (def a 
nil) above, which I don't recommend. 
So, you don't need to define things in any specific order - use 
declare. 
Rich 
So the resolution of vars at compile time is good for clojure's namespaces, which are very well thought out in comparison to other lisps like common lisp. Also, it helps to avoid typos. When you spelled the name of a function wrong and clojure would allow forward-declarations, it would assume that you are referring to a var that isn't defined yet and would forward-declare that for you. The error would come up at runtime and (for me, at least) something as triviall as a spelling error is hard to see.
If you look from that position, declare is actually a feature of clojure. But nevertheless I tend to write my clojure-code in a top-down fashion, just like I think about the code. I start with highest level of abstraction at the top of my file and work down to the lower levels of abstraction. This makes the code easier to write and to read for me. Instead of writing new code above my old code or going away from my function and add a declare and go back to my function, everytime I want to have a forward declaration, I wanted to be able to specify symbols to be forward-declared without a need to leave the function and forgetting what I wanted to implement…

1.2 a with-auto-declare macro

So here is my solution: A with-auto-declare macro. With-auto-declare takes a bunch of code, searches for symbols marked to be forward-declared and adds a declare with these symbols on top of the code and gives the code with the auto-declared symbols replaced by the normal symbols. I choose to mark a symbol to be forward-declared by writing a D!- in front of it (please let me know if you have a better alternative). Here's the code:
(ns auto-declare.core)

(defn auto-declare-symbol?
  "Symbols which start with D!- and can not be resolved
   are subject to be auto-declared"
  [symb]
  (and (.startsWith (str symb) "D!-")
       (not (resolve symb))))

(defn auto-declare-symbol-to-normal-symbol [auto-declare-symbol]
  (-> auto-declare-symbol str (.substring 3) symbol))

(refer 'clojure.walk)
(defmacro with-auto-declare [symb & code]
  (let [auto-defs# (->> code flatten (filter auto-declare-symbol?) distinct)
        normal-symbols# (map auto-declare-symbol-to-normal-symbol auto-defs#)
        replacement-map# (-> {} (into (map vector auto-defs# normal-symbols#)))]
    `(do (declare ~@normal-symbols#) ~@(clojure.walk/postwalk-replace  replacement-map# code))))
Flatten extracts all symbols of the code, and distinct removes the duplicates from the auto-declare symbols. I use the function clojure.walk/postwalk-replace to replace all occurences of an auto-declare symbol with the corresponding normal-symbol. Therefor I need an replacement map, with is easily filled with into. The expression (map vector coll1 coll2) just zips the collections together. The 2-element vectors are interpreted as key-value pairs from into.

1.3 example of the use of with-auto-declare

As an example: Here is a piece of code from my blogpost: Porting PAIP to clojure - chapter 2:
(declare sentence noun-phrase verb-phrase article noun verb one-of) 
(defn  sentence [] (concat (noun-phrase) (verb-phrase)))
(defn  noun-phrase [] (concat (article) (noun)))
(defn  verb-phrase [] (concat (verb) (noun-phrase)))
(defn  article [] (one-of ["the" "a"]))
(defn  noun [] (one-of ["man" "ball" "woman" "table"]))
(defn  verb [] (one-of ["hit" "took" "saw" "liked"]))
=> #'auto-declare.core/verb
I had to declare every single function to get the code compiling… Now with my macro, the code becomes:
(with-auto-declare 
  (defn  sentence [] (concat (D!-noun-phrase) (D!-verb-phrase)))
  (defn  noun-phrase [] (concat (D!-article) (D!-noun)))
  (defn  verb-phrase [] (concat (D!-verb) (D!-noun-phrase)))
  (defn  article [] (D!-one-of ["the" "a"]))
  (defn  noun [] (D!-one-of ["man" "ball" "woman" "table"]))
  (defn  verb [] (D!-one-of ["hit" "took" "saw" "liked"])))
=> #'auto-declare.core/verb
It works! With this macro, you have both: the convienience of automatic-forward declarations when wanted, and you don't have the risk that typos are interpreted as forward-declarations. As always I am thankfull of comments/advice etc.

Edit: Ok I admit that the D!- looks somewhat ugly. I used it to prevent that it shadows an already defined symbol accidently. But, because I test if the symbol can be resolve to decide if it is an auto-declare symbol, this isn't neccesary. So I changed the with-auto-declare macro that it takes the prefix as first argument.
(ns auto-declare.core)

  (defn auto-declare-symbol?
    "Symbols which start with D!- and can not be resolved
     are subject to be auto-declared"
    [prefix symb]
    (and (.startsWith (str symb) (str prefix))
         (not (resolve symb))))

  (defn auto-declare-symbol-to-normal-symbol [prefix auto-declare-symbol]
    (-> auto-declare-symbol str (.substring (.length (str prefix))) symbol))

  (refer 'clojure.walk)

  (defmacro with-auto-declare [symb & code]
    (let [auto-defs# (->> code flatten (filter #(auto-declare-symbol? symb %)) distinct)
          normal-symbols# (map #(auto-declare-symbol-to-normal-symbol symb %1) auto-defs#)
          replacement-map# (-> {} (into (map vector auto-defs# normal-symbols#)))]
      `(do (declare ~@normal-symbols#) ~@(clojure.walk/postwalk-replace  replacement-map# code))))
=> #'auto-declare.core/with-auto-declare
The example: Now with a _ as prefix:
(with-auto-declare _
    (defn  sentence [] (concat (_noun-phrase) (_verb-phrase)))
    (defn  noun-phrase [] (concat (_article) (_noun)))
    (defn  verb-phrase [] (concat (_verb) (_noun-phrase)))
    (defn  article [] (_one-of ["the" "a"]))
    (defn  noun [] (_one-of ["man" "ball" "woman" "table"]))
    (defn  verb [] (_one-of ["hit" "took" "saw" "liked"])))
=> #'auto-declare.core/verb
Date: 2012-06-01T12:55+0200
Author: Maik Schünemann
Org version 7.8.09 with Emacs version 23
Validate XHTML 1.0

Mittwoch, 30. Mai 2012


Literate Programming in Clojure


1 Literate programming in clojure using org-mode babel

In this post I'll first show a double setup of org-mode babel-clojure with both inferior-lisp and slime, and then I will show how to use org-babel-clojure for literate programming.

1.1 Literate programming

Who doesn't know this situation: You have to modify someone others code but you have no clue what this code should do. Searching around the (often barely documentation) doesn't help much either, because it is often not clear what intention the author had with the code, or why he implemented it that way. Literate programming does not only help others reading your code, but it helps you as well,because
  • writing down your intention and explaining it makes it easier for you to code it
  • it helps you to be clear about your design decisions
  • writing it down intented to be read by others makes you to automatically write code of better quality
In Literate programming, you mix the explanation of the programm and the source code in one file. The source code can be automatically extracted. This is called tangling. It is also possible to evaluate the code while you are in the document.

1.2 Org-mode-babel

Babel is a part of org-mode that is a full featured literate programming program. Being part of org-mode, it is also able to be exported into a variety of formats including latex and html. I am using it to write my series of 'Porting PAIP to clojure'. Each blog-post is just a .org file which gets exported to html and uploaded on blogger. Here is the setup:

1.2.1 Setting up org-mode-babel with clojure

Fortunately, clojure is a supported language of org-mode, but it's documentation is outdated. So add following to your .emacs file:
(require 'ob)
(require 'ob-tangle)
(add-to-list 'org-babel-tangle-lang-exts '("clojure" . "clj"))

(org-babel-do-load-languages
 'org-babel-load-languages
 '((emacs-lisp . t)
   (clojure . t)))
(require 'ob-clojure)

(defun org-babel-execute:clojure (body params)
  "Evaluate a block of Clojure code with Babel."
  (concat "=> "
          (if (fboundp 'slime-eval)
              (slime-eval `(swank:interactive-eval-region ,body))
           (if (fboundp 'lisp-eval-string)
             (lisp-eval-string body)
              (error "You have to start a clojure repl first!")))))

(setq org-src-fontify-natively t)
(setq org-confirm-babel-evaluate nil)
(setq org-export-babel-evaluate nil)
(setq org-src-window-setup 'current-window)
(setq inferior-lisp-program "lein repl")
(I assume you have a recent org-mode installed) It will work with either inferior lisp using lein repl, which has the advantage that you can run it without a project.clj file, or with M-x clojure-jack-in, which gives you the goddies of slime with clojure.
I have also written a few lines to make the exported html look better and defined a skeleton to quickly insert the first lines in the .org file. If you type C-S-f4, you will be prompted for the title and the content of the skeleton will be inserted.
  (define-skeleton org-skeleton
  "Header info for a emacs-org file."
  "Title: "
  "#+TITLE:" str " \n"
  "#+AUTHOR: Maik Schünemann\n"
  "#+email: maikschuenemann@gmail.com\n"
  "#+STARTUP:showall\n"
  "-----"
 )
(global-set-key [C-S-f4] 'org-skeleton)

(setq org-export-html-style-extra "<style type=\"text/css\">pre {\n    border: 1pt solid #AEBDCC;\n   color:white;\n background-color: #000000;\n     padding: 5pt;\n font-family: courier, monospace;\n        font-size: 90%;\n        overflow:auto;\n  } </style>")

1.2.2 Interacting with clojure code.

Start a repl first. Either via M-x run-lisp or with M-x clojure-jack-in. Note, if you use the former, the result of the code won't get inserted in the .org file after evaluation.
To write some clojure-code, type the following (may be good to define a kbd-macro to speed this up)
#+begin_src clojure :results output :exports both
(def six (+ 1 2 3))
#+end_src
If you place your cursor between the beginsrc and endsrc and type C-c C-c, you will see the following:
#+begin_src clojure :results output :exports both
 (def six (+ 1 2 3))
 #+end_src

#+RESULTS:
: => #'user/six
Note that the code got evaluated and is now defined in the repl. So if you evaluate the following (either in the file, or in the repl, you will see:
#+begin_src clojure :results output :exports both
six
#+end_src 

#+RESULTS:
: => 6
So you loose nothing of the interactive development you love when coding clojure.

1.2.3 Exporting to html

When you are done, you can export the file to html by typing C-c C-e h or C-c C-e b to see the exported html in your standart browser. If you want the source code to be colorized like you see it in clojure-mode, you need to install the htmlize package (via elpa) The clojure code shown above will export to the following:
(def six (+ 1 2 3))
=> #'user/six
Cool, nicely colored clojure code in your html-documentation or your blog-post. So This should be enough to get you started using org-mode-babel with clojure for Literate Programming. It is interactive like writing normal clojure code and it helps you to think more clearly about your code.

1.3 the org-file

Here is the org-file that you see right now:

#+TITLE:Literate Programming in Clojure
#+AUTHOR: Maik Schünemann
#+email: maikschuenemann@gmail.com
#+STARTUP:showall
-----
* Literate programming in clojure using org-mode babel
  In my previous post ... I showed a setup of clojure with org-mode babel using inferior lisp.
  In this post I'll first show a double setup with both inferior-lisp and slime, and then I will
  show how to use org-babel-clojure for literate programming.
** Literate programming
   Who doesn't know this situation:
   You have to modify someone others code but you have no clue what this code should do. Searching
   around the (often barely documentation) doesn't help much either, because it is often not clear
   what intention the author had with the code, or why he implemented it that way.
   [[http://de.wikipedia.org/wiki/Literate_programming][Literate programming]] does not only help others reading your code, but it helps you as well,because
   - writing down your intention and explaining it makes it easier for you to code it
   - it helps you to be clear about your design decisions
   - writing it down intented to be read by others makes you to automatically write code of better quality
   In Literate programming, you mix the explanation of the programm and the source code in one file.
   The source code can be automatically extracted. This is called tangling. It is also possible to evaluate
   the code while you are in the document.
** Org-mode-babel
   [[http://orgmode.org/worg/org-contrib/babel/][Babel]] is a part of [[http://orgmode.org/][org-mode]] that is a full featured literate programming program. Being part of org-mode,
   it is also able to be exported into a variety of formats including latex and html. I am using it to write
   my series of 'Porting PAIP to clojure'. Each blog-post is just a .org file which gets exported to html and
   uploaded on blogger. Here is the setup:
*** Setting up org-mode-babel with clojure
    Fortunately, clojure is a supported language of org-mode, but it's documentation is outdated. So add following
    to your .emacs file:
    #+begin_src elisp
    (require 'ob)
    (require 'ob-tangle)
    (add-to-list 'org-babel-tangle-lang-exts '("clojure" . "clj"))
 
    (org-babel-do-load-languages
     'org-babel-load-languages
     '((emacs-lisp . t)
       (clojure . t)))
    (require 'ob-clojure)
 
    (defun org-babel-execute:clojure (body params)
      "Evaluate a block of Clojure code with Babel."
      (concat "=> "
              (if (fboundp 'slime-eval)
                  (slime-eval `(swank:interactive-eval-region ,body))
               (if (fboundp 'lisp-eval-string)
                 (lisp-eval-string body)
                  (error "You have to start a clojure repl first!")))))
 
    (setq org-src-fontify-natively t)
    (setq org-confirm-babel-evaluate nil)
    (setq org-export-babel-evaluate nil)
    (setq org-src-window-setup 'current-window)
    (setq inferior-lisp-program "lein repl")
    #+end_src
    (I assume you have a recent org-mode installed)
    It will work with either inferior lisp using lein repl, which has the advantage that you can run it without
    a project.clj file, or with M-x clojure-jack-in, which gives you the goddies of slime with clojure.

    I have also written a few lines to make the exported html look better and defined a skeleton to quickly insert
    the first lines in the .org file. If you type C-S-f4, you will be prompted for the title and the content of the
    skeleton will be inserted.
    #+begin_src elisp
    (define-skeleton org-skeleton
    "Header info for a emacs-org file."
    "Title: "
    "#+TITLE:" str " \n"
    "#+AUTHOR: Maik Schünemann\n"
    "#+email: maikschuenemann@gmail.com\n"
    "#+STARTUP:showall\n"
    "-----"
   )
  (global-set-key [C-S-f4] 'org-skeleton)

  (setq org-export-html-style-extra "<style type=\"text/css\">pre {\n    border: 1pt solid #AEBDCC;\n color:white;\n background-color: #000000;\n padding: 5pt;\n font-family: courier, monospace;\n        font-size: 90%;\n        overflow:auto;\n  } </style>")
    #+end_src
 
*** Interacting with clojure code.
    Start a repl first. Either via M-x run-lisp or with M-x clojure-jack-in. Note, if you use the former, the
    result of the code won't get inserted in the .org file after evaluation.

    To write some clojure-code, type the following (may be good to define a kbd-macro to speed this up)
    #+begin_example
    #+begin_src clojure :results output :exports both
    (def six (+ 1 2 3))
    #+end_src
    #+end_example
    If you place your cursor between the begin_src and end_src and type C-c C-c, you will see the following:
    #+begin_example
    #+begin_src clojure :results output :exports both
     (def six (+ 1 2 3))
     #+end_src

    #+RESULTS:
    : => #'user/six
    #+end_example
    Note that the code got evaluated and is now defined in the repl. So if you evaluate the following (either in
    the file, or in the repl, you will see:
    #+begin_example
    #+begin_src clojure :results output :exports both
    six
    #+end_src

    #+RESULTS:
    : => 6
    #+end_example
    So you loose nothing of the interactive development you love when coding clojure.

*** Exporting to html
    When you are done, you can export the file to html by typing C-c C-e h or C-c C-e b to see the exported
    html in your standart browser.
    If you want the source code to be colorized like you see it in clojure-mode, you need to install the
    htmlize package (via elpa)
    The clojure code shown above will export to the following:

    #+begin_src clojure :results output :exports both
     (def six (+ 1 2 3))
     #+end_src

    #+RESULTS:
    : => #'user/six
    Cool, nicely colored clojure code in your html-documentation or your blog-post.
    So This should be enough to get you started using org-mode-babel with clojure for Literate Programming.
    It is interactive like writing normal clojure code and it helps you to think more clearly about your code.

** the org-file
   Here is the org-file that you see right now:
   (leaving this part out to avoid infinite file size...)

Date: 2012-05-30T11:17+0200
Author: Maik Schünemann
Org version 7.8.09 with Emacs version 23
Validate XHTML 1.0

Dienstag, 29. Mai 2012

Porting PAIP to clojure chapter 2

Porting PAIP to clojure chapter 2

Chapter 2 - Generating random Sentences

1 Porting PAIP to clojure - chapter 2 A Simple Lisp Program 

1.1 Introduction

This is the first post of my series porting PAIP to clojure. It is possible to follow my post even if you don't PAIP but I strongly recommend that you get a copy of it. This chapter contains a program to generate random sentences using a grammar for a subset of english. Norvig presents two versions of the program, a straightforward solution using functions to generate random sentences from a grammar Then the weaknesses of this approach are shown and a better solution is developed using a data-driven solution. Afterwards, the advantages of the data-driven approach are discussed.

1.2 Generating random Sentences

1.2.1 The Grammar

Sentence => Noun-Phrase + Verb-Phrase
Noun-Phrase => Article + Noun
 …
 Article => the, a, ….
 …
A rule consists either of two or more non-terminals like Noun-phrase or of a collection of terminals like the, a,… This representation is very natural if you have ever worked with context-free grammars before. The rule Article => the, a, … means: An article is either the or a or … (Article = the | a | …)

1.2.2 The Program

That is the Common Lisp code for the naive solution. It can be found here.
(defun sentence ()    (append (noun-phrase) (verb-phrase)))
(defun noun-phrase () (append (Article) (Noun)))
(defun verb-phrase () (append (Verb) (noun-phrase)))
(defun Article ()     (one-of '(the a)))
(defun Noun ()        (one-of '(man ball woman table)))
(defun Verb ()        (one-of '(hit took saw liked)))
The translation to clojure is straightforward. But I will use a slightly modified version which is, as I think, a little bot more ideomatic in clojure:
(declare sentence noun-phrase verb-phrase article noun verb one-of)

(defn  sentence [] (concat (noun-phrase) (verb-phrase)))
(defn  noun-phrase [] (concat (article) (noun)))
(defn  verb-phrase [] (concat (verb) (noun-phrase)))
(defn  article [] (one-of ["the" "a"]))
(defn  noun [] (one-of ["man" "ball" "woman" "table"]))
(defn  verb [] (one-of ["hit" "took" "saw" "liked"]))
I use vectors instead of lists and strings instead of symbols. here are the helper-functions: one-of picks one element of the collection and returns a single-element collection containing it.
(defn one-of [coll]
  (if (seq coll)
   [(rand-nth coll)]))
To test, let's get 10 random sentences:
(take 10 (repeatedly sentence))   
this evaluates to:
(("a" "ball" "liked" "the" "ball")
("the" "table" "took" "the" "ball")
("the" "man" "took" "a" "man")
("the" "woman" "saw" "a" "man")
("a" "ball" "hit" "the" "ball")
("a" "table" "hit" "the" "man")
("a" "ball" "took" "a" "ball")
("a" "man" "liked" "a" "ball")
("a" "table" "saw" "the" "ball")
("the" "woman" "hit" "a" "ball"))
I believe it is apparent that this is not a good representation for a grammar. The lisp functions are definitly harder to read than the normal grammar syntax shown above. As an example, if you wanted to represent the following rule: Adj* => e,Adj + Adj* (note that e is the empty symbol) you would have to write a function like this:
(declare Adj)
(defn Adj* []
  (if (= (rand-int 2) 0)
    nil
    (concat (Adj) (Adj*))))
There has to be a better way… enter the data-driven solution!

1.3 A data-driven solution

The naive solution is straightforward to implement while making it harder to encode the grammar rules. By using a data-driven approach, it is the other war round. Writing grammar rules becomes easy while implementation may become harder. So let's take a look how the data-driven solution works for this problem.

1.3.1 The representation of the grammar

The first step is to decouple the grammar from the code which processes it. In the book Norvig represents the grammar with the following datastructure:
 (defparameter *simple-grammar*
 '((sentence -> (noun-phrase verb-phrase))
  (noun-phrase -> (Article Noun))
  (verb-phrase -> (Verb noun-phrase))
  (Article -> the a)
  (Noun -> man ball woman table)
   (Verb -> hit took saw liked))
"A grammar for a trivial subset of English.")
Here the grammar rule sentence => noun-phrase + verbphrase gets translated to
(sentence -> (noun-phrase verb-phrase)).
So a list of elements means "produce this nonterminals in sequence".
Multiple nonterminals right of the -> sign mean "choose one of this nonterminals".
Every rule is itself a list and the grammar is a list of rules.
But this representation is not really clojureish?. We can take advantage of clojures literal representation for maps,vectors and sets in addition to lists.
Using these, the meaning of a datastructure- when choosen appropriate- becomes clearer.
Let me give you an example:
What are grammar rules? Rules map a nonterminal to - posible multiple - nonterminals or terminals. Thus it is appropriate to represent them as a map in clojure. I choose to represent the nonterminals as keywords and the values of the map as either one element or a vector of multiple elements.
A vector means: "apply all elements in order".
Norvig represents a choice of nonterminals as simply the nonterminals written after the -> sign.
For me, it was not clear at the beginning that that means "choose one of these nonterminals".
Using a set in clojure, this becomes obvious. Rewritten, the grammar becomes the following:
(def simple-grammar
{:sentence [:noun-phrase :verb-phrase]
 :noun-phrase [:Article :Noun]
 :verb-phrase [:Verb :noun-phrase]
 :Article #{"the" "a"}
 :Noun #{"man" "ball" "woman" "table"}
 :Verb #{"hit" "took" "saw" "liked"}})

(def ^:dynamic *grammar* simple-grammar)
Ok that's for the design part. Now that we have a good representation of our data, the grammar, we have to worry about evaluating it.

1.3.2 Evaluating the grammar

Because I have chosen a different and more ideomatic representation of the grammar in clojure, the code for evaluating the grammar will be different than the code in PAIP. So don't expect a literal translation of the PAIP code.
So, how can we generate a possible sentence:
the function 'generate' will take the startsymbol as argument and retrieves the rule from the grammar.
If there is not a rule for the argument in the grammar, the argument itself is evaluated (thus making it possible to call generate either with the left hand or the right hand side of a rule).
It will recurse on the elements of the rule in sequence, appending the result as it goes along.
When it encounters a set, it will generate one random element of it. it recursively generates the nonterminal. If it encounters a terminal, that is none of the above are true, it just returns a single-element vector of it.
(defn generate [phrase]
    (cond (get *grammar* phrase) (generate (get *grammar* phrase))
          (sequential? phrase) (mapcat generate phrase)
          (set? phrase)  (generate (rand-nth (seq phrase)))
          :else [phrase]))
Amazing how closely the code mimics the description. So this is data-dricen programming.
Decouple the representation of the data from the code which evaluates it.
Let's take a look at a more complicated example and see how the data-driven approach scales. Here's the bigger grammar:

(def bigger-grammar
 {:sentence [:noun-phrase :verb-phrase]
  :noun-phrase #{[:Article :Adj* :Noun :PP*] :Name :Pronoun}
  :verb-phrase [:Verb :noun-phrase :PP*]
  :PP* #{[] [:PP :PP*]}
  :Adj* #{[] [:Adj :Adj*]}
  :PP [:Prep :noun-phrase]
  :Prep #{"to" "in" "by" "with" "on"}
  :Adj #{"big" "little" "blue" "green" "adiabatic"}
  :Article #{"the" "a"}
  :Name #{"Pat" "Kim" "Lee" "Terry" "Robin"}
  :Noun #{"man" "ball" "woman" "table"}
  :Verb #{"hit" "took" "saw" "liked"}
  :Pronoun #{"he" "she" "it" "these" "those" "that"}
 })
(def ^:dynamic *grammar* bigger-grammar)

Let's generate 10 sentences again:

(take 10 (repeatedly #(generate :sentence)))

(("these" "liked" "a" "ball")
 ("a" "ball" "in" "the" "big" "green" "big" "woman" "in" "that" "by" "Terry" "liked" "a" "man")
 ("that" "liked" "that")
 ("Lee" "hit" "it")
 ("Lee" "took" "Lee")
 ("the" "little" "blue" "little" "ball" "in" "Terry" "by" "Robin" "liked" "Pat")
 ("a" "adiabatic" "blue" "blue" "ball" "saw" "the" "man" "in" "Pat" "by" "Lee" "on" "a" "adiabatic" "table"
  "in" "Terry")
 ("Pat" "hit" "Robin" "to" "those")
 ("those" "liked" "the" "woman" "with" "Robin" "with" "these" "in" "the" "table" "to" "Robin" "to" "a"
  "blue" "adiabatic" "ball" "with" "she" "on" "those" "on" "those")
 ("it" "hit" "Kim" "on" "she" "on" "the" "table"))
It works! Enjoy the funny sentences.
But hey, what if we wanted to see how the sentences are generated, to see the parse-tree. Because of the data-driven design, it is easy to implement this. The data doesn't need to be changed, we need only a new evaluation-function.
(defn generate-all [phrase]
    (cond (get *grammar* phrase)  (list phrase (generate-all (get *grammar* phrase)))
          (sequential? phrase) (mapcat generate-all phrase)
          (set? phrase)  (generate-all (rand-nth (seq phrase)))
          :else phrase))
there are only two changes: make a list of phrase and the generated symbols before recursing and return the phrase in the base call instead of a single item vector containing it.
Imaging changing all functions to achieve this instead of just two lines …
(generate-all :sentence)
;=>
    (:sentence 
      (:noun-phrase 
         (:Name "Kim")
       :verb-phrase 
         (:Verb "saw"
          :noun-phrase 
            (:Article "the"
             :Adj* 
               (:Adj "adiabatic" 
                :Adj* ())
             :Noun "ball" 
             :PP* ())
          :PP* ())))
Norvig gives a last example of a generate-all function, which works on the simple grammar and returns all possible sentences defined by the grammar (the language of the grammar). I leaf the implementation to the reader :)

1.4 Advantages of the data-driven solution

Gratulations for making it through the post. With this chapter, Norvig makes a strong point which will be even more important in the next chapters (and is for programming in general). If you use the data-driven approach, you use "the most natural notation available to solve the problem". So instead of worrying how to implement the problem, worry about how to represent you data, so that it is easy to understand and to scale it. With the data-driven solution, you can
  • expand and modify the program easier
  • use different datasets with the same evaluation function
  • use different evaluation functions with the same dataset
  • represent your problem so that it is easier to understand.
If there is something you take away from this post, it is this:
Use the best notation available to solve your problem!


Chapter 4 coming soon. It contains an implementation of the GPS, the General Problem Solver.
Date: 2012-05-28T18:12+0200
Author: kima
Org version 7.8.09 with Emacs version 23
Validate XHTML 1.0