# ryjo.codes

## A* Algorithm Written in CLIPS

### Introduction

The A* (A-star) Algorithm is a great way to determine the shortest distance between two points in a coordinate system. In order to determine whether a coordinate is the next "step," you measure the distance from the start point as well as the distance from the end point. You repeat this process until you have found a complete path between the start and end points.

In this article, I'll talk about one way of implementing the A* Algorithm with CLIPS. We'll extend CLIPS by adding a custom C function so that we can call it from the precedents of our rules. We'll also discuss some of the existing functions available out-of-the-box with CLIPS. Let's dive in!

### Explaining the A* Algorithm

First, let's discuss how the A* Algorithm works. The Algorithm uses two measurements to determine the next "step" in a path: 1. the distance to the start node and 2. the distance to the end node. Calculate both, add them together, and whichever square has the lower number is the next step.

The distance from the start node to a square is referred to as the square's "g cost." The distance to the end node from a square is its "h cost." The "f cost" is the sum of these two costs. In the above diagram, the square with the "s" is our starting point, and the square with the "e" is our end.

Start by "opening" the squares that surrounding the starting square. This means: calculate the cost to move to that square, as well as their distance from the end. The cost to move to that square accumulates from the square that caused it to open. Moving north, south, east and west will incur a cost of 10, while moving diagonally will cost 14. This is because of the way we calculate the distance between two points on a grid. To rehash a bit of math, the formula to calculate this is . In other words: subtract the `x` values from both points from one another and square them. Do the same for the `y` values. Sum the result of these two together, then find the square root of this sum.

Using the above diagram, our start square is at `(2, 4)`. We want to calculate the g cost of one of the surrounding squares `(2, 3)`. Our equation is therefore . This simplifies to , then to . The square root of 1 is 1, so our g cost for moving north, south, east, or west from any square is 1. Moving diagonally from `(2, 4)` to `(3, 3)` would be . This simplifies to which is roughly 1.4. We'll multiply g cost by 10 in our calculations to avoid using decimal places.

To calculate our h cost, we'll use the same formula. Instead of using the coordinate of our start node, we use the end node. Once we have both the g and h cost, we add them together to get the f cost.

Calculate all of the f costs for the open nodes. Once this is done, we'll "close" the square with the lowest cost. We then repeat this algorithm, opening the squares that surround our newly closed square. In our above example, we only have one more opened square `(2, 2)`.

Now that we've closed the next lowest f cost square `(2, 3)`, we begin to open the surrounding squares. One of the squares surrounding this one `(2, 1)` is newly opened, but look at `(2, 2)`. We've updated its g cost and, therefore, its f cost. When you close a square and are looking at its surrounding squares, if a previously opened square would have a lower g cost had it been opened by this new square, we replace it. Now, we'll track that `(2, 2)` has an f cost of 42 instead of 50 and was opened by `(2, 3)` instead of `(3, 3)`.

The next square we close is `(3, 4)`. We repeat the algorithm even if it results in a dead end. We may be able to see that `(3, 4)` will not lead us to the end square, but our program cannot. Thus it must explore every possible square according to the rules of the algorithm.

Continue this process until the end node is opened. From here, you may determine your path. We choose the lowest g cost for squares opened multiple times. To tie break f costs, choose the square with the lower h cost. The square that opened a square is the previous step in the path. In our examples above, our full path is `(2, 4) (2, 3) (2, 2) (3, 1) (4, 1)`.

### Implementation in CLIPS

First, we'll download CLIPS and compile it. We'll download it, untar it, and then compile it with `make`:

```\$ wget https://sourceforge.net/projects/clipsrules/files/CLIPS/6.40/clips_core_source_640.tar.gz
# ... truncated output ...
HTTP request sent, awaiting response... 302 Found
Location: https://versaweb.dl.sourceforge.net/project/clipsrules/CLIPS/6.40/clips_core_source_640.tar.gz [following]
--2022-12-11 16:32:46--  https://versaweb.dl.sourceforge.net/project/clipsrules/CLIPS/6.40/clips_core_source_640.tar.gz
Resolving versaweb.dl.sourceforge.net (versaweb.dl.sourceforge.net)... 162.251.232.173
Connecting to versaweb.dl.sourceforge.net (versaweb.dl.sourceforge.net)|162.251.232.173|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1082012 (1.0M) [application/x-gzip]
Saving to: ‘clips_core_source_640.tar.gz’

clips_core_source_640.ta 100%[===============================>]   1.03M  2.97MB/s    in 0.3s

2022-12-11 16:32:51 (2.97 MB/s) - ‘clips_core_source_640.tar.gz’ saved [1082012/1082012]
\$ tar --strip-components=2 -xvf clips_core_source_640.tar.gz
# ... truncated output ...
\$ make```

Now that CLIPS is compiled, we'll define a few templates. In CLIPS, templates are used to give facts more structure:

```(deftemplate barrier
(slot x)
(slot y))

(deftemplate closed
(slot last-x)
(slot last-y)
(slot path))

(deftemplate delta-g
(slot x)
(slot y)
(slot g))

(deftemplate opened
(slot previous-x)
(slot previous-y)
(slot x)
(slot y)
(slot g)
(slot h)
(slot f))```

Now we'll use these templates when we define some of our initial facts with `deffacts`:

```(deffacts config
(start       2       4)
(end         4       1)
(delta-g (x  0)  (y  1)  (g 10))
(delta-g (x  1)  (y  0)  (g 10))
(delta-g (x  0)  (y -1)  (g 10))
(delta-g (x -1)  (y  0)  (g 10))
(delta-g (x  1)  (y  1)  (g 14))
(delta-g (x -1)  (y  1)  (g 14))
(delta-g (x  1)  (y -1)  (g 14))
(delta-g (x -1)  (y -1)  (g 14))
(barrier (x  4)  (y  5))
(barrier (x  4)  (y  4))
(barrier (x  4)  (y  3))
(barrier (x  4)  (y  2))
(barrier (x  3)  (y  2)))```

Note the difference between the `start` and `end` facts and the template facts after them. With the template facts, we must specify which named "slot" the values are for. We can specify them in any order we choose, and we can even omit them if we don't need them in rule antecedents.

### Our Initial Ruling

We'll begin the process of "closing" our starting node and "opening" the surrounding nodes:

```(defrule init
(start ?x  ?y)
(end   ?ex ?ey)
(not (closed))
=>
(assert (closed (path (str-cat "(" ?x ", " ?y ")")) (last-x ?x) (last-y ?y)))
(do-for-all-facts ((?d delta-g))
;Do not open nodes on a barrier
(not (any-factp ((?b barrier)) (and (= ?b:x (+ ?x ?d:x)) (= ?b:y (+ ?y ?d:y)))))
(bind ?xx (+ ?x ?d:x))
(bind ?yy (+ ?y ?d:y))
(bind ?h (h ?xx ?yy ?ex ?ey))
(assert (opened (g ?d:g) (previous-x ?x) (previous-y ?y) (x ?xx) (y ?yy) (h ?h) (f (+ ?d:g ?h))))))```

The `do-for-all-facts` function lets us do some action for all facts that match a given query. We specify we want to look at our `delta-g` facts. These represent the 8 directons one can travel from a square as well as the g cost of taking those directions. We then use `any-factp` to see if there is a barrier at the square in this given direction. If there isn't, we open that square.

Ok, now that we have at least one closed square and opened squares, we can work on closing our next open node:

```(defrule close-next
;There is not a "closed" path ending at the intended end coordinate
(end ?x ?y)
(not (closed (last-x ?x) (last-y ?y)))

;There is a closed path that opened a coordinate
(closed (path ?path) (last-x ?px) (last-y ?py))
(opened (g ?g) (h ?h) (f ?f) (previous-x ?px) (previous-y ?py) (x ?ox) (y ?oy))
;that has the lowest "g cost" leading to that specific coordinate
(not (opened (g ?gg&:(< ?gg ?g)) (x ?ox) (y ?oy)))
;and that has the lowest "f cost" of all open coordinates
(not (opened (f ?ff&:(< ?ff ?f))))
;and that has the lowest "h cost" of all open coordinates tied with its "f cost"
(not (opened (f ?f) (h ?hh&:(< ?hh ?h))))
=>
;Close this coordinate
(assert (closed (path (str-cat ?path " (" ?ox ", " ?oy ")")) (last-x ?ox) (last-y ?oy)))
;Remove all open nodes for this coordinate
(do-for-all-facts ((?o opened)) (and (= ?o:x ?ox) (= ?o:y ?oy)) (retract ?o))
;Open new coordinates
(do-for-all-facts ((?d delta-g))
(and
;Do not re-open closed nodes
(not (any-factp ((?c closed)) (and (= ?c:last-x (+ ?ox ?d:x)) (= ?c:last-y (+ ?oy ?d:y)))))
;Do not open nodes on a barrier
(not (any-factp ((?b barrier))     (and (= ?b:x (+ ?ox ?d:x))      (= ?b:y (+ ?oy ?d:y)))))
;Do not open more nodes at this coordinate
;that would have a higher g cost
(not (any-factp ((?o opened))      (and (= ?o:x (+ ?ox ?d:x))      (= ?o:y (+ ?oy ?d:y))
(<= ?o:g (+ ?g  ?d:g))))))
(bind ?xx (+ ?ox ?d:x))
(bind ?yy (+ ?oy ?d:y))
(bind ?hhh (h ?xx ?yy ?x ?y))
(bind ?ggg (+ ?g ?d:g))
(assert
(opened (previous-x ?ox) (previous-y ?oy) (x ?xx) (y ?yy) (g ?ggg) (h ?hhh) (f (+ ?hhh ?ggg))))))```

We must implement our custom `h` function which calculates the h cost of a node. CLIPS is an awesome language, but doing heavy math in our rules engine will make things slow. To avoid this, we'll calculate the h cost in C. We'll add a function to the file `userfunctions.c`, and then we'll add a call to `AddUDF` inside of the `UserFunctions` function. This will let us call our function inside of our rules engine:

```#include "clips.h"
#include "math.h"

void UserFunctions(Environment *);

void h(
Environment *env,
UDFContext *udfc,
UDFValue *out)
{
UDFValue ox, oy, x, y;
if (! UDFNextArgument(udfc,NUMBER_BITS,&ox))
{ return; }
if (! UDFNextArgument(udfc,NUMBER_BITS,&oy))
{ return; }
if (! UDFNextArgument(udfc,NUMBER_BITS,&x))
{ return; }
if (! UDFNextArgument(udfc,NUMBER_BITS,&y))
{ return; }

double floatValue;
long long oxValue, oyValue, xValue, yValue;
oxValue = ox.integerValue->contents;
oyValue = oy.integerValue->contents;
xValue  = x.integerValue->contents;
yValue  = y.integerValue->contents;
floatValue = 10 * sqrt(
(oxValue - xValue) * (oxValue - xValue) +
(oyValue - yValue) * (oyValue - yValue)
);
out->floatValue = CreateFloat(env, floatValue);
}

/*********************************************************/
/* UserFunctions: Informs the expert system environment  */
/*   of any user defined functions. In the default case, */
/*   there are no user defined functions. To define      */
/*   functions, either this function must be replaced by */
/*   a function with the same name within this file, or  */
/*   this function can be deleted from this file and     */
/*   included in another file.                           */
/*********************************************************/
void UserFunctions(
Environment *env)
{
#if MAC_XCD
#pragma unused(env)
#endif
AddUDF(env, "h", "d", 4, 4, "l", h, "h", NULL);
}```

Finally, we'll add a rule that'll report a completed path:

```(defrule done
(end ?x ?y)
(closed (path ?path) (last-x ?x) (last-y ?y))
=>
(println "Done!" crlf "Path: " ?path))
```

Let's create a batch file called `astar.bat`. We'll use this file to load in the rules and run the rules engine:

```(watch statistics)
(reset)
(run)
(exit)```

We also specify that we want to see some statistics around the performance of our rules engine. We'll run this file with our compiled CLIPS binary:

```\$ ./clips -f2 astar.bat
%%%%\$***
Done!
Path: (2, 4) (2, 3) (2, 2) (3, 1) (4, 1)
8 rules fired        Run time is 0.000307083129882812 seconds.
26051.5776397516 rules per second.
28 mean number of facts (36 maximum).
0 mean number of instances (0 maximum).
1 mean number of activations (1 maximum).
```

Nice! That's the path that we expect, and it's fast!

### Conclusion

I hope that you've found this article useful! CLIPS is such a fun language, and I feel like I'm only scratching the surface. By the way: in learning this Algorithm, I found this YouTube video to be quite helpful.

- ryjo