LISP is an acronym for "LISt Processing Language" - so named because the list is one of the primary data structures in the language. Symbolic AI regards symbolic lists as being a key part of the way intelligent beings and systems actually store and manipulate information.
LISP is the second oldest higher-level language (after FORTRAN), first developed by John McCarthy (then of MIT) following the Dartmouth Conference of 1956 that is often taken as the birth of the discipline of Artificial Intelligence. Though it has been used in other areas, it is primarily identified with the field of AI and was the "lingua franca" of AI work for many years (arguably still so).
In LISP's early years, mutually incompatibile dialects proliferated as it was implemented on different platforms. In 1984, a standardization effort resulted in the development of Common Lisp, as defined in Guy Steele, Common Lisp: The Language. (The second edition of this book remains one of the best-selling LISP books on amazon.com's ranking). Common Lisp became an ANSI standard in 1994. Today, most implementations of LISP attempt to adhere to the Common Lisp standard.
One early dialect of LISP - Scheme - has survived to the present day, and is also a standardized and widely used language. While it resembles Common Lisp, it also differs from it in a few ways that cause some to regard it as a distinct language. At the same time, some ideas that were pioneered in Scheme (e.g. static versus dynamic scoping of names) also found their way into Common Lisp - not surprising given that the same individual (Guy Steele) was a co-creator of Scheme as well as the author of the definition for Common Lisp. This page will discuss Common Lisp.
Two important books on LISP have been made freely available by their publishers on the Internet:
LISP is the earliest representative of the functional programming language paradigm. Unlike procedural and object-oriented languages - whose theoretical model of computation is the Turing Machine, LISP's theoretical model of computation is the Lambda calculus developed by Alonzo Church. (It can be shown that the two models of computation are equivalent in power - that is, any algorithm that can be expressed in one model can also be expressed in the other).
This difference might be understood in terms of the following diagram:
In procedural languages, code operates on data; in object-oriented languages, objects encapsulate code and data and interact with one another; in functional languages data flows through functions but has no separate existence of its own. That is, in procedural languages data is passive; in OO languages it is active; in pure functional languages it is ephemeral.
However, most functional languages (including LISP) provide a mechanism for keeping certain data in existence even when it is not flowing through functions. In LISP, this takes the form of what Common Lisp calls "special variables" - essentially equivalent to what in other programming languages are called "global variables". That is to say, though LISP is a functional language, it is not a pure functional language (though it could be used as one by avoiding imperative constructs.)
LISP uses a variant of prefix (Polish) notation called Cambridge Polish
Notation. All expressions are fully parenthesized, with the first element of
an executable form being the name of a function, macro, or special form. For
example, the expression written as 1 + 2 * 3 - 4 in traditional infix notation
is written as (- (+ 1 (* 2 3)) 4)
in LISP.
A further critical characteristic of LISP - which most functional languages do
not share, is that there is absolutely no syntactic distinction between
functions and data. For example, in one context, (car '(volkswagen
golf))
may be a piece of data representing a particular motor vehicle.
In another context, it may be taken as the application of the function
car
to the list (volkswagen golf)
.
(It turns out that, since car
is a defined operation in LISP,
this form is meaningful and has the value volkswagen
!)
In McCarthy's original paper, data objects were referred to as s-expressions (symbolic expressions). Common Lisp defines a rather complex hierarchy of data types; however, there are only a few core types of s-expression that have been part of the language since the beginning. These include:
word this-is-a-symbol *this-is-also-a-symbol* 123so-is-thisLISP is not case sensitive, so, for example,
word
,
Word
, WORD
, and for that matter wOrD
are all equivalent.42 3.14 6.02e23Along with symbols, numbers are also called atoms.
(a list) (a list (that contains another list)) ((small prime numbers) (2 3 5 7))
()
) is a legitimate list.
It has an alternate name - nil
. These two names mean exactly
the same thing - though one may prefer to use one name or the other
depending on the context. The empty list ( ()
or
nil
) has the interesting property that it is the one and
only s-expression that is both a symbol (and hence an atom) and
a list.
(1 . 2) (a . b) (this . (is . (another . (way . (of . (representing . (a . (list . nil))))))))Dotted pairs do not appear often in LISP programs, but as the last example indicated, they are actually the way that LISP represents lists internally. For instance, the last example above is not a true list (though it may be referred to as a "dotted list"; however, it is equivalent to the list
(this is another way of representing a list)
. Note well that
the last dotted pair in the internal representation of a true list consists
of the last element and the empty list (nil
) - it is not a
dotted pair consisting of the last two elements! For example,
(this . (is . (not . (what (you . think)))))
is not
the true list (this is not what you think)
but rather the
dotted list (this is not what (you . think))
!
It is sometimes helpful to picture the way that LISP represents a list in terms
of dotted pairs (conses). For example, the list (this is a list)
is represented internally as (this . (is . (a . (list . nil))))
,
which looks like this in memory:
The following is the complete type hierarchy of Common Lisp. Note well that
every s-expression is either an atom or a list, except for ()
/
nil
, which is both!
Common Lisp type name | Printed form example(s) | |||
---|---|---|---|---|
Atom | ||||
Symbol | x , this-is-a-symbol ,
*so-is-this* , 1+ ,
nil , () | |||
Keyword | :test | |||
Number | ||||
Integer | ||||
Fixnum | 123 ,
-4 ,
#o377 (octal),
#b10100011 (binary) | |||
Bignum | 12345678901234567890 ,
| |||
Ratio | 4/5 | |||
Float | ||||
Short-Float | -1.3s-4 | |||
Single-Float | -1.3e-4 ,
.00013 | |||
Double-Float | -1.3d-4 | |||
Long-Float | -1.3l-4 | |||
Complex | #c(3.2 4.0) | |||
Character | #\A , #\a ,
#\Control-A
(all 3 different as characters) | |||
Bit | 0 , 1
(printed only, cannot be read) | |||
Vector | #(1 2 3) ,
#(apples bananas cumquats) | |||
Bit-Vector | #*101010 | |||
String | "This is a string" | |||
Array | #2a((1 2 3) (4 5 6)) | |||
Structure | #s(student-info :first-name john
:last-name jones :gpa 2.37) (valid only if the structure student-info has been
defined by defstruct ) | |||
List | ||||
True List | (1 2 3) | |||
Dotted List | (1 . 2) ,
(1 2 (3 . 4)) |
A symbol (other than one of the self-evaluating ones listed below) is regarded as the name of a variable when it is evaluated. LISP is a dynamically typed language - a variable does not have a declared type, but simply takes on the type of its current value. Common Lisp recognizes two kinds of variables.
let
or let*
; the
latter are analogous to local variables in other programming languages.
Example:
(defun f (x) (let ((y 3)) ;; At this point, x and y are lexical variables ) )
setq
. Example (assuming no lexical variable named
x
is in scope):
(setq x 3)Creates the special variable
x
if necessary, and in any case
gives it the value 3
.Common Lisp uses two separate namespaces for the use of a symbol as a special form / macro / function name and the use of a symbol as a variable name. Thus, for example, the following is legal (though arguably a source of confusion)
(setq setq 3)This would cause the variable whose name is
setq
to be given the
value 3
. When the LISP interpreter sees the name setq
as the first element of a list to be evaluated, it takes it as the name of a
special form; but any time the LISP interpreter sees a symbol like
setq
elsewhere, it takes it as the name of a variable.
(This turns out to be one of the significant differences between LISP and Scheme;
the latter has only a single namespace, so a given symbol cannot be both the
name of a variable and the name of a special form / macro / function.
In LISP, code takes the form of expressions to be evaluated. For example,
(+ 3 4)
is an expression that calls for applying the addition
operation to the two numbers 3 and 4, yielding the result 7. We noted earlier
that LISP makes absolutely no syntactic difference between data and code. As a
result, any LISP s-expression might potentially be treated as an expression
to be evaluated - which may or may not be meaningful. For example, we could
attempt to evaluate (3 4 5)
- but the result would be an error
since this s-expression is not meaningful when viewed as code.
When attempting to evaluate a form, LISP uses the following rules:
nil
(or ()
) and t
(which is often used as the logical value true.):
)nil
, t
, and keywords) are
treated as the names of
variables. (See discussion above) If a symbol has been defined as the
name of a lexical or special variable, the variable's value is its value;
otherwise, attempting to evaluate it results in an error.
(+ 3 4)
is treated as the application of the function named
by +
to the arguments 3
and 4
, with
the result (7) being the value of the expression.
As noted above, any list whose first element is a symbol is regarded as either the application of a function or of a special form or macro to the remaining elements of the lists, regarded as its arguments. The LISP interpreter handles these quite differently, though.
When a function call is evaluated, the interpreter first evaluates each of the arguments in the list, and then applies the specified function to the result.
Example: consider the following form:(+ (* 2 3) (* 4 5)))When evaluating it, the interpreter first evaluates the form
(* 2 3)
, then the form (* 4 5)
. To evaluate
(* 2 3)
, it
first evaluates 2
and then 3
. Since numbers are
self-evaluating forms, their values are 2 and 3, and when the interpeter
applies *
to these evaluated arguments, the result is 6.
Something similar happens when (* 4 5)
is evaluated to yield 20.
Then the function +
is applied to these evaluated arguments 6
and 20, yielding the final result 26.
Example: consider the following code excerpt:
(setq a 4) (setq b 5) (+ a b)When evaluating
(+ a b)
, the interpreter first evaluates the forms
a
and b
. Each is taken as the name of a variable,
and the most recent values assigned to those variables (4 and 5) are used.
Finally, the interpreter applies the function +
to these values,
yielding 9.
If the first symbol in a list to be evaluated is the name of a special form or
macro, the interpreter does not evaluate the arguments. However, the
special form or macro may evaluate some of the arguments - how this is handled
varies from special form / macro to special form / macro. For example, the
special form setq
evaluates its second argument, but not its
first.
Example: consider the second line of the following code excerpt:
(setq a 1) (setq a (+ 2 3))Because the special form
setq
evaluates its second argument, the
form (+ 2 3)
is evaluated, yielding 5. However, the form
a
is not evaluated - so 5 is assigned to a
.
(If setq
were a function or a special form that evaluated both
of its arguments, the operation would mean that 5 should be assigned to 1 -
obviously an absurd operation!)
The following are but a few of the special forms, macros, and functions of Common Lisp.
quote
quote
is a special form that simply passes its unevaluated
argument through. For example, the value of (quote a)
is
the symbol a.'anything
to be equivalent to
(quote anything)
.
setq
setq
is a special form that evaluates its
second argument and then assigns the value to the variable named by its
first argument.
let
let
is a special form that creates lexical variables. It has
the following basic form (though variations are possible)(let ( { (variable expression) }* ) { form }* )The forms are evaluated in a context in which each of the variables is a lexical variable bound to the initial value specified. All of the expressions are evaluated before any of the variables is bound - therefore an expression cannot use a variable previously declared in the same
let
.
if
if
is a special form that selects one of two expressions to
evaluate based on the outcome of some test, with its value becoming the
value of the whole expression. It is analogous the conditional
expression of C-family languages (test ? then-expr :
else-expr
), and has the following syntax
(if test then-form else-form)The two forms must be single s-expressions; however
progn
can
be used to group multiple forms (analogous to { ... }
in Java),
though this is, of course, not purely functional style.
The else-form may be omitted; in this case, if the test is
false the result is nil
.
cond
cond
is a macro that selects one series of expressions out of
several to evaluate. It has the following basic form (though variations are
possible)(cond { ( test { form }*) }* )Each test is evaluated in turn until one returns any result other than
nil
- at which point the following form(s) is/are
evaluated. The value of the last form evaluated becomes the value of
the cond
- hence if there is more than one form, any
form save the last is evaluated only for its side effects. It is
common to see that the last clause of cond
has the test
t
, which of course is always true and constitutes a default
case. (Of course, if one is programming in pure functional form, each
clause of a cond
will have exactly one form.)
and
- macro with any number of arguments, which are
usually all tests - true if all are non-nil - false if any is
nil
. Does short-circuit evaluation - i.e. stops evaluating
arguments as soon as one that is false (nil
) is encountered.
or
- macro with any number of arguments, which are
usually all tests - true if any is non-nil - false if all are
nil
. Does short-circuit evaluation - i.e. stops evaluating
arguments as soon as one that is true (non-nil) is encountered.
defun
- macro for defining named functions. The
basic syntax is
(defun name ( { parameter }* ) { form }* )When the function is called, the form(s) constituting the body are evaluated in sequence in a context where any parameter(s) are lexical variables bound to the actual parameters of the call; if there are multiple form(s) the value of the last becomes the value of the function and the others are evaluated only for side-effects. (Of course, if one is programming in pure functional style, the body of a function will consist of exactly one form.)
lambda
- macro for defining anonymous functions. The
basic syntax is
(lambda ( { parameter }* ) { form }* )When the function is called, the form(s) constituting the body are evaluated in sequence in a context where any parameter(s) are lexical variables bound to the actual parameters of the call; if there are multiple form(s) the value of the last becomes the value of the function and the others are evaluated only for side-effects. (Of course, if one is programming in pure functional style, the body of a function will consist of exactly one form.)
The list operations are often best understood by picturing what they do in diagrammtic form, as done below. All are functions, which means that the arguments are evaluated before the operation is performed.
cons
- 2 arguments of any type. Constructs a
dotted pair (cons cell) consisting of the 2 arguments. For example,
(cons 'a 3)
is (a . 3)
car
- 1 argument, which must be a true or dotted list.
Extracts the car of the first cell of the list. For example,
(car '(a 3))
is a
cdr
- 1 argument, which must be a true or dotted list.
Extracts the cdr of the first cell of the list. If the list consists
of more than one cell, its cdr will itself be a list. For example,
(cdr '(a 3))
is (3)
. (Note that this is a two
element true list, as compared to a dotted pair as in the
cons
example.) caar
, cdar
,
cadr
, cddr
,
caddr
, cdddr
and several more - any
combination of up to four a's or d's -
convenience forms equivalent to (car (car ...))
,
(cdr (car ...))
, (car (cdr ...))
,
(cdr (cdr ...))
etc.
list
- any number of arguments of any type. Converts
its arguments to a true list. For example,
(list 'a 'b '(1 2))
is (a b (1 2))
append
- any number of arguments - all, save the last,
must be true lists (and it usually is, too). Combines its arguments into a
single list, which will be a true list if the last argument is. For
example, (append '(1 2) '(3 4))
is (1 2 3 4)
LISP supports the arithmetic operations found in most programming languages,
plus many unique ones as well. Most of the operations take any number of
arguments - e.g. (* 2 3 4 5)
evaluates to 120. All are functions,
which means that the arguments are evaluated before the operation is performed.
+
addition - any number of arguments; with no arguments
gives the identity element for addition, 0.
-
negation - one argument; subtraction - two or more
arguments; illegal with no arguments.
*
multiplication - any number of arguments; with no
arguments gives the identity element for multiplication, 1.
/
reciprocal - one argument; division - two or more
arguments; illegal with no arguments.
1+
increment by 1 - one argument only.
1-
decrement by 1 - one argument only.
exp
e to the power - one argument only.
expt
number to the power - two arguments only.
These are functions that apply some test one or two values (variables or
expressions - evaluated before the test is applied) and either succeed
(evaluating to t
) or fail (evaluating to nil
).
atom
- true if its single argument is an atom (see the
type hierarchy above).
symbolp
- true if its single argument is a symbol
(see the type hierarchy above).
numberp
- true if its single argument is a number
(see the type hierarchy above).
listp
- true if its single argument is a (true or dotted)
list (see the type hierarchy above).
null
- true if its single argument is the empty list
(()
or nil
).
eq
- true if its two arguments are the same identical
object. Because of the way LISP handles symbols, a given name always
maps to the same symbol, so something like (eq 'a 'a)
is
always true.
eql
- true if its two arguments are eq
or
are numbers with the same type and value or characters with the same value.
equal
- true if its two arguments are isomorphic - i.e.
their printed representation is the same.
LISP supports the standard comparisons on numbers, but most can take one up to any number of arguments, rather than just two. (In the case of two arguments, they behave in the traditional way.) All of the arguments must be numbers.
=
- all arguments equal/=
- all arguments differ (no two equal)>
- all arguments monotonically increasing<
- all arguments monotonically decreasing>=
- all arguments monotonically non-decreasing<=
- all arguments monotonically non-increasingzerop
- one argument only - true if the argument is 0Common Lisp provides a large number of input output operations. Input operations are functions whose value is some input read from an external stream; output operations are typically functions which have the side effect of printing something on an external stream and also return as their function value that which was printed.
read
no argument - reads a single s-expression
print
single argument - prints a single s-expression,
preceeded by a newline and followed by a single space.
prin1
single argument - prints a single s-expression,
with no whitespace before or after.
princ
single argument - prints a single s-expression,
with no whitespace before or after, and without special characters such
as quotes around a string.
terpri
no argument - prints a single newline.