Node:Words and Symbols, Next:Syntax, Previous:Divide and Conquer, Up:Words in a defun
When we first start thinking about how to count the words in a
function definition, the first question is (or ought to be) what are
we going to count? When we speak of `words' with respect to a Lisp
function definition, we are actually speaking, in large part, of
`symbols'. For example, the following multiply-by-seven
function contains the five symbols defun
,
multiply-by-seven
, number
, *
, and 7
. In
addition, in the documentation string, it contains the four words
Multiply
, NUMBER
, by
, and seven
. The
symbol number
is repeated, so the definition contains a total
of ten words and symbols.
(defun multiply-by-seven (number) "Multiply NUMBER by seven." (* 7 number))
However, if we mark the multiply-by-seven
definition with
C-M-h (mark-defun
), and then call
count-words-region
on it, we will find that
count-words-region
claims the definition has eleven words, not
ten! Something is wrong!
The problem is twofold: count-words-region
does not count the
*
as a word, and it counts the single symbol,
multiply-by-seven
, as containing three words. The hyphens are
treated as if they were interword spaces rather than intraword
connectors: multiply-by-seven
is counted as if it were written
multiply by seven
.
The cause of this confusion is the regular expression search within
the count-words-region
definition that moves point forward word
by word. In the canonical version of count-words-region
, the
regexp is:
"\\w+\\W*"
This regular expression is a pattern defining one or more word constituent characters possibly followed by one or more characters that are not word constituents. What is meant by `word constituent characters' brings us to the issue of syntax, which is worth a section of its own.