1 10/20/86  create_wordlist, cwl
 2 
 3 
 4 Syntax as a command:  cwl path {-control_args}
 5 
 6 
 7 Function:  produces an alphabetized list of all distinct words found in
 8 the specified text segment.  This list is saved in a wordlist segment
 9 that is created in the working directory.  The wordlist segment is
10 given the entryname of the text segment with a suffix of wl added.  The
11 total number of words in the text segment and the number of words put
12 into the wordlist segment are displayed.
13 
14 
15 Arguments:
16 where path is the pathname of the text segment or MSF.
17 
18 
19 Control arguments:
20 -brief, -bf
21    suppresses the display of the total number of words in the text
22    segment and the number of words put into the wordlist segment.
23 -from n, -fm n
24    words are processed in the text segment starting from the line
25    number specified by n.  If this control argument is not specified,
26    then the text segment is processed starting from the first line.
27 -header, -he
28    displays the pathname of the text segment.
29 -no_control_lines, -ncl
30    suppresses the display of the control lines.
31 
32 
33 -no_exclude, -ne
34    specifies that words containing only special characters or
35    punctuation are not to be excluded from the wordlist.
36 -no_sort, -ns
37    specifies that the words in the wordlist segment are not to be
38    sorted into alphabetical order.  They are put into the wordlist
39    segment in the order in which they are found in the text segment and
40    duplications are not eliminated.  (This control argument is intended
41    for special application and should not be used for normal wordlist
42    segment creation.)
43 -temp_dir path, -td path
44    path holds the temporary segment if the input file is an MSF.
45    (Default is the process directory.)
46 
47 
48 -to n
49    words are processed in the text segment up to and including the line
50    number specified by n.  If this control argument is not specified,
51    then the text segment is processed to the last line.
52 
53 
54 Notes:  Words in the text segment are separated by the following
55 delimiter (white space) characters:
56 
57    space
58    horizontal tab
59    vertical tab
60    newline
61    form feed
62 
63 
64 Punctuation characters are removed from the word.  The characters "([{
65 are removed from the left side of the word.  The characters ")]}.,;:?!
66 are removed from the right side of the word.  Also, PAD characters
67 (octal 177) are removed from the left side of the word.  Additional
68 special processing is performed on each word after all punctuation is
69 removed.  A summary of this special processing is given below:
70 
71      if the entire word is underscored, then the underscores are
72      removed.  If only part of a word is underscored, then the
73      underscores remain.
74 
75      if the word contains no letters, i.e., consists entirely of
76      punctuation characters or other special characters, then the word
77      is excluded from the wordlist.  The -no_exclude control argument
78      disables the automatic exclusion of such words.