1 01/22/85  sort_strings, sstr
  2 
  3 Syntax as a command:  sstr {-control_args} strings
  4 
  5 
  6 Syntax as an active function:  [sstr {-control_args} strings]
  7 
  8 
  9 Function: orders the argument strings according to the ASCII collating
 10 sequence.
 11 
 12 
 13 Arguments:
 14 strings
 15    are the strings to be sorted.  All arguments following the first
 16    strings are treated as strings.  You can use -string to identify a
 17    first string that looks like a control argument or to separate a
 18    numeric string from operands of -field.
 19 
 20 
 21 Control arguments (sort units):
 22 -all, -a
 23    makes the primary (and only) sort field the entire sort unit; i.e.,
 24    each string is considered to be a sort unit when sorting.  (Default)
 25 -block N, -bk N
 26    makes the sort unit a block of N strings, where N must be a positive
 27    integer (see "Examples" below).  (Default: 1 string)
 28 
 29 
 30 Control arguments (handling duplicates):
 31 -duplicates, -dup
 32    retains duplicate sort units in the sorted results.  (Default)
 33 -only_duplicates, -odup
 34    only sort units that occur more than once in the input appear in the
 35    sorted results.  One unit from each set of duplicate sort units is
 36    placed in the return value, in sorted order.
 37 -only_duplicate_keys, -odupk
 38    only sort units that have duplicate sort fields appear in the sorted
 39    results.  All such units having duplicate sort fields are placed in
 40    the return value, since the nonsort field portions of the units may
 41    differ.
 42 
 43 
 44 -only_unique, -ouq
 45    only sort units that are unique appear in the sorted results.
 46    Whenever a set of duplicate units are found, they are removed
 47    entirely from the return value.
 48 -only_unique_keys, -ouqk
 49    only sort units that have unique sort fields appear in the sorted
 50    results.  All units having duplicate sort fields are removed
 51    entirely from the return value.
 52 
 53 
 54 -unique, -uq
 55    deletes duplicate sort units from the sorted results.  For each set
 56    of duplicate sort units, only the first appears in the sorted
 57    results, along with nonduplicate sort units.
 58 -unique_keys, -uqk
 59    deletes sort units having duplicate sort fields from the sorted
 60    results.  For each set of sort units having duplicate fields, only
 61    the first appears in the sorted results, along with nonduplicate
 62    sort units.
 63 
 64 
 65 Control arguments (input strings):
 66 -string strings, -str strings
 67    identifies the strings that follow as the strings to be sorted.  All
 68    remaining arguments are treated as input strings.
 69 
 70 
 71 Control arguments (sort order):
 72 -ascending, -asc
 73    returns the sorted results in ascending order.  (Default)
 74 -case_sensitive, -cs
 75    makes the sort by comparing sort fields without translating letters
 76    to lowercase.  (Default)
 77 -character, -ch
 78    makes the sort based on the character representation of the sort
 79    field.  (Default)
 80 -descending, -dsc
 81    returns the sorted results in descending order.
 82 -field field_specs, -fl field_specs
 83    specifies the field(s) to be used when comparing two sort units.
 84    This allows units to be sorted based upon comparison of only a part
 85    of each sort unit.  (See "Notes on field specifications.")  Multiple
 86    -field control arguments may be used to specify multiple fields.
 87 
 88 
 89 -integer, -int
 90    makes the sort by converting the sort field to fixed binary (71,0)
 91    integers when comparing one sort unit with another (see "Notes").
 92 -non_case_sensitive, -ncs
 93    makes the sort by translating letters in the sort fields to
 94    lowercase when comparing one sort unit with another.  The actual
 95    sorted results remain unchanged.
 96 -numeric, -num
 97    makes the sort by converting the sort field to float decimal (59)
 98    numbers when comparing one sort unit with another (see "Notes").
 99 
100 
101 Syntax of field specification: field_start field_length {sort_controls}
102 
103 
104 Notes on field specification: The field_spec operands of -field
105 define the fields within each sort unit by which the unit is sorted.
106 The first field_spec defines the primary sort field, the second, a
107 secondary sort field, and so forth.  Each field_spec consists of a
108 field start location, field length, and optional sorting controls.
109 
110 
111 List of field_start formats: You can give the field start location in
112    one of the following formats:
113 S
114    a positive integer, giving the character position of the start of
115    the field in the sort unit (e.g., 1 if the field begins at the first
116    character).  If the sort unit contains fewer than S characters, then
117    the unit is sorted as if space characters appeared in the sort
118    field.
119 -from S, -fm S
120    where S is a positive integer giving the character position of the
121    start of the field in the sort unit.
122 
123 
124 -from STR, -fm STR
125    where STR is a character string that identifies the beginning of the
126    sort field.  The field begins with the first character of the sort
127    unit that follows STR.  If STR does not appear in the sort unit,
128    then the unit is sorted as if the sort field contained space
129    characters.
130 -from /REGEXP/, -fm /REGEXP/
131    where REGEXP is a regular expression that identifies the beginning
132    of the sort field.  The field begins with the first character of the
133    sort unit that follows the part of the sort unit matching REGEXP
134    (see the qedx command).  If no match for REGEXP is found in the sort
135    unit, then the unit is sorted as if the sort field contained space
136    characters.
137 
138 
139 -from -string STR, -fm -str STR
140    treats STR as a character string that identifies the beginning of
141    the sort field, even though STR may look like an integer or a
142    regular expression.  For example,
143       -from -string 25
144    identifies a sort field that begins with the character following 25
145    in the sort unit.
146 
147 
148 List of field_length formats: You can specify the sort field length in
149    one of the following ways:
150 L
151    a positive integer, giving the length of the sort field in
152    characters.  If the sort unit is too short to hold a sort field of L
153    characters (i.e., if the number of characters from the first
154    character of the sort field to the end of the sort unit is less than
155    L), then the unit is sorted as if the field were extended on the
156    right with space characters to a length of L characters.
157    Alternately, L can be -1 to indicate that the remainder of the sort
158    unit is to be used as the sort field.
159 -for L
160    where L is a positive integer giving the length of the sort field in
161    characters, or -1 to use the remainder of the sort unit as the sort
162    field.
163 
164 
165 -to E
166    where E is a positive integer giving the character position of the
167    end of the sort field in the sort unit (e.g., 5 if the field stops
168    after the fifth character of the sort unit).  If the sort unit
169    contains fewer then E characters, then the unit is sorted as if
170    space characters were added on the right to extend the unit to E
171    characters.
172 -to STR
173    where STR is a character string that identifies the end of the sort
174    field.  The field ends with the first character of the sort unit
175    preceding STR.  If STR does not appear in the sort unit after the
176    starting position of the sort field, then the unit is sorted as if
177    space characters appeared in the sort field.
178 
179 
180 -to /REGEXP/
181    where REGEXP is a regular expression that identifies the end of the
182    sort field.  The field ends with the first character of the sort
183    unit that precedes the part of the sort unit matching REGEXP (see
184    the qedx command).  If no match for REGEXP is found in the sort unit
185    after the starting position of the sort field, then the unit is
186    sorted as if space characters appeared in the sort field.
187 -to -string STR
188    treats STR as a character string that identifies the end of the sort
189    field, even though STR may look like an integer or a regular
190    expression.
191 
192 
193    Note that when you use -to to indicate the end of the field, then
194    sort_strings examines all sort units to determine the length of the
195    longest instance of this sort field in any sort unit; it then sort
196    units as if the sort field in each unit were extended on the right
197    with space characters to the length of the longest sort field
198    instance.
199 
200 
201 List of sort_controls:  The sort controls may be one from each of the
202    following three sets of arguments; the arguments within each set are
203    incompatible with each other.  If you give none, then the default is
204    specified by the corresponding control argument.
205 ascending, asc
206    sorts units with this field in ascending order.
207 descending, dsc
208    sorts units with this field in descending order.
209 case_sensitive, cs
210    sorts units by treating uppercase letters in this field as being
211    different from lowercase letters.
212 non_case_sensitive, ncs
213    sorts units by translating this field to lowercase.
214 character, ch
215    sorts units with this field by the character representation.
216 
217 
218 integer, int
219    sorts unit with this field by converting the character
220    representation to its integer value (fixed binary (71,0)).
221 numeric, num
222    sorts units with this field by converting the character
223    representaion to its numeric value (float decimal (59)).
224 
225 
226 Notes:  Using the control arguments, each string (or group of strings
227 if you supply -block) is treated as a separate sort unit.  These sort
228 units are then sorted, and the ordered units are printed or returned as
229 the active function return value.
230 
231 If you invoke sort_strings without any control arguments, -ascending,
232 -all, and -character are assumed.
233 
234 You can sort a maximum of 261,119 units.  The sort is stable; i.e.,
235 duplicate units appear in the same order in the sorted results as in
236 the original input.
237 
238 The input strings are sorted using temporary segments in the process
239 directory.
240 
241 
242 The determination of whether or not a sort unit is to be deleted (see
243 -unique) is independent of sort field specifications; i.e., given a
244 number of nonidentical sort units that contain identical sort fields,
245 all the units do appear in the sorted results.
246 
247 The following groups have control arguments that are mutually exclusive
248 with each other.  If you provide more than one from a group in a single
249 command, the last one given in the command overrides the others.
250    1. -all, -field
251    2. -ascending, -descending
252    3. -case_sensitive, -non_case_sensitive
253    4. -character, -integer, -numeric
254    5. -duplicates, -only_duplicates, -only_duplicate_keys,
255       -unique, -unique_keys.