ksgi

tutorial4.xml
403 строки · 13.2 Кб
Перенос по словам
1
<article data-sblg-article="1" data-sblg-tags="tutorial" itemscope="itemscope" itemtype="http://schema.org/BlogPosting">
2
	<header>
3
		<h2 itemprop="name">
4
			Using Pages
5
		</h2>
6
		<address itemprop="author">Ross Richardson</address>
7
		<time itemprop="datePublished" datetime="2017-09-20">20 September, 2017</time>
8
	</header>
9
	<p>
10
		<strong>Thanks to Ross Richardson's fine work in contributing this tutorial!</strong>
11
	</p>
12
	<p>
13
		In order to facilitate convenient handling of common cases, <span class="nm">kcgi</span> provides functionality for dealing with
14
		the <abbr>CGI</abbr> meta variable <code>PATH_INFO</code>).
15
		For example, if <span class="file">/cgi-bin/foo</span> is the CGI script, invoking <span
16
			class="file">/cgi-bin/foo/bar/baz</span> will pass <span class="file">/bar/baz</span> as additional information.
17
		Many CGI scripts use this functionality as <q>URL normalisation</q>, or pushing query-string variables into the path.
18
	</p>
19
	<p>
20
		This tutorial describes an example CGI which implements a news site devoted to some particular topic.
21
		The default document shows an index page, and there are sections for particular relevant areas.
22
		In each of these, the trailing slash may be included or omitted.
23
		I assume that your script is available at <span class="file">/cgi-bin/news</span>.
24
	</p>
25
	<dl>
26
		<dt><span class="file">/cgi-bin/news</span>, <span class="file">/cgi-bin/news/index</span></dt>
27
		<dd>main index</dd>
28
		<dt><span class="file">/cgi-bin/news/about/</span></dt>
29
		<dd>about the site</dd>
30
		<dt><span class="file">/cgi-bin/news/archive/</span></dt>
31
		<dd>archive of old articles</dd>
32
		<dt><span class="file">/cgi-bin/news/archive/<var>yyyy</var></span></dt>
33
		<dd>archive/index of articles for year <var>yyyy</var></dd>
34
		<dt><span class="file">/cgi-bin/news/archive/<var>yyyy</var>/<var>mm</var></span></dt>
35
		<dd>archive/index of articles for month <var>mm</var> of year <var>yyyy</var></dd>
36
		<dt><span class="file">/cgi-bin/news/archive/<var>yyyy</var>/<var>mm</var>/<var>dd</var></span></dt>
37
		<dd>archive/index of articles for date <var>yyyy</var>-<var>mm</var>-<var>dd</var></dd>
38
		<dt><span class="file">/cgi-bin/news/random</span></dt>
39
		<dd>a random article</dd>
40
		<dt><span class="file">/cgi-bin/news/tag/<var>subj</var></span></dt>
41
		<dd>articles tagged with &quot;<var>subj</var>&quot;</dd>
42
	</dl>
43
	<p>
44
		<aside itemprop="about">
45
			The tutorial gives an overview of the basic path handling provided by <span class="nm">kcgi</span>, and then shows and discusses
46
			relevant code snippets.
47
		</aside>
48
	</p>
49
	<h3>
50
		Basic Handling
51
	</h3>
52
	<p>
53
		Assuming a call to <a href="khttp_parse.3.html">khttp_parse(3)</a> returns <code>KCGI_OK</code>, the relevant fields of the
54
		<code>struct kreq</code> are:
55
	</p>
56
	<dl>
57
		<dt><code>fullpath</code></dt>
58
		<dd>the value of <abbr>CGI</abbr> meta variable <code>PATH_INFO</code> (which may be the empty string)</dd>
59
		<dt><code>pagename</code></dt>
60
		<dd>the substring of <code>PATH_INFO</code> from after the initial '/' to (but excluding) the next '/', or to the end-of-string
61
			(or the empty string if no such substring exists)</dd>
62
		<dt><code>page</code></dt>
63
		<dd>
64
			<ul>
65
				<li>if <code>pagename</code> is the empty string, the <code>defpage</code> parameter passed to
66
					<a href="khttp_parse.3.html">khttp_parse(3)</a> (that is, the index corrsponding to the default page)</li>
67
				<li>if <code>pagename</code> matches one of the strings in the <code>pages</code> parameter passed to
68
					<a href="khttp_parse.3.html">khttp_parse(3)</a>, the index of that string</li>
69
				<li>if <code>pagename</code> does not match any of the strings in <code>pages</code>, the <code>pagesz</code>
70
					parameter passed to <a href="khttp_parse.3.html">khttp_parse(3)</a></li>
71
			</ul>
72
		</dd>
73
		<dt><code>path</code></dt>
74
		<dd>the middle part of <code>PATH_INFO</code> after stripping <code>pagename/</code> at the beginning and <code>.suffix</code>
75
			at the end.</dd>
76
	</dl>
77
	<p>
78
		In addition, the field <code>pname</code> contains the value of the <abbr>CGI</abbr> meta variable <code>SCRIPT_NAME</code>.
79
	</p>
80
	<h3>
81
		Source Code
82
	</h3>
83
	<p>
84
		Here we look only at the code snippets not covered by the earlier tutorials.
85
		Firstly, we define some values corresponding with the subsections of the site.
86
	</p>
87
	<figure class="sample">
88
		<pre class="prettyprint linenums">enum pg {
89
  PG_INDEX,
90
  PG_ABOUT,
91
  PG_ARCHIVE,
92
  PG_RANDOM,
93
  PG_TAG,
94
  PG__MAX
95
};</pre>
96
	</figure>
97
	<p>
98
		Next, we define the path strings corresponding with the enumeration values
99
	</p>
100
	<figure class="sample">
101
		<pre class="prettyprint linenums">static const char *pages[PG__MAX] = {
102
  &quot;index&quot;,
103
  &quot;about&quot;,
104
  &quot;archive&quot;,
105
  &quot;random&quot;,
106
  &quot;tag&quot;
107
};</pre>
108
	</figure>
109
	<p>
110
		We then define a constant bitmap corresponding with those <code>enum pg</code> values for which no extra path information should
111
		be present in the <abbr>HTTP</abbr> request.
112
		This will be used for sanity-checking the request.
113
	</p>
114
	<figure class="sample">
115
		<pre class="prettyprint linenums">const size_t pg_no_extra_permitted =
116
  ((1 &lt;&lt; PG_INDEX) | 
117
   (1 &lt;&lt; PG_ABOUT) | 
118
   (1 &lt;&lt; PG_RANDOM));</pre>
119
	</figure>
120
	<p>
121
		Next, we define a type for dates, a constant for the earliest valid year, functions for parsing a string specifying a date.
122
		We use year zero to indicate an invalid specification, and month/day zero to indicate that a month/day value was not specified.)
123
	</p>
124
	<p>
125
		<strong>Editor's note</strong>: remember that <a href="https://man.openbsd.org/strptime">strptime(3)</a> and friends may not be
126
		available within a file-system sandbox due to time-zone access, so we need to find another way.
127
	</p>
128
	<figure class="sample">
129
		<pre class="prettyprint linenums">struct adate {
130
  unsigned int year; /* 0 if invalid */
131
  unsigned int month; /* 0 if not specified */
132
  unsigned int day; /* 0 if not specified */
133
};
134

135
const unsigned int  archive_first_yr = 1995;
136

137
static unsigned int
138
current_year(void)
139
{
140
  struct tm *t;
141
  time_t now;
142

143
  if ((now = time(NULL)) == (time_t)-1 || 
144
      (t = gmtime(&amp;now)) == NULL)
145
    exit(EXIT_FAILURE);
146

147
  return t-&gt;tm_year + 1900;
148
} /* current_year */
149

150
static unsigned int
151
month_length(unsigned int y, unsigned int m)
152
{
153
  unsigned int len;
154

155
  switch (m) {
156
    case 2:
157
      if (y % 4 == 0 &amp;&amp; (y % 100 != 0 || y % 400 == 0))
158
        len = 29;
159
      else
160
        len = 28;
161
      break;
162
    case 1:
163
    case 3: 
164
    case 5: 
165
    case 7:
166
    case 8: 
167
    case 10: 
168
    case 12:
169
      len = 31;
170
      break;
171
    case 4: 
172
    case 6: 
173
    case 9: 
174
    case 11:
175
      len = 30;
176
      break;
177
    default:
178
      exit(EXIT_FAILURE);
179
    }
180
    return len;
181
} /* month_length */
182

183
static void
184
str_to_adate(const char* s, char sep, struct adate *d)
185
{
186
  long long val;
187
  char *t, *a, *b;
188
  size_t i;
189

190
  /* Set error/default state until proven otherwise. */
191
  d-&gt;year = 0;
192
  d-&gt;month = 0;
193
  d-&gt;day = 0;
194

195
  i = 0;
196
  while (isdigit((unsigned char)s[i]) || s[i] == sep)
197
    i++;
198

199
  if (i &gt; 0 &amp;&amp; s[i] == '\0') {
200
    /* s consists of digits and sep characters only. */
201
    /* Make a copy with which is is safe to tamper. */
202
    t = kstrdup(s);
203
    a = t;
204
    if ((b = strchr(a, sep)) != NULL)
205
      *b = '\0';
206
    val = strtonum(a, archive_first_yr, current_year(), NULL);
207
    if (val != 0) {
208
      /* Year is OK. */
209
      d-&gt;year = val;
210
      if (b != NULL &amp;&amp; b[1] != '\0') {
211
        /* Move on to month. */
212
        a = &amp;b[1];
213
        if ((b = strchr(a, sep)) != NULL)
214
          *b = '\0';
215
        val = strtonum(a, 1, 12, NULL);
216
        if (val == 0) {
217
          d-&gt;year = 0;
218
        } else {
219
          d-&gt;month = val;
220
          if (b != NULL &amp;&amp; b[1] != '\0') {
221
            /* Move on to day. */
222
            a = &amp;b[1];
223
            if ((b = strchr(a, sep)) != NULL)
224
              *b = '\0';
225
	    if ((b != NULL &amp;&amp; b[1] != '\0') || 
226
	        (val = strtonum(a, 1, month_length
227
	         (d-&gt;year, d-&gt;month), NULL)) == 0) {
228
              d-&gt;year  = 0;
229
              d-&gt;month = 0;
230
            } else {
231
              d-&gt;day   = val;
232
            }
233
          }
234
        }
235
      }
236
    }
237
    free(t);
238
  }
239
} /* str_to_adate */</pre>
240
	</figure>
241
	<p>
242
		Now, we consider the basic handling of the request.
243
	</p>
244
	<figure class="sample">
245
		<pre class="prettyprint linenums">int
246
main(void) {
247
  struct kreq r;
248
  struct adate ad;
249
  struct kpair *p;
250

251
  if (khttp_parse(&amp;r, NULL, 0,
252
      pages, PG__MAX, PG_INDEX) != KCGI_OK)
253
    return 0 /* abort */;
254

255
  if (r.mime != KMIME_TEXT_HTML) {
256
    handle_err(&amp;r, KHTTP_404);
257
  } else if (r.method != KMETHOD_GET &amp;&amp; 
258
             r.method != KMETHOD_HEAD) {
259
    handle_err(&amp;r, KHTTP_405);
260
  } else if (r.page == PG__MAX || 
261
            (r.path[0] != '\0' &amp;&amp;
262
             ((1 &lt;&lt; r.page) &amp; pg_no_extra_permitted))) {
263
    handle_err(&amp;r, KHTTP_404);
264
  } else {
265
    switch (r.page) {
266
      case PG_INDEX :
267
        handle_index(&amp;r);
268
        break;
269
      case PG_ABOUT :
270
        handle_about(&amp;r);
271
        break;
272
      case PG_ARCHIVE :
273
        if (r.path != NULL &amp;&amp; r.path[0] != '\0') {
274
          str_to_adate(r.path, '/', &amp;ad);
275
          if (ad.year != 0) {
276
            handle_archive(&amp;r, &amp;ad);
277
          } else {
278
            handle_err(&amp;r, KHTTP_404);
279
          }
280
        } else {
281
          /* Not specified at all. */
282
          handle_archive(&amp;r, NULL);
283
        }
284
        break;
285
      case PG_RANDOM :
286
        handle_random(&amp;r);
287
        break;
288
      case PG_TAG :
289
        handle_tag(&amp;r, r.path);
290
        break;
291
      default :
292
        /* shouldn't happen */
293
        handle_err(&amp;r, KHTTP_500);
294
        break;
295
      }
296
    }
297
    khttp_free(&amp;r);
298
    return EXIT_SUCCESS;
299
}</pre>
300
	</figure>
301
	<p>
302
		Suppose we now decide that we wish to fall back to looking for a date specification (with '-' separators rather than '/') in the
303
		query string if none is specified in the path.
304
		This is as simple as adding the required definition&#x2026;
305
	</p>
306
	<figure class="sample">
307
		<pre class="prettyprint linenums">enum key {
308
  KEY_ADATE,
309
  KEY__MAX
310
};</pre>
311
	</figure>
312
	<p>
313
		&#x2026;and adding a validator function&#x2026;
314
	</p>
315
	<figure class="sample">
316
		<pre class="prettyprint linenums">static int
317
valid_adate(struct kpair* kp)
318
{
319
  struct adate ad;
320
  int ok;
321

322
  /* Invalid until proven otherwise. */
323
  ok = 0;
324

325
  if (kvalid_stringne(kp)) {
326
    str_to_adate(kp-&gt;val, '-', &amp;ad);
327
    if (ad.year != 0) {
328
      /* We have a valid specification. */
329
      kp-&gt;type = KPAIR__MAX  /* Not a simple type. */;
330
      kp-&gt;valsz = sizeof(ad);
331
      kp-&gt;val   = kmalloc(kp-&gt;valsz);
332
      ((struct adate*)kp-&gt;val)-&gt;year  = ad.year;
333
      ((struct adate*)kp-&gt;val)-&gt;month = ad.month;
334
      ((struct adate*)kp-&gt;val)-&gt;day   = ad.day;
335
      ok = 1;
336
    }
337
  }
338
  return ok;
339
} /* valid_adate */
340

341
static const struct kvalid keys[KEY__MAX] = {
342
  { valid_adate, &quot;adate&quot; }  /* KEY_ADATE */
343
};</pre>
344
	</figure>
345
	<p>
346
		(Note that the same date parsing function, <kbd>str_to_adate()</kbd>, is used but in this case it is wrapped in a validator
347
		function and thus executes in the sandboxed environment.)
348
	</p>
349
	<p>
350
		&#x2026;and, in <kbd>main()</kbd>, modifying the call to <a href="khttp_parse.3.html">khttp_parse(3)</a>&#x2026;
351
	</p>
352
	<figure class="sample">
353
		<pre class="prettyprint linenums">if (khttp_parse(&amp;r, keys, KEY__MAX,
354
      pages, PG__MAX, PG_INDEX) != KCGI_OK) {
355
  khttp_free(&amp;r);
356
  return EXIT_FAILURE  /* abort */;
357
}</pre>
358
	</figure>
359
	<p>
360
		&#x2026;and handling of the <kbd>PG_ARCHIVE</kbd> case&#x2026;
361
	</p>
362
	<figure class="sample">
363
		<pre class="prettyprint linenums">case PG_ARCHIVE :
364
  if (r.path != NULL &amp;&amp; r.path[0] != '\0') {
365
    str_to_adate(r.path, '/', &amp;ad);
366
    if (ad.year != 0)
367
      handle_archive(&amp;r, &amp;ad);
368
    else
369
      handle_err(&amp;r, KHTTP_404);
370
  } else if (r.fieldmap[KEY_ADATE] != NULL) {
371
    /* Fallback to field. */
372
    handle_archive(&amp;r, (struct adate*)r.fieldmap[KEY_ADATE]-&gt;val);
373
  } else if (r.fieldnmap[KEY_ADATE] != NULL) {
374
    /* Field is invalid. */
375
    handle_err(&amp;r, KHTTP_404);
376
  } else {
377
    /* Not specified at all. */
378
    handle_archive(&amp;r, NULL);
379
  }
380
  break;</pre>
381
	</figure>
382
	<p>
383
		Whilst some specifications are naturally suited to the use of path information (for example, dates, file system hierarchies, and
384
		timezones), others are are a less natural fit.
385
		Suppose, in our example, that we want to be able to specify a date and a tag <em>at the same time</em>.  This could be achieved
386
		by extending the behaviour of the <kbd>archive</kbd> or <kbd>tag</kbd> &quot;page&quot;, but does not fit comfortably with
387
		either.
388
		In general, use of query string <kbd>keys</kbd> is preferred over <kbd>pages</kbd> because the former:
389
	</p>
390
	<ul>
391
		<li><strong>involve parsing/validation in a sandboxed environment</strong></li>
392
		<li>allows for greater flexibility</li>
393
	</ul>
394
	<p>
395
		<strong>Editor's note</strong>: Ross makes a good case
396
		for putting some sort of handling facility for URLs into
397
		the protected child process.
398
		For example, we could pass a string into <a href="khttp_parse.3.html">khttp_parsex(3)</a> that would define a template for
399
		splitting the path into arguments.
400
		For example, <q>/@@0@@/@@1@@/@@2@@</q> might consider a pathname matching <q>/foo/bar/baz</q> with components being validated as
401
		query arguments.
402
	</p>
403
</article>
404
ksgi

Использование cookies