入力文字列自身でマーク付けに使う文字を指定する #6
もう一つ問題点になりそうなものがあった。
コマンドラインオプションでは複数のメタ文字を同じ文字に設定することを禁じているが、
入力文字列中のメタ文字変更シーケンスではこれを禁じていない。
これが原因でうまく解析できなくなるのは入力データ側の責任ではあるが、
とりあえず字句解析器のメタ文字変更シーケンスの解析段階で禁止してみる。
int yylex(void) { ...snip if (c == current_start_mark_char) { ...snip if (c2 == current_start_mark_char) { ...snip if (c3 == current_end_mark_char || c3 == current_start_annotation_char) { fputs("error: meta characters must differ from each other anytime\n", stderr); exit(1); } current_start_mark_char = c3; } else if (c2 == current_end_mark_char) { ...snip if (c3 == current_start_mark_char || c3 == current_start_annotation_char) { fputs("error: meta characters must differ from each other anytime\n", stderr); exit(1); } current_end_mark_char = c3; } else if (c2 == current_start_annotation_char) { ...snip if (c3 == current_start_mark_char || c3 == current_end_mark_char) { fputs("error: meta characters must differ from each other anytime\n", stderr); exit(1); } current_start_annotation_char = c3; } else { ...snip } ...snip while (1) { ...snip if (c == current_start_mark_char) { ...snip if (c2 == current_start_mark_char) { ...snip if (c3 == current_end_mark_char || c3 == current_start_annotation_char) { fputs("error: meta characters must differ from each other anytime\n", stderr); exit(1); } current_start_mark_char = c3; } else if (c2 == current_end_mark_char) { ...snip if (c3 == current_start_mark_char || c3 == current_start_annotation_char) { fputs("error: meta characters must differ from each other anytime\n", stderr); exit(1); } current_end_mark_char = c3; } else if (c2 == current_start_annotation_char) { ...snip if (c3 == current_start_mark_char || c3 == current_end_mark_char) { fputs("error: meta characters must differ from each other anytime\n", stderr); exit(1); } current_start_annotation_char = c3; } else { ...snip } ...snip } }
全然コンパクトじゃなくて見難いが、
$ echo -n "(()x" | ./parser error: meta characters must differ from each other anytime
字句解析段階でエラーを検出できている。