4 // Fraunhofer Institute for Open Communication Systems (FOKUS)
5 // Competence Center NETwork research (NET), St. Augustin, GERMANY
6 // Stefan Bund <g0dil@berlios.de>
8 // This program is free software; you can redistribute it and/or modify
9 // it under the terms of the GNU General Public License as published by
10 // the Free Software Foundation; either version 2 of the License, or
11 // (at your option) any later version.
13 // This program is distributed in the hope that it will be useful,
14 // but WITHOUT ANY WARRANTY; without even the implied warranty of
15 // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
16 // GNU General Public License for more details.
18 // You should have received a copy of the GNU General Public License
19 // along with this program; if not, write to the
20 // Free Software Foundation, Inc.,
21 // 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
24 \brief Parse public header */
26 #ifndef HH_SENF_Scheduler_Console_Parse_
27 #define HH_SENF_Scheduler_Console_Parse_ 1
29 /** \defgroup console_parser The parser
31 The console/config library defines a simple language used to interact with the console or to
32 configure the application. The parser is not concerned about interpreting commands or
33 arguments, checking that a command exists or managing directories. The parser just takes the
38 \section console_language The Language
40 The config/console language is used in configuration files and interactively at the
41 console. Some features of the language are more useful in config files, others at the
42 interactive console but the language is the same in both cases.
44 Let's start with a sample of the config/console language. The following is written as a
47 # My someserver configuration file
53 accept senf::log::Debug IMPORTANT;
54 accept server::ServerLog CRITICAL;
57 provide serverlog senf::log::FileTarget "/var/log/server.log";
59 reject senf::log::Debug senf::Console::Server NOTICE;
60 accept senf::log::Debug NOTICE;
61 accept server::ServerLog;
65 /server/stuffing (UDPPacket x"01 02 03 04");
66 /server/allow_hosts 10.1.2.3 # our internal server
67 10.2.3.4 10.4.3.5 # client workstations
70 /help/infoUrl "http://senf.j32.de/src/doc";
73 The interactive syntax is the same with some notes:
74 \li All commands must be complete on a single line. This includes grouping constructs which must
75 be closed on the same line they are opened.
76 \li The last ';' is optional. However, multiple commands may be entered on a single line when
77 they are separated by ';'.
78 \li An empty line on the interactive console will repeat the last command.
80 The language consists of a small number of syntactic entities:
82 \subsection console_special_chars Special characters
84 These are characters, which have a special meaning. Some are used internally, others are just
85 returned as punctuation tokens
88 <tr><td>#</td><td>Comments are marked with '#' and continue to the end of the line</td></tr>
89 <tr><td>/</td><td>path component separator</td></tr>
90 <tr><td>( )</td><td>argument grouping</td></tr>
91 <tr><td>{ }</td><td>directory grouping</td></tr>
92 <tr><td>;</td><td>command terminator</td></tr>
93 <tr><td>, =</td><td>punctuation tokens</td></tr>
96 \subsection console_basic Basic elements
98 A <b>word</b> is \e any sequence of consecutive characters which does not include any special
99 character. Examples for words are thus
102 jens@fokus.fraunhofer.de
107 The following are \e not valid words:
113 A <b>string literal</b> is just that: A double-quoted string (C/C++ style) possibly with
114 embedded escape chars:
120 A <b>hex-string literal</b> is used to represent binary data. It looks like a string which has
121 only hexadecimal bytes or whitespace as contents (comments and newlines are Ok when not read
122 from the interactive console)
130 A <b>token</b> is a \e word, \e string or \e hex-string, or a single special character (that's
131 true, any special character is allowed as a token). '(' and ')' must be properly nested.
133 A <b>path</b> is a sequence of \e words separated by '/' (and optional whitespace). A path may
134 have an optional initial and/or a terminating '/'.
141 \subsection console_statements Statements
143 There are several types of statements:
144 \li The bulk of all statements are \e path statements
145 \li There are some \e built-in statements which are mostly useful at the interactive console
146 \li A special form of statement is the <em>directory group</em>
148 A <b>path</b> statement consists of a (possibly relative) path followed by any number of
149 arguments and terminated with a ';' (or end-of-input)
151 /path/to/command arg1 "arg2" (complex=(1 2) another) ;
153 Every argument is either
154 \li A single word, string or hex-string
155 \li or a parenthesized list of tokens.
157 So above command has three arguments: 'arg1', 'arg2' (a single token each) and one argument with
158 the 7 tokens 'complex', '=', '(', '1', '2', ')', 'another'. The interpretation of the arguments
159 is completely up to the command.
161 A <b>built-in</b> statement is one of
164 <tr><td>\c cd \e path</td><td>Change current directory</td></tr>
165 <tr><td>\c ls [ \e path ]</td><td>List contents of \e path or current directory</td></tr>
166 <tr><td>\c exit</td><td>Exit interactive console</td></tr>
167 <tr><td>\c help [ \e path ]</td><td>Show help for \e path or current directory</td></tr>
170 A <b>directory group</b> statement is a block of statements all executed relatively to a fixed
178 At the beginning of the block, the current directory is saved and the directory is changed to
179 the given directory. All commands are executed and at the end of the block, the saved directory
182 \section console_parse_api The parser API
184 The senf::console::CommandParser is responsible for taking text input and turning it into a
185 sequence of senf::console::ParseCommandInfo structures. The structures are returned by passing
186 them successively to a callback function.
188 Every statement is returned as a senf::console::ParseCommandInfo instance. Directory groups are
189 handled specially: They are divided into two special built-in commands called PUSHD and POPD.
195 #include <boost/utility.hpp>
196 #include <boost/scoped_ptr.hpp>
197 #include <boost/range/iterator_range.hpp>
198 #include <boost/iterator/iterator_facade.hpp>
199 #include <boost/function.hpp>
200 #include "../../Utils/safe_bool.hh"
201 #include "../../Utils/Exception.hh"
203 //#include "Parse.mpp"
204 ///////////////////////////////hh.p////////////////////////////////////////
209 namespace detail { class FilePositionWithIndex; }
211 namespace detail { struct ParserAccess; }
213 /** \brief Single argument token
215 All command arguments are split into tokens by the parser. Each token is returned as an
218 \ingroup console_parser
225 PathSeparator = 0x0001, // '/'
226 ArgumentGroupOpen = 0x0002, // '('
227 ArgumentGroupClose = 0x0004, // ')'
228 DirectoryGroupOpen = 0x0008, // '{'
229 DirectoryGroupClose = 0x0010, // '}'
230 CommandTerminator = 0x0020, // ';'
231 OtherPunctuation = 0x0040,
232 BasicString = 0x0080,
238 ArgumentGrouper = ArgumentGroupOpen
239 | ArgumentGroupClose,
241 DirectoryGrouper = DirectoryGroupOpen
242 | DirectoryGroupClose,
244 Punctuation = DirectoryGroupOpen
245 | DirectoryGroupClose
253 SimpleArgument = Word
258 Token(); ///< Create empty token
259 Token(TokenType type, std::string token);
260 ///< Create token with given type and value
261 Token(TokenType type, std::string token, detail::FilePositionWithIndex const & pos);
262 ///< Create token with given type and value
265 std::string const & value() const; ///< String value of token
266 /**< This value is properly unquoted */
268 TokenType type() const; ///< Token type
270 unsigned line() const; ///< Line number of token in source
271 unsigned column() const; ///< Column number of token in source
272 unsigned index() const; ///< Index (char count) of token in source
274 bool is(unsigned tokens) const; ///< Check, whether tokens type matches \a tokens
275 /**< \a tokens is a bit-mask of token types to check. */
277 bool operator==(Token const & other) const;
278 bool operator!=(Token const & other) const;
290 std::ostream & operator<<(std::ostream & os, Token const & token);
292 /** \brief Create a \c None token
296 /** \brief Create a \c PathSeparator ['/'] token
298 Token PathSeparatorToken();
300 /** \brief Create an \c ArgumentGroupOpen ['('] token
302 Token ArgumentGroupOpenToken();
304 /** \brief Create a \c ArgumentGroupClose [')'] token
306 Token ArgumentGroupCloseToken();
308 /** \brief Create a \c DirectoryGroupOpen ['{'] token
310 Token DirectoryGroupOpenToken();
312 /** \brief Create a \c DirectoryGroupClose ['}'] token
314 Token DirectoryGroupCloseToken();
316 /** \brief Create a \c CommandTerminator [';'] token
318 Token CommandTerminatorToken();
320 /** \brief Create a \c OtherPunctuation ['=', ','] token with the given \a value
322 Token OtherPunctuationToken(std::string const & value);
324 /** \brief Create a \c BasicString token with the given \a value
326 Token BasicStringToken(std::string const & value);
328 /** \brief Create a \c HexString token with the given \a value
330 Token HexStringToken(std::string const & value);
332 /** \brief Create a \c Word token with the given \a value
334 Token WordToken(std::string const & value);
336 /** \brief Single parsed console command
338 Every command parsed is returned in a ParseCommandInfo instance. This information is purely
339 taken from the parser, no semantic information is attached at this point, the config/console
340 node tree is not involved in any way. ParseCommandInfo consist of
342 \li the type of command: built-in or normal command represented by a possibly relative path
343 into the command tree.
345 \li the arguments. Every argument consists of a range of Token instances.
347 \ingroup console_parser
349 class ParseCommandInfo
351 typedef std::vector<Token> Tokens;
352 typedef std::vector<std::string> CommandPath;
355 class ArgumentIterator;
357 typedef CommandPath::const_iterator path_iterator;
358 typedef Tokens::const_iterator token_iterator;
359 typedef ArgumentIterator argument_iterator;
360 typedef Tokens::size_type size_type;
362 typedef boost::iterator_range<path_iterator> CommandPathRange;
363 typedef boost::iterator_range<argument_iterator> ArgumentsRange;
364 typedef boost::iterator_range<token_iterator> TokensRange;
366 enum BuiltinCommand { NoBuiltin,
376 BuiltinCommand builtin() const; ///< Command type
377 /**< \returns \c NoBuiltin, if the command is an ordinary
378 command, otherwise the id of the built-in command */
379 TokensRange commandPath() const; ///< Command path
380 /**< This is the path to the command if it is not a built-in
381 command. Every element of the returned range
382 constitutes one path element. If the first element is
383 empty, the path is an absolute path, otherwise it is
384 relative. If the last element is an empty string, the
385 path ends with a '/' char. */
386 ArgumentsRange arguments() const; ///< Command arguments
387 /**< The returned range contains one TokensRange for each
389 TokensRange tokens() const; ///< All argument tokens
390 /**< The returned range contains \e all argument tokens in a
391 single range not divided into separate arguments. */
393 void clear(); ///< Clear all data members
394 bool empty(); ///< \c true, if the data is empty
396 void builtin(BuiltinCommand builtin); ///< Assign builtin command
397 void command(std::vector<Token> & commandPath); ///< Assign non-builtin command
399 void addToken(Token const & token); ///< Add argument token
400 /**< You \e must ensure, that the resulting argument tokens
401 are properly nested regarding '()' groups, otherwise
402 interpreting arguments using the arguments() call will
403 crash the program. */
410 std::vector<Token> commandPath_;
411 BuiltinCommand builtin_;
415 /** \brief Iterator parsing argument groups
417 This special iterator parses a token range returned by the parser into argument ranges. An
418 argument range is either a single token or it is a range of tokens enclosed in matching
419 parenthesis. The ParseCommandInfo::arguments() uses this iterator type. To recursively parse
420 complex arguments, you can however use this iterator to divide a multi-token argument into
421 further argument groups (e.g. to parse a list or vector of items).
423 This iterator is a bidirectional iterator \e not a random access iterator.
425 class ParseCommandInfo::ArgumentIterator
426 : public boost::iterator_facade< ParseCommandInfo::ArgumentIterator,
427 ParseCommandInfo::TokensRange,
428 boost::bidirectional_traversal_tag,
429 ParseCommandInfo::TokensRange >
433 explicit ArgumentIterator(ParseCommandInfo::TokensRange::iterator i);
436 reference dereference() const;
437 bool equal(ArgumentIterator const & other) const;
441 mutable ParseCommandInfo::TokensRange::iterator b_;
442 mutable ParseCommandInfo::TokensRange::iterator e_;
444 void setRange() const;
446 friend class boost::iterator_core_access;
447 friend class ParseCommandInfo;
450 /** \brief Syntax error parsing command arguments exception
452 All errors while parsing the arguments of a command must be signaled by throwing an instance
453 of SyntaxErrorException. This is important, so command overloading works.
455 struct SyntaxErrorException : public senf::Exception
456 { explicit SyntaxErrorException(std::string const & msg = "syntax error")
457 : senf::Exception(msg) {} };
459 /** \brief Wrapper checking argument iterator access for validity
461 CheckedArgumentIteratorWrapper is a wrapper around a range of arguments parsed using the
462 ParseCommandInfo::ArgumentIterator. It is used to parse arguments either in a command
463 (registered with manual argument parsing) or when defining a custom parser.
465 void fn(std::ostream & out, senf::console::ParseCommandInfo command)
471 senf::console::CheckedArgumentIteratorWrapper arg (command.arguments());
472 senf::console::parse( *(arg++), arg1 );
473 senf::console::parse( *(arg++), arg2 );
480 To use the wrapper, you must ensure that:
481 \li You increment the iterator \e past all arguments you parse. The iterator must point to
482 the end of the range when parsing is complete.
483 \li The iterator wrapper is destroyed after parsing but before executing the command itself
486 Accessing a non-existent argument or failing to parse all arguments will raise a
487 senf::console::SyntaxErrorException.
489 \see \link console_arg_custom Example customer parser \endlink
491 class CheckedArgumentIteratorWrapper
492 : boost::noncopyable,
493 public boost::iterator_facade< CheckedArgumentIteratorWrapper,
494 ParseCommandInfo::TokensRange,
495 boost::forward_traversal_tag,
496 ParseCommandInfo::TokensRange >,
497 public senf::safe_bool<CheckedArgumentIteratorWrapper>
500 typedef boost::iterator_facade< CheckedArgumentIteratorWrapper,
501 ParseCommandInfo::TokensRange,
502 boost::forward_traversal_tag,
503 ParseCommandInfo::TokensRange > IteratorFacade;
506 explicit CheckedArgumentIteratorWrapper(
507 ParseCommandInfo::ArgumentsRange const & range,
508 std::string const & msg = "invalid number of arguments");
509 ///< Make wrapper from ArgumentsRange
510 /**< This constructs a wrapper from a
511 ParseCommandInfo::ArgumentsRange.
512 \param[in] range Range of arguments to parse
513 \param[in] msg Error message */
514 explicit CheckedArgumentIteratorWrapper(
515 ParseCommandInfo::TokensRange const & range,
516 std::string const & msg = "invalid number of arguments");
517 ///< Make wrapper from TokensRange
518 /**< This constructs a wrapper from a
519 ParseCommandInfo::TokensRange. The TokensRange is first
520 converted into an ParseCommandInfo::ArgumentsRange
521 which is then wrapped.
522 \param[in] range Range of tokens to parse
523 \param[in] msg Error message */
525 ~CheckedArgumentIteratorWrapper(); ///< Check, if all arguments are parsed
526 /**< The destructor validates, that all arguments are parsed
527 correctly when leaving the scope, in which the wrapper
528 is instantiated normally (not by an exception).
530 \warning This destructor will throw a
531 SyntaxErrorException, if not all arguments are parsed
532 and when no other exception is in progress. */
534 operator ParseCommandInfo::ArgumentIterator();
535 ///< Use wrapper as ParseCommandInfo::ArgumentIterator
537 bool boolean_test() const; ///< \c true, if more arguments are available
538 bool done() const; ///< \c true, if all arguments are parsed
540 void clear(); ///< Set range empty
541 /**< This call will point the current iterator to the end of
543 \post done() == \c true; */
545 bool operator==(ParseCommandInfo::ArgumentIterator const & other) const;
546 ///< Compare wrapper against ArgumentIterator
547 bool operator!=(ParseCommandInfo::ArgumentIterator const & other) const;
548 ///< Compare wrapper against ArgumentIterator
550 using IteratorFacade::operator++;
551 ParseCommandInfo::ArgumentIterator operator++(int);
554 reference dereference() const;
557 ParseCommandInfo::ArgumentIterator i_;
558 ParseCommandInfo::ArgumentIterator e_;
561 friend class boost::iterator_core_access;
564 /**< \brief Output ParseCommandInfo instance
565 \related ParseCommandInfo
567 std::ostream & operator<<(std::ostream & stream, ParseCommandInfo const & info);
569 /** \brief Parse commands
571 This class implements a parser for the console/config language. It supports parsing strings
572 as well as files. For every parsed command, a callback function is called.
574 \implementation The implementation is based on Boost.Spirit. See the file \ref Parse.ih for
575 the formal language grammar.
577 \implementation Parsing an arbitrary iostream is not supported since arbitrary streams are
578 not seekable. If this is needed, it can however be provided using stream iterators and
579 some special iterator adaptors from Boost.Spirit. However, the amount of backtracking
580 needs to be analyzed before this is viable.
582 \ingroup console_parser
588 ///////////////////////////////////////////////////////////////////////////
591 typedef boost::function<void (ParseCommandInfo const &)> Callback;
593 ///////////////////////////////////////////////////////////////////////////
594 ///\name Structors and default members
601 ///////////////////////////////////////////////////////////////////////////
603 void parse(std::string const & command, Callback cb); ///< Parse string
604 void parseFile(std::string const & filename, Callback cb); ///< Parse file
605 /**< \throws SystemException if the file cannot be
608 void parseArguments(std::string const & arguments, ParseCommandInfo & info);
609 ///< Parse \a arguments
610 /**< parseArguments() parses the string \a arguments which
611 contains arbitrary command arguments (without the name
612 of the command). The argument tokens are written into
615 void parsePath(std::string const & path, ParseCommandInfo & info);
617 /**< parsePath() parses the string \a path as an arbitrary
618 command path. The result is written into \a info. */
620 std::string::size_type parseIncremental(std::string const & commands, Callback cb);
621 ///< Incremental parse
622 /**< An incremental parse will parse all complete statements
623 in \a commands. parseIncremental() will return the
624 number of characters successfully parsed from \a
627 \note The incremental parser \e requires all statements
628 to be terminated explicitly. This means, that the
629 last ';' is \e not optional in this case. */
631 static bool isSpecialChar(char ch); ///< Check, if \a ch is a special character
632 static bool isPunctuationChar(char ch); ///< Check, if \a ch is a punctuation character
633 static bool isSpaceChar(char ch); ///< Check, if \a ch is a space character
634 static bool isInvalidChar(char ch); ///< Check, if \a ch is an invalid character
635 static bool isWordChar(char ch); ///< Check, if \a ch is a word character
637 /** \brief Exception thrown when the parser detects an error */
638 struct ParserErrorException : public SyntaxErrorException
639 { explicit ParserErrorException(std::string const & msg) : SyntaxErrorException(msg) {} };
643 struct SetIncremental;
645 template <class Iterator>
646 Iterator parseLoop(Iterator b, Iterator e, std::string const & source, Callback cb);
650 boost::scoped_ptr<Impl> impl_;
652 friend class SetIncremental;
657 ///////////////////////////////hh.e////////////////////////////////////////
659 //#include "Parse.ct"
660 //#include "Parse.cti"
667 // comment-column: 40
668 // c-file-style: "senf"
669 // indent-tabs-mode: nil
670 // ispell-local-dictionary: "american"
671 // compile-command: "scons -u test"