4 // Fraunhofer Institute for Open Communication Systems (FOKUS)
5 // Competence Center NETwork research (NET), St. Augustin, GERMANY
6 // Stefan Bund <g0dil@berlios.de>
8 // This program is free software; you can redistribute it and/or modify
9 // it under the terms of the GNU General Public License as published by
10 // the Free Software Foundation; either version 2 of the License, or
11 // (at your option) any later version.
13 // This program is distributed in the hope that it will be useful,
14 // but WITHOUT ANY WARRANTY; without even the implied warranty of
15 // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
16 // GNU General Public License for more details.
18 // You should have received a copy of the GNU General Public License
19 // along with this program; if not, write to the
20 // Free Software Foundation, Inc.,
21 // 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
24 \brief Parse public header */
29 /** \defgroup console_parser The parser
31 The console/config library defines a simple language used to interact with the console or to
32 configure the application. The parser is not concerned about interpreting commands or
33 arguments, checking that a command exists or managing directories. The parser just takes the
38 \section console_language The Language
40 The config/console language is used in configuration files and interactively at the
41 console. Some features of the language are more useful in config files, others at the
42 interactive console but the language is the same in both cases.
44 Let's start with a sample of the config/console language. The following is written as a
47 # My someserver configuration file
53 accept senf::log::Debug IMPORTANT;
54 accept server::ServerLog CRITICAL;
57 provide serverlog senf::log::FileTarget "/var/log/server.log";
59 reject senf::log::Debug senf::Console::Server NOTICE;
60 accept senf::log::Debug NOTICE;
61 accept server::ServerLog;
65 /server/stuffing (UDPPacket x"01 02 03 04");
66 /server/allow_hosts 10.1.2.3 # our internal server
67 10.2.3.4 10.4.3.5 # client workstations
70 /help/infoUrl "http://senf.j32.de/src/doc";
73 The interactive syntax is the same with some notes:
74 \li All commands must be complete on a single line. This includes grouping constructs which must
75 be closed on the same line they are opened.
76 \li The last ';' is optional. However, multiple commands may be entered on a single line when
77 they are separated by ';'.
78 \li An empty line on the interactive console will repeat the last command.
80 The language consists of a small number of syntactic entities:
82 \subsection console_special_chars Special characters
84 These are characters, which have a special meaning. Some are used internally, others are just
85 returned as punctuation tokens
88 <tr><td>/</td><td>path component separator</td></tr>
89 <tr><td>( )</td><td>argument grouping</td></tr>
90 <tr><td>{ }</td><td>directory grouping</td></tr>
91 <tr><td>;</td><td>command terminator</td></tr>
92 <tr><td>, =</td><td>punctuation tokens</td></tr>
95 \subsection console_basic Basic elements
97 A <b>word</b> is \e any sequence of consecutive characters which does not include any special
98 character. Examples for words are thus
101 jens@fokus.fraunhofer.de
106 The following are \e not valid words:
112 A <b>string literal</b> is just that: A double-quoted string (C/C++ style) possibly with
113 embedded escape chars:
119 A <b>hex-string literal</b> is used to represent binary data. It looks like a string which has
120 only hexadecimal bytes or whitespace as contents (comments and newlines are Ok when not read
121 from the interactive console)
129 A <b>token</b> is a \e word, \e string or \e hex-string, or a single special character (that's
130 true, any special character is allowed as a token). '(' and ')' must be properly nested.
132 A <b>path</b> is a sequence of \e words separated by '/' (and optional whitespace). A path may
133 have an optional initial and/or a terminating '/'.
140 \subsection console_statements Statements
142 There are several types of statements:
143 \li The bulk of all statements are \e path statements
144 \li There are some \e built-in statements which are mostly useful at the interactive console
145 \li A special form of statement is the <em>directory group</em>
147 A <b>path</b> statement consists of a (possibly relative) path followed by any number of
148 arguments and terminated with a ';' (or end-of-input)
150 /path/to/command arg1 "arg2" (complex=(1 2) another) ;
152 Every argument is either
153 \li A single word, string or hex-string
154 \li or a parenthesized list of tokens.
156 So above command has three arguments: 'arg1', 'arg2' (a single token each) and one argument with
157 the 7 tokens 'complex', '=', '(', '1', '2', ')', 'another'. The interpretation of the arguments
158 is completely up to the command.
160 A <b>built-in</b> statement is one of
163 <tr><td>\c cd \e path</td><td>Change current directory</td></tr>
164 <tr><td>\c ls [ \e path ]</td><td>List contents of \e path or current directory</td></tr>
165 <tr><td>\c exit</td><td>Exit interactive console</td></tr>
166 <tr><td>\c help [ \e path ]</td><td>Show help for \e path or current directory</td></tr>
169 A <b>directory group</b> statement is a block of statements all executed relatively to a fixed
177 At the beginning of the block, the current directory is saved and the directory is changed to
178 the given directory. All commands are executed and at the end of the block, the saved directory
181 \section console_parse_api The parser API
183 The senf::console::CommandParser is responsible for taking text input and turning it into a
184 sequence of senf::console::ParseCommandInfo structures. The structures are returned by passing
185 them successively to a callback function.
187 Every statement is returned as a senf::console::ParseCommandInfo instance. Directory groups are
188 handled specially: They are divided into two special built-in commands called PUSHD and POPD.
194 #include <boost/utility.hpp>
195 #include <boost/scoped_ptr.hpp>
196 #include <boost/range/iterator_range.hpp>
197 #include <boost/iterator/iterator_facade.hpp>
198 #include <boost/function.hpp>
199 #include "../Utils/safe_bool.hh"
201 //#include "Parse.mpp"
202 ///////////////////////////////hh.p////////////////////////////////////////
207 namespace detail { struct ParserAccess; }
209 /** \brief Single argument token
211 All command arguments are split into tokens by the parser. Each token is returned as an
214 \ingroup console_parser
221 PathSeparator = 0x0001, // '/'
222 ArgumentGroupOpen = 0x0002, // '('
223 ArgumentGroupClose = 0x0004, // ')'
224 DirectoryGroupOpen = 0x0008, // '{'
225 DirectoryGroupClose = 0x0010, // '}'
226 CommandTerminator = 0x0020, // ';'
227 OtherPunctuation = 0x0040,
228 BasicString = 0x0080,
234 ArgumentGrouper = ArgumentGroupOpen
235 | ArgumentGroupClose,
237 DirectoryGrouper = DirectoryGroupOpen
238 | DirectoryGroupClose,
240 Punctuation = DirectoryGroupOpen
241 | DirectoryGroupClose
249 SimpleArgument = Word
254 Token(); ///< Create empty token
255 Token(TokenType type, std::string token); ///< Create token with given type and value
258 std::string const & value() const; ///< String value of token
259 /**< This value is properly unquoted */
261 TokenType type() const; ///< Token type
263 bool is(unsigned tokens) const; ///< Check, whether tokens type matches \a tokens
264 /**< \a tokens is a bit-mask of token types to check. */
266 bool operator==(Token const & other) const;
267 bool operator!=(Token const & other) const;
276 std::ostream & operator<<(std::ostream & os, Token const & token);
278 /** \brief Create a \c None token
282 /** \brief Create a \c PathSeparator ['/'] token
284 Token PathSeparatorToken();
286 /** \brief Create an \c ArgumentGroupOpen ['('] token
288 Token ArgumentGroupOpenToken();
290 /** \brief Create a \c ArgumentGroupClose [')'] token
292 Token ArgumentGroupCloseToken();
294 /** \brief Create a \c DirectoryGroupOpen ['{'] token
296 Token DirectoryGroupOpenToken();
298 /** \brief Create a \c DirectoryGroupClose ['}'] token
300 Token DirectoryGroupCloseToken();
302 /** \brief Create a \c CommandTerminator [';'] token
304 Token CommandTerminatorToken();
306 /** \brief Create a \c OtherPunctuation ['=', ','] token with the given \a value
308 Token OtherPunctuationToken(std::string const & value);
310 /** \brief Create a \c BasicString token with the given \a value
312 Token BasicStringToken(std::string const & value);
314 /** \brief Create a \c HexString token with the given \a value
316 Token HexStringToken(std::string const & value);
318 /** \brief Create a \c Word token with the given \a value
320 Token WordToken(std::string const & value);
322 /** \brief Single parsed console command
324 Every command parsed is returned in a ParseCommandInfo instance. This information is purely
325 taken from the parser, no semantic information is attached at this point, the config/console
326 node tree is not involved in any why. ParseCommandInfo consist of
328 \li the type of command: built-in or normal command represented by a possibly relative path
329 into the command tree.
331 \li the arguments. Every argument consists of a range of Token instances.
333 \ingroup console_parser
335 class ParseCommandInfo
337 typedef std::vector<Token> Tokens;
338 typedef std::vector<std::string> CommandPath;
341 class ArgumentIterator;
343 typedef CommandPath::const_iterator path_iterator;
344 typedef Tokens::const_iterator token_iterator;
345 typedef ArgumentIterator argument_iterator;
346 typedef Tokens::size_type size_type;
348 typedef boost::iterator_range<path_iterator> CommandPathRange;
349 typedef boost::iterator_range<argument_iterator> ArgumentsRange;
350 typedef boost::iterator_range<token_iterator> TokensRange;
352 enum BuiltinCommand { NoBuiltin,
362 BuiltinCommand builtin() const; ///< Command type
363 /**< \returns \c NoBuiltin, if the command is an ordinary
364 command, otherwise the id of the built-in command */
365 TokensRange commandPath() const; ///< Command path
366 /**< This is the path to the command if it is not a built-in
367 command. Every element of the returned range
368 constitutes one path element. If the first element is
369 empty, the path is an absolute path, otherwise it is
370 relative. If the last element is an empty string, the
371 path ends with a '/' char. */
372 ArgumentsRange arguments() const; ///< Command arguments
373 /**< The returned range contains one TokensRange for each
375 TokensRange tokens() const; ///< All argument tokens
376 /**< The returned range contains \e all argument tokens in a
377 single range not divided into separate arguments. */
379 void clear(); ///< Clear all data members
380 bool empty(); ///< \c true, if the data is empty
382 void builtin(BuiltinCommand builtin); ///< Assign builtin command
383 void command(std::vector<Token> & commandPath); ///< Assign non-builtin command
385 void addToken(Token const & token); ///< Add argument token
386 /**< You \e must ensure, that the resulting argument tokens
387 are properly nested regarding '()' groups, otherwise
388 interpreting arguments using the arguments() call will
389 crash the program. */
396 std::vector<Token> commandPath_;
397 BuiltinCommand builtin_;
401 /** \brief Iterator parsing argument groups
403 This special iterator parses a token range returned by the parser into argument ranges. An
404 argument range is either a single token or it is a range of tokens enclosed in matching
405 parenthesis. The ParseCommandInfo::arguments() uses this iterator type. To recursively parse
406 complex arguments, you can however use this iterator to divide a multi-token argument into
407 further argument groups (e.g. to parse a list or vector of items).
409 This iterator is a bidirectional iterator \e not a random access iterator.
411 class ParseCommandInfo::ArgumentIterator
412 : public boost::iterator_facade< ParseCommandInfo::ArgumentIterator,
413 ParseCommandInfo::TokensRange,
414 boost::bidirectional_traversal_tag,
415 ParseCommandInfo::TokensRange >
419 explicit ArgumentIterator(ParseCommandInfo::TokensRange::iterator i);
422 reference dereference() const;
423 bool equal(ArgumentIterator const & other) const;
427 mutable ParseCommandInfo::TokensRange::iterator b_;
428 mutable ParseCommandInfo::TokensRange::iterator e_;
430 void setRange() const;
432 friend class boost::iterator_core_access;
433 friend class ParseCommandInfo;
436 /** \brief Syntax error parsing command arguments exception
438 All errors while parsing the arguments of a command must be signaled by throwing an instance
439 of SyntaxErrorException. This is important, so command overloading works.
441 struct SyntaxErrorException : public std::exception
443 explicit SyntaxErrorException(std::string const & msg = "");
444 virtual ~SyntaxErrorException() throw();
446 virtual char const * what() const throw();
447 std::string const & message() const;
450 std::string message_;
453 /** \brief Wrapper checking argument iterator access for validity
455 CheckedArgumentIteratorWrapper is a wrapper around a range of arguments parsed using the
456 ParseCommandInfo::ArgumentIterator. It is used to parse arguments either in a command
457 (registered with manual argument parsing) or when defining a custom parser.
459 void fn(std::ostream & out, senf::console::ParseCommandInfo command)
465 senf::console::CheckedArgumentIteratorWrapper arg (command.arguments());
466 senf::console::parse( *(arg++), arg1 );
467 senf::console::parse( *(arg++), arg2 );
474 To use the wrapper, you must ensure that:
475 \li You increment the iterator \e past all arguments you parse. The iterator must point to
476 the end of the range when parsing is complete.
477 \li The iterator wrapper is destroyed after parsing but before executing the command itself
480 Accessing a non-existent argument or failing to parse all arguments will raise a
481 senf::console::SyntaxErrorException.
483 \see \link console_arg_custom Example customer parser \endlink
485 class CheckedArgumentIteratorWrapper
486 : boost::noncopyable,
487 public boost::iterator_facade< CheckedArgumentIteratorWrapper,
488 ParseCommandInfo::TokensRange,
489 boost::forward_traversal_tag,
490 ParseCommandInfo::TokensRange >,
491 public senf::safe_bool<CheckedArgumentIteratorWrapper>
494 typedef boost::iterator_facade< CheckedArgumentIteratorWrapper,
495 ParseCommandInfo::TokensRange,
496 boost::forward_traversal_tag,
497 ParseCommandInfo::TokensRange > IteratorFacade;
500 explicit CheckedArgumentIteratorWrapper(
501 ParseCommandInfo::ArgumentsRange const & range,
502 std::string const & msg = "invalid number of arguments");
503 ///< Make wrapper from ArgumentsRange
504 /**< This constructs a wrapper from a
505 ParseCommandInfo::ArgumentsRange.
506 \param[in] range Range of arguments to parse
507 \param[in] msg Error message */
508 explicit CheckedArgumentIteratorWrapper(
509 ParseCommandInfo::TokensRange const & range,
510 std::string const & msg = "invalid number of arguments");
511 ///< Make wrapper from TokensRange
512 /**< This constructs a wrapper from a
513 ParseCommandInfo::TokensRange. The TokensRange is first
514 converted into an ParseCommandInfo::ArgumentsRange
515 which is then wrapped.
516 \param[in] range Range of tokens to parse
517 \param[in] msg Error message */
519 ~CheckedArgumentIteratorWrapper(); ///< Check, if all arguments are parsed
520 /**< The destructor validates, that all arguments are parsed
521 correctly when leaving the scope, in which the wrapper
522 is instantiated normally (not by an exception).
524 \warning This destructor will throw a
525 SyntaxErrorException, if not all arguments are parsed
526 and when no other exception is in progress. */
528 operator ParseCommandInfo::ArgumentIterator();
529 ///< Use wrapper as ParseCommandInfo::ArgumentIterator
531 bool boolean_test() const; ///< \c true, if more arguments are available
532 bool done() const; ///< \c true, if all arguments are parsed
534 void clear(); ///< Set range empty
535 /**< This call will point the current iterator to the end of
537 \post done() == \c true; */
539 bool operator==(ParseCommandInfo::ArgumentIterator const & other) const;
540 ///< Compare wrapper against ArgumentIterator
541 bool operator!=(ParseCommandInfo::ArgumentIterator const & other) const;
542 ///< Compare wrapper against ArgumentIterator
544 using IteratorFacade::operator++;
545 ParseCommandInfo::ArgumentIterator operator++(int);
548 reference dereference() const;
551 ParseCommandInfo::ArgumentIterator i_;
552 ParseCommandInfo::ArgumentIterator e_;
555 friend class boost::iterator_core_access;
558 /**< \brief Output ParseCommandInfo instance
559 \related ParseCommandInfo
561 std::ostream & operator<<(std::ostream & stream, ParseCommandInfo const & info);
563 /** \brief Parse commands
565 This class implements a parser for the console/config language. It supports parsing strings
566 as well as files. For every parsed command, a callback function is called.
568 \implementation The implementation is based on Boost.Spirit. See the file \ref Parse.ih for
569 the formal language grammar.
571 \implementation Parsing an arbitrary iostream is not supported since arbitrary streams are
572 not seekable. If this is needed, it can however be provided using stream iterators and
573 some special iterator adaptors from Boost.Spirit. However, the amount of backtracking
574 needs to be analyzed before this is viable.
576 \todo Implement more detailed error reporting and error recovery.
578 \ingroup console_parser
584 ///////////////////////////////////////////////////////////////////////////
587 typedef boost::function<void (ParseCommandInfo const &)> Callback;
589 ///////////////////////////////////////////////////////////////////////////
590 ///\name Structors and default members
597 ///////////////////////////////////////////////////////////////////////////
599 bool parse(std::string const & command, Callback cb); ///< Parse string
600 bool parseFile(std::string const & filename, Callback cb); ///< Parse file
601 /**< \throws SystemException if the file cannot be
604 bool parseArguments(std::string const & arguments, ParseCommandInfo & info);
605 ///< Parse \a argumtns
606 /**< parseArguments() parses the string \a arguments which
607 contains arbitrary command arguments (without the name
608 of the command). The argument tokens are written into
611 std::string::size_type parseIncremental(std::string const & commands, Callback cb);
612 ///< Incremental parse
613 /**< An incremental parse will parse all complete statements
614 in \a commands. parseIncremental() will return the
615 number of characters successfully parsed from \a
618 \note The incremental parser \e requires all statements
619 to be terminated explicitly. This means, that the
620 last ';' is \e not optional in this case. */
624 struct SetIncremental;
626 template <class Iterator>
627 Iterator parseLoop(Iterator b, Iterator e, Callback cb);
631 boost::scoped_ptr<Impl> impl_;
633 friend class SetIncremental;
638 ///////////////////////////////hh.e////////////////////////////////////////
640 //#include "Parse.ct"
641 //#include "Parse.cti"
648 // comment-column: 40
649 // c-file-style: "senf"
650 // indent-tabs-mode: nil
651 // ispell-local-dictionary: "american"
652 // compile-command: "scons -u test"