4 // Fraunhofer Institute for Open Communication Systems (FOKUS)
6 // The contents of this file are subject to the Fraunhofer FOKUS Public License
7 // Version 1.0 (the "License"); you may not use this file except in compliance
8 // with the License. You may obtain a copy of the License at
9 // http://senf.berlios.de/license.html
11 // The Fraunhofer FOKUS Public License Version 1.0 is based on,
12 // but modifies the Mozilla Public License Version 1.1.
13 // See the full license text for the amendments.
15 // Software distributed under the License is distributed on an "AS IS" basis,
16 // WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
17 // for the specific language governing rights and limitations under the License.
19 // The Original Code is Fraunhofer FOKUS code.
21 // The Initial Developer of the Original Code is Fraunhofer-Gesellschaft e.V.
22 // (registered association), Hansastraße 27 c, 80686 Munich, Germany.
23 // All Rights Reserved.
26 // Stefan Bund <g0dil@berlios.de>
29 \brief Parse public header */
31 #ifndef HH_SENF_Scheduler_Console_Parse_
32 #define HH_SENF_Scheduler_Console_Parse_ 1
34 /** \defgroup console_parser The parser
36 The console/config library defines a simple language used to interact with the console or to
37 configure the application. The parser is not concerned about interpreting commands or
38 arguments, checking that a command exists or managing directories. The parser just takes the
43 \section console_language The Language
45 The config/console language is used in configuration files and interactively at the
46 console. Some features of the language are more useful in config files, others at the
47 interactive console but the language is the same in both cases.
49 Let's start with a sample of the config/console language. The following is written as a
52 # My someserver configuration file
58 accept senf::log::Debug IMPORTANT;
59 accept server::ServerLog CRITICAL;
62 provide serverlog senf::log::FileTarget "/var/log/server.log";
64 reject senf::log::Debug senf::Console::Server NOTICE;
65 accept senf::log::Debug NOTICE;
66 accept server::ServerLog;
70 /server/stuffing (UDPPacket x"01 02 03 04");
71 /server/allow_hosts 10.1.2.3 # our internal server
72 10.2.3.4 10.4.3.5 # client workstations
75 /help/infoUrl "http://senf.j32.de/src/doc";
78 The interactive syntax is the same with some notes:
79 \li All commands must be complete on a single line. This includes grouping constructs which must
80 be closed on the same line they are opened.
81 \li The last ';' is optional. However, multiple commands may be entered on a single line when
82 they are separated by ';'.
83 \li An empty line on the interactive console will repeat the last command.
85 The language consists of a small number of syntactic entities:
87 \subsection console_special_chars Special characters
89 These are characters, which have a special meaning. Some are used internally, others are just
90 returned as punctuation tokens
93 <tr><td>#</td><td>Comments are marked with '#' and continue to the end of the line</td></tr>
94 <tr><td>/</td><td>path component separator</td></tr>
95 <tr><td>( )</td><td>argument grouping</td></tr>
96 <tr><td>{ }</td><td>directory grouping</td></tr>
97 <tr><td>;</td><td>command terminator</td></tr>
98 <tr><td>, =</td><td>punctuation tokens</td></tr>
101 \subsection console_basic Basic elements
103 A <b>word</b> is \e any sequence of consecutive characters which does not include any special
104 character. Examples for words are thus
107 jens@fokus.fraunhofer.de
112 The following are \e not valid words:
118 A <b>string literal</b> is just that: A double-quoted string (C/C++ style) possibly with
119 embedded escape chars:
125 A <b>hex-string literal</b> is used to represent binary data. It looks like a string which has
126 only hexadecimal bytes or whitespace as contents (comments and newlines are Ok when not read
127 from the interactive console)
135 A <b>token</b> is a \e word, \e string or \e hex-string, or a single special character (that's
136 true, any special character is allowed as a token). '(' and ')' must be properly nested.
138 A <b>path</b> is a sequence of \e words separated by '/' (and optional whitespace). A path may
139 have an optional initial and/or a terminating '/'.
146 \subsection console_statements Statements
148 There are several types of statements:
149 \li The bulk of all statements are \e path statements
150 \li There are some \e built-in statements which are mostly useful at the interactive console
151 \li A special form of statement is the <em>directory group</em>
153 A <b>path</b> statement consists of a (possibly relative) path followed by any number of
154 arguments and terminated with a ';' (or end-of-input)
156 /path/to/command arg1 "arg2" (complex=(1 2) another) ;
158 Every argument is either
159 \li A single word, string or hex-string
160 \li or a parenthesized list of tokens.
162 So above command has three arguments: 'arg1', 'arg2' (a single token each) and one argument with
163 the 7 tokens 'complex', '=', '(', '1', '2', ')', 'another'. The interpretation of the arguments
164 is completely up to the command.
166 A <b>built-in</b> statement is one of
169 <tr><td>\c cd \e path</td><td>Change current directory</td></tr>
170 <tr><td>\c ls [ \e path ]</td><td>List contents of \e path or current directory</td></tr>
171 <tr><td>\c exit</td><td>Exit interactive console</td></tr>
172 <tr><td>\c help [ \e path ]</td><td>Show help for \e path or current directory</td></tr>
175 A <b>directory group</b> statement is a block of statements all executed relatively to a fixed
183 At the beginning of the block, the current directory is saved and the directory is changed to
184 the given directory. All commands are executed and at the end of the block, the saved directory
187 \section console_parse_api The parser API
189 The senf::console::CommandParser is responsible for taking text input and turning it into a
190 sequence of senf::console::ParseCommandInfo structures. The structures are returned by passing
191 them successively to a callback function.
193 Every statement is returned as a senf::console::ParseCommandInfo instance. Directory groups are
194 handled specially: They are divided into two special built-in commands called PUSHD and POPD.
200 #include <boost/utility.hpp>
201 #include <boost/scoped_ptr.hpp>
202 #include <boost/range/iterator_range.hpp>
203 #include <boost/iterator/iterator_facade.hpp>
204 #include <boost/function.hpp>
205 #include <senf/Utils/safe_bool.hh>
206 #include <senf/Utils/Exception.hh>
208 //#include "Parse.mpp"
209 //-/////////////////////////////////////////////////////////////////////////////////////////////////
214 namespace detail { class FilePositionWithIndex; }
216 namespace detail { struct ParserAccess; }
218 /** \brief Single argument token
220 All command arguments are split into tokens by the parser. Each token is returned as an
223 \ingroup console_parser
230 PathSeparator = 0x0001, // '/'
231 ArgumentGroupOpen = 0x0002, // '('
232 ArgumentGroupClose = 0x0004, // ')'
233 DirectoryGroupOpen = 0x0008, // '{'
234 DirectoryGroupClose = 0x0010, // '}'
235 CommandTerminator = 0x0020, // ';'
236 OtherPunctuation = 0x0040,
237 BasicString = 0x0080,
243 ArgumentGrouper = ArgumentGroupOpen
244 | ArgumentGroupClose,
246 DirectoryGrouper = DirectoryGroupOpen
247 | DirectoryGroupClose,
249 Punctuation = DirectoryGroupOpen
250 | DirectoryGroupClose
258 SimpleArgument = Word
263 Token(); ///< Create empty token
264 Token(TokenType type, std::string token);
265 ///< Create token with given type and value
266 Token(TokenType type, std::string token, detail::FilePositionWithIndex const & pos);
267 ///< Create token with given type and value
270 std::string const & value() const; ///< String value of token
271 /**< This value is properly unquoted */
273 TokenType type() const; ///< Token type
275 unsigned line() const; ///< Line number of token in source
276 unsigned column() const; ///< Column number of token in source
277 unsigned index() const; ///< Index (char count) of token in source
279 bool is(unsigned tokens) const; ///< Check, whether tokens type matches \a tokens
280 /**< \a tokens is a bit-mask of token types to check. */
282 bool operator==(Token const & other) const;
283 bool operator!=(Token const & other) const;
295 std::ostream & operator<<(std::ostream & os, Token const & token);
297 /** \brief Create a \c None token
301 /** \brief Create a \c PathSeparator ['/'] token
303 Token PathSeparatorToken();
305 /** \brief Create an \c ArgumentGroupOpen ['('] token
307 Token ArgumentGroupOpenToken();
309 /** \brief Create a \c ArgumentGroupClose [')'] token
311 Token ArgumentGroupCloseToken();
313 /** \brief Create a \c DirectoryGroupOpen ['{'] token
315 Token DirectoryGroupOpenToken();
317 /** \brief Create a \c DirectoryGroupClose ['}'] token
319 Token DirectoryGroupCloseToken();
321 /** \brief Create a \c CommandTerminator [';'] token
323 Token CommandTerminatorToken();
325 /** \brief Create a \c OtherPunctuation ['=', ','] token with the given \a value
327 Token OtherPunctuationToken(std::string const & value);
329 /** \brief Create a \c BasicString token with the given \a value
331 Token BasicStringToken(std::string const & value);
333 /** \brief Create a \c HexString token with the given \a value
335 Token HexStringToken(std::string const & value);
337 /** \brief Create a \c Word token with the given \a value
339 Token WordToken(std::string const & value);
341 /** \brief Single parsed console command
343 Every command parsed is returned in a ParseCommandInfo instance. This information is purely
344 taken from the parser, no semantic information is attached at this point, the config/console
345 node tree is not involved in any way. ParseCommandInfo consist of
347 \li the type of command: built-in or normal command represented by a possibly relative path
348 into the command tree.
350 \li the arguments. Every argument consists of a range of Token instances.
352 \ingroup console_parser
354 class ParseCommandInfo
356 typedef std::vector<Token> Tokens;
357 typedef std::vector<std::string> CommandPath;
360 class ArgumentIterator;
362 typedef CommandPath::const_iterator path_iterator;
363 typedef Tokens::const_iterator token_iterator;
364 typedef ArgumentIterator argument_iterator;
365 typedef Tokens::size_type size_type;
367 typedef boost::iterator_range<path_iterator> CommandPathRange;
368 typedef boost::iterator_range<argument_iterator> ArgumentsRange;
369 typedef boost::iterator_range<token_iterator> TokensRange;
371 enum BuiltinCommand { NoBuiltin,
383 BuiltinCommand builtin() const; ///< Command type
384 /**< \returns \c NoBuiltin, if the command is an ordinary
385 command, otherwise the id of the built-in command */
386 TokensRange commandPath() const; ///< Command path
387 /**< This is the path to the command if it is not a built-in
388 command. Every element of the returned range
389 constitutes one path element. If the first element is
390 empty, the path is an absolute path, otherwise it is
391 relative. If the last element is an empty string, the
392 path ends with a '/' char. */
393 ArgumentsRange arguments() const; ///< Command arguments
394 /**< The returned range contains one TokensRange for each
396 TokensRange tokens() const; ///< All argument tokens
397 /**< The returned range contains \e all argument tokens in a
398 single range not divided into separate arguments. */
400 void clear(); ///< Clear all data members
401 bool empty(); ///< \c true, if the data is empty
403 void builtin(BuiltinCommand builtin); ///< Assign builtin command
404 void command(std::vector<Token> & commandPath); ///< Assign non-builtin command
406 void addToken(Token const & token); ///< Add argument token
407 /**< You \e must ensure, that the resulting argument tokens
408 are properly nested regarding '()' groups, otherwise
409 interpreting arguments using the arguments() call will
410 crash the program. */
417 std::vector<Token> commandPath_;
418 BuiltinCommand builtin_;
422 /** \brief Iterator parsing argument groups
424 This special iterator parses a token range returned by the parser into argument ranges. An
425 argument range is either a single token or it is a range of tokens enclosed in matching
426 parenthesis. The ParseCommandInfo::arguments() uses this iterator type. To recursively parse
427 complex arguments, you can however use this iterator to divide a multi-token argument into
428 further argument groups (e.g. to parse a list or vector of items).
430 This iterator is a bidirectional iterator \e not a random access iterator.
432 class ParseCommandInfo::ArgumentIterator
433 : public boost::iterator_facade< ParseCommandInfo::ArgumentIterator,
434 ParseCommandInfo::TokensRange,
435 boost::bidirectional_traversal_tag,
436 ParseCommandInfo::TokensRange >
440 explicit ArgumentIterator(ParseCommandInfo::TokensRange::iterator i);
443 reference dereference() const;
444 bool equal(ArgumentIterator const & other) const;
448 mutable ParseCommandInfo::TokensRange::iterator b_;
449 mutable ParseCommandInfo::TokensRange::iterator e_;
451 void setRange() const;
453 friend class boost::iterator_core_access;
454 friend class ParseCommandInfo;
457 /** \brief Syntax error parsing command arguments exception
459 All errors while parsing the arguments of a command must be signaled by throwing an instance
460 of SyntaxErrorException. This is important, so command overloading works.
462 struct SyntaxErrorException : public senf::Exception
463 { explicit SyntaxErrorException(std::string const & msg = "syntax error")
464 : senf::Exception(msg) {} };
466 /** \brief Wrapper checking argument iterator access for validity
468 CheckedArgumentIteratorWrapper is a wrapper around a range of arguments parsed using the
469 ParseCommandInfo::ArgumentIterator. It is used to parse arguments either in a command
470 (registered with manual argument parsing) or when defining a custom parser.
472 void fn(std::ostream & out, senf::console::ParseCommandInfo command)
478 senf::console::CheckedArgumentIteratorWrapper arg (command.arguments());
479 senf::console::parse( *(arg++), arg1 );
480 senf::console::parse( *(arg++), arg2 );
487 To use the wrapper, you must ensure that:
488 \li You increment the iterator \e past all arguments you parse. The iterator must point to
489 the end of the range when parsing is complete.
490 \li The iterator wrapper is destroyed after parsing but before executing the command itself
493 Accessing a non-existent argument or failing to parse all arguments will raise a
494 senf::console::SyntaxErrorException.
496 \see \ref console_args_custom "Example customer parser"
498 class CheckedArgumentIteratorWrapper
499 : boost::noncopyable,
500 public boost::iterator_facade< CheckedArgumentIteratorWrapper,
501 ParseCommandInfo::TokensRange,
502 boost::forward_traversal_tag,
503 ParseCommandInfo::TokensRange >,
504 public senf::safe_bool<CheckedArgumentIteratorWrapper>
507 typedef boost::iterator_facade< CheckedArgumentIteratorWrapper,
508 ParseCommandInfo::TokensRange,
509 boost::forward_traversal_tag,
510 ParseCommandInfo::TokensRange > IteratorFacade;
513 explicit CheckedArgumentIteratorWrapper(
514 ParseCommandInfo::ArgumentsRange const & range,
515 std::string const & msg = "invalid number of arguments");
516 ///< Make wrapper from ArgumentsRange
517 /**< This constructs a wrapper from a
518 ParseCommandInfo::ArgumentsRange.
519 \param[in] range Range of arguments to parse
520 \param[in] msg Error message */
521 explicit CheckedArgumentIteratorWrapper(
522 ParseCommandInfo::TokensRange const & range,
523 std::string const & msg = "invalid number of arguments");
524 ///< Make wrapper from TokensRange
525 /**< This constructs a wrapper from a
526 ParseCommandInfo::TokensRange. The TokensRange is first
527 converted into an ParseCommandInfo::ArgumentsRange
528 which is then wrapped.
529 \param[in] range Range of tokens to parse
530 \param[in] msg Error message */
532 ~CheckedArgumentIteratorWrapper(); ///< Check, if all arguments are parsed
533 /**< The destructor validates, that all arguments are parsed
534 correctly when leaving the scope, in which the wrapper
535 is instantiated normally (not by an exception).
537 \warning This destructor will throw a
538 SyntaxErrorException, if not all arguments are parsed
539 and when no other exception is in progress. */
541 operator ParseCommandInfo::ArgumentIterator();
542 ///< Use wrapper as ParseCommandInfo::ArgumentIterator
544 bool boolean_test() const; ///< \c true, if more arguments are available
545 bool done() const; ///< \c true, if all arguments are parsed
547 void clear(); ///< Set range empty
548 /**< This call will point the current iterator to the end of
550 \post done() == \c true; */
552 bool operator==(ParseCommandInfo::ArgumentIterator const & other) const;
553 ///< Compare wrapper against ArgumentIterator
554 bool operator!=(ParseCommandInfo::ArgumentIterator const & other) const;
555 ///< Compare wrapper against ArgumentIterator
557 using IteratorFacade::operator++;
558 ParseCommandInfo::ArgumentIterator operator++(int);
561 reference dereference() const;
564 ParseCommandInfo::ArgumentIterator i_;
565 ParseCommandInfo::ArgumentIterator e_;
568 friend class boost::iterator_core_access;
571 /**< \brief Output ParseCommandInfo instance
572 \related ParseCommandInfo
574 std::ostream & operator<<(std::ostream & stream, ParseCommandInfo const & info);
576 /** \brief Parse commands
578 This class implements a parser for the console/config language. It supports parsing strings
579 as well as files. For every parsed command, a callback function is called.
581 \implementation The implementation is based on Boost.Spirit. See the file \ref Parse.ih for
582 the formal language grammar.
584 \implementation Parsing an arbitrary iostream is not supported since arbitrary streams are
585 not seekable. If this is needed, it can however be provided using stream iterators and
586 some special iterator adaptors from Boost.Spirit. However, the amount of backtracking
587 needs to be analyzed before this is viable.
589 \ingroup console_parser
595 //-////////////////////////////////////////////////////////////////////////
598 typedef boost::function<void (ParseCommandInfo const &)> Callback;
600 //-////////////////////////////////////////////////////////////////////////
601 ///\name Structors and default members
608 //-////////////////////////////////////////////////////////////////////////
610 void parse(std::string const & command, Callback cb); ///< Parse string
611 void parseFile(std::string const & filename, Callback cb); ///< Parse file
612 /**< \throws SystemException if the file cannot be
615 void parseArguments(std::string const & arguments, ParseCommandInfo & info);
616 ///< Parse \a arguments
617 /**< parseArguments() parses the string \a arguments which
618 contains arbitrary command arguments (without the name
619 of the command). The argument tokens are written into
622 void parsePath(std::string const & path, ParseCommandInfo & info);
624 /**< parsePath() parses the string \a path as an arbitrary
625 command path. The result is written into \a info. */
627 std::string::size_type parseIncremental(std::string const & commands, Callback cb);
628 ///< Incremental parse
629 /**< An incremental parse will parse all complete statements
630 in \a commands. parseIncremental() will return the
631 number of characters successfully parsed from \a
634 \note The incremental parser \e requires all statements
635 to be terminated explicitly. This means, that the
636 last ';' is \e not optional in this case. */
638 static bool isSpecialChar(char ch); ///< Check, if \a ch is a special character
639 static bool isPunctuationChar(char ch); ///< Check, if \a ch is a punctuation character
640 static bool isSpaceChar(char ch); ///< Check, if \a ch is a space character
641 static bool isInvalidChar(char ch); ///< Check, if \a ch is an invalid character
642 static bool isWordChar(char ch); ///< Check, if \a ch is a word character
644 /** \brief Exception thrown when the parser detects an error */
645 struct ParserErrorException : public SyntaxErrorException
646 { explicit ParserErrorException(std::string const & msg) : SyntaxErrorException(msg) {} };
650 struct SetIncremental;
652 template <class Iterator>
653 Iterator parseLoop(Iterator b, Iterator e, std::string const & source, Callback cb);
657 boost::scoped_ptr<Impl> impl_;
659 friend class SetIncremental;
664 //-/////////////////////////////////////////////////////////////////////////////////////////////////
666 //#include "Parse.ct"
667 //#include "Parse.cti"
674 // comment-column: 40
675 // c-file-style: "senf"
676 // indent-tabs-mode: nil
677 // ispell-local-dictionary: "american"
678 // compile-command: "scons -u test"