Opened 3 years ago

Closed 3 years ago

#5110 closed enhancement (complete)

Migrate D2 server configuration to the bison parser

Reported by: fdupont Owned by: tmark
Priority: medium Milestone: Kea1.2
Component: remote-management Version: git
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: Mozilla Feature Depending on Ticket:
Estimated Difficulty: 0 Add Hours to Ticket: 4
Total Hours: 68 Internal?: no

Description

An expected side effect is to get rid of old DhcpParser.

Subtickets

Change History (20)

comment:1 Changed 3 years ago by hschempf

  • Milestone changed from Kea-proposed to Kea1.2
  • Priority changed from medium to low

Per team meeting Jan 12, accept Kea 1.2 Low

comment:2 Changed 3 years ago by hschempf

  • Owner set to tmark
  • Status changed from new to assigned

comment:3 Changed 3 years ago by tomek

Once this ticket is ready for merge, please try to remove DhcpConfigParser? class. AFAICT D2 stuff was the last piece of code that was using it. If there are other areas that still using it, let's discuss that on a call.

comment:4 Changed 3 years ago by fdupont

Updated flex source codes and regenerated with last flex (2.6.3) and bison.

comment:5 Changed 3 years ago by tmark

  • Add Hours to Ticket changed from 0 to 50
  • Owner changed from tmark to Unassigned
  • Status changed from assigned to reviewing

At long last this ticket is ready for review. It required changes to libprocess as well as to D2. D2's entire configuration is now handled by bison JSON parsing and all of its element parsers derive from SimpleParser.

I apologize ahead of time for the large diff but there was little point in doing it piece meal.

All references to DhcpConfigParser? have been removed from libprocess as well as D2. The only remaining reference is in the control agent, so deleting the DhcpConfigParser? class is left for a subsequent ticket.

comment:6 Changed 3 years ago by tomek

  • Owner changed from Unassigned to tomek

comment:7 follow-up: Changed 3 years ago by fdupont

Some comments (for you agree I can fix them directly in the branch):

  • d2_cfg_mgr.cc in getInt() and co the intValue() and co can raise an exception. IMHO you should adopt the new version of these templates and when needed just change the type of the raised exception.
  • d2_cfg_mgr.cc please use isV4Zero()
  • d2_config.h NCRS -> NCRs, UPD -> UDP, emtpy -> empty
  • d2_config.h it is a matter of style but IMHO data::SimpleParser -> isc::data::SimpleParser (BTW one of them at least is preceded by a double space)

I'll continue in another comment.

comment:8 Changed 3 years ago by fdupont

d2_controller.h bison parsing to parse -> bison generated parser to parse

d2_lexer.ll SUB_DNS_SERVERS is missing

d2_parser.yy I noted you chose to put some checks into the parser itself (IMHO a good idea).

d2_parser.yy missing ; at the end of sub_ddns_domains, sub_dns_servers sub_tsig_keys,

d2_parser.yy extra elem in ddns_domain_name & others

d2_parser.yy please add a blank line between debuglevel and severity

I'll continue...

comment:9 Changed 3 years ago by fdupont

parser_context.h syntactix -> syntactic, determing -> determining, servess -> servers

parser_context.h perhaps the renew-timer example should be changed for the DHCP-DDNS one?

parser_context.cc perhaps the copyright years should be 2017 alone (I don't know when you started...)?

d2_cfg_mgr_unittests.cc spelling: anwser, cannat, insensisitive, replacable, sytnax (BTW I did a sort -u so there can be more than one occurrence)

d2_simple_parser_unittest.cc (same ispell | sort -u): alogorithm, approprate, specifices, valaue, virtural

parser_unittest.cc betelguese -> betelgeuse

process/d_controller.h udpateConfig -> updateConfig (doxygen!)

Next my final (?) comment.

comment:10 Changed 3 years ago by fdupont

Extra comments:

  • first I ran last week the branch so it passed on macOS (so if I don't review the code itself I don't expect critical errors).
  • even the buggy flex version is not expected to be used IMHO the .ll file should be converted to use C (vs C++) comments only (or the ticket handling this to be updated)
  • same for required parameters in the .yy file (again the corresponding ticket can be updated too).
  • I recommend to use the fact the parser does not return unknown keywords to remove the foreach in the D2CfgMgr::buildParams (of course if this function won't be phased out soon). And using the SimpleParser routine you can remove also the try/catch so save 2 indent levels.


comment:11 Changed 3 years ago by tomek

  • Add Hours to Ticket changed from 50 to 14
  • Owner changed from tomek to tmark
  • Total Hours changed from 0 to 64

When you metioned large diff you weren't joking. Oh boy.

Having said that, some of the changes are not necessary. This branch
contains changes in lib/eval, but only in the regenerated files. Those
should not be checked in unless you changed the source .ll or .yy
files. Make sure you don't merge them.

src/lib/process/d_controller.cc
DControllerBase::configFromFile() calls fromJSONFile to to parse the
input if parseFile returned NULL. That assumes the default parseFile
impelementation was used. But what would happen if a user specified
a file that is there, but is empty?

src/lib/process/d_cfg_mgr.cc|h
Comment for parseElement should be corrected: "Parses the an element".

Let's just say I dislike the idea of doing any changes to the
buildParams and buildAndCommit methods. Those two methods are
dedicated to the old paradigm and spending any cycles on improving
them is a waste of time. It would be better to work towards
eliminating it completely.

The new methods you added in d_cfg_mgr.h and d_controller.h produce
unused parameter warnings:

In file included from ../../../../src/lib/process/d_process.h:12:0,
                 from ../../../../src/lib/process/d_controller.h:16,
                 from ../../../../src/bin/agent/ctrl_agent_controller.h:10,
                 from ctrl_agent_controller_unittests.cc:8:
../../../../src/lib/process/d_cfg_mgr.h:324:55: warning: unused parameter ‘mutable_config’ [-Wunused-parameter]
     virtual void setCfgDefaults(isc::data::ElementPtr mutable_config) {
                                                       ^
../../../../src/lib/process/d_cfg_mgr.h:335:50: warning: unused parameter ‘element_id’ [-Wunused-parameter]
     virtual void parseElement(const std::string& element_id,
                                                  ^
../../../../src/lib/process/d_cfg_mgr.h:336:58: warning: unused parameter ‘element’ [-Wunused-parameter]
                               isc::data::ConstElementPtr element) {
                                                          ^
In file included from ../../../../src/bin/agent/ctrl_agent_controller.h:10:0,
                 from ctrl_agent_controller_unittests.cc:8:
../../../../src/lib/process/d_controller.h:414:69: warning: unused parameter ‘file_name’ [-Wunused-parameter]
     virtual isc::data::ConstElementPtr parseFile(const std::string& file_name) {
                                                                     ^
../../../../src/lib/process/d_controller.h:434:69: warning: unused parameter ‘input’ [-Wunused-parameter]
     virtual isc::data::ConstElementPtr parseText(const std::string& input) {

changes in src/bin/dhcpX/*.hh and dhcpX_lexer.cc
How did you generate the files? The .cc files have a huge difference,
but the only difference I spotted was minor change in the .ll files.

src/bin/d2/parser_context.h

I see you added D2ParseError exception. We already have Dhcp4ParseError and
Dhcp6ParseError. Your would be a third one and I was expected to add another one
for agent parser. Instead, I decided to create a generic ParseError? exception
and move it to lib/cc/dhcp_config_error.h. This header contains two exceptions:
ParseError? and DhcpConfigError? (that will soon be renamed to ConfigError?).
ParseError? was added as part of #5076 work and I don't rememeber the ticket #
for the DhcpConfigError?, but Francis created it already. I'm writing all of
this mostly as information. The exception you wrote will be converted to use
ParseError? eventually, but right now we probably don't want to add any
dependencies between tickets, so let's keep the exception as is.

There are a number of parser types defined. Do you really need all of them?
Yes, similar list was added in DHCPv4 and DHCPv6, but we do have specific use
cases for many of them (e.g. option parser needed to parse options stored in
DB). For agent parser, I wrote only 3 types: JSON (to parse generic json), AGENT
(to parse full agent config) and SUB_AGENT (to parse simplified config without
loggers and without top level map with "Control-agent" in it; that's useful for
writing simpler tests). To summarize, I'm not saying that you must remove them
all, just think whether we will use them. We haven't talked about any designs
yet, but are you envisioning D2 commands that would take advantage of those?
I don't know, maybe having D2 commands like add-tsig-key or add-domain would
use those sub-parsers?

DHCPvX parsers defined a lot of ParserContext? values, and so did your D2 code.
I have thought about it a bit and came to a conclusion that those are not
needed. We only need 2 values: whether the lexer should recognize the keywords
or not. So you can collapse all values different than NO_KEYWORD to a single
value of KEYWORDS. You may take a look at src/bin/agent/parser_context.h on
trac5076. One minor advantage of having so many contexts is the ability to print
the scope in error messages. On the other hand, without them defining tokens in
.ll files is much simpler, there are far fewer calls to ctx.enter() and
ctx.leave() in the .yy file and parser_context.h is shorter. So in my opinion
it's better to have only 2 values rather than close to 20.

src/bin/d2/parser_context.cc
This stuff is so awesome that in 2016 nobody thought it would even be possible.
You probably should make it more obvious by changing the first line to 2017 :)

Do we really need virtual destructor (or any destructor at all)?

src/bin/d2/d2_simple_parser.cc
Is there any reason why D2_GLOBAL_DEFAULTS use constants from D2Params?
From C++ perspective it doesn't matter at all, but there are two other reasons
why using the actual values rather than constants is more useful. First,
we can tell users to look at the file and they would understand it, even if they
have no clue about C++. Second, I'm thinking about a script that would
extract those tables and generate User's Guide sections out of them.
This would only work, however, if the absolute values were used. With
constants, it would still perhaps be possible, but it would be more fragile
and more difficult to implement.

Can you move the DEFAULTS descriptions from .h to .cc? The idea is that even
non-programmers could understand it, when they look at the .cc file.

Finally, for timeout value it would be nice to add a comment regarding the
units. I presume these are seconds, right?

src/bin/d2/d2_parser.yy
Line 275 adds a check if port number is negative. That's fine, but if you
want to add a check here, you should also check for the value being over
64K. There are similar checks in other parts of the code (e.g. line 487).
Do you think it would be useful to write a common method for it?

You have an interesting way of indenting the bison rules. Francis proposed
to keep the | below the semicolon in the first line. This was never codified
in any way, though, and it's more like a matter of aestetics, so feel free
to keep the code as it is if you prefer it that way.

There's a @todo in line 367 about checking FQDN. This may be a tricky
thing to do. I remember a discussion in IETF about adding a text for
validation of received FQDN option. The conclusion was that it was
very complex thing to verify, especially if you want to cover things
like IDN. Some regional encodings are really weird. So perhaps we can
simply say something in the documentation that we do not verify it?

Do I get it correctly that the hostname (line 466) is accepted only
if its value is ""?

You did a fine job writing this bison grammar. There's very little to
change here.

src/bin/d2/d2_lexer.ll
If you agree with my earlier comments about only needing two lexer
contexts (KEYWORDS vs NO_KEYWORDS), all the switches in d2_lexer.ll
could be replaced with code similar to:

if (driver.ctx_ != ParserContext::NO_KEYWORDS) {
    return D2Parser::make_PORT(driver.loc_);
} else {
    return D2Parser::make_STRING("port", driver.loc_);
}

This is less error prone and easier to maintain. With the code as
currently written, the default is to use string tokens, that may
cause problems when the grammar is extended in the future. If you
forget to update the lexer code, the token being generated will change
if you introduce new contexts.

src/bin/d2/d2_controller.cc|h
Corrected couple minor things. Please pull and review.

src/bin/d2/d2_config.h

The comment for TSIGKeyInfoParser class mentions .spec file. What's
your opinion on them? So far we still have them in the git repo, but
after migration to bison, is there much value in keeping them? There
are at least couple aspects to consider here. First, the bison grammar
sort of documents what is *actually* supported and what is
not. Second, the grammar is not so easy to read. Third, the .spec
files are most likely outdated and sometimes plain wrong. The fact
that we don't really know for sure is a clear sign of the problem.

Removed a bunch of obsolete methods. Please pull and review.

Comment for TSIGKeyInfoListParser::parse says that a separate parser
instance is created for each entry on that list. Is it really
necessary? Can't the code simply create one instance and use it for
all entries? It doesn't matter much, though, because I'm not sure what
exactly the code does when the parser instance is created. There is no
constructor and no member fields. [5 mins later]. Ok, I just checked.
The sizeof(instance of TSIGKeyInfoParser) returns a grand total of

  1. Nevermind then, it's ok to instantiate as many of those objects as

you like. :)

src/bin/d2/d2_config.cc
As pointed our earlier, there are some default values here and it would
be cleaner to move them to d2_simple_parser.cc. If you agree with
this, we should probably move this task to a separate ticket.

The comment in TSIGKeyInfoParser::parse() mentions that creation of
TSIGKey info may fail, because the secret may be invlid. What kind of
content could be considered invalid? An empty string? Too long string?
Is this something we could check against? I'm not pushing for this in
any way, these are honest questions from a security noob.

I see that D2CfgError is being used. Would you be ok with replacing it
with ConfigError?, which would be a generic exception shared among all
components? Unifying the exceptions is one thing that we can do in the
not so distant future.

src/bin/d2/d2_cfg_mgr.h
My reaction to line 86: Your Polish is improving. Dobra robota!
(mapy means maps in Polish).

Corrected. Please pull and review.

src/bin/d2/d2_cfg_mgr.cc

I see much similarity between getInt (204) and extractInt in
lib/cc/simple_parser.h. When we have ConfigError? and ParseError? in
lib/cc, it will be possible to remove that duplication.

src/bin/d2/tests/nc_test_utils.cc
The valid_d2_config uses tmark.org domain, which is a real domain
(currently available for sale). That's a violation of RFC2606. You
should use one of the domains listed in that RFC. tmark.example.org
looks like a good choice if you want to leave an everlasting
mark of your contributions. :)

The same applies to DdnsDomainListParserTest?.invalidDomain,
validDomain, validList, duplicateDomain and possibly couple others.


There are 8 new doxygen warnings (src/bin/d2/d2_cfg_mgr.h and
d2_simple_parser.h)


I have fixed a couple minor issues. Please pull and review.


The code builds and unit-tests pass on Ubuntu 16.04.1LTS x64.
Note there are abundant compilation warnings due to the header
issues mentioned above (d_cfg_mgr.h and d_controller.h).

In general, you did a very fine job. Sure, there are some
little things to fix, but these are insignificant and overall
the code is in a good shape.

I think this scope of changes warrant a changelog entry,
regardless if we feel the changes are transparent to the user
or not.

comment:12 Changed 3 years ago by fdupont

Two comments about last comments:

  • I am against the use of 2 syntactic contexts only: even if the parsing is still correct it breaks the error reporting in particular the unknown_map_entry rule which requires a STRING and in general all the cases where a keyword is used at a bad place (there is a long technical discussion on jabber about this).
  • the idea of one big bison file has some advantages but raises at least 2 problems: first it is not clear it is easier to maintain one big file for many purposes, second (bound to the first point) there is no built-in way to make a bison input file modular: no way to call another parser from bison, no way to split the .yy file into smaller files and to include them. Of course it is still possible to use a text processor like m4 to build the big .yy file but it is complex, time-consuming, etc.

comment:13 Changed 3 years ago by tomek

  • Priority changed from low to medium

comment:14 in reply to: ↑ 7 ; follow-up: Changed 3 years ago by tmark

Replying to fdupont:

Some comments (for you agree I can fix them directly in the branch):

Fixed

  • d2_cfg_mgr.cc in getInt() and co the intValue() and co can raise an exception. IMHO you should adopt the new version of these templates and when needed just change the type of the raised exception.

What "new versions"? Could you tell me where they are defined or were they added after 5110 was rebased?


  • d2_cfg_mgr.cc please use isV4Zero()

Done. Didn't know those existed when I originally wrote this.

  • d2_config.h NCRS -> NCRs, UPD -> UDP, emtpy -> empty
  • d2_config.h it is a matter of style but IMHO data::SimpleParser -> isc::data::SimpleParser (BTW one of them at least is preceded by a double space)

I'll continue in another comment.

Done

comment:15 in reply to: ↑ 14 Changed 3 years ago by fdupont

Replying to tmark:

  • d2_cfg_mgr.cc in getInt() and co the intValue() and co can raise an exception. IMHO you should adopt the new version of these templates and when needed just change the type of the raised exception.

What "new versions"? Could you tell me where they are defined or were they added after 5110 was rebased?

=> trac5110 has: extractInt() template doing first value->intValue();, new version as in master has getIntType() template doing first getInteger(scope, name);. Note exceptions where changed too (twice when trac5076 (agent parser) was merged) and I have under review another ticket which instantiates the template in getUint{8,16,32}() methods...

Now there are a lot of things waiting for trac5110 merge so in the choice between merge now and create a followup ticket, and refine trac5110 code the second option can be not the best one...

comment:16 follow-up: Changed 3 years ago by tmark

  • Owner changed from tmark to tomek

At this point I am submitting the changes I have. Any further refinements can be done under an additional ticket.

comment:17 in reply to: ↑ 16 Changed 3 years ago by tomek

  • Add Hours to Ticket changed from 14 to 4
  • Total Hours changed from 64 to 68

Replying to tmark:

At this point I am submitting the changes I have. Any further refinements can be done under an additional ticket.

Agree. I did review your changes and have only minor changes. To not prolong this ticket and further, I have addressed them. The comments below are mostly for documentation purposes only.

There were couple unused parameters warnings. I have fixed those.

d2_parser.yy
There was an odd check for port number: if <=0 or >= 65792.
Why such a strange number? Shouldn't that be max uint16_t, i.e. 65535?
The number you used is exactly 256 too large. This is repeated
twice - for port and for dns-server-port. If this is not a mistake,
the code definitely needs a comment why it's 257*256 and not the
usually expected 256*256. Asked Marcin if he's knew any unusual
port tricks here, but he wasn't aware of any. Checked the docs, but there was no mention, so I decided it's a bit uncommon off-by-one (ok, off by 256) type of error and corrected the parser and related unit-tests.

parser_unittest.cc
Thanks for correcting the spelling of Betegeuse. It was not a typo,
at least not a typical one. I mislearnt English spelling of that name.
Both the text and my perception of its spelling are now corrected. Thanks.

d2_simple_parser_unittest.cc
The change introduced in line 5 was peculiar. I presume it was a mistake and reverted it.


Thomas is on PTO today and the code is ready to go. Since there are several tickets that can take advantage of those changes (in particular #5134, removal of the DhcpConfigParser? class and exceptions name clean-up), I decided to merge it today.

I verified that the code builds and unit-tests pass on Ubuntu 16.04.1 LTS.

The ticket is left open, because ChangeLog? entry is needed. I'll let Thomas do the honours here :)

comment:18 Changed 3 years ago by tomek

  • Owner changed from tomek to tmark

comment:19 Changed 3 years ago by tomek

  • Sub-Project changed from DHCP to Mozilla

comment:20 Changed 3 years ago by tmark

  • Resolution set to complete
  • Status changed from reviewing to closed

Ticket merged as of git feb2cedc0936364a923ab78542a21114533dd0f5.

Note: There was some sort of merge issue when Tomek attempted to merge trac5110, so he created a new branch, trac5110_fix, and merged that.

ChangeLog? entry 1218 has been added.

Note: See TracTickets for help on using tickets.