BIND 10 Face to Face Meeting Notes

CNNIC offices, Beijing

Day 3, Wednesday April 28, 2010

Jelte's Python wrapper tutorial (insert slides)


Methods: Errors

Michael: Is python threaded at all?

Jelte: no, python has threads but it fakes them

Shane: its going to change soon someone has done a project at Apple that will allow it to become threaded.

Shane: when you raise an exception you did it with a class can you throw an error there?

Jelte: yes, the second one here is actually the class (on slide: Methods: Errors)

Shane: and if you want to set other values you set them in the object as you normally do?

Jelte: yes


The one static thing in the module is the intializer, it is called as soon as it is imported. If we define our own exceptions we define anothr static object.

Modules: Exceptions

There is a new exception class and give it a name but if you don't the name will be derived from exception

Shane: there is a function which will return a hash you can use to do that?

Jelte: or maybe you can put it directly in to be parsed, I don't know

Jelte: the importan thing in the end is to add an object, and because we're going to destroy it soon we need to add a reference counter

Modules: Constants

Modules: Enums

Enums is not much more than constants in the form of a list

the difference is tha you make an associative array from integers to strings and add it to our module (getting very big)

Modules: classes

Jelte: the PyType_Ready function doesn't just tell you you're ready it acutually starts the function, so it is poorly named In general, unless specifically noted, all references you get from your objects are borrowed.

One More Thing

If you need to keep a reference around use Py_INCREF and then don't forget to use Py_DECREF when you don't need it anymore.


Shane: so t seems like there is a lot of setup/base work. It seems like its almost constant.

Jelte: well, a lot of the framework is copy/pasting and changing names

Evan: can it be automated?

Michael: the point of this discussion is to not automate

Feng: - template classes - you can use them for this. wrapping stl classes may be hard. Templates like vector etc.

Jinmei: in general we dont expose template APIs from the DNS library in some cases we may expose vectors, if that is a problem we can probably change it.

Shane: what about shared pointers?

Jinmei: those might be a problem

Shane: Jelte do you have any idea how we'd support templates doing this ourselves?

Jelte: I dont see a problem, there might be more funcitons but they'd be more copy/paste-able

Feng: maybe you can write a template for each type of class?

Shane: efficiency concerns?

Jelte: well if you want it to be efficient, don't use Python

Jinmei: what do we mean?

Shane: you could end up having to do a deep copy, in the worst case

Jelte: sometimes you can need up to three copies

Michael: how are overloaded methods handled?

Jelte: let me show you

Shane: going to the experiments directory in BIND10 svn

something called python_binding

Jelte: this is the wrapper from the name class

we define exceptions, constants, and the function list, looking for overloading.... toWire perhaps?

Jelte: wait I haven't done this one yet. I do overloading, the way its done in general, you use the ParseTupule? function.

(looking in rrclass_python.xx) this class takes either a byte object or a message renderer object. OVerloading is done by checking all possible forms of arguments. Need to clear error after doing this

Shane: presumably this could take any sequence

Jelte: this will never fail unless you dont provide an argument

Michael: then the second one will fail and it will fall through the clearing set. The second one can fail right?

Jelte: yes.

Shane: so presumably the PyBytes_from_string_and_fail could fail, right?

Jelte: unless our own toWire function fails. This is not really efficent because we first create a buffer and then read that into a new python object.

Shane: maybe you could make a "scratch work" buffer to use over and over

Jelte: I didn't want to do it because we dont really need buffers like we need them in C.

Jinmei: I also looked at the code and found that in some cases you need to know some implementation of the C++ version to write the wrapper for example RRclass show integer... Python generally only handles generic integers

Jelte: actually I discovered later that some of those letters are for short integer, long integer, etc.

Jinmei: conversion may not be a trivial task

Jelte: the typing work is similar but the wrapping the functions might be much less than obvious sometimes and this is an example

Jelte: I do an ugly and not normal thing here, I include C files (in

Jelte: normally people only make tiny modules to do tiny things, not wrap entire libraries

Jinmei: maybe we can do this in C++ and set up separate namespaces

Jelte: then maybe we should talk about that later.

Jelte: one other thing is ha you have to define a nonstatic function that goes into all of them, but you can't give them all the same name. If we need to add a class we make a .cc file like this and include it in the module aggregator assay.

Jeremy: is there a performance cost?

Jelte: there is always a performance cost

Jeremy: have we checked?

Jelte: not yet, but I assume a cost, especially if we do the dirty tricks I described for toWire.

Shane: it won't be slower, it might not be faster, but I dont think it will be slower.

Jinmei: I think this will be faster than DNS_Python (5x faster)

Jelte: the trick to high performance here is to use the highest level functions possible for your python code

Michael: will there be cases where youw ant to add a function for python's use?

Jelte: here yes but not in C++

Shane: perhaps to create iterators

Michael: or questions

Jelte: or to use an object as a list

Michael: is there any way we can use a template for the r types? That we copy and add things, not that we expect to be automatically generated.

Jelte: I kindof did that here, copied rrclass python and then re-wrote functions as I needed to and I could probably just rip out the actual code and we'd have a template.

David: a piece of experience - a system in this area has the problem of migration - you need to build over the migration gap when the new server understands it as a discrete class.

Jelte: we have this option in C++ as well, but here I only implemented the highest level of rdata and what is beneath doesn't matter to me. What we lose is specific functionality for specific types.

Shane: what do I get?

Jelte: you get a string or a value

Shane: well, we can see if anyone needs more than that.

Evan: what would DNS_Python do?

Shane: it has a file that declares all of the DNS various names as constants. So you have the classes as a module you import, and one for types, etc. They're symbols not strings.

Evan: what about the structures of the data representations of the types?

Shane: Its actually a little bit cumbersome you get an attribute within an object which has the values and you can get down tothe point where you ahve an array and if you know the right offset you can get the values. But as a developer I could not find the documentation to find that easily.

Jinmei: that is different from my experience. It does have a different class for each RR type, overall structure is a lot like BIND 9.

Shane: when I used it I was doing research into contact information in a DNS databse was valid. So looking for MX records and converting to address records. the lookup utilities didn't coerce inton anything useful.

Michael: that seems weird.

Shane: it was perfectly useable but you had to know where to poke and the automatically generated PyDoc? documentation wasn't easy to work with Michael: just because C++ does it one way doens't mean we can't create a class on the flyu and make it more featueed?

Jelte: true, I've been trying to stick to the API as close as possible but we may not always want to do that

Evan: the big benefit of a class for each RR type is that the object has a method you can use to access individual fields. Python has a dictionary field you could use.

Jelte: in this case I only did the base rdata class. Michael: one reason you want to use Python is to get away from the C++ API in some cases. If we can hide the ugliness of C++ with a python class, that sounds pretty cool.

Jelte: the code I got from francis also defines a python wrapping class that wraps a wrapper and can have other cool stuff ight there. Michael: does it have introspection?

Shane: yes there is a function called dir that gives you all the definitions of one thing. Shane: this is almost like C code, right?

Jelte: yes they call it Python c it just supports c++

Shane: Earlier I was questioning how much work is this, to do stuff by hand? We have to have python wrappers. But we need to decide how to do that. but from the manegerial point of view we need to decide which takes the least work, both now to set up and in the future to extend and maintain, as the person with the most experienc ein this what is your perspective?

Jelte: I only really know the manual one. But in each of the wrappers that are already made there are specific things to hack around Jelte: if we keep this clean, its a lot of work once, and we need automated checks to see if the original API changes, and if we ever fall behind we'll never catch up again. While I was working on this especially the first few days, every few words I would think "we should automate this"

Michael: it will be really hard to automate function overloading

Jelte: and if you really do it, you'll end up with something like twig or Boost_Python anyway.

Michael: If the tests could be written in a generic way, which could then be used to write a test for C++ and Python, that would help us keep things in sync

Jelte: the r_python tests look a lot like our C++ tests

Michael: converting from a string seems less efficient than converting from a number, say in a case statement. Perls DNS library and Ruby's DNS library define constants for numerical values. If we did that we could use those and still read it

Shane: I guess my big concern is if we do it by hand, will we get to a point where it is so hard to maintain that we wish we had used the automated tool?

Michael: or the other way, but no matter what we will end up unhappy

Jelte: the least work would be Boost_Python, and the question is do we want this dependency. Swig is better if we want to involve other languages.

Jinmei: it will look much simpler than this

Michael: there is another option which is similar but not quite - (looking at what you give Boost_Python to generate)

Jelte: what it does not give us is flexibility...

Feng: yes you can

Shane: you inherit it? ok that makes sense

Jinmei: I prefer avoiding Boost_Python because it is a difficult interface with Python 3 and it requires a ridiculously long compile time

Jelte: the Boost ParseGenerator? has a longer compile time

Michael: not worried about compile times compared to run times and it wont be as bad as BIND 9.

Jinmei: it takes an hour and a half to compile.

Michael: oh woe. I care now.

Evan: lets not do this then

Michael: its not surprising me ebcause its all chained together. its cool but its a stack recall every time.

Michael: I'd still like to see what Swig would do, because it has the advantage of giving us free output into other languages. Michael: if you really can create 4 languages from one file in Swig thats a huge win but it has big disadvantages.

Jelte: it probably coems down to needing to only use a common subset of languages which might eliminate the value of using different laungages in the first place

Michael: Maybe lets make the python stuff a client of C++

Jelte: we could see what functions we use now, implement those, and do it as we go along

Michael: that way if we change later the cost is less

Shane: Swig is an external tool, and we'd have to assign someone to look at that. I suspect there is not a fourth option. But there may be at least one more option Jinmei: performance really matters here, there is a market here

Michael: suppose it was easier if we refactored the dNS library to make it simpler? If putting restrictions on the libraties we write to make them easier to put into python was an option would it help? I am thinking maybe when we write C++ code now, we need to think abotu how hard it would be to convert to python

Evan: only libDNS?

Shane: and libDHCP and maybe the interface stuff

Jelte: maybe msgq but maybe not

Michael: no, if we switch to JSON then its just network I/O.

Jinmei: maybe we provide binary versions of dynamic modules so that they don't have to use it, can that be an option? We compile it. Shane: yes. good idea.

Jelte: I don't really like us doing packaging, or compiling.

Shane: If we go with the recommendation to write the minimum set, natively, we could use the experiment you've done as a basis, Jelte? Jelte: yes we could. Actually we could view the code Francis's stuff was an experiment and mine came out of it.

Jelte: easier to go from native to boost later, than the other way around.

Decision: Shane: we need to comment out or remove the part we don't need, and expand it just enough for the functionality we need for the xfer code and the zone loader and then continue as we need. Then take the existing Boost_Python stuff, put that into experiemnts, in case we go back to it later, we need to do this if not by our next release then the next one.

Decision: Shane: and we need to investigate Swig, not too far into the future

BIND 10 Year One Post-Mortem:

A post-mortem is when you review the project after it is done. The goal is to capture lessons you can learn, repeat the good things, and avoid the bad things. It is actually very important in the larger context of quality assurance. With QA everyhing you do you need to measure and review these things. A post-mortem is how you check to see how the projects are going?

What went well? What needs to be improved?


Communication Management Planning Documentation Development methodology

Technology Choices Coding practices Specific APIs

Y2 release one

What didnt go well --We had a slow start to it end of y1, IETF, burnout, we spent half the 6 weeks recovering (Shane)

--Before the release we didn't have specific goals for the release (Jeremy)

--We didn't have any metrics (Shane)

--lack of actual tasks from backlog (Michael)

-- missed own timeline by one day because of problems with tarball missing tests (Jeremy)

--release day of the week confusion

--I need more coordination with the real time release stuff (jeremy) - lets set up blocks on jabber for "release buddy"

stopped using daily checkins

What went well

--It was so good when Jeremy came up with the review tickets and he and Shane went through every one and reviewed and assigned

them. We're going to do this on a more regular basis (Shane)

--Up front we had a date picked out (Jeremy)

--we actually beat our deadline by two days (Jeremy)

--liked having Jeremy doing the release real time on Jabber (Michael)

-- a lot less work because of the working system (Jelte and Shane) had some simple runtime test scripts

Year One Release

What didn't go well


lack of schedule clarity need to pay attention to timezones on deadlines (big deadline snuck up) (Jelte)

no documented decision for autoconf and automake versions used (Jeremy)

-- Rob and Dave think this is a huge problem

-- do we care?

-- lets try not to go backwards (Michael)

not all the tests were being included in the tarball until really late (Jeremy)

confusion around PGP signing of code process (Jeremy)

too many changes in the last 36 hours

too many suggestions in the last 36 hours

rushing at the end

release engineer's system is too slow (Jeremy) (note Shane has approved new desktop for Jeremy)

readmes need rework/review

releng had to work all night long before the deadline

release process doesnt have enough detail - even though he didn't use it

didn't have verification procedure

didn't have runtime tests, had to do it manually

we shipped xfrin secretly without documentation and it didn't really work (but no one complained)

we don't define release related bugs, what does that mean?

need version numbering clarification and schema for trunk, etc, and include subversion tag from root directory)

xfrout was missing documentation

no review process for xfrout

jeremy created manual changelog entries but only got a little feedback right before hand. He didn't ask anyone specifically, but he also didn't get any review.

our changelogs are written for us - do we want to summarize and make them public friendly, or.... what?

need to make a customer oriented changes summary

not enough testers of BIND 10 as a whole

test failures not always handled quickly

failures lingering can mask other problems

no generic DNS server testing (system level tests)

don't have distributor relationships yet - a few are adhoc - jeremy has invited people to lists and wiki

need binary packaging procedures and we dont have a timeline for it yet


don't always know what each person's responsibilities are is not efficient

a lot of procedures are still unclear

-- subversion procedures

-- tools

-- review

-- ticket process

who owns what?

need individual goals

when you finish one task, what comes next?

we had a major bootstrapping problem (fuzzy front end)

unclear deadlines

who assigns tickets?

being unable to reply to bug mail from trac was confusing (learning trac email system confusing)

Communication - 8 month in challenging period - daily checkins helped

internal and external distractions

resource issues/fire fighting and BIND 9

easy to get sucked into never ending BIND 9 task

issue with international language folks on phone and being able to understand

more thorough action item tracking out of the call


we don't have anything like documentation coverage testing

will the examples for year one work for year two?

its not clear what level of docs are expected

developer docs are fallen to the side

need documentation as an ongoing deliverable

Development Methodology

moving from waterfall to scrum

did we check in on how long things took compared to the plan? not really.

Face to face

getting agenda sooner?

focus on what we need to do instead of brainstorming?

What went well

we made our release date even after our resource challenge, 2 weeks early even (fantastic)

had a documented release process (need even more documentation)

Evan and Mark walked Jeremy through releng procedure as needed (thank you)

bouncing from reviewer to reviewee was successful

daily check ins helped a LOT - always minutes on the wiki

we use our communication tools about as well as we can

feature backlog is a big improvement

weekly call is excellent in general

waterfall development model worked as well as it ever does

Face to face: splitting day between identifying and fixing problems is good

some benefit to splitting locations between continents of participants

some people dont want more travel, some do

meeting at ISC is cheaper usually


we really really really need to deal with the backlog

we're about 3 hours behind schedule

not so bad because of thursday/friday free

Feature backlog review (

shane: the idea here is that when you have a new need or idea, you add it to the backlog, and hen periodically you go through and work on prioritizing the backlog.

tool issue: what will we use? a spreadsheet? the twiki? (the markup is impossible) pivotal? MS project? (poop!) openoffice? (open source... poop?)

Perhaps Michael will write us a tool

AI: Larissa to review tools for this issue

Sticky notes of all the open items on the backlog, known issues list, and bug tickets

starting with the 49 backlog items

1 Ticket #231 full edns0 support description: in the ++ library one week FTE + review owner: shane developer: Jinmei

2 move data source out of auth server description: the code for the data source is a library now. this one is DONE! w00t!

3 Ticket #232 write access to data source --> needs futher investigation description: design the API and implement it, with versioning, concurrency, and which featues are supported where. time estimate: one month "too hard"

4 Trac Ticket #233 loader using data source description: we would like the loader to write through the data source like DDNs or IXFR would dependency: on write access to data source "too hard"

5 xfrin should use the data source "too hard"

6 xferout should use the data source needs to iterate the database and write relevant API pieces time estimate: 3 days FTE "too hard"

  1. ASIO work

description: changing from Boost ASIO to using separate ASIO and then removing custom TCP code time estimate: 2 weeks

  1. ACL for queries

description: standalone (need for xfrin, xfrout, allow updates, view selection(?), recursive queries) time estimate:

  1. BDB style Datasource

including the config file and tests time estimate: 2 weeks FTE priority: medium low developer: Evan

  1. In Memory Database --> need more thoughts

"too hard"

  1. statistics reporting
  1. Postgress Database


  1. MySQL database


  1. administrator roles

such as bind-ctrl, cmnd-ctrl, groups, roles, etc 4 weeks Likun

  1. network interface monitoring component

listens for interfaces becoming available or going away and then sends a message

  1. Per component use case review for administrators



config mgr





2 days

also note that going forward we want modules to have use cases/requirements

  1. parse crash test for real world data

set up a framework scope: crash test only, zone contents, query trace could someone else run this for us? SIDN can do the othr part 5 day max for small scope version

  1. control mutiple auth daemons

multicore support "too hard"

  1. b10 auth man page - done
  1. b10 man page - done
  1. b10 c&c - man page
  1. maketargets to generate man pages

need top level maketarget 1 day lower priority

  1. complete host tool feature

(incl. man page) low priority for y2

  1. subset of 23
  1. b10 loadzone or python module fixed (bug report)

one day

  1. b10 loadzone should have option to choose data source type - handled in 4.
  1. spec files to be documented as features --> closed.
  1. usersguide.xml

1 day jeremy

  1. initial performance tests for auth server

output should be a benchmarking suite database generates a webpage with graphs to compare. QPS performance automated 2 weeks (to be added to incrementally throughout development)

  1. profiling for the auth server

tool to run to see where the time is going find the hot spots and document them as tasks. need to discuss whether or not to automate "too hard"

  1. perf testing for xfrin

across backends start with basic setup, 2 servers and small, medium, large zones, adding backends as we get them (and server versions) can't use cucumber 2 weeks

  1. perf testing for xfrout

"" with xfrout 1 week

  1. perf testing for loadzone

"" with loadzone 1 week

  1. stopped notification


  1. 1 week with testing

"too hard"

  1. Core Component Crash Protection

2 weeks

  1. log framework

start with one python module 1 week max

  1. b10 start auth server drop privs asap

3 days

  1. UNIX domain socket for msgq

1 week

  1. Windows port.

--> pending further discussion

  1. IXFR in

"too hard"

  1. IXFR out

"too hard"

  1. port selection to specify one address to listen to a service

(procedure for using services available now, but other services can be plugged in later) 2 days

  1. hostname.bind, id.server

disable any of these, set them to a specific value WWB9D depends on EDNS0 support 2 weeks

  1. consistent --version (Trac #166)- for 3rd incremental release

every tool should have two versions it returns, the version of b10 and the version of the tool

  1. bind10-info - returns "showtech" type information, OS version, --version, etc

1 week

  1. bind10-conf feature in bind-ctl - Similar to postfix's postconf

set a value and get a value basics in a few days 1.5 weeks full support

  1. configure at end

one day

  1. MacOS binary package

low priority (maybe combine with other binary packaging stuff)

  1. python bindings for datasource API --> need more discussion
  1. CC Channel ASIO

pretty easy 2 weeks with tests) depends on learning ASIO

  1. access control configuration architecture

normal boolean operations? investigate compatability with BIND 9? "too hard"

  1. ACL architecture

"too hard"

  1. ACL object

"too hard"

  1. use JSON

"too hard"

  1. timeout on the blocking rate of CC msg

1 week

non captive data source ---> to be investigated.

  1. document collection spec

Jelte time estimate:

  1. stat collector daemon

how does it collect and often does it collect them Fujuwara time estimate: priority: need more input

  1. stat reporter using bind-ctrl

not an API, just functionality and pretty printing 1 week Fujiwara depends upon the collector daemon (or parallel)

  1. stat reporter using XML/HTTP
  1. stat reporting over DNS

3 week task

  1. SNMP

plugin for SNMPd 3 weeks Shawn

63 DDNS "too hard"

  1. Create predefined roles for system bootstrap
  1. NSID support

NSID is a EDNS0 based server identifier

  1. port creator

4 days not Jelte

  1. per component use case review for developers

2 days

  1. SIDN black box testing
  1. hot cache experiment

for all queries simple full query cash with a hash lookup and then check performance Michael one week

  1. BoB to stop/start modules based on user commands

include configuration and restart commands 2 weeks

  1. CC Channel use JSON

"too hard"

  1. hankins? or a public call to the dev list?
  1. lwresd implementation

part of BIND 10 or separate? (small cacheing resolver)

73 convert a python module over to log4python (xfer-in or xfer-out or cmdctl daemon or...)

74 lwresd compatible thingy

75 rename libauth to libdatasrc (1 day)

76 configuration for core system components (command line, environment vars, or file) (1 week)

77 message-library: TSIG support (given 78) (2 week)

78 complete dns message library (which may require rewriting, redesign, or refactoring) and add docs (2 weeks)

79 (trac#184) b10 config manager code coverage and test completion (3 days)

80 add value/type check according to module specification. (need to decide if this is done already or if the API is not used) (2 days)

81 cmdctl: HTTP return codes for RESTful API (1 week)

82 cmdctl: code coverage and test completion (3 days)

83 cmdctl: improve error messages (1 day)

84 cmdctl: choose different default port number (1 day)

85 loadzone: check zone correctness and report loudly of errors/warnings during loading (unknown time)

86 loadzone: $INCLUDE fixes. Support optional (origin, comment) and $TTL (comment) (RFC compliance) (1 day)

87 loadzone: $GENERATE support (1 week)

88 loadzone: test suite of different file formats to load (1 week)

89 loadzone: verbose option to exactly what is happening with name cleanup based on origin, and %done while running (3 days)

90 xfrin: "refresh" zone (check and retransfer if needed) (through zone manager) (2 days) "too hard"

91 Implement zone maintenance daemon functionality (unknown time) "too hard"

92 Trac#179 xfrin: test code coverage, write code to handle errors (2 weeks)

93 Ticket #230 BoB: do not start other processes until specific processes are started (sequenced startup) (3 days)

94 base32: improve, test, document, and check for security issues, and move into its own library (1 week)

95 sort out where misc. functions and classes live (which library) (Larissa and Shane to decide)

96 xferin: TSIG support (2 weeks)

97 xferout: TSIG support (2 weeks)

98 trac#150 message-library: TXT rdata is incomplete: multiple strings: it needs to fully support the TXT specification. (3 days)

99 web-based configuration, control, and status (undefined time)

100 trac#17 exception handling

101 trac#17 micro-benchmarks for name construction

102 trac#24 split python "binaries" from a file into a library with a small wrapper to call it. (1 week)

103 trac#49 Name::split(n) (3 days)

104 trac#57 compareNames() (name comparison API) (1 week)

105 cmdctl: generate certificate and private key on install, and set permissions correctly (1 week)

106 trac#68 install headers (1 day)

107 trac#77 catch exception from database layer when loading an invalid schema (1 day)(3rd release or later)

108 Handle schema upgrades by providing a one-way migration from old to new (2 weeks)

109 trac#109 auth: better error message when cannot open socket (1 day)

110 trac#110 BoB: clean up even if start up fails

111 trac#124 Element class should have a way to make an "empty" item (arrays, lists) (2 days)

112 trac#134 xferout not dying in time, so killed via SIGKILL. (2 weeks)

113 trac#154 generate changelog from svn (1 day)

114 recurser: timeline analysis of recursive query validation (3 days)

116 recurser: packet sender/tracker (2 weeks)

117 recurser: augment b10-host (2 days)

118 recurser: benchmarks (2 weeks) (3rd release or later)

119 recurser: architecture and design "too hard"

120 recurser: DNS cache class design - hash table - no timer or LRU (1 week)

121 write straw man proposal for writeable data source (3 days) (Trac Ticket # 173)

122 notify recv (2 weeks)

123 python wrappers to support xfr (2 weeks)

Last modified 10 years ago Last modified on Jun 7, 2010, 9:38:32 PM