Opened 7 years ago

Closed 7 years ago

#2856 closed task (fixed)

memory manager initialization

Reported by: jinmei Owned by: muks
Priority: medium Milestone: Sprint-20130806
Component: shmem manager Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket: shared memory data source
Estimated Difficulty: 4 Add Hours to Ticket: 0
Total Hours: 4.99 Internal?: no

Description (last modified by jinmei)

Subtask of #2830. Depend on #2854, #2855, #2993.

(Revised, based on more details of resulting #2854, although the
details may even change more after review).

First, this is how the entire memmgr will look like after the series
of related tasks, not only this particular ticket:

  • (the base class of) SegmentInfo will have the __state attribute. This is independent from the underlying segment type ("mapped", etc), and can take the following 4 values:
    • UPDATING: the segment is being updated (by the builder thread, although SegmentInfo won't care about this level of details).
    • SYNCHRONIZING: one pair of underlying segments has been updated, and readers are now migrating to the updated version of the segment.
    • COPYING: all readers that used the old version of segment have been migrated to the updated version, and the old segment is now being updated.
    • READY: both segments of the pair have been updated. it can now handle further updates (e.g., from xfrin).
  • __state is private to SegmentInfo; but maybe we provide read-only access for tests
  • To manage state transition, SegmentInfo will have some more attributes and methods:
    • __readers (set of 'reader_session_id'): private to SegmentInfo. It consists of the (ID of) reader modules that are using the "current" reader version of the segment.
    • __old_readers (set of 'reader_session_id'): private to SegmentInfo for write (update), but publicly readable. It can be non empty only in the SYNCHRONIZING state, and consists of (ID of) reader modules that are using the old version of the segments (and have to migrate to the updated version).
    • __events (FIFO queue of opaque data): queue for pending update events. update events can come at any timing (e.g., after xfr-in), but can be only handled if SegmentInfo is in the READY state. This maintain such pending events in the order of coming. SegmentInfo doesn't have to know the details of the stored data; it only matters for the memmgr.
    • complete_update(): called when memmgr is notified from the builder of the completion of segment update. change the state from UPDATING to SYNCHRONIZING, and COPYING to READY. In the former case, elements of __readers are moved to __old_readers. and, if __old_readers is empty, it pops the head (oldest) event from __events and return it. error if it's called in other states than UPDATING and COPYING.
    • sync_reader(reader_session_id): can only be called in the SYNCHRONIZING state. memmgr calls it when it receives segment_update_ack message from a reader module. It moves the given ID from __old_readers to __readers, and if __old_readers becomes empty, change the state to COPYING. If the state has changed to COPYING, it pops the head (oldest) event from __events and return it; otherwise it returns None.
    • add_event(event_data): called by memmgr when it receives a request for reloading a zone, and simply append the given data to __events.
    • start_update(): if the state is READY and __events is non empty, changes the state to UPDATING and returns the head (oldest) event (not popping it). This tells the caller (memmgr) should initiate the update process with the builder. In all other cases it returns None.
    • add_reader(reader_session_id): called by memmgr when it first gets the pre-existing readers or when it's notified of a new reader. It simply adds the given ID to __readers. no state transition happens.
    • remove_reader(reader_session_id): called by memmgr when it's notified that an existing reader has unsubscribed. It removes the given reader from either __readers or __old_readers (wherever the reader belonged), and in the latter case, if __old_readers becomes empty, change the state to COPYING (the state should have been SYNCHRONIZING), pops the head (oldest) event from __events and return it.

A summarized (and simplified) state transition diagram would be as
follows:

                                            +--sync_reader()/remove_reader()
                                            |  still have old readers
                                            |          |
            UPDATING-----complete_--->SYNCHRONIZING<---+
              ^          update()           |
start_update()|                             | sync_reader()/remove_reader()
events        |                             V no more old reader
exist       READY<------complete_----------COPYING
                        update()

On construction, SegmentInfo object is in the READY state, with
__readers, __old_readers, __events being all empty.

Commands (elements of __events) passed from the main memmgr thread
to the builder would be a tuple (or whatever, this is just a
conceptual example):

('load', zone_name, dsrc_info, rrclass, dsrc_name)

where

  • zone_name is None or isc.dns.Name, specifying the zone name to load. If it's None, it means all zones to be cached in the specified data source (used for initialization)
  • dsrc_info is a DataSrcInfo? object corresponding to the generation ID of the set of data sources for this loading.
  • rrclass is isc.dns.RRClass object, the RR class of the data source.
  • dsrc_name is a string, specifying a data source name.

The builder thread tells the main memmgr thread when it completes a
given load command with the following parameter (again, this is just a
conceptual example):

('load-completed', dsrc_info, rrclass, dsrc_name)

where dsrc_info, rrclass and dsrc_name are the same values of those in the
corresponding command.

In this particular ticket, we do the following:

  • Extend SegmentInfo, adding the new attributes and methods (even though not all of them are needed right now).
  • Then extend the MemorySegmentBuilder class (introduced in #2855) for the 'load' command. It will look like as follows:
        def __handle_load(self, zname, dsrc_info, rrclass, dsrc_name):
            clist = dsrc_info.clients_map[rrclass]
            sgmt_info = dsrc_info.segmet_info_map[(rrclass, dsrc_name)]
            clist.reset_memory_segment(dsrc_name, READ_WRITE,
                                       sgmt_info.get_reset_param(WRITER))
    
            if zname is not None:
                zones = [(None, zname)]
            else:
                zones = clist.get_zone_table_accessor(dsrc_name, True)
    
            for _, zname in zones:
                # Note: until #2993 is completed we cannot use this extension
                cache_load_error = (zname is None) # install empty zone initially
                writer = clist.get_cached_zone_writer(zname, dsrc_name,
                                                      cache_load_error)
                try:
                    error = writer.load()
                    if error is not None:
                        # log the error
                        continue
                except Exception:
                    # log it
                    continue
                writer.install()
                writer.cleanup()
    
            # need to reset the segment so readers can read it (note:
            # memmgr itself doesn't have to keep it open, but there's currently
            # no public API to just clear the segment)
            clist.reset_memory_segment(dsrc_name, READ_ONLY,
                                       sgmt_info.get_reset_param(WRITER))
    
            with self.__lock_update_queue:
                self.__update_queue.append(('load-completed', dsrc_info, rrclass,
                                            dsrc_name))
            self.__sock_to_main.send(<some_data>)
    

Subtickets

Change History (21)

comment:1 Changed 7 years ago by jinmei

  • Milestone changed from Previous-Sprint-Proposed to Next-Sprint-Proposed

comment:2 Changed 7 years ago by jinmei

  • Description modified (diff)

comment:3 Changed 7 years ago by jinmei

  • Description modified (diff)

comment:4 Changed 7 years ago by muks

  • Milestone set to Sprint-20130625

comment:5 Changed 7 years ago by jinmei

  • Description modified (diff)

comment:6 Changed 7 years ago by jinmei

  • Description modified (diff)

comment:7 Changed 7 years ago by jinmei

  • Description modified (diff)

comment:8 Changed 7 years ago by muks

  • Owner set to muks
  • Status changed from new to assigned

Picking

comment:9 Changed 7 years ago by muks

  • Owner changed from muks to UnAssigned
  • Status changed from assigned to reviewing

This ticket is up for review.

  • Please review the tests for behaviour of various SegmentInfo methods in different states carefully.
  • Please check documentation as other developers will read this in the future.
  • I have left the logging calls out of __handle_load() for now with FIXMEs, as I don't know how this code will end up in the bigger picture. I guess these should be logged in some corresponding memmgr ticket (they'll be simple logging calls).

comment:10 Changed 7 years ago by vorner

  • Owner changed from UnAssigned to vorner

comment:11 follow-up: Changed 7 years ago by vorner

  • Owner changed from vorner to muks

Hello

First, the scope of the ticket seems strange to me. There seems to be an update to builder part and also to the SegmentInfo class. But I don't think these are directly related (eg. there'll be something in between connecting them together, which is not there yet). I don't think the ticket mentions that, but it makes me ask why it is the case. Or, it is possible I look incorrectly into the code.

Also, the semantics of the SegmentInfo class seems strange to me. It looks like a lot of different methods return the event to be handled. So the caller will have to check the return value in like 5 different places.

The distcheck fails:

ERROR: files left in build directory after distclean:
./src/lib/python/isc/memmgr/tests/zone-IN-1-MasterFiles-mapped.0
make[1]: *** [distcleancheck] Error 1
make[1]: Leaving directory `/tmp/bind10/bind10-3/bind10-20130529/_build'

Should this at least log the error?

    def __handle_bad_command(self):
        # A bad command was received. Raising an exception is not useful
        # in this case as we are likely running in a different thread
        # from the main thread which would need to be notified. Instead
        # return this in the response queue.
        self._response_queue.append(('bad_command',))
        self._shutdown = True

I don't know if it's a good idea to leave the FIXME: log error in the code. It still seems better to log at least something and leave fixup of the messages for later, than not log at all.

Do we really start at READY in the SegmentInfo class? Won't we start by loading something and sending updates to the clients?

I don't see what is meant by „publicly available“ here:

        # __old_readers is a set of 'reader_session_id' private to
        # SegmentInfo for write (update), but publicly readable. It can
        # be non empty only in the SYNCHRONIZING state, and consists of
        # (ID of) reader modules that are using the old version of the
        # segments (and have to migrate to the updated version).
        self.__old_readers = set()

There's a class that does a queue effectively in python (sorry, I don't remember the name now). When using list, it needs to copy all the items to the left when removed from the head.

You seem to check for emptiness with len at many places. This looks ineffective, because the whole length must be computed. I believe all well-behaved containers in python can be directly used in the condition to mean non-emptiness.

I'm little bit worried about the case when there's no update and no reader either. It seems it'll get to the SYNCHRONIZING state. But as there's no reader, no reader will ever move from the old readers, so nothing will trigger the check for the emptiness of that set. It seems like the manager will get stuck in that state forever.

    def complete_update(self):
        """This method should be called when memmgr is notified by the
        builder of the completion of segment update. It changes the
        state from UPDATING to SYNCHRONIZING, and COPYING to READY. In
        the former case, set of reader modules that are using the
        "current" reader version of the segment are moved to the set
        that are using an "old" version of segment. If there are no such
        readers using the "old" version of segment, it pops the head
        (oldest) event from the pending events queue and returns it. It
        is an error if this method is called in other states than
        UPDATING and COPYING."""
        if self.__state == self.UPDATING:
            self.__state = self.SYNCHRONIZING
            self.__old_readers.update(self.__readers)
            self.__readers.clear()
            return self.__sync_reader_helper(self.SYNCHRONIZING)
        elif self.__state == self.COPYING:
            self.__state = self.READY
            return None
        else:
            raise SegmentInfoError('complete_update() called in ' +
                                   'incorrect state: ' + str(self.__state))

Why is the switching of __readers and __old_readers so complicated? Won't this be simpler and also faster?

self.__old_readers = self.__readers
self.__readers = set()

Why can't remove_reader happen any time? I don't see a reason why the auth server couldn't terminate or crash at any time it finds appropriate. Will the manager be required to buffer the unsubscriptions?

The sync_reader and remove_reader seem to be quite similar. Could they be somewhat unified or share some of their code?

Looking at this, it makes me wonder if it makes sense to test anything at all on the memory manager if the shared memory is not available. Would it make sense to completely exclude the memory manager from build then?

    @unittest.skipIf(os.environ['HAVE_SHARED_MEMORY'] != 'yes',
                     'shared memory is not available')

Would it be better to assertEqual for 1, or even better, for concrete value?

        self.assertNotEqual(len(self.__sgmt_info.get_events()), 0)

Why does it use set() sometimes and {…} sometimes? Why do we test that python doesn't care about order of values in set?

    def test_add_reader(self):
        self.assertSetEqual(self.__sgmt_info.get_readers(), set())
        self.assertSetEqual(self.__sgmt_info.get_old_readers(), set())
        self.__sgmt_info.add_reader(1)

        self.assertSetEqual(self.__sgmt_info.get_readers(), {1})
        self.__sgmt_info.add_reader(3)
        self.assertSetEqual(self.__sgmt_info.get_readers(), {1, 3})
        # ordering doesn't matter in sets
        self.__sgmt_info.add_reader(2)
        self.assertSetEqual(self.__sgmt_info.get_readers(), {1, 2, 3})
        self.assertSetEqual(self.__sgmt_info.get_readers(), {1, 3, 2})

Is it really allowed to call start_update any time?

comment:12 in reply to: ↑ 11 ; follow-up: Changed 7 years ago by muks

  • Owner changed from muks to vorner

Hi Michal

Replying to vorner:

First, the scope of the ticket seems strange to me. There seems to be
an update to builder part and also to the SegmentInfo class. But I
don't think these are directly related (eg. there'll be something in
between connecting them together, which is not there yet). I don't
think the ticket mentions that, but it makes me ask why it is the
case. Or, it is possible I look incorrectly into the code.

There isn't anything in implementation connecting them. But I guess this
was included as ground work by Jinmei.

Also, the semantics of the SegmentInfo class seems strange to me. It
looks like a lot of different methods return the event to be
handled. So the caller will have to check the return value in like 5
different places.

I didn't follow the last sentence.

The distcheck fails:

ERROR: files left in build directory after distclean:
./src/lib/python/isc/memmgr/tests/zone-IN-1-MasterFiles-mapped.0
make[1]: *** [distcleancheck] Error 1
make[1]: Leaving directory `/tmp/bind10/bind10-3/bind10-20130529/_build'

It didn't cleanup the mapped file after the test. The fixture has been
updated to do that now during teardown.

Should this at least log the error?

    def __handle_bad_command(self):
        # A bad command was received. Raising an exception is not useful
        # in this case as we are likely running in a different thread
        # from the main thread which would need to be notified. Instead
        # return this in the response queue.
        self._response_queue.append(('bad_command',))
        self._shutdown = True

This has been updated to log the error.

I don't know if it's a good idea to leave the FIXME: log error in
the code. It still seems better to log at least something and leave
fixup of the messages for later, than not log at all.

Suggestion taken and logging has been added.

Do we really start at READY in the SegmentInfo class? Won't we
start by loading something and sending updates to the clients?

This is what the ticket explicitly specifies. I guess it's ok
considering we can start by start_update().

I don't see what is meant by „publicly available“ here:

        # __old_readers is a set of 'reader_session_id' private to
        # SegmentInfo for write (update), but publicly readable. It can
        # be non empty only in the SYNCHRONIZING state, and consists of
        # (ID of) reader modules that are using the old version of the
        # segments (and have to migrate to the updated version).
        self.__old_readers = set()

If you mean the phrase "publicly readable", this means that there is a
getter, but there's no way to set it from outside the class.

There's a class that does a queue effectively in python (sorry, I
don't remember the name now). When using list, it needs to copy all
the items to the left when removed from the head.

I am not able to find such a class. If you remember the name, please let
me know. The queue.Queue class is a producer-consumer queue.

You seem to check for emptiness with len at many places. This looks
ineffective, because the whole length must be computed. I believe all
well-behaved containers in python can be directly used in the
condition to mean non-emptiness.

OK I have modified the code to test the variables directly for truth.

I'm little bit worried about the case when there's no update and no
reader either. It seems it'll get to the SYNCHRONIZING state. But as
there's no reader, no reader will ever move from the old readers, so
nothing will trigger the check for the emptiness of that set. It seems
like the manager will get stuck in that state forever.

    def complete_update(self):
        """This method should be called when memmgr is notified by the
        builder of the completion of segment update. It changes the
        state from UPDATING to SYNCHRONIZING, and COPYING to READY. In
        the former case, set of reader modules that are using the
        "current" reader version of the segment are moved to the set
        that are using an "old" version of segment. If there are no such
        readers using the "old" version of segment, it pops the head
        (oldest) event from the pending events queue and returns it. It
        is an error if this method is called in other states than
        UPDATING and COPYING."""
        if self.__state == self.UPDATING:
            self.__state = self.SYNCHRONIZING
            self.__old_readers.update(self.__readers)
            self.__readers.clear()
            return self.__sync_reader_helper(self.SYNCHRONIZING)
        elif self.__state == self.COPYING:
            self.__state = self.READY
            return None
        else:
            raise SegmentInfoError('complete_update() called in ' +
                                   'incorrect state: ' + str(self.__state))

What do you suggest we do for it?

Why is the switching of __readers and __old_readers so complicated? Won't this be simpler and also faster?

self.__old_readers = self.__readers
self.__readers = set()

The current code performs a set union. I'm not sure if the assignment
does the same. Does it?

Why can't remove_reader happen any time? I don't see a reason why
the auth server couldn't terminate or crash at any time it finds
appropriate. Will the manager be required to buffer the
unsubscriptions?

Hmmm... shall we allow this then? I implemented the description to the
letter, but didn't consider the bigger picture in this particular
ticket.

The sync_reader and remove_reader seem to be quite similar. Could
they be somewhat unified or share some of their code?

It could be, but they differ in small ways. I think such unification
would make it harder to follow.

Looking at this, it makes me wonder if it makes sense to test anything
at all on the memory manager if the shared memory is not
available. Would it make sense to completely exclude the memory
manager from build then?

    @unittest.skipIf(os.environ['HAVE_SHARED_MEMORY'] != 'yes',
                     'shared memory is not available')

I think we still allow local segments to the memory manager. There is
another ticket about it, but this is something we can do afterwards.

Would it be better to assertEqual for 1, or even better, for concrete value?

        self.assertNotEqual(len(self.__sgmt_info.get_events()), 0)

This is done too.

Why does it use set() sometimes and {…} sometimes? Why do we test that python doesn't care about order of values in set?

    def test_add_reader(self):
        self.assertSetEqual(self.__sgmt_info.get_readers(), set())
        self.assertSetEqual(self.__sgmt_info.get_old_readers(), set())
        self.__sgmt_info.add_reader(1)

        self.assertSetEqual(self.__sgmt_info.get_readers(), {1})
        self.__sgmt_info.add_reader(3)
        self.assertSetEqual(self.__sgmt_info.get_readers(), {1, 3})
        # ordering doesn't matter in sets
        self.__sgmt_info.add_reader(2)
        self.assertSetEqual(self.__sgmt_info.get_readers(), {1, 2, 3})
        self.assertSetEqual(self.__sgmt_info.get_readers(), {1, 3, 2})

{} is syntax for an empty dict.

Is it really allowed to call start_update any time?

I thought it was weird too, but thought maybe I misunderstood the
description. Shall I update it?

Did all the documentation look OK to you?

comment:13 in reply to: ↑ 12 ; follow-up: Changed 7 years ago by vorner

  • Owner changed from vorner to muks

Hello

Replying to muks:

Also, the semantics of the SegmentInfo class seems strange to me. It
looks like a lot of different methods return the event to be
handled. So the caller will have to check the return value in like 5
different places.

I didn't follow the last sentence.

These methods:

  • complete_update()
  • sync_reader()
  • start_update()
  • remove_reader()

Whenever they figure out a state change is needed, they return it as their result and pop from the queue. Or they return None.

The caller then needs to check the return value of each of these methods, but probably handle it the same way. It doesn't sound like convenient API of the class.

I know it is described in the ticket this way, though.

There's a class that does a queue effectively in python (sorry, I
don't remember the name now). When using list, it needs to copy all
the items to the left when removed from the head.

I am not able to find such a class. If you remember the name, please let
me know. The queue.Queue class is a producer-consumer queue.

collections.deque

I'm little bit worried about the case when there's no update and no
reader either. It seems it'll get to the SYNCHRONIZING state. But as
there's no reader, no reader will ever move from the old readers, so
nothing will trigger the check for the emptiness of that set. It seems
like the manager will get stuck in that state forever.

    def complete_update(self):
        """This method should be called when memmgr is notified by the
        builder of the completion of segment update. It changes the
        state from UPDATING to SYNCHRONIZING, and COPYING to READY. In
        the former case, set of reader modules that are using the
        "current" reader version of the segment are moved to the set
        that are using an "old" version of segment. If there are no such
        readers using the "old" version of segment, it pops the head
        (oldest) event from the pending events queue and returns it. It
        is an error if this method is called in other states than
        UPDATING and COPYING."""
        if self.__state == self.UPDATING:
            self.__state = self.SYNCHRONIZING
            self.__old_readers.update(self.__readers)
            self.__readers.clear()
            return self.__sync_reader_helper(self.SYNCHRONIZING)
        elif self.__state == self.COPYING:
            self.__state = self.READY
            return None
        else:
            raise SegmentInfoError('complete_update() called in ' +
                                   'incorrect state: ' + str(self.__state))

What do you suggest we do for it?

Skip the SYNCHRONIZING state directly?

Why is the switching of __readers and __old_readers so complicated? Won't this be simpler and also faster?

self.__old_readers = self.__readers
self.__readers = set()

The current code performs a set union. I'm not sure if the assignment
does the same. Does it?

No, it overwrites. But, at the moment when it happens, isn't the target empty?

Why can't remove_reader happen any time? I don't see a reason why
the auth server couldn't terminate or crash at any time it finds
appropriate. Will the manager be required to buffer the
unsubscriptions?

Hmmm... shall we allow this then? I implemented the description to the
letter, but didn't consider the bigger picture in this particular
ticket.

I think we should allow it.

Looking at this, it makes me wonder if it makes sense to test anything
at all on the memory manager if the shared memory is not
available. Would it make sense to completely exclude the memory
manager from build then?

    @unittest.skipIf(os.environ['HAVE_SHARED_MEMORY'] != 'yes',
                     'shared memory is not available')

I think we still allow local segments to the memory manager. There is
another ticket about it, but this is something we can do afterwards.

Well, we didn't manage to get rid of local segments yet. But the usefulness of memmgr without the shared memory is just completely zero, because the local segments are just local in there.

Is it really allowed to call start_update any time?

I thought it was weird too, but thought maybe I misunderstood the
description. Shall I update it?

I don't know. I don't have completely clear and detailed view of how the rest will look like, but I think the operation makes no sense in many contexts, so we probably should disallow it then.

Did all the documentation look OK to you?

I think it looked OK.

Also, the tests fail for me:

EE.E
======================================================================
ERROR: test_configure (__main__.TestMemmgr)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/../src/bin/memmgr/tests/memmgr_test.py", line 120, in test_configure
    parse_answer(self.__mgr._config_handler({})))
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/src/bin/memmgr/memmgr.py", line 76, in _config_handler
    logger.debug(logger.DBGLVL_TRACE_BASIC, MEMMGR_CONFIG_UPDATE)
NameError: global name 'MEMMGR_CONFIG_UPDATE' is not defined

======================================================================
ERROR: test_datasrc_config_handler (__main__.TestMemmgr)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/../src/bin/memmgr/tests/memmgr_test.py", line 178, in test_datasrc_config_handler
    self.__mgr._datasrc_config_handler({}, cfg_data)
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/src/bin/memmgr/memmgr.py", line 208, in _datasrc_config_handler
    logger.info(MEMMGR_DATASRC_RECONFIGURED, genid)
NameError: global name 'MEMMGR_DATASRC_RECONFIGURED' is not defined

======================================================================
ERROR: test_setup_module (__main__.TestMemmgr)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/src/bin/memmgr/memmgr.py", line 181, in _setup_module
    'data_sources', self._datasrc_config_handler)
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/../src/bin/memmgr/tests/memmgr_test.py", line 41, in add_remote_config_by_name
    raise self.add_remote_exception
isc.config.ccsession.ModuleCCSessionError: faked exception

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/../src/bin/memmgr/tests/memmgr_test.py", line 163, in test_setup_module
    self.__mgr._setup_module)
  File "/usr/lib64/python3.3/unittest/case.py", line 570, in assertRaises
    return context.handle('assertRaises', callableObj, args, kwargs)
  File "/usr/lib64/python3.3/unittest/case.py", line 135, in handle
    callable_obj(*args, **kwargs)
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/src/bin/memmgr/memmgr.py", line 183, in _setup_module
    logger.error(MEMMGR_NO_DATASRC_CONF, ex)
NameError: global name 'MEMMGR_NO_DATASRC_CONF' is not defined

----------------------------------------------------------------------

And, the copyright year in isc/memmgr/logger.py is wrong.

comment:14 in reply to: ↑ 13 ; follow-up: Changed 7 years ago by muks

  • Owner changed from muks to vorner

Hi Michal

Replying to vorner:

Hello

Replying to muks:

Also, the semantics of the SegmentInfo class seems strange to me. It
looks like a lot of different methods return the event to be
handled. So the caller will have to check the return value in like 5
different places.

I didn't follow the last sentence.

These methods:

  • complete_update()
  • sync_reader()
  • start_update()
  • remove_reader()

Whenever they figure out a state change is needed, they return it as
their result and pop from the queue. Or they return None.

The caller then needs to check the return value of each of these
methods, but probably handle it the same way. It doesn't sound like
convenient API of the class.

I know it is described in the ticket this way, though.

I guess we'll know better how to handle this once we start using these
methods in memmgr.

There's a class that does a queue effectively in python (sorry, I
don't remember the name now). When using list, it needs to copy all
the items to the left when removed from the head.

I am not able to find such a class. If you remember the name, please let
me know. The queue.Queue class is a producer-consumer queue.

collections.deque

The code has been updated to use the deque now.

I'm little bit worried about the case when there's no update and no
reader either. It seems it'll get to the SYNCHRONIZING state. But as
there's no reader, no reader will ever move from the old readers, so
nothing will trigger the check for the emptiness of that set. It seems
like the manager will get stuck in that state forever.

    def complete_update(self):
        """This method should be called when memmgr is notified by the
        builder of the completion of segment update. It changes the
        state from UPDATING to SYNCHRONIZING, and COPYING to READY. In
        the former case, set of reader modules that are using the
        "current" reader version of the segment are moved to the set
        that are using an "old" version of segment. If there are no such
        readers using the "old" version of segment, it pops the head
        (oldest) event from the pending events queue and returns it. It
        is an error if this method is called in other states than
        UPDATING and COPYING."""
        if self.__state == self.UPDATING:
            self.__state = self.SYNCHRONIZING
            self.__old_readers.update(self.__readers)
            self.__readers.clear()
            return self.__sync_reader_helper(self.SYNCHRONIZING)
        elif self.__state == self.COPYING:
            self.__state = self.READY
            return None
        else:
            raise SegmentInfoError('complete_update() called in ' +
                                   'incorrect state: ' + str(self.__state))

What do you suggest we do for it?

Skip the SYNCHRONIZING state directly?

This is now implemented. Please check the testcase too.

Why is the switching of __readers and __old_readers so
complicated? Won't this be simpler and also faster?

self.__old_readers = self.__readers
self.__readers = set()

The current code performs a set union. I'm not sure if the assignment
does the same. Does it?

No, it overwrites. But, at the moment when it happens, isn't the target empty?

You're right. I've updated this code to make it an assignment.

Why can't remove_reader happen any time? I don't see a reason why
the auth server couldn't terminate or crash at any time it finds
appropriate. Will the manager be required to buffer the
unsubscriptions?

Hmmm... shall we allow this then? I implemented the description to the
letter, but didn't consider the bigger picture in this particular
ticket.

I think we should allow it.

Reading this again, I wonder if memmgr is supposed to populate and
maintain readers (from the group) only after start_update() is issued?
And then it stops maintaining this list after it gets back to READY
state. Otherwise I don't follow why it is designed this way.

Looking at this, it makes me wonder if it makes sense to test anything
at all on the memory manager if the shared memory is not
available. Would it make sense to completely exclude the memory
manager from build then?

    @unittest.skipIf(os.environ['HAVE_SHARED_MEMORY'] != 'yes',
                     'shared memory is not available')

I think we still allow local segments to the memory manager. There is
another ticket about it, but this is something we can do afterwards.

Well, we didn't manage to get rid of local segments yet. But the
usefulness of memmgr without the shared memory is just completely
zero, because the local segments are just local in there.

I agree memmgr doesn't make any sense without shared memory, and we can
install it conditionally. Given the other memmgr tickets, would auth,
etc. work properly without memmgr running? We should keep this in mind
as we complete the shared memory work.

Shall I make another ticket for it, that conditionally installs memmgr?

Is it really allowed to call start_update any time?

I thought it was weird too, but thought maybe I misunderstood the
description. Shall I update it?

I don't know. I don't have completely clear and detailed view of how
the rest will look like, but I think the operation makes no sense in
many contexts, so we probably should disallow it then.

This has been updated as suggested.

Did all the documentation look OK to you?

I think it looked OK.

Also, the tests fail for me:

EE.E
======================================================================
ERROR: test_configure (__main__.TestMemmgr)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/../src/bin/memmgr/tests/memmgr_test.py", line 120, in test_configure
    parse_answer(self.__mgr._config_handler({})))
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/src/bin/memmgr/memmgr.py", line 76, in _config_handler
    logger.debug(logger.DBGLVL_TRACE_BASIC, MEMMGR_CONFIG_UPDATE)
NameError: global name 'MEMMGR_CONFIG_UPDATE' is not defined

======================================================================
ERROR: test_datasrc_config_handler (__main__.TestMemmgr)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/../src/bin/memmgr/tests/memmgr_test.py", line 178, in test_datasrc_config_handler
    self.__mgr._datasrc_config_handler({}, cfg_data)
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/src/bin/memmgr/memmgr.py", line 208, in _datasrc_config_handler
    logger.info(MEMMGR_DATASRC_RECONFIGURED, genid)
NameError: global name 'MEMMGR_DATASRC_RECONFIGURED' is not defined

======================================================================
ERROR: test_setup_module (__main__.TestMemmgr)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/src/bin/memmgr/memmgr.py", line 181, in _setup_module
    'data_sources', self._datasrc_config_handler)
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/../src/bin/memmgr/tests/memmgr_test.py", line 41, in add_remote_config_by_name
    raise self.add_remote_exception
isc.config.ccsession.ModuleCCSessionError: faked exception

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/../src/bin/memmgr/tests/memmgr_test.py", line 163, in test_setup_module
    self.__mgr._setup_module)
  File "/usr/lib64/python3.3/unittest/case.py", line 570, in assertRaises
    return context.handle('assertRaises', callableObj, args, kwargs)
  File "/usr/lib64/python3.3/unittest/case.py", line 135, in handle
    callable_obj(*args, **kwargs)
  File "/tmp/bind10/bind10-1/bind10-20130529/_build/src/bin/memmgr/memmgr.py", line 183, in _setup_module
    logger.error(MEMMGR_NO_DATASRC_CONF, ex)
NameError: global name 'MEMMGR_NO_DATASRC_CONF' is not defined

----------------------------------------------------------------------

This was due to a library .mes file being named the same as another in
src/bin/. It has been fixed now.

And, the copyright year in isc/memmgr/logger.py is wrong.

Also fixed.

comment:15 in reply to: ↑ 14 Changed 7 years ago by vorner

  • Owner changed from vorner to muks

Hello

Replying to muks:

I know it is described in the ticket this way, though.

I guess we'll know better how to handle this once we start using these
methods in memmgr.

Yes, I don't know if it's so cool to do the bottom-up development, since we build classes we don't see the exact reason for yet :-|.

I think we should allow it.

Reading this again, I wonder if memmgr is supposed to populate and
maintain readers (from the group) only after start_update() is issued?
And then it stops maintaining this list after it gets back to READY
state. Otherwise I don't follow why it is designed this way.

Same as with unsubscriptions, subscriptions can come at any time IMO.

But it probably depends on the current state it is in, to which set to put them and so.

What is done now?

I agree memmgr doesn't make any sense without shared memory, and we can
install it conditionally. Given the other memmgr tickets, would auth,
etc. work properly without memmgr running? We should keep this in mind
as we complete the shared memory work.

As far as I follow the development, I think auth should work well enough without memmgr, provided it doesn't use the mapped segments. If it tried to use the segments, it would not get them, so the zones would stay unavailable and it would return something like SERVFAIL for the queries to them, but the rest of zones would work OK.

Shall I make another ticket for it, that conditionally installs memmgr?

It might make some sense, so yes, please.

comment:16 follow-up: Changed 7 years ago by muks

  • Owner changed from muks to vorner

Replying to vorner:

I think we should allow it.

Reading this again, I wonder if memmgr is supposed to populate and
maintain readers (from the group) only after start_update() is issued?
And then it stops maintaining this list after it gets back to READY
state. Otherwise I don't follow why it is designed this way.

Same as with unsubscriptions, subscriptions can come at any time IMO.

But it probably depends on the current state it is in, to which set to
put them and so.

What is done now?

Right now memmgr does not call any of these methods (in SegmentInfo)
at all. But I don't follow this anymore. What behavior changes are
suggested now?

I agree memmgr doesn't make any sense without shared memory, and we can
install it conditionally. Given the other memmgr tickets, would auth,
etc. work properly without memmgr running? We should keep this in mind
as we complete the shared memory work.

As far as I follow the development, I think auth should work well
enough without memmgr, provided it doesn't use the mapped segments. If
it tried to use the segments, it would not get them, so the zones
would stay unavailable and it would return something like SERVFAIL for
the queries to them, but the rest of zones would work OK.

Shall I make another ticket for it, that conditionally installs memmgr?

It might make some sense, so yes, please.

#3079 (Don't build and install memmgr where shared memory support is not
available) was created for this.

comment:17 in reply to: ↑ 16 Changed 7 years ago by vorner

  • Owner changed from vorner to muks

Hello

Replying to muks:

Same as with unsubscriptions, subscriptions can come at any time IMO.

But it probably depends on the current state it is in, to which set to
put them and so.

What is done now?

Right now memmgr does not call any of these methods (in SegmentInfo)
at all. But I don't follow this anymore. What behavior changes are
suggested now?

No, I meant what happens if I call add_reader in, lets say, SYNCHRONIZING state. Do we throw? If we throw, we should change that (I looked and it seems we don't throw, but I better ask if I'm right).

And, depending on the state, the new reader might need to go to either old readers or the current readers (I did not think about when to which, maybe putting it to the current ones is OK every time).

So, depending on the current state, it might be no change is needed. Could you go through the states and decide where it makes sense to put them? If no change is needed, then it is OK to merge.

comment:18 follow-up: Changed 7 years ago by muks

  • Owner changed from muks to vorner

Do you agree with the following (it's not very clear from the design doc):

When the memmgr is not in the READY state, if it gets notified of a new auth subscribing to the readers group, it is fine to assume the new auth instance is using the new mapped file and not the old one. For making sure there is no race, memmgr should make SegmentInfo updates in the main thread itself (which also handles communications) and only have the builder in a different thread.

So in such a case, notifications of reader group subscriptions can make memmgr add all readers to the current list of readers as they are using the latest mapped file.

comment:19 in reply to: ↑ 18 Changed 7 years ago by vorner

  • Owner changed from vorner to muks

Hello

Replying to muks:

Do you agree with the following (it's not very clear from the design doc):

When the memmgr is not in the READY state, if it gets notified of a new auth subscribing to the readers group, it is fine to assume the new auth instance is using the new mapped file and not the old one. For making sure there is no race, memmgr should make SegmentInfo updates in the main thread itself (which also handles communications) and only have the builder in a different thread.

Thinking about that, yes, you are right. It's correct to put them all to the current set.

So it is OK to merge this (it might be good to add the ↑ text to a comment/docstring, though ‒ but that doesn't need another review round).

comment:20 Changed 7 years ago by vorner

  • Total Hours changed from 0 to 4.99

comment:21 Changed 7 years ago by muks

  • Resolution set to fixed
  • Status changed from reviewing to closed

Merged to master branch in commit 26b0ae80ecf50d3868ca059a5965ecb275bb135b:

* e3e92f0 [2856] Add a comment about add_reader() in states other than READY
* 786a742 [2856] With no readers, make complete_update() from UPDATING go directly to READY
* 4f96d3a [2856] Raise an exception if start_update() is called in any state other than READY
* 77e3dfc [2856] Simplify code (old_readers will be empty before syncing readers)
* 724e39e [2856] Use deque instead of a list to store events
* 941c1d8 [2856] Fix copyright year
* 40471d9 [2856] Rename memmgr module mes file to libmemmgr to avoid conflict
* 1f09795 [2856] Update test to check more concretely
* cd9d3e1 [2856] Log errors when using the ZoneWriter
* 9c54e83 [2856] Log when the MemorySegmentBuilder receives a bad command
* 27a4c21 [2856] Remove excessive ordering test
* e61c298 [2856] Test lists and sets directly instead of using len()
* 7c6999a [2856] Delete the mapped file after the test completes
* 37753c0 [2856] Add test for "load" command to builder
* cca0075 [2856] Rename env variable
* 377e881 [2856] Add documentation for the __handle_load() method, etc.
* d9b5677 [2856] Include the state transition diagram
* 7027266 [2856] Add some more missing documentation
* 2de17f7 [2856] Add documentation
* 644a03f [2856] Fix handle_load() invocation
* 89a447a [2856] Add more thorough remove_reader() tests
* 292f28a [2856] Add more thorough sync_reader() tests
* 1bbc190 [2856] Add basic sync_reader() and remove_reader() tests
* af62ea0 [2856] Test that the state doesn't change when an exception is raised
* a84352b [2856] Check complete_update() return value
* d4fba54 [2856] Add start_update() tests
* fd38783 [2856] Check that old_readers is untouched by add_reader()
* d474c8d [2856] Fix exception messages
* e785578 [2856] Reorder methods
* 59c4a13 [2856] Add add_reader() tests
* af0db15 [2856] Fix test name
* 3401846 [2856] Move test in file
* e68217d [2856] Test UPDATING -> SYNCHRONIZING with and without events in the queue
* c6e820b [2856] Add state transition test helpers, and extend complete_update() tests
* 8df3463 [2856] Add "load" command handler
* e291894 [2856] Pass tuples in the command queue
* 894180b [2856] Move command handlers into their own methods
* a06a52e [2856] Extend SegmentInfo and add some basic tests

Resolving as fixed. Thank you for the reviews Michal.

Note: See TracTickets for help on using tickets.