Opened 7 years ago

Closed 7 years ago

#2962 closed defect (fixed)

cmdctl test failure due to race in certificate keypair creation

Reported by: muks Owned by: muks
Priority: medium Milestone: Sprint-20130806
Component: ~cmd-ctl (obsolete) Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 5 Add Hours to Ticket: 0
Total Hours: 0.22 Internal?: no

Description

There is a cmdctl test failure on my Fedora 18 box (with Python 3.3). It is not reproducible, but I have the backtrace from the failure:

make[5]: Entering directory `/home/muks/bind10/src/bin/cmdctl/tests'
for pytest in cmdctl_test.py b10-certgen_test.py ; do \
echo Running test: $pytest ; \
 \
PYTHONPATH=/home/muks/bind10/src/lib/python/isc/log_messages:/home/muks/bind10/src/lib/python/isc/cc:/home/muks/bind10/src/lib/python:/home/muks/bind10/src/lib/python:/home/muks/bind10/src/lib/dns/python/.libs:/home/muks/bind10/src/bin/cmdctl \
CMDCTL_SRC_PATH=/home/muks/bind10/src/bin/cmdctl \
CMDCTL_BUILD_PATH=/home/muks/bind10/src/bin/cmdctl \
B10_LOCKFILE_DIR_FROM_BUILD=/home/muks/bind10 \
/usr/bin/python3.3 /home/muks/bind10/src/bin/cmdctl/tests/$pytest || exit ; \
done
Running test: cmdctl_test.py
....................................E
======================================================================
ERROR: test_wrap_sock_in_ssl_context (__main__.TestSecureHTTPServer)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/muks/bind10/src/bin/cmdctl/tests/cmdctl_test.py", line 741, in test_wrap_sock_in_ssl_context
    BUILD_FILE_PATH + 'cmdctl-certfile.pem')
  File "/home/muks/bind10/src/bin/cmdctl/cmdctl.py", line 610, in _wrap_socket_in_ssl_context
    raise socket.error
OSError

----------------------------------------------------------------------
Ran 37 tests in 0.019s

FAILED (errors=1)
make[5]: *** [check-local] Error 1
make[5]: Leaving directory `/home/muks/bind10/src/bin/cmdctl/tests'
make[4]: *** [check-am] Error 2
make[4]: Leaving directory `/home/muks/bind10/src/bin/cmdctl/tests'

Subtickets

Attachments (2)

cmdctl-certfile.pem (1.1 KB) - added by muks 7 years ago.
cmdctl-keyfile.pem (1.7 KB) - added by muks 7 years ago.

Download all attachments as: .zip

Change History (16)

comment:1 Changed 7 years ago by muks

  • Estimated Difficulty changed from 0 to 5

comment:2 Changed 7 years ago by jinmei

I had this in my environment, too: see http://bind10.isc.org/ticket/2930#comment:7

and it's Python 3.2, so it's not specific to 3.3.

comment:3 Changed 7 years ago by shane

We should catch the error and output some information for future debugging.

If we want to actively track this down we could try running just this test in a loop and see if it repeats that way.

comment:4 Changed 7 years ago by shane

  • Milestone New Tasks deleted

comment:5 Changed 7 years ago by muks

I could reproduce this issue today. By logging the initial exception, I got the following:

======================================================================
ERROR: test_wrap_sock_in_ssl_context (__main__.TestSecureHTTPServer)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/muks/bind10/src/bin/cmdctl/tests/cmdctl_test.py", line 751, in test_wrap_sock_in_ssl_context
    BUILD_FILE_PATH + 'cmdctl-certfile.pem')
  File "/home/muks/bind10/src/bin/cmdctl/cmdctl.py", line 611, in _wrap_socket_in_ssl_context
    raise socket.error(str(ex))
OSError: [X509: KEY_VALUES_MISMATCH] key values mismatch (_ssl.c:2084)

----------------------------------------------------------------------

Checking...

comment:6 Changed 7 years ago by muks

This is the direct backtrace (no re-raise):

======================================================================
ERROR: test_wrap_sock_in_ssl_context (__main__.TestSecureHTTPServer)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/muks/bind10/src/bin/cmdctl/tests/cmdctl_test.py", line 751, in test_wrap_sock_in_ssl_context
    BUILD_FILE_PATH + 'cmdctl-certfile.pem')
  File "/home/muks/bind10/src/bin/cmdctl/cmdctl.py", line 599, in _wrap_socket_in_ssl_context
    ssl_version=ssl.PROTOCOL_SSLv23)
  File "/usr/lib64/python3.3/ssl.py", line 597, in wrap_socket
    ciphers=ciphers)
  File "/usr/lib64/python3.3/ssl.py", line 260, in __init__
    self.context.load_cert_chain(certfile, keyfile)
ssl.SSLError: [X509: KEY_VALUES_MISMATCH] key values mismatch (_ssl.c:2084)

----------------------------------------------------------------------

comment:7 Changed 7 years ago by muks

Removing the *.pem files and re-creating them (during make using b10-certgen) seems to make this issue go away. Bringing back the original *.pem files makes this problem reappear. So something's wrong with the certificate keypair that b10-certgen created. I'll attach them here.

Changed 7 years ago by muks

Changed 7 years ago by muks

comment:8 Changed 7 years ago by muks

  • Milestone set to Sprint-20130806
  • Owner set to muks
  • Status changed from new to assigned

comment:9 Changed 7 years ago by muks

  • Owner changed from muks to UnAssigned
  • Status changed from assigned to reviewing

This issue was due to the following in Makefile.am:

# Generate the initial certificates immediately
cmdctl-certfile.pem: b10-certgen
       	./b10-certgen -q -w

cmdctl-keyfile.pem: b10-certgen
       	./b10-certgen -q -w

A parallel make will cause these two b10-certgen commands to get executed at the same time resulting in corrupted PEM files (as you can notice in the attached keyfile at the bottom), or where the key pair do not match each other.

This has been fixed using a hack as there's no portable and correct way to serialize it, or have it execute just once for that matter.

Up for review.

comment:10 Changed 7 years ago by muks

  • Summary changed from cmdctl test failure (doesn't happen regularly) to cmdctl test failure due to race in certificate keypair creation

comment:11 Changed 7 years ago by vorner

  • Owner changed from UnAssigned to vorner

comment:12 follow-up: Changed 7 years ago by vorner

  • Owner changed from vorner to muks
  • Total Hours changed from 0 to 0.22

I agree with the fix (and with the fact that this could have the described effect, so it probably fixes the problem).

However, may we use some command that exists generally?

/bin/sed -e "s|@@SYSCONFDIR@@|/tmp/bind10//bind10-1.install/etc|" cmdctl.spec.pre >cmdctl.spec
  CXX      b10_certgen-b10-certgen.o
  CXXLD    b10-certgen
./b10-certgen -q -w
noop

/bin/bash: noop: command not found

Maybe even touch of the target would be good, this way, if the
cmdctl-certfile.pem is older than cmdctl-keyfile.pem for some reason, make will
try to run the rule every time.

I think this can be merged afterwards.

comment:13 in reply to: ↑ 12 Changed 7 years ago by muks

Replying to vorner:

Maybe even touch of the target would be good, this way, if the
cmdctl-certfile.pem is older than cmdctl-keyfile.pem for some reason, make will
try to run the rule every time.

This is now done.

comment:14 Changed 7 years ago by muks

  • Resolution set to fixed
  • Status changed from reviewing to closed

Merged to master branch in commit 09f557d871faef090ed444ebeee7f13e142184a0:

* 2b589c6 [2962] Touch the dependency instead of using noop
* 35c53d9 [2962] Don't run b10-certgen in parallel (which results in corruption of PEM output files)
* 940eae0 [2962] Don't raise another exception, but re-raise the same one

Resolving as fixed. Thank you for the reviews Michal.

A ChangeLog entry was also added.

Note: See TracTickets for help on using tickets.