Opened 9 years ago

Closed 9 years ago

#838 closed defect (fixed)

"string iterator is not dereferencable" issue

Reported by: fdupont Owned by: ocean
Priority: medium Milestone: Sprint-20110614
Component: Unclassified Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 3.0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

The debug version of lib/dns tests on Windows fails with this message
"string iterator is not dereferencable" in Visual Studio C++ 2010
in VC/include/xstring
Looking at the include it seems the functor madness uses the operator*()
on xxx.end() iterators...

Subtickets

Attachments (2)

di.cc (166 bytes) - added by fdupont 9 years ago.
base-n.diff (1.2 KB) - added by jinmei 9 years ago.

Download all attachments as: .zip

Change History (35)

Changed 9 years ago by fdupont

comment:1 Changed 9 years ago by fdupont

Trivial triggering example in attachment: works well with g++ on Linux,
raise the infamous message on Windows in Debug mode only.

comment:2 Changed 9 years ago by fdupont

I'll retry to set _ITERATOR_DEBUG_LEVEL to 0 because I have two tests which fails on vector out
of range (i.e., the support library crashes in place of raise the exception :-).
With this: Base64 and Rdata_Unknown (see 836)
With the string message: Hex, Base32, Rdata_DNSKEY and Rdata_NSEC3

Last edited 9 years ago by fdupont (previous) (diff)

comment:3 Changed 9 years ago by fdupont

With _ITERATOR_DEBUG_LEVEL=0 (note I had to recompile gtest too) errors are different,
for instance isspace(0x80) fails (nothing to do, 0x80 is set to a char, isspace() takes an int).
In conclusion IMHO Windows is another world and the Debug version will never fully work!
Give up?

comment:4 in reply to: ↑ description Changed 9 years ago by fdupont

Found at least one bug: src/lib/dns/util/base_n.cc is wrong:
in DecodeNormalizer:

const char& operator*() const {

if (in_pad_ && *base_ == BASE_PADDING_CHAR) {

return (base_zero_code_);

} else {

return (*base_);

}

}

if base_ is already at the end, it is illegal to deference it.
Now BaseNTransformer<>::decode calls:

result.assign(Decoder(DecodeNormalizer(BaseZeroCode, input.begin(),

srit.base(), input.end())),

Decoder(DecodeNormalizer(BaseZeroCode, input.end(),

input.end(), input.end())));

This is an internal error as (from DecodeNormalizer comment):
// Note: this class is intended to be used within this implementation file,
// and for simplicity assumes "base < base_beginpad <= base_end" on
// construction without validating the arguments. The behavior is undefined
// if this assumption doesn't hold.

I agree: it is undefined and only seems to work...
Of course there is a similar issue on the encoding side.

About the crash itself, I added something to trace to_4_bit() in
util/binary_from_base16.h when decodeHex() is called with "dea":
the stopping condition doesn't work as the function is called
for 'd', 'e', 'a', and:

  • on Windows it crashes (at the place I cited at the beginning of the comment)
  • on Linux it is called with '\0', fortunately it raises BadValue?

IMHO the transform_width logic is not what was expected.

A good news: it seems it is the only source of errors (others are
in 836).

comment:5 Changed 9 years ago by fdupont

Proposed fixes:

  • add in DecodeNomalizer operator*() at the beginning:

if (base_ == base_end_)

isc_throw(BadValue, "end of input");

  • add (int)xxx & 0xff to the two calls to isspace() (with _DEBUG set isspace int argument must verify ((unsigned)xxx + 1) <= 256

With this (src/lib/dns/tests) run_unittests no longer crashes
in Debug mode on Windows with Visual Studio 2010.
Tested on Visual Studio 2008 too...

comment:6 Changed 9 years ago by stephen

  • Milestone set to Sprint-20110503

comment:7 Changed 9 years ago by ocean

  • Owner set to ocean
  • Status changed from new to assigned

comment:8 follow-up: Changed 9 years ago by ocean

  • Defect Severity set to N/A
  • Owner changed from ocean to UnAssigned
  • Status changed from assigned to reviewing
  • Sub-Project set to DNS

comment:9 in reply to: ↑ 8 ; follow-up: Changed 9 years ago by jinmei

  • Owner changed from UnAssigned to ocean

Replying to ocean:

Could you provide some more context? Have you successfully reproduced
the reported problem? If so, have you confirmed the change fixed it?
The committed change seems to be a simple copy and paste of (part of)
what Francis suggested...does that mean you agree that it was the best
fix? Or did you commit it simply because you confirmed it fixed it?

Also, what about other issues reported in the ticket?

comment:10 in reply to: ↑ 9 ; follow-up: Changed 9 years ago by ocean

Replying to jinmei:

Replying to ocean:

Could you provide some more context? Have you successfully reproduced
the reported problem? If so, have you confirmed the change fixed it?
The committed change seems to be a simple copy and paste of (part of)
what Francis suggested...does that mean you agree that it was the best
fix? Or did you commit it simply because you confirmed it fixed it?

Also, what about other issues reported in the ticket?

Yes I did reproduced this problem on Windows with VS2010. And this can fix it.

But to reproduce this, a lot of code changes need to make otherwise the code cannot be compiled, all the changes are scattered in several related tickes #827 #829 #831 #832 #836 #837 #839 #858

For this ticket, the current fix should be enough. Another concern is the isspace() problem, because this can only reproduced in DEBUG version. But it seems add the (int)xxx & 0xFF will not do any harm, so I will add it too.

For other related tickets, I'm not sure whether we should fix it one by one or create another ticket to put windows related code changes into the trunk. It seems that Shane does not hope to waste too much time on Windows platform.:)

comment:11 Changed 9 years ago by ocean

On windows, the compiler will validate the character that passed to isspace() on DEBUG mode:

        _ASSERTE((unsigned)(c + 1) <= 256);

So if c < -1 it will make this ASSERT fail. So I added some checking before the parameter is passed to isspace().

comment:12 Changed 9 years ago by ocean

  • Owner changed from ocean to jinmei

comment:13 in reply to: ↑ 10 Changed 9 years ago by fdupont

Replying to ocean:

But to reproduce this, a lot of code changes need to make otherwise the code cannot be compiled, all the changes are scattered in several related tickets #827 #829 #831 #832 #836 #837 #839 #858

=> this is why I consolidated all the corresponding branches into trac826 (not yet a candidate for the whole stuff but already a solid base)

For other related tickets, I'm not sure whether we should fix it one by one or create another ticket to put windows related code changes into the trunk. It seems that Shane does not hope to waste too much time on Windows platform.:)

=> IMHO bugs should be fixed even they are discovered on an unsupported platform. After we have things which clearly have a general benefit (isspace() example) and things which are specific to a platform (should be protected by an #ifdef)

comment:14 follow-up: Changed 9 years ago by jinmei

Another quick question: the change committed in the branch modified
EncodeNormalizer::operator*(), while the original proposed fix was
about DecodeNormalizer::operator*(). Could you clarify it? Is that a
typo of the original proposed fix, or did you find the original report
was wrong and the problem was actually in and only in
EncodeNormalizer??

In any case, lacking the environment, it's not very clear to me what
is exactly wrong with the original implementation (calling operator*
at the end() of the data is wrong of course, but I don't understand
how that situation happened). Please explain more details about the
problem.

Thanks,

comment:15 in reply to: ↑ 14 ; follow-up: Changed 9 years ago by ocean

Replying to jinmei:

Another quick question: the change committed in the branch modified
EncodeNormalizer::operator*(), while the original proposed fix was
about DecodeNormalizer::operator*(). Could you clarify it? Is that a
typo of the original proposed fix, or did you find the original report
was wrong and the problem was actually in and only in
EncodeNormalizer??

Sorry, it's my fault. it should be put in DecodeNormalizer, this is not needed to put
into EncodeNormalizer. I cannot commit the change on the Windows platform because a lot
of hacks are made to reproduce this. So we'd better put the windows related changes into the trunk.

In any case, lacking the environment, it's not very clear to me what
is exactly wrong with the original implementation (calling operator*
at the end() of the data is wrong of course, but I don't understand
how that situation happened). Please explain more details about the
problem.

Thanks,

This is triggered by the following sequence.

  1. Given input string of " " which is a one char string with 0x20 (the space) character.
  2. Call decodeBase32Hex-> Base32HexTransformer::decode->
            result.assign(Decoder(DecodeNormalizer(BaseZeroCode, input.begin(),
                                                   srit.base(), input.end())),
                          Decoder(DecodeNormalizer(BaseZeroCode, input.end(),
                                                   input.end(), input.end())));
    
  3. The DecodeNormalizer() is a iterator with base_ points to the start of string.
  4. When later the result.assign() is called, it will increase the first iterator and try to dereference it.
        DecodeNormalizer& operator++() {
            ++base_;
            while (base_ != base_end_ && isspace(*base_)) {
                ++base_;
            }
            if (base_ == base_beginpad_) {
                in_pad_ = true;
            }
    
            return (*this);
        }
    

Initially the base_ is point to the begin of the input string, since *base_ is a space character (0x20), the base_ will be increased by 1 and equal to base_end_

  1. Later when the iterator is dereferenced, it will trigger this problem.

I tried to fix it in the operator++() function, but it seems that no appropriate method.

comment:16 in reply to: ↑ 15 ; follow-up: Changed 9 years ago by jinmei

Replying to ocean:

In any case, lacking the environment, it's not very clear to me what
is exactly wrong with the original implementation (calling operator*
at the end() of the data is wrong of course, but I don't understand
how that situation happened). Please explain more details about the
problem.

Thanks,

This is triggered by the following sequence.

[snip]

Okay, thanks. I now think I fully understand the problem.

I also successfully reproduced the same (I believe) problem on our
Solaris buildbox machine with the STLPort debug mode.

Now, comments about the fix:

It seems to be the correct fix to throw an exception in
DecodeNormalizer::operator*(). But we can then make another cleanup:
we now don't need to check whether *base_ is BASE_PADDING_CHAR. It
was an (incorrect) attempt to avoid the situation that the base_ ==
base_end_ check now does.

Also,

  1. Given input string of " " which is a one char string with 0x20 (the space) character.

with your fix, the input of " " will result in the exception, which is
not correct. It should be handled just like "". On looking at it
closely, I noticed the previous code doesn't handle the case where the
input begins with a space (this is a different problem than the
iterator bug itself).

Finally, regarding the conversion of char to signed int for isspace():
I don't think checking it with >0 is the cleanest fix. Even though I
generally want to avoid relying on casts, it seems to be one of the
(rare) cases where a cast makes most sense (and in this case the case
should be safe).

I've just committed my counter proposal based on the above comments
and pushed it. I also made a couple of unrelated (but STL-related)
bugs in the test code. Please check it (it's just a "proposal" and
I've pushed it so that we can easily share it.) Also, please consider
adding more tests to base64 and hex tests that highlight the problem
as I did it for base32hex.

Oh, and really finally, a higher level point. Wheter or not we create
a separate ticket, I believe we should fix the problems Francis
reported for Windows. I agree with Francis on this point; it's not an
issue specific for Windows, but real bugs that just happen to be
revealed more clearly in a Windows environment (in fact, we can
reproduce some of the problems on Solaris + STLPort). Priority is
another issue, and on that point the fact we are not seriously
thinking Windows support can matter.

comment:17 Changed 9 years ago by jinmei

  • Owner changed from jinmei to ocean

comment:18 in reply to: ↑ 16 ; follow-up: Changed 9 years ago by ocean

Replying to jinmei:

Replying to ocean:

In any case, lacking the environment, it's not very clear to me what
is exactly wrong with the original implementation (calling operator*
at the end() of the data is wrong of course, but I don't understand
how that situation happened). Please explain more details about the
problem.

Thanks,

This is triggered by the following sequence.

[snip]

Okay, thanks. I now think I fully understand the problem.

I also successfully reproduced the same (I believe) problem on our
Solaris buildbox machine with the STLPort debug mode.

Now, comments about the fix:

It seems to be the correct fix to throw an exception in
DecodeNormalizer::operator*(). But we can then make another cleanup:
we now don't need to check whether *base_ is BASE_PADDING_CHAR. It
was an (incorrect) attempt to avoid the situation that the base_ ==
base_end_ check now does.

Also,

  1. Given input string of " " which is a one char string with 0x20 (the space) character.

with your fix, the input of " " will result in the exception, which is
not correct. It should be handled just like "". On looking at it
closely, I noticed the previous code doesn't handle the case where the
input begins with a space (this is a different problem than the
iterator bug itself).

Finally, regarding the conversion of char to signed int for isspace():
I don't think checking it with >0 is the cleanest fix. Even though I
generally want to avoid relying on casts, it seems to be one of the
(rare) cases where a cast makes most sense (and in this case the case
should be safe).

Consider the input is a value in range between [-128, 0), it will be mapped
to [0x80, 0xFF) after static_cast<unsigned char>(c), which are
control and Latin-1 supplement in Unicode. For example, 0xA0 is
defined as "non-break space", so whether isspace(0xA0) returned true or false
depends on the implementation. I have tested on my pc that it returns 0 for
[0x80, 0xFF], but I'm afraid that future or other implementation may break it.

I've just committed my counter proposal based on the above comments
and pushed it. I also made a couple of unrelated (but STL-related)
bugs in the test code. Please check it (it's just a "proposal" and
I've pushed it so that we can easily share it.) Also, please consider
adding more tests to base64 and hex tests that highlight the problem
as I did it for base32hex.

OK, I will check it.

Oh, and really finally, a higher level point. Wheter or not we create
a separate ticket, I believe we should fix the problems Francis
reported for Windows. I agree with Francis on this point; it's not an
issue specific for Windows, but real bugs that just happen to be
revealed more clearly in a Windows environment (in fact, we can
reproduce some of the problems on Solaris + STLPort). Priority is
another issue, and on that point the fact we are not seriously
thinking Windows support can matter.

Yes, I agree:)

comment:19 Changed 9 years ago by ocean

  • Owner changed from ocean to jinmei

Add more unit tests to base64 decode

comment:20 in reply to: ↑ 18 ; follow-up: Changed 9 years ago by jinmei

Replying to ocean:

Finally, regarding the conversion of char to signed int for isspace():
I don't think checking it with >0 is the cleanest fix. Even though I
generally want to avoid relying on casts, it seems to be one of the
(rare) cases where a cast makes most sense (and in this case the case
should be safe).

Consider the input is a value in range between [-128, 0), it will be mapped
to [0x80, 0xFF) after static_cast<unsigned char>(c), which are
control and Latin-1 supplement in Unicode. For example, 0xA0 is
defined as "non-break space", so whether isspace(0xA0) returned true or false
depends on the implementation. I have tested on my pc that it returns 0 for
[0x80, 0xFF], but I'm afraid that future or other implementation may break it.

Hmm, point taken. On thinking about it with the rationale, I now tend
to agree that >0 is better than the naive cast. If you like please
feel free to revert that part. But in that case please leave some
comment about the implication. I suspect it's not trivial.

Also, with other fixes I suspect this bug cannot be triggered via our
current tests. Please confirm that (or confirm I'm wrong:-) and if my
observation is correct, please add an explicit test that would
re-introduce this problem, confirm the new code without the additional

0 check triggers it, and the additional check actually prevents that.

I've just committed my counter proposal based on the above comments
and pushed it. I also made a couple of unrelated (but STL-related)
bugs in the test code. Please check it (it's just a "proposal" and
I've pushed it so that we can easily share it.) Also, please consider
adding more tests to base64 and hex tests that highlight the problem
as I did it for base32hex.

OK, I will check it.

I've not seen you modify it, so are you okay with them (except the
cast for isspace())?

comment:21 Changed 9 years ago by jinmei

  • Owner changed from jinmei to ocean

comment:22 in reply to: ↑ 20 ; follow-up: Changed 9 years ago by ocean

Replying to jinmei:

Replying to ocean:

Finally, regarding the conversion of char to signed int for isspace():
I don't think checking it with >0 is the cleanest fix. Even though I
generally want to avoid relying on casts, it seems to be one of the
(rare) cases where a cast makes most sense (and in this case the case
should be safe).

Consider the input is a value in range between [-128, 0), it will be mapped
to [0x80, 0xFF) after static_cast<unsigned char>(c), which are
control and Latin-1 supplement in Unicode. For example, 0xA0 is
defined as "non-break space", so whether isspace(0xA0) returned true or false
depends on the implementation. I have tested on my pc that it returns 0 for
[0x80, 0xFF], but I'm afraid that future or other implementation may break it.

Hmm, point taken. On thinking about it with the rationale, I now tend
to agree that >0 is better than the naive cast. If you like please
feel free to revert that part. But in that case please leave some
comment about the implication. I suspect it's not trivial.

I have reverted to the original fix and add some comments.

Also, with other fixes I suspect this bug cannot be triggered via our
current tests. Please confirm that (or confirm I'm wrong:-) and if my
observation is correct, please add an explicit test that would
re-introduce this problem, confirm the new code without the additional
> 0 check triggers it, and the additional check actually prevents that.

I'm a little confused for this. For even we don't add the < 0 check, most
of the isspace() implementation will just accept and return zero.
It will only cause problem on windows platform in debug mode because it will
do the check of

 _ASSERTE((unsigned)(c + 1) <= 256);

This will be trigger in unit test of HexTest::decodeMap

input[1] = 128 

since input[1] is a char type, it will be converted to -128 and then trigger it.

I've just committed my counter proposal based on the above comments
and pushed it. I also made a couple of unrelated (but STL-related)
bugs in the test code. Please check it (it's just a "proposal" and
I've pushed it so that we can easily share it.) Also, please consider
adding more tests to base64 and hex tests that highlight the problem
as I did it for base32hex.

OK, I will check it.

I've not seen you modify it, so are you okay with them (except the
cast for isspace())?

Yes, I think it is ok.

comment:23 Changed 9 years ago by ocean

  • Owner changed from ocean to jinmei

comment:24 in reply to: ↑ 22 Changed 9 years ago by jinmei

Replying to ocean:

Consider the input is a value in range between [-128, 0), it will be mapped
to [0x80, 0xFF) after static_cast<unsigned char>(c), which are
control and Latin-1 supplement in Unicode. For example, 0xA0 is
defined as "non-break space", so whether isspace(0xA0) returned true or false
depends on the implementation. I have tested on my pc that it returns 0 for
[0x80, 0xFF], but I'm afraid that future or other implementation may break it.

Hmm, point taken. On thinking about it with the rationale, I now tend
to agree that >0 is better than the naive cast. If you like please
feel free to revert that part. But in that case please leave some
comment about the implication. I suspect it's not trivial.

I have reverted to the original fix and add some comments.

Okay, but the comment doesn't seem to be sufficient to me. It doesn't
explain why we cannot simply cast it to unsigned char. Also, it
doesn't match the implementation: while the comment says "larger than
0", the test is ">= 0". Further, I suspect >= 0 could be harmful for
systems with unsigned char.

I'd note another minor style point. The added parentheses seem just
redundant to me. While it's true we sometimes be redundant
intentionally (for example, we always add {} even when they can
omitted), and in some cases it might be rather preferred than relying
on implicit priorities, but "((*base_) > 0)" seems way too awkward.
It also makes the policy of when to use () inconsistent, even within
that particular while condition.

Finally, I just noticed the opening curly brace should be in the same
line as that for while, according to our common style guideline.

I'm attaching a diff that fixes all of the above. Please check it and
apply if you agree.

Also, with other fixes I suspect this bug cannot be triggered via our
current tests. Please confirm that (or confirm I'm wrong:-) and if my
observation is correct, please add an explicit test that would
re-introduce this problem, confirm the new code without the additional
> 0 check triggers it, and the additional check actually prevents that.

I'm a little confused for this. For even we don't add the < 0 check, most
of the isspace() implementation will just accept and return zero.
It will only cause problem on windows platform in debug mode because it will
do the check of

 _ASSERTE((unsigned)(c + 1) <= 256);

This will be trigger in unit test of HexTest::decodeMap

input[1] = 128 

since input[1] is a char type, it will be converted to -128 and then trigger it.

Ah, okay, I missed that test. I incorrectly thought this was
triggered as a side effect of overrun.

comment:25 Changed 9 years ago by jinmei

  • Owner changed from jinmei to ocean

Changed 9 years ago by jinmei

comment:26 Changed 9 years ago by stephen

  • Estimated Difficulty changed from 0.0 to 3

comment:27 follow-up: Changed 9 years ago by jinmei

What's the current status of the ticket?

I thought we were basically in agreement except for minor editorial
points (and even with suggested patch for those parts) and expected to
be completed quite soon. Yet it's been in the review queue, still
assigned, for more than 10 days.

If Ocean cannot continue this ticket right now, I'd suggest
re-assigning it to someone else only for my final suggestions, and
complete it.

comment:28 in reply to: ↑ 27 Changed 9 years ago by ocean

sorry, I'm quite busy with CNNIC works last week, I'll finish it today.
Replying to jinmei:

What's the current status of the ticket?

I thought we were basically in agreement except for minor editorial
points (and even with suggested patch for those parts) and expected to
be completed quite soon. Yet it's been in the review queue, still
assigned, for more than 10 days.

If Ocean cannot continue this ticket right now, I'd suggest
re-assigning it to someone else only for my final suggestions, and
complete it.

comment:29 follow-up: Changed 9 years ago by ocean

  • Owner changed from ocean to jinmei

comment:30 in reply to: ↑ 29 Changed 9 years ago by jinmei

Replying to ocean:

Hey, if you simply change the owner without a comment, I cannot be
sure what I'm expected to do:-)

If you agreed with my suggest change and applied it without a change
(from a quick look at the latest diff, it seems to indicate so), I
have no other issue. Please merge.

comment:31 Changed 9 years ago by jinmei

  • Owner changed from jinmei to ocean

comment:32 Changed 9 years ago by ocean

Merged to master, sorry for any trouble:)

comment:33 Changed 9 years ago by ocean

  • Resolution set to fixed
  • Status changed from reviewing to closed
Note: See TracTickets for help on using tickets.