Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#1568 closed enhancement (complete)

Change throw to assert to gain performance

Reported by: vorner Owned by: vorner
Priority: medium Milestone: Sprint-20120320
Component: b10-auth Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket: performance
Estimated Difficulty: 0 Add Hours to Ticket: 0
Total Hours: 3.37 Internal?: no

Description

As discovered during the f2f meeting, this change improves performance by around 20% (may need official benchmarking to confirm).

Subtickets

Change History (17)

comment:1 Changed 8 years ago by vorner

It is ready for review. I don't think the feature is visible enough to have a changelog entry (maybe we should include a more global changelook entry like „multiple performance improvements“ on the release then).

comment:2 Changed 8 years ago by vorner

  • Owner changed from vorner to UnAssigned
  • Status changed from new to reviewing

comment:3 Changed 8 years ago by jinmei

I suggest we hold off merging (or perhaps reviewing) this change until
we see the effect of MessageRenderer? refactoring (part of changes
now available in jinmei-labelseq). In my experiment it could increase
the qps performance more than 50%, and doesn't rely on
OutputBuffer::operator[] in the first place, so probably making this
change moot in terms of performance improvement.

That refactoring work itself is not big, so I believe we can
reasonably complete that task by the end of y3.

comment:4 Changed 8 years ago by vorner

Jeremy says it has 2%-13% speedup on builtin and in-memory, but 1%-2% slowdown on sqlit3 with cache, according to his benchmarks.

Anyway, I'm OK with waiting to see if it'll help after the other improvements.

comment:5 Changed 8 years ago by jinmei

  • Owner changed from UnAssigned to jinmei

comment:6 Changed 8 years ago by jinmei

Now the new MessageRenderer implementation is likely to need this,
I think it's time to resume this ticket.

First, I made a very minor editorial fix to the branch.

Second, I think we need a changelog for this because it could
potentially change observable behavior.

Third, I wonder we could do something similar to what I did in
trac1603: still throw on error but move the throw statement to a
separate private method:

-        assert(pos < size_);
+        if (pos >= size_) {
+            throwError();
+        }
         return (buffer_[pos]);
     }
     //@}
+private:
+    static void throwError() {
+        isc_throw(InvalidBufferPosition, "read at invalid position");
+    }
+public:

I've tested this on trac1603bench, and I saw it was slightly slower
than the assert() version, but the gap was not that big and I was not
sure if it was an experimental margin or the real difference due to
the implementation difference. At the moment I'm okay with keeping
the use of assert() if we want the highest possible performance, but
we might note the possibility of future extension where we use an
exception in a separate function.

Other than these, the branch is okay for merge.

comment:7 Changed 8 years ago by jinmei

  • Owner changed from jinmei to vorner

comment:8 follow-up: Changed 8 years ago by vorner

OK, I'll bring it on tomorrow on the call and we'll talk about it. Your solution seems better in the terms of design and I wouldn't be against myself (if the difference is small, it isn't that important as the big one before).

comment:9 in reply to: ↑ 8 Changed 8 years ago by jinmei

Replying to vorner:

OK, I'll bring it on tomorrow on the call and we'll talk about it. Your solution seems better in the terms of design and I wouldn't be against myself (if the difference is small, it isn't that important as the big one before).

If you have time, could you also compare the performance in your
environment? You can do it for the total server performance with a
tool like queryperf, or message_renderer_bench for the specific effect
on the renderer.

comment:10 Changed 8 years ago by vorner

So, I measured it (and noticed a test was failing for the original branch ‒ but I'd solve only if we go for it). The results are as follows:

The original, unchanged (with changes from #1603):

Parameters:
  Iterations: 100000
Benchmark for old MessageRenderer (positive response)
Performed 4200000 iterations in 5.414628s (775676.56ips)
Benchmark for dumb MessageRenderer (positive response)
Performed 4200000 iterations in 0.307879s (13641722.88ips)
Benchmark for new MessageRenderer (positive response)
Performed 4200000 iterations in 0.982537s (4274648.18ips)
Benchmark for old MessageRenderer (NXDOMAIN response)
Performed 400000 iterations in 0.327549s (1221191.33ips)
Benchmark for dumb MessageRenderer (NXDOMAIN response)
Performed 400000 iterations in 0.006059s (66017494.64ips)
Benchmark for new MessageRenderer (NXDOMAIN response)
Performed 400000 iterations in 0.110053s (3634612.41ips)
Benchmark for old MessageRenderer (SERVFAIL response)
Performed 100000 iterations in 0.027959s (3576665.83ips)
Benchmark for dumb MessageRenderer (SERVFAIL response)
Performed 100000 iterations in 0.001662s (60168471.72ips)
Benchmark for new MessageRenderer (SERVFAIL response)
Performed 100000 iterations in 0.033245s (3007971.12ips)

With the assert (applied to #1603, now trac1568_new):

Parameters:
  Iterations: 100000
Benchmark for old MessageRenderer (positive response)
Performed 4200000 iterations in 1.967427s (2134767.90ips)
Benchmark for dumb MessageRenderer (positive response)
Performed 4200000 iterations in 0.144562s (29053278.18ips)
Benchmark for new MessageRenderer (positive response)
Performed 4200000 iterations in 0.699861s (6001191.67ips)
Benchmark for old MessageRenderer (NXDOMAIN response)
Performed 400000 iterations in 0.131807s (3034740.19ips)
Benchmark for dumb MessageRenderer (NXDOMAIN response)
Performed 400000 iterations in 0.006047s (66148503.39ips)
Benchmark for new MessageRenderer (NXDOMAIN response)
Performed 400000 iterations in 0.087616s (4565376.19ips)
Benchmark for old MessageRenderer (SERVFAIL response)
Performed 100000 iterations in 0.023528s (4250255.02ips)
Benchmark for dumb MessageRenderer (SERVFAIL response)
Performed 100000 iterations in 0.001675s (59701492.54ips)
Benchmark for new MessageRenderer (SERVFAIL response)
Performed 100000 iterations in 0.030313s (3298914.66ips)

And with the private method (pushed as track1568_method):

Parameters:
  Iterations: 100000
Benchmark for old MessageRenderer (positive response)
Performed 4200000 iterations in 3.635178s (1155376.71ips)
Benchmark for dumb MessageRenderer (positive response)
Performed 4200000 iterations in 0.169868s (24725080.65ips)
Benchmark for new MessageRenderer (positive response)
Performed 4200000 iterations in 0.855160s (4911361.62ips)
Benchmark for old MessageRenderer (NXDOMAIN response)
Performed 400000 iterations in 0.230515s (1735244.99ips)
Benchmark for dumb MessageRenderer (NXDOMAIN response)
Performed 400000 iterations in 0.006246s (64040986.23ips)
Benchmark for new MessageRenderer (NXDOMAIN response)
Performed 400000 iterations in 0.096402s (4149291.51ips)
Benchmark for old MessageRenderer (SERVFAIL response)
Performed 100000 iterations in 0.022769s (4391936.40ips)
Benchmark for dumb MessageRenderer (SERVFAIL response)
Performed 100000 iterations in 0.001805s (55401662.05ips)
Benchmark for new MessageRenderer (SERVFAIL response)
Performed 100000 iterations in 0.032079s (3117304.16ips)

So there seems to be some difference. Let's wait for tomorrow call and discuss it over.

comment:11 Changed 8 years ago by jinmei

nFYI: I'm pasting my result

Focusing on the new renderer implementation, the assert version
is 7.3%, 9.2%, 1.0% faster than the method version (for positive,
NXDOMAIN, SERVFAIL cases respectively).

1568_new

Parameters:
  Iterations: 100000
Benchmark for old MessageRenderer (positive response)
Performed 4200000 iterations in 6.840785s (613964.63ips)
Benchmark for dumb MessageRenderer (positive response)
Performed 4200000 iterations in 0.166941s (25158588.96ips)
Benchmark for new MessageRenderer (positive response)
Performed 4200000 iterations in 1.085116s (3870553.93ips)
Benchmark for old MessageRenderer (NXDOMAIN response)
Performed 400000 iterations in 0.496187s (806147.68ips)
Benchmark for dumb MessageRenderer (NXDOMAIN response)
Performed 400000 iterations in 0.016813s (23791114.02ips)
Benchmark for new MessageRenderer (NXDOMAIN response)
Performed 400000 iterations in 0.149617s (2673492.99ips)
Benchmark for old MessageRenderer (SERVFAIL response)
Performed 100000 iterations in 0.065764s (1520588.77ips)
Benchmark for dumb MessageRenderer (SERVFAIL response)
Performed 100000 iterations in 0.004095s (24420024.42ips)
Benchmark for new MessageRenderer (SERVFAIL response)
Performed 100000 iterations in 0.052844s (1892362.43ips)

1568_method

Parameters:
  Iterations: 100000
Benchmark for old MessageRenderer (positive response)
Performed 4200000 iterations in 7.061764s (594752.25ips)
Benchmark for dumb MessageRenderer (positive response)
Performed 4200000 iterations in 0.175361s (23950593.35ips)
Benchmark for new MessageRenderer (positive response)
Performed 4200000 iterations in 1.164642s (3606258.40ips)
Benchmark for old MessageRenderer (NXDOMAIN response)
Performed 400000 iterations in 0.485412s (824042.26ips)
Benchmark for dumb MessageRenderer (NXDOMAIN response)
Performed 400000 iterations in 0.016988s (23546032.49ips)
Benchmark for new MessageRenderer (NXDOMAIN response)
Performed 400000 iterations in 0.163406s (2447890.53ips)
Benchmark for old MessageRenderer (SERVFAIL response)
Performed 100000 iterations in 0.068464s (1460621.64ips)
Benchmark for dumb MessageRenderer (SERVFAIL response)
Performed 100000 iterations in 0.004166s (24003840.61ips)
Benchmark for new MessageRenderer (SERVFAIL response)
Performed 100000 iterations in 0.053355s (1874238.59ips)

comment:12 follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to jinmei

As we decided to merge it on the call, I run the tests and found one failing. Would you have a quick look at it, before I actually merge? (in the trac1568_new branch).

Thank you.

comment:13 in reply to: ↑ 12 Changed 8 years ago by jinmei

Replying to vorner:

As we decided to merge it on the call, I run the tests and found one failing. Would you have a quick look at it, before I actually merge? (in the trac1568_new branch).

Looks okay.

comment:14 Changed 8 years ago by jinmei

  • Owner changed from jinmei to vorner

comment:15 Changed 8 years ago by vorner

  • Total Hours changed from 0 to 2.87

Thank you, closing.

(I'm not sure if the hours are correct, as it is so old ticket, my time logs probably don't go that far to past)

comment:16 Changed 8 years ago by vorner

  • Resolution set to complete
  • Status changed from reviewing to closed

comment:17 Changed 8 years ago by jinmei

  • Total Hours changed from 2.87 to 3.37
Note: See TracTickets for help on using tickets.