Opened 9 years ago

Closed 7 years ago

#357 closed defect (fixed)

There should be timeout on TCP connection in auth server

Reported by: vorner Owned by: jelte
Priority: medium Milestone: Sprint-20120904
Component: b10-auth Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: Medium
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 5 Add Hours to Ticket: 0
Total Hours: 5.94 Internal?: no

Description

It seems that the b10-auth holds an inactive TCP connection opened forever (I already have a socat connected to it for an hour while nothing is sent). This seems to be a bad thing, since it allocates a 64kB buffer for each TCP connection. It is too easy to create a connection and keep it open.

Furthemore, as an idle connection can survive forever without any packet going any way, this is possible resource leak. Imagine that a machine starts a TCP connection to Auth and crashes after the handshake, but before anything is sent. No packets are sent from the auth, since it waits on read. Therefore there will be no os-level timeout. But the machine will not send any packets, it does not know about the connection after reboot. Therefore the connection will be opened forever, eating 64kB of memory and a file descriptor.

Subtickets

Change History (12)

comment:1 Changed 8 years ago by shane

  • Defect Severity set to Medium
  • Sub-Project set to DNS

Agreed, we need this. It should default to a small value (a few seconds should be plenty), but also be configurable. Probably system-wide is fine for this.

comment:2 Changed 7 years ago by jinmei

  • Milestone set to Next-Sprint-Proposed

comment:3 Changed 7 years ago by jelte

  • Estimated Difficulty changed from 0.0 to 5

comment:4 Changed 7 years ago by jelte

  • Milestone changed from Next-Sprint-Proposed to Sprint-20120904

comment:5 Changed 7 years ago by jelte

  • Owner set to jelte
  • Status changed from new to assigned

comment:6 Changed 7 years ago by jelte

  • Owner changed from jelte to UnAssigned
  • Status changed from assigned to reviewing

ready for review.

First commit is the actual implementation.
Second one is to be able to configure and change the value (lots of small changes to propagate it all the way from auth_srv to TCPServer, and a few tricks to get it working there).
Commits 3 and 4 are documentation.

Proposed changelog:
[bug] jelte
b10-auth no longer keeps incoming TCP connections open if no (or not enough) data is sent by the client. The timeout value is configurable.

comment:7 Changed 7 years ago by vorner

  • Owner changed from UnAssigned to vorner

comment:8 follow-up: Changed 7 years ago by vorner

  • Owner changed from vorner to jelte

Hello

About the changelog message, it looks bad if it starts with lower-case letter, maybe changing the sentence slightly so b10-auth is not the first word? Also, maybe saying the well-known word „Timeout“ in there somewhere might make sense.

Now, to the code. Cppcheck complains here:
src/bin/auth/auth_config.cc:123: check_fail: Member variable 'TCPRecvTimeoutConfig::timeout_' is not initialized in the constructor. (warning,uninitMemberVar)

The special-case of 0 should be documented in the guide.

Maybe we should also validate the value is non-negative (someone might try to put -1 there hoping for some magic to happen). Also, maybe a warning with too low values (below 1 second, for example)?

Also, I'm not sure if 1 is enough as a default value. There are connection types (like GPRS/EDGE) with possibly multi-second round trips.

comment:9 in reply to: ↑ 8 Changed 7 years ago by jelte

  • Owner changed from jelte to vorner

Replying to vorner:

Hello

About the changelog message, it looks bad if it starts with lower-case letter, maybe changing the sentence slightly so b10-auth is not the first word? Also, maybe saying the well-known word „Timeout“ in there somewhere might make sense.

take 2:

[bug]
TCP connections now time out in b10-auth if no (or not all) query data is sent by the client. The timeout value defaults to 5000 milliseconds, but is configurable in Auth/tcp_recv_timeout.

Now, to the code. Cppcheck complains here:
src/bin/auth/auth_config.cc:123: check_fail: Member variable 'TCPRecvTimeoutConfig::timeout_' is not initialized in the constructor. (warning,uninitMemberVar)

fixed (set to 0, build() will override it anyway)

The special-case of 0 should be documented in the guide.

oh right, of course

Maybe we should also validate the value is non-negative (someone might try to put -1 there hoping for some magic to happen). Also, maybe a warning with too low values (below 1 second, for example)?

i suspect that people might actually want to (temporarily) set this quite low, to mitigate synflood-like attacks and increase the chance that at least fast proper connections get through

It does error on negative values now. And I discovered a lowlevel problem that is out of scope for this ticket, but the c++ client side of msgq can't handle very large integers in JSON data, I'll make a separate ticket for that.

Also, I'm not sure if 1 is enough as a default value. There are connection types (like GPRS/EDGE) with possibly multi-second round trips.

right, I chatted a bit with jeremy about this, we decided that we could use some statistical info here :) but since we need a value now, i chose 5 seconds (windows CE clients themselves give up after 3, a common resolv.conf client-side timeout is 5)

comment:10 Changed 7 years ago by vorner

  • Owner changed from vorner to jelte

Hello

I think it is ready to merge.

comment:11 Changed 7 years ago by vorner

  • Total Hours changed from 0 to 0.94

comment:12 Changed 7 years ago by jelte

  • Resolution set to fixed
  • Status changed from reviewing to closed
  • Total Hours changed from 0.94 to 5.94

Thanks!

Merged, closing ticket.

Note: See TracTickets for help on using tickets.