Commit Graph

157996 Commits

Author SHA1 Message Date
Stephen Hemminger
dc1f8bf68b netdev: change transmit to limited range type
The transmit function should only return one of three possible values,
some drivers got confused and returned errno's or other values.
This changes the definition so that this can be caught at compile time.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-01 01:13:03 -07:00
Joe Perches
9e39f7c5b3 s2io: Generate complete messages using single line DBG_PRINTs
Single line log messages should be emitted by a single call
where possible.

Converted multiple calls to DBG_PRINT to single call form.
Removed "s2io:" preface from DBG_PRINTs.

The DBG_PRINT macro now emits a log level and is surrounded by
a do {...} while (0)

All s2io log output is now prefaced with KBUILD_MODNAME ": "
via pr_fmt.

The DBG_PRINT macro should probably be converted to use the
dev_<level> form eventually.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 22:35:11 -07:00
Joe Perches
82c2d02356 s2io.c: Convert skipped nic->config.tx_cfg[i]. to tx_cfg->
Missed doing the conversion in earlier patch.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 22:35:08 -07:00
Joe Perches
ffb5df6ce7 s2io.c: Standardize statistics accessors
Regularize the declaration and uses of
	struct config_param *config = &sp->config;
	struct mac_info *mac_control = &sp->mac_control;
and use
	struct stat_block *stats = mac_control->stats_info;
	struct swStat *swstats = &stats->sw_stat;
	struct xpakStat *xstats = &stats->xpak_stat;
and convert the longish uses like
	nic->mac_control.stats_info->sw_stat.<foo>
to
	swstats-><foo>
etc.

This also makes the statistics code marginally smaller
and presumably faster.

Old:
$ size s2io.o
   text	   data	    bss	    dec	    hex	filename
 114289	    516	  33360	 148165	  242c5	s2io.o
New:
$ size s2io.o
   text	   data	    bss	    dec	    hex	filename
 114097	    516	  33360	 147973	  24205	s2io.o

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 22:35:06 -07:00
Joe Perches
a2a20aef44 s2io.c: fix spelling explaination
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 22:35:01 -07:00
Joe Perches
6cef2b8eb7 s2io.c: convert printks to pr_<level>
Fixed trivial typo as well

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 22:34:55 -07:00
Joe Perches
d44570e406 s2io.c: Make more conforming to normal kernel style
Still has a few long lines.

checkpatch was:
	total: 263 errors, 53 warnings, 8751 lines checked
is:
	total: 4 errors, 35 warnings, 8767 lines checked

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 22:34:51 -07:00
Joe Perches
44364a035a s2io.c: use kzalloc
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 22:34:46 -07:00
Joe Perches
4f87032021 s2io.c: Use calculated size in kmallocs
Use consistent style.  Don't calculate the kmalloc size multiple times

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 22:34:43 -07:00
Joe Perches
13d866a9c9 s2io.c: Shorten code line length by using intermediate pointers
Repeated variable use and line wrapping is hard to read.
Use temp variables instead of direct references.

struct fifo_info *fifo = &mac_control->fifos[i];
struct ring_info *ring = &mac_control->rings[i];
struct tx_fifo_config *tx_cfg = &config->tx_cfg[i];
struct rx_ring_config *rx_cfg = &config->rx_cfg[i];

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 22:34:39 -07:00
Joe Perches
6fce365df8 s2io.c: Use const for strings
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 22:34:36 -07:00
Krishna Kumar
a453e0689a pkt_sched: Fix resource limiting in pfifo_fast
pfifo_fast_enqueue has this check:
        if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len) {

which allows each band to enqueue upto tx_queue_len skbs for a
total of 3*tx_queue_len skbs. I am not sure if this was the
intention of limiting in qdisc.

Patch compiled and 32 simultaneous netperf testing ran fine. Also:
# tc -s qdisc show dev eth2
qdisc pfifo_fast 0: root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 16835026752 bytes 373116 pkt (dropped 0, overlimits 0 requeues 25) 
 rate 0bit 0pps backlog 0b 0p requeues 25 

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 22:20:28 -07:00
Krishna Kumar
03a9a447d2 net: convert remaining non-symbolic return values in dev_queue_xmit
Patch compiled and 32 simultaneous netperf testing ran fine.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 22:16:57 -07:00
Krishna Kumar
7b3d3e4fc6 netdevice: Consolidate to use existing macros where available.
Patch compiled and 32 simultaneous netperf testing ran fine.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 22:16:20 -07:00
Oliver Hartkopp
6ca8b990e0 can: use correct NET_RX_ return values
Dropped skb's should be documented by an appropriate return value.
Use the correct NET_RX_DROP and NET_RX_SUCCESS values for that reason.

Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 22:13:18 -07:00
roel kluin
5de3fcab91 WAN: bit and/or confusion
Fix the tests that check whether Frame* bits are not set

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 22:02:26 -07:00
Anton Vorontsov
2394905f67 ucc_geth: Implement suspend/resume and Wake-On-LAN support
This patch implements suspend/resume and WOL support for UCC Ethernet
driver.

We support two wake up events: wake on PHY/link changes and wake
on magic packet.

In some CPUs (like MPC8569) QE shuts down during sleep, so magic packet
detection is unusable, and also on resume we should fully reinitialize
UCC structures.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 21:51:47 -07:00
Anton Vorontsov
bf5aec2e79 ucc_geth: Remove UGETH_MAGIC_PACKET Kconfig symbol and code
This patch removes currently unused UGETH_MAGIC_PACKET Kconfig symbol
and code, i.e. magic_packet_detection_{enable,disable} functions.

The two functions each contain just two steps that we'll place into
suspend/resume code path under CONFIG_PM.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 21:51:43 -07:00
Anton Vorontsov
54b1598384 ucc_geth: Factor out MAC initialization steps into a call
This patch factors out MAC initialization into ucc_geth_init_mac()
function that we'll use for suspend/resume.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 21:51:37 -07:00
Anton Vorontsov
ed24157ede powerpc/qe: Implement qe_alive_during_sleep() helper function
In some CPUs (i.e. MPC8569) QE shuts down completely during sleep,
drivers may want to know that to reinitialize registers and buffer
descriptors.

This patch implements qe_alive_during_sleep() helper function, so far
it just checks if MPC8569-compatible power management controller is
present, which is a sign that QE turns off during sleep.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 21:51:33 -07:00
Anton Vorontsov
e0ad2cd8ff ucc_geth: Fix NULL pointer dereference in uec_get_ethtool_stats()
In commit 3e73fc9a12 ("ucc_geth: Fix IO
memory (un)mapping code") I fixed ug_regs IO memory leak by properly
freeing the allocated memory. But ethtool_stats() callback doesn't
check for ug_regs being NULL, and that causes following oops if
'ethtool -S' is executed on a closed eth device:

  Unable to handle kernel paging request for data at address 0x00000180
  Faulting instruction address: 0xc0208228
  Oops: Kernel access of bad area, sig: 11 [#1]
  ...
  NIP [c0208228] uec_get_ethtool_stats+0x38/0x140
  LR [c02559a0] ethtool_get_stats+0xf8/0x23c
  Call Trace:
  [ef87bcd0] [c025597c] ethtool_get_stats+0xd4/0x23c (unreliable)
  [ef87bd00] [c025706c] dev_ethtool+0xfe8/0x11bc
  [ef87be00] [c0252b5c] dev_ioctl+0x454/0x6a8
  ...
  ---[ end trace 77fff1162a9586b0 ]---
  Segmentation fault

This patch fixes the issue.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 21:51:29 -07:00
David S. Miller
b9caaabb99 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-next-2.6 2009-08-30 21:30:39 -07:00
Matt Carlson
fc57e515a2 tg3: Update version to 3.101
This patch updates the tg3 version to 3.101.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:43:06 -07:00
Matt Carlson
f3f3f27e5b tg3: Move per-int tx members to a per-int struct
This patch moves the tx_prod, tx_cons, tx_pending, tx_ring, and
tx_buffers transmit ring device members to a per-interrupt structure.
It also adds a new transmit producer mailbox member (prodmbox) and
converts the code to use it rather than a preprocessor constant.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:43:04 -07:00
Matt Carlson
723344820a tg3: Move per-int rx members to per-int struct
This patch moves the rx_rcb, rx_rcb_mapping, and rx_rcb_ptr return ring
device members to a per-interrupt structure.  It also adds a new return
ring consumer mailbox register member (consmbox) and converts the code
to use it rather than a preprocessor constant.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:43:03 -07:00
Matt Carlson
898a56f8d8 tg3: Move general int members to a per-int struct
This patch moves the last_tag, last_tag_irq, and hw_status device
members to a per-interrupt structure.  It also adds a new interrupt
mailbox member (int_mbox) and converts the code to use it rather than a
direct preprocessor constant.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:43:01 -07:00
Matt Carlson
17375d25d3 tg3: Convert napi handlers to use tnapi
This patch converts the napi interrupt handler functions to accept and
use tg3_napi structures.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:42:59 -07:00
Matt Carlson
09943a1819 tg3: Convert ISR parameter to tnapi
This patch migrates the ISR parameter from struct net_device to struct
tg3_napi.  Checkpatch complains about the existence of the preexisting
IRQF_SAMPLE_RANDOM flag.  I've opted to keep this patch conservative and
let it continue to exist until the flag gets officially purged from the
kernel.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:42:58 -07:00
Matt Carlson
8ef0442f98 tg3: Move napi to per-int struct
This patch creates a per-interrupt data structure, moves the napi
member over, and creates a tg3 pointer back to the device structure.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:42:56 -07:00
Matt Carlson
07b0173cb5 tg3: Cleanup interrupt setup / teardown
Later patches will be adding MSIX support, which will complicate
interrupt initialization.  This patch prepares for the integration by
breaking out the interrupt setup and teardown code into separate
functions and cleaning up the error return paths.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:42:54 -07:00
Matt Carlson
79ed5ac7dd tg3: Use ext rx bds
The 5717 only uses extended buffer descriptors for the jumbo producer
ring.  Extended buffer descriptors are available on all devices that
support a separate jumbo producer ring so make the change universal.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:42:52 -07:00
Matt Carlson
21f581a536 tg3: Create a new prodring_set structure
This patch migrates most of the rx producer ring variables to a new
tg3_rx_prodring_set structure and modifies the code accordingly.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:42:50 -07:00
Matt Carlson
cf7a7298c4 tg3: Create rx producer ring setup routines
Later patches are going to complicate the ring initialization routines.
This patch breaks out the setup and teardown of the rx producer rings
into separate functions to make the code more readable.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:42:47 -07:00
Matt Carlson
287be12e17 tg3: Clarify rx buffer relationships
This patch attempts to document the various rx buffer sizes used by the
driver and how they relate to each other.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:42:43 -07:00
Matt Carlson
8f666b07ac tg3: Move the JUMBO_CAPABLE and SUPPORT_MSI flags
This patch moves where the jumbo capable and msi support flags are
located.  This is prep work for the addition of msix support flags.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:42:41 -07:00
Matt Carlson
fdb72b38c9 tg3: Break out mini producer ring handling
This patch separates the code that sets up the mini producer ring from
the code that sets up the jumbo producer rings.  The 5717 asic rev
devices do not have a mini ring, but do have a jumbo frame
implementation similar to the 5704 and previous devices.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:42:36 -07:00
Matt Carlson
8590a603e5 tg3: Reformat NVRAM case statements
This patch fixes up the NVRAM detection switch statements to conform
to the kernel coding style.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:42:33 -07:00
Matt Carlson
2befdcea96 tg3: Add new 5785 10/100 only device ID
This patch adds a new device ID for those 5785 devices that will only
use 10/100 phys.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:42:31 -07:00
Matt Carlson
0a9140cff2 tg3: Delay mdio bus init until fw finishes
The device firmware uses the MDIO bus during early setup.  If the driver
modifies the MDIO bus configuration while it is in use by the firmware,
any number of bad things can happen.  This patch delays MDIO setup until
after the firmware posts its magic signature, signifying initialization
is complete.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 15:42:27 -07:00
roel kluin
b3df9a514f tipc: fix test of bearer_priority range in tipc_register_media()
For the bearer_priority to be less than TIPC_MIN_LINK_PRI and greater than
TIPC_MAX_LINK_PRI is logically impossible.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 00:19:42 -07:00
Alexey Dobriyan
ea00b8e222 can: switch to seq_file
create_proc_read_entry() is going to be removed soon.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 00:19:38 -07:00
Yoshihiro Shimoda
4923576b8a net: sh_eth: add value of ether_link pin in platform_data
The method of ETHER_LINK pin is board dependence.
This patch adding paramters are:
 - no_ether_link          : If set to 1, do not use ETHER_LINK
 - ether_link_active_low  : If set to 1, ETHER_LINK is active low.

Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 00:19:35 -07:00
Rajashekhara, Sudhakar
2db9517ef3 TI DaVinci EMAC: delay DaVinci EMAC initialization
On TI's DA850/OMAP-L138 EVM, MAC address is stored in SPI
flash which is accessed using MTD interface.

This patch delays the initialization of DaVinci EMAC driver
by changing module_init to late_initcall. This helps SPI and
MTD drivers to get initialized before EMAC thereby enabling
EMAC driver to read the MAC address while booting and use it.

Tested with NFS on DM644x, DM6467, DA830/OMAP-L137 and
DA850/OMAP-L138 EVMs.

Signed-off-by: Sudhakar Rajashekhara <sudhakar.raj@ti.com>
Reviewed-by: Chaithrika U S <chaithrika@ti.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 00:19:33 -07:00
Krzysztof Halasa
38edb5b87e WAN/LMC: Fix type_trans().
Fix lmc_proto_type() invocation.

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 00:19:32 -07:00
Joe Perches
8a27f7c90f lib/vsprintf.c: Add "%pI6c" - print pointer as compressed ipv6 address
Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Jens Rosenboom <jens@mcbone.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 00:19:26 -07:00
John Dykstra
9a7030b76a tcp: Remove redundant copy of MD5 authentication key
Remove the copy of the MD5 authentication key from tcp_check_req().
This key has already been copied by tcp_v4_syn_recv_sock() or
tcp_v6_syn_recv_sock().

Signed-off-by: John Dykstra <john.dykstra1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 00:19:25 -07:00
Krishna Kumar
fd3ae5e8fc Speed-up pfifo_fast lookup using a private bitmap
Maintain a per-qdisc bitmap for pfifo_fast giving  availability
of skbs for each band. This allows faster lookup for a skb when
there are no high priority skbs. Also, it helps in (rare) cases
when there are no skbs on the list, where an immediate lookup is
faster than iterating through the three bands.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 00:19:21 -07:00
David Ward
31ce8c71a3 ipv6: Update Neighbor Cache when IPv6 RA is received on a router
When processing a received IPv6 Router Advertisement, the kernel
creates or updates an IPv6 Neighbor Cache entry for the sender --
but presently this does not occur if IPv6 forwarding is enabled
(net.ipv6.conf.*.forwarding = 1), or if IPv6 Router Advertisements
are not accepted (net.ipv6.conf.*.accept_ra = 0), because in these
cases processing of the Router Advertisement has already halted.

This patch allows the Neighbor Cache to be updated in these cases,
while still avoiding any modification to routes or link parameters.

This continues to satisfy RFC 4861, since any entry created in the
Neighbor Cache as the result of a received Router Advertisement is
still placed in the STALE state.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 00:04:09 -07:00
Michael Chan
078b073588 bnx2: Update firmware to 5.0.0.j3.
- Better small packet receive performance.
- Better handling of Flow control on 5709.
- Fixed iSCSI TMP ABORT TASK problem.
- Added iSCSI TCP timestamp option.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 00:02:46 -07:00
Octavian Purdila
80a1096bac tcp: fix premature termination of FIN_WAIT2 time-wait sockets
There is a race condition in the time-wait sockets code that can lead
to premature termination of FIN_WAIT2 and, subsequently, to RST
generation when the FIN,ACK from the peer finally arrives:

Time     TCP header
0.000000 30755 > http [SYN] Seq=0 Win=2920 Len=0 MSS=1460 TSV=282912 TSER=0
0.000008 http > 30755 aSYN, ACK] Seq=0 Ack=1 Win=2896 Len=0 MSS=1460 TSV=...
0.136899 HEAD /1b.html?n1Lg=v1 HTTP/1.0 [Packet size limited during capture]
0.136934 HTTP/1.0 200 OK [Packet size limited during capture]
0.136945 http > 30755 [FIN, ACK] Seq=187 Ack=207 Win=2690 Len=0 TSV=270521...
0.136974 30755 > http [ACK] Seq=207 Ack=187 Win=2734 Len=0 TSV=283049 TSER=...
0.177983 30755 > http [ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283089 TSER=...
0.238618 30755 > http [FIN, ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283151...
0.238625 http > 30755 [RST] Seq=188 Win=0 Len=0

Say twdr->slot = 1 and we are running inet_twdr_hangman and in this
instance inet_twdr_do_twkill_work returns 1. At that point we will
mark slot 1 and schedule inet_twdr_twkill_work. We will also make
twdr->slot = 2.

Next, a connection is closed and tcp_time_wait(TCP_FIN_WAIT2, timeo)
is called which will create a new FIN_WAIT2 time-wait socket and will
place it in the last to be reached slot, i.e. twdr->slot = 1.

At this point say inet_twdr_twkill_work will run which will start
destroying the time-wait sockets in slot 1, including the just added
TCP_FIN_WAIT2 one.

To avoid this issue we increment the slot only if all entries in the
slot have been purged.

This change may delay the slots cleanup by a time-wait death row
period but only if the worker thread didn't had the time to run/purge
the current slot in the next period (6 seconds with default sysctl
settings). However, on such a busy system even without this change we
would probably see delays...

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-29 00:00:35 -07:00