Commit 1d7138de87
igmp: RCU conversion of in_dev->mc_list
converted rwlock to RCU.
Update the s390 network drivers(qeth & lcs) code to adapt to this change.
V2 : Changes based on suggestions given by Eric Dumazet
Signed-off-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit to convert to use the dev_pm_ops struct
introduces a bug. The shutdown flag is not yet used
because the hibernation on memory is done by using
the freeze callback.
Thanks to Vlad for having reported it.
Reported-by: Vlad Lungu <vlad.lungu@windriver.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the ret variable is not used for anything other than
receive the value of the t3_adapter_error(), which will always be 0,
because the reset parameter is 0.
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
jhash is widely used in the kernel and because the functions
are inlined, the cost in size is significant. Also, the new jhash
functions are slightly larger than the previous ones so better un-inline.
As a preparation step, the calls to the internal macros are replaced
with the plain jhash function calls.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
fl->fl_gre_key is network byte order contrary to fl->fl_icmp_*.
Make xfrm_flowi_{s|d}port return network byte order values for gre
key too.
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calculating the BD entry using a modulus operation isn't optimal, especially
inside the loop. This patch removes the modulus operations in favour of:
i) simply checking for wrapping in the case of cur_rx
ii) forcing num_tx to be a power of two and using it to mask out the
entry from cur_tx
The also prevents possible issues related overflow of the cur_rx and cur_tx
counters.
Signed-off-by: Jonas Bonn <jonas@southpole.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
update_ethoc_tx_stats doesn't need to return anything so make its return
type void in order to avoid an unnecessary cast when the function is called.
Signed-off-by: Jonas Bonn <jonas@southpole.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
MDIO read and write were checking whether a timeout had expired to determine
whether to recheck the result of the MDIO operation. Under heavy CPU usage,
however, it was possible for the timeout to expire before the routine got
around to be able to check a second time even, thus erroneousy returning an
-EBUSY.
This patch changes the the MDIO IO routines to try up to five times to complete
the operation before giving up, thus lessening the dependency on CPU load.
This resolves a problem whereby a ping flood would keep the CPU so busy that
the above problem would manifest itself; the MDIO command to check link status
would fail and the interface would erroneously be shut down.
Signed-off-by: Jonas Bonn <jonas@southpole.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
The old interrupt handling was incorrect in that it did not account for the
fact that the interrupt source bits get set irregardless of whether or not
their corresponding mask is set. This patch fixes that by masking off the
source bits for masked interrupts.
Furthermore, the handling of transmission events is moved to the NAPI polling
handler alongside the reception handler, thus preventing a whole bunch of
interrupts during heavy traffic.
Signed-off-by: Jonas Bonn <jonas@southpole.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
An interrupt may occur between checking bd.stat and clearing the
interrupt source register which would result in the packet going totally
unnoticed as the interrupt will be missed. Double check bd.stat after
clearing the interrupt source register to guard against such an
occurrence.
Signed-off-by: Jonas Bonn <jonas@southpole.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Occasionally, it seems that some race is causing the interrupts to not be
reenabled otherwise with the end result that networking just stops working.
Enabling interrupts after calling napi_complete is more in line with what
other drivers do.
Signed-off-by: Jonas Bonn <jonas@southpole.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the ability to describe ethernet devices via a flattened
device tree. As device tree remains an optional feature, these bits all
need to be guarded by CONFIG_OF ifdefs.
MAC address is settable via the device tree parameter "local-mac-address";
however, the selection of the phy id is limited to probing, for now.
Signed-off-by: Jonas Bonn <jonas@southpole.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In KVM passthrough mode, the driver may not have config access to
non-standard registers. The BNX2_PCICFG_MISC_CONFIG config register
access to setup mailbox swapping can be done using MMIO.
Update version to 2.0.20.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 5709 chip requires the BNX2_MISC_NEW_CORE_CTL_DMA_ENABLE bit to be
cleared and polling for pending DMAs to complete before chip reset.
Without this step, we've seen NMIs during repeated resets of the chip.
Signed-off-by: Eddie Wai <waie@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the vlan device is lockless and single queue do not
transfer the real num queues. This is causing a BUG_ON to occur.
kernel BUG at net/8021q/vlan.c:345!
Call Trace:
[<ffffffff813fd6e8>] ? fib_rules_event+0x28/0x1b0
[<ffffffff814ad2b5>] notifier_call_chain+0x55/0x80
[<ffffffff81089156>] raw_notifier_call_chain+0x16/0x20
[<ffffffff813e5af7>] call_netdevice_notifiers+0x37/0x70
[<ffffffff813e6756>] netdev_features_change+0x16/0x20
[<ffffffffa02995be>] ixgbe_fcoe_enable+0xae/0x100 [ixgbe]
[<ffffffffa01da06a>] vlan_dev_fcoe_enable+0x2a/0x30 [8021q]
[<ffffffffa02d08c3>] fcoe_create+0x163/0x630 [fcoe]
[<ffffffff811244d5>] ? mmap_region+0x255/0x5a0
[<ffffffff81080ef0>] param_attr_store+0x50/0x80
[<ffffffff810809b6>] module_attr_store+0x26/0x30
[<ffffffff811b9db2>] sysfs_write_file+0xf2/0x180
[<ffffffff8114fc88>] vfs_write+0xc8/0x190
[<ffffffff81150621>] sys_write+0x51/0x90
[<ffffffff8100c0b2>] system_call_fastpath+0x16/0x1b
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When testing struct netdev_queue state against FROZEN bit, we also test
XOFF bit. We can test both bits at once and save some cycles.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. IPV6_TLV_TEL_DST_SIZE
This has not been using for several years since created.
2. RT6_INFO_LEN
commit 33120b30 kill all RT6_INFO_LEN's references, but only this definition remained.
commit 33120b30cc
Author: Alexey Dobriyan <adobriyan@sw.ru>
Date: Tue Nov 6 05:27:11 2007 -0800
[IPV6]: Convert /proc/net/ipv6_route to seq_file interface
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In kdump environment do not depend on reset_devices
parameter to reset the device as the parameter may become obsolete.
Instead use an adapter specific mechanism to determine if the device
needs a reset.
Driver maintains a count of number of pci functions probed
and decrements the count when remove handler of that pci function
is called. If the first probe, probe of function 0,
detects the count as non zero then reset the device.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In kdump environment do not depend upon reset_devices parameter
to reset the pci function as this parameter may become obsolete.
Instead use an adapter specific mechanism to determine if the pci
function needs to be reset.
Per function refcount is maintained in driver, which is set in probe
and reset in remove handler of adapter. If the probe detects the count
as non zero then reset the function.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These macros have been existed for several years since v2.6.12-rc2.
But they never be used. So remove them now.
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As David pointed out correctly, updates to af-specific attributes
are currently not atomic. If multiple changes are requested and
one of them fails, previous updates may have been applied already
leaving the link behind in a undefined state.
This patch splits the function parse_link_af() into two functions
validate_link_af() and set_link_at(). validate_link_af() is placed
to validate_linkmsg() check for errors as early as possible before
any changes to the link have been made. set_link_af() is called to
commit the changes later.
This method is not fail proof, while it is currently sufficient
to make set_link_af() inerrable and thus 100% atomic, the
validation function method will not be able to detect all error
scenarios in the future, there will likely always be errors
depending on states which are f.e. not protected by rtnl_mutex
and thus may change between validation and setting.
Also, instead of silently ignoring unknown address families and
config blocks for address families which did not register a set
function the errors EAFNOSUPPORT respectively EOPNOSUPPORT are
returned to avoid comitting 4 out of 5 update requests without
notifying the user.
Signed-off-by: Thomas Graf <tgraf@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use vzalloc() and vzalloc_node() in net drivers
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Key changes are:
- EQ ids are not assigned consecutively in Lancer. So, fix mapping of MSIx
vector to EQ-id.
- BAR mapping and some req locations different for Lancer.
- TCP,UDP,IP checksum fields must be compulsorily set in TX wrb for TSO in
Lancer.
- CEV_IST reg not present in Lancer; so, peek into event queue to check for
new entries
- cq_create and mcc_create cmd interface is different for Lancer; handle
accordingly
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implements transmit packet steering (XPS) for multiqueue
devices. XPS selects a transmit queue during packet transmission based
on configuration. This is done by mapping the CPU transmitting the
packet to a queue. This is the transmit side analogue to RPS-- where
RPS is selecting a CPU based on receive queue, XPS selects a queue
based on the CPU (previously there was an XPS patch from Eric
Dumazet, but that might more appropriately be called transmit completion
steering).
Each transmit queue can be associated with a number of CPUs which will
use the queue to send packets. This is configured as a CPU mask on a
per queue basis in:
/sys/class/net/eth<n>/queues/tx-<n>/xps_cpus
The mappings are stored per device in an inverted data structure that
maps CPUs to queues. In the netdevice structure this is an array of
num_possible_cpu structures where each structure holds and array of
queue_indexes for queues which that CPU can use.
The benefits of XPS are improved locality in the per queue data
structures. Also, transmit completions are more likely to be done
nearer to the sending thread, so this should promote locality back
to the socket on free (e.g. UDP). The benefits of XPS are dependent on
cache hierarchy, application load, and other factors. XPS would
nominally be configured so that a queue would only be shared by CPUs
which are sharing a cache, the degenerative configuration woud be that
each CPU has it's own queue.
Below are some benchmark results which show the potential benfit of
this patch. The netperf test has 500 instances of netperf TCP_RR test
with 1 byte req. and resp.
bnx2x on 16 core AMD
XPS (16 queues, 1 TX queue per CPU) 1234K at 100% CPU
No XPS (16 queues) 996K at 100% CPU
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In dev_pick_tx, don't do work in calculating queue
index or setting
the index in the sock unless the device has more than one queue. This
allows the sock to be set only with a queue index of a multi-queue
device which is desirable if device are stacked like in a tunnel.
We also allow the mapping of a socket to queue to be changed. To
maintain in order packet transmission a flag (ooo_okay) has been
added to the sk_buff structure. If a transport layer sets this flag
on a packet, the transmit queue can be changed for the socket.
Presumably, the transport would set this if there was no possbility
of creating OOO packets (for instance, there are no packets in flight
for the socket). This patch includes the modification in TCP output
for setting this flag.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dev_base_lock is the legacy way to lock the device list, and is planned
to disappear. (writers hold RTNL, readers hold RCU lock)
Convert rdma_translate_ip() and update_ipv6_gids() to RCU locking.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lower SCM_MAX_FD from 255 to 253 so that allocations for scm_fp_list are
halved. (commit f8d570a4 added two pointers in this structure)
scm_fp_dup() should not copy whole structure (and trigger kmemcheck
warnings), but only the used part. While we are at it, only allocate
needed size.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ipv6_sk_mc_lock rwlock becomes a spinlock.
readers (inet6_mc_check()) now takes rcu_read_lock() instead of read
lock. Writers dont need to disable BH anymore.
struct ipv6_mc_socklist objects are reclaimed after one RCU grace
period.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates the PM support using the dev_pm_ops
and reviews the hibernation support.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds in the plat_stmmacenet_data
the init and exit callbacks that can be used
for invoking specific platform functions.
For example, on ST targets, these call the
PAD manager functions to set PIO lines and
syscfg registers.
The patch removes the stmmac_claim_resource
only used on STM Kernels as well.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch tidies-up the stmmac_priv structure
that had many fileds alredy defined in the
plat_stmmacenet_data structure.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the interrupt mode configuration and NAPIs adding before a
register_netdev() call to prevent netdev->open() from running
before these functions are done.
Advance a driver version number.
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Reported-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>