In case the discovery session is carried over iser, we can't
access the assumed network portal since the default portal is
used. In this case we don't really need to allocate the fastreg
pool, just prepare to the text pdu that will follow.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reported-by: Alex Tabachnik <alext@mellanox.com>
Cc: stable@vger.kernel.org # 3.15+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This is a small follow-up to the larger ARM SoC updates merged
last week, almost entirely for the keystone platform.
The main change here is to use the new dma-ranges parsing code
that came in through Russell's ARM tree. This allows the keystone
platform to do cache-coherent DMA and to finally support all the
available physical memory when LPAE is enabled.
Aside from this, the keystone reset driver has been rewritten,
and there is a small bug fix to allow building the orion5x platform
again.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAU5harmCrR//JCVInAQJuXhAA2v+IVK4HqkIric4Hj/7monhv/Gxm+7Fj
+3GYX9QHncSnILQm0ke+OYKYoLNPDbDVyvsY5JWbewLC3TbvnKZoM25Q4Aj6Fcj5
GAVgggMtc7dkk8VIDfUvjyYZsiiWcvhATvwlQg70oOC0LmRjPwQnBvyP+zvUiBUJ
++xBGiVsfScIRhbjB+cYZl7dhLXA+GKzaCw9sKVCS+ExmlAYBSy4cUkRCkw4AQI6
s4udVuPtFi7++rNTdn2MBeyLyk0pOjQu+jepo/bwmKGaKS88KwZjTCPI1YRtrkrv
ZjF4AY0kjUenoJuN9Dj9q6I6Ivc8YFDr411hF/LwiBF7+QUk0EnAEeAhDGan/ZCG
1+xZbY54tQk+TNNgR2V/RzUWilhPmG3rdtq9MnvYrmQnyGeF07MCCAIqoH+B+hDm
AvApocOT9sReD4yTNIBHYPKsp04jLXR5XKRZ98QkAOInEQaYTpFOU3LyrdgFLLB3
fIfDf2ZHCc6cmp/wXKk83LVxFQqYbyw0Q93xB/X/8yqb/NyJ8hVaKWgAcofUuj3t
M87I+XcljA/dx8FsELM3B+qSZXakdDUgXBkfNePkm7GGlq5NziBTMJnhbXiIVMe+
4kZIyxRD3OZj0d1K3mHMSvoX/FY/98AJWk2eizwwIx64PskatiKQTkX9XFnYPXLb
AtdWv8t3Ozc=
=AqkT
-----END PGP SIGNATURE-----
Merge tag 'soc2-for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull part two of ARM SoC updates from Arnd Bergmann:
"This is a small follow-up to the larger ARM SoC updates merged last
week, almost entirely for the keystone platform.
The main change here is to use the new dma-ranges parsing code that
came in through Russell's ARM tree. This allows the keystone platform
to do cache-coherent DMA and to finally support all the available
physical memory when LPAE is enabled.
Aside from this, the keystone reset driver has been rewritten, and
there is a small bug fix to allow building the orion5x platform again"
* tag 'soc2-for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: keystone: Drop use of meminfo since its not available anymore
ARM: orion5x: fix mvebu_mbus_dt_init call
ARM: configs: keystone: enable reset driver support
ARM: dts: keystone: update reset node to work with reset driver
ARM: keystone: remove redundant reset stuff
ARM: keystone: Update the dma offset for non-dt platform devices
ARM: keystone: Switch over to coherent memory address space
ARM: configs: keystone: add MTD_SPI_NOR (new dependency for M25P80)
ARM: configs: keystone: drop CONFIG_COMMON_CLK_DEBUG
Pull reiserfs and ext3 changes from Jan Kara:
"Big reiserfs cleanup from Jeff, an ext3 deadlock fix, and some small
cleanups"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (34 commits)
reiserfs: Fix compilation breakage with CONFIG_REISERFS_CHECK
ext3: Fix deadlock in data=journal mode when fs is frozen
reiserfs: call truncate_setsize under tailpack mutex
fs/jbd/revoke.c: replace shift loop by ilog2
reiserfs: remove obsolete __constant_cpu_to_le32
reiserfs: balance_leaf refactor, split up balance_leaf_when_delete
reiserfs: balance_leaf refactor, format balance_leaf_finish_node
reiserfs: balance_leaf refactor, format balance_leaf_new_nodes_paste
reiserfs: balance_leaf refactor, format balance_leaf_paste_right
reiserfs: balance_leaf refactor, format balance_leaf_insert_right
reiserfs: balance_leaf refactor, format balance_leaf_paste_left
reiserfs: balance_leaf refactor, format balance_leaf_insert_left
reiserfs: balance_leaf refactor, pull out balance_leaf{left, right, new_nodes, finish_node}
reiserfs: balance_leaf refactor, pull out balance_leaf_finish_node_paste
reiserfs: balance_leaf refactor pull out balance_leaf_finish_node_insert
reiserfs: balance_leaf refactor, pull out balance_leaf_new_nodes_paste
reiserfs: balance_leaf refactor, pull out balance_leaf_new_nodes_insert
reiserfs: balance_leaf refactor, pull out balance_leaf_paste_right
reiserfs: balance_leaf refactor, pull out balance_leaf_insert_right
reiserfs: balance_leaf refactor, pull out balance_leaf_paste_left
...
free_msi_irqs() is leaking memory, since list_for_each_entry(entry,
&dev->msi_list, list) {...} is never executed, because dev->msi_list is
made empty by the loop just above this one.
Fix it by relying on zero termination of attribute array like
populate_msi_sysfs() does.
Fixes: 1c51b50c29 ("PCI/MSI: Export MSI mode using attributes, not kobjects")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: stable@vger.kernel.org # v3.14+
Pull btrfs updates from Chris Mason:
"The biggest change here is Josef's rework of the btrfs quota
accounting, which improves the in-memory tracking of delayed extent
operations.
I had been working on Btrfs stack usage for a while, mostly because it
had become impossible to do long stress runs with slab, lockdep and
pagealloc debugging turned on without blowing the stack. Even though
you upgraded us to a nice king sized stack, I kept most of the
patches.
We also have some very hard to find corruption fixes, an awesome sysfs
use after free, and the usual assortment of optimizations, cleanups
and other fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (80 commits)
Btrfs: convert smp_mb__{before,after}_clear_bit
Btrfs: fix scrub_print_warning to handle skinny metadata extents
Btrfs: make fsync work after cloning into a file
Btrfs: use right type to get real comparison
Btrfs: don't check nodes for extent items
Btrfs: don't release invalid page in btrfs_page_exists_in_range()
Btrfs: make sure we retry if page is a retriable exception
Btrfs: make sure we retry if we couldn't get the page
btrfs: replace EINVAL with EOPNOTSUPP for dev_replace raid56
trivial: fs/btrfs/ioctl.c: fix typo s/substract/subtract/
Btrfs: fix leaf corruption after __btrfs_drop_extents
Btrfs: ensure btrfs_prev_leaf doesn't miss 1 item
Btrfs: fix clone to deal with holes when NO_HOLES feature is enabled
btrfs: free delayed node outside of root->inode_lock
btrfs: replace EINVAL with ERANGE for resize when ULLONG_MAX
Btrfs: fix transaction leak during fsync call
btrfs: Avoid trucating page or punching hole in a already existed hole.
Btrfs: update commit root on snapshot creation after orphan cleanup
Btrfs: ioctl, don't re-lock extent range when not necessary
Btrfs: avoid visiting all extent items when cloning a range
...
This update contains:
o cleanup removing unused function args
o rework of the filestreams allocator to use dentry cache parent lookups
o new on-disk free inode btree and optimised inode allocator
o various bug fixes
o rework of internal attribute API
o cleanup of superblock feature bit support to remove historic cruft
o more fixes and minor cleanups
o added a new directory/attribute geometry abstraction
o yet more fixes and minor cleanups.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJTl+YfAAoJEK3oKUf0dfod/T8QALmvR28JZTL3vtlD5rppXp9+
DXOMrwgJ9V+GOI39tpUgw1u/C5DuaFPRPtmCjnb9Do4DJMrHj+zD8ADvoVd6asa0
FHH4TuulQqOJVu67SZ3ng15yjyy+wPfymQZIQPQY/IwVMUUEpWnSFnKha1GAsL8Y
RY/WNU50wMu4wxi0++ENooHJC2EoxXpzB80cHddN81zFEFZobw0cm5Aa5xBZEZ4i
P+GpEuUpWHKvVaWRLuIMgVC0NuOt5KtLfS97ong+tRgWCw//QVl28Rxhrj1ZHsF3
VAskVsSFVIIPHP7qKjQyCGk71iqBfrfAgRqqJHFZgSmtSzyK3hVvJlRRDdCT5hi6
00aHg9vz9815I7zrQwyMuy872N3DTislOxJZGD7PKgLpgfeHs4qk+cQ1xCi2gdFn
xnh2p4mLolZHzanUsoxYpSh7f7o+NT3xgET3yS63uuO/I57o74JJDfRDjWNX6I9F
LLtIGb1cwVFUYbXcHGfP1wxQ1BS6rYYYwKpSJqqwJXApL9MqoxH2B8Hoo0BaG43/
3UlNi+yljvhBNiJnx2pAIdU+WaIL1ZQj9XzuU1sFSa8lnFNb2x+wkgjHzJ0Hdotm
zZqirCo1jyyNkyTwGfwJwGzNgZemQDMQ7cr2MYzG1mhFMLEZZJeFmWVzfuzJ3yoR
jke/Hy/qiWVK0en43MdR
=qnz2
-----END PGP SIGNATURE-----
Merge tag 'xfs-for-linus-3.16-rc1' of git://oss.sgi.com/xfs/xfs
Pull xfs updates from Dave Chinner:
"This update contains:
- cleanup removing unused function args
- rework of the filestreams allocator to use dentry cache parent
lookups
- new on-disk free inode btree and optimised inode allocator
- various bug fixes
- rework of internal attribute API
- cleanup of superblock feature bit support to remove historic cruft
- more fixes and minor cleanups
- added a new directory/attribute geometry abstraction
- yet more fixes and minor cleanups"
* tag 'xfs-for-linus-3.16-rc1' of git://oss.sgi.com/xfs/xfs: (86 commits)
xfs: fix xfs_da_args sparse warning in xfs_readdir
xfs: Fix rounding in xfs_alloc_fix_len()
xfs: tone down writepage/releasepage WARN_ONs
xfs: small cleanup in xfs_lowbit64()
xfs: kill xfs_buf_geterror()
xfs: xfs_readsb needs to check for magic numbers
xfs: block allocation work needs to be kswapd aware
xfs: remove redundant geometry information from xfs_da_state
xfs: replace attr LBSIZE with xfs_da_geometry
xfs: pass xfs_da_args to xfs_attr_leaf_newentsize
xfs: use xfs_da_geometry for block size in attr code
xfs: remove mp->m_dir_geo from directory logging
xfs: reduce direct usage of mp->m_dir_geo
xfs: move node entry counts to xfs_da_geometry
xfs: convert dir/attr btree threshold to xfs_da_geometry
xfs: convert m_dirblksize to xfs_da_geometry
xfs: convert m_dirblkfsbs to xfs_da_geometry
xfs: convert directory segment limits to xfs_da_geometry
xfs: convert directory db conversion to xfs_da_geometry
xfs: convert directory dablk conversion to xfs_da_geometry
...
No need to read the PCI register for the PF's base queue on every single Tx
queue enable and disable as we already have the value stored from reading
the capability features at startup.
Change-ID: Ic02fb622757742f43cb8269369c3d972d4f66555
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A drop action comes down as a ring_cookie value, so allow it as
a special value that can be used to configure destination control.
Also fix the output to filter read command accordingly.
Change-ID: I9956723cee42f3194885403317dd21ed4a151144
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add members to stat struct to keep track of Flow director ATR and
SideBand filter packet matches.
Change-ID: Ibbb31a53c7adcc2bb96991dd80565442a2f2513c
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change drops the FTYPE field from the Rx descriptor, to
match the hardware implementation.
Change-ID: I66d31d2b43861da45e8ace4fb03df033abe88bab
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
FW can indicate any admin queue error states to the driver via some bits
in the length registers. Each time we process an admin queue message,
check these bits and log any errors we find. Since the VF really can't
do much, we just print the message and depend on the PF driver to clear
things up on our behalf.
Change-ID: I92bc6c53ce3b4400544e0ca19c5de2d27490bd0d
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Linux gives us a function to copy Ethernet MAC addresses, let's use it.
Change-ID: I0c861900029ca5ea65a53ca39565852fb633f6fd
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove the filter created by the firmware with the default MAC address it
reads out of the NVM storage and a promiscuous VLAN tag and replace it
with a filter that will not accept tagged packets by default. The system
must request a VLAN tag packet filter to get packets with that tag.
Change-ID: I119e6c3603a039bd68282ba31bf26f33a575490a
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently if the firmware reports DCB capability the driver enables
I40E_FLAG_DCB_ENABLED flag. When this flag is enabled the driver
inserts a tag when transmitting a packet from the port even if there
are no DCB traffic classes configured at the port.
This patch adds a new flag I40E_FLAG_DCB_CAPABLE that will be set
when the DCB capability is present and the existing flag
I40E_FLAG_DCB_ENABLED will be set only if there are more than one
traffic classes configured at the port.
Change-ID: I24ccbf53ef293db2eba80c8a9772acf729795bd5
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If the device is down, there's no place to go but up, so don't try to go
down even more. This prevents a CPU soft lock in napi_disable().
Change-ID: I8b058b9ee974dfa01c212fae2597f4f54b333314
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In XL710 devices we program FD filter's fields from Tx perspective of the flow.
However the user interface exposed in ethtool should be compliant with the
previous generation of drivers where a filter src and dst field are from
the RX perspective. This patch changes the ethtool interface in this regard
to match the other drivers.
Change-ID: Iec6ccddd87357c4fb53ccf33aa0fae699faf70cf
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add set_pf_context, replace set_phy_reset with set_phy_debug, add
nvm_config_read/write, remove nvm_read/write_reg_se and add some
PHY types.
With these changes we bump the API version to 1.2.
Change-ID: I4dc3aec175c2316f66fc9b726b3f7d594699d84e
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Set appropriate fields in Tx queue configuration virtchnl message
to pf to enable headwb and setup headwb addr.
Then use that info from the VF to set headwb and headwb_addr instead of
always enabling them.
Change-ID: I7d393d1b2b07f0f3355b3a4f7c2d3c6ee3b0d622
Signed-off-by: Ashish Shah <ashish.n.shah@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch separates the hardware logic from the set function, so that
we can re-use it during a ptp_reset. This enables the reset to return
functionality to the last known timestamp mode, rather than resetting
the value. We initialize the mode to off during the ptp_init cycle.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Return a 0 directly rather than a constant.
Reported-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Pull block layer fixes from Jens Axboe:
"Final small batch of fixes to be included before -rc1. Some general
cleanups in here as well, but some of the blk-mq fixes we need for the
NVMe conversion and/or scsi-mq. The pull request contains:
- Support for not merging across a specified "chunk size", if set by
the driver. Some NVMe devices perform poorly for IO that crosses
such a chunk, so we need to support it generically as part of
request merging avoid having to do complicated split logic. From
me.
- Bump max tag depth to 10Ki tags. Some scsi devices have a huge
shared tag space. Before we failed with EINVAL if a too large tag
depth was specified, now we truncate it and pass back the actual
value. From me.
- Various blk-mq rq init fixes from me and others.
- A fix for enter on a dying queue for blk-mq from Keith. This is
needed to prevent oopsing on hot device removal.
- Fixup for blk-mq timer addition from Ming Lei.
- Small round of performance fixes for mtip32xx from Sam Bradshaw.
- Minor stack leak fix from Rickard Strandqvist.
- Two __init annotations from Fabian Frederick"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: add __init to blkcg_policy_register
block: add __init to elv_register
block: ensure that bio_add_page() always accepts a page for an empty bio
blk-mq: add timer in blk_mq_start_request
blk-mq: always initialize request->start_time
block: blk-exec.c: Cleaning up local variable address returnd
mtip32xx: minor performance enhancements
blk-mq: ->timeout should be cleared in blk_mq_rq_ctx_init()
blk-mq: don't allow queue entering for a dying queue
blk-mq: bump max tag depth to 10K tags
block: add blk_rq_set_block_pc()
block: add notion of a chunk size for request merging
- refactor m25p80.c driver for use as a general SPI NOR framework for other
drivers which may speak to SPI NOR flash without providing full SPI support
(i.e., not part of drivers/spi/)
- new Freescale QuadSPI driver (utilizing new SPI NOR framework)
- updates for the STMicro "FSM" SPI NOR driver
- fix sync/flush behavior on mtd_blkdevs
- fixup subpage write support on a few NAND drivers
- correct the MTD OOB test for odd-sized OOB areas
- add BCH-16 support for OMAP NAND
- fix warnings and trivial refactoring
- utilize new ECC DT bindings in pxa3xx NAND driver
- new LPDDR NVM driver
- address a few assorted bugs caught by Coverity
- add new imx6sx support for GPMI NAND
- use a bounce buffer for NAND when non-DMA-able buffers are used
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJTl/9oAAoJEFySrpd9RFgtfnUP+wURZfF1Xnek/3yNZP0Pt90x
yToXPDgsK5oteBAAWMUwtnJImDzsUMD8BgLNLU1jvPKuvMo9lwNMaCF+l1wUrTeC
F1VgYWq4tub3tk104Dthlguk0Jhj66k61LbvFvKXhkGEYGD9iPFeTPWyARUZTYOv
R4eRybuU+l2ZTDd+vNStXx9oWyqzWXekwrhMi10YWoxF694kBMI4C0rZQ2CexjVl
B5K0oL2P8JU/yNLgtMgPOfkh8rHZEoXECA3vaQscZzsOnc0evDndKSTkTX1Ls61Y
eWFgXV6qyhL+5VKTiHNzi6/J0NeNaTquOs9HoBuWr1DwaS8aoWEhBjeuNrXGYs/s
6++CRoDDcdWunAXBH8hqFSu6IweYB5TQ+QMUa7Z69C9n/fZg82dF4i2RSnp4Y2rs
qI19LrPIpdyL1ril5ndp0U2JRYXdxOpX3+jf2anG6u3vYjzI8P8tXEGKUz/uNpnK
fpEn2zKpeHAq62eqScuhGsO7MO2bIg7yNKMdSoeeeT9dgbah6fkrQnaDNbtjC+Y1
rXMhgLiVebmm8BVe6w5XSFqCw+76RxmO04TAj/Vy3WVPQ2KNn+OuLc0yVlsqAO9n
7Y19QvHeMZZW4O/w5RQ/OniJpysXN0ESj2cE93DHdgUPQ5aedIN0r5eQA0M1e8c6
W2MQFS5nJPiCxUYia4KP
=6UIq
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20140610' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
- refactor m25p80.c driver for use as a general SPI NOR framework for
other drivers which may speak to SPI NOR flash without providing full
SPI support (i.e., not part of drivers/spi/)
- new Freescale QuadSPI driver (utilizing new SPI NOR framework)
- updates for the STMicro "FSM" SPI NOR driver
- fix sync/flush behavior on mtd_blkdevs
- fixup subpage write support on a few NAND drivers
- correct the MTD OOB test for odd-sized OOB areas
- add BCH-16 support for OMAP NAND
- fix warnings and trivial refactoring
- utilize new ECC DT bindings in pxa3xx NAND driver
- new LPDDR NVM driver
- address a few assorted bugs caught by Coverity
- add new imx6sx support for GPMI NAND
- use a bounce buffer for NAND when non-DMA-able buffers are used
* tag 'for-linus-20140610' of git://git.infradead.org/linux-mtd: (77 commits)
mtd: gpmi: add gpmi support for imx6sx
mtd: maps: remove check for CONFIG_MTD_SUPERH_RESERVE
mtd: bf5xx_nand: use the managed version of kzalloc
mtd: pxa3xx_nand: make the driver work on big-endian systems
mtd: nand: omap: fix omap_calculate_ecc_bch() for-loop error
mtd: nand: r852: correct write_buf loop bounds
mtd: nand_bbt: handle error case for nand_create_badblock_pattern()
mtd: nand_bbt: remove unused variable
mtd: maps: sc520cdp: fix warnings
mtd: slram: fix unused variable warning
mtd: pfow: remove unused variable
mtd: lpddr: fix Kconfig dependency, for I/O accessors
mtd: nand: pxa3xx: Add supported ECC strength and step size to the DT binding
mtd: nand: pxa3xx: Use ECC strength and step size devicetree binding
mtd: nand: pxa3xx: Clean pxa_ecc_init() error handling
mtd: nand: Warn the user if the selected ECC strength is too weak
mtd: nand: omap: Documentation: How to select correct ECC scheme for your device ?
mtd: nand: omap: add support for BCH16_ECC - NAND driver updates
mtd: nand: omap: add support for BCH16_ECC - ELM driver updates
mtd: nand: omap: add support for BCH16_ECC - GPMC driver updates
...
Mostly performance improvements with a few corner-case bug fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIVAwUAU5fevznsnt1WYoG5AQKzNxAAgjZPg6LZgl0wEHJi09ykqLK9dSkglz2B
RY5IEWPEMcsGJQbSX7fEME5WSHkqKBIHx43xmdzijPCHyWLnF4vQRubNAi/Wo8PC
yfJyI5hMrG8+0rNKnmXWeG+NLGj7l7j+wkExe27VBlu2+6ATvIUCa34kjOWFDJJx
xAN9/t22XGly776SiuLXTpMhA0pRWH4sqijJhvM5EqPlsEUjGxvQU/URwUPYzhYr
iCbvY64uKB1Crh+vn7yP1IUWc77aDOoUPIxO50m0GXOQFU/wbQta1NAFmMeu8tIo
3QRlGzP+jRJFBV7jLii4nHTMSVxBmD2zMppxVtDoq/egWhyZlJhh2Kd5sUKo2r9V
8bGQm7e1JlsDsRMVXxSlx3TebZDfJhtnFNhYb6JPlpWM5EYjCzXXfJUl3/6LfKX6
BKJ800xvabr5TFkVubF9H80LTQxaO+pQoS6PSs8hnwneqtybdbt9o7V59yluMGao
IwaG/BY4jlY+76YnK32Kbr1eofUNYXBLg45mgSkrhUVqv9HLHe0CDnYAzYs68aur
zG8a1rhA9IJtBmWwql2Gm8lKOnDFM6onvSmaYntsabJ++eahaNDpL5ck+R4kO8mJ
pyI1N6GQDjtLdONk4NmH36P+CuCZje6KwRbVXvsJIBA1042tkZfU0zdqgt5QhFZj
Ma9ZCn9wZFo=
=MaM/
-----END PGP SIGNATURE-----
Merge tag 'md/3.16' of git://neil.brown.name/md
Pull md updates from Neil Brown:
"Assorted md fixes for 3.16
Mostly performance improvements with a few corner-case bug fixes"
* tag 'md/3.16' of git://neil.brown.name/md:
raid5: speedup sync_request processing
md/raid5: deadlock between retry_aligned_read with barrier io
raid5: add an option to avoid copy data from bio to stripe cache
md/bitmap: remove confusing code from filemap_get_page.
raid5: avoid release list until last reference of the stripe
md: md_clear_badblocks should return an error code on failure.
md/raid56: Don't perform reads to support writes until stripe is ready.
md: refuse to change shape of array if it is active but read-only
There was a bug in debug printout when CONFIG_REISERFS_CHECK was
enabled so one of the assertions in do_balan.c didn't compile. Fix it.
Fixes: 0080e9f9d3
Signed-off-by: Jan Kara <jack@suse.cz>
Currently we forward MCEs to guest which have been recovered by guest.
And for unhandled errors we do not deliver the MCE to guest. It looks like
with no support of FWNMI in qemu, guest just panics whenever we deliver the
recovered MCEs to guest. Also, the existig code used to return to host for
unhandled errors which was casuing guest to hang with soft lockups inside
guest and makes it difficult to recover guest instance.
This patch now forwards all fatal MCEs to guest causing guest to crash/panic.
And, for recovered errors we just go back to normal functioning of guest
instead of returning to host. This fixes soft lockup issues in guest.
This patch also fixes an issue where guest MCE events were not logged to
host console.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We don't see MCE counter getting increased in /proc/interrupts which gives
false impression of no MCE occurred even when there were MCE events.
The machine check early handling was added for PowerKVM and we missed to
increment the MCE count in the early handler.
We also increment mce counters in the machine_check_exception call, but
in most cases where we handle the error hypervisor never reaches there
unless its fatal and we want to crash. Only during fatal situation we may
see double increment of mce count. We need to fix that. But for
now it always good to have some count increased instead of zero.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently machine check handler does not check for stack overflow for
nested machine check. If we hit another MCE while inside the machine check
handler repeatedly from same address then we get into risk of stack
overflow which can cause huge memory corruption. This patch limits the
nested MCE level to 4 and panic when we cross level 4.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Current code does not check for unhandled/unrecovered errors and return from
interrupt if it is recoverable exception which in-turn triggers same machine
check exception in a loop causing hypervisor to be unresponsive.
This patch fixes this situation and forces hypervisor to panic for
unhandled/unrecovered errors.
This patch also fixes another issue where unrecoverable_exception routine
was called in real mode in case of unrecoverable exception (MSR_RI = 0).
This causes another exception vector 0x300 (data access) during system crash
leading to confusion while debugging cause of the system crash.
Also turn ME bit off while going down, so that when another MCE is hit during
panic path, system will checkstop and hypervisor will get restarted cleanly
by SP.
With the above fixes we now throw correct console messages (see below) while
crashing the system in case of unhandled/unrecoverable machine checks.
--------------
Severe Machine check interrupt [[Not recovered]
Initiator: CPU
Error type: UE [Instruction fetch]
Effective address: 0000000030002864
Oops: Machine check, sig: 7 [#1]
SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: bork(O) bridge stp llc kvm [last unloaded: bork]
CPU: 36 PID: 55162 Comm: bash Tainted: G O 3.14.0mce #1
task: c000002d72d022d0 ti: c000000007ec0000 task.ti: c000002d72de4000
NIP: 0000000030002864 LR: 00000000300151a4 CTR: 000000003001518c
REGS: c000000007ec3d80 TRAP: 0200 Tainted: G O (3.14.0mce)
MSR: 9000000000041002 <SF,HV,ME,RI> CR: 28222848 XER: 20000000
CFAR: 0000000030002838 DAR: d0000000004d0000 DSISR: 00000000 SOFTE: 1
GPR00: 000000003001512c 0000000031f92cb0 0000000030078af0 0000000030002864
GPR04: d0000000004d0000 0000000000000000 0000000030002864 ffffffffffffffc9
GPR08: 0000000000000024 0000000030008af0 000000000000002c c00000000150e728
GPR12: 9000000000041002 0000000031f90000 0000000010142550 0000000040000000
GPR16: 0000000010143cdc 0000000000000000 00000000101306fc 00000000101424dc
GPR20: 00000000101424e0 000000001013c6f0 0000000000000000 0000000000000000
GPR24: 0000000010143ce0 00000000100f6440 c000002d72de7e00 c000002d72860250
GPR28: c000002d72860240 c000002d72ac0038 0000000000000008 0000000000040000
NIP [0000000030002864] 0x30002864
LR [00000000300151a4] 0x300151a4
Call Trace:
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
---[ end trace 7285f0beac1e29d3 ]---
Sending IPI to other CPUs
IPI complete
OPAL V3 detected !
--------------
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
As Ben suggested, it's meaningful to dump PE's location code
for site engineers when hitting EEH errors. The patch introduces
function eeh_pe_loc_get() to retireve the location code from
dev-tree so that we can output it when hitting EEH errors.
If primary PE bus is root bus, the PHB's dev-node would be tried
prior to root port's dev-node. Otherwise, the upstream bridge's
dev-node of the primary PE bus will be check for the location code
directly.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
MAX_DMA_CHANNELS is defined in asm/scatterlist.h of the powerpc
architecture. Rename this #define in xgbe.h to avoid the
redefined warning issued during compilation.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a kernel BUG_ON in skb_segment. It is hit when
testing two VMs on openvswitch with one VM acting as VXLAN gateway.
During VXLAN packet GSO, skb_segment is called with skb->data
pointing to inner TCP payload. skb_segment calls skb_network_protocol
to retrieve the inner protocol. skb_network_protocol actually expects
skb->data to point to MAC and it calls pskb_may_pull with ETH_HLEN.
This ends up pulling in ETH_HLEN data from header tail. As a result,
pskb_trim logic is skipped and BUG_ON is hit later.
Move skb_push in front of skb_network_protocol so that skb->data
lines up properly.
kernel BUG at net/core/skbuff.c:2999!
Call Trace:
[<ffffffff816ac412>] tcp_gso_segment+0x122/0x410
[<ffffffff816bc74c>] inet_gso_segment+0x13c/0x390
[<ffffffff8164b39b>] skb_mac_gso_segment+0x9b/0x170
[<ffffffff816b3658>] skb_udp_tunnel_segment+0xd8/0x390
[<ffffffff816b3c00>] udp4_ufo_fragment+0x120/0x140
[<ffffffff816bc74c>] inet_gso_segment+0x13c/0x390
[<ffffffff8109d742>] ? default_wake_function+0x12/0x20
[<ffffffff8164b39b>] skb_mac_gso_segment+0x9b/0x170
[<ffffffff8164b4d0>] __skb_gso_segment+0x60/0xc0
[<ffffffff8164b6b3>] dev_hard_start_xmit+0x183/0x550
[<ffffffff8166c91e>] sch_direct_xmit+0xfe/0x1d0
[<ffffffff8164bc94>] __dev_queue_xmit+0x214/0x4f0
[<ffffffff8164bf90>] dev_queue_xmit+0x10/0x20
[<ffffffff81687edb>] ip_finish_output+0x66b/0x890
[<ffffffff81688a58>] ip_output+0x58/0x90
[<ffffffff816c628f>] ? fib_table_lookup+0x29f/0x350
[<ffffffff816881c9>] ip_local_out_sk+0x39/0x50
[<ffffffff816cbfad>] iptunnel_xmit+0x10d/0x130
[<ffffffffa0212200>] vxlan_xmit_skb+0x1d0/0x330 [vxlan]
[<ffffffffa02a3919>] vxlan_tnl_send+0x129/0x1a0 [openvswitch]
[<ffffffffa02a2cd6>] ovs_vport_send+0x26/0xa0 [openvswitch]
[<ffffffffa029931e>] do_output+0x2e/0x50 [openvswitch]
Signed-off-by: Wei-Chun Chao <weichunc@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bug report on https://bugzilla.kernel.org/show_bug.cgi?id=75781
When a local output ipsec packet match the mangle table rule,
and be set mark value, the packet will be route again in
route_me_harder -> _session_decoder6
In this case, the nhoff in CB of skb was still the default
value 0. So the protocal match can't success and the packet can't match
correct SA rule,and then the packet be send out in plaintext.
To fixed up the issue. The CB->nhoff must be set.
Signed-off-by: Hui Zhang <huizhang@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use dma_addr_t for DMA address parameters and u32 for shared memory
offset parameters.
Do not assume that dma_addr_t is the same as unsigned long; it will
not be in PAE configurations. Truncate DMA addresses to 32 bits when
printing them. This is OK because the DMA mask for this device is
32-bit (per default).
Also rename the DMA address parameters from 'skb' to 'dma'.
Compile-tested only.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some tunnels (though only vti as for now) can use i_key just for internal use:
for example vti uses it for fwmark'ing incoming packets. So raw i_key value
shouldn't be treated as a distinguisher for them. ip_tunnel_key_match exists for
cases when we want to compare two ip_tunnel_parms' i_keys.
Example bug:
ip link add type vti ikey 1 local 1.0.0.1 remote 2.0.0.2
ip link add type vti ikey 2 local 1.0.0.1 remote 2.0.0.2
spawned two tunnels, although it doesn't make sense.
Signed-off-by: Dmitry Popov <ixaphire@qrator.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz says:
====================
mlx4 SRIOV fixes
The patch from Wei Yang is a designed fix to a regression introduced by earlier commit
of him. Jack added a fix to the resource management which we got from IBM.
Let's get that into 3.16-rc1 1st and later see to what stable version/s this should go.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Following commit befdf89 "net/mlx4_core: Preserve pci_dev_data after
__mlx4_remove_one()", there are two mlx4 pci callbacks which will
attempt to release the mlx4_priv object -- .shutdown and .remove.
This leads to a use-after-free access to the already freed mlx4_priv
instance and trigger a "Kernel access of bad area" crash when both
.shutdown and .remove are called.
During reboot or kexec, .shutdown is called, with the VFs probed to
the host going through shutdown first and then the PF. Later, the PF
will trigger VFs' .remove since VFs still have driver attached.
Fix that by keeping only one driver entry which releases mlx4_priv.
Fixes: befdf89 ('net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one()')
CC: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Hypervisor driver tracks free slots and reserved slots at the global level
and tracks allocated slots and guaranteed slots per VF.
Guaranteed slots are treated as reserved by the driver, so the total
reserved slots is the sum of all guaranteed slots over all the VFs.
As VFs allocate resources, free (global) is decremented and allocated (per VF)
is incremented for those resources. However, reserved (global) is never changed.
This means that effectively, when a VF allocates a resource from its
guaranteed pool, it is actually reducing that resource's free pool (since
the global reserved count was not also reduced).
The fix for this problem is the following: For each resource, as long as a
VF's allocated count is <= its guaranteed number, when allocating for that
VF, the reserved count (global) should be reduced by the allocation as well.
When the global reserved count reaches zero, the remaining global free count
is still accessible as the free pool for that resource.
When the VF frees resources, the reverse happens: the global reserved count
for a resource is incremented only once the VFs allocated number falls below
its guaranteed number.
This fix was developed by Rick Kready <kready@us.ibm.com>
Reported-by: Rick Kready <kready@us.ibm.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ip tunnel add remote 10.2.2.1 local 10.2.2.2 mode vti ikey 1 okey 2
translates to p->iflags = VTI_ISVTI|GRE_KEY and p->i_key = 1, but GRE_KEY !=
TUNNEL_KEY, so ip_tunnel_ioctl would set i_key to 0 (same story with o_key)
making us unable to create vti tunnels with [io]key via ip tunnel.
We cannot simply translate GRE_KEY to TUNNEL_KEY (as GRE module does) because
vti_tunnels with same local/remote addresses but different ikeys will be treated
as different then. So, imo the best option here is to move p->i_flags & *_KEY
check for vti tunnels from ip_tunnel.c to ip_vti.c and to think about [io]_mark
field for ip_tunnel_parm in the future.
Signed-off-by: Dmitry Popov <ixaphire@qrator.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Logical conjunction always evaluates to false: minor < 2 && minor > 1
I guess what you wanted is rather: minor > 2 || minor < 1
This was partly found using a static code analysis program called cppcheck.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
A check on a memory allocation is checked incorrectly.
This was partly found using a static code analysis program called cppcheck.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix following compilation warning:
[...]
CC drivers/net/phy/amd-xgbe-phy.o
drivers/net/phy/amd-xgbe-phy.c:1353:30: warning:
‘amd_xgbe_phy_ids’ defined but not used [-Wunused-variable]
static struct mdio_device_id amd_xgbe_phy_ids[] = {
^
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- 'struct nlattr' must be 2 byte aligned
- provide big-endian input data for nlattr/nlattr_nest tests
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The macro 'A' used in internal BPF interpreter:
#define A regs[insn->a_reg]
was easily confused with the name of classic BPF register 'A', since
'A' would mean two different things depending on context.
This patch is trying to clean up the naming and clarify its usage in the
following way:
- A and X are names of two classic BPF registers
- BPF_REG_A denotes internal BPF register R0 used to map classic register A
in internal BPF programs generated from classic
- BPF_REG_X denotes internal BPF register R7 used to map classic register X
in internal BPF programs generated from classic
- internal BPF instruction format:
struct sock_filter_int {
__u8 code; /* opcode */
__u8 dst_reg:4; /* dest register */
__u8 src_reg:4; /* source register */
__s16 off; /* signed offset */
__s32 imm; /* signed immediate constant */
};
- BPF_X/BPF_K is 1 bit used to encode source operand of instruction
In classic:
BPF_X - means use register X as source operand
BPF_K - means use 32-bit immediate as source operand
In internal:
BPF_X - means use 'src_reg' register as source operand
BPF_K - means use 32-bit immediate as source operand
Suggested-by: Chema Gonzalez <chema@google.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Chema Gonzalez <chema@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dns_query() credulously assumes that keys are null-terminated and
returns a copy of a memory block that is off by one.
Signed-off-by: Manuel Schölling <manuel.schoelling@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enables POWER8 doorbell IPIs on powernv.
Since doorbells can only IPI within a core, we test to see when we can use
doorbells and if not we fall back to XICS. This also enables hypervisor
doorbells to wakeup us up from nap/sleep via the LPCR PECEDH bit.
Based on tests by Anton, the best case IPI latency between two threads dropped
from 894ns to 512ns.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently when entering fastsleep we clear all LPCR PECE bits.
This patch changes it to only clear the decrementer bit (ie. PECE1), which is
the only bit we really need to clear here. This is needed if we want to set
other wakeup causes like the PECEDH bit so we can use hypervisor doorbells on
powernv. Also we no longer clear the MER bit as it should never be set in the
host anyway.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On PowerNV platform, EEH errors are reported by IO accessors or poller
driven by interrupt. After the PE is isolated, we won't produce EEH
event for the PE. The current implementation has possibility of EEH
event lost in this way:
The interrupt handler queues one "special" event, which drives the poller.
EEH thread doesn't pick the special event yet. IO accessors kicks in, the
frozen PE is marked as "isolated" and EEH event is queued to the list.
EEH thread runs because of special event and purge all existing EEH events.
However, we never produce an other EEH event for the frozen PE. Eventually,
the PE is marked as "isolated" and we don't have EEH event to recover it.
The patch fixes the issue to keep EEH events for PEs that have been
marked as "isolated" with the help of additional "force" help to
eeh_remove_event().
Reported-by: Rolf Brudeseth <rolfb@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit b0d278b7d3 ("powerpc/perf_event: Reduce latency of calling
perf_event_do_pending") added a check for CONFIG_PMAC were a check for
CONFIG_PPC_PMAC was clearly intended.
Fixes: b0d278b7d3 ("powerpc/perf_event: Reduce latency of calling perf_event_do_pending")
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit cd64d1697c ("powerpc: mtmsrd not defined") added a check for
CONFIG_PPC_CPU were a check for CONFIG_PPC_FPU was clearly intended.
Fixes: cd64d1697c ("powerpc: mtmsrd not defined")
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>