Fix return from sprd_handle_irq() with spin_lock held.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Reviewed-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6ae9200f2c ("enlarge console.name") increased the storage
for the console name to 16 bytes, but not the corresponding
struct console_cmdline::name storage. Console names longer than
8 bytes cause read beyond end-of-string and failure to match
console; I'm not sure if there are other unexpected consequences.
Cc: <stable@vger.kernel.org> # 2.6.22+
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This problem was taken care of three times already in
* b0de59b573 (TTY: do not update
atime/mtime on read/write),
* 37b7f3c765 (TTY: fix atime/mtime
regression), and
* b0b885657b (tty: fix up atime/mtime
mess, take three)
But it still misses one point. As John Paul correctly points out, we
do not care about setting date. If somebody ever changes wall
time backwards (by mistake for example), tty timestamps are never
updated until the original wall time passes.
So check the absolute difference of times and if it large than "8
seconds or so", always update the time. That means we will update
immediatelly when changing time. Ergo, CAP_SYS_TIME can foul the
check, but it was always that way.
Thanks John for serving me this so nicely debugged.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: John Paul Perry <john_paul.perry@alcatel-lucent.com>
Cc: <stable@vger.kernel.org> # all, as b0b885657 was backported
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixed behaviour of get_mctrl() serial driver function as documented in:
https://www.kernel.org/doc/Documentation/serial/driver
Added device-tree properties 'dcd-override', 'dsr-override',
'cts-override', and 'ri-override' specific to the Synopsis 8250
DesignWare UART driver. Allows one to force Data Carrier Detect,
Clear To Send, and Data Set Ready signals to permanently be reported as
active. The Ring indicator can be forced to be reported as inactive.
It is possible that if modem control signalling is enabled on a port
that doesn't have these pins (e.g. - a simple two wire Tx/Rx port), the
driver can hang indefinitely waiting for the state to change. The new
DT properties allow the driver to ignore the state of these pins on
serial ports that don't support them, as recommended in the kernel
documentation.
Reviewed-by: JD (Jiandong) Zheng <jdzheng@broadcom.com>
Signed-off-by: Jonathan Richardson <jonathar@broadcom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
These quirk entries have the same effect as default
quirk entry, so we can just delete them.
Signed-off-by: Wang YanQing <udknight@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 8b5c913f7e
("serial: 8250_pci: Add WCH CH352 quirk to avoid Xscale detection")
trigger one redundant entry report message.
This patch fix it.
Reported-by: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Wang YanQing <udknight@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
I'm still receiving reports to my email address, so let's point this
at the linux-serial mailing list instead.
Cc: <stable@vger.kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 0aa525d118.
The conditional RX-FIFO read seems to cause spurious interrupts and we
see just:
|serial8250: too much work for irq29
The previous behaviour was "default" for decades and Marvell's 88f6282 SoC
might not be the only that relies on it. Therefore the Omap fix is
reverted for now.
Fixes: 0aa525d118 ("tty: serial: 8250_core: read only RX if there is
something in the FIFO")
Reported-By: Nicolas Schichan <nschichan@freebox.fr>
Debuged-By: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 6d01bb9dc8.
The exact same code was added in commit 3239fd31d4 (serial: of-serial: fetch
line number from DT) a few lined above. Doing this once should be enough.
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull drm fixes from Dave Airlie:
"Radeon, imx, msm, and i915 fixes.
The msm, imx and i915 ones are fairly run of the mill.
Radeon had some DP audio and posting reads for irq fixes, along with a
fix for 32-bit kernels with new cards, we were using unsigned long to
represent GPU side memory space, but since that changed size on 32 vs
64 cards with lots of VRAM failed, so the change has no effect on
x86-64, just moves to using uint64_t instead"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (35 commits)
drm/msm: kexec fixes
drm/msm/mdp5: fix cursor blending
drm/msm/mdp5: fix cursor ROI
drm/msm/atomic: Don't leak atomic commit object when commit fails
drm/msm/mdp5: Avoid flushing registers when CRTC is disabled
drm/msm: update generated headers (add 6th lm.base entry)
drm/msm/mdp5: fixup "drm/msm: fix fallout of atomic dpms changes"
drm/ttm: device address space != CPU address space
drm/mm: Support 4 GiB and larger ranges
drm/i915: gen4: work around hang during hibernation
drm/i915: Check for driver readyness before handling an underrun interrupt
drm/radeon: fix interlaced modes on DCE8
drm/radeon: fix DRM_IOCTL_RADEON_CS oops
drm/radeon: do a posting read in cik_set_irq
drm/radeon: do a posting read in si_set_irq
drm/radeon: do a posting read in evergreen_set_irq
drm/radeon: do a posting read in r600_set_irq
drm/radeon: do a posting read in rs600_set_irq
drm/radeon: do a posting read in r100_set_irq
radeon/audio: fix DP audio on DCE6
...
A clock specifier is required for i.MX I2C and is
provided in all DTS implementations. Add this to the
list of required properties in the binding.
Signed-off-by: Matt Porter <mporter@konsulko.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
This patch marks baytrail_i2c_acquire() that it might sleep. Also it chages
while-loop to do-while and, though it is matter of taste, gives a chance to
check one more time before report a timeout.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
It seems the idea behind the cross-check is to prevent acquire semaphore when
there is no release callback and vice versa. Thus, patch fixes a typo.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
There is no need to export functions that are used as the callbacks in the
struct dw_i2c_dev. Otherwise we get the following warnings:
drivers/i2c/busses/i2c-designware-baytrail.c:63:5: warning: symbol 'baytrail_i2c_acquire' was not declared. Should it be static?
drivers/i2c/busses/i2c-designware-baytrail.c:114:6: warning: symbol 'baytrail_i2c_release' was not declared. Should it be static?
While here, do few indentation fixes, remove i2c_dw_eval_lock_support() from
functions exported to the modules and redundant assignment of local sem
variable.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
It seems we have same message for different return values in get_sem() and
baytrail_i2c_acquire(). I suspect this is just a typo, so this patch fixes it.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
The patch converts hardcoded numerical constants to a named ones.
While here, align the variable name in get_sem() and reset_semaphore().
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Pull btrfs fixes from Chris Mason:
"Outside of misc fixes, Filipe has a few fsync corners and we're
pulling in one more of Josef's fixes from production use here"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs:__add_inode_ref: out of bounds memory read when looking for extended ref.
Btrfs: fix data loss in the fast fsync path
Btrfs: remove extra run_delayed_refs in update_cowonly_root
Btrfs: incremental send, don't rename a directory too soon
btrfs: fix lost return value due to variable shadowing
Btrfs: do not ignore errors from btrfs_lookup_xattr in do_setxattr
Btrfs: fix off-by-one logic error in btrfs_realloc_node
Btrfs: add missing inode update when punching hole
Btrfs: abort the transaction if we fail to update the free space cache inode
Btrfs: fix fsync race leading to ordered extent memory leaks
Pull livepatching fix from Jiri Kosina:
"Fix an RCU unlock misplacement in live patching infrastructure, from
Peter Zijlstra"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
livepatch: fix RCU usage in klp_find_external_symbol()
Pull thermal management fixes from Eduardo Valentin:
"Specifics:
- adding Lukasz as maintainer of samsung thermal driver.
- driver fixes: exynos and int430x.
- one fix in the exynos cpufreq driver related to cpu cooling (acked
by cpufreq maintainer).
- fix default sysfs attributes of cooling devices
Note: I am sending this pull on Rui's behalf while he fixes issues in his Linux box"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
thermal: Make sysfs attributes of cooling devices default attributes
Thermal/int340x: Fix memleak for aux trip
MAINTAINERS: Add entry for SAMSUNG THERMAL DRIVER
cpufreq: exynos: Use simple approach to asses if cpu cooling can be used
thermal: exynos: Fix wrong control of power down detection mode for Exynos7
two fixes, both cc'd stable.
* tag 'drm-intel-fixes-2015-03-05' of git://anongit.freedesktop.org/drm-intel:
drm/i915: gen4: work around hang during hibernation
drm/i915: Check for driver readyness before handling an underrun interrupt
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJU9enEAAoJEHm+PkMAQRiG/ewIAJ4MW4tcAhaVj6ndCF3+uL/b
RaVm1apUjsTloe5Fl0TT9J5CO3zdOetmMNToy2sf0W4MJDIyHf21o83l7eniV/6q
al/c3fQ6HVtNjiSUNghTtzVlL+gUD1F60b9BGYi1V5h2Mp8u0NG1alTGLQfCB8sE
ArB+v2aWEdSPn7mZDA0Yuc1In+8bkpht3oy+OLD/8JNkqqLnml9YOyPjM1cuRpBr
NxKCLcPzSHH9/nR3T6XtkxXYV5xD3+CDm9roJhfHukoFmfT/G3C65Zcp2KEed/Cw
QQpu+ox7fpUs10F/Fbfm8AE+tRB4o2sGh97sprXrO5oaFdx6FPIBo4WN8i/Vy68=
=qpY+
-----END PGP SIGNATURE-----
Merge tag 'v4.0-rc2' into drm-fixes
Linux 4.0-rc2
Merging this manually as the i915 change is in it,
and intel fixes are on top of this
- tpm_dev_add_device(): cdev_add() must be done before uevent is
propagated in order to avoid races.
- tpm_chip_register(): tpm_dev_add_device() must be done as the
last step before exposing device to the user space in order to
avoid races.
In addition clarified description in tpm_chip_register().
Fixes: 313d21eeab ("tpm: device class for tpm")
Fixes: afb5abc262 ("tpm: two-phase chip management functions")
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Problem: When IMA and VTPM are both enabled in kernel config,
kernel hangs during bootup on LE OS.
Why?: IMA calls tpm_pcr_read() which results in tpm_ibmvtpm_send
and tpm_ibmtpm_recv getting called. A trace showed that
tpm_ibmtpm_recv was hanging.
Resolution: tpm_ibmtpm_recv was hanging because tpm_ibmvtpm_send
was sending CRQ message that probably did not make much sense
to phype because of Endianness. The fix below sends correctly
converted CRQ for LE. This was not caught before because it
seems IMA is not enabled by default in kernel config and
IMA exercises this particular code path in vtpm.
Tested with IMA and VTPM enabled in kernel config and VTPM
enabled on both a BE OS and a LE OS ppc64 lpar. This exercised
CRQ and TPM command code paths in vtpm.
Patch is against Peter's tpmdd tree on github which included
Vicky's previous vtpm le patches.
Signed-off-by: Joy Latten <jmlatten@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # eb71f8a5e3: "Added Little Endian support to vtpm module"
Cc: <stable@vger.kernel.org>
Reviewed-by: Ashley Lai <ashley@ahsleylai.com>
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Alexander Duyck says:
====================
The rest of the FIB patches (add key_vector to fib_table)
This patch series is the rest of what I had originally planned for this kernel
release. It adds a structure called key_vector which is embedded within
every tnode, leaf, and the trie root itself. By doing this we can navigate
from any point within the trie to any other point fairly quickly and
avoiding NULL pointer checks in the case of a backtrace. As a result we
can pipeline things a bit further since we don't have to worry about
dereferencing NULL in a backtrace. This can amount to significant savings
on a long backtrace.
I decided to drop the up-level code as that conflicts with combining the
main and local tries. I have one patch as an RFC that currently combines
the tries however it still needs some work as we have to split the local
and main tries in the event of custom rules being defined. As such we are
probably going to be doing some more hacking on fib_table_flush_external as
that will also need to flush the local entries from the main trie and place
them back in the local trie.
v2: Rebased on the switchdev FIB offload work
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This change makes it so that the root of the trie contains a key_vector, by
doing this we make room to essentially collapse the entire trie by at least
one cache line as we can store the information about the tnode or leaf that
is pointed to in the root.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change pulls the parent pointer from the key_vector and places it in
the tnode structure.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This pulls the information about the child array out of the key_vector and
places it in the tnode since that is where it is needed.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RCU is only needed once for the entire node, not once per key_vector so we
can pull that out and move it to the tnode structure.
In addition add accessors to be used inside the RCU functions so that we
can more easily get from the key vector to either the tnode or the trie
pointers.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change pulls the fields not explicitly needed in the key_vector and
placed them in the new tnode structure. By doing this we will eventually
be able to reduce the key_vector down to 16 bytes on 64 bit systems, and
12 bytes on 32 bit systems.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are now checking the length of a key_vector instead of a tnode so it
makes sense to probably just rename this to child_length since it would
probably even be applicable to a leaf.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I am replacing the tnode_get_child call with get_child since we are
techically pulling the child out of a key_vector now and not a tnode.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename the tnode to key_vector. The key_vector will be the eventual
container for all of the information needed by either a leaf or a tnode.
The final result should be much smaller than the 40 bytes currently needed
for either one.
This also updates the trie struct so that it contains an array of size 1 of
tnode pointers. This is to bring the structure more inline with how an
actual tnode itself is configured.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Resize related functions now all return a pointer to the pointer that
references the object that was resized.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change just does a couple of minor cleanups on
fib_table_flush_external. Specifically it addresses the fact that resize
was being called even though nothing was being removed from the table, and
it drops an unecessary indent since we could just call continue on the
inverse of the fi && flag check.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ZynqMP soc has single interrupt for all the queue events. So,
passing the IRQF_SHARED flag for interrupt registration call.
Signed-off-by: Punnaiah Choudary Kalluri <punnaia@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Include multi queue support for the ethernet IP version in xilinx ZynqMP
SoC.
Signed-off-by: Punnaiah Choudary Kalluri <punnaia@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
brcmfmac:
* sdio improvements
* add a debugfs file so users can provide us all the revinfo we could
ask for
iwlwifi:
* add triggers for firmware dump collection
* remove support for -9.ucode
* new statitics API
* rate control improvements
ath9k:
* add per-vif TX power capability
* BT coexistance fixes
ath10k:
* qca6174: enable STA transmit beamforming (TxBF) support
* disable multi-vif power save by default
bcma:
* enable support for PCIe Gen 2 host devices
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJU+dliAAoJEG4XJFUm622bsqQH/RO1Gxuw6hmiHPeeIcoDmlvt
MZKvy6xcAiFqREfGwDxjVminlTZ7/MB9bABeaoQKzpQFpCJW/ftjIqwfbRqZWsvG
3IC0s2nPTwWU8YSsZTbifnyXCVNQDJuE+5nQ3hMO2rE/dZDi1zt1fS2hiSXtlASS
kgBJcfXgoVxvhZ1WI+uVpbU0RtwXmI7tVylREE1sbgCrg7AuJx4Q2QmZ1GioPRLy
20HnFVFcIcbHk4eXVwAJOspdjctujoR858pg/oxlcVXWb7MOOCV/Fk8WMursZxFh
qj/I/kbDcFYh3H5uC+6qL/kRByY80/yckLDiMbghA0QR5/PSx2nvp/UfkqIf008=
=qgVl
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-next-for-davem-2015-03-06' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Major changes:
brcmfmac:
* sdio improvements
* add a debugfs file so users can provide us all the revinfo we could
ask for
iwlwifi:
* add triggers for firmware dump collection
* remove support for -9.ucode
* new statitics API
* rate control improvements
ath9k:
* add per-vif TX power capability
* BT coexistance fixes
ath10k:
* qca6174: enable STA transmit beamforming (TxBF) support
* disable multi-vif power save by default
bcma:
* enable support for PCIe Gen 2 host devices
Signed-off-by: David S. Miller <davem@davemloft.net>
When building without CONFIG_NET_SWITCHDEV,
netdev_switch_fib_ipv4_abort is defined in the header file. It must
be static inline to avoid build failure at link time.
Fixes: 8e05fd7166 ("fib: hook IPv4 fib for hardware offload")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the nla length is less than 2 then the nla data could be accessed
beyond the accessible bounds. So ensure that the nla is big enough to
at least read the via_family before doing so. Replace magic value of
2.
Fixes: 03c0566542 ("mpls: Basic support for adding and removing routes")
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petri Gynther says:
====================
net: bcmgenet: preparation for multiple Rx queues
Three small patches in preparation for supporting multiple Rx queues:
1. set hw_params->rx_queues = 0
2. adjust the call to alloc_etherdev_mqs()
3. add GENET_Q16_RX_BD_CNT and hw_params->rx_bds_per_q
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for supporting multiple Rx queues, add GENET_Q16_RX_BD_CNT
and hw_params->rx_bds_per_q.
Signed-off-by: Petri Gynther <pgynther@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for supporting multiple Rx queues, adjust the call to
alloc_etherdev_mqs() to allow max GENET_MAX_MQ_CNT + 1 Rx queues.
The actual number of Rx queues in use is correctly adjusted with:
netif_set_real_num_rx_queues(priv->dev, priv->hw_params->rx_queues + 1);
Signed-off-by: Petri Gynther <pgynther@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bcmgenet driver doesn't yet support multiple Rx queues.
Set hw_params->rx_queues = 0 accordingly.
The default Rx queue (Q16) is still created and operational.
Signed-off-by: Petri Gynther <pgynther@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2015-03-06
This series contains updates to e1000, e1000e and igb.
Yanir provides updates to e1000e based on the patches provided by John
Linville. First updates the code comment to better describe the changes
and the impact on the driver. Second removed calls to ioremap/unmap for
i219 since this is only relevant to older hardware only. Starting with
i219, the NVM will not be mapped to its one BAR but to a address region
in another bar.
Alex Duyck provides two fixes for igb, first fixes a compile warning
where a variable may be used uninitialized, so Alex initializes it.
Second fixes an issue where all of the pin register values were having
to be pushed onto the stack each time the function was called, so to
avoid this, Alex made them static const so that they should only need
to be allocated once and we can avoid all the instructions to get them
onto the stack.
Eliezer found an issue in e1000 where we needed to be calling
netif_carrier_off earlier in the down() to prevent the stack from
queuing more packets to the interface.
Sabrina Dubroca resolved a potential race condition by adding a
dummy allocator. There was a race condition between e1000_change_mtu()
cleanups and netpoll, when changing the MTU across jumbo sizes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fan Du says:
====================
Improvements for TCP PMTU
This patchset performs some improvements and enhancement
for current TCP PMTU as per RFC4821 with the aim to find
optimal mms size quickly, and also be adaptive to route
changes like enlarged path MTU. Then TCP PMTU could be
used to probe a effective pmtu in absence of ICMP message
for tunnels(e.g. vxlan) across different networking stack.
Patch1/4: Set probe mss base to 1024 Bytes per RFC4821
Patch2/4: Do not double probe_size for each probing,
use a simple binary search to gain maximum performance.
mss for next probing.
Patch3/4: Create a probe timer to detect enlarged path MTU.
Patch4/4: Update ip-sysctl.txt for new sysctl knobs.
Changelog:
v5:
- Zero probe_size before resetting search range.
- Update ip-sysctl.txt for new sysctl knobs.
v4:
- Convert probe_size to mss, not directly from search_low/high
- Clamp probe_threshold
- Don't adjust search_high in blackhole probe, so drop orignal patch3
v3:
- Update commit message for patch2
- Fix pseudo timer delta calculation in patch4
v2:
- Introduce sysctl_tcp_probe_threshold to control when
probing will stop, as suggested by John Heffner.
- Add patch3 to shrink current mss value for search low boundary.
- Drop cannonical timer usages, implements pseudo timer based on
32bits jiffies tcp_time_stamp, as suggested by Eric Dumazet.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Namely tcp_probe_interval to control how often to restart
a probe. And tcp_probe_threshold to control when stop the
probing in respect to the width of search range in bytes
Signed-off-by: Fan Du <fan.du@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As per RFC4821 7.3. Selecting Probe Size, a probe timer should
be armed once probing has converged. Once this timer expired,
probing again to take advantage of any path PMTU change. The
recommended probing interval is 10 minutes per RFC1981. Probing
interval could be sysctled by sysctl_tcp_probe_interval.
Eric Dumazet suggested to implement pseudo timer based on 32bits
jiffies tcp_time_stamp instead of using classic timer for such
rare event.
Signed-off-by: Fan Du <fan.du@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>