If both hot-add and power fault were observed in a single interrupt, we
handled the hot-add first, then the power fault, in this path:
pciehp_ist
if (events & (PDC | DLLSC))
pciehp_handle_presence_or_link_change
case OFF_STATE:
pciehp_enable_slot
__pciehp_enable_slot
board_added
pciehp_power_on_slot
ctrl->power_fault_detected = 0
pcie_write_cmd(ctrl, PCI_EXP_SLTCTL_PWR_ON, PCI_EXP_SLTCTL_PCC)
pciehp_green_led_on(p_slot) # power LED on
pciehp_set_attention_status(p_slot, 0) # attention LED off
if ((events & PFD) && !ctrl->power_fault_detected)
ctrl->power_fault_detected = 1
pciehp_set_attention_status(1) # attention LED on
pciehp_green_led_off(slot) # power LED off
This left the attention indicator on (even though the hot-add succeeded)
and the power indicator off (even though the slot power was on).
Fix this by checking for power faults before checking for new devices.
Prior to 0e94916e60, this was successful because everything was chained
through work queues and the order was:
INT_PRESENCE_ON -> INT_POWER_FAULT -> ENABLE_REQ
The ENABLE_REQ cleared the power fault at the end, but now everything is
handled inline with the interrupt thread, such that the work ENABLE_REQ was
doing happens before power fault handling now.
Fixes: 0e94916e60 ("PCI: pciehp: Handle events synchronously")
Signed-off-by: Keith Busch <keith.busch@intel.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
p.port can is indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
drivers/pci/switch/switchtec.c:912 ioctl_port_to_pff() warn: potential spectre issue 'pcfg->dsp_pff_inst_id' [r]
Fix this by sanitizing p.port before using it to index
pcfg->dsp_pff_inst_id
Notice that given that speculation windows are large, the policy is to kill
the speculation on the first load and not worry if it can be completed with
a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Logan Gunthorpe <logang@deltatee.com>
Cc: stable@vger.kernel.org
Currently I am managing the Synopsys drivers & tools team (full-time) and
so I am passing the pcie-designware maintenance to Gustavo.
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
CC: Jingoo Han <jingoohan1@gmail.com>
Add myself as maintainer of the IBM RPA hotplug modules in the
drivers/pci/hotplug directory. These modules provide kernel interfaces for
support of Dynamic Logical Partitioning (DLPAR) of Logical and Physical IO
slots, and hotplug of physical PCI slots of a PHB on RPA-compliant ppc64
platforms (pseries).
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Since commit 23c85094fe ("proc/kcore: add vmcoreinfo note to /proc/kcore")
the kernel has exported the vmcoreinfo PT_NOTE on /proc/kcore as well
as /proc/vmcore.
arm64 only exposes it's additional arch information via
arch_crash_save_vmcoreinfo() if built with CONFIG_KEXEC, as kdump was
previously the only user of vmcoreinfo.
Move this weak function to a separate file that is built at the same
time as its caller in kernel/crash_core.c. This ensures values like
'kimage_voffset' are always present in the vmcoreinfo PT_NOTE.
CC: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
All other uses of "asm goto" go through asm_volatile_goto, which avoids
a miscompile when using GCC < 4.8.2. Replace our open-coded "asm goto"
statements with the asm_volatile_goto macro to avoid issues with older
toolchains.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This reverts commit 375899cddc.
The visibility of early messages did not longer take into account
"quiet", "debug", and "loglevel" early parameters.
It would be possible to invalidate and recompute LOG_NOCONS flag
for the affected messages. But it would be hairy.
Instead this patch just reverts the problematic commit. We could
come up with a better solution for the original problem. For example,
we could simplify the logic and just mark messages that should always
be visible or always invisible on the console.
Also this patch reverts the related build fix commit ffaa619af1
("printk: Fix warning about unused suppress_message_printing").
Finally, this patch does not put back the unused LOG_NOCONS flag.
Link: http://lkml.kernel.org/r/20180910145747.emvfzv4mzlk5dfqk@pathway.suse.cz
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Maninder Singh <maninder1.s@samsung.com>
Reported-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
since we use PSP to program IH regs now
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix SDMA hang in prt mode, clear XNACK_WATERMARK in reg SDMA0_UTCL1_WATERMK to avoid the issue
Affected ASICs: VEGA10 VEGA12 RV1 RV2
v2: add reg clear for SDMA1
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Tested-by: Yukun Li <yukun1.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Avoid unlocking a lock we never locked.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Building drivers/mtd/nand/raw/nandsim.c on arch/hexagon/ produces a
printk format build warning. This is due to hexagon's ffs() being
coded as returning long instead of int.
Fix the printk format warning by changing all of hexagon's ffs() and
fls() functions to return int instead of long. The variables that
they return are already int instead of long. This return type
matches the return type in <asm-generic/bitops/>.
../drivers/mtd/nand/raw/nandsim.c: In function 'init_nandsim':
../drivers/mtd/nand/raw/nandsim.c:760:2: warning: format '%u' expects argument of type 'unsigned int', but argument 2 has type 'long int' [-Wformat]
There are no ffs() or fls() allmodconfig build errors after making this
change.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: linux-hexagon@vger.kernel.org
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Patch-mainline: linux-kernel @ 07/22/2018, 16:03
Signed-off-by: Richard Kuo <rkuo@codeaurora.org>
Fix build warning in arch/hexagon/kernel/dma.c by casting a void *
to unsigned long to match the function parameter type.
../arch/hexagon/kernel/dma.c: In function 'arch_dma_alloc':
../arch/hexagon/kernel/dma.c:51:5: warning: passing argument 2 of 'gen_pool_add' makes integer from pointer without a cast [enabled by default]
../include/linux/genalloc.h:112:19: note: expected 'long unsigned int' but argument is of type 'void *'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: linux-sh@vger.kernel.org
Patch-mainline: linux-kernel @ 07/20/2018, 20:17
[rkuo@codeaurora.org: fixed architecture name]
Signed-off-by: Richard Kuo <rkuo@codeaurora.org>
After switching to the new procfs API, it is supposed to
retrieve the private pointer from PDE_DATA(file_inode(s->file)),
s->private is no longer referred.
Fixes: 1cd6718272 ("netfilter/x_tables: switch to proc_create_seq_private")
Reported-by: Sami Farin <hvtaifwkbgefbaei@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Tested-by: Sami Farin <hvtaifwkbgefbaei@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
NF_REPEAT places the packet at the beginning of the iptables chain
instead of accepting or rejecting it right away. The packet however will
reach the end of the chain and continue to the end of iptables
eventually, so it needs the same handling as NF_ACCEPT and NF_DROP.
Fixes: 368982cd7d ("netfilter: nfnetlink_queue: resolve clash for unconfirmed conntracks")
Signed-off-by: Michal 'vorner' Vaner <michal.vaner@avast.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Compiler did not catch incorrect typing in the rcu hook assignment.
% nfct add timeout test-tcp inet tcp established 100 close 10 close_wait 10
% iptables -I OUTPUT -t raw -p tcp -j CT --timeout test-tcp
dmesg - xt_CT: Timeout policy `test-tcp' can only be used by L3 protocol number 25000
The CT target bails out with incorrect layer 3 protocol number.
Fixes: 6c1fd7dc48 ("netfilter: cttimeout: decouple timeout policy from nfnetlink_cttimeout object")
Reported-by: Harsha Sharma <harshasharmaiitr@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Now that cttimeout support for nft_ct is in place, these should depend
on CONFIG_NF_CONNTRACK_TIMEOUT otherwise we can crash when dumping the
policy if this option is not enabled.
[ 71.600121] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[...]
[ 71.600141] CPU: 3 PID: 7612 Comm: nft Not tainted 4.18.0+ #246
[...]
[ 71.600188] Call Trace:
[ 71.600201] ? nft_ct_timeout_obj_dump+0xc6/0xf0 [nft_ct]
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Doug Smythies says:
Sometimes it is desirable to temporarily disable, or clear,
the iptables rule set on a computer being controlled via a
secure shell session (SSH). While unwise on an internet facing
computer, I also do it often on non-internet accessible computers
while testing. Recently, this has become problematic, with the
SSH session being dropped upon re-load of the rule set.
The problem is that when all rules are deleted, conntrack hooks get
unregistered.
In case the rules are re-added later, its possible that tcp window
has moved far enough so that all packets are considered invalid (out of
window) until entry expires (which can take forever, default
established timeout is 5 days).
Fix this by clearing maxwin of existing tcp connections on register.
v2: don't touch entries on hook removal.
v3: remove obsolete expiry check.
Reported-by: Doug Smythies <dsmythies@telus.net>
Fixes: 4d3a57f23d ("netfilter: conntrack: do not enable connection tracking unless needed")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Committing a transaction can consume some metadata of it's own, we now
reserve a small amount of metadata to cover this. Free metadata
reported by the kernel will not include this reserve.
If any of the reserve has been used after a commit we enter a new
internal state PM_OUT_OF_METADATA_SPACE. This is reported as
PM_READ_ONLY, so no userland changes are needed. If the metadata
device is resized the pool will move back to PM_WRITE.
These changes mean we never need to abort and rollback a transaction due
to running out of metadata space. This is particularly important
because there have been a handful of reports of data corruption against
DM thin-provisioning that can all be attributed to the thin-pool having
ran out of metadata space.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
This reverts commit a81cf9799a.
The patch causes a regression, which I cannot find the reason for.
So let's revert for now, as a revert hurts only performance.
Original report:
I was trying to resolve the problem with Oliver but we don't get any conclusion
for 5 months, so I am now sending this to mail list and cdc_acm authors.
I am using simple request-response protocol to obtain the boiller parameters
in constant intervals.
A simple one transaction is:
1. opening the /dev/ttyACM0
2. sending the following 10-bytes request to the device:
unsigned char req[] = {0x02, 0xfe, 0x01, 0x05, 0x08, 0x02, 0x01, 0x69, 0xab, 0x03};
3. reading response (frame of 74 bytes length).
4. closing the descriptor
I am doing this transaction with 5 seconds intervals.
Before the bad commit everything was working correctly: I've got a requests and
a responses in a timely manner.
After the bad commit more time I am using the kernel module, more problems I have.
The graph [2] is showing the problem.
As you can see after module load all seems fine but after about 30 minutes I've got
a plenty of EAGAINs when doing read()'s and trying to read back the data.
When I rmmod and insmod the cdc_acm module again, then the situation is starting
over again: running ok shortly after load, and more time it is running, more EAGAINs
I have when calling read().
As a bonus I can see the problem on the device itself:
The device is configured as you can see here on this screen [3].
It has two transmision LEDs: TX and RX. Blink duration is set for 100ms.
This is a recording before the bad commit when all is working fine: [4]
And this is with the bad commit: [5]
As you can see the TX led is blinking wrongly long (indicating transmission?)
and I have problems doing read() calls (EAGAIN).
Reported-by: Mariusz Bialonczyk <manio@skyboo.net>
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Fixes: a81cf9799a ("cdc-acm: implement put_char() and flush_chars()")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since renesas_usb3 udc driver calls usb_of_get_companion_dev()
which is on usb/core/of.c, build error like below happens if we
disable CONFIG_USB because the usb/core/ needs CONFIG_USB:
ERROR: "usb_of_get_companion_dev" [drivers/usb/gadget/udc/renesas_usb3.ko] undefined!
According to the usb/gadget/Kconfig, "NOTE: Gadget support
** DOES NOT ** depend on host-side CONFIG_USB !!".
So, to fix the issue, this patch changes the usb_of_get_companion_dev()
place from usb/core/of.c to usb/common/common.c to be called by both
host and gadget.
Reported-by: John Garry <john.garry@huawei.com>
Fixes: 39facfa01c ("usb: gadget: udc: renesas_usb3: Add register of usb role switch")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The MTK xHCI controller use some reserved bytes in endpoint context for
bandwidth scheduling, so need keep them in xhci_endpoint_copy();
The issue is introduced by:
commit f5249461b5 ("xhci: Clear the host side toggle manually when
endpoint is soft reset")
It resets endpoints and will drop bandwidth scheduling parameters used
by interrupt or isochronous endpoints on MTK xHCI controller.
Fixes: f5249461b5 ("xhci: Clear the host side toggle manually when
endpoint is soft reset")
Cc: stable@vger.kernel.org
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Tested-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Quectel EP06 (and EM06/EG06) supports dynamic configuration of USB
interfaces, without the device changing VID/PID or configuration number.
When the configuration is updated and interfaces are added/removed, the
interface numbers change. This means that the current code for matching
EP06 does not work.
This patch removes the current EP06 interface number match, and replaces
it with a match on class, subclass and protocol. Unfortunately, matching
on those three alone is not enough, as the diag interface exports the
same values as QMI. The other serial interfaces + adb export different
values and do not match.
The diag interface only has two endpoints, while the QMI interface has
three. I have therefore added a check for number of interfaces, and we
ignore the interface if the number of endpoints equals two.
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates license to use SPDX-License-Identifier
instead of verbose license text.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The gasket in-kernel framework, recently introduced under staging,
re-implements what is already long-time provided by the UIO
subsystem, with extra PCI BAR remapping and MSI conveniences.
Before moving it out of staging, make sure we add the new bits to
the UIO framework instead, then transform its signle client, the
Apex driver, to a proper UIO driver (uio_driver.h).
Link: https://lkml.kernel.org/r/20180828103817.GB1397@do-kernel
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 550ddadcc7 ("tty: hvc: hvc_write() may sleep") broke the
termination condition in case the driver stops accepting characters.
This can result in unnecessary polling of the busy driver.
Restore it by testing the hvc_push return code.
Tested-by: Matteo Croce <mcroce@redhat.com>
Tested-by: Jason Gunthorpe <jgg@mellanox.com>
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit ec97eaad13 ("tty: hvc: hvc_poll() break hv read loop") causes
the virtio console to hang at times (e.g., if you paste a bunch of
characters to it.
The reason is that get_chars must return 0 before we can be sure the
driver will kick or poll input again, but this change only scheduled a
poll if get_chars had returned a full count. Change this to poll on
any > 0 count.
Reported-by: Matteo Croce <mcroce@redhat.com>
Reported-by: Jason Gunthorpe <jgg@mellanox.com>
Tested-by: Matteo Croce <mcroce@redhat.com>
Tested-by: Jason Gunthorpe <jgg@mellanox.com>
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix a few issues in Documentation/x86/earlyprintk.txt:
- correct typos, punctuation, missing word, wrong word
- change product name from Netchip to NetChip
- expand where to add "earlyprintk=dbg"
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Link: http://lkml.kernel.org/r/d0c40ac3-7659-6374-dbda-23d3d2577f30@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Perf can record user stack data in response to a synchronous request, such
as a tracepoint firing. If this happens under set_fs(KERNEL_DS), then we
end up reading user stack data using __copy_from_user_inatomic() under
set_fs(KERNEL_DS). I think this conflicts with the intention of using
set_fs(KERNEL_DS). And it is explicitly forbidden by hardware on ARM64
when both CONFIG_ARM64_UAO and CONFIG_ARM64_PAN are used.
So fix this by forcing USER_DS when recording user stack data.
Signed-off-by: Yabin Cui <yabinc@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 88b0193d94 ("perf/callchain: Force USER_DS when invoking perf_callchain_user()")
Link: http://lkml.kernel.org/r/20180823225935.27035-1-yabinc@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Trivial fix to spelling mistake in pr_err() error message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20180824112235.8842-1-colin.king@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit:
c3bc8fd637 ("tracing: Centralize preemptirq tracepoints and unify their usage")
added the inclusion of <trace/events/preemptirq.h>.
liblockdep doesn't have a stub version of that header so now fails to build.
However, commit:
bff1b208a5 ("tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and unify their usage"")
removed the use of functions declared in that header. So delete the #include.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <alexander.levin@verizon.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: bff1b208a5 ("tracing: Partial revert of "tracing: Centralize ...")
Fixes: c3bc8fd637 ("tracing: Centralize preemptirq tracepoints ...")
Link: http://lkml.kernel.org/r/20180828203315.GD18030@decadent.org.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
ovl_free_fs() dereferences ofs->workbasedir and ofs->upper_mnt in cases when
those might not have been initialized yet.
Fix the initialization order for these fields.
Reported-by: syzbot+c75f181dc8429d2eb887@syzkaller.appspotmail.com
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # v4.15
Fixes: 95e6d4177c ("ovl: grab reference to workbasedir early")
Fixes: a9075cdb46 ("ovl: factor out ovl_free_fs() helper")
Fix kernel-doc warning for missing 'flags' parameter description:
../kernel/sched/fair.c:3371: warning: Function parameter or member 'flags' not described in 'attach_entity_load_avg'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: ea14b57e8a ("sched/cpufreq: Provide migration hint")
Link: http://lkml.kernel.org/r/cdda0d42-880d-4229-a9f7-5899c977a063@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It can happen that load_balance() finds a busiest group and then a
busiest rq but the calculated imbalance is in fact 0.
In such situation, detach_tasks() returns immediately and lets the
flag LBF_ALL_PINNED set. The busiest CPU is then wrongly assumed to
have pinned tasks and removed from the load balance mask. then, we
redo a load balance without the busiest CPU. This creates wrong load
balance situation and generates wrong task migration.
If the calculated imbalance is 0, it's useless to try to find a
busiest rq as no task will be migrated and we can return immediately.
This situation can happen with heterogeneous system or smp system when
RT tasks are decreasing the capacity of some CPUs.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: jhugo@codeaurora.org
Link: http://lkml.kernel.org/r/1536306664-29827-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since commit:
523e979d31 ("sched/core: Use PELT for scale_rt_capacity()")
scale_rt_capacity() returns the remaining capacity and not a scale factor
to apply on cpu_capacity_orig. arch_scale_cpu() is directly called by
scale_rt_capacity() so we must take the sched_domain argument.
Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 523e979d31 ("sched/core: Use PELT for scale_rt_capacity()")
Link: http://lkml.kernel.org/r/20180904093626.GA23936@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When a task which previously ran on a given CPU is remotely queued to
wake up on that same CPU, there is a period where the task's state is
TASK_WAKING and its vruntime is not normalized. This is not accounted
for in vruntime_normalized() which will cause an error in the task's
vruntime if it is switched from the fair class during this time.
For example if it is boosted to RT priority via rt_mutex_setprio(),
rq->min_vruntime will not be subtracted from the task's vruntime but
it will be added again when the task returns to the fair class. The
task's vruntime will have been erroneously doubled and the effective
priority of the task will be reduced.
Note this will also lead to inflation of all vruntimes since the doubled
vruntime value will become the rq's min_vruntime when other tasks leave
the rq. This leads to repeated doubling of the vruntime and priority
penalty.
Fix this by recognizing a WAKING task's vruntime as normalized only if
sched_remote_wakeup is true. This indicates a migration, in which case
the vruntime would have been normalized in migrate_task_rq_fair().
Based on a similar patch from John Dias <joaodias@google.com>.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Steve Muckle <smuckle@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Redpath <Chris.Redpath@arm.com>
Cc: John Dias <joaodias@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miguel de Dios <migueldedios@google.com>
Cc: Morten Rasmussen <Morten.Rasmussen@arm.com>
Cc: Patrick Bellasi <Patrick.Bellasi@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Quentin Perret <quentin.perret@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@google.com>
Cc: kernel-team@android.com
Fixes: b5179ac70d ("sched/fair: Prepare to fix fairness problems on migration")
Link: http://lkml.kernel.org/r/20180831224217.169476-1-smuckle@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
update_blocked_averages() is called to periodiccally decay the stalled load
of idle CPUs and to sync all loads before running load balance.
When cfs rq is idle, it trigs a load balance during pick_next_task_fair()
in order to potentially pull tasks and to use this newly idle CPU. This
load balance happens whereas prev task from another class has not been put
and its utilization updated yet. This may lead to wrongly account running
time as idle time for RT or DL classes.
Test that no RT or DL task is running when updating their utilization in
update_blocked_averages().
We still update RT and DL utilization instead of simply skipping them to
make sure that all metrics are synced when used during load balance.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 371bf42732 ("sched/rt: Add rt_rq utilization tracking")
Fixes: 3727e0e163 ("sched/dl: Add dl_rq utilization tracking")
Link: http://lkml.kernel.org/r/1535728975-22799-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With the following commit:
051f3ca02e ("sched/topology: Introduce NUMA identity node sched domain")
the scheduler introduced a new NUMA level. However this leads to the NUMA topology
on 2 node systems to not be marked as NUMA_DIRECT anymore.
After this commit, it gets reported as NUMA_BACKPLANE, because
sched_domains_numa_level is now 2 on 2 node systems.
Fix this by allowing setting systems that have up to 2 NUMA levels as
NUMA_DIRECT.
While here remove code that assumes that level can be 0.
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andre Wild <wild@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Fixes: 051f3ca02e "Introduce NUMA identity node sched domain"
Link: http://lkml.kernel.org/r/1533920419-17410-1-git-send-email-srikar@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch follows commit 1751e8a6cb ("Rename superblock
flags (MS_xyz -> SB_xyz)") and after commit ("vfs: Suppress
MS_* flag defs within the kernel unless explicitly enabled"),
there is no MS_RDONLY and MS_NOATIME at all.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>