linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-03 05:26:42 +07:00

Author	SHA1	Message	Date
Richard Weinberger	417e02bf42	netfilter: xt_LOG: fix bogus extra layer-4 logging information In 16059b5 netfilter: merge ipt_LOG and ip6_LOG into xt_LOG, we have merged ipt_LOG and ip6t_LOG. However: IN=wlan0 OUT= MAC=xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx SRC=213.150.61.61 DST=192.168.1.133 LEN=40 TOS=0x00 PREC=0x00 TTL=117 ID=10539 DF PROTO=TCP SPT=80 DPT=49013 WINDOW=0 RES=0x00 ACK RST URGP=0 PROTO=UDPLITE SPT=80 DPT=49013 LEN=45843 PROTO=ICMP TYPE=0 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Several missing break in the code led to including bogus layer-4 information. This patch fixes this problem. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-03-07 17:40:59 +01:00
WANG Cong	5f1f815103	netfilter: remove ipt_SAME.h and ipt_realm.h These two headers are not required anymore, they have been replaced by xt_SAME.h and xt_realm.h. Florian Westphal pointed out this. Cc: "David S. Miller" <davem@davemloft.net> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: WANG Cong <xiyou.wangcong@gmail. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-03-07 17:40:56 +01:00
Tony Zelenoff	58020f7761	netfilter: nf_ct_ecache: refactor nf_ct_deliver_cached_events * identation lowered * some CPU cycles saved at delayed item variable initialization Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-03-07 17:40:53 +01:00
Tony Zelenoff	93326ae312	netfilter: nf_ct_ecache: trailing whitespace removed Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-03-07 17:40:51 +01:00
Richard Weinberger	6939c33a75	netfilter: merge ipt_LOG and ip6_LOG into xt_LOG ipt_LOG and ip6_LOG have a lot of common code, merge them to reduce duplicate code. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-03-07 17:40:49 +01:00
Pablo Neira Ayuso	544d5c7d9f	netfilter: ctnetlink: allow to set expectfn for expectations This patch allows you to set expectfn which is specifically used by the NAT side of most of the existing conntrack helpers. I have added a symbol map that uses a string as key to look up for the function that is attached to the expectation object. This is the best solution I came out with to solve this issue. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-03-07 17:40:46 +01:00
Pablo Neira Ayuso	076a0ca026	netfilter: ctnetlink: add NAT support for expectations This patch adds the missing bits to create expectations that are created in NAT setups.	2012-03-07 17:40:44 +01:00
Pablo Neira Ayuso	b8c5e52c13	netfilter: ctnetlink: allow to set expectation class This patch allows you to set the expectation class. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-03-07 17:40:42 +01:00
Pablo Neira Ayuso	660fdb2a0f	netfilter: ctnetlink: allow to set helper for new expectations This patch allow you to set the helper for newly created expectations based of the CTA_EXPECT_HELP_NAME attribute. Before this, the helper set was NULL. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-03-07 17:40:40 +01:00
Jozsef Kadlecsik	7f81c951d9	netfilter: ipset: hash:net,iface timeout bug fixed Timed out entries were still matched till the garbage collector purged them out. The fix is verified in the testsuite. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-03-07 17:40:37 +01:00
Jozsef Kadlecsik	2a7cef2a4b	netfilter: ipset: Exceptions support added to hash:net types The "nomatch" keyword and option is added to the hash:net types, by which one can add exception entries to sets. Example: ipset create test hash:net ipset add test 192.168.0/24 ipset add test 192.168.0/30 nomatch In this case the IP addresses from 192.168.0/24 except 192.168.0/30 match the elements of the set. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-03-07 17:40:35 +01:00
Jozsef Kadlecsik	0927a1ac63	netfilter: ipset: Log warning when a hash type of set gets full If the set is full, the SET target cannot add more elements. Log warning so that the admin got notified about it. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-03-07 17:40:33 +01:00
Jan Engelhardt	ae8ded1cb8	netfilter: ipset: expose userspace-relevant parts in ip_set.h iptables's libxt_SET.c depends on these. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-03-07 17:40:31 +01:00
Jan Engelhardt	c15f1c8325	netfilter: ipset: use NFPROTO_ constants ipset is actually using NFPROTO values rather than AF (xt_set passes that along). Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-03-07 17:40:29 +01:00
Duc Dang	ae5d33723e	powerpc/44x: Add more changes for APM821XX EMAC driver This patch includes: Configure EMAC PHY clock source (clock from PHY or internal clock). Do not advertise PHY half duplex capability as APM821XX EMAC does not support half duplex mode. Add changes to support configuring jumbo frame for APM821XX EMAC. [ Fix coding style -DaveM ] Signed-off-by: Duc Dang <dhdang@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 17:07:42 -05:00
Duc Dang	8dfc2b45ff	powerpc/44x: Add new compatible value for EMAC node of APM821XX dts file. This compatible value will be used to distinguish some special features of APM821XX EMAC: no half duplex mode support, configuring jumbo frame. Signed-off-by: Duc Dang <dhdang@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 17:06:07 -05:00
David S. Miller	95f050bf7f	net: Use bool for return value of dev_valid_name(). Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 16:12:15 -05:00
Yevgeny Petrilin	66431a7d45	net/mlx4: defining functions as static Fixing sparse warnings, the 2 functions are only used in same file. Defining them as static and not exporting them. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:18 -05:00
Yevgeny Petrilin	be6736ba1f	net/mlx4: remove unused functions Removing functions that are no longer in use, but still exist Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:18 -05:00
Yevgeny Petrilin	9a9a232a92	net/mlx4: fixing sparse warnings for not declared, functions The SET_PORT functions are implemented in port.c, which is part of mlx4_core, these functions are exported. The functions are in use by the mlx4_en module (were originally part of mlx4_en). Their declaration remained in mlx4_en module, moving the declaration to the right location. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:18 -05:00
Yevgeny Petrilin	2ab573c586	net/mlx4: fixing sparse warnings when copying mac, address to gid entry The mac should be written as __be64 the gid. The warning was because we changed the mac parameter, which is u64, by writing result of cpu_to_be64 into it. Fixing by using new variable of type __be64. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:17 -05:00
Yevgeny Petrilin	39b2c4ebb4	net/mlx4: fix sparse warnings on wrong type for RSS keys The keys used for the hardware RSS topelitz hash are of type __be32 where the values provided by the driver are from array of u32, this triggered sparse warning on incorrect type in assignment as of different base types. Since these values are picked randomly, the fix is to transform the key to __be32 by executing cpu_to_be_32() Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:17 -05:00
Or Gerlitz	966684d581	net/mlx4: fix sparse warnings on TX blue flame buffer The blue flame buffer is defined to be of type void __iomem * but was passed to mlx4_bf_copy which gets unsigned long * . This triggered sparse warning on different address spaces, fix that by changing mlx4_bf_copy first param to be of type void __iomem * . Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:17 -05:00
Or Gerlitz	4ef2a435be	net/mlx4: fix sparse warnings on TX control flags, endianess Fix sparse warnings on incompatibility between the endianess of the ctrl_flags field of struct mlx4_en_priv to the srcrb_flags field of struct mlx4_wqe_ctrl_seg by changing the former to be __be32 instead of u32. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:17 -05:00
Yevgeny Petrilin	ebf8c9aa03	net/mlx4_en: Saving mem access on data path Localized the pdev->dev, and using dma_map instead of pci_map There are multiple map/unmap operations on data path, optimizing those by saving redundant pointer access. Those places were identified as hot-spots when running kernel profiling during some benchmarks. The fixes had most impact when testing packet rate with small packets, reducing several % from CPU load, and in some case being the difference between reaching wire speed or being CPU bound. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:17 -05:00
Santosh Nayak	6975f4ce5a	qla3xxx: ethernet: Silence static checker warning. Silence the following warning: "warn: returning -1 instead of -ENOMEM is sloppy". Signed-off-by: Santosh Nayak <santoshprasadnayak@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:14 -05:00
Junchang Wang	9184a22701	8139too: Add 64bit statistics Switch to use ndo_get_stats64 to get 64bit statistics. Two sync entries are used (one for Rx and one for Tx). Signed-off-by: Junchang Wang <junchangwang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 00:04:16 -05:00
David S. Miller	f6a1ad4295	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/vmxnet3/vmxnet3_drv.c Small vmxnet3 conflict with header size bug fix in 'net'. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-05 21:16:26 -05:00
Linus Torvalds	f3969bf78f	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "It contains three cherry-picked fixes from perf/core, which turned out to be more urgent than we originally thought." * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf tools: Handle kernels that don't support attr.exclude_{guest,host} perf tools: Change perf_guest default back to false perf record: No build id option fails	2012-03-05 16:23:12 -08:00
Linus Torvalds	98e990afa6	USB: revert a powerpc EHCI patch There is just one patch in here, a revert of a powerpc EHCI driver patch that was reported to cause problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iEYEABECAAYFAk9VTXkACgkQMUfUDdst+ylV+wCg0LCngetBRR4J7Tu+fxfIBS00 z6YAni9fZFigFsapZqiypbSVrZ6FARQs =g7Br -----END PGP SIGNATURE----- Merge tag 'usb-3.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb USB: revert a powerpc EHCI patch There is just one patch in here, a revert of a powerpc EHCI driver patch that was reported to cause problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> * tag 'usb-3.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: Revert "powerpc/usb: fix issue of CPU halt when missing USB PHY clock"	2012-03-05 16:10:44 -08:00
Linus Torvalds	75d7b398b7	tty: build fix for 3.3-rc6 This contains one build fix for the powerpc udbg driver that was reported. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iEYEABECAAYFAk9VTK0ACgkQMUfUDdst+ykAGACeLDYE9U586NNUAGcHALtb6AtT R1IAoK4NgsUvzxkp8XOlUYUar1DulcZB =0xfn -----END PGP SIGNATURE----- Merge tag 'tty-3.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty tty: build fix for 3.3-rc6 This contains one build fix for the powerpc udbg driver that was reported. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> * tag 'tty-3.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: tty/powerpc: early udbg consoles can't be modules	2012-03-05 16:10:27 -08:00
Linus Torvalds	a2e5f13ce8	3 fixes for md in 3.3-rc 2 relate to the recently added drive replacement. One causes read error in RAID10 to sometimes be retried indefinitely. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQIVAwUAT1VI1znsnt1WYoG5AQK47Q//d51y5QCpABFNUcgIM626zJXlBWFUSmzU wFOGXh5emN6/TWguzkiZwrvcspDmXMzz1zmJtGWixYb2jBpn2MHEN4uNz3Vq68w+ IYk/dJg/CG4+lzX+6IjiHOb3+TASRx94QZHJASx68vypqniAyikshqcbUeZBMTB0 Fu+sKqsOGYmwQfe6/vtRPVXY7DYK2dFDBRMFpmOl+o4Y2XxmmWzMw4Dg1RIEdtFS Jo9GwLHTnlw2xoc0XooufeT0Q2KOpqi9T8L6Nj0ORwpgsFqgtZ/kIOoGU6qOpSri ofLTrobVKMpjFtmiYVOp9TaBlPnd/TNX3E4WPLGNsAwYuRUFjq8evmJKjG+pOdeB 3ArxRKRJCaI2jnVhH+NpT7i/tpkEg/8a/BoOAihX+hM/8QkmsWluaRBOGMhpuuuc 1baPVTusi/zijO9cM8RGIXaQj5UG4s3LUpCIOIYdDyxsfmAH5KN1F2EPrU4NMME2 96THSshIZLkgAg5ICwtva0qoHlBlEclAlVAzEomT7R9KwHojEB1xUiyMmaIdMFoy JjGFAMp2E5+KBKZ1eYEHjthPWCb+nZ3eYHUh0DOnEt4kASCXnn45GJREQkpkNIR/ HhDTS8vI743unKnbCtYFMxiw/9OXZbMkdoZhobg7lxcpoQlWJ+5ziOtACl0h0Kv8 +ET+Kp3W8K4= =93ms -----END PGP SIGNATURE----- Merge tag 'md-3.3-fixes' of git://neil.brown.name/md Pull md fixes from Neil Brown: "Three fixes for md in 3.3-rc: Two relate to the recently added drive replacement. One fixes the problem where a read error in RAID10 would sometimes be retried indefinitely." * tag 'md-3.3-fixes' of git://neil.brown.name/md: md/raid10: fix assembling of arrays with replacement devices. md/raid10: fix handling of error on last working device in array. md/raid1: fix buglet in md_raid1_contested.	2012-03-05 16:01:25 -08:00
Linus Torvalds	3e85fb9cd4	Merge branch 'akpm' (Andrew's patch bomb) Merge the emailed seties of 19 patches from Andrew Morton * akpm: rapidio/tsi721: fix queue wrapping bug in inbound doorbell handler memcg: fix mapcount check in move charge code for anonymous page mm: thp: fix BUG on mm->nr_ptes alpha: fix 32/64-bit bug in futex support memcg: fix GPF when cgroup removal races with last exit debugobjects: Fix selftest for static warnings floppy/scsi: fix setting of BIO flags memcg: fix deadlock by inverting lrucare nesting drivers/rtc/rtc-r9701.c: fix crash in r9701_remove() c2port: class_create() returns an ERR_PTR pps: class_create() returns an ERR_PTR, not NULL hung_task: fix the broken rcu_lock_break() logic vfork: kill PF_STARTING coredump_wait: don't call complete_vfork_done() vfork: make it killable vfork: introduce complete_vfork_done() aio: wake up waiters when freeing unused kiocbs kprobes: return proper error code from register_kprobe() kmsg_dump: don't run on non-error paths by default	2012-03-05 15:50:25 -08:00
Alexandre Bounine	b24823e61b	rapidio/tsi721: fix queue wrapping bug in inbound doorbell handler Fix a bug that causes a kernel panic when the number of received doorbells is larger than number of entries in the inbound doorbell queue (current default value = 512). Another possible indication for this bug is large number of spurious doorbells reported by tsi721 driver after reaching the queue size maximum. Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Chul Kim <chul.kim@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: <stable@vger.kernel.org> [3.2.x+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:43 -08:00
Naoya Horiguchi	e6ca7b89dc	memcg: fix mapcount check in move charge code for anonymous page Currently the charge on shared anonyous pages is supposed not to moved in task migration. To implement this, we need to check that mapcount > 1, instread of > 2. So this patch fixes it. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:43 -08:00
Andrea Arcangeli	1c641e8471	mm: thp: fix BUG on mm->nr_ptes Dave Jones reports a few Fedora users hitting the BUG_ON(mm->nr_ptes...) in exit_mmap() recently. Quoting Hugh's discovery and explanation of the SMP race condition: "mm->nr_ptes had unusual locking: down_read mmap_sem plus page_table_lock when incrementing, down_write mmap_sem (or mm_users 0) when decrementing; whereas THP is careful to increment and decrement it under page_table_lock. Now most of those paths in THP also hold mmap_sem for read or write (with appropriate checks on mm_users), but two do not: when split_huge_page() is called by hwpoison_user_mappings(), and when called by add_to_swap(). It's conceivable that the latter case is responsible for the exit_mmap() BUG_ON mm->nr_ptes that has been reported on Fedora." The simplest way to fix it without having to alter the locking is to make split_huge_page() a noop in nr_ptes terms, so by counting the preallocated pagetables that exists for every mapped hugepage. It was an arbitrary choice not to count them and either way is not wrong or right, because they are not used but they're still allocated. Reported-by: Dave Jones <davej@redhat.com> Reported-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Josh Boyer <jwboyer@redhat.com> Cc: <stable@vger.kernel.org> [3.0.x, 3.1.x, 3.2.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:43 -08:00
Andrew Morton	62aca40365	alpha: fix 32/64-bit bug in futex support Michael Cree said: : : I have noticed some user space problems (pulseaudio crashes in pthread : : code, glibc/nptl test suite failures, java compiler freezes on SMP alpha : : systems) that arise when using a 2.6.39 or later kernel on Alpha. : : Bisecting between 2.6.38 and 2.6.39 (using glibc/nptl test suite as : : criterion for good/bad kernel) eventually leads to: : : : : `8d7718aa08` is the first bad commit : : commit `8d7718aa08` : : Author: Michel Lespinasse <walken@google.com> : : Date: Thu Mar 10 18:50:58 2011 -0800 : : : : futex: Sanitize futex ops argument types : : : : Change futex_atomic_op_inuser and futex_atomic_cmpxchg_inatomic : : prototypes to use u32 types for the futex as this is the data type the : : futex core code uses all over the place. : : : : Looking at the commit I see there is a change of the uaddr argument in : : the Alpha architecture specific code for futexes from int to u32, but I : : don't see why this should cause a problem. Richard Henderson said: : futex_atomic_cmpxchg_inatomic(u32 uval, u32 __user uaddr, : u32 oldval, u32 newval) : ... : : "r"(uaddr), "r"((long)oldval), "r"(newval) : : : There is no 32-bit compare instruction. These are implemented by : consistently extending the values to a 64-bit type. Since the : load instruction sign-extends, we want to sign-extend the other : quantity as well (despite the fact it's logically unsigned). : : So: : : - : "r"(uaddr), "r"((long)oldval), "r"(newval) : + : "r"(uaddr), "r"((long)(int)oldval), "r"(newval) : : should do the trick. Michael said: : This fixes the glibc test suite failures and the pulseaudio related : crashes, but it does not fix the java compiiler lockups that I was (and : are still) observing. That is some other problem. Reported-by: Michael Cree <mcree@orcon.net.nz> Tested-by: Michael Cree <mcree@orcon.net.nz> Acked-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Michel Lespinasse <walken@google.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:43 -08:00
Hugh Dickins	7512102cf6	memcg: fix GPF when cgroup removal races with last exit When moving tasks from old memcg (with move_charge_at_immigrate on new memcg), followed by removal of old memcg, hit General Protection Fault in mem_cgroup_lru_del_list() (called from release_pages called from free_pages_and_swap_cache from tlb_flush_mmu from tlb_finish_mmu from exit_mmap from mmput from exit_mm from do_exit). Somewhat reproducible, takes a few hours: the old struct mem_cgroup has been freed and poisoned by SLAB_DEBUG, but mem_cgroup_lru_del_list() is still trying to update its stats, and take page off lru before freeing. A task, or a charge, or a page on lru: each secures a memcg against removal. In this case, the last task has been moved out of the old memcg, and it is exiting: anonymous pages are uncharged one by one from the memcg, as they are zapped from its pagetables, so the charge gets down to 0; but the pages themselves are queued in an mmu_gather for freeing. Most of those pages will be on lru (and force_empty is careful to lru_add_drain_all, to add pages from pagevec to lru first), but not necessarily all: perhaps some have been isolated for page reclaim, perhaps some isolated for other reasons. So, force_empty may find no task, no charge and no page on lru, and let the removal proceed. There would still be no problem if these pages were immediately freed; but typically (and the put_page_testzero protocol demands it) they have to be added back to lru before they are found freeable, then removed from lru and freed. We don't see the issue when adding, because the mem_cgroup_iter() loops keep their own reference to the memcg being scanned; but when it comes to mem_cgroup_lru_del_list(). I believe this was not an issue in v3.2: there, PageCgroupAcctLRU and PageCgroupUsed flags were used (like a trick with mirrors) to deflect view of pc->mem_cgroup to the stable root_mem_cgroup when neither set. `38c5d72f3e` ("memcg: simplify LRU handling by new rule") mercifully removed those convolutions, but left this General Protection Fault. But it's surprisingly easy to restore the old behaviour: just check PageCgroupUsed in mem_cgroup_lru_add_list() (which decides on which lruvec to add), and reset pc to root_mem_cgroup if page is uncharged. A risky change? just going back to how it worked before; testing, and an audit of uses of pc->mem_cgroup, show no problem. And there's a nice bonus: with mem_cgroup_lru_add_list() itself making sure that an uncharged page goes to root lru, mem_cgroup_reset_owner() no longer has any purpose, and we can safely revert `4e5f01c2b9` ("memcg: clear pc->mem_cgroup if necessary"). Calling update_page_reclaim_stat() after add_page_to_lru_list() in swap.c is not strictly necessary: the lru_lock there, with RCU before memcg structures are freed, makes mem_cgroup_get_reclaim_stat_from_page safe without that; but it seems cleaner to rely on one dependency less. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:43 -08:00
Stephen Boyd	9f78ff005a	debugobjects: Fix selftest for static warnings debugobjects is now printing a warning when a fixup for a NOTAVAILABLE object is run. This causes the selftest to fail like: ODEBUG: selftest warnings failed 4 != 5 We could just increase the number of warnings that the selftest is expecting to see because that is actually what has changed. But, it turns out that fixup_activate() was written with inverted logic and thus a fixup for a static object returned 1 indicating the object had been fixed, and 0 otherwise. Fix the logic to be correct and update the counts to reflect that nothing needed fixing for a static object. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:43 -08:00
Muthu Kumar	9354f1b8e6	floppy/scsi: fix setting of BIO flags Fix setting bio flags in drivers (sd_dif/floppy). Signed-off-by: Muthukumar R <muthur@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:43 -08:00
Hugh Dickins	9ce70c0240	memcg: fix deadlock by inverting lrucare nesting We have forgotten the rules of lock nesting: the irq-safe ones must be taken inside the non-irq-safe ones, otherwise we are open to deadlock: CPU0 CPU1 ---- ---- lock(&(&pc->lock)->rlock); local_irq_disable(); lock(&(&zone->lru_lock)->rlock); lock(&(&pc->lock)->rlock); <Interrupt> lock(&(&zone->lru_lock)->rlock); To check a different locking issue, I happened to add a spin_lock to memcg's bit_spin_lock in lock_page_cgroup(), and lockdep very quickly complained about __mem_cgroup_commit_charge_lrucare() (on CPU1 above). So delete __mem_cgroup_commit_charge_lrucare(), passing a bool lrucare to __mem_cgroup_commit_charge() instead, taking zone->lru_lock under lock_page_cgroup() in the lrucare case. The original was using spin_lock_irqsave, but we'd be in more trouble if it were ever called at interrupt time: unconditional _irq is enough. And ClearPageLRU before del from lru, SetPageLRU before add to lru: no strong reason, but that is the ordering used consistently elsewhere. Fixes `36b62ad539` ("memcg: simplify corner case handling of LRU"). Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:43 -08:00
Anatolij Gustschin	73737b8787	drivers/rtc/rtc-r9701.c: fix crash in r9701_remove() If probing the RTC didn't succeed due to failed RTC register access, the RTC device will be unregistered. Then, when removing the module r9701_remove() causes a kernel crash while trying to unregister a not registered RTC device. Fix this by doing RTC register access test before RTC device registration. Signed-off-by: Anatolij Gustschin <agust@denx.de> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:43 -08:00
Dan Carpenter	22ea71d7f4	c2port: class_create() returns an ERR_PTR class_create() doesn't return a NULL, it only returns ERR_PTRs. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:43 -08:00
Dan Carpenter	7ad12566dc	pps: class_create() returns an ERR_PTR, not NULL class_create() never returns NULLs only ERR_PTRs. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Rodolfo Giometti <giometti@enneenne.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:43 -08:00
Oleg Nesterov	6027ce497d	hung_task: fix the broken rcu_lock_break() logic check_hung_uninterruptible_tasks()->rcu_lock_break() introduced by "softlockup: check all tasks in hung_task" commit `ce9dbe24` looks absolutely wrong. - rcu_lock_break() does put_task_struct(). If the task has exited it is not safe to even read its ->state, nothing protects this task_struct. - The TASK_DEAD checks are wrong too. Contrary to the comment, we can't use it to check if the task was unhashed. It can be unhashed without TASK_DEAD, or it can be valid with TASK_DEAD. For example, an autoreaping task can do release_task(current) long before it sets TASK_DEAD in do_exit(). Or, a zombie task can have ->state == TASK_DEAD but release_task() was not called, and in this case we must not break the loop. Change this code to check pid_alive() instead, and do this before we drop the reference to the task_struct. Note: while_each_thread() under rcu_read_lock() is not really safe, it can livelock. This will be fixed later, but fortunately in this case the "max_count" logic saves us anyway. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Mandeep Singh Baines <msb@google.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:42 -08:00
Oleg Nesterov	6e27f63edb	vfork: kill PF_STARTING Previously it was (ab)used by utrace. Then it was wrongly used by the scheduler code. Currently it is not used, kill it before it finds the new erroneous user. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:42 -08:00
Oleg Nesterov	57b59c4a14	coredump_wait: don't call complete_vfork_done() Now that CLONE_VFORK is killable, coredump_wait() no longer needs complete_vfork_done(). zap_threads() should find and kill all tasks with the same ->mm, this includes our parent if ->vfork_done is set. mm_release() becomes the only caller, unexport complete_vfork_done(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:42 -08:00
Oleg Nesterov	d68b46fe16	vfork: make it killable Make vfork() killable. Change do_fork(CLONE_VFORK) to do wait_for_completion_killable(). If it fails we do not return to the user-mode and never touch the memory shared with our child. However, in this case we should clear child->vfork_done before return, we use task_lock() in do_fork()->wait_for_vfork_done() and complete_vfork_done() to serialize with each other. Note: now that we use task_lock() we don't really need completion, we could turn task->vfork_done into "task_struct *wake_up_me" but this needs some complications. NOTE: this and the next patches do not affect in-kernel users of CLONE_VFORK, kernel threads run with all signals ignored including SIGKILL/SIGSTOP. However this is obviously the user-visible change. Not only a fatal signal can kill the vforking parent, a sub-thread can do execve or exit_group() and kill the thread sleeping in vfork(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:42 -08:00
Oleg Nesterov	c415c3b47e	vfork: introduce complete_vfork_done() No functional changes. Move the clear-and-complete-vfork_done code into the new trivial helper, complete_vfork_done(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:42 -08:00
Jeff Moyer	880641bb9d	aio: wake up waiters when freeing unused kiocbs Bart Van Assche reported a hung fio process when either hot-removing storage or when interrupting the fio process itself. The (pruned) call trace for the latter looks like so: fio D 0000000000000001 0 6849 6848 0x00000004 ffff880092541b88 0000000000000046 ffff880000000000 ffff88012fa11dc0 ffff88012404be70 ffff880092541fd8 ffff880092541fd8 ffff880092541fd8 ffff880128b894d0 ffff88012404be70 ffff880092541b88 000000018106f24d Call Trace: schedule+0x3f/0x60 io_schedule+0x8f/0xd0 wait_for_all_aios+0xc0/0x100 exit_aio+0x55/0xc0 mmput+0x2d/0x110 exit_mm+0x10d/0x130 do_exit+0x671/0x860 do_group_exit+0x44/0xb0 get_signal_to_deliver+0x218/0x5a0 do_signal+0x65/0x700 do_notify_resume+0x65/0x80 int_signal+0x12/0x17 The problem lies with the allocation batching code. It will opportunistically allocate kiocbs, and then trim back the list of iocbs when there is not enough room in the completion ring to hold all of the events. In the case above, what happens is that the pruning back of events ends up freeing up the last active request and the context is marked as dead, so it is thus responsible for waking up waiters. Unfortunately, the code does not check for this condition, so we end up with a hung task. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Reported-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Bart Van Assche <bvanassche@acm.org> Cc: <stable@kernel.org> [3.2.x only] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:42 -08:00

1 2 3 4 5 ...

289353 Commits