linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-13 01:46:52 +07:00

Author	SHA1	Message	Date
Sebastian Ott	c630493327	[S390] proper use of device register Don't use kfree directly after device registration started. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:45 +02:00
Stefan Haberland	ca99dab01d	[S390] dasd: fix message naming This patch fixes message naming so that generic dasd messages do not contain the device discipline. For this purpose the dev_ makros are replaced by pr_ makros for generic dasd messages. Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:42 +02:00
Stefan Haberland	68b781fe1b	[S390] dasd: optimize cpu usage in goodcase remove unnecessary dbf call, remove string operations for magic Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:41 +02:00
Stefan Weinhuber	97f604b074	[S390] dasd: fail requests when device state is less then ready A DASD device that is not ready or online has no defined disk layout, so all requests that arrive in such a state need to be returned as failed. Signed-off-by: Stefan Weinhuber <wein@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:41 +02:00
Sebastian Ott	3ac276f8cb	[S390] cio: remove ccw_device init_name We used the init_name to set the console ccw_device's name early at the boot stage. This patch moves the name setting (for all ccw devices) to the point where we actually register the device. At this time we can do dynamic allocations and therefore use dev_set_name. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:41 +02:00
Sebastian Ott	3b554a14f4	[S390] cio: move final put_device to ccw_device_unregister We use a test_and_clear_bit to prevent a device from being unregistered twice. Unfortunately in this cases the "final" put_device (from device_initialize) was issued more than once, resulting in an use after free error. Fix this by moving this put_device to ccw_device_unregister. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:40 +02:00
Sebastian Ott	6ee4fec6be	[S390] cio: remove subchannel init_name We used the init_name to set the console subchannels name early at the boot stage. With the patch cio: fix memleak in subchannel validation we moved the name setting to the point where we actually register the console subchannel. At this time we can do dynamic allocations and therefore use dev_set_name. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:40 +02:00
Sebastian Ott	ab6aae0902	[S390] cio: fix memleak in subchannel validation When scanning for new subchannels we have a code path where we allocate memory for a struct subchannel, set the device name (which is dynamically allocated now) and do a check if the underlying device is blacklisted - if so we free the subchannel structure. Since we have not set up refcounting at this stage, the device name's memory is lost. Fix this by moving the dev_set_name after the blacklist test. Note: With this patch the init_name for the console subchannel becomes virtually obsolete. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:39 +02:00
Sebastian Ott	f014824ee7	[S390] cio: fix use after free in s390 debug feature When using s390dbf with "%s" in sprintf format strings the string itself is not copied to the dbf buffer. Since in this case only pointers are stored in the s390dbf, we should not use dev_name - which is bound to the lifetime of the device. Reading this entry from s390dbf after the device was released will cause an use after free error. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:39 +02:00
Jan Glauber	3f09bb8965	[S390] qdio: remove limited number of debugfs entries The number of qdio debugfs entries was limited. Remove this limit and group the queue files in a per device directory. Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:39 +02:00
Michael Ernst	217ee6c64a	[S390] cio: failing set online/offline processing. When unit checks trigger sensing the device state is set to W4SENSE until sense completion; then the device state is set back to ONLINE. If a unit check occurs while set online or set offline requests are processed then it might happen that the device's temporary W4SENSE state causes these functions to terminate, leaving the device in an inconsistent state when the state is set back to ONLINE later on so that the device cannot be set online or offline any longer. To solve this, set online/offline and related rollback or error routines are processed only if the device is in a final or DISCONNECTED state. Signed-off-by: Michael Ernst <mernst@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:38 +02:00
Sebastian Ott	be7a2ddce6	[S390] cio: ensure to hold a reference for deferred deregistration Ensure to always hold an extra device reference for scheduling a subchannel deregistration, by moving the get_device to ccw_device_schedule_sch_unregister. This fixes an use after free error in ccw_device_call_sch_unregister where put_device was called on an already freed device structure. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:38 +02:00
Jan Glauber	e2910bcf8c	[S390] qdio: continue polling if the queue is not finished With commit `c38f960809` polling was stopped for the queue even if new data is available. Return immediately after scheduling the queue tasklet if the queue is not done. Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:37 +02:00
Sebastian Ott	efd986db2d	[S390] cio: increase trace level Move debug traces for start I/O and interrupt events to exclusive trace levels. Also change tracing in hot-path from sprintf (costly) to hex. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:37 +02:00
Sebastian Ott	626e476ae0	[S390] cio: fix not oper handling after failed [on\|off]line processing If online/offline processing of a ccw device fails, resulting in not operational state, notify the driver and unregister the device in case the driver dosn't want to keep it. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:37 +02:00
Peter Oberparleiter	1da73bc80b	[S390] cio: consolidate subchannel intparm reset Ensure that the hardware interruption parameter for a subchannel is reset when the associated subchannel data structure is freed. Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:36 +02:00
Heiko Carstens	62733e5a5a	[S390] cio: move scsw helper functions to header file All scsw helper functions are very short and usage of them shouldn't result in function calls. Therefore we move them to a separate header file. Also saves a lot of EXPORT_SYMBOLs. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:36 +02:00
Peter Oberparleiter	1f1148c88a	[S390] cio: fix ineffective verify event Path verification events occurring for offline devices are currently ignored. As a result, offline devices are not removed, even though they might no longer be accessible (for example because the last path to the device was varied offline). Fix this by scheduling a status evaluation for the affected subchannel when a path verification event occurs. Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:36 +02:00
Geert Uytterhoeven	0d03d59d9b	md: Fix "strchr" [drivers/md/dm-log-userspace.ko] undefined! Commit `b8313b6da7` ("dm log: remove incorrect field from userspace table output") added a call to strstr() with a single-character "needle" string parameter. Unfortunately some versions of gcc replace such calls to strstr() by calls to strchr() behind our back. This causes linking errors if strchr() is defined as an inline function in <asm/string.h> (e.g. on m68k): \| WARNING: "strchr" [drivers/md/dm-log-userspace.ko] undefined! Avoid this by explicitly calling strchr() instead. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-10 14:55:01 -07:00
Ed Cashin	7135a71b19	aoe: allocate unused request_queue for sysfs Andy Whitcroft reported an oops in aoe triggered by use of an incorrectly initialised request_queue object: [ 2645.959090] kobject '<NULL>' (ffff880059ca22c0): tried to add an uninitialized object, something is seriously wrong. [ 2645.959104] Pid: 6, comm: events/0 Not tainted 2.6.31-5-generic #24-Ubuntu [ 2645.959107] Call Trace: [ 2645.959139] [<ffffffff8126ca2f>] kobject_add+0x5f/0x70 [ 2645.959151] [<ffffffff8125b4ab>] blk_register_queue+0x8b/0xf0 [ 2645.959155] [<ffffffff8126043f>] add_disk+0x8f/0x160 [ 2645.959161] [<ffffffffa01673c4>] aoeblk_gdalloc+0x164/0x1c0 [aoe] The request queue of an aoe device is not used but can be allocated in code that does not sleep. Bruno bisected this regression down to `cd43e26f07` block: Expose stacked device queues in sysfs "This seems to generate /sys/block/$device/queue and its contents for everyone who is using queues, not just for those queues that have a non-NULL queue->request_fn." Addresses http://bugs.launchpad.net/bugs/410198 Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13942 Note that embedding a queue inside another object has always been an illegal construct, since the queues are reference counted and must persist until the last reference is dropped. So aoe was always buggy in this respect (Jens). Signed-off-by: Ed Cashin <ecashin@coraid.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Bruno Premont <bonbons@linux-vserver.org> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-09 14:10:18 +02:00
Linus Torvalds	e6890f6f3d	i915: disable interrupts before tearing down GEM state Reinette Chatre reports a frozen system (with blinking keyboard LEDs) when switching from graphics mode to the text console, or when suspending (which does the same thing). With netconsole, the oops turned out to be BUG: unable to handle kernel NULL pointer dereference at 0000000000000084 IP: [<ffffffffa03ecaab>] i915_driver_irq_handler+0x26b/0xd20 [i915] and it's due to the i915_gem.c code doing drm_irq_uninstall() after having done i915_gem_idle(). And the i915_gem_idle() path will do i915_gem_idle() -> i915_gem_cleanup_ringbuffer() -> i915_gem_cleanup_hws() -> dev_priv->hw_status_page = NULL; but if an i915 interrupt comes in after this stage, it may want to access that hw_status_page, and gets the above NULL pointer dereference. And since the NULL pointer dereference happens from within an interrupt, and with the screen still in graphics mode, the common end result is simply a silently hung machine. Fix it by simply uninstalling the irq handler before idling rather than after. Fixes http://bugzilla.kernel.org/show_bug.cgi?id=13819 Reported-and-tested-by: Reinette Chatre <reinette.chatre@intel.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-08 17:09:24 -07:00
Zhenyu Wang	7c8460db30	drm/i915: fix mask bits setting eDP is exclusive connector too, and add missing crtc_mask setting for TV. This fixes http://bugzilla.kernel.org/show_bug.cgi?id=14139 Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Reported-and-tested-by: Carlos R. Mafra <crmafra2@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-08 10:16:20 -07:00
Linus Torvalds	3ff323f890	Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/radeon/kms: add LTE/GTE discard + rv515 two sided stencil register.	2009-09-07 11:42:25 -07:00
Linus Torvalds	4886b5b485	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: gianfar: Fix build.	2009-09-07 11:40:24 -07:00
Linus Torvalds	cbeb2864b1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6: pcmcia: add CNF-CDROM-ID for ide	2009-09-07 11:40:15 -07:00
Linus Torvalds	f69fb9c398	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel: agp/intel: support for new chip variant of IGDNG mobile drm/i915: Unref old_obj on get_fence_reg() error path drm/i915: increase default latency constant (v2 w/comment)	2009-09-07 11:38:30 -07:00
Dave Airlie	a54775c875	drm/radeon/kms: add LTE/GTE discard + rv515 two sided stencil register. This adds some rv350+ register for LTE/GTE discard, and enables the rv515 two sided stencil register. It also disables the DEPTHXY_OFFSET register which can be used to workaround the CS checker. Moves rs690 to proper place in rs600 and uses correct table on rs600. Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-09-07 15:26:19 +10:00
David S. Miller	d9d8e0418f	gianfar: Fix build. Reported by Michael Guntsche <mike@it-loops.com> -------------------- Commit `38bddf04bc` gianfar: gfar_remove needs to call unregister_netdev() breaks the build of the gianfar driver because "dev" is undefined in this function. To quickly test rc9 I changed this to priv->ndev but I do not know if this is the correct one. -------------------- Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-06 01:41:24 -07:00
Linus Torvalds	f815c335d2	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: firewire: sbp2: fix freeing of unallocated memory firewire: ohci: fix Ricoh R5C832, video reception firewire: ohci: fix Agere FW643 and multiple cameras firewire: core: fix crash in iso resource management	2009-09-05 14:59:00 -07:00
Linus Torvalds	5136a6c0fd	Merge git://git.infradead.org/~dwmw2/mtd-2.6.31 * git://git.infradead.org/~dwmw2/mtd-2.6.31: JFFS2: add missing verify buffer allocation/deallocation mtd: nftl: fix offset alignments mtd: nftl: write support is broken mtd: m25p80: fix null pointer dereference bug	2009-09-05 14:57:04 -07:00
Linus Torvalds	59430c2f43	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: tc: Fix unitialized kernel memory leak pkt_sched: Revert tasklet_hrtimer changes. net: sk_free() should be allowed right after sk_alloc() gianfar: gfar_remove needs to call unregister_netdev() ipw2200: firmware DMA loading rework	2009-09-05 14:52:41 -07:00
Linus Torvalds	3bb314f01c	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Re-enable cpufreq suspend and resume code	2009-09-05 14:51:24 -07:00
Linus Torvalds	154f807e55	Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm * git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm snapshot: fix on disk chunk size validation dm exception store: split set_chunk_size dm snapshot: fix header corruption race on invalidation dm snapshot: refactor zero_disk_area to use chunk_io dm log: userspace add luid to distinguish between concurrent log instances dm raid1: do not allow log_failure variable to unset after being set dm log: remove incorrect field from userspace table output dm log: fix userspace status output dm stripe: expose correct io hints dm table: add more context to terse warning messages dm table: fix queue_limit checking device iterator dm snapshot: implement iterate devices dm multipath: fix oops when request based io fails when no paths	2009-09-05 13:51:07 -07:00
Linus Torvalds	9b6a3df372	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI SR-IOV: correct broken resource alignment calculations	2009-09-05 13:50:46 -07:00
Linus Torvalds	6399534472	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: atkbd - add Compaq Presario R4000-series repeat quirk Input: i8042 - add Acer Aspire 5536 to the nomux list	2009-09-05 13:41:29 -07:00
Linus Torvalds	ac89a9174d	pty: don't limit the writes to 'pty_space()' inside 'pty_write()' The whole write-room thing is something that is up to the _caller_ to worry about, not the pty layer itself. The total buffer space will still be limited by the buffering routines themselves, so there is no advantage or need in having pty_write() artificially limit the size somehow. And what happened was that the caller (the n_tty line discipline, in this case) may have verified that there is room for 2 bytes to be written (for NL -> CRNL expansion), and it used to then do those writes as two single-byte writes. And if the first byte written (CR) then caused a new tty buffer to be allocated, pty_space() may have returned zero when trying to write the second byte (LF), and then incorrectly failed the write - leading to a lost newline character. This should finally fix http://bugzilla.kernel.org/show_bug.cgi?id=14015 Reported-by: Mikael Pettersson <mikpe@it.uu.se> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-05 13:27:10 -07:00
Linus Torvalds	37f81fa1f6	n_tty: do O_ONLCR translation as a single write When translating CR to CRNL in the n_tty line discipline, we did it as two tty_put_char() calls. Which works, but is stupid, and has caused problems before too with bad interactions with the write_room() logic. The generic USB serial driver had that problem, for example. Now the pty layer had similar issues after being moved to the generic tty buffering code (in commit `d945cb9cce`: "pty: Rework the pty layer to use the normal buffering logic"). So stop doing the silly separate two writes, and do it as a single write instead. That's what the n_tty layer already does for the space expansion of tabs (XTABS), and it means that we'll now always have just a single write for the CRNL to match the single 'tty_write_room()' test, which hopefully means that the next time somebody screws up buffering, it won't cause weeks of debugging. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-05 12:46:07 -07:00
Stefan Richter	baed6b82d9	firewire: sbp2: fix freeing of unallocated memory If a target writes invalid status (typically status of a command that already timed out), firewire-sbp2 attempts to put away an ORB that doesn't exist. https://bugzilla.redhat.com/show_bug.cgi?id=519772 Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-09-05 15:59:34 +02:00
Stefan Richter	4fe0badd58	firewire: ohci: fix Ricoh R5C832, video reception In dual-buffer DMA mode, no video frames are ever received from R5C832 by libdc1394. Fallback to packet-per-buffer DMA works reliably. http://thread.gmane.org/gmane.linux.kernel.firewire.devel/13393/focus=13476 Reported-by: Jonathan Cameron <jic23@cam.ac.uk> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-09-05 15:59:34 +02:00
Stefan Richter	fc383796a8	firewire: ohci: fix Agere FW643 and multiple cameras An Agere FW643 OHCI 1.1 card works fine for video reception from one camera but fails early if receiving from two cameras. After a short while, no IR IRQ events occur and the context control register does not react anymore. This happens regardless whether both IR DMA contexts are dual-buffer or one is dual-buffer and the other packet-per-buffer. This can be worked around by disabling dual buffer DMA mode entirely. http://sourceforge.net/mailarchive/message.php?msg_name=4A7C0594.2020208%40gmail.com (Reported by Samuel Audet.) In another report (by Jonathan Cameron), an FW643 works OK with two cameras in dual buffer mode. Whether this is due to different chip revisions or different usage patterns (different video formats) is not yet clear. However, as far as the current capabilities of firewire-core's isochronous I/O interface are concerned, simply switching off dual-buffer on non-working and working FW643s alike is not a problem in practice. We only need to revisit this issue if we are going to enhance the interface, e.g. so that applications can explicitly choose modes. Reported-by: Samuel Audet <samuel.audet@gmail.com> Reported-by: Jonathan Cameron <jic23@cam.ac.uk> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-09-05 15:59:34 +02:00
Stefan Richter	1821bc19d5	firewire: core: fix crash in iso resource management This fixes a regression due to post 2.6.30 commit "firewire: core: do not DMA-map stack addresses" `6fdc037094`. As David Moore noted, a previously correct sizeof() expression became wrong since the commit changed its argument from an array to a pointer. This resulted in an oops in ohci_cancel_packet in the shared workqueue thread's context when an isochronous resource was to be freed. Reported-by: Jonathan Cameron <jic23@cam.ac.uk> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-09-05 15:59:34 +02:00
Mikulas Patocka	ae0b7448e9	dm snapshot: fix on disk chunk size validation Fix some problems seen in the chunk size processing when activating a pre-existing snapshot. For a new snapshot, the chunk size can either be supplied by the creator or a default value can be used. For an existing snapshot, the chunk size in the snapshot header on disk should always be used. If someone attempts to load an existing snapshot and has the 'default chunk size' option set, the kernel uses its default value even when it is incorrect for the snapshot being loaded. This patch ensures the correct on-disk value is always used. Secondly, when the code does use the chunk size stored on the disk it is prudent to revalidate it, so the code can exit cleanly if it got corrupted as happened in https://bugzilla.redhat.com/show_bug.cgi?id=461506 . Cc: stable@kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-09-04 20:40:43 +01:00
Mikulas Patocka	2defcc3fb4	dm exception store: split set_chunk_size Break the function set_chunk_size to two functions in preparation for the fix in the following patch. Cc: stable@kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-09-04 20:40:41 +01:00
Mikulas Patocka	61578dcd3f	dm snapshot: fix header corruption race on invalidation If a persistent snapshot fills up, a race can corrupt the on-disk header which causes a crash on any future attempt to activate the snapshot (typically while booting). This patch fixes the race. When the snapshot overflows, __invalidate_snapshot is called, which calls snapshot store method drop_snapshot. It goes to persistent_drop_snapshot that calls write_header. write_header constructs the new header in the "area" location. Concurrently, an existing kcopyd job may finish, call copy_callback and commit_exception method, that goes to persistent_commit_exception. persistent_commit_exception doesn't do locking, relying on the fact that callbacks are single-threaded, but it can race with snapshot invalidation and overwrite the header that is just being written while the snapshot is being invalidated. The result of this race is a corrupted header being written that can lead to a crash on further reactivation (if chunk_size is zero in the corrupted header). The fix is to use separate memory areas for each. See the bug: https://bugzilla.redhat.com/show_bug.cgi?id=461506 Cc: stable@kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-09-04 20:40:39 +01:00
Mikulas Patocka	02d2fd31de	dm snapshot: refactor zero_disk_area to use chunk_io Refactor chunk_io to prepare for the fix in the following patch. Pass an area pointer to chunk_io and simplify zero_disk_area to use chunk_io. No functional change. Cc: stable@kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-09-04 20:40:37 +01:00
Jonathan Brassow	7ec23d5094	dm log: userspace add luid to distinguish between concurrent log instances Device-mapper userspace logs (like the clustered log) are identified by a universally unique identifier (UUID). This identifier is used to associate requests from the kernel to a specific log in userspace. The UUID must be unique everywhere, since multiple machines may use this identifier when communicating about a particular log, as is the case for cluster logs. Sometimes, device-mapper/LVM may re-use a UUID. This is the case during pvmoves, when moving from one segment of an LV to another, or when resizing a mirror, etc. In these cases, a new log is created with the same UUID and loaded in the "inactive" slot. When a device-mapper "resume" is issued, the "live" table is deactivated and the new "inactive" table becomes "live". (The "inactive" table can also be removed via a device-mapper 'clear' command.) The above two issues were colliding. More than one log was being created with the same UUID, and there was no way to distinguish between them. So, sometimes the wrong log would be swapped out during the exchange. The solution is to create a locally unique identifier, 'luid', to go along with the UUID. This new identifier is used to determine exactly which log is being referenced by the kernel when the log exchange is made. The identifier is not universally safe, but it does not need to be, since create/destroy/suspend/resume operations are bound to a specific machine; and these are the operations that make up the exchange. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-09-04 20:40:34 +01:00
Jonathan Brassow	d2b698644c	dm raid1: do not allow log_failure variable to unset after being set This patch fixes a bug which was triggering a case where the primary leg could not be changed on failure even when the mirror was in-sync. The case involves the failure of the primary device along with the transient failure of the log device. The problem is that bios can be put on the 'failures' list (due to log failure) before 'fail_mirror' is called due to the primary device failure. Normally, this is fine, but if the log device failure is transient, a subsequent iteration of the work thread, 'do_mirror', will reset 'log_failure'. The 'do_failures' function then resets the 'in_sync' variable when processing bios on the failures list. The 'in_sync' variable is what is used to determine if the primary device can be switched in the event of a failure. Since this has been reset, the primary device is incorrectly assumed to be not switchable. The case has been seen in the cluster mirror context, where one machine realizes the log device is dead before the other machines. As the responsibilities of the server migrate from one node to another (because the mirror is being reconfigured due to the failure), the new server may think for a moment that the log device is fine - thus resetting the 'log_failure' variable. In any case, it is inappropiate for us to reset the 'log_failure' variable. The above bug simply illustrates that it can actually hurt us. Cc: stable@kernel.org Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-09-04 20:40:32 +01:00
Jonathan Brassow	b8313b6da7	dm log: remove incorrect field from userspace table output The output of 'dmsetup table' includes an internal field that should not be there. This patch removes it. To make the fix simpler, we first reorder a constructor argument The 'device size' argument is generated internally. Currently it is placed as the last space-separated word of the constructor string. However, we need to use a version of the string without this word, so we move it to the beginning instead so it is trivial to skip past it. We keep a copy of the arguments passed to userspace for creating a log, just in case we need to resend them. These are the same arguments that are desired in the STATUSTYPE_TABLE request, except for one. When creating the userspace log, the userspace daemon must know the size of the mirror, so that is added to the arguments given in the constructor table. We were printing this extra argument out as well, which is a mistake. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-09-04 20:40:30 +01:00
Jonathan Brassow	4142a96917	dm log: fix userspace status output Fix 'dmsetup table' output. There is a missing ' ' at the end of the string causing two words to run together. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-09-04 20:40:28 +01:00
Mike Snitzer	40bea43127	dm stripe: expose correct io hints Set sensible I/O hints for striped DM devices in the topology infrastructure added for 2.6.31 for userspace tools to obtain via sysfs. Add .io_hints to 'struct target_type' to allow the I/O hints portion (io_min and io_opt) of the 'struct queue_limits' to be set by each target and implement this for dm-stripe. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-09-04 20:40:25 +01:00

1 2 3 4 5 ...

68428 Commits