linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-15 17:36:57 +07:00

Author	SHA1	Message	Date
Stefan Richter	5e2125677f	firewire: sbp2: fix DMA mapping leak on the failure path Reported-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> who also provided a first version of the fix. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-01-28 20:31:08 +01:00
Stefan Richter	f746072abc	firewire: sbp2: define some magic numbers as macros Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-01-28 20:31:07 +01:00
Stefan Richter	a08e100aec	firewire: sbp2: fix payload limit at S1600 and S3200 1394-2008 clause 16.3.4.1 (1394b-2002 clause 16.3.1.1) defines tighter limits than 1394-2008 clause 6.2.2.3 (1394a-2000 clause 6.2.2.3). Our previously too large limit doesn't matter though if the controller reports its max_receive correctly. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-01-28 20:31:07 +01:00
Stefan Richter	e747a5c0be	firewire: core: optimize card shutdown This fixes a regression by "firewire: keep highlevel drivers attached during brief connection loss": There were 2 seconds unnecessary waiting added to the shutdown procedure of each controller. We use card->link as status flag to signal the device handler that there is no use to wait for a come-back. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-01-24 20:40:12 +01:00
Stefan Richter	8b7b6afaa8	firewire: ohci: increase AT req. retries, fix ack_busy_X from Panasonic camcorders and others Camcorders have a tendency to fail read requests to their config ROM and write request to their FCP command register with ack_busy_X. This has become a problem with newer kernels and especially Panasonic camcorders, causing AV/C in dvgrab and kino to fail. Dvgrab for example frequently logs "send oops"; kino reports loss of AV/C control. I suspect that lower CPU scheduling latencies in newer kernels made this issue more prominent now. According to https://sourceforge.net/tracker/?func=detail&atid=114103&aid=2492640&group_id=14103 this can be fixed by configuring the FireWire controller for more hardware retries for request transmission; these retries are evidently more successful than libavc1394's own retry loop (typically 3 tries on top of hardware retries). Presumably the same issue has been reported at https://bugzilla.redhat.com/show_bug.cgi?id=449252 and https://bugzilla.redhat.com/show_bug.cgi?id=477279 . In a quick test with a JVC camcorder (which didn't malfunction like the reported camcorders), this change decreased the number of ack_busy_X from 16 in three runs of dvgrab to 4 in three runs of the same capture duration. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-01-24 11:17:27 +01:00
Stefan Richter	b006854955	firewire: ohci: change "context_stop: still active" log message The present message is mostly just noise. We only need to be notified if the "active" flag does not go off before the retry loop terminates. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-01-24 11:17:27 +01:00
Stefan Richter	3d36a0df3b	firewire: keep highlevel drivers attached during brief connection loss There are situations when nodes vanish from the bus and come back quickly thereafter: - When certain bus-powered hubs are plugged in, - when certain devices are plugged into 6-port hubs, - when certain disk enclosures are switched from self-power to bus power or vice versa and break the daisy chain during the transition, - when the user plugs a cable out and quickly plugs it back in, e.g. to reorder a daisy chain (works on Mac OS X if done quickly enough), - when certain hubs temporarily malfunction during high bus traffic. Until now, firewire-core reported affected nodes as lost to the highlevel drivers (firewire-sbp2 and userspace drivers). We now delay the destruction of device representations until after at least two seconds after the last bus reset. If a "new" device is detected in this period whose bus information block and root directory header match that of a device which is pending for deletion, we resurrect that device and send update calls to highlevel drivers. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-01-20 19:29:52 +01:00
Stefan Richter	8cd0bbbdff	firewire: unnecessary BM delay after generation rollover Noticed by Jarod Wilson: The bus manager work was unnecessarily delayed each time the bus generation counter rolled over. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2009-01-20 19:29:51 +01:00
Stefan Richter	a5c7f4710f	firewire: insist on successive self ID complete events The whole topology code only works if the old and new topologies which are compared come from immediately successive self ID complete events. If there happened bus resets without self ID complete events in the meantime, or self ID complete events with invalid selfIDs, the topology comparison could identify nodes wrongly, or more likely just corrupt kernel memory or panic right away. We now discard all nodes of the old topology and treat all current nodes as new ones if the current self ID generation is not the previous one plus 1. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2009-01-20 19:29:51 +01:00
Stefan Richter	6230582320	firewire: core: fix sleep in atomic context due to driver core change Due to commit `2831fe6f9c`, "driver core: create a private portion of struct device", device_initialize() can no longer be called from atomic contexts. We now defer it until after config ROM probing. This requires changes to the bus manager code because this may use a device before it was probed. Reported-by: Jay Fenlason <fenlason@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-01-09 23:22:32 +01:00
Stefan Richter	c8a12d45d5	firewire: reorder struct fw_card for better cache efficiency topology_map is by far the largest member in struct fw_card. Move it to the very end of the struct so that card pointer dereferences have better chances to hit the CPU cache. This requires to increase the topology_map backing store to the size specified in IEEE 1394, i.e. 256 rather than 255 quadlets. Otherwise the topology_map response handler may access invalid memory. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-01-04 23:50:38 +01:00
Stefan Richter	d6f95a3d14	firewire: fix resetting of bus manager retry counter An earlier change, maybe long ago, removed the copying of self_id_count into card->self_id_count. Since then each bus reset cleared card->bm_retries even when it shouldn't. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-01-04 23:50:38 +01:00
Jay Fenlason	0fa1986f3a	firewire: improve refcounting of fw_card Take a reference to the card whenever fw_card_bm_work() is scheduled on that card and release it when the work is done. This allows us to remove the cancel_delayed_work_sync() in fw_core_remove_card(). Signed-off-by: Jay Fenlason <fenlason@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (patch update)	2009-01-04 23:50:37 +01:00
Jay Fenlason	2cc489c213	firewire: typo in comment Signed-off-by: Jay Fenlason <fenlason@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-01-04 23:50:37 +01:00
Stefan Richter	d6053e08f5	firewire: fix small memory leak at module removal Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-01-04 23:50:37 +01:00
Stefan Richter	621f6dd715	firewire: fw-sbp2: remove unnecessary locking What was I thinking when I added sbp2_set_generation()? Its locking did nothing (except for implicitly providing the necessary barrier between node IDs update and generation update). Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-01-04 23:50:36 +01:00
Stefan Richter	1d1dc5e83f	firewire: fw-ohci: fix IOMMU resource exhaustion There is a DMA map/ unmap imbalance whenever a block write request packet is sent and then dequeued with ohci_cancel_packet. The latter may happen frequently if the AR resp tasklet is executed before the AT req tasklet for the same transaction. Add the missing dma_unmap_single. This fixes https://bugzilla.redhat.com/show_bug.cgi?id=475156 Reported-by: Emmanuel Kowalski Tested-by: Emmanuel Kowalski Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-12-10 12:45:34 +01:00
Stefan Richter	031bb27c4b	firewire: fw-sbp2: another iPod mini quirk entry Add another model ID of a broken firmware to prevent early I/O errors by acesses at the end of the disk. Reported at linux1394-user, http://marc.info/?t=122670842900002 Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-11-25 21:38:31 +01:00
Kay Sievers	a1f64819fe	firewire: struct device - replace bus_id with dev_name(), dev_set_name() Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-10-31 08:48:25 +01:00
Jay Fenlason	cd1f70fdb4	firewire: fw-sbp2: fix races 1: There is a small race between queue_delayed_work() and its corresponding kref_get(). Do the kref_get first, and _put it again if the queue_delayed_work() failed, so there is no chance of the kref going to zero while the work is scheduled. 2: An SBP2_LOGOUT_REQUEST could be sent out with a login_id full of garbage. Initialize it to an invalid value so we can tell if we ever got a valid login_id. 3: The node ID and generation may have changed but the new values may not yet have been recorded in lu and tgt when the final logout is attempted. Use the latest values from the device in sbp2_release_target(). Signed-off-by: Jay Fenlason <fenlason@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-10-26 10:27:01 +01:00
Stefan Richter	0dcfeb7e3c	firewire: fw-sbp2: delay first login to avoid retries This optimizes firewire-sbp2's device probe for the case that the local node and the SBP-2 node were discovered at the same time. In this case, fw-core's bus management work and fw-sbp2's login and SCSI probe work are scheduled in parallel (in the globally shared workqueue and in fw-sbp2's workqueue, respectively). The bus reset from fw-core may then disturb and extremely delay the login and SCSI probe because the latter fails with several command timeouts and retries and has to be retried from scratch. We avoid this particular situation of sbp2_login() and fw_card_bm_work() running in parallel by delaying the first sbp2_login() a little bit. This is meant to be a short-term fix for https://bugzilla.redhat.com/show_bug.cgi?id=466679. In the long run, the SCSI probe, i.e. fw-sbp2's call of __scsi_add_device(), should be parallelized with sbp2_reconnect(). Problem reported and fix tested and confirmed by Alex Kanavin. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-10-26 10:27:01 +01:00
Stefan Richter	7007a0765e	firewire: fw-ohci: initialization failure path fixes Fix leaks when pci_probe fails. Simplify error log strings. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-10-26 10:27:00 +01:00
Jay Fenlason	a55709ba9d	firewire: fw-ohci: don't leak dma memory on module removal The transmit and receive context dma memory was not being freed on module removal. Neither was the config rom memory. Fix that. The ab->next assignment is pure paranoia. Signed-off-by: Jay Fenlason <fenlason@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-10-26 10:27:00 +01:00
Jay Fenlason	77e5571917	firewire: fix struct fw_node memory leak With the bus_resets patch applied, it is easy to see this memory leak by repeatedly resetting the firewire bus while running slabtop in another window. Just watch kmalloc-32 grow and grow... Signed-off-by: Jay Fenlason <fenlason@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-10-26 10:27:00 +01:00
Jay Fenlason	4f9740d4f5	firewire: Survive more than 256 bus resets The "color" is used during the topology building after a bus reset, hovever in "struct fw_node"s it is stored in a u8, but in struct fw_card it is stored in an int. When the value wraps in one struct, but not the other, disaster strikes. Signed-off-by: Jay Fenlason <fenlason@redhat.com> Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10922. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-10-26 10:26:59 +01:00
Stefan Richter	99692f71ee	firewire: fix ioctl() return code Reported by Jay Fenlason: ioctl() did not return as intended - the size of data read into ioctl_send_request, - the number of datagrams enqueued by ioctl_queue_iso. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-10-15 22:21:10 +02:00
Stefan Richter	7a1003449c	firewire: fix setting tag and sy in iso transmission Reported by Jay Fenlason: The iso packet control accessors in fw-cdev.c had bogus masks. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-10-15 22:21:10 +02:00
Stefan Richter	4bbc1bdd01	firewire: fw-sbp2: fix another small generation access bug queuecommand() looked at the remote and local node IDs before it read the bus generation. The corresponding race with sbp2_reconnect updating these data was probably impossible to happen though because the current code blocks the SCSI layer during reconnection. However, better safe than sorry, especially if someone later improves the code to not block the SCSI layer. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-10-15 22:21:10 +02:00
Stefan Richter	09b12dd4e3	firewire: fw-sbp2: enforce s/g segment size limit 1. We don't need to round the SBP-2 segment size limit down to a multiple of 4 kB (0xffff -> 0xf000). It is only necessary to ensure quadlet alignment (0xffff -> 0xfffc). 2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure and the block IO layer about the restriction. This way we can remove the size checks and segment splitting in the queuecommand path. This assumes that no other code in the firewire stack uses dma_map_sg() with conflicting requirements. It furthermore assumes that the controller device's platform actually allows us to set the segment size to our liking. Assert the latter with a BUG_ON(). 3. Also use blk_queue_max_segment_size() to tell the block IO layer about it. It cannot know it because our scsi_add_host() does not point to the FireWire controller's device. Thanks to Grant Grundler and FUJITA Tomonori for advice. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-10-15 22:21:10 +02:00
Jay Fenlason	1e119fa995	firewire: fw_send_request_sync() Share code between fw_send_request + wait_for_completion callers. Signed-off-by: Jay Fenlason <fenlason@redhat.com> Addendum: Removes an unnecessary struct and an ununsed retry loop. Calls it fw_run_transaction() instead of fw_send_request_sync(). Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: Kristian Høgsberg <krh@redhat.com>	2008-10-15 22:21:09 +02:00
Stefan Richter	30b0aa7c9a	firewire: Kconfig help update Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-08-19 18:47:56 +02:00
Linus Torvalds	a14ad05f47	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: firewire: Preserve response data alignment bug when it is harmless	2008-08-06 12:03:43 -07:00
David Moore	8401d92ba4	firewire: Preserve response data alignment bug when it is harmless Recently, a bug having to do with the alignment of transaction response data was fixed. However, some apps such as libdc1394 relied on the presence of that bug in order to function correctly. In order to stay compatible with old versions of those apps, this patch preserves the bug in cases where it is harmless to normal operation (such as the single quadlet read) due to a simple duplication of data. This guarantees maximum compatability for those users who are using the old app with the fixed kernel. Signed-off-by: David Moore <dcm@acm.org> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-08-02 20:03:49 +02:00
Linus Torvalds	837b41b5de	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: firewire: state userland requirements in Kconfig help firewire: avoid memleak after phy config transmit failure firewire: fw-ohci: TSB43AB22/A dualbuffer workaround firewire: queue the right number of data firewire: warn on unfinished transactions during card removal firewire: small fw_fill_request cleanup firewire: fully initialize fw_transaction before marking it pending firewire: fix race of bus reset with request transmission	2008-07-27 10:24:06 -07:00
FUJITA Tomonori	8d8bb39b9e	dma-mapping: add the device argument to dma_mapping_error() Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER architecture does: This enables us to cleanly fix the Calgary IOMMU issue that some devices are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423). I think that per-device dma_mapping_ops support would be also helpful for KVM people to support PCI passthrough but Andi thinks that this makes it difficult to support the PCI passthrough (see the above thread). So I CC'ed this to KVM camp. Comments are appreciated. A pointer to dma_mapping_ops to struct dev_archdata is added. If the pointer is non NULL, DMA operations in asm/dma-mapping.h use it. If it's NULL, the system-wide dma_ops pointer is used as before. If it's useful for KVM people, I plan to implement a mechanism to register a hook called when a new pci (or dma capable) device is created (it works with hot plugging). It enables IOMMUs to set up an appropriate dma_mapping_ops per device. The major obstacle is that dma_mapping_error doesn't take a pointer to the device unlike other DMA operations. So x86 can't have dma_mapping_ops per device. Note all the POWER IOMMUs use the same dma_mapping_error function so this is not a problem for POWER but x86 IOMMUs use different dma_mapping_error functions. The first patch adds the device argument to dma_mapping_error. The patch is trivial but large since it touches lots of drivers and dma-mapping.h in all the architecture. This patch: dma_mapping_error() doesn't take a pointer to the device unlike other DMA operations. So we can't have dma_mapping_ops per device. Note that POWER already has dma_mapping_ops per device but all the POWER IOMMUs use the same dma_mapping_error function. x86 IOMMUs use device argument. [akpm@linux-foundation.org: fix sge] [akpm@linux-foundation.org: fix svc_rdma] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix bnx2x] [akpm@linux-foundation.org: fix s2io] [akpm@linux-foundation.org: fix pasemi_mac] [akpm@linux-foundation.org: fix sdhci] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix sparc] [akpm@linux-foundation.org: fix ibmvscsi] Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:03 -07:00
Stefan Richter	f05e21b39f	firewire: state userland requirements in Kconfig help Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-07-25 20:10:33 +02:00
Stefan Richter	c0220d686b	firewire: avoid memleak after phy config transmit failure Use only statically allocated data for PHY config packet transmission. With the previous incarnation, some data wouldn't be freed if the packet transmit callback was never called. A theoretical drawback now is that, in PCs with more than one card, card A may complete() for a waiter on card B. But this is highly unlikely and its impact not serious. Bus manager B may reset bus B before the PHY config went out, but the next phy config on B should be fine. However, with a timeout of 100ms, this situation is close to impossible. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-07-25 20:10:32 +02:00
Stefan Richter	95984f62c9	firewire: fw-ohci: TSB43AB22/A dualbuffer workaround Isochronous reception in dualbuffer mode is reportedly broken with TI TSB43AB22A on x86-64. Descriptor addresses above 2G have been determined as the trigger: https://bugzilla.redhat.com/show_bug.cgi?id=435550 Two fixes are possible: - pci_set_consistent_dma_mask(pdev, DMA_31BIT_MASK); at least when IR descriptors are allocated, or - simply don't use dualbuffer. This fix implements the latter workaround. But we keep using dualbuffer on x86-32 which won't give us highmen (and thus physical addresses outside the 31bit range) in coherent DMA memory allocations. Right now we could for example also whitelist PPC32, but DMA mapping implementation details are expected to change there. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-07-25 15:41:23 +02:00
JiSheng Zhang	f9543d0ab6	firewire: queue the right number of data There will be 4 padding bytes in struct fw_cdev_event_response on some platforms The member:__u32 data will point to these padding bytes. While queue the response and data in complete_transaction in fw-cdev.c, it will queue like this: \|response(excluding padding bytes)\|4 padding bytes\|4 padding bytes\|data. It queue 4 extra bytes. That is to say it use "&response + sizeof(response)" while other place of kernel and userspace library use "&response + offsetof (typeof(response), data)". So it will lost the last 4 bytes of data. This patch can fix it while not changing the struct definition. Signed-off-by: JiSheng Zhang <jszhang3@mail.ustc.edu.cn> This fixes responses to outbound block read requests on 64bit architectures. Tested on i686, x86-64, and x86-64 with i686 userland, using firecontrol and gscanbus. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-07-20 15:25:03 +02:00
Linus Torvalds	22a37bcb78	Merge branch 'sbp2-spindown' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 * 'sbp2-spindown' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: ieee1394: sbp2: spin disks down on suspend and shutdown firewire: fw-sbp2: spin disks down on suspend and shutdown ieee1394: sbp2: fix spindown for PL-3507 and TSB42AA9 firmwares firewire: fw-sbp2: fix spindown for PL-3507 and TSB42AA9 firmwares scsi: sd: optionally set power condition in START STOP UNIT	2008-07-15 12:39:44 -07:00
Stefan Richter	1e8afea124	firewire: warn on unfinished transactions during card removal After card->done and card->work are completed, any remaining pending request would be a bug. We cannot safely complete a transaction at that point anymore. IOW card users must not drop their last fw_card reference (usually indirect references through fw_device references) before their last outbound transaction through that card was finished. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-07-14 13:06:04 +02:00
Stefan Richter	b9549bc680	firewire: small fw_fill_request cleanup - better name for a function argument - removal of a local variable which became unnecessary after "fully initialize fw_transaction before marking it pending" Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-07-14 13:06:04 +02:00
Stefan Richter	e9aeb46c93	firewire: fully initialize fw_transaction before marking it pending In theory, card->flush_timer could already access a transaction between fw_send_request()'s spin_unlock_irqrestore and the rest of what happens in fw_send_request(). This would happen if the process which sends the request is preempted and put to sleep right after spin_unlock_irqrestore for longer than 100ms. Therefore we fill in everything in struct fw_transaction at which the flush_timer might look at before we lift the lock. To do: Ensure that the timer does not pick up the transaction before the time of the AT request event plus split transaction timeout. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-07-14 13:06:04 +02:00
Stefan Richter	792a61021c	firewire: fix race of bus reset with request transmission Reported by Jay Fenlason: A bus reset tasklet may call fw_flush_transactions and touch transactions (call their callback which will free them) while the context which submitted the transaction is still inserting it into the transmission queue. A simple solution to this problem is to _not_ "flush" the transactions because of a bus reset (complete the transcations as 'cancelled'). They will now simply time out (completed as 'cancelled' by the split-timeout timer). Jay Fenlason thought of this fix too but I was quicker to type it out. :-) Background: Contexts which access an instance of struct fw_transaction are: 1. the submitter, until it inserted the packet which is embedded in the transaction into the AT req DMA, 2. the AsReqTrContext tasklet when the request packet was acked by the responder node or transmission to the responder failed, 3. the AsRspRcvContext tasklet when it found a request which matched an incoming response, 4. the card->flush_timer when it picks up timed-out transactions to cancel them, 5. the bus reset tasklet when it cancels transactions (this access is eliminated by this patch), 6. a process which shuts down an fw_card (unregisters it from fw-core when the controller is unbound from fw-ohci) --- although in this case there shouldn't really be any transactions anymore because we wait until all card users finished their business with the card. All of these contexts run concurrently (except for the 6th, presumably). The 1st is safe against the 2nd and 3rd because of the way how a request packet is carefully submitted to the hardware. A race between 2nd and 3rd has been fixed a while ago (bug 9617). The 4th is almost safe against 1st, 2nd, 3rd; there are issues with it if huge scheduling latencies occur, to be fixed separately. The 5th looks safe against 2nd, 3rd, and 4th but is unsafe against 1st. Maybe this could be fixed with an explicit state variable in struct fw_transaction. But this would require fw_transaction to be rewritten as only dynamically allocatable object with reference counting --- not a good solution if we also can simply kill this 5th accessing context (replace it by the 4th). Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-07-14 13:06:04 +02:00
Stefan Richter	a7ea67823a	firewire: don't respond to broadcast write requests Contrary to a comment in the source, request->ack of a broadcast write request can be ACK_PENDING. Hence the existing check is insufficient. Debug dmesg before: AR spd 0 tl 00, ffc0 -> ffff, ack_pending , QW req, fffff0000234 = ffffffff AT spd 0 tl 00, ffff -> ffc0, ack_complete, W resp And the requesting node (linux1394) reports an unsolicited response. Debug dmesg after: AR spd 0 tl 00, ffc0 -> ffff, ack_pending , QW req, fffff0000234 = ffffffff Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-07-14 13:06:03 +02:00
Stefan Richter	459f79235d	firewire: clean up fw_card reference counting This is a functionally equivalent replacement of the current reference counting of struct fw_card instances. It only converts it to common idioms as suggested by Kristian Høgsberg: - struct kref replaces atomic_t as the counter. - wait_for_completion is used to wait for all card users to complete. BTW, it may make sense to count card->flush_timer and card->work as card users too. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-07-14 13:06:03 +02:00
Stefan Richter	2147ef204f	firewire: clean up some includes Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-07-14 13:06:03 +02:00
Stefan Richter	bbf094cf3d	firewire: remove unused struct members Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-07-14 13:06:03 +02:00
Stefan Richter	e534fe16b9	firewire: implement broadcast_channel CSR for 1394a compliance See IEEE 1394a clause 8.3.2.3.11. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-07-14 13:06:03 +02:00
Stefan Richter	2635f96f90	firewire: fw-sbp2: spin disks down on suspend and shutdown This instructs sd_mod to send START STOP UNIT on suspend and resume, and on driver unbinding or unloading (including when the system is shut down). We don't do this though if multiple initiators may log in to the target. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Tested-by: Tino Keitel <tino.keitel@gmx.de>	2008-07-14 13:00:18 +02:00
Stefan Richter	ffcaade310	firewire: fw-sbp2: fix spindown for PL-3507 and TSB42AA9 firmwares Reported by Tino Keitel: PL-3507 with firmware from Prolific does not spin down the disk on START STOP UNIT with power condition = 0 and start = 0. It does however work with power condition = 2 or 3. Also found while investigating this: DViCO Momobay CX-1 and FX-3A (TI TSB42AA9/A based) become unresponsive after START STOP UNIT with power condition = 0 and start = 0. They stay responsive if power condition is set when stopping the motor. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Tested-by: Tino Keitel <tino.keitel@gmx.de>	2008-07-14 13:00:17 +02:00
Richard Sharpe	0e3e2eabf4	firewire: fw-sbp2: fix parsing of logical unit directories There is a small off-by-one bug in firewire-sbp2. This causes problems when a device exports multiple LUN Directories. I found it when trying to talk to a SONY DVD Jukebox. Signed-off-by: Richard Sharpe <realrichardsharpe@gmail.com> Acked-by: Kristian Høgsberg <krh@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (op. order, changelog)	2008-06-27 20:55:00 +02:00
Stefan Richter	a7b64b8704	firewire: Kconfig menu touch-up Emphasize the recommendation to build only one stack. Trim the prompts to better fit into short attention spans. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-06-19 00:12:35 +02:00
Stefan Richter	ae1e535579	firewire: deadline for PHY config transmission If the low-level driver failed to initialize a card properly without noticing it, fw-core was blocked indefinitely when trying to send a PHY config packet. This hung up the events kernel thread, e.g. locked up keyboard input. https://bugzilla.redhat.com/show_bug.cgi?id=444694 https://bugzilla.redhat.com/show_bug.cgi?id=446763 This problem was introduced between 2.6.25 and 2.6.26-rc1 by commit `2a0a259049` "firewire: wait until PHY configuration packet was transmitted (fix bus reset loop)". The solution is to wait with timeout. I tested it with 7 different working controllers and 1 non-working controller. On the working ones, the packet callback complete()s usually --- but not always --- before a timeout of 10ms. Hence I chose a safer timeout of 100ms. On the few tests with the non-working controller ALi M5271, PHY config packet transmission always timed out so far. (Fw-ohci needs to be fixed for this controller independently of this deadline fix. Often the core doesn't even attempt to send a phy config because not even self ID reception works.) Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-06-19 00:12:35 +02:00
Stefan Richter	161b96e782	firewire: fw-ohci: unify printk prefixes The messages which can be enabled by fw-ohci's debug module parameter are changed from KERN_DEBUG to KERN_NOTICE level and uniformly prefixed with "firewire_ohci: ". This further simplifies communication with users when we ask them to capture debug messages. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-06-19 00:12:35 +02:00
Stefan Richter	5cb84067d6	firewire: fill_bus_reset_event needs lock protection Callers of fill_bus_reset_event() have to take card->lock. Otherwise access to node data may oops if node removal is in progress. A lockless alternative would be - event->local_node_id = card->local_node->node_id; + tmp = fw_node_get(card->local_node); + event->local_node_id = tmp->node_id; + fw_node_put(tmp); and ditto with the other node pointers which fill_bus_reset_event() accesses. But I went the locked route because one of the two callers already holds the lock. As a bonus, we don't need the memory barrier anymore because device->generation and device->node_id are written in a card->lock protected section. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Kristian Høgsberg <krh@redhat.com>	2008-06-19 00:12:35 +02:00
Stefan Richter	affc9c24ad	firewire: fw-ohci: write selfIDBufferPtr before LinkControl.rcvSelfID OHCI 1.1 clause 5.10 requires that selfIDBufferPtr is valid when a 1 is written into LinkControl.rcvSelfID. This driver bug has so far not been known to cause harm because most chips obviously accept a later selfIDBufferPtr write, at least before HCControl.linkEnable is written. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Kristian Høgsberg <krh@redhat.com>	2008-06-19 00:12:35 +02:00
Stefan Richter	e896ec4302	firewire: fw-ohci: disable PHY packet reception into AR context We want the rcvPhyPkt bit in LinkControl off before we start using the chip. However, the spec says that the reset value of it is undefined. Hence switch it explicitly off. https://bugzilla.redhat.com/show_bug.cgi?id=244576#c48 shows that for example the nForce2 integrated FireWire controller seems to have it on by default. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-06-19 00:12:34 +02:00
Stefan Richter	ccff962943	firewire: fw-ohci: use of uninitialized data in AR handler header_length and payload_length are filled with random data if an unknown tcode was read from the AR buffer (i.e. if the AR buffer contained invalid data). We still need a better strategy to recover from this, but at least handle_ar_packet now doesn't return out of bound buffer addresses anymore. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-06-19 00:12:34 +02:00
Stefan Richter	0bf607c5b4	firewire: don't panic on invalid AR request buffer BUG() at this place is wrong. (Unless if the low level driver would already do higher-level input validation of incoming request headers.) Invalid incoming requests or bugs in the controller which corrupt the AR-req buffer needlessly crashed the box because this is run in tasklet context. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-06-19 00:12:34 +02:00
Jay Fenlason	551f4cb9de	firewire: prevent userspace from accessing shut down devices If userspace ignores the POLLERR bit from poll(), and only attempts to read() the device when POLLIN is set, it can still make ioctl() calls on a device that has been removed from the system. The node_id and generation returned by GET_INFO will be outdated, but INITIATE_BUS_RESET would still cause a bus reset, and GET_CYCLE_TIMER will return data. And if you guess the correct generation to use, you can send requests to a different device on the bus, and get responses back. This patch prevents open, ioctl, compat_ioctl, and mmap against shutdown devices. Signed-off-by: Jay Fenlason <fenlason@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-05-20 18:24:17 +02:00
Linus Torvalds	d626e3bf72	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: [SCSI] aic94xx: fix section mismatch [SCSI] u14-34f: Fix 32bit only problem [SCSI] dpt_i2o: sysfs code [SCSI] dpt_i2o: 64 bit support [SCSI] dpt_i2o: move from virt_to_bus/bus_to_virt to dma_alloc_coherent [SCSI] dpt_i2o: use standard __init / __exit code [SCSI] megaraid_sas: fix suspend/resume sections [SCSI] aacraid: Add Power Management support [SCSI] aacraid: Fix jbod operations scan issues [SCSI] aacraid: Fix warning about macro side-effects [SCSI] add support for variable length extended commands [SCSI] Let scsi_cmnd->cmnd use request->cmd buffer [SCSI] bsg: add large command support [SCSI] aacraid: Fix down_interruptible() to check the return value correctly [SCSI] megaraid_sas; Update the Version and Changelog [SCSI] ibmvscsi: Handle non SCSI error status [SCSI] bug fix for free list handling [SCSI] ipr: Rename ipr's state scsi host attribute to prevent collisions [SCSI] megaraid_mbox: fix Dell CERC firmware problem	2008-05-02 13:52:35 -07:00
Boaz Harrosh	64a87b244b	[SCSI] Let scsi_cmnd->cmnd use request->cmd buffer - struct scsi_cmnd had a 16 bytes command buffer of its own. This is an unnecessary duplication and copy of request's cmd. It is probably left overs from the time that scsi_cmnd could function without a request attached. So clean that up. - Once above is done, few places, apart from scsi-ml, needed adjustments due to changing the data type of scsi_cmnd->cmnd. - Lots of drivers still use MAX_COMMAND_SIZE. So I have left that #define but equate it to BLK_MAX_CDB. The way I see it and is reflected in the patch below is. MAX_COMMAND_SIZE - means: The longest fixed-length () SCSI CDB as per the SCSI standard and is not related to the implementation. BLK_MAX_CDB. - The allocated space at the request level - I have audit all ISA drivers and made sure none use ->cmnd in a DMA Operation. Same audit was done by Andi Kleen. ()fixed-length here means commands that their size can be determined by their opcode and the CDB does not carry a length specifier, (unlike the VARIABLE_LENGTH_CMD(0x7f) command). This is actually not exactly true and the SCSI standard also defines extended commands and vendor specific commands that can be bigger than 16 bytes. The kernel will support these using the same infrastructure used for VARLEN CDB's. So in effect MAX_COMMAND_SIZE means the maximum size command scsi-ml supports without specifying a cmd_len by ULD's Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-05-02 10:18:22 -05:00
Linus Torvalds	886c35fbcf	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: firewire: fw-sbp2: log scsi_target ID at release ieee1394: fix NULL pointer dereference in sysfs access	2008-05-01 11:31:38 -07:00
Stefan Richter	f32ddaddf9	firewire: fw-sbp2: log scsi_target ID at release Makes the good-by message more informative. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-05-01 19:55:24 +02:00
Matthew Wilcox	6188e10d38	Convert asm/semaphore.h users to linux/semaphore.h Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2008-04-18 22:22:54 -04:00
Matthew Wilcox	d3135846f6	drivers: Remove unnecessary inclusions of asm/semaphore.h None of these files use any of the functionality promised by asm/semaphore.h. It's possible that they rely on it dragging in some unrelated header file, but I can't build all these files, so we'll have fix any build failures as they come up. Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2008-04-18 22:16:32 -04:00
Adrian Bunk	db8be076ca	firewire: cleanups This patch contains the following cleanups: - #if 0 the following unused structs: - fw-transaction.c:fw_low_memory_region - fw-transaction.c:fw_private_region - fw-transaction.c:fw_csr_region - fw-transaction.c:fw_unit_space_region - remove the following unused EXPORT_SYMBOL's: - fw-card.c:fw_core_add_descriptor - fw-card.c:fw_core_remove_descriptor - fw-iso.c:fw_iso_context_create - fw-iso.c:fw_iso_context_destroy - fw-iso.c:fw_iso_context_start - fw-iso.c:fw_iso_context_queue - fw-iso.c:fw_iso_context_stop Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:37 +02:00
Stefan Richter	25b1c3d888	firewire: fix synchronization of gap counts Fix: The fact that nodes had different gap counts would be overlooked if the bus manager code would pick gap count 63 because of beta repeaters or because of very large hop counts. In this case, the bus manager code would miss that it actually has to send the PHY config packet with gap count 63. Related trivial changes: Use bool for an int used as bool, touch up some comments. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:36 +02:00
Stefan Richter	2a0a259049	firewire: wait until PHY configuration packet was transmitted (fix bus reset loop) We now exit fw_send_phy_config /after/ the PHY config packet has been transmitted, instead of before. A subsequent fw_core_initiate_bus_reset will therefore not overlap with the transmission. This is meant to make the send PHY config packet + reset bus routine more deterministic. Fixes bus reset loop and eventual panic with - VIA VT6307 + IOGEAR hub + Unibrain Fire-i camera http://bugzilla.kernel.org/show_bug.cgi?id=10128 - JMicron card Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-04-18 17:55:36 +02:00
Stefan Richter	e09770db0f	firewire: remove unused struct member request_generation is internal to fw-ohci and unneeded in fw_card. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:36 +02:00
Jarod Wilson	15f0d833f6	firewire: use bitwise and to get reg in handle_registers for code efficiency. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:36 +02:00
Jarod Wilson	cca6097713	firewire: replace more hex values with defined csr constants Trivial change to replace more meaningless (to the untrained eye) hex values with defined CSR constants. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:36 +02:00
Stefan Richter	c9755e14a0	firewire: reread config ROM when device reset the bus When a device changes its configuration ROM, it announces this with a bus reset. firewire-core has to check which node initiated a bus reset and whether any unit directories went away or were added on this node. Tested with an IOI FWB-IDE01AB which has its link-on bit set if bus power is available but does not respond to ROM read requests if self power is off. This implements - recognition of the units if self power is switched on after fw-core gave up the initial attempt to read the config ROM, - shutdown of the units when self power is switched off. Also tested with a second PC running Linux/ieee1394. When the eth1394 driver is inserted and removed on that node, fw-core now notices the addition and removal of the IPv4 unit on the ieee1394 node. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:36 +02:00
Stefan Richter	1dadff71d6	firewire: replace static ROM cache by allocated cache read_bus_info_block() is repeatedly called by workqueue jobs. These will step on each others toes eventually if there are multiple workqueue threads, and we end up with corrupt config ROM images. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:35 +02:00
Stefan Richter	d34316a4bd	firewire: fw-ohci: work around generation bug in TI controllers (fix AV/C and more) Unlike the ohci1394 driver, fw-ohci uses the selfIDGeneration field of bus reset packets to determine the generation of incoming requests as per OHCI 1.1 clause 8.4.2.3. This is more precise --- provided that the controller inserts the correct generation. Texas Instruments chips often don't. This prevented the transmission of response packets, which for example broke AV/C transactions as used when communicating with miniDV cameras and any other AV/C devices. There is apparently no way to detect and adjust incorrect generations. Therefore we ignore the generation of bus reset packets from TI chips and use the generation of the self ID buffer instead. Alas this is received at a slightly wrong time. In rare cases, this could cause us to not respond to legitimate requests or to respond to expired requests. (The latter is less likely because the bus reset packet AR event is typically handled before the self ID complete event.) Bug reported by Mladen Kuntner, who was extraordinarily patient while dealing with the driver maintainers. Fix confirmed to be required and effective for TSB82AA2 and a TSB43AB22 or TSB43AB22A. https://bugzilla.redhat.com/show_bug.cgi?id=243081 Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-04-18 17:55:35 +02:00
Stefan Richter	08ddb2f4c2	firewire: fw-ohci: extend logging of bus generations and node ID Extend the logging of "AR evt_bus_reset, link internal" to "AR evt_bus_reset, generation ${selfIDGeneration}". That way we can check whether this generation matches the one seen in self ID complete event logging. See OHCI 1.1 clause 8.4.2.3. Also extend logging of "firewire_ohci: * selfIDs, generation " by "local node ID ffc" in self ID logging to make the local node in AT/AR event logs more obvious. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-04-18 17:55:35 +02:00
Stefan Richter	a007bb857e	firewire: fw-ohci: conditionally log busReset interrupts Add a debug option to watch bus reset interrupt events. Half of this patch is taken from Jarod Wilson's first version of the JMicron fix. BusReset interrupts are only generated if the respective module parameter flag was set before the controller is being initialized. Else we keep this event masked to reduce IRQ load in normal operation and to avoid potential problems with buggy chips. Note, this is unlike the other IRQ events whose logging can be enabled any time after chip initialization. This and the influence on what interrupts the chip generates is why I added an extra flag for it. Also, reorder the debug parameter flags according to their perceived usefulness. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-04-18 17:55:35 +02:00
Jarod Wilson	76f73ca1b2	firewire: fw-ohci: don't append to AT context when it's not active I finally tracked down the issues with this JMicron PCI-e card in my possession to a failure to comply with section 7.2.3.2 of the OHCI 1.1 specification (thanks to Kristian for the pointer to illustrate that it is indeed a flaw in this card, not the driver). The controller should simply flush the packets we've appended to its AT queue if a bus reset occurs before they've been transmitted and we'll try again, but something goes wrong and the controller winds up hung. However, we can avoid the problem by simply checking if the IntEvent.busReset register had been set before we try appending to the AT context. When busReset is set, the AT context is completely halted until busReset is cleared, so there's no point in appending AT packets until the register is cleared. So at_context_queue_packet() now checks for busReset being set, and bails with an RCODE_GENERATION packet ack, which results in us trying to append the packet again after recognizing the fact there has been a bus reset, and clearing busReset. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:35 +02:00
Jarod Wilson	75f7832e3b	firewire: fw-ohci: log regAccessFail events While trying to debug this piece of crap JMicron PCI-e controller in my possession, one thought was that perhaps I was encountering register access failures. I'm not, but logging them would be good, so we can see if they are a real problem we should be taking into account anywhere in the code. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (added list contact)	2008-04-18 17:55:34 +02:00
Jarod Wilson	022147242f	firewire: fw-ohci: make sure HCControl register LPS bit is set I've now witnessed multiple occasions where one of my controllers (a very poorly working JMicron PCIe card) fails to get its registers properly set up in ohci_enable(), apparently due to an occasionally very slow to initiate SClk. The easy fix for this problem is to add a tiny while loop to try again a time or three after initially enabling LPS before we move on (or give up). Of course, the card still isn't fully functional yet, but this gets it at least one tiny step closer... Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:34 +02:00
Stefan Richter	130d5496e2	firewire: fw-ohci: missing PPC PMac feature calls in failure path Balance ohci_pmac_on and ohci_pmac_off if pci_driver.probe fails. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:34 +02:00
Stefan Richter	43286568ad	firewire: fw-ohci: untangle a mixed unsigned/signed expression and make another expression more readable. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:34 +02:00
Stefan Richter	ad3c0fe8b8	firewire: debug interrupt events This adds debug printks for asynchronous transmission and reception and for self ID reception. They can be enabled at module load time, and at runtime via /sys/module/firewire_ohci/parameters/debug. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Also added: Logging of interrupt event codes and of cancelled AT packets. The code now depends on a Kconfig variable. This makes it easier to build firewire-ohci without the feature or to make it an option in the future. The variable is currently hidden and always on. This feature inflates firewire-ohci.ko by 7 kB = 27% on x86-64 and by 4 kB = 23% on i686. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:34 +02:00
Stefan Richter	016bf3dfcf	firewire: fw-ohci: catch self_id_count == 0 fw_core_handle_bus_reset() incorrectly relied on the assumption that self_id_count > 0. We check early in fw-ohci and discard the self ID complete event if self_id_count == 0 because a valid event always has at least one self ID packet in it (the one of the local node). Hence treat self_id_count == 0 like any other kind of invalid self ID buffer. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-04-18 17:55:34 +02:00
Stefan Richter	c8a9a498e1	firewire: fw-ohci: add self ID error check Discard self ID buffer contents if - the selfIDError flag is set, - any of the self ID packets has bit errors. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-04-18 17:55:33 +02:00
Stefan Richter	2ed0f181f0	firewire: fw-ohci: refactor probe, remove, suspend, resume Clean up shared code and variable names. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:33 +02:00
Stefan Richter	eb5ca72eff	firewire: fw-ohci: switch on bus power after resume on PPC PMac The platform feature calls in the suspend method switched off cable power, but the calls in the resume method did not switch it back on. Add the necessary feature call to .resume. Also add the corresponding call to .suspend to make .suspend's behavior explicitly the same on all PMacs. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:33 +02:00
Stefan Richter	080de8c2c5	firewire: fw-ohci: add option for remote debugging This way firewire-ohci can be used for remote debugging like ohci1394. Version with amendment from Fri, 11 Apr 2008 00:08:08 +0200. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: Bernhard Kaindl <bk@suse.de>	2008-04-18 17:55:33 +02:00
Jarod Wilson	17cff9ff87	firewire: fw-sbp2: set dual-phase cycle_limit Try to write dual-phase retry protocol limits to BUSY_TIMEOUT register. - The dual-phase retry protocol is optional to implement, and if not supported, writes to the dual-phase portion of the register will be ignored. We try to write the original 1394-1995 default here. - In the case of devices that are also SBP-3-compliant, all writes are ignored, as the register is read-only, but contains single-phase retry of 15, which is what we're trying to set for all SBP-2 device anyway, so this write attempt is safe and yields more consistent behavior for all devices. See section 8.3.2.3.5 of the 1394-1995 spec, section 6.2 of the SBP-2 spec, and section 6.4 of the SBP-3 spec for further details. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:33 +02:00
Stefan Richter	a5fd9ec7a2	firewire: fw-sbp2: reduce log noise The block/unblock logic is now sufficiently tested. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:32 +02:00
Stefan Richter	6f73100cbb	firewire: fw-sbp2: remove unnecessary memset orb came from kzalloc. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:32 +02:00
Stefan Richter	0d7dcbf2a3	firewire: fw-sbp2: simplify some macros How hard can it be to switch on one bit? :-) Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:32 +02:00
Stefan Richter	71ee9f01f2	firewire: fw-sbp2: remove usages of fw_memcpy_to_be32 Write directly in big endian instead of byte-swapping after the fact. This saves a few conversions, lets gcc use constant endianess conversions where possible, and enables deeper endianess annotation. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:32 +02:00
Stefan Richter	8ac3a47cab	firewire: fw-sbp2: relax SCSI DMA alignment Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-04-18 17:55:32 +02:00
Stefan Richter	1dc3bea78b	firewire: refactor fw_unit reference counting Add wrappers for getting and putting a unit. Remove some line breaks. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-04-18 17:55:32 +02:00
Stefan Richter	7c1fca3366	firewire: fw-sbp2: fix reference counting The reference count of the unit dropped too low in an error path in sbp2_probe. Fixed by moving the _get further up. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-04-18 17:55:31 +02:00
Stefan Richter	bd7dee6311	firewire: remove superfluous reference counting The card->kref became obsolete since patch "firewire: fix crash in automatic module unloading" added another counter of card users. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-04-18 17:55:31 +02:00
Jarod Wilson	6b84236d37	firewire: fw-ohci: plug dma memory leak in AR handler There's an ugly little memory leak in firewire-ohci's ar_context_tasklet(), where we're not freeing up some of the memory we use for each ar_buffer, due to a moving pointer. The problem has been there for a while, but didn't get noticed until after converting the AR routines over to use coherent DMA and I started running into I/O stall- outs with the following message output repeatedly to the console: PCI-DMA: Out of IOMMU space for 53248 bytes at device 0000:04:09.0 Plugging this leak is definitely necessary, but unfortunately, isn't the entire answer to my problem, it only increases the amount of I/O that I can do before hitting the problem. Still working on tracking down the root cause.. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-03-27 21:01:14 +01:00
Stefan Richter	10a4c73551	firewire: fix panic in handle_at_packet This fixes a use-after-free bug in the handling of split transactions. The AT DMA handler of the request was occasionally executed after the AR DMA handler of the response. The AT DMA handler then accessed an already freed packet. Reported by Johannes Berg. http://bugzilla.kernel.org/show_bug.cgi?id=9617 Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Tested-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-03-20 18:13:05 +01:00
Stefan Richter	f5101d58af	firewire: fw-ohci: shut up false compiler warning on PPC32 Shut up "may be used uninitialised in this function" warnings due to PPC32's implementation of dma_alloc_coherent(). Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-03-14 00:57:00 +01:00
Jarod Wilson	bde1709aaa	firewire: fw-ohci: use dma_alloc_coherent for ar_buffer Currently, we do nothing to guarantee we have a consistent DMA buffer for asynchronous receive packets. Rather than doing several sync's following a dma_map_single() to get consistent buffers, just switch to using dma_alloc_coherent(). Resolves constant buffer failures on my own x86_64 laptop w/4GB of RAM and likely to fix a number of other failures witnessed on x86_64 systems with 4GB of RAM or more. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-03-14 00:57:00 +01:00
Stefan Richter	2aa9ff7fc5	firewire: fw-sbp2: fix for SYM13FW500 bridge (Datafab disk) Fix I/O errors due to SYM13FW500's inability to handle larger request sizes. Reported by Piergiorgio Sartor <piergiorgio.sartor@nexgo.de> in https://bugzilla.redhat.com/show_bug.cgi?id=436879 Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-03-14 00:56:59 +01:00
Stefan Richter	0a8da30dc7	firewire: update Kconfig help text Remove some less necessary information, point out that video1394 and dv1394 should be blacklisted along with ohci1394. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-03-14 00:56:59 +01:00
Stefan Richter	a2cdebe33f	firewire: warn on fatal condition in topology code If this ever happens to anybody, we want to have it in his log. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-03-14 00:56:59 +01:00
Jarod Wilson	51f9dbef5b	firewire: fw-sbp2: set single-phase retry_limit Per the SBP-2 specification, all SBP-2 target devices must have a BUSY_TIMEOUT register. Per the 1394-1995 specification, the retry_limt portion of the register should be set to 0x0 initially, and set on the target by a logged in initiator (i.e., a Linux host w/firewire controller(s)). Well, as it turns out, lots of devices these days have actually moved on to starting to implement SBP-3 compliance, which says that retry_limit should default to 0xf instead (yes, SBP-3 stomps directly on 1394-1995, oops). Prior to this change, the firewire driver stack didn't touch retry_limit, and any SBP-3 compliant device worked fine, while SBP-2 compliant ones were unable to retransmit when the host returned an ack_busy_X, which resulted in stalled out I/O, eventually causing the SCSI layer to give up and offline the device. The simple fix is for us to set retry_limit to 0xf in the register for all devices (which actually matches what the old ieee1394 stack did). Prior to this change, a hard disk behind an SBP-2 Prolific PL-3507 bridge chip would routinely encounter buffer I/O errors and wind up offlined by the SCSI layer. With this change, I've encountered zero I/O failures moving tens of GB of data around. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-03-14 00:56:59 +01:00
Stefan Richter	11bf20ad02	firewire: fw-ohci: Apple UniNorth 1st generation support Mostly copied from ohci1394.c. Necessary for some older Macs, e.g. PowerBook G3 Pismo and early PowerBook G4 Titanium. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-03-14 00:56:59 +01:00
Stefan Richter	ea8d006b91	firewire: fw-ohci: PPC PMac platform code Copied from ohci1394.c. This code is necessary to prevent machine check exceptions when reloading or resuming the driver. Tested on a 1st generation PowerBook G4 Titanium, which also needs the pci_probe() hunk. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> I was able to reproduce the system exception on resume with a 3rd-gen Titanium PowerBook G4 667, and this patch does let the system resume successfully now. Not quite clear if there was possibly an updated version coming using pci_enable_device() instead of the pair of pmac_call_feature() calls, but either way, this is a definite must-have, at least for older ppc macs -- my Aluminum PowerBook G4/1.67 suspends and resumes without this patch just fine. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-03-14 00:56:58 +01:00
Stefan Richter	efbf390a2d	firewire: endianess annotations Kills warnings from 'make C=1 CHECKFLAGS="-D__CHECK_ENDIAN__" modules': drivers/firewire/fw-transaction.c:771:10: warning: incorrect type in assignment (different base types) drivers/firewire/fw-transaction.c:771:10: expected unsigned int [unsigned] [usertype] <noident> drivers/firewire/fw-transaction.c:771:10: got restricted unsigned int [usertype] <noident> drivers/firewire/fw-transaction.h:93:10: warning: incorrect type in assignment (different base types) drivers/firewire/fw-transaction.h:93:10: expected unsigned int [unsigned] [usertype] <noident> drivers/firewire/fw-transaction.h:93:10: got restricted unsigned int [usertype] <noident> drivers/firewire/fw-ohci.c:1490:8: warning: restricted degrades to integer drivers/firewire/fw-ohci.c:1490:35: warning: restricted degrades to integer drivers/firewire/fw-ohci.c:1516:5: warning: cast to restricted type Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-03-14 00:56:58 +01:00
Stefan Richter	25df287dc7	firewire: endianess fix The generation of incoming requests was filled in in wrong byte order on machines with big endian CPU. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-03-14 00:56:58 +01:00
Stefan Richter	855c603d61	firewire: fix crash in automatic module unloading "modprobe firewire-ohci; sleep .1; modprobe -r firewire-ohci" used to result in crashes like this: BUG: unable to handle kernel paging request at ffffffff8807b455 IP: [<ffffffff8807b455>] PGD 203067 PUD 207063 PMD 7c170067 PTE 0 Oops: 0010 [1] PREEMPT SMP CPU 0 Modules linked in: i915 drm cpufreq_ondemand acpi_cpufreq freq_table applesmc input_polldev led_class coretemp hwmon eeprom snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss button thermal processor sg snd_hda_intel snd_pcm snd_timer snd snd_page_alloc sky2 i2c_i801 rtc [last unloaded: crc_itu_t] Pid: 9, comm: events/0 Not tainted 2.6.25-rc2 #3 RIP: 0010:[<ffffffff8807b455>] [<ffffffff8807b455>] RSP: 0018:ffff81007dcdde88 EFLAGS: 00010246 RAX: ffff81007dc95040 RBX: ffff81007dee5390 RCX: 0000000000005e13 RDX: 0000000000008c8b RSI: 0000000000000001 RDI: ffff81007dee5388 RBP: ffff81007dc5eb40 R08: 0000000000000002 R09: ffffffff8022d05c R10: ffffffff8023b34c R11: ffffffff8041a353 R12: ffff81007dee5388 R13: ffffffff8807b455 R14: ffffffff80593bc0 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffffff8055a000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: ffffffff8807b455 CR3: 0000000000201000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process events/0 (pid: 9, threadinfo ffff81007dcdc000, task ffff81007dc95040) Stack: ffffffff8023b396 ffffffff88082524 0000000000000000 ffffffff8807d9ae ffff81007dc5eb40 ffff81007dc9dce0 ffff81007dc5eb40 ffff81007dc5eb80 ffff81007dc9dce0 ffffffffffffffff ffffffff8023be87 0000000000000000 Call Trace: [<ffffffff8023b396>] ? run_workqueue+0xdf/0x1df [<ffffffff8023be87>] ? worker_thread+0xd8/0xe3 [<ffffffff8023e917>] ? autoremove_wake_function+0x0/0x2e [<ffffffff8023bdaf>] ? worker_thread+0x0/0xe3 [<ffffffff8023e813>] ? kthread+0x47/0x74 [<ffffffff804198e0>] ? trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8020c008>] ? child_rip+0xa/0x12 [<ffffffff8020b6e3>] ? restore_args+0x0/0x3d [<ffffffff8023e68a>] ? kthreadd+0x14c/0x171 [<ffffffff8023e68a>] ? kthreadd+0x14c/0x171 [<ffffffff8023e7cc>] ? kthread+0x0/0x74 [<ffffffff8020bffe>] ? child_rip+0x0/0x12 Code: Bad RIP value. RIP [<ffffffff8807b455>] RSP <ffff81007dcdde88> CR2: ffffffff8807b455 ---[ end trace c7366c6657fe5bed ]--- Note that this crash happened _after_ firewire-core was unloaded. The shared workqueue tried to run firewire-core's device initialization jobs or similar jobs. The fix makes sure that firewire-ohci and hence firewire-core is not unloaded before all device shutdown jobs have been completed. This is determined by the count of device initializations minus device releases. Also skip useless retries in the node initialization job if the node is to be shut down. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-03-02 12:35:46 +01:00
Stefan Richter	15803478fd	firewire: potentially invalid pointers used in fw_card_bm_work The bus management workqueue job was in danger to dereference NULL pointers. Also, after having temporarily lifted card->lock, a few node pointers and a device pointer may have become invalid. Add NULL pointer checks and get the necessary references. Also, move card->local_node out of fw_card_bm_work's sight during shutdown of the card. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-03-02 12:35:46 +01:00
Stefan Richter	f8436158b1	firewire: fw-sbp2: better fix for NULL pointer dereference in scsi_remove_device Patch "firewire: fw-sbp2: fix NULL pointer deref. in scsi_remove_device" had the unintended effect that firewire-sbp2 could not be unloaded anymore until all SBP-2 devices were unplugged. We now fix the NULL pointer bug by reacquiring a reference to the sdev instead of holding a reference to the sdev (and to the module) all the time. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Tested-by: Jarod Wilson <jwilson@redhat.com>	2008-03-02 12:35:46 +01:00
Stefan Richter	fae6031214	firewire: fix NULL pointer deref. and resource leak By supplying ioctl()s in the wrong order, a userspace client was able to trigger NULL pointer dereferences. Furthermore, by calling ioctl_create_iso_context more than once, new contexts could be created without ever freeing the previously created contexts. Thanks to Anders Blomdell for the report. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-02-21 19:05:56 +01:00
Stefan Richter	33f1c6c352	firewire: fw-sbp2: fix NULL pointer deref. in scsi_remove_device Fix a kernel bug when unplugging an SBP-2 device after having its scsi_device already removed via the "delete" sysfs attribute. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-02-19 19:57:23 +01:00
Stefan Richter	5513c5f6f9	firewire: fw-sbp2: fix NULL pointer deref. in slave_alloc Fix a kernel bug when running rescan-scsi-bus while a FireWire disk is connected: http://bugzilla.kernel.org/show_bug.cgi?id=10008 Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-02-19 19:57:23 +01:00
Stefan Richter	2e2705bdcb	firewire: fw-sbp2: (try to) avoid I/O errors during reconnect While fw-sbp2 takes the necessary time to reconnect to a logical unit after bus reset, the SCSI core keeps sending new commands. They are all immediately completed with host busy status, and application clients or filesystems will break quickly. The SCSI device might even be taken offline: http://bugzilla.kernel.org/show_bug.cgi?id=9734 The only remedy seems to be to block the SCSI device until reconnect. Alas the SCSI core has no useful API to block only one logical unit i.e. the scsi_device, therefore we block the entire Scsi_Host. This currently corresponds to an SBP-2 target. In case of targets with multiple logical units, we need to satisfy the dependencies between logical units by carefully tracking the blocking state of the target and its units. We block all logical units of a target as soon as one of them needs to be blocked, and keep them blocked until all of them are ready to be unblocked. Furthermore, as the history of the old sbp2 driver has shown, the scsi_block_requests() API is a minefield with high potential of deadlocks. We therefore take extra measures to keep logical units unblocked during __scsi_add_device() and during shutdown. This avoids I/O errors during reconnect in many but alas not in all cases. There may still be errors after a re-login had to be performed. Also, some bridges have been seen to cease fetching management ORBs if I/O went on up until a bus reset. In these cases, all management ORBs time out after mgt_orb_timeout. The old sbp2 driver is less vulnerable or maybe not vulnerable to this, for as yet unknown reasons. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-02-19 19:57:23 +01:00
Stefan Richter	e80de3704a	firewire: fw-sbp2: enforce a retry of __scsi_add_device if bus generation changed fw-sbp2 is unable to reconnect while performing __scsi_add_device because there is only a single workqueue thread context available for both at the moment. This should be fixed eventually. An actual failure of __scsi_add_device is easy to handle, but an incomplete execution of __scsi_add_device with an sdev returned would remain undetected and leave the SBP-2 target unusable. Therefore we use a workaround: If there was a bus reset during __scsi_add_device (i.e. during the SCSI probe), we remove the new sdev immediately, log out, and attempt login and SCSI probe again. Tested-by: Jarod Wilson <jwilson@redhat.com> (earlier version) Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-02-16 15:40:35 +01:00
Stefan Richter	7bb6bf7c8b	firewire: fw-sbp2: sort includes Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-02-16 15:40:35 +01:00
Stefan Richter	ce896d95cc	firewire: fw-sbp2: logout and login after failed reconnect If fw-sbp2 was too late with requesting the reconnect, the target would reject this. In this case, log out before attempting the reconnect. Else several firmwares will deny the re-login because they somehow didn't invalidate the old login. Also, don't retry reconnects in this situation. The retries won't succeed either. These changes improve chances for successful re-login and shorten the period during which the logical unit is inaccessible. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-02-16 15:40:35 +01:00
Stefan Richter	0fa6dfdb0a	firewire: fw-sbp2: don't add scsi_device twice When a reconnect failed but re-login succeeded, __scsi_add_device was called again. In those cases, __scsi_add_device succeeded and returned the pointer to the existing scsi_device. fw-sbp2 then continued orderly, except that it missed to call sbp2_cancel_orbs. SCSI core would call fw-sbp2's eh_abort_handler eventually if there had been an outstanding command. This patch avoids the needless lookups and temporary allocations in SCSI core and I/O stall and timeout until eh_abort_handler hits. Also, __scsi_add_device tolerating calls for devices which already exist is undocumented behavior on which we shouldn't rely. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-02-16 15:40:35 +01:00
Stefan Richter	48f18c761c	firewire: fw-sbp2: log bus_id at management request failures for easier readable logs if more than one SBP-2 device is present. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-02-16 15:40:34 +01:00
Stefan Richter	e0e6021555	firewire: fw-sbp2: wait for completion of fetch agent reset Like the old sbp2 driver, wait for the write transaction to the AGENT_RESET to complete before proceeding (after login, after reconnect, or in SCSI error handling). There is one occasion where AGENT_RESET is written to from atomic context when getting DEAD status for a command ORB. There we still continue without waiting for the transaction to complete because this is more difficult to fix... Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-02-16 15:40:34 +01:00
Stefan Richter	9220f19462	firewire: fw-sbp2: add INQUIRY delay workaround Several different SBP-2 bridges accept a login early while the IDE device is still powering up. They are therefore unable to respond to SCSI INQUIRY immediately, and the SCSI core has to retry the INQUIRY. One of these retries is typically successful, and all is well. But in case of Momobay FX-3A, the INQUIRY retries tend to fail entirely. This can usually be avoided by waiting a little while after login before letting the SCSI core send the INQUIRY. The old sbp2 driver handles this more gracefully for as yet unknown reasons (perhaps because it waits for fetch agent resets to complete, unlike fw-sbp2 which quickly proceeds after requesting the agent reset). Therefore the workaround is not as much necessary for sbp2. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-02-16 15:40:34 +01:00
Stefan Richter	fa6e697b85	firewire: log GUID of new devices This should help to interpret user reports. E.g. one can look up the vendor OUI (first three bytes of the GUID) and thus tell what is what. Also simplifies the math in the GUID sysfs attribute. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-02-16 15:40:34 +01:00
Stefan Richter	be6f48b017	firewire: fw-sbp2: don't retry login or reconnect after unplug If a device is being unplugged while fw-sbp2 had a login or reconnect on schedule, it would take about half a minute to shut the fw_unit down: Jan 27 18:34:54 stein firewire_sbp2: logged in to fw2.0 LUN 0000 (0 retries) <unplug> Jan 27 18:34:59 stein firewire_sbp2: sbp2_scsi_abort Jan 27 18:34:59 stein scsi 25:0:0:0: Device offlined - not ready after error recovery Jan 27 18:35:01 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:06 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:12 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:17 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:22 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:27 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:32 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:32 stein firewire_sbp2: failed to login to fw2.0 LUN 0000 Jan 27 18:35:32 stein firewire_sbp2: released fw2.0 After this patch, typically only a few seconds spent in __scsi_add_device remain: Jan 27 19:05:50 stein firewire_sbp2: logged in to fw2.0 LUN 0000 (0 retries) <unplug> Jan 27 19:05:56 stein firewire_sbp2: sbp2_scsi_abort Jan 27 19:05:56 stein scsi 33:0:0:0: Device offlined - not ready after error recovery Jan 27 19:05:56 stein firewire_sbp2: released fw2.0 The benefit of this is less noise in the syslog. It furthermore avoids a few wasted CPU cycles and needlessly prolonged lifetime of a few driver objects. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-02-16 15:40:33 +01:00
Stefan Richter	96b19062e7	firewire: fix "kobject_add failed for fw* with -EEXIST" There is a race between shutdown and creation of devices: fw-core may attempt to add a device with the same name of an already existing device. http://bugzilla.kernel.org/show_bug.cgi?id=9828 Impact of the bug: Happens rarely (when shutdown of a device coincides with creation of another), forces the user to unplug and replug the new device to get it working. The fix is obvious: Free the minor number after instead of before device_unregister(). This requires to take an additional reference of the fw_device as long as the IDR tree points to it. And while we are at it, we fix an additional race condition: fw_device_op_open() took its reference of the fw_device a little bit too late, hence was in danger to access an already invalid fw_device. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-02-16 15:40:33 +01:00
Stefan Richter	1b9c12ba2f	firewire: fw-sbp2: fix logout before login retry This fixes a "can't recognize device" kind of bug. If the SCSI INQUIRY failed and hence __scsi_add_device failed due to a bus reset, we tried a logout and then waited for the already scheduled login work to happen. So far so good, but the generation used for the logout was outdated, hence the logout never reached the target. The target might therefore deny the subsequent relogin attempt, which would also leave the target inaccessible. Therefore fetch a fresh device->generation for the logout. Use memory barriers to prevent our plan being foiled by compiler or hardware optimizations. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-02-16 15:40:32 +01:00
Stefan Richter	05cca73814	firewire: fw-sbp2: unsigned int vs. unsigned Standardize on "unsigned int" style. Sort some struct members thematically. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-02-16 15:40:32 +01:00
Jarod Wilson	384170da93	firewire: fw-sbp2: Use sbp2 device-provided mgt orb timeout for logins To be more compliant with section 7.4.8 of the SBP-2 specification, use the mgt_ORB_timeout specified in the SBP-2 device's config rom for login ORB attempts (though with some sanity checks). A happy side-effect is that certain device and controller combinations that sometimes take more than 20 seconds to get synced up (like my laptop with just about any SBP-2 device) now function more reliably. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (silenced sparse)	2008-01-30 22:22:29 +01:00
Jarod Wilson	a4c379c197	firewire: fw-sbp2: increase login orb reply timeout, fix "failed to login" Increase (and rename) the login orb reply timeout value to 20s to match that of the old firewire stack. 2s simply didn't give many devices enough time to spin up and reply. Fixes inability to recognize some devices. Failure mode was "orb reply timed out"/"failed to login". Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (style, comments, changelog)	2008-01-30 22:22:28 +01:00
Jarod Wilson	8f9f963e5d	firewire: replace subtraction with bitwise and Replace an unnecessary subtraction with a bitwise AND when determining the value of ext_tcode in fw_fill_transaction() to save a cpu cycle or two in a somewhat critical path. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-01-30 22:22:28 +01:00
Stefan Richter	f8d2dc3938	firewire: fw-core: react on bus resets while the config ROM is being fetched read_rom() obtained a fresh new fw_device.generation for each read transaction. Hence it was able to continue reading in the middle of the ROM even if a bus reset happened. However the device may have modified the ROM during the reset. We would end up with a corrupt fetched ROM image then. Although all of this is quite unlikely, it is not impossible. Therefore we now restart reading the ROM if the bus generation changed. Note, the memory barrier in read_rom() is still necessary according to tests by Jarod Wilson, despite of the ->generation access being moved up in the call chain. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> This is essentially what I've been beating on locally, and I've yet to hit another config rom read failure with it. Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-01-30 22:22:28 +01:00
Stefan Richter	b5d2a5e04e	firewire: enforce access order between generation and node ID, fix "giving up on config rom" fw_device.node_id and fw_device.generation are accessed without mutexes. We have to ensure that all readers will get to see node_id updates before generation updates. Fixes an inability to recognize devices after "giving up on config rom", https://bugzilla.redhat.com/show_bug.cgi?id=429950 Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Reviewed by Nick Piggin <nickpiggin@yahoo.com.au>. Verified to fix 'giving up on config rom' issues on multiple system and drive combinations that were previously affected. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Kristian Høgsberg <krh@redhat.com>	2008-01-30 22:22:27 +01:00
Stefan Richter	cf5a56ac80	firewire: fw-cdev: use device generation, not card generation We have to use the fw_device.generation here, not the fw_card.generation, because the generation must never be newer than the node ID when we emit a transaction. This cannot be guaranteed with fw_card.generation. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Verified in concert with subsequent memory barriers patch to fix 'giving up on config rom' issues on multiple system and drive combinations that were previously affected. Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-01-30 22:22:27 +01:00
Stefan Richter	5a8a1bcd15	firewire: fw-sbp2: use device generation, not card generation There was a small window where a login or reconnect job could use an already updated card generation with an outdated node ID. We have to use the fw_device.generation here, not the fw_card.generation, because the generation must never be newer than the node ID when we emit a transaction. This cannot be guaranteed with fw_card.generation. Furthermore, the target's and initiator's node IDs can be obtained from fw_device and fw_card. Dereferencing their underlying topology objects is not necessary. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Verified in concert with subsequent memory barriers patch to fix 'giving up on config rom' issues on multiple system and drive combinations that were previously affected. Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-01-30 22:22:26 +01:00
Stefan Richter	14dc992aa7	firewire: fw-sbp2: try to increase reconnect_hold (speed up reconnection) Ask the target to grant 4 seconds instead of the standard and minimum of 1 second window after bus reset for reconnection. This accelerates reconnection if there are more than one targets on the bus: If a login and inquiry to one target blocks the fw-sbp2 workqueue for more than 1s after bus reset, we now still can reconnect to the other target. Before that, fw-sbp2's reconnect attempts would be rejected with "error status: 0:9" (function rejected), and fw-sbp2 would finally re-login. All those futile reconnect attemps cost extra time until the target which needs re-login is ready for I/O again. The reconnect timeout field in the login ORB doesn't have to be honored by the target though. I found that we could get up to - allegedly 32768s from an old OXFW911 firmware - 256s from LSI bridges - 4s from OXUF922 and OXFW912 bridges, - 2s from TI bridges, - only the standard 1s from Initio and Prolific bridges and from Apple OpenFirmware in target mode. We just try to get 4 seconds which already covers the case of a few HDDs on the same bus quite nicely. A minor drawback occurs in the following (rare and impractical) border case: - two initiators are there, initiator 1 holds an exclusive login to a target, - initiator 1 goes off the bus, - target refuses login attempts from initiator 2 until reconnect_hold seconds after bus reset. An alternative approach to the issue at hand would be to parallelize fw-sbp2's reconnect and login work. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: Jarod Wilson <jwilson@redhat.com>	2008-01-30 22:22:26 +01:00
Stefan Richter	4dccd020d7	firewire: fw-sbp2: skip unnecessary logout Don't attempt to send a logout ORB if the target was already unplugged or had its link switched off. If two targets are attached, this enhances the chance to quickly reconnect to the remaining target when one target is plugged out. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: Jarod Wilson <jwilson@redhat.com>	2008-01-30 22:22:26 +01:00
David Moore	fe5ca63430	firewire: fw-ohci: Dynamically allocate buffers for DMA descriptors Previously, the fw-ohci driver used fixed-length buffers for storing descriptors for isochronous receive DMA programs. If an application (such as libdc1394) generated a DMA program that was too large, fw-ohci would reach the limit of its fixed-sized buffer and return an error to userspace. This patch replaces the fixed-length ring-buffer with a linked-list of page-sized buffers. Additional buffers can be dynamically allocated and appended to the list when necessary. For a particular context, buffers are kept around after use and reused as necessary, so there is no allocation taking place after the DMA program is generated for the first time. In addition, the buffers it uses are coherent for DMA so there is no syncing required before and after writes. This syncing wasn't properly done in the previous version of the code. - This is the fourth version of my patch that replaces a fixed-length buffer for DMA descriptors with a dynamically allocated linked-list of buffers. As we discovered with the last attempt, new context programs are sometimes queued from interrupt context, making it unacceptable to call tasklet_disable() from context_get_descriptors(). This version of the patch uses ohci->lock for all locking needs instead of tasklet_disable/enable. There is a new requirement that context_get_descriptors() be called while holding ohci->lock. It was already held for the AT context, so adding the requirement for the iso context did not seem particularly onerous. In addition, this has the side benefit of allowing iso queue to be safely called from concurrent user-space threads, which previously was not safe. Signed-off-by: David Moore <dcm@acm.org> Signed-off-by: Kristian Høgsberg <krh@redhat.com> Signed-off-by: Jarod Wilson <jwilson@redhat.com> - Fixes the following issues: - Isochronous reception stopped prematurely if an application used a larger buffer. (Reproduced with coriander.) - Isochronous reception stopped after one or a few frames on VT630x in OHCI 1.0 mode. (Fixes reception in coriander, but dvgrab still doesn't work with these chips.) Patch update: struct member alignment, whitespace nits Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-01-30 22:22:24 +01:00
Stefan Richter	bb9f2206b6	firewire: fw-ohci: CycleTooLong interrupt management The firewire-ohci driver so far lacked the ability to resume cycle master duty after that condition happened, as added to ohci1394 in Linux 2.6.18 by commit `57fdb58fa5`. This ports this patch to fw-ohci. The "cycle too long" condition has been seen in practice - with IIDC cameras if a mode with packets too large for a speed is chosen, - sporadically when capturing DV on a VIA VT6306 card with ohci1394/ ieee1394/ raw1394/ dvgrab 2. https://bugzilla.redhat.com/show_bug.cgi?id=415841#c7 (This does not fix Fedora bug 415841.) Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-01-30 22:22:24 +01:00
Rabin Vincent	478b233eda	firewire: Fix extraction of source node id Fix extraction of the source node id from the packet header. Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-01-30 22:22:24 +01:00
David Moore	bcee893c6c	firewire: fw-ohci: Bug fixes for packet-per-buffer support This patch corrects a number of bugs in the current OHCI 1.0 packet-per-buffer support: 1. Correctly deal with payloads that cross a page boundary. The previous version would not split the descriptor at such a boundary, potentially corrupting unrelated memory. 2. Allow user-space to specify multiple packets per struct fw_cdev_iso_packet in the same way that dual-buffer allows. This is signaled by header_length being a multiple of header_size. This multiple determines the number of packets. The payload size allocated per packet is determined by dividing the total payload size by the number of packets. 3. Make sync support work properly for packet-per-buffer. I have tested this patch with libdc1394 by forcing my OHCI 1.1 controller to use the packet-per-buffer support instead of dual-buffer. I would greatly appreciate testing by those who have a DV devices and other types of iso streamers to make sure I didn't cause any regressions. Stefan, with this patch, I'm hoping that libdc1394 will work with all your OHCI 1.0 controllers now. The one bit of future work that remains for packet-per-buffer support is the automatic compaction of short payloads that I discussed with Kristian. Signed-off-by: David Moore <dcm@acm.org> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-01-30 22:22:23 +01:00
David Moore	0642b6577f	firewire: fw-ohci: Fix for dualbuffer three-or-more buffers This patch fixes the problem where different OHCI 1.1 controllers behave differently when a received iso packet straddles three or more buffers when using the dual-buffer receive mode. Two changes are made in order to handle this situation: 1. The packet sync DMA descriptor is given a non-zero header length and non-zero payload length. This is because zero-payload descriptors are not discussed in the OHCI 1.1 specs and their behavior is thus undefined. Instead we use a header size just large enough for a single header and a payload length of 4 bytes for this first descriptor. 2. As we process received packets in the context's tasklet, read the packet length out of the headers. Keep track of the running total of the packet length as "excess_bytes", so we can ignore any descriptors where no packet starts or ends. These descriptors may not have had their first_res_count or second_res_count fields updated by the controller so we cannot rely on those values. The main drawback of this patch is that the excess_bytes value might get "out of sync" with the packet descriptors if something strange happens to the DMA program. I'm not if such a thing could ever happen, but I appreciate any suggestions in making it more robust. Also, the packet-per-buffer support may need a similar fix to deal with issue 1, but I haven't done any work on that yet. Stefan, I'm hoping that with this patch, all your OHCI 1.1 controllers will work properly with an unmodified version of libdc1394. Signed-off-by: David Moore <dcm@acm.org> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-01-30 22:22:23 +01:00
Stefan Richter	4b11ea96a0	firewire: fw-sbp2: remove unused misleading macro SBP2_MAX_SECTORS is nowhere used in fw-sbp2. It merely got copied over from sbp2 where it played a role in the past. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-01-30 22:22:22 +01:00
Stefan Richter	b7811da2d9	firewire: fw-sbp2: prepare for s/g chaining Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-01-30 22:22:22 +01:00
Stefan Richter	285838eb22	firewire: fw-sbp2: refactor workq and kref handling This somewhat reduces the size of firewire-sbp2.ko. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2008-01-30 22:22:21 +01:00
James Bottomley	465ff3185e	[SCSI] relax scsi dma alignment This patch relaxes the default SCSI DMA alignment from 512 bytes to 4 bytes. I remember from previous discussions that usb and firewire have sector size alignment requirements, so I upped their alignments in the respective slave allocs. The reason for doing this is so that we don't get such a huge amount of copy overhead in bio_copy_user() for udev. (basically all inquiries it issues can now be directly mapped). Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-01-11 18:29:22 -06:00
Jarod Wilson	a186b4a6b2	firewire: OHCI 1.0 Isochronous Receive support Third rendition of FireWire OHCI 1.0 Isochronous Receive support, using a zer-copy method similar to OHCI 1.1 which puts the IR data payload directly into the userspace buffer. The zero-copy implementation eliminates the video artifacts, audio popping, and buffer underrun problems seen with version 1 of this patch, as well as fixing a regression in OHCI 1.1 support introduced by version 2 of this patch. Successfully tested in OHCI 1.1 mode on the following chipsets: - NEC uPD72847 (rev 01), OHCI 1.1 (PCI) - Ti XIO2200(A) (rev 01), OHCI 1.1 (PCIe) - Ti TSB41AB2 (rev 01), OHCI 1.1 (PCI on SB Audigy) - Apple UniNorth 2 (rev 81), OHCI 1.1 (PowerBook G4 onboard) Successfully tested in OHCI 1.0 mode on the following chipsets: - Agere FW323 (rev 06), OHCI 1.0 (Mac Mini onboard) - Agere FW323 (rev 06), OHCI 1.0 (PCI) - Via VT6306 (rev 46), OHCI 1.0 (PCI) - NEC OrangeLink (rev 01), OHCI 1.0 (PCI) - NEC uPD72847 (rev 01), OHCI 1.1 (PCI) - Ti XIO2200(A) (rev 01), OHCI 1.1 (PCIe) The bulk of testing was done in an x86_64 system, but was also successfully sanity-tested on other systems, including a PPC(32) PowerBook G4 and an i686 EPIA M10k. Crude benchmarking (watching top during capture) puts the cpu utilization during capture on the EPIA's 1GHz Via C3 processor around 13%, which is down from 30% with the v1 code. Some implementation details: To maintain the same userspace API as dual-buffer mode, we set up two descriptors for every incoming packet. The first is an INPUT_MORE descriptor, pointing to a buffer large enough to hold just the packet's iso headers, immediately followed by an INPUT_LAST descriptor, pointing to a chunk of the userspace buffer big enough for the packet's data payload. With this setup, each incoming packet fills in these two descriptors in a manner that very closely emulates dual-buffer receive, to the point where the bulk of the handle_ir_* code is now identical between the two (and probably primed for some restructuring to share code between them). The only caveat I have at the moment is that neither of my OHCI 1.0 Via VT6307-based FireWire controllers work particularly well with this code for reasons I have yet to figure out. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2007-12-10 21:55:19 +01:00
Stefan Richter	7c45d1913f	firewire: fw-sbp2: fix refcounting Since patch "fw-sbp2: use an own workqueue (fix system responsiveness)" increased parallelism between fw-sbp2 and fw-core, it was possible that fw-sbp2 didn't release the SCSI device when the FireWire device was disconnected. This happened if sbp2_update() ran during sbp2_login(), because a bus reset occurred during sbp2_login(). The sbp2_login() work would [try to] reschedule itself because it failed due to the bus reset, and it would _not_ drop its reference on the target. However, sbp2_update() would schedule sbp2_login() too before sbp2_login() rescheduled itself and hence sbp2_update() would take an additional reference. And then we would have one reference too many. The fix is to _always_ drop the reference when leaving the sbp2_login() work. If the sbp2_login() work reschedules itself, it takes a reference, but only if it wasn't already rescheduled by sbp2_update(). Ditto in the sbp2_reconnect() work. The resulting code is actually simpler than before: We _always_ take a reference when successfully scheduling work. And we _always_ drop a reference when leaving a workqueue job. No exceptions. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2007-11-07 01:59:28 +01:00
Kristian Høgsberg	0bd243c4d9	firewire: Fix pci resume to not pass in a __be32 config rom. The ohci_enable() function shared between pci_probe and pci_resume takes a host endian config rom, but ohci->config_rom is __be32. This sets up the config rom in the wrong endian on little endian machine, specifically, BusOptions will be initialized to a 0 max receive size. This patch changes the way we reuse the config rom so that we avoid this problem. Signed-off-by: Kristian Hoegsberg <krh@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2007-10-31 19:02:19 +01:00

1 2 3 4 5 ...

449 Commits