mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2025-01-23 06:09:21 +07:00
6ce6c7faf2
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
290 lines
13 KiB
Plaintext
290 lines
13 KiB
Plaintext
How To Write Linux PCI Drivers
|
|
|
|
by Martin Mares <mj@ucw.cz> on 07-Feb-2000
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
The world of PCI is vast and it's full of (mostly unpleasant) surprises.
|
|
Different PCI devices have different requirements and different bugs --
|
|
because of this, the PCI support layer in Linux kernel is not as trivial
|
|
as one would wish. This short pamphlet tries to help all potential driver
|
|
authors find their way through the deep forests of PCI handling.
|
|
|
|
|
|
0. Structure of PCI drivers
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
There exist two kinds of PCI drivers: new-style ones (which leave most of
|
|
probing for devices to the PCI layer and support online insertion and removal
|
|
of devices [thus supporting PCI, hot-pluggable PCI and CardBus in a single
|
|
driver]) and old-style ones which just do all the probing themselves. Unless
|
|
you have a very good reason to do so, please don't use the old way of probing
|
|
in any new code. After the driver finds the devices it wishes to operate
|
|
on (either the old or the new way), it needs to perform the following steps:
|
|
|
|
Enable the device
|
|
Access device configuration space
|
|
Discover resources (addresses and IRQ numbers) provided by the device
|
|
Allocate these resources
|
|
Communicate with the device
|
|
Disable the device
|
|
|
|
Most of these topics are covered by the following sections, for the rest
|
|
look at <linux/pci.h>, it's hopefully well commented.
|
|
|
|
If the PCI subsystem is not configured (CONFIG_PCI is not set), most of
|
|
the functions described below are defined as inline functions either completely
|
|
empty or just returning an appropriate error codes to avoid lots of ifdefs
|
|
in the drivers.
|
|
|
|
|
|
1. New-style drivers
|
|
~~~~~~~~~~~~~~~~~~~~
|
|
The new-style drivers just call pci_register_driver during their initialization
|
|
with a pointer to a structure describing the driver (struct pci_driver) which
|
|
contains:
|
|
|
|
name Name of the driver
|
|
id_table Pointer to table of device ID's the driver is
|
|
interested in. Most drivers should export this
|
|
table using MODULE_DEVICE_TABLE(pci,...).
|
|
probe Pointer to a probing function which gets called (during
|
|
execution of pci_register_driver for already existing
|
|
devices or later if a new device gets inserted) for all
|
|
PCI devices which match the ID table and are not handled
|
|
by the other drivers yet. This function gets passed a
|
|
pointer to the pci_dev structure representing the device
|
|
and also which entry in the ID table did the device
|
|
match. It returns zero when the driver has accepted the
|
|
device or an error code (negative number) otherwise.
|
|
This function always gets called from process context,
|
|
so it can sleep.
|
|
remove Pointer to a function which gets called whenever a
|
|
device being handled by this driver is removed (either
|
|
during deregistration of the driver or when it's
|
|
manually pulled out of a hot-pluggable slot). This
|
|
function always gets called from process context, so it
|
|
can sleep.
|
|
save_state Save a device's state before it's suspend.
|
|
suspend Put device into low power state.
|
|
resume Wake device from low power state.
|
|
enable_wake Enable device to generate wake events from a low power
|
|
state.
|
|
|
|
(Please see Documentation/power/pci.txt for descriptions
|
|
of PCI Power Management and the related functions)
|
|
|
|
The ID table is an array of struct pci_device_id ending with a all-zero entry.
|
|
Each entry consists of:
|
|
|
|
vendor, device Vendor and device ID to match (or PCI_ANY_ID)
|
|
subvendor, Subsystem vendor and device ID to match (or PCI_ANY_ID)
|
|
subdevice
|
|
class, Device class to match. The class_mask tells which bits
|
|
class_mask of the class are honored during the comparison.
|
|
driver_data Data private to the driver.
|
|
|
|
Most drivers don't need to use the driver_data field. Best practice
|
|
for use of driver_data is to use it as an index into a static list of
|
|
equivalent device types, not to use it as a pointer.
|
|
|
|
Have a table entry {PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID}
|
|
to have probe() called for every PCI device known to the system.
|
|
|
|
New PCI IDs may be added to a device driver at runtime by writing
|
|
to the file /sys/bus/pci/drivers/{driver}/new_id. When added, the
|
|
driver will probe for all devices it can support.
|
|
|
|
echo "vendor device subvendor subdevice class class_mask driver_data" > \
|
|
/sys/bus/pci/drivers/{driver}/new_id
|
|
where all fields are passed in as hexadecimal values (no leading 0x).
|
|
Users need pass only as many fields as necessary; vendor, device,
|
|
subvendor, and subdevice fields default to PCI_ANY_ID (FFFFFFFF),
|
|
class and classmask fields default to 0, and driver_data defaults to
|
|
0UL. Device drivers must initialize use_driver_data in the dynids struct
|
|
in their pci_driver struct prior to calling pci_register_driver in order
|
|
for the driver_data field to get passed to the driver. Otherwise, only a
|
|
0 is passed in that field.
|
|
|
|
When the driver exits, it just calls pci_unregister_driver() and the PCI layer
|
|
automatically calls the remove hook for all devices handled by the driver.
|
|
|
|
Please mark the initialization and cleanup functions where appropriate
|
|
(the corresponding macros are defined in <linux/init.h>):
|
|
|
|
__init Initialization code. Thrown away after the driver
|
|
initializes.
|
|
__exit Exit code. Ignored for non-modular drivers.
|
|
__devinit Device initialization code. Identical to __init if
|
|
the kernel is not compiled with CONFIG_HOTPLUG, normal
|
|
function otherwise.
|
|
__devexit The same for __exit.
|
|
|
|
Tips:
|
|
The module_init()/module_exit() functions (and all initialization
|
|
functions called only from these) should be marked __init/exit.
|
|
The struct pci_driver shouldn't be marked with any of these tags.
|
|
The ID table array should be marked __devinitdata.
|
|
The probe() and remove() functions (and all initialization
|
|
functions called only from these) should be marked __devinit/exit.
|
|
If you are sure the driver is not a hotplug driver then use only
|
|
__init/exit __initdata/exitdata.
|
|
|
|
Pointers to functions marked as __devexit must be created using
|
|
__devexit_p(function_name). That will generate the function
|
|
name or NULL if the __devexit function will be discarded.
|
|
|
|
|
|
2. How to find PCI devices manually (the old style)
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
PCI drivers not using the pci_register_driver() interface search
|
|
for PCI devices manually using the following constructs:
|
|
|
|
Searching by vendor and device ID:
|
|
|
|
struct pci_dev *dev = NULL;
|
|
while (dev = pci_get_device(VENDOR_ID, DEVICE_ID, dev))
|
|
configure_device(dev);
|
|
|
|
Searching by class ID (iterate in a similar way):
|
|
|
|
pci_get_class(CLASS_ID, dev)
|
|
|
|
Searching by both vendor/device and subsystem vendor/device ID:
|
|
|
|
pci_get_subsys(VENDOR_ID, DEVICE_ID, SUBSYS_VENDOR_ID, SUBSYS_DEVICE_ID, dev).
|
|
|
|
You can use the constant PCI_ANY_ID as a wildcard replacement for
|
|
VENDOR_ID or DEVICE_ID. This allows searching for any device from a
|
|
specific vendor, for example.
|
|
|
|
These functions are hotplug-safe. They increment the reference count on
|
|
the pci_dev that they return. You must eventually (possibly at module unload)
|
|
decrement the reference count on these devices by calling pci_dev_put().
|
|
|
|
|
|
3. Enabling and disabling devices
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
Before you do anything with the device you've found, you need to enable
|
|
it by calling pci_enable_device() which enables I/O and memory regions of
|
|
the device, allocates an IRQ if necessary, assigns missing resources if
|
|
needed and wakes up the device if it was in suspended state. Please note
|
|
that this function can fail.
|
|
|
|
If you want to use the device in bus mastering mode, call pci_set_master()
|
|
which enables the bus master bit in PCI_COMMAND register and also fixes
|
|
the latency timer value if it's set to something bogus by the BIOS.
|
|
|
|
If you want to use the PCI Memory-Write-Invalidate transaction,
|
|
call pci_set_mwi(). This enables the PCI_COMMAND bit for Mem-Wr-Inval
|
|
and also ensures that the cache line size register is set correctly.
|
|
Make sure to check the return value of pci_set_mwi(), not all architectures
|
|
may support Memory-Write-Invalidate.
|
|
|
|
If your driver decides to stop using the device (e.g., there was an
|
|
error while setting it up or the driver module is being unloaded), it
|
|
should call pci_disable_device() to deallocate any IRQ resources, disable
|
|
PCI bus-mastering, etc. You should not do anything with the device after
|
|
calling pci_disable_device().
|
|
|
|
4. How to access PCI config space
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
You can use pci_(read|write)_config_(byte|word|dword) to access the config
|
|
space of a device represented by struct pci_dev *. All these functions return 0
|
|
when successful or an error code (PCIBIOS_...) which can be translated to a text
|
|
string by pcibios_strerror. Most drivers expect that accesses to valid PCI
|
|
devices don't fail.
|
|
|
|
If you don't have a struct pci_dev available, you can call
|
|
pci_bus_(read|write)_config_(byte|word|dword) to access a given device
|
|
and function on that bus.
|
|
|
|
If you access fields in the standard portion of the config header, please
|
|
use symbolic names of locations and bits declared in <linux/pci.h>.
|
|
|
|
If you need to access Extended PCI Capability registers, just call
|
|
pci_find_capability() for the particular capability and it will find the
|
|
corresponding register block for you.
|
|
|
|
|
|
5. Addresses and interrupts
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
Memory and port addresses and interrupt numbers should NOT be read from the
|
|
config space. You should use the values in the pci_dev structure as they might
|
|
have been remapped by the kernel.
|
|
|
|
See Documentation/IO-mapping.txt for how to access device memory.
|
|
|
|
The device driver needs to call pci_request_region() to make sure
|
|
no other device is already using the same resource. The driver is expected
|
|
to determine MMIO and IO Port resource availability _before_ calling
|
|
pci_enable_device(). Conversely, drivers should call pci_release_region()
|
|
_after_ calling pci_disable_device(). The idea is to prevent two devices
|
|
colliding on the same address range.
|
|
|
|
Generic flavors of pci_request_region() are request_mem_region()
|
|
(for MMIO ranges) and request_region() (for IO Port ranges).
|
|
Use these for address resources that are not described by "normal" PCI
|
|
interfaces (e.g. BAR).
|
|
|
|
All interrupt handlers should be registered with IRQF_SHARED and use the devid
|
|
to map IRQs to devices (remember that all PCI interrupts are shared).
|
|
|
|
|
|
6. Other interesting functions
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
pci_find_slot() Find pci_dev corresponding to given bus and
|
|
slot numbers.
|
|
pci_set_power_state() Set PCI Power Management state (0=D0 ... 3=D3)
|
|
pci_find_capability() Find specified capability in device's capability
|
|
list.
|
|
pci_module_init() Inline helper function for ensuring correct
|
|
pci_driver initialization and error handling.
|
|
pci_resource_start() Returns bus start address for a given PCI region
|
|
pci_resource_end() Returns bus end address for a given PCI region
|
|
pci_resource_len() Returns the byte length of a PCI region
|
|
pci_set_drvdata() Set private driver data pointer for a pci_dev
|
|
pci_get_drvdata() Return private driver data pointer for a pci_dev
|
|
pci_set_mwi() Enable Memory-Write-Invalidate transactions.
|
|
pci_clear_mwi() Disable Memory-Write-Invalidate transactions.
|
|
|
|
|
|
7. Miscellaneous hints
|
|
~~~~~~~~~~~~~~~~~~~~~~
|
|
When displaying PCI slot names to the user (for example when a driver wants
|
|
to tell the user what card has it found), please use pci_name(pci_dev)
|
|
for this purpose.
|
|
|
|
Always refer to the PCI devices by a pointer to the pci_dev structure.
|
|
All PCI layer functions use this identification and it's the only
|
|
reasonable one. Don't use bus/slot/function numbers except for very
|
|
special purposes -- on systems with multiple primary buses their semantics
|
|
can be pretty complex.
|
|
|
|
If you're going to use PCI bus mastering DMA, take a look at
|
|
Documentation/DMA-mapping.txt.
|
|
|
|
Don't try to turn on Fast Back to Back writes in your driver. All devices
|
|
on the bus need to be capable of doing it, so this is something which needs
|
|
to be handled by platform and generic code, not individual drivers.
|
|
|
|
|
|
8. Vendor and device identifications
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
For the future, let's avoid adding device ids to include/linux/pci_ids.h.
|
|
|
|
PCI_VENDOR_ID_xxx for vendors, and a hex constant for device ids.
|
|
|
|
Rationale: PCI_VENDOR_ID_xxx constants are re-used, but device ids are not.
|
|
Further, device ids are arbitrary hex numbers, normally used only in a
|
|
single location, the pci_device_id table.
|
|
|
|
9. Obsolete functions
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
There are several functions which you might come across when trying to
|
|
port an old driver to the new PCI interface. They are no longer present
|
|
in the kernel as they aren't compatible with hotplug or PCI domains or
|
|
having sane locking.
|
|
|
|
pci_find_device() Superseded by pci_get_device()
|
|
pci_find_subsys() Superseded by pci_get_subsys()
|
|
pci_find_slot() Superseded by pci_get_slot()
|