mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2025-01-26 20:59:25 +07:00
5c7a35e3e2
On PowerNV platform, EEH errors are reported by IO accessors or poller driven by interrupt. After the PE is isolated, we won't produce EEH event for the PE. The current implementation has possibility of EEH event lost in this way: The interrupt handler queues one "special" event, which drives the poller. EEH thread doesn't pick the special event yet. IO accessors kicks in, the frozen PE is marked as "isolated" and EEH event is queued to the list. EEH thread runs because of special event and purge all existing EEH events. However, we never produce an other EEH event for the frozen PE. Eventually, the PE is marked as "isolated" and we don't have EEH event to recover it. The patch fixes the issue to keep EEH events for PEs that have been marked as "isolated" with the help of additional "force" help to eeh_remove_event(). Reported-by: Rolf Brudeseth <rolfb@us.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
41 lines
1.4 KiB
C
41 lines
1.4 KiB
C
/*
|
|
* This program is free software; you can redistribute it and/or modify
|
|
* it under the terms of the GNU General Public License as published by
|
|
* the Free Software Foundation; either version 2 of the License, or
|
|
* (at your option) any later version.
|
|
*
|
|
* This program is distributed in the hope that it will be useful,
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
* GNU General Public License for more details.
|
|
*
|
|
* You should have received a copy of the GNU General Public License
|
|
* along with this program; if not, write to the Free Software
|
|
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
|
*
|
|
* Copyright (c) 2005 Linas Vepstas <linas@linas.org>
|
|
*/
|
|
|
|
#ifndef ASM_POWERPC_EEH_EVENT_H
|
|
#define ASM_POWERPC_EEH_EVENT_H
|
|
#ifdef __KERNEL__
|
|
|
|
/*
|
|
* structure holding pci controller data that describes a
|
|
* change in the isolation status of a PCI slot. A pointer
|
|
* to this struct is passed as the data pointer in a notify
|
|
* callback.
|
|
*/
|
|
struct eeh_event {
|
|
struct list_head list; /* to form event queue */
|
|
struct eeh_pe *pe; /* EEH PE */
|
|
};
|
|
|
|
int eeh_event_init(void);
|
|
int eeh_send_failure_event(struct eeh_pe *pe);
|
|
void eeh_remove_event(struct eeh_pe *pe, bool force);
|
|
void eeh_handle_event(struct eeh_pe *pe);
|
|
|
|
#endif /* __KERNEL__ */
|
|
#endif /* ASM_POWERPC_EEH_EVENT_H */
|