Commit Graph

370 Commits

Author SHA1 Message Date
Lv Zheng
c2b46d679b ACPI / EC: Add PM operations to improve event handling for resume process
This patch makes 2 changes:

1. Restore old behavior
Originally, EC driver stops handling both events and transactions in
acpi_ec_block_transactions(), and restarts to handle transactions in
acpi_ec_unblock_transactions_early(), restarts to handle both events and
transactions in acpi_ec_unblock_transactions().
While currently, EC driver still stops handling both events and
transactions in acpi_ec_block_transactions(), but restarts to handle both
events and transactions in acpi_ec_unblock_transactions_early().
This patch tries to restore the old behavior by dropping
__acpi_ec_enable_event() from acpi_unblock_transactions_early().

2. Improve old behavior
However this still cannot fix the real issue as both of the
acpi_ec_unblock_xxx() functions are invoked in the noirq stage. Since the
EC driver actually doesn't implement the event handling in the polling
mode, re-enabling the event handling too early in the noirq stage could
result in the problem that if there is no triggering source causing
advance_transaction() to be invoked, pending SCI_EVT cannot be detected by
the EC driver and _Qxx cannot be triggered.
It actually makes sense to restart the event handling in any point during
resuming after the noirq stage. Just like the boot stage where the event
handling is enabled in .add(), this patch further moves
acpi_ec_enable_event() to .resume(). After doing that, the following 2
functions can be combined:
acpi_ec_unblock_transactions_early()/acpi_ec_unblock_transactions().

The differences of the event handling availability between the old behavior
(this patch isn't applied) and the new behavior (this patch is applied) are
as follows:
                        !Applied        Applied
before suspend          Y               Y
suspend before EC       Y               Y
suspend after EC        Y               Y
suspend_late            Y               Y
suspend_noirq           Y (actually N)  Y (actually N)
resume_noirq            Y (actually N)  Y (actually N)
resume_late             Y (actually N)  Y (actually N)
resume before EC        Y (actually N)  Y (actually N)
resume after EC         Y (actually N)  Y
after resume            Y (actually N)  Y
Where "actually N" means if there is no triggering source, the EC driver
is actually not able to notice the pending SCI_EVT occurred in the noirq
stage. So we can clearly see that this patch has improved the situation.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-31 00:32:10 +02:00
Lv Zheng
e923e8e79e ACPI / EC: Fix an issue that SCI_EVT cannot be detected after event is enabled
After enabling the EC event handling, Linux is still in the noirq stage, if
there is no triggering source (EC transaction, GPE STS status),
advance_transaction() will not be invoked and SCI_EVT cannot be detected.
This patch adds one more triggering source after enabling the EC event
handling to poll the pending SCI_EVT.

Known issues:
1. Still no SCI_EVT triggering source
   There could still be no SCI_EVT triggering source after handling the
   first SCI_EVT (polled by this patch if any). Because after handling the
   first SCI_EVT, Linux could still be in noirq stage and there could still
   be no further triggering source in this stage. Then the second SCI_EVT
   indicated during this stage still cannot be detected by the EC driver.
   With this improvement applied, it is then possible to move
   acpi_ec_enable_event() out of the noirq stage to fix this issue (if the
   first SCI_EVT is handled out of the noirq stage, the follow-up SCI_EVTs
   should be able to trigger IRQs).

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-31 00:32:10 +02:00
Lv Zheng
750f628be6 ACPI / EC: Add EC_FLAGS_QUERY_ENABLED to reveal a hidden logic
There is a hidden logic in the EC driver:
1. During boot, EC_FLAGS_QUERY_PENDING is responsible for blocking event
   handling;
2. During suspend, EC_FLAGS_STARTED is responsible for blocking event
   handling.
This patch uses a new EC_FLAGS_QUERY_ENABLED flag to make this hidden
logic explicit and have code cleaned up. No functional change.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-31 00:32:10 +02:00
Lv Zheng
df45db6177 ACPI / EC: Add PM operations for suspend/resume noirq stage
It is reported that on some platforms, resume speed is not fast. The cause
is: in noirq stage, EC driver is working in polling mode, and each state
machine advancement requires a context switch.

The context switch is not necessary to the EC driver's polling mode. This
patch implements PM hooks to automatically switch the driver to/from the
busy polling mode to eliminate the overhead caused by the context switch.

This finally contributes to the tuning result: acpi_pm_finish() execution
time is improved from 192ms to 6ms.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reported-and-tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-17 02:37:02 +02:00
Rafael J. Wysocki
4dc14b343d Merge branches 'acpi-ec' and 'acpi-button'
* acpi-ec:
  ACPI / EC: Work around method reentrancy limit in ACPICA for _Qxx

* acpi-button:
  ACPI / button: remove pointer to old lid_sysfs on unbind
2016-08-05 16:04:49 +02:00
Lv Zheng
e1191bd4f6 ACPI / EC: Work around method reentrancy limit in ACPICA for _Qxx
A regression is caused by the following commit:

  Commit: 02b771b64b
  Subject: ACPI / EC: Fix an issue caused by the serialized _Qxx evaluations

In this commit, using system workqueue causes that the maximum parallel
executions of _Qxx can exceed 255. This violates the method reentrancy
limit in ACPICA and generates the following error log:

  ACPI Error: Method reached maximum reentrancy limit (255) (20150818/dsmethod-341)

This patch creates a seperate workqueue and limits the number of parallel
_Qxx evaluations down to a configurable value (can be tuned against number
of online CPUs).

Since EC events are handled after driver probe, we can create the workqueue
in acpi_ec_init().

Fixes: 02b771b64b (ACPI / EC: Fix an issue caused by the serialized _Qxx evaluations)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=135691
Cc: 4.3+ <stable@vger.kernel.org> # 4.3+
Reported-and-tested-by: Helen Buus <ubuntu@hbuus.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-04 02:07:53 +02:00
Rafael J. Wysocki
f6bc0a168e Merge branches 'acpi-ec', 'acpi-video', 'acpi-button' and 'acpi-thermal'
* acpi-ec:
  ACPI / EC: Remove wrong ECDT correction quirks
  ACPI / EC: Cleanup boot EC code using acpi_ec_alloc()

* acpi-video:
  ACPI / video: Dummy acpi_video_register should return error code
  ACPI / video: skip evaluating _DOD when it does not exist
  ACPI / video: Thinkpad X201 Tablet needs video_detect_force_video

* acpi-button:
  ACPI / button: Add quirks for initial lid state notification
  ACPI / button: Refactor functions to eliminate redundant code
  ACPI / button: Remove initial lid state notification

* acpi-thermal:
  ACPI / thermal: Remove create_workqueue()
2016-07-25 13:42:00 +02:00
Lv Zheng
fa5b4a509d ACPI / EC: Fix code ordering issue in ec_remove_handlers()
There is an order issue in ec_remove_handlers() that acpi_ec_stop()
is called before removing the operation region handler. That is
incorrect, because the operation region handler removal triggers
_REG(DISCONNECT) which may result in new EC transactions to carry
out.

That existing issue has been triggered by the following commit:

    Commit: dcf15cbded
    Subject: ACPI / EC: Fix a boot EC regresion by restoring boot EC

which changed the driver to call ec_remove_handlers() after invoking
_REG(CONNECT), so the issue has become visible.

Fixes: dcf15cbded (ACPI / EC: Fix a boot EC regresion by restoring boot EC)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=102421
Reported-and-tested-by: Wolfram Sang <wsa@the-dreams.de>
Reported-by: Nicholas <nkudriavtsev@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-08 21:44:12 +02:00
Lv Zheng
bc539567ab ACPI / EC: Remove wrong ECDT correction quirks
Our Windows probe result shows that EC._REG is evaluated after evaluating
all _INI/_STA control methods.

With boot EC always switched in acpi_ec_dsdt_probe(), we can see that as
long as there is no EC opregion accesses in the MLC (module level code, AML
code out of any control methods) and in _INI/_STA, there is no need to make
sure that ECDT must be correct.

Bugs of 9399/12461 were reported against an order issue that BAT0/1._STA
evaluations contain EC accesses while the ECDT setting is wrong.

>From the acpidump output posted on bug 9399, we can see that it is actually
a different issue. In this table, if EC._REG is not executed, EC accesses
will be done in a platform specific manner. As we've already ensured not to
execute EC._REG during the eary stage, we can remove the quirks for bug
9399.

From the acpidump output posted on bug 12461, we can see that it still
needs the quirk. In this table, EC._REG flags a named object whose default
value is One, thus BAT1._STA surely should invoke EC accesses whatever we
invoke EC._REG or not. We have to keep the quirk for it before we can root
cause the issue.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-04 15:30:06 +02:00
Lv Zheng
fd6231e785 ACPI / EC: Cleanup boot EC code using acpi_ec_alloc()
Failure handling of the boot EC code is not tidy. This patch cleans
them up with acpi_ec_alloc().

This patch also changes acpi_ec_dsdt_probe(), always switches the
boot EC from the ECDT one to the DSDT one in this function.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-04 15:30:06 +02:00
Lv Zheng
dcf15cbded ACPI / EC: Fix a boot EC regresion by restoring boot EC support for the DSDT EC
According to the Windows probing result, during the table loading, the EC
device described in the ECDT should be used. And the ECDT EC is also
effective during the period the namespace objects are initialized (we can
see a separate process executing _STA/_INI on Windows before executing
other device specific control methods, for example, EC._REG). During the
device enumration, the EC device described in the DSDT should be used. But
there are differences between Linux and Windows around the device probing
order. Thus in Linux, we should enable the DSDT EC as early as possible
before enumerating devices in order not to trigger issues related to the
device enumeration order differences.

This patch thus converts acpi_boot_ec_enable() into acpi_ec_dsdt_probe() to
fix the gap. This also fixes a user reported regression triggered after we
switched the "table loading"/"ECDT support" to be ACPI spec 2.0 compliant.

Fixes: 59f0aa9480 (ACPI 2.0 / ECDT: Remove early namespace reference from EC)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=119261
Reported-and-tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-07 02:29:53 +02:00
Lv Zheng
59f0aa9480 ACPI 2.0 / ECDT: Remove early namespace reference from EC
All operation region accesses are allowed by AML interpreter when AML is
executed, so actually BIOSen are responsible to avoid the operation region
accesses in AML before OSPM has prepared an operation region driver. This
is done via _REG control method. So AML code normally sets a global named
object REGC to 1 when _REG(3, 1) is evaluated.

Then what is ECDT? Quoting from ACPI spec 6.0, 5.2.15 Embedded Controller
Boot Resources Table (ECDT):
 "The presence of this table allows OSPM to provide Embedded Controller
  operation region space access before the namespace has been evaluated."
Spec also suggests a compatible mean to indicate the early EC access
availability:
 Device (EC)
 {
     Name (REGC, Ones)
     Method (_REG, 2)
     {
         If (LEqual (Arg0, 3))
         {
             Store (Arg1, REGC)
         }
     }
     Method (ECAV)
     {
         If (LEqual (REGC, Ones))
         {
             If (LGreaterEqual (_REV, 2))
             {
                 Return (One)
             }
             Else
             {
                 Return (Zero)
             }
         }
         Else
         {
             Return (REGC)
         }
     }
 }
In this way, it allows EC accesses to happen before EC._REG(3, 1) is
invoked.

But ECAV is not the only way practical BIOSen using to indicate the early
EC access availibility, the known variations include:
1. Setting REGC to One in \_SB._INI when _REV >= 2. Since \_SB._INI is the
   first control method evaluated by OSPM during the enumeration, this
   allows EC accesses to happen for the entire enumeration process before
   the namespace EC is enumerated.
2. Initialize REGC to One by default, this even allows EC accesses to
   happen during the table loading.

Linux is now broken around ECDT support during the long term bug fixing
work because it has merged many wrong ECDT bug fixes (see details below).
Linux currently uses namespace EC's settings instead of ECDT settings when
ECDT is detected. This apparently will result in namespace walk and
_CRS/_GPE/_REG evaluations. Such stuffs could only happen after namespace
is ready, while ECDT is purposely to be used before namespace is ready.

The wrong bug fixing story is:
1. Link 1:
   At Linux ACPI early stages, "no _Lxx/_Exx/_Qxx evaluation can happen
   before the namespace is ready" are not ensured by ACPICA core and Linux.
   This is currently ensured by deferred enabling of GPE and defered
   registering of EC query methods (acpi_ec_register_query_methods).
2. Link 2:
   Reporters reported buggy ECDTs, expecting quirks for the platform.
   Originally, the quirk is simple, only doing things with ECDT.
   Bug 9399 and 12461 are platforms (Asus L4R, Asus M6R, MSI MS-171F)
   reported to have wrong ECDT IO port addresses, the port addresses are
   reversed.
   Bug 11880 is a platform (Asus X50GL) reported to have 0 valued port
   addresses, we can see that all EC accesses are protected by ECAV on
   this platform, so actually no early EC accesses is required by this
   platform.
3. Link 3:
   But when the bug fixing developer was requested to provide a handy and
   non-quirk bug fix, he tried to use correct EC settings from namespace
   and broke the spec purpose. We can even see that the developer was
   suffered from many regrssions. One interesting one is 14086, where the
   actual root cause obviously should be: _REG is evaluated too early. But
   unfortunately, the bug is fixed in a totally wrong way.

So everything goes wrong from these commits:
   Commit: c6cb0e8784
   Subject: ACPI: EC: Don't trust ECDT tables from ASUS
   Commit: a5032bfdd9
   Subject: ACPI: EC: Always parse EC device

This patch reverts Linux behavior to simple ECDT quirk support in order to
stop early _CRS/_GPE/_REG evaluations.
For Bug 9399, 12461, since it is reported that the platforms require early
EC accesses, this patch restores the simple ECDT quirks for them.
For Bug 11880, since it is not reported that the platform requires early EC
accesses and its ACPI tables contain correct ECAV, we choose an ECDT
enumeration failure for this platform.

Link 1: https://bugzilla.kernel.org/show_bug.cgi?id=9916
        http://bugzilla.kernel.org/show_bug.cgi?id=10100
        https://lkml.org/lkml/2008/2/25/282
Link 2: https://bugzilla.kernel.org/show_bug.cgi?id=9399
        https://bugzilla.kernel.org/show_bug.cgi?id=12461
        https://bugzilla.kernel.org/show_bug.cgi?id=11880
Link 3: https://bugzilla.kernel.org/show_bug.cgi?id=11884
        https://bugzilla.kernel.org/show_bug.cgi?id=14081
        https://bugzilla.kernel.org/show_bug.cgi?id=14086
        https://bugzilla.kernel.org/show_bug.cgi?id=14446
Link 4: https://bugzilla.kernel.org/show_bug.cgi?id=112911
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-09 03:06:44 +02:00
Lv Zheng
0e1affe41b ACPI 2.0 / ECDT: Split EC_FLAGS_HANDLERS_INSTALLED
This patch splits EC_FLAGS_HANDLERS_INSTALLED so that address space handler
can be installed when it is not possible to install GPE handler during
early stage.
This patch also tunes address space handler installation, making it
happening earlier than GPE handler installation for the same purpose.

Since acpi_ec_start()/acpi_ec_stop() will be entered multiple times after
applying this change, it is also required to protect acpi_enable_gpe()/
acpi_disable_gpe() invocations.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=112911
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-09 03:06:43 +02:00
Markus Elfring
4981c2b7ab ACPI-EC: Drop unnecessary check made before calling acpi_ec_delete_query()
The acpi_ec_delete_query() function tests whether its argument is NULL
and then returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-16 23:29:44 +01:00
Lv Zheng
6119754707 ACPI / EC: Fix a race issue in acpi_ec_guard_event()
In acpi_ec_guard_event(), EC transaction state machine variables should be
checked with the EC spinlock locked.
The bug doesn't trigger any real issue now because this bug can only occur
when the ec_event_clearing=event mode is applied while there is no user
currently using this mode.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-09-26 01:46:25 +02:00
Lv Zheng
0700c047f6 ACPI / EC: Fix query handler related issues
1. acpi_ec_remove_query_handlers()
This patch refines the query handler removal logic implemented in
acpi_ec_remove_query_handler(), making it to invoke new
acpi_ec_remove_query_handlers() API, and ensuring all other removal code
paths to invoke the new API to honor the reference count of the query
handlers.

2. acpi_ec_get_query_handler_by_value()
This patch also refines the query handler search logic originally
implemented in acpi_ec_query(), collecting it into
acpi_ec_get_query_handler_by_value(). And since schedule_work() can ensure
the serilization of acpi_ec_event_handler(), we needn't put the
mutex_lock() around schedule_work().

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-09-26 01:46:24 +02:00
Lv Zheng
15b94fa32a ACPI / EC: Fix a memory leak issue in acpi_ec_query()
When query handler is not found, "result" is actually stil 0, and
"struct acpi_ec_query" is not NULL, so the deletion code of
"struct acpi_ec_query" at the end of the function cannot be invoked.
As a consequence, memory leak can be observed.

The issue is introduced by this commit:
  Commit: 02b771b64b
  Subject: ACPI / EC: Fix an issue caused by the serialized _Qxx

This patch fixes such memory leakage.

Fixes: 02b771b64b (ACPI / EC: Fix an issue caused by the serialized _Qxx evaluations)
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-09-26 01:44:59 +02:00
Rafael J. Wysocki
5d2a1a927d Merge branches 'acpi-pci', 'acpi-soc', 'acpi-ec' and 'acpi-osl'
* acpi-pci:
  ACPI, PCI: Penalize legacy IRQ used by ACPI SCI

* acpi-soc:
  ACPI / LPSS: Ignore 10ms delay for Braswell

* acpi-ec:
  ACPI / EC: Fix an issue caused by the serialized _Qxx evaluations

* acpi-osl:
  ACPI / osl: replace custom implementation of readq / writeq
2015-09-01 03:41:19 +02:00
Lv Zheng
02b771b64b ACPI / EC: Fix an issue caused by the serialized _Qxx evaluations
It is proven that Windows evaluates _Qxx handlers in a parallel way. This
patch follows this fact, splits _Qxx evaluations from the NOTIFY queue to
form a separate queue, so that _Qxx evaluations can be queued up on
different CPUs rather than being queued up on a CPU0 bound queue.
Event handling related callbacks are also renamed and sorted in this patch.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=94411
Reported-and-tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-08-25 03:19:32 +02:00
Jarkko Nikula
4c62dbbce9 ACPI: Remove FSF mailing addresses
There is no need to carry potentially outdated Free Software Foundation
mailing address in file headers since the COPYING file includes it.

Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-08 02:27:32 +02:00
Lv Zheng
66db383439 ACPI / EC: Fix a code coverity issue when QR_EC transactions are failed.
When the QR_EC transaction fails, the EC_FLAGS_QUERY_PENDING flag prevents
the event handling work queue from being scheduled again.

Though there shouldn't be failed QR_EC transactions, and this gap was
efficiently used for catching and learning the SCI_EVT clearing timing
compliance issues, we need to fix this as we are not fully compatible
with all platforms/Windows to handle SCI_EVT clearing timing correctly.
Fixing this gives the EC driver the chances to recover from a state machine
failure.

So this patch fixes this issue. When nr_pending_queries drops to 0, it
clears EC_FLAGS_QUERY_PENDING at the proper position for different modes in
order to ensure that the SCI_EVT handling can proceed.

In order to be clearer for future ec_event_clearing modes, all checks in
this patch are written in the inclusive style, not the exclusive style.

Cc: 3.16+ <stable@vger.kernel.org> # 3.16+
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-06-15 14:35:59 +02:00
Lv Zheng
3cb02aeb28 ACPI / EC: Fix EC_FLAGS_QUERY_HANDSHAKE platforms using new event clearing timing.
It is reported that on several platforms, EC firmware will not respond
non-expected QR_EC (see EC_FLAGS_QUERY_HANDSHAKE, only write QR_EC when
SCI_EVT is set).

Unfortunately, ACPI specification doesn't define when the SCI_EVT should be
cleared by the firmware, thus the original implementation queued up second
QR_EC right after writing QR_EC command and before reading the returned
event value as at that time the SCI_EVT is ensured not cleared. This
behavior is also based on the assumption that the firmware should be able
to return 0x00 to indicate "no outstanding event". This behavior did fix
issues on Samsung platforms where the spurious query value of 0x00 is
supported and didn't break platforms in my test queue.

But recently, specific Acer, Asus, Lenovo platforms keep on blaming this
change.

This patch changes the behavior to re-check the SCI_EVT a bit later and
removes EC_FLAGS_QUERY_HANDSHAKE quirks, hoping this is the Windows
compliant EC driver behavior.

In order to be robust to the possible regressions, instead of removing the
quirk directly, this patch keeps the quirk code, removes the quirk users
and keeps old behavior for Samsung platforms.

Cc: 3.16+ <stable@vger.kernel.org> # 3.16+
Link: https://bugzilla.kernel.org/show_bug.cgi?id=94411
Link: https://bugzilla.kernel.org/show_bug.cgi?id=97381
Link: https://bugzilla.kernel.org/show_bug.cgi?id=98111
Reported-and-tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Reported-and-tested-by: Tigran Gabrielyan <tigrangab@gmail.com>
Reported-and-tested-by: Adrien D <ghbdtn@openmailbox.org>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-06-15 14:35:59 +02:00
Lv Zheng
1d68d2612c ACPI / EC: Add event clearing variation support.
We've been suffering from the uncertainty of the SCI_EVT clearing timing.
This patch implements 3 of 4 possible modes to handle SCI_EVT clearing
variations. The old behavior is kept in this patch.

Status: QR_EC is re-checked as early as possible after checking previous
        SCI_EVT. This always leads to 2 QR_EC transactions per SCI_EVT
        indication and the target may implement event queue which returns
        0x00 indicating "no outstanding event".
        This is proven to be a conflict against Windows behavior, but is
        still kept in this patch to make the EC driver robust to the
        possible regressions that may occur on Samsung platforms.
Query: QR_EC is re-checked after the target has handled the QR_EC query
       request command pushed by the host.
Event: QR_EC is re-checked after the target has noticed the query event
       response data pulled by the host.
       This timing is not determined by any IRQs, so we may need to use a
       guard period in this mode, which may explain the existence of the
       ec_guard() code used by the old EC driver where the re-check timing
       is implemented in the similar way as this mode.
Method: QR_EC is re-checked as late as possible after completing the _Qxx
        evaluation. The target may implement SCI_EVT like a level triggered
        interrupt.
        It is proven on kernel bugzilla 94411 that, Windows will have all
        _Qxx evaluations parallelized. Thus unless required by further
        evidences, we needn't implement this mode as it is a conflict of
        the _Qxx parallelism requirement.

Note that, according to the reports, there are platforms that cannot be
handled using the "Status" mode without enabling the
EC_FLAGS_QUERY_HANDSHAKE quirk. But they can be handled with the other
modes according to the tests (kernel bugzilla 97381).

The following log entry can be used to confirm the differences of the 3
modes as it should appear at the different positions for the 3 modes:
  Command(QR_EC) unblocked
Status: appearing after
         EC_SC(W) = 0x84
Query: appearing after
         EC_DATA(R) = 0xXX
       where XX is the event number used to determine _QXX
Event: appearing after first
         EC_SC(R) = 0xX0 SCI_EVT=x BURST=0 CMD=0 IBF=0 OBF=0
       that is next to the following log entry:
         Command(QR_EC) completed by hardware

Link: https://bugzilla.kernel.org/show_bug.cgi?id=94411
Link: https://bugzilla.kernel.org/show_bug.cgi?id=97381
Link: https://bugzilla.kernel.org/show_bug.cgi?id=98111
Reported-and-tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Reported-and-tested-by: Tigran Gabrielyan <tigrangab@gmail.com>
Reported-and-tested-by: Adrien D <ghbdtn@openmailbox.org>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-06-15 14:35:59 +02:00
Lv Zheng
9d8993be2d ACPI / EC: Convert event handling work queue into loop style.
During the period that a work queue is scheduled (queued up for run) but
hasn't been run, second schedule_work() could fail. This may not lead to
the loss of queries because QR_EC is always ensured to be submitted after
the work queue has been in the running state.

The event handling work queue can be changed into the loop style to allow
us to control the code in a more flexible way:
1. Makes it possible to add event=0x00 termination condition in the loop.
2. Increases the thoughput of the QR_EC transactions as the 2nd+ QR_EC
   transactions may be handled in the same work item used for the 1st QR_EC
   transaction, thus the delay caused by the 2nd+ work item scheduling can
   be eliminated.

Except the logging message changes and the throughput improvement, this
patch is just a funcitonal no-op.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Tested-by: Tigran Gabrielyan <tigrangab@gmail.com>
Tested-by: Adrien D <ghbdtn@openmailbox.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-06-15 14:35:59 +02:00
Lv Zheng
f8b8eb7153 ACPI / EC: Cleanup transaction state transition.
This patch collects transaction state transition code into one function. We
then could have a single function to maintain transaction transition
related behaviors. No functional changes.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Tested-by: Tigran Gabrielyan <tigrangab@gmail.com>
Tested-by: Adrien D <ghbdtn@openmailbox.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-06-15 14:35:58 +02:00
Lv Zheng
3174abcfea ACPI / EC: Remove non-root-caused busy polling quirks.
{ Update to correct 1 patch subject in the description }

We have fixed a lot of race issues in the EC driver recently.

The following commit introduces MSI udelay()/msleep() quirk to MSI laptops
to make EC firmware working for bug 12011 without root causing any EC
driver race issues:
  Commit: 5423a0cb3f
  Subject: ACPI: EC: Add delay for slow MSI controller
  Commit: 34ff4dbccc
  Subject: ACPI: EC: Separate delays for MSI hardware

The following commit extends ECDT validation quirk to MSI laptops to make
EC driver locating EC registers properly for bug 12461:
  Commit: a5032bfdd9
  Subject: ACPI: EC: Always parse EC device
This is a different quirk than the MSI udelay()/msleep() quirk. This patch
keeps validating ECDT for only "Micro-Star MS-171F" as reported.

The following commit extends MSI udelay()/msleep() quirk to Quanta laptops
to make EC firmware working for bug 20242, there is no requirement to
validate ECDT for Quanta laptops:
  Commit: 534bc4e3d2 Mon Sep 17 00:00:00 2001
  Subject: ACPI EC: enable MSI workaround for Quanta laptops

The following commit extends MSI udelay()/msleep() quirk to Clevo laptops
to make EC firmware working for bug 77431, there is no requirement to
validate ECDT for Clevo laptops:
  Commit: 777cb38295
  Subject: ACPI / EC: Add msi quirk for Clevo W350etq

All udelay()/msleep() quirks for MSI/Quanta/Clevo seem to be the wrong
fixes generated without fixing the EC driver race issues.
And even if it is not wrong, the guarding can be covered by the following
commits in wait polling mode:
  Commit: 9e295ac14d
  Subject: ACPI / EC: Reduce ec_poll() by referencing the last register access timestamp.
  Commit: commit in the same series
  Subject: ACPI / EC: Fix and clean up register access guarding logics.
The only case that is not covered is the inter-transaction guarding. And
there is no evidence that we need the inter-transaction guarding upon
reading the noted bug entries.

So it is time to remove the quirks and let the users to try again. If there
is a regression, the only thing we need to do is to restore the
inter-transaction guarding for the reported platforms.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12461
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-16 01:53:03 +02:00
Lv Zheng
15de603b04 ACPI / EC: Add module params for polling modes.
We have 2 polling modes in the EC driver:
1. busy polling: originally used for the MSI quirks. udelay() is used to
   perform register access guarding.
2. wait polling: normal code path uses wait_event_timeout() and it can be
   woken up as soon as the transaction is completed in the interrupt mode.
   It also contains the register acces guarding logic in case the interrupt
   doesn't arrive and the EC driver is about to advance the transaction in
   task context (the polling mode).
The wait polling is useful for interrupt mode to allow other tasks to use
the CPU during the wait.
But for the polling mode, the busy polling takes less time than the wait
polling, because if no interrupt arrives, the wait polling has to wait the
minimal HZ interval.

We have a new use case for using the busy polling mode. Some GPIO drivers
initialize PIN configuration which cause a GPIO multiplexed EC GPE to be
disabled out of the GPE register's control. Busy polling mode is useful
here as it takes less time than the wait polling. But the guarding logic
prevents it from responding even faster. We should spinning around the EC
status rather than spinning around the nop execution lasted a determined
period.

This patch introduces 2 module params for the polling mode switch and the
guard time, so that users can use the busy polling mode without the
guarding in case the guarding is not necessary. This is an example to use
the 2 module params for this purpose:
  acpi.ec_busy_polling acpi.ec_polling_guard=0

We've tested the patch on a test platform. The platform suffers from such
kind of the GPIO PIN issue. The GPIO driver resets all PIN configuration
and after that, EC interrupt cannot arrive because of the multiplexing.
Then the platform suffers from a long delay carried out by the
wait_event_timeout() as all further EC transactions will run in the polling
mode. We switched the EC driver to use the busy polling mechanism instead
of the wait timeout polling mechanism and the delay is still high:
[   44.283005] calling  PNP0C0B:00+ @ 1305, parent: platform
[   44.417548] call PNP0C0B:00+ returned 0 after 131323 usecs
And this patch can significantly reduce the delay:
[   44.502625] calling  PNP0C0B:00+ @ 1308, parent: platform
[   44.503760] call PNP0C0B:00+ returned 0 after 1103 usecs

Tested-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-16 01:51:18 +02:00
Lv Zheng
d8d031a605 ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.

The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
   implemented in the same place. We have an inter-transaction udelay() in
   acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
   if we can use per-device timestamp. This patch completes such merge to
   form a new ec_guard() function and collects all guarding related hidden
   logics in it.
   One hidden logic is: there is no inter-transaction guarding performed
   for non MSI quirk (wait polling mode), this patch skips
   inter-transaction guarding before wait_event_timeout() for the wait
   polling mode to reveal the hidden logic.
   The other hidden logic is: there is msleep() inter-transaction guarding
   performed when the GPE storming is observed. As after merging this
   commit:
     Commit: e1d4d90fc0
     Subject: ACPI / EC: Refine command storm prevention support
   EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
   acpi_ec_transaction_unlocked(), the msleep() guard logic will never
   happen now. Since no one complains such change, this logic is likely
   added during the old times where the EC race issues are not fixed and
   the bugs are false root-caused to the timing issue. This patch simply
   removes the out-dated logic. We can restore it by stop skipping
   inter-transaction guarding for wait polling mode.
   Two different delay values are defined for msleep() and udelay() while
   they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
   observed in noirq suspend/resume processes where polling mode is always
   used) before advance_transaction() is invoked ("wait polling" log is
   added before wait_event_timeout()). We can see 2 wait_event_timeout()
   invocations. This is because time_after() ensures a ">" validation while
   we only need a ">=" validation here:
   [   86.739909] ACPI: Waking up from system sleep state S3
   [   86.742857] ACPI : EC: 2: Increase command
   [   86.742859] ACPI : EC: ***** Command(RD_EC) started *****
   [   86.742861] ACPI : EC: ===== TASK (0) =====
   [   86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
   [   86.742873] ACPI : EC: EC_SC(W) = 0x80
   [   86.742876] ACPI : EC: ***** Event started *****
   [   86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
   [   86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
   [   86.747966] ACPI : EC: ===== TASK (0) =====
   [   86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
   [   86.747978] ACPI : EC: EC_DATA(W) = 0x06
   [   86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
   [   86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
   [   86.755969] ACPI : EC: ===== TASK (0) =====
   [   86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
   [   86.755993] ACPI : EC: EC_DATA(R) = 0x03
   [   86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
   [   86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
   [   86.755996] ACPI : EC: 1: Decrease command
   This patch corrects this by using time_before() instead in ec_guard():
   [   54.283146] ACPI: Waking up from system sleep state S3
   [   54.285414] ACPI : EC: 2: Increase command
   [   54.285415] ACPI : EC: ***** Command(RD_EC) started *****
   [   54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
   [   54.285417] ACPI : EC: ===== TASK (0) =====
   [   54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
   [   54.285425] ACPI : EC: EC_SC(W) = 0x80
   [   54.285427] ACPI : EC: ***** Event started *****
   [   54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
   [   54.287209] ACPI : EC: ===== TASK (0) =====
   [   54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
   [   54.287219] ACPI : EC: EC_DATA(W) = 0x06
   [   54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
   [   54.291190] ACPI : EC: ===== TASK (0) =====
   [   54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
   [   54.291213] ACPI : EC: EC_DATA(R) = 0x03
   [   54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
   [   54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
   [   54.291216] ACPI : EC: 1: Decrease command

After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.

Except the time_before() change, all other changes do not change the
behavior of the EC driver.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-16 01:51:18 +02:00
Lv Zheng
373783e6e9 ACPI / EC: Remove irqs_disabled() check.
The following commit merges polling and interrupt modes for EC driver:
  Commit: 2a84cb9852 Mon Sep 17 00:00:00 2001
  Subject: ACPI: EC: Merge IRQ and POLL modes
The irqs_disabled() check introduced in it tries to fall into busy polling
mode when the context of ec_poll() cannot sleep.

Actually ec_poll() is ensured to be invoked in the contexts that can sleep
(from a sysfs /sys/kernel/debug/ec/ec0/io access, or from
acpi_evaluate_object(), or from acpi_ec_gpe_poller()). Without the MSI
quirk, we never saw the udelay() logic invoked. Thus this check is useless
and can be removed.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-16 01:51:17 +02:00
Lv Zheng
5ab82a11e5 ACPI / EC: Remove storming threashold enlarging quirk.
This patch removes the storming threashold enlarging quirk.

After applying the following commit, we can notice that there is no no-op
GPE handling invocation can be observed, thus it is unlikely that the
no-op counts can exceed the storming threashold:
  Commit: ca37bfdfbc
  Subject: ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
Even when the storming happens, we have already limited its affection to
the only transaction and no further transactions will be affected. This is
done by this commit:
  Commit: e1d4d90fc0
  Subject: ACPI / EC: Refine command storm prevention support

So it's time to remove this quirk.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=45151
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-16 01:51:17 +02:00
Lv Zheng
7c0b2595da ACPI / EC: Update acpi_ec_is_gpe_raised() with new GPE status flag.
This patch updates acpi_ec_is_gpe_raised() according to the following
commit:
  Commit: 09af8e8290
  Subject: ACPICA: Events: Add support to return both enable/status register values for GPE and fixed event.
This is actually a no-op change as both the flags are defined to a same
value.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-16 01:51:17 +02:00
Chris Bainbridge
6b5eab5469 ACPI / EC: fix NULL pointer dereference in acpi_ec_remove_query_handler()
Use list_for_each_entry_safe for iterating because handler may be freed
in the loop.

BUG: unable to handle kernel NULL pointer dereference at 000000000000002c
IP: [<ffffffff814d69c8>] acpi_ec_put_query_handler+0x7/0x1a
Call Trace:
 acpi_ec_remove_query_handler+0x87/0x97
 acpi_smbus_hc_remove+0x2a/0x44 [sbshc]
 acpi_device_remove+0x7b/0x9a
 __device_release_driver+0x7e/0x110
 driver_detach+0xb0/0xc0
 bus_remove_driver+0x54/0xe0
 driver_unregister+0x2b/0x60
 acpi_bus_unregister_driver+0x10/0x12
 acpi_smb_hc_driver_exit+0x10/0x12 [sbshc]
 SyS_delete_module+0x1b8/0x210
 system_call_fastpath+0x12/0x6a

Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-22 04:12:35 +02:00
Lan Tianyu
1c832b3e85 ACPI / EC: Call acpi_walk_dep_device_list() after installing EC opregion handler
On some machines(E,G Mircosoft surface 3), ACPI battery depends on
the EC operation region and it has _DEP method which contains EC.
Current code doesn't support such devices whose dep_unmet will be
not be decreased after EC opregion handler being installed. This
blocks battery device to be attached with its driver. This patch
is to fix the issue.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=90161
Reported-and-tested-by: Lompik <lompik@voila.fr>
Tested-by: Valentin Lab <valentin.lab_bugzilla.kernel.org@kalysto.org>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-02 02:21:41 +02:00
Lv Zheng
770970f0b4 ACPI / EC: Add GPE reference counting debugging messages.
This patch enhances debugging with the GPE reference count messages added.

This kind of log entries can be used by the platform validators to validate
if there is an EC transaction broken because of firmware/driver bugs.

No functional changes.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-03-10 00:58:37 +01:00
Lv Zheng
3535a3c126 ACPI / EC: Cleanup logging/debugging splitter support.
This patch refines logging/debugging splitter support so that when DEBUG is
disabled, splitters won't be visible in the kernel logs while they are
still available for developers when DEBUG is enabled.

This patch also refines the splitters to mark the following handling
process boundaries:
  +++++: boundary of driver starting/stopping
         boundary of IRQ storming
  =====: boundary of transaction advancement
  *****: boundary of EC command
         boundary of EC query
  #####: boundary of EC _Qxx evaluation

The following 2 log entries are originally logged using pr_info() in order
to be used as the boot/suspend/resume log entries for the EC device, this
patch also restores them to pr_info() logging level:
 ACPI : EC: EC started
 ACPI : EC: EC stopped

In this patch, one log entry around "Polling quirk" is converted into
ec_dbg_raw() which doesn't contain the boundary marker.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-03-10 00:58:36 +01:00
Scot Doyle
92e4b1bcd6 ACPI / EC: Remove non-standard log emphasis
Remove unusual pr_info() visual emphasis introduced in ad479e7f47
"ACPI / EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag".

Signed-off-by: Scot Doyle <lkml14@scotdoyle.com>
[ rjw: Change pr_info() to pr_debug() too in those places. ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-17 18:27:59 +01:00
Rafael J. Wysocki
37d11391c2 Revert "ACPI / EC: Add query flushing support"
Revert commit f252cb09e1 (ACPI / EC: Add query flushing support),
because it breaks system suspend on Acer Aspire S5.  The machine
just hangs solid at the last stage of suspend (after taking non-boot
CPUs offline).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-11 17:35:05 +01:00
Rafael J. Wysocki
e06bf91b59 Revert "ACPI / EC: Add GPE reference counting debugging messages"
Revert commit b5bca896ef (ACPI / EC: Add GPE reference counting
debugging messages), because it depends on commit f252cb09e1
(ACPI / EC: Add query flushing support) which breaks system suspend
on Acer Aspire S5 and needs to be reverted.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-11 17:33:23 +01:00
Lv Zheng
b5bca896ef ACPI / EC: Add GPE reference counting debugging messages
This patch enhances debugging with the GPE reference count messages added.
No functional changes.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-06 15:48:10 +01:00
Lv Zheng
f252cb09e1 ACPI / EC: Add query flushing support
This patch implementes the QR_EC flushing support.

Grace periods are implemented from the detection of an SCI_EVT to the
submission/completion of the QR_EC transaction. During this period, all
EC command transactions are allowed to be submitted.

Note that query periods and event periods are intentionally distiguished to
allow further improvements.
1. Query period: from the detection of an SCI_EVT to the sumission of the
   QR_EC command. This period is used for storming prevention, as currently
   QR_EC is deferred to a work queue rather than directly issued from the
   IRQ context even there is no other transactions pending, so malicous
   SCI_EVT GPE can act like "level triggered" to trigger a GPE storm. We
   need to be prepared for this. And in the future, we may change it to be
   a part of the advance_transaction() where we will try QR_EC submission
   in appropriate positions to avoid such GPE storming.
2. Event period: from the detection of an SCI_EVT to the completion of the
   QR_EC command. We may extend it to the completion of _Qxx evaluation.
   This is actually a grace period for event flushing, but we only flush
   queries due to the reason stated in known issue 1. That's also why we
   use EC_FLAGS_EVENT_xxx. During this period, QR_EC transactions need to
   pass the flushable submission check.

In this patch, the following flags are implemented:
1. EC_FLAGS_EVENT_ENABLED: this is derived from the old
   EC_FLAGS_QUERY_PENDING flag which can block SCI_EVT handlings.
   With this flag, the logics implemented by the original flag are
   extended:
   1. Old logic: unless both of the flags are set, the event poller will
                 not be scheduled, and
   2. New logic: as soon as both of the flags are set, the evet poller will
                 be scheduled.
2. EC_FLAGS_EVENT_DETECTED: this is also derived from the old
   EC_FLAGS_QUERY_PENDING flag which can block SCI_EVT detection. It thus
   can be used to indicate the storming prevention period for query
   submission.
   acpi_ec_submit_request()/acpi_ec_complete_request() are invoked to
   implement this period so that acpi_set_gpe() can be invoked under the
   "reference count > 0" condition.
3. EC_FLAGS_EVENT_PENDING: this is newly added to indicate the grace period
   for event flushing (query flushing for now).
   acpi_ec_submit_request()/acpi_ec_complete_request() are invoked to
   implement this period so that the flushing process can wait until the
   event handling (query transaction for now) to be completed.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=82611
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Ortwin Glück <odi@odi.ch>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-06 15:48:10 +01:00
Lv Zheng
e1d4d90fc0 ACPI / EC: Refine command storm prevention support
This patch refines EC command storm prevention support.

Current command storming code is wrong, when the storming condition is
detected, it only flags the condition without doing anything for the
current command but performing storming prevention for the follow-up
commands. So:
1. The first command which suffers from the storming still suffers from
   storming.
2. The follow-up commands which may not suffer from the storming are
   unconditionally forced into the storming prevention mode.
Ideally, we should only enable storm prevention immediately after detection
for the current command so that the next command can try the
power/performance efficient interrupt mode again.

This patch improves the command storm prevention by disabling GPE right
after the detection and re-enabling it right before completing the command
transaction using the GPE storming prevention APIs. This thus deploys the
following GPE handling model:
1. acpi_enable_gpe()/acpi_disable_gpe() for reference count changes:
   This set of APIs are used for EC usage reference counting.
2. acpi_set_gpe(ACPI_GPE_ENABLE)/acpi_set_gpe(ACPI_GPE_DISABLE):
   This set of APIs are used for preventing GPE storm. They must be invoked
   when the reference count > 0.
   Note that as the storming prevention should always happen when there is
   an outstanding request, or GPE enabling value will be messed up by the
   races. This patch also adds BUG_ON() to enforces this rule to prevent
   future bugs.

The msleep(1) used after completing a transaction is useless now as this
sounds like a guard time only useful for platforms that need the
EC_FLAGS_MSI quirks while we have fixed GPE race issues using the previous
raw handler mode enabling. It is kept to avoid regressions. A seperate
patch which deletes EC_FLAGS_MSI quirks should take care of deleting it.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-06 15:48:09 +01:00
Lv Zheng
9887d22add ACPI / EC: Add command flushing support.
This patch implements the EC command flushing support.

During the grace period indicated by EC_FLAGS_STARTED and EC_FLAGS_STOPPED,
all submitted EC command transactions can be completed and new submissions
are prevented before suspending so that the EC hardware can be ensured to
be in the idle state when the system is resumed.

There is a good indicator for flush support:
All acpi_ec_submit_request() is invoked after checking driver state with
acpi_ec_started() except the first one. This means all code paths can be
flushed as fast as possible by discarding the requests occurred after the
flush operation. The reference increased for such kind of code path is
wrapped by acpi_ec_submit_flushable_request().

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Ortwin Glück <odi@odi.ch>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-06 15:48:09 +01:00
Lv Zheng
ad479e7f47 ACPI / EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag
By using the 2 flags, we can indicate an inter-mediate state where the
current transactions should be completed while the new transactions should
be dropped.

The comparison of the old flag and the new flags:
  Old			New
  about to set BLOCKED	STOPPED set / STARTED set
  BLOCKED set		STOPPED clear / STARTED clear
  BLOCKED clear		STOPPED clear / STARTED set
A new period can be indicated by the 2 flags. The new period is between the
point where we are about to set BLOCKED and the point when the BLOCKED is
set. The new flags facilitate us with acpi_ec_started() check to allow the
EC transaction to be submitted during the new period. This period thus can
be used as a grace period for the EC transaction flushing.

The only functional change after applying this patch is:
1. The GPE enabling/disabling is protected by the EC specific lock. We can
   do this because of recent ACPICA GPE API enhancement. This is reasonable
   as the GPE disabling/enabling state should only be determined by the EC
   driver's state machine which is protected by the EC spinlock.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Ortwin Glück <odi@odi.ch>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-06 15:48:09 +01:00
Lv Zheng
a8d4fc227f ACPI / EC: Update revision due to raw handler mode.
The bug fixes around GPE races have been done to the EC driver by the
previous commits. This patch increases the revision to 3 to indicate the
behavior differences between the old and the new drivers. The
copyright/authorship notices are also updated.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 15:42:18 +01:00
Lv Zheng
9e295ac14d ACPI / EC: Reduce ec_poll() by referencing the last register access timestamp.
Timeout in the ec_poll() doesn't refer to the last register access time. It
thus can win the competition against the acpi_ec_gpe_handler() if a
transaction takes longer than 1ms but individual register accesses are less
than 1ms.  In some cases, it can make the following silicon bug easier to
be triggered:
 GPE EN is not wired to the GPE trigger line, so when GPE STS is already
 set when 1 is written to GPE EN, no GPE can be triggered.

This patch adds register access timestamp reference support for ec_poll()
to reduce the number of ec_poll() invocations.

Reported-by: Venkat Raghavulu <venkat.raghavulu@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 15:42:18 +01:00
Lv Zheng
ca37bfdfbc ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
This patch switches EC driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode where
the GPE lock is not held for acpi_ec_gpe_handler() and the ACPICA internal
GPE enabling/disabling/clearing operations are bypassed so that further
improvements are possible with the GPE APIs.

There are 2 strong reasons for deploying raw GPE handler mode in the EC
driver:
1. Some hardware logics can control their interrupts via their own
   registers, so their interrupts can be disabled/enabled/acknowledged
   without using the super IRQ controller provided functions. While there
   is no mean (EC commands) for the EC driver to achieve this.
2. During suspending, the EC driver is still working for a while to
   complete the platform firmware provided functionailities using ec_poll()
   after all GPEs are disabled (see acpi_ec_block_transactions()), which
   means the EC driver will drive the EC GPE out of the GPE core's control.

Without deploying the raw GPE handler mode, we can see many races between
the EC driver and the GPE core due to the above restrictions:
1. There is a race condition due to ACPICA internal GPE
   disabling/clearing/enabling logics in acpi_ev_gpe_dispatch():
     Orignally EC GPE is disabled (EN=0), cleared (STS=0) before invoking a
     GPE handler and re-enabled (EN=1) after invoking a GPE handler in
     acpi_ev_gpe_dispatch(). When re-enabling appears, GPE may be flagged
     (STS=1).
       =================================================================
       (event pending A)
       =================================================================
       acpi_ev_gpe_dispatch()    ec_poll()
         EN=0
         STS=0
         acpi_ec_gpe_handler()
       *****************************************************************
       (event handling A)
           Lock(EC)
           advance_transaction()
             EC_SC read
       =================================================================
       (event pending B)
       =================================================================
             EC_SC handled
           Unlock(EC)
       *****************************************************************
       *****************************************************************
       (event handling B)
                                   Lock(EC)
                                   advance_transaction()
                                     EC_SC read
       =================================================================
       (event pending C)
       =================================================================
                                     EC_SC handled
                                   Unlock(EC)
       *****************************************************************
           EN=1
   This race condition is the root cause of different issues on different
   silicon variations.
   A. Silicon variation A:
      On some platforms, GPE will be triggered due to "writing 1 to EN when
      STS=1". This is because both EN and STS lines are wired to the GPE
      trigger line.
      1. Issue 1:
         We can see no-op acpi_ec_gpe_handler() invoked on such platforms.
         This is because:
         a. event pending B: An event can arrive after ACPICA's GPE
            clearing performed in acpi_ev_gpe_dispatch(), this event may
            fail to be detected by EC_SC read that is performed before its
            arrival;
         b. event handling B: The event can be handled in ec_poll() because
            EC lock is released after acpi_ec_gpe_handler() invocation;
         c. There is no code in ec_poll() to clear STS but the GPE can
            still be triggered by the EN=1 write performed in
            acpi_ev_finish_gpe(), this leads to a no-op EC GPE handler
            invocation;
         d. As no-op GPE handler invocations are counted by the EC driver
            to trigger the command storming conditions, the wrong no-op
            GPE handler invocations thus can easily trigger wrong command
            storming conditions.
         Note 1:
         If we removed GPE disabling/enabling code from
         acpi_ev_gpe_dispatch(), we could still see no-op GPE handlers
         triggered by the event arriving after the GPE clearing and before
         the GPE handling on both silicon variation A and B. This can only
         occur if the CPU is very slow (timing slice between STS=0 write
         and EC_SC read should be short enough before hardware sets another
         GPE indication). Thus this is very rare and is not what we need to
         fix.
   B. Silicon variation B:
      On other platforms, GPE may not be triggered due to "writing 1 to EN
      when STS=1". This is because only STS line is wired to the GPE
      trigger line.
      2. Issue 2:
         We can see GPE loss on such platforms. This is because:
         a. event pending B vs. event handling A: An event can arrive after
            ACPICA's GPE handling performed in acpi_ev_gpe_dispatch(), or
            event pending C vs. event handling B: An event can arrive after
            Linux's GPE handling performed in ec_poll(),
            these events may fail to be detected by EC_SC read that is
            performed before their arrival;
         b. The GPE cannot be triggered by EN=1 write performed in
            acpi_ev_finish_gpe();
         c. If no polling mechanism is implemented in the driver for the
            pending event (for example, SCI_EVT), this event is lost due to
            no GPE being triggered.
         Note 2:
         On most platforms, there might be another rule that GPE may not be
         triggered due to "writing 1 to STS when STS=1 and EN=1".
         Then on silicon variation B, an even worse case is if the issue 2
         event loss happens, further events may never trigger GPE again on
         such platforms due to being blocked by the current STS=1. Unless
         someone clears STS, all events have to be polled.
2. There is a race condition due to lacking in GPE status checking in EC
   driver:
     Originally, GPE status is checked in ACPICA core but not checked in
     the GPE handler. Thus since the status checking and handling is not
     locked, it can be interrupted by another handling path.
       =================================================================
       (event pending A)
       =================================================================
       acpi_ev_gpe_detect()        ec_poll()
         if (EN==1 && STS==1)
       *****************************************************************
       (event handling A)
                                     Lock(EC)
                                     advance_transaction()
                                       EC_SC read
                                       EC_SC handled
                                     Unlock(EC)
       *****************************************************************
         acpi_ev_gpe_dispatch()
           EN=0
           STS=0
           acpi_ec_gpe_handler()
       *****************************************************************
       (event handling B)
             Lock(EC)
             advance_transaction()
               EC_SC read
             Unlock(EC)
       *****************************************************************
      3. Issue 3:
         We can see no-op acpi_ec_gpe_handler() invoked on both silicon
         variation A and B. This is because:
         a. event pending A: An event can arrive to trigger an EC GPE and
            ACPICA checks it and is about to invoke the EC GPE handler;
         b. event handling A: The event can be handled in ec_poll() because
            EC lock is not held after the GPE status checking;
         c. event handling B: Then when the EC GPE handler is invoked, it
            becomes a no-op GPE handler invocation.
         d. As no-op GPE handler invocations are counted by the EC driver
            to trigger the command storming conditions, the wrong no-op
            GPE handler invocations thus can easily trigger wrong command
            storming conditions.
      Note 3:
      This no-op GPE handler invocation is rare because the time between
      the IRQ arrival and the acpi_ec_gpe_handler() invocation is less than
      the timeout value waited in ec_poll(). So most of the no-op GPE
      handler invocations are caused by the reason described in issue 1.
3. There is a race condition due to ACPICA internal GPE clearing logic in
   acpi_enable_gpe():
     During runtime, acpi_enable_gpe() can be invoked by the EC storming
     prevention code. When it is invoked, GPE may be flagged (STS=1).
       =================================================================
       (event pending A)
       =================================================================
       acpi_ev_gpe_dispatch()    acpi_ec_transaction()
         EN=0
         STS=0
         acpi_ec_gpe_handler()
       *****************************************************************
       (event handling A)
           Lock(EC)
           advance_transaction()
             EC_SC read
             EC_SC handled
           Unlock(EC)
       *****************************************************************
         EN=1 ?
                                   Lock(EC)
                                   Unlock(EC)
       =================================================================
       (event pending B)
       =================================================================
                                   acpi_enable_gpe()
                                     STS=0
                                     EN=1
    4. Issue 4:
       We can see GPE loss on both silicon variation A and B platforms.
       This is because:
       a. event pending B: An event can arrive right before ACPICA's GPE
          clearing performed in acpi_enable_gpe();
       b. If the GPE is cleared when GPE is disabled, then EN=1 write in
          acpi_enable_gpe() cannot trigger this GPE;
       c. If no polling mechanism is implemented in the driver for this
          event (for example, SCI_EVT), this event is lost due to no GPE
          being triggered.
       Note 4:
       Currently we don't have this issue, but after we switch the EC
       driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode, we need to take care
       of handling this because the EN=1 write in acpi_ev_gpe_dispatch()
       will be abandoned.

There might be more race issues for the current GPE handler usages. This is
because the EC IRQ's enabling/disabling/checking/clearing/handling
operations should be locked by a single lock that is under the EC driver's
control to achieve the serialization. Which means we need to invoke GPE
APIs with EC driver's lock held and all ACPICA internal GPE operations
related to the GPE handler should be abandoned. Invoking GPE APIs inside of
the EC driver lock and bypassing ACPICA internal GPE operations requires
the ACPI_GPE_DISPATCH_RAW_HANDLER mode where the same lock used by the APIs
are released prior than invoking the handlers. Otherwise, we can see dead
locks due to circular locking dependencies (see Reference below).

This patch then switches the EC driver into the
ACPI_GPE_DISPATCH_RAW_HANDLER mode so that it can perform correct GPE
operations using the GPE APIs:
1. Bypasses EN modifications performed in acpi_ev_gpe_dispatch() by
   using acpi_install_gpe_raw_handler() and invoking all GPE APIs with EC
   spin lock held. This can fix issue 1 as it makes a non frequent GPE
   enabling/disabling environment.
2. Bypasses STS clearing performed in acpi_enable_gpe() by replacing
   acpi_enable_gpe()/acpi_disable_gpe() with acpi_set_gpe(). This can fix
   issue 4. And this can also help to fix issue 1 as it makes a no sudden
   GPE clearing environment when GPE is frequently enabled/disabled.
3. Ensures STS acknowledged before handling by invoking acpi_clear_gpe()
   in advance_transaction(). This can finally fix issue 1 even in a
   frequent GPE enabling/disabling environment. And this can also finally
   fix issue 3 when issue 2 is fixed.
   Note 3:
   GPE clearing is edge triggered W1C, which means we can clear it right
   before handling it. Since all EC GPE indications are handled in
   advance_transaction() by previous commits, we can now move GPE clearing
   into it to implement the correct GPE clearing.
   Note 4:
   We can use acpi_set_gpe() which is not shared GPE safer instead of
   acpi_enable_gpe()/acpi_disable_gpe() because EC GPE is not shared by
   other hardware, which is mentioned in the ACPI specification 5.0, 12.6
   Interrupt Model: "OSPM driver treats this as an edge event (the EC SCI
   cannot be shared)". So we can stop using shared GPE safer APIs
   acpi_enable_gpe()/acpi_disable_gpe() in the EC driver. Otherwise
   cleanups need to be made in acpi_ev_enable_gpe() to bypass the GPE
   clearing logic before keeping acpi_enable_gpe().
This patch also invokes advance_transaction() when GPE is re-enabled in the
task context which:
1. Ensures EN=1 can trigger GPE by checking and handling EC status register
   right after EN=1 writes. This can fix issue 2.

After applying this patch, without frequent GPE enablings considered:
       =================================================================
       (event pending A)
       =================================================================
       acpi_ec_gpe_handler()     ec_poll()
       *****************************************************************
       (event handling A)
         Lock(EC)
           advance_transaction()
             if STS==1
               STS=0
             EC_SC read
       =================================================================
       (event pending B)
       =================================================================
             EC_SC handled
         Unlock(EC)
       *****************************************************************
       *****************************************************************
       (event handling B)
                                   Lock(EC)
                                     advance_transaction()
                                       if STS==1
                                         STS=0
                                       EC_SC read
       =================================================================
       (event pending C)
       =================================================================
                                       EC_SC handled
                                   Unlock(EC)
       *****************************************************************
The event pending for issue 1 (event pending B) can arrive as a next GPE
due to the previous IRQ context STS=0 write. And if it is handled by
ec_poll() (event handling B), as it is also acknowledged by ec_poll(), the
event pending for issue 2 (event pending C) can properly arrive as a next
GPE after the task context STS=0 write. So no GPE will be lost and never
triggered due to GPE clearing performed in the wrong position. And since
all GPE handling is performed after a locked GPE status checking, we can
hardly see no-op GPE handler invocations due to issue 1 and 3. We may still
see no-op GPE handler invocations due to "Note 1", but as it is inevitable,
it needn't be fixed.

After applying this patch, with frequent GPE enablings considered:
       =================================================================
       (event pending A)
       =================================================================
       acpi_ec_gpe_handler()     acpi_ec_transaction()
       *****************************************************************
       (event handling A)
         Lock(EC)
           advance_transaction()
             if STS==1
               STS=0
             EC_SC read
       =================================================================
       (event pending B)
       =================================================================
             EC_SC handled
         Unlock(EC)
       *****************************************************************
       *****************************************************************
       (event handling B)
                                   Lock(EC)
                                     EN=1
                                     if STS==1
                                       advance_transaction()
                                         if STS==1
                                           STS=0
                                         EC_SC read
       =================================================================
       (event pending C)
       =================================================================
                                         EC_SC handled
                                   Unlock(EC)
       *****************************************************************
The event pending for issue 2 can be manually handled by
advance_transaction(). And after the STS=0 write performed in the manual
triggered advance_transaction(), GPE can always arrive. So no GPE will be
lost due to frequent GPE disabling/enabling performed in the driver like
issue 4.
Note 5:
It's ideally when EN=1 write occurred, an IRQ thread should be woken up to
handle the GPE when the GPE was raised. But this requires the IRQ thread to
contain the poller code for all EC GPE indications, while currently some of
the indications are handled in the user tasks. It then is very hard for the
code to determine whether a user task should be invoked or the poller work
item should be scheduled. So we have to invoke advance_transaction()
directly now and it leaves us such a restriction for the GPE re-enabling:
it must be performed in the task context to avoid starving the GPEs.

As a conclusion: we can see the EC GPE is always handled in serial after
deploying the raw GPE handler mode:
  Lock(EC)
  if (STS==1)
    STS=0
  EC_SC read
  EC_SC handled
  Unlock(EC)
The EC driver specific lock is responsible to make the EC GPE handling
processes serialized so that EC can handle its GPE from both IRQ and task
contexts and the next IRQ can be ensured to arrive after this process.

Note 6:
We have many EC_FLAGS_MSI qurik users in the current driver. They all seem
to be suffering from unexpected GPE triggering source lost. And they are
false root caused to a timing issue. Since EC communication protocol has
already flow control defined, timing shouldn't be the root cause, while
this fix might be fixing the root cause of the old bugs.

Link: https://lkml.org/lkml/2014/11/4/974
Link: https://lkml.org/lkml/2014/11/18/316
Link: https://www.spinics.net/lists/linux-acpi/msg54340.html
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 15:42:18 +01:00
Lv Zheng
550b3aac5a ACPI / EC: Cleanup QR_EC related code
The QR_EC related code pieces have redundants, this patch merges them into
acpi_ec_query() which invokes acpi_ec_transaction() where EC mutex and the
global lock are already held. After doing so, query handler traversal still
need to be locked by EC mutex after invoking acpi_ec_transaction().

Note that EC event handling is sequential. We fetch one event from firmware
event queue and process it until 0x00 or error returned. So we don't need
to hold mutex for whole acpi_ec_clear() process to determine whether we
should continue to drain. And for the same reason, we don't need to hold
mutex for the whole procedure from the QR_EC transaction to the query
handler traversal.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:06:49 +01:00
Lv Zheng
74443bbed7 ACPI / EC: Fix issues related to the SCI_EVT handling
This patch fixes 2 issues related to the draining behavior. But it doesn't
implement the draining support, it only cleans up code so that further
draining support is possible.

The draining behavior is expected by some platforms (for example, Samsung)
where SCI_EVT is set only once for a set of events and might be cleared for
the very first QR_EC command issued after SCI_EVT is set. EC firmware on
such platforms will return 0x00 to indicate "no outstanding event". Thus
after seeing an SCI_EVT indication, EC driver need to fetch events until
0x00 returned (see acpi_ec_clear()).

Issue 1 - acpi_ec_submit_query():
It's reported on Samsung laptops that SCI_EVT isn't checked when the
transactions are advanced in ec_poll(), which leads to SCI_EVT triggering
source lost:
 If no EC GPE IRQs are arrived after that, EC driver cannot detect this
 event and handle it.
See comment 244/247 for kernel bugzilla 44161.
This patch fixes this issue by moving SCI_EVT checks into
advance_transaction(). So that SCI_EVT is checked each time we are going to
handle the EC firmware indications. And this check will happen for both IRQ
context and task context.
Since after doing that, SCI_EVT is also checked after completing a
transaction, ec_check_sci() and ec_check_sci_sync() can be removed.

Issue 2 - acpi_ec_complete_query():
We expect to clear EC_FLAGS_QUERY_PENDING to allow queuing another draining
QR_EC after writing a QR_EC command and before reading the event. After
reading the event, SCI_EVT might be cleared by the firmware, thus it may
not be possible to queue such a draining QR_EC at that time.
But putting the EC_FLAGS_QUERY_PENDING clearing code after
start_transaction() is wrong as there are chances that after
start_transaction(), QR_EC can fail to be sent. If this happens,
EC_FLAG_QUERY_PENDING will be cleared earlier. As a consequence, the
draining QR_EC will also be queued earlier than expected.
This patch also moves this code into advance_transaction() where QR_EC is
just sent (ACPI_EC_COMMAND_POLL flagged) to fix this issue.

Notes:
1. After introducing the 2 SCI_EVT related handlings into
   advance_transaction(), a next QR_EC can be queued right after writing
   the current QR_EC command and before reading the event. But this still
   hasn't implemented the draining behavior as the draining support
   requires:
     If a previous returned event value isn't 0x00, a draining QR_EC need
     to be issued even when SCI_EVT isn't set.
2. In this patch, acpi_os_execute() is also converted into a seperate work
   item to avoid invoking kmalloc() in the atomic context. We can do this
   because of the previous global lock fix.
3. Originally, EC_FLAGS_EVENT_PENDING is also used to avoid queuing up
   multiple work items (created by acpi_os_execute()), this can be covered
   by only using a single work item. But this patch still keeps this flag
   as there are different usages in the driver initialization steps relying
   on this flag.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=44161
Reported-by: Kieran Clancy <clancy.kieran@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:06:49 +01:00
Lv Zheng
f3e1432951 ACPI / EC: Fix a code path that global lock is not held
Currently QR_EC is queued up on CPU 0 to be safe with SMM because there is
no global lock held for acpi_ec_gpe_query(). As we are about to move QR_EC
to a non CPU 0 bound work queue to avoid invoking kmalloc() in
advance_transaction(), we have to acquire global lock for the new QR_EC
work item to avoid regressions.

Known issue:
1. Global lock for acpi_ec_clear().
   This is an existing issue that acpi_ec_clear() which invokes
   acpi_ec_sync_query() also suffers from the same issue. But this patch's
   target is only the code to invoke acpi_ec_sync_query() in a CPU 0 bound
   work queue item, and the acpi_ec_clear() can be automatically fixed by
   further patch that merges the redundant code, so it is left unchanged.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:06:49 +01:00
Lv Zheng
c2cf5769fa ACPI / EC: Fix returning values in acpi_ec_sync_query()
The returning value of acpi_os_execute() is erroneously handled as errno.
This patch corrects it by returning EBUSY to indicate the work queue item
creation failure.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:06:49 +01:00
Lv Zheng
01305d4139 ACPI / EC: Add reference counting for query handlers
This patch adds reference counting for query handlers in order to eliminate
kmalloc()/kfree() usage.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Steffen Weber <steffen.weber@gmail.com>
Tested-by: Ortwin Glück <odi@odi.ch>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:06:48 +01:00
Lv Zheng
0c78808f51 ACPI / EC: Cleanup transaction wakeup code
This patch moves transaction wakeup code into advance_transaction().
No functional changes.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:06:48 +01:00
Lv Zheng
1741acea75 ACPI / EC: Fix unexpected ec_remove_handlers() invocations
The ec_remove_handlers() is invoked without checking
EC_FLAGS_HANDLERS_INSTALLED, this patch enhances this check to avoid issues
that acpi_disable_gpe() is invoked unexpectedly to reduce the GPE runtime
count. This may happen when the EC handler installation failed on some
platforms.

Reported-by: Venkat Raghavulu <venkat.raghavulu@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-15 15:10:23 +01:00
Lv Zheng
7914900110 ACPI / EC: Fix regression due to conflicting firmware behavior between Samsung and Acer.
It is reported that Samsung laptops that need to poll events are broken by
the following commit:
 Commit 3afcf2ece4
 Subject: ACPI / EC: Add support to disallow QR_EC to be issued when SCI_EVT isn't set

The behaviors of the 2 vendor firmwares are conflict:
 1. Acer: OSPM shouldn't issue QR_EC unless SCI_EVT is set, firmware
         automatically sets SCI_EVT as long as there is event queued up.
 2. Samsung: OSPM should issue QR_EC whatever SCI_EVT is set, firmware
            returns 0 when there is no event queued up.

This patch is a quick fix to distinguish the behaviors to make Acer
behavior only effective for Acer EC firmware so that the breakages on
Samsung EC firmware can be avoided.

Fixes: 3afcf2ece4 (ACPI / EC: Add support to disallow QR_EC to be issued ...)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=44161
Reported-and-tested-by: Ortwin Glück <odi@odi.ch>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Cc: 3.17+ <stable@vger.kernel.org> # 3.17+
[ rjw : Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-29 16:52:35 +01:00
Lv Zheng
df9ff91801 Revert "ACPI / EC: Add support to disallow QR_EC to be issued before completing previous QR_EC"
It is reported that the following commit breaks Samsung hardware:
 Commit: 558e4736f2.
 Subject: ACPI / EC: Add support to disallow QR_EC to be issued before
          completing previous QR_EC

Which means the Samsung behavior conflicts with the Acer behavior.

1. Samsung may behave like:
   [ +event 1 ] SCI_EVT set
   [ +event 2 ] SCI_EVT set
                              write QR_EC
                              read event
   [ -event 1 ] SCI_EVT clear
   Without the above commit, Samsung can work:
   [ +event 1 ] SCI_EVT set
   [ +event 2 ] SCI_EVT set
                              write QR_EC
                              CAN prepare next QR_EC as SCI_EVT=1
                              read event
   [ -event 1 ] SCI_EVT clear
                              write QR_EC
                              read event
   [ -event 2 ] SCI_EVT clear
   With the above commit, Samsung cannot work:
   [ +event 1 ] SCI_EVT set
   [ +event 2 ] SCI_EVT set
                              write QR_EC
                              read event
   [ -event 1 ] SCI_EVT clear
                              CANNOT prepare next QR_EC as SCI_EVT=0
2. Acer may behave like:
   [ +event 1 ] SCI_EVT set
   [ +event 2 ]
                              write QR_EC
                              read event
   [ -event 1 ] SCI_EVT clear
   [ +event 2 ] SCI_EVT set
   Without the above commit, Acer cannot work when there is only 1 event:
   [ +event 1 ] SCI_EVT set
                              write QR_EC
                              can prepared next QR_EC as SCI_EVT=1
                              read event
   [ -event 1 ] SCI_EVT clear
                              CANNOT write QR_EC as SCI_EVT=0
   With the above commit, Acer can work:
   [ +event 1 ] SCI_EVT set
   [ +event 2 ]
                              write QR_EC
                              read event
   [ -event 1 ] SCI_EVT set
                              can prepare next QR_EC because SCI_EVT=0
                              CAN write QR_EC as SCI_EVT=1

Since Acer can also work with only the following commit applied:
 Commit: 3afcf2ece4
 Subject: ACPI / EC: Add support to disallow QR_EC to be issued when
          SCI_EVT isn't set
commit 558e4736f2 can be reverted.

Fixes: 558e4736f2 (ACPI / EC: Add support to disallow QR_EC to be issued ...)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=44161
Reported-and-tested-by: Ortwin Glück <odi@odi.ch>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Cc: 3.17+ <stable@vger.kernel.org> # 3.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-29 16:48:33 +01:00
Lv Zheng
7a73e60e39 ACPI / EC: Cleanup coding style.
This patch cleans up the following coding style issues that are detected by
scripts/checkpatch.pl:
 ERROR: code indent should use tabs where possible
 ERROR: "foo * bar" should be "foo *bar"
 WARNING: Missing a blank line after declarations
 WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
 WARNING: void function return statements are not generally useful
 WARNING: else is not generally useful after a break or return
 WARNING: break is not useful after a goto or return
 WARNING: braces {} are not necessary for single statement blocks
 WARNING: line over 80 characters
 WARNING: msleep < 20ms can sleep for up to 20ms; see Documentation/timers/timers-howto.txt
No functional changes.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-21 00:44:54 +02:00
Lv Zheng
d3090b6a6c ACPI / EC: Refine event/query debugging messages.
This patch refines event/query debugging messages to use a unified format
as commands. Developers can clearly find different processes by checking
different log seperators. No functional changes.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-21 00:44:54 +02:00
Lv Zheng
e34c0e2bb4 ACPI / EC: Add detailed command/query debugging information.
Developers really don't need to translate EC commands in mind. This patch
adds detailed debugging information for the EC commands.
The address can be found in the follow-up sequential EC_DATA(W) accesses,
thus this patch also removes some of the redundant address information.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-21 00:44:53 +02:00
Lv Zheng
459572a750 ACPI / EC: Enhance the logs to apply to QR_EC transactions.
Currently some logs are applied to new transactions, but QR_EC transactions
are not included. This patch merges the code path to make the logs also
applying to the QR_EC transactions.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-21 00:44:53 +02:00
Lv Zheng
c95f25b036 ACPI / EC: Add CPU ID to debugging messages.
This patch adds CPU ID to the context entries' debugging output. no
functional changes.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-21 00:44:53 +02:00
Lan Tianyu
777cb38295 ACPI / EC: Add msi quirk for Clevo W350etq
Clevo W350etq's EC will not produce GPE interrupt some time after
booting. The ACPI notify event won't trigger when the issue takes
place. After debugging, adding msi quirk for the machine can fix
the issue. This patch is to add msi quirk for the machine.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Reported-and-tested-by: qbanin@gmail.com
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-09-02 01:57:56 +02:00
Lv Zheng
558e4736f2 ACPI / EC: Add support to disallow QR_EC to be issued before completing previous QR_EC
There is platform refusing to respond QR_EC when SCI_EVT isn't set
which is Acer Aspire V5-573G.

By disallowing QR_EC to be issued before the previous one has been
completed we are able to reduce the possibilities to trigger issues on
such platforms.

Note that this fix can only reduce the occurrence rate of this issue, but
this issue may still occur when such a platform doesn't clear SCI_EVT
before or immediately after completing the previous QR_EC transaction.
This patch cannot fix the CLEAR_ON_RESUME quirk which also relies on
the assumption that the platforms are able to respond even when SCI_EVT
isn't set.

But this patch is still useful as it can help to reduce the number of
scheduled QR_EC work items.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=82611
Reported-and-tested-by: Alexander Mezin <mezin.alexander@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Cc: 3.16+ <stable@vger.kernel.org> # 3.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-08-26 02:15:51 +02:00
Lv Zheng
3afcf2ece4 ACPI / EC: Add support to disallow QR_EC to be issued when SCI_EVT isn't set
There is a platform refusing to respond QR_EC when SCI_EVT isn't set
(Acer Aspire V5-573G).

Currently, we rely on the behaviour that the EC firmware can respond
something (for example, 0x00 to indicate "no outstanding events") to
QR_EC even when SCI_EVT is not set, but the reporter has complained
about AC/battery pluging/unpluging and video brightness change delay
on that platform.

This is because the work item that has issued QR_EC has to wait until
timeout in this case, and the _Qxx method evaluation work item queued
after QR_EC one is delayed.

It sounds reasonable to fix this issue by:
 1. Implementing SCI_EVT sanity check before issuing QR_EC in the EC
    driver's main state machine.
 2. Moving QR_EC issuing out of the work queue used by _Qxx evaluation
    to a seperate IRQ handling thread.

This patch fixes this issue using solution 1.

By disallowing QR_EC to be issued when SCI_EVT isn't set, we are able to
handle such platform in the EC driver's main state machine. This patch
enhances the state machine in this way to survive with such malfunctioning
EC firmware.

Note that this patch can also fix CLEAR_ON_RESUME quirk which also relies
on the assumption that the platforms are able to respond even when SCI_EVT
isn't set.

Fixes: c0d653412f ACPI / EC: Fix race condition in ec_transaction_completed()
Link: https://bugzilla.kernel.org/show_bug.cgi?id=82611
Reported-and-tested-by: Alexander Mezin <mezin.alexander@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Cc: 3.16+ <stable@vger.kernel.org> # 3.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-08-26 02:15:47 +02:00
Colin Ian King
ed4b197ddd ACPI / EC: Free saved_ec on error exit path
Smatch detected two memory leaks on saved_ec:

drivers/acpi/ec.c:1070 acpi_ec_ecdt_probe() warn: possible
  memory leak of 'saved_ec'
drivers/acpi/ec.c:1109 acpi_ec_ecdt_probe() warn: possible
  memory leak of 'saved_ec'

Free saved_ec on these two error exit paths to stop the memory
leak.  Note that saved_ec maybe null, but kfree on null is allowed.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-07 13:20:30 +02:00
Lv Zheng
dd43de20f5 ACPI / EC: Add detailed fields debugging support of EC_SC(R).
Developers really don't need to translate EC_SC(R) in mind as long as the
field details are decoded in the debugging message.

Tested-by: Gareth Williams <gareth@garethwilliams.me.uk>
Tested-by: Steffen Weber <steffen.weber@gmail.com>
Tested-by: Hans de Goede <jwrdegoede@fedoraproject.org>
Tested-by: Arthur Chen <axchen@nvidia.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-07 12:56:11 +02:00
Lv Zheng
4a3f6b5bf3 ACPI / EC: Update revision due to recent changes
The bug fixes and asynchronous improvements have been done to the EC driver
by the previous commits. This patch increases the revision to 2.2 to
indicate the behavior differences between the old and the new drivers. The
copyright/authorship notices are also updated.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-07 12:55:45 +02:00
Lv Zheng
c0d653412f ACPI / EC: Fix race condition in ec_transaction_completed()
There is a race condition in ec_transaction_completed().

When ec_transaction_completed() is called in the GPE handler, it could
return true because of (ec->curr == NULL). Then the wake_up() invocation
could complete the next command unexpectedly since there is no lock between
the 2 invocations. With the previous cleanup, the IBF=0 waiter race need
not be handled any more. It's now safe to return a flag from
advance_condition() to indicate the requirement of wakeup, the flag is
returned from a locked context.

The ec_transaction_completed() is now only invoked by the ec_poll() where
the ec->curr is ensured to be different from NULL.

After cleaning up, the EVT_SCI=1 check should be moved out of the wakeup
condition so that an EVT_SCI raised with (ec->curr == NULL) can trigger a
QR_SC command.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=70891
Link: https://bugzilla.kernel.org/show_bug.cgi?id=63931
Link: https://bugzilla.kernel.org/show_bug.cgi?id=59911
Reported-and-tested-by: Gareth Williams <gareth@garethwilliams.me.uk>
Reported-and-tested-by: Hans de Goede <jwrdegoede@fedoraproject.org>
Reported-by: Barton Xu <tank.xuhan@gmail.com>
Tested-by: Steffen Weber <steffen.weber@gmail.com>
Tested-by: Arthur Chen <axchen@nvidia.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-07 12:55:32 +02:00
Lv Zheng
9b80f0f73a ACPI / EC: Remove duplicated ec_wait_ibf0() waiter
After we've added the first command byte write into advance_transaction(),
the IBF=0 waiter is duplicated with the command completion waiter
implemented in the ec_poll() because:
   If IBF=1 blocked the first command byte write invoked in the task
   context ec_poll(), it would be kicked off upon IBF=0 interrupt or timed
   out and retried again in the task context.

Remove this seperate and duplicate IBF=0 waiter.  By doing so we can
reduce the overall number of times to access the EC_SC(R) status
register.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=70891
Link: https://bugzilla.kernel.org/show_bug.cgi?id=63931
Link: https://bugzilla.kernel.org/show_bug.cgi?id=59911
Reported-and-tested-by: Gareth Williams <gareth@garethwilliams.me.uk>
Reported-and-tested-by: Hans de Goede <jwrdegoede@fedoraproject.org>
Reported-by: Barton Xu <tank.xuhan@gmail.com>
Tested-by: Steffen Weber <steffen.weber@gmail.com>
Tested-by: Arthur Chen <axchen@nvidia.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-07 12:54:03 +02:00
Lv Zheng
f92fca0060 ACPI / EC: Add asynchronous command byte write support
Move the first command byte write into advance_transaction() so that all
EC register accesses that can affect the command processing state machine
can happen in this asynchronous state machine advancement function.

The advance_transaction() function then can be a complete implementation
of an asyncrhonous transaction for a single command so that:
 1. The first command byte can be written in the interrupt context;
 2. The command completion waiter can also be used to wait the first command
    byte's timeout;
 3. In BURST mode, the follow-up command bytes can be written in the
    interrupt context directly, so that it doesn't need to return to the
    task context. Returning to the task context reduces the throughput of
    the BURST mode and in the worst cases where the system workload is very
    high, this leads to the hardware driven automatic BURST mode exit.

In order not to increase memory consumption, convert 'done' into 'flags'
to contain multiple indications:
 1. ACPI_EC_COMMAND_COMPLETE: converting from original 'done' condition,
    indicating the completion of the command transaction.
 2. ACPI_EC_COMMAND_POLL: indicating the availability of writing the first
    command byte. A new command can utilize this flag to compete for the
    right of accessing the underlying hardware. There is a follow-up bug
    fix that has utilized this new flag.

The 2 flags are important because it also reflects a key concept of IO
programs' design used in the system softwares. Normally an IO program
running in the kernel should first be implemented in the asynchronous way.
And the 2 flags are the most common way to implement its synchronous
operations on top of the asynchronous operations:
1. POLL: This flag can be used to block until the asynchronous operations
         can happen.
2. COMPLETE: This flag can be used to block until the asynchronous
             operations have completed.
By constructing code cleanly in this way, many difficult problems can be
solved smoothly.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=70891
Link: https://bugzilla.kernel.org/show_bug.cgi?id=63931
Link: https://bugzilla.kernel.org/show_bug.cgi?id=59911
Reported-and-tested-by: Gareth Williams <gareth@garethwilliams.me.uk>
Reported-and-tested-by: Hans de Goede <jwrdegoede@fedoraproject.org>
Reported-by: Barton Xu <tank.xuhan@gmail.com>
Tested-by: Steffen Weber <steffen.weber@gmail.com>
Tested-by: Arthur Chen <axchen@nvidia.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-07 12:52:02 +02:00
Lv Zheng
66b42b78bc ACPI / EC: Avoid race condition related to advance_transaction()
The advance_transaction() will be invoked from the IRQ context GPE handler
and the task context ec_poll(). The handling of this function is locked so
that the EC state machine are ensured to be advanced sequentially.

But there is a problem. Before invoking advance_transaction(), EC_SC(R) is
read. Then for advance_transaction(), there could be race condition around
the lock from both contexts. The first one reading the register could fail
this race and when it passes the stale register value to the state machine
advancement code, the hardware condition is totally different from when
the register is read. And the hardware accesses determined from the wrong
hardware status can break the EC state machine. And there could be cases
that the functionalities of the platform firmware are seriously affected.
For example:
 1. When 2 EC_DATA(W) writes compete the IBF=0, the 2nd EC_DATA(W) write may
    be invalid due to IBF=1 after the 1st EC_DATA(W) write. Then the
    hardware will either refuse to respond a next EC_SC(W) write of the next
    command or discard the current WR_EC command when it receives a EC_SC(W)
    write of the next command.
 2. When 1 EC_SC(W) write and 1 EC_DATA(W) write compete the IBF=0, the
    EC_DATA(W) write may be invalid due to IBF=1 after the EC_SC(W) write.
    The next EC_DATA(R) could never be responded by the hardware. This is
    the root cause of the reported issue.

Fix this issue by moving the EC_SC(R) access into the lock so that we can
ensure that the state machine is advanced consistently.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=70891
Link: https://bugzilla.kernel.org/show_bug.cgi?id=63931
Link: https://bugzilla.kernel.org/show_bug.cgi?id=59911
Reported-and-tested-by: Gareth Williams <gareth@garethwilliams.me.uk>
Reported-and-tested-by: Hans de Goede <jwrdegoede@fedoraproject.org>
Reported-by: Barton Xu <tank.xuhan@gmail.com>
Tested-by: Steffen Weber <steffen.weber@gmail.com>
Tested-by: Arthur Chen <axchen@nvidia.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-07 12:50:25 +02:00
Kieran Clancy
3eba563e28 ACPI / EC: Process rather than discard events in acpi_ec_clear
Address a regression caused by commit ad332c8a45:
(ACPI / EC: Clear stale EC events on Samsung systems)

After the earlier patch, there was found to be a race condition on some
earlier Samsung systems (N150/N210/N220). The function acpi_ec_clear was
sometimes discarding a new EC event before its GPE was triggered by the
system. In the case of these systems, this meant that the "lid open"
event was not registered on resume if that was the cause of the wake,
leading to problems when attempting to close the lid to suspend again.

After testing on a number of Samsung systems, both those affected by the
previous EC bug and those affected by the race condition, it seemed that
the best course of action was to process rather than discard the events.
On Samsung systems which accumulate stale EC events, there does not seem
to be any adverse side-effects of running the associated _Q methods.

This patch adds an argument to the static function acpi_ec_sync_query so
that it may be used within the acpi_ec_clear loop in place of
acpi_ec_query_unlocked which was used previously.

With thanks to Stefan Biereigel for reporting the issue, and for all the
people who helped test the new patch on affected systems.

Fixes: ad332c8a45 (ACPI / EC: Clear stale EC events on Samsung systems)
References: https://lkml.kernel.org/r/532FE3B2.9060808@biereigel-wb.de
References: https://bugzilla.kernel.org/show_bug.cgi?id=44161#c173
Reported-by: Stefan Biereigel <stefan@biereigel.de>
Signed-off-by: Kieran Clancy <clancy.kieran@gmail.com>
Tested-by: Stefan Biereigel <stefan@biereigel.de>
Tested-by: Dennis Jansen <dennis.jansen@web.de>
Tested-by: Nicolas Porcel <nicolasporcel06@gmail.com>
Tested-by: Maurizio D'Addona <mauritiusdadd@gmail.com>
Tested-by: Juan Manuel Cabo <juanmanuel.cabo@gmail.com>
Tested-by: Giannis Koutsou <giannis.koutsou@gmail.com>
Tested-by: Kieran Clancy <clancy.kieran@gmail.com>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-04-29 23:07:38 +02:00
Kieran Clancy
ad332c8a45 ACPI / EC: Clear stale EC events on Samsung systems
A number of Samsung notebooks (530Uxx/535Uxx/540Uxx/550Pxx/900Xxx/etc)
continue to log events during sleep (lid open/close, AC plug/unplug,
battery level change), which accumulate in the EC until a buffer fills.
After the buffer is full (tests suggest it holds 8 events), GPEs stop
being triggered for new events. This state persists on wake or even on
power cycle, and prevents new events from being registered until the EC
is manually polled.

This is the root cause of a number of bugs, including AC not being
detected properly, lid close not triggering suspend, and low ambient
light not triggering the keyboard backlight. The bug also seemed to be
responsible for performance issues on at least one user's machine.

Juan Manuel Cabo found the cause of bug and the workaround of polling
the EC manually on wake.

The loop which clears the stale events is based on an earlier patch by
Lan Tianyu (see referenced attachment).

This patch:
 - Adds a function acpi_ec_clear() which polls the EC for stale _Q
   events at most ACPI_EC_CLEAR_MAX (currently 100) times. A warning is
   logged if this limit is reached.
 - Adds a flag EC_FLAGS_CLEAR_ON_RESUME which is set to 1 if the DMI
   system vendor is Samsung. This check could be replaced by several
   more specific DMI vendor/product pairs, but it's likely that the bug
   affects more Samsung products than just the five series mentioned
   above. Further, it should not be harmful to run acpi_ec_clear() on
   systems without the bug; it will return immediately after finding no
   data waiting.
 - Runs acpi_ec_clear() on initialisation (boot), from acpi_ec_add()
 - Runs acpi_ec_clear() on wake, from acpi_ec_unblock_transactions()

References: https://bugzilla.kernel.org/show_bug.cgi?id=44161
References: https://bugzilla.kernel.org/show_bug.cgi?id=45461
References: https://bugzilla.kernel.org/show_bug.cgi?id=57271
References: https://bugzilla.kernel.org/attachment.cgi?id=126801
Suggested-by: Juan Manuel Cabo <juanmanuel.cabo@gmail.com>
Signed-off-by: Kieran Clancy <clancy.kieran@gmail.com>
Reviewed-by: Lan Tianyu <tianyu.lan@intel.com>
Reviewed-by: Dennis Jansen <dennis.jansen@web.de>
Tested-by: Kieran Clancy <clancy.kieran@gmail.com>
Tested-by: Juan Manuel Cabo <juanmanuel.cabo@gmail.com>
Tested-by: Dennis Jansen <dennis.jansen@web.de>
Tested-by: Maurizio D'Addona <mauritiusdadd@gmail.com>
Tested-by: San Zamoyski <san@plusnet.pl>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-06 13:27:23 +01:00
Rafael J. Wysocki
bcc7201a91 Merge branches 'acpi-gpe', 'acpi-video', 'acpi-thermal', 'acpi-processor', 'acpi-sleep'
* acpi-gpe:
  ACPI / EC: disable GPE before removing GPE handler
  ACPI / Button: Fix enabling button GPEs twice

* acpi-video:
  ACPI: Blacklist Win8 OSI for some HP laptop 2013 models
  ACPI / video: Fix typo in video_detect.c

* acpi-thermal:
  ACPI / thermal: remove const from thermal_zone_device_ops declaration

* acpi-processor:
  ACPI / scan: bail out early if failed to parse APIC ID for CPU

* acpi-sleep:
  ACPI / sleep: remove panic in case hardware has changed after S4
2014-01-12 23:46:55 +01:00
Rashika
b8a0b0d199 ACPI / EC: Remove unused functions and add prototype declaration in internal.h
Adds the prototype declarations of functions acpi_ec_add_query_handler()
and acpi_ec_remove_query_handler() in header file internal.h and removes
unused functions ec_burst_enable() and ec_burst_disable() in ec.c.

This eliminates the following warnings in ec.c:
drivers/acpi/ec.c:393:5: warning: no previous prototype for ‘ec_burst_enable’ [-Wmissing-prototypes]
drivers/acpi/ec.c:402:5: warning: no previous prototype for ‘ec_burst_disable’ [-Wmissing-prototypes]
drivers/acpi/ec.c:531:5: warning: no previous prototype for ‘acpi_ec_add_query_handler’ [-Wmissing-prototypes]
drivers/acpi/ec.c:552:6: warning: no previous prototype for ‘acpi_ec_remove_query_handler’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-06 00:13:23 +01:00
Lan Tianyu
42b946bb35 ACPI / EC: disable GPE before removing GPE handler
Adjust the order of disabling the EC GPE and removing its handler to
avoid unhandled events.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
[rjw: Changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-12-19 15:56:15 +01:00
Lv Zheng
8b48463f89 ACPI: Clean up inclusions of ACPI header files
Replace direct inclusions of <acpi/acpi.h>, <acpi/acpi_bus.h> and
<acpi/acpi_drivers.h>, which are incorrect, with <linux/acpi.h>
inclusions and remove some inclusions of those files that aren't
necessary.

First of all, <acpi/acpi.h>, <acpi/acpi_bus.h> and <acpi/acpi_drivers.h>
should not be included directly from any files that are built for
CONFIG_ACPI unset, because that generally leads to build warnings about
undefined symbols in !CONFIG_ACPI builds.  For CONFIG_ACPI set,
<linux/acpi.h> includes those files and for CONFIG_ACPI unset it
provides stub ACPI symbols to be used in that case.

Second, there are ordering dependencies between those files that always
have to be met.  Namely, it is required that <acpi/acpi_bus.h> be included
prior to <acpi/acpi_drivers.h> so that the acpi_pci_root declarations the
latter depends on are always there.  And <acpi/acpi.h> which provides
basic ACPICA type declarations should always be included prior to any other
ACPI headers in CONFIG_ACPI builds.  That also is taken care of including
<linux/acpi.h> as appropriate.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> (drivers/pci stuff)
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> (Xen stuff)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-12-07 01:03:14 +01:00
Rafael J. Wysocki
dcaea2c18e Merge branch 'acpi-ec'
* acpi-ec:
  ACPI / EC: Ensure lock is acquired before accessing ec struct members
2013-11-19 01:06:06 +01:00
Puneet Kumar
36b15875a7 ACPI / EC: Ensure lock is acquired before accessing ec struct members
A bug was introduced by commit b76b51ba0c ('ACPI / EC: Add more debug
info and trivial code cleanup') that erroneously caused the struct member
to be accessed before acquiring the required lock.  This change fixes
it by ensuring the lock acquisition is done first.

Found by Aaron Durbin <adurbin@chromium.org>

Fixes: b76b51ba0c ('ACPI / EC: Add more debug info and trivial code cleanup')
References: http://crbug.com/319019
Signed-off-by: Puneet Kumar <puneetster@chromium.org>
Reviewed-by: Aaron Durbin <adurbin@chromium.org>
[olof: Commit message reworded a bit]
Signed-off-by: Olof Johansson <olof@lixom.net>
Cc: 3.8+ <stable@vger.kernel.org> # 3.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-11-15 23:18:27 +01:00
Lan Tianyu
16a26e8527 ACPI / EC: Convert all printk() calls to dynamic debug function
This patch is to convert all printks in the ec driver to pr_debug/info/err
and define pr_fmt macro to replace PREFIX.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-25 17:13:50 +02:00
Rafael J. Wysocki
da48afb26b Merge branch 'acpi-assorted'
* acpi-assorted:
  ACPI / EC: Add ASUSTEK L4R to quirk list in order to validate ECDT
  ACPI / thermal: Add check of "_TZD" availability and evaluating result
2013-08-30 14:13:50 +02:00
Lan Tianyu
524f42fab7 ACPI / EC: Add ASUSTEK L4R to quirk list in order to validate ECDT
The ECDT of ASUSTEK L4R doesn't provide correct command and data
I/O ports.  The DSDT provides the correct information instead.

For this reason, add this machine to quirk list for ECDT validation
and use the EC information from the DSDT.

[rjw: Changelog]
References: https://bugzilla.kernel.org/show_bug.cgi?id=60765
Reported-and-tested-by: Daniele Esposti <expo@expobrain.net>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Cc: All <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-28 21:50:13 +02:00
Rafael J. Wysocki
0c581415b5 Merge branch 'acpi-assorted'
* acpi-assorted:
  ACPI / osl: Kill macro INVALID_TABLE().
  earlycpio.c: Fix the confusing comment of find_cpio_data().
  ACPI / x86: Print Hot-Pluggable Field in SRAT.
  ACPI / thermal: Use THERMAL_TRIPS_NONE macro to replace number
  ACPI / thermal: Remove unused macros in the driver/acpi/thermal.c
  ACPI / thermal: Remove the unused lock of struct acpi_thermal
  ACPI / osl: Fix osi_setup_entries[] __initdata attribute location
  ACPI / numa: Fix __init attribute location in slit_valid()
  ACPI / dock: Fix __init attribute location in find_dock_and_bay()
  ACPI / Sleep: Fix incorrect placement of __initdata
  ACPI / processor: Fix incorrect placement of __initdata
  ACPI / EC: Fix incorrect placement of __initdata
  ACPI / scan: Drop unnecessary label from acpi_create_platform_device()
  ACPI: Move acpi_bus_get_device() from bus.c to scan.c
  ACPI / scan: Allow platform device creation without any IO resources
  ACPI: Cleanup sparse warning on acpi_os_initialize1()
  platform / thinkpad: Remove deprecated hotkey_report_mode parameter
  ACPI: Remove the old /proc/acpi/event interface
2013-08-27 01:29:04 +02:00
Sachin Kamat
6cef749701 ACPI / EC: Fix incorrect placement of __initdata
__initdata should be placed between the variable name and equal
sign for the variable to be placed in the intended section.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-07 23:49:28 +02:00
Jiang Liu
952c63e951 ACPI: introduce helper function acpi_has_method()
Introduce helper function acpi_has_method() and use it in a number
of places to simplify code.

[rjw: Changelog]
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-07-15 01:33:10 +02:00
Lan Tianyu
eff9a4b62b ACPI / EC: Add HP Folio 13 to ec_dmi_table in order to skip DSDT scan
HP Folio 13's BIOS defines CMOS RTC Operation Region and the EC's
_REG method will access that region.  To allow the CMOS RTC region
handler to be installed before the EC _REG method is first invoked,
add ec_skip_dsdt_scan() as HP Folio 13's callback to ec_dmi_table.

References: https://bugzilla.kernel.org/show_bug.cgi?id=54621
Reported-and-tested-by: Stefan Nagy <public@stefan-nagy.at>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-06-27 21:37:18 +02:00
Lan Tianyu
28fe5c825f ACPI / EC: Restart transaction even when the IBF flag set
The EC driver works abnormally with IBF flag always set.
IBF means "The host has written a byte of data to the command
or data port, but the embedded controller has not yet read it".
If IBF is set in the EC status and not cleared, this will cause
all subsequent EC requests to fail with a timeout error.

Change the EC driver so that it doesn't refuse to restart a
transaction if IBF is set in the status.  Also increase the
number of transaction restarts to 5, as it turns out that 2
is not sufficient in some cases.

This bug happens on several different machines (Asus V1S,
Dell Latitude E6530, Samsung R719, Acer Aspire 5930G,
Sony Vaio SR19VN and others).

[rjw: Changelog]
References: https://bugzilla.kernel.org/show_bug.cgi?id=14733
References: https://bugzilla.kernel.org/show_bug.cgi?id=15560
References: https://bugzilla.kernel.org/show_bug.cgi?id=15946
References: https://bugzilla.kernel.org/show_bug.cgi?id=42945
References: https://bugzilla.kernel.org/show_bug.cgi?id=48221
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Cc: All <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12 14:03:15 +02:00
Rafael J. Wysocki
51fac8388a ACPI: Remove useless type argument of driver .remove() operation
The second argument of ACPI driver .remove() operation is only used
by the ACPI processor driver and the value passed to that driver
through it is always available from the given struct acpi_device
object's removal_type field.  For this reason, the second ACPI driver
.remove() argument is in fact useless, so drop it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2013-01-26 00:37:24 +01:00
Feng Tang
a3cd8d2789 ACPI / EC: Don't count a SCI interrupt as a false one
Currently when advance_transaction() is called in EC interrupt handler,
if there is nothing driver can do with the interrupt, it will be taken
as a false one.

But this is not always true, as there may be a SCI EC interrupt fired
during normal read/write operation, which should not be counted as a
false one. This patch fixes the problem.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:15:59 +01:00
Feng Tang
b76b51ba0c ACPI / EC: Add more debug info and trivial code cleanup
Add more debug info for EC transaction debugging, like the interrupt
status register value, the detail info of a EC transaction.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:15:59 +01:00
Feng Tang
f351d027ee ACPI / EC: Cleanup the member name for spinlock/mutex in struct
Current member names for mutex/spinlock are a little confusing.

Change the
{
	struct mutex lock;
	spinlock_t curr_lock;
}
to
{
	struct mutex mutex;
	spinlock_t lock;
}

So that the code is cleaner and easier to read.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:15:59 +01:00
Feng Tang
67bfa9b60b ACPI: EC: Add a quirk for CLEVO M720T/M730T laptop
By enlarging the GPE storm threshold back to 20, that laptop's
EC works fine with interrupt mode instead of polling mode.

https://bugzilla.kernel.org/show_bug.cgi?id=45151

Reported-and-Tested-by: Francesco <trentini@dei.unipd.it>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
cc: stable@vger.kernel.org
2012-10-06 14:57:33 -04:00
Feng Tang
a520d52e99 ACPI: EC: Make the GPE storm threshold a module parameter
The Linux EC driver includes a mechanism to detect GPE storms,
and switch from interrupt-mode to polling mode.  However, polling
mode sometimes doesn't work, so the workaround is problematic.
Also, different systems seem to need the threshold for detecting
the GPE storm at different levels.

ACPI_EC_STORM_THRESHOLD was initially 20 when it's created, and
was changed to 8 in 2.6.28 commit 06cf7d3c7 "ACPI: EC: lower interrupt storm
threshold" to fix kernel bug 11892 by forcing the laptop in that bug to
work in polling mode. However in bug 45151, it works fine in interrupt
mode if we lift the threshold back to 20.

This patch makes the threshold a module parameter so that user has a
flexible option to debug/workaround this issue.

The default is unchanged.

This is also a preparation patch to fix specific systems:
	https://bugzilla.kernel.org/show_bug.cgi?id=45151

Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
cc: stable@vger.kernel.org
2012-10-06 14:53:10 -04:00
Linus Torvalds
a335750b9a Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull ACPI & Power Management changes from Len Brown:
 - ACPI 5.0 after-ripples, ACPICA/Linux divergence cleanup
 - cpuidle evolving, more ARM use
 - thermal sub-system evolving, ditto
 - assorted other PM bits

Fix up conflicts in various cpuidle implementations due to ARM cpuidle
cleanups (ARM at91 self-refresh and cpu idle code rewritten into
"standby" in asm conflicting with the consolidation of cpuidle time
keeping), trivial SH include file context conflict and RCU tracing fixes
in generic code.

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (77 commits)
  ACPI throttling: fix endian bug in acpi_read_throttling_status()
  Disable MCP limit exceeded messages from Intel IPS driver
  ACPI video: Don't start video device until its associated input device has been allocated
  ACPI video: Harden video bus adding.
  ACPI: Add support for exposing BGRT data
  ACPI: export acpi_kobj
  ACPI: Fix logic for removing mappings in 'acpi_unmap'
  CPER failed to handle generic error records with multiple sections
  ACPI: Clean redundant codes in scan.c
  ACPI: Fix unprotected smp_processor_id() in acpi_processor_cst_has_changed()
  ACPI: consistently use should_use_kmap()
  PNPACPI: Fix device ref leaking in acpi_pnp_match
  ACPI: Fix use-after-free in acpi_map_lsapic
  ACPI: processor_driver: add missing kfree
  ACPI, APEI: Fix incorrect APEI register bit width check and usage
  Update documentation for parameter *notrigger* in einj.txt
  ACPI, APEI, EINJ, new parameter to control trigger action
  ACPI, APEI, EINJ, limit the range of einj_param
  ACPI, APEI, Fix ERST header length check
  cpuidle: power_usage should be declared signed integer
  ...
2012-03-30 16:45:39 -07:00
Andi Kleen
d6795fe32d ACPI: ec: Do request_region outside WARN()
WARN() is not supposed to have side effects, so move the request_regions
outside.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 01:41:35 -04:00
Seth Forshee
3e2abc5a35 ACPI: EC: Add ec_get_handle()
toshiba_acpi needs to execute an AML method within the EC namespace
to make hotkeys work on some platforms. Provide an interface to
allow it to easily get a handle to the EC namespace for this purpose.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
2012-03-20 12:14:25 -04:00
Len Brown
751516f0a9 Merge branch 'ec-cleanup' into release
Conflicts:
	drivers/platform/x86/compal-laptop.c
2011-05-29 04:40:39 -04:00
Len Brown
6288cf1e76 Merge branches 'acpica', 'aml-custom', 'bugzilla-16548', 'bugzilla-20242', 'd3-cold', 'ec-asus' and 'thermal-fix' into release 2011-05-29 04:38:48 -04:00
Zhang Rui
08b53f0e6b ACPI EC: remove redundant code
ec->handle is set in ec_parse_device(), so don't bother to set it again.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 02:59:50 -04:00
Zhang Rui
534bc4e3d2 ACPI EC: enable MSI workaround for Quanta laptops
Enable MSI workaround for Quanta laptops.
https://bugzilla.kernel.org/show_bug.cgi?id=20242

Tested-by: Jan-Matthias Braun <jan_braun@gmx.net>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 01:35:46 -04:00
Peter Collingbourne
af986d101d ACPI: EC: add another DMI check for ASUS hardware
Commit 0adf3c746a introduced a regression
by making the ECDT validation test for ASUS hardware more restrictive.
The previous test used the dmi_name_in_vendors function which searches
a number of DMI fields, while the new test checked only the BIOS
vendor, which is known to not match on an ASUS F5GL laptop which
requires ECDT validation.

Add a rule to ec_dmi_table based on an alternative DMI pattern for
ASUS hardware as found elsewhere in the kernel.

Signed-off-by: Peter Collingbourne <peter@pcc.me.uk>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-04-25 16:54:15 -04:00