From cf579dfb82550e34de7ccf3ef090d8b834ccd3a9 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Sun, 29 Jan 2012 20:38:29 +0100 Subject: [PATCH 01/20] PM / Sleep: Introduce "late suspend" and "early resume" of devices The current device suspend/resume phases during system-wide power transitions appear to be insufficient for some platforms that want to use the same callback routines for saving device states and related operations during runtime suspend/resume as well as during system suspend/resume. In principle, they could point their .suspend_noirq() and .resume_noirq() to the same callback routines as their .runtime_suspend() and .runtime_resume(), respectively, but at least some of them require device interrupts to be enabled while the code in those routines is running. It also makes sense to have device suspend-resume callbacks that will be executed with runtime PM disabled and with device interrupts enabled in case someone needs to run some special code in that context during system-wide power transitions. Apart from this, .suspend_noirq() and .resume_noirq() were introduced as a workaround for drivers using shared interrupts and failing to prevent their interrupt handlers from accessing suspended hardware. It appears to be better not to use them for other porposes, or we may have to deal with some serious confusion (which seems to be happening already). For the above reasons, introduce new device suspend/resume phases, "late suspend" and "early resume" (and analogously for hibernation) whose callback will be executed with runtime PM disabled and with device interrupts enabled and whose callback pointers generally may point to runtime suspend/resume routines. Signed-off-by: Rafael J. Wysocki Reviewed-by: Mark Brown Reviewed-by: Kevin Hilman --- Documentation/power/devices.txt | 93 ++++++++---- arch/x86/kernel/apm_32.c | 11 +- drivers/base/power/main.c | 249 +++++++++++++++++++++++++++++--- drivers/xen/manage.c | 6 +- include/linux/pm.h | 43 +++++- include/linux/suspend.h | 4 + kernel/kexec.c | 8 +- kernel/power/hibernate.c | 24 +-- kernel/power/main.c | 8 +- kernel/power/suspend.c | 4 +- 10 files changed, 358 insertions(+), 92 deletions(-) diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt index 20af7def23c8..872815cd41d3 100644 --- a/Documentation/power/devices.txt +++ b/Documentation/power/devices.txt @@ -96,6 +96,12 @@ struct dev_pm_ops { int (*thaw)(struct device *dev); int (*poweroff)(struct device *dev); int (*restore)(struct device *dev); + int (*suspend_late)(struct device *dev); + int (*resume_early)(struct device *dev); + int (*freeze_late)(struct device *dev); + int (*thaw_early)(struct device *dev); + int (*poweroff_late)(struct device *dev); + int (*restore_early)(struct device *dev); int (*suspend_noirq)(struct device *dev); int (*resume_noirq)(struct device *dev); int (*freeze_noirq)(struct device *dev); @@ -305,7 +311,7 @@ Entering System Suspend ----------------------- When the system goes into the standby or memory sleep state, the phases are: - prepare, suspend, suspend_noirq. + prepare, suspend, suspend_late, suspend_noirq. 1. The prepare phase is meant to prevent races by preventing new devices from being registered; the PM core would never know that all the @@ -324,7 +330,12 @@ When the system goes into the standby or memory sleep state, the phases are: appropriate low-power state, depending on the bus type the device is on, and they may enable wakeup events. - 3. The suspend_noirq phase occurs after IRQ handlers have been disabled, + 3 For a number of devices it is convenient to split suspend into the + "quiesce device" and "save device state" phases, in which cases + suspend_late is meant to do the latter. It is always executed after + runtime power management has been disabled for all devices. + + 4. The suspend_noirq phase occurs after IRQ handlers have been disabled, which means that the driver's interrupt handler will not be called while the callback method is running. The methods should save the values of the device's registers that weren't saved previously and finally put the @@ -359,7 +370,7 @@ Leaving System Suspend ---------------------- When resuming from standby or memory sleep, the phases are: - resume_noirq, resume, complete. + resume_noirq, resume_early, resume, complete. 1. The resume_noirq callback methods should perform any actions needed before the driver's interrupt handlers are invoked. This generally @@ -375,14 +386,18 @@ When resuming from standby or memory sleep, the phases are: device driver's ->pm.resume_noirq() method to perform device-specific actions. - 2. The resume methods should bring the the device back to its operating + 2. The resume_early methods should prepare devices for the execution of + the resume methods. This generally involves undoing the actions of the + preceding suspend_late phase. + + 3 The resume methods should bring the the device back to its operating state, so that it can perform normal I/O. This generally involves undoing the actions of the suspend phase. - 3. The complete phase uses only a bus callback. The method should undo the - actions of the prepare phase. Note, however, that new children may be - registered below the device as soon as the resume callbacks occur; it's - not necessary to wait until the complete phase. + 4. The complete phase should undo the actions of the prepare phase. Note, + however, that new children may be registered below the device as soon as + the resume callbacks occur; it's not necessary to wait until the + complete phase. At the end of these phases, drivers should be as functional as they were before suspending: I/O can be performed using DMA and IRQs, and the relevant clocks are @@ -429,8 +444,8 @@ an image of the system memory while everything is stable, reactivate all devices (thaw), write the image to permanent storage, and finally shut down the system (poweroff). The phases used to accomplish this are: - prepare, freeze, freeze_noirq, thaw_noirq, thaw, complete, - prepare, poweroff, poweroff_noirq + prepare, freeze, freeze_late, freeze_noirq, thaw_noirq, thaw_early, + thaw, complete, prepare, poweroff, poweroff_late, poweroff_noirq 1. The prepare phase is discussed in the "Entering System Suspend" section above. @@ -441,7 +456,11 @@ system (poweroff). The phases used to accomplish this are: save time it's best not to do so. Also, the device should not be prepared to generate wakeup events. - 3. The freeze_noirq phase is analogous to the suspend_noirq phase discussed + 3. The freeze_late phase is analogous to the suspend_late phase described + above, except that the device should not be put in a low-power state and + should not be allowed to generate wakeup events by it. + + 4. The freeze_noirq phase is analogous to the suspend_noirq phase discussed above, except again that the device should not be put in a low-power state and should not be allowed to generate wakeup events. @@ -449,15 +468,19 @@ At this point the system image is created. All devices should be inactive and the contents of memory should remain undisturbed while this happens, so that the image forms an atomic snapshot of the system state. - 4. The thaw_noirq phase is analogous to the resume_noirq phase discussed + 5. The thaw_noirq phase is analogous to the resume_noirq phase discussed above. The main difference is that its methods can assume the device is in the same state as at the end of the freeze_noirq phase. - 5. The thaw phase is analogous to the resume phase discussed above. Its + 6. The thaw_early phase is analogous to the resume_early phase described + above. Its methods should undo the actions of the preceding + freeze_late, if necessary. + + 7. The thaw phase is analogous to the resume phase discussed above. Its methods should bring the device back to an operating state, so that it can be used for saving the image if necessary. - 6. The complete phase is discussed in the "Leaving System Suspend" section + 8. The complete phase is discussed in the "Leaving System Suspend" section above. At this point the system image is saved, and the devices then need to be @@ -465,16 +488,19 @@ prepared for the upcoming system shutdown. This is much like suspending them before putting the system into the standby or memory sleep state, and the phases are similar. - 7. The prepare phase is discussed above. + 9. The prepare phase is discussed above. - 8. The poweroff phase is analogous to the suspend phase. + 10. The poweroff phase is analogous to the suspend phase. - 9. The poweroff_noirq phase is analogous to the suspend_noirq phase. + 11. The poweroff_late phase is analogous to the suspend_late phase. -The poweroff and poweroff_noirq callbacks should do essentially the same things -as the suspend and suspend_noirq callbacks. The only notable difference is that -they need not store the device register values, because the registers should -already have been stored during the freeze or freeze_noirq phases. + 12. The poweroff_noirq phase is analogous to the suspend_noirq phase. + +The poweroff, poweroff_late and poweroff_noirq callbacks should do essentially +the same things as the suspend, suspend_late and suspend_noirq callbacks, +respectively. The only notable difference is that they need not store the +device register values, because the registers should already have been stored +during the freeze, freeze_late or freeze_noirq phases. Leaving Hibernation @@ -518,22 +544,25 @@ To achieve this, the image kernel must restore the devices' pre-hibernation functionality. The operation is much like waking up from the memory sleep state, although it involves different phases: - restore_noirq, restore, complete + restore_noirq, restore_early, restore, complete 1. The restore_noirq phase is analogous to the resume_noirq phase. - 2. The restore phase is analogous to the resume phase. + 2. The restore_early phase is analogous to the resume_early phase. - 3. The complete phase is discussed above. + 3. The restore phase is analogous to the resume phase. -The main difference from resume[_noirq] is that restore[_noirq] must assume the -device has been accessed and reconfigured by the boot loader or the boot kernel. -Consequently the state of the device may be different from the state remembered -from the freeze and freeze_noirq phases. The device may even need to be reset -and completely re-initialized. In many cases this difference doesn't matter, so -the resume[_noirq] and restore[_norq] method pointers can be set to the same -routines. Nevertheless, different callback pointers are used in case there is a -situation where it actually matters. + 4. The complete phase is discussed above. + +The main difference from resume[_early|_noirq] is that restore[_early|_noirq] +must assume the device has been accessed and reconfigured by the boot loader or +the boot kernel. Consequently the state of the device may be different from the +state remembered from the freeze, freeze_late and freeze_noirq phases. The +device may even need to be reset and completely re-initialized. In many cases +this difference doesn't matter, so the resume[_early|_noirq] and +restore[_early|_norq] method pointers can be set to the same routines. +Nevertheless, different callback pointers are used in case there is a situation +where it actually does matter. Device Power Management Domains diff --git a/arch/x86/kernel/apm_32.c b/arch/x86/kernel/apm_32.c index f76623cbe263..5d56931a15b3 100644 --- a/arch/x86/kernel/apm_32.c +++ b/arch/x86/kernel/apm_32.c @@ -1234,8 +1234,7 @@ static int suspend(int vetoable) struct apm_user *as; dpm_suspend_start(PMSG_SUSPEND); - - dpm_suspend_noirq(PMSG_SUSPEND); + dpm_suspend_end(PMSG_SUSPEND); local_irq_disable(); syscore_suspend(); @@ -1259,9 +1258,9 @@ static int suspend(int vetoable) syscore_resume(); local_irq_enable(); - dpm_resume_noirq(PMSG_RESUME); - + dpm_resume_start(PMSG_RESUME); dpm_resume_end(PMSG_RESUME); + queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); for (as = user_list; as != NULL; as = as->next) { @@ -1277,7 +1276,7 @@ static void standby(void) { int err; - dpm_suspend_noirq(PMSG_SUSPEND); + dpm_suspend_end(PMSG_SUSPEND); local_irq_disable(); syscore_suspend(); @@ -1291,7 +1290,7 @@ static void standby(void) syscore_resume(); local_irq_enable(); - dpm_resume_noirq(PMSG_RESUME); + dpm_resume_start(PMSG_RESUME); } static apm_event_t get_event(void) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index e2cc3d2e0ecc..b462c0e341cb 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -47,6 +47,7 @@ typedef int (*pm_callback_t)(struct device *); LIST_HEAD(dpm_list); LIST_HEAD(dpm_prepared_list); LIST_HEAD(dpm_suspended_list); +LIST_HEAD(dpm_late_early_list); LIST_HEAD(dpm_noirq_list); struct suspend_stats suspend_stats; @@ -245,6 +246,40 @@ static pm_callback_t pm_op(const struct dev_pm_ops *ops, pm_message_t state) return NULL; } +/** + * pm_late_early_op - Return the PM operation appropriate for given PM event. + * @ops: PM operations to choose from. + * @state: PM transition of the system being carried out. + * + * Runtime PM is disabled for @dev while this function is being executed. + */ +static pm_callback_t pm_late_early_op(const struct dev_pm_ops *ops, + pm_message_t state) +{ + switch (state.event) { +#ifdef CONFIG_SUSPEND + case PM_EVENT_SUSPEND: + return ops->suspend_late; + case PM_EVENT_RESUME: + return ops->resume_early; +#endif /* CONFIG_SUSPEND */ +#ifdef CONFIG_HIBERNATE_CALLBACKS + case PM_EVENT_FREEZE: + case PM_EVENT_QUIESCE: + return ops->freeze_late; + case PM_EVENT_HIBERNATE: + return ops->poweroff_late; + case PM_EVENT_THAW: + case PM_EVENT_RECOVER: + return ops->thaw_early; + case PM_EVENT_RESTORE: + return ops->restore_early; +#endif /* CONFIG_HIBERNATE_CALLBACKS */ + } + + return NULL; +} + /** * pm_noirq_op - Return the PM operation appropriate for given PM event. * @ops: PM operations to choose from. @@ -374,21 +409,21 @@ static int device_resume_noirq(struct device *dev, pm_message_t state) TRACE_RESUME(0); if (dev->pm_domain) { - info = "EARLY power domain "; + info = "noirq power domain "; callback = pm_noirq_op(&dev->pm_domain->ops, state); } else if (dev->type && dev->type->pm) { - info = "EARLY type "; + info = "noirq type "; callback = pm_noirq_op(dev->type->pm, state); } else if (dev->class && dev->class->pm) { - info = "EARLY class "; + info = "noirq class "; callback = pm_noirq_op(dev->class->pm, state); } else if (dev->bus && dev->bus->pm) { - info = "EARLY bus "; + info = "noirq bus "; callback = pm_noirq_op(dev->bus->pm, state); } if (!callback && dev->driver && dev->driver->pm) { - info = "EARLY driver "; + info = "noirq driver "; callback = pm_noirq_op(dev->driver->pm, state); } @@ -399,13 +434,13 @@ static int device_resume_noirq(struct device *dev, pm_message_t state) } /** - * dpm_resume_noirq - Execute "early resume" callbacks for non-sysdev devices. + * dpm_resume_noirq - Execute "noirq resume" callbacks for all devices. * @state: PM transition of the system being carried out. * - * Call the "noirq" resume handlers for all devices marked as DPM_OFF_IRQ and + * Call the "noirq" resume handlers for all devices in dpm_noirq_list and * enable device drivers to receive interrupts. */ -void dpm_resume_noirq(pm_message_t state) +static void dpm_resume_noirq(pm_message_t state) { ktime_t starttime = ktime_get(); @@ -415,7 +450,7 @@ void dpm_resume_noirq(pm_message_t state) int error; get_device(dev); - list_move_tail(&dev->power.entry, &dpm_suspended_list); + list_move_tail(&dev->power.entry, &dpm_late_early_list); mutex_unlock(&dpm_list_mtx); error = device_resume_noirq(dev, state); @@ -423,6 +458,80 @@ void dpm_resume_noirq(pm_message_t state) suspend_stats.failed_resume_noirq++; dpm_save_failed_step(SUSPEND_RESUME_NOIRQ); dpm_save_failed_dev(dev_name(dev)); + pm_dev_err(dev, state, " noirq", error); + } + + mutex_lock(&dpm_list_mtx); + put_device(dev); + } + mutex_unlock(&dpm_list_mtx); + dpm_show_time(starttime, state, "noirq"); + resume_device_irqs(); +} + +/** + * device_resume_early - Execute an "early resume" callback for given device. + * @dev: Device to handle. + * @state: PM transition of the system being carried out. + * + * Runtime PM is disabled for @dev while this function is being executed. + */ +static int device_resume_early(struct device *dev, pm_message_t state) +{ + pm_callback_t callback = NULL; + char *info = NULL; + int error = 0; + + TRACE_DEVICE(dev); + TRACE_RESUME(0); + + if (dev->pm_domain) { + info = "early power domain "; + callback = pm_late_early_op(&dev->pm_domain->ops, state); + } else if (dev->type && dev->type->pm) { + info = "early type "; + callback = pm_late_early_op(dev->type->pm, state); + } else if (dev->class && dev->class->pm) { + info = "early class "; + callback = pm_late_early_op(dev->class->pm, state); + } else if (dev->bus && dev->bus->pm) { + info = "early bus "; + callback = pm_late_early_op(dev->bus->pm, state); + } + + if (!callback && dev->driver && dev->driver->pm) { + info = "early driver "; + callback = pm_late_early_op(dev->driver->pm, state); + } + + error = dpm_run_callback(callback, dev, state, info); + + TRACE_RESUME(error); + return error; +} + +/** + * dpm_resume_early - Execute "early resume" callbacks for all devices. + * @state: PM transition of the system being carried out. + */ +static void dpm_resume_early(pm_message_t state) +{ + ktime_t starttime = ktime_get(); + + mutex_lock(&dpm_list_mtx); + while (!list_empty(&dpm_late_early_list)) { + struct device *dev = to_device(dpm_late_early_list.next); + int error; + + get_device(dev); + list_move_tail(&dev->power.entry, &dpm_suspended_list); + mutex_unlock(&dpm_list_mtx); + + error = device_resume_early(dev, state); + if (error) { + suspend_stats.failed_resume_early++; + dpm_save_failed_step(SUSPEND_RESUME_EARLY); + dpm_save_failed_dev(dev_name(dev)); pm_dev_err(dev, state, " early", error); } @@ -431,9 +540,18 @@ void dpm_resume_noirq(pm_message_t state) } mutex_unlock(&dpm_list_mtx); dpm_show_time(starttime, state, "early"); - resume_device_irqs(); } -EXPORT_SYMBOL_GPL(dpm_resume_noirq); + +/** + * dpm_resume_start - Execute "noirq" and "early" device callbacks. + * @state: PM transition of the system being carried out. + */ +void dpm_resume_start(pm_message_t state) +{ + dpm_resume_noirq(state); + dpm_resume_early(state); +} +EXPORT_SYMBOL_GPL(dpm_resume_start); /** * device_resume - Execute "resume" callbacks for given device. @@ -716,21 +834,21 @@ static int device_suspend_noirq(struct device *dev, pm_message_t state) char *info = NULL; if (dev->pm_domain) { - info = "LATE power domain "; + info = "noirq power domain "; callback = pm_noirq_op(&dev->pm_domain->ops, state); } else if (dev->type && dev->type->pm) { - info = "LATE type "; + info = "noirq type "; callback = pm_noirq_op(dev->type->pm, state); } else if (dev->class && dev->class->pm) { - info = "LATE class "; + info = "noirq class "; callback = pm_noirq_op(dev->class->pm, state); } else if (dev->bus && dev->bus->pm) { - info = "LATE bus "; + info = "noirq bus "; callback = pm_noirq_op(dev->bus->pm, state); } if (!callback && dev->driver && dev->driver->pm) { - info = "LATE driver "; + info = "noirq driver "; callback = pm_noirq_op(dev->driver->pm, state); } @@ -738,21 +856,21 @@ static int device_suspend_noirq(struct device *dev, pm_message_t state) } /** - * dpm_suspend_noirq - Execute "late suspend" callbacks for non-sysdev devices. + * dpm_suspend_noirq - Execute "noirq suspend" callbacks for all devices. * @state: PM transition of the system being carried out. * * Prevent device drivers from receiving interrupts and call the "noirq" suspend * handlers for all non-sysdev devices. */ -int dpm_suspend_noirq(pm_message_t state) +static int dpm_suspend_noirq(pm_message_t state) { ktime_t starttime = ktime_get(); int error = 0; suspend_device_irqs(); mutex_lock(&dpm_list_mtx); - while (!list_empty(&dpm_suspended_list)) { - struct device *dev = to_device(dpm_suspended_list.prev); + while (!list_empty(&dpm_late_early_list)) { + struct device *dev = to_device(dpm_late_early_list.prev); get_device(dev); mutex_unlock(&dpm_list_mtx); @@ -761,7 +879,7 @@ int dpm_suspend_noirq(pm_message_t state) mutex_lock(&dpm_list_mtx); if (error) { - pm_dev_err(dev, state, " late", error); + pm_dev_err(dev, state, " noirq", error); suspend_stats.failed_suspend_noirq++; dpm_save_failed_step(SUSPEND_SUSPEND_NOIRQ); dpm_save_failed_dev(dev_name(dev)); @@ -776,10 +894,95 @@ int dpm_suspend_noirq(pm_message_t state) if (error) dpm_resume_noirq(resume_event(state)); else - dpm_show_time(starttime, state, "late"); + dpm_show_time(starttime, state, "noirq"); return error; } -EXPORT_SYMBOL_GPL(dpm_suspend_noirq); + +/** + * device_suspend_late - Execute a "late suspend" callback for given device. + * @dev: Device to handle. + * @state: PM transition of the system being carried out. + * + * Runtime PM is disabled for @dev while this function is being executed. + */ +static int device_suspend_late(struct device *dev, pm_message_t state) +{ + pm_callback_t callback = NULL; + char *info = NULL; + + if (dev->pm_domain) { + info = "late power domain "; + callback = pm_late_early_op(&dev->pm_domain->ops, state); + } else if (dev->type && dev->type->pm) { + info = "late type "; + callback = pm_late_early_op(dev->type->pm, state); + } else if (dev->class && dev->class->pm) { + info = "late class "; + callback = pm_late_early_op(dev->class->pm, state); + } else if (dev->bus && dev->bus->pm) { + info = "late bus "; + callback = pm_late_early_op(dev->bus->pm, state); + } + + if (!callback && dev->driver && dev->driver->pm) { + info = "late driver "; + callback = pm_late_early_op(dev->driver->pm, state); + } + + return dpm_run_callback(callback, dev, state, info); +} + +/** + * dpm_suspend_late - Execute "late suspend" callbacks for all devices. + * @state: PM transition of the system being carried out. + */ +static int dpm_suspend_late(pm_message_t state) +{ + ktime_t starttime = ktime_get(); + int error = 0; + + mutex_lock(&dpm_list_mtx); + while (!list_empty(&dpm_suspended_list)) { + struct device *dev = to_device(dpm_suspended_list.prev); + + get_device(dev); + mutex_unlock(&dpm_list_mtx); + + error = device_suspend_late(dev, state); + + mutex_lock(&dpm_list_mtx); + if (error) { + pm_dev_err(dev, state, " late", error); + suspend_stats.failed_suspend_late++; + dpm_save_failed_step(SUSPEND_SUSPEND_LATE); + dpm_save_failed_dev(dev_name(dev)); + put_device(dev); + break; + } + if (!list_empty(&dev->power.entry)) + list_move(&dev->power.entry, &dpm_late_early_list); + put_device(dev); + } + mutex_unlock(&dpm_list_mtx); + if (error) + dpm_resume_early(resume_event(state)); + else + dpm_show_time(starttime, state, "late"); + + return error; +} + +/** + * dpm_suspend_end - Execute "late" and "noirq" device suspend callbacks. + * @state: PM transition of the system being carried out. + */ +int dpm_suspend_end(pm_message_t state) +{ + int error = dpm_suspend_late(state); + + return error ? : dpm_suspend_noirq(state); +} +EXPORT_SYMBOL_GPL(dpm_suspend_end); /** * legacy_suspend - Execute a legacy (bus or class) suspend callback for device. diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c index ce4fa0831860..9e14ae6cd49c 100644 --- a/drivers/xen/manage.c +++ b/drivers/xen/manage.c @@ -129,9 +129,9 @@ static void do_suspend(void) printk(KERN_DEBUG "suspending xenstore...\n"); xs_suspend(); - err = dpm_suspend_noirq(PMSG_FREEZE); + err = dpm_suspend_end(PMSG_FREEZE); if (err) { - printk(KERN_ERR "dpm_suspend_noirq failed: %d\n", err); + printk(KERN_ERR "dpm_suspend_end failed: %d\n", err); goto out_resume; } @@ -149,7 +149,7 @@ static void do_suspend(void) err = stop_machine(xen_suspend, &si, cpumask_of(0)); - dpm_resume_noirq(si.cancelled ? PMSG_THAW : PMSG_RESTORE); + dpm_resume_start(si.cancelled ? PMSG_THAW : PMSG_RESTORE); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); diff --git a/include/linux/pm.h b/include/linux/pm.h index e4982ac3fbbc..c68e1f22ac95 100644 --- a/include/linux/pm.h +++ b/include/linux/pm.h @@ -110,6 +110,10 @@ typedef struct pm_message { * Subsystem-level @suspend() is executed for all devices after invoking * subsystem-level @prepare() for all of them. * + * @suspend_late: Continue operations started by @suspend(). For a number of + * devices @suspend_late() may point to the same callback routine as the + * runtime suspend callback. + * * @resume: Executed after waking the system up from a sleep state in which the * contents of main memory were preserved. The exact action to perform * depends on the device's subsystem, but generally the driver is expected @@ -122,6 +126,10 @@ typedef struct pm_message { * Subsystem-level @resume() is executed for all devices after invoking * subsystem-level @resume_noirq() for all of them. * + * @resume_early: Prepare to execute @resume(). For a number of devices + * @resume_early() may point to the same callback routine as the runtime + * resume callback. + * * @freeze: Hibernation-specific, executed before creating a hibernation image. * Analogous to @suspend(), but it should not enable the device to signal * wakeup events or change its power state. The majority of subsystems @@ -131,6 +139,10 @@ typedef struct pm_message { * Subsystem-level @freeze() is executed for all devices after invoking * subsystem-level @prepare() for all of them. * + * @freeze_late: Continue operations started by @freeze(). Analogous to + * @suspend_late(), but it should not enable the device to signal wakeup + * events or change its power state. + * * @thaw: Hibernation-specific, executed after creating a hibernation image OR * if the creation of an image has failed. Also executed after a failing * attempt to restore the contents of main memory from such an image. @@ -140,15 +152,23 @@ typedef struct pm_message { * subsystem-level @thaw_noirq() for all of them. It also may be executed * directly after @freeze() in case of a transition error. * + * @thaw_early: Prepare to execute @thaw(). Undo the changes made by the + * preceding @freeze_late(). + * * @poweroff: Hibernation-specific, executed after saving a hibernation image. * Analogous to @suspend(), but it need not save the device's settings in * memory. * Subsystem-level @poweroff() is executed for all devices after invoking * subsystem-level @prepare() for all of them. * + * @poweroff_late: Continue operations started by @poweroff(). Analogous to + * @suspend_late(), but it need not save the device's settings in memory. + * * @restore: Hibernation-specific, executed after restoring the contents of main * memory from a hibernation image, analogous to @resume(). * + * @restore_early: Prepare to execute @restore(), analogous to @resume_early(). + * * @suspend_noirq: Complete the actions started by @suspend(). Carry out any * additional operations required for suspending the device that might be * racing with its driver's interrupt handler, which is guaranteed not to @@ -158,9 +178,10 @@ typedef struct pm_message { * @suspend_noirq() has returned successfully. If the device can generate * system wakeup signals and is enabled to wake up the system, it should be * configured to do so at that time. However, depending on the platform - * and device's subsystem, @suspend() may be allowed to put the device into - * the low-power state and configure it to generate wakeup signals, in - * which case it generally is not necessary to define @suspend_noirq(). + * and device's subsystem, @suspend() or @suspend_late() may be allowed to + * put the device into the low-power state and configure it to generate + * wakeup signals, in which case it generally is not necessary to define + * @suspend_noirq(). * * @resume_noirq: Prepare for the execution of @resume() by carrying out any * operations required for resuming the device that might be racing with @@ -171,9 +192,9 @@ typedef struct pm_message { * additional operations required for freezing the device that might be * racing with its driver's interrupt handler, which is guaranteed not to * run while @freeze_noirq() is being executed. - * The power state of the device should not be changed by either @freeze() - * or @freeze_noirq() and it should not be configured to signal system - * wakeup by any of these callbacks. + * The power state of the device should not be changed by either @freeze(), + * or @freeze_late(), or @freeze_noirq() and it should not be configured to + * signal system wakeup by any of these callbacks. * * @thaw_noirq: Prepare for the execution of @thaw() by carrying out any * operations required for thawing the device that might be racing with its @@ -249,6 +270,12 @@ struct dev_pm_ops { int (*thaw)(struct device *dev); int (*poweroff)(struct device *dev); int (*restore)(struct device *dev); + int (*suspend_late)(struct device *dev); + int (*resume_early)(struct device *dev); + int (*freeze_late)(struct device *dev); + int (*thaw_early)(struct device *dev); + int (*poweroff_late)(struct device *dev); + int (*restore_early)(struct device *dev); int (*suspend_noirq)(struct device *dev); int (*resume_noirq)(struct device *dev); int (*freeze_noirq)(struct device *dev); @@ -584,13 +611,13 @@ struct dev_pm_domain { #ifdef CONFIG_PM_SLEEP extern void device_pm_lock(void); -extern void dpm_resume_noirq(pm_message_t state); +extern void dpm_resume_start(pm_message_t state); extern void dpm_resume_end(pm_message_t state); extern void dpm_resume(pm_message_t state); extern void dpm_complete(pm_message_t state); extern void device_pm_unlock(void); -extern int dpm_suspend_noirq(pm_message_t state); +extern int dpm_suspend_end(pm_message_t state); extern int dpm_suspend_start(pm_message_t state); extern int dpm_suspend(pm_message_t state); extern int dpm_prepare(pm_message_t state); diff --git a/include/linux/suspend.h b/include/linux/suspend.h index 91784a4f8608..ac1c114c499d 100644 --- a/include/linux/suspend.h +++ b/include/linux/suspend.h @@ -42,8 +42,10 @@ enum suspend_stat_step { SUSPEND_FREEZE = 1, SUSPEND_PREPARE, SUSPEND_SUSPEND, + SUSPEND_SUSPEND_LATE, SUSPEND_SUSPEND_NOIRQ, SUSPEND_RESUME_NOIRQ, + SUSPEND_RESUME_EARLY, SUSPEND_RESUME }; @@ -53,8 +55,10 @@ struct suspend_stats { int failed_freeze; int failed_prepare; int failed_suspend; + int failed_suspend_late; int failed_suspend_noirq; int failed_resume; + int failed_resume_early; int failed_resume_noirq; #define REC_FAILED_NUM 2 int last_failed_dev; diff --git a/kernel/kexec.c b/kernel/kexec.c index 7b0886786701..a6a675cb9818 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -1546,13 +1546,13 @@ int kernel_kexec(void) if (error) goto Resume_console; /* At this point, dpm_suspend_start() has been called, - * but *not* dpm_suspend_noirq(). We *must* call - * dpm_suspend_noirq() now. Otherwise, drivers for + * but *not* dpm_suspend_end(). We *must* call + * dpm_suspend_end() now. Otherwise, drivers for * some devices (e.g. interrupt controllers) become * desynchronized with the actual state of the * hardware at resume time, and evil weirdness ensues. */ - error = dpm_suspend_noirq(PMSG_FREEZE); + error = dpm_suspend_end(PMSG_FREEZE); if (error) goto Resume_devices; error = disable_nonboot_cpus(); @@ -1579,7 +1579,7 @@ int kernel_kexec(void) local_irq_enable(); Enable_cpus: enable_nonboot_cpus(); - dpm_resume_noirq(PMSG_RESTORE); + dpm_resume_start(PMSG_RESTORE); Resume_devices: dpm_resume_end(PMSG_RESTORE); Resume_console: diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index 6d6d28870335..a5d4cf0aa03e 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -245,8 +245,8 @@ void swsusp_show_speed(struct timeval *start, struct timeval *stop, * create_image - Create a hibernation image. * @platform_mode: Whether or not to use the platform driver. * - * Execute device drivers' .freeze_noirq() callbacks, create a hibernation image - * and execute the drivers' .thaw_noirq() callbacks. + * Execute device drivers' "late" and "noirq" freeze callbacks, create a + * hibernation image and run the drivers' "noirq" and "early" thaw callbacks. * * Control reappears in this routine after the subsequent restore. */ @@ -254,7 +254,7 @@ static int create_image(int platform_mode) { int error; - error = dpm_suspend_noirq(PMSG_FREEZE); + error = dpm_suspend_end(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); @@ -306,7 +306,7 @@ static int create_image(int platform_mode) Platform_finish: platform_finish(platform_mode); - dpm_resume_noirq(in_suspend ? + dpm_resume_start(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); return error; @@ -394,16 +394,16 @@ int hibernation_snapshot(int platform_mode) * resume_target_kernel - Restore system state from a hibernation image. * @platform_mode: Whether or not to use the platform driver. * - * Execute device drivers' .freeze_noirq() callbacks, restore the contents of - * highmem that have not been restored yet from the image and run the low-level - * code that will restore the remaining contents of memory and switch to the - * just restored target kernel. + * Execute device drivers' "noirq" and "late" freeze callbacks, restore the + * contents of highmem that have not been restored yet from the image and run + * the low-level code that will restore the remaining contents of memory and + * switch to the just restored target kernel. */ static int resume_target_kernel(bool platform_mode) { int error; - error = dpm_suspend_noirq(PMSG_QUIESCE); + error = dpm_suspend_end(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); @@ -460,7 +460,7 @@ static int resume_target_kernel(bool platform_mode) Cleanup: platform_restore_cleanup(platform_mode); - dpm_resume_noirq(PMSG_RECOVER); + dpm_resume_start(PMSG_RECOVER); return error; } @@ -518,7 +518,7 @@ int hibernation_platform_enter(void) goto Resume_devices; } - error = dpm_suspend_noirq(PMSG_HIBERNATE); + error = dpm_suspend_end(PMSG_HIBERNATE); if (error) goto Resume_devices; @@ -549,7 +549,7 @@ int hibernation_platform_enter(void) Platform_finish: hibernation_ops->finish(); - dpm_resume_noirq(PMSG_RESTORE); + dpm_resume_start(PMSG_RESTORE); Resume_devices: entering_platform_hibernation = false; diff --git a/kernel/power/main.c b/kernel/power/main.c index 9824b41e5a18..8c5014a4e052 100644 --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -165,16 +165,20 @@ static int suspend_stats_show(struct seq_file *s, void *unused) last_errno %= REC_FAILED_NUM; last_step = suspend_stats.last_failed_step + REC_FAILED_NUM - 1; last_step %= REC_FAILED_NUM; - seq_printf(s, "%s: %d\n%s: %d\n%s: %d\n%s: %d\n" - "%s: %d\n%s: %d\n%s: %d\n%s: %d\n", + seq_printf(s, "%s: %d\n%s: %d\n%s: %d\n%s: %d\n%s: %d\n" + "%s: %d\n%s: %d\n%s: %d\n%s: %d\n%s: %d\n", "success", suspend_stats.success, "fail", suspend_stats.fail, "failed_freeze", suspend_stats.failed_freeze, "failed_prepare", suspend_stats.failed_prepare, "failed_suspend", suspend_stats.failed_suspend, + "failed_suspend_late", + suspend_stats.failed_suspend_late, "failed_suspend_noirq", suspend_stats.failed_suspend_noirq, "failed_resume", suspend_stats.failed_resume, + "failed_resume_early", + suspend_stats.failed_resume_early, "failed_resume_noirq", suspend_stats.failed_resume_noirq); seq_printf(s, "failures:\n last_failed_dev:\t%-s\n", diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c index 4fd51beed879..560a639614a1 100644 --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -147,7 +147,7 @@ static int suspend_enter(suspend_state_t state, bool *wakeup) goto Platform_finish; } - error = dpm_suspend_noirq(PMSG_SUSPEND); + error = dpm_suspend_end(PMSG_SUSPEND); if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Platform_finish; @@ -189,7 +189,7 @@ static int suspend_enter(suspend_state_t state, bool *wakeup) if (suspend_ops->wake) suspend_ops->wake(); - dpm_resume_noirq(PMSG_RESUME); + dpm_resume_start(PMSG_RESUME); Platform_finish: if (suspend_ops->finish) From e470d06655e00749f6f9372e4fa4f20cea7ed7c5 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Sun, 29 Jan 2012 20:38:41 +0100 Subject: [PATCH 02/20] PM / Sleep: Introduce generic callbacks for new device PM phases Introduce generic subsystem callbacks for the new phases of device suspend/resume during system power transitions: "late suspend", "early resume", "late freeze", "early thaw", "late poweroff", "early restore". Signed-off-by: Rafael J. Wysocki --- drivers/base/power/generic_ops.c | 165 ++++++++++++++++++++----------- include/linux/pm.h | 6 ++ 2 files changed, 114 insertions(+), 57 deletions(-) diff --git a/drivers/base/power/generic_ops.c b/drivers/base/power/generic_ops.c index 10bdd793f0bd..d03d290f31c2 100644 --- a/drivers/base/power/generic_ops.c +++ b/drivers/base/power/generic_ops.c @@ -91,68 +91,39 @@ int pm_generic_prepare(struct device *dev) return ret; } -/** - * __pm_generic_call - Generic suspend/freeze/poweroff/thaw subsystem callback. - * @dev: Device to handle. - * @event: PM transition of the system under way. - * @bool: Whether or not this is the "noirq" stage. - * - * Execute the PM callback corresponding to @event provided by the driver of - * @dev, if defined, and return its error code. Return 0 if the callback is - * not present. - */ -static int __pm_generic_call(struct device *dev, int event, bool noirq) -{ - const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; - int (*callback)(struct device *); - - if (!pm) - return 0; - - switch (event) { - case PM_EVENT_SUSPEND: - callback = noirq ? pm->suspend_noirq : pm->suspend; - break; - case PM_EVENT_FREEZE: - callback = noirq ? pm->freeze_noirq : pm->freeze; - break; - case PM_EVENT_HIBERNATE: - callback = noirq ? pm->poweroff_noirq : pm->poweroff; - break; - case PM_EVENT_RESUME: - callback = noirq ? pm->resume_noirq : pm->resume; - break; - case PM_EVENT_THAW: - callback = noirq ? pm->thaw_noirq : pm->thaw; - break; - case PM_EVENT_RESTORE: - callback = noirq ? pm->restore_noirq : pm->restore; - break; - default: - callback = NULL; - break; - } - - return callback ? callback(dev) : 0; -} - /** * pm_generic_suspend_noirq - Generic suspend_noirq callback for subsystems. * @dev: Device to suspend. */ int pm_generic_suspend_noirq(struct device *dev) { - return __pm_generic_call(dev, PM_EVENT_SUSPEND, true); + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->suspend_noirq ? pm->suspend_noirq(dev) : 0; } EXPORT_SYMBOL_GPL(pm_generic_suspend_noirq); +/** + * pm_generic_suspend_late - Generic suspend_late callback for subsystems. + * @dev: Device to suspend. + */ +int pm_generic_suspend_late(struct device *dev) +{ + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->suspend_late ? pm->suspend_late(dev) : 0; +} +EXPORT_SYMBOL_GPL(pm_generic_suspend_late); + /** * pm_generic_suspend - Generic suspend callback for subsystems. * @dev: Device to suspend. */ int pm_generic_suspend(struct device *dev) { - return __pm_generic_call(dev, PM_EVENT_SUSPEND, false); + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->suspend ? pm->suspend(dev) : 0; } EXPORT_SYMBOL_GPL(pm_generic_suspend); @@ -162,17 +133,33 @@ EXPORT_SYMBOL_GPL(pm_generic_suspend); */ int pm_generic_freeze_noirq(struct device *dev) { - return __pm_generic_call(dev, PM_EVENT_FREEZE, true); + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->freeze_noirq ? pm->freeze_noirq(dev) : 0; } EXPORT_SYMBOL_GPL(pm_generic_freeze_noirq); +/** + * pm_generic_freeze_late - Generic freeze_late callback for subsystems. + * @dev: Device to freeze. + */ +int pm_generic_freeze_late(struct device *dev) +{ + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->freeze_late ? pm->freeze_late(dev) : 0; +} +EXPORT_SYMBOL_GPL(pm_generic_freeze_late); + /** * pm_generic_freeze - Generic freeze callback for subsystems. * @dev: Device to freeze. */ int pm_generic_freeze(struct device *dev) { - return __pm_generic_call(dev, PM_EVENT_FREEZE, false); + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->freeze ? pm->freeze(dev) : 0; } EXPORT_SYMBOL_GPL(pm_generic_freeze); @@ -182,17 +169,33 @@ EXPORT_SYMBOL_GPL(pm_generic_freeze); */ int pm_generic_poweroff_noirq(struct device *dev) { - return __pm_generic_call(dev, PM_EVENT_HIBERNATE, true); + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->poweroff_noirq ? pm->poweroff_noirq(dev) : 0; } EXPORT_SYMBOL_GPL(pm_generic_poweroff_noirq); +/** + * pm_generic_poweroff_late - Generic poweroff_late callback for subsystems. + * @dev: Device to handle. + */ +int pm_generic_poweroff_late(struct device *dev) +{ + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->poweroff_late ? pm->poweroff_late(dev) : 0; +} +EXPORT_SYMBOL_GPL(pm_generic_poweroff_late); + /** * pm_generic_poweroff - Generic poweroff callback for subsystems. * @dev: Device to handle. */ int pm_generic_poweroff(struct device *dev) { - return __pm_generic_call(dev, PM_EVENT_HIBERNATE, false); + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->poweroff ? pm->poweroff(dev) : 0; } EXPORT_SYMBOL_GPL(pm_generic_poweroff); @@ -202,17 +205,33 @@ EXPORT_SYMBOL_GPL(pm_generic_poweroff); */ int pm_generic_thaw_noirq(struct device *dev) { - return __pm_generic_call(dev, PM_EVENT_THAW, true); + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->thaw_noirq ? pm->thaw_noirq(dev) : 0; } EXPORT_SYMBOL_GPL(pm_generic_thaw_noirq); +/** + * pm_generic_thaw_early - Generic thaw_early callback for subsystems. + * @dev: Device to thaw. + */ +int pm_generic_thaw_early(struct device *dev) +{ + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->thaw_early ? pm->thaw_early(dev) : 0; +} +EXPORT_SYMBOL_GPL(pm_generic_thaw_early); + /** * pm_generic_thaw - Generic thaw callback for subsystems. * @dev: Device to thaw. */ int pm_generic_thaw(struct device *dev) { - return __pm_generic_call(dev, PM_EVENT_THAW, false); + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->thaw ? pm->thaw(dev) : 0; } EXPORT_SYMBOL_GPL(pm_generic_thaw); @@ -222,17 +241,33 @@ EXPORT_SYMBOL_GPL(pm_generic_thaw); */ int pm_generic_resume_noirq(struct device *dev) { - return __pm_generic_call(dev, PM_EVENT_RESUME, true); + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->resume_noirq ? pm->resume_noirq(dev) : 0; } EXPORT_SYMBOL_GPL(pm_generic_resume_noirq); +/** + * pm_generic_resume_early - Generic resume_early callback for subsystems. + * @dev: Device to resume. + */ +int pm_generic_resume_early(struct device *dev) +{ + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->resume_early ? pm->resume_early(dev) : 0; +} +EXPORT_SYMBOL_GPL(pm_generic_resume_early); + /** * pm_generic_resume - Generic resume callback for subsystems. * @dev: Device to resume. */ int pm_generic_resume(struct device *dev) { - return __pm_generic_call(dev, PM_EVENT_RESUME, false); + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->resume ? pm->resume(dev) : 0; } EXPORT_SYMBOL_GPL(pm_generic_resume); @@ -242,17 +277,33 @@ EXPORT_SYMBOL_GPL(pm_generic_resume); */ int pm_generic_restore_noirq(struct device *dev) { - return __pm_generic_call(dev, PM_EVENT_RESTORE, true); + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->restore_noirq ? pm->restore_noirq(dev) : 0; } EXPORT_SYMBOL_GPL(pm_generic_restore_noirq); +/** + * pm_generic_restore_early - Generic restore_early callback for subsystems. + * @dev: Device to resume. + */ +int pm_generic_restore_early(struct device *dev) +{ + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->restore_early ? pm->restore_early(dev) : 0; +} +EXPORT_SYMBOL_GPL(pm_generic_restore_early); + /** * pm_generic_restore - Generic restore callback for subsystems. * @dev: Device to restore. */ int pm_generic_restore(struct device *dev) { - return __pm_generic_call(dev, PM_EVENT_RESTORE, false); + const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + + return pm && pm->restore ? pm->restore(dev) : 0; } EXPORT_SYMBOL_GPL(pm_generic_restore); diff --git a/include/linux/pm.h b/include/linux/pm.h index c68e1f22ac95..73c610573a74 100644 --- a/include/linux/pm.h +++ b/include/linux/pm.h @@ -632,17 +632,23 @@ extern void __suspend_report_result(const char *function, void *fn, int ret); extern int device_pm_wait_for_dev(struct device *sub, struct device *dev); extern int pm_generic_prepare(struct device *dev); +extern int pm_generic_suspend_late(struct device *dev); extern int pm_generic_suspend_noirq(struct device *dev); extern int pm_generic_suspend(struct device *dev); +extern int pm_generic_resume_early(struct device *dev); extern int pm_generic_resume_noirq(struct device *dev); extern int pm_generic_resume(struct device *dev); extern int pm_generic_freeze_noirq(struct device *dev); +extern int pm_generic_freeze_late(struct device *dev); extern int pm_generic_freeze(struct device *dev); extern int pm_generic_thaw_noirq(struct device *dev); +extern int pm_generic_thaw_early(struct device *dev); extern int pm_generic_thaw(struct device *dev); extern int pm_generic_restore_noirq(struct device *dev); +extern int pm_generic_restore_early(struct device *dev); extern int pm_generic_restore(struct device *dev); extern int pm_generic_poweroff_noirq(struct device *dev); +extern int pm_generic_poweroff_late(struct device *dev); extern int pm_generic_poweroff(struct device *dev); extern void pm_generic_complete(struct device *dev); From 8916e3702ec422b57cc549fbae3986106292100f Mon Sep 17 00:00:00 2001 From: Marcos Paulo de Souza Date: Sat, 4 Feb 2012 22:26:13 +0100 Subject: [PATCH 03/20] PM / Suspend: Avoid code duplication in suspend statistics update The code if (error) { suspend_stats.fail++; dpm_save_failed_errno(error); } else suspend_stats.success++; Appears in the kernel/power/main.c and kernel/power/suspend.c. This patch just creates a new function to avoid duplicated code. Suggested-by: Srivatsa S. Bhat Signed-off-by: Marcos Paulo de Souza Acked-by: Srivatsa S. Bhat Signed-off-by: Rafael J. Wysocki --- include/linux/suspend.h | 16 ++++++++++++++++ kernel/power/main.c | 6 +----- kernel/power/suspend.c | 6 +----- 3 files changed, 18 insertions(+), 10 deletions(-) diff --git a/include/linux/suspend.h b/include/linux/suspend.h index ac1c114c499d..b90191894441 100644 --- a/include/linux/suspend.h +++ b/include/linux/suspend.h @@ -94,6 +94,22 @@ static inline void dpm_save_failed_step(enum suspend_stat_step step) suspend_stats.last_failed_step %= REC_FAILED_NUM; } +/** + * suspend_stats_update - Update success/failure statistics of suspend-to-ram + * + * @error: Value returned by enter_state() function + */ +static inline void suspend_stats_update(int error) +{ + if (error) { + suspend_stats.fail++; + dpm_save_failed_errno(error); + } else { + suspend_stats.success++; + } +} + + /** * struct platform_suspend_ops - Callbacks for managing platform dependent * system sleep states. diff --git a/kernel/power/main.c b/kernel/power/main.c index 8c5014a4e052..b1e324878d5f 100644 --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -296,11 +296,7 @@ static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr, } if (state < PM_SUSPEND_MAX && *s) { error = enter_state(state); - if (error) { - suspend_stats.fail++; - dpm_save_failed_errno(error); - } else - suspend_stats.success++; + suspend_stats_update(error); } #endif diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c index 560a639614a1..03bc92b42750 100644 --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -321,11 +321,7 @@ int pm_suspend(suspend_state_t state) int ret; if (state > PM_SUSPEND_ON && state < PM_SUSPEND_MAX) { ret = enter_state(state); - if (ret) { - suspend_stats.fail++; - dpm_save_failed_errno(ret); - } else - suspend_stats.success++; + suspend_stats_update(ret); return ret; } return -EINVAL; From 9045a05044268b075c13bb0284601b24959dc3c6 Mon Sep 17 00:00:00 2001 From: "Srivatsa S. Bhat" Date: Sat, 4 Feb 2012 22:26:26 +0100 Subject: [PATCH 04/20] PM / Freezer / Docs: Document the beauty of freeze/thaw semantics The way the different freeze/thaw functions encapsulate each other are quite lovely from a design point of view. And as a side-effect, the way in which they are invoked (cleaning up on failure for example) differs significantly from how usual functions are dealt with. This is because of the underlying semantics that govern the freezing and thawing of various tasks. This subtle aspect that differentiates these functions from the rest, is worth documenting. Many thanks to Tejun Heo for providing enlightenment on this topic. Signed-off-by: Srivatsa S. Bhat Signed-off-by: Rafael J. Wysocki --- Documentation/power/freezing-of-tasks.txt | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/Documentation/power/freezing-of-tasks.txt b/Documentation/power/freezing-of-tasks.txt index ebd7490ef1df..ec715cd78fbb 100644 --- a/Documentation/power/freezing-of-tasks.txt +++ b/Documentation/power/freezing-of-tasks.txt @@ -63,6 +63,27 @@ devices have been reinitialized, the function thaw_processes() is called in order to clear the PF_FROZEN flag for each frozen task. Then, the tasks that have been frozen leave __refrigerator() and continue running. + +Rationale behind the functions dealing with freezing and thawing of tasks: +------------------------------------------------------------------------- + +freeze_processes(): + - freezes only userspace tasks + +freeze_kernel_threads(): + - freezes all tasks (including kernel threads) because we can't freeze + kernel threads without freezing userspace tasks + +thaw_kernel_threads(): + - thaws only kernel threads; this is particularly useful if we need to do + anything special in between thawing of kernel threads and thawing of + userspace tasks, or if we want to postpone the thawing of userspace tasks + +thaw_processes(): + - thaws all tasks (including kernel threads) because we can't thaw userspace + tasks without thawing kernel threads + + III. Which kernel threads are freezable? Kernel threads are not freezable by default. However, a kernel thread may clear From 51d6ff7acd920379f54d0be4dbe844a46178a65f Mon Sep 17 00:00:00 2001 From: "Srivatsa S. Bhat" Date: Sat, 4 Feb 2012 22:26:38 +0100 Subject: [PATCH 05/20] PM / Hibernate: Thaw kernel threads in hibernation_snapshot() in error/test path In the hibernation call path, the kernel threads are frozen inside hibernation_snapshot(). If we happen to encounter an error further down the road or if we are exiting early due to a successful freezer test, then thaw kernel threads before returning to the caller. Signed-off-by: Srivatsa S. Bhat Acked-by: Tejun Heo Signed-off-by: Rafael J. Wysocki --- kernel/power/hibernate.c | 6 ++++-- kernel/power/user.c | 8 ++------ 2 files changed, 6 insertions(+), 8 deletions(-) diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index a5d4cf0aa03e..c6dee739080c 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -343,13 +343,13 @@ int hibernation_snapshot(int platform_mode) * successful freezer test. */ freezer_test_done = true; - goto Cleanup; + goto Thaw; } error = dpm_prepare(PMSG_FREEZE); if (error) { dpm_complete(PMSG_RECOVER); - goto Cleanup; + goto Thaw; } suspend_console(); @@ -385,6 +385,8 @@ int hibernation_snapshot(int platform_mode) platform_end(platform_mode); return error; + Thaw: + thaw_kernel_threads(); Cleanup: swsusp_free(); goto Close; diff --git a/kernel/power/user.c b/kernel/power/user.c index 3e100075b13c..7bee91f9af51 100644 --- a/kernel/power/user.c +++ b/kernel/power/user.c @@ -249,16 +249,12 @@ static long snapshot_ioctl(struct file *filp, unsigned int cmd, } pm_restore_gfp_mask(); error = hibernation_snapshot(data->platform_support); - if (error) { - thaw_kernel_threads(); - } else { + if (!error) { error = put_user(in_suspend, (int __user *)arg); if (!error && !freezer_test_done) data->ready = 1; - if (freezer_test_done) { + if (freezer_test_done) freezer_test_done = false; - thaw_kernel_threads(); - } } break; From a556d5b58345ccf51826b9ceac078072f830738b Mon Sep 17 00:00:00 2001 From: "Srivatsa S. Bhat" Date: Sat, 4 Feb 2012 23:39:56 +0100 Subject: [PATCH 06/20] PM / Hibernate: Refactor and simplify freezer_test_done The code related to 'freezer_test_done' is needlessly convoluted. Refactor the code and simplify the implementation. Signed-off-by: Srivatsa S. Bhat Signed-off-by: Rafael J. Wysocki --- kernel/power/hibernate.c | 10 +++++----- kernel/power/user.c | 6 ++---- 2 files changed, 7 insertions(+), 9 deletions(-) diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index c6dee739080c..72baaf011fb7 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -629,12 +629,8 @@ int hibernate(void) goto Finish; error = hibernation_snapshot(hibernation_mode == HIBERNATION_PLATFORM); - if (error) + if (error || freezer_test_done) goto Thaw; - if (freezer_test_done) { - freezer_test_done = false; - goto Thaw; - } if (in_suspend) { unsigned int flags = 0; @@ -659,6 +655,10 @@ int hibernate(void) Thaw: thaw_processes(); + + /* Don't bother checking whether freezer_test_done is true */ + freezer_test_done = false; + Finish: free_basic_memory_bitmaps(); usermodehelper_enable(); diff --git a/kernel/power/user.c b/kernel/power/user.c index 7bee91f9af51..33c4329205af 100644 --- a/kernel/power/user.c +++ b/kernel/power/user.c @@ -251,10 +251,8 @@ static long snapshot_ioctl(struct file *filp, unsigned int cmd, error = hibernation_snapshot(data->platform_support); if (!error) { error = put_user(in_suspend, (int __user *)arg); - if (!error && !freezer_test_done) - data->ready = 1; - if (freezer_test_done) - freezer_test_done = false; + data->ready = !freezer_test_done && !error; + freezer_test_done = false; } break; From 7c95149b7f1f61201b12c73c4862a41bf2428961 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Sat, 11 Feb 2012 00:00:11 +0100 Subject: [PATCH 07/20] PM / Sleep: Initialize wakeup source locks in wakeup_source_add() Initialize wakeup source locks in wakeup_source_add() instead of wakeup_source_create(), because otherwise the locks of the wakeup sources that haven't been allocated with wakeup_source_create() aren't initialized and handled properly. Signed-off-by: Rafael J. Wysocki --- drivers/base/power/wakeup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c index caf995fb774b..6e591a8a49da 100644 --- a/drivers/base/power/wakeup.c +++ b/drivers/base/power/wakeup.c @@ -64,7 +64,6 @@ struct wakeup_source *wakeup_source_create(const char *name) if (!ws) return NULL; - spin_lock_init(&ws->lock); if (name) ws->name = kstrdup(name, GFP_KERNEL); @@ -105,6 +104,7 @@ void wakeup_source_add(struct wakeup_source *ws) if (WARN_ON(!ws)) return; + spin_lock_init(&ws->lock); setup_timer(&ws->timer, pm_wakeup_timer_fn, (unsigned long)ws); ws->active = false; From 6c83b4818dd65eb17e633b6b629a81da7bed90b3 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Sat, 11 Feb 2012 00:00:34 +0100 Subject: [PATCH 08/20] PM / Sleep: Do not check wakeup too often in try_to_freeze_tasks() Use the observation that it is more efficient to check the wakeup variable once before the loop reporting tasks that were not frozen in try_to_freeze_tasks() than to do that in every step of that loop. Signed-off-by: Rafael J. Wysocki --- kernel/power/process.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/kernel/power/process.c b/kernel/power/process.c index 7e426459e60a..6aeb5efe00eb 100644 --- a/kernel/power/process.c +++ b/kernel/power/process.c @@ -98,13 +98,15 @@ static int try_to_freeze_tasks(bool user_only) elapsed_csecs / 100, elapsed_csecs % 100, todo - wq_busy, wq_busy); - read_lock(&tasklist_lock); - do_each_thread(g, p) { - if (!wakeup && !freezer_should_skip(p) && - p != current && freezing(p) && !frozen(p)) - sched_show_task(p); - } while_each_thread(g, p); - read_unlock(&tasklist_lock); + if (!wakeup) { + read_lock(&tasklist_lock); + do_each_thread(g, p) { + if (p != current && !freezer_should_skip(p) + && freezing(p) && !frozen(p)) + sched_show_task(p); + } while_each_thread(g, p); + read_unlock(&tasklist_lock); + } } else { printk("(elapsed %d.%02d seconds) ", elapsed_csecs / 100, elapsed_csecs % 100); From 6f585f750d792652f33b6e85b1ee205be4b5e572 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Sat, 11 Feb 2012 22:40:23 +0100 Subject: [PATCH 09/20] PM / Sleep: Remove unnecessary label from suspend_freeze_processes() The Finish label in suspend_freeze_processes() is in fact unnecessary and makes the function look more complicated than it really is, so remove that label (along with a few empty lines). Signed-off-by: Rafael J. Wysocki Acked-by: Srivatsa S. Bhat --- kernel/power/power.h | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/kernel/power/power.h b/kernel/power/power.h index 21724eee5206..398d42b48e9e 100644 --- a/kernel/power/power.h +++ b/kernel/power/power.h @@ -234,16 +234,14 @@ static inline int suspend_freeze_processes(void) int error; error = freeze_processes(); - /* * freeze_processes() automatically thaws every task if freezing * fails. So we need not do anything extra upon error. */ if (error) - goto Finish; + return error; error = freeze_kernel_threads(); - /* * freeze_kernel_threads() thaws only kernel threads upon freezing * failure. So we have to thaw the userspace tasks ourselves. @@ -251,7 +249,6 @@ static inline int suspend_freeze_processes(void) if (error) thaw_processes(); - Finish: return error; } From 55ae451918ec62e553f11b6118fec157f90c31c3 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Mon, 13 Feb 2012 16:29:14 +0100 Subject: [PATCH 10/20] PM / Sleep: Unify kerneldoc comments in kernel/power/suspend.c The kerneldoc comments in kernel/power/suspend.c are not formatted in the same way and the quality of some of them is questionable. Unify the formatting and improve the contents. Signed-off-by: Rafael J. Wysocki Acked-by: Srivatsa S. Bhat --- kernel/power/suspend.c | 56 ++++++++++++++++++++---------------------- 1 file changed, 27 insertions(+), 29 deletions(-) diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c index 03bc92b42750..e6b5ef958603 100644 --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -37,8 +37,8 @@ const char *const pm_states[PM_SUSPEND_MAX] = { static const struct platform_suspend_ops *suspend_ops; /** - * suspend_set_ops - Set the global suspend method table. - * @ops: Pointer to ops structure. + * suspend_set_ops - Set the global suspend method table. + * @ops: Suspend operations to use. */ void suspend_set_ops(const struct platform_suspend_ops *ops) { @@ -58,11 +58,11 @@ bool valid_state(suspend_state_t state) } /** - * suspend_valid_only_mem - generic memory-only valid callback + * suspend_valid_only_mem - Generic memory-only valid callback. * - * Platform drivers that implement mem suspend only and only need - * to check for that in their .valid callback can use this instead - * of rolling their own .valid callback. + * Platform drivers that implement mem suspend only and only need to check for + * that in their .valid() callback can use this instead of rolling their own + * .valid() callback. */ int suspend_valid_only_mem(suspend_state_t state) { @@ -83,10 +83,11 @@ static int suspend_test(int level) } /** - * suspend_prepare - Do prep work before entering low-power state. + * suspend_prepare - Prepare for entering system sleep state. * - * This is common code that is called for each state that we're entering. - * Run suspend notifiers, allocate a console and stop all processes. + * Common code run for every system sleep state that can be entered (except for + * hibernation). Run suspend notifiers, allocate the "suspend" console and + * freeze processes. */ static int suspend_prepare(void) { @@ -131,9 +132,9 @@ void __attribute__ ((weak)) arch_suspend_enable_irqs(void) } /** - * suspend_enter - enter the desired system sleep state. - * @state: State to enter - * @wakeup: Returns information that suspend should not be entered again. + * suspend_enter - Make the system enter the given sleep state. + * @state: System sleep state to enter. + * @wakeup: Returns information that the sleep state should not be re-entered. * * This function should be called after devices have been suspended. */ @@ -199,9 +200,8 @@ static int suspend_enter(suspend_state_t state, bool *wakeup) } /** - * suspend_devices_and_enter - suspend devices and enter the desired system - * sleep state. - * @state: state to enter + * suspend_devices_and_enter - Suspend devices and enter system sleep state. + * @state: System sleep state to enter. */ int suspend_devices_and_enter(suspend_state_t state) { @@ -251,10 +251,10 @@ int suspend_devices_and_enter(suspend_state_t state) } /** - * suspend_finish - Do final work before exiting suspend sequence. + * suspend_finish - Clean up before finishing the suspend sequence. * - * Call platform code to clean up, restart processes, and free the - * console that we've allocated. This is not called for suspend-to-disk. + * Call platform code to clean up, restart processes, and free the console that + * we've allocated. This routine is not called for hibernation. */ static void suspend_finish(void) { @@ -265,14 +265,12 @@ static void suspend_finish(void) } /** - * enter_state - Do common work of entering low-power state. - * @state: pm_state structure for state we're entering. + * enter_state - Do common work needed to enter system sleep state. + * @state: System sleep state to enter. * - * Make sure we're the only ones trying to enter a sleep state. Fail - * if someone has beat us to it, since we don't want anything weird to - * happen when we wake up. - * Then, do the setup for suspend, enter the state, and cleaup (after - * we've woken up). + * Make sure that no one else is trying to put the system into a sleep state. + * Fail if that's not the case. Otherwise, prepare for system suspend, make the + * system enter the given sleep state and clean up after wakeup. */ int enter_state(suspend_state_t state) { @@ -310,11 +308,11 @@ int enter_state(suspend_state_t state) } /** - * pm_suspend - Externally visible function for suspending system. - * @state: Enumerated value of state to enter. + * pm_suspend - Externally visible function for suspending the system. + * @state: System sleep state to enter. * - * Determine whether or not value is within range, get state - * structure, and enter (above). + * Check if the value of @state represents one of the supported states, + * execute enter_state() and update system suspend statistics. */ int pm_suspend(suspend_state_t state) { From 93e1ee43a72b11e1b50aab87046c131a836a4456 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Mon, 13 Feb 2012 16:29:24 +0100 Subject: [PATCH 11/20] PM / Sleep: Make enter_state() in kernel/power/suspend.c static The enter_state() function in kernel/power/suspend.c should be static and state_store() in kernel/power/suspend.c should call pm_suspend() instead of it, so make that happen (which also reduces code duplication related to suspend statistics). Signed-off-by: Rafael J. Wysocki Acked-by: Srivatsa S. Bhat --- kernel/power/main.c | 8 +++----- kernel/power/power.h | 2 -- kernel/power/suspend.c | 2 +- 3 files changed, 4 insertions(+), 8 deletions(-) diff --git a/kernel/power/main.c b/kernel/power/main.c index b1e324878d5f..1c12581f1c62 100644 --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -291,12 +291,10 @@ static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr, #ifdef CONFIG_SUSPEND for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) { - if (*s && len == strlen(*s) && !strncmp(buf, *s, len)) + if (*s && len == strlen(*s) && !strncmp(buf, *s, len)) { + error = pm_suspend(state); break; - } - if (state < PM_SUSPEND_MAX && *s) { - error = enter_state(state); - suspend_stats_update(error); + } } #endif diff --git a/kernel/power/power.h b/kernel/power/power.h index 398d42b48e9e..98f3622d7407 100644 --- a/kernel/power/power.h +++ b/kernel/power/power.h @@ -177,13 +177,11 @@ extern const char *const pm_states[]; extern bool valid_state(suspend_state_t state); extern int suspend_devices_and_enter(suspend_state_t state); -extern int enter_state(suspend_state_t state); #else /* !CONFIG_SUSPEND */ static inline int suspend_devices_and_enter(suspend_state_t state) { return -ENOSYS; } -static inline int enter_state(suspend_state_t state) { return -ENOSYS; } static inline bool valid_state(suspend_state_t state) { return false; } #endif /* !CONFIG_SUSPEND */ diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c index e6b5ef958603..4914358a0543 100644 --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -272,7 +272,7 @@ static void suspend_finish(void) * Fail if that's not the case. Otherwise, prepare for system suspend, make the * system enter the given sleep state and clean up after wakeup. */ -int enter_state(suspend_state_t state) +static int enter_state(suspend_state_t state) { int error; From bc25cf508942c56810d4fb623ef27b56ccef7783 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Mon, 13 Feb 2012 16:29:33 +0100 Subject: [PATCH 12/20] PM / Sleep: Drop suspend_stats_update() Since suspend_stats_update() is only called from pm_suspend(), move its code directly into that function and remove the static inline definition from include/linux/suspend.h. Clean_up pm_suspend() in the process. Signed-off-by: Rafael J. Wysocki Acked-by: Srivatsa S. Bhat --- include/linux/suspend.h | 16 ---------------- kernel/power/suspend.c | 18 ++++++++++++------ 2 files changed, 12 insertions(+), 22 deletions(-) diff --git a/include/linux/suspend.h b/include/linux/suspend.h index b90191894441..ac1c114c499d 100644 --- a/include/linux/suspend.h +++ b/include/linux/suspend.h @@ -94,22 +94,6 @@ static inline void dpm_save_failed_step(enum suspend_stat_step step) suspend_stats.last_failed_step %= REC_FAILED_NUM; } -/** - * suspend_stats_update - Update success/failure statistics of suspend-to-ram - * - * @error: Value returned by enter_state() function - */ -static inline void suspend_stats_update(int error) -{ - if (error) { - suspend_stats.fail++; - dpm_save_failed_errno(error); - } else { - suspend_stats.success++; - } -} - - /** * struct platform_suspend_ops - Callbacks for managing platform dependent * system sleep states. diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c index 4914358a0543..88e5c967370d 100644 --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -316,12 +316,18 @@ static int enter_state(suspend_state_t state) */ int pm_suspend(suspend_state_t state) { - int ret; - if (state > PM_SUSPEND_ON && state < PM_SUSPEND_MAX) { - ret = enter_state(state); - suspend_stats_update(ret); - return ret; + int error; + + if (state <= PM_SUSPEND_ON || state >= PM_SUSPEND_MAX) + return -EINVAL; + + error = enter_state(state); + if (error) { + suspend_stats.fail++; + dpm_save_failed_errno(error); + } else { + suspend_stats.success++; } - return -EINVAL; + return error; } EXPORT_SYMBOL(pm_suspend); From c48825251cf5950da9d618144c4db6c130e6c0cd Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Mon, 13 Feb 2012 16:29:47 +0100 Subject: [PATCH 13/20] PM: Add comment describing relationships between PM callbacks to pm.h The UNIVERSAL_DEV_PM_OPS() macro is slightly misleading, because it may suggest that it's a good idea to point runtime PM callback pointers to the same routines as system suspend/resume callbacks .suspend() and .resume(), which is not the case. For this reason, add a comment to include/linux/pm.h, next to the definition of UNIVERSAL_DEV_PM_OPS(), describing how device PM callbacks are related to each other. Signed-off-by: Rafael J. Wysocki --- include/linux/pm.h | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/include/linux/pm.h b/include/linux/pm.h index 73c610573a74..d6dd6f612b8d 100644 --- a/include/linux/pm.h +++ b/include/linux/pm.h @@ -320,6 +320,15 @@ const struct dev_pm_ops name = { \ /* * Use this for defining a set of PM operations to be used in all situations * (sustem suspend, hibernation or runtime PM). + * NOTE: In general, system suspend callbacks, .suspend() and .resume(), should + * be different from the corresponding runtime PM callbacks, .runtime_suspend(), + * and .runtime_resume(), because .runtime_suspend() always works on an already + * quiescent device, while .suspend() should assume that the device may be doing + * something when it is called (it should ensure that the device will be + * quiescent after it has returned). Therefore it's better to point the "late" + * suspend and "early" resume callback pointers, .suspend_late() and + * .resume_early(), to the same routines as .runtime_suspend() and + * .runtime_resume(), respectively (and analogously for hibernation). */ #define UNIVERSAL_DEV_PM_OPS(name, suspend_fn, resume_fn, idle_fn) \ const struct dev_pm_ops name = { \ From 69f1d475cc80c55121852b3030873cdd407fd31c Mon Sep 17 00:00:00 2001 From: Bjorn Helgaas Date: Tue, 14 Feb 2012 22:20:52 +0100 Subject: [PATCH 14/20] PM / Hibernate: print physical addresses consistently with other parts of kernel Print physical address info in a style consistent with the %pR style used elsewhere in the kernel. Signed-off-by: Bjorn Helgaas Signed-off-by: Rafael J. Wysocki --- kernel/power/snapshot.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c index 6a768e537001..8e2e7461375f 100644 --- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -711,9 +711,10 @@ static void mark_nosave_pages(struct memory_bitmap *bm) list_for_each_entry(region, &nosave_regions, list) { unsigned long pfn; - pr_debug("PM: Marking nosave pages: %016lx - %016lx\n", - region->start_pfn << PAGE_SHIFT, - region->end_pfn << PAGE_SHIFT); + pr_debug("PM: Marking nosave pages: [mem %#010llx-%#010llx]\n", + (unsigned long long) region->start_pfn << PAGE_SHIFT, + ((unsigned long long) region->end_pfn << PAGE_SHIFT) + - 1); for (pfn = region->start_pfn; pfn < region->end_pfn; pfn++) if (pfn_valid(pfn)) { From d94aff87826ee6aa43032f4c0263482913f4e2c8 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Fri, 17 Feb 2012 23:39:20 +0100 Subject: [PATCH 15/20] PM / Sleep: Fix possible infinite loop during wakeup source destruction MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit If wakeup_source_destroy() is called for an active wakeup source that is never deactivated, it will spin forever. To prevent that from happening, make wakeup_source_destroy() call __pm_relax() for the wakeup source object it is about to free instead of waiting until it will be deactivated by someone else. However, for this to work it also needs to make sure that the timer function will not be executed after the final __pm_relax(), so make it run del_timer_sync() on the wakeup source's timer beforehand. Additionally, update the kerneldoc comment to document the requirement that __pm_stay_awake() and __pm_wakeup_event() must not be run in parallel with wakeup_source_destroy(). Reported-by: Arve Hjønnevåg Signed-off-by: Rafael J. Wysocki --- drivers/base/power/wakeup.c | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c index 6e591a8a49da..d279f462d624 100644 --- a/drivers/base/power/wakeup.c +++ b/drivers/base/power/wakeup.c @@ -74,22 +74,17 @@ EXPORT_SYMBOL_GPL(wakeup_source_create); /** * wakeup_source_destroy - Destroy a struct wakeup_source object. * @ws: Wakeup source to destroy. + * + * Callers must ensure that __pm_stay_awake() or __pm_wakeup_event() will never + * be run in parallel with this function for the same wakeup source object. */ void wakeup_source_destroy(struct wakeup_source *ws) { if (!ws) return; - spin_lock_irq(&ws->lock); - while (ws->active) { - spin_unlock_irq(&ws->lock); - - schedule_timeout_interruptible(msecs_to_jiffies(TIMEOUT)); - - spin_lock_irq(&ws->lock); - } - spin_unlock_irq(&ws->lock); - + del_timer_sync(&ws->timer); + __pm_relax(ws); kfree(ws->name); kfree(ws); } From da863cddd831b0f4bf2d067f8b75254f1be94590 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Fri, 17 Feb 2012 23:39:33 +0100 Subject: [PATCH 16/20] PM / Sleep: Fix race conditions related to wakeup source timer function MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit If __pm_wakeup_event() has been used (with a nonzero timeout) to report a wakeup event and then __pm_relax() immediately followed by __pm_stay_awake() is called or __pm_wakeup_event() is called once again for the same wakeup source object before its timer expires, the timer function pm_wakeup_timer_fn() may still be run as a result of the previous __pm_wakeup_event() call. In either of those cases it may mistakenly deactivate the wakeup source that has just been activated. To prevent that from happening, make wakeup_source_deactivate() clear the wakeup source's timer_expires field and make pm_wakeup_timer_fn() check if timer_expires is different from zero and if it's not in future before calling wakeup_source_deactivate() (if timer_expires is 0, it means that the timer has just been deleted and if timer_expires is in future, it means that the timer has just been rescheduled to a different time). Reported-by: Arve Hjønnevåg Signed-off-by: Rafael J. Wysocki --- drivers/base/power/wakeup.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c index d279f462d624..b38bb9afb719 100644 --- a/drivers/base/power/wakeup.c +++ b/drivers/base/power/wakeup.c @@ -433,6 +433,7 @@ static void wakeup_source_deactivate(struct wakeup_source *ws) ws->max_time = duration; del_timer(&ws->timer); + ws->timer_expires = 0; /* * Increment the counter of registered wakeup events and decrement the @@ -487,11 +488,22 @@ EXPORT_SYMBOL_GPL(pm_relax); * pm_wakeup_timer_fn - Delayed finalization of a wakeup event. * @data: Address of the wakeup source object associated with the event source. * - * Call __pm_relax() for the wakeup source whose address is stored in @data. + * Call wakeup_source_deactivate() for the wakeup source whose address is stored + * in @data if it is currently active and its timer has not been canceled and + * the expiration time of the timer is not in future. */ static void pm_wakeup_timer_fn(unsigned long data) { - __pm_relax((struct wakeup_source *)data); + struct wakeup_source *ws = (struct wakeup_source *)data; + unsigned long flags; + + spin_lock_irqsave(&ws->lock, flags); + + if (ws->active && ws->timer_expires + && time_after_eq(jiffies, ws->timer_expires)) + wakeup_source_deactivate(ws); + + spin_unlock_irqrestore(&ws->lock, flags); } /** From 4782e1654bdbd30cf307da090b3c4f70157477cb Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Fri, 17 Feb 2012 23:39:39 +0100 Subject: [PATCH 17/20] PM / Sleep: Make __pm_stay_awake() delete wakeup source timers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit If __pm_stay_awake() is called after __pm_wakeup_event() for the same wakep source object before its timer expires, it won't cancel the timer, so the wakeup source will be deactivated from the timer function as scheduled by __pm_wakeup_event(). In that case __pm_stay_awake() doesn't have any effect beyond incrementing the wakeup source's event_count field, although it should cancel the timer and make the wakeup source stay active until __pm_relax() is called for it. To fix this problem make __pm_stay_awake() delete the wakeup source's timer and ensure that it won't be deactivated from the timer funtion afterwards by clearing its timer_expires field. Reported-by: Arve Hjønnevåg Signed-off-by: Rafael J. Wysocki --- drivers/base/power/wakeup.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c index b38bb9afb719..7c5ab70b9ef3 100644 --- a/drivers/base/power/wakeup.c +++ b/drivers/base/power/wakeup.c @@ -344,7 +344,6 @@ static void wakeup_source_activate(struct wakeup_source *ws) { ws->active = true; ws->active_count++; - ws->timer_expires = jiffies; ws->last_time = ktime_get(); /* Increment the counter of events in progress. */ @@ -365,9 +364,14 @@ void __pm_stay_awake(struct wakeup_source *ws) return; spin_lock_irqsave(&ws->lock, flags); + ws->event_count++; if (!ws->active) wakeup_source_activate(ws); + + del_timer(&ws->timer); + ws->timer_expires = 0; + spin_unlock_irqrestore(&ws->lock, flags); } EXPORT_SYMBOL_GPL(__pm_stay_awake); @@ -541,7 +545,7 @@ void __pm_wakeup_event(struct wakeup_source *ws, unsigned int msec) if (!expires) expires = 1; - if (time_after(expires, ws->timer_expires)) { + if (!ws->timer_expires || time_after(expires, ws->timer_expires)) { mod_timer(&ws->timer, expires); ws->timer_expires = expires; } From 05b4877f6a4f1ba4952d1222213d262bf8c132b7 Mon Sep 17 00:00:00 2001 From: "Srivatsa S. Bhat" Date: Fri, 17 Feb 2012 23:39:51 +0100 Subject: [PATCH 18/20] PM / Hibernate: Enable usermodehelpers in hibernate() error path If create_basic_memory_bitmaps() fails, usermodehelpers are not re-enabled before returning. Fix this. And while at it, reword the goto labels so that they look more meaningful. Signed-off-by: Srivatsa S. Bhat Cc: stable@vger.kernel.org Signed-off-by: Rafael J. Wysocki --- kernel/power/hibernate.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index 72baaf011fb7..0a186cfde788 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -618,7 +618,7 @@ int hibernate(void) /* Allocate memory management structures */ error = create_basic_memory_bitmaps(); if (error) - goto Exit; + goto Enable_umh; printk(KERN_INFO "PM: Syncing filesystems ... "); sys_sync(); @@ -626,7 +626,7 @@ int hibernate(void) error = freeze_processes(); if (error) - goto Finish; + goto Free_bitmaps; error = hibernation_snapshot(hibernation_mode == HIBERNATION_PLATFORM); if (error || freezer_test_done) @@ -659,8 +659,9 @@ int hibernate(void) /* Don't bother checking whether freezer_test_done is true */ freezer_test_done = false; - Finish: + Free_bitmaps: free_basic_memory_bitmaps(); + Enable_umh: usermodehelper_enable(); Exit: pm_notifier_call_chain(PM_POST_HIBERNATION); From 8671bbc1bd0442ef0eab27f7d56216431c490820 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Tue, 21 Feb 2012 23:47:56 +0100 Subject: [PATCH 19/20] PM / Sleep: Add more wakeup source initialization routines MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The existing wakeup source initialization routines are not particularly useful for wakeup sources that aren't created by wakeup_source_create(), because their users have to open code filling the objects with zeros and setting their names. For this reason, introduce routines that can be used for initializing, for example, static wakeup source objects. Requested-by: Arve Hjønnevåg Signed-off-by: Rafael J. Wysocki --- drivers/base/power/wakeup.c | 50 ++++++++++++++++++++++++++++++------- include/linux/pm_wakeup.h | 22 +++++++++++++++- 2 files changed, 62 insertions(+), 10 deletions(-) diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c index 7c5ab70b9ef3..2a3e581b8dcd 100644 --- a/drivers/base/power/wakeup.c +++ b/drivers/base/power/wakeup.c @@ -52,6 +52,23 @@ static void pm_wakeup_timer_fn(unsigned long data); static LIST_HEAD(wakeup_sources); +/** + * wakeup_source_prepare - Prepare a new wakeup source for initialization. + * @ws: Wakeup source to prepare. + * @name: Pointer to the name of the new wakeup source. + * + * Callers must ensure that the @name string won't be freed when @ws is still in + * use. + */ +void wakeup_source_prepare(struct wakeup_source *ws, const char *name) +{ + if (ws) { + memset(ws, 0, sizeof(*ws)); + ws->name = name; + } +} +EXPORT_SYMBOL_GPL(wakeup_source_prepare); + /** * wakeup_source_create - Create a struct wakeup_source object. * @name: Name of the new wakeup source. @@ -60,31 +77,44 @@ struct wakeup_source *wakeup_source_create(const char *name) { struct wakeup_source *ws; - ws = kzalloc(sizeof(*ws), GFP_KERNEL); + ws = kmalloc(sizeof(*ws), GFP_KERNEL); if (!ws) return NULL; - if (name) - ws->name = kstrdup(name, GFP_KERNEL); - + wakeup_source_prepare(ws, name ? kstrdup(name, GFP_KERNEL) : NULL); return ws; } EXPORT_SYMBOL_GPL(wakeup_source_create); /** - * wakeup_source_destroy - Destroy a struct wakeup_source object. - * @ws: Wakeup source to destroy. + * wakeup_source_drop - Prepare a struct wakeup_source object for destruction. + * @ws: Wakeup source to prepare for destruction. * * Callers must ensure that __pm_stay_awake() or __pm_wakeup_event() will never * be run in parallel with this function for the same wakeup source object. */ -void wakeup_source_destroy(struct wakeup_source *ws) +void wakeup_source_drop(struct wakeup_source *ws) { if (!ws) return; del_timer_sync(&ws->timer); __pm_relax(ws); +} +EXPORT_SYMBOL_GPL(wakeup_source_drop); + +/** + * wakeup_source_destroy - Destroy a struct wakeup_source object. + * @ws: Wakeup source to destroy. + * + * Use only for wakeup source objects created with wakeup_source_create(). + */ +void wakeup_source_destroy(struct wakeup_source *ws) +{ + if (!ws) + return; + + wakeup_source_drop(ws); kfree(ws->name); kfree(ws); } @@ -147,8 +177,10 @@ EXPORT_SYMBOL_GPL(wakeup_source_register); */ void wakeup_source_unregister(struct wakeup_source *ws) { - wakeup_source_remove(ws); - wakeup_source_destroy(ws); + if (ws) { + wakeup_source_remove(ws); + wakeup_source_destroy(ws); + } } EXPORT_SYMBOL_GPL(wakeup_source_unregister); diff --git a/include/linux/pm_wakeup.h b/include/linux/pm_wakeup.h index a32da962d693..d9f05113e5fb 100644 --- a/include/linux/pm_wakeup.h +++ b/include/linux/pm_wakeup.h @@ -41,7 +41,7 @@ * @active: Status of the wakeup source. */ struct wakeup_source { - char *name; + const char *name; struct list_head entry; spinlock_t lock; struct timer_list timer; @@ -73,7 +73,9 @@ static inline bool device_may_wakeup(struct device *dev) } /* drivers/base/power/wakeup.c */ +extern void wakeup_source_prepare(struct wakeup_source *ws, const char *name); extern struct wakeup_source *wakeup_source_create(const char *name); +extern void wakeup_source_drop(struct wakeup_source *ws); extern void wakeup_source_destroy(struct wakeup_source *ws); extern void wakeup_source_add(struct wakeup_source *ws); extern void wakeup_source_remove(struct wakeup_source *ws); @@ -103,11 +105,16 @@ static inline bool device_can_wakeup(struct device *dev) return dev->power.can_wakeup; } +static inline void wakeup_source_prepare(struct wakeup_source *ws, + const char *name) {} + static inline struct wakeup_source *wakeup_source_create(const char *name) { return NULL; } +static inline void wakeup_source_drop(struct wakeup_source *ws) {} + static inline void wakeup_source_destroy(struct wakeup_source *ws) {} static inline void wakeup_source_add(struct wakeup_source *ws) {} @@ -165,4 +172,17 @@ static inline void pm_wakeup_event(struct device *dev, unsigned int msec) {} #endif /* !CONFIG_PM_SLEEP */ +static inline void wakeup_source_init(struct wakeup_source *ws, + const char *name) +{ + wakeup_source_prepare(ws, name); + wakeup_source_add(ws); +} + +static inline void wakeup_source_trash(struct wakeup_source *ws) +{ + wakeup_source_remove(ws); + wakeup_source_drop(ws); +} + #endif /* _LINUX_PM_WAKEUP_H */ From 37f08be11be9a7d9351fb1b9b408259519a126f3 Mon Sep 17 00:00:00 2001 From: Marcos Paulo de Souza Date: Tue, 21 Feb 2012 23:57:47 +0100 Subject: [PATCH 20/20] PM / Freezer: Remove references to TIF_FREEZE in comments This patch removes all the references in the code about the TIF_FREEZE flag removed by commit a3201227f803ad7fd43180c5195dbe5a2bf998aa freezer: make freezing() test freeze conditions in effect instead of TIF_FREEZE There still are some references to TIF_FREEZE in Documentation/power/freezing-of-tasks.txt, but it looks like that documentation needs more thorough work to reflect how the new freezer works, and hence merely removing the references to TIF_FREEZE won't really help. So I have not touched that part in this patch. Suggested-by: Srivatsa S. Bhat Signed-off-by: Marcos Paulo de Souza Signed-off-by: Rafael J. Wysocki --- kernel/exit.c | 2 +- kernel/freezer.c | 6 +++--- kernel/power/process.c | 8 +++----- 3 files changed, 7 insertions(+), 9 deletions(-) diff --git a/kernel/exit.c b/kernel/exit.c index 294b1709170d..fd0af05e0639 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -424,7 +424,7 @@ void daemonize(const char *name, ...) */ exit_mm(current); /* - * We don't want to have TIF_FREEZE set if the system-wide hibernation + * We don't want to get frozen, in case system-wide hibernation * or suspend transition begins right now. */ current->flags |= (PF_NOFREEZE | PF_KTHREAD); diff --git a/kernel/freezer.c b/kernel/freezer.c index 9815b8d1eed5..11f82a4d4eae 100644 --- a/kernel/freezer.c +++ b/kernel/freezer.c @@ -99,9 +99,9 @@ static void fake_signal_wake_up(struct task_struct *p) * freeze_task - send a freeze request to given task * @p: task to send the request to * - * If @p is freezing, the freeze request is sent by setting %TIF_FREEZE - * flag and either sending a fake signal to it or waking it up, depending - * on whether it has %PF_FREEZER_NOSIG set. + * If @p is freezing, the freeze request is sent either by sending a fake + * signal (if it's not a kernel thread) or waking it up (if it's a kernel + * thread). * * RETURNS: * %false, if @p is not freezing or already frozen; %true, otherwise diff --git a/kernel/power/process.c b/kernel/power/process.c index 6aeb5efe00eb..0d2aeb226108 100644 --- a/kernel/power/process.c +++ b/kernel/power/process.c @@ -53,11 +53,9 @@ static int try_to_freeze_tasks(bool user_only) * It is "frozen enough". If the task does wake * up, it will immediately call try_to_freeze. * - * Because freeze_task() goes through p's - * scheduler lock after setting TIF_FREEZE, it's - * guaranteed that either we see TASK_RUNNING or - * try_to_stop() after schedule() in ptrace/signal - * stop sees TIF_FREEZE. + * Because freeze_task() goes through p's scheduler lock, it's + * guaranteed that TASK_STOPPED/TRACED -> TASK_RUNNING + * transition can't race with task state testing here. */ if (!task_is_stopped_or_traced(p) && !freezer_should_skip(p))