From: Takao Indoh <tindoh@redhat.com> Date: Mon, 19 Jul 2010 16:11:12 -0400 Subject: [ipmi] add parameter to limit CPU usage in kipmid Message-id: <20100719161112.27697.76231.sendpatchset@flat.lab.bos.redhat.com> Patchwork-id: 26947 O-Subject: [RHEL5.6 PATCH v2][IPMI] Add parameter to limit CPU usage in kipmid Bugzilla: 494680 RH-Acked-by: Larry Woodman <lwoodman@redhat.com> BZ#494680 https://bugzilla.redhat.com/show_bug.cgi?id=494680 I backported ae74e823cb7d4cd476f623fce9a38f625f6c09a8 and posted a patch to rh-kernel, but problem was reported in upstream. https://bugzilla.kernel.org/show_bug.cgi?id=16147 This problem was fix by 8d1f66dc9b4f80a1441bc1c33efa98aca99e8813, so I posted again. [Summary] In some cases kipmid can use a lot of CPU depending on the interface's performance. This can waste a lot of CPU and cause various issues with detecting idle CPU and using extra power. [How to fix] Add new module paramter, kipmid_max_busy_us, which sets the maximum amount of time, in microseconds, that kipmid will spin before sleeping for a tick. This value sets a balance between performance and CPU waste and needs to be tuned to your needs. [Upstream] Included in upstream commit ae74e823cb7d4cd476f623fce9a38f625f6c09a8 Author: Martin Wilck <martin.wilck@ts.fujitsu.com> Date: Wed Mar 10 15:23:06 2010 -0800 commit 8d1f66dc9b4f80a1441bc1c33efa98aca99e8813 Author: Martin Wilck <martin.wilck@ts.fujitsu.com> Date: Tue Jun 29 15:05:31 2010 -0700 [Test status] The reporter of this problem tested a patch and they confirmed that this patch worked as expected. - In default configuration (kipmid_max_busy_us=0) the behaviour is the same as before (high cpu load of kipmid) - When setting kipmid_max_busy_us to 500, kipmid is no more seen as heavy cpu consumer in top. Successful brew scratch build against each arch: https://brewweb.devel.redhat.com/taskinfo?taskID=2592413 Please review and ACK. Thanks, Takao Indoh diff --git a/Documentation/IPMI.txt b/Documentation/IPMI.txt index 2b705df..793ad43 100644 --- a/Documentation/IPMI.txt +++ b/Documentation/IPMI.txt @@ -365,6 +365,7 @@ You can change this at module load time (for a module) with: regshifts=<shift1>,<shift2>,... slave_addrs=<addr1>,<addr2>,... force_kipmid=<enable1>,<enable2>,... + kipmid_max_busy_us=<ustime1>,<ustime2>,... unload_when_empty=[0|1] Each of these except si_trydefaults is a list, the first item for the @@ -433,6 +434,7 @@ kernel command line as: ipmi_si.regshifts=<shift1>,<shift2>,... ipmi_si.slave_addrs=<addr1>,<addr2>,... ipmi_si.force_kipmid=<enable1>,<enable2>,... + ipmi_si.kipmid_max_busy_us=<ustime1>,<ustime2>,... It works the same as the module parameters of the same names. @@ -447,6 +449,16 @@ have high-res timers enabled in the kernel and you don't have interrupts enabled, the driver will run VERY slowly. Don't blame me, these interfaces suck. +Unfortunately, this thread can use a lot of CPU depending on the +interface's performance. This can waste a lot of CPU and cause +various issues with detecting idle CPU and using extra power. To +avoid this, the kipmid_max_busy_us sets the maximum amount of time, in +microseconds, that kipmid will spin before sleeping for a tick. This +value sets a balance between performance and CPU waste and needs to be +tuned to your needs. Maybe, someday, auto-tuning will be added, but +that's not a simple thing and even the auto-tuning would need to be +tuned to the user's desired performance. + The driver supports a hot add and remove of interfaces. This way, interfaces can be added or removed after the kernel is up and running. This is done using /sys/modules/ipmi_si/hotmod, which is a write-only diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c index cd5d7e4..652cc42 100644 --- a/drivers/char/ipmi/ipmi_si_intf.c +++ b/drivers/char/ipmi/ipmi_si_intf.c @@ -240,6 +240,9 @@ struct smi_info static int force_kipmid[SI_MAX_PARMS]; static int num_force_kipmid; +static unsigned int kipmid_max_busy_us[SI_MAX_PARMS]; +static int num_max_busy_us; + static int unload_when_empty = 1; static int try_smi_init(struct smi_info *smi); @@ -868,21 +871,75 @@ static void set_run_to_completion(void *send_info, int i_run_to_completion) spin_unlock_irqrestore(&(smi_info->si_lock), flags); } +/* + * Use -1 in the nsec value of the busy waiting timespec to tell that + * we are spinning in kipmid looking for something and not delaying + * between checks + */ +static inline void ipmi_si_set_not_busy(struct timespec *ts) +{ + ts->tv_nsec = -1; +} +static inline int ipmi_si_is_busy(struct timespec *ts) +{ + return ts->tv_nsec != -1; +} + +static int ipmi_thread_busy_wait(enum si_sm_result smi_result, + const struct smi_info *smi_info, + struct timespec *busy_until) +{ + unsigned int max_busy_us = 0; + + if (smi_info->intf_num < num_max_busy_us) + max_busy_us = kipmid_max_busy_us[smi_info->intf_num]; + if (max_busy_us == 0 || smi_result != SI_SM_CALL_WITH_DELAY) + ipmi_si_set_not_busy(busy_until); + else if (!ipmi_si_is_busy(busy_until)) { + getnstimeofday(busy_until); + timespec_add_ns(busy_until, max_busy_us*NSEC_PER_USEC); + } else { + struct timespec now; + getnstimeofday(&now); + if (unlikely(timespec_compare(&now, busy_until) > 0)) { + ipmi_si_set_not_busy(busy_until); + return 0; + } + } + return 1; +} + + +/* + * A busy-waiting loop for speeding up IPMI operation. + * + * Lousy hardware makes this hard. This is only enabled for systems + * that are not BT and do not have interrupts. It starts spinning + * when an operation is complete or until max_busy tells it to stop + * (if that is enabled). See the paragraph on kimid_max_busy_us in + * Documentation/IPMI.txt for details. + */ static int ipmi_thread(void *data) { struct smi_info *smi_info = data; unsigned long flags; enum si_sm_result smi_result; + struct timespec busy_until; + ipmi_si_set_not_busy(&busy_until); set_user_nice(current, 19); while (!kthread_should_stop()) { + int busy_wait; + spin_lock_irqsave(&(smi_info->si_lock), flags); smi_result = smi_event_handler(smi_info, 0); spin_unlock_irqrestore(&(smi_info->si_lock), flags); + busy_wait = ipmi_thread_busy_wait(smi_result, smi_info, + &busy_until); if (smi_result == SI_SM_CALL_WITHOUT_DELAY) { /* do nothing */ } - else if (smi_result == SI_SM_CALL_WITH_DELAY) + else if (smi_result == SI_SM_CALL_WITH_DELAY && busy_wait) schedule(); else schedule_timeout_interruptible(1); @@ -1157,6 +1214,11 @@ module_param(unload_when_empty, int, 0); MODULE_PARM_DESC(unload_when_empty, "Unload the module if no interfaces are" " specified or found, default is 1. Setting to 0" " is useful for hot add of devices using hotmod."); +module_param_array(kipmid_max_busy_us, uint, &num_max_busy_us, 0644); +MODULE_PARM_DESC(kipmid_max_busy_us, + "Max time (in microseconds) to busy-wait for IPMI data before" + " sleeping. 0 (default) means to wait forever. Set to 100-500" + " if kipmid is using up a lot of CPU time."); static void std_irq_cleanup(struct smi_info *info)