There is a race in ME hardware between data copy for host and interrupt
delivery. An interrupt can be delivered prior to whole data copied for the
host to read but rather then going trough the reset we just merely need to
wait for the next interrupt.
The bug is visible in read/write stress with multiple connections per client
This is a regression caused as a side effect of the commit:
commit 544f94601409653f07ae6e22d4a39e3a90dceead
mei: do not run reset flow from the interrupt thread
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Cc: stable <stable@vger.kernel.org> # 3.14
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1. Delete cb from list before freeing it
2. Fix missed break that leads to
switch case fall-through and BUG invocation.
Regression from:
commit 6bb948c9e500d24321c36c67c81daf8d1a7e561e
mei: get rid of ext_msg
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While running Documentation/watchdog/src/watchdog-simple.c
and quiting by Ctrl-C, fallowing error is displayed:
mei_me 0000:00:16.0: wd: stop failed to complete ret=-512.
The whatchdog core framework is not able to propagate
-ESYSRESTART or -EINTR. Also There is no much sense in
restarting the close system call so instead of using
wait_event_interruptible_timeout we can use wait_event_timeout
with reasonable 10 msecs timeout.
Reported-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add reduce credits to wd_send to remove code
repetition and simplify error handling
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1. Propagate ENOTTY to user space if the client is not present
in the system
2. Use ETIME consistently on timeouts
3. Return EIO on write failures
4. Return ENODEV on recoverable device failures such as resets
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since txe use doorbell and not circular buffer
we have to cheat in write slot counting, txe always consume all the
slots upon write. In order for it to work we need to track
slots using mei_hbuf_empty_slots() instead of tracking it in mei layer
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A client has to acquire host buffer
before writing, we add lock like wrapper
to replace the code snippet
if (dev->hbuf_is_ready)
dev->hbuf_is_ready = false;
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
PCI suspend resume callbacks should be defined
under CONFIG_PM_SLEEP
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This will eliminate compilation warning:
drivers/misc/mei/pci-me.c:303:12: warning: mei_me_pci_suspend defined but not used [-Wunused-function]
static int mei_me_pci_suspend(struct device *device)
drivers/misc/mei/pci-me.c:323:12: warning: mei_me_pci_resume defined but not used [-Wunused-function]
static int mei_me_pci_resume(struct device *device)
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We can use simply list_for_each_entry if there is no
entry removal
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We already have a helper to find me client by id, let's
use it in all relevant places.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Drop not-very-useful check and with this
fix read on index that can be after array end.
Cleanup search function as byproduct.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Connect wd and amthif through regular mei_cl_connect API
as there is no reason to connect in asynchronous mode.
Also use mei_cl_is_connected in order to protect flows
instead of depending on wd_pending and amthif_timer
Now we can remove all the special handling in hbm layer
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1. Return -ENOTTY on client connect if the requested client was not found
on the enumeration list or the client was internally disabled, in the later
case FW will return NOT_FOUND.
2. Return -EBUSY if the client cannot be connected because of resource
contention
3. Change response status enum to have MEI_CL_ prefix
4. Add function to translate response status to a string
for more readable logging
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When stopping the MEI, we should remove and potentially unregister
all bus devices queued on the mei_dev linked list.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use more standard message writing for
oob data.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This operation actually only support connection
and not a generic ioctl
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kconfig is not transitive so INTEL_ME_TXE has to depend
on WATCHDOG_CORE as well
ERROR: "watchdog_unregister_device" [drivers/misc/mei/mei.ko] undefined!
ERROR: "watchdog_register_device" [drivers/misc/mei/mei.ko] undefined!
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
None of these files are actually using any __init type directives
and hence don't need to include <linux/init.h>. Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Export active connection state to debugfs
The information displayed is [me,host] id pair,
client connection state, and client's read and write states
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Reviewed-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In some rare case mei hw reset may take long time to settle.
Instead of blocking resume flow we span another driver reset flow in
separate work context
This allows as to shorten hw reset timeout to something more acceptable
by DPM_WATCHDOG_TIMEOUT
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
register txe hardware with pci bus
and add pci pm handlers
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hw-txe.c adds txe hw specific functionality
It implements hw specific interrupt handler, mei_hw_ops
functions and as well txe hw helpers
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This header file add mei_txe_hw structure
that hold txe hw specific state and other sw constructs.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This header file add register definitions
for TXE hardware found BayTrail platforms.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clear write callbacks sitting in write_waiting list on reset.
Otherwise these callbacks are left dangling and cause memory leak.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
give up reseting after 3 unsuccessful tries
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1. MEI_DEV_RESETTING device state spans only hardware reset flow
while starting dev state is saved into a local variable for further
reference, this let us to reduce big if statements in case we
are trying to avoid nested resets
2. During initializations if the reset ended in MEI_DEV_DISABLED device
state we bail out with -ENODEV
3. Remove redundant interrupts_enabled parameter as this
can be deduced from the starting dev_state
4. mei_reset propagates error code to the caller
5. Add mei_restart function to wrap the pci resume
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix syntax errors in comments and debug strings
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
nfc_nfc_free unlink clients from the device list
and has to be called under mei mutex
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Reviewed-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When reset is caused by hbm protocol mismatch or timeout
we might end up in an endless reset loop and hbm protocol
will never sync
Cc: <stable@vger.kernel.org>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fixes a potential deadlock in case of a firmware
initiated reset
mei_reset has a dialog with the interrupt thread hence
it has to be run from an another work item
Most of the mei_resets were called from mei_hbm_dispatch
which is called in interrupt thread context so this
function underwent major revamp. The error code is
propagated to the interrupt thread and if needed
the reset is scheduled from there.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ME device is 64bit DMA capable
We assume both coherent and consistent memory to match
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Set hbm header bit 30 for internal commands
This mark commands that are generated by
the device driver
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
And Lynx Point H Refresh and Wildcat Point LP
device ids.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
list_del_init appears twice in row in mei_cl_unlink, drop one.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cancel each work properly and remove flash_work_queue.
Quoting documentation:
In most situations flushing the entire workqueue is overkill; you merely
need to know that a particular work item isn't queued and isn't running.
In such cases you should use cancel_delayed_work_sync() or
cancel_work_sync() instead.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Propagate error codes from called functions, they are correct.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Driver better use dev_dbg, not pr_debug.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
no need to change error code value returned by
mei_me_cl_by_id, just propagate it on
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Move the unexpected state print to the beginning of mei_reset,
thus printing right state.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>