movs instruction will combine data to accelerate moving data,
however we need to concern two cases about it.
1. movs instruction need long lantency to startup,
so here we use general mov instruction to copy data.
2. movs instruction is not good for unaligned case,
even if src offset is 0x10, dest offset is 0x0,
we avoid and handle the case by general mov instruction.
Signed-off-by: Ma Ling <ling.ma@intel.com>
LKML-Reference: <1284664360-6138-1-git-send-email-ling.ma@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
memmove() allow source and destination address to be overlap, but
there is no such limitation for memcpy(). Therefore, explicitly
implement memmove() in both the forwards and backward directions, to
give us the ability to optimize memcpy().
Signed-off-by: Ma Ling <ling.ma@intel.com>
LKML-Reference: <C10D3FB0CD45994C8A51FEC1227CE22F0E483AD86A@shsmsx502.ccr.corp.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>