Commit Graph

616515 Commits

Author SHA1 Message Date
wang di
4f76f0ec09 staging: lustre: llite: move dir cache to MDC layer
Move directory entries cache from llite to MDC, so client
side dir stripe will use independent hash function(in LMV),
which does not need to be tightly coupled with the backend
storage dir-entry hash function. With striped directory, it
will be 2-tier hash, LMV calculate hash value according to the
name and hash-type in layout, then each MDT will store these
entry in disk by its own hash.

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3531
Reviewed-on: http://review.whamcloud.com/7043
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 16:09:27 +02:00
wang di
7ccb7c8f17 staging: lustre: lmv: implement lmv version of read_page
All the code needed to implement read_page. This
will eventually replace lmv_readpage.

Signed-off-by: wang di <di.wang@intel.com>
Reviewed-on: http://review.whamcloud.com/10761
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4906
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 16:09:27 +02:00
Oleg Drokin
3a77df1180 staging/lustre: Always return EEXIST on mkdir for existing names
if the name already exists, but we don't have write permissions
in the parent, force talking to the MDS to determine what
more sensical error code to return.
This also happens to fix matlab and other such programs that
assume that EEXIST is the only valid error code for mkdir of
an existing directory.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 16:03:38 +02:00
Al Viro
4cae780e54 lustre: introduce lnet_copy_{k, }iov2iter(), kill lnet_copy_{k, }iov2{k, }iov()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 16:03:38 +02:00
Al Viro
c1b7b8eb86 lustre: pass iov_iter to ->lnd_recv()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 16:03:38 +02:00
Al Viro
03766dca62 lustre: constify lib-move.c stuff
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 16:03:38 +02:00
Al Viro
a482284865 lustre: ->kss_scratch... are unused now
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 16:03:38 +02:00
Al Viro
1b4e992f50 staging: lustre: ksocknal_lib_send_kiov(): sendmsg doesn't bugger iovec...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 16:03:38 +02:00
Al Viro
805560e8f1 staging: lustre: ksocknal_lib_send_iov(): sendmsg doesn't bugger iovec...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 16:03:38 +02:00
Al Viro
8040ddfb68 staging: lustre: ksocknal_lib_recv_iov(): recvmsg doesn't bugger iovec anymore...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 16:03:38 +02:00
Doug Oucharek
72e62b6c7c staging: lustre: lnet: Stop Infinite CON RACE Condition
In current code, when a CON RACE occurs, the passive side will
let the node with the higher NID value win the race.

We have a field case where a node can have a "stuck"
connection which never goes away and is the trigger of a
never-ending loop of re-connections.

This patch introduces a counter to how many times a
connection in a connecting state has been the cause of a CON RACE
rejection. After 20 times (constant MAX_CONN_RACES_BEFORE_ABORT),
we assume the connection is stuck and let the other side (with
lower NID) win.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7646
Reviewed-on: http://review.whamcloud.com/19430
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:39 +02:00
Liang Zhen
ba5301428e staging: lustre: lnet: lock improvement for ko2iblnd
kiblnd_check_sends() takes conn::ibc_lock at the begin and release
this lock at the end, this is inefficient because most use-case
needs to explicitly release ibc_lock before caling this function.

This patches changes it to kiblnd_check_sends_locked() and avoid
unnecessary lock dances.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7099
Reviewed-on: http://review.whamcloud.com/20322
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:39 +02:00
Alexander Boyko
d04c0943c1 staging: lustre: lnet: make connection more stable with packet loss
IB network may lose last connection handshake packet.
This problem isn't Lustre specific and described at
https://oss.oracle.com/pipermail/rds-devel/2007-December/000271.html
for example. Solution is to make conection established if any packet
is received for it.

Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Seagate-bug-id: MRP-2883
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8303
Reviewed-on: http://review.whamcloud.com/20874
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:39 +02:00
Doug Oucharek
af96bab11b staging: lustre: lnet: Correct position of lnet_ni_decref()
In fix http://review.whamcloud.com/#/c/19614/, the call
to lnet_ni_decref() should have followed the routines
which are using the NI.  This patch correct that.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8022
Reviewed-on: http://review.whamcloud.com/21001
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:39 +02:00
Doug Oucharek
e426f0d24e staging: lustre: lnet: Do not drop message when shutting down LNet
There is a case in lnet_parse() where we discover that LNet
is shutting down but we continue to use the NI when we drop the
message and end up calling ko2iblnd_check_send_locked() which tries to
allocate from the Tx pool which has been cleaned up already.
This triggers a NULL pointer dereference.

This fix just returns from lnet_parse() when we disover LNet is
shutting down.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8106
Reviewed-on: http://review.whamcloud.com/19993
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:39 +02:00
wang di
bce1bbf4b8 staging: lustre: llite: set op_max_pages
Cache the maximum allowed pages supported by the llite
layer. This value will be used in the mdc and lmv layer.

Signed-off-by: wang di <di.wang@intel.com>
Reviewed-on: http://review.whamcloud.com/7043
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3531
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:39 +02:00
wang di
049e215e0e staging: lustre: obd: implement md_read_page
This patch adds md_read_page which is a new more
flexiable api that will replace md_readpage.

Signed-off-by: wang di <di.wang@intel.com>
Reviewed-on: http://review.whamcloud.com/10761
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4906
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:39 +02:00
Dmitry Eremin
a043a1020f staging: lustre: lmv: build error with gcc 4.7.0 20110509
Fixed comparison between signed and unsigned indexes.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3775
Reviewed-on: http://review.whamcloud.com/7382
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:39 +02:00
John L. Hammond
8015e18087 staging: lustre: obd: validate open handle cookies
Add a const void *h_owner member to struct portals_handle. Add a const
void *owner parameter to class_handle2object() which must be matched
by the h_owner member of the handle in addition to the cookie.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3233
Reviewed-on: http://review.whamcloud.com/6938
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:39 +02:00
James Simmons
6a42e615a2 staging: lustre: llite: remove assert for acl refcount
The purpose of this asssert to was to ensure lustre
was properly managing its posix_acl access. This test
is invalid due to the VFS layer also taking references
on the posix_acl. In reality their is no simple way to
detect this class of mistakes.

Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:39 +02:00
James Simmons
a5c954eff6 staging: lustre: include: fix one off errors in lustre_id.h
During inspection of another patch Dan Carpenter noticed some
one off errors in lustre_id.h. Fix the condition test for
OBIF_MAX_OID.

Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:38 +02:00
Hongchao Zhang
3147b26840 staging: lustre: osc: Automatically increase the max_dirty_mb
When RPC size or the max RPCs in flight is increased, the actual
limit might be max_dirty_mb. This patch automatically increases
the max_dirty_mb value at connection time and when the related
values are tuned manually by proc file system.

this patch also changes the unit of "cl_dirty" and "cl_dirty_max"
in client_obd from byte to page.

Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4933
Reviewed-on: http://review.whamcloud.com/10446
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:38 +02:00
Fan Yong
c8deb3cb5f staging: lustre: lmv: build master LMV EA dynamically build via readdir
When creating a striped directory, the master object saves the slave
objects (or shards) as internal sub-directories. The sub-directory's
name is composed of ${shard_FID}:${shard_idx}. With the name, we can
easily to know what the shard is and where it should be.

On the other hand, we need to store some information related with the
striped directory, such as magic, hash type, shards count, and so on.
That is the LMV EA (header). We do NOT store the FID of each shard in
the LMV EA. Instead, when we need the shards' FIDs (such as readdir()
on client-side), we can build the entrie LMV EA on the MDT (in RAM) by
iterating the sub-directory entries that are contained in the master
object of the striped directroy.

Above mechanism can simplify the striped directory create operation.
For very large striped directory, logging the FIDs array in the LMV
EA will be trouble. It also simplify the LFSCK for verifying striped
directory, because it reduces the inconsistency sources.

Another fixing is about the lmv_master_fid in master LMV EA header,
it is redundant information, and may become one of the inconsistency
sources. So replace it with two __u64 padding fields.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5223
Reviewed-on: http://review.whamcloud.com/10751
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:38 +02:00
Andriy Skulysh
36c6607c79 staging: lustre: ptlrpc: request gets stuck in UNREGISTERING phase
Exit condition from UNREGISTERING phase is releasing of
both reply and bulk buffers.

Call ptlrpc_unregister_bulk() if ptlrpc_unregister_reply()
wasn't completed in async mode before switching to
UNREGISTERING phase.

Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5259
Xyratex-bug-id: MRP-1960
Reviewed-on: http://review.whamcloud.com/10846
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Ann Koehler <amk@cray.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:38 +02:00
wang di
893ab74792 staging: lustre: lmv: try all stripes for unknown hash functions
For unknown hash type, LMV should try all stripes to locate
the name entry. But it will only for lookup and unlink, i.e.
we can only list and unlink entries under striped dir with
unknown hash type.

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4921
Reviewed-on: http://review.whamcloud.com/10041
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:38 +02:00
Mikhail Pershin
b14b3ba577 staging: lustre: llog: keep llog ctxt indices constant
The llog context id table cannot be shrunk easily because that
will cause index shifting and incompatibility between old client
and new server and vice versa.

Patch moves llog_ctxt_id table to the lustre_idl.h because this is
wire protocol data, these values are added to the wirecheck.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5218
Reviewed-on: http://review.whamcloud.com/10758
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:38 +02:00
Lai Siyao
ac094f382f staging: lustre: ptlrpc: add OBD_CONNECT_UNLINK_CLOSE flag
Add OBD_CONNECT_UNLINK_CLOSE flag for interop, once this is supported,
client packs file handle in unlink RPC, and MDT will close file before
unlink.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4367
Reviewed-on: http://review.whamcloud.com/10426
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:38 +02:00
Lai Siyao
c1b66fccf9 staging: lustre: fid: do open-by-fid by default
Currently client open-by-fid often packs name into the request,
but the name may be invalid, eg. NFS export, and even if it's
valid, it may cause inconsistency because this operation is done
on this fid, which is globally unique, but name not.

Since open-by-fid doesn't pack name, for striped dir we can't know
parent stripe fid on client, so we set parent fid the same as
child fid, and MDT has to find its parent fid from linkea (this is
already supported by MDT).

M_CHECK_STALE becomes obsolete.

Unset MDS_OPEN_FL_INTERNAL from open syscall flags, because these
flags are internally used, and should not be set from user space.

It's not necessary to store parent fid in lli_pfid, because MDT
can get it's parent fid from linkea, and now that DNE stripe
directory stores master inode fid in lli_pfid, stop storing parent
fid to avoid conflict.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3544
Reviewed-on: http://review.whamcloud.com/7476
Reviewed-on: http://review.whamcloud.com/10692
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:38 +02:00
Brian Behlendorf
be191af9f5 staging: lustre: obd: limit lu_object cache
As the LU cache grows it can consume large enough chunks of
memory that ends up preventing buffers for other objects,
such as the OIs, from being cached and severely impacting
the performance for FID lookups. Limit the lu_object cache
to a maximum of lu_cache_nr objects.

NOTES:

* In order to be able to quickly determine the number of objects in
  the hash table the CFS_HASH_COUNTER flag is added.  This adds an
  atomic_inc/dec to the hash insert/remove paths but is not expected
  to have any measurable impact of performance.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5164
Reviewed-on: http://review.whamcloud.com/10237
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:38 +02:00
James Simmons
b4e40299f6 staging: lustre: obdclass: compile issues with variable not being initialized
One of the versions of gcc I have refuses to build obd_mount.c due to
index not be initialized in function lmd_make_exclusion before it is
used.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4629
Reviewed-on: http://review.whamcloud.com/10705
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:38 +02:00
Emoly Liu
099d5adf1e staging: lustre: ldlm: improve ldlm_lock_create() return value
ldlm_lock_create() and ldlm_resource_get() always return NULL as
error reporting and "NULL" is interpretted as ENOMEM incorrectly
sometimes. This patch fixes this problem by using ERR_PTR() rather
than NULL.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4524
Reviewed-on: http://review.whamcloud.com/9004
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:38 +02:00
Patrick Farrell
aa08b0e3c7 staging: lustre: fld: add fld description documentation
Add subsystem description from Di Wang to header file.

Signed-off-by: Patrick Farrell <paf@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5153
Reviewed-on: http://review.whamcloud.com/10631
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:38 +02:00
wang di
7aca4ae34c staging: lustre: mdc: always use D_INFO for debug info when mdc_put_rpc_lock fails
Also use D_INFO no matter what the error returned from
mdc_put_rpc_lock.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4973
Reviewed-on: http://review.whamcloud.com/10150
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:37 +02:00
Jinshan Xiong
d806f30e63 staging: lustre: osc: revise unstable pages accounting
A few changes are made in this patch for unstable pages tracking:

1. Remove kernel NFS unstable pages tracking because it killed
   performance
2. Track unstable pages as part of LRU cache. Otherwise Lustre
   can use much more memory than max_cached_mb
3. Remove obd_unstable_pages tracking to avoid using global
   atomic counter
4. Make unstable pages track optional. Tracking unstable pages is
   turned off by default, and can be controlled by
   llite.*.unstable_stats.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4841
Reviewed-on: http://review.whamcloud.com/10003
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:37 +02:00
Jinshan Xiong
96c53363d8 staging: lustre: clio: Reduce memory overhead of per-page allocation
A page in clio used to occupy 584 bytes, which will use size-1024
slab cache. This patch reduces the per-page overhead to 512 bytes
so it can use size-512 instead.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4793
Reviewed-on: http://review.whamcloud.com/10070
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:37 +02:00
John L. Hammond
2e1b5b8b35 staging: lustre: mdt: add mbo_ prefix to members of struct mdt_body
Rename each member of struct mdt_body, adding the prefix mbo_.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2675
Reviewed-on: http://review.whamcloud.com/10202
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:37 +02:00
Hongchao Zhang
50ccd16ea2 staging: lustre: llite: set dir LOV xattr length variable
the LOV xattr of directory could be either lov_user_md_v1
(size is 32) or lov_user_md_v3 (size is 48), then the actual
size of the LOV xattr should be return.

Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5100
Reviewed-on: http://review.whamcloud.com/10453
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:37 +02:00
Niu Yawei
f7aafa7cad staging: lustre: obd: rename lsr_padding to lsr_valid
Simple variable rename.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4345
Reviewed-on: http://review.whamcloud.com/10223
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:37 +02:00
wang di
d07d4cb829 staging: lustre: llite: use the correct mode for striped directory
Create striped directory with correct mode, which should be
handling same as mkdir.

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4929
Reviewed-on: http://review.whamcloud.com/10028
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:37 +02:00
wang di
711942df95 staging: lustre: lmv: Match MDT where the FID locates first
With DNE every object can have two locks in different namespaces:
lookup lock in space of MDT storing direntry and update/open lock
in space of MDT storing inode. In lmv_find_cbdata/lmv_lock_lock,
it should try the MDT that the FID maps to first, since this can
be easily found, and only try others if that fails.

In the error handler of lmv_add_targets, it should check whether
ld_tgt_count is being increased before ld_tgt_count is being -1.

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4098
Reviewed-on: http://review.whamcloud.com/8019
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:37 +02:00
Fan Yong
865b734e8d staging: lustre: lov: new pattern flag for partially repaired file
When the layout LFSCK repairs orphan OST-object, if the parent
MDT-object was lost, then it will re-create the MDT-object and
regenerate the LOV EA and fill the target LOV EA slot with the
orphan information, and fill other slots with zero (LOV hole);
if related LOV EA slot is invalid or hole, then it will refill
the target LOV EA slot; if the target slot exceeds current LOV
EA tail, then extend the LOV EA, and fill the gaps as zero.

Some of the LOV EA holes may cannot be re-filled finally because
of lost some OST-objects. And even if they can be re-filled, but
there are still some possible race accessings from client before
the re-filling. If the client access the LOV EA with hole(s), it
may cause some strange behaviour, such as trigger LBUG()/LASSERT()
on the client.

So we will make the client to be aware of the LOV EA is incomplete.
We introduce a new LOV EA pattern flag LOV_PATTERN_F_HOLE for that:
any time when the LFSCK repairs the LOV EA with hole(s), the LOV EA
will be marked as LOV_PATTERN_F_HOLE; when all the holes in the LOV
EA are refilled, the LOV_PATTERN_F_HOLE will be dropped.

For a new client, it recongizes the pattern flag LOV_PATTERN_F_HOLE,
then it can permit/forbid some opertions on the file with LOV holes:

 1) Normal read/write the file with LOV EA hole is permitted, but the
    application will get EIO error when read data from the dummy slot
    or write data to the dummy slot.
 2) The users can dump the recovered data via some common read tools,
    such as "dd conv=sync,noerror".

 3) Append data to the file which has LOV EA hole will get EIO failure.

 4) Other operations will skip the LOV EA hole(s), and will not get
    failures, such as {s,g}etattr, {s,g}getxattr, stat, chown/chgrp,
    chmod, touch, unlink, and so on.

For an old client, since it will not recognize the new pattern flag
LOV_PATTERN_F_HOLE. So the LOV EA with hole will be dicarded with
failure, but it will not cause the client to be crashed.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4675
Reviewed-on: http://review.whamcloud.com/10042
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:37 +02:00
wang di
a609c39370 staging: lustre: lmv: validate lock with correct stripe FID
In ll_lookup_it_finish, we need use the real parent(stripe)
FID to validate the parent UPDATE lock.

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4925
Reviewed-on: http://review.whamcloud.com/10026
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:37 +02:00
wang di
ef21b1fb83 staging: lustre: llite: a few fixes about readdir of striped dir.
Normally we know the value of op_mea1 when ll_readdir is called.
In the case of '.' or '..' op_mea1 is unknown so for that case
fetch the real parents FID.

Signed-off-by: wang di <di.wang@intel.com>
Signed-off-by: Li Xi <lixi@ddn.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4603
Reviewed-on: http://review.whamcloud.com/9191
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Li Xi <pkuelelixi@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:37 +02:00
John L. Hammond
00c0a6aea0 staging: lustre: uapi: reduce scope of lustre_idl.h
Move the definition of OBD_OCD_VERSION() and similar macros from
lustre_idl.h to lustre_ver.h. These macros are primarily used in
comparisons to LUSTRE_VERSION_CODE which is defined in lustre_ver.h
and so should be defined there as well. Move a few definitions
(related to FIDs, quota and striping) from lustre_idl.h to
lustre_user.h.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5065
Reviewed-on: http://review.whamcloud.com/10336
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:37 +02:00
Jinshan Xiong
d8c0b0a9e6 staging: lustre: llite: Change readdir BRW metrics
To simplify the code, change the metrics from bytes to pages.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5034
Reviewed-on: http://review.whamcloud.com/10275
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:36 +02:00
Jinshan Xiong
297e908f08 staging: lustre: llite: Fix the deadlock in balance_dirty_pages()
If the page is already dirtied in ll_write_end() and kernel tries
to call balance_dirty_pages() to write back dirty pages in the same
thread, this is deadlock case if the page is already held by clio.

This can also fix the issue of LU-4873.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4977
Reviewed-on: http://review.whamcloud.com/10149
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:36 +02:00
Christopher J. Morrone
d6099af20a staging: lustre: Remove static declaration in anonymous union
It is not permitted in C++ to have a static declaration inside
of an anonymous union. The g++ compiler will complaine with an
error like this:

 error: struct ost_id::<anonymous union>::ostid invalid; an
 anonymous union can only have non-static data members [-fpermissive]

This patch changes the code to use an unnamed struct in place of
"struct ostid" inside of the anonymous union. That name declaration
was completely unnecessary anyway, since it was not used anywhere else.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4987
Reviewed-on: http://review.whamcloud.com/10176
Reviewed-by: Robert Read <robert.read@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:36 +02:00
Gregoire Pichon
c948390f10 staging: lustre: llite: fix inconsistencies of root squash feature
Root squash exhibits inconsistent behaviour on a client when
enabled. If a file is not cached on the client, then root will get
a permission denied error when accessing the file. When
the file has recently been accessed by a regular user and is
still in cache, root will be able to access the file without error
because the permission check is only done by the client that
isn't aware of root squash.

While the only real security benefit from root squash is to deny
clients access to files owned by root itself, it also makes sense
to treat file access on the client in a consistent manner
regardless of whether the file is in cache or not.

This patch adds root squash settings to llite so that client is able
to apply root squashing when it is relevant.

Configuration of MDT root squash settings will automatically be
applied to llite config log as well.

Update cfs_str2num_check() routine by removing any modification
of the specified string parameter. Since string can come from ls_str
field of a lstr structure, this avoids inconsistent ls_len field.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1778
Reviewed-on: http://review.whamcloud.com/5700
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:36 +02:00
John L. Hammond
d097d67b7f staging: lustre: llite: validate names
In ll_prep_md_op_data() validate names according to the same formula
used in mdd_name_check(). Add mdc_pack_name() to validate the name
actually packed in the request.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4992
Reviewed-on: http://review.whamcloud.com/10198
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:36 +02:00
wang di
8f18c8a48b staging: lustre: lmv: separate master object with master stripe
Separate master stripe with master object, so
1. stripeEA only exists on master object.
2. sub-stripe object will be inserted into master object
as sub-directory, and it can get the master object by "..".

By this, it will remove those specilities for stripe0 in
LMV and LOD. And also simplify LFSCK, i.e. consistency check
would be easier.

When then master object becomes an orphan, we should
mark all of its sub-stripes as dead object as well,
otherwise client might still be able to create files
under these stripes.

A few fixes for striped directory layout lock:

 1. stripe 0 should be locked as EX, same as other stripes.
 2. Acquire the layout for directory, when it is being unliked.

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4690
Reviewed-on: http://review.whamcloud.com/9511
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-21 15:57:36 +02:00