Commit Graph

381575 Commits

Author SHA1 Message Date
Theodore Ts'o
cb53054118 ext4: fix up error handling for mpage_map_and_submit_extent()
The function mpage_released_unused_page() must only be called once;
otherwise the kernel will BUG() when the second call to
mpage_released_unused_page() tries to unlock the pages which had been
unlocked by the first call.

Also restructure the error handling so that we only give up on writing
the dirty pages in the case of ENOSPC where retrying the allocation
won't help.  Otherwise, a transient failure, such as a kmalloc()
failure in calling ext4_map_blocks() might cause us to give up on
those pages, leading to a scary message in /var/log/messages plus data
loss.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2013-07-01 08:12:40 -04:00
Theodore Ts'o
39c04153fd jbd2: fix theoretical race in jbd2__journal_restart
Once we decrement transaction->t_updates, if this is the last handle
holding the transaction from closing, and once we release the
t_handle_lock spinlock, it's possible for the transaction to commit
and be released.  In practice with normal kernels, this probably won't
happen, since the commit happens in a separate kernel thread and it's
unlikely this could all happen within the space of a few CPU cycles.

On the other hand, with a real-time kernel, this could potentially
happen, so save the tid found in transaction->t_tid before we release
t_handle_lock.  It would require an insane configuration, such as one
where the jbd2 thread was set to a very high real-time priority,
perhaps because a high priority real-time thread is trying to read or
write to a file system.  But some people who use real-time kernels
have been known to do insane things, including controlling
laser-wielding industrial robots.  :-)

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2013-07-01 08:12:40 -04:00
Lukas Czerner
e1be3a928e ext4: only zero partial blocks in ext4_zero_partial_blocks()
Currently if we pass range into ext4_zero_partial_blocks() which covers
entire block we would attempt to zero it even though we should only zero
unaligned part of the block.

Fix this by checking whether the range covers the whole block skip
zeroing if so.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-07-01 08:12:39 -04:00
Theodore Ts'o
42c832debb ext4: check error return from ext4_write_inline_data_end()
The function ext4_write_inline_data_end() can return an error.  So we
need to assign it to a signed integer variable to check for an error
return (since copied is an unsigned int).

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Zheng Liu <wenqing.lz@taobao.com>
Cc: stable@vger.kernel.org
2013-07-01 08:12:39 -04:00
jon ernst
353eefd338 ext4: delete unnecessary C statements
Comparing unsigned variable with 0 always returns false.
err = 0 is duplicated and unnecessary.

[ tytso: Also cleaned up error handling in ext4_block_zero_page_range() ]

Signed-off-by: "Jon Ernst" <jonernst07@gmx.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-07-01 08:12:39 -04:00
Al Viro
64cb927371 ext3,ext4: don't mess with dir_file->f_pos in htree_dirblock_to_tree()
Both ext3 and ext4 htree_dirblock_to_tree() is just filling the
in-core rbtree for use by call_filldir().  All updates of ->f_pos are
done by the latter; bumping it here (on error) is obviously wrong - we
might very well have it nowhere near the block we'd found an error in.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2013-07-01 08:12:38 -04:00
Theodore Ts'o
fe52d17cdd jbd2: move superblock checksum calculation to jbd2_write_superblock()
Some of the functions which modify the jbd2 superblock were not
updating the checksum before calling jbd2_write_superblock().  Move
the call to jbd2_superblock_csum_set() to jbd2_write_superblock(), so
that the checksum is calculated consistently.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: stable@vger.kernel.org
2013-07-01 08:12:38 -04:00
Ashish Sangwan
aeb2817a4e ext4: pass inode pointer instead of file pointer to punch hole
No need to pass file pointer when we can directly pass inode pointer.

Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-07-01 08:12:38 -04:00
boxi liu
c4932dbe63 ext4: improve free space calculation for inline_data
In ext4 feature inline_data,it use the xattr's space to store the
inline data in inode.When we calculate the inline data as the xattr,we
add the pad.But in get_max_inline_xattr_value_size() function we count
the free space without pad.It cause some contents are moved to a block
even if it can be
stored in the inode.

Signed-off-by: liulei <lewis.liulei@huawei.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Tao Ma <boyu.mt@taobao.com>
2013-07-01 08:12:37 -04:00
Joe Perches
e7c96e8e47 ext4: reduce object size when !CONFIG_PRINTK
Reduce the object size ~10% could be useful for embedded systems.

Add #ifdef CONFIG_PRINTK #else #endif blocks to hold formats and
arguments, passing " " to functions when !CONFIG_PRINTK and still
verifying format and arguments with no_printk.

$ size fs/ext4/built-in.o*
   text	   data	    bss	    dec	    hex	filename
 239375	    610	    888	 240873	  3ace9	fs/ext4/built-in.o.new
 264167	    738	    888	 265793	  40e41	fs/ext4/built-in.o.old

    $ grep -E "CONFIG_EXT4|CONFIG_PRINTK" .config
    # CONFIG_PRINTK is not set
    CONFIG_EXT4_FS=y
    CONFIG_EXT4_USE_FOR_EXT23=y
    CONFIG_EXT4_FS_POSIX_ACL=y
    # CONFIG_EXT4_FS_SECURITY is not set
    # CONFIG_EXT4_DEBUG is not set

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-07-01 08:12:37 -04:00
Zheng Liu
d3922a777f ext4: improve extent cache shrink mechanism to avoid to burn CPU time
Now we maintain an proper in-order LRU list in ext4 to reclaim entries
from extent status tree when we are under heavy memory pressure.  For
keeping this order, a spin lock is used to protect this list.  But this
lock burns a lot of CPU time.  We can use the following steps to trigger
it.

  % cd /dev/shm
  % dd if=/dev/zero of=ext4-img bs=1M count=2k
  % mkfs.ext4 ext4-img
  % mount -t ext4 -o loop ext4-img /mnt
  % cd /mnt
  % for ((i=0;i<160;i++)); do truncate -s 64g $i; done
  % for ((i=0;i<160;i++)); do cp $i /dev/null &; done
  % perf record -a -g
  % perf report

This commit tries to fix this problem.  Now a new member called
i_touch_when is added into ext4_inode_info to record the last access
time for an inode.  Meanwhile we never need to keep a proper in-order
LRU list.  So this can avoid to burns some CPU time.  When we try to
reclaim some entries from extent status tree, we use list_sort() to get
a proper in-order list.  Then we traverse this list to discard some
entries.  In ext4_sb_info, we use s_es_last_sorted to record the last
time of sorting this list.  When we traverse the list, we skip the inode
that is newer than this time, and move this inode to the tail of LRU
list.  When the head of the list is newer than s_es_last_sorted, we will
sort the LRU list again.

In this commit, we break the loop if s_extent_cache_cnt == 0 because
that means that all extents in extent status tree have been reclaimed.

Meanwhile in this commit, ext4_es_{un}register_shrinker()'s prototype is
changed to save a local variable in these functions.

Reported-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-07-01 08:12:37 -04:00
Alexey Khoroshilov
2c00ef3ee3 ext4: implement error handling of ext4_mb_new_preallocation()
If memory allocation in ext4_mb_new_group_pa() is failed,
it returns error code, ext4_mb_new_preallocation() propages it,
but ext4_mb_new_blocks() ignores it.

An observed result was:

- allocation fail means ext4_mb_new_group_pa() does not update
  ext4_allocation_context;

- ext4_mb_new_blocks() sets ext4_allocation_request->len (ar->len =
  ac->ac_b_ex.fe_len;) to number of blocks preallocated (512) instead
  of number of blocks requested (1);

- that activates update cycle in ext4_splice_branch():
    for (i = 1; i < blks; i++) <-- blks is 512 instead of 1 here
      *(where->p + i) = cpu_to_le32(current_block++);

- it iterates 511 times and corrupts a chunk of memory including inode
  structure;

- page fault happens at EXT4_SB(inode->i_sb) in ext4_mark_inode_dirty();

- system hangs with 'scheduling while atomic' BUG.

The patch implements a check for ext4_mb_new_preallocation() error
code and handles its failure as if ext4_mb_regular_allocator() fails.

Found by Linux File System Verification project (linuxtesting.org).

[ Patch restructed by tytso to make the flow of control easier to follow. ]

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-07-01 08:12:36 -04:00
Maarten ter Huurne
6ca792edc1 ext4: fix corruption when online resizing a fs with 1K block size
Subtracting the number of the first data block places the superblock
backups one block too early, corrupting the file system. When the block
size is larger than 1K, the first data block is 0, so the subtraction
has no effect and no corruption occurs.

Signed-off-by: Maarten ter Huurne <maarten@treewalker.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
CC: stable@vger.kernel.org
2013-07-01 08:12:08 -04:00
Catalin Marinas
aa729dccb5 Merge branch 'for-next/hugepages' of git://git.linaro.org/people/stevecapper/linux into upstream-hugepages
* 'for-next/hugepages' of git://git.linaro.org/people/stevecapper/linux:
  ARM64: mm: THP support.
  ARM64: mm: Raise MAX_ORDER for 64KB pages and THP.
  ARM64: mm: HugeTLB support.
  ARM64: mm: Move PTE_PROT_NONE bit.
  ARM64: mm: Make PAGE_NONE pages read only and no-execute.
  ARM64: mm: Restore memblock limit when map_mem finished.
  mm: thp: Correct the HPAGE_PMD_ORDER check.
  x86: mm: Remove general hugetlb code from x86.
  mm: hugetlb: Copy general hugetlb code from x86 to mm.
  x86: mm: Remove x86 version of huge_pmd_share.
  mm: hugetlb: Copy huge_pmd_share from x86 to mm.

Conflicts:
	arch/arm64/Kconfig
	arch/arm64/include/asm/pgtable-hwdef.h
	arch/arm64/include/asm/pgtable.h
2013-07-01 11:20:58 +01:00
Mark Brown
70083c4c8c Merge remote-tracking branch 'regulator/topic/tps62360' into regulator-next 2013-07-01 11:17:11 +01:00
Mark Brown
60908305fb Merge remote-tracking branch 'regulator/topic/of' into regulator-next 2013-07-01 11:17:10 +01:00
Mark Brown
5ee6c72880 Merge remote-tracking branch 'regulator/topic/max8973' into regulator-next 2013-07-01 11:17:10 +01:00
Mark Brown
7455c7f4ac Merge remote-tracking branch 'regulator/topic/lp872x' into regulator-next 2013-07-01 11:17:09 +01:00
Mark Brown
bc830f352e Merge remote-tracking branch 'regulator/topic/lp397x' into regulator-next 2013-07-01 11:17:09 +01:00
Mark Brown
0a192cc860 Merge remote-tracking branch 'regulator/topic/linar' into regulator-next 2013-07-01 11:17:08 +01:00
Mark Brown
39c9f80f43 Merge remote-tracking branch 'regulator/topic/isl6271a' into regulator-next 2013-07-01 11:17:08 +01:00
Mark Brown
bf7b8113e3 Merge remote-tracking branch 'regulator/topic/drvdata' into regulator-next 2013-07-01 11:17:07 +01:00
Mark Brown
71c2dca1ff Merge remote-tracking branch 'regulator/topic/delay' into regulator-next 2013-07-01 11:17:07 +01:00
Mark Brown
59aedb6df1 Merge remote-tracking branch 'regulator/topic/abb' into regulator-next 2013-07-01 11:17:06 +01:00
Mark Brown
c84130e700 Merge remote-tracking branch 'regulator/topic/ab8500' into regulator-next 2013-07-01 11:17:06 +01:00
Mark Brown
28120bf849 Merge remote-tracking branch 'regulator/fix/max77693' into regulator-linus 2013-07-01 11:17:05 +01:00
Axel Lin
8c7e7ddf31 regulator: max77693: Remove NULL test for rmatch[i].init_data
The implementation in of_regulator_match() already ensures match->init_data is
not NULL for all matched cases if the return value of of_regulator_match() > 0.

Thus remove NULL test for rmatch[i].init_data.

This patch also fixes the condition for loop iteration.
The for loop should iterate "matched" times rather than ARRAY_SIZE(regulators)
because we only allocate "matched" number of entries for rdata.
Though in most cases, "matched" == ARRAY_SIZE(regulators).

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Acked-by: Jonghwa Lee <jonghwa3.lee@samsung.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
2013-07-01 11:16:15 +01:00
Axel Lin
f3ecdd7bf6 regulator: max77693: Fix trivial typo
Fix trivial typo in the equation to check upper bound of current setting.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Acked-by: Jonghwa Lee <jonghwa3.lee@samsung.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
2013-07-01 11:15:38 +01:00
Ingo Molnar
2fd1b48788 Linux 3.10
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQEcBAABAgAGBQJR0K2gAAoJEHm+PkMAQRiGWsEH+gMZSN1qRm34hZ82q1Tx7HvL
 Eb/Gsl3Qw/7G2TlTqgjBUs36IdqV9O2cui/aa3/TfXvdvrx+0GlhRkEwQPc+ygcO
 Mvoyoke4tT4+4jVFdCg1J8avREsa28/6oaHs0ZZxuVmJBBLTJH7aXaNsGn6eU1q9
 9+p798MQis6naIiPC63somlZcCIiBhsuWCPWpEfLMn8G1HWAFTM3xXIbNBqe/brS
 bmIOfhomlIZ5dcdaXGvjtP3+KJhkNDwhkPC4tVYu8JqqgSlrE+a+EGyEuuGqKk10
 U+swiqyuD31uBI9ga54u/2FzSqDiAu6YOcMXevjo/m3g9XLdYbYLvN+nvN8alCQ=
 =Ob6Z
 -----END PGP SIGNATURE-----

Merge tag 'v3.10' into sched/core

Merge in a recent upstream commit:

  c2853c8df5 include/linux/math64.h: add div64_ul()

because:

  72a4cf20cb sched: Change cfs_rq load avg to unsigned long

relies on it.

[ We don't rebase sched/core for this, because the handful of
  followup commits after the broken commit are not behavioral
  changes so are unlikely to be needed during bisection. ]

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-07-01 11:18:53 +02:00
Linus Torvalds
8bb495e3f0 Linux 3.10 2013-06-30 15:13:29 -07:00
Linus Torvalds
f0277dce1b Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull another powerpc fix from Benjamin Herrenschmidt:
 "I mentioned that while we had fixed the kernel crashes, EEH error
  recovery didn't always recover...  It appears that I had a fix for
  that already in powerpc-next (with a stable CC).

  I cherry-picked it today and did a few tests and it seems that things
  now work quite well.  The patch is also pretty simple, so I see no
  reason to wait before merging it."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/eeh: Fix fetching bus for single-dev-PE
2013-06-30 15:08:15 -07:00
Linus Torvalds
4b483802fd SCSI fixes on 20130626
This is a set of seven bug fixes.  Several fcoe fixes for locking problems,
 initiator issues and a VLAN API change, all of which could eventually lead to
 data corruption, one fix for a qla2xxx locking problem which could lead to
 multiple completions of the same request (and subsequent data corruption) and
 a use after free in the ipr driver.  Plus one minor MAINTAINERS file update
 
 Signed-off-by: James Bottomley <JBottomley@Parallels.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQEcBAABAgAGBQJRy9qEAAoJEDeqqVYsXL0MIoUH/RkCyVWZqj2utTMXg0olPxm/
 GE2OC5X944mBrMeY0wSfEj58NQoo9cRiXGTxZLl+X0lRHbTU4CCGUAAEu/0o5N/M
 Gqtmk3Gn+iY819F0gYqs/En4IjLPsifonXaamHA1341NlNjDqb6IQdOQN8qjlfQT
 aDebRuzS/z5jFO8pviem+GD2FDVSCdkM24SeJLxqxoyuOR77W/3n6bjlVY1jkCoI
 lFA5k9OeqQRiEGqR+Da2nLWmCPt85R+qjNzxTqFmF2gh+Z/VW1hwAcjWz+/xSc4O
 V7d/ZN9Qhk31PY3oQ1Q1jJU0fW95bqOo6dZvGrLkOVc1FkaFgjMF7RQuA8rVxRI=
 =xWnQ
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is a set of seven bug fixes.  Several fcoe fixes for locking
  problems, initiator issues and a VLAN API change, all of which could
  eventually lead to data corruption, one fix for a qla2xxx locking
  problem which could lead to multiple completions of the same request
  (and subsequent data corruption) and a use after free in the ipr
  driver.  Plus one minor MAINTAINERS file update"

(only six bugfixes in this pull, since I had already pulled the fcoe API
fix directly from Robert Love)

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  [SCSI] ipr: Avoid target_destroy accessing memory after it was freed
  [SCSI] qla2xxx: Fix for locking issue between driver ISR and mailbox routines
  MAINTAINERS: Fix fcoe mailing list
  libfc: extend ex_lock to protect all of fc_seq_send
  libfc: Correct check for initiator role
  libfcoe: Fix Conflicting FCFs issue in the fabric
2013-06-30 15:06:25 -07:00
Greg Kroah-Hartman
828c6a102b Revert "serial: 8250_pci: add support for another kind of NetMos Technology PCI 9835 Multi-I/O Controller"
This reverts commit 8d2f8cd424.

As reported by Stefan, this device already works with the parport_serial
driver, so the 8250_pci driver should not also try to grab it as well.

Reported-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
Cc: Wang YanQing <udknight@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-30 09:03:06 -07:00
Mark Brown
7bc8c4c37a Merge remote-tracking branch 'regmap/topic/field' into regmap-next 2013-06-30 12:40:03 +01:00
Mark Brown
ad4f496b44 Merge remote-tracking branch 'regmap/topic/debugfs' into regmap-next 2013-06-30 12:40:02 +01:00
Mark Brown
912af52f31 Merge remote-tracking branch 'regmap/topic/core' into regmap-next 2013-06-30 12:40:02 +01:00
Mark Brown
feff98f550 Merge remote-tracking branch 'regmap/topic/cache' into regmap-next 2013-06-30 12:40:01 +01:00
Axel Lin
539fde59eb pinctrl: st: Remove unnecessary use of of_match_ptr macro
This is a DT only driver and st_pctl_of_match is always compiled
in. Hence of_match_ptr is unnecessary.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
2013-06-30 12:39:33 +01:00
Gavin Shan
ea461abf61 powerpc/eeh: Fix fetching bus for single-dev-PE
While running Linux as guest on top of phyp, we possiblly have
PE that includes single PCI device. However, we didn't return
its PCI bus correctly and it leads to failure on recovery from
EEH errors for single-dev-PE. The patch fixes the issue.

Cc: <stable@vger.kernel.org> # v3.7+
Cc: Steve Best <sbest@us.ibm.com>
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-30 14:08:34 +10:00
Alexander Graf
a3ff5fbc94 KVM: PPC: Ignore PIR writes
While technically it's legal to write to PIR and have the identifier changed,
we don't implement logic to do so because we simply expose vcpu_id to the guest.

So instead, let's ignore writes to PIR. This ensures that we don't inject faults
into the guest for something the guest is allowed to do. While at it, we cross
our fingers hoping that it also doesn't mind that we broke its PIR read values.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-30 03:33:22 +02:00
Paul Mackerras
681562cd56 KVM: PPC: Book3S PR: Invalidate SLB entries properly
At present, if the guest creates a valid SLB (segment lookaside buffer)
entry with the slbmte instruction, then invalidates it with the slbie
instruction, then reads the entry with the slbmfee/slbmfev instructions,
the result of the slbmfee will have the valid bit set, even though the
entry is not actually considered valid by the host.  This is confusing,
if not worse.  This fixes it by zeroing out the orige and origv fields
of the SLB entry structure when the entry is invalidated.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-30 03:33:22 +02:00
Paul Mackerras
0f296829b5 KVM: PPC: Book3S PR: Allow guest to use 1TB segments
With this, the guest can use 1TB segments as well as 256MB segments.
Since we now have the situation where a single emulated guest segment
could correspond to multiple shadow segments (as the shadow segments
are still 256MB segments), this adds a new kvmppc_mmu_flush_segment()
to scan for all shadow segments that need to be removed.

This restructures the guest HPT (hashed page table) lookup code to
use the correct hashing and matching functions for HPTEs within a
1TB segment.  We use the standard hpt_hash() function instead of
open-coding the hash calculation, and we use HPTE_V_COMPARE() with
an AVPN value that has the B (segment size) field included.  The
calculation of avpn is done a little earlier since it doesn't change
in the loop starting at the do_second label.

The computation in kvmppc_mmu_book3s_64_esid_to_vsid() changes so that
it returns a 256MB VSID even if the guest SLB entry is a 1TB entry.
This is because the users of this function are creating 256MB SLB
entries.  We set a new VSID_1T flag so that entries created from 1T
segments don't collide with entries from 256MB segments.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-30 03:33:22 +02:00
Paul Mackerras
6ed1485f65 KVM: PPC: Book3S PR: Don't keep scanning HPTEG after we find a match
The loop in kvmppc_mmu_book3s_64_xlate() that looks up a translation
in the guest hashed page table (HPT) keeps going if it finds an
HPTE that matches but doesn't allow access.  This is incorrect; it
is different from what the hardware does, and there should never be
more than one matching HPTE anyway.  This fixes it to stop when any
matching HPTE is found.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-30 03:33:22 +02:00
Paul Mackerras
bc1bc4e392 KVM: PPC: Book3S PR: Fix invalidation of SLB entry 0 on guest entry
On entering a PR KVM guest, we invalidate the whole SLB before loading
up the guest entries.  We do this using an slbia instruction, which
invalidates all entries except entry 0, followed by an slbie to
invalidate entry 0.  However, the slbie turns out to be ineffective
in some circumstances (specifically when the host linear mapping uses
64k pages) because of errors in computing the parameter to the slbie.
The result is that the guest kernel hangs very early in boot because
it takes a DSI the first time it tries to access kernel data using
a linear mapping address in real mode.

Currently we construct bits 36 - 43 (big-endian numbering) of the slbie
parameter by taking bits 56 - 63 of the SLB VSID doubleword.  These bits
for the tlbie are C (class, 1 bit), B (segment size, 2 bits) and 5
reserved bits.  For the SLB VSID doubleword these are C (class, 1 bit),
reserved (1 bit), LP (large page size, 2 bits), and 4 reserved bits.
Thus we are not setting the B field correctly, and when LP = 01 as
it is for 64k pages, we are setting a reserved bit.

Rather than add more instructions to calculate the slbie parameter
correctly, this takes a simpler approach, which is to set entry 0 to
zeroes explicitly.  Normally slbmte should not be used to invalidate
an entry, since it doesn't invalidate the ERATs, but it is OK to use
it to invalidate an entry if it is immediately followed by slbia,
which does invalidate the ERATs.  (This has been confirmed with the
Power architects.)  This approach takes fewer instructions and will
work whatever the contents of entry 0.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-30 03:33:21 +02:00
Paul Mackerras
8ed7b7e9d2 KVM: PPC: Book3S PR: Fix proto-VSID calculations
This makes sure the calculation of the proto-VSIDs used by PR KVM
is done with 64-bit arithmetic.  Since vcpu3s->context_id[] is int,
when we do vcpu3s->context_id[0] << ESID_BITS the shift will be done
with 32-bit instructions, possibly leading to significant bits
getting lost, as the context id can be up to 524283 and ESID_BITS is
18.  To fix this we cast the context id to u64 before shifting.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-30 03:33:21 +02:00
Tiejun Chen
5f17ce8b95 KVM: PPC: Guard doorbell exception with CONFIG_PPC_DOORBELL
Availablity of the doorbell_exception function is guarded by
CONFIG_PPC_DOORBELL. Use the same define to guard our caller
of it.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
[agraf: improve patch description]
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-30 03:33:21 +02:00
Linus Torvalds
6c355beafd Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Ben Herrenschmidt:
 "We discovered some breakage in our "EEH" (PCI Error Handling) code
  while doing error injection, due to a couple of regressions.  One of
  them is due to a patch (37f02195be "powerpc/pci: fix PCI-e devices
  rescan issue on powerpc platform") that, in hindsight, I shouldn't
  have merged considering that it caused more problems than it solved.

  Please pull those two fixes.  One for a simple EEH address cache
  initialization issue.  The other one is a patch from Guenter that I
  had originally planned to put in 3.11 but which happens to also fix
  that other regression (a kernel oops during EEH error handling and
  possibly hotplug).

  With those two, the couple of test machines I've hammered with error
  injection are remaining up now.  EEH appears to still fail to recover
  on some devices, so there is another problem that Gavin is looking
  into but at least it's no longer crashing the kernel."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/pci: Improve device hotplug initialization
  powerpc/eeh: Add eeh_dev to the cache during boot
2013-06-29 17:02:48 -07:00
Olof Johansson
8d5bc1a6ac ARM: dt: Only print warning, not WARN() on bad cpu map in device tree
Due to recent changes and expecations of proper cpu bindings, there are
now cases for many of the in-tree devicetrees where a WARN() will hit
on boot due to badly formatted /cpus nodes.

Downgrade this to a pr_warn() to be less alarmist, since it's not a
new problem.

Tested on Arndale, Cubox, Seaboard and Panda ES. Panda hits the WARN
without this, the others do not.

Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-29 17:00:40 -07:00
Yijing Wang
038f0babc9 Gpio/trivial: replace numeric with standard PM state macros
Use standard PM state macros PCI_Dx instead of numeric 0/1/2..

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Cc: Michael Buesch <m@bues.ch>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Jiri Kosina <trivial@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2013-06-30 01:37:07 +02:00
Guenter Roeck
7846de406f powerpc/pci: Improve device hotplug initialization
Commit 37f02195b (powerpc/pci: fix PCI-e devices rescan issue on powerpc
platform) fixes a problem with interrupt and DMA initialization on hot
plugged devices. With this commit, interrupt and DMA initialization for
hot plugged devices is handled in the pci device enable function.

This approach has a couple of drawbacks. First, it creates two code paths
for device initialization, one for hot plugged devices and another for devices
known during the initial PCI scan. Second, the initialization code for hot
plugged devices is only called when the device is enabled, ie typically
in the probe function. Also, the platform specific setup code is called each
time pci_enable_device() is called, not only once during device discovery,
meaning it is actually called multiple times, once for devices discovered
during the initial scan and again each time a driver is re-loaded.

The visible result is that interrupt pins are only assigned to hot plugged
devices when the device driver is loaded. Effectively this changes the PCI
probe API, since pci_dev->irq and the device's dma configuration will now
only be valid after pci_enable() was called at least once. A more subtle
change is that platform specific PCI device setup is moved from device
discovery into the driver's probe function, more specifically into the
pci_enable_device() call.

To fix the inconsistencies, add new function pcibios_add_device.
Call pcibios_setup_device from pcibios_setup_bus_devices if device setup
is not complete, and from pcibios_add_device if bus setup is complete.

With this change, device setup code is moved back into device initialization,
and called exactly once for both static and hot plugged devices.

[ This also fixes a regression introduced by the above patch which
  causes dev->irq to be overwritten under some cirumstances after
  MSIs have been enabled for the device which leads to crashes due
  to the MSI core "hijacking" dev->irq to store the base MSI number
  and not the LSI. --BenH
]

Cc: Yuanquan Chen <Yuanquan.Chen@freescale.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hiroo Matsumoto <matsumoto.hiroo@jp.fujitsu.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-30 08:46:46 +10:00