linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-04 09:06:55 +07:00

Author	SHA1	Message	Date
Jan Engelhardt	4b2cbd42be	netfilter: x_tables: rectify XT_FUNCTION_MAXNAMELEN usage There has been quite a confusion in userspace about XT_FUNCTION_MAXNAMELEN; because struct xt_entry_match used MAX-1, userspace would have to do an awkward MAX-2 for maximum length checking (due to '\0'). This patch adds a new define that matches the definition of XT_TABLE_MAXNAMELEN - being the size of the actual struct member, not one off. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-27 15:34:34 +02:00
Jesper Dangaard Brouer	af740b2c8f	netfilter: nf_conntrack: extend with extra stat counter I suspect an unfortunatly series of events occuring under a DDoS attack, in function __nf_conntrack_find() nf_contrack_core.c. Adding a stats counter to see if the search is restarted too often. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-23 12:34:56 +02:00
Bart De Schuymer	6c79bf0f24	netfilter: bridge-netfilter: fix refragmenting IP traffic encapsulated in PPPoE traffic The MTU for IP traffic encapsulated inside PPPoE traffic is smaller than the MTU of the Ethernet device (1500). Connection tracking gathers all IP packets and sometimes will refragment them in ip_fragment(). We then need to subtract the length of the encapsulating header from the mtu used in ip_fragment(). The check in br_nf_dev_queue_xmit() which determines if ip_fragment() has to be called is also updated for the PPPoE-encapsulated packets. nf_bridge_copy_header() is also updated to make sure the PPPoE data length field has the correct value. Signed-off-by: Bart De Schuymer <bdschuym@pandora.be> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-20 16:22:01 +02:00
Patrick McHardy	6291055465	Merge branch 'master' of /repos/git/net-next-2.6 Conflicts: Documentation/feature-removal-schedule.txt net/ipv6/netfilter/ip6t_REJECT.c net/netfilter/xt_limit.c Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-20 16:02:01 +02:00
Patrick McHardy	22265a5c3c	netfilter: xt_TEE: resolve oif using netdevice notifiers Replace the runtime oif name resolving by netdevice notifier based resolving. When an oif is given, a netdevice notifier is registered to resolve the name on NETDEV_REGISTER or NETDEV_CHANGE and unresolve it again on NETDEV_UNREGISTER or NETDEV_CHANGE to a different name. Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-20 15:07:32 +02:00
Eric Dumazet	e36fa2f7e9	rps: cleanups struct softnet_data holds many queues, so consistent use "sd" name instead of "queue" is better. Adds a rps_ipi_queued() helper to cleanup enqueue_to_backlog() Adds a _and_irq_disable suffix to net_rps_action() name, as David suggested. incr_input_queue_head() becomes input_queue_head_incr() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-20 01:18:05 -07:00
Eric Dumazet	88751275b8	rps: shortcut net_rps_action() net_rps_action() is a bit expensive on NR_CPUS=64..4096 kernels, even if RPS is not active. Tom Herbert used two bitmasks to hold information needed to send IPI, but a single LIFO list seems more appropriate. Move all RPS logic into net_rps_action() to cleanup net_rx_action() code (remove two ifdefs) Move rps_remote_softirq_cpus into softnet_data to share its first cache line, filling an existing hole. In a future patch, we could call net_rps_action() from process_backlog() to make sure we send IPI before handling this cpu backlog. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-19 13:20:34 -07:00
Jan Engelhardt	f3c5c1bfd4	netfilter: xtables: make ip_tables reentrant Currently, the table traverser stores return addresses in the ruleset itself (struct ip6t_entry->comefrom). This has a well-known drawback: the jumpstack is overwritten on reentry, making it necessary for targets to return absolute verdicts. Also, the ruleset (which might be heavy memory-wise) needs to be replicated for each CPU that can possibly invoke ip6t_do_table. This patch decouples the jumpstack from struct ip6t_entry and instead puts it into xt_table_info. Not being restricted by 'comefrom' anymore, we can set up a stack as needed. By default, there is room allocated for two entries into the traverser. arp_tables is not touched though, because there is just one/two modules and further patches seek to collapse the table traverser anyhow. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-19 16:05:10 +02:00
Jan Engelhardt	e281b19897	netfilter: xtables: inclusion of xt_TEE xt_TEE can be used to clone and reroute a packet. This can for example be used to copy traffic at a router for logging purposes to another dedicated machine. References: http://www.gossamer-threads.com/lists/iptables/devel/68781 Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-19 14:17:47 +02:00
Tom Herbert	fec5e652e5	rfs: Receive Flow Steering This patch implements receive flow steering (RFS). RFS steers received packets for layer 3 and 4 processing to the CPU where the application for the corresponding flow is running. RFS is an extension of Receive Packet Steering (RPS). The basic idea of RFS is that when an application calls recvmsg (or sendmsg) the application's running CPU is stored in a hash table that is indexed by the connection's rxhash which is stored in the socket structure. The rxhash is passed in skb's received on the connection from netif_receive_skb. For each received packet, the associated rxhash is used to look up the CPU in the hash table, if a valid CPU is set then the packet is steered to that CPU using the RPS mechanisms. The convolution of the simple approach is that it would potentially allow OOO packets. If threads are thrashing around CPUs or multiple threads are trying to read from the same sockets, a quickly changing CPU value in the hash table could cause rampant OOO packets-- we consider this a non-starter. To avoid OOO packets, this solution implements two types of hash tables: rps_sock_flow_table and rps_dev_flow_table. rps_sock_table is a global hash table. Each entry is just a CPU number and it is populated in recvmsg and sendmsg as described above. This table contains the "desired" CPUs for flows. rps_dev_flow_table is specific to each device queue. Each entry contains a CPU and a tail queue counter. The CPU is the "current" CPU for a matching flow. The tail queue counter holds the value of a tail queue counter for the associated CPU's backlog queue at the time of last enqueue for a flow matching the entry. Each backlog queue has a queue head counter which is incremented on dequeue, and so a queue tail counter is computed as queue head count + queue length. When a packet is enqueued on a backlog queue, the current value of the queue tail counter is saved in the hash entry of the rps_dev_flow_table. And now the trick: when selecting the CPU for RPS (get_rps_cpu) the rps_sock_flow table and the rps_dev_flow table for the RX queue are consulted. When the desired CPU for the flow (found in the rps_sock_flow table) does not match the current CPU (found in the rps_dev_flow table), the current CPU is changed to the desired CPU if one of the following is true: - The current CPU is unset (equal to RPS_NO_CPU) - Current CPU is offline - The current CPU's queue head counter >= queue tail counter in the rps_dev_flow table. This checks if the queue tail has advanced beyond the last packet that was enqueued using this table entry. This guarantees that all packets queued using this entry have been dequeued, thus preserving in order delivery. Making each queue have its own rps_dev_flow table has two advantages: 1) the tail queue counters will be written on each receive, so keeping the table local to interrupting CPU s good for locality. 2) this allows lockless access to the table-- the CPU number and queue tail counter need to be accessed together under mutual exclusion from netif_receive_skb, we assume that this is only called from device napi_poll which is non-reentrant. This patch implements RFS for TCP and connected UDP sockets. It should be usable for other flow oriented protocols. There are two configuration parameters for RFS. The "rps_flow_entries" kernel init parameter sets the number of entries in the rps_sock_flow_table, the per rxqueue sysfs entry "rps_flow_cnt" contains the number of entries in the rps_dev_flow table for the rxqueue. Both are rounded to power of two. The obvious benefit of RFS (over just RPS) is that it achieves CPU locality between the receive processing for a flow and the applications processing; this can result in increased performance (higher pps, lower latency). The benefits of RFS are dependent on cache hierarchy, application load, and other factors. On simple benchmarks, we don't necessarily see improvement and sometimes see degradation. However, for more complex benchmarks and for applications where cache pressure is much higher this technique seems to perform very well. Below are some benchmark results which show the potential benfit of this patch. The netperf test has 500 instances of netperf TCP_RR test with 1 byte req. and resp. The RPC test is an request/response test similar in structure to netperf RR test ith 100 threads on each host, but does more work in userspace that netperf. e1000e on 8 core Intel No RFS or RPS 104K tps at 30% CPU No RFS (best RPS config): 290K tps at 63% CPU RFS 303K tps at 61% CPU RPC test tps CPU% 50/90/99% usec latency Latency StdDev No RFS/RPS 103K 48% 757/900/3185 4472.35 RPS only: 174K 73% 415/993/2468 491.66 RFS 223K 73% 379/651/1382 315.61 Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-16 16:01:27 -07:00
Shan Wei	4e15ed4d93	net: replace ipfragok with skb->local_df As Herbert Xu said: we should be able to simply replace ipfragok with skb->local_df. commit f88037(sctp: Drop ipfargok in sctp_xmit function) has droped ipfragok and set local_df value properly. The patch kills the ipfragok parameter of .queue_xmit(). Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-15 23:36:37 -07:00
David S. Miller	3eb14b944f	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2010-04-15 14:31:06 -07:00
John W. Linville	5c01d56693	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem Conflicts: Documentation/feature-removal-schedule.txt drivers/net/wireless/ath/ath5k/phy.c drivers/net/wireless/wl12xx/wl1271_main.c	2010-04-15 16:21:34 -04:00
Bart De Schuymer	e179e6322a	netfilter: bridge-netfilter: Fix MAC header handling with IP DNAT - fix IP DNAT on vlan- or pppoe-encapsulated traffic: The functions neigh_hh_output() or dst->neighbour->output() overwrite the complete Ethernet header, although we only need the destination MAC address. For encapsulated packets, they ended up overwriting the encapsulating header. The new code copies the Ethernet source MAC address and protocol number before calling dst->neighbour->output(). The Ethernet source MAC and protocol number are copied back in place in br_nf_pre_routing_finish_bridge_slow(). This also makes the IP DNAT more transparent because in the old scheme the source MAC of the bridge was copied into the source address in the Ethernet header. We also let skb->protocol equal ETH_P_IP resp. ETH_P_IPV6 during the execution of the PF_INET resp. PF_INET6 hooks. - Speed up IP DNAT by calling neigh_hh_bridge() instead of neigh_hh_output(): if dst->hh is available, we already know the MAC address so we can just copy it. Signed-off-by: Bart De Schuymer <bdschuym@pandora.be> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-15 12:26:39 +02:00
Bart De Schuymer	ea2d9b41bd	netfilter: bridge-netfilter: simplify IP DNAT Remove br_netfilter.c::br_nf_local_out(). The function br_nf_local_out() was needed because the PF_BRIDGE::LOCAL_OUT hook could be called when IP DNAT happens on to-be-bridged traffic. The new scheme eliminates this mess. Signed-off-by: Bart De Schuymer <bdschuym@pandora.be> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-15 12:14:51 +02:00
Changli Gao	fd793d8905	net: CONFIG_SMP should be CONFIG_RPS Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-15 00:16:59 -07:00
Giuseppe CAVALLARO	e326e8503d	stmmac: new descriptor field for the driver's platform The new enh_desc is used for selecting the enhanced descriptors structure. There are several scenarios; some chips (mac10/100 or gmac) want to use the enhanced descriptors; others want the normal ones. For example, on ST platforms: MAC10/100 uses the normal desc structure and the GMAC uses the enhanced one. It can be useful to get this information from the platform. This could also be decided at run-time looking at the chip's ID number; but it could happen that chips with the same ID want to use different descriptor structure. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-14 04:49:51 -07:00
Patrick McHardy	f0ad0860d0	ipv4: ipmr: support multiple tables This patch adds support for multiple independant multicast routing instances, named "tables". Userspace multicast routing daemons can bind to a specific table instance by issuing a setsockopt call using a new option MRT_TABLE. The table number is stored in the raw socket data and affects all following ipmr setsockopt(), getsockopt() and ioctl() calls. By default, a single table (RT_TABLE_DEFAULT) is created with a default routing rule pointing to it. Newly created pimreg devices have the table number appended ("pimregX"), with the exception of devices created in the default table, which are named just "pimreg" for compatibility reasons. Packets are directed to a specific table instance using routing rules, similar to how regular routing rules work. Currently iif, oif and mark are supported as keys, source and destination addresses could be supported additionally. Example usage: - bind pimd/xorp/... to a specific table: uint32_t table = 123; setsockopt(fd, IPPROTO_IP, MRT_TABLE, &table, sizeof(table)); - create routing rules directing packets to the new table: # ip mrule add iif eth0 lookup 123 # ip mrule add oif eth0 lookup 123 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:34 -07:00
Patrick McHardy	0c12295a74	ipv4: ipmr: move mroute data into seperate structure Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:34 -07:00
Patrick McHardy	862465f2e7	ipv4: ipmr: convert struct mfc_cache to struct list_head Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:33 -07:00
Patrick McHardy	d658f8a0e6	ipv4: ipmr: remove net pointer from struct mfc_cache Now that cache entries in unres_queue don't need to be distinguished by their network namespace pointer anymore, we can remove it from struct mfc_cache add pass the namespace as function argument to the functions that need it. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:33 -07:00
Patrick McHardy	e258beb22f	ipv4: ipmr: move unres_queue and timer to per-namespace data The unres_queue is currently shared between all namespaces. Following patches will additionally allow to create multiple multicast routing tables in each namespace. Having a single shared queue for all these users seems to excessive, move the queue and the cleanup timer to the per-namespace data to unshare it. As a side-effect, this fixes a bug in the seq file iteration functions: the first entry returned is always from the current namespace, entries returned after that may belong to any namespace. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:32 -07:00
Patrick McHardy	f74e49b561	ipv4: raw: move struct raw_sock and raw_sk() to include/net/raw.h A following patch will use struct raw_sock to store state for ipmr, so having the definitions in icmp.h doesn't fit very well anymore. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:31 -07:00
Patrick McHardy	0f87b1dd01	net: fib_rules: decouple address families from real address families Decouple the address family values used for fib_rules from the real address families in socket.h. This allows to use fib_rules for code that is not a real address family without increasing AF_MAX/NPROTO. Values up to 127 are reserved for real address families and map directly to the corresponding AF value, values starting from 128 are for other uses. rtnetlink is changed to invoke the AF_UNSPEC dumpit/doit handlers for these families. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:31 -07:00
Patrick McHardy	d8a566beaa	net: fib_rules: consolidate IPv4 and DECnet ->default_pref() functions. Both functions are equivalent, consolidate them since a following patch needs a third implementation for multicast routing. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:30 -07:00
Jan Engelhardt	9c6eb28aca	netfilter: ipv6: add IPSKB_REROUTED exclusion to NF_HOOK/POSTROUTING invocation Similar to how IPv4's ip_output.c works, have ip6_output also check the IPSKB_REROUTED flag. It will be set from xt_TEE for cloned packets since Xtables can currently only deal with a single packet in flight at a time. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Acked-by: David S. Miller <davem@davemloft.net> [Patrick: changed to use an IP6SKB value instead of IPSKB] Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-13 15:32:16 +02:00
Alexey Dobriyan	9f93ff5be5	Restore __ALIGN_MASK() Fix lib/bitmap.c compile failure due to __ALIGN_KERNEL changes. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-13 14:09:15 +02:00
Eric Dumazet	acbbc07145	net: uninline skb_bond_should_drop() skb_bond_should_drop() is too big to be inlined. This patch reduces kernel text size, and its compilation time as well (shrinking include/linux/netdevice.h) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 03:32:42 -07:00
Hans J. Koch	829e001543	Fix some #includes in CAN drivers (rebased for net-next-2.6) In the current implementation, CAN drivers need to #include <linux/can.h> _before_ they #include <linux/can/dev.h>, which is both ugly and unnecessary. Fix this by including <linux/can.h> in <linux/can/dev.h> and remove the #include <linux/can.h> lines from drivers. Signed-off-by: Hans J. Koch <hjk@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 03:32:42 -07:00
Alexander Duyck	cd58950a53	skbuff: remove unused dev_consume_skb macro definition dev_consume_skb and kfree_skb_clean have no users and in the case of kfree_skb_clean could cause potential build issues since I cannot find where it is defined. Based on the patch in which it was introduced it appears to have been a bit of leftover code from an earlier version of the patch in which kfree_skb_clean was dropped in favor of consume_skb. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 02:58:24 -07:00
Alexey Dobriyan	a79ff731a1	netfilter: xtables: make XT_ALIGN() usable in exported headers by exporting __ALIGN_KERNEL() XT_ALIGN() was rewritten through ALIGN() by commit `42107f5009` "netfilter: xtables: symmetric COMPAT_XT_ALIGN definition". ALIGN() is not exported in userspace headers, which created compile problem for tc(8) and will create problem for iptables(8). We can't export generic looking name ALIGN() but we can export less generic __ALIGN_KERNEL() (suggested by Ben Hutchings). Google knows nothing about __ALIGN_KERNEL(). COMPAT_XT_ALIGN() changed for symmetry. Reported-by: Andreas Henriksson <andreas@fatal.se> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-13 11:21:46 +02:00
Eric Dumazet	b6c6712a42	net: sk_dst_cache RCUification With latest CONFIG_PROVE_RCU stuff, I felt more comfortable to make this work. sk->sk_dst_cache is currently protected by a rwlock (sk_dst_lock) This rwlock is readlocked for a very small amount of time, and dst entries are already freed after RCU grace period. This calls for RCU again :) This patch converts sk_dst_lock to a spinlock, and use RCU for readers. __sk_dst_get() is supposed to be called with rcu_read_lock() or if socket locked by user, so use appropriate rcu_dereference_check() condition (rcu_read_lock_held() \|\| sock_owned_by_user(sk)) This patch avoids two atomic ops per tx packet on UDP connected sockets, for example, and permits sk_dst_lock to be much less dirtied. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 01:41:33 -07:00
Richard Cochran	ed85b565b8	packet: support for TX time stamps on RAW sockets Enable the SO_TIMESTAMPING socket infrastructure for raw packet sockets. We introduce PACKET_TX_TIMESTAMP for the control message cmsg_type. Similar support for UDP and CAN sockets was added in commit `51f31cabe3` Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 01:30:48 -07:00
Herbert Xu	bb29624614	inet: Remove unused send_check length argument inet: Remove unused send_check length argument This patch removes the unused length argument from the send_check function in struct inet_connection_sock_af_ops. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Yinghai <yinghai.lu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-11 15:29:09 -07:00
David S. Miller	871039f02f	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/stmmac/stmmac_main.c drivers/net/wireless/wl12xx/wl1271_cmd.c drivers/net/wireless/wl12xx/wl1271_main.c drivers/net/wireless/wl12xx/wl1271_spi.c net/core/ethtool.c net/mac80211/scan.c	2010-04-11 14:53:53 -07:00
David S. Miller	4a1032faac	Merge branch 'master' of /home/davem/src/GIT/linux-2.6/	2010-04-11 02:44:30 -07:00
Linus Torvalds	2f4084209a	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: (34 commits) cfq-iosched: Fix the incorrect timeslice accounting with forced_dispatch loop: Update mtime when writing using aops block: expose the statistics in blkio.time and blkio.sectors for the root cgroup backing-dev: Handle class_create() failure Block: Fix block/elevator.c elevator_get() off-by-one error drbd: lc_element_by_index() never returns NULL cciss: unlock on error path cfq-iosched: Do not merge queues of BE and IDLE classes cfq-iosched: Add additional blktrace log messages in CFQ for easier debugging i2o: Remove the dangerous kobj_to_i2o_device macro block: remove 16 bytes of padding from struct request on 64bits cfq-iosched: fix a kbuild regression block: make CONFIG_BLK_CGROUP visible Remove GENHD_FL_DRIVERFS block: Export max number of segments and max segment size in sysfs block: Finalize conversion of block limits functions block: Fix overrun in lcm() and move it to lib vfs: improve writeback_inodes_wb() paride: fix off-by-one test drbd: fix al-to-on-disk-bitmap for 4k logical_block_size ...	2010-04-09 11:50:29 -07:00
David Howells	ce82653d6c	radix_tree_tag_get() is not as safe as the docs make out [ver #2 ] radix_tree_tag_get() is not safe to use concurrently with radix_tree_tag_set() or radix_tree_tag_clear(). The problem is that the double tag_get() in radix_tree_tag_get(): if (!tag_get(node, tag, offset)) saw_unset_tag = 1; if (height == 1) { int ret = tag_get(node, tag, offset); may see the value change due to the action of set/clear. RCU is no protection against this as no pointers are being changed, no nodes are being replaced according to a COW protocol - set/clear alter the node directly. The documentation in linux/radix-tree.h, however, says that radix_tree_tag_get() is an exception to the rule that "any function modifying the tree or tags (...) must exclude other modifications, and exclude any functions reading the tree". The problem is that the next statement in radix_tree_tag_get() checks that the tag doesn't vary over time: BUG_ON(ret && saw_unset_tag); This has been seen happening in FS-Cache: https://www.redhat.com/archives/linux-cachefs/2010-April/msg00013.html To this end, remove the BUG_ON() from radix_tree_tag_get() and note in various comments that the value of the tag may change whilst the RCU read lock is held, and thus that the return value of radix_tree_tag_get() may not be relied upon unless radix_tree_tag_set/clear() and radix_tree_delete() are excluded from running concurrently with it. Reported-by: Romain DEGEZ <romain.degez@smartjog.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-09 10:12:03 -07:00
Pekka Enberg	fc1c183353	slab: Generify kernel pointer validation As suggested by Linus, introduce a kern_ptr_validate() helper that does some sanity checks to make sure a pointer is a valid kernel pointer. This is a preparational step for fixing SLUB kmem_ptr_validate(). Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Matt Mackall <mpm@selenic.com> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-09 10:09:50 -07:00
Javier Cardona	97ad9139fd	mac80211: Moved mesh action codes to a more visible location Grouped mesh action codes together with the other action codes in ieee80211.h. Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-08 15:24:07 -04:00
Timo Teräs	e4077e018b	xfrm: Fix crashes in xfrm_lookup() From: Timo Teräs <timo.teras@iki.fi> Happens because CONFIG_XFRM_SUB_POLICY is not enabled, and one of the helper functions I used did unexpected things in that case. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-08 11:27:42 -07:00
John W. Linville	0f2df9eac7	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into merge Conflicts: Documentation/feature-removal-schedule.txt drivers/net/wireless/ath/ath5k/phy.c drivers/net/wireless/iwlwifi/iwl-4965.c drivers/net/wireless/iwlwifi/iwl-agn.c drivers/net/wireless/iwlwifi/iwl-core.c drivers/net/wireless/iwlwifi/iwl-core.h drivers/net/wireless/iwlwifi/iwl-tx.c	2010-04-08 13:34:54 -04:00
Mark Lord	45c4d015a9	libata: Fix accesses at LBA28 boundary (old bug, but nasty) (v2) Most drives from Seagate, Hitachi, and possibly other brands, do not allow LBA28 access to sector number 0x0fffffff (2^28 - 1). So instead use LBA48 for such accesses. This bug could bite a lot of systems, especially when the user has taken care to align partitions to 4KB boundaries. On misaligned systems, it is less likely to be encountered, since a 4KB read would end at 0x10000000 rather than at 0x0fffffff. Signed-off-by: Mark Lord <mlord@pobox.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2010-04-08 12:53:57 -04:00
Linus Torvalds	cf90bfe2eb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6: ide: Fix IDE taskfile with cfq scheduler ide: Must hold queue lock when requeueing ide: Requeue request after DMA timeout	2010-04-08 07:45:36 -07:00
chavey	97f8aefbbf	net: fix ethtool coding style errors and warnings Fix coding style errors and warnings output while running checkpatch.pl on the files net/core/ethtool.c and include/linux/ethtool.h Signed-off-by: chavey <chavey@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 21:54:42 -07:00
John Hughes	f5eb917b86	x25: Patch to fix bug 15678 - x25 accesses fields beyond end of packet. Here is a patch to stop X.25 examining fields beyond the end of the packet. For example, when a simple CALL ACCEPTED was received: 10 10 0f x25_parse_facilities was attempting to decode the FACILITIES field, but this packet contains no facilities field. Signed-off-by: John Hughes <john@calva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 21:29:25 -07:00
Michael S. Tsirkin	b7a413015d	virtio: disable multiport console support. Move MULTIPORT feature and related config changes out of exported headers, and disable the feature at runtime. At this point, it seems less risky to keep code around until we can enable it than rip it out completely. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-04-08 09:46:19 +09:30
Pavel Roskin	18e225f257	net: fix definition of netdev_for_each_mc_addr() The first argument should be called ha, not mclist. All callers use the name "ha", but if they used a different name, there would be a compile error. Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 16:40:09 -07:00
Johannes Berg	098a607091	mac80211: clean up/fix aggregation code The aggregation code has a number of quirks, like inventing an unneeded WLAN_BACK_TIMER value and leaking memory under certain circumstances during station destruction. Fix these issues by using the regular aggregation session teardown code and blocking new aggregation sessions, all before the station is really destructed. As a side effect, this gets rid of the long code block to destroy aggregation safely. Additionally, rename tid_state_rx which can only have the values IDLE and OPERATIONAL to tid_active_rx to make it easier to understand that there is no bitwise stuff going on on the RX side -- the TX side remains because it needs to keep track of the driver and peer states. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-07 14:38:05 -04:00
Jouni Malinen	d5cdfacb35	cfg80211: Add local-state-change-only auth/deauth/disassoc cfg80211 is quite strict on allowing authentication and association commands only in certain states. In order to meet these requirements, user space applications may need to clear authentication or association state in some cases. Currently, this can be done with deauth/disassoc command, but that ends up sending out Deauthentication or Disassociation frame unnecessarily. Add a new nl80211 attribute to allow this sending of the frame be skipped, but with all other deauth/disassoc operations being completed. Similar state change is also needed for IEEE 802.11r FT protocol in the FT-over-DS case which does not use Authentication frame exchange in a transition to another BSS. For this to work with cfg80211, an authentication entry needs to be created for the target BSS without sending out an Authentication frame. The nl80211 authentication command can be used for this purpose, too, with the new attribute to indicate that the command is only for changing local state. This enables wpa_supplicant to complete FT-over-DS transition successfully. Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-07 14:37:56 -04:00

1 2 3 4 5 ...

36562 Commits