linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-22 18:32:14 +07:00

Author	SHA1	Message	Date
Li RongQing	8f84985fec	net: unify the pcpu_tstats and br_cpu_netstats as one They are same, so unify them as one, pcpu_sw_netstats. Define pcpu_sw_netstat in netdevice.h, remove pcpu_tstats from if_tunnel and remove br_cpu_netstats from br_private.h Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-04 20:10:24 -05:00
stephen hemminger	5e419e68a6	llc: make lock static The llc_sap_list_lock does not need to be global, only acquired in core. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-03 20:56:48 -05:00
stephen hemminger	8f09898bf0	socket: cleanups Namespace related cleaning * make cred_to_ucred static * remove unused sock_rmalloc function Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-03 20:55:58 -05:00
Tom Herbert	9a4aa9af44	ipv4: Use percpu Cache route in IP tunnels percpu route cache eliminates share of dst refcnt between CPUs. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-03 19:40:57 -05:00
Tom Herbert	7d442fab0a	ipv4: Cache dst in tunnels Avoid doing a route lookup on every packet being tunneled. In ip_tunnel.c cache the route returned from ip_route_output if the tunnel is "connected" so that all the rouitng parameters are taken from tunnel parms for a packet. Specifically, not NBMA tunnel and tos is from tunnel parms (not inner packet). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-03 19:38:45 -05:00
Neil Horman	f916ec9608	sctp: Add process name and pid to deprecation warnings Recently I updated the sctp socket option deprecation warnings to be both a bit more clear and ratelimited to prevent user processes from spamming the log file. Ben Hutchings suggested that I add the process name and pid to these warnings so that users can tell who is responsible for using the deprecated apis. This patch accomplishes that. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: Ben Hutchings <bhutchings@solarflare.com> CC: "David S. Miller" <davem@davemloft.net> CC: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-03 19:36:46 -05:00
Pablo Neira Ayuso	c9c8e48597	netfilter: nf_tables: dump sets in all existing families This patch allows you to dump all sets available in all of the registered families. This allows you to use NFPROTO_UNSPEC to dump all existing sets, similarly to other existing table, chain and rule operations. This patch is based on original patch from Arturo Borrero González. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-01-04 00:23:11 +01:00
Daniel Borkmann	82a37132f3	netfilter: x_tables: lightweight process control group matching It would be useful e.g. in a server or desktop environment to have a facility in the notion of fine-grained "per application" or "per application group" firewall policies. Probably, users in the mobile, embedded area (e.g. Android based) with different security policy requirements for application groups could have great benefit from that as well. For example, with a little bit of configuration effort, an admin could whitelist well-known applications, and thus block otherwise unwanted "hard-to-track" applications like [1] from a user's machine. Blocking is just one example, but it is not limited to that, meaning we can have much different scenarios/policies that netfilter allows us than just blocking, e.g. fine grained settings where applications are allowed to connect/send traffic to, application traffic marking/conntracking, application-specific packet mangling, and so on. Implementation of PID-based matching would not be appropriate as they frequently change, and child tracking would make that even more complex and ugly. Cgroups would be a perfect candidate for accomplishing that as they associate a set of tasks with a set of parameters for one or more subsystems, in our case the netfilter subsystem, which, of course, can be combined with other cgroup subsystems into something more complex if needed. As mentioned, to overcome this constraint, such processes could be placed into one or multiple cgroups where different fine-grained rules can be defined depending on the application scenario, while e.g. everything else that is not part of that could be dropped (or vice versa), thus making life harder for unwanted processes to communicate to the outside world. So, we make use of cgroups here to track jobs and limit their resources in terms of iptables policies; in other words, limiting, tracking, etc what they are allowed to communicate. In our case we're working on outgoing traffic based on which local socket that originated from. Also, one doesn't even need to have an a-prio knowledge of the application internals regarding their particular use of ports or protocols. Matching is extremly lightweight as we just test for the sk_classid marker of sockets, originating from net_cls. net_cls and netfilter do not contradict each other; in fact, each construct can live as standalone or they can be used in combination with each other, which is perfectly fine, plus it serves Tejun's requirement to not introduce a new cgroups subsystem. Through this, we result in a very minimal and efficient module, and don't add anything except netfilter code. One possible, minimal usage example (many other iptables options can be applied obviously): 1) Configuring cgroups if not already done, e.g.: mkdir /sys/fs/cgroup/net_cls mount -t cgroup -o net_cls net_cls /sys/fs/cgroup/net_cls mkdir /sys/fs/cgroup/net_cls/0 echo 1 > /sys/fs/cgroup/net_cls/0/net_cls.classid (resp. a real flow handle id for tc) 2) Configuring netfilter (iptables-nftables), e.g.: iptables -A OUTPUT -m cgroup ! --cgroup 1 -j DROP 3) Running applications, e.g.: ping 208.67.222.222 <pid:1799> echo 1799 > /sys/fs/cgroup/net_cls/0/tasks 64 bytes from 208.67.222.222: icmp_seq=44 ttl=49 time=11.9 ms [...] ping 208.67.220.220 <pid:1804> ping: sendmsg: Operation not permitted [...] echo 1804 > /sys/fs/cgroup/net_cls/0/tasks 64 bytes from 208.67.220.220: icmp_seq=89 ttl=56 time=19.0 ms [...] Of course, real-world deployments would make use of cgroups user space toolsuite, or own custom policy daemons dynamically moving applications from/to various cgroups. [1] http://www.blackhat.com/presentations/bh-europe-06/bh-eu-06-biondi/bh-eu-06-biondi-up.pdf Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: cgroups@vger.kernel.org Acked-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-01-03 23:41:44 +01:00
Daniel Borkmann	86f8515f97	net: netprio: rename config to be more consistent with cgroup configs While we're at it and introduced CGROUP_NET_CLASSID, lets also make NETPRIO_CGROUP more consistent with the rest of cgroups and rename it into CONFIG_CGROUP_NET_PRIO so that for networking, we now have CONFIG_CGROUP_NET_{PRIO,CLASSID}. This not only makes the CONFIG option consistent among networking cgroups, but also among cgroups CONFIG conventions in general as the vast majority has a prefix of CONFIG_CGROUP_<SUBSYS>. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Zefan Li <lizefan@huawei.com> Cc: cgroups@vger.kernel.org Acked-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-01-03 23:41:42 +01:00
Daniel Borkmann	fe1217c4f3	net: net_cls: move cgroupfs classid handling into core Zefan Li requested [1] to perform the following cleanup/refactoring: - Split cgroupfs classid handling into net core to better express a possible more generic use. - Disable module support for cgroupfs bits as the majority of other cgroupfs subsystems do not have that, and seems to be not wished from cgroup side. Zefan probably might want to follow-up for netprio later on. - By this, code can be further reduced which previously took care of functionality built when compiled as module. cgroupfs bits are being placed under net/core/netclassid_cgroup.c, so that we are consistent with {netclassid,netprio}_cgroup naming that is under net/core/ as suggested by Zefan. No change in functionality, but only code refactoring that is being done here. [1] http://patchwork.ozlabs.org/patch/304825/ Suggested-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Zefan Li <lizefan@huawei.com> Cc: Thomas Graf <tgraf@suug.ch> Cc: cgroups@vger.kernel.org Acked-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-01-03 23:41:41 +01:00
Eric Leblond	14abfa161d	netfilter: xt_CT: fix error value in xt_ct_tg_check() If setting event mask fails then we were returning 0 for success. This patch updates return code to -EINVAL in case of problem. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-01-03 23:41:39 +01:00
stephen hemminger	dcd93ed4cd	netfilter: nf_conntrack: remove dead code The following code is not used in current upstream code. Some of this seems to be old hooks, other might be used by some out of tree module (which I don't care about breaking), and the need_ipv4_conntrack was used by old NAT code but no longer called. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-01-03 23:41:37 +01:00
stephen hemminger	02eca9d2cc	netfilter: ipset: remove unused code Function never used in current upstream code. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-01-03 23:41:35 +01:00
Daniel Borkmann	34ce324019	netfilter: nf_nat: add full port randomization support We currently use prandom_u32() for allocation of ports in tcp bind(0) and udp code. In case of plain SNAT we try to keep the ports as is or increment on collision. SNAT --random mode does use per-destination incrementing port allocation. As a recent paper pointed out in [1] that this mode of port allocation makes it possible to an attacker to find the randomly allocated ports through a timing side-channel in a socket overloading attack conducted through an off-path attacker. So, NF_NAT_RANGE_PROTO_RANDOM actually weakens the port randomization in regard to the attack described in this paper. As we need to keep compatibility, add another flag called NF_NAT_RANGE_PROTO_RANDOM_FULLY that would replace the NF_NAT_RANGE_PROTO_RANDOM hash-based port selection algorithm with a simple prandom_u32() in order to mitigate this attack vector. Note that the lfsr113's internal state is periodically reseeded by the kernel through a local secure entropy source. More details can be found in [1], the basic idea is to send bursts of packets to a socket to overflow its receive queue and measure the latency to detect a possible retransmit when the port is found. Because of increasing ports to given destination and port, further allocations can be predicted. This information could then be used by an attacker for e.g. for cache-poisoning, NS pinning, and degradation of service attacks against DNS servers [1]: The best defense against the poisoning attacks is to properly deploy and validate DNSSEC; DNSSEC provides security not only against off-path attacker but even against MitM attacker. We hope that our results will help motivate administrators to adopt DNSSEC. However, full DNSSEC deployment make take significant time, and until that happens, we recommend short-term, non-cryptographic defenses. We recommend to support full port randomisation, according to practices recommended in [2], and to avoid per-destination sequential port allocation, which we show may be vulnerable to derandomisation attacks. Joint work between Hannes Frederic Sowa and Daniel Borkmann. [1] https://sites.google.com/site/hayashulman/files/NIC-derandomisation.pdf [2] http://arxiv.org/pdf/1205.5190v1.pdf Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-01-03 23:41:26 +01:00
Michal Nazarewicz	720e0dfa3a	netfilter: nf_tables: remove unused variable in nf_tables_dump_set() The nfmsg variable is not used (except in sizeof operator which does not care about its value) between the first and second time it is assigned the value. Furthermore, nlmsg_data has no side effects, so the assignment can be safely removed. Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-01-03 22:51:14 +01:00
Daniel Borkmann	1466291790	netfilter: nf_tables: fix type in parsing in nf_tables_set_alloc_name() In nf_tables_set_alloc_name(), we are trying to find a new, unused name for our new set and interate through the list of present sets. As far as I can see, we're using format string %d to parse already present names in order to mark their presence in a bitmap, so that we can later on find the first 0 in that map to assign the new set name to. We should rather use a temporary variable of type int to store the result of sscanf() to, and for making sanity checks on. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-01-03 22:50:59 +01:00
David S. Miller	aca5f58f9b	netpoll: Fix missing TXQ unlock and and OOPS. The VLAN tag handling code in netpoll_send_skb_on_dev() has two problems. 1) It exits without unlocking the TXQ. 2) It then tries to queue a NULL skb to npinfo->txq. Reported-by: Ahmed Tamrawi <atamrawi@iastate.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 19:50:52 -05:00
Li RongQing	469bdcefdc	ipv6: fix the use of pcpu_tstats in ip6_vti.c when read/write the 64bit data, the correct lock should be hold. and we can use the generic vti6_get_stats to return stats, and not define a new one in ip6_vti.c Fixes: `87b6d218f3` ("tunnel: implement 64 bits statistics") Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 19:37:21 -05:00
Li RongQing	abb6013cca	ipv6: fix the use of pcpu_tstats in ip6_tunnel when read/write the 64bit data, the correct lock should be hold. Fixes: `87b6d218f3` ("tunnel: implement 64 bits statistics") Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 19:37:21 -05:00
Yasushi Asano	fad8da3e08	ipv6 addrconf: fix preferred lifetime state-changing behavior while valid_lft is infinity Fixed a problem with setting the lifetime of an IPv6 address. When setting preferred_lft to a value not zero or infinity, while valid_lft is infinity(0xffffffff) preferred lifetime is set to forever and does not update. Therefore preferred lifetime never becomes deprecated. valid lifetime and preferred lifetime should be set independently, even if valid lifetime is infinity, preferred lifetime must expire correctly (meaning it must eventually become deprecated) Signed-off-by: Yasushi Asano <yasushi.asano@jp.fujitsu.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 19:34:40 -05:00
Daniel Borkmann	4d231b76ee	net: llc: fix use after free in llc_ui_recvmsg While commit `30a584d944` fixes datagram interface in LLC, a use after free bug has been introduced for SOCK_STREAM sockets that do not make use of MSG_PEEK. The flow is as follow ... if (!(flags & MSG_PEEK)) { ... sk_eat_skb(sk, skb, false); ... } ... if (used + offset < skb->len) continue; ... where sk_eat_skb() calls __kfree_skb(). Therefore, cache original length and work on skb_len to check partial reads. Fixes: `30a584d944` ("[LLX]: SOCK_DGRAM interface fixes") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 19:31:09 -05:00
Wei-Chun Chao	7a7ffbabf9	ipv4: fix tunneled VM traffic over hw VXLAN/GRE GSO NIC VM to VM GSO traffic is broken if it goes through VXLAN or GRE tunnel and the physical NIC on the host supports hardware VXLAN/GRE GSO offload (e.g. bnx2x and next-gen mlx4). Two issues - (VXLAN) VM traffic has SKB_GSO_DODGY and SKB_GSO_UDP_TUNNEL with SKB_GSO_TCP/UDP set depending on the inner protocol. GSO header integrity check fails in udp4_ufo_fragment if inner protocol is TCP. Also gso_segs is calculated incorrectly using skb->len that includes tunnel header. Fix: robust check should only be applied to the inner packet. (VXLAN & GRE) Once GSO header integrity check passes, NULL segs is returned and the original skb is sent to hardware. However the tunnel header is already pulled. Fix: tunnel header needs to be restored so that hardware can perform GSO properly on the original packet. Signed-off-by: Wei-Chun Chao <weichunc@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 19:06:47 -05:00
Cong Wang	c1ddf295f5	net: revert "sched classifier: make cgroup table local" This reverts commit `de6fb288b1`. Otherwise we got: net/sched/cls_cgroup.c:106:29: error: static declaration of ‘net_cls_subsys’ follows non-static declaration static struct cgroup_subsys net_cls_subsys = { ^ In file included from include/linux/cgroup.h:654:0, from net/sched/cls_cgroup.c:18: include/linux/cgroup_subsys.h:35:29: note: previous declaration of ‘net_cls_subsys’ was here SUBSYS(net_cls) ^ make[2]: *** [net/sched/cls_cgroup.o] Error 1 Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 19:02:12 -05:00
Vlad Yasevich	619a60ee04	sctp: Remove outqueue empty state The SCTP outqueue structure maintains a data chunks that are pending transmission, the list of chunks that are pending a retransmission and a length of data in flight. It also tries to keep the emtpy state so that it can performe shutdown sequence or notify user. The problem is that the empy state is inconsistently tracked. It is possible to completely drain the queue without sending anything when using PR-SCTP. In this case, the empty state will not be correctly state as report by Jamal Hadi Salim <jhs@mojatatu.com>. This can cause an association to be perminantly stuck in the SHUTDOWN_PENDING state. Additionally, SCTP is incredibly inefficient when setting the empty state. Even though all the data is availaible in the outqueue structure, we ignore it and walk a list of trasnports. In the end, we can completely remove the extra empty state and figure out if the queue is empty by looking at 3 things: length of pending data, length of in-flight data, and exisiting of retransmit data. All of these are already in the strucutre. Reported-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Tested-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 17:22:48 -05:00
stephen hemminger	de6fb288b1	sched classifier: make cgroup table local Doesn't need to be global. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 03:30:36 -05:00
stephen hemminger	9c75f4029c	sched action: make local function static No need to export functions only used in one file. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 03:30:36 -05:00
Weilong Chen	dd9b45598a	ipv4: switch and case should be at the same indent Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 03:30:36 -05:00
Weilong Chen	442c67f844	ipv4: spaces required around that '=' Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 03:30:36 -05:00
Li RongQing	0c3584d589	ipv6: remove prune parameter for fib6_clean_all since the prune parameter for fib6_clean_all always is 0, remove it. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 03:30:35 -05:00
wangweidong	b055597697	tipc: make the code look more readable In commit `3b8401fe9d` ("tipc: kill unnecessary goto's") didn't make the code look most readable, so fix it. This patch is cosmetic and does not change the operation of TIPC in any way. Suggested-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 03:30:35 -05:00
Salam Noureddine	56022a8fdd	ipv4: arp: update neighbour address when a gratuitous arp is received and arp_accept is set Gratuitous arp packets are useful in switchover scenarios to update client arp tables as quickly as possible. Currently, the mac address of a neighbour is only updated after a locktime period has elapsed since the last update. In most use cases such delays are unacceptable for network admins. Moreover, the "updated" field of the neighbour stucture doesn't record the last time the address of a neighbour changed but records any change that happens to the neighbour. This is clearly a bug since locktime uses that field as meaning "addr_updated". With this observation, I was able to perpetuate a stale address by sending a stream of gratuitous arp packets spaced less than locktime apart. With this change the address is updated when a gratuitous arp is received and the arp_accept sysctl is set. Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 00:08:38 -05:00
stephen hemminger	e82435341f	ipv6: namespace cleanups Running 'make namespacecheck' shows: net/ipv6/route.o ipv6_route_table_template rt6_bind_peer net/ipv6/icmp.o icmpv6_route_lookup ipv6_icmp_table_template This addresses some of those warnings by: * make icmpv6_route_lookup static * move inline's out of ip6_route.h since only used into route.c * move rt6_bind_peer into route.c Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-01 23:46:09 -05:00
stephen hemminger	1d143d9f0c	net: core functions cleanup The following functions are not used outside of net/core/dev.c and should be declared static. call_netdevice_notifiers_info __dev_remove_offload netdev_has_any_upper_dev __netdev_adjacent_dev_remove __netdev_adjacent_dev_link_lists __netdev_adjacent_dev_unlink_lists __netdev_adjacent_dev_unlink __netdev_adjacent_dev_link_neighbour __netdev_adjacent_dev_unlink_neighbour And the following are never used and should be deleted netdev_lower_dev_get_private_rcu __netdev_find_adj_rcu Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-01 23:46:09 -05:00
stephen hemminger	2173f8d953	netlink: cleanup tap related functions Cleanups in netlink_tap code * remove unused function netlink_clear_multicast_users * make local function static Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-01 23:43:36 -05:00
stephen hemminger	3678a9d863	netlink: cleanup rntl_af_register The function __rtnl_af_register is never called outside this code, and the return value is always 0. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-01 23:42:19 -05:00
Li RongQing	c3ac17cd6a	ipv6: fix the use of pcpu_tstats in sit when read/write the 64bit data, the correct lock should be hold. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-01 22:48:59 -05:00
John W. Linville	ad86c55bac	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2014-01-01 15:39:56 -05:00
Pablo Neira Ayuso	d497c63527	netfilter: add help information to new nf_tables Kconfig options Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-01-01 18:37:10 +01:00
Zhi Yong Wu	21eb218989	net, sch: fix the typo in register_qdisc() Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 16:44:10 -05:00
Zhi Yong Wu	855abcf066	net, rps: fix the comment of net_rps_action_and_irq_enable() Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 16:44:10 -05:00
David S. Miller	2205369a31	vlan: Fix header ops passthru when doing TX VLAN offload. When the vlan code detects that the real device can do TX VLAN offloads in hardware, it tries to arrange for the real device's header_ops to be invoked directly. But it does so illegally, by simply hooking the real device's header_ops up to the VLAN device. This doesn't work because we will end up invoking a set of header_ops routines which expect a device type which matches the real device, but will see a VLAN device instead. Fix this by providing a pass-thru set of header_ops which will arrange to pass the proper real device instead. To facilitate this add a dev_rebuild_header(). There are implementations which provide a ->cache and ->create but not a ->rebuild (f.e. PLIP). So we need a helper function just like dev_hard_header() to avoid crashes. Use this helper in the one existing place where the header_ops->rebuild was being invoked, the neighbour code. With lots of help from Florian Westphal. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 16:23:35 -05:00
wangweidong	0438816efd	sctp: move skb_dst_set() a bit downwards in sctp_packet_transmit() skb_dst_set will use dst, if dst is NULL although is not a problem, then goto the 'no_route' and free nskb, so do the skb_dst_set is pointless. so move the skb_dst_set after dst check. Remove the unnecessary initialization as well. v2: fix the subject line because it would confuse people, as pointed out by Daniel. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 14:31:44 -05:00
Yang Yingliang	6a031f67c8	sch_netem: support of 64bit rates Add a new attribute to support 64bit rates so that tc can use them to break the 32bit limit. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 14:31:44 -05:00
Yang Yingliang	8cfd88d6d7	sch_netem: more precise length of packets With TSO/GSO/GRO packets, skb->len doesn't represent a precise amount of bytes on wire. This patch replace skb->len with qdisc_pkt_len(skb) which is more precise. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 14:31:43 -05:00
Daniel Borkmann	604d13c97f	netlink: specify netlink packet direction for nlmon In order to facilitate development for netlink protocol dissector, fill the unused field skb->pkt_type of the cloned skb with a hint of the address space of the new owner (receiver) socket in the notion of "to kernel" resp. "to user". At the time we invoke __netlink_deliver_tap_skb(), we already have set the new skb owner via netlink_skb_set_owner_r(), so we can use that for netlink_is_kernel() probing. In normal PF_PACKET network traffic, this field denotes if the packet is destined for us (PACKET_HOST), if it's broadcast (PACKET_BROADCAST), etc. As we only have 3 bit reserved, we can use the value (= 6) of PACKET_FASTROUTE as it's _not used_ anywhere in the whole kernel and not supported anywhere, and packets of such type were never exposed to user space, so there are no overlapping users of such kind. Thus, as wished, that seems the only way to make both PACKET_* values non-overlapping and therefore device agnostic. By using those two flags for netlink skbs on nlmon devices, they can be made available and picked up via sll_pkttype (previously unused in netlink context) in struct sockaddr_ll. We now have these two directions: - PACKET_USER (= 6) -> to user space - PACKET_KERNEL (= 7) -> to kernel space Partial `ip a` example strace for sa_family=AF_NETLINK with detected nl msg direction: syscall: direction: sendto(3, ...) = 40 /* to kernel / recvmsg(3, ...) = 3404 / to user / recvmsg(3, ...) = 1120 / to user / recvmsg(3, ...) = 20 / to user / sendto(3, ...) = 40 / to kernel / recvmsg(3, ...) = 168 / to user / recvmsg(3, ...) = 144 / to user / recvmsg(3, ...) = 20 / to user */ Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jakub Zawadzki <darkjames-ws@darkjames.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 14:31:43 -05:00
Daniel Borkmann	73bfd370c8	netlink: only do not deliver to tap when both sides are kernel sks We should also deliver packets to nlmon devices when we are in netlink_unicast_kernel(), and only one of the {src,dst} sockets is user sk and the other one kernel sk. That's e.g. the case in netlink diag, netlink route, etc. Still, forbid to deliver messages from kernel to kernel sks. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jakub Zawadzki <darkjames-ws@darkjames.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 14:31:43 -05:00
Neil Horman	94f65193ab	SCTP: Reduce log spamming for sctp setsockopt During a recent discussion regarding some sctp socket options, it was noted that we have several points at which we issue log warnings that can be flooded at an unbounded rate by any user. Fix this by converting all the pr_warns in the sctp_setsockopt path to be pr_warn_ratelimited. Note there are several debug level messages as well. I'm leaving those alone, as, if you turn on pr_debug, you likely want lots of verbosity. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: David Miller <davem@davemloft.net> CC: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 13:58:40 -05:00
Yang Yingliang	c76f2a2c4c	sch_dsmark: use correct func name in print messages In dsmark_drop(), the function name printed by pr_debug is "dsmark_reset", correct it to "dsmark_drop" by using __func__ . BTW, replace the other function names with __func__ . Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 13:50:57 -05:00
Yang Yingliang	a071d27241	sch_htb: use /* comments Do not use C99 // comments and correct a spelling typo. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 13:50:57 -05:00
Yang Yingliang	c17988a90f	net_sched: replace pr_warning with pr_warn Prefer pr_warn(... to pr_warning(... Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 13:50:56 -05:00
Weilong Chen	d4dd8aeefd	packet: fix "foo * bar" and "(foo*)" problems Cleanup checkpatch errors.Specially,the second changed line is exactly 80 columns long. Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 13:38:41 -05:00
Li RongQing	46306b4934	ipv6: unneccessary to get address prefix in addrconf_get_prefix_route Since addrconf_get_prefix_route inputs the address prefix to fib6_locate, which does not uses the data which is out of the prefix_len length, so do not need to use ipv6_addr_prefix to get address prefix. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 13:37:46 -05:00
Eric Leblond	bee11dc78f	netfilter: nft_reject: support for IPv6 and TCP reset This patch moves nft_reject_ipv4 to nft_reject and adds support for IPv6 protocol. This patch uses functions included in nf_reject.h to implement reject by TCP reset. The code has to be build as a module if NF_TABLES_IPV6 is also a module to avoid compilation error due to usage of IPv6 functions. This has been done in Kconfig by using the construct: depends on NF_TABLES_IPV6 \|\| !NF_TABLES_IPV6 This seems a bit weird in terms of syntax but works perfectly. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-30 18:15:38 +01:00
Eric Leblond	cc70d069e2	netfilter: REJECT: separate reusable code This patch prepares the addition of TCP reset support in the nft_reject module by moving reusable code into a header file. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-30 15:04:41 +01:00
Florian Westphal	f81152e350	net: rose: restore old recvmsg behavior recvmsg handler in net/rose/af_rose.c performs size-check ->msg_namelen. After commit `f3d3342602` (net: rework recvmsg handler msg_name and msg_namelen logic), we now always take the else branch due to namelen being initialized to 0. Digging in netdev-vger-cvs git repo shows that msg_namelen was initialized with a fixed-size since at least 1995, so the else branch was never taken. Compile tested only. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-29 22:33:17 -05:00
Ying Xue	84602761ca	tipc: fix deadlock during socket release A deadlock might occur if name table is withdrawn in socket release routine, and while packets are still being received from bearer. CPU0 CPU1 T0: recv_msg() release() T1: tipc_recv_msg() tipc_withdraw() T2: [grab node lock] [grab port lock] T3: tipc_link_wakeup_ports() tipc_nametbl_withdraw() T4: [grab port lock]* named_cluster_distribute() T5: wakeupdispatch() tipc_link_send() T6: [grab node lock]* The opposite order of holding port lock and node lock on above two different paths may result in a deadlock. If socket lock instead of port lock is used to protect port instance in tipc_withdraw(), the reverse order of holding port lock and node lock will be eliminated, as a result, the deadlock is killed as well. Reported-by: Lars Everbrand <lars.everbrand@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-29 22:24:07 -05:00
stephen hemminger	24245a1b05	lro: remove dead code Remove leftover code that is not used anywhere in current tree. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-29 16:34:25 -05:00
stephen hemminger	f7e56a76ac	tcp: make local functions static The following are only used in one file: tcp_connect_init tcp_set_rto Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-29 16:34:24 -05:00
Eric Leblond	5f291c2869	netfilter: select NFNETLINK when enabling NF_TABLES In Kconfig, nf_tables depends on NFNETLINK so building nf_tables as a module or inside kernel depends on the state of NFNETLINK inside the kernel config. If someone wants to build nf_tables inside the kernel, it is necessary to also build NFNETLINK inside the kernel. But NFNETLINK can not be set in the menu so it is necessary to toggle other nfnetlink subsystems such as logging and nfacct to see the nf_tables switch. This patch changes the dependency from 'depend' to 'select' inside Kconfig to allow to set the build of nftables as modules or inside kernel independently. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-29 12:08:29 +01:00
David S. Miller	8eb9bff0ed	Included changes: - reset netfilter-bridge state when removing the batman-adv header from an incoming packet. This prevents netfilter bridge from being fooled when the same packet enters a bridge twice (or more): the first time within the batman-adv header and the second time without. - adjust the packet layout to prevent any architecture from adding padding bytes. All the structs sent over the wire now have size multiple of 4bytes (unless pack(2) is used). - fix access to the inner vlan_eth header when reading the VID in the rx path. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJSvvT6AAoJEEKTMo6mOh1Vn4EP+gKxohskmxmnRd/MnOL1FyyR +T1+RtNm3Z2+v0OlR1zcHS5MJwt7KDrakS+w4tJTNiZTaPPjHQZu9wAn4wlFt9ix 9GO9nAtTbOcPj72WZsmq3Wslm2qKnF+kmu7hdOu7Grv3n3EEafQ+P8vzw7FACB+N dMiwkfGdHgSxlM7a1faJM9N2qiSM2vUesElYPyinhjCciHwBuFW3HwK3CxayVwVE Q20AMlgUd18ZGgy5oqrZz/1LCpaOsydtbcNHOXjF8gX0zwCTmog9GTAD6hT0mtTU 8lo2DHLFH9Hxm1bs8jDkiQEBLK2gXPGb6ic07IRE6g24my62RrAizE/yMmd1WrCP 6I8Qhs58tQ0RUVH8m6XG71TBw3nLk69QbKxHbYiDWlLQhrygx/9+ZXfuu3VUPvGS 9EmNuWw3Io7+KqaT0BJC6CJIkZLTL7oJvgyO2uFGEgkSjY5vxSF8GU0yisP9xilW bpfhQO64gsmxjY2tjmIVRckVTUOsMnJKVlquq6aHKfgsOQdnTTNOd+l7bD8izIJ8 GwZKZLuzYJqYop4TxpkbNK2ckCl0kcMCHHLInJ/pO1KU0sAs+V1+HHouX6VGr9HT IsbBvwMslbRPi8uOFAzueHTT/DzBXdP4sMsg5QomBQOqH8LlZuQ4ZBHtUbEQOVrp xRCy3Fj/wjWzZUFdYE3f =RDuB -----END PGP SIGNATURE----- Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge Included changes: - reset netfilter-bridge state when removing the batman-adv header from an incoming packet. This prevents netfilter bridge from being fooled when the same packet enters a bridge twice (or more): the first time within the batman-adv header and the second time without. - adjust the packet layout to prevent any architecture from adding padding bytes. All the structs sent over the wire now have size multiple of 4bytes (unless pack(2) is used). - fix access to the inner vlan_eth header when reading the VID in the rx path. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-29 00:30:59 -05:00
David S. Miller	a72338a00e	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net This patchset contains four nf_tables fixes, one IPVS fix due to missing updates in the interaction with the new sedadj conntrack extension that was added to support the netfilter synproxy code, and a couple of one-liners to fix netnamespace netfilter issues. More specifically, they are: * Fix ipv6_find_hdr() call without offset being explicitly initialized in nft_exthdr, as required by that function, from Daniel Borkmann. * Fix oops in nfnetlink_log when using netns and unloading the kernel module, from Gao feng. * Fix BUG_ON in nf_ct_timestamp extension after netns is destroyed, from Helmut Schaa. * Fix crash in IPVS due to missing sequence adjustment extension being allocated in the conntrack, from Jesper Dangaard Brouer. * Add bugtrap to spot a warning in case you deference sequence adjustment conntrack area when not available, this should help to catch similar invalid dereferences in the Netfilter tree, also from Jesper. * Fix incomplete dumping of sets in nf_tables when retrieving by family, from me. * Fix oops when updating the table state (dormant <-> active) and having user (not base ) chains, from me. * Fix wrong validation in set element data that results in returning -EINVAL when using the nf_tables dictionary feature with mappings, also from me. We don't usually have this amount of fixes by this time (as we're already in -rc5 of the development cycle), although half of them are related to nf_tables which is a relatively new thing, and I also believe that holidays have also delayed the flight of bugfixes to mainstream a bit. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-29 00:24:28 -05:00
Stephen Hemminger	ea074b3495	ipv4: ping make local stuff static Don't export ping_table or ping_v4_sendmsg. Both are only used inside ping code. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-28 17:05:45 -05:00
Stephen Hemminger	068a6e1834	ipv4: remove unused function inetpeer_invalidate_family defined but never used Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-28 17:03:20 -05:00
Stephen Hemminger	7195cf7221	arp: make arp_invalidate static Don't export arp_invalidate, only used in arp.c Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-28 17:02:46 -05:00
Stephen Hemminger	c9cb6b6ec1	ipv4: make fib_detect_death static Make fib_detect_death function static only used in one file. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-28 17:01:46 -05:00
Pablo Neira Ayuso	2ee0d3c80f	netfilter: nf_tables: fix wrong datatype in nft_validate_data_load() This patch fixes dictionary mappings, eg. add rule ip filter input meta dnat set tcp dport map { 22 => 1.1.1.1, 23 => 2.2.2.2 } The kernel was returning -EINVAL in nft_validate_data_load() since the type of the set element data that is passed was the real userspace datatype instead of NFT_DATA_VALUE. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-28 22:32:28 +01:00
Antonio Quartulli	2b1e2cb359	batman-adv: fix vlan header access When batadv_get_vid() is invoked in interface_rx() the batman-adv header has already been removed, therefore the header_len argument has to be 0. Introduced by `c018ad3de6` ("batman-adv: add the VLAN ID attribute to the TT entry") Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2013-12-28 14:48:40 +01:00
Antonio Quartulli	55883fd104	batman-adv: clean nf state when removing protocol header If an interface enslaved into batman-adv is a bridge (or a virtual interface built on top of a bridge) the nf_bridge member of the skbs reaching the soft-interface is filled with the state about "netfilter bridge" operations. Then, if one of such skbs is locally delivered, the nf_bridge member should be cleaned up to avoid that the old state could mess up with other "netfilter bridge" operations when entering a second bridge. This is needed because batman-adv is an encapsulation protocol. However at the moment skb->nf_bridge is not released at all leading to bogus "netfilter bridge" behaviours. Fix this by cleaning the netfilter state of the skb before it gets delivered to the upper layer in interface_rx(). Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2013-12-28 14:47:44 +01:00
Pablo Neira Ayuso	994737513e	netfilter: nf_tables: remove nft_meta_target In `e035b77` ("netfilter: nf_tables: nft_meta module get/set ops"), we got the meta target merged into the existing meta expression. So let's get rid of this dead code now that we fully support that feature. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-28 14:06:25 +01:00
Arturo Borrero Gonzalez	e035b77ac7	netfilter: nf_tables: nft_meta module get/set ops This patch adds kernel support for the meta expression in get/set flavour. The set operation indicates that a given packet has to be set with a property, currently one of mark, priority, nftrace. The get op is what was currently working: evaluate the given packet property. In the nftrace case, the value is always 1. Such behaviour is copied from net/netfilter/xt_TRACE.c The NFTA_META_DREG and NFTA_META_SREG attributes are mutually exclusives. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-28 14:02:12 +01:00
Antonio Quartulli	ca66304644	batman-adv: fix alignment for batadv_tvlv_tt_change Make struct batadv_tvlv_tt_change a multiple 4 bytes long to avoid padding on any architecture. Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2013-12-28 12:51:18 +01:00
Simon Wunderlich	2f7a318219	batman-adv: fix size of batadv_bla_claim_dst Since this is a mac address and always 48 bit, and we can assume that it is always aligned to 2-byte boundaries, add a pack(2) pragma. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-12-28 12:51:17 +01:00
Antonio Quartulli	27a417e6ba	batman-adv: fix size of batadv_icmp_header struct batadv_icmp_header currently has a size of 17, which will be padded to 20 on some architectures. Fix this by unrolling the header into the parent structures. Moreover keep the ICMP parsing functions as generic as they are now by using a stub icmp_header struct during packet parsing. Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2013-12-28 12:51:16 +01:00
Simon Wunderlich	a40d9b075c	batman-adv: fix header alignment by unrolling batadv_header The size of the batadv_header of 3 is problematic on some architectures which automatically pad all structures to a 32 bit boundary. To not lose performance by packing this struct, better embed it into the various host structures. Reported-by: Russell King <linux@arm.linux.org.uk> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-12-28 12:51:16 +01:00
Simon Wunderlich	46b76e0b8b	batman-adv: fix alignment for batadv_coded_packet The compiler may decide to pad the structure, and then it does not have the expected size of 46 byte. Fix this by moving it in the pragma pack(2) part of the code. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-12-28 12:51:15 +01:00
Pablo Neira Ayuso	d201297561	netfilter: nf_tables: fix oops when updating table with user chains This patch fixes a crash while trying to deactivate a table that contains user chains. You can reproduce it via: % nft add table table1 % nft add chain table1 chain1 % nft-table-upd ip table1 dormant [ 253.021026] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 [ 253.021114] IP: [<ffffffff8134cebd>] nf_register_hook+0x35/0x6f [ 253.021167] PGD 30fa5067 PUD 30fa2067 PMD 0 [ 253.021208] Oops: 0000 [#1] SMP [...] [ 253.023305] Call Trace: [ 253.023331] [<ffffffffa0885020>] nf_tables_newtable+0x11c/0x258 [nf_tables] [ 253.023385] [<ffffffffa0878592>] nfnetlink_rcv_msg+0x1f4/0x226 [nfnetlink] [ 253.023438] [<ffffffffa0878418>] ? nfnetlink_rcv_msg+0x7a/0x226 [nfnetlink] [ 253.023491] [<ffffffffa087839e>] ? nfnetlink_bind+0x45/0x45 [nfnetlink] [ 253.023542] [<ffffffff8134b47e>] netlink_rcv_skb+0x3c/0x88 [ 253.023586] [<ffffffffa0878973>] nfnetlink_rcv+0x3af/0x3e4 [nfnetlink] [ 253.023638] [<ffffffff813fb0d4>] ? _raw_read_unlock+0x22/0x34 [ 253.023683] [<ffffffff8134af17>] netlink_unicast+0xe2/0x161 [ 253.023727] [<ffffffff8134b29a>] netlink_sendmsg+0x304/0x332 [ 253.023773] [<ffffffff8130d250>] __sock_sendmsg_nosec+0x25/0x27 [ 253.023820] [<ffffffff8130fb93>] sock_sendmsg+0x5a/0x7b [ 253.023861] [<ffffffff8130d5d5>] ? copy_from_user+0x2a/0x2c [ 253.023905] [<ffffffff8131066f>] ? move_addr_to_kernel+0x35/0x60 [ 253.023952] [<ffffffff813107b3>] SYSC_sendto+0x119/0x15c [ 253.023995] [<ffffffff81401107>] ? sysret_check+0x1b/0x56 [ 253.024039] [<ffffffff8108dc30>] ? trace_hardirqs_on_caller+0x140/0x1db [ 253.024090] [<ffffffff8120164e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 253.024141] [<ffffffff81310caf>] SyS_sendto+0x9/0xb [ 253.026219] [<ffffffff814010e2>] system_call_fastpath+0x16/0x1b Reported-by: Alex Wei <alex.kern.mentor@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-28 12:18:16 +01:00
Pablo Neira Ayuso	e38195bf32	netfilter: nf_tables: fix dumping with large number of sets If not table name is specified, the dumping of the existing sets may be incomplete with a sufficiently large number of sets and tables. This patch fixes missing reset of the cursors after finding the location of the last object that has been included in the previous multi-part message. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-28 12:14:42 +01:00
Li RongQing	6a9eadccff	ipv6: release dst properly in ipip6_tunnel_xmit if a dst is not attached to anywhere, it should be released before exit ipip6_tunnel_xmit, otherwise cause dst memory leakage. Fixes: `61c1db7fae` ("ipv6: sit: add GSO/TSO support") Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-27 13:14:40 -05:00
Weilong Chen	3193502e5b	ieee802154: space prohibited before that close parenthesis Fix checkpatch error with space. Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-27 13:06:16 -05:00
Weilong Chen	3cdba604d0	llc: "foo* bar" should be "foo *bar" Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-27 13:06:15 -05:00
Jamal Hadi Salim	1a29321ed0	net_sched: act: Dont increment refcnt on replace This is a bug fix. The existing code tries to kill many birds with one stone: Handling binding of actions to filters, new actions and replacing of action attributes. A simple test case to illustrate: XXXX moja@fe1:~$ sudo tc actions add action drop index 12 moja@fe1:~$ actions get action gact index 12 action order 1: gact action drop random type none pass val 0 index 12 ref 1 bind 0 moja@fe1:~$ sudo tc actions replace action ok index 12 moja@fe1:~$ actions get action gact index 12 action order 1: gact action drop random type none pass val 0 index 12 ref 2 bind 0 XXXX The above shows the refcounf being wrongly incremented on replace. There are more complex scenarios with binding of actions to filters that i am leaving out that didnt work as well... Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-27 12:50:00 -05:00
Sasha Levin	c2349758ac	rds: prevent dereference of a NULL device Binding might result in a NULL device, which is dereferenced causing this BUG: [ 1317.260548] BUG: unable to handle kernel NULL pointer dereference at 000000000000097 4 [ 1317.261847] IP: [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110 [ 1317.263315] PGD 418bcb067 PUD 3ceb21067 PMD 0 [ 1317.263502] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 1317.264179] Dumping ftrace buffer: [ 1317.264774] (ftrace buffer empty) [ 1317.265220] Modules linked in: [ 1317.265824] CPU: 4 PID: 836 Comm: trinity-child46 Tainted: G W 3.13.0-rc4- next-20131218-sasha-00013-g2cebb9b-dirty #4159 [ 1317.267415] task: ffff8803ddf33000 ti: ffff8803cd31a000 task.ti: ffff8803cd31a000 [ 1317.268399] RIP: 0010:[<ffffffff84225f52>] [<ffffffff84225f52>] rds_ib_laddr_check+ 0x82/0x110 [ 1317.269670] RSP: 0000:ffff8803cd31bdf8 EFLAGS: 00010246 [ 1317.270230] RAX: 0000000000000000 RBX: ffff88020b0dd388 RCX: 0000000000000000 [ 1317.270230] RDX: ffffffff8439822e RSI: 00000000000c000a RDI: 0000000000000286 [ 1317.270230] RBP: ffff8803cd31be38 R08: 0000000000000000 R09: 0000000000000000 [ 1317.270230] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 1317.270230] R13: 0000000054086700 R14: 0000000000a25de0 R15: 0000000000000031 [ 1317.270230] FS: 00007ff40251d700(0000) GS:ffff88022e200000(0000) knlGS:000000000000 0000 [ 1317.270230] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1317.270230] CR2: 0000000000000974 CR3: 00000003cd478000 CR4: 00000000000006e0 [ 1317.270230] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1317.270230] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602 [ 1317.270230] Stack: [ 1317.270230] 0000000054086700 5408670000a25de0 5408670000000002 0000000000000000 [ 1317.270230] ffffffff84223542 00000000ea54c767 0000000000000000 ffffffff86d26160 [ 1317.270230] ffff8803cd31be68 ffffffff84223556 ffff8803cd31beb8 ffff8800c6765280 [ 1317.270230] Call Trace: [ 1317.270230] [<ffffffff84223542>] ? rds_trans_get_preferred+0x42/0xa0 [ 1317.270230] [<ffffffff84223556>] rds_trans_get_preferred+0x56/0xa0 [ 1317.270230] [<ffffffff8421c9c3>] rds_bind+0x73/0xf0 [ 1317.270230] [<ffffffff83e4ce62>] SYSC_bind+0x92/0xf0 [ 1317.270230] [<ffffffff812493f8>] ? context_tracking_user_exit+0xb8/0x1d0 [ 1317.270230] [<ffffffff8119313d>] ? trace_hardirqs_on+0xd/0x10 [ 1317.270230] [<ffffffff8107a852>] ? syscall_trace_enter+0x32/0x290 [ 1317.270230] [<ffffffff83e4cece>] SyS_bind+0xe/0x10 [ 1317.270230] [<ffffffff843a6ad0>] tracesys+0xdd/0xe2 [ 1317.270230] Code: 00 8b 45 cc 48 8d 75 d0 48 c7 45 d8 00 00 00 00 66 c7 45 d0 02 00 89 45 d4 48 89 df e8 78 49 76 ff 41 89 c4 85 c0 75 0c 48 8b 03 <80> b8 74 09 00 00 01 7 4 06 41 bc 9d ff ff ff f6 05 2a b6 c2 02 [ 1317.270230] RIP [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110 [ 1317.270230] RSP <ffff8803cd31bdf8> [ 1317.270230] CR2: 0000000000000974 Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-27 12:33:58 -05:00
Jesper Dangaard Brouer	b25adce160	ipvs: correct usage/allocation of seqadj ext in ipvs The IPVS FTP helper ip_vs_ftp could trigger an OOPS in nf_ct_seqadj_set, after commit `41d73ec053` (netfilter: nf_conntrack: make sequence number adjustments usuable without NAT). This is because, the seqadj ext is now allocated dynamically, and the IPVS code didn't handle this situation. Fix this in the IPVS nfct code by invoking the alloc function nfct_seqadj_ext_add(). Fixes: `41d73ec053` (netfilter: nf_conntrack: make sequence number adjustments usuable without NAT) Suggested-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-12-27 12:30:02 +09:00
Jesper Dangaard Brouer	db12cf2743	netfilter: WARN about wrong usage of sequence number adjustments Since commit `41d73ec053` (netfilter: nf_conntrack: make sequence number adjustments usuable without NAT), the sequence number extension is dynamically allocated. Instead of dying, give a WARN splash, in case of wrong usage of the seqadj code, e.g. when forgetting to allocate via nfct_seqadj_ext_add(). Wrong usage have been seen in the IPVS code path. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-12-27 12:29:54 +09:00
Geert Uytterhoeven	9dcbe1b87c	ipvs: Remove unused variable ret from sync_thread_master() net/netfilter/ipvs/ip_vs_sync.c: In function 'sync_thread_master': net/netfilter/ipvs/ip_vs_sync.c:1640:8: warning: unused variable 'ret' [-Wunused-variable] Commit `35a2af94c7` ("sched/wait: Make the __wait_event*() interface more friendly") changed how the interruption state is returned. However, sync_thread_master() ignores this state, now causing a compile warning. According to Julian Anastasov <ja@ssi.bg>, this behavior is OK: "Yes, your patch looks ok to me. In the past we used ssleep() but IPVS users were confused why IPVS threads increase the load average. So, we switched to _interruptible calls and later the socket polling was added." Document this, as requested by Peter Zijlstra, to avoid precious developers disappearing in this pitfall in the future. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-12-27 12:19:32 +09:00
Yang Yingliang	2e04ad424b	sch_tbf: add TBF_BURST/TBF_PBURST attribute When we set burst to 1514 with low rate in userspace, the kernel get a value of burst that less than 1514, which doesn't work. Because it may make some loss when transform burst to buffer in userspace. This makes burst lose some bytes, when the kernel transform the buffer back to burst. This patch adds two new attributes to support sending burst/mtu to kernel directly to avoid the loss. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:54:22 -05:00
wangweidong	4e2d52bfb2	sctp: fix checkpatch errors with //commen Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:47:48 -05:00
wangweidong	8d72651d86	sctp: fix checkpatch errors with open brace '{' and trailing statements fix checkpatch errors below: ERROR: that open brace { should be on the previous line ERROR: open brace '{' following function declarations go on the next line ERROR: trailing statements should be on next line Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:47:48 -05:00
wangweidong	f7010e6144	sctp: fix checkpatch errors with indent fix checkpatch errors below: ERROR: switch and case should be at the same inden ERROR: code indent should use tabs where possible Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:47:48 -05:00
wangweidong	26ac8e5fe1	sctp: fix checkpatch errors with (foo)\|foo bar\|foo* bar fix checkpatch errors below: ERROR: "(foo)" should be "(foo )" ERROR: "foo * bar" should be "foo bar" ERROR: "foo bar" should be "foo *bar" Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:47:47 -05:00
wangweidong	cb3f837ba9	sctp: fix checkpatch errors with space required or prohibited fix checkpatch errors while the space is required or prohibited to the "=,()++..." Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:47:47 -05:00
Weilong Chen	4c99aa409a	ipv6: cleanup for tcp_ipv6.c Fix some checkpatch errors for tcp_ipv6.c Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:46:23 -05:00
Weilong Chen	49564e5516	ipv4: ipv4: Cleanup the comments in tcp_yeah.c This cleanup the comments in tcp_yeah.c. 1.The old link is dead,use a new one to instead. 2.'lin' add nothing useful,remove it. 3.do not use C99 // comments. Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:43:55 -05:00
Weilong Chen	5797deb657	ipv4: ERROR: code indent should use tabs where possible Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:43:21 -05:00
Weilong Chen	47d18a9be1	ipv4: ERROR: do not initialise globals to 0 or NULL Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:43:21 -05:00
Weilong Chen	c71151f05b	ipv4: fix all space errors in file igmp.c Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:43:21 -05:00
Weilong Chen	d41db5af26	ipv4: fix checkpatch error with foo * bar Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:43:21 -05:00
Weilong Chen	0c9a67d2ed	ipv4: fix checkpatch error "space prohibited" Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:43:21 -05:00
Weilong Chen	a22318e83b	ipv4: do clean up with spaces Fix checkpatch errors like: ERROR: spaces required around that XXX Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:43:21 -05:00
dingtianhong	496d7e8ea3	mac8011: slight optimization of addr compare Use the possibly more efficient ether_addr_equal to instead of memcmp. Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John W. Linville <linville@tuxdriver.com> Cc: David Miller <davem@davemloft.net> Cc: linux-wireless@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:31:34 -05:00

1 2 3 4 5 ...

30951 Commits