linux_dsm_epyc7002/net/tipc/server.c

645 lines
15 KiB
C
Raw Normal View History

tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
/*
* net/tipc/server.c: TIPC server infrastructure
*
* Copyright (c) 2012-2013, Wind River Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. Neither the names of the copyright holders nor the names of its
* contributors may be used to endorse or promote products derived from
* this software without specific prior written permission.
*
* Alternatively, this software may be distributed under the terms of the
* GNU General Public License ("GPL") version 2 as published by the Free
* Software Foundation.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#include "server.h"
#include "core.h"
#include "socket.h"
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
#include <net/sock.h>
tipc: fix netns refcnt leak When the TIPC module is loaded, we launch a topology server in kernel space, which in its turn is creating TIPC sockets for communication with topology server users. Because both the socket's creator and provider reside in the same module, it is necessary that the TIPC module's reference count remains zero after the server is started and the socket created; otherwise it becomes impossible to perform "rmmod" even on an idle module. Currently, we achieve this by defining a separate "tipc_proto_kern" protocol struct, that is used only for kernel space socket allocations. This structure has the "owner" field set to NULL, which restricts the module reference count from being be bumped when sk_alloc() for local sockets is called. Furthermore, we have defined three kernel-specific functions, tipc_sock_create_local(), tipc_sock_release_local() and tipc_sock_accept_local(), to avoid the module counter being modified when module local sockets are created or deleted. This has worked well until we introduced name space support. However, after name space support was introduced, we have observed that a reference count leak occurs, because the netns counter is not decremented in tipc_sock_delete_local(). This commit remedies this problem. But instead of just modifying tipc_sock_delete_local(), we eliminate the whole parallel socket handling infrastructure, and start using the regular sk_create_kern(), kernel_accept() and sk_release_kernel() calls. Since those functions manipulate the module counter, we must now compensate for that by explicitly decrementing the counter after module local sockets are created, and increment it just before calling sk_release_kernel(). Fixes: a62fbccecd62 ("tipc: make subscriber server support net namespace") Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reported-by: Cong Wang <cwang@twopensource.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 08:32:57 +07:00
#include <linux/module.h>
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
/* Number of messages to send before rescheduling */
#define MAX_SEND_MSG_COUNT 25
#define MAX_RECV_MSG_COUNT 25
#define CF_CONNECTED 1
tipc: fix netns refcnt leak When the TIPC module is loaded, we launch a topology server in kernel space, which in its turn is creating TIPC sockets for communication with topology server users. Because both the socket's creator and provider reside in the same module, it is necessary that the TIPC module's reference count remains zero after the server is started and the socket created; otherwise it becomes impossible to perform "rmmod" even on an idle module. Currently, we achieve this by defining a separate "tipc_proto_kern" protocol struct, that is used only for kernel space socket allocations. This structure has the "owner" field set to NULL, which restricts the module reference count from being be bumped when sk_alloc() for local sockets is called. Furthermore, we have defined three kernel-specific functions, tipc_sock_create_local(), tipc_sock_release_local() and tipc_sock_accept_local(), to avoid the module counter being modified when module local sockets are created or deleted. This has worked well until we introduced name space support. However, after name space support was introduced, we have observed that a reference count leak occurs, because the netns counter is not decremented in tipc_sock_delete_local(). This commit remedies this problem. But instead of just modifying tipc_sock_delete_local(), we eliminate the whole parallel socket handling infrastructure, and start using the regular sk_create_kern(), kernel_accept() and sk_release_kernel() calls. Since those functions manipulate the module counter, we must now compensate for that by explicitly decrementing the counter after module local sockets are created, and increment it just before calling sk_release_kernel(). Fixes: a62fbccecd62 ("tipc: make subscriber server support net namespace") Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reported-by: Cong Wang <cwang@twopensource.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 08:32:57 +07:00
#define CF_SERVER 2
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
#define sock2con(x) ((struct tipc_conn *)(x)->sk_user_data)
/**
* struct tipc_conn - TIPC connection structure
* @kref: reference counter to connection object
* @conid: connection identifier
* @sock: socket handler associated with connection
* @flags: indicates connection state
* @server: pointer to connected server
* @rwork: receive work item
* @usr_data: user-specified field
* @rx_action: what to do when connection socket is active
* @outqueue: pointer to first outbound message in queue
* @outqueue_lock: control access to the outqueue
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
* @outqueue: list of connection objects for its server
* @swork: send work item
*/
struct tipc_conn {
struct kref kref;
int conid;
struct socket *sock;
unsigned long flags;
struct tipc_server *server;
struct work_struct rwork;
int (*rx_action) (struct tipc_conn *con);
void *usr_data;
struct list_head outqueue;
spinlock_t outqueue_lock;
struct work_struct swork;
};
/* An entry waiting to be sent */
struct outqueue_entry {
struct list_head list;
struct kvec iov;
struct sockaddr_tipc dest;
};
static void tipc_recv_work(struct work_struct *work);
static void tipc_send_work(struct work_struct *work);
static void tipc_clean_outqueues(struct tipc_conn *con);
tipc: fix a race condition leading to subscriber refcnt bug Until now, the requests sent to topology server are queued to a workqueue by the generic server framework. These messages are processed by worker threads and trigger the registered callbacks. To reduce latency on uniprocessor systems, explicit rescheduling is performed using cond_resched() after MAX_RECV_MSG_COUNT(25) messages. This implementation on SMP systems leads to an subscriber refcnt error as described below: When a worker thread yields by calling cond_resched() in a SMP system, a new worker is created on another CPU to process the pending workitem. Sometimes the sleeping thread wakes up before the new thread finishes execution. This breaks the assumption on ordering and being single threaded. The fault is more frequent when MAX_RECV_MSG_COUNT is lowered. If the first thread was processing subscription create and the second thread processing close(), the close request will free the subscriber and the create request oops as follows: [31.224137] WARNING: CPU: 2 PID: 266 at include/linux/kref.h:46 tipc_subscrb_rcv_cb+0x317/0x380 [tipc] [31.228143] CPU: 2 PID: 266 Comm: kworker/u8:1 Not tainted 4.5.0+ #97 [31.228377] Workqueue: tipc_rcv tipc_recv_work [tipc] [...] [31.228377] Call Trace: [31.228377] [<ffffffff812fbb6b>] dump_stack+0x4d/0x72 [31.228377] [<ffffffff8105a311>] __warn+0xd1/0xf0 [31.228377] [<ffffffff8105a3fd>] warn_slowpath_null+0x1d/0x20 [31.228377] [<ffffffffa0098067>] tipc_subscrb_rcv_cb+0x317/0x380 [tipc] [31.228377] [<ffffffffa00a4984>] tipc_receive_from_sock+0xd4/0x130 [tipc] [31.228377] [<ffffffffa00a439b>] tipc_recv_work+0x2b/0x50 [tipc] [31.228377] [<ffffffff81071925>] process_one_work+0x145/0x3d0 [31.246554] ---[ end trace c3882c9baa05a4fd ]--- [31.248327] BUG: spinlock bad magic on CPU#2, kworker/u8:1/266 [31.249119] BUG: unable to handle kernel NULL pointer dereference at 0000000000000428 [31.249323] IP: [<ffffffff81099d0c>] spin_dump+0x5c/0xe0 [31.249323] PGD 0 [31.249323] Oops: 0000 [#1] SMP In this commit, we - rename tipc_conn_shutdown() to tipc_conn_release(). - move connection release callback execution from tipc_close_conn() to a new function tipc_sock_release(), which is executed before we free the connection. Thus we release the subscriber during connection release procedure rather than connection shutdown procedure. Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12 18:05:21 +07:00
static void tipc_sock_release(struct tipc_conn *con);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
static void tipc_conn_kref_release(struct kref *kref)
{
struct tipc_conn *con = container_of(kref, struct tipc_conn, kref);
struct sockaddr_tipc *saddr = con->server->saddr;
tipc: fix netns refcnt leak When the TIPC module is loaded, we launch a topology server in kernel space, which in its turn is creating TIPC sockets for communication with topology server users. Because both the socket's creator and provider reside in the same module, it is necessary that the TIPC module's reference count remains zero after the server is started and the socket created; otherwise it becomes impossible to perform "rmmod" even on an idle module. Currently, we achieve this by defining a separate "tipc_proto_kern" protocol struct, that is used only for kernel space socket allocations. This structure has the "owner" field set to NULL, which restricts the module reference count from being be bumped when sk_alloc() for local sockets is called. Furthermore, we have defined three kernel-specific functions, tipc_sock_create_local(), tipc_sock_release_local() and tipc_sock_accept_local(), to avoid the module counter being modified when module local sockets are created or deleted. This has worked well until we introduced name space support. However, after name space support was introduced, we have observed that a reference count leak occurs, because the netns counter is not decremented in tipc_sock_delete_local(). This commit remedies this problem. But instead of just modifying tipc_sock_delete_local(), we eliminate the whole parallel socket handling infrastructure, and start using the regular sk_create_kern(), kernel_accept() and sk_release_kernel() calls. Since those functions manipulate the module counter, we must now compensate for that by explicitly decrementing the counter after module local sockets are created, and increment it just before calling sk_release_kernel(). Fixes: a62fbccecd62 ("tipc: make subscriber server support net namespace") Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reported-by: Cong Wang <cwang@twopensource.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 08:32:57 +07:00
struct socket *sock = con->sock;
struct sock *sk;
if (sock) {
sk = sock->sk;
if (test_bit(CF_SERVER, &con->flags)) {
__module_get(sock->ops->owner);
__module_get(sk->sk_prot_creator->owner);
}
saddr->scope = -TIPC_NODE_SCOPE;
kernel_bind(sock, (struct sockaddr *)saddr, sizeof(*saddr));
tipc: fix a race condition leading to subscriber refcnt bug Until now, the requests sent to topology server are queued to a workqueue by the generic server framework. These messages are processed by worker threads and trigger the registered callbacks. To reduce latency on uniprocessor systems, explicit rescheduling is performed using cond_resched() after MAX_RECV_MSG_COUNT(25) messages. This implementation on SMP systems leads to an subscriber refcnt error as described below: When a worker thread yields by calling cond_resched() in a SMP system, a new worker is created on another CPU to process the pending workitem. Sometimes the sleeping thread wakes up before the new thread finishes execution. This breaks the assumption on ordering and being single threaded. The fault is more frequent when MAX_RECV_MSG_COUNT is lowered. If the first thread was processing subscription create and the second thread processing close(), the close request will free the subscriber and the create request oops as follows: [31.224137] WARNING: CPU: 2 PID: 266 at include/linux/kref.h:46 tipc_subscrb_rcv_cb+0x317/0x380 [tipc] [31.228143] CPU: 2 PID: 266 Comm: kworker/u8:1 Not tainted 4.5.0+ #97 [31.228377] Workqueue: tipc_rcv tipc_recv_work [tipc] [...] [31.228377] Call Trace: [31.228377] [<ffffffff812fbb6b>] dump_stack+0x4d/0x72 [31.228377] [<ffffffff8105a311>] __warn+0xd1/0xf0 [31.228377] [<ffffffff8105a3fd>] warn_slowpath_null+0x1d/0x20 [31.228377] [<ffffffffa0098067>] tipc_subscrb_rcv_cb+0x317/0x380 [tipc] [31.228377] [<ffffffffa00a4984>] tipc_receive_from_sock+0xd4/0x130 [tipc] [31.228377] [<ffffffffa00a439b>] tipc_recv_work+0x2b/0x50 [tipc] [31.228377] [<ffffffff81071925>] process_one_work+0x145/0x3d0 [31.246554] ---[ end trace c3882c9baa05a4fd ]--- [31.248327] BUG: spinlock bad magic on CPU#2, kworker/u8:1/266 [31.249119] BUG: unable to handle kernel NULL pointer dereference at 0000000000000428 [31.249323] IP: [<ffffffff81099d0c>] spin_dump+0x5c/0xe0 [31.249323] PGD 0 [31.249323] Oops: 0000 [#1] SMP In this commit, we - rename tipc_conn_shutdown() to tipc_conn_release(). - move connection release callback execution from tipc_close_conn() to a new function tipc_sock_release(), which is executed before we free the connection. Thus we release the subscriber during connection release procedure rather than connection shutdown procedure. Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12 18:05:21 +07:00
tipc_sock_release(con);
sock_release(sock);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
con->sock = NULL;
}
tipc_clean_outqueues(con);
kfree(con);
}
static void conn_put(struct tipc_conn *con)
{
kref_put(&con->kref, tipc_conn_kref_release);
}
static void conn_get(struct tipc_conn *con)
{
kref_get(&con->kref);
}
static struct tipc_conn *tipc_conn_lookup(struct tipc_server *s, int conid)
{
struct tipc_conn *con;
spin_lock_bh(&s->idr_lock);
con = idr_find(&s->conn_idr, conid);
if (con)
conn_get(con);
spin_unlock_bh(&s->idr_lock);
return con;
}
static void sock_data_ready(struct sock *sk)
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
{
struct tipc_conn *con;
read_lock_bh(&sk->sk_callback_lock);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
con = sock2con(sk);
if (con && test_bit(CF_CONNECTED, &con->flags)) {
conn_get(con);
if (!queue_work(con->server->rcv_wq, &con->rwork))
conn_put(con);
}
read_unlock_bh(&sk->sk_callback_lock);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
}
static void sock_write_space(struct sock *sk)
{
struct tipc_conn *con;
read_lock_bh(&sk->sk_callback_lock);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
con = sock2con(sk);
if (con && test_bit(CF_CONNECTED, &con->flags)) {
conn_get(con);
if (!queue_work(con->server->send_wq, &con->swork))
conn_put(con);
}
read_unlock_bh(&sk->sk_callback_lock);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
}
static void tipc_register_callbacks(struct socket *sock, struct tipc_conn *con)
{
struct sock *sk = sock->sk;
write_lock_bh(&sk->sk_callback_lock);
sk->sk_data_ready = sock_data_ready;
sk->sk_write_space = sock_write_space;
sk->sk_user_data = con;
con->sock = sock;
write_unlock_bh(&sk->sk_callback_lock);
}
static void tipc_unregister_callbacks(struct tipc_conn *con)
{
struct sock *sk = con->sock->sk;
write_lock_bh(&sk->sk_callback_lock);
sk->sk_user_data = NULL;
write_unlock_bh(&sk->sk_callback_lock);
}
tipc: fix a race condition leading to subscriber refcnt bug Until now, the requests sent to topology server are queued to a workqueue by the generic server framework. These messages are processed by worker threads and trigger the registered callbacks. To reduce latency on uniprocessor systems, explicit rescheduling is performed using cond_resched() after MAX_RECV_MSG_COUNT(25) messages. This implementation on SMP systems leads to an subscriber refcnt error as described below: When a worker thread yields by calling cond_resched() in a SMP system, a new worker is created on another CPU to process the pending workitem. Sometimes the sleeping thread wakes up before the new thread finishes execution. This breaks the assumption on ordering and being single threaded. The fault is more frequent when MAX_RECV_MSG_COUNT is lowered. If the first thread was processing subscription create and the second thread processing close(), the close request will free the subscriber and the create request oops as follows: [31.224137] WARNING: CPU: 2 PID: 266 at include/linux/kref.h:46 tipc_subscrb_rcv_cb+0x317/0x380 [tipc] [31.228143] CPU: 2 PID: 266 Comm: kworker/u8:1 Not tainted 4.5.0+ #97 [31.228377] Workqueue: tipc_rcv tipc_recv_work [tipc] [...] [31.228377] Call Trace: [31.228377] [<ffffffff812fbb6b>] dump_stack+0x4d/0x72 [31.228377] [<ffffffff8105a311>] __warn+0xd1/0xf0 [31.228377] [<ffffffff8105a3fd>] warn_slowpath_null+0x1d/0x20 [31.228377] [<ffffffffa0098067>] tipc_subscrb_rcv_cb+0x317/0x380 [tipc] [31.228377] [<ffffffffa00a4984>] tipc_receive_from_sock+0xd4/0x130 [tipc] [31.228377] [<ffffffffa00a439b>] tipc_recv_work+0x2b/0x50 [tipc] [31.228377] [<ffffffff81071925>] process_one_work+0x145/0x3d0 [31.246554] ---[ end trace c3882c9baa05a4fd ]--- [31.248327] BUG: spinlock bad magic on CPU#2, kworker/u8:1/266 [31.249119] BUG: unable to handle kernel NULL pointer dereference at 0000000000000428 [31.249323] IP: [<ffffffff81099d0c>] spin_dump+0x5c/0xe0 [31.249323] PGD 0 [31.249323] Oops: 0000 [#1] SMP In this commit, we - rename tipc_conn_shutdown() to tipc_conn_release(). - move connection release callback execution from tipc_close_conn() to a new function tipc_sock_release(), which is executed before we free the connection. Thus we release the subscriber during connection release procedure rather than connection shutdown procedure. Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12 18:05:21 +07:00
static void tipc_sock_release(struct tipc_conn *con)
{
struct tipc_server *s = con->server;
if (con->conid)
s->tipc_conn_release(con->conid, con->usr_data);
tipc_unregister_callbacks(con);
}
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
static void tipc_close_conn(struct tipc_conn *con)
{
struct tipc_server *s = con->server;
if (test_and_clear_bit(CF_CONNECTED, &con->flags)) {
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
spin_lock_bh(&s->idr_lock);
idr_remove(&s->conn_idr, con->conid);
s->idr_in_use--;
spin_unlock_bh(&s->idr_lock);
/* We shouldn't flush pending works as we may be in the
* thread. In fact the races with pending rx/tx work structs
* are harmless for us here as we have already deleted this
tipc: fix a race condition leading to subscriber refcnt bug Until now, the requests sent to topology server are queued to a workqueue by the generic server framework. These messages are processed by worker threads and trigger the registered callbacks. To reduce latency on uniprocessor systems, explicit rescheduling is performed using cond_resched() after MAX_RECV_MSG_COUNT(25) messages. This implementation on SMP systems leads to an subscriber refcnt error as described below: When a worker thread yields by calling cond_resched() in a SMP system, a new worker is created on another CPU to process the pending workitem. Sometimes the sleeping thread wakes up before the new thread finishes execution. This breaks the assumption on ordering and being single threaded. The fault is more frequent when MAX_RECV_MSG_COUNT is lowered. If the first thread was processing subscription create and the second thread processing close(), the close request will free the subscriber and the create request oops as follows: [31.224137] WARNING: CPU: 2 PID: 266 at include/linux/kref.h:46 tipc_subscrb_rcv_cb+0x317/0x380 [tipc] [31.228143] CPU: 2 PID: 266 Comm: kworker/u8:1 Not tainted 4.5.0+ #97 [31.228377] Workqueue: tipc_rcv tipc_recv_work [tipc] [...] [31.228377] Call Trace: [31.228377] [<ffffffff812fbb6b>] dump_stack+0x4d/0x72 [31.228377] [<ffffffff8105a311>] __warn+0xd1/0xf0 [31.228377] [<ffffffff8105a3fd>] warn_slowpath_null+0x1d/0x20 [31.228377] [<ffffffffa0098067>] tipc_subscrb_rcv_cb+0x317/0x380 [tipc] [31.228377] [<ffffffffa00a4984>] tipc_receive_from_sock+0xd4/0x130 [tipc] [31.228377] [<ffffffffa00a439b>] tipc_recv_work+0x2b/0x50 [tipc] [31.228377] [<ffffffff81071925>] process_one_work+0x145/0x3d0 [31.246554] ---[ end trace c3882c9baa05a4fd ]--- [31.248327] BUG: spinlock bad magic on CPU#2, kworker/u8:1/266 [31.249119] BUG: unable to handle kernel NULL pointer dereference at 0000000000000428 [31.249323] IP: [<ffffffff81099d0c>] spin_dump+0x5c/0xe0 [31.249323] PGD 0 [31.249323] Oops: 0000 [#1] SMP In this commit, we - rename tipc_conn_shutdown() to tipc_conn_release(). - move connection release callback execution from tipc_close_conn() to a new function tipc_sock_release(), which is executed before we free the connection. Thus we release the subscriber during connection release procedure rather than connection shutdown procedure. Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12 18:05:21 +07:00
* connection from server connection list.
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
*/
kernel_sock_shutdown(con->sock, SHUT_RDWR);
conn_put(con);
}
}
static struct tipc_conn *tipc_alloc_conn(struct tipc_server *s)
{
struct tipc_conn *con;
int ret;
con = kzalloc(sizeof(struct tipc_conn), GFP_ATOMIC);
if (!con)
return ERR_PTR(-ENOMEM);
kref_init(&con->kref);
INIT_LIST_HEAD(&con->outqueue);
spin_lock_init(&con->outqueue_lock);
INIT_WORK(&con->swork, tipc_send_work);
INIT_WORK(&con->rwork, tipc_recv_work);
spin_lock_bh(&s->idr_lock);
ret = idr_alloc(&s->conn_idr, con, 0, 0, GFP_ATOMIC);
if (ret < 0) {
kfree(con);
spin_unlock_bh(&s->idr_lock);
return ERR_PTR(-ENOMEM);
}
con->conid = ret;
s->idr_in_use++;
spin_unlock_bh(&s->idr_lock);
set_bit(CF_CONNECTED, &con->flags);
con->server = s;
return con;
}
static int tipc_receive_from_sock(struct tipc_conn *con)
{
struct msghdr msg = {};
struct tipc_server *s = con->server;
struct sockaddr_tipc addr;
struct kvec iov;
void *buf;
int ret;
buf = kmem_cache_alloc(s->rcvbuf_cache, GFP_ATOMIC);
if (!buf) {
ret = -ENOMEM;
goto out_close;
}
iov.iov_base = buf;
iov.iov_len = s->max_rcvbuf_size;
msg.msg_name = &addr;
ret = kernel_recvmsg(con->sock, &msg, &iov, 1, iov.iov_len,
MSG_DONTWAIT);
if (ret <= 0) {
kmem_cache_free(s->rcvbuf_cache, buf);
goto out_close;
}
s->tipc_conn_recvmsg(sock_net(con->sock->sk), con->conid, &addr,
con->usr_data, buf, ret);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
kmem_cache_free(s->rcvbuf_cache, buf);
return 0;
out_close:
if (ret != -EWOULDBLOCK)
tipc_close_conn(con);
else if (ret == 0)
/* Don't return success if we really got EOF */
ret = -EAGAIN;
return ret;
}
static int tipc_accept_from_sock(struct tipc_conn *con)
{
struct tipc_server *s = con->server;
struct socket *sock = con->sock;
struct socket *newsock;
struct tipc_conn *newcon;
int ret;
tipc: fix netns refcnt leak When the TIPC module is loaded, we launch a topology server in kernel space, which in its turn is creating TIPC sockets for communication with topology server users. Because both the socket's creator and provider reside in the same module, it is necessary that the TIPC module's reference count remains zero after the server is started and the socket created; otherwise it becomes impossible to perform "rmmod" even on an idle module. Currently, we achieve this by defining a separate "tipc_proto_kern" protocol struct, that is used only for kernel space socket allocations. This structure has the "owner" field set to NULL, which restricts the module reference count from being be bumped when sk_alloc() for local sockets is called. Furthermore, we have defined three kernel-specific functions, tipc_sock_create_local(), tipc_sock_release_local() and tipc_sock_accept_local(), to avoid the module counter being modified when module local sockets are created or deleted. This has worked well until we introduced name space support. However, after name space support was introduced, we have observed that a reference count leak occurs, because the netns counter is not decremented in tipc_sock_delete_local(). This commit remedies this problem. But instead of just modifying tipc_sock_delete_local(), we eliminate the whole parallel socket handling infrastructure, and start using the regular sk_create_kern(), kernel_accept() and sk_release_kernel() calls. Since those functions manipulate the module counter, we must now compensate for that by explicitly decrementing the counter after module local sockets are created, and increment it just before calling sk_release_kernel(). Fixes: a62fbccecd62 ("tipc: make subscriber server support net namespace") Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reported-by: Cong Wang <cwang@twopensource.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 08:32:57 +07:00
ret = kernel_accept(sock, &newsock, O_NONBLOCK);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
if (ret < 0)
return ret;
newcon = tipc_alloc_conn(con->server);
if (IS_ERR(newcon)) {
ret = PTR_ERR(newcon);
sock_release(newsock);
return ret;
}
newcon->rx_action = tipc_receive_from_sock;
tipc_register_callbacks(newsock, newcon);
/* Notify that new connection is incoming */
newcon->usr_data = s->tipc_conn_new(newcon->conid);
if (!newcon->usr_data) {
sock_release(newsock);
return -ENOMEM;
}
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
/* Wake up receive process in case of 'SYN+' message */
newsock->sk->sk_data_ready(newsock->sk);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
return ret;
}
static struct socket *tipc_create_listen_sock(struct tipc_conn *con)
{
struct tipc_server *s = con->server;
struct socket *sock = NULL;
int ret;
ret = sock_create_kern(s->net, AF_TIPC, SOCK_SEQPACKET, 0, &sock);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
if (ret < 0)
return NULL;
ret = kernel_setsockopt(sock, SOL_TIPC, TIPC_IMPORTANCE,
(char *)&s->imp, sizeof(s->imp));
if (ret < 0)
goto create_err;
ret = kernel_bind(sock, (struct sockaddr *)s->saddr, sizeof(*s->saddr));
if (ret < 0)
goto create_err;
switch (s->type) {
case SOCK_STREAM:
case SOCK_SEQPACKET:
con->rx_action = tipc_accept_from_sock;
ret = kernel_listen(sock, 0);
if (ret < 0)
goto create_err;
break;
case SOCK_DGRAM:
case SOCK_RDM:
con->rx_action = tipc_receive_from_sock;
break;
default:
pr_err("Unknown socket type %d\n", s->type);
goto create_err;
}
tipc: fix netns refcnt leak When the TIPC module is loaded, we launch a topology server in kernel space, which in its turn is creating TIPC sockets for communication with topology server users. Because both the socket's creator and provider reside in the same module, it is necessary that the TIPC module's reference count remains zero after the server is started and the socket created; otherwise it becomes impossible to perform "rmmod" even on an idle module. Currently, we achieve this by defining a separate "tipc_proto_kern" protocol struct, that is used only for kernel space socket allocations. This structure has the "owner" field set to NULL, which restricts the module reference count from being be bumped when sk_alloc() for local sockets is called. Furthermore, we have defined three kernel-specific functions, tipc_sock_create_local(), tipc_sock_release_local() and tipc_sock_accept_local(), to avoid the module counter being modified when module local sockets are created or deleted. This has worked well until we introduced name space support. However, after name space support was introduced, we have observed that a reference count leak occurs, because the netns counter is not decremented in tipc_sock_delete_local(). This commit remedies this problem. But instead of just modifying tipc_sock_delete_local(), we eliminate the whole parallel socket handling infrastructure, and start using the regular sk_create_kern(), kernel_accept() and sk_release_kernel() calls. Since those functions manipulate the module counter, we must now compensate for that by explicitly decrementing the counter after module local sockets are created, and increment it just before calling sk_release_kernel(). Fixes: a62fbccecd62 ("tipc: make subscriber server support net namespace") Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reported-by: Cong Wang <cwang@twopensource.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 08:32:57 +07:00
/* As server's listening socket owner and creator is the same module,
* we have to decrease TIPC module reference count to guarantee that
* it remains zero after the server socket is created, otherwise,
* executing "rmmod" command is unable to make TIPC module deleted
* after TIPC module is inserted successfully.
*
* However, the reference count is ever increased twice in
* sock_create_kern(): one is to increase the reference count of owner
* of TIPC socket's proto_ops struct; another is to increment the
* reference count of owner of TIPC proto struct. Therefore, we must
* decrement the module reference count twice to ensure that it keeps
* zero after server's listening socket is created. Of course, we
* must bump the module reference count twice as well before the socket
* is closed.
*/
module_put(sock->ops->owner);
module_put(sock->sk->sk_prot_creator->owner);
set_bit(CF_SERVER, &con->flags);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
return sock;
create_err:
tipc: fix netns refcnt leak When the TIPC module is loaded, we launch a topology server in kernel space, which in its turn is creating TIPC sockets for communication with topology server users. Because both the socket's creator and provider reside in the same module, it is necessary that the TIPC module's reference count remains zero after the server is started and the socket created; otherwise it becomes impossible to perform "rmmod" even on an idle module. Currently, we achieve this by defining a separate "tipc_proto_kern" protocol struct, that is used only for kernel space socket allocations. This structure has the "owner" field set to NULL, which restricts the module reference count from being be bumped when sk_alloc() for local sockets is called. Furthermore, we have defined three kernel-specific functions, tipc_sock_create_local(), tipc_sock_release_local() and tipc_sock_accept_local(), to avoid the module counter being modified when module local sockets are created or deleted. This has worked well until we introduced name space support. However, after name space support was introduced, we have observed that a reference count leak occurs, because the netns counter is not decremented in tipc_sock_delete_local(). This commit remedies this problem. But instead of just modifying tipc_sock_delete_local(), we eliminate the whole parallel socket handling infrastructure, and start using the regular sk_create_kern(), kernel_accept() and sk_release_kernel() calls. Since those functions manipulate the module counter, we must now compensate for that by explicitly decrementing the counter after module local sockets are created, and increment it just before calling sk_release_kernel(). Fixes: a62fbccecd62 ("tipc: make subscriber server support net namespace") Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reported-by: Cong Wang <cwang@twopensource.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 08:32:57 +07:00
kernel_sock_shutdown(sock, SHUT_RDWR);
sock_release(sock);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
return NULL;
}
static int tipc_open_listening_sock(struct tipc_server *s)
{
struct socket *sock;
struct tipc_conn *con;
con = tipc_alloc_conn(s);
if (IS_ERR(con))
return PTR_ERR(con);
sock = tipc_create_listen_sock(con);
tipc: fix oops when creating server socket fails When creation of TIPC internal server socket fails, we get an oops with the following dump: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc] PGD 13719067 PUD 12008067 PMD 0 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC Modules linked in: tipc(+) CPU: 4 PID: 4340 Comm: insmod Not tainted 3.10.0+ #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 task: ffff880014360000 ti: ffff88001374c000 task.ti: ffff88001374c000 RIP: 0010:[<ffffffffa0011f49>] [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc] RSP: 0018:ffff88001374dc98 EFLAGS: 00010292 RAX: 0000000000000000 RBX: ffff880012ac09d8 RCX: 0000000000000000 RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffff880014360000 RBP: ffff88001374dcb8 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0016fa0 R13: ffffffffa0017010 R14: ffffffffa0017010 R15: ffff880012ac09d8 FS: 0000000000000000(0000) GS:ffff880016600000(0063) knlGS:00000000f76668d0 CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b CR2: 0000000000000020 CR3: 0000000012227000 CR4: 00000000000006e0 Stack: ffff88001374dcb8 ffffffffa0016fa0 0000000000000000 0000000000000001 ffff88001374dcf8 ffffffffa0012922 ffff88001374dce8 00000000ffffffea ffffffffa0017100 0000000000000000 ffff8800134241a8 ffffffffa0017150 Call Trace: [<ffffffffa0012922>] tipc_server_stop+0xa2/0x1b0 [tipc] [<ffffffffa0009995>] tipc_subscr_stop+0x15/0x20 [tipc] [<ffffffffa00130f5>] tipc_core_stop+0x1d/0x33 [tipc] [<ffffffffa001f0d4>] tipc_init+0xd4/0xf8 [tipc] [<ffffffffa001f000>] ? 0xffffffffa001efff [<ffffffff8100023f>] do_one_initcall+0x3f/0x150 [<ffffffff81082f4d>] ? __blocking_notifier_call_chain+0x7d/0xd0 [<ffffffff810cc58a>] load_module+0x11aa/0x19c0 [<ffffffff810c8d60>] ? show_initstate+0x50/0x50 [<ffffffff8190311c>] ? retint_restore_args+0xe/0xe [<ffffffff810cce79>] SyS_init_module+0xd9/0x110 [<ffffffff8190dc65>] sysenter_dispatch+0x7/0x1f Code: 6c 24 70 4c 89 ef e8 b7 04 8f e1 8b 73 04 4c 89 e7 e8 7c 9e 32 e1 41 83 ac 24 b8 00 00 00 01 4c 89 ef e8 eb 0a 8f e1 48 8b 43 08 <4c> 8b 68 20 4d 8d a5 48 03 00 00 4c 89 e7 e8 04 05 8f e1 4c 89 RIP [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc] RSP <ffff88001374dc98> CR2: 0000000000000020 ---[ end trace b02321f40e4269a3 ]--- We have the following call chain: tipc_core_start() ret = tipc_subscr_start() ret = tipc_server_start(){ server->enabled = 1; ret = tipc_open_listening_sock() } I.e., the server->enabled flag is unconditionally set to 1, whatever the return value of tipc_open_listening_sock(). This causes a crash when tipc_core_start() tries to clean up resources after a failed initialization: if (ret == failed) tipc_subscr_stop() tipc_server_stop(){ if (server->enabled) tipc_close_conn(){ NULL reference of con->sock-sk OOPS! } } To avoid this, tipc_server_start() should only set server->enabled to 1 in case of a succesful socket creation. In case of failure, it should release all allocated resources before returning. Problem introduced in commit c5fa7b3cf3cb22e4ac60485fc2dc187fe012910f ("tipc: introduce new TIPC server infrastructure") in v3.11-rc1. Note that it won't be seen often; it takes a module load under memory constrained conditions in order to trigger the failure condition. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-01 19:29:18 +07:00
if (!sock) {
idr_remove(&s->conn_idr, con->conid);
s->idr_in_use--;
kfree(con);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
return -EINVAL;
tipc: fix oops when creating server socket fails When creation of TIPC internal server socket fails, we get an oops with the following dump: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc] PGD 13719067 PUD 12008067 PMD 0 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC Modules linked in: tipc(+) CPU: 4 PID: 4340 Comm: insmod Not tainted 3.10.0+ #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 task: ffff880014360000 ti: ffff88001374c000 task.ti: ffff88001374c000 RIP: 0010:[<ffffffffa0011f49>] [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc] RSP: 0018:ffff88001374dc98 EFLAGS: 00010292 RAX: 0000000000000000 RBX: ffff880012ac09d8 RCX: 0000000000000000 RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffff880014360000 RBP: ffff88001374dcb8 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0016fa0 R13: ffffffffa0017010 R14: ffffffffa0017010 R15: ffff880012ac09d8 FS: 0000000000000000(0000) GS:ffff880016600000(0063) knlGS:00000000f76668d0 CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b CR2: 0000000000000020 CR3: 0000000012227000 CR4: 00000000000006e0 Stack: ffff88001374dcb8 ffffffffa0016fa0 0000000000000000 0000000000000001 ffff88001374dcf8 ffffffffa0012922 ffff88001374dce8 00000000ffffffea ffffffffa0017100 0000000000000000 ffff8800134241a8 ffffffffa0017150 Call Trace: [<ffffffffa0012922>] tipc_server_stop+0xa2/0x1b0 [tipc] [<ffffffffa0009995>] tipc_subscr_stop+0x15/0x20 [tipc] [<ffffffffa00130f5>] tipc_core_stop+0x1d/0x33 [tipc] [<ffffffffa001f0d4>] tipc_init+0xd4/0xf8 [tipc] [<ffffffffa001f000>] ? 0xffffffffa001efff [<ffffffff8100023f>] do_one_initcall+0x3f/0x150 [<ffffffff81082f4d>] ? __blocking_notifier_call_chain+0x7d/0xd0 [<ffffffff810cc58a>] load_module+0x11aa/0x19c0 [<ffffffff810c8d60>] ? show_initstate+0x50/0x50 [<ffffffff8190311c>] ? retint_restore_args+0xe/0xe [<ffffffff810cce79>] SyS_init_module+0xd9/0x110 [<ffffffff8190dc65>] sysenter_dispatch+0x7/0x1f Code: 6c 24 70 4c 89 ef e8 b7 04 8f e1 8b 73 04 4c 89 e7 e8 7c 9e 32 e1 41 83 ac 24 b8 00 00 00 01 4c 89 ef e8 eb 0a 8f e1 48 8b 43 08 <4c> 8b 68 20 4d 8d a5 48 03 00 00 4c 89 e7 e8 04 05 8f e1 4c 89 RIP [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc] RSP <ffff88001374dc98> CR2: 0000000000000020 ---[ end trace b02321f40e4269a3 ]--- We have the following call chain: tipc_core_start() ret = tipc_subscr_start() ret = tipc_server_start(){ server->enabled = 1; ret = tipc_open_listening_sock() } I.e., the server->enabled flag is unconditionally set to 1, whatever the return value of tipc_open_listening_sock(). This causes a crash when tipc_core_start() tries to clean up resources after a failed initialization: if (ret == failed) tipc_subscr_stop() tipc_server_stop(){ if (server->enabled) tipc_close_conn(){ NULL reference of con->sock-sk OOPS! } } To avoid this, tipc_server_start() should only set server->enabled to 1 in case of a succesful socket creation. In case of failure, it should release all allocated resources before returning. Problem introduced in commit c5fa7b3cf3cb22e4ac60485fc2dc187fe012910f ("tipc: introduce new TIPC server infrastructure") in v3.11-rc1. Note that it won't be seen often; it takes a module load under memory constrained conditions in order to trigger the failure condition. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-01 19:29:18 +07:00
}
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
tipc_register_callbacks(sock, con);
return 0;
}
static struct outqueue_entry *tipc_alloc_entry(void *data, int len)
{
struct outqueue_entry *entry;
void *buf;
entry = kmalloc(sizeof(struct outqueue_entry), GFP_ATOMIC);
if (!entry)
return NULL;
buf = kmemdup(data, len, GFP_ATOMIC);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
if (!buf) {
kfree(entry);
return NULL;
}
entry->iov.iov_base = buf;
entry->iov.iov_len = len;
return entry;
}
static void tipc_free_entry(struct outqueue_entry *e)
{
kfree(e->iov.iov_base);
kfree(e);
}
static void tipc_clean_outqueues(struct tipc_conn *con)
{
struct outqueue_entry *e, *safe;
spin_lock_bh(&con->outqueue_lock);
list_for_each_entry_safe(e, safe, &con->outqueue, list) {
list_del(&e->list);
tipc_free_entry(e);
}
spin_unlock_bh(&con->outqueue_lock);
}
int tipc_conn_sendmsg(struct tipc_server *s, int conid,
struct sockaddr_tipc *addr, void *data, size_t len)
{
struct outqueue_entry *e;
struct tipc_conn *con;
con = tipc_conn_lookup(s, conid);
if (!con)
return -EINVAL;
e = tipc_alloc_entry(data, len);
if (!e) {
conn_put(con);
return -ENOMEM;
}
if (addr)
memcpy(&e->dest, addr, sizeof(struct sockaddr_tipc));
spin_lock_bh(&con->outqueue_lock);
list_add_tail(&e->list, &con->outqueue);
spin_unlock_bh(&con->outqueue_lock);
if (test_bit(CF_CONNECTED, &con->flags)) {
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
if (!queue_work(s->send_wq, &con->swork))
conn_put(con);
} else {
conn_put(con);
}
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
return 0;
}
void tipc_conn_terminate(struct tipc_server *s, int conid)
{
struct tipc_conn *con;
con = tipc_conn_lookup(s, conid);
if (con) {
tipc_close_conn(con);
conn_put(con);
}
}
static void tipc_send_to_sock(struct tipc_conn *con)
{
int count = 0;
struct tipc_server *s = con->server;
struct outqueue_entry *e;
struct msghdr msg;
int ret;
spin_lock_bh(&con->outqueue_lock);
while (1) {
e = list_entry(con->outqueue.next, struct outqueue_entry,
list);
if ((struct list_head *) e == &con->outqueue)
break;
spin_unlock_bh(&con->outqueue_lock);
memset(&msg, 0, sizeof(msg));
msg.msg_flags = MSG_DONTWAIT;
if (s->type == SOCK_DGRAM || s->type == SOCK_RDM) {
msg.msg_name = &e->dest;
msg.msg_namelen = sizeof(struct sockaddr_tipc);
}
ret = kernel_sendmsg(con->sock, &msg, &e->iov, 1,
e->iov.iov_len);
if (ret == -EWOULDBLOCK || ret == 0) {
cond_resched();
goto out;
} else if (ret < 0) {
goto send_err;
}
/* Don't starve users filling buffers */
if (++count >= MAX_SEND_MSG_COUNT) {
cond_resched();
count = 0;
}
spin_lock_bh(&con->outqueue_lock);
list_del(&e->list);
tipc_free_entry(e);
}
spin_unlock_bh(&con->outqueue_lock);
out:
return;
send_err:
tipc_close_conn(con);
}
static void tipc_recv_work(struct work_struct *work)
{
struct tipc_conn *con = container_of(work, struct tipc_conn, rwork);
int count = 0;
while (test_bit(CF_CONNECTED, &con->flags)) {
if (con->rx_action(con))
break;
/* Don't flood Rx machine */
if (++count >= MAX_RECV_MSG_COUNT) {
cond_resched();
count = 0;
}
}
conn_put(con);
}
static void tipc_send_work(struct work_struct *work)
{
struct tipc_conn *con = container_of(work, struct tipc_conn, swork);
if (test_bit(CF_CONNECTED, &con->flags))
tipc_send_to_sock(con);
conn_put(con);
}
static void tipc_work_stop(struct tipc_server *s)
{
destroy_workqueue(s->rcv_wq);
destroy_workqueue(s->send_wq);
}
static int tipc_work_start(struct tipc_server *s)
{
s->rcv_wq = alloc_ordered_workqueue("tipc_rcv", 0);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
if (!s->rcv_wq) {
pr_err("can't start tipc receive workqueue\n");
return -ENOMEM;
}
s->send_wq = alloc_ordered_workqueue("tipc_send", 0);
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
if (!s->send_wq) {
pr_err("can't start tipc send workqueue\n");
destroy_workqueue(s->rcv_wq);
return -ENOMEM;
}
return 0;
}
int tipc_server_start(struct tipc_server *s)
{
int ret;
spin_lock_init(&s->idr_lock);
idr_init(&s->conn_idr);
s->idr_in_use = 0;
s->rcvbuf_cache = kmem_cache_create(s->name, s->max_rcvbuf_size,
0, SLAB_HWCACHE_ALIGN, NULL);
if (!s->rcvbuf_cache)
return -ENOMEM;
ret = tipc_work_start(s);
if (ret < 0) {
kmem_cache_destroy(s->rcvbuf_cache);
return ret;
}
tipc: fix oops when creating server socket fails When creation of TIPC internal server socket fails, we get an oops with the following dump: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc] PGD 13719067 PUD 12008067 PMD 0 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC Modules linked in: tipc(+) CPU: 4 PID: 4340 Comm: insmod Not tainted 3.10.0+ #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 task: ffff880014360000 ti: ffff88001374c000 task.ti: ffff88001374c000 RIP: 0010:[<ffffffffa0011f49>] [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc] RSP: 0018:ffff88001374dc98 EFLAGS: 00010292 RAX: 0000000000000000 RBX: ffff880012ac09d8 RCX: 0000000000000000 RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffff880014360000 RBP: ffff88001374dcb8 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0016fa0 R13: ffffffffa0017010 R14: ffffffffa0017010 R15: ffff880012ac09d8 FS: 0000000000000000(0000) GS:ffff880016600000(0063) knlGS:00000000f76668d0 CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b CR2: 0000000000000020 CR3: 0000000012227000 CR4: 00000000000006e0 Stack: ffff88001374dcb8 ffffffffa0016fa0 0000000000000000 0000000000000001 ffff88001374dcf8 ffffffffa0012922 ffff88001374dce8 00000000ffffffea ffffffffa0017100 0000000000000000 ffff8800134241a8 ffffffffa0017150 Call Trace: [<ffffffffa0012922>] tipc_server_stop+0xa2/0x1b0 [tipc] [<ffffffffa0009995>] tipc_subscr_stop+0x15/0x20 [tipc] [<ffffffffa00130f5>] tipc_core_stop+0x1d/0x33 [tipc] [<ffffffffa001f0d4>] tipc_init+0xd4/0xf8 [tipc] [<ffffffffa001f000>] ? 0xffffffffa001efff [<ffffffff8100023f>] do_one_initcall+0x3f/0x150 [<ffffffff81082f4d>] ? __blocking_notifier_call_chain+0x7d/0xd0 [<ffffffff810cc58a>] load_module+0x11aa/0x19c0 [<ffffffff810c8d60>] ? show_initstate+0x50/0x50 [<ffffffff8190311c>] ? retint_restore_args+0xe/0xe [<ffffffff810cce79>] SyS_init_module+0xd9/0x110 [<ffffffff8190dc65>] sysenter_dispatch+0x7/0x1f Code: 6c 24 70 4c 89 ef e8 b7 04 8f e1 8b 73 04 4c 89 e7 e8 7c 9e 32 e1 41 83 ac 24 b8 00 00 00 01 4c 89 ef e8 eb 0a 8f e1 48 8b 43 08 <4c> 8b 68 20 4d 8d a5 48 03 00 00 4c 89 e7 e8 04 05 8f e1 4c 89 RIP [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc] RSP <ffff88001374dc98> CR2: 0000000000000020 ---[ end trace b02321f40e4269a3 ]--- We have the following call chain: tipc_core_start() ret = tipc_subscr_start() ret = tipc_server_start(){ server->enabled = 1; ret = tipc_open_listening_sock() } I.e., the server->enabled flag is unconditionally set to 1, whatever the return value of tipc_open_listening_sock(). This causes a crash when tipc_core_start() tries to clean up resources after a failed initialization: if (ret == failed) tipc_subscr_stop() tipc_server_stop(){ if (server->enabled) tipc_close_conn(){ NULL reference of con->sock-sk OOPS! } } To avoid this, tipc_server_start() should only set server->enabled to 1 in case of a succesful socket creation. In case of failure, it should release all allocated resources before returning. Problem introduced in commit c5fa7b3cf3cb22e4ac60485fc2dc187fe012910f ("tipc: introduce new TIPC server infrastructure") in v3.11-rc1. Note that it won't be seen often; it takes a module load under memory constrained conditions in order to trigger the failure condition. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-01 19:29:18 +07:00
ret = tipc_open_listening_sock(s);
if (ret < 0) {
tipc_work_stop(s);
kmem_cache_destroy(s->rcvbuf_cache);
return ret;
}
return ret;
tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 21:54:39 +07:00
}
void tipc_server_stop(struct tipc_server *s)
{
struct tipc_conn *con;
int total = 0;
int id;
spin_lock_bh(&s->idr_lock);
for (id = 0; total < s->idr_in_use; id++) {
con = idr_find(&s->conn_idr, id);
if (con) {
total++;
spin_unlock_bh(&s->idr_lock);
tipc_close_conn(con);
spin_lock_bh(&s->idr_lock);
}
}
spin_unlock_bh(&s->idr_lock);
tipc_work_stop(s);
kmem_cache_destroy(s->rcvbuf_cache);
idr_destroy(&s->conn_idr);
}