summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2010-09-29ip6tnl: percpu stats accountingEric Dumazet
Maintain per_cpu tx_bytes, tx_packets, rx_bytes, rx_packets. Other seldom used fields are kept in netdev->stats structure, possibly unsafe. This is a preliminary work to support lockless transmit path, and correct RX stats, that are already unsafe. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29ipip: enable lockless xmitsEric Dumazet
IPIP tunnels can benefit from lockless xmits, using NETIF_F_LLTX Bench on a 16 cpus machine (dual E5540 cpus), 16 threads sending 10000000 UDP frames via one ipip tunnel (size:200 bytes per frame) Before patch : real 2m53.321s user 0m10.277s sys 46m0.597s After patch: real 0m32.063s user 0m9.237s sys 8m16.255s Last problem to solve is the contention on dst : 16118.00 28.3% __ip_route_output_key vmlinux 6135.00 10.8% dst_release vmlinux 3220.00 5.6% ip_finish_output vmlinux 2149.00 3.8% ip_route_output_flow vmlinux 1575.00 2.8% ip_append_data vmlinux 1481.00 2.6% ip_push_pending_frames vmlinux 1349.00 2.4% __xfrm_lookup vmlinux 1216.00 2.1% csum_partial_copy_generic vmlinux 1208.00 2.1% udp_sendmsg vmlinux Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29ip_gre: lockless xmitEric Dumazet
GRE tunnels can benefit from lockless xmits, using NETIF_F_LLTX Note: If tunnels are created with the "oseq" option, LLTX is not enabled : Even using an atomic_t o_seq, we would increase chance for packets being out of order at receiver. Bench on a 16 cpus machine (dual E5540 cpus), 16 threads sending 10000000 UDP frames via one gre tunnel (size:200 bytes per frame) Before patch : real 3m0.094s user 0m9.365s sys 47m50.103s After patch: real 0m29.756s user 0m11.097s sys 7m33.012s Last problem to solve is the contention on dst : 38660.00 21.4% __ip_route_output_key vmlinux 20786.00 11.5% dst_release vmlinux 14191.00 7.8% __xfrm_lookup vmlinux 12410.00 6.9% ip_finish_output vmlinux 4540.00 2.5% ip_push_pending_frames vmlinux 4427.00 2.4% ip_append_data vmlinux 4265.00 2.4% __alloc_skb vmlinux 4140.00 2.3% __ip_local_out vmlinux 3991.00 2.2% dev_queue_xmit vmlinux Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29sit: enable lockless xmitsEric Dumazet
SIT tunnels can benefit from lockless xmits, using NETIF_F_LLTX Bench on a 16 cpus machine (dual E5540 cpus), 16 threads sending 10000000 UDP frames via one sit tunnel (size:220 bytes per frame) Before patch : real 3m15.399s user 0m9.185s sys 51m55.403s 75029.00 87.5% _raw_spin_lock vmlinux 1090.00 1.3% dst_release vmlinux 902.00 1.1% dev_queue_xmit vmlinux 627.00 0.7% sock_wfree vmlinux 613.00 0.7% ip6_push_pending_frames ipv6.ko 505.00 0.6% __ip_route_output_key vmlinux After patch: real 1m1.387s user 0m12.489s sys 15m58.868s 28239.00 23.3% dst_release vmlinux 13570.00 11.2% ip6_push_pending_frames ipv6.ko 13118.00 10.8% ip6_append_data ipv6.ko 7995.00 6.6% __ip_route_output_key vmlinux 7924.00 6.5% sk_dst_check vmlinux 5015.00 4.1% udpv6_sendmsg ipv6.ko 3594.00 3.0% sock_alloc_send_pskb vmlinux 3135.00 2.6% sock_wfree vmlinux 3055.00 2.5% ip6_sk_dst_lookup ipv6.ko 2473.00 2.0% ip_finish_output vmlinux Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29sit: fix percpu stats accountingEric Dumazet
commit 15fc1f7056ebd (sit: percpu stats accounting) forgot the fallback tunnel case (sit0), and can crash pretty fast. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29ipip: fix percpu stats accountingEric Dumazet
commit 3c97af99a5aa1 (ipip: percpu stats accounting) forgot the fallback tunnel case (tunl0), and can crash pretty fast. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29dummy: percpu stats and lockless xmitEric Dumazet
Converts dummy network device driver to : - percpu stats - 64bit stats - lockless xmit (NETIF_F_LLTX) - performance features added (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_TSO | NETIF_F_NO_CSUM | NETIF_F_HIGHDMA) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29net: add a recursion limit in xmit pathEric Dumazet
As tunnel devices are going to be lockless, we need to make sure a misconfigured machine wont enter an infinite loop. Add a percpu variable, and limit to three the number of stacked xmits. Reported-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-28ipv6: Implement Any-IP support for IPv6.Maciej Żenczykowski
AnyIP is the capability to receive packets and establish incoming connections on IPs we have not explicitly configured on the machine. An example use case is to configure a machine to accept all incoming traffic on eth0, and leave the policy of whether traffic for a given IP should be delivered to the machine up to the load balancer. Can be setup as follows: ip -6 rule from all iif eth0 lookup 200 ip -6 route add local default dev lo table 200 (in this case for all IPv6 addresses) Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-28ipv4: Allow configuring subnets as local addressesTom Herbert
This patch allows a host to be configured to respond to any address in a specified range as if it were local, without actually needing to configure the address on an interface. This is done through routing table configuration. For instance, to configure a host to respond to any address in 10.1/16 received on eth0 as a local address we can do: ip rule add from all iif eth0 lookup 200 ip route add local 10.1/16 dev lo proto kernel scope host src 127.0.0.1 table 200 This host is now reachable by any 10.1/16 address (route lookup on input for packets received on eth0 can find the route). On output, the rule will not be matched so that this host can still send packets to 10.1/16 (not sent on loopback). Presumably, external routing can be configured to make sense out of this. To make this work, we needed to modify the logic in finding the interface which is assigned a given source address for output (dev_ip_find). We perform a normal fib_lookup instead of just a lookup on the local table, and in the lookup we ignore the input interface for matching. This patch is useful to implement IP-anycast for subnets of virtual addresses. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-28pch_gbe: add header filesRandy Dunlap
Fix build errors, more header files needed. drivers/net/pch_gbe/pch_gbe_main.c:965: error: implicit declaration of function 'tcp_hdr' drivers/net/pch_gbe/pch_gbe_main.c:965: error: invalid type argument of '->' (have 'int') drivers/net/pch_gbe/pch_gbe_main.c:968: error: invalid type argument of '->' (have 'int') drivers/net/pch_gbe/pch_gbe_main.c:976: error: implicit declaration of function 'udp_hdr' drivers/net/pch_gbe/pch_gbe_main.c:976: error: invalid type argument of '->' (have 'int') drivers/net/pch_gbe/pch_gbe_main.c:980: error: invalid type argument of '->' (have 'int') Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27Documentation: Update Phonet doc for Pipe Controller implementationKumar Sanghvi
Updates the Phonet document with description related to Pipe controller implementation Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com> Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27tg3: Use netif_set_real_num_{rx,tx}_queues()Ben Hutchings
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-278021q: Use netif_copy_real_num_queues() to set queue countsBen Hutchings
This covers RX if necessary, as well as TX. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27sfc: Use netif_set_real_num_{rx,tx}_queues()Ben Hutchings
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27niu: Use netif_set_real_num_{rx,tx}_queues()Ben Hutchings
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27myri10ge: Use netif_set_real_num_{rx, tx}_queues()Ben Hutchings
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27mv643xx_eth: Use netif_set_real_num_{rx, tx}_queues()Ben Hutchings
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27mlx4_en: Use netif_set_real_num_{rx, tx}_queues()Ben Hutchings
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27ixgbe: Use netif_set_real_num_{rx,tx}_queues()Ben Hutchings
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27igb: Use netif_set_real_num_{rx,tx}_queues()Ben Hutchings
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27gianfar: Use netif_set_real_num_rx_queues()Ben Hutchings
Do not set num_tx_queues or real_num_tx_queues, since alloc_etherdev_mq() does that. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27cxgb4vf: Use netif_set_real_num_{rx, tx}_queues()Ben Hutchings
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27cxgb4: Use netif_set_real_num_{rx,tx}_queues()Ben Hutchings
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27cxgb3: Use netif_set_real_num_{rx,tx}_queues()Ben Hutchings
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27bnx2x: Use netif_set_real_num_{rx,tx}_queues()Ben Hutchings
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27bnx2: Use netif_set_real_num_{rx,tx}_queues()Ben Hutchings
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27net: Add netif_copy_real_num_queues() for use by virtual net driversBen Hutchings
This sets the active numbers of queues on a net device to match another. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27net: Allow changing number of RX queues after device allocationBen Hutchings
For RPS, we create a kobject for each RX queue based on the number of queues passed to alloc_netdev_mq(). However, drivers generally do not determine the numbers of hardware queues to use until much later, so this usually represents the maximum number the driver may use and not the actual number in use. For TX queues, drivers can update the actual number using netif_set_real_num_tx_queues(). Add a corresponding function for RX queues, netif_set_real_num_rx_queues(). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27net: sk_{detach|attach}_filter() rcu fixesEric Dumazet
sk_attach_filter() and sk_detach_filter() are run with socket locked. Use the appropriate rcu_dereference_protected() instead of blocking BH, and rcu_dereference_bh(). There is no point adding BH prevention and memory barrier. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27fib: use atomic_inc_not_zero() in fib_rules_lookupEric Dumazet
It seems we dont use appropriate refcount increment in an rcu_read_lock() protected section. fib_rule_get() might increment a null refcount and bad things could happen. While fib_nl_delrule() respects an rcu grace period before calling fib_rule_put(), fib_rules_cleanup_ops() calls fib_rule_put() without a grace period. Note : after this patch, we might avoid the synchronize_rcu() call done in fib_nl_delrule() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27sit: percpu stats accountingEric Dumazet
Maintain per_cpu tx_bytes, tx_packets, rx_bytes, rx_packets. Other seldom used fields are kept in netdev->stats structure, possibly unsafe. This is a preliminary work to support lockless transmit path, and correct RX stats, that are already unsafe. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27ipip: percpu stats accountingEric Dumazet
Maintain per_cpu tx_bytes, tx_packets, rx_bytes, rx_packets. Other seldom used fields are kept in netdev->stats structure, possibly unsafe. This is a preliminary work to support lockless transmit path, and correct RX stats, that are already unsafe. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27ip_gre: percpu stats accountingEric Dumazet
Le lundi 27 septembre 2010 à 14:29 +0100, Ben Hutchings a écrit : > > diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c > > index 5d6ddcb..de39b22 100644 > > --- a/net/ipv4/ip_gre.c > > +++ b/net/ipv4/ip_gre.c > [...] > > @@ -377,7 +405,7 @@ static struct ip_tunnel *ipgre_tunnel_locate(struct net *net, > > if (parms->name[0]) > > strlcpy(name, parms->name, IFNAMSIZ); > > else > > - sprintf(name, "gre%%d"); > > + strcpy(name, "gre%d"); > > > > dev = alloc_netdev(sizeof(*t), name, ipgre_tunnel_setup); > > if (!dev) > [...] > > This is a valid fix, but doesn't belong in this patch! > Sorry ? It was not a fix, but at most a cleanup ;) Anyway I forgot the gretap case... [PATCH 2/4 v2] ip_gre: percpu stats accounting Maintain per_cpu tx_bytes, tx_packets, rx_bytes, rx_packets. Other seldom used fields are kept in netdev->stats structure, possibly unsafe. This is a preliminary work to support lockless transmit path, and correct RX stats, that are already unsafe. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27tunnels: prepare percpu accountingEric Dumazet
Tunnels are going to use percpu for their accounting. They are going to use a new tstats field in net_device. skb_tunnel_rx() is changed to be a wrapper around __skb_tunnel_rx() IPTUNNEL_XMIT() is changed to be a wrapper around __IPTUNNEL_XMIT() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27vlan: use this_cpu_ptr() in vlan_skb_recv()Eric Dumazet
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27Phonet: Implement Pipe Controller to support Nokia Slim ModemsKumar Sanghvi
Phonet stack assumes the presence of Pipe Controller, either in Modem or on Application Processing Engine user-space for the Pipe data. Nokia Slim Modems like WG2.5 used in ST-Ericsson U8500 platform do not implement Pipe controller in them. This patch adds Pipe Controller implemenation to Phonet stack to support Pipe data over Phonet stack for Nokia Slim Modems. Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com> Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/qlcnic/qlcnic_init.c net/ipv4/ip_output.c
2010-09-26net: r6040: store BIOS default MAC in perm_addOtavio Salvador
Signed-off-by: Otavio Salvador <otavio@ossystems.com.br> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-26ipv6: add a missing unregister_pernet_subsys callNeil Horman
Clean up a missing exit path in the ipv6 module init routines. In addrconf_init we call ipv6_addr_label_init which calls register_pernet_subsys for the ipv6_addr_label_ops structure. But if module loading fails, or if the ipv6 module is removed, there is no corresponding unregister_pernet_subsys call, which leaves a now-bogus address on the pernet_list, leading to oopses in subsequent registrations. This patch cleans up both the failed load path and the unload path. Tested by myself with good results. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/net/addrconf.h | 1 + net/ipv6/addrconf.c | 11 ++++++++--- net/ipv6/addrlabel.c | 5 +++++ 3 files changed, 14 insertions(+), 3 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-26net: loopback driver cleanupEric Dumazet
loopback driver uses dev->ml_priv to store its percpu stats pointer. It uses ugly casts "(void __percpu __force *)" to shut up sparse complains. Define an union to better document we use ml_priv in loopback driver and define a lstats field with appropriate types. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-26net: fix rcu use in ip_route_output_slowEric Dumazet
__in_dev_get_rtnl(dev_out) is called while RTNL is not held, thus triggers a lockdep fault. At this point, we only perform a raw test of dev_out->ip_ptr being NULL, we dont need to make sure ip_ptr cant changed right after. We can use rcu_dereference_raw() for this. Reported-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-26rps: allocate rx queues in register_netdevice onlyEric Dumazet
Instead of having two places were we allocate dev->_rx, introduce netif_alloc_rx_queues() helper and call it only from register_netdevice(), not from alloc_netdev_mq() Goal is to let drivers change dev->num_rx_queues after allocating netdev and before registering it. This also removes a lot of ifdefs in net/core/dev.c Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-26s390: use free_netdev(netdev) instead of kfree()Vasiliy Kulikov
Freeing netdev without free_netdev() leads to net, tx leaks. I might lead to dereferencing freed pointer. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) @@ struct net_device* dev; @@ -kfree(dev) +free_netdev(dev) Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-26sgiseeq: use free_netdev(netdev) instead of kfree()Kulikov Vasiliy
Freeing netdev without free_netdev() leads to net, tx leaks. I might lead to dereferencing freed pointer. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) @@ struct net_device* dev; @@ -kfree(dev) +free_netdev(dev) Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-26rionet: use free_netdev(netdev) instead of kfree()Kulikov Vasiliy
Freeing netdev without free_netdev() leads to net, tx leaks. I might lead to dereferencing freed pointer. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) @@ struct net_device* dev; @@ -kfree(dev) +free_netdev(dev) Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-26ibm_newemac: use free_netdev(netdev) instead of kfree()Kulikov Vasiliy
Freeing netdev without free_netdev() leads to net, tx leaks. I might lead to dereferencing freed pointer. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) @@ struct net_device* dev; @@ -kfree(dev) +free_netdev(dev) Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-26net: update SOCK_MIN_RCVBUFEric Dumazet
SOCK_MIN_RCVBUF current value is 256 bytes It doesnt permit to receive the smallest possible frame, considering socket sk_rmem_alloc/sk_rcvbuf account skb truesizes. On 64bit arches, sizeof(struct sk_buff) is 240 bytes. Add the typical 64 bytes of headroom, and we go over the limit. With old kernels and 32bit arches, we were under the limit, if netdriver was doing copybreak. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-26smsc911x: Add MODULE_ALIAS()Vincent Stehlé
This enables auto loading for the smsc911x ethernet driver. Signed-off-by: Vincent Stehlé <vincent.stehle@laposte.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-26net: reset skb queue mapping when rx'ing over tunnelTom Herbert
Reset queue mapping when an skb is reentering the stack via a tunnel. On second pass, the queue mapping from the original device is no longer valid. Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>