summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-01-13tcp: remove early retransmitYuchung Cheng
This patch removes the support of RFC5827 early retransmit (i.e., fast recovery on small inflight with <3 dupacks) because it is subsumed by the new RACK loss detection. More specifically when RACK receives DUPACKs, it'll arm a reordering timer to start fast recovery after a quarter of (min)RTT, hence it covers the early retransmit except RACK does not limit itself to specific inflight or dupack numbers. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-13tcp: remove forward retransmit featureYuchung Cheng
Forward retransmit is an esoteric feature in RFC3517 (condition(3) in the NextSeg()). Basically if a packet is not considered lost by the current criteria (# of dupacks etc), but the congestion window has room for more packets, then retransmit this packet. However it actually conflicts with the rest of recovery design. For example, when reordering is detected we want to be conservative in retransmitting packets but forward-retransmit feature would break that to force more retransmission. Also the implementation is fairly complicated inside the retransmission logic inducing extra iterations in the write queue. With RACK losses are being detected timely and this heuristic is no longer necessary. There this patch removes the feature. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-13tcp: extend F-RTO to catch more spurious timeoutsYuchung Cheng
Current F-RTO reverts cwnd reset whenever a never-retransmitted packet was (s)acked. The timeout can be declared spurious because the packets acknoledged with this ACK was transmitted before the timeout, so clearly not all the packets are lost to reset the cwnd. This nice detection does not really depend F-RTO internals. This patch applies the detection universally. On Google servers this change detected 20% more spurious timeouts. Suggested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-13tcp: enable RACK loss detection to trigger recoveryYuchung Cheng
This patch changes two things: 1. Start fast recovery with RACK in addition to other heuristics (e.g., DUPACK threshold, FACK). Prior to this change RACK is enabled to detect losses only after the recovery has started by other algorithms. 2. Disable TCP early retransmit. RACK subsumes the early retransmit with the new reordering timer feature. A latter patch in this series removes the early retransmit code. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-13tcp: check undo conditions before detecting lossesYuchung Cheng
Currently RACK would mark loss before the undo operations in TCP loss recovery. This could incorrectly identify real losses as spurious. For example a sender first experiences a delay spike and then eventually some packets were lost due to buffer overrun. In this case, the sender should perform fast recovery b/c not all the packets were lost. But the sender may first trigger a (spurious) RTO and reset cwnd to 1. The following ACKs may used to mark real losses by tcp_rack_mark_lost. Then in tcp_process_loss this ACK could trigger F-RTO undo condition and unmark real losses and revert the cwnd reduction. If there are no more ACKs coming back, eventually the sender would timeout again instead of performing fast recovery. The patch fixes this incorrect process by always performing the undo checks before detecting losses. Fixes: 4f41b1c58a32 ("tcp: use RACK to detect losses") Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-13tcp: use sequence to break TS ties for RACK loss detectionYuchung Cheng
The packets inside a jumbo skb (e.g., TSO) share the same skb timestamp, even though they are sent sequentially on the wire. Since RACK is based on time, it can not detect some packets inside the same skb are lost. However, we can leverage the packet sequence numbers as extended timestamps to detect losses. Therefore, when RACK timestamp is identical to skb's timestamp (i.e., one of the packets of the skb is acked or sacked), we use the sequence numbers of the acked and unacked packets to break ties. We can use the same sequence logic to advance RACK xmit time as well to detect more losses and avoid timeout. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-13tcp: add reordering timer in RACK loss detectionYuchung Cheng
This patch makes RACK install a reordering timer when it suspects some packets might be lost, but wants to delay the decision a little bit to accomodate reordering. It does not create a new timer but instead repurposes the existing RTO timer, because both are meant to retransmit packets. Specifically it arms a timer ICSK_TIME_REO_TIMEOUT when the RACK timing check fails. The wait time is set to RACK.RTT + RACK.reo_wnd - (NOW - Packet.xmit_time) + fudge This translates to expecting a packet (Packet) should take (RACK.RTT + RACK.reo_wnd + fudge) to deliver after it was sent. When there are multiple packets that need a timer, we use one timer with the maximum timeout. Therefore the timer conservatively uses the maximum window to expire N packets by one timeout, instead of N timeouts to expire N packets sent at different times. The fudge factor is 2 jiffies to ensure when the timer fires, all the suspected packets would exceed the deadline and be marked lost by tcp_rack_detect_loss(). It has to be at least 1 jiffy because the clock may tick between calling icsk_reset_xmit_timer(timeout) and actually hang the timer. The next jiffy is to lower-bound the timeout to 2 jiffies when reo_wnd is < 1ms. When the reordering timer fires (tcp_rack_reo_timeout): If we aren't in Recovery we'll enter fast recovery and force fast retransmit. This is very similar to the early retransmit (RFC5827) except RACK is not constrained to only enter recovery for small outstanding flights. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-13tcp: record most recent RTT in RACK loss detectionYuchung Cheng
Record the most recent RTT in RACK. It is often identical to the "ca_rtt_us" values in tcp_clean_rtx_queue. But when the packet has been retransmitted, RACK choses to believe the ACK is for the (latest) retransmitted packet if the RTT is over minimum RTT. This requires passing the arrival time of the most recent ACK to RACK routines. The timestamp is now recorded in the "ack_time" in tcp_sacktag_state during the ACK processing. This patch does not change the RACK algorithm itself. It only adds the RTT variable to prepare the next main patch. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-13tcp: new helper for RACK to detect lossYuchung Cheng
Create a new helper tcp_rack_detect_loss to prepare the upcoming RACK reordering timer patch. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-13tcp: new helper function for RACK loss detectionYuchung Cheng
Create a new helper tcp_rack_mark_skb_lost to prepare the upcoming RACK reordering timer support. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-13liquidio: use fallback for selecting txqSatanand Burla
Remove assignment to ndo_select_queue so that fallback is used for selecting txq. Also remove the now-useless function that used to be assigned to ndo_select_queue. Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-13net: dsa: mv88e6xxx: add EEPROM support to 6390Vivien Didelot
The Marvell 6352 chip has a 8-bit address/16-bit data EEPROM access. The Marvell 6390 chip has a 16-bit address/8-bit data EEPROM access. This patch implements the 8-bit data EEPROM access in the mv88e6xxx driver and adds its support to chips of the 6390 family. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12ipv6: sr: static percpu allocation for hmac_ringEric Dumazet
Current allocations are not NUMA aware, and lack proper cleanup in case of error. It is perfectly fine to use static per cpu allocations for 256 bytes per cpu. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Lebrun <david.lebrun@uclouvain.be> Acked-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12ipmr: improve hash scalabilityNikolay Aleksandrov
Recently we started using ipmr with thousands of entries and easily hit soft lockups on smaller devices. The reason is that the hash function uses the high order bits from the src and dst, but those don't change in many common cases, also the hash table is only 64 elements so with thousands it doesn't scale at all. This patch migrates the hash table to rhashtable, and in particular the rhl interface which allows for duplicate elements to be chained because of the MFC_PROXY support (*,G; *,*,oif cases) which allows for multiple duplicate entries to be added with different interfaces (IMO wrong, but it's been in for a long time). And here are some results from tests I've run in a VM: mr_table size (default, allocated for all namespaces): Before After 49304 bytes 2400 bytes Add 65000 routes (the diff is much larger on smaller devices): Before After 1m42s 58s Forwarding 256 byte packets with 65000 routes (test done in a VM): Before After 3 Mbps / ~1465 pps 122 Mbps / ~59000 pps As a bonus we no longer see the soft lockups on smaller devices which showed up even with 2000 entries before. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12secure_seq: fix sparse errorsEric Dumazet
Fixes following warnings : net/core/secure_seq.c:125:28: warning: incorrect type in argument 1 (different base types) net/core/secure_seq.c:125:28: expected unsigned int const [unsigned] [usertype] a net/core/secure_seq.c:125:28: got restricted __be32 [usertype] saddr net/core/secure_seq.c:125:35: warning: incorrect type in argument 2 (different base types) net/core/secure_seq.c:125:35: expected unsigned int const [unsigned] [usertype] b net/core/secure_seq.c:125:35: got restricted __be32 [usertype] daddr net/core/secure_seq.c:125:43: warning: cast from restricted __be16 net/core/secure_seq.c:125:61: warning: restricted __be16 degrades to integer Fixes: 7cd23e5300c1 ("secure_seq: use SipHash in place of MD5") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12liquidio VF: reduce load time of modulePrasad Kanneganti
Reduce the load time of the VF driver by decreasing the wait time between iterations of the loop that polls for a mailbox response from the PF. Also change the wait time units from jiffies to milliseconds. Signed-off-by: Prasad Kanneganti <prasad.kanneganti@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12liquidio: remove unnecessary codeFelix Manlunas
Remove code that's no longer needed. It used to serve a purpose, which was to fix a link-related bug. For a while now, the NIC firmware has had a more elegant fix for that bug. Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12tilepro: Fix non-void return from void functionJoe Perches
commit bc1f44709cf2 ("net: make ndo_get_stats64 a void function") mistakenly used a return value for this void conversion. Fix it. Signed-off-by: Joe Perches <joe@perches.com> cc: stephen hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12Merge branch 'mdio-gpio-next'David S. Miller
Florian Fainelli says: ==================== net: mdio-gpio: Use modern GPIO helpers This patch series modernizes the mdio-gpio and makes it switch to the latest and greatest API for manipulating GPIO lines, thus allowing some simplifications in the driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12net: mdio-gpio: Use gpio subsystem to handle low-active pinsGuenter Roeck
gpiod functions support handling low-active pins, so we can move thos code out of this driver into the gpio subsystem and simplify the code a bit. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12net: mdio-gpio: Convert to use gpiod functions where possibleGuenter Roeck
Using gpiod functions lets us use functionality which is not available with gpio functions. There is no gpiod function to match devm_gpio_request_one, so leave it in place and use gpio_to_desc() to convert absolute pin numbers to gpio descriptors. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12net: mdio-gpio: Use devm_gpio_request_one instead of devm_gpio_requestGuenter Roeck
Using devm_gpio_request_one lets us request gpio pins with initial state in one go. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12cdc-ether: usbnet_cdc_zte_status() can be staticWei Yongjun
Fixes the following sparse warning: drivers/net/usb/cdc_ether.c:469:6: warning: symbol 'usbnet_cdc_zte_status' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12tools: psock_lib: harden socket filter used by psock testsSowmini Varadhan
The filter added by sock_setfilter is intended to only permit packets matching the pattern set up by create_payload(), but we only check the ip_len, and a single test-character in the IP packet to ensure this condition. Harden the filter by adding additional constraints so that we only permit UDP/IPv4 packets that meet the ip_len and test-character requirements. Include the bpf_asm src as a comment, in case this needs to be enhanced in the future Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12lwt_bpf: bpf_lwt_prog_cmp() can be staticWei Yongjun
Fixes the following sparse warning: net/core/lwt_bpf.c:355:5: warning: symbol 'bpf_lwt_prog_cmp' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12Merge branch 's390-qeth-next'David S. Miller
Ursula Braun says: ==================== s390: qeth patches yesterday I came up with 13 qeth patches. Since you have not been happy with the 13th patch, I want to make sure that at least the remaining 12 qeth patches can be applied to net-next. Here is the resend of them. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12s390/qeth: fix retrieval of vipa and proxy-arp addressesUrsula Braun
qeth devices in layer3 mode need a separate handling of vipa and proxy-arp addresses. vipa and proxy-arp addresses processed by qeth can be read from userspace. Introduced with commit 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback") the retrieval of vipa and proxy-arp addresses is broken, if more than one vipa or proxy-arp address are set. The qeth code used local variable "int i" for 2 different purposes. This patch now spends 2 separate local variables of type "int". While touching these functions hash_for_each_safe() is converted to hash_for_each(), since there is no removal of hash entries. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reference-ID: RQM 3524 Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12s390/qeth: issue STARTLAN as first IPA commandJulian Wiedmann
STARTLAN needs to be the first IPA command after MPC initialization completes. So move the qeth_send_startlan() call from the layer disciplines into the core path, right after the MPC handshake. While at it, replace the magic LAN OFFLINE return code with the existing enum. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12s390/qeth: shuffle MAC management functions aroundJulian Wiedmann
Move all MAC utility functions in one place, and drop the forward declarations. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12s390/qeth: extract qeth_l2_remove_mac()Julian Wiedmann
This matches qeth_l2_write_mac(). Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12s390/qeth: consolidate errno translationJulian Wiedmann
Consolidate errno handling for MAC management: Instead of doing this in every caller, do it in one place. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Suggested-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12s390/qeth: don't convert return code twiceJulian Wiedmann
qeth_l2_send_groupmac() already translates the return code, so calling qeth_setdel_makerc() a second time only produces garbage. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12s390/qeth: drop qeth_l2_del_all_macs() parameterJulian Wiedmann
The only caller passes del = 0, so remove both the parameter and the code that handles != 0. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Acked-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12s390/qeth: Remove QETH_IP_HEADER_SIZEJulian Wiedmann
Remove unused define QETH_IP_HEADER_SIZE. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Acked-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12s390/qeth: Allow reading hsuid in state DOWNJulian Wiedmann
Accessing the current hsuid via card->options.hsuid is perfectly fine, even when the card is DOWN. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Acked-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12s390/qeth: display warning for OSA3 RX/TX checksum offloadingThomas Richter
When RX/TX checksum offloading is turned on and the adapter is an OSA 3 card in layer 3 mode, the checksum offloading is only performed when both peers use different adapters. If both peers share an OSA 3 card, communication is a memory copy and checksum offloading is not performed. This patch adds a warning to inform the administrator. OSA 3 in layer 2 mode does not offer the RX/TX checksum offload feature. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12s390/qeth: test RX/TX checksum offload replyThomas Richter
Turning on receive and/or transmit checksum offload support on the OSA card requires 2 commands: 1. start command which replies with available features 2. enable command to turn on selected features. The current version does not check the reply of the start command and simply uses the returned value to enable offload features. When the start command returns zero, this leads to a situation where no checksum offload is turned on by the hardware. Even worse no error indication is returned. The Linux kernel assumes the OSA card performs RX/TX checksum offload, but the hardware does not perform any checksum verification at all. This patch checks the return of the start and enable command responses from the hardware and turns off checksum offloading if the commands fails or does not respond with the correct bit setting. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12s390/qeth: rework RX/TX checksum offloadThomas Richter
Rework the RX/TX checksum offloading command sequence to use the provided function call back mechanims to return card data to the device driver. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12Merge branch 'bpf-cb-access'David S. Miller
Daniel Borkmann says: ==================== More flexible BPF cb access This patch improves BPF's cb access by allowing b/h/w/dw access variants on it. For details, please see individual patches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12bpf: allow b/h/w/dw access for bpf's cb in ctxDaniel Borkmann
When structs are used to store temporary state in cb[] buffer that is used with programs and among tail calls, then the generated code will not always access the buffer in bpf_w chunks. We can ease programming of it and let this act more natural by allowing for aligned b/h/w/dw sized access for cb[] ctx member. Various test cases are attached as well for the selftest suite. Potentially, this can also be reused for other program types to pass data around. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12bpf: pass original insn directly to convert_ctx_accessDaniel Borkmann
Currently, when calling convert_ctx_access() callback for the various program types, we pass in insn->dst_reg, insn->src_reg, insn->off from the original instruction. This information is needed to rewrite the instruction that is based on the user ctx structure into a kernel representation for the ctx. As we'd like to allow access size beyond just BPF_W, we'd need also insn->code for that in order to decode the original access size. Given that, lets just pass insn directly to the convert_ctx_access() callback and work on that to not clutter the callback with even more arguments we need to pass when everything is already contained in insn. So lets go through that once, no functional change. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12Merge branch 'smc-fixes'David S. Miller
Ursula Braun says: ==================== net/smc: fix typo and clc-bug I received 2 bug reports for my new AF_SMC-code. Here are the fixes for them. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12smc: ETH_ALEN as memcpy length for mac addressesUrsula Braun
When creating an SMC connection, there is a CLC (connection layer control) handshake to prepare for RDMA traffic. The corresponding code is part of commit 0cfdd8f92cac ("smc: connection and link group creation"). Mac addresses to be exchanged in the handshake are copied with a wrong length of 12 instead of 6 bytes. Following code overwrites the wrongly copied code, but nevertheless the correct length should already be used for the preceding mac address copying. Use ETH_ALEN for the memcpy length with mac addresses. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Fixes: 0cfdd8f92cac ("smc: connection and link group creation") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12net: fix AF_SMC related typoUrsula Braun
When introducing the new socket family AF_SMC in commit ac7138746e14 ("smc: establish new socket family"), a typo in af_family_clock_key_strings has slipped in. This patch repairs it. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Fixes: ac7138746e14 ("smc: establish new socket family") Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12cxgb4: Initialize mbox lock and list for mgmt devGanesh Goudar
Initialize mbox lock and list for mgmt dev to avoid NULL pointer dereference when cxgb_set_vf_mac is called. And also allocate memory for private data while allocating mgmt netdev. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12net: core: Make netif_wake_subqueue a wrapperFlorian Fainelli
netif_wake_subqueue() is duplicating the same thing that netif_tx_wake_queue() does, so make it call it directly after looking up the queue from the index. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11net: thunderx: Make hfunc variable const type in nicvf_set_rxfh()Robert Richter
>From struct ethtool_ops: int (*set_rxfh)(struct net_device *, const u32 *indir, const u8 *key, const u8 hfunc); Change function arg of hfunc to const type. V2: Fixed indentation. Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11net: thunderx: Fix error return code in nicvf_open()Wei Yongjun
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 712c31853440 ("net: thunderx: Program LMAC credits based on MTU") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11sfc: efx_get_phys_port_id() can be staticWei Yongjun
Fixes the following sparse warning: drivers/net/ethernet/sfc/efx.c:2337:5: warning: symbol 'efx_get_phys_port_id' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Two AF_* families adding entries to the lockdep tables at the same time. Signed-off-by: David S. Miller <davem@davemloft.net>