commit d3c67a52a66ba2d44bcf1b8262609148c7c73113 Author: Greg Kroah-Hartman Date: Mon Dec 17 21:55:18 2018 +0100 Linux 4.4.168 commit 5e27a782d2cd82e7ac599b1769ad061370a55aff Author: Shuah Khan Date: Thu Sep 15 08:36:07 2016 -0600 selftests: Move networking/timestamping from Documentation commit 3d2c86e3057995270e08693231039d9d942871f0 upstream. Remove networking from Documentation Makefile to move the test to selftests. Update networking/timestamping Makefile to work under selftests. These tests will not be run as part of selftests suite and will not be included in install targets. They can be built and run separately for now. This is part of the effort to move runnable code from Documentation. Acked-by: Jonathan Corbet Signed-off-by: Shuah Khan [ added to 4.4.y stable to remove a build warning - gregkh] Signed-off-by: Greg Kroah-Hartman commit 2397ae1e4b9a914346fa77acfc9b066fdea00833 Author: Arnd Bergmann Date: Fri Sep 22 23:29:18 2017 +0200 rocker: fix rocker_tlv_put_* functions for KASAN commit 6098d7ddd62f532f80ee2a4b01aca500a8e4e9e4 upstream. Inlining these functions creates lots of stack variables that each take 64 bytes when KASAN is enabled, leading to this warning about potential stack overflow: drivers/net/ethernet/rocker/rocker_ofdpa.c: In function 'ofdpa_cmd_flow_tbl_add': drivers/net/ethernet/rocker/rocker_ofdpa.c:621:1: error: the frame size of 2752 bytes is larger than 1536 bytes [-Werror=frame-larger-than=] gcc-8 can now consolidate the stack slots itself, but on older versions we get the same behavior by using a temporary variable that holds a copy of the inline function argument. Cc: stable@vger.kernel.org Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715 Signed-off-by: Arnd Bergmann Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e375a05c5082d708bdd79eff150fe2e53cccf5e8 Author: Guenter Roeck Date: Sun Jul 1 13:57:24 2018 -0700 staging: speakup: Replace strncpy with memcpy commit fd29edc7232bc19f969e8f463138afc5472b3d5f upstream. gcc 8.1.0 generates the following warnings. drivers/staging/speakup/kobjects.c: In function 'punc_store': drivers/staging/speakup/kobjects.c:522:2: warning: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length drivers/staging/speakup/kobjects.c:504:6: note: length computed here drivers/staging/speakup/kobjects.c: In function 'synth_store': drivers/staging/speakup/kobjects.c:391:2: warning: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length drivers/staging/speakup/kobjects.c:388:8: note: length computed here Using strncpy() is indeed less than perfect since the length of data to be copied has already been determined with strlen(). Replace strncpy() with memcpy() to address the warning and optimize the code a little. Signed-off-by: Guenter Roeck Reviewed-by: Samuel Thibault Signed-off-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman commit ed8c3cb1e66ac583ae461fef25b0cd01df1e50e7 Author: Sudip Mukherjee Date: Thu Aug 25 23:14:12 2016 +0530 matroxfb: fix size of memcpy commit 59921b239056fb6389a865083284e00ce0518db6 upstream. hw->DACreg has a size of 80 bytes and MGADACbpp32 has 21. So when memcpy copies MGADACbpp32 to hw->DACreg it copies 80 bytes but only 21 bytes are valid. Signed-off-by: Sudip Mukherjee Signed-off-by: Tomi Valkeinen Signed-off-by: Greg Kroah-Hartman commit dc4bc70259daba144f799e40a99413a86c601006 Author: Arnd Bergmann Date: Thu Nov 30 11:55:46 2017 -0500 media: dvb-frontends: fix i2c access helpers for KASAN commit 3cd890dbe2a4f14cc44c85bb6cf37e5e22d4dd0e upstream. A typical code fragment was copied across many dvb-frontend drivers and causes large stack frames when built with with CONFIG_KASAN on gcc-5/6/7: drivers/media/dvb-frontends/cxd2841er.c:3225:1: error: the frame size of 3992 bytes is larger than 3072 bytes [-Werror=frame-larger-than=] drivers/media/dvb-frontends/cxd2841er.c:3404:1: error: the frame size of 3136 bytes is larger than 3072 bytes [-Werror=frame-larger-than=] drivers/media/dvb-frontends/stv0367.c:3143:1: error: the frame size of 4016 bytes is larger than 3072 bytes [-Werror=frame-larger-than=] drivers/media/dvb-frontends/stv090x.c:3430:1: error: the frame size of 5312 bytes is larger than 3072 bytes [-Werror=frame-larger-than=] drivers/media/dvb-frontends/stv090x.c:4248:1: error: the frame size of 4872 bytes is larger than 3072 bytes [-Werror=frame-larger-than=] gcc-8 now solves this by consolidating the stack slots for the argument variables, but on older compilers we can get the same behavior by taking the pointer of a local variable rather than the inline function argument. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715 Cc: stable@vger.kernel.org Signed-off-by: Arnd Bergmann Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit adc143b97d06a3305707726e69b4247db050cb88 Author: Willy Tarreau Date: Fri May 11 08:11:44 2018 +0200 proc: do not access cmdline nor environ from file-backed areas commit 7f7ccc2ccc2e70c6054685f5e3522efa81556830 upstream. proc_pid_cmdline_read() and environ_read() directly access the target process' VM to retrieve the command line and environment. If this process remaps these areas onto a file via mmap(), the requesting process may experience various issues such as extra delays if the underlying device is slow to respond. Let's simply refuse to access file-backed areas in these functions. For this we add a new FOLL_ANON gup flag that is passed to all calls to access_remote_vm(). The code already takes care of such failures (including unmapped areas). Accesses via /proc/pid/mem were not changed though. This was assigned CVE-2018-1120. Note for stable backports: the patch may apply to kernels prior to 4.11 but silently miss one location; it must be checked that no call to access_remote_vm() keeps zero as the last argument. Reported-by: Qualys Security Advisory Cc: Linus Torvalds Cc: Andy Lutomirski Cc: Oleg Nesterov Signed-off-by: Willy Tarreau Signed-off-by: Linus Torvalds [bwh: Backported to 4.4: - Update the extra call to access_remote_vm() from proc_pid_cmdline_read() - Adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 56941bb6400ca6ed0fdcaaa1f8c8183234bf199c Author: Linus Torvalds Date: Mon Oct 24 19:00:44 2016 -0700 proc: don't use FOLL_FORCE for reading cmdline and environment commit 272ddc8b37354c3fe111ab26d25e792629148eee upstream. Now that Lorenzo cleaned things up and made the FOLL_FORCE users explicit, it becomes obvious how some of them don't really need FOLL_FORCE at all. So remove FOLL_FORCE from the proc code that reads the command line and arguments from user space. The mem_rw() function actually does want FOLL_FORCE, because gdd (and possibly many other debuggers) use it as a much more convenient version of PTRACE_PEEKDATA, but we should consider making the FOLL_FORCE part conditional on actually being a ptracer. This does not actually do that, just moves adds a comment to that effect and moves the gup_flags settings next to each other. Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 079d9ea86202777cd57c69879a5ba8db6a2c1b1e Author: Lorenzo Stoakes Date: Thu Oct 13 01:20:19 2016 +0100 mm: replace access_remote_vm() write parameter with gup_flags commit 6347e8d5bcce33fc36e651901efefbe2c93a43ef upstream. This removes the 'write' argument from access_remote_vm() and replaces it with 'gup_flags' as use of this function previously silently implied FOLL_FORCE, whereas after this patch callers explicitly pass this flag. We make this explicit as use of FOLL_FORCE can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes Acked-by: Michal Hocko Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 2b8143d6874b385c79b60257bb0f0ad328ee2194 Author: Lorenzo Stoakes Date: Thu Oct 13 01:20:18 2016 +0100 mm: replace __access_remote_vm() write parameter with gup_flags commit 442486ec1096781c50227b73f721a63974b0fdda upstream. This removes the 'write' argument from __access_remote_vm() and replaces it with 'gup_flags' as use of this function previously silently implied FOLL_FORCE, whereas after this patch callers explicitly pass this flag. We make this explicit as use of FOLL_FORCE can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes Acked-by: Michal Hocko Signed-off-by: Linus Torvalds [bwh: Backported to 4.4: adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 8e50b8b07f462ab4b91bc1491b1c91bd75e4ad40 Author: Lorenzo Stoakes Date: Thu Oct 13 01:20:16 2016 +0100 mm: replace get_user_pages() write/force parameters with gup_flags commit 768ae309a96103ed02eb1e111e838c87854d8b51 upstream. This removes the 'write' and 'force' from get_user_pages() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes Acked-by: Christian König Acked-by: Jesper Nilsson Acked-by: Michal Hocko Reviewed-by: Jan Kara Signed-off-by: Linus Torvalds [bwh: Backported to 4.4: - Drop changes in rapidio, vchiq, goldfish - Keep the "write" variable in amdgpu_ttm_tt_pin_userptr() as it's still needed - Also update calls from various other places that now use get_user_pages_remote() upstream, which were updated there by commit 9beae1ea8930 "mm: replace get_user_pages_remote() write/force ..." - Also update calls from hfi1 and ipath - Adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 3ec22a6bce3f06aa3b8a399ea456fb1cb3792584 Author: Lorenzo Stoakes Date: Thu Oct 13 01:20:15 2016 +0100 mm: replace get_vaddr_frames() write/force parameters with gup_flags commit 7f23b3504a0df63b724180262c5f3f117f21bcae upstream. This removes the 'write' and 'force' from get_vaddr_frames() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes Acked-by: Michal Hocko Reviewed-by: Jan Kara Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit ffe6376c1803b5e964f609e8bc01507642397641 Author: Lorenzo Stoakes Date: Thu Oct 13 01:20:14 2016 +0100 mm: replace get_user_pages_locked() write/force parameters with gup_flags commit 3b913179c3fa89dd0e304193fa0c746fc0481447 upstream. This removes the 'write' and 'force' use from get_user_pages_locked() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes Acked-by: Michal Hocko Reviewed-by: Jan Kara Signed-off-by: Linus Torvalds [bwh: Backported to 4.4: adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 2b29980eb75bc7dcb23ed0436fe805ac6e684542 Author: Lorenzo Stoakes Date: Thu Oct 13 01:20:13 2016 +0100 mm: replace get_user_pages_unlocked() write/force parameters with gup_flags commit c164154f66f0c9b02673f07aa4f044f1d9c70274 upstream. This removes the 'write' and 'force' use from get_user_pages_unlocked() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes Reviewed-by: Jan Kara Acked-by: Michal Hocko Signed-off-by: Linus Torvalds [bwh: Backported to 4.4: - Also update calls from process_vm_rw_single_vec() and async_pf_execute() - Adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit ff099ed77421a5dc8206bb61e5a598b28ab39ebb Author: Ben Hutchings Date: Sun Dec 16 23:50:08 2018 +0000 mm/nommu.c: Switch __get_user_pages_unlocked() to use __get_user_pages() Extracted from commit cde70140fed8 "mm/gup: Overload get_user_pages() functions". This is needed before picking commit 768ae309a961 "mm: replace get_user_pages() write/force parameters with gup_flags". Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit ab424c8eb71ee9ea4ba798faaeaf62e84048cb9a Author: Lorenzo Stoakes Date: Thu Oct 13 01:20:12 2016 +0100 mm: remove write/force parameters from __get_user_pages_unlocked() commit d4944b0ecec0af882483fe44b66729316e575208 upstream. This removes the redundant 'write' and 'force' parameters from __get_user_pages_unlocked() to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes Acked-by: Paolo Bonzini Reviewed-by: Jan Kara Acked-by: Michal Hocko Signed-off-by: Linus Torvalds [bwh: Backported to 4.4: - Defer changes in process_vm_rw_single_vec() and async_pf_execute() since they use get_user_pages_unlocked() here - Adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 6a3c9524df93e9597d30c2d32114950772fce58e Author: Lorenzo Stoakes Date: Thu Oct 13 01:20:11 2016 +0100 mm: remove write/force parameters from __get_user_pages_locked() commit 859110d7497cdd0e6b21010d6f777049d676382c upstream. This removes the redundant 'write' and 'force' parameters from __get_user_pages_locked() to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes Reviewed-by: Jan Kara Acked-by: Michal Hocko Signed-off-by: Linus Torvalds [bwh: Backported to 4.4: - Drop change in get_user_pages_remote() - Adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 8aca77150a9a2f89aba94f62a09c95f9a00c2956 Author: Jens Axboe Date: Mon May 21 12:21:14 2018 -0600 sr: pass down correctly sized SCSI sense buffer commit f7068114d45ec55996b9040e98111afa56e010fe upstream. We're casting the CDROM layer request_sense to the SCSI sense buffer, but the former is 64 bytes and the latter is 96 bytes. As we generally allocate these on the stack, we end up blowing up the stack. Fix this by wrapping the scsi_execute() call with a properly sized sense buffer, and copying back the bits for the CDROM layer. Reported-by: Piotr Gabriel Kosinski Reported-by: Daniel Shapira Tested-by: Kees Cook Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request") Signed-off-by: Jens Axboe [bwh: Despite what the "Fixes" field says, a buffer overrun was already possible if the sense data was really > 64 bytes long. Backported to 4.4: - We always need to allocate a sense buffer in order to call scsi_normalize_sense() - Remove the existing conditional heap-allocation of the sense buffer] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit c873dfa0ccbdb08e9fb42f497503e148f79cdebb Author: Kees Cook Date: Tue Jul 10 16:22:22 2018 -0700 swiotlb: clean up reporting commit 7d63fb3af87aa67aa7d24466e792f9d7c57d8e79 upstream. This removes needless use of '%p', and refactors the printk calls to use pr_*() helpers instead. Signed-off-by: Kees Cook Reviewed-by: Konrad Rzeszutek Wilk Signed-off-by: Christoph Hellwig [bwh: Backported to 4.4: - Adjust filename - Remove "swiotlb: " prefix from an additional log message] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 0d547583faaa4371ec566bdd38488b1eba5e8a2d Author: Mike Kravetz Date: Thu Apr 5 16:18:21 2018 -0700 hugetlbfs: fix bug in pgoff overflow checking commit 5df63c2a149ae65a9ec239e7c2af44efa6f79beb upstream. This is a fix for a regression in 32 bit kernels caused by an invalid check for pgoff overflow in hugetlbfs mmap setup. The check incorrectly specified that the size of a loff_t was the same as the size of a long. The regression prevents mapping hugetlbfs files at offsets greater than 4GB on 32 bit kernels. On 32 bit kernels conversion from a page based unsigned long can not overflow a loff_t byte offset. Therefore, skip this check if sizeof(unsigned long) != sizeof(loff_t). Link: http://lkml.kernel.org/r/20180330145402.5053-1-mike.kravetz@oracle.com Fixes: 63489f8e8211 ("hugetlbfs: check for pgoff value overflow") Reported-by: Dan Rue Signed-off-by: Mike Kravetz Tested-by: Anders Roxell Cc: Michal Hocko Cc: Yisheng Xie Cc: "Kirill A . Shutemov" Cc: Nic Losby Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 38d924e1d040a5632c023d7b9a3a6bfbaf4d49d1 Author: Mike Kravetz Date: Thu Mar 22 16:17:13 2018 -0700 hugetlbfs: check for pgoff value overflow commit 63489f8e821144000e0bdca7e65a8d1cc23a7ee7 upstream. A vma with vm_pgoff large enough to overflow a loff_t type when converted to a byte offset can be passed via the remap_file_pages system call. The hugetlbfs mmap routine uses the byte offset to calculate reservations and file size. A sequence such as: mmap(0x20a00000, 0x600000, 0, 0x66033, -1, 0); remap_file_pages(0x20a00000, 0x600000, 0, 0x20000000000000, 0); will result in the following when task exits/file closed, kernel BUG at mm/hugetlb.c:749! Call Trace: hugetlbfs_evict_inode+0x2f/0x40 evict+0xcb/0x190 __dentry_kill+0xcb/0x150 __fput+0x164/0x1e0 task_work_run+0x84/0xa0 exit_to_usermode_loop+0x7d/0x80 do_syscall_64+0x18b/0x190 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 The overflowed pgoff value causes hugetlbfs to try to set up a mapping with a negative range (end < start) that leaves invalid state which causes the BUG. The previous overflow fix to this code was incomplete and did not take the remap_file_pages system call into account. [mike.kravetz@oracle.com: v3] Link: http://lkml.kernel.org/r/20180309002726.7248-1-mike.kravetz@oracle.com [akpm@linux-foundation.org: include mmdebug.h] [akpm@linux-foundation.org: fix -ve left shift count on sh] Link: http://lkml.kernel.org/r/20180308210502.15952-1-mike.kravetz@oracle.com Fixes: 045c7a3f53d9 ("hugetlbfs: fix offset overflow in hugetlbfs mmap") Signed-off-by: Mike Kravetz Reported-by: Nic Losby Acked-by: Michal Hocko Cc: "Kirill A . Shutemov" Cc: Yisheng Xie Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [bwh: Backported to 4.4: Use a conditional WARN() instead of VM_WARN()] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit f24fb9efea4df0376c5bda53a2e4b189a28dc09e Author: Mike Kravetz Date: Thu Apr 13 14:56:32 2017 -0700 hugetlbfs: fix offset overflow in hugetlbfs mmap commit 045c7a3f53d9403b62d396b6d051c4be5044cdb4 upstream. If mmap() maps a file, it can be passed an offset into the file at which the mapping is to start. Offset could be a negative value when represented as a loff_t. The offset plus length will be used to update the file size (i_size) which is also a loff_t. Validate the value of offset and offset + length to make sure they do not overflow and appear as negative. Found by syzcaller with commit ff8c0c53c475 ("mm/hugetlb.c: don't call region_abort if region_chg fails") applied. Prior to this commit, the overflow would still occur but we would luckily return ENOMEM. To reproduce: mmap(0, 0x2000, 0, 0x40021, 0xffffffffffffffffULL, 0x8000000000000000ULL); Resulted in, kernel BUG at mm/hugetlb.c:742! Call Trace: hugetlbfs_evict_inode+0x80/0xa0 evict+0x24a/0x620 iput+0x48f/0x8c0 dentry_unlink_inode+0x31f/0x4d0 __dentry_kill+0x292/0x5e0 dput+0x730/0x830 __fput+0x438/0x720 ____fput+0x1a/0x20 task_work_run+0xfe/0x180 exit_to_usermode_loop+0x133/0x150 syscall_return_slowpath+0x184/0x1c0 entry_SYSCALL_64_fastpath+0xab/0xad Fixes: ff8c0c53c475 ("mm/hugetlb.c: don't call region_abort if region_chg fails") Link: http://lkml.kernel.org/r/1491951118-30678-1-git-send-email-mike.kravetz@oracle.com Reported-by: Vegard Nossum Signed-off-by: Mike Kravetz Acked-by: Hillf Danton Cc: Dmitry Vyukov Cc: Michal Hocko Cc: "Kirill A . Shutemov" Cc: Andrey Ryabinin Cc: Naoya Horiguchi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 30a2ae50aef84ce6bb6132859a04dca461dbafdd Author: Mike Kravetz Date: Fri Mar 31 15:12:07 2017 -0700 mm/hugetlb.c: don't call region_abort if region_chg fails commit ff8c0c53c47530ffea82c22a0a6df6332b56c957 upstream. Changes to hugetlbfs reservation maps is a two step process. The first step is a call to region_chg to determine what needs to be changed, and prepare that change. This should be followed by a call to call to region_add to commit the change, or region_abort to abort the change. The error path in hugetlb_reserve_pages called region_abort after a failed call to region_chg. As a result, the adds_in_progress counter in the reservation map is off by 1. This is caught by a VM_BUG_ON in resv_map_release when the reservation map is freed. syzkaller fuzzer (when using an injected kmalloc failure) found this bug, that resulted in the following: kernel BUG at mm/hugetlb.c:742! Call Trace: hugetlbfs_evict_inode+0x7b/0xa0 fs/hugetlbfs/inode.c:493 evict+0x481/0x920 fs/inode.c:553 iput_final fs/inode.c:1515 [inline] iput+0x62b/0xa20 fs/inode.c:1542 hugetlb_file_setup+0x593/0x9f0 fs/hugetlbfs/inode.c:1306 newseg+0x422/0xd30 ipc/shm.c:575 ipcget_new ipc/util.c:285 [inline] ipcget+0x21e/0x580 ipc/util.c:639 SYSC_shmget ipc/shm.c:673 [inline] SyS_shmget+0x158/0x230 ipc/shm.c:657 entry_SYSCALL_64_fastpath+0x1f/0xc2 RIP: resv_map_release+0x265/0x330 mm/hugetlb.c:742 Link: http://lkml.kernel.org/r/1490821682-23228-1-git-send-email-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz Reported-by: Dmitry Vyukov Acked-by: Hillf Danton Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 954648ebf8e27fcbf23b7954b79a22a5cacc83b1 Author: Thomas Gleixner Date: Thu Nov 1 13:02:38 2018 -0700 posix-timers: Sanitize overrun handling commit 78c9c4dfbf8c04883941445a195276bb4bb92c76 upstream. The posix timer overrun handling is broken because the forwarding functions can return a huge number of overruns which does not fit in an int. As a consequence timer_getoverrun(2) and siginfo::si_overrun can turn into random number generators. The k_clock::timer_forward() callbacks return a 64 bit value now. Make k_itimer::ti_overrun[_last] 64bit as well, so the kernel internal accounting is correct. 3Remove the temporary (int) casts. Add a helper function which clamps the overrun value returned to user space via timer_getoverrun(2) or siginfo::si_overrun limited to a positive value between 0 and INT_MAX. INT_MAX is an indicator for user space that the overrun value has been clamped. Reported-by: Team OWL337 Signed-off-by: Thomas Gleixner Acked-by: John Stultz Cc: Peter Zijlstra Cc: Michael Kerrisk Link: https://lkml.kernel.org/r/20180626132705.018623573@linutronix.de [florian: Make patch apply to v4.9.135] Signed-off-by: Florian Fainelli Reviewed-by: Thomas Gleixner Signed-off-by: Sasha Levin Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit e47b9b2b005ab8b1b83bc0ac4aa2803cba57182a Author: Lior David Date: Tue Nov 14 15:25:39 2017 +0200 wil6210: missing length check in wmi_set_ie commit b5a8ffcae4103a9d823ea3aa3a761f65779fbe2a upstream. Add a length check in wmi_set_ie to detect unsigned integer overflow. Signed-off-by: Lior David Signed-off-by: Maya Erez Signed-off-by: Kalle Valo Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 1c74bd22e846b162ea6401e8d43172e0e7256ccf Author: Alexei Starovoitov Date: Tue May 15 09:27:05 2018 -0700 bpf: Prevent memory disambiguation attack commit af86ca4e3088fe5eacf2f7e58c01fa68ca067672 upstream. Detect code patterns where malicious 'speculative store bypass' can be used and sanitize such patterns. 39: (bf) r3 = r10 40: (07) r3 += -216 41: (79) r8 = *(u64 *)(r7 +0) // slow read 42: (7a) *(u64 *)(r10 -72) = 0 // verifier inserts this instruction 43: (7b) *(u64 *)(r8 +0) = r3 // this store becomes slow due to r8 44: (79) r1 = *(u64 *)(r6 +0) // cpu speculatively executes this load 45: (71) r2 = *(u8 *)(r1 +0) // speculatively arbitrary 'load byte' // is now sanitized Above code after x86 JIT becomes: e5: mov %rbp,%rdx e8: add $0xffffffffffffff28,%rdx ef: mov 0x0(%r13),%r14 f3: movq $0x0,-0x48(%rbp) fb: mov %rdx,0x0(%r14) ff: mov 0x0(%rbx),%rdi 103: movzbq 0x0(%rdi),%rsi Signed-off-by: Alexei Starovoitov Signed-off-by: Thomas Gleixner [bwh: Backported to 4.4: - Add verifier_env parameter to check_stack_write() - Look up stack slot_types with state->stack_slot_type[] rather than state->stack[].slot_type[] - Drop bpf_verifier_env argument to verbose() - Adjust filename, context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 451624d47005aace4e314b488cb70ba3ee5dcce8 Author: Ben Hutchings Date: Wed Dec 5 22:41:36 2018 +0000 bpf/verifier: Pass instruction index to check_mem_access() and check_xadd() Extracted from commit 31fd85816dbe "bpf: permits narrower load from bpf program context fields". Cc: Daniel Borkmann Cc: Alexei Starovoitov Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 168cb9b7b2839e861278f9fde03820aba32c4ee0 Author: Ben Hutchings Date: Wed Dec 5 22:45:15 2018 +0000 bpf/verifier: Add spi variable to check_stack_write() Extracted from commit dc503a8ad984 "bpf/verifier: track liveness for pruning". Cc: Daniel Borkmann Cc: Alexei Starovoitov Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 3c4bb079e16e222324c68d7594b1ab6f699edfca Author: Alexei Starovoitov Date: Thu Sep 1 18:37:21 2016 -0700 bpf: support 8-byte metafield access commit cedaf52693f02372010548c63b2e63228b959099 upstream. The verifier supported only 4-byte metafields in struct __sk_buff and struct xdp_md. The metafields in upcoming struct bpf_perf_event are 8-byte to match register width in struct pt_regs. Teach verifier to recognize 8-byte metafield access. The patch doesn't affect safety of sockets and xdp programs. They check for 4-byte only ctx access before these conditions are hit. Signed-off-by: Alexei Starovoitov Acked-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit ff3c3b181c5ee5930b9cc6ca59c4c985a3d93220 Author: Tom Lendacky Date: Thu May 10 22:06:39 2018 +0200 KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD commit bc226f07dcd3c9ef0b7f6236fe356ea4a9cb4769 upstream. Expose the new virtualized architectural mechanism, VIRT_SSBD, for using speculative store bypass disable (SSBD) under SVM. This will allow guests to use SSBD on hardware that uses non-architectural mechanisms for enabling SSBD. [ tglx: Folded the migration fixup from Paolo Bonzini ] Signed-off-by: Tom Lendacky Signed-off-by: Thomas Gleixner Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 3e3a1c2ee031cd3d1a8fe9a990b61c8f17a6dd83 Author: Borislav Petkov Date: Wed May 2 18:15:14 2018 +0200 x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP commit e7c587da125291db39ddf1f49b18e5970adbac17 upstream. Intel and AMD have different CPUID bits hence for those use synthetic bits which get set on the respective vendor's in init_speculation_control(). So that debacles like what the commit message of c65732e4f721 ("x86/cpu: Restore CPUID_8000_0008_EBX reload") talks about don't happen anymore. Signed-off-by: Borislav Petkov Signed-off-by: Thomas Gleixner Reviewed-by: Konrad Rzeszutek Wilk Tested-by: Jörg Otte Cc: Linus Torvalds Cc: "Kirill A. Shutemov" Link: https://lkml.kernel.org/r/20180504161815.GG9257@pd.tnic Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 4.4: This was partly applied before; apply just the missing bits] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit b5ec2b3f11993d843f75c2d2954ece20af96dc88 Author: Thomas Gleixner Date: Wed May 9 23:01:01 2018 +0200 x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL commit ccbcd2674472a978b48c91c1fbfb66c0ff959f24 upstream. AMD is proposing a VIRT_SPEC_CTRL MSR to handle the Speculative Store Bypass Disable via MSR_AMD64_LS_CFG so that guests do not have to care about the bit position of the SSBD bit and thus facilitate migration. Also, the sibling coordination on Family 17H CPUs can only be done on the host. Extend x86_spec_ctrl_set_guest() and x86_spec_ctrl_restore_host() with an extra argument for the VIRT_SPEC_CTRL MSR. Hand in 0 from VMX and in SVM add a new virt_spec_ctrl member to the CPU data structure which is going to be used in later patches for the actual implementation. Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Reviewed-by: Konrad Rzeszutek Wilk Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 4.4: This was partly applied before; apply just the missing bits] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 7f77d36ab3f3d3dc09af0afbc7b58198382e9941 Author: Thomas Gleixner Date: Fri May 11 15:21:01 2018 +0200 KVM: SVM: Move spec control call after restore of GS commit 15e6c22fd8e5a42c5ed6d487b7c9fe44c2517765 upstream. svm_vcpu_run() invokes x86_spec_ctrl_restore_host() after VMEXIT, but before the host GS is restored. x86_spec_ctrl_restore_host() uses 'current' to determine the host SSBD state of the thread. 'current' is GS based, but host GS is not yet restored and the access causes a triple fault. Move the call after the host GS restore. Fixes: 885f82bfbc6f x86/process: Allow runtime control of Speculative Store Bypass Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Reviewed-by: Konrad Rzeszutek Wilk Acked-by: Paolo Bonzini Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 0109a1b0a5cababd514671b517722585302c0d4f Author: Konrad Rzeszutek Wilk Date: Wed Apr 25 22:04:25 2018 -0400 x86/KVM/VMX: Expose SPEC_CTRL Bit(2) to the guest commit da39556f66f5cfe8f9c989206974f1cb16ca5d7c upstream. Expose the CPUID.7.EDX[31] bit to the guest, and also guard against various combinations of SPEC_CTRL MSR values. The handling of the MSR (to take into account the host value of SPEC_CTRL Bit(2)) is taken care of in patch: KVM/SVM/VMX/x86/spectre_v2: Support the combination of guest and host IBRS Signed-off-by: Konrad Rzeszutek Wilk Signed-off-by: Thomas Gleixner Reviewed-by: Ingo Molnar [dwmw2: Handle 4.9 guest CPUID differences, rename guest_cpu_has_ibrs() → guest_cpu_has_spec_ctrl()] Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 4.4: Update feature bit name] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 2658e4d66deca4c1fc6eb59514bded62dd0a7812 Author: Konrad Rzeszutek Wilk Date: Wed Apr 25 22:04:19 2018 -0400 x86/bugs, KVM: Support the combination of guest and host IBRS commit 5cf687548705412da47c9cec342fd952d71ed3d5 upstream. A guest may modify the SPEC_CTRL MSR from the value used by the kernel. Since the kernel doesn't use IBRS, this means a value of zero is what is needed in the host. But the 336996-Speculative-Execution-Side-Channel-Mitigations.pdf refers to the other bits as reserved so the kernel should respect the boot time SPEC_CTRL value and use that. This allows to deal with future extensions to the SPEC_CTRL interface if any at all. Note: This uses wrmsrl() instead of native_wrmsl(). I does not make any difference as paravirt will over-write the callq *0xfff.. with the wrmsrl assembler code. Signed-off-by: Konrad Rzeszutek Wilk Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Reviewed-by: Ingo Molnar [bwh: Backported to 4.4: This was partly applied before; apply just the missing bits] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 67e326e034383857f0cd0a2bc92c6b525fc710e6 Author: Dan Williams Date: Mon Jan 29 17:02:49 2018 -0800 x86/uaccess: Use __uaccess_begin_nospec() and uaccess_try_nospec commit 304ec1b050310548db33063e567123fae8fd0301 upstream. Quoting Linus: I do think that it would be a good idea to very expressly document the fact that it's not that the user access itself is unsafe. I do agree that things like "get_user()" want to be protected, but not because of any direct bugs or problems with get_user() and friends, but simply because get_user() is an excellent source of a pointer that is obviously controlled from a potentially attacking user space. So it's a prime candidate for then finding _subsequent_ accesses that can then be used to perturb the cache. __uaccess_begin_nospec() covers __get_user() and copy_from_iter() where the limit check is far away from the user pointer de-reference. In those cases a barrier_nospec() prevents speculation with a potential pointer to privileged memory. uaccess_try_nospec covers get_user_try. Suggested-by: Linus Torvalds Suggested-by: Andi Kleen Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Cc: linux-arch@vger.kernel.org Cc: Kees Cook Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727416953.33451.10508284228526170604.stgit@dwillia2-desk3.amr.corp.intel.com [bwh: Backported to 4.4: - Convert several more functions to use __uaccess_begin_nospec(), that are just wrappers in mainline - Adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 557cd0d20ec971f52e4b9482d551b41503bb3e55 Author: Dan Williams Date: Mon Jan 29 17:02:44 2018 -0800 x86/usercopy: Replace open coded stac/clac with __uaccess_{begin, end} commit b5c4ae4f35325d520b230bab6eb3310613b72ac1 upstream. In preparation for converting some __uaccess_begin() instances to __uacess_begin_nospec(), make sure all 'from user' uaccess paths are using the _begin(), _end() helpers rather than open-coded stac() and clac(). No functional changes. Suggested-by: Ingo Molnar Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Cc: linux-arch@vger.kernel.org Cc: Tom Lendacky Cc: Kees Cook Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727416438.33451.17309465232057176966.stgit@dwillia2-desk3.amr.corp.intel.com [bwh: Backported to 4.4: - Convert several more functions to use __uaccess_begin_nospec(), that are just wrappers in mainline - Adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 6d1d4fc34287da617b50bd7139e536a8d69c24ea Author: Dan Williams Date: Mon Jan 29 17:02:39 2018 -0800 x86: Introduce __uaccess_begin_nospec() and uaccess_try_nospec commit b3bbfb3fb5d25776b8e3f361d2eedaabb0b496cd upstream. For __get_user() paths, do not allow the kernel to speculate on the value of a user controlled pointer. In addition to the 'stac' instruction for Supervisor Mode Access Protection (SMAP), a barrier_nospec() causes the access_ok() result to resolve in the pipeline before the CPU might take any speculative action on the pointer value. Given the cost of 'stac' the speculation barrier is placed after 'stac' to hopefully overlap the cost of disabling SMAP with the cost of flushing the instruction pipeline. Since __get_user is a major kernel interface that deals with user controlled pointers, the __uaccess_begin_nospec() mechanism will prevent speculative execution past an access_ok() permission check. While speculative execution past access_ok() is not enough to lead to a kernel memory leak, it is a necessary precondition. To be clear, __uaccess_begin_nospec() is addressing a class of potential problems near __get_user() usages. Note, that while the barrier_nospec() in __uaccess_begin_nospec() is used to protect __get_user(), pointer masking similar to array_index_nospec() will be used for get_user() since it incorporates a bounds check near the usage. uaccess_try_nospec provides the same mechanism for get_user_try. No functional changes. Suggested-by: Linus Torvalds Suggested-by: Andi Kleen Suggested-by: Ingo Molnar Signed-off-by: Dan Williams Signed-off-by: Thomas Gleixner Cc: linux-arch@vger.kernel.org Cc: Tom Lendacky Cc: Kees Cook Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727415922.33451.5796614273104346583.stgit@dwillia2-desk3.amr.corp.intel.com [bwh: Backported to 4.4: use current_thread_info()] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 45b871531cb6d2973d0c68083376719c50779a96 Author: Linus Torvalds Date: Tue Feb 23 14:58:52 2016 -0800 x86: fix SMAP in 32-bit environments commit de9e478b9d49f3a0214310d921450cf5bb4a21e6 upstream. In commit 11f1a4b9755f ("x86: reorganize SMAP handling in user space accesses") I changed how the stac/clac instructions were generated around the user space accesses, which then made it possible to do batched accesses efficiently for user string copies etc. However, in doing so, I completely spaced out, and didn't even think about the 32-bit case. And nobody really even seemed to notice, because SMAP doesn't even exist until modern Skylake processors, and you'd have to be crazy to run 32-bit kernels on a modern CPU. Which brings us to Andy Lutomirski. He actually tested the 32-bit kernel on new hardware, and noticed that it doesn't work. My bad. The trivial fix is to add the required uaccess begin/end markers around the raw accesses in . I feel a bit bad about this patch, just because that header file really should be cleaned up to avoid all the duplicated code in it, and this commit just expands on the problem. But this just fixes the bug without any bigger cleanup surgery. Reported-and-tested-by: Andy Lutomirski Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit cc8c5450b6d7ac0cbc36d133ffa21a01dd21b3d8 Author: Linus Torvalds Date: Thu Dec 17 09:45:09 2015 -0800 x86: reorganize SMAP handling in user space accesses commit 11f1a4b9755f5dbc3e822a96502ebe9b044b14d8 upstream. This reorganizes how we do the stac/clac instructions in the user access code. Instead of adding the instructions directly to the same inline asm that does the actual user level access and exception handling, add them at a higher level. This is mainly preparation for the next step, where we will expose an interface to allow users to mark several accesses together as being user space accesses, but it does already clean up some code: - the inlined trivial cases of copy_in_user() now do stac/clac just once over the accesses: they used to do one pair around the user space read, and another pair around the write-back. - the {get,put}_user_ex() macros that are used with the catch/try handling don't do any stac/clac at all, because that happens in the try/catch surrounding them. Other than those two cleanups that happened naturally from the re-organization, this should not make any difference. Yet. Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit d0169c04fee013922a272a19f7950439a5e07230 Author: Paolo Bonzini Date: Thu Feb 22 16:43:17 2018 +0100 KVM/x86: Remove indirect MSR op calls from SPEC_CTRL commit ecb586bd29c99fb4de599dec388658e74388daad upstream. Having a paravirt indirect call in the IBRS restore path is not a good idea, since we are trying to protect from speculative execution of bogus indirect branch targets. It is also slower, so use native_wrmsrl() on the vmentry path too. Signed-off-by: Paolo Bonzini Reviewed-by: Jim Mattson Cc: David Woodhouse Cc: KarimAllah Ahmed Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Radim Krčmář Cc: Thomas Gleixner Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Fixes: d28b387fb74da95d69d2615732f50cceb38e9a4d Link: http://lkml.kernel.org/r/20180222154318.20361-2-pbonzini@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 4.4: adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 89be8950bab799ddb9cc3777345e3c21bcb32dba Author: KarimAllah Ahmed Date: Sat Feb 3 15:56:23 2018 +0100 KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL commit b2ac58f90540e39324e7a29a7ad471407ae0bf48 upstream. [ Based on a patch from Paolo Bonzini ] ... basically doing exactly what we do for VMX: - Passthrough SPEC_CTRL to guests (if enabled in guest CPUID) - Save and restore SPEC_CTRL around VMExit and VMEntry only if the guest actually used it. Signed-off-by: KarimAllah Ahmed Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Reviewed-by: Darren Kenny Reviewed-by: Konrad Rzeszutek Wilk Cc: Andrea Arcangeli Cc: Andi Kleen Cc: Jun Nakajima Cc: kvm@vger.kernel.org Cc: Dave Hansen Cc: Tim Chen Cc: Andy Lutomirski Cc: Asit Mallick Cc: Arjan Van De Ven Cc: Greg KH Cc: Paolo Bonzini Cc: Dan Williams Cc: Linus Torvalds Cc: Ashok Raj Link: https://lkml.kernel.org/r/1517669783-20732-1-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit fc6aae9f407810cb153a9133c28735871f9f0a16 Author: KarimAllah Ahmed Date: Thu Feb 1 22:59:45 2018 +0100 KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL commit d28b387fb74da95d69d2615732f50cceb38e9a4d upstream. [ Based on a patch from Ashok Raj ] Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for guests that will only mitigate Spectre V2 through IBRS+IBPB and will not be using a retpoline+IBPB based approach. To avoid the overhead of saving and restoring the MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only start saving and restoring when a non-zero is written to it. No attempt is made to handle STIBP here, intentionally. Filtering STIBP may be added in a future patch, which may require trapping all writes if we don't want to pass it through directly to the guest. [dwmw2: Clean up CPUID bits, save/restore manually, handle reset] Signed-off-by: KarimAllah Ahmed Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Reviewed-by: Darren Kenny Reviewed-by: Konrad Rzeszutek Wilk Reviewed-by: Jim Mattson Cc: Andrea Arcangeli Cc: Andi Kleen Cc: Jun Nakajima Cc: kvm@vger.kernel.org Cc: Dave Hansen Cc: Tim Chen Cc: Andy Lutomirski Cc: Asit Mallick Cc: Arjan Van De Ven Cc: Greg KH Cc: Paolo Bonzini Cc: Dan Williams Cc: Linus Torvalds Cc: Ashok Raj Link: https://lkml.kernel.org/r/1517522386-18410-5-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 337c26f50a7189f114fce87e45eadd8d6dd9560b Author: KarimAllah Ahmed Date: Thu Feb 1 22:59:44 2018 +0100 KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES commit 28c1c9fabf48d6ad596273a11c46e0d0da3e14cd upstream. Intel processors use MSR_IA32_ARCH_CAPABILITIES MSR to indicate RDCL_NO (bit 0) and IBRS_ALL (bit 1). This is a read-only MSR. By default the contents will come directly from the hardware, but user-space can still override it. [dwmw2: The bit in kvm_cpuid_7_0_edx_x86_features can be unconditional] Signed-off-by: KarimAllah Ahmed Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Reviewed-by: Paolo Bonzini Reviewed-by: Darren Kenny Reviewed-by: Jim Mattson Reviewed-by: Konrad Rzeszutek Wilk Cc: Andrea Arcangeli Cc: Andi Kleen Cc: Jun Nakajima Cc: kvm@vger.kernel.org Cc: Dave Hansen Cc: Linus Torvalds Cc: Andy Lutomirski Cc: Asit Mallick Cc: Arjan Van De Ven Cc: Greg KH Cc: Dan Williams Cc: Tim Chen Cc: Ashok Raj Link: https://lkml.kernel.org/r/1517522386-18410-4-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 4.4: adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 4b3870c343a82cd2df7192cc5149c87205dcc611 Author: Ashok Raj Date: Thu Feb 1 22:59:43 2018 +0100 KVM/x86: Add IBPB support commit 15d45071523d89b3fb7372e2135fbd72f6af9506 upstream. The Indirect Branch Predictor Barrier (IBPB) is an indirect branch control mechanism. It keeps earlier branches from influencing later ones. Unlike IBRS and STIBP, IBPB does not define a new mode of operation. It's a command that ensures predicted branch targets aren't used after the barrier. Although IBRS and IBPB are enumerated by the same CPUID enumeration, IBPB is very different. IBPB helps mitigate against three potential attacks: * Mitigate guests from being attacked by other guests. - This is addressed by issing IBPB when we do a guest switch. * Mitigate attacks from guest/ring3->host/ring3. These would require a IBPB during context switch in host, or after VMEXIT. The host process has two ways to mitigate - Either it can be compiled with retpoline - If its going through context switch, and has set !dumpable then there is a IBPB in that path. (Tim's patch: https://patchwork.kernel.org/patch/10192871) - The case where after a VMEXIT you return back to Qemu might make Qemu attackable from guest when Qemu isn't compiled with retpoline. There are issues reported when doing IBPB on every VMEXIT that resulted in some tsc calibration woes in guest. * Mitigate guest/ring0->host/ring0 attacks. When host kernel is using retpoline it is safe against these attacks. If host kernel isn't using retpoline we might need to do a IBPB flush on every VMEXIT. Even when using retpoline for indirect calls, in certain conditions 'ret' can use the BTB on Skylake-era CPUs. There are other mitigations available like RSB stuffing/clearing. * IBPB is issued only for SVM during svm_free_vcpu(). VMX has a vmclear and SVM doesn't. Follow discussion here: https://lkml.org/lkml/2018/1/15/146 Please refer to the following spec for more details on the enumeration and control. Refer here to get documentation about mitigations. https://software.intel.com/en-us/side-channel-security-support [peterz: rebase and changelog rewrite] [karahmed: - rebase - vmx: expose PRED_CMD if guest has it in CPUID - svm: only pass through IBPB if guest has it in CPUID - vmx: support !cpu_has_vmx_msr_bitmap()] - vmx: support nested] [dwmw2: Expose CPUID bit too (AMD IBPB only for now as we lack IBRS) PRED_CMD is a write-only MSR] Signed-off-by: Ashok Raj Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: David Woodhouse Signed-off-by: KarimAllah Ahmed Signed-off-by: Thomas Gleixner Reviewed-by: Konrad Rzeszutek Wilk Cc: Andrea Arcangeli Cc: Andi Kleen Cc: kvm@vger.kernel.org Cc: Asit Mallick Cc: Linus Torvalds Cc: Andy Lutomirski Cc: Dave Hansen Cc: Arjan Van De Ven Cc: Greg KH Cc: Jun Nakajima Cc: Paolo Bonzini Cc: Dan Williams Cc: Tim Chen Link: http://lkml.kernel.org/r/1515720739-43819-6-git-send-email-ashok.raj@intel.com Link: https://lkml.kernel.org/r/1517522386-18410-3-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 4.4: adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 321fbb1fad297ccbac0efd28e58851a085ac29fa Author: Paolo Bonzini Date: Tue Jan 16 16:51:18 2018 +0100 KVM: VMX: make MSR bitmaps per-VCPU commit 904e14fb7cb96401a7dc803ca2863fd5ba32ffe6 upstream. Place the MSR bitmap in struct loaded_vmcs, and update it in place every time the x2apic or APICv state can change. This is rare and the loop can handle 64 MSRs per iteration, in a similar fashion as nested_vmx_prepare_msr_bitmap. This prepares for choosing, on a per-VM basis, whether to intercept the SPEC_CTRL and PRED_CMD MSRs. Suggested-by: Jim Mattson Signed-off-by: Paolo Bonzini Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 4.4: - APICv support looked different - We still need to intercept the APIC_ID MSR - Adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 8f54df9756caed1d499bc8f412ab736a8928dc39 Author: Paolo Bonzini Date: Thu Jan 11 12:16:15 2018 +0100 KVM: VMX: introduce alloc_loaded_vmcs commit f21f165ef922c2146cc5bdc620f542953c41714b upstream. Group together the calls to alloc_vmcs and loaded_vmcs_init. Soon we'll also allocate an MSR bitmap there. Signed-off-by: Paolo Bonzini Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 4.4: - No loaded_vmcs::shadow_vmcs field to initialise - Adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 81cd492667c69020b3f55bed8eb5bfa4bebf7895 Author: Jim Mattson Date: Mon Nov 27 17:22:25 2017 -0600 KVM: nVMX: Eliminate vmcs02 pool commit de3a0021a60635de96aa92713c1a31a96747d72c upstream. The potential performance advantages of a vmcs02 pool have never been realized. To simplify the code, eliminate the pool. Instead, a single vmcs02 is allocated per VCPU when the VCPU enters VMX operation. Signed-off-by: Jim Mattson Signed-off-by: Mark Kanda Reviewed-by: Ameya More Reviewed-by: David Hildenbrand Reviewed-by: Paolo Bonzini Signed-off-by: Radim Krčmář Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 4.4: - No loaded_vmcs::shadow_vmcs field to initialise - Adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 1bec1a14bb080e86af254984135cd83e76f1f91d Author: David Matlack Date: Tue Aug 1 14:00:40 2017 -0700 KVM: nVMX: mark vmcs12 pages dirty on L2 exit commit c9f04407f2e0b3fc9ff7913c65fcfcb0a4b61570 upstream. The host physical addresses of L1's Virtual APIC Page and Posted Interrupt descriptor are loaded into the VMCS02. The CPU may write to these pages via their host physical address while L2 is running, bypassing address-translation-based dirty tracking (e.g. EPT write protection). Mark them dirty on every exit from L2 to prevent them from getting out of sync with dirty tracking. Also mark the virtual APIC page and the posted interrupt descriptor dirty when KVM is virtualizing posted interrupt processing. Signed-off-by: David Matlack Reviewed-by: Paolo Bonzini Signed-off-by: Radim Krčmář Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit e90c6ad207bcb7a599c259596c9a9e1bb15eb7bb Author: Radim Krčmář Date: Mon Aug 8 20:16:22 2016 +0200 KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC commit d048c098218e91ed0e10dfa1f0f80e2567fe4ef7 upstream. msr bitmap can be used to avoid a VM exit (interception) on guest MSR accesses. In some configurations of VMX controls, the guest can even directly access host's x2APIC MSRs. See SDM 29.5 VIRTUALIZING MSR-BASED APIC ACCESSES. L2 could read all L0's x2APIC MSRs and write TPR, EOI, and SELF_IPI. To do so, L1 would first trick KVM to disable all possible interceptions by enabling APICv features and then would turn those features off; nested_vmx_merge_msr_bitmap() only disabled interceptions, so VMX would not intercept previously enabled MSRs even though they were not safe with the new configuration. Correctly re-enabling interceptions is not enough as a second bug would still allow L1+L2 to access host's MSRs: msr bitmap was shared for all VMCSs, so L1 could trigger a race to get the desired combination of msr bitmap and VMX controls. This fix allocates a msr bitmap for every L1 VCPU, allows only safe x2APIC MSRs from L1's msr bitmap, and disables msr bitmaps if they would have to intercept everything anyway. Fixes: 3af18d9c5fe9 ("KVM: nVMX: Prepare for using hardware MSR bitmap") Reported-by: Jim Mattson Suggested-by: Wincy Van Reviewed-by: Wanpeng Li Signed-off-by: Radim Krčmář [bwh: Backported to 4.4: - handle_vmon() doesn't allocate a cached vmcs12 - Adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 8420459f1d938a02b060bde1e161fdd1de212fac Author: Takashi Sakamoto Date: Wed Jun 14 19:30:03 2017 +0900 ALSA: pcm: remove SNDRV_PCM_IOCTL1_INFO internal command commit e11f0f90a626f93899687b1cc909ee37dd6c5809 upstream. Drivers can implement 'struct snd_pcm_ops.ioctl' to handle some requests from ALSA PCM core. These requests are internal purpose in kernel land. Usually common set of operations are used for it. SNDRV_PCM_IOCTL1_INFO is one of the requests. According to code comment, it has been obsoleted in the old days. We can see old releases in ftp.alsa-project.org. The command was firstly introduced in v0.5.0 release as SND_PCM_IOCTL1_INFO, to allow drivers to fill data of 'struct snd_pcm_channel_info' type. In v0.9.0 release, this was obsoleted by the other commands for ioctl(2) such as SNDRV_PCM_IOCTL_CHANNEL_INFO. This commit removes the long-abandoned command, bye. Signed-off-by: Takashi Sakamoto Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 7f0324fb34c4736bf12ae8a5137e6913772749cc Author: Namhyung Kim Date: Wed Oct 19 10:23:41 2016 +0900 pstore: Convert console write to use ->write_buf [ Upstream commit 70ad35db3321a6d129245979de4ac9d06eed897c ] Maybe I'm missing something, but I don't know why it needs to copy the input buffer to psinfo->buf and then write. Instead we can write the input buffer directly. The only implementation that supports console message (i.e. ramoops) already does it for ftrace messages. For the upcoming virtio backend driver, it needs to protect psinfo->buf overwritten from console messages. If it could use ->write_buf method instead of ->write, the problem will be solved easily. Cc: Stefan Hajnoczi Signed-off-by: Namhyung Kim Signed-off-by: Kees Cook Signed-off-by: Sasha Levin commit 466570dc30cf556a0f27c9d823341e249b2000a9 Author: Pan Bian Date: Fri Nov 30 14:10:54 2018 -0800 ocfs2: fix potential use after free [ Upstream commit 164f7e586739d07eb56af6f6d66acebb11f315c8 ] ocfs2_get_dentry() calls iput(inode) to drop the reference count of inode, and if the reference count hits 0, inode is freed. However, in this function, it then reads inode->i_generation, which may result in a use after free bug. Move the put operation later. Link: http://lkml.kernel.org/r/1543109237-110227-1-git-send-email-bianpan2016@163.com Fixes: 781f200cb7a("ocfs2: Remove masklog ML_EXPORT.") Signed-off-by: Pan Bian Reviewed-by: Andrew Morton Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Joseph Qi Cc: Changwei Ge Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 5d8fe653a9340cf2a4daca908e1a56984f1c0909 Author: Qian Cai Date: Fri Nov 30 14:09:48 2018 -0800 debugobjects: avoid recursive calls with kmemleak [ Upstream commit 8de456cf87ba863e028c4dd01bae44255ce3d835 ] CONFIG_DEBUG_OBJECTS_RCU_HEAD does not play well with kmemleak due to recursive calls. fill_pool kmemleak_ignore make_black_object put_object __call_rcu (kernel/rcu/tree.c) debug_rcu_head_queue debug_object_activate debug_object_init fill_pool kmemleak_ignore make_black_object ... So add SLAB_NOLEAKTRACE to kmem_cache_create() to not register newly allocated debug objects at all. Link: http://lkml.kernel.org/r/20181126165343.2339-1-cai@gmx.us Signed-off-by: Qian Cai Suggested-by: Catalin Marinas Acked-by: Waiman Long Acked-by: Catalin Marinas Cc: Thomas Gleixner Cc: Yang Shi Cc: Arnd Bergmann Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 168a1537ca6f5058243cc88c025a55ac87f22fd3 Author: Pan Bian Date: Fri Nov 30 14:09:18 2018 -0800 hfsplus: do not free node before using [ Upstream commit c7d7d620dcbd2a1c595092280ca943f2fced7bbd ] hfs_bmap_free() frees node via hfs_bnode_put(node). However it then reads node->this when dumping error message on an error path, which may result in a use-after-free bug. This patch frees node only when it is never used. Link: http://lkml.kernel.org/r/1543053441-66942-1-git-send-email-bianpan2016@163.com Signed-off-by: Pan Bian Reviewed-by: Andrew Morton Cc: Ernesto A. Fernandez Cc: Joe Perches Cc: Viacheslav Dubeyko Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 736ba5cba268c3b2b6abd16947cf42ff5ad96164 Author: Pan Bian Date: Fri Nov 30 14:09:14 2018 -0800 hfs: do not free node before using [ Upstream commit ce96a407adef126870b3f4a1b73529dd8aa80f49 ] hfs_bmap_free() frees the node via hfs_bnode_put(node). However, it then reads node->this when dumping error message on an error path, which may result in a use-after-free bug. This patch frees the node only when it is never again used. Link: http://lkml.kernel.org/r/1542963889-128825-1-git-send-email-bianpan2016@163.com Fixes: a1185ffa2fc ("HFS rewrite") Signed-off-by: Pan Bian Reviewed-by: Andrew Morton Cc: Joe Perches Cc: Ernesto A. Fernandez Cc: Viacheslav Dubeyko Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 9549f09b02ade613e233d61c1a058bee8a584425 Author: Larry Chen Date: Fri Nov 30 14:08:56 2018 -0800 ocfs2: fix deadlock caused by ocfs2_defrag_extent() [ Upstream commit e21e57445a64598b29a6f629688f9b9a39e7242a ] ocfs2_defrag_extent may fall into deadlock. ocfs2_ioctl_move_extents ocfs2_ioctl_move_extents ocfs2_move_extents ocfs2_defrag_extent ocfs2_lock_allocators_move_extents ocfs2_reserve_clusters inode_lock GLOBAL_BITMAP_SYSTEM_INODE __ocfs2_flush_truncate_log inode_lock GLOBAL_BITMAP_SYSTEM_INODE As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock against the global bitmap if local allocator has not sufficient cluters. Once global bitmap could meet the demand, ocfs2_reserve_cluster will return success with global bitmap locked. After ocfs2_reserve_cluster(), if truncate log is full, __ocfs2_flush_truncate_log() will definitely fall into deadlock because it needs to inode_lock global bitmap, which has already been locked. To fix this bug, we could remove from ocfs2_lock_allocators_move_extents() the code which intends to lock global allocator, and put the removed code after __ocfs2_flush_truncate_log(). ocfs2_lock_allocators_move_extents() is referred by 2 places, one is here, the other does not need the data allocator context, which means this patch does not affect the caller so far. Link: http://lkml.kernel.org/r/20181101071422.14470-1-lchen@suse.com Signed-off-by: Larry Chen Reviewed-by: Changwei Ge Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Joseph Qi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 8f1ee7557959af3367d573ec11d376f4a94e1f5e Author: Colin Ian King Date: Tue Jul 17 09:53:42 2018 +0100 fscache, cachefiles: remove redundant variable 'cache' [ Upstream commit 31ffa563833576bd49a8bf53120568312755e6e2 ] Variable 'cache' is being assigned but is never used hence it is redundant and can be removed. Cleans up clang warning: warning: variable 'cache' set but not used [-Wunused-but-set-variable] Signed-off-by: Colin Ian King Signed-off-by: David Howells Signed-off-by: Sasha Levin commit b1ef956a8ba5d57c7209539b149e7e4b1d22c208 Author: NeilBrown Date: Fri Oct 26 17:16:29 2018 +1100 fscache: fix race between enablement and dropping of object [ Upstream commit c5a94f434c82529afda290df3235e4d85873c5b4 ] It was observed that a process blocked indefintely in __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP to be cleared via fscache_wait_for_deferred_lookup(). At this time, ->backing_objects was empty, which would normaly prevent __fscache_read_or_alloc_page() from getting to the point of waiting. This implies that ->backing_objects was cleared *after* __fscache_read_or_alloc_page was was entered. When an object is "killed" and then "dropped", FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is ->backing_objects cleared. This leaves a window where something else can set FSCACHE_COOKIE_LOOKING_UP and __fscache_read_or_alloc_page() can start waiting, before ->backing_objects is cleared There is some uncertainty in this analysis, but it seems to be fit the observations. Adding the wake in this patch will be handled correctly by __fscache_read_or_alloc_page(), as it checks if ->backing_objects is empty again, after waiting. Customer which reported the hang, also report that the hang cannot be reproduced with this fix. The backtrace for the blocked process looked like: PID: 29360 TASK: ffff881ff2ac0f80 CPU: 3 COMMAND: "zsh" #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1 #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8 #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache] #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache] #6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs] #7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs] #8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73 #9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs] #10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756 #11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa #12 [ffff881ff43eff18] sys_read at ffffffff811fda62 #13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e Signed-off-by: NeilBrown Signed-off-by: David Howells Signed-off-by: Sasha Levin commit 26e081481506908e75a202ac91d15b484be7b27c Author: Srikanth Boddepalli Date: Tue Nov 27 19:53:27 2018 +0530 xen: xlate_mmu: add missing header to fix 'W=1' warning [ Upstream commit 72791ac854fea36034fa7976b748fde585008e78 ] Add a missing header otherwise compiler warns about missed prototype: drivers/xen/xlate_mmu.c:183:5: warning: no previous prototype for 'xen_xlate_unmap_gfn_range?' [-Wmissing-prototypes] int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma, ^~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Srikanth Boddepalli Reviewed-by: Boris Ostrovsky Reviewed-by: Joey Pabalinas Signed-off-by: Juergen Gross Signed-off-by: Sasha Levin commit 0f1b4c748370a567c6d0ecc0a48a079a450861a0 Author: Y.C. Chen Date: Thu Nov 22 11:56:28 2018 +0800 drm/ast: fixed reading monitor EDID not stable issue [ Upstream commit 300625620314194d9e6d4f6dda71f2dc9cf62d9f ] v1: over-sample data to increase the stability with some specific monitors v2: refine to avoid infinite loop v3: remove un-necessary "volatile" declaration [airlied: fix two checkpatch warnings] Signed-off-by: Y.C. Chen Signed-off-by: Dave Airlie Link: https://patchwork.freedesktop.org/patch/msgid/1542858988-1127-1-git-send-email-yc_chen@aspeedtech.com Signed-off-by: Sasha Levin commit 573f19f1517bee0dd16ebf629c0746a2dfc20461 Author: Pan Bian Date: Wed Nov 28 15:30:24 2018 +0800 net: hisilicon: remove unexpected free_netdev [ Upstream commit c758940158bf29fe14e9d0f89d5848f227b48134 ] The net device ndev is freed via free_netdev when failing to register the device. The control flow then jumps to the error handling code block. ndev is used and freed again. Resulting in a use-after-free bug. Signed-off-by: Pan Bian Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 553c8675f5fea8c9fbf178c3a2187a2423b4af38 Author: Josh Elsasser Date: Sat Nov 24 12:57:33 2018 -0800 ixgbe: recognize 1000BaseLX SFP modules as 1Gbps [ Upstream commit a8bf879af7b1999eba36303ce9cc60e0e7dd816c ] Add the two 1000BaseLX enum values to the X550's check for 1Gbps modules, allowing the core driver code to establish a link over this SFP type. This is done by the out-of-tree driver but the fix wasn't in mainline. Fixes: e23f33367882 ("ixgbe: Fix 1G and 10G link stability for X550EM_x SFP+”) Fixes: 6a14ee0cfb19 ("ixgbe: Add X550 support function pointers") Signed-off-by: Josh Elsasser Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin commit 6a1fb1bf8e6db1da631f14f790b5fc776cb93c7a Author: Lorenzo Bianconi Date: Mon Nov 26 15:07:16 2018 +0100 net: thunderx: fix NULL pointer dereference in nic_remove [ Upstream commit 24a6d2dd263bc910de018c78d1148b3e33b94512 ] Fix a possible NULL pointer dereference in nic_remove routine removing the nicpf module if nic_probe fails. The issue can be triggered with the following reproducer: $rmmod nicvf $rmmod nicpf [ 521.412008] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000014 [ 521.422777] Mem abort info: [ 521.425561] ESR = 0x96000004 [ 521.428624] Exception class = DABT (current EL), IL = 32 bits [ 521.434535] SET = 0, FnV = 0 [ 521.437579] EA = 0, S1PTW = 0 [ 521.440730] Data abort info: [ 521.443603] ISV = 0, ISS = 0x00000004 [ 521.447431] CM = 0, WnR = 0 [ 521.450417] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000072a3da42 [ 521.457022] [0000000000000014] pgd=0000000000000000 [ 521.461916] Internal error: Oops: 96000004 [#1] SMP [ 521.511801] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018 [ 521.518664] pstate: 80400005 (Nzcv daif +PAN -UAO) [ 521.523451] pc : nic_remove+0x24/0x88 [nicpf] [ 521.527808] lr : pci_device_remove+0x48/0xd8 [ 521.532066] sp : ffff000013433cc0 [ 521.535370] x29: ffff000013433cc0 x28: ffff810f6ac50000 [ 521.540672] x27: 0000000000000000 x26: 0000000000000000 [ 521.545974] x25: 0000000056000000 x24: 0000000000000015 [ 521.551274] x23: ffff8007ff89a110 x22: ffff000001667070 [ 521.556576] x21: ffff8007ffb170b0 x20: ffff8007ffb17000 [ 521.561877] x19: 0000000000000000 x18: 0000000000000025 [ 521.567178] x17: 0000000000000000 x16: 000000000000010ffc33ff98 x8 : 0000000000000000 [ 521.593683] x7 : 0000000000000000 x6 : 0000000000000001 [ 521.598983] x5 : 0000000000000002 x4 : 0000000000000003 [ 521.604284] x3 : ffff8007ffb17184 x2 : ffff8007ffb17184 [ 521.609585] x1 : ffff000001662118 x0 : ffff000008557be0 [ 521.614887] Process rmmod (pid: 1897, stack limit = 0x00000000859535c3) [ 521.621490] Call trace: [ 521.623928] nic_remove+0x24/0x88 [nicpf] [ 521.627927] pci_device_remove+0x48/0xd8 [ 521.631847] device_release_driver_internal+0x1b0/0x248 [ 521.637062] driver_detach+0x50/0xc0 [ 521.640628] bus_remove_driver+0x60/0x100 [ 521.644627] driver_unregister+0x34/0x60 [ 521.648538] pci_unregister_driver+0x24/0xd8 [ 521.652798] nic_cleanup_module+0x14/0x111c [nicpf] [ 521.657672] __arm64_sys_delete_module+0x150/0x218 [ 521.662460] el0_svc_handler+0x94/0x110 [ 521.666287] el0_svc+0x8/0xc [ 521.669160] Code: aa1e03e0 9102c295 d503201f f9404eb3 (b9401660) Fixes: 4863dea3fab0 ("net: Adding support for Cavium ThunderX network controller") Signed-off-by: Lorenzo Bianconi Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit afca87e81681a55a1db525e1ea66cbf601a6fee9 Author: Yi Wang Date: Thu Nov 8 16:48:36 2018 +0800 KVM: x86: fix empty-body warnings [ Upstream commit 354cb410d87314e2eda344feea84809e4261570a ] We get the following warnings about empty statements when building with 'W=1': arch/x86/kvm/lapic.c:632:53: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] arch/x86/kvm/lapic.c:1907:42: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] arch/x86/kvm/lapic.c:1936:65: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] arch/x86/kvm/lapic.c:1975:44: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] Rework the debug helper macro to get rid of these warnings. Signed-off-by: Yi Wang Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 026e6bd1a41b375b40c013ee9e1dfa8afba3859d Author: Aaro Koskinen Date: Sun Nov 25 00:17:07 2018 +0200 USB: omap_udc: fix USB gadget functionality on Palm Tungsten E [ Upstream commit 2c2322fbcab8102b8cadc09d66714700a2da42c2 ] On Palm TE nothing happens when you try to use gadget drivers and plug the USB cable. Fix by adding the board to the vbus sense quirk list. Signed-off-by: Aaro Koskinen Signed-off-by: Felipe Balbi Signed-off-by: Sasha Levin commit 09542b329dbf6dc12f3d7590adfe02429791cf49 Author: Aaro Koskinen Date: Sun Nov 25 00:17:06 2018 +0200 USB: omap_udc: fix omap_udc_start() on 15xx machines [ Upstream commit 6ca6695f576b8453fe68865e84d25946d63b10ad ] On OMAP 15xx machines there are no transceivers, and omap_udc_start() always fails as it forgot to adjust the default return value. Signed-off-by: Aaro Koskinen Signed-off-by: Felipe Balbi Signed-off-by: Sasha Levin commit 39d5919321d360227862a08a0788b205d9335b14 Author: Aaro Koskinen Date: Sun Nov 25 00:17:05 2018 +0200 USB: omap_udc: fix crashes on probe error and module removal [ Upstream commit 99f700366fcea1aa2fa3c49c99f371670c3c62f8 ] We currently crash if usb_add_gadget_udc_release() fails, since the udc->done is not initialized until in the remove function. Furthermore, on module removal the udc data is accessed although the release function is already triggered by usb_del_gadget_udc() early in the function. Fix by rewriting the release and remove functions, basically moving all the cleanup into the release function, and doing the completion only in the module removal case. The patch fixes omap_udc module probe with a failing gadged, and also allows the removal of omap_udc. Tested by running "modprobe omap_udc; modprobe -r omap_udc" in a loop. Signed-off-by: Aaro Koskinen Signed-off-by: Felipe Balbi Signed-off-by: Sasha Levin commit a20d7a308fa2191256c1734499c7fb85a5e6f92d Author: Aaro Koskinen Date: Sun Nov 25 00:17:04 2018 +0200 USB: omap_udc: use devm_request_irq() [ Upstream commit 286afdde1640d8ea8916a0f05e811441fbbf4b9d ] The current code fails to release the third irq on the error path (observed by reading the code), and we get also multiple WARNs with failing gadget drivers due to duplicate IRQ releases. Fix by using devm_request_irq(). Signed-off-by: Aaro Koskinen Signed-off-by: Felipe Balbi Signed-off-by: Sasha Levin commit ac86c99ca19ae57ba955312f5183e1a49aa9372e Author: Martynas Pumputis Date: Fri Nov 23 17:43:26 2018 +0100 bpf: fix check of allowed specifiers in bpf_trace_printk [ Upstream commit 1efb6ee3edea57f57f9fb05dba8dcb3f7333f61f ] A format string consisting of "%p" or "%s" followed by an invalid specifier (e.g. "%p%\n" or "%s%") could pass the check which would make format_decode (lib/vsprintf.c) to warn. Fixes: 9c959c863f82 ("tracing: Allow BPF programs to call bpf_trace_printk()") Reported-by: syzbot+1ec5c5ec949c4adaa0c4@syzkaller.appspotmail.com Signed-off-by: Martynas Pumputis Signed-off-by: Daniel Borkmann Signed-off-by: Sasha Levin commit d4327131dbe0147d0f07c5f0a65a0cb079bd2caa Author: Pan Bian Date: Fri Nov 23 15:56:33 2018 +0800 exportfs: do not read dentry after free [ Upstream commit 2084ac6c505a58f7efdec13eba633c6aaa085ca5 ] The function dentry_connected calls dput(dentry) to drop the previously acquired reference to dentry. In this case, dentry can be released. After that, IS_ROOT(dentry) checks the condition (dentry == dentry->d_parent), which may result in a use-after-free bug. This patch directly compares dentry with its parent obtained before dropping the reference. Fixes: a056cc8934c("exportfs: stop retrying once we race with rename/remove") Signed-off-by: Pan Bian Signed-off-by: Al Viro Signed-off-by: Sasha Levin commit 476bea0f653c47228d22975a16858a080de04993 Author: Peter Ujfalusi Date: Wed Nov 14 13:06:23 2018 +0200 ASoC: omap-dmic: Add pm_qos handling to avoid overruns with CPU_IDLE [ Upstream commit ffdcc3638c58d55a6fa68b6e5dfd4fb4109652eb ] We need to block sleep states which would require longer time to leave than the time the DMA must react to the DMA request in order to keep the FIFO serviced without overrun. Signed-off-by: Peter Ujfalusi Acked-by: Jarkko Nikula Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 3aa69785b7d5c0257a0d7c88c746d26c96a46e66 Author: Peter Ujfalusi Date: Wed Nov 14 13:06:22 2018 +0200 ASoC: omap-mcpdm: Add pm_qos handling to avoid under/overruns with CPU_IDLE [ Upstream commit 373a500e34aea97971c9d71e45edad458d3da98f ] We need to block sleep states which would require longer time to leave than the time the DMA must react to the DMA request in order to keep the FIFO serviced without under of overrun. Signed-off-by: Peter Ujfalusi Acked-by: Jarkko Nikula Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 44b0243956db138911bafd0bc1ff8427a24b53e8 Author: Robbie Ko Date: Wed Nov 14 18:32:37 2018 +0000 Btrfs: send, fix infinite loop due to directory rename dependencies [ Upstream commit a4390aee72713d9e73f1132bcdeb17d72fbbf974 ] When doing an incremental send, due to the need of delaying directory move (rename) operations we can end up in infinite loop at apply_children_dir_moves(). An example scenario that triggers this problem is described below, where directory names correspond to the numbers of their respective inodes. Parent snapshot: . |--- 261/ |--- 271/ |--- 266/ |--- 259/ |--- 260/ | |--- 267 | |--- 264/ | |--- 258/ | |--- 257/ | |--- 265/ |--- 268/ |--- 269/ | |--- 262/ | |--- 270/ |--- 272/ | |--- 263/ | |--- 275/ | |--- 274/ |--- 273/ Send snapshot: . |-- 275/ |-- 274/ |-- 273/ |-- 262/ |-- 269/ |-- 258/ |-- 271/ |-- 268/ |-- 267/ |-- 270/ |-- 259/ | |-- 265/ | |-- 272/ |-- 257/ |-- 260/ |-- 264/ |-- 263/ |-- 261/ |-- 266/ When processing inode 257 we delay its move (rename) operation because its new parent in the send snapshot, inode 272, was not yet processed. Then when processing inode 272, we delay the move operation for that inode because inode 274 is its ancestor in the send snapshot. Finally we delay the move operation for inode 274 when processing it because inode 275 is its new parent in the send snapshot and was not yet moved. When finishing processing inode 275, we start to do the move operations that were previously delayed (at apply_children_dir_moves()), resulting in the following iterations: 1) We issue the move operation for inode 274; 2) Because inode 262 depended on the move operation of inode 274 (it was delayed because 274 is its ancestor in the send snapshot), we issue the move operation for inode 262; 3) We issue the move operation for inode 272, because it was delayed by inode 274 too (ancestor of 272 in the send snapshot); 4) We issue the move operation for inode 269 (it was delayed by 262); 5) We issue the move operation for inode 257 (it was delayed by 272); 6) We issue the move operation for inode 260 (it was delayed by 272); 7) We issue the move operation for inode 258 (it was delayed by 269); 8) We issue the move operation for inode 264 (it was delayed by 257); 9) We issue the move operation for inode 271 (it was delayed by 258); 10) We issue the move operation for inode 263 (it was delayed by 264); 11) We issue the move operation for inode 268 (it was delayed by 271); 12) We verify if we can issue the move operation for inode 270 (it was delayed by 271). We detect a path loop in the current state, because inode 267 needs to be moved first before we can issue the move operation for inode 270. So we delay again the move operation for inode 270, this time we will attempt to do it after inode 267 is moved; 13) We issue the move operation for inode 261 (it was delayed by 263); 14) We verify if we can issue the move operation for inode 266 (it was delayed by 263). We detect a path loop in the current state, because inode 270 needs to be moved first before we can issue the move operation for inode 266. So we delay again the move operation for inode 266, this time we will attempt to do it after inode 270 is moved (its move operation was delayed in step 12); 15) We issue the move operation for inode 267 (it was delayed by 268); 16) We verify if we can issue the move operation for inode 266 (it was delayed by 270). We detect a path loop in the current state, because inode 270 needs to be moved first before we can issue the move operation for inode 266. So we delay again the move operation for inode 266, this time we will attempt to do it after inode 270 is moved (its move operation was delayed in step 12). So here we added again the same delayed move operation that we added in step 14; 17) We attempt again to see if we can issue the move operation for inode 266, and as in step 16, we realize we can not due to a path loop in the current state due to a dependency on inode 270. Again we delay inode's 266 rename to happen after inode's 270 move operation, adding the same dependency to the empty stack that we did in steps 14 and 16. The next iteration will pick the same move dependency on the stack (the only entry) and realize again there is still a path loop and then again the same dependency to the stack, over and over, resulting in an infinite loop. So fix this by preventing adding the same move dependency entries to the stack by removing each pending move record from the red black tree of pending moves. This way the next call to get_pending_dir_moves() will not return anything for the current parent inode. A test case for fstests, with this reproducer, follows soon. Signed-off-by: Robbie Ko Reviewed-by: Filipe Manana [Wrote changelog with example and more clear explanation] Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit b9b45f496f3ee085d6f8439f993124ef94fb13e9 Author: Huacai Chen Date: Thu Nov 15 10:44:57 2018 +0800 hwmon: (w83795) temp4_type has writable permission [ Upstream commit 09aaf6813cfca4c18034fda7a43e68763f34abb1 ] Both datasheet and comments of store_temp_mode() tell us that temp1~4_type is writable, so fix it. Signed-off-by: Yao Wang Signed-off-by: Huacai Chen Fixes: 39deb6993e7c (" hwmon: (w83795) Simplify temperature sensor type handling") Signed-off-by: Guenter Roeck Signed-off-by: Sasha Levin commit fd103a69b40023d38e8b93bfdc6a8c7522db0efd Author: Tzung-Bi Shih Date: Wed Nov 14 17:06:13 2018 +0800 ASoC: dapm: Recalculate audio map forcely when card instantiated [ Upstream commit 882eab6c28d23a970ae73b7eb831b169a672d456 ] Audio map are possible in wrong state before card->instantiated has been set to true. Imaging the following examples: time 1: at the beginning in:-1 in:-1 in:-1 in:-1 out:-1 out:-1 out:-1 out:-1 SIGGEN A B Spk time 2: after someone called snd_soc_dapm_new_widgets() (e.g. create_fill_widget_route_map() in sound/soc/codecs/hdac_hdmi.c) in:1 in:0 in:0 in:0 out:0 out:0 out:0 out:1 SIGGEN A B Spk time 3: routes added in:1 in:0 in:0 in:0 out:0 out:0 out:0 out:1 SIGGEN -----> A -----> B ---> Spk In the end, the path should be powered on but it did not. At time 3, "in" of SIGGEN and "out" of Spk did not propagate to their neighbors because snd_soc_dapm_add_path() will not invalidate the paths if the card has not instantiated (i.e. card->instantiated is false). To correct the state of audio map, recalculate the whole map forcely. Signed-off-by: Tzung-Bi Shih Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 15d8d7246173d11e7b75e34bc0226682468f9a8c Author: Nicolin Chen Date: Tue Nov 13 19:48:54 2018 -0800 hwmon: (ina2xx) Fix current value calculation [ Upstream commit 38cd989ee38c16388cde89db5b734f9d55b905f9 ] The current register (04h) has a sign bit at MSB. The comments for this calculation also mention that it's a signed register. However, the regval is unsigned type so result of calculation turns out to be an incorrect value when current is negative. This patch simply fixes this by adding a casting to s16. Fixes: 5d389b125186c ("hwmon: (ina2xx) Make calibration register value fixed") Signed-off-by: Nicolin Chen Signed-off-by: Guenter Roeck Signed-off-by: Sasha Levin commit 9ba7a303ff290d21bade27317337505640e556c0 Author: Thomas Richter Date: Tue Nov 13 15:38:22 2018 +0000 s390/cpum_cf: Reject request for sampling in event initialization [ Upstream commit 613a41b0d16e617f46776a93b975a1eeea96417c ] On s390 command perf top fails [root@s35lp76 perf] # ./perf top -F100000 --stdio Error: cycles: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat' [root@s35lp76 perf] # Using event -e rb0000 works as designed. Event rb0000 is the event number of the sampling facility for basic sampling. During system start up the following PMUs are installed in the kernel's PMU list (from head to tail): cpum_cf --> s390 PMU counter facility device driver cpum_sf --> s390 PMU sampling facility device driver uprobe kprobe tracepoint task_clock cpu_clock Perf top executes following functions and calls perf_event_open(2) system call with different parameters many times: cmd_top --> __cmd_top --> perf_evlist__add_default --> __perf_evlist__add_default --> perf_evlist__new_cycles (creates event type:0 (HW) config 0 (CPU_CYCLES) --> perf_event_attr__set_max_precise_ip Uses perf_event_open(2) to detect correct precise_ip level. Fails 3 times on s390 which is ok. Then functions cmd_top --> __cmd_top --> perf_top__start_counters -->perf_evlist__config --> perf_can_comm_exec --> perf_probe_api This functions test support for the following events: "cycles:u", "instructions:u", "cpu-clock:u" using --> perf_do_probe_api --> perf_event_open_cloexec Test the close on exec flag support with perf_event_open(2). perf_do_probe_api returns true if the event is supported. The function returns true because event cpu-clock is supported by the PMU cpu_clock. This is achieved by many calls to perf_event_open(2). Function perf_top__start_counters now calls perf_evsel__open() for every event, which is the default event cpu_cycles (config:0) and type HARDWARE (type:0) which a predfined frequence of 4000. Given the above order of the PMU list, the PMU cpum_cf gets called first and returns 0, which indicates support for this sampling. The event is fully allocated in the function perf_event_open (file kernel/event/core.c near line 10521 and the following check fails: event = perf_event_alloc(&attr, cpu, task, group_leader, NULL, NULL, NULL, cgroup_fd); if (IS_ERR(event)) { err = PTR_ERR(event); goto err_cred; } if (is_sampling_event(event)) { if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) { err = -EOPNOTSUPP; goto err_alloc; } } The check for the interrupt capabilities fails and the system call perf_event_open() returns -EOPNOTSUPP (-95). Add a check to return -ENODEV when sampling is requested in PMU cpum_cf. This allows common kernel code in the perf_event_open() system call to test the next PMU in above list. Fixes: 97b1198fece0 (" "s390, perf: Use common PMU interrupt disabled code") Signed-off-by: Thomas Richter Reviewed-by: Hendrik Brueckner Signed-off-by: Martin Schwidefsky Signed-off-by: Sasha Levin commit c65b63a085c86d23b439f1a2831628c3c3ae4f94 Author: YueHaibing Date: Sat Nov 10 04:13:24 2018 +0000 sysv: return 'err' instead of 0 in __sysv_write_inode [ Upstream commit c4b7d1ba7d263b74bb72e9325262a67139605cde ] Fixes gcc '-Wunused-but-set-variable' warning: fs/sysv/inode.c: In function '__sysv_write_inode': fs/sysv/inode.c:239:6: warning: variable 'err' set but not used [-Wunused-but-set-variable] __sysv_write_inode should return 'err' instead of 0 Fixes: 05459ca81ac3 ("repair sysv_write_inode(), switch sysv to simple_fsync()") Signed-off-by: YueHaibing Signed-off-by: Al Viro Signed-off-by: Sasha Levin commit d7589b83d628eeb0850d61c354dbf47e15befc10 Author: Janusz Krzysztofik Date: Wed Nov 7 22:30:31 2018 +0100 ARM: OMAP1: ams-delta: Fix possible use of uninitialized field [ Upstream commit cec83ff1241ec98113a19385ea9e9cfa9aa4125b ] While playing with initialization order of modem device, it has been discovered that under some circumstances (early console init, I believe) its .pm() callback may be called before the uart_port->private_data pointer is initialized from plat_serial8250_port->private_data, resulting in NULL pointer dereference. Fix it by checking for uninitialized pointer before using it in modem_pm(). Fixes: aabf31737a6a ("ARM: OMAP1: ams-delta: update the modem to use regulator API") Signed-off-by: Janusz Krzysztofik Signed-off-by: Tony Lindgren Signed-off-by: Sasha Levin commit d2f642b05832f54e46bb7cc601068038c27901ea Author: Nathan Chancellor Date: Wed Oct 17 17:54:00 2018 -0700 ARM: OMAP2+: prm44xx: Fix section annotation on omap44xx_prm_enable_io_wakeup [ Upstream commit eef3dc34a1e0b01d53328b88c25237bcc7323777 ] When building the kernel with Clang, the following section mismatch warning appears: WARNING: vmlinux.o(.text+0x38b3c): Section mismatch in reference from the function omap44xx_prm_late_init() to the function .init.text:omap44xx_prm_enable_io_wakeup() The function omap44xx_prm_late_init() references the function __init omap44xx_prm_enable_io_wakeup(). This is often because omap44xx_prm_late_init lacks a __init annotation or the annotation of omap44xx_prm_enable_io_wakeup is wrong. Remove the __init annotation from omap44xx_prm_enable_io_wakeup so there is no more mismatch. Signed-off-by: Nathan Chancellor Signed-off-by: Tony Lindgren Signed-off-by: Sasha Levin commit a56de6cd69bd700e0db0ad3e02e45a7910319cec Author: Stefano Brivio Date: Thu Dec 6 19:30:37 2018 +0100 neighbour: Avoid writing before skb->head in neigh_hh_output() [ Upstream commit e6ac64d4c4d095085d7dd71cbd05704ac99829b2 ] While skb_push() makes the kernel panic if the skb headroom is less than the unaligned hardware header size, it will proceed normally in case we copy more than that because of alignment, and we'll silently corrupt adjacent slabs. In the case fixed by the previous patch, "ipv6: Check available headroom in ip6_xmit() even without options", we end up in neigh_hh_output() with 14 bytes headroom, 14 bytes hardware header and write 16 bytes, starting 2 bytes before the allocated buffer. Always check we're not writing before skb->head and, if the headroom is not enough, warn and drop the packet. v2: - instead of panicking with BUG_ON(), WARN_ON_ONCE() and drop the packet (Eric Dumazet) - if we avoid the panic, though, we need to explicitly check the headroom before the memcpy(), otherwise we'll have corrupted slabs on a running kernel, after we warn - use __skb_push() instead of skb_push(), as the headroom check is already implemented here explicitly (Eric Dumazet) Signed-off-by: Stefano Brivio Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2da7c7b22b33ea0e240b1627e10b76212cba94aa Author: Nicolas Dichtel Date: Thu Nov 29 14:45:39 2018 +0100 tun: forbid iface creation with rtnl ops [ Upstream commit 35b827b6d06199841a83839e8bb69c0cd13a28be ] It's not supported right now (the goal of the initial patch was to support 'ip link del' only). Before the patch: $ ip link add foo type tun [ 239.632660] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [snip] [ 239.636410] RIP: 0010:register_netdevice+0x8e/0x3a0 This panic occurs because dev->netdev_ops is not set by tun_setup(). But to have something usable, it will require more than just setting netdev_ops. Fixes: f019a7a594d9 ("tun: Implement ip link del tunXXX") CC: Eric W. Biederman Signed-off-by: Nicolas Dichtel Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ed7748bcf290ad8f80020217d832f458ac9bae7f Author: Yuchung Cheng Date: Wed Dec 5 14:38:38 2018 -0800 tcp: fix NULL ref in tail loss probe [ Upstream commit b2b7af861122a0c0f6260155c29a1b2e594cd5b5 ] TCP loss probe timer may fire when the retranmission queue is empty but has a non-zero tp->packets_out counter. tcp_send_loss_probe will call tcp_rearm_rto which triggers NULL pointer reference by fetching the retranmission queue head in its sub-routines. Add a more detailed warning to help catch the root cause of the inflight accounting inconsistency. Reported-by: Rafael Tinoco Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet Signed-off-by: Neal Cardwell Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 266b50e76449bf4a2391aabd9cc8ec364f8e0589 Author: Eric Dumazet Date: Tue Dec 4 09:40:35 2018 -0800 rtnetlink: ndo_dflt_fdb_dump() only work for ARPHRD_ETHER devices [ Upstream commit 688838934c231bb08f46db687e57f6d8bf82709c ] kmsan was able to trigger a kernel-infoleak using a gre device [1] nlmsg_populate_fdb_fill() has a hard coded assumption that dev->addr_len is ETH_ALEN, as normally guaranteed for ARPHRD_ETHER devices. A similar issue was fixed recently in commit da71577545a5 ("rtnetlink: Disallow FDB configuration for non-Ethernet device") [1] BUG: KMSAN: kernel-infoleak in copyout lib/iov_iter.c:143 [inline] BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x4c0/0x2700 lib/iov_iter.c:576 CPU: 0 PID: 6697 Comm: syz-executor310 Not tainted 4.20.0-rc3+ #95 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x32d/0x480 lib/dump_stack.c:113 kmsan_report+0x12c/0x290 mm/kmsan/kmsan.c:683 kmsan_internal_check_memory+0x32a/0xa50 mm/kmsan/kmsan.c:743 kmsan_copy_to_user+0x78/0xd0 mm/kmsan/kmsan_hooks.c:634 copyout lib/iov_iter.c:143 [inline] _copy_to_iter+0x4c0/0x2700 lib/iov_iter.c:576 copy_to_iter include/linux/uio.h:143 [inline] skb_copy_datagram_iter+0x4e2/0x1070 net/core/datagram.c:431 skb_copy_datagram_msg include/linux/skbuff.h:3316 [inline] netlink_recvmsg+0x6f9/0x19d0 net/netlink/af_netlink.c:1975 sock_recvmsg_nosec net/socket.c:794 [inline] sock_recvmsg+0x1d1/0x230 net/socket.c:801 ___sys_recvmsg+0x444/0xae0 net/socket.c:2278 __sys_recvmsg net/socket.c:2327 [inline] __do_sys_recvmsg net/socket.c:2337 [inline] __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334 __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334 do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x441119 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fffc7f008a8 EFLAGS: 00000207 ORIG_RAX: 000000000000002f RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000441119 RDX: 0000000000000040 RSI: 00000000200005c0 RDI: 0000000000000003 RBP: 00000000006cc018 R08: 0000000000000100 R09: 0000000000000100 R10: 0000000000000100 R11: 0000000000000207 R12: 0000000000402080 R13: 0000000000402110 R14: 0000000000000000 R15: 0000000000000000 Uninit was stored to memory at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:246 [inline] kmsan_save_stack mm/kmsan/kmsan.c:261 [inline] kmsan_internal_chain_origin+0x13d/0x240 mm/kmsan/kmsan.c:469 kmsan_memcpy_memmove_metadata+0x1a9/0xf70 mm/kmsan/kmsan.c:344 kmsan_memcpy_metadata+0xb/0x10 mm/kmsan/kmsan.c:362 __msan_memcpy+0x61/0x70 mm/kmsan/kmsan_instr.c:162 __nla_put lib/nlattr.c:744 [inline] nla_put+0x20a/0x2d0 lib/nlattr.c:802 nlmsg_populate_fdb_fill+0x444/0x810 net/core/rtnetlink.c:3466 nlmsg_populate_fdb net/core/rtnetlink.c:3775 [inline] ndo_dflt_fdb_dump+0x73a/0x960 net/core/rtnetlink.c:3807 rtnl_fdb_dump+0x1318/0x1cb0 net/core/rtnetlink.c:3979 netlink_dump+0xc79/0x1c90 net/netlink/af_netlink.c:2244 __netlink_dump_start+0x10c4/0x11d0 net/netlink/af_netlink.c:2352 netlink_dump_start include/linux/netlink.h:216 [inline] rtnetlink_rcv_msg+0x141b/0x1540 net/core/rtnetlink.c:4910 netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2477 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4965 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x1699/0x1740 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x13c7/0x1440 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xe3b/0x1240 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:246 [inline] kmsan_internal_poison_shadow+0x6d/0x130 mm/kmsan/kmsan.c:170 kmsan_kmalloc+0xa1/0x100 mm/kmsan/kmsan_hooks.c:186 __kmalloc+0x14c/0x4d0 mm/slub.c:3825 kmalloc include/linux/slab.h:551 [inline] __hw_addr_create_ex net/core/dev_addr_lists.c:34 [inline] __hw_addr_add_ex net/core/dev_addr_lists.c:80 [inline] __dev_mc_add+0x357/0x8a0 net/core/dev_addr_lists.c:670 dev_mc_add+0x6d/0x80 net/core/dev_addr_lists.c:687 ip_mc_filter_add net/ipv4/igmp.c:1128 [inline] igmp_group_added+0x4d4/0xb80 net/ipv4/igmp.c:1311 __ip_mc_inc_group+0xea9/0xf70 net/ipv4/igmp.c:1444 ip_mc_inc_group net/ipv4/igmp.c:1453 [inline] ip_mc_up+0x1c3/0x400 net/ipv4/igmp.c:1775 inetdev_event+0x1d03/0x1d80 net/ipv4/devinet.c:1522 notifier_call_chain kernel/notifier.c:93 [inline] __raw_notifier_call_chain kernel/notifier.c:394 [inline] raw_notifier_call_chain+0x13d/0x240 kernel/notifier.c:401 __dev_notify_flags+0x3da/0x860 net/core/dev.c:1733 dev_change_flags+0x1ac/0x230 net/core/dev.c:7569 do_setlink+0x165f/0x5ea0 net/core/rtnetlink.c:2492 rtnl_newlink+0x2ad7/0x35a0 net/core/rtnetlink.c:3111 rtnetlink_rcv_msg+0x1148/0x1540 net/core/rtnetlink.c:4947 netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2477 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4965 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x1699/0x1740 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x13c7/0x1440 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xe3b/0x1240 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 Bytes 36-37 of 105 are uninitialized Memory access of size 105 starts at ffff88819686c000 Data copied to user address 0000000020000380 Fixes: d83b06036048 ("net: add fdb generic dump routine") Signed-off-by: Eric Dumazet Cc: John Fastabend Cc: Ido Schimmel Cc: David Ahern Reviewed-by: Ido Schimmel Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d7519c01c9b83129530434baa3d78ee23078400c Author: Christoph Paasch Date: Thu Nov 29 16:01:04 2018 -0800 net: Prevent invalid access to skb->prev in __qdisc_drop_all [ Upstream commit 9410d386d0a829ace9558336263086c2fbbe8aed ] __qdisc_drop_all() accesses skb->prev to get to the tail of the segment-list. With commit 68d2f84a1368 ("net: gro: properly remove skb from list") the skb-list handling has been changed to set skb->next to NULL and set the list-poison on skb->prev. With that change, __qdisc_drop_all() will panic when it tries to dereference skb->prev. Since commit 992cba7e276d ("net: Add and use skb_list_del_init().") __list_del_entry is used, leaving skb->prev unchanged (thus, pointing to the list-head if it's the first skb of the list). This will make __qdisc_drop_all modify the next-pointer of the list-head and result in a panic later on: [ 34.501053] general protection fault: 0000 [#1] SMP KASAN PTI [ 34.501968] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.20.0-rc2.mptcp #108 [ 34.502887] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011 [ 34.504074] RIP: 0010:dev_gro_receive+0x343/0x1f90 [ 34.504751] Code: e0 48 c1 e8 03 42 80 3c 30 00 0f 85 4a 1c 00 00 4d 8b 24 24 4c 39 65 d0 0f 84 0a 04 00 00 49 8d 7c 24 38 48 89 f8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 04 [ 34.507060] RSP: 0018:ffff8883af507930 EFLAGS: 00010202 [ 34.507761] RAX: 0000000000000007 RBX: ffff8883970b2c80 RCX: 1ffff11072e165a6 [ 34.508640] RDX: 1ffff11075867008 RSI: ffff8883ac338040 RDI: 0000000000000038 [ 34.509493] RBP: ffff8883af5079d0 R08: ffff8883970b2d40 R09: 0000000000000062 [ 34.510346] R10: 0000000000000034 R11: 0000000000000000 R12: 0000000000000000 [ 34.511215] R13: 0000000000000000 R14: dffffc0000000000 R15: ffff8883ac338008 [ 34.512082] FS: 0000000000000000(0000) GS:ffff8883af500000(0000) knlGS:0000000000000000 [ 34.513036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 34.513741] CR2: 000055ccc3e9d020 CR3: 00000003abf32000 CR4: 00000000000006e0 [ 34.514593] Call Trace: [ 34.514893] [ 34.515157] napi_gro_receive+0x93/0x150 [ 34.515632] receive_buf+0x893/0x3700 [ 34.516094] ? __netif_receive_skb+0x1f/0x1a0 [ 34.516629] ? virtnet_probe+0x1b40/0x1b40 [ 34.517153] ? __stable_node_chain+0x4d0/0x850 [ 34.517684] ? kfree+0x9a/0x180 [ 34.518067] ? __kasan_slab_free+0x171/0x190 [ 34.518582] ? detach_buf+0x1df/0x650 [ 34.519061] ? lapic_next_event+0x5a/0x90 [ 34.519539] ? virtqueue_get_buf_ctx+0x280/0x7f0 [ 34.520093] virtnet_poll+0x2df/0xd60 [ 34.520533] ? receive_buf+0x3700/0x3700 [ 34.521027] ? qdisc_watchdog_schedule_ns+0xd5/0x140 [ 34.521631] ? htb_dequeue+0x1817/0x25f0 [ 34.522107] ? sch_direct_xmit+0x142/0xf30 [ 34.522595] ? virtqueue_napi_schedule+0x26/0x30 [ 34.523155] net_rx_action+0x2f6/0xc50 [ 34.523601] ? napi_complete_done+0x2f0/0x2f0 [ 34.524126] ? kasan_check_read+0x11/0x20 [ 34.524608] ? _raw_spin_lock+0x7d/0xd0 [ 34.525070] ? _raw_spin_lock_bh+0xd0/0xd0 [ 34.525563] ? kvm_guest_apic_eoi_write+0x6b/0x80 [ 34.526130] ? apic_ack_irq+0x9e/0xe0 [ 34.526567] __do_softirq+0x188/0x4b5 [ 34.527015] irq_exit+0x151/0x180 [ 34.527417] do_IRQ+0xdb/0x150 [ 34.527783] common_interrupt+0xf/0xf [ 34.528223] This patch makes sure that skb->prev is set to NULL when entering netem_enqueue. Cc: Prashant Bhole Cc: Tyler Hicks Cc: Eric Dumazet Fixes: 68d2f84a1368 ("net: gro: properly remove skb from list") Suggested-by: Eric Dumazet Signed-off-by: Christoph Paasch Reviewed-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a1ec658a588b6d18c1cc1a5760c34e1dcfca1696 Author: Heiner Kallweit Date: Mon Dec 3 08:19:33 2018 +0100 net: phy: don't allow __set_phy_supported to add unsupported modes [ Upstream commit d2a36971ef595069b7a600d1144c2e0881a930a1 ] Currently __set_phy_supported allows to add modes w/o checking whether the PHY supports them. This is wrong, it should never add modes but only remove modes we don't want to support. The commit marked as fixed didn't do anything wrong, it just copied existing functionality to the helper which is being fixed now. Fixes: f3a6bd393c2c ("phylib: Add phy_set_max_speed helper") Signed-off-by: Heiner Kallweit Reviewed-by: Andrew Lunn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 8832e33045b5219357fe7c859fda385902c1d052 Author: Su Yanjun Date: Mon Dec 3 15:33:07 2018 +0800 net: 8139cp: fix a BUG triggered by changing mtu with network traffic [ Upstream commit a5d4a89245ead1f37ed135213653c5beebea4237 ] When changing mtu many times with traffic, a bug is triggered: [ 1035.684037] kernel BUG at lib/dynamic_queue_limits.c:26! [ 1035.684042] invalid opcode: 0000 [#1] SMP [ 1035.684049] Modules linked in: loop binfmt_misc 8139cp(OE) macsec tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag tcp_lp fuse uinput xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter devlink ip6_tables iptable_filter sunrpc snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep ppdev snd_seq iosf_mbi crc32_pclmul parport_pc snd_seq_device ghash_clmulni_intel parport snd_pcm aesni_intel joydev lrw snd_timer virtio_balloon sg gf128mul glue_helper ablk_helper cryptd snd soundcore i2c_piix4 pcspkr ip_tables xfs libcrc32c sr_mod sd_mod cdrom crc_t10dif crct10dif_generic ata_generic [ 1035.684102] pata_acpi virtio_console qxl drm_kms_helper syscopyarea sysfillrect sysimgblt floppy fb_sys_fops crct10dif_pclmul crct10dif_common ttm crc32c_intel serio_raw ata_piix drm libata 8139too virtio_pci drm_panel_orientation_quirks virtio_ring virtio mii dm_mirror dm_region_hash dm_log dm_mod [last unloaded: 8139cp] [ 1035.684132] CPU: 9 PID: 25140 Comm: if-mtu-change Kdump: loaded Tainted: G OE ------------ T 3.10.0-957.el7.x86_64 #1 [ 1035.684134] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 1035.684136] task: ffff8f59b1f5a080 ti: ffff8f5a2e32c000 task.ti: ffff8f5a2e32c000 [ 1035.684149] RIP: 0010:[] [] dql_completed+0x180/0x190 [ 1035.684162] RSP: 0000:ffff8f5a75483e50 EFLAGS: 00010093 [ 1035.684162] RAX: 00000000000000c2 RBX: ffff8f5a6f91c000 RCX: 0000000000000000 [ 1035.684162] RDX: 0000000000000000 RSI: 0000000000000184 RDI: ffff8f599fea3ec0 [ 1035.684162] RBP: ffff8f5a75483ea8 R08: 00000000000000c2 R09: 0000000000000000 [ 1035.684162] R10: 00000000000616ef R11: ffff8f5a75483b56 R12: ffff8f599fea3e00 [ 1035.684162] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000184 [ 1035.684162] FS: 00007fa8434de740(0000) GS:ffff8f5a75480000(0000) knlGS:0000000000000000 [ 1035.684162] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1035.684162] CR2: 00000000004305d0 CR3: 000000024eb66000 CR4: 00000000001406e0 [ 1035.684162] Call Trace: [ 1035.684162] [ 1035.684162] [] ? cp_interrupt+0x478/0x580 [8139cp] [ 1035.684162] [] __handle_irq_event_percpu+0x44/0x1c0 [ 1035.684162] [] handle_irq_event_percpu+0x32/0x80 [ 1035.684162] [] handle_irq_event+0x3c/0x60 [ 1035.684162] [] handle_fasteoi_irq+0x59/0x110 [ 1035.684162] [] handle_irq+0xe4/0x1a0 [ 1035.684162] [] do_IRQ+0x4d/0xf0 [ 1035.684162] [] common_interrupt+0x162/0x162 [ 1035.684162] [ 1035.684162] [] ? __wake_up_bit+0x24/0x70 [ 1035.684162] [] ? do_set_pte+0xd5/0x120 [ 1035.684162] [] unlock_page+0x2b/0x30 [ 1035.684162] [] do_read_fault.isra.61+0x139/0x1b0 [ 1035.684162] [] handle_pte_fault+0x2f4/0xd10 [ 1035.684162] [] handle_mm_fault+0x39d/0x9b0 [ 1035.684162] [] __do_page_fault+0x203/0x500 [ 1035.684162] [] trace_do_page_fault+0x56/0x150 [ 1035.684162] [] do_async_page_fault+0x22/0xf0 [ 1035.684162] [] async_page_fault+0x28/0x30 [ 1035.684162] Code: 54 c7 47 54 ff ff ff ff 44 0f 49 ce 48 8b 35 48 2f 9c 00 48 89 77 58 e9 fe fe ff ff 0f 1f 80 00 00 00 00 41 89 d1 e9 ef fe ff ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 55 8d 42 ff 48 [ 1035.684162] RIP [] dql_completed+0x180/0x190 [ 1035.684162] RSP It's not the same as in 7fe0ee09 patch described. As 8139cp uses shared irq mode, other device irq will trigger cp_interrupt to execute. cp_change_mtu -> cp_close -> cp_open In cp_close routine just before free_irq(), some interrupt may occur. In my environment, cp_interrupt exectutes and IntrStatus is 0x4, exactly TxOk. That will cause cp_tx to wake device queue. As device queue is started, cp_start_xmit and cp_open will run at same time which will cause kernel BUG. For example: [#] for tx descriptor At start: [#][#][#] num_queued=3 After cp_init_hw->cp_start_hw->netdev_reset_queue: [#][#][#] num_queued=0 When 8139cp starts to work then cp_tx will check num_queued mismatchs the complete_bytes. The patch will check IntrMask before check IntrStatus in cp_interrupt. When 8139cp interrupt is disabled, just return. Signed-off-by: Su Yanjun Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 298e11444788a49c7b8b26abe2761e26d0426a01 Author: Stefano Brivio Date: Thu Dec 6 19:30:36 2018 +0100 ipv6: Check available headroom in ip6_xmit() even without options [ Upstream commit 66033f47ca60294a95fc85ec3a3cc909dab7b765 ] Even if we send an IPv6 packet without options, MAX_HEADER might not be enough to account for the additional headroom required by alignment of hardware headers. On a configuration without HYPERV_NET, WLAN, AX25, and with IPV6_TUNNEL, sending short SCTP packets over IPv4 over L2TP over IPv6, we start with 100 bytes of allocated headroom in sctp_packet_transmit(), end up with 54 bytes after l2tp_xmit_skb(), and 14 bytes in ip6_finish_output2(). Those would be enough to append our 14 bytes header, but we're going to align that to 16 bytes, and write 2 bytes out of the allocated slab in neigh_hh_output(). KASan says: [ 264.967848] ================================================================== [ 264.967861] BUG: KASAN: slab-out-of-bounds in ip6_finish_output2+0x1aec/0x1c70 [ 264.967866] Write of size 16 at addr 000000006af1c7fe by task netperf/6201 [ 264.967870] [ 264.967876] CPU: 0 PID: 6201 Comm: netperf Not tainted 4.20.0-rc4+ #1 [ 264.967881] Hardware name: IBM 2827 H43 400 (z/VM 6.4.0) [ 264.967887] Call Trace: [ 264.967896] ([<00000000001347d6>] show_stack+0x56/0xa0) [ 264.967903] [<00000000017e379c>] dump_stack+0x23c/0x290 [ 264.967912] [<00000000007bc594>] print_address_description+0xf4/0x290 [ 264.967919] [<00000000007bc8fc>] kasan_report+0x13c/0x240 [ 264.967927] [<000000000162f5e4>] ip6_finish_output2+0x1aec/0x1c70 [ 264.967935] [<000000000163f890>] ip6_finish_output+0x430/0x7f0 [ 264.967943] [<000000000163fe44>] ip6_output+0x1f4/0x580 [ 264.967953] [<000000000163882a>] ip6_xmit+0xfea/0x1ce8 [ 264.967963] [<00000000017396e2>] inet6_csk_xmit+0x282/0x3f8 [ 264.968033] [<000003ff805fb0ba>] l2tp_xmit_skb+0xe02/0x13e0 [l2tp_core] [ 264.968037] [<000003ff80631192>] l2tp_eth_dev_xmit+0xda/0x150 [l2tp_eth] [ 264.968041] [<0000000001220020>] dev_hard_start_xmit+0x268/0x928 [ 264.968069] [<0000000001330e8e>] sch_direct_xmit+0x7ae/0x1350 [ 264.968071] [<000000000122359c>] __dev_queue_xmit+0x2b7c/0x3478 [ 264.968075] [<00000000013d2862>] ip_finish_output2+0xce2/0x11a0 [ 264.968078] [<00000000013d9b14>] ip_finish_output+0x56c/0x8c8 [ 264.968081] [<00000000013ddd1e>] ip_output+0x226/0x4c0 [ 264.968083] [<00000000013dbd6c>] __ip_queue_xmit+0x894/0x1938 [ 264.968100] [<000003ff80bc3a5c>] sctp_packet_transmit+0x29d4/0x3648 [sctp] [ 264.968116] [<000003ff80b7bf68>] sctp_outq_flush_ctrl.constprop.5+0x8d0/0xe50 [sctp] [ 264.968131] [<000003ff80b7c716>] sctp_outq_flush+0x22e/0x7d8 [sctp] [ 264.968146] [<000003ff80b35c68>] sctp_cmd_interpreter.isra.16+0x530/0x6800 [sctp] [ 264.968161] [<000003ff80b3410a>] sctp_do_sm+0x222/0x648 [sctp] [ 264.968177] [<000003ff80bbddac>] sctp_primitive_ASSOCIATE+0xbc/0xf8 [sctp] [ 264.968192] [<000003ff80b93328>] __sctp_connect+0x830/0xc20 [sctp] [ 264.968208] [<000003ff80bb11ce>] sctp_inet_connect+0x2e6/0x378 [sctp] [ 264.968212] [<0000000001197942>] __sys_connect+0x21a/0x450 [ 264.968215] [<000000000119aff8>] sys_socketcall+0x3d0/0xb08 [ 264.968218] [<000000000184ea7a>] system_call+0x2a2/0x2c0 [...] Just like ip_finish_output2() does for IPv4, check that we have enough headroom in ip6_xmit(), and reallocate it if we don't. This issue is older than git history. Reported-by: Jianlin Shi Signed-off-by: Stefano Brivio Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman