diff options
author | Tomáš Mózes <tomas.mozes@gmail.com> | 2024-10-02 07:59:17 +0200 |
---|---|---|
committer | Tomáš Mózes <tomas.mozes@gmail.com> | 2024-10-02 07:59:17 +0200 |
commit | befc038ba7247e93c8b224942fcca2c4a9e32717 (patch) | |
tree | 950eaff689ddf97a580c2969891a193643bce8fb | |
parent | Xen 4.18.3-pre-patchset-0 (diff) | |
download | xen-upstream-patches-main.tar.gz xen-upstream-patches-main.tar.bz2 xen-upstream-patches-main.zip |
Xen 4.19.1-pre-patchset-0HEAD4.19.1-pre-patchset-0main
Signed-off-by: Tomáš Mózes <tomas.mozes@gmail.com>
92 files changed, 3028 insertions, 4591 deletions
diff --git a/0001-update-Xen-version-to-4.19.1-pre.patch b/0001-update-Xen-version-to-4.19.1-pre.patch new file mode 100644 index 0000000..8801862 --- /dev/null +++ b/0001-update-Xen-version-to-4.19.1-pre.patch @@ -0,0 +1,164 @@ +From f97db9b3bc3deac4eead160106a3f6de2ccce81d Mon Sep 17 00:00:00 2001 +From: Jan Beulich <jbeulich@suse.com> +Date: Thu, 8 Aug 2024 13:43:19 +0200 +Subject: [PATCH 01/35] update Xen version to 4.19.1-pre + +--- + Config.mk | 2 - + MAINTAINERS | 106 +++++---------------------------------------------- + xen/Makefile | 2 +- + 3 files changed, 10 insertions(+), 100 deletions(-) + +diff --git a/Config.mk b/Config.mk +index ac8fb847ce..03a89624c7 100644 +--- a/Config.mk ++++ b/Config.mk +@@ -234,8 +234,6 @@ ETHERBOOT_NICS ?= rtl8139 8086100e + + QEMU_TRADITIONAL_URL ?= https://xenbits.xen.org/git-http/qemu-xen-traditional.git + QEMU_TRADITIONAL_REVISION ?= xen-4.19.0 +-# Wed Jul 15 10:01:40 2020 +0100 +-# qemu-trad: remove Xen path dependencies + + # Specify which qemu-dm to use. This may be `ioemu' to use the old + # Mercurial in-tree version, or a local directory, or a git URL. +diff --git a/MAINTAINERS b/MAINTAINERS +index 2b0c894527..fe81ed63ad 100644 +--- a/MAINTAINERS ++++ b/MAINTAINERS +@@ -54,6 +54,15 @@ list. Remember to copy the appropriate stable branch maintainer who + will be listed in this section of the MAINTAINERS file in the + appropriate branch. + ++The maintainer for this branch is: ++ ++ Jan Beulich <jbeulich@suse.com> ++ ++Tools backport requests should also be copied to: ++ ++ Anthony Perard <anthony.perard@citrix.com> ++ ++ + Unstable Subsystem Maintainers + ============================== + +@@ -104,103 +113,6 @@ Descriptions of section entries: + xen-maintainers-<version format number of this file> + + +- Check-in policy +- =============== +- +-In order for a patch to be checked in, in general, several conditions +-must be met: +- +-1. In order to get a change to a given file committed, it must have +- the approval of at least one maintainer of that file. +- +- A patch of course needs Acks from the maintainers of each file that +- it changes; so a patch which changes xen/arch/x86/traps.c, +- xen/arch/x86/mm/p2m.c, and xen/arch/x86/mm/shadow/multi.c would +- require an Ack from each of the three sets of maintainers. +- +- See below for rules on nested maintainership. +- +-2. Each change must have appropriate approval from someone other than +- the person who wrote it. This can be either: +- +- a. An Acked-by from a maintainer of the code being touched (a +- co-maintainer if available, or a more general level maintainer if +- not available; see the secton on nested maintainership) +- +- b. A Reviewed-by by anyone of suitable stature in the community +- +-3. Sufficient time must have been given for anyone to respond. This +- depends in large part upon the urgency and nature of the patch. +- For a straightforward uncontroversial patch, a day or two may be +- sufficient; for a controversial patch, a week or two may be better. +- +-4. There must be no "open" objections. +- +-In a case where one person submits a patch and a maintainer gives an +-Ack, the Ack stands in for both the approval requirement (#1) and the +-Acked-by-non-submitter requirement (#2). +- +-In a case where a maintainer themselves submits a patch, the +-Signed-off-by meets the approval requirement (#1); so a Review +-from anyone in the community suffices for requirement #2. +- +-Before a maintainer checks in their own patch with another community +-member's R-b but no co-maintainer Ack, it is especially important to +-give their co-maintainer opportunity to give feedback, perhaps +-declaring their intention to check it in without their co-maintainers +-ack a day before doing so. +- +-In the case where two people collaborate on a patch, at least one of +-whom is a maintainer -- typically where one maintainer will do an +-early version of the patch, and another maintainer will pick it up and +-revise it -- there should be two Signed-off-by's and one Acked-by or +-Reviewed-by; with the maintainer who did the most recent change +-sending the patch, and an Acked-by or Reviewed-by coming from the +-maintainer who did not most recently edit the patch. This satisfies +-the requirement #2 because a) the Signed-off-by of the sender approves +-the final version of the patch; including all parts of the patch that +-the sender did not write b) the Reviewed-by approves the final version +-of the patch, including all patches that the reviewer did not write. +-Thus all code in the patch has been approved by someone who did not +-write it. +- +-Maintainers may choose to override non-maintainer objections in the +-case that consensus can't be reached. +- +-As always, no policy can cover all possible situations. In +-exceptional circumstances, committers may commit a patch in absence of +-one or more of the above requirements, if they are reasonably +-confident that the other maintainers will approve of their decision in +-retrospect. +- +- The meaning of nesting +- ====================== +- +-Many maintainership areas are "nested": for example, there are entries +-for xen/arch/x86 as well as xen/arch/x86/mm, and even +-xen/arch/x86/mm/shadow; and there is a section at the end called "THE +-REST" which lists all committers. The meaning of nesting is that: +- +-1. Under normal circumstances, the Ack of the most specific maintainer +-is both necessary and sufficient to get a change to a given file +-committed. So a change to xen/arch/x86/mm/shadow/multi.c requires the +-the Ack of the xen/arch/x86/mm/shadow maintainer for that part of the +-patch, but would not require the Ack of the xen/arch/x86 maintainer or +-the xen/arch/x86/mm maintainer. +- +-2. In unusual circumstances, a more general maintainer's Ack can stand +-in for or even overrule a specific maintainer's Ack. Unusual +-circumstances might include: +- - The patch is fixing a high-priority issue causing immediate pain, +- and the more specific maintainer is not available. +- - The more specific maintainer has not responded either to the +- original patch, nor to "pings", within a reasonable amount of time. +- - The more general maintainer wants to overrule the more specific +- maintainer on some issue. (This should be exceptional.) +- - In the case of a disagreement between maintainers, THE REST can +- settle the matter by majority vote. (This should be very exceptional +- indeed.) +- + + Maintainers List (try to look for most precise areas first) + +diff --git a/xen/Makefile b/xen/Makefile +index 16055101fb..59dac504b3 100644 +--- a/xen/Makefile ++++ b/xen/Makefile +@@ -6,7 +6,7 @@ this-makefile := $(call lastword,$(MAKEFILE_LIST)) + # All other places this is stored (eg. compile.h) should be autogenerated. + export XEN_VERSION = 4 + export XEN_SUBVERSION = 19 +-export XEN_EXTRAVERSION ?= .0$(XEN_VENDORVERSION) ++export XEN_EXTRAVERSION ?= .1-pre$(XEN_VENDORVERSION) + export XEN_FULLVERSION = $(XEN_VERSION).$(XEN_SUBVERSION)$(XEN_EXTRAVERSION) + -include xen-version + +-- +2.46.1 + diff --git a/0001-x86-entry-Fix-build-with-older-toolchains.patch b/0001-x86-entry-Fix-build-with-older-toolchains.patch deleted file mode 100644 index ad6e76a..0000000 --- a/0001-x86-entry-Fix-build-with-older-toolchains.patch +++ /dev/null @@ -1,32 +0,0 @@ -From 2d38302c33b117aa9a417056db241aefc840c2f0 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Tue, 9 Apr 2024 21:39:51 +0100 -Subject: [PATCH 01/56] x86/entry: Fix build with older toolchains - -Binutils older than 2.29 doesn't know INCSSPD. - -Fixes: 8e186f98ce0e ("x86: Use indirect calls in reset-stack infrastructure") -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> -(cherry picked from commit a9fa82500818a8d8ce5f2843f1577bd2c29d088e) ---- - xen/arch/x86/x86_64/entry.S | 2 ++ - 1 file changed, 2 insertions(+) - -diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S -index ad7dd3b23b..054fcb225f 100644 ---- a/xen/arch/x86/x86_64/entry.S -+++ b/xen/arch/x86/x86_64/entry.S -@@ -643,7 +643,9 @@ ENTRY(continue_pv_domain) - * JMPed to. Drop the return address. - */ - add $8, %rsp -+#ifdef CONFIG_XEN_SHSTK - ALTERNATIVE "", "mov $2, %eax; incsspd %eax", X86_FEATURE_XEN_SHSTK -+#endif - - call check_wakeup_from_wait - ret_from_intr: --- -2.45.2 - diff --git a/0002-altcall-fix-__alt_call_maybe_initdata-so-it-s-safe-f.patch b/0002-altcall-fix-__alt_call_maybe_initdata-so-it-s-safe-f.patch deleted file mode 100644 index 05ecd83..0000000 --- a/0002-altcall-fix-__alt_call_maybe_initdata-so-it-s-safe-f.patch +++ /dev/null @@ -1,49 +0,0 @@ -From 8bdcb0b98b53140102031ceca0611f22190227fd Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> -Date: Mon, 29 Apr 2024 09:35:21 +0200 -Subject: [PATCH 02/56] altcall: fix __alt_call_maybe_initdata so it's safe for - livepatch -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Setting alternative call variables as __init is not safe for use with -livepatch, as livepatches can rightfully introduce new alternative calls to -structures marked as __alt_call_maybe_initdata (possibly just indirectly due to -replacing existing functions that use those). Attempting to resolve those -alternative calls then results in page faults as the variable that holds the -function pointer address has been freed. - -When livepatch is supported use the __ro_after_init attribute instead of -__initdata for __alt_call_maybe_initdata. - -Fixes: f26bb285949b ('xen: Implement xen/alternative-call.h for use in common code') -Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> -Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> -master commit: af4cd0a6a61cdb03bc1afca9478b05b0c9703599 -master date: 2024-04-11 18:51:36 +0100 ---- - xen/include/xen/alternative-call.h | 7 ++++++- - 1 file changed, 6 insertions(+), 1 deletion(-) - -diff --git a/xen/include/xen/alternative-call.h b/xen/include/xen/alternative-call.h -index 5c6b9a562b..10f7d7637e 100644 ---- a/xen/include/xen/alternative-call.h -+++ b/xen/include/xen/alternative-call.h -@@ -50,7 +50,12 @@ - - #include <asm/alternative.h> - --#define __alt_call_maybe_initdata __initdata -+#ifdef CONFIG_LIVEPATCH -+/* Must keep for livepatches to resolve alternative calls. */ -+# define __alt_call_maybe_initdata __ro_after_init -+#else -+# define __alt_call_maybe_initdata __initdata -+#endif - - #else - --- -2.45.2 - diff --git a/0002-bunzip2-fix-rare-decompression-failure.patch b/0002-bunzip2-fix-rare-decompression-failure.patch new file mode 100644 index 0000000..79e8339 --- /dev/null +++ b/0002-bunzip2-fix-rare-decompression-failure.patch @@ -0,0 +1,39 @@ +From e54077cbca7149c8fa856535b69a4c70dfd48cd2 Mon Sep 17 00:00:00 2001 +From: Ross Lagerwall <ross.lagerwall@citrix.com> +Date: Thu, 8 Aug 2024 13:44:26 +0200 +Subject: [PATCH 02/35] bunzip2: fix rare decompression failure + +The decompression code parses a huffman tree and counts the number of +symbols for a given bit length. In rare cases, there may be >= 256 +symbols with a given bit length, causing the unsigned char to overflow. +This causes a decompression failure later when the code tries and fails to +find the bit length for a given symbol. + +Since the maximum number of symbols is 258, use unsigned short instead. + +Fixes: ab77e81f6521 ("x86/dom0: support bzip2 and lzma compressed bzImage payloads") +Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> +Acked-by: Jan Beulich <jbeulich@suse.com> +master commit: 303d3ff85c90ee4af4bad4e3b1d4932fa2634d64 +master date: 2024-07-30 11:55:56 +0200 +--- + xen/common/bunzip2.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/xen/common/bunzip2.c b/xen/common/bunzip2.c +index 4466426941..79f17162b1 100644 +--- a/xen/common/bunzip2.c ++++ b/xen/common/bunzip2.c +@@ -221,7 +221,8 @@ static int __init get_next_block(struct bunzip_data *bd) + RUNB) */ + symCount = symTotal+2; + for (j = 0; j < groupCount; j++) { +- unsigned char length[MAX_SYMBOLS], temp[MAX_HUFCODE_BITS+1]; ++ unsigned char length[MAX_SYMBOLS]; ++ unsigned short temp[MAX_HUFCODE_BITS+1]; + int minLen, maxLen, pp; + /* Read Huffman code lengths for each symbol. They're + stored in a way similar to mtf; record a starting +-- +2.46.1 + diff --git a/0003-XSM-domctl-Fix-permission-checks-on-XEN_DOMCTL_creat.patch b/0003-XSM-domctl-Fix-permission-checks-on-XEN_DOMCTL_creat.patch new file mode 100644 index 0000000..ccdb369 --- /dev/null +++ b/0003-XSM-domctl-Fix-permission-checks-on-XEN_DOMCTL_creat.patch @@ -0,0 +1,150 @@ +From d2ecc1f231b90d4e54394e25a9aef9be42c0d196 Mon Sep 17 00:00:00 2001 +From: Andrew Cooper <andrew.cooper3@citrix.com> +Date: Thu, 8 Aug 2024 13:44:56 +0200 +Subject: [PATCH 03/35] XSM/domctl: Fix permission checks on + XEN_DOMCTL_createdomain + +The XSM checks for XEN_DOMCTL_createdomain are problematic. There's a split +between xsm_domctl() called early, and flask_domain_create() called quite late +during domain construction. + +All XSM implementations except Flask have a simple IS_PRIV check in +xsm_domctl(), and operate as expected when an unprivileged domain tries to +make a hypercall. + +Flask however foregoes any action in xsm_domctl() and defers everything, +including the simple "is the caller permitted to create a domain" check, to +flask_domain_create(). + +As a consequence, when XSM Flask is active, and irrespective of the policy +loaded, all domains irrespective of privilege can: + + * Mutate the global 'rover' variable, used to track the next free domid. + Therefore, all domains can cause a domid wraparound, and combined with a + voluntary reboot, choose their own domid. + + * Cause a reasonable amount of a domain to be constructed before ultimately + failing for permission reasons, including the use of settings outside of + supported limits. + +In order to remediate this, pass the ssidref into xsm_domctl() and at least +check that the calling domain privileged enough to create domains. + +Take the opportunity to also fix the sign of the cmd parameter to be unsigned. + +This issue has not been assigned an XSA, because Flask is experimental and not +security supported. + +Reported-by: Ross Lagerwall <ross.lagerwall@citrix.com> +Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> +Reviewed-by: Jan Beulich <jbeulich@suse.com> +Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com> +master commit: ee32b9b29af449d38aad0a1b3a81aaae586f5ea7 +master date: 2024-07-30 17:42:17 +0100 +--- + xen/arch/x86/mm/paging.c | 2 +- + xen/common/domctl.c | 4 +++- + xen/include/xsm/dummy.h | 2 +- + xen/include/xsm/xsm.h | 7 ++++--- + xen/xsm/flask/hooks.c | 14 ++++++++++++-- + 5 files changed, 21 insertions(+), 8 deletions(-) + +diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c +index bca320fffa..dd47bde5ce 100644 +--- a/xen/arch/x86/mm/paging.c ++++ b/xen/arch/x86/mm/paging.c +@@ -767,7 +767,7 @@ long do_paging_domctl_cont( + if ( d == NULL ) + return -ESRCH; + +- ret = xsm_domctl(XSM_OTHER, d, op.cmd); ++ ret = xsm_domctl(XSM_OTHER, d, op.cmd, 0 /* SSIDref not applicable */); + if ( !ret ) + { + if ( domctl_lock_acquire() ) +diff --git a/xen/common/domctl.c b/xen/common/domctl.c +index 2c0331bb05..ea16b75910 100644 +--- a/xen/common/domctl.c ++++ b/xen/common/domctl.c +@@ -322,7 +322,9 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) + break; + } + +- ret = xsm_domctl(XSM_OTHER, d, op->cmd); ++ ret = xsm_domctl(XSM_OTHER, d, op->cmd, ++ /* SSIDRef only applicable for cmd == createdomain */ ++ op->u.createdomain.ssidref); + if ( ret ) + goto domctl_out_unlock_domonly; + +diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h +index 00d2cbebf2..7956f27a29 100644 +--- a/xen/include/xsm/dummy.h ++++ b/xen/include/xsm/dummy.h +@@ -162,7 +162,7 @@ static XSM_INLINE int cf_check xsm_set_target( + } + + static XSM_INLINE int cf_check xsm_domctl( +- XSM_DEFAULT_ARG struct domain *d, int cmd) ++ XSM_DEFAULT_ARG struct domain *d, unsigned int cmd, uint32_t ssidref) + { + XSM_ASSERT_ACTION(XSM_OTHER); + switch ( cmd ) +diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h +index 8dad03fd3d..627c0d2731 100644 +--- a/xen/include/xsm/xsm.h ++++ b/xen/include/xsm/xsm.h +@@ -60,7 +60,7 @@ struct xsm_ops { + int (*domctl_scheduler_op)(struct domain *d, int op); + int (*sysctl_scheduler_op)(int op); + int (*set_target)(struct domain *d, struct domain *e); +- int (*domctl)(struct domain *d, int cmd); ++ int (*domctl)(struct domain *d, unsigned int cmd, uint32_t ssidref); + int (*sysctl)(int cmd); + int (*readconsole)(uint32_t clear); + +@@ -248,9 +248,10 @@ static inline int xsm_set_target( + return alternative_call(xsm_ops.set_target, d, e); + } + +-static inline int xsm_domctl(xsm_default_t def, struct domain *d, int cmd) ++static inline int xsm_domctl(xsm_default_t def, struct domain *d, ++ unsigned int cmd, uint32_t ssidref) + { +- return alternative_call(xsm_ops.domctl, d, cmd); ++ return alternative_call(xsm_ops.domctl, d, cmd, ssidref); + } + + static inline int xsm_sysctl(xsm_default_t def, int cmd) +diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c +index 5e88c71b8e..278ad38c2a 100644 +--- a/xen/xsm/flask/hooks.c ++++ b/xen/xsm/flask/hooks.c +@@ -663,12 +663,22 @@ static int cf_check flask_set_target(struct domain *d, struct domain *t) + return rc; + } + +-static int cf_check flask_domctl(struct domain *d, int cmd) ++static int cf_check flask_domctl(struct domain *d, unsigned int cmd, ++ uint32_t ssidref) + { + switch ( cmd ) + { +- /* These have individual XSM hooks (common/domctl.c) */ + case XEN_DOMCTL_createdomain: ++ /* ++ * There is a later hook too, but at this early point simply check ++ * that the calling domain is privileged enough to create a domain. ++ * ++ * Note that d is NULL because we haven't even allocated memory for it ++ * this early in XEN_DOMCTL_createdomain. ++ */ ++ return avc_current_has_perm(ssidref, SECCLASS_DOMAIN, DOMAIN__CREATE, NULL); ++ ++ /* These have individual XSM hooks (common/domctl.c) */ + case XEN_DOMCTL_getdomaininfo: + case XEN_DOMCTL_scheduler_op: + case XEN_DOMCTL_irq_permission: +-- +2.46.1 + diff --git a/0003-x86-rtc-Avoid-UIP-flag-being-set-for-longer-than-exp.patch b/0003-x86-rtc-Avoid-UIP-flag-being-set-for-longer-than-exp.patch deleted file mode 100644 index 8307630..0000000 --- a/0003-x86-rtc-Avoid-UIP-flag-being-set-for-longer-than-exp.patch +++ /dev/null @@ -1,57 +0,0 @@ -From af0e9ba44a58c87d6d135d8ffbf468b4ceac0a41 Mon Sep 17 00:00:00 2001 -From: Ross Lagerwall <ross.lagerwall@citrix.com> -Date: Mon, 29 Apr 2024 09:36:04 +0200 -Subject: [PATCH 03/56] x86/rtc: Avoid UIP flag being set for longer than - expected - -In a test, OVMF reported an error initializing the RTC without -indicating the precise nature of the error. The only plausible -explanation I can find is as follows: - -As part of the initialization, OVMF reads register C and then reads -register A repatedly until the UIP flag is not set. If this takes longer -than 100 ms, OVMF fails and reports an error. This may happen with the -following sequence of events: - -At guest time=0s, rtc_init() calls check_update_timer() which schedules -update_timer for t=(1 - 244us). - -At t=1s, the update_timer function happens to have been called >= 244us -late. In the timer callback, it sets the UIP flag and schedules -update_timer2 for t=1s. - -Before update_timer2 runs, the guest reads register C which calls -check_update_timer(). check_update_timer() stops the scheduled -update_timer2 and since the guest time is now outside of the update -cycle, it schedules update_timer for t=(2 - 244us). - -The UIP flag will therefore be set for a whole second from t=1 to t=2 -while the guest repeatedly reads register A waiting for the UIP flag to -clear. Fix it by clearing the UIP flag when scheduling update_timer. - -I was able to reproduce this issue with a synthetic test and this -resolves the issue. - -Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -master commit: 43a07069863b419433dee12c9b58c1f7ce70aa97 -master date: 2024-04-23 14:09:18 +0200 ---- - xen/arch/x86/hvm/rtc.c | 1 + - 1 file changed, 1 insertion(+) - -diff --git a/xen/arch/x86/hvm/rtc.c b/xen/arch/x86/hvm/rtc.c -index 206b4296e9..4839374352 100644 ---- a/xen/arch/x86/hvm/rtc.c -+++ b/xen/arch/x86/hvm/rtc.c -@@ -202,6 +202,7 @@ static void check_update_timer(RTCState *s) - } - else - { -+ s->hw.cmos_data[RTC_REG_A] &= ~RTC_UIP; - next_update_time = (USEC_PER_SEC - guest_usec - 244) * NS_PER_USEC; - expire_time = NOW() + next_update_time; - s->next_update_time = expire_time; --- -2.45.2 - diff --git a/0004-x86-MTRR-correct-inadvertently-inverted-WC-check.patch b/0004-x86-MTRR-correct-inadvertently-inverted-WC-check.patch deleted file mode 100644 index ed7754d..0000000 --- a/0004-x86-MTRR-correct-inadvertently-inverted-WC-check.patch +++ /dev/null @@ -1,36 +0,0 @@ -From eb7059767c82d833ebecdf8106e96482b04f3c40 Mon Sep 17 00:00:00 2001 -From: Jan Beulich <jbeulich@suse.com> -Date: Mon, 29 Apr 2024 09:36:37 +0200 -Subject: [PATCH 04/56] x86/MTRR: correct inadvertently inverted WC check -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -The ! clearly got lost by mistake. - -Fixes: e9e0eb30d4d6 ("x86/MTRR: avoid several indirect calls") -Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> -Signed-off-by: Jan Beulich <jbeulich@suse.com> -Acked-by: Roger Pau Monné <roger.pau@citrix.com> -master commit: 77e25f0e30ddd11e043e6fce84bf108ce7de5b6f -master date: 2024-04-23 14:13:48 +0200 ---- - xen/arch/x86/cpu/mtrr/main.c | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/xen/arch/x86/cpu/mtrr/main.c b/xen/arch/x86/cpu/mtrr/main.c -index 55a4da54a7..90b235f57e 100644 ---- a/xen/arch/x86/cpu/mtrr/main.c -+++ b/xen/arch/x86/cpu/mtrr/main.c -@@ -316,7 +316,7 @@ int mtrr_add_page(unsigned long base, unsigned long size, - } - - /* If the type is WC, check that this processor supports it */ -- if ((type == X86_MT_WC) && mtrr_have_wrcomb()) { -+ if ((type == X86_MT_WC) && !mtrr_have_wrcomb()) { - printk(KERN_WARNING - "mtrr: your processor doesn't support write-combining\n"); - return -EOPNOTSUPP; --- -2.45.2 - diff --git a/0004-x86-dom0-fix-restoring-cr3-and-the-mapcache-override.patch b/0004-x86-dom0-fix-restoring-cr3-and-the-mapcache-override.patch new file mode 100644 index 0000000..40dbb9f --- /dev/null +++ b/0004-x86-dom0-fix-restoring-cr3-and-the-mapcache-override.patch @@ -0,0 +1,38 @@ +From adf1939b51a0a2fa596f7acca0989bfe56cab307 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> +Date: Thu, 8 Aug 2024 13:45:28 +0200 +Subject: [PATCH 04/35] x86/dom0: fix restoring %cr3 and the mapcache override + on PV build error +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +One of the error paths in the PV dom0 builder section that runs on the guest +page-tables wasn't restoring the Xen value of %cr3, neither removing the +mapcache override. + +Fixes: 079ff2d32c3d ('libelf-loader: introduce elf_load_image') +Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> +Reviewed-by: Jan Beulich <jbeulich@suse.com> +master commit: 1fc3f77113dd43b14fa7ef5936dcdba120c0b63f +master date: 2024-07-31 12:41:02 +0200 +--- + xen/arch/x86/pv/dom0_build.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c +index d8043fa58a..57e58a02e7 100644 +--- a/xen/arch/x86/pv/dom0_build.c ++++ b/xen/arch/x86/pv/dom0_build.c +@@ -825,6 +825,8 @@ int __init dom0_construct_pv(struct domain *d, + rc = elf_load_binary(&elf); + if ( rc < 0 ) + { ++ mapcache_override_current(NULL); ++ switch_cr3_cr4(current->arch.cr3, read_cr4()); + printk("Failed to load the kernel binary\n"); + goto out; + } +-- +2.46.1 + diff --git a/0005-x86-altcall-further-refine-clang-workaround.patch b/0005-x86-altcall-further-refine-clang-workaround.patch new file mode 100644 index 0000000..7107099 --- /dev/null +++ b/0005-x86-altcall-further-refine-clang-workaround.patch @@ -0,0 +1,73 @@ +From ee032f29972b8c58db9fcf96650f9cbc083edca8 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> +Date: Thu, 8 Aug 2024 13:45:58 +0200 +Subject: [PATCH 05/35] x86/altcall: further refine clang workaround +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +The current code in ALT_CALL_ARG() won't successfully workaround the clang +code-generation issue if the arg parameter has a size that's not a power of 2. +While there are no such sized parameters at the moment, improve the workaround +to also be effective when such sizes are used. + +Instead of using a union with a long use an unsigned long that's first +initialized to 0 and afterwards set to the argument value. + +Reported-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> +Suggested-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> +Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> +Reviewed-by: Jan Beulich <jbeulich@suse.com> +master commit: 561cba38ff551383a628dc93e64ab0691cfc92bf +master date: 2024-07-31 12:41:22 +0200 +--- + xen/arch/x86/include/asm/alternative.h | 26 ++++++++++++-------------- + 1 file changed, 12 insertions(+), 14 deletions(-) + +diff --git a/xen/arch/x86/include/asm/alternative.h b/xen/arch/x86/include/asm/alternative.h +index e63b459276..a86eadfaec 100644 +--- a/xen/arch/x86/include/asm/alternative.h ++++ b/xen/arch/x86/include/asm/alternative.h +@@ -169,27 +169,25 @@ extern void alternative_branches(void); + + #ifdef CONFIG_CC_IS_CLANG + /* +- * Use a union with an unsigned long in order to prevent clang from +- * skipping a possible truncation of the value. By using the union any +- * truncation is carried before the call instruction, in turn covering +- * for ABI-non-compliance in that the necessary clipping / extension of +- * the value is supposed to be carried out in the callee. ++ * Clang doesn't follow the psABI and doesn't truncate parameter values at the ++ * callee. This can lead to bad code being generated when using alternative ++ * calls. + * +- * Note this behavior is not mandated by the standard, and hence could +- * stop being a viable workaround, or worse, could cause a different set +- * of code-generation issues in future clang versions. ++ * Workaround it by using a temporary intermediate variable that's zeroed ++ * before being assigned the parameter value, as that forces clang to zero the ++ * register at the caller. + * + * This has been reported upstream: + * https://github.com/llvm/llvm-project/issues/12579 + * https://github.com/llvm/llvm-project/issues/82598 + */ + #define ALT_CALL_ARG(arg, n) \ +- register union { \ +- typeof(arg) e[sizeof(long) / sizeof(arg)]; \ +- unsigned long r; \ +- } a ## n ## _ asm ( ALT_CALL_arg ## n ) = { \ +- .e[0] = ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg); })\ +- } ++ register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n ) = ({ \ ++ unsigned long tmp = 0; \ ++ BUILD_BUG_ON(sizeof(arg) > sizeof(unsigned long)); \ ++ *(typeof(arg) *)&tmp = (arg); \ ++ tmp; \ ++ }) + #else + #define ALT_CALL_ARG(arg, n) \ + register typeof(arg) a ## n ## _ asm ( ALT_CALL_arg ## n ) = \ +-- +2.46.1 + diff --git a/0005-x86-spec-fix-reporting-of-BHB-clearing-usage-from-gu.patch b/0005-x86-spec-fix-reporting-of-BHB-clearing-usage-from-gu.patch deleted file mode 100644 index bad0428..0000000 --- a/0005-x86-spec-fix-reporting-of-BHB-clearing-usage-from-gu.patch +++ /dev/null @@ -1,69 +0,0 @@ -From 0b0c7dca70d64c35c86e5d503f67366ebe2b9138 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> -Date: Mon, 29 Apr 2024 09:37:04 +0200 -Subject: [PATCH 05/56] x86/spec: fix reporting of BHB clearing usage from - guest entry points -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Reporting whether the BHB clearing on entry is done for the different domains -types based on cpu_has_bhb_seq is unhelpful, as that variable signals whether -there's a BHB clearing sequence selected, but that alone doesn't imply that -such sequence is used from the PV and/or HVM entry points. - -Instead use opt_bhb_entry_{pv,hvm} which do signal whether BHB clearing is -performed on entry from PV/HVM. - -Fixes: 689ad48ce9cf ('x86/spec-ctrl: Wire up the Native-BHI software sequences') -Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> -master commit: 049ab0b2c9f1f5edb54b505fef0bc575787dafe9 -master date: 2024-04-25 16:35:56 +0200 ---- - xen/arch/x86/spec_ctrl.c | 8 ++++---- - 1 file changed, 4 insertions(+), 4 deletions(-) - -diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c -index ba4349a024..8c67d6256a 100644 ---- a/xen/arch/x86/spec_ctrl.c -+++ b/xen/arch/x86/spec_ctrl.c -@@ -634,7 +634,7 @@ static void __init print_details(enum ind_thunk thunk) - (boot_cpu_has(X86_FEATURE_SC_MSR_HVM) || - boot_cpu_has(X86_FEATURE_SC_RSB_HVM) || - boot_cpu_has(X86_FEATURE_IBPB_ENTRY_HVM) || -- cpu_has_bhb_seq || amd_virt_spec_ctrl || -+ opt_bhb_entry_hvm || amd_virt_spec_ctrl || - opt_eager_fpu || opt_verw_hvm) ? "" : " None", - boot_cpu_has(X86_FEATURE_SC_MSR_HVM) ? " MSR_SPEC_CTRL" : "", - (boot_cpu_has(X86_FEATURE_SC_MSR_HVM) || -@@ -643,7 +643,7 @@ static void __init print_details(enum ind_thunk thunk) - opt_eager_fpu ? " EAGER_FPU" : "", - opt_verw_hvm ? " VERW" : "", - boot_cpu_has(X86_FEATURE_IBPB_ENTRY_HVM) ? " IBPB-entry" : "", -- cpu_has_bhb_seq ? " BHB-entry" : ""); -+ opt_bhb_entry_hvm ? " BHB-entry" : ""); - - #endif - #ifdef CONFIG_PV -@@ -651,14 +651,14 @@ static void __init print_details(enum ind_thunk thunk) - (boot_cpu_has(X86_FEATURE_SC_MSR_PV) || - boot_cpu_has(X86_FEATURE_SC_RSB_PV) || - boot_cpu_has(X86_FEATURE_IBPB_ENTRY_PV) || -- cpu_has_bhb_seq || -+ opt_bhb_entry_pv || - opt_eager_fpu || opt_verw_pv) ? "" : " None", - boot_cpu_has(X86_FEATURE_SC_MSR_PV) ? " MSR_SPEC_CTRL" : "", - boot_cpu_has(X86_FEATURE_SC_RSB_PV) ? " RSB" : "", - opt_eager_fpu ? " EAGER_FPU" : "", - opt_verw_pv ? " VERW" : "", - boot_cpu_has(X86_FEATURE_IBPB_ENTRY_PV) ? " IBPB-entry" : "", -- cpu_has_bhb_seq ? " BHB-entry" : ""); -+ opt_bhb_entry_pv ? " BHB-entry" : ""); - - printk(" XPTI (64-bit PV only): Dom0 %s, DomU %s (with%s PCID)\n", - opt_xpti_hwdom ? "enabled" : "disabled", --- -2.45.2 - diff --git a/0006-x86-spec-adjust-logic-that-elides-lfence.patch b/0006-x86-spec-adjust-logic-that-elides-lfence.patch deleted file mode 100644 index 6da96c4..0000000 --- a/0006-x86-spec-adjust-logic-that-elides-lfence.patch +++ /dev/null @@ -1,75 +0,0 @@ -From f0ff1d9cb96041a84a24857a6464628240deed4f Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> -Date: Mon, 29 Apr 2024 09:37:29 +0200 -Subject: [PATCH 06/56] x86/spec: adjust logic that elides lfence -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -It's currently too restrictive by just checking whether there's a BHB clearing -sequence selected. It should instead check whether BHB clearing is used on -entry from PV or HVM specifically. - -Switch to use opt_bhb_entry_{pv,hvm} instead, and then remove cpu_has_bhb_seq -since it no longer has any users. - -Reported-by: Jan Beulich <jbeulich@suse.com> -Fixes: 954c983abcee ('x86/spec-ctrl: Software BHB-clearing sequences') -Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> -master commit: 656ae8f1091bcefec9c46ec3ea3ac2118742d4f6 -master date: 2024-04-25 16:37:01 +0200 ---- - xen/arch/x86/include/asm/cpufeature.h | 3 --- - xen/arch/x86/spec_ctrl.c | 6 +++--- - 2 files changed, 3 insertions(+), 6 deletions(-) - -diff --git a/xen/arch/x86/include/asm/cpufeature.h b/xen/arch/x86/include/asm/cpufeature.h -index 7a312c485e..3c57f55de0 100644 ---- a/xen/arch/x86/include/asm/cpufeature.h -+++ b/xen/arch/x86/include/asm/cpufeature.h -@@ -228,9 +228,6 @@ static inline bool boot_cpu_has(unsigned int feat) - #define cpu_bug_fpu_ptrs boot_cpu_has(X86_BUG_FPU_PTRS) - #define cpu_bug_null_seg boot_cpu_has(X86_BUG_NULL_SEG) - --#define cpu_has_bhb_seq (boot_cpu_has(X86_SPEC_BHB_TSX) || \ -- boot_cpu_has(X86_SPEC_BHB_LOOPS)) -- - enum _cache_type { - CACHE_TYPE_NULL = 0, - CACHE_TYPE_DATA = 1, -diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c -index 8c67d6256a..12c19b7eca 100644 ---- a/xen/arch/x86/spec_ctrl.c -+++ b/xen/arch/x86/spec_ctrl.c -@@ -2328,7 +2328,7 @@ void __init init_speculation_mitigations(void) - * unconditional WRMSR. If we do have it, or we're not using any - * prior conditional block, then it's safe to drop the LFENCE. - */ -- if ( !cpu_has_bhb_seq && -+ if ( !opt_bhb_entry_pv && - (boot_cpu_has(X86_FEATURE_SC_MSR_PV) || - !boot_cpu_has(X86_FEATURE_IBPB_ENTRY_PV)) ) - setup_force_cpu_cap(X86_SPEC_NO_LFENCE_ENTRY_PV); -@@ -2344,7 +2344,7 @@ void __init init_speculation_mitigations(void) - * active in the block that is skipped when interrupting guest - * context, then it's safe to drop the LFENCE. - */ -- if ( !cpu_has_bhb_seq && -+ if ( !opt_bhb_entry_pv && - (boot_cpu_has(X86_FEATURE_SC_MSR_PV) || - (!boot_cpu_has(X86_FEATURE_IBPB_ENTRY_PV) && - !boot_cpu_has(X86_FEATURE_SC_RSB_PV))) ) -@@ -2356,7 +2356,7 @@ void __init init_speculation_mitigations(void) - * A BHB sequence, if used, is the only conditional action, so if we - * don't have it, we don't need the safety LFENCE. - */ -- if ( !cpu_has_bhb_seq ) -+ if ( !opt_bhb_entry_hvm ) - setup_force_cpu_cap(X86_SPEC_NO_LFENCE_ENTRY_VMX); - } - --- -2.45.2 - diff --git a/0006-xen-sched-fix-error-handling-in-cpu_schedule_up.patch b/0006-xen-sched-fix-error-handling-in-cpu_schedule_up.patch new file mode 100644 index 0000000..86189a6 --- /dev/null +++ b/0006-xen-sched-fix-error-handling-in-cpu_schedule_up.patch @@ -0,0 +1,113 @@ +From b37580d5e984770266783b639552a97c36ecb58a Mon Sep 17 00:00:00 2001 +From: Juergen Gross <jgross@suse.com> +Date: Thu, 8 Aug 2024 13:46:21 +0200 +Subject: [PATCH 06/35] xen/sched: fix error handling in cpu_schedule_up() + +In case cpu_schedule_up() is failing, it needs to undo all externally +visible changes it has done before. + +Reason is that cpu_schedule_callback() won't be called with the +CPU_UP_CANCELED notifier in case cpu_schedule_up() did fail. + +Fixes: 207589dbacd4 ("xen/sched: move per cpu scheduler private data into struct sched_resource") +Reported-by: Jan Beulich <jbeulich@suse.com> +Signed-off-by: Juergen Gross <jgross@suse.com> +Reviewed-by: Jan Beulich <jbeulich@suse.com> +master commit: 44a7d4f0a5e9eae41a44a162e54ff6d2ebe5b7d6 +master date: 2024-07-31 14:50:18 +0200 +--- + xen/common/sched/core.c | 63 +++++++++++++++++++++-------------------- + 1 file changed, 33 insertions(+), 30 deletions(-) + +diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c +index d84b65f197..c466711e9e 100644 +--- a/xen/common/sched/core.c ++++ b/xen/common/sched/core.c +@@ -2755,6 +2755,36 @@ static struct sched_resource *sched_alloc_res(void) + return sr; + } + ++static void cf_check sched_res_free(struct rcu_head *head) ++{ ++ struct sched_resource *sr = container_of(head, struct sched_resource, rcu); ++ ++ free_cpumask_var(sr->cpus); ++ if ( sr->sched_unit_idle ) ++ sched_free_unit_mem(sr->sched_unit_idle); ++ xfree(sr); ++} ++ ++static void cpu_schedule_down(unsigned int cpu) ++{ ++ struct sched_resource *sr; ++ ++ rcu_read_lock(&sched_res_rculock); ++ ++ sr = get_sched_res(cpu); ++ ++ kill_timer(&sr->s_timer); ++ ++ cpumask_clear_cpu(cpu, &sched_res_mask); ++ set_sched_res(cpu, NULL); ++ ++ /* Keep idle unit. */ ++ sr->sched_unit_idle = NULL; ++ call_rcu(&sr->rcu, sched_res_free); ++ ++ rcu_read_unlock(&sched_res_rculock); ++} ++ + static int cpu_schedule_up(unsigned int cpu) + { + struct sched_resource *sr; +@@ -2794,7 +2824,10 @@ static int cpu_schedule_up(unsigned int cpu) + idle_vcpu[cpu]->sched_unit->res = sr; + + if ( idle_vcpu[cpu] == NULL ) ++ { ++ cpu_schedule_down(cpu); + return -ENOMEM; ++ } + + idle_vcpu[cpu]->sched_unit->rendezvous_in_cnt = 0; + +@@ -2812,36 +2845,6 @@ static int cpu_schedule_up(unsigned int cpu) + return 0; + } + +-static void cf_check sched_res_free(struct rcu_head *head) +-{ +- struct sched_resource *sr = container_of(head, struct sched_resource, rcu); +- +- free_cpumask_var(sr->cpus); +- if ( sr->sched_unit_idle ) +- sched_free_unit_mem(sr->sched_unit_idle); +- xfree(sr); +-} +- +-static void cpu_schedule_down(unsigned int cpu) +-{ +- struct sched_resource *sr; +- +- rcu_read_lock(&sched_res_rculock); +- +- sr = get_sched_res(cpu); +- +- kill_timer(&sr->s_timer); +- +- cpumask_clear_cpu(cpu, &sched_res_mask); +- set_sched_res(cpu, NULL); +- +- /* Keep idle unit. */ +- sr->sched_unit_idle = NULL; +- call_rcu(&sr->rcu, sched_res_free); +- +- rcu_read_unlock(&sched_res_rculock); +-} +- + void sched_rm_cpu(unsigned int cpu) + { + int rc; +-- +2.46.1 + diff --git a/0007-xen-hvm-Don-t-skip-MSR_READ-trace-record.patch b/0007-xen-hvm-Don-t-skip-MSR_READ-trace-record.patch new file mode 100644 index 0000000..0649616 --- /dev/null +++ b/0007-xen-hvm-Don-t-skip-MSR_READ-trace-record.patch @@ -0,0 +1,40 @@ +From 97a15007c9606d4c53109754bb21fd593bca589b Mon Sep 17 00:00:00 2001 +From: George Dunlap <george.dunlap@cloud.com> +Date: Thu, 8 Aug 2024 13:47:02 +0200 +Subject: [PATCH 07/35] xen/hvm: Don't skip MSR_READ trace record + +Commit 37f074a3383 ("x86/msr: introduce guest_rdmsr()") introduced a +function to combine the MSR_READ handling between PV and HVM. +Unfortunately, by returning directly, it skipped the trace generation, +leading to gaps in the trace record, as well as xenalyze errors like +this: + +hvm_generic_postprocess: d2v0 Strange, exit 7c(VMEXIT_MSR) missing a handler + +Replace the `return` with `goto out`. + +Fixes: 37f074a3383 ("x86/msr: introduce guest_rdmsr()") +Signed-off-by: George Dunlap <george.dunlap@cloud.com> +Reviewed-by: Jan Beulich <jbeulich@suse.com> +master commit: bc8a43fff61ae4162a95d84f4e148d6773667cd2 +master date: 2024-08-02 08:42:09 +0200 +--- + xen/arch/x86/hvm/hvm.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c +index 7f4b627b1f..0fe2b85b16 100644 +--- a/xen/arch/x86/hvm/hvm.c ++++ b/xen/arch/x86/hvm/hvm.c +@@ -3557,7 +3557,7 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) + fixed_range_base = (uint64_t *)v->arch.hvm.mtrr.fixed_ranges; + + if ( (ret = guest_rdmsr(v, msr, msr_content)) != X86EMUL_UNHANDLEABLE ) +- return ret; ++ goto out; + + ret = X86EMUL_OKAY; + +-- +2.46.1 + diff --git a/0007-xen-xsm-Wire-up-get_dom0_console.patch b/0007-xen-xsm-Wire-up-get_dom0_console.patch deleted file mode 100644 index 540541c..0000000 --- a/0007-xen-xsm-Wire-up-get_dom0_console.patch +++ /dev/null @@ -1,66 +0,0 @@ -From 026542c8577ab6af7c1dbc7446547bdc2bc705fd Mon Sep 17 00:00:00 2001 -From: Jason Andryuk <jason.andryuk@amd.com> -Date: Tue, 21 May 2024 10:19:43 +0200 -Subject: [PATCH 07/56] xen/xsm: Wire up get_dom0_console - -An XSM hook for get_dom0_console is currently missing. Using XSM with -a PVH dom0 shows: -(XEN) FLASK: Denying unknown platform_op: 64. - -Wire up the hook, and allow it for dom0. - -Fixes: 4dd160583c ("x86/platform: introduce hypercall to get initial video console settings") -Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> -Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com> -master commit: 647f7e50ebeeb8152974cad6a12affe474c74513 -master date: 2024-04-30 08:33:41 +0200 ---- - tools/flask/policy/modules/dom0.te | 2 +- - xen/xsm/flask/hooks.c | 4 ++++ - xen/xsm/flask/policy/access_vectors | 2 ++ - 3 files changed, 7 insertions(+), 1 deletion(-) - -diff --git a/tools/flask/policy/modules/dom0.te b/tools/flask/policy/modules/dom0.te -index f1dcff48e2..16b8c9646d 100644 ---- a/tools/flask/policy/modules/dom0.te -+++ b/tools/flask/policy/modules/dom0.te -@@ -16,7 +16,7 @@ allow dom0_t xen_t:xen { - allow dom0_t xen_t:xen2 { - resource_op psr_cmt_op psr_alloc pmu_ctrl get_symbol - get_cpu_levelling_caps get_cpu_featureset livepatch_op -- coverage_op -+ coverage_op get_dom0_console - }; - - # Allow dom0 to use all XENVER_ subops that have checks. -diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c -index 78225f68c1..5e88c71b8e 100644 ---- a/xen/xsm/flask/hooks.c -+++ b/xen/xsm/flask/hooks.c -@@ -1558,6 +1558,10 @@ static int cf_check flask_platform_op(uint32_t op) - return avc_has_perm(domain_sid(current->domain), SECINITSID_XEN, - SECCLASS_XEN2, XEN2__GET_SYMBOL, NULL); - -+ case XENPF_get_dom0_console: -+ return avc_has_perm(domain_sid(current->domain), SECINITSID_XEN, -+ SECCLASS_XEN2, XEN2__GET_DOM0_CONSOLE, NULL); -+ - default: - return avc_unknown_permission("platform_op", op); - } -diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors -index 4e6710a63e..a35e3d4c51 100644 ---- a/xen/xsm/flask/policy/access_vectors -+++ b/xen/xsm/flask/policy/access_vectors -@@ -99,6 +99,8 @@ class xen2 - livepatch_op - # XEN_SYSCTL_coverage_op - coverage_op -+# XENPF_get_dom0_console -+ get_dom0_console - } - - # Classes domain and domain2 consist of operations that a domain performs on --- -2.45.2 - diff --git a/0008-tools-lsevtchn-Use-errno-macro-to-handle-hypercall-e.patch b/0008-tools-lsevtchn-Use-errno-macro-to-handle-hypercall-e.patch new file mode 100644 index 0000000..76bb65e --- /dev/null +++ b/0008-tools-lsevtchn-Use-errno-macro-to-handle-hypercall-e.patch @@ -0,0 +1,75 @@ +From e0e84771b61ed985809d105d8f116d4c520542b0 Mon Sep 17 00:00:00 2001 +From: Matthew Barnes <matthew.barnes@cloud.com> +Date: Thu, 8 Aug 2024 13:47:30 +0200 +Subject: [PATCH 08/35] tools/lsevtchn: Use errno macro to handle hypercall + error cases + +Currently, lsevtchn aborts its event channel enumeration when it hits +an event channel that is owned by Xen. + +lsevtchn does not distinguish between different hypercall errors, which +results in lsevtchn missing potential relevant event channels with +higher port numbers. + +Use the errno macro to distinguish between hypercall errors, and +continue event channel enumeration if the hypercall error is not +critical to enumeration. + +Signed-off-by: Matthew Barnes <matthew.barnes@cloud.com> +Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> +master commit: e92a453c8db8bba62d6be3006079e2b9990c3978 +master date: 2024-08-02 08:43:57 +0200 +--- + tools/xcutils/lsevtchn.c | 22 ++++++++++++++++++++-- + 1 file changed, 20 insertions(+), 2 deletions(-) + +diff --git a/tools/xcutils/lsevtchn.c b/tools/xcutils/lsevtchn.c +index d1710613dd..30c8d847b8 100644 +--- a/tools/xcutils/lsevtchn.c ++++ b/tools/xcutils/lsevtchn.c +@@ -3,6 +3,7 @@ + #include <stdint.h> + #include <string.h> + #include <stdio.h> ++#include <errno.h> + + #include <xenctrl.h> + +@@ -24,7 +25,23 @@ int main(int argc, char **argv) + status.port = port; + rc = xc_evtchn_status(xch, &status); + if ( rc < 0 ) +- break; ++ { ++ switch ( errno ) ++ { ++ case EACCES: /* Xen-owned evtchn */ ++ continue; ++ ++ case EINVAL: /* Port enumeration has ended */ ++ rc = 0; ++ break; ++ ++ default: ++ perror("xc_evtchn_status"); ++ rc = 1; ++ break; ++ } ++ goto out; ++ } + + if ( status.status == EVTCHNSTAT_closed ) + continue; +@@ -58,7 +75,8 @@ int main(int argc, char **argv) + printf("\n"); + } + ++ out: + xc_interface_close(xch); + +- return 0; ++ return rc; + } +-- +2.46.1 + diff --git a/0008-xen-x86-Fix-Syntax-warning-in-gen-cpuid.py.patch b/0008-xen-x86-Fix-Syntax-warning-in-gen-cpuid.py.patch deleted file mode 100644 index 7c04f23..0000000 --- a/0008-xen-x86-Fix-Syntax-warning-in-gen-cpuid.py.patch +++ /dev/null @@ -1,41 +0,0 @@ -From 47cf06c09a2fa1ee92ea3e7718c8f8e0f1450d88 Mon Sep 17 00:00:00 2001 -From: Jason Andryuk <jason.andryuk@amd.com> -Date: Tue, 21 May 2024 10:20:06 +0200 -Subject: [PATCH 08/56] xen/x86: Fix Syntax warning in gen-cpuid.py - -Python 3.12.2 warns: - -xen/tools/gen-cpuid.py:50: SyntaxWarning: invalid escape sequence '\s' - "\s+([\s\d]+\*[\s\d]+\+[\s\d]+)\)" -xen/tools/gen-cpuid.py:51: SyntaxWarning: invalid escape sequence '\s' - "\s+/\*([\w!]*) .*$") - -Specify the strings as raw strings so '\s' is read as literal '\' + 's'. -This avoids escaping all the '\'s in the strings. - -Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> -Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> -master commit: 08e79bba73d74a85d3ce6ff0f91c5205f1e05eda -master date: 2024-04-30 08:34:37 +0200 ---- - xen/tools/gen-cpuid.py | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py -index 02dd45a5ed..415d644db5 100755 ---- a/xen/tools/gen-cpuid.py -+++ b/xen/tools/gen-cpuid.py -@@ -47,8 +47,8 @@ def parse_definitions(state): - """ - feat_regex = re.compile( - r"^XEN_CPUFEATURE\(([A-Z0-9_]+)," -- "\s+([\s\d]+\*[\s\d]+\+[\s\d]+)\)" -- "\s+/\*([\w!]*) .*$") -+ r"\s+([\s\d]+\*[\s\d]+\+[\s\d]+)\)" -+ r"\s+/\*([\w!]*) .*$") - - word_regex = re.compile( - r"^/\* .* word (\d*) \*/$") --- -2.45.2 - diff --git a/0009-9pfsd-fix-release-build-with-old-gcc.patch b/0009-9pfsd-fix-release-build-with-old-gcc.patch new file mode 100644 index 0000000..6d6f2ef --- /dev/null +++ b/0009-9pfsd-fix-release-build-with-old-gcc.patch @@ -0,0 +1,33 @@ +From 8ad5a8c5c36add2eee70a7253da4098ebffdb79b Mon Sep 17 00:00:00 2001 +From: Jan Beulich <jbeulich@suse.com> +Date: Thu, 8 Aug 2024 13:47:44 +0200 +Subject: [PATCH 09/35] 9pfsd: fix release build with old gcc + +Being able to recognize that "par" is reliably initialized on the 1st +loop iteration requires not overly old compilers. + +Fixes: 7809132b1a1d ("tools/xen-9pfsd: add 9pfs response generation support") +Signed-off-by: Jan Beulich <jbeulich@suse.com> +Reviewed-by: Juergen Gross <jgross@suse.com> +master commit: 984cb316cb27b53704c607e640a7dd2763b898ab +master date: 2024-08-02 08:44:22 +0200 +--- + tools/9pfsd/io.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/tools/9pfsd/io.c b/tools/9pfsd/io.c +index df1be3df7d..468e0241f5 100644 +--- a/tools/9pfsd/io.c ++++ b/tools/9pfsd/io.c +@@ -196,7 +196,7 @@ static void fill_buffer_at(void **data, const char *fmt, ...); + static void vfill_buffer_at(void **data, const char *fmt, va_list ap) + { + const char *f; +- const void *par; ++ const void *par = NULL; /* old gcc */ + const char *str_val; + const struct p9_qid *qid; + const struct p9_stat *stat; +-- +2.46.1 + diff --git a/0009-VT-d-correct-ATS-checking-for-root-complex-integrate.patch b/0009-VT-d-correct-ATS-checking-for-root-complex-integrate.patch deleted file mode 100644 index 2d2dc91..0000000 --- a/0009-VT-d-correct-ATS-checking-for-root-complex-integrate.patch +++ /dev/null @@ -1,63 +0,0 @@ -From a4c5bbb9db07b27e66f7c47676b1c888e1bece20 Mon Sep 17 00:00:00 2001 -From: Jan Beulich <jbeulich@suse.com> -Date: Tue, 21 May 2024 10:20:58 +0200 -Subject: [PATCH 09/56] VT-d: correct ATS checking for root complex integrated - devices -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Spec version 4.1 says - -"The ATSR structures identifies PCI Express Root-Ports supporting - Address Translation Services (ATS) transactions. Software must enable - ATS on endpoint devices behind a Root Port only if the Root Port is - reported as supporting ATS transactions." - -Clearly root complex integrated devices aren't "behind root ports", -matching my observation on a SapphireRapids system having an ATS- -capable root complex integrated device. Hence for such devices we -shouldn't try to locate a corresponding ATSR. - -Since both pci_find_ext_capability() and pci_find_cap_offset() return -"unsigned int", change "pos" to that type at the same time. - -Fixes: 903b93211f56 ("[VTD] laying the ground work for ATS") -Signed-off-by: Jan Beulich <jbeulich@suse.com> -Acked-by: Roger Pau Monné <roger.pau@citrix.com> -master commit: 04e31583bab97e5042a44a1d00fce2760272635f -master date: 2024-05-06 09:22:45 +0200 ---- - xen/drivers/passthrough/vtd/x86/ats.c | 9 +++++++-- - 1 file changed, 7 insertions(+), 2 deletions(-) - -diff --git a/xen/drivers/passthrough/vtd/x86/ats.c b/xen/drivers/passthrough/vtd/x86/ats.c -index 1f5913bed9..61052ef580 100644 ---- a/xen/drivers/passthrough/vtd/x86/ats.c -+++ b/xen/drivers/passthrough/vtd/x86/ats.c -@@ -44,7 +44,7 @@ struct acpi_drhd_unit *find_ats_dev_drhd(struct vtd_iommu *iommu) - int ats_device(const struct pci_dev *pdev, const struct acpi_drhd_unit *drhd) - { - struct acpi_drhd_unit *ats_drhd; -- int pos; -+ unsigned int pos, expfl = 0; - - if ( !ats_enabled || !iommu_qinval ) - return 0; -@@ -53,7 +53,12 @@ int ats_device(const struct pci_dev *pdev, const struct acpi_drhd_unit *drhd) - !ecap_dev_iotlb(drhd->iommu->ecap) ) - return 0; - -- if ( !acpi_find_matched_atsr_unit(pdev) ) -+ pos = pci_find_cap_offset(pdev->sbdf, PCI_CAP_ID_EXP); -+ if ( pos ) -+ expfl = pci_conf_read16(pdev->sbdf, pos + PCI_EXP_FLAGS); -+ -+ if ( MASK_EXTR(expfl, PCI_EXP_FLAGS_TYPE) != PCI_EXP_TYPE_RC_END && -+ !acpi_find_matched_atsr_unit(pdev) ) - return 0; - - ats_drhd = find_ats_dev_drhd(drhd->iommu); --- -2.45.2 - diff --git a/0010-tools-libxs-Open-dev-xen-xenbus-fds-as-O_CLOEXEC.patch b/0010-tools-libxs-Open-dev-xen-xenbus-fds-as-O_CLOEXEC.patch deleted file mode 100644 index 9f9cdd7..0000000 --- a/0010-tools-libxs-Open-dev-xen-xenbus-fds-as-O_CLOEXEC.patch +++ /dev/null @@ -1,47 +0,0 @@ -From 2bc52041cacb33a301ebf939d69a021597941186 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Tue, 21 May 2024 10:21:47 +0200 -Subject: [PATCH 10/56] tools/libxs: Open /dev/xen/xenbus fds as O_CLOEXEC - -The header description for xs_open() goes as far as to suggest that the fd is -O_CLOEXEC, but it isn't actually. - -`xl devd` has been observed leaking /dev/xen/xenbus into children. - -Link: https://github.com/QubesOS/qubes-issues/issues/8292 -Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com> -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Juergen Gross <jgross@suse.com> -master commit: f4f2f3402b2f4985d69ffc0d46f845d05fd0b60f -master date: 2024-05-07 15:18:36 +0100 ---- - tools/libs/store/xs.c | 6 +++++- - 1 file changed, 5 insertions(+), 1 deletion(-) - -diff --git a/tools/libs/store/xs.c b/tools/libs/store/xs.c -index 140b9a2839..1498515073 100644 ---- a/tools/libs/store/xs.c -+++ b/tools/libs/store/xs.c -@@ -54,6 +54,10 @@ struct xs_stored_msg { - #include <dlfcn.h> - #endif - -+#ifndef O_CLOEXEC -+#define O_CLOEXEC 0 -+#endif -+ - struct xs_handle { - /* Communications channel to xenstore daemon. */ - int fd; -@@ -227,7 +231,7 @@ error: - static int get_dev(const char *connect_to) - { - /* We cannot open read-only because requests are writes */ -- return open(connect_to, O_RDWR); -+ return open(connect_to, O_RDWR | O_CLOEXEC); - } - - static int all_restrict_cb(Xentoolcore__Active_Handle *ah, domid_t domid) { --- -2.45.2 - diff --git a/0010-x86-emul-Fix-misaligned-IO-breakpoint-behaviour-in-P.patch b/0010-x86-emul-Fix-misaligned-IO-breakpoint-behaviour-in-P.patch new file mode 100644 index 0000000..07b592a --- /dev/null +++ b/0010-x86-emul-Fix-misaligned-IO-breakpoint-behaviour-in-P.patch @@ -0,0 +1,41 @@ +From 033060ee6e05f9e86ef1a51674864b55dc15e62c Mon Sep 17 00:00:00 2001 +From: Matthew Barnes <matthew.barnes@cloud.com> +Date: Thu, 8 Aug 2024 13:48:03 +0200 +Subject: [PATCH 10/35] x86/emul: Fix misaligned IO breakpoint behaviour in PV + guests + +When hardware breakpoints are configured on misaligned IO ports, the +hardware will mask the addresses based on the breakpoint width during +comparison. + +For PV guests, misaligned IO breakpoints do not behave the same way, and +therefore yield different results. + +This patch tweaks the emulation of IO breakpoints for PV guests such +that they reproduce the same behaviour as hardware. + +Fixes: bec9e3205018 ("x86: emulate I/O port access breakpoints") +Signed-off-by: Matthew Barnes <matthew.barnes@cloud.com> +Reviewed-by: Jan Beulich <jbeulich@suse.com> +master commit: 08aacc392d86d4c7dbebdb5e664060ae2af72057 +master date: 2024-08-08 13:27:50 +0200 +--- + xen/arch/x86/pv/emul-priv-op.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c +index f101510a1b..aa11ecadaa 100644 +--- a/xen/arch/x86/pv/emul-priv-op.c ++++ b/xen/arch/x86/pv/emul-priv-op.c +@@ -346,6 +346,8 @@ static unsigned int check_guest_io_breakpoint(struct vcpu *v, + case DR_LEN_8: width = 8; break; + } + ++ start &= ~(width - 1UL); ++ + if ( (start < (port + len)) && ((start + width) > port) ) + match |= 1u << i; + } +-- +2.46.1 + diff --git a/0011-x86-IOMMU-move-tracking-in-iommu_identity_mapping.patch b/0011-x86-IOMMU-move-tracking-in-iommu_identity_mapping.patch new file mode 100644 index 0000000..930ea55 --- /dev/null +++ b/0011-x86-IOMMU-move-tracking-in-iommu_identity_mapping.patch @@ -0,0 +1,111 @@ +From c61d4264d26d1ffb26563bfb6dc2f0b06cd72128 Mon Sep 17 00:00:00 2001 +From: Teddy Astie <teddy.astie@vates.tech> +Date: Tue, 13 Aug 2024 16:47:19 +0200 +Subject: [PATCH 11/35] x86/IOMMU: move tracking in iommu_identity_mapping() +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +If for some reason xmalloc() fails after having mapped the reserved +regions, an error is reported, but the regions remain mapped in the P2M. + +Similarly if an error occurs during set_identity_p2m_entry() (except on +the first call), the partial mappings of the region would be retained +without being tracked anywhere, and hence without there being a way to +remove them again from the domain's P2M. + +Move the setting up of the list entry ahead of trying to map the region. +In cases other than the first mapping failing, keep record of the full +region, such that a subsequent unmapping request can be properly torn +down. + +To compensate for the potentially excess unmapping requests, don't log a +warning from p2m_remove_identity_entry() when there really was nothing +mapped at a given GFN. + +This is XSA-460 / CVE-2024-31145. + +Fixes: 2201b67b9128 ("VT-d: improve RMRR region handling") +Fixes: c0e19d7c6c42 ("IOMMU: generalize VT-d's tracking of mapped RMRR regions") +Signed-off-by: Teddy Astie <teddy.astie@vates.tech> +Signed-off-by: Jan Beulich <jbeulich@suse.com> +Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> +master commit: beadd68b5490ada053d72f8a9ce6fd696d626596 +master date: 2024-08-13 16:36:40 +0200 +--- + xen/arch/x86/mm/p2m.c | 8 +++++--- + xen/drivers/passthrough/x86/iommu.c | 30 ++++++++++++++++++++--------- + 2 files changed, 26 insertions(+), 12 deletions(-) + +diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c +index e7e327d6a6..1739133fc2 100644 +--- a/xen/arch/x86/mm/p2m.c ++++ b/xen/arch/x86/mm/p2m.c +@@ -1267,9 +1267,11 @@ int p2m_remove_identity_entry(struct domain *d, unsigned long gfn_l) + else + { + gfn_unlock(p2m, gfn, 0); +- printk(XENLOG_G_WARNING +- "non-identity map d%d:%lx not cleared (mapped to %lx)\n", +- d->domain_id, gfn_l, mfn_x(mfn)); ++ if ( (p2mt != p2m_invalid && p2mt != p2m_mmio_dm) || ++ a != p2m_access_n || !mfn_eq(mfn, INVALID_MFN) ) ++ printk(XENLOG_G_WARNING ++ "non-identity map %pd:%lx not cleared (mapped to %lx)\n", ++ d, gfn_l, mfn_x(mfn)); + ret = 0; + } + +diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c +index cc0062b027..8b1e0596b8 100644 +--- a/xen/drivers/passthrough/x86/iommu.c ++++ b/xen/drivers/passthrough/x86/iommu.c +@@ -267,24 +267,36 @@ int iommu_identity_mapping(struct domain *d, p2m_access_t p2ma, + if ( p2ma == p2m_access_x ) + return -ENOENT; + +- while ( base_pfn < end_pfn ) +- { +- int err = set_identity_p2m_entry(d, base_pfn, p2ma, flag); +- +- if ( err ) +- return err; +- base_pfn++; +- } +- + map = xmalloc(struct identity_map); + if ( !map ) + return -ENOMEM; ++ + map->base = base; + map->end = end; + map->access = p2ma; + map->count = 1; ++ ++ /* ++ * Insert into list ahead of mapping, so the range can be found when ++ * trying to clean up. ++ */ + list_add_tail(&map->list, &hd->arch.identity_maps); + ++ for ( ; base_pfn < end_pfn; ++base_pfn ) ++ { ++ int err = set_identity_p2m_entry(d, base_pfn, p2ma, flag); ++ ++ if ( !err ) ++ continue; ++ ++ if ( (map->base >> PAGE_SHIFT_4K) == base_pfn ) ++ { ++ list_del(&map->list); ++ xfree(map); ++ } ++ return err; ++ } ++ + return 0; + } + +-- +2.46.1 + diff --git a/0011-x86-cpu-policy-Fix-migration-from-Ice-Lake-to-Cascad.patch b/0011-x86-cpu-policy-Fix-migration-from-Ice-Lake-to-Cascad.patch deleted file mode 100644 index 26eb3ec..0000000 --- a/0011-x86-cpu-policy-Fix-migration-from-Ice-Lake-to-Cascad.patch +++ /dev/null @@ -1,92 +0,0 @@ -From 0673eae8e53de5007dba35149527579819428323 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Tue, 21 May 2024 10:22:08 +0200 -Subject: [PATCH 11/56] x86/cpu-policy: Fix migration from Ice Lake to Cascade - Lake -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Ever since Xen 4.14, there has been a latent bug with migration. - -While some toolstacks can level the features properly, they don't shink -feat.max_subleaf when all features have been dropped. This is because -we *still* have not completed the toolstack side work for full CPU Policy -objects. - -As a consequence, even when properly feature levelled, VMs can't migrate -"backwards" across hardware which reduces feat.max_subleaf. One such example -is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0). - -Extend the max policies feat.max_subleaf to the hightest number Xen knows -about, but leave the default policies matching the host. This will allow VMs -with a higher feat.max_subleaf than strictly necessary to migrate in. - -Eventually we'll manage to teach the toolstack how to avoid creating such VMs -in the first place, but there's still more work to do there. - -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Acked-by: Roger Pau Monné <roger.pau@citrix.com> -master commit: a2330b51df267e20e66bbba6c5bf08f0570ed58b -master date: 2024-05-07 16:56:46 +0100 ---- - xen/arch/x86/cpu-policy.c | 22 ++++++++++++++++++++++ - 1 file changed, 22 insertions(+) - -diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c -index a822800f52..1aba6ed4ca 100644 ---- a/xen/arch/x86/cpu-policy.c -+++ b/xen/arch/x86/cpu-policy.c -@@ -603,6 +603,13 @@ static void __init calculate_pv_max_policy(void) - unsigned int i; - - *p = host_cpu_policy; -+ -+ /* -+ * Some VMs may have a larger-than-necessary feat max_subleaf. Allow them -+ * to migrate in. -+ */ -+ p->feat.max_subleaf = ARRAY_SIZE(p->feat.raw) - 1; -+ - x86_cpu_policy_to_featureset(p, fs); - - for ( i = 0; i < ARRAY_SIZE(fs); ++i ) -@@ -643,6 +650,10 @@ static void __init calculate_pv_def_policy(void) - unsigned int i; - - *p = pv_max_cpu_policy; -+ -+ /* Default to the same max_subleaf as the host. */ -+ p->feat.max_subleaf = host_cpu_policy.feat.max_subleaf; -+ - x86_cpu_policy_to_featureset(p, fs); - - for ( i = 0; i < ARRAY_SIZE(fs); ++i ) -@@ -679,6 +690,13 @@ static void __init calculate_hvm_max_policy(void) - const uint32_t *mask; - - *p = host_cpu_policy; -+ -+ /* -+ * Some VMs may have a larger-than-necessary feat max_subleaf. Allow them -+ * to migrate in. -+ */ -+ p->feat.max_subleaf = ARRAY_SIZE(p->feat.raw) - 1; -+ - x86_cpu_policy_to_featureset(p, fs); - - mask = hvm_hap_supported() ? -@@ -780,6 +798,10 @@ static void __init calculate_hvm_def_policy(void) - const uint32_t *mask; - - *p = hvm_max_cpu_policy; -+ -+ /* Default to the same max_subleaf as the host. */ -+ p->feat.max_subleaf = host_cpu_policy.feat.max_subleaf; -+ - x86_cpu_policy_to_featureset(p, fs); - - mask = hvm_hap_supported() ? --- -2.45.2 - diff --git a/0012-x86-pass-through-documents-as-security-unsupported-w.patch b/0012-x86-pass-through-documents-as-security-unsupported-w.patch new file mode 100644 index 0000000..a83553c --- /dev/null +++ b/0012-x86-pass-through-documents-as-security-unsupported-w.patch @@ -0,0 +1,42 @@ +From 3e8a2217f211d49dd771f7918d72df057121109f Mon Sep 17 00:00:00 2001 +From: Jan Beulich <jbeulich@suse.com> +Date: Tue, 13 Aug 2024 16:48:13 +0200 +Subject: [PATCH 12/35] x86/pass-through: documents as security-unsupported + when sharing resources + +When multiple devices share resources and one of them is to be passed +through to a guest, security of the entire system and of respective +guests individually cannot really be guaranteed without knowing +internals of any of the involved guests. Therefore such a configuration +cannot really be security-supported, yet making that explicit was so far +missing. + +This is XSA-461 / CVE-2024-31146. + +Signed-off-by: Jan Beulich <jbeulich@suse.com> +Reviewed-by: Juergen Gross <jgross@suse.com> +master commit: 9c94eda1e3790820699a6de3f6a7c959ecf30600 +master date: 2024-08-13 16:37:25 +0200 +--- + SUPPORT.md | 5 +++++ + 1 file changed, 5 insertions(+) + +diff --git a/SUPPORT.md b/SUPPORT.md +index 8b998d9bc7..1d8b38cbd0 100644 +--- a/SUPPORT.md ++++ b/SUPPORT.md +@@ -841,6 +841,11 @@ This feature is not security supported: see https://xenbits.xen.org/xsa/advisory + + Only systems using IOMMUs are supported. + ++Passing through of devices sharing resources with another device is not ++security supported. Such sharing could e.g. be the same line interrupt being ++used by multiple devices, one of which is to be passed through, or two such ++devices having memory BARs within the same 4k page. ++ + Not compatible with migration, populate-on-demand, altp2m, + introspection, memory sharing, or memory paging. + +-- +2.46.1 + diff --git a/0012-x86-ucode-Distinguish-ucode-already-up-to-date.patch b/0012-x86-ucode-Distinguish-ucode-already-up-to-date.patch deleted file mode 100644 index dd2f91a..0000000 --- a/0012-x86-ucode-Distinguish-ucode-already-up-to-date.patch +++ /dev/null @@ -1,58 +0,0 @@ -From a42c83b202cc034c43c723cf363dbbabac61b1af Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Tue, 21 May 2024 10:22:52 +0200 -Subject: [PATCH 12/56] x86/ucode: Distinguish "ucode already up to date" -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Right now, Xen returns -ENOENT for both "the provided blob isn't correct for -this CPU", and "the blob isn't newer than what's loaded". - -This in turn causes xen-ucode to exit with an error, when "nothing to do" is -more commonly a success condition. - -Handle EEXIST specially and exit cleanly. - -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Acked-by: Roger Pau Monné <roger.pau@citrix.com> -master commit: 648db37a155aca6f66d4cf3bb118417a728c3579 -master date: 2024-05-09 18:19:49 +0100 ---- - tools/misc/xen-ucode.c | 5 ++++- - xen/arch/x86/cpu/microcode/core.c | 2 +- - 2 files changed, 5 insertions(+), 2 deletions(-) - -diff --git a/tools/misc/xen-ucode.c b/tools/misc/xen-ucode.c -index c6ae6498d6..390969db3d 100644 ---- a/tools/misc/xen-ucode.c -+++ b/tools/misc/xen-ucode.c -@@ -125,8 +125,11 @@ int main(int argc, char *argv[]) - exit(1); - } - -+ errno = 0; - ret = xc_microcode_update(xch, buf, len); -- if ( ret ) -+ if ( ret == -1 && errno == EEXIST ) -+ printf("Microcode already up to date\n"); -+ else if ( ret ) - { - fprintf(stderr, "Failed to update microcode. (err: %s)\n", - strerror(errno)); -diff --git a/xen/arch/x86/cpu/microcode/core.c b/xen/arch/x86/cpu/microcode/core.c -index 4e011cdc41..d5338ad345 100644 ---- a/xen/arch/x86/cpu/microcode/core.c -+++ b/xen/arch/x86/cpu/microcode/core.c -@@ -640,7 +640,7 @@ static long cf_check microcode_update_helper(void *data) - "microcode: couldn't find any newer%s revision in the provided blob!\n", - opt_ucode_allow_same ? " (or the same)" : ""); - microcode_free_patch(patch); -- ret = -ENOENT; -+ ret = -EEXIST; - - goto put; - } --- -2.45.2 - diff --git a/0013-automation-disable-Yocto-jobs.patch b/0013-automation-disable-Yocto-jobs.patch new file mode 100644 index 0000000..72fda13 --- /dev/null +++ b/0013-automation-disable-Yocto-jobs.patch @@ -0,0 +1,48 @@ +From 51ae51301f2b4bccd365353f78510c1bdac522c9 Mon Sep 17 00:00:00 2001 +From: Stefano Stabellini <stefano.stabellini@amd.com> +Date: Fri, 9 Aug 2024 23:59:18 -0700 +Subject: [PATCH 13/35] automation: disable Yocto jobs + +The Yocto jobs take a long time to run. We are changing Gitlab ARM64 +runners and the new runners might not be able to finish the Yocto jobs +in a reasonable time. + +For now, disable the Yocto jobs by turning them into "manual" trigger +(they need to be manually executed.) + +Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> +Reviewed-by: Michal Orzel <michal.orzel@amd.com> +master commit: 1c24bca387136d73f88f46ce3db82d34411702e8 +master date: 2024-08-09 23:59:18 -0700 +--- + automation/gitlab-ci/build.yaml | 3 +++ + 1 file changed, 3 insertions(+) + +diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml +index 7ce88d38e7..09895d1fbd 100644 +--- a/automation/gitlab-ci/build.yaml ++++ b/automation/gitlab-ci/build.yaml +@@ -470,17 +470,20 @@ yocto-qemuarm64: + extends: .yocto-test-arm64 + variables: + YOCTO_BOARD: qemuarm64 ++ when: manual + + yocto-qemuarm: + extends: .yocto-test-arm64 + variables: + YOCTO_BOARD: qemuarm + YOCTO_OUTPUT: --copy-output ++ when: manual + + yocto-qemux86-64: + extends: .yocto-test-x86-64 + variables: + YOCTO_BOARD: qemux86-64 ++ when: manual + + # Cppcheck analysis jobs + +-- +2.46.1 + diff --git a/0013-libxl-fix-population-of-the-online-vCPU-bitmap-for-P.patch b/0013-libxl-fix-population-of-the-online-vCPU-bitmap-for-P.patch deleted file mode 100644 index e5fb285..0000000 --- a/0013-libxl-fix-population-of-the-online-vCPU-bitmap-for-P.patch +++ /dev/null @@ -1,61 +0,0 @@ -From 9966e5413133157a630f7462518005fb898e582a Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> -Date: Tue, 21 May 2024 10:23:27 +0200 -Subject: [PATCH 13/56] libxl: fix population of the online vCPU bitmap for PVH -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -libxl passes some information to libacpi to create the ACPI table for a PVH -guest, and among that information it's a bitmap of which vCPUs are online -which can be less than the maximum number of vCPUs assigned to the domain. - -While the population of the bitmap is done correctly for HVM based on the -number of online vCPUs, for PVH the population of the bitmap is done based on -the number of maximum vCPUs allowed. This leads to all local APIC entries in -the MADT being set as enabled, which contradicts the data in xenstore if vCPUs -is different than maximum vCPUs. - -Fix by copying the internal libxl bitmap that's populated based on the vCPUs -parameter. - -Reported-by: Arthur Borsboom <arthurborsboom@gmail.com> -Link: https://gitlab.com/libvirt/libvirt/-/issues/399 -Reported-by: Leigh Brown <leigh@solinno.co.uk> -Fixes: 14c0d328da2b ('libxl/acpi: Build ACPI tables for HVMlite guests') -Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> -Tested-by: Leigh Brown <leigh@solinno.co.uk> -Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> -master commit: 5cc7347b04b2d0a3133754c7a9b936f614ec656a -master date: 2024-05-11 00:13:43 +0100 ---- - tools/libs/light/libxl_x86_acpi.c | 6 +++--- - 1 file changed, 3 insertions(+), 3 deletions(-) - -diff --git a/tools/libs/light/libxl_x86_acpi.c b/tools/libs/light/libxl_x86_acpi.c -index 620f3c700c..5cf261bd67 100644 ---- a/tools/libs/light/libxl_x86_acpi.c -+++ b/tools/libs/light/libxl_x86_acpi.c -@@ -89,7 +89,7 @@ static int init_acpi_config(libxl__gc *gc, - uint32_t domid = dom->guest_domid; - xc_domaininfo_t info; - struct hvm_info_table *hvminfo; -- int i, r, rc; -+ int r, rc; - - config->dsdt_anycpu = config->dsdt_15cpu = dsdt_pvh; - config->dsdt_anycpu_len = config->dsdt_15cpu_len = dsdt_pvh_len; -@@ -138,8 +138,8 @@ static int init_acpi_config(libxl__gc *gc, - hvminfo->nr_vcpus = info.max_vcpu_id + 1; - } - -- for (i = 0; i < hvminfo->nr_vcpus; i++) -- hvminfo->vcpu_online[i / 8] |= 1 << (i & 7); -+ memcpy(hvminfo->vcpu_online, b_info->avail_vcpus.map, -+ b_info->avail_vcpus.size); - - config->hvminfo = hvminfo; - --- -2.45.2 - diff --git a/0014-automation-use-expect-to-run-QEMU.patch b/0014-automation-use-expect-to-run-QEMU.patch new file mode 100644 index 0000000..90e2c62 --- /dev/null +++ b/0014-automation-use-expect-to-run-QEMU.patch @@ -0,0 +1,362 @@ +From 0918434e0fbee48c9dccc5fe262de5a81e380c15 Mon Sep 17 00:00:00 2001 +From: Stefano Stabellini <stefano.stabellini@amd.com> +Date: Fri, 9 Aug 2024 23:59:20 -0700 +Subject: [PATCH 14/35] automation: use expect to run QEMU + +Use expect to invoke QEMU so that we can terminate the test as soon as +we get the right string in the output instead of waiting until the +final timeout. + +For timeout, instead of an hardcoding the value, use a Gitlab CI +variable "QEMU_TIMEOUT" that can be changed depending on the latest +status of the Gitlab CI runners. + +Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> +Reviewed-by: Michal Orzel <michal.orzel@amd.com> +master commit: c36efb7fcea6ef9f31a20e60ec79ed3ae293feee +master date: 2024-08-09 23:59:20 -0700 +--- + automation/scripts/qemu-alpine-x86_64.sh | 16 +++---- + automation/scripts/qemu-key.exp | 45 +++++++++++++++++++ + automation/scripts/qemu-smoke-dom0-arm32.sh | 16 +++---- + automation/scripts/qemu-smoke-dom0-arm64.sh | 16 +++---- + .../scripts/qemu-smoke-dom0less-arm32.sh | 18 ++++---- + .../scripts/qemu-smoke-dom0less-arm64.sh | 16 +++---- + automation/scripts/qemu-smoke-ppc64le.sh | 13 +++--- + automation/scripts/qemu-smoke-riscv64.sh | 13 +++--- + automation/scripts/qemu-smoke-x86-64.sh | 15 ++++--- + automation/scripts/qemu-xtf-dom0less-arm64.sh | 15 +++---- + 10 files changed, 112 insertions(+), 71 deletions(-) + create mode 100755 automation/scripts/qemu-key.exp + +diff --git a/automation/scripts/qemu-alpine-x86_64.sh b/automation/scripts/qemu-alpine-x86_64.sh +index 8e398dcea3..5359e0820b 100755 +--- a/automation/scripts/qemu-alpine-x86_64.sh ++++ b/automation/scripts/qemu-alpine-x86_64.sh +@@ -77,18 +77,16 @@ EOF + # Run the test + rm -f smoke.serial + set +e +-timeout -k 1 720 \ +-qemu-system-x86_64 \ ++export QEMU_CMD="qemu-system-x86_64 \ + -cpu qemu64,+svm \ + -m 2G -smp 2 \ + -monitor none -serial stdio \ + -nographic \ + -device virtio-net-pci,netdev=n0 \ +- -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0 |& \ +- # Remove carriage returns from the stdout output, as gitlab +- # interface chokes on them +- tee smoke.serial | sed 's/\r//' ++ -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0" + +-set -e +-(grep -q "Domain-0" smoke.serial && grep -q "BusyBox" smoke.serial) || exit 1 +-exit 0 ++export QEMU_LOG="smoke.serial" ++export LOG_MSG="Domain-0" ++export PASSED="BusyBox" ++ ++./automation/scripts/qemu-key.exp +diff --git a/automation/scripts/qemu-key.exp b/automation/scripts/qemu-key.exp +new file mode 100755 +index 0000000000..35eb903a31 +--- /dev/null ++++ b/automation/scripts/qemu-key.exp +@@ -0,0 +1,45 @@ ++#!/usr/bin/expect -f ++ ++set timeout $env(QEMU_TIMEOUT) ++ ++log_file -a $env(QEMU_LOG) ++ ++match_max 10000 ++ ++eval spawn $env(QEMU_CMD) ++ ++expect_after { ++ -re "(.*)\r" { ++ exp_continue ++ } ++ timeout {send_error "ERROR-Timeout!\n"; exit 1} ++ eof {send_error "ERROR-EOF!\n"; exit 1} ++} ++ ++if {[info exists env(UBOOT_CMD)]} { ++ expect "=>" ++ ++ send "$env(UBOOT_CMD)\r" ++} ++ ++if {[info exists env(LOG_MSG)]} { ++ expect { ++ "$env(PASSED)" { ++ expect "$env(LOG_MSG)" ++ exit 0 ++ } ++ "$env(LOG_MSG)" { ++ expect "$env(PASSED)" ++ exit 0 ++ } ++ } ++} ++ ++expect { ++ "$env(PASSED)" { ++ exit 0 ++ } ++} ++ ++expect eof ++ +diff --git a/automation/scripts/qemu-smoke-dom0-arm32.sh b/automation/scripts/qemu-smoke-dom0-arm32.sh +index 31c05cc840..bab66bfe44 100755 +--- a/automation/scripts/qemu-smoke-dom0-arm32.sh ++++ b/automation/scripts/qemu-smoke-dom0-arm32.sh +@@ -78,9 +78,7 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d . -c config + + rm -f ${serial_log} + set +e +-echo " virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000"| \ +-timeout -k 1 720 \ +-./qemu-system-arm \ ++export QEMU_CMD="./qemu-system-arm \ + -machine virt \ + -machine virtualization=true \ + -smp 4 \ +@@ -91,9 +89,11 @@ timeout -k 1 720 \ + -no-reboot \ + -device virtio-net-pci,netdev=n0 \ + -netdev user,id=n0,tftp=./ \ +- -bios /usr/lib/u-boot/qemu_arm/u-boot.bin |& \ +- tee ${serial_log} | sed 's/\r//' ++ -bios /usr/lib/u-boot/qemu_arm/u-boot.bin" ++ ++export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000" ++export QEMU_LOG="${serial_log}" ++export LOG_MSG="Domain-0" ++export PASSED="/ #" + +-set -e +-(grep -q "Domain-0" ${serial_log} && grep -q "^/ #" ${serial_log}) || exit 1 +-exit 0 ++../automation/scripts/qemu-key.exp +diff --git a/automation/scripts/qemu-smoke-dom0-arm64.sh b/automation/scripts/qemu-smoke-dom0-arm64.sh +index 352963a741..0094bfc8e1 100755 +--- a/automation/scripts/qemu-smoke-dom0-arm64.sh ++++ b/automation/scripts/qemu-smoke-dom0-arm64.sh +@@ -94,9 +94,7 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d binaries/ -c binaries/conf + # Run the test + rm -f smoke.serial + set +e +-echo " virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000"| \ +-timeout -k 1 720 \ +-./binaries/qemu-system-aarch64 \ ++export QEMU_CMD="./binaries/qemu-system-aarch64 \ + -machine virtualization=true \ + -cpu cortex-a57 -machine type=virt \ + -m 2048 -monitor none -serial stdio \ +@@ -104,9 +102,11 @@ timeout -k 1 720 \ + -no-reboot \ + -device virtio-net-pci,netdev=n0 \ + -netdev user,id=n0,tftp=binaries \ +- -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin |& \ +- tee smoke.serial | sed 's/\r//' ++ -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin" ++ ++export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000" ++export QEMU_LOG="smoke.serial" ++export LOG_MSG="Domain-0" ++export PASSED="BusyBox" + +-set -e +-(grep -q "Domain-0" smoke.serial && grep -q "BusyBox" smoke.serial) || exit 1 +-exit 0 ++./automation/scripts/qemu-key.exp +diff --git a/automation/scripts/qemu-smoke-dom0less-arm32.sh b/automation/scripts/qemu-smoke-dom0less-arm32.sh +index c027c8c5c8..68ffbabdb8 100755 +--- a/automation/scripts/qemu-smoke-dom0less-arm32.sh ++++ b/automation/scripts/qemu-smoke-dom0less-arm32.sh +@@ -5,7 +5,7 @@ set -ex + test_variant=$1 + + # Prompt to grep for to check if dom0 booted successfully +-dom0_prompt="^/ #" ++dom0_prompt="/ #" + + serial_log="$(pwd)/smoke.serial" + +@@ -131,9 +131,7 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d . -c config + # Run the test + rm -f ${serial_log} + set +e +-echo " virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000"| \ +-timeout -k 1 240 \ +-./qemu-system-arm \ ++export QEMU_CMD="./qemu-system-arm \ + -machine virt \ + -machine virtualization=true \ + -smp 4 \ +@@ -144,9 +142,11 @@ timeout -k 1 240 \ + -no-reboot \ + -device virtio-net-pci,netdev=n0 \ + -netdev user,id=n0,tftp=./ \ +- -bios /usr/lib/u-boot/qemu_arm/u-boot.bin |& \ +- tee ${serial_log} | sed 's/\r//' ++ -bios /usr/lib/u-boot/qemu_arm/u-boot.bin" ++ ++export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000" ++export QEMU_LOG="${serial_log}" ++export LOG_MSG="${dom0_prompt}" ++export PASSED="${passed}" + +-set -e +-(grep -q "${dom0_prompt}" ${serial_log} && grep -q "${passed}" ${serial_log}) || exit 1 +-exit 0 ++../automation/scripts/qemu-key.exp +diff --git a/automation/scripts/qemu-smoke-dom0less-arm64.sh b/automation/scripts/qemu-smoke-dom0less-arm64.sh +index 15258692d5..eb25c4af4b 100755 +--- a/automation/scripts/qemu-smoke-dom0less-arm64.sh ++++ b/automation/scripts/qemu-smoke-dom0less-arm64.sh +@@ -205,9 +205,7 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d binaries/ -c binaries/conf + # Run the test + rm -f smoke.serial + set +e +-echo " virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000"| \ +-timeout -k 1 240 \ +-./binaries/qemu-system-aarch64 \ ++export QEMU_CMD="./binaries/qemu-system-aarch64 \ + -machine virtualization=true \ + -cpu cortex-a57 -machine type=virt,gic-version=$gic_version \ + -m 2048 -monitor none -serial stdio \ +@@ -215,9 +213,11 @@ timeout -k 1 240 \ + -no-reboot \ + -device virtio-net-pci,netdev=n0 \ + -netdev user,id=n0,tftp=binaries \ +- -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin |& \ +- tee smoke.serial | sed 's/\r//' ++ -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin" ++ ++export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000" ++export QEMU_LOG="smoke.serial" ++export LOG_MSG="Welcome to Alpine Linux" ++export PASSED="${passed}" + +-set -e +-(grep -q "^Welcome to Alpine Linux" smoke.serial && grep -q "${passed}" smoke.serial) || exit 1 +-exit 0 ++./automation/scripts/qemu-key.exp +diff --git a/automation/scripts/qemu-smoke-ppc64le.sh b/automation/scripts/qemu-smoke-ppc64le.sh +index 9088881b73..ccb4a576f4 100755 +--- a/automation/scripts/qemu-smoke-ppc64le.sh ++++ b/automation/scripts/qemu-smoke-ppc64le.sh +@@ -11,8 +11,7 @@ machine=$1 + rm -f ${serial_log} + set +e + +-timeout -k 1 20 \ +-qemu-system-ppc64 \ ++export QEMU_CMD="qemu-system-ppc64 \ + -bios skiboot.lid \ + -M $machine \ + -m 2g \ +@@ -21,9 +20,9 @@ qemu-system-ppc64 \ + -monitor none \ + -nographic \ + -serial stdio \ +- -kernel binaries/xen \ +- |& tee ${serial_log} | sed 's/\r//' ++ -kernel binaries/xen" + +-set -e +-(grep -q "Hello, ppc64le!" ${serial_log}) || exit 1 +-exit 0 ++export QEMU_LOG="${serial_log}" ++export PASSED="Hello, ppc64le!" ++ ++./automation/scripts/qemu-key.exp +diff --git a/automation/scripts/qemu-smoke-riscv64.sh b/automation/scripts/qemu-smoke-riscv64.sh +index f90df3c051..0355c075b7 100755 +--- a/automation/scripts/qemu-smoke-riscv64.sh ++++ b/automation/scripts/qemu-smoke-riscv64.sh +@@ -6,15 +6,14 @@ set -ex + rm -f smoke.serial + set +e + +-timeout -k 1 2 \ +-qemu-system-riscv64 \ ++export QEMU_CMD="qemu-system-riscv64 \ + -M virt \ + -smp 1 \ + -nographic \ + -m 2g \ +- -kernel binaries/xen \ +- |& tee smoke.serial | sed 's/\r//' ++ -kernel binaries/xen" + +-set -e +-(grep -q "All set up" smoke.serial) || exit 1 +-exit 0 ++export QEMU_LOG="smoke.serial" ++export PASSED="All set up" ++ ++./automation/scripts/qemu-key.exp +diff --git a/automation/scripts/qemu-smoke-x86-64.sh b/automation/scripts/qemu-smoke-x86-64.sh +index 3014d07314..37ac10e068 100755 +--- a/automation/scripts/qemu-smoke-x86-64.sh ++++ b/automation/scripts/qemu-smoke-x86-64.sh +@@ -16,11 +16,12 @@ esac + + rm -f smoke.serial + set +e +-timeout -k 1 30 \ +-qemu-system-x86_64 -nographic -kernel binaries/xen \ ++export QEMU_CMD="qemu-system-x86_64 -nographic -kernel binaries/xen \ + -initrd xtf/tests/example/$k \ +- -append "loglvl=all console=com1 noreboot console_timestamps=boot $extra" \ +- -m 512 -monitor none -serial file:smoke.serial +-set -e +-grep -q 'Test result: SUCCESS' smoke.serial || exit 1 +-exit 0 ++ -append \"loglvl=all console=com1 noreboot console_timestamps=boot $extra\" \ ++ -m 512 -monitor none -serial stdio" ++ ++export QEMU_LOG="smoke.serial" ++export PASSED="Test result: SUCCESS" ++ ++./automation/scripts/qemu-key.exp +diff --git a/automation/scripts/qemu-xtf-dom0less-arm64.sh b/automation/scripts/qemu-xtf-dom0less-arm64.sh +index b08c2d44fb..0666f6363e 100755 +--- a/automation/scripts/qemu-xtf-dom0less-arm64.sh ++++ b/automation/scripts/qemu-xtf-dom0less-arm64.sh +@@ -51,9 +51,7 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d binaries/ -c binaries/conf + # Run the test + rm -f smoke.serial + set +e +-echo " virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000"| \ +-timeout -k 1 120 \ +-./binaries/qemu-system-aarch64 \ ++export QEMU_CMD="./binaries/qemu-system-aarch64 \ + -machine virtualization=true \ + -cpu cortex-a57 -machine type=virt \ + -m 2048 -monitor none -serial stdio \ +@@ -61,9 +59,10 @@ timeout -k 1 120 \ + -no-reboot \ + -device virtio-net-pci,netdev=n0 \ + -netdev user,id=n0,tftp=binaries \ +- -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin |& \ +- tee smoke.serial | sed 's/\r//' ++ -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin" ++ ++export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000" ++export QEMU_LOG="smoke.serial" ++export PASSED="${passed}" + +-set -e +-(grep -q "${passed}" smoke.serial) || exit 1 +-exit 0 ++./automation/scripts/qemu-key.exp +-- +2.46.1 + diff --git a/0014-libxl-Fix-handling-XenStore-errors-in-device-creatio.patch b/0014-libxl-Fix-handling-XenStore-errors-in-device-creatio.patch deleted file mode 100644 index ac28521..0000000 --- a/0014-libxl-Fix-handling-XenStore-errors-in-device-creatio.patch +++ /dev/null @@ -1,191 +0,0 @@ -From 8271f0e8f23b63199caf0edcfe85ebc1c1412d1b Mon Sep 17 00:00:00 2001 -From: Demi Marie Obenour <demi@invisiblethingslab.com> -Date: Tue, 21 May 2024 10:23:52 +0200 -Subject: [PATCH 14/56] libxl: Fix handling XenStore errors in device creation - -If xenstored runs out of memory it is possible for it to fail operations -that should succeed. libxl wasn't robust against this, and could fail -to ensure that the TTY path of a non-initial console was created and -read-only for guests. This doesn't qualify for an XSA because guests -should not be able to run xenstored out of memory, but it still needs to -be fixed. - -Add the missing error checks to ensure that all errors are properly -handled and that at no point can a guest make the TTY path of its -frontend directory writable. - -Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com> -Reviewed-by: Juergen Gross <jgross@suse.com> -master commit: 531d3bea5e9357357eaf6d40f5784a1b4c29b910 -master date: 2024-05-11 00:13:43 +0100 ---- - tools/libs/light/libxl_console.c | 11 ++--- - tools/libs/light/libxl_device.c | 72 ++++++++++++++++++++------------ - tools/libs/light/libxl_xshelp.c | 13 ++++-- - 3 files changed, 60 insertions(+), 36 deletions(-) - -diff --git a/tools/libs/light/libxl_console.c b/tools/libs/light/libxl_console.c -index cd7412a327..a563c9d3c7 100644 ---- a/tools/libs/light/libxl_console.c -+++ b/tools/libs/light/libxl_console.c -@@ -351,11 +351,10 @@ int libxl__device_console_add(libxl__gc *gc, uint32_t domid, - flexarray_append(front, "protocol"); - flexarray_append(front, LIBXL_XENCONSOLE_PROTOCOL); - } -- libxl__device_generic_add(gc, XBT_NULL, device, -- libxl__xs_kvs_of_flexarray(gc, back), -- libxl__xs_kvs_of_flexarray(gc, front), -- libxl__xs_kvs_of_flexarray(gc, ro_front)); -- rc = 0; -+ rc = libxl__device_generic_add(gc, XBT_NULL, device, -+ libxl__xs_kvs_of_flexarray(gc, back), -+ libxl__xs_kvs_of_flexarray(gc, front), -+ libxl__xs_kvs_of_flexarray(gc, ro_front)); - out: - return rc; - } -@@ -665,6 +664,8 @@ int libxl_device_channel_getinfo(libxl_ctx *ctx, uint32_t domid, - */ - if (!val) val = "/NO-SUCH-PATH"; - channelinfo->u.pty.path = strdup(val); -+ if (channelinfo->u.pty.path == NULL) -+ abort(); - break; - default: - break; -diff --git a/tools/libs/light/libxl_device.c b/tools/libs/light/libxl_device.c -index 13da6e0573..3035501f2c 100644 ---- a/tools/libs/light/libxl_device.c -+++ b/tools/libs/light/libxl_device.c -@@ -177,8 +177,13 @@ int libxl__device_generic_add(libxl__gc *gc, xs_transaction_t t, - ro_frontend_perms[1].perms = backend_perms[1].perms = XS_PERM_READ; - - retry_transaction: -- if (create_transaction) -+ if (create_transaction) { - t = xs_transaction_start(ctx->xsh); -+ if (t == XBT_NULL) { -+ LOGED(ERROR, device->domid, "xs_transaction_start failed"); -+ return ERROR_FAIL; -+ } -+ } - - /* FIXME: read frontend_path and check state before removing stuff */ - -@@ -195,42 +200,55 @@ retry_transaction: - if (rc) goto out; - } - -- /* xxx much of this function lacks error checks! */ -- - if (fents || ro_fents) { -- xs_rm(ctx->xsh, t, frontend_path); -- xs_mkdir(ctx->xsh, t, frontend_path); -+ if (!xs_rm(ctx->xsh, t, frontend_path) && errno != ENOENT) -+ goto out; -+ if (!xs_mkdir(ctx->xsh, t, frontend_path)) -+ goto out; - /* Console 0 is a special case. It doesn't use the regular PV - * state machine but also the frontend directory has - * historically contained other information, such as the - * vnc-port, which we don't want the guest fiddling with. - */ - if ((device->kind == LIBXL__DEVICE_KIND_CONSOLE && device->devid == 0) || -- (device->kind == LIBXL__DEVICE_KIND_VUART)) -- xs_set_permissions(ctx->xsh, t, frontend_path, -- ro_frontend_perms, ARRAY_SIZE(ro_frontend_perms)); -- else -- xs_set_permissions(ctx->xsh, t, frontend_path, -- frontend_perms, ARRAY_SIZE(frontend_perms)); -- xs_write(ctx->xsh, t, GCSPRINTF("%s/backend", frontend_path), -- backend_path, strlen(backend_path)); -- if (fents) -- libxl__xs_writev_perms(gc, t, frontend_path, fents, -- frontend_perms, ARRAY_SIZE(frontend_perms)); -- if (ro_fents) -- libxl__xs_writev_perms(gc, t, frontend_path, ro_fents, -- ro_frontend_perms, ARRAY_SIZE(ro_frontend_perms)); -+ (device->kind == LIBXL__DEVICE_KIND_VUART)) { -+ if (!xs_set_permissions(ctx->xsh, t, frontend_path, -+ ro_frontend_perms, ARRAY_SIZE(ro_frontend_perms))) -+ goto out; -+ } else { -+ if (!xs_set_permissions(ctx->xsh, t, frontend_path, -+ frontend_perms, ARRAY_SIZE(frontend_perms))) -+ goto out; -+ } -+ if (!xs_write(ctx->xsh, t, GCSPRINTF("%s/backend", frontend_path), -+ backend_path, strlen(backend_path))) -+ goto out; -+ if (fents) { -+ rc = libxl__xs_writev_perms(gc, t, frontend_path, fents, -+ frontend_perms, ARRAY_SIZE(frontend_perms)); -+ if (rc) goto out; -+ } -+ if (ro_fents) { -+ rc = libxl__xs_writev_perms(gc, t, frontend_path, ro_fents, -+ ro_frontend_perms, ARRAY_SIZE(ro_frontend_perms)); -+ if (rc) goto out; -+ } - } - - if (bents) { - if (!libxl_only) { -- xs_rm(ctx->xsh, t, backend_path); -- xs_mkdir(ctx->xsh, t, backend_path); -- xs_set_permissions(ctx->xsh, t, backend_path, backend_perms, -- ARRAY_SIZE(backend_perms)); -- xs_write(ctx->xsh, t, GCSPRINTF("%s/frontend", backend_path), -- frontend_path, strlen(frontend_path)); -- libxl__xs_writev(gc, t, backend_path, bents); -+ if (!xs_rm(ctx->xsh, t, backend_path) && errno != ENOENT) -+ goto out; -+ if (!xs_mkdir(ctx->xsh, t, backend_path)) -+ goto out; -+ if (!xs_set_permissions(ctx->xsh, t, backend_path, backend_perms, -+ ARRAY_SIZE(backend_perms))) -+ goto out; -+ if (!xs_write(ctx->xsh, t, GCSPRINTF("%s/frontend", backend_path), -+ frontend_path, strlen(frontend_path))) -+ goto out; -+ rc = libxl__xs_writev(gc, t, backend_path, bents); -+ if (rc) goto out; - } - - /* -@@ -276,7 +294,7 @@ retry_transaction: - out: - if (create_transaction && t) - libxl__xs_transaction_abort(gc, &t); -- return rc; -+ return rc != 0 ? rc : ERROR_FAIL; - } - - typedef struct { -diff --git a/tools/libs/light/libxl_xshelp.c b/tools/libs/light/libxl_xshelp.c -index 751cd942d9..a6e34ab10f 100644 ---- a/tools/libs/light/libxl_xshelp.c -+++ b/tools/libs/light/libxl_xshelp.c -@@ -60,10 +60,15 @@ int libxl__xs_writev_perms(libxl__gc *gc, xs_transaction_t t, - for (i = 0; kvs[i] != NULL; i += 2) { - path = GCSPRINTF("%s/%s", dir, kvs[i]); - if (path && kvs[i + 1]) { -- int length = strlen(kvs[i + 1]); -- xs_write(ctx->xsh, t, path, kvs[i + 1], length); -- if (perms) -- xs_set_permissions(ctx->xsh, t, path, perms, num_perms); -+ size_t length = strlen(kvs[i + 1]); -+ if (length > UINT_MAX) -+ return ERROR_FAIL; -+ if (!xs_write(ctx->xsh, t, path, kvs[i + 1], length)) -+ return ERROR_FAIL; -+ if (perms) { -+ if (!xs_set_permissions(ctx->xsh, t, path, perms, num_perms)) -+ return ERROR_FAIL; -+ } - } - } - return 0; --- -2.45.2 - diff --git a/0015-x86-vLAPIC-prevent-undue-recursion-of-vlapic_error.patch b/0015-x86-vLAPIC-prevent-undue-recursion-of-vlapic_error.patch new file mode 100644 index 0000000..ce66fe7 --- /dev/null +++ b/0015-x86-vLAPIC-prevent-undue-recursion-of-vlapic_error.patch @@ -0,0 +1,57 @@ +From 9358a7fad7f0427e7d1666da0c78cef341ee9072 Mon Sep 17 00:00:00 2001 +From: Jan Beulich <jbeulich@suse.com> +Date: Tue, 24 Sep 2024 14:27:03 +0200 +Subject: [PATCH 15/35] x86/vLAPIC: prevent undue recursion of vlapic_error() + +With the error vector set to an illegal value, the function invoking +vlapic_set_irq() would bring execution back here, with the non-recursive +lock already held. Avoid the call in this case, merely further updating +ESR (if necessary). + +This is XSA-462 / CVE-2024-45817. + +Fixes: 5f32d186a8b1 ("x86/vlapic: don't silently accept bad vectors") +Reported-by: Federico Serafini <federico.serafini@bugseng.com> +Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> +Signed-off-by: Jan Beulich <jbeulich@suse.com> +Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> +Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> +master commit: c42d9ec61f6d11e25fa77bd44dd11dad1edda268 +master date: 2024-09-24 14:23:29 +0200 +--- + xen/arch/x86/hvm/vlapic.c | 17 ++++++++++++++++- + 1 file changed, 16 insertions(+), 1 deletion(-) + +diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c +index 9cfc82666a..46ff758904 100644 +--- a/xen/arch/x86/hvm/vlapic.c ++++ b/xen/arch/x86/hvm/vlapic.c +@@ -112,9 +112,24 @@ static void vlapic_error(struct vlapic *vlapic, unsigned int errmask) + if ( (esr & errmask) != errmask ) + { + uint32_t lvterr = vlapic_get_reg(vlapic, APIC_LVTERR); ++ bool inj = false; + +- vlapic_set_reg(vlapic, APIC_ESR, esr | errmask); + if ( !(lvterr & APIC_LVT_MASKED) ) ++ { ++ /* ++ * If LVTERR is unmasked and has an illegal vector, vlapic_set_irq() ++ * will end up back here. Break the cycle by only injecting LVTERR ++ * if it will succeed, and folding in RECVILL otherwise. ++ */ ++ if ( (lvterr & APIC_VECTOR_MASK) >= 16 ) ++ inj = true; ++ else ++ errmask |= APIC_ESR_RECVILL; ++ } ++ ++ vlapic_set_reg(vlapic, APIC_ESR, esr | errmask); ++ ++ if ( inj ) + vlapic_set_irq(vlapic, lvterr & APIC_VECTOR_MASK, 0); + } + spin_unlock_irqrestore(&vlapic->esr_lock, flags); +-- +2.46.1 + diff --git a/0015-xen-sched-set-all-sched_resource-data-inside-locked-.patch b/0015-xen-sched-set-all-sched_resource-data-inside-locked-.patch deleted file mode 100644 index a8090d4..0000000 --- a/0015-xen-sched-set-all-sched_resource-data-inside-locked-.patch +++ /dev/null @@ -1,84 +0,0 @@ -From 3999b675cad5b717274d6493899b0eea8896f4d7 Mon Sep 17 00:00:00 2001 -From: Juergen Gross <jgross@suse.com> -Date: Tue, 21 May 2024 10:24:26 +0200 -Subject: [PATCH 15/56] xen/sched: set all sched_resource data inside locked - region for new cpu - -When adding a cpu to a scheduler, set all data items of struct -sched_resource inside the locked region, as otherwise a race might -happen (e.g. when trying to access the cpupool of the cpu): - - (XEN) ----[ Xen-4.19.0-1-d x86_64 debug=y Tainted: H ]---- - (XEN) CPU: 45 - (XEN) RIP: e008:[<ffff82d040244cbf>] common/sched/credit.c#csched_load_balance+0x41/0x877 - (XEN) RFLAGS: 0000000000010092 CONTEXT: hypervisor - (XEN) rax: ffff82d040981618 rbx: ffff82d040981618 rcx: 0000000000000000 - (XEN) rdx: 0000003ff68cd000 rsi: 000000000000002d rdi: ffff83103723d450 - (XEN) rbp: ffff83207caa7d48 rsp: ffff83207caa7b98 r8: 0000000000000000 - (XEN) r9: ffff831037253cf0 r10: ffff83103767c3f0 r11: 0000000000000009 - (XEN) r12: ffff831037237990 r13: ffff831037237990 r14: ffff831037253720 - (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 0000000000f526e0 - (XEN) cr3: 000000005bc2f000 cr2: 0000000000000010 - (XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: 0000000000000000 - (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 - (XEN) Xen code around <ffff82d040244cbf> (common/sched/credit.c#csched_load_balance+0x41/0x877): - (XEN) 48 8b 0c 10 48 8b 49 08 <48> 8b 79 10 48 89 bd b8 fe ff ff 49 8b 4e 28 48 - <snip> - (XEN) Xen call trace: - (XEN) [<ffff82d040244cbf>] R common/sched/credit.c#csched_load_balance+0x41/0x877 - (XEN) [<ffff82d040245a18>] F common/sched/credit.c#csched_schedule+0x36a/0x69f - (XEN) [<ffff82d040252644>] F common/sched/core.c#do_schedule+0xe8/0x433 - (XEN) [<ffff82d0402572dd>] F common/sched/core.c#schedule+0x2e5/0x2f9 - (XEN) [<ffff82d040232f35>] F common/softirq.c#__do_softirq+0x94/0xbe - (XEN) [<ffff82d040232fc8>] F do_softirq+0x13/0x15 - (XEN) [<ffff82d0403075ef>] F arch/x86/domain.c#idle_loop+0x92/0xe6 - (XEN) - (XEN) Pagetable walk from 0000000000000010: - (XEN) L4[0x000] = 000000103ff61063 ffffffffffffffff - (XEN) L3[0x000] = 000000103ff60063 ffffffffffffffff - (XEN) L2[0x000] = 0000001033dff063 ffffffffffffffff - (XEN) L1[0x000] = 0000000000000000 ffffffffffffffff - (XEN) - (XEN) **************************************** - (XEN) Panic on CPU 45: - (XEN) FATAL PAGE FAULT - (XEN) [error_code=0000] - (XEN) Faulting linear address: 0000000000000010 - (XEN) **************************************** - -Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> -Fixes: a8c6c623192e ("sched: clarify use cases of schedule_cpu_switch()") -Signed-off-by: Juergen Gross <jgross@suse.com> -Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> -Tested-by: Andrew Cooper <andrew.cooper3@citrix.com> -master commit: d104a07524ffc92ae7a70dfe192c291de2a563cc -master date: 2024-05-15 19:59:52 +0100 ---- - xen/common/sched/core.c | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c -index 34ad39b9ad..3c2403ebcf 100644 ---- a/xen/common/sched/core.c -+++ b/xen/common/sched/core.c -@@ -3179,6 +3179,8 @@ int schedule_cpu_add(unsigned int cpu, struct cpupool *c) - - sr->scheduler = new_ops; - sr->sched_priv = ppriv; -+ sr->granularity = cpupool_get_granularity(c); -+ sr->cpupool = c; - - /* - * Reroute the lock to the per pCPU lock as /last/ thing. In fact, -@@ -3191,8 +3193,6 @@ int schedule_cpu_add(unsigned int cpu, struct cpupool *c) - /* _Not_ pcpu_schedule_unlock(): schedule_lock has changed! */ - spin_unlock_irqrestore(old_lock, flags); - -- sr->granularity = cpupool_get_granularity(c); -- sr->cpupool = c; - /* The cpu is added to a pool, trigger it to go pick up some work */ - cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ); - --- -2.45.2 - diff --git a/0016-Arm-correct-FIXADDR_TOP.patch b/0016-Arm-correct-FIXADDR_TOP.patch new file mode 100644 index 0000000..244e873 --- /dev/null +++ b/0016-Arm-correct-FIXADDR_TOP.patch @@ -0,0 +1,58 @@ +From 46a2ce35212c9b35c4818ca9eec918aa4a45cb48 Mon Sep 17 00:00:00 2001 +From: Jan Beulich <jbeulich@suse.com> +Date: Tue, 24 Sep 2024 14:28:22 +0200 +Subject: [PATCH 16/35] Arm: correct FIXADDR_TOP + +While reviewing a RISC-V patch cloning the Arm code, I noticed an +off-by-1 here: FIX_PMAP_{BEGIN,END} being an inclusive range and +FIX_LAST being the same as FIX_PMAP_END, FIXADDR_TOP cannot derive from +FIX_LAST alone, or else the BUG_ON() in virt_to_fix() would trigger if +FIX_PMAP_END ended up being used. + +While touching this area also add a check for fixmap and boot FDT area +to not only not overlap, but to have at least one (unmapped) page in +between. + +Fixes: 4f17357b52f6 ("xen/arm: add Persistent Map (PMAP) infrastructure") +Signed-off-by: Jan Beulich <jbeulich@suse.com> +Reviewed-by: Michal Orzel <michal.orzel@amd.com> +master commit: fe3412ab83cc53c2bf2c497be3794bc09751efa5 +master date: 2024-08-13 21:50:55 +0100 +--- + xen/arch/arm/include/asm/fixmap.h | 2 +- + xen/arch/arm/mmu/setup.c | 6 ++++++ + 2 files changed, 7 insertions(+), 1 deletion(-) + +diff --git a/xen/arch/arm/include/asm/fixmap.h b/xen/arch/arm/include/asm/fixmap.h +index a823456ecb..0cb5d54d1c 100644 +--- a/xen/arch/arm/include/asm/fixmap.h ++++ b/xen/arch/arm/include/asm/fixmap.h +@@ -18,7 +18,7 @@ + #define FIX_LAST FIX_PMAP_END + + #define FIXADDR_START FIXMAP_ADDR(0) +-#define FIXADDR_TOP FIXMAP_ADDR(FIX_LAST) ++#define FIXADDR_TOP FIXMAP_ADDR(FIX_LAST + 1) + + #ifndef __ASSEMBLY__ + +diff --git a/xen/arch/arm/mmu/setup.c b/xen/arch/arm/mmu/setup.c +index f4bb424c3c..57042ed57b 100644 +--- a/xen/arch/arm/mmu/setup.c ++++ b/xen/arch/arm/mmu/setup.c +@@ -128,6 +128,12 @@ static void __init __maybe_unused build_assertions(void) + + #undef CHECK_SAME_SLOT + #undef CHECK_DIFFERENT_SLOT ++ ++ /* ++ * Fixmaps must not overlap with boot FDT mapping area. Make sure there's ++ * at least one guard page in between. ++ */ ++ BUILD_BUG_ON(FIXADDR_TOP >= BOOT_FDT_VIRT_START); + } + + lpae_t __init pte_of_xenaddr(vaddr_t va) +-- +2.46.1 + diff --git a/0016-x86-respect-mapcache_domain_init-failing.patch b/0016-x86-respect-mapcache_domain_init-failing.patch deleted file mode 100644 index db7ddfe..0000000 --- a/0016-x86-respect-mapcache_domain_init-failing.patch +++ /dev/null @@ -1,38 +0,0 @@ -From dfabab2cd9461ef9d21a708461f35d2ae4b55220 Mon Sep 17 00:00:00 2001 -From: Jan Beulich <jbeulich@suse.com> -Date: Tue, 21 May 2024 10:25:08 +0200 -Subject: [PATCH 16/56] x86: respect mapcache_domain_init() failing -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -The function itself properly handles and hands onwards failure from -create_perdomain_mapping(). Therefore its caller should respect possible -failure, too. - -Fixes: 4b28bf6ae90b ("x86: re-introduce map_domain_page() et al") -Signed-off-by: Jan Beulich <jbeulich@suse.com> -Acked-by: Roger Pau Monné <roger.pau@citrix.com> -master commit: 7270fdc7a0028d4b7b26fd1b36c6b9e97abcf3da -master date: 2024-05-15 19:59:52 +0100 ---- - xen/arch/x86/domain.c | 3 ++- - 1 file changed, 2 insertions(+), 1 deletion(-) - -diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c -index 307446273a..5feb0d0679 100644 ---- a/xen/arch/x86/domain.c -+++ b/xen/arch/x86/domain.c -@@ -850,7 +850,8 @@ int arch_domain_create(struct domain *d, - } - else if ( is_pv_domain(d) ) - { -- mapcache_domain_init(d); -+ if ( (rc = mapcache_domain_init(d)) != 0 ) -+ goto fail; - - if ( (rc = pv_domain_initialise(d)) != 0 ) - goto fail; --- -2.45.2 - diff --git a/0017-tools-xentop-Fix-cpu-sort-order.patch b/0017-tools-xentop-Fix-cpu-sort-order.patch deleted file mode 100644 index de19ddc..0000000 --- a/0017-tools-xentop-Fix-cpu-sort-order.patch +++ /dev/null @@ -1,76 +0,0 @@ -From f3d20dd31770a70971f4f85521eec1e741d38695 Mon Sep 17 00:00:00 2001 -From: Leigh Brown <leigh@solinno.co.uk> -Date: Tue, 21 May 2024 10:25:30 +0200 -Subject: [PATCH 17/56] tools/xentop: Fix cpu% sort order - -In compare_cpu_pct(), there is a double -> unsigned long long converion when -calling compare(). In C, this discards the fractional part, resulting in an -out-of order sorting such as: - - NAME STATE CPU(sec) CPU(%) - xendd --b--- 4020 5.7 - icecream --b--- 2600 3.8 - Domain-0 -----r 1060 1.5 - neon --b--- 827 1.1 - cheese --b--- 225 0.7 - pizza --b--- 359 0.5 - cassini --b--- 490 0.4 - fusilli --b--- 159 0.2 - bob --b--- 502 0.2 - blender --b--- 121 0.2 - bread --b--- 69 0.1 - chickpea --b--- 67 0.1 - lentil --b--- 67 0.1 - -Introduce compare_dbl() function and update compare_cpu_pct() to call it. - -Fixes: 49839b535b78 ("Add xenstat framework.") -Signed-off-by: Leigh Brown <leigh@solinno.co.uk> -Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> -master commit: e27fc7d15eab79e604e8b8728778594accc23cf1 -master date: 2024-05-15 19:59:52 +0100 ---- - tools/xentop/xentop.c | 13 ++++++++++++- - 1 file changed, 12 insertions(+), 1 deletion(-) - -diff --git a/tools/xentop/xentop.c b/tools/xentop/xentop.c -index 545bd5e96d..c2a311befe 100644 ---- a/tools/xentop/xentop.c -+++ b/tools/xentop/xentop.c -@@ -85,6 +85,7 @@ static void set_delay(const char *value); - static void set_prompt(const char *new_prompt, void (*func)(const char *)); - static int handle_key(int); - static int compare(unsigned long long, unsigned long long); -+static int compare_dbl(double, double); - static int compare_domains(xenstat_domain **, xenstat_domain **); - static unsigned long long tot_net_bytes( xenstat_domain *, int); - static bool tot_vbd_reqs(xenstat_domain *, int, unsigned long long *); -@@ -422,6 +423,16 @@ static int compare(unsigned long long i1, unsigned long long i2) - return 0; - } - -+/* Compares two double precision numbers, returning -1,0,1 for <,=,> */ -+static int compare_dbl(double d1, double d2) -+{ -+ if (d1 < d2) -+ return -1; -+ if (d1 > d2) -+ return 1; -+ return 0; -+} -+ - /* Comparison function for use with qsort. Compares two domains using the - * current sort field. */ - static int compare_domains(xenstat_domain **domain1, xenstat_domain **domain2) -@@ -523,7 +534,7 @@ static double get_cpu_pct(xenstat_domain *domain) - - static int compare_cpu_pct(xenstat_domain *domain1, xenstat_domain *domain2) - { -- return -compare(get_cpu_pct(domain1), get_cpu_pct(domain2)); -+ return -compare_dbl(get_cpu_pct(domain1), get_cpu_pct(domain2)); - } - - /* Prints cpu percentage statistic */ --- -2.45.2 - diff --git a/0017-xl-fix-incorrect-output-in-help-command.patch b/0017-xl-fix-incorrect-output-in-help-command.patch new file mode 100644 index 0000000..f2ab58a --- /dev/null +++ b/0017-xl-fix-incorrect-output-in-help-command.patch @@ -0,0 +1,36 @@ +From e12998a9db8d0ac14477557d09b437783a999ea4 Mon Sep 17 00:00:00 2001 +From: "John E. Krokes" <mag@netherworld.org> +Date: Tue, 24 Sep 2024 14:29:26 +0200 +Subject: [PATCH 17/35] xl: fix incorrect output in "help" command + +In "xl help", the output includes this line: + + vsnd-list List virtual display devices for a domain + +This should obviously say "sound devices" instead of "display devices". + +Signed-off-by: John E. Krokes <mag@netherworld.org> +Reviewed-by: Juergen Gross <jgross@suse.com> +Acked-by: Anthony PERARD <anthony.perard@vates.tech> +master commit: 09226d165b57d919150458044c5b594d3d1dc23a +master date: 2024-08-14 08:49:44 +0200 +--- + tools/xl/xl_cmdtable.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c +index 42751228c1..53fc22d344 100644 +--- a/tools/xl/xl_cmdtable.c ++++ b/tools/xl/xl_cmdtable.c +@@ -433,7 +433,7 @@ const struct cmd_spec cmd_table[] = { + }, + { "vsnd-list", + &main_vsndlist, 0, 0, +- "List virtual display devices for a domain", ++ "List virtual sound devices for a domain", + "<Domain(s)>", + }, + { "vsnd-detach", +-- +2.46.1 + diff --git a/0018-x86-mtrr-avoid-system-wide-rendezvous-when-setting-A.patch b/0018-x86-mtrr-avoid-system-wide-rendezvous-when-setting-A.patch deleted file mode 100644 index a57775d..0000000 --- a/0018-x86-mtrr-avoid-system-wide-rendezvous-when-setting-A.patch +++ /dev/null @@ -1,60 +0,0 @@ -From 7cdb1fa2ab0b5e11f66cada0370770404153c824 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> -Date: Tue, 21 May 2024 10:25:39 +0200 -Subject: [PATCH 18/56] x86/mtrr: avoid system wide rendezvous when setting AP - MTRRs -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -There's no point in forcing a system wide update of the MTRRs on all processors -when there are no changes to be propagated. On AP startup it's only the AP -that needs to write the system wide MTRR values in order to match the rest of -the already online CPUs. - -We have occasionally seen the watchdog trigger during `xen-hptool cpu-online` -in one Intel Cascade Lake box with 448 CPUs due to the re-setting of the MTRRs -on all the CPUs in the system. - -While there adjust the comment to clarify why the system-wide resetting of the -MTRR registers is not needed for the purposes of mtrr_ap_init(). - -Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> -Release-acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> -Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> -master commit: abd00b037da5ffa4e8c4508a5df0cd6eabb805a4 -master date: 2024-05-15 19:59:52 +0100 ---- - xen/arch/x86/cpu/mtrr/main.c | 15 ++++++++------- - 1 file changed, 8 insertions(+), 7 deletions(-) - -diff --git a/xen/arch/x86/cpu/mtrr/main.c b/xen/arch/x86/cpu/mtrr/main.c -index 90b235f57e..0a44ebbcb0 100644 ---- a/xen/arch/x86/cpu/mtrr/main.c -+++ b/xen/arch/x86/cpu/mtrr/main.c -@@ -573,14 +573,15 @@ void mtrr_ap_init(void) - if (!mtrr_if || hold_mtrr_updates_on_aps) - return; - /* -- * Ideally we should hold mtrr_mutex here to avoid mtrr entries changed, -- * but this routine will be called in cpu boot time, holding the lock -- * breaks it. This routine is called in two cases: 1.very earily time -- * of software resume, when there absolutely isn't mtrr entry changes; -- * 2.cpu hotadd time. We let mtrr_add/del_page hold cpuhotplug lock to -- * prevent mtrr entry changes -+ * hold_mtrr_updates_on_aps takes care of preventing unnecessary MTRR -+ * updates when batch starting the CPUs (see -+ * mtrr_aps_sync_{begin,end}()). -+ * -+ * Otherwise just apply the current system wide MTRR values to this AP. -+ * Note this doesn't require synchronization with the other CPUs, as -+ * there are strictly no modifications of the current MTRR values. - */ -- set_mtrr(~0U, 0, 0, 0); -+ mtrr_set_all(); - } - - /** --- -2.45.2 - diff --git a/0018-x86emul-correct-UD-check-for-AVX512-FP16-complex-mul.patch b/0018-x86emul-correct-UD-check-for-AVX512-FP16-complex-mul.patch new file mode 100644 index 0000000..cdbc59e --- /dev/null +++ b/0018-x86emul-correct-UD-check-for-AVX512-FP16-complex-mul.patch @@ -0,0 +1,37 @@ +From e2f29f7bad59c4be53363c8c0d2933982a22d0de Mon Sep 17 00:00:00 2001 +From: Jan Beulich <jbeulich@suse.com> +Date: Tue, 24 Sep 2024 14:30:04 +0200 +Subject: [PATCH 18/35] x86emul: correct #UD check for AVX512-FP16 complex + multiplications + +avx512_vlen_check()'s argument was inverted, while the surrounding +conditional wrongly forced the EVEX.L'L check for the scalar forms when +embedded rounding was in effect. + +Fixes: d14c52cba0f5 ("x86emul: handle AVX512-FP16 complex multiplication insns") +Signed-off-by: Jan Beulich <jbeulich@suse.com> +Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> +master commit: a30d438ce58b70c5955f5d37f776086ab8f88623 +master date: 2024-08-19 15:32:31 +0200 +--- + xen/arch/x86/x86_emulate/x86_emulate.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c +index 2d5c1de8ec..16557385bf 100644 +--- a/xen/arch/x86/x86_emulate/x86_emulate.c ++++ b/xen/arch/x86/x86_emulate/x86_emulate.c +@@ -7984,8 +7984,8 @@ x86_emulate( + generate_exception_if(modrm_reg == src1 || + (ea.type != OP_MEM && modrm_reg == modrm_rm), + X86_EXC_UD); +- if ( ea.type != OP_REG || (b & 1) || !evex.brs ) +- avx512_vlen_check(!(b & 1)); ++ if ( ea.type != OP_REG || !evex.brs ) ++ avx512_vlen_check(b & 1); + goto simd_zmm; + } + +-- +2.46.1 + diff --git a/0019-update-Xen-version-to-4.18.3-pre.patch b/0019-update-Xen-version-to-4.18.3-pre.patch deleted file mode 100644 index 34f2b33..0000000 --- a/0019-update-Xen-version-to-4.18.3-pre.patch +++ /dev/null @@ -1,25 +0,0 @@ -From 01f7a3c792241d348a4e454a30afdf6c0d6cd71c Mon Sep 17 00:00:00 2001 -From: Jan Beulich <jbeulich@suse.com> -Date: Tue, 21 May 2024 11:52:11 +0200 -Subject: [PATCH 19/56] update Xen version to 4.18.3-pre - ---- - xen/Makefile | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/xen/Makefile b/xen/Makefile -index 657f6fa4e3..786ab61600 100644 ---- a/xen/Makefile -+++ b/xen/Makefile -@@ -6,7 +6,7 @@ this-makefile := $(call lastword,$(MAKEFILE_LIST)) - # All other places this is stored (eg. compile.h) should be autogenerated. - export XEN_VERSION = 4 - export XEN_SUBVERSION = 18 --export XEN_EXTRAVERSION ?= .2$(XEN_VENDORVERSION) -+export XEN_EXTRAVERSION ?= .3-pre$(XEN_VENDORVERSION) - export XEN_FULLVERSION = $(XEN_VERSION).$(XEN_SUBVERSION)$(XEN_EXTRAVERSION) - -include xen-version - --- -2.45.2 - diff --git a/0019-x86-pv-Introduce-x86_merge_dr6-and-fix-do_debug.patch b/0019-x86-pv-Introduce-x86_merge_dr6-and-fix-do_debug.patch new file mode 100644 index 0000000..87690da --- /dev/null +++ b/0019-x86-pv-Introduce-x86_merge_dr6-and-fix-do_debug.patch @@ -0,0 +1,140 @@ +From de924e4dbac80ac7d94a2e86c37eecccaa1bc677 Mon Sep 17 00:00:00 2001 +From: Andrew Cooper <andrew.cooper3@citrix.com> +Date: Tue, 24 Sep 2024 14:30:49 +0200 +Subject: [PATCH 19/35] x86/pv: Introduce x86_merge_dr6() and fix do_debug() +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Pretty much everywhere in Xen the logic to update %dr6 when injecting #DB is +buggy. Introduce a new x86_merge_dr6() helper, and start fixing the mess by +adjusting the dr6 merge in do_debug(). Also correct the comment. + +Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> +Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> +Reviewed-by: Jan Beulich <jbeulich@suse.com> +master commit: 54ef601a66e8d812a6a6a308f02524e81201825e +master date: 2024-08-21 23:59:19 +0100 +--- + xen/arch/x86/debug.c | 40 ++++++++++++++++++++++++++++ + xen/arch/x86/include/asm/debugreg.h | 7 +++++ + xen/arch/x86/include/asm/x86-defns.h | 7 +++++ + xen/arch/x86/traps.c | 11 +++++--- + 4 files changed, 62 insertions(+), 3 deletions(-) + +diff --git a/xen/arch/x86/debug.c b/xen/arch/x86/debug.c +index 127fe83021..b10f1f12b6 100644 +--- a/xen/arch/x86/debug.c ++++ b/xen/arch/x86/debug.c +@@ -2,12 +2,52 @@ + /* + * Copyright (C) 2023 XenServer. + */ ++#include <xen/bug.h> + #include <xen/kernel.h> + + #include <xen/lib/x86/cpu-policy.h> + + #include <asm/debugreg.h> + ++/* ++ * Merge new bits into dr6. 'new' is always given in positive polarity, ++ * matching the Intel VMCS PENDING_DBG semantics. ++ * ++ * At the time of writing (August 2024), on the subject of %dr6 updates the ++ * manuals are either vague (Intel "certain exceptions may clear bits 0-3"), ++ * or disputed (AMD makes statements which don't match observed behaviour). ++ * ++ * The only debug exception I can find which doesn't clear the breakpoint bits ++ * is ICEBP(/INT1) on AMD systems. This is also the one source of #DB that ++ * doesn't have an explicit status bit, meaning we can't easily identify this ++ * case either (AMD systems don't virtualise PENDING_DBG and only provide a ++ * post-merge %dr6 value). ++ * ++ * Treat %dr6 merging as unconditionally writing the breakpoint bits. ++ * ++ * We can't really manage any better, and guest kernels handling #DB as ++ * instructed by the SDM/APM (i.e. reading %dr6 then resetting it back to ++ * default) wont notice. ++ */ ++unsigned int x86_merge_dr6(const struct cpu_policy *p, unsigned int dr6, ++ unsigned int new) ++{ ++ /* Flip dr6 to have positive polarity. */ ++ dr6 ^= X86_DR6_DEFAULT; ++ ++ /* Sanity check that only known values are passed in. */ ++ ASSERT(!(dr6 & ~X86_DR6_KNOWN_MASK)); ++ ASSERT(!(new & ~X86_DR6_KNOWN_MASK)); ++ ++ /* Breakpoint bits overridden. All others accumulate. */ ++ dr6 = (dr6 & ~X86_DR6_BP_MASK) | new; ++ ++ /* Flip dr6 back to having default polarity. */ ++ dr6 ^= X86_DR6_DEFAULT; ++ ++ return x86_adj_dr6_rsvd(p, dr6); ++} ++ + unsigned int x86_adj_dr6_rsvd(const struct cpu_policy *p, unsigned int dr6) + { + unsigned int ones = X86_DR6_DEFAULT; +diff --git a/xen/arch/x86/include/asm/debugreg.h b/xen/arch/x86/include/asm/debugreg.h +index 96c406ad53..6baa725441 100644 +--- a/xen/arch/x86/include/asm/debugreg.h ++++ b/xen/arch/x86/include/asm/debugreg.h +@@ -108,4 +108,11 @@ struct cpu_policy; + unsigned int x86_adj_dr6_rsvd(const struct cpu_policy *p, unsigned int dr6); + unsigned int x86_adj_dr7_rsvd(const struct cpu_policy *p, unsigned int dr7); + ++/* ++ * Merge new bits into dr6. 'new' is always given in positive polarity, ++ * matching the Intel VMCS PENDING_DBG semantics. ++ */ ++unsigned int x86_merge_dr6(const struct cpu_policy *p, unsigned int dr6, ++ unsigned int new); ++ + #endif /* _X86_DEBUGREG_H */ +diff --git a/xen/arch/x86/include/asm/x86-defns.h b/xen/arch/x86/include/asm/x86-defns.h +index 3bcdbaccd3..caa92829ea 100644 +--- a/xen/arch/x86/include/asm/x86-defns.h ++++ b/xen/arch/x86/include/asm/x86-defns.h +@@ -132,6 +132,13 @@ + #define X86_DR6_ZEROS _AC(0x00001000, UL) /* %dr6 bits forced to 0 */ + #define X86_DR6_DEFAULT _AC(0xffff0ff0, UL) /* Default %dr6 value */ + ++#define X86_DR6_BP_MASK \ ++ (X86_DR6_B0 | X86_DR6_B1 | X86_DR6_B2 | X86_DR6_B3) ++ ++#define X86_DR6_KNOWN_MASK \ ++ (X86_DR6_BP_MASK | X86_DR6_BLD | X86_DR6_BD | X86_DR6_BS | \ ++ X86_DR6_BT | X86_DR6_RTM) ++ + /* + * Debug control flags in DR7. + */ +diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c +index ee91fc56b1..78e83f6fc1 100644 +--- a/xen/arch/x86/traps.c ++++ b/xen/arch/x86/traps.c +@@ -2017,9 +2017,14 @@ void asmlinkage do_debug(struct cpu_user_regs *regs) + return; + } + +- /* Save debug status register where guest OS can peek at it */ +- v->arch.dr6 |= (dr6 & ~X86_DR6_DEFAULT); +- v->arch.dr6 &= (dr6 | ~X86_DR6_DEFAULT); ++ /* ++ * Update the guest's dr6 so the debugger can peek at it. ++ * ++ * TODO: This should be passed out-of-band, so guest state is not modified ++ * by debugging actions completed behind it's back. ++ */ ++ v->arch.dr6 = x86_merge_dr6(v->domain->arch.cpu_policy, ++ v->arch.dr6, dr6 ^ X86_DR6_DEFAULT); + + if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached ) + { +-- +2.46.1 + diff --git a/0020-x86-pv-Fix-merging-of-new-status-bits-into-dr6.patch b/0020-x86-pv-Fix-merging-of-new-status-bits-into-dr6.patch new file mode 100644 index 0000000..b9be372 --- /dev/null +++ b/0020-x86-pv-Fix-merging-of-new-status-bits-into-dr6.patch @@ -0,0 +1,222 @@ +From b74a5ea8399d1a0466c55332f557863acdae21b6 Mon Sep 17 00:00:00 2001 +From: Andrew Cooper <andrew.cooper3@citrix.com> +Date: Tue, 24 Sep 2024 14:34:30 +0200 +Subject: [PATCH 20/35] x86/pv: Fix merging of new status bits into %dr6 +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +All #DB exceptions result in an update of %dr6, but this isn't captured in +Xen's handling, and is buggy just about everywhere. + +To begin resolving this issue, add a new pending_dbg field to x86_event +(unioned with cr2 to avoid taking any extra space, adjusting users to avoid +old-GCC bugs with anonymous unions), and introduce pv_inject_DB() to replace +the current callers using pv_inject_hw_exception(). + +Push the adjustment of v->arch.dr6 into pv_inject_event(), and use the new +x86_merge_dr6() rather than the current incorrect logic. + +A key property is that pending_dbg is taken with positive polarity to deal +with RTM/BLD sensibly. Most callers pass in a constant, but callers passing +in a hardware %dr6 value need to XOR the value with X86_DR6_DEFAULT to flip to +positive polarity. + +This fixes the behaviour of the breakpoint status bits; that any left pending +are generally discarded when a new #DB is raised. In principle it would fix +RTM/BLD too, except PV guests can't turn these capabilities on to start with. + +Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> +Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> +Reviewed-by: Jan Beulich <jbeulich@suse.com> +master commit: db39fa4b27ea470902d4625567cb6fa24030ddfa +master date: 2024-08-21 23:59:19 +0100 +--- + xen/arch/x86/include/asm/domain.h | 18 ++++++++++++++++-- + xen/arch/x86/include/asm/hvm/hvm.h | 3 ++- + xen/arch/x86/pv/emul-priv-op.c | 5 +---- + xen/arch/x86/pv/emulate.c | 9 +++++++-- + xen/arch/x86/pv/ro-page-fault.c | 2 +- + xen/arch/x86/pv/traps.c | 16 ++++++++++++---- + xen/arch/x86/traps.c | 2 +- + xen/arch/x86/x86_emulate/x86_emulate.h | 5 ++++- + 8 files changed, 44 insertions(+), 16 deletions(-) + +diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h +index f5daeb182b..5d92891e6f 100644 +--- a/xen/arch/x86/include/asm/domain.h ++++ b/xen/arch/x86/include/asm/domain.h +@@ -731,15 +731,29 @@ static inline void pv_inject_hw_exception(unsigned int vector, int errcode) + pv_inject_event(&event); + } + ++static inline void pv_inject_DB(unsigned long pending_dbg) ++{ ++ struct x86_event event = { ++ .vector = X86_EXC_DB, ++ .type = X86_EVENTTYPE_HW_EXCEPTION, ++ .error_code = X86_EVENT_NO_EC, ++ }; ++ ++ event.pending_dbg = pending_dbg; ++ ++ pv_inject_event(&event); ++} ++ + static inline void pv_inject_page_fault(int errcode, unsigned long cr2) + { +- const struct x86_event event = { ++ struct x86_event event = { + .vector = X86_EXC_PF, + .type = X86_EVENTTYPE_HW_EXCEPTION, + .error_code = errcode, +- .cr2 = cr2, + }; + ++ event.cr2 = cr2; ++ + pv_inject_event(&event); + } + +diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h +index 1c01e22c8e..238eece0cf 100644 +--- a/xen/arch/x86/include/asm/hvm/hvm.h ++++ b/xen/arch/x86/include/asm/hvm/hvm.h +@@ -525,9 +525,10 @@ static inline void hvm_inject_page_fault(int errcode, unsigned long cr2) + .vector = X86_EXC_PF, + .type = X86_EVENTTYPE_HW_EXCEPTION, + .error_code = errcode, +- .cr2 = cr2, + }; + ++ event.cr2 = cr2; ++ + hvm_inject_event(&event); + } + +diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c +index aa11ecadaa..15c83b9d23 100644 +--- a/xen/arch/x86/pv/emul-priv-op.c ++++ b/xen/arch/x86/pv/emul-priv-op.c +@@ -1366,10 +1366,7 @@ int pv_emulate_privileged_op(struct cpu_user_regs *regs) + ctxt.bpmatch |= DR_STEP; + + if ( ctxt.bpmatch ) +- { +- curr->arch.dr6 |= ctxt.bpmatch | DR_STATUS_RESERVED_ONE; +- pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); +- } ++ pv_inject_DB(ctxt.bpmatch); + + /* fall through */ + case X86EMUL_RETRY: +diff --git a/xen/arch/x86/pv/emulate.c b/xen/arch/x86/pv/emulate.c +index e7a1c0a2cc..8c44dea123 100644 +--- a/xen/arch/x86/pv/emulate.c ++++ b/xen/arch/x86/pv/emulate.c +@@ -71,10 +71,15 @@ void pv_emul_instruction_done(struct cpu_user_regs *regs, unsigned long rip) + { + regs->rip = rip; + regs->eflags &= ~X86_EFLAGS_RF; ++ + if ( regs->eflags & X86_EFLAGS_TF ) + { +- current->arch.dr6 |= DR_STEP | DR_STATUS_RESERVED_ONE; +- pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); ++ /* ++ * TODO: this should generally use TF from the start of the ++ * instruction. It's only a latent bug for now, as this path isn't ++ * used for any instruction which modifies eflags. ++ */ ++ pv_inject_DB(X86_DR6_BS); + } + } + +diff --git a/xen/arch/x86/pv/ro-page-fault.c b/xen/arch/x86/pv/ro-page-fault.c +index cad28ef928..d0fe07e3a1 100644 +--- a/xen/arch/x86/pv/ro-page-fault.c ++++ b/xen/arch/x86/pv/ro-page-fault.c +@@ -390,7 +390,7 @@ int pv_ro_page_fault(unsigned long addr, struct cpu_user_regs *regs) + /* Fallthrough */ + case X86EMUL_OKAY: + if ( ctxt.retire.singlestep ) +- pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); ++ pv_inject_DB(X86_DR6_BS); + + /* Fallthrough */ + case X86EMUL_RETRY: +diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c +index 83e84e2762..5a7341abf0 100644 +--- a/xen/arch/x86/pv/traps.c ++++ b/xen/arch/x86/pv/traps.c +@@ -12,6 +12,7 @@ + #include <xen/lib.h> + #include <xen/softirq.h> + ++#include <asm/debugreg.h> + #include <asm/pv/trace.h> + #include <asm/shared.h> + #include <asm/traps.h> +@@ -50,9 +51,9 @@ void pv_inject_event(const struct x86_event *event) + tb->cs = ti->cs; + tb->eip = ti->address; + +- if ( event->type == X86_EVENTTYPE_HW_EXCEPTION && +- vector == X86_EXC_PF ) ++ switch ( vector | -(event->type == X86_EVENTTYPE_SW_INTERRUPT) ) + { ++ case X86_EXC_PF: + curr->arch.pv.ctrlreg[2] = event->cr2; + arch_set_cr2(curr, event->cr2); + +@@ -62,9 +63,16 @@ void pv_inject_event(const struct x86_event *event) + error_code |= PFEC_user_mode; + + trace_pv_page_fault(event->cr2, error_code); +- } +- else ++ break; ++ ++ case X86_EXC_DB: ++ curr->arch.dr6 = x86_merge_dr6(curr->domain->arch.cpu_policy, ++ curr->arch.dr6, event->pending_dbg); ++ fallthrough; ++ default: + trace_pv_trap(vector, regs->rip, use_error_code, error_code); ++ break; ++ } + + if ( use_error_code ) + { +diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c +index 78e83f6fc1..8e2df3e719 100644 +--- a/xen/arch/x86/traps.c ++++ b/xen/arch/x86/traps.c +@@ -2032,7 +2032,7 @@ void asmlinkage do_debug(struct cpu_user_regs *regs) + return; + } + +- pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); ++ pv_inject_DB(0 /* N/A, already merged */); + } + + void asmlinkage do_entry_CP(struct cpu_user_regs *regs) +diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h +index d92be69d84..e8a0e57228 100644 +--- a/xen/arch/x86/x86_emulate/x86_emulate.h ++++ b/xen/arch/x86/x86_emulate/x86_emulate.h +@@ -78,7 +78,10 @@ struct x86_event { + uint8_t type; /* X86_EVENTTYPE_* */ + uint8_t insn_len; /* Instruction length */ + int32_t error_code; /* X86_EVENT_NO_EC if n/a */ +- unsigned long cr2; /* Only for X86_EXC_PF h/w exception */ ++ union { ++ unsigned long cr2; /* #PF */ ++ unsigned long pending_dbg; /* #DB (new DR6 bits, positive polarity) */ ++ }; + }; + + /* +-- +2.46.1 + diff --git a/0020-x86-ucode-Further-fixes-to-identify-ucode-already-up.patch b/0020-x86-ucode-Further-fixes-to-identify-ucode-already-up.patch deleted file mode 100644 index c00dce2..0000000 --- a/0020-x86-ucode-Further-fixes-to-identify-ucode-already-up.patch +++ /dev/null @@ -1,92 +0,0 @@ -From cd873f00bedca2f1afeaf13a78f70e719c5b1398 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Wed, 26 Jun 2024 13:36:13 +0200 -Subject: [PATCH 20/56] x86/ucode: Further fixes to identify "ucode already up - to date" -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -When the revision in hardware is newer than anything Xen has to hand, -'microcode_cache' isn't set up. Then, `xen-ucode` initiates the update -because it doesn't know whether the revisions across the system are symmetric -or not. This involves the patch getting all the way into the -apply_microcode() hooks before being found to be too old. - -This is all a giant mess and needs an overhaul, but in the short term simply -adjust the apply_microcode() to return -EEXIST. - -Also, unconditionally print the preexisting microcode revision on boot. It's -relevant information which is otherwise unavailable if Xen doesn't find new -microcode to use. - -Fixes: 648db37a155a ("x86/ucode: Distinguish "ucode already up to date"") -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -Acked-by: Roger Pau Monné <roger.pau@citrix.com> -master commit: 977d98e67c2e929c62aa1f495fc4c6341c45abb5 -master date: 2024-05-16 13:59:11 +0100 ---- - xen/arch/x86/cpu/microcode/amd.c | 7 +++++-- - xen/arch/x86/cpu/microcode/core.c | 2 ++ - xen/arch/x86/cpu/microcode/intel.c | 7 +++++-- - 3 files changed, 12 insertions(+), 4 deletions(-) - -diff --git a/xen/arch/x86/cpu/microcode/amd.c b/xen/arch/x86/cpu/microcode/amd.c -index 75fc84e445..d8f7646e88 100644 ---- a/xen/arch/x86/cpu/microcode/amd.c -+++ b/xen/arch/x86/cpu/microcode/amd.c -@@ -222,12 +222,15 @@ static int cf_check apply_microcode(const struct microcode_patch *patch) - uint32_t rev, old_rev = sig->rev; - enum microcode_match_result result = microcode_fits(patch); - -+ if ( result == MIS_UCODE ) -+ return -EINVAL; -+ - /* - * Allow application of the same revision to pick up SMT-specific changes - * even if the revision of the other SMT thread is already up-to-date. - */ -- if ( result != NEW_UCODE && result != SAME_UCODE ) -- return -EINVAL; -+ if ( result == OLD_UCODE ) -+ return -EEXIST; - - if ( check_final_patch_levels(sig) ) - { -diff --git a/xen/arch/x86/cpu/microcode/core.c b/xen/arch/x86/cpu/microcode/core.c -index d5338ad345..8a47f4471f 100644 ---- a/xen/arch/x86/cpu/microcode/core.c -+++ b/xen/arch/x86/cpu/microcode/core.c -@@ -887,6 +887,8 @@ int __init early_microcode_init(unsigned long *module_map, - - ucode_ops.collect_cpu_info(); - -+ printk(XENLOG_INFO "BSP microcode revision: 0x%08x\n", this_cpu(cpu_sig).rev); -+ - /* - * Some hypervisors deliberately report a microcode revision of -1 to - * mean that they will not accept microcode updates. -diff --git a/xen/arch/x86/cpu/microcode/intel.c b/xen/arch/x86/cpu/microcode/intel.c -index 060c529a6e..a2d88e3ac0 100644 ---- a/xen/arch/x86/cpu/microcode/intel.c -+++ b/xen/arch/x86/cpu/microcode/intel.c -@@ -294,10 +294,13 @@ static int cf_check apply_microcode(const struct microcode_patch *patch) - - result = microcode_update_match(patch); - -- if ( result != NEW_UCODE && -- !(opt_ucode_allow_same && result == SAME_UCODE) ) -+ if ( result == MIS_UCODE ) - return -EINVAL; - -+ if ( result == OLD_UCODE || -+ (result == SAME_UCODE && !opt_ucode_allow_same) ) -+ return -EEXIST; -+ - wbinvd(); - - wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)patch->data); --- -2.45.2 - diff --git a/0021-x86-msi-prevent-watchdog-triggering-when-dumping-MSI.patch b/0021-x86-msi-prevent-watchdog-triggering-when-dumping-MSI.patch deleted file mode 100644 index 8bcc63f..0000000 --- a/0021-x86-msi-prevent-watchdog-triggering-when-dumping-MSI.patch +++ /dev/null @@ -1,44 +0,0 @@ -From 1ffb29d132600e6a7965c2885505615a6fd6c647 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> -Date: Wed, 26 Jun 2024 13:36:52 +0200 -Subject: [PATCH 21/56] x86/msi: prevent watchdog triggering when dumping MSI - state -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Use the same check that's used in dump_irqs(). - -Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> -Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> -master commit: 594b22ca5be681ec1b42c34f321cc2600d582210 -master date: 2024-05-20 14:29:44 +0100 ---- - xen/arch/x86/msi.c | 4 ++++ - 1 file changed, 4 insertions(+) - -diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c -index a78367d7cf..3eaeffd1e0 100644 ---- a/xen/arch/x86/msi.c -+++ b/xen/arch/x86/msi.c -@@ -17,6 +17,7 @@ - #include <xen/param.h> - #include <xen/pci.h> - #include <xen/pci_regs.h> -+#include <xen/softirq.h> - #include <xen/iocap.h> - #include <xen/keyhandler.h> - #include <xen/pfn.h> -@@ -1405,6 +1406,9 @@ static void cf_check dump_msi(unsigned char key) - unsigned long flags; - const char *type = "???"; - -+ if ( !(irq & 0x1f) ) -+ process_pending_softirqs(); -+ - if ( !irq_desc_initialized(desc) ) - continue; - --- -2.45.2 - diff --git a/0021-x86-pv-Address-Coverity-complaint-in-check_guest_io_.patch b/0021-x86-pv-Address-Coverity-complaint-in-check_guest_io_.patch new file mode 100644 index 0000000..b951fb1 --- /dev/null +++ b/0021-x86-pv-Address-Coverity-complaint-in-check_guest_io_.patch @@ -0,0 +1,112 @@ +From cb6c3cfc5f8aa8bd8aae1abffea0574b02a04840 Mon Sep 17 00:00:00 2001 +From: Andrew Cooper <andrew.cooper3@citrix.com> +Date: Tue, 24 Sep 2024 14:36:25 +0200 +Subject: [PATCH 21/35] x86/pv: Address Coverity complaint in + check_guest_io_breakpoint() + +Commit 08aacc392d86 ("x86/emul: Fix misaligned IO breakpoint behaviour in PV +guests") caused a Coverity INTEGER_OVERFLOW complaint based on the reasoning +that width could be 0. + +It can't, but digging into the code generation, GCC 8 and later (bisected on +godbolt) choose to emit a CSWITCH lookup table, and because the range (bottom +2 bits clear), it's a 16-entry lookup table. + +So Coverity is understandable, given that GCC did emit a (dead) logic path +where width stayed 0. + +Rewrite the logic. Introduce x86_bp_width() which compiles to a single basic +block, which replaces the switch() statement. Take the opportunity to also +make start and width be loop-scope variables. + +No practical change, but it should compile better and placate Coverity. + +Fixes: 08aacc392d86 ("x86/emul: Fix misaligned IO breakpoint behaviour in PV guests") +Coverity-ID: 1616152 +Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> +Reviewed-by: Jan Beulich <jbeulich@suse.com> +master commit: 6d41a9d8a12ff89adabdc286e63e9391a0481699 +master date: 2024-08-21 23:59:19 +0100 +--- + xen/arch/x86/include/asm/debugreg.h | 25 +++++++++++++++++++++++++ + xen/arch/x86/pv/emul-priv-op.c | 21 ++++++--------------- + 2 files changed, 31 insertions(+), 15 deletions(-) + +diff --git a/xen/arch/x86/include/asm/debugreg.h b/xen/arch/x86/include/asm/debugreg.h +index 6baa725441..23aa592e40 100644 +--- a/xen/arch/x86/include/asm/debugreg.h ++++ b/xen/arch/x86/include/asm/debugreg.h +@@ -115,4 +115,29 @@ unsigned int x86_adj_dr7_rsvd(const struct cpu_policy *p, unsigned int dr7); + unsigned int x86_merge_dr6(const struct cpu_policy *p, unsigned int dr6, + unsigned int new); + ++/* ++ * Calculate the width of a breakpoint from its dr7 encoding. ++ * ++ * The LEN encoding in dr7 is 2 bits wide per breakpoint and encoded as a X-1 ++ * (0, 1 and 3) for widths of 1, 2 and 4 respectively in the 32bit days. ++ * ++ * In 64bit, the unused value (2) was given a meaning of width 8, which is ++ * great for efficiency but less great for nicely calculating the width. ++ */ ++static inline unsigned int x86_bp_width(unsigned int dr7, unsigned int bp) ++{ ++ unsigned int raw = (dr7 >> (DR_CONTROL_SHIFT + ++ DR_CONTROL_SIZE * bp + 2)) & 3; ++ ++ /* ++ * If the top bit is set (i.e. we've got an 4 or 8 byte wide breakpoint), ++ * flip the bottom to reverse their order, making them sorted properly. ++ * Then it's a simple shift to calculate the width. ++ */ ++ if ( raw & 2 ) ++ raw ^= 1; ++ ++ return 1U << raw; ++} ++ + #endif /* _X86_DEBUGREG_H */ +diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c +index 15c83b9d23..b90f745c75 100644 +--- a/xen/arch/x86/pv/emul-priv-op.c ++++ b/xen/arch/x86/pv/emul-priv-op.c +@@ -323,30 +323,21 @@ static unsigned int check_guest_io_breakpoint(struct vcpu *v, + unsigned int port, + unsigned int len) + { +- unsigned int width, i, match = 0; +- unsigned long start; ++ unsigned int i, match = 0; + + if ( !v->arch.pv.dr7_emul || !(v->arch.pv.ctrlreg[4] & X86_CR4_DE) ) + return 0; + + for ( i = 0; i < 4; i++ ) + { ++ unsigned long start; ++ unsigned int width; ++ + if ( !(v->arch.pv.dr7_emul & (3 << (i * DR_ENABLE_SIZE))) ) + continue; + +- start = v->arch.dr[i]; +- width = 0; +- +- switch ( (v->arch.dr7 >> +- (DR_CONTROL_SHIFT + i * DR_CONTROL_SIZE)) & 0xc ) +- { +- case DR_LEN_1: width = 1; break; +- case DR_LEN_2: width = 2; break; +- case DR_LEN_4: width = 4; break; +- case DR_LEN_8: width = 8; break; +- } +- +- start &= ~(width - 1UL); ++ width = x86_bp_width(v->arch.dr7, i); ++ start = v->arch.dr[i] & ~(width - 1UL); + + if ( (start < (port + len)) && ((start + width) > port) ) + match |= 1u << i; +-- +2.46.1 + diff --git a/0022-x86-irq-remove-offline-CPUs-from-old-CPU-mask-when-a.patch b/0022-x86-irq-remove-offline-CPUs-from-old-CPU-mask-when-a.patch deleted file mode 100644 index 28fec3e..0000000 --- a/0022-x86-irq-remove-offline-CPUs-from-old-CPU-mask-when-a.patch +++ /dev/null @@ -1,44 +0,0 @@ -From 52e16bf065cb42b79d14ac74d701d1f9d8506430 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> -Date: Wed, 26 Jun 2024 13:37:20 +0200 -Subject: [PATCH 22/56] x86/irq: remove offline CPUs from old CPU mask when - adjusting move_cleanup_count -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -When adjusting move_cleanup_count to account for CPUs that are offline also -adjust old_cpu_mask, otherwise further calls to fixup_irqs() could subtract -those again and create an imbalance in move_cleanup_count. - -Fixes: 472e0b74c5c4 ('x86/IRQ: deal with move cleanup count state in fixup_irqs()') -Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -master commit: e63209d3ba2fd1b2f232babd14c9c679ffa7b09a -master date: 2024-06-10 10:33:22 +0200 ---- - xen/arch/x86/irq.c | 8 ++++++++ - 1 file changed, 8 insertions(+) - -diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c -index e07006391a..db14df93db 100644 ---- a/xen/arch/x86/irq.c -+++ b/xen/arch/x86/irq.c -@@ -2576,6 +2576,14 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) - desc->arch.move_cleanup_count -= cpumask_weight(affinity); - if ( !desc->arch.move_cleanup_count ) - release_old_vec(desc); -+ else -+ /* -+ * Adjust old_cpu_mask to account for the offline CPUs, -+ * otherwise further calls to fixup_irqs() could subtract those -+ * again and possibly underflow the counter. -+ */ -+ cpumask_andnot(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask, -+ affinity); - } - - if ( !desc->action || cpumask_subset(desc->affinity, mask) ) --- -2.45.2 - diff --git a/0022-x86emul-always-set-operand-size-for-AVX-VNNI-INT8-in.patch b/0022-x86emul-always-set-operand-size-for-AVX-VNNI-INT8-in.patch new file mode 100644 index 0000000..cf37c99 --- /dev/null +++ b/0022-x86emul-always-set-operand-size-for-AVX-VNNI-INT8-in.patch @@ -0,0 +1,36 @@ +From 1e68200487e662e9f8720d508a1d6b3d3e2c72b9 Mon Sep 17 00:00:00 2001 +From: Jan Beulich <jbeulich@suse.com> +Date: Tue, 24 Sep 2024 14:37:08 +0200 +Subject: [PATCH 22/35] x86emul: always set operand size for AVX-VNNI-INT8 + insns + +Unlike for AVX-VNNI-INT16 I failed to notice that op_bytes may still be +zero when reaching the respective case block: With the ext0f38_table[] +entries having simd_packed_int, the defaulting at the bottom of +x86emul_decode() won't set the field to non-zero for F3- or F2-prefixed +insns. + +Fixes: 842acaa743a5 ("x86emul: support AVX-VNNI-INT8") +Signed-off-by: Jan Beulich <jbeulich@suse.com> +Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> +master commit: d45687cca2450bfebe1dfbddb22f4f03c6fbc9cb +master date: 2024-08-23 09:11:15 +0200 +--- + xen/arch/x86/x86_emulate/x86_emulate.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c +index 16557385bf..4d9649a2af 100644 +--- a/xen/arch/x86/x86_emulate/x86_emulate.c ++++ b/xen/arch/x86/x86_emulate/x86_emulate.c +@@ -6075,6 +6075,7 @@ x86_emulate( + case X86EMUL_OPC_VEX_F2(0x0f38, 0x51): /* vpdpbssds [xy]mm/mem,[xy]mm,[xy]mm */ + host_and_vcpu_must_have(avx_vnni_int8); + generate_exception_if(vex.w, X86_EXC_UD); ++ op_bytes = 16 << vex.l; + goto simd_0f_ymm; + + case X86EMUL_OPC_VEX_66(0x0f38, 0x50): /* vpdpbusd [xy]mm/mem,[xy]mm,[xy]mm */ +-- +2.46.1 + diff --git a/0023-CI-Update-FreeBSD-to-13.3.patch b/0023-CI-Update-FreeBSD-to-13.3.patch deleted file mode 100644 index 6a6e7ae..0000000 --- a/0023-CI-Update-FreeBSD-to-13.3.patch +++ /dev/null @@ -1,33 +0,0 @@ -From 80f2d2c2a515a6b9a4ea1b128267c6e1b5085002 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Wed, 26 Jun 2024 13:37:58 +0200 -Subject: [PATCH 23/56] CI: Update FreeBSD to 13.3 -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Acked-by: Roger Pau Monné <roger.pau@citrix.com> -Acked-by: Stefano Stabellini <sstabellini@kernel.org> -master commit: 5ea7f2c9d7a1334b3b2bd5f67fab4d447b60613d -master date: 2024-06-11 17:00:10 +0100 ---- - .cirrus.yml | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/.cirrus.yml b/.cirrus.yml -index 63f3afb104..e961877881 100644 ---- a/.cirrus.yml -+++ b/.cirrus.yml -@@ -17,7 +17,7 @@ freebsd_template: &FREEBSD_TEMPLATE - task: - name: 'FreeBSD 13' - freebsd_instance: -- image_family: freebsd-13-2 -+ image_family: freebsd-13-3 - << : *FREEBSD_TEMPLATE - - task: --- -2.45.2 - diff --git a/0023-x86emul-set-fake-operand-size-for-AVX512CD-broadcast.patch b/0023-x86emul-set-fake-operand-size-for-AVX512CD-broadcast.patch new file mode 100644 index 0000000..d94f56e --- /dev/null +++ b/0023-x86emul-set-fake-operand-size-for-AVX512CD-broadcast.patch @@ -0,0 +1,35 @@ +From a0d6b75b832d2f7c54429de1a550fe122bcd6881 Mon Sep 17 00:00:00 2001 +From: Jan Beulich <jbeulich@suse.com> +Date: Tue, 24 Sep 2024 14:37:52 +0200 +Subject: [PATCH 23/35] x86emul: set (fake) operand size for AVX512CD broadcast + insns + +Back at the time I failed to pay attention to op_bytes still being zero +when reaching the respective case block: With the ext0f38_table[] +entries having simd_packed_int, the defaulting at the bottom of +x86emul_decode() won't set the field to non-zero for F3-prefixed insns. + +Fixes: 37ccca740c26 ("x86emul: support AVX512CD insns") +Signed-off-by: Jan Beulich <jbeulich@suse.com> +Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> +master commit: 6fa6b7feaafd622db3a2f3436750cf07782f4c12 +master date: 2024-08-23 09:12:24 +0200 +--- + xen/arch/x86/x86_emulate/x86_emulate.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c +index 4d9649a2af..305f4286bf 100644 +--- a/xen/arch/x86/x86_emulate/x86_emulate.c ++++ b/xen/arch/x86/x86_emulate/x86_emulate.c +@@ -5928,6 +5928,7 @@ x86_emulate( + evex.w == ((b >> 4) & 1)), + X86_EXC_UD); + d |= TwoOp; ++ op_bytes = 1; /* fake */ + /* fall through */ + case X86EMUL_OPC_EVEX_66(0x0f38, 0xc4): /* vpconflict{d,q} [xyz]mm/mem,[xyz]mm{k} */ + fault_suppression = false; +-- +2.46.1 + diff --git a/0024-x86-smp-do-not-use-shorthand-IPI-destinations-in-CPU.patch b/0024-x86-smp-do-not-use-shorthand-IPI-destinations-in-CPU.patch deleted file mode 100644 index b69c88c..0000000 --- a/0024-x86-smp-do-not-use-shorthand-IPI-destinations-in-CPU.patch +++ /dev/null @@ -1,98 +0,0 @@ -From 98238d49ecb149a5ac07cb8032817904c404ac2b Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> -Date: Wed, 26 Jun 2024 13:38:36 +0200 -Subject: [PATCH 24/56] x86/smp: do not use shorthand IPI destinations in CPU - hot{,un}plug contexts -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Due to the current rwlock logic, if the CPU calling get_cpu_maps() does -so from a cpu_hotplug_{begin,done}() region the function will still -return success, because a CPU taking the rwlock in read mode after -having taken it in write mode is allowed. Such corner case makes using -get_cpu_maps() alone not enough to prevent using the shorthand in CPU -hotplug regions. - -Introduce a new helper to detect whether the current caller is between a -cpu_hotplug_{begin,done}() region and use it in send_IPI_mask() to restrict -shorthand usage. - -Fixes: 5500d265a2a8 ('x86/smp: use APIC ALLBUT destination shorthand when possible') -Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -master commit: 171c52fba5d94e050d704770480dcb983490d0ad -master date: 2024-06-12 14:29:31 +0200 ---- - xen/arch/x86/smp.c | 2 +- - xen/common/cpu.c | 5 +++++ - xen/include/xen/cpu.h | 10 ++++++++++ - xen/include/xen/rwlock.h | 2 ++ - 4 files changed, 18 insertions(+), 1 deletion(-) - -diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c -index 3a331cbdbc..340fcafb46 100644 ---- a/xen/arch/x86/smp.c -+++ b/xen/arch/x86/smp.c -@@ -88,7 +88,7 @@ void send_IPI_mask(const cpumask_t *mask, int vector) - * the system have been accounted for. - */ - if ( system_state > SYS_STATE_smp_boot && -- !unaccounted_cpus && !disabled_cpus && -+ !unaccounted_cpus && !disabled_cpus && !cpu_in_hotplug_context() && - /* NB: get_cpu_maps lock requires enabled interrupts. */ - local_irq_is_enabled() && (cpus_locked = get_cpu_maps()) && - (park_offline_cpus || -diff --git a/xen/common/cpu.c b/xen/common/cpu.c -index 8709db4d29..6e35b114c0 100644 ---- a/xen/common/cpu.c -+++ b/xen/common/cpu.c -@@ -68,6 +68,11 @@ void cpu_hotplug_done(void) - write_unlock(&cpu_add_remove_lock); - } - -+bool cpu_in_hotplug_context(void) -+{ -+ return rw_is_write_locked_by_me(&cpu_add_remove_lock); -+} -+ - static NOTIFIER_HEAD(cpu_chain); - - void __init register_cpu_notifier(struct notifier_block *nb) -diff --git a/xen/include/xen/cpu.h b/xen/include/xen/cpu.h -index e1d4eb5967..6bf5786750 100644 ---- a/xen/include/xen/cpu.h -+++ b/xen/include/xen/cpu.h -@@ -13,6 +13,16 @@ void put_cpu_maps(void); - void cpu_hotplug_begin(void); - void cpu_hotplug_done(void); - -+/* -+ * Returns true when the caller CPU is between a cpu_hotplug_{begin,done}() -+ * region. -+ * -+ * This is required to safely identify hotplug contexts, as get_cpu_maps() -+ * would otherwise succeed because a caller holding the lock in write mode is -+ * allowed to acquire the same lock in read mode. -+ */ -+bool cpu_in_hotplug_context(void); -+ - /* Receive notification of CPU hotplug events. */ - void register_cpu_notifier(struct notifier_block *nb); - -diff --git a/xen/include/xen/rwlock.h b/xen/include/xen/rwlock.h -index 9e35ee2edf..dc74d1c057 100644 ---- a/xen/include/xen/rwlock.h -+++ b/xen/include/xen/rwlock.h -@@ -309,6 +309,8 @@ static always_inline void write_lock_irq(rwlock_t *l) - - #define rw_is_locked(l) _rw_is_locked(l) - #define rw_is_write_locked(l) _rw_is_write_locked(l) -+#define rw_is_write_locked_by_me(l) \ -+ lock_evaluate_nospec(_is_write_locked_by_me(atomic_read(&(l)->cnts))) - - - typedef struct percpu_rwlock percpu_rwlock_t; --- -2.45.2 - diff --git a/0024-x86-x2APIC-correct-cluster-tracking-upon-CPUs-going-.patch b/0024-x86-x2APIC-correct-cluster-tracking-upon-CPUs-going-.patch new file mode 100644 index 0000000..a85c858 --- /dev/null +++ b/0024-x86-x2APIC-correct-cluster-tracking-upon-CPUs-going-.patch @@ -0,0 +1,52 @@ +From 404fb9b745dd3f1ca17c3e957e43e3f95ab2613a Mon Sep 17 00:00:00 2001 +From: Jan Beulich <jbeulich@suse.com> +Date: Tue, 24 Sep 2024 14:38:27 +0200 +Subject: [PATCH 24/35] x86/x2APIC: correct cluster tracking upon CPUs going + down for S3 +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Downing CPUs for S3 is somewhat special: Since we can expect the system +to come back up in exactly the same hardware configuration, per-CPU data +for the secondary CPUs isn't de-allocated (and then cleared upon re- +allocation when the CPUs are being brought back up). Therefore the +cluster_cpus per-CPU pointer will retain its value for all CPUs other +than the final one in a cluster (i.e. in particular for all CPUs in the +same cluster as CPU0). That, however, is in conflict with the assertion +early in init_apic_ldr_x2apic_cluster(). + +Note that the issue is avoided on Intel hardware, where we park CPUs +instead of bringing them down. + +Extend the bypassing of the freeing to the suspend case, thus making +suspend/resume also a tiny bit faster. + +Fixes: 2e6c8f182c9c ("x86: distinguish CPU offlining from CPU removal") +Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> +Signed-off-by: Jan Beulich <jbeulich@suse.com> +Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> +Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> +master commit: ad3ff7b4279d16c91c23cda6e8be5bc670b25c9a +master date: 2024-08-26 10:30:40 +0200 +--- + xen/arch/x86/genapic/x2apic.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/xen/arch/x86/genapic/x2apic.c b/xen/arch/x86/genapic/x2apic.c +index 371dd100c7..d531035fa4 100644 +--- a/xen/arch/x86/genapic/x2apic.c ++++ b/xen/arch/x86/genapic/x2apic.c +@@ -228,7 +228,8 @@ static int cf_check update_clusterinfo( + case CPU_UP_CANCELED: + case CPU_DEAD: + case CPU_REMOVE: +- if ( park_offline_cpus == (action != CPU_REMOVE) ) ++ if ( park_offline_cpus == (action != CPU_REMOVE) || ++ system_state == SYS_STATE_suspend ) + break; + if ( per_cpu(cluster_cpus, cpu) ) + { +-- +2.46.1 + diff --git a/0025-x86-dom0-disable-SMAP-for-PV-domain-building-only.patch b/0025-x86-dom0-disable-SMAP-for-PV-domain-building-only.patch new file mode 100644 index 0000000..f431756 --- /dev/null +++ b/0025-x86-dom0-disable-SMAP-for-PV-domain-building-only.patch @@ -0,0 +1,145 @@ +From 743af916723eb4f1197719fc0aebd4460bafb5bf Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> +Date: Tue, 24 Sep 2024 14:39:23 +0200 +Subject: [PATCH 25/35] x86/dom0: disable SMAP for PV domain building only +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Move the logic that disables SMAP so it's only performed when building a PV +dom0, PVH dom0 builder doesn't require disabling SMAP. + +The fixes tag is to account for the wrong usage of cpu_has_smap in +create_dom0(), it should instead have used +boot_cpu_has(X86_FEATURE_XEN_SMAP). Fix while moving the logic to apply to PV +only. + +While there also make cr4_pv32_mask __ro_after_init. + +Fixes: 493ab190e5b1 ('xen/sm{e, a}p: allow disabling sm{e, a}p for Xen itself') +Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> +Reviewed-by: Jan Beulich <jbeulich@suse.com> +Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> +master commit: fb1658221a31ec1db33253a80001191391e73b17 +master date: 2024-08-28 19:59:07 +0100 +--- + xen/arch/x86/include/asm/setup.h | 2 ++ + xen/arch/x86/pv/dom0_build.c | 40 ++++++++++++++++++++++++++++---- + xen/arch/x86/setup.c | 20 +--------------- + 3 files changed, 38 insertions(+), 24 deletions(-) + +diff --git a/xen/arch/x86/include/asm/setup.h b/xen/arch/x86/include/asm/setup.h +index d75589178b..8f7dfefb4d 100644 +--- a/xen/arch/x86/include/asm/setup.h ++++ b/xen/arch/x86/include/asm/setup.h +@@ -64,6 +64,8 @@ extern bool opt_dom0_verbose; + extern bool opt_dom0_cpuid_faulting; + extern bool opt_dom0_msr_relaxed; + ++extern unsigned long cr4_pv32_mask; ++ + #define max_init_domid (0) + + #endif +diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c +index 57e58a02e7..07e9594493 100644 +--- a/xen/arch/x86/pv/dom0_build.c ++++ b/xen/arch/x86/pv/dom0_build.c +@@ -354,11 +354,11 @@ static struct page_info * __init alloc_chunk(struct domain *d, + return page; + } + +-int __init dom0_construct_pv(struct domain *d, +- const module_t *image, +- unsigned long image_headroom, +- module_t *initrd, +- const char *cmdline) ++static int __init dom0_construct(struct domain *d, ++ const module_t *image, ++ unsigned long image_headroom, ++ module_t *initrd, ++ const char *cmdline) + { + int i, rc, order, machine; + bool compatible, compat; +@@ -1051,6 +1051,36 @@ out: + return rc; + } + ++int __init dom0_construct_pv(struct domain *d, ++ const module_t *image, ++ unsigned long image_headroom, ++ module_t *initrd, ++ const char *cmdline) ++{ ++ int rc; ++ ++ /* ++ * Clear SMAP in CR4 to allow user-accesses in construct_dom0(). This ++ * prevents us needing to rewrite construct_dom0() in terms of ++ * copy_{to,from}_user(). ++ */ ++ if ( boot_cpu_has(X86_FEATURE_XEN_SMAP) ) ++ { ++ cr4_pv32_mask &= ~X86_CR4_SMAP; ++ write_cr4(read_cr4() & ~X86_CR4_SMAP); ++ } ++ ++ rc = dom0_construct(d, image, image_headroom, initrd, cmdline); ++ ++ if ( boot_cpu_has(X86_FEATURE_XEN_SMAP) ) ++ { ++ write_cr4(read_cr4() | X86_CR4_SMAP); ++ cr4_pv32_mask |= X86_CR4_SMAP; ++ } ++ ++ return rc; ++} ++ + /* + * Local variables: + * mode: C +diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c +index eee20bb175..f1076c7203 100644 +--- a/xen/arch/x86/setup.c ++++ b/xen/arch/x86/setup.c +@@ -79,8 +79,7 @@ bool __read_mostly use_invpcid; + int8_t __initdata opt_probe_port_aliases = -1; + boolean_param("probe-port-aliases", opt_probe_port_aliases); + +-/* Only used in asm code and within this source file */ +-unsigned long asmlinkage __read_mostly cr4_pv32_mask; ++unsigned long __ro_after_init cr4_pv32_mask; + + /* **** Linux config option: propagated to domain0. */ + /* "acpi=off": Sisables both ACPI table parsing and interpreter. */ +@@ -955,26 +954,9 @@ static struct domain *__init create_dom0(const module_t *image, + } + } + +- /* +- * Temporarily clear SMAP in CR4 to allow user-accesses in construct_dom0(). +- * This saves a large number of corner cases interactions with +- * copy_from_user(). +- */ +- if ( cpu_has_smap ) +- { +- cr4_pv32_mask &= ~X86_CR4_SMAP; +- write_cr4(read_cr4() & ~X86_CR4_SMAP); +- } +- + if ( construct_dom0(d, image, headroom, initrd, cmdline) != 0 ) + panic("Could not construct domain 0\n"); + +- if ( cpu_has_smap ) +- { +- write_cr4(read_cr4() | X86_CR4_SMAP); +- cr4_pv32_mask |= X86_CR4_SMAP; +- } +- + return d; + } + +-- +2.46.1 + diff --git a/0025-x86-irq-limit-interrupt-movement-done-by-fixup_irqs.patch b/0025-x86-irq-limit-interrupt-movement-done-by-fixup_irqs.patch deleted file mode 100644 index 7c40bba..0000000 --- a/0025-x86-irq-limit-interrupt-movement-done-by-fixup_irqs.patch +++ /dev/null @@ -1,104 +0,0 @@ -From ce0a0cb0a74a909abf988f242aa228acdd2917fe Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> -Date: Wed, 26 Jun 2024 13:39:11 +0200 -Subject: [PATCH 25/56] x86/irq: limit interrupt movement done by fixup_irqs() -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -The current check used in fixup_irqs() to decide whether to move around -interrupts is based on the affinity mask, but such mask can have all bits set, -and hence is unlikely to be a subset of the input mask. For example if an -interrupt has an affinity mask of all 1s, any input to fixup_irqs() that's not -an all set CPU mask would cause that interrupt to be shuffled around -unconditionally. - -What fixup_irqs() care about is evacuating interrupts from CPUs not set on the -input CPU mask, and for that purpose it should check whether the interrupt is -assigned to a CPU not present in the input mask. Assume that ->arch.cpu_mask -is a subset of the ->affinity mask, and keep the current logic that resets the -->affinity mask if the interrupt has to be shuffled around. - -Doing the affinity movement based on ->arch.cpu_mask requires removing the -special handling to ->arch.cpu_mask done for high priority vectors, otherwise -the adjustment done to cpu_mask makes them always skip the CPU interrupt -movement. - -While there also adjust the comment as to the purpose of fixup_irqs(). - -Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -master commit: c7564d7366d865cc407e3d64bca816d07edee174 -master date: 2024-06-12 14:30:40 +0200 ---- - xen/arch/x86/include/asm/irq.h | 2 +- - xen/arch/x86/irq.c | 21 +++++++++++---------- - 2 files changed, 12 insertions(+), 11 deletions(-) - -diff --git a/xen/arch/x86/include/asm/irq.h b/xen/arch/x86/include/asm/irq.h -index d7fb8ec7e8..71d4a8fc56 100644 ---- a/xen/arch/x86/include/asm/irq.h -+++ b/xen/arch/x86/include/asm/irq.h -@@ -132,7 +132,7 @@ void free_domain_pirqs(struct domain *d); - int map_domain_emuirq_pirq(struct domain *d, int pirq, int emuirq); - int unmap_domain_pirq_emuirq(struct domain *d, int pirq); - --/* Reset irq affinities to match the given CPU mask. */ -+/* Evacuate interrupts assigned to CPUs not present in the input CPU mask. */ - void fixup_irqs(const cpumask_t *mask, bool verbose); - void fixup_eoi(void); - -diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c -index db14df93db..566331bec1 100644 ---- a/xen/arch/x86/irq.c -+++ b/xen/arch/x86/irq.c -@@ -2529,7 +2529,7 @@ static int __init cf_check setup_dump_irqs(void) - } - __initcall(setup_dump_irqs); - --/* Reset irq affinities to match the given CPU mask. */ -+/* Evacuate interrupts assigned to CPUs not present in the input CPU mask. */ - void fixup_irqs(const cpumask_t *mask, bool verbose) - { - unsigned int irq; -@@ -2553,19 +2553,15 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) - - vector = irq_to_vector(irq); - if ( vector >= FIRST_HIPRIORITY_VECTOR && -- vector <= LAST_HIPRIORITY_VECTOR ) -+ vector <= LAST_HIPRIORITY_VECTOR && -+ desc->handler == &no_irq_type ) - { -- cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask); -- - /* - * This can in particular happen when parking secondary threads - * during boot and when the serial console wants to use a PCI IRQ. - */ -- if ( desc->handler == &no_irq_type ) -- { -- spin_unlock(&desc->lock); -- continue; -- } -+ spin_unlock(&desc->lock); -+ continue; - } - - if ( desc->arch.move_cleanup_count ) -@@ -2586,7 +2582,12 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) - affinity); - } - -- if ( !desc->action || cpumask_subset(desc->affinity, mask) ) -+ /* -+ * Avoid shuffling the interrupt around as long as current target CPUs -+ * are a subset of the input mask. What fixup_irqs() cares about is -+ * evacuating interrupts from CPUs not in the input mask. -+ */ -+ if ( !desc->action || cpumask_subset(desc->arch.cpu_mask, mask) ) - { - spin_unlock(&desc->lock); - continue; --- -2.45.2 - diff --git a/0026-x86-EPT-correct-special-page-checking-in-epte_get_en.patch b/0026-x86-EPT-correct-special-page-checking-in-epte_get_en.patch deleted file mode 100644 index c94728a..0000000 --- a/0026-x86-EPT-correct-special-page-checking-in-epte_get_en.patch +++ /dev/null @@ -1,46 +0,0 @@ -From 6e647efaf2b02ce92bcf80bec47c18cca5084f8a Mon Sep 17 00:00:00 2001 -From: Jan Beulich <jbeulich@suse.com> -Date: Wed, 26 Jun 2024 13:39:44 +0200 -Subject: [PATCH 26/56] x86/EPT: correct special page checking in - epte_get_entry_emt() -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -mfn_valid() granularity is (currently) 256Mb. Therefore the start of a -1Gb page passing the test doesn't necessarily mean all parts of such a -range would also pass. Yet using the result of mfn_to_page() on an MFN -which doesn't pass mfn_valid() checking is liable to result in a crash -(the invocation of mfn_to_page() alone is presumably "just" UB in such a -case). - -Fixes: ca24b2ffdbd9 ("x86/hvm: set 'ipat' in EPT for special pages") -Signed-off-by: Jan Beulich <jbeulich@suse.com> -Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> -master commit: 5540b94e8191059eb9cbbe98ac316232a42208f6 -master date: 2024-06-13 16:53:34 +0200 ---- - xen/arch/x86/mm/p2m-ept.c | 6 +++++- - 1 file changed, 5 insertions(+), 1 deletion(-) - -diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c -index 85c4e8e54f..1aa6bbc771 100644 ---- a/xen/arch/x86/mm/p2m-ept.c -+++ b/xen/arch/x86/mm/p2m-ept.c -@@ -518,8 +518,12 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn, - } - - for ( special_pgs = i = 0; i < (1ul << order); i++ ) -- if ( is_special_page(mfn_to_page(mfn_add(mfn, i))) ) -+ { -+ mfn_t cur = mfn_add(mfn, i); -+ -+ if ( mfn_valid(cur) && is_special_page(mfn_to_page(cur)) ) - special_pgs++; -+ } - - if ( special_pgs ) - { --- -2.45.2 - diff --git a/0026-x86-HVM-correct-partial-HPET_STATUS-write-emulation.patch b/0026-x86-HVM-correct-partial-HPET_STATUS-write-emulation.patch new file mode 100644 index 0000000..3b79d84 --- /dev/null +++ b/0026-x86-HVM-correct-partial-HPET_STATUS-write-emulation.patch @@ -0,0 +1,37 @@ +From 6e96dee93c60af4ee446f5e0fddf3b424824de18 Mon Sep 17 00:00:00 2001 +From: Jan Beulich <jbeulich@suse.com> +Date: Tue, 24 Sep 2024 14:40:03 +0200 +Subject: [PATCH 26/35] x86/HVM: correct partial HPET_STATUS write emulation + +For partial writes the non-written parts of registers are folded into +the full 64-bit value from what they're presently set to. That's wrong +to do though when the behavior is write-1-to-clear: Writes not +including to low 3 bits would unconditionally clear all ISR bits which +are presently set. Re-calculate the value to use. + +Fixes: be07023be115 ("x86/vhpet: add support for level triggered interrupts") +Signed-off-by: Jan Beulich <jbeulich@suse.com> +Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> +master commit: 41d358d2f9607ba37c216effa39b9f1bc58de69d +master date: 2024-08-29 10:02:20 +0200 +--- + xen/arch/x86/hvm/hpet.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c +index 87642575f9..f0e5f877f4 100644 +--- a/xen/arch/x86/hvm/hpet.c ++++ b/xen/arch/x86/hvm/hpet.c +@@ -404,7 +404,8 @@ static int cf_check hpet_write( + break; + + case HPET_STATUS: +- /* write 1 to clear. */ ++ /* Write 1 to clear. Therefore don't use new_val directly here. */ ++ new_val = val << ((addr & 7) * 8); + while ( new_val ) + { + bool active; +-- +2.46.1 + diff --git a/0027-Arm64-adjust-__irq_to_desc-to-fix-build-with-gcc14.patch b/0027-Arm64-adjust-__irq_to_desc-to-fix-build-with-gcc14.patch new file mode 100644 index 0000000..a95c549 --- /dev/null +++ b/0027-Arm64-adjust-__irq_to_desc-to-fix-build-with-gcc14.patch @@ -0,0 +1,61 @@ +From ee826bc490d6036ed9b637ada014a2d59d151f79 Mon Sep 17 00:00:00 2001 +From: Jan Beulich <jbeulich@suse.com> +Date: Tue, 24 Sep 2024 14:40:34 +0200 +Subject: [PATCH 27/35] Arm64: adjust __irq_to_desc() to fix build with gcc14 +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +With the original code I observe + +In function ‘__irq_to_desc’, + inlined from ‘route_irq_to_guest’ at arch/arm/irq.c:465:12: +arch/arm/irq.c:54:16: error: array subscript -2 is below array bounds of ‘irq_desc_t[32]’ {aka ‘struct irq_desc[32]’} [-Werror=array-bounds=] + 54 | return &this_cpu(local_irq_desc)[irq]; + | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +which looks pretty bogus: How in the world does the compiler arrive at +-2 when compiling route_irq_to_guest()? Yet independent of that the +function's parameter wants to be of unsigned type anyway, as shown by +a vast majority of callers (others use plain int when they really mean +non-negative quantities). With that adjustment the code compiles fine +again. + +Signed-off-by: Jan Beulich <jbeulich@suse.com> +Acked-by: Michal Orzel <michal.orzel@amd.com> +master commit: 99f942f3d410059dc223ee0a908827e928ef3592 +master date: 2024-08-29 10:03:53 +0200 +--- + xen/arch/arm/include/asm/irq.h | 2 +- + xen/arch/arm/irq.c | 2 +- + 2 files changed, 2 insertions(+), 2 deletions(-) + +diff --git a/xen/arch/arm/include/asm/irq.h b/xen/arch/arm/include/asm/irq.h +index ec437add09..88e060bf29 100644 +--- a/xen/arch/arm/include/asm/irq.h ++++ b/xen/arch/arm/include/asm/irq.h +@@ -56,7 +56,7 @@ extern const unsigned int nr_irqs; + struct irq_desc; + struct irqaction; + +-struct irq_desc *__irq_to_desc(int irq); ++struct irq_desc *__irq_to_desc(unsigned int irq); + + #define irq_to_desc(irq) __irq_to_desc(irq) + +diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c +index 6b89f64fd1..b9757d7ad3 100644 +--- a/xen/arch/arm/irq.c ++++ b/xen/arch/arm/irq.c +@@ -48,7 +48,7 @@ void irq_end_none(struct irq_desc *irq) + static irq_desc_t irq_desc[NR_IRQS]; + static DEFINE_PER_CPU(irq_desc_t[NR_LOCAL_IRQS], local_irq_desc); + +-struct irq_desc *__irq_to_desc(int irq) ++struct irq_desc *__irq_to_desc(unsigned int irq) + { + if ( irq < NR_LOCAL_IRQS ) + return &this_cpu(local_irq_desc)[irq]; +-- +2.46.1 + diff --git a/0027-x86-EPT-avoid-marking-non-present-entries-for-re-con.patch b/0027-x86-EPT-avoid-marking-non-present-entries-for-re-con.patch deleted file mode 100644 index 23e8946..0000000 --- a/0027-x86-EPT-avoid-marking-non-present-entries-for-re-con.patch +++ /dev/null @@ -1,85 +0,0 @@ -From d31385be5c8e8bc5efb6f8848057bd0c69e8274a Mon Sep 17 00:00:00 2001 -From: Jan Beulich <jbeulich@suse.com> -Date: Wed, 26 Jun 2024 13:40:11 +0200 -Subject: [PATCH 27/56] x86/EPT: avoid marking non-present entries for - re-configuring -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -For non-present entries EMT, like most other fields, is meaningless to -hardware. Make the logic in ept_set_entry() setting the field (and iPAT) -conditional upon dealing with a present entry, leaving the value at 0 -otherwise. This has two effects for epte_get_entry_emt() which we'll -want to leverage subsequently: -1) The call moved here now won't be issued with INVALID_MFN anymore (a - respective BUG_ON() is being added). -2) Neither of the other two calls could now be issued with a truncated - form of INVALID_MFN anymore (as long as there's no bug anywhere - marking an entry present when that was populated using INVALID_MFN). - -Signed-off-by: Jan Beulich <jbeulich@suse.com> -Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> -master commit: 777c71d31325bc55ba1cc3f317d4155fe519ab0b -master date: 2024-06-13 16:54:17 +0200 ---- - xen/arch/x86/mm/p2m-ept.c | 29 ++++++++++++++++++----------- - 1 file changed, 18 insertions(+), 11 deletions(-) - -diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c -index 1aa6bbc771..641d61b350 100644 ---- a/xen/arch/x86/mm/p2m-ept.c -+++ b/xen/arch/x86/mm/p2m-ept.c -@@ -649,6 +649,8 @@ static int cf_check resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn) - if ( e.emt != MTRR_NUM_TYPES ) - break; - -+ ASSERT(is_epte_present(&e)); -+ - if ( level == 0 ) - { - for ( gfn -= i, i = 0; i < EPT_PAGETABLE_ENTRIES; ++i ) -@@ -914,17 +916,6 @@ ept_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn, - - if ( mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt) ) - { -- bool ipat; -- int emt = epte_get_entry_emt(p2m->domain, _gfn(gfn), mfn, -- i * EPT_TABLE_ORDER, &ipat, -- p2mt); -- -- if ( emt >= 0 ) -- new_entry.emt = emt; -- else /* ept_handle_misconfig() will need to take care of this. */ -- new_entry.emt = MTRR_NUM_TYPES; -- -- new_entry.ipat = ipat; - new_entry.sp = !!i; - new_entry.sa_p2mt = p2mt; - new_entry.access = p2ma; -@@ -940,6 +931,22 @@ ept_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn, - need_modify_vtd_table = 0; - - ept_p2m_type_to_flags(p2m, &new_entry); -+ -+ if ( is_epte_present(&new_entry) ) -+ { -+ bool ipat; -+ int emt = epte_get_entry_emt(p2m->domain, _gfn(gfn), mfn, -+ i * EPT_TABLE_ORDER, &ipat, -+ p2mt); -+ -+ BUG_ON(mfn_eq(mfn, INVALID_MFN)); -+ -+ if ( emt >= 0 ) -+ new_entry.emt = emt; -+ else /* ept_handle_misconfig() will need to take care of this. */ -+ new_entry.emt = MTRR_NUM_TYPES; -+ new_entry.ipat = ipat; -+ } - } - - if ( sve != -1 ) --- -2.45.2 - diff --git a/0028-libxl-Fix-nul-termination-of-the-return-value-of-lib.patch b/0028-libxl-Fix-nul-termination-of-the-return-value-of-lib.patch new file mode 100644 index 0000000..7f43c74 --- /dev/null +++ b/0028-libxl-Fix-nul-termination-of-the-return-value-of-lib.patch @@ -0,0 +1,100 @@ +From c18635fd69fc2da238f00a26ab707f1b2a50bf64 Mon Sep 17 00:00:00 2001 +From: Javi Merino <javi.merino@cloud.com> +Date: Tue, 24 Sep 2024 14:41:06 +0200 +Subject: [PATCH 28/35] libxl: Fix nul-termination of the return value of + libxl_xen_console_read_line() +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +When built with ASAN, "xl dmesg" crashes in the "printf("%s", line)" +call in main_dmesg(). ASAN reports a heap buffer overflow: an +off-by-one access to cr->buffer. + +The readconsole sysctl copies up to count characters into the buffer, +but it does not add a null character at the end. Despite the +documentation of libxl_xen_console_read_line(), line_r is not +nul-terminated if 16384 characters were copied to the buffer. + +Fix this by asking xc_readconsolering() to fill the buffer up to size +- 1. As the number of characters in the buffer is only needed in +libxl_xen_console_read_line(), make it a local variable there instead +of part of the libxl__xen_console_reader struct. + +Fixes: 4024bae739cc ("xl: Add subcommand 'xl dmesg'") +Reported-by: Edwin Török <edwin.torok@cloud.com> +Signed-off-by: Javi Merino <javi.merino@cloud.com> +Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> +master commit: bb03169bcb6ecccf372de1f6b9285cd519a26bb8 +master date: 2024-09-03 10:53:44 +0100 +--- + tools/libs/light/libxl_console.c | 19 +++++++++++++++---- + tools/libs/light/libxl_internal.h | 1 - + 2 files changed, 15 insertions(+), 5 deletions(-) + +diff --git a/tools/libs/light/libxl_console.c b/tools/libs/light/libxl_console.c +index a563c9d3c7..9f736b8913 100644 +--- a/tools/libs/light/libxl_console.c ++++ b/tools/libs/light/libxl_console.c +@@ -774,12 +774,17 @@ libxl_xen_console_reader * + { + GC_INIT(ctx); + libxl_xen_console_reader *cr; +- unsigned int size = 16384; ++ /* ++ * We want xen to fill the buffer in as few hypercalls as ++ * possible, but xen will not nul-terminate it. The default size ++ * of Xen's console buffer is 16384. Leave one byte at the end ++ * for the null character. ++ */ ++ unsigned int size = 16384 + 1; + + cr = libxl__zalloc(NOGC, sizeof(libxl_xen_console_reader)); + cr->buffer = libxl__zalloc(NOGC, size); + cr->size = size; +- cr->count = size; + cr->clear = clear; + cr->incremental = 1; + +@@ -800,10 +805,16 @@ int libxl_xen_console_read_line(libxl_ctx *ctx, + char **line_r) + { + int ret; ++ /* ++ * Number of chars to copy into the buffer. xc_readconsolering() ++ * does not add a null character at the end, so leave a space for ++ * us to add it. ++ */ ++ unsigned int nr_chars = cr->size - 1; + GC_INIT(ctx); + + memset(cr->buffer, 0, cr->size); +- ret = xc_readconsolering(ctx->xch, cr->buffer, &cr->count, ++ ret = xc_readconsolering(ctx->xch, cr->buffer, &nr_chars, + cr->clear, cr->incremental, &cr->index); + if (ret < 0) { + LOGE(ERROR, "reading console ring buffer"); +@@ -811,7 +822,7 @@ int libxl_xen_console_read_line(libxl_ctx *ctx, + return ERROR_FAIL; + } + if (!ret) { +- if (cr->count) { ++ if (nr_chars) { + *line_r = cr->buffer; + ret = 1; + } else { +diff --git a/tools/libs/light/libxl_internal.h b/tools/libs/light/libxl_internal.h +index 3b58bb2d7f..96d14f5746 100644 +--- a/tools/libs/light/libxl_internal.h ++++ b/tools/libs/light/libxl_internal.h +@@ -2077,7 +2077,6 @@ _hidden char *libxl__uuid2string(libxl__gc *gc, const libxl_uuid uuid); + struct libxl__xen_console_reader { + char *buffer; + unsigned int size; +- unsigned int count; + unsigned int clear; + unsigned int incremental; + unsigned int index; +-- +2.46.1 + diff --git a/0028-x86-EPT-drop-questionable-mfn_valid-from-epte_get_en.patch b/0028-x86-EPT-drop-questionable-mfn_valid-from-epte_get_en.patch deleted file mode 100644 index ee495d4..0000000 --- a/0028-x86-EPT-drop-questionable-mfn_valid-from-epte_get_en.patch +++ /dev/null @@ -1,47 +0,0 @@ -From 3b777c2ce4ea8cf67b79a5496e51201145606798 Mon Sep 17 00:00:00 2001 -From: Jan Beulich <jbeulich@suse.com> -Date: Wed, 26 Jun 2024 13:40:35 +0200 -Subject: [PATCH 28/56] x86/EPT: drop questionable mfn_valid() from - epte_get_entry_emt() -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -mfn_valid() is RAM-focused; it will often return false for MMIO. Yet -access to actual MMIO space should not generally be restricted to UC -only; especially video frame buffer accesses are unduly affected by such -a restriction. - -Since, as of 777c71d31325 ("x86/EPT: avoid marking non-present entries -for re-configuring"), the function won't be called with INVALID_MFN or, -worse, truncated forms thereof anymore, we call fully drop that check. - -Fixes: 81fd0d3ca4b2 ("x86/hvm: simplify 'mmio_direct' check in epte_get_entry_emt()") -Signed-off-by: Jan Beulich <jbeulich@suse.com> -Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> -master commit: 4fdd8d75566fdad06667a79ec0ce6f43cc466c54 -master date: 2024-06-13 16:55:22 +0200 ---- - xen/arch/x86/mm/p2m-ept.c | 6 ------ - 1 file changed, 6 deletions(-) - -diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c -index 641d61b350..d325424e97 100644 ---- a/xen/arch/x86/mm/p2m-ept.c -+++ b/xen/arch/x86/mm/p2m-ept.c -@@ -500,12 +500,6 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn, - return -1; - } - -- if ( !mfn_valid(mfn) ) -- { -- *ipat = true; -- return X86_MT_UC; -- } -- - /* - * Conditional must be kept in sync with the code in - * {iomem,ioports}_{permit,deny}_access(). --- -2.45.2 - diff --git a/0029-SUPPORT.md-split-XSM-from-Flask.patch b/0029-SUPPORT.md-split-XSM-from-Flask.patch new file mode 100644 index 0000000..4cf9adb --- /dev/null +++ b/0029-SUPPORT.md-split-XSM-from-Flask.patch @@ -0,0 +1,66 @@ +From 3ceb79ceabab58305a0f35aed0117537f7a6b922 Mon Sep 17 00:00:00 2001 +From: Jan Beulich <jbeulich@suse.com> +Date: Tue, 24 Sep 2024 14:41:51 +0200 +Subject: [PATCH 29/35] SUPPORT.md: split XSM from Flask +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +XSM is a generic framework, which in particular is also used by SILO. +With this it can't really be experimental: Arm mandates SILO for having +a security supported configuration. + +Signed-off-by: Jan Beulich <jbeulich@suse.com> +Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> +Reviewed-by: Daniel P. Smith <dpsmith@apertussolutions.com> +master commit: d7c18b8720824d7efc39ffa7296751e1812865a9 +master date: 2024-09-04 16:05:03 +0200 +--- + SUPPORT.md | 19 +++++++++++++++++-- + 1 file changed, 17 insertions(+), 2 deletions(-) + +diff --git a/SUPPORT.md b/SUPPORT.md +index 1d8b38cbd0..ba6052477b 100644 +--- a/SUPPORT.md ++++ b/SUPPORT.md +@@ -768,13 +768,21 @@ Compile time disabled for ARM by default. + + Status, x86: Supported, not security supported + +-### XSM & FLASK ++### XSM (Xen Security Module) Framework ++ ++XSM is a security policy framework. The dummy implementation is covered by this ++statement, and implements a policy whereby dom0 is all powerful. See below for ++alternative modules (FLASK, SILO). ++ ++ Status: Supported ++ ++### FLASK XSM Module + + Status: Experimental + + Compile time disabled by default. + +-Also note that using XSM ++Also note that using FLASK + to delegate various domain control hypercalls + to particular other domains, rather than only permitting use by dom0, + is also specifically excluded from security support for many hypercalls. +@@ -787,6 +795,13 @@ Please see XSA-77 for more details. + The default policy includes FLASK labels and roles for a "typical" Xen-based system + with dom0, driver domains, stub domains, domUs, and so on. + ++### SILO XSM Module ++ ++SILO extends the dummy policy by enforcing that DomU-s can only communicate ++with Dom0, yet not with each other. ++ ++ Status: Supported ++ + ## Virtual Hardware, Hypervisor + + ### x86/Nested PV +-- +2.46.1 + diff --git a/0029-x86-Intel-unlock-CPUID-earlier-for-the-BSP.patch b/0029-x86-Intel-unlock-CPUID-earlier-for-the-BSP.patch deleted file mode 100644 index 6722508..0000000 --- a/0029-x86-Intel-unlock-CPUID-earlier-for-the-BSP.patch +++ /dev/null @@ -1,105 +0,0 @@ -From c4b284912695a5802433512b913e968eda01544f Mon Sep 17 00:00:00 2001 -From: Jan Beulich <jbeulich@suse.com> -Date: Wed, 26 Jun 2024 13:41:05 +0200 -Subject: [PATCH 29/56] x86/Intel: unlock CPUID earlier for the BSP -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Intel CPUs have a MSR bit to limit CPUID enumeration to leaf two. If -this bit is set by the BIOS then CPUID evaluation does not work when -data from any leaf greater than two is needed; early_cpu_init() in -particular wants to collect leaf 7 data. - -Cure this by unlocking CPUID right before evaluating anything which -depends on the maximum CPUID leaf being greater than two. - -Inspired by (and description cloned from) Linux commit 0c2f6d04619e -("x86/topology/intel: Unlock CPUID before evaluating anything"). - -Signed-off-by: Jan Beulich <jbeulich@suse.com> -Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> -master commit: fa4d026737a47cd1d66ffb797a29150b4453aa9f -master date: 2024-06-18 15:12:44 +0200 ---- - xen/arch/x86/cpu/common.c | 3 ++- - xen/arch/x86/cpu/cpu.h | 2 ++ - xen/arch/x86/cpu/intel.c | 29 +++++++++++++++++------------ - 3 files changed, 21 insertions(+), 13 deletions(-) - -diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c -index 26eed2ade1..edec0a2546 100644 ---- a/xen/arch/x86/cpu/common.c -+++ b/xen/arch/x86/cpu/common.c -@@ -336,7 +336,8 @@ void __init early_cpu_init(bool verbose) - - c->x86_vendor = x86_cpuid_lookup_vendor(ebx, ecx, edx); - switch (c->x86_vendor) { -- case X86_VENDOR_INTEL: actual_cpu = intel_cpu_dev; break; -+ case X86_VENDOR_INTEL: intel_unlock_cpuid_leaves(c); -+ actual_cpu = intel_cpu_dev; break; - case X86_VENDOR_AMD: actual_cpu = amd_cpu_dev; break; - case X86_VENDOR_CENTAUR: actual_cpu = centaur_cpu_dev; break; - case X86_VENDOR_SHANGHAI: actual_cpu = shanghai_cpu_dev; break; -diff --git a/xen/arch/x86/cpu/cpu.h b/xen/arch/x86/cpu/cpu.h -index e3d06278b3..8be65e975a 100644 ---- a/xen/arch/x86/cpu/cpu.h -+++ b/xen/arch/x86/cpu/cpu.h -@@ -24,3 +24,5 @@ void amd_init_lfence(struct cpuinfo_x86 *c); - void amd_init_ssbd(const struct cpuinfo_x86 *c); - void amd_init_spectral_chicken(void); - void detect_zen2_null_seg_behaviour(void); -+ -+void intel_unlock_cpuid_leaves(struct cpuinfo_x86 *c); -diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c -index deb7b70464..0dc7c27601 100644 ---- a/xen/arch/x86/cpu/intel.c -+++ b/xen/arch/x86/cpu/intel.c -@@ -303,10 +303,24 @@ static void __init noinline intel_init_levelling(void) - ctxt_switch_masking = intel_ctxt_switch_masking; - } - --static void cf_check early_init_intel(struct cpuinfo_x86 *c) -+/* Unmask CPUID levels if masked. */ -+void intel_unlock_cpuid_leaves(struct cpuinfo_x86 *c) - { -- u64 misc_enable, disable; -+ uint64_t misc_enable, disable; -+ -+ rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable); -+ -+ disable = misc_enable & MSR_IA32_MISC_ENABLE_LIMIT_CPUID; -+ if (disable) { -+ wrmsrl(MSR_IA32_MISC_ENABLE, misc_enable & ~disable); -+ bootsym(trampoline_misc_enable_off) |= disable; -+ c->cpuid_level = cpuid_eax(0); -+ printk(KERN_INFO "revised cpuid level: %u\n", c->cpuid_level); -+ } -+} - -+static void cf_check early_init_intel(struct cpuinfo_x86 *c) -+{ - /* Netburst reports 64 bytes clflush size, but does IO in 128 bytes */ - if (c->x86 == 15 && c->x86_cache_alignment == 64) - c->x86_cache_alignment = 128; -@@ -315,16 +329,7 @@ static void cf_check early_init_intel(struct cpuinfo_x86 *c) - bootsym(trampoline_misc_enable_off) & MSR_IA32_MISC_ENABLE_XD_DISABLE) - printk(KERN_INFO "re-enabled NX (Execute Disable) protection\n"); - -- /* Unmask CPUID levels and NX if masked: */ -- rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable); -- -- disable = misc_enable & MSR_IA32_MISC_ENABLE_LIMIT_CPUID; -- if (disable) { -- wrmsrl(MSR_IA32_MISC_ENABLE, misc_enable & ~disable); -- bootsym(trampoline_misc_enable_off) |= disable; -- printk(KERN_INFO "revised cpuid level: %d\n", -- cpuid_eax(0)); -- } -+ intel_unlock_cpuid_leaves(c); - - /* CPUID workaround for Intel 0F33/0F34 CPU */ - if (boot_cpu_data.x86 == 0xF && boot_cpu_data.x86_model == 3 && --- -2.45.2 - diff --git a/0030-x86-fix-UP-build-with-gcc14.patch b/0030-x86-fix-UP-build-with-gcc14.patch new file mode 100644 index 0000000..bdb7dbe --- /dev/null +++ b/0030-x86-fix-UP-build-with-gcc14.patch @@ -0,0 +1,63 @@ +From d625c4e9fb46ef1b81a5b32d8fe1774c432cddd6 Mon Sep 17 00:00:00 2001 +From: Jan Beulich <jbeulich@suse.com> +Date: Tue, 24 Sep 2024 14:41:59 +0200 +Subject: [PATCH 30/35] x86: fix UP build with gcc14 +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +The complaint is: + +In file included from ././include/xen/config.h:17, + from <command-line>: +arch/x86/smpboot.c: In function ‘link_thread_siblings.constprop’: +./include/asm-generic/percpu.h:16:51: error: array subscript [0, 0] is outside array bounds of ‘long unsigned int[1]’ [-Werror=array-bounds=] + 16 | (*RELOC_HIDE(&per_cpu__##var, __per_cpu_offset[cpu])) +./include/xen/compiler.h:140:29: note: in definition of macro ‘RELOC_HIDE’ + 140 | (typeof(ptr)) (__ptr + (off)); }) + | ^~~ +arch/x86/smpboot.c:238:27: note: in expansion of macro ‘per_cpu’ + 238 | cpumask_set_cpu(cpu2, per_cpu(cpu_sibling_mask, cpu1)); + | ^~~~~~~ +In file included from ./arch/x86/include/generated/asm/percpu.h:1, + from ./include/xen/percpu.h:30, + from ./arch/x86/include/asm/cpuid.h:9, + from ./arch/x86/include/asm/cpufeature.h:11, + from ./arch/x86/include/asm/system.h:6, + from ./include/xen/list.h:11, + from ./include/xen/mm.h:68, + from arch/x86/smpboot.c:12: +./include/asm-generic/percpu.h:12:22: note: while referencing ‘__per_cpu_offset’ + 12 | extern unsigned long __per_cpu_offset[NR_CPUS]; + | ^~~~~~~~~~~~~~~~ + +Which I consider bogus in the first place ("array subscript [0, 0]" vs a +1-element array). Yet taking the experience from 99f942f3d410 ("Arm64: +adjust __irq_to_desc() to fix build with gcc14") I guessed that +switching function parameters to unsigned int (which they should have +been anyway) might help. And voilà ... + +Signed-off-by: Jan Beulich <jbeulich@suse.com> +Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> +master commit: a2de7dc4d845738e734b10fce6550c89c6b1092c +master date: 2024-09-04 16:09:28 +0200 +--- + xen/arch/x86/smpboot.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c +index 8aa621533f..0a89f22a39 100644 +--- a/xen/arch/x86/smpboot.c ++++ b/xen/arch/x86/smpboot.c +@@ -226,7 +226,7 @@ static int booting_cpu; + /* CPUs for which sibling maps can be computed. */ + static cpumask_t cpu_sibling_setup_map; + +-static void link_thread_siblings(int cpu1, int cpu2) ++static void link_thread_siblings(unsigned int cpu1, unsigned int cpu2) + { + cpumask_set_cpu(cpu1, per_cpu(cpu_sibling_mask, cpu2)); + cpumask_set_cpu(cpu2, per_cpu(cpu_sibling_mask, cpu1)); +-- +2.46.1 + diff --git a/0030-x86-irq-deal-with-old_cpu_mask-for-interrupts-in-mov.patch b/0030-x86-irq-deal-with-old_cpu_mask-for-interrupts-in-mov.patch deleted file mode 100644 index 785df10..0000000 --- a/0030-x86-irq-deal-with-old_cpu_mask-for-interrupts-in-mov.patch +++ /dev/null @@ -1,84 +0,0 @@ -From 39a6170c15bf369a2b26c855ea7621387ed4070b Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> -Date: Wed, 26 Jun 2024 13:41:35 +0200 -Subject: [PATCH 30/56] x86/irq: deal with old_cpu_mask for interrupts in - movement in fixup_irqs() -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Given the current logic it's possible for ->arch.old_cpu_mask to get out of -sync: if a CPU set in old_cpu_mask is offlined and then onlined -again without old_cpu_mask having been updated the data in the mask will no -longer be accurate, as when brought back online the CPU will no longer have -old_vector configured to handle the old interrupt source. - -If there's an interrupt movement in progress, and the to be offlined CPU (which -is the call context) is in the old_cpu_mask, clear it and update the mask, so -it doesn't contain stale data. - -Note that when the system is going down fixup_irqs() will be called by -smp_send_stop() from CPU 0 with a mask with only CPU 0 on it, effectively -asking to move all interrupts to the current caller (CPU 0) which is the only -CPU to remain online. In that case we don't care to migrate interrupts that -are in the process of being moved, as it's likely we won't be able to move all -interrupts to CPU 0 due to vector shortage anyway. - -Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -master commit: 817d1cd627be668c358d038f0fadbf7d24d417d3 -master date: 2024-06-18 15:14:49 +0200 ---- - xen/arch/x86/irq.c | 29 ++++++++++++++++++++++++++++- - 1 file changed, 28 insertions(+), 1 deletion(-) - -diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c -index 566331bec1..f877327975 100644 ---- a/xen/arch/x86/irq.c -+++ b/xen/arch/x86/irq.c -@@ -2539,7 +2539,7 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) - for ( irq = 0; irq < nr_irqs; irq++ ) - { - bool break_affinity = false, set_affinity = true; -- unsigned int vector; -+ unsigned int vector, cpu = smp_processor_id(); - cpumask_t *affinity = this_cpu(scratch_cpumask); - - if ( irq == 2 ) -@@ -2582,6 +2582,33 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) - affinity); - } - -+ if ( desc->arch.move_in_progress && -+ /* -+ * Only attempt to adjust the mask if the current CPU is going -+ * offline, otherwise the whole system is going down and leaving -+ * stale data in the masks is fine. -+ */ -+ !cpu_online(cpu) && -+ cpumask_test_cpu(cpu, desc->arch.old_cpu_mask) ) -+ { -+ /* -+ * This CPU is going offline, remove it from ->arch.old_cpu_mask -+ * and possibly release the old vector if the old mask becomes -+ * empty. -+ * -+ * Note cleaning ->arch.old_cpu_mask is required if the CPU is -+ * brought offline and then online again, as when re-onlined the -+ * per-cpu vector table will no longer have ->arch.old_vector -+ * setup, and hence ->arch.old_cpu_mask would be stale. -+ */ -+ cpumask_clear_cpu(cpu, desc->arch.old_cpu_mask); -+ if ( cpumask_empty(desc->arch.old_cpu_mask) ) -+ { -+ desc->arch.move_in_progress = 0; -+ release_old_vec(desc); -+ } -+ } -+ - /* - * Avoid shuffling the interrupt around as long as current target CPUs - * are a subset of the input mask. What fixup_irqs() cares about is --- -2.45.2 - diff --git a/0031-x86-irq-handle-moving-interrupts-in-_assign_irq_vect.patch b/0031-x86-irq-handle-moving-interrupts-in-_assign_irq_vect.patch deleted file mode 100644 index 96e87cd..0000000 --- a/0031-x86-irq-handle-moving-interrupts-in-_assign_irq_vect.patch +++ /dev/null @@ -1,172 +0,0 @@ -From 3a8f4ec75d8ed8da6370deac95c341cbada96802 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> -Date: Wed, 26 Jun 2024 13:42:05 +0200 -Subject: [PATCH 31/56] x86/irq: handle moving interrupts in - _assign_irq_vector() -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Currently there's logic in fixup_irqs() that attempts to prevent -_assign_irq_vector() from failing, as fixup_irqs() is required to evacuate all -interrupts from the CPUs not present in the input mask. The current logic in -fixup_irqs() is incomplete, as it doesn't deal with interrupts that have -move_cleanup_count > 0 and a non-empty ->arch.old_cpu_mask field. - -Instead of attempting to fixup the interrupt descriptor in fixup_irqs() so that -_assign_irq_vector() cannot fail, introduce logic in _assign_irq_vector() -to deal with interrupts that have either move_{in_progress,cleanup_count} set -and no remaining online CPUs in ->arch.cpu_mask. - -If _assign_irq_vector() is requested to move an interrupt in the state -described above, first attempt to see if ->arch.old_cpu_mask contains any valid -CPUs that could be used as fallback, and if that's the case do move the -interrupt back to the previous destination. Note this is easier because the -vector hasn't been released yet, so there's no need to allocate and setup a new -vector on the destination. - -Due to the logic in fixup_irqs() that clears offline CPUs from -->arch.old_cpu_mask (and releases the old vector if the mask becomes empty) it -shouldn't be possible to get into _assign_irq_vector() with -->arch.move_{in_progress,cleanup_count} set but no online CPUs in -->arch.old_cpu_mask. - -However if ->arch.move_{in_progress,cleanup_count} is set and the interrupt has -also changed affinity, it's possible the members of ->arch.old_cpu_mask are no -longer part of the affinity set, move the interrupt to a different CPU part of -the provided mask and keep the current ->arch.old_{cpu_mask,vector} for the -pending interrupt movement to be completed. - -Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -master commit: 369558924a642bbb0cb731e9a3375958867cb17b -master date: 2024-06-18 15:15:10 +0200 ---- - xen/arch/x86/irq.c | 97 ++++++++++++++++++++++++++++++++-------------- - 1 file changed, 68 insertions(+), 29 deletions(-) - -diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c -index f877327975..13ef61a5b7 100644 ---- a/xen/arch/x86/irq.c -+++ b/xen/arch/x86/irq.c -@@ -553,7 +553,58 @@ static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask) - } - - if ( desc->arch.move_in_progress || desc->arch.move_cleanup_count ) -- return -EAGAIN; -+ { -+ /* -+ * If the current destination is online refuse to shuffle. Retry after -+ * the in-progress movement has finished. -+ */ -+ if ( cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map) ) -+ return -EAGAIN; -+ -+ /* -+ * Due to the logic in fixup_irqs() that clears offlined CPUs from -+ * ->arch.old_cpu_mask it shouldn't be possible to get here with -+ * ->arch.move_{in_progress,cleanup_count} set and no online CPUs in -+ * ->arch.old_cpu_mask. -+ */ -+ ASSERT(valid_irq_vector(desc->arch.old_vector)); -+ ASSERT(cpumask_intersects(desc->arch.old_cpu_mask, &cpu_online_map)); -+ -+ if ( cpumask_intersects(desc->arch.old_cpu_mask, mask) ) -+ { -+ /* -+ * Fallback to the old destination if moving is in progress and the -+ * current destination is to be offlined. This is only possible if -+ * the CPUs in old_cpu_mask intersect with the affinity mask passed -+ * in the 'mask' parameter. -+ */ -+ desc->arch.vector = desc->arch.old_vector; -+ cpumask_and(desc->arch.cpu_mask, desc->arch.old_cpu_mask, mask); -+ -+ /* Undo any possibly done cleanup. */ -+ for_each_cpu(cpu, desc->arch.cpu_mask) -+ per_cpu(vector_irq, cpu)[desc->arch.vector] = irq; -+ -+ /* Cancel the pending move and release the current vector. */ -+ desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED; -+ cpumask_clear(desc->arch.old_cpu_mask); -+ desc->arch.move_in_progress = 0; -+ desc->arch.move_cleanup_count = 0; -+ if ( desc->arch.used_vectors ) -+ { -+ ASSERT(test_bit(old_vector, desc->arch.used_vectors)); -+ clear_bit(old_vector, desc->arch.used_vectors); -+ } -+ -+ return 0; -+ } -+ -+ /* -+ * There's an interrupt movement in progress but the destination(s) in -+ * ->arch.old_cpu_mask are not suitable given the 'mask' parameter, go -+ * through the full logic to find a new vector in a suitable CPU. -+ */ -+ } - - err = -ENOSPC; - -@@ -609,7 +660,22 @@ next: - current_vector = vector; - current_offset = offset; - -- if ( valid_irq_vector(old_vector) ) -+ if ( desc->arch.move_in_progress || desc->arch.move_cleanup_count ) -+ { -+ ASSERT(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map)); -+ /* -+ * Special case when evacuating an interrupt from a CPU to be -+ * offlined and the interrupt was already in the process of being -+ * moved. Leave ->arch.old_{vector,cpu_mask} as-is and just -+ * replace ->arch.{cpu_mask,vector} with the new destination. -+ * Cleanup will be done normally for the old fields, just release -+ * the current vector here. -+ */ -+ if ( desc->arch.used_vectors && -+ !test_and_clear_bit(old_vector, desc->arch.used_vectors) ) -+ ASSERT_UNREACHABLE(); -+ } -+ else if ( valid_irq_vector(old_vector) ) - { - cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask, - &cpu_online_map); -@@ -2620,33 +2686,6 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) - continue; - } - -- /* -- * In order for the affinity adjustment below to be successful, we -- * need _assign_irq_vector() to succeed. This in particular means -- * clearing desc->arch.move_in_progress if this would otherwise -- * prevent the function from succeeding. Since there's no way for the -- * flag to get cleared anymore when there's no possible destination -- * left (the only possibility then would be the IRQs enabled window -- * after this loop), there's then also no race with us doing it here. -- * -- * Therefore the logic here and there need to remain in sync. -- */ -- if ( desc->arch.move_in_progress && -- !cpumask_intersects(mask, desc->arch.cpu_mask) ) -- { -- unsigned int cpu; -- -- cpumask_and(affinity, desc->arch.old_cpu_mask, &cpu_online_map); -- -- spin_lock(&vector_lock); -- for_each_cpu(cpu, affinity) -- per_cpu(vector_irq, cpu)[desc->arch.old_vector] = ~irq; -- spin_unlock(&vector_lock); -- -- release_old_vec(desc); -- desc->arch.move_in_progress = 0; -- } -- - if ( !cpumask_intersects(mask, desc->affinity) ) - { - break_affinity = true; --- -2.45.2 - diff --git a/0031-x86emul-test-fix-build-with-gas-2.43.patch b/0031-x86emul-test-fix-build-with-gas-2.43.patch new file mode 100644 index 0000000..fe30e10 --- /dev/null +++ b/0031-x86emul-test-fix-build-with-gas-2.43.patch @@ -0,0 +1,86 @@ +From 78d412f8bc3d78458cd868ba375ad30175194d91 Mon Sep 17 00:00:00 2001 +From: Jan Beulich <jbeulich@suse.com> +Date: Tue, 24 Sep 2024 14:42:39 +0200 +Subject: [PATCH 31/35] x86emul/test: fix build with gas 2.43 + +Drop explicit {evex} pseudo-prefixes. New gas (validly) complains when +they're used on things other than instructions. Our use was potentially +ahead of macro invocations - see simd.h's "override" macro. + +Signed-off-by: Jan Beulich <jbeulich@suse.com> +Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> +master commit: 3c09288298af881ea1bb568740deb2d2a06bcd41 +master date: 2024-09-06 08:41:18 +0200 +--- + tools/tests/x86_emulator/simd.c | 14 +++++++------- + 1 file changed, 7 insertions(+), 7 deletions(-) + +diff --git a/tools/tests/x86_emulator/simd.c b/tools/tests/x86_emulator/simd.c +index 263cea662d..d68a7364c2 100644 +--- a/tools/tests/x86_emulator/simd.c ++++ b/tools/tests/x86_emulator/simd.c +@@ -333,7 +333,7 @@ static inline vec_t movlhps(vec_t x, vec_t y) { + # if FLOAT_SIZE == 4 + # define broadcast(x) ({ \ + vec_t t_; \ +- asm ( "%{evex%} vbroadcastss %1, %0" \ ++ asm ( "vbroadcastss %1, %0" \ + : "=v" (t_) : "m" (*(float[1]){ x }) ); \ + t_; \ + }) +@@ -401,14 +401,14 @@ static inline vec_t movlhps(vec_t x, vec_t y) { + # if VEC_SIZE >= 32 + # define broadcast(x) ({ \ + vec_t t_; \ +- asm ( "%{evex%} vbroadcastsd %1, %0" : "=v" (t_) \ ++ asm ( "vbroadcastsd %1, %0" : "=v" (t_) \ + : "m" (*(double[1]){ x }) ); \ + t_; \ + }) + # else + # define broadcast(x) ({ \ + vec_t t_; \ +- asm ( "%{evex%} vpbroadcastq %1, %0" \ ++ asm ( "vpbroadcastq %1, %0" \ + : "=v" (t_) : "m" (*(double[1]){ x }) ); \ + t_; \ + }) +@@ -601,7 +601,7 @@ static inline vec_t movlhps(vec_t x, vec_t y) { + # if INT_SIZE == 4 || UINT_SIZE == 4 + # define broadcast(x) ({ \ + vec_t t_; \ +- asm ( "%{evex%} vpbroadcastd %1, %0" \ ++ asm ( "vpbroadcastd %1, %0" \ + : "=v" (t_) : "m" (*(int[1]){ x }) ); \ + t_; \ + }) +@@ -649,7 +649,7 @@ static inline vec_t movlhps(vec_t x, vec_t y) { + # elif INT_SIZE == 8 || UINT_SIZE == 8 + # define broadcast(x) ({ \ + vec_t t_; \ +- asm ( "%{evex%} vpbroadcastq %1, %0" \ ++ asm ( "vpbroadcastq %1, %0" \ + : "=v" (t_) : "m" (*(long long[1]){ x }) ); \ + t_; \ + }) +@@ -716,7 +716,7 @@ static inline vec_t movlhps(vec_t x, vec_t y) { + # if INT_SIZE == 1 || UINT_SIZE == 1 + # define broadcast(x) ({ \ + vec_t t_; \ +- asm ( "%{evex%} vpbroadcastb %1, %0" \ ++ asm ( "vpbroadcastb %1, %0" \ + : "=v" (t_) : "m" (*(char[1]){ x }) ); \ + t_; \ + }) +@@ -745,7 +745,7 @@ static inline vec_t movlhps(vec_t x, vec_t y) { + # elif INT_SIZE == 2 || UINT_SIZE == 2 + # define broadcast(x) ({ \ + vec_t t_; \ +- asm ( "%{evex%} vpbroadcastw %1, %0" \ ++ asm ( "vpbroadcastw %1, %0" \ + : "=v" (t_) : "m" (*(short[1]){ x }) ); \ + t_; \ + }) +-- +2.46.1 + diff --git a/0032-x86-HVM-properly-reject-indirect-VRAM-writes.patch b/0032-x86-HVM-properly-reject-indirect-VRAM-writes.patch new file mode 100644 index 0000000..79652e7 --- /dev/null +++ b/0032-x86-HVM-properly-reject-indirect-VRAM-writes.patch @@ -0,0 +1,45 @@ +From ec3999e205ccadbeb8ab1f8420dea02fee2b5a5d Mon Sep 17 00:00:00 2001 +From: Jan Beulich <jbeulich@suse.com> +Date: Tue, 24 Sep 2024 14:43:02 +0200 +Subject: [PATCH 32/35] x86/HVM: properly reject "indirect" VRAM writes + +While ->count will only be different from 1 for "indirect" (data in +guest memory) accesses, it being 1 does not exclude the request being an +"indirect" one. Check both to be on the safe side, and bring the ->count +part also in line with what ioreq_send_buffered() actually refuses to +handle. + +Fixes: 3bbaaec09b1b ("x86/hvm: unify stdvga mmio intercept with standard mmio intercept") +Signed-off-by: Jan Beulich <jbeulich@suse.com> +Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> +master commit: eb7cd0593d88c4b967a24bca8bd30591966676cd +master date: 2024-09-12 09:13:04 +0200 +--- + xen/arch/x86/hvm/stdvga.c | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c +index b16c59f772..5f02d88615 100644 +--- a/xen/arch/x86/hvm/stdvga.c ++++ b/xen/arch/x86/hvm/stdvga.c +@@ -530,14 +530,14 @@ static bool cf_check stdvga_mem_accept( + + spin_lock(&s->lock); + +- if ( p->dir == IOREQ_WRITE && p->count > 1 ) ++ if ( p->dir == IOREQ_WRITE && (p->data_is_ptr || p->count != 1) ) + { + /* + * We cannot return X86EMUL_UNHANDLEABLE on anything other then the + * first cycle of an I/O. So, since we cannot guarantee to always be + * able to send buffered writes, we have to reject any multi-cycle +- * I/O and, since we are rejecting an I/O, we must invalidate the +- * cache. ++ * or "indirect" I/O and, since we are rejecting an I/O, we must ++ * invalidate the cache. + * Single-cycle write transactions are accepted even if the cache is + * not active since we can assert, when in stdvga mode, that writes + * to VRAM have no side effect and thus we can try to buffer them. +-- +2.46.1 + diff --git a/0032-xen-ubsan-Fix-UB-in-type_descriptor-declaration.patch b/0032-xen-ubsan-Fix-UB-in-type_descriptor-declaration.patch deleted file mode 100644 index c7c0968..0000000 --- a/0032-xen-ubsan-Fix-UB-in-type_descriptor-declaration.patch +++ /dev/null @@ -1,39 +0,0 @@ -From 5397ab9995f7354e7f8122a8a91c810256afa3d1 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Wed, 26 Jun 2024 13:42:30 +0200 -Subject: [PATCH 32/56] xen/ubsan: Fix UB in type_descriptor declaration - -struct type_descriptor is arranged with a NUL terminated string following the -kind/info fields. - -The only reason this doesn't trip UBSAN detection itself (on more modern -compilers at least) is because struct type_descriptor is only referenced in -suppressed regions. - -Switch the declaration to be a real flexible member. No functional change. - -Fixes: 00fcf4dd8eb4 ("xen/ubsan: Import ubsan implementation from Linux 4.13") -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -master commit: bd59af99700f075d06a6d47a16f777c9519928e0 -master date: 2024-06-18 14:55:04 +0100 ---- - xen/common/ubsan/ubsan.h | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/xen/common/ubsan/ubsan.h b/xen/common/ubsan/ubsan.h -index a3159040fe..3db42e75b1 100644 ---- a/xen/common/ubsan/ubsan.h -+++ b/xen/common/ubsan/ubsan.h -@@ -10,7 +10,7 @@ enum { - struct type_descriptor { - u16 type_kind; - u16 type_info; -- char type_name[1]; -+ char type_name[]; - }; - - struct source_location { --- -2.45.2 - diff --git a/0033-x86-xstate-Fix-initialisation-of-XSS-cache.patch b/0033-x86-xstate-Fix-initialisation-of-XSS-cache.patch deleted file mode 100644 index 1a8c724..0000000 --- a/0033-x86-xstate-Fix-initialisation-of-XSS-cache.patch +++ /dev/null @@ -1,74 +0,0 @@ -From 4ee1df89d9c92609e5fff3c9b261ce4b1bb88e42 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Wed, 26 Jun 2024 13:43:19 +0200 -Subject: [PATCH 33/56] x86/xstate: Fix initialisation of XSS cache - -The clobbering of this_cpu(xcr0) and this_cpu(xss) to architecturally invalid -values is to force the subsequent set_xcr0() and set_msr_xss() to reload the -hardware register. - -While XCR0 is reloaded in xstate_init(), MSR_XSS isn't. This causes -get_msr_xss() to return the invalid value, and logic of the form: - - old = get_msr_xss(); - set_msr_xss(new); - ... - set_msr_xss(old); - -to try and restore said invalid value. - -The architecturally invalid value must be purged from the cache, meaning the -hardware register must be written at least once. This in turn highlights that -the invalid value must only be used in the case that the hardware register is -available. - -Fixes: f7f4a523927f ("x86/xstate: reset cached register values on resume") -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -master commit: 9e6dbbe8bf400aacb99009ddffa91d2a0c312b39 -master date: 2024-06-19 13:00:06 +0100 ---- - xen/arch/x86/xstate.c | 18 +++++++++++------- - 1 file changed, 11 insertions(+), 7 deletions(-) - -diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c -index f442610fc5..ca76f98fe2 100644 ---- a/xen/arch/x86/xstate.c -+++ b/xen/arch/x86/xstate.c -@@ -641,13 +641,6 @@ void xstate_init(struct cpuinfo_x86 *c) - return; - } - -- /* -- * Zap the cached values to make set_xcr0() and set_msr_xss() really -- * write it. -- */ -- this_cpu(xcr0) = 0; -- this_cpu(xss) = ~0; -- - cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx); - feature_mask = (((u64)edx << 32) | eax) & XCNTXT_MASK; - BUG_ON(!valid_xcr0(feature_mask)); -@@ -657,8 +650,19 @@ void xstate_init(struct cpuinfo_x86 *c) - * Set CR4_OSXSAVE and run "cpuid" to get xsave_cntxt_size. - */ - set_in_cr4(X86_CR4_OSXSAVE); -+ -+ /* -+ * Zap the cached values to make set_xcr0() and set_msr_xss() really write -+ * the hardware register. -+ */ -+ this_cpu(xcr0) = 0; - if ( !set_xcr0(feature_mask) ) - BUG(); -+ if ( cpu_has_xsaves ) -+ { -+ this_cpu(xss) = ~0; -+ set_msr_xss(0); -+ } - - if ( bsp ) - { --- -2.45.2 - diff --git a/0033-xen-x86-pvh-handle-ACPI-RSDT-table-in-PVH-Dom0-build.patch b/0033-xen-x86-pvh-handle-ACPI-RSDT-table-in-PVH-Dom0-build.patch new file mode 100644 index 0000000..d5d65b8 --- /dev/null +++ b/0033-xen-x86-pvh-handle-ACPI-RSDT-table-in-PVH-Dom0-build.patch @@ -0,0 +1,63 @@ +From d0ea9b319d4ca04e29ef533db0c3655a78dec315 Mon Sep 17 00:00:00 2001 +From: Stefano Stabellini <stefano.stabellini@amd.com> +Date: Tue, 24 Sep 2024 14:43:24 +0200 +Subject: [PATCH 33/35] xen/x86/pvh: handle ACPI RSDT table in PVH Dom0 build +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Xen always generates an XSDT table even if the firmware only provided an +RSDT table. Copy the RSDT header from the firmware table, adjusting the +signature, for the XSDT table when not provided by the firmware. + +This is necessary to run Xen on QEMU. + +Fixes: 1d74282c455f ('x86: setup PVHv2 Dom0 ACPI tables') +Suggested-by: Roger Pau Monné <roger.pau@citrix.com> +Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> +Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com> +Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> +master commit: 6e7f7a0c16c4d406bda6d4a900252ff63a7c5fad +master date: 2024-09-12 09:18:25 +0200 +--- + xen/arch/x86/hvm/dom0_build.c | 17 ++++++++++++++++- + 1 file changed, 16 insertions(+), 1 deletion(-) + +diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c +index f3eddb6846..3dd913bdb0 100644 +--- a/xen/arch/x86/hvm/dom0_build.c ++++ b/xen/arch/x86/hvm/dom0_build.c +@@ -1078,7 +1078,16 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, + rc = -EINVAL; + goto out; + } +- xsdt_paddr = rsdp->xsdt_physical_address; ++ /* ++ * Note the header is the same for both RSDT and XSDT, so it's fine to ++ * copy the native RSDT header to the Xen crafted XSDT if no native ++ * XSDT is available. ++ */ ++ if ( rsdp->revision > 1 && rsdp->xsdt_physical_address ) ++ xsdt_paddr = rsdp->xsdt_physical_address; ++ else ++ xsdt_paddr = rsdp->rsdt_physical_address; ++ + acpi_os_unmap_memory(rsdp, sizeof(*rsdp)); + table = acpi_os_map_memory(xsdt_paddr, sizeof(*table)); + if ( !table ) +@@ -1090,6 +1099,12 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, + xsdt->header = *table; + acpi_os_unmap_memory(table, sizeof(*table)); + ++ /* ++ * In case the header is an RSDT copy, unconditionally ensure it has ++ * an XSDT sig. ++ */ ++ xsdt->header.signature[0] = 'X'; ++ + /* Add the custom MADT. */ + xsdt->table_offset_entry[0] = madt_addr; + +-- +2.46.1 + diff --git a/0034-blkif-reconcile-protocol-specification-with-in-use-i.patch b/0034-blkif-reconcile-protocol-specification-with-in-use-i.patch new file mode 100644 index 0000000..baa2b49 --- /dev/null +++ b/0034-blkif-reconcile-protocol-specification-with-in-use-i.patch @@ -0,0 +1,183 @@ +From 933416b13966a3fa2a37b1f645c23afbd8fb6d09 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> +Date: Tue, 24 Sep 2024 14:43:50 +0200 +Subject: [PATCH 34/35] blkif: reconcile protocol specification with in-use + implementations +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Current blkif implementations (both backends and frontends) have all slight +differences about how they handle the 'sector-size' xenstore node, and how +other fields are derived from this value or hardcoded to be expressed in units +of 512 bytes. + +To give some context, this is an excerpt of how different implementations use +the value in 'sector-size' as the base unit for to other fields rather than +just to set the logical sector size of the block device: + + │ sectors xenbus node │ requests sector_number │ requests {first,last}_sect +────────────────────────┼─────────────────────┼────────────────────────┼─────────────────────────── +FreeBSD blk{front,back} │ sector-size │ sector-size │ 512 +────────────────────────┼─────────────────────┼────────────────────────┼─────────────────────────── +Linux blk{front,back} │ 512 │ 512 │ 512 +────────────────────────┼─────────────────────┼────────────────────────┼─────────────────────────── +QEMU blkback │ sector-size │ sector-size │ sector-size +────────────────────────┼─────────────────────┼────────────────────────┼─────────────────────────── +Windows blkfront │ sector-size │ sector-size │ sector-size +────────────────────────┼─────────────────────┼────────────────────────┼─────────────────────────── +MiniOS │ sector-size │ 512 │ 512 + +An attempt was made by 67e1c050e36b in order to change the base units of the +request fields and the xenstore 'sectors' node. That however only lead to more +confusion, as the specification now clearly diverged from the reference +implementation in Linux. Such change was only implemented for QEMU Qdisk +and Windows PV blkfront. + +Partially revert to the state before 67e1c050e36b while adjusting the +documentation for 'sectors' to match what it used to be previous to +2fa701e5346d: + + * Declare 'feature-large-sector-size' deprecated. Frontends should not expose + the node, backends should not make decisions based on its presence. + + * Clarify that 'sectors' xenstore node and the requests fields are always in + 512-byte units, like it was previous to 2fa701e5346d and 67e1c050e36b. + +All base units for the fields used in the protocol are 512-byte based, the +xenbus 'sector-size' field is only used to signal the logic block size. When +'sector-size' is greater than 512, blkfront implementations must make sure that +the offsets and sizes (despite being expressed in 512-byte units) are aligned +to the logical block size specified in 'sector-size', otherwise the backend +will fail to process the requests. + +This will require changes to some of the frontends and backends in order to +properly support 'sector-size' nodes greater than 512. + +Fixes: 2fa701e5346d ('blkif.h: Provide more complete documentation of the blkif interface') +Fixes: 67e1c050e36b ('public/io/blkif.h: try to fix the semantics of sector based quantities') +Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> +Reviewed-by: Juergen Gross <jgross@suse.com> +Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> +master commit: 221f2748e8dabe8361b8cdfcffbeab9102c4c899 +master date: 2024-09-12 14:04:56 +0200 +--- + xen/include/public/io/blkif.h | 52 ++++++++++++++++++++++++++--------- + 1 file changed, 39 insertions(+), 13 deletions(-) + +diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h +index 22f1eef0c0..9b00d633d3 100644 +--- a/xen/include/public/io/blkif.h ++++ b/xen/include/public/io/blkif.h +@@ -237,12 +237,16 @@ + * sector-size + * Values: <uint32_t> + * +- * The logical block size, in bytes, of the underlying storage. This +- * must be a power of two with a minimum value of 512. ++ * The logical block size, in bytes, of the underlying storage. This must ++ * be a power of two with a minimum value of 512. The sector size should ++ * only be used for request segment length and alignment. + * +- * NOTE: Because of implementation bugs in some frontends this must be +- * set to 512, unless the frontend advertizes a non-zero value +- * in its "feature-large-sector-size" xenbus node. (See below). ++ * When exposing a device that uses a logical sector size of 4096, the ++ * only difference xenstore wise will be that 'sector-size' (and possibly ++ * 'physical-sector-size' if supported by the backend) will be 4096, but ++ * the 'sectors' node will still be calculated using 512 byte units. The ++ * sector base units in the ring requests fields will all be 512 byte ++ * based despite the logical sector size exposed in 'sector-size'. + * + * physical-sector-size + * Values: <uint32_t> +@@ -254,9 +258,9 @@ + * sectors + * Values: <uint64_t> + * +- * The size of the backend device, expressed in units of "sector-size". +- * The product of "sector-size" and "sectors" must also be an integer +- * multiple of "physical-sector-size", if that node is present. ++ * The size of the backend device, expressed in units of 512b. The ++ * product of "sectors" * 512 must also be an integer multiple of ++ * "physical-sector-size", if that node is present. + * + ***************************************************************************** + * Frontend XenBus Nodes +@@ -338,6 +342,7 @@ + * feature-large-sector-size + * Values: 0/1 (boolean) + * Default Value: 0 ++ * Notes: DEPRECATED, 12 + * + * A value of "1" indicates that the frontend will correctly supply and + * interpret all sector-based quantities in terms of the "sector-size" +@@ -411,6 +416,11 @@ + *(10) The discard-secure property may be present and will be set to 1 if the + * backing device supports secure discard. + *(11) Only used by Linux and NetBSD. ++ *(12) Possibly only ever implemented by the QEMU Qdisk backend and the Windows ++ * PV block frontend. Other backends and frontends supported 'sector-size' ++ * values greater than 512 before such feature was added. Frontends should ++ * not expose this node, neither should backends make any decisions based ++ * on it being exposed by the frontend. + */ + + /* +@@ -619,11 +629,14 @@ + #define BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST 8 + + /* +- * NB. 'first_sect' and 'last_sect' in blkif_request_segment, as well as +- * 'sector_number' in blkif_request, blkif_request_discard and +- * blkif_request_indirect are sector-based quantities. See the description +- * of the "feature-large-sector-size" frontend xenbus node above for +- * more information. ++ * NB. 'first_sect' and 'last_sect' in blkif_request_segment are all in units ++ * of 512 bytes, despite the 'sector-size' xenstore node possibly having a ++ * value greater than 512. ++ * ++ * The value in 'first_sect' and 'last_sect' fields must be setup so that the ++ * resulting segment offset and size is aligned to the logical sector size ++ * reported by the 'sector-size' xenstore node, see 'Backend Device Properties' ++ * section. + */ + struct blkif_request_segment { + grant_ref_t gref; /* reference to I/O buffer frame */ +@@ -634,6 +647,10 @@ struct blkif_request_segment { + + /* + * Starting ring element for any I/O request. ++ * ++ * The 'sector_number' field is in units of 512b, despite the value of the ++ * 'sector-size' xenstore node. Note however that the offset in ++ * 'sector_number' must be aligned to 'sector-size'. + */ + struct blkif_request { + uint8_t operation; /* BLKIF_OP_??? */ +@@ -648,6 +665,10 @@ typedef struct blkif_request blkif_request_t; + /* + * Cast to this structure when blkif_request.operation == BLKIF_OP_DISCARD + * sizeof(struct blkif_request_discard) <= sizeof(struct blkif_request) ++ * ++ * The 'sector_number' field is in units of 512b, despite the value of the ++ * 'sector-size' xenstore node. Note however that the offset in ++ * 'sector_number' must be aligned to 'sector-size'. + */ + struct blkif_request_discard { + uint8_t operation; /* BLKIF_OP_DISCARD */ +@@ -660,6 +681,11 @@ struct blkif_request_discard { + }; + typedef struct blkif_request_discard blkif_request_discard_t; + ++/* ++ * The 'sector_number' field is in units of 512b, despite the value of the ++ * 'sector-size' xenstore node. Note however that the offset in ++ * 'sector_number' must be aligned to 'sector-size'. ++ */ + struct blkif_request_indirect { + uint8_t operation; /* BLKIF_OP_INDIRECT */ + uint8_t indirect_op; /* BLKIF_OP_{READ/WRITE} */ +-- +2.46.1 + diff --git a/0034-x86-cpuid-Fix-handling-of-XSAVE-dynamic-leaves.patch b/0034-x86-cpuid-Fix-handling-of-XSAVE-dynamic-leaves.patch deleted file mode 100644 index 1905728..0000000 --- a/0034-x86-cpuid-Fix-handling-of-XSAVE-dynamic-leaves.patch +++ /dev/null @@ -1,72 +0,0 @@ -From 9b43092d54b5f9e9d39d9f20393671e303b19e81 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Wed, 26 Jun 2024 13:43:44 +0200 -Subject: [PATCH 34/56] x86/cpuid: Fix handling of XSAVE dynamic leaves - -[ This is a minimal backport of commit 71cacfb035f4 ("x86/cpuid: Fix handling - of XSAVE dynamic leaves") to fix the bugs without depending on the large - rework of XSTATE handling in Xen 4.19 ] - -First, if XSAVE is available in hardware but not visible to the guest, the -dynamic leaves shouldn't be filled in. - -Second, the comment concerning XSS state is wrong. VT-x doesn't manage -host/guest state automatically, but there is provision for "host only" bits to -be set, so the implications are still accurate. - -In Xen 4.18, no XSS states are supported, so it's safe to keep deferring to -real hardware. - -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -master commit: 71cacfb035f4a78ee10970dc38a3baa04d387451 -master date: 2024-06-19 13:00:06 +0100 ---- - xen/arch/x86/cpuid.c | 30 +++++++++++++----------------- - 1 file changed, 13 insertions(+), 17 deletions(-) - -diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c -index 455a09b2dd..f6fd6cc6b3 100644 ---- a/xen/arch/x86/cpuid.c -+++ b/xen/arch/x86/cpuid.c -@@ -330,24 +330,20 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf, - case XSTATE_CPUID: - switch ( subleaf ) - { -- case 1: -- if ( p->xstate.xsavec || p->xstate.xsaves ) -- { -- /* -- * TODO: Figure out what to do for XSS state. VT-x manages -- * host vs guest MSR_XSS automatically, so as soon as we start -- * supporting any XSS states, the wrong XSS will be in -- * context. -- */ -- BUILD_BUG_ON(XSTATE_XSAVES_ONLY != 0); -- -- /* -- * Read CPUID[0xD,0/1].EBX from hardware. They vary with -- * enabled XSTATE, and appropraite XCR0|XSS are in context. -- */ -+ /* -+ * Read CPUID[0xd,0/1].EBX from hardware. They vary with enabled -+ * XSTATE, and the appropriate XCR0 is in context. -+ */ - case 0: -- res->b = cpuid_count_ebx(leaf, subleaf); -- } -+ if ( p->basic.xsave ) -+ res->b = cpuid_count_ebx(0xd, 0); -+ break; -+ -+ case 1: -+ /* This only works because Xen doesn't support XSS states yet. */ -+ BUILD_BUG_ON(XSTATE_XSAVES_ONLY != 0); -+ if ( p->xstate.xsavec ) -+ res->b = cpuid_count_ebx(0xd, 1); - break; - } - break; --- -2.45.2 - diff --git a/0035-x86-irq-forward-pending-interrupts-to-new-destinatio.patch b/0035-x86-irq-forward-pending-interrupts-to-new-destinatio.patch deleted file mode 100644 index f05b09e..0000000 --- a/0035-x86-irq-forward-pending-interrupts-to-new-destinatio.patch +++ /dev/null @@ -1,143 +0,0 @@ -From e95d30f9e5eed0c5d9dbf72d4cc3ae373152ab10 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> -Date: Wed, 26 Jun 2024 13:44:08 +0200 -Subject: [PATCH 35/56] x86/irq: forward pending interrupts to new destination - in fixup_irqs() -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -fixup_irqs() is used to evacuate interrupts from to be offlined CPUs. Given -the CPU is to become offline, the normal migration logic used by Xen where the -vector in the previous target(s) is left configured until the interrupt is -received on the new destination is not suitable. - -Instead attempt to do as much as possible in order to prevent loosing -interrupts. If fixup_irqs() is called from the CPU to be offlined (as is -currently the case for CPU hot unplug) attempt to forward pending vectors when -interrupts that target the current CPU are migrated to a different destination. - -Additionally, for interrupts that have already been moved from the current CPU -prior to the call to fixup_irqs() but that haven't been delivered to the new -destination (iow: interrupts with move_in_progress set and the current CPU set -in ->arch.old_cpu_mask) also check whether the previous vector is pending and -forward it to the new destination. - -This allows us to remove the window with interrupts enabled at the bottom of -fixup_irqs(). Such window wasn't safe anyway: references to the CPU to become -offline are removed from interrupts masks, but the per-CPU vector_irq[] array -is not updated to reflect those changes (as the CPU is going offline anyway). - -Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -master commit: e2bb28d621584fce15c907002ddc7c6772644b64 -master date: 2024-06-20 12:09:32 +0200 ---- - xen/arch/x86/include/asm/apic.h | 5 ++++ - xen/arch/x86/irq.c | 46 ++++++++++++++++++++++++++++----- - 2 files changed, 45 insertions(+), 6 deletions(-) - -diff --git a/xen/arch/x86/include/asm/apic.h b/xen/arch/x86/include/asm/apic.h -index 7625c0ecd6..ad8d7cc054 100644 ---- a/xen/arch/x86/include/asm/apic.h -+++ b/xen/arch/x86/include/asm/apic.h -@@ -145,6 +145,11 @@ static __inline bool_t apic_isr_read(u8 vector) - (vector & 0x1f)) & 1; - } - -+static inline bool apic_irr_read(unsigned int vector) -+{ -+ return apic_read(APIC_IRR + (vector / 32 * 0x10)) & (1U << (vector % 32)); -+} -+ - static __inline u32 get_apic_id(void) /* Get the physical APIC id */ - { - u32 id = apic_read(APIC_ID); -diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c -index 13ef61a5b7..290f8d26e7 100644 ---- a/xen/arch/x86/irq.c -+++ b/xen/arch/x86/irq.c -@@ -2604,7 +2604,7 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) - - for ( irq = 0; irq < nr_irqs; irq++ ) - { -- bool break_affinity = false, set_affinity = true; -+ bool break_affinity = false, set_affinity = true, check_irr = false; - unsigned int vector, cpu = smp_processor_id(); - cpumask_t *affinity = this_cpu(scratch_cpumask); - -@@ -2657,6 +2657,25 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) - !cpu_online(cpu) && - cpumask_test_cpu(cpu, desc->arch.old_cpu_mask) ) - { -+ /* -+ * This to be offlined CPU was the target of an interrupt that's -+ * been moved, and the new destination target hasn't yet -+ * acknowledged any interrupt from it. -+ * -+ * We know the interrupt is configured to target the new CPU at -+ * this point, so we can check IRR for any pending vectors and -+ * forward them to the new destination. -+ * -+ * Note that for the other case of an interrupt movement being in -+ * progress (move_cleanup_count being non-zero) we know the new -+ * destination has already acked at least one interrupt from this -+ * source, and hence there's no need to forward any stale -+ * interrupts. -+ */ -+ if ( apic_irr_read(desc->arch.old_vector) ) -+ send_IPI_mask(cpumask_of(cpumask_any(desc->arch.cpu_mask)), -+ desc->arch.vector); -+ - /* - * This CPU is going offline, remove it from ->arch.old_cpu_mask - * and possibly release the old vector if the old mask becomes -@@ -2697,6 +2716,14 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) - if ( desc->handler->disable ) - desc->handler->disable(desc); - -+ /* -+ * If the current CPU is going offline and is (one of) the target(s) of -+ * the interrupt, signal to check whether there are any pending vectors -+ * to be handled in the local APIC after the interrupt has been moved. -+ */ -+ if ( !cpu_online(cpu) && cpumask_test_cpu(cpu, desc->arch.cpu_mask) ) -+ check_irr = true; -+ - if ( desc->handler->set_affinity ) - desc->handler->set_affinity(desc, affinity); - else if ( !(warned++) ) -@@ -2707,6 +2734,18 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) - - cpumask_copy(affinity, desc->affinity); - -+ if ( check_irr && apic_irr_read(vector) ) -+ /* -+ * Forward pending interrupt to the new destination, this CPU is -+ * going offline and otherwise the interrupt would be lost. -+ * -+ * Do the IRR check as late as possible before releasing the irq -+ * desc in order for any in-flight interrupts to be delivered to -+ * the lapic. -+ */ -+ send_IPI_mask(cpumask_of(cpumask_any(desc->arch.cpu_mask)), -+ desc->arch.vector); -+ - spin_unlock(&desc->lock); - - if ( !verbose ) -@@ -2718,11 +2757,6 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) - printk("Broke affinity for IRQ%u, new: %*pb\n", - irq, CPUMASK_PR(affinity)); - } -- -- /* That doesn't seem sufficient. Give it 1ms. */ -- local_irq_enable(); -- mdelay(1); -- local_irq_disable(); - } - - void fixup_eoi(void) --- -2.45.2 - diff --git a/0035-xen-ucode-Fix-buffer-under-run-when-parsing-AMD-cont.patch b/0035-xen-ucode-Fix-buffer-under-run-when-parsing-AMD-cont.patch new file mode 100644 index 0000000..221f55b --- /dev/null +++ b/0035-xen-ucode-Fix-buffer-under-run-when-parsing-AMD-cont.patch @@ -0,0 +1,62 @@ +From 2c61ab407172682e1382204a8305107f19e2951b Mon Sep 17 00:00:00 2001 +From: Demi Marie Obenour <demi@invisiblethingslab.com> +Date: Tue, 24 Sep 2024 14:44:10 +0200 +Subject: [PATCH 35/35] xen/ucode: Fix buffer under-run when parsing AMD + containers + +The AMD container format has no formal spec. It is, at best, precision +guesswork based on AMD's prior contributions to open source projects. The +Equivalence Table has both an explicit length, and an expectation of having a +NULL entry at the end. + +Xen was sanity checking the NULL entry, but without confirming that an entry +was present, resulting in a read off the front of the buffer. With some +manual debugging/annotations this manifests as: + + (XEN) *** Buf ffff83204c00b19c, eq ffff83204c00b194 + (XEN) *** eq: 0c 00 00 00 44 4d 41 00 00 00 00 00 00 00 00 00 aa aa aa aa + ^-Actual buffer-------------------^ + (XEN) *** installed_cpu: 000c + (XEN) microcode: Bad equivalent cpu table + (XEN) Parsing microcode blob error -22 + +When loaded by hypercall, the 4 bytes interpreted as installed_cpu happen to +be the containing struct ucode_buf's len field, and luckily will be nonzero. + +When loaded at boot, it's possible for the access to #PF if the module happens +to have been placed on a 2M boundary by the bootloader. Under Linux, it will +commonly be the end of the CPIO header. + +Drop the probe of the NULL entry; Nothing else cares. A container without one +is well formed, insofar that we can still parse it correctly. With this +dropped, the same container results in: + + (XEN) microcode: couldn't find any matching ucode in the provided blob! + +Fixes: 4de936a38aa9 ("x86/ucode/amd: Rework parsing logic in cpu_request_microcode()") +Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com> +Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> +Reviewed-by: Jan Beulich <jbeulich@suse.com> +master commit: a8bf14f6f331d4f428010b4277b67c33f561ed19 +master date: 2024-09-13 15:23:30 +0100 +--- + xen/arch/x86/cpu/microcode/amd.c | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +diff --git a/xen/arch/x86/cpu/microcode/amd.c b/xen/arch/x86/cpu/microcode/amd.c +index f76a563c8b..9fe6e29751 100644 +--- a/xen/arch/x86/cpu/microcode/amd.c ++++ b/xen/arch/x86/cpu/microcode/amd.c +@@ -336,8 +336,7 @@ static struct microcode_patch *cf_check cpu_request_microcode( + if ( size < sizeof(*et) || + (et = buf)->type != UCODE_EQUIV_CPU_TABLE_TYPE || + size - sizeof(*et) < et->len || +- et->len % sizeof(et->eq[0]) || +- et->eq[(et->len / sizeof(et->eq[0])) - 1].installed_cpu ) ++ et->len % sizeof(et->eq[0]) ) + { + printk(XENLOG_ERR "microcode: Bad equivalent cpu table\n"); + error = -EINVAL; +-- +2.46.1 + diff --git a/0036-x86-re-run-exception-from-stub-recovery-selftests-wi.patch b/0036-x86-re-run-exception-from-stub-recovery-selftests-wi.patch deleted file mode 100644 index a552e9c..0000000 --- a/0036-x86-re-run-exception-from-stub-recovery-selftests-wi.patch +++ /dev/null @@ -1,84 +0,0 @@ -From 5ac3cbbf83e1f955aeaf5d0f503099f5249b5c25 Mon Sep 17 00:00:00 2001 -From: Jan Beulich <jbeulich@suse.com> -Date: Thu, 4 Jul 2024 14:06:19 +0200 -Subject: [PATCH 36/56] x86: re-run exception-from-stub recovery selftests with - CET-SS enabled - -On the BSP, shadow stacks are enabled only relatively late in the -booting process. They in particular aren't active yet when initcalls are -run. Keep the testing there, but invoke that testing a 2nd time when -shadow stacks are active, to make sure we won't regress that case after -addressing XSA-451. - -While touching this code, switch the guard from NDEBUG to CONFIG_DEBUG, -such that IS_ENABLED() can validly be used at the new call site. - -Signed-off-by: Jan Beulich <jbeulich@suse.com> -Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> -master commit: cfe3ad67127b86e1b1c06993b86422673a51b050 -master date: 2024-02-27 13:49:52 +0100 ---- - xen/arch/x86/extable.c | 8 +++++--- - xen/arch/x86/include/asm/setup.h | 2 ++ - xen/arch/x86/setup.c | 4 ++++ - 3 files changed, 11 insertions(+), 3 deletions(-) - -diff --git a/xen/arch/x86/extable.c b/xen/arch/x86/extable.c -index 8ffcd346d7..12cc9935d8 100644 ---- a/xen/arch/x86/extable.c -+++ b/xen/arch/x86/extable.c -@@ -128,10 +128,11 @@ search_exception_table(const struct cpu_user_regs *regs, unsigned long *stub_ra) - return 0; - } - --#ifndef NDEBUG -+#ifdef CONFIG_DEBUG -+#include <asm/setup.h> - #include <asm/traps.h> - --static int __init cf_check stub_selftest(void) -+int __init cf_check stub_selftest(void) - { - static const struct { - uint8_t opc[8]; -@@ -155,7 +156,8 @@ static int __init cf_check stub_selftest(void) - unsigned int i; - bool fail = false; - -- printk("Running stub recovery selftests...\n"); -+ printk("%s stub recovery selftests...\n", -+ system_state < SYS_STATE_active ? "Running" : "Re-running"); - - for ( i = 0; i < ARRAY_SIZE(tests); ++i ) - { -diff --git a/xen/arch/x86/include/asm/setup.h b/xen/arch/x86/include/asm/setup.h -index 9a460e4db8..14d15048eb 100644 ---- a/xen/arch/x86/include/asm/setup.h -+++ b/xen/arch/x86/include/asm/setup.h -@@ -38,6 +38,8 @@ void *bootstrap_map(const module_t *mod); - - int xen_in_range(unsigned long mfn); - -+int cf_check stub_selftest(void); -+ - extern uint8_t kbd_shift_flags; - - #ifdef NDEBUG -diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c -index 25017b5d96..f2592c3dc9 100644 ---- a/xen/arch/x86/setup.c -+++ b/xen/arch/x86/setup.c -@@ -738,6 +738,10 @@ static void noreturn init_done(void) - - system_state = SYS_STATE_active; - -+ /* Re-run stub recovery self-tests with CET-SS active. */ -+ if ( IS_ENABLED(CONFIG_DEBUG) && cpu_has_xen_shstk ) -+ stub_selftest(); -+ - domain_unpause_by_systemcontroller(dom0); - - /* MUST be done prior to removing .init data. */ --- -2.45.2 - diff --git a/0037-tools-tests-don-t-let-test-xenstore-write-nodes-exce.patch b/0037-tools-tests-don-t-let-test-xenstore-write-nodes-exce.patch deleted file mode 100644 index cc7e47d..0000000 --- a/0037-tools-tests-don-t-let-test-xenstore-write-nodes-exce.patch +++ /dev/null @@ -1,41 +0,0 @@ -From 0ebfa35965257343ba3d8377be91ad8512a9c749 Mon Sep 17 00:00:00 2001 -From: Juergen Gross <jgross@suse.com> -Date: Thu, 4 Jul 2024 14:06:54 +0200 -Subject: [PATCH 37/56] tools/tests: don't let test-xenstore write nodes - exceeding default size - -Today test-xenstore will write nodes with 3000 bytes node data. This -size is exceeding the default quota for the allowed node size. While -working in dom0 with C-xenstored, OCAML-xenstored does not like that. - -Use a size of 2000 instead, which is lower than the allowed default -node size of 2048. - -Fixes: 3afc5e4a5b75 ("tools/tests: add xenstore testing framework") -Signed-off-by: Juergen Gross <jgross@suse.com> -Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> -master commit: 642005e310483c490b0725fab4672f2b77fdf2ba -master date: 2024-05-02 18:15:31 +0100 ---- - tools/tests/xenstore/test-xenstore.c | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/tools/tests/xenstore/test-xenstore.c b/tools/tests/xenstore/test-xenstore.c -index d491dac53b..73a7011d21 100644 ---- a/tools/tests/xenstore/test-xenstore.c -+++ b/tools/tests/xenstore/test-xenstore.c -@@ -408,9 +408,9 @@ static int test_ta3_deinit(uintptr_t par) - #define TEST(s, f, p, l) { s, f ## _init, f, f ## _deinit, (uintptr_t)(p), l } - struct test tests[] = { - TEST("read 1", test_read, 1, "Read node with 1 byte data"), --TEST("read 3000", test_read, 3000, "Read node with 3000 bytes data"), -+TEST("read 2000", test_read, 2000, "Read node with 2000 bytes data"), - TEST("write 1", test_write, 1, "Write node with 1 byte data"), --TEST("write 3000", test_write, 3000, "Write node with 3000 bytes data"), -+TEST("write 2000", test_write, 2000, "Write node with 2000 bytes data"), - TEST("dir", test_dir, 0, "List directory"), - TEST("rm node", test_rm, 0, "Remove single node"), - TEST("rm dir", test_rm, WRITE_BUFFERS_N, "Remove node with sub-nodes"), --- -2.45.2 - diff --git a/0038-tools-tests-let-test-xenstore-exit-with-non-0-status.patch b/0038-tools-tests-let-test-xenstore-exit-with-non-0-status.patch deleted file mode 100644 index ee0a497..0000000 --- a/0038-tools-tests-let-test-xenstore-exit-with-non-0-status.patch +++ /dev/null @@ -1,57 +0,0 @@ -From 22f623622cc60571be9cccc323a1d17749683667 Mon Sep 17 00:00:00 2001 -From: Juergen Gross <jgross@suse.com> -Date: Thu, 4 Jul 2024 14:07:12 +0200 -Subject: [PATCH 38/56] tools/tests: let test-xenstore exit with non-0 status - in case of error - -In case a test is failing in test-xenstore, let the tool exit with an -exit status other than 0. - -Fix a typo in an error message. - -Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> -Fixes: 3afc5e4a5b75 ("tools/tests: add xenstore testing framework") -Signed-off-by: Juergen Gross <jgross@suse.com> -master commit: 2d4ba205591ba64f31149ae31051678159ee9e11 -master date: 2024-05-02 18:15:46 +0100 ---- - tools/tests/xenstore/test-xenstore.c | 8 ++++---- - 1 file changed, 4 insertions(+), 4 deletions(-) - -diff --git a/tools/tests/xenstore/test-xenstore.c b/tools/tests/xenstore/test-xenstore.c -index 73a7011d21..7a9bd9afb3 100644 ---- a/tools/tests/xenstore/test-xenstore.c -+++ b/tools/tests/xenstore/test-xenstore.c -@@ -506,14 +506,14 @@ int main(int argc, char *argv[]) - stop = time(NULL) + randtime; - srandom((unsigned int)stop); - -- while ( time(NULL) < stop ) -+ while ( time(NULL) < stop && !ret ) - { - t = random() % ARRAY_SIZE(tests); - ret = call_test(tests + t, iters, true); - } - } - else -- for ( t = 0; t < ARRAY_SIZE(tests); t++ ) -+ for ( t = 0; t < ARRAY_SIZE(tests) && !ret; t++ ) - { - if ( !test || !strcmp(test, tests[t].name) ) - ret = call_test(tests + t, iters, false); -@@ -525,10 +525,10 @@ int main(int argc, char *argv[]) - xs_close(xsh); - - if ( ta_loops ) -- printf("Exhaustive transaction retries (%d) occurrred %d times.\n", -+ printf("Exhaustive transaction retries (%d) occurred %d times.\n", - MAX_TA_LOOPS, ta_loops); - -- return 0; -+ return ret ? 3 : 0; - } - - /* --- -2.45.2 - diff --git a/0039-LICENSES-Add-MIT-0-MIT-No-Attribution.patch b/0039-LICENSES-Add-MIT-0-MIT-No-Attribution.patch deleted file mode 100644 index 8b2c4ec..0000000 --- a/0039-LICENSES-Add-MIT-0-MIT-No-Attribution.patch +++ /dev/null @@ -1,58 +0,0 @@ -From 75b4f9474a1aa33a6f9e0986b51c390f9b38ae5a Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Thu, 4 Jul 2024 14:08:11 +0200 -Subject: [PATCH 39/56] LICENSES: Add MIT-0 (MIT No Attribution) - -We are about to import code licensed under MIT-0. It's compatible for us to -use, so identify it as a permitted license. - -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> -Acked-by: Christian Lindig <christian.lindig@cloud.com> -master commit: 219cdff3fb7b4a03ab14869584f111e0f623b330 -master date: 2024-05-23 15:04:40 +0100 ---- - LICENSES/MIT-0 | 31 +++++++++++++++++++++++++++++++ - 1 file changed, 31 insertions(+) - create mode 100644 LICENSES/MIT-0 - -diff --git a/LICENSES/MIT-0 b/LICENSES/MIT-0 -new file mode 100644 -index 0000000000..70fb90ee34 ---- /dev/null -+++ b/LICENSES/MIT-0 -@@ -0,0 +1,31 @@ -+Valid-License-Identifier: MIT-0 -+ -+SPDX-URL: https://spdx.org/licenses/MIT-0.html -+ -+Usage-Guide: -+ -+ To use the MIT-0 License put the following SPDX tag/value pair into a -+ comment according to the placement guidelines in the licensing rules -+ documentation: -+ SPDX-License-Identifier: MIT-0 -+ -+License-Text: -+ -+MIT No Attribution -+ -+Copyright <year> <copyright holder> -+ -+Permission is hereby granted, free of charge, to any person obtaining a copy -+of this software and associated documentation files (the "Software"), to deal -+in the Software without restriction, including without limitation the rights -+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -+copies of the Software, and to permit persons to whom the Software is -+furnished to do so. -+ -+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -+SOFTWARE. --- -2.45.2 - diff --git a/0040-tools-Import-stand-alone-sd_notify-implementation-fr.patch b/0040-tools-Import-stand-alone-sd_notify-implementation-fr.patch deleted file mode 100644 index 990158d..0000000 --- a/0040-tools-Import-stand-alone-sd_notify-implementation-fr.patch +++ /dev/null @@ -1,130 +0,0 @@ -From 1743102a92479834c8e17b20697129e05b7c8313 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Thu, 4 Jul 2024 14:10:10 +0200 -Subject: [PATCH 40/56] tools: Import stand-alone sd_notify() implementation - from systemd - -... in order to avoid linking against the whole of libsystemd. - -Only minimal changes to the upstream copy, to function as a drop-in -replacement for sd_notify() and as a header-only library. - -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Juergen Gross <jgross@suse.com> -Acked-by: Christian Lindig <christian.lindig@cloud.com> -master commit: 78510f3a1522f2856330ffa429e0e35f8aab4277 -master date: 2024-05-23 15:04:40 +0100 -master commit: 78510f3a1522f2856330ffa429e0e35f8aab4277 -master date: 2024-05-23 15:04:40 +0100 ---- - tools/include/xen-sd-notify.h | 98 +++++++++++++++++++++++++++++++++++ - 1 file changed, 98 insertions(+) - create mode 100644 tools/include/xen-sd-notify.h - -diff --git a/tools/include/xen-sd-notify.h b/tools/include/xen-sd-notify.h -new file mode 100644 -index 0000000000..28c9b20f15 ---- /dev/null -+++ b/tools/include/xen-sd-notify.h -@@ -0,0 +1,98 @@ -+/* SPDX-License-Identifier: MIT-0 */ -+ -+/* -+ * Implement the systemd notify protocol without external dependencies. -+ * Supports both readiness notification on startup and on reloading, -+ * according to the protocol defined at: -+ * https://www.freedesktop.org/software/systemd/man/latest/sd_notify.html -+ * This protocol is guaranteed to be stable as per: -+ * https://systemd.io/PORTABILITY_AND_STABILITY/ -+ * -+ * Differences from the upstream copy: -+ * - Rename/rework as a drop-in replacement for systemd/sd-daemon.h -+ * - Only take the subset Xen cares about -+ * - Respect -Wdeclaration-after-statement -+ */ -+ -+#ifndef XEN_SD_NOTIFY -+#define XEN_SD_NOTIFY -+ -+#include <errno.h> -+#include <stddef.h> -+#include <stdlib.h> -+#include <sys/socket.h> -+#include <sys/un.h> -+#include <unistd.h> -+ -+static inline void xen_sd_closep(int *fd) { -+ if (!fd || *fd < 0) -+ return; -+ -+ close(*fd); -+ *fd = -1; -+} -+ -+static inline int xen_sd_notify(const char *message) { -+ union sockaddr_union { -+ struct sockaddr sa; -+ struct sockaddr_un sun; -+ } socket_addr = { -+ .sun.sun_family = AF_UNIX, -+ }; -+ size_t path_length, message_length; -+ ssize_t written; -+ const char *socket_path; -+ int __attribute__((cleanup(xen_sd_closep))) fd = -1; -+ -+ /* Verify the argument first */ -+ if (!message) -+ return -EINVAL; -+ -+ message_length = strlen(message); -+ if (message_length == 0) -+ return -EINVAL; -+ -+ /* If the variable is not set, the protocol is a noop */ -+ socket_path = getenv("NOTIFY_SOCKET"); -+ if (!socket_path) -+ return 0; /* Not set? Nothing to do */ -+ -+ /* Only AF_UNIX is supported, with path or abstract sockets */ -+ if (socket_path[0] != '/' && socket_path[0] != '@') -+ return -EAFNOSUPPORT; -+ -+ path_length = strlen(socket_path); -+ /* Ensure there is room for NUL byte */ -+ if (path_length >= sizeof(socket_addr.sun.sun_path)) -+ return -E2BIG; -+ -+ memcpy(socket_addr.sun.sun_path, socket_path, path_length); -+ -+ /* Support for abstract socket */ -+ if (socket_addr.sun.sun_path[0] == '@') -+ socket_addr.sun.sun_path[0] = 0; -+ -+ fd = socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0); -+ if (fd < 0) -+ return -errno; -+ -+ if (connect(fd, &socket_addr.sa, offsetof(struct sockaddr_un, sun_path) + path_length) != 0) -+ return -errno; -+ -+ written = write(fd, message, message_length); -+ if (written != (ssize_t) message_length) -+ return written < 0 ? -errno : -EPROTO; -+ -+ return 1; /* Notified! */ -+} -+ -+static inline int sd_notify(int unset_environment, const char *message) { -+ int r = xen_sd_notify(message); -+ -+ if (unset_environment) -+ unsetenv("NOTIFY_SOCKET"); -+ -+ return r; -+} -+ -+#endif /* XEN_SD_NOTIFY */ --- -2.45.2 - diff --git a/0041-tools-c-o-xenstored-Don-t-link-against-libsystemd.patch b/0041-tools-c-o-xenstored-Don-t-link-against-libsystemd.patch deleted file mode 100644 index 5bf3f98..0000000 --- a/0041-tools-c-o-xenstored-Don-t-link-against-libsystemd.patch +++ /dev/null @@ -1,87 +0,0 @@ -From 77cf215157d267a7776f3c4ec32e89064dcd84cd Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Thu, 4 Jul 2024 14:10:29 +0200 -Subject: [PATCH 41/56] tools/{c,o}xenstored: Don't link against libsystemd - -Use the local freestanding wrapper instead. - -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Juergen Gross <jgross@suse.com> -Acked-by: Christian Lindig <christian.lindig@cloud.com> -master commit: caf864482689a5dd6a945759b6372bb260d49665 -master date: 2024-05-23 15:04:40 +0100 ---- - tools/ocaml/xenstored/Makefile | 3 +-- - tools/ocaml/xenstored/systemd_stubs.c | 2 +- - tools/xenstored/Makefile | 5 ----- - tools/xenstored/core.c | 4 ++-- - 4 files changed, 4 insertions(+), 10 deletions(-) - -diff --git a/tools/ocaml/xenstored/Makefile b/tools/ocaml/xenstored/Makefile -index e8aaecf2e6..fa45305d8c 100644 ---- a/tools/ocaml/xenstored/Makefile -+++ b/tools/ocaml/xenstored/Makefile -@@ -4,8 +4,7 @@ include $(OCAML_TOPLEVEL)/common.make - - # Include configure output (config.h) - CFLAGS += -include $(XEN_ROOT)/tools/config.h --CFLAGS-$(CONFIG_SYSTEMD) += $(SYSTEMD_CFLAGS) --LDFLAGS-$(CONFIG_SYSTEMD) += $(SYSTEMD_LIBS) -+CFLAGS-$(CONFIG_SYSTEMD) += $(CFLAGS_xeninclude) - - CFLAGS += $(CFLAGS-y) - CFLAGS += $(APPEND_CFLAGS) -diff --git a/tools/ocaml/xenstored/systemd_stubs.c b/tools/ocaml/xenstored/systemd_stubs.c -index f4c875075a..7dbbdd35bf 100644 ---- a/tools/ocaml/xenstored/systemd_stubs.c -+++ b/tools/ocaml/xenstored/systemd_stubs.c -@@ -25,7 +25,7 @@ - - #if defined(HAVE_SYSTEMD) - --#include <systemd/sd-daemon.h> -+#include <xen-sd-notify.h> - - CAMLprim value ocaml_sd_notify_ready(value ignore) - { -diff --git a/tools/xenstored/Makefile b/tools/xenstored/Makefile -index e0897ed1ba..09adfe1d50 100644 ---- a/tools/xenstored/Makefile -+++ b/tools/xenstored/Makefile -@@ -9,11 +9,6 @@ xenstored: LDLIBS += $(LDLIBS_libxenctrl) - xenstored: LDLIBS += -lrt - xenstored: LDLIBS += $(SOCKET_LIBS) - --ifeq ($(CONFIG_SYSTEMD),y) --$(XENSTORED_OBJS-y): CFLAGS += $(SYSTEMD_CFLAGS) --xenstored: LDLIBS += $(SYSTEMD_LIBS) --endif -- - TARGETS := xenstored - - .PHONY: all -diff --git a/tools/xenstored/core.c b/tools/xenstored/core.c -index edd07711db..dfe98e7bfc 100644 ---- a/tools/xenstored/core.c -+++ b/tools/xenstored/core.c -@@ -61,7 +61,7 @@ - #endif - - #if defined(XEN_SYSTEMD_ENABLED) --#include <systemd/sd-daemon.h> -+#include <xen-sd-notify.h> - #endif - - extern xenevtchn_handle *xce_handle; /* in domain.c */ -@@ -3000,7 +3000,7 @@ int main(int argc, char *argv[]) - #if defined(XEN_SYSTEMD_ENABLED) - if (!live_update) { - sd_notify(1, "READY=1"); -- fprintf(stderr, SD_NOTICE "xenstored is ready\n"); -+ fprintf(stderr, "xenstored is ready\n"); - } - #endif - --- -2.45.2 - diff --git a/0042-tools-Drop-libsystemd-as-a-dependency.patch b/0042-tools-Drop-libsystemd-as-a-dependency.patch deleted file mode 100644 index 168680e..0000000 --- a/0042-tools-Drop-libsystemd-as-a-dependency.patch +++ /dev/null @@ -1,648 +0,0 @@ -From 7967bd358e93ed83e01813a8d0dfd68aa67f5780 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Thu, 4 Jul 2024 14:10:40 +0200 -Subject: [PATCH 42/56] tools: Drop libsystemd as a dependency -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -There are no more users, and we want to disuade people from introducing new -users just for sd_notify() and friends. Drop the dependency. - -We still want the overall --with{,out}-systemd to gate the generation of the -service/unit/mount/etc files. - -Rerun autogen.sh, and mark the dependency as removed in the build containers. - -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Juergen Gross <jgross@suse.com> -Acked-by: Christian Lindig <christian.lindig@cloud.com> - -tools: (Actually) drop libsystemd as a dependency - -When reinstating some of systemd.m4 between v1 and v2, I reintroduced a little -too much. While {c,o}xenstored are indeed no longer linked against -libsystemd, ./configure still looks for it. - -Drop this too. - -Fixes: ae26101f6bfc ("tools: Drop libsystemd as a dependency") -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> -master commit: ae26101f6bfc8185adcdb9165d469bdc467780db -master date: 2024-05-23 15:04:40 +0100 -master commit: 6ef4fa1e7fe78c1dae07b451292b07facfce4902 -master date: 2024-05-30 12:15:25 +0100 ---- - CHANGELOG.md | 7 +- - config/Tools.mk.in | 2 - - m4/systemd.m4 | 17 -- - tools/configure | 485 +-------------------------------------------- - 4 files changed, 7 insertions(+), 504 deletions(-) - -diff --git a/CHANGELOG.md b/CHANGELOG.md -index fa54d59df1..ceca12eb5f 100644 ---- a/CHANGELOG.md -+++ b/CHANGELOG.md -@@ -4,7 +4,12 @@ Notable changes to Xen will be documented in this file. - - The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) - --## [4.18.2](https://xenbits.xen.org/gitweb/?p=xen.git;a=shortlog;h=RELEASE-4.18.2) -+## [4.18.3](https://xenbits.xen.org/gitweb/?p=xen.git;a=shortlog;h=RELEASE-4.18.3) -+ -+### Changed -+ - When building with Systemd support (./configure --enable-systemd), remove -+ libsystemd as a build dependency. Systemd Notify support is retained, now -+ using a standalone library implementation. - - ## [4.18.1](https://xenbits.xen.org/gitweb/?p=xen.git;a=shortlog;h=RELEASE-4.18.1) - -diff --git a/config/Tools.mk.in b/config/Tools.mk.in -index b54ab21f96..50fbef841f 100644 ---- a/config/Tools.mk.in -+++ b/config/Tools.mk.in -@@ -52,8 +52,6 @@ CONFIG_PYGRUB := @pygrub@ - CONFIG_LIBFSIMAGE := @libfsimage@ - - CONFIG_SYSTEMD := @systemd@ --SYSTEMD_CFLAGS := @SYSTEMD_CFLAGS@ --SYSTEMD_LIBS := @SYSTEMD_LIBS@ - XEN_SYSTEMD_DIR := @SYSTEMD_DIR@ - XEN_SYSTEMD_MODULES_LOAD := @SYSTEMD_MODULES_LOAD@ - CONFIG_9PFS := @ninepfs@ -diff --git a/m4/systemd.m4 b/m4/systemd.m4 -index 112dc11b5e..ab12ea313d 100644 ---- a/m4/systemd.m4 -+++ b/m4/systemd.m4 -@@ -41,15 +41,6 @@ AC_DEFUN([AX_ALLOW_SYSTEMD_OPTS], [ - ]) - - AC_DEFUN([AX_CHECK_SYSTEMD_LIBS], [ -- PKG_CHECK_MODULES([SYSTEMD], [libsystemd-daemon],, -- [PKG_CHECK_MODULES([SYSTEMD], [libsystemd >= 209])] -- ) -- dnl pkg-config older than 0.24 does not set these for -- dnl PKG_CHECK_MODULES() worth also noting is that as of version 208 -- dnl of systemd pkg-config --cflags currently yields no extra flags yet. -- AC_SUBST([SYSTEMD_CFLAGS]) -- AC_SUBST([SYSTEMD_LIBS]) -- - AS_IF([test "x$SYSTEMD_DIR" = x], [ - dnl In order to use the line below we need to fix upstream systemd - dnl to properly ${prefix} for child variables in -@@ -95,13 +86,6 @@ AC_DEFUN([AX_CHECK_SYSTEMD], [ - ],[systemd=n]) - ]) - --AC_DEFUN([AX_CHECK_SYSTEMD_ENABLE_AVAILABLE], [ -- PKG_CHECK_MODULES([SYSTEMD], [libsystemd-daemon], [systemd="y"],[ -- PKG_CHECK_MODULES([SYSTEMD], [libsystemd >= 209], -- [systemd="y"],[systemd="n"]) -- ]) --]) -- - dnl Enables systemd by default and requires a --disable-systemd option flag - dnl to configure if you want to disable. - AC_DEFUN([AX_ENABLE_SYSTEMD], [ -@@ -121,6 +105,5 @@ dnl to have systemd build libraries it will be enabled. You can always force - dnl disable with --disable-systemd - AC_DEFUN([AX_AVAILABLE_SYSTEMD], [ - AX_ALLOW_SYSTEMD_OPTS() -- AX_CHECK_SYSTEMD_ENABLE_AVAILABLE() - AX_CHECK_SYSTEMD() - ]) -diff --git a/tools/configure b/tools/configure -index 38c0808d3a..7bb935d23b 100755 ---- a/tools/configure -+++ b/tools/configure -@@ -626,8 +626,6 @@ ac_subst_vars='LTLIBOBJS - LIBOBJS - pvshim - ninepfs --SYSTEMD_LIBS --SYSTEMD_CFLAGS - SYSTEMD_MODULES_LOAD - SYSTEMD_DIR - systemd -@@ -864,9 +862,7 @@ pixman_LIBS - libzstd_CFLAGS - libzstd_LIBS - LIBNL3_CFLAGS --LIBNL3_LIBS --SYSTEMD_CFLAGS --SYSTEMD_LIBS' -+LIBNL3_LIBS' - - - # Initialize some variables set by options. -@@ -1621,10 +1617,6 @@ Some influential environment variables: - LIBNL3_CFLAGS - C compiler flags for LIBNL3, overriding pkg-config - LIBNL3_LIBS linker flags for LIBNL3, overriding pkg-config -- SYSTEMD_CFLAGS -- C compiler flags for SYSTEMD, overriding pkg-config -- SYSTEMD_LIBS -- linker flags for SYSTEMD, overriding pkg-config - - Use these variables to override the choices made by `configure' or to help - it to find libraries and programs with nonstandard names/locations. -@@ -3889,8 +3881,6 @@ esac - - - -- -- - - - -@@ -9540,223 +9530,6 @@ fi - - - -- --pkg_failed=no --{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for SYSTEMD" >&5 --$as_echo_n "checking for SYSTEMD... " >&6; } -- --if test -n "$SYSTEMD_CFLAGS"; then -- pkg_cv_SYSTEMD_CFLAGS="$SYSTEMD_CFLAGS" -- elif test -n "$PKG_CONFIG"; then -- if test -n "$PKG_CONFIG" && \ -- { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libsystemd-daemon\""; } >&5 -- ($PKG_CONFIG --exists --print-errors "libsystemd-daemon") 2>&5 -- ac_status=$? -- $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 -- test $ac_status = 0; }; then -- pkg_cv_SYSTEMD_CFLAGS=`$PKG_CONFIG --cflags "libsystemd-daemon" 2>/dev/null` -- test "x$?" != "x0" && pkg_failed=yes --else -- pkg_failed=yes --fi -- else -- pkg_failed=untried --fi --if test -n "$SYSTEMD_LIBS"; then -- pkg_cv_SYSTEMD_LIBS="$SYSTEMD_LIBS" -- elif test -n "$PKG_CONFIG"; then -- if test -n "$PKG_CONFIG" && \ -- { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libsystemd-daemon\""; } >&5 -- ($PKG_CONFIG --exists --print-errors "libsystemd-daemon") 2>&5 -- ac_status=$? -- $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 -- test $ac_status = 0; }; then -- pkg_cv_SYSTEMD_LIBS=`$PKG_CONFIG --libs "libsystemd-daemon" 2>/dev/null` -- test "x$?" != "x0" && pkg_failed=yes --else -- pkg_failed=yes --fi -- else -- pkg_failed=untried --fi -- -- -- --if test $pkg_failed = yes; then -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 --$as_echo "no" >&6; } -- --if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then -- _pkg_short_errors_supported=yes --else -- _pkg_short_errors_supported=no --fi -- if test $_pkg_short_errors_supported = yes; then -- SYSTEMD_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "libsystemd-daemon" 2>&1` -- else -- SYSTEMD_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "libsystemd-daemon" 2>&1` -- fi -- # Put the nasty error message in config.log where it belongs -- echo "$SYSTEMD_PKG_ERRORS" >&5 -- -- -- --pkg_failed=no --{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for SYSTEMD" >&5 --$as_echo_n "checking for SYSTEMD... " >&6; } -- --if test -n "$SYSTEMD_CFLAGS"; then -- pkg_cv_SYSTEMD_CFLAGS="$SYSTEMD_CFLAGS" -- elif test -n "$PKG_CONFIG"; then -- if test -n "$PKG_CONFIG" && \ -- { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libsystemd >= 209\""; } >&5 -- ($PKG_CONFIG --exists --print-errors "libsystemd >= 209") 2>&5 -- ac_status=$? -- $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 -- test $ac_status = 0; }; then -- pkg_cv_SYSTEMD_CFLAGS=`$PKG_CONFIG --cflags "libsystemd >= 209" 2>/dev/null` -- test "x$?" != "x0" && pkg_failed=yes --else -- pkg_failed=yes --fi -- else -- pkg_failed=untried --fi --if test -n "$SYSTEMD_LIBS"; then -- pkg_cv_SYSTEMD_LIBS="$SYSTEMD_LIBS" -- elif test -n "$PKG_CONFIG"; then -- if test -n "$PKG_CONFIG" && \ -- { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libsystemd >= 209\""; } >&5 -- ($PKG_CONFIG --exists --print-errors "libsystemd >= 209") 2>&5 -- ac_status=$? -- $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 -- test $ac_status = 0; }; then -- pkg_cv_SYSTEMD_LIBS=`$PKG_CONFIG --libs "libsystemd >= 209" 2>/dev/null` -- test "x$?" != "x0" && pkg_failed=yes --else -- pkg_failed=yes --fi -- else -- pkg_failed=untried --fi -- -- -- --if test $pkg_failed = yes; then -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 --$as_echo "no" >&6; } -- --if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then -- _pkg_short_errors_supported=yes --else -- _pkg_short_errors_supported=no --fi -- if test $_pkg_short_errors_supported = yes; then -- SYSTEMD_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "libsystemd >= 209" 2>&1` -- else -- SYSTEMD_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "libsystemd >= 209" 2>&1` -- fi -- # Put the nasty error message in config.log where it belongs -- echo "$SYSTEMD_PKG_ERRORS" >&5 -- -- systemd="n" --elif test $pkg_failed = untried; then -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 --$as_echo "no" >&6; } -- systemd="n" --else -- SYSTEMD_CFLAGS=$pkg_cv_SYSTEMD_CFLAGS -- SYSTEMD_LIBS=$pkg_cv_SYSTEMD_LIBS -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 --$as_echo "yes" >&6; } -- systemd="y" --fi -- --elif test $pkg_failed = untried; then -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 --$as_echo "no" >&6; } -- -- --pkg_failed=no --{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for SYSTEMD" >&5 --$as_echo_n "checking for SYSTEMD... " >&6; } -- --if test -n "$SYSTEMD_CFLAGS"; then -- pkg_cv_SYSTEMD_CFLAGS="$SYSTEMD_CFLAGS" -- elif test -n "$PKG_CONFIG"; then -- if test -n "$PKG_CONFIG" && \ -- { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libsystemd >= 209\""; } >&5 -- ($PKG_CONFIG --exists --print-errors "libsystemd >= 209") 2>&5 -- ac_status=$? -- $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 -- test $ac_status = 0; }; then -- pkg_cv_SYSTEMD_CFLAGS=`$PKG_CONFIG --cflags "libsystemd >= 209" 2>/dev/null` -- test "x$?" != "x0" && pkg_failed=yes --else -- pkg_failed=yes --fi -- else -- pkg_failed=untried --fi --if test -n "$SYSTEMD_LIBS"; then -- pkg_cv_SYSTEMD_LIBS="$SYSTEMD_LIBS" -- elif test -n "$PKG_CONFIG"; then -- if test -n "$PKG_CONFIG" && \ -- { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libsystemd >= 209\""; } >&5 -- ($PKG_CONFIG --exists --print-errors "libsystemd >= 209") 2>&5 -- ac_status=$? -- $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 -- test $ac_status = 0; }; then -- pkg_cv_SYSTEMD_LIBS=`$PKG_CONFIG --libs "libsystemd >= 209" 2>/dev/null` -- test "x$?" != "x0" && pkg_failed=yes --else -- pkg_failed=yes --fi -- else -- pkg_failed=untried --fi -- -- -- --if test $pkg_failed = yes; then -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 --$as_echo "no" >&6; } -- --if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then -- _pkg_short_errors_supported=yes --else -- _pkg_short_errors_supported=no --fi -- if test $_pkg_short_errors_supported = yes; then -- SYSTEMD_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "libsystemd >= 209" 2>&1` -- else -- SYSTEMD_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "libsystemd >= 209" 2>&1` -- fi -- # Put the nasty error message in config.log where it belongs -- echo "$SYSTEMD_PKG_ERRORS" >&5 -- -- systemd="n" --elif test $pkg_failed = untried; then -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 --$as_echo "no" >&6; } -- systemd="n" --else -- SYSTEMD_CFLAGS=$pkg_cv_SYSTEMD_CFLAGS -- SYSTEMD_LIBS=$pkg_cv_SYSTEMD_LIBS -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 --$as_echo "yes" >&6; } -- systemd="y" --fi -- --else -- SYSTEMD_CFLAGS=$pkg_cv_SYSTEMD_CFLAGS -- SYSTEMD_LIBS=$pkg_cv_SYSTEMD_LIBS -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 --$as_echo "yes" >&6; } -- systemd="y" --fi -- -- - if test "x$enable_systemd" != "xno"; then : - - if test "x$systemd" = "xy" ; then : -@@ -9766,262 +9539,6 @@ $as_echo "#define HAVE_SYSTEMD 1" >>confdefs.h - - systemd=y - -- --pkg_failed=no --{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for SYSTEMD" >&5 --$as_echo_n "checking for SYSTEMD... " >&6; } -- --if test -n "$SYSTEMD_CFLAGS"; then -- pkg_cv_SYSTEMD_CFLAGS="$SYSTEMD_CFLAGS" -- elif test -n "$PKG_CONFIG"; then -- if test -n "$PKG_CONFIG" && \ -- { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libsystemd-daemon\""; } >&5 -- ($PKG_CONFIG --exists --print-errors "libsystemd-daemon") 2>&5 -- ac_status=$? -- $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 -- test $ac_status = 0; }; then -- pkg_cv_SYSTEMD_CFLAGS=`$PKG_CONFIG --cflags "libsystemd-daemon" 2>/dev/null` -- test "x$?" != "x0" && pkg_failed=yes --else -- pkg_failed=yes --fi -- else -- pkg_failed=untried --fi --if test -n "$SYSTEMD_LIBS"; then -- pkg_cv_SYSTEMD_LIBS="$SYSTEMD_LIBS" -- elif test -n "$PKG_CONFIG"; then -- if test -n "$PKG_CONFIG" && \ -- { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libsystemd-daemon\""; } >&5 -- ($PKG_CONFIG --exists --print-errors "libsystemd-daemon") 2>&5 -- ac_status=$? -- $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 -- test $ac_status = 0; }; then -- pkg_cv_SYSTEMD_LIBS=`$PKG_CONFIG --libs "libsystemd-daemon" 2>/dev/null` -- test "x$?" != "x0" && pkg_failed=yes --else -- pkg_failed=yes --fi -- else -- pkg_failed=untried --fi -- -- -- --if test $pkg_failed = yes; then -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 --$as_echo "no" >&6; } -- --if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then -- _pkg_short_errors_supported=yes --else -- _pkg_short_errors_supported=no --fi -- if test $_pkg_short_errors_supported = yes; then -- SYSTEMD_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "libsystemd-daemon" 2>&1` -- else -- SYSTEMD_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "libsystemd-daemon" 2>&1` -- fi -- # Put the nasty error message in config.log where it belongs -- echo "$SYSTEMD_PKG_ERRORS" >&5 -- -- --pkg_failed=no --{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for SYSTEMD" >&5 --$as_echo_n "checking for SYSTEMD... " >&6; } -- --if test -n "$SYSTEMD_CFLAGS"; then -- pkg_cv_SYSTEMD_CFLAGS="$SYSTEMD_CFLAGS" -- elif test -n "$PKG_CONFIG"; then -- if test -n "$PKG_CONFIG" && \ -- { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libsystemd >= 209\""; } >&5 -- ($PKG_CONFIG --exists --print-errors "libsystemd >= 209") 2>&5 -- ac_status=$? -- $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 -- test $ac_status = 0; }; then -- pkg_cv_SYSTEMD_CFLAGS=`$PKG_CONFIG --cflags "libsystemd >= 209" 2>/dev/null` -- test "x$?" != "x0" && pkg_failed=yes --else -- pkg_failed=yes --fi -- else -- pkg_failed=untried --fi --if test -n "$SYSTEMD_LIBS"; then -- pkg_cv_SYSTEMD_LIBS="$SYSTEMD_LIBS" -- elif test -n "$PKG_CONFIG"; then -- if test -n "$PKG_CONFIG" && \ -- { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libsystemd >= 209\""; } >&5 -- ($PKG_CONFIG --exists --print-errors "libsystemd >= 209") 2>&5 -- ac_status=$? -- $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 -- test $ac_status = 0; }; then -- pkg_cv_SYSTEMD_LIBS=`$PKG_CONFIG --libs "libsystemd >= 209" 2>/dev/null` -- test "x$?" != "x0" && pkg_failed=yes --else -- pkg_failed=yes --fi -- else -- pkg_failed=untried --fi -- -- -- --if test $pkg_failed = yes; then -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 --$as_echo "no" >&6; } -- --if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then -- _pkg_short_errors_supported=yes --else -- _pkg_short_errors_supported=no --fi -- if test $_pkg_short_errors_supported = yes; then -- SYSTEMD_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "libsystemd >= 209" 2>&1` -- else -- SYSTEMD_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "libsystemd >= 209" 2>&1` -- fi -- # Put the nasty error message in config.log where it belongs -- echo "$SYSTEMD_PKG_ERRORS" >&5 -- -- as_fn_error $? "Package requirements (libsystemd >= 209) were not met: -- --$SYSTEMD_PKG_ERRORS -- --Consider adjusting the PKG_CONFIG_PATH environment variable if you --installed software in a non-standard prefix. -- --Alternatively, you may set the environment variables SYSTEMD_CFLAGS --and SYSTEMD_LIBS to avoid the need to call pkg-config. --See the pkg-config man page for more details." "$LINENO" 5 --elif test $pkg_failed = untried; then -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 --$as_echo "no" >&6; } -- { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 --$as_echo "$as_me: error: in \`$ac_pwd':" >&2;} --as_fn_error $? "The pkg-config script could not be found or is too old. Make sure it --is in your PATH or set the PKG_CONFIG environment variable to the full --path to pkg-config. -- --Alternatively, you may set the environment variables SYSTEMD_CFLAGS --and SYSTEMD_LIBS to avoid the need to call pkg-config. --See the pkg-config man page for more details. -- --To get pkg-config, see <http://pkg-config.freedesktop.org/>. --See \`config.log' for more details" "$LINENO" 5; } --else -- SYSTEMD_CFLAGS=$pkg_cv_SYSTEMD_CFLAGS -- SYSTEMD_LIBS=$pkg_cv_SYSTEMD_LIBS -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 --$as_echo "yes" >&6; } -- --fi -- --elif test $pkg_failed = untried; then -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 --$as_echo "no" >&6; } -- --pkg_failed=no --{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for SYSTEMD" >&5 --$as_echo_n "checking for SYSTEMD... " >&6; } -- --if test -n "$SYSTEMD_CFLAGS"; then -- pkg_cv_SYSTEMD_CFLAGS="$SYSTEMD_CFLAGS" -- elif test -n "$PKG_CONFIG"; then -- if test -n "$PKG_CONFIG" && \ -- { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libsystemd >= 209\""; } >&5 -- ($PKG_CONFIG --exists --print-errors "libsystemd >= 209") 2>&5 -- ac_status=$? -- $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 -- test $ac_status = 0; }; then -- pkg_cv_SYSTEMD_CFLAGS=`$PKG_CONFIG --cflags "libsystemd >= 209" 2>/dev/null` -- test "x$?" != "x0" && pkg_failed=yes --else -- pkg_failed=yes --fi -- else -- pkg_failed=untried --fi --if test -n "$SYSTEMD_LIBS"; then -- pkg_cv_SYSTEMD_LIBS="$SYSTEMD_LIBS" -- elif test -n "$PKG_CONFIG"; then -- if test -n "$PKG_CONFIG" && \ -- { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libsystemd >= 209\""; } >&5 -- ($PKG_CONFIG --exists --print-errors "libsystemd >= 209") 2>&5 -- ac_status=$? -- $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 -- test $ac_status = 0; }; then -- pkg_cv_SYSTEMD_LIBS=`$PKG_CONFIG --libs "libsystemd >= 209" 2>/dev/null` -- test "x$?" != "x0" && pkg_failed=yes --else -- pkg_failed=yes --fi -- else -- pkg_failed=untried --fi -- -- -- --if test $pkg_failed = yes; then -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 --$as_echo "no" >&6; } -- --if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then -- _pkg_short_errors_supported=yes --else -- _pkg_short_errors_supported=no --fi -- if test $_pkg_short_errors_supported = yes; then -- SYSTEMD_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "libsystemd >= 209" 2>&1` -- else -- SYSTEMD_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "libsystemd >= 209" 2>&1` -- fi -- # Put the nasty error message in config.log where it belongs -- echo "$SYSTEMD_PKG_ERRORS" >&5 -- -- as_fn_error $? "Package requirements (libsystemd >= 209) were not met: -- --$SYSTEMD_PKG_ERRORS -- --Consider adjusting the PKG_CONFIG_PATH environment variable if you --installed software in a non-standard prefix. -- --Alternatively, you may set the environment variables SYSTEMD_CFLAGS --and SYSTEMD_LIBS to avoid the need to call pkg-config. --See the pkg-config man page for more details." "$LINENO" 5 --elif test $pkg_failed = untried; then -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 --$as_echo "no" >&6; } -- { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 --$as_echo "$as_me: error: in \`$ac_pwd':" >&2;} --as_fn_error $? "The pkg-config script could not be found or is too old. Make sure it --is in your PATH or set the PKG_CONFIG environment variable to the full --path to pkg-config. -- --Alternatively, you may set the environment variables SYSTEMD_CFLAGS --and SYSTEMD_LIBS to avoid the need to call pkg-config. --See the pkg-config man page for more details. -- --To get pkg-config, see <http://pkg-config.freedesktop.org/>. --See \`config.log' for more details" "$LINENO" 5; } --else -- SYSTEMD_CFLAGS=$pkg_cv_SYSTEMD_CFLAGS -- SYSTEMD_LIBS=$pkg_cv_SYSTEMD_LIBS -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 --$as_echo "yes" >&6; } -- --fi -- --else -- SYSTEMD_CFLAGS=$pkg_cv_SYSTEMD_CFLAGS -- SYSTEMD_LIBS=$pkg_cv_SYSTEMD_LIBS -- { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 --$as_echo "yes" >&6; } -- --fi -- -- -- - if test "x$SYSTEMD_DIR" = x; then : - - SYSTEMD_DIR="\$(prefix)/lib/systemd/system/" --- -2.45.2 - diff --git a/0043-x86-ioapic-Fix-signed-shifts-in-io_apic.c.patch b/0043-x86-ioapic-Fix-signed-shifts-in-io_apic.c.patch deleted file mode 100644 index c368c1d..0000000 --- a/0043-x86-ioapic-Fix-signed-shifts-in-io_apic.c.patch +++ /dev/null @@ -1,46 +0,0 @@ -From 0dc5fbee17cd2bcb1aa6a1cf420dd80381587de8 Mon Sep 17 00:00:00 2001 -From: Matthew Barnes <matthew.barnes@cloud.com> -Date: Thu, 4 Jul 2024 14:11:03 +0200 -Subject: [PATCH 43/56] x86/ioapic: Fix signed shifts in io_apic.c - -There exists bitshifts in the IOAPIC code where signed integers are -shifted to the left by up to 31 bits, which is undefined behaviour. - -This patch fixes this by changing the integers from signed to unsigned. - -Signed-off-by: Matthew Barnes <matthew.barnes@cloud.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> -master commit: c5746b021e573184fb92b601a0e93a295485054e -master date: 2024-06-21 15:09:26 +0100 ---- - xen/arch/x86/io_apic.c | 6 ++++-- - 1 file changed, 4 insertions(+), 2 deletions(-) - -diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c -index 0ef61fb2f1..c5342789e8 100644 ---- a/xen/arch/x86/io_apic.c -+++ b/xen/arch/x86/io_apic.c -@@ -1692,7 +1692,8 @@ static void cf_check mask_and_ack_level_ioapic_irq(struct irq_desc *desc) - !io_apic_level_ack_pending(desc->irq)) - move_masked_irq(desc); - -- if ( !(v & (1 << (i & 0x1f))) ) { -+ if ( !(v & (1U << (i & 0x1f))) ) -+ { - spin_lock(&ioapic_lock); - __edge_IO_APIC_irq(desc->irq); - __level_IO_APIC_irq(desc->irq); -@@ -1756,7 +1757,8 @@ static void cf_check end_level_ioapic_irq_new(struct irq_desc *desc, u8 vector) - !io_apic_level_ack_pending(desc->irq) ) - move_native_irq(desc); - -- if (!(v & (1 << (i & 0x1f)))) { -+ if ( !(v & (1U << (i & 0x1f))) ) -+ { - spin_lock(&ioapic_lock); - __mask_IO_APIC_irq(desc->irq); - __edge_IO_APIC_irq(desc->irq); --- -2.45.2 - diff --git a/0044-tools-xl-Open-xldevd.log-with-O_CLOEXEC.patch b/0044-tools-xl-Open-xldevd.log-with-O_CLOEXEC.patch deleted file mode 100644 index 39dc3eb..0000000 --- a/0044-tools-xl-Open-xldevd.log-with-O_CLOEXEC.patch +++ /dev/null @@ -1,53 +0,0 @@ -From 2b3bf02c4f5e44d7d7bd3636530c9ebc837dea87 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Thu, 4 Jul 2024 14:11:36 +0200 -Subject: [PATCH 44/56] tools/xl: Open xldevd.log with O_CLOEXEC -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -`xl devd` has been observed leaking /var/log/xldevd.log into children. - -Note this is specifically safe; dup2() leaves O_CLOEXEC disabled on newfd, so -after setting up stdout/stderr, it's only the logfile fd which will close on -exec(). - -Link: https://github.com/QubesOS/qubes-issues/issues/8292 -Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com> -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> -Reviewed-by: Demi Marie Obenour <demi@invisiblethingslab.com> -Acked-by: Anthony PERARD <anthony.perard@vates.tech> -master commit: ba52b3b624e4a1a976908552364eba924ca45430 -master date: 2024-06-24 16:22:59 +0100 ---- - tools/xl/xl_utils.c | 6 +++++- - 1 file changed, 5 insertions(+), 1 deletion(-) - -diff --git a/tools/xl/xl_utils.c b/tools/xl/xl_utils.c -index 17489d1829..b0d23b2cdb 100644 ---- a/tools/xl/xl_utils.c -+++ b/tools/xl/xl_utils.c -@@ -27,6 +27,10 @@ - #include "xl.h" - #include "xl_utils.h" - -+#ifndef O_CLOEXEC -+#define O_CLOEXEC 0 -+#endif -+ - void dolog(const char *file, int line, const char *func, const char *fmt, ...) - { - va_list ap; -@@ -270,7 +274,7 @@ int do_daemonize(const char *name, const char *pidfile) - exit(-1); - } - -- CHK_SYSCALL(logfile = open(fullname, O_WRONLY|O_CREAT|O_APPEND, 0644)); -+ CHK_SYSCALL(logfile = open(fullname, O_WRONLY | O_CREAT | O_APPEND | O_CLOEXEC, 0644)); - free(fullname); - assert(logfile >= 3); - --- -2.45.2 - diff --git a/0045-pirq_cleanup_check-leaks.patch b/0045-pirq_cleanup_check-leaks.patch deleted file mode 100644 index dcf96c7..0000000 --- a/0045-pirq_cleanup_check-leaks.patch +++ /dev/null @@ -1,84 +0,0 @@ -From c9f50d2c5f29b630603e2b95f29e5b6e416a6187 Mon Sep 17 00:00:00 2001 -From: Jan Beulich <jbeulich@suse.com> -Date: Thu, 4 Jul 2024 14:11:57 +0200 -Subject: [PATCH 45/56] pirq_cleanup_check() leaks -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Its original introduction had two issues: For one the "common" part of -the checks (carried out in the macro) was inverted. And then after -removal from the radix tree the structure wasn't scheduled for freeing. -(All structures still left in the radix tree would be freed upon domain -destruction, though.) - -For the freeing to be safe even if it didn't use RCU (i.e. to avoid use- -after-free), re-arrange checks/operations in evtchn_close(), such that -the pointer wouldn't be used anymore after calling pirq_cleanup_check() -(noting that unmap_domain_pirq_emuirq() itself calls the function in the -success case). - -Fixes: c24536b636f2 ("replace d->nr_pirqs sized arrays with radix tree") -Fixes: 79858fee307c ("xen: fix hvm_domain_use_pirq's behavior") -Signed-off-by: Jan Beulich <jbeulich@suse.com> -Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> -master commit: daa90dfea9175c07f13d1a2d901857b2dd14d080 -master date: 2024-07-02 08:35:56 +0200 ---- - xen/arch/x86/irq.c | 1 + - xen/common/event_channel.c | 11 ++++++++--- - xen/include/xen/irq.h | 2 +- - 3 files changed, 10 insertions(+), 4 deletions(-) - -diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c -index 290f8d26e7..00be3b88e8 100644 ---- a/xen/arch/x86/irq.c -+++ b/xen/arch/x86/irq.c -@@ -1413,6 +1413,7 @@ void (pirq_cleanup_check)(struct pirq *pirq, struct domain *d) - - if ( radix_tree_delete(&d->pirq_tree, pirq->pirq) != pirq ) - BUG(); -+ free_pirq_struct(pirq); - } - - /* Flush all ready EOIs from the top of this CPU's pending-EOI stack. */ -diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c -index 66f924a7b0..b1a6215c37 100644 ---- a/xen/common/event_channel.c -+++ b/xen/common/event_channel.c -@@ -705,11 +705,16 @@ int evtchn_close(struct domain *d1, int port1, bool guest) - if ( !is_hvm_domain(d1) ) - pirq_guest_unbind(d1, pirq); - pirq->evtchn = 0; -- pirq_cleanup_check(pirq, d1); - #ifdef CONFIG_X86 -- if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1, pirq->pirq) > 0 ) -- unmap_domain_pirq_emuirq(d1, pirq->pirq); -+ if ( !is_hvm_domain(d1) || -+ domain_pirq_to_irq(d1, pirq->pirq) <= 0 || -+ unmap_domain_pirq_emuirq(d1, pirq->pirq) < 0 ) -+ /* -+ * The successful path of unmap_domain_pirq_emuirq() will have -+ * called pirq_cleanup_check() already. -+ */ - #endif -+ pirq_cleanup_check(pirq, d1); - } - unlink_pirq_port(chn1, d1->vcpu[chn1->notify_vcpu_id]); - break; -diff --git a/xen/include/xen/irq.h b/xen/include/xen/irq.h -index 65083135e1..5dcd2d8f0c 100644 ---- a/xen/include/xen/irq.h -+++ b/xen/include/xen/irq.h -@@ -180,7 +180,7 @@ extern struct pirq *pirq_get_info(struct domain *d, int pirq); - void pirq_cleanup_check(struct pirq *pirq, struct domain *d); - - #define pirq_cleanup_check(pirq, d) \ -- ((pirq)->evtchn ? pirq_cleanup_check(pirq, d) : (void)0) -+ (!(pirq)->evtchn ? pirq_cleanup_check(pirq, d) : (void)0) - - extern void pirq_guest_eoi(struct pirq *pirq); - extern void desc_guest_eoi(struct irq_desc *desc, struct pirq *pirq); --- -2.45.2 - diff --git a/0046-tools-dombuilder-Correct-the-length-calculation-in-x.patch b/0046-tools-dombuilder-Correct-the-length-calculation-in-x.patch deleted file mode 100644 index b25f15d..0000000 --- a/0046-tools-dombuilder-Correct-the-length-calculation-in-x.patch +++ /dev/null @@ -1,44 +0,0 @@ -From 8e51c8f1d45fad242a315fa17ba3582c02e66840 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Thu, 4 Jul 2024 14:12:31 +0200 -Subject: [PATCH 46/56] tools/dombuilder: Correct the length calculation in - xc_dom_alloc_segment() - -xc_dom_alloc_segment() is passed a size in bytes, calculates a size in pages -from it, then fills in the new segment information with a bytes value -re-calculated from the number of pages. - -This causes the module information given to the guest (MB, or PVH) to have -incorrect sizes; specifically, sizes rounded up to the next page. - -This in turn is problematic for Xen. When Xen finds a gzipped module, it -peeks at the end metadata to judge the decompressed size, which is a -4 -backreference from the reported end of the module. - -Fill in seg->vend using the correct number of bytes. - -Fixes: ea7c8a3d0e82 ("libxc: reorganize domain builder guest memory allocator") -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Acked-by: Anthony PERARD <anthony.perard@vates.tech> -master commit: 4c3a618b0adaa0cd59e0fa0898bb60978b8b3a5f -master date: 2024-07-02 10:50:18 +0100 ---- - tools/libs/guest/xg_dom_core.c | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/tools/libs/guest/xg_dom_core.c b/tools/libs/guest/xg_dom_core.c -index c4f4e7f3e2..f5521d528b 100644 ---- a/tools/libs/guest/xg_dom_core.c -+++ b/tools/libs/guest/xg_dom_core.c -@@ -601,7 +601,7 @@ int xc_dom_alloc_segment(struct xc_dom_image *dom, - memset(ptr, 0, pages * page_size); - - seg->vstart = start; -- seg->vend = dom->virt_alloc_end; -+ seg->vend = start + size; - - DOMPRINTF("%-20s: %-12s : 0x%" PRIx64 " -> 0x%" PRIx64 - " (pfn 0x%" PRIpfn " + 0x%" PRIpfn " pages)", --- -2.45.2 - diff --git a/0047-tools-libxs-Fix-CLOEXEC-handling-in-get_dev.patch b/0047-tools-libxs-Fix-CLOEXEC-handling-in-get_dev.patch deleted file mode 100644 index aabae58..0000000 --- a/0047-tools-libxs-Fix-CLOEXEC-handling-in-get_dev.patch +++ /dev/null @@ -1,95 +0,0 @@ -From d1b3bbb46402af77089906a97c413c14ed1740d2 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Thu, 4 Jul 2024 14:13:10 +0200 -Subject: [PATCH 47/56] tools/libxs: Fix CLOEXEC handling in get_dev() - -Move the O_CLOEXEC compatibility outside of an #ifdef USE_PTHREAD block. - -Introduce set_cloexec() to wrap fcntl() setting FD_CLOEXEC. It will be reused -for other CLOEXEC fixes too. - -Use set_cloexec() when O_CLOEXEC isn't available as a best-effort fallback. - -Fixes: f4f2f3402b2f ("tools/libxs: Open /dev/xen/xenbus fds as O_CLOEXEC") -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Juergen Gross <jgross@suse.com> -Acked-by: Anthony PERARD <anthony.perard@vates.tech> -master commit: bf7c1464706adfa903f1e7d59383d042c3a88e39 -master date: 2024-07-02 10:51:06 +0100 ---- - tools/libs/store/xs.c | 38 ++++++++++++++++++++++++++++++++------ - 1 file changed, 32 insertions(+), 6 deletions(-) - -diff --git a/tools/libs/store/xs.c b/tools/libs/store/xs.c -index 1498515073..037e79d98b 100644 ---- a/tools/libs/store/xs.c -+++ b/tools/libs/store/xs.c -@@ -40,6 +40,10 @@ - #include <xentoolcore_internal.h> - #include <xen_list.h> - -+#ifndef O_CLOEXEC -+#define O_CLOEXEC 0 -+#endif -+ - struct xs_stored_msg { - XEN_TAILQ_ENTRY(struct xs_stored_msg) list; - struct xsd_sockmsg hdr; -@@ -54,10 +58,6 @@ struct xs_stored_msg { - #include <dlfcn.h> - #endif - --#ifndef O_CLOEXEC --#define O_CLOEXEC 0 --#endif -- - struct xs_handle { - /* Communications channel to xenstore daemon. */ - int fd; -@@ -176,6 +176,16 @@ static bool setnonblock(int fd, int nonblock) { - return true; - } - -+static bool set_cloexec(int fd) -+{ -+ int flags = fcntl(fd, F_GETFL); -+ -+ if (flags < 0) -+ return false; -+ -+ return fcntl(fd, flags | FD_CLOEXEC) >= 0; -+} -+ - int xs_fileno(struct xs_handle *h) - { - char c = 0; -@@ -230,8 +240,24 @@ error: - - static int get_dev(const char *connect_to) - { -- /* We cannot open read-only because requests are writes */ -- return open(connect_to, O_RDWR | O_CLOEXEC); -+ int fd, saved_errno; -+ -+ fd = open(connect_to, O_RDWR | O_CLOEXEC); -+ if (fd < 0) -+ return -1; -+ -+ /* Compat for non-O_CLOEXEC environments. Racy. */ -+ if (!O_CLOEXEC && !set_cloexec(fd)) -+ goto error; -+ -+ return fd; -+ -+error: -+ saved_errno = errno; -+ close(fd); -+ errno = saved_errno; -+ -+ return -1; - } - - static int all_restrict_cb(Xentoolcore__Active_Handle *ah, domid_t domid) { --- -2.45.2 - diff --git a/0048-tools-libxs-Fix-CLOEXEC-handling-in-get_socket.patch b/0048-tools-libxs-Fix-CLOEXEC-handling-in-get_socket.patch deleted file mode 100644 index e01a6b4..0000000 --- a/0048-tools-libxs-Fix-CLOEXEC-handling-in-get_socket.patch +++ /dev/null @@ -1,60 +0,0 @@ -From d689bb4d2cd3ccdb0067b0ca953cccbc5ab375ae Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Thu, 4 Jul 2024 14:13:18 +0200 -Subject: [PATCH 48/56] tools/libxs: Fix CLOEXEC handling in get_socket() - -get_socket() opens a socket, then uses fcntl() to set CLOEXEC. This is racy -with exec(). - -Open the socket with SOCK_CLOEXEC. Use the same compatibility strategy as -O_CLOEXEC on ancient versions of Linux. - -Reported-by: Frediano Ziglio <frediano.ziglio@cloud.com> -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Juergen Gross <jgross@suse.com> -Acked-by: Anthony PERARD <anthony.perard@vates.tech> -master commit: 1957dd6aff931877fc22699d8f2d4be8728014ba -master date: 2024-07-02 10:51:11 +0100 ---- - tools/libs/store/xs.c | 14 ++++++++------ - 1 file changed, 8 insertions(+), 6 deletions(-) - -diff --git a/tools/libs/store/xs.c b/tools/libs/store/xs.c -index 037e79d98b..11a766c508 100644 ---- a/tools/libs/store/xs.c -+++ b/tools/libs/store/xs.c -@@ -44,6 +44,10 @@ - #define O_CLOEXEC 0 - #endif - -+#ifndef SOCK_CLOEXEC -+#define SOCK_CLOEXEC 0 -+#endif -+ - struct xs_stored_msg { - XEN_TAILQ_ENTRY(struct xs_stored_msg) list; - struct xsd_sockmsg hdr; -@@ -207,16 +211,14 @@ int xs_fileno(struct xs_handle *h) - static int get_socket(const char *connect_to) - { - struct sockaddr_un addr; -- int sock, saved_errno, flags; -+ int sock, saved_errno; - -- sock = socket(PF_UNIX, SOCK_STREAM, 0); -+ sock = socket(PF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0); - if (sock < 0) - return -1; - -- if ((flags = fcntl(sock, F_GETFD)) < 0) -- goto error; -- flags |= FD_CLOEXEC; -- if (fcntl(sock, F_SETFD, flags) < 0) -+ /* Compat for non-SOCK_CLOEXEC environments. Racy. */ -+ if (!SOCK_CLOEXEC && !set_cloexec(sock)) - goto error; - - addr.sun_family = AF_UNIX; --- -2.45.2 - diff --git a/0049-tools-libxs-Fix-CLOEXEC-handling-in-xs_fileno.patch b/0049-tools-libxs-Fix-CLOEXEC-handling-in-xs_fileno.patch deleted file mode 100644 index 564cece..0000000 --- a/0049-tools-libxs-Fix-CLOEXEC-handling-in-xs_fileno.patch +++ /dev/null @@ -1,109 +0,0 @@ -From 26b8ff1861a870e01456b31bf999f25df5538ebf Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Thu, 4 Jul 2024 14:13:30 +0200 -Subject: [PATCH 49/56] tools/libxs: Fix CLOEXEC handling in xs_fileno() - -xs_fileno() opens a pipe on first use to communicate between the watch thread -and the main thread. Nothing ever sets CLOEXEC on the file descriptors. - -Check for the availability of the pipe2() function with configure. Despite -starting life as Linux-only, FreeBSD and NetBSD have gained it. - -When pipe2() isn't available, try our best with pipe() and set_cloexec(). - -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Juergen Gross <jgross@suse.com> -Acked-by: Anthony PERARD <anthony.perard@vates.tech> -master commit: a2ff677852f0ce05fa335e8e5682bf2ae0c916ee -master date: 2024-07-02 10:52:59 +0100 ---- - tools/config.h.in | 3 +++ - tools/configure | 12 ++++++++++++ - tools/configure.ac | 2 ++ - tools/libs/store/xs.c | 16 +++++++++++++++- - 4 files changed, 32 insertions(+), 1 deletion(-) - -diff --git a/tools/config.h.in b/tools/config.h.in -index 0bb2fe08a1..50ad60fcb0 100644 ---- a/tools/config.h.in -+++ b/tools/config.h.in -@@ -39,6 +39,9 @@ - /* Define to 1 if you have the <memory.h> header file. */ - #undef HAVE_MEMORY_H - -+/* Define to 1 if you have the `pipe2' function. */ -+#undef HAVE_PIPE2 -+ - /* pygrub enabled */ - #undef HAVE_PYGRUB - -diff --git a/tools/configure b/tools/configure -index 7bb935d23b..e35112b5c5 100755 ---- a/tools/configure -+++ b/tools/configure -@@ -9751,6 +9751,18 @@ if test "$ax_found" = "0"; then : - fi - - -+for ac_func in pipe2 -+do : -+ ac_fn_c_check_func "$LINENO" "pipe2" "ac_cv_func_pipe2" -+if test "x$ac_cv_func_pipe2" = xyes; then : -+ cat >>confdefs.h <<_ACEOF -+#define HAVE_PIPE2 1 -+_ACEOF -+ -+fi -+done -+ -+ - cat >confcache <<\_ACEOF - # This file is a shell script that caches the results of configure - # tests run on this system so they can be shared between configure -diff --git a/tools/configure.ac b/tools/configure.ac -index 618ef8c63f..53ac20af1e 100644 ---- a/tools/configure.ac -+++ b/tools/configure.ac -@@ -543,4 +543,6 @@ AS_IF([test "x$pvshim" = "xy"], [ - - AX_FIND_HEADER([INCLUDE_ENDIAN_H], [endian.h sys/endian.h]) - -+AC_CHECK_FUNCS([pipe2]) -+ - AC_OUTPUT() -diff --git a/tools/libs/store/xs.c b/tools/libs/store/xs.c -index 11a766c508..c8845b69e2 100644 ---- a/tools/libs/store/xs.c -+++ b/tools/libs/store/xs.c -@@ -190,13 +190,27 @@ static bool set_cloexec(int fd) - return fcntl(fd, flags | FD_CLOEXEC) >= 0; - } - -+static int pipe_cloexec(int fds[2]) -+{ -+#if HAVE_PIPE2 -+ return pipe2(fds, O_CLOEXEC); -+#else -+ if (pipe(fds) < 0) -+ return -1; -+ /* Best effort to set CLOEXEC. Racy. */ -+ set_cloexec(fds[0]); -+ set_cloexec(fds[1]); -+ return 0; -+#endif -+} -+ - int xs_fileno(struct xs_handle *h) - { - char c = 0; - - mutex_lock(&h->watch_mutex); - -- if ((h->watch_pipe[0] == -1) && (pipe(h->watch_pipe) != -1)) { -+ if ((h->watch_pipe[0] == -1) && (pipe_cloexec(h->watch_pipe) != -1)) { - /* Kick things off if the watch list is already non-empty. */ - if (!XEN_TAILQ_EMPTY(&h->watch_list)) - while (write(h->watch_pipe[1], &c, 1) != 1) --- -2.45.2 - diff --git a/0050-cmdline-document-and-enforce-extra_guest_irqs-upper-.patch b/0050-cmdline-document-and-enforce-extra_guest_irqs-upper-.patch deleted file mode 100644 index f7f61e8..0000000 --- a/0050-cmdline-document-and-enforce-extra_guest_irqs-upper-.patch +++ /dev/null @@ -1,156 +0,0 @@ -From 30c695ddaf067cbe7a98037474e7910109238807 Mon Sep 17 00:00:00 2001 -From: Jan Beulich <jbeulich@suse.com> -Date: Thu, 4 Jul 2024 14:14:16 +0200 -Subject: [PATCH 50/56] cmdline: document and enforce "extra_guest_irqs" upper - bounds -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -PHYSDEVOP_pirq_eoi_gmfn_v<N> accepting just a single GFN implies that no -more than 32k pIRQ-s can be used by a domain on x86. Document this upper -bound. - -To also enforce the limit, (ab)use both arch_hwdom_irqs() (changing its -parameter type) and setup_system_domains(). This is primarily to avoid -exposing the two static variables or introducing yet further arch hooks. - -While touching arch_hwdom_irqs() also mark it hwdom-init. - -Signed-off-by: Jan Beulich <jbeulich@suse.com> -Acked-by: Roger Pau Monné <roger.pau@citrix.com> - -amend 'cmdline: document and enforce "extra_guest_irqs" upper bounds' - -Address late review comments for what is now commit 17f6d398f765: -- bound max_irqs right away against nr_irqs -- introduce a #define for a constant used twice - -Requested-by: Roger Pau Monné <roger.pau@citrix.com> -Signed-off-by: Jan Beulich <jbeulich@suse.com> -Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> -master commit: 17f6d398f76597f8009ec0530842fb8705ece7ba -master date: 2024-07-02 12:00:27 +0200 -master commit: 1f56accba33ffea0abf7d1c6384710823d10cbd6 -master date: 2024-07-03 14:03:27 +0200 ---- - docs/misc/xen-command-line.pandoc | 3 ++- - xen/arch/x86/io_apic.c | 17 ++++++++++------- - xen/common/domain.c | 24 ++++++++++++++++++++++-- - xen/include/xen/irq.h | 3 ++- - 4 files changed, 36 insertions(+), 11 deletions(-) - -diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc -index 10a09bbf23..d857bd0f89 100644 ---- a/docs/misc/xen-command-line.pandoc -+++ b/docs/misc/xen-command-line.pandoc -@@ -1175,7 +1175,8 @@ common for all domUs, while the optional second number (preceded by a comma) - is for dom0. Changing the setting for domU has no impact on dom0 and vice - versa. For example to change dom0 without changing domU, use - `extra_guest_irqs=,512`. The default value for Dom0 and an eventual separate --hardware domain is architecture dependent. -+hardware domain is architecture dependent. The upper limit for both values on -+x86 is such that the resulting total number of IRQs can't be higher than 32768. - Note that specifying zero as domU value means zero, while for dom0 it means - to use the default. - -diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c -index c5342789e8..f7591fd091 100644 ---- a/xen/arch/x86/io_apic.c -+++ b/xen/arch/x86/io_apic.c -@@ -2664,18 +2664,21 @@ void __init ioapic_init(void) - nr_irqs_gsi, nr_irqs - nr_irqs_gsi); - } - --unsigned int arch_hwdom_irqs(domid_t domid) -+unsigned int __hwdom_init arch_hwdom_irqs(const struct domain *d) - { - unsigned int n = fls(num_present_cpus()); -+ /* Bounding by the domain pirq EOI bitmap capacity. */ -+ const unsigned int max_irqs = min_t(unsigned int, nr_irqs, -+ PAGE_SIZE * BITS_PER_BYTE); - -- if ( !domid ) -- n = min(n, dom0_max_vcpus()); -- n = min(nr_irqs_gsi + n * NR_DYNAMIC_VECTORS, nr_irqs); -+ if ( is_system_domain(d) ) -+ return max_irqs; - -- /* Bounded by the domain pirq eoi bitmap gfn. */ -- n = min_t(unsigned int, n, PAGE_SIZE * BITS_PER_BYTE); -+ if ( !d->domain_id ) -+ n = min(n, dom0_max_vcpus()); -+ n = min(nr_irqs_gsi + n * NR_DYNAMIC_VECTORS, max_irqs); - -- printk("Dom%d has maximum %u PIRQs\n", domid, n); -+ printk("%pd has maximum %u PIRQs\n", d, n); - - return n; - } -diff --git a/xen/common/domain.c b/xen/common/domain.c -index 003f4ab125..62832a5860 100644 ---- a/xen/common/domain.c -+++ b/xen/common/domain.c -@@ -351,7 +351,8 @@ static int late_hwdom_init(struct domain *d) - } - - static unsigned int __read_mostly extra_hwdom_irqs; --static unsigned int __read_mostly extra_domU_irqs = 32; -+#define DEFAULT_EXTRA_DOMU_IRQS 32U -+static unsigned int __read_mostly extra_domU_irqs = DEFAULT_EXTRA_DOMU_IRQS; - - static int __init cf_check parse_extra_guest_irqs(const char *s) - { -@@ -688,7 +689,7 @@ struct domain *domain_create(domid_t domid, - d->nr_pirqs = nr_static_irqs + extra_domU_irqs; - else - d->nr_pirqs = extra_hwdom_irqs ? nr_static_irqs + extra_hwdom_irqs -- : arch_hwdom_irqs(domid); -+ : arch_hwdom_irqs(d); - d->nr_pirqs = min(d->nr_pirqs, nr_irqs); - - radix_tree_init(&d->pirq_tree); -@@ -812,6 +813,25 @@ void __init setup_system_domains(void) - if ( IS_ERR(dom_xen) ) - panic("Failed to create d[XEN]: %ld\n", PTR_ERR(dom_xen)); - -+#ifdef CONFIG_HAS_PIRQ -+ /* Bound-check values passed via "extra_guest_irqs=". */ -+ { -+ unsigned int n = max(arch_hwdom_irqs(dom_xen), nr_static_irqs); -+ -+ if ( extra_hwdom_irqs > n - nr_static_irqs ) -+ { -+ extra_hwdom_irqs = n - nr_static_irqs; -+ printk(XENLOG_WARNING "hwdom IRQs bounded to %u\n", n); -+ } -+ if ( extra_domU_irqs > -+ max(DEFAULT_EXTRA_DOMU_IRQS, n - nr_static_irqs) ) -+ { -+ extra_domU_irqs = n - nr_static_irqs; -+ printk(XENLOG_WARNING "domU IRQs bounded to %u\n", n); -+ } -+ } -+#endif -+ - /* - * Initialise our DOMID_IO domain. - * This domain owns I/O pages that are within the range of the page_info -diff --git a/xen/include/xen/irq.h b/xen/include/xen/irq.h -index 5dcd2d8f0c..bef170bcb6 100644 ---- a/xen/include/xen/irq.h -+++ b/xen/include/xen/irq.h -@@ -196,8 +196,9 @@ extern struct irq_desc *pirq_spin_lock_irq_desc( - - unsigned int set_desc_affinity(struct irq_desc *desc, const cpumask_t *mask); - -+/* When passed a system domain, this returns the maximum permissible value. */ - #ifndef arch_hwdom_irqs --unsigned int arch_hwdom_irqs(domid_t domid); -+unsigned int arch_hwdom_irqs(const struct domain *d); - #endif - - #ifndef arch_evtchn_bind_pirq --- -2.45.2 - diff --git a/0051-x86-entry-don-t-clear-DF-when-raising-UD-for-lack-of.patch b/0051-x86-entry-don-t-clear-DF-when-raising-UD-for-lack-of.patch deleted file mode 100644 index acefc8e..0000000 --- a/0051-x86-entry-don-t-clear-DF-when-raising-UD-for-lack-of.patch +++ /dev/null @@ -1,58 +0,0 @@ -From 7e636b8a16412d4f0d94b2b24d7ebcd2c749afff Mon Sep 17 00:00:00 2001 -From: Jan Beulich <jbeulich@suse.com> -Date: Thu, 4 Jul 2024 14:14:49 +0200 -Subject: [PATCH 51/56] x86/entry: don't clear DF when raising #UD for lack of - syscall handler - -While doing so is intentional when invoking the actual callback, to -mimic a hard-coded SYCALL_MASK / FMASK MSR, the same should not be done -when no handler is available and hence #UD is raised. - -Fixes: ca6fcf4321b3 ("x86/pv: Inject #UD for missing SYSCALL callbacks") -Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> -Signed-off-by: Jan Beulich <jbeulich@suse.com> -Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> -master commit: d2fe9ab3048d503869ec81bc49db07e55a4a2386 -master date: 2024-07-02 12:01:21 +0200 ---- - xen/arch/x86/x86_64/entry.S | 12 +++++++++++- - 1 file changed, 11 insertions(+), 1 deletion(-) - -diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S -index 054fcb225f..d3def49ea3 100644 ---- a/xen/arch/x86/x86_64/entry.S -+++ b/xen/arch/x86/x86_64/entry.S -@@ -38,6 +38,14 @@ switch_to_kernel: - setc %cl - leal (,%rcx,TBF_INTERRUPT),%ecx - -+ /* -+ * The PV ABI hardcodes the (guest-inaccessible and virtual) -+ * SYSCALL_MASK MSR such that DF (and nothing else) would be cleared. -+ * Note that the equivalent of IF (VGCF_syscall_disables_events) is -+ * dealt with separately above. -+ */ -+ mov $~X86_EFLAGS_DF, %esi -+ - test %rax, %rax - UNLIKELY_START(z, syscall_no_callback) /* TB_eip == 0 => #UD */ - mov VCPU_trap_ctxt(%rbx), %rdi -@@ -47,12 +55,14 @@ UNLIKELY_START(z, syscall_no_callback) /* TB_eip == 0 => #UD */ - testb $4, X86_EXC_UD * TRAPINFO_sizeof + TRAPINFO_flags(%rdi) - setnz %cl - lea TBF_EXCEPTION(, %rcx, TBF_INTERRUPT), %ecx -+ or $~0, %esi /* Don't clear DF */ - UNLIKELY_END(syscall_no_callback) - - movq %rax,TRAPBOUNCE_eip(%rdx) - movb %cl,TRAPBOUNCE_flags(%rdx) - call create_bounce_frame -- andl $~X86_EFLAGS_DF,UREGS_eflags(%rsp) -+ /* Conditionally clear DF */ -+ and %esi, UREGS_eflags(%rsp) - /* %rbx: struct vcpu */ - test_all_events: - ASSERT_NOT_IN_ATOMIC --- -2.45.2 - diff --git a/0052-evtchn-build-fix-for-Arm.patch b/0052-evtchn-build-fix-for-Arm.patch deleted file mode 100644 index 6cbeb10..0000000 --- a/0052-evtchn-build-fix-for-Arm.patch +++ /dev/null @@ -1,43 +0,0 @@ -From 45c5333935628e7c80de0bd5a9d9eff50b305b16 Mon Sep 17 00:00:00 2001 -From: Jan Beulich <jbeulich@suse.com> -Date: Thu, 4 Jul 2024 16:57:29 +0200 -Subject: [PATCH 52/56] evtchn: build fix for Arm - -When backporting daa90dfea917 ("pirq_cleanup_check() leaks") I neglected -to pay attention to it depending on 13a7b0f9f747 ("restrict concept of -pIRQ to x86"). That one doesn't want backporting imo, so use / adjust -custom #ifdef-ary to address the immediate issue of pirq_cleanup_check() -not being available on Arm. - -Signed-off-by: Jan Beulich <jbeulich@suse.com> ---- - xen/common/event_channel.c | 4 +++- - 1 file changed, 3 insertions(+), 1 deletion(-) - -diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c -index b1a6215c37..e6ec556603 100644 ---- a/xen/common/event_channel.c -+++ b/xen/common/event_channel.c -@@ -643,7 +643,9 @@ static int evtchn_bind_pirq(evtchn_bind_pirq_t *bind) - if ( rc != 0 ) - { - info->evtchn = 0; -+#ifdef CONFIG_X86 - pirq_cleanup_check(info, d); -+#endif - goto out; - } - -@@ -713,8 +715,8 @@ int evtchn_close(struct domain *d1, int port1, bool guest) - * The successful path of unmap_domain_pirq_emuirq() will have - * called pirq_cleanup_check() already. - */ --#endif - pirq_cleanup_check(pirq, d1); -+#endif - } - unlink_pirq_port(chn1, d1->vcpu[chn1->notify_vcpu_id]); - break; --- -2.45.2 - diff --git a/0053-x86-IRQ-avoid-double-unlock-in-map_domain_pirq.patch b/0053-x86-IRQ-avoid-double-unlock-in-map_domain_pirq.patch deleted file mode 100644 index 686e142..0000000 --- a/0053-x86-IRQ-avoid-double-unlock-in-map_domain_pirq.patch +++ /dev/null @@ -1,53 +0,0 @@ -From d46a1ce3175dc45e97a8c9b89b0d0ff46145ae64 Mon Sep 17 00:00:00 2001 -From: Jan Beulich <jbeulich@suse.com> -Date: Tue, 16 Jul 2024 14:14:43 +0200 -Subject: [PATCH 53/56] x86/IRQ: avoid double unlock in map_domain_pirq() -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Forever since its introduction the main loop in the function dealing -with multi-vector MSI had error exit points ("break") with different -properties: In one case no IRQ descriptor lock is being held. -Nevertheless the subsequent error cleanup path assumed such a lock would -uniformly need releasing. Identify the case by setting "desc" to NULL, -thus allowing the unlock to be skipped as necessary. - -This is CVE-2024-31143 / XSA-458. - -Coverity ID: 1605298 -Fixes: d1b6d0a02489 ("x86: enable multi-vector MSI") -Signed-off-by: Jan Beulich <jbeulich@suse.com> -Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> -master commit: 57338346f29cea7b183403561bdc5f407163b846 -master date: 2024-07-16 14:09:14 +0200 ---- - xen/arch/x86/irq.c | 5 ++++- - 1 file changed, 4 insertions(+), 1 deletion(-) - -diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c -index 00be3b88e8..5dae8bd1b9 100644 ---- a/xen/arch/x86/irq.c -+++ b/xen/arch/x86/irq.c -@@ -2287,6 +2287,7 @@ int map_domain_pirq( - - set_domain_irq_pirq(d, irq, info); - spin_unlock_irqrestore(&desc->lock, flags); -+ desc = NULL; - - info = NULL; - irq = create_irq(NUMA_NO_NODE, true); -@@ -2322,7 +2323,9 @@ int map_domain_pirq( - - if ( ret ) - { -- spin_unlock_irqrestore(&desc->lock, flags); -+ if ( desc ) -+ spin_unlock_irqrestore(&desc->lock, flags); -+ - pci_disable_msi(msi_desc); - if ( nr ) - { --- -2.45.2 - diff --git a/0054-x86-physdev-Return-pirq-that-irq-was-already-mapped-.patch b/0054-x86-physdev-Return-pirq-that-irq-was-already-mapped-.patch deleted file mode 100644 index 5e245f9..0000000 --- a/0054-x86-physdev-Return-pirq-that-irq-was-already-mapped-.patch +++ /dev/null @@ -1,38 +0,0 @@ -From f9f3062f11e144438fac9e9da6aa4cb41a6009b1 Mon Sep 17 00:00:00 2001 -From: Jiqian Chen <Jiqian.Chen@amd.com> -Date: Thu, 25 Jul 2024 16:20:17 +0200 -Subject: [PATCH 54/56] x86/physdev: Return pirq that irq was already mapped to - -Fix bug introduced by 0762e2502f1f ("x86/physdev: factor out the code to allocate and -map a pirq"). After that re-factoring, when pirq<0 and current_pirq>0, it means -caller want to allocate a free pirq for irq but irq already has a mapped pirq, then -it returns the negative pirq, so it fails. However, the logic before that -re-factoring is different, it should return the current_pirq that irq was already -mapped to and make the call success. - -Fixes: 0762e2502f1f ("x86/physdev: factor out the code to allocate and map a pirq") -Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> -Signed-off-by: Huang Rui <ray.huang@amd.com> -Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -master commit: 0d2b87b5adfc19e87e9027d996db204c66a47f30 -master date: 2024-07-08 14:46:12 +0100 ---- - xen/arch/x86/irq.c | 1 + - 1 file changed, 1 insertion(+) - -diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c -index 5dae8bd1b9..6b1f338eae 100644 ---- a/xen/arch/x86/irq.c -+++ b/xen/arch/x86/irq.c -@@ -2914,6 +2914,7 @@ static int allocate_pirq(struct domain *d, int index, int pirq, int irq, - d->domain_id, index, pirq, current_pirq); - if ( current_pirq < 0 ) - return -EBUSY; -+ pirq = current_pirq; - } - else if ( type == MAP_PIRQ_TYPE_MULTI_MSI ) - { --- -2.45.2 - diff --git a/0055-tools-libxs-Fix-fcntl-invocation-in-set_cloexec.patch b/0055-tools-libxs-Fix-fcntl-invocation-in-set_cloexec.patch deleted file mode 100644 index e4cc09e..0000000 --- a/0055-tools-libxs-Fix-fcntl-invocation-in-set_cloexec.patch +++ /dev/null @@ -1,57 +0,0 @@ -From 81f1e807fadb8111d71b78191e01ca688d74eac7 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper <andrew.cooper3@citrix.com> -Date: Thu, 25 Jul 2024 16:20:53 +0200 -Subject: [PATCH 55/56] tools/libxs: Fix fcntl() invocation in set_cloexec() - -set_cloexec() had a bit too much copy&pate from setnonblock(), and -insufficient testing on ancient versions of Linux... - -As written (emulating ancient linux by undef'ing O_CLOEXEC), strace shows: - - open("/dev/xen/xenbus", O_RDWR) = 3 - fcntl(3, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE) - fcntl(3, 0x8003 /* F_??? */, 0x7ffe4a771d90) = -1 EINVAL (Invalid argument) - close(3) = 0 - -which is obviously nonsense. - -Switch F_GETFL -> F_GETFD, and fix the second invocation to use F_SETFD. With -this, strace is rather happer: - - open("/dev/xen/xenbus", O_RDWR) = 3 - fcntl(3, F_GETFD) = 0 - fcntl(3, F_SETFD, FD_CLOEXEC) = 0 - -Fixes: bf7c1464706a ("tools/libxs: Fix CLOEXEC handling in get_dev()") -Reported-by: Ross Lagerwall <ross.lagerwall@citrix.com> -Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> -Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com> -Reviewed-by: Juergen Gross <jgross@suse.com> -master commit: 37810b52d003f8a04af41d7b1f85eff24af9f804 -master date: 2024-07-09 15:32:18 +0100 ---- - tools/libs/store/xs.c | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/tools/libs/store/xs.c b/tools/libs/store/xs.c -index c8845b69e2..38a6ce3cf2 100644 ---- a/tools/libs/store/xs.c -+++ b/tools/libs/store/xs.c -@@ -182,12 +182,12 @@ static bool setnonblock(int fd, int nonblock) { - - static bool set_cloexec(int fd) - { -- int flags = fcntl(fd, F_GETFL); -+ int flags = fcntl(fd, F_GETFD); - - if (flags < 0) - return false; - -- return fcntl(fd, flags | FD_CLOEXEC) >= 0; -+ return fcntl(fd, F_SETFD, flags | FD_CLOEXEC) >= 0; - } - - static int pipe_cloexec(int fds[2]) --- -2.45.2 - diff --git a/0056-x86-altcall-fix-clang-code-gen-when-using-altcall-in.patch b/0056-x86-altcall-fix-clang-code-gen-when-using-altcall-in.patch deleted file mode 100644 index c94c516..0000000 --- a/0056-x86-altcall-fix-clang-code-gen-when-using-altcall-in.patch +++ /dev/null @@ -1,85 +0,0 @@ -From d078d0aa86e9e3b937f673dc89306b3afd09d560 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com> -Date: Thu, 25 Jul 2024 16:21:17 +0200 -Subject: [PATCH 56/56] x86/altcall: fix clang code-gen when using altcall in - loop constructs -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Yet another clang code generation issue when using altcalls. - -The issue this time is with using loop constructs around alternative_{,v}call -instances using parameter types smaller than the register size. - -Given the following example code: - -static void bar(bool b) -{ - unsigned int i; - - for ( i = 0; i < 10; i++ ) - { - int ret_; - register union { - bool e; - unsigned long r; - } di asm("rdi") = { .e = b }; - register unsigned long si asm("rsi"); - register unsigned long dx asm("rdx"); - register unsigned long cx asm("rcx"); - register unsigned long r8 asm("r8"); - register unsigned long r9 asm("r9"); - register unsigned long r10 asm("r10"); - register unsigned long r11 asm("r11"); - - asm volatile ( "call %c[addr]" - : "+r" (di), "=r" (si), "=r" (dx), - "=r" (cx), "=r" (r8), "=r" (r9), - "=r" (r10), "=r" (r11), "=a" (ret_) - : [addr] "i" (&(func)), "g" (func) - : "memory" ); - } -} - -See: https://godbolt.org/z/qvxMGd84q - -Clang will generate machine code that only resets the low 8 bits of %rdi -between loop calls, leaving the rest of the register possibly containing -garbage from the use of %rdi inside the called function. Note also that clang -doesn't truncate the input parameters at the callee, thus breaking the psABI. - -Fix this by turning the `e` element in the anonymous union into an array that -consumes the same space as an unsigned long, as this forces clang to reset the -whole %rdi register instead of just the low 8 bits. - -Fixes: 2ce562b2a413 ('x86/altcall: use a union as register type for function parameters on clang') -Suggested-by: Jan Beulich <jbeulich@suse.com> -Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> -Reviewed-by: Jan Beulich <jbeulich@suse.com> -master commit: d51b2f5ea1915fe058f730b0ec542cf84254fca0 -master date: 2024-07-23 13:59:30 +0200 ---- - xen/arch/x86/include/asm/alternative.h | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/xen/arch/x86/include/asm/alternative.h b/xen/arch/x86/include/asm/alternative.h -index 0d3697f1de..e63b459276 100644 ---- a/xen/arch/x86/include/asm/alternative.h -+++ b/xen/arch/x86/include/asm/alternative.h -@@ -185,10 +185,10 @@ extern void alternative_branches(void); - */ - #define ALT_CALL_ARG(arg, n) \ - register union { \ -- typeof(arg) e; \ -+ typeof(arg) e[sizeof(long) / sizeof(arg)]; \ - unsigned long r; \ - } a ## n ## _ asm ( ALT_CALL_arg ## n ) = { \ -- .e = ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg); }) \ -+ .e[0] = ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg); })\ - } - #else - #define ALT_CALL_ARG(arg, n) \ --- -2.45.2 - @@ -1,6 +1,6 @@ -Xen upstream patchset #0 for 4.18.3-pre +Xen upstream patchset #0 for 4.19.1-pre Containing patches from -RELEASE-4.18.2 (844f9931c6c207588a70f897262c628cd542f75a) +RELEASE-4.19.0 (0ef126c163d99932c9d7142e2bd130633c5c4844) to -staging-4.18 (d078d0aa86e9e3b937f673dc89306b3afd09d560) +staging-4.19 (2c61ab407172682e1382204a8305107f19e2951b) |