2
0
mirror of https://github.com/xcat2/xcat-core.git synced 2026-05-17 19:57:18 +00:00
Commit Graph

27023 Commits

Author SHA1 Message Date
Markus Hilger 5bd7bd2112 Merge pull request #7561 from VersatusHPC/fix/confignetwork-typo
fix: typo in confignetwork preventing SETINSTALLNIC from working
2026-05-07 11:55:55 +02:00
Markus Hilger 79dd24d80b Merge pull request #7560 from VersatusHPC/fix/otherpkgs-dnf-detection
fix: detect dnf as package manager in ospkgs and otherpkgs postscripts
2026-05-07 11:53:54 +02:00
Markus Hilger b2272818d8 Merge pull request #7559 from VersatusHPC/fix/bmcsetup-disable-retry
fix: skip disabled IPMI user slots in bmcsetup
2026-05-07 11:52:30 +02:00
Markus Hilger 5463a7c46c Merge pull request #7558 from VersatusHPC/fix/mkdef-partial-object-on-validation-error
fix: prevent mkdef partial writes on validation errors
2026-05-07 11:50:41 +02:00
Vinícius Ferrão 1e341d941a fix: typo in confignetwork preventing SETINSTALLNIC from working
bool_install_nic on line 99 should be boot_install_nic, matching
the variable used everywhere else in the script. This caused the
SETINSTALLNIC environment variable to have no effect.

Fixes: xcat2/xcat-core#7472
2026-05-07 03:02:07 -03:00
Vinícius Ferrão dd15d5abb5 fix: detect dnf as package manager in ospkgs and otherpkgs postscripts
On RHEL 9.x minimal installs, the yum package may not exist as a
separate RPM — only dnf is present with /usr/bin/yum as a symlink.
The previous detection using rpm -q yum would fail, causing hasyum
to remain 0 and skipping repo file creation entirely.

Replace rpm -q based detection with executable checks for /usr/bin/dnf
and /usr/bin/yum. Introduce yumcmd variable to carry the actual command
name through all package operations instead of hardcoding yum.

Fixes: xcat2/xcat-core#7497
2026-05-07 02:54:44 -03:00
Vinícius Ferrão 7c1444335e fix: skip disabled IPMI user slots in bmcsetup
bmcsetup iterated every user slot and retried ipmitool user disable for slots that were already disabled. Lenovo XCC reports those attempts as Invalid data field in request, so discovery can spend minutes retrying no-op disables.

Read the current user table once per BMC, keep the old fallback when the table cannot be read, and disable only non-target slots whose IPMI Msg flag is true. Also use the loop's current username when resolving the target slot and keep the intended userslot 2 fallback assignment.

Fixes xcat2/xcat-core#5065
2026-05-06 23:54:02 -03:00
Vinícius Ferrão 2ae97c2ac4 fix: prevent mkdef partial writes on validation errors 2026-05-06 20:40:38 -03:00
Markus Hilger 238b357428 Merge pull request #7557 from Obihoernchen/ci
Speed up CI
2026-05-07 01:00:20 +02:00
Markus Hilger cb11564ff1 Merge pull request #7556 from VersatusHPC/fix/openbmc-503-retry
fix: retry on HTTP 503 from OpenBMC REST API instead of failing
2026-05-07 00:34:10 +02:00
Markus Hilger 145bff6ca3 Speed up CI 2026-05-07 00:30:10 +02:00
Markus Hilger a90ef274aa Merge pull request #7555 from VersatusHPC/fix/ipmi-rmcptag-and-cbcpad
fix: ipmi rmcptag and cbcpad
2026-05-07 00:00:15 +02:00
Vinícius Ferrão 1bca57fa2a fix: retry on HTTP 503 from OpenBMC REST API instead of failing
OpenBMC BMCs intermittently return 503 Service Unavailable when the
REST service is busy or recovering. xCAT reported the error immediately,
requiring the user to manually retry. A second attempt usually succeeds.

Retry the same request up to 3 times with a 3-second wait on 503.
If all retries fail, report the error as before. The existing 504
handling for bmcreboot is preserved.

Ref: #4264
2026-05-06 18:50:53 -03:00
Markus Hilger ca6bafd723 Merge pull request #7553 from VersatusHPC/fix/tls-policy
feat: add xCAT TLS policy selection
2026-05-06 19:39:19 +02:00
Markus Hilger aa180925e3 Merge pull request #7550 from VersatusHPC/fix/profile-asset-dotted-osvers
fix: handle dotted OS versions in profile asset lookup
2026-05-06 19:18:26 +02:00
Markus Hilger 2b1986d946 Merge pull request #7552 from VersatusHPC/fix/ubuntu-live-media-guardrails
fix: guardrails for Ubuntu genimage
2026-05-06 19:17:08 +02:00
Markus Hilger b93a54daee Merge pull request #7554 from VersatusHPC/fix/dhcpop-requires
fix: use runtime require for xCAT::DHCP::Backend in dhcpop
2026-05-06 18:45:27 +02:00
Vinícius Ferrão 911c74fda6 fix: use runtime require for xCAT::DHCP::Backend in dhcpop
The xCAT-server build-readme script runs every tool in share/xcat/tools/
with --help during RPM packaging. At build time perl-xCAT is not installed,
so the compile-time 'use xCAT::DHCP::Backend' aborts before --help can run.

Switch to runtime require inside the remove-operation branch where the
module is actually needed.
2026-05-06 02:55:11 -03:00
Vinícius Ferrão c0e8b1730e fix: fall back from sha256 to sha1 on RAKP2 auth rejection
Extend the existing sha256-to-sha1 fallback (already present in
got_rmcp_response for Open Session errors) to also cover RAKP2
rejections with "Unauthorized name" (0x0d) or "Invalid role" (0x09).

Ref: #7511
2026-05-06 01:26:29 -03:00
Vinícius Ferrão 86f6a12264 fix: set IPMI name-only lookup bit in RAKP1 to match ipmitool
Set bit 4 (0x10) of the requested privilege byte in RAKP Message 1
for name-only user lookup, matching ipmitool behavior. Use the same
value consistently in all HMAC calculations (RAKP2 verification,
RAKP3 auth code, SIK derivation).

Without this, some BMCs fail user lookup with "Unauthorized name"
even though the credentials are correct.

Ref: #7511
2026-05-06 01:25:55 -03:00
Vinícius Ferrão 2bcdc52f92 fix: accept RMCP message tag 0 from OpenBMC with session ID correlation
OpenBMC-based BMCs return message tag 0 in RAKP2/RAKP4 instead of
echoing the tag from the request. xCAT rejected these as stale
responses and retried indefinitely until timeout.

Accept tag 0 but verify the remote console session ID in the response
matches our current sidm. This prevents stale retries from corrupting
session state while allowing OpenBMC responses through.

Applied to got_rmcp_response, got_rakp2, and got_rakp4.

Ref: #7511
2026-05-06 01:25:09 -03:00
Vinícius Ferrão cb2a6b3f3c fix: reject IPMI packets with invalid CBC padding instead of crashing
cbc_pad in decrypt mode reads the last byte as the pad count, then
calls splice(@block, 0 - $count). If decrypted data is corrupt, the
pad count can exceed the array size, crashing with "Modification of
non-creatable array value attempted, subscript -16".

Return empty string on invalid padding so the caller treats it as a
decryption failure rather than accepting corrupted data as a valid
IPMI response.

Ref: #7511
2026-05-06 01:23:10 -03:00
Vinícius Ferrão 2915e9be0e Add xCAT TLS policy selection 2026-05-05 23:20:18 -03:00
Markus Hilger 0649b4c4ac Merge pull request #7549 from VersatusHPC/fix/update-copyright
docs: update copyright to include xCAT Consortium
2026-05-06 03:17:12 +02:00
Markus Hilger b006975b54 Merge pull request #7551 from VersatusHPC/fix/sles-legacy-validation
fix: restore legacy SLES provisioning paths
2026-05-06 02:49:34 +02:00
Vinícius Ferrão 7b20bbd187 Guard Ubuntu live media package sources 2026-05-05 21:40:04 -03:00
Vinícius Ferrão 9f33b19214 fix: restore legacy SLES provisioning paths 2026-05-05 17:09:37 -03:00
Vinícius Ferrão 119b19ce14 fix: handle dotted OS versions in profile asset lookup 2026-05-05 13:50:58 -03:00
Vinícius Ferrão 59af27444b docs: update copyright to include xCAT Consortium
Add xCAT Consortium (2022-2026) alongside the original IBM Corporation
copyright (2015-2022) in the Sphinx documentation configuration.
2026-05-05 10:44:03 -03:00
Markus Hilger 5b12eecd40 Merge pull request #7548 from VersatusHPC/fix/update-cuda-docs
docs: update NVIDIA CUDA documentation for modern OS support
2026-05-05 09:30:21 +02:00
Markus Hilger 2618b532c5 Merge pull request #7547 from VersatusHPC/fix/dhcp-dynamic-range-overlap
fix: errors out when node IP overlaps DHCP dynamic range
2026-05-05 09:28:23 +02:00
Markus Hilger 472529e046 Merge pull request #7546 from VersatusHPC/fix/remove-hardcoded-quiet
fix: move kernel quiet flag from hardcoded to osimage default
2026-05-05 09:26:26 +02:00
Vinícius Ferrão 60820b1abe docs: update NVIDIA CUDA documentation for modern OS support
The CUDA docs were frozen at CUDA 9.2 / RHEL 7.5 / Ubuntu 14.04 since
2019. Update to cover all currently supported OS and architecture
combinations (EL 7-10, Ubuntu 20.04-24.04, x86_64/ppc64le/sbsa).

Consolidate the version-specific repo and osimage pages into generic
guides that use placeholder variables, reducing 7 files to 2 while
covering more OS versions. Both online (direct NVIDIA repo URL) and
offline (dnf download / apt download mirroring) workflows are
documented.

All NVIDIA repository URLs validated against
developer.download.nvidia.com/compute/cuda/repos/ and confirmed
accessible with valid repodata.

Addresses #7373
2026-05-05 02:32:09 -03:00
Vinícius Ferrão bb8dd525da Error when node IP overlaps DHCP dynamic range
Previously, makedhcp warned but still created host entries without
a static IP reservation when a node's address fell inside the
dynamic range. The node would silently get a random IP from the
pool instead of its configured address.

Now errors and skips the node on all four DHCP paths (ISC v4/v6,
Kea v4/v6) with a clear message telling the admin to move the IP
outside the range or adjust the dynamic range.

This makes ISC DHCP and Kea behavior consistent and aligns with
xCAT's design: the dynamic range is for hardware discovery,
known nodes should have static IPs outside it.

Closes #6539
2026-05-05 00:48:30 -03:00
Vinícius Ferrão 03a16dd081 fix: move kernel quiet flag from hardcoded to osimage default
The quiet kernel parameter was hardcoded in anaconda.pm and sles.pm,
making it impossible for admins to get verbose boot output without
editing plugin source code. The existing addkcmdline mechanism
(bootparams and linuximage tables) only appends to the kernel command
line, so there was no way to remove quiet.

Move quiet out of the plugin kcmdline construction and into the
linuximage.addkcmdline default set during copycds osimage creation.
Admins who want verbose boot for debugging can now remove it per
osimage:

    chdef -t osimage <image> addkcmdline=""

New osimages get addkcmdline="quiet" by default. Existing osimages
with a custom addkcmdline are not overwritten on re-run of copycds.

Genesis/discovery boot (mknb.pm) is unchanged as it does not use
osimage definitions.

Addresses #6916
2026-05-04 22:58:52 -03:00
Markus Hilger a51a4d7710 Merge pull request #7543 from VersatusHPC/fix/systemd-xcatd
feat: Use systemd instead of legacy initscripts
2026-05-05 01:38:37 +02:00
Markus Hilger e65b968000 Merge pull request #7545 from VersatusHPC/fix/nodeset-empty-master-ip
fix: fail nodeset when MASTER_IP cannot be resolved
2026-05-05 01:34:25 +02:00
Vinícius Ferrão bfbc48c698 fix: fail nodeset when MASTER_IP cannot be resolved
Template.pm silently continued rendering kickstart templates when
getipaddr() failed to resolve the master hostname, producing
kickstarts with an empty MASTER_IP. Nodes would install successfully
but fail on first reboot when post.xcat and xcatinstallpost tried
to contact the master, timing out after 90 retries with:

    the network between the node and  is not ready

Postage.pm (mypostscript generation) already checks for this and
returns a clear error. Apply the same pattern in Template.pm so
nodeset fails immediately with a descriptive message instead of
producing a broken kickstart.

Fixes #7544
2026-05-04 18:52:13 -03:00
Vinícius Ferrão 7897f30bfe Modernize xcatd service packaging 2026-05-04 18:13:23 -03:00
Markus Hilger d5831828d6 Merge pull request #7533 from VersatusHPC/fix/opensuse-leap-support
feat: add openSUSE Leap 15 and SLES 15 provisioning support
2026-05-04 17:20:59 +02:00
Vinícius Ferrão 88da644249 Merge pull request #7532 from VersatusHPC/fix/el10-netboot-dhcp-client
fix: use NetworkManager for EL10 netboot DHCP instead of dhclient
2026-05-04 17:20:11 +02:00
Markus Hilger c7915645b3 Merge pull request #7541 from VersatusHPC/fix/ipmi-rspconfig-set-readback
Improve rspconfig SET readback and fix backupgateway SET target
2026-05-04 17:19:38 +02:00
Markus Hilger 679bed8926 Merge pull request #7542 from VersatusHPC/fix/apache-disable-directory-indexing
fix: disable Apache directory indexing on /install and /tftpboot
2026-05-04 17:18:39 +02:00
Markus Hilger 2bdb0d4d02 Merge pull request #7540 from VersatusHPC/fix/remove-docker-lifecycle
fix: remove Docker container lifecycle management (dead code since 2016)
2026-05-04 17:15:58 +02:00
Vinícius Ferrão 5035697e9b fix: disable Apache directory indexing on /install and /tftpboot
The default xCAT Apache configuration shipped with Options Indexes
enabled for the /install and /tftpboot directories. This allowed
unauthenticated users to browse directory listings, disclosing the
full tree of postscripts, boot files, and (in production deployments)
potentially kickstart files with password hashes, custom scripts with
embedded credentials, and cluster topology details.

Replace Options Indexes with -Indexes in all four shipped Apache config
files (MN and SN, Apache 2.2 and 2.4 variants). Direct file access
by known path continues to work, so all provisioning workflows are
unaffected. Directory browsing for /xcat-doc is preserved as it
contains only public documentation.

Additionally, add an Apache hardening guide documenting recommended
permissions for sensitive directories under /install, network binding
best practices, and IP-based access control options.

Addresses #7450
2026-05-03 23:01:01 -03:00
Vinícius Ferrão d71c7f7ac6 Improve rspconfig SET readback and fix backupgateway SET target
On some BMCs (notably Supermicro), a GET immediately after SET
returns the old value until the BMC applies the change. This made
rspconfig output misleading for network setting operations.

- Store the canonical SET value after normalization and compare
  with the GET readback for ip, netmask, gateway, and backupgateway.
  When they differ, annotate the output:
  "BMC Gateway: 10.20.0.1 (requested 10.20.0.254, not yet reflected)"
- Consolidate ip/netmask/gateway/backupgateway display into one block
- Fix backupgateway SET: was routed through the gateway branch
  writing parameter 0x0C instead of 0x0E. Now has its own branch
  writing the correct IPMI parameter.
- ip=dhcp is unaffected (separate code path, never stores a value)

Tested on Supermicro IPMI BMC (10.20.0.51).

Fixes #3445
2026-05-03 21:01:42 -03:00
Markus Hilger ddd7f8da3f Merge pull request #7539 from VersatusHPC/fix/ipmi-vlan-disable
fix: IPMI VLAN disable
2026-05-03 20:10:47 +02:00
Markus Hilger 1c132aab49 Merge pull request #7538 from VersatusHPC/feat/openbmc-rspconfig-user-snmp
feat: add OpenBMC rspconfig user and alert support
2026-05-03 20:09:35 +02:00
Vinícius Ferrão 4165b26a04 fix: remove Docker container lifecycle management (dead code since 2016)
Docker container lifecycle management (mgt=docker, mkdocker, rmdocker,
lsdocker) was added in 2015-2016 as an experiment targeting Docker API
v1.22 on Ubuntu only. Documentation and man pages were deliberately
removed in 2019 (PRs #6222 and #6324) with the original developer's
approval, noting that "the interface of Docker has become very simple
right now, so there is no value for xCAT to offer such functions."

The plugin was still being shipped but has had no functional code changes
since April 2016, was never listed as a valid mgt value in Schema.pm,
and no user ever filed an issue about it.

Removed:
- xCAT-server/lib/xcat/plugins/docker.pm (1,142 lines)
- xCAT/postscripts/setupdockerhost
- xCAT-server/share/xcat/scripts/setup-dockerhost-cert.sh
- xCAT-test/autotest/testcase/dockercommand/ (test cases)
- Docker attribute definitions in Schema.pm
- Client symlinks (mkdocker, rmdocker, lsdocker)
- Usage entries and dockerhost cert handling in credentials.pm
- Docker attribute documentation in man7 pages

The "Running xCAT in Docker" documentation (dockerized_xcat/) is
retained as it documents containerizing xCAT itself, not the removed
mgt=docker feature.

Closes #7518
2026-05-03 12:11:33 -03:00
Vinícius Ferrão 2fa7fca1ad Allow rspconfig to disable VLAN on IPMI BMCs
rspconfig vlan= only accepted values 1-4096 with no way to disable
VLAN tagging. Users had to resort to raw IPMI commands to clear a
stale VLAN after ip=dhcp.

- Accept vlan=off/disable/disabled to clear VLAN tagging via
  standard IPMI parameter 0x14 with the enable bit unset
- Fix valid range from 1-4096 to 1-4094 (IEEE 802.1Q)
- Use strict digit matching to reject malformed inputs

To clear VLAN after a DHCP reset: rspconfig <node> vlan=off

Tested on Supermicro IPMI BMC (10.20.0.51).

Partially addresses #3725
2026-05-03 12:04:21 -03:00