2
0
mirror of https://github.com/xcat2/confluent.git synced 2026-02-15 12:18:59 +00:00
Commit Graph

202 Commits

Author SHA1 Message Date
Jarrod Johnson
68097428a5 Modernize asyncio invocation in main confluent runtime 2026-01-21 16:47:17 -05:00
Jarrod Johnson
d89305ca42 Merge branch 'master' into async
Try to merge in 2025 work into async
2026-01-20 14:24:01 -05:00
Jarrod Johnson
c567bfbd17 Add sysctl tune check to selfcheck
Apart frem the gc_thresh indirect check, perform other checks.

For now, just highlight that tcp_sack being disabled can really
mess with BMC connections.  Since the management node may have high speed and the BMC may be behind a 100MBit link, SACK
is needed to overcome the massive loss and
induce TCP to rate limit appropriately.
2025-09-02 08:53:55 -04:00
Jarrod Johnson
6be98c7e60 Fix leaking ssh-agent processes in selfcheck 2025-08-26 08:44:42 -04:00
Jarrod Johnson
05dbbd6ce0 Explicitly check root user keys
Replace simple existence check
with a check that assures the content also matches.
2025-06-25 16:10:26 -04:00
Jarrod Johnson
94dc266cd4 Add neighbor overflow check to confluent_selfcheck
A common issue in larger layer 2 configurations is
for the neighbor table to be undersized for the number of
nodes.

Detect this manifesting and present a message.
2025-05-22 13:57:16 -04:00
Jarrod Johnson
ca3a53fde4 Provide specific guidance for bad ssh key permissions 2025-05-06 09:51:11 -04:00
Markus Hilger
e5b1b5d3a0 Implement YAML support for confluentdbutil (fixes #152) 2025-03-05 17:42:31 +01:00
Jarrod Johnson
52497d7d95 Broaden except clause on automation check
For whatever reason, we can't seem to specifically catch
the CalledProcessError and have to resort to generic Exception.
2025-02-06 10:44:59 -05:00
Jarrod Johnson
b46aecbeed Fix PXE afterm merge and have rebase work with async 2024-08-25 18:40:26 -04:00
Jarrod Johnson
fe0a15faf2 Merge branch 'master' into async 2024-08-22 08:42:37 -04:00
Markus Hilger
e735a12b3a Fix small typo 2024-08-22 12:38:52 +02:00
Jarrod Johnson
2f415caead Fix osdeploy updateboot with asyncio 2024-08-16 17:06:16 -04:00
Jarrod Johnson
5eaf998391 Remove greenlet, and change 'confluent' to asyncio 2024-08-16 14:36:59 -04:00
Jarrod Johnson
2cc61a1810 Merge branch 'master' into async 2024-08-14 16:26:55 -04:00
Jarrod Johnson
28b88bdb12 Add reporting of skipped nodes in a 'skip' merge 2024-08-14 11:40:11 -04:00
Jarrod Johnson
29d0e90487 Implement confluentdbutil 'merge'
For now, implement 'skip', where conflicting nodes/groups are
ignored in new input.
2024-08-14 11:26:51 -04:00
Jarrod Johnson
c754dc2641 Merge branch 'master' into async 2024-08-08 09:45:15 -04:00
Jarrod Johnson
8f1a1130a8 Add a selfcheck to check misdone collective manager 2024-07-24 15:55:04 -04:00
Jarrod Johnson
b6a0250e5c Advance state of asyncio
Add a mechanism to close a session the right way
in tlvdata

Fix confluentdbutil/configmanager to restore/dump db to directory

Move auth to asyncio away from eventlet

Fix some issues with httpapi, enable reading body via aiohttp

Fix health from ipmi plugin

Fix user creation across a collective.
2024-06-13 16:32:02 -04:00
Jarrod Johnson
bdb7f064d6 Rework a number of subprecess calls and osdeploy
Some subprocess calls were reworked to use asyncio friendly
variants.

Also, osdeploy initialize was checked, and reworked the ssh and tls
handling.

osdeploy import was also reworked to functional with async only.
2024-05-31 17:22:26 -04:00
Jarrod Johnson
c5405f832c Advance state of async shellserver
Can successfully run ssh sessions through
confluent with async now
2024-05-29 20:18:07 -04:00
Jarrod Johnson
207cc3471e Fix closing sockets in various contexts
With asyncio, we must close the writer half of a pair

Also rework the get_next_msg to work better.

Still need to allow stop_following to interrupt get_next_msg
2024-05-16 15:40:43 -04:00
Jarrod Johnson
5e222041bf Merge branch 'master' into async 2024-05-03 10:27:31 -04:00
Jarrod Johnson
b7a5101a34 Provide extra warning about redoing SSH materials 2024-05-03 10:27:01 -04:00
Jarrod Johnson
ee6f869cea Port utilities to asyncio, selfcheck and osdeploy
confluent_selfcheck removes eventlet dependency,

osdeploy reworked to use async methods to work with new client.
2024-04-30 14:30:01 -04:00
Jarrod Johnson
e8110551db Port some of the collective management to asyncio 2024-04-15 17:19:27 -04:00
Jarrod Johnson
198ffb8be6 Advance asyncio port
Purge sockapi of remaining eventlet call

Extend asyncio into the credserver to finish out sockapi.

Have client and sockapi complete TLS connection including password checking

Fix confetty ability to 'create'.
2024-04-01 16:38:10 -04:00
Jarrod Johnson
1fbaee6149 Further move toward asyncio and reduce PyOpenSSL dep
Since we are rebasing to at least Python 3.6, and with
some extra ctypes wranging of the ssl context, we can likely
remove PyOpenSSL. Take first steps by removing it from 'sockapi'.

Have confluent executable become the 'top level' for eventlet, to allow
work on 'de-eventleting' on 'main.py'.

Rework tlvdata to deal with either a socket or a reader, writer tuple.
Using TLS with asyncio is easiest with the 'open_connection'
semantics, which force either a Protocol handler (callback based) or
dual streams.  While protocol approach ends with a more socket-like
'transport', the 'protocol' half is a bit unwieldy. So reader and writer
streams instead.
2024-03-29 16:23:45 -04:00
Jarrod Johnson
1d4505ff3c SSH test by IP, to reflect actual usage and catch issues
One issue is modified ssh_known_hosts wildcard customization
failing to cover IP address.
2024-03-14 11:21:41 -04:00
Jarrod Johnson
4ca82948ba SSH test by IP, to reflect actual usage and catch issues
One issue is modified ssh_known_hosts wildcard customization
failing to cover IP address.
2024-03-14 11:20:36 -04:00
Jarrod Johnson
399c1467c1 Remove redundant kill on the agent pid
Extraneous kill on the agent pid is removed.
2024-03-14 10:53:13 -04:00
Jarrod Johnson
876b59c1f0 Remove redundant kill on the agent pid
Extraneous kill on the agent pid is removed.
2024-03-14 10:52:52 -04:00
Jarrod Johnson
58d9bc1816 Updates to confluent_selfcheck
Reap ssh-agent to avoid stale agents lying around.

Remove nuisance warnings about virbr0 when present.

Do a full runthrough as the confluent user to ssh to a node when user
requests with '-a', marking known_hosts and automation key issues.
2024-03-14 10:50:26 -04:00
Jarrod Johnson
dcb6a1c759 Updates to confluent_selfcheck
Reap ssh-agent to avoid stale agents lying around.

Remove nuisance warnings about virbr0 when present.

Do a full runthrough as the confluent user to ssh to a node when user
requests with '-a', marking known_hosts and automation key issues.
2024-03-14 10:50:01 -04:00
Jarrod Johnson
cdefb400f9 Expose fingerprinting and better error handling to osdeploy
This allows custom name and pre-import checking.
2024-03-11 13:33:15 -04:00
Jarrod Johnson
4f92e3413a Expose fingerprinting and better error handling to osdeploy
This allows custom name and pre-import checking.
2024-03-11 13:32:45 -04:00
Jarrod Johnson
d6bff637db Commence work on async 2024-02-23 11:56:07 -05:00
Jarrod Johnson
c9452e65e8 Fix some osdeploy ordering issues
osdeploy initialization dependencies have been
improved and marked if absolutely dependent.
2023-11-15 11:30:20 -05:00
Jarrod Johnson
f475d58955 Various permission fixes for osdeploy initialize
Fix a few scenarios where certain ordering of
initialize creates unworkable permissions.
2023-11-13 15:43:11 -05:00
Jarrod Johnson
d082610678 Add more deep checking of node networking
Whether due to the management node or node IP addresses,
check if deployment can reasonably proceed using IPv4 or IPv6,
and give a warning with some suggestions to check.

Also, add nodeinventory <node> -s as an example resolution for missing
uuid.
2023-10-27 13:34:52 -04:00
Jarrod Johnson
5d1315098f Enhance and extend check of node relations 2023-05-25 11:14:58 -04:00
Jarrod Johnson
b9d0da0416 Correct mistake in the gathering of valid nodenames 2023-04-26 15:37:08 -04:00
erderial
9bb402a1b8 Update confluent_selfcheck 2023-04-03 10:27:07 +03:00
erderial
13d4c57ee2 changes done as per request 2023-03-31 19:32:43 +03:00
erderial
88c47c9254 added functionality to check for net.*switch
added functionality to check for net.*switch
2023-03-31 16:43:15 +03:00
Jarrod Johnson
baa365fcac Implement non-voting collective members
Provide for applications
where only a small subset of collective
members should be
considered to count
toward whether the collective
can proceed.

Commonly, 'service' nodes may
be numerous to do work, but may all want to go offline
during a maintenance window.
2023-03-06 11:56:15 -05:00
Jarrod Johnson
5ea214a726 Use eventlet subprocess
sshutil uses eventlet subprocess,
making calledprocesserror
hard to catch.

Adjust to consistently use same
subprocesss module.
2023-02-22 16:34:13 -05:00
Jarrod Johnson
fcde113e08 Add a check of dns.domain to selfcheck for node 2023-02-08 14:45:16 -05:00
Jarrod Johnson
1777223232 Fixes for osdeploy arm ipxe init 2023-01-27 08:40:31 -05:00