At present Zaza will wait until it hits a timeout if one of the
units enters a error state while it awaits a idle model.
Add check for errored units to ``block_until_all_units_idle()``
and ``wait_for_application_states()``
Fixes#100
The call to status action in ``wait_for_mirror_state`` helper is
susceptible to Ceph bug LP: #1820976.
Encapsulate in try except clause to work around it.
When you tell Ceph to resync mirrored RBD images it will in
practice remove and re-create the image.
At present the image state wait helper will happilly accept no
images in a pool as a positive outcome.
Add optional ``require_images_in`` parameter that allows the wait
helper to block even when no images are available in the pool (yet)
Update installation sequence to fit the Travis Xenial environment.
This should save us some time on binary deps as we no longer need
to install snapd and kernel packages mandated by spectre/meltdown
in the Travis ``trusty`` image.
At present the test uses a dict which includes the internal
numeric ID each pool has. This may be different on each side
depending on which order the pools were created in.
Use pool name in a sorted list for comparison instead.
Sometimes the slave RGW takes a little time to re-sync periods
and data from the new master RGW after a promotion.
Increase the number of attempts to retrieve the data written
to the master RGW from the slave RGW units.
* Expose resource_reaches_status' retry options
resource_reaches_status is used in many places to monitor many
different kinds of OpenStack resource. It is reasonable to expect a
flavor to be created in a few seconds where as an image based instance
may take minutes to become active. With that in mind this change
exposes the retry options used by tenacity allowing the caller to
set reasonable expectations for the resource state change.
In addition the image based instance creation call is updated
to retry for longer as this has been timing out.
* Remove swap files
Adapt existing CephRGW test case to deal with multisite
deployment testing.
Validate services across both standard ceph-radosgw and
slave-ceph-radosgw applications if discovered.
Upload test data to source site, validate replication and
delete on target site.
Switch masakari tests to use pacemaker-remote. For this the method
of simulating a host failure has to change to sending the process a
sigterm as shuting down pacemaker-remote any other way is seen as an
orderly shutdown and does not trigger things like stonith.
Also add check to ensure all nodes appear to be online from a
crm pov before starting failover test.
When running the file ready function each unit would download into the
same tmp directory. For files with restrictive permissions this would
lead to a permission denied and false negative on the second unit. i.e.
file is 444 and the subsequent scp fails due to the inability to
overwrite the first file.
Use a separate tmp dir for each unit to avoid this problem.
Centralize the check if the cerfiticates relation to vault exists. If
it does, update authentication information to include the CA
certificate and https as the protocol.