2. RHV TrilioVault Deployment Guide

2.1. Introduction

TrilioVault for RHV, by Trilio Data, is a native RHV service that provides policy-based comprehensive backup and recovery for RHV workloads. The solution captures point-in-time workloads (Application, OS, Compute, Network, Configurations, Data and Metadata of an environment) as full or incremental snapshots. These snapshots can be held in a variety of storage environments including NFS and in the future AWS S3 compatible storage. With TrilioVault and its single click recovery, organizations can improve Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO). With TrilioVault, IT departments are enabled to fully deploy RHV solutions and provide business assurance through enhanced data retention, protection and integrity.

With the use of TrilioVault’s VAST (Virtual Snapshot Technology), Enterprise IT and Cloud Service Providers can now deploy backup and disaster recovery as a service to prevent data loss or data corruption through point-in-time snapshots and seamless one-click recovery. TrilioVault takes point-in-time backup of the entire workload consisting of compute resources, network configurations and storage data as one unit. It also takes incremental backups that only captures the changes that were made since the last backup. Incremental snapshots save time and storage space as the backup only includes changes since the last backup. The benefits of using VAST for backup and restore could be summarized as below:

  1. Efficient capture and storage of snapshots. Since our full backups only include data that is committed to storage volume and the incremental backups only include changed blocks of data since last backup, our backup processes are efficient and storages backup images efficiently on the backup media.
  2. Faster and reliable recovery. When your applications become complex that snap multiple VMs and storage volumes, our efficient recovery process will bring your application from zero to operational with just click of button.
  3. Easy migration of workloads between clouds. TrilioVault captures all the details of your application and hence our migration includes your entire application stack without leaving any thing for guess work.
  4. Through policy and automation lower the Total Cost of Ownership. Our role driven backup process and automation eliminates the need for dedicated backup administrators, thereby improves your total cost of ownership.

2.2. System Requirements

TrilioVault is a software only solution that is shipped as a VM image in a file format called QCOW2. The TrilioVault solution can be deployed as a TrilioVault virtual machine created from the TrilioVault QCOW2 images.

Hardware requirements for TrilioVault virtual appliance are mentioned below.

TrilioVault Appliance QCOW2
Flavor of TrilioVault Appliance Storage: 40 GB
Memory: 24 GB
vCPUs: 4

2.3. Prerequisite

2.3.1. Version Requirement

Minimum version requirement for both services - “ovirt-imageio-proxy and ovirt-imageio-daemon” for running ansible playbook is 1.4.2. Deployment will fail if the services version is below 1.4.2

Triliovault supported Red Hat Virtualization Hypervisor (RHVH) and Red Hat Virtualization Manager (RHV) version is 4.2.x. Support for 4.3.x would be provided in future releases.

2.3.2. Installing redis

Trilio uses Python celery service for transferring disk(s). Celery service requires Redis for publishing and subscribing message for managing those transfers. Hence, Redis service has to be installed & enabled on RedHat Virtualization.

For this distribution install below repo:

# yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

Once the EPEL installation has finished you can install Redis, again using yum:

# yum install redis

This may take a few minutes to complete. After the installation finishes, start the Redis service:

# systemctl start redis.service

If you’d like Redis to start on boot, you can enable it with the enable command:

# systemctl enable redis

You can check Redis’s status by running the following:

# systemctl status redis.service

2.3.3. Importing Certificates

We need to import certificate to enable disk upload through UI. Follow below steps to do so -

  1. For image transfer(upload) to work correctly from browser, we need CA certificates of manager and of imageio-proxy, following is the path on RHV manager -
/etc/pki/ovirt-engine/certs/imageio-proxy.cer

Note

Above path can be obtained from the following file on RHV manager - cat /etc/ovirt-imageio-proxy/ovirt-imageio-proxy.conf

  1. Imageio-proxy.cer needs to be manually copied to the client machine from which RHV-Manager will be accessed. Once copied, double click on the certificate to install.
  2. Click on “Install Certificate…” > Select “Local Machine” > Select “Place all certificates in following store” then click “Browse” > Select “Trusted Root Certificate Authorities” then click OK > Click Next and then “Finish”.
  3. Once the certificate is installed, please add the RHV manager’s entry in the host file of the Client machine from which manager will be accessed. Similarly add the entry for Triliovault VM FQDN as well.
  4. To import remaining certificate, go to Storage > Disks > Click on “Upload” > Click on “Start” and then click “Test Connections”. An alert in red is displayed which will ask you to install one more certificate (pki.cer). It can be downloaded from the link given in the alert. Follow same procedure to install the certificate.
  5. To verify successful certificate installation, you can go to Storage > Disks > Click on “Upload” > Click on “Start” and then click “Test Connections”. An alert in green should be displayed.

2.3.4. Adding an exception

Once the TVM is deployed, we need to add an exception to view contents from the new “Backup” Tab. Follow the below steps to do so -

Note

If certificates are properly imported and applied to TVM, there will be no need to add any exception to access the page.

  1. Right click anywhere on the web page and select “Inspect Element” for Mozilla firefox and “Inspect” for chrome and go to “Network” tab.
  2. Click on the backup tab, you will see an entry with error (in red), click on it and copy the “Requested URL”.
  3. Manually add an exception for the URL. Workload page contents will be visible after this.

3. TVM Deployment Steps

3.1. Uploading TrilioVault disk

TrilioVault deployment is done by attaching a pre-formatted, CentOS 7 -QCOW2 disk to a VM with specifications mentioned above.

  1. Click on “Storage → Disks”

image1

  1. Click on “Upload” and then click on “Start”, choose the qcow2 image and enter all the highlighted details.

Note

QCOW2/RAW uploads are supported over NFS & SAN storage domains. Refer link from Red Hat Virtualization Admin Guide

image2

3.2. TrilioVault VM Creation

  1. To Create a VM and attach the disk we just uploaded, click on “Compute → Virtual Machines → New”

image3

  1. Click on “Attach” as highlighted and select the uploaded disk

image4

  1. Select the appropriate NIC for VM traffic -

image5

  1. Now click on “Systems” Tab for VM configuration (Memory, vCPUs, etc)

image6

  1. Once this is done, click on “OK”.

Note

Though recommended memory size is 24GB, 8GB works just fine and can be expanded when required.

  1. After VM is successfully created and is up and running, open the console for network configurations.

3.3. TVM Network Configuration

  1. Open TVM console and do the following -

    Open network config

vi /etc/sysconfig/network-scripts/ifcfg-eth0
  1. Edit/Enter following details -

Example:

BOOTPROTO=static
DEVICE=eth0
HWADDR=00:1a:4a:16:01:01
ONBOOT=yes
TYPE=Ethernet
USERCTL=no
IPADDR=149.56.121.111
NETMASk=255.255.255.224
GATEWAY=149.56.121.126
  1. After saving file, restart network service,
service network restart

and try to ping the IP from landing VM.

3.4. TVM Configuration

  1. Open the configurator using the newly assigned IP in the browser, default credentials are admin | password. It will ask for password reset. Enter all the relevant details including License and click Submit if the validation is successful (Click TAB after every entry to validate the field)

image7

  1. Configuration in progress -

image8

  1. Once the configuration is done, you can click on “Click here” which will redirect to the Virtual IP TVM portal.

image9

  1. You can also update the license later, to do so, click on “TrilioVault License” browse and select license to update and click Submit.

image10

  1. This will conclude the configuration part of TVM. Now we will proceed to the TVM installation on RHV.
  2. Once configuration is done, we can import previous workloads onto the newer TVM along with all the previous snapshots and schedules, to do so click on “Import Workloads” tab and perform the import -

image11

3.5. Installation

We have to configure imageio-proxy for managers & hosts in order to configure TVM with RHV setup. There are two methods to Install - Installation using Password and Installation without password -

Note

Following operation needs to be done on TVM

3.5.1. Installation using Password

  1. Enter hosts’ details in file: /opt/stack/imageio-ansible/inventories/production/daemon
<RHV-host1_IP> ansible_user=root password=xxxxx

<RHV-host2_IP> ansible_user=root password=xxxxx

.

.

<RHV-hostn_IP> ansible_user=root password=xxxxx
  1. Enter manager’s details in file: /opt/stack/imageio-ansible/inventories/production/proxy
<RHV-Manager_IP> ansible_user=root password=xxxxx
  1. Then go to following location on TVM and run below script:
cd /opt/stack/imageio-ansible
  • For Manager run:
ansible-playbook site.yml -i inventories/production/proxy --tags proxy
  • For Hosts run:
ansible-playbook site.yml -i inventories/production/daemon --tags daemon

3.5.2. Installation using passwordless ssh keys

  1. Enter hosts’ details in file: /opt/stack/imageio-ansible/inventories/production/daemon
<RHV-host1_IP> ansible_user=root

<RHV-host2_IP> ansible_user=root

.

.

<RHV-hostn_IP> ansible_user=root
  1. Enter manager’s details in file : /opt/stack/imageio-ansible/inventories/production/proxy
<RHV-Manager_IP> ansible_user=root
  1. Check if TVM has private and public keys generated at path /root/.ssh/
    • If yes, then copy public key and add it in /root/.ssh/authorized_keys file of daemon/proxy hosts on which the scripts to be executed.
    • If no, generate keys on TVM using ssh-keygen command, and pass appropriate passphrase to generate the keys. Then copy public key and add it in /root/.ssh/authorized_keys file of hosts (daemon) and manager(proxy) on which you want to run scripts.
  2. Then go to following location on TVM and run below script:
cd /opt/stack/imageio-ansible
  • For Manager run:
ansible-playbook site.yml -i inventories/production/proxy --private-key ~/.ssh/id_rsa --tags proxy
  • For Hosts run:
ansible-playbook site.yml -i inventories/production/daemon --private-key ~/.ssh/id_rsa --tags daemon

3.5.3. Post configuration health check

Login to TVM as root and run pcs status as shown below.

[root@om_tvm imageio-ansible]# pcs status
Cluster name: triliovault

WARNINGS:
Corosync and pacemaker node names do not match (IPs used in setup?)
Stack: corosync
Current DC: om_tvm (version 1.1.19-8.el7_6.1-c3c624ea3d) -
partition with quorum
Last updated: Wed Dec 5 12:25:02 2018
Last change: Wed Dec 5 09:20:08 2018 by root via cibadmin on om_tvm
1 node configured
4 resources configured

Online: [ om_tvm ]
Full list of resources:
virtual_ip (ocf::'heartbeat:IPaddr2): Started om_tvm
wlm-api (systemd:wlm-api): Started om_tvm
wlm-scheduler (systemd:wlm-scheduler): Started om_tvm
Clone Set: lb_nginx-clone [lb_nginx]
Started: [ om_tvm ]
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled

4. Backup and Recovery

4.1. Workload creation

  1. Once TVM is successfully configured and installed, we will find a new Tab on RHV-manager named as “Backup”.

image12

  1. In this option, we can create workload, which is a backup profile, this profile defines the workload name, workload member, workload execution schedule, retention policy and full backup interval.
  2. Click on “Create Workload” button and inside “Details” tab give a relevant workload name.

image13

  1. Click on the “Workload Members” tab to ADD VMs to the workload.

image14

  1. Click on the “Schedule” tab to set schedule of snapshot execution.

image15

  1. Workload policy defines various aspects of the backup process including number of backups to retain, frequency at which backups are taken and full backups between incrementals.Click on “Policy” tab to set appropriate retention type and Full backup interval -

image16

  1. Once, Workload schedule is in action, it looks like the following

image17

4.2. Snapshots

  1. We can also trigger manual snapshot as follows, click on the workload created, then click on “Snapshots” tab and click “Create snapshot”. Give snapshot name and select whether you want to take Full snapshot or incremental and click “Create”

image18

  1. There is also another way to create snapshot which is from the main page as follows, and the rest of the steps are the same -

image19

4.3. Snapshot Restore

There are 2 restore types:
  • Selective restore
  • One Click restore

4.3.1. Selective Restore

  1. Click on Workload and at the right you will find a button One Click Restore and a down arrow, click on the down arrow to find the “Selective restore” option -

image20

  1. In the details tab, fill-out relevant details, most important SELECT APPROPRIATE NIC and STORAGE TYPEs from the Networks and Storage section

Note

If we select “DCA” then we can restore VM only in Datacenter A, similarly for Datacenter B

image21

  1. In Vm Instances tab, select the VM that needs to be restored, fill all the relevant details like New VM Name, Instance type (flavor), Data Center and Cluster (DCA is selected here) as highlighted and click “Restore”. The VM with the New VM name will be visible in Compute -> Virtual Machines.

image22

4.3.2. One Click Restore

Note

One click restore only works when original VMs are deleted

First delete the original VM(s) which were added to the workload as the members.

  1. Click on workload, then “snapshots” tab and click on “One Click Restore” button

image23

  1. Enter Restore job name and description, and then click “Create”, restore will be initiated.

Note

There is also another way to initiate Selective restore as well as One Click restore, click on workload → “Snapshots” tab → click on latest or any snapshot you want to initiate restore from → click on “Restores” tab then finally click on “Selective Restore” or “One Click Restore” button

image24

4.5. Global Job Scheduler

Backup schedules of all the workloads can be enabled/disabled using this feature.

If disabled, no workload even with its local scheduler enabled, should trigger a scheduled backup.

Once re-enabled, then all the workloads with previously enabled local scheduler should trigger the backups as per the scheduler setting (backup intervals).

image28

image29

5. Troubleshooting and Logs

5.1. TVM Logs

  1. /var/log/workloads directory contains all the logs of operations performed by TVM.

5.2. RHV Manager Logs

  1. /var/log/ovirt-engine/ contains useful logs; engine.log contains all the operations performed on/by ovirt-engine i.e. RHV Manager
  2. Other useful logs are ui.log, console.log, boot.log

5.3. RHV Host Logs

1. /var/log/vdsm contains logs of the operation performed on virtual machines hosted on a particular host. When troubleshooting for storage related issues like coldeMerge, vdsm logs of the ‘SPM’ host should be checked(applicable when more than one host exists in a DataCenter).

  1. /var/log/ovirt-imageio-daemon contains the information of the tasks being performed by imageio-daemon service.

3 /var/log/ovirt_celery/ contains the information about the disk transfers performed by TVM.