Disaster Recovery Checklist for Alation
Business Function
Alation is a data catalog application that helps to support collaboration, data analysis, and data governance. It links to a legacy Oracle and a Redshift data warehousing backend.
There are two environments:
- UAT
- Prod
Scope
The scope of this document is the prod Alation application running in the ACE AWS Account.
| Environment | Instance | IP | URL |
|---|---|---|---|
| Prod | ecd-alation-prod | 10.158.21.158 | https://ecd-alation.gene.com/ |
Key Information
-
The latest Alation backups are required.
- Additional support documentation for restore: https://docs2.alationdata.com/en/latest/installconfig/BackupandRestore/BackupRestore46.html
-
Backups are stored locally at the
/backup/backupfolder. -
We receive daily backups from Alation instances through AWS Backups. You can find more information at: Alation-EC2-Daily-Backup
Previously, backups were copied directly to the S3 bucket
gred-alation-backups, but the last backup in this bucket is for December 15, 2022. For new backups, please refer to Alation-EC2-Daily-Backup on AWS Backups.
Recovery Checklist & Dependencies
Disclaimer: The procedures outlined in this document have not been reviewed or updated for an extended period. Therefore, they should not be followed or executed verbatim without a thorough cross-verification against the most recent operational practices and protocols. The following tables provide the step-by-step recovery checklist that the ACE Infra team can use to guide them through the recovery steps for
Alation. Tasks to verify the functionality of the recovered service are included
| Status | Task Summary | Task Details |
|---|---|---|
| ☐ | Log into AWS | Log into the AWS Console using this link |
| ☐ | Go to S3 Service | Search for the S3 service in the upper search bar and then search for bucket name: gred-alation-backups |
| ☐ | Find the folder with current backups | Within the gred-alation-backups bucket, search for the folder with the current <4 digit year>-<2 digit month> |
| ☐ | Find latest backup | Verify that the folder has the most current backup, either current date or the day before. The file should be formated <timestamp>-<release version>__alation_backup.tar.gz. Example: 202211040400_9-2-9-155519_alation_backup.tar.gz |
| ☐ | Log into the Alation prod server | Log in with ssh to ecd-alation.gene.com |
| ☐ | Verify alation server has current backups | Go to the directory /backup/backup/ and verify that their is a current backup, either current date or the day before |
| ☐ | Make a restore folder | Switch to elevated root privileges and create the folder /backup/restore. Run the command `sudo -i; mkdir /backup/restore |
| ☐ | Copy the latest backup to restore folder | Copy the latest backup file from /backup/backup/ to the /backup/restore folder. Run the command sudo cp /backup/backup/backup_file.tar.gz /backup/restore/ |
| ☐ | Enter a Screen session | Run the command screen -S alation-restore |
| ☐ | Enter the Alation shell | Run the command sudo /etc/init.d/alation shell |
| ☐ | Stop the alation services | Run the command stop_alation |
| ☐ | Change the alation_conf value to reflect the path to the backup file. | Run the command alation_conf alation.backup.restore_file -s /data2/restore/backup_file.tar.gz |
| ☐ | Log into Alation and | Run the command stop_alation |
| ☐ | Switch to the alation user | Run the command sudo su alation |
| ☐ | Change ownership of the restore file to the alation user: | Run the command sudo chown alation:alation /data2/restore/backup_file.tar.gz |
| ☐ | Switch to the alation user | Run the command sudo su alation |
| ☐ | Verify successful backup | Log into https://ecd-alation.gene.com/ and verify that you see the expected data from the backup |
Addendum
Note that if Alation is in such a state that you cannot go into the Alation shell, you will need to uninstall Alation, reinstall alation to the release matching the latest backups and perform the above steps:
Steps to uninstall and re-install Alation:
| Status | Task Summary | Task Details |
|---|---|---|
| ☐ | Log into the Alation Support Portal | Log into https://customerportal.alationdata.com |
| ☐ | Get the correct download link | Select the Genentech - Prod link and copy the curl command for the current version. |
| ☐ | Verify if Alation rpm is stored locally | It’s possible to skip the download of the rpm from the support portal you find the current rpm under the /root/. If that’s the case you can skip the next two steps |
| ☐ | Download the rpm to a local workstation | Assuming the Alation rpm isn’t stored on the server, run the curl command to download to a local workstation |
| ☐ | Copy the rpm to a local workstation | Assuming the Alation rpm isn’t stored on the server, now copy (scp or rsyc) the rpm to the alation server to a /tmp folder location |
| ☐ | Uninstall Alation | Run the command `rpm -qa |
| ☐ | Confirm rpm download is good | Go to the directory where the alation installer is located and run rpm -K alation####.rpm. If the download is corrupted, download the rpm again. |
| ☐ | Re-install Alation | Run the command sudo rpm -ivh alation####.rpm |
| ☐ | Initialize Alation | Run the command sudo /etc/init.d/alation init /data /backup |
| ☐ | Continue with the restore | Continue with the rest of the restore steps outlined in the above section |