ACE PlatformDR Checklist — Alation

Disaster Recovery Checklist for Alation

Business Function

Alation is a data catalog application that helps to support collaboration, data analysis, and data governance. It links to a legacy Oracle and a Redshift data warehousing backend.

There are two environments:

  • UAT
  • Prod

Scope

The scope of this document is the prod Alation application running in the ACE AWS Account.

EnvironmentInstanceIPURL
Prodecd-alation-prod10.158.21.158https://ecd-alation.gene.com/
Please note that only the prod environment is backed up to S3.

Key Information

Previously, backups were copied directly to the S3 bucket gred-alation-backups, but the last backup in this bucket is for December 15, 2022. For new backups, please refer to Alation-EC2-Daily-Backup on AWS Backups.

Recovery Checklist & Dependencies

Disclaimer: The procedures outlined in this document have not been reviewed or updated for an extended period. Therefore, they should not be followed or executed verbatim without a thorough cross-verification against the most recent operational practices and protocols. The following tables provide the step-by-step recovery checklist that the ACE Infra team can use to guide them through the recovery steps for Alation. Tasks to verify the functionality of the recovered service are included

StatusTask SummaryTask Details
Log into AWSLog into the AWS Console using this link
Go to S3 ServiceSearch for the S3 service in the upper search bar and then search for bucket name: gred-alation-backups
Find the folder with current backupsWithin the gred-alation-backups bucket, search for the folder with the current <4 digit year>-<2 digit month>
Find latest backupVerify that the folder has the most current backup, either current date or the day before. The file should be formated <timestamp>-<release version>__alation_backup.tar.gz. Example: 202211040400_9-2-9-155519_alation_backup.tar.gz
Log into the Alation prod serverLog in with ssh to ecd-alation.gene.com
Verify alation server has current backupsGo to the directory /backup/backup/ and verify that their is a current backup, either current date or the day before
Make a restore folderSwitch to elevated root privileges and create the folder /backup/restore. Run the command `sudo -i; mkdir /backup/restore
Copy the latest backup to restore folderCopy the latest backup file from /backup/backup/ to the /backup/restore folder. Run the command sudo cp /backup/backup/backup_file.tar.gz /backup/restore/
Enter a Screen sessionRun the command screen -S alation-restore
Enter the Alation shellRun the command sudo /etc/init.d/alation shell
Stop the alation servicesRun the command stop_alation
Change the alation_conf value to reflect the path to the backup file.Run the command alation_conf alation.backup.restore_file -s /data2/restore/backup_file.tar.gz
Log into Alation andRun the command stop_alation
Switch to the alation userRun the command sudo su alation
Change ownership of the restore file to the alation user:Run the command sudo chown alation:alation /data2/restore/backup_file.tar.gz
Switch to the alation userRun the command sudo su alation
Verify successful backupLog into https://ecd-alation.gene.com/ and verify that you see the expected data from the backup

Addendum

Note that if Alation is in such a state that you cannot go into the Alation shell, you will need to uninstall Alation, reinstall alation to the release matching the latest backups and perform the above steps:

Steps to uninstall and re-install Alation:

StatusTask SummaryTask Details
Log into the Alation Support PortalLog into https://customerportal.alationdata.com
Get the correct download linkSelect the Genentech - Prod link and copy the curl command for the current version.
Verify if Alation rpm is stored locallyIt’s possible to skip the download of the rpm from the support portal you find the current rpm under the /root/. If that’s the case you can skip the next two steps
Download the rpm to a local workstationAssuming the Alation rpm isn’t stored on the server, run the curl command to download to a local workstation
Copy the rpm to a local workstationAssuming the Alation rpm isn’t stored on the server, now copy (scp or rsyc) the rpm to the alation server to a /tmp folder location
Uninstall AlationRun the command `rpm -qa
Confirm rpm download is goodGo to the directory where the alation installer is located and run rpm -K alation####.rpm. If the download is corrupted, download the rpm again.
Re-install AlationRun the command sudo rpm -ivh alation####.rpm
Initialize AlationRun the command sudo /etc/init.d/alation init /data /backup
Continue with the restoreContinue with the rest of the restore steps outlined in the above section