ApplicationsGRADE Reading Center Data Transfer

GRADE RC Transfer Overview

The GRADE Reading Center transfer infrastructure allows for an external user to transfer files to GRADE RC specific buckets in our ACE Prod infrastructure.

Note that this is for the HONU Imaging Data Ingestion project.

The infrastructure consists of three main AWS resources.

  • AWS Transfer
  • AWS Cognito
  • AWS s3

AWS Transfer is an SFTP server hosted by AWS that facilitates transfers to s3. Note that an additional dependency in this is a file transfer tool such as rclone is also needed to run cli commands to perform the transfers.

AWS Cognito is an authentication, authorization, and user management resource enables us to create accounts for external users to access internal AWS resources.

We have one s3 bucket set aside to receive files - grade-rc-bucket-staging.

AWS Transfer

The server endpoint is: s-b92e2a6b98e44b198.server.transfer.us-west-2.amazonaws.com

This server is configured for SFTP connections with three AZ’s and subnets and is publically accessible. We have IAM roles and policies to limit who can access the server along with Cognito authentication to limit what external user’s can access it.

Note that for a user to access the SFTP server, a local user must be created by the admin.

How to create a User

  1. Log into the AWS Console and make sure you’re in the right us-west-2 region
  2. Search for AWS Transfer in the search bar
  3. Select the Servers link in the left hand menu bar
  4. You should see the server id - s-b92e2a6b98e44b198
  5. Select it and scroll down to the Users section.
  6. Note the already created list of users.
  7. Click the Add User button to create a new user.
  8. Enter a new user
  9. Select the role - grade-rc_transfer-s3-role for the user. This allows the local account and AWS transfer service to have access to the GRADE RC s3 bucket
  10. Please note that a RSA public/private keypair is also needed. Please add the public key to the new user you are creating for the transfer server.
  11. Click Add when you are complete.

The user should now have passwordless access to the sftp server at the endpoint url noted above.

AWS Transfer Logs

Logs can be found in the /aws/transfer/s-b92e2a6b98e44b198 Log group under CloudWatch.

AWS Cognito

The Cognito component consists of a User Pool and and Identity Pool.

User Pool

User Pool Name: grade-rc_s3_user_pool

User pools are specifically for authentication (identity verification).

This user pool is where we onboard new users for access to the S3 buckets. We also define an app client - grade-rc_s3_transfer. This app is an entity within a user pool that has permission to call unauthenticated API operations as defined by AWS.

To access the Identity Pool

  1. Log into the AWS Console and make sure you’re in the right us-west-2 region
  2. Search for Cognito
  3. Click the Manage User Pools button
  4. Select the User pool - grade-rc_s3_id_pool

You should be able to update the configuration settings for the user pool by just selecting the pool and making the configuration changes directly.

Identity Pool

Identity Pool name: grade-rc_s3_id_pool

Identity pools are specifically for authorization (access control).

The identity pool is where we link our application, in this case the transfer of files to S3 to our authenticated users from the user pool.

To access the Identity Pool

  1. Log into the AWS Console and make sure you’re in the right us-west-2 region
  2. Search for Cognito
  3. Click the Manage Identity Pools button
  4. Select the Identity pool - grade-rc_s3_id_pool
  5. Click the Edit identity pool link in the upper left of the page.

Note that this identity pool is configured with the user pool id from above and the client app id from the user pool as well.

AWS S3 buckets

Files for GRADE RC will be transferred to the bucket - grade-rc-bucket-staging. Additional buckets can be added where needed.

Bash scripts for automating Cognito signup and authentication

We have also provided users with 2 bash scripts.

  • One script for signup and authentication into Cognito.
  • The other script after the user has been on-boarded, which is for authenticating the user and automating transfers into our Grade RC bucket.