Workspace:
Workspace Creation Process:
-
Each workspace cost about 20K, check with Saima before reaching out to V7 for creation
-
Workspace creation can be requested via two methods
- Reach out to @James Hudson via slack channel
ext-ecdi-v7for workspace creation request or follow step # 2 - Logon to V7, click on support on bottom right and request for workspace creation request
- Once approved an email will be sent, whoever creates the workspace becomes owner of the workspace
- Request V7 to change Owner to Parthi KT
- Owner has permission to drop workspace in-addition to all admin permission so it should be owned by Ace-Infra
- Reach out to @James Hudson via slack channel
Workspace Naming Standards:
- It should be org, department and team. See ECDI ACE AI team as an example
gene gred ecdi ace ai - Space on the workspace will be internally converted to -
Workspace Local upload disable:
- Workspace by default has permission to upload files from local machine
- Local file upload violates Roche security policy since the file will be uploaded to V7 S3 bucket
- Reach out to @@James Hudson via slack channel
ext-ecdi-v7or support channel,request to disable local upload capability
Workspace AD Group Naming Standards:
- Always start with
v7ws- - Typical naming
v7ws-<workspacename>after replacing space with - - Ex:
gene gred ecdi ace aiworkspace will have AD group asv7ws-gene-gred-ecdi-ace-ai - Create Owner group
v7ws-gene-gred-ecdi-ace-ai-ownerswith Miao, Adam and Parthi as the Owners. Make this non-requestable on CIDM - Create AD group
v7ws-gene-gred-ecdi-ace-aiwithv7ws-gene-gred-ecdi-ace-ai-ownersas group owners - With the above approach Miao/Adam will approve the request and we will know the user counts for each workspace
S3 Setup:
V7 S3 setup:
- S3 setup is done via Terraform. Code is located at github.
- https://github.com/gred-ecdi/terraform-ace-prod/blob/master/us-west-2/infra-v7-prod-usw2/main.tf
- Note: Virus scan setup is manual. Check https://wiki.gred.ai/en/ACE/CSS-Antivirus-for-S3-Integration-Guide to do it
User Permission Change:
User Roles:
- Once User logons to V7, they will have
workerrole - It needs to be changed to
adminoruservia v7 portal —> settings —> members —> change role from drop down - User name will list on V7 only if they logon
- If user is not part of CIDM group, they will not be able to logon
User Login Page:
Refer Below Example for User login page
- https://darwin.v7labs.com/
- Choose Use Single Sign-on
- Type the team name Genentech Parent and click Login
- It is prompt for Roche Single Sign on Login details
User Per Workspace Limitation:
- Currently we have 100 license. If we go beyond it, we need to pay $2000 per user
S3 on other AWS Account
- Setup for Bucket Policy
{
"Version": "2012-10-17",
"Id": "V7Access",
"Statement": [
{
"Sid": "DarwinAccess",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::258327614892:role/external_s3"
},
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": "arn:aws:s3:::orion-caris/*" # change it to your bucket
}
]
}- Add CORS Policy on the bucket
[
{
"AllowedHeaders": [
"*"
],
"AllowedMethods": [
"GET"
],
"AllowedOrigins": [
"https://darwin.v7labs.com"
],
"ExposeHeaders": [],
"MaxAgeSeconds": 3000
}
]
- If bucket is encrypted via customer managed key, add this is the KMS policy
{
"Sid": "Allow access for v7",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::258327614892:role/external_s3"
},
"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:DescribeKey"
],
"Resource": "*"
},Example Slack Notification to users who plan to use their AWS S3 Buckets:
@Jeff Eastham - Your V7 setup is complete.
Access control: Raise CIDM to add yourself to AD group v7ws-gene-gred-rp-digipath Logon Method: After CIDM completion, Logon to https://darwin.v7labs.com/ —> use Single Sign on —> gene gred rp digipath —> SSO Logon details
Admin Access: By Default, you are worker. Ping me after logging once, I will make you an workspace Admin Data Access: Upload from Desktop to V7 is disabled. Workspace Admin need to configure S3 bucket on Storage tab of V7. AWS Admin: Bucket you are configuring needs bucket policy. If your bucket is KMS-CMK encrypted, you need to add below key policy. Check with V7 on what policy to use if you use AWS Managed key
Vendor Questions: Via slack channel ext-ecdi-v7 on the workspace gRED
Bucket Policy:
{
"Version": "2012-10-17",
"Id": "V7Access",
"Statement": [
{
"Sid": "DarwinAccess",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::258327614892:role/external_s3"
},
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": "arn:aws:s3:::v7-gene-gred-ace-nlp-prod/*" ##replace with your bucket arn
}
]
}KMS-CMK Key Policy:
{
"Sid": "Allow access for v7",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::258327614892:role/external_s3"
},
"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:DescribeKey"
],
"Resource": "*"
}Next Step: I will schedule a call with the Vendor
V7 DataSet Registration Steps:
- Typically this is the first step after setting up the workspace
- Users upload their file/image to S3 bucket after which they need to do dataset registration
- Sample code will help to start the process for single file registration
import requests
api_key = "dfdf3r4cg.8SFv1FTV0BfRvKky_VU0MlmIPxJ4t6Sm" `# Generate this key on V7 portal`
team_slug = "gene-gred-ace-nlp" `# Refers to workspace name with - added to it`
dataset_slug = "data" `# Any name that you want to display on V7`
storage_name = "v7-gene-gred-ace-nlp-prod" `# s3 bucket name`
headers = {
"Content-Type": "application/json",
"Accept": "application/json",
"Authorization": f"ApiKey {api_key}"
}
payload = {
"items": [
{
"path": "/",
"slots": [
{
"as_frames": "false",
"slot_name": "1",
"storage_key": "data/000000000.png", `#provide the s3 folder/filename`
"file_name": "000000000.png"
}
],
"name": "000000000.png"
}
],
"dataset_slug": dataset_slug,
"storage_slug": storage_name
}
response = requests.post(
f"https://darwin.v7labs.com/api/v2/teams/{team_slug}/items/register_existing",
headers=headers,
json=payload
)
body = response.json()
if response.status_code != 200:
print("request failed", response.text)
elif 'blocked_items' in body and len(body['blocked_items']) > 0:
print("failed to register items:")
for item in body['blocked_items']: print("\t - ", item)
if len(body['items']) > 0: print("successfully registered items:")
for item in body['items']: print("\t - ", item)
else:
print("success")- If you need multi file registration, you can use the sample below
import boto3
import requests
dev = boto3.session.Session(profile_name='default')
# Connect to the S3 bucket
s3 = dev.client('s3')
# Your AWS bucket name
bucket_name = 'v7-roche-pred-opm-prod'
# List objects within the bucket
#objects = s3.list_objects_v2(Bucket=bucket_name)
# List objects within the bucket and subfolder if needed
objects = s3.list_objects_v2(Bucket=bucket_name, Prefix='data/acdc_batch2_dcm') #If you want files from subfolder use this and comment above command
# V7 API setup
api_key = "dfdf3r4cg.8SFv1FTV0BfRvKky_VU0MlmIPxJ4t6Sm" `# Generate this key on V7 portal`
team_slug = "gene-gred-ace-nlp" `# Refers to workspace name with - added to it`
dataset_slug = "data" `# Any name that you want to display on V7`
storage_name = "v7-gene-gred-ace-nlp-prod" `# s3 bucket name`
headers = {
"Content-Type": "application/json",
"Accept": "application/json",
"Authorization": f"ApiKey {api_key}"
}
# Initialize payload
payload = {
"items": [],
"dataset_slug": dataset_slug,
"storage_slug": storage_name
}
# Iterate over each object in the bucket
for obj in objects.get('Contents', []):
file_name = obj['Key']
if f"{file_name.split('/')[-1]}":
payload['items'].append({
"path": "/",
"slots": [
{
"as_frames": "false",
"slot_name": "1",
"storage_key": f"{file_name}",
"file_name": f"{file_name.split('/')[-1]}"
}
],
"name": f"{file_name.split('/')[-1]}"
})
#print(payload) #To test before loading to V7 uncomment print command and comment everything below this line
#Send request to V7
response = requests.post(
f"https://darwin.v7labs.com/api/v2/teams/{team_slug}/items/register_existing",
headers=headers,
json=payload,
verify=True
)
# Process response
body = response.json()
if response.status_code != 200:
print("request failed", response.text)
elif 'blocked_items' in body and len(body['blocked_items']) > 0:
print("failed to register items:")
for item in body['blocked_items']: print("\t - ", item)
if len(body['items']) > 0: print("successfully registered items:")
for item in body['items']: print("\t - ", item)
else:
print("success")