CDP part 5: user permissions management on CDP Public Cloud

CDP part 5: user permissions management on CDP Public Cloud

Do you like our work......we hire!

Never miss our publications about Open Source, big data and distributed systems, low frequency of one email every two months.

When you create a user or a group in CDP, it requires permissions to access resources and use the Data Services.

This article is the fifth in a series of six:

CDP Public Cloud manages these permissions through roles, which control the scope of access to the resources.

There are two main types of roles:

  • Account Roles: permissions to access or perform tasks on all resources within the CDP tenant
  • Resource Roles: permissions to access or perform tasks on a specific resource, such as an environment

This article focus on setting the roles and the Ranger policies required for the group of users created in User management on CDP Public Cloud with Keycloak to complete the lab-article which closes this series.

By definition, a group in CDP is a collection of user accounts that have the same account and resource roles. Therefore we can manage all our needs at the group level.

Three remarks before starting the configuration:

  • At least one user of the group has to log in to make the group visible on the CDP console.
  • The PowerUser role is required to assign roles to a group.
  • The EnvironmentAdmin role is required to set the Ranger Policies.

Required Roles

To give users access to all resources required to set up the lab article, we need to assign them the following roles:

  • Account Roles:
    • PowerUser
    • DFCatalogAdmin
    • DFCatalogViewer
  • Resource Roles:
    • DWAdmin
    • DWUser
    • DFFlowAdmin
    • DFFlowUser
    • DEUser

In addition, we need to set the Data Access Role on the IDBroker Mappings to ensure user applications can access the Data Lake.

As in CDP Public Cloud deployment on AWS, the role configuration can be done via the Cloudera web interface or the CDP CLI. Both approaches are covered.

Configuring Roles using the CDP Web Interface

This approach is recommended if you are new to CDP. It is slower but gives you a better idea of the configuration process. If you did not install and configure the CDP CLI and the AWS CLI as described in Introduction to end-to-end data lakehouse architecture with CDP, this is also your only option.

If you want to go faster and use the terminal to set the roles, scroll down to the Configuring roles from the Terminal section.

Note: You still need to use the CDP console to configure the Ranger policies since this task cannot be accomplished using the CDP CLI.

To set the Account Roles:

  1. Log in to the CDP console and select Management Console

    management_console

  2. Navigate to User Management > Groups > Your group name

    account_roles01

  3. Select Roles and click Update Roles

    account_roles02

  4. Select the account roles of the list above and click Update

    account_roles03

  5. You should get the following

    account_roles04

To set the Resource Roles:

  1. Log in to the CDP console and select Management Console

    management_console

  2. Navigate to Environments > Your environment

    resource_roles01

  3. On the top right corner select Actions and click Manage Access

    resource_roles02

  4. Select the Access tab, write your group name in the search box

    resurce_roles03

  5. Select the resource roles of the list above and click Update

    resource_roles04

  6. The last step is to synchronize the users with the environment, therefore click Synchronize Users

    resource_roles05

  7. Click Synchronize Users

    resource_roles05

To set IDBrokers Mappings:

  1. Select the IDBroker Mappings tab, click Edit

    idbroker_mapping01

  2. Add Data Access Role

    1. Select your group name in the search box
    2. Copy the Data Access Role above
    3. Paste it into the Role space
    4. Click Save and Sync

    idbroker_mapping02

  3. You should get the following

    idbroker_mapping03

Configuring Roles from the Terminal

Deploying via the terminal is recommended for experienced users who want to launch their environment quickly. You need to have the CDP CLI and the AWS CLI installed on your system as described in the CDP part 1: introduction to end-to-end data lakehouse architecture with CDP.

Configuration via the terminal requires the following steps:

  1. Set Account Roles
  2. Set Resource Roles
  3. Set IDBroker mappings
  4. Synchronize users

Set Account Roles

To set the account roles, you need your group name and the CRN of the roles you want to assign. In order to do so, use the following commands:

# Set your group name
export CDP_GROUP_NAME=adaltas-students
# Array with the required Account Roles CRN
export ACCOUNT_ROLES=(PowerUser DFCatalogAdmin DFCatalogViewer)

# Get Account Roles CRN
get_crn_account_role () {
   CDP_ACCOUNT_ROLE_NAME=$1
   CDP_ACCOUNT_ROLE_CRN=$(cdp iam list-roles |jq --arg CDP_ACCOUNT_ROLE_NAME "$CDP_ACCOUNT_ROLE_NAME" '.roles[] | select(.crn | endswith($CDP_ACCOUNT_ROLE_NAME))| .crn')
   echo $CDP_ACCOUNT_ROLE_CRN | tr -d '"'
}

With all the required variables defined, you can set the roles.

# Assign Account Roles
for role_name in "${ACCOUNT_ROLES[@]}"; do \
cdp iam assign-group-role \
   --group-name ${CDP_GROUP_NAME} \
   --role $(get_crn_account_role ${role_name}); \
done

There is no immediate feedback if you successfully assign the roles. You can validate with this command:

cdp iam list-group-assigned-roles --group-name $CDP_GROUP_NAME

Set Resource Roles

To set resource roles, you need the CRN of your CDP environment, your group of users, and the roles you want to assign. In order to do so, use the following commands:

# Set your environment name
export CDP_ENV_NAME=[your-environment-name]
# Get Group CRN
export CDP_GROUP_CRN=$(cdp iam list-groups |jq --arg CDP_GROUP_NAME "$CDP_GROUP_NAME" '.groups[] | select(.groupName==$CDP_GROUP_NAME).crn')
# Get Environment CRN
export CDP_ENV_CRN=$(cdp environments describe-environment --environment-name ${CDP_ENV_NAME} | jq -r .environment.crn)
# Array with the required resource roles CRN
export RESOURCE_ROLES=(DWAdmin DWUser DFFlowAdmin DFFlowUser DEUser)
# Function: Get Resource Roles CRN
get_crn_resource_role () {
   CDP_RESOURCE_ROLE_NAME=$1
   CDP_RESOURCE_ROLE_CRN=$(cdp iam list-resource-roles |jq --arg CDP_RESOURCE_ROLE_NAME "$CDP_RESOURCE_ROLE_NAME" '.resourceRoles[] | select(.crn | endswith($CDP_RESOURCE_ROLE_NAME))| .crn')
   echo $CDP_RESOURCE_ROLE_CRN | tr -d '"'
}

With all the required variables defined, you can set the roles.

# Set Resource Roles
for role_name in "${RESOURCE_ROLES[@]}"; do \
cdp iam assign-group-resource-role \
   --group-name $CDP_GROUP_NAME \
   --resource-role-crn $(get_crn_resource_role ${role_name}) \
   --resource-crn $CDP_ENV_CRN; \
done

There is no immediate feedback if you successfully assign the roles. You can validate with this command:

cdp iam list-group-assigned-resource-roles --group-name $CDP_GROUP_NAME

Set IDBroker mapping

To configure the IDBroker Mapping, you need information from your AWS CloudFormation stack. Retrieve this information using the following commands:

# Get AWS Stack required parameters
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity | jq .Account)
export CDP_RESOURCE_PREFIX=$(aws cloudformation describe-stacks --stack-name aws-${USER}-env | jq '.Stacks[].Parameters[] | select(.ParameterKey=="prefix").ParameterValue')

export AWS_DATA_ADMIN_ROLE_ARN=arn:aws:iam::${AWS_ACCOUNT_ID}:role/${CDP_RESOURCE_PREFIX}-datalake-admin-role
export AWS_RANGER_AUDIT_ROLE_ARN=arn:aws:iam::${AWS_ACCOUNT_ID}:role/${CDP_RESOURCE_PREFIX}-ranger-audit-role

Now you can set the Data Access Role to your group on the IDBroker Mappings.

Note: The following command updates ALL the IDBroker Mappings configuration, which is why both Data Access and Ranger Audit roles are required.

# Update IDBroker Mappings roles
cdp environments set-id-broker-mappings \
  --environment-name $CDP_ENV_NAME \
  --data-access-role $AWS_DATA_ADMIN_ROLE_ARN \
  --ranger-audit-role $AWS_RANGER_AUDIT_ROLE_ARN \
  --mappings accessorCrn=$CDP_GROUP_CRN,role=$AWS_DATA_ADMIN_ROLE_ARN

Synchronize Users and IDBroker Mappings

With all the configuration done, it’s time to synchronize both users and IDBroker mappings with your environment.

# Synchronize users
cdp environments sync-all-users \
  --environment-name $CDP_ENV_NAME

# Synchronize IDBroker Mappings
cdp environments sync-id-broker-mappings \
  --environment-name $CDP_ENV_NAME

Configure Ranger policies

There is one more layer of permissions to be configured to enable users to complete the lab, Ranger policies.

These policies are at the Data Warehouse service level. As you will see in the next article, users need to create and query tables on the data warehouse.

All this configuration is done via the Cloudera web interface using the Ranger console.

As a reminder, before starting, you need the Power User account role on CDP to follow along.

  1. Navigate to Data Warehouse

    cdp_datawarehouse

  2. In Overview, find the Database Catalog title for your environment, click on the three vertical dots on the top right, and select Open Ranger

    ranger_policies01

  3. In the Ranger Service Manager, click Hadoop SQL

    ranger_policies02

  4. Open policy 9: all - database, table, columns

    • Add {USER} under Allow Conditions, Select Users
    • Click Save

    ranger_policy9_00

    ranger_policy9_01

    ranger_policy9_02

    ranger_policy9_03

  5. Open policy 11: all - storage-type, storage-url

    • Add {USER} under Allow Conditions, Select Users
    • Click Save

    ranger_policy11_00

    ranger_policy11_01

    ranger_policy11_02

    ranger_policy11_03

Next Steps: Hands-On Lab on a CDP Public Cloud Environment

Finally, both users and architecture are ready, so it’s time to let users experiment with all the managed services of your AWS-hosted CDP Public Cloud Environment with the hands-on lab-article that closes this series.

Share this article

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain