Skip to content

Commit

Permalink
[HOPSWORKS-3136] Migrate IAM role chaining docs (logicalclocks#1088)
Browse files Browse the repository at this point in the history
  • Loading branch information
ErmiasG authored Jun 7, 2022
1 parent 2e17eb6 commit e89856b
Show file tree
Hide file tree
Showing 12 changed files with 129 additions and 5 deletions.
77 changes: 77 additions & 0 deletions docs/admin/iamRoleChaining.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# AWS IAM Role Chaining
Using an EC2 instance profile enables your Hopsworks cluster to access AWS resources.
This forces all Hopsworks users to share the instance profile role and the resource access policies attached to
that role. To allow for per project access policies you could have your users use AWS credentials directly in
their programs which is not recommended so you should instead use [Role chaining](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_terms-and-concepts.html#iam-term-role-chaining).
To use Role chaining, you need to first setup IAM roles in AWS:

**Step 1**. Create an instance profile role with policies that will allow it to assume all resource roles that we can
assume
from the Hopsworks cluster.

```json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AssumeDataRoles",
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": [
"arn:aws:iam::123456789011:role/test-role",
"arn:aws:iam::xxxxxxxxxxxx:role/s3-role",
"arn:aws:iam::xxxxxxxxxxxx:role/dev-s3-role",
"arn:aws:iam::xxxxxxxxxxxx:role/redshift"
]
}
]
}
```
<figcaption>Example policy for assuming four roles.</figcaption>

**Step 2**. Create the resource roles and edit trust relationship and add policy document that will allow the instance
profile
to assume this role.

```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::xxxxxxxxxxxx:role/instance-profile"
},
"Action": "sts:AssumeRole"
}
]
}
```
<figcaption>Example policy document.</figcaption>

Role chaining allows the instance profile to assume any role in the policy attached in step 1. To limit access to
iam roles we can create a per-project mapping from the admin page in Hopsworks.

<figure>
<a href="../../assets/images/admin/iam-role/cluster-settings.png">
<img src="../../assets/images/admin/iam-role/cluster-settings.png" alt="Role Chaining"/>
</a>
<figcaption>Role Chaining</figcaption>
</figure>

Click on your name in the top right corner of the navigation bar and choose _Cluster Settings_ from the dropdown menu.
In the Cluster Settings' _IAM Role Chaining_ tab you can configure the mappings between projects and IAM roles.
You can add mappings by entering the project name, which roles in that project can access the cloud role and the
role ARN.
Optionally you can set a role mapping as default by marking the default checkbox. The default roles can be changed from
the project setting by a Data owner in that project.

<figure>
<a href="../../assets/images/admin/iam-role/new-role-chaining.png">
<img src="../../assets/images/admin/iam-role/new-role-chaining.png" alt="Create Role Chaining"/>
</a>
<figcaption>Create Role Chaining</figcaption>
</figure>

Any member of a project can then go to the _Project Settings_ ->
[Assuming IAM Roles](../compute/project/iamRoleChaining.md) page to see which roles they can assume.
11 changes: 6 additions & 5 deletions docs/admin/services.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ You can find the Services page by clicking on your name, in the top right corner
_Cluster Settings_ from the dropdown menu and going to the _Services_ tab.

<figure>
<a href="../../assets/images/admin/services/full.png">
<img src="../../assets/images/admin/services/full.png" alt="services page" />
<a href="../../assets/images/admin/services/services-page.png">
<img src="../../assets/images/admin/services/services-page.png" alt="services page" />
</a>
<figcaption>Services page</figcaption>
</figure>
Expand All @@ -16,7 +16,8 @@ It provides information about their status as reported by agents that monitor th
Systemd units.

Columns in the services table represent machines in your cluster. Each service running on a machine will have a status
_running_ (green), _stopped_ (gray), or _bad health_ (red).
_running_ (green) or _stopped_ (red). If a service is not installed on a machine it will have a status _not installed_
(gray).
Services are divided into groups, and you can search for a service by its name or group. You can also search for
machines by their host name.

Expand All @@ -29,8 +30,8 @@ machines by their host name.

After you find the correct service you will be able to **start**, **stop** or **restart** it, by clicking on its status.
<figure>
<a href="../../assets/images/admin/services/start.png">
<img src="../../assets/images/admin/services/start.png" alt="start services" />
<a href="../../assets/images/admin/services/services-start.png">
<img src="../../assets/images/admin/services/services-start.png" alt="start services" />
</a>
<figcaption>Start, Stop and Restart a service</figcaption>
</figure>
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed docs/assets/images/admin/services/full.png
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/admin/services/services.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed docs/assets/images/admin/services/start.png
Binary file not shown.
Binary file added docs/assets/images/iam-role/project-settings.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
44 changes: 44 additions & 0 deletions docs/compute/project/iamRoleChaining.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Assuming AWS IAM Roles
When deploying Hopsworks on EC2 instances you might need to assume different roles to access resources on AWS.
These roles are configured in AWS and mapped to a project in Hopsworks, for a guide on how to configure this go to
[AWS IAM Role Chaining](../../admin/iamRoleChaining.md).

After an administrator configured role mappings in Hopsworks you can see the roles you can assume in the Project
Settings IAM Role Chaining tab.
<figure>
<a href="../../../assets/images/iam-role/project-settings.png">
<img src="../../../assets/images/iam-role/project-settings.png" alt="Role Chaining"/>
</a>
<figcaption>Role Chaining</figcaption>
</figure>

You can then use the [Hops python library](https://hops-py.logicalclocks.com/) and
[Hops java/scala library](https://github.com/logicalclocks/hops-util) to assume the roles listed in your project’s settings page.

When calling _assume\_role_ you can pass the role ARN string or use the get role method that takes the role id
as an argument. If you assign a default role for your project you can call _assume\_role_ without arguments.

You can assign (if you are a Data owner in that project) a default role to you project by clicking on the _default_
checkbox of the role you want to make default. You can set one default per project role. If a default is set for
a project role (Data scientist or Data owner) and all members (ALL) the default set for the project role will take
precedence over the default set for all members.

###### python
```python
from hops.credentials_provider import get_role, assume_role
credentials = assume_role(role_arn=get_role(1))
spark.read.csv("s3a://resource/test.csv").show()
```

###### scala
```scala
import io.hops.util.CredentialsProvider
val creds = CredentialsProvider.assumeRole(CredentialsProvider.getRole(1))
spark.read.csv("s3a://resource/test.csv").show()
```

The _assume\_role_ method sets spark hadoop configurations that will allow spark to read s3 buckets. The code examples
above show how to read s3 buckets using Python and Scala.

The method also sets environment variables **AWS_ACCESS_KEY_ID**, **AWS_SECRET_ACCESS_KEY** and
**AWS_SESSION_TOKEN** so that programs running in the container can use the credentials for the newly assumed role.
2 changes: 2 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ nav:
- Project-based Multi-tenancy: compute/project/multiTenancy.md
- Delete a Project: compute/project/deleteProject.md
- Project Name Reserved Words: compute/project/reservedNames.md
- Assuming IAM Roles: compute/project/iamRoleChaining.md
- Python: compute/python.md
- Jupyter: compute/jupyter.md
- Jobs: compute/jobs.md
Expand All @@ -51,6 +52,7 @@ nav:
- User Management: admin/user.md
- Configure Alerts: admin/alert.md
- Manage Services: admin/services.md
- IAM Role Chaining: admin/iamRoleChaining.md
- Hopsworks.ai: https://docs.hopsworks.ai/hopsworks-cloud/latest/
- Examples: https://examples.hopsworks.ai/
- Community: https://community.hopsworks.ai/
Expand Down

0 comments on commit e89856b

Please sign in to comment.