Enterprise-level policy enforcement with OPA (Open Policy Agent) and Gloo Edge

OPA (Open Policy Agent) and Gloo Edge

 

In this post, we’ll take you through Open Policy Agent (OPA), the reasons why it is widely used, different architectures to enable global and local policies, and a workshop where you can try it out and practice.

OPA is an open source, general purpose policy enforcement tool. It’s a small component and is used for PEP (Policy Enforcement Point) and PDP (Policy Decision Point). In a nutshell, it is the place in your system where you can find the answer to “Can I do x action?”

 

Where OPA came from

The concepts of PEP, PDP, and some others like PAP (Policy Administration Point) and PIP (Policy Information Point) came widely used through XACML (eXtensible Access Control Markup Language), which offered ABAC (attribute-based access control). Years ago it was considered a standard.

However, with Cloud Native, XACML has become less used, and an alternative has arisen: Open Policy Agent, which simplifies the schema. However, keep in mind that the concepts of PEP, PDP, etc. remain with OPA.

Its simplistic architecture looks like this:

 

Strengths of OPA

After listening to the community and how OPA is used, I would highlight two major strengths:

1. ABAC flexibility compared to RBAC

Does this mean that OPA performs RBAC (role-based access control)? Yes. But don’t stop there! When it comes to authorization, you would typically think of RBAC as an appropriate solution. Generally when you think of a User, you link it to a role…User Admin, User Developer, User Employee. And with RBAC, you can also define hierarchies, for example: Developer belongs to an Employee. This is known as Groups and Roles.

However, Role is just one of the possible attributes. Forget about the couple User-Role. You live in a world of objects and attributes. Therefore, your policies should follow that. This is where ABAC (attribute-based access control) comes in. The intention of ABAC is to offer additional flexibility through access control based on any attribute.

To give an example:

A CI/CD pipeline in GitHub creates an environment to run tests. A policy is needed to allow/disallow the capability of creating the environment, due to quota or any other reason.

Assigning a Role to that pipeline does not make much sense. Which Role does a pipeline have? However, the pipeline itself has attributes: 

  • Time. You can create a policy which allows the pipeline to run only at a certain period of time.
  • Name. You can allow/disallow actions based on the Pipeline name.
  • Repository. Allow/disallow the action for pipelines triggered in a specific repository or branch.

As you can see, RBAC can limit you in this case, while ABAC gives you full flexibility.

2. Separation of responsibilities

At the beginning of this post, you read that OPA is used for PEP and PDP. Besides that, you need a place where you can administer your policies in a strict way (Rego rules in OPA). This is the PAP component. Then to bridge the worlds of Policy Operations and Policy Development, you need PIP. This separation allows you to have global architectures where there will be global policies (across regions) and local policies (application scoped).

Global Policies

Here’s an example of a global policies architecture using S3 buckets to store the policies and a Policy Admin Tool to administer those policies:

 

As you can see, the decision of how to create and store the policies are left out for another team to develop and maintain, allowing different designs and development lifecycles.

OPA acts as an agent, leveraging pulling mechanisms. Recurrently, the OPA servers pull the policies from the storages. And an S3 bucket can be easily replicated across regions, making it easy to maintain the policies globally.

 

Local Policies

Local policies are scoped to the application. This covers the level of authorization that only the application developers know.

Traditionally, developers are in charge of defining these policies. The information is taken from the application itself, making the policy enforcement “local”.

The architecture would look like this:

As you can see, separation allows you to define the other part of the decision flow as you want, fitting better to any architecture in a Cloud Native world.

 

Try OPA in this hands-on tutorial

Now that you have the context, let’s get this working. In this step-by-step tutorial, you will deploy Gloo Edge and all the components necessary to perform Authorization (AuthZ) with OPA in one of the shapes presented before.

The scenario is that you offer a service which is behind Gloo Edge. This is integrated with OPA for enterprise policy decisions.

In the tutorial, you will apply global and local policies.

  • Global policy: The requests need to contain a header called token. This would be a policy that all services in the system should follow.
  • Local Policy: OPA will call the upstream service to bring data about the quota of the tenants (given in a header) and the limit. The policy will ensure that the current usage of the tenants is not higher that he limit.

In a real scenario, the tenant usage should vary. But in this tutorial it is static and hardcoded:

  • Tenant Antonio has used 2 units
  • Tenant Pedro has used 3 units
  • Quota limit is 3 units

Therefore, Antonio is able to access the resources since his 2 units did not reach the limit of 3 units.

On the contrary, Pedro has reached the limit so that the request will return 403 Forbidden

Your architecture will look like this:

 

A: Asynchronously, OPA is configured to pull Policies and Data from the S3 bucket

1: Gloo Edge extAuth filter is configured to connect to OPA

2: OPA is configured to pull Data from the upstream service

 

Tutorial prerequisites

First, let’s make sure the Gloo is working smoothly:

glooctl check

Make sure that there are not VirtualServices:

kubectl get vs -A

This shouldn’t show any result.

Apply the upstream service. In this case it will be static web server exposing two endpoints: /endpoint and /quota

Note: Sometimes, the Copy button does not work properly. In that case, select the text, copy and paste the code in your terminal.

kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
 name: demo
 labels:
   app: demo
spec:
 replicas: 1
 selector:
   matchLabels:
     app: demo
 template:
   metadata:
     labels:
       app: demo
   spec:
     containers:
     - name: apache
       image: httpd
       lifecycle:
         postStart:
           exec:
             command:
               - "/bin/sh"
               - "-c"
               - |
                 echo '{"here": "my-content"}' > htdocs/endpoint
                 echo '{"limit": 3, "current": {"antonio": 2, "pedro": 3}}' > htdocs/quota
       ports:
       - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
 name: demo
spec:
 ports:
 - port: 80
   protocol: TCP
   targetPort: 80
 selector:
   app: demo
 type: ClusterIP
EOF

Install minio which offers S3 storage:

helm upgrade --install minio minio/minio -n minio --create-namespace --version 8.0.10 -f - <<EOF
accessKey: admin
secretKey: admin123
persistence:
 enabled: false
defaultBucket:
 enabled: true
 name: opa-policies
EOF

Create an OPA bundle.

Remotely, OPA can expect bundles of Policies and Data. In this example you only use policies.

A bundle is a compress file with an specific structure:

bundle
├── policies
│   ├── global
│   │   └── my_enterprise_rule.rego
│   └── local
│       └── my_app_rule.rego
└── .manifest

As you saw in at the beginning of the post, this is considered PAP. Its implementation is left for the reader to develop. In this tutorial, you will do it manually. The structure of the bundle is very specific for OPA to understand:

mkdir -p bundle/policies/global
mkdir -p bundle/policies/local

cat << EOF > bundle/policies/global/my_enterprise_rule.rego
package policies.global.my_enterprise

import input.attributes.request.http as http_request

token_header_is_mandatory {
   http_request.headers["token"]              # It must have this header
}
EOF

cat << EOF > bundle/policies/local/my_app_rule.rego
package policies.local.my_app

import input.attributes.request.http as http_request

quota_limit {
 response := http.send({
     "method" : "GET",
     "url": "http://demo.default.svc/quota"
 })

 response.status_code == 200              # The request to the upstream needs to be 200
 tenant = http_request.headers["tenant"]   
 body = json.unmarshal(response.raw_body)
 body.current[tenant] < body.limit        # The current value needs to be less than the quota limit
}
EOF

cat << EOF > bundle/.manifest
{
   "roots": ["policies"]
}
EOF


opa build --bundle bundle

Push the bundle to the Minio (S3) bucket. Make sure you kill the port-forward process after it finishes:

kubectl port-forward svc/minio -n minio 9000 2>&1 >/dev/null &
sleep 1
mc -q alias set minio http://localhost:9000 admin admin123
mc -q mb minio/opa-policies
mc -q cp bundle.tar.gz minio/opa-policies/bundle.tar.gz
pkill kubectl

You can now see the bundle through the browser. Use port-forward:

kubectl port-forward svc/minio -n minio 9000

And then in your browser, access the web UI: http://localhost:9000/minio/opa-policies/

You will be prompted to enter credentials. You can find these values in the helm command you used to install minio:

Access Key: admin

Secret Key: admin123

You should see something like this:

Install OPA. In this example, you will install it as standalone. However, it can be installed as sidecar. The reason for using the sidecar model is to reduce the latency to the minimum. The containers will belong to the same pod and there will be dependency in between. If OPA is not ready, the ExtAuth server will not work either.

Note: The model you choose (sidecar/standalone) impacts directly the whole solution. Please, make sure you analyze which option fits better your requirements.

kubectl create ns opa

kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
 labels:
   app: opa
 name: opa
 namespace: opa
spec:
 replicas: 1
 selector:
   matchLabels:
     app: opa
 template:
   metadata:
     labels:
       app: opa
   spec:
     containers:
     - image: openpolicyagent/opa:latest-envoy
       name: opa-istio
       env:
         - name: "AWS_REGION"
           value: "this_is_not_needed_by_minio"
         - name: "AWS_SECRET_ACCESS_KEY"
           value: "admin123"
         - name: "AWS_ACCESS_KEY_ID"
           value: "admin"
       ports:
         - containerPort: 9191
       args:
         - "run"
         - "--server"
         - "--config-file=/config/config.yaml"
         - "--log-level=debug"
         - "--set=decision_logs.console=true"
         - "--addr=localhost:8181"
         - "--diagnostic-addr=0.0.0.0:8282"
         - "/policy/policy.rego"
       volumeMounts:
         - mountPath: "/config"
           name: opa-istio-config
         - mountPath: "/policy"
           name: opa-policy
     volumes:
       - name: opa-istio-config
         configMap:
           name: opa-istio-config
       - name: opa-policy
         configMap:
           name: opa-policy
---
apiVersion: v1
kind: ConfigMap
metadata:
 name: opa-istio-config
 namespace: opa
data:
 config.yaml: |
   services:
     minio:
       url: http://minio.minio.svc:9000
       credentials:
         s3_signing:
           environment_credentials: {}
   bundles:
     authz:
       service: minio
       resource: opa-policies/bundle.tar.gz
       polling:
         min_delay_seconds: 10
         max_delay_seconds: 20
   plugins:
     envoy_ext_authz_grpc:
       addr: :9191
       path: mypackage/authz/allow
   decision_logs:
     console: true
---
apiVersion: v1
kind: ConfigMap
metadata:
 name: opa-policy
 namespace: opa
data:
 policy.rego: |
   package mypackage.authz

   import data.policies.global.my_enterprise as my_enterprise
   import data.policies.local.my_app as my_app

   import input.attributes.request.http as http_request

   default allow = false

   allow {
     my_enterprise.token_header_is_mandatory
     my_app.quota_limit
   }

---

apiVersion: v1
kind: Service
metadata:
 name: opa
 namespace: opa
 labels:
   app: opa
spec:
 ports:
   - port: 9191
     protocol: TCP
     name: tcp
 selector:
   app: opa
EOF

Apply the AuthConfig which will be used later by the VirtualService.

Note: Notice that OPA is defined as a Passthrough, one of the Authorization models that Gloo Edge offers.

kubectl apply -f - <<EOF
apiVersion: enterprise.gloo.solo.io/v1
kind: AuthConfig
metadata:
 name: passthrough-auth
spec:
 configs:
 - passThroughAuth:
     grpc:
       address: opa.opa.svc.cluster.local:9191
       connectionTimeout: 30s
EOF

Create virtualService and static upstream:

kubectl apply -f - <<EOF
apiVersion: gloo.solo.io/v1
kind: Upstream
metadata:
 name: my-static-upstream
 namespace: default
spec:
 kube:
   selector:
     app: demo
   serviceName: demo
   serviceNamespace: default
   servicePort: 80

---

apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
 name: demo
spec:
 virtualHost:
   domains:
     - '*'
   options:
     extauth:
       configRef:
         name: passthrough-auth
         namespace: default
   routes:
     - matchers:
         - prefix: /
       routeAction:
           single:
             upstream:
               name: my-static-upstream
               namespace: default
EOF

 

Test it out!

After all the installations, let’s test and see the results.

Open a port-forward to the proxy:

kubectl -n gloo-system port-forward svc/gateway-proxy 8080:80

And let’s do the positive test: Tenant Antonio did not reach the limit and the request contains a token header:

curl -w 'Status code: %{http_code}' -XGET localhost:8080/endpoint -H "tenant: antonio" -H "token: xxx"

You should retrieve:

{"here": "my-content"}
Status code: 200%

Positive test are always easy. Now, the negative tests; when a request does not meet the policy.

Let’s test the Global Policy: token header must exist:

curl -w 'Status code: %{http_code}' -XGET localhost:8080/endpoint -H "tenant: antonio"

And you should retrieve:

Status code: 403

This is due to not satisfying global policy.

Now, the local policy. With the positive test, you proved that Antonio did not reach the limit. Now, let’s see Pedro.

curl -w 'Status code: %{http_code}' -XGET localhost:8080/endpoint -H "tenant: pedro" -H "token: xxx"

And you should retrieve:

Status code: 403

This proves that Pedro has reached the limit and the request was rejected by the ExtAuth filter (through OPA and the local policy).

 

Final thoughts on OPA

As you could see, OPA is extremely flexible and a good component to add to your stack when you need to apply ABAC policies. The architectures are not fixed and they can be adjusted to most of the scenarios.

Keep an eye out for more posts on this topic. And, in another workshop, I will show you how to apply this concepts to a Service Mesh like Istio.