HowTo access HCPs S3 interface

This tiny HowTo is not intended to train you on how to code against Amazon S3 or compatible services, but as a short instruction how to talk to HCPs S3-compatible interface (HS3).

Prerequisites

This is for Python 3 using the boto3 package (at the time of writing, we used boto3 version 1.3.0).

Make sure you have at least Python 3.4 installed, then use pip to install the boto3 package (you might want to consider using a virtual environment):

$ pip install boto3

Pre-requisites on HCP:

  • MAPI enabled on System level

  • A Tenant to be used as HS3 endpoint

  • MAPI enabled within that Tenant

  • Proper setup Namespace defaults

    • Default size for new Namespaces (Buckets)

    • Versioning enabled

  • A local user account within the Tenant having Allow namespace management enabled

Basic Usage

Let’s have a look at this piece of sample code; the highlighted lines are crucial and commented below:

 1import boto3
 2from botocore.utils import fix_s3_host
 3from botocore.vendored.requests.packages.urllib3 import disable_warnings
 4
 5if __name__ == '__main__':
 6
 7    disable_warnings()
 8
 9    s3 = boto3.resource(service_name='s3',
10                        endpoint_url='https://s3erlei.hcp72.archivas.com',
11                        verify=False)
12
13    s3.meta.client.meta.events.unregister('before-sign.s3', fix_s3_host)
14
15    # create a new bucket
16    s3.create_bucket(Bucket='test')
17
18    # list existing buckets
19    for b in s3.buckets.all():
20        print(b.name)
21    print('-' * 60)
22
23    # store two objects
24    s3.Object('test', 'hello.txt').put(Body=b"I'm a test file")
25    s3.Object('test', 'bin.txt').put(Body=b"0123456789abcdef"*10000)
26
27    # read an object and print it's content
28    obj = s3.Object(bucket_name='test', key='hello.txt')
29    response = obj.get()
30    print(response['Body'].read())

As boto3 per default tries to use Amazons S3 service, we need to specify our private endpoint (line 9 to 11) during resource creation. Unfortunately, this isn’t enough; we also need to break the automatism that always tries to force us to Amazon S3 (line 2 and 13) when it comes to object access.

BTW, boto3 outputs a warning if SSL certificates verification is switched off. To supress this warning, use the code on line 3 and 7. But take it from me, you really shouldn’t do so!

Tip

For a more in-depth example on how to interact with Amazon S3 (and compatible storage services, including HCP), have a look at the HS3 Shell

ACLs

Beside using canned Access Control Lists, ACLs are often set by PUTting a request that has an XML-formatted body. While its format is defined by a specific XSD published by Amazon, the AWS S3 service seems not to care too much on adherence when it comes to ACLs…

HCP on the other hand cares for adherence to the XSD, which will likely lead apps using boto3 into errors when PUTting ACLs.

The reason for this is in the dict()s you have to provide to the respective method calls in boto3, which are sorted by hash, instead off the point of time a key has been added. The solution to this is to use collection.OrderedDict()s instead of dict()s, and fill them with the key-value pairs in the order defined by the mentioned XSD. Example:

from collections import OrderedDict as OD

owner = OD([('ID', 'myID')
            ])
grants = [OD([('Grantee', OD[('Type', 'AmazonCustomerByEmail'),
                             ('EmailAddress', 'testuser'),
                             ]),
              ('Permission', 'READ')
              ],
          OD([('Grantee', OD[('Type', 'Group'),
                             ('URI',
                              'http://acs.amazonaws.com/groups/global/AllUsers')
                             ]),
              ('Permission', 'READ')
              ]
          ]

acp = OD([('Owner', owner),
          ('Grants', grants)
          ])

acl.put(AccessControlPolicy=acp)

More to learn

For everything else I leave it to the boto3 documentation and the exemplary code of the hs3sh project.