MareArts Computer Vision Study.: The list_objects_v2 function returns up to 1000 objects by default. To read all the contents in the bucket, you can use pagination.

3/30/2023

The list_objects_v2 function returns up to 1000 objects by default. To read all the contents in the bucket, you can use pagination.

refer to code:

You can modify '.json' for you case.

import boto3

def get_origin_fn_list(ORIGIN_DATA_S3, ORIGIN_DATA_S3_prefix):
    s3 = boto3.client('s3')
    paginator = s3.get_paginator('list_objects_v2')
    origin_path = {}

    for response in paginator.paginate(Bucket=ORIGIN_DATA_S3, Prefix=ORIGIN_DATA_S3_prefix):
        for obj in response['Contents']:
            if obj['Key'][-4:] == '.json':
                path = obj['Key']
                uid = path.split('/')[-2]
                origin_path[uid] = path

    print(f"get kv.json list: {len(origin_path)}/{sum(1 for _ in paginator.paginate(Bucket=ORIGIN_DATA_S3, Prefix=ORIGIN_DATA_S3_prefix))}")
    return origin_path

Thank you.

🙇🏻‍♂️

www.marearts.com

MareArts Computer Vision Study.

Pages

3/30/2023

The list_objects_v2 function returns up to 1000 objects by default. To read all the contents in the bucket, you can use pagination.

No comments:

Post a Comment