utils
¶
Helper functions
|
Simple class that allow streaming reads from GZip files (from https://gist.github.com/beaufour/4205533). |
|
Return string representation of bytes. |
|
Remove leading “/” from object_name |
|
A generator that splits an array into chunks of desired byte size |
|
Return byte size of file-object |
Get AWS keys from environmental variables if available |
|
Get AWS keys from default S3fs location if available. |
|
|
Read AWS keys from S3fs configuration or environmental variables. |
|
Return the size of the S3 object in MB |
|
Check string to see if it has any glob magic |
Check if string has non-trivial glob pattern |
|
Check string to see if it has trivial glob magic (e.g. “path/*”). |
|
|
Make the path behave as expected when querying S3 with list_objects. |
|
Return the name of all objects in a list |
|
Join two or more pathname components, inserting SEPARATOR as needed. |
|
Print name, size, and creation date of objects in list. |
|
Fill a numpy n-d array with file-like object contents |
|
remove leading “/” from a string |
|
|
|
|
|
Convert a URI to a bucket, object name tuple. |
|
|
|
Clean URL names from a list. |
GzipInputStream
¶
- class cottoncandy.utils.GzipInputStream(fileobj, block_size=16384)¶
Bases:
object
Simple class that allow streaming reads from GZip files (from https://gist.github.com/beaufour/4205533).
Python 2.x gzip.GZipFile relies on .seek() and .tell(), so it doesn’t support this (@see: http://bo4.me/YKWSsL).
Adapted from: http://effbot.org/librarybook/zlib-example-4.py
- __init__(fileobj, block_size=16384)¶
Initialize with the given file-like object.
@param fileobj: file-like object,
- next()¶
- read(size=0)¶
- readline()¶
- readlines()¶
- seek(offset, whence=0)¶
- tell()¶
bytes2human¶
- cottoncandy.utils.bytes2human(nbytes)¶
Return string representation of bytes.
- Parameters
nbytes (int) – Number of bytes
- Returns
human_bytes – Human readable byte size (e.g. “10.00MB”, “1.24GB”, etc.).
- Return type
str
clean_object_name¶
- cottoncandy.utils.clean_object_name(input_function)¶
Remove leading “/” from object_name
This is important for compatibility with S3fs. S3fs does not list objects with a “/” prefix.
generate_ndarray_chunks¶
- cottoncandy.utils.generate_ndarray_chunks(arr, axis=None, buffersize=104857600)¶
A generator that splits an array into chunks of desired byte size
- Parameters
arr (np.ndarray) –
axis (int, None) – The axis along which to slice the array. If None is given, the array is chunked into ideal isotropic voxels.
buffersize (scalar) – Byte size of the desired array chunks
- Returns
iterator – The object yields the tuple: (chunk_coordinates, chunk_data_slice)
chunk_coordinates: Indices of the current chunk along each dimension
chunk_data_slice: Data for this chunk
- Return type
generator object
Notes
axis=None
is WIP and only works well for near isotropic matrices.
get_fileobject_size¶
- cottoncandy.utils.get_fileobject_size(file_object)¶
Return byte size of file-object
- Parameters
file_object (file object) –
- Returns
nbytes
- Return type
int
get_key_from_environ¶
- cottoncandy.utils.get_key_from_environ()¶
Get AWS keys from environmental variables if available
- Returns
ACCESS_KEY (str)
SECRET_KEY (str)
Notes
Reads AWS_ACCESS_KEY and AWS_SECRET_KEY
get_key_from_s3fs¶
- cottoncandy.utils.get_key_from_s3fs()¶
Get AWS keys from default S3fs location if available.
- Returns
ACCESS_KEY (str)
SECRET_KEY (str)
Notes
Reads ~/.passwd-s3fs to get ACCESSKEY and SECRET KEY
get_keys¶
- cottoncandy.utils.get_keys()¶
Read AWS keys from S3fs configuration or environmental variables.
- Returns
ACCESS_KEY (str)
SECRET_KEY (str)
get_object_size¶
- cottoncandy.utils.get_object_size(boto_s3_object)¶
Return the size of the S3 object in MB
- Parameters
boto_s3_object (boto object) –
- Returns
object_size
- Return type
float (in MB)
has_trivial_magic¶
mk_aws_path¶
- cottoncandy.utils.mk_aws_path(path)¶
Make the path behave as expected when querying S3 with list_objects.
xxx/yyy -> xxx/yyy/
xxx/ -> xxx/
xxx -> xxx/
/ -> ‘’
‘’ -> ‘’
objects2names¶
- cottoncandy.utils.objects2names(objects)¶
Return the name of all objects in a list
- Parameters
objects (list (of boto3 objects)) –
- Returns
object_names
- Return type
list (of strings)
pathjoin¶
- cottoncandy.utils.pathjoin(a, *p)¶
Join two or more pathname components, inserting SEPARATOR as needed. If any component is an absolute path, all previous path components will be discarded. An empty last part will result in a path that ends with a separator.
print_objects¶
- cottoncandy.utils.print_objects(object_list)¶
Print name, size, and creation date of objects in list.
- Parameters
object_list (list (of boto3 objects)) –
read_buffered¶
- cottoncandy.utils.read_buffered(frm, to, buffersize=64)¶
Fill a numpy n-d array with file-like object contents
- Parameters
frm (buffer) – Object with a
read
methodto (np.ndarray) – Array to which the contents will be put
remove_trivial_magic¶
- cottoncandy.utils.remove_trivial_magic(s)¶
xxx/* -> xxx/
xxx/ -> xxx/
xxx//yyy/ -> xxx//yyy/