aodncore.util package
Subpackages
Submodules
aodncore.util.fileops module
This module provides utility functions relating to various filesystem operations
- class aodncore.util.fileops.TemporaryDirectory(suffix=None, prefix=None, dir=None)[source]
Bases:
object
Create and return a temporary directory. This has the same behavior as mkdtemp but can be used as a context manager. For example:
- with TemporaryDirectory() as tmpdir:
…
Upon exiting the context, the directory and everything contained in it are removed.
- aodncore.util.fileops.dir_exists(path)[source]
Check directory exists at a given path returning True or False
- Parameters
path – directory to check
- aodncore.util.fileops.extract_gzip(gzip_path, dest_dir, dest_name=None)[source]
Extract a GZ (GZIP) file’s contents into a directory
- Parameters
gzip_path – path to the source GZ file
dest_dir – destination directory into which the GZ is extracted
dest_name – basename for the extracted file (defaults to the original name minus the ‘.gz’ extension)
- Returns
None
- aodncore.util.fileops.extract_zip(zip_path, dest_dir)[source]
Extract a ZIP file’s contents into a directory
- Parameters
zip_path – path to the source ZIP file
dest_dir – destination directory into which the ZIP is extracted
- Returns
None
- aodncore.util.fileops.find_file(base_path, regex)[source]
Find a file in a directory (recursively) based on a match string.
This purpose of this method is to identify a specific file, so only the first match will be returned.
- Parameters
base_path – A string containing the base directory in which to recursively search.
regex – A string containing a regular expression used to identify the file.
- Returns
A string containing the full path to the matched file.
- aodncore.util.fileops.get_file_checksum(filepath, block_size=65536, algorithm='sha256')[source]
Get the hash (checksum) of a file
- Parameters
filepath – path to the input file
block_size – number of bytes to hash each iteration
algorithm – hash algorithm (from
hashlib
module)
- Returns
hash of the input file
- aodncore.util.fileops.is_dir_writable(path)[source]
Check whether a directory is writable
- Parameters
path – directory path to check
- Returns
None
- aodncore.util.fileops.is_file_writable(path)[source]
Check whether a file is writable
Note
Not as reliable as the
is_dir_writable()
function since that actually writes a file- Parameters
path – file path to check
- Returns
None
- aodncore.util.fileops.is_gzip_file(filepath)[source]
Check whether a file path refers to a valid ZIP file
- Parameters
filepath – path to the file being checked
- Returns
True if filepath is a valid ZIP file, otherwise False
- aodncore.util.fileops.is_json_file(filepath)[source]
Check whether a file path refers to a valid JSON file
- Parameters
filepath – path to the file being checked
- Returns
True if filepath is a valid JSON file, otherwise False
- aodncore.util.fileops.is_netcdf_file(filepath)[source]
Check whether a file path refers to a valid NetCDF file
- Parameters
filepath – path to the file being checked
- Returns
True if filepath is a valid NetCDF file, otherwise False
- aodncore.util.fileops.is_nonempty_file(filepath)[source]
Check whether a file path refers to a file with length greater than zero
- Parameters
filepath – path to the file being checked
- Returns
True if filepath is non-zero, otherwise False
- aodncore.util.fileops.is_zip_file(filepath)[source]
Check whether a file path refers to a valid ZIP file
- Parameters
filepath – path to the file being checked
- Returns
True if filepath is a valid ZIP file, otherwise False
- aodncore.util.fileops.list_regular_files(path, recursive=False, sort_key=<functools.KeyWrapper object>)[source]
List all regular files in a given directory, returning the absolute path
- Parameters
sort_key – callable used to sort directory listings
path – input directory to list
recursive –
bool
flag to enable recursive listing
- Returns
iterator returning only regular files
- aodncore.util.fileops.mkdir_p(path, mode=493)[source]
Recursively create a directory, including parent directories (analogous to shell command ‘mkdir -p’)
- Parameters
mode –
path – path to new directory
- Returns
None
- aodncore.util.fileops.rm_f(path)[source]
Remove a file, ignoring “file not found” errors (analogous to shell command ‘rm -f’)
- Parameters
path – path to file being deleted
- Returns
None
- aodncore.util.fileops.rm_r(path)[source]
Remove a file or directory recursively (analogous to shell command ‘rm -r’)
- Parameters
path – path to file being deleted
- Returns
None
- aodncore.util.fileops.rm_rf(path)[source]
Remove a file or directory, ignoring “file not found” errors (analogous to shell command ‘rm -f’)
- Parameters
path – path to file being deleted
- Returns
None
- aodncore.util.fileops.safe_copy_file(source, destination, overwrite=False)[source]
Copy a file atomically by copying first to a temporary file in the same directory (and therefore filesystem) as the intended destination, before performing a rename (which is atomic)
- Parameters
source – source file path
destination – destination file path (will not be overwritten unless ‘overwrite’ set to True)
overwrite – set to True to allow existing destination file to be overwritten
- Returns
None
aodncore.util.misc module
This module provides miscellaneous utility functions not related to filesystem or subprocess operations.
These are typically functions which query, manipulate or transform Python objects.
- class aodncore.util.misc.CaptureStdIO(merge_streams=False)[source]
Bases:
object
Context manager to capture stdout and stderr emitted from the block into a list. Optionally merge stdout and stderr streams into stdout.
- class aodncore.util.misc.LoggingContext(logger, level=None, format_=None, handler=None, close=True)[source]
Bases:
object
Context manager to allow temporary changes to logging configuration within the context of the block
- class aodncore.util.misc.Pattern
Bases:
object
Compiled regular expression object.
- findall(string, pos=0, endpos=9223372036854775807)
Return a list of all non-overlapping matches of pattern in string.
- finditer(string, pos=0, endpos=9223372036854775807)
Return an iterator over all non-overlapping matches for the RE pattern in string.
For each match, the iterator returns a match object.
- flags
The regex matching flags.
- fullmatch(string, pos=0, endpos=9223372036854775807)
Matches against all of the string.
- groupindex
A dictionary mapping group names to group numbers.
- groups
The number of capturing groups in the pattern.
- match(string, pos=0, endpos=9223372036854775807)
Matches zero or more characters at the beginning of the string.
- pattern
The pattern string from which the RE object was compiled.
- scanner(string, pos=0, endpos=9223372036854775807)
- search(string, pos=0, endpos=9223372036854775807)
Scan through string looking for a match, and return a corresponding match object instance.
Return None if no position in the string matches.
- split(string, maxsplit=0)
Split string by the occurrences of pattern.
- sub(repl, string, count=0)
Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl.
- subn(repl, string, count=0)
Return the tuple (new_string, number_of_subs_made) found by replacing the leftmost non-overlapping occurrences of pattern with the replacement repl.
- class aodncore.util.misc.TemplateRenderer(package='aodncore.pipeline', package_path='templates')[source]
Bases:
object
Simple template renderer
- class aodncore.util.misc.WriteOnceOrderedDict[source]
Bases:
OrderedDict
Sub-class of OrderedDict which prevents overwriting/deleting of keys once set
- clear() None. Remove all items from od.
- pop(k[, d]) v, remove specified key and return the corresponding
value. If key is not found, d is returned if given, otherwise KeyError is raised.
- popitem(*args, **kwargs)
Remove and return a (key, value) pair from the dictionary.
Pairs are returned in LIFO order if last is true or FIFO order if false.
- setdefault(*args, **kwargs)
Insert key with a value of default if key is not in the dictionary.
Return the value for key if key is in the dictionary, else default.
- update([E, ]**F) None. Update D from dict/iterable E and F.
If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]
- aodncore.util.misc.discover_entry_points(entry_point_group, working_set=<pkg_resources.WorkingSet object>)[source]
Discover entry points registered under the given entry point group name in the given
pkg_resources.WorkingSet
instance- Parameters
entry_point_group – entry point group name used to find entry points in working set
working_set –
pkg_resources.WorkingSet
instance
- Returns
tuple
containing two elements, with the first being adict
containing each discovered and successfully loaded entry point (with keys being the entry point name and values being a reference to the object referenced by the entry point), and the second tuple element being alist
of objects which failed to be loaded
- aodncore.util.misc.ensure_regex(o)[source]
Ensure that the returned value is a compiled regular expression (Pattern) from a given input, or raise if the object is not a valid regular expression
- Parameters
o – input object, a single regex (string or pre-compiled)
- Returns
Pattern
instance
- aodncore.util.misc.ensure_regex_list(o)[source]
Ensure that the returned value is a list of compiled regular expressions (Pattern) from a given input, or raise if the object is not a list of valid regular expression
- Parameters
o – input object, either a single regex or a sequence of regexes (string or pre-compiled)
- Returns
list
ofPattern
instances
- aodncore.util.misc.ensure_writeonceordereddict(o, empty_on_fail=True)[source]
Function to accept and object and return the WriteOnceOrderedDict representation of the object. An object which can not be handled by the WriteOnceOrderedDict __init__ method will either result in an empty, or if ‘empty_on_fail’ is set to False, will result in an exception.
- Parameters
o – input object
empty_on_fail – boolean flag to determine whether an invalid object will result in a new empty WriteOnceOrderedDict being returned or the exception re-raised.
- Returns
WriteOnceOrderedDict
instance
- aodncore.util.misc.format_exception(exception)[source]
Return a pretty string representation of an Exception object containing the Exception name and message
- Parameters
exception –
Exception
object- Returns
string
- aodncore.util.misc.generate_id()[source]
Generate a unique id starting with non-numeric character
- Returns
unique id
- aodncore.util.misc.get_pattern_subgroups_from_string(string, regex)
Function to retrieve parts of a string given a compiled pattern (re.compile(pattern)) the pattern needs to match the beginning of the string (see https://docs.python.org/2/library/re.html#re.RegexObject.match)
No need to start the pattern with “^”; and
To match anywhere in the string, start the pattern with “.*”.
- Returns
dictionary of fields matching a given pattern
- aodncore.util.misc.get_regex_subgroups_from_string(string, regex)[source]
Function to retrieve parts of a string given a compiled pattern (re.compile(pattern)) the pattern needs to match the beginning of the string (see https://docs.python.org/2/library/re.html#re.RegexObject.match)
No need to start the pattern with “^”; and
To match anywhere in the string, start the pattern with “.*”.
- Returns
dictionary of fields matching a given pattern
- aodncore.util.misc.is_function(o)[source]
Check whether a given object is a function
- Parameters
o – object to check
- Returns
True if object is a function, otherwise False
- aodncore.util.misc.is_nonstring_iterable(sequence)[source]
Check whether an object is a non-string
Iterable
- Parameters
sequence – object to check
- Returns
True if object is a non-string sub class of
Iterable
- aodncore.util.misc.is_valid_email_address(address)[source]
Simple email address validation
- Parameters
address – address to validate
- Returns
True if address matches the regex, otherwise False
- aodncore.util.misc.iter_public_attributes(instance, ignored_attributes=None)[source]
Get an iterator over an instance’s public attributes, including properties
- Parameters
instance – object instance
ignored_attributes – set of attribute names to exclude
- Returns
iterator over the instances public attributes
- aodncore.util.misc.list_not_empty(_list)[source]
Flag a list containing not None values :return: boolean - True if list contains any non-None values, otherwise False
- aodncore.util.misc.matches_regexes(input_string, include_regexes, exclude_regexes=None)[source]
Function to filter a string (e.g. file path) according to regular expression inclusions minus exclusions
- Parameters
input_string – string for comparison to the regular expressions
include_regexes – list of inclusions
exclude_regexes – list of exclusions to subtract from the list produced by inclusions
- Returns
True if the of the string matches one of the ‘include_regexes’ but not one of the ‘exclude_regexes’
- aodncore.util.misc.merge_dicts(*args)[source]
Recursive
dict
mergeDict-like objects are merged sequentially from left to right into a new
dict
Based on: https://gist.github.com/angstwad/bf22d1822c38a92ec0a9
- Returns
None
- aodncore.util.misc.slice_sequence(sequence, slice_size)[source]
Return a
list
containing the inputSequence
sliced intoSequence
instances with a length equal to or less thanslice_size
Note
The type of the elements should be the same type as the original sequence based on the usual Python slicing behaviour, but the outer sequence will always be a
list
type.- Parameters
sequence – input sequence
slice_size – size of each sub-Sequence
- Returns
list
ofSequence
instances
- aodncore.util.misc.str_to_list(string_, delimiter=',', strip_method='strip', include_empty=False)[source]
Return a comma-separated string as native list, with whitespace stripped and empty strings excluded
- Parameters
string – input string
delimiter – character(s) used to split the string
strip_method – which strip method to use for each element (invalid method names
include_empty – boolean to control whether empty strings are included in returned list
- Returns
list representation of the given config option
- aodncore.util.misc.validate_bool(o)
- aodncore.util.misc.validate_dict(o)
- aodncore.util.misc.validate_int(o)
- aodncore.util.misc.validate_mandatory_elements(mandatory, actual, name='item')[source]
Ensure that a collection contains all the elements in a ‘mandatory’ collection of elements
- Parameters
mandatory – collection of mandatory elements
actual – collection to compare against mandatory collection
name – name of object being validated for exception message (e.g. ‘item’ or ‘section’)
- Returns
None
- aodncore.util.misc.validate_mapping(o)
- aodncore.util.misc.validate_relative_path_attr(path, path_attr)[source]
Validate a path, raising an exception containing the name of the attribute which failed
- Parameters
path – string containing the path to test
path_attr – attribute name to include in the exceptions message if validation fails
- Returns
None
- aodncore.util.misc.validate_string(o)
aodncore.util.process module
This module provides general purpose code for interacting with operating system subprocesses
aodncore.util.wfs module
- class aodncore.util.wfs.WfsBroker(wfs_url, version='1.1.0')[source]
Bases:
object
Simple higher level interface to a WebFeatureService instance, to provide common helper methods and standardise response handling around JSON
- get_url_property_name(layer)[source]
Get the URL property name for a given layer
- Parameters
layer – schema dict as returned by WebFeatureService.get_schema
- Returns
string containing the URL property name
- getfeature_dict(layer, ogc_expression=None, **kwargs)[source]
Make a GetFeature request, and return the response in a native dict.
- Parameters
layer – layer name supplied to GetFeature typename parameter
ogc_expression – OgcExpression used to filter the returned features. If omitted, returns all features.
kwargs – keyword arguments passed to the underlying WebFeatureService.getfeature method
- Returns
dict containing the parsed GetFeature response
- query_file_exists(layer, name)[source]
Returns a bool representing whether a given ‘file_url’ is present in a layer
- Parameters
layer – layer name supplied to GetFeature typename parameter
name – ‘file_url’ inserted into OGC filter, and supplied to GetFeature filter parameter
- Returns
whether the given file is present in the layer
- query_files(layer, ogc_expression=None, url_property_name=None)[source]
Return an IndexedSet of files for a given layer
- Parameters
layer – layer name supplied to GetFeature typename parameter
ogc_expression – OgcExpression used to filter the returned features. If omitted, all URLs are returned.
url_property_name – property name for file URL. If omitted, property name is determined from layer schema
- Returns
list of files for the layer
- url_propertyname_candidates = ('file_url', 'url')
- property wfs
Read-only property to access the instantiated WebFeatureService object directly
Note: lazily initialised because instantiating a WebFeatureService causes HTTP traffic, which is only desirable if subsequent WFS requests are actually going to be made (which isn’t always the case when instantiating this broker class)
- Returns
WebFeatureService instance
Module contents
- class aodncore.util.CaptureStdIO(merge_streams=False)[source]
Bases:
object
Context manager to capture stdout and stderr emitted from the block into a list. Optionally merge stdout and stderr streams into stdout.
- class aodncore.util.IndexedSet(other=None)[source]
Bases:
MutableSet
IndexedSet
is acollections.MutableSet
that maintains insertion order and uniqueness of inserted elements. It’s a hybrid type, mostly like an OrderedSet, but alsolist
-like, in that it supports indexing and slicing.- Args:
other (iterable): An optional iterable used to initialize the set.
>>> x = IndexedSet(list(range(4)) + list(range(8))) >>> x IndexedSet([0, 1, 2, 3, 4, 5, 6, 7]) >>> x - set(range(2)) IndexedSet([2, 3, 4, 5, 6, 7]) >>> x[-1] 7 >>> fcr = IndexedSet('freecreditreport.com') >>> ''.join(fcr[:fcr.index('.')]) 'frecditpo'
Standard set operators and interoperation with
set
are all supported:>>> fcr & set('cash4gold.com') IndexedSet(['c', 'd', 'o', '.', 'm'])
As you can see, the
IndexedSet
is almost like aUniqueList
, retaining only one copy of a given value, in the order it was first added. For the curious, the reason why IndexedSet does not support setting items based on index (i.e,__setitem__()
), consider the following dilemma:my_indexed_set = [A, B, C, D] my_indexed_set[2] = A
At this point, a set requires only one A, but a
list
would overwrite C. Overwriting C would change the length of the list, meaning thatmy_indexed_set[2]
would not be A, as expected with a list, but rather D. So, no__setitem__()
.Otherwise, the API strives to be as complete a union of the
list
andset
APIs as possible.
- class aodncore.util.LoggingContext(logger, level=None, format_=None, handler=None, close=True)[source]
Bases:
object
Context manager to allow temporary changes to logging configuration within the context of the block
- class aodncore.util.Pattern
Bases:
object
Compiled regular expression object.
- findall(string, pos=0, endpos=9223372036854775807)
Return a list of all non-overlapping matches of pattern in string.
- finditer(string, pos=0, endpos=9223372036854775807)
Return an iterator over all non-overlapping matches for the RE pattern in string.
For each match, the iterator returns a match object.
- flags
The regex matching flags.
- fullmatch(string, pos=0, endpos=9223372036854775807)
Matches against all of the string.
- groupindex
A dictionary mapping group names to group numbers.
- groups
The number of capturing groups in the pattern.
- match(string, pos=0, endpos=9223372036854775807)
Matches zero or more characters at the beginning of the string.
- pattern
The pattern string from which the RE object was compiled.
- scanner(string, pos=0, endpos=9223372036854775807)
- search(string, pos=0, endpos=9223372036854775807)
Scan through string looking for a match, and return a corresponding match object instance.
Return None if no position in the string matches.
- split(string, maxsplit=0)
Split string by the occurrences of pattern.
- sub(repl, string, count=0)
Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl.
- subn(repl, string, count=0)
Return the tuple (new_string, number_of_subs_made) found by replacing the leftmost non-overlapping occurrences of pattern with the replacement repl.
- class aodncore.util.SystemProcess(command, stdin=- 1, stdout=- 1, stdin_text=None, env=None, shell=False)[source]
Bases:
object
Class to encapsulate a system command, including execution, output handling and returncode handling
- class aodncore.util.TemplateRenderer(package='aodncore.pipeline', package_path='templates')[source]
Bases:
object
Simple template renderer
- class aodncore.util.TemporaryDirectory(suffix=None, prefix=None, dir=None)[source]
Bases:
object
Create and return a temporary directory. This has the same behavior as mkdtemp but can be used as a context manager. For example:
- with TemporaryDirectory() as tmpdir:
…
Upon exiting the context, the directory and everything contained in it are removed.
- class aodncore.util.WfsBroker(wfs_url, version='1.1.0')[source]
Bases:
object
Simple higher level interface to a WebFeatureService instance, to provide common helper methods and standardise response handling around JSON
- get_url_property_name(layer)[source]
Get the URL property name for a given layer
- Parameters
layer – schema dict as returned by WebFeatureService.get_schema
- Returns
string containing the URL property name
- getfeature_dict(layer, ogc_expression=None, **kwargs)[source]
Make a GetFeature request, and return the response in a native dict.
- Parameters
layer – layer name supplied to GetFeature typename parameter
ogc_expression – OgcExpression used to filter the returned features. If omitted, returns all features.
kwargs – keyword arguments passed to the underlying WebFeatureService.getfeature method
- Returns
dict containing the parsed GetFeature response
- query_file_exists(layer, name)[source]
Returns a bool representing whether a given ‘file_url’ is present in a layer
- Parameters
layer – layer name supplied to GetFeature typename parameter
name – ‘file_url’ inserted into OGC filter, and supplied to GetFeature filter parameter
- Returns
whether the given file is present in the layer
- query_files(layer, ogc_expression=None, url_property_name=None)[source]
Return an IndexedSet of files for a given layer
- Parameters
layer – layer name supplied to GetFeature typename parameter
ogc_expression – OgcExpression used to filter the returned features. If omitted, all URLs are returned.
url_property_name – property name for file URL. If omitted, property name is determined from layer schema
- Returns
list of files for the layer
- url_propertyname_candidates = ('file_url', 'url')
- property wfs
Read-only property to access the instantiated WebFeatureService object directly
Note: lazily initialised because instantiating a WebFeatureService causes HTTP traffic, which is only desirable if subsequent WFS requests are actually going to be made (which isn’t always the case when instantiating this broker class)
- Returns
WebFeatureService instance
- class aodncore.util.WriteOnceOrderedDict[source]
Bases:
OrderedDict
Sub-class of OrderedDict which prevents overwriting/deleting of keys once set
- clear() None. Remove all items from od.
- pop(k[, d]) v, remove specified key and return the corresponding
value. If key is not found, d is returned if given, otherwise KeyError is raised.
- popitem(*args, **kwargs)
Remove and return a (key, value) pair from the dictionary.
Pairs are returned in LIFO order if last is true or FIFO order if false.
- setdefault(*args, **kwargs)
Insert key with a value of default if key is not in the dictionary.
Return the value for key if key is in the dictionary, else default.
- update([E, ]**F) None. Update D from dict/iterable E and F.
If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]
- class aodncore.util.classproperty(fget=None, doc=None, lazy=False)[source]
Bases:
property
Similar to property, but allows class-level properties. That is, a property whose getter is like a classmethod.
The wrapped method may explicitly use the classmethod decorator (which must become before this decorator), or the classmethod may be omitted (it is implicit through use of this decorator).
Note
classproperty only works for read-only properties. It does not currently allow writeable/deletable properties, due to subtleties of how Python descriptors work. In order to implement such properties on a class a metaclass for that class must be implemented.
- fgetcallable
The function that computes the value of this property (in particular, the function when this is used as a decorator) a la property.
- docstr, optional
The docstring for the property–by default inherited from the getter function.
- lazybool, optional
If True, caches the value returned by the first call to the getter function, so that it is only called once (used for lazy evaluation of an attribute). This is analogous to lazyproperty. The
lazy
argument can also be used when classproperty is used as a decorator (see the third example below). When used in the decorator syntax this must be passed in as a keyword argument.
>>> class Foo: ... _bar_internal = 1 ... @classproperty ... def bar(cls): ... return cls._bar_internal + 1 ... >>> Foo.bar 2 >>> foo_instance = Foo() >>> foo_instance.bar 2 >>> foo_instance._bar_internal = 2 >>> foo_instance.bar # Ignores instance attributes 2
As previously noted, a classproperty is limited to implementing read-only attributes:
>>> class Foo: ... _bar_internal = 1 ... @classproperty ... def bar(cls): ... return cls._bar_internal ... @bar.setter ... def bar(cls, value): ... cls._bar_internal = value ... Traceback (most recent call last): ... NotImplementedError: classproperty can only be read-only; use a metaclass to implement modifiable class-level properties
When the
lazy
option is used, the getter is only called once:>>> class Foo: ... @classproperty(lazy=True) ... def bar(cls): ... print("Performing complicated calculation") ... return 1 ... >>> Foo.bar Performing complicated calculation 1 >>> Foo.bar 1
If a subclass inherits a lazy classproperty the property is still re-evaluated for the subclass:
>>> class FooSub(Foo): ... pass ... >>> FooSub.bar Performing complicated calculation 1 >>> FooSub.bar 1
- aodncore.util.discover_entry_points(entry_point_group, working_set=<pkg_resources.WorkingSet object>)[source]
Discover entry points registered under the given entry point group name in the given
pkg_resources.WorkingSet
instance- Parameters
entry_point_group – entry point group name used to find entry points in working set
working_set –
pkg_resources.WorkingSet
instance
- Returns
tuple
containing two elements, with the first being adict
containing each discovered and successfully loaded entry point (with keys being the entry point name and values being a reference to the object referenced by the entry point), and the second tuple element being alist
of objects which failed to be loaded
- aodncore.util.ensure_regex(o)[source]
Ensure that the returned value is a compiled regular expression (Pattern) from a given input, or raise if the object is not a valid regular expression
- Parameters
o – input object, a single regex (string or pre-compiled)
- Returns
Pattern
instance
- aodncore.util.ensure_regex_list(o)[source]
Ensure that the returned value is a list of compiled regular expressions (Pattern) from a given input, or raise if the object is not a list of valid regular expression
- Parameters
o – input object, either a single regex or a sequence of regexes (string or pre-compiled)
- Returns
list
ofPattern
instances
- aodncore.util.ensure_writeonceordereddict(o, empty_on_fail=True)[source]
Function to accept and object and return the WriteOnceOrderedDict representation of the object. An object which can not be handled by the WriteOnceOrderedDict __init__ method will either result in an empty, or if ‘empty_on_fail’ is set to False, will result in an exception.
- Parameters
o – input object
empty_on_fail – boolean flag to determine whether an invalid object will result in a new empty WriteOnceOrderedDict being returned or the exception re-raised.
- Returns
WriteOnceOrderedDict
instance
- aodncore.util.extract_gzip(gzip_path, dest_dir, dest_name=None)[source]
Extract a GZ (GZIP) file’s contents into a directory
- Parameters
gzip_path – path to the source GZ file
dest_dir – destination directory into which the GZ is extracted
dest_name – basename for the extracted file (defaults to the original name minus the ‘.gz’ extension)
- Returns
None
- aodncore.util.extract_zip(zip_path, dest_dir)[source]
Extract a ZIP file’s contents into a directory
- Parameters
zip_path – path to the source ZIP file
dest_dir – destination directory into which the ZIP is extracted
- Returns
None
- aodncore.util.find_file(base_path, regex)[source]
Find a file in a directory (recursively) based on a match string.
This purpose of this method is to identify a specific file, so only the first match will be returned.
- Parameters
base_path – A string containing the base directory in which to recursively search.
regex – A string containing a regular expression used to identify the file.
- Returns
A string containing the full path to the matched file.
- aodncore.util.format_exception(exception)[source]
Return a pretty string representation of an Exception object containing the Exception name and message
- Parameters
exception –
Exception
object- Returns
string
- aodncore.util.generate_id()[source]
Generate a unique id starting with non-numeric character
- Returns
unique id
- aodncore.util.get_file_checksum(filepath, block_size=65536, algorithm='sha256')[source]
Get the hash (checksum) of a file
- Parameters
filepath – path to the input file
block_size – number of bytes to hash each iteration
algorithm – hash algorithm (from
hashlib
module)
- Returns
hash of the input file
- aodncore.util.get_pattern_subgroups_from_string(string, regex)
Function to retrieve parts of a string given a compiled pattern (re.compile(pattern)) the pattern needs to match the beginning of the string (see https://docs.python.org/2/library/re.html#re.RegexObject.match)
No need to start the pattern with “^”; and
To match anywhere in the string, start the pattern with “.*”.
- Returns
dictionary of fields matching a given pattern
- aodncore.util.is_dir_writable(path)[source]
Check whether a directory is writable
- Parameters
path – directory path to check
- Returns
None
- aodncore.util.is_function(o)[source]
Check whether a given object is a function
- Parameters
o – object to check
- Returns
True if object is a function, otherwise False
- aodncore.util.is_gzip_file(filepath)[source]
Check whether a file path refers to a valid ZIP file
- Parameters
filepath – path to the file being checked
- Returns
True if filepath is a valid ZIP file, otherwise False
- aodncore.util.is_json_file(filepath)[source]
Check whether a file path refers to a valid JSON file
- Parameters
filepath – path to the file being checked
- Returns
True if filepath is a valid JSON file, otherwise False
- aodncore.util.is_netcdf_file(filepath)[source]
Check whether a file path refers to a valid NetCDF file
- Parameters
filepath – path to the file being checked
- Returns
True if filepath is a valid NetCDF file, otherwise False
- aodncore.util.is_nonempty_file(filepath)[source]
Check whether a file path refers to a file with length greater than zero
- Parameters
filepath – path to the file being checked
- Returns
True if filepath is non-zero, otherwise False
- aodncore.util.is_nonstring_iterable(sequence)[source]
Check whether an object is a non-string
Iterable
- Parameters
sequence – object to check
- Returns
True if object is a non-string sub class of
Iterable
- aodncore.util.is_valid_email_address(address)[source]
Simple email address validation
- Parameters
address – address to validate
- Returns
True if address matches the regex, otherwise False
- aodncore.util.is_zip_file(filepath)[source]
Check whether a file path refers to a valid ZIP file
- Parameters
filepath – path to the file being checked
- Returns
True if filepath is a valid ZIP file, otherwise False
- aodncore.util.iter_public_attributes(instance, ignored_attributes=None)[source]
Get an iterator over an instance’s public attributes, including properties
- Parameters
instance – object instance
ignored_attributes – set of attribute names to exclude
- Returns
iterator over the instances public attributes
- class aodncore.util.lazyproperty(fget, fset=None, fdel=None, doc=None)[source]
Bases:
property
Works similarly to property(), but computes the value only once.
This essentially memorizes the value of the property by storing the result of its computation in the
__dict__
of the object instance. This is useful for computing the value of some property that should otherwise be invariant. For example:>>> class LazyTest: ... @lazyproperty ... def complicated_property(self): ... print('Computing the value for complicated_property...') ... return 42 ... >>> lt = LazyTest() >>> lt.complicated_property Computing the value for complicated_property... 42 >>> lt.complicated_property 42
As the example shows, the second time
complicated_property
is accessed, theprint
statement is not executed. Only the return value from the first access offcomplicated_property
is returned.By default, a setter and deleter are used which simply overwrite and delete, respectively, the value stored in
__dict__
. Any user-specified setter or deleter is executed before executing these default actions. The one exception is that the default setter is not run if the user setter already sets the new value in__dict__
and returns that value and the returned value is notNone
.Adapted from the recipe at http://code.activestate.com/recipes/363602-lazy-property-evaluation
- aodncore.util.list_not_empty(_list)[source]
Flag a list containing not None values :return: boolean - True if list contains any non-None values, otherwise False
- aodncore.util.list_regular_files(path, recursive=False, sort_key=<functools.KeyWrapper object>)[source]
List all regular files in a given directory, returning the absolute path
- Parameters
sort_key – callable used to sort directory listings
path – input directory to list
recursive –
bool
flag to enable recursive listing
- Returns
iterator returning only regular files
- aodncore.util.matches_regexes(input_string, include_regexes, exclude_regexes=None)[source]
Function to filter a string (e.g. file path) according to regular expression inclusions minus exclusions
- Parameters
input_string – string for comparison to the regular expressions
include_regexes – list of inclusions
exclude_regexes – list of exclusions to subtract from the list produced by inclusions
- Returns
True if the of the string matches one of the ‘include_regexes’ but not one of the ‘exclude_regexes’
- aodncore.util.merge_dicts(*args)[source]
Recursive
dict
mergeDict-like objects are merged sequentially from left to right into a new
dict
Based on: https://gist.github.com/angstwad/bf22d1822c38a92ec0a9
- Returns
None
- aodncore.util.mkdir_p(path, mode=493)[source]
Recursively create a directory, including parent directories (analogous to shell command ‘mkdir -p’)
- Parameters
mode –
path – path to new directory
- Returns
None
- aodncore.util.retry_decorator(exceptions=<class 'Exception'>, tries=-1, delay=0, max_delay=None, backoff=1, jitter=0, logger=<Logger aodncore.util.external.retry.api (WARNING)>)
Returns a retry decorator.
- Parameters
exceptions – an exception or a tuple of exceptions to catch. default: Exception.
tries – the maximum number of attempts. default: -1 (infinite).
delay – initial delay between attempts. default: 0.
max_delay – the maximum value of delay. default: None (no limit).
backoff – multiplier applied to delay between attempts. default: 1 (no backoff).
jitter – extra seconds added to delay between attempts. default: 0. fixed if a number, random if a range tuple (min, max)
logger – logger.warning(fmt, error, delay) will be called on failed attempts. default: retry.logging_logger. if None, logging is disabled.
- Returns
a retry decorator.
- aodncore.util.rm_f(path)[source]
Remove a file, ignoring “file not found” errors (analogous to shell command ‘rm -f’)
- Parameters
path – path to file being deleted
- Returns
None
- aodncore.util.rm_r(path)[source]
Remove a file or directory recursively (analogous to shell command ‘rm -r’)
- Parameters
path – path to file being deleted
- Returns
None
- aodncore.util.rm_rf(path)[source]
Remove a file or directory, ignoring “file not found” errors (analogous to shell command ‘rm -f’)
- Parameters
path – path to file being deleted
- Returns
None
- aodncore.util.safe_copy_file(source, destination, overwrite=False)[source]
Copy a file atomically by copying first to a temporary file in the same directory (and therefore filesystem) as the intended destination, before performing a rename (which is atomic)
- Parameters
source – source file path
destination – destination file path (will not be overwritten unless ‘overwrite’ set to True)
overwrite – set to True to allow existing destination file to be overwritten
- Returns
None
- aodncore.util.safe_move_file(src, dst, overwrite=False)[source]
Move a file atomically by performing a copy and delete
- Parameters
src – source file path
dst – destination file path
overwrite – set to True to allow existing destination file to be overwritten
- Returns
None
- aodncore.util.slice_sequence(sequence, slice_size)[source]
Return a
list
containing the inputSequence
sliced intoSequence
instances with a length equal to or less thanslice_size
Note
The type of the elements should be the same type as the original sequence based on the usual Python slicing behaviour, but the outer sequence will always be a
list
type.- Parameters
sequence – input sequence
slice_size – size of each sub-Sequence
- Returns
list
ofSequence
instances
- aodncore.util.str_to_list(string_, delimiter=',', strip_method='strip', include_empty=False)[source]
Return a comma-separated string as native list, with whitespace stripped and empty strings excluded
- Parameters
string – input string
delimiter – character(s) used to split the string
strip_method – which strip method to use for each element (invalid method names
include_empty – boolean to control whether empty strings are included in returned list
- Returns
list representation of the given config option
- aodncore.util.validate_bool(o)
- aodncore.util.validate_dict(o)
- aodncore.util.validate_int(o)
- aodncore.util.validate_mandatory_elements(mandatory, actual, name='item')[source]
Ensure that a collection contains all the elements in a ‘mandatory’ collection of elements
- Parameters
mandatory – collection of mandatory elements
actual – collection to compare against mandatory collection
name – name of object being validated for exception message (e.g. ‘item’ or ‘section’)
- Returns
None
- aodncore.util.validate_mapping(o)
- aodncore.util.validate_relative_path_attr(path, path_attr)[source]
Validate a path, raising an exception containing the name of the attribute which failed
- Parameters
path – string containing the path to test
path_attr – attribute name to include in the exceptions message if validation fails
- Returns
None
- aodncore.util.validate_string(o)