Package resource¶
Extracts package resource data
works!
intuitive
flexible
static type checks
The Problem¶
Some Python package authors create bash scripts to find a folder,
using Path(__file__).parent which contains their data
files.
Understandably, the UX of working with Python package data is beyond their patience level.
This module is for those who want the UX of extracting package data to be easy. Enough that they’ll go back and remove all those ugly hacks and bash scripts.
Note
PackageResource.package_data_folders yields data folders
First step to extracting package data is to narrow down the (package) folders. The Second step is extracting the data files.
Example¶
Extract package data to local cache
package data folder: data/currency
Note the path is relative
Local cache folder: $HOME/.cache/[package name]
import sys
from typing import TYPE_CHECKING
from collections.abc import Iterator
from functools import partial
from pathlib import PurePath
from logging_strict.util.package_resource import filter_by_suffix
from logging_strict.util.package_resource import filter_by_file_stem
from logging_strict.util.package_resource import PartSuffix
from logging_strict.util.package_resource import PartStem
from logging_strict.util.package_resource import package_data_folders
from logging_strict.util.package_resource import cache_extract
if sys.version_info >= (3, 9): # pragma: no cover
try:
from importlib.resources.abc import Traversable # py312+
except ImportError:
from importlib.abc import Traversable # py39+
else: # pragma: no cover
msg_exc = "Traversable py39+"
raise ImportError(msg_exc)
if TYPE_CHECKING:
data_folder_path: str
cb_file_stem: PartStem
cb_file_suffix: PartSuffix
generator_folders: Iterator[Traversable]
path_entry: type[PurePath]
data_folder_path = "data/currency"
cb_file_stem = partial(filter_by_file_stem, "crypto_btc_default")
cb_file_suffix = partial(filter_by_suffix, ".bitcoin")
generator_folders = package_data_folders(
cb_suffix=cb_file_suffix,
cb_file_stem=cb_file_stem,
package_name="decimals",
path_relative_package_dir=data_folder_path,
)
for path_entry in cache_extract(
generator_folders,
package_name,
cb_suffix=cb_file_suffix,
cb_file_stem=cb_file_stem,
is_overwrite=False,
):
# path_entry is the extracted file path in local cache
pass
So our file, data/currency/crypto_btc_default.bitcoin is
extracted into folder $HOME/.cache/[package name]/data/currency
For more fine control, options are:
move it within the cache_extract for loop
Note
DIY
Especially
filter_by_file_stem(),
but this might apply to
filter_by_suffix()
as well, these are for the simplest scenerio. They are both just a
normal function. If/when necessary, roll your own
Note
package_data_folders param package_data_folders.package_name
Change to whichever package contains the data files you are interested in. Not the package in this example
Module private variables¶
- logging_strict.util.package_resource.__all__: tuple[str, str, str, str, str, str] = ("filter_by_suffix", "filter_by_file_stem", "PackageResource", "PartSuffix", "PartStem", "get_package_data")¶
Module object exports
- logging_strict.util.package_resource.is_module_debug: bool = False¶
During development, turns on logging. Once unittest cover reaches 100%, turn off
- logging_strict.util.package_resource.g_module: str = logging_strict.util.package_resource¶
logging dotted path
- logging_strict.util.package_resource._LOGGER: logging.Logger¶
Complicated module. Does issue logging warnings
Module objects¶
- class logging_strict.util.package_resource.PackageResource(package, package_data_folder_start)¶
In a Python package, could be any package installed into the virtual environment, which package data folder is the base folder in which to start the search for data files. As in a fallback folder
Do not assume the default start data folder is
data. Impose rule that data files must not be stored in the package base folder; must be placed into a folder- Variables:
- cache_extract(base_folder_generator, /, cb_suffix=None, cb_file_stem=None, is_overwrite=False)¶
A generic extractor to local cache folder
- Parameters:
base_folder_generator¶ (collections.abc.Iterator[importlib.resources.abc.Traversable]) – Package data folder Generator. Narrows down folders to search to only folders containing the target data files
cb_suffix¶ (collections.abc.Callable[[str],bool]) – Function creating using
functools.partial()which filters by suffixcb_file_stem¶ (collections.abc.Callable[[str],bool]) – Function creating using
functools.partial()which filters by file name stem
- Returns:
local cached file path
- Return type:
See also
Caution
Refresh generator
Resources will not be extracted if the generator is exhausted. If running in a loop, reinitialize generator
- get_parent_paths(*, cb_suffix=None, cb_file_stem=None, path_relative_package_dir=None, parent_count=1)¶
Example from a package there is a resource:
data/theme/size/category/[image file name]The relative path is extracted. In this case,
data, which is relevent only to the package, not to the final file system location. Interested in a relative path, not the absolute path from POV of the packageRemaining path
theme/size/category/[image file name]
resource: [image file name]
Parents: [“theme”, “size”, “category”]
The cb_suffix and cb_file_stem selects the relevent file
Caution
Location of package data files
CANNOT be in the base folder of a package. Move any package data files into an appropriately named/categoried sub-folder.
Strong assumption that there will never be data files in the package base folder. And if so, those aren’t data files, that’s clutter
- Parameters:
cb_suffix¶ (collections.abc.Callable[[str],bool] | None) – Function creating using
functools.partial()which filters by suffixcb_file_stem¶ (collections.abc.Callable[[str],bool] | None) – Function creating using
functools.partial()which filters by file name stempath_relative_package_dir¶ (pathlib.Path | str | None) – package base folder to start the search.
Nonebecomes thePackageResource.package_data_folder_start, not the package base folder. Assumes package authors are smart and would never be that gullible.parent_count¶ (int | None) – Default 1. Retrieve x number of parent folder names
- Returns:
file name and respective parents as an Sequence[str]
- Return type:
- property package_data_folder_start¶
Package name
- Returns:
package base data folder name. Not relative path
- Return type:
- package_data_folders(*, cb_suffix=None, cb_file_stem=None, path_relative_package_dir=None)¶
Generic generator for retrieving package data folder paths. Does not do the file extraction.
Caution
Generators delayed execution
Creating a generator will always succeed; the code is not immediately executed. If the code, would normally raise an Exception, have to execute the generator for that to occur.
This function is used as input to functions:
PackageResource.resource_extractorPackageResource.cache_extract. So any Exception or logging would be delayed until those calls- Parameters:
cb_suffix¶ (collections.abc.Callable[[str],bool]) – Function creating using
functools.partial()which filters by suffixcb_file_stem¶ (collections.abc.Callable[[str],bool]) – Function creating using
functools.partial()which filters by file name stempackage_name¶ (str) – Default [app name]. Python3 has namespace. So a Distribution need not contain all packages which will share the same namespace. There maybe multiple gui implementations installed
path_relative_package_dir¶ (pathlib.Path | str | None) – package base folder to start the search
- Returns:
All py:class:importlib.resources.abc.Traversable paths. Possibly filtered by theme
- Return type:
collections.abc.Iterator[importlib.resources.abc.Traversable]
- Raises:
ImportError– package not installed. Before introspecting package data, install package
- path_relative(y, /, *, path_relative_package_dir=None, parent_count=None)¶
Whilst traversing package data, a data file’s path, relative to a package folder, usually root folder, is unavailable. Only have the absolute path of the extracted data file
This limits flexibility. There might be need, especially during testing, to move the extracted data file to another folder
An Example
ywhich is an absolute path package data extracted byimportlib.resources.as_file(). Which should be zip safe[venv path]/lib/python3.9/site-packages/decimals/data/currency/digital_tox_default.ini
Code sample is not extracting package data, instead fakes an absolute path, which needs to contain folder “data” although the local cache wouldn’t have this folder.
>>> from pathlib import Path >>> from logging_strict.constants import g_app_name >>> from logging_strict.util.package_resource import ( ... PackageResource, ... _extract_folder, ... ) >>> path_local_cache = Path(_extract_folder(g_app_name)) >>> y = path_local_cache.joinpath( ... "data", "currency", "nonsense", "digital_tox_default.ini" ... ) >>> pr = PackageResource("some package name", "data") >>> pr.path_relative(y, parent_count=None) PosixPath('currency/nonsense/digital_tox_default.ini') >>> pr.path_relative(y, parent_count=0) PosixPath('digital_tox_default.ini') >>> pr.path_relative(y, parent_count=1) PosixPath('nonsense/digital_tox_default.ini') >>> pr.path_relative(y, parent_count=2) PosixPath('currency/nonsense/digital_tox_default.ini') >>> pr.path_relative(y, parent_count=3) # can't do beyond start dir, "data" PosixPath('currency/nonsense/digital_tox_default.ini')
- Parameters:
y¶ (pathlib.Path) – Extracted data file’s path
path_relative_package_dir¶ (
Path| str | None) – Default “data” (folder). Relative package path. Treat a base folderparent_count¶ (int | None) – Ignoring file name. Default
Noneindicates entire relative path. Return x folders, from parent, working backwards
- Returns:
Relative path excluding from
path_relative_package_dir- Return type:
- Raises:
TypeError–None, not a type[PurePath] or relative pathLookupError– Cannot return relative path from non-existing parent folder
- resource_extract(base_folder_generator, path_dest, /, cb_suffix=None, cb_file_stem=None, is_overwrite=False, as_user=False)¶
A generic extractor
Use task specific resource extractors for a cleaner UX
- Parameters:
base_folder_generator¶ (collections.abc.Iterator[importlib.resources.abc.Traversable]) – Package data folder Generator. Narrows down the search to folders known to contain target package data files
path_dest¶ (pathlib.Path | str) – destination folder
cb_suffix¶ (collections.abc.Callable[[str],bool]) – Function creating using
functools.partial()which filters by suffixcb_file_stem¶ (collections.abc.Callable[[str],bool]) – Function creating using
functools.partial()which filters by file name stemis_overwrite¶ (bool | None) – Default
False. Force overwriting of destination fileas_user¶ (bool | None) – Default
False.Falsedest file owner set to root. Otherwise dest file owner set to user
- Returns:
local cached file path
- Return type:
Caution
Refresh generator
Resources will not be extracted if the generator is exhausted. If running in a loop, reinitialize generator
Todo
acl permissions of dest folder
Check acl writable permissions Is dest folder tree writable?
- class logging_strict.util.package_resource.PartStem(*args, **kwargs)¶
file stem callback functions Careful! Will return all files that match the file stem
Usage
from typing import TYPE_CHECKING from functools import partial from logging_strict.util.package_resource import filter_by_file_stem from logging_strict.util.package_resource import PartStem if TYPE_CHECKING: cb_file_stem: PartStem cb_file_stem = partial(filter_by_file_stem, file_name) cb_file_stem = partial(filter_by_file_stem, "index.theme")
- class logging_strict.util.package_resource.PartSuffix(*args, **kwargs)¶
Type of suffix callback functions
Usage
from typing import TYPE_CHECKING from functools import partial from logging_strict.util.package_resource import filter_by_suffix from logging_strict.util.package_resource import PartSuffix if TYPE_CHECKING: cb_suffix: PartSuffix cb_suffix = partial(filter_by_suffix, (".svg", ".png")) cb_suffix = partial(filter_by_suffix, ".toml")
- logging_strict.util.package_resource.filter_by_file_stem(expected_file_name, test_file_name)¶
This is the simpliest case, simple matching of package resource file name against expected file name
Usage
from functools import partial from logging_strict.util.package_resource import filter_by_file_stem cb_file_stem = partial(filter_by_file_stem, expected_file_name) ...
cb_file_stemis used extensively within this module- Parameters:
- Returns:
Trueif same otherwiseFalse- Return type:
Note
This is the simpliest case
- For more complex cases write a lambda or function and
use
functools.partial()to create a callback
- logging_strict.util.package_resource.filter_by_suffix(expected_suffix, test_suffix)¶
Usage
from functools import partial from logging_strict.package_resource import filter_by_suffix, PartSuffix cb_suffix: PartSuffix = partial(filter_by_suffix, expected_suffix) ...
Then use
cb_suffixas kwarg toPackageResource.cache_extract
- logging_strict.util.package_resource.get_package_data(package_name: str, file_name_stem: str, suffix='.csv', convert_to_path: Sequence[str] = ('data',), is_extract: bool | None = False) str¶
Export and read one package file. Exports to
/run/user/[current session user id]. This tmp folder inaccessible to other users and contents automagically removed at system shutdown- Parameters:
suffix¶ (str | collections.abc.Sequence[str] | None) – str or tuple. Target file suffixes
convert_to_path¶ (collections.abc.Sequence[str] | None) – Default
("data",). relative dotted path to subfolder, excluding package_name.Before reading file contents,
True – extract to tmp folder
False – read data file contents from within package
- Returns:
file contents or on failure None
- Return type:
str | None