powersimdata.data_access package

Subpackages

Submodules

powersimdata.data_access.context module

class powersimdata.data_access.context.Context[source]

Bases: object

Factory for data access instances

static get_data_access(make_fs=None)[source]

Return a data access instance appropriate for the current environment.

Parameters:

make_fs (callable) – a function that returns a filesystem instance, or None to use a default

Returns:

(powersimdata.data_access.data_access.DataAccess) – a data access instance

static get_launcher(scenario)[source]

Return instance for interaction with simulation engine

Parameters:

scenario (powersimdata.scenario.scenario.Scenario) – a scenario object

Returns:

(powersimdata.data_access.launcher.Launcher) – a launcher instance

powersimdata.data_access.csv_store module

class powersimdata.data_access.csv_store.CsvStore(data_access)[source]

Bases: object

Base class for common functionality used to manage scenario and execute list stored as csv files on the server

Parameters:

powersimdata.data_access.data_access.DataAccess – data access object

commit(table, checksum)[source]

Save to local directory and upload if needed

Parameters:
  • table (pandas.DataFrame) – the data frame to save

  • checksum (str) – the checksum prior to download

get_table()[source]

Attempt to download the file from server and blob storage, falling back to local copy if one exists, and return the combined result.

Returns:

(pandas.DataFrame) – the specified table as a data frame.

powersimdata.data_access.csv_store.verify_hash(func)[source]

Utility function which verifies the sha1sum of the file before writing it on the server. Operates on methods that return an updated scenario or execute list.

powersimdata.data_access.data_access module

class powersimdata.data_access.data_access.DataAccess[source]

Bases: object

Interface to a local or remote data store.

checksum(relative_path)[source]

Return the checksum of the file path

Parameters:

relative_path (str) – path relative to root

Returns:

(str) – the checksum of the file

copy_from(file_name, from_dir=None)[source]

Copy a file from data store to userspace.

Parameters:
  • file_name (str) – file name to copy.

  • from_dir (str) – data store directory to copy file from.

get(filepath)[source]

Copy file from remote filesystem if needed and read into memory

Parameters:

filepath (str) – path to file

Returns:

(tuple) – file object and filepath to be handled by caller

get_profile_version(callback)[source]

Returns available raw profile from blob storage or local disk

Parameters:

callback (callable) – a function taking a fs instance that returns the available profiles on that fs

Returns:

(list) – available profile version.

push(file_name, checksum)[source]

Push the file from local to remote root folder, ensuring integrity

Parameters:
  • file_name (str) – the file name, located at the local root

  • checksum (str) – the checksum prior to download

remove(base_dir, pattern, confirm=True)[source]

Delete files in current environment

Parameters:
  • base_dir (str) – root within which to search

  • pattern (str) – glob specifying files to remove

  • confirm (bool) – prompt before executing command

Returns:

(bool) – True if the operation is completed

tmp_folder(scenario_id)[source]

Get path to temporary scenario folder

Parameters:

scenario_id (int/str) – the scenario id

Returns:

(str) – the specified path

write(filepath, save_local=True)[source]

Write a file to data store.

Parameters:
  • filepath (str) – path to save data to

  • save_local (bool) – whether a copy should also be saved to the local filesystem, if such a filesystem is configured. Defaults to True.

class powersimdata.data_access.data_access.LocalDataAccess(_fs=None)[source]

Bases: DataAccess

Interface to shared data volume

push(file_name, checksum)[source]

Write file if checksum matches

Parameters:
  • file_name (str) – the file name, located at the local root

  • checksum (str) – the checksum prior to download

class powersimdata.data_access.data_access.MemoryDataAccess[source]

Bases: _DataAccessTemplate

Mimic a client server architecture using in memory filesystems

class powersimdata.data_access.data_access.SSHDataAccess(_fs=None)[source]

Bases: DataAccess

Interface to a remote data store, accessed via SSH.

checksum(relative_path)[source]

Return the checksum of the file path

Parameters:

relative_path (str) – path relative to root

Returns:

(str) – the checksum of the file

exec_command(command)[source]
execute_command_async(command)[source]

Execute a command via ssh, without waiting for completion.

Parameters:

command (list) – list of str to be passed to command line.

Returns:

(subprocess.Popen) – the local ssh process

property fs

Get or create a filesystem object, defaulting to a MultiFS that combines the server and blob containers.

Returns:

(fs.base.FS) – filesystem instance

push(file_name, checksum)[source]

Push file to server and verify the checksum matches a prior value

Parameters:
  • file_name (str) – the file name, located at the local root

  • checksum (str) – the checksum prior to download

Raises:

IOError – if command generated stderr

class powersimdata.data_access.data_access.TempDataAccess[source]

Bases: _DataAccessTemplate

Mimic a client server architecture using temp filesystems

powersimdata.data_access.execute_list module

class powersimdata.data_access.execute_list.ExecuteListManager(data_access)[source]

Bases: CsvStore

Storage abstraction for execute list using a csv file.

add_entry(scenario_info)[source]

Add entry to execute list

Parameters:

scenario_info (collections.OrderedDict) – entry to add

delete_entry(scenario_id)[source]

Deletes entry from execute list.

Parameters:

scenario_id (int/str) – the id of the scenario

Returns:

(pandas.DataFrame) – the updated data frame

get_execute_table()[source]

Returns execute table from server if possible, otherwise read local copy. Updates the local copy upon successful server connection.

Returns:

(pandas.DataFrame) – execute list as a data frame.

get_status(scenario_id)[source]

Return the status for the scenario

Parameters:

scenario_id (str/int) – the scenario id

Raises:

Exception – if scenario not found in execute list.

Returns:

(str) – scenario status

set_status(scenario_id, status)[source]

Set the scenario status

Parameters:
  • scenario_id (int/str) – the scenario id

  • status (str) – the new status

Returns:

(pandas.DataFrame) – the updated data frame

powersimdata.data_access.execute_table module

powersimdata.data_access.fs_helper module

powersimdata.data_access.fs_helper.get_blob_fs(container)[source]

Create fs for the given blob storage container

Parameters:

container (str) – the container name

Returns:

(fs.base.FS) – filesystem instance

powersimdata.data_access.fs_helper.get_multi_fs(root)[source]

Create filesystem combining the server (if connected) with profile and scenario containers in blob storage. The priority is in descending order, so the server will be used first if possible

Parameters:

root (str) – root directory on server

Returns:

(fs.base.FS) – filesystem instance

powersimdata.data_access.fs_helper.get_scenario_fs()[source]

Create filesystem combining the server (if connected) with blob storage, prioritizing the server if connected.

Returns:

(fs.base.FS) – filesystem instance

powersimdata.data_access.fs_helper.get_ssh_fs(root='')[source]

Create fs for the given directory on the server

Parameters:

root (str) – root direcory on server

Returns:

(fs.base.FS) – filesystem instance

powersimdata.data_access.launcher module

class powersimdata.data_access.launcher.HttpLauncher(scenario)[source]

Bases: Launcher

BASE_URL = 'http://becompute01.gatesventures.com:5000'
check_progress()[source]

Get the status of an ongoing simulation, if possible

Returns:

(dict) – contains “output”, “errors”, “scenario_id”, and “status” keys which map to stdout, stderr, and the respective scenario attributes

extract_simulation_output()[source]

Extracts simulation outputs {PG, PF, LMP, CONGU, CONGL}

Returns:

(dict) – contains “output”, “errors”, “scenario_id”, and “status” keys which map to stdout, stderr, and the respective scenario attributes

class powersimdata.data_access.launcher.Launcher(scenario)[source]

Bases: object

Base class for interaction with simulation engine.

Parameters:

scenario (powersimdata.scenario.scenario.Scenario) – scenario instance

extract_simulation_output()[source]

Extracts simulation outputs {PG, PF, LMP, CONGU, CONGL} on server.

launch_simulation(threads=None, solver=None, extract_data=True)[source]

Launches simulation on target environment

Parameters:
  • threads (int) – the number of threads to be used. This defaults to None, where None means auto.

  • solver (str) – the solver used for optimization. This defaults to None, which translates to gurobi

  • extract_data (bool) – whether the results of the simulation engine should automatically extracted after the simulation has run. This defaults to True.

Returns:

(subprocess.Popen) or (dict) - the process, if using ssh to server, otherwise a dict containing status information.

property scenario_id
class powersimdata.data_access.launcher.NativeLauncher(scenario)[source]

Bases: Launcher

check_progress()[source]

Get the status of an ongoing simulation, if possible

Returns:

(dict) – contains “output”, “errors”, “scenario_id”, and “status” keys which map to stdout, stderr, and the respective scenario attributes

extract_simulation_output()[source]

Extracts simulation outputs {PG, PF, LMP, CONGU, CONGL}

Returns:

(dict) – contains “output”, “errors”, “scenario_id”, and “status” keys which map to stdout, stderr, and the respective scenario attributes

class powersimdata.data_access.launcher.SSHLauncher(scenario)[source]

Bases: Launcher

check_progress()[source]
extract_simulation_output()[source]

Extracts simulation outputs {PG, PF, LMP, CONGU, CONGL} on server.

Returns:

(subprocess.Popen) – new process used to extract output data.

powersimdata.data_access.scenario_list module

class powersimdata.data_access.scenario_list.ScenarioListManager(data_access)[source]

Bases: CsvStore

Storage abstraction for scenario list using a csv file.

add_entry(scenario_info)[source]

Adds scenario to the scenario list file.

Parameters:

scenario_info (collections.OrderedDict) – entry to add to scenario list.

Returns:

(pandas.DataFrame) – the updated data frame

delete_entry(scenario_id)[source]

Deletes entry in scenario list.

Parameters:

scenario_id (int/str) – the id of the scenario

Returns:

(pandas.DataFrame) – the updated data frame

get_scenario(descriptor)[source]

Get information for a scenario based on id or name

Parameters:

descriptor (int/str) – the id or name of the scenario

Returns:

(collections.OrderedDict) – matching entry as a dict, or None if either zero or multiple matches found

get_scenario_table()[source]

Returns scenario table from server if possible, otherwise read local copy. Updates the local copy upon successful server connection.

Returns:

(pandas.DataFrame) – scenario list as a data frame.

powersimdata.data_access.scenario_table module

powersimdata.data_access.sql_store module

powersimdata.data_access.ssh_fs module

class powersimdata.data_access.ssh_fs.WrapSSHFS(parent_fs, path)[source]

Bases: SubFS

Wrapper around another filesystem which is rooted at the given path and adds progress bar for download

Parameters:
  • parent_fs (fs.base.FS) – the filesystem instance to wrap

  • path (str) – the path which will be the root of the wrapped filesystem

checksum(filepath)[source]

Return the checksum of the file path (using sha1sum)

Parameters:

filepath (str) – path to file

Returns:

(str) – the checksum of the file

download(path, file, chunk_size=None, **options)[source]

Wrapper around pyfilesystem download with progress bar

exec_command(command)[source]

Wrapper around paramiko exec_command

Parameters:

command (str) – the command to execute

Returns:

(tuple) – standard streams

powersimdata.data_access.ssh_fs.progress_bar(*args, **kwargs)[source]

Creates progress bar

Parameters:
  • *args – variable length argument list passed to the tqdm constructor.

  • **kwargs – arbitrary keyword arguments passed to the tqdm constructor.

Module contents