Skip to content

data_set

Module data_set contains ...

as_xarray(time_series_info)

Coerce an object to an xarray time series

Converts if possible an object to an xarray time series. Suitable objects are an equivalent 'uchronia' C++ entity via an external pointer. Typically deals with time series and ensemble thereof, but may be expanded later on to support more types.

Parameters:

Name Type Description Default
time_series_info NdTimeSeries

A representation of a time series. Supported types are external pointers as data from uchronia C API.

required

Returns:

Type Description
xr.DataArray

an xarray object

Source code in uchronia/data_set.py
def as_xarray(time_series_info: "NdTimeSeries") -> xr.DataArray:
    """
    Coerce an object to an xarray time series

    Converts if possible an object to an xarray time series. Suitable objects are an equivalent 'uchronia' C++ entity
    via an external pointer. Typically deals with time series and ensemble thereof, but may be expanded later on to support more types.

    Args:
        time_series_info (NdTimeSeries): A representation of a time series. Supported types are external pointers as data from uchronia C API.

    Returns:
        an xarray object

    """
    if isinstance(time_series_info, xr.DataArray) or isinstance(
        time_series_info, xr.Dataset
    ):
        return time_series_info
    # if isinstance(time_series_info, dict):
    #     return marshaledTimeSeriesToXts(time_series_info)
    # if cinterop::isInteropRegularTimeSeries(time_series_info)):
    #     return(cinterop::as_xarrayTimeSeries(time_series_info))
    if is_cffi_native_handle(time_series_info):
        if is_singular_time_series(time_series_info):
            return uwg.ToStructSingleTimeSeriesData_py(time_series_info)
        elif is_ensemble_time_series(time_series_info):
            return uwg.ToStructEnsembleTimeSeriesData_py(time_series_info)
        # elif is_time_series_of_ensemble_time_series(time_series_info):
        #     return uwg.SomethingStillNotDone(time_series_info)
        else:
            raise ValueError(
                'as_xarray: does not know how to convert to xarray an object of external type "'
                + time_series_info.type_id
                + '"'
            )
    else:
        k = type(time_series_info)
        raise ValueError(
            "cannot convert objects of type " + str(k) + " to an xarray time series"
        )

datasets_summaries(data_library)

Get the summaries of datasets in a library

Parameters:

Name Type Description Default
data_library TimeSeriesLibrary

library to query

required

Returns:

Type Description
List[str]

List[str]: short descriptions of all the datasets in this library

Source code in uchronia/data_set.py
def datasets_summaries(data_library: "TimeSeriesLibrary") -> List[str]:
    """Get the summaries of datasets in a library 

    Args:
        data_library (TimeSeriesLibrary): library to query

    Returns:
        List[str]: short descriptions of all the datasets in this library
    """    
    return uwg.GetEnsembleDatasetDataSummaries_py(data_library)

get_data_identifiers(provider)

Gets the known time series identifiers (e.g. Gauge names) of a time series provider

Gets the known time series identifiers (e.g. Gauge names) of a time series provider. This means that the argument is a wrapper around an external pointer to a object whose type is inheriting from the C++ type datatypes::timeseries::TimeSeriesProvider

Parameters:

Name Type Description Default
provider Any

an external pointer to a uchronia Time Series Provider.

required

Returns:

Type Description
List[str]

a character vector

Source code in uchronia/data_set.py
def get_data_identifiers(provider: "TimeSeriesProvider") -> List[str]:
    """
    Gets the known time series identifiers (e.g. Gauge names) of a time series provider

    Gets the known time series identifiers (e.g. Gauge names) of a time series provider.
    This means that the argument is a wrapper around an external pointer to a object whose type is
    inheriting from the C++ type datatypes::timeseries::TimeSeriesProvider<double>

    Args:
        provider (Any): an external pointer to a uchronia Time Series Provider.

    Returns:
        a character vector

    """
    return uwg.GetProviderTimeSeriesIdentifiers_py(provider)

get_dataset(data_library, data_id)

Retrieve data from a data sets library

Gets the data from a library for a given identifier.

Parameters:

Name Type Description Default
data_library TimeSeriesLibrary

wrapper around an external pointer ENSEMBLE_DATA_SET_PTR, a.k.a a "time series library"

required
data_id str

character, one data identifier for the data retrieved.

required

Returns:

Name Type Description
NdTimeSeries NdTimeSeries

a uni- or multidimensional time series

Source code in uchronia/data_set.py
def get_dataset(data_library: "TimeSeriesLibrary", data_id: str) -> "NdTimeSeries": # Union['TimeSeries','EnsemblePtrTimeSeries','EnsembleForecastTimeSeries']:
    """
    Retrieve data from a data sets library

    Gets the data from a library for a given identifier.

    Args:
        data_library (TimeSeriesLibrary): wrapper around an external pointer ENSEMBLE_DATA_SET_PTR, a.k.a a "time series library"
        data_id (str): character, one data identifier for the data retrieved.

    Returns:
        NdTimeSeries: a uni- or multidimensional time series
    """
    result = uwc.GetDatasetFromLibrary_Pkg(data_library, data_id)
    return result

get_dataset_ids(data_library)

Gets the top level data identifiers in a data library (data set)

Parameters:

Name Type Description Default
data_library TimeSeriesLibrary

wrapper around an external pointer ENSEMBLE_DATA_SET_PTR, a.k.a a "time series library"

required

Returns:

Type Description
List[str]

List[str]: identifiers for the datasets in this library

Source code in uchronia/data_set.py
def get_dataset_ids(data_library: "TimeSeriesLibrary") -> List[str]:
    """
    Gets the top level data identifiers in a data library (data set)

    Args:
        data_library (TimeSeriesLibrary): wrapper around an external pointer ENSEMBLE_DATA_SET_PTR, a.k.a a "time series library"

    Returns:
        List[str]: identifiers for the datasets in this library
    """
    return uwg.GetEnsembleDatasetDataIdentifiers_py(data_library)

get_dataset_single_time_series(data_library, data_id)

Retrieve data from a data sets library

Gets the data from a library for a given identifier.

Parameters:

Name Type Description Default
data_library TimeSeriesLibrary

wrapper around an external pointer ENSEMBLE_DATA_SET_PTR

required
data_id str

character, one data identifier for the time series.

required

Returns:

Name Type Description
TimeSeries TimeSeries

univariate time series of dimension 1

Source code in uchronia/data_set.py
def get_dataset_single_time_series(data_library: "TimeSeriesLibrary", data_id) -> "TimeSeries":
    """
    Retrieve data from a data sets library

    Gets the data from a library for a given identifier.

    Args:
        data_library (TimeSeriesLibrary): wrapper around an external pointer ENSEMBLE_DATA_SET_PTR
        data_id (str): character, one data identifier for the time series.

    Returns:
        TimeSeries: univariate time series of dimension 1

    """
    result = uwg.GetDatasetSingleTimeSeries_py(data_library, data_id)
    return result

get_ensemble_dataset(dataset_id='', data_path='')

Gets an object, a library to access a set of time series

Gets an object, a library to access a set of time series

Parameters:

Name Type Description Default
dataset_id str

currently only a path to a file in YAML is supported.

''
data_path str

(unused - for future use) overriding path to data storage

''

Returns:

Name Type Description
TimeSeriesLibrary TimeSeriesLibrary

external pointer type ENSEMBLE_DATA_SET_PTR, or a Python class wrapper around it

Example

TODO

Source code in uchronia/data_set.py
def get_ensemble_dataset(dataset_id: str = "", data_path: str = "") -> "TimeSeriesLibrary":
    """
    Gets an object, a library to access a set of time series

    Gets an object, a library to access a set of time series

    Args:
        dataset_id (str): currently only a path to a file in YAML is supported.
        data_path (str): (unused - for future use) overriding path to data storage

    Returns:
        TimeSeriesLibrary: external pointer type ENSEMBLE_DATA_SET_PTR, or a Python class wrapper around it 

    Example:
        TODO

    """
    #     \dontrun{
    # d <- uchronia::sampleDataDir()
    # yamlFn <- file.path(d, 'time_series_library.yaml')
    # if(!file.exists(yamlFn)) {stop(paste0('sample YAML file ', yamlFn, ' not found')) }
    # file.show(yamlFn)
    # dataSet <- uchronia::getEnsembleDataSet(yamlFn)
    # (dataIds <- uchronia::getDataSetIds(dataSet))
    # subIdentifiers(dataSet, "var1_obs_collection")
    # (univTs <- uchronia::getDataSet(dataSet, "var1_obs"))
    # (multivTs <- uchronia::getDataSet(dataSet, "var1_obs_collection"))
    # (ensFcast <- uchronia::getDataSet(dataSet, "pet_fcast_ens"))
    # plot(asXts(univTs))
    # zoo::plot.zoo(asXts(multivTs))
    # ## precipIds <- paste( 'subarea', getSubareaIds(simulation), 'P', sep='.')
    # ## evapIds <- paste( 'subarea', getSubareaIds(simulation), 'E', sep='.')
    # ## swift::playInputs(simulation, dataSet, precipIds, rep('rain_obs', length(precipIds)))
    # ## swift::playInputs(simulation, dataSet, evapIds, rep('pet_obs', length(evapIds)), 'daily_to_hourly')
    # }

    # tools::file_ext(dataset_id)
    if os.path.exists(dataset_id):
        # if this is an RData file, load into this environment?
        # if this is a YAML data set, or something like that
        return uwg.LoadEnsembleDataset_py(dataset_id, data_path)
    else:
        raise FileNotFoundError(
            "file not found. get_ensemble_dataset is in a prototype stage and supports only YAML data set descriptors"
        )

get_multiple_time_series_from_provider(ts_provider, var_ids, api_get_ts_func)

Gets one or more time series from a time series provider

Gets one or more time series from a time series provider. This function is exported for use by other python packages rather than for end users.

Parameters:

Name Type Description Default
ts_provider Any

wrapper around an object coercible to a TIME_SERIES_PROVIDER_PTR

required
var_ids VecStr

character vector, IDs of the time series to retrieve from the provider

required
api_get_ts_func TsRetrievalSignature

a function that takes as arguments 'ts_provider' and a character.

required

Returns:

Type Description
xr.DataArray

an xarray time series

Example

internalGetRecordedTts <- function(simulation, varIds) {

uchronia::getMultipleTimeSeriesFromProvider(simulation, varIds, GetRecorded_Pkg_R)

}

Source code in uchronia/data_set.py
def get_multiple_time_series_from_provider(
    ts_provider: "TimeSeriesProvider", var_ids: VecStr, api_get_ts_func: "TsRetrievalSignature"
) -> xr.DataArray:
    """
    Gets one or more time series from a time series provider

    Gets one or more time series from a time series provider. This function is exported for use by other python packages rather than for end users.

    Args:
        ts_provider (Any): wrapper around an object coercible to a TIME_SERIES_PROVIDER_PTR
        var_ids (VecStr): character vector, IDs of the time series to retrieve from the provider
        api_get_ts_func (TsRetrievalSignature): a function that takes as arguments 'ts_provider' and a character.

    Returns:
        an xarray time series

    Example:
        >>> # internalGetRecordedTts <- function(simulation, varIds) {
        >>> # uchronia::getMultipleTimeSeriesFromProvider(simulation, varIds, GetRecorded_Pkg_R)
        >>> # }

    """
    from uchronia.internals import internal_get_multiple_time_series

    return internal_get_multiple_time_series(ts_provider, var_ids, api_get_ts_func)

get_time_series_from_provider(provider, data_id=None)

Gets a time series from a time series provider, given a data ID

Gets a time series from a time series provider, given a data ID. This means that the argument is a wrapper around an external pointer to a object whose type is inheriting from the C++ type datatypes::timeseries::TimeSeriesProvider

Parameters:

Name Type Description Default
provider Any

an external pointer to a uchronia Time Series Provider.

required
data_id str

character, one or more data identifier for the time series. If missing, all known identifiers will be used; be careful about the resulting size.

None

Returns:

Name Type Description
Any Any

an xarray time series, or several, or None?

Source code in uchronia/data_set.py
def get_time_series_from_provider(provider: "TimeSeriesProvider", data_id: str = None) -> Any:
    """
    Gets a time series from a time series provider, given a data ID

    Gets a time series from a time series provider, given a data ID.
    This means that the argument is a wrapper around an external pointer to a object whose type is
    inheriting from the C++ type datatypes::timeseries::TimeSeriesProvider<double>

    Args:
        provider (Any): an external pointer to a uchronia Time Series Provider.
        data_id (str): character, one or more data identifier for the time series. If missing, all known identifiers will be used; be careful about the resulting size.

    Returns:
        Any: an xarray time series, or several, or None?

    """
    if data_id is None:
        data_id = get_data_identifiers(provider)
    return internal_get_time_series_from_provider(provider, data_id)