efts_io ¶

efts-io package.

Ensemble forecast time series

Modules:

attributes –

Management of netCDF attributes.
cli –

Module that contains the command line application.
conventions –

Naming conventions for the EFTS netCDF file format.
debug –

Debugging utilities.
dimensions –

Functions to create and manipulate dimensions for netCDF files.
helpers –

Helper functions for netcdf file.
variables –

Handling of EFTS netCDF variables definitions.
wrapper –

A thin wrapper around xarray for reading and writing Ensemble Forecast Time Series (EFTS) data sets.

Classes:

DataOriginType –

Type of data origin according to STF 2.0 conventions.
EftsDataSet –

Convenience class for access to a Ensemble Forecast Time Series in netCDF file.
LocationType –

Type of measurement location according to STF 2.0 conventions.
StfVariable –

Hydrological variable type in the STF convention.
TimeSeriesType –

Type of time series aggregation according to STF 2.0 conventions.

Functions:

create_global_attributes –

Creates STF global attributes.
create_mandatory_global_attributes –

Create a dictionary of mandatory global attributes for an EFTS dataset.
create_quality_variable_attributes –

Create attributes for a quality code variable (e.g., rain_obs_qual).
create_state_variable_attributes –

Create attributes for a state variable (e.g., sv1, sv2).
create_var_attribute_definition –

Create variable attribute definition (legacy function).
create_variable_attributes –

Create variable attributes for STF 2.0 compliant netCDF files.
get_parser –

Return the CLI argument parser.
main –

Run the main program.
open_efts –

Open an EFTS NetCDF file.
template_variable_attributes –

Create a template dictionary for variable attributes.
validate_global_attributes –

Validate a dictionary of global attributes against STF 2.0 conventions.
validate_quality_variable_attributes –

Validate a dictionary of quality variable attributes against STF 2.0 conventions.
validate_state_variable_attributes –

Validate a dictionary of state variable attributes against STF 2.0 conventions.
validate_variable_attributes –

Validate a dictionary of data variable attributes against STF 2.0 conventions.
xr_efts –

Create an xarray Dataset for EFTS data.

DataOriginType ¶

DataOriginType(code: str, description: str)

Bases: Enum

Type of data origin according to STF 2.0 conventions.

This enumeration defines how the data was obtained or generated, following the STF (Standard Time Format) 2.0 conventions.

Attributes:

OBSERVED –

Data observed directly from instruments (e.g., gauged rainfall)
DERIVED –

Data derived from observations through processing (e.g., AWAP rainfall)
SIMULATED –

Data simulated from historical observations (e.g., flow from GR4H with obs forcing)
FORECAST –

Data forecast/simulated from predictions (e.g., flow from GR4H with NWP forcing)

Examples:

>>> from efts_io.attributes import DataOriginType
>>> origin = DataOriginType.OBSERVED
>>> origin.code
'obs'
>>> origin.description
'observed directly'

Parameters:

code ¶
(str) –

String code defined by STF 2.0 conventions
description ¶
(str) –

Human-readable description of the data origin

Source code in src/efts_io/conventions.py

def __init__(self, code: str, description: str) -> None:
    """Initialize a DataOriginType with its string code and text description.

    Args:
        code: String code defined by STF 2.0 conventions
        description: Human-readable description of the data origin
    """
    self.code = code
    self.description = description

EftsDataSet ¶

EftsDataSet(data: str | Dataset)

Convenience class for access to a Ensemble Forecast Time Series in netCDF file.

Methods:

append_history –

Append a new entry to the history attribute with a timestamp.
create_data_variables –

Create data variables in the data set.
get_dim_names –

Gets the name of all dimensions in the data set.
get_ensemble_size –

Return the length of the ensemble size dimension.
get_lead_time_count –

Length of the lead time dimension.
get_lead_time_values –

Return the values of the lead time dimension.
get_station_count –

Return the number of stations in the data set.
get_time_dim –

Return the time dimension variable as a vector of date-time stamps.
new_variable –

Create a new variable in the data set.
put_lead_time_values –

Set the values of the lead time dimension.
save_to_stf2 –

Save to file.
to_netcdf –

Write the data set to a netCDF file.
writeable_to_stf2 –

Check if the dataset can be written to a netCDF file compliant with STF 2.0 specification.

Attributes:

catchment (str) –

Get or set the catchment attribute of the dataset.
comment (str) –

Get or set the comment attribute of the dataset.
history (str) –

Gets/sets the history attribute of the dataset.
institution (str) –

Get or set the institution attribute of the dataset.
source (str) –

Get or set the source attribute of the dataset.
stf2_int_datatype (str) –

The type of integer to save to in the STF 2.x netcdf convention: 'i4' or 'i8'.
stf_convention_version (float) –

Get or set the STF_convention_version attribute of the dataset.
stf_nc_spec (str) –

Get or set the STF_nc_spec attribute of the dataset.
title (str) –

Get or set the title attribute of the dataset.

Source code in src/efts_io/wrapper.py

def __init__(self, data: str | xr.Dataset) -> None:
    """Create a new EftsDataSet object."""
    self.time_zone_timestamps = True  # Not sure about https://github.com/csiro-hydroinformatics/efts-io/issues/3
    self.STATION_DIMNAME = STATION_DIMNAME
    self.stations_varname = STATION_ID_VARNAME
    self.LEAD_TIME_DIMNAME = LEAD_TIME_DIMNAME
    self.ENS_MEMBER_DIMNAME = ENS_MEMBER_DIMNAME
    self.data: xr.Dataset

    if data is None:
        raise ValueError("input cannot be None")
    if isinstance(data, Path):
        data = str(data)
    if isinstance(data, str):
        new_dataset = load_from_stf2_file(data, self.time_zone_timestamps)
        self.data = new_dataset
    elif isinstance(data, xr.Dataset):
        self.data = data
    else:
        raise TypeError(f"Unsupported type {type(data)}")

    self.stf2_int_datatype = "i4"  # default integer type for STF2 saving

catchment `property` `writable` ¶

catchment: str

Get or set the catchment attribute of the dataset.

comment `property` `writable` ¶

comment: str

Get or set the comment attribute of the dataset.

history `property` `writable` ¶

history: str

Gets/sets the history attribute of the dataset.

institution `property` `writable` ¶

institution: str

Get or set the institution attribute of the dataset.

source `property` `writable` ¶

source: str

Get or set the source attribute of the dataset.

stf2_int_datatype `property` `writable` ¶

stf2_int_datatype: str

The type of integer to save to in the STF 2.x netcdf convention: 'i4' or 'i8'.

stf_convention_version `property` `writable` ¶

stf_convention_version: float

Get or set the STF_convention_version attribute of the dataset.

stf_nc_spec `property` `writable` ¶

stf_nc_spec: str

Get or set the STF_nc_spec attribute of the dataset.

title `property` `writable` ¶

title: str

Get or set the title attribute of the dataset.

append_history ¶

append_history(
    message: str, timestamp: datetime | None = None
) -> None

Append a new entry to the history attribute with a timestamp.

message: The message to append. timestamp: If not provided, the current UTC time is used.

Source code in src/efts_io/wrapper.py

def append_history(self, message: str, timestamp: datetime | None = None) -> None:
    """Append a new entry to the `history` attribute with a timestamp.

    message: The message to append.
    timestamp: If not provided, the current UTC time is used.
    """
    from datetime import UTC  # noqa: PLC0415

    if timestamp is None:
        timestamp = datetime.now(UTC)
    ts_str = timestamp.isoformat()

    current_history = self.data.attrs.get(HISTORY_ATTR_KEY, "")
    if current_history:
        self.data.attrs[HISTORY_ATTR_KEY] = f"{current_history}\n{ts_str} - {message}"
    else:
        self.data.attrs[HISTORY_ATTR_KEY] = f"{ts_str} - {message}"

create_data_variables ¶

create_data_variables(
    data_var_def: dict[str, dict[str, Any]],
) -> None

Create data variables in the data set.

var_defs_dict["variable_1"].keys() dict_keys(['name', 'longname', 'units', 'dim_type', 'missval', 'precision', 'attributes'])

Source code in src/efts_io/wrapper.py

def create_data_variables(self, data_var_def: dict[str, dict[str, Any]]) -> None:
    """Create data variables in the data set.

    var_defs_dict["variable_1"].keys()
    dict_keys(['name', 'longname', 'units', 'dim_type', 'missval', 'precision', 'attributes'])
    """
    ens_fcast_data_var_def = [x for x in data_var_def.values() if x["dim_type"] == "4"]
    ens_data_var_def = [x for x in data_var_def.values() if x["dim_type"] == "3"]
    point_data_var_def = [x for x in data_var_def.values() if x["dim_type"] == "2"]

    # Dimension order follows C (row-major) convention: slowest-varying axis first, fastest last.
    # The STF 2.0 specification lists dimensions in Fortran order
    # (lead_time, station, ens_member, time), but Python/NumPy arrays are C-order.
    # Both the on-disk write path (make_ready_for_saving) and the read path
    # (load_from_stf2_file) use C order: (time, realization, station_id, lead_time).
    # Using the same order here ensures freshly created variables are laid out
    # identically to variables reconstructed after a save/reload cycle, preventing
    # silent axis-mismatch bugs when indexing with positional notation.
    four_dims_names = (TIME_DIMNAME, REALISATION_DIMNAME, STATION_ID_DIMNAME, LEAD_TIME_DIMNAME)
    three_dims_names = (TIME_DIMNAME, REALISATION_DIMNAME, STATION_ID_DIMNAME)
    two_dims_names = (TIME_DIMNAME, STATION_ID_DIMNAME)

    four_dims_shape = tuple(self.data.sizes[dimname] for dimname in four_dims_names)
    three_dims_shape = tuple(self.data.sizes[dimname] for dimname in three_dims_names)
    two_dims_shape = tuple(self.data.sizes[dimname] for dimname in two_dims_names)
    for vardefs, dims_shape, dims_names in [
        (ens_fcast_data_var_def, four_dims_shape, four_dims_names),
        (ens_data_var_def, three_dims_shape, three_dims_names),
        (point_data_var_def, two_dims_shape, two_dims_names),
    ]:
        for x in vardefs:
            varname = x["name"]
            # TODO: perhaps check for keys here
            # _check_mandatory_keys(x)
            self._new_variable_from_legacy_specs(dims_shape, dims_names, x, varname)

get_dim_names ¶

get_dim_names() -> list[str]

Gets the name of all dimensions in the data set.

Source code in src/efts_io/wrapper.py

def get_dim_names(self) -> list[str]:
    """Gets the name of all dimensions in the data set."""
    return [x for x in self.data.sizes.keys()]  # noqa: C416, SIM118

get_ensemble_size ¶

get_ensemble_size() -> int

Return the length of the ensemble size dimension.

Source code in src/efts_io/wrapper.py

def get_ensemble_size(self) -> int:
    """Return the length of the ensemble size dimension."""
    return self._dim_size(REALISATION_DIMNAME)

get_lead_time_count ¶

get_lead_time_count() -> int

Length of the lead time dimension.

Source code in src/efts_io/wrapper.py

def get_lead_time_count(self) -> int:
    """Length of the lead time dimension."""
    return self._dim_size(self.LEAD_TIME_DIMNAME)

get_lead_time_values ¶

get_lead_time_values() -> ndarray

Return the values of the lead time dimension.

Source code in src/efts_io/wrapper.py

def get_lead_time_values(self) -> np.ndarray:
    """Return the values of the lead time dimension."""
    return self.data[self.LEAD_TIME_DIMNAME].to_numpy()

get_station_count ¶

get_station_count() -> int

Return the number of stations in the data set.

Source code in src/efts_io/wrapper.py

def get_station_count(self) -> int:
    """Return the number of stations in the data set."""
    return self._dim_size(STATION_ID_DIMNAME)

get_time_dim ¶

get_time_dim() -> ndarray

Return the time dimension variable as a vector of date-time stamps.

Source code in src/efts_io/wrapper.py

def get_time_dim(self) -> np.ndarray:
    """Return the time dimension variable as a vector of date-time stamps."""
    # Gets the time dimension variable as a vector of date-time stamps
    return self.data.time.to_numpy()  # but loosing attributes.

new_variable ¶

new_variable(
    varname: str,
    dim_names: Iterable[str],
    var_attributes: dict[str, Any],
    data: ndarray | None = None,
) -> DataArray

Create a new variable in the data set.

Parameters:

varname ¶
(str) –

Name of the new variable.
dim_names ¶
(Iterable[str]) –

Names of the dimensions for the new variable.
var_attributes ¶
(dict[str, Any]) –

Attributes for the new variable. Must include 'units' key. See template_variable_attributes
data ¶
(Optional[ndarray], default: None ) –

Data for the new variable. If None, the variable is initialized with NaNs. Defaults to None.

Returns:

DataArray –

xr.DataArray: The newly created variable as an xarray DataArray.

Source code in src/efts_io/wrapper.py

def new_variable(
    self,
    varname: str,
    dim_names: Iterable[str],
    var_attributes: dict[str, Any],
    data: np.ndarray | None = None,
) -> xr.DataArray:
    """Create a new variable in the data set.

    Args:
        varname (str): Name of the new variable.
        dim_names (Iterable[str]): Names of the dimensions for the new variable.
        var_attributes (dict[str, Any]): Attributes for the new variable. Must include 'units' key. See `template_variable_attributes`
        data (Optional[np.ndarray], optional): Data for the new variable. If None, the variable is initialized with NaNs. Defaults to None.

    Returns:
        xr.DataArray: The newly created variable as an xarray DataArray.
    """
    if varname in self.data.variables:
        raise ValueError(f"Variable '{varname}' already exists in the dataset.")
    if UNITS_ATTR_KEY not in var_attributes:
        raise ValueError(f"Variable attributes must include '{UNITS_ATTR_KEY}' key.")
    known_dimnames = self.get_dim_names()
    unknown_dims = [x for x in dim_names if x not in set(known_dimnames)]
    if unknown_dims:
        raise ValueError(f"Unknown dimension names: {unknown_dims}; must be one of {known_dimnames}.")
    dims_shape = tuple(self.data.sizes[dimname] for dimname in dim_names)
    if data is not None:
        if data.shape != dims_shape:
            raise ValueError(
                f"Data shape {data.shape} does not match expected shape {dims_shape} for dimensions {dim_names}.",
            )
        data_array = data
    else:
        data_array = nan_full(dims_shape)
    data_coords = {dim: self.data.coords[dim] for dim in dim_names}
    new_array = xr.DataArray(
        name=varname,
        data=data_array,
        coords=data_coords,
        dims=dim_names,
        attrs=var_attributes.copy(),
    )
    self.data[varname] = new_array
    return new_array

put_lead_time_values ¶

put_lead_time_values(values: Iterable[float]) -> None

Set the values of the lead time dimension.

Source code in src/efts_io/wrapper.py

def put_lead_time_values(self, values: Iterable[float]) -> None:
    """Set the values of the lead time dimension."""
    self.data[self.LEAD_TIME_DIMNAME].values = np.array(values)

save_to_stf2 ¶

save_to_stf2(
    path: str,
    variable_name: str | None = None,
    var_type: StfVariable = STREAMFLOW,
    data_type: DataOriginType = OBSERVED,
    ens: bool = False,
    timestep: str = "days",
    data_qual: DataArray | None = None,
) -> None

Save to file.

Source code in src/efts_io/wrapper.py

def save_to_stf2(
    self,
    path: str,
    variable_name: str | None = None,
    var_type: StfVariable = StfVariable.STREAMFLOW,
    data_type: DataOriginType = DataOriginType.OBSERVED,
    ens: bool = False,  # noqa: FBT001, FBT002
    timestep: str = "days",
    data_qual: xr.DataArray | None = None,
) -> None:
    """Save to file."""
    from efts_io._ncdf_stf2 import write_nc_stf2  # noqa: PLC0415

    if isinstance(self.data, xr.Dataset):
        if variable_name is None:
            raise ValueError("Inner data is a DataSet, so an explicit variable name must be explicitely specified.")
        d = self.data[variable_name]
    # elif isinstance(self.data, xr.DataArray):
    #    d = self.data
    else:
        raise TypeError(f"Unsupported data type {type(self.data)}")

    if UNITS_ATTR_KEY not in d.attrs:
        raise ValueError(f"DataArray variable '{d.name}' must have '{UNITS_ATTR_KEY}' attribute defined.")

    write_nc_stf2(
        out_nc_file=path,  # : str,
        dataset=self.data,
        data=d,  # : xr.DataArray,
        var_type=var_type,  # : int = 1,
        data_type=data_type,  # : int = 3,
        stf_nc_vers=2,  # : int = 2,
        ens=ens,  # : bool = False,
        timestep=timestep,  # :str="days",
        data_qual=data_qual,  # : Optional[xr.DataArray] = None,
        overwrite=True,  # :bool=True,
        # loc_info=loc_info, # : Optional[Dict[str, Any]] = None,
        intdata_type=self.stf2_int_datatype,
    )

to_netcdf ¶

to_netcdf(path: str, version: str | None = '2.0') -> None

Write the data set to a netCDF file.

If version is "2.0", the dataset is written using the save_to_stf2 method, which ensures compliance with the STF 2.0 specification. Only version "2.0" is currently supported. If version is None, the dataset is written using xarray's built-in to_netcdf method, which may not be compliant with any specific convention.

Parameters:

path ¶
(str) –

The file path to write the netCDF file to.
version ¶
(str | None, default: '2.0' ) –

The version of the netCDF format to write. Defaults to "2.0". If None, uses xarray's default writing method.

Source code in src/efts_io/wrapper.py

def to_netcdf(self, path: str, version: str | None = "2.0") -> None:
    """Write the data set to a netCDF file.

    If version is "2.0", the dataset is written using the save_to_stf2 method, which ensures
    compliance with the STF 2.0 specification. Only version "2.0" is currently supported.
    If version is None, the dataset is written using xarray's built-in to_netcdf method,
    which may not be compliant with any specific convention.

    Args:
        path (str): The file path to write the netCDF file to.
        version (str | None, optional): The version of the netCDF format to write. Defaults to "2.0". If None, uses xarray's default writing method.
    """
    if version is None:
        self.data.to_netcdf(path)
    elif version == "2.0":
        self.save_to_stf2(path)
    else:
        raise ValueError("Only version 2.0 is supported for now")

writeable_to_stf2 ¶

writeable_to_stf2() -> bool

Check if the dataset can be written to a netCDF file compliant with STF 2.0 specification.

This method checks if the underlying xarray dataset or dataarray has the required dimensions and global attributes as specified by the STF 2.0 convention.

Returns:

bool ( bool ) –

True if the dataset can be written to a STF 2.0 compliant netCDF file, False otherwise.

Source code in src/efts_io/wrapper.py

def writeable_to_stf2(self) -> bool:
    """Check if the dataset can be written to a netCDF file compliant with STF 2.0 specification.

    This method checks if the underlying xarray dataset or dataarray has the required dimensions and global attributes as specified by the STF 2.0 convention.

    Returns:
        bool: True if the dataset can be written to a STF 2.0 compliant netCDF file, False otherwise.
    """
    from efts_io.conventions import exportable_to_stf2  # noqa: PLC0415

    return exportable_to_stf2(self.data)

LocationType ¶

Bases: Enum

Type of measurement location according to STF 2.0 conventions.

This enumeration defines whether the measurement represents a point or an area-averaged value.

Attributes:

POINT –

Point measurement (e.g., rain gauge, stream gauge)
AREA –

Area-averaged measurement (e.g., subcatchment area)

Examples:

>>> from efts_io.attributes import LocationType
>>> loc = LocationType.POINT
>>> loc.value
'Point'

StfVariable ¶

Bases: Enum

Hydrological variable type in the STF convention.

TimeSeriesType ¶

TimeSeriesType(code: int, description: str)

Bases: Enum

Type of time series aggregation according to STF 2.0 conventions.

This enumeration defines how time series data is aggregated or sampled, following the STF (Standard Time Format) 2.0 conventions for water forecasting netCDF files.

Attributes:

INSTANTANEOUS –

Data recorded at a specific instant (e.g., stage height)
ACCUMULATED –

Data accumulated over the preceding time interval (e.g., rainfall)
AVERAGED –

Data averaged over the preceding time interval (e.g., flow, average temp)
ACCUMULATED_FORECAST –

Data accumulated since start of forecast (e.g., cumulative flow)
POINT_IN_INTERVAL –

Point value recorded in the preceding interval (e.g., max/min temperature)
CLIMATOLOGY_INSTANTANEOUS –

Climatology of instantaneous data
CLIMATOLOGY_ACCUMULATED –

Climatology of accumulated data
CLIMATOLOGY_AVERAGED –

Climatology of averaged data
CLIMATOLOGY_ACCUMULATED_FORECAST –

Climatology of forecast-accumulated data
CLIMATOLOGY_POINT –

Climatology of point-in-interval data

Examples:

>>> from efts_io.attributes import TimeSeriesType
>>> ts_type = TimeSeriesType.ACCUMULATED
>>> ts_type.code
2
>>> ts_type.description
'accumulated over the preceding interval'

Parameters:

code ¶
(int) –

Numeric code defined by STF 2.0 conventions
description ¶
(str) –

Human-readable description of the aggregation type

Source code in src/efts_io/conventions.py

def __init__(self, code: int, description: str) -> None:
    """Initialize a TimeSeriesType with its numeric code and text description.

    Args:
        code: Numeric code defined by STF 2.0 conventions
        description: Human-readable description of the aggregation type
    """
    self.code = code
    self.description = description

create_global_attributes ¶

create_global_attributes(
    title: str,
    institution: str,
    source: str,
    catchment: str,
    comment: str,
    stf_convention_version: float = 2.0,
    stf_nc_spec: str = STF_2_0_URL,
    history: str = "",
) -> dict[str, Any]

Creates STF global attributes.

Parameters:

title ¶
(str) –

title
institution ¶
(str) –

institution
source ¶
(str) –

source
catchment ¶
(str) –

catchment
comment ¶
(str) –

comment
stf_convention_version ¶
(float, default: 2.0 ) –

STF convention version (default: 2.0)
stf_nc_spec ¶
(str, default: STF_2_0_URL ) –

URL to the STF specification document (default: STF 2.0 URL)
history ¶
(str, default: '' ) –

audit trail for modifications to the original data (default: "")

Raises:

ValueError –

Unexpected or insufficient information

Returns:

dict[str, Any] –

dict[str, Any]: dictionary of global attributes

Source code in src/efts_io/attributes.py

def create_global_attributes(
    title: str,
    institution: str,
    source: str,
    catchment: str,
    comment: str,
    stf_convention_version: float = 2.0,
    stf_nc_spec: str = STF_2_0_URL,
    history: str = "",
) -> dict[str, Any]:
    """Creates STF global attributes.

    Args:
        title (str): title
        institution (str): institution
        source (str): source
        catchment (str): catchment
        comment (str): comment
        stf_convention_version (float): STF convention version (default: 2.0)
        stf_nc_spec (str): URL to the STF specification document (default: STF 2.0 URL)
        history (str): audit trail for modifications to the original data (default: "")

    Raises:
        ValueError: Unexpected or insufficient information

    Returns:
        dict[str, Any]: dictionary of global attributes
    """
    # catchment info should not have white spaces (and why was that???)
    # catchment = 'Upper  Murray River '
    # catchment = stringr::str_replace_all(catchment, pattern='\\s+', '_')

    if title == "":
        raise ValueError("Empty title is not accepted as a valid attribute")

    return {
        TITLE_ATTR_KEY: title,
        INSTITUTION_ATTR_KEY: institution,
        SOURCE_ATTR_KEY: source,
        CATCHMENT_ATTR_KEY: catchment,
        STF_CONVENTION_VERSION_ATTR_KEY: stf_convention_version,
        STF_NC_SPEC_ATTR_KEY: stf_nc_spec,
        COMMENT_ATTR_KEY: comment,
        HISTORY_ATTR_KEY: history,
    }

create_mandatory_global_attributes ¶

create_mandatory_global_attributes(
    title: str,
    institution: str,
    catchment: str,
    source: str,
    comment: str,
    history: str | None = None,
) -> dict[str, str]

Create a dictionary of mandatory global attributes for an EFTS dataset.

Parameters:

title ¶
(str) –

Title of the dataset.
institution ¶
(str) –

Institution responsible for the dataset.
catchment ¶
(str) –

Catchment area description.
source ¶
(str) –

Source of the data.
comment ¶
(str) –

Additional comments about the dataset.
history ¶
(Optional[str], default: None ) –

History of the dataset. If None, a default history message is created. Defaults to None.

Returns:

dict[str, str] –

Dict[str, str]: A dictionary containing the mandatory global attributes.

Source code in src/efts_io/wrapper.py

def create_mandatory_global_attributes(
    title: str,
    institution: str,
    catchment: str,
    source: str,
    comment: str,
    history: str | None = None,
) -> dict[str, str]:
    """Create a dictionary of mandatory global attributes for an EFTS dataset.

    Args:
        title (str): Title of the dataset.
        institution (str): Institution responsible for the dataset.
        catchment (str): Catchment area description.
        source (str): Source of the data.
        comment (str): Additional comments about the dataset.
        history (Optional[str], optional): History of the dataset. If None, a default history message is created. Defaults to None.

    Returns:
        Dict[str, str]: A dictionary containing the mandatory global attributes.
    """
    d = _stf2_mandatory_global_attributes(
        title=title,
        institution=institution,
        catchment=catchment,
        source=source,
        comment=comment,
        history=history or __default_history_attval(),
    )
    return d  # noqa: RET504

create_quality_variable_attributes ¶

create_quality_variable_attributes(
    long_name: str,
    quality_code_standard: str,
    fill_value: int = -1,
) -> dict[str, Any]

Create attributes for a quality code variable (e.g., rain_obs_qual).

Quality code variables have a distinct set of attributes from data variables. Per the STF 2.0 conventions, they require long_name, units (the quality code standard), and _FillValue (an integer, default -1).

Parameters:

long_name ¶
(str) –

Human-readable name (e.g., "Quality of observed rainfall")
quality_code_standard ¶
(str) –

Quality code standard used (e.g., "ABC Quality coding")
fill_value ¶
(int, default: -1 ) –

Integer fill value for missing data (default: -1)

Returns:

dict[str, Any] –

Dictionary of attributes ready to use with xarray DataArray or EftsDataSet.new_variable()

Examples:

>>> attrs = create_quality_variable_attributes(
...     long_name="Quality of observed rainfall",
...     quality_code_standard="ABC Quality coding",
... )
>>> attrs["_FillValue"]
-1

Source code in src/efts_io/attributes.py

def create_quality_variable_attributes(
    long_name: str,
    quality_code_standard: str,
    fill_value: int = -1,
) -> dict[str, Any]:
    """Create attributes for a quality code variable (e.g., rain_obs_qual).

    Quality code variables have a distinct set of attributes from data variables.
    Per the STF 2.0 conventions, they require ``long_name``, ``units`` (the quality
    code standard), and ``_FillValue`` (an integer, default -1).

    Args:
        long_name: Human-readable name (e.g., "Quality of observed rainfall")
        quality_code_standard: Quality code standard used (e.g., "ABC Quality coding")
        fill_value: Integer fill value for missing data (default: -1)

    Returns:
        Dictionary of attributes ready to use with xarray DataArray or EftsDataSet.new_variable()

    Examples:
        >>> attrs = create_quality_variable_attributes(
        ...     long_name="Quality of observed rainfall",
        ...     quality_code_standard="ABC Quality coding",
        ... )
        >>> attrs["_FillValue"]
        -1
    """
    return {
        LONG_NAME_ATTR_KEY: long_name,
        UNITS_ATTR_KEY: quality_code_standard,
        FILLVALUE_ATTR_KEY: fill_value,
    }

create_state_variable_attributes ¶

create_state_variable_attributes(
    long_name: str,
    model_name: str,
    sv_name: str,
    sv_description: str,
    fill_value: float = -9999.0,
) -> dict[str, Any]

Create attributes for a state variable (e.g., sv1, sv2).

State variables store internal model states. Per the STF 2.0 conventions, they require long_name, model_name, sv_name, sv_description, and _FillValue.

Parameters:

long_name ¶
(str) –

Human-readable name (e.g., "state var 1")
model_name ¶
(str) –

Name of the model (e.g., "GR4H_RR")
sv_name ¶
(str) –

Name of the state variable in the model (e.g., "UH_Inflow")
sv_description ¶
(str) –

Description of the state variable (e.g., "Total inflow to Unit Hydrographs in GR4H")
fill_value ¶
(float, default: -9999.0 ) –

Fill value for missing data (default: -9999.0)

Returns:

dict[str, Any] –

Dictionary of attributes ready to use with xarray DataArray or EftsDataSet.new_variable()

Examples:

>>> attrs = create_state_variable_attributes(
...     long_name="state var 1",
...     model_name="GR4H_RR",
...     sv_name="UH_Inflow",
...     sv_description="Total inflow to Unit Hydrographs in GR4H",
... )
>>> attrs["model_name"]
'GR4H_RR'

Source code in src/efts_io/attributes.py

def create_state_variable_attributes(
    long_name: str,
    model_name: str,
    sv_name: str,
    sv_description: str,
    fill_value: float = -9999.0,
) -> dict[str, Any]:
    """Create attributes for a state variable (e.g., sv1, sv2).

    State variables store internal model states. Per the STF 2.0 conventions,
    they require ``long_name``, ``model_name``, ``sv_name``, ``sv_description``,
    and ``_FillValue``.

    Args:
        long_name: Human-readable name (e.g., "state var 1")
        model_name: Name of the model (e.g., "GR4H_RR")
        sv_name: Name of the state variable in the model (e.g., "UH_Inflow")
        sv_description: Description of the state variable (e.g., "Total inflow to Unit Hydrographs in GR4H")
        fill_value: Fill value for missing data (default: -9999.0)

    Returns:
        Dictionary of attributes ready to use with xarray DataArray or EftsDataSet.new_variable()

    Examples:
        >>> attrs = create_state_variable_attributes(
        ...     long_name="state var 1",
        ...     model_name="GR4H_RR",
        ...     sv_name="UH_Inflow",
        ...     sv_description="Total inflow to Unit Hydrographs in GR4H",
        ... )
        >>> attrs["model_name"]
        'GR4H_RR'
    """
    return {
        LONG_NAME_ATTR_KEY: long_name,
        MODEL_NAME_ATTR_KEY: model_name,
        SV_NAME_ATTR_KEY: sv_name,
        SV_DESCRIPTION_ATTR_KEY: sv_description,
        FILLVALUE_ATTR_KEY: fill_value,
    }

create_var_attribute_definition ¶

create_var_attribute_definition(
    data_type_code: int = 2,
    type_description: str = "accumulated over the preceding interval",
    dat_type: str = "der",
    dat_type_description: str = "AWAP data interpolated from observations",
    location_type: str = "Point",
) -> dict[str, str]

Create variable attribute definition (legacy function).

.. deprecated:: This function is maintained for backward compatibility. For new code, use :func:create_variable_attributes with the type-safe enumerations (:class:TimeSeriesType, :class:DataOriginType, :class:LocationType) instead.

Parameters:

data_type_code ¶
(int, default: 2 ) –

Numeric code for time series type (1-5, 11-15)
type_description ¶
(str, default: 'accumulated over the preceding interval' ) –

Description of the aggregation type
dat_type ¶
(str, default: 'der' ) –

String code for data origin ("obs", "der", "sim", "fct")
dat_type_description ¶
(str, default: 'AWAP data interpolated from observations' ) –

Description of the data
location_type ¶
(str, default: 'Point' ) –

"Point" or "Area"

Returns:

dict[str, str] –

Dictionary of type-related attributes

Examples:

>>> # Old way (still works but not recommended)
>>> attrs = create_var_attribute_definition(
...     data_type_code=2,
...     type_description="accumulated over the preceding interval",
...     dat_type="obs",
... )
>>>
>>> # New recommended way
>>> from efts_io.attributes import (
...     create_variable_attributes,
...     TimeSeriesType,
...     DataOriginType,
... )
>>> attrs = create_variable_attributes(
...     long_name="observed rainfall",
...     units="mm",
...     time_series_type=TimeSeriesType.ACCUMULATED,
...     data_origin=DataOriginType.OBSERVED,
...     data_description="gauge measurements",
... )

Source code in src/efts_io/attributes.py

def create_var_attribute_definition(
    data_type_code: int = 2,
    type_description: str = "accumulated over the preceding interval",
    dat_type: str = "der",
    dat_type_description: str = "AWAP data interpolated from observations",
    location_type: str = "Point",
) -> dict[str, str]:
    """Create variable attribute definition (legacy function).

    .. deprecated::
        This function is maintained for backward compatibility.
        For new code, use :func:`create_variable_attributes` with the type-safe enumerations
        (:class:`TimeSeriesType`, :class:`DataOriginType`, :class:`LocationType`) instead.

    Args:
        data_type_code: Numeric code for time series type (1-5, 11-15)
        type_description: Description of the aggregation type
        dat_type: String code for data origin ("obs", "der", "sim", "fct")
        dat_type_description: Description of the data
        location_type: "Point" or "Area"

    Returns:
        Dictionary of type-related attributes

    Examples:
        >>> # Old way (still works but not recommended)
        >>> attrs = create_var_attribute_definition(
        ...     data_type_code=2,
        ...     type_description="accumulated over the preceding interval",
        ...     dat_type="obs",
        ... )
        >>>
        >>> # New recommended way
        >>> from efts_io.attributes import (
        ...     create_variable_attributes,
        ...     TimeSeriesType,
        ...     DataOriginType,
        ... )
        >>> attrs = create_variable_attributes(
        ...     long_name="observed rainfall",
        ...     units="mm",
        ...     time_series_type=TimeSeriesType.ACCUMULATED,
        ...     data_origin=DataOriginType.OBSERVED,
        ...     data_description="gauge measurements",
        ... )
    """
    return {
        TYPE_ATTR_KEY: str(data_type_code),
        TYPE_DESCRIPTION_ATTR_KEY: type_description,
        DAT_TYPE_ATTR_KEY: dat_type,
        DAT_TYPE_DESCRIPTION_ATTR_KEY: dat_type_description,
        LOCATION_TYPE_ATTR_KEY: location_type,
    }

create_variable_attributes ¶

create_variable_attributes(
    long_name: str,
    units: str,
    time_series_type: TimeSeriesType,
    data_origin: DataOriginType,
    data_description: str,
    location_type: LocationType = POINT,
    fill_value: float = -9999.0,
) -> dict[str, Any]

Create variable attributes for STF 2.0 compliant netCDF files.

This is the recommended function for creating metadata attributes for data variables. It uses type-safe enumerations to ensure attributes conform to STF 2.0 conventions without requiring users to remember numeric codes or string identifiers.

Parameters:

long_name ¶
(str) –

Human-readable name for the variable (e.g., "observed rainfall")
units ¶
(str) –

Units of measurement (e.g., "mm", "m3/s", "°C")
time_series_type ¶
(TimeSeriesType) –

How the data is aggregated/sampled (use TimeSeriesType enum)
data_origin ¶
(DataOriginType) –

How the data was obtained (use DataOriginType enum)
data_description ¶
(str) –

Detailed description of the data (e.g., "AWAP data interpolated from observations")
location_type ¶
(LocationType, default: POINT ) –

Whether data is point or area measurement (default: POINT)
fill_value ¶
(float, default: -9999.0 ) –

Value used for missing data (default: -9999.0)

Returns:

dict[str, Any] –

Dictionary of attributes ready to use with xarray DataArray or EftsDataSet.new_variable()

Examples:

>>> from efts_io.attributes import (
...     create_variable_attributes,
...     TimeSeriesType,
...     DataOriginType,
...     LocationType,
... )
>>> attrs = create_variable_attributes(
...     long_name="observed rainfall",
...     units="mm",
...     time_series_type=TimeSeriesType.ACCUMULATED,
...     data_origin=DataOriginType.OBSERVED,
...     data_description="gauge measurements from station network",
...     location_type=LocationType.POINT,
... )
>>> attrs["type"]
2
>>> attrs["type_description"]
'accumulated over the preceding interval'
>>> attrs["dat_type"]
'obs'

get_parser ¶

get_parser() -> ArgumentParser

Return the CLI argument parser.

Returns:

ArgumentParser –

An argparse parser.

Source code in src/efts_io/_internal/cli.py

def get_parser() -> argparse.ArgumentParser:
    """Return the CLI argument parser.

    Returns:
        An argparse parser.
    """
    parser = argparse.ArgumentParser(prog="efts")
    parser.add_argument("-V", "--version", action="version", version=f"%(prog)s {debug._get_version()}")
    parser.add_argument("--debug-info", action=_DebugInfo, help="Print debug information.")
    return parser

main ¶

main(args: list[str] | None = None) -> int

Run the main program.

This function is executed when you type efts or python -m efts_io.

Parameters:

args ¶
(list[str] | None, default: None ) –

Arguments passed from the command line.

Returns:

int –

An exit code.

Source code in src/efts_io/_internal/cli.py

def main(args: list[str] | None = None) -> int:
    """Run the main program.

    This function is executed when you type `efts` or `python -m efts_io`.

    Parameters:
        args: Arguments passed from the command line.

    Returns:
        An exit code.
    """
    parser = get_parser()
    opts = parser.parse_args(args=args)
    print(opts)
    return 0

open_efts ¶

open_efts(ncfile: Any) -> EftsDataSet

Open an EFTS NetCDF file.

Source code in src/efts_io/wrapper.py

def open_efts(ncfile: Any) -> EftsDataSet:
    """Open an EFTS NetCDF file."""
    # raise NotImplemented("open_efts")
    # if isinstance(ncfile, str):
    #     nc = ncdf4::nc_open(ncfile, readunlim = FALSE, write = writein)
    # } else if (methods::is(ncfile, "ncdf4")) {
    #     nc = ncfile
    # }
    return EftsDataSet(ncfile)

template_variable_attributes ¶

template_variable_attributes(
    time_series_type: Optional[TimeSeriesType] = None,
    data_origin: Optional[DataOriginType] = None,
    location_type: Optional[LocationType] = None,
    fill_value: float = -9999.0,
) -> dict[str, Any]

Create a template dictionary for variable attributes.

This function provides a starting point for creating variable attributes that comply with STF 2.0 conventions. For the recommended type-safe approach, use the enumerations from efts_io.attributes.

Parameters:

time_series_type ¶
(Optional[TimeSeriesType], default: None ) –

TimeSeriesType enum or None (pre-fills type info if provided)
data_origin ¶
(Optional[DataOriginType], default: None ) –

DataOriginType enum or None (pre-fills data origin if provided)
location_type ¶
(Optional[LocationType], default: None ) –

LocationType enum or None (defaults to POINT)
fill_value ¶
(float, default: -9999.0 ) –

Value for missing data (default: -9999.0)

Returns:

dict[str, Any] –

Dictionary with all required attribute keys

Examples:

>>> from efts_io import EftsDataSet
>>> from efts_io.attributes import TimeSeriesType, DataOriginType
>>>
>>> # Using type-safe enums (recommended)
>>> attrs = template_variable_attributes(
...     time_series_type=TimeSeriesType.ACCUMULATED,
...     data_origin=DataOriginType.OBSERVED,
... )
>>> attrs["long_name"] = "observed rainfall"
>>> attrs["units"] = "mm"
>>>
>>> # Or get a blank template
>>> attrs = template_variable_attributes()

Note

For complete attribute creation in one call, use: from efts_io.attributes import create_variable_attributes

validate_global_attributes ¶

validate_global_attributes(
    attrs: dict[str, Any],
) -> list[str]

Validate a dictionary of global attributes against STF 2.0 conventions.

Parameters:

attrs ¶
(dict[str, Any]) –

Dictionary of attributes to validate

Returns:

list[str] –

List of error message strings. Empty list means valid.

Examples:

>>> from efts_io.attributes import create_global_attributes
>>> attrs = create_global_attributes("Title", "Inst", "Src", "Catch", "Comment")
>>> validate_global_attributes(attrs)
[]

Source code in src/efts_io/attributes.py

def validate_global_attributes(attrs: dict[str, Any]) -> list[str]:
    """Validate a dictionary of global attributes against STF 2.0 conventions.

    Args:
        attrs: Dictionary of attributes to validate

    Returns:
        List of error message strings. Empty list means valid.

    Examples:
        >>> from efts_io.attributes import create_global_attributes
        >>> attrs = create_global_attributes("Title", "Inst", "Src", "Catch", "Comment")
        >>> validate_global_attributes(attrs)
        []
    """
    errors: list[str] = []
    required_keys = {
        TITLE_ATTR_KEY: str,
        INSTITUTION_ATTR_KEY: str,
        SOURCE_ATTR_KEY: str,
        CATCHMENT_ATTR_KEY: str,
        STF_CONVENTION_VERSION_ATTR_KEY: (int, float),
        STF_NC_SPEC_ATTR_KEY: str,
        COMMENT_ATTR_KEY: str,
        HISTORY_ATTR_KEY: str,
    }

    for key, expected_type in required_keys.items():
        if key not in attrs:
            errors.append(f"Missing required attribute '{key}'")
        elif not isinstance(attrs[key], expected_type):
            errors.append(
                f"Attribute '{key}' has type '{type(attrs[key]).__name__}',"
                f" expected '{expected_type.__name__ if isinstance(expected_type, type) else ' or '.join(t.__name__ for t in expected_type)}'",
            )

    if TITLE_ATTR_KEY in attrs and isinstance(attrs[TITLE_ATTR_KEY], str) and attrs[TITLE_ATTR_KEY] == "":
        errors.append(f"Attribute '{TITLE_ATTR_KEY}' must not be empty")

    if CATCHMENT_ATTR_KEY in attrs and isinstance(attrs[CATCHMENT_ATTR_KEY], str) and " " in attrs[CATCHMENT_ATTR_KEY]:
        errors.append(f"Attribute '{CATCHMENT_ATTR_KEY}' must not contain spaces (use underscores instead)")

    return errors

validate_quality_variable_attributes ¶

validate_quality_variable_attributes(
    attrs: dict[str, Any],
) -> list[str]

Validate a dictionary of quality variable attributes against STF 2.0 conventions.

Parameters:

attrs ¶
(dict[str, Any]) –

Dictionary of attributes to validate

Returns:

list[str] –

List of error message strings. Empty list means valid.

Examples:

>>> from efts_io.attributes import create_quality_variable_attributes
>>> attrs = create_quality_variable_attributes(
...     "Quality of observed rainfall", "ABC Quality coding"
... )
>>> validate_quality_variable_attributes(attrs)
[]

Source code in src/efts_io/attributes.py

def validate_quality_variable_attributes(attrs: dict[str, Any]) -> list[str]:
    """Validate a dictionary of quality variable attributes against STF 2.0 conventions.

    Args:
        attrs: Dictionary of attributes to validate

    Returns:
        List of error message strings. Empty list means valid.

    Examples:
        >>> from efts_io.attributes import create_quality_variable_attributes
        >>> attrs = create_quality_variable_attributes(
        ...     "Quality of observed rainfall", "ABC Quality coding"
        ... )
        >>> validate_quality_variable_attributes(attrs)
        []
    """
    errors: list[str] = []
    required_keys = {
        LONG_NAME_ATTR_KEY: str,
        UNITS_ATTR_KEY: str,
        FILLVALUE_ATTR_KEY: int,
    }

    for key, expected_type in required_keys.items():
        if key not in attrs:
            errors.append(f"Missing required attribute '{key}'")
        elif not isinstance(attrs[key], expected_type):
            errors.append(
                f"Attribute '{key}' has type '{type(attrs[key]).__name__}', expected '{expected_type.__name__}'",
            )

    return errors

validate_state_variable_attributes ¶

validate_state_variable_attributes(
    attrs: dict[str, Any],
) -> list[str]

Validate a dictionary of state variable attributes against STF 2.0 conventions.

Parameters:

attrs ¶
(dict[str, Any]) –

Dictionary of attributes to validate

Returns:

list[str] –

List of error message strings. Empty list means valid.

Examples:

>>> from efts_io.attributes import create_state_variable_attributes
>>> attrs = create_state_variable_attributes("sv1", "GR4H_RR", "UH_Inflow", "desc")
>>> validate_state_variable_attributes(attrs)
[]

Source code in src/efts_io/attributes.py

def validate_state_variable_attributes(attrs: dict[str, Any]) -> list[str]:
    """Validate a dictionary of state variable attributes against STF 2.0 conventions.

    Args:
        attrs: Dictionary of attributes to validate

    Returns:
        List of error message strings. Empty list means valid.

    Examples:
        >>> from efts_io.attributes import create_state_variable_attributes
        >>> attrs = create_state_variable_attributes("sv1", "GR4H_RR", "UH_Inflow", "desc")
        >>> validate_state_variable_attributes(attrs)
        []
    """
    errors: list[str] = []
    required_keys = {
        LONG_NAME_ATTR_KEY: str,
        MODEL_NAME_ATTR_KEY: str,
        SV_NAME_ATTR_KEY: str,
        SV_DESCRIPTION_ATTR_KEY: str,
        FILLVALUE_ATTR_KEY: (int, float),
    }

    for key, expected_type in required_keys.items():
        if key not in attrs:
            errors.append(f"Missing required attribute '{key}'")
        elif not isinstance(attrs[key], expected_type):
            errors.append(
                f"Attribute '{key}' has type '{type(attrs[key]).__name__}',"
                f" expected '{expected_type.__name__ if isinstance(expected_type, type) else ' or '.join(t.__name__ for t in expected_type)}'",
            )

    return errors

validate_variable_attributes ¶

validate_variable_attributes(
    attrs: dict[str, Any],
) -> list[str]

Validate a dictionary of data variable attributes against STF 2.0 conventions.

Checks that all required keys are present and that coded values are valid.

Parameters:

attrs ¶
(dict[str, Any]) –

Dictionary of attributes to validate

Returns:

list[str] –

List of error message strings. Empty list means valid.

Examples:

>>> errors = validate_variable_attributes({})
>>> len(errors) > 0
True

Source code in src/efts_io/attributes.py

def validate_variable_attributes(attrs: dict[str, Any]) -> list[str]:
    """Validate a dictionary of data variable attributes against STF 2.0 conventions.

    Checks that all required keys are present and that coded values are valid.

    Args:
        attrs: Dictionary of attributes to validate

    Returns:
        List of error message strings. Empty list means valid.

    Examples:
        >>> errors = validate_variable_attributes({})
        >>> len(errors) > 0
        True
    """
    errors: list[str] = []
    required_keys = {
        LONG_NAME_ATTR_KEY: str,
        UNITS_ATTR_KEY: str,
        FILLVALUE_ATTR_KEY: (int, float),
        TYPE_ATTR_KEY: int,
        TYPE_DESCRIPTION_ATTR_KEY: str,
        DAT_TYPE_ATTR_KEY: str,
        DAT_TYPE_DESCRIPTION_ATTR_KEY: str,
        LOCATION_TYPE_ATTR_KEY: str,
    }

    for key, expected_type in required_keys.items():
        if key not in attrs:
            errors.append(f"Missing required attribute '{key}'")
        elif not isinstance(attrs[key], expected_type):
            errors.append(
                f"Attribute '{key}' has type '{type(attrs[key]).__name__}',"
                f" expected '{expected_type.__name__ if isinstance(expected_type, type) else ' or '.join(t.__name__ for t in expected_type)}'",
            )

    if (
        TYPE_ATTR_KEY in attrs
        and isinstance(attrs[TYPE_ATTR_KEY], int)
        and attrs[TYPE_ATTR_KEY] not in _VALID_TYPE_CODES
    ):
        errors.append(
            f"Attribute '{TYPE_ATTR_KEY}' has value {attrs[TYPE_ATTR_KEY]},"
            f" expected one of {sorted(_VALID_TYPE_CODES)}",
        )

    if (
        DAT_TYPE_ATTR_KEY in attrs
        and isinstance(attrs[DAT_TYPE_ATTR_KEY], str)
        and attrs[DAT_TYPE_ATTR_KEY] not in _VALID_DAT_TYPE_CODES
    ):
        errors.append(
            f"Attribute '{DAT_TYPE_ATTR_KEY}' has value '{attrs[DAT_TYPE_ATTR_KEY]}',"
            f" expected one of {sorted(_VALID_DAT_TYPE_CODES)}",
        )

    if (
        LOCATION_TYPE_ATTR_KEY in attrs
        and isinstance(attrs[LOCATION_TYPE_ATTR_KEY], str)
        and attrs[LOCATION_TYPE_ATTR_KEY] not in _VALID_LOCATION_TYPES
    ):
        errors.append(
            f"Attribute '{LOCATION_TYPE_ATTR_KEY}' has value '{attrs[LOCATION_TYPE_ATTR_KEY]}',"
            f" expected one of {sorted(_VALID_LOCATION_TYPES)}",
        )

    return errors

xr_efts ¶

xr_efts(
    issue_times: Iterable[ConvertibleToTimestamp],
    station_ids: Iterable[str],
    lead_times: Iterable[int] | None = None,
    lead_time_tstep: str = "hours",
    ensemble_size: int = 1,
    station_names: Iterable[str] | None = None,
    latitudes: Iterable[float] | None = None,
    longitudes: Iterable[float] | None = None,
    areas: Iterable[float] | None = None,
    nc_attributes: dict[str, str] | None = None,
) -> Dataset

Create an xarray Dataset for EFTS data.

Source code in src/efts_io/wrapper.py

def xr_efts(
    issue_times: Iterable[ConvertibleToTimestamp],
    station_ids: Iterable[str],
    lead_times: Iterable[int] | None = None,
    lead_time_tstep: str = "hours",
    ensemble_size: int = 1,
    # variables
    station_names: Iterable[str] | None = None,
    latitudes: Iterable[float] | None = None,
    longitudes: Iterable[float] | None = None,
    areas: Iterable[float] | None = None,
    nc_attributes: dict[str, str] | None = None,
) -> xr.Dataset:
    """Create an xarray Dataset for EFTS data."""
    # Check that station ids are unique:
    if len(set(station_ids)) != len(station_ids):
        raise ValueError("Station names must be unique.")
    # I learned today that xarray 2025.7.1 can now accept pandas datetimeindex as coordinates
    # for backward compatibility with older xarray versions, we convert to list here.
    # See https://github.com/csiro-hydroinformatics/efts-io/issues/13, in the future may change design.
    if isinstance(issue_times, pd.DatetimeIndex):
        # This will convert each item to a tstamp such as
        # Timestamp('2023-01-01 00:00:00+1000', tz='UTC+10:00')
        issue_times = list(issue_times)  # issue_times is iterable,and iterated over indeed.
    if lead_times is None:
        lead_times = [0]
    coords = {
        TIME_DIMNAME: issue_times,
        # STATION_DIMNAME: np.arange(start=1, stop=len(station_ids) + 1, step=1),
        STATION_ID_DIMNAME: station_ids,  # np.arange(start=1, stop=len(station_ids) + 1, step=1),
        REALISATION_DIMNAME: np.arange(start=1, stop=ensemble_size + 1, step=1),
        LEAD_TIME_DIMNAME: lead_times,
        # Initially, I was exploring attaching a coordinate to an existing dimension STATION_DIMNAME, using:
        # https://docs.xarray.dev/en/latest/generated/xarray.DataArray.assign_coords.html#xarray.DataArray.assign_coords
        # then using https://github.com/pydata/xarray/issues/2028#issuecomment-1265252754  to be able to
        # index by station IDs. But in July 2025 decided to not have a STATION_DIMNAME dimension, which is
        # an artefact from legacy conventions (Fortran 1-based indexing and other related limitations).
        # Keeping a number based STATION_DIMNAME here is only making things more difficult and data subsetting more prone to bugs.
        # STATION_ID_VARNAME: (STATION_DIMNAME, station_ids),
    }
    n_stations = len(station_ids)
    latitudes = latitudes if latitudes is not None else nan_full(n_stations)
    longitudes = longitudes if longitudes is not None else nan_full(n_stations)
    areas = areas if areas is not None else nan_full(n_stations)
    station_names = station_names if station_names is not None else [f"{i}" for i in station_ids]
    data_vars = {
        STATION_NAME_VARNAME: (STATION_ID_DIMNAME, station_names),
        LAT_VARNAME: (STATION_ID_DIMNAME, latitudes),
        LON_VARNAME: (STATION_ID_DIMNAME, longitudes),
        AREA_VARNAME: (STATION_ID_DIMNAME, areas),
    }
    nc_attributes = nc_attributes or _stf2_mandatory_global_attributes()
    d = xr.Dataset(
        data_vars=data_vars,
        coords=coords,
        attrs=nc_attributes,
    )
    # Credits to the work reported in https://github.com/pydata/xarray/issues/2028#issuecomment-1265252754
    # d = d.set_xindex(STATION_ID_VARNAME)
    d.time.attrs = {
        STANDARD_NAME_ATTR_KEY: TIME_DIMNAME,
        LONG_NAME_ATTR_KEY: TIME_DIMNAME,
        # TIME_STANDARD_KEY: "UTC",
        AXIS_ATTR_KEY: "t",
        # UNITS_ATTR_KEY: "days since 2000-11-14 23:00:00.0 +0000",
    }
    d.lead_time.attrs = {
        STANDARD_NAME_ATTR_KEY: "lead time",
        LONG_NAME_ATTR_KEY: "forecast lead time",
        AXIS_ATTR_KEY: "v",
        UNITS_ATTR_KEY: f"{lead_time_tstep} since time",
    }
    d.realization.attrs = {
        STANDARD_NAME_ATTR_KEY: ENS_MEMBER_DIMNAME,  # TODO: should we keep the STF 2.0 ens_member as a standard name?
        LONG_NAME_ATTR_KEY: "ensemble member",
        UNITS_ATTR_KEY: "member id",
        AXIS_ATTR_KEY: "u",
    }
    d.station_id.attrs = {LONG_NAME_ATTR_KEY: "station or node identification code"}
    d.station_name.attrs = {LONG_NAME_ATTR_KEY: "station or node name"}
    d.lat.attrs = {LONG_NAME_ATTR_KEY: "latitude", UNITS_ATTR_KEY: "degrees_north", AXIS_ATTR_KEY: "y"}
    d.lon.attrs = {LONG_NAME_ATTR_KEY: "longitude", UNITS_ATTR_KEY: "degrees_east", AXIS_ATTR_KEY: "x"}
    d.area.attrs = {
        LONG_NAME_ATTR_KEY: "station area",
        UNITS_ATTR_KEY: "km^2",
        STANDARD_NAME_ATTR_KEY: AREA_VARNAME,
    }
    return d

efts_io ¶

DataOriginType ¶

code ¶

description ¶

EftsDataSet ¶

catchment property writable ¶

comment property writable ¶

history property writable ¶

institution property writable ¶

source property writable ¶

stf2_int_datatype property writable ¶

stf_convention_version property writable ¶

stf_nc_spec property writable ¶

title property writable ¶

append_history ¶

create_data_variables ¶

get_dim_names ¶

get_ensemble_size ¶

get_lead_time_count ¶

get_lead_time_values ¶

get_station_count ¶

get_time_dim ¶

new_variable ¶

varname ¶

dim_names ¶

var_attributes ¶

data ¶

put_lead_time_values ¶

save_to_stf2 ¶

to_netcdf ¶

path ¶

version ¶

writeable_to_stf2 ¶

LocationType ¶

StfVariable ¶

TimeSeriesType ¶

code ¶

description ¶

create_global_attributes ¶

title ¶

institution ¶

source ¶

catchment ¶

comment ¶

stf_convention_version ¶

stf_nc_spec ¶

history ¶

create_mandatory_global_attributes ¶

title ¶

institution ¶

catchment ¶

source ¶

comment ¶

history ¶

create_quality_variable_attributes ¶

long_name ¶

quality_code_standard ¶

fill_value ¶

create_state_variable_attributes ¶

long_name ¶

model_name ¶

sv_name ¶

sv_description ¶

fill_value ¶

create_var_attribute_definition ¶

data_type_code ¶

type_description ¶

dat_type ¶

dat_type_description ¶

location_type ¶

create_variable_attributes ¶

long_name ¶

units ¶

time_series_type ¶

data_origin ¶

data_description ¶

location_type ¶

fill_value ¶

get_parser ¶

main ¶

`code` ¶

`description` ¶

catchment `property` `writable` ¶

comment `property` `writable` ¶

history `property` `writable` ¶

institution `property` `writable` ¶

source `property` `writable` ¶

stf2_int_datatype `property` `writable` ¶

stf_convention_version `property` `writable` ¶

stf_nc_spec `property` `writable` ¶

title `property` `writable` ¶

`varname` ¶

`dim_names` ¶

`var_attributes` ¶

`data` ¶

`path` ¶

`version` ¶

`code` ¶

`description` ¶

`title` ¶

`institution` ¶

`source` ¶

`catchment` ¶

`comment` ¶

`stf_convention_version` ¶

`stf_nc_spec` ¶

`history` ¶

`title` ¶

`institution` ¶

`catchment` ¶

`source` ¶

`comment` ¶

`history` ¶

`long_name` ¶

`quality_code_standard` ¶

`fill_value` ¶

`long_name` ¶

`model_name` ¶

`sv_name` ¶

`sv_description` ¶

`fill_value` ¶

`data_type_code` ¶

`type_description` ¶

`dat_type` ¶

`dat_type_description` ¶

`location_type` ¶

`long_name` ¶

`units` ¶

`time_series_type` ¶

`data_origin` ¶

`data_description` ¶

`location_type` ¶

`fill_value` ¶

`args` ¶

`time_series_type` ¶

`data_origin` ¶

`location_type` ¶

`fill_value` ¶

`attrs` ¶

`attrs` ¶

`attrs` ¶

`attrs` ¶