Basic Processors#

Select Region#

canproc.processors.base.select_region(data: Dataset | DataArray, region: dict[str, tuple[float, float]] = {'lat': (-90, 90), 'lon': (-180, 360)}) Dataset | DataArray#

Select a geopraphic region. Expects longitude coordinates (0 to 360). If longitude[1] > longitude[0] selection is wrapped from east to west.

Parameters:
  • data (xr.Dataset | xr.DataArray) – input data

  • region (dict[str, tuple[float, float]], optional) – region to use for selection, by default {“lat”: (-90, 90), “lon”: (-180, 360)}

Returns:

xr.Dataset | xr.DataArray – subset of input data

Area Weights#

canproc.processors.base.area_weights(data: ~xarray.core.dataarray.DataArray | ~xarray.core.dataset.Dataset, dim: str = 'lat', kernel: ~typing.Callable = <function spherical_weights>) DataArray#

Compute the relative weights for area weighting. Input data is expected to be on a regular grid.

Parameters:
  • data (xr.DataArray | xr.Dataset) – Input data

  • latdim (str, optional) – name of latitude dimension, by default lat

  • kernel (Callable, optional) – function to compute the weights, by default spherical_area

Returns:

xr.DataArray – weights along the latdim dimension

Area Mean#

canproc.processors.base.area_mean(data: DataArray | Dataset, weights: DataArray | None = None, region: dict[str, tuple[float, float]] | None = {'lat': (-90, 90), 'lon': (-180, 360)}, method: Literal['max', 'min', 'mean', 'std', 'sum'] = 'mean') DataArray | Dataset#

Compute the area weighted mean of the data

Parameters:
  • data (xr.DataArray | xr.Dataset) – input dataset to be averaged

  • region (dict[str, tuple[float, float]] | None, optional) – If set, a region is selected before averaging is performed. By default {“lat”: (-90, 90), “lon”: (0, 360)} Latitude and longitude dimensions are read from the region parameter if provided.

  • weights (xr.DataArray | None, optional) – User provided weights to used for the average. If not supplied weights are calculated using area_weights.

Returns:

xr.DataArray | xr.Dataset – input data after selection and averaging

Raises:

ValueError – If latitude and longitude coordinates cannot be found

Cell Area#

canproc.processors.base.cell_area(data: DataArray | Dataset, latdim: str = 'lat', londim: str = 'lon', radius: float = 6371000.0, broadcast: bool = False) DataArray#

Calculate the area of each grid cell in a rectilinear latitude-longitude grid. If longitude spacing is constant a 1D array is returned unless broadcast is true.

Parameters:
  • data (xr.DataArray or xr.Dataset) – Input data containing latitude and longitude dimensions.

  • latdim (str, optional) – Name of the latitude dimension in the data. Default is “lat”.

  • londim (str, optional) – Name of the longitude dimension in the data. Default is “lon”.

  • radius (float, optional) – Radius of the sphere (e.g., Earth) in meters. Default is 6371000 km.

  • broadcast (bool, optional) – If True, broadcast longitude weights to match the shape of the data. Default is False.

Returns:

xr.DataArray – Array of cell areas with the same shape as the input data’s spatial dimensions.

Notes

  • Assumes latitude and longitude are in degrees.

  • Uses spherical geometry for latitude weighting and linear weighting for longitude.

Zonal Mean#

canproc.processors.base.zonal_mean(data: Dataset | DataArray, lon_dim: str = 'lon') Dataset | DataArray#

Compute the zonal mean

Parameters:
  • data (xr.Dataset | xr.DataArray) – Data to be averaged.

  • lon_dim (str, optional) – name of longitude dimension over which to average, by default “lon”.

Returns:

xr.Dataset | xr.DataArray – Zonally averaged data

Pressure Interpolation#

canproc.processors.physics.interpolate_to_pressure(data: DataArray, input_pressure: DataArray, output_pressure: DataArray | list[float] | ndarray, input_dim: str = 'level', output_dim: str = 'plev') DataArray#

Interpolate data on log hybrid sigma levels [0, 1] onto pressure

Parameters:
  • data (xr.DataArray) – Data to be interpolated

  • input_pressure (xr.DataArray) – Pressure used for interpolation, should be the same shape as data

  • output_pressure (xr.DataArray) – One dimensional output pressure levels

Returns:

xr.DataArray – data interpolated onto pressure levels

To NetCDF#

canproc.processors.base.to_netcdf(data: Dataset | DataArray, filename: str, **kwargs) Dataset | DataArray#

Save an xarray Dataset or DataArray to a NetCDF file with CMIP-compliant options.

This function wraps xarray.to_netcdf, providing additional handling for encoding options (such as writing double-precision floats as single-precision) and metadata insertion. It also appends provenance information to the dataset attributes.

Parameters:
  • data (xr.Dataset or xr.DataArray) – The xarray object to be saved.

  • filename (str) – The path to the output NetCDF file.

  • **kwargs

    Additional keyword arguments passed to xarray.to_netcdf. Special handling is provided for:
    • encoding: dict, optional

      Encoding options for variables. If ‘write_double_as_float’ is present, float64 variables are written as float32.

    • metadata: dict, optional

      Metadata to be added to the dataset before saving.

    • template: string, optional

      If a template is provided the filename will be determined dynamically using values from kwargs[“naming_kwargs”]

    • naming_kwargs: dict, optional

      Values used to fill the template. Ignored if template is not provided.

Returns:

xr.Dataset or xr.DataArray – The input data, possibly with updated attributes.