mth5.groups

Import all Group objects

Submodules

Classes

BaseGroup

Base class for HDF5 group management with metadata handling.

ReportsGroup

Store report files (PDF/text) and images under /Survey/Reports.

StandardsGroup

Container for metadata standards documentation stored in the HDF5 file.

FiltersGroup

Container for managing all filter types in MTH5 format.

EstimateDataset

Container for statistical estimates of transfer functions.

FCChannelDataset

Container for Fourier coefficients (FC) from windowed FFT analysis.

MasterFCGroup

Master container for all Fourier Coefficient estimations of time series data.

FCGroup

Manage a set of Fourier Coefficients from a single processing run.

FCDecimationGroup

Container for a single decimation level of Fourier Coefficient data.

TransferFunctionsGroup

Container for transfer functions under a station.

TransferFunctionGroup

Wrapper for a single transfer function estimation.

ElectricDataset

Specialized container for electric field channel data.

MagneticDataset

Specialized container for magnetic field channel data.

ChannelDataset

A container for channel time series data stored in HDF5 format.

AuxiliaryDataset

Specialized container for auxiliary channel data.

RunGroup

Container for a single MT measurement run with multiple channels.

FeatureChannelDataset

Container for multi-dimensional Fourier Coefficients organized by time and frequency.

MasterFeaturesGroup

Master group container for features associated with Fourier Coefficients or time series.

FeatureGroup

Container for a single feature set with all associated runs and decimation levels.

FeatureTSRunGroup

Container for time series features from a processing or analysis run.

FeatureFCRunGroup

Container for Fourier Coefficient features from a processing run.

FeatureDecimationGroup

Container for a single decimation level with multiple Fourier Coefficient channels.

MasterSurveyGroup

Collection helper for surveys under Experiment/Surveys.

SurveyGroup

Wrapper for a single survey at Experiment/Surveys/<id>.

ExperimentGroup

Utility class to hold general information about the experiment and

Package Contents

class mth5.groups.BaseGroup(group: h5py.Group | h5py.Dataset, group_metadata: mt_metadata.base.MetadataBase | None = None, **kwargs: Any)[source]

Base class for HDF5 group management with metadata handling.

Provides core functionality for reading, writing, and managing HDF5 groups with integrated metadata validation using mt_metadata standards.

Parameters:
  • group (h5py.Group or h5py.Dataset) – HDF5 group or dataset object to wrap.

  • group_metadata (MetadataBase, optional) – Metadata container with validated attributes. Default is None.

  • **kwargs (dict) – Additional keyword arguments to set as instance attributes.

hdf5_group

Weak reference to the underlying HDF5 group.

Type:

h5py.Group or h5py.Dataset

metadata

Metadata object with validation and standards compliance.

Type:

MetadataBase

logger

Logger instance for tracking operations.

Type:

loguru.Logger

compression

HDF5 compression method (e.g., ‘gzip’).

Type:

str, optional

compression_opts

Compression options/level.

Type:

int, optional

shuffle

Enable HDF5 shuffle filter. Default is False.

Type:

bool

fletcher32

Enable HDF5 Fletcher32 checksum. Default is False.

Type:

bool

Notes

  • All HDF5 group references are weak references to prevent lingering file references after the group is closed.

  • Metadata changes should be written using write_metadata() method.

  • This is a base class inherited by more specific group types like SurveyGroup, StationGroup, RunGroup, etc.

Examples

Create and manage a group with metadata

>>> import h5py
>>> with h5py.File('data.h5', 'r+') as f:
...     group = f.create_group('MyGroup')
...     base_obj = BaseGroup(group)
...     print(base_obj)
...     # Set and write metadata
...     base_obj.metadata.id = 'MyGroup'
...     base_obj.write_metadata()

Access metadata and group structure

>>> print(base_obj.metadata.id)
'MyGroup'
>>> print(base_obj.groups_list)
['subgroup1', 'subgroup2']
>>> print(base_obj.hdf5_group.ref)  # Get HDF5 reference
<HDF5 Group Reference>
compression = None
compression_opts = None
shuffle = False
fletcher32 = False
logger
property metadata: mt_metadata.base.MetadataBase

Get metadata object with lazy loading from HDF5 attributes.

Returns:

Metadata container with all attributes and validation.

Return type:

MetadataBase

Notes

Metadata is loaded on first access and cached for subsequent accesses.

Examples

>>> meta = base_obj.metadata
>>> print(meta.id)
'MyGroup'
>>> print(meta.mth5_type)
'Survey'
property groups_list: list[str]

Get list of all subgroup names in the HDF5 group.

Returns:

Names of all subgroups and datasets.

Return type:

list of str

Examples

>>> print(base_obj.groups_list)
['Station_001', 'Station_002', 'metadata']
property dataset_options: dict[str, Any]

Get the HDF5 dataset creation options.

Returns:

Dictionary containing compression, shuffle, and checksum settings.

Return type:

dict

Examples

>>> options = base_obj.dataset_options
>>> print(options)
{'compression': 'gzip', 'compression_opts': 4,
 'shuffle': True, 'fletcher32': False}
read_metadata() None[source]

Read metadata from HDF5 group attributes into metadata object.

Loads all HDF5 attributes and converts them to appropriate Python types before populating the metadata object with validation.

Notes

This method is called automatically on first metadata access if metadata has not been read yet. Empty attributes are skipped with a debug message.

Examples

Manually read metadata after file changes

>>> base_obj.read_metadata()
>>> print(base_obj.metadata.id)
'MyGroup'

Check what attributes were read

>>> base_obj.read_metadata()
>>> attrs = list(base_obj.metadata.to_dict().keys())
>>> print(f"Attributes: {attrs}")
Attributes: ['id', 'comments', 'provenance']
write_metadata() None[source]

Write metadata from object to HDF5 group attributes.

Converts metadata values to numpy-compatible types before writing to HDF5 attributes. Handles read-only mode gracefully with warnings.

Raises:
  • KeyError – If HDF5 write fails for reasons other than read-only mode.

  • ValueError – If synchronous group creation fails for reasons other than read-only mode.

Notes

  • Keys that already exist are overwritten.

  • Read-only files will log a warning instead of raising an error.

  • This method should be called after any metadata changes.

Examples

Update metadata and write to file

>>> base_obj.metadata.id = 'UpdatedGroup'
>>> base_obj.metadata.comments = 'New comments'
>>> base_obj.write_metadata()

Verify write by reloading

>>> base_obj._has_read_metadata = False
>>> base_obj.read_metadata()
>>> print(base_obj.metadata.id)
'UpdatedGroup'
initialize_group(**kwargs: Any) None[source]

Initialize group by setting attributes and writing metadata.

Convenience method that sets keyword arguments as instance attributes and writes all metadata to the HDF5 file.

Parameters:

**kwargs (dict) – Key-value pairs to set as instance attributes.

Examples

Initialize with compression settings

>>> base_obj.initialize_group(
...     compression='gzip',
...     compression_opts=4,
...     shuffle=True
... )
rename_group(new_name: str) None[source]

Rename the current group in the HDF5 file.

Parameters:

new_name (str) – New name for the group. Will be validated and normalized.

Raises:

MTH5Error – If renaming fails due to read-only mode or other issues.

Examples

Rename a group

>>> print(survey_obj.hdf5_group.name)
'/OldSurveyName'
>>> survey_obj.rename_group('NewSurveyName')
>>> print(survey_obj.hdf5_group.name)
'/NewSurveyName'
class mth5.groups.ReportsGroup(group: h5py.Group, **kwargs: Any)[source]

Bases: mth5.groups.base.BaseGroup

Store report files (PDF/text) and images under /Survey/Reports.

Files are embedded into HDF5 datasets with basic metadata preserved.

Examples

>>> reports = survey.reports_group
>>> _ = reports.add_report("site_report", filename="/tmp/report.pdf")
>>> _ = reports.get_report("site_report")
add_report(report_name: str, report_metadata: dict[str, Any] | None = None, filename: str | pathlib.Path | None = None) None[source]

Add a report or image file to the group.

Parameters:
  • report_name (str) – Dataset name to store the file under.

  • report_metadata (dict, optional) – Additional attributes to attach to the dataset.

  • filename (str or Path, optional) – Path to the file to embed; supported types: PDF/TXT/MD and common images.

Raises:

FileNotFoundError – If filename does not exist.

Examples

>>> reports.add_report("manual", filename="docs/manual.pdf")
get_report(report_name: str) pathlib.Path[source]

Extract a stored report or image to the current working directory.

Parameters:

report_name (str) – Name of the stored dataset.

Returns:

Path to the materialized file on disk.

Return type:

pathlib.Path

Raises:

ValueError – If the stored file type is unsupported.

Examples

>>> path = reports.get_report("site_report")
>>> path.exists()
True
class mth5.groups.StandardsGroup(group: Any, **kwargs: Any)[source]

Bases: mth5.groups.base.BaseGroup

Container for metadata standards documentation stored in the HDF5 file.

Stores metadata standards used throughout the survey in a standardized summary table. This enables users to understand metadata directly from the file without requiring external documentation.

The standards are organized in a summary table at /Survey/Standards/summary with columns for attribute name, type, requirements, style, units, and descriptions.

summary_table

The standards summary table with metadata definitions.

Type:

MTH5Table

Notes

Standards include definitions for:

  • Survey, Station, Run, Electric, Magnetic, Auxiliary metadata

  • Filter types: Coefficient, FIR, FrequencyResponseTable, PoleZero, TimeDelay

  • Processing standards from aurora and fourier_coefficients modules

Examples

>>> with MTH5('survey.mth5') as mth5_obj:
...     standards = mth5_obj.standards_group
...     summary = standards.summary_table
...     print(summary.array.dtype.names)
('attribute', 'type', 'required', 'style', 'units', 'description', ...)

Get information about a specific attribute:

>>> standards.get_attribute_information('survey.release_license')
survey.release_license
--------------------------
        type          : string
        required      : True
        style         : controlled vocabulary
        ...
property summary_table: mth5.tables.MTH5Table
get_attribute_information(attribute_name: str) None[source]

Print detailed information about a metadata attribute.

Retrieves and displays all metadata standards information for the specified attribute from the standards summary table.

Parameters:

attribute_name (str) – Name of the attribute to describe (e.g., ‘survey.release_license’).

Raises:

MTH5TableError – If the attribute is not found in the standards summary table.

Notes

Prints formatted output including:

  • Data type

  • Whether attribute is required

  • Style (e.g., controlled vocabulary)

  • Units

  • Description

  • Valid options

  • Aliases

  • Example values

  • Default value

Examples

>>> standards = mth5_obj.standards_group
>>> standards.get_attribute_information('survey.release_license')
survey.release_license
--------------------------
        type          : string
        required      : True
        style         : controlled vocabulary
        units         :
        description   : How the data can be used. The options are based on
                 Creative Commons licenses.
        options       : CC-0,CC-BY,CC-BY-SA,CC-BY-ND,CC-BY-NC-SA
        alias         :
        example       : CC-0
        default       : CC-0
summary_table_from_dict(summary_dict: dict[str, Any]) None[source]

Populate summary table from a dictionary of metadata standards.

Converts a flattened dictionary of metadata standards into rows in the HDF5 summary table.

Parameters:

summary_dict (dict[str, Any]) – Flattened dictionary of all metadata standards. Keys are attribute names, values are dictionaries with type, required, style, units, description, etc.

Notes

Processes dictionary values:

  • Lists are converted to comma-separated strings

  • None values become empty strings

  • Bytes are decoded to UTF-8

Examples

>>> standards = StandardsGroup(group)
>>> metadata = summarize_metadata_standards()
>>> standards.summary_table_from_dict(metadata)
get_standards_summary(modules: list[str] | None = None) numpy.ndarray[source]

Get standards for specified metadata modules.

Retrieves and concatenates standards arrays from one or more metadata modules for inclusion in the standards table.

Parameters:

modules (list[str], optional) – List of module names to include (e.g., ‘timeseries’, ‘filters’). If None, uses default modules: common, timeseries, timeseries.filters, transfer_functions.tf, features, features.weights, processing, processing.fourier_coefficients, processing.aurora. Default is None.

Returns:

Concatenated numpy structured array containing standards for all requested modules with dtype matching STANDARDS_DTYPE.

Return type:

np.ndarray

Examples

>>> standards = StandardsGroup(group)
>>> ts_standards = standards.get_standards_summary(['timeseries'])
>>> print(ts_standards.shape)
(45,)

Get all default modules:

>>> all_standards = standards.get_standards_summary()
summary_table_from_array(array: numpy.ndarray) None[source]

Populate summary table from a numpy structured array.

Converts a structured numpy array into rows in the HDF5 summary table.

Parameters:

array (np.ndarray) – Structured numpy array with dtype matching STANDARDS_DTYPE. Each row represents one metadata attribute definition.

Notes

Iterates through all rows of the structured array and adds them sequentially to the summary table using add_row().

Examples

>>> standards = StandardsGroup(group)
>>> standards_array = standards.get_standards_summary()
>>> standards.summary_table_from_array(standards_array)
initialize_group() None[source]

Initialize the standards group and create the summary table.

Creates the summary table dataset in the HDF5 file and populates it with metadata standards from all default modules. Sets appropriate HDF5 attributes and writes the group metadata.

Notes

Initialization process:

  1. Creates HDF5 dataset for summary table with maximum expandable shape

  2. Applies compression if configured in dataset_options

  3. Sets HDF5 attributes: type, last_updated, reference

  4. Populates table with standards from all default modules

  5. Writes group metadata to HDF5

The summary table uses STANDARDS_DTYPE and supports up to 1000 rows.

Examples

>>> mth5_obj.initialize_group()
>>> summary_table = mth5_obj.standards_group.summary_table
>>> print(summary_table.array.shape)
(342,)
class mth5.groups.FiltersGroup(group: h5py.Group, **kwargs)[source]

Bases: mth5.groups.base.BaseGroup

Container for managing all filter types in MTH5 format.

This class provides a unified interface for organizing and accessing filters of different types. It automatically creates and manages subgroups for each filter type (ZPK, Coefficient, Time Delay, FAP, and FIR) within the HDF5 file structure.

Filter Types

  • zpk: Zeros, Poles, and Gain representation

  • coefficient: FIR coefficient filter

  • time_delay: Time delay filter

  • fap: Frequency-Amplitude-Phase (FAP) lookup table

  • fir: Finite Impulse Response filter

param group:

HDF5 group object for the filters container.

type group:

h5py.Group

param **kwargs:

Additional keyword arguments passed to BaseGroup.

zpk_group

Subgroup for zeros-poles-gain filters.

Type:

ZPKGroup

coefficient_group

Subgroup for coefficient filters.

Type:

CoefficientGroup

time_delay_group

Subgroup for time delay filters.

Type:

TimeDelayGroup

fap_group

Subgroup for frequency-amplitude-phase filters.

Type:

FAPGroup

fir_group

Subgroup for FIR filters.

Type:

FIRGroup

Examples

>>> import h5py
>>> from mth5.groups.filters import FiltersGroup
>>> with h5py.File('data.h5', 'r') as f:
...     filters = FiltersGroup(f['Filters'])
...     all_filters = filters.filter_dict
...     zpk_filter = filters.to_filter_object('my_zpk_filter')
property filter_dict: dict[str, Any]

Get a dictionary of all filters across all filter type groups.

Aggregates filters from all subgroups (ZPK, Coefficient, Time Delay, FAP, FIR) into a single dictionary for convenient access and querying.

Returns:

Dictionary mapping filter names to filter metadata dictionaries. Each entry contains filter information including type and HDF5 reference.

Return type:

dict[str, Any]

Examples

>>> filters = FiltersGroup(h5_group)
>>> all_filters = filters.filter_dict
>>> print(list(all_filters.keys()))
['my_zpk_filter', 'lowpass_coefficient', 'time_delay_1', ...]
>>> print(all_filters['my_zpk_filter']['type'])
'zpk'
add_filter(filter_object: object) object[source]

Add a filter dataset based on its type.

Automatically detects the filter type and routes the filter to the appropriate subgroup. Filter names are normalized to lowercase and forward slashes are replaced with “ per “ for consistency.

Parameters:

filter_object (mt_metadata.timeseries.filters) –

An MT metadata filter object with a ‘type’ attribute. Supported types:

  • ’zpk’, ‘poles_zeros’: Zeros-Poles-Gain filter

  • ’coefficient’: Coefficient filter

  • ’time_delay’, ‘time delay’: Time delay filter

  • ’fap’, ‘frequency response table’: Frequency-Amplitude-Phase filter

  • ’fir’: Finite Impulse Response filter

Returns:

Filter group object from the appropriate subgroup.

Return type:

object

Notes

If a filter with the same name already exists, the existing filter is returned instead of creating a duplicate.

Examples

>>> from mt_metadata.timeseries.filters import ZPK
>>> filters = FiltersGroup(h5_group)
>>> zpk_filter = ZPK(name='my_filter')
>>> added_filter = filters.add_filter(zpk_filter)

Add coefficient filter:

>>> from mt_metadata.timeseries.filters import Coefficient
>>> coeff_filter = Coefficient(name='lowpass')
>>> filters.add_filter(coeff_filter)
get_filter(name: str) h5py.Dataset | h5py.Group[source]

Retrieve a filter dataset by name.

Looks up the filter by name in the aggregated filter dictionary and returns the HDF5 dataset or group object.

Parameters:

name (str) – Name of the filter to retrieve.

Returns:

HDF5 dataset or group object for the requested filter.

Return type:

h5py.Dataset or h5py.Group

Raises:

KeyError – If the filter name is not found in the filter dictionary.

Examples

>>> filters = FiltersGroup(h5_group)
>>> filter_dataset = filters.get_filter('my_zpk_filter')
>>> print(filter_dataset.attrs)
to_filter_object(name: str) object[source]

Convert a filter HDF5 dataset to an MT metadata filter object.

Retrieves the filter metadata from the HDF5 file and converts it to the appropriate MT metadata filter class based on filter type.

Parameters:

name (str) – Name of the filter to convert.

Returns:

MT metadata filter object (ZPK, Coefficient, TimeDelay, FAP, or FIR).

Return type:

object

Raises:

KeyError – If the filter name is not found in the filter dictionary.

Examples

>>> filters = FiltersGroup(h5_group)
>>> zpk_filter = filters.to_filter_object('my_zpk_filter')
>>> print(zpk_filter.name)
'my_zpk_filter'
>>> print(type(zpk_filter))
<class 'mt_metadata.timeseries.filters.ZPK'>

Get different filter types:

>>> coeff_filter = filters.to_filter_object('lowpass_coefficient')
>>> fap_filter = filters.to_filter_object('frequency_response_1')
class mth5.groups.EstimateDataset(dataset: h5py.Dataset, dataset_metadata: mt_metadata.transfer_functions.tf.statistical_estimate.StatisticalEstimate | None = None, write_metadata: bool = True, **kwargs: Any)[source]

Container for statistical estimates of transfer functions.

This class holds multi-dimensional statistical estimates for transfer functions with full metadata management. Estimates are stored as HDF5 datasets with dimensions for period, output channels, and input channels.

Parameters:
  • dataset (h5py.Dataset) – HDF5 dataset containing the statistical estimate data.

  • dataset_metadata (mt_metadata.transfer_functions.tf.StatisticalEstimate, optional) – Metadata object for the estimate. If provided and write_metadata is True, the metadata will be written to the HDF5 attributes. Defaults to None.

  • write_metadata (bool, optional) – If True, write metadata to the HDF5 dataset attributes. Defaults to True.

  • **kwargs (Any) – Additional keyword arguments (reserved for future use).

hdf5_dataset

Weak reference to the HDF5 dataset.

Type:

h5py.Dataset

metadata

Metadata container for the estimate.

Type:

StatisticalEstimate

logger

Logger instance for reporting messages.

Type:

loguru.logger

Raises:
  • MTH5Error – If dataset_metadata is provided but is not of type StatisticalEstimate or a compatible metadata class.

  • TypeError – If input data cannot be converted to numpy array or has wrong dtype/shape.

Notes

The estimate data is stored in 3D form with shape: (n_periods, n_output_channels, n_input_channels)

Metadata is automatically synchronized between the pydantic model and HDF5 attributes on initialization and after any modifications.

Examples

Create an estimate dataset from an HDF5 group:

>>> import h5py
>>> import numpy as np
>>> from mt_metadata.transfer_functions.tf.statistical_estimate import StatisticalEstimate
>>> # Create HDF5 file with estimate dataset
>>> with h5py.File('estimate.h5', 'w') as f:
...     # Create dataset with shape (10 periods, 2 outputs, 2 inputs)
...     data = np.random.rand(10, 2, 2)
...     dset = f.create_dataset('estimate', data=data)
...     # Create EstimateDataset
...     est = EstimateDataset(dset, write_metadata=True)

Convert estimate to xarray and back:

>>> periods = np.logspace(-3, 3, 10)  # 10 periods from 1e-3 to 1e3 s
>>> xr_data = est.to_xarray(periods)
>>> # Modify xarray coordinates
>>> new_xr = xr_data.rename({'output': 'new_output', 'input': 'new_input'})
>>> est.from_xarray(new_xr)  # Load modified data back

Access estimate data in different formats:

>>> # Get numpy array
>>> np_data = est.to_numpy()
>>> print(np_data.shape)  # (10, 2, 2)
>>> # Get xarray with proper coordinates
>>> xr_data = est.to_xarray(periods)
>>> print(xr_data.dims)  # ('period', 'output', 'input')
logger
metadata
read_metadata() None[source]

Read metadata from HDF5 attributes into metadata container.

Reads all attributes from the HDF5 dataset and loads them into the internal metadata object for validation and access.

Return type:

None

Notes

This is automatically called during initialization if ‘mth5_type’ attribute exists in the HDF5 dataset.

Examples

Reload metadata from HDF5 after external modification:

>>> # Metadata was modified in HDF5
>>> est.read_metadata()  # Reload changes
>>> print(est.metadata.name)  # Access updated name
write_metadata() None[source]

Write metadata from container to HDF5 dataset attributes.

Converts the pydantic metadata model to a dictionary and writes each field as an HDF5 attribute. Values are converted to appropriate numpy types for compatibility.

Return type:

None

Notes

All existing attributes with the same names will be overwritten. This is called automatically during initialization and after metadata updates.

Examples

Save updated metadata to HDF5:

>>> est.metadata.name = "Updated Estimate"
>>> est.write_metadata()  # Persist to file
>>> # Verify write
>>> print(est.hdf5_dataset.attrs['name'])
b'Updated Estimate'
replace_dataset(new_data_array: numpy.ndarray) None[source]

Replace entire dataset with new data.

Resizes the HDF5 dataset if necessary and replaces all data. Converts input to numpy array if needed.

Parameters:

new_data_array (np.ndarray) – New estimate data to store. Should have shape (n_periods, n_output_channels, n_input_channels).

Return type:

None

Raises:

TypeError – If input cannot be converted to numpy array.

Notes

If new data has different shape, HDF5 dataset will be resized. This is generally safe but may fragment the HDF5 file.

Examples

Replace estimate with new data:

>>> import numpy as np
>>> new_estimate = np.random.rand(10, 2, 2)  # 10 periods, 2 channels
>>> est.replace_dataset(new_estimate)
>>> print(est.to_numpy().shape)
(10, 2, 2)

Replace with data from list (auto-converted to array):

>>> data_list = [[[1, 2], [3, 4]]] * 5  # 5 periods
>>> est.replace_dataset(data_list)
>>> est.to_numpy().shape
(5, 2, 2)
to_xarray(period: numpy.ndarray | list) xarray.DataArray[source]

Convert estimate to xarray DataArray.

Creates an xarray DataArray with proper coordinates for periods, output channels, and input channels. Includes metadata as attributes.

Parameters:

period (np.ndarray | list) – Period values for coordinate. Should have length equal to estimate first dimension (n_periods).

Returns:

DataArray with dimensions (period, output, input) and coordinates from metadata.

Return type:

xr.DataArray

Notes

Metadata changes in xarray are not validated and will not be synchronized back to HDF5 without explicit call to from_xarray(). Data is loaded entirely into memory.

Examples

Convert to xarray with logarithmic period spacing:

>>> import numpy as np
>>> periods = np.logspace(-2, 3, 10)  # 10 periods from 0.01 to 1000
>>> xr_data = est.to_xarray(periods)
>>> print(xr_data.dims)
('period', 'output', 'input')
>>> print(xr_data.coords['period'].values)
[1.00e-02 3.16e-02 ... 1.00e+03]

Select data by period range:

>>> subset = xr_data.sel(period=slice(0.1, 100))
>>> print(subset.shape)
(8, 2, 2)
to_numpy() numpy.ndarray[source]

Convert estimate to numpy array.

Returns the HDF5 dataset as a numpy array. Data is loaded entirely into memory.

Returns:

3D array with shape (n_periods, n_output_channels, n_input_channels).

Return type:

np.ndarray

Notes

For large estimates, this loads all data into RAM. Consider using HDF5 slicing for memory-efficient access.

Examples

Get full estimate as numpy array:

>>> data = est.to_numpy()
>>> print(data.shape)
(10, 2, 2)
>>> print(data.dtype)
float64

Access specific period and channels:

>>> data = est.to_numpy()
>>> # Get first 5 periods, output channel 0, input channel 1
>>> subset = data[:5, 0, 1]
>>> print(subset.shape)
(5,)
from_numpy(new_estimate: numpy.ndarray) None[source]

Load estimate data from numpy array.

Validates dtype and shape compatibility, resizes dataset if needed, and stores the data.

Parameters:

new_estimate (np.ndarray) – Estimate data to load. Must be convertible to numpy array. Preferred shape: (n_periods, n_output_channels, n_input_channels).

Return type:

None

Raises:

TypeError – If dtype doesn’t match existing dataset or input cannot be converted to numpy array.

Notes

‘data’ is a built-in Python function and cannot be used as parameter name. The dataset will be resized if shape doesn’t match.

Examples

Load estimate from numpy array:

>>> import numpy as np
>>> new_data = np.random.rand(5, 2, 2)
>>> est.from_numpy(new_data)
>>> print(est.to_numpy().shape)
(5, 2, 2)

Load with automatic dtype conversion:

>>> float_data = np.array([[[1.0, 2.0]]], dtype=np.float64)
>>> est.from_numpy(float_data)
from_xarray(data: xarray.DataArray) None[source]

Load estimate data from xarray DataArray.

Updates metadata from xarray coordinates and attributes, then stores the data.

Parameters:

data (xr.DataArray) – DataArray containing estimate. Expected dimensions: (period, output, input).

Return type:

None

Notes

This will update output_channels, input_channels, name, and data_type from the xarray object. All changes are persisted to HDF5.

Examples

Load estimate from modified xarray:

>>> xr_data = est.to_xarray(periods)
>>> # Modify data and metadata
>>> modified = xr_data * 2  # Scale by 2
>>> est.from_xarray(modified)
>>> print(est.to_numpy()[0, 0, 0])  # Verify scale

Rename channels and reload:

>>> xr_data = est.to_xarray(periods)
>>> new_xr = xr_data.rename({
...     'output': ['Ex', 'Ey'],
...     'input': ['Bx', 'By']
... })
>>> est.from_xarray(new_xr)
>>> print(est.metadata.output_channels)
['Ex', 'Ey']
class mth5.groups.FCChannelDataset(dataset: h5py.Dataset, dataset_metadata: mt_metadata.processing.fourier_coefficients.FCChannel | None = None, **kwargs: Any)[source]

Container for Fourier coefficients (FC) from windowed FFT analysis.

Holds multi-dimensional Fourier coefficient data representing time-frequency analysis results. Data is uniformly sampled in both frequency (via harmonic index) and time (via uniform FFT window step size).

Parameters:
  • dataset (h5py.Dataset) – HDF5 dataset containing the Fourier coefficient data.

  • dataset_metadata (FCChannel | None, optional) – Metadata object containing FC channel properties like start time, end time, sample rates, units, and frequency method. If provided, metadata will be written to HDF5 attributes. Defaults to None.

  • **kwargs (Any) – Additional keyword arguments (reserved for future use).

hdf5_dataset

Weak reference to the HDF5 dataset.

Type:

h5py.Dataset

metadata

Metadata container for the Fourier coefficients.

Type:

FCChannel

logger

Logger instance for reporting messages.

Type:

loguru.logger

Raises:
  • MTH5Error – If dataset_metadata is provided but is not of type FCChannel.

  • TypeError – If input data cannot be converted to numpy array or has incompatible dtype/shape.

Notes

The data array has shape (n_windows, n_frequencies) where: - n_windows: Number of time windows in the FFT moving window analysis - n_frequencies: Number of frequency bins determined by window size

Data is typically complex-valued representing Fourier coefficients. Time windows are uniformly spaced with interval 1/sample_rate_window_step. Frequencies are uniformly spaced from frequency_min to frequency_max.

Metadata includes: - Time period (start and end) - Acquisition and decimated sample rates - Window sample rate (delta_t within window) - Units - Frequency method (integer harmonic index calculation) - Component name (channel designation)

Examples

Create an FC dataset from HDF5 group:

>>> import h5py
>>> import numpy as np
>>> from mt_metadata.processing.fourier_coefficients import FCChannel
>>> with h5py.File('fc.h5', 'w') as f:
...     # Create 2D array: 50 time windows, 256 frequencies
...     data = np.random.rand(50, 256) + 1j * np.random.rand(50, 256)
...     dset = f.create_dataset('Ex', data=data, dtype=np.complex128)
...     # Create FCChannelDataset
...     fc = FCChannelDataset(dset, write_metadata=True)

Convert to xarray and access time-frequency data:

>>> xr_data = fc.to_xarray()
>>> print(xr_data.dims)  # ('time', 'frequency')
>>> # Access data at specific time and frequency
>>> subset = xr_data.sel(time='2023-01-01T12:00:00', method='nearest')

Inspect properties:

>>> print(f"Windows: {fc.n_windows}, Frequencies: {fc.n_frequencies}")
>>> print(f"Frequency range: {fc.frequency.min():.2f}-{fc.frequency.max():.2f} Hz")
logger
metadata
read_metadata() None[source]

Read metadata from HDF5 attributes into metadata container.

Reads all attributes from the HDF5 dataset and loads them into the internal metadata object for validation and access.

Return type:

None

Notes

This is automatically called during initialization if ‘mth5_type’ attribute exists in the HDF5 dataset.

Examples

Reload metadata from HDF5 after external modification:

>>> # Metadata was modified in HDF5
>>> fc.read_metadata()  # Reload changes
>>> print(fc.metadata.component)  # Access updated component
write_metadata() None[source]

Write metadata from container to HDF5 dataset attributes.

Converts the pydantic metadata model to a dictionary and writes each field as an HDF5 attribute. Values are converted to appropriate numpy types for compatibility. Always ensures ‘mth5_type’ attribute is set to ‘FCChannel’.

Return type:

None

Notes

All existing attributes with the same names will be overwritten. This is called automatically during initialization and after metadata updates. Read-only files will silently skip writes.

Examples

Save updated metadata to HDF5:

>>> fc.metadata.component = "Ey"
>>> fc.write_metadata()  # Persist to file
>>> # Verify write
>>> print(fc.hdf5_dataset.attrs['component'])
b'Ey'
property n_windows: int

Number of time windows in the FFT analysis.

Returns:

Number of time windows (first dimension of data array).

Return type:

int

Notes

This corresponds to the number of rows in the 2D spectrogram data. Each window represents a uniform time interval determined by the window step size (1/sample_rate_window_step).

Examples

>>> print(f"Time windows: {fc.n_windows}")
Time windows: 50
property time: numpy.ndarray

Time array including the start of each time window.

Generates uniformly spaced time coordinates based on the start time, window step rate, and number of windows. Uses metadata time period to determine bounds.

Returns:

Array of datetime64 values for each window start time.

Return type:

np.ndarray

Notes

Time coordinates are generated using make_dt_coordinates, which ensures consistency between specified start/end times and the number of windows.

Examples

Access time array for time-based indexing:

>>> time_array = fc.time
>>> print(time_array.shape)  # (n_windows,)
>>> print(time_array[0])  # First window time
2023-01-01T00:00:00.000000
property n_frequencies: int

Number of frequency bins in the Fourier analysis.

Returns:

Number of frequency bins (second dimension of data array).

Return type:

int

Notes

This corresponds to the number of columns in the 2D spectrogram data. Determined by the FFT window size and relates to the frequency resolution of the analysis.

Examples

>>> print(f"Frequency bins: {fc.n_frequencies}")
Frequency bins: 256
property frequency: numpy.ndarray

Frequency array from metadata frequency bounds.

Generates uniformly spaced frequency coordinates based on the metadata frequency range and number of frequency bins.

Returns:

Array of frequency values, linearly spaced from frequency_min to frequency_max.

Return type:

np.ndarray

Notes

Frequencies represent harmonic indices or actual frequency values depending on the frequency method specified in metadata. Spacing is determined by n_frequencies bins over the range.

Examples

Access frequency array for frequency-based indexing:

>>> freq_array = fc.frequency
>>> print(freq_array.shape)  # (n_frequencies,)
>>> print(f"Frequency range: {freq_array.min():.2f} to {freq_array.max():.2f} Hz")
Frequency range: 0.00 to 64.00 Hz
replace_dataset(new_data_array: numpy.ndarray) None[source]

Replace entire dataset with new data.

Resizes the HDF5 dataset if necessary and replaces all data. Converts input to numpy array if needed.

Parameters:

new_data_array (np.ndarray) – New FC data to store. Should have shape (n_windows, n_frequencies) and typically complex-valued.

Return type:

None

Raises:

TypeError – If input cannot be converted to numpy array.

Notes

If new data has different shape, HDF5 dataset will be resized. This is generally safe but may fragment the HDF5 file.

Examples

Replace FC data with new analysis results:

>>> import numpy as np
>>> new_fc = np.random.rand(30, 256) + 1j * np.random.rand(30, 256)
>>> fc.replace_dataset(new_fc)
>>> print(fc.to_numpy().shape)
(30, 256)

Replace with data from list (auto-converted to array):

>>> data_list = [[[1+1j, 2+2j]], [[3+3j, 4+4j]]] * 15
>>> fc.replace_dataset(data_list)
>>> fc.to_numpy().shape
(30, 2)
to_xarray() xarray.DataArray[source]

Convert FC data to xarray DataArray.

Creates an xarray DataArray with proper coordinates for time and frequency. Includes metadata as attributes.

Returns:

DataArray with dimensions (time, frequency) and coordinates from metadata and computed properties.

Return type:

xr.DataArray

Notes

Metadata changes in xarray are not validated and will not be synchronized back to HDF5 without explicit call to from_xarray(). Data is loaded entirely into memory.

Examples

Convert to xarray with automatic coordinates:

>>> xr_data = fc.to_xarray()
>>> print(xr_data.dims)
('time', 'frequency')
>>> print(xr_data.shape)
(50, 256)

Select data by time and frequency range:

>>> subset = xr_data.sel(
...     time=slice('2023-01-01T00:00:00', '2023-01-01T12:00:00'),
...     frequency=slice(0, 10)
... )
>>> print(subset.shape)  # Subset shape
to_numpy() numpy.ndarray[source]

Convert FC data to numpy array.

Returns the HDF5 dataset as a numpy array. Data is loaded entirely into memory.

Returns:

2D complex array with shape (n_windows, n_frequencies).

Return type:

np.ndarray

Notes

For large spectrograms, this loads all data into RAM. Consider using HDF5 slicing for memory-efficient access to subsets.

Examples

Get full FC data as numpy array:

>>> data = fc.to_numpy()
>>> print(data.shape)
(50, 256)
>>> print(data.dtype)
complex128

Access specific time window and frequency:

>>> data = fc.to_numpy()
>>> # Get first 10 windows, frequency bin 100
>>> subset = data[:10, 100]
>>> print(subset.shape)
(10,)
from_numpy(new_estimate: numpy.ndarray) None[source]

Load FC data from numpy array.

Validates dtype and shape compatibility, resizes dataset if needed, and stores the data.

Parameters:

new_estimate (np.ndarray) – FC data to load. Should have shape (n_windows, n_frequencies). Typically complex-valued array.

Return type:

None

Raises:

TypeError – If dtype doesn’t match existing dataset or input cannot be converted to numpy array.

Notes

‘data’ is a built-in Python function and cannot be used as parameter name. The dataset will be resized if shape doesn’t match. Dtype compatibility is strictly enforced.

Examples

Load FC data from numpy array:

>>> import numpy as np
>>> new_data = np.random.rand(25, 128) + 1j * np.random.rand(25, 128)
>>> fc.from_numpy(new_data)
>>> print(fc.to_numpy().shape)
(25, 128)

Load with magnitude and phase separation:

>>> magnitude = np.random.rand(20, 256)
>>> phase = np.random.rand(20, 256) * 2 * np.pi
>>> fc_data = magnitude * np.exp(1j * phase)
>>> fc.from_numpy(fc_data)
from_xarray(data: xarray.DataArray, sample_rate_decimation_level: int | float) None[source]

Load FC data from xarray DataArray.

Updates metadata from xarray coordinates and attributes, then stores the data. Computes frequency and time parameters from the provided xarray object.

Parameters:
  • data (xr.DataArray) – DataArray containing FC data. Expected dimensions: (time, frequency).

  • sample_rate_decimation_level (int | float) – Decimation level applied to original sample rate. Used to track processing history.

Return type:

None

Notes

This will update time_period (start/end), frequency bounds, window step rate, decimation level, component name, and units from the xarray object. All changes are persisted to HDF5.

Examples

Load FC data from modified xarray:

>>> xr_data = fc.to_xarray()
>>> # Modify data (e.g., apply filter)
>>> modified = xr_data * np.hamming(256)  # Apply frequency window
>>> fc.from_xarray(modified, sample_rate_decimation_level=4)
>>> print(fc.metadata.sample_rate_decimation_level)
4

Load with updated metadata from another analysis:

>>> import xarray as xr
>>> import pandas as pd
>>> time_coords = pd.date_range('2023-01-01', periods=30, freq='1H')
>>> freq_coords = np.arange(0, 128)
>>> new_fc = xr.DataArray(
...     data=np.random.rand(30, 128) + 1j * np.random.rand(30, 128),
...     coords={'time': time_coords, 'frequency': freq_coords},
...     dims=['time', 'frequency'],
...     name='Ey',
...     attrs={'units': 'mV/km'}
... )
>>> fc.from_xarray(new_fc, sample_rate_decimation_level=1)
>>> print(fc.metadata.component)
Ey
class mth5.groups.MasterFCGroup(group: h5py.Group, **kwargs)[source]

Bases: mth5.groups.BaseGroup

Master container for all Fourier Coefficient estimations of time series data.

This class manages multiple Fourier Coefficient processing runs, each containing different decimation levels. No metadata is required at the master level.

Hierarchy

MasterFCGroup -> FCGroup (processing runs) -> FCDecimationGroup (decimation levels) -> FCChannelDataset (individual channels)

param group:

HDF5 group object for the master FC container.

type group:

h5py.Group

param **kwargs:

Additional keyword arguments passed to BaseGroup.

Examples

>>> import h5py
>>> from mth5.groups.fourier_coefficients import MasterFCGroup
>>> with h5py.File('data.h5', 'r') as f:
...     master = MasterFCGroup(f['FC'])
...     fc_group = master.add_fc_group('processing_run_1')
property fc_summary: pandas.DataFrame

Get a summary of all Fourier Coefficient processing runs.

Returns:

Summary information for all FC groups including names and metadata.

Return type:

pd.DataFrame

Examples

>>> master = MasterFCGroup(h5_group)
>>> summary = master.fc_summary
add_fc_group(fc_name: str, fc_metadata: mt_metadata.processing.fourier_coefficients.Decimation | None = None) FCGroup[source]

Add a Fourier Coefficient processing run group.

Parameters:
  • fc_name (str) – Name for the FC group (usually identifies the processing run).

  • fc_metadata (fc.Decimation, optional) – Metadata for the FC group. Default is None.

Returns:

Newly created Fourier Coefficient group.

Return type:

FCGroup

Examples

>>> master = MasterFCGroup(h5_group)
>>> fc_group = master.add_fc_group('processing_run_1')
>>> print(fc_group.name)
'processing_run_1'
get_fc_group(fc_name: str) FCGroup[source]

Retrieve a Fourier Coefficient group by name.

Parameters:

fc_name (str) – Name of the FC group to retrieve.

Returns:

The requested Fourier Coefficient group.

Return type:

FCGroup

Raises:

MTH5Error – If the FC group does not exist.

Examples

>>> master = MasterFCGroup(h5_group)
>>> fc_group = master.get_fc_group('processing_run_1')
remove_fc_group(fc_name: str) None[source]

Remove a Fourier Coefficient group.

Deletes the specified FC group and all associated decimation levels and channels.

Parameters:

fc_name (str) – Name of the FC group to remove.

Raises:

MTH5Error – If the FC group does not exist.

Examples

>>> master = MasterFCGroup(h5_group)
>>> master.remove_fc_group('processing_run_1')
class mth5.groups.FCGroup(group: h5py.Group, decimation_level_metadata: mt_metadata.processing.fourier_coefficients.Decimation | None = None, **kwargs)[source]

Bases: mth5.groups.BaseGroup

Manage a set of Fourier Coefficients from a single processing run.

Holds Fourier Coefficient estimations organized by decimation level. Each decimation level contains channels (Ex, Ey, Hz, etc.) with complex frequency or time-frequency representations of the input signal.

All channels must use the same calibration. Recalibration requires rerunning the Fourier Coefficient estimation.

hdf5_group

The HDF5 group containing decimation levels

Type:

h5py.Group

metadata[source]

Decimation metadata including time period, sample rates, and channels

Type:

fc.Decimation

Notes

Processing run structure:

  • Multiple decimation levels at different sample rates

  • Each decimation level contains multiple channels

  • Each channel contains complex Fourier coefficients

  • Time period and sample rates define the estimation window

Examples

>>> with h5py.File('data.h5', 'r') as f:
...     fc_run = FCGroup(f['Fourier_Coefficients/run_1'])
...     print(fc_run.decimation_level_summary)
metadata() mt_metadata.processing.fourier_coefficients.Decimation[source]

Get processing run metadata including all decimation levels.

Collects metadata from all decimation level groups and aggregates into a single Decimation metadata object.

Returns:

Metadata containing time period, sample rates, and all decimation level information.

Return type:

fc.Decimation

Notes

This getter automatically populates:

  • Time period (start and end)

  • List of all decimation levels and their metadata

  • HDF5 reference to this group

Examples

>>> fc_run = FCGroup(h5_group)
>>> metadata = fc_run.metadata
>>> print(metadata.time_period.start)
2023-01-01T00:00:00
property decimation_level_summary: pandas.DataFrame

Get a summary of all decimation levels in this processing run.

Returns information about each decimation level including sample rate, decimation level value, and time span.

Returns:

Summary with columns:

  • decimation_level: Integer decimation level identifier

  • start: ISO format start time of this decimation level

  • end: ISO format end time of this decimation level

  • hdf5_reference: Reference to the HDF5 group

Return type:

pd.DataFrame

Notes

Each row represents a single decimation level containing multiple channels with Fourier coefficients at different sample rates.

Examples

>>> fc_run = FCGroup(h5_group)
>>> summary = fc_run.decimation_level_summary
>>> print(summary[['decimation_level', 'start', 'end']])
   decimation_level                start                  end
0              0     2023-01-01T00:00:00.000000  2023-01-01T01:00:00.000000
1              1     2023-01-01T00:00:00.000000  2023-01-01T02:00:00.000000
add_decimation_level(decimation_level_name: str, decimation_level_metadata: dict | mt_metadata.processing.fourier_coefficients.Decimation | None = None) FCDecimationGroup[source]

Add a new decimation level to the processing run.

Creates a new FCDecimationGroup for a single decimation level containing Fourier Coefficient channels at a specific sample rate.

Parameters:
  • decimation_level_name (str) – Identifier for the decimation level.

  • decimation_level_metadata (dict | fc.Decimation, optional) – Metadata for the decimation level. Can be a dictionary or fc.Decimation object. Default is None.

Returns:

Newly created decimation level group.

Return type:

FCDecimationGroup

Examples

>>> fc_run = FCGroup(h5_group)
>>> metadata = fc.Decimation(decimation_level=0)
>>> decimation = fc_run.add_decimation_level('0', metadata)
get_decimation_level(decimation_level_name: str) FCDecimationGroup[source]

Retrieve a decimation level by name.

Parameters:

decimation_level_name (str) – Name or identifier of the decimation level.

Returns:

The requested decimation level group.

Return type:

FCDecimationGroup

Examples

>>> fc_run = FCGroup(h5_group)
>>> decimation = fc_run.get_decimation_level('0')
>>> channels = decimation.groups_list
remove_decimation_level(decimation_level_name: str) None[source]

Remove a decimation level from the processing run.

Deletes the HDF5 group and all its channels (FCChannelDataset objects).

Parameters:

decimation_level_name (str) – Name or identifier of the decimation level to remove.

Notes

This removes the entire decimation level and all channels within it. To remove individual channels, use FCDecimationGroup.remove_channel() instead.

Examples

>>> fc_run = FCGroup(h5_group)
>>> fc_run.remove_decimation_level('0')
update_metadata() None[source]

Update processing run metadata from all decimation levels.

Aggregates time period information from all decimation levels and writes updated metadata to HDF5.

Notes

Collects:

  • Earliest start time across all decimation levels

  • Latest end time across all decimation levels

Should be called after adding or removing decimation levels.

Examples

>>> fc_run = FCGroup(h5_group)
>>> fc_run.add_decimation_level('0', metadata0)
>>> fc_run.add_decimation_level('1', metadata1)
>>> fc_run.update_metadata()
supports_aurora_processing_config(processing_config: aurora.config.metadata.processing.Processing, remote: bool) bool[source]

Check if all required decimation levels exist for Aurora processing.

Performs an all-or-nothing check: returns True only if every decimation level required by the processing config is available in this FCGroup.

Uses sequential logic to short-circuit: if any required decimation level is missing, immediately returns False without checking remaining levels.

Parameters:
  • processing_config (aurora.config.metadata.processing.Processing) – Aurora processing configuration containing required decimation levels.

  • remote (bool) – Whether to check for remote processing compatibility.

Returns:

True if all required decimation levels are available and consistent, False otherwise.

Return type:

bool

Notes

Validation logic:

  1. Extract list of decimation levels from processing config

  2. Iterate through each required level in sequence

  3. For each level, find a matching FCDecimation in this group

  4. Check consistency using Aurora’s validation method

  5. If any level is missing or inconsistent, return False immediately

  6. Return True only if all levels pass validation

Examples

>>> fc_run = FCGroup(h5_group)
>>> config = aurora.config.metadata.processing.Processing(...)
>>> if fc_run.supports_aurora_processing_config(config, remote=False):
...     # All decimation levels are available
...     pass
class mth5.groups.FCDecimationGroup(group: h5py.Group, decimation_level_metadata: mt_metadata.processing.fourier_coefficients.Decimation | None = None, **kwargs)[source]

Bases: mth5.groups.BaseGroup

Container for a single decimation level of Fourier Coefficient data.

This class manages all channels at a specific decimation level, assuming uniform sampling in both frequency and time domains.

Data Assumptions

  1. Data uniformly sampled in frequency domain

  2. Data uniformly sampled in time domain

  3. FFT moving window has uniform step size

start_time

Start time of the decimation level

Type:

datetime

end_time

End time of the decimation level

Type:

datetime

channels

List of channel names in this decimation level

Type:

list

decimation_factor

Factor by which data was decimated

Type:

int

decimation_level

Level index in decimation hierarchy

Type:

int

sample_rate

Sample rate after decimation (Hz)

Type:

float

method

Method used (FFT, wavelet, etc.)

Type:

str

window

Window parameters (length, overlap, type, sample rate)

Type:

dict

param group:

HDF5 group object for this decimation level.

type group:

h5py.Group

param decimation_level_metadata:

Metadata for the decimation level. Default is None.

type decimation_level_metadata:

optional

param **kwargs:

Additional keyword arguments passed to BaseGroup.

Examples

>>> decimation = FCDecimationGroup(h5_group, decimation_level_metadata=metadata)
>>> channel = decimation.add_channel('Ex', fc_data=fc_array)
metadata()[source]

Overwrite get metadata to include channel information in the runs

property channel_summary: pandas.DataFrame

Get a summary of all channels in this decimation level.

Returns a pandas DataFrame with detailed information about each Fourier Coefficient channel including time ranges, dimensions, and sampling rates.

Returns:

DataFrame with columns:

  • componentstr

    Channel component name (e.g., ‘Ex’, ‘Hy’)

  • startdatetime64[ns]

    Start time of the channel data

  • enddatetime64[ns]

    End time of the channel data

  • n_frequencyint64

    Number of frequency bins

  • n_windowsint64

    Number of time windows

  • sample_rate_decimation_levelfloat64

    Decimation level sample rate (Hz)

  • sample_rate_window_stepfloat64

    Sample rate of window stepping (Hz)

  • unitsstr

    Physical units of the data

  • hdf5_referenceh5py.ref_dtype

    HDF5 reference to the channel dataset

Return type:

pd.DataFrame

Examples

>>> decimation = FCDecimationGroup(h5_group)
>>> summary = decimation.channel_summary
>>> print(summary[['component', 'n_frequency', 'n_windows']])
from_dataframe(df: pandas.DataFrame, channel_key: str, time_key: str = 'time', frequency_key: str = 'frequency') None[source]

Load Fourier Coefficient data from a pandas DataFrame.

Assumes the channel_key column contains complex coefficient values organized with time and frequency dimensions.

Parameters:
  • df (pd.DataFrame) – Input DataFrame containing the coefficient data.

  • channel_key (str) – Name of the column containing coefficient values.

  • time_key (str, default='time') – Name of the time coordinate column.

  • frequency_key (str, default='frequency') – Name of the frequency coordinate column.

Raises:

TypeError – If df is not a pandas DataFrame.

Examples

>>> decimation = FCDecimationGroup(h5_group)
>>> decimation.from_dataframe(df, channel_key='Ex', time_key='time')
from_xarray(data_array: xarray.Dataset | xarray.DataArray, sample_rate_decimation_level: float) None[source]

Load Fourier Coefficient data from an xarray DataArray or Dataset.

Automatically extracts metadata (time, frequency, units) from the xarray object and creates appropriate FCChannelDataset instances for each variable or the single DataArray.

Parameters:
  • data_array (xr.DataArray or xr.Dataset) – Input xarray object with ‘time’ and ‘frequency’ coordinates and dimensions [‘time’, ‘frequency’] (or transposed variant).

  • sample_rate_decimation_level (float) – Sample rate of the decimation level (Hz).

Raises:

TypeError – If data_array is not an xarray Dataset or DataArray.

Notes

Automatically handles both (time, frequency) and (frequency, time) dimension ordering. Units are extracted from xarray attributes if available.

Examples

>>> import xarray as xr
>>> import numpy as np
>>> decimation = FCDecimationGroup(h5_group)

Create sample xarray data:

>>> times = np.arange('2023-01-01', '2023-01-02', dtype='datetime64[s]')
>>> freqs = np.linspace(0.01, 100, 256)
>>> data_array = np.random.randn(len(times), len(freqs)) + \
...              1j * np.random.randn(len(times), len(freqs))
>>> xr_data = xr.DataArray(
...     data_array,
...     dims=['time', 'frequency'],
...     coords={'time': times, 'frequency': freqs},
...     name='Ex'
... )

Load into decimation group:

>>> decimation.from_xarray(xr_data, sample_rate_decimation_level=0.5)
to_xarray(channels: list[str] | None = None) xarray.Dataset[source]

Create an xarray Dataset from Fourier Coefficient channels.

If no channels are specified, all channels in the decimation level are included. Each channel becomes a data variable in the resulting Dataset.

Parameters:

channels (list[str], optional) – List of channel names to include. If None, all channels are used. Default is None.

Returns:

xarray Dataset with channels as data variables and ‘time’ and ‘frequency’ as shared coordinates.

Return type:

xr.Dataset

Examples

>>> decimation = FCDecimationGroup(h5_group)
>>> xr_data = decimation.to_xarray()
>>> print(xr_data.data_vars)
Data variables:
    Ex  (time, frequency) complex128
    Ey  (time, frequency) complex128

Get specific channels:

>>> subset = decimation.to_xarray(channels=['Ex', 'Ey'])
from_numpy_array(nd_array: numpy.ndarray, ch_name: str | list[str]) None[source]

Load Fourier Coefficient data from a numpy array.

Assumes array shape is either (n_frequencies, n_windows) for a single channel or (n_channels, n_frequencies, n_windows) for multiple channels.

Parameters:
  • nd_array (np.ndarray) – Input numpy array containing coefficient data.

  • ch_name (str or list[str]) – Channel name (for 2D array) or list of channel names (for 3D array).

Raises:
  • TypeError – If nd_array is not a numpy ndarray.

  • ValueError – If array shape is not (n_frequencies, n_windows) or (n_channels, n_frequencies, n_windows).

Examples

>>> decimation = FCDecimationGroup(h5_group)

Load single channel:

>>> data_2d = np.random.randn(256, 100) + 1j * np.random.randn(256, 100)
>>> decimation.from_numpy_array(data_2d, ch_name='Ex')

Load multiple channels:

>>> data_3d = np.random.randn(2, 256, 100) + 1j * np.random.randn(2, 256, 100)
>>> decimation.from_numpy_array(data_3d, ch_name=['Ex', 'Ey'])
add_channel(fc_name: str, fc_data: numpy.ndarray | None = None, fc_metadata: mt_metadata.processing.fourier_coefficients.FCChannel | None = None, max_shape: tuple = (None, None), chunks: bool = True, dtype: type = complex, **kwargs) mth5.groups.FCChannelDataset[source]

Add a Fourier Coefficient channel to the decimation level.

Creates a new FCChannelDataset for a single channel at a single decimation level. Input data can be provided as numpy array or created empty.

Parameters:
  • fc_name (str) – Name for the Fourier Coefficient channel (usually component name like ‘Ex’).

  • fc_data (np.ndarray, optional) – Input data with shape (n_frequencies, n_windows). Default is None (creates empty).

  • fc_metadata (fc.FCChannel, optional) – Metadata for the channel. Default is None.

  • max_shape (tuple, default=(None, None)) – Maximum shape for HDF5 dataset dimensions (expandable if None).

  • chunks (bool, default=True) – Whether to use HDF5 chunking.

  • dtype (type, default=complex) – Data type for the dataset.

  • **kwargs – Additional keyword arguments for HDF5 dataset creation.

Returns:

Newly created FCChannelDataset object.

Return type:

FCChannelDataset

Raises:

TypeError – If fc_data type is not supported.

Notes

Data layout assumes (time, frequency) organization:

  • time index: window start times

  • frequency index: harmonic indices or float values

  • data: complex Fourier coefficients

If a channel with the same name already exists, the existing channel is returned instead of creating a duplicate.

Examples

>>> decimation = FCDecimationGroup(h5_group)
>>> metadata = fc.FCChannel(component='Ex')

Create from numpy array:

>>> fc_data = np.random.randn(100, 256) + 1j * np.random.randn(100, 256)
>>> channel = decimation.add_channel('Ex', fc_data=fc_data, fc_metadata=metadata)

Create empty channel (expandable):

>>> channel = decimation.add_channel('Ex', fc_metadata=metadata)
get_channel(fc_name: str) mth5.groups.FCChannelDataset[source]

Retrieve a Fourier Coefficient channel by name.

Parameters:

fc_name (str) – Name of the Fourier Coefficient channel to retrieve.

Returns:

The requested Fourier Coefficient channel dataset.

Return type:

FCChannelDataset

Raises:
  • KeyError – If the channel does not exist in this decimation level.

  • MTH5Error – If unable to retrieve the channel from HDF5.

Examples

>>> decimation = FCDecimationGroup(h5_group)
>>> channel = decimation.get_channel('Ex')
>>> print(channel.shape)
(100, 256)
remove_channel(fc_name: str) None[source]

Remove a Fourier Coefficient channel from the decimation level.

Deletes the HDF5 dataset associated with the channel. Note that this removes the reference but does not reduce the HDF5 file size.

Parameters:

fc_name (str) – Name of the Fourier Coefficient channel to remove.

Raises:

MTH5Error – If the channel does not exist.

Notes

Deleting a channel does not reduce the HDF5 file size; it simply removes the reference to the data. To truly reduce file size, copy the desired data to a new file.

Examples

>>> decimation = FCDecimationGroup(h5_group)
>>> decimation.remove_channel('Ex')
update_metadata() None[source]

Update decimation level metadata from all channels.

Aggregates metadata from all FC channels in the decimation level including time period, sample rates, and window step information. Updates the internal metadata object and writes to HDF5.

Notes

Collects the following information from channels:

  • Time period start/end from channel data

  • Sample rate decimation level

  • Sample rate window step

Should be called after adding or modifying channels to keep metadata synchronized.

Examples

>>> decimation = FCDecimationGroup(h5_group)
>>> decimation.add_channel('Ex', fc_data=data_ex)
>>> decimation.add_channel('Ey', fc_data=data_ey)
>>> decimation.update_metadata()
add_feature(feature_name: str, feature_data: numpy.ndarray | None = None, feature_metadata: dict | None = None, max_shape: tuple = (None, None, None), chunks: bool = True, **kwargs) None[source]

Add a feature dataset to the decimation level.

Creates a new dataset for auxiliary features or derived quantities related to Fourier Coefficients (e.g., SNR, coherency, power, etc.).

Parameters:
  • feature_name (str) – Name for the feature dataset.

  • feature_data (np.ndarray, optional) – Input data for the feature. Default is None (creates empty).

  • feature_metadata (dict, optional) – Metadata dictionary for the feature. Default is None.

  • max_shape (tuple, default=(None, None, None)) – Maximum shape for HDF5 dataset dimensions (expandable if None).

  • chunks (bool, default=True) – Whether to use HDF5 chunking.

  • **kwargs – Additional keyword arguments for HDF5 dataset creation.

Notes

Feature types may include:

  • Power: Total power in Fourier coefficients

  • SNR: Signal-to-noise ratio

  • Coherency: Cross-component coherence

  • Weights: Channel-specific weights

  • Flags: Data quality or processing flags

Examples

>>> decimation = FCDecimationGroup(h5_group)
>>> snr_data = np.random.randn(100, 256)
>>> decimation.add_feature('snr', feature_data=snr_data)

Or create empty feature for later population:

>>> decimation.add_feature('power_Ex')
class mth5.groups.TransferFunctionsGroup(group: Any, **kwargs: Any)[source]

Bases: mth5.groups.BaseGroup

Container for transfer functions under a station.

Each child group is a single transfer function estimation managed by TransferFunctionGroup.

Examples

>>> from mth5 import mth5
>>> m5 = mth5.MTH5()
>>> _ = m5.open_mth5("/tmp/example.mth5", mode="a")
>>> station = m5.stations_group.add_station("mt01")
>>> tf_group = station.transfer_functions_group
>>> tf_group.groups_list
[]
tf_summary(as_dataframe: bool = True) pandas.DataFrame | numpy.ndarray[source]

Summarize transfer functions stored for the station.

Parameters:

as_dataframe (bool, default True) – If True return a pandas DataFrame, otherwise a NumPy structured array.

Returns:

Summary rows including station reference, location, and TF metadata.

Return type:

pandas.DataFrame or numpy.ndarray

Examples

>>> summary = tf_group.tf_summary()
>>> summary.columns[:4].tolist()
['station_hdf5_reference', 'station', 'latitude', 'longitude']
add_transfer_function(name: str, tf_object: mt_metadata.transfer_functions.core.TF | None = None) TransferFunctionGroup[source]

Add a transfer function group under this station.

Parameters:
  • name (str) – Transfer function identifier.

  • tf_object (TF, optional) – Transfer function instance to seed metadata and datasets.

Returns:

Wrapper for the created or existing transfer function.

Return type:

TransferFunctionGroup

Examples

>>> tf_group = station.transfer_functions_group
>>> _ = tf_group.add_transfer_function("mt01_4096")
get_transfer_function(tf_id: str) TransferFunctionGroup[source]

Return an existing transfer function by id.

Parameters:

tf_id (str) – Name of the transfer function.

Returns:

Wrapper for the requested transfer function.

Return type:

TransferFunctionGroup

Raises:

MTH5Error – If the transfer function does not exist.

Examples

>>> existing = station.transfer_functions_group.get_transfer_function("mt01_4096")
>>> existing.name
'mt01_4096'
remove_transfer_function(tf_id: str) None[source]

Delete a transfer function reference from the station.

Parameters:

tf_id (str) – Transfer function name.

Notes

HDF5 deletion removes the reference only; storage is not reclaimed.

Examples

>>> tf_group.remove_transfer_function("mt01_4096")
get_tf_object(tf_id: str) mt_metadata.transfer_functions.core.TF[source]

Return a populated mt_metadata.transfer_functions.core.TF.

Parameters:

tf_id (str) – Transfer function name to convert.

Returns:

Transfer function populated with metadata and estimates.

Return type:

mt_metadata.transfer_functions.core.TF

Examples

>>> tf_obj = tf_group.get_tf_object("mt01_4096")
class mth5.groups.TransferFunctionGroup(group: Any, **kwargs: Any)[source]

Bases: mth5.groups.BaseGroup

Wrapper for a single transfer function estimation.

has_estimate(estimate: str) bool[source]

Return True if an estimate exists and is populated.

property period: numpy.ndarray | None

Return period array stored in period dataset, if present.

add_statistical_estimate(estimate_name: str, estimate_data: numpy.ndarray | xarray.DataArray | None = None, estimate_metadata: mt_metadata.transfer_functions.tf.statistical_estimate.StatisticalEstimate | None = None, max_shape: tuple[int | None, int | None, int | None] = (None, None, None), chunks: bool = True, **kwargs: Any) mth5.groups.EstimateDataset[source]

Add a statistical estimate dataset.

Parameters:
  • estimate_name (str) – Dataset name.

  • estimate_data (numpy.ndarray or xarray.DataArray, optional) – Estimate values; if None a placeholder array is created.

  • estimate_metadata (StatisticalEstimate, optional) – Metadata describing the estimate.

  • max_shape (tuple of int or None, default (None, None, None)) – Maximum shape for resizable datasets.

  • chunks (bool, default True) – Chunking flag forwarded to HDF5 dataset creation.

Returns:

Wrapper combining dataset and metadata.

Return type:

EstimateDataset

Raises:

TypeError – If estimate_data is not array-like.

Examples

>>> est = tf_group.add_statistical_estimate("transfer_function")
>>> isinstance(est, EstimateDataset)
True
get_estimate(estimate_name: str) mth5.groups.EstimateDataset[source]

Return a statistical estimate dataset by name.

remove_estimate(estimate_name: str) None[source]

Remove a statistical estimate dataset reference.

to_tf_object() mt_metadata.transfer_functions.core.TF[source]

Convert this group into a populated TF object.

Returns:

TF instance with survey, station, runs, channels, period, and estimate datasets applied.

Return type:

mt_metadata.transfer_functions.core.TF

Raises:

ValueError – If no period dataset is present.

Examples

>>> tf_obj = tf_group.to_tf_object()
from_tf_object(tf_obj: mt_metadata.transfer_functions.core.TF, update_metadata: bool = True) None[source]

Populate datasets from a TF object.

Parameters:
  • tf_obj (TF) – Transfer function object containing estimates and metadata.

  • update_metadata (bool, default True) – If True write transfer function metadata to HDF5.

Raises:

ValueError – If tf_obj is not a TF instance.

Examples

>>> tf_group.from_tf_object(tf_obj)
class mth5.groups.ElectricDataset(group: h5py.Dataset, **kwargs: Any)[source]

Bases: ChannelDataset

Specialized container for electric field channel data.

Inherits all functionality from ChannelDataset with electric field specific metadata handling.

Parameters:
  • group (h5py.Dataset) – HDF5 dataset containing electric field data.

  • **kwargs (dict) – Additional keyword arguments passed to ChannelDataset.

Examples

>>> ex_dataset = run_group.get_channel('Ex')
>>> print(type(ex_dataset))
<class 'mth5.groups.channel_dataset.ElectricDataset'>
>>> print(ex_dataset.metadata.type)
'electric'
>>> print(ex_dataset.metadata.units)
'mV/km'
class mth5.groups.MagneticDataset(group: h5py.Dataset, **kwargs: Any)[source]

Bases: ChannelDataset

Specialized container for magnetic field channel data.

Inherits all functionality from ChannelDataset with magnetic field specific metadata handling.

Parameters:
  • group (h5py.Dataset) – HDF5 dataset containing magnetic field data.

  • **kwargs (dict) – Additional keyword arguments passed to ChannelDataset.

Examples

>>> hx_dataset = run_group.get_channel('Hx')
>>> print(type(hx_dataset))
<class 'mth5.groups.channel_dataset.MagneticDataset'>
>>> print(hx_dataset.metadata.type)
'magnetic'
>>> print(hx_dataset.metadata.units)
'nT'
class mth5.groups.ChannelDataset(dataset: h5py.Dataset | None, dataset_metadata: mt_metadata.base.MetadataBase | None = None, write_metadata: bool = True, **kwargs: Any)[source]

A container for channel time series data stored in HDF5 format.

This class provides a flexible interface to work with magnetotelluric channel data, allowing conversion to various formats (xarray, pandas, numpy) while maintaining metadata integrity.

Parameters:
  • dataset (h5py.Dataset or None) – HDF5 dataset object containing the channel time series data.

  • dataset_metadata (MetadataBase, optional) – Metadata container for Electric, Magnetic, or Auxiliary channel types. Default is None.

  • write_metadata (bool, optional) – Whether to write metadata to the HDF5 dataset on initialization. Default is True.

  • **kwargs (dict) – Additional keyword arguments to set as instance attributes.

hdf5_dataset

Weak reference to the underlying HDF5 dataset.

Type:

h5py.Dataset

metadata

Channel metadata object with validation.

Type:

MetadataBase

logger

Logger instance for tracking operations.

Type:

loguru.Logger

Raises:

MTH5Error – If the dataset is not of the correct type or metadata validation fails.

See also

ElectricDataset

Specialized container for electric field channels.

MagneticDataset

Specialized container for magnetic field channels.

AuxiliaryDataset

Specialized container for auxiliary channels.

Examples

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
>>> channel = run.get_channel('Ex')
>>> channel
Channel Electric:
-------------------
  component:        Ex
  data type:        electric
  data format:      float32
  data shape:       (4096,)
  start:            1980-01-01T00:00:00+00:00
  end:              1980-01-01T00:00:01+00:00
  sample rate:      4096

Access time series data

>>> ts_data = channel.to_channel_ts()
>>> print(f"Mean: {ts_data.ts.mean():.2f}, Std: {ts_data.ts.std():.2f}")

Convert to xarray for time-based indexing

>>> xr_data = channel.to_xarray()
>>> subset = xr_data.sel(time=slice('1980-01-01T00:00:00', '1980-01-01T00:00:10'))
logger
metadata
property run_metadata: mt_metadata.timeseries.Run

Get the run-level metadata containing this channel.

Returns:

Run metadata object with channel information included.

Return type:

metadata.Run

Examples

>>> run_meta = channel.run_metadata
>>> print(run_meta.id)
'MT001a'
>>> print(run_meta.channels_recorded_electric)
['Ex', 'Ey']
property station_metadata: mt_metadata.timeseries.Station

Get the station-level metadata containing this channel.

Returns:

Station metadata object with run and channel information.

Return type:

metadata.Station

Examples

>>> station_meta = channel.station_metadata
>>> print(f"{station_meta.id}: {station_meta.location.latitude}, {station_meta.location.longitude}")
'MT001: 40.5, -112.3'
property survey_metadata: mt_metadata.timeseries.Survey

Get the survey-level metadata containing this channel.

Returns:

Complete survey metadata hierarchy including this channel.

Return type:

metadata.Survey

Examples

>>> survey_meta = channel.survey_metadata
>>> print(survey_meta.id)
'MT Survey 2023'
>>> print(f"Stations: {len(survey_meta.stations)}")
Stations: 15
property survey_id: str

Get the survey identifier.

Returns:

Survey ID string.

Return type:

str

Examples

>>> print(channel.survey_id)
'MT_Survey_2023'
property channel_response: mt_metadata.timeseries.filters.ChannelResponse

Get the complete channel response from applied filters.

Constructs a ChannelResponse object by retrieving all filters referenced in the channel metadata from the survey’s Filters group.

Returns:

Channel response object containing all applied filters in sequence.

Return type:

ChannelResponse

Notes

Filters are applied in the order specified by their sequence_number. Filter names are normalized by replacing ‘/’ with ‘ per ‘ and converting to lowercase.

Examples

>>> response = channel.channel_response
>>> print(f"Number of filters: {len(response.filters_list)}")
Number of filters: 3
>>> for filt in response.filters_list:
...     print(f"{filt.name}: {filt.type}")
zpk: zpk
coefficient: coefficient
time delay: time_delay
property start: mt_metadata.common.mttime.MTime

Get the start time of the channel data.

Returns:

Start time from metadata.time_period.start.

Return type:

MTime

Examples

>>> print(channel.start)
1980-01-01T00:00:00+00:00
>>> print(channel.start.iso_str)
'1980-01-01T00:00:00.000000+00:00'
property end: mt_metadata.common.mttime.MTime

Calculate the end time based on start time, sample rate, and number of samples.

Returns:

Calculated end time of the data.

Return type:

MTime

Notes

End time is calculated as: start + (n_samples - 1) / sample_rate The -1 ensures the last sample falls exactly at the end time.

Examples

>>> print(f"Duration: {channel.end - channel.start} seconds")
Duration: 3600.0 seconds
>>> print(channel.end.iso_str)
'1980-01-01T01:00:00.000000+00:00'
property sample_rate: float

Get the sample rate in samples per second.

Returns:

Sample rate in Hz.

Return type:

float

Examples

>>> print(f"Sample rate: {channel.sample_rate} Hz")
Sample rate: 256.0 Hz
property n_samples: int

Get the total number of samples in the dataset.

Returns:

Number of data points in the time series.

Return type:

int

Examples

>>> print(f"Total samples: {channel.n_samples:,}")
Total samples: 921,600
>>> duration = channel.n_samples / channel.sample_rate
>>> print(f"Duration: {duration/3600:.1f} hours")
Duration: 1.0 hours
property time_index: pandas.DatetimeIndex

Create a time index for the dataset based on metadata.

Returns:

Pandas datetime index spanning the entire dataset.

Return type:

pd.DatetimeIndex

Notes

The time index is useful for time-based queries and slicing operations. It is generated dynamically from start time, sample rate, and number of samples.

Examples

>>> time_idx = channel.time_index
>>> print(time_idx[0], time_idx[-1])
1980-01-01 00:00:00 1980-01-01 00:59:59.996093750
>>> print(f"Index length: {len(time_idx)}")
Index length: 921600
read_metadata() None[source]

Read metadata from HDF5 attributes into the metadata container.

Loads all HDF5 attributes from the dataset and converts them to the appropriate Python types before populating the metadata object.

For older MTH5 files, this method attempts to coerce values to the expected types based on the metadata schema to maintain backwards compatibility.

Notes

This method automatically validates metadata through the metadata container’s validators. Type coercion is applied to handle older file formats that may have stored metadata with different types.

Examples

>>> channel.read_metadata()
>>> print(channel.metadata.component)
'Ex'
>>> print(channel.metadata.sample_rate)
256.0

Handles type coercion for older files

>>> # If sample_rate was stored as string '256.0' in old file
>>> channel.read_metadata()
>>> print(type(channel.metadata.sample_rate))
<class 'float'>
write_metadata() None[source]

Write metadata from the container to HDF5 dataset attributes.

Converts all metadata values to numpy-compatible types before writing to HDF5 attributes. Falls back to string conversion if direct conversion fails.

Notes

This method is automatically called during initialization and when metadata is updated.

Examples

>>> channel.metadata.component = 'Ey'
>>> channel.metadata.measurement_azimuth = 90.0
>>> channel.write_metadata()
replace_dataset(new_data_array: numpy.ndarray) None[source]

Replace the entire dataset with new data.

Parameters:

new_data_array (np.ndarray) – New data array with shape (npts,). Must be 1-dimensional.

Raises:

TypeError – If new_data_array cannot be converted to numpy array.

Notes

The HDF5 dataset will be resized if the new array has a different shape. All existing data will be overwritten.

Examples

Replace with synthetic data

>>> import numpy as np
>>> new_data = np.sin(2 * np.pi * 1.0 * np.linspace(0, 10, 2560))
>>> channel.replace_dataset(new_data)
>>> print(f"New shape: {channel.hdf5_dataset.shape}")
New shape: (2560,)

Replace with processed data

>>> original = channel.hdf5_dataset[:]
>>> filtered = np.convolve(original, np.ones(5)/5, mode='same')
>>> channel.replace_dataset(filtered)
extend_dataset(new_data_array: numpy.ndarray, start_time: str | mt_metadata.common.mttime.MTime, sample_rate: float, fill: str | float | int | None = None, max_gap_seconds: float | int = 1, fill_window: int = 10) None[source]

Extend or prepend data to the existing dataset with gap handling.

Intelligently adds new data before, after, or within the existing time series. Handles time alignment, overlaps, and gaps with configurable fill strategies.

Parameters:
  • new_data_array (np.ndarray) – New data array with shape (npts,).

  • start_time (str or MTime) – Start time of the new data array in UTC.

  • sample_rate (float) – Sample rate of the new data array in Hz. Must match existing sample rate.

  • fill (str, float, int, or None, optional) –

    Strategy for filling data gaps:

    • None : Raise MTH5Error if gap exists (default)

    • ’mean’ : Fill with mean of both datasets within fill_window

    • ’median’ : Fill with median of both datasets within fill_window

    • ’nan’ : Fill with NaN values

    • numeric value : Fill with specified constant

  • max_gap_seconds (float or int, optional) – Maximum allowed gap in seconds. Exceeding this raises MTH5Error. Default is 1 second.

  • fill_window (int, optional) – Number of points from each dataset edge to estimate fill values. Default is 10 points.

Raises:
  • MTH5Error – If sample rates don’t match, gap exceeds max_gap_seconds, or fill strategy is invalid.

  • TypeError – If new_data_array cannot be converted to numpy array.

Notes

  • Prepend: New data start < existing start

  • Append: New data start > existing end

  • Overwrite: New data overlaps existing data

The dataset is automatically resized to accommodate new data.

Examples

Append data with a small gap

>>> ex = mth5_obj.get_channel('MT001', 'MT001a', 'Ex')
>>> print(f"Original: {ex.n_samples} samples, ends {ex.end}")
Original: 4096 samples, ends 2015-01-08T19:32:09.500000+00:00
>>> new_data = np.random.randn(4096)
>>> new_start = (ex.end + 0.5).isoformat()  # 0.5s gap
>>> ex.extend_dataset(new_data, new_start, ex.sample_rate,
...                   fill='median', max_gap_seconds=2)
>>> print(f"Extended: {ex.n_samples} samples, ends {ex.end}")
Extended: 8200 samples, ends 2015-01-08T19:40:42.500000+00:00

Prepend data seamlessly

>>> prepend_data = np.random.randn(2048)
>>> prepend_start = (ex.start - 2048/ex.sample_rate).isoformat()
>>> ex.extend_dataset(prepend_data, prepend_start, ex.sample_rate)
>>> print(f"New start: {ex.start}")

Overwrite section of existing data

>>> replacement_data = np.zeros(1024)
>>> replace_start = (ex.start + 1.0).isoformat()  # 1s after start
>>> ex.extend_dataset(replacement_data, replace_start, ex.sample_rate)
has_data() bool[source]

Check if the channel contains non-zero data.

Returns:

True if dataset has non-zero values, False if all zeros or empty.

Return type:

bool

Examples

>>> if channel.has_data():
...     print("Channel has valid data")
... else:
...     print("Channel is empty or all zeros")
Channel has valid data
>>> empty_channel.has_data()
False
to_channel_ts() mth5.timeseries.ChannelTS[source]

Convert the dataset to a ChannelTS object with full metadata.

Returns:

Time series object with data, metadata, and channel response.

Return type:

ChannelTS

Notes

Data is loaded into memory. The resulting ChannelTS object is independent of the HDF5 file and can be modified without affecting the original dataset.

Examples

>>> ts = channel.to_channel_ts()
>>> print(f"Type: {type(ts)}")
Type: <class 'mth5.timeseries.channel_ts.ChannelTS'>
>>> print(f"Shape: {ts.ts.shape}, Mean: {ts.ts.mean():.2f}")
Shape: (4096,), Mean: 0.15

Process the time series

>>> filtered_ts = ts.low_pass_filter(cutoff=10.0)
>>> detrended_ts = ts.detrend('linear')
>>> ts.plot()
to_xarray() xarray.DataArray[source]

Convert the dataset to an xarray DataArray with time coordinates.

Returns:

DataArray with time index and metadata as attributes.

Return type:

xr.DataArray

Notes

Data is loaded into memory. Metadata is stored in the attrs dictionary and will not be validated if modified.

Examples

>>> xr_data = channel.to_xarray()
>>> print(xr_data)
<xarray.DataArray (time: 4096)>
array([0.931, 0.142, ..., 0.882])
Coordinates:
  * time     (time) datetime64[ns] 1980-01-01 ... 1980-01-01T00:00:15.996
.. attribute:: component

Ex

sample_rate

256.0

...

Use xarray’s powerful selection

>>> morning = xr_data.sel(time=slice('1980-01-01T06:00', '1980-01-01T12:00'))
>>> daily_mean = xr_data.resample(time='1D').mean()
>>> xr_data.plot()
to_dataframe() pandas.DataFrame[source]

Convert the dataset to a pandas DataFrame with time index.

Returns:

DataFrame with ‘data’ column and time index. Metadata stored in attrs.

Return type:

pd.DataFrame

Notes

Data is loaded into memory. Metadata is stored in the experimental attrs attribute and will not be validated if modified.

Examples

>>> df = channel.to_dataframe()
>>> print(df.head())
                     data
time
1980-01-01 00:00:00  0.931
1980-01-01 00:00:00  0.142
...

Use pandas operations

>>> df['data'].describe()
>>> df.resample('1H').mean()
>>> df.plot(y='data', figsize=(12, 4))

Access metadata

>>> print(df.attrs['component'])
'Ex'
>>> print(df.attrs['sample_rate'])
256.0
to_numpy() numpy.recarray[source]

Convert the dataset to a numpy structured array with time and data columns.

Returns:

Record array with ‘time’ and ‘channel_data’ fields.

Return type:

np.recarray

Notes

Data is loaded into memory. The ‘data’ name is avoided as it’s a builtin to numpy.

Examples

>>> arr = channel.to_numpy()
>>> print(arr.dtype.names)
('time', 'channel_data')
>>> print(arr['time'][0])
1980-01-01T00:00:00.000000000
>>> print(arr['channel_data'].mean())
0.152

Access fields

>>> times = arr['time']
>>> data = arr['channel_data']
>>> import matplotlib.pyplot as plt
>>> plt.plot(times, data)
from_channel_ts(channel_ts_obj: mth5.timeseries.ChannelTS, how: str = 'replace', fill: str | float | int | None = None, max_gap_seconds: float | int = 1, fill_window: int = 10) None[source]

Populate the dataset from a ChannelTS object.

Parameters:
  • channel_ts_obj (ChannelTS) – Time series object containing data and metadata.

  • how ({'replace', 'extend'}, optional) –

    Method for adding data:

    • ’replace’ : Replace entire dataset (default)

    • ’extend’ : Append/prepend to existing data with gap handling

  • fill (str, float, int, or None, optional) –

    Gap filling strategy (only used with how=’extend’):

    • None : Raise error on gaps (default)

    • ’mean’ : Fill with mean of both datasets

    • ’median’ : Fill with median of both datasets

    • ’nan’ : Fill with NaN

    • numeric : Fill with constant value

  • max_gap_seconds (float or int, optional) – Maximum allowed gap in seconds. Default is 1.

  • fill_window (int, optional) – Points to use for estimating fill values. Default is 10.

Raises:
  • TypeError – If channel_ts_obj is not a ChannelTS instance.

  • MTH5Error – If time alignment or metadata validation fails.

Examples

Replace entire dataset

>>> from mth5.timeseries import ChannelTS
>>> import numpy as np
>>> ts = ChannelTS(
...     channel_type='electric',
...     data=np.random.randn(1000),
...     channel_metadata={'electric': {
...         'component': 'ex',
...         'sample_rate': 256.0
...     }}
... )
>>> channel.from_channel_ts(ts, how='replace')
>>> print(channel.n_samples)
1000

Extend existing dataset

>>> new_ts = ChannelTS(
...     channel_type='electric',
...     data=np.random.randn(500),
...     channel_metadata={'electric': {
...         'component': 'ex',
...         'sample_rate': 256.0,
...         'time_period.start': channel.end.isoformat()
...     }}
... )
>>> channel.from_channel_ts(new_ts, how='extend', fill='median')
>>> print(channel.n_samples)
1500
from_xarray(data_array: xarray.DataArray, how: str = 'replace', fill: str | float | int | None = None, max_gap_seconds: float | int = 1, fill_window: int = 10) None[source]

Populate the dataset from an xarray DataArray.

Parameters:
  • data_array (xr.DataArray) – DataArray with time coordinate and metadata in attrs.

  • how ({'replace', 'extend'}, optional) –

    Method for adding data:

    • ’replace’ : Replace entire dataset (default)

    • ’extend’ : Append/prepend to existing data with gap handling

  • fill (str, float, int, or None, optional) –

    Gap filling strategy (only used with how=’extend’):

    • None : Raise error on gaps (default)

    • ’mean’ : Fill with mean of both datasets

    • ’median’ : Fill with median of both datasets

    • ’nan’ : Fill with NaN

    • numeric : Fill with constant value

  • max_gap_seconds (float or int, optional) – Maximum allowed gap in seconds. Default is 1.

  • fill_window (int, optional) – Points to use for estimating fill values. Default is 10.

Raises:
  • TypeError – If data_array is not an xarray.DataArray.

  • MTH5Error – If time alignment fails.

Examples

Replace from xarray

>>> import xarray as xr
>>> import numpy as np
>>> import pandas as pd
>>> time = pd.date_range('2020-01-01', periods=1000, freq='0.004S')
>>> data = xr.DataArray(
...     np.random.randn(1000),
...     coords=[('time', time)],
...     attrs={'component': 'ex', 'sample_rate': 256.0}
... )
>>> channel.from_xarray(data, how='replace')
>>> print(channel.n_samples)
1000

Extend from xarray with gap

>>> time2 = pd.date_range('2020-01-01T00:00:05', periods=500, freq='0.004S')
>>> data2 = xr.DataArray(np.random.randn(500), coords=[('time', time2)])
>>> channel.from_xarray(data2, how='extend', fill='mean')
property channel_entry: numpy.ndarray

Create a structured array entry for channel summary tables.

Returns:

Structured array with dtype=CHANNEL_DTYPE containing channel metadata and HDF5 references for survey-wide summaries.

Return type:

np.ndarray

Notes

This entry includes survey ID, station ID, run ID, location, component, time period, sample rate, and HDF5 references for navigation.

Examples

>>> entry = channel.channel_entry
>>> print(entry['component'][0])
'Ex'
>>> print(entry['sample_rate'][0])
256.0
>>> print(entry['station'][0])
'MT001'
time_slice(start: str | mt_metadata.common.mttime.MTime, end: str | mt_metadata.common.mttime.MTime | None = None, n_samples: int | None = None, return_type: str = 'channel_ts') mth5.timeseries.ChannelTS | xarray.DataArray | pandas.DataFrame | numpy.ndarray[source]

Extract a time slice from the channel dataset.

Parameters:
  • start (str or MTime) – Start time of the slice in UTC.

  • end (str or MTime, optional) – End time of the slice. Mutually exclusive with n_samples.

  • n_samples (int, optional) – Number of samples to extract. Mutually exclusive with end.

  • return_type ({'channel_ts', 'xarray', 'pandas', 'numpy'}, optional) – Format for returned data. Default is ‘channel_ts’.

Returns:

Time slice in the requested format with appropriate metadata.

Return type:

ChannelTS or xr.DataArray or pd.DataFrame or np.ndarray

Raises:

ValueError – If both end and n_samples are provided or neither is provided.

Notes

  • If the requested slice extends beyond available data, it will be automatically truncated with a warning.

  • Regional HDF5 references are used when possible for efficiency.

Examples

Extract by number of samples

>>> ex = mth5_obj.get_channel('FL001', 'FL001a', 'Ex')
>>> ex_slice = ex.time_slice("2015-01-08T19:49:15", n_samples=4096)
>>> print(type(ex_slice))
<class 'mth5.timeseries.channel_ts.ChannelTS'>
>>> print(f"Slice shape: {ex_slice.ts.shape}")
Slice shape: (4096,)
>>> ex_slice.plot()

Extract by time range

>>> ex_slice = ex.time_slice(
...     "2015-01-08T19:49:15",
...     end="2015-01-08T20:49:15"
... )
>>> print(f"Duration: {ex_slice.end - ex_slice.start} seconds")
Duration: 3600.0 seconds

Return as xarray for analysis

>>> xr_slice = ex.time_slice(
...     "2015-01-08T19:49:15",
...     n_samples=1000,
...     return_type='xarray'
... )
>>> print(xr_slice.mean().values)
0.152
>>> xr_slice.plot()

Return as pandas for tabular ops

>>> df_slice = ex.time_slice(
...     "2015-01-08T19:49:15",
...     n_samples=500,
...     return_type='pandas'
... )
>>> df_slice['data'].describe()
>>> df_slice.resample('10S').mean()

Return as numpy for computation

>>> np_slice = ex.time_slice(
...     "2015-01-08T19:49:15",
...     n_samples=100,
...     return_type='numpy'
... )
>>> np.fft.fft(np_slice)
get_index_from_time(given_time: str | mt_metadata.common.mttime.MTime) int[source]

Calculate the array index for a given time.

Parameters:

given_time (str or MTime) – Time to convert to index.

Returns:

Array index corresponding to the given time.

Return type:

int

Notes

Index is calculated as: (time - start_time) * sample_rate and rounded to nearest integer.

Examples

>>> idx = channel.get_index_from_time('1980-01-01T00:00:10')
>>> print(f"Index for 10 seconds: {idx}")
Index for 10 seconds: 2560
>>> # With 256 Hz sample rate: 10 * 256 = 2560
>>> start_idx = channel.get_index_from_time(channel.start)
>>> print(start_idx)
0
get_index_from_end_time(given_time: str | mt_metadata.common.mttime.MTime) int[source]

Get the end index value (inclusive) for a given time.

Parameters:

given_time (str or MTime) – Time to convert to end index.

Returns:

Array index + 1 for inclusive slicing.

Return type:

int

Notes

Adds 1 to the calculated index to make it suitable for inclusive end slicing (e.g., array[start:end]).

Examples

>>> end_idx = channel.get_index_from_end_time('1980-01-01T00:00:10')
>>> data_slice = channel.hdf5_dataset[0:end_idx]
>>> # Includes sample at exactly 10 seconds
class mth5.groups.AuxiliaryDataset(group: h5py.Dataset, **kwargs: Any)[source]

Bases: ChannelDataset

Specialized container for auxiliary channel data.

Inherits all functionality from ChannelDataset with auxiliary channel specific metadata handling. Used for temperature, battery voltage, etc.

Parameters:
  • group (h5py.Dataset) – HDF5 dataset containing auxiliary data.

  • **kwargs (dict) – Additional keyword arguments passed to ChannelDataset.

Examples

>>> temp_dataset = run_group.get_channel('Temperature')
>>> print(type(temp_dataset))
<class 'mth5.groups.channel_dataset.AuxiliaryDataset'>
>>> print(temp_dataset.metadata.type)
'auxiliary'
>>> print(temp_dataset.metadata.units)
'celsius'
class mth5.groups.RunGroup(group: h5py.Group, run_metadata: mt_metadata.timeseries.Run | None = None, **kwargs: Any)[source]

Bases: mth5.groups.BaseGroup

Container for a single MT measurement run with multiple channels.

Manages time series data and metadata for one measurement run within a station. A run can contain multiple channels of electric, magnetic, and auxiliary data. This class provides methods to add, retrieve, and manage individual channels, along with convenient access to station and survey metadata.

The run group is located at /Survey/Stations/{station_name}/{run_name} in the HDF5 file hierarchy.

metadata[source]

Run metadata including sample rate, time period, and channel information.

Type:

mt_metadata.timeseries.Run

channel_summary

Summary table of all channels in the run.

Type:

pd.DataFrame

groups_list

List of channel names in the run.

Type:

list[str]

Parameters:
  • group (h5py.Group) – HDF5 group for the run, should have path like /Survey/Stations/{station_name}/{run_name}

  • run_metadata (mt_metadata.timeseries.Run, optional) – Metadata container for the run. Default is None.

  • **kwargs (Any) – Additional keyword arguments passed to BaseGroup.

Notes

Key behaviors:

  • Channels can be of type: electric, magnetic, or auxiliary

  • All metadata updates should use the metadata object for validation

  • Call write_metadata() after modifying metadata to persist changes

  • Channel metadata is cached for performance during repeated access

  • Deleting a channel removes the reference but doesn’t reduce file size

Examples

Access run from an open MTH5 file:

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')

Check available channels:

>>> run.groups_list
['Ex', 'Ey', 'Hx', 'Hy']

Access HDF5 group directly:

>>> run.hdf5_group.ref
<HDF5 Group Reference>

Update metadata and persist to file:

>>> run.metadata.sample_rate = 512.0
>>> run.write_metadata()

Add a channel:

>>> import numpy as np
>>> data = np.random.rand(4096)
>>> ex = run.add_channel('Ex', 'electric', data=data)

This class provides methods to add and get channels. A summary table of all existing channels in the run is also provided as a convenience look up table to make searching easier.

Parameters:
  • group (h5py.Group) – HDF5 group for a station, should have a path /Survey/Stations/station_name/run_name

  • station_metadata (mth5.metadata.Station, optional) – metadata container, defaults to None

Access RunGroup from an open MTH5 file:

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
Check what channels exist:

>>> station.groups_list
['Ex', 'Ey', 'Hx', 'Hy']

To access the hdf5 group directly use RunGroup.hdf5_group

>>> station.hdf5_group.ref
<HDF5 Group Reference>

Note

All attributes should be input into the metadata object, that way all input will be validated against the metadata standards. If you change attributes in metadata object, you should run the SurveyGroup.write_metadata() method. This is a temporary solution, working on an automatic updater if metadata is changed.

>>> run.metadata.existing_attribute = 'update_existing_attribute'
>>> run.write_metadata()

If you want to add a new attribute this should be done using the metadata.add_base_attribute method.

>>> station.metadata.add_base_attribute('new_attribute',
>>> ...                                 'new_attribute_value',
>>> ...                                 {'type':str,
>>> ...                                  'required':True,
>>> ...                                  'style':'free form',
>>> ...                                  'description': 'new attribute desc.',
>>> ...                                  'units':None,
>>> ...                                  'options':[],
>>> ...                                  'alias':[],
>>> ...                                  'example':'new attribute
Add a channel:

>>> new_channel = run.add_channel('Ex', 'electric',
>>> ...                            data=numpy.random.rand(4096))
>>> new_run
/Survey/Stations/MT001/MT001a:
=======================================
    --> Dataset: summary
    ......................
    --> Dataset: Ex
    ......................
    --> Dataset: Ey
    ......................
    --> Dataset: Hx
    ......................
    --> Dataset: Hy
    ......................
Add a channel with metadata:

>>> from mth5.metadata import Electric
>>> ex_metadata = Electric()
>>> ex_metadata.time_period.start = '2020-01-01T12:30:00'
>>> ex_metadata.time_period.end = '2020-01-03T16:30:00'
>>> new_ex = run.add_channel('Ex', 'electric',
>>> ...                       channel_metadata=ex_metadata)
>>> # to look at the metadata
>>> new_ex.metadata
{
     "electric": {
        "ac.end": 1.2,
        "ac.start": 2.3,
        ...
        }
}

See also

mth5.metadata for details on how to add metadata from various files and python objects.

Remove a channel:

>>> run.remove_channel('Ex')
>>> station
/Survey/Stations/MT001/MT001a:
=======================================
    --> Dataset: summary
    ......................
    --> Dataset: Ey
    ......................
    --> Dataset: Hx
    ......................
    --> Dataset: Hy
    ......................

Note

Deleting a station is not as simple as del(station). In HDF5 this does not free up memory, it simply removes the reference to that station. The common way to get around this is to copy what you want into a new file, or overwrite the station.

Get a channel:

>>> existing_ex = stations.get_channel('Ex')
>>> existing_ex
Channel Electric:
-------------------
    data type:        Ex
    data type:        electric
    data format:      float32
    data shape:       (4096,)
    start:            1980-01-01T00:00:00+00:00
    end:              1980-01-01T00:32:+08:00
    sample rate:      8
Summary Table:

A summary table is provided to make searching easier. The table summarized all stations within a survey. To see what names are in the summary table:

>>> run.summary_table.dtype.descr
[('component', ('|S5', {'h5py_encoding': 'ascii'})),
 ('start', ('|S32', {'h5py_encoding': 'ascii'})),
 ('end', ('|S32', {'h5py_encoding': 'ascii'})),
 ('n_samples', '<i4'),
 ('measurement_type', ('|S12', {'h5py_encoding': 'ascii'})),
 ('units', ('|S25', {'h5py_encoding': 'ascii'})),
 ('hdf5_reference', ('|O', {'ref': h5py.h5r.Reference}))]

Note

When a run is added an entry is added to the summary table, where the information is pulled from the metadata.

>>> new_run.summary_table
index | component | start | end | n_samples | measurement_type | units |
hdf5_reference
--------------------------------------------------------------------------
-------------
property station_metadata: mt_metadata.timeseries.Station

Get station metadata with current run included.

Returns:

Station metadata object containing this run’s information.

Return type:

metadata.Station

Examples

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5("example.h5", mode='r')
>>> run = mth5_obj.get_run("MT001", "MT001a")
>>> station_meta = run.station_metadata
>>> print(station_meta.id)
MT001
property survey_metadata: mt_metadata.timeseries.Survey

Get survey metadata with current station and run included.

Returns:

Survey metadata object containing the full hierarchy.

Return type:

metadata.Survey

Examples

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5("example.h5", mode='r')
>>> run = mth5_obj.get_run("MT001", "MT001a")
>>> survey_meta = run.survey_metadata
>>> print(survey_meta.id)
CONUS_South
recache_channel_metadata() None[source]

Clear and rebuild the channel metadata cache from current HDF5 data.

This method reads all channel metadata from HDF5 storage and updates the internal cache. Useful when channel metadata has been modified externally or needs to be synchronized.

Examples

>>> run = mth5_obj.get_run("MT001", "MT001a")
>>> run.recache_channel_metadata()
>>> # Cache is now synchronized with HDF5 storage
metadata() mt_metadata.timeseries.Run[source]

Get run metadata including all channel information.

This property dynamically reads and caches channel metadata from HDF5, ensuring the run metadata always reflects the current state of channels.

Returns:

Run metadata object with all channels included.

Return type:

metadata.Run

Examples

>>> run = mth5_obj.get_run("MT001", "MT001a")
>>> run_meta = run.metadata
>>> print(run_meta.channels_recorded_electric)
['ex', 'ey']
>>> print(run_meta.sample_rate)
256.0
property channel_summary: pandas.DataFrame

Get summary of all channels in the run as a DataFrame.

Returns:

DataFrame with columns: component, start, end, n_samples, sample_rate, measurement_type, units, hdf5_reference.

Return type:

pandas.DataFrame

Examples

>>> run = mth5_obj.get_run("MT001", "MT001a")
>>> summary = run.channel_summary
>>> print(summary[['component', 'sample_rate', 'n_samples']])
  component  sample_rate  n_samples
0        ex        256.0      65536
1        ey        256.0      65536
2        hx        256.0      65536
3        hy        256.0      65536
write_metadata() None[source]

Write run metadata to HDF5 attributes.

Converts metadata object to dictionary and writes all attributes to the HDF5 group.

Examples

>>> run = mth5_obj.get_run("MT001", "MT001a")
>>> run.metadata.sample_rate = 512.0
>>> run.write_metadata()
>>> # Metadata is now persisted to HDF5 file
add_channel(channel_name, channel_type, data, channel_dtype='int32', shape=None, max_shape=(None,), chunks=True, channel_metadata=None, **kwargs)[source]

Add a channel to the run.

Parameters:
  • channel_name (str) – Name of the channel (e.g., ‘ex’, ‘ey’, ‘hx’, ‘hy’, ‘hz’).

  • channel_type (str) – Type of channel: ‘electric’, ‘magnetic’, or ‘auxiliary’.

  • data (numpy.ndarray or None) – Time series data for the channel. If None, an empty resizable dataset will be created.

  • channel_dtype (str, optional) – Data type for the channel if data is None, by default “int32”.

  • shape (tuple of int, optional) – Initial shape of the dataset. If None and data is None, shape is estimated from metadata or set to (1,), by default None.

  • max_shape (tuple of int or None, optional) – Maximum shape the dataset can be resized to. Use None for unlimited growth in that dimension, by default (None,).

  • chunks (bool or int, optional) – Enable chunked storage. If True, uses automatic chunking. If int, uses that chunk size, by default True.

  • channel_metadata (mt_metadata.timeseries.Electric, Magnetic, or Auxiliary, optional) – Metadata object for the channel, by default None.

  • **kwargs (dict) – Additional keyword arguments.

Returns:

The created channel dataset object.

Return type:

ElectricDataset or MagneticDataset or AuxiliaryDataset

Raises:

MTH5Error – If channel_type is not one of: electric, magnetic, auxiliary.

Examples

Add a channel with data:

>>> import numpy as np
>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5("example.h5", mode='a')
>>> run = mth5_obj.get_run("MT001", "MT001a")
>>> data = np.random.rand(4096)
>>> ex = run.add_channel('ex', 'electric', data)
>>> print(ex.metadata.component)
ex

Add a channel with metadata:

>>> from mt_metadata.timeseries import Electric
>>> ex_meta = Electric()
>>> ex_meta.time_period.start = '2020-01-01T12:30:00'
>>> ex_meta.sample_rate = 256.0
>>> ex = run.add_channel('ex', 'electric', None,
...                      channel_metadata=ex_meta)
>>> print(ex.metadata.sample_rate)
256.0

Add a channel with custom shape:

>>> ex = run.add_channel('ex', 'electric', None,
...                      shape=(8192,), channel_dtype='float32')
>>> print(ex.hdf5_dataset.shape)
(8192,)
get_channel(channel_name: str) mth5.groups.ElectricDataset | mth5.groups.MagneticDataset | mth5.groups.AuxiliaryDataset | mth5.groups.ChannelDataset[source]

Get a channel from an existing name.

Returns the appropriate channel dataset container based on the channel type (electric, magnetic, or auxiliary).

Parameters:

channel_name (str) – Name of the channel to retrieve (e.g., ‘ex’, ‘ey’, ‘hx’).

Returns:

Channel dataset object containing the channel data and metadata.

Return type:

ElectricDataset or MagneticDataset or AuxiliaryDataset or ChannelDataset

Raises:

MTH5Error – If the channel does not exist in the run.

Examples

Attempting to get a non-existent channel:

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5("example.h5", mode='r')
>>> run = mth5_obj.get_run("MT001", "MT001a")
>>> ex = run.get_channel('ex')
MTH5Error: ex does not exist, check groups_list for existing names

Check available channels first:

>>> run.groups_list
['ey', 'hx', 'hz']

Get an existing channel:

>>> ey = run.get_channel('ey')
>>> print(ey)
Channel Electric:
-------------------
        component:        ey
        data type:        electric
        data format:      float32
        data shape:       (4096,)
        start:            1980-01-01T00:00:00+00:00
        end:              1980-01-01T00:00:01+00:00
        sample rate:      4096
remove_channel(channel_name: str) None[source]

Remove a channel from the run.

Deleting a channel is not as simple as del(channel). In HDF5, this does not free up memory; it simply removes the reference to that channel. The common way to get around this is to copy what you want into a new file, or overwrite the channel.

Parameters:

channel_name (str) – Name of the existing channel to remove.

Notes

Deleting a channel does not reduce the HDF5 file size. It simply removes the reference. If file size reduction is your goal, copy what you want into another file.

Todo: Need to remove summary table entry as well.

Examples

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
>>> run.remove_channel('ex')
has_data() bool[source]

Check if the run contains any non-empty, non-zero data.

Verifies that all channels in the run have valid data (non-zero and non-empty arrays). Returns False if any channel lacks data.

Returns:

True if all channels have data, False if any channel is empty or all zeros.

Return type:

bool

Notes

A channel is considered to have data if its has_data() method returns True, meaning it contains non-zero values.

Examples

>>> run = mth5_obj.get_run("MT001", "MT001a")
>>> if run.has_data():
...     print("Run contains valid data")
...     runts = run.to_runts()
to_runts(start: str | None = None, end: str | None = None, n_samples: int | None = None) mth5.timeseries.RunTS[source]

Convert run to a RunTS timeseries object.

Combines all channels in the run into a RunTS object which handles multi-channel time series data with associated metadata.

Parameters:
  • start (str, optional) – Start time for time slice in ISO format (e.g., ‘2023-01-01T12:00:00’). If None, uses entire channel data. Default is None.

  • end (str, optional) – End time for time slice in ISO format. Only used if start is specified. Default is None.

  • n_samples (int, optional) – Number of samples to extract from start. If both end and n_samples are specified, end takes precedence. Default is None.

Returns:

RunTS object containing all channels with full run and station metadata.

Return type:

RunTS

Notes

  • Includes run, station, and survey metadata in the output

  • Skips the ‘summary’ group which is not a channel

  • If start is specified, performs time slicing; otherwise returns full data

Examples

Convert entire run to RunTS:

>>> run = mth5_obj.get_run("MT001", "MT001a")
>>> runts = run.to_runts()
>>> print(runts.channels)
['ex', 'ey', 'hx', 'hy']

Time slice the run:

>>> runts = run.to_runts(start='2023-01-01T12:00:00',
...                       end='2023-01-01T13:00:00')
>>> print(runts.ex.ts.shape)
(1024,)
from_runts(run_ts_obj: mth5.timeseries.RunTS, **kwargs: Any) list[mth5.groups.ElectricDataset | mth5.groups.MagneticDataset | mth5.groups.AuxiliaryDataset][source]

Create channel datasets from a RunTS timeseries object.

Converts a RunTS object with multiple channels and metadata into HDF5 channel datasets and updates run metadata accordingly.

Parameters:
  • run_ts_obj (RunTS) – RunTS object containing multiple channels and metadata.

  • **kwargs (Any) – Additional keyword arguments.

Returns:

List of created channel dataset objects.

Return type:

list[ElectricDataset | MagneticDataset | AuxiliaryDataset]

Raises:

MTH5Error – If input is not a RunTS object.

Notes

  • Updates run metadata from input object

  • Validates station and run IDs match current context

  • Creates appropriate channel type based on channel metadata

  • Automatically registers recorded channels in run metadata

Examples

>>> from mth5.timeseries import RunTS
>>> run = mth5_obj.get_run("MT001", "MT001a")
>>> runts = RunTS.from_file("timeseries_data.txt")
>>> channels = run.from_runts(runts)
>>> print(f"Created {len(channels)} channels")
Created 4 channels
from_channel_ts(channel_ts_obj: mth5.timeseries.ChannelTS) mth5.groups.ElectricDataset | mth5.groups.MagneticDataset | mth5.groups.AuxiliaryDataset[source]

Create a channel dataset from a ChannelTS timeseries object.

Converts a single ChannelTS object with time series data and metadata into an HDF5 channel dataset. Handles filter registration and updates run metadata with channel information.

Parameters:

channel_ts_obj (ChannelTS) – ChannelTS object containing time series data and metadata.

Returns:

Created channel dataset object.

Return type:

ElectricDataset | MagneticDataset | AuxiliaryDataset

Raises:

MTH5Error – If input is not a ChannelTS object.

Notes

  • Registers filters from channel response if present

  • Validates and corrects station/run ID mismatches

  • Updates run metadata recorded channel lists

  • Automatically determines channel type from metadata

Examples

>>> from mth5.timeseries import ChannelTS
>>> run = mth5_obj.get_run("MT001", "MT001a")
>>> channel = ChannelTS.from_file("ex_timeseries.txt")
>>> ex = run.from_channel_ts(channel)
>>> print(ex.metadata.component)
ex
update_run_metadata() None[source]

Update metadata and table entries (Deprecated). .. deprecated:

Use update_metadata() instead.
Raises:

DeprecationWarning – Always raised to indicate this method should not be used.

update_metadata() None[source]

Update run metadata from all channels and persist to HDF5.

Aggregates metadata from all channels including time period and sample rate, then writes updated metadata to HDF5 attributes.

Raises:

Exception – May raise exceptions if no channels exist (logs warning).

Notes

Updates:

  • Time period start from minimum of all channels

  • Time period end from maximum of all channels

  • Sample rate from first channel (assumes uniform across channels)

Should be called after adding or removing channels to maintain consistency between channel and run metadata.

Examples

>>> run = mth5_obj.get_run("MT001", "MT001a")
>>> run.add_channel('ex', 'electric', data=ex_data)
>>> run.add_channel('ey', 'electric', data=ey_data)
>>> run.update_metadata()  # Updates time period and sample rate
plot(start: str | None = None, end: str | None = None, n_samples: int | None = None) Any[source]

Create a matplotlib plot of all channels in the run.

Generates a multi-panel plot showing all channels in the run using the RunTS plotting functionality.

Parameters:
  • start (str, optional) – Start time for time slice in ISO format. If None, plots entire channel data. Default is None.

  • end (str, optional) – End time for time slice in ISO format. Only used if start is specified. Default is None.

  • n_samples (int, optional) – Number of samples to extract from start. If both end and n_samples are specified, end takes precedence. Default is None.

Returns:

Matplotlib figure or axes object (depends on RunTS.plot() implementation).

Return type:

Any

Notes

  • Creates separate subplots for each channel type (electric, magnetic, auxiliary)

  • Time slice parameters work the same as to_runts()

  • Requires matplotlib to be installed

Examples

Plot entire run:

>>> run = mth5_obj.get_run("MT001", "MT001a")
>>> fig = run.plot()
>>> fig.show()

Plot time slice:

>>> fig = run.plot(start='2023-01-01T12:00:00',
...                end='2023-01-01T13:00:00')
class mth5.groups.FeatureChannelDataset(dataset: h5py.Dataset, dataset_metadata: mt_metadata.features.FeatureDecimationChannel | None = None, **kwargs)[source]

Container for multi-dimensional Fourier Coefficients organized by time and frequency.

This class manages Fourier Coefficient data with frequency band organization, similar to FCDataset but with enhanced band tracking capabilities. The data array is organized with the following assumptions:

  1. Data are grouped into frequency bands

  2. Data are uniformly sampled in time (uniform FFT moving window step size)

The dataset tracks temporal evolution of frequency content across multiple windows, making it suitable for time-frequency analysis of geophysical signals.

Parameters:
  • dataset (h5py.Dataset) – HDF5 dataset containing the Fourier coefficient data.

  • dataset_metadata (FeatureDecimationChannel, optional) – Metadata for the dataset. See mt_metadata.features.FeatureDecimationChannel. If provided, must be of the same type as the internal metadata class. Default is None.

  • **kwargs – Additional keyword arguments for future extensibility.

hdf5_dataset

Reference to the HDF5 dataset.

Type:

h5py.Dataset

metadata

Metadata container with the following attributes:

  • namestr

    Dataset name

  • time_period.startdatetime

    Start time of the data acquisition

  • time_period.enddatetime

    End time of the data acquisition

  • sample_rate_window_stepfloat

    Sample rate of the time window stepping (Hz)

  • frequency_minfloat

    Minimum frequency in the band (Hz)

  • frequency_maxfloat

    Maximum frequency in the band (Hz)

  • unitsstr

    Physical units of the coefficient data

  • componentstr

    Component identifier (e.g., ‘Ex’, ‘Hy’)

  • sample_rate_decimation_levelint

    Decimation level applied to acquire this data

Type:

FeatureDecimationChannel

Raises:

MTH5Error – If dataset_metadata type does not match the expected FeatureDecimationChannel type.

Examples

>>> import h5py
>>> from mt_metadata.features import FeatureDecimationChannel
>>> from mth5.groups.feature_dataset import FeatureChannelDataset

Create a feature dataset from an HDF5 group:

>>> with h5py.File('data.h5', 'r') as f:
...     h5_dataset = f['feature_group']['Ex']
...     feature = FeatureChannelDataset(h5_dataset)
...     print(f"Time windows: {feature.n_windows}")
...     print(f"Frequencies: {feature.n_frequencies}")

Access time and frequency arrays:

>>> time_array = feature.time
>>> freq_array = feature.frequency
>>> data_array = feature.to_numpy()
logger
metadata
read_metadata() None[source]

Read metadata from the HDF5 file into the metadata container.

This method loads all attributes from the HDF5 dataset into the metadata container, enabling validation and type checking.

Examples

>>> feature.read_metadata()
>>> print(feature.metadata.component)
'Ex'
write_metadata() None[source]

Write metadata from the metadata container to the HDF5 attributes.

This method serializes the metadata container and writes all metadata as attributes to the HDF5 dataset. Raises exceptions are caught for read-only files.

Examples

>>> feature.metadata.component = 'Ey'
>>> feature.write_metadata()
property n_windows: int

Get the number of time windows in the dataset.

Returns:

Number of time windows (first dimension of the dataset).

Return type:

int

property time: numpy.ndarray

Get the time array for each window.

Returns an array of datetime64 values representing the start time of each time window. The time spacing is determined by the sample rate of the window stepping.

Returns:

Array of datetime64 values with shape (n_windows,) representing the start time of each window.

Return type:

np.ndarray

Examples

>>> time_array = feature.time
>>> print(time_array.shape)
(100,)
>>> print(time_array[0])
numpy.datetime64('2023-01-01T00:00:00')
property n_frequencies: int

Get the number of frequency bins in the dataset.

Returns:

Number of frequency bins (second dimension of the dataset).

Return type:

int

property frequency: numpy.ndarray

Get the frequency array for the dataset.

Returns a linearly-spaced frequency array from frequency_min to frequency_max with n_frequencies points.

Returns:

Array of float64 frequencies in Hz with shape (n_frequencies,).

Return type:

np.ndarray

Examples

>>> freq_array = feature.frequency
>>> print(freq_array.shape)
(256,)
>>> print(f"Frequency range: {freq_array[0]:.2f} - {freq_array[-1]:.2f} Hz")
Frequency range: 0.01 - 100.00 Hz
replace_dataset(new_data_array: numpy.ndarray) None[source]

Replace the entire HDF5 dataset with new data.

This method resizes the HDF5 dataset as needed and replaces all data. The input array must have the same dtype as the existing dataset.

Parameters:

new_data_array (np.ndarray) – New data array to replace the existing dataset. Will be converted to numpy array if necessary.

Raises:

TypeError – If input cannot be converted to a numpy array or has incompatible shape.

Examples

>>> import numpy as np
>>> new_data = np.random.randn(100, 256)
>>> feature.replace_dataset(new_data)
to_xarray() xarray.DataArray[source]

Convert the feature dataset to an xarray DataArray.

Returns an xarray DataArray with proper time and frequency coordinates, metadata attributes, and component naming. The entire dataset is loaded into memory.

Returns:

DataArray with dimensions [‘time’, ‘frequency’] and coordinates matching the dataset’s time and frequency arrays.

Return type:

xr.DataArray

Notes

Metadata stored in xarray attributes will not be validated if modified. The full dataset is loaded into memory; use with caution for large datasets.

Examples

>>> xr_data = feature.to_xarray()
>>> print(xr_data.dims)
('time', 'frequency')
>>> print(xr_data.name)
'Ex'
>>> subset = xr_data.sel(time=slice('2023-01-01', '2023-01-02'))
to_numpy() numpy.ndarray[source]

Convert the feature dataset to a numpy array.

Returns the dataset as a numpy array by loading it from the HDF5 file into memory. The array shape is (n_windows, n_frequencies).

Returns:

Numpy array containing all feature data with shape (n_windows, n_frequencies).

Return type:

np.ndarray

Examples

>>> data = feature.to_numpy()
>>> print(data.shape)
(100, 256)
>>> print(data.dtype)
complex128
>>> mean_amplitude = np.abs(data).mean()
from_numpy(new_estimate: numpy.ndarray) None[source]

Load data from a numpy array into the HDF5 dataset.

This method updates the HDF5 dataset with new data from a numpy array. The input array must match the dataset’s dtype. The HDF5 dataset will be resized if necessary to accommodate the new data.

Parameters:

new_estimate (np.ndarray) – Numpy array to write to the HDF5 dataset. Must have compatible dtype with the existing dataset.

Raises:

TypeError – If input array dtype does not match the HDF5 dataset dtype or if input cannot be converted to numpy array.

Notes

The variable ‘data’ is a builtin in numpy and cannot be used as a parameter name.

Examples

>>> import numpy as np
>>> new_data = np.random.randn(100, 256) + 1j * np.random.randn(100, 256)
>>> feature.from_numpy(new_data)
>>> loaded_data = feature.to_numpy()
>>> assert loaded_data.shape == new_data.shape
from_xarray(data: xarray.DataArray, sample_rate_decimation_level: int) None[source]

Load data and metadata from an xarray DataArray.

This method updates both the HDF5 dataset and metadata from an xarray DataArray. It extracts time coordinates, frequency range, and component information from the DataArray and its attributes.

Parameters:
  • data (xr.DataArray) – Input xarray DataArray with ‘time’ and ‘frequency’ coordinates. Expected dimensions are [‘time’, ‘frequency’].

  • sample_rate_decimation_level (int) – Decimation level applied to the original data to produce this feature dataset (integer ≥ 1).

Notes

Metadata stored in xarray attributes will be extracted and written to the HDF5 file. The full dataset is loaded into memory during this process.

Examples

>>> import xarray as xr
>>> import numpy as np

Create sample xarray data:

>>> times = np.arange('2023-01-01', '2023-01-02', dtype='datetime64[s]')
>>> freqs = np.linspace(0.01, 100, 256)
>>> data_array = np.random.randn(len(times), len(freqs)) + \
...              1j * np.random.randn(len(times), len(freqs))
>>> xr_data = xr.DataArray(
...     data_array,
...     dims=['time', 'frequency'],
...     coords={'time': times, 'frequency': freqs},
...     name='Ex',
...     attrs={'units': 'mV/km'}
... )

Load into feature dataset:

>>> feature.from_xarray(xr_data, sample_rate_decimation_level=2)
>>> print(feature.metadata.component)
'Ex'
class mth5.groups.MasterFeaturesGroup(group: h5py.Group, **kwargs)[source]

Bases: mth5.groups.BaseGroup

Master group container for features associated with Fourier Coefficients or time series.

This class manages the top-level organization of geophysical feature data, organizing it into feature-specific groups. Features can include various frequency or time-domain analyses.

Hierarchy

MasterFeatureGroup -> FeatureGroup -> FeatureRunGroup ->

  • FC: FeatureDecimationGroup -> FeatureChannelDataset

  • Time Series: FeatureChannelDataset

param group:

HDF5 group object for this MasterFeaturesGroup.

type group:

h5py.Group

param **kwargs:

Additional keyword arguments passed to BaseGroup.

Examples

>>> import h5py
>>> from mth5.groups.features import MasterFeaturesGroup
>>> with h5py.File('data.h5', 'r') as f:
...     master = MasterFeaturesGroup(f['features'])
...     feature_list = master.groups_list
add_feature_group(feature_name: str, feature_metadata: mt_metadata.features.FeatureDecimationChannel | None = None) FeatureGroup[source]

Add a feature group to the master features container.

Creates a new FeatureGroup with the specified name and optional metadata. Feature groups organize all runs and decimation levels for a particular feature.

Parameters:
  • feature_name (str) – Name for the feature group. Will be validated and formatted.

  • feature_metadata (FeatureDecimationChannel, optional) – Metadata describing the feature. Default is None.

Returns:

Newly created feature group object.

Return type:

FeatureGroup

Examples

>>> master = MasterFeaturesGroup(h5_group)
>>> feature = master.add_feature_group('coherency')
>>> print(feature.name)
'coherency'
get_feature_group(feature_name: str) FeatureGroup[source]

Retrieve a feature group by name.

Parameters:

feature_name (str) – Name of the feature group to retrieve.

Returns:

The requested feature group.

Return type:

FeatureGroup

Raises:

MTH5Error – If the feature group does not exist.

Examples

>>> master = MasterFeaturesGroup(h5_group)
>>> feature = master.get_feature_group('coherency')
>>> print(feature.name)
'coherency'
remove_feature_group(feature_name: str) None[source]

Remove a feature group from the master container.

Deletes the specified feature group and its associated data from the HDF5 file. Note that this operation removes the reference but does not reduce the file size; copy desired data to a new file for size reduction.

Parameters:

feature_name (str) – Name of the feature group to remove.

Raises:

MTH5Error – If the feature group does not exist.

Examples

>>> master = MasterFeaturesGroup(h5_group)
>>> master.remove_feature_group('coherency')
class mth5.groups.FeatureGroup(group: h5py.Group, feature_metadata: object | None = None, **kwargs)[source]

Bases: mth5.groups.BaseGroup

Container for a single feature set with all associated runs and decimation levels.

This class manages feature-specific data including all processing runs and decimation levels. Features can include both Fourier Coefficient and time series data.

Hierarchy

FeatureGroup -> FeatureRunGroup ->

  • FC: FeatureDecimationLevel -> FeatureChannelDataset

  • TS: FeatureChannelDataset

param group:

HDF5 group object for this FeatureGroup.

type group:

h5py.Group

param feature_metadata:

Metadata specific to this feature. Should include description and parameters.

type feature_metadata:

optional

param **kwargs:

Additional keyword arguments passed to BaseGroup.

Notes

Feature metadata should be specific to the feature and include descriptions of the feature and any parameters used in its computation.

Examples

>>> feature = FeatureGroup(h5_group, feature_metadata=metadata)
>>> run_group = feature.add_feature_run_group('run_1', domain='fc')
add_feature_run_group(feature_name: str, feature_run_metadata: object | None = None, domain: str = 'fc') object[source]

Add a feature run group for a single feature.

Creates either a Fourier Coefficient run group or a time series run group based on the specified domain. The domain can be determined from the metadata or explicitly provided.

Parameters:
  • feature_name (str) – Name for the feature run group.

  • feature_run_metadata (optional) – Metadata for the feature run. If provided, domain is extracted from metadata.domain attribute. Default is None.

  • domain (str, default='fc') –

    Domain type for the data. Must be one of:

    • ’fc’, ‘frequency’, ‘fourier’, ‘fourier_domain’: Fourier Coefficients

    • ’ts’, ‘time’, ‘time series’, ‘time_series’: Time series

Returns:

Newly created feature run group.

Return type:

FeatureFCRunGroup or FeatureTSRunGroup

Raises:
  • ValueError – If domain is not recognized.

  • AttributeError – If metadata does not have a domain attribute when metadata is provided.

Examples

>>> feature = FeatureGroup(h5_group)
>>> fc_run = feature.add_feature_run_group('processing_run_1', domain='fc')
>>> ts_run = feature.add_feature_run_group('ts_analysis', domain='ts')
get_feature_run_group(feature_name: str, domain: str = 'frequency') object[source]

Retrieve a feature run group by name and domain type.

Parameters:
  • feature_name (str) – Name of the feature run group to retrieve.

  • domain (str, default='frequency') –

    Domain type. Must be one of:

    • ’fc’, ‘frequency’, ‘fourier’, ‘fourier_domain’: Fourier Coefficients

    • ’ts’, ‘time’, ‘time series’, ‘time_series’: Time series

Returns:

The requested feature run group.

Return type:

FeatureFCRunGroup or FeatureTSRunGroup

Raises:
  • ValueError – If domain is not recognized.

  • MTH5Error – If the feature run group does not exist.

Examples

>>> feature = FeatureGroup(h5_group)
>>> fc_run = feature.get_feature_run_group('processing_run_1', domain='fc')
remove_feature_run_group(feature_name: str) None[source]

Remove a feature run group.

Deletes the specified feature run group and all its associated data. Note that deletion removes the reference but does not reduce HDF5 file size.

Parameters:

feature_name (str) – Name of the feature run group to remove.

Raises:

MTH5Error – If the feature run group does not exist.

Examples

>>> feature = FeatureGroup(h5_group)
>>> feature.remove_feature_run_group('processing_run_1')
class mth5.groups.FeatureTSRunGroup(group: h5py.Group, feature_run_metadata: object | None = None, **kwargs)[source]

Bases: mth5.groups.BaseGroup

Container for time series features from a processing or analysis run.

This class wraps a RunGroup to manage time series data features while maintaining compatibility with the feature hierarchy structure.

Parameters:
  • group (h5py.Group) – HDF5 group object for this FeatureTSRunGroup.

  • feature_run_metadata (optional) – Metadata for the feature run (same type as timeseries.Run).

  • **kwargs – Additional keyword arguments passed to BaseGroup.

Notes

This class uses methods from RunGroup for channel management, which may have performance implications due to multiple RunGroup instantiations.

Examples

>>> ts_run = FeatureTSRunGroup(h5_group, feature_run_metadata=metadata)
>>> channel = ts_run.add_feature_channel('Ex', 'electric', data)
add_feature_channel(channel_name: str, channel_type: str, data: numpy.ndarray | None = None, channel_dtype: str = 'int32', shape: tuple | None = None, max_shape: tuple = (None,), chunks: bool = True, channel_metadata: object | None = None, **kwargs) object[source]

Add a time series channel to the feature run group.

Creates a new channel for time series data with the specified properties and optional metadata. Channel metadata should be a timeseries.Channel object.

Parameters:
  • channel_name (str) – Name for the channel.

  • channel_type (str) – Type of channel (e.g., ‘electric’, ‘magnetic’).

  • data (np.ndarray, optional) – Initial data for the channel. Default is None.

  • channel_dtype (str, default='int32') – Data type for the channel.

  • shape (tuple, optional) – Shape of the channel data. Default is None.

  • max_shape (tuple, default=(None,)) – Maximum shape for expandable dimensions.

  • chunks (bool, default=True) – Whether to use chunking for the dataset.

  • channel_metadata (optional) – Metadata object (timeseries.Channel type). Default is None.

  • **kwargs – Additional keyword arguments for dataset creation.

Returns:

Channel object from RunGroup.

Return type:

object

Examples

>>> ts_run = FeatureTSRunGroup(h5_group)
>>> channel = ts_run.add_feature_channel(
...     'Ex', 'electric', data=np.arange(1000))
get_feature_channel(channel_name: str) object[source]

Retrieve a feature channel by name.

Parameters:

channel_name (str) – Name of the channel to retrieve.

Returns:

Channel object from RunGroup.

Return type:

object

Raises:

MTH5Error – If the channel does not exist.

Examples

>>> ts_run = FeatureTSRunGroup(h5_group)
>>> channel = ts_run.get_feature_channel('Ex')
remove_feature_channel(channel_name: str) None[source]

Remove a feature channel from the run group.

Parameters:

channel_name (str) – Name of the channel to remove.

Raises:

MTH5Error – If the channel does not exist.

Examples

>>> ts_run = FeatureTSRunGroup(h5_group)
>>> ts_run.remove_feature_channel('Ex')
class mth5.groups.FeatureFCRunGroup(group: h5py.Group, feature_run_metadata: mt_metadata.processing.fourier_coefficients.decimation.Decimation | None = None, **kwargs)[source]

Bases: mth5.groups.BaseGroup

Container for Fourier Coefficient features from a processing run.

This class manages Fourier Coefficient data organized by decimation levels, each containing multiple frequency channels with time-frequency data.

Hierarchy

FeatureFCRunGroup -> FeatureDecimationGroup -> FeatureChannelDataset

metadata[source]

Metadata including:

  • list of decimation levels

  • start time (earliest)

  • end time (latest)

  • method (fft, wavelet, …)

  • list of channels used

  • starting sample rate

  • bands used

  • type (TS or FC)

Type:

Decimation

param group:

HDF5 group object for this FeatureFCRunGroup.

type group:

h5py.Group

param feature_run_metadata:

Decimation metadata for the feature run. Default is None.

type feature_run_metadata:

optional

param **kwargs:

Additional keyword arguments passed to BaseGroup.

Examples

>>> fc_run = FeatureFCRunGroup(h5_group, feature_run_metadata=metadata)
>>> decimation = fc_run.add_decimation_level('level_0', dec_metadata)
metadata() mt_metadata.processing.fourier_coefficients.decimation.Decimation[source]

Overwrite get metadata to include channel information in the runs

property decimation_level_summary: pandas.DataFrame

Get a summary of all decimation levels in the run.

Returns a pandas DataFrame with information about each decimation level including decimation factor, time range, and HDF5 reference.

Returns:

DataFrame with columns:

  • namestr

    Decimation level name

  • startdatetime64[ns]

    Start time of the decimation level

  • enddatetime64[ns]

    End time of the decimation level

  • hdf5_referenceh5py.ref_dtype

    HDF5 reference to the decimation level group

Return type:

pd.DataFrame

Examples

>>> fc_run = FeatureFCRunGroup(h5_group)
>>> summary = fc_run.decimation_level_summary
>>> print(summary[['name', 'start', 'end']])
add_decimation_level(decimation_level_name: str, feature_decimation_level_metadata: object | None = None) FeatureDecimationGroup[source]

Add a decimation level group to the feature run.

Parameters:
  • decimation_level_name (str) – Name for the decimation level.

  • feature_decimation_level_metadata (optional) – Metadata for the decimation level. Default is None.

Returns:

Newly created decimation level group.

Return type:

FeatureDecimationGroup

Examples

>>> fc_run = FeatureFCRunGroup(h5_group)
>>> decimation = fc_run.add_decimation_level('level_0', dec_metadata)
>>> print(decimation.name)
'level_0'
get_decimation_level(decimation_level_name: str) FeatureDecimationGroup[source]

Retrieve a decimation level group by name.

Parameters:

decimation_level_name (str) – Name of the decimation level to retrieve.

Returns:

The requested decimation level group.

Return type:

FeatureDecimationGroup

Raises:

MTH5Error – If the decimation level does not exist.

Examples

>>> fc_run = FeatureFCRunGroup(h5_group)
>>> decimation = fc_run.get_decimation_level('level_0')
remove_decimation_level(decimation_level_name: str) None[source]

Remove a decimation level from the feature run.

Parameters:

decimation_level_name (str) – Name of the decimation level to remove.

Raises:

MTH5Error – If the decimation level does not exist.

Examples

>>> fc_run = FeatureFCRunGroup(h5_group)
>>> fc_run.remove_decimation_level('level_0')
update_metadata() None[source]

Update metadata from all decimation levels.

Scans all decimation levels and updates the run-level metadata with aggregated information including time ranges.

Examples

>>> fc_run = FeatureFCRunGroup(h5_group)
>>> fc_run.update_metadata()
class mth5.groups.FeatureDecimationGroup(group: h5py.Group, decimation_level_metadata: object | None = None, **kwargs)[source]

Bases: mth5.groups.BaseGroup

Container for a single decimation level with multiple Fourier Coefficient channels.

This class manages Fourier Coefficient data organized by frequency, time, and channel. Data is assumed to be uniformly sampled in both frequency and time domains.

Hierarchy

FeatureDecimationGroup -> FeatureChannelDataset (multiple channels)

Data Assumptions

  1. Data are uniformly sampled in frequency domain

  2. Data are uniformly sampled in time domain

  3. FFT moving window has uniform step size

start time

Start time of the decimation level

Type:

datetime

end time

End time of the decimation level

Type:

datetime

channels

List of channel names in this decimation level

Type:

list

decimation_factor

Factor by which data was decimated

Type:

int

decimation_level

Level index in decimation hierarchy

Type:

int

decimation_sample_rate

Sample rate after decimation (Hz)

Type:

float

method

Method used (FFT, wavelet, etc.)

Type:

str

anti_alias_filter

Anti-aliasing filter used

Type:

optional

prewhitening_type

Type of prewhitening applied

Type:

optional

harmonics_kept

Harmonic indices kept in the data

Type:

list or ‘all’

window

Window parameters (length, overlap, type, sample rate)

Type:

dict

bands

Frequency bands in the data

Type:

list

param group:

HDF5 group object for this FeatureDecimationGroup.

type group:

h5py.Group

param decimation_level_metadata:

Metadata for the decimation level. Default is None.

type decimation_level_metadata:

optional

param **kwargs:

Additional keyword arguments passed to BaseGroup.

Examples

>>> decimation = FeatureDecimationGroup(h5_group, metadata)
>>> channel = decimation.add_channel('Ex', fc_data=fc_array, fc_metadata=ch_metadata)
metadata()[source]

Overwrite get metadata to include channel information in the runs

property channel_summary: pandas.DataFrame

Get a summary of all channels in this decimation level.

Returns a pandas DataFrame with detailed information about each Fourier Coefficient channel including time ranges, dimensions, and sampling rates.

Returns:

DataFrame with columns:

  • namestr

    Channel name

  • startdatetime64[ns]

    Start time of the channel data

  • enddatetime64[ns]

    End time of the channel data

  • n_frequencyint64

    Number of frequency bins

  • n_windowsint64

    Number of time windows

  • sample_rate_decimation_levelfloat64

    Decimation level sample rate (Hz)

  • sample_rate_window_stepfloat64

    Sample rate of window stepping (Hz)

  • unitsstr

    Physical units of the data

  • hdf5_referenceh5py.ref_dtype

    HDF5 reference to the channel dataset

Return type:

pd.DataFrame

Examples

>>> decimation = FeatureDecimationGroup(h5_group)
>>> summary = decimation.channel_summary
>>> print(summary[['name', 'n_frequency', 'n_windows']])
from_dataframe(df: pandas.DataFrame, channel_key: str, time_key: str = 'time', frequency_key: str = 'frequency') None[source]

Load Fourier Coefficient data from a pandas DataFrame.

Assumes the channel_key column contains complex coefficient values organized with time and frequency dimensions.

Parameters:
  • df (pd.DataFrame) – Input DataFrame containing the coefficient data.

  • channel_key (str) – Name of the column containing coefficient values.

  • time_key (str, default='time') – Name of the time coordinate column.

  • frequency_key (str, default='frequency') – Name of the frequency coordinate column.

Raises:

TypeError – If df is not a pandas DataFrame.

Examples

>>> decimation = FeatureDecimationGroup(h5_group)
>>> decimation.from_dataframe(df, channel_key='Ex', time_key='time')
from_xarray(data_array: xarray.DataArray | xarray.Dataset, sample_rate_decimation_level: float) None[source]

Load Fourier Coefficient data from an xarray DataArray or Dataset.

Automatically extracts metadata (time, frequency, units) from the xarray object and creates appropriate FeatureChannelDataset instances for each variable or the single DataArray.

Parameters:
  • data_array (xr.DataArray or xr.Dataset) – Input xarray object with ‘time’ and ‘frequency’ coordinates and dimensions [‘time’, ‘frequency’] (or transposed variant).

  • sample_rate_decimation_level (float) – Sample rate of the decimation level (Hz).

Raises:

TypeError – If data_array is not an xarray Dataset or DataArray.

Notes

Automatically handles both (time, frequency) and (frequency, time) dimension ordering. Units are extracted from xarray attributes if available.

Examples

>>> import xarray as xr
>>> import numpy as np
>>> decimation = FeatureDecimationGroup(h5_group)

Create sample xarray data:

>>> times = np.arange('2023-01-01', '2023-01-02', dtype='datetime64[s]')
>>> freqs = np.linspace(0.01, 100, 256)
>>> data_array = np.random.randn(len(times), len(freqs)) + \
...              1j * np.random.randn(len(times), len(freqs))
>>> xr_data = xr.DataArray(
...     data_array,
...     dims=['time', 'frequency'],
...     coords={'time': times, 'frequency': freqs},
...     name='Ex',
...     attrs={'units': 'mV/km'}
... )

Load into decimation group:

>>> decimation.from_xarray(xr_data, sample_rate_decimation_level=0.5)
to_xarray(channels: list | None = None) xarray.Dataset[source]

Create an xarray Dataset from Fourier Coefficient channels.

If no channels are specified, all channels in the decimation level are included. Each channel becomes a data variable in the resulting Dataset.

Parameters:

channels (list, optional) – List of channel names to include. If None, all channels are used. Default is None.

Returns:

xarray Dataset with channels as data variables and ‘time’ and ‘frequency’ as shared coordinates.

Return type:

xr.Dataset

Examples

>>> decimation = FeatureDecimationGroup(h5_group)
>>> xr_data = decimation.to_xarray()
>>> print(xr_data.data_vars)
Data variables:
    Ex  (time, frequency) complex128
    Ey  (time, frequency) complex128

Get specific channels:

>>> subset = decimation.to_xarray(channels=['Ex', 'Ey'])
from_numpy_array(nd_array: numpy.ndarray, ch_name: str | list) None[source]

Load Fourier Coefficient data from a numpy array.

Assumes array shape is either (n_frequencies, n_windows) for a single channel or (n_channels, n_frequencies, n_windows) for multiple channels.

Parameters:
  • nd_array (np.ndarray) – Input numpy array containing coefficient data.

  • ch_name (str or list) – Channel name (for 2D array) or list of channel names (for 3D array).

Raises:
  • TypeError – If nd_array is not a numpy ndarray.

  • ValueError – If array shape is not (n_frequencies, n_windows) or (n_channels, n_frequencies, n_windows).

Examples

>>> decimation = FeatureDecimationGroup(h5_group)

Load single channel:

>>> data_2d = np.random.randn(256, 100) + 1j * np.random.randn(256, 100)
>>> decimation.from_numpy_array(data_2d, ch_name='Ex')

Load multiple channels:

>>> data_3d = np.random.randn(2, 256, 100) + 1j * np.random.randn(2, 256, 100)
>>> decimation.from_numpy_array(data_3d, ch_name=['Ex', 'Ey'])
add_channel(fc_name: str, fc_data: numpy.ndarray | xarray.DataArray | xarray.Dataset | pandas.DataFrame | None = None, fc_metadata: mt_metadata.features.FeatureDecimationChannel | None = None, max_shape: tuple = (None, None), chunks: bool = True, dtype: type = complex, **kwargs) mth5.groups.FeatureChannelDataset[source]

Add a Fourier Coefficient channel to the decimation level.

Creates a new FeatureChannelDataset for a single channel at a single decimation level. Input data can be provided as numpy array, xarray, DataFrame, or created empty.

Parameters:
  • fc_name (str) – Name for the Fourier Coefficient channel.

  • fc_data (np.ndarray, xr.DataArray, xr.Dataset, pd.DataFrame, optional) – Input data. Can be numpy array (time, frequency) or xarray/DataFrame format. Default is None (creates empty dataset).

  • fc_metadata (FeatureDecimationChannel, optional) – Metadata for the channel. Default is None.

  • max_shape (tuple, default=(None, None)) – Maximum shape for HDF5 dataset dimensions (expandable if None).

  • chunks (bool, default=True) – Whether to use HDF5 chunking.

  • dtype (type, default=complex) – Data type for the dataset (e.g., complex, float, int).

  • **kwargs – Additional keyword arguments for HDF5 dataset creation.

Returns:

Newly created FeatureChannelDataset object.

Return type:

FeatureChannelDataset

Raises:
  • TypeError – If fc_data type is not supported or metadata type mismatch.

  • RuntimeError or OSError – If channel already exists (will return existing channel).

Notes

Data layout assumes (time, frequency) organization:

  • time index: window start times

  • frequency index: harmonic indices or float values

  • data: complex Fourier coefficients

Examples

>>> decimation = FeatureDecimationGroup(h5_group)
>>> metadata = FeatureDecimationChannel(name='Ex')

Create from numpy array:

>>> fc_data = np.random.randn(100, 256) + 1j * np.random.randn(100, 256)
>>> channel = decimation.add_channel('Ex', fc_data=fc_data, fc_metadata=metadata)

Create empty channel (expandable):

>>> channel = decimation.add_channel('Ex', fc_metadata=metadata)
get_channel(fc_name: str) mth5.groups.FeatureChannelDataset[source]

Retrieve a Fourier Coefficient channel by name.

Parameters:

fc_name (str) – Name of the channel to retrieve.

Returns:

The requested FeatureChannelDataset object.

Return type:

FeatureChannelDataset

Raises:

MTH5Error – If the channel does not exist.

Examples

>>> decimation = FeatureDecimationGroup(h5_group)
>>> channel = decimation.get_channel('Ex')
>>> data = channel.to_numpy()
remove_channel(fc_name: str) None[source]

Remove a Fourier Coefficient channel from the decimation level.

Deletes the channel from the HDF5 file. Note that this removes the reference but does not reduce file size.

Parameters:

fc_name (str) – Name of the channel to remove.

Raises:

MTH5Error – If the channel does not exist.

Notes

To reduce HDF5 file size, copy desired data to a new file.

Examples

>>> decimation = FeatureDecimationGroup(h5_group)
>>> decimation.remove_channel('Ex')
update_metadata() None[source]

Update metadata from all channels in the decimation level.

Scans all channels and updates the decimation-level metadata with aggregated information including time ranges and sampling rates.

Examples

>>> decimation = FeatureDecimationGroup(h5_group)
>>> decimation.update_metadata()
add_weights(weight_name: str, weight_data: numpy.ndarray | None = None, weight_metadata: object | None = None, max_shape: tuple = (None, None, None), chunks: bool = True, **kwargs) None[source]

Add weight or masking data for Fourier Coefficients.

Creates a dataset to store weights or masks for quality control, frequency band selection, or time window filtering.

Parameters:
  • weight_name (str) – Name for the weight dataset.

  • weight_data (np.ndarray, optional) – Weight values. Default is None.

  • weight_metadata (optional) – Metadata for the weight dataset. Default is None.

  • max_shape (tuple, default=(None, None, None)) – Maximum shape for expandable dimensions.

  • chunks (bool, default=True) – Whether to use HDF5 chunking.

  • **kwargs – Additional keyword arguments for HDF5 dataset creation.

Notes

Weight datasets can track:

  • weight_channel: Per-channel weights

  • weight_band: Per-frequency-band weights

  • weight_time: Per-time-window weights

This method is a placeholder for future implementation.

Examples

>>> decimation = FeatureDecimationGroup(h5_group)
>>> decimation.add_weights('coherency_weights', weight_data=weights)
class mth5.groups.MasterSurveyGroup(group: h5py.Group, **kwargs: Any)[source]

Bases: mth5.groups.BaseGroup

Collection helper for surveys under Experiment/Surveys.

Provides helpers to add, fetch, or remove surveys and to summarize all channels in the experiment.

Examples

>>> from mth5 import mth5
>>> m5 = mth5.MTH5()
>>> _ = m5.open_mth5("/tmp/example.mth5", mode="a")
>>> surveys = m5.surveys_group
>>> _ = surveys.add_survey("survey_01")
>>> surveys.channel_summary.head()
property channel_summary: pandas.DataFrame

Return a DataFrame summarizing all channels across surveys.

Returns:

Columns include survey, station, run, location, component, start/end, sample info, orientation, units, and HDF5 reference.

Return type:

pandas.DataFrame

Examples

>>> summary = surveys.channel_summary
>>> set(summary.columns) >= {"survey", "station", "run", "component"}
True
add_survey(survey_name: str, survey_metadata: mt_metadata.timeseries.Survey | None = None) SurveyGroup[source]

Add or fetch a survey at /Experiment/Surveys/<name>.

Parameters:
  • survey_name (str) – Survey identifier; validated with validate_name.

  • survey_metadata (Survey, optional) – Metadata container used to seed the survey attributes.

Returns:

Wrapper for the created or existing survey.

Return type:

SurveyGroup

Raises:
  • ValueError – If survey_name is empty.

  • MTH5Error – If the provided metadata id conflicts with the group name.

Examples

>>> survey = surveys.add_survey("survey_01")
>>> survey.metadata.id
'survey_01'
get_survey(survey_name: str) SurveyGroup[source]

Return an existing survey by name.

Parameters:

survey_name (str) – Existing survey name.

Returns:

Wrapper for the requested survey.

Return type:

SurveyGroup

Raises:

MTH5Error – If the survey does not exist.

Examples

>>> existing = surveys.get_survey("survey_01")
>>> existing.metadata.id
'survey_01'
remove_survey(survey_name: str) None[source]

Delete a survey reference from the file.

Parameters:

survey_name (str) – Existing survey name.

Notes

HDF5 deletion removes the reference only; storage is not reclaimed.

Examples

>>> surveys.remove_survey("survey_01")
class mth5.groups.SurveyGroup(group: h5py.Group, survey_metadata: mt_metadata.timeseries.Survey | None = None, **kwargs: Any)[source]

Bases: mth5.groups.BaseGroup

Wrapper for a single survey at Experiment/Surveys/<id>.

Handles survey-level metadata, child groups (stations, reports, filters, standards), and synchronization utilities.

Examples

>>> survey = surveys.add_survey("survey_01")
>>> survey.metadata.id
'survey_01'
initialize_group(**kwargs: Any) None[source]

Create default subgroups and write survey metadata.

Parameters:

**kwargs – Additional attributes to set on the instance before initialization.

Examples

>>> survey.initialize_group()
metadata() mt_metadata.timeseries.Survey[source]

Survey metadata enriched with station and filter information.

write_metadata() None[source]

Write HDF5 attributes from the survey metadata object.

property stations_group: mth5.groups.MasterStationGroup
property filters_group: mth5.groups.FiltersGroup

Convenience accessor for /Survey/Filters group.

property reports_group: mth5.groups.ReportsGroup

Convenience accessor for /Survey/Reports group.

property standards_group: mth5.groups.StandardsGroup

Convenience accessor for /Survey/Standards group.

update_survey_metadata(survey_dict: dict[str, Any] | None = None) None[source]

Deprecated alias for update_metadata().

Raises:

DeprecationWarning – Always raised to direct callers to update_metadata.

Examples

>>> survey.update_survey_metadata()
Traceback (most recent call last):
...
DeprecationWarning: 'update_survey_metadata' has been deprecated use 'update_metadata()'
update_metadata(survey_dict: dict[str, Any] | None = None) None[source]

Synchronize survey metadata from station summaries.

Parameters:

survey_dict (dict, optional) – Additional metadata values to merge before synchronization.

Notes

Updates survey start/end dates and bounding box from station summaries, then writes metadata to HDF5.

Examples

>>> _ = survey.update_metadata()
>>> survey.metadata.time_period.start_date
'2020-01-01'
class mth5.groups.ExperimentGroup(group, **kwargs)[source]

Bases: mth5.groups.BaseGroup

Utility class to hold general information about the experiment and accompanying metadata for an MT experiment.

To access the hdf5 group directly use ExperimentGroup.hdf5_group.

>>> experiment = ExperimentGroup(hdf5_group)
>>> experiment.hdf5_group.ref
<HDF5 Group Reference>

Note

All attributes should be input into the metadata object, that way all input will be validated against the metadata standards. If you change attributes in metadata object, you should run the ExperimentGroup.write_metadata() method. This is a temporary solution, working on an automatic updater if metadata is changed.

>>> experiment.metadata.existing_attribute = 'update_existing_attribute'
>>> experiment.write_metadata()

If you want to add a new attribute this should be done using the metadata.add_base_attribute method.

>>> experiment.metadata.add_base_attribute('new_attribute',
>>> ...                                'new_attribute_value',
>>> ...                                {'type':str,
>>> ...                                 'required':True,
>>> ...                                 'style':'free form',
>>> ...                                 'description': 'new attribute desc.',
>>> ...                                 'units':None,
>>> ...                                 'options':[],
>>> ...                                 'alias':[],
>>> ...                                 'example':'new attribute

Tip

If you want ot add stations, reports, etc to the experiment this should be done from the MTH5 object. This is to avoid duplication, at least for now.

To look at what the structure of /Experiment looks like:

>>> experiment
/Experiment:
====================
    |- Group: Surveys
    -----------------
    |- Group: Reports
    -----------------
    |- Group: Standards
    -------------------
    |- Group: Stations
    ------------------
metadata()[source]

Overwrite get metadata to include station information

property surveys_group