mth5.groups
Import all Group objects
Submodules
- mth5.groups.base
- mth5.groups.channel_dataset
- mth5.groups.estimate_dataset
- mth5.groups.experiment
- mth5.groups.fc_dataset
- mth5.groups.feature_dataset
- mth5.groups.features
- mth5.groups.filter_groups
- mth5.groups.filters
- mth5.groups.fourier_coefficients
- mth5.groups.reports
- mth5.groups.run
- mth5.groups.standards
- mth5.groups.station
- mth5.groups.survey
- mth5.groups.transfer_function
Classes
Base class for HDF5 group management with metadata handling. |
|
Store report files (PDF/text) and images under |
|
Container for metadata standards documentation stored in the HDF5 file. |
|
Container for managing all filter types in MTH5 format. |
|
Container for statistical estimates of transfer functions. |
|
Container for Fourier coefficients (FC) from windowed FFT analysis. |
|
Master container for all Fourier Coefficient estimations of time series data. |
|
Manage a set of Fourier Coefficients from a single processing run. |
|
Container for a single decimation level of Fourier Coefficient data. |
|
Container for transfer functions under a station. |
|
Wrapper for a single transfer function estimation. |
|
Specialized container for electric field channel data. |
|
Specialized container for magnetic field channel data. |
|
A container for channel time series data stored in HDF5 format. |
|
Specialized container for auxiliary channel data. |
|
Container for a single MT measurement run with multiple channels. |
|
Container for multi-dimensional Fourier Coefficients organized by time and frequency. |
|
Master group container for features associated with Fourier Coefficients or time series. |
|
Container for a single feature set with all associated runs and decimation levels. |
|
Container for time series features from a processing or analysis run. |
|
Container for Fourier Coefficient features from a processing run. |
|
Container for a single decimation level with multiple Fourier Coefficient channels. |
|
Collection helper for surveys under |
|
Wrapper for a single survey at |
|
Utility class to hold general information about the experiment and |
Package Contents
- class mth5.groups.BaseGroup(group: h5py.Group | h5py.Dataset, group_metadata: mt_metadata.base.MetadataBase | None = None, **kwargs: Any)[source]
Base class for HDF5 group management with metadata handling.
Provides core functionality for reading, writing, and managing HDF5 groups with integrated metadata validation using mt_metadata standards.
- Parameters:
group (h5py.Group or h5py.Dataset) – HDF5 group or dataset object to wrap.
group_metadata (MetadataBase, optional) – Metadata container with validated attributes. Default is None.
**kwargs (dict) – Additional keyword arguments to set as instance attributes.
- hdf5_group
Weak reference to the underlying HDF5 group.
- Type:
h5py.Group or h5py.Dataset
- metadata
Metadata object with validation and standards compliance.
- Type:
MetadataBase
- logger
Logger instance for tracking operations.
- Type:
loguru.Logger
- compression
HDF5 compression method (e.g., ‘gzip’).
- Type:
str, optional
- compression_opts
Compression options/level.
- Type:
int, optional
- shuffle
Enable HDF5 shuffle filter. Default is False.
- Type:
bool
- fletcher32
Enable HDF5 Fletcher32 checksum. Default is False.
- Type:
bool
Notes
All HDF5 group references are weak references to prevent lingering file references after the group is closed.
Metadata changes should be written using write_metadata() method.
This is a base class inherited by more specific group types like SurveyGroup, StationGroup, RunGroup, etc.
Examples
Create and manage a group with metadata
>>> import h5py >>> with h5py.File('data.h5', 'r+') as f: ... group = f.create_group('MyGroup') ... base_obj = BaseGroup(group) ... print(base_obj) ... # Set and write metadata ... base_obj.metadata.id = 'MyGroup' ... base_obj.write_metadata()
Access metadata and group structure
>>> print(base_obj.metadata.id) 'MyGroup' >>> print(base_obj.groups_list) ['subgroup1', 'subgroup2'] >>> print(base_obj.hdf5_group.ref) # Get HDF5 reference <HDF5 Group Reference>
- compression = None
- compression_opts = None
- shuffle = False
- fletcher32 = False
- logger
- property metadata: mt_metadata.base.MetadataBase
Get metadata object with lazy loading from HDF5 attributes.
- Returns:
Metadata container with all attributes and validation.
- Return type:
MetadataBase
Notes
Metadata is loaded on first access and cached for subsequent accesses.
Examples
>>> meta = base_obj.metadata >>> print(meta.id) 'MyGroup' >>> print(meta.mth5_type) 'Survey'
- property groups_list: list[str]
Get list of all subgroup names in the HDF5 group.
- Returns:
Names of all subgroups and datasets.
- Return type:
list of str
Examples
>>> print(base_obj.groups_list) ['Station_001', 'Station_002', 'metadata']
- property dataset_options: dict[str, Any]
Get the HDF5 dataset creation options.
- Returns:
Dictionary containing compression, shuffle, and checksum settings.
- Return type:
dict
Examples
>>> options = base_obj.dataset_options >>> print(options) {'compression': 'gzip', 'compression_opts': 4, 'shuffle': True, 'fletcher32': False}
- read_metadata() None[source]
Read metadata from HDF5 group attributes into metadata object.
Loads all HDF5 attributes and converts them to appropriate Python types before populating the metadata object with validation.
Notes
This method is called automatically on first metadata access if metadata has not been read yet. Empty attributes are skipped with a debug message.
Examples
Manually read metadata after file changes
>>> base_obj.read_metadata() >>> print(base_obj.metadata.id) 'MyGroup'
Check what attributes were read
>>> base_obj.read_metadata() >>> attrs = list(base_obj.metadata.to_dict().keys()) >>> print(f"Attributes: {attrs}") Attributes: ['id', 'comments', 'provenance']
- write_metadata() None[source]
Write metadata from object to HDF5 group attributes.
Converts metadata values to numpy-compatible types before writing to HDF5 attributes. Handles read-only mode gracefully with warnings.
- Raises:
KeyError – If HDF5 write fails for reasons other than read-only mode.
ValueError – If synchronous group creation fails for reasons other than read-only mode.
Notes
Keys that already exist are overwritten.
Read-only files will log a warning instead of raising an error.
This method should be called after any metadata changes.
Examples
Update metadata and write to file
>>> base_obj.metadata.id = 'UpdatedGroup' >>> base_obj.metadata.comments = 'New comments' >>> base_obj.write_metadata()
Verify write by reloading
>>> base_obj._has_read_metadata = False >>> base_obj.read_metadata() >>> print(base_obj.metadata.id) 'UpdatedGroup'
- initialize_group(**kwargs: Any) None[source]
Initialize group by setting attributes and writing metadata.
Convenience method that sets keyword arguments as instance attributes and writes all metadata to the HDF5 file.
- Parameters:
**kwargs (dict) – Key-value pairs to set as instance attributes.
Examples
Initialize with compression settings
>>> base_obj.initialize_group( ... compression='gzip', ... compression_opts=4, ... shuffle=True ... )
- rename_group(new_name: str) None[source]
Rename the current group in the HDF5 file.
- Parameters:
new_name (str) – New name for the group. Will be validated and normalized.
- Raises:
MTH5Error – If renaming fails due to read-only mode or other issues.
Examples
Rename a group
>>> print(survey_obj.hdf5_group.name) '/OldSurveyName' >>> survey_obj.rename_group('NewSurveyName') >>> print(survey_obj.hdf5_group.name) '/NewSurveyName'
- class mth5.groups.ReportsGroup(group: h5py.Group, **kwargs: Any)[source]
Bases:
mth5.groups.base.BaseGroupStore report files (PDF/text) and images under
/Survey/Reports.Files are embedded into HDF5 datasets with basic metadata preserved.
Examples
>>> reports = survey.reports_group >>> _ = reports.add_report("site_report", filename="/tmp/report.pdf") >>> _ = reports.get_report("site_report")
- add_report(report_name: str, report_metadata: dict[str, Any] | None = None, filename: str | pathlib.Path | None = None) None[source]
Add a report or image file to the group.
- Parameters:
report_name (str) – Dataset name to store the file under.
report_metadata (dict, optional) – Additional attributes to attach to the dataset.
filename (str or Path, optional) – Path to the file to embed; supported types: PDF/TXT/MD and common images.
- Raises:
FileNotFoundError – If
filenamedoes not exist.
Examples
>>> reports.add_report("manual", filename="docs/manual.pdf")
- get_report(report_name: str) pathlib.Path[source]
Extract a stored report or image to the current working directory.
- Parameters:
report_name (str) – Name of the stored dataset.
- Returns:
Path to the materialized file on disk.
- Return type:
pathlib.Path
- Raises:
ValueError – If the stored file type is unsupported.
Examples
>>> path = reports.get_report("site_report") >>> path.exists() True
- class mth5.groups.StandardsGroup(group: Any, **kwargs: Any)[source]
Bases:
mth5.groups.base.BaseGroupContainer for metadata standards documentation stored in the HDF5 file.
Stores metadata standards used throughout the survey in a standardized summary table. This enables users to understand metadata directly from the file without requiring external documentation.
The standards are organized in a summary table at
/Survey/Standards/summarywith columns for attribute name, type, requirements, style, units, and descriptions.Notes
Standards include definitions for:
Survey, Station, Run, Electric, Magnetic, Auxiliary metadata
Filter types: Coefficient, FIR, FrequencyResponseTable, PoleZero, TimeDelay
Processing standards from aurora and fourier_coefficients modules
Examples
>>> with MTH5('survey.mth5') as mth5_obj: ... standards = mth5_obj.standards_group ... summary = standards.summary_table ... print(summary.array.dtype.names) ('attribute', 'type', 'required', 'style', 'units', 'description', ...)
Get information about a specific attribute:
>>> standards.get_attribute_information('survey.release_license') survey.release_license -------------------------- type : string required : True style : controlled vocabulary ...
- property summary_table: mth5.tables.MTH5Table
- get_attribute_information(attribute_name: str) None[source]
Print detailed information about a metadata attribute.
Retrieves and displays all metadata standards information for the specified attribute from the standards summary table.
- Parameters:
attribute_name (str) – Name of the attribute to describe (e.g., ‘survey.release_license’).
- Raises:
MTH5TableError – If the attribute is not found in the standards summary table.
Notes
Prints formatted output including:
Data type
Whether attribute is required
Style (e.g., controlled vocabulary)
Units
Description
Valid options
Aliases
Example values
Default value
Examples
>>> standards = mth5_obj.standards_group >>> standards.get_attribute_information('survey.release_license') survey.release_license -------------------------- type : string required : True style : controlled vocabulary units : description : How the data can be used. The options are based on Creative Commons licenses. options : CC-0,CC-BY,CC-BY-SA,CC-BY-ND,CC-BY-NC-SA alias : example : CC-0 default : CC-0
- summary_table_from_dict(summary_dict: dict[str, Any]) None[source]
Populate summary table from a dictionary of metadata standards.
Converts a flattened dictionary of metadata standards into rows in the HDF5 summary table.
- Parameters:
summary_dict (dict[str, Any]) – Flattened dictionary of all metadata standards. Keys are attribute names, values are dictionaries with type, required, style, units, description, etc.
Notes
Processes dictionary values:
Lists are converted to comma-separated strings
None values become empty strings
Bytes are decoded to UTF-8
Examples
>>> standards = StandardsGroup(group) >>> metadata = summarize_metadata_standards() >>> standards.summary_table_from_dict(metadata)
- get_standards_summary(modules: list[str] | None = None) numpy.ndarray[source]
Get standards for specified metadata modules.
Retrieves and concatenates standards arrays from one or more metadata modules for inclusion in the standards table.
- Parameters:
modules (list[str], optional) – List of module names to include (e.g., ‘timeseries’, ‘filters’). If None, uses default modules: common, timeseries, timeseries.filters, transfer_functions.tf, features, features.weights, processing, processing.fourier_coefficients, processing.aurora. Default is None.
- Returns:
Concatenated numpy structured array containing standards for all requested modules with dtype matching STANDARDS_DTYPE.
- Return type:
np.ndarray
Examples
>>> standards = StandardsGroup(group) >>> ts_standards = standards.get_standards_summary(['timeseries']) >>> print(ts_standards.shape) (45,)
Get all default modules:
>>> all_standards = standards.get_standards_summary()
- summary_table_from_array(array: numpy.ndarray) None[source]
Populate summary table from a numpy structured array.
Converts a structured numpy array into rows in the HDF5 summary table.
- Parameters:
array (np.ndarray) – Structured numpy array with dtype matching STANDARDS_DTYPE. Each row represents one metadata attribute definition.
Notes
Iterates through all rows of the structured array and adds them sequentially to the summary table using add_row().
Examples
>>> standards = StandardsGroup(group) >>> standards_array = standards.get_standards_summary() >>> standards.summary_table_from_array(standards_array)
- initialize_group() None[source]
Initialize the standards group and create the summary table.
Creates the summary table dataset in the HDF5 file and populates it with metadata standards from all default modules. Sets appropriate HDF5 attributes and writes the group metadata.
Notes
Initialization process:
Creates HDF5 dataset for summary table with maximum expandable shape
Applies compression if configured in dataset_options
Sets HDF5 attributes: type, last_updated, reference
Populates table with standards from all default modules
Writes group metadata to HDF5
The summary table uses STANDARDS_DTYPE and supports up to 1000 rows.
Examples
>>> mth5_obj.initialize_group() >>> summary_table = mth5_obj.standards_group.summary_table >>> print(summary_table.array.shape) (342,)
- class mth5.groups.FiltersGroup(group: h5py.Group, **kwargs)[source]
Bases:
mth5.groups.base.BaseGroupContainer for managing all filter types in MTH5 format.
This class provides a unified interface for organizing and accessing filters of different types. It automatically creates and manages subgroups for each filter type (ZPK, Coefficient, Time Delay, FAP, and FIR) within the HDF5 file structure.
Filter Types
zpk: Zeros, Poles, and Gain representation
coefficient: FIR coefficient filter
time_delay: Time delay filter
fap: Frequency-Amplitude-Phase (FAP) lookup table
fir: Finite Impulse Response filter
- param group:
HDF5 group object for the filters container.
- type group:
h5py.Group
- param **kwargs:
Additional keyword arguments passed to BaseGroup.
- coefficient_group
Subgroup for coefficient filters.
- Type:
- time_delay_group
Subgroup for time delay filters.
- Type:
Examples
>>> import h5py >>> from mth5.groups.filters import FiltersGroup >>> with h5py.File('data.h5', 'r') as f: ... filters = FiltersGroup(f['Filters']) ... all_filters = filters.filter_dict ... zpk_filter = filters.to_filter_object('my_zpk_filter')
- property filter_dict: dict[str, Any]
Get a dictionary of all filters across all filter type groups.
Aggregates filters from all subgroups (ZPK, Coefficient, Time Delay, FAP, FIR) into a single dictionary for convenient access and querying.
- Returns:
Dictionary mapping filter names to filter metadata dictionaries. Each entry contains filter information including type and HDF5 reference.
- Return type:
dict[str, Any]
Examples
>>> filters = FiltersGroup(h5_group) >>> all_filters = filters.filter_dict >>> print(list(all_filters.keys())) ['my_zpk_filter', 'lowpass_coefficient', 'time_delay_1', ...] >>> print(all_filters['my_zpk_filter']['type']) 'zpk'
- add_filter(filter_object: object) object[source]
Add a filter dataset based on its type.
Automatically detects the filter type and routes the filter to the appropriate subgroup. Filter names are normalized to lowercase and forward slashes are replaced with “ per “ for consistency.
- Parameters:
filter_object (mt_metadata.timeseries.filters) –
An MT metadata filter object with a ‘type’ attribute. Supported types:
’zpk’, ‘poles_zeros’: Zeros-Poles-Gain filter
’coefficient’: Coefficient filter
’time_delay’, ‘time delay’: Time delay filter
’fap’, ‘frequency response table’: Frequency-Amplitude-Phase filter
’fir’: Finite Impulse Response filter
- Returns:
Filter group object from the appropriate subgroup.
- Return type:
object
Notes
If a filter with the same name already exists, the existing filter is returned instead of creating a duplicate.
Examples
>>> from mt_metadata.timeseries.filters import ZPK >>> filters = FiltersGroup(h5_group) >>> zpk_filter = ZPK(name='my_filter') >>> added_filter = filters.add_filter(zpk_filter)
Add coefficient filter:
>>> from mt_metadata.timeseries.filters import Coefficient >>> coeff_filter = Coefficient(name='lowpass') >>> filters.add_filter(coeff_filter)
- get_filter(name: str) h5py.Dataset | h5py.Group[source]
Retrieve a filter dataset by name.
Looks up the filter by name in the aggregated filter dictionary and returns the HDF5 dataset or group object.
- Parameters:
name (str) – Name of the filter to retrieve.
- Returns:
HDF5 dataset or group object for the requested filter.
- Return type:
h5py.Dataset or h5py.Group
- Raises:
KeyError – If the filter name is not found in the filter dictionary.
Examples
>>> filters = FiltersGroup(h5_group) >>> filter_dataset = filters.get_filter('my_zpk_filter') >>> print(filter_dataset.attrs)
- to_filter_object(name: str) object[source]
Convert a filter HDF5 dataset to an MT metadata filter object.
Retrieves the filter metadata from the HDF5 file and converts it to the appropriate MT metadata filter class based on filter type.
- Parameters:
name (str) – Name of the filter to convert.
- Returns:
MT metadata filter object (ZPK, Coefficient, TimeDelay, FAP, or FIR).
- Return type:
object
- Raises:
KeyError – If the filter name is not found in the filter dictionary.
Examples
>>> filters = FiltersGroup(h5_group) >>> zpk_filter = filters.to_filter_object('my_zpk_filter') >>> print(zpk_filter.name) 'my_zpk_filter' >>> print(type(zpk_filter)) <class 'mt_metadata.timeseries.filters.ZPK'>
Get different filter types:
>>> coeff_filter = filters.to_filter_object('lowpass_coefficient') >>> fap_filter = filters.to_filter_object('frequency_response_1')
- class mth5.groups.EstimateDataset(dataset: h5py.Dataset, dataset_metadata: mt_metadata.transfer_functions.tf.statistical_estimate.StatisticalEstimate | None = None, write_metadata: bool = True, **kwargs: Any)[source]
Container for statistical estimates of transfer functions.
This class holds multi-dimensional statistical estimates for transfer functions with full metadata management. Estimates are stored as HDF5 datasets with dimensions for period, output channels, and input channels.
- Parameters:
dataset (h5py.Dataset) – HDF5 dataset containing the statistical estimate data.
dataset_metadata (mt_metadata.transfer_functions.tf.StatisticalEstimate, optional) – Metadata object for the estimate. If provided and write_metadata is True, the metadata will be written to the HDF5 attributes. Defaults to None.
write_metadata (bool, optional) – If True, write metadata to the HDF5 dataset attributes. Defaults to True.
**kwargs (Any) – Additional keyword arguments (reserved for future use).
- hdf5_dataset
Weak reference to the HDF5 dataset.
- Type:
h5py.Dataset
- metadata
Metadata container for the estimate.
- Type:
StatisticalEstimate
- logger
Logger instance for reporting messages.
- Type:
loguru.logger
- Raises:
MTH5Error – If dataset_metadata is provided but is not of type StatisticalEstimate or a compatible metadata class.
TypeError – If input data cannot be converted to numpy array or has wrong dtype/shape.
Notes
The estimate data is stored in 3D form with shape: (n_periods, n_output_channels, n_input_channels)
Metadata is automatically synchronized between the pydantic model and HDF5 attributes on initialization and after any modifications.
Examples
Create an estimate dataset from an HDF5 group:
>>> import h5py >>> import numpy as np >>> from mt_metadata.transfer_functions.tf.statistical_estimate import StatisticalEstimate >>> # Create HDF5 file with estimate dataset >>> with h5py.File('estimate.h5', 'w') as f: ... # Create dataset with shape (10 periods, 2 outputs, 2 inputs) ... data = np.random.rand(10, 2, 2) ... dset = f.create_dataset('estimate', data=data) ... # Create EstimateDataset ... est = EstimateDataset(dset, write_metadata=True)
Convert estimate to xarray and back:
>>> periods = np.logspace(-3, 3, 10) # 10 periods from 1e-3 to 1e3 s >>> xr_data = est.to_xarray(periods) >>> # Modify xarray coordinates >>> new_xr = xr_data.rename({'output': 'new_output', 'input': 'new_input'}) >>> est.from_xarray(new_xr) # Load modified data back
Access estimate data in different formats:
>>> # Get numpy array >>> np_data = est.to_numpy() >>> print(np_data.shape) # (10, 2, 2) >>> # Get xarray with proper coordinates >>> xr_data = est.to_xarray(periods) >>> print(xr_data.dims) # ('period', 'output', 'input')
- logger
- metadata
- read_metadata() None[source]
Read metadata from HDF5 attributes into metadata container.
Reads all attributes from the HDF5 dataset and loads them into the internal metadata object for validation and access.
- Return type:
None
Notes
This is automatically called during initialization if ‘mth5_type’ attribute exists in the HDF5 dataset.
Examples
Reload metadata from HDF5 after external modification:
>>> # Metadata was modified in HDF5 >>> est.read_metadata() # Reload changes >>> print(est.metadata.name) # Access updated name
- write_metadata() None[source]
Write metadata from container to HDF5 dataset attributes.
Converts the pydantic metadata model to a dictionary and writes each field as an HDF5 attribute. Values are converted to appropriate numpy types for compatibility.
- Return type:
None
Notes
All existing attributes with the same names will be overwritten. This is called automatically during initialization and after metadata updates.
Examples
Save updated metadata to HDF5:
>>> est.metadata.name = "Updated Estimate" >>> est.write_metadata() # Persist to file >>> # Verify write >>> print(est.hdf5_dataset.attrs['name']) b'Updated Estimate'
- replace_dataset(new_data_array: numpy.ndarray) None[source]
Replace entire dataset with new data.
Resizes the HDF5 dataset if necessary and replaces all data. Converts input to numpy array if needed.
- Parameters:
new_data_array (np.ndarray) – New estimate data to store. Should have shape (n_periods, n_output_channels, n_input_channels).
- Return type:
None
- Raises:
TypeError – If input cannot be converted to numpy array.
Notes
If new data has different shape, HDF5 dataset will be resized. This is generally safe but may fragment the HDF5 file.
Examples
Replace estimate with new data:
>>> import numpy as np >>> new_estimate = np.random.rand(10, 2, 2) # 10 periods, 2 channels >>> est.replace_dataset(new_estimate) >>> print(est.to_numpy().shape) (10, 2, 2)
Replace with data from list (auto-converted to array):
>>> data_list = [[[1, 2], [3, 4]]] * 5 # 5 periods >>> est.replace_dataset(data_list) >>> est.to_numpy().shape (5, 2, 2)
- to_xarray(period: numpy.ndarray | list) xarray.DataArray[source]
Convert estimate to xarray DataArray.
Creates an xarray DataArray with proper coordinates for periods, output channels, and input channels. Includes metadata as attributes.
- Parameters:
period (np.ndarray | list) – Period values for coordinate. Should have length equal to estimate first dimension (n_periods).
- Returns:
DataArray with dimensions (period, output, input) and coordinates from metadata.
- Return type:
xr.DataArray
Notes
Metadata changes in xarray are not validated and will not be synchronized back to HDF5 without explicit call to from_xarray(). Data is loaded entirely into memory.
Examples
Convert to xarray with logarithmic period spacing:
>>> import numpy as np >>> periods = np.logspace(-2, 3, 10) # 10 periods from 0.01 to 1000 >>> xr_data = est.to_xarray(periods) >>> print(xr_data.dims) ('period', 'output', 'input') >>> print(xr_data.coords['period'].values) [1.00e-02 3.16e-02 ... 1.00e+03]
Select data by period range:
>>> subset = xr_data.sel(period=slice(0.1, 100)) >>> print(subset.shape) (8, 2, 2)
- to_numpy() numpy.ndarray[source]
Convert estimate to numpy array.
Returns the HDF5 dataset as a numpy array. Data is loaded entirely into memory.
- Returns:
3D array with shape (n_periods, n_output_channels, n_input_channels).
- Return type:
np.ndarray
Notes
For large estimates, this loads all data into RAM. Consider using HDF5 slicing for memory-efficient access.
Examples
Get full estimate as numpy array:
>>> data = est.to_numpy() >>> print(data.shape) (10, 2, 2) >>> print(data.dtype) float64
Access specific period and channels:
>>> data = est.to_numpy() >>> # Get first 5 periods, output channel 0, input channel 1 >>> subset = data[:5, 0, 1] >>> print(subset.shape) (5,)
- from_numpy(new_estimate: numpy.ndarray) None[source]
Load estimate data from numpy array.
Validates dtype and shape compatibility, resizes dataset if needed, and stores the data.
- Parameters:
new_estimate (np.ndarray) – Estimate data to load. Must be convertible to numpy array. Preferred shape: (n_periods, n_output_channels, n_input_channels).
- Return type:
None
- Raises:
TypeError – If dtype doesn’t match existing dataset or input cannot be converted to numpy array.
Notes
‘data’ is a built-in Python function and cannot be used as parameter name. The dataset will be resized if shape doesn’t match.
Examples
Load estimate from numpy array:
>>> import numpy as np >>> new_data = np.random.rand(5, 2, 2) >>> est.from_numpy(new_data) >>> print(est.to_numpy().shape) (5, 2, 2)
Load with automatic dtype conversion:
>>> float_data = np.array([[[1.0, 2.0]]], dtype=np.float64) >>> est.from_numpy(float_data)
- from_xarray(data: xarray.DataArray) None[source]
Load estimate data from xarray DataArray.
Updates metadata from xarray coordinates and attributes, then stores the data.
- Parameters:
data (xr.DataArray) – DataArray containing estimate. Expected dimensions: (period, output, input).
- Return type:
None
Notes
This will update output_channels, input_channels, name, and data_type from the xarray object. All changes are persisted to HDF5.
Examples
Load estimate from modified xarray:
>>> xr_data = est.to_xarray(periods) >>> # Modify data and metadata >>> modified = xr_data * 2 # Scale by 2 >>> est.from_xarray(modified) >>> print(est.to_numpy()[0, 0, 0]) # Verify scale
Rename channels and reload:
>>> xr_data = est.to_xarray(periods) >>> new_xr = xr_data.rename({ ... 'output': ['Ex', 'Ey'], ... 'input': ['Bx', 'By'] ... }) >>> est.from_xarray(new_xr) >>> print(est.metadata.output_channels) ['Ex', 'Ey']
- class mth5.groups.FCChannelDataset(dataset: h5py.Dataset, dataset_metadata: mt_metadata.processing.fourier_coefficients.FCChannel | None = None, **kwargs: Any)[source]
Container for Fourier coefficients (FC) from windowed FFT analysis.
Holds multi-dimensional Fourier coefficient data representing time-frequency analysis results. Data is uniformly sampled in both frequency (via harmonic index) and time (via uniform FFT window step size).
- Parameters:
dataset (h5py.Dataset) – HDF5 dataset containing the Fourier coefficient data.
dataset_metadata (FCChannel | None, optional) – Metadata object containing FC channel properties like start time, end time, sample rates, units, and frequency method. If provided, metadata will be written to HDF5 attributes. Defaults to None.
**kwargs (Any) – Additional keyword arguments (reserved for future use).
- hdf5_dataset
Weak reference to the HDF5 dataset.
- Type:
h5py.Dataset
- metadata
Metadata container for the Fourier coefficients.
- Type:
FCChannel
- logger
Logger instance for reporting messages.
- Type:
loguru.logger
- Raises:
MTH5Error – If dataset_metadata is provided but is not of type FCChannel.
TypeError – If input data cannot be converted to numpy array or has incompatible dtype/shape.
Notes
The data array has shape (n_windows, n_frequencies) where: - n_windows: Number of time windows in the FFT moving window analysis - n_frequencies: Number of frequency bins determined by window size
Data is typically complex-valued representing Fourier coefficients. Time windows are uniformly spaced with interval 1/sample_rate_window_step. Frequencies are uniformly spaced from frequency_min to frequency_max.
Metadata includes: - Time period (start and end) - Acquisition and decimated sample rates - Window sample rate (delta_t within window) - Units - Frequency method (integer harmonic index calculation) - Component name (channel designation)
Examples
Create an FC dataset from HDF5 group:
>>> import h5py >>> import numpy as np >>> from mt_metadata.processing.fourier_coefficients import FCChannel >>> with h5py.File('fc.h5', 'w') as f: ... # Create 2D array: 50 time windows, 256 frequencies ... data = np.random.rand(50, 256) + 1j * np.random.rand(50, 256) ... dset = f.create_dataset('Ex', data=data, dtype=np.complex128) ... # Create FCChannelDataset ... fc = FCChannelDataset(dset, write_metadata=True)
Convert to xarray and access time-frequency data:
>>> xr_data = fc.to_xarray() >>> print(xr_data.dims) # ('time', 'frequency') >>> # Access data at specific time and frequency >>> subset = xr_data.sel(time='2023-01-01T12:00:00', method='nearest')
Inspect properties:
>>> print(f"Windows: {fc.n_windows}, Frequencies: {fc.n_frequencies}") >>> print(f"Frequency range: {fc.frequency.min():.2f}-{fc.frequency.max():.2f} Hz")
- logger
- metadata
- read_metadata() None[source]
Read metadata from HDF5 attributes into metadata container.
Reads all attributes from the HDF5 dataset and loads them into the internal metadata object for validation and access.
- Return type:
None
Notes
This is automatically called during initialization if ‘mth5_type’ attribute exists in the HDF5 dataset.
Examples
Reload metadata from HDF5 after external modification:
>>> # Metadata was modified in HDF5 >>> fc.read_metadata() # Reload changes >>> print(fc.metadata.component) # Access updated component
- write_metadata() None[source]
Write metadata from container to HDF5 dataset attributes.
Converts the pydantic metadata model to a dictionary and writes each field as an HDF5 attribute. Values are converted to appropriate numpy types for compatibility. Always ensures ‘mth5_type’ attribute is set to ‘FCChannel’.
- Return type:
None
Notes
All existing attributes with the same names will be overwritten. This is called automatically during initialization and after metadata updates. Read-only files will silently skip writes.
Examples
Save updated metadata to HDF5:
>>> fc.metadata.component = "Ey" >>> fc.write_metadata() # Persist to file >>> # Verify write >>> print(fc.hdf5_dataset.attrs['component']) b'Ey'
- property n_windows: int
Number of time windows in the FFT analysis.
- Returns:
Number of time windows (first dimension of data array).
- Return type:
int
Notes
This corresponds to the number of rows in the 2D spectrogram data. Each window represents a uniform time interval determined by the window step size (1/sample_rate_window_step).
Examples
>>> print(f"Time windows: {fc.n_windows}") Time windows: 50
- property time: numpy.ndarray
Time array including the start of each time window.
Generates uniformly spaced time coordinates based on the start time, window step rate, and number of windows. Uses metadata time period to determine bounds.
- Returns:
Array of datetime64 values for each window start time.
- Return type:
np.ndarray
Notes
Time coordinates are generated using make_dt_coordinates, which ensures consistency between specified start/end times and the number of windows.
Examples
Access time array for time-based indexing:
>>> time_array = fc.time >>> print(time_array.shape) # (n_windows,) >>> print(time_array[0]) # First window time 2023-01-01T00:00:00.000000
- property n_frequencies: int
Number of frequency bins in the Fourier analysis.
- Returns:
Number of frequency bins (second dimension of data array).
- Return type:
int
Notes
This corresponds to the number of columns in the 2D spectrogram data. Determined by the FFT window size and relates to the frequency resolution of the analysis.
Examples
>>> print(f"Frequency bins: {fc.n_frequencies}") Frequency bins: 256
- property frequency: numpy.ndarray
Frequency array from metadata frequency bounds.
Generates uniformly spaced frequency coordinates based on the metadata frequency range and number of frequency bins.
- Returns:
Array of frequency values, linearly spaced from frequency_min to frequency_max.
- Return type:
np.ndarray
Notes
Frequencies represent harmonic indices or actual frequency values depending on the frequency method specified in metadata. Spacing is determined by n_frequencies bins over the range.
Examples
Access frequency array for frequency-based indexing:
>>> freq_array = fc.frequency >>> print(freq_array.shape) # (n_frequencies,) >>> print(f"Frequency range: {freq_array.min():.2f} to {freq_array.max():.2f} Hz") Frequency range: 0.00 to 64.00 Hz
- replace_dataset(new_data_array: numpy.ndarray) None[source]
Replace entire dataset with new data.
Resizes the HDF5 dataset if necessary and replaces all data. Converts input to numpy array if needed.
- Parameters:
new_data_array (np.ndarray) – New FC data to store. Should have shape (n_windows, n_frequencies) and typically complex-valued.
- Return type:
None
- Raises:
TypeError – If input cannot be converted to numpy array.
Notes
If new data has different shape, HDF5 dataset will be resized. This is generally safe but may fragment the HDF5 file.
Examples
Replace FC data with new analysis results:
>>> import numpy as np >>> new_fc = np.random.rand(30, 256) + 1j * np.random.rand(30, 256) >>> fc.replace_dataset(new_fc) >>> print(fc.to_numpy().shape) (30, 256)
Replace with data from list (auto-converted to array):
>>> data_list = [[[1+1j, 2+2j]], [[3+3j, 4+4j]]] * 15 >>> fc.replace_dataset(data_list) >>> fc.to_numpy().shape (30, 2)
- to_xarray() xarray.DataArray[source]
Convert FC data to xarray DataArray.
Creates an xarray DataArray with proper coordinates for time and frequency. Includes metadata as attributes.
- Returns:
DataArray with dimensions (time, frequency) and coordinates from metadata and computed properties.
- Return type:
xr.DataArray
Notes
Metadata changes in xarray are not validated and will not be synchronized back to HDF5 without explicit call to from_xarray(). Data is loaded entirely into memory.
Examples
Convert to xarray with automatic coordinates:
>>> xr_data = fc.to_xarray() >>> print(xr_data.dims) ('time', 'frequency') >>> print(xr_data.shape) (50, 256)
Select data by time and frequency range:
>>> subset = xr_data.sel( ... time=slice('2023-01-01T00:00:00', '2023-01-01T12:00:00'), ... frequency=slice(0, 10) ... ) >>> print(subset.shape) # Subset shape
- to_numpy() numpy.ndarray[source]
Convert FC data to numpy array.
Returns the HDF5 dataset as a numpy array. Data is loaded entirely into memory.
- Returns:
2D complex array with shape (n_windows, n_frequencies).
- Return type:
np.ndarray
Notes
For large spectrograms, this loads all data into RAM. Consider using HDF5 slicing for memory-efficient access to subsets.
Examples
Get full FC data as numpy array:
>>> data = fc.to_numpy() >>> print(data.shape) (50, 256) >>> print(data.dtype) complex128
Access specific time window and frequency:
>>> data = fc.to_numpy() >>> # Get first 10 windows, frequency bin 100 >>> subset = data[:10, 100] >>> print(subset.shape) (10,)
- from_numpy(new_estimate: numpy.ndarray) None[source]
Load FC data from numpy array.
Validates dtype and shape compatibility, resizes dataset if needed, and stores the data.
- Parameters:
new_estimate (np.ndarray) – FC data to load. Should have shape (n_windows, n_frequencies). Typically complex-valued array.
- Return type:
None
- Raises:
TypeError – If dtype doesn’t match existing dataset or input cannot be converted to numpy array.
Notes
‘data’ is a built-in Python function and cannot be used as parameter name. The dataset will be resized if shape doesn’t match. Dtype compatibility is strictly enforced.
Examples
Load FC data from numpy array:
>>> import numpy as np >>> new_data = np.random.rand(25, 128) + 1j * np.random.rand(25, 128) >>> fc.from_numpy(new_data) >>> print(fc.to_numpy().shape) (25, 128)
Load with magnitude and phase separation:
>>> magnitude = np.random.rand(20, 256) >>> phase = np.random.rand(20, 256) * 2 * np.pi >>> fc_data = magnitude * np.exp(1j * phase) >>> fc.from_numpy(fc_data)
- from_xarray(data: xarray.DataArray, sample_rate_decimation_level: int | float) None[source]
Load FC data from xarray DataArray.
Updates metadata from xarray coordinates and attributes, then stores the data. Computes frequency and time parameters from the provided xarray object.
- Parameters:
data (xr.DataArray) – DataArray containing FC data. Expected dimensions: (time, frequency).
sample_rate_decimation_level (int | float) – Decimation level applied to original sample rate. Used to track processing history.
- Return type:
None
Notes
This will update time_period (start/end), frequency bounds, window step rate, decimation level, component name, and units from the xarray object. All changes are persisted to HDF5.
Examples
Load FC data from modified xarray:
>>> xr_data = fc.to_xarray() >>> # Modify data (e.g., apply filter) >>> modified = xr_data * np.hamming(256) # Apply frequency window >>> fc.from_xarray(modified, sample_rate_decimation_level=4) >>> print(fc.metadata.sample_rate_decimation_level) 4
Load with updated metadata from another analysis:
>>> import xarray as xr >>> import pandas as pd >>> time_coords = pd.date_range('2023-01-01', periods=30, freq='1H') >>> freq_coords = np.arange(0, 128) >>> new_fc = xr.DataArray( ... data=np.random.rand(30, 128) + 1j * np.random.rand(30, 128), ... coords={'time': time_coords, 'frequency': freq_coords}, ... dims=['time', 'frequency'], ... name='Ey', ... attrs={'units': 'mV/km'} ... ) >>> fc.from_xarray(new_fc, sample_rate_decimation_level=1) >>> print(fc.metadata.component) Ey
- class mth5.groups.MasterFCGroup(group: h5py.Group, **kwargs)[source]
Bases:
mth5.groups.BaseGroupMaster container for all Fourier Coefficient estimations of time series data.
This class manages multiple Fourier Coefficient processing runs, each containing different decimation levels. No metadata is required at the master level.
Hierarchy
MasterFCGroup -> FCGroup (processing runs) -> FCDecimationGroup (decimation levels) -> FCChannelDataset (individual channels)
- param group:
HDF5 group object for the master FC container.
- type group:
h5py.Group
- param **kwargs:
Additional keyword arguments passed to BaseGroup.
Examples
>>> import h5py >>> from mth5.groups.fourier_coefficients import MasterFCGroup >>> with h5py.File('data.h5', 'r') as f: ... master = MasterFCGroup(f['FC']) ... fc_group = master.add_fc_group('processing_run_1')
- property fc_summary: pandas.DataFrame
Get a summary of all Fourier Coefficient processing runs.
- Returns:
Summary information for all FC groups including names and metadata.
- Return type:
pd.DataFrame
Examples
>>> master = MasterFCGroup(h5_group) >>> summary = master.fc_summary
- add_fc_group(fc_name: str, fc_metadata: mt_metadata.processing.fourier_coefficients.Decimation | None = None) FCGroup[source]
Add a Fourier Coefficient processing run group.
- Parameters:
fc_name (str) – Name for the FC group (usually identifies the processing run).
fc_metadata (fc.Decimation, optional) – Metadata for the FC group. Default is None.
- Returns:
Newly created Fourier Coefficient group.
- Return type:
Examples
>>> master = MasterFCGroup(h5_group) >>> fc_group = master.add_fc_group('processing_run_1') >>> print(fc_group.name) 'processing_run_1'
- get_fc_group(fc_name: str) FCGroup[source]
Retrieve a Fourier Coefficient group by name.
- Parameters:
fc_name (str) – Name of the FC group to retrieve.
- Returns:
The requested Fourier Coefficient group.
- Return type:
- Raises:
MTH5Error – If the FC group does not exist.
Examples
>>> master = MasterFCGroup(h5_group) >>> fc_group = master.get_fc_group('processing_run_1')
- remove_fc_group(fc_name: str) None[source]
Remove a Fourier Coefficient group.
Deletes the specified FC group and all associated decimation levels and channels.
- Parameters:
fc_name (str) – Name of the FC group to remove.
- Raises:
MTH5Error – If the FC group does not exist.
Examples
>>> master = MasterFCGroup(h5_group) >>> master.remove_fc_group('processing_run_1')
- class mth5.groups.FCGroup(group: h5py.Group, decimation_level_metadata: mt_metadata.processing.fourier_coefficients.Decimation | None = None, **kwargs)[source]
Bases:
mth5.groups.BaseGroupManage a set of Fourier Coefficients from a single processing run.
Holds Fourier Coefficient estimations organized by decimation level. Each decimation level contains channels (Ex, Ey, Hz, etc.) with complex frequency or time-frequency representations of the input signal.
All channels must use the same calibration. Recalibration requires rerunning the Fourier Coefficient estimation.
- hdf5_group
The HDF5 group containing decimation levels
- Type:
h5py.Group
- metadata[source]
Decimation metadata including time period, sample rates, and channels
- Type:
fc.Decimation
Notes
Processing run structure:
Multiple decimation levels at different sample rates
Each decimation level contains multiple channels
Each channel contains complex Fourier coefficients
Time period and sample rates define the estimation window
Examples
>>> with h5py.File('data.h5', 'r') as f: ... fc_run = FCGroup(f['Fourier_Coefficients/run_1']) ... print(fc_run.decimation_level_summary)
- metadata() mt_metadata.processing.fourier_coefficients.Decimation[source]
Get processing run metadata including all decimation levels.
Collects metadata from all decimation level groups and aggregates into a single Decimation metadata object.
- Returns:
Metadata containing time period, sample rates, and all decimation level information.
- Return type:
fc.Decimation
Notes
This getter automatically populates:
Time period (start and end)
List of all decimation levels and their metadata
HDF5 reference to this group
Examples
>>> fc_run = FCGroup(h5_group) >>> metadata = fc_run.metadata >>> print(metadata.time_period.start) 2023-01-01T00:00:00
- property decimation_level_summary: pandas.DataFrame
Get a summary of all decimation levels in this processing run.
Returns information about each decimation level including sample rate, decimation level value, and time span.
- Returns:
Summary with columns:
decimation_level: Integer decimation level identifier
start: ISO format start time of this decimation level
end: ISO format end time of this decimation level
hdf5_reference: Reference to the HDF5 group
- Return type:
pd.DataFrame
Notes
Each row represents a single decimation level containing multiple channels with Fourier coefficients at different sample rates.
Examples
>>> fc_run = FCGroup(h5_group) >>> summary = fc_run.decimation_level_summary >>> print(summary[['decimation_level', 'start', 'end']]) decimation_level start end 0 0 2023-01-01T00:00:00.000000 2023-01-01T01:00:00.000000 1 1 2023-01-01T00:00:00.000000 2023-01-01T02:00:00.000000
- add_decimation_level(decimation_level_name: str, decimation_level_metadata: dict | mt_metadata.processing.fourier_coefficients.Decimation | None = None) FCDecimationGroup[source]
Add a new decimation level to the processing run.
Creates a new FCDecimationGroup for a single decimation level containing Fourier Coefficient channels at a specific sample rate.
- Parameters:
decimation_level_name (str) – Identifier for the decimation level.
decimation_level_metadata (dict | fc.Decimation, optional) – Metadata for the decimation level. Can be a dictionary or fc.Decimation object. Default is None.
- Returns:
Newly created decimation level group.
- Return type:
Examples
>>> fc_run = FCGroup(h5_group) >>> metadata = fc.Decimation(decimation_level=0) >>> decimation = fc_run.add_decimation_level('0', metadata)
- get_decimation_level(decimation_level_name: str) FCDecimationGroup[source]
Retrieve a decimation level by name.
- Parameters:
decimation_level_name (str) – Name or identifier of the decimation level.
- Returns:
The requested decimation level group.
- Return type:
Examples
>>> fc_run = FCGroup(h5_group) >>> decimation = fc_run.get_decimation_level('0') >>> channels = decimation.groups_list
- remove_decimation_level(decimation_level_name: str) None[source]
Remove a decimation level from the processing run.
Deletes the HDF5 group and all its channels (FCChannelDataset objects).
- Parameters:
decimation_level_name (str) – Name or identifier of the decimation level to remove.
Notes
This removes the entire decimation level and all channels within it. To remove individual channels, use FCDecimationGroup.remove_channel() instead.
Examples
>>> fc_run = FCGroup(h5_group) >>> fc_run.remove_decimation_level('0')
- update_metadata() None[source]
Update processing run metadata from all decimation levels.
Aggregates time period information from all decimation levels and writes updated metadata to HDF5.
Notes
Collects:
Earliest start time across all decimation levels
Latest end time across all decimation levels
Should be called after adding or removing decimation levels.
Examples
>>> fc_run = FCGroup(h5_group) >>> fc_run.add_decimation_level('0', metadata0) >>> fc_run.add_decimation_level('1', metadata1) >>> fc_run.update_metadata()
- supports_aurora_processing_config(processing_config: aurora.config.metadata.processing.Processing, remote: bool) bool[source]
Check if all required decimation levels exist for Aurora processing.
Performs an all-or-nothing check: returns True only if every decimation level required by the processing config is available in this FCGroup.
Uses sequential logic to short-circuit: if any required decimation level is missing, immediately returns False without checking remaining levels.
- Parameters:
processing_config (aurora.config.metadata.processing.Processing) – Aurora processing configuration containing required decimation levels.
remote (bool) – Whether to check for remote processing compatibility.
- Returns:
True if all required decimation levels are available and consistent, False otherwise.
- Return type:
bool
Notes
Validation logic:
Extract list of decimation levels from processing config
Iterate through each required level in sequence
For each level, find a matching FCDecimation in this group
Check consistency using Aurora’s validation method
If any level is missing or inconsistent, return False immediately
Return True only if all levels pass validation
Examples
>>> fc_run = FCGroup(h5_group) >>> config = aurora.config.metadata.processing.Processing(...) >>> if fc_run.supports_aurora_processing_config(config, remote=False): ... # All decimation levels are available ... pass
- class mth5.groups.FCDecimationGroup(group: h5py.Group, decimation_level_metadata: mt_metadata.processing.fourier_coefficients.Decimation | None = None, **kwargs)[source]
Bases:
mth5.groups.BaseGroupContainer for a single decimation level of Fourier Coefficient data.
This class manages all channels at a specific decimation level, assuming uniform sampling in both frequency and time domains.
Data Assumptions
Data uniformly sampled in frequency domain
Data uniformly sampled in time domain
FFT moving window has uniform step size
- start_time
Start time of the decimation level
- Type:
datetime
- end_time
End time of the decimation level
- Type:
datetime
- channels
List of channel names in this decimation level
- Type:
list
- decimation_factor
Factor by which data was decimated
- Type:
int
- decimation_level
Level index in decimation hierarchy
- Type:
int
- sample_rate
Sample rate after decimation (Hz)
- Type:
float
- method
Method used (FFT, wavelet, etc.)
- Type:
str
- window
Window parameters (length, overlap, type, sample rate)
- Type:
dict
- param group:
HDF5 group object for this decimation level.
- type group:
h5py.Group
- param decimation_level_metadata:
Metadata for the decimation level. Default is None.
- type decimation_level_metadata:
optional
- param **kwargs:
Additional keyword arguments passed to BaseGroup.
Examples
>>> decimation = FCDecimationGroup(h5_group, decimation_level_metadata=metadata) >>> channel = decimation.add_channel('Ex', fc_data=fc_array)
- property channel_summary: pandas.DataFrame
Get a summary of all channels in this decimation level.
Returns a pandas DataFrame with detailed information about each Fourier Coefficient channel including time ranges, dimensions, and sampling rates.
- Returns:
DataFrame with columns:
- componentstr
Channel component name (e.g., ‘Ex’, ‘Hy’)
- startdatetime64[ns]
Start time of the channel data
- enddatetime64[ns]
End time of the channel data
- n_frequencyint64
Number of frequency bins
- n_windowsint64
Number of time windows
- sample_rate_decimation_levelfloat64
Decimation level sample rate (Hz)
- sample_rate_window_stepfloat64
Sample rate of window stepping (Hz)
- unitsstr
Physical units of the data
- hdf5_referenceh5py.ref_dtype
HDF5 reference to the channel dataset
- Return type:
pd.DataFrame
Examples
>>> decimation = FCDecimationGroup(h5_group) >>> summary = decimation.channel_summary >>> print(summary[['component', 'n_frequency', 'n_windows']])
- from_dataframe(df: pandas.DataFrame, channel_key: str, time_key: str = 'time', frequency_key: str = 'frequency') None[source]
Load Fourier Coefficient data from a pandas DataFrame.
Assumes the channel_key column contains complex coefficient values organized with time and frequency dimensions.
- Parameters:
df (pd.DataFrame) – Input DataFrame containing the coefficient data.
channel_key (str) – Name of the column containing coefficient values.
time_key (str, default='time') – Name of the time coordinate column.
frequency_key (str, default='frequency') – Name of the frequency coordinate column.
- Raises:
TypeError – If df is not a pandas DataFrame.
Examples
>>> decimation = FCDecimationGroup(h5_group) >>> decimation.from_dataframe(df, channel_key='Ex', time_key='time')
- from_xarray(data_array: xarray.Dataset | xarray.DataArray, sample_rate_decimation_level: float) None[source]
Load Fourier Coefficient data from an xarray DataArray or Dataset.
Automatically extracts metadata (time, frequency, units) from the xarray object and creates appropriate FCChannelDataset instances for each variable or the single DataArray.
- Parameters:
data_array (xr.DataArray or xr.Dataset) – Input xarray object with ‘time’ and ‘frequency’ coordinates and dimensions [‘time’, ‘frequency’] (or transposed variant).
sample_rate_decimation_level (float) – Sample rate of the decimation level (Hz).
- Raises:
TypeError – If data_array is not an xarray Dataset or DataArray.
Notes
Automatically handles both (time, frequency) and (frequency, time) dimension ordering. Units are extracted from xarray attributes if available.
Examples
>>> import xarray as xr >>> import numpy as np >>> decimation = FCDecimationGroup(h5_group)
Create sample xarray data:
>>> times = np.arange('2023-01-01', '2023-01-02', dtype='datetime64[s]') >>> freqs = np.linspace(0.01, 100, 256) >>> data_array = np.random.randn(len(times), len(freqs)) + \ ... 1j * np.random.randn(len(times), len(freqs)) >>> xr_data = xr.DataArray( ... data_array, ... dims=['time', 'frequency'], ... coords={'time': times, 'frequency': freqs}, ... name='Ex' ... )
Load into decimation group:
>>> decimation.from_xarray(xr_data, sample_rate_decimation_level=0.5)
- to_xarray(channels: list[str] | None = None) xarray.Dataset[source]
Create an xarray Dataset from Fourier Coefficient channels.
If no channels are specified, all channels in the decimation level are included. Each channel becomes a data variable in the resulting Dataset.
- Parameters:
channels (list[str], optional) – List of channel names to include. If None, all channels are used. Default is None.
- Returns:
xarray Dataset with channels as data variables and ‘time’ and ‘frequency’ as shared coordinates.
- Return type:
xr.Dataset
Examples
>>> decimation = FCDecimationGroup(h5_group) >>> xr_data = decimation.to_xarray() >>> print(xr_data.data_vars) Data variables: Ex (time, frequency) complex128 Ey (time, frequency) complex128
Get specific channels:
>>> subset = decimation.to_xarray(channels=['Ex', 'Ey'])
- from_numpy_array(nd_array: numpy.ndarray, ch_name: str | list[str]) None[source]
Load Fourier Coefficient data from a numpy array.
Assumes array shape is either (n_frequencies, n_windows) for a single channel or (n_channels, n_frequencies, n_windows) for multiple channels.
- Parameters:
nd_array (np.ndarray) – Input numpy array containing coefficient data.
ch_name (str or list[str]) – Channel name (for 2D array) or list of channel names (for 3D array).
- Raises:
TypeError – If nd_array is not a numpy ndarray.
ValueError – If array shape is not (n_frequencies, n_windows) or (n_channels, n_frequencies, n_windows).
Examples
>>> decimation = FCDecimationGroup(h5_group)
Load single channel:
>>> data_2d = np.random.randn(256, 100) + 1j * np.random.randn(256, 100) >>> decimation.from_numpy_array(data_2d, ch_name='Ex')
Load multiple channels:
>>> data_3d = np.random.randn(2, 256, 100) + 1j * np.random.randn(2, 256, 100) >>> decimation.from_numpy_array(data_3d, ch_name=['Ex', 'Ey'])
- add_channel(fc_name: str, fc_data: numpy.ndarray | None = None, fc_metadata: mt_metadata.processing.fourier_coefficients.FCChannel | None = None, max_shape: tuple = (None, None), chunks: bool = True, dtype: type = complex, **kwargs) mth5.groups.FCChannelDataset[source]
Add a Fourier Coefficient channel to the decimation level.
Creates a new FCChannelDataset for a single channel at a single decimation level. Input data can be provided as numpy array or created empty.
- Parameters:
fc_name (str) – Name for the Fourier Coefficient channel (usually component name like ‘Ex’).
fc_data (np.ndarray, optional) – Input data with shape (n_frequencies, n_windows). Default is None (creates empty).
fc_metadata (fc.FCChannel, optional) – Metadata for the channel. Default is None.
max_shape (tuple, default=(None, None)) – Maximum shape for HDF5 dataset dimensions (expandable if None).
chunks (bool, default=True) – Whether to use HDF5 chunking.
dtype (type, default=complex) – Data type for the dataset.
**kwargs – Additional keyword arguments for HDF5 dataset creation.
- Returns:
Newly created FCChannelDataset object.
- Return type:
- Raises:
TypeError – If fc_data type is not supported.
Notes
Data layout assumes (time, frequency) organization:
time index: window start times
frequency index: harmonic indices or float values
data: complex Fourier coefficients
If a channel with the same name already exists, the existing channel is returned instead of creating a duplicate.
Examples
>>> decimation = FCDecimationGroup(h5_group) >>> metadata = fc.FCChannel(component='Ex')
Create from numpy array:
>>> fc_data = np.random.randn(100, 256) + 1j * np.random.randn(100, 256) >>> channel = decimation.add_channel('Ex', fc_data=fc_data, fc_metadata=metadata)
Create empty channel (expandable):
>>> channel = decimation.add_channel('Ex', fc_metadata=metadata)
- get_channel(fc_name: str) mth5.groups.FCChannelDataset[source]
Retrieve a Fourier Coefficient channel by name.
- Parameters:
fc_name (str) – Name of the Fourier Coefficient channel to retrieve.
- Returns:
The requested Fourier Coefficient channel dataset.
- Return type:
- Raises:
KeyError – If the channel does not exist in this decimation level.
MTH5Error – If unable to retrieve the channel from HDF5.
Examples
>>> decimation = FCDecimationGroup(h5_group) >>> channel = decimation.get_channel('Ex') >>> print(channel.shape) (100, 256)
- remove_channel(fc_name: str) None[source]
Remove a Fourier Coefficient channel from the decimation level.
Deletes the HDF5 dataset associated with the channel. Note that this removes the reference but does not reduce the HDF5 file size.
- Parameters:
fc_name (str) – Name of the Fourier Coefficient channel to remove.
- Raises:
MTH5Error – If the channel does not exist.
Notes
Deleting a channel does not reduce the HDF5 file size; it simply removes the reference to the data. To truly reduce file size, copy the desired data to a new file.
Examples
>>> decimation = FCDecimationGroup(h5_group) >>> decimation.remove_channel('Ex')
- update_metadata() None[source]
Update decimation level metadata from all channels.
Aggregates metadata from all FC channels in the decimation level including time period, sample rates, and window step information. Updates the internal metadata object and writes to HDF5.
Notes
Collects the following information from channels:
Time period start/end from channel data
Sample rate decimation level
Sample rate window step
Should be called after adding or modifying channels to keep metadata synchronized.
Examples
>>> decimation = FCDecimationGroup(h5_group) >>> decimation.add_channel('Ex', fc_data=data_ex) >>> decimation.add_channel('Ey', fc_data=data_ey) >>> decimation.update_metadata()
- add_feature(feature_name: str, feature_data: numpy.ndarray | None = None, feature_metadata: dict | None = None, max_shape: tuple = (None, None, None), chunks: bool = True, **kwargs) None[source]
Add a feature dataset to the decimation level.
Creates a new dataset for auxiliary features or derived quantities related to Fourier Coefficients (e.g., SNR, coherency, power, etc.).
- Parameters:
feature_name (str) – Name for the feature dataset.
feature_data (np.ndarray, optional) – Input data for the feature. Default is None (creates empty).
feature_metadata (dict, optional) – Metadata dictionary for the feature. Default is None.
max_shape (tuple, default=(None, None, None)) – Maximum shape for HDF5 dataset dimensions (expandable if None).
chunks (bool, default=True) – Whether to use HDF5 chunking.
**kwargs – Additional keyword arguments for HDF5 dataset creation.
Notes
Feature types may include:
Power: Total power in Fourier coefficients
SNR: Signal-to-noise ratio
Coherency: Cross-component coherence
Weights: Channel-specific weights
Flags: Data quality or processing flags
Examples
>>> decimation = FCDecimationGroup(h5_group) >>> snr_data = np.random.randn(100, 256) >>> decimation.add_feature('snr', feature_data=snr_data)
Or create empty feature for later population:
>>> decimation.add_feature('power_Ex')
- class mth5.groups.TransferFunctionsGroup(group: Any, **kwargs: Any)[source]
Bases:
mth5.groups.BaseGroupContainer for transfer functions under a station.
Each child group is a single transfer function estimation managed by
TransferFunctionGroup.Examples
>>> from mth5 import mth5 >>> m5 = mth5.MTH5() >>> _ = m5.open_mth5("/tmp/example.mth5", mode="a") >>> station = m5.stations_group.add_station("mt01") >>> tf_group = station.transfer_functions_group >>> tf_group.groups_list []
- tf_summary(as_dataframe: bool = True) pandas.DataFrame | numpy.ndarray[source]
Summarize transfer functions stored for the station.
- Parameters:
as_dataframe (bool, default True) – If
Truereturn a pandas DataFrame, otherwise a NumPy structured array.- Returns:
Summary rows including station reference, location, and TF metadata.
- Return type:
pandas.DataFrame or numpy.ndarray
Examples
>>> summary = tf_group.tf_summary() >>> summary.columns[:4].tolist() ['station_hdf5_reference', 'station', 'latitude', 'longitude']
- add_transfer_function(name: str, tf_object: mt_metadata.transfer_functions.core.TF | None = None) TransferFunctionGroup[source]
Add a transfer function group under this station.
- Parameters:
name (str) – Transfer function identifier.
tf_object (TF, optional) – Transfer function instance to seed metadata and datasets.
- Returns:
Wrapper for the created or existing transfer function.
- Return type:
Examples
>>> tf_group = station.transfer_functions_group >>> _ = tf_group.add_transfer_function("mt01_4096")
- get_transfer_function(tf_id: str) TransferFunctionGroup[source]
Return an existing transfer function by id.
- Parameters:
tf_id (str) – Name of the transfer function.
- Returns:
Wrapper for the requested transfer function.
- Return type:
- Raises:
MTH5Error – If the transfer function does not exist.
Examples
>>> existing = station.transfer_functions_group.get_transfer_function("mt01_4096") >>> existing.name 'mt01_4096'
- remove_transfer_function(tf_id: str) None[source]
Delete a transfer function reference from the station.
- Parameters:
tf_id (str) – Transfer function name.
Notes
HDF5 deletion removes the reference only; storage is not reclaimed.
Examples
>>> tf_group.remove_transfer_function("mt01_4096")
- get_tf_object(tf_id: str) mt_metadata.transfer_functions.core.TF[source]
Return a populated
mt_metadata.transfer_functions.core.TF.- Parameters:
tf_id (str) – Transfer function name to convert.
- Returns:
Transfer function populated with metadata and estimates.
- Return type:
mt_metadata.transfer_functions.core.TF
Examples
>>> tf_obj = tf_group.get_tf_object("mt01_4096")
- class mth5.groups.TransferFunctionGroup(group: Any, **kwargs: Any)[source]
Bases:
mth5.groups.BaseGroupWrapper for a single transfer function estimation.
- property period: numpy.ndarray | None
Return period array stored in
perioddataset, if present.
- add_statistical_estimate(estimate_name: str, estimate_data: numpy.ndarray | xarray.DataArray | None = None, estimate_metadata: mt_metadata.transfer_functions.tf.statistical_estimate.StatisticalEstimate | None = None, max_shape: tuple[int | None, int | None, int | None] = (None, None, None), chunks: bool = True, **kwargs: Any) mth5.groups.EstimateDataset[source]
Add a statistical estimate dataset.
- Parameters:
estimate_name (str) – Dataset name.
estimate_data (numpy.ndarray or xarray.DataArray, optional) – Estimate values; if
Nonea placeholder array is created.estimate_metadata (StatisticalEstimate, optional) – Metadata describing the estimate.
max_shape (tuple of int or None, default (None, None, None)) – Maximum shape for resizable datasets.
chunks (bool, default True) – Chunking flag forwarded to HDF5 dataset creation.
- Returns:
Wrapper combining dataset and metadata.
- Return type:
- Raises:
TypeError – If
estimate_datais not array-like.
Examples
>>> est = tf_group.add_statistical_estimate("transfer_function") >>> isinstance(est, EstimateDataset) True
- get_estimate(estimate_name: str) mth5.groups.EstimateDataset[source]
Return a statistical estimate dataset by name.
- to_tf_object() mt_metadata.transfer_functions.core.TF[source]
Convert this group into a populated
TFobject.- Returns:
TF instance with survey, station, runs, channels, period, and estimate datasets applied.
- Return type:
mt_metadata.transfer_functions.core.TF
- Raises:
ValueError – If no period dataset is present.
Examples
>>> tf_obj = tf_group.to_tf_object()
- from_tf_object(tf_obj: mt_metadata.transfer_functions.core.TF, update_metadata: bool = True) None[source]
Populate datasets from a
TFobject.- Parameters:
tf_obj (TF) – Transfer function object containing estimates and metadata.
update_metadata (bool, default True) – If
Truewrite transfer function metadata to HDF5.
- Raises:
ValueError – If
tf_objis not aTFinstance.
Examples
>>> tf_group.from_tf_object(tf_obj)
- class mth5.groups.ElectricDataset(group: h5py.Dataset, **kwargs: Any)[source]
Bases:
ChannelDatasetSpecialized container for electric field channel data.
Inherits all functionality from ChannelDataset with electric field specific metadata handling.
- Parameters:
group (h5py.Dataset) – HDF5 dataset containing electric field data.
**kwargs (dict) – Additional keyword arguments passed to ChannelDataset.
Examples
>>> ex_dataset = run_group.get_channel('Ex') >>> print(type(ex_dataset)) <class 'mth5.groups.channel_dataset.ElectricDataset'> >>> print(ex_dataset.metadata.type) 'electric' >>> print(ex_dataset.metadata.units) 'mV/km'
- class mth5.groups.MagneticDataset(group: h5py.Dataset, **kwargs: Any)[source]
Bases:
ChannelDatasetSpecialized container for magnetic field channel data.
Inherits all functionality from ChannelDataset with magnetic field specific metadata handling.
- Parameters:
group (h5py.Dataset) – HDF5 dataset containing magnetic field data.
**kwargs (dict) – Additional keyword arguments passed to ChannelDataset.
Examples
>>> hx_dataset = run_group.get_channel('Hx') >>> print(type(hx_dataset)) <class 'mth5.groups.channel_dataset.MagneticDataset'> >>> print(hx_dataset.metadata.type) 'magnetic' >>> print(hx_dataset.metadata.units) 'nT'
- class mth5.groups.ChannelDataset(dataset: h5py.Dataset | None, dataset_metadata: mt_metadata.base.MetadataBase | None = None, write_metadata: bool = True, **kwargs: Any)[source]
A container for channel time series data stored in HDF5 format.
This class provides a flexible interface to work with magnetotelluric channel data, allowing conversion to various formats (xarray, pandas, numpy) while maintaining metadata integrity.
- Parameters:
dataset (h5py.Dataset or None) – HDF5 dataset object containing the channel time series data.
dataset_metadata (MetadataBase, optional) – Metadata container for Electric, Magnetic, or Auxiliary channel types. Default is None.
write_metadata (bool, optional) – Whether to write metadata to the HDF5 dataset on initialization. Default is True.
**kwargs (dict) – Additional keyword arguments to set as instance attributes.
- hdf5_dataset
Weak reference to the underlying HDF5 dataset.
- Type:
h5py.Dataset
- metadata
Channel metadata object with validation.
- Type:
MetadataBase
- logger
Logger instance for tracking operations.
- Type:
loguru.Logger
- Raises:
MTH5Error – If the dataset is not of the correct type or metadata validation fails.
See also
ElectricDatasetSpecialized container for electric field channels.
MagneticDatasetSpecialized container for magnetic field channels.
AuxiliaryDatasetSpecialized container for auxiliary channels.
Examples
>>> from mth5 import mth5 >>> mth5_obj = mth5.MTH5() >>> mth5_obj.open_mth5(r"/test.mth5", mode='a') >>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a') >>> channel = run.get_channel('Ex') >>> channel Channel Electric: ------------------- component: Ex data type: electric data format: float32 data shape: (4096,) start: 1980-01-01T00:00:00+00:00 end: 1980-01-01T00:00:01+00:00 sample rate: 4096
Access time series data
>>> ts_data = channel.to_channel_ts() >>> print(f"Mean: {ts_data.ts.mean():.2f}, Std: {ts_data.ts.std():.2f}")
Convert to xarray for time-based indexing
>>> xr_data = channel.to_xarray() >>> subset = xr_data.sel(time=slice('1980-01-01T00:00:00', '1980-01-01T00:00:10'))
- logger
- metadata
- property run_metadata: mt_metadata.timeseries.Run
Get the run-level metadata containing this channel.
- Returns:
Run metadata object with channel information included.
- Return type:
metadata.Run
Examples
>>> run_meta = channel.run_metadata >>> print(run_meta.id) 'MT001a' >>> print(run_meta.channels_recorded_electric) ['Ex', 'Ey']
- property station_metadata: mt_metadata.timeseries.Station
Get the station-level metadata containing this channel.
- Returns:
Station metadata object with run and channel information.
- Return type:
metadata.Station
Examples
>>> station_meta = channel.station_metadata >>> print(f"{station_meta.id}: {station_meta.location.latitude}, {station_meta.location.longitude}") 'MT001: 40.5, -112.3'
- property survey_metadata: mt_metadata.timeseries.Survey
Get the survey-level metadata containing this channel.
- Returns:
Complete survey metadata hierarchy including this channel.
- Return type:
metadata.Survey
Examples
>>> survey_meta = channel.survey_metadata >>> print(survey_meta.id) 'MT Survey 2023' >>> print(f"Stations: {len(survey_meta.stations)}") Stations: 15
- property survey_id: str
Get the survey identifier.
- Returns:
Survey ID string.
- Return type:
str
Examples
>>> print(channel.survey_id) 'MT_Survey_2023'
- property channel_response: mt_metadata.timeseries.filters.ChannelResponse
Get the complete channel response from applied filters.
Constructs a ChannelResponse object by retrieving all filters referenced in the channel metadata from the survey’s Filters group.
- Returns:
Channel response object containing all applied filters in sequence.
- Return type:
ChannelResponse
Notes
Filters are applied in the order specified by their sequence_number. Filter names are normalized by replacing ‘/’ with ‘ per ‘ and converting to lowercase.
Examples
>>> response = channel.channel_response >>> print(f"Number of filters: {len(response.filters_list)}") Number of filters: 3 >>> for filt in response.filters_list: ... print(f"{filt.name}: {filt.type}") zpk: zpk coefficient: coefficient time delay: time_delay
- property start: mt_metadata.common.mttime.MTime
Get the start time of the channel data.
- Returns:
Start time from metadata.time_period.start.
- Return type:
MTime
Examples
>>> print(channel.start) 1980-01-01T00:00:00+00:00 >>> print(channel.start.iso_str) '1980-01-01T00:00:00.000000+00:00'
- property end: mt_metadata.common.mttime.MTime
Calculate the end time based on start time, sample rate, and number of samples.
- Returns:
Calculated end time of the data.
- Return type:
MTime
Notes
End time is calculated as: start + (n_samples - 1) / sample_rate The -1 ensures the last sample falls exactly at the end time.
Examples
>>> print(f"Duration: {channel.end - channel.start} seconds") Duration: 3600.0 seconds >>> print(channel.end.iso_str) '1980-01-01T01:00:00.000000+00:00'
- property sample_rate: float
Get the sample rate in samples per second.
- Returns:
Sample rate in Hz.
- Return type:
float
Examples
>>> print(f"Sample rate: {channel.sample_rate} Hz") Sample rate: 256.0 Hz
- property n_samples: int
Get the total number of samples in the dataset.
- Returns:
Number of data points in the time series.
- Return type:
int
Examples
>>> print(f"Total samples: {channel.n_samples:,}") Total samples: 921,600 >>> duration = channel.n_samples / channel.sample_rate >>> print(f"Duration: {duration/3600:.1f} hours") Duration: 1.0 hours
- property time_index: pandas.DatetimeIndex
Create a time index for the dataset based on metadata.
- Returns:
Pandas datetime index spanning the entire dataset.
- Return type:
pd.DatetimeIndex
Notes
The time index is useful for time-based queries and slicing operations. It is generated dynamically from start time, sample rate, and number of samples.
Examples
>>> time_idx = channel.time_index >>> print(time_idx[0], time_idx[-1]) 1980-01-01 00:00:00 1980-01-01 00:59:59.996093750 >>> print(f"Index length: {len(time_idx)}") Index length: 921600
- read_metadata() None[source]
Read metadata from HDF5 attributes into the metadata container.
Loads all HDF5 attributes from the dataset and converts them to the appropriate Python types before populating the metadata object.
For older MTH5 files, this method attempts to coerce values to the expected types based on the metadata schema to maintain backwards compatibility.
Notes
This method automatically validates metadata through the metadata container’s validators. Type coercion is applied to handle older file formats that may have stored metadata with different types.
Examples
>>> channel.read_metadata() >>> print(channel.metadata.component) 'Ex' >>> print(channel.metadata.sample_rate) 256.0
Handles type coercion for older files
>>> # If sample_rate was stored as string '256.0' in old file >>> channel.read_metadata() >>> print(type(channel.metadata.sample_rate)) <class 'float'>
- write_metadata() None[source]
Write metadata from the container to HDF5 dataset attributes.
Converts all metadata values to numpy-compatible types before writing to HDF5 attributes. Falls back to string conversion if direct conversion fails.
Notes
This method is automatically called during initialization and when metadata is updated.
Examples
>>> channel.metadata.component = 'Ey' >>> channel.metadata.measurement_azimuth = 90.0 >>> channel.write_metadata()
- replace_dataset(new_data_array: numpy.ndarray) None[source]
Replace the entire dataset with new data.
- Parameters:
new_data_array (np.ndarray) – New data array with shape (npts,). Must be 1-dimensional.
- Raises:
TypeError – If new_data_array cannot be converted to numpy array.
Notes
The HDF5 dataset will be resized if the new array has a different shape. All existing data will be overwritten.
Examples
Replace with synthetic data
>>> import numpy as np >>> new_data = np.sin(2 * np.pi * 1.0 * np.linspace(0, 10, 2560)) >>> channel.replace_dataset(new_data) >>> print(f"New shape: {channel.hdf5_dataset.shape}") New shape: (2560,)
Replace with processed data
>>> original = channel.hdf5_dataset[:] >>> filtered = np.convolve(original, np.ones(5)/5, mode='same') >>> channel.replace_dataset(filtered)
- extend_dataset(new_data_array: numpy.ndarray, start_time: str | mt_metadata.common.mttime.MTime, sample_rate: float, fill: str | float | int | None = None, max_gap_seconds: float | int = 1, fill_window: int = 10) None[source]
Extend or prepend data to the existing dataset with gap handling.
Intelligently adds new data before, after, or within the existing time series. Handles time alignment, overlaps, and gaps with configurable fill strategies.
- Parameters:
new_data_array (np.ndarray) – New data array with shape (npts,).
start_time (str or MTime) – Start time of the new data array in UTC.
sample_rate (float) – Sample rate of the new data array in Hz. Must match existing sample rate.
fill (str, float, int, or None, optional) –
Strategy for filling data gaps:
None : Raise MTH5Error if gap exists (default)
’mean’ : Fill with mean of both datasets within fill_window
’median’ : Fill with median of both datasets within fill_window
’nan’ : Fill with NaN values
numeric value : Fill with specified constant
max_gap_seconds (float or int, optional) – Maximum allowed gap in seconds. Exceeding this raises MTH5Error. Default is 1 second.
fill_window (int, optional) – Number of points from each dataset edge to estimate fill values. Default is 10 points.
- Raises:
MTH5Error – If sample rates don’t match, gap exceeds max_gap_seconds, or fill strategy is invalid.
TypeError – If new_data_array cannot be converted to numpy array.
Notes
Prepend: New data start < existing start
Append: New data start > existing end
Overwrite: New data overlaps existing data
The dataset is automatically resized to accommodate new data.
Examples
Append data with a small gap
>>> ex = mth5_obj.get_channel('MT001', 'MT001a', 'Ex') >>> print(f"Original: {ex.n_samples} samples, ends {ex.end}") Original: 4096 samples, ends 2015-01-08T19:32:09.500000+00:00 >>> new_data = np.random.randn(4096) >>> new_start = (ex.end + 0.5).isoformat() # 0.5s gap >>> ex.extend_dataset(new_data, new_start, ex.sample_rate, ... fill='median', max_gap_seconds=2) >>> print(f"Extended: {ex.n_samples} samples, ends {ex.end}") Extended: 8200 samples, ends 2015-01-08T19:40:42.500000+00:00
Prepend data seamlessly
>>> prepend_data = np.random.randn(2048) >>> prepend_start = (ex.start - 2048/ex.sample_rate).isoformat() >>> ex.extend_dataset(prepend_data, prepend_start, ex.sample_rate) >>> print(f"New start: {ex.start}")
Overwrite section of existing data
>>> replacement_data = np.zeros(1024) >>> replace_start = (ex.start + 1.0).isoformat() # 1s after start >>> ex.extend_dataset(replacement_data, replace_start, ex.sample_rate)
- has_data() bool[source]
Check if the channel contains non-zero data.
- Returns:
True if dataset has non-zero values, False if all zeros or empty.
- Return type:
bool
Examples
>>> if channel.has_data(): ... print("Channel has valid data") ... else: ... print("Channel is empty or all zeros") Channel has valid data
>>> empty_channel.has_data() False
- to_channel_ts() mth5.timeseries.ChannelTS[source]
Convert the dataset to a ChannelTS object with full metadata.
- Returns:
Time series object with data, metadata, and channel response.
- Return type:
Notes
Data is loaded into memory. The resulting ChannelTS object is independent of the HDF5 file and can be modified without affecting the original dataset.
Examples
>>> ts = channel.to_channel_ts() >>> print(f"Type: {type(ts)}") Type: <class 'mth5.timeseries.channel_ts.ChannelTS'> >>> print(f"Shape: {ts.ts.shape}, Mean: {ts.ts.mean():.2f}") Shape: (4096,), Mean: 0.15
Process the time series
>>> filtered_ts = ts.low_pass_filter(cutoff=10.0) >>> detrended_ts = ts.detrend('linear') >>> ts.plot()
- to_xarray() xarray.DataArray[source]
Convert the dataset to an xarray DataArray with time coordinates.
- Returns:
DataArray with time index and metadata as attributes.
- Return type:
xr.DataArray
Notes
Data is loaded into memory. Metadata is stored in the attrs dictionary and will not be validated if modified.
Examples
>>> xr_data = channel.to_xarray() >>> print(xr_data) <xarray.DataArray (time: 4096)> array([0.931, 0.142, ..., 0.882]) Coordinates: * time (time) datetime64[ns] 1980-01-01 ... 1980-01-01T00:00:15.996 .. attribute:: component
Ex
- sample_rate
256.0
- ...
Use xarray’s powerful selection
>>> morning = xr_data.sel(time=slice('1980-01-01T06:00', '1980-01-01T12:00')) >>> daily_mean = xr_data.resample(time='1D').mean() >>> xr_data.plot()
- to_dataframe() pandas.DataFrame[source]
Convert the dataset to a pandas DataFrame with time index.
- Returns:
DataFrame with ‘data’ column and time index. Metadata stored in attrs.
- Return type:
pd.DataFrame
Notes
Data is loaded into memory. Metadata is stored in the experimental attrs attribute and will not be validated if modified.
Examples
>>> df = channel.to_dataframe() >>> print(df.head()) data time 1980-01-01 00:00:00 0.931 1980-01-01 00:00:00 0.142 ...
Use pandas operations
>>> df['data'].describe() >>> df.resample('1H').mean() >>> df.plot(y='data', figsize=(12, 4))
Access metadata
>>> print(df.attrs['component']) 'Ex' >>> print(df.attrs['sample_rate']) 256.0
- to_numpy() numpy.recarray[source]
Convert the dataset to a numpy structured array with time and data columns.
- Returns:
Record array with ‘time’ and ‘channel_data’ fields.
- Return type:
np.recarray
Notes
Data is loaded into memory. The ‘data’ name is avoided as it’s a builtin to numpy.
Examples
>>> arr = channel.to_numpy() >>> print(arr.dtype.names) ('time', 'channel_data') >>> print(arr['time'][0]) 1980-01-01T00:00:00.000000000 >>> print(arr['channel_data'].mean()) 0.152
Access fields
>>> times = arr['time'] >>> data = arr['channel_data'] >>> import matplotlib.pyplot as plt >>> plt.plot(times, data)
- from_channel_ts(channel_ts_obj: mth5.timeseries.ChannelTS, how: str = 'replace', fill: str | float | int | None = None, max_gap_seconds: float | int = 1, fill_window: int = 10) None[source]
Populate the dataset from a ChannelTS object.
- Parameters:
channel_ts_obj (ChannelTS) – Time series object containing data and metadata.
how ({'replace', 'extend'}, optional) –
Method for adding data:
’replace’ : Replace entire dataset (default)
’extend’ : Append/prepend to existing data with gap handling
fill (str, float, int, or None, optional) –
Gap filling strategy (only used with how=’extend’):
None : Raise error on gaps (default)
’mean’ : Fill with mean of both datasets
’median’ : Fill with median of both datasets
’nan’ : Fill with NaN
numeric : Fill with constant value
max_gap_seconds (float or int, optional) – Maximum allowed gap in seconds. Default is 1.
fill_window (int, optional) – Points to use for estimating fill values. Default is 10.
- Raises:
TypeError – If channel_ts_obj is not a ChannelTS instance.
MTH5Error – If time alignment or metadata validation fails.
Examples
Replace entire dataset
>>> from mth5.timeseries import ChannelTS >>> import numpy as np >>> ts = ChannelTS( ... channel_type='electric', ... data=np.random.randn(1000), ... channel_metadata={'electric': { ... 'component': 'ex', ... 'sample_rate': 256.0 ... }} ... ) >>> channel.from_channel_ts(ts, how='replace') >>> print(channel.n_samples) 1000
Extend existing dataset
>>> new_ts = ChannelTS( ... channel_type='electric', ... data=np.random.randn(500), ... channel_metadata={'electric': { ... 'component': 'ex', ... 'sample_rate': 256.0, ... 'time_period.start': channel.end.isoformat() ... }} ... ) >>> channel.from_channel_ts(new_ts, how='extend', fill='median') >>> print(channel.n_samples) 1500
- from_xarray(data_array: xarray.DataArray, how: str = 'replace', fill: str | float | int | None = None, max_gap_seconds: float | int = 1, fill_window: int = 10) None[source]
Populate the dataset from an xarray DataArray.
- Parameters:
data_array (xr.DataArray) – DataArray with time coordinate and metadata in attrs.
how ({'replace', 'extend'}, optional) –
Method for adding data:
’replace’ : Replace entire dataset (default)
’extend’ : Append/prepend to existing data with gap handling
fill (str, float, int, or None, optional) –
Gap filling strategy (only used with how=’extend’):
None : Raise error on gaps (default)
’mean’ : Fill with mean of both datasets
’median’ : Fill with median of both datasets
’nan’ : Fill with NaN
numeric : Fill with constant value
max_gap_seconds (float or int, optional) – Maximum allowed gap in seconds. Default is 1.
fill_window (int, optional) – Points to use for estimating fill values. Default is 10.
- Raises:
TypeError – If data_array is not an xarray.DataArray.
MTH5Error – If time alignment fails.
Examples
Replace from xarray
>>> import xarray as xr >>> import numpy as np >>> import pandas as pd >>> time = pd.date_range('2020-01-01', periods=1000, freq='0.004S') >>> data = xr.DataArray( ... np.random.randn(1000), ... coords=[('time', time)], ... attrs={'component': 'ex', 'sample_rate': 256.0} ... ) >>> channel.from_xarray(data, how='replace') >>> print(channel.n_samples) 1000
Extend from xarray with gap
>>> time2 = pd.date_range('2020-01-01T00:00:05', periods=500, freq='0.004S') >>> data2 = xr.DataArray(np.random.randn(500), coords=[('time', time2)]) >>> channel.from_xarray(data2, how='extend', fill='mean')
- property channel_entry: numpy.ndarray
Create a structured array entry for channel summary tables.
- Returns:
Structured array with dtype=CHANNEL_DTYPE containing channel metadata and HDF5 references for survey-wide summaries.
- Return type:
np.ndarray
Notes
This entry includes survey ID, station ID, run ID, location, component, time period, sample rate, and HDF5 references for navigation.
Examples
>>> entry = channel.channel_entry >>> print(entry['component'][0]) 'Ex' >>> print(entry['sample_rate'][0]) 256.0 >>> print(entry['station'][0]) 'MT001'
- time_slice(start: str | mt_metadata.common.mttime.MTime, end: str | mt_metadata.common.mttime.MTime | None = None, n_samples: int | None = None, return_type: str = 'channel_ts') mth5.timeseries.ChannelTS | xarray.DataArray | pandas.DataFrame | numpy.ndarray[source]
Extract a time slice from the channel dataset.
- Parameters:
start (str or MTime) – Start time of the slice in UTC.
end (str or MTime, optional) – End time of the slice. Mutually exclusive with n_samples.
n_samples (int, optional) – Number of samples to extract. Mutually exclusive with end.
return_type ({'channel_ts', 'xarray', 'pandas', 'numpy'}, optional) – Format for returned data. Default is ‘channel_ts’.
- Returns:
Time slice in the requested format with appropriate metadata.
- Return type:
ChannelTS or xr.DataArray or pd.DataFrame or np.ndarray
- Raises:
ValueError – If both end and n_samples are provided or neither is provided.
Notes
If the requested slice extends beyond available data, it will be automatically truncated with a warning.
Regional HDF5 references are used when possible for efficiency.
Examples
Extract by number of samples
>>> ex = mth5_obj.get_channel('FL001', 'FL001a', 'Ex') >>> ex_slice = ex.time_slice("2015-01-08T19:49:15", n_samples=4096) >>> print(type(ex_slice)) <class 'mth5.timeseries.channel_ts.ChannelTS'> >>> print(f"Slice shape: {ex_slice.ts.shape}") Slice shape: (4096,) >>> ex_slice.plot()
Extract by time range
>>> ex_slice = ex.time_slice( ... "2015-01-08T19:49:15", ... end="2015-01-08T20:49:15" ... ) >>> print(f"Duration: {ex_slice.end - ex_slice.start} seconds") Duration: 3600.0 seconds
Return as xarray for analysis
>>> xr_slice = ex.time_slice( ... "2015-01-08T19:49:15", ... n_samples=1000, ... return_type='xarray' ... ) >>> print(xr_slice.mean().values) 0.152 >>> xr_slice.plot()
Return as pandas for tabular ops
>>> df_slice = ex.time_slice( ... "2015-01-08T19:49:15", ... n_samples=500, ... return_type='pandas' ... ) >>> df_slice['data'].describe() >>> df_slice.resample('10S').mean()
Return as numpy for computation
>>> np_slice = ex.time_slice( ... "2015-01-08T19:49:15", ... n_samples=100, ... return_type='numpy' ... ) >>> np.fft.fft(np_slice)
- get_index_from_time(given_time: str | mt_metadata.common.mttime.MTime) int[source]
Calculate the array index for a given time.
- Parameters:
given_time (str or MTime) – Time to convert to index.
- Returns:
Array index corresponding to the given time.
- Return type:
int
Notes
Index is calculated as: (time - start_time) * sample_rate and rounded to nearest integer.
Examples
>>> idx = channel.get_index_from_time('1980-01-01T00:00:10') >>> print(f"Index for 10 seconds: {idx}") Index for 10 seconds: 2560 >>> # With 256 Hz sample rate: 10 * 256 = 2560
>>> start_idx = channel.get_index_from_time(channel.start) >>> print(start_idx) 0
- get_index_from_end_time(given_time: str | mt_metadata.common.mttime.MTime) int[source]
Get the end index value (inclusive) for a given time.
- Parameters:
given_time (str or MTime) – Time to convert to end index.
- Returns:
Array index + 1 for inclusive slicing.
- Return type:
int
Notes
Adds 1 to the calculated index to make it suitable for inclusive end slicing (e.g., array[start:end]).
Examples
>>> end_idx = channel.get_index_from_end_time('1980-01-01T00:00:10') >>> data_slice = channel.hdf5_dataset[0:end_idx] >>> # Includes sample at exactly 10 seconds
- class mth5.groups.AuxiliaryDataset(group: h5py.Dataset, **kwargs: Any)[source]
Bases:
ChannelDatasetSpecialized container for auxiliary channel data.
Inherits all functionality from ChannelDataset with auxiliary channel specific metadata handling. Used for temperature, battery voltage, etc.
- Parameters:
group (h5py.Dataset) – HDF5 dataset containing auxiliary data.
**kwargs (dict) – Additional keyword arguments passed to ChannelDataset.
Examples
>>> temp_dataset = run_group.get_channel('Temperature') >>> print(type(temp_dataset)) <class 'mth5.groups.channel_dataset.AuxiliaryDataset'> >>> print(temp_dataset.metadata.type) 'auxiliary' >>> print(temp_dataset.metadata.units) 'celsius'
- class mth5.groups.RunGroup(group: h5py.Group, run_metadata: mt_metadata.timeseries.Run | None = None, **kwargs: Any)[source]
Bases:
mth5.groups.BaseGroupContainer for a single MT measurement run with multiple channels.
Manages time series data and metadata for one measurement run within a station. A run can contain multiple channels of electric, magnetic, and auxiliary data. This class provides methods to add, retrieve, and manage individual channels, along with convenient access to station and survey metadata.
The run group is located at
/Survey/Stations/{station_name}/{run_name}in the HDF5 file hierarchy.- metadata[source]
Run metadata including sample rate, time period, and channel information.
- Type:
mt_metadata.timeseries.Run
- channel_summary
Summary table of all channels in the run.
- Type:
pd.DataFrame
- groups_list
List of channel names in the run.
- Type:
list[str]
- Parameters:
group (h5py.Group) – HDF5 group for the run, should have path like
/Survey/Stations/{station_name}/{run_name}run_metadata (mt_metadata.timeseries.Run, optional) – Metadata container for the run. Default is None.
**kwargs (Any) – Additional keyword arguments passed to BaseGroup.
Notes
Key behaviors:
Channels can be of type: electric, magnetic, or auxiliary
All metadata updates should use the metadata object for validation
Call write_metadata() after modifying metadata to persist changes
Channel metadata is cached for performance during repeated access
Deleting a channel removes the reference but doesn’t reduce file size
Examples
Access run from an open MTH5 file:
>>> from mth5 import mth5 >>> mth5_obj = mth5.MTH5() >>> mth5_obj.open_mth5(r"/test.mth5", mode='a') >>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
Check available channels:
>>> run.groups_list ['Ex', 'Ey', 'Hx', 'Hy']
Access HDF5 group directly:
>>> run.hdf5_group.ref <HDF5 Group Reference>
Update metadata and persist to file:
>>> run.metadata.sample_rate = 512.0 >>> run.write_metadata()
Add a channel:
>>> import numpy as np >>> data = np.random.rand(4096) >>> ex = run.add_channel('Ex', 'electric', data=data)
This class provides methods to add and get channels. A summary table of all existing channels in the run is also provided as a convenience look up table to make searching easier.
- Parameters:
group (
h5py.Group) – HDF5 group for a station, should have a path/Survey/Stations/station_name/run_namestation_metadata (
mth5.metadata.Station, optional) – metadata container, defaults to None
- Access RunGroup from an open MTH5 file:
>>> from mth5 import mth5 >>> mth5_obj = mth5.MTH5() >>> mth5_obj.open_mth5(r"/test.mth5", mode='a') >>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
- Check what channels exist:
>>> station.groups_list ['Ex', 'Ey', 'Hx', 'Hy']
To access the hdf5 group directly use RunGroup.hdf5_group
>>> station.hdf5_group.ref <HDF5 Group Reference>
Note
All attributes should be input into the metadata object, that way all input will be validated against the metadata standards. If you change attributes in metadata object, you should run the SurveyGroup.write_metadata() method. This is a temporary solution, working on an automatic updater if metadata is changed.
>>> run.metadata.existing_attribute = 'update_existing_attribute' >>> run.write_metadata()
If you want to add a new attribute this should be done using the metadata.add_base_attribute method.
>>> station.metadata.add_base_attribute('new_attribute', >>> ... 'new_attribute_value', >>> ... {'type':str, >>> ... 'required':True, >>> ... 'style':'free form', >>> ... 'description': 'new attribute desc.', >>> ... 'units':None, >>> ... 'options':[], >>> ... 'alias':[], >>> ... 'example':'new attribute
- Add a channel:
>>> new_channel = run.add_channel('Ex', 'electric', >>> ... data=numpy.random.rand(4096)) >>> new_run /Survey/Stations/MT001/MT001a: ======================================= --> Dataset: summary ...................... --> Dataset: Ex ...................... --> Dataset: Ey ...................... --> Dataset: Hx ...................... --> Dataset: Hy ......................
- Add a channel with metadata:
>>> from mth5.metadata import Electric >>> ex_metadata = Electric() >>> ex_metadata.time_period.start = '2020-01-01T12:30:00' >>> ex_metadata.time_period.end = '2020-01-03T16:30:00' >>> new_ex = run.add_channel('Ex', 'electric', >>> ... channel_metadata=ex_metadata) >>> # to look at the metadata >>> new_ex.metadata { "electric": { "ac.end": 1.2, "ac.start": 2.3, ... } }
See also
mth5.metadata for details on how to add metadata from various files and python objects.
- Remove a channel:
>>> run.remove_channel('Ex') >>> station /Survey/Stations/MT001/MT001a: ======================================= --> Dataset: summary ...................... --> Dataset: Ey ...................... --> Dataset: Hx ...................... --> Dataset: Hy ......................
Note
Deleting a station is not as simple as del(station). In HDF5 this does not free up memory, it simply removes the reference to that station. The common way to get around this is to copy what you want into a new file, or overwrite the station.
- Get a channel:
>>> existing_ex = stations.get_channel('Ex') >>> existing_ex Channel Electric: ------------------- data type: Ex data type: electric data format: float32 data shape: (4096,) start: 1980-01-01T00:00:00+00:00 end: 1980-01-01T00:32:+08:00 sample rate: 8
- Summary Table:
A summary table is provided to make searching easier. The table summarized all stations within a survey. To see what names are in the summary table:
>>> run.summary_table.dtype.descr [('component', ('|S5', {'h5py_encoding': 'ascii'})), ('start', ('|S32', {'h5py_encoding': 'ascii'})), ('end', ('|S32', {'h5py_encoding': 'ascii'})), ('n_samples', '<i4'), ('measurement_type', ('|S12', {'h5py_encoding': 'ascii'})), ('units', ('|S25', {'h5py_encoding': 'ascii'})), ('hdf5_reference', ('|O', {'ref': h5py.h5r.Reference}))]
Note
When a run is added an entry is added to the summary table, where the information is pulled from the metadata.
>>> new_run.summary_table index | component | start | end | n_samples | measurement_type | units | hdf5_reference -------------------------------------------------------------------------- -------------
- property station_metadata: mt_metadata.timeseries.Station
Get station metadata with current run included.
- Returns:
Station metadata object containing this run’s information.
- Return type:
metadata.Station
Examples
>>> from mth5 import mth5 >>> mth5_obj = mth5.MTH5() >>> mth5_obj.open_mth5("example.h5", mode='r') >>> run = mth5_obj.get_run("MT001", "MT001a") >>> station_meta = run.station_metadata >>> print(station_meta.id) MT001
- property survey_metadata: mt_metadata.timeseries.Survey
Get survey metadata with current station and run included.
- Returns:
Survey metadata object containing the full hierarchy.
- Return type:
metadata.Survey
Examples
>>> from mth5 import mth5 >>> mth5_obj = mth5.MTH5() >>> mth5_obj.open_mth5("example.h5", mode='r') >>> run = mth5_obj.get_run("MT001", "MT001a") >>> survey_meta = run.survey_metadata >>> print(survey_meta.id) CONUS_South
- recache_channel_metadata() None[source]
Clear and rebuild the channel metadata cache from current HDF5 data.
This method reads all channel metadata from HDF5 storage and updates the internal cache. Useful when channel metadata has been modified externally or needs to be synchronized.
Examples
>>> run = mth5_obj.get_run("MT001", "MT001a") >>> run.recache_channel_metadata() >>> # Cache is now synchronized with HDF5 storage
- metadata() mt_metadata.timeseries.Run[source]
Get run metadata including all channel information.
This property dynamically reads and caches channel metadata from HDF5, ensuring the run metadata always reflects the current state of channels.
- Returns:
Run metadata object with all channels included.
- Return type:
metadata.Run
Examples
>>> run = mth5_obj.get_run("MT001", "MT001a") >>> run_meta = run.metadata >>> print(run_meta.channels_recorded_electric) ['ex', 'ey'] >>> print(run_meta.sample_rate) 256.0
- property channel_summary: pandas.DataFrame
Get summary of all channels in the run as a DataFrame.
- Returns:
DataFrame with columns: component, start, end, n_samples, sample_rate, measurement_type, units, hdf5_reference.
- Return type:
pandas.DataFrame
Examples
>>> run = mth5_obj.get_run("MT001", "MT001a") >>> summary = run.channel_summary >>> print(summary[['component', 'sample_rate', 'n_samples']]) component sample_rate n_samples 0 ex 256.0 65536 1 ey 256.0 65536 2 hx 256.0 65536 3 hy 256.0 65536
- write_metadata() None[source]
Write run metadata to HDF5 attributes.
Converts metadata object to dictionary and writes all attributes to the HDF5 group.
Examples
>>> run = mth5_obj.get_run("MT001", "MT001a") >>> run.metadata.sample_rate = 512.0 >>> run.write_metadata() >>> # Metadata is now persisted to HDF5 file
- add_channel(channel_name, channel_type, data, channel_dtype='int32', shape=None, max_shape=(None,), chunks=True, channel_metadata=None, **kwargs)[source]
Add a channel to the run.
- Parameters:
channel_name (str) – Name of the channel (e.g., ‘ex’, ‘ey’, ‘hx’, ‘hy’, ‘hz’).
channel_type (str) – Type of channel: ‘electric’, ‘magnetic’, or ‘auxiliary’.
data (numpy.ndarray or None) – Time series data for the channel. If None, an empty resizable dataset will be created.
channel_dtype (str, optional) – Data type for the channel if data is None, by default “int32”.
shape (tuple of int, optional) – Initial shape of the dataset. If None and data is None, shape is estimated from metadata or set to (1,), by default None.
max_shape (tuple of int or None, optional) – Maximum shape the dataset can be resized to. Use None for unlimited growth in that dimension, by default (None,).
chunks (bool or int, optional) – Enable chunked storage. If True, uses automatic chunking. If int, uses that chunk size, by default True.
channel_metadata (mt_metadata.timeseries.Electric, Magnetic, or Auxiliary, optional) – Metadata object for the channel, by default None.
**kwargs (dict) – Additional keyword arguments.
- Returns:
The created channel dataset object.
- Return type:
- Raises:
MTH5Error – If channel_type is not one of: electric, magnetic, auxiliary.
Examples
Add a channel with data:
>>> import numpy as np >>> from mth5 import mth5 >>> mth5_obj = mth5.MTH5() >>> mth5_obj.open_mth5("example.h5", mode='a') >>> run = mth5_obj.get_run("MT001", "MT001a") >>> data = np.random.rand(4096) >>> ex = run.add_channel('ex', 'electric', data) >>> print(ex.metadata.component) ex
Add a channel with metadata:
>>> from mt_metadata.timeseries import Electric >>> ex_meta = Electric() >>> ex_meta.time_period.start = '2020-01-01T12:30:00' >>> ex_meta.sample_rate = 256.0 >>> ex = run.add_channel('ex', 'electric', None, ... channel_metadata=ex_meta) >>> print(ex.metadata.sample_rate) 256.0
Add a channel with custom shape:
>>> ex = run.add_channel('ex', 'electric', None, ... shape=(8192,), channel_dtype='float32') >>> print(ex.hdf5_dataset.shape) (8192,)
- get_channel(channel_name: str) mth5.groups.ElectricDataset | mth5.groups.MagneticDataset | mth5.groups.AuxiliaryDataset | mth5.groups.ChannelDataset[source]
Get a channel from an existing name.
Returns the appropriate channel dataset container based on the channel type (electric, magnetic, or auxiliary).
- Parameters:
channel_name (str) – Name of the channel to retrieve (e.g., ‘ex’, ‘ey’, ‘hx’).
- Returns:
Channel dataset object containing the channel data and metadata.
- Return type:
ElectricDataset or MagneticDataset or AuxiliaryDataset or ChannelDataset
- Raises:
MTH5Error – If the channel does not exist in the run.
Examples
Attempting to get a non-existent channel:
>>> from mth5 import mth5 >>> mth5_obj = mth5.MTH5() >>> mth5_obj.open_mth5("example.h5", mode='r') >>> run = mth5_obj.get_run("MT001", "MT001a") >>> ex = run.get_channel('ex') MTH5Error: ex does not exist, check groups_list for existing names
Check available channels first:
>>> run.groups_list ['ey', 'hx', 'hz']
Get an existing channel:
>>> ey = run.get_channel('ey') >>> print(ey) Channel Electric: ------------------- component: ey data type: electric data format: float32 data shape: (4096,) start: 1980-01-01T00:00:00+00:00 end: 1980-01-01T00:00:01+00:00 sample rate: 4096
- remove_channel(channel_name: str) None[source]
Remove a channel from the run.
Deleting a channel is not as simple as del(channel). In HDF5, this does not free up memory; it simply removes the reference to that channel. The common way to get around this is to copy what you want into a new file, or overwrite the channel.
- Parameters:
channel_name (str) – Name of the existing channel to remove.
Notes
Deleting a channel does not reduce the HDF5 file size. It simply removes the reference. If file size reduction is your goal, copy what you want into another file.
Todo: Need to remove summary table entry as well.
Examples
>>> from mth5 import mth5 >>> mth5_obj = mth5.MTH5() >>> mth5_obj.open_mth5(r"/test.mth5", mode='a') >>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a') >>> run.remove_channel('ex')
- has_data() bool[source]
Check if the run contains any non-empty, non-zero data.
Verifies that all channels in the run have valid data (non-zero and non-empty arrays). Returns False if any channel lacks data.
- Returns:
True if all channels have data, False if any channel is empty or all zeros.
- Return type:
bool
Notes
A channel is considered to have data if its has_data() method returns True, meaning it contains non-zero values.
Examples
>>> run = mth5_obj.get_run("MT001", "MT001a") >>> if run.has_data(): ... print("Run contains valid data") ... runts = run.to_runts()
- to_runts(start: str | None = None, end: str | None = None, n_samples: int | None = None) mth5.timeseries.RunTS[source]
Convert run to a RunTS timeseries object.
Combines all channels in the run into a RunTS object which handles multi-channel time series data with associated metadata.
- Parameters:
start (str, optional) – Start time for time slice in ISO format (e.g., ‘2023-01-01T12:00:00’). If None, uses entire channel data. Default is None.
end (str, optional) – End time for time slice in ISO format. Only used if start is specified. Default is None.
n_samples (int, optional) – Number of samples to extract from start. If both end and n_samples are specified, end takes precedence. Default is None.
- Returns:
RunTS object containing all channels with full run and station metadata.
- Return type:
Notes
Includes run, station, and survey metadata in the output
Skips the ‘summary’ group which is not a channel
If start is specified, performs time slicing; otherwise returns full data
Examples
Convert entire run to RunTS:
>>> run = mth5_obj.get_run("MT001", "MT001a") >>> runts = run.to_runts() >>> print(runts.channels) ['ex', 'ey', 'hx', 'hy']
Time slice the run:
>>> runts = run.to_runts(start='2023-01-01T12:00:00', ... end='2023-01-01T13:00:00') >>> print(runts.ex.ts.shape) (1024,)
- from_runts(run_ts_obj: mth5.timeseries.RunTS, **kwargs: Any) list[mth5.groups.ElectricDataset | mth5.groups.MagneticDataset | mth5.groups.AuxiliaryDataset][source]
Create channel datasets from a RunTS timeseries object.
Converts a RunTS object with multiple channels and metadata into HDF5 channel datasets and updates run metadata accordingly.
- Parameters:
run_ts_obj (RunTS) – RunTS object containing multiple channels and metadata.
**kwargs (Any) – Additional keyword arguments.
- Returns:
List of created channel dataset objects.
- Return type:
list[ElectricDataset | MagneticDataset | AuxiliaryDataset]
- Raises:
MTH5Error – If input is not a RunTS object.
Notes
Updates run metadata from input object
Validates station and run IDs match current context
Creates appropriate channel type based on channel metadata
Automatically registers recorded channels in run metadata
Examples
>>> from mth5.timeseries import RunTS >>> run = mth5_obj.get_run("MT001", "MT001a") >>> runts = RunTS.from_file("timeseries_data.txt") >>> channels = run.from_runts(runts) >>> print(f"Created {len(channels)} channels") Created 4 channels
- from_channel_ts(channel_ts_obj: mth5.timeseries.ChannelTS) mth5.groups.ElectricDataset | mth5.groups.MagneticDataset | mth5.groups.AuxiliaryDataset[source]
Create a channel dataset from a ChannelTS timeseries object.
Converts a single ChannelTS object with time series data and metadata into an HDF5 channel dataset. Handles filter registration and updates run metadata with channel information.
- Parameters:
channel_ts_obj (ChannelTS) – ChannelTS object containing time series data and metadata.
- Returns:
Created channel dataset object.
- Return type:
- Raises:
MTH5Error – If input is not a ChannelTS object.
Notes
Registers filters from channel response if present
Validates and corrects station/run ID mismatches
Updates run metadata recorded channel lists
Automatically determines channel type from metadata
Examples
>>> from mth5.timeseries import ChannelTS >>> run = mth5_obj.get_run("MT001", "MT001a") >>> channel = ChannelTS.from_file("ex_timeseries.txt") >>> ex = run.from_channel_ts(channel) >>> print(ex.metadata.component) ex
- update_run_metadata() None[source]
Update metadata and table entries (Deprecated). .. deprecated:
Use update_metadata() instead.
- Raises:
DeprecationWarning – Always raised to indicate this method should not be used.
- update_metadata() None[source]
Update run metadata from all channels and persist to HDF5.
Aggregates metadata from all channels including time period and sample rate, then writes updated metadata to HDF5 attributes.
- Raises:
Exception – May raise exceptions if no channels exist (logs warning).
Notes
Updates:
Time period start from minimum of all channels
Time period end from maximum of all channels
Sample rate from first channel (assumes uniform across channels)
Should be called after adding or removing channels to maintain consistency between channel and run metadata.
Examples
>>> run = mth5_obj.get_run("MT001", "MT001a") >>> run.add_channel('ex', 'electric', data=ex_data) >>> run.add_channel('ey', 'electric', data=ey_data) >>> run.update_metadata() # Updates time period and sample rate
- plot(start: str | None = None, end: str | None = None, n_samples: int | None = None) Any[source]
Create a matplotlib plot of all channels in the run.
Generates a multi-panel plot showing all channels in the run using the RunTS plotting functionality.
- Parameters:
start (str, optional) – Start time for time slice in ISO format. If None, plots entire channel data. Default is None.
end (str, optional) – End time for time slice in ISO format. Only used if start is specified. Default is None.
n_samples (int, optional) – Number of samples to extract from start. If both end and n_samples are specified, end takes precedence. Default is None.
- Returns:
Matplotlib figure or axes object (depends on RunTS.plot() implementation).
- Return type:
Any
Notes
Creates separate subplots for each channel type (electric, magnetic, auxiliary)
Time slice parameters work the same as to_runts()
Requires matplotlib to be installed
Examples
Plot entire run:
>>> run = mth5_obj.get_run("MT001", "MT001a") >>> fig = run.plot() >>> fig.show()
Plot time slice:
>>> fig = run.plot(start='2023-01-01T12:00:00', ... end='2023-01-01T13:00:00')
- class mth5.groups.FeatureChannelDataset(dataset: h5py.Dataset, dataset_metadata: mt_metadata.features.FeatureDecimationChannel | None = None, **kwargs)[source]
Container for multi-dimensional Fourier Coefficients organized by time and frequency.
This class manages Fourier Coefficient data with frequency band organization, similar to FCDataset but with enhanced band tracking capabilities. The data array is organized with the following assumptions:
Data are grouped into frequency bands
Data are uniformly sampled in time (uniform FFT moving window step size)
The dataset tracks temporal evolution of frequency content across multiple windows, making it suitable for time-frequency analysis of geophysical signals.
- Parameters:
dataset (h5py.Dataset) – HDF5 dataset containing the Fourier coefficient data.
dataset_metadata (FeatureDecimationChannel, optional) – Metadata for the dataset. See
mt_metadata.features.FeatureDecimationChannel. If provided, must be of the same type as the internal metadata class. Default is None.**kwargs – Additional keyword arguments for future extensibility.
- hdf5_dataset
Reference to the HDF5 dataset.
- Type:
h5py.Dataset
- metadata
Metadata container with the following attributes:
- namestr
Dataset name
- time_period.startdatetime
Start time of the data acquisition
- time_period.enddatetime
End time of the data acquisition
- sample_rate_window_stepfloat
Sample rate of the time window stepping (Hz)
- frequency_minfloat
Minimum frequency in the band (Hz)
- frequency_maxfloat
Maximum frequency in the band (Hz)
- unitsstr
Physical units of the coefficient data
- componentstr
Component identifier (e.g., ‘Ex’, ‘Hy’)
- sample_rate_decimation_levelint
Decimation level applied to acquire this data
- Type:
FeatureDecimationChannel
- Raises:
MTH5Error – If dataset_metadata type does not match the expected FeatureDecimationChannel type.
Examples
>>> import h5py >>> from mt_metadata.features import FeatureDecimationChannel >>> from mth5.groups.feature_dataset import FeatureChannelDataset
Create a feature dataset from an HDF5 group:
>>> with h5py.File('data.h5', 'r') as f: ... h5_dataset = f['feature_group']['Ex'] ... feature = FeatureChannelDataset(h5_dataset) ... print(f"Time windows: {feature.n_windows}") ... print(f"Frequencies: {feature.n_frequencies}")
Access time and frequency arrays:
>>> time_array = feature.time >>> freq_array = feature.frequency >>> data_array = feature.to_numpy()
- logger
- metadata
- read_metadata() None[source]
Read metadata from the HDF5 file into the metadata container.
This method loads all attributes from the HDF5 dataset into the metadata container, enabling validation and type checking.
Examples
>>> feature.read_metadata() >>> print(feature.metadata.component) 'Ex'
- write_metadata() None[source]
Write metadata from the metadata container to the HDF5 attributes.
This method serializes the metadata container and writes all metadata as attributes to the HDF5 dataset. Raises exceptions are caught for read-only files.
Examples
>>> feature.metadata.component = 'Ey' >>> feature.write_metadata()
- property n_windows: int
Get the number of time windows in the dataset.
- Returns:
Number of time windows (first dimension of the dataset).
- Return type:
int
- property time: numpy.ndarray
Get the time array for each window.
Returns an array of datetime64 values representing the start time of each time window. The time spacing is determined by the sample rate of the window stepping.
- Returns:
Array of datetime64 values with shape (n_windows,) representing the start time of each window.
- Return type:
np.ndarray
Examples
>>> time_array = feature.time >>> print(time_array.shape) (100,) >>> print(time_array[0]) numpy.datetime64('2023-01-01T00:00:00')
- property n_frequencies: int
Get the number of frequency bins in the dataset.
- Returns:
Number of frequency bins (second dimension of the dataset).
- Return type:
int
- property frequency: numpy.ndarray
Get the frequency array for the dataset.
Returns a linearly-spaced frequency array from frequency_min to frequency_max with n_frequencies points.
- Returns:
Array of float64 frequencies in Hz with shape (n_frequencies,).
- Return type:
np.ndarray
Examples
>>> freq_array = feature.frequency >>> print(freq_array.shape) (256,) >>> print(f"Frequency range: {freq_array[0]:.2f} - {freq_array[-1]:.2f} Hz") Frequency range: 0.01 - 100.00 Hz
- replace_dataset(new_data_array: numpy.ndarray) None[source]
Replace the entire HDF5 dataset with new data.
This method resizes the HDF5 dataset as needed and replaces all data. The input array must have the same dtype as the existing dataset.
- Parameters:
new_data_array (np.ndarray) – New data array to replace the existing dataset. Will be converted to numpy array if necessary.
- Raises:
TypeError – If input cannot be converted to a numpy array or has incompatible shape.
Examples
>>> import numpy as np >>> new_data = np.random.randn(100, 256) >>> feature.replace_dataset(new_data)
- to_xarray() xarray.DataArray[source]
Convert the feature dataset to an xarray DataArray.
Returns an xarray DataArray with proper time and frequency coordinates, metadata attributes, and component naming. The entire dataset is loaded into memory.
- Returns:
DataArray with dimensions [‘time’, ‘frequency’] and coordinates matching the dataset’s time and frequency arrays.
- Return type:
xr.DataArray
Notes
Metadata stored in xarray attributes will not be validated if modified. The full dataset is loaded into memory; use with caution for large datasets.
Examples
>>> xr_data = feature.to_xarray() >>> print(xr_data.dims) ('time', 'frequency') >>> print(xr_data.name) 'Ex' >>> subset = xr_data.sel(time=slice('2023-01-01', '2023-01-02'))
- to_numpy() numpy.ndarray[source]
Convert the feature dataset to a numpy array.
Returns the dataset as a numpy array by loading it from the HDF5 file into memory. The array shape is (n_windows, n_frequencies).
- Returns:
Numpy array containing all feature data with shape (n_windows, n_frequencies).
- Return type:
np.ndarray
Examples
>>> data = feature.to_numpy() >>> print(data.shape) (100, 256) >>> print(data.dtype) complex128 >>> mean_amplitude = np.abs(data).mean()
- from_numpy(new_estimate: numpy.ndarray) None[source]
Load data from a numpy array into the HDF5 dataset.
This method updates the HDF5 dataset with new data from a numpy array. The input array must match the dataset’s dtype. The HDF5 dataset will be resized if necessary to accommodate the new data.
- Parameters:
new_estimate (np.ndarray) – Numpy array to write to the HDF5 dataset. Must have compatible dtype with the existing dataset.
- Raises:
TypeError – If input array dtype does not match the HDF5 dataset dtype or if input cannot be converted to numpy array.
Notes
The variable ‘data’ is a builtin in numpy and cannot be used as a parameter name.
Examples
>>> import numpy as np >>> new_data = np.random.randn(100, 256) + 1j * np.random.randn(100, 256) >>> feature.from_numpy(new_data) >>> loaded_data = feature.to_numpy() >>> assert loaded_data.shape == new_data.shape
- from_xarray(data: xarray.DataArray, sample_rate_decimation_level: int) None[source]
Load data and metadata from an xarray DataArray.
This method updates both the HDF5 dataset and metadata from an xarray DataArray. It extracts time coordinates, frequency range, and component information from the DataArray and its attributes.
- Parameters:
data (xr.DataArray) – Input xarray DataArray with ‘time’ and ‘frequency’ coordinates. Expected dimensions are [‘time’, ‘frequency’].
sample_rate_decimation_level (int) – Decimation level applied to the original data to produce this feature dataset (integer ≥ 1).
Notes
Metadata stored in xarray attributes will be extracted and written to the HDF5 file. The full dataset is loaded into memory during this process.
Examples
>>> import xarray as xr >>> import numpy as np
Create sample xarray data:
>>> times = np.arange('2023-01-01', '2023-01-02', dtype='datetime64[s]') >>> freqs = np.linspace(0.01, 100, 256) >>> data_array = np.random.randn(len(times), len(freqs)) + \ ... 1j * np.random.randn(len(times), len(freqs)) >>> xr_data = xr.DataArray( ... data_array, ... dims=['time', 'frequency'], ... coords={'time': times, 'frequency': freqs}, ... name='Ex', ... attrs={'units': 'mV/km'} ... )
Load into feature dataset:
>>> feature.from_xarray(xr_data, sample_rate_decimation_level=2) >>> print(feature.metadata.component) 'Ex'
- class mth5.groups.MasterFeaturesGroup(group: h5py.Group, **kwargs)[source]
Bases:
mth5.groups.BaseGroupMaster group container for features associated with Fourier Coefficients or time series.
This class manages the top-level organization of geophysical feature data, organizing it into feature-specific groups. Features can include various frequency or time-domain analyses.
Hierarchy
MasterFeatureGroup -> FeatureGroup -> FeatureRunGroup ->
FC: FeatureDecimationGroup -> FeatureChannelDataset
Time Series: FeatureChannelDataset
- param group:
HDF5 group object for this MasterFeaturesGroup.
- type group:
h5py.Group
- param **kwargs:
Additional keyword arguments passed to BaseGroup.
Examples
>>> import h5py >>> from mth5.groups.features import MasterFeaturesGroup >>> with h5py.File('data.h5', 'r') as f: ... master = MasterFeaturesGroup(f['features']) ... feature_list = master.groups_list
- add_feature_group(feature_name: str, feature_metadata: mt_metadata.features.FeatureDecimationChannel | None = None) FeatureGroup[source]
Add a feature group to the master features container.
Creates a new FeatureGroup with the specified name and optional metadata. Feature groups organize all runs and decimation levels for a particular feature.
- Parameters:
feature_name (str) – Name for the feature group. Will be validated and formatted.
feature_metadata (FeatureDecimationChannel, optional) – Metadata describing the feature. Default is None.
- Returns:
Newly created feature group object.
- Return type:
Examples
>>> master = MasterFeaturesGroup(h5_group) >>> feature = master.add_feature_group('coherency') >>> print(feature.name) 'coherency'
- get_feature_group(feature_name: str) FeatureGroup[source]
Retrieve a feature group by name.
- Parameters:
feature_name (str) – Name of the feature group to retrieve.
- Returns:
The requested feature group.
- Return type:
- Raises:
MTH5Error – If the feature group does not exist.
Examples
>>> master = MasterFeaturesGroup(h5_group) >>> feature = master.get_feature_group('coherency') >>> print(feature.name) 'coherency'
- remove_feature_group(feature_name: str) None[source]
Remove a feature group from the master container.
Deletes the specified feature group and its associated data from the HDF5 file. Note that this operation removes the reference but does not reduce the file size; copy desired data to a new file for size reduction.
- Parameters:
feature_name (str) – Name of the feature group to remove.
- Raises:
MTH5Error – If the feature group does not exist.
Examples
>>> master = MasterFeaturesGroup(h5_group) >>> master.remove_feature_group('coherency')
- class mth5.groups.FeatureGroup(group: h5py.Group, feature_metadata: object | None = None, **kwargs)[source]
Bases:
mth5.groups.BaseGroupContainer for a single feature set with all associated runs and decimation levels.
This class manages feature-specific data including all processing runs and decimation levels. Features can include both Fourier Coefficient and time series data.
Hierarchy
FeatureGroup -> FeatureRunGroup ->
FC: FeatureDecimationLevel -> FeatureChannelDataset
TS: FeatureChannelDataset
- param group:
HDF5 group object for this FeatureGroup.
- type group:
h5py.Group
- param feature_metadata:
Metadata specific to this feature. Should include description and parameters.
- type feature_metadata:
optional
- param **kwargs:
Additional keyword arguments passed to BaseGroup.
Notes
Feature metadata should be specific to the feature and include descriptions of the feature and any parameters used in its computation.
Examples
>>> feature = FeatureGroup(h5_group, feature_metadata=metadata) >>> run_group = feature.add_feature_run_group('run_1', domain='fc')
- add_feature_run_group(feature_name: str, feature_run_metadata: object | None = None, domain: str = 'fc') object[source]
Add a feature run group for a single feature.
Creates either a Fourier Coefficient run group or a time series run group based on the specified domain. The domain can be determined from the metadata or explicitly provided.
- Parameters:
feature_name (str) – Name for the feature run group.
feature_run_metadata (optional) – Metadata for the feature run. If provided, domain is extracted from metadata.domain attribute. Default is None.
domain (str, default='fc') –
Domain type for the data. Must be one of:
’fc’, ‘frequency’, ‘fourier’, ‘fourier_domain’: Fourier Coefficients
’ts’, ‘time’, ‘time series’, ‘time_series’: Time series
- Returns:
Newly created feature run group.
- Return type:
- Raises:
ValueError – If domain is not recognized.
AttributeError – If metadata does not have a domain attribute when metadata is provided.
Examples
>>> feature = FeatureGroup(h5_group) >>> fc_run = feature.add_feature_run_group('processing_run_1', domain='fc') >>> ts_run = feature.add_feature_run_group('ts_analysis', domain='ts')
- get_feature_run_group(feature_name: str, domain: str = 'frequency') object[source]
Retrieve a feature run group by name and domain type.
- Parameters:
feature_name (str) – Name of the feature run group to retrieve.
domain (str, default='frequency') –
Domain type. Must be one of:
’fc’, ‘frequency’, ‘fourier’, ‘fourier_domain’: Fourier Coefficients
’ts’, ‘time’, ‘time series’, ‘time_series’: Time series
- Returns:
The requested feature run group.
- Return type:
- Raises:
ValueError – If domain is not recognized.
MTH5Error – If the feature run group does not exist.
Examples
>>> feature = FeatureGroup(h5_group) >>> fc_run = feature.get_feature_run_group('processing_run_1', domain='fc')
- remove_feature_run_group(feature_name: str) None[source]
Remove a feature run group.
Deletes the specified feature run group and all its associated data. Note that deletion removes the reference but does not reduce HDF5 file size.
- Parameters:
feature_name (str) – Name of the feature run group to remove.
- Raises:
MTH5Error – If the feature run group does not exist.
Examples
>>> feature = FeatureGroup(h5_group) >>> feature.remove_feature_run_group('processing_run_1')
- class mth5.groups.FeatureTSRunGroup(group: h5py.Group, feature_run_metadata: object | None = None, **kwargs)[source]
Bases:
mth5.groups.BaseGroupContainer for time series features from a processing or analysis run.
This class wraps a RunGroup to manage time series data features while maintaining compatibility with the feature hierarchy structure.
- Parameters:
group (h5py.Group) – HDF5 group object for this FeatureTSRunGroup.
feature_run_metadata (optional) – Metadata for the feature run (same type as timeseries.Run).
**kwargs – Additional keyword arguments passed to BaseGroup.
Notes
This class uses methods from RunGroup for channel management, which may have performance implications due to multiple RunGroup instantiations.
Examples
>>> ts_run = FeatureTSRunGroup(h5_group, feature_run_metadata=metadata) >>> channel = ts_run.add_feature_channel('Ex', 'electric', data)
- add_feature_channel(channel_name: str, channel_type: str, data: numpy.ndarray | None = None, channel_dtype: str = 'int32', shape: tuple | None = None, max_shape: tuple = (None,), chunks: bool = True, channel_metadata: object | None = None, **kwargs) object[source]
Add a time series channel to the feature run group.
Creates a new channel for time series data with the specified properties and optional metadata. Channel metadata should be a timeseries.Channel object.
- Parameters:
channel_name (str) – Name for the channel.
channel_type (str) – Type of channel (e.g., ‘electric’, ‘magnetic’).
data (np.ndarray, optional) – Initial data for the channel. Default is None.
channel_dtype (str, default='int32') – Data type for the channel.
shape (tuple, optional) – Shape of the channel data. Default is None.
max_shape (tuple, default=(None,)) – Maximum shape for expandable dimensions.
chunks (bool, default=True) – Whether to use chunking for the dataset.
channel_metadata (optional) – Metadata object (timeseries.Channel type). Default is None.
**kwargs – Additional keyword arguments for dataset creation.
- Returns:
Channel object from RunGroup.
- Return type:
object
Examples
>>> ts_run = FeatureTSRunGroup(h5_group) >>> channel = ts_run.add_feature_channel( ... 'Ex', 'electric', data=np.arange(1000))
- get_feature_channel(channel_name: str) object[source]
Retrieve a feature channel by name.
- Parameters:
channel_name (str) – Name of the channel to retrieve.
- Returns:
Channel object from RunGroup.
- Return type:
object
- Raises:
MTH5Error – If the channel does not exist.
Examples
>>> ts_run = FeatureTSRunGroup(h5_group) >>> channel = ts_run.get_feature_channel('Ex')
- remove_feature_channel(channel_name: str) None[source]
Remove a feature channel from the run group.
- Parameters:
channel_name (str) – Name of the channel to remove.
- Raises:
MTH5Error – If the channel does not exist.
Examples
>>> ts_run = FeatureTSRunGroup(h5_group) >>> ts_run.remove_feature_channel('Ex')
- class mth5.groups.FeatureFCRunGroup(group: h5py.Group, feature_run_metadata: mt_metadata.processing.fourier_coefficients.decimation.Decimation | None = None, **kwargs)[source]
Bases:
mth5.groups.BaseGroupContainer for Fourier Coefficient features from a processing run.
This class manages Fourier Coefficient data organized by decimation levels, each containing multiple frequency channels with time-frequency data.
Hierarchy
FeatureFCRunGroup -> FeatureDecimationGroup -> FeatureChannelDataset
- metadata[source]
Metadata including:
list of decimation levels
start time (earliest)
end time (latest)
method (fft, wavelet, …)
list of channels used
starting sample rate
bands used
type (TS or FC)
- Type:
Decimation
- param group:
HDF5 group object for this FeatureFCRunGroup.
- type group:
h5py.Group
- param feature_run_metadata:
Decimation metadata for the feature run. Default is None.
- type feature_run_metadata:
optional
- param **kwargs:
Additional keyword arguments passed to BaseGroup.
Examples
>>> fc_run = FeatureFCRunGroup(h5_group, feature_run_metadata=metadata) >>> decimation = fc_run.add_decimation_level('level_0', dec_metadata)
- metadata() mt_metadata.processing.fourier_coefficients.decimation.Decimation[source]
Overwrite get metadata to include channel information in the runs
- property decimation_level_summary: pandas.DataFrame
Get a summary of all decimation levels in the run.
Returns a pandas DataFrame with information about each decimation level including decimation factor, time range, and HDF5 reference.
- Returns:
DataFrame with columns:
- namestr
Decimation level name
- startdatetime64[ns]
Start time of the decimation level
- enddatetime64[ns]
End time of the decimation level
- hdf5_referenceh5py.ref_dtype
HDF5 reference to the decimation level group
- Return type:
pd.DataFrame
Examples
>>> fc_run = FeatureFCRunGroup(h5_group) >>> summary = fc_run.decimation_level_summary >>> print(summary[['name', 'start', 'end']])
- add_decimation_level(decimation_level_name: str, feature_decimation_level_metadata: object | None = None) FeatureDecimationGroup[source]
Add a decimation level group to the feature run.
- Parameters:
decimation_level_name (str) – Name for the decimation level.
feature_decimation_level_metadata (optional) – Metadata for the decimation level. Default is None.
- Returns:
Newly created decimation level group.
- Return type:
Examples
>>> fc_run = FeatureFCRunGroup(h5_group) >>> decimation = fc_run.add_decimation_level('level_0', dec_metadata) >>> print(decimation.name) 'level_0'
- get_decimation_level(decimation_level_name: str) FeatureDecimationGroup[source]
Retrieve a decimation level group by name.
- Parameters:
decimation_level_name (str) – Name of the decimation level to retrieve.
- Returns:
The requested decimation level group.
- Return type:
- Raises:
MTH5Error – If the decimation level does not exist.
Examples
>>> fc_run = FeatureFCRunGroup(h5_group) >>> decimation = fc_run.get_decimation_level('level_0')
- remove_decimation_level(decimation_level_name: str) None[source]
Remove a decimation level from the feature run.
- Parameters:
decimation_level_name (str) – Name of the decimation level to remove.
- Raises:
MTH5Error – If the decimation level does not exist.
Examples
>>> fc_run = FeatureFCRunGroup(h5_group) >>> fc_run.remove_decimation_level('level_0')
- class mth5.groups.FeatureDecimationGroup(group: h5py.Group, decimation_level_metadata: object | None = None, **kwargs)[source]
Bases:
mth5.groups.BaseGroupContainer for a single decimation level with multiple Fourier Coefficient channels.
This class manages Fourier Coefficient data organized by frequency, time, and channel. Data is assumed to be uniformly sampled in both frequency and time domains.
Hierarchy
FeatureDecimationGroup -> FeatureChannelDataset (multiple channels)
Data Assumptions
Data are uniformly sampled in frequency domain
Data are uniformly sampled in time domain
FFT moving window has uniform step size
- start time
Start time of the decimation level
- Type:
datetime
- end time
End time of the decimation level
- Type:
datetime
- channels
List of channel names in this decimation level
- Type:
list
- decimation_factor
Factor by which data was decimated
- Type:
int
- decimation_level
Level index in decimation hierarchy
- Type:
int
- decimation_sample_rate
Sample rate after decimation (Hz)
- Type:
float
- method
Method used (FFT, wavelet, etc.)
- Type:
str
- anti_alias_filter
Anti-aliasing filter used
- Type:
optional
- prewhitening_type
Type of prewhitening applied
- Type:
optional
- harmonics_kept
Harmonic indices kept in the data
- Type:
list or ‘all’
- window
Window parameters (length, overlap, type, sample rate)
- Type:
dict
- bands
Frequency bands in the data
- Type:
list
- param group:
HDF5 group object for this FeatureDecimationGroup.
- type group:
h5py.Group
- param decimation_level_metadata:
Metadata for the decimation level. Default is None.
- type decimation_level_metadata:
optional
- param **kwargs:
Additional keyword arguments passed to BaseGroup.
Examples
>>> decimation = FeatureDecimationGroup(h5_group, metadata) >>> channel = decimation.add_channel('Ex', fc_data=fc_array, fc_metadata=ch_metadata)
- property channel_summary: pandas.DataFrame
Get a summary of all channels in this decimation level.
Returns a pandas DataFrame with detailed information about each Fourier Coefficient channel including time ranges, dimensions, and sampling rates.
- Returns:
DataFrame with columns:
- namestr
Channel name
- startdatetime64[ns]
Start time of the channel data
- enddatetime64[ns]
End time of the channel data
- n_frequencyint64
Number of frequency bins
- n_windowsint64
Number of time windows
- sample_rate_decimation_levelfloat64
Decimation level sample rate (Hz)
- sample_rate_window_stepfloat64
Sample rate of window stepping (Hz)
- unitsstr
Physical units of the data
- hdf5_referenceh5py.ref_dtype
HDF5 reference to the channel dataset
- Return type:
pd.DataFrame
Examples
>>> decimation = FeatureDecimationGroup(h5_group) >>> summary = decimation.channel_summary >>> print(summary[['name', 'n_frequency', 'n_windows']])
- from_dataframe(df: pandas.DataFrame, channel_key: str, time_key: str = 'time', frequency_key: str = 'frequency') None[source]
Load Fourier Coefficient data from a pandas DataFrame.
Assumes the channel_key column contains complex coefficient values organized with time and frequency dimensions.
- Parameters:
df (pd.DataFrame) – Input DataFrame containing the coefficient data.
channel_key (str) – Name of the column containing coefficient values.
time_key (str, default='time') – Name of the time coordinate column.
frequency_key (str, default='frequency') – Name of the frequency coordinate column.
- Raises:
TypeError – If df is not a pandas DataFrame.
Examples
>>> decimation = FeatureDecimationGroup(h5_group) >>> decimation.from_dataframe(df, channel_key='Ex', time_key='time')
- from_xarray(data_array: xarray.DataArray | xarray.Dataset, sample_rate_decimation_level: float) None[source]
Load Fourier Coefficient data from an xarray DataArray or Dataset.
Automatically extracts metadata (time, frequency, units) from the xarray object and creates appropriate FeatureChannelDataset instances for each variable or the single DataArray.
- Parameters:
data_array (xr.DataArray or xr.Dataset) – Input xarray object with ‘time’ and ‘frequency’ coordinates and dimensions [‘time’, ‘frequency’] (or transposed variant).
sample_rate_decimation_level (float) – Sample rate of the decimation level (Hz).
- Raises:
TypeError – If data_array is not an xarray Dataset or DataArray.
Notes
Automatically handles both (time, frequency) and (frequency, time) dimension ordering. Units are extracted from xarray attributes if available.
Examples
>>> import xarray as xr >>> import numpy as np >>> decimation = FeatureDecimationGroup(h5_group)
Create sample xarray data:
>>> times = np.arange('2023-01-01', '2023-01-02', dtype='datetime64[s]') >>> freqs = np.linspace(0.01, 100, 256) >>> data_array = np.random.randn(len(times), len(freqs)) + \ ... 1j * np.random.randn(len(times), len(freqs)) >>> xr_data = xr.DataArray( ... data_array, ... dims=['time', 'frequency'], ... coords={'time': times, 'frequency': freqs}, ... name='Ex', ... attrs={'units': 'mV/km'} ... )
Load into decimation group:
>>> decimation.from_xarray(xr_data, sample_rate_decimation_level=0.5)
- to_xarray(channels: list | None = None) xarray.Dataset[source]
Create an xarray Dataset from Fourier Coefficient channels.
If no channels are specified, all channels in the decimation level are included. Each channel becomes a data variable in the resulting Dataset.
- Parameters:
channels (list, optional) – List of channel names to include. If None, all channels are used. Default is None.
- Returns:
xarray Dataset with channels as data variables and ‘time’ and ‘frequency’ as shared coordinates.
- Return type:
xr.Dataset
Examples
>>> decimation = FeatureDecimationGroup(h5_group) >>> xr_data = decimation.to_xarray() >>> print(xr_data.data_vars) Data variables: Ex (time, frequency) complex128 Ey (time, frequency) complex128
Get specific channels:
>>> subset = decimation.to_xarray(channels=['Ex', 'Ey'])
- from_numpy_array(nd_array: numpy.ndarray, ch_name: str | list) None[source]
Load Fourier Coefficient data from a numpy array.
Assumes array shape is either (n_frequencies, n_windows) for a single channel or (n_channels, n_frequencies, n_windows) for multiple channels.
- Parameters:
nd_array (np.ndarray) – Input numpy array containing coefficient data.
ch_name (str or list) – Channel name (for 2D array) or list of channel names (for 3D array).
- Raises:
TypeError – If nd_array is not a numpy ndarray.
ValueError – If array shape is not (n_frequencies, n_windows) or (n_channels, n_frequencies, n_windows).
Examples
>>> decimation = FeatureDecimationGroup(h5_group)
Load single channel:
>>> data_2d = np.random.randn(256, 100) + 1j * np.random.randn(256, 100) >>> decimation.from_numpy_array(data_2d, ch_name='Ex')
Load multiple channels:
>>> data_3d = np.random.randn(2, 256, 100) + 1j * np.random.randn(2, 256, 100) >>> decimation.from_numpy_array(data_3d, ch_name=['Ex', 'Ey'])
- add_channel(fc_name: str, fc_data: numpy.ndarray | xarray.DataArray | xarray.Dataset | pandas.DataFrame | None = None, fc_metadata: mt_metadata.features.FeatureDecimationChannel | None = None, max_shape: tuple = (None, None), chunks: bool = True, dtype: type = complex, **kwargs) mth5.groups.FeatureChannelDataset[source]
Add a Fourier Coefficient channel to the decimation level.
Creates a new FeatureChannelDataset for a single channel at a single decimation level. Input data can be provided as numpy array, xarray, DataFrame, or created empty.
- Parameters:
fc_name (str) – Name for the Fourier Coefficient channel.
fc_data (np.ndarray, xr.DataArray, xr.Dataset, pd.DataFrame, optional) – Input data. Can be numpy array (time, frequency) or xarray/DataFrame format. Default is None (creates empty dataset).
fc_metadata (FeatureDecimationChannel, optional) – Metadata for the channel. Default is None.
max_shape (tuple, default=(None, None)) – Maximum shape for HDF5 dataset dimensions (expandable if None).
chunks (bool, default=True) – Whether to use HDF5 chunking.
dtype (type, default=complex) – Data type for the dataset (e.g., complex, float, int).
**kwargs – Additional keyword arguments for HDF5 dataset creation.
- Returns:
Newly created FeatureChannelDataset object.
- Return type:
- Raises:
TypeError – If fc_data type is not supported or metadata type mismatch.
RuntimeError or OSError – If channel already exists (will return existing channel).
Notes
Data layout assumes (time, frequency) organization:
time index: window start times
frequency index: harmonic indices or float values
data: complex Fourier coefficients
Examples
>>> decimation = FeatureDecimationGroup(h5_group) >>> metadata = FeatureDecimationChannel(name='Ex')
Create from numpy array:
>>> fc_data = np.random.randn(100, 256) + 1j * np.random.randn(100, 256) >>> channel = decimation.add_channel('Ex', fc_data=fc_data, fc_metadata=metadata)
Create empty channel (expandable):
>>> channel = decimation.add_channel('Ex', fc_metadata=metadata)
- get_channel(fc_name: str) mth5.groups.FeatureChannelDataset[source]
Retrieve a Fourier Coefficient channel by name.
- Parameters:
fc_name (str) – Name of the channel to retrieve.
- Returns:
The requested FeatureChannelDataset object.
- Return type:
- Raises:
MTH5Error – If the channel does not exist.
Examples
>>> decimation = FeatureDecimationGroup(h5_group) >>> channel = decimation.get_channel('Ex') >>> data = channel.to_numpy()
- remove_channel(fc_name: str) None[source]
Remove a Fourier Coefficient channel from the decimation level.
Deletes the channel from the HDF5 file. Note that this removes the reference but does not reduce file size.
- Parameters:
fc_name (str) – Name of the channel to remove.
- Raises:
MTH5Error – If the channel does not exist.
Notes
To reduce HDF5 file size, copy desired data to a new file.
Examples
>>> decimation = FeatureDecimationGroup(h5_group) >>> decimation.remove_channel('Ex')
- update_metadata() None[source]
Update metadata from all channels in the decimation level.
Scans all channels and updates the decimation-level metadata with aggregated information including time ranges and sampling rates.
Examples
>>> decimation = FeatureDecimationGroup(h5_group) >>> decimation.update_metadata()
- add_weights(weight_name: str, weight_data: numpy.ndarray | None = None, weight_metadata: object | None = None, max_shape: tuple = (None, None, None), chunks: bool = True, **kwargs) None[source]
Add weight or masking data for Fourier Coefficients.
Creates a dataset to store weights or masks for quality control, frequency band selection, or time window filtering.
- Parameters:
weight_name (str) – Name for the weight dataset.
weight_data (np.ndarray, optional) – Weight values. Default is None.
weight_metadata (optional) – Metadata for the weight dataset. Default is None.
max_shape (tuple, default=(None, None, None)) – Maximum shape for expandable dimensions.
chunks (bool, default=True) – Whether to use HDF5 chunking.
**kwargs – Additional keyword arguments for HDF5 dataset creation.
Notes
Weight datasets can track:
weight_channel: Per-channel weights
weight_band: Per-frequency-band weights
weight_time: Per-time-window weights
This method is a placeholder for future implementation.
Examples
>>> decimation = FeatureDecimationGroup(h5_group) >>> decimation.add_weights('coherency_weights', weight_data=weights)
- class mth5.groups.MasterSurveyGroup(group: h5py.Group, **kwargs: Any)[source]
Bases:
mth5.groups.BaseGroupCollection helper for surveys under
Experiment/Surveys.Provides helpers to add, fetch, or remove surveys and to summarize all channels in the experiment.
Examples
>>> from mth5 import mth5 >>> m5 = mth5.MTH5() >>> _ = m5.open_mth5("/tmp/example.mth5", mode="a") >>> surveys = m5.surveys_group >>> _ = surveys.add_survey("survey_01") >>> surveys.channel_summary.head()
- property channel_summary: pandas.DataFrame
Return a DataFrame summarizing all channels across surveys.
- Returns:
Columns include survey, station, run, location, component, start/end, sample info, orientation, units, and HDF5 reference.
- Return type:
pandas.DataFrame
Examples
>>> summary = surveys.channel_summary >>> set(summary.columns) >= {"survey", "station", "run", "component"} True
- add_survey(survey_name: str, survey_metadata: mt_metadata.timeseries.Survey | None = None) SurveyGroup[source]
Add or fetch a survey at
/Experiment/Surveys/<name>.- Parameters:
survey_name (str) – Survey identifier; validated with
validate_name.survey_metadata (Survey, optional) – Metadata container used to seed the survey attributes.
- Returns:
Wrapper for the created or existing survey.
- Return type:
- Raises:
ValueError – If
survey_nameis empty.MTH5Error – If the provided metadata id conflicts with the group name.
Examples
>>> survey = surveys.add_survey("survey_01") >>> survey.metadata.id 'survey_01'
- get_survey(survey_name: str) SurveyGroup[source]
Return an existing survey by name.
- Parameters:
survey_name (str) – Existing survey name.
- Returns:
Wrapper for the requested survey.
- Return type:
- Raises:
MTH5Error – If the survey does not exist.
Examples
>>> existing = surveys.get_survey("survey_01") >>> existing.metadata.id 'survey_01'
- class mth5.groups.SurveyGroup(group: h5py.Group, survey_metadata: mt_metadata.timeseries.Survey | None = None, **kwargs: Any)[source]
Bases:
mth5.groups.BaseGroupWrapper for a single survey at
Experiment/Surveys/<id>.Handles survey-level metadata, child groups (stations, reports, filters, standards), and synchronization utilities.
Examples
>>> survey = surveys.add_survey("survey_01") >>> survey.metadata.id 'survey_01'
- initialize_group(**kwargs: Any) None[source]
Create default subgroups and write survey metadata.
- Parameters:
**kwargs – Additional attributes to set on the instance before initialization.
Examples
>>> survey.initialize_group()
- metadata() mt_metadata.timeseries.Survey[source]
Survey metadata enriched with station and filter information.
- property stations_group: mth5.groups.MasterStationGroup
- property filters_group: mth5.groups.FiltersGroup
Convenience accessor for
/Survey/Filtersgroup.
- property reports_group: mth5.groups.ReportsGroup
Convenience accessor for
/Survey/Reportsgroup.
- property standards_group: mth5.groups.StandardsGroup
Convenience accessor for
/Survey/Standardsgroup.
- update_survey_metadata(survey_dict: dict[str, Any] | None = None) None[source]
Deprecated alias for
update_metadata().- Raises:
DeprecationWarning – Always raised to direct callers to
update_metadata.
Examples
>>> survey.update_survey_metadata() Traceback (most recent call last): ... DeprecationWarning: 'update_survey_metadata' has been deprecated use 'update_metadata()'
- update_metadata(survey_dict: dict[str, Any] | None = None) None[source]
Synchronize survey metadata from station summaries.
- Parameters:
survey_dict (dict, optional) – Additional metadata values to merge before synchronization.
Notes
Updates survey start/end dates and bounding box from station summaries, then writes metadata to HDF5.
Examples
>>> _ = survey.update_metadata() >>> survey.metadata.time_period.start_date '2020-01-01'
- class mth5.groups.ExperimentGroup(group, **kwargs)[source]
Bases:
mth5.groups.BaseGroupUtility class to hold general information about the experiment and accompanying metadata for an MT experiment.
To access the hdf5 group directly use ExperimentGroup.hdf5_group.
>>> experiment = ExperimentGroup(hdf5_group) >>> experiment.hdf5_group.ref <HDF5 Group Reference>
Note
All attributes should be input into the metadata object, that way all input will be validated against the metadata standards. If you change attributes in metadata object, you should run the ExperimentGroup.write_metadata() method. This is a temporary solution, working on an automatic updater if metadata is changed.
>>> experiment.metadata.existing_attribute = 'update_existing_attribute' >>> experiment.write_metadata()
If you want to add a new attribute this should be done using the metadata.add_base_attribute method.
>>> experiment.metadata.add_base_attribute('new_attribute', >>> ... 'new_attribute_value', >>> ... {'type':str, >>> ... 'required':True, >>> ... 'style':'free form', >>> ... 'description': 'new attribute desc.', >>> ... 'units':None, >>> ... 'options':[], >>> ... 'alias':[], >>> ... 'example':'new attribute
Tip
If you want ot add stations, reports, etc to the experiment this should be done from the MTH5 object. This is to avoid duplication, at least for now.
To look at what the structure of
/Experimentlooks like:>>> experiment /Experiment: ==================== |- Group: Surveys ----------------- |- Group: Reports ----------------- |- Group: Standards ------------------- |- Group: Stations ------------------
- property surveys_group