mth5.tables package
Submodules
mth5.tables.channel_table module
- class mth5.tables.channel_table.ChannelSummaryTable(hdf5_dataset: Dataset)[source]
Bases:
MTH5TableConvenience wrapper around the channel summary dataset.
Provides helpers to summarize channels, convert to pandas, and derive run-level summaries.
Examples
>>> ch_table = ChannelSummaryTable(hdf5_dataset) >>> df = ch_table.to_dataframe() >>> run_df = ch_table.to_run_summary()
- to_dataframe() DataFrame[source]
Convert the channel summary to a pandas DataFrame.
- Returns:
Channel summary with decoded string columns and parsed datetimes.
- Return type:
pandas.DataFrame
Examples
>>> df = ch_table.to_dataframe() >>> df.head()
- to_run_summary(allowed_input_channels: Iterable[str] = ['h2', 'hx', 'hy', 'by', 'bx', 'h1'], allowed_output_channels: Iterable[str] = ['e1', 'ex', 'ey', 'e2', 'e4', 'bz', 'h3', 'hz', 'e3'], sortby: list[str] | None = None) DataFrame[source]
Compress channel summary into a run-level summary (one row per run).
- Parameters:
allowed_input_channels (Iterable[str], optional) – Allowed input channel names, by default
ALLOWED_INPUT_CHANNELS.allowed_output_channels (Iterable[str], optional) – Allowed output channel names, by default
ALLOWED_OUTPUT_CHANNELS.sortby (list of str or None, optional) – Columns to sort by; defaults to
["station", "start"]whenNone.
- Returns:
Run-level summary including channels, durations, and references.
- Return type:
pandas.DataFrame
Examples
>>> run_df = ch_table.to_run_summary() >>> run_df.columns[:4].tolist() ['survey', 'station', 'run', 'start']
mth5.tables.fc_table module
Tabulate Fourier coefficients stored in an MTH5 file.
This module provides a small utility for summarizing Fourier-coefficient datasets (e.g., FCChannel) into a structured table and exporting to a convenient pandas.DataFrame for querying and analysis.
Notes
- A basic test for this module exists under
mth5/tests/version_1/test_fcs.py.
- The table is populated by traversing the HDF5 hierarchy and collecting
entries for datasets labeled with the attribute
mth5_type='FCChannel'.
- class mth5.tables.fc_table.FCSummaryTable(hdf5_dataset: Dataset)[source]
Bases:
MTH5TableSummary table for Fourier coefficients.
This class wraps an HDF5 dataset that stores a summary of Fourier coefficient datasets and provides convenience functions such as summarize() (to populate the table) and to_dataframe() (to export entries).
Examples
Populate and export a summary from an existing MTH5 file:
>>> import h5py >>> from mth5.tables.fc_table import FCSummaryTable >>> f = h5py.File('example.mth5', 'r') >>> # Assume the summary dataset already exists at this path >>> table_ds = f['Exchange']['FC_Summary'] >>> fc_table = FCSummaryTable(table_ds) >>> fc_table.summarize() # walk the file and fill entries >>> df = fc_table.to_dataframe() >>> df.head()
- summarize() None[source]
Populate the summary table by traversing the HDF5 hierarchy.
The traversal searches for datasets with attribute
mth5_type == 'FCChannel'and adds a corresponding summary row for each.- Return type:
None
Notes
If the table contains rows from a different OS/encoding, row insertion can raise a ValueError. A warning is logged and processing continues for subsequent rows.
Examples
Refresh the table entries:
>>> fc_table.clear_table() >>> fc_table.summarize()
- to_dataframe() DataFrame[source]
Convert the table to a pandas.DataFrame for easier querying.
- Returns:
A dataframe with decoded string columns and parsed start/end timestamps.
- Return type:
pandas.DataFrame
Examples
Export to a dataframe and filter by component:
>>> df = fc_table.to_dataframe() >>> df[df.component == 'ex']
mth5.tables.mth5_table module
MTH5 table utilities.
This module provides the MTH5Table base class which wraps an HDF5 dataset and offers convenience methods for row management, locating entries, and exporting to pandas.DataFrame.
Notes
- Designed as a thin layer on top of NumPy/HDF5; for complex querying, prefer
converting to a DataFrame via to_dataframe().
Datatypes are validated and kept consistent with the underlying dataset.
- class mth5.tables.mth5_table.MTH5Table(hdf5_dataset: Dataset, default_dtype: dtype)[source]
Bases:
objectBase wrapper around an HDF5 dataset representing a typed table.
Provides simple NumPy-based operations including row insertion/removal, basic locating utilities, and conversion to pandas.DataFrame.
- Parameters:
hdf5_dataset (h5py.Dataset) – The HDF5 dataset that stores the table.
default_dtype (numpy.dtype) – The default dtype schema for the table entries.
- Raises:
MTH5TableError – If hdf5_dataset is not an instance of h5py.Dataset.
Examples
Create a simple table and add a row:
>>> import h5py, numpy as np >>> f = h5py.File('example.h5', 'w') >>> dtype = np.dtype([('name', 'S16'), ('value', 'f8')]) >>> ds = f.create_dataset('table', (1,), maxshape=(None,), dtype=dtype) >>> from mth5.tables.mth5_table import MTH5Table >>> t = MTH5Table(ds, dtype) >>> row = np.array([('alpha'.encode('utf-8'), 1.23)], dtype=dtype) >>> t.add_row(row) 1 >>> df = t.to_dataframe() >>> df.head()
- add_row(row: ndarray, index: int | None = None) int[source]
Add a row to the table.
- Parameters:
row (numpy.ndarray) – Row to insert. Must have the same dtype (or same field names, allowing safe casting) as the table.
index (int, optional) – Index at which to insert the row. If None, appends to the end.
- Returns:
Index of the inserted row.
- Return type:
int
- Raises:
TypeError – If row is not a numpy.ndarray.
ValueError – If the dtype is incompatible with the table.
- check_dtypes(other_dtype: dtype) bool[source]
Check that dtypes match the table’s dtype (including field names).
- Parameters:
other_dtype (numpy.dtype) – The dtype to compare against the table’s dtype.
- Returns:
True if the dtypes match; otherwise False.
- Return type:
bool
- clear_table() None[source]
Reset the table by recreating the dataset with a single null row.
Notes
Deletes the current dataset and replaces it with a new dataset with the same compression/options and dtype, but shape (1,).
- locate(column: str, value: Any, test: Literal['eq', 'lt', 'le', 'gt', 'ge', 'be', 'bt'] = 'eq') ndarray[source]
Locate row indices where a column satisfies a comparison.
- Parameters:
column (str) – Name of the column to test.
value (Any) – Value to compare against. For string columns, a str is converted to a numpy.bytes_. For time columns (start, end, start_date, end_date), values are coerced to numpy.datetime64.
test ({'eq','lt','le','gt','ge','be','bt'}, default 'eq') – Type of comparison to perform. - ‘eq’: equals - ‘lt’: less than - ‘le’: less than or equal to - ‘gt’: greater than - ‘ge’: greater than or equal to - ‘be’: strictly between - ‘bt’: alias for ‘be’
- Returns:
Array of matching row indices.
- Return type:
numpy.ndarray
- Raises:
ValueError – If test is ‘be’/’bt’ and value is not a 2-length iterable.
Examples
Find rows with value greater than 10:
>>> idx = t.locate('value', 10, test='gt')
- remove_row(index: int) int[source]
Remove a row by replacing it with a null entry.
- Parameters:
index (int) – Index of the row to remove.
- Returns:
Index that was updated with a null row.
- Return type:
int
- Raises:
IndexError – If the index is out of bounds for the current shape.
Notes
There is no intrinsic index stored within the array; indexing is on-the-fly. Prefer using the HDF5 reference column for robust identification.
The current approach inserts a null row at the specified index.
- to_dataframe() DataFrame[source]
Convert the table into a pandas.DataFrame.
- Returns:
DataFrame with decoded string columns where applicable.
- Return type:
pandas.DataFrame
Examples
Convert and preview:
>>> df = t.to_dataframe() >>> df.head()
- update_dtype(new_dtype: dtype) None[source]
Update the dataset’s dtype while preserving data and field names.
- Parameters:
new_dtype (numpy.dtype) – New dtype to apply. Must have identical field names.
Notes
Performs a manual copy into a new array to avoid unsafe casting errors, then recreates the dataset with the new dtype and same dataset options.
- update_row(entry: ndarray) int[source]
Update a row by locating its index and rewriting the entry.
- Parameters:
entry (numpy.ndarray) – Entry to update, with the same dtype as the table.
- Returns:
Row index that was updated, or the new row index if not found.
- Return type:
int
Notes
Matching by hdf5_reference is not reliable; this uses add_row and will append if the original row cannot be located.
mth5.tables.tf_table module
Transfer function summary table utilities.
Summarize TransferFunction groups stored in an MTH5 file into a structured table and provide a convenient pandas.DataFrame view for querying.
Notes
- Traversal searches for groups with attribute
mth5_type='transferfunction' and collects basic availability flags (impedance, tipper, covariance) along with period range and references.
- Traversal searches for groups with attribute
- class mth5.tables.tf_table.TFSummaryTable(hdf5_dataset: Dataset)[source]
Bases:
MTH5TableSummary table for TransferFunction groups.
Provides convenience functions to populate the table (summarize) and export to pandas.DataFrame (to_dataframe).
Examples
Build and export a TF summary:
>>> import h5py >>> from mth5.tables.tf_table import TFSummaryTable >>> f = h5py.File('example.mth5', 'r') >>> tf_summary_ds = f['Exchange']['TF_Summary'] >>> tf_table = TFSummaryTable(tf_summary_ds) >>> tf_table.summarize() >>> df = tf_table.to_dataframe() >>> df.head()
- summarize() None[source]
Populate the summary table by traversing the HDF5 hierarchy.
Searches for groups where
mth5_typeequals'transferfunction'and adds a row indicating available datasets (impedance, tipper, covariance), period min/max, and relevant references.- Return type:
None
Examples
Refresh the TF summary:
>>> tf_table.clear_table() >>> tf_table.summarize()
Module contents
- class mth5.tables.ChannelSummaryTable(hdf5_dataset: Dataset)[source]
Bases:
MTH5TableConvenience wrapper around the channel summary dataset.
Provides helpers to summarize channels, convert to pandas, and derive run-level summaries.
Examples
>>> ch_table = ChannelSummaryTable(hdf5_dataset) >>> df = ch_table.to_dataframe() >>> run_df = ch_table.to_run_summary()
- to_dataframe() DataFrame[source]
Convert the channel summary to a pandas DataFrame.
- Returns:
Channel summary with decoded string columns and parsed datetimes.
- Return type:
pandas.DataFrame
Examples
>>> df = ch_table.to_dataframe() >>> df.head()
- to_run_summary(allowed_input_channels: Iterable[str] = ['h2', 'hx', 'hy', 'by', 'bx', 'h1'], allowed_output_channels: Iterable[str] = ['e1', 'ex', 'ey', 'e2', 'e4', 'bz', 'h3', 'hz', 'e3'], sortby: list[str] | None = None) DataFrame[source]
Compress channel summary into a run-level summary (one row per run).
- Parameters:
allowed_input_channels (Iterable[str], optional) – Allowed input channel names, by default
ALLOWED_INPUT_CHANNELS.allowed_output_channels (Iterable[str], optional) – Allowed output channel names, by default
ALLOWED_OUTPUT_CHANNELS.sortby (list of str or None, optional) – Columns to sort by; defaults to
["station", "start"]whenNone.
- Returns:
Run-level summary including channels, durations, and references.
- Return type:
pandas.DataFrame
Examples
>>> run_df = ch_table.to_run_summary() >>> run_df.columns[:4].tolist() ['survey', 'station', 'run', 'start']
- class mth5.tables.FCSummaryTable(hdf5_dataset: Dataset)[source]
Bases:
MTH5TableSummary table for Fourier coefficients.
This class wraps an HDF5 dataset that stores a summary of Fourier coefficient datasets and provides convenience functions such as summarize() (to populate the table) and to_dataframe() (to export entries).
Examples
Populate and export a summary from an existing MTH5 file:
>>> import h5py >>> from mth5.tables.fc_table import FCSummaryTable >>> f = h5py.File('example.mth5', 'r') >>> # Assume the summary dataset already exists at this path >>> table_ds = f['Exchange']['FC_Summary'] >>> fc_table = FCSummaryTable(table_ds) >>> fc_table.summarize() # walk the file and fill entries >>> df = fc_table.to_dataframe() >>> df.head()
- summarize() None[source]
Populate the summary table by traversing the HDF5 hierarchy.
The traversal searches for datasets with attribute
mth5_type == 'FCChannel'and adds a corresponding summary row for each.- Return type:
None
Notes
If the table contains rows from a different OS/encoding, row insertion can raise a ValueError. A warning is logged and processing continues for subsequent rows.
Examples
Refresh the table entries:
>>> fc_table.clear_table() >>> fc_table.summarize()
- to_dataframe() DataFrame[source]
Convert the table to a pandas.DataFrame for easier querying.
- Returns:
A dataframe with decoded string columns and parsed start/end timestamps.
- Return type:
pandas.DataFrame
Examples
Export to a dataframe and filter by component:
>>> df = fc_table.to_dataframe() >>> df[df.component == 'ex']
- class mth5.tables.MTH5Table(hdf5_dataset: Dataset, default_dtype: dtype)[source]
Bases:
objectBase wrapper around an HDF5 dataset representing a typed table.
Provides simple NumPy-based operations including row insertion/removal, basic locating utilities, and conversion to pandas.DataFrame.
- Parameters:
hdf5_dataset (h5py.Dataset) – The HDF5 dataset that stores the table.
default_dtype (numpy.dtype) – The default dtype schema for the table entries.
- Raises:
MTH5TableError – If hdf5_dataset is not an instance of h5py.Dataset.
Examples
Create a simple table and add a row:
>>> import h5py, numpy as np >>> f = h5py.File('example.h5', 'w') >>> dtype = np.dtype([('name', 'S16'), ('value', 'f8')]) >>> ds = f.create_dataset('table', (1,), maxshape=(None,), dtype=dtype) >>> from mth5.tables.mth5_table import MTH5Table >>> t = MTH5Table(ds, dtype) >>> row = np.array([('alpha'.encode('utf-8'), 1.23)], dtype=dtype) >>> t.add_row(row) 1 >>> df = t.to_dataframe() >>> df.head()
- add_row(row: ndarray, index: int | None = None) int[source]
Add a row to the table.
- Parameters:
row (numpy.ndarray) – Row to insert. Must have the same dtype (or same field names, allowing safe casting) as the table.
index (int, optional) – Index at which to insert the row. If None, appends to the end.
- Returns:
Index of the inserted row.
- Return type:
int
- Raises:
TypeError – If row is not a numpy.ndarray.
ValueError – If the dtype is incompatible with the table.
- check_dtypes(other_dtype: dtype) bool[source]
Check that dtypes match the table’s dtype (including field names).
- Parameters:
other_dtype (numpy.dtype) – The dtype to compare against the table’s dtype.
- Returns:
True if the dtypes match; otherwise False.
- Return type:
bool
- clear_table() None[source]
Reset the table by recreating the dataset with a single null row.
Notes
Deletes the current dataset and replaces it with a new dataset with the same compression/options and dtype, but shape (1,).
- property dtype: dtype
- property hdf5_reference: object
- locate(column: str, value: Any, test: Literal['eq', 'lt', 'le', 'gt', 'ge', 'be', 'bt'] = 'eq') ndarray[source]
Locate row indices where a column satisfies a comparison.
- Parameters:
column (str) – Name of the column to test.
value (Any) – Value to compare against. For string columns, a str is converted to a numpy.bytes_. For time columns (start, end, start_date, end_date), values are coerced to numpy.datetime64.
test ({'eq','lt','le','gt','ge','be','bt'}, default 'eq') – Type of comparison to perform. - ‘eq’: equals - ‘lt’: less than - ‘le’: less than or equal to - ‘gt’: greater than - ‘ge’: greater than or equal to - ‘be’: strictly between - ‘bt’: alias for ‘be’
- Returns:
Array of matching row indices.
- Return type:
numpy.ndarray
- Raises:
ValueError – If test is ‘be’/’bt’ and value is not a 2-length iterable.
Examples
Find rows with value greater than 10:
>>> idx = t.locate('value', 10, test='gt')
- property nrows: int
- remove_row(index: int) int[source]
Remove a row by replacing it with a null entry.
- Parameters:
index (int) – Index of the row to remove.
- Returns:
Index that was updated with a null row.
- Return type:
int
- Raises:
IndexError – If the index is out of bounds for the current shape.
Notes
There is no intrinsic index stored within the array; indexing is on-the-fly. Prefer using the HDF5 reference column for robust identification.
The current approach inserts a null row at the specified index.
- property shape: tuple[int, ...]
- to_dataframe() DataFrame[source]
Convert the table into a pandas.DataFrame.
- Returns:
DataFrame with decoded string columns where applicable.
- Return type:
pandas.DataFrame
Examples
Convert and preview:
>>> df = t.to_dataframe() >>> df.head()
- update_dtype(new_dtype: dtype) None[source]
Update the dataset’s dtype while preserving data and field names.
- Parameters:
new_dtype (numpy.dtype) – New dtype to apply. Must have identical field names.
Notes
Performs a manual copy into a new array to avoid unsafe casting errors, then recreates the dataset with the new dtype and same dataset options.
- update_row(entry: ndarray) int[source]
Update a row by locating its index and rewriting the entry.
- Parameters:
entry (numpy.ndarray) – Entry to update, with the same dtype as the table.
- Returns:
Row index that was updated, or the new row index if not found.
- Return type:
int
Notes
Matching by hdf5_reference is not reliable; this uses add_row and will append if the original row cannot be located.
- class mth5.tables.TFSummaryTable(hdf5_dataset: Dataset)[source]
Bases:
MTH5TableSummary table for TransferFunction groups.
Provides convenience functions to populate the table (summarize) and export to pandas.DataFrame (to_dataframe).
Examples
Build and export a TF summary:
>>> import h5py >>> from mth5.tables.tf_table import TFSummaryTable >>> f = h5py.File('example.mth5', 'r') >>> tf_summary_ds = f['Exchange']['TF_Summary'] >>> tf_table = TFSummaryTable(tf_summary_ds) >>> tf_table.summarize() >>> df = tf_table.to_dataframe() >>> df.head()
- summarize() None[source]
Populate the summary table by traversing the HDF5 hierarchy.
Searches for groups where
mth5_typeequals'transferfunction'and adds a row indicating available datasets (impedance, tipper, covariance), period min/max, and relevant references.- Return type:
None
Examples
Refresh the TF summary:
>>> tf_table.clear_table() >>> tf_table.summarize()