mth5.groups package

Submodules

mth5.groups.base module

Base Group Class

Contains all the base functions that will be used by group classes.

Created on Fri May 29 15:09:48 2020

copyright

Jared Peacock (jpeacock@usgs.gov)

license

MIT

class mth5.groups.base.BaseGroup(group, group_metadata=None, **kwargs)[source]

Bases: object

Generic object that will have functionality for reading/writing groups, including attributes. To access the hdf5 group directly use the BaseGroup.hdf5_group property.

>>> base = BaseGroup(hdf5_group)
>>> base.hdf5_group.ref
<HDF5 Group Reference>

Note

All attributes should be input into the metadata object, that way all input will be validated against the metadata standards. If you change attributes in metadata object, you should run the BaseGroup.write_metadata method. This is a temporary solution working on an automatic updater if metadata is changed.

>>> base.metadata.existing_attribute = 'update_existing_attribute'
>>> base.write_metadata()

If you want to add a new attribute this should be done using the metadata.add_base_attribute method.

>>> base.metadata.add_base_attribute('new_attribute',
...                                  'new_attribute_value',
...                                  {'type':str,
...                                   'required':True,
...                                   'style':'free form',
...                                   'description': 'new attribute desc.',
...                                   'units':None,
...                                   'options':[],
...                                   'alias':[],
...                                   'example':'new attribute'})

Includes intializing functions that makes a summary table and writes metadata.

property dataset_options
property groups_list
initialize_group(**kwargs)[source]

Initialize group by making a summary table and writing metadata

property metadata

Metadata for the Group based on mt_metadata.timeseries

read_metadata()[source]

read metadata from the HDF5 group into metadata object

write_metadata()[source]

Write HDF5 metadata from metadata object.

mth5.groups.filters module

Created on Wed Dec 23 17:08:40 2020

Need to make a group for FAP and FIR filters.

copyright

Jared Peacock (jpeacock@usgs.gov)

license

MIT

class mth5.groups.filters.FiltersGroup(group, **kwargs)[source]

Bases: mth5.groups.base.BaseGroup

Not implemented yet

add_filter(filter_object)[source]

Add a filter dataset based on type

current types are:
  • zpk –> zeros, poles, gain

  • fap –> frequency look up table

  • time_delay –> time delay filter

  • coefficient –> coefficient filter

Parameters

filter_object (mt_metadata.timeseries.filters) – An MT metadata filter object

property filter_dict
get_filter(name)[source]

Get a filter by name

to_filter_object(name)[source]

return the MT metadata representation of the filter

mth5.groups.master_station_run_channel module

Created on Wed Dec 23 17:18:29 2020

Note

Need to keep these groups together, if you split them into files you

get a circular import.

copyright

Jared Peacock (jpeacock@usgs.gov)

license

MIT

class mth5.groups.master_station_run_channel.AuxiliaryDataset(group, **kwargs)[source]

Bases: mth5.groups.master_station_run_channel.ChannelDataset

Holds a channel dataset. This is a simple container for the data to make sure that the user has the flexibility to turn the channel into an object they want to deal with.

For now all the numpy type slicing can be used on hdf5_dataset

Parameters
  • dataset (h5py.Dataset) – dataset object for the channel

  • dataset_metadata ([ mth5.metadata.Electric | mth5.metadata.Magnetic | mth5.metadata.Auxiliary ], optional) – metadata container, defaults to None

Raises

MTH5Error – If the dataset is not of the correct type

Utilities will be written to create some common objects like:

  • xarray.DataArray

  • pandas.DataFrame

  • zarr

  • dask.Array

The benefit of these other objects is that they can be indexed by time, and they have much more buit-in funcionality.

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
>>> channel = run.get_channel('Ex')
>>> channel
Channel Electric:
-------------------
            component:        Ey
    data type:        electric
    data format:      float32
    data shape:       (4096,)
    start:            1980-01-01T00:00:00+00:00
    end:              1980-01-01T00:00:01+00:00
    sample rate:      4096
class mth5.groups.master_station_run_channel.ChannelDataset(dataset, dataset_metadata=None, **kwargs)[source]

Bases: object

Holds a channel dataset. This is a simple container for the data to make sure that the user has the flexibility to turn the channel into an object they want to deal with.

For now all the numpy type slicing can be used on hdf5_dataset

Parameters
  • dataset (h5py.Dataset) – dataset object for the channel

  • dataset_metadata ([ mth5.metadata.Electric | mth5.metadata.Magnetic | mth5.metadata.Auxiliary ], optional) – metadata container, defaults to None

Raises

MTH5Error – If the dataset is not of the correct type

Utilities will be written to create some common objects like:

  • xarray.DataArray

  • pandas.DataFrame

  • zarr

  • dask.Array

The benefit of these other objects is that they can be indexed by time, and they have much more buit-in funcionality.

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
>>> channel = run.get_channel('Ex')
>>> channel
Channel Electric:
-------------------
            component:        Ey
    data type:        electric
    data format:      float32
    data shape:       (4096,)
    start:            1980-01-01T00:00:00+00:00
    end:              1980-01-01T00:00:01+00:00
    sample rate:      4096
property channel_entry

channel entry that will go into a full channel summary of the entire survey

property channel_response_filter
property end

return end time based on the data

extend_dataset(new_data_array, start_time, sample_rate, fill=None, max_gap_seconds=1, fill_window=10)[source]

Append data according to how the start time aligns with existing data. If the start time is before existing start time the data is prepended, similarly if the start time is near the end data will be appended.

If the start time is within the existing time range, existing data will be replace with the new data.

If there is a gap between start or end time of the new data with the existing data you can either fill the data with a constant value or an error will be raise depending on the value of fill.

Parameters
  • new_data_array (numpy.ndarray) – new data array with shape (npts, )

  • start_time (string or mth5.utils.mttime.MTime) – start time of the new data array in UTC

  • sample_rate (float) – Sample rate of the new data array, must match existing sample rate

  • fill (string, None, float, integer) – If there is a data gap how do you want to fill the gap * None: will raise an mth5.utils.exceptions.MTH5Error * ‘mean’: will fill with the mean of each data set within the fill window * ‘median’: will fill with the median of each data set within the fill window * value: can be an integer or float to fill the gap * ‘nan’: will fill the gap with NaN

  • max_gap_seconds (float or integer) – sets a maximum number of seconds the gap can be. Anything over this number will raise a mth5.utils.exceptions.MTH5Error.

  • fill_window (integer) – number of points from the end of each data set to estimate fill value from.

Raises

mth5.utils.excptions.MTH5Error if sample rate is not the same, or fill value is not understood,

Rubric

>>> ex = mth5_obj.get_channel('MT001', 'MT001a', 'Ex')
>>> ex.n_samples
4096
>>> ex.end
2015-01-08T19:32:09.500000+00:00
>>> t = timeseries.ChannelTS('electric',
...                     data=2*np.cos(4 * np.pi * .05 *         ...                                   np.linspace(0,4096l num=4096) *
...                                   .01),
...                     channel_metadata={'electric':{
...                        'component': 'ex',
...                        'sample_rate': 8,
...                        'time_period.start':(ex.end+(1)).iso_str}})
>>> ex.extend_dataset(t.ts, t.start, t.sample_rate, fill='median',
...                   max_gap_seconds=2)
2020-07-02T18:02:47 - mth5.groups.Electric.extend_dataset - INFO -
filling data gap with 1.0385180759767025
>>> ex.n_samples
8200
>>> ex.end
2015-01-08T19:40:42.500000+00:00
from_channel_ts(channel_ts_obj, how='replace', fill=None, max_gap_seconds=1, fill_window=10)[source]

fill data set from a mth5.timeseries.ChannelTS object.

Will check for time alignement, and metadata.

Parameters
  • channel_ts_obj (mth5.timeseries.ChannelTS) – time series object

  • how

    how the new array will be input to the existing dataset:

    • ’replace’ -> replace the entire dataset nothing is left over.

    • ’extend’ -> add onto the existing dataset, any overlapping values will be rewritten, if there are gaps between data sets those will be handled depending on the value of fill.

    param fill

    If there is a data gap how do you want to fill the gap:

    • None -> will raise an mth5.utils.exceptions.MTH5Error

    • ’mean’-> will fill with the mean of each data set within the fill window

    • ’median’ -> will fill with the median of each data set within the fill window

    • value -> can be an integer or float to fill the gap

    • ’nan’ -> will fill the gap with NaN

  • max_gap_seconds (float or integer) – sets a maximum number of seconds the gap can be. Anything over this number will raise a mth5.utils.exceptions.MTH5Error.

  • fill_window (integer) – number of points from the end of each data set to estimate fill value from.

from_xarray(data_array, how='replace', fill=None, max_gap_seconds=1, fill_window=10)[source]

fill data set from a xarray.DataArray object.

Will check for time alignement, and metadata.

Parameters
  • data_array_obj – Xarray data array

  • how

    how the new array will be input to the existing dataset:

    • ’replace’ -> replace the entire dataset nothing is left over.

    • ’extend’ -> add onto the existing dataset, any overlapping

    values will be rewritten, if there are gaps between data sets those will be handled depending on the value of fill.

    param fill

    If there is a data gap how do you want to fill the gap:

    • None -> will raise an mth5.utils.exceptions.MTH5Error

    • ’mean’-> will fill with the mean of each data set within

      the fill window

    • ’median’ -> will fill with the median of each data set

      within the fill window

    • value -> can be an integer or float to fill the gap

    • ’nan’ -> will fill the gap with NaN

  • max_gap_seconds (float or integer) – sets a maximum number of seconds the gap can be. Anything over this number will raise a mth5.utils.exceptions.MTH5Error.

  • fill_window (integer) – number of points from the end of each data set to estimate fill value from.

get_index_from_time(given_time)[source]

get the appropriate index for a given time.

Parameters

given_time (TYPE) – DESCRIPTION

Returns

DESCRIPTION

Return type

TYPE

property master_station_group

shortcut to master station group

property n_samples
read_metadata()[source]

Read metadata from the HDF5 file into the metadata container, that way it can be validated.

replace_dataset(new_data_array)[source]

replace the entire dataset with a new one, nothing left behind

Parameters

new_data_array (numpy.ndarray) – new data array shape (npts, )

property run_group

shortcut to run group

property sample_rate
property start
property station_group

shortcut to station group

property table_entry

Creat a table entry to put into the run summary table.

property time_index

time index given parameters in metadata :rtype: pandas.DatetimeIndex

Type

return

time_slice(start_time, end_time=None, n_samples=None, return_type='channel_ts')[source]

Get a time slice from the channel and return the appropriate type

  • numpy array with metadata

  • pandas.Dataframe with metadata

  • xarray.DataFrame with metadata

  • mth5.timeseries.ChannelTS ‘default’

  • dask.DataFrame with metadata ‘not yet’

Parameters
  • start_time (string or mth5.utils.mttime.MTime) – start time of the slice

  • end_time (string or mth5.utils.mttime.MTime, optional) – end time of the slice

  • n_samples (integer, optional) – number of samples to read in

Returns

the correct container for the time series.

Return type

[ xarray.DataArray | pandas.DataFrame | mth5.timeseries.ChannelTS | numpy.ndarray ]

Raises

ValueError if both end_time and n_samples are None or given.

Example with number of samples

>>> ex = mth5_obj.get_channel('FL001', 'FL001a', 'Ex')
>>> ex_slice = ex.time_slice("2015-01-08T19:49:15", n_samples=4096)
>>> ex_slice
<xarray.DataArray (time: 4096)>
array([0.93115046, 0.14233688, 0.87917119, ..., 0.26073634, 0.7137319 ,
       0.88154395])
Coordinates:
  * time     (time) datetime64[ns] 2015-01-08T19:49:15 ... 2015-01-08T19:57:46.875000
Attributes:
    ac.end:                      None
    ac.start:                    None
    ...

>>> type(ex_slice)
mth5.timeseries.ChannelTS

# plot the time series
>>> ex_slice.ts.plot()
Example with start and end time

>>> ex_slice = ex.time_slice("2015-01-08T19:49:15",
...                          end_time="2015-01-09T19:49:15")
Raises Example

>>> ex_slice = ex.time_slice("2015-01-08T19:49:15",
...                          end_time="2015-01-09T19:49:15",
...                          n_samples=4096)
ValueError: Must input either end_time or n_samples, not both.
to_channel_ts()[source]
Returns

a Timeseries with the appropriate time index and metadata

Return type

mth5.timeseries.ChannelTS

loads from memory (nearly half the size of xarray alone, not sure why)

to_dataframe()[source]
Returns

a dataframe where data is stored in the ‘data’ column and attributes are stored in the experimental attrs attribute

Return type

pandas.DataFrame

Note

that metadta will not be validated if changed in an xarray.

loads into RAM

to_numpy()[source]
Returns

a numpy structured array with 2 columns (time, channel_data)

Return type

numpy.core.records

data is a builtin to numpy and cannot be used as a name

loads into RAM

to_xarray()[source]
Returns

an xarray DataArray with appropriate metadata and the appropriate time index.

Return type

xarray.DataArray

Note

that metadta will not be validated if changed in an xarray.

loads from memory

write_metadata()[source]

Write metadata from the metadata container to the HDF5 attrs dictionary.

class mth5.groups.master_station_run_channel.ElectricDataset(group, **kwargs)[source]

Bases: mth5.groups.master_station_run_channel.ChannelDataset

Holds a channel dataset. This is a simple container for the data to make sure that the user has the flexibility to turn the channel into an object they want to deal with.

For now all the numpy type slicing can be used on hdf5_dataset

Parameters
  • dataset (h5py.Dataset) – dataset object for the channel

  • dataset_metadata ([ mth5.metadata.Electric | mth5.metadata.Magnetic | mth5.metadata.Auxiliary ], optional) – metadata container, defaults to None

Raises

MTH5Error – If the dataset is not of the correct type

Utilities will be written to create some common objects like:

  • xarray.DataArray

  • pandas.DataFrame

  • zarr

  • dask.Array

The benefit of these other objects is that they can be indexed by time, and they have much more buit-in funcionality.

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
>>> channel = run.get_channel('Ex')
>>> channel
Channel Electric:
-------------------
            component:        Ey
    data type:        electric
    data format:      float32
    data shape:       (4096,)
    start:            1980-01-01T00:00:00+00:00
    end:              1980-01-01T00:00:01+00:00
    sample rate:      4096
class mth5.groups.master_station_run_channel.MagneticDataset(group, **kwargs)[source]

Bases: mth5.groups.master_station_run_channel.ChannelDataset

Holds a channel dataset. This is a simple container for the data to make sure that the user has the flexibility to turn the channel into an object they want to deal with.

For now all the numpy type slicing can be used on hdf5_dataset

Parameters
  • dataset (h5py.Dataset) – dataset object for the channel

  • dataset_metadata ([ mth5.metadata.Electric | mth5.metadata.Magnetic | mth5.metadata.Auxiliary ], optional) – metadata container, defaults to None

Raises

MTH5Error – If the dataset is not of the correct type

Utilities will be written to create some common objects like:

  • xarray.DataArray

  • pandas.DataFrame

  • zarr

  • dask.Array

The benefit of these other objects is that they can be indexed by time, and they have much more buit-in funcionality.

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
>>> channel = run.get_channel('Ex')
>>> channel
Channel Electric:
-------------------
            component:        Ey
    data type:        electric
    data format:      float32
    data shape:       (4096,)
    start:            1980-01-01T00:00:00+00:00
    end:              1980-01-01T00:00:01+00:00
    sample rate:      4096
class mth5.groups.master_station_run_channel.MasterStationGroup(group, **kwargs)[source]

Bases: mth5.groups.base.BaseGroup

Utility class to holds information about the stations within a survey and accompanying metadata. This class is next level down from Survey for stations /Survey/Stations. This class provides methods to add and get stations. A summary table of all existing stations is also provided as a convenience look up table to make searching easier.

To access MasterStationGroup from an open MTH5 file:

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> stations = mth5_obj.stations_group

To check what stations exist

>>> stations.groups_list
['summary', 'MT001', 'MT002', 'MT003']

To access the hdf5 group directly use SurveyGroup.hdf5_group.

>>> stations.hdf5_group.ref
<HDF5 Group Reference>

Note

All attributes should be input into the metadata object, that way all input will be validated against the metadata standards. If you change attributes in metadata object, you should run the SurveyGroup.write_metadata() method. This is a temporary solution, working on an automatic updater if metadata is changed.

>>> stations.metadata.existing_attribute = 'update_existing_attribute'
>>> stations.write_metadata()

If you want to add a new attribute this should be done using the metadata.add_base_attribute method.

>>> stations.metadata.add_base_attribute('new_attribute',
>>> ...                                'new_attribute_value',
>>> ...                                {'type':str,
>>> ...                                 'required':True,
>>> ...                                 'style':'free form',
>>> ...                                 'description': 'new attribute desc.',
>>> ...                                 'units':None,
>>> ...                                 'options':[],
>>> ...                                 'alias':[],
>>> ...                                 'example':'new attribute

To add a station:

>>> new_station = stations.add_station('new_station')
>>> stations
/Survey/Stations:
====================
    --> Dataset: summary
    ......................
    |- Group: new_station
    ---------------------
        --> Dataset: summary
        ......................

Add a station with metadata:

>>> from mth5.metadata import Station
>>> station_metadata = Station()
>>> station_metadata.id = 'MT004'
>>> station_metadata.time_period.start = '2020-01-01T12:30:00'
>>> station_metadata.location.latitude = 40.000
>>> station_metadata.location.longitude = -120.000
>>> new_station = stations.add_station('Test_01', station_metadata)
>>> # to look at the metadata
>>> new_station.metadata
{
    "station": {
        "acquired_by.author": null,
        "acquired_by.comments": null,
        "id": "MT004",
        ...
        }
}

See also

mth5.metadata for details on how to add metadata from various files and python objects.

To remove a station:

>>> stations.remove_station('new_station')
>>> stations
/Survey/Stations:
====================
    --> Dataset: summary
    ......................

Note

Deleting a station is not as simple as del(station). In HDF5 this does not free up memory, it simply removes the reference to that station. The common way to get around this is to copy what you want into a new file, or overwrite the station.

To get a station:

>>> existing_station = stations.get_station('existing_station_name')
>>> existing_station
/Survey/Stations/existing_station_name:
=======================================
    --> Dataset: summary
    ......................
    |- Group: run_01
    ----------------
        --> Dataset: summary
        ......................
        --> Dataset: Ex
        ......................
        --> Dataset: Ey
        ......................
        --> Dataset: Hx
        ......................
        --> Dataset: Hy
        ......................
        --> Dataset: Hz
        ......................

A summary table is provided to make searching easier. The table summarized all stations within a survey. To see what names are in the summary table:

>>> stations.summary_table.dtype.descr
[('id', ('|S5', {'h5py_encoding': 'ascii'})),
 ('start', ('|S32', {'h5py_encoding': 'ascii'})),
 ('end', ('|S32', {'h5py_encoding': 'ascii'})),
 ('components', ('|S100', {'h5py_encoding': 'ascii'})),
 ('measurement_type', ('|S12', {'h5py_encoding': 'ascii'})),
 ('sample_rate', '<f8')]

Note

When a station is added an entry is added to the summary table, where the information is pulled from the metadata.

>>> stations.summary_table
index |   id    |            start             |             end
 | components | measurement_type | sample_rate
 -------------------------------------------------------------------------
 --------------------------------------------------
 0   |  Test_01   |  1980-01-01T00:00:00+00:00 |  1980-01-01T00:00:00+00:00
 |  Ex,Ey,Hx,Hy,Hz   |  BBMT   | 100
add_station(station_name, station_metadata=None)[source]
Add a station with metadata if given with the path:

/Survey/Stations/station_name

If the station already exists, will return that station and nothing is added.

Parameters
  • station_name (string) – Name of the station, should be the same as metadata.id

  • station_metadata (mth5.metadata.Station, optional) – Station metadata container, defaults to None

Returns

A convenience class for the added station

Return type

mth5_groups.StationGroup

Example
>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> # one option
>>> stations = mth5_obj.stations_group
>>> new_station = stations.add_station('MT001')
>>> # another option
>>> new_staiton = mth5_obj.stations_group.add_station('MT001')
property channel_summary

Summary of all channels in the file.

get_station(station_name)[source]

Get a station with the same name as station_name

Parameters

station_name (string) – existing station name

Returns

convenience station class

Return type

mth5.mth5_groups.StationGroup

Raises

MTH5Error – if the station name is not found.

Example

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> # one option
>>> stations = mth5_obj.stations_group
>>> existing_station = stations.get_station('MT001')
>>> # another option
>>> existing_staiton = mth5_obj.stations_group.get_station('MT001')
MTH5Error: MT001 does not exist, check station_list for existing names
remove_station(station_name)[source]

Remove a station from the file.

Note

Deleting a station is not as simple as del(station). In HDF5 this does not free up memory, it simply removes the reference to that station. The common way to get around this is to copy what you want into a new file, or overwrite the station.

Parameters

station_name (string) – existing station name

Example
>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> # one option
>>> stations = mth5_obj.stations_group
>>> stations.remove_station('MT001')
>>> # another option
>>> mth5_obj.stations_group.remove_station('MT001')
property station_summary

Summary of stations in the file

Returns

DESCRIPTION

Return type

TYPE

class mth5.groups.master_station_run_channel.RunGroup(group, run_metadata=None, **kwargs)[source]

Bases: mth5.groups.base.BaseGroup

RunGroup is a utility class to hold information about a single run and accompanying metadata. This class is the next level down from Stations –> /Survey/Stations/station/station{a-z}.

This class provides methods to add and get channels. A summary table of all existing channels in the run is also provided as a convenience look up table to make searching easier.

Parameters
  • group (h5py.Group) – HDF5 group for a station, should have a path /Survey/Stations/station_name/run_name

  • station_metadata (mth5.metadata.Station, optional) – metadata container, defaults to None

Access RunGroup from an open MTH5 file

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
Check what channels exist

>>> station.groups_list
['Ex', 'Ey', 'Hx', 'Hy']

To access the hdf5 group directly use RunGroup.hdf5_group

>>> station.hdf5_group.ref
<HDF5 Group Reference>

Note

All attributes should be input into the metadata object, that way all input will be validated against the metadata standards. If you change attributes in metadata object, you should run the SurveyGroup.write_metadata() method. This is a temporary solution, working on an automatic updater if metadata is changed.

>>> run.metadata.existing_attribute = 'update_existing_attribute'
>>> run.write_metadata()

If you want to add a new attribute this should be done using the metadata.add_base_attribute method.

>>> station.metadata.add_base_attribute('new_attribute',
>>> ...                                 'new_attribute_value',
>>> ...                                 {'type':str,
>>> ...                                  'required':True,
>>> ...                                  'style':'free form',
>>> ...                                  'description': 'new attribute desc.',
>>> ...                                  'units':None,
>>> ...                                  'options':[],
>>> ...                                  'alias':[],
>>> ...                                  'example':'new attribute
Add a channel

>>> new_channel = run.add_channel('Ex', 'electric',
>>> ...                            data=numpy.random.rand(4096))
>>> new_run
/Survey/Stations/MT001/MT001a:
=======================================
    --> Dataset: summary
    ......................
    --> Dataset: Ex
    ......................
    --> Dataset: Ey
    ......................
    --> Dataset: Hx
    ......................
    --> Dataset: Hy
    ......................
Add a channel with metadata

>>> from mth5.metadata import Electric
>>> ex_metadata = Electric()
>>> ex_metadata.time_period.start = '2020-01-01T12:30:00'
>>> ex_metadata.time_period.end = '2020-01-03T16:30:00'
>>> new_ex = run.add_channel('Ex', 'electric',
>>> ...                       channel_metadata=ex_metadata)
>>> # to look at the metadata
>>> new_ex.metadata
{
     "electric": {
        "ac.end": 1.2,
        "ac.start": 2.3,
        ...
        }
}

See also

mth5.metadata for details on how to add metadata from various files and python objects.

Remove a channel

>>> run.remove_channel('Ex')
>>> station
/Survey/Stations/MT001/MT001a:
=======================================
    --> Dataset: summary
    ......................
    --> Dataset: Ey
    ......................
    --> Dataset: Hx
    ......................
    --> Dataset: Hy
    ......................

Note

Deleting a station is not as simple as del(station). In HDF5 this does not free up memory, it simply removes the reference to that station. The common way to get around this is to copy what you want into a new file, or overwrite the station.

Get a channel

>>> existing_ex = stations.get_channel('Ex')
>>> existing_ex
Channel Electric:
-------------------
    data type:        Ex
    data type:        electric
    data format:      float32
    data shape:       (4096,)
    start:            1980-01-01T00:00:00+00:00
    end:              1980-01-01T00:32:+08:00
    sample rate:      8
Summary Table

A summary table is provided to make searching easier. The table summarized all stations within a survey. To see what names are in the summary table:

>>> run.summary_table.dtype.descr
[('component', ('|S5', {'h5py_encoding': 'ascii'})),
 ('start', ('|S32', {'h5py_encoding': 'ascii'})),
 ('end', ('|S32', {'h5py_encoding': 'ascii'})),
 ('n_samples', '<i4'),
 ('measurement_type', ('|S12', {'h5py_encoding': 'ascii'})),
 ('units', ('|S25', {'h5py_encoding': 'ascii'})),
 ('hdf5_reference', ('|O', {'ref': h5py.h5r.Reference}))]

Note

When a run is added an entry is added to the summary table, where the information is pulled from the metadata.

>>> new_run.summary_table
index | component | start | end | n_samples | measurement_type | units |
hdf5_reference
--------------------------------------------------------------------------
-------------
add_channel(channel_name, channel_type, data, channel_dtype='int32', max_shape=(None), chunks=True, channel_metadata=None, **kwargs)[source]

add a channel to the run

Parameters
  • channel_name (string) – name of the channel

  • channel_type (string) – [ electric | magnetic | auxiliary ]

  • channel_metadata ([ mth5.metadata.Electric | mth5.metadata.Magnetic | mth5.metadata.Auxiliary ], optional) – metadata container, defaults to None

Raises

MTH5Error – If channel type is not correct

Returns

Channel container

Return type

[ mth5.mth5_groups.ElectricDatset | mth5.mth5_groups.MagneticDatset | mth5.mth5_groups.AuxiliaryDatset ]

>>> new_channel = run.add_channel('Ex', 'electric', None)
>>> new_channel
Channel Electric:
-------------------
                component:        None
        data type:        electric
        data format:      float32
        data shape:       (1,)
        start:            1980-01-01T00:00:00+00:00
        end:              1980-01-01T00:00:00+00:00
        sample rate:      None
property channel_summary

summary of channels in run :return: DESCRIPTION :rtype: TYPE

from_channel_ts(channel_ts_obj)[source]

create a channel data set from a mth5.timeseries.ChannelTS object and update metadata.

Parameters

channel_ts_obj (mth5.timeseries.ChannelTS) – a single time series object

Returns

new channel dataset

Return type

:class:`mth5.groups.ChannelDataset

from_runts(run_ts_obj, **kwargs)[source]

create channel datasets from a mth5.timeseries.RunTS object and update metadata.

:parameter mth5.timeseries.RunTS run_ts_obj: Run object with all the appropriate channels and metadata.

Will create a run group and appropriate channel datasets.

get_channel(channel_name)[source]

Get a channel from an existing name. Returns the appropriate container.

Parameters

channel_name (string) – name of the channel

Returns

Channel container

Return type

[ mth5.mth5_groups.ElectricDatset | mth5.mth5_groups.MagneticDatset | mth5.mth5_groups.AuxiliaryDatset ]

Raises

MTH5Error – If no channel is found

Example

>>> existing_channel = run.get_channel('Ex')
MTH5Error: Ex does not exist, check groups_list for existing names'
>>> run.groups_list
['Ey', 'Hx', 'Hz']
>>> existing_channel = run.get_channel('Ey')
>>> existing_channel
Channel Electric:
-------------------
                component:        Ey
        data type:        electric
        data format:      float32
        data shape:       (4096,)
        start:            1980-01-01T00:00:00+00:00
        end:              1980-01-01T00:00:01+00:00
        sample rate:      4096
property master_station_group

shortcut to master station group

property metadata

Overwrite get metadata to include channel information in the runs

remove_channel(channel_name)[source]

Remove a run from the station.

Note

Deleting a channel is not as simple as del(channel). In HDF5 this does not free up memory, it simply removes the reference to that channel. The common way to get around this is to copy what you want into a new file, or overwrite the channel.

Parameters

station_name (string) – existing station name

Example

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
>>> run.remove_channel('Ex')
property station_group

shortcut to station group

property table_entry

Get a run table entry

Returns

a properly formatted run table entry

Return type

numpy.ndarray with dtype:

>>> dtype([('id', 'S20'),
         ('start', 'S32'),
         ('end', 'S32'),
         ('components', 'S100'),
         ('measurement_type', 'S12'),
         ('sample_rate', float),
         ('hdf5_reference', h5py.ref_dtype)])
to_runts()[source]

create a mth5.timeseries.RunTS object from channels of the run

Returns

DESCRIPTION

Return type

TYPE

validate_run_metadata()[source]

Update metadata and table entries to ensure consistency

Returns

DESCRIPTION

Return type

TYPE

write_metadata()[source]

Overwrite Base.write_metadata to include updating table entry Write HDF5 metadata from metadata object.

class mth5.groups.master_station_run_channel.StationGroup(group, station_metadata=None, **kwargs)[source]

Bases: mth5.groups.base.BaseGroup

StationGroup is a utility class to hold information about a single station and accompanying metadata. This class is the next level down from Stations –> /Survey/Stations/station_name.

This class provides methods to add and get runs. A summary table of all existing runs in the station is also provided as a convenience look up table to make searching easier.

Parameters
  • group (h5py.Group) – HDF5 group for a station, should have a path /Survey/Stations/station_name

  • station_metadata (mth5.metadata.Station, optional) – metadata container, defaults to None

Usage

Access StationGroup from an open MTH5 file

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> station = mth5_obj.stations_group.get_station('MT001')
Check what runs exist

>>> station.groups_list
['MT001a', 'MT001b', 'MT001c', 'MT001d']

To access the hdf5 group directly use StationGroup.hdf5_group.

>>> station.hdf5_group.ref
<HDF5 Group Reference>

Note

All attributes should be input into the metadata object, that way all input will be validated against the metadata standards. If you change attributes in metadata object, you should run the SurveyGroup.write_metadata() method. This is a temporary solution, working on an automatic updater if metadata is changed.

>>> station.metadata.existing_attribute = 'update_existing_attribute'
>>> station.write_metadata()

If you want to add a new attribute this should be done using the metadata.add_base_attribute method.

>>> station.metadata.add_base_attribute('new_attribute',
>>> ...                                 'new_attribute_value',
>>> ...                                 {'type':str,
>>> ...                                  'required':True,
>>> ...                                  'style':'free form',
>>> ...                                  'description': 'new attribute desc.',
>>> ...                                  'units':None,
>>> ...                                  'options':[],
>>> ...                                  'alias':[],
>>> ...                                  'example':'new attribute
To add a run

>>> new_run = stations.add_run('MT001e')
>>> new_run
/Survey/Stations/Test_01:
=========================
    |- Group: MT001e
    -----------------
        --> Dataset: summary
        ......................
    --> Dataset: summary
    ......................
Add a run with metadata

>>> from mth5.metadata import Run
>>> run_metadata = Run()
>>> run_metadata.time_period.start = '2020-01-01T12:30:00'
>>> run_metadata.time_period.end = '2020-01-03T16:30:00'
>>> run_metadata.location.latitude = 40.000
>>> run_metadata.location.longitude = -120.000
>>> new_run = runs.add_run('Test_01', run_metadata)
>>> # to look at the metadata
>>> new_run.metadata
{
    "run": {
        "acquired_by.author": "new_user",
        "acquired_by.comments": "First time",
        "channels_recorded_auxiliary": ['T'],
        ...
        }
}

See also

mth5.metadata for details on how to add metadata from various files and python objects.

Remove a run

>>> station.remove_run('new_run')
>>> station
/Survey/Stations/Test_01:
=========================
    --> Dataset: summary
    ......................

Note

Deleting a station is not as simple as del(station). In HDF5 this does not free up memory, it simply removes the reference to that station. The common way to get around this is to copy what you want into a new file, or overwrite the station.

Get a run

>>> existing_run = stations.get_station('existing_run')
>>> existing_run
/Survey/Stations/MT001/MT001a:
=======================================
    --> Dataset: summary
    ......................
    --> Dataset: Ex
    ......................
    --> Dataset: Ey
    ......................
    --> Dataset: Hx
    ......................
    --> Dataset: Hy
    ......................
    --> Dataset: Hz
    ......................
Summary Table

A summary table is provided to make searching easier. The table summarized all stations within a survey. To see what names are in the summary table:

>>> new_run.summary_table.dtype.descr
[('id', ('|S20', {'h5py_encoding': 'ascii'})),
 ('start', ('|S32', {'h5py_encoding': 'ascii'})),
 ('end', ('|S32', {'h5py_encoding': 'ascii'})),
 ('components', ('|S100', {'h5py_encoding': 'ascii'})),
 ('measurement_type', ('|S12', {'h5py_encoding': 'ascii'})),
 ('sample_rate', '<f8'),
 ('hdf5_reference', ('|O', {'ref': h5py.h5r.Reference}))]

Note

When a run is added an entry is added to the summary table, where the information is pulled from the metadata.

>>> station.summary_table
index | id | start | end | components | measurement_type | sample_rate |
hdf5_reference
--------------------------------------------------------------------------
-------------
add_run(run_name, run_metadata=None)[source]

Add a run to a station.

Parameters
  • run_name (string) – run name, should be id{a-z}

  • metadata (mth5.metadata.Station, optional) – metadata container, defaults to None

need to be able to fill an entry in the summary table.

get_run(run_name)[source]

get a run from run name

Parameters

run_name (string) – existing run name

Returns

Run object

Return type

mth5.mth5_groups.RunGroup

>>> existing_run = station.get_run('MT001')
locate_run(sample_rate, start)[source]

Locate a run based on sample rate and start time from the summary table

Parameters
  • sample_rate (float) – sample rate in samples/seconds

  • start (string or mth5.utils.mttime.MTime) – start time

Returns

appropriate run name, None if not found

Return type

string or None

make_run_name()[source]

Make a run name that will be the next alphabet letter extracted from the run list. Expects that all runs are labled as id{a-z}.

Returns

metadata.id + next letter

Return type

string

>>> station.metadata.id = 'MT001'
>>> station.make_run_name()
'MT001a'
property master_station_group

shortcut to master station group

property metadata

Overwrite get metadata to include run information in the station

property name
remove_run(run_name)[source]

Remove a run from the station.

Note

Deleting a station is not as simple as del(station). In HDF5 this does not free up memory, it simply removes the reference to that station. The common way to get around this is to copy what you want into a new file, or overwrite the station.

Parameters

station_name (string) – existing station name

Example
>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> # one option
>>> stations = mth5_obj.stations_group
>>> stations.remove_station('MT001')
>>> # another option
>>> mth5_obj.stations_group.remove_station('MT001')
property run_summary

Summary of runs in the station

Returns

DESCRIPTION

Return type

TYPE

property table_entry

make table entry

validate_station_metadata()[source]

Check metadata from the runs and make sure it matches the station metadata

Returns

DESCRIPTION

Return type

TYPE

mth5.groups.reports module

Created on Wed Dec 23 17:03:53 2020

copyright

Jared Peacock (jpeacock@usgs.gov)

license

MIT

class mth5.groups.reports.ReportsGroup(group, **kwargs)[source]

Bases: mth5.groups.base.BaseGroup

Not sure how to handle this yet

add_report(report_name, report_metadata=None, report_data=None)[source]
Parameters
  • report_name (TYPE) – DESCRIPTION

  • report_metadata (TYPE, optional) – DESCRIPTION, defaults to None

  • report_data (TYPE, optional) – DESCRIPTION, defaults to None

Returns

DESCRIPTION

Return type

TYPE

mth5.groups.standards module

Created on Wed Dec 23 17:05:33 2020

copyright

Jared Peacock (jpeacock@usgs.gov)

license

MIT

class mth5.groups.standards.StandardsGroup(group, **kwargs)[source]

Bases: mth5.groups.base.BaseGroup

The StandardsGroup is a convenience group that stores the metadata standards that were used to make the current file. This is to help a user understand the metadata directly from the file and not have to look up documentation that might not be updated.

The metadata standards are stored in the summary table /Survey/Standards/summary

>>> standards = mth5_obj.standards_group
>>> standards.summary_table
index | attribute | type | required | style | units | description |
options  |  alias |  example
--------------------------------------------------------------------------
get_attribute_information(attribute_name)[source]

get information about an attribute

The attribute name should be in the summary table.

Parameters

attribute_name (string) – attribute name

Returns

prints a description of the attribute

Raises

MTH5TableError – if attribute is not found

>>> standars = mth5_obj.standards_group
>>> standards.get_attribute_information('survey.release_license')
survey.release_license
--------------------------
        type          : string
        required      : True
        style         : controlled vocabulary
        units         :
        description   : How the data can be used. The options are based on
                 Creative Commons licenses. For details visit
                 https://creativecommons.org/licenses/
        options       : CC-0,CC-BY,CC-BY-SA,CC-BY-ND,CC-BY-NC-SA,CC-BY-NC-ND
        alias         :
        example       : CC-0
initialize_group()[source]

Initialize the group by making a summary table that summarizes the metadata standards used to describe the data.

Also, write generic metadata information.

property summary_table
summary_table_from_dict(summary_dict)[source]

Fill summary table from a dictionary that summarizes the metadata for the entire survey.

Parameters

summary_dict (dictionary) – Flattened dictionary of all metadata standards within the survey.

mth5.groups.standards.summarize_metadata_standards()[source]

Summarize metadata standards into a dictionary

mth5.groups.survey module

Created on Wed Dec 23 16:59:45 2020

copyright

Jared Peacock (jpeacock@usgs.gov)

license

MIT

class mth5.groups.survey.SurveyGroup(group, **kwargs)[source]

Bases: mth5.groups.base.BaseGroup

Utility class to holds general information about the survey and accompanying metadata for an MT survey.

To access the hdf5 group directly use SurveyGroup.hdf5_group.

>>> survey = SurveyGroup(hdf5_group)
>>> survey.hdf5_group.ref
<HDF5 Group Reference>

Note

All attributes should be input into the metadata object, that way all input will be validated against the metadata standards. If you change attributes in metadata object, you should run the SurveyGroup.write_metadata() method. This is a temporary solution, working on an automatic updater if metadata is changed.

>>> survey.metadata.existing_attribute = 'update_existing_attribute'
>>> survey.write_metadata()

If you want to add a new attribute this should be done using the metadata.add_base_attribute method.

>>> survey.metadata.add_base_attribute('new_attribute',
>>> ...                                'new_attribute_value',
>>> ...                                {'type':str,
>>> ...                                 'required':True,
>>> ...                                 'style':'free form',
>>> ...                                 'description': 'new attribute desc.',
>>> ...                                 'units':None,
>>> ...                                 'options':[],
>>> ...                                 'alias':[],
>>> ...                                 'example':'new attribute

Tip

If you want ot add stations, reports, etc to the survey this should be done from the MTH5 object. This is to avoid duplication, at least for now.

To look at what the structure of /Survey looks like:

>>> survey
/Survey:
====================
    |- Group: Filters
    -----------------
        --> Dataset: summary
    -----------------
    |- Group: Reports
    -----------------
        --> Dataset: summary
        -----------------
    |- Group: Standards
    -------------------
        --> Dataset: summary
        -----------------
    |- Group: Stations
    ------------------
        --> Dataset: summary
        -----------------
property metadata

Overwrite get metadata to include station information

property stations_group
update_survey_metadata(survey_dict=None)[source]

update start end dates and location corners from stations_group.summary_table

Module contents

Import all Group objects

class mth5.groups.AuxiliaryDataset(group, **kwargs)[source]

Bases: mth5.groups.master_station_run_channel.ChannelDataset

Holds a channel dataset. This is a simple container for the data to make sure that the user has the flexibility to turn the channel into an object they want to deal with.

For now all the numpy type slicing can be used on hdf5_dataset

Parameters
  • dataset (h5py.Dataset) – dataset object for the channel

  • dataset_metadata ([ mth5.metadata.Electric | mth5.metadata.Magnetic | mth5.metadata.Auxiliary ], optional) – metadata container, defaults to None

Raises

MTH5Error – If the dataset is not of the correct type

Utilities will be written to create some common objects like:

  • xarray.DataArray

  • pandas.DataFrame

  • zarr

  • dask.Array

The benefit of these other objects is that they can be indexed by time, and they have much more buit-in funcionality.

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
>>> channel = run.get_channel('Ex')
>>> channel
Channel Electric:
-------------------
            component:        Ey
    data type:        electric
    data format:      float32
    data shape:       (4096,)
    start:            1980-01-01T00:00:00+00:00
    end:              1980-01-01T00:00:01+00:00
    sample rate:      4096
class mth5.groups.BaseGroup(group, group_metadata=None, **kwargs)[source]

Bases: object

Generic object that will have functionality for reading/writing groups, including attributes. To access the hdf5 group directly use the BaseGroup.hdf5_group property.

>>> base = BaseGroup(hdf5_group)
>>> base.hdf5_group.ref
<HDF5 Group Reference>

Note

All attributes should be input into the metadata object, that way all input will be validated against the metadata standards. If you change attributes in metadata object, you should run the BaseGroup.write_metadata method. This is a temporary solution working on an automatic updater if metadata is changed.

>>> base.metadata.existing_attribute = 'update_existing_attribute'
>>> base.write_metadata()

If you want to add a new attribute this should be done using the metadata.add_base_attribute method.

>>> base.metadata.add_base_attribute('new_attribute',
...                                  'new_attribute_value',
...                                  {'type':str,
...                                   'required':True,
...                                   'style':'free form',
...                                   'description': 'new attribute desc.',
...                                   'units':None,
...                                   'options':[],
...                                   'alias':[],
...                                   'example':'new attribute'})

Includes intializing functions that makes a summary table and writes metadata.

property dataset_options
property groups_list
initialize_group(**kwargs)[source]

Initialize group by making a summary table and writing metadata

property metadata

Metadata for the Group based on mt_metadata.timeseries

read_metadata()[source]

read metadata from the HDF5 group into metadata object

write_metadata()[source]

Write HDF5 metadata from metadata object.

class mth5.groups.ChannelDataset(dataset, dataset_metadata=None, **kwargs)[source]

Bases: object

Holds a channel dataset. This is a simple container for the data to make sure that the user has the flexibility to turn the channel into an object they want to deal with.

For now all the numpy type slicing can be used on hdf5_dataset

Parameters
  • dataset (h5py.Dataset) – dataset object for the channel

  • dataset_metadata ([ mth5.metadata.Electric | mth5.metadata.Magnetic | mth5.metadata.Auxiliary ], optional) – metadata container, defaults to None

Raises

MTH5Error – If the dataset is not of the correct type

Utilities will be written to create some common objects like:

  • xarray.DataArray

  • pandas.DataFrame

  • zarr

  • dask.Array

The benefit of these other objects is that they can be indexed by time, and they have much more buit-in funcionality.

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
>>> channel = run.get_channel('Ex')
>>> channel
Channel Electric:
-------------------
            component:        Ey
    data type:        electric
    data format:      float32
    data shape:       (4096,)
    start:            1980-01-01T00:00:00+00:00
    end:              1980-01-01T00:00:01+00:00
    sample rate:      4096
property channel_entry

channel entry that will go into a full channel summary of the entire survey

property channel_response_filter
property end

return end time based on the data

extend_dataset(new_data_array, start_time, sample_rate, fill=None, max_gap_seconds=1, fill_window=10)[source]

Append data according to how the start time aligns with existing data. If the start time is before existing start time the data is prepended, similarly if the start time is near the end data will be appended.

If the start time is within the existing time range, existing data will be replace with the new data.

If there is a gap between start or end time of the new data with the existing data you can either fill the data with a constant value or an error will be raise depending on the value of fill.

Parameters
  • new_data_array (numpy.ndarray) – new data array with shape (npts, )

  • start_time (string or mth5.utils.mttime.MTime) – start time of the new data array in UTC

  • sample_rate (float) – Sample rate of the new data array, must match existing sample rate

  • fill (string, None, float, integer) – If there is a data gap how do you want to fill the gap * None: will raise an mth5.utils.exceptions.MTH5Error * ‘mean’: will fill with the mean of each data set within the fill window * ‘median’: will fill with the median of each data set within the fill window * value: can be an integer or float to fill the gap * ‘nan’: will fill the gap with NaN

  • max_gap_seconds (float or integer) – sets a maximum number of seconds the gap can be. Anything over this number will raise a mth5.utils.exceptions.MTH5Error.

  • fill_window (integer) – number of points from the end of each data set to estimate fill value from.

Raises

mth5.utils.excptions.MTH5Error if sample rate is not the same, or fill value is not understood,

Rubric

>>> ex = mth5_obj.get_channel('MT001', 'MT001a', 'Ex')
>>> ex.n_samples
4096
>>> ex.end
2015-01-08T19:32:09.500000+00:00
>>> t = timeseries.ChannelTS('electric',
...                     data=2*np.cos(4 * np.pi * .05 *         ...                                   np.linspace(0,4096l num=4096) *
...                                   .01),
...                     channel_metadata={'electric':{
...                        'component': 'ex',
...                        'sample_rate': 8,
...                        'time_period.start':(ex.end+(1)).iso_str}})
>>> ex.extend_dataset(t.ts, t.start, t.sample_rate, fill='median',
...                   max_gap_seconds=2)
2020-07-02T18:02:47 - mth5.groups.Electric.extend_dataset - INFO -
filling data gap with 1.0385180759767025
>>> ex.n_samples
8200
>>> ex.end
2015-01-08T19:40:42.500000+00:00
from_channel_ts(channel_ts_obj, how='replace', fill=None, max_gap_seconds=1, fill_window=10)[source]

fill data set from a mth5.timeseries.ChannelTS object.

Will check for time alignement, and metadata.

Parameters
  • channel_ts_obj (mth5.timeseries.ChannelTS) – time series object

  • how

    how the new array will be input to the existing dataset:

    • ’replace’ -> replace the entire dataset nothing is left over.

    • ’extend’ -> add onto the existing dataset, any overlapping values will be rewritten, if there are gaps between data sets those will be handled depending on the value of fill.

    param fill

    If there is a data gap how do you want to fill the gap:

    • None -> will raise an mth5.utils.exceptions.MTH5Error

    • ’mean’-> will fill with the mean of each data set within the fill window

    • ’median’ -> will fill with the median of each data set within the fill window

    • value -> can be an integer or float to fill the gap

    • ’nan’ -> will fill the gap with NaN

  • max_gap_seconds (float or integer) – sets a maximum number of seconds the gap can be. Anything over this number will raise a mth5.utils.exceptions.MTH5Error.

  • fill_window (integer) – number of points from the end of each data set to estimate fill value from.

from_xarray(data_array, how='replace', fill=None, max_gap_seconds=1, fill_window=10)[source]

fill data set from a xarray.DataArray object.

Will check for time alignement, and metadata.

Parameters
  • data_array_obj – Xarray data array

  • how

    how the new array will be input to the existing dataset:

    • ’replace’ -> replace the entire dataset nothing is left over.

    • ’extend’ -> add onto the existing dataset, any overlapping

    values will be rewritten, if there are gaps between data sets those will be handled depending on the value of fill.

    param fill

    If there is a data gap how do you want to fill the gap:

    • None -> will raise an mth5.utils.exceptions.MTH5Error

    • ’mean’-> will fill with the mean of each data set within

      the fill window

    • ’median’ -> will fill with the median of each data set

      within the fill window

    • value -> can be an integer or float to fill the gap

    • ’nan’ -> will fill the gap with NaN

  • max_gap_seconds (float or integer) – sets a maximum number of seconds the gap can be. Anything over this number will raise a mth5.utils.exceptions.MTH5Error.

  • fill_window (integer) – number of points from the end of each data set to estimate fill value from.

get_index_from_time(given_time)[source]

get the appropriate index for a given time.

Parameters

given_time (TYPE) – DESCRIPTION

Returns

DESCRIPTION

Return type

TYPE

property master_station_group

shortcut to master station group

property n_samples
read_metadata()[source]

Read metadata from the HDF5 file into the metadata container, that way it can be validated.

replace_dataset(new_data_array)[source]

replace the entire dataset with a new one, nothing left behind

Parameters

new_data_array (numpy.ndarray) – new data array shape (npts, )

property run_group

shortcut to run group

property sample_rate
property start
property station_group

shortcut to station group

property table_entry

Creat a table entry to put into the run summary table.

property time_index

time index given parameters in metadata :rtype: pandas.DatetimeIndex

Type

return

time_slice(start_time, end_time=None, n_samples=None, return_type='channel_ts')[source]

Get a time slice from the channel and return the appropriate type

  • numpy array with metadata

  • pandas.Dataframe with metadata

  • xarray.DataFrame with metadata

  • mth5.timeseries.ChannelTS ‘default’

  • dask.DataFrame with metadata ‘not yet’

Parameters
  • start_time (string or mth5.utils.mttime.MTime) – start time of the slice

  • end_time (string or mth5.utils.mttime.MTime, optional) – end time of the slice

  • n_samples (integer, optional) – number of samples to read in

Returns

the correct container for the time series.

Return type

[ xarray.DataArray | pandas.DataFrame | mth5.timeseries.ChannelTS | numpy.ndarray ]

Raises

ValueError if both end_time and n_samples are None or given.

Example with number of samples

>>> ex = mth5_obj.get_channel('FL001', 'FL001a', 'Ex')
>>> ex_slice = ex.time_slice("2015-01-08T19:49:15", n_samples=4096)
>>> ex_slice
<xarray.DataArray (time: 4096)>
array([0.93115046, 0.14233688, 0.87917119, ..., 0.26073634, 0.7137319 ,
       0.88154395])
Coordinates:
  * time     (time) datetime64[ns] 2015-01-08T19:49:15 ... 2015-01-08T19:57:46.875000
Attributes:
    ac.end:                      None
    ac.start:                    None
    ...

>>> type(ex_slice)
mth5.timeseries.ChannelTS

# plot the time series
>>> ex_slice.ts.plot()
Example with start and end time

>>> ex_slice = ex.time_slice("2015-01-08T19:49:15",
...                          end_time="2015-01-09T19:49:15")
Raises Example

>>> ex_slice = ex.time_slice("2015-01-08T19:49:15",
...                          end_time="2015-01-09T19:49:15",
...                          n_samples=4096)
ValueError: Must input either end_time or n_samples, not both.
to_channel_ts()[source]
Returns

a Timeseries with the appropriate time index and metadata

Return type

mth5.timeseries.ChannelTS

loads from memory (nearly half the size of xarray alone, not sure why)

to_dataframe()[source]
Returns

a dataframe where data is stored in the ‘data’ column and attributes are stored in the experimental attrs attribute

Return type

pandas.DataFrame

Note

that metadta will not be validated if changed in an xarray.

loads into RAM

to_numpy()[source]
Returns

a numpy structured array with 2 columns (time, channel_data)

Return type

numpy.core.records

data is a builtin to numpy and cannot be used as a name

loads into RAM

to_xarray()[source]
Returns

an xarray DataArray with appropriate metadata and the appropriate time index.

Return type

xarray.DataArray

Note

that metadta will not be validated if changed in an xarray.

loads from memory

write_metadata()[source]

Write metadata from the metadata container to the HDF5 attrs dictionary.

class mth5.groups.ElectricDataset(group, **kwargs)[source]

Bases: mth5.groups.master_station_run_channel.ChannelDataset

Holds a channel dataset. This is a simple container for the data to make sure that the user has the flexibility to turn the channel into an object they want to deal with.

For now all the numpy type slicing can be used on hdf5_dataset

Parameters
  • dataset (h5py.Dataset) – dataset object for the channel

  • dataset_metadata ([ mth5.metadata.Electric | mth5.metadata.Magnetic | mth5.metadata.Auxiliary ], optional) – metadata container, defaults to None

Raises

MTH5Error – If the dataset is not of the correct type

Utilities will be written to create some common objects like:

  • xarray.DataArray

  • pandas.DataFrame

  • zarr

  • dask.Array

The benefit of these other objects is that they can be indexed by time, and they have much more buit-in funcionality.

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
>>> channel = run.get_channel('Ex')
>>> channel
Channel Electric:
-------------------
            component:        Ey
    data type:        electric
    data format:      float32
    data shape:       (4096,)
    start:            1980-01-01T00:00:00+00:00
    end:              1980-01-01T00:00:01+00:00
    sample rate:      4096
class mth5.groups.FiltersGroup(group, **kwargs)[source]

Bases: mth5.groups.base.BaseGroup

Not implemented yet

add_filter(filter_object)[source]

Add a filter dataset based on type

current types are:
  • zpk –> zeros, poles, gain

  • fap –> frequency look up table

  • time_delay –> time delay filter

  • coefficient –> coefficient filter

Parameters

filter_object (mt_metadata.timeseries.filters) – An MT metadata filter object

property filter_dict
get_filter(name)[source]

Get a filter by name

to_filter_object(name)[source]

return the MT metadata representation of the filter

class mth5.groups.MagneticDataset(group, **kwargs)[source]

Bases: mth5.groups.master_station_run_channel.ChannelDataset

Holds a channel dataset. This is a simple container for the data to make sure that the user has the flexibility to turn the channel into an object they want to deal with.

For now all the numpy type slicing can be used on hdf5_dataset

Parameters
  • dataset (h5py.Dataset) – dataset object for the channel

  • dataset_metadata ([ mth5.metadata.Electric | mth5.metadata.Magnetic | mth5.metadata.Auxiliary ], optional) – metadata container, defaults to None

Raises

MTH5Error – If the dataset is not of the correct type

Utilities will be written to create some common objects like:

  • xarray.DataArray

  • pandas.DataFrame

  • zarr

  • dask.Array

The benefit of these other objects is that they can be indexed by time, and they have much more buit-in funcionality.

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
>>> channel = run.get_channel('Ex')
>>> channel
Channel Electric:
-------------------
            component:        Ey
    data type:        electric
    data format:      float32
    data shape:       (4096,)
    start:            1980-01-01T00:00:00+00:00
    end:              1980-01-01T00:00:01+00:00
    sample rate:      4096
class mth5.groups.MasterStationGroup(group, **kwargs)[source]

Bases: mth5.groups.base.BaseGroup

Utility class to holds information about the stations within a survey and accompanying metadata. This class is next level down from Survey for stations /Survey/Stations. This class provides methods to add and get stations. A summary table of all existing stations is also provided as a convenience look up table to make searching easier.

To access MasterStationGroup from an open MTH5 file:

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> stations = mth5_obj.stations_group

To check what stations exist

>>> stations.groups_list
['summary', 'MT001', 'MT002', 'MT003']

To access the hdf5 group directly use SurveyGroup.hdf5_group.

>>> stations.hdf5_group.ref
<HDF5 Group Reference>

Note

All attributes should be input into the metadata object, that way all input will be validated against the metadata standards. If you change attributes in metadata object, you should run the SurveyGroup.write_metadata() method. This is a temporary solution, working on an automatic updater if metadata is changed.

>>> stations.metadata.existing_attribute = 'update_existing_attribute'
>>> stations.write_metadata()

If you want to add a new attribute this should be done using the metadata.add_base_attribute method.

>>> stations.metadata.add_base_attribute('new_attribute',
>>> ...                                'new_attribute_value',
>>> ...                                {'type':str,
>>> ...                                 'required':True,
>>> ...                                 'style':'free form',
>>> ...                                 'description': 'new attribute desc.',
>>> ...                                 'units':None,
>>> ...                                 'options':[],
>>> ...                                 'alias':[],
>>> ...                                 'example':'new attribute

To add a station:

>>> new_station = stations.add_station('new_station')
>>> stations
/Survey/Stations:
====================
    --> Dataset: summary
    ......................
    |- Group: new_station
    ---------------------
        --> Dataset: summary
        ......................

Add a station with metadata:

>>> from mth5.metadata import Station
>>> station_metadata = Station()
>>> station_metadata.id = 'MT004'
>>> station_metadata.time_period.start = '2020-01-01T12:30:00'
>>> station_metadata.location.latitude = 40.000
>>> station_metadata.location.longitude = -120.000
>>> new_station = stations.add_station('Test_01', station_metadata)
>>> # to look at the metadata
>>> new_station.metadata
{
    "station": {
        "acquired_by.author": null,
        "acquired_by.comments": null,
        "id": "MT004",
        ...
        }
}

See also

mth5.metadata for details on how to add metadata from various files and python objects.

To remove a station:

>>> stations.remove_station('new_station')
>>> stations
/Survey/Stations:
====================
    --> Dataset: summary
    ......................

Note

Deleting a station is not as simple as del(station). In HDF5 this does not free up memory, it simply removes the reference to that station. The common way to get around this is to copy what you want into a new file, or overwrite the station.

To get a station:

>>> existing_station = stations.get_station('existing_station_name')
>>> existing_station
/Survey/Stations/existing_station_name:
=======================================
    --> Dataset: summary
    ......................
    |- Group: run_01
    ----------------
        --> Dataset: summary
        ......................
        --> Dataset: Ex
        ......................
        --> Dataset: Ey
        ......................
        --> Dataset: Hx
        ......................
        --> Dataset: Hy
        ......................
        --> Dataset: Hz
        ......................

A summary table is provided to make searching easier. The table summarized all stations within a survey. To see what names are in the summary table:

>>> stations.summary_table.dtype.descr
[('id', ('|S5', {'h5py_encoding': 'ascii'})),
 ('start', ('|S32', {'h5py_encoding': 'ascii'})),
 ('end', ('|S32', {'h5py_encoding': 'ascii'})),
 ('components', ('|S100', {'h5py_encoding': 'ascii'})),
 ('measurement_type', ('|S12', {'h5py_encoding': 'ascii'})),
 ('sample_rate', '<f8')]

Note

When a station is added an entry is added to the summary table, where the information is pulled from the metadata.

>>> stations.summary_table
index |   id    |            start             |             end
 | components | measurement_type | sample_rate
 -------------------------------------------------------------------------
 --------------------------------------------------
 0   |  Test_01   |  1980-01-01T00:00:00+00:00 |  1980-01-01T00:00:00+00:00
 |  Ex,Ey,Hx,Hy,Hz   |  BBMT   | 100
add_station(station_name, station_metadata=None)[source]
Add a station with metadata if given with the path:

/Survey/Stations/station_name

If the station already exists, will return that station and nothing is added.

Parameters
  • station_name (string) – Name of the station, should be the same as metadata.id

  • station_metadata (mth5.metadata.Station, optional) – Station metadata container, defaults to None

Returns

A convenience class for the added station

Return type

mth5_groups.StationGroup

Example
>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> # one option
>>> stations = mth5_obj.stations_group
>>> new_station = stations.add_station('MT001')
>>> # another option
>>> new_staiton = mth5_obj.stations_group.add_station('MT001')
property channel_summary

Summary of all channels in the file.

get_station(station_name)[source]

Get a station with the same name as station_name

Parameters

station_name (string) – existing station name

Returns

convenience station class

Return type

mth5.mth5_groups.StationGroup

Raises

MTH5Error – if the station name is not found.

Example

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> # one option
>>> stations = mth5_obj.stations_group
>>> existing_station = stations.get_station('MT001')
>>> # another option
>>> existing_staiton = mth5_obj.stations_group.get_station('MT001')
MTH5Error: MT001 does not exist, check station_list for existing names
remove_station(station_name)[source]

Remove a station from the file.

Note

Deleting a station is not as simple as del(station). In HDF5 this does not free up memory, it simply removes the reference to that station. The common way to get around this is to copy what you want into a new file, or overwrite the station.

Parameters

station_name (string) – existing station name

Example
>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> # one option
>>> stations = mth5_obj.stations_group
>>> stations.remove_station('MT001')
>>> # another option
>>> mth5_obj.stations_group.remove_station('MT001')
property station_summary

Summary of stations in the file

Returns

DESCRIPTION

Return type

TYPE

class mth5.groups.ReportsGroup(group, **kwargs)[source]

Bases: mth5.groups.base.BaseGroup

Not sure how to handle this yet

add_report(report_name, report_metadata=None, report_data=None)[source]
Parameters
  • report_name (TYPE) – DESCRIPTION

  • report_metadata (TYPE, optional) – DESCRIPTION, defaults to None

  • report_data (TYPE, optional) – DESCRIPTION, defaults to None

Returns

DESCRIPTION

Return type

TYPE

class mth5.groups.RunGroup(group, run_metadata=None, **kwargs)[source]

Bases: mth5.groups.base.BaseGroup

RunGroup is a utility class to hold information about a single run and accompanying metadata. This class is the next level down from Stations –> /Survey/Stations/station/station{a-z}.

This class provides methods to add and get channels. A summary table of all existing channels in the run is also provided as a convenience look up table to make searching easier.

Parameters
  • group (h5py.Group) – HDF5 group for a station, should have a path /Survey/Stations/station_name/run_name

  • station_metadata (mth5.metadata.Station, optional) – metadata container, defaults to None

Access RunGroup from an open MTH5 file

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
Check what channels exist

>>> station.groups_list
['Ex', 'Ey', 'Hx', 'Hy']

To access the hdf5 group directly use RunGroup.hdf5_group

>>> station.hdf5_group.ref
<HDF5 Group Reference>

Note

All attributes should be input into the metadata object, that way all input will be validated against the metadata standards. If you change attributes in metadata object, you should run the SurveyGroup.write_metadata() method. This is a temporary solution, working on an automatic updater if metadata is changed.

>>> run.metadata.existing_attribute = 'update_existing_attribute'
>>> run.write_metadata()

If you want to add a new attribute this should be done using the metadata.add_base_attribute method.

>>> station.metadata.add_base_attribute('new_attribute',
>>> ...                                 'new_attribute_value',
>>> ...                                 {'type':str,
>>> ...                                  'required':True,
>>> ...                                  'style':'free form',
>>> ...                                  'description': 'new attribute desc.',
>>> ...                                  'units':None,
>>> ...                                  'options':[],
>>> ...                                  'alias':[],
>>> ...                                  'example':'new attribute
Add a channel

>>> new_channel = run.add_channel('Ex', 'electric',
>>> ...                            data=numpy.random.rand(4096))
>>> new_run
/Survey/Stations/MT001/MT001a:
=======================================
    --> Dataset: summary
    ......................
    --> Dataset: Ex
    ......................
    --> Dataset: Ey
    ......................
    --> Dataset: Hx
    ......................
    --> Dataset: Hy
    ......................
Add a channel with metadata

>>> from mth5.metadata import Electric
>>> ex_metadata = Electric()
>>> ex_metadata.time_period.start = '2020-01-01T12:30:00'
>>> ex_metadata.time_period.end = '2020-01-03T16:30:00'
>>> new_ex = run.add_channel('Ex', 'electric',
>>> ...                       channel_metadata=ex_metadata)
>>> # to look at the metadata
>>> new_ex.metadata
{
     "electric": {
        "ac.end": 1.2,
        "ac.start": 2.3,
        ...
        }
}

See also

mth5.metadata for details on how to add metadata from various files and python objects.

Remove a channel

>>> run.remove_channel('Ex')
>>> station
/Survey/Stations/MT001/MT001a:
=======================================
    --> Dataset: summary
    ......................
    --> Dataset: Ey
    ......................
    --> Dataset: Hx
    ......................
    --> Dataset: Hy
    ......................

Note

Deleting a station is not as simple as del(station). In HDF5 this does not free up memory, it simply removes the reference to that station. The common way to get around this is to copy what you want into a new file, or overwrite the station.

Get a channel

>>> existing_ex = stations.get_channel('Ex')
>>> existing_ex
Channel Electric:
-------------------
    data type:        Ex
    data type:        electric
    data format:      float32
    data shape:       (4096,)
    start:            1980-01-01T00:00:00+00:00
    end:              1980-01-01T00:32:+08:00
    sample rate:      8
Summary Table

A summary table is provided to make searching easier. The table summarized all stations within a survey. To see what names are in the summary table:

>>> run.summary_table.dtype.descr
[('component', ('|S5', {'h5py_encoding': 'ascii'})),
 ('start', ('|S32', {'h5py_encoding': 'ascii'})),
 ('end', ('|S32', {'h5py_encoding': 'ascii'})),
 ('n_samples', '<i4'),
 ('measurement_type', ('|S12', {'h5py_encoding': 'ascii'})),
 ('units', ('|S25', {'h5py_encoding': 'ascii'})),
 ('hdf5_reference', ('|O', {'ref': h5py.h5r.Reference}))]

Note

When a run is added an entry is added to the summary table, where the information is pulled from the metadata.

>>> new_run.summary_table
index | component | start | end | n_samples | measurement_type | units |
hdf5_reference
--------------------------------------------------------------------------
-------------
add_channel(channel_name, channel_type, data, channel_dtype='int32', max_shape=(None), chunks=True, channel_metadata=None, **kwargs)[source]

add a channel to the run

Parameters
  • channel_name (string) – name of the channel

  • channel_type (string) – [ electric | magnetic | auxiliary ]

  • channel_metadata ([ mth5.metadata.Electric | mth5.metadata.Magnetic | mth5.metadata.Auxiliary ], optional) – metadata container, defaults to None

Raises

MTH5Error – If channel type is not correct

Returns

Channel container

Return type

[ mth5.mth5_groups.ElectricDatset | mth5.mth5_groups.MagneticDatset | mth5.mth5_groups.AuxiliaryDatset ]

>>> new_channel = run.add_channel('Ex', 'electric', None)
>>> new_channel
Channel Electric:
-------------------
                component:        None
        data type:        electric
        data format:      float32
        data shape:       (1,)
        start:            1980-01-01T00:00:00+00:00
        end:              1980-01-01T00:00:00+00:00
        sample rate:      None
property channel_summary

summary of channels in run :return: DESCRIPTION :rtype: TYPE

from_channel_ts(channel_ts_obj)[source]

create a channel data set from a mth5.timeseries.ChannelTS object and update metadata.

Parameters

channel_ts_obj (mth5.timeseries.ChannelTS) – a single time series object

Returns

new channel dataset

Return type

:class:`mth5.groups.ChannelDataset

from_runts(run_ts_obj, **kwargs)[source]

create channel datasets from a mth5.timeseries.RunTS object and update metadata.

:parameter mth5.timeseries.RunTS run_ts_obj: Run object with all the appropriate channels and metadata.

Will create a run group and appropriate channel datasets.

get_channel(channel_name)[source]

Get a channel from an existing name. Returns the appropriate container.

Parameters

channel_name (string) – name of the channel

Returns

Channel container

Return type

[ mth5.mth5_groups.ElectricDatset | mth5.mth5_groups.MagneticDatset | mth5.mth5_groups.AuxiliaryDatset ]

Raises

MTH5Error – If no channel is found

Example

>>> existing_channel = run.get_channel('Ex')
MTH5Error: Ex does not exist, check groups_list for existing names'
>>> run.groups_list
['Ey', 'Hx', 'Hz']
>>> existing_channel = run.get_channel('Ey')
>>> existing_channel
Channel Electric:
-------------------
                component:        Ey
        data type:        electric
        data format:      float32
        data shape:       (4096,)
        start:            1980-01-01T00:00:00+00:00
        end:              1980-01-01T00:00:01+00:00
        sample rate:      4096
property master_station_group

shortcut to master station group

property metadata

Overwrite get metadata to include channel information in the runs

remove_channel(channel_name)[source]

Remove a run from the station.

Note

Deleting a channel is not as simple as del(channel). In HDF5 this does not free up memory, it simply removes the reference to that channel. The common way to get around this is to copy what you want into a new file, or overwrite the channel.

Parameters

station_name (string) – existing station name

Example

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> run = mth5_obj.stations_group.get_station('MT001').get_run('MT001a')
>>> run.remove_channel('Ex')
property station_group

shortcut to station group

property table_entry

Get a run table entry

Returns

a properly formatted run table entry

Return type

numpy.ndarray with dtype:

>>> dtype([('id', 'S20'),
         ('start', 'S32'),
         ('end', 'S32'),
         ('components', 'S100'),
         ('measurement_type', 'S12'),
         ('sample_rate', float),
         ('hdf5_reference', h5py.ref_dtype)])
to_runts()[source]

create a mth5.timeseries.RunTS object from channels of the run

Returns

DESCRIPTION

Return type

TYPE

validate_run_metadata()[source]

Update metadata and table entries to ensure consistency

Returns

DESCRIPTION

Return type

TYPE

write_metadata()[source]

Overwrite Base.write_metadata to include updating table entry Write HDF5 metadata from metadata object.

class mth5.groups.StandardsGroup(group, **kwargs)[source]

Bases: mth5.groups.base.BaseGroup

The StandardsGroup is a convenience group that stores the metadata standards that were used to make the current file. This is to help a user understand the metadata directly from the file and not have to look up documentation that might not be updated.

The metadata standards are stored in the summary table /Survey/Standards/summary

>>> standards = mth5_obj.standards_group
>>> standards.summary_table
index | attribute | type | required | style | units | description |
options  |  alias |  example
--------------------------------------------------------------------------
get_attribute_information(attribute_name)[source]

get information about an attribute

The attribute name should be in the summary table.

Parameters

attribute_name (string) – attribute name

Returns

prints a description of the attribute

Raises

MTH5TableError – if attribute is not found

>>> standars = mth5_obj.standards_group
>>> standards.get_attribute_information('survey.release_license')
survey.release_license
--------------------------
        type          : string
        required      : True
        style         : controlled vocabulary
        units         :
        description   : How the data can be used. The options are based on
                 Creative Commons licenses. For details visit
                 https://creativecommons.org/licenses/
        options       : CC-0,CC-BY,CC-BY-SA,CC-BY-ND,CC-BY-NC-SA,CC-BY-NC-ND
        alias         :
        example       : CC-0
initialize_group()[source]

Initialize the group by making a summary table that summarizes the metadata standards used to describe the data.

Also, write generic metadata information.

property summary_table
summary_table_from_dict(summary_dict)[source]

Fill summary table from a dictionary that summarizes the metadata for the entire survey.

Parameters

summary_dict (dictionary) – Flattened dictionary of all metadata standards within the survey.

class mth5.groups.StationGroup(group, station_metadata=None, **kwargs)[source]

Bases: mth5.groups.base.BaseGroup

StationGroup is a utility class to hold information about a single station and accompanying metadata. This class is the next level down from Stations –> /Survey/Stations/station_name.

This class provides methods to add and get runs. A summary table of all existing runs in the station is also provided as a convenience look up table to make searching easier.

Parameters
  • group (h5py.Group) – HDF5 group for a station, should have a path /Survey/Stations/station_name

  • station_metadata (mth5.metadata.Station, optional) – metadata container, defaults to None

Usage

Access StationGroup from an open MTH5 file

>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> station = mth5_obj.stations_group.get_station('MT001')
Check what runs exist

>>> station.groups_list
['MT001a', 'MT001b', 'MT001c', 'MT001d']

To access the hdf5 group directly use StationGroup.hdf5_group.

>>> station.hdf5_group.ref
<HDF5 Group Reference>

Note

All attributes should be input into the metadata object, that way all input will be validated against the metadata standards. If you change attributes in metadata object, you should run the SurveyGroup.write_metadata() method. This is a temporary solution, working on an automatic updater if metadata is changed.

>>> station.metadata.existing_attribute = 'update_existing_attribute'
>>> station.write_metadata()

If you want to add a new attribute this should be done using the metadata.add_base_attribute method.

>>> station.metadata.add_base_attribute('new_attribute',
>>> ...                                 'new_attribute_value',
>>> ...                                 {'type':str,
>>> ...                                  'required':True,
>>> ...                                  'style':'free form',
>>> ...                                  'description': 'new attribute desc.',
>>> ...                                  'units':None,
>>> ...                                  'options':[],
>>> ...                                  'alias':[],
>>> ...                                  'example':'new attribute
To add a run

>>> new_run = stations.add_run('MT001e')
>>> new_run
/Survey/Stations/Test_01:
=========================
    |- Group: MT001e
    -----------------
        --> Dataset: summary
        ......................
    --> Dataset: summary
    ......................
Add a run with metadata

>>> from mth5.metadata import Run
>>> run_metadata = Run()
>>> run_metadata.time_period.start = '2020-01-01T12:30:00'
>>> run_metadata.time_period.end = '2020-01-03T16:30:00'
>>> run_metadata.location.latitude = 40.000
>>> run_metadata.location.longitude = -120.000
>>> new_run = runs.add_run('Test_01', run_metadata)
>>> # to look at the metadata
>>> new_run.metadata
{
    "run": {
        "acquired_by.author": "new_user",
        "acquired_by.comments": "First time",
        "channels_recorded_auxiliary": ['T'],
        ...
        }
}

See also

mth5.metadata for details on how to add metadata from various files and python objects.

Remove a run

>>> station.remove_run('new_run')
>>> station
/Survey/Stations/Test_01:
=========================
    --> Dataset: summary
    ......................

Note

Deleting a station is not as simple as del(station). In HDF5 this does not free up memory, it simply removes the reference to that station. The common way to get around this is to copy what you want into a new file, or overwrite the station.

Get a run

>>> existing_run = stations.get_station('existing_run')
>>> existing_run
/Survey/Stations/MT001/MT001a:
=======================================
    --> Dataset: summary
    ......................
    --> Dataset: Ex
    ......................
    --> Dataset: Ey
    ......................
    --> Dataset: Hx
    ......................
    --> Dataset: Hy
    ......................
    --> Dataset: Hz
    ......................
Summary Table

A summary table is provided to make searching easier. The table summarized all stations within a survey. To see what names are in the summary table:

>>> new_run.summary_table.dtype.descr
[('id', ('|S20', {'h5py_encoding': 'ascii'})),
 ('start', ('|S32', {'h5py_encoding': 'ascii'})),
 ('end', ('|S32', {'h5py_encoding': 'ascii'})),
 ('components', ('|S100', {'h5py_encoding': 'ascii'})),
 ('measurement_type', ('|S12', {'h5py_encoding': 'ascii'})),
 ('sample_rate', '<f8'),
 ('hdf5_reference', ('|O', {'ref': h5py.h5r.Reference}))]

Note

When a run is added an entry is added to the summary table, where the information is pulled from the metadata.

>>> station.summary_table
index | id | start | end | components | measurement_type | sample_rate |
hdf5_reference
--------------------------------------------------------------------------
-------------
add_run(run_name, run_metadata=None)[source]

Add a run to a station.

Parameters
  • run_name (string) – run name, should be id{a-z}

  • metadata (mth5.metadata.Station, optional) – metadata container, defaults to None

need to be able to fill an entry in the summary table.

get_run(run_name)[source]

get a run from run name

Parameters

run_name (string) – existing run name

Returns

Run object

Return type

mth5.mth5_groups.RunGroup

>>> existing_run = station.get_run('MT001')
locate_run(sample_rate, start)[source]

Locate a run based on sample rate and start time from the summary table

Parameters
  • sample_rate (float) – sample rate in samples/seconds

  • start (string or mth5.utils.mttime.MTime) – start time

Returns

appropriate run name, None if not found

Return type

string or None

make_run_name()[source]

Make a run name that will be the next alphabet letter extracted from the run list. Expects that all runs are labled as id{a-z}.

Returns

metadata.id + next letter

Return type

string

>>> station.metadata.id = 'MT001'
>>> station.make_run_name()
'MT001a'
property master_station_group

shortcut to master station group

property metadata

Overwrite get metadata to include run information in the station

property name
remove_run(run_name)[source]

Remove a run from the station.

Note

Deleting a station is not as simple as del(station). In HDF5 this does not free up memory, it simply removes the reference to that station. The common way to get around this is to copy what you want into a new file, or overwrite the station.

Parameters

station_name (string) – existing station name

Example
>>> from mth5 import mth5
>>> mth5_obj = mth5.MTH5()
>>> mth5_obj.open_mth5(r"/test.mth5", mode='a')
>>> # one option
>>> stations = mth5_obj.stations_group
>>> stations.remove_station('MT001')
>>> # another option
>>> mth5_obj.stations_group.remove_station('MT001')
property run_summary

Summary of runs in the station

Returns

DESCRIPTION

Return type

TYPE

property table_entry

make table entry

validate_station_metadata()[source]

Check metadata from the runs and make sure it matches the station metadata

Returns

DESCRIPTION

Return type

TYPE

class mth5.groups.SurveyGroup(group, **kwargs)[source]

Bases: mth5.groups.base.BaseGroup

Utility class to holds general information about the survey and accompanying metadata for an MT survey.

To access the hdf5 group directly use SurveyGroup.hdf5_group.

>>> survey = SurveyGroup(hdf5_group)
>>> survey.hdf5_group.ref
<HDF5 Group Reference>

Note

All attributes should be input into the metadata object, that way all input will be validated against the metadata standards. If you change attributes in metadata object, you should run the SurveyGroup.write_metadata() method. This is a temporary solution, working on an automatic updater if metadata is changed.

>>> survey.metadata.existing_attribute = 'update_existing_attribute'
>>> survey.write_metadata()

If you want to add a new attribute this should be done using the metadata.add_base_attribute method.

>>> survey.metadata.add_base_attribute('new_attribute',
>>> ...                                'new_attribute_value',
>>> ...                                {'type':str,
>>> ...                                 'required':True,
>>> ...                                 'style':'free form',
>>> ...                                 'description': 'new attribute desc.',
>>> ...                                 'units':None,
>>> ...                                 'options':[],
>>> ...                                 'alias':[],
>>> ...                                 'example':'new attribute

Tip

If you want ot add stations, reports, etc to the survey this should be done from the MTH5 object. This is to avoid duplication, at least for now.

To look at what the structure of /Survey looks like:

>>> survey
/Survey:
====================
    |- Group: Filters
    -----------------
        --> Dataset: summary
    -----------------
    |- Group: Reports
    -----------------
        --> Dataset: summary
        -----------------
    |- Group: Standards
    -------------------
        --> Dataset: summary
        -----------------
    |- Group: Stations
    ------------------
        --> Dataset: summary
        -----------------
property metadata

Overwrite get metadata to include station information

property stations_group
update_survey_metadata(survey_dict=None)[source]

update start end dates and location corners from stations_group.summary_table