mth5.clients package

Submodules

mth5.clients.make_mth5 module

Make MTH5

This module provides helper functions to make MTH5 file from various clients

Supported Clients include:

  • FDSN (through Obspy)

  • Science Base (TODO)

  • NCI - Australia (TODO)

Updated on Wed Aug 25 19:57:00 2021

@author: jpeacock + tronan

class mth5.clients.make_mth5.MakeMTH5(mth5_version='0.2.0', interact=False, save_path=None, **kwargs)[source]

Bases: object

from_fdsn_client(request_df, client='IRIS', **kwargs)[source]

Pull data from an FDSN archive like IRIS. Uses Obspy.Clients.

Parameters
  • request_df (pandas.DataFrame) –

    DataFrame with columns

    • ’network’ –> FDSN Network code

    • ’station’ –> FDSN Station code

    • ’location’ –> FDSN Location code

    • ’channel’ –> FDSN Channel code

    • ’start’ –> Start time YYYY-MM-DDThh:mm:ss

    • ’end’ –> End time YYYY-MM-DDThh:mm:ss

  • client (string, optional) – FDSN client name, defaults to “IRIS”

  • interact (bool) – Boolean to keep the created MTH5 file open or not

Raises
  • AttributeError – If the input DataFrame is not properly formatted an Attribute Error will be raised.

  • ValueError – If the values of the DataFrame are not correct a ValueError will be raised.

Returns

MTH5 file name

Return type

pathlib.Path

Note

If any of the column values are blank, then any value will

searched for. For example if you leave ‘station’ blank, any station within the given start and end time will be returned.

from_usgs_geomag(request_df, **kwargs)[source]

Download geomagnetic observatory data from USGS webservices into an MTH5 using a request dataframe or csv file.

  • observatory: Geogmangetic observatory ID

  • type: type of data to get ‘adjusted’

  • start: start date time to request UTC

  • end: end date time to request UTC

  • elements: components to get

  • sampling_period: samples between measurements in seconds

Parameters

request_df (pandas.DataFrame, str or Path if csv file) –

DataFrame with columns

  • ’observatory’ –> Observatory code

  • ’type’ –> data type [ ‘variation’ | ‘adjusted’ | ‘quasi-definitive’ | ‘definitive’ ]

  • ’elements’ –> Elements to get [D, DIST, DST, E, E-E, E-N, F, G, H, SQ, SV, UK1, UK2, UK3, UK4, X, Y, Z]

  • ’sampling_period’ –> sample period [ 1 | 60 | 3600 ]

  • ’start’ –> Start time YYYY-MM-DDThh:mm:ss

  • ’end’ –> End time YYYY-MM-DDThh:mm:ss

Returns

if interact is True an MTH5 object is returned otherwise the path to the file is returned

Return type

Path or mth5.mth5.MTH5

from_zen(data_path, sample_rates=[4096, 1024, 256], calibration_path=None, survey_id=None, combine=True, **kwargs)[source]

Create an MTH5 from zen data.

Parameters
  • data_path (TYPE) – DESCRIPTION

  • sample_rates (TYPE, optional) – DESCRIPTION, defaults to [4096, 1024, 256]

  • save_path (TYPE, optional) – DESCRIPTION, defaults to None

  • calibration_path (TYPE, optional) – DESCRIPTION, defaults to None

Returns

DESCRIPTION

Return type

TYPE

Module contents

class mth5.clients.FDSN(client='IRIS', mth5_version='0.2.0', **kwargs)[source]

Bases: object

build_network_dict(df, client)[source]

Build out a dictionary of networks, keyed by network_id, start_time. We could return this dict and use it as an auxilliary variable, but it seems easier to just add a column to the df.

Parameters

df (pd.DataFrame) – This is a “request_df”

build_station_dict(df, client, networks_dict)[source]

Given the {network-id, starttime}-keyed dict of networks, we build a station layer below this

Parameters
  • df

  • networks_dict

get_df_from_inventory(inventory)[source]

Create an data frame from an inventory object

Parameters

inventory (obspy.Inventory) – inventory object

Returns

dataframe in proper format

Return type

pandas.DataFrame

get_fdsn_channel_map()[source]
get_inventory_from_df(df, client=None, data=True)[source]

20230806: The nested for looping here can make debugging complex, as well as lead to a lot of redundancies. I propose that we build out a dictionary of networks, keyed by network_id, start_time. It may actually be simpler to just add a column to the request_df that has the network_obj

networks = {} networks[network_id] = {} networks[network_id][start_time_1] = obspy_network_obj networks[network_id][start_time_2] = obspy_network_obj …

Then the role of “returned_network” can be replaced by accessing the appropriate element and the second for-loop can move up by a layer of indentation.

Will try to factor i Get an obspy.Inventory object from a pandas.DataFrame

Parameters
  • df (pandas.DataFrame) –

    DataFrame with columns

    • ’network’ –> FDSN Network code

    • ’station’ –> FDSN Station code

    • ’location’ –> FDSN Location code

    • ’channel’ –> FDSN Channel code

    • ’start’ –> Start time YYYY-MM-DDThh:mm:ss

    • ’end’ –> End time YYYY-MM-DDThh:mm:ss

  • client (string) – FDSN client

  • data – True if you want data False if you want just metadata,

defaults to True :type data: boolean, optional :return: An inventory of metadata requested and data :rtype: obspy.Inventory and obspy.Stream

Note

If any of the column values are blank, then any value will

searched for. For example if you leave ‘station’ blank, any station within the given start and end time will be returned.

get_run_group(mth5_obj_or_survey, station_id, run_id)[source]

This method is key to merging wrangle_runs_into_containers_v1 and wrangle_runs_into_containers_v2. Because a v1 mth5 object can get a survey group with the same method as can a v2 survey_group

Thus we can replace run_group = m.stations_group.get_station(station_id).add_run(run_id) & run_group = survey_group.stations_group.get_station(station_id).add_run(run_id) with run_group = mth5_obj_or_survey.stations_group.get_station(station_id).add_run(run_id) :param mth5_obj_or_survey: :type mth5_obj_or_survey: mth5.mth5.MTH5 or mth5.groups.survey.SurveyGroup

get_run_list_from_station_id(m, station_id, survey_id=None)[source]

ignored_groups created to address issue #153. This might be better placed closer to the core of mth5.

Parameters
  • m

  • station_id

Returns

run_list

Return type

list of strings

get_station_streams(station_id)[source]

Get streams for a certain station

get_unique_networks_and_stations(df)[source]

Get unique lists of networks, stations, locations, and channels from a given data frame.

[{‘network’: FDSN code, “stations”: [list of stations for network]}]

Parameters

df (pandas.DataFrame) – request data frame

Returns

list of network dictionaries with

[{‘network’: FDSN code, “stations”: [list of stations for network]}] :rtype: list

get_waveforms_from_request_row(client, row)[source]
Parameters
  • client

  • row

make_filename(df)[source]

Make a filename from a data frame that is networks and stations

Parameters

df (pandas.DataFrame) – request data frame

Returns

file name as network_01+stations_network_02+stations.h5

Return type

string

make_mth5_from_fdsn_client(df, path=None, client=None, interact=False)[source]

Make an MTH5 file from an FDSN data center

Parameters
  • df (pandas.DataFrame) –

    DataFrame with columns

    • ’network’ –> FDSN Network code

    • ’station’ –> FDSN Station code

    • ’location’ –> FDSN Location code

    • ’channel’ –> FDSN Channel code

    • ’start’ –> Start time YYYY-MM-DDThh:mm:ss

    • ’end’ –> End time YYYY-MM-DDThh:mm:ss

  • path (string or pathlib.Path, optional) – Path to save MTH5 file to, defaults to None

  • client (string, optional) – FDSN client name, defaults to “IRIS”

Raises

AttributeError – If the input DataFrame is not properly

formatted an Attribute Error will be raised. :raises ValueError: If the values of the DataFrame are not correct a ValueError will be raised. :return: MTH5 file name :rtype: pathlib.Path

Note

If any of the column values are blank, then any value will

searched for. For example if you leave ‘station’ blank, any station within the given start and end time will be returned.

pack_stream_into_run_group(run_group, run_stream)[source]
property run_list_ne_stream_intervals_message

note about not equal stream intervals

run_timings_match_stream_timing(run_group, stream_start, stream_end)[source]

Checks start and end times in the run. Compares start and end times of runs to start and end times of traces. If True, will packs runs based on time spans.

Parameters
  • run_group

  • stream_start

  • stream_end

stream_boundaries(streams)[source]

Identify start and end times of streams

Parameters

streams (obspy.core.stream.Stream) –

wrangle_runs_into_containers(m, station_id, survey_group=None)[source]

Note 1: There used to be two separate functions for this, but now there is one run_group_source is defined as either m or survey_group depending on v0.1.0 or 0.2.0

Note 2: If/elif/elif/else Logic: The strategy is to add the group first. This will get the already filled in metadata to update the run_ts_obj. Then get streams an add existing metadata.

Parameters
  • m

  • streams

  • station_id

  • survey_group

class mth5.clients.MakeMTH5(mth5_version='0.2.0', interact=False, save_path=None, **kwargs)[source]

Bases: object

from_fdsn_client(request_df, client='IRIS', **kwargs)[source]

Pull data from an FDSN archive like IRIS. Uses Obspy.Clients.

Parameters
  • request_df (pandas.DataFrame) –

    DataFrame with columns

    • ’network’ –> FDSN Network code

    • ’station’ –> FDSN Station code

    • ’location’ –> FDSN Location code

    • ’channel’ –> FDSN Channel code

    • ’start’ –> Start time YYYY-MM-DDThh:mm:ss

    • ’end’ –> End time YYYY-MM-DDThh:mm:ss

  • client (string, optional) – FDSN client name, defaults to “IRIS”

  • interact (bool) – Boolean to keep the created MTH5 file open or not

Raises
  • AttributeError – If the input DataFrame is not properly formatted an Attribute Error will be raised.

  • ValueError – If the values of the DataFrame are not correct a ValueError will be raised.

Returns

MTH5 file name

Return type

pathlib.Path

Note

If any of the column values are blank, then any value will

searched for. For example if you leave ‘station’ blank, any station within the given start and end time will be returned.

from_usgs_geomag(request_df, **kwargs)[source]

Download geomagnetic observatory data from USGS webservices into an MTH5 using a request dataframe or csv file.

  • observatory: Geogmangetic observatory ID

  • type: type of data to get ‘adjusted’

  • start: start date time to request UTC

  • end: end date time to request UTC

  • elements: components to get

  • sampling_period: samples between measurements in seconds

Parameters

request_df (pandas.DataFrame, str or Path if csv file) –

DataFrame with columns

  • ’observatory’ –> Observatory code

  • ’type’ –> data type [ ‘variation’ | ‘adjusted’ | ‘quasi-definitive’ | ‘definitive’ ]

  • ’elements’ –> Elements to get [D, DIST, DST, E, E-E, E-N, F, G, H, SQ, SV, UK1, UK2, UK3, UK4, X, Y, Z]

  • ’sampling_period’ –> sample period [ 1 | 60 | 3600 ]

  • ’start’ –> Start time YYYY-MM-DDThh:mm:ss

  • ’end’ –> End time YYYY-MM-DDThh:mm:ss

Returns

if interact is True an MTH5 object is returned otherwise the path to the file is returned

Return type

Path or mth5.mth5.MTH5

from_zen(data_path, sample_rates=[4096, 1024, 256], calibration_path=None, survey_id=None, combine=True, **kwargs)[source]

Create an MTH5 from zen data.

Parameters
  • data_path (TYPE) – DESCRIPTION

  • sample_rates (TYPE, optional) – DESCRIPTION, defaults to [4096, 1024, 256]

  • save_path (TYPE, optional) – DESCRIPTION, defaults to None

  • calibration_path (TYPE, optional) – DESCRIPTION, defaults to None

Returns

DESCRIPTION

Return type

TYPE

class mth5.clients.PhoenixClient(data_path, sample_rates=[130, 24000], save_path=None, calibration_path=None)[source]

Bases: object

property calibration_path

Path to calibration data

property data_path

Path to phoenix data

get_run_dict()[source]

Get Run information

Returns

DESCRIPTION

Return type

TYPE

make_mth5_from_phoenix(**kwargs)[source]

Make an MTH5 from Phoenix files. Split into runs, account for filters

Parameters
  • data_path (TYPE, optional) – DESCRIPTION, defaults to None

  • sample_rates (TYPE, optional) – DESCRIPTION, defaults to None

  • save_path (TYPE, optional) – DESCRIPTION, defaults to None

Returns

DESCRIPTION

Return type

TYPE

property sample_rates

sample rates to look for

property save_path

Path to save mth5

class mth5.clients.USGSGeomag(**kwargs)[source]

Bases: object

add_run_id(request_df)[source]

Add run id to request df

Parameters

request_df (pandas.DataFrame) – request dataframe

Returns

add a run number to unique time windows for each observatory at each unique sampling period.

Return type

pandas.DataFrame

make_mth5_from_geomag(request_df)[source]

Download geomagnetic observatory data from USGS webservices into an MTH5 using a request dataframe or csv file.

Parameters

request_df (pandas.DataFrame, str or Path if csv file) –

DataFrame with columns

  • ’observatory’ –> Observatory code

  • ’type’ –> data type [ ‘variation’ | ‘adjusted’ | ‘quasi-definitive’ | ‘definitive’ ]

  • ’elements’ –> Elements to get [D, DIST, DST, E, E-E, E-N, F, G, H, SQ, SV, UK1, UK2, UK3, UK4, X, Y, Z]

  • ’sampling_period’ –> sample period [ 1 | 60 | 3600 ]

  • ’start’ –> Start time YYYY-MM-DDThh:mm:ss

  • ’end’ –> End time YYYY-MM-DDThh:mm:ss

Returns

if interact is True an MTH5 object is returned otherwise the path to the file is returned

Return type

Path or mth5.mth5.MTH5

validate_request_df(request_df)[source]

Make sure the input request dataframe has the appropriate columns

Parameters

request_df (pandas.DataFrame) – request dataframe

Returns

valid request dataframe

Return type

pandas.DataFrame

class mth5.clients.ZenClient(data_path, sample_rates=[4096, 1024, 256], save_path=None, calibration_path=None, **kwargs)[source]

Bases: object

property calibration_path

Path to calibration data

property data_path

Path to phoenix data

get_run_dict()[source]

Get Run information

Returns

DESCRIPTION

Return type

TYPE

get_survey(station_dict)[source]

get survey name from a dictionary of a single station of runs :param station_dict: DESCRIPTION :type station_dict: TYPE :return: DESCRIPTION :rtype: TYPE

make_mth5_from_zen(survey_id=None, combine=True, **kwargs)[source]

Make an MTH5 from Phoenix files. Split into runs, account for filters

Parameters
  • data_path (TYPE, optional) – DESCRIPTION, defaults to None

  • sample_rates (TYPE, optional) – DESCRIPTION, defaults to None

  • save_path (TYPE, optional) – DESCRIPTION, defaults to None

Returns

DESCRIPTION

Return type

TYPE

property sample_rates

sample rates to look for

property save_path

Path to save mth5