mth5.clients package

Submodules

mth5.clients.make_mth5 module

Make MTH5

This module provides helper functions to make MTH5 file from various clients

Supported Clients include:

FDSN (through Obspy)

Science Base (TODO)

NCI - Australia (TODO)

Updated on Wed Aug 25 19:57:00 2021

@author: jpeacock + tronan

class mth5.clients.make_mth5.MakeMTH5(mth5_version='0.2.0', interact=False, save_path=None, **kwargs)[source]

Bases: object

from_fdsn_client(request_df, client='IRIS', **kwargs)[source]

Pull data from an FDSN archive like IRIS. Uses Obspy.Clients.

Parameters

request_df (pandas.DataFrame) –
DataFrame with columns
- ’network’ –> FDSN Network code
- ’station’ –> FDSN Station code
- ’location’ –> FDSN Location code
- ’channel’ –> FDSN Channel code
- ’start’ –> Start time YYYY-MM-DDThh:mm:ss
- ’end’ –> End time YYYY-MM-DDThh:mm:ss
client (string, optional) – FDSN client name, defaults to “IRIS”
interact (bool) – Boolean to keep the created MTH5 file open or not

Raises

AttributeError – If the input DataFrame is not properly formatted an Attribute Error will be raised.
ValueError – If the values of the DataFrame are not correct a ValueError will be raised.

Returns

MTH5 file name

Return type

pathlib.Path

Note

If any of the column values are blank, then any value will

searched for. For example if you leave ‘station’ blank, any station within the given start and end time will be returned.

from_usgs_geomag(request_df, **kwargs)[source]

Download geomagnetic observatory data from USGS webservices into an MTH5 using a request dataframe or csv file.

observatory: Geogmangetic observatory ID
type: type of data to get ‘adjusted’
start: start date time to request UTC
end: end date time to request UTC
elements: components to get
sampling_period: samples between measurements in seconds

Parameters

request_df (pandas.DataFrame, str or Path if csv file) –

DataFrame with columns

’observatory’ –> Observatory code
’type’ –> data type [ ‘variation’ | ‘adjusted’ | ‘quasi-definitive’ | ‘definitive’ ]
’elements’ –> Elements to get [D, DIST, DST, E, E-E, E-N, F, G, H, SQ, SV, UK1, UK2, UK3, UK4, X, Y, Z]
’sampling_period’ –> sample period [ 1 | 60 | 3600 ]
’start’ –> Start time YYYY-MM-DDThh:mm:ss
’end’ –> End time YYYY-MM-DDThh:mm:ss

Returns

if interact is True an MTH5 object is returned otherwise the path to the file is returned

Return type

Path or mth5.mth5.MTH5

from_zen(data_path, sample_rates=[4096, 1024, 256], calibration_path=None, survey_id=None, combine=True, **kwargs)[source]

Create an MTH5 from zen data.

Parameters

data_path (TYPE) – DESCRIPTION
sample_rates (TYPE, optional) – DESCRIPTION, defaults to [4096, 1024, 256]
save_path (TYPE, optional) – DESCRIPTION, defaults to None
calibration_path (TYPE, optional) – DESCRIPTION, defaults to None

Returns

DESCRIPTION

Return type

TYPE

Module contents

class mth5.clients.FDSN(client='IRIS', mth5_version='0.2.0', **kwargs)[source]

Bases: object

build_network_dict(df, client)[source]

Build out a dictionary of networks, keyed by network_id, start_time. We could return this dict and use it as an auxilliary variable, but it seems easier to just add a column to the df.

Parameters: df (pd.DataFrame) – This is a “request_df”

build_station_dict(df, client, networks_dict)[source]

Given the {network-id, starttime}-keyed dict of networks, we build a station layer below this

Parameters

df –
networks_dict –

get_df_from_inventory(inventory)[source]

Create an data frame from an inventory object

Parameters: inventory (obspy.Inventory) – inventory object
Returns: dataframe in proper format
Return type: pandas.DataFrame

get_fdsn_channel_map()[source]

get_inventory_from_df(df, client=None, data=True)[source]

20230806: The nested for looping here can make debugging complex, as well as lead to a lot of redundancies. I propose that we build out a dictionary of networks, keyed by network_id, start_time. It may actually be simpler to just add a column to the request_df that has the network_obj

networks = {} networks[network_id] = {} networks[network_id][start_time_1] = obspy_network_obj networks[network_id][start_time_2] = obspy_network_obj …

Then the role of “returned_network” can be replaced by accessing the appropriate element and the second for-loop can move up by a layer of indentation.

Will try to factor i Get an obspy.Inventory object from a pandas.DataFrame

Parameters

df (pandas.DataFrame) –
DataFrame with columns
- ’network’ –> FDSN Network code
- ’station’ –> FDSN Station code
- ’location’ –> FDSN Location code
- ’channel’ –> FDSN Channel code
- ’start’ –> Start time YYYY-MM-DDThh:mm:ss
- ’end’ –> End time YYYY-MM-DDThh:mm:ss
client (string) – FDSN client
data – True if you want data False if you want just metadata,

defaults to True :type data: boolean, optional :return: An inventory of metadata requested and data :rtype: obspy.Inventory and obspy.Stream

Note

If any of the column values are blank, then any value will

searched for. For example if you leave ‘station’ blank, any station within the given start and end time will be returned.

get_run_group(mth5_obj_or_survey, station_id, run_id)[source]

This method is key to merging wrangle_runs_into_containers_v1 and wrangle_runs_into_containers_v2. Because a v1 mth5 object can get a survey group with the same method as can a v2 survey_group

Thus we can replace run_group = m.stations_group.get_station(station_id).add_run(run_id) & run_group = survey_group.stations_group.get_station(station_id).add_run(run_id) with run_group = mth5_obj_or_survey.stations_group.get_station(station_id).add_run(run_id) :param mth5_obj_or_survey: :type mth5_obj_or_survey: mth5.mth5.MTH5 or mth5.groups.survey.SurveyGroup

get_run_list_from_station_id(m, station_id, survey_id=None)[source]

ignored_groups created to address issue #153. This might be better placed closer to the core of mth5.

Parameters

m –
station_id –

Returns

run_list

Return type

list of strings

get_station_streams(station_id)[source]: Get streams for a certain station

get_unique_networks_and_stations(df)[source]

Get unique lists of networks, stations, locations, and channels from a given data frame.

[{‘network’: FDSN code, “stations”: [list of stations for network]}]

Parameters: df (pandas.DataFrame) – request data frame
Returns: list of network dictionaries with

[{‘network’: FDSN code, “stations”: [list of stations for network]}] :rtype: list

get_waveforms_from_request_row(client, row)[source]

Parameters

client –
row –

make_filename(df)[source]

Make a filename from a data frame that is networks and stations

Parameters: df (pandas.DataFrame) – request data frame
Returns: file name as network_01+stations_network_02+stations.h5
Return type: string

make_mth5_from_fdsn_client(df, path=None, client=None, interact=False)[source]

Make an MTH5 file from an FDSN data center

Parameters

df (pandas.DataFrame) –
DataFrame with columns
- ’network’ –> FDSN Network code
- ’station’ –> FDSN Station code
- ’location’ –> FDSN Location code
- ’channel’ –> FDSN Channel code
- ’start’ –> Start time YYYY-MM-DDThh:mm:ss
- ’end’ –> End time YYYY-MM-DDThh:mm:ss
path (string or pathlib.Path, optional) – Path to save MTH5 file to, defaults to None
client (string, optional) – FDSN client name, defaults to “IRIS”

Raises

AttributeError – If the input DataFrame is not properly

formatted an Attribute Error will be raised. :raises ValueError: If the values of the DataFrame are not correct a ValueError will be raised. :return: MTH5 file name :rtype: pathlib.Path

Note

If any of the column values are blank, then any value will

searched for. For example if you leave ‘station’ blank, any station within the given start and end time will be returned.

pack_stream_into_run_group(run_group, run_stream)[source]

property run_list_ne_stream_intervals_message: note about not equal stream intervals

run_timings_match_stream_timing(run_group, stream_start, stream_end)[source]

Checks start and end times in the run. Compares start and end times of runs to start and end times of traces. If True, will packs runs based on time spans.

Parameters

run_group –
stream_start –
stream_end –

stream_boundaries(streams)[source]

Identify start and end times of streams

Parameters: streams (obspy.core.stream.Stream) –

wrangle_runs_into_containers(m, station_id, survey_group=None)[source]

Note 1: There used to be two separate functions for this, but now there is one run_group_source is defined as either m or survey_group depending on v0.1.0 or 0.2.0

Note 2: If/elif/elif/else Logic: The strategy is to add the group first. This will get the already filled in metadata to update the run_ts_obj. Then get streams an add existing metadata.

Parameters

m –
streams –
station_id –
survey_group –

class mth5.clients.MakeMTH5(mth5_version='0.2.0', interact=False, save_path=None, **kwargs)[source]

Bases: object

from_fdsn_client(request_df, client='IRIS', **kwargs)[source]

Pull data from an FDSN archive like IRIS. Uses Obspy.Clients.

Parameters

request_df (pandas.DataFrame) –
DataFrame with columns
- ’network’ –> FDSN Network code
- ’station’ –> FDSN Station code
- ’location’ –> FDSN Location code
- ’channel’ –> FDSN Channel code
- ’start’ –> Start time YYYY-MM-DDThh:mm:ss
- ’end’ –> End time YYYY-MM-DDThh:mm:ss
client (string, optional) – FDSN client name, defaults to “IRIS”
interact (bool) – Boolean to keep the created MTH5 file open or not

Raises

AttributeError – If the input DataFrame is not properly formatted an Attribute Error will be raised.
ValueError – If the values of the DataFrame are not correct a ValueError will be raised.

Returns

MTH5 file name

Return type

pathlib.Path

Note

If any of the column values are blank, then any value will

searched for. For example if you leave ‘station’ blank, any station within the given start and end time will be returned.

from_usgs_geomag(request_df, **kwargs)[source]

Download geomagnetic observatory data from USGS webservices into an MTH5 using a request dataframe or csv file.

observatory: Geogmangetic observatory ID
type: type of data to get ‘adjusted’
start: start date time to request UTC
end: end date time to request UTC
elements: components to get
sampling_period: samples between measurements in seconds

Parameters

request_df (pandas.DataFrame, str or Path if csv file) –

DataFrame with columns

’observatory’ –> Observatory code
’type’ –> data type [ ‘variation’ | ‘adjusted’ | ‘quasi-definitive’ | ‘definitive’ ]
’elements’ –> Elements to get [D, DIST, DST, E, E-E, E-N, F, G, H, SQ, SV, UK1, UK2, UK3, UK4, X, Y, Z]
’sampling_period’ –> sample period [ 1 | 60 | 3600 ]
’start’ –> Start time YYYY-MM-DDThh:mm:ss
’end’ –> End time YYYY-MM-DDThh:mm:ss

Returns

if interact is True an MTH5 object is returned otherwise the path to the file is returned

Return type

Path or mth5.mth5.MTH5

from_zen(data_path, sample_rates=[4096, 1024, 256], calibration_path=None, survey_id=None, combine=True, **kwargs)[source]

Create an MTH5 from zen data.

Parameters

data_path (TYPE) – DESCRIPTION
sample_rates (TYPE, optional) – DESCRIPTION, defaults to [4096, 1024, 256]
save_path (TYPE, optional) – DESCRIPTION, defaults to None
calibration_path (TYPE, optional) – DESCRIPTION, defaults to None

Returns

DESCRIPTION

Return type

TYPE

class mth5.clients.PhoenixClient(data_path, sample_rates=[130, 24000], save_path=None, calibration_path=None)[source]

Bases: object

property calibration_path: Path to calibration data

property data_path: Path to phoenix data

get_run_dict()[source]

Get Run information

Returns: DESCRIPTION
Return type: TYPE

make_mth5_from_phoenix(**kwargs)[source]

Make an MTH5 from Phoenix files. Split into runs, account for filters

Parameters

data_path (TYPE, optional) – DESCRIPTION, defaults to None
sample_rates (TYPE, optional) – DESCRIPTION, defaults to None
save_path (TYPE, optional) – DESCRIPTION, defaults to None

Returns

DESCRIPTION

Return type

TYPE

property sample_rates: sample rates to look for

property save_path: Path to save mth5

class mth5.clients.USGSGeomag(**kwargs)[source]

Bases: object

add_run_id(request_df)[source]

Add run id to request df

Parameters: request_df (pandas.DataFrame) – request dataframe
Returns: add a run number to unique time windows for each observatory at each unique sampling period.
Return type: pandas.DataFrame

make_mth5_from_geomag(request_df)[source]

Download geomagnetic observatory data from USGS webservices into an MTH5 using a request dataframe or csv file.

Parameters

request_df (pandas.DataFrame, str or Path if csv file) –

DataFrame with columns

’observatory’ –> Observatory code
’type’ –> data type [ ‘variation’ | ‘adjusted’ | ‘quasi-definitive’ | ‘definitive’ ]
’elements’ –> Elements to get [D, DIST, DST, E, E-E, E-N, F, G, H, SQ, SV, UK1, UK2, UK3, UK4, X, Y, Z]
’sampling_period’ –> sample period [ 1 | 60 | 3600 ]
’start’ –> Start time YYYY-MM-DDThh:mm:ss
’end’ –> End time YYYY-MM-DDThh:mm:ss

Returns

if interact is True an MTH5 object is returned otherwise the path to the file is returned

Return type

Path or mth5.mth5.MTH5

validate_request_df(request_df)[source]

Make sure the input request dataframe has the appropriate columns

Parameters: request_df (pandas.DataFrame) – request dataframe
Returns: valid request dataframe
Return type: pandas.DataFrame

class mth5.clients.ZenClient(data_path, sample_rates=[4096, 1024, 256], save_path=None, calibration_path=None, **kwargs)[source]

Bases: object

property calibration_path: Path to calibration data

property data_path: Path to phoenix data

get_run_dict()[source]

Get Run information

Returns: DESCRIPTION
Return type: TYPE

get_survey(station_dict)[source]: get survey name from a dictionary of a single station of runs :param station_dict: DESCRIPTION :type station_dict: TYPE :return: DESCRIPTION :rtype: TYPE

make_mth5_from_zen(survey_id=None, combine=True, **kwargs)[source]

Make an MTH5 from Phoenix files. Split into runs, account for filters

Parameters

data_path (TYPE, optional) – DESCRIPTION, defaults to None
sample_rates (TYPE, optional) – DESCRIPTION, defaults to None
save_path (TYPE, optional) – DESCRIPTION, defaults to None

Returns

DESCRIPTION

Return type

TYPE

property sample_rates: sample rates to look for

property save_path: Path to save mth5