mth5.io.lemi package

Submodules

mth5.io.lemi.lemi424 module

Created on Tue May 11 15:31:31 2021

copyright

Jared Peacock (jpeacock@usgs.gov)

license

MIT

class mth5.io.lemi.lemi424.LEMI424(fn=None, **kwargs)[source]

Bases: object

Read in a LEMI424 file, this is a place holder until IRIS finalizes their reader.

Parameters
  • fn (pathlib.Path or string) – full path to LEMI424 file

  • sample_rate (float) – sample rate of the file, default is 1.0

  • chunk_size (integer) – chunk size for pandas to use, does not change reading time much for a single day file. default is 8640

  • file_column_names (list of strings) – column names of the LEMI424 file

  • dtypes (dictionary with keys of column names and values of data types) – data types for each column

  • data_column_names (dictionary with keys of column names and values of data types) – same as file_column names with and added column for date, which is the combined date and time columns.

LEMI424 File Column Names
  • year

  • month

  • day

  • hour

  • minute

  • second

  • bx

  • by

  • bz

  • temperature_e

  • temperature_h

  • e1

  • e2

  • e3

  • e4

  • battery

  • elevation

  • latitude

  • lat_hemisphere

  • longitude

  • lon_hemisphere

  • n_satellites

  • gps_fix

  • time_diff

Data Column Names
  • date

  • bx

  • by

  • bz

  • temperature_e

  • temperature_h

  • e1

  • e2

  • e3

  • e4

  • battery

  • elevation

  • latitude

  • lat_hemisphere

  • longitude

  • lon_hemisphere

  • n_satellites

  • gps_fix

  • time_diff

property data

Data represented as a pandas.DataFrame with data_column names

property elevation

median elevation where data have been collected in the LEMI424 file

property end

end time of data collection in the LEMI424 file

property file_size

size of file in bytes

property fn

full path to LEMI424 file

property gps_lock

has GPS lock

property latitude

median latitude where data have been collected in the LEMI424 file

property longitude

median longitude where data have been collected in the LEMI424 file

property n_samples

number of samples in the file

read(fn=None, fast=True)[source]

Read a LEMI424 file using pandas. The fast way will read in the first and last line to get the start and end time to make a time index. Then it will read in the data skipping parsing the date time columns. It will check to make sure the expected amount of points are correct. If not then it will read in the slower way which used the date time parser to ensure any time gaps are respected.

Parameters
  • fn (string or pathlib.Path, optional) – full path to file, defaults to None. Uses LEMI424.fn if not provided

  • fast – read the fast way (True) or not (False)

Returns

DESCRIPTION

Return type

TYPE

read_metadata()[source]

Read only first and last rows to get important metadata to use in the collection.

property run_metadata

run metadata as mt_metadata.timeseries.Run

property start

start time of data collection in the LEMI424 file

property station_metadata

station metadata as mt_metadata.timeseries.Station

to_run_ts(fn=None, e_channels=['e1', 'e2'])[source]

Create a mth5.timeseries.RunTS object from the data

Parameters
  • fn (string or pathlib.Path, optional) – full path to file, defaults to None. Will use LEMI424.fn if None.

  • e_channels (list of strings, optional) – columns for the electric channels to use, defaults to [“e1”, “e2”]

Returns

RunTS object

Return type

mth5.timeseries.RunTS

mth5.io.lemi.lemi424.lemi_date_parser(year, month, day, hour, minute, second)[source]

convenience function to combine the date-time columns that are output by lemi into a single column

Assumes UTC

Parameters
  • year (int) – year

  • month (int) – month

  • day (int) – day of the month

  • hour (int) – hour in 24 hr format

  • minute (int) – minutes in the hour

  • second (int) – seconds in the minute

Returns

date time as a single column

Return type

pandas.DateTime

mth5.io.lemi.lemi424.lemi_hemisphere_parser(hemisphere)[source]

convert hemisphere into a value [-1, 1]. Assumes the prime meridian is 0.

Parameters

hemisphere (string) – hemisphere string [ ‘N’ | ‘S’ | ‘E’ | ‘W’]

Returns

unity with a sign for the given hemisphere

Return type

signed integer

mth5.io.lemi.lemi424.lemi_position_parser(position)[source]

convenience function to parse the location strings into a decimal float Uses the hemisphere for the sign.

Note

the format of the location is odd in that it is multiplied by 100 within the LEMI to provide a single floating point value that includes the degrees and decimal degrees –> {degrees}{degrees[mm.ss]}. For example 40.50166 would be represented as 4030.1.

Parameters

position (TYPE) – DESCRIPTION

Returns

DESCRIPTION

Return type

TYPE

mth5.io.lemi.lemi424.read_lemi424(fn, e_channels=['e1', 'e2'], fast=True, logger_file_handler=None)[source]

Read a LEMI 424 TXT file.

Parameters
  • fn (string or Path) – input file name

  • e_channels – A list of electric channels to read,

defaults to [“e1”, “e2”] :type e_channels: list of strings, optional :return: A RunTS object with appropriate metadata :rtype: mth5.timeseries.RunTS

mth5.io.lemi.lemi_collection module

LEMI 424 Collection

Collection of TXT files combined into runs

Created on Wed Aug 31 10:32:44 2022

@author: jpeacock

class mth5.io.lemi.lemi_collection.LEMICollection(file_path=None, **kwargs)[source]

Bases: Collection

Collection of LEMI 424 files into runs based on start and end times. Will assign the run name as ‘sr1_{index:0{zeros}}’ –> ‘sr1_0001’ for zeros = 4.

Parameters
  • file_path (string or :class`pathlib.Path`) – full path to single station LEMI424 directory

  • file_ext (string) – extension of LEMI424 files, default is ‘txt’

  • station_id (string) – station id

  • survey_id (string) – survey id

Note

This class assumes that the given file path contains a single LEMI station. If you want to do multiple stations merge the returned data frames.

Note

LEMI data comes with little metadata about the station or survey, therefore you should assign station_id and survey_id.

>>> from mth5.io.lemi import LEMICollection
>>> lc = LEMICollection(r"/path/to/single/lemi/station")
>>> lc.station_id = "mt001"
>>> lc.survey_id = "test_survey"
>>> run_dict = lc.get_runs(1)
assign_run_names(df, zeros=4)[source]

Assign run names based on start and end times, checks if a file has the same start time as the last end time.

Run names are assigned as sr{sample_rate}_{run_number:0{zeros}}.

Parameters
  • df (pandas.DataFrame) – Dataframe with the appropriate columns

  • zeros (int, optional) – number of zeros in run name, defaults to 4

Returns

Dataframe with run names

Return type

pandas.DataFrame

to_dataframe(sample_rates=[1], run_name_zeros=4, calibration_path=None)[source]

Create a data frame of each TXT file in a given directory.

Note

This assumes the given directory contains a single station

Parameters
  • sample_rates (int or list, optional) – sample rate to get, will always be 1 for LEMI data defaults to [1]

  • run_name_zeros (int, optional) – number of zeros to assing to the run name, defaults to 4

  • calibration_path (string or Path, optional) – path to calibration files, defaults to None

Returns

Dataframe with information of each TXT file in the given directory.

Return type

pandas.DataFrame

Example
>>> from mth5.io.lemi import LEMICollection
>>> lc = LEMICollection("/path/to/single/lemi/station")
>>> lemi_df = lc.to_dataframe()

Module contents

class mth5.io.lemi.LEMI424(fn=None, **kwargs)[source]

Bases: object

Read in a LEMI424 file, this is a place holder until IRIS finalizes their reader.

Parameters
  • fn (pathlib.Path or string) – full path to LEMI424 file

  • sample_rate (float) – sample rate of the file, default is 1.0

  • chunk_size (integer) – chunk size for pandas to use, does not change reading time much for a single day file. default is 8640

  • file_column_names (list of strings) – column names of the LEMI424 file

  • dtypes (dictionary with keys of column names and values of data types) – data types for each column

  • data_column_names (dictionary with keys of column names and values of data types) – same as file_column names with and added column for date, which is the combined date and time columns.

LEMI424 File Column Names
  • year

  • month

  • day

  • hour

  • minute

  • second

  • bx

  • by

  • bz

  • temperature_e

  • temperature_h

  • e1

  • e2

  • e3

  • e4

  • battery

  • elevation

  • latitude

  • lat_hemisphere

  • longitude

  • lon_hemisphere

  • n_satellites

  • gps_fix

  • time_diff

Data Column Names
  • date

  • bx

  • by

  • bz

  • temperature_e

  • temperature_h

  • e1

  • e2

  • e3

  • e4

  • battery

  • elevation

  • latitude

  • lat_hemisphere

  • longitude

  • lon_hemisphere

  • n_satellites

  • gps_fix

  • time_diff

property data

Data represented as a pandas.DataFrame with data_column names

property elevation

median elevation where data have been collected in the LEMI424 file

property end

end time of data collection in the LEMI424 file

property file_size

size of file in bytes

property fn

full path to LEMI424 file

property gps_lock

has GPS lock

property latitude

median latitude where data have been collected in the LEMI424 file

property longitude

median longitude where data have been collected in the LEMI424 file

property n_samples

number of samples in the file

read(fn=None, fast=True)[source]

Read a LEMI424 file using pandas. The fast way will read in the first and last line to get the start and end time to make a time index. Then it will read in the data skipping parsing the date time columns. It will check to make sure the expected amount of points are correct. If not then it will read in the slower way which used the date time parser to ensure any time gaps are respected.

Parameters
  • fn (string or pathlib.Path, optional) – full path to file, defaults to None. Uses LEMI424.fn if not provided

  • fast – read the fast way (True) or not (False)

Returns

DESCRIPTION

Return type

TYPE

read_metadata()[source]

Read only first and last rows to get important metadata to use in the collection.

property run_metadata

run metadata as mt_metadata.timeseries.Run

property start

start time of data collection in the LEMI424 file

property station_metadata

station metadata as mt_metadata.timeseries.Station

to_run_ts(fn=None, e_channels=['e1', 'e2'])[source]

Create a mth5.timeseries.RunTS object from the data

Parameters
  • fn (string or pathlib.Path, optional) – full path to file, defaults to None. Will use LEMI424.fn if None.

  • e_channels (list of strings, optional) – columns for the electric channels to use, defaults to [“e1”, “e2”]

Returns

RunTS object

Return type

mth5.timeseries.RunTS

class mth5.io.lemi.LEMICollection(file_path=None, **kwargs)[source]

Bases: Collection

Collection of LEMI 424 files into runs based on start and end times. Will assign the run name as ‘sr1_{index:0{zeros}}’ –> ‘sr1_0001’ for zeros = 4.

Parameters
  • file_path (string or :class`pathlib.Path`) – full path to single station LEMI424 directory

  • file_ext (string) – extension of LEMI424 files, default is ‘txt’

  • station_id (string) – station id

  • survey_id (string) – survey id

Note

This class assumes that the given file path contains a single LEMI station. If you want to do multiple stations merge the returned data frames.

Note

LEMI data comes with little metadata about the station or survey, therefore you should assign station_id and survey_id.

>>> from mth5.io.lemi import LEMICollection
>>> lc = LEMICollection(r"/path/to/single/lemi/station")
>>> lc.station_id = "mt001"
>>> lc.survey_id = "test_survey"
>>> run_dict = lc.get_runs(1)
assign_run_names(df, zeros=4)[source]

Assign run names based on start and end times, checks if a file has the same start time as the last end time.

Run names are assigned as sr{sample_rate}_{run_number:0{zeros}}.

Parameters
  • df (pandas.DataFrame) – Dataframe with the appropriate columns

  • zeros (int, optional) – number of zeros in run name, defaults to 4

Returns

Dataframe with run names

Return type

pandas.DataFrame

to_dataframe(sample_rates=[1], run_name_zeros=4, calibration_path=None)[source]

Create a data frame of each TXT file in a given directory.

Note

This assumes the given directory contains a single station

Parameters
  • sample_rates (int or list, optional) – sample rate to get, will always be 1 for LEMI data defaults to [1]

  • run_name_zeros (int, optional) – number of zeros to assing to the run name, defaults to 4

  • calibration_path (string or Path, optional) – path to calibration files, defaults to None

Returns

Dataframe with information of each TXT file in the given directory.

Return type

pandas.DataFrame

Example
>>> from mth5.io.lemi import LEMICollection
>>> lc = LEMICollection("/path/to/single/lemi/station")
>>> lemi_df = lc.to_dataframe()
mth5.io.lemi.read_lemi424(fn, e_channels=['e1', 'e2'], fast=True, logger_file_handler=None)[source]

Read a LEMI 424 TXT file.

Parameters
  • fn (string or Path) – input file name

  • e_channels – A list of electric channels to read,

defaults to [“e1”, “e2”] :type e_channels: list of strings, optional :return: A RunTS object with appropriate metadata :rtype: mth5.timeseries.RunTS