mth5.io.lemi package

Submodules

mth5.io.lemi.lemi424 module

Created on Tue May 11 15:31:31 2021

copyright: Jared Peacock (jpeacock@usgs.gov)
license: MIT

class mth5.io.lemi.lemi424.LEMI424(fn=None, **kwargs)[source]

Bases: object

Read in a LEMI424 file, this is a place holder until IRIS finalizes their reader.

Parameters

fn (pathlib.Path or string) – full path to LEMI424 file
sample_rate (float) – sample rate of the file, default is 1.0
chunk_size (integer) – chunk size for pandas to use, does not change reading time much for a single day file. default is 8640
file_column_names (list of strings) – column names of the LEMI424 file
dtypes (dictionary with keys of column names and values of data types) – data types for each column
data_column_names (dictionary with keys of column names and values of data types) – same as file_column names with and added column for date, which is the combined date and time columns.

LEMI424 File Column Names

year
month
day
hour
minute
second
bx
by
bz
temperature_e
temperature_h
e1
e2
e3
e4
battery
elevation
latitude
lat_hemisphere
longitude
lon_hemisphere
n_satellites
gps_fix
time_diff

Data Column Names

date
bx
by
bz
temperature_e
temperature_h
e1
e2
e3
e4
battery
elevation
latitude
lat_hemisphere
longitude
lon_hemisphere
n_satellites
gps_fix
time_diff

property data: Data represented as a pandas.DataFrame with data_column names

property elevation: median elevation where data have been collected in the LEMI424 file

property end: end time of data collection in the LEMI424 file

property file_size: size of file in bytes

property fn: full path to LEMI424 file

property gps_lock: has GPS lock

property latitude: median latitude where data have been collected in the LEMI424 file

property longitude: median longitude where data have been collected in the LEMI424 file

property n_samples: number of samples in the file

read(fn=None, fast=True)[source]

Read a LEMI424 file using pandas. The fast way will read in the first and last line to get the start and end time to make a time index. Then it will read in the data skipping parsing the date time columns. It will check to make sure the expected amount of points are correct. If not then it will read in the slower way which used the date time parser to ensure any time gaps are respected.

Parameters

fn (string or pathlib.Path, optional) – full path to file, defaults to None. Uses LEMI424.fn if not provided
fast – read the fast way (True) or not (False)

Returns

DESCRIPTION

Return type

TYPE

read_metadata()[source]: Read only first and last rows to get important metadata to use in the collection.

property run_metadata: run metadata as mt_metadata.timeseries.Run

property start: start time of data collection in the LEMI424 file

property station_metadata: station metadata as mt_metadata.timeseries.Station

to_run_ts(fn=None, e_channels=['e1', 'e2'])[source]

Create a mth5.timeseries.RunTS object from the data

Parameters

fn (string or pathlib.Path, optional) – full path to file, defaults to None. Will use LEMI424.fn if None.
e_channels (list of strings, optional) – columns for the electric channels to use, defaults to [“e1”, “e2”]

Returns

RunTS object

Return type

mth5.timeseries.RunTS

mth5.io.lemi.lemi424.lemi_date_parser(year, month, day, hour, minute, second)[source]

convenience function to combine the date-time columns that are output by lemi into a single column

Assumes UTC

Parameters

year (int) – year
month (int) – month
day (int) – day of the month
hour (int) – hour in 24 hr format
minute (int) – minutes in the hour
second (int) – seconds in the minute

Returns

date time as a single column

Return type

pandas.DateTime

mth5.io.lemi.lemi424.lemi_hemisphere_parser(hemisphere)[source]

convert hemisphere into a value [-1, 1]. Assumes the prime meridian is 0.

Parameters: hemisphere (string) – hemisphere string [ ‘N’ | ‘S’ | ‘E’ | ‘W’]
Returns: unity with a sign for the given hemisphere
Return type: signed integer

mth5.io.lemi.lemi424.lemi_position_parser(position)[source]

convenience function to parse the location strings into a decimal float Uses the hemisphere for the sign.

Note

the format of the location is odd in that it is multiplied by 100 within the LEMI to provide a single floating point value that includes the degrees and decimal degrees –> {degrees}{degrees[mm.ss]}. For example 40.50166 would be represented as 4030.1.

Parameters: position (TYPE) – DESCRIPTION
Returns: DESCRIPTION
Return type: TYPE

mth5.io.lemi.lemi424.read_lemi424(fn, e_channels=['e1', 'e2'], fast=True, logger_file_handler=None)[source]

Read a LEMI 424 TXT file.

Parameters

fn (string or Path) – input file name
e_channels – A list of electric channels to read,

defaults to [“e1”, “e2”] :type e_channels: list of strings, optional :return: A RunTS object with appropriate metadata :rtype: mth5.timeseries.RunTS

mth5.io.lemi.lemi_collection module

LEMI 424 Collection

Collection of TXT files combined into runs

Created on Wed Aug 31 10:32:44 2022

@author: jpeacock

class mth5.io.lemi.lemi_collection.LEMICollection(file_path=None, **kwargs)[source]

Bases: Collection

Collection of LEMI 424 files into runs based on start and end times. Will assign the run name as ‘sr1_{index:0{zeros}}’ –> ‘sr1_0001’ for zeros = 4.

Parameters

file_path (string or :class`pathlib.Path`) – full path to single station LEMI424 directory
file_ext (string) – extension of LEMI424 files, default is ‘txt’
station_id (string) – station id
survey_id (string) – survey id

Note

This class assumes that the given file path contains a single LEMI station. If you want to do multiple stations merge the returned data frames.

Note

LEMI data comes with little metadata about the station or survey, therefore you should assign station_id and survey_id.

>>> from mth5.io.lemi import LEMICollection
>>> lc = LEMICollection(r"/path/to/single/lemi/station")
>>> lc.station_id = "mt001"
>>> lc.survey_id = "test_survey"
>>> run_dict = lc.get_runs(1)

assign_run_names(df, zeros=4)[source]

Assign run names based on start and end times, checks if a file has the same start time as the last end time.

Run names are assigned as sr{sample_rate}_{run_number:0{zeros}}.

Parameters

df (pandas.DataFrame) – Dataframe with the appropriate columns
zeros (int, optional) – number of zeros in run name, defaults to 4

Returns

Dataframe with run names

Return type

pandas.DataFrame

to_dataframe(sample_rates=[1], run_name_zeros=4, calibration_path=None)[source]

Create a data frame of each TXT file in a given directory.

Note

This assumes the given directory contains a single station

Parameters

sample_rates (int or list, optional) – sample rate to get, will always be 1 for LEMI data defaults to [1]
run_name_zeros (int, optional) – number of zeros to assing to the run name, defaults to 4
calibration_path (string or Path, optional) – path to calibration files, defaults to None

Returns

Dataframe with information of each TXT file in the given directory.

Return type

pandas.DataFrame

Example

>>> from mth5.io.lemi import LEMICollection
>>> lc = LEMICollection("/path/to/single/lemi/station")
>>> lemi_df = lc.to_dataframe()

Module contents

class mth5.io.lemi.LEMI424(fn=None, **kwargs)[source]

Bases: object

Read in a LEMI424 file, this is a place holder until IRIS finalizes their reader.

Parameters

fn (pathlib.Path or string) – full path to LEMI424 file
sample_rate (float) – sample rate of the file, default is 1.0
chunk_size (integer) – chunk size for pandas to use, does not change reading time much for a single day file. default is 8640
file_column_names (list of strings) – column names of the LEMI424 file
dtypes (dictionary with keys of column names and values of data types) – data types for each column
data_column_names (dictionary with keys of column names and values of data types) – same as file_column names with and added column for date, which is the combined date and time columns.

LEMI424 File Column Names

year
month
day
hour
minute
second
bx
by
bz
temperature_e
temperature_h
e1
e2
e3
e4
battery
elevation
latitude
lat_hemisphere
longitude
lon_hemisphere
n_satellites
gps_fix
time_diff

Data Column Names

date
bx
by
bz
temperature_e
temperature_h
e1
e2
e3
e4
battery
elevation
latitude
lat_hemisphere
longitude
lon_hemisphere
n_satellites
gps_fix
time_diff

property data: Data represented as a pandas.DataFrame with data_column names

property elevation: median elevation where data have been collected in the LEMI424 file

property end: end time of data collection in the LEMI424 file

property file_size: size of file in bytes

property fn: full path to LEMI424 file

property gps_lock: has GPS lock

property latitude: median latitude where data have been collected in the LEMI424 file

property longitude: median longitude where data have been collected in the LEMI424 file

property n_samples: number of samples in the file

read(fn=None, fast=True)[source]

Read a LEMI424 file using pandas. The fast way will read in the first and last line to get the start and end time to make a time index. Then it will read in the data skipping parsing the date time columns. It will check to make sure the expected amount of points are correct. If not then it will read in the slower way which used the date time parser to ensure any time gaps are respected.

Parameters

fn (string or pathlib.Path, optional) – full path to file, defaults to None. Uses LEMI424.fn if not provided
fast – read the fast way (True) or not (False)

Returns

DESCRIPTION

Return type

TYPE

read_metadata()[source]: Read only first and last rows to get important metadata to use in the collection.

property run_metadata: run metadata as mt_metadata.timeseries.Run

property start: start time of data collection in the LEMI424 file

property station_metadata: station metadata as mt_metadata.timeseries.Station

to_run_ts(fn=None, e_channels=['e1', 'e2'])[source]

Create a mth5.timeseries.RunTS object from the data

Parameters

fn (string or pathlib.Path, optional) – full path to file, defaults to None. Will use LEMI424.fn if None.
e_channels (list of strings, optional) – columns for the electric channels to use, defaults to [“e1”, “e2”]

Returns

RunTS object

Return type

mth5.timeseries.RunTS

class mth5.io.lemi.LEMICollection(file_path=None, **kwargs)[source]

Bases: Collection

Collection of LEMI 424 files into runs based on start and end times. Will assign the run name as ‘sr1_{index:0{zeros}}’ –> ‘sr1_0001’ for zeros = 4.

Parameters

file_path (string or :class`pathlib.Path`) – full path to single station LEMI424 directory
file_ext (string) – extension of LEMI424 files, default is ‘txt’
station_id (string) – station id
survey_id (string) – survey id

Note

This class assumes that the given file path contains a single LEMI station. If you want to do multiple stations merge the returned data frames.

Note

LEMI data comes with little metadata about the station or survey, therefore you should assign station_id and survey_id.

>>> from mth5.io.lemi import LEMICollection
>>> lc = LEMICollection(r"/path/to/single/lemi/station")
>>> lc.station_id = "mt001"
>>> lc.survey_id = "test_survey"
>>> run_dict = lc.get_runs(1)

assign_run_names(df, zeros=4)[source]

Assign run names based on start and end times, checks if a file has the same start time as the last end time.

Run names are assigned as sr{sample_rate}_{run_number:0{zeros}}.

Parameters

df (pandas.DataFrame) – Dataframe with the appropriate columns
zeros (int, optional) – number of zeros in run name, defaults to 4

Returns

Dataframe with run names

Return type

pandas.DataFrame

to_dataframe(sample_rates=[1], run_name_zeros=4, calibration_path=None)[source]

Create a data frame of each TXT file in a given directory.

Note

This assumes the given directory contains a single station

Parameters

sample_rates (int or list, optional) – sample rate to get, will always be 1 for LEMI data defaults to [1]
run_name_zeros (int, optional) – number of zeros to assing to the run name, defaults to 4
calibration_path (string or Path, optional) – path to calibration files, defaults to None

Returns

Dataframe with information of each TXT file in the given directory.

Return type

pandas.DataFrame

Example

>>> from mth5.io.lemi import LEMICollection
>>> lc = LEMICollection("/path/to/single/lemi/station")
>>> lemi_df = lc.to_dataframe()

mth5.io.lemi.read_lemi424(fn, e_channels=['e1', 'e2'], fast=True, logger_file_handler=None)[source]

Read a LEMI 424 TXT file.

Parameters

fn (string or Path) – input file name
e_channels – A list of electric channels to read,

defaults to [“e1”, “e2”] :type e_channels: list of strings, optional :return: A RunTS object with appropriate metadata :rtype: mth5.timeseries.RunTS