mth5.io.lemi package
Submodules
mth5.io.lemi.lemi424 module
Created on Tue May 11 15:31:31 2021
- copyright
Jared Peacock (jpeacock@usgs.gov)
- license
MIT
- class mth5.io.lemi.lemi424.LEMI424(fn=None, **kwargs)[source]
Bases:
object
Read in a LEMI424 file, this is a place holder until IRIS finalizes their reader.
- Parameters
fn (
pathlib.Path
or string) – full path to LEMI424 filesample_rate (float) – sample rate of the file, default is 1.0
chunk_size (integer) – chunk size for pandas to use, does not change reading time much for a single day file. default is 8640
file_column_names (list of strings) – column names of the LEMI424 file
dtypes (dictionary with keys of column names and values of data types) – data types for each column
data_column_names (dictionary with keys of column names and values of data types) – same as file_column names with and added column for date, which is the combined date and time columns.
- LEMI424 File Column Names
year
month
day
hour
minute
second
bx
by
bz
temperature_e
temperature_h
e1
e2
e3
e4
battery
elevation
latitude
lat_hemisphere
longitude
lon_hemisphere
n_satellites
gps_fix
time_diff
- Data Column Names
date
bx
by
bz
temperature_e
temperature_h
e1
e2
e3
e4
battery
elevation
latitude
lat_hemisphere
longitude
lon_hemisphere
n_satellites
gps_fix
time_diff
- property data
Data represented as a
pandas.DataFrame
with data_column names
- property elevation
median elevation where data have been collected in the LEMI424 file
- property end
end time of data collection in the LEMI424 file
- property file_size
size of file in bytes
- property fn
full path to LEMI424 file
- property gps_lock
has GPS lock
- property latitude
median latitude where data have been collected in the LEMI424 file
- property longitude
median longitude where data have been collected in the LEMI424 file
- property n_samples
number of samples in the file
- read(fn=None, fast=True)[source]
Read a LEMI424 file using pandas. The fast way will read in the first and last line to get the start and end time to make a time index. Then it will read in the data skipping parsing the date time columns. It will check to make sure the expected amount of points are correct. If not then it will read in the slower way which used the date time parser to ensure any time gaps are respected.
- Parameters
fn (string or
pathlib.Path
, optional) – full path to file, defaults to None. Uses LEMI424.fn if not providedfast – read the fast way (True) or not (False)
- Returns
DESCRIPTION
- Return type
TYPE
- read_metadata()[source]
Read only first and last rows to get important metadata to use in the collection.
- property run_metadata
run metadata as
mt_metadata.timeseries.Run
- property start
start time of data collection in the LEMI424 file
- property station_metadata
station metadata as
mt_metadata.timeseries.Station
- to_run_ts(fn=None, e_channels=['e1', 'e2'])[source]
Create a
mth5.timeseries.RunTS
object from the data- Parameters
fn (string or
pathlib.Path
, optional) – full path to file, defaults to None. Will use LEMI424.fn if None.e_channels (list of strings, optional) – columns for the electric channels to use, defaults to [“e1”, “e2”]
- Returns
RunTS object
- Return type
- mth5.io.lemi.lemi424.lemi_date_parser(year, month, day, hour, minute, second)[source]
convenience function to combine the date-time columns that are output by lemi into a single column
Assumes UTC
- Parameters
year (int) – year
month (int) – month
day (int) – day of the month
hour (int) – hour in 24 hr format
minute (int) – minutes in the hour
second (int) – seconds in the minute
- Returns
date time as a single column
- Return type
pandas.DateTime
- mth5.io.lemi.lemi424.lemi_hemisphere_parser(hemisphere)[source]
convert hemisphere into a value [-1, 1]. Assumes the prime meridian is 0.
- Parameters
hemisphere (string) – hemisphere string [ ‘N’ | ‘S’ | ‘E’ | ‘W’]
- Returns
unity with a sign for the given hemisphere
- Return type
signed integer
- mth5.io.lemi.lemi424.lemi_position_parser(position)[source]
convenience function to parse the location strings into a decimal float Uses the hemisphere for the sign.
Note
the format of the location is odd in that it is multiplied by 100 within the LEMI to provide a single floating point value that includes the degrees and decimal degrees –> {degrees}{degrees[mm.ss]}. For example 40.50166 would be represented as 4030.1.
- Parameters
position (TYPE) – DESCRIPTION
- Returns
DESCRIPTION
- Return type
TYPE
- mth5.io.lemi.lemi424.read_lemi424(fn, e_channels=['e1', 'e2'], fast=True, logger_file_handler=None)[source]
Read a LEMI 424 TXT file.
- Parameters
fn (string or Path) – input file name
e_channels – A list of electric channels to read,
defaults to [“e1”, “e2”] :type e_channels: list of strings, optional :return: A RunTS object with appropriate metadata :rtype:
mth5.timeseries.RunTS
mth5.io.lemi.lemi_collection module
LEMI 424 Collection
Collection of TXT files combined into runs
Created on Wed Aug 31 10:32:44 2022
@author: jpeacock
- class mth5.io.lemi.lemi_collection.LEMICollection(file_path=None, **kwargs)[source]
Bases:
Collection
Collection of LEMI 424 files into runs based on start and end times. Will assign the run name as ‘sr1_{index:0{zeros}}’ –> ‘sr1_0001’ for zeros = 4.
- Parameters
file_path (string or :class`pathlib.Path`) – full path to single station LEMI424 directory
file_ext (string) – extension of LEMI424 files, default is ‘txt’
station_id (string) – station id
survey_id (string) – survey id
Note
This class assumes that the given file path contains a single LEMI station. If you want to do multiple stations merge the returned data frames.
Note
LEMI data comes with little metadata about the station or survey, therefore you should assign station_id and survey_id.
>>> from mth5.io.lemi import LEMICollection >>> lc = LEMICollection(r"/path/to/single/lemi/station") >>> lc.station_id = "mt001" >>> lc.survey_id = "test_survey" >>> run_dict = lc.get_runs(1)
- assign_run_names(df, zeros=4)[source]
Assign run names based on start and end times, checks if a file has the same start time as the last end time.
Run names are assigned as sr{sample_rate}_{run_number:0{zeros}}.
- Parameters
df (
pandas.DataFrame
) – Dataframe with the appropriate columnszeros (int, optional) – number of zeros in run name, defaults to 4
- Returns
Dataframe with run names
- Return type
pandas.DataFrame
- to_dataframe(sample_rates=[1], run_name_zeros=4, calibration_path=None)[source]
Create a data frame of each TXT file in a given directory.
Note
This assumes the given directory contains a single station
- Parameters
sample_rates (int or list, optional) – sample rate to get, will always be 1 for LEMI data defaults to [1]
run_name_zeros (int, optional) – number of zeros to assing to the run name, defaults to 4
calibration_path (string or Path, optional) – path to calibration files, defaults to None
- Returns
Dataframe with information of each TXT file in the given directory.
- Return type
pandas.DataFrame
- Example
>>> from mth5.io.lemi import LEMICollection >>> lc = LEMICollection("/path/to/single/lemi/station") >>> lemi_df = lc.to_dataframe()
Module contents
- class mth5.io.lemi.LEMI424(fn=None, **kwargs)[source]
Bases:
object
Read in a LEMI424 file, this is a place holder until IRIS finalizes their reader.
- Parameters
fn (
pathlib.Path
or string) – full path to LEMI424 filesample_rate (float) – sample rate of the file, default is 1.0
chunk_size (integer) – chunk size for pandas to use, does not change reading time much for a single day file. default is 8640
file_column_names (list of strings) – column names of the LEMI424 file
dtypes (dictionary with keys of column names and values of data types) – data types for each column
data_column_names (dictionary with keys of column names and values of data types) – same as file_column names with and added column for date, which is the combined date and time columns.
- LEMI424 File Column Names
year
month
day
hour
minute
second
bx
by
bz
temperature_e
temperature_h
e1
e2
e3
e4
battery
elevation
latitude
lat_hemisphere
longitude
lon_hemisphere
n_satellites
gps_fix
time_diff
- Data Column Names
date
bx
by
bz
temperature_e
temperature_h
e1
e2
e3
e4
battery
elevation
latitude
lat_hemisphere
longitude
lon_hemisphere
n_satellites
gps_fix
time_diff
- property data
Data represented as a
pandas.DataFrame
with data_column names
- property elevation
median elevation where data have been collected in the LEMI424 file
- property end
end time of data collection in the LEMI424 file
- property file_size
size of file in bytes
- property fn
full path to LEMI424 file
- property gps_lock
has GPS lock
- property latitude
median latitude where data have been collected in the LEMI424 file
- property longitude
median longitude where data have been collected in the LEMI424 file
- property n_samples
number of samples in the file
- read(fn=None, fast=True)[source]
Read a LEMI424 file using pandas. The fast way will read in the first and last line to get the start and end time to make a time index. Then it will read in the data skipping parsing the date time columns. It will check to make sure the expected amount of points are correct. If not then it will read in the slower way which used the date time parser to ensure any time gaps are respected.
- Parameters
fn (string or
pathlib.Path
, optional) – full path to file, defaults to None. Uses LEMI424.fn if not providedfast – read the fast way (True) or not (False)
- Returns
DESCRIPTION
- Return type
TYPE
- read_metadata()[source]
Read only first and last rows to get important metadata to use in the collection.
- property run_metadata
run metadata as
mt_metadata.timeseries.Run
- property start
start time of data collection in the LEMI424 file
- property station_metadata
station metadata as
mt_metadata.timeseries.Station
- to_run_ts(fn=None, e_channels=['e1', 'e2'])[source]
Create a
mth5.timeseries.RunTS
object from the data- Parameters
fn (string or
pathlib.Path
, optional) – full path to file, defaults to None. Will use LEMI424.fn if None.e_channels (list of strings, optional) – columns for the electric channels to use, defaults to [“e1”, “e2”]
- Returns
RunTS object
- Return type
- class mth5.io.lemi.LEMICollection(file_path=None, **kwargs)[source]
Bases:
Collection
Collection of LEMI 424 files into runs based on start and end times. Will assign the run name as ‘sr1_{index:0{zeros}}’ –> ‘sr1_0001’ for zeros = 4.
- Parameters
file_path (string or :class`pathlib.Path`) – full path to single station LEMI424 directory
file_ext (string) – extension of LEMI424 files, default is ‘txt’
station_id (string) – station id
survey_id (string) – survey id
Note
This class assumes that the given file path contains a single LEMI station. If you want to do multiple stations merge the returned data frames.
Note
LEMI data comes with little metadata about the station or survey, therefore you should assign station_id and survey_id.
>>> from mth5.io.lemi import LEMICollection >>> lc = LEMICollection(r"/path/to/single/lemi/station") >>> lc.station_id = "mt001" >>> lc.survey_id = "test_survey" >>> run_dict = lc.get_runs(1)
- assign_run_names(df, zeros=4)[source]
Assign run names based on start and end times, checks if a file has the same start time as the last end time.
Run names are assigned as sr{sample_rate}_{run_number:0{zeros}}.
- Parameters
df (
pandas.DataFrame
) – Dataframe with the appropriate columnszeros (int, optional) – number of zeros in run name, defaults to 4
- Returns
Dataframe with run names
- Return type
pandas.DataFrame
- to_dataframe(sample_rates=[1], run_name_zeros=4, calibration_path=None)[source]
Create a data frame of each TXT file in a given directory.
Note
This assumes the given directory contains a single station
- Parameters
sample_rates (int or list, optional) – sample rate to get, will always be 1 for LEMI data defaults to [1]
run_name_zeros (int, optional) – number of zeros to assing to the run name, defaults to 4
calibration_path (string or Path, optional) – path to calibration files, defaults to None
- Returns
Dataframe with information of each TXT file in the given directory.
- Return type
pandas.DataFrame
- Example
>>> from mth5.io.lemi import LEMICollection >>> lc = LEMICollection("/path/to/single/lemi/station") >>> lemi_df = lc.to_dataframe()
- mth5.io.lemi.read_lemi424(fn, e_channels=['e1', 'e2'], fast=True, logger_file_handler=None)[source]
Read a LEMI 424 TXT file.
- Parameters
fn (string or Path) – input file name
e_channels – A list of electric channels to read,
defaults to [“e1”, “e2”] :type e_channels: list of strings, optional :return: A RunTS object with appropriate metadata :rtype:
mth5.timeseries.RunTS