mth5.io.lemi package
Submodules
mth5.io.lemi.lemi424 module
Created on Tue May 11 15:31:31 2021
- copyright:
Jared Peacock (jpeacock@usgs.gov)
- license:
MIT
- class mth5.io.lemi.lemi424.LEMI424(fn: str | Path | None = None, **kwargs: Any)[source]
Bases:
objectRead and process LEMI424 magnetotelluric data files.
This is a placeholder until IRIS finalizes their reader.
- Parameters:
fn (str or pathlib.Path, optional) – Full path to LEMI424 file, by default None.
**kwargs (dict) – Additional keyword arguments for configuration.
- data_column_names[source]
Same as file_column_names with an added column for date.
- Type:
list of str
Notes
- LEMI424 File Column Names:
year, month, day, hour, minute, second, bx, by, bz, temperature_e, temperature_h, e1, e2, e3, e4, battery, elevation, latitude, lat_hemisphere, longitude, lon_hemisphere, n_satellites, gps_fix, time_diff
- Data Column Names:
date, bx, by, bz, temperature_e, temperature_h, e1, e2, e3, e4, battery, elevation, latitude, lat_hemisphere, longitude, lon_hemisphere, n_satellites, gps_fix, time_diff
- property data: DataFrame | None[source]
Data represented as a pandas DataFrame with data column names.
- Returns:
The loaded data or None if no data is loaded.
- Return type:
pd.DataFrame or None
- property elevation: float | None[source]
Median elevation where data have been collected.
- Returns:
Median elevation in meters or None if no data is loaded.
- Return type:
float or None
- property end: MTime | None[source]
End time of data collection in the LEMI424 file.
- Returns:
End time or None if no data is loaded.
- Return type:
MTime or None
- property file_size: int | None[source]
Size of file in bytes.
- Returns:
File size in bytes or None if no file is set.
- Return type:
int or None
- property fn: Path | None[source]
Full path to LEMI424 file.
- Returns:
Path to the file or None if not set.
- Return type:
pathlib.Path or None
- property gps_lock: Any | None[source]
GPS lock status array.
- Returns:
GPS fix values or None if no data is loaded.
- Return type:
numpy.ndarray or None
- property latitude: float | None[source]
Median latitude where data have been collected.
- Returns:
Median latitude in degrees or None if no data is loaded.
- Return type:
float or None
- property longitude: float | None[source]
Median longitude where data have been collected.
- Returns:
Median longitude in degrees or None if no data is loaded.
- Return type:
float or None
- property n_samples: int | None[source]
Number of samples in the file.
- Returns:
Number of samples or None if no data/file available.
- Return type:
int or None
- read(fn: str | Path | None = None, fast: bool = True) None[source]
Read a LEMI424 file using pandas.
The fast way will read in the first and last line to get the start and end time to make a time index. Then it will read in the data skipping parsing the date time columns. It will check to make sure the expected amount of points are correct. If not then it will read in the slower way which uses the date time parser to ensure any time gaps are respected.
- Parameters:
fn (str, pathlib.Path, or None, optional) – Full path to file. Uses LEMI424.fn if not provided, by default None.
fast (bool, optional) – Read the fast way (True) or not (False), by default True.
- Raises:
IOError – If file cannot be found.
- read_calibration(fn: str | Path) FrequencyResponseTableFilter[source]
Read a LEMI424 calibration file.
Calibration files are assumed to be JSON files with the following format: {
- “Calibration”: {
“gain”: float, “Freq”: [float], “Re”: [float], “Im”: [float]
}
}
- Parameters:
fn (str or pathlib.Path) – Full path to calibration file.
- Returns:
Calibration filter object.
- Return type:
mt_metadata.timeseries.filters.FrequencyResponseTableFilter
- read_metadata() None[source]
Read only first and last rows to get important metadata.
This method is used to extract essential metadata from the collection without loading the entire dataset.
- property run_metadata: Run[source]
Run metadata as mt_metadata.timeseries.Run object.
- Returns:
Run metadata object.
- Return type:
mt_metadata.timeseries.Run
- property start: MTime | None[source]
Start time of data collection in the LEMI424 file.
- Returns:
Start time or None if no data is loaded.
- Return type:
MTime or None
- property station_metadata: Station[source]
Station metadata as mt_metadata.timeseries.Station object.
- Returns:
Station metadata object.
- Return type:
mt_metadata.timeseries.Station
- to_run_ts(fn: str | Path | None = None, e_channels: list[str] = ['e1', 'e2'], calibration_dict: dict | None = None) RunTS[source]
Create a RunTS object from the data.
- Parameters:
fn (str, pathlib.Path, or None, optional) – Full path to file. Will use LEMI424.fn if None, by default None.
e_channels (list of str, optional) – Column names for the electric channels to use, by default [“e1”, “e2”].
calibration_dict (dict, optional) – Calibration dictionary to apply to the data, by default {}. Keys are the channel names and values are the calibration file path. The file path is assumed to be in the format lemi-{component}.sr.json.
- Returns:
RunTS object containing the data.
- Return type:
- mth5.io.lemi.lemi424.lemi_date_parser(year: int, month: int, day: int, hour: int, minute: int, second: int) Series[source]
Combine the date-time columns that are output by LEMI into a single column.
Assumes UTC timezone.
- Parameters:
year (int) – Year value.
month (int) – Month value (1-12).
day (int) – Day of the month (1-31).
hour (int) – Hour in 24-hour format (0-23).
minute (int) – Minutes in the hour (0-59).
second (int) – Seconds in the minute (0-59).
- Returns:
Combined date-time as a pandas DatetimeIndex.
- Return type:
pd.DatetimeIndex
- mth5.io.lemi.lemi424.lemi_hemisphere_parser(hemisphere: str) int[source]
Convert hemisphere into a value [-1, 1].
Assumes the prime meridian is 0.
- Parameters:
hemisphere (str) – Hemisphere string. Valid values are ‘N’, ‘S’, ‘E’, ‘W’.
- Returns:
Unity with a sign for the given hemisphere. Returns -1 for ‘S’ or ‘W’, 1 for ‘N’ or ‘E’.
- Return type:
int
- mth5.io.lemi.lemi424.lemi_position_parser(position: float) float[source]
Parse LEMI location strings into decimal degrees.
Uses the hemisphere for the sign.
Notes
The format of the location is odd in that it is multiplied by 100 within the LEMI to provide a single floating point value that includes the degrees and decimal degrees –> {degrees}{degrees[mm.ss]}. For example 40.50166 would be represented as 4030.1.
- Parameters:
position (float) – LEMI position value to parse.
- Returns:
Decimal degrees position.
- Return type:
float
- mth5.io.lemi.lemi424.read_lemi424(fn: str | Path | list[str | Path], e_channels: list[str] = ['e1', 'e2'], fast: bool = True, calibration_dict: dict | None = None) RunTS[source]
Read a LEMI 424 TXT file.
- Parameters:
fn (str or pathlib.Path) – Input file name.
e_channels (list of str, optional) – A list of electric channels to read, by default [“e1”, “e2”].
fast (bool, optional) – Use fast reading method, by default True.
calibration_dict (dict, optional) – Calibration dictionary to apply to the data, by default None. Keys are the channel names and values are the calibration file path.
- Returns:
A RunTS object with appropriate metadata.
- Return type:
mth5.io.lemi.lemi_collection module
LEMI 424 Collection
Collection of TXT files combined into runs
Created on Wed Aug 31 10:32:44 2022
@author: jpeacock
- class mth5.io.lemi.lemi_collection.LEMICollection(file_path: str | Path | None = None, file_ext: List[str] | None = None, **kwargs)[source]
Bases:
CollectionCollection of LEMI 424 files into runs based on start and end times.
Will assign the run name as ‘sr1_{index:0{zeros}}’ –> ‘sr1_0001’ for zeros = 4.
Notes
This class assumes that the given file path contains a single LEMI station. If you want to do multiple stations merge the returned data frames.
LEMI data comes with little metadata about the station or survey, therefore you should assign station_id and survey_id.
- Parameters:
file_path (str or pathlib.Path, optional) – Full path to single station LEMI424 directory, by default None
file_ext (list of str, optional) – Extension of LEMI424 files, by default [“txt”, “TXT”]
**kwargs – Additional keyword arguments passed to parent Collection class
Examples
>>> from mth5.io.lemi import LEMICollection >>> lc = LEMICollection(r"/path/to/single/lemi/station") >>> lc.station_id = "mt001" >>> lc.survey_id = "test_survey" >>> run_dict = lc.get_runs(1)
- assign_run_names(df: DataFrame, zeros: int = 4) DataFrame[source]
Assign run names based on start and end times.
Checks if a file has the same start time as the last end time. Run names are assigned as sr{sample_rate}_{run_number:0{zeros}}.
- Parameters:
df (pd.DataFrame) – DataFrame with the appropriate columns
zeros (int, optional) – Number of zeros in run name, by default 4
- Returns:
DataFrame with run names assigned
- Return type:
pd.DataFrame
- get_calibrations(calibration_path: str | Path) dict[source]
Get calibration dictionary for LEMI424 files. This assumes that the calibrations files are in JSON format and named as ‘LEMI-424-<component>.json’
- Parameters:
calibration_path (str or pathlib.Path) – Path to calibration files
- Returns:
Calibration dictionary for LEMI424 files
- Return type:
dict
Examples
>>> from mth5.io.lemi import LEMICollection >>> lc = LEMICollection("/path/to/single/lemi/station") >>> cal_dict = lc.get_calibrations(Path("/path/to/calibrations"))
- to_dataframe(sample_rates: int | List[int] | None = None, run_name_zeros: int = 4, calibration_path: str | Path | None = None) DataFrame[source]
Create a data frame of each TXT file in a given directory.
Notes
This assumes the given directory contains a single station
- Parameters:
sample_rates (int or list of int, optional) – Sample rate to get, will always be 1 for LEMI data, by default [1]
run_name_zeros (int, optional) – Number of zeros to assign to the run name, by default 4
calibration_path (str or pathlib.Path, optional) – Path to calibration files, by default None
- Returns:
DataFrame with information of each TXT file in the given directory
- Return type:
pd.DataFrame
Examples
>>> from mth5.io.lemi import LEMICollection >>> lc = LEMICollection("/path/to/single/lemi/station") >>> lemi_df = lc.to_dataframe()
Module contents
- class mth5.io.lemi.LEMI424(fn: str | Path | None = None, **kwargs: Any)[source]
Bases:
objectRead and process LEMI424 magnetotelluric data files.
This is a placeholder until IRIS finalizes their reader.
- Parameters:
fn (str or pathlib.Path, optional) – Full path to LEMI424 file, by default None.
**kwargs (dict) – Additional keyword arguments for configuration.
- sample_rate
Sample rate of the file, default is 1.0.
- Type:
float
- chunk_size
Chunk size for pandas to use, default is 8640.
- Type:
int
- file_column_names
Column names of the LEMI424 file.
- Type:
list of str
- dtypes
Data types for each column.
- Type:
dict
- data_column_names
Same as file_column_names with an added column for date.
- Type:
list of str
- data
The loaded data.
- Type:
pd.DataFrame or None
Notes
- LEMI424 File Column Names:
year, month, day, hour, minute, second, bx, by, bz, temperature_e, temperature_h, e1, e2, e3, e4, battery, elevation, latitude, lat_hemisphere, longitude, lon_hemisphere, n_satellites, gps_fix, time_diff
- Data Column Names:
date, bx, by, bz, temperature_e, temperature_h, e1, e2, e3, e4, battery, elevation, latitude, lat_hemisphere, longitude, lon_hemisphere, n_satellites, gps_fix, time_diff
- property data: DataFrame | None
Data represented as a pandas DataFrame with data column names.
- Returns:
The loaded data or None if no data is loaded.
- Return type:
pd.DataFrame or None
- property elevation: float | None
Median elevation where data have been collected.
- Returns:
Median elevation in meters or None if no data is loaded.
- Return type:
float or None
- property end: MTime | None
End time of data collection in the LEMI424 file.
- Returns:
End time or None if no data is loaded.
- Return type:
MTime or None
- property file_size: int | None
Size of file in bytes.
- Returns:
File size in bytes or None if no file is set.
- Return type:
int or None
- property fn: Path | None
Full path to LEMI424 file.
- Returns:
Path to the file or None if not set.
- Return type:
pathlib.Path or None
- property gps_lock: Any | None
GPS lock status array.
- Returns:
GPS fix values or None if no data is loaded.
- Return type:
numpy.ndarray or None
- property latitude: float | None
Median latitude where data have been collected.
- Returns:
Median latitude in degrees or None if no data is loaded.
- Return type:
float or None
- property longitude: float | None
Median longitude where data have been collected.
- Returns:
Median longitude in degrees or None if no data is loaded.
- Return type:
float or None
- property n_samples: int | None
Number of samples in the file.
- Returns:
Number of samples or None if no data/file available.
- Return type:
int or None
- read(fn: str | Path | None = None, fast: bool = True) None[source]
Read a LEMI424 file using pandas.
The fast way will read in the first and last line to get the start and end time to make a time index. Then it will read in the data skipping parsing the date time columns. It will check to make sure the expected amount of points are correct. If not then it will read in the slower way which uses the date time parser to ensure any time gaps are respected.
- Parameters:
fn (str, pathlib.Path, or None, optional) – Full path to file. Uses LEMI424.fn if not provided, by default None.
fast (bool, optional) – Read the fast way (True) or not (False), by default True.
- Raises:
IOError – If file cannot be found.
- read_calibration(fn: str | Path) FrequencyResponseTableFilter[source]
Read a LEMI424 calibration file.
Calibration files are assumed to be JSON files with the following format: {
- “Calibration”: {
“gain”: float, “Freq”: [float], “Re”: [float], “Im”: [float]
}
}
- Parameters:
fn (str or pathlib.Path) – Full path to calibration file.
- Returns:
Calibration filter object.
- Return type:
mt_metadata.timeseries.filters.FrequencyResponseTableFilter
- read_metadata() None[source]
Read only first and last rows to get important metadata.
This method is used to extract essential metadata from the collection without loading the entire dataset.
- property run_metadata: Run
Run metadata as mt_metadata.timeseries.Run object.
- Returns:
Run metadata object.
- Return type:
mt_metadata.timeseries.Run
- property start: MTime | None
Start time of data collection in the LEMI424 file.
- Returns:
Start time or None if no data is loaded.
- Return type:
MTime or None
- property station_metadata: Station
Station metadata as mt_metadata.timeseries.Station object.
- Returns:
Station metadata object.
- Return type:
mt_metadata.timeseries.Station
- to_run_ts(fn: str | Path | None = None, e_channels: list[str] = ['e1', 'e2'], calibration_dict: dict | None = None) RunTS[source]
Create a RunTS object from the data.
- Parameters:
fn (str, pathlib.Path, or None, optional) – Full path to file. Will use LEMI424.fn if None, by default None.
e_channels (list of str, optional) – Column names for the electric channels to use, by default [“e1”, “e2”].
calibration_dict (dict, optional) – Calibration dictionary to apply to the data, by default {}. Keys are the channel names and values are the calibration file path. The file path is assumed to be in the format lemi-{component}.sr.json.
- Returns:
RunTS object containing the data.
- Return type:
- class mth5.io.lemi.LEMICollection(file_path: str | Path | None = None, file_ext: List[str] | None = None, **kwargs)[source]
Bases:
CollectionCollection of LEMI 424 files into runs based on start and end times.
Will assign the run name as ‘sr1_{index:0{zeros}}’ –> ‘sr1_0001’ for zeros = 4.
Notes
This class assumes that the given file path contains a single LEMI station. If you want to do multiple stations merge the returned data frames.
LEMI data comes with little metadata about the station or survey, therefore you should assign station_id and survey_id.
- Parameters:
file_path (str or pathlib.Path, optional) – Full path to single station LEMI424 directory, by default None
file_ext (list of str, optional) – Extension of LEMI424 files, by default [“txt”, “TXT”]
**kwargs – Additional keyword arguments passed to parent Collection class
- station_id
Station identification string, defaults to “mt001”
- Type:
str
- survey_id
Survey identification string, defaults to “mt”
- Type:
str
Examples
>>> from mth5.io.lemi import LEMICollection >>> lc = LEMICollection(r"/path/to/single/lemi/station") >>> lc.station_id = "mt001" >>> lc.survey_id = "test_survey" >>> run_dict = lc.get_runs(1)
- assign_run_names(df: DataFrame, zeros: int = 4) DataFrame[source]
Assign run names based on start and end times.
Checks if a file has the same start time as the last end time. Run names are assigned as sr{sample_rate}_{run_number:0{zeros}}.
- Parameters:
df (pd.DataFrame) – DataFrame with the appropriate columns
zeros (int, optional) – Number of zeros in run name, by default 4
- Returns:
DataFrame with run names assigned
- Return type:
pd.DataFrame
- get_calibrations(calibration_path: str | Path) dict[source]
Get calibration dictionary for LEMI424 files. This assumes that the calibrations files are in JSON format and named as ‘LEMI-424-<component>.json’
- Parameters:
calibration_path (str or pathlib.Path) – Path to calibration files
- Returns:
Calibration dictionary for LEMI424 files
- Return type:
dict
Examples
>>> from mth5.io.lemi import LEMICollection >>> lc = LEMICollection("/path/to/single/lemi/station") >>> cal_dict = lc.get_calibrations(Path("/path/to/calibrations"))
- to_dataframe(sample_rates: int | List[int] | None = None, run_name_zeros: int = 4, calibration_path: str | Path | None = None) DataFrame[source]
Create a data frame of each TXT file in a given directory.
Notes
This assumes the given directory contains a single station
- Parameters:
sample_rates (int or list of int, optional) – Sample rate to get, will always be 1 for LEMI data, by default [1]
run_name_zeros (int, optional) – Number of zeros to assign to the run name, by default 4
calibration_path (str or pathlib.Path, optional) – Path to calibration files, by default None
- Returns:
DataFrame with information of each TXT file in the given directory
- Return type:
pd.DataFrame
Examples
>>> from mth5.io.lemi import LEMICollection >>> lc = LEMICollection("/path/to/single/lemi/station") >>> lemi_df = lc.to_dataframe()
- mth5.io.lemi.read_lemi424(fn: str | Path | list[str | Path], e_channels: list[str] = ['e1', 'e2'], fast: bool = True, calibration_dict: dict | None = None) RunTS[source]
Read a LEMI 424 TXT file.
- Parameters:
fn (str or pathlib.Path) – Input file name.
e_channels (list of str, optional) – A list of electric channels to read, by default [“e1”, “e2”].
fast (bool, optional) – Use fast reading method, by default True.
calibration_dict (dict, optional) – Calibration dictionary to apply to the data, by default None. Keys are the channel names and values are the calibration file path.
- Returns:
A RunTS object with appropriate metadata.
- Return type: