mth5.io.nims package
Submodules
mth5.io.nims.gps module
Created on Thu Sep 1 11:43:56 2022
@author: jpeacock
- class mth5.io.nims.gps.GPS(gps_string, index=0)[source]
Bases:
object
class to parse GPS stamp from the NIMS
Depending on the type of Stamp different attributes will be filled.
GPRMC has full date and time information and declination GPGGA has elevation data
Note
GPGGA date is set to 1980-01-01 so that the time can be estimated. Should use GPRMC for accurate date/time information.
- property declination
geomagnetic declination in degrees from north
- property elevation
elevation in meters
- property fix
GPS fixed
- property gps_type
GPRMC or GPGGA
- property latitude
Latitude in decimal degrees, WGS84
- property longitude
Latitude in decimal degrees, WGS84
- parse_gps_string(gps_string)[source]
Parse a raw gps string from the NIMS and set appropriate attributes. GPS string will first be validated, then parsed.
- Parameters
gps_string (string) – raw GPS string to be parsed
- property time_stamp
return a datetime object of the time stamp
- validate_gps_list(gps_list)[source]
check to make sure the gps stamp is the correct format, checks each element for the proper format
- Parameters
gps_list (list) – a parsed gps string from a NIMS
- Raises
mth5.io.nims.GPSError
if anything is wrong.
mth5.io.nims.header module
Created on Thu Sep 1 12:57:32 2022
@author: jpeacock
- class mth5.io.nims.header.NIMSHeader(fn=None)[source]
Bases:
object
class to hold the NIMS header information.
A typical header looks like
''' >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>user field>>>>>>>>>>>>>>>>>>>>>>>>>>>> SITE NAME: Budwieser Spring STATE/PROVINCE: CA COUNTRY: USA >>> The following code in double quotes is REQUIRED to start the NIMS << >>> The next 3 lines contain values required for processing <<<<<<<<<<<< >>> The lines after that are optional <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< "300b" <-- 2CHAR EXPERIMENT CODE + 3 CHAR SITE CODE + RUN LETTER 1105-3; 1305-3 <-- SYSTEM BOX I.D.; MAG HEAD ID (if different) 106 0 <-- N-S Ex WIRE LENGTH (m); HEADING (deg E mag N) 109 90 <-- E-W Ey WIRE LENGTH (m); HEADING (deg E mag N) 1 <-- N ELECTRODE ID 3 <-- E ELECTRODE ID 2 <-- S ELECTRODE ID 4 <-- W ELECTRODE ID Cu <-- GROUND ELECTRODE INFO GPS INFO: 01/10/19 16:16:42 1616.7000 3443.6088 115.7350 W 946.6 OPERATOR: KP COMMENT: N/S CRS: .95/.96 DCV: 3.5 ACV:1 E/W CRS: .85/.86 DCV: 1.5 ACV: 1 Redeployed site for run b b/c possible animal disturbance '''
- property file_size
Size of the file
- property fn
Full path to NIMS file
- read_header(fn=None)[source]
read header information
- Parameters
fn (string or
pathlib.Path
) – full path to file to read- Raises
mth5.io.nims.NIMSError
if something is not right.
- property station
Station ID
mth5.io.nims.nims module
NIMS
deals with reading in NIMS DATA.BIN files
- This is a translation from Matlab codes written and edited by:
Anna Kelbert
Paul Bedrosian
Esteban Bowles-Martinez
Possibly others.
I’ve tested it against a version, and it matches. The data/GPS gaps I still don’t understand so for now the time series is just made continuous and the number of missing seconds is clipped from the end of the time series.
Note
this only works for 8Hz data for now
- copyright
Jared Peacock (jpeacock@usgs.gov)
- license
MIT
- class mth5.io.nims.nims.NIMS(fn=None)[source]
Bases:
NIMSHeader
NIMS Class will read in a NIMS DATA.BIN file.
A fast way to read the binary files are to first read in the GPS strings, the third byte in each block as a character and parse that into valid GPS stamps.
Then read in the entire data set as unsigned 8 bit integers and reshape the data to be n seconds x block size. Then parse that array into the status information and data.
I only have a limited amount of .BIN files to test so this will likely break if there are issues such as data gaps. This has been tested against the matlab program loadNIMS by Anna Kelbert and the match for all the .bin files I have. If something looks weird check it against that program.
Warning
Currently Only 8 Hz data is supported
- align_data(data_array, stamps)[source]
Need to match up the first good GPS stamp with the data
Do this by using the first GPS stamp and assuming that the time from the first time stamp to the start is the index value.
put the data into a pandas data frame that is indexed by time
- Parameters
data_array (array) – structure array with columns for each component [hx, hy, hz, ex, ey]
stamps (list) – list of GPS stamps [[status_index, [GPRMC, GPGGA]]]
- Returns
pandas DataFrame with colums of components and indexed by time initialized by the start time.
Note
Data gaps are squeezed cause not sure what a gap actually means.
- property box_temperature
data logger temperature, sampled at 1 second
- check_timing(stamps)[source]
make sure that there are the correct number of seconds in between the first and last GPS GPRMC stamps
- Parameters
stamps (list) – list of GPS stamps [[status_index, [GPRMC, GPGGA]]]
- Returns
[ True | False ] if data is valid or not.
- Returns
gap index locations
Note
currently it is assumed that if a data gap occurs the data can be squeezed to remove them. Probably a more elegant way of doing it.
- property declination
median elevation value from all the GPS stamps in decimal degrees WGS84
Only get from the first stamp within the sets
- property elevation
median elevation value from all the GPS stamps in decimal degrees WGS84
Only get from the first stamp within the sets
- property end_time
start time is the first good GPS time stamp minus the seconds to the beginning of the time series.
- property ex
EX
- property ex_metadata
- property ey
EY
- property ey_metadata
- find_sequence(data_array, block_sequence=None)[source]
find a sequence in a given array
- Parameters
data_array (array) – array of the data with shape [n, m] where n is the number of seconds recorded m is the block length for a given sampling rate.
block_sequence (list) – sequence pattern to locate default is [1, 131] the start of a data block.
- Returns
array of index locations where the sequence is found.
- get_channel_response(channel, dipole_length=1)[source]
Get the channel response for a given channel
- Parameters
channel (TYPE) – DESCRIPTION
dipole_length (TYPE, optional) – DESCRIPTION, defaults to 1
- Returns
DESCRIPTION
- Return type
TYPE
- get_stamps(nims_string)[source]
get a list of valid GPS strings and match synchronous GPRMC with GPGGA stamps if possible.
- Parameters
nims_string (str) – raw GPS string output by NIMS
- property hx
HX
- property hx_metadata
- property hy
HY
- property hy_metadata
- property hz
HZ
- property hz_metadata
- property latitude
median latitude value from all the GPS stamps in decimal degrees WGS84
Only get from the GPRMC stamp as they should be duplicates
- property longitude
median longitude value from all the GPS stamps in decimal degrees WGS84
Only get from the first stamp within the sets
- make_dt_index(start_time, sample_rate, stop_time=None, n_samples=None)[source]
make time index array
Note
date-time format should be YYYY-M-DDThh:mm:ss.ms UTC
- Parameters
start_time (string) – start time
end_time (string) – end time
sample_rate (float) – sample_rate in samples/second
- match_status_with_gps_stamps(status_array, gps_list)[source]
Match the index values from the status array with the index values of the GPS stamps. There appears to be a bit of wiggle room between when the lock is recorded and the stamp was actually recorded. This is typically 1 second and sometimes 2.
- Parameters
status_array (array) – array of status values from each data block
gps_list (list) – list of valid GPS stamps [[GPRMC, GPGGA], …]
Note
I think there is a 2 second gap between the lock and the first stamp character.
- property n_samples
- read_nims(fn=None)[source]
Read NIMS DATA.BIN file.
Read in the header information and stores those as attributes with the same names as in the header file.
Locate the beginning of the data blocks by looking for the first [1, 131, …] combo. Anything before that is cut out.
Make sure the data is a multiple of the block length, if the data is longer the extra bits are cut off.
Read in the GPS data (3rd byte of each block) as characters. Parses those into valid GPS stamps with appropriate index locations of where the ‘$’ was found.
Read in the data as unsigned 8-bit integers and reshape the array into [N, data_block_length]. Parse this array into the status information and the data.
Remove duplicate blocks, by removing the first of the duplicates as suggested by Anna and Paul.
Match the GPS locks from the status with valid GPS stamps.
Check to make sure that there is the correct number of seconds between the first and last GPS stamp. The extra seconds are cut off from the end of the time series. Not sure if this is the best way to accommodate gaps in the data.
Note
The data and information array returned have the duplicates removed and the sequence reset to be monotonic.
- Parameters
fn (str) – full path to DATA.BIN file
- Example
>>> from mth5.io import nims >>> n = nims.NIMS(r"/home/mt_data/nims/mt001.bin")
- remove_duplicates(info_array, data_array)[source]
remove duplicate blocks, removing the first duplicate as suggested by Paul and Anna. Checks to make sure that the mag data are identical for the duplicate blocks. Removes the blocks from the information and data arrays and returns the reduced arrays. This should sync up the timing of GPS stamps and index values.
- Parameters
info_array (np.array) – structured array of block information
data_array (np.array) – structured array of the data
- Returns
reduced information array
- Returns
reduced data array
- Returns
index of duplicates in raw data
- property run_metadata
Run metadata
- property start_time
start time is the first good GPS time stamp minus the seconds to the beginning of the time series.
- property station_metadata
Station metadata from nims file
mth5.io.nims.nims_collection module
LEMI 424 Collection
Collection of TXT files combined into runs
Created on Wed Aug 31 10:32:44 2022
@author: jpeacock
- class mth5.io.nims.nims_collection.NIMSCollection(file_path=None, **kwargs)[source]
Bases:
Collection
Collection of NIMS files into runs.
>>> from mth5.io.nims import LEMICollection >>> lc = NIMSCollection(r"/path/to/single/lemi/station") >>> lc.station_id = "mt001" >>> lc.survey_id = "test_survey" >>> run_dict = lc.get_runs(1)
- assign_run_names(df, zeros=2)[source]
Assign run names assuming a row represents single station
Run names are assigned as sr{sample_rate}_{run_number:0{zeros}}.
- Parameters
df (
pandas.DataFrame
) – Dataframe with the appropriate columnszeros (int, optional) – number of zeros in run name, defaults to 4
- Returns
Dataframe with run names
- Return type
pandas.DataFrame
- to_dataframe(sample_rates=[1], run_name_zeros=2, calibration_path=None)[source]
Create a data frame of each TXT file in a given directory.
Note
This assumes the given directory contains a single station
- Parameters
sample_rates (int or list, optional) – sample rate to get, will always be 1 for LEMI data defaults to [1]
run_name_zeros (int, optional) – number of zeros to assing to the run name, defaults to 4
calibration_path (string or Path, optional) – path to calibration files, defaults to None
- Returns
Dataframe with information of each TXT file in the given directory.
- Return type
pandas.DataFrame
- Example
>>> from mth5.io.lemi import LEMICollection >>> lc = LEMICollection("/path/to/single/lemi/station") >>> lemi_df = lc.to_dataframe()
mth5.io.nims.response_filters module
Created on Fri Sep 2 13:50:51 2022
@author: jpeacock
- class mth5.io.nims.response_filters.Response(system_id=None, **kwargs)[source]
Bases:
object
Common NIMS response filters for electric and magnetic channels
- dipole_filter(length)[source]
Make a dipole filter
- Parameters
length (TYPE) – dipole length in meters
- Returns
DESCRIPTION
- Return type
TYPE
- property electric_conversion
electric channel conversion from counts to volts :return: DESCRIPTION :rtype: TYPE
- property electric_high_pass_hp
1-pole low pass for 1 hz instuments :return: DESCRIPTION :rtype: TYPE
- property electric_high_pass_pc
1-pole low pass filter for 8 hz instruments :return: DESCRIPTION :rtype: TYPE
- property electric_low_pass
5 pole electric low pass filter :return: DESCRIPTION :rtype: TYPE
- property electric_physical_units
DESCRIPTION :rtype: TYPE
- Type
return
- get_channel_response(channel, dipole_length=1)[source]
Get the full channel response filter :param channel: DESCRIPTION :type channel: TYPE :param dipole_length: DESCRIPTION, defaults to 1 :type dipole_length: TYPE, optional :return: DESCRIPTION :rtype: TYPE
- get_electric_high_pass(hardware='pc')[source]
get the electric high pass filter based on the hardware
- property magnetic_conversion
DESCRIPTION :rtype: TYPE
- Type
return
- property magnetic_low_pass
Low pass 3 pole filter
- Returns
DESCRIPTION
- Return type
TYPE
Module contents
- class mth5.io.nims.GPS(gps_string, index=0)[source]
Bases:
object
class to parse GPS stamp from the NIMS
Depending on the type of Stamp different attributes will be filled.
GPRMC has full date and time information and declination GPGGA has elevation data
Note
GPGGA date is set to 1980-01-01 so that the time can be estimated. Should use GPRMC for accurate date/time information.
- property declination
geomagnetic declination in degrees from north
- property elevation
elevation in meters
- property fix
GPS fixed
- property gps_type
GPRMC or GPGGA
- property latitude
Latitude in decimal degrees, WGS84
- property longitude
Latitude in decimal degrees, WGS84
- parse_gps_string(gps_string)[source]
Parse a raw gps string from the NIMS and set appropriate attributes. GPS string will first be validated, then parsed.
- Parameters
gps_string (string) – raw GPS string to be parsed
- property time_stamp
return a datetime object of the time stamp
- validate_gps_list(gps_list)[source]
check to make sure the gps stamp is the correct format, checks each element for the proper format
- Parameters
gps_list (list) – a parsed gps string from a NIMS
- Raises
mth5.io.nims.GPSError
if anything is wrong.
- class mth5.io.nims.NIMS(fn=None)[source]
Bases:
NIMSHeader
NIMS Class will read in a NIMS DATA.BIN file.
A fast way to read the binary files are to first read in the GPS strings, the third byte in each block as a character and parse that into valid GPS stamps.
Then read in the entire data set as unsigned 8 bit integers and reshape the data to be n seconds x block size. Then parse that array into the status information and data.
I only have a limited amount of .BIN files to test so this will likely break if there are issues such as data gaps. This has been tested against the matlab program loadNIMS by Anna Kelbert and the match for all the .bin files I have. If something looks weird check it against that program.
Warning
Currently Only 8 Hz data is supported
- align_data(data_array, stamps)[source]
Need to match up the first good GPS stamp with the data
Do this by using the first GPS stamp and assuming that the time from the first time stamp to the start is the index value.
put the data into a pandas data frame that is indexed by time
- Parameters
data_array (array) – structure array with columns for each component [hx, hy, hz, ex, ey]
stamps (list) – list of GPS stamps [[status_index, [GPRMC, GPGGA]]]
- Returns
pandas DataFrame with colums of components and indexed by time initialized by the start time.
Note
Data gaps are squeezed cause not sure what a gap actually means.
- property box_temperature
data logger temperature, sampled at 1 second
- check_timing(stamps)[source]
make sure that there are the correct number of seconds in between the first and last GPS GPRMC stamps
- Parameters
stamps (list) – list of GPS stamps [[status_index, [GPRMC, GPGGA]]]
- Returns
[ True | False ] if data is valid or not.
- Returns
gap index locations
Note
currently it is assumed that if a data gap occurs the data can be squeezed to remove them. Probably a more elegant way of doing it.
- property declination
median elevation value from all the GPS stamps in decimal degrees WGS84
Only get from the first stamp within the sets
- property elevation
median elevation value from all the GPS stamps in decimal degrees WGS84
Only get from the first stamp within the sets
- property end_time
start time is the first good GPS time stamp minus the seconds to the beginning of the time series.
- property ex
EX
- property ex_metadata
- property ey
EY
- property ey_metadata
- find_sequence(data_array, block_sequence=None)[source]
find a sequence in a given array
- Parameters
data_array (array) – array of the data with shape [n, m] where n is the number of seconds recorded m is the block length for a given sampling rate.
block_sequence (list) – sequence pattern to locate default is [1, 131] the start of a data block.
- Returns
array of index locations where the sequence is found.
- get_channel_response(channel, dipole_length=1)[source]
Get the channel response for a given channel
- Parameters
channel (TYPE) – DESCRIPTION
dipole_length (TYPE, optional) – DESCRIPTION, defaults to 1
- Returns
DESCRIPTION
- Return type
TYPE
- get_stamps(nims_string)[source]
get a list of valid GPS strings and match synchronous GPRMC with GPGGA stamps if possible.
- Parameters
nims_string (str) – raw GPS string output by NIMS
- property hx
HX
- property hx_metadata
- property hy
HY
- property hy_metadata
- property hz
HZ
- property hz_metadata
- property latitude
median latitude value from all the GPS stamps in decimal degrees WGS84
Only get from the GPRMC stamp as they should be duplicates
- property longitude
median longitude value from all the GPS stamps in decimal degrees WGS84
Only get from the first stamp within the sets
- make_dt_index(start_time, sample_rate, stop_time=None, n_samples=None)[source]
make time index array
Note
date-time format should be YYYY-M-DDThh:mm:ss.ms UTC
- Parameters
start_time (string) – start time
end_time (string) – end time
sample_rate (float) – sample_rate in samples/second
- match_status_with_gps_stamps(status_array, gps_list)[source]
Match the index values from the status array with the index values of the GPS stamps. There appears to be a bit of wiggle room between when the lock is recorded and the stamp was actually recorded. This is typically 1 second and sometimes 2.
- Parameters
status_array (array) – array of status values from each data block
gps_list (list) – list of valid GPS stamps [[GPRMC, GPGGA], …]
Note
I think there is a 2 second gap between the lock and the first stamp character.
- property n_samples
- read_nims(fn=None)[source]
Read NIMS DATA.BIN file.
Read in the header information and stores those as attributes with the same names as in the header file.
Locate the beginning of the data blocks by looking for the first [1, 131, …] combo. Anything before that is cut out.
Make sure the data is a multiple of the block length, if the data is longer the extra bits are cut off.
Read in the GPS data (3rd byte of each block) as characters. Parses those into valid GPS stamps with appropriate index locations of where the ‘$’ was found.
Read in the data as unsigned 8-bit integers and reshape the array into [N, data_block_length]. Parse this array into the status information and the data.
Remove duplicate blocks, by removing the first of the duplicates as suggested by Anna and Paul.
Match the GPS locks from the status with valid GPS stamps.
Check to make sure that there is the correct number of seconds between the first and last GPS stamp. The extra seconds are cut off from the end of the time series. Not sure if this is the best way to accommodate gaps in the data.
Note
The data and information array returned have the duplicates removed and the sequence reset to be monotonic.
- Parameters
fn (str) – full path to DATA.BIN file
- Example
>>> from mth5.io import nims >>> n = nims.NIMS(r"/home/mt_data/nims/mt001.bin")
- remove_duplicates(info_array, data_array)[source]
remove duplicate blocks, removing the first duplicate as suggested by Paul and Anna. Checks to make sure that the mag data are identical for the duplicate blocks. Removes the blocks from the information and data arrays and returns the reduced arrays. This should sync up the timing of GPS stamps and index values.
- Parameters
info_array (np.array) – structured array of block information
data_array (np.array) – structured array of the data
- Returns
reduced information array
- Returns
reduced data array
- Returns
index of duplicates in raw data
- property run_metadata
Run metadata
- property start_time
start time is the first good GPS time stamp minus the seconds to the beginning of the time series.
- property station_metadata
Station metadata from nims file
- class mth5.io.nims.NIMSCollection(file_path=None, **kwargs)[source]
Bases:
Collection
Collection of NIMS files into runs.
>>> from mth5.io.nims import LEMICollection >>> lc = NIMSCollection(r"/path/to/single/lemi/station") >>> lc.station_id = "mt001" >>> lc.survey_id = "test_survey" >>> run_dict = lc.get_runs(1)
- assign_run_names(df, zeros=2)[source]
Assign run names assuming a row represents single station
Run names are assigned as sr{sample_rate}_{run_number:0{zeros}}.
- Parameters
df (
pandas.DataFrame
) – Dataframe with the appropriate columnszeros (int, optional) – number of zeros in run name, defaults to 4
- Returns
Dataframe with run names
- Return type
pandas.DataFrame
- to_dataframe(sample_rates=[1], run_name_zeros=2, calibration_path=None)[source]
Create a data frame of each TXT file in a given directory.
Note
This assumes the given directory contains a single station
- Parameters
sample_rates (int or list, optional) – sample rate to get, will always be 1 for LEMI data defaults to [1]
run_name_zeros (int, optional) – number of zeros to assing to the run name, defaults to 4
calibration_path (string or Path, optional) – path to calibration files, defaults to None
- Returns
Dataframe with information of each TXT file in the given directory.
- Return type
pandas.DataFrame
- Example
>>> from mth5.io.lemi import LEMICollection >>> lc = LEMICollection("/path/to/single/lemi/station") >>> lemi_df = lc.to_dataframe()
- class mth5.io.nims.NIMSHeader(fn=None)[source]
Bases:
object
class to hold the NIMS header information.
A typical header looks like
''' >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>user field>>>>>>>>>>>>>>>>>>>>>>>>>>>> SITE NAME: Budwieser Spring STATE/PROVINCE: CA COUNTRY: USA >>> The following code in double quotes is REQUIRED to start the NIMS << >>> The next 3 lines contain values required for processing <<<<<<<<<<<< >>> The lines after that are optional <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< "300b" <-- 2CHAR EXPERIMENT CODE + 3 CHAR SITE CODE + RUN LETTER 1105-3; 1305-3 <-- SYSTEM BOX I.D.; MAG HEAD ID (if different) 106 0 <-- N-S Ex WIRE LENGTH (m); HEADING (deg E mag N) 109 90 <-- E-W Ey WIRE LENGTH (m); HEADING (deg E mag N) 1 <-- N ELECTRODE ID 3 <-- E ELECTRODE ID 2 <-- S ELECTRODE ID 4 <-- W ELECTRODE ID Cu <-- GROUND ELECTRODE INFO GPS INFO: 01/10/19 16:16:42 1616.7000 3443.6088 115.7350 W 946.6 OPERATOR: KP COMMENT: N/S CRS: .95/.96 DCV: 3.5 ACV:1 E/W CRS: .85/.86 DCV: 1.5 ACV: 1 Redeployed site for run b b/c possible animal disturbance '''
- property file_size
Size of the file
- property fn
Full path to NIMS file
- read_header(fn=None)[source]
read header information
- Parameters
fn (string or
pathlib.Path
) – full path to file to read- Raises
mth5.io.nims.NIMSError
if something is not right.
- property station
Station ID
- class mth5.io.nims.Response(system_id=None, **kwargs)[source]
Bases:
object
Common NIMS response filters for electric and magnetic channels
- dipole_filter(length)[source]
Make a dipole filter
- Parameters
length (TYPE) – dipole length in meters
- Returns
DESCRIPTION
- Return type
TYPE
- property electric_conversion
electric channel conversion from counts to volts :return: DESCRIPTION :rtype: TYPE
- property electric_high_pass_hp
1-pole low pass for 1 hz instuments :return: DESCRIPTION :rtype: TYPE
- property electric_high_pass_pc
1-pole low pass filter for 8 hz instruments :return: DESCRIPTION :rtype: TYPE
- property electric_low_pass
5 pole electric low pass filter :return: DESCRIPTION :rtype: TYPE
- property electric_physical_units
DESCRIPTION :rtype: TYPE
- Type
return
- get_channel_response(channel, dipole_length=1)[source]
Get the full channel response filter :param channel: DESCRIPTION :type channel: TYPE :param dipole_length: DESCRIPTION, defaults to 1 :type dipole_length: TYPE, optional :return: DESCRIPTION :rtype: TYPE
- get_electric_high_pass(hardware='pc')[source]
get the electric high pass filter based on the hardware
- property magnetic_conversion
DESCRIPTION :rtype: TYPE
- Type
return
- property magnetic_low_pass
Low pass 3 pole filter
- Returns
DESCRIPTION
- Return type
TYPE