mth5.io.nims package
Submodules
mth5.io.nims.gps module
NIMS GPS data parser for magnetotelluric surveys.
This module provides functionality to parse GPS stamps from NIMS (North Island Magnetotelluric Survey) data files. It handles both GPRMC and GPGGA GPS message formats, extracting location, time, and other GPS-related information.
Classes
- GPSErrorException
Custom exception for GPS parsing errors.
- GPSobject
Main class for parsing and validating GPS stamp data.
Notes
The GPS parser handles two main GPS message types: - GPRMC: Provides full date/time information and magnetic declination - GPGGA: Provides elevation data and fix quality information
Binary data contamination is automatically cleaned during parsing.
Examples
>>> from mth5.io.nims.gps import GPS
>>> gps_string = "GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*"
>>> gps = GPS(gps_string)
>>> print(f"Latitude: {gps.latitude}, Longitude: {gps.longitude}")
Created
Thu Sep 1 11:43:56 2022
- class mth5.io.nims.gps.GPS(gps_string: str | bytes, index: int = 0)[source]
Bases:
objectParser for GPS stamps from NIMS magnetotelluric data.
Handles parsing and validation of GPS strings from NIMS data files. Supports both GPRMC and GPGGA message formats, automatically detecting the type and extracting relevant geographic and temporal information.
- Parameters:
gps_string (str or bytes) – Raw GPS string to be parsed. Can contain binary contamination which will be automatically cleaned.
index (int, default 0) – Index or sequence number for this GPS record.
Notes
GPS message format differences:
- GPRMC (Recommended Minimum Course)
Contains: date, time, coordinates, speed, course, magnetic declination Date: Full date information (year, month, day)
- GPGGA (Global Positioning System Fix Data)
Contains: time, coordinates, fix quality, elevation Date: Defaults to 1980-01-01 for time estimation only
The parser automatically handles: - Binary contamination in GPS strings - Missing comma delimiters - GPS type auto-detection and correction - Coordinate conversion from degrees-minutes to decimal degrees
Examples
Parse a GPRMC string:
>>> gps_string = "GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*" >>> gps = GPS(gps_string) >>> print(f"Position: {gps.latitude:.5f}, {gps.longitude:.5f}") Position: 34.72683, -115.73501
Parse a GPGGA string:
>>> gps_string = "GPGGA,183511,3443.6098,N,11544.1007,W,1,04,2.6,937.2,M,-28.1,M,*" >>> gps = GPS(gps_string) >>> print(f"Elevation: {gps.elevation} {gps.elevation_units}") Elevation: 937.2 meters
Handle invalid GPS data:
>>> gps = GPS("invalid_string") >>> print(f"Valid: {gps.valid}") Valid: False
- property declination: float | None[source]
Magnetic declination in degrees from true north.
- Returns:
Magnetic declination in degrees. Positive values indicate eastward declination, negative values indicate westward declination. Returns None if declination data is not available.
- Return type:
float or None
Notes
Magnetic declination is only available in GPRMC messages. GPGGA messages will return None as they don’t contain declination data.
Western declination values are automatically converted to negative.
Examples
>>> gps = GPS("GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*") >>> gps.declination 13.1
- property elevation: float[source]
Elevation above sea level in meters.
- Returns:
Elevation in meters. Returns 0.0 if elevation data is not available or cannot be converted.
- Return type:
float
Notes
Elevation is typically only available in GPGGA messages. GPRMC messages will return 0.0 as they don’t contain elevation data.
Conversion errors are logged but don’t raise exceptions.
Examples
>>> gps = GPS("GPGGA,183511,3443.6098,N,11544.1007,W,1,04,2.6,937.2,M,-28.1,M,*") >>> gps.elevation 937.2
- property fix: str | None[source]
GPS fix status.
- Returns:
GPS fix status (typically “A” for valid fix), or None if fix information is not available or not applicable for the message type.
- Return type:
str or None
Notes
Fix status is typically available in GPRMC messages: - “A”: Valid fix - “V”: Invalid fix
GPGGA messages use different fix quality indicators.
Examples
>>> gps = GPS("GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*") >>> gps.fix 'A'
- property gps_type: str | None[source]
GPS message type.
- Returns:
GPS message type: “GPRMC” or “GPGGA”, or None if not set.
- Return type:
str or None
Examples
>>> gps = GPS("GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*") >>> gps.gps_type 'GPRMC'
- property latitude: float[source]
Latitude in decimal degrees (WGS84).
- Returns:
Latitude in decimal degrees. Negative values indicate Southern hemisphere. Returns 0.0 if coordinate data is invalid.
- Return type:
float
Notes
Converts from GPS format (DDMM.MMMM) to decimal degrees: decimal_degrees = degrees + minutes/60
Southern hemisphere coordinates are automatically converted to negative values.
Examples
>>> gps = GPS("GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*") >>> gps.latitude 34.72683
- property longitude: float[source]
Longitude in decimal degrees (WGS84).
- Returns:
Longitude in decimal degrees. Negative values indicate Western hemisphere. Returns 0.0 if coordinate data is invalid.
- Return type:
float
Notes
Converts from GPS format (DDDMM.MMMM) to decimal degrees: decimal_degrees = degrees + minutes/60
Western hemisphere coordinates are automatically converted to negative values.
Examples
>>> gps = GPS("GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*") >>> gps.longitude -115.73501166666667
- parse_gps_string(gps_string: str | bytes) None[source]
Parse GPS string and populate object attributes.
Main parsing method that validates the GPS string, identifies the message type (GPRMC/GPGGA), and extracts all relevant information into object attributes.
- Parameters:
gps_string (str or bytes) – Raw GPS string from NIMS data file.
Notes
This method performs the following operations: 1. Splits and validates the GPS string 2. Handles missing comma delimiter between time and coordinates 3. Validates each GPS field according to message type 4. Sets object attributes based on parsed values 5. Sets
validflag based on parsing successIf any validation errors occur, they are logged but parsing continues with
Nonevalues for invalid fields.The method automatically detects GPS message type and applies appropriate field validation rules.
Examples
Parse a valid GPS string:
>>> gps = GPS("") >>> gps.parse_gps_string("GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*") >>> print(f"Valid: {gps.valid}, Type: {gps.gps_type}") Valid: True, Type: GPRMC
Handle invalid GPS string:
>>> gps.parse_gps_string("invalid_gps_data") >>> print(f"Valid: {gps.valid}") Valid: False
- property time_stamp: datetime | None[source]
GPS timestamp as datetime object.
- Returns:
Timestamp parsed from GPS data, or None if time data is invalid.
- Return type:
datetime.datetime or None
Notes
For GPRMC messages: Uses full date and time information For GPGGA messages: Uses time with default date of 1980-01-01
Time format: HHMMSS (hours, minutes, seconds) Date format: DDMMYY (day, month, 2-digit year)
Invalid date strings are logged but return None rather than raising exceptions.
Examples
>>> gps = GPS("GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*") >>> gps.time_stamp datetime.datetime(2019, 9, 26, 18, 35, 11)
- validate_gps_list(gps_list: list[str]) tuple[list[str] | None, list[str]][source]
Validate GPS field list and check format compliance.
Performs comprehensive validation of GPS message components including type checking, length validation, and field-specific validation.
- Parameters:
gps_list (list of str) – GPS message components split by delimiter.
- Returns:
gps_list (list of str or None) – Validated GPS list with corrected values, or None if critical validation fails.
error_list (list of str) – List of validation error messages encountered during processing.
Notes
Validation steps performed: 1. GPS message type validation and correction 2. Message length validation based on type 3. Time format validation (6 digits) 4. Coordinate validation (latitude/longitude + hemisphere) 5. Date validation for GPRMC messages 6. Elevation validation for GPGGA messages
Non-critical validation errors are collected but don’t halt processing. Critical errors (type or length) return None and stop validation.
Examples
Validate a correct GPS list:
>>> gps = GPS("") >>> gps_data = ["GPRMC", "183511", "A", "3443.6098", "N", "11544.1007", "W", ... "000.0", "000.0", "260919", "013.1", "E"] >>> validated, errors = gps.validate_gps_list(gps_data) >>> print(f"Errors: {len(errors)}") Errors: 0
Handle validation errors:
>>> bad_data = ["INVALID", "time", "fix"] >>> validated, errors = gps.validate_gps_list(bad_data) >>> print(f"Result: {validated}, Errors: {len(errors)}") Result: None, Errors: 1
- validate_gps_string(gps_string: str | bytes) str | None[source]
Validate and clean GPS string.
Removes binary contamination, finds string terminator, and validates format. Handles both string and bytes input.
- Parameters:
gps_string (str or bytes) – Raw GPS string to validate. May contain binary contamination that will be automatically removed.
- Returns:
Cleaned GPS string with terminator removed, or None if validation fails due to missing terminator or decode errors.
- Return type:
str or None
- Raises:
TypeError – If input is not string or bytes.
Notes
Binary contamination bytes that are automatically removed: -
\xd9,\xc7,\xcc-\x00(null byte, replaced with ‘*’ terminator)The GPS string must end with ‘*’ character to be considered valid.
Examples
Clean a contaminated binary GPS string:
>>> gps = GPS("") >>> contaminated = b"GPRMC,183511,A\xd9,3443.6098,N*" >>> clean = gps.validate_gps_string(contaminated) >>> print(clean) GPRMC,183511,A,3443.6098,N
Handle missing terminator:
>>> invalid = "GPRMC,183511,A,3443.6098,N" # No '*' >>> result = gps.validate_gps_string(invalid) >>> print(result) None
mth5.io.nims.header module
Created on Thu Sep 1 12:57:32 2022
@author: jpeacock
- class mth5.io.nims.header.NIMSHeader(fn: str | Path | None = None)[source]
Bases:
objectClass to hold NIMS header information.
This class parses and stores header information from NIMS DATA.BIN files. The header contains metadata about the measurement site, equipment setup, GPS coordinates, electrode configuration, and other survey parameters.
- Parameters:
fn (str or Path, optional) – Path to the NIMS file to read, by default None
Examples
A typical header looks like:
''' >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>user field>>>>>>>>>>>>>>>>>>>>>>>>>>>> SITE NAME: Budwieser Spring STATE/PROVINCE: CA COUNTRY: USA >>> The following code in double quotes is REQUIRED to start the NIMS << >>> The next 3 lines contain values required for processing <<<<<<<<<<<< >>> The lines after that are optional <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< "300b" <-- 2CHAR EXPERIMENT CODE + 3 CHAR SITE CODE + RUN LETTER 1105-3; 1305-3 <-- SYSTEM BOX I.D.; MAG HEAD ID (if different) 106 0 <-- N-S Ex WIRE LENGTH (m); HEADING (deg E mag N) 109 90 <-- E-W Ey WIRE LENGTH (m); HEADING (deg E mag N) 1 <-- N ELECTRODE ID 3 <-- E ELECTRODE ID 2 <-- S ELECTRODE ID 4 <-- W ELECTRODE ID Cu <-- GROUND ELECTRODE INFO GPS INFO: 26/09/19 18:29:29 34.7268 N 115.7350 W 939.8 OPERATOR: KP COMMENT: N/S CRS: .95/.96 DCV: 3.5 ACV:1 E/W CRS: .85/.86 DCV: 1.5 ACV: 1 Redeployed site for run b b/c possible animal disturbance '''
- property file_size: int | None[source]
Size of the NIMS file in bytes.
- Returns:
File size in bytes, or None if no file is set
- Return type:
int or None
- Raises:
FileNotFoundError – If the file does not exist
- property fn: Path | None[source]
Full path to NIMS file.
- Returns:
Path object representing the NIMS file location, or None if no file is set
- Return type:
Path or None
- parse_header_dict(header_dict: dict[str, str] | None = None) None[source]
Parse the header dictionary into individual attributes.
This method takes the raw header dictionary and extracts specific information into class attributes for easy access.
- Parameters:
header_dict (dict of str, optional) – Dictionary containing header key-value pairs. Uses self.header_dict if not provided.
Notes
Parses various header fields including: - Wire lengths and azimuths for electric field measurements - System box and magnetometer IDs - GPS coordinates and timestamp - Run identifier - Other metadata fields
- read_header(fn: str | Path | None = None) None[source]
Read header information from a NIMS file.
This method reads and parses the header section of a NIMS DATA.BIN file, extracting metadata about the survey setup, GPS coordinates, electrode configuration, and other parameters.
- Parameters:
fn (str or Path, optional) – Full path to NIMS file to read. Uses self.fn if not provided.
- Raises:
NIMSError – If the file does not exist or cannot be read
Notes
The method reads up to _max_header_length bytes from the beginning of the file, parses the header information, and stores the results in the header_dict attribute and individual properties.
mth5.io.nims.nims module
NIMS
deals with reading in NIMS DATA.BIN files
- This is a translation from Matlab codes written and edited by:
Anna Kelbert
Paul Bedrosian
Esteban Bowles-Martinez
Possibly others.
I’ve tested it against a version, and it matches. The data/GPS gaps I still don’t understand so for now the time series is just made continuous and the number of missing seconds is clipped from the end of the time series.
Note
this only works for 8Hz data for now
- copyright:
Jared Peacock (jpeacock@usgs.gov)
- license:
MIT
- class mth5.io.nims.nims.NIMS(fn: str | Path | None = None)[source]
Bases:
NIMSHeaderNIMS Class for reading NIMS DATA.BIN files.
A fast way to read the binary files are to first read in the GPS strings, the third byte in each block as a character and parse that into valid GPS stamps.
Then read in the entire data set as unsigned 8 bit integers and reshape the data to be n seconds x block size. Then parse that array into the status information and data.
- Parameters:
fn (str or Path, optional) – Path to the NIMS DATA.BIN file to read, by default None
Notes
I only have a limited amount of .BIN files to test so this will likely break if there are issues such as data gaps. This has been tested against the matlab program loadNIMS by Anna Kelbert and the match for all the .bin files I have. If something looks weird check it against that program.
Warning
Currently Only 8 Hz data is supported
Examples
>>> from mth5.io.nims import nims >>> n = nims.NIMS(r"/home/mt_data/nims/mt001.bin") >>> n.read_nims()
- align_data(data_array, stamps)[source]
Need to match up the first good GPS stamp with the data
Do this by using the first GPS stamp and assuming that the time from the first time stamp to the start is the index value.
put the data into a pandas data frame that is indexed by time
- Parameters:
data_array (array) – structure array with columns for each component [hx, hy, hz, ex, ey]
stamps (list) – list of GPS stamps [[status_index, [GPRMC, GPGGA]]]
- Returns:
pandas DataFrame with colums of components and indexed by time initialized by the start time.
Note
Data gaps are squeezed cause not sure what a gap actually means.
- property box_temperature: ChannelTS | None[source]
Data logger temperature channel.
- Returns:
Temperature channel sampled at 1 second, interpolated to match the time series sample rate, or None if no time series data
- Return type:
ChannelTS or None
Notes
Temperature is measured in Celsius and interpolated onto the same time grid as the magnetic and electric field channels.
- check_timing(stamps)[source]
make sure that there are the correct number of seconds in between the first and last GPS GPRMC stamps
- Parameters:
stamps (list) – list of GPS stamps [[status_index, [GPRMC, GPGGA]]]
- Returns:
[ True | False ] if data is valid or not.
- Returns:
gap index locations
Note
currently it is assumed that if a data gap occurs the data can be squeezed to remove them. Probably a more elegant way of doing it.
- property declination: float | None[source]
Median magnetic declination value from all GPS stamps.
- Returns:
Median magnetic declination in decimal degrees from GPRMC stamps, or None if no declination data available
- Return type:
float or None
Notes
Only uses GPRMC stamps as they contain declination information.
- property elevation: float | None[source]
Median elevation value from all GPS stamps.
- Returns:
Median elevation in meters (WGS84) from GPS stamps, or header GPS elevation if no stamps available
- Return type:
float or None
Notes
Uses the first stamp within each GPS stamp set. For paired stamps (GPRMC/GPGGA), uses the GPGGA elevation if available.
- property end_time: MTime[source]
End time of the time series data.
- Returns:
End time derived from the last time series index, or estimated from start time and number of samples
- Return type:
MTime
Notes
If time series data is available, uses the last timestamp. Otherwise estimates end time from start time plus duration calculated from number of samples and sample rate.
- property ex: ChannelTS | None[source]
EX electric field channel time series.
- Returns:
Time series data for the EX electric field component, or None if no time series data is loaded
- Return type:
ChannelTS or None
- property ex_metadata: Electric | None[source]
Metadata for the EX electric field channel.
- Returns:
Electric field metadata object for the EX channel, or None if no time series data is loaded
- Return type:
Electric or None
- property ey: ChannelTS | None[source]
EY electric field channel time series.
- Returns:
Time series data for the EY electric field component, or None if no time series data is loaded
- Return type:
ChannelTS or None
- property ey_metadata: Electric | None[source]
Metadata for the EY electric field channel.
- Returns:
Electric field metadata object for the EY channel, or None if no time series data is loaded
- Return type:
Electric or None
- find_sequence(data_array: ndarray, block_sequence: list[int] | None = None) ndarray[source]
Find a sequence pattern in the data array.
- Parameters:
data_array (ndarray) – Array of the data with shape [n, m] where n is the number of seconds recorded and m is the block length for a given sampling rate
block_sequence (list of int, optional) – Sequence pattern to locate, by default [1, 131] (start of data block)
- Returns:
Array of index locations where the sequence is found
- Return type:
ndarray
Notes
Uses numpy rolling and comparison to find all occurrences of the specified sequence pattern in the data array.
- get_channel_response(channel: str, dipole_length: float = 1) Any[source]
Get the channel response for a given channel.
- Parameters:
channel (str) – Channel identifier (e.g., ‘hx’, ‘hy’, ‘hz’, ‘ex’, ‘ey’)
dipole_length (float, optional) – Dipole length for electric field channels, by default 1
- Returns:
Channel response object from the NIMS response filters
- Return type:
Any
Notes
Uses the NIMS response filters to generate appropriate response functions for magnetic and electric field channels at the current sample rate.
- get_stamps(nims_string: bytes) list[tuple[Any, list[GPS]]][source]
Extract and parse valid GPS strings, matching GPRMC with GPGGA stamps.
- Parameters:
nims_string (bytes) – Raw GPS binary string output by NIMS
- Returns:
List of matched GPS stamps where each element is a tuple containing index and list of GPS objects [GPRMC, GPGGA] (or just [GPRMC])
- Return type:
list of tuple
Notes
Skips the first entry as it tends to be incomplete. Attempts to match synchronous GPRMC with GPGGA stamps when possible.
- has_data() bool[source]
Check if the NIMS object contains time series data.
- Returns:
True if ts_data is not None, False otherwise
- Return type:
bool
- property hx: ChannelTS | None[source]
HX magnetic field channel time series.
- Returns:
Time series data for the HX magnetic field component, or None if no time series data is loaded
- Return type:
ChannelTS or None
- property hx_metadata: Magnetic | None[source]
Metadata for the HX magnetic field channel.
- Returns:
Magnetic field metadata object for the HX channel, or None if no time series data is loaded
- Return type:
Magnetic or None
- property hy: ChannelTS | None[source]
HY magnetic field channel time series.
- Returns:
Time series data for the HY magnetic field component, or None if no time series data is loaded
- Return type:
ChannelTS or None
- property hy_metadata: Magnetic | None[source]
Metadata for the HY magnetic field channel.
- Returns:
Magnetic field metadata object for the HY channel, or None if no time series data is loaded
- Return type:
Magnetic or None
- property hz: ChannelTS | None[source]
HZ magnetic field channel time series.
- Returns:
Time series data for the HZ magnetic field component, or None if no time series data is loaded
- Return type:
ChannelTS or None
- property hz_metadata: Magnetic | None[source]
Metadata for the HZ magnetic field channel.
- Returns:
Magnetic field metadata object for the HZ channel, or None if no time series data is loaded
- Return type:
Magnetic or None
- property latitude: float | None[source]
Median latitude value from all GPS stamps.
- Returns:
Median latitude in decimal degrees (WGS84) from GPRMC stamps, or header GPS latitude if no stamps available
- Return type:
float or None
Notes
Only uses GPRMC stamps as they should be duplicates of GPGGA stamps but include additional validation.
- property longitude: float | None[source]
Median longitude value from all GPS stamps.
- Returns:
Median longitude in decimal degrees (WGS84) from GPS stamps, or header GPS longitude if no stamps available
- Return type:
float or None
Notes
Uses the first stamp within each GPS stamp set.
- make_dt_index(start_time: str, sample_rate: float, stop_time: str | None = None, n_samples: int | None = None) DatetimeIndex[source]
Create datetime index array for time series data.
- Parameters:
start_time (str) – Start time in format YYYY-MM-DDThh:mm:ss.ms UTC
sample_rate (float) – Sample rate in samples/second
stop_time (str, optional) – End time in same format as start_time
n_samples (int, optional) – Number of samples to generate
- Returns:
Pandas datetime index with UTC timezone
- Return type:
DatetimeIndex
Notes
Either stop_time or n_samples must be provided. The datetime format should be YYYY-MM-DDThh:mm:ss.ms UTC.
- Raises:
ValueError – If neither stop_time nor n_samples is provided
- match_status_with_gps_stamps(status_array, gps_list)[source]
Match the index values from the status array with the index values of the GPS stamps. There appears to be a bit of wiggle room between when the lock is recorded and the stamp was actually recorded. This is typically 1 second and sometimes 2.
- Parameters:
status_array (array) – array of status values from each data block
gps_list (list) – list of valid GPS stamps [[GPRMC, GPGGA], …]
Note
I think there is a 2 second gap between the lock and the first stamp character.
- property n_samples: int | None[source]
Number of samples in the time series.
- Returns:
Number of samples if data is loaded, estimated from file size if file exists, None otherwise
- Return type:
int or None
- read_nims(fn: str | Path | None = None) None[source]
Read NIMS DATA.BIN file and parse all data.
This method performs the complete data reading and processing workflow:
Read header information and store as attributes
Locate data block beginning by finding first [1, 131, …] sequence
Ensure data is multiple of block length, trim excess bits
Extract GPS data (3rd byte of each block) and parse GPS stamps
Read data as unsigned 8-bit integers, reshape to [N, block_length]
Remove duplicate blocks (first of each duplicate pair)
Match GPS status locks with valid GPS stamps
Verify timing between first/last GPS stamps, trim excess seconds
- Parameters:
fn (str or Path, optional) – Path to NIMS DATA.BIN file. Uses self.fn if not provided.
Notes
The data and information arrays returned have duplicates removed and sequence reset to be monotonic. Extra seconds due to timing gaps are trimmed from the end of the time series.
Examples
>>> from mth5.io import nims >>> n = nims.NIMS(r"/home/mt_data/nims/mt001.bin") >>> n.read_nims()
- remove_duplicates(info_array, data_array)[source]
remove duplicate blocks, removing the first duplicate as suggested by Paul and Anna. Checks to make sure that the mag data are identical for the duplicate blocks. Removes the blocks from the information and data arrays and returns the reduced arrays. This should sync up the timing of GPS stamps and index values.
- Parameters:
info_array (np.array) – structured array of block information
data_array (np.array) – structured array of the data
- Returns:
reduced information array
- Returns:
reduced data array
- Returns:
index of duplicates in raw data
- property run_metadata: Run | None[source]
Run metadata for the NIMS data collection.
- Returns:
MT run metadata including data logger information, timing, and channel metadata, or None if no time series data is loaded
- Return type:
Run or None
- property start_time: MTime[source]
Start time of the time series data.
- Returns:
Start time derived from the first GPS time stamp index, or header GPS stamp if no time series data available
- Return type:
MTime
Notes
The start time is calculated from the first good GPS time stamp minus the seconds to the beginning of the time series.
- property station_metadata: Station | None[source]
Station metadata from NIMS file.
- Returns:
MT station metadata including geographic information and location data, or None if no time series data is loaded
- Return type:
Station or None
- to_runts(calibrate: bool = False) RunTS | None[source]
Get xarray RunTS object for the NIMS data.
- Parameters:
calibrate (bool, optional) – Whether to apply calibration to the data, by default False
- Returns:
Time series run object containing all channels and metadata, or None if no time series data is loaded
- Return type:
RunTS or None
Notes
Includes all magnetic field channels (hx, hy, hz), electric field channels (ex, ey), and box temperature data.
- unwrap_sequence(sequence: ndarray) ndarray[source]
Unwrap sequence to sequential numbers instead of modulo 256.
- Parameters:
sequence (ndarray) – Sequence of byte numbers (0-255) to unwrap
- Returns:
Unwrapped sequence with first number set to 0 and subsequent values forming a continuous count
- Return type:
ndarray
Notes
Handles the fact that sequence numbers are stored as single bytes (0-255) but represent a continuous count. When a value of 255 is encountered, the next rollover is anticipated.
- mth5.io.nims.nims.read_nims(fn: str | Path) RunTS | None[source]
Convenience function to read a NIMS DATA.BIN file.
- Parameters:
fn (str or Path) – Path to the NIMS DATA.BIN file
- Returns:
Time series run object containing all channels and metadata, or None if reading fails
- Return type:
RunTS or None
Examples
>>> from mth5.io.nims import nims >>> run_ts = nims.read_nims("/path/to/data.bin")
mth5.io.nims.nims_collection module
NIMS Collection
Collection of NIMS binary files combined into runs for magnetotelluric data processing.
Created on Wed Aug 31 10:32:44 2022
@author: jpeacock
- class mth5.io.nims.nims_collection.NIMSCollection(file_path: str | Path | None = None, **kwargs: Any)[source]
Bases:
CollectionCollection of NIMS binary files into runs.
This class provides functionality for organizing and processing multiple NIMS binary files into a structured format for magnetotelluric data analysis.
- Parameters:
file_path (str or Path, optional) – Path to the directory containing NIMS binary files.
**kwargs (dict) – Additional keyword arguments passed to the parent Collection class.
Examples
>>> from mth5.io.nims import NIMSCollection >>> nc = NIMSCollection(r"/path/to/nims/station") >>> nc.survey_id = "mt001" >>> df = nc.to_dataframe()
See also
mth5.io.collection.CollectionBase collection class
mth5.io.nims.NIMSNIMS file reader
- assign_run_names(df: DataFrame, zeros: int = 2) DataFrame[source]
Assign standardized run names to DataFrame entries by station.
This method assigns run names following the pattern ‘sr{sample_rate}_{run_number}’ where run_number is zero-padded according to the zeros parameter. Run names are assigned sequentially within each station, ordered by start time.
- Parameters:
df (pd.DataFrame) – DataFrame containing NIMS file metadata with required columns: ‘station’, ‘start’, ‘run’, ‘sample_rate’. The DataFrame will be modified in-place.
zeros (int, default 2) – Number of zeros to use for zero-padding the run number in the generated run names (e.g., zeros=2 gives ‘01’, ‘02’, etc.).
- Returns:
The input DataFrame with updated ‘run’ and ‘sequence_number’ columns. Run names follow the format ‘sr{sample_rate}_{run_number:0{zeros}}’.
- Return type:
pd.DataFrame
Notes
Existing run names (non-None values) are preserved
Files are processed in chronological order within each station
Sequence numbers are assigned incrementally starting from 1
Only files with None run names receive new assignments
Examples
>>> import pandas as pd >>> from mth5.io.nims import NIMSCollection >>> # Assuming df has columns: station, start, run, sample_rate >>> nc = NIMSCollection() >>> df_updated = nc.assign_run_names(df, zeros=3) >>> print(df_updated['run'].tolist()) ['sr8_001', 'sr8_002', 'sr1_001']
- to_dataframe(sample_rates: int | list[int] = [1], run_name_zeros: int = 2, calibration_path: str | Path | None = None) DataFrame[source]
Create a DataFrame of each NIMS binary file in the collection directory.
This method processes all NIMS binary files in the specified directory and extracts metadata to create a structured DataFrame suitable for further magnetotelluric data processing.
- Parameters:
sample_rates (int or list of int, default [1]) – Sample rates to include in the DataFrame. Note that for NIMS data, this parameter is present for interface consistency but all files will be processed regardless of their sample rate.
run_name_zeros (int, default 2) – Number of zeros to use when formatting run names in the output.
calibration_path (str or Path, optional) – Path to calibration files. Currently not used in NIMS processing but included for interface consistency.
- Returns:
DataFrame containing metadata for each NIMS file with columns: - survey : Survey identifier - station : Station name from NIMS file - run : Run identifier from NIMS file - start : Start time in ISO format - end : End time in ISO format - fn : File path - sample_rate : Sampling rate - file_size : File size in bytes - n_samples : Number of samples - dipole : Electric dipole lengths [Ex, Ey] - channel_id : Channel identifier (always 1) - sequence_number : Sequence number (always 0) - component : Comma-separated component list - instrument_id : Instrument identifier (always ‘NIMS’)
- Return type:
pd.DataFrame
Notes
This method assumes the directory contains files from a single station. Each NIMS file is read to extract header information including timing, station identification, and measurement parameters.
Examples
>>> from mth5.io.nims import NIMSCollection >>> nc = NIMSCollection("/path/to/nims/station") >>> df = nc.to_dataframe(run_name_zeros=3) >>> print(df[['station', 'run', 'start', 'sample_rate']])
mth5.io.nims.response_filters module
Created on Fri Sep 2 13:50:51 2022
@author: jpeacock
- class mth5.io.nims.response_filters.Response(system_id=None, **kwargs)[source]
Bases:
objectCommon NIMS response filters for electric and magnetic channels
- dipole_filter(length)[source]
Make a dipole filter
- Parameters:
length (TYPE) – dipole length in meters
- Returns:
DESCRIPTION
- Return type:
TYPE
- property electric_conversion[source]
electric channel conversion from counts to Volts :return: DESCRIPTION :rtype: TYPE
- property electric_high_pass_hp[source]
1-pole low pass for 1 hz instuments :return: DESCRIPTION :rtype: TYPE
- property electric_high_pass_pc[source]
1-pole low pass filter for 8 hz instruments :return: DESCRIPTION :rtype: TYPE
- property electric_low_pass[source]
5 pole electric low pass filter :return: DESCRIPTION :rtype: TYPE
- get_channel_response(channel, dipole_length=1)[source]
Get the full channel response filter :param channel: DESCRIPTION :type channel: TYPE :param dipole_length: DESCRIPTION, defaults to 1 :type dipole_length: TYPE, optional :return: DESCRIPTION :rtype: TYPE
Module contents
- class mth5.io.nims.GPS(gps_string: str | bytes, index: int = 0)[source]
Bases:
objectParser for GPS stamps from NIMS magnetotelluric data.
Handles parsing and validation of GPS strings from NIMS data files. Supports both GPRMC and GPGGA message formats, automatically detecting the type and extracting relevant geographic and temporal information.
- Parameters:
gps_string (str or bytes) – Raw GPS string to be parsed. Can contain binary contamination which will be automatically cleaned.
index (int, default 0) – Index or sequence number for this GPS record.
- gps_string
The original GPS string provided for parsing.
- Type:
str
- index
Index or sequence number for this GPS record.
- Type:
int
- valid
Whether the GPS string was successfully parsed and validated.
- Type:
bool
- elevation_units
Units for elevation measurements, typically “meters”.
- Type:
str
- logger
Logger instance for debugging and error reporting.
- Type:
loguru.Logger
Notes
GPS message format differences:
- GPRMC (Recommended Minimum Course)
Contains: date, time, coordinates, speed, course, magnetic declination Date: Full date information (year, month, day)
- GPGGA (Global Positioning System Fix Data)
Contains: time, coordinates, fix quality, elevation Date: Defaults to 1980-01-01 for time estimation only
The parser automatically handles: - Binary contamination in GPS strings - Missing comma delimiters - GPS type auto-detection and correction - Coordinate conversion from degrees-minutes to decimal degrees
Examples
Parse a GPRMC string:
>>> gps_string = "GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*" >>> gps = GPS(gps_string) >>> print(f"Position: {gps.latitude:.5f}, {gps.longitude:.5f}") Position: 34.72683, -115.73501
Parse a GPGGA string:
>>> gps_string = "GPGGA,183511,3443.6098,N,11544.1007,W,1,04,2.6,937.2,M,-28.1,M,*" >>> gps = GPS(gps_string) >>> print(f"Elevation: {gps.elevation} {gps.elevation_units}") Elevation: 937.2 meters
Handle invalid GPS data:
>>> gps = GPS("invalid_string") >>> print(f"Valid: {gps.valid}") Valid: False
- property declination: float | None
Magnetic declination in degrees from true north.
- Returns:
Magnetic declination in degrees. Positive values indicate eastward declination, negative values indicate westward declination. Returns None if declination data is not available.
- Return type:
float or None
Notes
Magnetic declination is only available in GPRMC messages. GPGGA messages will return None as they don’t contain declination data.
Western declination values are automatically converted to negative.
Examples
>>> gps = GPS("GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*") >>> gps.declination 13.1
- property elevation: float
Elevation above sea level in meters.
- Returns:
Elevation in meters. Returns 0.0 if elevation data is not available or cannot be converted.
- Return type:
float
Notes
Elevation is typically only available in GPGGA messages. GPRMC messages will return 0.0 as they don’t contain elevation data.
Conversion errors are logged but don’t raise exceptions.
Examples
>>> gps = GPS("GPGGA,183511,3443.6098,N,11544.1007,W,1,04,2.6,937.2,M,-28.1,M,*") >>> gps.elevation 937.2
- property fix: str | None
GPS fix status.
- Returns:
GPS fix status (typically “A” for valid fix), or None if fix information is not available or not applicable for the message type.
- Return type:
str or None
Notes
Fix status is typically available in GPRMC messages: - “A”: Valid fix - “V”: Invalid fix
GPGGA messages use different fix quality indicators.
Examples
>>> gps = GPS("GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*") >>> gps.fix 'A'
- property gps_type: str | None
GPS message type.
- Returns:
GPS message type: “GPRMC” or “GPGGA”, or None if not set.
- Return type:
str or None
Examples
>>> gps = GPS("GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*") >>> gps.gps_type 'GPRMC'
- property latitude: float
Latitude in decimal degrees (WGS84).
- Returns:
Latitude in decimal degrees. Negative values indicate Southern hemisphere. Returns 0.0 if coordinate data is invalid.
- Return type:
float
Notes
Converts from GPS format (DDMM.MMMM) to decimal degrees: decimal_degrees = degrees + minutes/60
Southern hemisphere coordinates are automatically converted to negative values.
Examples
>>> gps = GPS("GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*") >>> gps.latitude 34.72683
- property longitude: float
Longitude in decimal degrees (WGS84).
- Returns:
Longitude in decimal degrees. Negative values indicate Western hemisphere. Returns 0.0 if coordinate data is invalid.
- Return type:
float
Notes
Converts from GPS format (DDDMM.MMMM) to decimal degrees: decimal_degrees = degrees + minutes/60
Western hemisphere coordinates are automatically converted to negative values.
Examples
>>> gps = GPS("GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*") >>> gps.longitude -115.73501166666667
- parse_gps_string(gps_string: str | bytes) None[source]
Parse GPS string and populate object attributes.
Main parsing method that validates the GPS string, identifies the message type (GPRMC/GPGGA), and extracts all relevant information into object attributes.
- Parameters:
gps_string (str or bytes) – Raw GPS string from NIMS data file.
Notes
This method performs the following operations: 1. Splits and validates the GPS string 2. Handles missing comma delimiter between time and coordinates 3. Validates each GPS field according to message type 4. Sets object attributes based on parsed values 5. Sets
validflag based on parsing successIf any validation errors occur, they are logged but parsing continues with
Nonevalues for invalid fields.The method automatically detects GPS message type and applies appropriate field validation rules.
Examples
Parse a valid GPS string:
>>> gps = GPS("") >>> gps.parse_gps_string("GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*") >>> print(f"Valid: {gps.valid}, Type: {gps.gps_type}") Valid: True, Type: GPRMC
Handle invalid GPS string:
>>> gps.parse_gps_string("invalid_gps_data") >>> print(f"Valid: {gps.valid}") Valid: False
- property time_stamp: datetime | None
GPS timestamp as datetime object.
- Returns:
Timestamp parsed from GPS data, or None if time data is invalid.
- Return type:
datetime.datetime or None
Notes
For GPRMC messages: Uses full date and time information For GPGGA messages: Uses time with default date of 1980-01-01
Time format: HHMMSS (hours, minutes, seconds) Date format: DDMMYY (day, month, 2-digit year)
Invalid date strings are logged but return None rather than raising exceptions.
Examples
>>> gps = GPS("GPRMC,183511,A,3443.6098,N,11544.1007,W,000.0,000.0,260919,013.1,E*") >>> gps.time_stamp datetime.datetime(2019, 9, 26, 18, 35, 11)
- validate_gps_list(gps_list: list[str]) tuple[list[str] | None, list[str]][source]
Validate GPS field list and check format compliance.
Performs comprehensive validation of GPS message components including type checking, length validation, and field-specific validation.
- Parameters:
gps_list (list of str) – GPS message components split by delimiter.
- Returns:
gps_list (list of str or None) – Validated GPS list with corrected values, or None if critical validation fails.
error_list (list of str) – List of validation error messages encountered during processing.
Notes
Validation steps performed: 1. GPS message type validation and correction 2. Message length validation based on type 3. Time format validation (6 digits) 4. Coordinate validation (latitude/longitude + hemisphere) 5. Date validation for GPRMC messages 6. Elevation validation for GPGGA messages
Non-critical validation errors are collected but don’t halt processing. Critical errors (type or length) return None and stop validation.
Examples
Validate a correct GPS list:
>>> gps = GPS("") >>> gps_data = ["GPRMC", "183511", "A", "3443.6098", "N", "11544.1007", "W", ... "000.0", "000.0", "260919", "013.1", "E"] >>> validated, errors = gps.validate_gps_list(gps_data) >>> print(f"Errors: {len(errors)}") Errors: 0
Handle validation errors:
>>> bad_data = ["INVALID", "time", "fix"] >>> validated, errors = gps.validate_gps_list(bad_data) >>> print(f"Result: {validated}, Errors: {len(errors)}") Result: None, Errors: 1
- validate_gps_string(gps_string: str | bytes) str | None[source]
Validate and clean GPS string.
Removes binary contamination, finds string terminator, and validates format. Handles both string and bytes input.
- Parameters:
gps_string (str or bytes) – Raw GPS string to validate. May contain binary contamination that will be automatically removed.
- Returns:
Cleaned GPS string with terminator removed, or None if validation fails due to missing terminator or decode errors.
- Return type:
str or None
- Raises:
TypeError – If input is not string or bytes.
Notes
Binary contamination bytes that are automatically removed: -
\xd9,\xc7,\xcc-\x00(null byte, replaced with ‘*’ terminator)The GPS string must end with ‘*’ character to be considered valid.
Examples
Clean a contaminated binary GPS string:
>>> gps = GPS("") >>> contaminated = b"GPRMC,183511,A\xd9,3443.6098,N*" >>> clean = gps.validate_gps_string(contaminated) >>> print(clean) GPRMC,183511,A,3443.6098,N
Handle missing terminator:
>>> invalid = "GPRMC,183511,A,3443.6098,N" # No '*' >>> result = gps.validate_gps_string(invalid) >>> print(result) None
- exception mth5.io.nims.GPSError[source]
Bases:
ExceptionCustom exception for GPS parsing and validation errors.
Raised when GPS string parsing fails or when GPS data validation encounters invalid values.
- class mth5.io.nims.NIMS(fn: str | Path | None = None)[source]
Bases:
NIMSHeaderNIMS Class for reading NIMS DATA.BIN files.
A fast way to read the binary files are to first read in the GPS strings, the third byte in each block as a character and parse that into valid GPS stamps.
Then read in the entire data set as unsigned 8 bit integers and reshape the data to be n seconds x block size. Then parse that array into the status information and data.
- Parameters:
fn (str or Path, optional) – Path to the NIMS DATA.BIN file to read, by default None
- block_size
Size of data blocks (default 131 for 8 Hz data)
- Type:
int
- block_sequence
Sequence pattern to locate [1, 131]
- Type:
list of int
- sample_rate
Sample rate in samples/second (default 8)
- Type:
int
- e_conversion_factor
Electric field conversion factor
- Type:
float
- h_conversion_factor
Magnetic field conversion factor
- Type:
float
- t_conversion_factor
Temperature conversion factor
- Type:
float
- t_offset
Temperature offset value
- Type:
int
- info_array
Structured array of block information
- Type:
ndarray or None
- stamps
List of valid GPS stamps
- Type:
list or None
- ts_data
Time series data as pandas DataFrame
- Type:
DataFrame or None
- gaps
List of timing gaps found in data
- Type:
list or None
- duplicate_list
List of duplicate blocks found
- Type:
list or None
Notes
I only have a limited amount of .BIN files to test so this will likely break if there are issues such as data gaps. This has been tested against the matlab program loadNIMS by Anna Kelbert and the match for all the .bin files I have. If something looks weird check it against that program.
Warning
Currently Only 8 Hz data is supported
Examples
>>> from mth5.io.nims import nims >>> n = nims.NIMS(r"/home/mt_data/nims/mt001.bin") >>> n.read_nims()
- align_data(data_array, stamps)[source]
Need to match up the first good GPS stamp with the data
Do this by using the first GPS stamp and assuming that the time from the first time stamp to the start is the index value.
put the data into a pandas data frame that is indexed by time
- Parameters:
data_array (array) – structure array with columns for each component [hx, hy, hz, ex, ey]
stamps (list) – list of GPS stamps [[status_index, [GPRMC, GPGGA]]]
- Returns:
pandas DataFrame with colums of components and indexed by time initialized by the start time.
Note
Data gaps are squeezed cause not sure what a gap actually means.
- property box_temperature: ChannelTS | None
Data logger temperature channel.
- Returns:
Temperature channel sampled at 1 second, interpolated to match the time series sample rate, or None if no time series data
- Return type:
ChannelTS or None
Notes
Temperature is measured in Celsius and interpolated onto the same time grid as the magnetic and electric field channels.
- check_timing(stamps)[source]
make sure that there are the correct number of seconds in between the first and last GPS GPRMC stamps
- Parameters:
stamps (list) – list of GPS stamps [[status_index, [GPRMC, GPGGA]]]
- Returns:
[ True | False ] if data is valid or not.
- Returns:
gap index locations
Note
currently it is assumed that if a data gap occurs the data can be squeezed to remove them. Probably a more elegant way of doing it.
- property declination: float | None
Median magnetic declination value from all GPS stamps.
- Returns:
Median magnetic declination in decimal degrees from GPRMC stamps, or None if no declination data available
- Return type:
float or None
Notes
Only uses GPRMC stamps as they contain declination information.
- property elevation: float | None
Median elevation value from all GPS stamps.
- Returns:
Median elevation in meters (WGS84) from GPS stamps, or header GPS elevation if no stamps available
- Return type:
float or None
Notes
Uses the first stamp within each GPS stamp set. For paired stamps (GPRMC/GPGGA), uses the GPGGA elevation if available.
- property end_time: MTime
End time of the time series data.
- Returns:
End time derived from the last time series index, or estimated from start time and number of samples
- Return type:
MTime
Notes
If time series data is available, uses the last timestamp. Otherwise estimates end time from start time plus duration calculated from number of samples and sample rate.
- property ex: ChannelTS | None
EX electric field channel time series.
- Returns:
Time series data for the EX electric field component, or None if no time series data is loaded
- Return type:
ChannelTS or None
- property ex_metadata: Electric | None
Metadata for the EX electric field channel.
- Returns:
Electric field metadata object for the EX channel, or None if no time series data is loaded
- Return type:
Electric or None
- property ey: ChannelTS | None
EY electric field channel time series.
- Returns:
Time series data for the EY electric field component, or None if no time series data is loaded
- Return type:
ChannelTS or None
- property ey_metadata: Electric | None
Metadata for the EY electric field channel.
- Returns:
Electric field metadata object for the EY channel, or None if no time series data is loaded
- Return type:
Electric or None
- find_sequence(data_array: ndarray, block_sequence: list[int] | None = None) ndarray[source]
Find a sequence pattern in the data array.
- Parameters:
data_array (ndarray) – Array of the data with shape [n, m] where n is the number of seconds recorded and m is the block length for a given sampling rate
block_sequence (list of int, optional) – Sequence pattern to locate, by default [1, 131] (start of data block)
- Returns:
Array of index locations where the sequence is found
- Return type:
ndarray
Notes
Uses numpy rolling and comparison to find all occurrences of the specified sequence pattern in the data array.
- get_channel_response(channel: str, dipole_length: float = 1) Any[source]
Get the channel response for a given channel.
- Parameters:
channel (str) – Channel identifier (e.g., ‘hx’, ‘hy’, ‘hz’, ‘ex’, ‘ey’)
dipole_length (float, optional) – Dipole length for electric field channels, by default 1
- Returns:
Channel response object from the NIMS response filters
- Return type:
Any
Notes
Uses the NIMS response filters to generate appropriate response functions for magnetic and electric field channels at the current sample rate.
- get_stamps(nims_string: bytes) list[tuple[Any, list[GPS]]][source]
Extract and parse valid GPS strings, matching GPRMC with GPGGA stamps.
- Parameters:
nims_string (bytes) – Raw GPS binary string output by NIMS
- Returns:
List of matched GPS stamps where each element is a tuple containing index and list of GPS objects [GPRMC, GPGGA] (or just [GPRMC])
- Return type:
list of tuple
Notes
Skips the first entry as it tends to be incomplete. Attempts to match synchronous GPRMC with GPGGA stamps when possible.
- has_data() bool[source]
Check if the NIMS object contains time series data.
- Returns:
True if ts_data is not None, False otherwise
- Return type:
bool
- property hx: ChannelTS | None
HX magnetic field channel time series.
- Returns:
Time series data for the HX magnetic field component, or None if no time series data is loaded
- Return type:
ChannelTS or None
- property hx_metadata: Magnetic | None
Metadata for the HX magnetic field channel.
- Returns:
Magnetic field metadata object for the HX channel, or None if no time series data is loaded
- Return type:
Magnetic or None
- property hy: ChannelTS | None
HY magnetic field channel time series.
- Returns:
Time series data for the HY magnetic field component, or None if no time series data is loaded
- Return type:
ChannelTS or None
- property hy_metadata: Magnetic | None
Metadata for the HY magnetic field channel.
- Returns:
Magnetic field metadata object for the HY channel, or None if no time series data is loaded
- Return type:
Magnetic or None
- property hz: ChannelTS | None
HZ magnetic field channel time series.
- Returns:
Time series data for the HZ magnetic field component, or None if no time series data is loaded
- Return type:
ChannelTS or None
- property hz_metadata: Magnetic | None
Metadata for the HZ magnetic field channel.
- Returns:
Magnetic field metadata object for the HZ channel, or None if no time series data is loaded
- Return type:
Magnetic or None
- property latitude: float | None
Median latitude value from all GPS stamps.
- Returns:
Median latitude in decimal degrees (WGS84) from GPRMC stamps, or header GPS latitude if no stamps available
- Return type:
float or None
Notes
Only uses GPRMC stamps as they should be duplicates of GPGGA stamps but include additional validation.
- property longitude: float | None
Median longitude value from all GPS stamps.
- Returns:
Median longitude in decimal degrees (WGS84) from GPS stamps, or header GPS longitude if no stamps available
- Return type:
float or None
Notes
Uses the first stamp within each GPS stamp set.
- make_dt_index(start_time: str, sample_rate: float, stop_time: str | None = None, n_samples: int | None = None) DatetimeIndex[source]
Create datetime index array for time series data.
- Parameters:
start_time (str) – Start time in format YYYY-MM-DDThh:mm:ss.ms UTC
sample_rate (float) – Sample rate in samples/second
stop_time (str, optional) – End time in same format as start_time
n_samples (int, optional) – Number of samples to generate
- Returns:
Pandas datetime index with UTC timezone
- Return type:
DatetimeIndex
Notes
Either stop_time or n_samples must be provided. The datetime format should be YYYY-MM-DDThh:mm:ss.ms UTC.
- Raises:
ValueError – If neither stop_time nor n_samples is provided
- match_status_with_gps_stamps(status_array, gps_list)[source]
Match the index values from the status array with the index values of the GPS stamps. There appears to be a bit of wiggle room between when the lock is recorded and the stamp was actually recorded. This is typically 1 second and sometimes 2.
- Parameters:
status_array (array) – array of status values from each data block
gps_list (list) – list of valid GPS stamps [[GPRMC, GPGGA], …]
Note
I think there is a 2 second gap between the lock and the first stamp character.
- property n_samples: int | None
Number of samples in the time series.
- Returns:
Number of samples if data is loaded, estimated from file size if file exists, None otherwise
- Return type:
int or None
- read_nims(fn: str | Path | None = None) None[source]
Read NIMS DATA.BIN file and parse all data.
This method performs the complete data reading and processing workflow:
Read header information and store as attributes
Locate data block beginning by finding first [1, 131, …] sequence
Ensure data is multiple of block length, trim excess bits
Extract GPS data (3rd byte of each block) and parse GPS stamps
Read data as unsigned 8-bit integers, reshape to [N, block_length]
Remove duplicate blocks (first of each duplicate pair)
Match GPS status locks with valid GPS stamps
Verify timing between first/last GPS stamps, trim excess seconds
- Parameters:
fn (str or Path, optional) – Path to NIMS DATA.BIN file. Uses self.fn if not provided.
Notes
The data and information arrays returned have duplicates removed and sequence reset to be monotonic. Extra seconds due to timing gaps are trimmed from the end of the time series.
Examples
>>> from mth5.io import nims >>> n = nims.NIMS(r"/home/mt_data/nims/mt001.bin") >>> n.read_nims()
- remove_duplicates(info_array, data_array)[source]
remove duplicate blocks, removing the first duplicate as suggested by Paul and Anna. Checks to make sure that the mag data are identical for the duplicate blocks. Removes the blocks from the information and data arrays and returns the reduced arrays. This should sync up the timing of GPS stamps and index values.
- Parameters:
info_array (np.array) – structured array of block information
data_array (np.array) – structured array of the data
- Returns:
reduced information array
- Returns:
reduced data array
- Returns:
index of duplicates in raw data
- property run_metadata: Run | None
Run metadata for the NIMS data collection.
- Returns:
MT run metadata including data logger information, timing, and channel metadata, or None if no time series data is loaded
- Return type:
Run or None
- property start_time: MTime
Start time of the time series data.
- Returns:
Start time derived from the first GPS time stamp index, or header GPS stamp if no time series data available
- Return type:
MTime
Notes
The start time is calculated from the first good GPS time stamp minus the seconds to the beginning of the time series.
- property station_metadata: Station | None
Station metadata from NIMS file.
- Returns:
MT station metadata including geographic information and location data, or None if no time series data is loaded
- Return type:
Station or None
- to_runts(calibrate: bool = False) RunTS | None[source]
Get xarray RunTS object for the NIMS data.
- Parameters:
calibrate (bool, optional) – Whether to apply calibration to the data, by default False
- Returns:
Time series run object containing all channels and metadata, or None if no time series data is loaded
- Return type:
RunTS or None
Notes
Includes all magnetic field channels (hx, hy, hz), electric field channels (ex, ey), and box temperature data.
- unwrap_sequence(sequence: ndarray) ndarray[source]
Unwrap sequence to sequential numbers instead of modulo 256.
- Parameters:
sequence (ndarray) – Sequence of byte numbers (0-255) to unwrap
- Returns:
Unwrapped sequence with first number set to 0 and subsequent values forming a continuous count
- Return type:
ndarray
Notes
Handles the fact that sequence numbers are stored as single bytes (0-255) but represent a continuous count. When a value of 255 is encountered, the next rollover is anticipated.
- class mth5.io.nims.NIMSCollection(file_path: str | Path | None = None, **kwargs: Any)[source]
Bases:
CollectionCollection of NIMS binary files into runs.
This class provides functionality for organizing and processing multiple NIMS binary files into a structured format for magnetotelluric data analysis.
- Parameters:
file_path (str or Path, optional) – Path to the directory containing NIMS binary files.
**kwargs (dict) – Additional keyword arguments passed to the parent Collection class.
- file_ext
File extension for NIMS binary files (‘bin’).
- Type:
str
- survey_id
Survey identifier, defaults to ‘mt’.
- Type:
str
Examples
>>> from mth5.io.nims import NIMSCollection >>> nc = NIMSCollection(r"/path/to/nims/station") >>> nc.survey_id = "mt001" >>> df = nc.to_dataframe()
See also
mth5.io.collection.CollectionBase collection class
mth5.io.nims.NIMSNIMS file reader
- assign_run_names(df: DataFrame, zeros: int = 2) DataFrame[source]
Assign standardized run names to DataFrame entries by station.
This method assigns run names following the pattern ‘sr{sample_rate}_{run_number}’ where run_number is zero-padded according to the zeros parameter. Run names are assigned sequentially within each station, ordered by start time.
- Parameters:
df (pd.DataFrame) – DataFrame containing NIMS file metadata with required columns: ‘station’, ‘start’, ‘run’, ‘sample_rate’. The DataFrame will be modified in-place.
zeros (int, default 2) – Number of zeros to use for zero-padding the run number in the generated run names (e.g., zeros=2 gives ‘01’, ‘02’, etc.).
- Returns:
The input DataFrame with updated ‘run’ and ‘sequence_number’ columns. Run names follow the format ‘sr{sample_rate}_{run_number:0{zeros}}’.
- Return type:
pd.DataFrame
Notes
Existing run names (non-None values) are preserved
Files are processed in chronological order within each station
Sequence numbers are assigned incrementally starting from 1
Only files with None run names receive new assignments
Examples
>>> import pandas as pd >>> from mth5.io.nims import NIMSCollection >>> # Assuming df has columns: station, start, run, sample_rate >>> nc = NIMSCollection() >>> df_updated = nc.assign_run_names(df, zeros=3) >>> print(df_updated['run'].tolist()) ['sr8_001', 'sr8_002', 'sr1_001']
- to_dataframe(sample_rates: int | list[int] = [1], run_name_zeros: int = 2, calibration_path: str | Path | None = None) DataFrame[source]
Create a DataFrame of each NIMS binary file in the collection directory.
This method processes all NIMS binary files in the specified directory and extracts metadata to create a structured DataFrame suitable for further magnetotelluric data processing.
- Parameters:
sample_rates (int or list of int, default [1]) – Sample rates to include in the DataFrame. Note that for NIMS data, this parameter is present for interface consistency but all files will be processed regardless of their sample rate.
run_name_zeros (int, default 2) – Number of zeros to use when formatting run names in the output.
calibration_path (str or Path, optional) – Path to calibration files. Currently not used in NIMS processing but included for interface consistency.
- Returns:
DataFrame containing metadata for each NIMS file with columns: - survey : Survey identifier - station : Station name from NIMS file - run : Run identifier from NIMS file - start : Start time in ISO format - end : End time in ISO format - fn : File path - sample_rate : Sampling rate - file_size : File size in bytes - n_samples : Number of samples - dipole : Electric dipole lengths [Ex, Ey] - channel_id : Channel identifier (always 1) - sequence_number : Sequence number (always 0) - component : Comma-separated component list - instrument_id : Instrument identifier (always ‘NIMS’)
- Return type:
pd.DataFrame
Notes
This method assumes the directory contains files from a single station. Each NIMS file is read to extract header information including timing, station identification, and measurement parameters.
Examples
>>> from mth5.io.nims import NIMSCollection >>> nc = NIMSCollection("/path/to/nims/station") >>> df = nc.to_dataframe(run_name_zeros=3) >>> print(df[['station', 'run', 'start', 'sample_rate']])
- class mth5.io.nims.NIMSHeader(fn: str | Path | None = None)[source]
Bases:
objectClass to hold NIMS header information.
This class parses and stores header information from NIMS DATA.BIN files. The header contains metadata about the measurement site, equipment setup, GPS coordinates, electrode configuration, and other survey parameters.
- Parameters:
fn (str or Path, optional) – Path to the NIMS file to read, by default None
- fn
Path to the NIMS file
- Type:
Path or None
- site_name
Name of the measurement site
- Type:
str or None
- state_province
State or province of the measurement location
- Type:
str or None
- country
Country of the measurement location
- Type:
str or None
- box_id
System box identifier
- Type:
str or None
- mag_id
Magnetometer head identifier
- Type:
str or None
- ex_length
North-South electric field wire length in meters
- Type:
float or None
- ex_azimuth
North-South electric field wire heading in degrees
- Type:
float or None
- ey_length
East-West electric field wire length in meters
- Type:
float or None
- ey_azimuth
East-West electric field wire heading in degrees
- Type:
float or None
- n_electrode_id
North electrode identifier
- Type:
str or None
- s_electrode_id
South electrode identifier
- Type:
str or None
- e_electrode_id
East electrode identifier
- Type:
str or None
- w_electrode_id
West electrode identifier
- Type:
str or None
- ground_electrode_info
Ground electrode information
- Type:
str or None
- header_gps_stamp
GPS timestamp from header
- Type:
MTime or None
- header_gps_latitude
GPS latitude from header in decimal degrees
- Type:
float or None
- header_gps_longitude
GPS longitude from header in decimal degrees
- Type:
float or None
- header_gps_elevation
GPS elevation from header in meters
- Type:
float or None
- operator
Operator name
- Type:
str or None
- comments
Survey comments
- Type:
str or None
- run_id
Run identifier
- Type:
str or None
- data_start_seek
Byte position where data begins in file
- Type:
int
Examples
A typical header looks like:
''' >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>user field>>>>>>>>>>>>>>>>>>>>>>>>>>>> SITE NAME: Budwieser Spring STATE/PROVINCE: CA COUNTRY: USA >>> The following code in double quotes is REQUIRED to start the NIMS << >>> The next 3 lines contain values required for processing <<<<<<<<<<<< >>> The lines after that are optional <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< "300b" <-- 2CHAR EXPERIMENT CODE + 3 CHAR SITE CODE + RUN LETTER 1105-3; 1305-3 <-- SYSTEM BOX I.D.; MAG HEAD ID (if different) 106 0 <-- N-S Ex WIRE LENGTH (m); HEADING (deg E mag N) 109 90 <-- E-W Ey WIRE LENGTH (m); HEADING (deg E mag N) 1 <-- N ELECTRODE ID 3 <-- E ELECTRODE ID 2 <-- S ELECTRODE ID 4 <-- W ELECTRODE ID Cu <-- GROUND ELECTRODE INFO GPS INFO: 26/09/19 18:29:29 34.7268 N 115.7350 W 939.8 OPERATOR: KP COMMENT: N/S CRS: .95/.96 DCV: 3.5 ACV:1 E/W CRS: .85/.86 DCV: 1.5 ACV: 1 Redeployed site for run b b/c possible animal disturbance '''
- property file_size: int | None
Size of the NIMS file in bytes.
- Returns:
File size in bytes, or None if no file is set
- Return type:
int or None
- Raises:
FileNotFoundError – If the file does not exist
- property fn: Path | None
Full path to NIMS file.
- Returns:
Path object representing the NIMS file location, or None if no file is set
- Return type:
Path or None
- parse_header_dict(header_dict: dict[str, str] | None = None) None[source]
Parse the header dictionary into individual attributes.
This method takes the raw header dictionary and extracts specific information into class attributes for easy access.
- Parameters:
header_dict (dict of str, optional) – Dictionary containing header key-value pairs. Uses self.header_dict if not provided.
Notes
Parses various header fields including: - Wire lengths and azimuths for electric field measurements - System box and magnetometer IDs - GPS coordinates and timestamp - Run identifier - Other metadata fields
- read_header(fn: str | Path | None = None) None[source]
Read header information from a NIMS file.
This method reads and parses the header section of a NIMS DATA.BIN file, extracting metadata about the survey setup, GPS coordinates, electrode configuration, and other parameters.
- Parameters:
fn (str or Path, optional) – Full path to NIMS file to read. Uses self.fn if not provided.
- Raises:
NIMSError – If the file does not exist or cannot be read
Notes
The method reads up to _max_header_length bytes from the beginning of the file, parses the header information, and stores the results in the header_dict attribute and individual properties.
- property station: str | None
Station ID derived from run ID.
- Returns:
Station identifier (run ID without the last character), or None if run_id is not set
- Return type:
str or None
Notes
The station ID is typically the run ID with the last character (run letter) removed.
- class mth5.io.nims.Response(system_id=None, **kwargs)[source]
Bases:
objectCommon NIMS response filters for electric and magnetic channels
- dipole_filter(length)[source]
Make a dipole filter
- Parameters:
length (TYPE) – dipole length in meters
- Returns:
DESCRIPTION
- Return type:
TYPE
- property electric_conversion
electric channel conversion from counts to Volts :return: DESCRIPTION :rtype: TYPE
- property electric_high_pass_hp
1-pole low pass for 1 hz instuments :return: DESCRIPTION :rtype: TYPE
- property electric_high_pass_pc
1-pole low pass filter for 8 hz instruments :return: DESCRIPTION :rtype: TYPE
- property electric_low_pass
5 pole electric low pass filter :return: DESCRIPTION :rtype: TYPE
- property electric_physical_units
DESCRIPTION :rtype: TYPE
- Type:
return
- get_channel_response(channel, dipole_length=1)[source]
Get the full channel response filter :param channel: DESCRIPTION :type channel: TYPE :param dipole_length: DESCRIPTION, defaults to 1 :type dipole_length: TYPE, optional :return: DESCRIPTION :rtype: TYPE
- get_electric_high_pass(hardware='pc')[source]
get the electric high pass filter based on the hardware
- property magnetic_conversion
DESCRIPTION :rtype: TYPE
- Type:
return
- property magnetic_low_pass
Low pass 3 pole filter
- Returns:
DESCRIPTION
- Return type:
TYPE
- mth5.io.nims.read_nims(fn: str | Path) RunTS | None[source]
Convenience function to read a NIMS DATA.BIN file.
- Parameters:
fn (str or Path) – Path to the NIMS DATA.BIN file
- Returns:
Time series run object containing all channels and metadata, or None if reading fails
- Return type:
RunTS or None
Examples
>>> from mth5.io.nims import nims >>> run_ts = nims.read_nims("/path/to/data.bin")