Make an MTH5 from LEMI data

This notebook provides an example of how to read in LEMI (.TXT) files into an MTH5.

[1]:
from mth5.mth5 import MTH5
from mth5.io.lemi import LEMICollection
from mth5 import read_file
2022-09-07 18:07:09,774 [line 135] mth5.setup_logger - INFO: Logging file can be found C:\Users\jpeacock\OneDrive - DOI\Documents\GitHub\mth5\logs\mth5_debug.log

LEMI Collection

We will use the LEMICollection to assemble the .txt files into a logical order by schedule action or run. The output LEMI files include all data for each channel.

IMPORTANT: LEMICollection assumes the given file path is for a single station.

Metadata: we need to input the station_id and the survey_id to provide minimal metadata when making an MTH5 fild.

The LEMICollection.get_runs() will return a two level ordered dictionary (OrderedDict). The first level is keyed by station ID. These objects are in turn ordered dictionaries by run ID. Therefore you can loop over stations and runs.

Note: n_samples is an estimate based on file size not the data. To get an accurate number you should read in the full file.

[2]:
zc = LEMICollection(r"c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0110")
zc.station_id = "mt001"
zc.survey_id = "test"
runs = zc.get_runs(sample_rates=[1])
print(f"Found {len(runs)} station with {len(runs[list(runs.keys())[0]])} runs")
Found 1 station with 5 runs
[3]:
for run_id, run_df in runs[zc.station_id].items():
    display(run_df)
survey station run start end channel_id component fn sample_rate file_size n_samples sequence_number instrument_id calibration_fn
0 test mt001 sr1_0001 2020-09-30 20:21:00+00:00 2020-09-30 20:28:15+00:00 1 temperature_e,temperature_h,e1,e2,bx,by,bz c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... 1.0 66272 436 0 LEMI424 None
survey station run start end channel_id component fn sample_rate file_size n_samples sequence_number instrument_id calibration_fn
1 test mt001 sr1_0002 2020-09-30 20:29:00+00:00 2020-09-30 20:42:16+00:00 1 temperature_e,temperature_h,e1,e2,bx,by,bz c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... 1.0 121144 797 0 LEMI424 None
survey station run start end channel_id component fn sample_rate file_size n_samples sequence_number instrument_id calibration_fn
2 test mt001 sr1_0003 2020-09-30 20:54:00+00:00 2020-09-30 21:11:01+00:00 1 temperature_e,temperature_h,e1,e2,bx,by,bz c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... 1.0 155344 1022 0 LEMI424 None
survey station run start end channel_id component fn sample_rate file_size n_samples sequence_number instrument_id calibration_fn
3 test mt001 sr1_0004 2020-09-30 21:12:00+00:00 2020-09-30 21:13:45+00:00 1 temperature_e,temperature_h,e1,e2,bx,by,bz c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... 1.0 16112 106 0 LEMI424 None
survey station run start end channel_id component fn sample_rate file_size n_samples sequence_number instrument_id calibration_fn
4 test mt001 sr1_0005 2020-09-30 21:14:00+00:00 2020-09-30 23:59:59+00:00 1 temperature_e,temperature_h,e1,e2,bx,by,bz c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... 1.0 1513920 9960 0 LEMI424 None
5 test mt001 sr1_0005 2020-10-01 00:00:00+00:00 2020-10-01 23:59:59+00:00 1 temperature_e,temperature_h,e1,e2,bx,by,bz c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... 1.0 13132800 86400 0 LEMI424 None
6 test mt001 sr1_0005 2020-10-02 00:00:00+00:00 2020-10-02 23:59:59+00:00 1 temperature_e,temperature_h,e1,e2,bx,by,bz c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... 1.0 13132800 86400 0 LEMI424 None
7 test mt001 sr1_0005 2020-10-03 00:00:00+00:00 2020-10-03 23:59:59+00:00 1 temperature_e,temperature_h,e1,e2,bx,by,bz c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... 1.0 13132800 86400 0 LEMI424 None
8 test mt001 sr1_0005 2020-10-04 00:00:00+00:00 2020-10-04 23:59:59+00:00 1 temperature_e,temperature_h,e1,e2,bx,by,bz c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... 1.0 13132800 86400 0 LEMI424 None
9 test mt001 sr1_0005 2020-10-05 00:00:00+00:00 2020-10-05 23:59:59+00:00 1 temperature_e,temperature_h,e1,e2,bx,by,bz c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... 1.0 13132801 86400 0 LEMI424 None
10 test mt001 sr1_0005 2020-10-06 00:00:00+00:00 2020-10-06 23:59:59+00:00 1 temperature_e,temperature_h,e1,e2,bx,by,bz c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... 1.0 13132800 86400 0 LEMI424 None
11 test mt001 sr1_0005 2020-10-07 00:00:00+00:00 2020-10-07 14:19:46+00:00 1 temperature_e,temperature_h,e1,e2,bx,by,bz c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... 1.0 7841224 51587 0 LEMI424 None

Build MTH5

Now that we have a logical collection of files, lets load them into an MTH5. We will simply loop of the stations, runs, and channels in the ordered dictionary.

There are a few things that to keep in mind:

  • The LEMI raw files come with very little metadata, so as a user you will have to manually input most of it.

  • The output files from a LEMI are already calibrated into units of nT and mV/km (I think), therefore there are no filter to apply to calibrate the data.

  • Since this is a MTH5 file version 0.2.0 the filters are in the survey_group so add them there.

[4]:
m = MTH5()
m.open_mth5(zc.file_path.joinpath("from_lemi.h5"))
2022-09-07 18:07:12,663 [line 663] mth5.mth5.MTH5._initialize_file - INFO: Initialized MTH5 0.2.0 file c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0110\from_lemi.h5 in mode a
[5]:
survey_group = m.add_survey(zc.survey_id)
[6]:
%%time
for station_id in runs.keys():
    station_group = survey_group.stations_group.add_station(station_id)
    for run_id, run_df in runs[station_id].items():
        run_group = station_group.add_run(run_id)
        run_ts = read_file(run_df.fn.to_list())
        run_group.from_runts(run_ts)
    station_group.metadata.update(run_ts.station_metadata)
    station_group.write_metadata()
Wall time: 42 s
[7]:
%%time
station_group.validate_station_metadata()
station_group.write_metadata()

survey_group.update_survey_metadata()
survey_group.write_metadata()
Wall time: 27.3 s

MTH5 Structure

Have a look at the MTH5 structure and make sure it looks correct.

[8]:
m
[8]:
/:
====================
    |- Group: Experiment
    --------------------
        |- Group: Reports
        -----------------
        |- Group: Standards
        -------------------
            --> Dataset: summary
            ......................
        |- Group: Surveys
        -----------------
            |- Group: test
            --------------
                |- Group: Filters
                -----------------
                    |- Group: coefficient
                    ---------------------
                    |- Group: fap
                    -------------
                    |- Group: fir
                    -------------
                    |- Group: time_delay
                    --------------------
                    |- Group: zpk
                    -------------
                |- Group: Reports
                -----------------
                |- Group: Standards
                -------------------
                    --> Dataset: summary
                    ......................
                |- Group: Stations
                ------------------
                    |- Group: mt001
                    ---------------
                        |- Group: Transfer_Functions
                        ----------------------------
                        |- Group: sr1_0001
                        ------------------
                            --> Dataset: bx
                            .................
                            --> Dataset: by
                            .................
                            --> Dataset: bz
                            .................
                            --> Dataset: e1
                            .................
                            --> Dataset: e2
                            .................
                            --> Dataset: temperature_e
                            ............................
                            --> Dataset: temperature_h
                            ............................
                        |- Group: sr1_0002
                        ------------------
                            --> Dataset: bx
                            .................
                            --> Dataset: by
                            .................
                            --> Dataset: bz
                            .................
                            --> Dataset: e1
                            .................
                            --> Dataset: e2
                            .................
                            --> Dataset: temperature_e
                            ............................
                            --> Dataset: temperature_h
                            ............................
                        |- Group: sr1_0003
                        ------------------
                            --> Dataset: bx
                            .................
                            --> Dataset: by
                            .................
                            --> Dataset: bz
                            .................
                            --> Dataset: e1
                            .................
                            --> Dataset: e2
                            .................
                            --> Dataset: temperature_e
                            ............................
                            --> Dataset: temperature_h
                            ............................
                        |- Group: sr1_0004
                        ------------------
                            --> Dataset: bx
                            .................
                            --> Dataset: by
                            .................
                            --> Dataset: bz
                            .................
                            --> Dataset: e1
                            .................
                            --> Dataset: e2
                            .................
                            --> Dataset: temperature_e
                            ............................
                            --> Dataset: temperature_h
                            ............................
                        |- Group: sr1_0005
                        ------------------
                            --> Dataset: bx
                            .................
                            --> Dataset: by
                            .................
                            --> Dataset: bz
                            .................
                            --> Dataset: e1
                            .................
                            --> Dataset: e2
                            .................
                            --> Dataset: temperature_e
                            ............................
                            --> Dataset: temperature_h
                            ............................
        --> Dataset: channel_summary
        ..............................
        --> Dataset: tf_summary
        .........................

Channel Summary

Have a look at the channel summary and make sure everything looks good.

[9]:
m.channel_summary.summarize()
m.channel_summary.to_dataframe()
[9]:
survey station run latitude longitude elevation component start end n_samples sample_rate measurement_type azimuth tilt units hdf5_reference run_hdf5_reference station_hdf5_reference
0 test mt001 a 34.080655 -107.214079 2202.8 bx 2020-09-30 20:21:00+00:00 2020-09-30 20:28:16+00:00 436 1.0 magnetic 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
1 test mt001 a 34.080655 -107.214079 2202.8 by 2020-09-30 20:21:00+00:00 2020-09-30 20:28:16+00:00 436 1.0 magnetic 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
2 test mt001 a 34.080655 -107.214079 2202.8 bz 2020-09-30 20:21:00+00:00 2020-09-30 20:28:16+00:00 436 1.0 magnetic 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
3 test mt001 a 34.080655 -107.214079 2202.8 e1 2020-09-30 20:21:00+00:00 2020-09-30 20:28:16+00:00 436 1.0 electric 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
4 test mt001 a 34.080655 -107.214079 2202.8 e2 2020-09-30 20:21:00+00:00 2020-09-30 20:28:16+00:00 436 1.0 electric 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
5 test mt001 a 34.080655 -107.214079 2202.8 temperature_e 2020-09-30 20:21:00+00:00 2020-09-30 20:28:16+00:00 436 1.0 auxiliary 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
6 test mt001 a 34.080655 -107.214079 2202.8 temperature_h 2020-09-30 20:21:00+00:00 2020-09-30 20:28:16+00:00 436 1.0 auxiliary 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
7 test mt001 a 34.080655 -107.214079 2202.8 bx 2020-09-30 20:29:00+00:00 2020-09-30 20:42:17+00:00 797 1.0 magnetic 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
8 test mt001 a 34.080655 -107.214079 2202.8 by 2020-09-30 20:29:00+00:00 2020-09-30 20:42:17+00:00 797 1.0 magnetic 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
9 test mt001 a 34.080655 -107.214079 2202.8 bz 2020-09-30 20:29:00+00:00 2020-09-30 20:42:17+00:00 797 1.0 magnetic 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
10 test mt001 a 34.080655 -107.214079 2202.8 e1 2020-09-30 20:29:00+00:00 2020-09-30 20:42:17+00:00 797 1.0 electric 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
11 test mt001 a 34.080655 -107.214079 2202.8 e2 2020-09-30 20:29:00+00:00 2020-09-30 20:42:17+00:00 797 1.0 electric 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
12 test mt001 a 34.080655 -107.214079 2202.8 temperature_e 2020-09-30 20:29:00+00:00 2020-09-30 20:42:17+00:00 797 1.0 auxiliary 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
13 test mt001 a 34.080655 -107.214079 2202.8 temperature_h 2020-09-30 20:29:00+00:00 2020-09-30 20:42:17+00:00 797 1.0 auxiliary 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
14 test mt001 a 34.080655 -107.214079 2202.8 bx 2020-09-30 20:54:00+00:00 2020-09-30 21:11:02+00:00 1022 1.0 magnetic 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
15 test mt001 a 34.080655 -107.214079 2202.8 by 2020-09-30 20:54:00+00:00 2020-09-30 21:11:02+00:00 1022 1.0 magnetic 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
16 test mt001 a 34.080655 -107.214079 2202.8 bz 2020-09-30 20:54:00+00:00 2020-09-30 21:11:02+00:00 1022 1.0 magnetic 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
17 test mt001 a 34.080655 -107.214079 2202.8 e1 2020-09-30 20:54:00+00:00 2020-09-30 21:11:02+00:00 1022 1.0 electric 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
18 test mt001 a 34.080655 -107.214079 2202.8 e2 2020-09-30 20:54:00+00:00 2020-09-30 21:11:02+00:00 1022 1.0 electric 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
19 test mt001 a 34.080655 -107.214079 2202.8 temperature_e 2020-09-30 20:54:00+00:00 2020-09-30 21:11:02+00:00 1022 1.0 auxiliary 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
20 test mt001 a 34.080655 -107.214079 2202.8 temperature_h 2020-09-30 20:54:00+00:00 2020-09-30 21:11:02+00:00 1022 1.0 auxiliary 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
21 test mt001 a 34.080655 -107.214079 2202.8 bx 2020-09-30 21:12:00+00:00 2020-09-30 21:13:46+00:00 106 1.0 magnetic 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
22 test mt001 a 34.080655 -107.214079 2202.8 by 2020-09-30 21:12:00+00:00 2020-09-30 21:13:46+00:00 106 1.0 magnetic 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
23 test mt001 a 34.080655 -107.214079 2202.8 bz 2020-09-30 21:12:00+00:00 2020-09-30 21:13:46+00:00 106 1.0 magnetic 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
24 test mt001 a 34.080655 -107.214079 2202.8 e1 2020-09-30 21:12:00+00:00 2020-09-30 21:13:46+00:00 106 1.0 electric 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
25 test mt001 a 34.080655 -107.214079 2202.8 e2 2020-09-30 21:12:00+00:00 2020-09-30 21:13:46+00:00 106 1.0 electric 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
26 test mt001 a 34.080655 -107.214079 2202.8 temperature_e 2020-09-30 21:12:00+00:00 2020-09-30 21:13:46+00:00 106 1.0 auxiliary 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
27 test mt001 a 34.080655 -107.214079 2202.8 temperature_h 2020-09-30 21:12:00+00:00 2020-09-30 21:13:46+00:00 106 1.0 auxiliary 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
28 test mt001 a 34.080655 -107.214079 2202.8 bx 2020-09-30 21:14:00+00:00 2020-10-07 17:05:47+00:00 589907 1.0 magnetic 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
29 test mt001 a 34.080655 -107.214079 2202.8 by 2020-09-30 21:14:00+00:00 2020-10-07 17:05:47+00:00 589907 1.0 magnetic 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
30 test mt001 a 34.080655 -107.214079 2202.8 bz 2020-09-30 21:14:00+00:00 2020-10-07 17:05:47+00:00 589907 1.0 magnetic 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
31 test mt001 a 34.080655 -107.214079 2202.8 e1 2020-09-30 21:14:00+00:00 2020-10-07 17:05:47+00:00 589907 1.0 electric 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
32 test mt001 a 34.080655 -107.214079 2202.8 e2 2020-09-30 21:14:00+00:00 2020-10-07 17:05:47+00:00 589907 1.0 electric 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
33 test mt001 a 34.080655 -107.214079 2202.8 temperature_e 2020-09-30 21:14:00+00:00 2020-10-07 17:05:47+00:00 589907 1.0 auxiliary 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
34 test mt001 a 34.080655 -107.214079 2202.8 temperature_h 2020-09-30 21:14:00+00:00 2020-10-07 17:05:47+00:00 589907 1.0 auxiliary 0.0 0.0 none <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>

Close the MTH5

This is important, you should close the file after you are done using it. Otherwise bad things can happen if you try to open it with another program or Python interpreter.

[10]:
m.close_mth5()
2022-09-07 18:08:23,271 [line 744] mth5.mth5.MTH5.close_mth5 - INFO: Flushing and closing c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0110\from_lemi.h5