Make an MTH5 from LEMI data
This notebook provides an example of how to read in LEMI (.TXT) files into an MTH5.
[1]:
from mth5.mth5 import MTH5
from mth5.io.lemi import LEMICollection
from mth5 import read_file
2022-09-07 18:07:09,774 [line 135] mth5.setup_logger - INFO: Logging file can be found C:\Users\jpeacock\OneDrive - DOI\Documents\GitHub\mth5\logs\mth5_debug.log
LEMI Collection
We will use the LEMICollection
to assemble the .txt files into a logical order by schedule action or run. The output LEMI files include all data for each channel.
IMPORTANT: LEMICollection
assumes the given file path is for a single station.
Metadata: we need to input the station_id
and the survey_id
to provide minimal metadata when making an MTH5 fild.
The LEMICollection.get_runs()
will return a two level ordered dictionary (OrderedDict
). The first level is keyed by station ID. These objects are in turn ordered dictionaries by run ID. Therefore you can loop over stations and runs.
Note: n_samples
is an estimate based on file size not the data. To get an accurate number you should read in the full file.
[2]:
zc = LEMICollection(r"c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0110")
zc.station_id = "mt001"
zc.survey_id = "test"
runs = zc.get_runs(sample_rates=[1])
print(f"Found {len(runs)} station with {len(runs[list(runs.keys())[0]])} runs")
Found 1 station with 5 runs
[3]:
for run_id, run_df in runs[zc.station_id].items():
display(run_df)
survey | station | run | start | end | channel_id | component | fn | sample_rate | file_size | n_samples | sequence_number | instrument_id | calibration_fn | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | test | mt001 | sr1_0001 | 2020-09-30 20:21:00+00:00 | 2020-09-30 20:28:15+00:00 | 1 | temperature_e,temperature_h,e1,e2,bx,by,bz | c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... | 1.0 | 66272 | 436 | 0 | LEMI424 | None |
survey | station | run | start | end | channel_id | component | fn | sample_rate | file_size | n_samples | sequence_number | instrument_id | calibration_fn | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | test | mt001 | sr1_0002 | 2020-09-30 20:29:00+00:00 | 2020-09-30 20:42:16+00:00 | 1 | temperature_e,temperature_h,e1,e2,bx,by,bz | c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... | 1.0 | 121144 | 797 | 0 | LEMI424 | None |
survey | station | run | start | end | channel_id | component | fn | sample_rate | file_size | n_samples | sequence_number | instrument_id | calibration_fn | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | test | mt001 | sr1_0003 | 2020-09-30 20:54:00+00:00 | 2020-09-30 21:11:01+00:00 | 1 | temperature_e,temperature_h,e1,e2,bx,by,bz | c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... | 1.0 | 155344 | 1022 | 0 | LEMI424 | None |
survey | station | run | start | end | channel_id | component | fn | sample_rate | file_size | n_samples | sequence_number | instrument_id | calibration_fn | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
3 | test | mt001 | sr1_0004 | 2020-09-30 21:12:00+00:00 | 2020-09-30 21:13:45+00:00 | 1 | temperature_e,temperature_h,e1,e2,bx,by,bz | c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... | 1.0 | 16112 | 106 | 0 | LEMI424 | None |
survey | station | run | start | end | channel_id | component | fn | sample_rate | file_size | n_samples | sequence_number | instrument_id | calibration_fn | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4 | test | mt001 | sr1_0005 | 2020-09-30 21:14:00+00:00 | 2020-09-30 23:59:59+00:00 | 1 | temperature_e,temperature_h,e1,e2,bx,by,bz | c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... | 1.0 | 1513920 | 9960 | 0 | LEMI424 | None |
5 | test | mt001 | sr1_0005 | 2020-10-01 00:00:00+00:00 | 2020-10-01 23:59:59+00:00 | 1 | temperature_e,temperature_h,e1,e2,bx,by,bz | c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... | 1.0 | 13132800 | 86400 | 0 | LEMI424 | None |
6 | test | mt001 | sr1_0005 | 2020-10-02 00:00:00+00:00 | 2020-10-02 23:59:59+00:00 | 1 | temperature_e,temperature_h,e1,e2,bx,by,bz | c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... | 1.0 | 13132800 | 86400 | 0 | LEMI424 | None |
7 | test | mt001 | sr1_0005 | 2020-10-03 00:00:00+00:00 | 2020-10-03 23:59:59+00:00 | 1 | temperature_e,temperature_h,e1,e2,bx,by,bz | c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... | 1.0 | 13132800 | 86400 | 0 | LEMI424 | None |
8 | test | mt001 | sr1_0005 | 2020-10-04 00:00:00+00:00 | 2020-10-04 23:59:59+00:00 | 1 | temperature_e,temperature_h,e1,e2,bx,by,bz | c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... | 1.0 | 13132800 | 86400 | 0 | LEMI424 | None |
9 | test | mt001 | sr1_0005 | 2020-10-05 00:00:00+00:00 | 2020-10-05 23:59:59+00:00 | 1 | temperature_e,temperature_h,e1,e2,bx,by,bz | c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... | 1.0 | 13132801 | 86400 | 0 | LEMI424 | None |
10 | test | mt001 | sr1_0005 | 2020-10-06 00:00:00+00:00 | 2020-10-06 23:59:59+00:00 | 1 | temperature_e,temperature_h,e1,e2,bx,by,bz | c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... | 1.0 | 13132800 | 86400 | 0 | LEMI424 | None |
11 | test | mt001 | sr1_0005 | 2020-10-07 00:00:00+00:00 | 2020-10-07 14:19:46+00:00 | 1 | temperature_e,temperature_h,e1,e2,bx,by,bz | c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0... | 1.0 | 7841224 | 51587 | 0 | LEMI424 | None |
Build MTH5
Now that we have a logical collection of files, lets load them into an MTH5. We will simply loop of the stations, runs, and channels in the ordered dictionary.
There are a few things that to keep in mind:
The LEMI raw files come with very little metadata, so as a user you will have to manually input most of it.
The output files from a LEMI are already calibrated into units of nT and mV/km (I think), therefore there are no filter to apply to calibrate the data.
Since this is a MTH5 file version 0.2.0 the filters are in the
survey_group
so add them there.
[4]:
m = MTH5()
m.open_mth5(zc.file_path.joinpath("from_lemi.h5"))
2022-09-07 18:07:12,663 [line 663] mth5.mth5.MTH5._initialize_file - INFO: Initialized MTH5 0.2.0 file c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0110\from_lemi.h5 in mode a
[5]:
survey_group = m.add_survey(zc.survey_id)
[6]:
%%time
for station_id in runs.keys():
station_group = survey_group.stations_group.add_station(station_id)
for run_id, run_df in runs[station_id].items():
run_group = station_group.add_run(run_id)
run_ts = read_file(run_df.fn.to_list())
run_group.from_runts(run_ts)
station_group.metadata.update(run_ts.station_metadata)
station_group.write_metadata()
Wall time: 42 s
[7]:
%%time
station_group.validate_station_metadata()
station_group.write_metadata()
survey_group.update_survey_metadata()
survey_group.write_metadata()
Wall time: 27.3 s
MTH5 Structure
Have a look at the MTH5 structure and make sure it looks correct.
[8]:
m
[8]:
/:
====================
|- Group: Experiment
--------------------
|- Group: Reports
-----------------
|- Group: Standards
-------------------
--> Dataset: summary
......................
|- Group: Surveys
-----------------
|- Group: test
--------------
|- Group: Filters
-----------------
|- Group: coefficient
---------------------
|- Group: fap
-------------
|- Group: fir
-------------
|- Group: time_delay
--------------------
|- Group: zpk
-------------
|- Group: Reports
-----------------
|- Group: Standards
-------------------
--> Dataset: summary
......................
|- Group: Stations
------------------
|- Group: mt001
---------------
|- Group: Transfer_Functions
----------------------------
|- Group: sr1_0001
------------------
--> Dataset: bx
.................
--> Dataset: by
.................
--> Dataset: bz
.................
--> Dataset: e1
.................
--> Dataset: e2
.................
--> Dataset: temperature_e
............................
--> Dataset: temperature_h
............................
|- Group: sr1_0002
------------------
--> Dataset: bx
.................
--> Dataset: by
.................
--> Dataset: bz
.................
--> Dataset: e1
.................
--> Dataset: e2
.................
--> Dataset: temperature_e
............................
--> Dataset: temperature_h
............................
|- Group: sr1_0003
------------------
--> Dataset: bx
.................
--> Dataset: by
.................
--> Dataset: bz
.................
--> Dataset: e1
.................
--> Dataset: e2
.................
--> Dataset: temperature_e
............................
--> Dataset: temperature_h
............................
|- Group: sr1_0004
------------------
--> Dataset: bx
.................
--> Dataset: by
.................
--> Dataset: bz
.................
--> Dataset: e1
.................
--> Dataset: e2
.................
--> Dataset: temperature_e
............................
--> Dataset: temperature_h
............................
|- Group: sr1_0005
------------------
--> Dataset: bx
.................
--> Dataset: by
.................
--> Dataset: bz
.................
--> Dataset: e1
.................
--> Dataset: e2
.................
--> Dataset: temperature_e
............................
--> Dataset: temperature_h
............................
--> Dataset: channel_summary
..............................
--> Dataset: tf_summary
.........................
Channel Summary
Have a look at the channel summary and make sure everything looks good.
[9]:
m.channel_summary.summarize()
m.channel_summary.to_dataframe()
[9]:
survey | station | run | latitude | longitude | elevation | component | start | end | n_samples | sample_rate | measurement_type | azimuth | tilt | units | hdf5_reference | run_hdf5_reference | station_hdf5_reference | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | bx | 2020-09-30 20:21:00+00:00 | 2020-09-30 20:28:16+00:00 | 436 | 1.0 | magnetic | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
1 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | by | 2020-09-30 20:21:00+00:00 | 2020-09-30 20:28:16+00:00 | 436 | 1.0 | magnetic | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
2 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | bz | 2020-09-30 20:21:00+00:00 | 2020-09-30 20:28:16+00:00 | 436 | 1.0 | magnetic | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
3 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | e1 | 2020-09-30 20:21:00+00:00 | 2020-09-30 20:28:16+00:00 | 436 | 1.0 | electric | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
4 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | e2 | 2020-09-30 20:21:00+00:00 | 2020-09-30 20:28:16+00:00 | 436 | 1.0 | electric | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
5 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | temperature_e | 2020-09-30 20:21:00+00:00 | 2020-09-30 20:28:16+00:00 | 436 | 1.0 | auxiliary | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
6 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | temperature_h | 2020-09-30 20:21:00+00:00 | 2020-09-30 20:28:16+00:00 | 436 | 1.0 | auxiliary | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
7 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | bx | 2020-09-30 20:29:00+00:00 | 2020-09-30 20:42:17+00:00 | 797 | 1.0 | magnetic | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
8 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | by | 2020-09-30 20:29:00+00:00 | 2020-09-30 20:42:17+00:00 | 797 | 1.0 | magnetic | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
9 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | bz | 2020-09-30 20:29:00+00:00 | 2020-09-30 20:42:17+00:00 | 797 | 1.0 | magnetic | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
10 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | e1 | 2020-09-30 20:29:00+00:00 | 2020-09-30 20:42:17+00:00 | 797 | 1.0 | electric | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
11 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | e2 | 2020-09-30 20:29:00+00:00 | 2020-09-30 20:42:17+00:00 | 797 | 1.0 | electric | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
12 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | temperature_e | 2020-09-30 20:29:00+00:00 | 2020-09-30 20:42:17+00:00 | 797 | 1.0 | auxiliary | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
13 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | temperature_h | 2020-09-30 20:29:00+00:00 | 2020-09-30 20:42:17+00:00 | 797 | 1.0 | auxiliary | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
14 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | bx | 2020-09-30 20:54:00+00:00 | 2020-09-30 21:11:02+00:00 | 1022 | 1.0 | magnetic | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
15 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | by | 2020-09-30 20:54:00+00:00 | 2020-09-30 21:11:02+00:00 | 1022 | 1.0 | magnetic | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
16 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | bz | 2020-09-30 20:54:00+00:00 | 2020-09-30 21:11:02+00:00 | 1022 | 1.0 | magnetic | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
17 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | e1 | 2020-09-30 20:54:00+00:00 | 2020-09-30 21:11:02+00:00 | 1022 | 1.0 | electric | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
18 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | e2 | 2020-09-30 20:54:00+00:00 | 2020-09-30 21:11:02+00:00 | 1022 | 1.0 | electric | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
19 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | temperature_e | 2020-09-30 20:54:00+00:00 | 2020-09-30 21:11:02+00:00 | 1022 | 1.0 | auxiliary | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
20 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | temperature_h | 2020-09-30 20:54:00+00:00 | 2020-09-30 21:11:02+00:00 | 1022 | 1.0 | auxiliary | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
21 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | bx | 2020-09-30 21:12:00+00:00 | 2020-09-30 21:13:46+00:00 | 106 | 1.0 | magnetic | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
22 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | by | 2020-09-30 21:12:00+00:00 | 2020-09-30 21:13:46+00:00 | 106 | 1.0 | magnetic | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
23 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | bz | 2020-09-30 21:12:00+00:00 | 2020-09-30 21:13:46+00:00 | 106 | 1.0 | magnetic | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
24 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | e1 | 2020-09-30 21:12:00+00:00 | 2020-09-30 21:13:46+00:00 | 106 | 1.0 | electric | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
25 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | e2 | 2020-09-30 21:12:00+00:00 | 2020-09-30 21:13:46+00:00 | 106 | 1.0 | electric | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
26 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | temperature_e | 2020-09-30 21:12:00+00:00 | 2020-09-30 21:13:46+00:00 | 106 | 1.0 | auxiliary | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
27 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | temperature_h | 2020-09-30 21:12:00+00:00 | 2020-09-30 21:13:46+00:00 | 106 | 1.0 | auxiliary | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
28 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | bx | 2020-09-30 21:14:00+00:00 | 2020-10-07 17:05:47+00:00 | 589907 | 1.0 | magnetic | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
29 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | by | 2020-09-30 21:14:00+00:00 | 2020-10-07 17:05:47+00:00 | 589907 | 1.0 | magnetic | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
30 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | bz | 2020-09-30 21:14:00+00:00 | 2020-10-07 17:05:47+00:00 | 589907 | 1.0 | magnetic | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
31 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | e1 | 2020-09-30 21:14:00+00:00 | 2020-10-07 17:05:47+00:00 | 589907 | 1.0 | electric | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
32 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | e2 | 2020-09-30 21:14:00+00:00 | 2020-10-07 17:05:47+00:00 | 589907 | 1.0 | electric | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
33 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | temperature_e | 2020-09-30 21:14:00+00:00 | 2020-10-07 17:05:47+00:00 | 589907 | 1.0 | auxiliary | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
34 | test | mt001 | a | 34.080655 | -107.214079 | 2202.8 | temperature_h | 2020-09-30 21:14:00+00:00 | 2020-10-07 17:05:47+00:00 | 589907 | 1.0 | auxiliary | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
Close the MTH5
This is important, you should close the file after you are done using it. Otherwise bad things can happen if you try to open it with another program or Python interpreter.
[10]:
m.close_mth5()
2022-09-07 18:08:23,271 [line 744] mth5.mth5.MTH5.close_mth5 - INFO: Flushing and closing c:\Users\jpeacock\OneDrive - DOI\mt\lemi\DATA0110\from_lemi.h5