mth5.processing.run_summary
This module contains the RunSummary class.
This is a helper class that summarizes the Runs in an mth5.
TODO: This class and methods could be replaced by methods in MTH5.
Functionality of RunSummary() 1. User can get a list of local_station options, which correspond to unique pairs of values: (survey, station)
2. User can see all possible ways of processing the data: - one list per (survey, station) pair in the run_summary
Some of the following functionalities may end up in KernelDataset: 3. User can select local_station -this can trigger a reduction of runs to only those that are from the local staion and simultaneous runs at other stations 4. Given a local station, a list of possible reference stations can be generated 5. Given a remote reference station, a list of all relevent runs, truncated to maximize coverage of the local station runs is generated 6. Given such a “restricted run list”, runs can be dropped 7. Time interval endpoints can be changed
- Development Notes:
- TODO: consider adding methods:
drop_runs_shorter_than”: removes short runs from summary
fill_gaps_by_time_interval”: allows runs to be merged if gaps between are short
fill_gaps_by_run_names”: allows runs to be merged if gaps between are short
TODO: Consider whether this should return a copy or modify in-place when querying the df.
Classes
Class to contain a run-summary table from one or more mth5s. |
Functions
|
Given a list of mth5's, iterate over them, extracting run_summaries and |
Module Contents
- class mth5.processing.run_summary.RunSummary(input_dict: dict | None | None = None, df: pandas.DataFrame | None | None = None)[source]
Class to contain a run-summary table from one or more mth5s.
WIP: For the full MMT case this may need modification to a channel based summary.
- from_mth5s(mth5_list) list[source]
Iterates over mth5s in list and creates one big dataframe summarizing the runs
- property mini_summary: pandas.DataFrame[source]
Shows the dataframe with only a few columns for readbility.
- mth5.processing.run_summary.extract_run_summaries_from_mth5s(mth5_list, summary_type='run', deduplicate=True)[source]
Given a list of mth5’s, iterate over them, extracting run_summaries and merging into one big table.
Development Notes: ToDo: Move this method into mth5? or mth5_helpers? ToDo: Make this a class so that the __repr__ is a nice visual representation of the df, like what channel summary does in mth5 - 2022-05-28 Modified to allow this method to accept mth5 objects as well as the already supported types of pathlib.Path or str
In order to drop duplicates I used the solution here: https://stackoverflow.com/questions/43855462/pandas-drop-duplicates-method-not-working-on-dataframe-containing-lists
- Parameters:
deduplicate (, defaults to True. : bool, optional) – By default, True.
mth5_list
mth5_paths (list) – Paths or strings that point to mth5s.
summary_type (string, optional) – One of [“channel”, “run”] “channel” returns concatenated channel summary, “run” returns concatenated run summary,. By default, “run”.
deduplicate
- Returns:
super_summary – Given a list of mth5s, a dataframe of all available runs.
- Return type:
pd.DataFrame