ecg_analysis package¶
Subpackages¶
Submodules¶
ecg_analysis.calculations module¶
- ecg_analysis.calculations.dict2json(my_dict: dict, folder: str = 'files')¶
Takes a dict and folder name and writes dict data to file in folder
Take a dictionary and writes it to a JSON file in a folder. Folder is called ‘files’ unless specified otherwise. The base name of the file is indicated by the ‘filename’ key in the dictionary, which is not included in the JSON itself
- Parameters
my_dict (dict) – dictionary of items to write to the JSON file
folder (str) – folder to save the files in. Default is ‘files’
- ecg_analysis.calculations.get_metrics(data: pandas.core.frame.DataFrame, t_key: str = 'time', v_key: str = 'voltage', rounding: int = 3) dict ¶
Calculates all relevant metrics in an ECG data set
This function takes in a dataframe with the keys ‘t_key’ and ‘v_key’ which by default are ‘time’ and ‘voltage’ respectively. It returns a dictionary with the duration, voltage extremes, filename, number of beats, beats per minute, anda list of beat times stored in it paired with the relevant key. Each numeric metric is rounded to three decimal places by default, which can be changed by assigning an integer to the ‘rounding’ parameter
- Parameters
data (DataFrame) – A pandas dataframe that contains the fields t_key and v_key
t_key (str) – String indicating the name of the time column in data. ‘time’ by default.
v_key (str) – String indicating the name of the voltage column in data. ‘voltage’ by default.
rounding (int) – An integer indicating the number of decimals to round to. 3 by default
- Returns
A dictionary with the keys: duration, beats, extremes, filename, num_beats, mean_hr_bpm
- Return type
dict
- ecg_analysis.calculations.remove_dir(directory: str)¶
Takes a folder name and removes it, even if it contains files
Deletes the selected folder by listing through the files and deleting them one by one and then deleting the empty folder
- Parameters
directory (str) – Directory path to remove it and its contents
ecg_analysis.ecg_reader module¶
- ecg_analysis.ecg_reader.apply_to_df(my_data: pandas.core.frame.DataFrame, func: object, invert=False)¶
Applies given boolean function to DataFrame rows and removes False rows
Takes a DataFrame my_data and applies a function func element by element to determine bool value. Per given row, it then checks if all column elements. If invert is False, then it deletes any rows where func is false in at least one column. Conversely, if invert is True, it deletes every row where func is true in at least one column.
- Parameters
my_data (DataFrame) – DataFrame where func is applied element-wise
func (object) – function passed to apply. Takes in a DataFrame element and returns a boolean value
invert (bool) – Indication of whether to delete rows where the func is true or false in at least one column
- Returns
DataFrame where func returns true or false (depending on invert) for every column
- Return type
DataFrame
- ecg_analysis.ecg_reader.check_range(data_set: pandas.core.series.Series, filename: str, upper: Union[float, int], lower: Union[float, int])¶
Checks to see if a value of a given Series is outside a given range
Takes a Series and logs an error if there are any values above the param upper or below the param lower.
- Parameters
data_set (Series) – Series to check the range of its values
filename (str) – File name to indicate in the error log
upper (Union[float, int]) – Upper limit of the range. Log if any values are above this.
lower (Union[float, int]) – Lower limit of the range. Log if any values are above this.
- ecg_analysis.ecg_reader.clean_data(my_data: pandas.core.frame.DataFrame)¶
Cleans missing, nan, and non numeric values from a DataFrame
Takes a DataFrame and checks each element for a missing, nan, or non numeric value and then removes the whole row if it does. Returns the cleaned DataFrame and logs the row indices and DataFrame names of erroneous rows.
- Parameters
my_data (DataFrame) – DataFrame to be cleaned
- Returns
DataFrame without any missing, nan, or non numeric values
- Return type
DataFrame
- ecg_analysis.ecg_reader.filter_data(my_data: pandas.core.series.Series, first_samp: Union[float, int], last_samp: Union[float, int], high: Union[float, int], low: Union[float, int], **kwargs) numpy.ndarray ¶
Filter ECG data using a band pass FIR filter with reflected padding
Takes a pandas Series indicating voltage values of an ECG and filters it using a finite impulse response (FIR) filter. The padding is reflected to keep from too much attenuation or inversion of the signal. It uses the first and last time points to calculate the sample rate, and takes a variable high and low pass frequency for the band pass. This function uses mne for backend and their documentation can be found below. Keyword arguments corresponding to their documentation can be set in this function call.
link: https://mne.tools/stable/generated/mne.filter.filter_data.html
- Parameters
my_data (Series) – Series data describing the voltage time-series
first_samp (Union[float, int]) – number indicating the time at which the first sample was taken
last_samp (Union[float, int]) – number indicating the time at which the last sample was taken
high (Union[float, int]) – The top of the range of the band pass filter. Must be higher than the low parameter
low (Union[float, int]) – The bottom of the range of the band pass filter. Must be lower than the high parameter
- Returns
numpy array of the filtered data
- Return type
np.ndarray
- ecg_analysis.ecg_reader.is_mt_str(my_str) bool ¶
Check to see if the given input is an empty string
Function that checks the given parameter my_str to see if it is an empty string. Uses the innate identity __eq__ to check.
- Parameters
my_str (str) – String to check if it is empty
- Returns
Boolean indicating if the input is an empty string
- Return type
bool
- ecg_analysis.ecg_reader.is_nan(x: Union[int, float, str]) bool ¶
Check to see if x is an nan value
Check to see if the given parameter x is an nan value. Similar to np.isnan, but with the capability to read strings as well as number types. Attempts to convert strings to floats to do so.
- Parameters
x (Union[int, float, str]) – Input to check if it is nan
- Returns
Boolean indicating the whether the input is nan
- Return type
bool
- ecg_analysis.ecg_reader.is_num(num: Any) bool ¶
Function that tests if object can be converted to number
A function that takes any input and detects if it is a number by attempting to convert the input to a float. This function catches convertable digit string cases.
Source: https://stackoverflow.com/questions/354038
- Parameters
num (Any) – object data to determine if it is conceivably an number
- Returns
a boolean determination if the input is a number
- Return type
bool
- ecg_analysis.ecg_reader.load_csv(local_file: str, cols: Union[List[str], Tuple[str]] = ('time', 'voltage')) pandas.core.frame.DataFrame ¶
Loads a patient file.csv into a DataFrame
Takes a file path to a csv file and reads then writes the data into a dataframe with the column headers indicated by cols.
- Parameters
local_file (str) – path to local .csv file
cols (Union[List[str], Tuple[str]]) – list or tuple indicating the column headers for the output dataframe. (‘time’, ‘voltage’) by default
- Returns
DataFrame read in of the csv file with headers given by cols
- Return type
DataFrame
- ecg_analysis.ecg_reader.preprocess_data(file_path: str, tlabel: str = 'time', vlabel: str = 'voltage', raw_max: Union[float, int] = 300, raw_min: Optional[Union[float, int]] = None, l_freq: Union[float, int] = 1, h_freq: Union[float, int] = 50, clean_only: bool = False, **kwargs) pandas.core.frame.DataFrame ¶
Takes a csv ECG file and runs a full preprocessing pipeline on it
Reads a given csv file path into a DataFrame, which is assigned the headers tlabel and vlabel. First, it cleans any bad data from the DataFrae using clean_data(). It then logs whether an values are outside the range [raw_min, raw_max] and logs it. Lastly, it takes the vlabel column and filters it using the filter_data() function. Any mne keyword arguments may be piped into the filter function through **kwargs
- Parameters
file_path (str) – Path to the csv file the will be read into the DataFrame
tlabel (str) – Label for the time column of the DataFrame.
vlabel (str) – Label for the voltage column of the DataFrame
raw_max (Union[float, int]) – Upper end of the range allowed for the voltage data.
raw_min (Union[float, int]) – Lower end of the range allowed for the voltage data. Default value is the negative of the raw_max.
l_freq (Union[float, int]) – Lower frequency band of the band pass filter
h_freq (Union[float, int]) – Upper frequency band of the band pas filter
clean_only (bool) – Option to not filter the data
- Returns
A fully preprocessed DataFrame of the time and voltage data
- Return type
DataFrame