ecg_analysis package

Subpackages

Submodules

ecg_analysis.calculations module

ecg_analysis.calculations.dict2json(my_dict: dict, folder: str = 'files')

Takes a dict and folder name and writes dict data to file in folder

Take a dictionary and writes it to a JSON file in a folder. Folder is called ‘files’ unless specified otherwise. The base name of the file is indicated by the ‘filename’ key in the dictionary, which is not included in the JSON itself

Parameters
  • my_dict (dict) – dictionary of items to write to the JSON file

  • folder (str) – folder to save the files in. Default is ‘files’

ecg_analysis.calculations.get_metrics(data: pandas.core.frame.DataFrame, t_key: str = 'time', v_key: str = 'voltage', rounding: int = 3) dict

Calculates all relevant metrics in an ECG data set

This function takes in a dataframe with the keys ‘t_key’ and ‘v_key’ which by default are ‘time’ and ‘voltage’ respectively. It returns a dictionary with the duration, voltage extremes, filename, number of beats, beats per minute, anda list of beat times stored in it paired with the relevant key. Each numeric metric is rounded to three decimal places by default, which can be changed by assigning an integer to the ‘rounding’ parameter

Parameters
  • data (DataFrame) – A pandas dataframe that contains the fields t_key and v_key

  • t_key (str) – String indicating the name of the time column in data. ‘time’ by default.

  • v_key (str) – String indicating the name of the voltage column in data. ‘voltage’ by default.

  • rounding (int) – An integer indicating the number of decimals to round to. 3 by default

Returns

A dictionary with the keys: duration, beats, extremes, filename, num_beats, mean_hr_bpm

Return type

dict

ecg_analysis.calculations.remove_dir(directory: str)

Takes a folder name and removes it, even if it contains files

Deletes the selected folder by listing through the files and deleting them one by one and then deleting the empty folder

Parameters

directory (str) – Directory path to remove it and its contents

ecg_analysis.ecg_reader module

ecg_analysis.ecg_reader.apply_to_df(my_data: pandas.core.frame.DataFrame, func: object, invert=False)

Applies given boolean function to DataFrame rows and removes False rows

Takes a DataFrame my_data and applies a function func element by element to determine bool value. Per given row, it then checks if all column elements. If invert is False, then it deletes any rows where func is false in at least one column. Conversely, if invert is True, it deletes every row where func is true in at least one column.

Parameters
  • my_data (DataFrame) – DataFrame where func is applied element-wise

  • func (object) – function passed to apply. Takes in a DataFrame element and returns a boolean value

  • invert (bool) – Indication of whether to delete rows where the func is true or false in at least one column

Returns

DataFrame where func returns true or false (depending on invert) for every column

Return type

DataFrame

ecg_analysis.ecg_reader.check_range(data_set: pandas.core.series.Series, filename: str, upper: Union[float, int], lower: Union[float, int])

Checks to see if a value of a given Series is outside a given range

Takes a Series and logs an error if there are any values above the param upper or below the param lower.

Parameters
  • data_set (Series) – Series to check the range of its values

  • filename (str) – File name to indicate in the error log

  • upper (Union[float, int]) – Upper limit of the range. Log if any values are above this.

  • lower (Union[float, int]) – Lower limit of the range. Log if any values are above this.

ecg_analysis.ecg_reader.clean_data(my_data: pandas.core.frame.DataFrame)

Cleans missing, nan, and non numeric values from a DataFrame

Takes a DataFrame and checks each element for a missing, nan, or non numeric value and then removes the whole row if it does. Returns the cleaned DataFrame and logs the row indices and DataFrame names of erroneous rows.

Parameters

my_data (DataFrame) – DataFrame to be cleaned

Returns

DataFrame without any missing, nan, or non numeric values

Return type

DataFrame

ecg_analysis.ecg_reader.filter_data(my_data: pandas.core.series.Series, first_samp: Union[float, int], last_samp: Union[float, int], high: Union[float, int], low: Union[float, int], **kwargs) numpy.ndarray

Filter ECG data using a band pass FIR filter with reflected padding

Takes a pandas Series indicating voltage values of an ECG and filters it using a finite impulse response (FIR) filter. The padding is reflected to keep from too much attenuation or inversion of the signal. It uses the first and last time points to calculate the sample rate, and takes a variable high and low pass frequency for the band pass. This function uses mne for backend and their documentation can be found below. Keyword arguments corresponding to their documentation can be set in this function call.

link: https://mne.tools/stable/generated/mne.filter.filter_data.html

Parameters
  • my_data (Series) – Series data describing the voltage time-series

  • first_samp (Union[float, int]) – number indicating the time at which the first sample was taken

  • last_samp (Union[float, int]) – number indicating the time at which the last sample was taken

  • high (Union[float, int]) – The top of the range of the band pass filter. Must be higher than the low parameter

  • low (Union[float, int]) – The bottom of the range of the band pass filter. Must be lower than the high parameter

Returns

numpy array of the filtered data

Return type

np.ndarray

ecg_analysis.ecg_reader.is_mt_str(my_str) bool

Check to see if the given input is an empty string

Function that checks the given parameter my_str to see if it is an empty string. Uses the innate identity __eq__ to check.

Parameters

my_str (str) – String to check if it is empty

Returns

Boolean indicating if the input is an empty string

Return type

bool

ecg_analysis.ecg_reader.is_nan(x: Union[int, float, str]) bool

Check to see if x is an nan value

Check to see if the given parameter x is an nan value. Similar to np.isnan, but with the capability to read strings as well as number types. Attempts to convert strings to floats to do so.

Parameters

x (Union[int, float, str]) – Input to check if it is nan

Returns

Boolean indicating the whether the input is nan

Return type

bool

ecg_analysis.ecg_reader.is_num(num: Any) bool

Function that tests if object can be converted to number

A function that takes any input and detects if it is a number by attempting to convert the input to a float. This function catches convertable digit string cases.

Source: https://stackoverflow.com/questions/354038

Parameters

num (Any) – object data to determine if it is conceivably an number

Returns

a boolean determination if the input is a number

Return type

bool

ecg_analysis.ecg_reader.load_csv(local_file: str, cols: Union[List[str], Tuple[str]] = ('time', 'voltage')) pandas.core.frame.DataFrame

Loads a patient file.csv into a DataFrame

Takes a file path to a csv file and reads then writes the data into a dataframe with the column headers indicated by cols.

Parameters
  • local_file (str) – path to local .csv file

  • cols (Union[List[str], Tuple[str]]) – list or tuple indicating the column headers for the output dataframe. (‘time’, ‘voltage’) by default

Returns

DataFrame read in of the csv file with headers given by cols

Return type

DataFrame

ecg_analysis.ecg_reader.preprocess_data(file_path: str, tlabel: str = 'time', vlabel: str = 'voltage', raw_max: Union[float, int] = 300, raw_min: Optional[Union[float, int]] = None, l_freq: Union[float, int] = 1, h_freq: Union[float, int] = 50, clean_only: bool = False, **kwargs) pandas.core.frame.DataFrame

Takes a csv ECG file and runs a full preprocessing pipeline on it

Reads a given csv file path into a DataFrame, which is assigned the headers tlabel and vlabel. First, it cleans any bad data from the DataFrae using clean_data(). It then logs whether an values are outside the range [raw_min, raw_max] and logs it. Lastly, it takes the vlabel column and filters it using the filter_data() function. Any mne keyword arguments may be piped into the filter function through **kwargs

Parameters
  • file_path (str) – Path to the csv file the will be read into the DataFrame

  • tlabel (str) – Label for the time column of the DataFrame.

  • vlabel (str) – Label for the voltage column of the DataFrame

  • raw_max (Union[float, int]) – Upper end of the range allowed for the voltage data.

  • raw_min (Union[float, int]) – Lower end of the range allowed for the voltage data. Default value is the negative of the raw_max.

  • l_freq (Union[float, int]) – Lower frequency band of the band pass filter

  • h_freq (Union[float, int]) – Upper frequency band of the band pas filter

  • clean_only (bool) – Option to not filter the data

Returns

A fully preprocessed DataFrame of the time and voltage data

Return type

DataFrame

Module contents