data_analysis_plotting_tools.PlottingTool#

Tool to facilitate data set plotting.

Module Contents#

Classes#

PlottingTool

Tool to facilitate data set plotting.

class data_analysis_plotting_tools.PlottingTool.PlottingTool#

Tool to facilitate data set plotting.

__start_local_bokeh_server(bkapp) None#

Private Method. Starts Bokeh to run in Browser.

__get_random_color_code() str#

Private Method. Returns random hexadecimal color code.

__is_date(string: str, fuzzy: bool = False)#

Private Method. Return whether the string can be interpreted as a date.

Parameters:
  • string – str, string to check for date

  • fuzzy – bool, ignore unknown tokens in string if True

add_data_set(df_name: str, data_frame: pandas.DataFrame, disable_feedback: bool = False) None#

Add a data set to be used.

Parameters:
  • df_name (str) – Name to give the data set.

  • data_frame (pd.DataFrame) – Data set as pandas DataFrame.

  • disable_feedback (bool) – Decide whether a confirmation message should be displayed or not.

Return type:

None.

plot_interactive(data_frames: dict) None#

Plot data sets on a preset 2D interactive chart.

Parameters:

data_frames (dict) –

Specifies the data sets and columns to use. First mentioned column will be on x-axis. Columns specified as x-axis must be exactly the same.

Example: {‘berlin’: [‘date’, ‘rain_sum’], ‘paris’: [‘date’, ‘temperature’]}

Return type:

None.

plot_univariate_graphs(df_name: str, number_columns_unvariate_graphs: int) None#

Plot an univariate pairplot from the numeric variables in the data set.

Parameters:
  • df_name (str) – Name of the data set to be plotted.

  • number_columns_unvariate_graphs (int) – Decide on how many rows the plots should be displayed.

Return type:

None.

plot_bivariate_graphs(df_name: str, numeric_variables: list[str]) None#

Plot a bivariate pairplot from the numeric variables in the data set.

Parameters:
  • df_name (str) – Name of the data set to be plotted.

  • numeric_variables (list[str]) – Choose numeric variables to plot by entering the name of the variable in the list.

Return type:

None.

plot_correlation_heatmap(df_name: str, numeric_variables: list[str]) None#

Plot a correlation heatmap using the numeric variables in the data set.

Parameters:
  • df_name (str) – Name of the data set to be plotted.

  • numeric_variables (list[str]) – Choose numeric variables to plot by entering the name of the variable in the list.

Return type:

None.

get_regression_model_summary(df_name: str, target_variable: str, predictor_variables: list[str], disable_feedback: bool = False, disable_plotting: bool = False)#

Plot a regression model based on variables to be studied.

Parameters:
  • df_name (str) – Name of the data set to be plotted.

  • target_variable (str) – Variable to be predicted.

  • predictor_variables (list[str]) – Input variables on which the output would be based.

  • disable_feedback (bool) – Whether to print feedbacks, like a model summary, into the console.

  • disable_plotting (bool) – Whether the regression model should be plotted.

Return type:

Model summary.