All you need to know about Exploratory Data Analysis

Ved Panse • 17 May, 2022

In this documentation, we will be covering how to perform an Exploratory Data Analysis (EDA). EDA is the analysis of data for predicting a future outcome. It is critical for companies as they can almost foresee their future sales and take the necessary actions.

What to use?

We will be using Python in this documentation. Python is not the only way to carry out an Exploratory Data Analysis. It just helps us to visualize CSVs interactively. You can use Microsoft Excel or any other spreadsheet application as well.

Importing modules

We will start by importing plotly. Plotly is a popular Python library used for visualizing.

import plotly.express as px

Line Graphs

Line modeling is one of the most basic models. It compares the growth rate of features. What are features?
Suppose you want to compare the stock prices of several companies. You are more likely to use a line graph to compare the growth rate. Here, the stock prices are the features.


Let's start plotting using Python.

We have to start by uploading a dataset first. Fortunately, plotly has some inbuilt sets for us. Let us import the stocks dataset, that contains details about the stock prices of several major companies.

df_stocks = px.data.stocks()

If you print df_stocks, you will see the features (headings of columns). You can use these features to plot your graphs. Let's plot a date VS price graph for Google (GOOG in the data frame).

graph = px.line(df_stocks, x="date", y="GOOG", labels={"x": "Date", "y": "Price"})
# labels are the titles for respective axes.

graph.show()

You can print the date VS price graph for multiple companies.

graph = px.line(df_stocks, x="date", y=["GOOG", "AAPL", "AMZN", "FB", "NFLX", "MSFT"], labels={ "x": "Date", "y": "Company stocks"})

graph.show()
Analyzing the graph

Now, it is time for the most important job: analyzing data. Just observe trends and try to identify the reasons behind those trends. This analysis should also provide a valuable insight that can help us predict a future outcome. Try changing the features that the coordinate axes represent to improve your computation.


Try to include many spectacular details to create a successful EDA. All the best!