All you need to know about Exploratory Data Analysis
In this documentation, we will be covering how to perform an Exploratory Data Analysis (EDA). EDA is the analysis of data for predicting a future outcome. It is critical for companies as they can almost foresee their future sales and take the necessary actions.
What to use?
We will be using Python in this documentation. Python is not the only way to carry out an Exploratory Data Analysis. It just helps us to visualize CSVs interactively. You can use Microsoft Excel or any other spreadsheet application as well.
Importing modules
We will start by importing plotly. Plotly is a popular Python library used for visualizing.
Line Graphs
Line modeling is one of the most basic models. It compares the growth rate of features. What are features?
Suppose you want to compare the stock prices of several companies. You are more likely to use a line graph to compare the growth
rate. Here, the stock prices are the features.
Let's start plotting using Python.
We have to start by uploading a dataset first. Fortunately, plotly has some inbuilt sets for us. Let us import the stocks dataset, that contains details about the stock prices of several major companies.
If you print df_stocks, you will see the features (headings of columns). You can use these features to plot your graphs. Let's plot a date VS price graph for Google (GOOG in the data frame).
# labels are the titles for respective axes.
graph.show()
You can print the date VS price graph for multiple companies.
graph.show()
Now, it is time for the most important job: analyzing data. Just observe trends and try to identify the reasons behind those trends. This analysis should also provide a valuable insight that can help us predict a future outcome. Try changing the features that the coordinate axes represent to improve your computation.
Try to include many spectacular details to create a successful EDA. All the best!