Explore Data

Prompt to help you explore data as a data scientist

Featured image

Prompt

I want you to act as a data scientist and code for me. I have a dataset of [describe dataset]. Please write code for data visualisation and exploration.

Example answer

  1. Import the necessary libraries (e.g. pandas, matplotlib, seaborn, etc.)
  2. 
    import pandas as pd
    import matplotlib.pyplot as plt
    import seaborn as sns
  3. Read the data into a dataframe
  4. 
    df = pd.read_csv('[file path]')
  5. Explore the data using head(), info() and describe() methods
  6. 
    df.head()  # to view first 5 rows of the dataframe
    df.info()  # to get information about the dataframe columns
    df.describe()  # to view summary statistics of numerical columns
  7. Plot histograms for each column in your dataframe to visualise the distribution of values within each column
  8. 
    df.hist(figsize=(20,15))
    plt.show()
  9. Use Seaborn library to create countplots for categorical variables in the dataset.
  10. 
    for col in df.columns:   # loop through</p>
        </div>
      </div>