Plentiful Palmer Penguins

Data Analysis
Author

Reetinav Das

Published

October 16, 2023

Penguins Dataset Exploration

We will be taking a look at the statistics of the Palmer Penguins Dataset.

We will set things up by importing necessary packages and taking a look at the first few rows of the dataset.

import pandas as pd
import plotly.io as pio
pio.renderers.default="iframe"
url = "https://raw.githubusercontent.com/PhilChodrow/PIC16B/master/datasets/palmer_penguins.csv"
penguins = pd.read_csv(url)
penguins.head()
studyName Sample Number Species Region Island Stage Individual ID Clutch Completion Date Egg Culmen Length (mm) Culmen Depth (mm) Flipper Length (mm) Body Mass (g) Sex Delta 15 N (o/oo) Delta 13 C (o/oo) Comments
0 PAL0708 1 Adelie Penguin (Pygoscelis adeliae) Anvers Torgersen Adult, 1 Egg Stage N1A1 Yes 11/11/07 39.1 18.7 181.0 3750.0 MALE NaN NaN Not enough blood for isotopes.
1 PAL0708 2 Adelie Penguin (Pygoscelis adeliae) Anvers Torgersen Adult, 1 Egg Stage N1A2 Yes 11/11/07 39.5 17.4 186.0 3800.0 FEMALE 8.94956 -24.69454 NaN
2 PAL0708 3 Adelie Penguin (Pygoscelis adeliae) Anvers Torgersen Adult, 1 Egg Stage N2A1 Yes 11/16/07 40.3 18.0 195.0 3250.0 FEMALE 8.36821 -25.33302 NaN
3 PAL0708 4 Adelie Penguin (Pygoscelis adeliae) Anvers Torgersen Adult, 1 Egg Stage N2A2 Yes 11/16/07 NaN NaN NaN NaN NaN NaN NaN Adult not sampled.
4 PAL0708 5 Adelie Penguin (Pygoscelis adeliae) Anvers Torgersen Adult, 1 Egg Stage N3A1 Yes 11/16/07 36.7 19.3 193.0 3450.0 FEMALE 8.76651 -25.32426 NaN

This is great info! Let’s take a look and see the different kinds of species that exist for the penguins.

print(penguins["Species"].unique())
['Adelie Penguin (Pygoscelis adeliae)'
 'Chinstrap penguin (Pygoscelis antarctica)'
 'Gentoo penguin (Pygoscelis papua)']

We are interested in the differences between the three species. We will explore these differences by looking at traits like Culmen Length, Flipper Length, and more! First we will drop rows where these values are NaN, or in other words, not available.

penguins = penguins.dropna(subset=["Culmen Length (mm)", 
                                   "Culmen Depth (mm)", 
                                   "Flipper Length (mm)", 
                                   "Body Mass (g)"])

We will first take a look at the culmen length and depth of the penguins using plotly. We will set the x axis to be culmen length and the y axis to be culmen depth. To make this even more useful, we’ll color code the points by species so we can see how these statistics differ.

from plotly import express as px
fig = px.scatter(data_frame=penguins, 
                 x="Culmen Length (mm)", 
                 y="Culmen Depth (mm)", 
                 color="Species", 
                 hover_name="Species", 
                 hover_data=["Island","Sex"], 
                 width=700, 
                 height=300)
#reduce whitespace
fig.update_layout(margin={"r":0, "t":0, "l":0, "b":0})
fig.show()

We can see here that culmen length and culmen depth can be pretty good predictors for what species a penguin could be! From this plot we see that Adelie penguins tend to have lower culmen depth than the other two species, whereas Gentoo penguins have lower culmen length than the other two. Chinstrap penguins tend to have both higher culmen depth and length.

Let’s go ahead and take a look at the other two attributes: Flipper Length and Body Mass. We will do a similar plot and see the distribution and see if there are any interesting facts.

fig = px.scatter(data_frame=penguins, 
                 x="Body Mass (g)", 
                 y="Flipper Length (mm)", 
                 color="Species", 
                 hover_name="Species", 
                 hover_data=["Island","Sex"], 
                 width=700, 
                 height=300)
#reduce whitespace
fig.update_layout(margin={"r":0, "t":0, "l":0, "b":0})
fig.show()

It seems like we can conclude some interesting facts based on this graph. When it comes to the Adelie and Chinstrap Penguins, there is not much of a difference between them in terms of flipper length and body mass. However, the Gentoo penguins on average have a significantly higher body mass and flipper length than the other two species.