Body measurements for three species of penguins on the Palmer Archipelago in Antarctica.

penguins

Format

A data frame with 333 rows and 8 variables:

species

a factor denoting penguin species (Adélie, Chinstrap and Gentoo)

island

a factor denoting island in Palmer Archipelago, Antarctica (Biscoe, Dream or Torgersen)

bill_length_mm

a number denoting bill length (millimeters)

bill_depth_mm

a number denoting bill depth (millimeters)

flipper_length_mm

an integer denoting flipper length (millimeters)

body_mass_g

an integer denoting body mass (grams)

sex

a factor denoting penguin sex (female, male)

year

an integer denoting the study year (2007, 2008, or 2009)

Source

This data set is borrowed from the R package palmerpenguins by Allison Horst, Alison Hill and Kristen Gorman (https://allisonhorst.github.io/palmerpenguins/). Check out their webpage for complimentary penguin artwork!

Note

penguins is a great introductory data set used throughout the data science and machine learning community (all the variables are exogenous; penguins can't choose their measurements!). To make things a bit easier, all rows with NAs were removed (to avoid issues like passing na.rm = TRUE to summary routines like mean() and so on).