BYU-I Course Project Data
  • Data Science Program
Categories
All (99)
DS 350 (1)
DS250 (1)
DS350 (2)
MATH221 (48)
advertising (2)
animals (7)
anthropology (2)
astronomy (3)
athletics (1)
behavioralscience (1)
biology (1)
business (1)
chemistry (1)
child (9)
climate (2)
currency (1)
ecology (4)
economics (1)
education (3)
efficiency (1)
entertainment (4)
environment (2)
environmentalhealth (1)
finance (2)
food (5)
geology (1)
haberdashery (1)
health (44)
housing (1)
language (1)
machinelearning (1)
marathon (8)
mechanicalengineering (1)
mentalhealth (2)
military (1)
mortality (1)
music (3)
nutrition (4)
occupationalsafety (1)
pharmaceuticals (1)
physics (6)
politics (1)
population (4)
production (1)
products (7)
psychology (1)
questionablebeautystandards (1)
realestate (1)
religion (1)
resources (1)
science (3)
sports (5)
stocks (1)
substanceabuse (1)
technology (1)
war (1)
world (3)

Data Overview

The Author is the originating BYU-I course that prompted the data. All terms are searchable. This page shows all datasets available on this site. You can explore datasets by topic or course by browsing data topics. We leverage the Posit’s pin packages in Python (pins) and R (pins).

For R users, the following process will read the data from this website. You will need to change DATANAMEONPOSIT to the actual name.

R code example to read data
library(pins)
url_data <- "https://posit.byui.edu/data/DATANAMEONPOSIT/"
board_url <- board_connect_url(c("dat" = url_data))
dat <- pin_read(board_url, "dat")

Use the following python function (read_url_pin()) to access the data in a Pandas DataFrame.

Python code example to read data
import pandas as pd
import requests
from io import BytesIO

def read_url_pin(name):
  url = "https://posit.byui.edu/data/" + name + "/" + name + ".parquet"
  response = requests.get(url)
  if response.status_code == 200:
    parquet_content = BytesIO(response.content)
    pandas_dataframe = pd.read_parquet(parquet_content)
    return pandas_dataframe
  else:
    print(f"Failed to retrieve data. Status code: {response.status_code}")
    return None

# Example usage:
pandas_df = read_url_pin("DATANAMEONPOSIT")

Data Posts

Mahon
MATH221
health
nutrition
A group of researchers including Annie Mahon investigated the weight loss of middle-aged women when they followed a reduced-calorie diet for 9 weeks. The weights of the…
MATH 221

6/25/24, 11:02:27 AM

AM/PM Heights
MATH221
health
Researcher Peter Stothart noted that “We have known for well over a century that people are taller in the morning, shrink progressively through the day and recover their…
MATH 221

5/25/24, 10:33:36 AM

Apollo Missions
astronomy
physics
During the 1960s and early 1970s, the United States was in a race to explore the moon. Seven missions (Apollo 11 through Apollo 17) were launched in an attempt to reach the…
MATH 221

5/22/24, 4:14:29 PM

Baby Boom
population
health
anthropology
The Mater Mothers’ Hospital is a busy hospital in Brisbane, Australia. The birth weights and times of all children born on December 18, 1997 in this hospital were recorded.
MATH 221

5/23/24, 8:46:46 AM

Batting Averages
MATH221
sports
athletics
Athletes’ statistics such as the batting average of a baseball player are regularly publicized and are a topic of discussion among sports enthusiasts. The batting average…
MATH 221

5/25/24, 10:33:36 AM

Biggest Loser
entertainment
questionablebeautystandards
A dataset containing information on Biggest Loser contestants.
MATH 221

5/25/24, 10:41:51 AM

BLEU Scores
MATH221
technology
language
machinelearning
Computer software is commonly used to translate text from one language to another. As part of his Ph.D. thesis, Philipp Koehn developed a phrase-based translation program…
MATH 221

5/25/24, 10:33:36 AM

Body Measurements
MATH221
health
Estimating percentage of body fat is one method by which the health of a person is assessed. To measure it accurately is often inconvenient or costly (finding density…
MATH 221

5/25/24, 10:33:36 AM

Body Temperature
MATH221
health
Data on body temperature extracted from a figure. Note that this data is representative.
MATH 221

6/25/24, 11:11:03 AM

Bone Mineral Density
MATH221
health
Kudzu is a plant that was imported to the United States from Japan and now covers over seven million acres in the South. The plant contains chemicals called isoflavones that…
MATH 221

5/25/24, 10:33:36 AM

Book of Mormon Wordprint
MATH221
religion
For several years, researchers have used statistics to try to determine the author of a disputed literary work. These techniques have been applied to the Federalist Papers…
MATH 221

5/23/24, 9:08:33 AM

Movie Revenue
MATH221
entertainment
The budget and worldwide revenue of movies released from 1991 to 2015
MATH 221

5/25/24, 10:33:36 AM

Cardiac Arrest Health
MATH221
health
A group of researchers led by Jared Bunch studied the long-term effects suffered by patients who experienced a cardiac arrest outside a hospital. Using the Short-Form…
MATH 221

5/25/24, 10:44:53 AM

Cause of Death
MATH221
health
nutrition
mortality
Researchers used forensic autopsy results to assess the causes of death in several malnourished people in Japan. The ultimate goal of the research is to reduce premature…
MATH 221

5/25/24, 10:33:36 AM

2015 Census
MATH221
population
Documentation on this dataset is scarce, so tread lightly. Dataset contains summary statistics from a 2015 census in the U.S. It is grouped by county.
MATH 221

5/25/24, 10:44:29 AM

Chiropractors
MATH221
health
business
The reasons people go to chiropractors in Europe, Australia, and the United States
MATH 221

5/25/24, 10:33:36 AM

Comet Water Production and Magnitude
MATH221
astronomy
physics
science
A comet is a small icy object which orbits the sun. As a comet approaches the sun, water and other particles thaw and detach from the comet. This forms a small temporary…
MATH 221

5/25/24, 10:33:36 AM

Conjugated Linoleic Acid
MATH221
health
animals
nutrition
Conjugated linoleic acid (CLA) is found in milk fat from cows. It has recently been discovered that CLA has several health-promoting characteristics, including cancer risk…
MATH 221

5/25/24, 10:33:36 AM

COPD Rehab
MATH221
health
The National Heart Lung and Blood Institute gives the following explanation of COPD: COPD, or chronic obstructive pulmonary (PULL-mun-ary) disease, is a progressive disease…
MATH 221

5/25/24, 10:44:29 AM

Cuckoo Eggs
animals
ecology
The size of eggs that cuckoos lay in the nests of other species
MATH 221

5/25/24, 10:33:36 AM

DART Expert DOW 6-month ANOVA
finance
stocks
Percent return of the 3 stock picking options (DARTS, DJIA, PROS)
MATH 221

5/25/24, 10:33:36 AM

DASL Cheese
MATH221
products
food
chemistry
As cheese ages, various chemical processes take place that determine the taste of the final product. Concentrations of various chemicals were measured in 30 samples of…
MATH 221

5/25/24, 10:33:36 AM

DASL Helium Football
sports
physics
Researchers at Ohio State University wanted to determine whether a football filled with helium would fly further than an identical football filled with normal air. The…
MATH 221

5/25/24, 10:33:36 AM

DASL Hot Dog Nutrition
products
health
food
Calorie and sodium contents of different types of hot dogs
MATH 221

5/25/24, 10:33:36 AM

DASL Stepping
MATH221
health
In 1993, students at Ohio State University wanted to determine how heart rate was affected by various stepping exercises. They wanted to consider the relationship between…
MATH 221

5/25/24, 10:33:36 AM

DASL Student
MATH221
health
In his landmark paper on the t-distribution, William S. Gosset referenced data on the number of additional hours of sleep patients obtained by using the drug…
MATH 221

5/25/24, 10:33:36 AM

DASL Taste Test Scores
MATH221
food
In testing food products for palatability, General Foods employed a 7-point scale from -3 (terrible) to +3 (excellent) with 0 representing “average”. Their standard method…
MATH 221

5/25/24, 10:33:36 AM

DASL Waste Run Up
MATH221
production
efficiency
The Levi-Strauss clothing manufacture plant in Albuquerque, New Mexico gets its cloth supplies from other supplying plants. These data were collected in order to determine…
MATH 221

5/25/24, 10:33:36 AM

Diving Elephant Seals
ecology
animals
Researchers Jessica U. Meir and Paul J. Ponganis measured the body temperatures of a sample of diving elephant seals. A thermistor was placed at a specific location on each…
MATH 221

5/25/24, 10:39:11 AM

Vietnam War Draft
MATH221
politics
war
An aggregated dataset that totals the amount of people drafted for the Vietnam War over time
MATH 221

5/25/24, 10:33:36 AM

Estuarine Crocodiles
MATH221
animals
ecology
Head and Body length of estuarine crocodiles These data were collected in order to estimate the length of the Sarcosuchus imperator, nicknamed “SuperCroc.” Sarcosuchus is a…
MATH 221

5/25/24, 10:44:29 AM

Euro Weight
MATH221
currency
The weight of 2,000 euro coins, measured in laboratory conditions. Researcher Herman Callaert (Hasselt University, Belgium) suggested that the weights of Euro coins might…
MATH 221

5/25/24, 10:33:36 AM

Forced Expiratory Volume
MATH221
health
Forced expiratory volume is the amount of air a person can exhale during a forced breath. This dataset includes the forced expiratory volume of various youths under 20 years…
MATH 221

5/25/24, 10:33:36 AM

Freshman Dinner
MATH221
The number of times freshman students from Colorado and Utah cooked their own dinner in a month
MATH 221

5/25/24, 10:33:36 AM

Gharial Crocodiles
MATH221
animals
Head and Body length of gharial crocodiles. These data were collected in order to estimate the length of the Sarcosuchus imperator, nicknamed ‘SuperCroc.’ Sarcosuchus is a…
MATH 221

6/25/24, 11:09:34 AM

BYUI GPAs
MATH221
education
The GPAs of various BYUI students
MATH 221

6/25/24, 11:08:35 AM

Gratitude
mentalhealth
In a study, people were asked to journal on either things they were grateful for, hassles, or events. They were given a happiness score based on a survey. In 2003 Professors…
MATH 221

6/25/24, 11:08:07 AM

Hot Dog Health
MATH221
health
food
Researchers for Consumer Reports wanted to determine how nutritional content varied between different types of hot dogs. They conducted a laboratory analysis of three…
MATH 221

6/25/24, 11:06:19 AM

Hubble’s Constant: Supernovas
MATH221
astronomy
physics
Recession velocity and distance for 36 supernovas were recorded using the Hubble Space Telescope.
MATH 221

6/25/24, 11:05:12 AM

Illinois Birth Weights
population
health
The birth weights of babies in Illinois were taken, grouped by race and origin of the mother.
MATH 221

6/25/24, 11:04:45 AM

Insulin Resistance, Depression
health
pharmaceuticals
mentalhealth
This dataset is simulated matching reported summary statistics. Type II diabetes is a medical condition involving insulin resistance. Insulin resistance means that the…
MATH 221

6/25/24, 11:04:21 AM

JSE Hats
MATH221
products
haberdashery
The dimensions of various hats
MATH 221

6/25/24, 11:04:00 AM

Lead Exposure and Behavior
MATH221
health
environmentalhealth
behavioralscience
Researchers investigated whether exposure to lead affected the behavior in poor children between the ages of 1 and 3 years old. The researchers measured the blood lead level…
MATH 221

6/25/24, 11:03:23 AM

Madison County Real Estate
MATH221
realestate
economics
Note that these data are from 2010, and no longer relevant unless you’re interested in ancient history. Real estate prices are very variable, and they depend on a variety of…
MATH 221

6/25/24, 11:02:55 AM

Manatees
MATH221
animals
population
ecology
Manatees are curious, peaceful sea creatures that like to sun themselves just below the ocean’s surface. Sadly, this puts them in direct contact with powerboat propellers. …
MATH 221

6/25/24, 11:02:00 AM

Math Self Efficacy
MATH221
education
Shane Goodwin and other researchers studied factors that affect a student’s confidence on a multiple-choice Mathematics exam. A group of n = 139 students in an…
MATH 221

6/25/24, 10:59:37 AM

Movies
MATH221
entertainment
A large dataset containing movies. There are some unknown columns, but some may potentially be of interest.
MATH 221

6/25/24, 10:58:59 AM

Music Height (Long)
music
The heights of singers in various sections (Alto, Bass, Soprano, Tenor)
MATH 221

6/25/24, 10:58:29 AM

NASDAQ Price and Volume
finance
Price and volume of NASDAQ shares
MATH 221

6/25/24, 10:58:04 AM

NBA Players
MATH221
sports
A large dataset of NBA players that spans decades. It has some useful columns and a slew of weird columns of unknown purpose.
MATH 221

6/25/24, 10:57:32 AM

Nicotine Test
MATH221
health
substanceabuse
Cigarette labels warn pregnant women against smoking. Does nicotine actually reach the fetus, crossing the protective placental barrier? Researchers selected consecutive…
MATH 221

6/25/24, 10:49:31 AM

Nosocomial Infections
MATH221
health
Data representing the total number of days patients were hospitalized. Patients were matched based on having a similar condition and other physical characteristics. The data…
MATH 221

6/25/24, 10:48:12 AM

Old Faithful
MATH221
environment
geology
Waiting time between eruptions and the duration of the eruption for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA. Researchers observed 272 eruptions of…
MATH 221

6/25/24, 10:47:34 AM

Double Stuf Oreos
MATH221
food
products
advertising
A group of statistics students wanted to test the claim that Double Stuf Oreo cookies have twice as much filling (“Stuf”) as the traditional Oreo cookies. The students…
MATH 221

5/25/24, 10:44:29 AM

Patient Satisfaction: Doctor vs. Nurse
MATH221
health
1033 patients with no regular health care provider were randomly assigned to receive treatment from either a doctor or a nurse practitioner for primary and follow-up care…
MATH 221

6/25/24, 10:48:47 AM

Pine Beetle
environment
animals
These data represent observed counts of the number of lodgepole pines per hectare in tree stands before and seven years after a mountain pine beetle outbreak.
MATH 221

6/25/24, 10:46:34 AM

Protein Requirement Campbell
MATH221
health
biology
nutrition
The protein requirement of an individual is the amount of protein they must consume daily to stay in equilibrium. This number varies from individual to individual.…
MATH 221

6/25/24, 10:56:08 AM

Reading Practice
MATH221
education
These data represent the number of days each week that children with developmental problems (the DEV group) and children without developmental problems (the GEN group)…
MATH 221

6/25/24, 10:55:35 AM

REE Classical Music
MATH221
music
psychology
health
Obesity is a growing problem worldwide. Many scientists are seeking creative solutions to trim down this epidemic. Reduced energy expenditure is a potential cause of…
MATH 221

6/25/24, 10:54:50 AM

Singer Heights
MATH221
music
The heights of singers by their group (Soprano, Alto, Tenor, Bass)
MATH 221

6/25/24, 10:43:43 AM

Soccer Shoes
sports
products
advertising
Nike, a company that makes sporting goods including shoes, funded a study to compare five soccer shoe designs. The objective of the research was to assess if footwear could…
MATH 221

6/25/24, 10:39:39 AM

Soviet Accidents
occupationalsafety
military
mechanicalengineering
Dr. Wm. Robert Johnston compiled a list of nuclear accidents on Soviet submarines that caused acute radiation casualties, together with information such as how many deaths…
MATH 221

6/25/24, 10:40:32 AM

Speed of Light by Michelson 1879
MATH221
science
physics
In 1879, Albert Abraham Michelson, an American physicist, published several observations of the speed of light in air. This was early in his quest to measure the speed of…
MATH 221

6/25/24, 10:53:43 AM

Speed of Light by Michelson 1882
MATH221
science
physics
A single-column dataframe of Michelson’s speed of light measurements in 1882. Compare this dataset to the one he recorded in 1879.
MATH 221

6/25/24, 10:53:57 AM

Superbowl Movies
entertainment
Revenue of films and whether or not they were advertised during the Superbowl or not
MATH 221

5/25/24, 10:44:29 AM

Twins Diabetes
health
A group of researchers studied the heights of youths diagnosed with type I diabetes. They wanted to see how diabetes affected height. The researchers compared identical…
MATH 221

5/25/24, 10:44:29 AM

Vertebral Heights
health
anthropology
When an x-ray or lateral radiograph of the spine is taken, it is not immediately clear whether a vertebra is fractured. Experts may disagree on their interpretations, and…
MATH 221

5/25/24, 10:44:29 AM

World Cup Heart Attacks
sports
health
Count of heart attacks during world cup and not during world cup
MATH 221

5/25/24, 10:44:29 AM

Wrong Site, Wrong Patient
health
On rare occasions, a medical procedure is performed on the wrong body part of the body or on the wrong patient. These are called wrong-site and wrong-patient mistakes. …
MATH 221

5/25/24, 10:44:29 AM

Zinc for Colds
health
Note that this dataset was extracted from a figure in a report. As a possible treatment for common colds, we tested zinc gluconate lozenges in a double-blind, placebo…
MATH 221

5/25/24, 10:44:29 AM

LED example bulbs of lumen output
products
An example data set of LED bulbs based on actual data.
M119

6/25/24, 10:36:25 AM

LED example bulbs of lumen output for two products with standard procedure time point measurements
products
An example data set of LED bulbs based on actual data.
M119

6/25/24, 10:36:47 AM

Access to Drinking Water
DS350
world
resources
Data on drinking water access around the world
DS 350

5/25/24, 10:33:36 AM

Climate Change: Antarctica
DS350
world
climate
The amount of sea ice in Antarctica over time
DS 350

5/24/24, 12:33:27 PM

Climate Change: Ocean
DS 350
world
climate
Ocean heat content is measured relative to the 1971–2000 average, which is set at zero for reference. It is measured in 10²² joules. For reference, 10²² joules are equal to…
DS 350

5/25/24, 10:33:36 AM

Denver residential dwelling sales for 2013
housing
DS250
Attributes of each dwelling with their selling price for homes that sold in Denver in 2013
DS 250

1/10/25, 3:40:52 PM

Child height and weight measurements for all data from three studies at one year of age.
health
child
Data from three different research studies. Each study had different research objectives.
DS 150

5/22/24, 3:57:41 PM

Child height and weight HAZ summaries for multiple countries
health
child
Data from three different research studies. Each study had different research objectives.
DS 150

5/22/24, 3:57:41 PM

Dutch child birth data
health
child
Longitudinal height and weight measurements during ages 0-2 years for a representative sample of 1,933 Dutch children born in 1988-1989.
DS 150

5/22/24, 3:57:41 PM

Dutch child height and weight measurements
health
child
Longitudinal height and weight measurements during ages 0-2 years for a representative sample of 1,933 Dutch children born in 1988-1989.
DS 150

5/22/24, 3:57:41 PM

WHO coeficients for height Z-score calculations
health
child
See https://www.cdc.gov/nchs/data/nhsr/nhsr063.pdf for a description of how calculations are made. However, the CDC has different coefficients.
DS 150

5/22/24, 3:57:41 PM

Child height, weight, head circumference measurements in resource-poor environments
health
child
Subset of growth data from the Malnutrition and Enteric Disease Study (MAL-ED).
DS 150

5/22/24, 3:57:41 PM

US child birth data
health
child
Subset of growth data from the collaborative perinatal project (CPP).
DS 150

5/22/24, 3:57:41 PM

US child height and weight measurements
health
child
Subset of growth data from the collaborative perinatal project (CPP).
DS 150

5/22/24, 3:57:41 PM

WHO coeficients for weight Z-score calculations
health
child
See https://www.cdc.gov/nchs/data/nhsr/nhsr063.pdf for a desciption on how calculations are made. However, the CDC has different coefficients.
DS 150

5/22/24, 3:57:41 PM

The full set of runners for all races during 2010.
marathon
This data set has 800k runners. The NYT had a good article -…
DS 150

5/25/24, 11:06:49 AM

The 50% sample of male/female runners for all years of the Berlin marathon that recorded gender.
marathon
This data set has ~200k observations. Marathon website - https://www.bmw-berlin-marathon.com/en/
DS 150

6/25/24, 11:00:23 AM

The full set of runners for the Big Sur marathon.
marathon
This data set has ~40k observations. Marathon website - https://www.bigsurmarathon.org/
DS 150

5/25/24, 11:06:49 AM

The full set of runners for the Jerusalem marathon.
marathon
This data set has ~2.5k observations. Marathon website - https://jerusalem-marathon.com/en/home-page/
DS 150

5/25/24, 11:06:49 AM

A random sample of 50% of males and females for each year of runners for all years of the New York City marathon where gender is recorded.
marathon
This data set has just over 200k runners. The NYT had a good article -…
DS 150

5/25/24, 11:06:49 AM

A resampled set of runners from all marathons with more 50 runners.
marathon
Each marathon will have 100 runners (50 male, 50 female) per year. So any marathon with less than 50 runners in the group will have multiple resampled runners. This data set…
DS 150

5/25/24, 11:06:49 AM

Table of Information about Marathons
marathon
An interesting data set to see the effects of goals on what should be a unimodal distrubtion of finish times. The NYT had a good article -…
DS 150

5/25/24, 11:06:49 AM

Race Location
marathon
This data set has ~2k observations.
DS 150

1/10/25, 4:13:21 PM

Word Health Organization (WHO) Tuberculosis budgets by country
health
See source for description of the data. tb_dictionary describes the column names.
DS 150

6/25/24, 10:26:06 AM

Word Health Organization (WHO) Tuberculosis case notifications by country
health
See source for description of the data. tb_dictionary describes the column names.
DS 150

6/25/24, 10:26:29 AM

World Health Organization (WHO) Tuberculosis csv file column names
health
Data dictionary for tuberculosis datasets. File found at https://extranet.who.int/tme/generateCSV.asp?ds=dictionary
DS 150

6/25/24, 10:27:43 AM

Tuberculosis Estimates
health
See source for description of the data. tb_dictionary describes the column names.
DS 150

6/25/24, 10:26:54 AM

Word Health Organization (WHO) Tuberculosis treatment outcomes by country
health
See source for description of the data. tb_dictionary describes the column names.
DS 150

6/25/24, 10:36:06 AM

Word Health Organization (WHO) Tuberculosis expenditures and utilization by country
health
See source for description of the data. tb_dictionary describes the column names.
DS 150

6/25/24, 10:27:25 AM

No matching items