Dutch child birth data

Longitudinal height and weight measurements during ages 0-2 years for a representative sample of 1,933 Dutch children born in 1988-1989.
health
child
Author

DS 150

Published

November 5, 2023

Data details

There are 206 rows and 3 columns. The data source1 is used to create our data that is stored in our pins table. You can access this pin from a connection to posit.byui.edu using hathawayj/birth_dutch.

Variable description

  • subjid: unique identifier of each child
  • sex: Male or Female
  • birthwt: Birthweight of the child

Variable summary

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
subjid 0 1 10632.60 502.06 10001 10064.25 11017.5 11071.75 11127 ▆▁▁▁▇
birthwt 0 1 3484.73 571.14 1180 3135.00 3500.0 3897.50 5100 ▁▂▇▇▁

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
sex 0 1 4 6 0 2 0
Explore generating code using R
pacman::p_load(tidyverse, fs, sf, arrow, googledrive, downloader, fs, glue, rvest, pins, connectapi)

t.dat <- tempfile()
download("https://github.com/stefvanbuuren/brokenstick/raw/71dc99e62ce57b58d5c1d2a1074fbd4bf394e559/data/smocc_hgtwgt.rda",t.dat, mode = "wb")

load(t.dat)

birth_dutch <- smocc_hgtwgt %>%
  group_by(subjid) %>%
  summarise(sex = sex[1], birthwt = birthwt[1]) %>%
  ungroup()


board <- board_connect()

pin_write(board, birth_dutch, type = "parquet", access_type = "all")

pin_name <- "birth_dutch"
meta <- pin_meta(board, paste0("hathawayj/", pin_name))
client <- connect()
my_app <- content_item(client, meta$local$content_id)
set_vanity_url(my_app, paste0("data/", pin_name))

Access data

This data is available to all.

Direct Download: birth_dutch.parquet

R and Python Download:

URL Connections:

For public data, any user can connect and read the data using pins::board_connect_url() in R.

library(pins)
url_data <- "https://posit.byui.edu/data/birth_dutch/"
board_url <- board_connect_url(c("dat" = url_data))
dat <- pin_read(board_url, "dat")

Use this custom function in Python to have the data in a Pandas DataFrame.

import pandas as pd
import requests
from io import BytesIO

def read_url_pin(name):
  url = "https://posit.byui.edu/data/" + name + "/" + name + ".parquet"
  response = requests.get(url)
  if response.status_code == 200:
    parquet_content = BytesIO(response.content)
    pandas_dataframe = pd.read_parquet(parquet_content)
    return pandas_dataframe
  else:
    print(f"Failed to retrieve data. Status code: {response.status_code}")
    return None

# Example usage:
pandas_df = read_url_pin("birth_dutch")

Authenticated Connection:

Our connect server is https://posit.byui.edu which you assign to your CONNECT_SERVER environment variable. You must create an API key and store it in your environment under CONNECT_API_KEY.

Read more about environment variables and the pins package to understand how these environment variables are stored and accessed in R and Python with pins.

library(pins)
board <- board_connect(auth = "auto")
dat <- pin_read(board, "hathawayj/birth_dutch")
import os
from pins import board_rsconnect
from dotenv import load_dotenv
load_dotenv()
API_KEY = os.getenv('CONNECT_API_KEY')
SERVER = os.getenv('CONNECT_SERVER')

board = board_rsconnect(server_url=SERVER, api_key=API_KEY)
dat = board.pin_read("hathawayj/birth_dutch")