Child height, weight, head circumference measurements in resource-poor environments

Subset of growth data from the Malnutrition and Enteric Disease Study (MAL-ED).
health
child
Author

DS 150

Published

November 5, 2023

Data details

There are 48,632 rows and 16 columns. The data source1 is used to create our data that is stored in our pins table. You can access this pin from a connection to posit.byui.edu using hathawayj/childhealth_maled.

Variable description

  • subjid: unique identifyer of each child
  • country: Label for the varied countries
  • sex: Male or Female
  • agedays: Age in days
  • wtkg: Weight measurement in kg (0.8-20.5)
  • stcm: Stature either Length or height in cm
  • htcm: Height in cm
  • lncm: Length in cm
  • lh_used: Lenght or Height used for stature
  • hccm: Head Circumference in cm
  • lhaz: Length or Height for age in SDS relative to WHO child growth standard
  • haz: Height for age in SDS relative to WHO child growth standard
  • laz: Length for age in SDS relative to WHO child growth standard
  • waz: Weight for age in SDS relative to WHO child growth standard
  • hcaz: Head circumference for age in SDS relative to WHO child growth standard
  • whz: Weight for height or length in SDS relative to WHO child growth standard

Variable summary

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
agedays 0 1.00 341.81 226.81 0.00 149.00 334.00 546.00 759.00 ▇▆▆▆▅
wtkg 29 1.00 7.83 2.52 1.66 6.29 8.10 9.52 20.08 ▃▇▅▁▁
stcm 8314 0.83 69.30 9.63 38.90 63.00 70.60 76.60 97.00 ▁▃▇▇▁
htcm 48527 0.00 83.16 4.28 73.50 80.50 82.50 85.60 96.80 ▂▇▅▂▁
lncm 8316 0.83 69.30 9.63 38.90 63.00 70.60 76.60 97.00 ▁▃▇▇▁
hccm 13142 0.73 42.87 3.72 29.00 40.90 43.60 45.50 53.70 ▁▂▆▇▁
lhaz 8314 0.83 -1.29 1.20 -6.72 -2.08 -1.30 -0.53 4.71 ▁▃▇▂▁
haz 48527 0.00 -1.32 1.29 -3.98 -2.15 -1.45 -0.53 2.90 ▂▇▅▂▁
laz 8316 0.83 -1.29 1.20 -6.72 -2.08 -1.30 -0.53 4.71 ▁▃▇▂▁
waz 29 1.00 -0.78 1.24 -6.27 -1.57 -0.79 0.00 5.33 ▁▃▇▁▁
hcaz 13142 0.73 -0.89 1.21 -5.60 -1.73 -0.91 -0.10 9.35 ▁▇▂▁▁
whz 8346 0.83 0.09 1.26 -5.92 -0.76 0.04 0.89 6.80 ▁▃▇▁▁

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
subjid 0 1.00 18 18 0 2145 0
sex 0 1.00 4 6 0 2 0
country 0 1.00 4 12 0 8 0
lh_used 8293 0.83 6 6 0 2 0
Explore generating code using R
## Obfuscate data
# https://clinepidb.org/ce/app/record/dataset/DS_5c41b87221

pacman::p_load(tidyverse, fs, sf, arrow, googledrive, downloader, fs, glue, rvest, pins, connectapi)

sdrive <- shared_drive_find("byuids_data")
maled_file <- drive_ls(sdrive)  |>
    filter(stringr::str_detect(name, "MALED"))
tempf <- tempfile()
drive_download(maled_file, tempf)
dat <- read_csv(tempf)

childhealth_maled <- dat %>%
  select(
    subjid = `Participant ID`, sex = Sex, country = Country,
    agedays = `Age (days)`, wtkg = `Weight (kg)`, stcm = `Stature (cm)`,
    htcm = `Height (cm)`, lncm = `Recumbent length (cm)`,
    lh_used = `Recumbent length or height used for stature`,
    hccm = `Head circumference (cm)`,
    lhaz = `Length- or height-for-age z-score`,
    haz = `Height-for-age z-score`, laz= `Length-for-age z-score`,
    waz = `Weight-for-age z-score`, hcaz = `Head circumference-for-age z-score`,
    whz = `Weight-for-length or -height z-score`)

board <- board_connect()
pin_write(board, childhealth_maled, type = "parquet") # adjust permission to campus on site.

pin_name <- "childhealth_maled"
meta <- pin_meta(board, paste0("hathawayj/", pin_name))
client <- connect()
my_app <- content_item(client, meta$local$content_id)
set_vanity_url(my_app, paste0("data/", pin_name))

Access data

This data is available to BYUI users.

Direct Download: childhealth_maled.parquet

R and Python Download:

URL Connections:

For public data, any user can connect and read the data using pins::board_connect_url() in R.

library(pins)
url_data <- "https://posit.byui.edu/data/childhealth_maled/"
board_url <- board_connect_url(c("dat" = url_data))
dat <- pin_read(board_url, "dat")

Use this custom function in Python to have the data in a Pandas DataFrame.

import pandas as pd
import requests
from io import BytesIO

def read_url_pin(name):
  url = "https://posit.byui.edu/data/" + name + "/" + name + ".parquet"
  response = requests.get(url)
  if response.status_code == 200:
    parquet_content = BytesIO(response.content)
    pandas_dataframe = pd.read_parquet(parquet_content)
    return pandas_dataframe
  else:
    print(f"Failed to retrieve data. Status code: {response.status_code}")
    return None

# Example usage:
pandas_df = read_url_pin("childhealth_maled")

Authenticated Connection:

Our connect server is https://posit.byui.edu which you assign to your CONNECT_SERVER environment variable. You must create an API key and store it in your environment under CONNECT_API_KEY.

Read more about environment variables and the pins package to understand how these environment variables are stored and accessed in R and Python with pins.

library(pins)
board <- board_connect(auth = "auto")
dat <- pin_read(board, "hathawayj/childhealth_maled")
import os
from pins import board_rsconnect
from dotenv import load_dotenv
load_dotenv()
API_KEY = os.getenv('CONNECT_API_KEY')
SERVER = os.getenv('CONNECT_SERVER')

board = board_rsconnect(server_url=SERVER, api_key=API_KEY)
dat = board.pin_read("hathawayj/childhealth_maled")