Tuberculosis Estimates

See source for description of the data. tb_dictionary describes the column names.
health
Author

DS 150

Published

February 20, 2024

Data details

There are 4,917 rows and 50 columns. The data source1 is used to create our data that is stored in our pins table. You can access this pin from a connection to posit.byui.edu using hathawayj/tb_estimates.

This data is available to all.

Variable description

See source for description of the data. tb_dictionary describes the column names.

Variable summary

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
year 0 1.00 2011.04 6.63 2000 2005.00 2011.00 2017.00 2.022000e+03 ▇▆▇▆▇
e_pop_num 0 1.00 32961977.37 130655150.64 1343 751115.00 5866405.00 21239457.00 1.425893e+09 ▇▁▁▁▁
e_inc_100k 0 1.00 123.05 184.82 0 12.00 46.00 161.00 1.590000e+03 ▇▁▁▁▁
e_inc_100k_lo 0 1.00 73.07 96.75 0 9.90 33.00 97.00 6.670000e+02 ▇▁▁▁▁
e_inc_100k_hi 0 1.00 193.60 362.00 0 14.00 57.00 235.00 5.510000e+03 ▇▁▁▁▁
e_inc_num 0 1.00 51549.20 250315.49 0 210.00 2900.00 17000.00 3.590000e+06 ▇▁▁▁▁
e_inc_num_lo 0 1.00 31866.85 153766.38 0 150.00 2100.00 11000.00 2.610000e+06 ▇▁▁▁▁
e_inc_num_hi 0 1.00 78242.33 409580.78 0 260.00 3700.00 25000.00 7.080000e+06 ▇▁▁▁▁
e_tbhiv_prct 685 0.86 12.23 17.02 0 1.10 5.10 16.00 1.000000e+02 ▇▁▁▁▁
e_tbhiv_prct_lo 685 0.86 8.54 13.77 0 0.44 2.60 10.00 8.300000e+01 ▇▁▁▁▁
e_tbhiv_prct_hi 685 0.86 17.67 21.44 0 2.80 8.70 24.00 1.000000e+02 ▇▂▁▁▁
e_inc_tbhiv_100k 685 0.86 36.16 116.77 0 0.21 1.95 12.00 1.320000e+03 ▇▁▁▁▁
e_inc_tbhiv_100k_lo 685 0.86 14.90 44.08 0 0.08 0.83 5.60 4.370000e+02 ▇▁▁▁▁
e_inc_tbhiv_100k_hi 685 0.86 70.22 263.64 0 0.57 3.60 21.00 4.570000e+03 ▇▁▁▁▁
e_inc_tbhiv_num 685 0.86 7133.26 31846.62 0 11.00 160.00 1825.00 4.610000e+05 ▇▁▁▁▁
e_inc_tbhiv_num_lo 685 0.86 2997.77 13392.37 0 4.00 72.00 890.00 2.290000e+05 ▇▁▁▁▁
e_inc_tbhiv_num_hi 685 0.86 13527.80 63177.17 0 19.75 240.00 3100.00 1.040000e+06 ▇▁▁▁▁
e_mort_exc_tbhiv_100k 23 1.00 14.64 22.55 0 0.87 4.00 19.00 1.880000e+02 ▇▁▁▁▁
e_mort_exc_tbhiv_100k_lo 23 1.00 9.12 13.09 0 0.73 3.30 12.00 9.600000e+01 ▇▁▁▁▁
e_mort_exc_tbhiv_100k_hi 23 1.00 21.77 35.77 0 1.00 4.60 26.00 3.100000e+02 ▇▁▁▁▁
e_mort_exc_tbhiv_num 23 1.00 6633.41 37296.31 0 16.00 220.00 1700.00 7.870000e+05 ▇▁▁▁▁
e_mort_exc_tbhiv_num_lo 23 1.00 4743.14 27937.41 0 13.00 190.00 1100.00 5.900000e+05 ▇▁▁▁▁
e_mort_exc_tbhiv_num_hi 23 1.00 8946.76 48350.25 0 17.00 250.00 2500.00 1.010000e+06 ▇▁▁▁▁
e_mort_tbhiv_100k 23 1.00 11.18 39.25 0 0.01 0.19 2.20 4.810000e+02 ▇▁▁▁▁
e_mort_tbhiv_100k_lo 23 1.00 6.02 22.43 0 0.00 0.08 1.10 3.060000e+02 ▇▁▁▁▁
e_mort_tbhiv_100k_hi 23 1.00 18.19 61.70 0 0.04 0.38 3.70 6.940000e+02 ▇▁▁▁▁
e_mort_tbhiv_num 23 1.00 2376.11 12208.58 0 0.00 16.00 340.00 2.230000e+05 ▇▁▁▁▁
e_mort_tbhiv_num_lo 23 1.00 1119.46 5480.67 0 0.00 7.00 170.00 9.800000e+04 ▇▁▁▁▁
e_mort_tbhiv_num_hi 23 1.00 4219.52 23987.97 0 1.00 28.00 560.00 5.550000e+05 ▇▁▁▁▁
e_mort_100k 23 1.00 25.83 53.92 0 0.98 4.60 24.00 5.330000e+02 ▇▁▁▁▁
e_mort_100k_lo 23 1.00 16.91 33.90 0 0.80 3.80 17.00 3.550000e+02 ▇▁▁▁▁
e_mort_100k_hi 23 1.00 36.84 79.13 0 1.20 5.40 31.00 7.470000e+02 ▇▁▁▁▁
e_mort_num 23 1.00 9014.90 46343.85 0 18.00 270.00 2800.00 9.800000e+05 ▇▁▁▁▁
e_mort_num_lo 23 1.00 6351.96 33441.23 0 15.00 230.00 1900.00 7.130000e+05 ▇▁▁▁▁
e_mort_num_hi 23 1.00 12227.81 61749.72 0 22.00 300.00 3700.00 1.340000e+06 ▇▁▁▁▁
cfr 123 0.97 0.16 0.13 0 0.08 0.11 0.22 1.000000e+00 ▇▂▁▁▁
cfr_lo 123 0.97 0.09 0.08 0 0.05 0.08 0.12 9.700000e-01 ▇▁▁▁▁
cfr_hi 123 0.97 0.24 0.21 0 0.11 0.16 0.34 1.000000e+00 ▇▂▂▁▁
cfr_pct 123 0.97 16.33 13.44 0 8.00 11.00 22.00 1.000000e+02 ▇▂▁▁▁
cfr_pct_lo 123 0.97 9.33 7.96 0 5.00 8.00 12.00 9.700000e+01 ▇▁▁▁▁
cfr_pct_hi 123 0.97 24.26 20.73 0 11.00 16.00 34.00 1.000000e+02 ▇▂▂▁▁
c_newinc_100k 181 0.96 73.52 104.45 0 10.00 37.00 95.00 9.330000e+02 ▇▁▁▁▁
c_cdr 286 0.94 73.33 19.44 0 61.00 80.00 87.00 2.400000e+02 ▁▇▁▁▁
c_cdr_lo 286 0.94 57.71 20.27 0 41.00 63.00 75.00 1.700000e+02 ▃▇▇▁▁
c_cdr_hi 286 0.94 107.49 118.83 0 95.50 100.00 100.00 6.700000e+03 ▇▁▁▁▁

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
country 0 1 4 56 0 217 0
iso2 23 1 2 2 0 216 0
iso3 0 1 3 3 0 217 0
iso_numeric 0 1 3 3 0 217 0
g_whoregion 0 1 3 3 0 6 0
Explore generating code using R
library(tidyverse)
library(pins)
library(connectapi)

tb_estimates <- read_csv("https://extranet.who.int/tme/generateCSV.asp?ds=estimates")

# Publish the data to the server with Bro. Hathaway as the owner.
board <- board_connect()
pin_write(board, tb_estimates, type = "parquet")

pin_name <- "tb_estimates"
meta <- pin_meta(board, paste0("hathawayj/", pin_name))
client <- connect()
my_app <- content_item(client, meta$local$content_id)
set_vanity_url(my_app, paste0("data/", pin_name))

Access data

This data is available to all.

Direct Download: tb_estimates.parquet

R and Python Download:

URL Connections:

For public data, any user can connect and read the data using pins::board_connect_url() in R.

library(pins)
url_data <- "https://posit.byui.edu/data/tb_estimates/"
board_url <- board_connect_url(c("dat" = url_data))
dat <- pin_read(board_url, "dat")

Use this custom function in Python to have the data in a Pandas DataFrame.

import pandas as pd
import requests
from io import BytesIO

def read_url_pin(name):
  url = "https://posit.byui.edu/data/" + name + "/" + name + ".parquet"
  response = requests.get(url)
  if response.status_code == 200:
    parquet_content = BytesIO(response.content)
    pandas_dataframe = pd.read_parquet(parquet_content)
    return pandas_dataframe
  else:
    print(f"Failed to retrieve data. Status code: {response.status_code}")
    return None

# Example usage:
pandas_df = read_url_pin("tb_estimates")

Authenticated Connection:

Our connect server is https://posit.byui.edu which you assign to your CONNECT_SERVER environment variable. You must create an API key and store it in your environment under CONNECT_API_KEY.

Read more about environment variables and the pins package to understand how these environment variables are stored and accessed in R and Python with pins.

library(pins)
board <- board_connect(auth = "auto")
dat <- pin_read(board, "hathawayj/tb_estimates")
import os
from pins import board_rsconnect
from dotenv import load_dotenv
load_dotenv()
API_KEY = os.getenv('CONNECT_API_KEY')
SERVER = os.getenv('CONNECT_SERVER')

board = board_rsconnect(server_url=SERVER, api_key=API_KEY)
dat = board.pin_read("hathawayj/tb_estimates")