US Census Record Names

First Names of US Social Security Card Holders
DS250
names
Author

DS 250

Published

October 15, 2025

Data details

There are 640,295 rows and 54 columns. The data source1 is used to create our data that is stored in our pins table. You can access this pin from a connection to posit.byui.edu using hathawayj/names_year_csv.

This data is available to all.

Variable description

  • name: The birth name
  • year: The year of their social security number creation.
  • State Name Columns: Each column for the 50 states and DC.
  • Total: The total over all the states.

Variable summary

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
year 0 1 1980.63 32.00 1910 1957 1989 2007 2024 ▃▃▃▆▇
AK 0 1 0.72 4.95 0 0 0 0 206 ▇▁▁▁▁
AL 0 1 9.53 56.31 0 0 0 0 3050 ▇▁▁▁▁
AR 0 1 5.62 32.33 0 0 0 0 1627 ▇▁▁▁▁
AZ 0 1 6.30 29.80 0 0 0 0 1047 ▇▁▁▁▁
CA 0 1 51.70 240.24 0 0 6 19 8315 ▇▁▁▁▁
CO 0 1 6.19 30.98 0 0 0 0 1037 ▇▁▁▁▁
CT 0 1 5.69 37.79 0 0 0 0 1624 ▇▁▁▁▁
DC 0 1 2.33 17.21 0 0 0 0 883 ▇▁▁▁▁
DE 0 1 1.06 7.60 0 0 0 0 312 ▇▁▁▁▁
FL 0 1 17.59 85.48 0 0 0 7 3506 ▇▁▁▁▁
GA 0 1 14.48 72.85 0 0 0 6 3080 ▇▁▁▁▁
HI 0 1 1.60 8.63 0 0 0 0 326 ▇▁▁▁▁
IA 0 1 6.88 43.21 0 0 0 0 2370 ▇▁▁▁▁
ID 0 1 2.04 10.92 0 0 0 0 530 ▇▁▁▁▁
IL 0 1 25.60 145.72 0 0 0 8 6248 ▇▁▁▁▁
IN 0 1 12.29 72.68 0 0 0 0 3014 ▇▁▁▁▁
KS 0 1 5.47 30.94 0 0 0 0 1387 ▇▁▁▁▁
KY 0 1 9.02 55.19 0 0 0 0 2561 ▇▁▁▁▁
LA 0 1 9.39 46.69 0 0 0 0 1681 ▇▁▁▁▁
MA 0 1 12.75 88.15 0 0 0 0 3910 ▇▁▁▁▁
MD 0 1 7.81 45.69 0 0 0 0 1793 ▇▁▁▁▁
ME 0 1 2.29 14.97 0 0 0 0 778 ▇▁▁▁▁
MI 0 1 19.63 118.25 0 0 0 6 4941 ▇▁▁▁▁
MN 0 1 9.55 56.53 0 0 0 0 2144 ▇▁▁▁▁
MO 0 1 11.56 66.68 0 0 0 0 2803 ▇▁▁▁▁
MS 0 1 6.60 40.02 0 0 0 0 2294 ▇▁▁▁▁
MT 0 1 1.58 10.11 0 0 0 0 473 ▇▁▁▁▁
NC 0 1 14.84 79.31 0 0 0 5 3900 ▇▁▁▁▁
ND 0 1 1.77 10.99 0 0 0 0 497 ▇▁▁▁▁
NE 0 1 3.81 23.05 0 0 0 0 1060 ▇▁▁▁▁
NH 0 1 1.63 10.96 0 0 0 0 447 ▇▁▁▁▁
NJ 0 1 14.32 87.42 0 0 0 5 3713 ▇▁▁▁▁
NM 0 1 2.79 14.35 0 0 0 0 589 ▇▁▁▁▁
NV 0 1 1.71 9.82 0 0 0 0 340 ▇▁▁▁▁
NY 0 1 40.09 229.81 0 0 0 13 10051 ▇▁▁▁▁
OH 0 1 24.02 146.96 0 0 0 7 5899 ▇▁▁▁▁
OK 0 1 7.06 37.79 0 0 0 0 2072 ▇▁▁▁▁
OR 0 1 4.70 25.58 0 0 0 0 1121 ▇▁▁▁▁
PA 0 1 27.86 184.81 0 0 0 7 8197 ▇▁▁▁▁
RI 0 1 1.90 14.53 0 0 0 0 682 ▇▁▁▁▁
SC 0 1 7.53 43.33 0 0 0 0 2326 ▇▁▁▁▁
SD 0 1 1.76 11.07 0 0 0 0 507 ▇▁▁▁▁
TN 0 1 11.02 62.75 0 0 0 0 3296 ▇▁▁▁▁
TX 0 1 39.38 162.64 0 0 6 17 5064 ▇▁▁▁▁
UT 0 1 4.24 20.74 0 0 0 0 707 ▇▁▁▁▁
VA 0 1 11.75 63.90 0 0 0 0 2688 ▇▁▁▁▁
VT 0 1 0.90 6.53 0 0 0 0 325 ▇▁▁▁▁
WA 0 1 8.35 42.95 0 0 0 0 1750 ▇▁▁▁▁
WI 0 1 10.33 64.66 0 0 0 0 2403 ▇▁▁▁▁
WV 0 1 4.85 34.06 0 0 0 0 1796 ▇▁▁▁▁
WY 0 1 0.70 5.06 0 0 0 0 229 ▇▁▁▁▁
Total 0 1 512.54 2657.86 5 8 25 139 99849 ▇▁▁▁▁

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
name 0 1 2 15 0 33400 0
Explore generating code using Python
# https://www.ssa.gov/oact/babynames/limits.html
# Download and unzip to same folder as this script
# AK,F,1911,Ruth,7
# %%
import polars as pl
from pins import board_connect
from dotenv import load_dotenv, find_dotenv
# from posit import connect

load_dotenv("../../../.env")
API_KEY = os.getenv('CONNECT_API_KEY')
SERVER = os.getenv('CONNECT_SERVER')


# %%
dat = pl.read_csv(
  'namesbystate/*.TXT',
  has_header=False,
  new_columns = ['state', 'gender', 'year', 'name', 'count'])\
  .group_by('state', 'name', 'year')\
  .agg(pl.sum('count').alias('count'))\
  .sort('state')

dat_total = dat.group_by('name', 'year').agg(pl.sum('count').alias('Total'))
# %%
# Now pivot
out_dat = dat.pivot('state', index=['name', 'year'], values='count')\
  .fill_null(0)\
  .join(dat_total, on=['name', 'year'])\
  .sort('name', 'year')


# %%
# Publish the data to the server with Bro. Hathaway as the owner.
pin_name = "names_year"
board = board_connect(server_url=SERVER, api_key=API_KEY)
board.pin_write(out_dat.to_pandas(), "hathawayj/" + pin_name, type="parquet")

# %%
meta = board.pin_meta("hathawayj/" + pin_name)
# https://docs.posit.co/connect/user/python-pins/
# https://rstudio.github.io/pins-python/
meta.local.get("content_id")



#%% Need to set the vanity url
# Do by hand https://posit.byui.edu/connect/#/apps/c0c197d3-f6bc-4129-9df4-4683c1f25e61/access
# Soon there will be code.
# https://github.com/posit-dev/posit-sdk-py/issues/175
# https://posit-dev.github.io/posit-sdk-py/quickstart.html

Access data

This data is available to all.

Direct Download: names_year_csv.csv

R and Python Download:

URL Connections:

For public data, any user can connect and read the data using pins::board_connect_url() in R.

library(pins)
url_data <- "https://posit.byui.edu/data/names_year_csv/"
board_url <- board_connect_url(c("dat" = url_data))
dat <- pin_read(board_url, "dat")

Use this custom function in Python to have the data in a Pandas DataFrame.

import pandas as pd
import requests
from io import BytesIO

def read_url_pin(name):
  url = "https://posit.byui.edu/data/" + name + "/" + name + ".parquet"
  response = requests.get(url)
  if response.status_code == 200:
    parquet_content = BytesIO(response.content)
    pandas_dataframe = pd.read_parquet(parquet_content)
    return pandas_dataframe
  else:
    print(f"Failed to retrieve data. Status code: {response.status_code}")
    return None

# Example usage:
pandas_df = read_url_pin("names_year_csv")

Authenticated Connection:

Our connect server is https://posit.byui.edu which you assign to your CONNECT_SERVER environment variable. You must create an API key and store it in your environment under CONNECT_API_KEY.

Read more about environment variables and the pins package to understand how these environment variables are stored and accessed in R and Python with pins.

library(pins)
board <- board_connect(auth = "auto")
dat <- pin_read(board, "hathawayj/names_year_csv")
import os
from pins import board_rsconnect
from dotenv import load_dotenv
load_dotenv()
API_KEY = os.getenv('CONNECT_API_KEY')
SERVER = os.getenv('CONNECT_SERVER')

board = board_rsconnect(server_url=SERVER, api_key=API_KEY)
dat = board.pin_read("hathawayj/names_year_csv")