Denver residential dwelling sales for 2013

Attributes of each dwelling with their selling price for homes that sold in Denver in 2013
housing
DS250
Author

DS 250

Published

February 1, 2024

Data details

There are 22,913 rows and 26 columns. The data source1 is used to create our data that is stored in our pins table. You can access this pin from a connection to posit.byui.edu using hathawayj/dwellings_denver.

This data is available to all.

Variable description

  • parcel: The parcel id
  • nbhd: Neigborhood of the home
  • abstrprd: No clue
  • livearea: Square footage that is liveable
  • finbsmnt: Square footage finished in the basement
  • basement: Total square footage of the basement
  • yrbuilt: Year the home was built
  • condition: Condition of the home (6 levels provided)
  • quality: A letter ratting
  • totunits: How many dwelling units in the building
  • stories: The number of stories
  • gartype: Details on the garage type
  • nocars: size of the garage in cars
  • xtraffic: emtpy
  • floorlvl: Mostly for apartments. What level the living unit is on
  • numbdrm: Number of bedrooms
  • numbaths: Number of bathrooms
  • arcstyle: Type of home
  • sprice: Selling price
  • deduct: Deduction from the selling price
  • netprice: Net price of home
  • tasp: Tax assesed selling price
  • smonth: Month sold
  • syear: Year sold
  • qualified: Q or U with 66 percent Q
  • status: I or V with over 90 percent I

Variable summary

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
abstrprd 0 1.00 1118.98 152.60 100 1112 1112 1114 9279 ▇▁▁▁▁
livearea 0 1.00 1510.78 787.02 293 975 1302 1823 21503 ▇▁▁▁▁
finbsmnt 0 1.00 330.51 469.91 0 0 0 716 4320 ▇▂▁▁▁
basement 0 1.00 524.50 567.65 0 0 401 959 9025 ▇▁▁▁▁
yrbuilt 0 1.00 1963.69 36.93 1873 1940 1963 2002 2013 ▂▃▆▅▇
totunits 0 1.00 1.02 0.26 0 1 1 1 10 ▇▁▁▁▁
stories 0 1.00 1.42 0.57 1 1 1 2 4 ▇▅▁▁▁
nocars 0 1.00 1.36 1.10 0 0 2 2 17 ▇▁▁▁▁
floorlvl 18025 0.21 5.20 7.04 1 1 2 5 44 ▇▁▁▁▁
numbdrm 0 1.00 2.52 0.88 0 2 2 3 9 ▁▇▁▁▁
numbaths 0 1.00 2.35 1.12 0 2 2 3 11 ▇▅▁▁▁
sprice 0 1.00 561859.64 2135231.25 1000 134000 230000 378400 21750000 ▇▁▁▁▁
deduct 0 1.00 1336.95 3061.79 0 0 0 2383 101809 ▇▁▁▁▁
netprice 0 1.00 560522.69 2135373.06 1000 132000 229000 375500 21750000 ▇▁▁▁▁
tasp 0 1.00 563820.73 2135272.35 1000 132890 231000 382184 21750000 ▇▁▁▁▁
smonth 0 1.00 6.54 3.27 1 4 6 9 12 ▇▇▆▆▇
syear 0 1.00 2011.06 0.71 2010 2011 2011 2012 2012 ▃▁▇▁▅

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
parcel 0 1.00 16 16 0 20465 0
condition 0 1.00 3 5 0 5 0
quality 0 1.00 1 1 0 5 0
gartype 0 1.00 2 7 0 7 0
xtraffic 22053 0.04 1 1 0 1 0
arcstyle 0 1.00 8 23 0 12 0
qualified 0 1.00 1 1 0 2 0
status 0 1.00 1 1 0 2 0
Explore generating code using Python
# https://www.denvergov.org/opendata/dataset/city-and-county-of-denver-real-property-sales-book-2013
# https://github.com/abresler/realtR

# %%
# packages
import polars as pl
import os
from dotenv import load_dotenv, find_dotenv
from pins import board_connect
import http.cookiejar
import json
import os
import urllib.parse
import urllib.request
from pydrive2.auth import GoogleAuth
from oauth2client.service_account import ServiceAccountCredentials
from pydrive2.drive import GoogleDrive

load_dotenv("../../../../.env")
API_KEY = os.getenv('CONNECT_API_KEY')
SERVER = os.getenv('CONNECT_SERVER')
# %%
SCOPES = ['https://www.googleapis.com/auth/drive.readonly']
gauth = GoogleAuth()
gauth.credentials = ServiceAccountCredentials.from_json("client_secrets.json", SCOPES)

# %%
gauth.LocalWebserverAuth()

drive = GoogleDrive(gauth)


# %%
dat = pl.read_csv(url, ignore_errors=True, truncate_ragged_lines=True)\
  .filter((pl.col("COMMUSE").is_null()) & (pl.col("LIVEAREA") > 0))\
  .drop(['Model', 'Cluster', 'Group', 'CondoComplex', 'CondoComplexName', 'COMMUSE'])\
  .with_columns(pl.col("NBHD").cast(pl.String()).cast(pl.Categorical()))\
  .select(pl.all().name.to_lowercase())

#%%
datp = pd.read_csv(url)\
    .query('~COMMUSE.notna()')\
    .drop(columns = ['Model', 'Cluster', 'Group', 'CondoComplex', 'CondoComplexName', 'COMMUSE'])\
    .assign(NBHD = lambda x: x.NBHD.astype('object'))\
    .query('LIVEAREA > 0')

# need to move the index into a column
# Here is the API to change the vanity name https://posit.co/blog/rstudio-connect-1-8-6-server-api/
# rename columns to lower case
dat.columns = dat.columns.str.lower()

# %%

# Publish the data to the server with Bro. Hathaway as the owner.
pin_name = "dwellings_denver"
board = board_connect(server_url=SERVER, api_key=API_KEY)
board.pin_write(dat.to_pandas(), "hathawayj/" + pin_name, type="parquet")

# %%
meta = board.pin_meta("hathawayj/" + pin_name)
# https://docs.posit.co/connect/user/python-pins/
# https://rstudio.github.io/pins-python/
meta.local.get("content_id")



#%% R code
# Publish the data to the server with Bro. Hathaway as the owner.
board <- board_connect()
pin_write(board, led_study, type = "parquet")

pin_name <- "led_study"
meta <- pin_meta(board, paste0("hathawayj/", pin_name))
client <- connect()
my_app <- content_item(client, meta$local$content_id)
set_vanity_url(my_app, paste0("data/", pin_name))

Access data

This data is available to all.

Direct Download: dwellings_denver.parquet

R and Python Download:

URL Connections:

For public data, any user can connect and read the data using pins::board_connect_url() in R.

library(pins)
url_data <- "https://posit.byui.edu/data/dwellings_denver/"
board_url <- board_connect_url(c("dat" = url_data))
dat <- pin_read(board_url, "dat")

Use this custom function in Python to have the data in a Pandas DataFrame.

import pandas as pd
import requests
from io import BytesIO

def read_url_pin(name):
  url = "https://posit.byui.edu/data/" + name + "/" + name + ".parquet"
  response = requests.get(url)
  if response.status_code == 200:
    parquet_content = BytesIO(response.content)
    pandas_dataframe = pd.read_parquet(parquet_content)
    return pandas_dataframe
  else:
    print(f"Failed to retrieve data. Status code: {response.status_code}")
    return None

# Example usage:
pandas_df = read_url_pin("dwellings_denver")

Authenticated Connection:

Our connect server is https://posit.byui.edu which you assign to your CONNECT_SERVER environment variable. You must create an API key and store it in your environment under CONNECT_API_KEY.

Read more about environment variables and the pins package to understand how these environment variables are stored and accessed in R and Python with pins.

library(pins)
board <- board_connect(auth = "auto")
dat <- pin_read(board, "hathawayj/dwellings_denver")
import os
from pins import board_rsconnect
from dotenv import load_dotenv
load_dotenv()
API_KEY = os.getenv('CONNECT_API_KEY')
SERVER = os.getenv('CONNECT_SERVER')

board = board_rsconnect(server_url=SERVER, api_key=API_KEY)
dat = board.pin_read("hathawayj/dwellings_denver")