Unit 2 Task 1: Batting Averages
Setup
SQL setup and test
Background
When you hear the word “relationship” what is the first thing that comes to mind? Probably not baseball. But a relationship is simply a way to describe how two or more objects are connected. There are many relationships in baseball such as those between teams and managers, players and salaries, even stadiums and concession prices.
The graphs on Data Visualizations from Best Tickets show many other relationships that exist in baseball.
Client Request
For this project, the Client wants you to use SQL queries called from within Python.
Data
This project will use the Lahman Baseball Database. In order to complete this project, you will need to download the database and save it inside the DS250 folder (hopefully you have a folder dedicated to this class) next to the .qmd file you will be using for this assignment. Note: Right click the ‘Download’ link and select “Save Link As” to download the data to your computer.
Download: lahmansbaseballdb
Information: Lahman Data Dictionary
Setup Instructions: See SQL Setup
Readings
Complete these readings before we cover the material in class. This will help you retain the material and make the class period more engaging - not less.
In other words, if an assignment is due on Wednesday, we will cover the material needed to complete the assignment on Tuesday. Therefore, you should complete the readings on Monday (or anytime before class on Tuesday).
Review the set-up instructions in preparation for class.
There are many flavors of SQL but most flavors have the same base commands. SQL queries are typed in the following pattern;
SELECT -- <columns> and <column calculations>
FROM -- <table name>
JOIN -- <table name>
ON -- <columns to join>
WHERE -- <filter condition on rows>
GROUP BY -- <subsets for column calculations>
HAVING -- <filter conditions on groups>
ORDER BY -- <how the output is returned in sequence>
LIMIT -- <number of rows to return>
The reading for this task are from this excellent SQL Guide. Read the following for this task:
All the categories in the “Basic” group (topics are grouped in the menu on the left side of the page).
And, the following sections in the Intermediate group:
Optional References
Questions and Tasks
Download the Unit 2 Task 1 Template. Note: batting average is number of hits divided by the number of at-bats. It is a number between 0 and 1 that represents what % of the time a player gets a hit (as opposed to getting out, for example).
Write an SQL query that provides playerID, yearID, and batting average for players with at least 1 at bat that year. Sort the table from highest batting average to lowest, and then by playerid alphabetically. Show the top 5 results in your report.
Use the same query as above, but only include players with at least 10 at bats that year. Print the top 5 results.
Now calculate the batting average for players over their entire careers (all years combined). Only include players with at least 100 at bats over their entire career, and print the top 5 results.
Submission:
When you have completed the report, you will need to follow this process to submit your work:
- Have the
unit2_task1_template
open in VS Code and nothing else - Click
Preview Button
in VS Code in the top right of the screen- This will render the project but also entire course work portfolio into
HTML
files for review - Confirm everything displas as you would like it to
- How you see it will be how it is viewed for grading
- If there is an error in any cell of the quarto files, the rendering will stop and you will need to fix the error before rendering again (if you get stuck post your error in Slack)
- This will render the project but also entire course work portfolio into
- Once the report is confirmed close the preview and open a VS Code
Terminal
- click
Terminal
in the top menu bar and thenNew Terminal
- click
- Type the following in the terminal
quarto render
- Then drag and drop
unit2_task1_template.qmd
into the terminal this will add the file path to the terminal command - Press Enter
- Then drag and drop
- This will render the project into a
HTML
file in the same location as the.qmd
file - To locate the file in VS Code
- Right click on the file in the file explorer and select
Reveal in File Explorer
(Win) orReveal in Finder
(Mac)
- Right click on the file in the file explorer and select
- Upload
HTML
file into Canvas
Deliverables:
Use this unit2_task1_template to submit your Client Report.
Answers to the questions | tasks. Each should include a written description of your results, code cells with comments, charts and/or tables.
Your report should be written in quarto markdown files and rendered to an HTML File. Upload the HTML file in Canvas. (Do not submit the .qmd
file)