Site icon Tutor Bin

University of the Cumberlands Clean and Explore Data Worksheet

University of the Cumberlands Clean and Explore Data Worksheet

Description

Review the Entire Clean and Explore Data Attached.

ZipcodeR Steps

install.packages(“zipcodeR”)

# install.packages(“devtools”)

devtools::install_github(“gavinrozzi/zipcodeR”)

library(zipcodeR)

search_state(‘NJ’)

Tutorials

https://cran.r-project.org/web/packages/zipcodeR/zipcodeR.pdf

https://www.gavinrozzi.com/post/an-r-package-for-zip-codes/
https://gavinrozzi.github.io/zipcodeR/

https://www.r-bloggers.com/2011/01/my-first-r-package-zipcode/

Update the plan

For this week’s objective, you will need to create a plan and add information about the status of the project. You will need to run the R datasets and then review the outputs. You can use any of the packages or use the Zipcode package.

Formulating the brief answers these questions

  • Why is this interesting or important? What about it is important?
  • What requires clarification?
  • What pitfalls could cause the analysis to be incomplete or incorrect?
  • Who is the audience? What do you think the audience expects?
  • How much time do you have to complete the project?
  • What are the project conditions?
  • What tools do you have access to?
  • Or, as is the case in this course, what are you limited to, regarding software?
  • Can the evidence be summarized in one visualization? Two? Several?
  • Will the results of this analysis be an exhibit (evidence), an explanation (presentation), or an exploration (audience interaction)?

Working with data

When you work with the data, whether cleaning, investigating, or exploring, ask yourself questions as you progress through the process. These questions may include

  • Are the right data types assigned?
  • How many observations are associated with the state I’m assigned?
  • How do I filter for the complaints specific to the analysis I’m assigned?
  • Which fields apply to this analysis?
  • What is the range of the median household income, the population, and the delay between receiving and forwarding customer complaints?
  • Should no or zero delay observations be separated? (Is management more interested in the overall or what or how the response times can be improved?)
  • Are there fields of data that would add to the data story that are outside the scope? Does the scope need to be modified? Perhaps you identified that a specific product or company was associated with all of the delays exceeding 50 days. That could be very useful information to the management team. Another possibility might be that there is one type of company response to the customer that has the higher delay time. Perhaps focusing on the delays exceeding a certain number of days offers very different insight than no or only a few days of delay? Yet another possibility? Perhaps the time of year, like a particular season, coincides with the length of the delay?

Submission requirements

When you document this information, you will need to write it as a paper. This is not a blog, a discussion, or a short answer paper. You will need to include an introduction, a topic sentence, supporting paragraphs, and a conclusion. Not great at writing? Make sure that you document your work using the standards of APA 7. To help with formatting use the APA 7 Student Paper template.

Your work with R in RStudio must be documented in a .r script file. You must submit a paper and a script file for this assignment.

Your paper is still about the project planning and data cleansing. The functions, libraries, or other elements specific to R are appropriate content for the paper. You will need to create a plan and address the questions based on your results from the R scripts.

Have a similar assignment? "Place an order for your assignment and have exceptional work written by our team of experts, guaranteeing you A results."

Exit mobile version