MIS 470 Colorado State University GC Housing Price Forecasting Paper
Description
Explore and Predict Sales Price of the Ames, IA Housing data sets
In real estate, housing market prediction (forecasting) is crucial. There are many factors that may influence the house prices. In this portfolio project, using R and RStudio, you will create an R script (*.R) file to explore statistically and visually the given Ames, IA Housing data sets (one data set for training and another for testing). In addition, you will predict residential homes sales price in Ames, IA via linear regression. You will also provide a two-page summary of what you did, an interpretation of the results you obtained, and a reflection on your learning experience.
To prepare for this Portfolio Project:
Download the Ames, IA Housing Training data set: MIS470housingtraining(1000×25).csv Download MIS470housingtraining(1000×25).csv.
Download the Ames, IA Housing Training data set: MIS470housingtesting(460×25).csv
- To complete this assignment:
In the Week 4 Portfolio Milestone, you examined the MIS470housingtraining(1000×25).csv data set. Now, Create an R Script (*.R) file to calculate six (6) statistical and visual (five (5) statistical and one (1) visual) measures of the sale price variable of the Ames, IA Housing Testing data set. Create one (1) additional visual measure of the combined training and testing data sets. In addition, predict residential homes sales price in Ames, IA via linear regression. Follow the steps below. Give the script file a name that includes your first name and last name like this Solution-W8-FirstName-LastName-Portfolio.R:
Read in the MIS470housingtesting(460×25).csv file into an R testing data frame. The MIS470housingtesting(460×25).csv file contains 460 records and 25 quantitative explanatory variables describing many aspects of residential homes in Ames, IA.
Calculate the summary statistics of minimum, maximum, mean, median, and standard deviation for the sales price variable of the testing data set.
Plot a histogram for the distribution of the sales price variable of the testing data set.Combine the two data sets (training and testing) into a single data set. This can be done in R by using the function combine(). Create a histogram of sale prices for the combined data set.
Using only the training data set, fit a linear regression model using all the explanatory variables and SalePrice as the response variable.
Remove all the rows with missing values (NA) from the testing data set. The function complete.cases() can be used. Using only the first 20 rows from testing data set, predict the sale price. The R function predict() can perform this task. You should have 20 predicted sale prices.
Execute your *.R script file and display the results of its execution in the RStudio console and/or the Plots tabs.
- Take screenshots, showing current date and time, to demonstrate successful completion of your work. The screenshot should show the R commands you applied and the results you obtained. Do not capture trial and error results. Only your final results should be captured.
Summarize your work in one page in which you explain what you did, interpret your results, and reflect on your experience:
Explain how you completed this assignment and how you resolved the issues you faced, if any.
Interpret the results you obtained from your actions including comments on comparing the statistical and visual measures among the training, testing, and combined data sets, comments on identification of the significant factors in your linear regression model, and comments on comparing predicted sale prices to the actual sale prices from the testing data set.
Reflect on your experience with this assignment and the lessons you learned.
To submit your response to this assignment:
Prepare all the required screenshots.
Prepare your summary of your work (what you did, interpretation of results, and reflection).
Have a similar assignment? "Place an order for your assignment and have exceptional work written by our team of experts, guaranteeing you A results."