Subject Code: | DATA4200 | ||
Subject Name: | Data Acquisition and Management | ||
Assessment Title: | Sampling and data mining project | ||
Assessment Type: | Report | ||
Word Count: | 1600 | Words | (+/-10%) |
Weighting: | 30% | ||
Total Marks: | 30 | ||
Submission: | via MyKBSand Turnitin | ||
Due Date: | By TuesdayWeek 5 (Report) 23:55AEST |
Read the Assessment Instructions and complete sections (a) – (e)
Consider the rubric at the end of the assignment for guidance on structure and content.
Submit your written report (in Word) and your software file (e.g. Excel, Power BI) via MyKBS by Tuesday 23:55 AEST Week 5.
Business Problem: Airbnb is a U.S. company which provides an online marketplace for short- term and/or holiday accommodation. Airbnb collect large volumes of data to gain insight into their clients and associated customers, such as review scores, host acceptance rate, ‘superhosts’, popular accommodation types and density of listings in particular location.
Data sets: We have obtained data on Airbnb listings in Melbourne with a variety of variables. Sampled datasets, the original data and data dictionary will be available from Week 4. See sections below.
Analysis and Report (30 marks)
Use Microsoft Excel or Power BI or Tableau.
Recall the sampling methods below that you have learnt about in lectures.
A data dictionary file and the following datasets (as .csv files) that contain sample data generated using quota, systematic, simple random, and stratified sampling will be available from week 4, see section c. below. You will also have to access the original population dataset cleansed_listings_dec_18.csv from the source, see section a. and section e. below.
Create a report and include your response to the following questions:
Explain the variations in your report and include the supporting data. Explain possible ethical issues that could occur from the use of sampled data.
Briefly evaluate the software that you have used to produce the summaries. (500 words, 10 marks)
KBS values academic integrity. All students must understand the meaning and consequences of cheating, plagiarism and other academic offences under the Academic Integrity and Conduct Policy.
What is academic integrity and misconduct? What are the penalties for academic misconduct? What are the late penalties?
How can I appeal my grade?
Click here for answers to these questions: http://www.kbs.edu.au/current-students/student-policies/.
Submissions that exceed the word limit by more than 10% will cease to be marked from the point at which that limit is exceeded.
Students may seek study assistance from their local Academic Learning Advisor or refer to the resources on the MyKBS Academic Success Centre page. Further details can be accessed at https://elearning.kbs.edu.au/course/view.php?id=1481
Please see the level of Generative AI that this assessment has been designed to accept:
Traffic Light |
Amountof Generative Artificial Intelligence (AI) usage |
Evidence Required | This assessment (✓) |
Level1 | This assessment fully integrates Generative AI, encouraging you to harness the technology's full potential in collaboration withyour own expertise. It willhighlight your ability to demonstrate how effectively you can work alongside AI to achieve sophisticated outcomes, blending human intellect and artificial intelligence. | Your collaboration with AI must be clearlyreferenced and documented in the appendix of your submission, including all prompts and responses used for the assessment. | |
Level2 | This assessment invites you to engage with Generative AI as a means of expanding your creativity and idea generation. It will highlight your ability to complement your original thinking with the capabilities of AI. For example, through brainstorming and preliminary concept development. | Your collaboration with AI must be clearlyreferenced and documented in the appendix of your submission, including all prompts and responses used for the assessment. |
✓ |
Level3 | This assessment showcases your individual knowledge and skills in the absence of Generative AI support. It willhighlight your personal abilities. For example, to analyse, synthesise, and create based on your own understanding and learning. | Use of generative AI is prohibited and may potentially result in penalties for academic misconduct, including but not limited to a mark of zero for the assessment. |
Section | Criteria | NN (Fail) 0-0.5 mark | P (Pass) 50%-64% | CR (Credit) 74%-65% | DN (Distinction) 75%-84% | HD (High Distinction) 85%-100% |
(a) | Comments on the usefulness of at least 4 variables in relation to insights (2 marks) | No comments | Comments on one selected variable | Comments on two selected variables | Comments on three selected variables | Comments on at least 4 selected variables |
(b) | State at least 3 advantage/disadvantage and limitations (2 marks) | not stated | One advantage / disadvantage and one limitation stated | Two advantage / disadvantage and two limitations stated | Any three advantages/disadvantages and less than 3 limitations | At least 3 advantage/disadvantage andlimitations stated |
(c) | Summary statistics for each sample across the four selected variables (6) | One sample and one selected variable | Two samples and two selected variables | 2-3 samples and 3 variables | Any three advantages/disadvantages and less than 3 limitations | At least 3 advantage/disadvantage andlimitations stated |
(d) | Comparisons made of results generated above and conclusions drawn and documented (10 marks) | No or limited comparison/conclusions drawn | Results compared to 2 samples and 2 selected variables with limited conclusions | Results compared to 2 samples and 2 selected variables with limited conclusions | 3 -4 samples 3 variables used in comparison of results with meaningful conclusions | 4 samples and at least 4 variablesused in comparison of results withmeaningful conclusions |
(e) | Explained with statistical examples which sampling method summary stats across all selected variables were nearest the main dataset, and variations were explained. Explain ethical issues and evaluate the software. (10 marks) | No, or very limited explanation of the comparative variations across 0-1 selected variables.
Ethics not considered
Evaluation of software not mentioned | Comparison of summary stats across one sample and just two simple variables
Ethics considered in a very general way
Evaluation of software very general | Comparison of summary stats across at least two sample and two unrelated variables
Ethics considered in a more relevant way, but may not be practical
Evaluation of software relevant | Comparison of summary stats across at all samples and three variables
Ethics considered in a very relevant, practical and realistic way
Evaluation of software relevant andspecific to this project | Comparison of summary stats acrossat least four sample and at least four variables. Diverse variable choices and originality shown.
Report engaging, novel and well integrated
Ethics considered in a very relevant, novel and practical way
Evaluation of software detailed, relevant and specific to this project |
Get original papers written according to your instructions and save time for what matters most.