ICT583 Data Science Applications
ICT583 Data Science Applications
- This assignment must be done individually by each student.
- Write your answers in a report format. You need to have one report for one part.
- For part one report, please clearly indicate each question number, give your code followed by the snapshot of your results and explain your solution.
- You will submit .R files along with the part one report. Use comments# to indicate each question number. Make sure your code exactly matches the provided answers and is easy to read and understand. Only include code and comments necessary for answering questions.
- Part two report is a separate word document for small-scale literature review.
Part One – small project (40%)
Do not rush into finding answers without a good understanding of the dataset. You should first ask yourself some questions like, what is this data frame about? What is the meaning of each column? What is the data type? Are there any columns describing similar things, etc.
All the data manipulation and visualization must be done using R.
The COVID-19 outbreak was first identified in December 2019 in Wuhan, China. The WHO declared the outbreak a Public Health Emergency of International Concern on 30 January 2020 and a pandemic on 11 March (Wikipedia). Organizations worldwide have been collecting data so that the government can monitor and learn from this pandemic. You will use the dataset ‘time_series_covid_19_confirmed.csv’ from LMS to explore the COVID-19 data. (20 points)
Note: This data set details can be found via https://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset#time_series_covid_19_confirmed.csv;
Your data analysis should include but not limited to the answers to the following questions:
1. Create two graphs that displays the latest number of COVID-19 cases of the top 15 and bottom 15 countries, respectively. Consider how to improve the quality and aesthetics of your visualization. (25 points)
2. Visualize the confirmed cases worldwide from January to March. (10 points)
3. Visualize the confirmed cases of COVID-19 in China and the rest of the world from January to March. Can you relate the main changes observed from the plot with the landmark events such as WHO declared a pandemic? (20 points)
4. Add a smooth trend line using linear regression to measure how fast the number of cases is growing in China after 15 February 2020. How does the rest of the world compare to linear growth? (20 points)
5. Raise at least two interesting questions from your own regarding the COVID-19 pandemic and find answers using the given dataset and explain. (25 points)
Part Two – small-scale review (60%)
For this part, you are asked to complete a small-scale literature review on data science application in managing COVID-19 Pandemic. You need to locate at least five computing journal articles on your topic and write a 4–5 page literature review on the articles you’ve selected.
Steps to complete your small-scale literature review:
1. Determine a data science application of your interest. There are various data science applications related to COVID-19, including but not limited to the analysis of COVID-19 trends, risk modeling, computer-aided diagnosis, contact tracing, social media topic modeling, etc., which help manage and control the pandemic to keep us safe. In this review, you are asked to focus on ONE specific data science application domain only.
2. Narrow down to a specific inquiry question that describes what you would like to know about your selected data science application topic.
3. Go to the library to search for and locate journals that include your topic’s information.
4. Search articles, read the abstracts to select those that correspond well to your topic and inquiry question.
5. Read your selected articles and begin to sort and classify them in a meaningful way. Always consider your original topic and inquiry question. In your review, it is important to summarize the main findings from the articles you reviewed and point out the information that you found particularly important to know that answered the inquiry question.
6. Write your review.
Suggested components of your review:
Introduction: Define the topic of your study and provide any background information that helps your reader to understand the topic. State your inquiry question for this review.
A concise summary of the selected papers and elaboration.
Own opinions of the papers: providing your views by connecting the papers reviewed to your inquiry question and the context of the application of interest.
References: listing of all references that you mentioned in your paper. Please use IEEE numbered reference style.