Google analytics assignment
Part A: Google Analytics and Tableau reportSelect your own live website and present a live report of Google Analytics and Tableau using the data of the live website: Data analytics method: explaining the machine-learning models (minimum 1 clustering and 2 classifications) and/or Google Analytics Details on Real-time Monitoring Report, Audience Analysis Report, Acquisition Report. Consumer Behaviour Report;
The analytics problems which you are going to solve, include• Customer Profiling/Classification• Target customers? Example: Who will buy Life Insurance?• Customer Avatar/Profile, Example: Age, Gender, Race, Ed• Customer Behavior Prediction• Loan or not? Example:Will he default on this car loan?• Buy or Not Buy? Example:Will she watch this movie? • How many units/sales? Example:How many shoes to stock?• Next-time buy? Example: When will she buy this Fish Oil again? • Chun Problem? Example: (Y/N) When will she switch to Bell?• Customer Service (Chatbot)? Example: How serve online better?
Part B: PythonThe soruces of your datasets can come from:• WWW.KAGGLE.COM• Google dataset search• Dataverse• Open Data Kit• Ckan• Open Data Monitor• Plenar.io• Open Data Impact Map • Smart City Open Dataset, including IoTs, Traffic, Transportation, Housing, etc (get source here)• Google Analytics Dashboard (tool screenshot) (github source here)• Sharing-economy Airbnb (github source here) • Travel Website Analytics (github source here)• Sports Club Analytics (github source here) • Twitter and Elections (github source here
What needs to be done on PythonData cleaning, explorative data analysis, and data visualization3) Data cleaning, explorative data analysis, and data visualization Clustering (minimum 1 tool, e.g., K-mean, K-NN) Data analytics: Classification (minimum 2 tools: e.g. Linear, LR, SVM, ANN)
Project presentation PPT format:• Business problem: a clear business data problem statement which your project addresses.• Motivation: why is this data problem interesting and challenging or innovative to solve?• Data Visualisation approach: please show your visualization results here.• Data Analytics approach: please demonstrate your Machine-learning models and/or Google Analytics and results here.• Conclusion and Future work: please summarize key achievements and identify future work here.7) Project submission files over UCW course portal • Project files: code, dataset, key outputs, performance matrix, to be compressed to a ZIP file;• Project presentation PPT: • Project final report (MS Word): your final project report should follow the APA style, includingo Introduction: describing the business problem and the motivation to solve it;o Relevant work: describing 5+ peer-reviewed papers related to your project with reference;o Data and visualization: describing your dataset and show the data using visualization tools;o Data analytics method: explaining the machine-learning models (minimum 1 clustering and 2 classifications) and/or Google Analytics Details on Real-time Monitoring Report, Audience Analysis Report, Acquisition Report. Consumer Behaviour Report;o Results: demonstrating the results of your data analytics models and visualize the data output;o Discussion, Conclusion, and Future work;o Reference list.
I would need MS document and ppt for the same. The assignment is on Google Analytics + Tableau (Live website), Python (on data set from one of the sources mentioned above in the mail).
Data Analysis and Data Visualization
Table of Content
Data Description 4
Data Visualization 4
Conclusio and Future Work 15
Part A: Google Analytics and Tableau Report
The impact of structured product insurance on financial market equities instruments is being investigated. We investigate different elements of structural market segments, such as buyer incentive, item hazards, hedging behaviour, and the influence of hedging on exchanges traded goods.We find that hedging would be variance supporting during a sell-off and volatility suppressive during a large rally. We also propose the introduction of certain new currency items that’d simplify the hedging process while also allowing certain individual investors to share their emotions more effectively without incurring the financial risks associated with structured instruments. Analyze the live data of the stock exachange.
Derivative trading is an important instrument for market health since it improves market pricing and adds stability. Although contracts were implemented late in Indian equities exchanges, they swiftly gained significance. Expand economic were established since the first exchanges priced currency receivables instrument in Asian economy in June 2000. Index choices, share options, and finally stock derivatives were launched a year and a half later. Since then, derivative transactions have expanded to multiple sets of financial markets quantities, providing market players with a means of speculating and hedging that would not be available in stock market. According to NSE data from 2007, individual investors have become the major players in financial markets during the last three to five years, contributing for over 60% of all derivative products trading on general. Although futures are useful tools for expressing complicated nonlinear economic views, a lack of knowledge and comprehension has led to investment in financial instruments, which have speculative salaries and bonuses but are customised and not publicly tradable.
- Dataset Description
The NSE data contains the various values of the trading. The NSE has the various symbol values like NIFTY 50, TATA Motors, etc. The dataset has the value for the various trading of the companies such as Nifty 50, Tata Motors, Ongc , Lt and HDFC, etc. The dataset has the values such as trading high day, low day , last price , high trading volume, high trading value, etc. The dataset we have taken from the live website. First we have inspect the data from the live data and take the target values from the site. Such as URL, encoder , cookies to extract the data from the live website. First we have extract the data from the live website in excel and perform the data visualization in tableau.
- Data Visualization
We have analyzed the stock symbols which has the highest value on particular day. We have mapped the stock symbols on the basis of the highest value. From the above analysis we can see that the SHREECEM has the highest trade value on the particular day.
The above graph shows the analysis of the highest trade value. We have taken the identifier and the total trade value. We have analysed the highest trade value such as NIFTY 50 is the highest trade value in the above graph.
As the trade market values is changes on dailyb basis. The above table is analyzed on the basis of 30 days value of the trdae value and the name of the company. From the above table we have analyzed the UPL has the highest sale in the 30 dyas where as the hinduliver has lowest sale.
We have taken the trade volume from each of the company. The bar graph is plotted for the trade volume of the company. As the value of the trading is changes every day hence the volume of the data has changes accordingly. From the above analysis we can see that the Nifity 50 has the highest trade volume.
For the above analysis we have taken the data for the 360 days of the trading for each company. The trade value is changes and the market is reflected daily. We have plotted the table for the analysis of the 360 dyas trade value. From the above table we can conclude that the Jswsteel has the highest trade sale in last 360 days where as the Eichermot has the lowest trade value.
The above dashboard shows the reflection of the trade value for each company. The trade value is changes in 30 days and 360 days. From the analysis we can see that the value of the trade has changes in 360 days and 30 days. It is updated for each company. We can conclude that the highest trade value in 30 days and 360 days is different. Though from the data the Nifty 50 has the highest trade volume and sales value.
Part B: Python
Financial fraud is a growing problem that harms the rest, collaborating organisations, and government administration. Along with advancements in cloud computing, transactions are expanding at a quicker rate, resulting in heavy internet penetration. With the advancement of technology and the growing use of credit cards, fraud rates have become a problem for the company. Thieves are devising new concepts or gaps to track down transactions whenever enhanced security features are added to credit card purchases. As a result, the behaviour of fraudsters and regular transactions is continually changing. Another issue with credit card information is that it is heavily skewed, making it difficult to detect suspicious charges. An unbalanced or skewed method is validated also with a re-sampling (over-sampling or under-sampling) approach to provide better conclusions. This report will compare distinct ratios of datasets and perhaps a random under-sampling strategy across skewed samples. The machine learning approach techniques used in this study are logistic regression, Nave Bayes, as well as K-nearest neighbour. With subsequent comparison study, the effectiveness of these methods is documented. The work is performed in Python, and now the algorithms’ accuracy, sensitivity, specificity, precision, F-measure, as well as area under the curve can be used to execute their duties. Depending on these metrics, a logistic regression-based strategy for dishonest predictions was considered superior to other estimation techniques relying on Nave Bayes like K-nearest neighbour. Introducing under selection strategies to the data before creating the prediction system also yields better outcomes. Credit card fraud is a term that has been used to describe a type of Detection of fraud During a random Logistic regression model was applied for predicting the outcome of KNN Naive Bayes. (Mehndiratta, 2019)
- INTRODUCTION –
A credit card is a compact, thin plastic or fibre card that carries personal details, other than a photograph or signature, and allows the user identified here on the card to spend products and services to his connected accounts, which is deducted regularly. ATMs, swiped devices, retail readers, banks, and Internet transactions all read card information quickly. Each card has a unique account number, which is extremely significant; the card’s security is based primarily upon that card’s security measures as well as the confidentiality of the bank account number. (Jayant, 2014)
The number of credit card transactions is rapidly increasing, which now has resulted in a significant spike in fraud incidents. To identify deception, a variety of analysis and statistical tools are employed. Artificial intelligence and pattern recognition are used in several fraud detection strategies. It is critical to prevent attacks using effective and secure ways.
Credit card fraud is on the rise, and financial losses as a result of fraud are skyrocketing. As modern technology emerges, the Internet or online payments are expanding in popularity. Credit cards are used in these transactions.
- DISCUSSION –
Fraud is often defined as criminal dishonesty with the intent to profit. With the increasing reliance on online technology, the number of credit card scams has reached a record high. Credit cards are used in nowadays most operations, whether online or offline. The number of research work covers a wide range of intrusion detection. Inner card fraud versus exterior card fraud is still the two main types of credit card fraud. Internal card fraud occurs when a fraudulent identity was being used to defraud people due to mutual agreement between issuers and the bank; on either hand, public card fraud occurs when a credit card can be used to impact the cost using doubtful means. (Meenakshi)
Credit card fraud is a serious problem that costs financial institutions and card firms a lot of money. Because of this massive flaw in the payment system, banking institutions consider credit card fraud to be a serious threat, and they have complete protection mechanisms in place to monitor operations and detect fraud as rapidly as possible. Predictive analytics is required so that together we can limit the impact of shady operations on transportation service, expenses, and the value of the firm. Machine learning has aided in the detection of a variety of critical business challenges also including email spam, targeted actionable insights, and accurate diagnosis, among others.
Implementation method that has been chosen Many technologies are used in the research study to detect fraud. According to the analysis, Nave Bayes, Logistic regression, and K – nearest neighbour are greater effective in detecting fraud than those of other techniques.
- Bayes, Nave A machine learning algorithm known as Nave Bayes. The Bayes theorem is used in this procedure. This is a straightforward but effective method. Bayes: theorem The Bayes equation calculates the average rate of change in terms of the likelihood of a previous event. (P (B/A) P (A)) / P (A/B) (B) Where P (A) denotes the priority of A. P (B) – B’s priorities P (A/B) – B’s a priori precedence. The Naive Bayes method is quick and easy. This technique is very scalable and requires few training sets.
- Logistic Regression is a technique for predicting the outcome of
This algorithm concerning that of linear regression. However, for predicting or forecasting outcomes, linear regression is utilised, whereas, for categorisation, logistic regression is employed.
• Linear regression is divided into two types:
• Binomial – only two possible kinds (i.e. 0 or 1)
• Multinomial – three or more different kinds that are not listed
• Ordinal – arranged alphabetically by category (i.e. very poor, poor, good, very good) For binary as well as multivariate classifiers, this approach is simple.
- Nearest Neighbor Algorithms (K-Nearest Neighbor) Several vulnerability scanning approaches have exploited the idea of nearest neighbor analysis. The k-nearest neighbor technique, which is a supervised learning technique in which the outcome of an experiencing today request is categorized judging from the number of K-Nearest Neighbor categories, is indeed one of the finest classification methods that have been utilized in credit card fraud detection.
Three primary elements enhance the development of the KNN algorithm:
• The metric for determining the distance between neighbors.
• The K nearest neighbor categorization is based on the proximity criterion.
• The number of neighbors that were used to categorize the random instance.
CONCLUSION AND FUTURE WORK –
Regression Analysis provides the maximum accuracy upon executing the method. In regression analysis, the time length is relatively long, however, in this situation, accuracy is the most important criterion for assessing the results. The results demonstrate that naive Bayes, logistic regression, as well ask-nearest neighbor classifiers have maximum accuracy of percent, 99.99 percent, and 99.84 percent, correspondingly. As a result of the comparisons, logistic regression outperforms both naive Bayes and k-nearest neighbor approaches. As a result, the methodology of logistic regression can be utilized to detect credit cards. (Mehndiratta, 2019)
- Jayant, P. (2014). Survey on Credit Card Fraud Detection Techniques. Retrieved 11 March 2021, from https://www.ijert.org/research/survey-on-credit-card-fraud-detection-techniques-IJERTV3IS031593.pdf
- Mehndiratta, S. (2019). Credit Card Fraud Detection Techniques: A Review. Retrieved 11 March 2021, from https://ijcsmc.com/docs/papers/August2019/V8I8201911.pdf
- Meenakshi, “Compariosn of Machine Learning Algorithms for Credit Card Fraud Detecton” From,
- Priya, B. (2014). Survey on Credit Card Fraud Detection Using Hidden Markov Model. Retrieved 11 March 2021, from https://ijarcce.com/wp-content/uploads/2012/03/IJARCCE2A-s-malvika-Survey-on-Credit-Card-Fraud-final.pdf
- Randhawa, K., Loo, C.K., Seera, M., Lim, C.P. & Nandi, A.K. (2018). Credit card fraud detection using AdaBoost and majority voting. IEEE access, 6, pp.14277-14284.
- R. Ramesh, “Credit card fraud detection using machine learning” From,
- Saini, A. (2019). (PDF) Credit Card Fraud Detection using Machine Learning and Data Science. Retrieved 11 March 2021, from https://www.researchgate.net/publication/336800562_Credit_Card_Fraud_Detection_using_Machine_Learning_and_Data_Science
- Shirgave, S. (2019). A Review On Credit Card Fraud Detection Using Machine Learning. Retrieved 11 March 2021, from http://www.ijstr.org/final-print/oct2019/A-Review-On-Credit-Card-Fraud-Detection-Using-Machine-Learning.pdf
- Varun Kumar, “Credit card fraud detection using machine learning” From,