Computer Science-QA5 online

Computer science-QA5 Online Services

Data Warehousing and Data Mining Coursework specification – Data Mining

Assignment Position

This specification covers the Data Mining half of this module. Hence it contributes 50% of the overall assessment towards the module.

Module Learning Outcomes

Module Learning Outcomes are the official statement of what you are intended to gain from the module. (“On successful completion of the module the student will be able to …..”). All module specifications carry these statements. In the bullet points below, we list the three Module Learning Outcomes for Data Mining.

Using sources from the literature provide a critical evaluation of the role of the data mining techniques in business intelligence.

Explain the concepts that underpin the subject area of [data mining] making reference to main established concepts and some developing areas.

Apply practical Data Mining tools in real problem contexts

Part A

Using Sources from the Literature, Provide a Critical Evaluation of the Role of the Data Mining Techniques in Business Intelligence

To answer Part A well you will have to gather information by reading books, journals and company white papers. You can use video lectures, tutorials, people’s online blogs, company adverts too – but do not rely only on non-peer reviewed sources.Many items can be accessed online books through the Library “Gateway” facility – see menu bars of Blackboard.Almost certainly you will discuss what several authors say, highlighting the similarity or differences between their answers (try to analyse they differ – e.g. what perspective are the authors taking? when was their document was written? does it have a bias towards a particular application/usage sector, etc), and you will use this comparison to review some pieces of the techniques you used in the practical session of this module using SAS Enterprise miner to explain to what extent your work is representative of what DM industry or researchers say is the topic of “data mining”.It is recommended that you pick a small number of topics (eg three, or four) and discuss these in detail rather than take a large number of topics and only discuss each in trivial depth.

Examples of Topics you Might Discuss Are

What methodologies are used in Data Mining Projects (“SEMMA, CRISP-DM, Method A and Virtuous Circle” is a good search starting point). What does the DM industry use? The approach you have used in the practical session that is closest to which?

Industry-scale Data Mining probably uses many softwaretools to helpidentify patterns and knowledge discovery. What are the main tools? Which are the processes most often covered by tools? How do these tools compare with the SAS Enterprise Miner used in this module?

Data mining have several advantages when used in particular industries. However, there are also limitations associated with Data mining. Examine in greater detail the pros and cons of data mining in different industries in a greater detail.

Why does real data need so much data preparation? What is industries’ practice about data preparation for data mining? In what ways does (or could) approach you have used in the practical session simulate what industry does?

Notice that we recommend that you discuss THREE areas, yet we have already listed fourtopic examples. This is to illustrate that there is a wide range to choose. You can choose other topic areas yourself.The total writing for Part A should be around 1200 words. Words beyond 1400 will not be read. Referencing style must follow the APA style.

You can read more about our case study assignment help services here.

How it Works

How It works ?

Step 1:- Click on Submit your Assignment here or shown in left side corner of every page and fill the quotation form with all the details. In the comment section, please mention Case Id mentioned in end of every Q&A Page. You can also send us your details through our email id support@assignmentconsultancy.com with Case Id in the email body. Case Id is essential to locate your questions so please mentioned that in your email or submit your quotes form comment section.

Step 2:- While filling submit your quotes form please fill all details like deadline date, expected budget, topic , your comments in addition to Case Id . The date is asked to provide deadline.

Step 3:- Once we received your assignments through submit your quotes form or email, we will review the Questions and notify our price through our email id. Kindly ensure that our email id assignmentconsultancy.help@gmail.com and support@assignmentconcultancy.com must not go into your spam folders. We request you to provide your expected budget as it will help us in negotiating with our experts.

Step 4:- Once you agreed with our price, kindly pay by clicking on Pay Now and please ensure that while entering your credit card details for making payment, it must be done correctly and address should be your credit card billing address. You can also request for invoice to our live chat representatives.

Step 5:- Once we received the payment we will notify through our email and will deliver the Q&A solution through mail as per agreed upon deadline.

Step 6:-You can also call us in our phone no. as given in the top of the home page or chat with our customer service representatives by clicking on chat now given in the bottom right corner.

Case Approach

Scientific Methodology

We use best scientific approach to solve case study as recommended and designed by best professors and experts in the World. The approach followed by our experts are given below:

Defining Problem

The first step in solving any case study analysis is to define its problem carefully. In order to do this step, our experts read the case two three times so as to define problem carefully and accurately. This step acts as a base and help in building the structure in next steps.

Structure Definition

The second step is to define structure to solve the case. Different cases has different requirements and so as the structure. Our experts understand this and follow student;s university guidelines to come out with best structure so that student will receive best mark for the same.

Research and Analysis

This is the most important step which actually defines the strength of any case analysis. In order to provide best case analysis, our experts not only refer case materials but also outside materials if required to come out with best analysis for the case.

Conclusion & Recommendations

A weak conclusion or recommendations spoil the entire case analysis. Our expert know this and always provide good chunks of volume for this part so that instructors will see the effort put by students in arriving at solution so as to provide best mark.

Related Services

Part B

Review of a Good Data Mining Journal Article

Find an example in the academic literature (i.e. an article or paper from a reputable academic journal) where data mining has been used. You should select an article where either clustering or a decision tree or another data mining technique has been used.Discuss your chosen article. Briefly describe the situation in which it was applied, what was discovered and whether (with reasons) you think data mining was used effectively.The total writing for Part A should be around 500 words. Words beyond 550 will not be read. Referencing style must follow the APA style

Part C

Apply Practical Data Mining Tools in Real Problem Context

Problem Outline

For this part, you are required to analyse a data set taken from the data mining competition prior to the third international conference of Principles and Practices of knowledge discovery in data bases (PKDD). This conference was held in Prague in 1999[1]. One of the challenges given for the competition was a set of datasets concerning financial transactions and details for customers at a Czech bank. The ERD of the database is shown below:

Details of the Query and the Resulting Data

We wish to build a model of customers for the bank in order to gain some insight into the patterns that exist in the customer groups. Several queries have been developed to give a final one QueryR described in the appendix. There were 4500 records for these customers. For each customer different types of credits and withdrawals take place, these are categorised as follows

Credits (Paying money into your account): Cash; Bank collect; other

Withdrawal(Taking money out of your account): Cash; Bank remittance; Card

From the transactional table it is possible to calculate the number of each type of transaction or the total value of each type of transaction. From these the average value of each type of transaction has been calculated by dividing the total value of transactions by the number of transactions. The resulting final table was produced in the access database and is called Queryr. It also contains other background information such as: age, sex, if there is a second account holder (second), if the client has a loan (loan) and the frequency of the issuance of statements (frequency).

Analysis Carried Out and Questions

This data has been analysed using SAS Enterprise Miner as follow

Question

The bank wishes to see if different customers have similar financial profiles and have therefore asked that the Queryr data be clustered. They are looking for about eight clusters. Since cluster analysis requires the use of fields that are symmetrical as possible each field in the Queryr data is investigated. This resulted in the plots and the table of suitable summary measures given in Figure Error! No text of specified style in document..2 and Table Error!No text of specified style in document..1 shown on the next two pages.

1.Using only the plots shown in Figure Error!No text of specified style in document.2 discuss the shape of each field in the dataset. What other features do you notice?

Interval Variable Summary Statistics

Variable	Mean	Standard Deviation	Non Missing	Missing	Minimum	Median	Maximum	Skewness
abankcol	4279.939	10054.18	4500	0	0	0	54824.5	3.245687
abankr	3083.5	2726.309	4500	0	0	2512	14811	1.110656
acashcr	10763.78	9295.205	4500	0	200	8328.839	31777.98	0.354047
acashwd	4875.521	3948.209	4500	0	141.6462	3568.881	20651.72	0.901878
acredit	7769.748	5456.384	4500	0	851.4029	6597.685	26885.96	0.857385
age	45.61529	17.08485	4500	0	17.05681	44.82409	81.9822	0.25241
aothcr	145.7339	61.31085	4500	0	0	139.61	335.1289	0.347512

2.Discuss the summary measures: min, max, mean, range and Std. Dev. (Standard deviation) in to explain any other features of the data. Do any of these measures help in understanding some of the features you discussed in question 1? If so explain how.

3.Use the Skewness shown in Table Error!No text of specified style in document..1on page 6 to further investigate the shape of the distribution of each field. Do your results confirm what you found in question 1?

A final cluster solution is produced. Some of the results are shown in Figure Error! No text of specified style in document.3 and.and Figure Error! No text of specified style in document.4 on page 9. Using these results, answer the following

4.How many clusters have been fitted? Which cluster has the most observations (customers) in it? Which has the least?

5. Which cluster has customers that are most similar (consistent)? Explain what evidence there is for this.

A set of box plots comparing the clusters is given below

6.What features do cluster 1 and 4 have in common that differs from the remaining clusters?

7.Which cluster tends to have older customers? Which has younger customers?

8.How do customers in cluster 3 use their bank account? How does cluster 5 compare to this?

Product code : Computer Science-QA5

Looking for best Computer Science-QA5 online,please click here

Summary