data mining
Running Head : Data Mining Data Mining [Name of Writer] [Name of Institution] Data Mining Introduction Retail establishments have heavily invested in information technology to help them manage their business more effectively and to gain a competitive advantage . Over the last three decades , electronic storing and capturing of large amounts of critical business data has been done and it has been continued to increase the volume . Yet , full capitalization of the value has not been able to be done by many retailers , despite this wealth of data . Discrimination and

br differentiation of the information is very difficult regarding its implicit data
For example , a discounter routinely collects point-of-sale data about the purchases made by each customer , but still has difficulty understanding the buying patterns of its customer base . Fortunately advances in the field of data mining are helping businesses leverage their data to obtain meaningful information that can give them a competitive edge . According to one merchandiser already working with advanced data mining techniques "Developing a thorough understanding of buying habits and preferences is central to our strategy of being the customer 's first choice . data mining allows us to make more informed merchandising and advertising decisions to better serve our customers
But data mining is rapidly becoming an over-hyped phrase . It is being used synonymously with terms such as micro merchandising micromarketing , and data warehousing . It is becoming viewed as a panacea to solve all the problems associated with large amounts of data . Can data mining really help a retailer better understand its business
Data Mining Defined
Data analysis is the key to addressing many business problems . The main purpose is to help make more intelligent business decisions . Data mining is a class of analysis techniques that assists in the process of providing more knowledge about your customers . In the academic definition , data mining is the process of making discoveries from large amounts of detailed data . The real potential gained from data mining lies in its ability to go beyond what is already known . By providing new insights based on historical , factual data , data mining uncovers new knowledge for category managers , merchants , store managers , and buyers in turn , resulting in more informed decision-making . One of the benefits of data mining in retail is a better understanding of customer behavior leading to more effective micro merchandising and micromarketing strategies . This usage of data mining relies on the fundamental business premise that there are valuable behavioral insights to be learned from historical data
Data mining complements traditional decision support systems that rely on the user to define the questions the user wants answered . Data mining complements traditional hypothesis-based decision-making by using the facts and patterns hidden in historical data to generate questions and insights about the business that were previously unknown
A retailer 's success has always been crucial in the establishment and maintenance of the customer relationships . However , it has been realized by a number of companies that more than generalized coupons , promotional events , price discounts is necessary to be given by the companies , in to ensure their success in the information age . Merchandise is needed at an attractive price quickly by the consumers . In addition consistent recognition is also required by them . In other words , one of the main priorities of the retailer should be the consumers , as wanted by them . In this regard , it is essential for the retail to know details about the customer . More about the customers can be learnt by the tactics of data mining , and offerings and services of the retailer can be directed and supported in relation with the needs and desires of the customers
Personal , pertinent , and actionable information regarding the purchasing habits of current , as well as , potential customers are compiled by the organization with the help of data mining process . Traditionally , data mining was used by the catalogers due to no other option of knowing the choices of the customers . In the result , an integral way of doing business was introduced and spread quickly in the market . An important way of the establishment of competitive advantages was considered by the knowing of behavior of the customers , as well as , demographic pros In addition , brand loyalty and awareness was created over the long term .Scope of Data Mining
Unsolicited mailers and surveys often bombarded the consumers , which is often becomes very disturbing for them . In this regard , data-gathering initiatives were re-thought by many companies due to the rise of privacy concerns . However , new and unique ways for the accumulation of the consumer information were spawned by the internet . Importantly , personal information is freely relied by the consumers as well
Collection of the information according to the choice of the consumers and not the company 's choice is the key to direct marketing as believed by many marketers . In this regard , own communication is being allowed by the consumers with the help of email addresses and websites
In the result , web addresses of the companies are being publicized by the companies , instead of wasting time by phone numbers and mailed surveys . Web surfers can purchase products , make reservations , input preferential data and comment on the organization 's merchandise or services . This data is collected and compiled in a database , and then used to segment consumers , focus marketing efforts , develop new products or re-invent old ones , and deliver a degree of individual customization that can improve customer response , conversion rates and purchases by as much as 50 percent
Retail point-of-sale purchases is another method of the capturing of consumer-specific data . Personal information is revealed by the consumers without any violation with the help of different point-of-sale software packages . Marketing and merchandising activities utilize the purchasing information . In addition , determination of the best customers , as well as , different types of inventories is done accurately by the store managers with the help of this information . The development of in-story service , as well as , customer recognition is also permitted by it . Tailored local store assortments are also offered by the enabling of the companies by the information . In addition , seasonal and size variations regarding the stores and market are also addressed by the companies . Targeted marketing programs are also offered to the consumers , whose examples have already been discussed above
Websites of the Bs Books and Music was once logged by them in the form of members , as well as , guests , during which , recording of the queries of the users was done . From that point on , preferences were shown first to user at the time of logging the site . New books or CDs of the same preference would be informed to the user , if any books of music had been purchased in the past . In the same way , information regarding the new material from a particular author will be informed by the consumer , if an author was inquired by the consumer without even any purchase
Customization regarding the browsing of the Amazon site has been done in relation with individual customers by the Amazaon .com , which is quite similar with the Bs . Recognition of the past queries , as well as name is done by giving acknowledgment to the users during the second visit to the site . E-mail notification of new material by an author was another feature that was offered by the Amazon .com . Interested consumers are notified about the particular genres as well
The option of VIP customers is offered by the Winn Dixie to its customers . Special discounts and services are received , and monitoring of the spending habits is done with the help of the VIP card . Tailored products from different demographics are also offered by the store Added incentives to shop at the supermarkets of Winn Dixie are offered to the consumers , when abudancy has been observed in the amount of the grocery stores
Specific customer-purchase behavior information has been captured . A source of income and convenient service was undertaken by the utilization of retail store credit cards , as well as , customer cards widely
In this regard , different patterns of customer behavior is analyzed and implemented at a local level . Other data mining sources are augmented by the utilization of more traditional consumer research by some retailers In the result , more customer relation is provided by these retailers For instance , a sample of stores can be used for the implementation of different in-store intercept surveys . The views of the customers of the stores can be analyzed by the provided and collected information from the abovementioned surveys . Raw data capture is found to be beyond these surveys , such things are determined by the analyst as the market share of the retailers , as well as , classifications of the goods with relation to the first choice of the customers . In addition , certain real and perceived factors are also determined by this classification . In this way , a greater share of the existing customers ' purchases is captured by the implementation of consumer research in the abovementioned different ways by astute retailers . The successful retailers result in the first choice of the customers
Abroad set of tools and practices is encompassed by a loose term , the data mining . Market information is capitalized , catalogued , and gathered by the banks with the help of the abovementioned set of tools Specifically , analysis of data in large quantities for the identification of meaningful patterns regarding the customers , as well as , other prospects is included in the process of the data mining . Thus opportunities are provided to the banks for the targeting of marketing efforts . In the past years , a hot has been considered the process of data mining , and this phrase has been caught in the industry . Due to its high consideration , a slang term `smurfing ' is being used by the practitioners of the data mining process for the of a layperson without the proper understanding of the data mining . Another catch phrase `customer relationship management ' is being used by the banks , as well as , some technology firms for the replacement of the term `data mining . It has been found that data mining and CRM is the same thing , which are usually discussed by the community bankers , as suggested by the director of sales and marketing for Medici Technology Inc , Brud Deas . Strata of customers is evaluated , probed , diced , and sliced with the help of collected information through the data mining In fact , only one perspective of the data mining is the CRM , Bankers , by whom , existing customers are only informed and marketed by the utilization of data mining software are not making the fullest of the available tools
Fast growth
Research Question on Data Mining
What is the relative importance of customer satisfaction attributes When evaluating continuous improvement initiatives , researchers often study the importance customers place on different product and service attributes . Some directly ask respondents to rate the importance of different attributes , but most researchers use statistically inferred ratings of importance . Employing statistical methods , researchers lose headway ratings of customer satisfaction with different attributes (independent variables ) against overall satisfaction (dependent variable . Typically , multiple regressions accomplish this task Multiple regression and related techniques make numerous assumptions that are often violated in practice . A typical problem in customer satisfaction research pertains to high levels of correlation between attributes (multi-co-linearity . This can dramatically affect the standardized beta coefficients , the statistical value used to determine relative importance of various attributes . If high levels of multi-co-linearity exist , then standardized beta coefficients probably will be biased , leading to inaccurate importance rating . Furthermore multiple regression assumes a normal distribution of ratings (i .e , the scores will resemble a normal bell curve . This isn 't the case with customer satisfaction . Past research has shown that data about customer satisfaction is often positively skewed . The majority of satisfaction scores fall at the upper end of the scale (8 to 10 on a 10-point scale Finally , these statistical techniques also assume linear relationships between independent and dependent variables . This too is a mistake Research has clearly demonstrated that relationships are often curvilinear , far from a straight line . In many industries , statistical assumptions don 't hold and can result in biased and misleading results
What is Impact of Satisfaction on Future Financial Performance
Satisfaction research seeks to link measures of customer satisfaction to financial performance . To accomplish this goal , researchers use regression analysis or structural equation modeling , which can display sensitivity (ROI ) analysis for different improvement initiatives . Roland Rust and colleagues have pointed out that these statistical techniques can show how satisfied customers would be with certain changes and how these changes would affect overall satisfaction and financial performance . Accurate analysis of this kind is extremely valuable , but invalid statistical assumptions affect the accuracy of sensitivity analysis . Businesses that make these errors think the results are precise , when really they are not
What level of attribute performance should be set as a target
Setting performance targets for customer satisfaction attributes is extremely difficult for many practitioners . Often , managers arbitrarily set them . Brandt suggests taking an outside-in approach to performance targets . He proposes that attribute performance targets should be set where important business goals such as sales revenue , market share , and retention are realized . For example , if on-time delivery with a score of 8 leads to loyalty , then the performance target for on-time delivery should be 8 . Brandt suggests a focus on appropriate levels of "upstream performance " so that desired "downstream performance " can be obtained To develop performance targets , Myers suggests using performance breaks This analysis graphs the relationship between the performance of a single attribute and a desired business outcome such as overall satisfaction , sales revenue , or retention . While the method is helpful it lacks rigor because it can lead to a variety of interpretations Better methods are needed
How it Works
Data mining techniques overcome many limitations of traditional analysis and statistical techniques . Neural networks and decision trees are two useful data mining techniques
Neural Network Analysis
Since the mid- '80s , neural networks have been the focus of a great deal of research . With the advent of high power computing and improved algorithms , neural network analysis is now commonplace . Originally developed in the '40s , the term "neural networks " refers to models designed to simulate the human brain . As an analytic tool , neural networks overcome the limitations of traditional statistics , which are typically visible in customer satisfaction research . Neural networks are mathematically driven . Bishop has shown how , in situations where non-normal data , multi-co-linearity , and non-linear relationships are present , neural networks will outperform multiple regressions . Multiple regressions cannot cope with high levels of complexity . For example , it can only handle a limited number of variables , often no more than five In contrast , neural networks and other data mining techniques can handle up to 200 variables and are thus far more realistic in representing the complexity that is typical of the business environment . Although researchers regularly draw elaborate comparisons of neural networks to the human brain , neural networks provide similar results as multiple regressions . Although these techniques analyze the data using different methods , conceptually , the results are similar . Multiple regressions use independent variables to predict a dependent variable . Neural networks call variables neurons and include one or more neurons in an input layer (independent variables ) to predict one or more neurons in an output layer (dependent variables . Both techniques are designed to make predictions and examine the impact of variables on those predictions The key to neural network analysis as a model is the way it handles complexity and uncertainty . The hidden neurons make the connection between input and output layers indirect . This is intended to account for and model complexity and uncertainty in the business world . This use of hidden neurons reflects an actual level of uncertainty and complexity found in business operations . The simplest form of neural networks consists of three layers . The input layer includes one or more neurons that are independent (predictor ) variables . The output layer includes one or more neurons that are dependent (outcome ) variables . A hidden layer connects these two layers and their neurons . Similar to multiple regressions , the input neurons predict the output neuron . Two important considerations affect neural network analysis : over-training and random starting positions . Most software packages include features to prevent over-training , which is critical to preventing memorization of the data producing results that will not generalize to the population . Random starting positions are necessary because starting positions will influence the results . Similar results generated by a number of random starting positions provide confidence in the results
Neural Network Applications
Neural network analysis is well-suited for research into customer satisfaction . It can evaluate the relative importance of customer satisfaction attributes and predict the effect of future financial performance with relation to the customer satisfaction (sensitivity analysis . Data from pizza customers illustrate the use of data mining techniques and how they compare to traditional approaches . Because financial data wasn 't available , only relative importance will be analyzed . Yet , with the inclusion of financial data , a similar model could analyze the affect of different issues on financial performance and evaluate the effect of proposed changes . Data about customer satisfaction were collected from customers of a national pizza chain The questionnaire inquired about customer perceptions of the pizza they ate and about their overall satisfaction with the pizza and the service of the chain . The customers were asked about taste , amount of pizza speed of delivery , reliability of delivery , price , delivered temperature , and friendliness of employees . A 10-point , semantic differential scale (1 poor , 10 excellent ) was used . Both multiple regression and neural network analysis were conducted to rate the relative importance of the seven attributes . In multiple regression analysis , attributes were entered as independent variables and overall satisfaction as the dependent variable . SPSS 10 .0 was used to conduct the analysis . In the neural network analysis , these same pizza attributes were entered as neurons in the input layer . Overall customer satisfaction was entered as the neuron in the output layer . Given the number of input neurons , three neurons were included in the hidden layer . Clementine 6 .0 , the SPSS data mining software package , was used to analyze the data . The importance rating of different attributes was determined using sensitivity analysis in Clementine . In the past , a major criticism of neural networks has been its inability to explain its reasoning--resembling a black box . Sensitivity analysis overcomes this problem and provides scores , which measure the importance of each neuron in the input layer . The scores for pizza attributes are derived from their ability to predict overall satisfaction and indicate the relative importance of each attribute , ranging between 0 and 1 , with higher values representing more importance . Because this data set does not meet the conditions necessary for effective statistical analysis , neural network analysis provides more accurate ratings of attribute importance Both techniques identified taste as the most important pizza attribute but neural networks identified amount as the second most important attribute , while multiple regression places this variable as a distant fourth . Regression analysis displays a negative coefficient for friendliness , and two variables (taste and price ) have much higher coefficients than the remaining pizza attributes . This is likely due to high levels of multi co-linearity . These different techniques suggest quite different views of what is important
Data Mining in Decision Tree Analysis
Another popular technique of the data mining is the decision tree analysis , also driven by a mathematical algorithm . As is the same with neural networks , non-normal distributions , multi co-linearity , and non-linear relationships do not hinder the performance of decision tree analysis . In regard to complexity , decision tree analysis can handle up to 200 predictor variables . Decision tree analysis , also called rule induction , uses different algorithms including ( HYPERLINK "http /web .ebscohost .com /ehost /detail ?vid 1 hid 17 sid 9c44d54b-81aa-4c 7b-a97b-2d863ddabed6 40sessionmgr9 " \l "bib1 " \o "1 " 1 ) CHAID HYPERLINK "http /web .ebscohost .com /ehost /detail ?vid 1 hid 17 sid 9c44d54b-81aa-4c 7b-a97b-2d863ddabed6 40sessionmgr9 " \l "bib2 " \o "2 " 2 ) C5 .0 , and HYPERLINK "http /web .ebscohost .com /ehost /detail ?vid 1 hid 17 sid 9c44d54b-81aa-4c 7b-a97b-2d863ddabed6 40sessionmgr9 " \l "bib3 " \o "3 " 3 ) C RT . Each technique is designed for slightly different purposes . For example , C5 .0 can only predict categorical dependent variables , whereas C RT can be used to predict interval or continuous variables . Essentially , decision tree analysis searches through the data to identify which predictor variable is most important to correctly predicting the dependent variable . Decision tree analysis partitions the data based on an initial split of the first variable . Thus , the most important predictor is selected first . For that variable , the level of that variable that corresponds to predicting the dependent variable is also identified Then , the next most important variable for predicting the dependent variable is selected , along with the corresponding level of performance Decision tree analysis continues until all relevant variables are selected . If a variable is not selected , then it is not critically important to the prediction . Decision tree analysis also may yield insight into different market segments . For example , the loyalty of different market segments is affected by different attributes in different ways . Decision tree analysis can identify various market segments in the data , display what is most important for that segment and define the level of performance that would create loyalty
Data Mining in Decision Tree Application
Decision tree analysis can be used to rank the importance of various attributes and to rigorously identify performance targets for customer satisfaction attributes . According to Brandt 's outside-in approach attribute performance levels should be set so desired business outcomes are achieved . Assuming customer loyalty is the desired business outcome decision tree analysis will rank the importance of customer satisfaction attributes according to how they predict loyalty and will indicate the level of performance required to create loyalty . Decision tree analysis is illustrated using data from the pizza chain about customer satisfaction . In this analysis , loyalty is the desired business outcome (dependent variable , and the pizza and service attributes are the predictor (independent ) variables . Two segments show the levels of performance that creates loyalty . One segment tells us how we have destroyed loyalty in the past . The loyalty of segment A perceives attributes importance in the following descending ( HYPERLINK "http /web .ebscohost .com /ehost /detail ?vid 1 hid 17 sid 9c44d54b-81aa-4c 7b-a97b-2d863ddabed6 40sessionmgr9 " \l "bib1 " \o "1 " 1 ) taste HYPERLINK "http /web .ebscohost .com /ehost /detail ?vid 1 hid 17 sid 9c44d54b-81aa-4c 7b-a97b-2d863ddabed6 40sessionmgr9 " \l "bib2 " \o "2 " 2 ) price , and (3 reliable delivery . This segment requires the following minimum performance levels to become loyal : taste (8 , price (9 , and reliable delivery (8 . Segment A wants a great tasting pizza (8 or better ) that 's very affordable (9 or better ) and that will be delivered reliably (8 or better . In contrast , segment B wants a great tasting pizza (9 or better ) that 's big in size (9 or better . For these two segments , the predictive models are accurate with levels of .971 and .915 . If pizza customers perceive taste to be 7 or below , and the amount to be 8 or below , then there 's a high likelihood .925 ) these customers will not be loyal . The results of decision tree analysis are similar to both regression and neural network analysis and help confirm the results Additionally , decision tree analysis goes beyond both of these techniques by providing insight into different segments and their perceptions of both attribute importance and performance . Clearly researchers can gain the most understanding by analyzing customer satisfaction data with both statistical and data mining techniques
Ensuring Success
Many researchers perceive data mining techniques as tools to "fish " for results , with no regard for theory or validation . Rigorous data mining is far from a weekend fishing trip . It 's critical that researchers follow a rigorous data mining methodology . The CRISP-DM methodology was created to do just that (An in-depth discussion of the CRISP-DM methodology ) A critical element of the CRISP-DM methodology concerns the validation of models . Data mining techniques are not based on statistical theory , so validating and testing models is imperative . Once a model is developed on practice data , the model is validated on a holdout sample . Using business knowledge and theory as a guide , models are created and tested in one data set , then validated on another sample . These tests are the only way to ensure accurate , valid models that generalize to the population . Typically , the training set represents about 60 of the sample , while the test and validation sample represents 30 and 10 , respectively . Researchers should expect a small drop in performance on both the test and validation samples , yet the models should perform relatively well . This practice ensures that the model produced is not valid only in the training data , but will generalize to the population . Researchers have long known that research triangulation offers the most valid and reliable perspective of customers . Different methods and techniques should be used as complements to draw conclusions about customers . Neural networks and decision tree analysis do not replace but complement statistical techniques . Researchers should look for convergence among both statistical and data mining techniques . If neural networks , decision trees , and regression analysis all derive similar results , then researchers can have increased confidence in the results However , in the face of multi-co linearity , non-normal distributions , and /or non-linear relationships , neural networks and decision trees are clearly the analytic techniques
A data mining exercise
Data Mining and Modeling as a Marketing Activity
Data mining has been truly integrated , and daily activities of the marketing have been modeled with the data mining by only a handful of companies . The benefits of this step has been explained by the author in `how to ' this month . An organizational context performs the data mining and its modeling activity , and customer-centric organizations are responsible for the honing of a marketing process with the help of a few high-performance . The elements and constituents of the data mining and modeling activity have been illustrated in this article . In addition , a scenario with the context of high performance by means of process schematics has also been demonstrated in the article . In the marketing sciences , a vital step in the marketing process has been the data mining and modeling for the current sound practices . Valuable information is given to the marketing strategists by these tools and techniques , which performs the development of the intelligence-driven tactics and strategies . Specifically , identification of targeted markets is assisted by the data mining and modeling for the purposes of marketing . The needs and desires of the customers , individuals , and business are also identified with the support the data mining . The mechanism for the refining of the abovementioned strategies and techniques is also provided by the data mining with the help of results analysis . Achievement of a precision approach to intelligence-driven marketing is done by the integration of these tools and techniques into the planning and designing of the campaign that has been meant for the marketing
Although many statistical modeling techniques are available for the purposes of marketing , common utilization has been done with the help of two primary techniques . Regression and tree models are these two primary techniques . Recently , utilization of a third technique , neural nets has been observed . Distinct different uses have been observed in each of these models . Usage of regression models , such as linear of logistic models is done , when predictive models are built for the forecasting of an outcome , as well as , for the generation of a mathematical algorithm Usage of tree models , such as CHAID or CART is done , when segmentation models of customer base are built . When the revenues and profits are affected by a large number of characteristics , the usage of neural nets is done . Commonly , the Ordinary Least Squares method is used by the ordinary linear regression , by which , smallest residual sum of squares is sought by the method . Therefore , the predicted line has been fitted by the attempts of this method , which was found to be very close to the actual outcomes . A continuous ratio measure as the outcome also presents the usage of this method . For instance , the company will be delivered by the revenue or profits customers . When the outcome of a dichotomous variable is to be predicted , the usage of logistic regression should be done . When a data is not available for its normal distribution , that is the categorical data , the usage of this method is done . When only one of two answers can be comprised by a dependant variable , a normal distribution cannot be given in this regard , which has previously been discussed in the . A pro of future prospects of the models is built by CHAID , as well as , the next generation , CART . For instance long-distance company can be switched by likelihood . Distinct and statistically significant segments are created by the tree models . A unique pro is comprised by its each segment . The results of overall population are then compared with the percent of sample population that has been gathered from that segment . In the result , better , as well as bad performance of the unique segments can be reviewed by the marketing analyst
Since a very different methodology is used by each method for the determination of significant variable , it should be kept in mind that fight model should be used for the available data . CHAID is less flexible , as compared with the CART , and richer information is provided by the CART , by which , segments are defined , and demographics are positioned , created , aligned . After the development of strategic marketing plan , the plan can be executed by the development of the tactics . First , it is necessary the purpose of the programs should be decided in the beginning
Acquiring of the new customers
Existing customers are cross sold with products and services
Introduction of the next level to the customers
Loyalty programs for the customers
Although , distinctly different purposes are sought by these programs the same process has been implemented for the application of learning from the statistical models . For instance , if new customers are acquired by the program , prospect names for the products and services can be acquired by the detailed analysis of the pro of best customer of the company . The pro of the best customer by whom , the products and services are owned can be used , if products and services is to be cross sell to the customers . In other words , the pro of the best customers of the next level , the sell is to setup at the next level . In this regard , it is very essential that the mining and modeling activity should be understood properly with relation to the process of integrated marketing . A real life scenario is presented , in to provide concrete to the context of mining and modeling . The market wants to be increased by the company nationally . A value proposition wants to be tested on 50 ,000 of these prospects . Mining and modeling have been used for the narrowing of the market focus , and for the identification and sizing of the target segments . In addition , a priori scoring algorithm has been created in this regard . List specifications are the translated version of these scoring algorithms , which provides 50 ,000 names to the company . Sales calls are used for the coordination of direct and database marketing during the swift moving of the company . In addition marketing investment is timed and allocated cost effectively by its operation
Decision Making and Data Mining
During the decision-making process , detailed and accurate information is responsible for the process that is performed by the managers . It has been shown by the data warehousing that the need of information regarding the customers and suppliers can be satisfied by it . In addition , the need regarding the market trends , competition , and effective information of the business procedures has also been satisfied by the data warehousing . Guesswork is not allowed these days . Careful data analysis must be done for the decision-making process . Business intelligence plays a crucial role in the data mining process . It has been found that data storage is very cheap , as compared to the entry of data , which is done on computers electronically . The data warehousing has generated information that is being used to gain benefits that are being reaped by the financial sector . The information in the existing databases was meant to be collected by the development of the data warehousing for the revelation of the new facts . In the organization the information of the data s is used in a better way . Creation of no new databases is done , and the original database is not edited in any way , which is one of the features of data warehousing . The data warehousing system is copied with the information from the existing databases , where the processing is done , and charts and graphs are prepared for the use of the companies for the generation of the new facts
Detailed data for only the current year will be contained in an online general ledger system . Additional support for strategic decision-making can be provided by both current , as well as , historical data that was contained in a financial data warehouse
In a survey carried out by Micro Strategy who are specialists in the technology used to explore , analyze and distribute information held in central data warehouses , it was found that 10 percent of the financial companies taking part were already gaining commercially by their use of information from their data warehouses . This is turning the data warehouse from a cost centre into a profit centre and is providing information to develop new channels to market , such as electronic commerce
The production stages of data warehousing was reached by one-third of the financial services in the United Kingdom , as indicated by a survey In the result , complex analysis of the information was carried out by a large number of users for the achievement of specific business goals
Around specific market or business needs , a demand for the analytical applications has been focused , which is driving the market for the data-warehousing solutions increasingly . Analysis specific to the own business of the companies were carried out by the setting up of data warehousing projects by over fifty percent of the companies , which took party in the survey . Applications such as churn analysis , one-to-one marketing , risk analysis , and several other applications were included in the financial services industry
The information gained from the loyalty cards is maximized by the usage of this in the retail industry widely . Online analytical processing is a method , by which , information stored in a data warehouse is accessed which has also been referred as OLAP . Hypothesized relationships are investigated by the involvement of utilization of queries in it . For instance , a query regarding the breaking down of sales in a region can be used for the beginning by the management . Additional queries can be followed by this beginning of the management . For instance , sales can be grouped by different customers and quarters . Such drill down capabilities can be used by the companies for the development of profit and loss statements with relation to the individual customers Subsequently , contracts can be renegotiated by the utilization of this information
It is clear that significant improvement and growing is aimed by the use of data warehouses . The increasing interest in the utilization of the data warehouses is a major contributor to this trend . Gaining of competitive advantage by the companies and detailed understanding of the customer and its behavior is one of the most important objectives of the data warehouses . The transaction level of the customer is inherently required by these objectives
The stored information can be accessed by the data mining . Relationships in the data can be discovered with the help of most sophisticated statistical analysis in the data mining . Patterns of used indicative of fraud are identified by the utilization of the data mining by the credit card companies . In the sales data , previously unknown relationships can be identified by the techniques of the data mining , by which , future promotions and their basis can be formulated by the companies . In data mining , two sets of tools are used for the purposes . In the data , the patterns and trends are discovered by first set of the tools , and these patterns and trends are verified by the second set of the tools . Data visualization , neural networks , cluster analysis , and factor analysis are included in the discovery set of the tools . Familiar statistical techniques have been included in the verification set of the tools Regression analysis , forecasts , co-relations , and t-tests have been included in the latter set of the tools . It was found by the Price Waterhouse that the generation of the increase revenue was the reason behind the building of most of the warehouses with the help of customer segmentation and better marketing procedures . For many companies , an especially pressing requirement was the advent of the call centers . The financial services sector is especially affected by this advent due to the idea of customer account `farming , which is being used by most of the companies in the market . In this regard , loyalty card schemes are being planned and implemented by major retailers for the effective use of the collected data
However , sales have not been focused in all the applications . Ideal environments for the measurement of the performance are the data warehouses . Large companies are realizing the importance of this field This fact has been realized by the large companies that it is not adequate that short-term financial data should be relied as an indicator of the performance . Today , the business world is indicating downright dangerous indications in this regard . A business tool is provided by the data warehouse , by which , measurable financial benefits must be delivered to the investors . IT terms of unclear notions can never justify the investment regarding the effective data management Virtually , specific business goals have been set up for initiation of all the successful data warehouse projects . A detailed picture of the implementation of the data warehouse must be given by the commissioning company at the time of setting up a data warehouse . In addition , future prospects of the data warehouse must also be considered during the setup . The process must be explained by the data warehouse supplier to the senior management of the commissioning company , which is an equally important factor . For instance , it must be indicated by the supplier that a large part of the overall costs will be accounted by the transfer of the data into the new database from the operational systems . It should also be pointed out by the supplier that short-circuiting of the evaluation process can brought costly mistakes in the project
Conclusion
Many community bankers may believe that they already know enough about their customers , and that they already have effective strategies for marketing new products . The historic profitability of many community banks over the past decade offers some support for this view . Data mining can be useful for every business this can be used as safety tool maintaining employee records , goods records , over all , through data mining businesses can grow fast . Data mining centers on a better understanding of customers ' behavior . The most critical need is to capture and retain details about each customer 's transaction . Most retail establishments today capture transaction details at the point-of-sale , and then promptly break the details apart at the end of the day into summaries of item movements by store . This data is then sent to the central site where it is aggregated with item movement summaries from other stores to develop a composite view of activities This data is used for traditional category management decisions . What gets lost through this process is the detailed data that contains the information about the customer 's behavior . It is imperative that the transaction detail data be kept together and aggregated across stores for data mining and analysis . This transaction data is first combined with descriptive data about the items and causal data , such as promotion information about the item , store location information , and coupon data The transaction data , as it is collected for all stores and all products over all time periods , is stored using data warehouse techniques and represents the universe of data available to be "mined " For effective data mining and analysis , a subset , or data mart , needs to be defined based on the particular business application and location , such as the promotion effectiveness for the back-to-school season time period (the six weeks from August to mid-September ) and the advertising area (the 15 stores surrounding Chicago . This data mart , defined by the dimensions of the business problem , is now ready for meaningful analysis
The area of data analysis has been evolved rapidly in the form of the data mining . In micromarketing and micro merchandising , key retail initiatives have been directly supported by the evolution of the data mining . This new technology is being explored by the retailers due to the availability of various opportunities , in to gain competitive advantage . In addition , consumer-wanted goods are also being offered with the help of the data mining
References
American Institute of Certified Public Accountants (AICPA (1999 "Top 10 technologies--plus 5 for tomorrow : Journal of Accountancy , 187 (5 16-17
Berger C (1999 "Data mining to reduce churn . Target Marketing , 22 (8 26-28
Berry MJA and GS Linoff (1997 . Data Mining Techniques : For Marketing Sales , and Customer Support . John Wiley Sons , Inc : New York , New York
Brabazon T (1997 "Data mining : A new source of competitive advantage Accountancy Ireland , 29 (3 : 30-31
Chin J (2000 "It 's important to do it well . Straits Times-Computer Times , 8 Nov , 2000 : 14-16
Chung HM and
Gray (1999 "Data mining " Journal of Management Information Systems , 16 (1 : 11-13
Coyle T (1999 "Finding your best customers : America 's Community Banker , 8 (9 : 26-29
Davis B (1999 "Data mining transformed , Information week , 751 : 86-88
Decker
(1998 "Data mining 's hidden dangers : Banking Strategies 74 (2 : 6-14
Fabris
(1998 "Advanced Navigation : CIO , 11 (15 : 50-55
Freedman J (1997 "IIA announces 1997 research priorities . Management Accounting , 78 (1 : 65-66
Hand DJ (1998 "Data mining : Statistics and more " The American Statistician , 52 (2 : 112-118
Jenkins D (1999 "Customer relationship management and the data warehouse : Call Center Solutions , 18 (2 : 88-92
Kiesnoski K (1999 "Customer relationship management : Bank Systems Technology , 36 (2 : 30-34
Koh HC (1992 "The sensitivity of optimal cut-off points to misclassification costs of Type I and Type II errors in the going-concern prediction context . Journal of Business Finance Accounting , 19 (2 : 187-197
Koh HC and SK Leong (2001 "Data Mining Applications in the Context of Casemix : Annals , Academy of Medicine (Singapore , 30 (4 , Supplement 41-49
Koh HC and CK Low (2001 "Using data mining in insurance companies Singapore International Insurance and Actuarial Journal , 4 (2 : 51-62
Kuykendall L (1999 "The data-mining toolbox : Credit Card Management 12 (6 : 3040
Lach J (1999 "Data mining digs in . American Demographics , 21 (7 38-45
McQueen G and S Thorley (1999 "Mining fool 's gold " Financial Analysts Journal , 55 (2 : 61-72
Murray LR (1997 "Lies , damned lies and more statistics : The neglected issue of multiplicity in accounting research . Accounting and Business Research , 27 (3 : 243-258
Peacock PR (1998 "Data mining in marketing : Part 1 : Marketing Management , 6 (4 : 8-18
SAS Institute (1998 . From Data to Business Advantages : Data Mining , The SEMMA Methodology and SAS Software . SAS Institute : Cary , North Carolina
Sargeant A and J McKenzie (1999 "The lifetime value of donors : Gaining insight through CHAID : Fund Raising Management , 30 (1 : 22-27
Schober D (1999 "Data detectives : Telephony , 237 (9 : 20-24
Stedman C (1998 "Data mining despite the dangers : Computerworld 32 (1 : 61-62
Trybula WJ (1997 "Data mining and knowledge discovery : Annual Review of Information Science and Technology , 32 : 197-229
Firestone , Joseph M . Information Management Journal , Sep /Oct2005 , Vol 39 Issue 5 , p47-52
Calders , Toon . ACM Transactions on Database Systems , Dec2006 , Vol . 31 Issue 4 , p1169-1214
Richman , Dan . Software Magazine , 2001 , Supplement Data Mining , Vol . 17 Issue 14 , pS3
Clifton , Chris . Journal of Computer Security , 2000 , Vol . 8 Issue 4 p281
Montana , John . Data Mining : A Slippery Slope , Information Management Journal , Oct2001 , Vol . 35 , Issue 4
Nasukawa , T , Nagano , T . Text analysis and knowledge mining system IBM Systems Journal , 2001 , Vol . 40 , Issue 4
(Lamb , Ellen Clair . Community Banker , Jun2000 , Vol . 9 Issue 6 , p22 , 5p
Saarenvirta , Gary . CMA Magazine , Mar1998 , Vol . 72 Issue 2 , p8
Cullen , Kevin . Library Journal , 8 /15 /2005 , Vol . 130 Issue 13 , p30-32
Biggs , Maggie . InfoWorld , 2003 , Vol . 25 Issue 35 , p44-51
Firestone , Joseph M . Information Management Journal , Sep /Oct2005 Vol . 39 Issue 5 , p47-52
PAGE
PAGE 23 Data Mining ...
More Courseworks on mining, data, analysis, Management Journal, SAS
- Anwser the Question about Data Collection, Mining and Analysi
- Direct Marketing
- Role of Data Mining in Stock Market and Investment Analysis
- Data Mining
- Integration of data mining with databases or data warehouses
- Data Mining
- data mining and data fusion
- Data mining and Data warehouses with example
- Essay editing
- Methods of Data Mining





