Prizes & Awards
My Profile
Active Members
TodayLast 7 Days
more...
|
New Feature: Community Sites:
Create your own .NET community website and start earning from Google AdSense !
It's Free !
|
Data Mining Concepts in 2005
|
Introduction Data mining is described as "the process of extracting valid, authentic, and actionable information from large databases." In
other words, data mining derives patterns and trends that exist in data. These patterns and trends can be collected together
and defined as a mining model.
Paragraph Heading 1 An important concept is that building a mining model is part of a larger process that includes everything from defining the
basic problem that the model will solve, to deploying the model into a working environment. This process can be defined by
using the following six basic steps:
1) Defining the Problem
2) Preparing Data
3) Exploring Data
4) Building Models
5) Exploring and Validating Models
6) Deploying and Updating Models
It is important to understand that creating a data mining model is a process, and that each step in the process may be
repeated as many times as needed to create a good model.
SQL Server 2005 provides an integrated environment for creating and working with data mining models, called Business
Intelligence Development Studio. The environment includes data mining algorithms and tools that make it easy to build a
comprehensive solution for a variety of projects.
Five Steps 1) Defining the Problem: This step includes analyzing business requirements, defining the scope of the problem, defining the metrics by which the
model will be evaluated, and defining the final objective for the data mining project.
2) Preparing Data The second step in the data mining process is to consolidate and clean the data that was identified in the Defining the
Problem step. Microsoft SQL Server 2005 Integration Services (SSIS) contains all the tools that you need to complete this
step, including transforms to automate data cleaning and consolidation.
3)Exploring Data The third step in the data mining process is to explore the prepared data. You must understand the data in order to make
appropriate decisions when you create the models. Exploration techniques include calculating the minimum and maximum values,
calculating mean and standard deviations, and looking at the distribution of the data.
4)Building Models The fourth step in the data mining process is to build the mining models.Before you build a model, you must randomly separate
the prepared data into separate training and testing datasets. You use the training dataset to build the model, and the
testing dataset to test the accuracy of the model by creating prediction queries. You can use the Percentage Sampling
Transformation in Integration Services to split the dataset.
5)Exploring and Validating Models The fifth step in the data mining process is to explore the models that you have built and test their effectiveness.
Summary
|
Responses
|
No responses found. Be the first to respond and make money from revenue sharing program.
|
|