DATA MINING TECHNIQUES : DATA MINING TECHNIQUES SUBMITTED BY:
ROLL NO. 1167714
M.TECH (I.T),2 SEM WEB MINING: : Web mining - is the application of data mining techniques to discover patterns from the Web.
Web Mining is the extraction of interesting and potentially useful patterns and implicit information from artifacts or activity related to the World Wide Web. WEB MINING: WEB MINING CAN BE OF THREE TYPES: : WEB USAGE MINING.
WEB STRUTURE MINING.
WEB CONTENT MINING. WEB MINING CAN BE OF THREE TYPES: WEB USAGE MINING: : Web usage mining is the process of extracting useful information from server logs i.e users history.
Web usage mining is the process of finding out what users are looking for on the Internet.
Some users might be looking at only textual data, whereas some others might be interested in multimedia data. WEB USAGE MINING: WEB STRUCTURE MINING: : Web structure mining is the process of using graph theory to analyze the node and connection structure of a web site.
web structure mining can be divided into two kinds:
1) Extracting patterns from hyperlinks in the web.
2) Mining the document structure: analysis of the tree-like structure of page structures to describe HTML or XML tag usage. WEB STRUCTURE MINING: WEB CONTENT MINING: : Web content mining is the extraction , mining and integration of useful data, information and knowledge from Web page contents. WEB CONTENT MINING: DIAGRAM OF WEB MINING: : DIAGRAM OF WEB MINING: SPATIAL DATA : : SPATIAL DATA: It is also know as geospatial data or geographic information it is the data or information that identifies the geographic location of features and boundaries on Earth, such as natural or constructed features, oceans, and more.
Spatial data is usually stored as coordinates and topology, and is data that can be mapped. Spatial data is often accessed, manipulated or analyzed through Geographic Information Systems (GIS). SPATIAL DATA : SPATIAL DATA MINING: : SPATIAL DATA MINING: Spatial data mining is the application of data mining methods.
Spatial data mining follows along the same functions in data mining, with the end objective to find patterns in geography.
It is used in offices requiring analysis of geo-referenced statistical data.
Used in public health services searching for explanations of disease clusters.
Used in geo-marketing companies doing customer segmentation based on spatial location. TEMPORAL DATA MINING: : It is to discover hidden relations between sequences of events.
It is basically include valid time and transaction time.
Valid time denotes the time during which a fact is true with respect to the real world.
Transaction time is the time during which a fact is stored in the database.
Examples are-scientific, medical, financial, factory machinery performance,weather,stock market. TEMPORAL DATA MINING: VISUAL WEB DATA MINING: : Application of Information visualization techniques on results of Web Mining in order to further amplify the perception of extracted patterns and visually explore new ones in web domain.
Information Visualization enabling the viewer to gain knowledge about the internal structure of the data and relationships in it.
Visualization in order to:
- Understand the structure of a particular website.
- Web surfers’ behavior when visiting that website. VISUAL WEB DATA MINING: CLIENT SERVER COMPUTING: : Client–server computing is a distributed computing model in which client applications request services from server processes.
A client application is a process or program that sends messages to a server via the network.
The server process or program listens for client requests that are transmitted via the network. Servers receive those requests and perform actions such as database queries and reading ﬁles.
An example of a client–server system is a banking application that allows a clerk to access account information on a central database server CLIENT SERVER COMPUTING: DISTRIBUTED PROCESSING: : Distributed processing is refer to a variety of computer systems that use more than one computer (or processor) to run an application.
Distributed processing refers to local-area networks (LANs) designed so that a single program can run simultaneously at various sites.
Distributed processing involves distributed databases. In this databases in which the data is stored across two or more computer systems. DISTRIBUTED PROCESSING: DIAGRAM OF DISTRIBUTED PROCESSING: : DIAGRAM OF DISTRIBUTED PROCESSING: PARALLEL PROCESSING: : Parallel processing is the ability to carry out multiple operations or tasks simultaneously.
The simultaneous use of more than one CPU or processor core to execute a program or multiple computational threads.
It makes programs run faster because there are more engines (CPUs or Cores) running it.
CPUs or cores can execute different portions without interfering with each other.
Paralleling processing solving a problem in less time or solving a larger problem in the same time. PARALLEL PROCESSING: