SPS Search MSDN evening 030604

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

By: jayiiht (44 month(s) ago)

hello

Presentation Transcript

MSDN Event 03/06/2004: 

MSDN Event 03/06/2004 Welcome

SharePoint Search: 

SharePoint Search Inge De Neef SharePoint Consultant Getronics Belgium Inge.DeNeef@getronics.com

Agenda: 

Agenda Definition Comparing Search Technologies Search Extensibility Modifying the Built-in Default User Interface Thesaurus Using SharePoint Portal Server Search from other applications Extending SharePoint Portal Server Search to Index Other Content Summary

Search - Definition: 

Search - Definition A single location from which to search multiple information sources simultaneously. Microsoft Search Service ships with number of default content sources: File Shares Websites (both http and https) Exchange Folders Active Directory Lotus Notes databases Other SharePoint sites and portals

Search - Definition: 

Search - Definition Keyword searches that search the full text of a document and the document's properties (metadata). Best Bet classification for documents selected as the best recommendation for a category or a specific keyword

Search Technology Comparison: 

Search Technology Comparison

Search Extensibility: 

Search Extensibility Modifying the Built-in Default User Interface Search WebPart Page Search Box, Search Menu, Advanced Search, Search Results Page Thesaurus Using SharePoint Portal Server Search from Other Applications SharePoint Portal Server Query Web service (http://<portal>/_vti_bin/search.asmx) Research Library Task Pane Extending SharePoint Portal Server Search to Index Other Content IFilter, IProtocolHandler, IWordbreaker, IStemmer

Slide8: 

Modifying the Built-in User Interface

Search Web Part Page: 

Search Web Part Page Ghosted page

SharePoint Portal Server’s Search Web Part Page: 

SharePoint Portal Server’s Search Web Part Page

Working Together: 

Working Together Search Menu Search Box Advanced Search Search Result JavaScript (Search.js and in HTML) Hidden Fields

Search Box: 

Search Box

SearchBox: 

SearchBox Used to enter the search keyword Present on every SharePoint page Interesting property: SearchResultPageURL Default opens Search.aspx Property: ContextSensitiveScopeType determines the default search scope (All Sources or This Topic) <SPSWC:RightBodySectionSearchBox runat="server" ContextSensitiveScopeType=“1” SearchResultPageURL=”SEARCH_HOME” FrameType="None“/>

Search Menu: 

Search Menu

Search Menu: 

Search Menu JavaScript Functions Onshd OnToggleAllGroups OnPinSearch OnSubscribeSearch TooleMgmtAdv

Advanced Search : 

Advanced Search

Adding custom metadata to the Advanced Search: 

Adding custom metadata to the Advanced Search

Creating a New Advanced Search Web Part: 

Creating a New Advanced Search Web Part Can be created by inheriting from: Microsoft.SharePoint.Portal.WebControls.AdvancedSearchControl Own controls can be added Only need to change CreateChildControls and RenderWebPart methods

Search Results: 

Search Results

Search Results Web Part: 

Search Results Web Part 3 ways of customizing: Property Sheet (basic customization) Dwp file FrontPage 2003 Creating your own SearchResults page

Search Results Web Part: 

Search Results Web Part Customize through the Property Sheet The number of items returned The display text for the “no results” condition Widths of columns

Search Results Web Part: 

Search Results Web Part Customize through the DWP file Call the search page with http://localhost/Search.aspx?mode=Edit&PageView=Shared Export the Search Results Web Part

Search Results Web Part: 

Search Results Web Part Customize using FrontPage 2003 Search page is displayed in xml Properties can be changed by modifying their values.

Search Results Web Part – Properties: 

Search Results Web Part – Properties ResultListID FixLayout GroupByList DefaultGroupBy SortByList DefaultSortBy ColumnURIs ColumnWidths TextForNoResults RowNumberForEachItem EnableQueryLoggingSearch SupportExpandCollapseAll EnableSQLCommandLogging ColumnDisplayNames OpenNewWindowForMatchingItems ShowRankForEachItem MaxMatchingItemsNumber

Search Results Web Part – Interesting properties: 

Search Results Web Part – Interesting properties DefaultSortBy: Default value: “urn:schemas-microsoft-com:fulltextqueryinfo:rank DESC” Most relevant result is shown first (based on OKAPI algorithm) MaxMatchingItemsNumber: Maximum number of search results shown

Search Results Web Part – Interesting properties: 

Search Results Web Part – Interesting properties OpenNewWindowForMatchingItems: To open search results in a new window By default set to false No default xml tag available in the search page, must be added: <OpenNewWindowForMatchingItems xmlns=“urn:schemas-microsoft-com:sharepoint:DataResultBase” >true </OpenNewWindowForMatchingItems>

Creating your own Search Results Page: 

Creating your own Search Results Page Can be created by inheriting from: Microsoft.SharePoint.Portal.WebControls.SearchResults The number of methods to override depends on the complexity of the search.

Your own Search Results Page: method examples: 

Your own Search Results Page: method examples GenerateQueryString Parameters: QueryTemplateSelectPart QueryTemplateFromPart QueryTemplateWherePart QueryTemplateOrderByPart (out) strSavedQuery IssueQuery GenerateHtmlOneRowForOneItem

Example: How to Add Support for Wildcard Searches: 

Example: How to Add Support for Wildcard Searches

Slide30: 

Thesaurus and Noise Words

Slide31: 

Thesaurus and noise words

Slide32: 

Thesaurus Allows you to search for: Search term Synonyms Translations Chemical formulas … other matching words SharePoint uses different thesaurus files for different languages

Slide33: 

Thesaurus Expansion tags: E.g. holiday verlof congé Replacement tags E.g. MS Microsoft Stemming E.g. talk** speak**

Slide34: 

Thesaurus

Slide35: 

Thesaurus – Noise words Noise words are words that are neglected in search queries, such as: “the”, “or”, “if”, numbers, … Stored in: C:\Program Files\SharePoint Portal Server\DATA\Config

Slide36: 

Noise Words

Summary: 

Summary Search is powerful Search is meant to be used by many clients Search is extensible with many components Search is customizable with many options Get out there and build on it!

Slide38: 

SharePoint Portal Server Query Web Service

Query Web Service: 

Query Web Service Query Accepts Query XML Defined by the urn:Microsoft.Search.Query namespace Returns Response XML Defined by the urn:Microsoft.Search.Response namespace QueryEx Accepts Query XML Defined by the urn:Microsoft.Search.Query namespace Returns search results as DataSet for the specified query string.

SharePoint Portal Server Query Web Service: 

SharePoint Portal Server Query Web Service Registration Defined by the urn:Microsoft.Search.Registration namespaces Returns the name of a portal site. SPSGetPortalSearchInfo Returns a list of search and catalog scopes. Status Returns a success code to indicate that the search provider is available

SharePoint Portal Server Query Web Service: 

SharePoint Portal Server Query Web Service Add Web Reference Service found at http://<portal>/_vti_bin/search.asmx Authenticate Formulate and send a query

SharePoint Portal Server Query Web Service: 

SharePoint Portal Server Query Web Service Syntax Help Microsoft SharePoint Portal Server 2001 SDK Manage Properties of Indexed Content Manage Content Sources SPSGetPortalSearchInfo() SPSQueryServiceConst Class Templates for SELECT, WHERE, CONTAINS View Source on search.aspx

SharePoint Portal Server Query Web Service: 

SharePoint Portal Server Query Web Service QueryText Pointers QueryText type='STRING‘ Returns results with some Research Task Pane intelligent formatting QueryText type='MSSQLFT‘ Query() returns 2 columns regardless of query DAV:DisplayName, DAV:href SELECT must contain urn:schemas.microsoft.com:fulltextqueryinfo:sdid

Slide44: 

Research Library Task Pane

Research and Reference Task Pane : 

Research and Reference Task Pane Task Pane in Microsoft Office System applications Allows user to search information sources A platform content providers can build on It supports rich content and forms SharePoint Portal Server is compatible!

Research and Reference Task Pane: 

Research and Reference Task Pane Registration Function Query Function Response XML

Slide47: 

Extending SharePoint Portal Server Search to Index Other Content

Extending SharePoint Portal Server to Index More Content: 

Extending SharePoint Portal Server to Index More Content Architecture overview Tools to be Built Protocol Handlers Filters Word Breakers

Search Characteristics: 

Search Characteristics Enterprise Scalability From ~ 5 M Docs to ~ 20 M Docs Cross catalog querying, load balanced queries Very Significant for Enterprise Search Scenarios Shared Portal Services Content Aggregation Probablistic Relevance Ranking Notifications/Alerts, Topic Assistant Adaptive Crawling Common Search Technology across Microsoft Product Offerings

Protocol Handlers and IFilters: 

Protocol Handlers and IFilters SharePoint Portal Server indexing capability is extensible via the development of Protocol Handlers and IFilters Protocol Handlers are used for extending the indexing capability of SharePoint Portal Server to other content sources IFilters are generally used for indexing specific types of files. Called by Protocol Handler, and thus Can be skipped if the Protocol Handler is willing to do all the work This is low-level technology; you’ll still need to use COM You must write a COM component, your end result will be a .dll Can use VC.NET to develop these components – attributed C++ is an advantage, but the code is not managed

Search Structure: 

Search Structure

Protocol Handler General Features: 

Protocol Handler General Features Registers with gatherer Connects to external content source Collects data from external content source Binds to content in external content source & streams back to gatherer Obtains metadata and security information on external content source and sends back to gatherer Sends LCID info to gatherer where appropriate.

Protocol Handlers Provided by Microsoft: 

Protocol Handlers Provided by Microsoft Microsoft Search Service ships with a number of Protocol out-of-box file:// http:// Exchange Active Directory Lotus Notes databases SharePoint sites and portals

IFilter General Features: 

IFilter General Features Extends the types of files which can be indexed Also COM based, end result is a .DLL Extracts internal properties from files as well as body text IFilters can be used with any Microsoft Search Vehicle, not just SharePoint Portal Server Microsoft® Windows® SQL Server Microsoft® Exchange Server

IFilters Provided by Microsoft: 

IFilters Provided by Microsoft Microsoft Search Service ships with a number of IFilters out-of-box All Office System document formats TIFF XML Popular 3rd Party IFilters PDF CAD (.dwg)

Word Breaker General Features: 

Word Breaker General Features Decomposition of text into individual text tokens, or words Extends the locales of data which can be indexed Also COM based, end result is a .DLL Wordbreakers can be also used with any Microsoft Search Vehicle, not just SharePoint Portal Server

Word breakers Provided by Microsoft: 

Word breakers Provided by Microsoft Many word breakers ship out of box in the Microsoft Search Service Interface recently published in Microsoft Platform SDK http://msdn.microsoft.com/library/default.asp?url=/library/en-us/indexsrv/html/ixrefint_9sfm.asp

Steps to Building a Protocol Handler: 

Steps to Building a Protocol Handler Install sample, get it running http://msdn.microsoft.com/library/default.asp?url=/library/en-us/spssdk/html/_creating_a_protocol_handler_sample.asp Connect to your content source Iterate through contents Extract data & pass to Gatherer Write metadata mapping code Write security mapping code Test

IFilters: 

IFilters Typically for parsing file formats Implement them within a protocol handler to expose … Contents of directories (Pretend the directory is a document and the files’ names form a multi-valued property. See sample PH in SDK for example.) Properties taken from the document store, not from inside the document itself

The Gatherer Pulls Data: 

The Gatherer Pulls Data You are reactive to the gatherer – must wait for requests The gatherer pulls – you can’t push You must buffer/cache data until asked for it You must keep more state than you might like This isn’t rocket science, but it can be a pitfall.

The Gatherer is Multithreaded: 

The Gatherer is Multithreaded Address data locking early in the design process Think about COM apartments and threading models The gatherer promises some thread affinity; see the SDK for details Are your libraries thread safe?

Returning Data: 

Returning Data Return File/Folder indicators Document Body Relevant Metadata Including Standard Metadata Custom Metadata LCID (if applicable) Location information (URL) Security Information

Clickable URLs in Search Results: 

Clickable URLs in Search Results Search is no good if you can’t get to the document Default result URL is search URL, but IE probably doesn’t understand your protocol (ie dctm://). The DAV:href property controls the URL that search renders – you can overwrite with a URL that the browser will like

Security Mapping: 

Security Mapping Search results contain file names, locations, and excerpts of the contents Default allows every user to see every file Search can “trim” results, but you must provide an NT ACL for each file at crawl time … no callback Mapping users between domains or even OSes can be tricky Mapping mechanism

Property Mapping: 

Property Mapping Done through the administrative UI Maps source document properties (e.g., Author, Subject, Description, etc.) to specific target properties (e.g., Auteur, Onderwerp, Omschrijving)