logging in or signing up Storage Architectures and Options alanmcsweeney Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT lite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 268 Category: Science & Tech.. License: Some Rights Reserved Like it (0) Dislike it (0) Added: January 11, 2011 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Storage Architectures and Options: Storage Architectures and Options Alan McSweeneyObjectives: January 11, 2011 2 Objectives To provide high-level information on storage options and architectures for storing and managing digital camera data To provide indicative sample solutions To initiate discussions on storage configurations and optionsAgenda: January 11, 2011 3 Agenda Confirmation of Storage Requirements Data Flows and Processes Storage Management Architectures and Options Storage Management Operation, Management and Use Sample SolutionsUnderstanding of Requirements: January 11, 2011 4 Understanding of Requirements Storage solution to manage raw and processed map image data Store raw and processed data No requirement to store intermediate pre-processed data Keep 6 month’s raw and processed data on primary storage Keep online copy of additional data Keep all raw and processed data indefinitely Size for at least 5 years Deliverables Draft data management/storage policy SLA options on data retrieval from non-primary storage Set of practical options Storage management policy documentObjectives of Storage Management: January 11, 2011 5 Objectives of Storage Management Data availability to meet service level commitments even during failures, disasters, or other forms of primary data loss Data protection against loss and to prevent unauthorised access Data retention that is compliant with regulations and standards in an unalterable state, fully audited for long periods of time Cost-effective storage management infrastructureBackup and Data Archival: January 11, 2011 6 Backup and Data Archival Backup Ensure efficient recoverability of data Does not make backup data directly available Optimised to bring large amounts of data back online quickly for system recovery Retention management at the volume level Not oriented to long-term management beyond life of current environment and media Archiving Copy from online environment to separately managed (secure) storage to reduce cost of storage and enforce retention Provides easy (ideally transparent) access for retrieval Optimised to write and retrieve data at file granularity File-level retention management Designed to manage data over long-term, through media migration and with access auditing and controls Designed to manage multiple copies of data on different media typesHigh Level Storage Management Architectures: January 11, 2011 7 High Level Storage Management Architectures Multi-tier data storage architectures Primary/Secondary Primary/Secondary/Tertiary Primary/Secondary and Tertiary in parallel Secondary disk storage layer is purely for convenience to allow recall of data Advantages and disadvantages in terms of cost and serviceHierarchical Storage Management (HSM): January 11, 2011 8 Hierarchical Storage Management (HSM) HSM is a key requirement of effective (and cost-effective) storage management Data is migrated (moved / copied) from one storage layer to another, usually less expensive, form of storage A stub is created for and replaces each migrated file On the local system, a stub file looks and act like a regular file When user action restores a file but the user does not change the file, that file is ″re-stubbed″ during the next migration processPrimary/Secondary: January 11, 2011 9 Primary/Secondary Primary Storage Secondary Storage High speed fibre-channel disk Data is directly accessible Offline/nearline storage Retain data indefinitely Tape/optical media Migrate After Defined IntervalPrimary/Secondary: January 11, 2011 10 Primary/Secondary Primary Storage Secondary Storage Migrate After Defined Interval Retrieve from Secondary to PrimaryPrimary/Secondary/Tertiary: January 11, 2011 11 Primary/Secondary/Tertiary Primary Storage Secondary Storage Tertiary Storage High speed fibre-channel disk Data is directly accessible High capacity ATA (SATA/FATA) disk Data is directly accessible Data resides Offline/nearline storage Retain data indefinitely Tape/optical media Migrate After Defined Interval Migrate After Defined IntervalPrimary/Secondary/Tertiary: January 11, 2011 12 Primary/Secondary/Tertiary Primary Storage Secondary Storage Tertiary Storage Migrate After Defined Interval Migrate After Defined Interval Retrieve from Secondary/Tertiary to PrimaryPrimary/Secondary and Tertiary in Parallel: January 11, 2011 13 Primary/Secondary and Tertiary in Parallel Primary Storage Secondary Storage Tertiary Storage Migrate After Defined Interval Take Copy ImmediatelyHardware Options: January 11, 2011 14 Hardware Options Disk Storage Tape Storage – Manual or Automated Optical Storage – Manual or Automated Hybrid devices VTL (Virtual Tape Library) EMC Centera IBM DR550 Storage gatewaysHardware Options - Disk: January 11, 2011 15 Hardware Options - Disk Disk – Advantages Speed - FC and SATA disk technologies allow the data to be housed on the appropriate disks SATA Drive technology has mature and can lead to decreased acquisition costs FC and SATA can be used within the same storage system for primary and secondary data Storage Virtualisation Virtualise disk arrays within a storage system Virtualise storage systems within a fabric Thin provisioning allows over commitment of disk – reducing acquisition costs Single Instance Storage (Deduplication) can be used but its effectiveness depends in the nature of the dataHardware Options - Disk: January 11, 2011 16 Hardware Options - Disk Disk – Disadvantages Acquisition cost Disk systems do not interoperate well Management - multiple skill sets may be required even if all storage systems are from the same vendor Most hardware vendors focus on ensuring hardware resilience, data resilience is not their concern Operating costs – power, air conditioning, maintenanceHardware Options – Removable Media: January 11, 2011 17 Hardware Options – Removable Media Advantages Control of costs Keep fixed number of media within automated library unit (could keep none) Disadvantages External media needs media management and control Media management is greater for smaller capacity optical disks Manual costs of media managementHardware Options – Optical Storage: January 11, 2011 18 Hardware Options – Optical Storage Optical Storage UDO (Ultra Density Optical) 60 GB media capacity UDO media have a 50+ year life UDO technology roadmap -120GB and 240GB media capacities Main vendor – Plasmon Resold by other vendors: HP and IBM WORM media optionOptical Library and Drive Performance: January 11, 2011 19 Optical Library and Drive Performance Poor performance relative to tape Direct access medium Use depends on data read (retrieval) and write volumesSingle Drive/Path Tape and Optical Read and Write Performance: January 11, 2011 20 Single Drive/Path Tape and Optical Read and Write PerformanceHardware Options – Optical Storage: January 11, 2011 21 Hardware Options – Optical Storage Optical – Advantages Reduced cost over disk Larger capacity media planned for the future Can have embedded encryption Long media shelf life before refresh is required Very reliable medium True WORM optionHardware Options – Optical Storage: January 11, 2011 22 Hardware Options – Optical Storage Optical – Disadvantages Low capacity Media must be managed offline unless multiple libraries are bought Low data access speed – not suited to large data volume restoresHardware Options – Optical Storage: January 11, 2011 23 Hardware Options – Optical Storage Optical Storage Issues Low medium capacity UDO – 60 GB currently, 120 GB and 240 GB planned Tape LTO-4 Ultrium 1840 – 800 GB uncompressed LTO-3 Ultrium 960 – 400 GB uncompressedTape and Optical Media Capacities: January 11, 2011 24 Tape and Optical Media Capacities Optical media capacity cumulative annual increase of c. 31% Tape media capacity cumulative annual increase of c. 64%Hardware Options – Tape: January 11, 2011 25 Hardware Options – Tape Tape – Advantages Cost Very well defined road map for LTO LTO4 (Dec 2006) - 1.6TB (2:1 compression) and data transfer rates of up to 240 MB/second (2:1 compression) LTO5 (Planned) - 3.2 TB (2:1 compression) and data transfer rates of up to 360 MB/second (assuming a 2:1 compression) LTO6 (Planned) - 6.4 TB (2:1 compression) and data transfer rates of up to 540 MB/second (assuming a 2:1 compression) High capacity media Designed for large data volume restore Multiple media can be streamed to aggregate capacity and speed Can have embedded encryptionHardware Options – Tape: January 11, 2011 26 Hardware Options – Tape Tape – Disadvantages Media shelf life – medium Media long-term reliability Cumbersome single file restores Sequential access mediumHardware Options – Tape Library: January 11, 2011 27 Hardware Options – Tape Library Widely available from large number of vendors: Dell, HP, IBM, Quantum IBM System Storage TS3500 Tape Library One base frame, and up to 15 expansion frames Up to 12 drives per frame (up to 192 per library) Up to 5.5 PB with LTO 4 cartridges LTO Fibre Channel interface for server attachment Very high capacity automated data management Long-term data storageVTL (Virtual Tape Library): January 11, 2011 28 VTL (Virtual Tape Library) Hybrid units that emulate tape libraries Use low cost disk (and possibly tape) Works with existing tape backup software Improved backup speeds No removable medium backup Sample products IBM IBM Virtualization Engine TS7510 IBM Virtualization Engine TS7520 HP StorageWorks Virtual Library System (VLS) VLS1000i VLS6000IBM Virtualization Engine TS75x0 : January 11, 2011 29 IBM Virtualization Engine TS75x0 TS7510 96 TB Capacity at 2:1 Compression Maximum number of virtual libraries – 128 Maximum number of virtual drives – 1,024 Maximum number of virtual cartridges – 8,192 Maximum number of concurrent backups – 32 TS7520 2.6 PB Capacity at 2:1 Compression Maximum number of virtual libraries – 512 Maximum number of virtual drives – 4,096 Maximum number of virtual cartridges – 64,000 Maximum number of concurrent backups – 32HP StorageWorks Virtual Library System (VLS): January 11, 2011 30 HP StorageWorks Virtual Library System (VLS) VLS1000i 3 TB Capacity at 2:1 Compression Maximum number of virtual libraries – 6 Maximum number of virtual drives – 12 VLS6000 105 TB Capacity at 2:1 Compression Maximum number of virtual libraries – 16 Maximum number of virtual drives – 128IBM DR550: January 11, 2011 31 IBM DR550 Uses multiple storage tiers (disk, tape, optical) within an archive Software - System Storage Archive Manager Two models DR1 - 36.88 TB raw DR2 - 168 TB raw Attached devices – support for PB capacities Tape systems Optical systems Awards Data Protection Summit—Information Lifecycle Management (ILM)—Best of Show, 2007 AIIM (The Enterprise Content Management Association)—Best in Show, 2005, 2006Software Options: January 11, 2011 32 Software Options HSM HSM is a principle most products offer the same basic functionality Automatic migration and management of data from one medium to another Stubs or pointer are left in place of migrated files Speed of retrieval depends upon speed of hardware upon which the files have been migrated to, this gives online, near-line and off-line optionsSoftware Options: January 11, 2011 33 Software Options Bridgehead Software Small company, employee owned Can they offer the level of service and support required when really needed Are they possible acquisition targets Ideal for mid – large customers Can it handle the levels of data over time Caminosoft Major corporation – publicly listed and managed by SEC rules and regulations Primary focus is on managing file server type data Repackaged by vendors such as CASoftware Options: January 11, 2011 34 Software Options Symantec Major corporation Two products: NetBackup Enterprise Vault NetBackup HSM does not support Windows Enterprise Vault KVS staff still provide support, separate entity within Symantec Focus is largely on email and compliance Some integration with NetBackup Files to be migrated are collected into CAB files Entire CAB file recalled Poor support for tape as archival medium Recommended that you only use tape for data that is seldom or never accessedSoftware Options: January 11, 2011 35 Software Options IBM – Tivoli Major corporation Vast knowledge within the company Extensive R&D budgets Agents and options from most major software and hardware vendorsSoftware Options: January 11, 2011 36 Software Options HP – File Archiver Major corporation Vast knowledge within the company Extensive R&D budgets “Simple Lightweight Solution” according to HPSoftware Options: January 11, 2011 37 Software Options HSM Product What is Required from chosen vendor / application? Stable and functionally bullet proof solution Easy to use Capable of handling files Capable of handling data volumes Must integrate with backup application (so as NetBackup does not initiate a restore when backing up or restoring stubs) Expert support knowledge Expert integration knowledge These products are dependant on hardware vendors solutionsData Deduplication: January 11, 2011 38 Data Deduplication Store only one copy of data The deduplication process should be granular The smaller the data block examined, the more likely it is duplicate data will be found. The deduplication process should be designed with minimal overhead when deduplicating (storing) and un-deduplicating (retrieving) data Hardware better than software The deduplication process should provide resiliency to insure that all data can be reliably stored and retrieved, even in the event of system failureData Deduplication: January 11, 2011 39 Data Deduplication Available for range of storage – hardware and software Symantec Enterprise Vault creates a MD5 fingerprint for every file that is archived If multiple files have the same hash code, only one copy of the file is physically stored IBM N Series has Advanced Single Instance Storage (ASIS) Hardware and block-based deduplicationDeduplication in Action: January 11, 2011 40 Deduplication in Action Client.ppt Identical file - 20 blocks Sales ed.ppt 20 x 4K blocks White paper.doc Different file - 10 blocks Sales ed v2.ppt Edited file - 24 blocks = Identical blocks With ASIS - 38 total blocks Without ASIS – 74 total blocksPotential Deduplication Savings – Dependent in Data Types: January 11, 2011 41 Potential Deduplication Savings – Dependent in Data TypesSoftware and Solution Design Constraints and Issues: January 11, 2011 42 Software and Solution Design Constraints and Issues Bottom Line Produce a realistic design before implementation and validate design Solutions must be fully tested to ensure it works as expected Decisions can then easily be made on the basis of the tests NetBackup integration must be thoroughly tested with any solution Primary to secondary to tertiary migration and retrievals must be tested and documented Misconfiguration or lack of understanding can lead to data loss or primary production system failure Need to look at the total cost of ownership – maintenance, power, manual effort – put a cost on all elements and activities to ensure fair comparison Reduced complexity – fewer components, vendors – means long-term ease of operation and use and has a genuine valueSample Storage Capacity Planning: January 11, 2011 43 Sample Storage Capacity Planning Sizing issues and assumptions Annual growth rate Overhead for determination of actual disk storage requirements (RAID overhead, etc.) Archival storage medium utilisation overhead (allowance for unfilled tapes, optical platters, RAID for VTL, etc.) Storage lifecycle Number of storage layers – 2 or 3 Sample storage capacity planning scenarios Annual growth rates – 0%, 10%, 20%, 30% Translated into monthly growth rates for calculations - 20% annual growth = 1.531% monthly Three tiers Migrate from Tier 1 to Tier 2 after 6 months Migrate from Tier 2 to Tier 3 after further 6 monthsDisk Space Calculations: January 11, 2011 44 Disk Space Calculations Storage estimates expressed as raw capacities required to accommodate data Includes overhead for effective usability, RAID, snapshots, online spare, less than 100% utilisation, etc. Primary storage after 5 years with 10% annual growth = 25,580 GB Equates to at least 34,533 GB of raw disk capacitySample Storage Capacity Planning – 0% Annual Growth Rate : January 11, 2011 45 Sample Storage Capacity Planning – 0% Annual Growth RateCapacities - Annual Growth Rate – 0%: January 11, 2011 46 Capacities - Annual Growth Rate – 0%Storage Capacities - 0% Annual Growth Rate: January 11, 2011 47 Storage Capacities - 0% Annual Growth RateMedia Requirements - 0% Annual Growth Rate: January 11, 2011 48 Media Requirements - 0% Annual Growth RateSample Storage Capacity Planning – 10% Annual Growth Rate : January 11, 2011 49 Sample Storage Capacity Planning – 10% Annual Growth RateCapacities - Annual Growth Rate – 10%: January 11, 2011 50 Capacities - Annual Growth Rate – 10%Storage Capacities - 10% Annual Growth Rate: January 11, 2011 51 Storage Capacities - 10% Annual Growth RateMedia Requirements - 10% Annual Growth Rate: January 11, 2011 52 Media Requirements - 10% Annual Growth RateSample Storage Capacity Planning – 20% Annual Growth Rate : January 11, 2011 53 Sample Storage Capacity Planning – 20% Annual Growth RateCapacities - Annual Growth Rate – 20%: January 11, 2011 54 Capacities - Annual Growth Rate – 20%Storage Capacities - 20% Annual Growth Rate: January 11, 2011 55 Storage Capacities - 20% Annual Growth RateMedia Requirements - 20% Annual Growth Rate: January 11, 2011 56 Media Requirements - 20% Annual Growth RateSample Storage Capacity Planning – 30% Annual Growth Rate : January 11, 2011 57 Sample Storage Capacity Planning – 30% Annual Growth RateCapacities - Annual Growth Rate – 30%: January 11, 2011 58 Capacities - Annual Growth Rate – 30%Storage Capacities - 30% Annual Growth Rate: January 11, 2011 59 Storage Capacities - 30% Annual Growth RateMedia Requirements - 30% Annual Growth Rate: January 11, 2011 60 Media Requirements - 30% Annual Growth Rate10 Year Data Storage Capacities – Different Growth Rates: January 11, 2011 61 10 Year Data Storage Capacities – Different Growth RatesSingle Drive/Path Tertiary Layer Data Write Times – Tape and Optical: January 11, 2011 62 Single Drive/Path Tertiary Layer Data Write Times – Tape and OpticalImplementation Options: January 11, 2011 63 Implementation Options Factors: 2 or 3 tiers Optical, tape or VTL as the last tier Use of existing storage (HP/Dell) or new storage DR or no DR Offsite manual copy or replication Software HSM – use existing NetBackup or other: HT FileStore, CaminoSoft, IBM TivoliSpectrum of Options: January 11, 2011 64 Spectrum of Options All disk DR option with replicated data Primary disk Secondary - tape Mixed disk/tape/optical/VTL/manual/automatedData Retrieval Operation: January 11, 2011 65 Data Retrieval Operation Secondary disk Data is retrieved to primary immediately – available within seconds/minutes Secondary/tertiary VTL Data is retrieved to primary immediately – available within minutes Secondary/tertiary tape library Data is retrieved to primary immediately – available within minutes Secondary/tertiary optical library Data is retrieved to primary immediately – available within hours Manual media retrieval Retrieval times depends on media location and staff allocated to media handlingSample Options: January 11, 2011 66 Sample Options Three tiers – optical or tape library as third tier All disk Reuse/expand existing hardware Low cost ATA disks for secondary storage Not all available options – presented for review and feedbackPhysical Option 1 – Three Tiers – Optical or Tape: January 11, 2011 67 Physical Option 1 – Three Tiers – Optical or TapePhysical Option 1 – Three Tiers – Optical or Tape: January 11, 2011 68 Physical Option 1 – Three Tiers – Optical or TapePhysical Option 1 - Components: January 11, 2011 69 Physical Option 1 - Components Primary storage – SAN with fibre disk Second storage – SAN with ATA disk Tertiary storage – optical library Software HT Filestore Caminosoft NetBackup Storage Migrator Tivoli Storage ManagerResilience: January 11, 2011 70 Resilience Primary storage mirrored for resilienceOperation and Service Level Agreement: January 11, 2011 71 Operation and Service Level AgreementPhysical Option 2 – All Disk Configuration: January 11, 2011 72 Physical Option 2 – All Disk Configuration All disk storage option Two mirrored sites with realtime replication Multiple replicated components for resilience Sample configuration Primary Storage Clustered SAN Controllers with 594 x 300 GB Fibre Channel Drives = 151 TB Raw Storage Secondary Storage Clustered SAN Controllers with 336 x 750 GB SATA Drives = 252 TB Raw Storage Total 403 TB of Raw Storage capacity (doubled for DR)All Disk Configuration: January 11, 2011 73 All Disk ConfigurationResilience – Multiple Points of Redundancy: January 11, 2011 74 Resilience – Multiple Points of RedundancyResilience: January 11, 2011 75 Resilience SAN switches SAN controllers Two disks per shelf Entire siteAll Disk Configuration: January 11, 2011 76 All Disk Configuration Indicative hardware and software (replication, snapshot) cost €1.8 million €4,460 per TB (doubled for DR) 5 standard racks in each location Does not include HSM software Installation and commissioning Represents high water mark in terms of costs and functionalityAll Disk Configuration: January 11, 2011 77 All Disk Configuration Advantages High performance Low manual intervention Highly resilient Disadvantages High cost of acquisition and operation Growth in data volumes means additional expense No upper limit on costPhysical Option 3 – Existing Hardware: January 11, 2011 78 Physical Option 3 – Existing Hardware Raw, pre-processed and processed data resides on HP EVA Replicated continuously to second EVA Dell CX disk array used as secondary location Existing ADIC LTO drives used for tertiary and long term offsite storageSlide 79: January 11, 2011 79Existing Hardware: January 11, 2011 80 Existing Hardware Advantages Cost Some skill sets already in organisation Disadvantages Investment in old technology Software based HSM product skills requiredIntroduction of Tertiary Device: January 11, 2011 81 Introduction of Tertiary Device Existing HP and Dell storage still employed UDO or LTO device used as final destination before removal to offsite archiveSlide 82: January 11, 2011 82Introduction of Tertiary Device: January 11, 2011 83 Introduction of Tertiary Device Advantages Cost – use of existing hardware Some skill sets already in organisation Media life is increased with UDO Disadvantages Cost – UDO or new tape library Management of archived media – especially UDO as they are low capacity Investment in old technology Software based HSM product skills required UDO retrieval speedsVirtual Tape Library: January 11, 2011 84 Virtual Tape Library VTL device will act as a tape library VTL will be secondary location HSM product skills may not be required NetBackup could manage this process VTL data will ultimately be archived to tape via ADIC tape librarySlide 85: January 11, 2011 85Virtual Tape Library: January 11, 2011 86 Virtual Tape Library Advantages Some skill sets already in organisation No new third party migration tool absolutely necessary Extension of NetBackup system using NetBackup Storage Migrator Disadvantages Cost – VTL with required capacity can be expensive Cannot take VTL backups offsite – tertiary solution still required Lack of vendor implementation experiencePhysical Option 4 – Disk Based Secondary Information Store: January 11, 2011 87 Physical Option 4 – Disk Based Secondary Information Store Single storage device with multiple PB of data scalability Data can be retained on information store for 15+ years and beyond 1 TB disk make this possible Data can be moved to storage attached tape Internal backup features of information store can aid NetBackup routine (SnapShots, Vaulting)Slide 88: January 11, 2011 88Disk Based Information Store: January 11, 2011 89 Disk Based Information Store Advantages Speed of retrieval No new third party migration tool absolutely necessary Simplicity Integration with NetBackup – no effect on daily backup routines Information store can be split across multiple information stores to give multiple PB capacity is required Disadvantages Cost – may be expensive initially but storage can be added over time as neededCentral Management – Storage Virtualisation: January 11, 2011 90 Central Management – Storage Virtualisation Controller site above storage systems Handle day to day management of storage across all platforms Advantages Skill set consolidation Costs Disadvantages Vendor based skill are still ultimately requiredSlide 91: January 11, 2011 91Key Questions: January 11, 2011 92 Key Questions Number of storage tiers and preferred configuration Use of tape/optical/VTL Software HSM option Disaster recovery/business continuity requirements and options Capacity planning constraints and assumptions New hardware or reuse of existing hardware Level of automation required for archival level Financial constraints and budget available Implementation scheduleMore Information: January 11, 2011 93 More Information Alan McSweeney alan@alanmcsweeney.com You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
Storage Architectures and Options alanmcsweeney Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT lite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 268 Category: Science & Tech.. License: Some Rights Reserved Like it (0) Dislike it (0) Added: January 11, 2011 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Storage Architectures and Options: Storage Architectures and Options Alan McSweeneyObjectives: January 11, 2011 2 Objectives To provide high-level information on storage options and architectures for storing and managing digital camera data To provide indicative sample solutions To initiate discussions on storage configurations and optionsAgenda: January 11, 2011 3 Agenda Confirmation of Storage Requirements Data Flows and Processes Storage Management Architectures and Options Storage Management Operation, Management and Use Sample SolutionsUnderstanding of Requirements: January 11, 2011 4 Understanding of Requirements Storage solution to manage raw and processed map image data Store raw and processed data No requirement to store intermediate pre-processed data Keep 6 month’s raw and processed data on primary storage Keep online copy of additional data Keep all raw and processed data indefinitely Size for at least 5 years Deliverables Draft data management/storage policy SLA options on data retrieval from non-primary storage Set of practical options Storage management policy documentObjectives of Storage Management: January 11, 2011 5 Objectives of Storage Management Data availability to meet service level commitments even during failures, disasters, or other forms of primary data loss Data protection against loss and to prevent unauthorised access Data retention that is compliant with regulations and standards in an unalterable state, fully audited for long periods of time Cost-effective storage management infrastructureBackup and Data Archival: January 11, 2011 6 Backup and Data Archival Backup Ensure efficient recoverability of data Does not make backup data directly available Optimised to bring large amounts of data back online quickly for system recovery Retention management at the volume level Not oriented to long-term management beyond life of current environment and media Archiving Copy from online environment to separately managed (secure) storage to reduce cost of storage and enforce retention Provides easy (ideally transparent) access for retrieval Optimised to write and retrieve data at file granularity File-level retention management Designed to manage data over long-term, through media migration and with access auditing and controls Designed to manage multiple copies of data on different media typesHigh Level Storage Management Architectures: January 11, 2011 7 High Level Storage Management Architectures Multi-tier data storage architectures Primary/Secondary Primary/Secondary/Tertiary Primary/Secondary and Tertiary in parallel Secondary disk storage layer is purely for convenience to allow recall of data Advantages and disadvantages in terms of cost and serviceHierarchical Storage Management (HSM): January 11, 2011 8 Hierarchical Storage Management (HSM) HSM is a key requirement of effective (and cost-effective) storage management Data is migrated (moved / copied) from one storage layer to another, usually less expensive, form of storage A stub is created for and replaces each migrated file On the local system, a stub file looks and act like a regular file When user action restores a file but the user does not change the file, that file is ″re-stubbed″ during the next migration processPrimary/Secondary: January 11, 2011 9 Primary/Secondary Primary Storage Secondary Storage High speed fibre-channel disk Data is directly accessible Offline/nearline storage Retain data indefinitely Tape/optical media Migrate After Defined IntervalPrimary/Secondary: January 11, 2011 10 Primary/Secondary Primary Storage Secondary Storage Migrate After Defined Interval Retrieve from Secondary to PrimaryPrimary/Secondary/Tertiary: January 11, 2011 11 Primary/Secondary/Tertiary Primary Storage Secondary Storage Tertiary Storage High speed fibre-channel disk Data is directly accessible High capacity ATA (SATA/FATA) disk Data is directly accessible Data resides Offline/nearline storage Retain data indefinitely Tape/optical media Migrate After Defined Interval Migrate After Defined IntervalPrimary/Secondary/Tertiary: January 11, 2011 12 Primary/Secondary/Tertiary Primary Storage Secondary Storage Tertiary Storage Migrate After Defined Interval Migrate After Defined Interval Retrieve from Secondary/Tertiary to PrimaryPrimary/Secondary and Tertiary in Parallel: January 11, 2011 13 Primary/Secondary and Tertiary in Parallel Primary Storage Secondary Storage Tertiary Storage Migrate After Defined Interval Take Copy ImmediatelyHardware Options: January 11, 2011 14 Hardware Options Disk Storage Tape Storage – Manual or Automated Optical Storage – Manual or Automated Hybrid devices VTL (Virtual Tape Library) EMC Centera IBM DR550 Storage gatewaysHardware Options - Disk: January 11, 2011 15 Hardware Options - Disk Disk – Advantages Speed - FC and SATA disk technologies allow the data to be housed on the appropriate disks SATA Drive technology has mature and can lead to decreased acquisition costs FC and SATA can be used within the same storage system for primary and secondary data Storage Virtualisation Virtualise disk arrays within a storage system Virtualise storage systems within a fabric Thin provisioning allows over commitment of disk – reducing acquisition costs Single Instance Storage (Deduplication) can be used but its effectiveness depends in the nature of the dataHardware Options - Disk: January 11, 2011 16 Hardware Options - Disk Disk – Disadvantages Acquisition cost Disk systems do not interoperate well Management - multiple skill sets may be required even if all storage systems are from the same vendor Most hardware vendors focus on ensuring hardware resilience, data resilience is not their concern Operating costs – power, air conditioning, maintenanceHardware Options – Removable Media: January 11, 2011 17 Hardware Options – Removable Media Advantages Control of costs Keep fixed number of media within automated library unit (could keep none) Disadvantages External media needs media management and control Media management is greater for smaller capacity optical disks Manual costs of media managementHardware Options – Optical Storage: January 11, 2011 18 Hardware Options – Optical Storage Optical Storage UDO (Ultra Density Optical) 60 GB media capacity UDO media have a 50+ year life UDO technology roadmap -120GB and 240GB media capacities Main vendor – Plasmon Resold by other vendors: HP and IBM WORM media optionOptical Library and Drive Performance: January 11, 2011 19 Optical Library and Drive Performance Poor performance relative to tape Direct access medium Use depends on data read (retrieval) and write volumesSingle Drive/Path Tape and Optical Read and Write Performance: January 11, 2011 20 Single Drive/Path Tape and Optical Read and Write PerformanceHardware Options – Optical Storage: January 11, 2011 21 Hardware Options – Optical Storage Optical – Advantages Reduced cost over disk Larger capacity media planned for the future Can have embedded encryption Long media shelf life before refresh is required Very reliable medium True WORM optionHardware Options – Optical Storage: January 11, 2011 22 Hardware Options – Optical Storage Optical – Disadvantages Low capacity Media must be managed offline unless multiple libraries are bought Low data access speed – not suited to large data volume restoresHardware Options – Optical Storage: January 11, 2011 23 Hardware Options – Optical Storage Optical Storage Issues Low medium capacity UDO – 60 GB currently, 120 GB and 240 GB planned Tape LTO-4 Ultrium 1840 – 800 GB uncompressed LTO-3 Ultrium 960 – 400 GB uncompressedTape and Optical Media Capacities: January 11, 2011 24 Tape and Optical Media Capacities Optical media capacity cumulative annual increase of c. 31% Tape media capacity cumulative annual increase of c. 64%Hardware Options – Tape: January 11, 2011 25 Hardware Options – Tape Tape – Advantages Cost Very well defined road map for LTO LTO4 (Dec 2006) - 1.6TB (2:1 compression) and data transfer rates of up to 240 MB/second (2:1 compression) LTO5 (Planned) - 3.2 TB (2:1 compression) and data transfer rates of up to 360 MB/second (assuming a 2:1 compression) LTO6 (Planned) - 6.4 TB (2:1 compression) and data transfer rates of up to 540 MB/second (assuming a 2:1 compression) High capacity media Designed for large data volume restore Multiple media can be streamed to aggregate capacity and speed Can have embedded encryptionHardware Options – Tape: January 11, 2011 26 Hardware Options – Tape Tape – Disadvantages Media shelf life – medium Media long-term reliability Cumbersome single file restores Sequential access mediumHardware Options – Tape Library: January 11, 2011 27 Hardware Options – Tape Library Widely available from large number of vendors: Dell, HP, IBM, Quantum IBM System Storage TS3500 Tape Library One base frame, and up to 15 expansion frames Up to 12 drives per frame (up to 192 per library) Up to 5.5 PB with LTO 4 cartridges LTO Fibre Channel interface for server attachment Very high capacity automated data management Long-term data storageVTL (Virtual Tape Library): January 11, 2011 28 VTL (Virtual Tape Library) Hybrid units that emulate tape libraries Use low cost disk (and possibly tape) Works with existing tape backup software Improved backup speeds No removable medium backup Sample products IBM IBM Virtualization Engine TS7510 IBM Virtualization Engine TS7520 HP StorageWorks Virtual Library System (VLS) VLS1000i VLS6000IBM Virtualization Engine TS75x0 : January 11, 2011 29 IBM Virtualization Engine TS75x0 TS7510 96 TB Capacity at 2:1 Compression Maximum number of virtual libraries – 128 Maximum number of virtual drives – 1,024 Maximum number of virtual cartridges – 8,192 Maximum number of concurrent backups – 32 TS7520 2.6 PB Capacity at 2:1 Compression Maximum number of virtual libraries – 512 Maximum number of virtual drives – 4,096 Maximum number of virtual cartridges – 64,000 Maximum number of concurrent backups – 32HP StorageWorks Virtual Library System (VLS): January 11, 2011 30 HP StorageWorks Virtual Library System (VLS) VLS1000i 3 TB Capacity at 2:1 Compression Maximum number of virtual libraries – 6 Maximum number of virtual drives – 12 VLS6000 105 TB Capacity at 2:1 Compression Maximum number of virtual libraries – 16 Maximum number of virtual drives – 128IBM DR550: January 11, 2011 31 IBM DR550 Uses multiple storage tiers (disk, tape, optical) within an archive Software - System Storage Archive Manager Two models DR1 - 36.88 TB raw DR2 - 168 TB raw Attached devices – support for PB capacities Tape systems Optical systems Awards Data Protection Summit—Information Lifecycle Management (ILM)—Best of Show, 2007 AIIM (The Enterprise Content Management Association)—Best in Show, 2005, 2006Software Options: January 11, 2011 32 Software Options HSM HSM is a principle most products offer the same basic functionality Automatic migration and management of data from one medium to another Stubs or pointer are left in place of migrated files Speed of retrieval depends upon speed of hardware upon which the files have been migrated to, this gives online, near-line and off-line optionsSoftware Options: January 11, 2011 33 Software Options Bridgehead Software Small company, employee owned Can they offer the level of service and support required when really needed Are they possible acquisition targets Ideal for mid – large customers Can it handle the levels of data over time Caminosoft Major corporation – publicly listed and managed by SEC rules and regulations Primary focus is on managing file server type data Repackaged by vendors such as CASoftware Options: January 11, 2011 34 Software Options Symantec Major corporation Two products: NetBackup Enterprise Vault NetBackup HSM does not support Windows Enterprise Vault KVS staff still provide support, separate entity within Symantec Focus is largely on email and compliance Some integration with NetBackup Files to be migrated are collected into CAB files Entire CAB file recalled Poor support for tape as archival medium Recommended that you only use tape for data that is seldom or never accessedSoftware Options: January 11, 2011 35 Software Options IBM – Tivoli Major corporation Vast knowledge within the company Extensive R&D budgets Agents and options from most major software and hardware vendorsSoftware Options: January 11, 2011 36 Software Options HP – File Archiver Major corporation Vast knowledge within the company Extensive R&D budgets “Simple Lightweight Solution” according to HPSoftware Options: January 11, 2011 37 Software Options HSM Product What is Required from chosen vendor / application? Stable and functionally bullet proof solution Easy to use Capable of handling files Capable of handling data volumes Must integrate with backup application (so as NetBackup does not initiate a restore when backing up or restoring stubs) Expert support knowledge Expert integration knowledge These products are dependant on hardware vendors solutionsData Deduplication: January 11, 2011 38 Data Deduplication Store only one copy of data The deduplication process should be granular The smaller the data block examined, the more likely it is duplicate data will be found. The deduplication process should be designed with minimal overhead when deduplicating (storing) and un-deduplicating (retrieving) data Hardware better than software The deduplication process should provide resiliency to insure that all data can be reliably stored and retrieved, even in the event of system failureData Deduplication: January 11, 2011 39 Data Deduplication Available for range of storage – hardware and software Symantec Enterprise Vault creates a MD5 fingerprint for every file that is archived If multiple files have the same hash code, only one copy of the file is physically stored IBM N Series has Advanced Single Instance Storage (ASIS) Hardware and block-based deduplicationDeduplication in Action: January 11, 2011 40 Deduplication in Action Client.ppt Identical file - 20 blocks Sales ed.ppt 20 x 4K blocks White paper.doc Different file - 10 blocks Sales ed v2.ppt Edited file - 24 blocks = Identical blocks With ASIS - 38 total blocks Without ASIS – 74 total blocksPotential Deduplication Savings – Dependent in Data Types: January 11, 2011 41 Potential Deduplication Savings – Dependent in Data TypesSoftware and Solution Design Constraints and Issues: January 11, 2011 42 Software and Solution Design Constraints and Issues Bottom Line Produce a realistic design before implementation and validate design Solutions must be fully tested to ensure it works as expected Decisions can then easily be made on the basis of the tests NetBackup integration must be thoroughly tested with any solution Primary to secondary to tertiary migration and retrievals must be tested and documented Misconfiguration or lack of understanding can lead to data loss or primary production system failure Need to look at the total cost of ownership – maintenance, power, manual effort – put a cost on all elements and activities to ensure fair comparison Reduced complexity – fewer components, vendors – means long-term ease of operation and use and has a genuine valueSample Storage Capacity Planning: January 11, 2011 43 Sample Storage Capacity Planning Sizing issues and assumptions Annual growth rate Overhead for determination of actual disk storage requirements (RAID overhead, etc.) Archival storage medium utilisation overhead (allowance for unfilled tapes, optical platters, RAID for VTL, etc.) Storage lifecycle Number of storage layers – 2 or 3 Sample storage capacity planning scenarios Annual growth rates – 0%, 10%, 20%, 30% Translated into monthly growth rates for calculations - 20% annual growth = 1.531% monthly Three tiers Migrate from Tier 1 to Tier 2 after 6 months Migrate from Tier 2 to Tier 3 after further 6 monthsDisk Space Calculations: January 11, 2011 44 Disk Space Calculations Storage estimates expressed as raw capacities required to accommodate data Includes overhead for effective usability, RAID, snapshots, online spare, less than 100% utilisation, etc. Primary storage after 5 years with 10% annual growth = 25,580 GB Equates to at least 34,533 GB of raw disk capacitySample Storage Capacity Planning – 0% Annual Growth Rate : January 11, 2011 45 Sample Storage Capacity Planning – 0% Annual Growth RateCapacities - Annual Growth Rate – 0%: January 11, 2011 46 Capacities - Annual Growth Rate – 0%Storage Capacities - 0% Annual Growth Rate: January 11, 2011 47 Storage Capacities - 0% Annual Growth RateMedia Requirements - 0% Annual Growth Rate: January 11, 2011 48 Media Requirements - 0% Annual Growth RateSample Storage Capacity Planning – 10% Annual Growth Rate : January 11, 2011 49 Sample Storage Capacity Planning – 10% Annual Growth RateCapacities - Annual Growth Rate – 10%: January 11, 2011 50 Capacities - Annual Growth Rate – 10%Storage Capacities - 10% Annual Growth Rate: January 11, 2011 51 Storage Capacities - 10% Annual Growth RateMedia Requirements - 10% Annual Growth Rate: January 11, 2011 52 Media Requirements - 10% Annual Growth RateSample Storage Capacity Planning – 20% Annual Growth Rate : January 11, 2011 53 Sample Storage Capacity Planning – 20% Annual Growth RateCapacities - Annual Growth Rate – 20%: January 11, 2011 54 Capacities - Annual Growth Rate – 20%Storage Capacities - 20% Annual Growth Rate: January 11, 2011 55 Storage Capacities - 20% Annual Growth RateMedia Requirements - 20% Annual Growth Rate: January 11, 2011 56 Media Requirements - 20% Annual Growth RateSample Storage Capacity Planning – 30% Annual Growth Rate : January 11, 2011 57 Sample Storage Capacity Planning – 30% Annual Growth RateCapacities - Annual Growth Rate – 30%: January 11, 2011 58 Capacities - Annual Growth Rate – 30%Storage Capacities - 30% Annual Growth Rate: January 11, 2011 59 Storage Capacities - 30% Annual Growth RateMedia Requirements - 30% Annual Growth Rate: January 11, 2011 60 Media Requirements - 30% Annual Growth Rate10 Year Data Storage Capacities – Different Growth Rates: January 11, 2011 61 10 Year Data Storage Capacities – Different Growth RatesSingle Drive/Path Tertiary Layer Data Write Times – Tape and Optical: January 11, 2011 62 Single Drive/Path Tertiary Layer Data Write Times – Tape and OpticalImplementation Options: January 11, 2011 63 Implementation Options Factors: 2 or 3 tiers Optical, tape or VTL as the last tier Use of existing storage (HP/Dell) or new storage DR or no DR Offsite manual copy or replication Software HSM – use existing NetBackup or other: HT FileStore, CaminoSoft, IBM TivoliSpectrum of Options: January 11, 2011 64 Spectrum of Options All disk DR option with replicated data Primary disk Secondary - tape Mixed disk/tape/optical/VTL/manual/automatedData Retrieval Operation: January 11, 2011 65 Data Retrieval Operation Secondary disk Data is retrieved to primary immediately – available within seconds/minutes Secondary/tertiary VTL Data is retrieved to primary immediately – available within minutes Secondary/tertiary tape library Data is retrieved to primary immediately – available within minutes Secondary/tertiary optical library Data is retrieved to primary immediately – available within hours Manual media retrieval Retrieval times depends on media location and staff allocated to media handlingSample Options: January 11, 2011 66 Sample Options Three tiers – optical or tape library as third tier All disk Reuse/expand existing hardware Low cost ATA disks for secondary storage Not all available options – presented for review and feedbackPhysical Option 1 – Three Tiers – Optical or Tape: January 11, 2011 67 Physical Option 1 – Three Tiers – Optical or TapePhysical Option 1 – Three Tiers – Optical or Tape: January 11, 2011 68 Physical Option 1 – Three Tiers – Optical or TapePhysical Option 1 - Components: January 11, 2011 69 Physical Option 1 - Components Primary storage – SAN with fibre disk Second storage – SAN with ATA disk Tertiary storage – optical library Software HT Filestore Caminosoft NetBackup Storage Migrator Tivoli Storage ManagerResilience: January 11, 2011 70 Resilience Primary storage mirrored for resilienceOperation and Service Level Agreement: January 11, 2011 71 Operation and Service Level AgreementPhysical Option 2 – All Disk Configuration: January 11, 2011 72 Physical Option 2 – All Disk Configuration All disk storage option Two mirrored sites with realtime replication Multiple replicated components for resilience Sample configuration Primary Storage Clustered SAN Controllers with 594 x 300 GB Fibre Channel Drives = 151 TB Raw Storage Secondary Storage Clustered SAN Controllers with 336 x 750 GB SATA Drives = 252 TB Raw Storage Total 403 TB of Raw Storage capacity (doubled for DR)All Disk Configuration: January 11, 2011 73 All Disk ConfigurationResilience – Multiple Points of Redundancy: January 11, 2011 74 Resilience – Multiple Points of RedundancyResilience: January 11, 2011 75 Resilience SAN switches SAN controllers Two disks per shelf Entire siteAll Disk Configuration: January 11, 2011 76 All Disk Configuration Indicative hardware and software (replication, snapshot) cost €1.8 million €4,460 per TB (doubled for DR) 5 standard racks in each location Does not include HSM software Installation and commissioning Represents high water mark in terms of costs and functionalityAll Disk Configuration: January 11, 2011 77 All Disk Configuration Advantages High performance Low manual intervention Highly resilient Disadvantages High cost of acquisition and operation Growth in data volumes means additional expense No upper limit on costPhysical Option 3 – Existing Hardware: January 11, 2011 78 Physical Option 3 – Existing Hardware Raw, pre-processed and processed data resides on HP EVA Replicated continuously to second EVA Dell CX disk array used as secondary location Existing ADIC LTO drives used for tertiary and long term offsite storageSlide 79: January 11, 2011 79Existing Hardware: January 11, 2011 80 Existing Hardware Advantages Cost Some skill sets already in organisation Disadvantages Investment in old technology Software based HSM product skills requiredIntroduction of Tertiary Device: January 11, 2011 81 Introduction of Tertiary Device Existing HP and Dell storage still employed UDO or LTO device used as final destination before removal to offsite archiveSlide 82: January 11, 2011 82Introduction of Tertiary Device: January 11, 2011 83 Introduction of Tertiary Device Advantages Cost – use of existing hardware Some skill sets already in organisation Media life is increased with UDO Disadvantages Cost – UDO or new tape library Management of archived media – especially UDO as they are low capacity Investment in old technology Software based HSM product skills required UDO retrieval speedsVirtual Tape Library: January 11, 2011 84 Virtual Tape Library VTL device will act as a tape library VTL will be secondary location HSM product skills may not be required NetBackup could manage this process VTL data will ultimately be archived to tape via ADIC tape librarySlide 85: January 11, 2011 85Virtual Tape Library: January 11, 2011 86 Virtual Tape Library Advantages Some skill sets already in organisation No new third party migration tool absolutely necessary Extension of NetBackup system using NetBackup Storage Migrator Disadvantages Cost – VTL with required capacity can be expensive Cannot take VTL backups offsite – tertiary solution still required Lack of vendor implementation experiencePhysical Option 4 – Disk Based Secondary Information Store: January 11, 2011 87 Physical Option 4 – Disk Based Secondary Information Store Single storage device with multiple PB of data scalability Data can be retained on information store for 15+ years and beyond 1 TB disk make this possible Data can be moved to storage attached tape Internal backup features of information store can aid NetBackup routine (SnapShots, Vaulting)Slide 88: January 11, 2011 88Disk Based Information Store: January 11, 2011 89 Disk Based Information Store Advantages Speed of retrieval No new third party migration tool absolutely necessary Simplicity Integration with NetBackup – no effect on daily backup routines Information store can be split across multiple information stores to give multiple PB capacity is required Disadvantages Cost – may be expensive initially but storage can be added over time as neededCentral Management – Storage Virtualisation: January 11, 2011 90 Central Management – Storage Virtualisation Controller site above storage systems Handle day to day management of storage across all platforms Advantages Skill set consolidation Costs Disadvantages Vendor based skill are still ultimately requiredSlide 91: January 11, 2011 91Key Questions: January 11, 2011 92 Key Questions Number of storage tiers and preferred configuration Use of tape/optical/VTL Software HSM option Disaster recovery/business continuity requirements and options Capacity planning constraints and assumptions New hardware or reuse of existing hardware Level of automation required for archival level Financial constraints and budget available Implementation scheduleMore Information: January 11, 2011 93 More Information Alan McSweeney alan@alanmcsweeney.com