Data Information and Knowledge Management Framework and DMBOK

Views:
 
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Structured and Comprehensive Approach to Data Management and the Data Management Book of Knowledge (DMBOK):

Structured and Comprehensive Approach to Data Management and the Data Management Book of Knowledge (DMBOK) Alan McSweeney

Objectives:

January 9, 2011 2 Objectives To provide an overview of a structured approach to developing and implementing a detailed data management policy including frameworks, standards, project, team and maturity

Agenda:

January 9, 2011 3 Agenda Introduction to Data Management State of Information and Data Governance Other Data Management Frameworks Data Management and Data Management Book of Knowledge (DMBOK) Conducting a Data Management Project Creating a Data Management Team Assessing Your Data Management Maturity

Preamble:

January 9, 2011 4 Preamble Every good presentation should start with quotations from The Prince and Dilbert

Management Wisdom:

January 9, 2011 5 Management Wisdom There is nothing more difficult to take in hand, more perilous to conduct or more uncertain in its success than to take the lead in the introduction of a new order of things. The Prince Never be in the same room as a decision. I'll illustrate my point with a puppet show that I call "Journey to Blameville" starring "Suggestion Sam" and "Manager Meg.“ You will often be asked to comment on things you don't understand. These handouts contain nonsense phrases that can be used in any situation so, let's dominate our industry with quality implementation of methodologies. Our executives have started their annual strategic planning sessions. This involves sitting in a room with inadequate data until an illusion of knowledge is attained. Then we'll reorganise, because that's all we know how to do. Dilbert

Information:

January 9, 2011 6 Information Information in all its forms – input, processed, outputs – is a core component of any IT system Applications exist to process data supplied by users and other applications Data breathes life into applications Data is stored and managed by infrastructure – hardware and software Data is a key organisation asset with a substantial value Significant responsibilities are imposed on organisations in managing data

Data, Information and Knowledge:

January 9, 2011 7 Data, Information and Knowledge Data is the representation of facts as text, numbers, graphics, images, sound or video Data is the raw material used to create information Facts are captured, stored, and expressed as data Information is data in context Without context, data is meaningless - we create meaningful information by interpreting the context around data Knowledge is information in perspective, integrated into a viewpoint based on the recognition and interpretation of patterns, such as trends, formed with other information and experience Knowledge is about understanding the significance of information Knowledge enables effective action

Data, Information, Knowledge and Action:

January 9, 2011 8 Data, Information, Knowledge and Action Data Action Knowledge Information

Information is an Organisation Asset:

January 9, 2011 9 Information is an Organisation Asset Tangible organisation assets are seen as having a value and are managed and controlled using inventory and asset management systems and procedures Data, because it is less tangible, is less widely perceived as a real asset, assigned a real value and managed as if it had a value High quality, accurate and available information is a pre-requisite to effective operation of any organisation

Data Management and Project Success:

January 9, 2011 10 Data Management and Project Success Data is fundamental to the effective and efficient operation of any solution Right data Right time Right tools and facilities Without data the solution has no purpose Data is too often overlooked in projects Project managers frequently do not appreciate the complexity of data issues

Generalised Information Management Lifecycle:

January 9, 2011 11 Generalised Information Management Lifecycle Design, define and implement framework to manage information through this lifecycle Generalised lifecycle that differs for specific information types Enter, Create, Acquire, Derive, Update, Capture Store, Manage, Replicate and Distribute Protect and Recover Archive and Recall Delete/Remove Manage, Control and Administer

Expanded Generalised Information Management Lifecycle:

January 9, 2011 12 Expanded Generalised Information Management Lifecycle Enter, Create, Acquire, Derive, Update, Capture Store, Manage, Replicate and Distribute Protect and Recover Archive and Recall Delete/Remove Design, Implement, Manage, Control and Administer Implement Underlying Infrastructure Plan, Design and Specify Include phases for information management lifecycle design and implementation of appropriate hardware and software to actualise lifecycle

Data and Information Management:

January 9, 2011 13 Data and Information Management Data and information management is a business process consisting of the planning and execution of policies, practices, and projects that acquire, control, protect, deliver, and enhance the value of data and information assets

Data and Information Management:

January 9, 2011 14 Data and Information Management To manage and utilise information as a strategic asset To implement processes, policies, infrastructure and solutions to govern, protect, maintain and use information To make relevant and correct information available in all business processes and IT systems for the right people in the right context at the right time with the appropriate security and with the right quality To exploit information in business decisions, processes and relations

Data Management Goals:

January 9, 2011 15 Data Management Goals Primary goals To understand the information needs of the enterprise and all its stakeholders To capture, store, protect, and ensure the integrity of data assets To continually improve the quality of data and information, including accuracy, integrity, integration, relevance and usefulness of data To ensure privacy and confidentiality, and to prevent unauthorised inappropriate use of data and information To maximise the effective use and value of data and information assets

Data Management Goals:

January 9, 2011 16 Data Management Goals Secondary goals To control the cost of data management To promote a wider and deeper understanding of the value of data assets To manage information consistently across the enterprise To align data management efforts and technology with business needs

Triggers for Data Management Initiative:

January 9, 2011 17 Triggers for Data Management Initiative When an enterprise is about to undertake architectural transformation, data management issues need to be understood and addressed Structured and comprehensive approach to data management enables the effective use of data to take advantage of its competitive advantages

Data Management Principles:

January 9, 2011 18 Data Management Principles Data and information are valuable enterprise assets Manage data and information carefully, like any other asset, by ensuring adequate quality, security, integrity, protection, availability, understanding and effective use Share responsibility for data management between business data owners and IT data management professionals Data management is a business function and a set of related disciplines

Organisation Data Management Function:

January 9, 2011 19 Organisation Data Management Function Business function of planning for, controlling and delivering data and information assets Development, execution, and supervision of plans, policies, programs, projects, processes, practices and procedures that control, protect, deliver, and enhance the value of data and information assets Scope of the data management function and the scale of its implementation vary widely with the size, means, and experience of organisations Role of data management remains the same across organisations even though implementation differs widely

Scope of Complete Data Management Function:

January 9, 2011 20 Scope of Complete Data Management Function

Shared Role Between Business and IT:

January 9, 2011 21 Shared Role Between Business and IT Data management is a shared responsibility between data management professionals within IT and the business data owners representing the interests of data producers and information consumers Business data ownership is the concerned with accountability for business responsibilities in data management Business data owners are data subject matter experts Represent the data interests of the business and take responsibility for the quality and use of data

Why Develop and Implement a Data Management Framework?:

January 9, 2011 22 Why Develop and Implement a Data Management Framework? Improve organisation data management efficiency Deliver better service to business Improve cost-effectiveness of data management Match the requirements of the business to the management of the data Embed handling of compliance and regulatory rules into data management framework Achieve consistency in data management across systems and applications Enable growth and change more easily Reduce data management and administration effort and cost Assist in the selection and implementation of appropriate data management solutions Implement a technology-independent data architecture

Data Management Issues:

January 9, 2011 23 Data Management Issues

Data Management Issues:

January 9, 2011 24 Data Management Issues Discovery - cannot find the right information Integration - cannot manipulate and combine information Insight - cannot extract value and knowledge from information Dissemination - cannot consume information Management – cannot manage and control information volumes and growth

Data Management Problems – User View:

January 9, 2011 25 Data Management Problems – User View Managing Storage Equipment Application Recoveries / Backup Retention Vendor Management Power Management Regulatory Compliance Lack of Integrated Tools Dealing with Performance Problems Data Mobility Archiving and Archive Management Storage Provisioning Managing Complexity Managing Costs Backup Administration and Management Proper Capacity Forecasting and Storage Reporting Managing Storage Growth

Information Management Challenges:

January 9, 2011 26 Information Management Challenges Explosive Data Growth Value and volume of data is overwhelming More data is see as critical Annual rate of 50+% percent Compliance Requirements Compliance with stringent regulatory requirements and audit procedures Fragmented Storage Environment Lack of enterprise-wide hardware and software data storage strategy and discipline Budgets Frozen or being cut

Data Quality:

January 9, 2011 27 Data Quality Poor data quality costs real money Process efficiency is negatively impacted by poor data quality Full potential benefits of new systems not be realised because of poor data quality Decision making is negatively affected by poor data quality

State of Information and Data Governance:

January 9, 2011 28 State of Information and Data Governance Information and Data Governance Report, April 2008 International Association for Information and Data Quality (IAIDQ) University of Arkansas at Little Rock, Information Quality Program (UALR-IQ)

Your Organisation Recognises and Values Information as a Strategic Asset and Manages it Accordingly:

January 9, 2011 29 Your Organisation Recognises and Values Information as a Strategic Asset and Manages it Accordingly

Direction of Change in the Results and Effectiveness of the Organisation's Formal or Informal Information/Data Governance Processes Over the Past Two Years:

January 9, 2011 30 Direction of Change in the Results and Effectiveness of the Organisation's Formal or Informal Information/Data Governance Processes Over the Past Two Years

Perceived Effectiveness of the Organisation's Current Formal or Informal Information/Data Governance Processes:

January 9, 2011 31 Perceived Effectiveness of the Organisation's Current Formal or Informal Information/Data Governance Processes

Actual Information/Data Governance Effectiveness vs. Organisation's Perception:

January 9, 2011 32 Actual Information/Data Governance Effectiveness vs. Organisation's Perception

Current Status of Organisation's Information/Data Governance Initiatives:

January 9, 2011 33 Current Status of Organisation's Information/Data Governance Initiatives

Expected Changes in Organisation's Information/Data Governance Efforts Over the Next Two Years:

January 9, 2011 34 Expected Changes in Organisation's Information/Data Governance Efforts Over the Next Two Years

Overall Objectives of Information / Data Governance Efforts:

January 9, 2011 35 Overall Objectives of Information / Data Governance Efforts

Change In Organisation's Information / Data Quality Over the Past Two Years:

January 9, 2011 36 Change In Organisation's Information / Data Quality Over the Past Two Years

Maturity Of Information / Data Governance Goal Setting And Measurement In Your Organisation:

January 9, 2011 37 Maturity Of Information / Data Governance Goal Setting And Measurement In Your Organisation

Maturity Of Information / Data Governance Processes And Policies In Your Organisation:

January 9, 2011 38 Maturity Of Information / Data Governance Processes And Policies In Your Organisation

Maturity Of Responsibility And Accountability For Information / Data Governance Among Employees In Your Organisation:

January 9, 2011 39 Maturity Of Responsibility And Accountability For Information / Data Governance Among Employees In Your Organisation

Other Data Management Frameworks:

January 9, 2011 40 Other Data Management Frameworks

Other Data Management-Related Frameworks:

January 9, 2011 41 Other Data Management-Related Frameworks TOGAF (and other enterprise architecture standards) define a process for arriving an at enterprise architecture definition, including data TOGAF has a phase relating to data architecture TOGAF deals with high level DMBOK translates high level into specific details COBIT is concerned with IT governance and controls: IT must implement internal controls around how it operates The systems IT delivers to the business and the underlying business processes these systems actualise must be controlled – these are controls external to IT To govern IT effectively, COBIT defines the activities and risks within IT that need to be managed COBIT has a process relating to data management Neither TOGAF nor COBIT are concerned with detailed data management design and implementation

DMBOK, TOGAF and COBIT:

January 9, 2011 42 DMBOK, TOGAF and COBIT TOGAF Defines the Process for Creating a Data Architecture as Part of an Overall Enterprise Architecture COBIT Provides Data Governance as Part of Overall IT Governance DMBOK Provides Detailed for Definition, Implementation and Operation of Data Management and Utilisation Can be a Precursor to Implementing Data Management Can Provide a Maturity Model for Assessing Data Management DMBOK Is a Specific and Comprehensive Data Oriented Framework

DMBOK, TOGAF and COBIT – Scope and Overlap:

January 9, 2011 43 DMBOK, TOGAF and COBIT – Scope and Overlap DMBOK COBIT TOGAF Data Governance Data Architecture Management Data Management Data Migration Data Development Data Operations Management Reference and Master Data Management Data Warehousing and Business Intelligence Management Document and Content Management Metadata Management Data Quality Management Data Security Management

TOGAF and Data Management:

January 9, 2011 44 TOGAF and Data Management Phase C1: Data Architecture Phase C2: Solutions and Application Architecture Phase C1 (subset of Phase C) relates to defining a data architecture

TOGAF Phase C1: Information Systems Architectures - Data Architecture - Objectives:

January 9, 2011 45 TOGAF Phase C1: Information Systems Architectures - Data Architecture - Objectives Purpose is to define the major types and sources of data necessary to support the business, in a way that is: Understandable by stakeholders Complete and consistent Stable Define the data entities relevant to the enterprise Not concerned with design of logical or physical storage systems or databases

TOGAF Phase C1: Information Systems Architectures - Data Architecture - Overview:

January 9, 2011 46 TOGAF Phase C1: Information Systems Architectures - Data Architecture - Overview

TOGAF Phase C1: Information Systems Architectures - Data Architecture - Approach - Key Considerations for Data Architecture:

January 9, 2011 47 TOGAF Phase C1: Information Systems Architectures - Data Architecture - Approach - Key Considerations for Data Architecture Data Management Important to understand and address data management issues Structured and comprehensive approach to data management enables the effective use of data to capitalise on its competitive advantages Clear definition of which application components in the landscape will serve as the system of record or reference for enterprise master data Will there be an enterprise-wide standard that all application components, including software packages, need to adopt Understand how data entities are utilised by business functions, processes, and services Understand how and where enterprise data entities are created, stored, transported, and reported Level and complexity of data transformations required to support the information exchange needs between applications Requirement for software in supporting data integration with external organisations

TOGAF Phase C1: Information Systems Architectures - Data Architecture - Approach - Key Considerations for Data Architecture:

January 9, 2011 48 TOGAF Phase C1: Information Systems Architectures - Data Architecture - Approach - Key Considerations for Data Architecture Data Migration Identify data migration requirements and also provide indicators as to the level of transformation for new/changed applications Ensure target application has quality data when it is populated Ensure enterprise-wide common data definition is established to support the transformation

TOGAF Phase C1: Information Systems Architectures - Data Architecture - Approach - Key Considerations for Data Architecture:

January 9, 2011 49 TOGAF Phase C1: Information Systems Architectures - Data Architecture - Approach - Key Considerations for Data Architecture Data Governance Ensures that the organisation has the necessary dimensions in place to enable the data transformation Structure – ensures the organisation has the necessary structure and the standards bodies to manage data entity aspects of the transformation Management System - ensures the organisation has the necessary management system and data-related programs to manage the governance aspects of data entities throughout its lifecycle People - addresses what data-related skills and roles the organisation requires for the transformation

TOGAF Phase C1: Information Systems Architectures - Data Architecture - Outputs:

January 9, 2011 50 TOGAF Phase C1: Information Systems Architectures - Data Architecture - Outputs Refined and updated versions of the Architecture Vision phase deliverables Statement of Architecture Work Validated data principles, business goals, and business drivers Draft Architecture Definition Document Baseline Data Architecture Target Data Architecture Business data model Logical data model Data management process models Data Entity/Business Function matrix Views corresponding to the selected viewpoints addressing key stakeholder concerns Draft Architecture Requirements Specification Gap analysis results Data interoperability requirements Relevant technical requirements Constraints on the Technology Architecture about to be designed Updated business requirements Updated application requirements Data Architecture components of an Architecture Roadmap

COBIT Structure:

January 9, 2011 51 COBIT Structure

COBIT and Data Management:

January 9, 2011 52 COBIT and Data Management COBIT objective DS11 Manage Data within the Deliver and Support (DS) domain Effective data management requires identification of data requirements Data management process includes establishing effective procedures to manage the media library, backup and recovery of data and proper disposal of media Effective data management helps ensure the quality, timeliness and availability of business data

COBIT and Data Management:

January 9, 2011 53 COBIT and Data Management Objective is the control over the IT process of managing data that meets the business requirement for IT of optimising the use of information and ensuring information is available as required Focuses on maintaining the completeness, accuracy, availability and protection of data Involves taking actions Backing up data and testing restoration Managing onsite and offsite storage of data Securely disposing of data and equipment Measured by User satisfaction with availability of data Percent of successful data restorations Number of incidents where sensitive data were retrieved after media were disposed of

COBIT Process DS11 Manage Data:

January 9, 2011 54 COBIT Process DS11 Manage Data DS11.1 Business Requirements for Data Management Establish arrangements to ensure that source documents expected from the business are received, all data received from the business are processed, all output required by the business is prepared and delivered, and restart and reprocessing needs are supported DS11.2 Storage and Retention Arrangements Define and implement procedures for data storage and archival, so data remain accessible and usable Procedures should consider retrieval requirements, cost-effectiveness, continued integrity and security requirements Establish storage and retention arrangements to satisfy legal, regulatory and business requirements for documents, data, archives, programmes, reports and messages (incoming and outgoing) as well as the data (keys, certificates) used for their encryption and authentication DS11.3 Media Library Management System Define and implement procedures to maintain an inventory of onsite media and ensure their usability and integrity Procedures should provide for timely review and follow-up on any discrepancies noted DS11.4 Disposal Define and implement procedures to prevent access to sensitive data and software from equipment or media when they are disposed of or transferred to another use Procedures should ensure that data marked as deleted or to be disposed cannot be retrieved. DS11.5 Backup and Restoration Define and implement procedures for backup and restoration of systems, data and documentation in line with business requirements and the continuity plan Verify compliance with the backup procedures, and verify the ability to and time required for successful and complete restoration Test backup media and the restoration process DS11.6 Security Requirements for Data Management Establish arrangements to identify and apply security requirements applicable to the receipt, processing, physical storage and output of data and sensitive messages Includes physical records, data transmissions and any data stored offsite

COBIT Data Management Goals and Metrics:

January 9, 2011 55 COBIT Data Management Goals and Metrics Backing up data and testing restoration Managing onsite and offsite storage of data Securely disposing of data and equipment Activity Goals Frequency of testing of backup media Average time for data restoration Key Performance Indicators Maintain the completeness, accuracy, validity and accessibility of stored data Secure data during disposal of media Effectively manage storage media Process Goals % of successful data restorations # of incidents where sensitive data were retrieved after media were disposed of # of down time or data integrity incidents caused by insufficient storage capacity Process Key Goal Indicators Backing up data and testing restoration Managing onsite and offsite storage of data Securely disposing of data and equipment Activity Goals Occurrences of inability to recover data critical to business process User satisfaction with availability of data Incidents of noncompliance with laws due to storage management issues IT Key Goal Indicators Are Measured By Are Measured By Are Measured By Drive Drive

Data Management Book of Knowledge (DMBOK):

January 9, 2011 56 Data Management Book of Knowledge (DMBOK)

Data Management Book of Knowledge (DMBOK):

January 9, 2011 57 Data Management Book of Knowledge (DMBOK) DMBOK is a generalised and comprehensive framework for managing data across the entire lifecycle Developed by DAMA (Data Management Association) DMBOK provides a detailed framework to assist development and implementation of data management processes and procedures and ensures all requirements are addressed Enables effective and appropriate data management across the organisation Provides awareness and visibility of data management issues and requirements

Data Management Book of Knowledge (DMBOK):

January 9, 2011 58 Data Management Book of Knowledge (DMBOK) Not a solution to your data management needs Framework and methodology for developing and implementing an appropriate solution Generalised framework to be customised to meet specific needs Provide a work breakdown structure for a data management project to allow the effort to be assessed No magic bullet

Scope and Structure of Data Management Book of Knowledge (DMBOK):

January 9, 2011 59 Scope and Structure of Data Management Book of Knowledge (DMBOK) Data Management Environmental Elements Data Management Functions

DMBOK Data Management Functions :

January 9, 2011 60 DMBOK Data Management Functions

DMBOK Data Management Functions:

January 9, 2011 61 DMBOK Data Management Functions Data Governance - planning, supervision and control over data management and use Data Architecture Management - defining the blueprint for managing data assets Data Development - analysis, design, implementation, testing, deployment, maintenance Data Operations Management - providing support from data acquisition to purging Data Security Management - Ensuring privacy, confidentiality and appropriate access Data Quality Management - defining, monitoring and improving data quality Reference and Master Data Management - managing master versions and replicas Data Warehousing and Business Intelligence Management - enabling reporting and analysis Document and Content Management - managing data found outside of databases Metadata Management - integrating, controlling and providing metadata

DMBOK Data Management Environmental Elements:

January 9, 2011 62 DMBOK Data Management Environmental Elements

DMBOK Data Management Environmental Elements:

January 9, 2011 63 DMBOK Data Management Environmental Elements Goals and Principles - directional business goals of each function and the fundamental principles that guide performance of each function Activities - each function is composed of lower level activities, sub-activities, tasks and steps Primary Deliverables - information and physical databases and documents created as interim and final outputs of each function. Some deliverables are essential, some are generally recommended, and others are optional depending on circumstances Roles and Responsibilities - business and IT roles involved in performing and supervising the function, and the specific responsibilities of each role in that function. Many roles will participate in multiple functions Practices and Techniques - common and popular methods and procedures used to perform the processes and produce the deliverables and may also include common conventions, best practice recommendations, and alternative approaches without elaboration Technology - categories of supporting technology such as software tools, standards and protocols, product selection criteria and learning curves Organisation and Culture – this can include issues such as management metrics, critical success factors, reporting structures, budgeting, resource allocation issues, expectations and attitudes, style, cultural, approach to change management

DMBOK Data Management Functions and Environmental Elements:

January 9, 2011 64 DMBOK Data Management Functions and Environmental Elements Goals and Principles Activities Primary Deliverables Roles and Responsibilities Practices and Techniques Technology Organisation and Culture Data Governance Data Architecture Management Data Development Data Operations Management ç Scope of Each Data Management Function è Data Security Management Data Quality Management Reference and Master Data Management Data Warehousing and Business Intelligence Management Document and Content Management Metadata Management

Scope of Data Management Book of Knowledge (DMBOK) Data Management Framework:

January 9, 2011 65 Scope of Data Management Book of Knowledge (DMBOK) Data Management Framework Hierarchy Function Activity Sub-Activity (not in all cases) Each activity is classified as one (or more) of: Planning Activities (P) Activities that set the strategic and tactical course for other data management activities May be performed on a recurring basis Development Activities (D) Activities undertaken within implementation projects and recognised as part of the systems development lifecycle (SDLC), creating data deliverables through analysis, design, building, testing, preparation, and deployment Control Activities (C) Supervisory activities performed on an on-going basis Operational Activities (O) Service and support activities performed on an on- going basis

Activity Groups Within Functions:

January 9, 2011 66 Activity Groups Within Functions Activity groups are classifications of data management activities Use the activity groupings to define the scope of data management sub-projects and identify the appropriate tasks: Analysis and design Implementation Operational improvement Management and administration Development Activities Control Activities Operational Activities Planning Activities

DMBOK Function and Activity Structure:

January 9, 2011 67 DMBOK Function and Activity Structure

DMBOK Function and Activity - Planning Activities:

January 9, 2011 68 DMBOK Function and Activity - Planning Activities

DMBOK Function and Activity - Control Activities:

January 9, 2011 69 DMBOK Function and Activity - Control Activities

DMBOK Function and Activity - Development Activities:

January 9, 2011 70 DMBOK Function and Activity - Development Activities

DMBOK Function and Activity - Operational Activities:

January 9, 2011 71 DMBOK Function and Activity - Operational Activities

DMBOK Environmental Elements Structure:

January 9, 2011 72 DMBOK Environmental Elements Structure

Data Governance:

January 9, 2011 73 Data Governance

Data Governance:

January 9, 2011 74 Data Governance Core function of the Data Management Framework Interacts with and influences each of the surrounding ten data management functions Data governance is the exercise of authority and control (planning, monitoring, and enforcement) over the management of data assets Data governance function guides how all other data management functions are performed High-level, executive data stewardship Data governance is not the same thing as IT governance Data governance is focused exclusively on the management of data assets

Data Governance – Definition and Goals:

January 9, 2011 75 Data Governance – Definition and Goals Definition The exercise of authority and control (planning, monitoring, and enforcement) over the management of data assets Goals To define, approve, and communicate data strategies, policies, standards, architecture, procedures, and metrics To track and enforce regulatory compliance and conformance to data policies, standards, architecture, and procedures To sponsor, track, and oversee the delivery of data management projects and services To manage and resolve data related issues To understand and promote the value of data assets

Data Governance - Overview:

January 9, 2011 76 Data Governance - Overview Business Goals Business Strategies IT Objectives IT Strategies Data Needs Data Issues Regulatory Requirements Inputs Business Executives IT Executives Data Stewards Regulatory Bodies Suppliers Intranet Website E-Mail Metadata Tools Metadata Repository Issue Management Tools Data Governance KPI Dashboard Tools Executive Data Stewards Coordinating Data Stewards Business Data Stewards Data Professionals DM Executive CIO Participants Data Policies Data Standards Resolved Issues Data Management Projects and Services Quality Data and Information Recognised Data Value Primary Deliverables Data Producers Knowledge Workers Managers and Executives Data Professionals Customers Consumers Data Value Data Management Cost Achievement of Objectives # of Decisions Made Steward Representation / Coverage Data Professional Headcount Data Management Process Maturity Metrics Data Governance

Data Governance Function, Activities and Sub-Activities:

January 9, 2011 77 Data Governance Function, Activities and Sub-Activities

Data Governance:

January 9, 2011 78 Data Governance Data governance is accomplished most effectively as an on-going program and a continual improvement process Every data governance programme is unique, taking into account distinctive organisational and cultural issues, and the immediate data management challenges and opportunities Data governance is at the core of managing data assets

Data Governance - Possible Organisation Structure:

January 9, 2011 79 Data Governance - Possible Organisation Structure

Data Governance Shared Decision Making:

January 9, 2011 80 Data Governance Shared Decision Making Enterprise Information Model Business Operating Model Information Needs IT Leadership Information Specifications Capital Investments Quality Requirements Research and Development Funding Issue Resolution Data Governance Model Business Decisions IT Decisions Shared Decision Making Database Architecture Enterprise Information Management Strategy Data Integration Architecture Enterprise Information Management Policies Data Warehousing and Business Intelligence Architecture Enterprise Information Management Standards Metadata Architecture Enterprise Information Management Metrics Technical Metadata Enterprise Information Management Services

Data Stewardship:

January 9, 2011 81 Data Stewardship Formal accountability for business responsibilities ensuring effective control and use of data assets Data steward is a business leader and/or recognised subject matter expert designated as accountable for these responsibilities Manage data assets on behalf of others and in the best interests of the organisation Represent the data interests of all stakeholders, including but not limited to, the interests of their own functional departments and divisions Protects, manages, and leverages the data resources Must take an enterprise perspective to ensure the quality and effective use of enterprise data

Data Stewardship - Roles:

January 9, 2011 82 Data Stewardship - Roles Executive Data Stewards – provide data governance and make of high-level data stewardship decisions Coordinating Data Stewards - lead and represent teams of business data stewards in discussions across teams and with executive data stewards Business Data Stewards - subject matter experts work with data management professionals on an ongoing basis to define and control data

Data Stewardship Roles Across Data Management Functions - 1:

January 9, 2011 83 Data Stewardship Roles Across Data Management Functions - 1 All Data Stewards Executive Data Stewards Coordinating Data Stewards Business Data Stewards Data Architecture Management Review, validate, approve, maintain and refine data architecture Review and approve the enterprise data architecture Integrate specifications, resolving differences Define data requirements specifications Data Development Validate physical data models and database designs, participate in database testing and conversion Define data requirements and specifications Data Operations Management Define requirements for data recovery, retention and performance Help identify, acquire, and control externally sourced data Data Security Management Provide security, privacy and confidentiality requirements, identify and resolve data security issues, assist in data security audits, and classify information confidentiality Reference and Master Data Management Control the creation, update, and retirement of code values and other reference data, define master data management requirements, identify and help resolve issues

Data Stewardship Roles Across Data Management Functions - 2:

January 9, 2011 84 Data Stewardship Roles Across Data Management Functions - 2 All Data Stewards Executive Data Stewards Coordinating Data Stewards Business Data Stewards Data Warehousing and Business Intelligence Management Provide business intelligence requirements and management metrics, and they identify and help resolve business intelligence issues Document and Content Management Define enterprise taxonomies and resolve content management issues Metadata Management Create and maintain business metadata (names, meanings, business rules), define metadata access and integration needs and use metadata to make effective data stewardship and governance decisions Data Quality Management Define data quality requirements and business rules, test application edits and validations, assist in the analysis, certification, and auditing of data quality, lead clean-up efforts, identify ways to solve causes of poor data quality, promote data quality awareness

Data Strategy:

January 9, 2011 85 Data Strategy High-level course of action to achieve high-level goals Data strategy is a data management program strategy a plan for maintaining and improving data quality, integrity, security and access Address all data management functions relevant to the organisation

Elements of Data Strategy:

January 9, 2011 86 Elements of Data Strategy Vision for data management Summary business case for data management Guiding principles, values, and management perspectives Mission and long-term directional goals of data management Management measures of data management success Short-term data management programme objectives Descriptions of data management roles and business units along with a summary of their responsibilities and decision rights Descriptions of data management programme components and initiatives Outline of the data management implementation roadmap Scope boundaries

Data Strategy:

January 9, 2011 87 Data Strategy Data Management Scope Statement Goals and objectives for a defined planning horizon and the roles, organisations, and individual leaders accountable for achieving these objectives Data Management Programme Charter Overall vision, business case, goals, guiding principles, measures of success, critical success factors, recognised risks Data Management Implementation Roadmap Identifying specific programs, projects, task assignments, and delivery milestones

Data Policies:

January 9, 2011 88 Statements of intent and fundamental rules governing the creation, acquisition, integrity, security, quality, and use of data and information More fundamental, global, and business critical than data standards Describe what to do and what not to do Should be few data policies stated briefly and directly Data Policies

Data Policies:

January 9, 2011 89 Data Policies Possible topics for data policies Data modeling and other data development activities Development and use of data architecture Data quality expectations, roles, and responsibilities Data security, including confidentiality classification policies, intellectual property policies, personal data privacy policies, general data access and usage policies, and data access by external parties Database recovery and data retention Access and use of externally sourced data Sharing data internally and externally Data warehousing and business intelligence Unstructured data - electronic files and physical records

Data Architecture:

January 9, 2011 90 Data Architecture Enterprise data model and other aspects of data architecture sponsored at the data governance level Need to pay particular attention to the alignment of the enterprise data model with key business strategies, processes, business units and systems Includes Data technology architecture Data integration architecture Data warehousing and business intelligence architecture Metadata architecture

Data Standards and Procedures:

January 9, 2011 91 Data Standards and Procedures Include naming standards, requirement specification standards, data modeling standards, database design standards, architecture standards and procedural standards for each data management function Must be effectively communicated, monitored, enforced and periodically re-evaluated Data management procedures are the methods, techniques, and steps followed to accomplish a specific activity or task

Data Standards and Procedures:

January 9, 2011 92 Data Standards and Procedures Possible topics for data standards and procedures Data modeling and architecture standards, including data naming conventions, definition standards, standard domains, and standard abbreviations Standard business and technical metadata to be captured, maintained, and integrated Data model management guidelines and procedures Metadata integration and usage procedures Standards for database recovery and business continuity, database performance, data retention, and external data acquisition Data security standards and procedures Reference data management control procedures Match / merge and data cleansing standards and procedures Business intelligence standards and procedures Enterprise content management standards and procedures, including use of enterprise taxonomies, support for legal discovery and document and e-mail retention, electronic signatures, report formatting standards and report distribution approaches

Regulatory Compliance:

January 9, 2011 93 Regulatory Compliance Most organisations are is impacted by government and industry regulations Many of these regulations dictate how data and information is to be managed Compliance is generally mandatory Data governance guides the implementation of adequate controls to ensure, document, and monitor compliance with data-related regulations.

Regulatory Compliance:

January 9, 2011 94 Regulatory Compliance Data governance needs to work the business to find the best answers to the following regulatory compliance questions How relevant is a regulation? Why is it important for us? How do we interpret it? What policies and procedures does it require? Do we comply now? How do we comply now? How should we comply in the future? What will it take? When will we comply? How do we demonstrate and prove compliance? How do we monitor compliance? How often do we review compliance? How do we identify and report non-compliance? How do we manage and rectify non-compliance?

Issue Management:

January 9, 2011 95 Issue Management Data governance assists in identifying, managing, and resolving data related issues Data quality issues Data naming and definition conflicts Business rule conflicts and clarifications Data security, privacy, and confidentiality issues Regulatory non-compliance issues Non-conformance issues (policies, standards, architecture, and procedures) Conflicting policies, standards, architecture, and procedures Conflicting stakeholder interests in data and information Organisational and cultural change management issues Issues regarding data governance procedures and decision rights Negotiation and review of data sharing agreements

Issue Management, Control and Escalation:

January 9, 2011 96 Issue Management, Control and Escalation Data governance implements issue controls and procedures Identifying, capturing, logging and updating issues Tracking the status of issues Documenting stakeholder viewpoints and resolution alternatives Objective, neutral discussions where all viewpoints are heard Escalating issues to higher levels of authority Determining, documenting and communicating issue resolutions.

Data Management Projects:

January 9, 2011 97 Data Management Projects Data management roadmap sets out a course of action for initiating and/or improving data management functions Consists of an assessment of current functions, definition of a target environment and target objectives and a transition plan outlining the steps required to reach these targets including an approach to organisational change management Every data management project should follow the project management standards of the organisation

Data Asset Valuation:

January 9, 2011 98 Data Asset Valuation Data and information are truly assets because they have business value, tangible or intangible Different approaches to estimating the value of data assets Identify the direct and indirect business benefits derived from use of the data Identify the cost of data loss, identifying the impacts of not having the current amount and quality level of data

Data Architecture Management:

January 9, 2011 99 Data Architecture Management

Data Architecture Management:

January 9, 2011 100 Data Architecture Management Concerned with defining and maintaining specifications that Provide a standard common business vocabulary Express strategic data requirements Outline high level integrated designs to meet these requirements Align with enterprise strategy and related business architecture Data architecture is an integrated set of specification artifacts used to define data requirements, guide integration and control of data assets and align data investments with business strategy Includes formal data names, comprehensive data definitions, effective data structures, precise data integrity rules, and robust data documentation

Data Architecture Management – Definition and Goals:

January 9, 2011 101 Data Architecture Management – Definition and Goals Definition Defining the data needs of the enterprise and designing the master blueprints to meet those needs Goals To plan with vision and foresight to provide high quality data To identify and define common data requirements To design conceptual structures and plans to meet the current and long-term data requirements of the enterprise

Data Architecture Management - Overview:

January 9, 2011 102 Data Architecture Management - Overview Business Goals Business Strategies Business Architecture Process Architecture IT Objectives IT Strategies Data Strategies Data Issues and Needs Technical Architecture Inputs Executives Data Stewards Data Producers Information Consumers Suppliers Data Modeling Tools Model Management Tool Metadata Repository Office Productivity Tools Tools Data Stewards Subject Matter Experts (SMEs) Data Architects Data Analysts and Modelers Other Enterprise Architects DM Executive and Managers CIO and Other Executives Database Administrators Data Model Administrator Participants Enterprise Data Model Information Value Chain Analysis Data Technology Architecture Data Integration / MDM Architecture DW / BI Architecture Metadata Architecture Enterprise Taxonomies and Namespaces Document Management Architecture Metadata Primary Deliverables Data Producers Knowledge Workers Managers and Executives Data Professionals Customers Consumers Data Value Data Management Cost Achievement of Objectives # of Decisions Made Steward Representation / Coverage Data Professional Headcount Data Management Process Maturity Metrics Data Architecture Management

Enterprise Data Architecture:

January 9, 2011 103 Enterprise Data Architecture Integrated set of specifications and documents Enterprise Data Model - the core of enterprise data architecture Information Value Chain Analysis - aligns data with business processes and other enterprise architecture components Related Data Delivery Architecture - including database architecture, data integration architecture, data warehousing / business intelligence architecture, document content architecture, and metadata architecture

Data Architecture Management Activities:

January 9, 2011 104 Data Architecture Management Activities Understand Enterprise Information Needs Develop and Maintain the Enterprise Data Model Analyse and Align With Other Business Models Define and Maintain the Database Architecture Define and Maintain the Data Integration Architecture Define and Maintain the Data Warehouse / Business Intelligence Architecture Define and Maintain Enterprise Taxonomies and Namespaces Define and Maintain the Metadata Architecture

Understanding Enterprise Information Needs:

January 9, 2011 105 Understanding Enterprise Information Needs In order to create an enterprise data architecture, the organisation must first define its information need An enterprise data model is a way of capturing and defining enterprise information needs and data requirements Master blueprint for enterprise-wide data integration Enterprise data model is a critical input to all future systems development projects and the baseline for additional data requirements analysis Evaluate the current inputs and outputs required by the organisation, both from and to internal and external targets

Develop and Maintain the Enterprise Data Model:

January 9, 2011 106 Develop and Maintain the Enterprise Data Model Data is the set of facts collected about business entities Data model is a set of data specifications that reflect data requirements and designs Enterprise data model is an integrated, subject-oriented data model defining the critical data produced and consumed across the organisation Define and analyse data requirements Design logical and physical data structures that support these requirements

Enterprise Data Model:

January 9, 2011 107 Enterprise Data Model

Enterprise Data Model:

January 9, 2011 108 Enterprise Data Model Build an enterprise data model in layers Focus on the most critical business subject areas

Subject Area Model:

January 9, 2011 109 Subject Area Model List of major subject areas that collectively express the essential scope of the enterprise Important to the success of the entire enterprise data model List of enterprise subject areas becomes one of the most significant organisation classifications Acceptable to organisation stakeholders Useful as the organising framework for data governance, data stewardship, and further enterprise data modeling

Conceptual Data Model:

January 9, 2011 110 Conceptual Data Model Conceptual data model defines business entities and their relationships Business entities are the primary organisational structures in a conceptual data model Business needs data about business entities Include a glossary containing the business definitions and other metadata associated with business entities and their relationships Assists improved business understanding and reconciliation of terms and their meanings Provide the framework for developing integrated information systems to support both transactional processing and business intelligence. Depicts how the enterprise sees information

Enterprise Logical Data Models:

January 9, 2011 111 Enterprise Logical Data Models Logical data model contain a level of detail below the conceptual data model Contain the essential data attributes for each entity Essential data attributes are those data attributes without which the enterprise cannot function – can be a subjective decision

Other Enterprise Data Model Components:

January 9, 2011 112 Other Enterprise Data Model Components Data Steward Responsibility Assignments- for subject areas, entities, attributes, and/or reference data value sets Valid Reference Data Values - controlled value sets for codes and/or labels and their business meaning Data Quality Specifications - rules for essential data attributes, such as accuracy / precision requirements, currency (timeliness), integrity rules, nullability, formatting, match/merge rules, and/or audit requirements Entity Life Cycles - show the different lifecycle states of the most important entities and the trigger events that change an entity from one state to another

Analyse and Align with Other Business Models:

January 9, 2011 113 Analyse and Align with Other Business Models Information value-chain analysis maps the relationships between enterprise model elements and other business models Business value chain identifies the functions of an organisation that contribute directly or indirectly to the organisation’s goals

Define and Maintain the Data Technology Architecture:

January 9, 2011 114 Define and Maintain the Data Technology Architecture Data technology architecture guides the selection and integration of data-related technology Data technology architecture defines standard tool categories, preferred tools in each category, and technology standards and protocols for technology integration Technology categories include Database management systems (DBMS) Database management utilities Data modelling and model management tools Business intelligence software for reporting and analysis Extract-transform-load (ETL), changed data capture (CDC), and other data integration tools Data quality analysis and data cleansing tools Metadata management software, including metadata repositories

Define and Maintain the Data Technology Architecture:

January 9, 2011 115 Define and Maintain the Data Technology Architecture Classify technology architecture components as Current - currently supported and used Deployment - deployed for use in the next 1-2 years Strategic - expected to be available for use in the next 2+ years Retirement - the organisation has retired or intends to retire this year Preferred - preferred for use by most applications. Containment - limited to use by certain applications Emerging - being researched and piloted for possible future deployment

Define and Maintain the Data Integration Architecture:

January 9, 2011 116 Define and Maintain the Data Integration Architecture Defines how data flows through all systems from beginning to end Both data architecture and application architecture, because it includes both databases and the applications that control the data flow into the system, between databases and back out of the system

Define and Maintain the Data Warehouse / Business Intelligence Architecture :

January 9, 2011 117 Define and Maintain the Data Warehouse / Business Intelligence Architecture Focuses on how data changes and snapshots are stored in data warehouse systems for maximum usefulness and performance Data integration architecture shows how data moves from source systems through staging databases into data warehouses and data marts Business intelligence architecture defines how decision support makes data available, including the selection and use of business intelligence tools

Define and Maintain Enterprise Taxonomies and Namespaces:

January 9, 2011 118 Define and Maintain Enterprise Taxonomies and Namespaces Taxonomy is the hierarchical structure used for outlining topics Organisations develop their own taxonomies to organise collective thinking about topics Overall enterprise data architecture includes organisational taxonomies Definition of terms used in such taxonomies should be consistent with the enterprise data model

Define and Maintain the Metadata Architecture:

January 9, 2011 119 Define and Maintain the Metadata Architecture Metadata architecture is the design for integration of metadata across software tools, repositories, directories, glossaries, and data dictionaries Metadata architecture defines the managed flow of metadata Defines how metadata is created, integrated, controlled, and accessed Metadata repository is the core of any metadata architecture Focus of metadata architecture is to ensure the quality, integration, and effective use of metadata

Data Architecture Management Guiding Principles:

January 9, 2011 120 Data Architecture Management Guiding Principles Data architecture is an integrated set of specification master blueprints used to define data requirements, guide data integration, control data assets, and align data investments with business strategy Enterprise data architecture is part of the overall enterprise architecture, along with process architecture, business architecture, systems architecture, and technology architecture Enterprise data architecture includes three major categories of specifications: the enterprise data model, information value chain analysis, and data delivery architecture Enterprise data architecture is about more than just data - it helps to establish a common business vocabulary An enterprise data model is an integrated subject-oriented data model defining the essential data used across an entire organisation Information value-chain analysis defines the critical relationships between data, processes, roles and organisations and other enterprise elements Data delivery architecture defines the master blueprint for how data flows across databases and applications Architectural frameworks like TOGAF help organise collective thinking about architecture

Data Development:

January 9, 2011 121 Data Development

Data Development:

January 9, 2011 122 Data Development Analysis, design, implementation, deployment, and maintenance of data solutions to maximise the value of the data resources to the enterprise Subset of project activities within the system development lifecycle focused on defining data requirements, designing the data solution components, and implementing these components Primary data solution components are databases and other data structures

Data Development – Definition and Goals:

January 9, 2011 123 Data Development – Definition and Goals Definition Designing, implementing, and maintaining solutions to meet the data needs of the enterprise Goals Identify and define data requirements Design data structures and other solutions to these requirements Implement and maintain solution components that meet these requirements Ensure solution conformance to data architecture and standards as appropriate Ensure the integrity, security, usability, and maintainability of structured data assets

Data Development - Overview:

January 9, 2011 124 Data Development - Overview Business Goals and Strategies Data Needs and Strategies Data Standards Data Architecture Process Architecture Application Architecture Technical Architecture Inputs Data Stewards Subject Matter Experts IT Steering Committee Data Governance Council Data Architects and Analysts Software Developers Data Producers Information Consumers Suppliers Data Modeling Tools Database Management Systems Software Development Tools Testing Tools Data Profiling Tools Model Management Tools Configuration Management Tools Office Productivity Tools Tools Data Stewards and SMEs Data Architects and Analysts Database Administrators Data Model Administrators Software Developers Project Managers DM Executives and Other IT Management Participants Data Requirements and Business Rules Conceptual Data Models Logical Data Models and Specifications Physical Data Models and Specifications Metadata (Business and Technical) Data Modeling and DB Design Standards Data Model and DB Design Reviews Version Controlled Data Models Test Data Development and Test Databases Information Products Data Access Services Data Integration Services Migrated and Converted Data Primary Deliverables Data Producers Knowledge Workers Managers and Executives Customers Data Professionals Other IT Professionals Consumers Data Development

Data Development Function, Activities and Sub-Activities:

January 9, 2011 125 Data Development Function, Activities and Sub-Activities

Data Development - Principles:

January 9, 2011 126 Data Development - Principles Data development activities are an integral part of the software development lifecycle Data modeling is an essential technique for effective data management and system design Conceptual and logical data modeling express business and application requirements while physical data modeling represents solution design Data modeling and database design define detail solution component specifications Data modeling and database design balances tradeoffs and needs Data professionals should collaborate with other project team members to design information products and data access and integration interfaces Data modeling and database design should follow documented standards Design reviews should review all data models and designs, in order to ensure they meet business requirements and follow design standards Data models represent valuable knowledge resources and so should be carefully managed and controlled them through library, configuration, and change management to ensure data model quality and availability Database administrators and other data professionals play important roles in the construction, testing, and deployment of databases and related application systems

Data Modeling, Analysis, and Solution Design:

January 9, 2011 127 Data Modeling, Analysis, and Solution Design Data modeling is an analysis and design method used to define and analyse data requirements, and design data structures that support these requirements A data model is a set of data specifications and related diagrams that reflect data requirements and designs Data modeling is a complex process involving interactions between people and with technology which do not compromise the integrity or security of the data Good data models accurately express and effectively communicate data requirements and quality solution design

Data Model:

January 9, 2011 128 Data Model The purposes of a data model are: Communication - a data model is a bridge to understanding data between people with different levels and types of experience. Data models help us understand a business area, an existing application, or the impact of modifying an existing structure. Data models may also facilitate training new business and/or technical staff Formalisation - a data model documents a single, precise definition of data requirements and data related business rules Scope – a data model can help explain the data context and scope of purchased application packages Data models that include the same data may differ by: Scope - expressing a perspective about data in terms of function (business view or application view), realm (process, department, division, enterprise, or industry view), and time (current state, short-term future, long-term future) Focus - basic and critical concepts (conceptual view), detailed but independent of context (logical view), or optimised for a specific technology and use (physical view)

Analyse Information Requirements:

January 9, 2011 129 Analyse Information Requirements Information is relevant and timely data in context To identify information requirements, first identify business information needs, often in the context of one or more business processes Business processes (and the underlying IT systems) consume information output from other business processes Requirements analysis includes the elicitation, organisation, documentation, review, refinement, approval, and change control of business requirements Some of these requirements identify business needs for data and information Logical data modeling is an important means of expressing business data requirements

Develop and Maintain Conceptual Data Models:

January 9, 2011 130 Develop and Maintain Conceptual Data Models Visual, high-level perspective on a subject area of importance to the business Contains the basic and critical business entities within a given realm and function with a description of each entity and the relationships between entities Define the meanings of the essential business vocabulary Reflect the data associated with a business process or application function Independent of technology and usage context

Develop and Maintain Conceptual Data Models:

January 9, 2011 131 Develop and Maintain Conceptual Data Models Entities A data entity is a collection of data about something that the business deems important and worthy of capture Entities appear in conceptual or logical data models Relationships Business rules define constraints on what can and cannot be done Data Rules – define constraints on how data relates to other data Action Rules - instructions on what to do when data elements contain certain values

Develop and Maintain Logical Data Models:

January 9, 2011 132 Develop and Maintain Logical Data Models Detailed representation of data requirements and the business rules that govern data quality Independent of any technology or specific implementation technical constraints Extension of a conceptual data model Logical data models transform conceptual data model structures by normalisation and abstraction Normalisation is the process of applying rules to organise business complexity into stable data structure Abstraction is the redefinition of data entities, elements, and relationships by removing details to broaden the applicability of data structures to a wider class of situations

Develop and Maintain Physical Data Models:

January 9, 2011 133 Develop and Maintain Physical Data Models Physical data model optimises the implementation of detailed data requirements and business rules in light of technology constraints, application usage, performance requirements, and modeling standards Physical data modeling transforms the logical data model Includes specific decisions Name of each table and column or file and field or schema and element Logical domain, physical data type, length, and nullability of each column or field Default values Primary and alternate unique keys and indexes

Detailed Data Design:

January 9, 2011 134 Detailed Data Design Detailed data design activities include Detailed physical database design, including views, functions, triggers, and stored procedures Definition of supporting data structures, such as XML schemas and object classes Creation of information products, such as the use of data in screens and reports Definition of data access solutions, including data access objects, integration services, and reporting and analysis services

Design Physical Databases:

January 9, 2011 135 Design Physical Databases Create detailed database implementation specifications Ensure the design meets data integrity requirements Determine the most appropriate physical structure to house and organise the data, such as relational or other type of DBMS, files, OLAP cubes, XML, etc. Determine database resource requirements, such as server size and location, disk space requirements, CPU and memory requirements, and network requirements Creating detailed design specifications for data structures, such as relational database tables, indexes, views, OLAP data cubes, XML schemas, etc. Ensure performance requirements are met, including batch and online response time requirements for queries, inserts, updates, and deletes Design for backup, recovery, archiving, and purge processing, ensuring availability requirements are met Design data security implementation, including authentication, encryption needs, application roles and data access and update permissions Review code to ensure that it meets coding standards and will run efficiently

Physical Database Design:

January 9, 2011 136 Physical Database Design Choose a database design based on both a choice of architecture and a choice of technology Base the choice of architecture (for example, relational, hierarchical, network, object, star schema, snowflake, cube, etc.) on data considerations Consider factors such as how long the data needs to be kept, whether it must be integrated with other data or passed across system or application boundaries, and on requirements of data security, integrity, recoverability, accessibility, and reusability Consider organisational or political factors, including organisational biases and developer skill sets, that lean toward a particular technology or vendor

Physical Database Design - Principles:

January 9, 2011 137 Physical Database Design - Principles Performance and Ease of Use - Ensure quick and easy access to data by approved users in a usable and business-relevant form Reusability - The database structure should ensure that, where appropriate, multiple applications would be able to use the data Integrity - The data should always have a valid business meaning and value, regardless of context, and should always reflect a valid state of the business Security - True and accurate data should always be immediately available to authorised users, but only to authorised users Maintainability - Perform all data work at a cost that yields value by ensuring that the cost of creating, storing, maintaining, using, and disposing of data does not exceed its value to the organisation

Physical Database Design - Questions:

January 9, 2011 138 Physical Database Design - Questions What are the performance requirements? What is the maximum permissible time for a query to return results, or for a critical set of updates to occur? What are the availability requirements for the database? What are the window(s) of time for performing database operations? How often should database backups and transaction log backups be done (i.e., what is the longest period of time we can risk non-recoverability of the data)? What is the expected size of the database? What is the expected rate of growth of the data? At what point can old or unused data be archived or deleted? How many concurrent users are anticipated? What sorts of data virtualisation are needed to support application requirements in a way that does not tightly couple the application to the database schema? Will other applications need the data? If so, what data and how? Will users expect to be able to do ad-hoc querying and reporting of the data? If so, how and with which tools? What, if any, business or application processes does the database need to implement? (e.g., trigger code that does cross-database integrity checking or updating, application classes encapsulated in database procedures or functions, database views that provide table recombination for ease of use or security purposes, etc.). Are there application or developer concerns regarding the database, or the database development process, that need to be addressed? Is the application code efficient? Can a code change relieve a performance issue?

Performance Modifications:

January 9, 2011 139 Performance Modifications Consider how the database will perform when applications make requests to access and modify data Indexing can improve query performance in many cases Denormalisation is the deliberate transformation of a normalised logical data model into tables with redundant data

Physical Database Design Documentation:

January 9, 2011 140 Physical Database Design Documentation Create physical database design document to assist implementation and maintenance

Design Information Products:

January 9, 2011 141 Design Information Products Design data-related deliverables Design screens and reports to meet business data requirements Ensure consistent use of business data terminology Reporting services give business users the ability to execute both pre-developed and ad-hoc reports Analysis services give business users to ability slice and dice data across multiple dimensions Dashboards display a wide array of analytics indicators, such as charts and graphs, efficiently Scorecard display information that indicates scores or calculated evaluations of performance Use data integrated from multiple databases as input to software for business process automation that coordinates multiple business processes across disparate platforms Data integration is a component of Enterprise Application Integration (EAI) software, enabling data to be easily passed from application to application across disparate platforms

Design Data Access Services:

January 9, 2011 142 Design Data Access Services May be necessary to access and combine data from remote databases with data in the local database Goal is to enable easy and inexpensive reuse of data across the organisation preventing, wherever possible, redundant and inconsistent data Options include Linked database connections SOA web services Message brokers Data access classes ETL Replication

Design Data Integration Services:

January 9, 2011 143 Design Data Integration Services Critical aspect of database design is determining appropriate update mechanisms and database transaction for recovery Define source-to-target mappings and data transformation designs for extract-transform-load (ETL) programs and other technology for ongoing data movement, cleansing and integration Design programs and utilities for data migration and conversion from old data structures to new data structures

Data Model and Design Quality Management:

January 9, 2011 144 Data Model and Design Quality Management Balance the needs of information consumers (the people with business requirements for data) and the data producers who capture the data in usable form Time and budget constraints Ensure data resides in data structures that are secure, recoverable, sharable, and reusable, and that this data is as correct, timely, relevant, and usable as possible Balance the short-term versus long-term business data interests of the organisation

Develop Data Modeling and Design Standards:

January 9, 2011 145 Develop Data Modeling and Design Standards Data modeling and database design standards serve as the guiding principles to effectively meet business data needs, conform to data architecture, and ensure data quality Data modeling and database design standards should include A list and description of standard data modeling and database design deliverables A list of standard names, acceptable abbreviations, and abbreviation rules for uncommon words, that apply to all data model objects A list of standard naming formats for all data model objects, including attribute and column class words A list and description of standard methods for creating and maintaining these deliverables A list and description of data modeling and database design roles and responsibilities A list and description of all metadata properties captured in data modeling and database design, including both business metadata and technical metadata, with guidelines defining metadata quality expectations and requirements Guidelines for how to use data modeling tools Guidelines for preparing for and leading design reviews

Review Data Model and Database Design Quality:

January 9, 2011 146 Review Data Model and Database Design Quality Conduct requirements reviews and design reviews, including a conceptual data model review, a logical data model review, and a physical database design review

Conceptual and Logical Data Model Reviews:

January 9, 2011 147 Conceptual and Logical Data Model Reviews Conceptual data model and logical data model design reviews should ensure that: Business data requirements are completely captured and clearly expressed in the model, including the business rules governing entity relationships Business (logical) names and business definitions for entities and attributes (business semantics) are clear, practical, consistent, and complementary Data modeling standards, including naming standards, have been followed The conceptual and logical data models have been validated

Physical Database Design Review:

January 9, 2011 148 Physical Database Design Review Physical database design reviews should ensure that: The design meets business, technology, usage, and performance requirements Database design standards, including naming and abbreviation standards, have been followed Availability, recovery, archiving, and purging procedures are defined according to standards Metadata quality expectations and requirements are met in order to properly update any metadata repository The physical data model has been validated

Data Model Validation:

January 9, 2011 149 Data Model Validation Validate data models against modeling standards, business requirements, and database requirements Ensure the model matches applicable modeling standards Ensure the model matches the business requirements Ensure the model matches the database requirements

Manage Data Model Versioning and Integration :

January 9, 2011 150 Manage Data Model Versioning and Integration Data models and other design specifications require change control Each change should include Why the project or situation required the change What and how the object(s) changed, including which tables had columns added, modified, or removed, etc. When the change was approved and when the change was made to the model Who made the change Where the change was made

Data Implementation:

January 9, 2011 151 Data Implementation Data implementation consists of data management activities that support system building, testing, and deployment Database implementation and change management in the development and test environments Test data creation, including any security procedures Development of data migration and conversion programs, both for project development through the SDLC and for business situations Validation of data quality requirements Creation and delivery of user training Contribution to the development of effective documentation

Implement Development / Test Database Changes:

January 9, 2011 152 Implement Development / Test Database Changes Implement changes to the database that are required during the course of application development Monitor database code to ensure that it is written to the same standards as application code Identify poor SQL coding practices that could lead to errors or performance problems

Create and Maintain Test Data:

January 9, 2011 153 Create and Maintain Test Data Populate databases in the development environment with test data Observe privacy and confidentiality requirements and practices for test data

Migrate and Convert Data:

January 9, 2011 154 Migrate and Convert Data Key component of many projects is the migration of legacy data to a new database environment, including any necessary data cleansing and reformatting

Build and Test Information Products:

January 9, 2011 155 Build and Test Information Products Implement mechanisms for integrating data from multiple sources, along with the appropriate metadata to ensure meaningful integration of the data Implement mechanisms for reporting and analysing the data, including online and web-based reporting, ad-hoc querying, BI scorecards, OLAP, portals, and the like Implement mechanisms for replication of the data, if network latency or other concerns make it impractical to service all users from a single data source

Build and Test Data Access Services:

January 9, 2011 156 Build and Test Data Access Services Develop, test, and execute data migration and conversion programs and procedures, first for development and test data and later for production deployment Data requirements should include business rules for data quality to guide the implementation of application edits and database referential integrity constraints Business data stewards and other subject matter experts should validate the correct implementation of data requirements through user acceptance testing

Validate Information Requirements:

January 9, 2011 157 Validate Information Requirements Test and validate that the solution meets the requirements, and plan deployment, developing training, and documentation. Data requirements may change abruptly, in response to either changed business requirements, invalid assumptions regarding the data or reprioritisation of existing requirements Test the implementation of the data requirements and ensure that the application requirements are satisfied

Prepare for Data Deployment:

January 9, 2011 158 Prepare for Data Deployment Leverage the business knowledge captured in data modeling to define clear and consistent language in user training and documentation Business concepts, terminology, definitions, and rules depicted in data models are an important part of application user training Data stewards and data analysts should participate in deployment preparation, including development and review of training materials and system documentation, especially to ensure consistent use of defined business data terminology Help desk support staff also require orientation and training in how system users appropriately access, manipulate, and interpret data Once installed, business data stewards and data analysts should monitor the early use of the system to see that business data requirements are indeed met

Data Operations Management:

January 9, 2011 159 Data Operations Management

Data Operations Management:

January 9, 2011 160 Data Operations Management Management is the development, maintenance, and support of structured data to maximise the value of the data resources to the enterprise and includes Database support Data technology management

Data Operations Management – Definition and Goals:

January 9, 2011 161 Data Operations Management – Definition and Goals Definition Planning, control, and support for structured data assets across the data lifecycle, from creation and acquisition through archival and purge Goals Protect and ensure the integrity of structured data assets Manage the availability of data throughout its lifecycle Optimise performance of database transactions

Data Operations Management - Overview:

January 9, 2011 162 Data Operations Management - Overview Data Requirements Data Architecture Data Models Legacy Data Service Level Agreements Inputs Executives IT Steering Committee Data Governance Council Data Stewards Data Architects and Modelers Software Developers Suppliers Database Management Systems Data Development Tools Database Administration Tools Office Productivity Tools Tools Database Administrators Software Developers Project Managers Data Stewards Data Architects and Analysts DM Executives and Other IT Management IT Operators Participants DBMS Technical Environments Dev/Test, QA, DR, and Production Databases Externally Sourced Data Database Performance Data Recovery Plans Business Continuity Data Retention Plan Archived and Purged Data Primary Deliverables Availability Performance Metrics Data Operations Management Data Creators Information Consumers Enterprise Customers Data Professionals Other IT Professionals Consumers

Data Operations Management Function, Activities and Sub-Activities:

January 9, 2011 163 Data Operations Management Function, Activities and Sub-Activities

Data Operations Management - Principles:

January 9, 2011 164 Data Operations Management - Principles Write everything down Keep everything Whenever possible, automate a procedure Focus to understand the purpose of each task, manage scope, simplify, do one thing at a time Measure twice, cut once React to problems and issues calmly and rationally, because panic causes more errors Understand the business, not just the technology Work together to collaborate, be accessible, share knowledge Use all of the resources at your disposal Keep up to date

Database Support - Scope:

January 9, 2011 165 Database Support - Scope Ensure the performance and reliability of the database, including performance tuning, monitoring, and error reporting Implement appropriate backup and recovery mechanisms to guarantee the recoverability of the data in any circumstance Implement mechanisms for clustering and failover of the database, if continual data availability data is a requirement Implement mechanisms for archiving data operations management

Database Support - Deliverables:

January 9, 2011 166 Database Support - Deliverables A production database environment, including an instance of the DBMS and its supporting server, of a sufficient size and capacity to ensure adequate performance, configured for the appropriate level of security, reliability and availability Mechanisms and processes for controlled implementation and changes to databases into the production environment Appropriate mechanisms for ensuring the availability, integrity, and recoverability of the data in response to all possible circumstances that could result in loss or corruption of data Appropriate mechanisms for detecting and reporting any error that occurs in the database, the DBMS, or the data server Database availability, recovery, and performance in accordance with service level agreements

Implement and Control Database Environments:

January 9, 2011 167 Implement and Control Database Environments Updating DBMS software Maintaining multiple installations, including different DBMS versions Installing and administering related data technology, including data integration software and third party data administration tools Setting and tuning DBMS system parameters Managing database connectivity Tune operating systems, networks, and transaction processing middleware to work with the DBMS Optimise the use of different storage technology for cost-effective storage

Obtain Externally Sourced Data:

January 9, 2011 168 Obtain Externally Sourced Data Managed approach to data acquisition centralises responsibility for data subscription services Document the external data source in the logical data model and data dictionary Implement the necessary processes to load the data into the database and/or make it available to applications

Plan for Data Recovery:

January 9, 2011 169 Plan for Data Recovery Establish service level agreements (SLAs) with IT data management services organisations for data availability and recovery SLAs set availability expectations, allowing time for database maintenance and backup, and set recovery time expectations for different recovery scenarios, including potential disasters Ensure a recovery plan exists for all databases and database servers, covering all possible scenarios Loss of the physical database server Loss of one or more disk storage devices Loss of a database, including the DBMS master database, temporary storage database, transaction log segment, etc. Corruption of database index or data pages Loss of the database or log segment file system Loss of database or transaction log backup files

Backup and Recover Data:

January 9, 2011 170 Backup and Recover Data Make regular backups of database and the database transaction logs Balance the importance of the data against the cost of protecting it Databases should reside on some sort of managed storage area For critical data, implement some sort of replication facility

Set Database Performance Service Levels:

January 9, 2011 171 Set Database Performance Service Levels Database performance has two components - availability and performance An unavailable database has a performance measure of zero SLAs between data management services organisations and data owners define expectations for database performance Availability is the percentage of time that a system or database can be used for productive work Availability requirements are constantly increasing, raising the business risks and costs of unavailable data

Set Database Performance Service Levels:

January 9, 2011 172 Set Database Performance Service Levels Factors affecting availability include Manageability - ability to create and maintain an effective environment Recoverability - ability to reestablish service after interruption, and correct errors caused by unforeseen events or component failures Reliability - ability to deliver service at specified levels for a stated period Serviceability - ability to determine the existence of problems, diagnose their causes, and repair / solve the problems Tasks to ensure databases stay online and operational Running database backup utilities Running database reorganisation utilities Running statistics gathering utilities Running integrity checking utilities Automating the execution of these utilities Exploiting table space clustering and partitioning Replicating data across mirror databases to ensure high availability

Set Database Performance Service Levels:

January 9, 2011 173 Set Database Performance Service Levels Cause of loss of database availability Planned and unplanned outages Loss of the server hardware Disk hardware failure Operating system failure DBMS software failure Application problems Network failure Data center site loss Security and authorisation problems Corruption of data (due to bugs, poor design, or user error) Loss of database objects Loss of data Data replication failure Severe performance problems Recovery failures Human error

Monitor and Tune Database Performance:

January 9, 2011 174 Monitor and Tune Database Performance Optimise database performance both proactively and reactively, by monitoring performance and by responding to problems quickly and effectively Run activity and performance reports against both the DBMS and the server on a regular basis including during periods of heavy activity When performance problems occur, use the monitoring and administration tools of the DBMS to help identify the source of the problem Memory allocation (buffer / cache for data) Locking and blocking Failure to update database statistics Poor SQL coding Insufficient indexing Application activity Increase in the number, size, or use of databases Database volatility

Support Specialised Databases:

January 9, 2011 175 Support Specialised Databases Some specialised situations require specialised types of databases

Data Technology Management:

January 9, 2011 176 Data Technology Management Managing data technology should follow the same principles and standards for managing any technology Use a reference model for technology management such as Information Technology Infrastructure Library (ITIL)

Understand Data Technology Requirements:

January 9, 2011 177 Understand Data Technology Requirements Understand the data and information needs of the business Understand the best possible applications of technology to solve business problems and take advantage of new business opportunities Understand the requirements of a data technology before determining what technical solution to choose for a particular situation What problem does this data technology mean to solve? What does this data technology do that is unavailable in other data technologies? What does this data technology not do that is available in other data technologies? Are there any specific hardware requirements for this data technology? Are there any specific Operating System requirements for this data technology? Are there any specific software requirements or additional applications required for this data technology to perform as advertised? Are there any specific storage requirements for this data technology? Are there any specific network or connectivity requirements for this data technology? Does this data technology include data security functionality? If not, what other tools does this technology work with that provides for data security functionality? Are there any specific skills required to be able support this data technology? Do we have those skills in-house or must we acquire them?

Define the Data Technology Architecture:

January 9, 2011 178 Define the Data Technology Architecture Data technology architecture addresses three core questions What technologies are standard (which are required, preferred, or acceptable)? Which technologies apply to which purposes and circumstances? In a distributed environment, which technologies exist where, and how does data move from one node to another? Technology is never free - even open-source technology requires maintenance Technology should always be regarded as the means to an end, rather than the end itself Buying the same technology that everyone else is using, and using it in the same way, does not create business value or competitive advantage for the organisation

Define the Data Technology Architecture:

January 9, 2011 179 Define the Data Technology Architecture Technology categories include Database management systems (DBMS) Database management utilities Data modelling and model management tools Business intelligence software for reporting and analysis Extract-transform-load (ETL), changed data capture (CDC), and other data integration tools Data quality analysis and data cleansing tools Metadata management software, including metadata repositories

Define the Data Technology Architecture:

January 9, 2011 180 Define the Data Technology Architecture Classify technology architecture components as Current - currently supported and used Deployment - deployed for use in the next 1-2 years Strategic - expected to be available for use in the next 2+ years Retirement - the organisation has retired or intends to retire this year Preferred - preferred for use by most applications. Containment - limited to use by certain applications Emerging - being researched and piloted for possible future deployment Create road map for the organisation consisting of these components to helps govern future technology decisions

Evaluate Data Technology:

January 9, 2011 181 Evaluate Data Technology Selecting appropriate data related technology, particularly the appropriate database management technology, is an important data management responsibility Data technologies to be researched and evaluated include: Database management systems (DBMS) software Database utilities, such as backup and recovery tools, and performance monitors Data modeling and model management software Database management tools, such as editors, schema generators, and database object generators Business intelligence software for reporting and analysis Extract-transfer-load (ETL) and other data integration tools Data quality analysis and data cleansing tools Data virtualisation technology Metadata management software, including metadata repositories

Evaluate Data Technology:

January 9, 2011 182 Evaluate Data Technology Use a standard technology evaluation process Understand user needs, objectives, and related requirements Understand the technology in general Identify available technology alternatives Identify the features required Weigh the importance of each feature Understand each technology alternative Evaluate and score each technology alternative’s ability to meet requirements Calculate total scores and rank technology alternatives by score Evaluate the results, including the weighted criteria Present the case for selecting the highest ranking alternative

Evaluate Data Technology:

January 9, 2011 183 Evaluate Data Technology Selecting strategic DBMS software is very important Factors to consider when selecting DBMS software include: Product architecture and complexity Application profile, such as transaction processing, business intelligence, and personal profiles Organisational appetite for technical risk Hardware platform and operating system support Availability of supporting software tools Performance benchmarks Scalability Software, memory, and storage requirements Available supply of trained technical professionals Cost of ownership, such as licensing, maintenance, and computing resources Vendor reputation Vendor support policy and release schedule Customer references

Install and Administer Data Technology:

January 9, 2011 184 Install and Administer Data Technology Need to deploy new technology products in development / test, QA / certification, and production environments Create and document processes and procedures for administering the product Cost and complexity of implementing new technology is usually underestimated Features and benefits are usually overestimated Start with small pilot projects and proof-of-concept (POC) implementations to get a good idea of the true costs and benefits before proceeding with larger production implementation

Inventory and Track Data Technology Licenses:

January 9, 2011 185 Inventory and Track Data Technology Licenses Comply with licensing agreements and regulatory requirements Track and conduct yearly audits of software license and annual support costs Track other costs such as server lease agreements and other fixed costs Use data to determine the total cost-of-ownership (TCO) for each type of technology and technology product Evaluate technologies and products that are becoming obsolete, unsupported, less useful, or too expensive

Support Data Technology Usage and Issues:

January 9, 2011 186 Support Data Technology Usage and Issues Work with business users and application developers to Ensure the most effective use of the technology Explore new applications of the technology Address any problems or issues that surface from its use Training is important to effective understanding and use of any technology

Data Security Management:

January 9, 2011 187 Data Security Management

Data Security Management:

January 9, 2011 188 Data Security Management Planning, development, and execution of security policies and procedures to provide proper authentication, authorisation, access, and auditing of data and information assets Effective data security policies and procedures ensure that the right people can use and update data in the right way, and that all inappropriate access and update is restricted Effective data security management function establishes governance mechanisms that are easy enough to abide by on a daily operational basis

Data Security Management – Definition and Goals:

January 9, 2011 189 Data Security Management – Definition and Goals Definition Planning, development, and execution of security policies and procedures to provide proper authentication, authorisation, access, and auditing of data and information. Goals Enable appropriate, and prevent inappropriate, access and change to data assets Meet regulatory requirements for privacy and confidentiality Ensure the privacy and confidentiality needs of all stakeholders are met

Data Security Management:

January 9, 2011 190 Data Security Management Protect information assets in alignment with privacy and confidentiality regulations and business requirements Stakeholder Concerns - organisations must recognise the privacy and confidentiality needs of their stakeholders, including clients, patients, students, citizens, suppliers, or business partners Government Regulations - government regulations protect some of the stakeholder security interests. Some regulations restrict access to information, while other regulations ensure openness, transparency, and accountability Proprietary Business Concerns - each organisation has its own proprietary data to protect - ensuring competitive advantage provided by intellectual property and intimate knowledge of customer needs and business partner relationships is a cornerstone in any business plan Legitimate Access Needs - data security implementers must also understand the legitimate needs for data access

Data Security Requirements and Procedures:

January 9, 2011 191 Data Security Requirements and Procedures Data security requirements and the procedures to meet these requirements Authentication - validate users are who they say they are Authorisation - identify the right individuals and grant them the right privileges to specific, appropriate views of data Access - enable these individuals and their privileges in a timely manner Audit - review security actions and user activity to ensure compliance with regulations and conformance with policy and standards

Data Security Management - Overview:

January 9, 2011 192 Data Security Management - Overview Business Goals Business Strategy Business Rules Business Process Data Strategy Data Privacy Issues Related IT Policies and Standards Inputs Data Stewards IT Steering Committee Data Stewardship Council Government Customers Suppliers Database Management System Business Intelligence Tools Application Frameworks Identity Management Technologies Change Control Systems Tools Data Stewards Data Security Administrators Database Administrators BI Analysts Data Architects DM Leader CIO/CTO Help Desk Analysts Participants Data Security Policies Data Privacy and Confidentiality Standards User Profiles, Passwords and Memberships Data Security Permissions Data Security Controls Data Access Views Document Classifications Authentication and Access History Data Security Audits Primary Deliverables Data Producers Knowledge Workers Managers Executives Customers Data Professionals Consumers Data Security Management

Data Security Management Function, Activities and Sub-Activities:

January 9, 2011 193 Data Security Management Function, Activities and Sub-Activities

Data Operations Management - Principles:

January 9, 2011 194 Data Operations Management - Principles Be a responsible trustee of data about all parties. Understand and respect the privacy and confidentiality needs of all stakeholders, be they clients, patients, students, citizens, suppliers, or business partners Understand and comply with all pertinent regulations and guidelines Data-to-process and data-to-role relationship (CRUD Create, Read, Update, Delete) matrices help map data access needs and guide definition of data security role groups, parameters, and permissions Definition of data security requirements and data security policy is a collaborative effort involving IT security administrators, data stewards, internal and external audit teams, and the legal department Identify detailed application security requirements in the analysis phase of every systems development project Classify all enterprise data and information products against a simple confidentiality classification schema Every user account should have a password set by the user following a set of password complexity guidelines, and expiring every 45 to 60 days Create role groups; define privileges by role; and grant privileges to users by assigning them to the appropriate role group. Whenever possible, assign each user to only one role group Some level of management must formally request, track, and approve all initial authorisations and subsequent changes to user and group authorisations To avoid data integrity issues with security access information, centrally manage user identity data and group membership data Use relational database views to restrict access to sensitive columns and / or specific rows Strictly limit and carefully consider every use of shared or service user accounts Monitor data access to certain information actively, and take periodic snapshots of data access activity to understand trends and compare against standards criteria Periodically conduct objective, independent, data security audits to verify regulatory compliance and standards conformance, and to analyse the effectiveness and maturity of data security policy and practice In an outsourced environment, be sure to clearly define the roles and responsibilities for data security and understand the chain of custody data across organisations and roles.

Understand Data Security Needs and Regulatory Requirements:

January 9, 2011 195 Understand Data Security Needs and Regulatory Requirements Distinguish between business rules and procedures and the rules imposed by application software products Common for systems to have their own unique set of data security requirements over and above those required business processes

Business Requirements:

January 9, 2011 196 Business Requirements Implementing data security within an enterprise requires an understanding of business requirements Business needs of an enterprise define the degree of rigidity required for data security Business rules and processes define the security touch points Data-to-process and data-to-role relationship matrices are useful tools to map these needs and guide definition of data security role-groups, parameters, and permissions Identify detailed application security requirements in the analysis phase of every systems development project

Regulatory Requirements:

January 9, 2011 197 Regulatory Requirements Organisations must comply with a growing set of regulations Some regulations impose security controls on information management

Define Data Security Policy:

January 9, 2011 198 Define Data Security Policy Definition of data security policy based on data security requirements is a collaborative effort involving IT security administrators, data stewards, internal and external audit teams, and the legal department Enterprise IT strategy and standards typically dictate high-level policies for access to enterprise data assets Data security policies are more granular in nature and take a very data-centric approach compared to an IT security policy

Define Data Security Standards:

January 9, 2011 199 Define Data Security Standards No one prescribed way of implementing data security to meet privacy and confidentiality requirements Regulations generally focus on ensuring achieving an end without defining them means for achieving it Organisations should design their own security controls, demonstrate that the controls meet the requirements of the law or regulations and document the implementation of those controls Information technology security standards can also affect Tools used to manage data security Data encryption standards and mechanisms Access guidelines to external vendors and contractors Data transmission protocols over the internet Documentation requirements Remote access standards Security breach incident reporting procedures

Define Data Security Standards:

January 9, 2011 200 Define Data Security Standards Consider physical security, especially with the explosion of portable devices and media, to formulate an effective data security strategy Access to data using mobile devices Storage of data on portable devices such as laptops, DVDs, CDs or USB drives Disposal of these devices in compliance with records management policies An organisation should develop a practical, implementable security policy including data security guiding principles Focus should be on quality and consistency not creating a lengthy body of guidelines Execution of the policy requires satisfying the elements of securing information assets: authentication, authorisation, access, and audit Information classification, access rights, role groups, users, and passwords are the means to implementing policy and satisfying these elements

Define Data Security Controls and Procedures:

January 9, 2011 201 Define Data Security Controls and Procedures Implementation and administration of data security policy is primarily the responsibility of security administrators Database security is often one responsibility of database administrators Implement proper controls to meet the objectives of relevant laws Implement a process to validate assigned permissions against a change management system used for tracking all user permission requests

Manage Users, Passwords, and Group Membership:

January 9, 2011 202 Manage Users, Passwords, and Group Membership Role groups enable security administrators to define privileges by role and to grant these privileges to users by enrolling them in the appropriate role group Data consistency in user and group management is a challenge in a mixed IT environment Construct group definitions at a workgroup or business unit level Organise roles in a hierarchy, so that child roles further restrict the privileges of parent roles

Password Standards and Procedures:

January 9, 2011 203 Password Standards and Procedures Passwords are the first line of defense in protecting access to data Every user account should be required to have a password set by the user with a sufficient level of password complexity defined in the security standards

Manage Data Access Views and Permissions:

January 9, 2011 204 Manage Data Access Views and Permissions Data security management involves not just preventing inappropriate access, but also enabling valid and appropriate access to data Most sets of data do not have any restricted access requirements Control sensitive data access by granting permissions - opt-in Access control degrades when achieved through shared or service accounts Implemented as convenience for administrators, these accounts often come with enhanced privileges and are untraceable to any particular user or administrator Enterprises using shared or service accounts run the risk of data security breaches Evaluate use of such accounts carefully, and never use them frequently or by default

Monitor User Authentication and Access Behaviour:

January 9, 2011 205 Monitor User Authentication and Access Behaviour Monitoring authentication and access behaviour is critical because: It provides information about who is connecting and accessing information assets, which is a basic requirement for compliance auditing It alerts security administrators to unforeseen situations, compensating for oversights in data security planning, design, and implementation Monitoring helps detect unusual or suspicious transactions that may warrant further investigation and issue resolution Perform monitoring either actively or passively Automated systems with human checks and balances in place best accomplish both methods

Classify Information Confidentiality:

January 9, 2011 206 Classify Information Confidentiality Classify an organisation’s data and information using a simple confidentiality classification schema Most organisations classify the level of confidentiality for information found within documents, including reports A typical classification schema might include the following five confidentiality classification levels: For General Audiences : Information available to anyone, including the general public Internal Use Only : Information limited to employees or members, but with minimal risk if shared Confidential : Information which should not be shared outside the organisation. Client Confidential information may not be shared with other clients Restricted Confidential : Information limited to individuals performing certain roles with the need to know Registered Confidential : Information so confidential that anyone accessing the information must sign a legal agreement to access the data and assume responsibility for its secrecy

Audit Data Security:

January 9, 2011 207 Audit Data Security Auditing data security is a recurring control activity with responsibility to analyse, validate, counsel, and recommend policies, standards, and activities related to data security management Auditing is a managerial activity performed with the help of analysts working on the actual implementation and details The goal of auditing is to provide management and the data governance council with objective, unbiased assessments, and rational, practical recommendations Auditing data security is no substitute for effective management of data security Auditing is a supportive, repeatable process, which should occur regularly, efficiently, and consistently

Audit Data Security:

January 9, 2011 208 Audit Data Security Auditing data security includes Analysing data security policy and standards against best practices and needs Analysing implementation procedures and actual practices to ensure consistency with data security goals, policies, standards, guidelines, and desired outcomes Assessing whether existing standards and procedures are adequate and in alignment with business and technology requirements Verifying the organisation is in compliance with regulatory requirements Reviewing the reliability and accuracy of data security audit data Evaluating escalation procedures and notification mechanisms in the event of a data security breach Reviewing contracts, data sharing agreements, and data security obligations of outsourced and external vendors, ensuring they meet their obligations, and ensuring the organisation meets its obligations for externally sourced data Reporting to senior management, data stewards, and other stakeholders on the state of data security within the organisation and the maturity of its practices Recommending data security design, operational, and compliance improvements

Data Security and Outsourcing:

January 9, 2011 209 Data Security and Outsourcing Outsourcing IT operations introduces additional data security challenges and responsibilities Outsourcing increases the number of people who share accountability for data across organisational and geographic boundaries Previously informal roles and responsibilities must now be explicitly defined as contractual obligations Outsourcing contracts must specify the responsibilities and expectations of each role Any form of outsourcing increases risk to the organisation Data security risk is escalated to include the outsource vendor, so any data security measures and processes must look at the risk from the outsource vendor not only as an external risk, but also as an internal risk

Data Security and Outsourcing:

January 9, 2011 210 Data Security and Outsourcing Transferring control, but not accountability, requires tighter risk management and control mechanisms: Service level agreements Limited liability provisions in the outsourcing contract Right-to-audit clauses in the contract Clearly defined consequences to breaching contractual obligations Frequent data security reports from the service vendor Independent monitoring of vendor system activity More frequent and thorough data security auditing Constant communication with the service vendor In an outsourced environment, it is important to maintain and track the lineage, or flow, of data across systems and individuals to maintain a chain of custody

Reference and Master Data Management:

January 9, 2011 211 Reference and Master Data Management

Reference and Master Data Management:

January 9, 2011 212 Reference and Master Data Management Reference and Master Data Management is the ongoing reconciliation and maintenance of reference data and master data Reference Data Management is control over defined domain values (also known as vocabularies), including control over standardised terms, code values and other unique identifiers, business definitions for each value, business relationships within and across domain value lists, and the consistent, shared use of accurate, timely and relevant reference data values to classify and categorise data Master Data Management is control over master data values to enable consistent, shared, contextual use across systems, of the most accurate, timely, and relevant version of truth about essential business entities Reference data and master data provide the context for transaction data

Reference and Master Data Management – Definition and Goals:

January 9, 2011 213 Reference and Master Data Management – Definition and Goals Definition Planning, implementation, and control activities to ensure consistency with a golden version of contextual data values Goals Provide authoritative source of reconciled, high-quality master and reference data Lower cost and complexity through reuse and leverage of standards Support business intelligence and information integration efforts

Reference and Master Data Management - Overview:

January 9, 2011 214 Reference and Master Data Management - Overview Business Drivers Data Requirements Policy and Regulations Standards Code Sets Master Data Transactional Data Inputs Steering Committees Business Data Stewards Subject Matter Experts Data Consumers Standards Organisations Data Providers Suppliers Reference Data Management Applications Master Data Management Applications Data Modeling Tools Process Modeling Tools Metadata Repositories Data Profiling Tools Data Cleansing Tools Data Integration Tools Business Process and Rule Engines Change Management Tools Tools Data Stewards Subject Matter Experts Data Architects Data Analysts Application Architects Data Governance Council Data Providers Other IT Professionals Participants Master and Reference Data Requirements Data Models and Documentation Reliable Reference and Master Data Golden Record Data Lineage Data Quality Metrics and Reports Data Cleansing Services Primary Deliverables Reference and Master Data Quality Change Activity Issues, Costs, Volume Use and Re-Use Availability Data Steward Coverage Metrics Reference and Master Data Management Application Users BI and Reporting Users Application Developers and Architects Data Integration Developers and Architects BI Developers and Architects Vendors, Customers, and Partners Consumers

Reference and Master Data Management Function, Activities and Sub-Activities:

January 9, 2011 215 Reference and Master Data Management Function, Activities and Sub-Activities

Reference and Master Data Management - Principles:

January 9, 2011 216 Reference and Master Data Management - Principles Shared reference and master data belongs to the organisation, not to a particular application or department Reference and master data management is an on-going data quality improvement program; its goals cannot be achieved by one project alone Business data stewards are the authorities accountable for controlling reference data values. Business data stewards work with data professionals to improve the quality of reference and master data Golden data values represent the organisation’s best efforts at determining the most accurate, current, and relevant data values for contextual use. New data may prove earlier assumptions to be false. Therefore, apply matching rules with caution, and ensure that any changes that are made are reversible Replicate master data values only from the database of record Request, communicate, and, in some cases, approve of changes to reference data values before implementation

Reference Data:

January 9, 2011 217 Reference Data Reference data is data used to classify or categorise other data Business rules usually dictate that reference data values conform to one of several allowed values In all organisations, reference data exists in virtually every database Reference tables link via foreign keys into other relational database tables, and the referential integrity functions within the database management system ensure only valid values from the reference tables are used in other tables

Master Data:

January 9, 2011 218 Master Data Master data is data about the business entities that provide context for business transactions Master data is the authoritative, most accurate data available about key business entities, used to establish the context for transactional data Master data values are considered golden Master Data Management is the process of defining and maintaining how master data will be created, integrated, maintained, and used throughout the enterprise

Master Data Challenges:

January 9, 2011 219 Master Data Challenges What are the important roles, organisations, places, and things referenced repeatedly? What data is describing the same person, organisation, place, or thing? Where is this data stored? What is the source for the data? Which data is more accurate? Which data source is more reliable and credible? Which data is most current? What data is relevant for specific needs? How do these needs overlap or conflict? What data from multiple sources can be integrated to create a more complete view and provide a more comprehensive understanding of the person, organisation, place or thing? What business rules can be established to automate master data quality improvement by accurately matching and merging data about the same person, organisation, place, or thing? How do we identify and restore data that was inappropriately matched and merged? How do we provide our golden data values to other systems across the enterprise? How do we identify where and when data other than the golden values is used?

Party Master Data:

January 9, 2011 220 Party Master Data Includes data about individuals, organisations, and the roles they play in business relationships Customer relationship management (CRM) systems perform MDM for customer data (also called Customer Data Integration (CDI)) Focus is to provide the most complete and accurate information about each and every customer Need to identify duplicate, redundant and conflicting data Party master data issues Complexity of roles and relationships played by individuals and organisations Difficulties in unique identification High number of data sources Business importance and potential impact of the data

Financial Master Data:

January 9, 2011 221 Financial Master Data Includes data about business units, cost centers, profit centers, general ledger accounts, budgets, projections, and projects Financial MDM solutions focus on not only creating, maintaining, and sharing information, but also simulating how changes to existing financial data may affect the organisation’s bottom line

Product Master Data:

January 9, 2011 222 Product Master Data Product master can consists of information on an organisation’s products and services or on the entire industry in which the organisation operates, including competitor products, and services Product Lifecycle Management (PLM) focuses on managing the lifecycle of a product or service from its conception (such as research), through its development, manufacturing, sale / delivery, service, and disposal

Location Master Data:

January 9, 2011 223 Location Master Data Provides the ability to track and share reference information about different geographies, and create hierarchical relationships or territories based on geographic information to support other processes Different industries require specialised earth science data (geographic data about seismic faults, flood plains, soil, annual rainfall, and severe weather risk areas) and related sociological data (population, ethnicity, income, and terrorism risk), usually supplied from external sources

Understand Reference and Master Data Integration Needs:

January 9, 2011 224 Understand Reference and Master Data Integration Needs Reference and master data requirements are relatively easy to discover and understand for a single application Potentially much more difficult to develop an understanding of these needs across applications, especially across the entire organisation Analysing the root causes of a data quality problem usually uncovers requirements for reference and master data integration Organisations that have successfully managed reference and master data typically have focused on one subject area at a time Analyse all occurrences of a few business entities, across all physical databases and for differing usage patterns

Identify Reference and Master Data Sources and Contributors:

January 9, 2011 225 Identify Reference and Master Data Sources and Contributors Successful organisations first understand the needs for reference and master data Then trace the lineage of this data to identify the original and interim source databases, files, applications, organisations and the individual roles that create and maintain the data Understand both the upstream sources and the downstream needs to capture quality data at its source

Define and Maintain the Data integration Architecture:

January 9, 2011 226 Define and Maintain the Data integration Architecture Effective data integration architecture controls the shared access, replication, and flow of data to ensure data quality and consistency, particularly for reference and master data Without data integration architecture, local reference and master data management occurs in application silos, inevitably resulting in redundant and inconsistent data The selected data integration architecture should also provide common data integration services Change request processing, including review and approval Data quality checks on externally acquired reference and master data Consistent application of data quality rules and matching rules Consistent patterns of processing Consistent metadata about mappings, transformations, programs and jobs Consistent audit, error resolution and performance monitoring data Consistent approach to replicating data Establishing master data standards can be a time consuming task as it may involve multiple stakeholders. Apply the same data standards, regardless of integration technology, to enable effective standardisation, sharing, and distribution of reference and master data

Data Integration Services Architecture:

January 9, 2011 227 Data Integration Services Architecture Data Quality Management MetaData Management Integration Metadata Job Flow and Statistics Data Acquisition, File Management and Audit Replication Management Data Standardisation Cleansing and Matching Business Metadata Source Data Archives Rules Errors Staging Reconciled Master Data Subscriptions

Implement Reference and Master Data Management Solutions:

January 9, 2011 228 Implement Reference and Master Data Management Solutions Reference and master data management solutions are complex Given the variety, complexity, and instability of requirements, no single solution or implementation project is likely to meet all reference and master data management needs Organisations should expect to implement reference and master data management solutions iteratively and incrementally through several related projects and phases

Define and Maintain Match Rules:

January 9, 2011 229 Define and Maintain Match Rules Matching, merging, and linking of data from multiple systems about the same person, group, place, or thing is a major master data management challenge Matching attempts to remove redundancy, to improve data quality, and provide information that is more comprehensive Data matching is performed by applying inference rules Duplicate identification match rules focus on a specific set of fields that uniquely identify an entity and identify merge opportunities without taking automatic action Match-merge rules match records and merge the data from these records into a single, unified, reconciled, and comprehensive record. Match-link rules identify and cross-reference records that appear to relate to a master record without updating the content of the cross-referenced record

Establish Golden Records:

January 9, 2011 230 Establish Golden Records Establishing golden master data values requires more inference, application of matching rules, and review of the results

Vocabulary Management and Reference Data:

January 9, 2011 231 Vocabulary Management and Reference Data A vocabulary is a collection of terms / concepts and their relationships Vocabulary management is defining, sourcing, importing, and maintaining a vocabulary and its associated reference data See ANSI/NISO Z39.19 - Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies - http://www.niso.org/kst/reports/standards?step=2&gid=&project_key=7cc9b583cb5a62e8c15d3099e0bb46bbae9cf38a Vocabulary management requires the identification of the standard list of preferred terms and their synonyms Vocabulary management requires data governance, enabling data stewards to assess stakeholder needs

Vocabulary Management and Reference Data:

January 9, 2011 232 Vocabulary Management and Reference Data Key questions to ask to enable vocabulary management What information concepts (data attributes) will this vocabulary support? Who is the audience for this vocabulary? What processes do they support, and what roles do they play? Why is the vocabulary needed? Will it support applications, content management, analytics, and so on? Who identifies and approves the preferred vocabulary and vocabulary terms? What are the current vocabularies different groups use to classify this information? Where are they located? How were they created? Who are their subject matter experts? Are there any security or privacy concerns for any of them? Are there existing standards that can be leveraged to fulfill this need? Are there concerns about using an external standard vs. internal? How frequently is the standard updated and what is the degree of change of each update? Are standards accessible in an easy to import / maintain format in a cost efficient manner?

Defining Golden Master Data Values:

January 9, 2011 233 Defining Golden Master Data Values Golden data values are the data values thought to be the most accurate, current, and relevant for shared, consistent use across applications Determine golden values by analyssing data quality, applying data quality rules and matching rules, and incorporating data quality controls into the applications that acquire, create, and update data Establish data quality measurements to set expectations, measure improvements, and help identify root causes of data quality problems Assess data quality through a combination of data profiling activities and verification against adherence to business rules Once the data is standardised and cleansed, the next step is to attempt reconciliation of redundant data through application of matching rules

Define and Maintain Hierarchies and Affiliations:

January 9, 2011 234 Define and Maintain Hierarchies and Affiliations Vocabularies and their associated reference data sets are often more than lists of preferred terms and their synonyms Affiliation management is the establishment and maintenance of relationships between master data records

Plan and Implement Integration of New Data Sources:

January 9, 2011 235 Plan and Implement Integration of New Data Sources Integrating new reference data sources involves Receiving and responding to new data acquisition requests from different groups Performing data quality assessment services using data cleansing and data profiling tools Assessing data integration complexity and cost Piloting the acquisition of data and its impact on match rules Determining who will be responsible for data quality Finalising data quality metrics

Replicate and Distribute Reference and Master Data:

January 9, 2011 236 Replicate and Distribute Reference and Master Data Reference and master data may be read directly from a database of record, or may be replicated from the database of record to other application databases for transaction processing, and data warehouses for business intelligence Reference data most commonly appears as pick list values in applications Replication aids maintenance of referential integrity

Manage Changes to Reference and Master Data:

January 9, 2011 237 Manage Changes to Reference and Master Data Specific individuals have the role of a business data steward with the authority to create, update, and retire reference data Formally control changes to controlled vocabularies and their reference data sets Carefully assess the impact of reference data changes

Data Warehousing and Business Intelligence Management:

January 9, 2011 238 Data Warehousing and Business Intelligence Management

Data Warehousing and Business Intelligence Management:

January 9, 2011 239 Data Warehousing and Business Intelligence Management A Data Warehouse is a combination of two primary components An integrated decision support database Related software programs used to collect, cleanse, transform, and store data from a variety of operational and external sources Both components combine to support historical, analytical, and business intelligence (BI) requirements A Data Warehouse may also include dependent data marts, which are subset copies of a data warehouse database A Data Warehouse includes any data stores or extracts used to support the delivery of data for BI purposes

Data Warehousing and Business Intelligence Management:

January 9, 2011 240 Data Warehousing and Business Intelligence Management Data Warehousing means the operational extract, cleansing, transformation, and load processes and associated control processes that maintain the data contained within a data warehouse Data Warehousing process focuses on enabling an integrated and historical business context on operational data by enforcing business rules and maintaining appropriate business data relationships and processes that interact with metadata repositories Business Intelligence is a set of business capabilities including Query, analysis, and reporting activity by knowledge workers to monitor and understand the financial operation health of, and make business decisions about, the enterprise Strategic and operational analytics and reporting on corporate operational data to support business decisions, risk management, and compliance

Data Warehousing and Business Intelligence Management :

January 9, 2011 241 Data Warehousing and Business Intelligence Management Together Data Warehousing and Business Intelligence Management is the collection, integration, and presentation of data to knowledge workers for the purpose of business analysis and decision-making Composed of activities supporting all phases of the decision support life cycle that provides context Moves and transforms data from sources to a common target data store Provides knowledge workers various means of access, manipulation Reporting of the integrated target data

Data Warehousing and Business Intelligence Management – Definition and Goals:

January 9, 2011 242 Data Warehousing and Business Intelligence Management – Definition and Goals Definition Planning, implementation, and control processes to provide decision support data and support knowledge workers engaged in reporting, query and analysis Goals To support and enable effective business analysis and decision making by knowledge workers To build and maintain the environment / infrastructure to support business intelligence activity, specifically leveraging all the other data management functions to cost effectively deliver consistent integrated data for all BI activity

Data Warehousing and Business Intelligence Management - Overview:

January 9, 2011 243 Data Warehousing and Business Intelligence Management - Overview Business Drivers BI Data and Access Requirements Data Quality Requirements Data Security Requirements Data Architecture Technical Architecture Data Modeling Standards and Guidelines Transactional Data Master and Reference Data Industry and External Data Inputs Executives and Managers Subject Matter Experts Data Governance Council Information Consumers (Internal and External) Data Producers Data Architects and Analysts Suppliers Database Management Systems Data Profiling Tools Data Integration Tools Data Cleansing Tools Business Intelligence Tools Analytic Applications Data Modeling Tools Performance Management Tools Metadata Repository Data Quality Tools Data Security Tools Tools Business Executives and Managers DM Execs and Other IT Management BI Program Manage SMEs and Other Information Consumers Data Stewards Project Managers Data Architects and Analysts Data Integration (ETL) Specialists BI Specialists Database Administrators Data Security Administrators Data Quality Analysts Participants DW/BI Architecture Data Warehouses Data Marts and OLAP Cubes Dashboards and Scorecards Analytic Applications File Extracts (for Data Mining/Statistical Tools) BI Tools and User Environments Data Quality Feedback Mechanism/Loop Primary Deliverables Usage Metrics Customer/User Satisfaction Subject Area Coverage % Response/Performance Metrics Metrics Data Warehousing and Business Intelligence Management Knowledge Workers Managers and Executives External Customers and Systems Internal Customers and Systems Data Professionals Other IT Professionals Consumers

Data Warehousing and Business Intelligence Management Objectives:

January 9, 2011 244 Data Warehousing and Business Intelligence Management Objectives Providing integrated storage of required current and historical data, organised by subject areas Ensuring credible, quality data for all appropriate access capabilities Ensuring a stable, high-performance, reliable environment for data acquisition, data management, and data access Providing an easy-to-use, flexible, and comprehensive data access environment Delivering both content and access to the content in increments appropriate to the organisation’s objectives Leveraging, rather than duplicating, relevant data management component functions such as Reference and Master Data Management, Data Governance, Data Quality, and Metadata Providing an enterprise focal point for data delivery in support of the decisions, policies, procedures, definitions, and standards that arise from DG Defining, building, and supporting all data stores, data processes, data infrastructure, and data tools that contain integrated, post-transactional, and refined data used for information viewing, analysis, or data request fulfillment Integrating newly discovered data as a result of BI processes into the DW for further analytics and BI use.

Data Warehousing and Business Intelligence Management Function, Activities and Sub-Activities:

January 9, 2011 245 Data Warehousing and Business Intelligence Management Function, Activities and Sub-Activities

Data Warehousing and Business Intelligence Management Principles:

January 9, 2011 246 Data Warehousing and Business Intelligence Management Principles Obtain executive commitment and support as these projects are labour intensive Secure business SMEs as their support and high availability are necessary for getting the correct data and useful BI solution Be business focused and driven. Make sure DW / BI work is serving real priority business needs and solving burning business problems. Let the business drive the prioritisation Demonstrable data quality is essential Provide incremental value. Ideally deliver in continual 2-3 month segments Transparency and self service. The more context (metadata of all kinds) provided, the more value customers derive. Wisely exposing information about the process reduces calls and increases satisfaction. One size does not fit all. Make sure you find the right tools and products for each of your customer segments Think and architect globally, act and build locally. Let the big-picture and end- vision guide the architecture, but build and deliver incrementally, with much shorter term and more project-based focus Collaborate with and integrate all other data initiatives, especially those for data governance, data quality, and metadata Start with the end in mind. Let the business priority and scope of end-data- delivery in the BI space drive the creation of the DW content. The main purpose for the existence of the DW is to serve up data to the end business customers via the BI capabilities Summarise and optimise last, not first. Build on the atomic data and add aggregates or summaries as needed for performance, but not to replace the detail.

Understand Business Intelligence Information Needs:

January 9, 2011 247 Understand Business Intelligence Information Needs All projects start with requirements Gathering requirements for DW-BIM projects has both similarities to and differences from gathering requirements for other projects For DW-BIM projects, it is important to understand the broader business context of the business area targeted as reporting is generalised and exploratory Capturing the actual business vocabulary and terminology is a key to success Document the business context, then explore the details of the actual source data Typically, the ETL portion can consume 60%-70% of a DW-BIM project’s budget and time The DW is often the first place where the pain of poor quality data in source systems and / or data entry functions becomes apparent Creating an executive summary of the identified business intelligence needs is a best practice When starting a DW-BIM programme, a good way to decide where to start is using a simple assessment of business impact and technical feasibility Technical feasibility will take into consideration things like complexity, availability and state of the data, and the availability of subject matter experts Projects that have high business impact and high technical feasibility are good candidates for starting.

Define and Maintain the DW-BI Architecture:

January 9, 2011 248 Define and Maintain the DW-BI Architecture Successful DW-BIM architecture requires the identification and bringing together of a number of key roles Technical Architect - hardware, operating systems, databases and DW-BIM architecture Data Architect - data analysis, systems of record, data modeling and data mapping ETL Architect / Design Lead - staging and transform, data marts, and schedules Metadata Specialist - metadata interfaces, metadata architecture and contents BI Application Architect / Design Lead - BI tool interfaces and report design, metadata delivery, data and report navigation and delivery Technical requirements including performance, availability, and timing needs are key drivers in developing the DW-BIM architecture The design decisions and principles for what data detail the DW contains is a key design priority for DW-BIM architecture Important that the DW-BIM architecture integrate with the overall corporate reporting architecture

Define and Maintain the DW-BI Architecture:

January 9, 2011 249 Define and Maintain the DW-BI Architecture No DW-BIM effort can be successful without business acceptance of data Business acceptance includes the data being understandable, having verifiable quality and having a demonstrable origin Sign-off by the Business on the data should be part of the User Acceptance Testing Structured random testing of the data in the BIM tool against data in the source systems over the initial load and a few update load cycles should be performed to meet sign-off criteria Meeting these requirements is paramount for every DW-BIM architecture

Implement Data Warehouses and Data Marts:

January 9, 2011 250 Implement Data Warehouses and Data Marts The purpose of a data warehouse is to integrate data from multiple sources and then serve up that integrated data for BI purposes Consumption is typically through data marts or other systems A single data warehouse will integrate data from multiple source systems and serve data to multiple data marts Purpose of data marts is to provide data for analysis to knowledge workers Start with the end in mind - identify the business problem to solve, then identify the details and what would be used and continue to work back into the integrated data required and ultimately all the way back to the data sources.

Implement Business Intelligence Tools and User Interfaces:

January 9, 2011 251 Implement Business Intelligence Tools and User Interfaces Well defined set of well-proven BI tools Implementing the right BI tool or User Interface (UI) is about identifying the right tools for the right user set Almost all BI tools also come with their own metadata repositories to manage their internal data maps and statistics

Query and Reporting Tools:

January 9, 2011 252 Query and Reporting Tools Query and reporting is the process of querying a data source and then formatting it to create a report With business query and reporting the data source is more often a data warehouse or data mart While IT develops production reports, power users and casual business users develop their own reports with business query tools Business query and reporting tools enable users who want to author their own reports or create outputs for use by others

Query and Reporting Tools Landscape:

January 9, 2011 253 Query and Reporting Tools Landscape Commonly Used Tools Commonly Used Tools Specialist Tools Customers, Suppliers and Regulators Frontline Workers Executives and Managers Analysts and Information Workers IT Developers Published Reports Business Query Interactive Fixed Reports Scorecards Embedded BI Dashboards OLAP Statistics BI Spreadsheets Production Reporting Tools

On Line Analytical Processing (OLAP) Tools:

January 9, 2011 254 On Line Analytical Processing (OLAP) Tools OLAP provides interactive, multi-dimensional analysis with different dimensions and different levels of detail The value of OLAP tools and cubes is reduction of the chance of confusion and erroneous interpretation by aligning the data content with the analyst's mental model Common OLAP operations include slice and dice, drill down, drill up, roll up, and pivot Slice - a slice is a subset of a multi-dimensional array corresponding to a single value for one or more members of the dimensions not in the subset Dice - the dice operation is a slice on more than two dimensions of a data cube, or more than two consecutive slices Drill Down / Up - drilling down or up is a specific analytical technique whereby the user navigates among levels of data, ranging from the most summarised (up) to the most detailed (down) Roll-Up – a roll-up involves computing all of the data relationships for one or more dimensions. To do this, define a computational relationship or formula Pivot - to change the dimensional orientation of a report or page display

Analytic Applications:

January 9, 2011 255 Analytic Applications Analytic applications include the logic and processes to extract data from well-known source systems, such as vendor ERP systems, a data model for the data mart, and pre-built reports and dashboards Analytic applications provide businesses with a pre-built solution to optimise a functional area or industry vertical Different types of analytic applications include customer, financial, supply chain, manufacturing, and human resource applications

Implementing Management Dashboards and Scorecards:

January 9, 2011 256 Implementing Management Dashboards and Scorecards Dashboards and scorecards are both ways of efficiently presenting performance information Dashboards are oriented more toward dynamic presentation of operational information while scorecards are more static representations of longer-term organisational, tactical, or strategic goals Typically, scorecards are divided into 4 quadrants or views of the organisation such as Finance, Customer, Environment, and Employees, each with a number of metrics

Performance Management Tools:

January 9, 2011 257 Performance Management Tools Performance management applications include budgeting, planning, and financial consolidation

Predictive Analytics and Data Mining Tools:

January 9, 2011 258 Predictive Analytics and Data Mining Tools Data mining is a particular kind of analysis that reveals patterns in data using various algorithms A data mining tool will help users discover relationships or show patterns in more exploratory fashion

Advanced Visualisation and Discovery Tools:

January 9, 2011 259 Advanced Visualisation and Discovery Tools Advanced visualisation and discovery tools allow users to interact with the data in a highly visual, interactive way Patterns in a large dataset can be difficult to recognise in a numbers display A pattern can be picked up visually fairly quickly when thousands of data points are loaded into a sophisticated display on a single page of display

Process Data for Business Intelligence:

January 9, 2011 260 Process Data for Business Intelligence Most of the work in any DW-BIM effort involves in the preparation and processing of the data

Staging Areas:

January 9, 2011 261 Staging Areas A staging area is the intermediate data store between an original data source and the centralised data repository All required cleansing, transformation, reconciliation, and relationships happen in this area

Mapping Sources and Targets:

January 9, 2011 262 Mapping Sources and Targets Source-to-target mapping is the documentation activity that defines data type details and transformation rules for all required entities and data elements and from each individual source to each individual target DW-BIM adds additional requirements to this source-to-target mapping process encountered as a component of any typical data migration One of the goals of the DW-BIM effort should be to provide a complete lineage for each data element available in the BI environment all the way back to its respective source(s) A solid taxonomy is necessary to match the data elements in different systems into a consistent structure in the EDW

Data Cleansing and Transformations (Data Acquisition):

January 9, 2011 263 Data Cleansing and Transformations (Data Acquisition) Data cleansing focuses on the activities that correct and enhance the domain values of individual data elements, including enforcement of standards Cleansing is particularly necessary for initial loads where significant history is involved The preferred strategy is to push data cleansing and correction activity back to the source systems whenever possible Data transformation focuses on activities that provide organisational context between data elements, entities, and subject areas Organisational context includes cross- referencing, reference and master data management and complete and correct relationships Data transformation is an essential component of being able to integrate data from multiple sources

Monitor and Tune Data Warehousing Processes:

January 9, 2011 264 Monitor and Tune Data Warehousing Processes Processing should be monitored across the system for bottlenecks and dependencies among processes Database tuning techniques should be employed where and when needed, including partitioning, tuned backup and recovery strategies Archiving is a difficult subject in data warehousing Users often consider the data warehouse as an active archive due to the long histories that are built, and are unwilling, particularly if the OLAP sources have dropped records, to see the data warehouse engage in archiving

Monitor and Tune BI Activity and Performance:

January 9, 2011 265 Monitor and Tune BI Activity and Performance A best practice for BI monitoring and tuning is to define and display a set of customer- facing satisfaction metrics Average query response time and the number of users per day / week / month, are examples of useful metrics to display Regular review of usage statistics and patterns is essential Reports providing frequency and resource usage of data, queries, and reports allow prudent enhancement Tuning BI activity is analogous to the principle of profiling applications in order to know where the bottlenecks are and where to apply optimisation efforts

Document and Content Management:

January 9, 2011 266 Document and Content Management

Document and Content Management:

January 9, 2011 267 Document and Content Management Document and Content Management is the control over capture, storage, access, and use of data and information stored outside relational databases Strategic and tactical focus overlaps with other data management functions in addressing the need for data governance, architecture, security, managed metadata, and data quality for unstructured data Document and Content Management includes two sub-functions: Document management is the storage, inventory, and control of electronic and paper documents. Document management encompasses the processes, techniques, and technologies for controlling and organising documents and records, whether stored electronically or on paper Content management refers to the processes, techniques, and technologies for organising, categorising, and structuring access to information content, resulting in effective retrieval and reuse. Content management is particularly important in developing websites and portals, but the techniques of indexing based on keywords, and organising based on taxonomies, can be applied across technology platforms.

Document and Content Management – Definition and Goals:

January 9, 2011 268 Document and Content Management – Definition and Goals Definition Planning, implementation, and control activities to store, protect, and access data found within electronic files and physical records (including text, graphics, images, audio, and video) Goals To safeguard and ensure the availability of data assets stored in less structured formats To enable effective and efficient retrieval and use of data and information in unstructured formats To comply with legal obligations and customer expectations To ensure business continuity through retention, recovery, and conversion To control document storage operating costs

Document and Content Management - Overview:

January 9, 2011 269 Document and Content Management - Overview Text Documents Reports Spreadsheets Email Instant Messages Faxes Voicemail Images Video Recordings Audio Recordings Printed Paper Files Microfiche/Microfilm Graphics Inputs Employees External Parties Suppliers Stored Documents Office Productivity Tools Image and Workflow Management Tools Records Management Tools XML Development Tools Collaboration Tools Internet Email Systems Tools All Employees Data Stewards DM Professionals Records Management Staff Other IT Professionals Data Management Executive Other IT Managers Chief Information Officer Chief Knowledge Officer Participants Managed Records in Many Media Formats E-discovery Records Outgoing Letters and Emails Contracts and Financial Documents Policies and Procedures Audit Trails and Logs Meeting Minutes Formal Reports Significant Memoranda Primary Deliverables Return on investment Key Performance Indicators Balanced Scorecards Metrics Document and Content Management Business and IT Users Government Regulatory Agencies Senior Management External Customers Consumers

Document and Content Management Function, Activities and Sub-Activities:

January 9, 2011 270 Document and Content Management Function, Activities and Sub-Activities

Document and Content Management - Principles:

January 9, 2011 271 Document and Content Management - Principles Everyone in an organisation has a role to play in protecting its future. Everyone must create, use, retrieve, and dispose of records in accordance with the established policies and procedures Experts in the handling of records and content should be fully engaged in policy and planning. Regulatory and best practices can vary significantly based on industry sector and legal jurisdiction Even if records management professionals are not available to the organisation, everyone can be trained and have an understanding of the issues. Once trained, business stewards and others can collaborate on an effective approach to records management

Document and Content Management:

January 9, 2011 272 Document and Content Management A document management system is an application used to track and store electronic documents and electronic images of paper documents Document management systems commonly provide storage, versioning, security, metadata management, content indexing, and retrieval capabilities A content management system is used to collect, organise, index, and retrieve information content; storing the content either as components or whole documents, while maintaining links between components While a document management system may provide content management functionality over the documents under its control, a content management system is essentially independent of where and how the documents are stored

Document / Record Management:

January 9, 2011 273 Document / Record Management Document / Record Management is the lifecycle management of the designated significant documents of the organisation Records can Physical such as documents, memos, contracts, reports or microfiche Electronic such as email content, attachments, and instant messaging Content on a website Documents on all types of media and hardware Data captured in databases of all kinds More than 90% of the records created today are electronic Growth in email and instant messaging has made the management of electronic records critical to an organisation

Document / Record Management:

January 9, 2011 274 Document / Record Management The lifecycle of Document / Record Management includes: Identification of existing and newly created documents / records Creation, Approval, and Enforcement of documents / records policies Classification of documents / records Documents / Records Retention Policy Storage: Short and long term storage of physical and electronic documents / records Retrieval and Circulation: Allowing access and circulation of documents / records in accordance with policies, security and control standards, and legal requirements Preservation and Disposal: Archiving and destroying documents / records according to organisational needs, statutes, and regulations

Plan for Managing Documents / Records:

January 9, 2011 275 Plan for Managing Documents / Records Plan document lifecycle from creation or receipt, organisation for retrieval, distribution and archiving or disposition Develop classification / indexing systems and taxonomies so that the retrieval of documents is easy Create planning and policy around documents and records on the value of the data to the organisation and as evidence of business transactions Identify the responsible, accountable organisational unit for managing the documents / records Develop and execute a retention plan and policy to archive, such as selected records for long-term preservation Records are destroyed at the end of their lifecycle according to operational needs, procedures, statutes and regulations

Implement Document / Record Management Systems for Acquisition, Storage, Access, and Security Controls:

January 9, 2011 276 Implement Document / Record Management Systems for Acquisition, Storage, Access, and Security Controls Documents can be created within a document management system or captured via scanners or OCR software Electronic documents must be indexed via keywords or text during the capture process so that the document can be found A document repository enables check-in and check-out features, versioning, collaboration, comparison, archiving, status state(s), migration from one storage media to another and disposition Document management can support different types of workflows Manual workflows that indicate where the user sends the document Rules-based workflow, where rules are created that dictate the flow of the document within an organisation Dynamic rules that allow for different workflows based on content

Backup and Recover Documents / Records:

January 9, 2011 277 Backup and Recover Documents / Records The document / record management system needs to be included as part of the overall corporate backup and recovery activities for all data and information Document / records manager be involved in risk mitigation and management, and business continuity especially regarding security for vital records A vital records program provides the organisation with access to the records necessary to conduct its business during a disaster and to resume normal business afterward

Retention and Disposition of Documents / Records:

January 9, 2011 278 Retention and Disposition of Documents / Records Defines the period of time during which documents / records for operational, legal, financial or historical value must be maintained Specifies the processes for compliance, and the methods and schedules for the disposition of documents / records Must deal with privacy and data protection issues Legal and regulatory requirements must be considered when setting up document record retention schedules

Audit Document / Records Management:

January 9, 2011 279 Audit Document / Records Management Document / records management requires auditing on a periodic basis to ensure that the right information is getting to the right people at the right time for decision making or performing operational activities Inventory - Each location in the inventory is uniquely identified Storage - Storage areas for physical documents / records have adequate space to accommodate growth Reliability and Accuracy - Spot checks are executed to confirm that the documents / records are an adequate reflection of what has been created or received Classification and Indexing Schemes - Metadata and document file plans are well described Access and Retrieval - End users find and retrieve critical information easily Retention Processes - Retention schedule is structured in a logical way Disposition Methods - Documents / records are disposed of as recommended Security and Confidentiality - Breaches of document / record confidentiality and loss of documents / records are recorded as security incidents and managed appropriately Organisational Understanding of Documents / Records Management - Appropriate training is provided to stakeholders and staff as to the roles and responsibilities related to document / records management

Content Management:

January 9, 2011 280 Content Management Organisation, categorisation, and structure of data / resources so that they can be stored, published, and reused in multiple ways Includes data / information, that exists in many forms and in multiple stages of completion within its lifecycle Content management systems manage the content of a website or intranet through the creation, editing, storing, organising, and publishing of content

Define and Maintain Enterprise Taxonomies (Information Content Architecture):

January 9, 2011 281 Define and Maintain Enterprise Taxonomies (Information Content Architecture) Process of creating a structure for a body of information or content Contains a controlled vocabulary that can help with navigation and search systems Content Architecture identifies the links and relationships between documents and content, specifies document requirements and attributes and defines the structure of content in a document or content management system

Document / Index Information Content Metadata:

January 9, 2011 282 Document / Index Information Content Metadata Development of metadata for unstructured data content Maintenance of metadata for unstructured data becomes the maintenance of a cross-reference of various local schemes to the official set of organisation metadata

Provide Content Access and Retrieval:

January 9, 2011 283 Provide Content Access and Retrieval Once the content has been described by metadata / key word tagging and classified within the appropriate Information Content Architecture, it is available for retrieval and use Finding unstructured data can be eased through portal technology

Govern for Quality Content:

January 9, 2011 284 Govern for Quality Content Managing unstructured data requires effective partnerships between data stewards, data professionals, and records managers The focus of data governance can include document and record retention policies, electronic signature policies, reporting formats, and report distribution policies High quality, accurate, and up-to-date information will aid in critical business decisions Timeliness of the decision-making process with high quality information may increase competitive advantage and business effectiveness

Metadata Management:

January 9, 2011 285 Metadata Management

Metadata Management:

January 9, 2011 286 Metadata Management Metadata is data about data Metadata Management is the set of processes that ensure proper creation, storage, integration, and control to support associated usage of metadata Leveraging metadata in an organisation can provide benefits Increase the value of strategic information by providing context for the data, thus aiding analysts in making more effective decisions Reduce training costs and lower the impact of staff turnover through thorough documentation of data context, history, and origin Reduce data-oriented research time by assisting business analysts in finding the information they need, in a timely manner Improve communication by bridging the gap between business users and IT professionals, leveraging work done by other teams, and increasing confidence in IT system data Increase speed of system development time-to-market by reducing system development life-cycle time Reduce risk of project failure through better impact analysis at various levels during change management Identify and reduce redundant data and processes, thereby reducing rework and use of redundant, out-of-date, or incorrect data

Metadata Management – Definition and Goals:

January 9, 2011 287 Metadata Management – Definition and Goals Definition Planning, implementation, and control activities to enable easy access to high quality, integrated metadata Goals Provide organisational understanding of terms, and usage Integrate metadata from diverse source Provide easy, integrated access to metadata Ensure metadata quality and security

Metadata:

January 9, 2011 288 Metadata Metadata is information about the physical data, technical and business processes, data rules and constraints, and logical and physical structures of the data, as used by an organisation Descriptive tags describe data, concepts and the connections between the data and concepts Business Analytics : Data definitions, reports, users, usage, performance Business Architecture : Roles and organisations, goals and objectives Business Definitions : The business terms and explanations for a particular concept, fact, or other item found in an organisation Business Rules : Standard calculations and derivation methods Data Governance : Policies, standards, procedures, programs, roles, organisations, stewardship assignments Data Integration : Sources, targets, transformations, lineage, ETL workflows, EAI, EII, migration / conversion Data Quality : Defects, metrics, ratings Document Content Management : Unstructured data, documents, taxonomies, name sets, legal discovery, search engine indexes Information Technology Infrastructure : Platforms, networks, configurations, licenses Logical Data Models : Entities, attributes, relationships and rules, business names and definitions Physical Data Models : Files, tables, columns, views, business definitions, indexes, usage, performance, change management Process Models : Functions, activities, roles, inputs / outputs, workflow, business rules, timing, stores Systems Portfolio and IT Governance : Databases, applications, projects and programs, integration roadmap, change management Service-Oriented Architecture (SOA) Information : Components, services, messages, master data System Design and Development : Requirements, designs and test plans, impact Systems Management : Data security, licenses, configuration, reliability, service levels

Metadata Management - Overview:

January 9, 2011 289 Metadata Management - Overview Metadata Requirements Metadata Issues Data Architecture Business Metadata Technical Metadata Process Metadata Operational Metadata Data Stewardship Metadata Inputs Data Stewards Data Architects Data Modelers Database Administrators Other Data Professionals Data Brokers Government and Industry Regulators Suppliers Metadata Repositories Data Modeling Tools Database Management Systems Data Integration Tools Business Intelligence Tools System Management Tools Object Modeling Tools Process Modeling Tools Report Generating Tools Data Quality Tools Data Development and Administration Tools Reference and Master Data Management Tools Tools Metadata Specialist Data Integration Architects Data Stewards Data Architects and Modelers Database Administrators Other DM Professionals Other IT Professionals DM Executive Business Users Participants Metadata Repositories Quality Metadata Metadata Models and Architecture Metadata Management Operational Analysis Metadata Analysis Data Lineage Change Impact Analysis Metadata Control Procedures Primary Deliverables Meta Data Quality Master Data Service Data Compliance Metadata Repository Contribution Metadata Documentation Quality Steward Representation / Coverage Metadata Usage / Reference Metadata Management Maturity Metadata Repository Availability Metrics Metadata Management Data Stewards Data Professionals Other IT Professionals Knowledge Workers Managers and Executives Customers and Collaborators Business Users Consumers

Metadata Management Function, Activities and Sub-Activities:

January 9, 2011 290 Metadata Management Function, Activities and Sub-Activities

Metadata Management - Principles:

January 9, 2011 291 Metadata Management - Principles Establish and maintain a metadata strategy and appropriate policies, especially clear goals and objectives for metadata management and usage Secure sustained commitment, funding, and vocal support from senior management concerning metadata management for the enterprise Take an enterprise perspective to ensure future extensibility, but implement through iterative and incremental delivery Develop a metadata strategy before evaluating, purchasing, and installing metadata management products Create or adopt metadata standards to ensure interoperability of metadata across the enterprise Ensure effective metadata acquisition for both internal and external meta- data Maximise user access, since a solution that is not accessed or is under-accessed will not show business value Understand and communicate the necessity of metadata and the purpose of each type of metadata; socialisation of the value of metadata will encourage business usage Measure content and usage Leverage XML, messaging, and Web services Establish and maintain enterprise-wide business involvement in data stewardship, assigning accountability for metadata Define and monitor procedures and processes to ensure correct policy implementation Include a focus on roles, staffing, standards, procedures, training, and metrics Provide dedicated metadata experts to the project and beyond Certify metadata quality

Understand Metadata Requirements:

January 9, 2011 292 Understand Metadata Requirements Metadata management strategy must reflect an understanding of enterprise needs for metadata Gather requirements to confirm the need for a metadata management environment, to set scope and priorities, educate and communicate, to guide tool evaluation and implementation, guide metadata modeling, guide internal metadata standards, guide provided services that rely on metadata, and to estimate and justify staffing needs Gather requirements from business and technical users Summarise the requirements from an analysis of roles, responsibilities, challenges, and the information needs of selected individuals in the organisation

Business User Requirements:

January 9, 2011 293 Business User Requirements Business users require improved understanding of the information from operational and analytical systems Business users require a high level of confidence in the information obtained from corporate data warehouses, analytical applications, and operational systems Need appropriate access to information delivery methods, such as reports, queries, ad-hoc, OLAP, dashboards with a high degree of quality documentation and context Business users must understand the intent and purpose of metadata management

Technical User Requirements:

January 9, 2011 294 Technical User Requirements Technical requirement topics include Daily feed throughput: size and processing time Existing metadata Sources - known and unknown Targets Transformations Architecture flow logical and physical Non-standard metadata requirements Technical users must understand the business context of the data at a sufficient level to provide the necessary support, including implementing the calculations or derived data rules

Define the Metadata Architecture:

January 9, 2011 295 Define the Metadata Architecture Metadata management solutions consist of Metadata creation / sourcing metadata integration Mmetadata repositories Metadata delivery Metadata usage Metadata control / management

Centralised Metadata Architecture:

January 9, 2011 296 Centralised Metadata Architecture Single metadata repository that contains copies of the live metadata from the various sources Advantages High availability, since it is independent of the source systems Quick metadata retrieval, since the repository and the query reside together Resolved database structures that are not affected by the proprietary nature of third party or commercial systems Extracted metadata may be transformed or enhanced with additional metadata that may not reside in the source system, improving quality Disadvantages Complex processes are necessary to ensure that changes in source metadata quickly replicate into the repository Maintenance of a centralised repository can be substantial Extraction could require custom additional modules or middleware Validation and maintenance of customised code can increase the demands on both internal IT staff and the software vendors

Distributed Metadata Architecture:

January 9, 2011 297 Distributed Metadata Architecture Metadata retrieval engine responds to user requests by retrieving data from source systems in real time with no persistent repository Advantages Metadata is always as current and valid as possible Queries are distributed, possibly improving response / process time Metadata requests from proprietary systems are limited to query processing rather than requiring a detailed understanding of proprietary data structures, therefore minimising the implementation and maintenance effort required Development of automated metadata query processing is likely simpler, requiring minimal manual intervention Batch processing is reduced, with no metadata replication or synchronisation processes Disadvantages No enhancement or standardisation of metadata is possible between systems Query capabilities are directly affected by the availability of the participating source systems No ability to support user-defined or manually inserted metadata entries since there is no repository in which to place these additions

Hybrid Metadata Architecture:

January 9, 2011 298 Hybrid Metadata Architecture Hybrid architecture where metadata still moves directly from the source systems into a repository but the repository design only accounts for the user-added metadata, the critical standardised items and the additions from manual sources Advantages Near-real-time retrieval of metadata from its source and enhanced metadata to meet user needs most effectively, when needed Lowers the effort for manual IT intervention and custom-coded access functionality to proprietary systems. Disadvantages Source systems must be available because the distributed nature of the back-end systems handles processing of queries Additional overhead is required to link those initial results with metadata augmentation in the central repository before presenting the result set to the end user Design forces the metadata repository to contain the latest version of the metadata source and forces it to manage changes to the source, as well Sets of program / process interfaces to tie the repository back to the meta- data source(s) must be built and maintained

Develop and Maintain Metadata Standards:

January 9, 2011 299 Develop and Maintain Metadata Standards Check industry or consensus standards and international standards International standards provide the framework from which the industry standards are developed and executed

Industry / Consensus Metadata Standards:

January 9, 2011 300 Industry / Consensus Metadata Standards Understanding the various standards for the implementation and management of meta- data in industry is essential to the appropriate selection and use of a metadata solution for an enterprise OMG (Object Management Group) specifications Common Warehouse Metadata (CWM) Information Management Metamodel (IMM) MDC Open Information Model (OIM) Extensible Markup Language (XML) Unified Modeling Language (UML) Extensible Markup Interface (XMI) Ontology Definition Metamodel (ODM) World Wide Web Consortium (W3C) RDF (Relational Definition Framework) for describing and interchanging meta- data using XML Dublin Core Metadata Initiative (DCMI) interoperable online metadata standard using RDF Distributed Management Task Force (DTMF) Web-Based Enterprise Management (WBEM) Common Information Model (CIM) standards-based management tools facilitating the exchange of data across otherwise disparate technologies and platforms Metadata standards for unstructured data ISO 5964 - Guidelines for the establishment and development of multilingual thesauri ISO 2788 - Guidelines for the establishment and development of monolingual thesauri ANSI/NISO Z39.1 - American Standard Reference Data and Arrangement of Periodicals ISO 704 - Terminology work Principles and methods

International Metadata Standards:

January 9, 2011 301 International Metadata Standards ISO / IEC 11179 is an international metadata standard for standardising and registering of data elements to make data understandable and shareable

Standard Metadata Metrics:

January 9, 2011 302 Standard Metadata Metrics Controlling the effectiveness of the metadata deployed environment requires measurements to assess user uptake, organisational commitment, and content coverage and quality Metadata Repository Completeness Metadata Documentation Quality Master Data Service Data Compliance Steward Representation / Coverage Metadata Usage / Reference Metadata Management Maturity Metadata Repository Availability

Implement a Managed Metadata Environment:

January 9, 2011 303 Implement a Managed Metadata Environment Implement a managed metadata environment in incremental steps in order to minimise risks to the organisation and to facilitate acceptance First implementation is a pilot to prove concepts and learn about managing the metadata environment

Create and Maintain Metadata:

January 9, 2011 304 Create and Maintain Metadata Metadata creation and update facility provides for the periodic scanning and updating of the repository in addition to the manual insertion and manipulation of metadata by authorised users and program Audit process validates activities and reports exceptions Metadata is the guide to the data in the organisation so its quality is critical

Integrate Metadata:

January 9, 2011 305 Integrate Metadata Integration processes gather and consolidate metadata from across the enterprise including metadata from data acquired outside the enterprise Challenges will arise in integration that will require resolution through the governance process Use a non-persistent metadata staging area to store temporary and backup files that supports rollback and recovery processes and provides an interim audit trail to assist repository managers when investigating metadata source or quality issues ETL tools used for data warehousing and Business Intelligence applications are often used effectively in metadata integration processes

Manage Metadata Repositories:

January 9, 2011 306 Manage Metadata Repositories Implement a number of control activities in order to manage the metadata environment Control of repositories is control of metadata movement and repository updates performed by the metadata specialist

Metadata Repositories:

January 9, 2011 307 Metadata Repositories Metadata repository refers to the physical tables in which the metadata are stored Generic design and not merely reflecting the source system database designs Metadata should be as integrated as possible this will be one of the most direct valued-added elements of the repository

Directories, Glossaries and Other Metadata Stores:

January 9, 2011 308 Directories, Glossaries and Other Metadata Stores A Directory is a type of metadata store that limits the metadata to the location or source of data in the enterprise A Glossary typically provides guidance for use of terms Other Metadata stores include specialised lists such as source lists or interfaces, code sets, lexicons, spatial and temporal schema, spatial reference, and distribution of digital geographic data sets, repositories of repositories and business rules

Distribute and Deliver Metadata:

January 9, 2011 309 Distribute and Deliver Metadata Metadata delivery layer is responsible for the delivery of the metadata from the repository to the end users and to any applications or tools that require metadata feeds to them

Query, Report and Analyse Metadata:

January 9, 2011 310 Query, Report and Analyse Metadata Metadata guides management and use of data assets A metadata repository must have a front-end application that supports the search-and- retrieval functionality required for all this guidance and management of data assets

Data Quality Management:

January 9, 2011 311 Data Quality Management

Data Quality Management:

January 9, 2011 312 Data Quality Management Critical support process in organisational change management Data quality is synonymous with information quality since poor data quality results in inaccurate information and poor business performance Data cleansing may result in short-term and costly improvements that do not address the root causes of data defects More rigorous data quality program is necessary to provide an economic solution to improved data quality and integrity Institutionalising processes for data quality oversight, management, and improvement hinges on identifying the business needs for quality data and determining the best ways to measure, monitor, control, and report on the quality of data Continuous process for defining the parameters for specifying acceptable levels of data quality to meet business needs, and for ensuring that data quality meets these levels

Data Quality Management – Definition and Goals:

January 9, 2011 313 Data Quality Management – Definition and Goals Definition Planning, implementation, and control activities that apply quality management techniques to measure, assess, improve, and ensure the fitness of data for use Goals To measurably improve the quality of data in relation to defined business expectations To define requirements and specifications for integrating data quality control into the system development lifecycle To provide defined processes for measuring, monitoring, and reporting conformance to acceptable levels of data quality

Data Quality Management:

January 9, 2011 314 Data Quality Management Data quality expectations provide the inputs necessary to define the data quality framework Framework includes defining the requirements, inspection policies, measures, and monitors that reflect changes in data quality and performance Requirements reflect three aspects of business data expectations Way to record the expectation in business rules Way to measure the quality of data within that dimension Acceptability threshold

Data Quality Management Approach:

January 9, 2011 315 Data Quality Management Approach Planning for the assessment of the current state and identification of key metrics for measuring data quality Deploying processes for measuring and improving the quality of data Monitoring and measuring the levels in relation to the defined business expectations Acting to resolve any identified issues to improve data quality and better meet business expectations

Data Quality Management - Overview:

January 9, 2011 316 Data Quality Management - Overview Business Requirements Data Requirements Data Quality Expectations Data Policies and Standards Business metadata Technical metadata Data Sources and Data Stores Inputs External Sources Regulatory Bodies Business Subject Matter Experts Information Consumers Data Producers Data Architects Data Modelers Suppliers Data Profiling Tools Statistical Analysis Tools Data Cleansing Tools Data Integration Tools Issue and Event Management Tools Tools Data Quality Analysts Data Analysts Database Administrators Data Stewards Other Data Professionals DRM Director Data Stewardship Council Participants Improved Quality Data Data Management Operational Analysis Data Profiles Data Quality Certification Reports Data Quality Service Level Agreements Primary Deliverables Data Value Statistics Errors / Requirement Violations Conformance to Expectations Conformance to Service Levels Metrics Data Quality Management Data Stewards Data Professionals Other IT Professionals Knowledge Workers Managers and Executives Customers Consumers

Data Quality Management Function, Activities and Sub-Activities:

January 9, 2011 317 Data Quality Management Function, Activities and Sub-Activities

Data Quality Management - Principles:

January 9, 2011 318 Data Quality Management - Principles Manage data as a core organisational asset All data elements will have a standardised data definition, data type, and acceptable value domain Leverage Data Governance for the control and performance of DQM Use industry and international data standards whenever possible Downstream data consumers specify data quality expectations Define business rules to assert conformance to data quality expectations Validate data instances and data sets against defined business rules Business process owners will agree to and abide by data quality SLAs Apply data corrections at the original source, if possible If it is not possible to correct data at the source, forward data corrections to the owner of the original source whenever possible Report measured levels of data quality to appropriate data stewards, business process owners, and SLA managers Identify a gold record for all data elements

Develop and Promote Data Quality Awareness:

January 9, 2011 319 Develop and Promote Data Quality Awareness Promoting data quality awareness means more than ensuring that the right people in the organisation are aware of the existence of data quality issues Establish a data governance framework for data quality Set priorities for data quality Develop and maintain standards for data quality Report relevant measurements of enterprise-wide data quality Provide guidance that facilitates staff involvement Establish communications mechanisms for knowledge sharing Develop and apply certification and compliance policies Monitor and report on performance Identify opportunities for improvements and build consensus for approval Resolve variations and conflicts

Define Data Quality Requirements:

January 9, 2011 320 Define Data Quality Requirements Applications are dependent on the use of data that meets specific needs associated with the successful completion of a business process Data quality requirements are often hidden within defined business policies Identify key data components associated with business policies Determine how identified data assertions affect the business Evaluate how data errors are categorised within a set of data quality dimensions Specify the business rules that measure the occurrence of data errors Provide a means for implementing measurement processes that assess conformance to those business rules Dimensions of data quality Accuracy Completeness Consistency Currency Precision Privacy Reasonableness Referential Integrity Timeliness Uniqueness Validity

Profile, Analyse and Assess Data Quality:

January 9, 2011 321 Profile, Analyse and Assess Data Quality Perform an assessment of the data using two different approaches, bottom-up and top-down Bottom-up assessment of existing data quality issues involves inspection and evaluation of the data sets themselves Top-down approach involves understanding how their processes consume data, and which data elements are critical to the success of the business application Identify a data set for review Catalog the business uses of that data set Subject the data set to empirical analysis using data profiling tools and techniques List all potential anomalies, review and evaluate Prioritise criticality of important anomalies in preparation for defining data quality metrics

Define Data Quality Metrics:

January 9, 2011 322 Define Data Quality Metrics Poor data quality affects the achievement of business objectives Seek and use indicators of data quality performance to report the relationship between flawed data and missed business objectives Measuring quality similarly to monitoring any type of business performance activity Data quality metrics should be reasonable and effective Measurability Business Relevance Acceptability Accountability / Stewardship Controllability Trackability

Define Data Quality Business Rules:

January 9, 2011 323 Define Data Quality Business Rules Measurement of conformance to specific business rules requires definition Monitoring conformance to these rules requires Segregating data values, records, and collections of records that do not meet business needs from the valid ones Generating a notification event alerting a data steward of a potential data quality issue Establishing an automated or event driven process for aligning or possibly correcting flawed data within business expectations

Test and Validate Data Quality Requirements:

January 9, 2011 324 Test and Validate Data Quality Requirements Data profiling tools analyse data to find potential anomalies Data profiling tools allow data analysts to define data rules for validation, assessing frequency distributions and corresponding measurements and then applying the defined rules against the data sets Characterising data quality levels based on data rule conformance provides an objective measure of data quality By using defined data rules to validate data, an organisation can distinguish those records that conform to defined data quality expectations and those that do not In turn, these data rules are used to baseline the current level of data quality as compared to ongoing audits

Set and Evaluate Data Quality Service Levels:

January 9, 2011 325 Set and Evaluate Data Quality Service Levels Data quality SLAs specify the organisation’s expectations for response and remediation Having data quality inspection and monitoring in place increases the likelihood of detection and remediation of a data quality issue before a significant business impact can occur Operational data quality control defined in a data quality SLA includes The data elements covered by the agreement The business impacts associated with data flaws The data quality dimensions associated with each data element The expectations for quality for each data element for each of the identified dimensions in each application or system in the value chain The methods for measuring against those expectations The acceptability threshold for each measurement The individual(s) to be notified in case the acceptability threshold is not met. The timelines and deadlines for expected resolution or remediation of the issue The escalation strategy and possible rewards and penalties when the resolution times are met.

Continuously Measure and Monitor Data Quality:

January 9, 2011 326 Continuously Measure and Monitor Data Quality Provide continuous monitoring by incorporating control and measurement processes into the information processing flow Incorporating the results of the control and measurement processes into both the operational procedures and reporting frameworks enable continuous monitoring of the levels of data quality

Manage Data Quality Issues:

January 9, 2011 327 Manage Data Quality Issues Supporting the enforcement of the data quality SLA requires a mechanism for reporting and tracking data quality incidents and activities for researching and resolving those incidents Data quality incident reporting system provides this capability Tracking of data quality incidents provides performance reporting data, including mean-time-to-resolve issues, frequency of occurrence of issues, types of issues, sources of issues and common approaches for correcting or eliminating problems Data quality incident tracking also requires a focus on training staff to recognise when data issues appear and how they are to be classified, logged and tracked according to the data quality SLA Implementing a data quality issues tracking system provides a number of benefits Information and knowledge sharing can improve performance and reduce duplication of effort Analysis of all the issues will help data quality team members determine any repetitive patterns, their frequency, and potentially the source of the issue

Clean and Correct Data Quality Defects:

January 9, 2011 328 Clean and Correct Data Quality Defects Perform data correction in three general ways Automated correction - Submit the data to data quality and data cleansing techniques using a collection of data transformations and rule-based standardisations, normalisations, and corrections Manual directed correction - Use automated tools to cleanse and correct data but require manual review before committing the corrections to persistent storage Manual correction : Data stewards inspect invalid records and determine the correct values, make the corrections, and commit the updated records

Design and Implement Operational DQM Procedures:

January 9, 2011 329 Design and Implement Operational DQM Procedures Using defined rules for validation of data quality provides a means of integrating data inspection into a set of operational procedures associated with active DQM Design and implement detailed procedures for operationalising activities Inspection and monitoring Diagnosis and evaluation of remediation alternatives Resolving the issue Reporting

Monitor Operational DQM Procedures and Performance:

January 9, 2011 330 Monitor Operational DQM Procedures and Performance Accountability is critical to the governance protocols overseeing data quality control Issues must be assigned to some number of individuals, groups, departments, or organisations Tracking process should specify and document the ultimate issue accountability to prevent issues from dropping through the cracks Metrics can provide valuable insights into the effectiveness of the current workflow, as well as systems and resource utilisation and are important management data points that can drive continuous operational improvement for data quality control

Conducting a Data Management Project:

January 9, 2011 331 Conducting a Data Management Project

Conducting a Data Management Project:

January 9, 2011 332 Conducting a Data Management Project Data management project depends on: Scope of the Project – data management functions to be encompassed Type of Project – from architecture to analysis to implementation Scope Within the Organisation – one or more business units or the entire organisation

Data Management Function and Project Type:

January 9, 2011 333 Data Management Function and Project Type Scope of Project Type of Project Data Governance Data Architecture Management Data Development Data Operations Management Data Security Management Reference and Master Data Management Data Warehousing and Business Intelligence Management Document and Content Management Metadata Management Data Quality Management Architecture Analysis and Design Implementation Operational Improvement Management and Administration

Mapping the Path Through the Selected Data Management Project:

January 9, 2011 334 Mapping the Path Through the Selected Data Management Project Use the framework to define the breakdown of the selected project

Project Elements – Data Management Functions, Type of Project, Organisational Scope:

January 9, 2011 335 Project Elements – Data Management Functions, Type of Project, Organisational Scope Organisational Scope of Project Type of Project Data Management Functions Within Scope of Project Select the project building blocks based on the project scope

Creating a Data Management Team:

January 9, 2011 336 Creating a Data Management Team

Creating a Data Management Team:

January 9, 2011 337 Creating a Data Management Team Having implemented a data management framework, must be monitored, managed and constantly improved Need to consolidate and coordinate data management and governance efforts to meet the challenges of Demand for performance management data Complexity in systems and processes Greater regulatory and compliance requirements Build a Data Management Center of Excellence (DMCOE)

Data Management Center of Excellence:

January 9, 2011 338 Data Management Center of Excellence Separate business units with the organisation generally implement their own solutions Each business unit will have different IT systems, data warehouses/data marts and business intelligence tools Organisation-wide coordination of data resources requires a centralised dedicated structure like the DMCOE providing data services Leads a organisation to business benefits through continuous improvement of data management DMCOE functions need to focus on leveraging organisational knowledge and skills to maximise the value of data to the organisation Maximise technology investment while decreasing costs and increasing efficiency, centralise best practices and standards and empower knowledge workers with information and provide thought leadership to the entire company DMCOE does not exist in isolation to other operations and service management functions

DMCOE Functions:

January 9, 2011 339 DMCOE Functions Maximise the value of the data technology investment to the organisation by taking a portfolio approach to increase skills and leverage and to optimise the infrastructure Focus on project delivery and information asset creation with an emphasis on reusability and knowledge management along with solution delivery Ensure the integrity of the organisation’s business processes and information systems Ensure the quality compliance effort related to the configuration, development, and documentation of enhancements Develop information learning and effective practices

Data Charter:

January 9, 2011 340 Data Charter Create charter that lists the fundamental principles of data management the DMCOE will adhere to: Data Strategy - Create a data blueprint, based upon business functions to facilitate data design Data Sharing - Promote the sharing of data across the organisation and reduce data redundancy Data Integrity - Ensure the integrity of data from design and availability perspectives Technical Expertise - Provide the expertise for the development and support of data systems High Availability and Optimal Performance - Ensure consistent high availability of data systems through proper design and use and optimise performance of the data systems

DMCOE Skills:

January 9, 2011 341 DMCOE Skills DMCOE needs skills across three dimensions Specific data management functions Business management and administration Technology and service management

DMCOE Skills:

January 9, 2011 342 DMCOE Skills Data Governance Data Architecture Management Document and Content Management Metadata Management Data Operations Management Data Security Management Reference and Master Data Management Data Warehousing and Business Intelligence Management Data Development Data Quality Management Data Management Strategy Personnel Management Data Management Design and Development Data Management Process Management Technical Architecture Application Deployment and Data Migration Environment and Infrastructure Management Service Management and Support Data Management Specific Functions Data Management Business Skills Data Management Technology and Service Functions Idealised set of DMCOE skills that need to be customised to suit specific organisation needs Just one view of a DMCOE Data Management Portfolio Management

DMCOE Business Management and Administration Skills:

January 9, 2011 343 DMCOE Business Management and Administration Skills

DMCOE Technology and Service Management Skills:

January 9, 2011 344 DMCOE Technology and Service Management Skills

Benefits of DMCOE:

January 9, 2011 345 Benefits of DMCOE Consistent infrastructure that reduces time to analyse and design and implement new IT solutions Reduced data management costs through a consistent data architecture and data integration infrastructure - reduced complexity, redundancy, tool proliferation Centralised repository of the organisation's data knowledge Organisation-wide standard methodology and processes to develop and maintain data infrastructure and procedures Increased data availability Increased data quality

Assessing Your Data Management Maturity:

January 9, 2011 346 Assessing Your Data Management Maturity

Assessing Your Data Management Maturity:

January 9, 2011 347 Assessing Your Data Management Maturity A Data Management Maturity Model is a measure of and then a process for determining the level of maturity that exists within an organisation’s data management function Provides a systematic framework for improving data management capability and identifying and prioritising opportunities, reducing cost and optimising the business value of data management investments M easure of data management maturity so that: It can be tracked over time to measure improvements It can be use to define project for data management maturity improvements within costs, time, and return on investment constraints Enables organisations to improve their data management function so that they can increase productivity, increase quality, decrease cost and decrease risk

Data Management Maturity Model:

January 9, 2011 348 Data Management Maturity Model Assesses data management maturity on a level of 1 to 5 across a number of data management capabilities Level Title Description 1 Initial Data management is ad hoc and localised. Everybody has their own approach that is unique and not standardised except for local initiatives. 2 Repeatable and Reactive Data management has become independent of the person or business unit administering and is standardised. 3 Defined and Standardised Data management is fully documented, determined by subject matter experts and validated. 4 Managed and Predictable Data management results and outcomes are stored and pro-actively cross-related within and between business units. The data management function actively exploit benefits of standardisation. 5 Optimising and Innovating As time, resources, technology, requirements and business landscape changes the data management function is able to be easily and quickly adjusted to fit new needs and environments

Maturity Level 1 - Initial:

January 9, 2011 349 Maturity Level 1 - Initial Data management processes are mostly disorganised and generally performed on an ad hoc or even even chaotic basis Data is considered as general purpose and is not viewed by either business or executive management to be a problem or a priority Data is accessible but not always available and is not secure or auditable No data management group and no one owns the responsibility for ensuring the quality, accuracy or integrity of the data Data management (to the degree that it is done at all) is reliant on the efforts and competence of individuals Data proliferates without control and the quality is inconsistent across the various business and applications silos Data exists in unconnected databases and spreadsheets using multiple formats and inconsistent definitions Little data profiling or analysis and data is not considered or understood as a component of linked processes No formal data quality processes and the processes that do exist are not repeatable because they are neither well defined nor well documented

Maturity Level 2 - Repeatable and Reactive:

January 9, 2011 350 Maturity Level 2 - Repeatable and Reactive Fundamental data management practices are established, defined, documented and can be repeated Data policies for creation and change management exist, but still rely on individuals and are not institutionalised throughout the organisation Data as valuable asset is a concept understood by some, but senior management support is lacking and there is little organisational buy-in to the importance of an enterprise-wide approach to managing data data is stored locally and data quality is reactive to circumstances Requirements are known and managed at the business unit and application level Procurement is ad hoc based on individual needs and data duplication is mostly invisible Data quality varies among business units and data failures occur on a cross-functional basis. Most data is integrated point-to-point and not across business units

Maturity Level 3 - Defined and Standardised:

January 9, 2011 351 Maturity Level 3 - Defined and Standardised Business analysts begin to control the data management process with IT playing a supporting role Data is recognised as a business enabler and moves from an undervalued commodity to an enterprise asset but there are still limited controls in place Executive management appreciates and understands the role of data governance and commits resources to its management Data administrative function exists as a complement to the database administration function and data is present for both business and IT related development discussions Some core data has defined policy that it is documented as part of the applications development lifecycle and the policies are enforced to a limited extent and testing is performed to ensure that data quality requirements are being achieved Data quality is not fully defined and there are multiple views of what quality Metadata repository exists and a data group maintains corporate data definitions and business rules A centralised platform for managing data is available at the group level and feeds analytical data marts Data is available to business users and can be audited

Maturity Level 4 - Managed and Predictable:

January 9, 2011 352 Maturity Level 4 - Managed and Predictable Data is treated as a critical corporate asset and viewed as equivalent to other enterprise wide assets Unified data governance strategy exists throughout the enterprise with executive level and CEO support Data management objectives are reviewed by senior management Business process interaction is completely documented and planning is centralised Data quality control, integration and synchronisation are integral parts of all business processes Content is monitored and corrected in real time to manage the reliability of the data manufacturing process and is based on the needs of customers, end users and the organisation as a whole Data quality is understood in statistical terms and managed throughout the transactions lifecycle Root cause analysis is well established and proactive steps are taken to prevent and not just correct data inconsistencies A centralised metadata repository exists and all changes are synchronised Data consistency is expected and achieved Data platform is managed at the enterprise level and feeds all reference data repositories Advanced platform tools are used to manage the metadata repository and all data transformation processes Data quality and integration tools are standardised across the enterprise.

Maturity Level 5 - Optimising and Innovating:

January 9, 2011 353 Maturity Level 5 - Optimising and Innovating The organisation is in continuous improvement mode Process enhancements are managed through monitoring feedback and a quantitative understanding of the causes of data inconsistencies Enterprise wide business intelligence is possible Organisation is agile enough to respond to changing circumstances and evolving business objectives Data is considered as the key resource for process improvement Data requirements for all projects are defined and agreed prior to initiation Development stresses the re-use of data and is synchronised with the procurement process Process of data management is continuously being improved Data quality (both monitoring and correction) is fully automated and adaptive Uncontrolled data duplication is eliminated and controlled duplication must be justified Governance is data driven and the organisation adopts a “test and learn” philosophy

Data Management Maturity Evaluation - Key Capabilities and Maturity Levels:

January 9, 2011 354 Data Management Maturity Evaluation - Key Capabilities and Maturity Levels Data Governance Data Architecture Management Document and Content Management Metadata Management Data Operations Management Data Security Management Reference and Master Data Management Data Warehousing and Business Intelligence Management Level 1 Level 2 Level 3 Level 4 Level 5 Data Development Data Quality Management < Description of capability associated with maturity level >

More Information:

January 9, 2011 355 More Information Alan McSweeney alan@alanmcsweeney.com

authorStream Live Help