Presentation2

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

National Digital Archive of Datasets: 

National Digital Archive of Datasets Jim Jamieson Senior Archivist University of London Computer Centre j.jamieson@ulcc.ac.uk

NDAD: 

NDAD National Digital Archive of Datasets National = UK Government Digital = electronic records - mostly databases and digital documents Archive = Records no longer in current use, in need of preservation Datasets = Databases and collections of data

Context: 

Context Under contract to The National Archives (TNA) A long-term contract Based at University of London Computer Centre Service is a unique combination of archival and digital preservation skills

What NDAD does: 

What NDAD does Captures information from databases Preserves the data and metadata long-term Renders data searchable online for users Closes data where appropriate Offers supporting documentation Provides full descriptive catalogues Develops systems and solutions to enable this

What NDAD doesn’t do: 

What NDAD doesn’t do Keep copies of programs - not a hardware/software museum Emulate programs, or their functionality Allow searchers to visit Choose its holdings Provide first-line support to its users

Slide9: 

Archivists Administration Data specialists Scanning & digitisation The National Archives Upload Transfer of dataset Cataloguing (EAD/XML) Acquire documentation Access conditions Thesaurus / indexing Stage 1 Receipt Target date Suspensions Communications Dealings with TNA Stage 2 Receipt Dataset ingest Analysis Data transformation / conversion processes Validate / check NDAD processing Conversion Scan paper documents Processing Checking NDAD preservation Catalogues upload Data upload Documents upload Publish to live site Dealing with Govt Depts Dataset documentation Development Software support Tool design Process development NDAD workflow diagram

Accessioning: 

Accessioning Workflow diagram - The Open Archival Information System (OAIS) functional model was used as the basis for this. ISO 9000 procedure documents each stage of the ingest process. Detailed procedure manuals. But… No such thing as a “typical” dataset.

How it happens: 

How it happens Large-scale database projects get ‘spotted’ by TNA Client managers NDAD and TNA negotiate with creating Department Agree on timetable for transfer Transfer forms Receive data and documentation

Slide12: 

Capturing a dataset Can be a ‘snapshot’ of a still-current or semi-current data collection Or a copy of a ‘dead’ database which has been superseded Data specialists suggest best method for export/copying process Ideally, process will be well documented

Slide13: 

Processing a dataset NDAD process the data into a format that doesn’t rely on proprietary software (eg CSV) This allows long-term preservation of the data Also makes it browsable and searchable online And allows downloads of tables, or even entire datasets

Slide14: 

Documentation Documents are records too! NDAD identifies and selects any documentation which helps users interpret and understand the dataset better: How it was created and used Who used it When it was used What it was used for

Slide15: 

Rendering Documents Aim is to make these available online We convert digital docs, putting them in common (and safe) formats: PDF, TXT… We scan paper docs We always make TXT versions available Even websites now rendered as docs

Slide16: 

Catalogue NDAD provides full descriptive catalogue of everything - datasets and documents Done to ISAD(G) standards, ie hierarchical Datasets arranged in Series Online catalogue, with hyperlinks Covers system history (Government computing) Searchable, with automated thesaurus terms

Thesaurus/Indexing: 

Thesaurus/Indexing All catalogues indexed UNESCO thesaurus enhanced by in-house thesaurus and UKAT NCA Rules for the Construction of Personal, Place and Corporate Names

Access: 

Access Many of the datasets held by NDAD are subject to restrictions on access But few are closed completely Public Records Act - “30 Year Rule” Data Protection Act Freedom of Information Copyright

Access: 

Access Selective closure of: Datasets Tables Fields Documents Aggregation

Slide20: 

Examples Schools’ Census Good run of statistics from 1975 onwards Data already published, but can be re-purposed in many ways A successful ‘data rescue’ operation, reconstructing information from truncated fields

Slide21: 

Examples Home and Leisure Accident Survey Unusual collection of records of accidents in the home TNA thought this would be ‘popular’ with public So large, NDAD had to divide it into annual tables Oddly enough, created by DTI (not a Health Department)

Slide22: 

Examples Enemy Property Claims Assessment Panel (EPCAP) Interesting subject (Nazi war victims being compensated by UK Govt) But surprisingly dull data Odd case where the Panel offered their data online, and it got out of synch with their own database

Examples: 

Examples National Lottery Awards Database Details of nearly 250,000 lottery grants Also available via departmental website Documentation included User Guide in the form of a Windows Help file

Examples: 

Examples Local Heritage Initiative HLF awards for community heritage projects managed by Countryside Agency First dataset received by NDAD to include multimedia content Not yet available on NDAD website

Slide25: 

Questions? http://www.ndad.nationalarchives.gov.uk