logging in or signing up ECDL2001 bnh 08092001 Hillary Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 34 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: February 28, 2008 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Danish Legal Deposit on the Internet:: Danish Legal Deposit on the Internet: Current Solutions and Approaches for the Future ECDL, September 2001 by Birgit N. Henriksen Head of Digitization and Web Department The Royal Library, Denmark bnh@kb.dkPresentation outline: Presentation outline Since 1998 selection based archiving (production) netarchive.dk (new project, multiple archiving strategies , 2001-2002) Nordic Web Archive (project 2000-2001, access to web archives) Three different initiatives:The Danish Legal Deposit Law : The Danish Legal Deposit Law 1697: All printers in royal and ducal lands must deposit 1703: Only printers in Copenhagen have to deposit 1781: All printers in royal and ducal lands must deposit 1902: All printed materials to be deposited 1927: Posters and some types of ephemera excluded 1997: All published works to be deposited The law from 1997 covers: The law from 1997 covers any work published in Denmark regardless of medium “work”: a delimited quantity of information which must be considered a final and independent unit “published”: when … copies of the work have been placed on sale or otherwise distributed to the publicTypes of Net Publications: Types of Net Publications Static included (only periodically updated) monographs periodicals Dynamic excluded (continuously updated) Databases homepageswww.pligtaflevering.dk: www.pligtaflevering.dkHow do we get the material?: How do we get the material? Download based on notification NOT Harvesting the Danish domain Delivery of works (a collection of files) from the individual publishersRegistration: Registration WHO the person in charge of the technical completion of the digital copy HOW by filling out a form at http://www.pligtaflevering.dkRegistration Form - Monographs: Registration Form - MonographsDownload - workflow: Download - workflow The staff at the Danish Department, The Royal Library determines whether a publication is covered by the law if yes, downloads all files belonging to the work checks downloaded work catalogues and classifies the work in the OPAC (only periodicals) transfers work to archival server (server mirrored nightly to State and University Library, Århus)Plug-ins: Plug-insSystem Environment: System EnvironmentDomain names in .dk domain: Domain names in .dk domainVolume in archived material: Volume in archived materialMonographs vs Periodicals: Monographs vs Periodicals Public vs. Private Publishers: Public vs. Private Publishers Staff resources: Staff resourcesMimeType Statistics – % of collected files: MimeType Statistics – % of collected filesThree generations using the internet: Three generations using the internetThe modifications from 1902 : The modifications from 1902 Brochures and advertisements Catalogues Election campaign material Club/organisation magazines Songs Scouting magazines, church newsletters Maps Portraits Art prints Problems related to the notification concept: Problems related to the notification concept Lack of notification of multiple representations of a publication Lack of notification of new versions Problems related to technical issues: Problems related to technical issues Errors or inconsistencies in the published files Java applets – no solution at the moment Found solutions on previous problems: Documents with java scripts Data behind forms Data behind username/password logins Cookies-based session handling SSL encryption Gains if harvesting is used: Gains if harvesting is used Better coverage of Denmark outside the public sphere Updated versions – also for static publications New trends on the net as soon as they appear Why not only harvesting?: Why not only harvesting? Programs and plug-ins are difficult to keep track on Harvesting is not always possible (e.g.. streamed and web casted material) Harvesting may not give a useful result - technical problems (java, interactive sites) - personalised sites Harvesting may produce a collection of documents that have never existed on the net Harvesting may not always give the best format for long-time preservationNet Art: Net ArtHome banking: Home bankingSearching the catalogue: Searching the catalogueCollections made by harvesting: Collections made by harvesting Are not complete – previous slides No robot will never be able to make a ’true’ snapshot – the snapshot contains a mix of documents that have never been published together at the same time – a ’fake’ Archive for Danish Literature: Archive for Danish Literature www.adl.dk from 1. October 2001 All full texts are structured in XML on work level The XML is loaded to a database The database performs the web publishing in well-formed HTML on a page level What do we prefer to archive and for what purpose?Birte Christensen-Dalsgaard: Archive Experience, not Data: Birte Christensen-Dalsgaard: Archive Experience, not Data Web Archiving Conference, CPH June 2001: Web Archiving Conference, CPH June 2001 Focus: User Expectations to webarchiving in DK Brought together : members of the user community, scholars as well as scientits member from the organisations traditionally in charge of preserving oral and written material members with technical knowledge Proceedings (UK) – netarchive.dk Web Archiving Conference, CPH June 2001: Web Archiving Conference, CPH June 2001 Sholars & scientist: Archive the dynamic part of the web Focus on archiving the content the context the evidence of use Archivists: Use different archiving approaches New methods for archiving dynamic material Budgets for making snapshots and making selective collections are comparable Birte Christensen-Dalsgaard: 3 dimensions - duration: Birte Christensen-Dalsgaard: 3 dimensions - duration Real time dialog Published, static Signal lifetime Hourly Update Book-like publications Scientific Journals News-sites ChatBirte Christensen-Dalssgard: 3 dimensions - Permanent value: Birte Christensen-Dalssgard: 3 dimensions - Permanent value Transient Persistent Permanent Value What is worth preserving? Quality vs. Representative Birte Christensen-Dalsgaard: Background - Nature of Information: Birte Christensen-Dalsgaard: Background - Nature of Information Interactivity Static Dynamic Transient Persistent Permanent Value Real time dialog Published, static Signal lifetime Hourly UpdateBirte Christensen-Dalsgaard: Domain of different harvesting methods: Birte Christensen-Dalsgaard: Domain of different harvesting methods Interactivity Static Dynamic Transient Persistent Permanent Value Real time dialog Published, static Signal lifetime Legal Deposit, DK Hourly Update Accumulative harvesting SnapshotBirte Christensen-Dalsgaard: What is missing?: Birte Christensen-Dalsgaard: What is missing? Interactivity Static Dynamic Transient Persistent Permanent Value Real time dialog Published, static Signal lifetime Legal Deposit, DK Hourly Update Accumulative harvesting Snapshotnetarchive.dk (1): Accumulative Snapshot netarchive.dk (1) Interactivity Static Dynamic Transient Persistent Permanent Value Real time dialog Published, static Signal lifetime Process Test different archival approaches and the subsequent usability of the archived material for research netarchive.dk (2): netarchive.dk (2) Pilot project testing different archival approaches and the subsequent usability of the archived material for research Project partners: State and University Library, Aarhus Centre for Internet Research The Royal Library With economic support from the Danish Electronic Research Library (DEF) Period: August 2001 – July 2002 Case: Danish municipal elections November 2001 netarchive.dk (3): netarchive.dk (3) Which materials with What frequency? Collection method? Which software? How should the collection of materials be organized and how should it be stored? How should obsolescence of data formats be dealt with? How should access be given? Budgets for collecting and storing netarchive.dk (4): netarchive.dk (4) Net material covered by netarchive.dk net activities from existing news media (newspapers, radio, TV (both national, regional and local media)) political parties official pages, national and local individual politicians’ personal pages official (county) municipal pages voters’ personal pages »local themes«- pages special interest organisations portals in the broadest sense opinion polling firms public emails/ press releases news groups / usenet net-conferences and chat How do we catch the missing part?: How do we catch the missing part? Process rather than material – ‘Filming’ the net through a browser Goal: Catch chronological series of displayed WebPages Tools to take into consideration: Business intelligence tools Tools used in usability laboratories … Nordic Web Archive (NWA): Nordic Web Archive (NWA) Establish a Danish test archive in order to participate in NWA Software: NEDLIB robot Status 1/9 2001: Archiving started 20/8 2001 1.9 mio documents 43 GB uncompressed data Slide44: Questions? You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
ECDL2001 bnh 08092001 Hillary Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 34 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: February 28, 2008 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Danish Legal Deposit on the Internet:: Danish Legal Deposit on the Internet: Current Solutions and Approaches for the Future ECDL, September 2001 by Birgit N. Henriksen Head of Digitization and Web Department The Royal Library, Denmark bnh@kb.dkPresentation outline: Presentation outline Since 1998 selection based archiving (production) netarchive.dk (new project, multiple archiving strategies , 2001-2002) Nordic Web Archive (project 2000-2001, access to web archives) Three different initiatives:The Danish Legal Deposit Law : The Danish Legal Deposit Law 1697: All printers in royal and ducal lands must deposit 1703: Only printers in Copenhagen have to deposit 1781: All printers in royal and ducal lands must deposit 1902: All printed materials to be deposited 1927: Posters and some types of ephemera excluded 1997: All published works to be deposited The law from 1997 covers: The law from 1997 covers any work published in Denmark regardless of medium “work”: a delimited quantity of information which must be considered a final and independent unit “published”: when … copies of the work have been placed on sale or otherwise distributed to the publicTypes of Net Publications: Types of Net Publications Static included (only periodically updated) monographs periodicals Dynamic excluded (continuously updated) Databases homepageswww.pligtaflevering.dk: www.pligtaflevering.dkHow do we get the material?: How do we get the material? Download based on notification NOT Harvesting the Danish domain Delivery of works (a collection of files) from the individual publishersRegistration: Registration WHO the person in charge of the technical completion of the digital copy HOW by filling out a form at http://www.pligtaflevering.dkRegistration Form - Monographs: Registration Form - MonographsDownload - workflow: Download - workflow The staff at the Danish Department, The Royal Library determines whether a publication is covered by the law if yes, downloads all files belonging to the work checks downloaded work catalogues and classifies the work in the OPAC (only periodicals) transfers work to archival server (server mirrored nightly to State and University Library, Århus)Plug-ins: Plug-insSystem Environment: System EnvironmentDomain names in .dk domain: Domain names in .dk domainVolume in archived material: Volume in archived materialMonographs vs Periodicals: Monographs vs Periodicals Public vs. Private Publishers: Public vs. Private Publishers Staff resources: Staff resourcesMimeType Statistics – % of collected files: MimeType Statistics – % of collected filesThree generations using the internet: Three generations using the internetThe modifications from 1902 : The modifications from 1902 Brochures and advertisements Catalogues Election campaign material Club/organisation magazines Songs Scouting magazines, church newsletters Maps Portraits Art prints Problems related to the notification concept: Problems related to the notification concept Lack of notification of multiple representations of a publication Lack of notification of new versions Problems related to technical issues: Problems related to technical issues Errors or inconsistencies in the published files Java applets – no solution at the moment Found solutions on previous problems: Documents with java scripts Data behind forms Data behind username/password logins Cookies-based session handling SSL encryption Gains if harvesting is used: Gains if harvesting is used Better coverage of Denmark outside the public sphere Updated versions – also for static publications New trends on the net as soon as they appear Why not only harvesting?: Why not only harvesting? Programs and plug-ins are difficult to keep track on Harvesting is not always possible (e.g.. streamed and web casted material) Harvesting may not give a useful result - technical problems (java, interactive sites) - personalised sites Harvesting may produce a collection of documents that have never existed on the net Harvesting may not always give the best format for long-time preservationNet Art: Net ArtHome banking: Home bankingSearching the catalogue: Searching the catalogueCollections made by harvesting: Collections made by harvesting Are not complete – previous slides No robot will never be able to make a ’true’ snapshot – the snapshot contains a mix of documents that have never been published together at the same time – a ’fake’ Archive for Danish Literature: Archive for Danish Literature www.adl.dk from 1. October 2001 All full texts are structured in XML on work level The XML is loaded to a database The database performs the web publishing in well-formed HTML on a page level What do we prefer to archive and for what purpose?Birte Christensen-Dalsgaard: Archive Experience, not Data: Birte Christensen-Dalsgaard: Archive Experience, not Data Web Archiving Conference, CPH June 2001: Web Archiving Conference, CPH June 2001 Focus: User Expectations to webarchiving in DK Brought together : members of the user community, scholars as well as scientits member from the organisations traditionally in charge of preserving oral and written material members with technical knowledge Proceedings (UK) – netarchive.dk Web Archiving Conference, CPH June 2001: Web Archiving Conference, CPH June 2001 Sholars & scientist: Archive the dynamic part of the web Focus on archiving the content the context the evidence of use Archivists: Use different archiving approaches New methods for archiving dynamic material Budgets for making snapshots and making selective collections are comparable Birte Christensen-Dalsgaard: 3 dimensions - duration: Birte Christensen-Dalsgaard: 3 dimensions - duration Real time dialog Published, static Signal lifetime Hourly Update Book-like publications Scientific Journals News-sites ChatBirte Christensen-Dalssgard: 3 dimensions - Permanent value: Birte Christensen-Dalssgard: 3 dimensions - Permanent value Transient Persistent Permanent Value What is worth preserving? Quality vs. Representative Birte Christensen-Dalsgaard: Background - Nature of Information: Birte Christensen-Dalsgaard: Background - Nature of Information Interactivity Static Dynamic Transient Persistent Permanent Value Real time dialog Published, static Signal lifetime Hourly UpdateBirte Christensen-Dalsgaard: Domain of different harvesting methods: Birte Christensen-Dalsgaard: Domain of different harvesting methods Interactivity Static Dynamic Transient Persistent Permanent Value Real time dialog Published, static Signal lifetime Legal Deposit, DK Hourly Update Accumulative harvesting SnapshotBirte Christensen-Dalsgaard: What is missing?: Birte Christensen-Dalsgaard: What is missing? Interactivity Static Dynamic Transient Persistent Permanent Value Real time dialog Published, static Signal lifetime Legal Deposit, DK Hourly Update Accumulative harvesting Snapshotnetarchive.dk (1): Accumulative Snapshot netarchive.dk (1) Interactivity Static Dynamic Transient Persistent Permanent Value Real time dialog Published, static Signal lifetime Process Test different archival approaches and the subsequent usability of the archived material for research netarchive.dk (2): netarchive.dk (2) Pilot project testing different archival approaches and the subsequent usability of the archived material for research Project partners: State and University Library, Aarhus Centre for Internet Research The Royal Library With economic support from the Danish Electronic Research Library (DEF) Period: August 2001 – July 2002 Case: Danish municipal elections November 2001 netarchive.dk (3): netarchive.dk (3) Which materials with What frequency? Collection method? Which software? How should the collection of materials be organized and how should it be stored? How should obsolescence of data formats be dealt with? How should access be given? Budgets for collecting and storing netarchive.dk (4): netarchive.dk (4) Net material covered by netarchive.dk net activities from existing news media (newspapers, radio, TV (both national, regional and local media)) political parties official pages, national and local individual politicians’ personal pages official (county) municipal pages voters’ personal pages »local themes«- pages special interest organisations portals in the broadest sense opinion polling firms public emails/ press releases news groups / usenet net-conferences and chat How do we catch the missing part?: How do we catch the missing part? Process rather than material – ‘Filming’ the net through a browser Goal: Catch chronological series of displayed WebPages Tools to take into consideration: Business intelligence tools Tools used in usability laboratories … Nordic Web Archive (NWA): Nordic Web Archive (NWA) Establish a Danish test archive in order to participate in NWA Software: NEDLIB robot Status 1/9 2001: Archiving started 20/8 2001 1.9 mio documents 43 GB uncompressed data Slide44: Questions?