AHDS-Introduction to Creating Digital Resources.pdf

download AHDS-Introduction to Creating Digital Resources.pdf

of 8

Transcript of AHDS-Introduction to Creating Digital Resources.pdf

  • 8/10/2019 AHDS-Introduction to Creating Digital Resources.pdf

    1/8

  • 8/10/2019 AHDS-Introduction to Creating Digital Resources.pdf

    2/8

    Pgina 2 de 8

    The AHDS provides subject focused specialist advice through its five Centres for History,Visual Arts, Performing Arts, Archaeology and Language, Linguistics and Literature. If uncertain, pleasecontact the AHDS Executive.

    Creating 'Fit for Purpose' Digital Resources

    A place can be found within the arts and humanities for virtually any kind of digital resource,

    ranging from transcribed texts to immersive virtual reality models. These resources are created using awide variety of tools and techniques, and with so many options it can be difficult to select the bestapproach. Information technology offers a huge range of possibilities and it can be difficult to select thebest option, or even to understand how best to assess the options. Keep in mind the maxim fit forpurpose. The tools and techniques that you use to create a digital resource should be determined by theintended purpose of that resource; it is important to ensure that technology is used to support theresearch or pedagogical objectives of a project, whilst not coming to dominate them.

    One of the benefits of thinking about building a digital resource that will be fit for purpose,instead of one that is led by the available technology and expertise, is that it helps keep the focus on theresearch, learning and teaching objectives of the project, rather than on the technical methodology that isused to achieve them. It should be possible to describe the goal of a digital resource creation project inthe arts and humanities in non-technical language, and from this it should be possible to describe non-

    technical criteria that can then be used to assess the possible technical approaches.

    Digital resources are often expensive to create, and as well as ensuring the finishedresource meets the requirements of the project, it is worth spending some time making the resourceflexible enough to be used by others in the future, perhaps for entirely different purposes. A key, andprobably the most important part, of this is documentation. A well documented resource engendersconfidence, and is both easier and more likely to be reused.

    Documentation should cover provenance, methodology, technical standards and unresolvedproblems. Documentation is discussed in more detail in the AHDS Guidelines for Documenting DigitalResources.

    The Project Team

    A digital resource creation project relies on the expertise of the project team to achieve itsgoals. Digital resource creation projects in the arts and humanities require both subject expertise andtechnical expertise.

    When technical work is outsourced, it is important that project staff develop a solid workingknowledge of the relevant technical issues so that they can understand and properly assess therecommendations of their technical contractors. When technical staff are brought into a project, it isimportant that their work is not carried out in isolation from the subject matter of the project. One of themost valuable commodities a digital resource creation project can have is staff who understand both thesubject area and the technical issues.

    Remember that technical decisions are interwoven with the overall subject focusedobjectives of the project, so it is important to ensure that technical decisions are not made in isolationfrom wider considerations.

    Planning

    Early Planning

    Many projects do not adequately plan how they will create their digital resource. Planning isoften left too late and there is too little of it. Early planning does not needed to be especially detailed to be

  • 8/10/2019 AHDS-Introduction to Creating Digital Resources.pdf

    3/8

    Pgina 3 de 8

    useful. A short one or two page project brief that describes the project's purpose, its intended audienceand the likely content will help to clarify the project's objectives, and will provide a useful summary thatcan be shared with the AHDS and other technical experts. This type of document will be sufficient fortechnical specialists to suggest possible technical approaches and to highlight potential pitfalls. Seekingout this type of advice early on means that the project will have enough time to examine several possibletechnical solutions and decide which one will best meet the project's objectives. Your project should beguided by a firm understanding of the content and the intended purpose of the digital resource. Thisunderstanding will help you manage some of the trade-offs involved in creating a digital resource:

    Amount and detail versus time and cost of creation

    Complexity of the digital resource versus ease of use

    Flexibility of the digital resource versus suitability for a specific use

    Content creation with current technology versus future possibilities

    Trials, Prototypes and Pilots

    Creating a digital resource is a practical activity, and the value of trying things out ahead of

    time should not be under-estimated. Trials, prototypes and pilots are all useful, particular in the areas ofdigitisation, software and hardware selection, and data entry. Including a formal piloting stage in yourproject can be especially worthwhile. It can be used to test the feasibility of different methods ofdigitisation, or check how your target audience responds to the resource.

    One area where trials and pilots are extremely useful is IPR (Intellectual Property Rights) -establishing in practice what you will need to do to ensure that you have all the rights and permissionsneeded to access and use material owned or held by others.

    The Project Timetable

    The basis for a sound project timetable is a clear understanding of how long each task willtake and the order in which they must be completed. The time allocated to each task must be based onrealistic estimates of the effort required. Trials, prototypes and pilots can all help to inform the project

    timetable.

    Once the main tasks have been identified, and the time needed to complete them has beenestimated, a project timetable can be drawn up. The timetable should show how long each task will take,the order in which tasks will be started and finished, and what the members of the project team are doingat any given point in time. Links between tasks should be clearly specified in the project timetable. Atimetable like this will help to reveal problems such as staff who are committed for more than 100% oftheir time, or tasks scheduled to begin before pre-requisite tasks have been finished.

    By identifying all the interdependencies between tasks in your project, the critical path of theproject can be identified. The critical path is the sequence of tasks that must all be completed on-time forthe entire project to be completed on time. A delay to any task on the critical path will delay later tasks,and the entire project will fall behind schedule.

    A good project timetable is a useful tool for monitoring progress. Regular reports by thoseresponsible for each task can be compared with the progress anticipated in the project timetable, allowingproblems to be identified and resolved as early as possible. Progress can only be monitored with up-to-date information, and this is best provided through some formal (but possibly quite simple) framework ofmeetings and reporting intended to share information about progress between project members.

  • 8/10/2019 AHDS-Introduction to Creating Digital Resources.pdf

    4/8

    Pgina 4 de 8

    Project Management

    A large team of academics, research students and technical specialists can be involved inthe project, and good project management is vital in ensuring that work is coordinated and delivered ontime. Projects should clearly identify who has the authority to act if the project is not going to plan.Usually, a single project manager is best, possibly supported by a management or advisory committee inthe case of larger projects.

    Digital resource creation projects need to be managed with a degree of formality. Reportsand meetings should be used to track progress against the project timetable and to help identify problemsas early as possible. In many projects, technical tasks are undertaken and managed somewhatseparately from other parts of the project, and in this situation it is important to ensure that the timing andobjectives of technical work remains coordinated with other project objectives. Planning the technicalwork in distinct stages, moving from design documents through to prototypes and then periodic reviews ofon-going work before the final resource is completed, can help. Each of these milestones provides anopportunity to judge how much concrete progress has been made.

    Developing a Digital Resource

    Design

    It is useful to think about designing a digital resource in terms of the following threeelements:

    The underlying data the resource contains

    The software (and hardware) that is needed to make sensible use of the data

    The user interface through which the user interacts with the software to retrieve, search, andmanipulate the data

    The software and user interface can be thought of as layers that lie on top of the actual data,making it easier to work with the data but, at the same time, constraining the way in which a user canaccess and manipulate the underlying data your digital resource contains. Consider as an example, a

    collection of digital images based on the works held in an art gallery. The digital resource could simplyconsist of a set of TIFF images. However, you will also want to provide users with some information abouteach image. You could create HTML pages that contain both the image and the information about it, butnow the user will have to have access to a web browser if they are to view the resource properly. Youmay then wish to allow users to search for specific images by keyword. This could be done by creating adatabase of keywords that can be queried from the webpage, but now you will need to run a databaseserver alongside the web server. Each additional piece of functionality complicates the resource further.

    A digital resource should be designed so that it's core content is as independent as possiblefrom the means of accessing that content. This will help keep the finished resource as flexible aspossible, allowing it to change and develop to meet unanticipated requirements while avoiding becominglocked into obsolete software, hardware or methods of interacting with data. To achieve these aims, weneed to start by thinking about the underlying structure and organisation of the data without worryingabout how it will ultimately be presented or stored for specific uses. Your design goal should be to hold

    master versions of all your data in forms that can be converted to meet varying purposes.

    Type of Data Environments

    Texts Web, print, textual analysis software

    Datasets Spreadsheets, databases, statistical analysis, dynamic web site

  • 8/10/2019 AHDS-Introduction to Creating Digital Resources.pdf

    5/8

    Pgina 5 de 8

    Audio Streamed from Web, desktop PC, editing software

    Still Images Web, published print quality, OCR (for vector images: GIS, CAD)

    Moving ImagesStreamed from Web, progressive download, DVD player, desktop PC, editingsoftware, TV broadcast

    Basically, designing a digital resource involves answering the question, what are youdeveloping? Is it a database, a website, a GIS (Geographical Information System), a catalogue or someother type of resource? Each type of digital resource entails a different approach to organising information- a data model -which will be appropriate for some tasks, but not for others. Perhaps the simplest datamodel is embodied in plain text files. These files simply store a sequence of numeric codes whichrepresent characters. All you need to know is which character each code represents. A far moresophisticated data model many people have some experience of is the relational data model, used bymost database software applications, such as Microsoft Access. This data model imposes a range ofconstraints on how the content of a database can be organised (it must be arranged in discrete fields,each record must be unique and so on) which ensure that data is organised consistently and predictablyso that the validation, searching and display of data can be automated.

    Different data models are implemented using different sets of standards, file formats andsoftware, so it is very important to understand the type of resource you are building before proceeding.For example, laying out a table as HTML for a web page is appropriate if you want people to read thetable, but storing the table as delimited text and loading it into a spreadsheet would be more appropriate ifyou plan to perform complex calculations on the table.

    Resource Type Things to investigate

    Texts XML, TEI, Dublin Core, PDF

    Dataset Relational data model, SQL, normalisation, XML

    GIS Vector and raster data models, polygon topology, Open GIS standards

    Library/ArchiveCatalogue

    XML, OAI, Dublin Core, subject specific metadata schemas (e.g. DDI, VRA Core),XSLT, controlled vocabularies

    WebsiteXHTML, W3C web accessibility standards, database connectivity (ODBC, ADO,JDBC), scripting languages (PHP, Javascript, ASP)

    Audio Clips Lossless compression MP3, sampling rates, bit rate

    Still ImagesResolution and colour depth, TIFF, PNG, lossless compression, NISO technicalmetadata, VRA Core 3.0 metadata, Dublin Core

    Moving Images Compression, MPEG frame rate, resolution and colour depth, screen size, 'codecs'

    For more detailed information about different types of digital resources and the issuesinvolved in designing them, you should read the AHDS Guide to Good Practice series. There are somebasic characteristics that apply across all types of digital resource that suggest it has been well-designed:

    Repetitive tasks can be easily automated

    Data structures are consistent, well defined and documented

  • 8/10/2019 AHDS-Introduction to Creating Digital Resources.pdf

    6/8

    Pgina 6 de 8

    Data is created according to consistent rules

    The presentation of data can be easily changed

    If you find that one of these points doesn't apply to your digital resource, then you canprobably improve it, making it more efficient (and your work less taxing!).

    Selecting Hardware

    All projects need hardware, but most projects need not be too concerned about the exactspecifications of desktop PCs, laptops and printers. Standard computing hardware is now very powerful,and will meet most needs, although projects should always seek advice about purchasing these itemsfrom their organisation's I.T. support service.

    More attention needs to be paid to the purchase of scanners, digital cameras and otherdigitisation tools. These devices directly determine the quality of your digital master versions, so careshould be taken to compare and test different devices before making a purchase.

    Selecting Software

    Digital resource creation projects in the arts and humanities may need a wide range of

    software, and it is not possible to discuss each category in detail here. A good rule to apply to anysituation is to select software that implements relevant standards and allows data to be easily importedand exported. By adopting standards you will make it easier to share your data with others, and it will beeasier for them to understand the data. These formats are also the most likely to import into another pieceof software without any loss of formatting or structure. By selecting software with lots of export optionsyou can minimise the risk of your data becoming dependent on an inappropriate or obsolete piece ofsoftware (and again, make it easier to share your data with others).

    Creating, Acquiring and Digitising Content

    The content of a digital resource may be created from scratch, digitised from existinganalogue sources or taken from existing digital material. However you obtain content for your digitalresource, the process should follow documented procedures and be consistent over time. Simple

    techniques, such as template documents and automation using macros, can help to maintain consistencyand reduce the likelihood of errors. This is especially important if more than one person will be doing thesame work in parallel as it is remarkably easy for differences in practice to creep in.

    Digitisation

    Digitisation is the central component of many digital resource creation projects in the artsand humanities. Still images (photographs, artwork) and written documents are the commonest targets fordigitisation, but audio and moving image recordings are also digitised, along with a range of more esotericsources.

    Probably the most important task associated with digitisation is obtaining all the necessaryrights and permissions needed to gain access to, digitise, and use the resulting digital surrogates as youwish to. While investigating these issues, it is sensible to also assess the practical difficulties that the

    material you plan to digitise may present. Fragile material, materials that cannot be moved, and materialsthat are of unusual sizes and shapes will all pose additional problems that may affect the amount ofmaterial you can digitise, or the way in which you decide to digitise it.

    The aim of digitisation is to create an accurate digital surrogate for the original object.Because the cost, effort and technical challenges involved in digitisation increase as the digitisedsurrogate is made more accurate, you will need to focus on the intended purpose of the digital surrogatein order to make sensible trade-offs between accuracy and other considerations (chiefly, time and cost).

  • 8/10/2019 AHDS-Introduction to Creating Digital Resources.pdf

    7/8

    Pgina 7 de 8

    There are two main factors in the digitisation process that will affect the accuracy of the final digitalsurrogate: the characteristics of the digitisation tool (scanner, digital camera etc.) you use, and the fileformat used to store the digital surrogate.

    Material to Digitise Tools Things to Investigate

    Live performance Digital camcorders

    Digital Video (DV), MPEG, YUV colour

    space, resolution, Firewire (IE 1394), lossycompression

    Paintings, pictures,diagrams

    Scanner, digital camera(optical) resolution, dynamic range, RGBcolour space

    Written documentsScanner, optical characterrecognition software, keyboard(transcription)

    transcription (optical) resolution, dynamicrange, RGB colour space, UNICODE,double-keying, spelling and grammarsoftware, optical character recognition (OCR)software

    Audio recording Sound card with analogue to digital Hardware compression, voice vs music

    Moving picturerecording

    Video capture card Signal standards supported (PAL, NTSC,SECAM), hardware compression

    Creating New Content

    As well as being useful for editing digitised material, software such as word processors,HTML editors, CAD packages and other tools, can be used to create new digital content from scratch.When more than one person will be creating material, it is important to use the same software, orestablish how content can be shared and integrated before work begins. This is especially importantwhen the a large number of external (to the project team) contributors will be providing content.

    It is important that everybody creating material understands the terms and conditions underwhich they will be used. Requiring all contributors to sign a formal licence, detailing their rights and thoseof the project, is a good idea. A number of model licences for different situations exist.

    Documenting the Resource

    General Documentation

    Comprehensive documentation, such as user guides, interview scripts, codebooks andperformance notes for example, is vital if a digital resource is to be shared and remain usable in the long-term. Indeed, good documentation often proves its worth when you return to a digital resource you havedesigned yourself after an absence. The AHDS strongly recommends that all projects devote areasonable part of their total effort to documenting the digital resources they create.

    Documenting a resource should not be left until it is completed, but should be seen as anintegral part of its development. Tasks should be documented as they occur, when the activity is fresh inthe mind. This approach guards against information being misplaced, forgotten, or taken away from theproject if key staff depart.

    Documentation should cover provenance of sources, methodology of digitisation, design ofdatabases, XML schemas and other data structures, and give details of code books, controlledvocabularies, abbreviations and other project specific knowledge. Documentation should also include keycorrespondence and formal agreements relating to the creation and use of the resource's content. Online

  • 8/10/2019 AHDS-Introduction to Creating Digital Resources.pdf

    8/8

    Pgina 8 de 8

    delivery systems, software and source code should, of course, be accompanied by suitable technicaldocumentation.

    The AHDS provides Guidelines for Documenting Digital Resources

    Resource Discovery Metadata

    In addition to general documentation most types of digital resource should also beaccompanied by some formal, structured, resource discovery metadata. Metadata is 'data about data'.Resource discovery metadata is information that describes your digital resource and helps potential usersfind it, similar to the type of information you find in a library catalogue. If your resource is a collection oftexts, images, or some other type of material where users will need to search a large set of items, you willneed to create resource discovery metadata for each item as part of the digital resource creation process.There are now many formal standards for resource discovery metadata intended for different subjectareas and levels of detail. You should at least create a basic metadata record for each item that conformsto the Dublin Core standard.