



![]() |
Metadata and Data Quality Problems in the Digital Library Jeffrey Beall
Email: jeffrey.beall@cudenver.edu
URL: http://carbon.cudenver.edu/~jbeall/
This is a summary version of the paper. The author's authoritative fulltext is available as PDF (20 pages, 292 kb). Download latest PDF viewer
AbstractThis paper describes the main types of data quality errors that occur in digital libraries, both in full-text objects and in metadata. Studying these errors is important because they can block access to online documents and because digital libraries should eliminate errors where possible. Some types of common errors include typographical errors, scanning and data conversion errors, and find and replace errors. Errors in metadata can also hinder access in digital libraries. The paper also discusses the responsibility for errors in digital documents and offers suggestions for managing digital library data quality. Editor's NoteThe original version has been removed because it had typos and was not the final, edited version of the article. This is the authoritative final version republished February 24, 2006. Index of FiguresFigure 1. Typographical Error in an online document Index of TablesTable 1. Summary of the error categories Table 2. Number of hits in JSTOR for five common typographical errors
ReferencesBallard, T. (2005) Typographical Errors in Library Databases. Rev. Jan. 20, 2005. http://faculty.quinnipiac.edu/libraries/tballard/typoscomplete.html. Berg, T. (2002) "Slips of the Typewriter Key", Applied Psycholinguistics, Vol. 23, pp. 185-207. Bruce, T.R. and Hillmann, D.I. (2004) "The Continuum of Metada Quality: Defining, Expressing, Exploiting". In Metadata in Practice, edited by D.I. Hillmann and E.L. Westbrooks (Chicago: American Library Association) pp. 238-256.
Gardner, S. (1992) "Spelling Errors in Online Databases: What the Technical Communicator Should Know", Technical Communication, Vol. 39, pp. 50-53. Gentner, D.R., Grudin, J.T., Larochelle, S., Norman, D.A., Rumelhart, D.E. (1983) "A Glossary of Terms Including a Classification of Typing Errors". In Cognitive Aspects of Skilled Typewriting, edited by W.E. Cooper (New York: Springer-Verlag), pp. 39-43. Graham, P. (1990) "Quality in Cataloging: Making Distinctions", Journal of Academic Librarianship,, Vol. 16, 213-218. Fox, C., Levitin, A. and Redman, T. (1994) "The Notion of Data and its Quality Dimensions", Information Processing and Management, Vol. 30, No. 1, 9-19. Lesk, M. (2004) Understanding Digital Libraries, 2nd ed. (Boston: Elsevier) Marieke, G., Powell, A., and Day, M. (2004) Improving the Quality of Metadata in Eprint Archives”, Ariadne, Issue 38. Online: http://www.ariadne.ac.uk/issue38/guy/ Marty, P. and Twidale, M. (2000) "Unexpected Help with Your Web-based Collections: Encouraging Data Quality Feedback from your Museum Visitors", Museums and the Web 2002 Papers. Online: http://www.archimuse.com/mw2000/papers/marty/marty.html Massey, O. (2003) Auditing catalogue quality by random sampling. A master's dissertation submitted in partial fulfillment of the requirements for the award of Master of Arts degree of Loughborough University. August 2000. Available online: http://owen.massey.net/dissertation/index.html Medawar, Katia (1995) "Database Quality: A Literature Review of the Past and a Plan for the Future", Program, vol. 29, no. 3, 257-272. Mizes, J.S., Fleece, E.L., Roos, C. (1984) "Incentives for Increasing Return Rates: Magnitude Levels, Response Bias, amd [sic] Format". Public Opinion Quarterly, Vol. 48, No. 4, pp. 794-800. Moen, W.E., Stewart, E.L., and McClure, C.R. (1998) "Assessing metadata quality: findings and methodological considerations from an evaluation of the U.S. Government Information Locater Service (GILS)". In IEEE International Forum on Research and Technology Advances in Digital Libraries, ADL '98 : Proceedings, April 22-24, 1998 Santa Barbara, California (Los Alamitos, Calif.: IEEE Computer Society Press) pp. 246-255. Ojala, M. (1996) "Oops! Retractions, Corrections, and Amplifications in Online Environments", Searcher, Vol. 4, No. 1, pp. 30-41.
Pollock, J.J., and Zamora A. (1983) "Collection and Characterization of Spelling Errors in Scientific and Scholarly Text", Journal of the American Society for Information Science, Vol. 34, No. 1, 51-58. Robertson, R.J. (2005) "Metadata quality: implications for library and information science professionals". Library Review, Vol. 54, No. 5, 295-300. Rothenberg, J. (1996) "Metadata to Support Data Quality and Longevity". Paper presented at the 1st IEEE Metadata Conference, Silver Spring, MD. Rothenberg, J., & Rand (1997) "A Discussion of Data Quality for Verification, Validation, and Certification (VV&C) of Data to be Used in Modeling", Rand Project Memorandum PM-709-DMSO, Rand. See also Data Quality Templates: http://vva.dmso.mil/Templates/Dataquality/default.htm Schottlander, B.E.C. (2003) "Metadata fundamentals for all librarians" [Book review]. The Journal of Academic Librarianship. Vol. 29, issue 6. p. 418-419. Wormell, I., editor (1990) Information quality: Definitions and dimensions (Los Angeles: Taylor Graham) |