[Series] Emerging Careers in Librarianship: Data Curation

The Economist, February 25, 2010.

Librarianship is a quickly changing field. Thanks to computers and the internet, it has undergone a tremendous metamorphosis over the last two decades. In light of these changes, we’re kicking off a new series in which we talk about the many emerging careers in librarianship. Today’s inaugural issue talks about the emerging field of data curation.

Background

Back in February, I talked about data sharing and wondered if scientists are ready to share their research data. If so, there must be special steps taken to ensure the data are available into the future. Digital data are fragile. If you don’t believe me, try opening a file created on a computer from the early 1990s. Chances are high that you won’t be able to for a number of reasons. You probably don’t have the right drive or software. Or the file has been corrupted from bit rot. Or the diskette itself has physically degraded. Yet, there are many reasons why we may want to preserve what’s contained within that file. There are specialized techniques to preserve the important parts of the data for future use. This field is called data curation and this is where the data curation librarian comes in.

Data curation is defined as “the active and ongoing management of data through its lifecycle of interest and usefulness to scholarship, science, and education.” (GSLIS) The volume of scientific data is growing exponentially across all scientific disciplines. This phenomenon has been termed the “data deluge.” The data deluge is now a fundamental characteristic of e-science and “big science,” especially in disciplines such as physics, astronomy, and earth and atmospheric sciences. Moreover, stakeholders are beginning to recognize the value in sharing data assets with each other and in curation of data for re-use over the long term. Competent information professionals are needed to curate this data for future research and education requirements.

Training

Only recently have LIS graduate programs begun to offer specialized training in data curation.  Those with formal training in traditional LIS skills are natural fits for data curation because it draws skills from that set, such as metadata, archiving, and collection development.

Based on your background, you might be well qualified for data curation. Being from an engineering background, I am better qualified for scientific data curation, whereas a humanist would be better suited to humanities data curation.

Career Outlook

I haven’t done a rigorous employment projection, but I can tell you with a high degree of certainty that the data curation field is wide open. There are positions within academic libraries, special libraries, data centers, large data repositories, and even in private industry. I’m interested in the academic library world, so I’ve been tracking the names used in job advertisements to describe the position I want. Just in the last four months, I’ve seen job advertisements with all of the following job titles:

  1. Data management specialist
  2. Data management librarian
  3. Data curator
  4. Data librarian
  5. Data services specialist
  6. Data research scientist
  7. Data curation specialist
  8. E-science librarian
  9. E-research librarian
  10. Science data librarian
  11. Research data librarian

This list doesn’t indicate the actual number of jobs, since many are called the same thing. There will almost certainly be more positions within this field in the coming years as universities and researchers realize the value in it. And it’s a good time to be in the LIS field, because I believe libraries are best positioned to offer these services to their institutions. From where I sit, I see a pretty promising field.

Summary

Just as libraries had to adapt to accommodate the internet and online public access catalogs, they will have to continue to adapt since big data is becoming a looming figure on the research horizon. We need trained professionals to be able to curate data for future research and education. Library and information science students are in prime position to fill this need.

References

Graduate School of Library and Information Science (GSLIS) at University of Illinois, Urbana-Champaign. Data Curation Education Program. In GSLIS – Center for Informatics Research in Science and Scholarship – Collections & Curation. Retrieved March 25, 2012, from http://cirss.lis.illinois.edu/CollMeta/dcep.html.

Categories: Emerging Careers

16 replies

  1. Question: isn’t what you describe in paragraph one actually more Digital Preservation (another buzzword)? Do these two areas overlap? Is a project like Digital Humanities Now an example of Data Curation in the way you describe it?

    Like

    • My explanation of trying to open an old file was simply to demonstrate the fragility of digital data and the need for more than simply saving the file on a disk if one expects to use it for a long time. “Digital preservation” and “data curation” are sometimes used interchangeably, but they are not the same thing. Digital preservation is simply one part of the larger process of data curation. Curation is more than simply preserving the file for long-term use. It’s managing the data from the time it is collected, through processing, preservation, and then re-use later. It includes a large metadata component (without metadata, data will become useless pretty quickly). Digital humanities must include data curation in the same way that digital science must. Is it clearer now?

      Like

  2. I don’t think my program has data curation type courses…. How much overlap is there with data management in librarianship versus in the IT world? Would “database administrator” be akin to data curator as a job title?

    Like

    • Data management is part of data curation. Database administration is part of data management. So yes, it’s all related, but I think data curation is the broad overarching field, of which there are several sub-parts. So if the job ad just said “database administrator,” it would be pretty focused to databases, not the entire data lifecycle.

      Like

Leave a reply to Paul Lai Cancel reply