What would a typical ‘day in the life’ of Gale’s production team consist of, at what points do you work with other parts of the business?
Sarah: For a production team member, there is no typical “day in the life”. We work on different projects with different teams from different countries on different times zones with different languages and each day presents unique challenges to overcome.
We take an idea from an Acquisitions Editor and get that idea online so that people can search archives and collections from source institutions around the world. We must then work with multiple different teams and vendors, both inside and outside Gale, to make the final product.
From Product Managers:
New or updated requirements from the Product Management team can come in at any point as well. The product managers review comments, feedback, and suggestions from users across all our projects. They collate all the information and turn them into user stories and requirements to get the most highly sought after feature in place across our platforms. This usually means we in production need to either add, change, standardise something within existing XML that will support this new feature. An example of this is the brand-new Browse Manuscript feature. It was something users had been asking for consistently so we received a requirement to make sure the required manuscripts had a manuscript number, and a way to sort the manuscript number in product so that a user would easily be able to find what they are looking for.
From Acquisitions Editors:
We can get a new idea from an Acquisitions Editor at any time and we start off with a kick-off meeting. This meeting is to get the idea, vision, and details about the project. Where is it coming from? What does it contain? How big is it? What should the final online archive look like? What is special about this archive?
Production then take the metadata and pour over it to make sure everything makes sense and there is nothing obvious missing. We see how the metadata is organised and how the library has catalogued the content and we try and keep the browsability similar so that users can easily find what they want. We sometimes create wireframes to show what a new feature or a new programme would look like. These wireframes vary from text boxes on a word document to a full fleshed out working wireframe. We check to make sure that what they are asking for is feasible and in line with other archives i.e. we do not want one archive using U.S.A. as a term to filter on and another archive using United States of America.
MAKING THE ARCHIVE
Source Institutions, universities, and libraries:
After an Acquisitions Editor gives us their idea, we then work with the source institution, university or library to work out how we will be able to get into scan their material, or how we can get the content out to scan it. We work with them on getting MARC records, complete lists, metadata, sometimes they help us flag the content, so the scanning teams know exactly what to scan when they get there. Sometimes, the source institution scans themselves, or have scanned some of the content themselves previously so production work with them to get the images to us so we can review. As some of these collections have not been properly catalogued or necessarily touched for a number of years, the production team will get clarifications and questions from the source library and we will have clarifications and questions for them as well as we start work on gathering all the content together to start the scanning.
To scan the average of 10 million pages of content each year, we have scanning vendors that can go into source institutions or can scan from shipped content at a vendor’s location. We provide a list of required items to the scanning teams and the source institution. These items can be books, pamphlets, flyers, newspapers, periodicals, magazines, manuscripts, scrolls, maps, photographs, and so on. Each type of content has different problems, solutions, and workflows. Production teams get questions and clarifications daily from the scanning vendors throughout the scanning process which lasts anywhere from 3 months to two years. Questions and clarifications cover scenarios such as missing items, extra items, duplicate items, items that require conservation, items with special requirements i.e. what is the best way to scan a 4ft scroll?, and items that are just too delicate to scan. Often the questions require input from the original Acquisition Editor. The production team help facilitate and monitor these questions and monitor the progress of the scanning so that it remains on schedule and to the original size.
The images then go through a quality assurance process to a different vendor. This QA vendor checks to make sure that the correct images have been scanned, in the correct order, in focus, at the right DPI and without any pages missing. This process also generates clarifications and questions that we deal with or facilitate.
Conversion to XML (Extensible Markup Language):
To process 10 million pages of content each year, we send the scanned images to a conversion vendor. We also send metadata in the form of MARC records, library cataloguing, our own metadata we have created in house or cataloguing from freelancers inside the libraries or institutions. The vendor matches the correct metadata to the correct set of images, scans all the words, both typed or handwritten, and maps those words coordinates to a full page. This allows a user to both search the words on the scanned image, but also locate it easily as the word will be highlighted. This process takes around 3-9 months depending on the size. Production teams will get questions and clarifications daily from the vendor throughout this process as well. These include missing metadata or MARC records, wrong metadata or MARC records, unique content that require special instructions i.e. what type label do we put to a book of stamps? The conversion vendor sends the XML and images back to us in batches and we look at the metadata to make sure that the vendor is capturing everything correctly like titles, authors, and publication dates etc. These clarifications and QA of work are the bulk of what the production teams do day to day.
Working with Dev:
We deliver the QA’d and sometimes corrected XML and images to the DEV and DEV processing teams, who load the content and images to the platforms. At this time, occasionally we will work with the DEV team if they spot a problem or if new indexes or requirements are involved.
Working with indexing teams:
Sometimes the products call for additional requirements that are out of scope or not achievable in the production teams. This includes scenarios like subject matter expert involvement or assigning subjects to newspaper articles. For these types of cases, the production team engages with the Indexing Team. They take our XML or metadata, depending on the type of work and give us back appropriate subject for us to put into our XML.