Pilot test images

How and where to upload test images
Official EAL test materials
Previous test materials

How and where to upload test images

Official pilot test images from the East Asian Library collections should go on the AWS server in a sub-folder of the /var/www/html/pilot_images/ directory. This way we can view them from the web (example) and they will be available to sites running on the Django framework as well as Tesseract OCR. Here are the steps necessary to upload the images and make them accessible. It's best to do this via the Terminal shell on Mac OS/Linux or a terminal-like environment (probably PowerShell?) on Windows.

Copy the images from your camera/phone to a local folder on your computer, e.g., /home/pete/test_photos
Log in to the AWS server using the instructions on this wiki page.
On the AWS server, go to the pilot_images/ directory: $ cd /var/www/html/pilot_images
On the AWS server, create a folder to house your images, with a descriptive name: $ mkdir nexus5x
On your local machine, copy the files over to the target directory on the AWS server via SSH copy (scp): $ scp -i LOCATION_OF_ccing.pem_FILE /home/pete/test_photos/* [email protected]:/var/www/html/pilot_images/nexus5x/.
On the AWS server, make sure the images are world-readable: $ chmod -R 755 /var/www/html/pilot_images/nexus5x
Open a web browser and make sure you can view the images at the expected URL: http://ec2-54-173-153-28.compute-1.amazonaws.com/pilot_images/nexus5x/

Official test materials from the East Asian Library

We now have 2-3 collections of ~100 books each, some already with UCLA Library barcodes and some not, that we can use for official an pilot test, i.e., taking pictures of them and their associated barcodes, using the barcodes to rename the images, and then uploading these images to Scribe. Details are available on this Google doc.

Previous test materials

Until we get actual images of book covers and title pages from the catalogers, we can use one of the Internet Archive’s extensive collections of book cover images for testing -- see for example https://archive.org/search.php?query=book%20covers.

Update as of 3/30/17 One likely workflow for CCing is that librarians will drop the book page images into a web folder somewhere, maybe hosted by Box (and perhaps with other metadata in different folders and files), and then provide this link to a server-side app via a basic interface that they can use to kick off the process of "ingesting" the images into CCing's OCR->Scribe workflow.

Initial testing of this workflow with Box ran into problems because it's not clear how to get direct download links of the full-resolution versions of the cover images from Box. So for now, we've installed an Apache web server on the AWS machine and are serving the test images from there. The actual location of the images on the server is in /var/www/html/test_images. They are accessible via the web at URLs like the following:

A bunch of book covers from part 9 of the Internet Archive's "Amazon covers crawl":

http://ec2-54-173-153-28.compute-1.amazonaws.com/test_images/

Some old-timey book covers that we tried to serve from Box:

http://ec2-54-173-153-28.compute-1.amazonaws.com/test_images/old_sample/

A bunch more images from Amazon, via the IA:

http://ec2-54-173-153-28.compute-1.amazonaws.com/test_images/amazon/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pilot test images

How and where to upload test images

Official test materials from the East Asian Library

Previous test materials

Clone this wiki locally