Project Requirements

Project Overview

The goal of this project is to develop an innovative system that enables image-based searching in the Special Collections and Archives (SCA) at the Cline Library. Users will be able to upload an image and find visually similar images in the library's digital archive, improving accessibility and search efficiency.
Description from our client

System Requirements

1. Access Images in the Special Collections and Archives

The system must integrate with the SCA’s existing CONTENTdm database to retrieve archived images and metadata.

2. Interpret Images

The system will use advanced tools like Vision Transformers (ViT) or Convolutional Neural Networks (CNN) to generate image embeddings. These embeddings will represent the meaning of an image in a machine-readable format.

Pre-trained models will be fine-tuned using the SCA’s data to ensure optimal accuracy.

3. Store Image Interpretations

A vector database will store the generated embeddings alongside their corresponding image references, enabling efficient similarity searches.

4. Perform Image Similarity Searches

Users can upload an image to the system, which will generate an embedding. Tools like Cosine Similarity or Euclidean Distance will compare the uploaded image to archived images in the vector database, returning the top matching results.

5. Integration with the Cline Library Website

The system will be hosted on a dedicated page within the Cline Library’s website, offering users direct access to the tool.

Visual Workflow

Below is a simplified flowchart illustrating the system's workflow, from image upload to displaying the results:

System Workflow Flowchart

Project Timeline

The following Gantt chart outlines the key milestones and timeline for the project's development:

Project Gantt Chart

Explore In-Depth Documentation

For detailed insights and implementation strategies, refer to the following documents: