...
Plan out and timeline the CDM->ScholarShare ETD migration.
Alicia and Gabe can start ASAP
Conditions
New export from CDM needed (Stefanie)
Celio needs to wrap up the current batch before the load.
Embargoes are clearly documented. Which doc is used for this currently, and is it up-to-date?
Tracked through this sheet: https://docs.google.com/spreadsheets/d/13ty_Qkqes69_bly8lDHzekW2vETP62mcPQdI-CmgkRQ/edit#gid=1242336235
Inappropriate supplemental files removed from prior batches are clearly marked. In which doc?
No in-progress metadata cleanup blocks this. Holly and Stefanie say we’re good to go
Timeline: Goal deadline: end of summer
Possible complications
file names are not reflected accurately in the CDM export. This could slow down uploads
Unknown limits on how many items can be uploaded simultaneously
Further issues with remote access to Alicia’s desktop and/or the shared drive
(Gabe and Alicia already defined a strategy for loads administered remotely with Chin)
Do we need to correct any filename issues we find in CDM?
No
Any metadata cleanup that must, should, or could happen mid-migration in OpenRefine or another program, while Alicia is formatting the data for import?
In previous discussions, we agreed this could happen post-load, so that this doesn’t slow down the migration timeline. Still true?
Still true.
However, if there are obvious issues that are easy to clean up we can pursue this.
Assign a point person or interested group for metadata questions from Alicia and Gabe during this process (as we did with SWORD mapping, systems testing).
Gabe and Alicia will reach out to Holly and Stefanie with questions, they can refer to Michael, Carla, Celio
Workflow assignment
1. Export metadata from CDM (Stefanie?)
2. Field reformating and renaming in OpenRefine; bulk metadata corrections (Alicia, Gabe)
3. Breaking load into discrete batches (all items with supplementals; all embargoed items; all other items, broken into manageable chunks). (Gabe, Alicia)
4. Bulk loads (Alicia)
...
Plan to retest and document the entire ETD workflow in prod, from PQ SWORD import through external delivery via ScholarShare's OAI endpoint.
Holly will be doing this, no need for a discrete step. She’s working with Chad to harvest from the OAI endpoint into Alma, through a similar process used in several other systems. When new MARC records are generated, this will serve as the test described.
Follow up task: update OAI endpoint for other systems that currently harvest our ETDs (NDLTD, etc.)
Holly and Annie will investigate. Point person and workflow TBD.
Document the workflow and assignments once finalized.
Questions about Carla and Celio’s workflows from Alicia.
Why do we need to be able to isolate the batches via metadata?
Action items
- Gabe Galson will set up ETD admin access for Alicia
- Gabe Galson will figure out how Celio can isolate individual batches, report back to group
- Holly Tomren will talk to Carla and Celio, then update this group on how to proceed with the planning of a training, through which we’ll get their input on the workflow structure
- Stefanie Ramsay (Unlicensed) will request a new share, defining the name and access in consultation with Annie and Alicia
- Gabe Galson will think of ways to split out new items from the Preservation exports
- Gabe Galson will schedule a part 2 meeting covering the ETD migration plan
- Gabe Galson will transfer the ETD delivery process from FTP to SWORD
- Alicia will train Celio and Carla
- Holly Tomren will share the cleanup sheet she used that isolated supplemental files.