A note in advance : Please do not use the "Easy Apply" button automatically created by Linkedin, but send through your cover letter explaining your motivation for this role and for working with us, along with your CV , to our Manager, Kerstin Arnold (kerstin.arnold@archivesportaleurope.net) . You can also contact her, if you have any questions about the role, the team, or the foundation.
Archives Portal Europe (APE) is the world’s largest online repository for archival materials, hosting collections from over a thousand institutions in more than 20 languages, all made searchable in a keyword-based query.
In this context, we are currently working on an extension of our search functionalities to include the possibility to not only search the descriptions of archival collections, but also digitised archival materials directly. For now, we are piloting this with materials that have already been digitised and OCR-ed or HTR-ed, but in the long(er) term we are also looking at possibly including an OCR / HTR service into our workflows.
Furthermore, we intend to further develop our Automated Topic Detection (ATD) tool as another potential extension to the current keyword search. At the moment, this tool includes (1) a cross-lingual search and (2) a multilingual detection, with the first aiming at users and data providers alike and the second aiming specifically at data providers and supporting them in their data preparation. You can find the technical details of the tool on our GitHub page (last updated in 2022), and read about it in the publication that followed its first iteration in the "Journal on Computing and Cultural Heritage" : https : / / dl.acm.org / doi / 10.1145 / 3494572.
What we are looking for
To advance in both of these activities, we are looking for a data scientist with experience in :
The tasks at hand
With regard to the new option for searching in digitised archival materials directly, the work will focus on the testing, evaluation and validation of OCR / HTR tools for potential use by or integration in Archives Portal Europe and the design and piloting of an OCR / HTR workflow as a new service offer.
The further developments that we are looking at for the ADT tool include expanding it to more (and ideally all) of the dataset available in our portal, supporting more languages, working with more reference ontologies, and - ultimately - supporting an integration of the cross-lingual search into the search functionalities of our frontend and of the multilingual detection into the workflows of our backend. This latter aspect will be addressed in close collaboration with other members of our technical team.
Find out more
If you think this could be you, please send through your cover letter explaining your motivation for this role and for working with us, along with your CV, to our Manager, Kerstin Arnold (kerstin.arnold@archivesportaleurope.net). You can also contact her, if you have any questions about the role, the team, or the foundation.
We envisage this work as a short-term assignment agreement equalling approximately 2 months full-time employment (or 270 hours). We are open to variations of part- to full-time models and offer an hourly rate of up to €60 depending on your experience and skill set.
The deadline for applications is Friday, 27 February , with interviews intended to be held during the week commencing Monday, 9 March, and a start as soon as possible after.
Data Scientist • Netherlands, Netherlands