Foundation INtegrated models for Libraries Archives and Museum
The digital transformation of libraries, which has been based on OCR (Optical Character Recognition) technology for more than 20 years, faces certain limitations both in terms of quality, due to the diversity of the collections and the limitations of OCR technology, and in terms of added value, due to a lack of structuring and high-level indexing. Named entity extraction is still little used because it mobilises language processing technologies, which were not very adaptable until recently. More generally, the semantic indexing of collections is underdeveloped and integrated with metadata. We propose to develop multimodal models (text + image) for the extraction of information from collections of digitised documents in large libraries. The literature shows that work in this direction is still underdeveloped, and that it is mainly aimed at processing commercial documents (invoices etc…).
ANR programme FINLAM relies on the expertise of LITIS to study the most relevant multimodal architectures to integrate the language knowledge conveyed by the large language models developed recently and to study the modalities of specialisation/adaptation of these models in conjunction with the learning of a generic optical encoder, benefiting from the annotated collections available at the BnF. User interaction will be considered according to different scenarios of closed and open queries. TEKLIA will mobilise its expertise to prepare the data and carry out model integration experiments and the deployment of its production chain on some targeted corpora. Specific user interaction scenarios will be specified and implemented, and will give rise to original experiments conducted in collaboration with the BnF operators and users. The BnF will be responsible for evaluating the performance of the proposed solutions quantitatively and qualitatively in terms of ergonomics, usability, and acceptability by the users.