Skip to main content

Institution repository collection development with web scraping

Many institutions report low rates of self-archiving with their institutional repositories, requiring repository managers to actively seek content to expand collections. This presentation discusses a method for collecting articles and metadata from open repositories using Beautiful Soup and Selenium as web scraping tools. Thousands of articles and corresponding descriptive metadata were quickly and easily added to an IR using this method, increasing visibility of research and engagement with the IR.


4:35 PM
10 minutes