Skip to Main Content

Text and Data Mining (TDM)

Instructions and links for Text and Data Mining

Publishers and Text and Data Mining

Database publishers are aware of the growing interest in text and data mining, and also of the different methods that are used to perform text and data mining. Publishers often include clauses in our license about what we are allowed to do. Most commonly this centres around restricting any text and data mining that could be considered for commercial use. To assist some publishers have created tools or software for use on their specific data.

As there are usually methods in place to track bots and other methods of TDM it is important to be aware of any restrictions a given publisher may place on our usage as it can have a large impact on everyone's ability to use a resource if we step over the boundaries we have agreed to. We have collected some data on some of the larger publishers we have subscriptions with. If you are unsure who publishes the resource you wish to perform text or data mining on, or the publisher is not on the list, please contact us before you begin and we will investigate whether there are any restrictions in our license or services offered by the publisher.
 

Database/
Vendor
Details More information
Adam Matthews Digital We allow Data Mining/Text Analysis by "Authorized Users" for fair use/academic research. Data must be kept on secure local storage and can only be kept for a limit of 3 years.
Researchers can apply to Adam Matthews to perform text and data mining. A sample application is available in the pdf link to the right.
Adam Matthew Data/Text Mining Statement
American Association for the Advancement of Science Please see clauses 4.3.7 and 5.1.3 and Annex A of the linked license agreement.

Text and data mining is generally allowed for non-commercial, internal, research oriented uses, but the use of any automated computer program or activity to search, index, test, download, or grab information from the Licensed Materials is not allowed.

Science Online Journals Institutional License Agreement
American Medical Association (JAMA) AMA is extending text and data mining rights to researchers at subscribing institutions worldwide for non-commercial research purposes. If your institution has a valid site license agreement with the AMA for JAMA Network Licensed material, you can register for limited rights to text and data mine (TDM) online content for non-commercial purposes by agreeing to abide by the provisions of this special license for Registered Users.

American Medical Association Policy

TDM Account creation

Association for Computing Machinery

Universities and Authorized Users shall be able to conduct TDM by an API provided by Vendor or a mutually agreed third-party provider.

In all cases where Licensed Materials offers both HTML/XML and PDF versions of the Licensed Materials, both versions shall be accessible for TDM.

The Licensed Materials provided for TDM shall be provided in such a manner as to be useful to Authorized Users. For example, there will be no rate or volume limits placed on TDM by ACM unless there has been evidence of disruption of ACM’s normal services, and any such limits shall be communicated to Universities in advance of taking effect.

Uses of TDM Output: It is mutually understood that Licensed Materials and “TDM Output” (the result of any Text and Data Mining activity or operation, capable of fixation, reproduction and/or communication in any form) provided or generated under this Agreement may be retained by Authorized Users throughout the full lifecycle of the TDM project, including for publication, and as necessary for replication and validation of research results.

 
Biochemical Society / Portland Press

3.2. The Institution shall be entitled to permit Authorised Users, for Educational Purposes only: ...

3.2.10. to download and make copies of the whole or any parts of the Licensed Material for the purposes of, and to perform and engage in computational analysis (including text and data mining) using the Licensed Material for the purpose of research and other Educational Purposes but not for Commercial Use, and to permit Authorised Users to distribute and display and otherwise use (publicly or otherwise), other than for Commercial Use,  the results, provided that such results do not reproduce the whole or a substantial part of any Licensed Content.  Copies of Licensed Content made under this Clause 3.2.10 shall be deleted promptly after the computational analysis has been completed;

 
Bloomsbury

2 GRANT OF LICENSE USAGE RIGHTS AND LIMITATIONS ON USE

2.2 For each Licensed Work, respectively, Licensor grants the Licensee the non-exclusive and non­transferable right for the Licensed Work Term and subject to any Concurrency Restriction(s) and the terms of the Legal Notice for that Licensed Work (including any Usage Rights specified in the Legal Notice) to allow Authorised Users at the Sites for the purposes of research, teaching, and private study to: 

2.2.6 carry out Text And Data Mining provided consent has been obtained from the Licensor prior to commencing Text And Data Mining activities. 

 
Brill TDM is permitted, without written permission. Subclause 3.2.4 of the standard version of the license agreement from 2017 onwards, states: “Authorized users may … use Text and Data Mining technologies to derive information from the Licensed Materials. Authorized Users may use the results of any Text Mining activity for their research, including without limitation the creation of an index, abstract, or description of Licensed Materials, whether in the form of a direct extraction or a representation in any form which is based on subscribed Content. If published, the research must be original and must not amount to a derivative work.”  
British Medical Journal (BMJ) Through the Crossref Text and Data Mining Service we are extending text and data mining rights to researchers at subscribing institutions worldwide for non-commercial research purposes under the terms and conditions below. This service will enable researchers to mine content across a wide range of publishers from a single site. BMJ TDM Licence/Policy
British Online Archives TDM is permitted, without written permission. Subclause 4.2.3 of the standard version of the license agreement from 2018 onwards, states: “The Licensee may permit its Affiliated Users to … perform and engage in text mining/data mining activities in relation to the Publication for legitimate academic research and other non-commercial educational purposes without obtaining the Licensor’s prior written consent.”  
Cambridge University Press

3 PERMITTED USES

3.3 Authorised Users may download, extract, store and index the Products for the purposes of TDM and may mount, load, integrate and analyse the results of TDM on their personal devices or Secure Network. Any copies of the Products accessed or reproduced by an Authorised User for the purposes of TDM must be deleted once the analysis of the results of the TDM is complete.

3.4 Authorised Users may use the results of their TDM in their research and make the results of their TDM publicly available, provided that no Product or part of a Product is reproduced within such research, other than as expressly permitted by applicable law.

 

Company of Biologists

3. PERMITTED USES
3.2. The Institution shall be entitled to permit Authorised Users, for Educational and Research Purposes only:

3.2.9. to download and make copies of the whole or any parts of the Licensed Material for the purposes of, and to perform and engage in computational analysis (including text and data mining) using the Licensed Material for the purpose of research and other Educational Purposes but not for Commercial Use, and to permit Authorised Users to distribute and display and otherwise use (publicly or otherwise), other than for Commercial Use, the results, provided that such results do not reproduce the whole or a substantial part of any Licensed Content . . . 

 
De Gruyter There is no reference to TDM in the standard license agreement. However, TDM might be permitted on a case-by-case basis; the publisher will consider each application on request.  
Elsevier - Science Direct Elsevier allows researchers to text mine subscribed content on ScienceDirect for non-commercial purposes, via the ScienceDirect API's. Researchers should register for an API key, instructions are available at the link to the right. Elsevier Text and Data Mining Policy
Elsevier - Scopus

Scopus can be text and data mined, however, an outline of the project must bo submitted for review (by Scopus content team) and you must agree to their TDM terms and conditions.

You can find out more by clicking on the link, signing in and looking at the section on Scopus TDM.

Scopus TDM information

Factiva

Text and data mining is not allowed by Factiva, although we have been informed that if a business proposal is submitted Dow Jones will provide a quote for that specific case. Factiva Terms of Use
Gale Gale will provide data for text and data mining on hard drive. This must be requested by institutions, not individuals, and is not a free service. TDM License addendum under investigation. Content from most Gale Digital Collections, including essential research databases like Eighteenth Century Collections Online and Nineteenth Century Collections Online, as well as content from Gale’s extensive newspaper archives and other collections are available. Gale Data Mining and Textual Analytics
IEEE Xplore

IEEE Xplore Metadata API addresses growing customer requests and the STM industry movement towards machine-readable content. All you need is an API Key (register to get) and then try any or all of the available API calls - no coding required. Simply replace any of the fields (like DOI) with your own list of DOI and away you go.

The Xplore Metadata API provides access to metadata for millions of documents available in the IEEE Xplore Digital Library including IEEE journals, conferences, books / ebooks, courses and standards.

IEEE Xplore Metadata API
IOP Science Text and data mining for non-commercial purposes is allowed. Researchers must contact IOP to arrange for an exception to the normal blocks they have in place to prevent systematic downloading of their content. IOP Science Text and Data Mining Policy
JSTOR Data for Research is a free service for researchers wishing to analyze content on JSTOR through a variety of lenses and perspectives. DfR enables researchers to find useful patterns, associations and unforeseen relationships in the body of research available in the journal and pamphlet archives on JSTOR. To this end we provide data sets of documents to researchers: OCR, metadata, Key Terms, N-grams and reference text. JSTOR About Data for Research
Knowledge Unlatched All content is published in open access under various Creative Commons licenses (usually CC-BY). As long as the TDM process adheres to this licensing, then yes, it is permitted without restriction.  
Oxford University Press Oxford University Press accommodates TDM for non-commercial use. Although researchers are not required to request permission for non-commercial text-mining, OUP offers consultation with a technical project manager to assist in planning the project, including avoidance of any technical safeguards triggers OUP has in place to protect the stability and security of their websites. Oxford Third Party Data Mining
Peter Lang (eBooks) Licensee will make every effort to inform users that they must obtain written permission from the Publisher, to perform and engage in text and data mining activities. Requests for which, shall specify if text and data mining activities are to be for commercial or non-commercial use. Permission for which shall not be unreasonably be withheld by the Publisher; provided that Authorized Users only engage in text and data mining activities for legitimate academic research and other educational purposes. Applicable ‘cost-recovery fees’ will be determined by the Publisher upon receipt of each request.  
ProQuest

10. Customer and its Authorized Users shall not:

i) Text mine, data mine or harvest metadata from the Service;

ProQuest Terms and Conditions
Royal Society The Institution shall be entitled to permit Authorised Users, for Educational and Research Purposes only: . . . to download and make copies of the whole or any parts of the Licensed Material for the purposes of, and to perform and engage in computational analysis (including text and data mining) using the Licensed material for . . . the purpose of research and other Educational Purposes but not for Commercial Use, and to permit Authorised Users to distribute and display and otherwise use (publicly or otherwise), other than for Commercial Use, the results, provided that such results do not reproduce the whole or a substantial part of any Licensed Content. Copies of Licensed Content made . . . shall be deleted promptly after the computational analysis has been completed. However, any copies required for purposes of research data retention by Licensee if and to the extent needed to comply with government and funding body data management requirements, may be retained. Royal Society Data Sharing and Mining
Royal Society of Chemistry (RSC) Publisher licenses Customer and Authorised Users to download, extract and index information from the Publisher Content and, where required, load and integrate the results on a server used for Customer's or Authorised User's text mining system and evaluate and interpret the text and data mining ("TDM") output for access and use by the Authorised User carrying ou the TDM ("TDM User") and other Authorised Users. Customer ensures that all TDM is carried out under the other conditions of this Clause 2. TDM may be undertaken on either locally loaded Publisher Content or as mutually agreed. For avoidance of doubt, if the TDM is being carried out from then Publisher's publishing platform at pubs.rsc.org, Authorised User shall send a request through to Publisher via tdm@rsc.org and Publisher shall ask the TDM User and not the Customer to perform any administrative task needed before carrying out the TDM. Royal Society of Chemistry Eighth Addendum 2023
Sage Publications Authorized Users shall be permitted to extract or use information contained in the Product for Educational Purposes, including, but not limited to, text and data mining, extraction and manipulation of information for the purposes of illustration, explanation, example, comment, criticism, teaching, research, or analysis. Sage Publications Ltd. Transformative Agreement
Sage Research Methods Our license allows Authorized Users to "use the licensed material to perform and engage in text mining /data mining activities for legitimate academic research and other educational purposes. Those uses beyond educational use shall require SAGE's permission."
Society for Industrial and Applied Mathematics Authorized Users may use the Licensed Materials to perform text and data mining for legitimate academic research and other educational purposes, but not for commercial purposes, for as long as the Licensee maintains paid access to the Licensed Materials.  
Springer This publisher allows non-commercial text and data mining. Springer is a supporter of the CrossRef TDM Initiative and expects their data to be fully supported soon. Springer Text and Data Mining Policy
Taylor & Francis (T&F eBooks, SDGO and Routledge Handbooks) Licensee will ensure that Authorised Users obtain prior written permission from the Publisher, to perform and engage in text and data mining activities in relation to the Licensed Materials or Online Services, which are understood as a machine process by which information may be derived by identifying patterns and trends within natural language through text categorization, statistical pattern recognition, concept or sentiment extraction, and the association of natural language with indexing terms. Requests for which, shall specify if text and data mining activities are to be for commercial or non-commercial use. If authorised by the Publisher, applicable ‘cost-recovery fees’ will be determined by the Publisher at its sole discretion.  
Taylor & Francis Journals 3. PERMITTED USES
3.3.8. to download and makes single copies of the whole or any parts of the Licensed Material for the purposes of, and to perform and engage in computational analysis (including text and data mining) using the Licensed Material, aside from Commercial Use, and to permit Authorised Users to distribute and display and otherwise use (publicly or otherwise), other than for Commercial Use, the results, provided that i) such results do not reproduce the whole or a substantial part of any Licensed Content and ii) the Publisher has been notified in writing in advance of this use to ensure that they can provided the appropriate technical assistance and maintain a log of projects. Copies of Licensed Content made under this Clause 3.3.9 shall be deleted promptly after the computational analysis has been completed; Notwithstanding the aforesaid, copies made under this Clause 3.3.9 required to be made for purposes of research data retention by Licensee, if and to the extent needed to comply with government and/or funding body data management requirements, may be retained, either for such period of time as may be required by such government and/or funding body, or otherwise for so long as reasonably required to be retained by the Licensee for such purpose;
3.3.9. to download the Licensed Material in whole or in part as is reasonably necessary for the Authorised User’s personal Educational and Research Purpose onto personal computing devices including, but not limited to, tablets, e-book readers and laptops, and stand-alone computers, without any limit in number. The Publisher makes no warranty as to the suitability of any Licensed Material for use on such devices; 
Taylor & Francis Online Terms & Conditions; Text and Data Mining
Wiley Wiley allows text and data mining for non-commercial purposes as long as it is done using an approved API service such as CrossRef. Wiley Text and Data Mining Agreement