Data collection

The toolbox should be used in three steps. Step 1 is the data collection step, which will enable you to collect large amounts of data from the relevant databases. The second step is the analysis step, which will enable you to objectively analyze the data using easy-to-use bibliometric analysis tools. Finally, the communication step will help you communicate your findings to the outside world via standard visualization and reporting methods. It is important to consider each sequence of using these three steps as only one iteration of the whole process. Depending on the outputs of the analysis and the feedback received after communication, you should iterate the same steps to make the required improvements.

Downloading large batches of data is limited to specific academic databases, the most famous ones including Scopus and Web of Science. However, about 200 academic databases and search systems are accessible within the network of TU Delft university (the complete list).  

 Download data from Scopus: see Instructions. (Download limit: 2000 records at a time. To download all search results, choose “Select all” from the check box shown in the right image.)

 Download data from Web of Science: see instructions. (Download limit: 500 records at a time)

API Access – Do you know  how to use APIs to collect data? If yes, then you will find the APIs for Scholarly Resources provided by the MIT Libraries very helpful.

When you’re collecting data and analyzing it, it’s very important to document your work. Some of the details we recommend you to document are listed below.

  • The date you’ve collected your data
  • The location you’re storing your files
  • The source you’ve collected data from
  • The keyword(s) you’ve searched for and filters  you’ve applied
  • The basic statistics (e.g., number of records, top authors, top countries, top sources and top organizations)
  • Annual distribution of publications
  • The Term map corresponding to your search  results and the key terms in each cluster
  • The Co-authorship map corresponding to your search results and the key authors in each cluster
  • The Citation map corresponding to your search results and the key papers in each cluster
  • Find the trendy/unfashionable terms in your search results.

Except from the last item above that is still difficult to do with existing tools, the rest you can easily generate with available Data Analysis tools introduced here.

To document your analysis, we’ve prepared a template MS Word file for you:
 Download the Data Collection and Exploration template 

To get an idea on how this internal report can look like, check the following examples.
 Download Internal Report on “Geopolymer Concrete"
 Download Internal Report on “Social Network + Game Theory”