NLP / Bibliometrics

From literature collection to topic & network analysis — in one flow.

Upload PDFs or query OpenAlex, then view topics, keyword networks, and per-period topic shifts in a single pipeline. Available only on the on-premise desktop app due to large PDF processing and local model downloads.

At a glance

Common use

Literature trends · Keyword clusters · Academic reports

Sources

OpenAlex search / Excel metadata / Bulk PDF upload

Languages

Korean · English (stanza-based tokenisation)

Engine

gensim LDA · networkx + pyvis

Output

Word cloud · Topic table · Network graph · Strategic diagram

Delivery

On-premise desktop app only

Data preparation

  1. 1OpenAlex search: keyword + year range → auto metadata collection
  2. 2Or Excel upload: title / abstract / year columns required
  3. 3Or bulk PDF upload (group folders auto-recognised as labels)
  4. 4Preprocessing: stopword + user-defined stopword removal
  5. 5Confirm tokenisation, then run topic modeling

Group PDFs by folder before upload — folder names become group labels so you can compare topics across groups.

Workflow

  1. 1Data collection (OpenAlex / Excel / PDF)
  2. 2Tokenisation + stopword removal (stanza)
  3. 3Word cloud + frequency analysis
  4. 4LDA optimal K auto-discovery (coherence)
  5. 5Topic modeling + per-topic keywords + distribution
  6. 6Network analysis (keyword co-occurrence)
  7. 7Time-series topic shifts + strategic diagram

Supported analyses

  • Word cloud + frequency

    Overall / per-group keywords visualised + Top-N table

  • LDA topic modeling

    Auto-pick optimal K via coherence, then derive topics + keywords

  • Network analysis

    Keyword co-occurrence graph + centrality + clustering

  • Time-series topic analysis

    Year-on-year topic share shifts + Mann-Kendall trend test

  • Strategic diagram

    Density × centrality quadrant for research positioning

Use cases

  • Literature trend discovery

    Query OpenAlex with topic keywords → auto-analyse 10-year topic shifts.

  • Internal report topic mapping

    Upload internal PDFs grouped by department → compare key topics per department.

  • Academic review draft

    Pipeline output (collection → analysis → visualisation) exports straight to LaTeX paper.

What you get

  • Collected metadata table + year distribution
  • Word cloud + Top-N keyword table
  • Per-topic keywords + document distribution + time shifts
  • Network graph + strategic diagram
  • Auto-generated paper (LaTeX → PDF)