NLP / Bibliometrics
From literature collection to topic & network analysis — in one flow.
Upload PDFs or query OpenAlex, then view topics, keyword networks, and per-period topic shifts in a single pipeline. Available only on the on-premise desktop app due to large PDF processing and local model downloads.
At a glance
Common use
Literature trends · Keyword clusters · Academic reports
Sources
OpenAlex search / Excel metadata / Bulk PDF upload
Languages
Korean · English (stanza-based tokenisation)
Engine
gensim LDA · networkx + pyvis
Output
Word cloud · Topic table · Network graph · Strategic diagram
Delivery
On-premise desktop app only
Data preparation
- 1OpenAlex search: keyword + year range → auto metadata collection
- 2Or Excel upload: title / abstract / year columns required
- 3Or bulk PDF upload (group folders auto-recognised as labels)
- 4Preprocessing: stopword + user-defined stopword removal
- 5Confirm tokenisation, then run topic modeling
Group PDFs by folder before upload — folder names become group labels so you can compare topics across groups.
Workflow
- 1Data collection (OpenAlex / Excel / PDF)
- 2Tokenisation + stopword removal (stanza)
- 3Word cloud + frequency analysis
- 4LDA optimal K auto-discovery (coherence)
- 5Topic modeling + per-topic keywords + distribution
- 6Network analysis (keyword co-occurrence)
- 7Time-series topic shifts + strategic diagram
Supported analyses
Word cloud + frequency
Overall / per-group keywords visualised + Top-N table
LDA topic modeling
Auto-pick optimal K via coherence, then derive topics + keywords
Network analysis
Keyword co-occurrence graph + centrality + clustering
Time-series topic analysis
Year-on-year topic share shifts + Mann-Kendall trend test
Strategic diagram
Density × centrality quadrant for research positioning
Use cases
Literature trend discovery
Query OpenAlex with topic keywords → auto-analyse 10-year topic shifts.
Internal report topic mapping
Upload internal PDFs grouped by department → compare key topics per department.
Academic review draft
Pipeline output (collection → analysis → visualisation) exports straight to LaTeX paper.
What you get
- Collected metadata table + year distribution
- Word cloud + Top-N keyword table
- Per-topic keywords + document distribution + time shifts
- Network graph + strategic diagram
- Auto-generated paper (LaTeX → PDF)