Which literature search platforms does InkCop support?

InkCop natively integrates four platforms: PubMed (biomedical authority), arXiv (STEM preprints), OpenAlex (200M+ open scholarly index), and CORE (world's largest open-access aggregator). Advanced Boolean queries with multi-dimensional filters. One-click batch import of full text + complete metadata (title, authors, journal, DOI, PMID, etc.) into the local knowledge base.

Which file formats are supported for bulk local import?

PDF, Word (.doc/.docx), PowerPoint (.ppt/.pptx), EPUB, RTF, ODT, Markdown, plain text, and PNG/JPG images (with OCR). Drag in a folder for recursive batch import. Background tasks resume automatically after app restart.

What is the LiteraturePrepareAgent and why does it own the pipeline?

It is InkCop's literature preprocessing agent that uses a single LLM call to perform semantic chunking, entity/relation extraction, knowledge-graph concept generation, summarization, and keyword extraction. Benefits: ① unified context for deeper understanding, ② atomic submission for data consistency, ③ unified rules easy to evolve, ④ benefits from future LLM improvements without code rewrites, ⑤ independent LLM config decoupled from main agent.

What is the difference between RAG vector retrieval and knowledge graph retrieval?

RAG vector retrieval is based on "semantic similarity" — finds content that expresses similar meaning differently but may drift. Knowledge graph retrieval is based on rigorous entity-relation-entity edges and gives structured, verifiable factual links. InkCop runs both: local ObjectBox + HNSW for millisecond vector retrieval, and Kuzu graph DB for multi-hop reasoning — complementary engines that together deliver stronger retrieval.

What can the knowledge graph visualization do, and when should I use it?

Powered by AntV G6 with concept graph and document graph dual views. Six layouts (radial, force-cluster, concentric, Dagre, circular, grid) auto-recommended. Click a node to expand neighbors, jump to source paper, or inspect related chunks. Use cases: literature review (find domain entity networks), topic exploration (build cognitive map of unfamiliar fields), relation verification (validate AI-claimed connections), and team knowledge sharing (visualize years of accumulated literature).

🧠 Core Hub · Literature Center

Local Knowledge Base

All-in-One Literature Hub: Search · Import · Retrieve · Reason

InkCop's local knowledge base is the central hub of your workflow — search 4 literature platforms directly, batch import multi-format local resources, then retrieve with dual RAG + Knowledge Graph engines. The LiteraturePrepareAgent owns the full pipeline. 100% local storage with AI-enhanced retrieval throughout your research lifecycle.

Free Download Pair with Reading Solution →

4 Literature Platforms · Native Integration

PubMed Biomedical

Biomedical authority (NIH/NLM)

Standard .nbib metadata supported

arXiv STEM

Physics / Math / CS preprints

Boolean query + subject filters

OpenAlex Open Access

200M+ open scholarly index

OA / fulltext filter + citation counts

CORE Aggregator

World's largest OA paper aggregator

One-click full-text PDF & abstract

Search platforms

Local file formats

Retrieval engines

100%

Local · Private

🔎 Feature 1

Direct Search · One-Click Import

No more juggling between websites. Search PubMed / arXiv / OpenAlex / CORE inside InkCop. Batch import to your local knowledge base — metadata auto-recovered, full-text indexed.

🔎

Advanced Query

Boolean + field filters

→

📋

Select / Select All

Multi-select to target folder

→

📥

Bulk Import

Metadata + full text saved

PubMed Biomedical

Biomedical authority (NIH/NLM)

Standard .nbib metadata supported

arXiv STEM

Physics / Math / CS preprints

Boolean query + subject filters

OpenAlex Open Access

200M+ open scholarly index

OA / fulltext filter + citation counts

CORE Aggregator

World's largest OA paper aggregator

One-click full-text PDF & abstract

📁 Feature 2

Bulk Local Import · Any Format

Got a massive backlog? Drag in a folder to recursively scan all supported formats. PDFs through MinerU for layout recovery; Word/EPUB/RTF/ODT via Pandoc; PNG/JPG images via OCR.

Async background processing — keep working in the foreground
Resume from checkpoint after app restart — no task loss
Auto-triggers LLM preprocessing on import
Choose target subfolder to align with your existing structure

📄 PDF

Layout preserved + AI metadata recovery

📝 Word

.doc / .docx via Pandoc

📊 PowerPoint

.ppt / .pptx slide import

📚 EPUB

E-book chapters fully parsed

Ⓜ️ Markdown

Native format, zero conversion loss

📃 RTF / ODT

Rich text formats lossless import

🖼️ Image

PNG / JPG with OCR extraction

📋 Plain Text

Logs, notes, batch outputs

🔗 Feature 3

Dual Retrieval: RAG + Knowledge Graph

Vector-only retrieval drifts; graph-only retrieval is too narrow. InkCop runs both engines in parallel, cross-calibrating to give AI both "broad" and "precise" context.

📊

RAG Vector Retrieval

Broad semantic similarity · Fault-tolerant

▸ObjectBox + HNSW local index, ms-level nearest-neighbor
▸Multimodal embeddings (text + image)
▸Optional reranker model for re-ranking
▸LLM-driven semantic chunking

🕸️

Knowledge Graph Retrieval

Rigorous entity links · Multi-hop reasoning

▸Kuzu graph database, native Cypher queries
▸Multi-hop reasoning across papers
▸Concept nodes + entity nodes (two-tier)
▸Every edge carries chunk citations for verification

Dual-engine synergy: RAG retrieves candidate chunks → Graph expands by entity links → Merge & dedupe → Optional LLM relevance test → Return context. Transparent to the LLM, delivering "broad and precise" results.

🤖 Feature 4 · Core Innovation

LiteraturePrepareAgent · LLM-Driven Pipeline

Traditional knowledge base systems use rule-based chunking, rule-based extraction, rule-based summarization — four or five independent modules in a chain, each seeing only its slice of input. The "weakest link" drags down overall quality. InkCop reverses this: hand the entire preprocessing pipeline to a single LLM agent.

✂️

Semantic Chunking

LLM identifies semantic boundaries to chunk documents — preserving complete argument paragraphs. Avoids rule-based chunking that fragments coherent reasoning.

🔍

Entity & Relation Extraction

Identifies people, institutions, concepts, methods, metrics; extracts coreference, contrast, causality, dependency relations into the graph database.

🧬

Graph Concept Generation

Generates "concept nodes" per document for cross-paper retrieval. Powers multi-hop reasoning and orphan-concept matching — making the graph truly serve retrieval.

🏷️

Summary / Keywords / Category

Single-pass production of summary, keywords, and topic classification. Combined with citation metadata, forms a complete document profile for precise retrieval.

6 Benefits of LLM-Driven Pipeline

Why not the traditional "rules + multi-module" pipeline?

🌐

Unified Context

In a traditional pipeline, each module sees only its slice of input. With LLM ownership, chunking, extraction, and summarization share the full context for deeper understanding.

⚛️

Atomic Submission

All outputs commit atomically via base_data_updater / index_data_updater tools. No more “chunked OK but graph extraction failed” intermediate states.

🧭

Unified Rules, Easy Evolution

All processing logic lives in the Agent prompt. Switching research domains or upgrading rules requires only a prompt tweak — no rewriting multiple modules.

🔄

Controlled Fallback & Retry

Failed jobs persist in the signals table. App-startup scans auto-retry; manual trigger above threshold prevents silent loss.

🚀

Full LLM Capability

As LLMs improve, preprocessing quality benefits naturally. No code rewrite — just swap to a stronger model or refine the prompt.

⚙️

Independent LLM Config

LiteraturePreparer can be decoupled from the chat-facing main_agent — configure a cheaper or faster local model for batch processing.

🕸️ Feature 5

Knowledge Graph Visualization · See the Structure

The knowledge graph is not just for AI retrieval — it's your cognitive tool. AntV G6-powered interactive graph lets you literally "see" the relationship network inside your knowledge base.

📐

Six Layout Algorithms

radial, force-cluster, concentric, antv-dagre, circular, grid — the system auto-recommends the best layout based on graph density.

🔀

Concept Graph + Document Graph

Concept graph shows cross-paper entity relations; document graph shows similarity and citation links between documents in the knowledge base. One-click switch.

🔭

Click-to-Expand Neighbors

Click any node to fetch its neighbor subgraph and merge with the current view. Progressively explore the graph without loading everything at once.

🎯

Focus Highlight + LOD

Selecting a node highlights first-degree neighbors and dims the rest. Labels auto-hide on low zoom — smooth rendering even with millions of nodes.

↩️

Related Chunks & Source Jump

A drawer shows chunks linked to the node and lets you jump straight to the source paper position. From graph to detail without tool-switching.

🔎

Keyword Search + Subgraph Filter

Filter the subgraph via the toolbar by keyword or knowledge-base ID. When nodes/edges exceed the limit, the graph auto-truncates with a notice.

Typical Use Cases

📚

Literature Review

Find domain entity networks via the concept graph; locate research gaps; multi-hop reasoning connects implicit logic across papers.

🗺️

Topic Exploration

In an unfamiliar field, start from one key concept and recursively expand neighbors — quickly build a cognitive map of the domain.

✅

Relation Verification

Is the AI-claimed "X relates to Y" actually grounded? Every graph edge carries chunk citations for item-by-item verification.

👥

Team Knowledge Sharing

Visualize years of team literature accumulation. New members onboard quickly without reinventing the wheel.

Complete Capability Matrix

Core capabilities for the full research lifecycle

🔎

Direct Search Across 4 Literature Platforms

Native integration with PubMed / arXiv / OpenAlex / CORE. Advanced Boolean queries with multi-dimensional filters. One-click batch import of full text + complete metadata (title, authors, journal, volume, DOI, PMID, etc.) into the local knowledge base.

📁

Bulk Import of Multi-Format Local Resources

Drag in a folder — recursive import. Full support for PDF / Word / PowerPoint / EPUB / RTF / ODT / Markdown / plain text plus PNG/JPG images (OCR). No more uploading one by one. Background tasks resume automatically after restart.

🔗

Dual Retrieval: RAG + Knowledge Graph

Vector RAG retrieval delivers “broad semantic similarity”; the knowledge graph delivers “rigorous entity links.” Local ObjectBox + HNSW for millisecond nearest-neighbor; Kuzu graph DB enables multi-hop reasoning across papers.

🤖

LLM-Driven Literature Preprocessing Pipeline

The LiteraturePrepareAgent owns the entire pipeline: semantic chunking, entity/relation extraction, knowledge-graph data, summary, and keywords — all in a single LLM call with atomic submission. Eliminates context loss and inconsistencies from multi-call pipelines.

🔬

Multimodal RAG & Chart Understanding

Beyond text — recognizes charts, curves, and pathology images in papers, converting them into searchable knowledge units. Supports multimodal embedding models (e.g. qwen3-vl-embedding).

🕸️

Interactive Knowledge Graph Visualization

AntV G6-powered Concept Graph + Document Graph with six layouts (radial, force-cluster, concentric, Dagre, circular, grid). Click a node to expand neighbors, jump to source paper, or inspect related chunks.

🏷️

Auto Metadata Discovery & Citation Archive

Drop any PDF — AI auto-recovers title, authors, abstract, journal, DOI, PMID. Saved as Zotero-style citation in metadata. Full .nbib import support for PubMed/Endnote workflows.

📖

Reading Guides & Visual HTML Reports

One-click reading guides, reverse reading, visual diagrams, and interactive HTML reports for every paper in the knowledge base. Grasp the core in 10 minutes.

🎯

@Precise Targeting & Context Pinning

@document, @knowledge article, @specific text — pin context to objective entities for grounded answers with traceable sources.

Local knowledge base connects to all solutions:

📚 Literature Reading ✍️ Academic Writing 🔍 Citation Management 🔒 Local Workstation Free Download