🧠 Core Hub · Literature Center

Local Knowledge Base

All-in-One Literature Hub: Search · Import · Retrieve · Reason

InkCop's local knowledge base is the central hub of your workflow — search 4 literature platforms directly, batch import multi-format local resources, then retrieve with dual RAG + Knowledge Graph engines. The LiteraturePrepareAgent owns the full pipeline. 100% local storage with AI-enhanced retrieval throughout your research lifecycle.

4 Literature Platforms · Native Integration

PubMed Biomedical

Biomedical authority (NIH/NLM)

Standard .nbib metadata supported

arXiv STEM

Physics / Math / CS preprints

Boolean query + subject filters

OpenAlex Open Access

200M+ open scholarly index

OA / fulltext filter + citation counts

CORE Aggregator

World's largest OA paper aggregator

One-click full-text PDF & abstract

4
Search platforms
8+
Local file formats
2
Retrieval engines
100%
Local · Private
🔎 Feature 1

Direct Search · One-Click Import

No more juggling between websites. Search PubMed / arXiv / OpenAlex / CORE inside InkCop. Batch import to your local knowledge base — metadata auto-recovered, full-text indexed.

🔎

Advanced Query

Boolean + field filters

📋

Select / Select All

Multi-select to target folder

📥

Bulk Import

Metadata + full text saved

PubMed Biomedical

Biomedical authority (NIH/NLM)

Standard .nbib metadata supported

arXiv STEM

Physics / Math / CS preprints

Boolean query + subject filters

OpenAlex Open Access

200M+ open scholarly index

OA / fulltext filter + citation counts

CORE Aggregator

World's largest OA paper aggregator

One-click full-text PDF & abstract

📁 Feature 2

Bulk Local Import · Any Format

Got a massive backlog? Drag in a folder to recursively scan all supported formats. PDFs through MinerU for layout recovery; Word/EPUB/RTF/ODT via Pandoc; PNG/JPG images via OCR.

  • Async background processing — keep working in the foreground
  • Resume from checkpoint after app restart — no task loss
  • Auto-triggers LLM preprocessing on import
  • Choose target subfolder to align with your existing structure
📄 PDF

Layout preserved + AI metadata recovery

📝 Word

.doc / .docx via Pandoc

📊 PowerPoint

.ppt / .pptx slide import

📚 EPUB

E-book chapters fully parsed

Ⓜ️ Markdown

Native format, zero conversion loss

📃 RTF / ODT

Rich text formats lossless import

🖼️ Image

PNG / JPG with OCR extraction

📋 Plain Text

Logs, notes, batch outputs

🔗 Feature 3

Dual Retrieval: RAG + Knowledge Graph

Vector-only retrieval drifts; graph-only retrieval is too narrow. InkCop runs both engines in parallel, cross-calibrating to give AI both "broad" and "precise" context.

📊

RAG Vector Retrieval

Broad semantic similarity · Fault-tolerant

  • ObjectBox + HNSW local index, ms-level nearest-neighbor
  • Multimodal embeddings (text + image)
  • Optional reranker model for re-ranking
  • LLM-driven semantic chunking
🕸️

Knowledge Graph Retrieval

Rigorous entity links · Multi-hop reasoning

  • Kuzu graph database, native Cypher queries
  • Multi-hop reasoning across papers
  • Concept nodes + entity nodes (two-tier)
  • Every edge carries chunk citations for verification

Dual-engine synergy: RAG retrieves candidate chunks → Graph expands by entity links → Merge & dedupe → Optional LLM relevance test → Return context. Transparent to the LLM, delivering "broad and precise" results.

🤖 Feature 4 · Core Innovation

LiteraturePrepareAgent · LLM-Driven Pipeline

Traditional knowledge base systems use rule-based chunking, rule-based extraction, rule-based summarization — four or five independent modules in a chain, each seeing only its slice of input. The "weakest link" drags down overall quality. InkCop reverses this: hand the entire preprocessing pipeline to a single LLM agent.

✂️

Semantic Chunking

LLM identifies semantic boundaries to chunk documents — preserving complete argument paragraphs. Avoids rule-based chunking that fragments coherent reasoning.

🔍

Entity & Relation Extraction

Identifies people, institutions, concepts, methods, metrics; extracts coreference, contrast, causality, dependency relations into the graph database.

🧬

Graph Concept Generation

Generates "concept nodes" per document for cross-paper retrieval. Powers multi-hop reasoning and orphan-concept matching — making the graph truly serve retrieval.

🏷️

Summary / Keywords / Category

Single-pass production of summary, keywords, and topic classification. Combined with citation metadata, forms a complete document profile for precise retrieval.

6 Benefits of LLM-Driven Pipeline

Why not the traditional "rules + multi-module" pipeline?

🌐

Unified Context

In a traditional pipeline, each module sees only its slice of input. With LLM ownership, chunking, extraction, and summarization share the full context for deeper understanding.

⚛️

Atomic Submission

All outputs commit atomically via base_data_updater / index_data_updater tools. No more “chunked OK but graph extraction failed” intermediate states.

🧭

Unified Rules, Easy Evolution

All processing logic lives in the Agent prompt. Switching research domains or upgrading rules requires only a prompt tweak — no rewriting multiple modules.

🔄

Controlled Fallback & Retry

Failed jobs persist in the signals table. App-startup scans auto-retry; manual trigger above threshold prevents silent loss.

🚀

Full LLM Capability

As LLMs improve, preprocessing quality benefits naturally. No code rewrite — just swap to a stronger model or refine the prompt.

⚙️

Independent LLM Config

LiteraturePreparer can be decoupled from the chat-facing main_agent — configure a cheaper or faster local model for batch processing.

🕸️ Feature 5

Knowledge Graph Visualization · See the Structure

The knowledge graph is not just for AI retrieval — it's your cognitive tool. AntV G6-powered interactive graph lets you literally "see" the relationship network inside your knowledge base.

📐

Six Layout Algorithms

radial, force-cluster, concentric, antv-dagre, circular, grid — the system auto-recommends the best layout based on graph density.

🔀

Concept Graph + Document Graph

Concept graph shows cross-paper entity relations; document graph shows similarity and citation links between documents in the knowledge base. One-click switch.

🔭

Click-to-Expand Neighbors

Click any node to fetch its neighbor subgraph and merge with the current view. Progressively explore the graph without loading everything at once.

🎯

Focus Highlight + LOD

Selecting a node highlights first-degree neighbors and dims the rest. Labels auto-hide on low zoom — smooth rendering even with millions of nodes.

↩️

Related Chunks & Source Jump

A drawer shows chunks linked to the node and lets you jump straight to the source paper position. From graph to detail without tool-switching.

🔎

Keyword Search + Subgraph Filter

Filter the subgraph via the toolbar by keyword or knowledge-base ID. When nodes/edges exceed the limit, the graph auto-truncates with a notice.

Typical Use Cases

📚

Literature Review

Find domain entity networks via the concept graph; locate research gaps; multi-hop reasoning connects implicit logic across papers.

🗺️

Topic Exploration

In an unfamiliar field, start from one key concept and recursively expand neighbors — quickly build a cognitive map of the domain.

Relation Verification

Is the AI-claimed "X relates to Y" actually grounded? Every graph edge carries chunk citations for item-by-item verification.

👥

Team Knowledge Sharing

Visualize years of team literature accumulation. New members onboard quickly without reinventing the wheel.

Complete Capability Matrix

Core capabilities for the full research lifecycle

🔎

Direct Search Across 4 Literature Platforms

Native integration with PubMed / arXiv / OpenAlex / CORE. Advanced Boolean queries with multi-dimensional filters. One-click batch import of full text + complete metadata (title, authors, journal, volume, DOI, PMID, etc.) into the local knowledge base.

Direct Search Across 4 Literature Platforms
📁

Bulk Import of Multi-Format Local Resources

Drag in a folder — recursive import. Full support for PDF / Word / PowerPoint / EPUB / RTF / ODT / Markdown / plain text plus PNG/JPG images (OCR). No more uploading one by one. Background tasks resume automatically after restart.

Bulk Import of Multi-Format Local Resources
🔗

Dual Retrieval: RAG + Knowledge Graph

Vector RAG retrieval delivers “broad semantic similarity”; the knowledge graph delivers “rigorous entity links.” Local ObjectBox + HNSW for millisecond nearest-neighbor; Kuzu graph DB enables multi-hop reasoning across papers.

Dual Retrieval: RAG + Knowledge Graph
🤖

LLM-Driven Literature Preprocessing Pipeline

The LiteraturePrepareAgent owns the entire pipeline: semantic chunking, entity/relation extraction, knowledge-graph data, summary, and keywords — all in a single LLM call with atomic submission. Eliminates context loss and inconsistencies from multi-call pipelines.

LLM-Driven Literature Preprocessing Pipeline
🔬

Multimodal RAG & Chart Understanding

Beyond text — recognizes charts, curves, and pathology images in papers, converting them into searchable knowledge units. Supports multimodal embedding models (e.g. qwen3-vl-embedding).

Multimodal RAG & Chart Understanding
🕸️

Interactive Knowledge Graph Visualization

AntV G6-powered Concept Graph + Document Graph with six layouts (radial, force-cluster, concentric, Dagre, circular, grid). Click a node to expand neighbors, jump to source paper, or inspect related chunks.

Interactive Knowledge Graph Visualization
🏷️

Auto Metadata Discovery & Citation Archive

Drop any PDF — AI auto-recovers title, authors, abstract, journal, DOI, PMID. Saved as Zotero-style citation in metadata. Full .nbib import support for PubMed/Endnote workflows.

Auto Metadata Discovery & Citation Archive
📖

Reading Guides & Visual HTML Reports

One-click reading guides, reverse reading, visual diagrams, and interactive HTML reports for every paper in the knowledge base. Grasp the core in 10 minutes.

Reading Guides & Visual HTML Reports
🎯

@Precise Targeting & Context Pinning

@document, @knowledge article, @specific text — pin context to objective entities for grounded answers with traceable sources.

@Precise Targeting & Context Pinning