litlasArchitecture
Checking account…
Architecture

The schema and ingest pipeline have their own route

This page exposes the raw / norm / canon / ops layout plus the latest ingest runs for each source.

Database

Postgres + Alembic

Schema layout: raw / norm / canon / ops

raw
norm
canon
ops
Runtime surfaces

Deployment components

frontend

  • runtime: Next.js
  • target: Vercel or containerized Node

api

  • runtime: FastAPI
  • target: Fly.io app

search

  • runtime: database-first search
  • sync: ops.outbox_event -> search.reindex.requested

auth

  • runtime: Firebase
  • configured: true

billing

  • runtime: Stripe
  • configured: true
Scale principles

Pipeline guarantees

  • raw.source_record and ops.outbox_event ledger ingest and downstream sync requests
  • ops.ingest_run stores per-source requested_count, raw_record_count, paper_count, and status
  • raw / norm / canon / ops are separated in the SQLAlchemy model and Alembic migration
  • canonical papers keep reversible source_slug + paper_key mappings back to raw records
  • search stays on a dedicated ConoHa-ready Postgres host so the full DBLP corpus can be queried without loading every paper in memory
Latest ingest status

Per-source run results

These counts come from ops.ingest_run and the canonical paper inventory.

OpenAlex

openalex

0 raw0 papersrunning

OpenCitations

opencitations

100 raw0 paperscompleted

Crossref

crossref

100 raw100 paperscompleted

DataCite

datacite

100 raw100 paperscompleted

ROR / ORCID

ror_orcid

100 raw0 paperscompleted

arXiv

arxiv

100 raw100 paperscompleted

DBLP

dblp

12414064 raw12414064 paperscompleted

OpenReview

openreview

100 raw100 paperscompleted

ACL Anthology

acl_anthology

100 raw100 paperscompleted

PubMed

pubmed

100 raw100 paperscompleted

PMC Open Access Subset

pmc_open_access

100 raw100 paperscompleted

DOAJ

doaj

100 raw100 paperscompleted

CORE

core

100 raw100 paperscompleted

OpenAIRE Graph

openaire_graph

100 raw100 paperscompleted

Unpaywall

unpaywall

100 raw0 paperscompleted

Common Crawl

common_crawl

100 raw0 paperscompleted