Mission Statement

Library science, applied to the open web.

What Cerulean is

An indie metasearch engine built on library science principles. Cerulean wraps the Brave, Serper, and DuckDuckGo search APIs and classifies every result by source role (how close to the original evidence) and source type (what kind of entity produced it). The classification is the product. The metasearch is the substrate. The full system, the three-tier classifier pipeline, and the honest limits live on the methodology page.

The lineage

Source evaluation predates the web. Library and information science has spent more than a century building frameworks for distinguishing kinds of sources, evaluating authority, and tracing claims back toward original evidence. The ACRL Framework for Information Literacy. The CRAAP test. The BEAM framework. The foundational primary-secondary-tertiary distinction that dates to nineteenth-century historiography. None of it is novel.

What is novel is applying it to web search at scale. Relevance ranking collapsed source-type distinctions because it didn't need them. A search engine optimizing for click-through can put a SEO listicle, a peer-reviewed article, and a Wikipedia entry side by side as long as all three contain the query terms. Cerulean restores the distinctions by tagging each result on two axes and surfacing the tags to the reader.

The model of the reader

Cerulean assumes the reader is an investigator. The patron at a reference desk. The graduate student running down a citation. The journalist verifying a quote. The analyst evaluating a vendor claim. The model is not "someone who wants an authoritative answer." It is "someone who wants to know what kinds of sources are available so they can decide what to trust." Librarians have always held this model of their patrons. Cerulean holds it for its users.

This was the model the early web held too. AskJeeves launched in 1996, before Google went public. You typed a question. You got a list of pages to read. You decided which to trust. AskJeeves shut down on May 1, 2026; the farewell read: "Every great search must come to an end." AI chatbots inherited the natural-language premise but inverted the model, replacing the reading-and-deciding step with a synthesized answer. Cerulean keeps the model AI search abandoned.

Where Cerulean fits

Cerulean is not a replacement for JSTOR, ProQuest, or EBSCO. Those databases hold the formal academic literature, with metadata, authority controls, and licensing infrastructure that general web search cannot match. For peer-reviewed work within their coverage, they will always be the right tool.

But none of them index the open web. The grey literature. The indie technical blog with the cleaner explanation than the textbook. The government report not yet archived in a formal database. The small-press journalism. The working draft on a researcher's institutional page. The IndieWeb post that does the field's clearest synthesis. The conference talk write-up. The standards body's working notes. All of that is invisible to the academic databases and visible to Cerulean.

The use case is complement, not replacement. A research librarian's workflow already includes both formal database queries and open-web reconnaissance. Cerulean is the open-web side of that workflow done with the same source-evaluation discipline the databases assume. Even for the master librarian, it opens avenues that don't exist in a formal database capacity, while preserving the structural signals their training has taught them to look for.

For the patron, the student, the journalist, or the analyst without database access, Cerulean carries the same discipline into the only territory available to them.

What Cerulean does, structurally

Source-role and source-type tags on every classified result. Visible before the click, not after. Role tags use the library science taxonomy: primary, secondary, tertiary. Type tags name the producer: journalism, academic, indie, commercial, primary-source publisher, reference, community, aggregator, SEO farm.
No ads. Ever. Not commercial placement, not sponsored content, not recommended results.
No AI summary. The methodology is the opposite of synthesis: preserve provenance, surface authority signals, trust the reader to evaluate. AI search blends sources and hides them. Cerulean tags them and shows them.
The query you typed. No autocomplete narrowing what you can ask, no "did you mean" rewriting what you asked, no semantic flattening of precise technical language into common phrasing.

Scope

Text only. Image search and video search are out of scope. Both are dominated by AI-generated content and AI-recommended ranking, with no result-level provenance. A text page can be read, its domain judged, its citations checked. An image or video requires you to view it before you can evaluate it, and deepfakes and generative tools make verification fundamentally harder than reading a page.

Dedicated image and video endpoints stay out of scope until result-level provenance is solvable for non-text assets: a working asset-level detector, or durable watermark and signature standards at meaningful coverage. Neither exists today. Image and video links that appear organically inside text results, a YouTube clip that genuinely answers a how-to query or a relevant image embedded in an article, are tagged and ranked alongside everything else.

What Cerulean is not

Not better than Google at everything. Commercial queries, hyperlocal search, shopping triage, "what time does the post office close." Google wins those. Cerulean is better at exactly one thing: showing the reader what kind of source they are looking at before they click, by a methodology a research librarian would recognize, with no commercial incentive to distort the result.

For research, reference work, investigative reporting, and anything where source provenance matters more than convenience, this is closer to what general web search used to be, before relevance ranking flattened the source-type distinctions and AI synthesis hid the sources entirely.

Tag the source. Trust the reader.