Hybrid deployment. Orchestration can run in your VPC or ours. Models are called via your own API keys.

Can we audit your evaluation framework?

Yes. Every scoring rule is documented and overridable. Customers can add their own evaluators and gating thresholds.

Build vs Buy · 2-minute read

The AI research system serious firms are trying to build.
Running today.

Q: Which models do you use?

A failover queue: Bailian Qwen, Claude Opus, Claude Sonnet. Customer-configurable; you can pin to a single model or use your own keys.

Q: What is the data retention policy?

Configurable. Default is 90 days on logs and 0 days on source data after report generation. Full export on cancellation within 7 days, full deletion within 30.

Q: How do you handle PII?

We do not accept it. Semper Signum is for public-company research only. If your use case involves PII, we are not the right vendor.

Q: What happens to our data if we cancel?

Full export in a standard format within 7 business days of cancellation. Full deletion of logs and artifacts within 30 days. Exit is clean.

Across finance, serious research shops are quietly assembling the same four things. AI embedded into existing research workflows — across listed equity, private firms, policy topics, and macro events. An evaluation framework with drift monitoring. A full audit trail with model and prompt versioning. A short pilot that produces measurable ROI before commitment. Semper Signum is that system, operational on day one.

See a real production report → Request the 30-day pilot scope

What firms are trying to build. What we already ship.

Ten requirements we see on every serious institutional roadmap. Left column is the requirement in the buyer's own language. Right column is how Semper Signum delivers it today, on real tickers, with logged evaluations.

The requirement

"Embed AI into existing research workflows, not build standalone tools."

→

What Semper Signum does

JSON in, HTML + PDF out. Reports integrate into your existing PM workflow without new UI to adopt.

The requirement

"Build an evaluation framework with drift monitoring."

→

What Semper Signum does

Per-stage evaluators score every output before it flows downstream. Drift catalogued per ticker, per stage, per model.

See how the evaluator works →

The requirement

"Full audit trail for every agent decision. Model and prompt versioning. Rollback capability."

→

What Semper Signum does

Every model call is logged with prompt hash, parameters, timestamp. Every stage rollback is traceable. Compliance-ready.

See the audit schema →

The requirement

"Hallucination control and reliability at institutional scale."

→

What Semper Signum does

Step-level verification against source filings. Contradictions trigger rollback and re-run with different model or prompt before the user sees anything.

The requirement

"Governance and Responsible AI posture that compliance and risk can sign off on."

→

What Semper Signum does

Data lineage diagram, model-provider list, retention policy, deletion SLA, PII rules. Reads like a security whitepaper.

Full governance posture →

The requirement

"Human-in-the-loop gating for high-risk decisions."

→

What Semper Signum does

Configurable thresholds on any stage. When the evaluator score drops below threshold, the stage escalates to a reviewer before publication.

The requirement

"4-8 week pilot that produces measurable ROI before commitment."

→

What Semper Signum does

30-day flat-fee pilot. Five production reports, evaluation framework tuned to your workflow, audit-trail handoff. Clean exit at day 30.

Pilot scope →

The requirement

"Adoption metrics, not deployment alone. Are analysts using it in their workflow?"

→

What Semper Signum does

Every report generation, every view, every export is logged. Pilot handoff includes an adoption dashboard tied to your named users.

The requirement

"Cost and latency tradeoffs explicit in the architecture."

→

What Semper Signum does

Failover model router: Bailian Qwen for cost, Claude Opus for quality, Claude Sonnet for latency. Customer-configurable per stage.

The requirement

"Integration with existing systems: Aladdin, Bloomberg, FactSet, S3, internal data lake."

→

What Semper Signum does

Standard REST / S3 / CSV inputs. Outputs as static HTML, PDF, or JSON. Your data stays in your stack; we add the analytical layer.

The math on build.

Rough year-one numbers to ship an equivalent internal system. US market comp, current cloud and tooling costs. Your numbers will differ; the shape will not.

Build it yourself

Hire the team

Technical lead (fully loaded)$280-340k

Two senior engineers$500k

Data / ML engineer$200k

Cloud, model APIs, tooling$80k

Time to first production artifact6-9 months

Risk of deprecation / team churnHigh

Year 1 total~$1.1M

Buy Semper Signum

Run the pilot

30-day pilot (flat fee)$50k

Production reports delivered5+

Evaluation + audit layerIncluded

Time to first report7 days

Internal team requiredNone

Exit after day 30Clean

Pilot total$50k

You can still build internally. A 30-day pilot gives your team a running start with a production system, an evaluator framework, and an audit trail instead of a blank repo.

What production looks like.

Not a demo environment. Not a slide. Actual output shipped on real names, with the audit log compliance expects from a human analyst's workpapers.

Executive Summary

Variant Perception & Thesis

Valuation: DCF + Comps + Scenarios

Risk Framework & Kill Criteria

Competitive Positioning

Adversarial Challenge

The report

Twenty-two structured sections on any subject: public company, private firm, policy topic, or macro event. Same depth, applied consistently. Thesis, valuation through three independent methods, competitive position, risk framework, management assessment. The same analysis a senior analyst would produce in two to four weeks, delivered in hours.

JPM · MC fair_value = -$47 · silent error

eval: ocf_classification · score 0.22 · FAIL

cause: bank OCF structurally negative → FCF margin -81%

action: rollback stage · apply NI-margin proxy

eval: ocf_classification · score 0.94 · PASS

JPM · MC fair_value = $212 · P5 $133 · P95 $352

A real catch, on JPM

Banks carry structurally negative operating cash flow, which silently broke the Monte Carlo valuation into nonsense territory. The per-stage evaluator flagged it, rolled the stage back, applied the correct income-margin proxy, and published $212. Every model call, score, and rollback is logged. See the full JPM report →

NVDA: 47 reports · drift 0.03 · stable

AAPL: 52 reports · drift 0.04 · stable

JPM: 38 reports · drift 0.12 · monitored

BABA: 19 reports · drift 0.28 · review

MSFT: 44 reports · drift 0.02 · stable

total: 90 tickers · avg drift 0.06

Drift monitoring

Across a production book, every ticker carries a drift score. When drift crosses threshold, the ticker enters review and the evaluator is retuned against fresh filings. Nothing silently degrades. Adoption dashboard and evaluator coverage tie directly to your named users.

Semper Signum operator view: grid of running and completed pipeline cards, each showing ticker, stage, agent status, model routing, and evaluator scores.

The operator view: every pipeline, every agent, every model call, with evaluator scores and rollback events surfaced in real time. This is the live surface the 30-day pilot hands you on day one.

The 30-day pilot.

One flat fee. Five reports. A tuned evaluation framework. A clean exit. No procurement drama.

What you get

Five production-quality Deep Dive reports on tickers you choose
Evaluation framework tuned to your workflow and risk tolerance
Audit-trail schema deployed in your VPC or ours
Drift-monitoring dashboard with your named users
Handoff runbook for your internal team

The ask

$50kflat, 30 days

Week 1: discovery + integration
Week 2: first production reports
Week 3: evaluation calibration
Week 4: handoff + training

Full week-by-week scope →

What IT, compliance, and legal will ask.

Eight questions we get in every institutional procurement cycle. If your reviewers have more, the governance page has the long form.

Do you work on-prem?

Hybrid deployment. Orchestration can run in your VPC or in ours. Models are called via your own API keys (Bailian, Anthropic, OpenAI, any combination). Output artifacts land wherever you route them: S3, SharePoint, internal wiki, your PM desk.

Which models do you use, and can we swap them?

Default failover queue: Bailian Qwen for cost, Claude Opus for quality, Claude Sonnet for latency. The router is config-driven. You can pin a single model, swap in your preferred provider, or route different stages to different models based on your cost and latency tradeoffs.

What is the data retention policy?

Configurable. Default: 90 days on logs, 0 days on source data after report generation. On cancellation, full export within 7 business days, full deletion of logs and artifacts within 30 days. Exit is clean and documented.

How do you handle PII?

We do not accept it. Semper Signum researches companies—public and private—using publicly available and licensed data only. We never ingest client-identifying information, personal holdings, or portfolio data. If your workflow requires processing PII, we are not the right vendor. This is a hard constraint, not a policy.

Can compliance audit your evaluation framework?

Yes. Every scoring rule is documented. Thresholds are configurable per stage, per ticker, per customer. You can add your own evaluators and gating thresholds. The evaluator schema is handed over during pilot handoff and re-auditable in production.

What happens to our data if we cancel?

Full export in a standard format within 7 business days. Full deletion of logs, artifacts, and cached model output within 30 days. Written confirmation of deletion provided. No retention beyond that.

Is this reseller-friendly?

Yes for hedge funds, PE firms, asset managers, RIAs, family offices, boutique investment banks, and wealth platforms whose clients are the end readers. Not for direct competitors in the research-platform space. License terms in the pilot contract are explicit on this.

We already have an internal AI roadmap. Why Semper Signum?

The pilot is how your internal effort accelerates. Instead of spending 6-9 months on infrastructure before the first production artifact, your team starts from a working system with an evaluator framework and an audit trail already in place. They spend year one extending, not building from zero.

The AI research system serious firms are trying to build.Running today.

What firms are trying to build. What we already ship.

The math on build.

Hire the team

Run the pilot

What production looks like.

The report

A real catch, on JPM

Drift monitoring

The 30-day pilot.

What you get

The ask

What IT, compliance, and legal will ask.

Send me the pilot scope PDF.

The AI research system serious firms are trying to build.
Running today.