Methodology
How Proprietor rates outlets, clusters stories, and detects blindspots
A transparent account of how this site works — and where it falls short
Proprietor aggregates articles from UK news outlets, groups them into story clusters, and measures how coverage differs by ownership and editorial stance. Each part of that process involves choices. This page makes them explicit.
How outlets are rated
Each tracked outlet carries editorial ratings across four axes. These are human judgments, not algorithmic outputs. They draw on MBFC, the Reuters Institute Digital News Report, and academic media studies literature. Every rating is stored with a confidence level (high / medium / low) and a source tag. All are openly contestable — see the transparency section below.
All bias dimensions use a scale from −5 to +5, where 0 is genuine neutrality. Most outlets sit between −3 and +3. The poles are intentionally descriptive, not pejorative.
Economic axis
Pro-Labour / State-ledPro-Market / Deregulatory Measures how an outlet covers taxation, public spending, welfare, nationalisation, and economic inequality. At the negative end: outlets like Novara Media and The Canary (−4) that explicitly advocate state intervention and wealth redistribution. At the positive end: The Economist (+4) and the Financial Times (+3), which champion free trade, low regulation, and private enterprise. The Guardian sits at −2; The Telegraph at +2. This is Proprietor's primary analysis axis — it is a more structurally revealing divide than left/right.
Social axis
Socially ProgressiveSocially Conservative Measures coverage of immigration, gender, identity politics, and social liberalism versus conservatism. GB News (+4) and The Sun (+3) sit at the conservative end. The Guardian (−3) and Novara Media (−3) at the progressive end. The BBC and Financial Times sit near zero — not because they have no social perspective, but because their coverage does not consistently advocate on social issues.
Establishment axis
Anti-EstablishmentPro-Establishment Measures how an outlet treats those in power — whether it tends towards deference or challenge. This axis cuts across left and right: Byline Times (−4) and openDemocracy (−3) are anti-establishment from the left; Guido Fawkes (−2) is anti-establishment from the right. The Times (+3) and the BBC (+2) sit towards the pro-establishment end, not because they never challenge power, but because their default posture tends to centre and legitimise existing institutions.
EU / Sovereignty axis
Pro-Brexit / SovereigntyPro-EU / Remain The UK's most legible political divide. −4 and −5 outlets (The Express, The Spectator, GB News, Guido Fawkes) campaign actively for Brexit and British sovereignty. +3 and +4 outlets (The Economist, New Statesman, openDemocracy) are consistently pro-EU and critical of Brexit. This is the one dimension where editorial positions are most clearly documented and consistent over time.
Credibility score
Separately from bias, each outlet carries a credibility score from 0–100, drawn primarily from MBFC with Reuters Institute data for established broadcast outlets. This score affects blindspot detection: outlets rated below 40 are weighted at 0.3 rather than 1.0, so a tabloid pile-on does not trigger the same signal as genuine cross-spectrum coverage. GB News (30), The Canary (45), and The Express (35) are weighted down. The Economist (90), The Conversation UK (88), and the BBC (85) carry full weight.
Ownership data
Ownership records are drawn from three primary sources:
- Media Reform Coalition — Who Owns the Media? Annual report tracking UK print and digital ownership concentration. The most comprehensive public audit of UK media ownership.
- RSF Media Ownership Monitor Reporters Without Borders' international monitoring of media ownership transparency, used for ultimate beneficial owner data where public records are thin.
- Companies House for corporate structure and beneficial ownership verification, where publicly registered UK entities are involved.
Each outlet record carries: ownership group, ultimate beneficial owner where known, funding model (advertising / subscription / public / membership / mixed), and whether editorial staff are union-recognised. Funding model and union data come from public reporting and are updated periodically.
Story clustering
Every 15 minutes, freshly ingested articles are converted into numerical vectors using a multilingual sentence model. These vectors capture the semantic meaning of each article — two articles about the same event will produce similar vectors even if their headlines use different words.
A new article is compared against recent clusters. If it is close enough to an existing cluster — above a similarity threshold of 0.78 — it joins that cluster. If it is far from all recent clusters, it starts a new one. A narrow grey zone between 0.62 and 0.78 is arbitrated by a language model asked a simple question: are these covering the same news event?
Two safeguards prevent false groupings. First, both articles must share at least one named entity — a person, organisation, or place. A story about Keir Starmer and a story about interest rates will not cluster together even if they have similar economic vocabulary. Second, the grey-zone language model check adds a second opinion on ambiguous cases.
Wire service content — PA Media stories republished verbatim across regional outlets — is detected using SimHash fingerprinting and shown once rather than once per republishing outlet. The originating outlet gets credit for coverage; the regional republications are recorded but do not inflate the coverage count.
Blindspot detection
A cluster qualifies as a blindspot when at least three distinct outlets have covered a story and one side of the economic axis has zero representation — all coverage comes from either pro-labour or pro-market outlets, with none from the other side.
This is a structural measure, not an editorial judgment. Proprietor does not decide whether a story deserves more coverage. It only observes that the outlets which did cover it are grouped on one side of the economic divide.
Blindspot types currently detected:
- Market press blindspot. A story covered only by pro-market outlets — no left-of-centre voices present.
- Establishment blindspot. A story covered only by pro-establishment outlets — no critical or anti-establishment voices.
- Brexit lens. An EU or sovereignty story with no pro-labour outlet coverage.
- Tabloid amplification. A story whose coverage is dominated by low-credibility outlets, suggesting amplification without quality corroboration.
- Independent Only. A story covered only by outlets confirmed as independently owned — not picked up by any outlet with a known corporate owner.
Known limitations
Proprietor is a work in progress. Current limitations worth naming explicitly:
- Ratings are editorial judgments. They reflect the people who made them and can be wrong. They are not produced by an algorithm and should not be treated as objective.
- Proprietor currently monitors 255 active GB outlets and 6,676 active outlets worldwide. Hundreds of significant titles — regional papers, specialist publications, community media — are absent. Blindspot detection is only as good as the outlets in the sample.
- RSS-based ingestion cannot retrieve paywalled articles. The Times, The Economist, and The Spectator contribute headline-level signals only.
- Clustering uses headline and summary text only. Full article bodies are not yet indexed. Two articles with identical headlines but opposite editorial takes may cluster together.
- Wire duplication detection catches near-verbatim republication but not rewrites. A regional paper that rewrites a PA Media story in its own words will be counted as original coverage.
- The blindspot threshold of three outlets favours widely-covered national stories. Niche specialist coverage — a trade outlet covering a sector story no national picks up — is not measured.
- The economic axis is the primary blindspot axis. Social, establishment, and EU/sovereignty dimensions are tracked but do not yet trigger blindspot flags.
- Topic classification covers 9 broad categories. Specialist or cross-cutting stories may be miscategorised or left untagged.
- Cluster summaries are AI-generated and not editorially reviewed. They describe what happened based on headlines, not full article bodies, and may miss nuance present in the original reporting.
- The outlet comparison feature compares coverage within the current cluster database. Historical comparisons are limited to the period since the site launched.
AI and language model usage
Proprietor uses Claude Haiku (Anthropic) at four points in the pipeline:
- Clustering tiebreaks. When two articles fall in the similarity grey zone (0.62–0.78), Haiku is asked “Are these covering the same news event?” and replies YES or NO. This is the only decision point where a language model influences clustering.
- Cluster titles. When a new cluster forms, Haiku generates a neutral descriptive title from the article headlines.
- Cluster summaries. For clusters with 3 or more articles, Haiku generates a 2-sentence neutral summary. First sentence: what happened. Second sentence: why it matters.
- Topic tagging. Each cluster is classified into one or more of 9 topic categories using Haiku.
All language model outputs are stored and visible. None are presented as editorial judgment — they are labelling and organisational tools. The underlying article data, outlet ratings, and blindspot detection logic are entirely deterministic.
Topic classification
Proprietor classifies each story cluster into one or more of 9 topic categories: Politics, Economy, Crime, Health, Environment, International, Sport, Culture, Technology. Classification is performed by Claude Haiku based on the cluster title. A cluster can carry up to 3 topic tags. Topics can be browsed at /topics.
Topic classification is a best-effort labelling system. Ambiguous or highly localised stories may not receive a tag. Classifications are not manually reviewed.
Features
- Story clustering across 6,676 active outlets (including 255 GB) updated every 15 minutes
- Ownership and bias ratings across 5 dimensions for every outlet
- Blindspot detection — stories covered only by one side of the economic divide
- Outlet comparison — put any two outlets side by side to see what they share and what each misses
- Topic pages — browse coverage by subject area
- Archive — browse coverage from any past date
- Search — full-text search across story clusters and outlets
- Cluster summaries — AI-generated 2-sentence summaries for stories with 3 or more articles
Data retention
Story clusters are never deleted. The archive reflects all coverage since the site launched. Articles older than the 72-hour clustering window are retained in the database but no longer contribute to new cluster formation unless the cluster is flagged as a running story.
About Proprietor
Proprietor is open-source software, built in public, with no advertising and no investor funding. It was created to increase transparency about who owns and shapes UK news — a structurally-aware alternative to Ground News, focused on the UK media landscape.
No outlet pays for listing or for its ratings. No outlet can pay to be removed. The outlet database includes publications across the full political spectrum, including outlets whose editorial positions we find objectionable.
Ratings can be challenged. If you believe a score is wrong, open a GitHub issue with your reasoning and evidence. We take corrections seriously and will update ratings when the argument is sound. Ownership data corrections are especially welcome — this is an area where public records are often incomplete.